Mac OS X Internals: A Systems Approach
9.8. Signals
Besides Mach exception handling, Mac OS X provides Unix-style signals as well, with the latter built atop the former.
An implementation of the signal mechanism involves two well-defined phases: signal generation and signal delivery. Signal generation is the occurrence of an event that warrants a signal. Signal delivery is the invocation of the signal's dispositionthat is, the carrying out of the associated signal action. Each signal has a default action, which can be one of the following on Mac OS X.
A signal can have its default action be overridden by a user-specified handler. The sigaction() system call can be used to assign signal actions, which can be specified as SIG_DFL (use the default action), SIG_IGN (ignore the signal), or a pointer to a signal handler function (catch the signal). A signal can also be blocked, wherein it remains pending until it is unblocked or the corresponding signal action is set to SIG_IGN. The sigprop array [bsd/sys/signalvar.h] categorizes the known signals and their default actions. // bsd/sys/signalvar.h #define SA_KILL 0x01 // terminates process by default #define SA_CORE 0x02 // ditto and dumps core #define SA_STOP 0x04 // suspend process #define SA_TTYSTOP 0x08 // ditto, from tty #define SA_IGNORE 0x10 // ignore by default #define SA_CONT 0x20 // continue if suspended int sigprop[NSIG + 1] = { 0, // unused SA_KILL, // SIGHUP SA_KILL, // SIGINT SA_KILL|SA_CORE, // SIGQUIT ... SA_KILL, // SIGUSR1 SA_KILL, // SIGUSR2 };
The following exceptional cases should be noted about blocking, catching, and ignoring signals.
The signal(3) man page provides a list of supported signals and their default actions.
The Mach exception-handling facility was designed to address several problems with the prevailing signal mechanisms in Unix systems. As Unix systems have evolved, the design and implementation of signal mechanisms have improved too. Let us look at some aspects of signals in the context of Mac OS X. 9.8.1. Reliability
Early signal implementations were unreliable in that a signal's action was reset to the default action whenever that signal was caught. If there were two or more successive occurrences of the same signal, there was a race condition as the kernel would reset the signal handler, and before the program could reinstall the user-defined handler, the default action would be invoked. Since the default action of many signals is to terminate the process, this was a severe problem. POSIX.1 included a reliable signal mechanism based on the signal mechanisms in 4.2BSD and 4.3BSD. The new mechanism requires the use of the newer sigaction(2) interface instead of the older signal(3) interface. Mac OS X provides both interfaces, although signal(3) is implemented in the system library as a wrapper call to sigaction(2). 9.8.2. The Number of Signals
Although the number of signal types available in Unix systems has increased over the years, often there are hard upper bounds because of the data types that kernels use to represent signal types. Mac OS X uses a 32-bit unsigned integer to represent a signal number, allowing a maximum of 32 signals. Mac OS X 10.4 has 31 signals.
9.8.3. Application-Defined Signals
POSIX.1 provides two application-defined signals, SIGUSR1 and SIGUSR2, which can be used by the programmer for arbitrary purposesfor example, as a rudimentary IPC mechanism.
Mac OS X 10.4 does not support real-time signals, which were originally defined as part of the Real-time Signals Extension in POSIX.4. Real-time signals are application-defined signals and can vary in numberranging from SIGRTMIN to SIGRTMAXacross systems that provide them. Other characteristics distinguish real-time signals from regular signals. For example, real-time signals are delivered in a guaranteed order: Multiple simultaneously pending real-time signals of the same type are delivered in the order they were sent, whereas simultaneously pending real-time signals of different types are delivered in the order of their signal numbers (lowest numbered first). 9.8.4. Signal-Based Notification of Asynchronous I/O
Mac OS X provides the asynchronous I/O (AIO) family of functions, also defined as part of POSIX.4. When an asynchronous event (such as a completed read or write) occurs, a program can receive a notification through one of the following mechanisms:
Mac OS X 10.4 supports only SIGEV_NONE and SIGEV_SIGNAL. Figure 939 shows a contrived program that uses the lio_listio() system call to submit an asynchronous read operation, while requesting notification of read completion through the SIGUSR1 signal. Multipleup to AIO_LISTIO_MAX (16)read or write operations can be submitted in a single call through lio_listio(). Figure 939. Signal notification of asynchronous I/O completion
9.8.5. Signals and Multithreading
The signal mechanism does not lend itself well to a multithreaded environment. Traditional signal semantics require exceptions to be handled serially, which is problematic when a multithreaded application generates exception signals. For example, if several threads hit breakpoints while debugging a multithreaded application, only one breakpoint can be reported to the debugger, which will therefore not have access to the entire state of the process. Modern-day operating systems have to deal with several common and system-specific problems in their signal implementations. A representative multithreaded signal implementation in a modern Unix system has per-thread signal masks, allowing threads to block signals independently of other threads in the same process. Mac OS X provides the pthread_sigmask() system call to examine or change (or both) the calling thread's signal mask. If a signal is generated because of a trap, such as an illegal instruction or an arithmetic exception (i.e., the signal is synchronous), it is sent to the thread that caused the trap. Others (typically asynchronous signals) are delivered to the first thread that is not blocking the signal. Note that signals such as SIGKILL, SIGSTOP, and SIGTERM affect the entire process. 9.8.6. Signal Actions
A signal action can be carried out only by the process (technically, a thread within that process) to which the signal was delivered. Unlike Mach exceptions, which can be handled by any thread in any task (with prior arrangement), no process can execute a signal handler on another process's behalf. This is problematic when the complete register context of an exception is desirable or the exception may have corrupted the resources of the victim process. Debuggers have been historically difficult to implement on Unix systems because of limitations in prevailing signal mechanisms. POSIX.1 allows a process to declare a signal to have its handler execute on an alternate stack, which can be defined and examined using sigaltstack(2). When changing a signal action through sigaction(2), the sa_flags field of the sigaction structure can have the SA_ONSTACK flag set to cause delivery of the signal in question on the alternate stack, provided an alternate stack has been declared with sigaltstack(). int sigaltstack(const struct sigaltstack *newss, struct sigaltstack *oldss); // bsd/sys/signal.h struct sigaltstack { user_addr_t ss_sp; // signal stack base user_size_t ss_size; // signal stack length int ss_flags; // SA_DISABLE and/or SA_ONSTACK }; #define SS_ONSTACK 0x0001 // take signal on signal stack #define SS_DISABLE 0x0004 // disable taking signals on alternate stack #define MINSIGSTKSZ 32768 // (32KB) minimum allowable stack #define SIGSTKSZ 131072 // (128KB) recommended stack size
If the signal handler needs exception context, the kernel must explicitly save the context and pass it to the handler for examination. For example, POSIX.1 stipulates that the signal-catching function (handler) for a signal will be entered differently based on whether the SA_SIGINFO flag is set for the signal or not. // SA_SIGINFO is cleared for this signal (no context passed) void sig_handler(int signo); // SA_SIGINFO is set for this signal (context passed) void sig_handler(int signo, siginfo_t *info, void *context);
The siginfo_t structure on a system must at least contain the signal number, the cause of the signal, and the signal value. // bsd/sys/signal.h // kernel representation of siginfo_t typedef struct __user_siginfo { int si_signo; // signal number int si_errno; // errno association int si_code; // signal code pid_t si_pid; // sending process uid_t si_uid; // sender's real user ID int si_status; // exit value user_addr_t si_addr; // faulting instruction union user_sigval si_value; // signal value user_long_t si_band; // band event for SIGPOLL user_ulong_t pad[7]; // reserved } user_siginfo_t;
When a signal handler is invoked, the current user context is saved and a new context is created. The context argument to sig_handler() can be cast to a pointer to an object of type ucontext_t. It refers to the receiving process's user context that was interrupted when the signal was delivered. The ucontext_t structure contains a data structure of type mcontext_t, which represents the machine-specific register state of the context. // kernel representation of 64-bit ucontext_t struct user_ucontext64 { // SA_ONSTACK set? int uc_onstack; // set of signals that are blocked when this context is active sigset_t uc_sigmask; // stack used by this context struct user_sigaltstack uc_stack; // pointer to the context that is resumed when this context returns user_addr_t uc_link; // size of the machine-specific representation of the saved context user_size_t uc_mcsize; // machine-specific representation of the saved context user_addr_t uc_mcontext64; }; // kernel representation of 64-bit PowerPC mcontext_t struct mcontext64 { // size_in_units_of_natural_t = struct ppc_exception_state64 es; // PPC_EXCEPTION_STATE64_COUNT + struct ppc_thread_state64 ss; // PPC_THREAD_STATE64_COUNT + struct ppc_float_state fs; // PPC_FLOAT_STATE_COUNT + struct ppc_vector_state vs; // PPC_VECTOR_STATE_COUNT }; The type and the amount of context made available to a signal handler depend on the operating system and the hardwarethe context is not guaranteed against corruption.
Mac OS X does not provide the POSIX getcontext() and setcontext() functions for retrieving and setting, respectively, the current user context of the calling thread. As we saw earlier, thread_get_state() and tHRead_set_state() are used for this purpose. Other related functions such as makecontext() and swapcontext() are also not available on Mac OS X. In any case, the getcontext() and setcontext() routines have been marked as obsolescent in SUSv3[13] and can be replaced using POSIX thread functions. [13] Single UNIX Specification, Version 3.
9.8.7. Signal Generation and Delivery
The kill() system call, which is used to send a signal to one or more processes, is invoked with two arguments: a process ID (pid) and a signal number. It sends the specified signal (provided that the caller's credentials permit it) to one or more processes based on whether the given pid is positive, 0, -1, or otherwise negative. The details of kill()'s behavior are described in the kill(2) man page. The killpg() system call sends the given signal to the given process group. For a certain combination of its arguments, kill() is equivalent to killpg(). The implementations of both system calls on Mac OS X use the psignal() internal function [bsd/kern/kern_sig.c] to send the signal. psignal() is a simple wrapper around psignal_lock() [bsd/kern/kern_sig.c]. If the signal has an associated action, psignal_lock() adds the signal to the set of pending signals for the process. Figure 940 shows the important functions that are part of the signal mechanism in the kernel. Figure 940. Implementation of the signal mechanism in the kernel
psignal_lock() calls get_signalthread() [bsd/kern/kern_sig.c] to select a thread for signal delivery. get_signalthread() examines the threads within the process, normally selecting the first thread that is not blocking the signal. Sending signals to the first thread allows single-threaded programs to be linked with multithreaded libraries. If get_signalthread() returns successfully, a specific asynchronous system trap (AST_BSD) is set for the thread. psignal_lock() then processes the signal, performing signal-specific actions as necessary and allowed. In particular, psignal_lock() examines the following fields of the utHRead structure, possibly modifying uu_siglist and uu_sigwait:
Before the thread returns to user space from the kernel (after a system call or trap), the kernel checks the thread for pending BSD ASTs. If it finds any, the kernel calls bsd_ast() [bsd/kern/kern_sig.c] on the thread. // bsd/kern/kern_sig.c void bsd_ast(thread_t thr_act) { ... if (CHECK_SIGNALS(p, current_thread(), ut)) { while ((signum = issignal(p))) postsig(signum); } ... }
psignal_lock() does not send signals to the kernel task, zombie processes, or a process that has invoked the reboot() system call. The CHECK_SIGNALS() macro [bsd/sys/signalvar.h] ensures that the thread is active (not terminated) and then calls the SHOULDissignal() macro to determine whether there are signals to be delivered based on the following quick checks.
When called in a loop, issignal() [bsd/kern/kern_sig.c] keeps returning a signal number if the current process has received a signal that should be caught, should cause termination of the process, or should interrupt the current system call. issignal() performs a variety of processing depending on the type of the signal, whether the signal is masked, whether the signal has the default action, and so on. For example, if the process has a pending SIGSTOP with the default action, issignal() processes the signal immediately and clears the signal. No signal number is returned in this case. Signals that have actions (including the default action of terminating the process) are returned and are processed by postsig() [bsd/kern/kern_sig.c]. postsig() either terminates the process if the default action warrants so or calls sendsig() [bsd/dev/ppc/unix_signal.c] to arrange for the process to run a signal handler. This arrangement primarily involves population of ucontext and mcontext structures (32-bit or 64-bit, as appropriate) that contain the context information required by the handler to run within the thread in user space. The context is copied to user space and various registers are set up, including SRR0, which contains the address at which the handler will start execution. Finally, postsig() calls thread_setstatus() [osfmk/kern/thread_act.c] to set the thread's machine state. thread_setstatus() is a trivial wrapper around the thread_set_state() Mach routine. 9.8.8. Mach Exceptions and Unix Signals Living Together
When the kernel starts up, bsdinit_task() [bsd/kern/bsd_init.c] calls ux_handler_init() [bsd/uxkern/ux_exception.c] to initialize the Unix exception handler. ux_handler_init() starts a kernel thread that runs ux_handler() [bsd/uxkern/ux_exception.c]an internal kernel exception handler that provides Unix compatibility by converting Mach exception messages to Unix signals. ux_handler() allocates a port set for receiving messages and then allocates an exception port within the set. The global name for this port is contained in ux_exception_port. The exception ports of both the host and the BSD init task (that would eventually run launchd) are set to ux_exception_port. Since launchd is the ultimate ancestor of all Unix processes, and task exception ports are inherited across fork(), most exceptions that have signal analogs are converted to signals by default. The message-handling loop of ux_handler() is the typical Mach exception handler loop: An exception message is received, exc_server() is called, and a reply message is sent. If there is an error in receiving a message because it is too large, the message is ignored. Any other error in message reception results in a kernel panic. The corresponding call to catch_exception_raise() causes an exception to be converted to a Unix signal and code by calling ux_exception() [bsd/uxkern/ux_exception.c]. Finally, the resultant signal is sent to the appropriate thread. // bsd/uxkern/ux_exception.c kern_return_t catch_exception_raise(...) { ... if (th_act != THR_ACT_NULL) { ut = get_bsdthread_info(th_act); // convert {Mach exception, code, subcode} to {Unix signal, uu_code} ux_exception(exception, code[0], code[1], &ux_signal, &ucode); // send signal if (ux_signal != 0) threadsignal(th_act, signal, ucode); thread_deallocate(th_act); } ... } ux_exception() first calls machine_exception() [bsd/dev/ppc/unix_signal.c] to attempt a machine-dependent translation of the given Mach exception and code to a Unix signal and code. The translation is as follows:
If machine_exception() fails to translate a Mach exception, ux_exception() itself translates exceptions as shown in Table 97.
The difference between SIGBUS and SIGSEGV must be carefully noted. Both correspond to a bad memory access, but for different reasons. A SIGBUS (bus error) occurs when the memory is valid in that it is mapped, but the victim is not allowed to access it. Accessing page 0, which is normally mapped into each address space with all access to it disallowed, will result in a SIGBUS. In contrast, a SIGSEGV (segmentation fault) occurs when the memory address is invalid in that it is not even mapped.
The automatic conversion of Mach exceptions to signals does not preclude user-level handling of the Mach exceptions underlying those signals. If there exists a task-level or thread-level exception handler, it will receive the exception message instead of ux_handler(). Thereafter, the user's handler can handle the exception entirely, performing any cleanup or corrective actions, or it may forward the initial exception message to ux_handler(), which would cause the exception to be converted to a signal after all. This is what the GNU debugger (GDB) does. Moreover, instead of forwarding the initial exception message, a user's exception handler can also send a new message to ux_handler(). This would require send rights to ux_exception_port, which is the original task exception port before the task-level or thread-level exception handler is installed by the user. A rather convoluted way of sending a software signal to a process would be to package and send the relevant information in a Mach exception message. (The exception type, code, and subcode would be EXC_SOFTWARE, EXC_SOFT_SIGNAL, and the signal number, respectively.) 9.8.9. Exceptions, Signals, and Debugging
Even though signal mechanisms in modern Unix systems have greatly improved, the relative cleanliness of Mach's exception-handling mechanism is still evident, especially when it comes to debugging. Since exceptions are essentially queued messages, a debugger can receive and record all exceptions that have occurred in a program since it was last examined. Multiple excepting threads can remain suspended until the debugger has dequeued and examined all exception messages. Such examination may include retrieving the victim's entire exception context. These features allow a debugger to determine a program's state more precisely than traditional signal semantics would allow. Moreover, an exception handler runs in its own thread, which may be in the same task or in a different task altogether. Therefore, exception handlers do not require the victim thread's resources to run. Even though Mac OS X does not support distributed Mach IPC, Mach's design does not preclude exception handlers from running on a different host. We saw that exception handlers can be designated in a fine-grained manner, as each exception type can have its own handler, which may further be per-thread or per-task. It is worthwhile to note that a thread-level exception handler is typically suitable for error handling, whereas a task-level handler is typically suitable for debugging. Task-level handlers also have the debugger-friendly property that they remain in effect across a fork() because task-level exception ports are inherited by the child process. 9.8.10. The ptrace() System Call
Mac OS X provides the ptrace() system call for process tracing and debugging, although certain ptrace() requests that are supported on FreeBSD are not implemented on Mac OS X, for example, PT_READ_I, PT_READ_D, PT_WRITE_I, PT_WRITE_D, PT_GETREGS, PT_SETREGS, and several others. Operations equivalent to those missing from the Mac OS X implementation of ptrace() can be typically performed through Mach-specific routines. For example, reading or writing program memory can be done through Mach VM routines.[14] Similarly, thread registers can be read or written through Mach thread routines. [14] Note, however, that the Mach VM routines are not optimal for operating on small amounts of data. Moreover, ptrace() on Mac OS X provides certain requests that are specific to Mac OS X, such as those listed here.
If PT_SIGEXC is applied to a process, when there is a signal to be delivered, issignal() [bsd/dkern/kern_sig.c] calls do_bsdexception() [bsd/kern/kern_sig.c] to generate a Mach exception message instead. The exception's type, code, and subcode are EXC_SOFTWARE, EXC_SOFT_SIGNAL, and the signal number, respectively. do_bsdexception(), which is analogous to the doexception() function we saw in Section 9.7.2, calls bsd_exception() [osfmk/kern/exception.c]. The latter calls one of the exception_raise functions. |
Категории