Mac OS X Internals: A Systems Approach

6.6. System Call Processing

In traditional UNIX, a system call is one of a well-defined set of functions that allow a user process to interact with the kernel. A user process invokes a system call to request the kernel to perform one or more operations on its behalf. The kernel may perform the requested operations after validating input parameters to the system call, if any, and perhaps several other checks. A system call may involve exchange of datatypically at least a return valuebetween the kernel and the user process.

Our definition of a Mac OS X system call is a function callable through the sc instruction. Note that it is legal to use the sc instruction from within the kernel. It is also possible to directlyfrom within the kernelcall an internal function that implements a system call. Nevertheless, a typical invocation of a system call is from user space.

Since the Mac OS X kernel is an amalgamation of entities with quite different personalities and characteristics, it is interesting to ask which portions of xnu these system calls provide entry to: BSD, Mach, the I/O Kit, or something else? The answer is: all of them, and more.

Based on how they are processed, Mac OS X system calls can be categorized as ultra-fast traps, firmware calls, and normal system calls. Figure 612 shows the key code paths involved in system call processing. The figure should be followed beginning at the "Start" label.

Figure 612. Details of system call processing in Mac OS X

We can also categorize Mac OS X system calls based on what they do, that is, based on their flavors. The following categorization also captureslargelythe division based on the kernel subsystems that these system calls provide access to.

  • BSD system calls are the Unix system calls, although several system calls in this category either have Mac OS Xspecific nuances or are seen only on Mac OS X. The BSD system calls substantially contribute to Mac OS X's POSIX compatibility at the system call level.

  • Mach system callsor Mach trapsserve as the building blocks for exporting a variety of Mach functionality through kernel calls, which are invoked from user space via Mach IPC. In particular, unlike BSD, Mach's kernel functionality is often accessed through kernel-user IPC, rather than having separate system calls for individual pieces of functionality.

  • I/O Kit traps constitute a subset of the Mach traps.

  • PowerPC-only special system calls include special-purpose calls for diagnostics, for performance monitoring, for access to the kernel's virtual machine monitor, and for calls related to the Blue Box.

  • PowerPC-only ultra-fast traps are system calls that perform very little work, do not have the context save/restore overhead of regular system calls, and return very quickly. Another type of a fast system call is a fastpath call, which is conceptually similar to an ultra-fast call but performs somewhat more work. These calls also do not return as rapidly as the ultra-fast calls.

  • Certain system calls can be optimized using the commpage feature on Mac OS X. The commpage area of memory contains frequently used code and data. These entities are grouped together as a set of consecutive pages that are mapped by the kernel into the address space of each process. The gettimeofday() system call is optimized in this mannerwhen a user process executes gettimeofday(), its commpage implementation is attempted first, failing which the actual system call is executed. We will see details of this mechanism later in this chapter.

As shown in Figure 612, the details of how each system call category is handled in the kernel differ. Nevertheless, all system calls are invoked from user space via the same basic mechanism. Each category uses one or more unique ranges of system call numbers. In a typical invocation of any type of system call, the calling entity in user space places the system call number in GPR0 and executes the sc instruction. These statements must be qualified with the following points to avoid confusion.

  • User programs normally do not call a system call directlya library stub invokes the sc instruction.

  • Some program invocations of system calls may never transition to the kernel because they are handled by a library entirely in user space. The commpage-optimized calls are examples of this.

  • Regardless of the programmer-visible invocation mechanism, a system call that does transition to the kernel always involves execution of the sc instruction.

The kernel's hardware vector for the system call exception maps the system call number in GPR0 to an index into the first-level dispatch table containing handlers for various types of system calls. It then branches to the call handler. Figure 613 shows details of this mapping.

Figure 613. Mapping an incoming system call number to an index in the first-level system call dispatch table

The first-level dispatch tablescTablealso resides in low memory. As Figure 613 shows, it can map ultra-fast system calls to their individual handlers, route all non-ultra-fast valid system calls to a normal dispatcher, and if an impossible index is encountered, it can send the call to WhoaBaby. The dispatcher for normal system calls sets the exception code in GPR11 to T_SYSTEM_CALL. Such calls, including BSD and Mach system calls, are next processed by .L_exception_entry(), the exception-handling code common to most exceptions. As shown in Figure 611, .L_exception_entry() branches to xcpSyscall to handle the T_SYSTEM_CALL exception code. xcpSyscall hands over the processing of most system calls to shandler() [osfmk/ppc/hw_exception.s].

User-Level System Call Emulation

Remnants of the Mach multiserver emulation facility can be seen in the xnu kernel. A typical multiserver configuration consisted of the Mach kernel along with one or more servers, as well as an emulation library that intercepted system calls for emulated processes, redirecting them to the appropriate emulation services. A user-process address space consisted of user code representing the application and the emulation library. xnu includes some of this code inherited from Mach, such as the task_set_emulation() and task_set_emulation_vector() functions. However, the code is not functional in xnu.

Категории