Mac OS X Internals: A Systems Approach

9.1. Introduction

Running even the most trivial C program on Mac OS X leads to the invocation of dozens of system callsas the runtime environment loads it, prepares it for execution, and executes it. Consider this simple example.

// empty.c main() { } $ gcc -o empty empty.c $ ktrace ./empty $ kdump | grep CALL | wc -l 49

Although our trivial program has an empty user-visible body, it still needs to be prepared by dyld so that the empty body can be executed. This preparation involves numerous steps, such as initializing Pthreads-related and Mach-related data structures for the new program. For example, dyld invokes a Mach trap to set the "self" value for the program's thread being run, initializes the special Mach ports in the application, and reserves the zeroth page so that it may not be allocated by the program. Consequently, there is a variety of communication between various bodies of user-space code and the kernel. Graphical interface systems make heavy use of communication between their components and with the rest of the system.

Nontrivial applications might comprise multiple threadsperhaps even multiple processesthat may need to communicate with each other in arbitrary ways, thus necessitating interfaces for such communication. Often, processes that are not part of the same program must communicate with each other too. The Unix command pipeline abstraction exemplifies such communication:

$ find . -type f | grep kernel | sort | head -5

It is worthwhile to question what qualifies as communication. In some cases, the line between communication and information sharing may be blurred. The Mac OS X pbcopy command-line utility is a Cocoa program that copies its standard output and places it in a pasteboard. It can handle ASCII data, Encapsulated PostScript (EPS), Rich Text Format (RTF), Portable Document Format (PDF), and so on. The pbpaste command removes data from a pasteboard and writes it to its standard output. These utilities allow command-line programs to communicate in a copy-and-paste way with other command-line or graphical programs. The following is a contrived (and expensive) way to print "Hello, World!" from the shell:

$ echo 'Hello, World!' | pbcopy $ pbpaste Hello, World!

For the purpose of this chapter, we understand interprocess communication (IPC) to be a well-defined mechanismwith a programming interfacefor transferring information between two or more entities. Historically, the communicating entities were processes, hence the term interprocess. Since the early days of timesharing systems, a variety of computing resources have been associated with processes. IPC is also a means of sharing these resources. As we saw in Chapter 7, a runnable entity can take many forms in Mac OS X. Consequently, IPC can occur between any of these runnable entitiesfor example, threads in the same task, threads in different tasks, and threads in the kernel.

Depending on the type of IPC, communicating parties may require some form of synchronization for the IPC mechanism to operate correctly. For example, if multiple processes are sharing a file or a region of memory, they must synchronize with each other to ensure that shared information is not being modified and read simultaneously, as it could briefly be in an inconsistent state. In general, IPC might require and may consist of one or more of the following operations:

  • Sharing of data

  • Transfer of data

  • Sharing of resources

  • Synchronization between IPC participants

  • Synchronous and asynchronous notifications

  • Control operations, such as a debugger shepherding a target process

The term IPC is often used synonymously with message passing, which could be thought of as one specific (and rather popular) IPC mechanism.

9.1.1. The Evolution of IPC

Early IPC mechanisms used files as the communication medium: an approach that did not work well owing to the slowness of disks and large windows for race conditions between programs. This was followed by shared memory approaches, wherein processes used commonly accessible regions of memory to implement ad hoc IPC and synchronization schemes. Eventually, IPC mechanisms became an abstraction provided by the operating system itself.

MULTICS IPC

Michael J. Spier and Elliott I. Organick described a general-purpose IPC facility in their 1969 paper titled "The MULTICS Interprocess Communication Facility."[1] A MULTICS process was defined as a "hardware-level" process whose address space was a collection of named segments, each with defined access, and over which a single execution point was free to fetch instructions and make data references. The MULTICS central supervisor program (the kernel) ensured that at most one execution point was ever awarded to an address space. With this definition of a process, MULTICS IPC was defined as an exchange of data communications among cooperating processes. This was achieved by an exchange of messages in a commonly accessible mailboxa shared database whose identity was known to each IPC participant by common convention.

The MULTICS IPC facility was part of the central supervisor. It was one of the earliest examples of a completely generalized, modular interface available to programmers.

[1] "The MULTICS Interprocess Communication Facility," by Michael J. Spier and Elliott I. Organick. In Proceedings of the Second ACM Symposium on Operating Systems Principles (Princeton, NJ: ACM, 1969, pp. 8391).

9.1.2. IPC in Mac OS X

Mac OS X provides a large number of IPC mechanisms, some with interfaces available at multiple layers of the system. The following are examples of IPC mechanisms/interfaces in Mac OS X:

  • Mach IPCthe lowest-level IPC mechanism and the direct basis for many higher-level mechanisms

  • Mach exceptions

  • Unix signals

  • Unnamed pipes

  • Named pipes (fifos)

  • XSI/System V IPC

  • POSIX IPC

  • Distributed Objects

  • Apple Events

  • Various interfaces for sending and receiving notifications, such as notify(3) and kqueue(2)

  • Core Foundation IPC mechanisms

Note that the term notification is context-dependent. For example, Mach can send notifications when a Mach port is deleted or destroyed. The application environments provide interfaces for sending and receiving intraprocess and interprocess notifications.

Each of these mechanisms has certain benefits, shortcomings, and caveats. A programmer could need to use a particular mechanism, or perhaps even multiple mechanisms, based on the program's requirements and the system layer for which it is being targeted.

In the rest of this chapter, we will look at these IPC mechanisms. Those that are common across several platforms (such as System V IPC), and therefore abundantly documented elsewhere, will be discussed only briefly.

An important IPC mechanism that we will not cover in this chapter is that provided by the ubiquitous BSD sockets. Similarly, we will also not discuss the older OpenTransport API, a subset of which is provided by Mac OS X as a compatibility library for legacy applications.

Since IPC usually goes hand in hand with synchronization, we will also look at the important synchronization mechanisms available on Mac OS X.

Категории