Telling the Monkeys What to Do
Most computers spend a lot of time doing nothing. If you start a system monitor tool and watch the CPU utilization, you'll see what I mean -- it's rare to see one hit 100%, even when you are running multiple programs.[1] There are just too many delays built in to software: disk accesses, network traffic, database queries, waiting for users to click a button, and so on. In fact, the majority of a modern CPU's capacity is often spent in an idle state; faster chips help speed up performance demand peaks, but much of their power can go largely unused.
[1] To watch on Windows, click the Start button, select Programs/Accessories/System Tools/System Monitor, and monitor Processor Usage. The graph rarely climbed above 50% on my laptop machine while writing this (at least until I typed while 1: pass in a Python interactive session console window).
Early on in computing, programmers realized that they could tap into such unused processing power, by running more than one program at the same time. By dividing up the CPU's attention among a set of tasks, its capacity need not go to waste while any given task is waiting for an external event to occur. The technique is usually called parallel processing, because tasks seem to be performed at once, overlapping and parallel in time. It's at the heart of modern operating systems, and gave rise to the notion of multiple active-window computer interfaces we've all grown to take for granted. Even within a single program, dividing processing up into tasks that run in parallel can make the overall system faster, at least as measured by the clock on your wall.
Just as importantly, modern software systems are expected to be responsive to users, regardless of the amount of work they must perform behind the scenes. It's usually unacceptable for a program to stall while busy carrying out a request. Consider an email-browser user interface, for example; when asked to fetch email from a server, the program must download text from a server over a network. If you have enough email and a slow enough Internet link, that step alone can take minutes to finish. But while the download task proceeds, the program as a whole shouldn't stall -- it still must respond to screen redraws, mouse clicks, etc.
Parallel processing comes to the rescue here too. By performing such long-running tasks in parallel with the rest of the program, the system at large can remain responsive no matter how busy some of its parts may be.
There are two built-in ways to get tasks running at the same time in Python -- process forks, and spawned threads. Functionally, both rely on underlying operating system services to run bits of Python code in parallel. Procedurally, they are very different in terms of interface, portability, and communication. At this writing, process forks don't work on Windows (more on this in a later note), but Python's thread support works on all major platforms. Moreover, there are additional Windows-specific ways to launch programs that are similar to forks.
In this chapter, which is a continuation of our look at system interfaces available to Python programmers, we explore Python's built-in tools for starting programs in parallel, as well as communicating with those programs. In some sense, we've already starting doing so -- the os.system and os.popen calls introduced and applied in the prior chapter are a fairly portable way to spawn and speak with command-line programs too. Here, our emphasis is on introducing more direct techniques -- forks, threads, pipes, signals, and Windows-specific launcher tools. In the next chapter (and the remainder of this book), we use these techniques in more realistic programs, so be sure you understand the basics here before flipping ahead.