Running Multiple Tasks in Parallel

Problem

Your build process takes too long to run. Rake finishes copying one set of files only to start copying another set. You could save time by running these tasks in parallel, instead of stringing them one after another.

Solution

Define a task using the multitask function instead of task. Each of that tasks prerequisites will be run in a separate thread.

In this code, Ill define two long-running tasks:

task copy_docs do # Simulate a large disk copy. sleep 5 end task compile_extensions do # Simulate a C compiler compiling a bunch of files. sleep 10 end task uild_serial => [copy_docs, compile_extensions] multitask uild_parallel => [copy_docs, compile_extensions]

The build_serial task runs in about 15 seconds, but the build_parallel task does the same thing in about 10 seconds.

Discussion

A multitask runs just like a normal task, except that each of its dependencies runs in a separate thread. When running the dependencies of a multitask, Rake first finds any common secondary dependencies of these dependencies, and runs them first. It then spawns a separate thread for each dependency, so that they can run simultaneously.

Consider three tasks, ice_cream, cheese, and yogurt, all of which have a dependency on buy_milk. You can run the first three tasks in separate threads with a multitask, but Rake will run buy_milk before creating the threads. Otherwise, ice_cream, cheese, and yogurt would all trigger buy_milk, wasting time.

When your tasks spend a lot of time blocking on I/O operations (as many Rake tasks do), using a multitask can speed up your builds. Unfortunately, it can also cause the same problems youll see with any multithreaded code. If youve got a fancy Rakefile, in which the tasks keep state inside Ruby data structures, youll need to synchronize access to those data structures to prevent multithreading problems.

You may also have problems converting a task to a multitask if your dependencies are set up incorrectly. Take the following example:

task uild => [compile_extensions, un_tests, generate_rdoc]

The unit tests can run if the compiled extensions aren available, so :compile_extensions shouldn be in this list at all: it should be a dependency of :run_tests. You might not notice this problem as long as you e using task (because :compile_extensions runs before :run_tests anyway), but if you switch to a multitask your tests will start failing. Fixing your dependencies will solve the problem.

The multitask method is available only in Rake 0.7.0 and higher.

See Also

Категории