Looping Through Multiple Iterables in Parallel

Problem

You want to traverse multiple iteration methods simultaneously, probably to match up the corresponding elements in several different arrays.

Solution

The SyncEnumerator class, defined in the generator library, makes it easy to iterate over a bunch of arrays or other Enumerable objects in parallel. Its each method yields a series of arrays, each array containing one item from each underlying Enumerable object:

require 'generator' enumerator = SyncEnumerator.new(%w{Four seven}, %w{score years}, %w{and ago}) enumerator.each do |row| row.each { |word| puts word } puts '---' end # Four # score # and # --- # seven # years # ago # --- enumerator = SyncEnumerator.new(%w{Four and}, %w{score seven years ago}) enumerator.each do |row| row.each { |word| puts word } puts '---' end # Four # score # --- # and # seven # --- # nil # years # --- # nil # ago # ---

You can reproduce the workings of a SyncEnumerator by wrapping each of your Enumerable objects in a Generator object. This code acts like SyncEnumerator#each, only it yields each individual item instead of arrays containing one item from each Enumerable:

def interosculate(*enumerables) generators = enumerables.collect { |x| Generator.new(x) } done = false until done done = true generators.each do |g| if g.next? yield g.next done = false end end end end interosculate(%w{Four and}, %w{score seven years ago}) do |x| puts x end # Four # score # and # seven # years # ago

 

Discussion

Any object that implements the each method can be wrapped in a Generator object. If you've used Java, think of a Generator as being like a Java Iterator object. It keeps track of where you are in a particular iteration over a data structure.

Normally, when you pass a block into an iterator method like each, that block gets called for every element in the iterator without interruption. No code outside the block will run until the iterator is done iterating. You can stop the iteration by writing a break statement inside the code block, but you can't restart a broken iteration later from the same placeunless you use a Generator.

Think of an iterator method like each as a candy dispenser that pours out all its candy in a steady stream once you push the button. The Generator class lets you turn that candy dispenser into one which dispenses only one piece of candy every time you push its button. You can carry this new dispenser around and ration your candy more easily.

In Ruby 1.8, the Generator class uses continuations to achieve this trick. It sets bookmarks for jumping out of an iteration and then back in. When you call Generator#next the generator "pumps" the iterator once (yielding a single element), sets a bookmark, and returns control back to your code. The next time you call Generator#next, the generator jumps back to its previously set bookmark and "pumps" the iterator once more.

Ruby 1.9 uses a more efficient implementation based on threads. This implementation calls each Enumerable object's each method (triggering the neverending stream of candy), but it does it in a separate thread for each object. After each piece of candy comes out, Ruby freezes time (pauses the thread) until the next time you call Generator#next.

It's simple to wrap an array in a generator, but if that's all there were to generators, you wouldn't need to mess around with Generators or even SyncEnumerables. It's easy to simulate the behavior of SyncEnumerable for arrays by starting an index into each array and incrementing it whenever you want to get another item from a particular array. Generator methods are truly useful in their ability to turn any type of iteration into a single-item candy dispenser.

Suppose that you want to use the functionality of a generator to iterate over an array, but you have an unusual type of iteration in mind. For instance, consider an array that looks like this:

l = ["junk1", 1, "junk2", 2, "junk3", "junk4", 3, "junk5"]

Let's say you'd like to iterate over the list but skip the "junk" entries. Wrapping the list in a generator object doesn't work; it gives you all the entries:

g = Generator.new(l) g.next # => "junk1" g.next # => 1 g.next # => "junk2"

It's not difficult to write an iterator method that skips the junk. Now, we don't want an iterator methodwe want a Generator objectbut the iterator method is a good starting point. At least it proves that the iteration we want can be implemented in Ruby.

def l.my_iterator each { |e| yield e unless e =~ /^junk/ } end l.my_iterator { |x| puts x } # 1 # 2 # 3

Here's the twist: when you wrap an array in a Generator or a SyncEnumerable object, you're actually wrapping the array's each method. The Generator doesn't just happen to yield elements in the same order as each: it's actually calling each, but using continuation (or thread) trickery to pause the iteration after each call to Generator#next.

By defining an appropriate code block and passing it into the Generator constructor, you can make a generation object of out of any piece of iteration codenot only the each method. The generator will know to call and interrupt that block of code, just as it knows to call and interrupt each when you pass an array into the constructor. Here's a generator that iterates over our array the way we want:

g = Generator.new { |g| l.each { |e| g.yield e unless e =~ /^junk/ } } g.next # => 1 g.next # => 2 g.next # => 3

The Generator constructor can take a code block that accepts the generator object itself as an argument. This code block performs the iteration that you'd like to have wrapped in a generator. Note the basic similarity of the code block to the body of the 1#my_iterator method. The only difference is that instead of the yield keyword we call the Generator#yield function, which handles some of the work involved with setting up and jumping to the continuations (Generator#next handles the rest of the continuation work).

Once you see how this works, you can eliminate some duplicate code by wrapping the 1#my_iterator method itself in a Generator:

g = Generator.new { |g| l.my_iterator { |e| g.yield e } } g.next # => 1 g.next # => 2 g.next # => 3

Here's a version of the interosculate method that can wrap methods as well as arrays. It accepts any combination of Enumerable objects and Method objects, turns each one into a Generator object, and loops through all the Generator objects, getting one element at a time from each:

def interosculate(*iteratables) generators = iteratables.collect do |x| if x.is_a? Method Generator.new { |g| x.call { |e| g.yield e } } else Generator.new(x) end end done = false until done done = true generators.each do |g| if g.next? yield g.next done = false end end end end

Here, we pass interosculate an array and a Method object, so that we can iterate through two arrays in opposite directions:

words1 = %w{Four and years} words2 = %w{ago seven score} interosculate(words1, words2.method(:reverse_each)) { |x| puts x } # Four # score # and # seven # years # ago

 

See Also

Категории