Keeping Multiple Values for the Same Hash Key

Problem

You want to build a hash that might have duplicate values for some keys.

Solution

The simplest way is to create a hash that initializes missing values to empty arrays. You can then append items onto the automatically created arrays:

hash = Hash.new { |hash, key| hash[key] = [] } raw_data = [ [1, 'a'], [1, 'b'], [1, 'c'], [2, 'a'], [2, ['b', 'c']], [3, 'c'] ] raw_data.each { |x,y| hash[x] << y } hash # => {1=>["a", "b", "c"], 2=>["a", ["b", "c"]], 3=>["c"]}

 

Discussion

A hash maps any given key to only one value, but that value can be an array. This is a common phenomenon when reading data structures from the outside world. For instance, a list of tasks with associated priorities may contain multiple items with the same priority. Simply reading the tasks into a hash keyed on priority would create key collisions, and obliterate all but one task with any given priority.

It's possible to subclass Hash to act like a normal hash until a key collision occurs, and then start keeping an array of values for the key that suffered the collision:

class MultiValuedHash < Hash def []=(key, value) if has_key?(key) super(key, [value, self[key]].flatten) else super end end end hash = MultiValuedHash.new raw_data.each { |x,y| hash[x] = y } hash # => {1=>["c", "b", "a"], 2=>["b", "c", "a"], 3=>"c"}

This saves a little bit of memory, but it's harder to write code for this class than for one that always keeps values in an array. There's also no way of knowing whether a value [1,2,3] is a single array value or three numeric values.

See Also

Категории