Keeping Multiple Values for the Same Hash Key
Problem
You want to build a hash that might have duplicate values for some keys.
Solution
The simplest way is to create a hash that initializes missing values to empty arrays. You can then append items onto the automatically created arrays:
hash = Hash.new { |hash, key| hash[key] = [] } raw_data = [ [1, 'a'], [1, 'b'], [1, 'c'], [2, 'a'], [2, ['b', 'c']], [3, 'c'] ] raw_data.each { |x,y| hash[x] << y } hash # => {1=>["a", "b", "c"], 2=>["a", ["b", "c"]], 3=>["c"]}
Discussion
A hash maps any given key to only one value, but that value can be an array. This is a common phenomenon when reading data structures from the outside world. For instance, a list of tasks with associated priorities may contain multiple items with the same priority. Simply reading the tasks into a hash keyed on priority would create key collisions, and obliterate all but one task with any given priority.
It's possible to subclass Hash to act like a normal hash until a key collision occurs, and then start keeping an array of values for the key that suffered the collision:
class MultiValuedHash < Hash def []=(key, value) if has_key?(key) super(key, [value, self[key]].flatten) else super end end end hash = MultiValuedHash.new raw_data.each { |x,y| hash[x] = y } hash # => {1=>["c", "b", "a"], 2=>["b", "c", "a"], 3=>"c"}
This saves a little bit of memory, but it's harder to write code for this class than for one that always keeps values in an array. There's also no way of knowing whether a value [1,2,3] is a single array value or three numeric values.
See Also
- Recipe 5.2, "Creating a Hash with a Default Value," explains the technique of the dynamic default value in more detail, and explains why you must initalize the empty list within a code blocknever within the arguments to Hash.new