Creating a Hash with a Default Value
Credit: Ben Giddings
Problem
You're using a hash, and you don't want to get nil as a value when you look up a key that isn't present in the hash. You want to get some more convenient value instead, possibly one calculated dynamically.
Solution
A normal hash has a default value of nil:
h = Hash.new h[1] # => nil h['do you have this string?'] # => nil
There are two ways of creating default values for hashes. If you want the default value to be the same object for every hash key, pass that value into the Hash constructor.
h = Hash.new("nope") h[1] # => "nope" h['do you have this string?'] # => "nope"
If you want the default value for a missing key to depend on the key or the current state of the hash, pass a code block into the hash constructor. The block will be called each time someone requests a missing key.
h = Hash.new { |hash, key| (key.respond_to? :to_str) ? "nope" : nil } h[1] # => nil h['do you have this string'] # => "nope"
Discussion
The first type of custom default value is most useful when you want a default value of zero. For example, this form can be used to calculate the frequency of certain words in a paragraph of text:
text = 'The rain in Spain falls mainly in the plain.' word_count_hash = Hash.new 0 # => {} text.split(/W+/).each { |word| word_count_hash[word.downcase] += 1 } word_count_hash # => {"rain"=>1, "plain"=>1, "in"=>2, "mainly"=>1, "falls"=>1, # "the"=>2, "spain"=>1}
What if you wanted to make lists of the words starting with a given character? Your first attempt might look like this:
first_letter_hash = Hash.new [] text.split(/W+/).each { |word| first_letter_hash[word[0,1].downcase] << word } first_letter_hash # => {} first_letter_hash["m"] # => ["The", "rain", "in", "Spain", "falls", "mainly", "in", "the", "plain"]
What's going on here? All those words don't start with "m"….
What happened is that the array you passed into the Hash constructor is being used for every default value. first_letter_hash["m"] is now a reference to that array, as is first_letter_hash["f"] and even first_letter_hash[1006].
This is a case where you need to pass in a block to the Hash constructor. The block is run every time the Hash can't find a key. This way you can create a different array each time.
first_letter_hash = Hash.new { |hash, key| hash[key] = [] } text.split(/W+/).each { |word| first_letter_hash[word[0,1].downcase] << word } first_letter_hash # => {"m"=>["mainly"], "p"=>["plain"], "f"=>["falls"], "r"=>["rain"], # "s"=>["Spain"], "i"=>["in", "in"], "t"=>["The", "the"]} first_letter_hash["m"] # => ["mainly"]
When a letter can't be found in the hash, Ruby calls the block passed into the Hash constructor. That block puts a new array into the hash, using the missing letter as the key. Now the letter is bound to a unique array, and words can be added to that array normally.
Note that if you want to add the array to the hash so it can be used later, you must assign it within the block of the Hash constructor. Otherwise you'll get a new, empty array every time you access first_letter_hash["m"]. The words you want to append to the array will be lost.
See Also
- This technique is used in recipes like Recipe 5.6, "Keeping Multiple Values for the Same Hash Key," and Recipe 5.12, "Building a Histogram"