Stripping Duplicate Elements from an Array

Problem

You want to strip all duplicate elements from an array, or prevent duplicate elements from being added in the first place.

Solution

Use Array#uniq to create a new array, based on an existing array but with no duplicate elements. Array#uniq! strips duplicate elements from an existing array.

survey_results = [1, 2, 7, 1, 1, 5, 2, 5, 1] distinct_answers = survey_results.uniq # => [1, 2, 7, 5] survey_results.uniq! survey_results # => [1, 2, 7, 5]

To ensure that duplicate values never get into your list, use a Set instead of an array. If you try to add a duplicate element to a Set, nothing will happen.

require 'set' survey_results = [1, 2, 7, 1, 1, 5, 2, 5, 1] distinct_answers = survey_results.to_set # => # games = [["Alice", "Bob"], ["Carol", "Ted"], ["Alice", "Mallory"], ["Ted", "Bob"]] players = games.inject(Set.new) { |set, game| game.each { |p| set << p }; set } # => # players << "Ted" # => #

 

Discussion

The common element between these two solutions is the hash (see Chapter 5). Array#uniq iterates over an array, using each element as a key in a hash that it always checks to see if it encountered an element earlier in the iteration. A Set keeps the same kind of hash from the beginning, and rejects elements already in the hash. You see something that acts like an array, but it won't accept duplicates. In either case, two objects are considered "duplicates" if they have the same result for ==.

The return value of Array#uniq is itself an array, and nothing prevents you from adding duplicate elements to it later on. If you want to start enforcing uniqueness in perpetuity, you should turn the array into a Set instead of calling uniq. Requiring the set library will define a new method Enumerable#to_set, which does this.

Array#uniq preserves the original order of the array (that is, the first instance of an object remains in its original location), but a Set has no order, because its internal implementation is a hash. To get array-like order in a Set, combine this recipe with Recipe 5.8 and subclass Set to use an OrderedHash:

class OrderedSet < Set def initialize @hash ||= OrderedHash.new end end

Needing to strip all instances of a particular value from an array is a problem that often comes up. Ruby provides Array#delete for this task, and Array#compact for the special case of removing nil values.

a = [1, 2, nil, 3, 3, nil, nil, nil, 5] a.compact # => [1, 2, 3, 3, 5] a.delete(3) a # => [1, 2, nil, nil, nil, nil, 5]

Категории