Making a Copy of an Object
Problem
You want to make a copy of an existing object: a new object that can be modified separately from the original.
Solution
Ruby provides two ways of doing this. If you only want to have to remember one way, remember Object#clone:
s1 = 'foo' # => "foo" s2 = s1.clone # => "foo" s1[0] = 'b' [s1, s2] # => ["boo", "foo"]
Discussion
Ruby has two object-copy methods: a quick one and a thorough one. The quick one, Object#dup, creates a new instance of an object's class, then sets all of the new object's instance variables so that they reference the same objects as the original does. Finally, it makes the new object tainted if the old object was tainted.
The downside of dup is that it creates a new instance of the object's original class. If you open up a specific object and give it a singleton method, you implicitly create a metaclass, an anonymous subclass of the original class. Calling dup on the object will yield a copy that lacks the singleton methods. The other object-copy method, Object#clone, makes a copy of the metaclass and instantiates the copy, instead of instantiating the object's original class.
material = 'cotton' class << material def definition puts 'The better half of velour.' end end material.definition # The better half of velour. 'cotton'.definition # NoMethodError: undefined method 'definition' for "cotton":String material.clone.definition # The better half of velour. material.dup.definition # NoMethodError: undefined method 'definition' for "cotton":String
Object#clone is also more strict about propagating Ruby's internal flags: it will propagate both an object's "tainted?" flag and its "frozen?" flag. If you want to make an unfrozen copy of a frozen object, you must use Object#dup.
Object#clone and Object#dup both perform shallow copies: they make copies of an object without also copying its instance variables. You'll end up with two objects whose instance variables point to the same objects. Modifications to one object's instance variables will be visible in the other object. This can cause problems if you're not expecting it:
class StringHolder attr_reader :string def initialize(string) @string = string end end s1 = StringHolder.new('string') s2 = s1.dup s3 = s1.clone s1.string[1] = 'p' s2.string # => "spring" s3.string # => "spring"
If you want to do a deep copy, an easy (though not particularly quick) way is to serialize the object to a binary string with Marshal, then load a new object from the string:
class Object def deep_copy Marshal.load(Marshal.dump(self)) end end s1 = StringHolder.new('string') s2 = s1.deep_copy s1.string[1] = 'p' s1.string # => "spring" s2.string # => "string"
Note that this will only work on an object that has no singleton methods:
class << s1 def definition puts "We hold strings so you don't have to." end end s1.deep_copy # TypeError: singleton can't be dumped
When an object is cloned or duplicated, Ruby creates a new instance of its class or superclass, but without calling the initialize method. If you want to define some code to run when an object is cloned or duplicated, define an initialize_copy method. This is a hook method that gives you a chance to modify the copy before Ruby passes it back to whoever called clone or dup. If you want to simulate a deep copy without using Marshal, this is your chance to modify the copy's instance variables:
class StringHolder def initialize_copy(from) @string = from.string.dup end end s1 = StringHolder.new('string') s2 = s1.dup s3 = s1.clone s1.string[1] = "p" s2.string # => "string" s3.string # => "string"
This table summarizes the differences between clone, dup, and the deep-copy technique that uses Marshal.
Object#clone |
Object#dup |
Deep copy with Marshal |
|
---|---|---|---|
Same instance variables? |
New references to the same objects |
New references to the same objects |
New objects |
Same metaclass? |
Yes |
No |
Yes[5] |
Same singleton methods? |
Yes |
No |
N/A[6] |
Same frozen state? |
Yes |
No |
No |
Same tainted state? |
Yes |
Yes |
Yes |
[5] Marshal can't serialize an object whose metaclass is different from its original class.
[6] Marshal can't serialize an object whose metaclass is different from its original class.
See Also
- Recipe 13.2, "Serializing Data with Marshal"