Serializing Data with Marshal

Problem

You want to serialize a data structure to disk faster than YAML can do it. You don care about the readability of the serialized data structure, or portability to other programming languages.

Solution

Use the Marshal module, built into Ruby. It works more or less like YAML, but its much faster. The Marshal.dump method transforms a data structure into a binary string, which you can write to a file and reconstitute later with Marshal.load.

Marshal.dump(10) # => "0410i17" Marshal.dump( en) # => "0410"10ten" Marshal.dump(10) # => "0410"a10" Marshal.load(Marshal.dump(%w{Brush up your Shakespeare})) # => ["Brush", "up", "your", "Shakespeare"] require set Marshal.load(Marshal.dump(Set.new([1, 2, 3]))) # => #

Discussion

Marshal is what most programmers coming from other languages expect from a serializer. Its fast (much faster than yaml), and it produces unreadable blobs of binary data. It can serialize almost anything that yaml can (see Recipe 13.1 for examples), and it can also handle a few cases that yaml can . For instance, you can use Marshal to serialize a reference to a class:

Marshal.dump(Set) # =>"0410c10Set"

Note that the serialized version of Set is little more than a reference to the class. Like YAML, Marshal depends on the presence of the original classes, and you can deserialize a reference to a class you don have.[4] With YAML, youll get an unresolved YAML::Object; with Marshal, you get an ArgumentError:

[4] This also means that if you add methods to a class, then serialize the class, your methods don get saved.

#!/usr/bin/ruby -w Marshal.load("0410c10Set") # ArgumentError: undefined class/module Set

Like YAML, Marshal only serializes data structures. It can serialize Ruby code (like Proc objects), or resources allocated by other processes (like filehandles or database connections). However, the two libraries differ in their error handling. YAML tends to serialize as much as it can: it can serialize a File object, but when you deserialize it, you get an object that doesn point to any actual file. Marshal just gives you an error when you try to serialize a file:

open(output, w) { |f| Marshal.dump(f) } # TypeError: can dump File

See Also

Категории