Serializing Data with YAML
Problem
You want to serialize a data structure and use it later. You may want to send the data structure to a file, then load it into a program written in a different programming language.
Solution
The simplest way is to use the built-in yaml library. When you require yaml, all Ruby objects sprout to_ yaml methods that convert them to the YAML serialization format. A YAML string is human-readable, and it intuitively corresponds to the object from which it was derived:
require yaml 10.to_yaml # => "--- 10 " en.to_yaml # => "--- ten " 10.to_yaml # => "--- "10" "
Arrays are represented as bulleted lists:
puts %w{Brush up your Shakespeare}.to_yaml # -- # - Brush # - up # - your # - Shakespeare
Hashes are represented as colon-separated key-value pairs:
puts ({ star => hydrogen, gold bar => gold }).to_yaml # -- # star: hydrogen # gold bar: gold
More complex Ruby objects are represented in terms of their classes and member variables:
require set puts Set.new([1, 2, 3]).to_yaml # --- !ruby/object:Set # hash: # 1: true # 2: true # 3: true
You can dump a data structure to a file with YAML.dump, and load it back with YAML.load:
users = [{:name => Bob, :permissions => [Read]}, {:name => Alice, :permissions => [Read, Write]}] # Serialize open(users, w) { |f| YAML.dump(users, f) } # And deserialize users2 = open("users") { |f| YAML.load(f) } # => [{:permissions=>["Read"], :name=>"Bob"}, # {:permissions=>["Read", "Write"], :name=>"Alice"}]
YAML implementations are available for Perl, Python, Java, PHP, JavaScript, and OCaml, so if you stick to the "standard" data types (strings, arrays, and so on), the serialized file will be portable across programming languages.
Discussion
If youve ever used Pythons pickle module or serialized a Java object, you know how convenient it is to be able to dump an object to disk and load it back later. You don have to define a custom data format or write an XML generator: you just shove the object into a file or a database, and read it back later. The only downside is that the serialized file is usually a binary mess that can only be understood by the serialization library.
YAML is a human-readable and somewhat cross-language serialization standard. Its format describes the simple data structures common to all modern programming languages. YAML can serialize and deserialize any combination of strings, booleans, numbers, dates and times, arrays (possibly nested arrays), and hashes (again, possibly nested ones).
You can also use YAML to serialize Ruby-specific objects: symbols, ranges, and regular expressions. Indeed, you can use YAML to serialize instances of custom classes: YAML serializes the class of the object and the values of its instance variables. Theres no guarantee, though, that other programming languages will understand what you mean.[3]
[3] Ruby can also read YAML descriptions of Perls regular expressions.
Not only is YAML human-readable, its human-writable. You can write YAML files in a text editor and load them into Ruby as objects. If you e having trouble with the YAML representation of a particular data structure, your best bet is to define a simple version of that data structure in an irb session, dump it to YAML, and work from there.
quiz_question = [What color is Raedon?, [Blue, Albino, *Yellow]] puts quiz_question.to_yaml # -- # - What color is Raedon? # - - Blue # - Albino # - "*Yellow"
Before you get drunk with power, you should know that YAML shares the limitations of other serialization schemes. Most obviously, you can only deserialize objects in an environment like the one in which you serialized them. Suppose you convert a Set object to YAML in one Ruby session:
require yaml require set set = Set.new([1, 2, 3]) open("set", "w") { |f| YAML.dump(set, f) }
In another Ruby session, you might try to convert the YAML back into a Set, without first requiring the set library:
# Bad code -- don try this!
require yaml
set = open("set") { |f| YAML.load(f) }
# => # Instead of a Set, youve got an unresolved object of class YAML::Object. The set has been loaded from the file and deserialized, but Ruby can resolve its class name. YAML can only serialize data; it can serialize Ruby code or system resources (such as filehandles or open sockets). This means some objects can be fully converted to YAML. The following code successfully serializes and deserializes a File object, but the deserialized File isn open and doesn point to anything in particular:
handle = open(a_file, w)
handle.path
# => "a_file"
handle2 = YAML.load(YAML.dump(handle))
# => #
The essence of the File objectits handle to a file on disk, granted by the operating systemhas been lost. Objects that contain Ruby code will lose their code when dumped to YAML. This means that Proc and Binding objects will turn up empty. Objects with singleton methods will be dumped without them. Classes can be dumped to YAML at all. But these are all edge cases. Most data structures, even complex ones, can be serialized to
YAML and stay readable to boot.
See Also
Категории