Object Serialization
Two streams in java.ioObjectInputStream and ObjectOutputStreamare run-of-the-mill byte streams and work like the other input and output streams. However, they are special in that they can read and write objects.
The key to writing an object is to represent its state in a serialized form sufficient to reconstruct the object as it is read. Thus, reading and writing objects is a process called object serialization. Object serialization is essential to building all but the most transient applications. You can use object serialization in the following ways:
- Remote method invocation (RMI) communication between objects via sockets.
The client and server programs in Putting It All Together, [1] use RMI to communicate. You can see object serialization used in that example to pass various objects back and forth between the client and the server. Refer to the online version of this tutorial to read about the example's use of RMI and object serialization.
[1] http://java.sun.com/docs/books/tutorial/together/index.html
- Lightweight persistence the archival of an object for use in a later invocation of the same program.
You need to know about object serialization from two points of view. First, you need to know how to serialize objects by writing them to an ObjectOutputStream and reading them in again, using an ObjectInputStream. The next section, Serializing Objects (page 335), shows you how. Second, you will want to know how to write a class so that its instances can be serialized. You can read how to do this in the section after that, Providing Object Serialization for Your Classes (page 336).
Serializing Objects
Reconstructing an object from a stream requires that the object first be written to a stream. So let's start there.
How to Write to an ObjectOutputStream
Writing objects to a stream is a straightforward process. For example, the following gets the current time in milliseconds by constructing a Date object and then serializes that object:
FileOutputStream out = new FileOutputStream("theTime"); ObjectOutputStream s = new ObjectOutputStream(out); s.writeObject("Today"); s.writeObject(new Date()); s.flush();
ObjectOutputStream must be constructed on another stream. This code constructs an ObjectOutputStream on a FileOutputStream, thereby serializing the object to a file named theTime. Next, the string Today and a Date object are written to the stream with the writeObject method of ObjectOutputStream.
Thus, the writeObject method serializes the specified object, traverses its references to other objects recursively, and writes them all. In this way, relationships between objects are maintained.
ObjectOutputStream implements the DataOutput interface that defines many methods for writing primitive data types, such as writeInt, writeFloat, or writeUTF. You can use these methods to write primitive data types to an ObjectOutputStream.
The writeObject method throws a NotSerializableException if it's given an object that is not serializable. An object is serializable only if its class implements the Serializable interface.
How to Read from an ObjectOutputStream
Once you've written objects and primitive data types to a stream, you'll likely want to read them out again and reconstruct the objects. This is also straightforward. Here's code that reads in the String and the Date objects that were written to the file named theTime in the previous example:
FileInputStream in = new FileInputStream("theTime"); ObjectInputStream s = new ObjectInputStream(in); String today = (String)s.readObject(); Date date = (Date)s.readObject();
Like ObjectOutputStream, ObjectInputStream must be constructed on another stream. In this example, the objects were archived in a file, so the code constructs an ObjectInputStream on a FileInputStream. Next, the code uses ObjectInputStream's readObject method to read the String and the Date objects from the file. The objects must be read from the stream in the same order in which they were written. Note that the return value from readObject is an object that is cast to and assigned to a specific type.
The readObject method deserializes the next object in the stream and traverses its references to other objects recursively to deserialize all objects that are reachable from it. In this way, it maintains the relationships between the objects.
ObjectInputStream implements the DataInput interface that defines methods for reading primitive data types. The methods in DataInput parallel those defined in DataOutput for writing primitive data types. They include such methods as readInt, readFloat, and readUTF. Use these methods to read primitive data types from an ObjectInputStream.
Providing Object Serialization for Your Classes
An object is serializable only if its class implements the Serializable interface. Thus, if you want to serialize the instances of one of your classes, the class must implement the Serializable interface. The good news is that Serializable is an empty interface. That is, it doesn't contain any method declarations; its purpose is simply to identify classes whose objects are serializable.
Implementing the Serializable Interface
Here's the complete definition of the Serializable interface:
package java.io; public interface Serializable { //there's nothing in here! };
Making instances of your classes serializable is easy. You just add the implements Serializable clause to your class declaration, like this:
public class MySerializableClass implements Serializable { ... }
You don't have to write any methods. You can serialize instances of this class with the defaultWriteObject method of ObjectOutputStream. This method automatically writes out everything required to reconstruct an instance of the class, including the following:
- Class of the object
- Class signature
- Values of all non-transient and non-static members, including members that refer to other objects
You can deserialize any instance of the class with the defaultReadObject method in ObjectInputStream.
For many classes, the default behavior is good enough. However, default serialization can be slow, and a class might want more explicit control over the serialization.
Customizing Serialization
You can customize serialization for your classes by providing two methods for it: writeObject and readObject. The writeObject method controls what information is saved and is typically used to append additional information to the stream. The readObject method either reads the information written by the corresponding writeObject method or can be used to update the state of the object after it has been restored.
The writeObject method must be declared exactly as shown in the following example and should call the stream's defaultWriteObject as the first thing it does to perform default serialization. Any special arrangements can be handled afterward:
private void writeObject(ObjectOutputStream s) throws IOException { s.defaultWriteObject(); //customized serialization code }
The readObject method must read in everything written by writeObject in the same order in which it was written. Also, the readObject method can perform calculations or update the state of the object. Here's the readObject method that corresponds to the writeObject method just shown:
private void readObject(ObjectInputStream s) throws IOException { s.defaultReadObject(); //customized deserialization code ... //followed by code to update the object, if necessary }
The readObject method must be declared exactly as shown.
The writeObject and readObject methods are responsible for serializing only the immediate class. Any serialization required by the superclasses is handled automatically. However, a class that needs to explicitly coordinate with its superclasses to serialize itself can do so by implementing the Externalizable interface.
Implementing the Externalizable Interface
For complete, explicit control of the serialization process, a class must implement the Externalizable interface. For Externalizable objects, only the identity of the object's class is automatically saved by the stream. The class is responsible for writing and reading its contents, and it must coordinate with its superclasses to do so.
Here's the complete definition of the Externalizable interface that extends Serializable:
package java.io; public interface Externalizable extends Serializable { public void writeExternal(ObjectOutput out) throws IOException; public void readExternal(ObjectInput in) throws IOException, java.lang.ClassNotFoundException; }
The following holds for an Externalizable class:
- It must implement the java.io.Externalizable interface.
- It must implement a writeExternal method to save the state of the object. Also, it must explicitly coordinate with its supertype to save its state.
- It must implement a readExternal method to read the data written by the writeExternal method from the stream and restore the state of the object. It must explicitly coordinate with the supertype to restore its state.
- If an externally defined format is being written, the writeExternal and readExternal methods are solely responsible for that format.
The writeExternal and readExternal methods are public and carry the risk that a client may be able to write or read information in the object other than by using its methods and variables. These methods must be used only when the information held by the object is not sensitive or when exposing that information would not present a security risk.
Protecting Sensitive Information
When developing a class that provides controlled access to resources, you must take care to protect sensitive information and functions. During deserialization, the private state of the object is restored. For example, a file descriptor contains a handle that provides access to an operating system resource. Being able to forge a file descriptor would allow some forms of illegal access, because restoring state is done from a stream. Therefore, the serializing runtime must take the conservative approach and not trust the stream to contain only valid representations of objects. To avoid compromising a class, you must provide either that the sensitive state of an object must not be restored from the stream or that it must be reverified by the class.
Several techniques are available to protect sensitive data in classes. The easiest is to mark as private transient variables that contain sensitive data. Transient and static variables are not serialized or deserialized. Marking the variables will prevent the state from appearing in the stream and from being restored during deserialization. Because writing and reading (of private variables) cannot be superseded outside of the class, the class's transient variables are safe.
Particularly sensitive classes should not be serialized. To accomplish this, the object should not implement either the Serializable or the Externalizable interface.
Some classes may find it beneficial to allow writing and reading but to specifically handle and revalidate the state as it is deserialized. The class should implement writeObject and readObject methods to save and restore only the appropriate state. If access should be denied, throwing a NotSerializableException will prevent further access.