Files and Streams

Java views each file as a sequential stream of bytes (Fig. 14.2). Every operating system provides a mechanism to determine the end of a file, such as an end-of-file marker or a count of the total bytes in the file that is recorded in a system-maintained administrative data structure. A Java program processing a stream of bytes simply receives an indication from the operating system when the program reaches the end of the streamthe program does not need to know how the underlying platform represents files or streams. In some cases, the end-of-file indication occurs as an exception. In other cases, the indication is a return value from a method invoked on a stream-processing object.

Figure 14.2. Java's view of a file of n bytes.

File streams can be used to input and output data as either characters or bytes. Streams that input and output bytes to files are known as byte-based streams, storing data in its binary format. Streams that input and output characters to files are known as character-based streams, storing data as a sequence of characters. For instance, if the value 5 were being stored using a byte-based stream, it would be stored in the binary format of the numeric value 5, or 101. If the value 5 were being stored using a character-based stream, it would be stored in the binary format of the character 5, or 00000000 00110101 (this is the binary for the numeric value 53, which indicates the character 5 in the Unicode character sets). The difference between the numeric value 5 and the character 5 is that the numeric value can be used as an integer, whereas the character 5 is simply a character that can be used in a string of text, as in "Sarah Miller is 15 years old". Files that are created using byte-based streams are referred to as binary files, while files created using character-based streams are referred to as text files. Text files can be read by text editors, while binary files are read by a program that converts the data to a human-readable format.

A Java program opens a file by creating an object and associating a stream of bytes or characters with it. The classes used to create these objects are discussed shortly. Java can also associate streams with different devices. In fact, Java creates three stream objects that are associated with devices when a Java program begins executingSystem.in, System.out and System.err. Object System.in (the standard input stream object) normally enables a program to input bytes from the keyboard; object System.out (the standard output stream object) normally enables a program to output data to the screen; and object System.err (the standard error stream object) normally enables a program to output error messages to the screen. Each of these streams can be redirected. For System.in, this capability enables the program to read bytes from a different source. For System.out and System.err, this capability enables the output to be sent to a different location, such as a file on disk. Class System provides methods setIn, setOut and setErr to redirect the standard input, output and error streams, respectively.

Java programs perform file processing by using classes from package java.io. This package includes definitions for stream classes, such as FileInputStream (for byte-based input from a file), FileOutputStream (for byte-based output to a file), FileReader (for character-based input from a file) and FileWriter (for character-based output to a file). Files are opened by creating objects of these stream classes, which inherit from classes InputStream, OutputStream, Reader and Writer, respectively (these classes will be discussed later in this chapter). Thus, the methods of these stream classes can all be applied to file streams as well.

Java contains classes that enable the programmer to perform input and output of objects or variables of primitive data types. The data will still be stored as bytes or characters behind the scenes, allowing the programmer to read or write data in the form of integers, strings, or other data types without having to worry about the details of converting such values to byte-format. To perform such input and output, objects of classes ObjectInputStream and ObjectOutputStream can be used together with the byte-based file stream classes FileInputStream and FileOutputStream (these classes will be discussed in more detail shortly). The complete hierarchy of classes in package java.io can be viewed in the online documentation at

java.sun.com/j2se/5.0/docs/api/java/io/package-tree.html

Each indentation level in the hierarchy indicates that the indented class extends the class under which it is indented. For example, class InputStream is a subclass of Object. Click a class's name in the hierarchy to view the details of the class.

As you can see in the hierarchy, Java offers many classes for performing input/output operations. We use several of these classes in this chapter to implement file-processing programs that create and manipulate sequential-access files and random-access files (discussed in Section 14.7). We also include a detailed example on class File, which is useful for obtaining information about files and directories. In Chapter 24, Networking, we use stream classes extensively to implement networking applications. Several other classes in the java.io package that we do not use in this chapter are discussed briefly in Section 14.8.

In addition to the classes in this package, character-based input and output can be performed with classes Scanner and Formatter. Class Scanner is used extensively to input data from the keyboard. As we will see, this class can also read data from a file. Class Formatter enables formatted data to be output to the screen or to a file in a manner similar to System.out.printf. Chapter 28, Formatted Output, presents the details of formatted output with System.out.printf. All these features can be used to format text files as well.

Категории