Java InstantCode. Developing Applications Using Java NIO
The java.nio.charset package contains various classes and methods for character set conversion, regular expression matching, and for encoding and decoding data. This package provides the charset, encoder, and decoder classes to convert data between bytes and coded characters . The character data is encoded for transmission over a network or for storage in a file.
A character set is a set of characters, such as alphabets from A to Z, a to z, and special characters. The coded character set is an assignment of numeric value to each character in the character set using the standard encoding scheme. The encoding scheme is a process of mapping the coded character set to a sequence of Octets. A collection of 8 bytes is known as Octet.
The charset are the combination of coded character and encoding schemes. The various standard charsets are US-ASCII, ISO-8859-1, UTF-8, UTF-16BE, UTF-16LE, and UTF-16.
Note | To learn more about java.nio.charset package, see: http://java.sun.com/j2se/1.4.2/docs/api/java/nio/charset/package-summary.html |
Figure 1-3 shows the class diagram for the java.nio.charset package:
Charset
The Charset class defines methods to retrieve various names associated with a character set. In addition, the Charset class defines methods to create encoders and decoders. The most commonly used methods of the Charset class are:
-
aliases() : Returns an object of the Set interface that indicates the aliases of a charset.
-
availableCharsets() : Returns an object of the SortedMap interface that contains one entry for each charset supported by the current JVM. The availableCharsets() method develops a sorted map from canonical charset names to charset objects. The map contains one entry for each charset, which the current JVM supports.
-
decode() : Returns an object of the CharBuffer class. The decode() method decodes bytes in the charset into Unicode characters.
-
displayName() : Returns a string that represents a readable name .
-
encode() : Returns an object of the ByteBuffer class. The encode() method encodes Unicode characters into bytes in the charset.
-
forName() : Returns an object of the Charset class for the standard charset. In addition, the forName() method throws the IllegalCharsetNameException exception if the given charset name is illegal and the UnsupportedCharsetException exception if there is no support for the named charset.
-
hashCode() : Returns an int value that indicates the calculated hashcode for the charset. The hashcode is an object that contains the hash id and hash state of the hash table.
-
newDecoder() : Returns an object of the CharsetDecoder class. The newDecoder() method constructs a new decoder for the charset.
-
newEncoder() : Returns an object of the CharsetEncoder class. The newEncoder() method constructs a new encoder for the charset.
CharsetDecoder
The CharsetDecoder class defines methods that transform a sequence of bytes into a sequence of 16-bit Unicode characters. The source buffer is a byte buffer and the resultant buffer is a character buffer. The most commonly used methods in the CharsetDecoder class are:
-
charset() : Returns an object of the Charset class that creates the decoder.
-
decode() : Returns an object of the CoderResult class. The decode() method decodes the bytes from the source buffer and writes the resultant character in the target buffer. In addition, the decode() method throws the IllegalStateException exception if a decode operation is already in progress.
-
decodeLoop() : Returns an object of the CoderResult class. The decodeLoop() method decodes one or more bytes into one or more characters.
-
flush() : Returns an object of the CoderResult class. The flush() method clears the decoder.
-
isAutoDetecting() : Returns a Boolean value. The isAutoDetecting() method indicates whether the decoder implements an auto-detecting charset.
-
isCharsetDetected() : Returns a Boolean value that indicates whether or not the decoder has detected a charset.
-
replacement() : Returns a string value that indicates the replacement value of the decoder.
-
reset() : Returns an object of the CharsetDecoder class. The reset() method resets the decoder.
CharsetEncoder
The CharsetEncoder class defines methods that transform a sequence of 16-bit Unicode characters into a sequence of bytes. The source buffer is a character buffer, and the resultant buffer is a byte buffer. The most commonly used methods in the CharsetDecoder class are:
-
canEncode() : Returns a Boolean value that indicates whether the encoder can encode the given character.
-
charset() : Returns an object of the Charset class that creates the encoder.
-
encode() : Returns an object of the CoderResult class. The encode() method encodes the characters from the source buffer and writes the resultant bytes in the target buffer. In addition, the encode() method throws the IllegalStateException exception if an encode operation is already in progress.
-
encodeLoop() : Returns an object of the CoderResult class. The encodeLoop() method encodes one or more bytes into one or more characters.
-
flush() : Returns an object of the CoderResult class. The flush() method flushes the encoder.
-
replacement() : Returns a string value that indicates the replacement value of the encoder.
-
reset() : Returns an object of CharsetEncoder class. The reset() method resets the encoder.
CoderResult
The CoderResult class defines the result state of a coder. A coder consumes bytes or characters from an input buffer, translates them, and writes the resulting characters or bytes to an output buffer. The translation process can be stopped due to various reasons, such as underflow, overflow, malformed -input error, and unmappable-character error. A coder can be an encoder or decoder. The most commonly used methods in the CoderResult class are:
-
isError() : Returns a string value that describes a coder result.
-
isMalformed() : Returns a Boolean value that indicates whether the object describes a malformed-input error. The malformed-input error occurs when an input byte sequence is not legal for a specified charset. The isMalformed() method returns true if an object indicates malformed-input error else returns false.
-
isOverFlow() : Returns a Boolean value that indicates whether the object describes overflow condition. An overflow condition occurs when the buffer that stores the decoded data reaches the buffer limit. The isOverFlow() method returns true if the object indicates overflow, otherwise the method returns false.
-
isUnderFlow() : Returns a Boolean value that indicates whether the object describes underflow condition. An underflow condition occurs when the buffer that stores the decoded data is empty. The isUnderFlow() method returns true if the object indicates underflow, otherwise the method returns false.
-
isUnmappable() : Returns a Boolean value that indicates whether the object describes an unmappable-character error. The unmappable-character error occurs when an input character or byte sequence is valid but cannot map to an output byte or character sequence. The isUnmappable() method returns true if the object indicates unmappable-character error, otherwise the method returns false.
Категории