Characters and Strings
The Java platform contains three classes that you can use when working with character data:
- Character[1] A class whose instances can hold a single character value. This class also defines handy methods that can manipulate or inspect single-character data.
[1] http://java.sun.com/j2se/1.3/docs/api/java/lang/Character.html
- String[2] A class for working with immutable (unchanging) data composed of multiple characters.
[2] http://java.sun.com/j2se/1.3/docs/api/java/lang/String.html
- StringBuffer[3] A class for storing and manipulating mutable data composed of multiple characters.
[3] http://java.sun.com/j2se/1.3/docs/api/java/lang/StringBuffer.html
Characters
An object of Character type contains a single character value. You use a Character object instead of a primitive char variable when an object is requiredfor example, when passing a character value into a method that changes the value or when placing a character value into a data structure, such as a vector, that requires objects.
The following sample program, CharacterDemo, [4] creates a few character objects and displays some information about them. The code that is related to the Character class is shown in boldface: [4] CharacterDemo.java is included on the CD and is available online. See Code Samples (page 174). This program requires Java 2 SDK 1.2 to run because it uses the compareTo method, which was added to the Character class for that release. |
public class CharacterDemo { public static void main(String args[]) { Character a = new Character('a'); Character a2 = new Character('a'); Character b = new Character('b'); int difference = a.compareTo(b); if (difference == 0) { System.out.println("a is equal to b."); } else if (difference < 0) { System.out.println("a is less than b."); } else if (difference > 0) { System.out.println("a is greater than b."); } System.out.println("a is " + ((a.equals(a2)) ? "equal" : "not equal") + " to a2."); System.out.println("The character " + a.toString() + " is " + (Character.isUpperCase(a.charValue()) ? "upper" : "lower") + "case."); } }
The following is the output from this program:
a is less than b. a is equal to a2. The character a is lowercase.
The CharacterDemo program calls the following constructors and methods provided by the Character class:
- Character(char) The Character class's only constructor, which creates a Character object containing the value provided by the argument. Once a Character object has been created, the value it contains cannot be changed.
- compareTo(Character)[1] An instance method that compares the values held by two character objects: the object on which the method is called (a in the example) and the argument to the method (b in the example). This method returns an integer indicating whether the value in the current object is greater than, equal to, or less than the value held by the argument. A letter is greater than another letter if its numeric value is greater.
[1] The compareTo method was added to the Character class for Java 2 SDK v. 1.2.
- equals(Object) An instance method that compares the value held by the current object with the value held by another. This method returns true if the values held by both objects are equal.
- toString() An instance method that converts the object to a string. The resulting string is one character in length and contains the value held by the character object.
- charValue() An instance method that returns the value held by the character object as a primitive char value.
- isUpperCase(char) A class method that determines whether a primitive char value is uppercase. This is one of many Character class methods that inspect or manipulate character data. Table 23 lists several other useful class methods the Character class provides.
Method |
Description |
---|---|
boolean isUpperCase(char) boolean isLowerCase(char) |
Determines whether the specified primitive char value is upper- or lowercase, respectively. |
char toUpperCase(char) char toLowerCase(char) |
Returns the upper- or lowercase form of the specified primitive char value. |
boolean isLetter(char) boolean isDigit(char) boolean isLetterOrDigit(char) |
Determines whether the specified primitive char value is a letter, a digit, or a letter or a digit, respectively. |
boolean isWhitespace(char)[a] |
Determines whether the specified primitive char value is white space according to the Java platform. |
boolean isSpaceChar(char)[b] |
Determines whether the specified primitive char value is a white-space character according to the Unicode specification. |
boolean isJavaIdentifierStart(char)[c] boolean isJavaIdentifierPart(char)[d] |
Determines whether the specified primitive char value can be the first character in a legal identifier or be a part of a legal identifier, respectively. |
[a] Added to the Java platform for the 1.1 release. Replaces isSpace(char), which is deprecated.
[b] Added to the Java platform for the 1.1 release.
[c] Added to the Java platform for the 1.1 release. Replaces isJavaLetter(char), which is deprecated.
[d] Added to the Java platform for the 1.1 release. Replaces isJavaLetterOrDigit(char), which is deprecated.
Strings and String Buffers
The Java platform provides two classes, String and StringBuffer, that store and manipulate stringscharacter data consisting of more than one character. The String class provides for strings whose value will not change. For example, if you write a method that requires string data and the method is not going to modify the string in any way, pass a String object into the method. The StringBuffer class provides for strings that will be modified; you use string buffers when you know that the value of the character data will change. You typically use string buffers for constructing character data dynamically: for example, when reading text data from a file. Because strings are constants, they are more efficient to use than are string buffers and can be shared. So it's important to use strings when you can.
Following is a sample program called StringsDemo, [1] which reverses the characters of a string. This program uses both a string and a string buffer. [1] StringsDemo.java is included on the CD and is available online. See Code Samples (page 174). Note that instead of explicitly writing code to reverse the characters of a string, you should use the reverse method in the StringBuffer class. |
public class StringsDemo { public static void main(String[] args) { String palindrome = "Dot saw I was Tod"; int len = palindrome.length(); StringBuffer dest = new StringBuffer(len); for (int i = (len - 1); i >= 0; i--) { dest.append(palindrome.charAt(i)); } System.out.println(dest.toString()); } }
The output from this program is:
doT saw I was toD
In addition to highlighting the differences between strings and string buffers, this section discusses several features of the String and StringBuffer classes: creating strings and string buffers, using accessor methods to get information about a string or string buffer, and modifying a string buffer.
Creating Strings and String Buffers
A string is often created from a string literala series of characters enclosed in double quotes. For example, when it encounters the following string literal, the Java platform creates a String object whose value is Gobbledygook.
"Gobbledygook"
The StringsDemo program uses this technique to create the string referred to by the palindrome variable:
String palindrome = "Dot saw I was Tod";
You can also create String objects as you would any other Java object: using the new keyword and a constructor. The String class provides several constructors that allow you to provide the initial value of the string, using different sources, such as an array of characters, an array of bytes, or a string buffer. Table 24 shows the constructors provided by the String class.
Constructor |
Description |
---|---|
String() |
Creates an empty string. |
String(byte[]) String(byte[], int, int) String(byte[], int, int, String) String(byte[], String) |
Creates a string whose value is set from the contents of an array of bytes. The two integer arguments, when present, set the offset and the length, respectively, of the subarray from which to take the initial values. The String argument, when present, specifies the character encoding to use to convert bytes to characters. |
String(char[]) String(char[], int, int) |
Creates a string whose value is set from the contents of an array of characters. The two integer arguments, when present, set the offset and the length, respectively, of the subarray from which to take the initial values. |
String(String) |
Creates a string whose value is set from another string. Using this constructor with a literal string argument is not recommended, because it creates two identical strings. |
String(StringBuffer) |
Creates a string whose value is set from a string buffer. |
[a] The String class defines other constructors not listed in this table. Those constructors have been deprecated, and their use is not recommended.
Here's an example of creating a string from a character array:
char[] helloArray = { 'h', 'e', 'l', 'l', 'o' }; helloString = new String(helloArray); System.out.println(helloString);
The last line of this code snippet displays: hello.
You must always use new to create a string buffer. The StringBuffer class has three constructors, as described in Table 25.
Constructor |
Description |
---|---|
StringBuffer() |
Creates an empty string buffer whose initial capacity is 16 characters. |
StringBuffer(int) |
Creates an empty string buffer with the specified initial capacity. |
StringBuffer(String) |
Creates a string buffer whose value is initialized by the specified String. The capacity of the string buffer is the length of the original string plus 16. |
The StringsDemo program creates the string buffer referred to by dest, using the constructor that sets the buffer's capacity:
String palindrome = "Dot saw I was Tod"; int len = palindrome.length(); StringBuffer dest = new StringBuffer(len);
This code creates the string buffer with an initial capacity equal to the length of the string referred to by the name palindrome. This ensures only one memory allocation for dest because it's just big enough to contain the characters that will be copied to it. By initializing the string buffer's capacity to a reasonable first guess, you minimize the number of times memory must be allocated for it. This makes your code more efficient because memory allocation is a relatively expensive operation.
Getting the Length of a String or a String Buffer
Methods used to obtain information about an object are known as accessor methods. One accessor method that you can use with both strings and string buffers is the length method, which returns the number of characters contained in the string or the string buffer. After the following two lines of code have been executed, len equals 17:
String palindrome = "Dot saw I was Tod"; int len = palindrome.length();
In addition to length, the StringBuffer class has a method called capacity, which returns the amount of space allocated for the string buffer rather than the amount of space used. For example, the capacity of the string buffer referred to by dest in the StringsDemo program never changes, although its length increases by 1 for each iteration of the loop. Figure 48 shows the capacity and the length of dest after nine characters have been appended to it.
Figure 48. A string buffer's length is the number of characters it contains; a string buffer's capacity is the number of character spaces that have been allocated.
The String class doesn't have a capacity method, because a string cannot change.
Getting Characters by Index from a String or a String Buffer
You can get the character at a particular index within a string or a string buffer by using the charAt accessor. The index of the first character is 0; the index of the last is length()-1. For example, the following code gets the character at index 9 in a string:
String anotherPalindrome = "Niagara. O roar again!"; char aChar = anotherPalindrome.charAt(9);
Indices begin at 0, so the character at index 9 is 'O', as illustrated in Figure 49:
Figure 49. Use the charAt method to get a character at a particular index.
The figure also shows that to compute the index of the last character of a string, you have to subtract 1 from the value returned by the length method.
If you want to get more than one character from a string or a string buffer, you can use the substring method. The substring method has two versions, as shown in Table 26.
Method |
Description |
---|---|
String substring(int) String substring(int, int) |
Returns a new string that is a substring of this string or string buffer.The first integer argument specifies the index of the first character. The second integer argument is the index of the last character -1. The length of the substring is therefore the first int minus the second int. If the second integer is not present, the substring extends to the end of the original string. |
[a] The substring methods were added to the StringBuffer class for Java 2 SDK 1.2.
The following code gets from the Niagara palindrome the substring that extends from index 11 to index 15, which is the word "roar":
String anotherPalindrome = "Niagara. O roar again!"; String roar = anotherPalindrome.substring(11, 15);
Remember that indices begin at 0 (Figure 50).
Figure 50. Use the substring method to get part of a string or string buffer.
Searching for a Character or a Substring within a String
The String class provides two accessor methods that return the position within the string of a specific character or substring: indexOf and lastIndexOf. The indexOf method searches forward from the beginning of the string, and lastIndexOf searches backward from the end of the string. Table 27 describes the various forms of the indexOf and the lastIndexOf methods.
Method |
Description |
---|---|
int indexOf(int) int lastIndexOf(int) |
Returns the index of the first (last) occurrence of the specified character. |
int indexOf(int, int) int lastIndexOf(int, int) |
Returns the index of the first (last) occurrence of the specified character, searching forward (backward) from the specified index. |
int indexOf(String) int lastIndexOf(String) |
Returns the index of the first (last) occurrence of the specified string. |
int indexOf(String, int) int lastIndexOf(String, int) |
Returns the index of the first (last) occurrence of the specified string, searching forward (backward) from the specified index. |
The StringBuffer class does not support the indexOf or the lastIndexOf methods. If you need to use these methods on a string buffer, first convert the string buffer to a string by using the toString method.
The following class, Filename, [1] illustrates the use of lastIndexOf and substring to isolate different parts of a file name. [1] Filename.java is included on the CD and is available online. See Code Samples (page 174). |
Note
The methods in the following class don't do any error checking and assume that their argument contains a full directory path and a file name with an extension. If these methods were production code, they would verify that their arguments were properly constructed.
// This class assumes that the string used to initialize // fullPath has a directory path, filename, and extension. // The methods won't work if it doesn't. public class Filename { private String fullPath; private char pathSeparator, extensionSeparator; public Filename(String str, char sep, char ext) { fullPath = str; pathSeparator = sep; extensionSeparator = ext; } public String extension() { int dot = fullPath.lastIndexOf(extensionSeparator); return fullPath.substring(dot + 1); } public String filename() { int dot = fullPath.lastIndexOf(extensionSeparator); int sep = fullPath.lastIndexOf(pathSeparator); return fullPath.substring(sep + 1, dot); } public String path() { int sep = fullPath.lastIndexOf(pathSeparator); return fullPath.substring(0, sep); } }
Here's a small program, named FilenameDemo, [1] that constructs a Filename object and calls all its methods: [1] FilenameDemo.java is included on the CD and is available online. See Code Samples (page 174). |
public class FilenameDemo { public static void main(String[] args) { Filename myHomePage = new Filename("/home/mem/index.html", '/', '.'); System.out.println("Extension = " + myHomePage.extension()); System.out.println("Filename = " + myHomePage.filename()); System.out.println("Path = " + myHomePage.path()); } }
And here's the output from FilenameDemo:
Extension = html Filename = index Path = /home/mem
As shown in Figure 51, our extension method uses lastIndexOf to locate the last occurrence of the period (.) in the file name. Then substring uses the return value of lastIndexOf to extract the file name extensionthat is, the substring from the period to the end of the string. This code assumes that the file name has a period in it; if the file name does not have a period, lastIndexOf returns -1, and the substring method throws a StringIndexOutOfBoundsException.
Figure 51. The use of lastIndexOf and substring in the extension method in the Filename class.
Also, notice that the extension method uses dot + 1 as the argument to substring. If the period character (.) is the last character of the string, dot + 1 is equal to the length of the string, which is 1 larger than the largest index into the string (because indices start at 0). This is a legal argument to substring because that method accepts an index equal to but not greater than the length of the string and interprets it to mean "the end of the string."
Comparing Strings and Portions of Strings
The String class has several methods for comparing strings and portions of strings. Table 28 lists and describes these methods.
Method |
Description |
---|---|
boolean endsWith(String) boolean startsWith(String) boolean startsWith(String, int) |
Returns true if this string ends with or begins with the substring specified as an argument to the method. The integer argument, when present, indicates the offset within the original string at which to begin looking. |
int compareTo(String) int compareTo(Object) * int compareToIgnoreCase(String) * |
Compares two strings lexicographically and returns an integer indicating whether this string is greater than (result is > 0), equal to (result is = 0), or less than (result is < 0) the argument. The Object argument is converted to a string before the comparison takes place. The compareToIgnoreCase method ignores case; thus, "a" and "A" are considered equal. |
boolean equals(Object) boolean equalsIgnore- Case(String) |
Returns true if this string contains the same sequence of characters as the argument. The Object argument is converted to a string before the comparison takes place. The equalsIgnoreCase method ignores case; thus, "a" and "A" are considered equal. |
boolean regionMatches(int, String, int, int) boolean regionMatches(boolean, int, String, int, int) |
Tests whether the specified region of this string matches the specified region of the String argument. The boolean argument indicates whether case should be ignored; if true, the case is ignored when comparing characters. |
[a] Methods marked with * were added to the String class for Java 2 SDK 1.2.
The following program, RegionMatchesDemo, [1] uses the regionMatches method to search for a string within another string: [1] RegionMatchesDemo.java is included on the CD and is available online. See Code Samples (page 174). |
public class RegionMatchesDemo { public static void main(String[] args) { String searchMe = "Green Eggs and Ham"; String findMe = "Eggs"; int len = findMe.length(); boolean foundIt = false; int i = 0; while (!searchMe.regionMatches(i, findMe, 0, len)) { i++; foundIt = true; } if (foundIt) { System.out.println(searchMe.substring(i, i+len)); } } }
The output from this program is Eggs.
The program steps through the string referred to by searchMe one character at a time. For each character, the program calls the regionMatches method to determine whether the substring beginning with the current character matches the string for which the program is looking.
Manipulating Strings
The String class has several methods that appear to modify a string. Of course, strings can't be modified, so what these methods really do is create and return a second string that contains the result, as indicated in Table 29.
Method |
Description |
---|---|
String concat(String) |
Concatenates the String argument to the end of this string. If the length of the argument is 0, the original string object is returned. |
String replace(char, char) |
Replaces all occurrences of the character specified as the first argument with the character specified as the second argument. If no replacements are necessary, the original string object is returned. |
String trim() |
Removes white space from both ends of this string. |
String toLowerCase() String toUpperCase() |
Converts this string to lower- or uppercase. If no conversions are necessary, these methods return the original string. |
Here's a small program, BostonAccentDemo, [1] that uses the replace method to translate a string into the Bostonian dialect: [1] BostonAccentDemo.java is included on the CD and is available online. See Code Samples (page 174). |
public class BostonAccentDemo { private static void bostonAccent(String sentence) { char r = 'r'; char h = 'h'; String translatedSentence = sentence.replace(r, h); System.out.println(translatedSentence); } public static void main(String[] args) { String translateThis = "Park the car in Harvard yard."; bostonAccent(translateThis); } }
The replace method switches all the r's to h's in the sentence string so that the output of this program is:
Pahk the cah in Hahvahd yahd.
Modifying String Buffers
As you know, string buffers can change. The StringBuffer class provides various methods for modifying the data within a string buffer. Table 30 summarizes the methods used to modify a string buffer.
Method |
Description |
---|---|
StringBuffer append(boolean) StringBuffer append(char) StringBuffer append(char[]) StringBuffer append(char[], int, int) StringBuffer append(double) StringBuffer append(float) StringBuffer append(int) StringBuffer append(long) StringBuffer append(Object) StringBuffer append(String) |
Appends the argument to this string buffer. The data is converted to a string before the append operation takes place. |
StringBuffer delete(int, int) * StringBuffer deleteCharAt(int) * |
Deletes the specified character(s) in this string buffer. |
StringBuffer insert(int, boolean) StringBuffer insert(int, char) StringBuffer insert(int, char[]) StringBuffer insert(int, char[], int, int) * StringBuffer insert(int, double) StringBuffer insert(int, float) StringBuffer insert(int, int) StringBuffer insert(int, long) StringBuffer insert(int, Object) StringBuffer insert(int, String) |
Inserts the second argument into the string buffer. The first integer argument indicates the index before which the data is to be inserted. The data is converted to a string before the insert operation takes place. |
StringBuffer replace(int, int, String) * void setCharAt(int, char) |
Replaces the specified character(s) in this string buffer. |
StringBuffer reverse() |
Reverses the sequence of characters in this string buffer. |
[a] Methods marked with * were added to the StringBuffer class for Java 2 SDK 1.2.
You saw the append method in action in the StringsDemo program at the beginning of this section. Here's a program, InsertDemo, [1] that uses the insert method to insert a string into a string buffer: [1] InsertDemo.java is included on the CD and is available online. See Code Samples (page 174). |
public class InsertDemo { public static void main(String[] args) { StringBuffer palindrome = new StringBuffer( "A man, a plan, a canal; Panama."); palindrome.insert(15, "a cat, "); System.out.println(palindrome); } }
The output from this program is still a palindrome:
A man, a plan, a cat, a canal; Panama. [2]
[2] Palindrome by Jim Saxe.
With insert, you specify the index before which you want the data inserted. In the example, 15 specifies that "a cat, " is to be inserted before the first a in a canal. To insert data at the beginning of a string buffer, use an index of 0. To add data at the end of a string buffer, use an index equal to the current length of the string buffer or use append.
If the operation that modifies a string buffer causes the size of the string buffer to grow beyond its current capacity, the string buffer allocates more memory. As mentioned previously, memory allocation is a relatively expensive operation, and you can make your code more efficient by initializing a string buffer's capacity to a reasonable first guess.
Strings and the Compiler
The compiler uses the String and the StringBuffer classes behind the scenes to handle literal strings and concatenation. As you know, you specify literal strings between double quotes:
"Hello World!"
You can use literal strings anywhere you would use a String object. For example, System.out.println accepts a string argument, so you could use a literal string there:
System.out.println("Might I add that you look lovely today.");
You can also use String methods directly from a literal string:
int len = "Goodbye Cruel World".length();
Because the compiler automatically creates a new string object for every literal string it encounters, you can use a literal string to initialize a string:
String s = "Hola Mundo";
The preceding construct is equivalent to, but more efficient than, this one, which ends up creating two identical strings:
String s = new String("Hola Mundo"); //don't do this
You can use + to concatenate strings:
String cat = "cat"; System.out.println("con" + cat + "enation");
Behind the scenes, the compiler uses string buffers to implement concatenation. The preceding example compiles to:
String cat = "cat"; System.out.println(new StringBuffer().append("con"). append(cat).append("enation").toString());
You can also use the + operator to append to a string values that are not themselves strings:
System.out.println("You're number " + 1);
The compiler implicitly converts the nonstring value (the integer 1 in the example) to a string object before performing the concatenation operation.
Summary of Characters and Strings
Use a Character object to contain a single character value, use a String object to contain a sequence of characters that won't change, and use a StringBuffer object to construct or to modify a sequence of characters dynamically. Refer to the tables listed in Table 31 for details about constructors and methods in these classes.
Number |
Table Title |
Page |
---|---|---|
Table 23 |
Useful Class Methods in the Character Class |
(page 134) |
Table 24 |
Constructors in the String Class |
(page 136) |
Table 25 |
Constructors in the StringBuffer Class |
(page 136) |
Table 26 |
The substring Methods in the String and StringBuffer Classes |
(page 138) |
Table 27 |
The indexOf and lastIndexOf Methods in the String Class |
(page 139) |
Table 28 |
Methods in the String Class for Comparing Strings |
(page 142) |
Table 29 |
Methods in the String Class for Manipulating Strings |
(page 143) |
Table 30 |
Methods for Modifying a String Buffer |
(page 144) |
Here's a fun program, Palindrome, [1] that determines whether a string is a palindrome. This program uses many methods from the String and the StringBuffer classes: [1] Palindrome.java is included on the CD and is available online. See Code Samples (page 174). To save space, we've removed the comments from the code. The online version is well commented. |
public class Palindrome { public static boolean isPalindrome(String stringToTest) { String workingCopy = removeJunk(stringToTest); String reversedCopy = reverse(workingCopy); return reversedCopy.equalsIgnoreCase(workingCopy); } protected static String removeJunk(String string) { int i, len = string.length(); StringBuffer dest = new StringBuffer(len); char c; for (i = (len - 1); i >= 0; i--) { c = string.charAt(i); if (Character.isLetterOrDigit(c)) { dest.append(c); } } return dest.toString(); } protected static String reverse(String string) { StringBuffer sb = new StringBuffer(string); return sb.reverse().toString(); } public static void main(String[] args) { String string = "Madam, I'm Adam."; System.out.println(); System.out.println("Testing whether the following " + "string is a palindrome:"); System.out.println(" " + string); System.out.println(); if (isPalindrome(string)) { System.out.println("It IS a palindrome!"); } else { System.out.println("It is NOT a palindrome!"); } System.out.println(); } }
The output from this program is:
Testing whether the following string is a palindrome: Madam, I'm Adam. It IS a palindrome!
Questions and Exercises: Characters and Strings
Questions
1: |
What is the initial capacity of the following string buffer? StringBuffer sb = new StringBuffer("Able was I ere I saw Elba."); |
2: |
Consider the following line of code: String hannah = "Did Hannah see bees? Hannah did.";
|
3: |
In the following program, what is the value of result after each numbered line executes? public class ComputeResult { public static void main(String[] args) { String original = "software"; StringBuffer result = new StringBuffer("hi"); int index = original.indexOf('a'); /*1*/ result.setCharAt(0, original.charAt(0)); /*2*/ result.setCharAt(1, original.charAt(original.length()-1)); /*3*/ result.insert(1, original.charAt(4)); /*4*/ result.append(original.substring(1,4)); /*5*/ result.insert(3, (original.substring(index, index+2) + " ")); System.out.println(result); } } |
Exercises
1: |
Show two ways to concatenate the following two strings together to get " Hi, mom." : String hi = "Hi, "; String mom = "mom."; |
2: |
Write a program that computes your initials from your full name and displays them. |
3: |
An anagram is a word or a phrase made by transposing the letters of another word or phrase; for example, "parliament" is an anagram of "partial men," and "software" is an anagram of "swear oft." Write a program that figures out whether one string is an anagram of another string. The program should ignore white space and punctuation. |
Answers
You can find answers to these Questions and Exercises online:
http://java.sun.com/docs/books/tutorial/java/data/QandE/characters-answers.html