Strings and StringBuilders

As described in Appendix A, there is some special syntax for using the String class. The double-quote syntax allows us to create an instance of the String class without using the keyword new. The + operator allows us to easily concatenate Strings. Thus, we can say

String sport = "foot" + "ball";

instead of the much more cumbersome:

String sport = new String(new char[] {'f', 'o', 'o', 't'}); sport = sport.concat(new String(new char[] {'b', 'a', 'l', 'l'});

We saw in Section 2.1 that Strings are immutable. Once a String is created, it cannot be modified. This makes it dangerous to use == to compare Strings, but allows Java to sometimes save space by not storing redundant copies of identical Strings.

The immutability of Strings sometimes has a cost in efficiency. Consider the toString() method from our ArrayList class in Section 5.3, reproduced in Figure 13-1.

Figure 13-1. The toString() method from the ArrayList class uses Strings.

1 public String toString() { 2 String result = "[ "; 3 for (int i = 0; i < size; i++) { 4 result += data[i] + " "; 5 } 6 return result + "]"; 7 }

Every time we use the + operator, a new String must be created and the contents of the old String copied into it. It would be better if we could avoid some of this copying (Figure 13-2). The built-in StringBuilder class allows us to do just this.

Figure 13-2. The toString() method of an instance of our ArrayList class returns a String such as "[ a b c d ]". Using Strings (top), it is necessary to create a new String instance every time new characters are added. A StringBuilder (bottom) stretches like an ArrayList, so it is not necessary to copy the array every time we add new characters.

If a String is like an array of characters, a StringBuilder is like an ArrayList of characters. We can optionally specify the capacity of a StringBuilder as an argument to the constructor, but it can stretch when necessary. Because the capacity of a StringBuilder doubles when it runs out of room, appending a new character takes constant amortized time. Creating a new String with an extra character, on the other hand, takes time linear in the number of characters previously in the String.

An improved version of the toString() method using a StringBuilder is given in Figure 13-3. We should generally write our toString() methods this way.

Figure 13-3. The toString() method using a StringBuilder.

1 public String toString() { 2 StringBuilder result = new StringBuilder("[ "); 3 for (int i = 0; i < size; i++) { 4 result.append(data[i] + " "); 5 } 6 result.append("]"); 7 return result.toString(); 8 }

Some of the methods from the String and StringBuilder classes are given in Figure 13-4. There are more methods not listed here; we'll leave the details for the API. The discussion that follows highlights some information that should let us use these classes more effectively.

Figure 13-4. UML class diagram of the String and StringBuilder classes, the Object class, and some associated interfaces.

(This item is displayed on page 354 in the print version)

Because instances of the String class are immutable, none of the listed methods of the String class have a return type of void. There would generally be no point in a method which neither returns a value nor modifies the object on which it is invoked. Instead, many of these methods return new Strings.

The contains() method returns TRue if its argument is a substring of this. A substring is a consecutive sequence of 0 or more characters within a String. For example, the invocation

"bookkeeper".contains("ookkee")

returns TRue.

The substring() method, given two arguments start and end, returns the substring from index start up to but not including index end. Thus,

"sesquipedalian".substring(3, 7)

returns characters 3 through 6that is, "quip".

A substring starting at index 0 is called a prefix. The method startsWith() determines whether its argument is a prefix. A substring running up against the other end of a String is called a suffix. The method endsWith() determines whether its argument is a suffix.

To avoid keeping redundant copies of identical Strings, the String class maintains a pool of instances. If a String expression involving no variables is identical to some instance in this pool, the value of the expression is a reference to the existing instance instead of a new String. If it is not, a new instance is added to the pool. To cause any other String instance to be treated this way, we can invoke its intern() method. Thus, if two Strings a and b are equals(), then

a.intern() == b.intern();

The TRim() method returns a new String with any spaces, tabs, or newline characters removed from the ends. For example,

" dramatic pause ".trim()

is "dramatic pause".

Many of the methods in the StringBuilder class have the return type StringBuilder. These methods actually modify the instance on which they are invoked. They could have the return type void, but for lack of anything else useful to return, they return the object on which they are invoked.

Exercises

13.1

We can't say

String state = "stressed".reverse();  

because the String class has no reverse() method. Show how to accomplish the same thing using the StringBuilder class.

 
13.2

Create an immutable version of the Die class from Chapter 1. The roll() method, instead of modifying the current instance, should return a new one. What methods need to be removed?

13.3

Look up the getChars() method of the String class in the API. This method has a return type of void. What is the point of this, when Strings are immutable?

String Matching

Категории