Synchronized Collections
The synchronized collection classes include Vector and Hashtable, part of the original JDK, as well as their cousins added in JDK 1.2, the synchronized wrapper classes created by the Collections.synchronizedXxx factory methods. These classes achieve thread safety by encapsulating their state and synchronizing every public method so that only one thread at a time can access the collection state.
5.1.1. Problems with Synchronized Collections
The synchronized collections are thread-safe, but you may sometimes need to use additional client-side locking to guard compound actions. Common compound actions on collections include iteration (repeatedly fetch elements until the collection is exhausted), navigation (find the next element after this one according to some order), and conditional operations such as put-if-absent (check if a Map has a mapping for key K, and if not, add the mapping (K,V)). With a synchronized collection, these compound actions are still technically thread-safe even without client-side locking, but they may not behave as you might expect when other threads can concurrently modify the collection.
Listing 5.1 shows two methods that operate on a Vector, getLast and delete-Last, both of which are check-then-act sequences. Each calls size to determine the size of the array and uses the resulting value to retrieve or remove the last element.
Listing 5.1. Compound Actions on a Vector that may Produce Confusing Results.
public static Object getLast(Vector list) { int lastIndex = list.size() - 1; return list.get(lastIndex); } public static void deleteLast(Vector list) { int lastIndex = list.size() - 1; list.remove(lastIndex); } |
These methods seem harmless, and in a sense they arethey can't corrupt the Vector, no matter how many threads call them simultaneously. But the caller of these methods might have a different opinion. If thread A calls getLast on a Vector with ten elements, thread B calls deleteLast on the same Vector, and the operations are interleaved as shown in Figure 5.1, getLast tHRows ArrayIndexOutOfBoundsException. Between the call to size and the subsequent call to get in getLast, the Vector shrank and the index computed in the first step is no longer valid. This is perfectly consistent with the specification of Vectorit throws an exception if asked for a nonexistent element. But this is not what a caller expects getLast to do, even in the face of concurrent modification, unless perhaps the Vector was empty to begin with.
Figure 5.1. Interleaving of Getlast and Deletelast that throws ArrayIndexOut-OfBoundsException.
Because the synchronized collections commit to a synchronization policy that supports client-side locking, [1] it is possible to create new operations that are atomic with respect to other collection operations as long as we know which lock to use. The synchronized collection classes guard each method with the lock on the synchronized collection object itself. By acquiring the collection lock we can make getLast and deleteLast atomic, ensuring that the size of the Vector does not change between calling size and get, as shown in Listing 5.2.
[1] This is documented only obliquely in the Java 5.0 Javadoc, as an example of the correct iteration idiom.
The risk that the size of the list might change between a call to size and the corresponding call to get is also present when we iterate through the elements of a Vector as shown in Listing 5.3.
This iteration idiom relies on a leap of faith that other threads will not modify the Vector between the calls to size and get. In a single-threaded environment, this assumption is perfectly valid, but when other threads may concurrently modify the Vector it can lead to trouble. Just as with getLast, if another thread deletes an element while you are iterating through the Vector and the operations are interleaved unluckily, this iteration idiom throws ArrayIndexOutOfBoundsException.
Listing 5.2. Compound Actions on Vector Using Client-side Locking.
public static Object getLast(Vector list) { synchronized (list) { int lastIndex = list.size() - 1; return list.get(lastIndex); } } public static void deleteLast(Vector list) { synchronized (list) { int lastIndex = list.size() - 1; list.remove(lastIndex); } } |
Listing 5.3. Iteration that may Throw ArrayIndexOutOfBoundsException.
for (int i = 0; i < vector.size(); i++) doSomething(vector.get(i)); |
Even though the iteration in Listing 5.3 can throw an exception, this doesn't mean Vector isn't thread-safe. The state of the Vector is still valid and the exception is in fact in conformance with its specification. However, that something as mundane as fetching the last element or iteration throw an exception is clearly undesirable.
The problem of unreliable iteration can again be addressed by client-side locking, at some additional cost to scalability. By holding the Vector lock for the duration of iteration, as shown in Listing 5.4, we prevent other threads from modifying the Vector while we are iterating it. Unfortunately, we also prevent other threads from accessing it at all during this time, impairing concurrency.
Listing 5.4. Iteration with Client-side Locking.
synchronized (vector) { for (int i = 0; i < vector.size(); i++) doSomething(vector.get(i)); } |
5.1.2. Iterators and Concurrentmodificationexception
We use Vector for the sake of clarity in many of our examples, even though it is considered a "legacy" collection class. But the more "modern" collection classes do not eliminate the problem of compound actions. The standard way to iterate a Collection is with an Iterator, either explicitly or through the for-each loop syntax introduced in Java 5.0, but using iterators does not obviate the need to lock the collection during iteration if other threads can concurrently modify it. The iterators returned by the synchronized collections are not designed to deal with concurrent modification, and they are fail-fastmeaning that if they detect that the collection has changed since iteration began, they throw the unchecked ConcurrentModificationException.
These fail-fast iterators are not designed to be foolproofthey are designed to catch concurrency errors on a "good-faith-effort" basis and thus act only as early-warning indicators for concurrency problems. They are implemented by associating a modification count with the collection: if the modification count changes during iteration, hasNext or next throws ConcurrentModificationException. However, this check is done without synchronization, so there is a risk of seeing a stale value of the modification count and therefore that the iterator does not realize a modification has been made. This was a deliberate design tradeoff to reduce the performance impact of the concurrent modification detection code.[2]
[2] ConcurrentModificationException can arise in single-threaded code as well; this happens when objects are removed from the collection directly rather than through Iterator.remove.
Listing 5.5 illustrates iterating a collection with the for-each loop syntax. Internally, javac generates code that uses an Iterator, repeatedly calling hasNext and next to iterate the List. Just as with iterating the Vector, the way to prevent ConcurrentModificationException is to hold the collection lock for the duration of the iteration.
Listing 5.5. Iterating a List with an Iterator.
List widgetList = Collections.synchronizedList(new ArrayList()); ... // May throw ConcurrentModificationException for (Widget w : widgetList) doSomething(w); |
There are several reasons, however, why locking a collection during iteration may be undesirable. Other threads that need to access the collection will block until the iteration is complete; if the collection is large or the task performed for each element is lengthy, they could wait a long time. Also, if the collection is locked as in Listing 5.4, doSomething is being called with a lock held, which is a risk factor for deadlock (see Chapter 10). Even in the absence of starvation or deadlock risk, locking collections for significant periods of time hurts application scalability. The longer a lock is held, the more likely it is to be contended, and if many threads are blocked waiting for a lock throughput and CPU utilization can suffer (see Chapter 11).
An alternative to locking the collection during iteration is to clone the collection and iterate the copy instead. Since the clone is thread-confined, no other thread can modify it during iteration, eliminating the possibility of ConcurrentModificationException. (The collection still must be locked during the clone operation itself.) Cloning the collection has an obvious performance cost; whether this is a favorable tradeoff depends on many factors including the size of the collection, how much work is done for each element, the relative frequency of iteration compared to other collection operations, and responsiveness and throughput requirements.
5.1.3. Hidden Iterators
While locking can prevent iterators from throwing ConcurrentModificationException, you have to remember to use locking everywhere a shared collection might be iterated. This is trickier than it sounds, as iterators are sometimes hidden, as in HiddenIterator in Listing 5.6. There is no explicit iteration in HiddenIterator, but the code in bold entails iteration just the same. The string concatenation gets turned by the compiler into a call to StringBuilder.append(Object), which in turn invokes the collection's toString methodand the implementation of toString in the standard collections iterates the collection and calls toString on each element to produce a nicely formatted representation of the collection's contents.
The addTenThings method could throw ConcurrentModificationException, because the collection is being iterated by toString in the process of preparing the debugging message. Of course, the real problem is that HiddenIterator is not thread-safe; the HiddenIterator lock should be acquired before using set in the println call, but debugging and logging code commonly neglect to do this.
The real lesson here is that the greater the distance between the state and the synchronization that guards it, the more likely that someone will forget to use proper synchronization when accessing that state. If HiddenIterator wrapped the HashSet with a synchronizedSet, encapsulating the synchronization, this sort of error would not occur.
Just as encapsulating an object's state makes it easier to preserve its invariants, encapsulating its synchronization makes it easier to enforce its synchronization policy. |
Listing 5.6. Iteration Hidden within String Concatenation. Don't Do this.
public class HiddenIterator { @GuardedBy("this") private final Set set = new HashSet(); public synchronized void add(Integer i) { set.add(i); } public synchronized void remove(Integer i) { set.remove(i); } public void addTenThings() { Random r = new Random(); for (int i = 0; i < 10; i++) add(r.nextInt()); System.out.println("DEBUG: added ten elements to " + set); } } |
Iteration is also indirectly invoked by the collection's hashCode and equals methods, which may be called if the collection is used as an element or key of another collection. Similarly, the containsAll, removeAll, and retainAll methods, as well as the constructors that take collections are arguments, also iterate the collection. All of these indirect uses of iteration can cause ConcurrentModificationException.
Concurrent Collections
|