Documenting Synchronization Policies

Documentation is one of the most powerful (and, sadly, most underutilized) tools for managing thread safety. Users look to the documentation to find out if a class is thread-safe, and maintainers look to the documentation to understand the implementation strategy so they can maintain it without inadvertently compromising safety. Unfortunately, both of these constituencies usually find less information in the documentation than they'd like.

Document a class's thread safety guarantees for its clients; document its synchonization policy for its maintainers.

Each use of synchronized, volatile, or any thread-safe class reflects a synchronization policy defining a strategy for ensuring the integrity of data in the face of concurrent access. That policy is an element of your program's design, and should be documented. Of course, the best time to document design decisions is at design time. Weeks or months later, the details may be a blurso write it down before you forget.

Crafting a synchronization policy requires a number of decisions: which variables to make volatile, which variables to guard with locks, which lock(s) guard which variables, which variables to make immutable or confine to a thread, which operations must be atomic, etc. Some of these are strictly implementation details and should be documented for the sake of future maintainers, but some affect the publicly observable locking behavior of your class and should be documented as part of its specification.

At the very least, document the thread safety guarantees made by a class. Is it thread-safe? Does it make callbacks with a lock held? Are there any specific locks that affect its behavior? Don't force clients to make risky guesses. If you don't want to commit to supporting client-side locking, that's fine, but say so. If you want clients to be able to create new atomic operations on your class, as we did in Section 4.4, you need to document which locks they should acquire to do so safely. If you use locks to guard state, document this for future maintainers, because it's so easythe @GuardedBy annotation will do the trick. If you use more subtle means to maintain thread safety, document them because they may not be obvious to maintainers.

The current state of affairs in thread safety documentation, even in the platform library classes, is not encouraging. How many times have you looked at the Javadoc for a class and wondered whether it was thread-safe?[8] Most classes don't offer any clue either way. Many official Java technology specifications, such as servlets and JDBC, woefully underdocument their thread safety promises and requirements.

[8] If you've never wondered this, we admire your optimism.

While prudence suggests that we not assume behaviors that aren't part of the specification, we have work to get done, and we are often faced with a choice of bad assumptions. Should we assume an object is thread-safe because it seems that it ought to be? Should we assume that access to an object can be made thread-safe by acquiring its lock first? (This risky technique works only if we control all the code that accesses that object; otherwise, it provides only the illusion of thread safety.) Neither choice is very satisfying.

To make matters worse, our intuition may often be wrong on which classes are "probably thread-safe" and which are not. As an example, java.text.SimpleDateFormat isn't thread-safe, but the Javadoc neglected to mention this until JDK 1.4. That this particular class isn't thread-safe comes as a surprise to many developers. How many programs mistakenly create a shared instance of a nonthread-safe object and used it from multiple threads, unaware that this might cause erroneous results under heavy load?

The problem with SimpleDateFormat could be avoided by not assuming a class is thread-safe if it doesn't say so. On the other hand, it is impossible to develop a servlet-based application without making some pretty questionable assumptions about the thread safety of container-provided objects like HttpSession. Don't make your customers or colleagues have to make guesses like this.

4.5.1. Interpreting Vague Documentation

Many Java technology specifications are silent, or at least unforthcoming, about thread safety guarantees and requirements for interfaces such as ServletContext, HttpSession, or DataSource.[9] Since these interfaces are implemented by your container or database vendor, you often can't look at the code to see what it does. Besides, you don't want to rely on the implementation details of one particular JDBC driveryou want to be compliant with the standard so your code works properly with any JDBC driver. But the words "thread" and "concurrent" do not appear at all in the JDBC specification, and appear frustratingly rarely in the servlet specification. So what do you do?

[9] We find it particularly frustrating that these omissions persist despite multiple major revisions of the specifications.

You are going to have to guess. One way to improve the quality of your guess is to interpret the specification from the perspective of someone who will implement it (such as a container or database vendor), as opposed to someone who will merely use it. Servlets are always called from a container-managed thread, and it is safe to assume that if there is more than one such thread, the container knows this. The servlet container makes available certain objects that provide service to multiple servlets, such as HttpSession or ServletContext. So the servlet container should expect to have these objects accessed concurrently, since it has created multiple threads and called methods like Servlet.service from them that could reasonably be expected to access the ServletContext.

Since it is impossible to imagine a single-threaded context in which these objects would be useful, one has to assume that they have been made thread-safe, even though the specification does not explicitly require this. Besides, if they required client-side locking, on what lock should the client code synchronize? The documentation doesn't say, and it seems absurd to guess. This "reasonable assumption" is further bolstered by the examples in the specification and official tutorials that show how to access ServletContext or HttpSession and do not use any client-side synchronization.

On the other hand, the objects placed in the ServletContext or HttpSession with setAttribute are owned by the web application, not the servlet container. The servlet specification does not suggest any mechanism for coordinating concurrent access to shared attributes. So attributes stored by the container on behalf of the web application should be thread-safe or effectively immutable. If all the container did was store these attributes on behalf of the web application, another option would be to ensure that they are consistently guarded by a lock when accessed from servlet application code. But because the container may want to serialize objects in the HttpSession for replication or passivation purposes, and the servlet container can't possibly know your locking protocol, you should make them thread-safe.

One can make a similar inference about the JDBC DataSource interface, which represents a pool of reusable database connections. A DataSource provides service to an application, and it doesn't make much sense in the context of a singlethreaded application. It is hard to imagine a use case that doesn't involve calling getConnection from multiple threads. And, as with servlets, the examples in the JDBC specification do not suggest the need for any client-side locking in the many code examples using DataSource. So, even though the specification doesn't promise that DataSource is thread-safe or require container vendors to provide a thread-safe implementation, by the same "it would be absurd if it weren't" argument, we have no choice but to assume that DataSource.getConnection does not require additional client-side locking.

On the other hand, we would not make the same argument about the JDBC Connection objects dispensed by the DataSource, since these are not necessarily intended to be shared by other activities until they are returned to the pool. So if an activity that obtains a JDBC Connection spans multiple threads, it must take responsibility for ensuring that access to the Connection is properly guarded by synchronization. (In most applications, activities that use a JDBC Connection are implemented so as to confine the Connection to a specific thread anyway.)

Категории