Making a Resource Thread-Safe

Problem

You are using multiple threads in a program and you need to ensure a resource is not modified by more than one thread at a time. In general, this process is called making the resource thread-safe, or serializing access to it.

Solution

Use the class mutex, defined in boost/thread/mutex.hpp, to synchronize access among threads. Example 12-2 shows a simple use of a mutex object to control concurrent access to a queue.

Example 12-2. Making a class thread-safe

#include #include #include // A simple queue class; don't do this, use std::queue template class Queue { public: Queue( ) {} ~Queue( ) {} void enqueue(const T& x) { // Lock the mutex for this queue boost::mutex::scoped_lock lock(mutex_); list_.push_back(x); // A scoped_lock is automatically destroyed (and thus unlocked) // when it goes out of scope } T dequeue( ) { boost::mutex::scoped_lock lock(mutex_); if (list_.empty( )) throw "empty!"; // This leaves the current scope, so the T tmp = list_.front( ); // lock is released list_.pop_front( ); return(tmp); } // Again: when scope ends, mutex_ is unlocked private: std::list list_; boost::mutex mutex_; }; Queue queueOfStrings; void sendSomething( ) { std::string s; for (int i = 0; i < 10; ++i) { queueOfStrings.enqueue("Cyrus"); } } void recvSomething( ) { std::string s; for (int i = 0; i < 10; ++i) { try {s = queueOfStrings.dequeue( );} catch(...) {} } } int main( ) { boost::thread thr1(sendSomething); boost::thread thr2(recvSomething); thr1.join( ); thr2.join( ); }

 

Discussion

Making classes, functions, blocks of code, or other objects thread-safe is at the heart of multithreaded programming. If you are designing a piece of software to be multithreaded, chances are that each thread will have its own set of resources, such as stack and heap objects, operating system resources, and so on. But sooner or later you will need to share something among threads. It may be a shared queue of incoming work requests (as in a multithreaded web server) or something as simple as an output stream (as in a log file, or even cout). The standard way of coordinating the safe sharing of resources is with a mutex, which provides mutually exclusive access to something.

The rest of this discussion describes what a mutex is in general and how to use boost::mutex in particular to serialize access to resources. I use the concept/model approach terminology that I mentioned briefly in the introduction to this chapter. A concept is an abstract (language-independent) description of something, and a model of a concept is its concrete representation in C++ class form. A refinement of a concept is a given concept with some additional or augmented behavior.

Concurrent programming is a complicated subject though and there are many more techniques than can fit in a single recipe. There are lots of different patterns that can be used, and different strategies that should be used for different applications. If you plan to do a significant amount of multithreaded programming, or if you are designing performance-critical applications, you ought to pick up a good book on multithreaded patterns. Many of the problems that make debugging multithreaded programs so difficult can be successfully averted with careful, tedious design.

Using mutexes

The mutex concept is simple: a mutex is something that represents a resource and can be locked or unlocked by only one thread at a time. It is a flag used to coordinate access to a resource by multiple consumers. In the Boost Threads library, the mutex concept is modeled by the boost::mutex class. In Example 12-2, write access to the Queue class is maintained with a mutex member variable:

boost::mutex mutex_;

mutex_ must be locked by any member function that must change the state of the queue of items that is maintained. The mutex object itself has no knowledge of what it's representing. It's just a locked/unlocked flag that is shared by all consumers of some resource.

In Example 12-2, when a Queue member function needs to change the state of the object, it must first lock mutex_. Only one thread at a time can lock it, which is what prevents multiple objects from modifying the state of a Queue object. Thus, a mutex is a simple signaling mechanism, but it is more than just a bool or int, because a mutex requires serialized access, which can only be guaranteed by the operating system kernel. If you try doing the same thing with a bool, it won't work because there's nothing that prevents multiple threads from modifying the state of a bool at the same time. (Different operating systems have different ways of doing this, which is why it is not easy to implement a portable threading library.)

mutex objects are locked and unlocked using several different locking strategies, the simplest of which is the scoped_lock. A scoped_lock is a class that, when constructed using a mutex argument, locks it until the lock is destroyed. Look at the enqueue member function in Example 12-2 to see how scoped_lock works with a mutex:

void enqueue(const T& x) { boost::mutex::scoped_lock lock(mutex_); list_.push_back(x); } // unlocked!

When lock is destroyed, mutex_ is unlocked. If the lock is constructed on a mutex that is already locked by another thread, the current thread goes into a wait state until the lock becomes available.

This design may seem a little odd at firstwhy not have lock and unlock methods on mutex? The approach of using a scoped_lock class that locks on construction and unlocks on destruction is actually much more convenient and less error-prone. When you create a lock using the scoped_lock approach, it locks the object for its lifetime, which means that you don't have to unlock explicitly anything on every control path. On the other hand, if you have to unlock a locked mutex explicitly, you have to ensure that any exceptions that are thrown in your function (or anywhere above your function on the call stack) are caught and the mutex is unlocked. With a scoped_lock, if an exception is thrown or the function returns, the scoped_lock object is automatically destroyed and the mutex is unlocked.

Using a mutex will get the job done, but it leaves a little to be desired. It makes no distinction between reading and writing, which is significant, because it is inefficient to force threads to wait in line to use a resource when many or all of them are doing a read-only operation, which should not require exclusive access. For this, the Boost Threads library provides read_write_mutex. Example 12-3 shows how you might implement Example 12-2 using a read_write_mutex with a front member function that allows the caller to retrieve a copy of the first item on the queue without popping it.

Example 12-3. Using a read/write mutex

#include #include #include #include template class Queue { public: Queue( ) : // Use a read/write mutex and give writers priority rwMutex_(boost::read_write_scheduling_policy::writer_priority) {} ~Queue( ) {} void enqueue(const T& x) { // Use a r/w lock since enqueue updates the state boost::read_write_mutex::scoped_write_lock writeLock(rwMutex_); list_.push_back(x); } T dequeue( ) { // Again, use a write lock boost::read_write_mutex::scoped_write_lock writeLock(rwMutex_); if (list_.empty( )) throw "empty!"; T tmp = list_.front( ); list_.pop_front( ); return(tmp); } T getFront( ) { // This is a read-only operation, so you only need a read lock boost::read_write_mutex::scoped_read_lock readLock(rwMutex_); if (list_.empty( )) throw "empty!"; return(list_.front( )); } private: std::list list_; boost::read_write_mutex rwMutex_; }; Queue queueOfStrings; void sendSomething( ) { std::string s; for (int i = 0; i < 10; ++i) { queueOfStrings.enqueue("Cyrus"); } } void checkTheFront( ) { std::string s; for (int i = 0; i < 10; ++i) { try {s = queueOfStrings.getFront( );} catch(...) {} } } int main( ) { boost::thread thr1(sendSomething); boost::thread_group grp; grp.create_thread(checkTheFront); grp.create_thread(checkTheFront); grp.create_thread(checkTheFront); grp.create_thread(checkTheFront); thr1.join( ); grp.join_all( ); }

There are a few things I should point out here. Notice that now I am using a read_write_mutex, like this:

boost::read_write_mutex rwMutex_;

The locks are also different when you're using read/write mutexes. In Example 12-3, when I want to lock the Queue for writing, I create a scoped_write_lock:

boost::read_write_mutex::scoped_write_lock writeLock(rwMutex_);

And when I just need to read the Queue, I use a scoped_read_lock:

boost::read_write_mutex::scoped_read_lock readLock(rwMutex_);

Read/write locks are handy, but they don't prevent you from shooting yourself in the foot. There is no compile-time check on the resource represented by rwMutex_ to make sure you're not changing it when you only have a read lock. Take extra care to ensure a thread only modifies the state of an object when it has a write lock because the compiler won't.

Exactly how these locks are scheduled is determined by the scheduling policy you chose when you constructed the mutex. There are four that are provided by the Boost Threads library:

 

reader_priority

Threads waiting for a read lock will be granted the lock before those waiting for a write lock.

 

writer_priority

Threads waiting for a write lock will be granted the lock before those waiting for a read lock.

 

alternating_single_read

Alternate between read and write locks. Grant a single reader a read lock when it is the readers' "turn." This policy gives writers priority in general. For example, if the mutex is write-locked, and there are several pending read locks and one pending write lock, one read lock will be granted, then the waiting write lock will be granted, then all remaining read locks will be granted. This assumes no new locks are requested during this period.

 

alternating_many_reads

Alternate between read and write locks. Grant all readers' locks when it is the readers' "turn." In other words, this policy empties the queue of all waiting read locks in between write locks.

Each of these policies has different pros and cons, and they will perform differently depending on your application. Deciding which policy to use takes careful consideration, because simply going with reader or writer priority can result in starvation, which I describe in more detail below.

Dangers

There are three basic problems that occur when you are programming with multiple threads: deadlock, starvation, and race conditions. There are techniques for avoiding each of them, with varying degrees of sophistication that are beyond the scope of this recipe. I will describe what each of the problems is so you know how what to watch out for, but if you plan on doing multithreaded application development, you should do some homework on multithreaded patterns first.

Deadlock is a situation that involves at least two threads and two resources. Consider two threads, A and B, and two resources, X and Y, where A has a lock on X and B has a lock on Y. A deadlock occurs when A tries to lock Y and B tries to lock X. If threads are not designed to break the deadlock somehow, then they will wait forever.

The Boost Threads library lets you avoid deadlocks with refinements to the mutex and locking concepts. A try mutex is one that supports attempts at locking, using a try lock that either succeeds or fails, but does not block and wait for the lock to become available. Using models of these concepts in the form of try_mutex and scoped_try_lock, your code can go and do something else if the resource you need to access is locked. There is also yet another refinement to the concept of a try lock, and that is a timed lock. With a timed lock, a thread can give up after blocking for a specific amount of time. I do not discuss timed locks in detail here; have a look at the Boost Threads documentation for details.

For example, in the Queue class in Example 12-2, you wanted to use a try mutex so dequeue returns a bool indicating whether or not it was able to dequeue the first item. This way, consumers of dequeue don't have to wait around if the queue is locked. Here's how you could rewrite dequeue:

bool dequeue(T& x) { boost::try_mutex::scoped_try_lock lock(tryMutex_); if (!lock.locked( )) return(false); else { if (list_.empty( )) throw "empty!"; x = list_.front( ); list_.pop_front( ); return(true); } } private: boost::try_mutex tryMutex_; // ...

The mutex being used and the lock are different than those used in Example 12-2. Be sure to correctly qualify the names of the mutex and lock classes you are using, otherwise, you won't get the behavior you expect.

When you serialize access to something, you tell consumers of it to line up and wait their turn. If each of them stays in the same position in line, everybody gets a chance to use the resource. But if you let some consumers cut in line, it is possible that those at the back of the line never get their turn. This is starvation.

When using a mutex, consumers wait in a group and not a line. There is no guaranteed order among threads that are waiting for the lock. For read/write mutexes, the Boost Threads library uses the four scheduling policies described earlier. Therefore, when using read/write mutexes, be aware of what the different scheduling policies mean, and what your threads are doing. If you are using writer_priority, and you have lots of threads creating write locks, your readers will starve; the same goes for reader_priority, since these scheduling policies always prefer one type of lock over another. Through testing, if you recognize that one kind of thread isn't making as much progress as it should, consider switching to an alternating_many_reads or alternating_single_read policy. You specify the policy when constructing a read/write mutex.

Finally, a race condition is a situation where your code has made an assumption about the order or atomicity of locks that has proven false. For example, consider a consumer of the Queue class that interrogates the element on the front and conditionally dequeues it:

if (q.getFront( ) == "Cyrus") { str = q.dequeue( ); // ...

This code works fine in a single-threaded environment, because q won't be modified between the first and second lines. However, when using multiple threads, you have to account for the situation where another thread modifies q at any momentin fact, you should assume that shared objects are modified when a thread doesn't have them locked. After line 1, another thread can come along and dequeue the next item from q, which means that line 2 gets something unexpected or nothing at all. Both getFront and dequeue lock the single mutex used to modify q, but in between it is unlocked, and if another thread is waiting on the lock, it may snatch it up before line 2 has a chance.

A solution, for this particular race condition, is to ensure that a lock is held for the duration of the operation. Create a member function called dequeueIfEquals that only dequeues the next object if it equals the argument. dequeueIfEquals can use a lock like anything else:

T dequeueIfEquals(const T& t) { boost::mutex::scoped_lock lock(mutex_); if (list_.front( ) == t) // ...

There are other kinds of race conditions, but this example should give you a general idea of what to watch out for. As the number of threads and shared resources you are using increases, the race conditions become more subtle and difficult to catch. Therefore, you should take special care to design to prevent them.

Ensuring serialized access to resources is the most difficult thing about multithreading, because when you don't do it right, debugging it can be a nightmare. Since a multithreaded program is inherently nondeterministic (because threads can execute in different order and for different lengths of time each time the program is run), it is painful to try and pinpoint exactly where or how a thread modifies something it shouldn't. More so than with single-threaded programming, a sound design will minimize debugging and rework.

Категории