SLaks.Blog

Making the world a better place, one line of code at a time

About Concurrent Collections

Posted on Sunday, November 20, 2011, at 6:42:00 PM UTC

One of the most useful additions to the .Net 4.0 base class library is the System.Collections.Concurrent namespace, which contains an all-new set of lock-free thread.

However, these collections are noticeably different from their classical counterparts.  There is no simple ConcurrentList<T> that you can drop into your code so that it will become thread-safe.  Instead, the new namespace has a queue, a stack, and some new thing called a bag, as well as ConcurrentDictionary<TKey, TValue> that largely resembles classical dictionaries.  It also has a BlockingCollection<T> class that wraps a concurrent collection and blocks until operations can succeed.

Many people have complained that Microsoft chose to provide an entirely new set of classes, rather than adding synchronized versions of the existing collections. 

In truth, however, creating this new set of classes is the correct – and, in fact, only – choice.  Ordinary synchronized are rarely useful and will not make a program thread-safe.  In general, slapping locks everywhere does not make a program thread-safe!  Rather, that will either not help (if there aren’t enough locks) or, if there are enough locks, result in deadlocks or a program that never runs more than one thread at a time.  (depending on how many different objects get locked)

Collections in particular have two fundamental issues when used on multiple threads.

The first, and simpler, issue is that the collection classes themselves are not thread-safe.  If two threads add to a List<T> at the same exact time, one thread is likely to overwrite the other thread’s value.  A synchronized version of List<T> with a lock around every method would solve this problem.

The bigger issue is that any code that uses a List<T> is unlikely to be thread-safe, even if the list itself is thread-safe.  For example. you can never enumerate over a multi-threaded list, because another thread may change the list at any time, invalidating the enumerator.  This issue could be solved by taking a read lock for the lifetime of the enumerator.  However, that is also a bad idea, since if any client code forgets to dispose the enumerator, the collection will deadlock when written to.

You also cannot use indices.  It is never safe to get, set, or remove an item at an index, because another thread might remove that item or clear the entire collection between your index check and the operation.

To solve all of these problems, you need thread-safe collections that provide atomic operations.  This is why all of the concurrent collections have such strange methods, including TryPop, AddOrUpdate, and TryTake.  These methods perform their operations atomically, and return false if the collection was empty (as appropriate; consult the documentation for actual detail).  Thus, the new concurrent collections can be used reliably in actual multi-threaded code without a separate layer of locks.

Categories: .Net, multi-threading, thread-safety Tweet this post

comments powered by Disqus