Data races in the JDK!

Eric | June 14, 2008

At the moment I am doing some more work on evaluating tracematches ahead-of-time. One tracematch patten that we use in our benchmarks we called ASyncIter, a simplified version of which looks as follows:

tracematch(Collection c) {
	sym sync after returning:
		call(* Collections.synchr*(..)) && args(c);
	sym iter before:
		call(* Collection.iterator()) && target(c);

	sync iter {
		if(!Thread.holdsLock(c))
			error(``Have to synchronize iterator at ''+thisJoinPoint);
        }
}

This tracematch reports an error if you create a synchronized collection and then iterate over this collection without holding a lock on the collection object. According to the JDK javadoc this is forbidden as it can lead to a race condition. One has to use synchronized collections as follows:

Collection c = Collections.synchronizedCollection(myCollection);
...
synchronized(c) {
	Iterator i = c.iterator(); // Must be in the synchronized block
	while (i.hasNext())
		foo(i.next());
}

The problem

Fair enough. If one is aware of this problem one can certainly write code like that and using the ASyncIter tracematch one can even check whether this idiom is adhered to. However I was absolutely stunned to find out that Koushik Sen found violations of this pattern even in the current version of the JDK! Yes that’s right, even the people at Sun got it wrong (and IBM’s JDK get’s it wrong too)!

Check out this code here:

public static void main(String[] args) {
	List l1 = new LinkedList(), l2 = new LinkedList();
	for(int i=0;i<100000;i++) {
		l1.add(""+i);
	}
	for(int i=0;i<100000;i++) {
		l2.add(""+i);
	}
	//reverse to make the search take a little longer
	Collections.reverse(l2);
	final List sl1 = Collections.synchronizedList(l1);
	final List sl2 = Collections.synchronizedList(l2);

	new Thread() {
		public void run() {
			//synchronized (sl2) {
				sl1.containsAll(sl2);
			//}
		}
	}.start();
	new Thread() {
		public void run() {
			try {
				Thread.sleep(100);
			} catch (InterruptedException e) {
			}
			sl2.add("evil");
		}
	}.start();
}

The code fills two linked lists. We reverse one of them so that if I call l1.containsAll(l2) this call takes a little longer (note that search in lists is slow anyway). Then I synchronize both lists by calling Collections.synchronizedList(..). This returns a synchronized wrapper object. So actually the concurrent modification performed by the second Thread in line 27 should do no harm. So you would think! This is what you get when you run this code:

Exception in thread "Thread-0" java.util.ConcurrentModificationException
	at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:761)
	at java.util.LinkedList$ListItr.next(LinkedList.java:696)
	at java.util.AbstractCollection.containsAll(AbstractCollection.java:278)
	at java.util.Collections$SynchronizedCollection.containsAll(Collections.java:1584)
	at Main$1.run(Main.java:24)

How come? The problem is that java.util.Collections.SynchronizedCollection.containsAll(Collection) only synchronizes on “this”, i.e. the synchronized collection which you call containsAll on. It does not however synchronize on the collection that you pass in as an argument (sl2 in our case). Therefore when I call sl2.add("evil") this call executes immediately and is not blocked until containsAll returns!

A possible solution

So how can one fix this? That’s easy: Just change the implementation of SynchronizedCollection.containsAll(Collection)  to:

public boolean containsAll(Collection<?> coll) {
    synchronized(mutex) {
	synchronized(coll) { //added
		return c.containsAll(coll);
	}
    }
}

Apparently somebody noted this problem before and I am stunned about the reply that Sun’s developers gave (“not a problem”).

How did Koushik Sen find this?

Yes that’s an interesting one. Koushik Sen wrote a really great paper for this year’s PLDI in which he proposes RaceFuzzer, a tool that filters out real data races from possible data races (such as the ones reported by Racer) at runtime. The tool basically stops the threads that could be involved in a race just before the race is going to happen (or not) and then triggers the race (if there is one) by progressing the threads step-by-step in a random order from there on. As you can see this works beautifully!