Effective Java : Concurrency

Concurrency

Objective: Threads enable multiple tasks to proceed concurrently. Concurrent programming is more difficult than single-threaded one because more things can go wrong, and it is hard to reproduce failures.

But concurrency is a must in modern programming to utilize multicore processors! This section provides advice to help you write clear, correct, well-documented concurrent programs.

Key topics:

  1. Synchronize Access to Sharable Mutable Data
  2. Excessive Synchronization
  3. Concurrentcy Utilities
  4. Concurrentcy and Lazy Initialization

Estimated time: 15-30 minutes.

Synchronize Access to Sharable Mutable Data

Which program below won’t stop as expected?

Don’t use volatile

public class StopThread {
private static boolean stopRequested;

public static void main(String[] args) throws InterruptedException {
Thread thread = new Thread(() -> {
int i = 0;
while (!stopRequested) i++;
});
thread.start();

TimeUnit.SECONDS.sleep(1);
stopRequested = true;
}
}

Use volatile

public class StopThread {
private static volatile boolean stopRequested;

public static void main(String[] args) throws InterruptedException {
Thread thread = new Thread(() -> {
int i = 0;
while (!stopRequested) i++;
});
thread.start();

TimeUnit.SECONDS.sleep(1);
stopRequested = true;
}
}

Can you spot a bug in one of the following snippets?

Use volatile

private static volatile int nextSerialNumber = 0;

public static int generateSerialNumber() {
return nextSerialNumber++;
}

Use atomic facilities

private static final AtomicLong nextSerialNumber = new AtomicLong();

public static long generateSerialNumber() {
return nextSerialNumber.getAndIncrement();
}

When multiple threads share mutable data, each thread that reads or writes the data must perform synchronization, otherwise there is no guarantee that one thread’s changes of the data will be visible to other threads, and therefore may cause liveness and safety failures. These failures are among the most difficult to debug. In the first example above, the use of  volatile  is a correct alternative for synchronization because you need only inter-thread communication, but not mutual exclusion. Without its use, the thread won’t stop, causing a liveness failure. But in the second example, its use is inappropriate because the increment operator (++) is not atomic. It performs two operations on  nextSerialNumber : (1) it reads the value, (2) it writes back a new value that is equal to the old value plus one. If a second thread reads the field between the time the first thread reads the old value and writes back the new one, then both threads will see the same value and thus return the same serial number, OUCH; this is a safety failure.

Note that the best way to avoid safety failure is not to share mutable data, meaning share only immutable data or don’t share at all — confine mutable data to a single thread. If you adopt this policy then you should document it carefully, so that the policy is maintained as your program evolves. It is also crucial to have a deep understanding of the frameworks and libraries you’re using because they may introduce threads that you are unaware of.

Excessive Synchronization

Assume that you are implementing an observable set wrapper that enables clients to subscribe/unsubscribe to notifications when elements are added to the set or removed from the set, as follows:

@FunctionalInterface public interface  SetObserver<E> {
void added(ObservableSet<E> set, E element);
}

public class ObservableSet<E> extends ForwardingSet<E> {

private final List<SetObserver<E>> observers = new ArrayList<>();

public void addObserver(SetObserver<E> observer) { … }
public void removeObserver(SetObserver<E> observer) { … }

private void notifyElementAdded(E element) { … }

@Override public boolean add(E element) {
boolean added = super.add(element);
if (added)
notifyElementAdded(element);
return added;
}

}

// An example of use of ObservableSet
public static void main(String[] args] {
ObservableSet<Integer> set = new ObservableSet<>(new HashSet<>());
set.addObserver(new SetObserver<>() {
public void added(ObservableSet<Integer> s, Integer e) {
System.out.println(e);
if (e == 5)
s.removeObserver(this);
}
});
for (int i = 0; i < 10; i++)
set.add(i);
}

Can you spot a bug in one of the following implementations of  notifyElementAdded ?

Do not use snapshot of observers

private void notifyElementAdded(E element) {
synchronized(observers) {
for (SetObserver<E> observer : observers)
observer.added(this, element);
}
}

Use snapshot of observers

private void notifyElementAdded(E element) {
List<SetObserver<E>> snapshot = null;
synchronized(observers) {
snapshot = new ArrayList<>(observers);
}
for (SetObserver<E> observer : snapshot)
observer.added(this, element);
}

 observer.added  is an alien method from clients for which you don’t, actually can’t, know how it works. So, you should be better off NOT including any alien method invocation in your synchronized code. The first implementation above will throw a  ConcurrentModificationException because  notifyElementAdded  is in the synchronized/locked process of iterating over  observers  when it invokes the observer’s  added  alien method, which calls the observable set’s  removeObserver  method, which in turn calls  observers.remove  method. This example shows only an exception problem, but in other circumstances it may cause catastrophic issues such as deadlock.

It is usually not too difficult to fix the sort of above problems by moving alien method invocations out of synchronized blocks such as the second implementation above. Note that this implementation can be improved by using  CopyOnWriteArrayList  from  java.util.concurrent  package. You should utilize this package when working on concurrency. Here are some other best practices:

  • You should do as little work as possible inside synchronized regions: obtain the lock, examine shared data, transform it if necessary, and drop the lock. If you must perform time-consuming tasks then find a way to move them out of synchronized blocks without violating guidelines in the previous sub-section.
  • If you are writing a mutable class, you can omit all synchronization, document that the class is not thread-safe, and allow clients to synchronize externally if they want to StringBuilder  and  ThreadLocalRandom  follow this guideline.
  • If you are writing a mutable class, you should choose to synchronize internally only if you can achieve significantly higher concurrency. And if you do so, you should use good techniques such as lock splittinglock stripping, and nonblocking concurrency control. Classes in  java.util.concurrent  package follow this guideline.

Concurrency Utilities

Which of the following implementations that simulate the behavior of  String.intern  is faster?

Don’t use Map.get

private static final ConcurrentMap<String, String> map = new ConcurrentHashMap<>();

public static String intern(String s) {
String previousValue = map.putIfAbsent(s, s);
return previousValue == null ? s : previousValue;
}

Use Map.get

private static final ConcurrentMap<String, String> map = new ConcurrentHashMap<>();

public static String intern(String s) {
String result = map.get(s);
if (result == null) {
result = map.putIfAbsent(s);
if (result == null)
return s;
}
return result;
}

The second implementation could be six times faster than the first one, because  ConcurrentHashMap  is optimized for retrieval operations such as  get . You should have a deep understanding of  java.util.concurrent  package, to be able to utilize its many concurrent utilities. Here are some other recommendations:

  • Prefer executors, tasks, and streams, to threads wherever possible.
  • Prefer  ConcurrentHashMap  to  Collections.synchronizedMap .
  • If you have to use lock fields, you should always declare them as  final .
  • If you have to fix a program that barely works (e.g., some of its threads are unnecessarily runnable) then don’t depend on thread scheduler (e.g., rely on  Thread.yield  or thread priorities, because the resulting program will be neither robust nor portable. You should consider to restructure the application, by using guidelines in this section, to reduce the number of concurrently runnable threads.

Whether or not you choose concurrent utilities to avoid implementing internal synchronization, you should always document what level of thread safety it supports, so as to enable safe concurrent use of your class. Here are some common levels:

  • Immutable: No external synchronization is necessary. Examples include  String  Long , and  BigInteger .
  • Unconditionally thread-safe: Instances of this lass are mutable, but no external synchronization is necessary because the class has sufficient internal synchronization. Examples include  AtomicLong  and  ConcurrentHashMap .
  • Conditionally thread-safe: Like conditionally thread-safe, but some methods require external synchronization. Examples include collections returned by  Collections.synchronized  wrappers, whose iterators require external synchronization.
  • Not thread-safe: Instances of this class are mutable, and to use them clients must surround each method invocation or invocation sequence with external synchronization. Examples include general-purpose collections such as  ArrayList  and  HashMap .
  • Thread-hostile: This class is unsafe for concurrent use; it must be fixed or deprecated. Examples include  generateSerialNumber  at the beginning of this section.

Concurrency and Lazy Initialization

Which of the following implementations would be preferable?

Use synchronized accessors

private FieldType field;

private synchronized FieldType getField() {
if (field == null)
field = computeFieldValue();
return field;
}

Use single-check idiom

private volatile FieldType field;

private FieldType getField() {
FieldType result = field;
if (result == null) // Single check, no locking
field = result = computeFieldValue();
return field;
}

Use double-check idiom

private volatile FieldType field;

private FieldType getField() {
FieldType result = field;
if (result == null) { // First field check, no locking
synchronized(this) {
if (field == null) // Second field check, with locking
field = result = computeFieldValue();
}
}
return field;
}

In general, you should initialize most fields normally, such as  private final FieldType field = computeFieldValue(); . If you have to initialize a field lazily to achieve your performance goals or to break a harmful initialization circularity, then use an appropriate lazy initialization technique. Use the double-check idiom for instance fields (see the third implementation above), or use the single-check idiom if you can tolerate repeated initialization (see the second implementation above). For static fields, however, use the holder class idiom, as follows:

private static class FieldHolder {
static final FieldType field = computeFieldValue();
}

private static FieldType getField() {
return FieldHolder.field;
}

Leave a Reply

%d bloggers like this: