Data structures in the multicore age

Posted on June 21, 2022

Shavit, [2011] “Data structures in the multicore age” is a relatively well-known popular science article from a decade ago by one of the concurrency coryphaei on the future of multithreaded (library) programming.

It starts with a motivation of Amdahl’s Law about the bottleneck of parallel computing. Then it mentions that correctness in the multithreaded setting breaks down into two aspects: safety (relative to the consistency model, e.g. linearizability) and liveness (progress conditions - lock-free/wait-free and more exotic). The tools for dealing with complexity are also changing - the stalls model is added, we need to keep a close eye on contention. A prediction is made that the development of algorithms and data structures will go down the way of relaxing guarantees (it seems that after 10+ years these are still pretty niche things).

Most of the article is spent on an example of the process of “weakening” a multi-threaded stack in Java, going through several phases:

  1. Stack under spinlock - linearizable, non-parallel (deadlock-free), bus congestion due to uncontrolled spins (usually solved by backoff circuits).
  2. Lock-free stack - specializes the lock to a CAS at the top of the stack.
  3. Elimination backoff stack - removes the sequential bottleneck, obtaining the so-called “dual” structure - i.e. allowing tracks to leave “antidata” (requests) and transfer elements directly without writing to the stack, still linearizable.
  4. Elimination tree - duality does not work well in the case of packets of identical operations, so we can weaken the consistency to the quiescent model (illusion of linearity only during periods of inactivity) by splitting the stack into several parallel ones and covering them with balancers.
  5. Stack pool - ultimately the problem is the sequential logic of the stack itself, so further performance gains can be achieved by removing the balancing machinery and turning the stack into a carefully architected, weakly-ordered structure, which the author believes should be sufficient for most applications.

Other structures should be similarly relaxed, e.g. by switching to hashing instead of exact queries. This seems to have partly come true with the popularization of probabilistic structures like the Bloom filter.