Sequential & Causal Consistency

Problem

Linearizability makes every operation pay for global real-time agreement: a quorum round-trip per operation, latency floored at the cross-region RTT, and rejected operations during a partition. Most applications never use that real-time guarantee. A user needs to see their own edits in the order they made them, and needs to never see a reply before the message it answers. Neither requires that a read on one continent instantly reflect a write that completed a millisecond ago on another. Spending linearizable cost everywhere buys a guarantee the application ignores, at the price of latency and availability it would rather keep.

What's actually needed is weaker: preserve the order a single client issued its operations, and preserve cause before effect, without forcing all replicas to agree on a single real-time timeline.

Solution

Two models, each dropping a different part of the real-time requirement.

Sequential consistency keeps a single total order that every process agrees on and that respects each process's own program order, but abandons real time. All clients observe operations as if interleaved into one sequence; within that sequence, each client's own operations stay in the order it issued them. Two operations on different clients that don't causally interact can be globally ordered either way, even contradicting wall-clock order, as long as everyone sees the same chosen order. The catch is that maintaining one agreed total order still costs coordination, and a known lower bound says you cannot make both reads and writes local under it, which is why it's deployed less than either neighbor in the hierarchy.

Causal consistency goes further and only orders operations related by cause and effect. Causality is program order within a process, plus reads-from across processes (a read of a value depends on the write that produced it), plus transitivity. Operations with no causal link are concurrent and may be observed in different orders at different replicas. The mechanism is dependency tracking: each write carries the versions it causally depends on, and a replica holds an incoming write until those dependencies are present locally, then applies it. Reads stay local and the system stays available under partition, which makes causal consistency the strongest model compatible with availability and convergence.

Tradeoffs

PropertyEffect
Real-time freshnessNeither guarantees a read sees the globally newest write; you can read a stale view that is still internally consistent
AvailabilityCausal allows local reads and writes and survives partitions, unlike linearizable systems
Sequential's costA single agreed total order still requires coordination and carries an operation-latency lower bound, so it can't make reads and writes both local
Dependency metadataCausal tracking (vector clocks or dependency lists) grows with the number of writers and replicas; this is the main implementation burden
Concurrent writesCausally unrelated writes can apply in different orders at different replicas, so you need convergent conflict resolution (last-writer-wins or merge) to stop replicas diverging
ScopeA per-object or per-session guarantee with no cross-object atomicity

Implementations

Minimal pseudocode (causal)

# each write carries the versions it causally depends on
def put(key, value, ctx): # ctx = versions this client has seen
v = next_version()
deps = ctx.snapshot() # everything observed so far
store[key] = (value, v)
replicate(key, value, v, deps)
ctx.observe(key, v)
return v
# a replica makes a remote write visible only once its causes are present
def on_replicated(key, value, v, deps):
wait_until(all(present_locally(d) for d in deps))
store[key] = (value, v)

The load-bearing idea: ship each write's dependencies with it, and delay visibility until those dependencies are satisfied locally.

COPS

The reference design for geo-replicated causal consistency (Lloyd et al., SOSP 2011). COPS provides "causal+" consistency, meaning causal ordering plus convergent handling of concurrent writes so replicas don't diverge. Each put returns a version and records the versions the client has read; replication to a remote datacenter carries that dependency metadata, and the remote cluster applies a write only after its dependencies have arrived. A get_transaction operation returns a causally consistent snapshot across multiple keys.

MongoDB causal sessions

Causal consistency is opt-in per session, available since 3.6. Within a causally consistent session a client gets read-your-writes, monotonic reads, monotonic writes, and writes-follow-reads. The implementation uses a cluster time derived from a hybrid logical clock plus an operation-time token; the client gossips those timestamps with each request, and the server waits until its data is at least as fresh as the token before answering, so the client never observes an effect before its cause.

Social-graph systems

Large social graphs typically run on causal or session guarantees rather than linearizability, because the dominant requirements are seeing your own actions immediately and never seeing a child object before its parent (a reply without its comment, a like on a post you can't see). Stores in this space favor local reads with read-your-writes within a region and accept that a write made elsewhere may take time to appear, trading global real-time freshness for the latency and availability that a social feed actually needs.