Replication II: Multi-Leader, Leaderless, Quorums, and Conflicts
Once more than one node can accept writes, availability rises and conflict resolution becomes part of the application contract.
Multi-writer replication moves the hard problem from routing writes to reconciling histories. If two writes can be concurrent, the system must preserve, order, merge, or reject them.
Multi-leader: local writes, global conflicts
Multi-leader replication allows writes in more than one location. It is attractive for multi-region systems: users write to a nearby region, then leaders replicate to each other. It can also support offline or disconnected operation.
The cost is concurrent writes to the same logical record. Two regions can update the same profile before hearing from each other. Last-write-wins is simple but can lose data because clock order is not causality. Better systems detect concurrency with version vectors or logical clocks, then resolve conflicts explicitly.
Conflict resolution belongs in the data model. A shopping cart can merge item sets. A bank transfer cannot merge by averaging balances. A collaborative document may use CRDTs or operational transforms. "Multi-leader" is not a free availability button; it is a promise that conflicts are acceptable and resolvable.
Leaderless: quorum reads and writes
Leaderless systems let clients write to multiple replicas and read from multiple replicas. With N replicas, a write quorum W, and read quorum R, the familiar condition W + R > N means every successful read intersects with every successful write in at least one replica, assuming clean failures and no sloppy shortcuts.
This does not magically give linearizability. Concurrent writes, failed writes, stale hinted handoff replicas, and clock-based resolution can still surprise you. But quorums give a tunable availability/latency/durability surface. Set W=1 for fast writes with weaker durability. Set R=1 for fast reads with staleness risk. Increase both for stronger detection of stale data.
Repair and reconciliation
Leaderless systems rely on repair. Read repair fixes stale replicas when reads notice disagreement. Anti-entropy background jobs compare replicas and sync missing data. Hinted handoff stores writes temporarily for unavailable nodes and replays later.
These mechanisms mean correctness is not just on the write path. It is also in background convergence. If repair falls behind, the system may remain available but increasingly inconsistent. For ML systems, this shows up as feature drift across regions or duplicated/late events in logs.
Trade-offs
| Choice | Buys | Costs |
|---|---|---|
| Multi-leader | Local low-latency writes and region independence | Conflict detection/resolution required |
| Leaderless quorum | No single write leader, tunable availability | More complex reads, repair, and anomaly surface |
| Last-write-wins | Simple automatic resolution | Can silently discard concurrent updates |
| Application merge | Preserves intent when domain supports it | Requires domain-specific conflict semantics |
You should be able to name the contract this mechanism offers, the workload or invariant that justifies it, and the bill it sends somewhere else: read latency, write latency, storage, availability, freshness, or operational complexity.
If conflicts are possible but not represented, the system resolves them accidentally. Accident usually means dropping a write, trusting a bad clock, or letting two replicas tell different stories forever.
Design prompts
- For a shopping cart, what conflict merge rule is safe? For account balance, why is it unsafe?
- What do
N,R, andWbuy in a leaderless store? - How can feature values differ across regions even when every region is healthy?