all_lessons / data_intensive_systems / 09 · partitioning lesson 9 / 16 · ~11 min

Partitioning: Hash, Range, Hot Keys, and Rebalancing

Partitioning splits data and load across machines. The shard key is a performance contract: it decides locality, balance, and which queries become scatter-gather.

First principle

Partitioning is easy if every key is equally popular and every query touches one key. Real systems are hard because data and attention are skewed.

key --> partition function --> shard hash(user_id): balanced point lookups, poor range locality range(timestamp): efficient ranges, hot newest range directory: explicit mapping, flexible, extra metadata service

Why partition?

A single machine has finite disk, memory, CPU, and network. Partitioning splits a dataset into subsets so different machines can store and serve different parts. Replication copies data for availability; partitioning divides data for capacity and load.

The goal is even distribution of both bytes and requests. Those are different. A shard may hold average bytes but receive celebrity-level traffic. Another may be large but cold. A good partitioning strategy considers storage volume, query volume, and operation shape.

Hash vs range partitioning

Hash partitioning applies a hash function to the key and assigns hash ranges to partitions. It spreads keys evenly when the hash is good. It is excellent for point lookups by key and poor for range queries because adjacent keys are intentionally scattered.

Range partitioning assigns contiguous key ranges to partitions. It preserves order, so range scans are efficient. But sorted order can create hot spots. If keys are timestamps, the newest partition receives most writes. If keys are sequential ids, the tail shard burns.

Compound keys can combine ideas: partition by tenant hash, sort by timestamp within tenant. That makes "latest events for tenant X" local while spreading tenants across shards.

Secondary indexes and rebalancing

Partitioning primary data is only the first layer. Secondary indexes complicate everything. A local secondary index lives on the same shard as the primary data, making writes local but queries across secondary fields scatter across shards. A global secondary index partitions by the indexed term, making reads direct but writes update a second distributed structure.

Rebalancing moves partitions when nodes are added or removed. Moving one key at a time is expensive; moving fixed partitions or virtual nodes is more manageable. But rebalancing itself consumes network and disk bandwidth, so automatic rebalancing must be rate-limited and observable.

Trade-offs

ChoiceBuysCosts
Hash partitioningEven key distribution and simple point lookupsRange queries scatter; hot keys remain hot
Range partitioningEfficient range scans and sorted accessSequential writes and skew can create hot ranges
Local secondary indexWrite path stays localReads by secondary field scatter-gather
Global secondary indexSecondary reads are targetedWrites fan out and index consistency lags
What you can now decide

You should be able to name the contract this mechanism offers, the workload or invariant that justifies it, and the bill it sends somewhere else: read latency, write latency, storage, availability, freshness, or operational complexity.

What breaks if you skip this?

A bad shard key turns scale-out into a single overloaded machine with extra steps. The system has many nodes, but the hot key, hot tenant, or hot range decides throughput.

Design prompts

  1. Shard a multi-tenant feature store. What is the shard key and what query becomes hard?
  2. Why does hashing not solve celebrity hot keys?
  3. When should a secondary index be global instead of local?