Range vs Hash Partitioning

Problem

Splitting a dataset across nodes needs a rule that maps each key to a partition. The consistent-hashing piece covered how a key's position becomes a node; this is the prior question of what the key's position should be, and the choice decides two things that pull against each other. One is how evenly load spreads: you want writes and reads scattered across partitions rather than piling onto one. The other is whether keys that are queried together stay physically together, so that a scan over a range of keys reads a few neighboring partitions instead of the whole cluster. No single mapping rule gives you both at once, and the two common rules sit at opposite ends.

Solution

Range partitioning keeps keys in sorted order and cuts the keyspace into contiguous ranges, each owned by a partition. Adjacent keys land on the same or neighboring partitions, so a range scan ("all events for user X between two times", "all rows from A to M") touches only the few partitions that cover the range and reads them in order. The cost is that load follows the key distribution. When writes target one end of the keyspace, which is exactly what monotonically increasing keys like timestamps and auto-increment IDs do, every write lands on the partition that owns the current maximum, creating a hotspot that adding nodes won't relieve because the hot range is a single partition. Ranges are split as they grow and reassigned to even things out, so you don't have to pre-size them.

Hash partitioning runs the key through a hash and partitions on the result, placing the key by hash(key) through a modulus or a ring. The hash throws away ordering and scatters adjacent keys uniformly, which spreads load evenly and dissolves the monotonic-key hotspot, since consecutive timestamps now hash to unrelated partitions. The cost is the mirror image: a range scan has to query every partition, because keys that were adjacent are now everywhere, so range queries become a scatter-gather across the whole cluster or aren't supported well at all.

The usual compromise is a compound key: hash a partition key to spread load across partitions, then keep a sort key ordered within each partition. You get even distribution across partition keys and cheap ordered scans within a single partition key, which covers the common access pattern of "read this user's items in order" without a full-cluster scan. It only helps when scans stay within one partition key; a scan across partition keys still fans out.

Tradeoffs

DimensionRangeHash
Range / scan queriesEfficient: a few contiguous partitions, read in orderScatter-gather across all partitions, or unsupported
Load distributionFollows the data; sequential keys create hotspotsEven by construction
Point lookupsFineFine
Sequential-write workloadAll writes hit the leading-edge partitionSpread across partitions
RebalancingSplit large or hot ranges and move themGoverned by the hash ring; about 1/N of keys move
Key-design burdenChoose or auto-pick split points; guard against monotonic keysChoose a high-cardinality partition key

Compound keys take the right column for spread and the left column for in-partition scans, at the price of cross-partition scans staying expensive.

Implementations

Minimal pseudocode

import bisect
# range: partitions own contiguous [start, next_start) ranges, keys sorted
def route_range(splits, key): # splits: sorted [(start_key, partition)]
starts = [s for s, _ in splits]
i = bisect.bisect_right(starts, key) - 1
return splits[i][1]
def scan_range(splits, lo, hi): # only the partitions covering [lo, hi)
return [p for (start, p) in splits if start < hi] # ...trimmed to overlap
# hash: partition by hash of key; ordering is destroyed, load is even
def route_hash(ring, key):
return ring.route(hash(key)) # the consistent-hashing ring
# compound: hash the partition key to spread, sort key keeps order within it
def route_compound(ring, partition_key, sort_key):
partition = ring.route(hash(partition_key))
return partition, sort_key # ordered scans only within one partition_key

HBase / Bigtable (range)

Bigtable, and its open-source counterpart HBase, store rows sorted lexicographically by row key and partition them into contiguous ranges, called tablets in Bigtable and regions in HBase. A range splits once it grows past a threshold and is reassigned to balance the cluster, so a scan over neighboring row keys is sequential and cheap. The standard hazard is the one above: monotonically increasing row keys funnel every write into the last range, so practitioners salt or field-swap the key to scatter writes deliberately. HBase docs: https://hbase.apache.org/book.html#regions.arch.

DynamoDB (hash)

DynamoDB hashes the partition key to place an item, spreading items across partitions evenly and eliminating sequential-write hotspots, at the cost that scanning across partition keys in order requires a full table Scan. It recovers in-partition locality with an optional sort key: items sharing a partition key are stored ordered, and a Query can range-scan them efficiently. This is the compound-key pattern made concrete. Docs: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html.

Spanner (range with splits)

Spanner partitions rows into contiguous key ranges called splits, ordered by primary key, so it keeps range-scan locality the way Bigtable does. It rebalances by splitting and relocating ranges based on both size and load, and the load-based splitting catches hotspots that size alone would miss. Even so, Google's own schema guidance still warns against monotonically increasing primary keys at high write rates, because a hot range can be split but the leading edge keeps moving to whichever split owns the maximum. Docs: https://cloud.google.com/spanner/docs/schema-design.