TTL & Invalidation | Microservices Course

Problem

Every caching pattern in this series assumes something retires stale entries, and waves at "invalidate the key" or "until the TTL" without saying how that actually works. A cached entry is a copy of data that lives somewhere else, and it goes wrong the instant the source changes. So you need a mechanism to retire copies that have gone stale, and the two available mechanisms sit at opposite ends of a difficulty curve.

Time-based expiry is trivial: stamp each entry with a lifetime and drop it when the lifetime passes. The catch is that the entry can be wrong for up to that lifetime, and the cache has no way to tell that it went stale early. Explicit invalidation is the opposite: retire the entry the moment the source changes, which gives precise freshness but requires knowing every cache entry derived from a given piece of data and reaching all of them, possibly across many nodes, regions, and separate caches, with no missed message and no race that reinserts the old value afterward. A missed or mis-ordered invalidation leaves stale data sitting in the cache, and to a user a stale read can be indistinguishable from lost data.

Solution

Pick the weakest mechanism that meets your freshness requirement, and reach for the precise one only where you must.

TTL attaches an expiry to each entry; the cache drops or refreshes it once the time passes. It needs no coordination with the source and no knowledge of what changed, which is why it's the default. You're trading away freshness for that simplicity: the entry can serve a stale value for up to its TTL, and it gets refetched on expiry even when nothing changed. Tune the TTL to the staleness the use case tolerates.

Explicit invalidation retires the affected entries when the source changes. Two techniques make this tractable instead of a manual hunt for every copy:

Versioned or content-hashed keys. Rather than mutating the value at a fixed key, put a version or a hash of the content into the key itself. A change writes a new key, so readers of the new version and readers still holding the old one never collide, and you never have to find and delete the old entry, it just ages out under its TTL or eviction. This sidesteps invalidation entirely by making keys immutable. Long-lived static assets work this way: the file is served with a far-future expiry and the URL changes on every build.

Tag (surrogate-key) purge. Tag each cached object with one or more keys naming the data it depends on, a many-to-many relationship between keys and objects. When a piece of data changes, issue one purge for its tag and every object carrying that tag is dropped, without enumerating the URLs. This is the production answer to a page that depends on several underlying records.

A purge comes in two strengths. A hard purge makes the object immediately inaccessible, so the next read goes to the origin as a miss. A soft purge marks it stale, letting the cache keep serving the old copy while it refetches, which spreads origin load instead of spiking it.

The mechanism is the easy part. The hard part is correctness: making sure each invalidation actually reaches every copy and that no concurrent read reinserts a stale value behind it. At large scale this is hard enough that you measure whether invalidation worked rather than assuming it did.

Tradeoffs

Property	Effect
TTL	No coordination and trivial to implement; an entry can be stale for up to its TTL and is refetched on expiry even when unchanged
Explicit invalidation	Near-immediate freshness; requires the data-to-entry mapping and must reach every copy across every node and region
Versioned keys	No invalidation needed and no races, since old keys age out on their own; readers must learn the new key, and old versions occupy the cache until eviction
Tag purge	Invalidates many derived entries in one request without listing them; requires disciplined tagging at write time
Hard vs soft purge	Hard purge is immediately correct but sends every reader to the origin at once; soft purge serves stale during the refetch and smooths origin load
Correctness	A missed or racing invalidation leaves silently stale data, so detecting it needs its own monitoring system
Fan-out cost	More cache copies means more places an invalidation must reach and more chances to miss one

Implementations

Minimal pseudocode

# TTL: bound staleness with no coordination
cache.set(key, val, ttl=300)            # entry self-expires after 300s

# versioned key: change the key instead of invalidating it
key = f"user:{id}:v{version}"           # a write bumps version -> new key
cache.set(key, val)                     # the old key ages out on its own

# tag purge: drop everything derived from a datum in one call
cache.set(page_key, html, tags={f"user:{id}", "homepage"})
def on_change(user_id):
    cache.purge_tag(f"user:{user_id}")  # drops every entry carrying the tag

Surrogate-key / cache-tag purge (Fastly, Cloudflare)

Fastly lets the origin tag any response with a Surrogate-Key header listing one or more keys, then purge by key: a single request drops every object carrying that key, in a many-to-many relationship between keys and objects, so a news site can tag a page with author/joe and category/tech and retire every page touching either without knowing their URLs. Up to 256 keys purge in one batch request, and a soft purge marks objects stale rather than deleting them so the edge serves the old copy while refetching. Cloudflare's equivalent is the Cache-Tag header with purge by tag or prefix. Fastly docs: https://www.fastly.com/documentation/guides/full-site-delivery/purging/purging-with-surrogate-keys/.

Meta's invalidation pipeline and Polaris

Meta's read caches (TAO for the social graph, a large memcache tier for the rest) invalidate on write, using per-object version numbers to reject stale writes and leases to stop a miss from stampeding the store or writing back a stale value (the lease mechanism comes from the "Scaling Memcache at Facebook" work). Their hard problem is verifying that an invalidation actually left every cache consistent, since a single missed message can produce a stale read that looks like data loss. To measure that, Meta runs Polaris, a separate service fed by its own invalidation event stream that watches caches for consistency violations and reports how close to fully consistent the fleet actually is. Meta's writeup: https://engineering.fb.com/2022/06/08/core-infra/cache-made-consistent/.

Content-hashed asset URLs

The versioned-key idea in everyday use. A build tool emits files like app.4f3a9c.js, where the hash is derived from the contents, and serves them with a long, immutable TTL. A deploy changes the contents, so the hash and therefore the URL change, and clients request the new URL while the old file ages out of the CDN untouched. No purge is ever issued, which makes it the cheapest invalidation strategy: none at all. The HTTP semantics: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching.