Negative Caching | Microservices Course

Problem

Every caching pattern so far stores successful results, which leaves a hole: a lookup that finds nothing produces no value to cache, so it falls through to the origin on every repeat. A key absent from the store, a query with an empty result, a 404, none of them leave anything behind, and the next identical request pays the full expensive lookup again to learn the same nothing. When the same missing key is requested over and over, by a buggy client retrying a typo'd URL, a crawler, or simply a popular link to something that was deleted, the cache never helps and the origin takes every hit.

This is also a clean path for deliberate load. Requesting keys you know aren't cached, because they don't exist, walks straight past the cache to the store every time, so an attacker generating random nonexistent keys can bypass the cache entirely and pound the origin (cache penetration). The empty result is exactly the case the cache ignores, which is what makes it a weak point.

Solution

Cache the absence. Store a marker meaning "this key does not exist" so repeated misses for the same key are answered from the cache instead of the store. The complication that the positive case doesn't have is that a negative entry is a claim about something that isn't there, and the claim turns wrong the instant the data is created. A positive value stays valid until it changes; a negative value is invalidated by a creation you may not have seen. So you bound it: give negatives a deliberately short TTL, shorter than positives, so a newly created record becomes visible quickly. The trade is a brief window where you keep answering "not found" after the thing exists, bought in exchange for absorbing the repeated-miss storm.

Two refinements make this safe in practice.

First, clear the negative entry when the missing thing is created. If the write path that inserts the record also deletes its negative marker, the staleness window collapses to zero wherever you control the write, and the short TTL is only a fallback for creations you don't see.

Second, cache authoritative absence, not transient failure. A 404 or an empty result is the store saying "definitively not here," which is a fact worth caching. A timeout or a 5xx is the store saying "I couldn't tell you," and caching that pins a temporary outage as though it were a fact, so you keep returning failure after the origin has recovered. DNS draws exactly this line: a name-does-not-exist answer is cacheable, while a server failure is handled separately and far more cautiously.

For the adversarial case there's a structural defense beyond per-key negatives: a Bloom filter (or similar membership sketch) holding every key that exists. A definite "not in the set" answer rejects a request before it touches the cache or the store, which beats per-key negative caching against an attacker who defeats it by varying the key on every request, since per-key negatives only help for repeats of the same missing key.

Tradeoffs

Property	Effect
Repeated-miss load	A hot missing key is answered from cache instead of re-querying the origin each time; the main reason to do this
Bounded staleness	A negative entry must expire quickly or be cleared on create, or a newly created record stays invisible for the TTL
Authoritative vs transient	Safe to cache definitive absence; caching a transient failure pins an outage and is the classic footgun
Adversarial misses	Per-key negatives don't stop an attacker varying the key; a membership filter does, at the cost of maintaining the filter
Memory	Negative entries occupy cache space too, so a flood of distinct missing keys can evict useful positive entries unless capped or filtered
Invalidation coupling	The write that creates a record must remember to clear its negative entry, one more place to get invalidation wrong

Implementations

Minimal pseudocode

MISSING = object()                       # sentinel, distinct from "not cached"

def get(key):
    val = cache.get(key)
    if val is MISSING:
        return None                      # negative hit: known-absent, no origin call
    if val is not None:
        return val                       # positive hit
    row = origin.read(key)               # miss: ask the store
    if row is None:
        cache.set(key, MISSING, ttl=30)  # short negative TTL
    else:
        cache.set(key, row, ttl=300)     # longer positive TTL
    return row

def on_create(key, row):
    cache.set(key, row)                  # clear the negative entry on write

DNS negative caching (NXDOMAIN / NODATA)

A resolver caches the two authoritative "no" answers, NXDOMAIN (the name does not exist) and NODATA (the name exists but has no record of the requested type), so a flood of queries for a missing name stops hammering the authoritative servers. The bound is carried by the zone's SOA record, which the authoritative server places in the authority section of the negative response specifically so the answer can be cached, and the negative TTL is the lesser of the SOA's MINIMUM field and the SOA record's own TTL. RFC 2308 repurposed that MINIMUM field to mean exactly "how long to cache negatives," and the footgun follows directly: set it too high and an NXDOMAIN lingers in resolvers after you create the record, delaying propagation. The same RFC draws the authoritative-versus-failure line that RFC 9520 later tightened. Spec: https://www.rfc-editor.org/rfc/rfc2308.html.

CDN 404 caching

CDNs cache error responses on a short per-status TTL so a storm of requests for a missing or moved URL is absorbed at the edge instead of reaching the origin. Google Cloud CDN exposes this directly as negative caching, letting you set a TTL per status code, with defaults around 120 seconds for 404 and 410 and 60 seconds for 405 and 501, capped at 30 minutes, and only for the authoritative codes (the 404/410/451 family and redirects) rather than arbitrary errors. Cloudflare does the same by default, caching 404 and 410 for a few minutes and 5xx errors for about a minute, overridable with a Cache-Control header. The deliberately short defaults are the "bound it" principle built in. Docs: https://docs.cloud.google.com/cdn/docs/using-negative-caching.

Bloom filter membership guard

The defense against the adversarial penetration case, used in front of the cache rather than inside it. Maintain a Bloom filter of every key that exists; on a request, test the filter first. A Bloom filter has no false negatives, so a "not present" verdict is definite and lets you reject the request immediately without touching the cache or the store, while its false positives (an occasional "maybe present" for an absent key) only cost a normal cache-then-store lookup. This holds up against random nonexistent keys in a way per-key negative entries cannot, since the filter answers for the whole keyspace at once. Reference: https://en.wikipedia.org/wiki/Bloom_filter.