Why Cache, and What to Cache
Most slow Django views are slow for the same reason: they make the database do work that would not change between requests. A category page lists the same products for every anonymous visitor for the next five minutes. A homepage runs the same six aggregate queries every refresh. A REST endpoint serialises the same paginated list of articles thousands of times an hour. None of that work needs to happen more than once per refresh window.
Caching trades freshness for latency. The contract is: this answer is correct enough for the next N seconds — serve it from RAM. The art is choosing N, choosing what to cache, and choosing where in the request pipeline to cache it.
A useful prioritisation: cache the output that's most expensive to generate and most reused across requests. Anonymous list pages are the obvious win. Authenticated personalised dashboards are usually not — different output per user, smaller hit rate, bigger invalidation surface.
Cache Backends: Pick One
Django ships with several built-in backends. The choice matters because the wrong backend silently turns your cache into an in-process map that other workers can't see — a common cause of "I cached it but the next refresh recomputed everything."
| Backend | Shared? | Persistent? | Use for |
|---|---|---|---|
LocMemCache |
No — per-process | No | Tests and single-process dev only. Never production with Gunicorn workers. |
DatabaseCache |
Yes | Yes | Small sites with no Redis budget. Hits the DB you're trying to relieve — use sparingly. |
FileBasedCache |
One host | Yes | Single-server deployments without Redis. Slower than Redis; fine for low-traffic edges. |
RedisCache (Django ≥ 4) |
Yes | Configurable | Default choice. Multi-worker, multi-host, fast, with TTL and atomic ops. |
PyMemcacheCache |
Yes | No | Memcached shops. Slightly faster than Redis on pure SET/GET; no Lua, no pub/sub. |
For 95% of Django apps, Redis is the right pick: shared across workers and hosts, fast enough that you'll never feel it, supports atomic increments and Lua scripts for stampede protection, and runs comfortably on a £6/month managed instance.
Setting Up the Redis Backend
Django 4+ ships django.core.cache.backends.redis.RedisCache in the box,
backed by redis-py. You don't need django-redis any more
unless you want its extras (custom serialisers, connection pool config, master/slave
routing). Start with the built-in and add the third-party only when you have a reason.
# settings.py
import os
CACHES = {
"default": {
"BACKEND": "django.core.cache.backends.redis.RedisCache",
"LOCATION": os.environ["REDIS_URL"], # redis://:password@host:6379/1
"OPTIONS": {
"db": 1,
"socket_timeout": 2,
"socket_connect_timeout": 1,
"retry_on_timeout": True,
},
"KEY_PREFIX": "wf", # avoids key collisions when Redis is shared
"VERSION": 1, # bump to invalidate every cached entry at once
"TIMEOUT": 300, # default 5 min — override per call
},
"sessions": { # separate logical DB for sessions
"BACKEND": "django.core.cache.backends.redis.RedisCache",
"LOCATION": os.environ["REDIS_URL"].replace("/1", "/2"),
"KEY_PREFIX": "wf-sess",
"TIMEOUT": 60 * 60 * 24 * 14, # 2 weeks
},
}
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
SESSION_CACHE_ALIAS = "sessions"
Two things worth doing on day one: set KEY_PREFIX so multiple apps can
share a Redis instance without colliding, and use separate logical DBs (or separate
Redis instances) for the page cache and the session store. Sessions surviving a cache
FLUSHDB is usually what you want.
Sanity-check the connection
from django.core.cache import cache
cache.set("ping", "pong", 30)
assert cache.get("ping") == "pong"
Run this in ./manage.py shell against staging before you wire up any caching.
If it raises a connection error, fix that first — every cache call will silently fall
through to "cache miss" otherwise, and you'll think the cache is just cold.
Per-View Caching
The simplest cache in Django: wrap a view in @cache_page(N) and the entire
rendered response is stored under a key derived from the URL. Subsequent requests for
the same URL skip the view function entirely.
# views.py
from django.views.decorators.cache import cache_page
from django.views.decorators.vary import vary_on_headers
CACHE_TTL_5MIN = 60 * 5
@cache_page(CACHE_TTL_5MIN)
def article_list(request):
articles = Article.objects.published().select_related("author")[:50]
return render(request, "blog/list.html", {"articles": articles})
# Vary by Accept-Language so EN and FR get separate cache entries
@cache_page(60 * 60) # 1 hour
@vary_on_headers("Accept-Language")
def landing_page(request):
return render(request, "marketing/landing.html")
@cache_page is great when a URL maps 1:1 to a body of output. It's a trap
when the response depends on things not in the URL: the logged-in user, a
cookie, a feature flag. The cache key won't reflect those, so every user sees the
response that the first user happened to trigger. The fix is either: (a) don't use
@cache_page on personalised views, or (b) add @vary_on_cookie
/ @vary_on_headers for the dimensions that matter.
Site-wide cache middleware (with care)
# settings.py
MIDDLEWARE = [
"django.middleware.cache.UpdateCacheMiddleware", # FIRST
# ... your existing middleware ...
"django.middleware.common.CommonMiddleware",
"django.middleware.cache.FetchFromCacheMiddleware", # LAST
]
CACHE_MIDDLEWARE_ALIAS = "default"
CACHE_MIDDLEWARE_SECONDS = 60 * 5
CACHE_MIDDLEWARE_KEY_PREFIX = "site"
The site-wide cache caches every GET/HEAD response that doesn't already set
Cache-Control: private. On a site with both anonymous list pages and a
logged-in admin, this is almost always wrong. Prefer per-view @cache_page
or template fragments unless you have a static marketing site.
Template Fragment Caching
When a page has a personalised header but a shared body — the common case — fragment caching lets you cache just the expensive bit.
{% load cache %}
<header>Hello {{ request.user.username }}</header> {# never cached #}
{% cache 300 sidebar_popular request.LANGUAGE_CODE %}
{# Cached for 5 min, separate entry per language #}
<aside class="popular">
{% for post in popular_posts %}
<a href="{{ post.url }}">{{ post.title }}</a>
{% endfor %}
</aside>
{% endcache %}
The tag signature is {% cache TIMEOUT FRAGMENT_NAME [var1] [var2] ... %}.
The variables become part of the cache key, so the same fragment can have many distinct
cached versions — one per language, per category, per A/B-test bucket, whatever you
pass.
Include the version of the underlying data
{# In your view, pass a "data version" — a timestamp or a Redis counter. #}
{% cache 3600 article_body article.id article.updated_at.timestamp %}
{{ article.body|safe }}
{% endcache %}
Including article.updated_at in the key means an edited article gets a new
cache key automatically — the old fragment expires naturally after its TTL, and the new
version starts caching from the next request. This pattern avoids the entire "how do I
invalidate a fragment when the model changes" problem.
The Low-Level Cache API
django.core.cache.cache exposes a small, Pythonic API: get,
set, add, get_or_set, delete,
get_many, set_many, incr, decr,
touch. Use it when you're caching a Python object (a list of dicts, a
computed score) rather than HTML.
from django.core.cache import cache
# Read with a default
top_authors = cache.get("top_authors", default=[])
# Atomic get-or-compute. The callable is invoked only on miss.
def _compute_top_authors():
return list(
Author.objects.annotate(n=Count("articles"))
.order_by("-n")
.values("id", "name", "n")[:10]
)
top_authors = cache.get_or_set(
"top_authors",
_compute_top_authors, # callable, not the result — lazy on miss
timeout=60 * 15, # 15 minutes
)
get_or_set is the workhorse. Always pass a callable, not a
pre-computed value — otherwise you pay the cost of computing the value on every
hit, defeating the cache entirely.
Batched reads with get_many
# Build N keys; round-trip Redis once instead of N times.
ids = [a.id for a in articles]
keys = [f"article:view_count:{i}" for i in ids]
counts = cache.get_many(keys) # {"article:view_count:42": 17, ...}
for a in articles:
a.view_count = counts.get(f"article:view_count:{a.id}", 0)
Atomic counters
# Page view counter — atomic, no race conditions
def record_view(article_id: int) -> int:
key = f"article:view_count:{article_id}"
try:
return cache.incr(key)
except ValueError:
# Key doesn't exist yet — initialise with a long TTL
cache.set(key, 1, timeout=60 * 60 * 24)
return 1
Counters are a great use of Redis. They run at hundreds of thousands of operations per second, you don't write to the database on every request, and you can periodically flush the totals back to PostgreSQL with a Celery beat task.
Caching the ORM
Django itself does not cache QuerySets across requests — the per-request QuerySet cache is gone the moment the view returns. To survive across requests, you have to do it yourself.
Pattern: cache a serialised list
from django.core.cache import cache
from django.utils import timezone
def featured_articles(limit: int = 10) -> list[dict]:
key = f"featured_articles:v1:{limit}"
def _load():
qs = (Article.objects
.filter(featured=True, published_at__lte=timezone.now())
.select_related("author")
.order_by("-published_at")[:limit])
# Serialise to plain dicts — pickling Django model instances
# works but ties cache to the model class. Plain dicts survive
# model changes and are smaller.
return [
{"id": a.id, "title": a.title, "slug": a.slug,
"author": a.author.name, "published_at": a.published_at.isoformat()}
for a in qs
]
return cache.get_or_set(key, _load, timeout=60 * 10)
Two things to notice. First, the key includes v1 — bump it to
v2 and you've invalidated every cached version of this function with no
Redis call. Second, the cached value is a list of plain dicts, not model instances.
Pickling model instances works but couples your cache to your model classes, and
changing a field can corrupt every old cached entry.
Pattern: cached_property for per-instance, per-request computation
from functools import cached_property
class Article(models.Model):
body = models.TextField()
# ...
@cached_property
def word_count(self) -> int:
return len(self.body.split())
cached_property is in-memory and per-instance — the computation runs once
per object, in the current Python process, and is thrown away when the instance is
garbage-collected. Use it for expensive computations on a model that gets accessed many
times in one request (e.g. a serializer reading the same property repeatedly). For
anything that should survive across requests, use the Redis cache.
Pattern: a per-row cache helper
def cached_article(pk: int) -> dict | None:
"""Get an article as a dict, cached by primary key."""
key = f"article:{pk}:v1"
def _load():
try:
a = Article.objects.select_related("author").get(pk=pk)
except Article.DoesNotExist:
return None
return {"id": a.id, "title": a.title, "body": a.body,
"author": a.author.name, "updated_at": a.updated_at.isoformat()}
return cache.get_or_set(key, _load, timeout=60 * 30)
def invalidate_article(pk: int) -> None:
cache.delete(f"article:{pk}:v1")
Wire invalidate_article() to a post_save / post_delete
signal on the model and the cache stays consistent without anyone having to remember
it (see §9 below).
Cache Keys: Boring is Better
Bad key design is the most common reason production caches go wrong. A few rules that have held up across years of Django projects:
- Embed a version segment. Every key has
:v1at the end. When the shape of the cached value changes, bump it. - Encode every input. If the value depends on a user, a tenant, a language, a filter — put it in the key. The cache will happily serve user A's data to user B if you forget.
- Hash long inputs. A querystring with 12 filters becomes a 200-char key. SHA-1 it, prefix with the path. Memcached has a 250-char limit; Redis doesn't but you still want short keys for monitoring.
- Namespace by resource.
article:,user_profile:,search:. Makesredis-cli SCAN MATCH article:*useful for ops.
import hashlib
from urllib.parse import urlencode
def search_cache_key(query: str, filters: dict, page: int) -> str:
payload = urlencode(sorted(filters.items())) + f"&q={query}&p={page}"
digest = hashlib.sha1(payload.encode()).hexdigest()[:16]
return f"search:v2:{digest}"
Cache Invalidation
"There are only two hard things in Computer Science: cache invalidation and naming things." The advice is overstated, but the work is real. Three patterns cover almost every case.
1. TTL-only ("ignore the problem")
Set a TTL short enough that stale data doesn't matter. Works for almost everything: list pages, search results, aggregates, trending posts. The cache eventually catches up; you don't have to write any invalidation code.
2. Versioned keys ("just bump the integer")
Store a counter in the cache. Include it in keys. To invalidate everything that depends on a resource, increment the counter — every key that referenced the old value is now unreachable.
def category_version(category_id: int) -> int:
return cache.get_or_set(f"ver:category:{category_id}", 1, timeout=None)
def category_articles_cache_key(category_id: int) -> str:
v = category_version(category_id)
return f"category:{category_id}:articles:v{v}"
# In signal handler: invalidate by bumping the version.
def invalidate_category(category_id: int) -> None:
try:
cache.incr(f"ver:category:{category_id}")
except ValueError:
cache.set(f"ver:category:{category_id}", 2, timeout=None)
3. Signal-based explicit invalidation
# signals.py
from django.db.models.signals import post_save, post_delete
from django.dispatch import receiver
from .cache_utils import invalidate_article
from .models import Article
@receiver([post_save, post_delete], sender=Article)
def clear_article_cache(sender, instance, **kwargs):
invalidate_article(instance.pk)
# Also bust list pages that contain this article
try:
cache.incr("ver:article_list")
except ValueError:
cache.set("ver:article_list", 2, timeout=None)
Cache Stampede Protection
A cache stampede (aka dogpile, thundering herd) is what happens when a popular key expires and 200 concurrent requests all simultaneously decide to recompute it. Your database goes from 50 QPS to 50,000 QPS for two seconds; the recomputation finishes, everyone writes the same value to Redis, and the herd disperses. On the next expiry, it happens again.
Two robust mitigations:
1. Distributed lock around the recompute
import time
from django.core.cache import cache
from django_redis import get_redis_connection # optional helper
def get_or_compute_locked(key: str, ttl: int, compute):
"""Single-flight: only one process recomputes; others wait briefly and re-read."""
cached = cache.get(key)
if cached is not None:
return cached
lock_key = f"{key}:lock"
# add() returns False if the key already exists — atomic SET NX in Redis.
got_lock = cache.add(lock_key, "1", timeout=30)
if not got_lock:
# Another worker is computing. Wait briefly, then re-read.
for _ in range(20): # ~2 s max
time.sleep(0.1)
cached = cache.get(key)
if cached is not None:
return cached
return compute() # give up, compute ourselves
try:
value = compute()
cache.set(key, value, timeout=ttl)
return value
finally:
cache.delete(lock_key)
2. Probabilistic early refresh ("XFetch")
Instead of every request waiting for expiry, occasionally refresh before it. The probability of an early refresh rises as the value gets closer to its TTL — so under steady load, the cache rarely actually expires.
import math
import random
import time
def xfetch(key: str, ttl: int, beta: float, compute):
"""Cache with probabilistic early recomputation (Vattani et al., 2015)."""
payload = cache.get(key) # {"value": ..., "delta": ..., "expires_at": ...}
now = time.time()
if payload and (now + beta * payload["delta"] * math.log(random.random())) < payload["expires_at"]:
return payload["value"]
started = time.time()
value = compute()
delta = time.time() - started # cost of recomputation in seconds
cache.set(key, {"value": value, "delta": delta, "expires_at": now + ttl}, timeout=ttl)
return value
beta is a tuning knob (typical 1.0–2.0): higher means refresh earlier. The
lock approach is simpler and good for most cases; XFetch is worth it for very hot keys
where even brief contention would hurt.
Monitoring & Sizing
A cache you don't measure is a cache you're guessing about. Three numbers to track:
- Hit rate. Redis
INFO statsexposeskeyspace_hitsandkeyspace_misses. Anything below ~80% on a steady-state app means TTLs are too short or keys are too fragmented. - Memory and evictions.
used_memoryapproachingmaxmemory, combined with non-zeroevicted_keys, means Redis is throwing away entries to make room. Either size up or shorten TTLs on low-value keys. - Latency. A healthy Redis responds in <1 ms on a LAN. If
cache.get()shows up in your APM as 20 ms+, you have a network problem, not a cache problem.
redis-cli INFO stats | grep -E 'keyspace_(hits|misses)|evicted'
# Top memory hogs
redis-cli --bigkeys
# Watch live commands (do NOT leave running in prod)
redis-cli MONITOR | head -50
Configure maxmemory with the allkeys-lru eviction policy
(recent keys survive, old ones get evicted first). This makes the cache self-managing
under memory pressure — entries you stopped using disappear without you having to
remember to clean them up.
Production Checklist
- Configured
CACHESexplicitly — never relying on the LocMemCache default in production. - Separate Redis logical DBs for the page cache and the session store, so a cache flush doesn't log everyone out.
- Every key has a version segment. Bump it instead of writing migration-style cache busters.
- Every personalised value's cache key includes the user/tenant. No exceptions — leaking one user's data to another is the worst possible cache bug.
get_or_setwith a callable, notget-then-setwith a precomputed value.- Signal-based invalidation on model save/delete for any model whose data you cache by primary key.
- Stampede protection on the top 5 hottest keys — either lock or XFetch. The rest can ride bare TTLs.
- Hit rate dashboard in Grafana / Datadog, alerting when it drops below ~70% on cacheable endpoints.
maxmemoryset withallkeys-lrueviction. Alert on non-zeroevicted_keyssustained over five minutes.- Connection timeouts set (
socket_timeout=2) so a Redis stall fails fast instead of blocking every worker. - Tests use
LocMemCacheor a per-test Redis DB, not the production cache.cache.clear()intearDown. - Graceful degradation — if
cache.get()raises, the view should still render. Wrap the cache call, log the exception, fall through.