WEBBYFOX-OS PATH /blog/django-caching-guide/ DOC ~16 MIN NODE LDN-01
FEED ACTIVE 13:42 BST
./post · django-caching-guide.md
NEW
// ARTICLE · MAY 15, 2026

Complete Django Caching Guide: Redis, Per-View, Template, ORM.

Caching is the cheapest performance gain available to a Django app — done right, it turns a 300 ms database-bound view into a 4 ms cache hit. Done badly, it serves stale data, masks bugs, and stampedes Redis under load. This guide walks every layer Django gives you: the Redis backend, @cache_page, template fragments, the low-level API, ORM patterns, invalidation, and stampede protection.

Python Django Performance Redis
· ~16 min read · Rizwan Mansuri
$ cat ./django-caching-guide.md
READ

Why Cache, and What to Cache

Most slow Django views are slow for the same reason: they make the database do work that would not change between requests. A category page lists the same products for every anonymous visitor for the next five minutes. A homepage runs the same six aggregate queries every refresh. A REST endpoint serialises the same paginated list of articles thousands of times an hour. None of that work needs to happen more than once per refresh window.

Caching trades freshness for latency. The contract is: this answer is correct enough for the next N seconds — serve it from RAM. The art is choosing N, choosing what to cache, and choosing where in the request pipeline to cache it.

REQUEST FLOW · CACHE LAYERS Browser HTTP cache CDN / nginx edge cache UpdateCache MW per-site, per-view Template fragments {% cache %} block Low-level API cache.get_or_set() Redis keys + TTL FALL-THROUGH · WHEN NO LAYER ANSWERS Django view render + ORM queries PostgreSQL ~5–250 ms / query A cache hit at any layer short-circuits everything to its right. Cheapest hit wins.
Django's caching layers, ordered cheapest-to-most-expensive from left to right. A hit at the CDN never touches your worker; a hit at the low-level cache still avoids the database.

A useful prioritisation: cache the output that's most expensive to generate and most reused across requests. Anonymous list pages are the obvious win. Authenticated personalised dashboards are usually not — different output per user, smaller hit rate, bigger invalidation surface.


Cache Backends: Pick One

Django ships with several built-in backends. The choice matters because the wrong backend silently turns your cache into an in-process map that other workers can't see — a common cause of "I cached it but the next refresh recomputed everything."

BackendShared?Persistent?Use for
LocMemCache No — per-process No Tests and single-process dev only. Never production with Gunicorn workers.
DatabaseCache Yes Yes Small sites with no Redis budget. Hits the DB you're trying to relieve — use sparingly.
FileBasedCache One host Yes Single-server deployments without Redis. Slower than Redis; fine for low-traffic edges.
RedisCache (Django ≥ 4) Yes Configurable Default choice. Multi-worker, multi-host, fast, with TTL and atomic ops.
PyMemcacheCache Yes No Memcached shops. Slightly faster than Redis on pure SET/GET; no Lua, no pub/sub.

For 95% of Django apps, Redis is the right pick: shared across workers and hosts, fast enough that you'll never feel it, supports atomic increments and Lua scripts for stampede protection, and runs comfortably on a £6/month managed instance.


Setting Up the Redis Backend

Django 4+ ships django.core.cache.backends.redis.RedisCache in the box, backed by redis-py. You don't need django-redis any more unless you want its extras (custom serialisers, connection pool config, master/slave routing). Start with the built-in and add the third-party only when you have a reason.

# settings.py
import os

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": os.environ["REDIS_URL"],  # redis://:password@host:6379/1
        "OPTIONS": {
            "db": 1,
            "socket_timeout": 2,
            "socket_connect_timeout": 1,
            "retry_on_timeout": True,
        },
        "KEY_PREFIX": "wf",        # avoids key collisions when Redis is shared
        "VERSION": 1,              # bump to invalidate every cached entry at once
        "TIMEOUT": 300,            # default 5 min — override per call
    },
    "sessions": {                   # separate logical DB for sessions
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": os.environ["REDIS_URL"].replace("/1", "/2"),
        "KEY_PREFIX": "wf-sess",
        "TIMEOUT": 60 * 60 * 24 * 14,  # 2 weeks
    },
}

SESSION_ENGINE = "django.contrib.sessions.backends.cache"
SESSION_CACHE_ALIAS = "sessions"

Two things worth doing on day one: set KEY_PREFIX so multiple apps can share a Redis instance without colliding, and use separate logical DBs (or separate Redis instances) for the page cache and the session store. Sessions surviving a cache FLUSHDB is usually what you want.

Sanity-check the connection

from django.core.cache import cache

cache.set("ping", "pong", 30)
assert cache.get("ping") == "pong"

Run this in ./manage.py shell against staging before you wire up any caching. If it raises a connection error, fix that first — every cache call will silently fall through to "cache miss" otherwise, and you'll think the cache is just cold.


Per-View Caching

The simplest cache in Django: wrap a view in @cache_page(N) and the entire rendered response is stored under a key derived from the URL. Subsequent requests for the same URL skip the view function entirely.

# views.py
from django.views.decorators.cache import cache_page
from django.views.decorators.vary import vary_on_headers

CACHE_TTL_5MIN = 60 * 5

@cache_page(CACHE_TTL_5MIN)
def article_list(request):
    articles = Article.objects.published().select_related("author")[:50]
    return render(request, "blog/list.html", {"articles": articles})


# Vary by Accept-Language so EN and FR get separate cache entries
@cache_page(60 * 60)            # 1 hour
@vary_on_headers("Accept-Language")
def landing_page(request):
    return render(request, "marketing/landing.html")

@cache_page is great when a URL maps 1:1 to a body of output. It's a trap when the response depends on things not in the URL: the logged-in user, a cookie, a feature flag. The cache key won't reflect those, so every user sees the response that the first user happened to trigger. The fix is either: (a) don't use @cache_page on personalised views, or (b) add @vary_on_cookie / @vary_on_headers for the dimensions that matter.

Site-wide cache middleware (with care)

# settings.py
MIDDLEWARE = [
    "django.middleware.cache.UpdateCacheMiddleware",   # FIRST
    # ... your existing middleware ...
    "django.middleware.common.CommonMiddleware",
    "django.middleware.cache.FetchFromCacheMiddleware", # LAST
]

CACHE_MIDDLEWARE_ALIAS = "default"
CACHE_MIDDLEWARE_SECONDS = 60 * 5
CACHE_MIDDLEWARE_KEY_PREFIX = "site"

The site-wide cache caches every GET/HEAD response that doesn't already set Cache-Control: private. On a site with both anonymous list pages and a logged-in admin, this is almost always wrong. Prefer per-view @cache_page or template fragments unless you have a static marketing site.


Template Fragment Caching

When a page has a personalised header but a shared body — the common case — fragment caching lets you cache just the expensive bit.

{% load cache %}

<header>Hello {{ request.user.username }}</header>  {# never cached #}

{% cache 300 sidebar_popular request.LANGUAGE_CODE %}
  {# Cached for 5 min, separate entry per language #}
  <aside class="popular">
    {% for post in popular_posts %}
      <a href="{{ post.url }}">{{ post.title }}</a>
    {% endfor %}
  </aside>
{% endcache %}

The tag signature is {% cache TIMEOUT FRAGMENT_NAME [var1] [var2] ... %}. The variables become part of the cache key, so the same fragment can have many distinct cached versions — one per language, per category, per A/B-test bucket, whatever you pass.

Include the version of the underlying data

{# In your view, pass a "data version" — a timestamp or a Redis counter. #}
{% cache 3600 article_body article.id article.updated_at.timestamp %}
  {{ article.body|safe }}
{% endcache %}

Including article.updated_at in the key means an edited article gets a new cache key automatically — the old fragment expires naturally after its TTL, and the new version starts caching from the next request. This pattern avoids the entire "how do I invalidate a fragment when the model changes" problem.


The Low-Level Cache API

django.core.cache.cache exposes a small, Pythonic API: get, set, add, get_or_set, delete, get_many, set_many, incr, decr, touch. Use it when you're caching a Python object (a list of dicts, a computed score) rather than HTML.

from django.core.cache import cache

# Read with a default
top_authors = cache.get("top_authors", default=[])

# Atomic get-or-compute. The callable is invoked only on miss.
def _compute_top_authors():
    return list(
        Author.objects.annotate(n=Count("articles"))
                       .order_by("-n")
                       .values("id", "name", "n")[:10]
    )

top_authors = cache.get_or_set(
    "top_authors",
    _compute_top_authors,    # callable, not the result — lazy on miss
    timeout=60 * 15,         # 15 minutes
)

get_or_set is the workhorse. Always pass a callable, not a pre-computed value — otherwise you pay the cost of computing the value on every hit, defeating the cache entirely.

Batched reads with get_many

# Build N keys; round-trip Redis once instead of N times.
ids = [a.id for a in articles]
keys = [f"article:view_count:{i}" for i in ids]
counts = cache.get_many(keys)        # {"article:view_count:42": 17, ...}

for a in articles:
    a.view_count = counts.get(f"article:view_count:{a.id}", 0)

Atomic counters

# Page view counter — atomic, no race conditions
def record_view(article_id: int) -> int:
    key = f"article:view_count:{article_id}"
    try:
        return cache.incr(key)
    except ValueError:
        # Key doesn't exist yet — initialise with a long TTL
        cache.set(key, 1, timeout=60 * 60 * 24)
        return 1

Counters are a great use of Redis. They run at hundreds of thousands of operations per second, you don't write to the database on every request, and you can periodically flush the totals back to PostgreSQL with a Celery beat task.


Caching the ORM

Django itself does not cache QuerySets across requests — the per-request QuerySet cache is gone the moment the view returns. To survive across requests, you have to do it yourself.

Pattern: cache a serialised list

from django.core.cache import cache
from django.utils import timezone

def featured_articles(limit: int = 10) -> list[dict]:
    key = f"featured_articles:v1:{limit}"
    def _load():
        qs = (Article.objects
              .filter(featured=True, published_at__lte=timezone.now())
              .select_related("author")
              .order_by("-published_at")[:limit])
        # Serialise to plain dicts — pickling Django model instances
        # works but ties cache to the model class. Plain dicts survive
        # model changes and are smaller.
        return [
            {"id": a.id, "title": a.title, "slug": a.slug,
             "author": a.author.name, "published_at": a.published_at.isoformat()}
            for a in qs
        ]
    return cache.get_or_set(key, _load, timeout=60 * 10)

Two things to notice. First, the key includes v1 — bump it to v2 and you've invalidated every cached version of this function with no Redis call. Second, the cached value is a list of plain dicts, not model instances. Pickling model instances works but couples your cache to your model classes, and changing a field can corrupt every old cached entry.

Pattern: cached_property for per-instance, per-request computation

from functools import cached_property

class Article(models.Model):
    body = models.TextField()
    # ...

    @cached_property
    def word_count(self) -> int:
        return len(self.body.split())

cached_property is in-memory and per-instance — the computation runs once per object, in the current Python process, and is thrown away when the instance is garbage-collected. Use it for expensive computations on a model that gets accessed many times in one request (e.g. a serializer reading the same property repeatedly). For anything that should survive across requests, use the Redis cache.

Pattern: a per-row cache helper

def cached_article(pk: int) -> dict | None:
    """Get an article as a dict, cached by primary key."""
    key = f"article:{pk}:v1"
    def _load():
        try:
            a = Article.objects.select_related("author").get(pk=pk)
        except Article.DoesNotExist:
            return None
        return {"id": a.id, "title": a.title, "body": a.body,
                "author": a.author.name, "updated_at": a.updated_at.isoformat()}
    return cache.get_or_set(key, _load, timeout=60 * 30)


def invalidate_article(pk: int) -> None:
    cache.delete(f"article:{pk}:v1")

Wire invalidate_article() to a post_save / post_delete signal on the model and the cache stays consistent without anyone having to remember it (see §9 below).


Cache Keys: Boring is Better

Bad key design is the most common reason production caches go wrong. A few rules that have held up across years of Django projects:

  • Embed a version segment. Every key has :v1 at the end. When the shape of the cached value changes, bump it.
  • Encode every input. If the value depends on a user, a tenant, a language, a filter — put it in the key. The cache will happily serve user A's data to user B if you forget.
  • Hash long inputs. A querystring with 12 filters becomes a 200-char key. SHA-1 it, prefix with the path. Memcached has a 250-char limit; Redis doesn't but you still want short keys for monitoring.
  • Namespace by resource. article:, user_profile:, search:. Makes redis-cli SCAN MATCH article:* useful for ops.
import hashlib
from urllib.parse import urlencode

def search_cache_key(query: str, filters: dict, page: int) -> str:
    payload = urlencode(sorted(filters.items())) + f"&q={query}&p={page}"
    digest = hashlib.sha1(payload.encode()).hexdigest()[:16]
    return f"search:v2:{digest}"

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." The advice is overstated, but the work is real. Three patterns cover almost every case.

1. TTL-only ("ignore the problem")

Set a TTL short enough that stale data doesn't matter. Works for almost everything: list pages, search results, aggregates, trending posts. The cache eventually catches up; you don't have to write any invalidation code.

2. Versioned keys ("just bump the integer")

Store a counter in the cache. Include it in keys. To invalidate everything that depends on a resource, increment the counter — every key that referenced the old value is now unreachable.

def category_version(category_id: int) -> int:
    return cache.get_or_set(f"ver:category:{category_id}", 1, timeout=None)

def category_articles_cache_key(category_id: int) -> str:
    v = category_version(category_id)
    return f"category:{category_id}:articles:v{v}"

# In signal handler: invalidate by bumping the version.
def invalidate_category(category_id: int) -> None:
    try:
        cache.incr(f"ver:category:{category_id}")
    except ValueError:
        cache.set(f"ver:category:{category_id}", 2, timeout=None)

3. Signal-based explicit invalidation

# signals.py
from django.db.models.signals import post_save, post_delete
from django.dispatch import receiver
from .cache_utils import invalidate_article
from .models import Article


@receiver([post_save, post_delete], sender=Article)
def clear_article_cache(sender, instance, **kwargs):
    invalidate_article(instance.pk)
    # Also bust list pages that contain this article
    try:
        cache.incr("ver:article_list")
    except ValueError:
        cache.set("ver:article_list", 2, timeout=None)

Cache Stampede Protection

A cache stampede (aka dogpile, thundering herd) is what happens when a popular key expires and 200 concurrent requests all simultaneously decide to recompute it. Your database goes from 50 QPS to 50,000 QPS for two seconds; the recomputation finishes, everyone writes the same value to Redis, and the herd disperses. On the next expiry, it happens again.

Two robust mitigations:

1. Distributed lock around the recompute

import time
from django.core.cache import cache
from django_redis import get_redis_connection   # optional helper

def get_or_compute_locked(key: str, ttl: int, compute):
    """Single-flight: only one process recomputes; others wait briefly and re-read."""
    cached = cache.get(key)
    if cached is not None:
        return cached

    lock_key = f"{key}:lock"
    # add() returns False if the key already exists — atomic SET NX in Redis.
    got_lock = cache.add(lock_key, "1", timeout=30)
    if not got_lock:
        # Another worker is computing. Wait briefly, then re-read.
        for _ in range(20):                  # ~2 s max
            time.sleep(0.1)
            cached = cache.get(key)
            if cached is not None:
                return cached
        return compute()                     # give up, compute ourselves

    try:
        value = compute()
        cache.set(key, value, timeout=ttl)
        return value
    finally:
        cache.delete(lock_key)

2. Probabilistic early refresh ("XFetch")

Instead of every request waiting for expiry, occasionally refresh before it. The probability of an early refresh rises as the value gets closer to its TTL — so under steady load, the cache rarely actually expires.

import math
import random
import time

def xfetch(key: str, ttl: int, beta: float, compute):
    """Cache with probabilistic early recomputation (Vattani et al., 2015)."""
    payload = cache.get(key)                 # {"value": ..., "delta": ..., "expires_at": ...}
    now = time.time()
    if payload and (now + beta * payload["delta"] * math.log(random.random())) < payload["expires_at"]:
        return payload["value"]

    started = time.time()
    value = compute()
    delta = time.time() - started            # cost of recomputation in seconds
    cache.set(key, {"value": value, "delta": delta, "expires_at": now + ttl}, timeout=ttl)
    return value

beta is a tuning knob (typical 1.0–2.0): higher means refresh earlier. The lock approach is simpler and good for most cases; XFetch is worth it for very hot keys where even brief contention would hurt.


Monitoring & Sizing

A cache you don't measure is a cache you're guessing about. Three numbers to track:

  • Hit rate. Redis INFO stats exposes keyspace_hits and keyspace_misses. Anything below ~80% on a steady-state app means TTLs are too short or keys are too fragmented.
  • Memory and evictions. used_memory approaching maxmemory, combined with non-zero evicted_keys, means Redis is throwing away entries to make room. Either size up or shorten TTLs on low-value keys.
  • Latency. A healthy Redis responds in <1 ms on a LAN. If cache.get() shows up in your APM as 20 ms+, you have a network problem, not a cache problem.
redis-cli INFO stats | grep -E 'keyspace_(hits|misses)|evicted'

# Top memory hogs
redis-cli --bigkeys

# Watch live commands (do NOT leave running in prod)
redis-cli MONITOR | head -50

Configure maxmemory with the allkeys-lru eviction policy (recent keys survive, old ones get evicted first). This makes the cache self-managing under memory pressure — entries you stopped using disappear without you having to remember to clean them up.


Production Checklist

  • Configured CACHES explicitly — never relying on the LocMemCache default in production.
  • Separate Redis logical DBs for the page cache and the session store, so a cache flush doesn't log everyone out.
  • Every key has a version segment. Bump it instead of writing migration-style cache busters.
  • Every personalised value's cache key includes the user/tenant. No exceptions — leaking one user's data to another is the worst possible cache bug.
  • get_or_set with a callable, not get-then-set with a precomputed value.
  • Signal-based invalidation on model save/delete for any model whose data you cache by primary key.
  • Stampede protection on the top 5 hottest keys — either lock or XFetch. The rest can ride bare TTLs.
  • Hit rate dashboard in Grafana / Datadog, alerting when it drops below ~70% on cacheable endpoints.
  • maxmemory set with allkeys-lru eviction. Alert on non-zero evicted_keys sustained over five minutes.
  • Connection timeouts set (socket_timeout=2) so a Redis stall fails fast instead of blocking every worker.
  • Tests use LocMemCache or a per-test Redis DB, not the production cache. cache.clear() in tearDown.
  • Graceful degradation — if cache.get() raises, the view should still render. Wrap the cache call, log the exception, fall through.
$ ls ./related/
3 POSTS
$ cd ../  ·  · Rizwan Mansuri ↗ RSS feed · ↑ top