Performance Profiling Methodology and Findings (2026-02-22)
This document summarizes the profiling methodology and findings for the current feature branch performance work, including cache correctness fixes, API caching verification, and frontend/case-page profiling.
Scope
Profiled and verified:
API endpoints:
/api//api/cases//api/laws//api/law_books/
Frontend pages:
/(frontpage)/case/(case list)/case/foo-case(case detail, fixture data)
The work included both correctness fixes (cache variance/safety) and performance improvements (query reductions, caching).
Environment and Tooling
Date: February 22, 2026
Python env:
.venvProfiling packages installed with
uv:django-silkdjango-querycountgprof2dot
Commands used to install profiling packages
source .venv/bin/activate
uv pip install django-silk==5.4.3 django-querycount==0.8.3 gprof2dot
Important Runtime Setup Notes
Env variable names (important)
django-configurations uses different env names for different settings types in this project:
Database override:
DATABASE_URL=...Profiling toggles:
DJANGO_PROFILING_ENABLED=1Querycount toggle:
DJANGO_QUERYCOUNT_ENABLED=1File cache location:
DJANGO_FILE_CACHE_LOCATION=/tmp/...
Using DJANGO_DATABASE_URL does not override the DB here.
Temp SQLite DB location under escalated commands
When profiling with escalated commands, the process did not see the same SQLite file contents created in the sandbox workspace path. The reliable approach was:
create/migrate/load the profiling DB in
/tmprun
DevConfigurationagainst/tmp/oldp-dev-profile.sqlite3
Methodology
1. Validate profiling middleware stack
Enabled Silk + QueryCount via env vars
Verified:
middleware activation
/silk/route registrationX-DjangoQueryCount-*headers on responsesSilk request recording in DB
2. Build a temporary profiling DB
Using a fresh SQLite DB for reproducible profiling:
TMPDIR=/tmp SQLITE_TMPDIR=/tmp \
DJANGO_CONFIGURATION=DevConfiguration \
DATABASE_URL=sqlite:////tmp/oldp-dev-profile.sqlite3 \
DJANGO_PROFILING_ENABLED=1 \
DJANGO_QUERYCOUNT_ENABLED=1 \
DJANGO_FILE_CACHE_LOCATION=/tmp/django_cache_devprofile \
.venv/bin/python manage.py migrate --noinput
Load fixture data:
TMPDIR=/tmp SQLITE_TMPDIR=/tmp \
DJANGO_CONFIGURATION=DevConfiguration \
DATABASE_URL=sqlite:////tmp/oldp-dev-profile.sqlite3 \
DJANGO_PROFILING_ENABLED=1 \
DJANGO_QUERYCOUNT_ENABLED=1 \
DJANGO_FILE_CACHE_LOCATION=/tmp/django_cache_devprofile \
.venv/bin/python manage.py loaddata \
locations/countries.json \
locations/states.json \
locations/cities.json \
courts/courts.json \
sources/default.json \
laws/laws.json \
cases/cases.json
3. Run profiled local server (DevConfiguration)
TMPDIR=/tmp SQLITE_TMPDIR=/tmp \
DJANGO_CONFIGURATION=DevConfiguration \
DATABASE_URL=sqlite:////tmp/oldp-dev-profile.sqlite3 \
DJANGO_PROFILING_ENABLED=1 \
DJANGO_QUERYCOUNT_ENABLED=1 \
DJANGO_FILE_CACHE_LOCATION=/tmp/django_cache_devprofile \
.venv/bin/python manage.py runserver 127.0.0.1:8014 --noreload
4. Measure cold vs warm requests (2-pass)
Two sequential requests per endpoint were issued with curl, capturing headers and timing:
status
X-DjangoQueryCount-CountVaryCache-ControlETagtime_total
Example:
curl -sS -o /dev/null -D - -w 'time_total=%{time_total}\n' \
http://127.0.0.1:8014/api/laws/
5. Use Silk SQL traces for root-cause analysis
For hotspots (especially /case/foo-case), Silk SQL traces were inspected to attribute remaining queries to:
view code
model methods
template-triggered calls
Findings
A. DevConfiguration profiling (with real cache behavior)
Endpoint results (cold -> warm)
Endpoint |
Querycount |
Time |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Interpretation
Cache behavior is working for API and frontend pages under
DevConfiguration.Cached API endpoints show the expected
Varyheaders (includingAuthorization,Cookie,Accept-Language,Host).Frontpage and case list show significant query-count reductions on warm requests.
Case detail remained the main hotspot after the initial pass.
B. Case Detail (/case/foo-case) Deep Dive and Optimizations
Warm-request SQL (before follow-up case optimizations)
Silk showed 3 application SQL queries on warm anonymous requests:
public annotation markers (
item.get_markers(request))related cases (
item.get_related())references query (
item.get_grouped_references()via template include)
This showed that the existing shared case cache only covered:
case object
reference markers
It did not yet cache template-driven shared data (references, related_cases) or anonymous marker/content rendering.
Optimization 2: Cache anonymous public markers + rendered content
Implemented anonymous-only shared caches for:
public markers list
annotated rendered content (
insert_markers(...)result)
This is safe for anonymous users because marker visibility is public-only.
Final /case/foo-case result (anonymous)
cold:
12querycount,~0.613swarm:
2querycount,~0.049s
Silk warm SQL after this step:
0 application SQL queries
Remaining
querycountis profiling instrumentation writes (Silk)
C. Cache Correctness Fixes Verified During Profiling
The branch now includes cache correctness protections that were important to verify while profiling:
API cache variance includes:
AuthorizationCookieAccept-LanguageHost
Role-based page cache keys vary by:
method
host
language
path
role
Role cache skips responses that:
set cookies
vary on cookie
are not cache-safe
This avoids cross-user/session and cross-locale/domain cache contamination while keeping aggressive caching enabled.
D. Remaining Bottlenecks / Next Targets
After the case-detail follow-up optimizations:
Warm anonymous
/case/foo-caseis effectively DB-free.Remaining performance work is more likely CPU/template/middleware/compression than DB on warm hits.
Best next profiling target:
/search/?q=...validate search facet cache hit behavior
inspect cache-miss CPU hotspots (facet processing and template rendering)
E. Continued Profiling: Search and Law Pages (same temp DB, DevConfiguration)
This follow-up pass profiled:
/search/?q=gg/search/autocomplete?q=gg/law/gg/(law book page)/law/gg/artikel-1(law detail page)
using the same profiling DB and the same DevConfiguration methodology as above, with a fresh file cache directory.
Endpoint results (cold -> warm)
Endpoint |
Querycount |
Time |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Silk request metadata (application SQL focus)
querycount includes profiling/instrumentation overhead, so Silk request metadata and SQL traces were used to measure application SQL:
/search/?q=ggcold:
meta_num_queries = 1warm:
meta_num_queries = 1warm Silk SQL count:
0
/search/autocomplete?q=ggcold:
meta_num_queries = 1warm:
meta_num_queries = 1warm Silk SQL count:
0
/law/gg/cold:
meta_num_queries = 226warm:
meta_num_queries = 1warm Silk SQL count:
0
/law/gg/artikel-1cold:
meta_num_queries = 14warm:
meta_num_queries = 1warm Silk SQL count:
0
Interpretation
Search and law pages are all effectively DB-free on warm cache hits (application SQL
0in Silk traces)./search/?q=ggshows a large cold-to-warm drop (~683ms -> ~42ms), which confirms the role-level page cache is paying off./law/gg/is now the standout cold-path hotspot (very high SQL query volume before cache).
Search profiling caveat (important)
In this local profiling environment, Elasticsearch was not reachable and the server logged connection failures for the search requests. The search views still returned 200 via the application’s error-handling/fallback path.
Implication:
the search endpoint timings above are still useful for measuring page-cache impact and local fallback rendering behavior
they do not represent true indexed-search performance under a healthy Elasticsearch backend
F. New Bottleneck Identified: Cold /law/gg/ Page (LawBook.sections N+1)
Observed pattern
Silk SQL trace for cold /law/gg/ showed:
203SQL rows recorded226application SQL queries (meta_num_queries)198duplicate queries of the same form:
SELECT laws_lawbook.id, laws_lawbook.sections FROM laws_lawbook WHERE laws_lawbook.id = ... LIMIT 21
Root cause
This is a template-driven N+1 on the law book page:
oldp/apps/laws/templates/laws/book.htmlloops overitemsand callsitem.get_section(oldp/apps/laws/templates/laws/book.html:18,oldp/apps/laws/templates/laws/book.html:21)Law.get_section()callsself.book.get_sections()(oldp/apps/laws/models.py:378)get_sections()accessesself.sections(oldp/apps/laws/models.py:142)In
view_book, the queryset defersbook__sectionsviaLaw.defer_fields_list_view(oldp/apps/laws/views.py:107,oldp/apps/laws/models.py:285)
Because book__sections is deferred, each item.get_section call triggers a lazy refresh of LawBook.sections, causing repeated identical queries.
Recommended fix (next pass)
Either of these should remove the N+1 on cold /law/gg/:
Do not defer
book__sectionsinview_book(or in the list-view defer set used there)Precompute section labels once in the view (e.g.,
sections = book.get_sections()) and use that in the template instead ofitem.get_section
G. Smaller Cold-Path Pattern: Duplicate get_revision_dates() Queries on Law Pages
Both cold /law/gg/ and /law/gg/artikel-1 traces show duplicate revision-date queries:
SELECT laws_lawbook.revision_date ... WHERE code = ... ORDER BY revision_date DESC(executed twice)
Likely cause:
templates call
book.get_revision_datesmultiple times (iteration + length check), e.g.laws/book.html:43andlaws/book.html:49LawBook.get_revision_dates()returns a queryset each time and does not cache/materialize (oldp/apps/laws/models.py:186)
This is a smaller issue than the book.sections N+1 and mostly affects cold requests, but it is still a clean optimization candidate (cache on instance or materialize once in the view).
H. Implemented Follow-up Fix for Law Pages and Re-Profiling Results
A follow-up patch was applied to remove the template-driven duplicate method calls on law pages without undefering book__sections on every Law row.
Implemented changes
view_book(...)materialize
revision_dates = list(book.get_revision_dates())oncematerialize
itemslist oncecompute
item.display_sectionfrombook.get_sections()once in the view (avoiditem.get_section()in template loop)
view_law(...)materialize
revision_dates = list(book.get_revision_dates())oncematerialize
related_laws = item.get_related()once
templates updated to use
revision_dates,related_laws, anditem.display_section
This removes:
the
/law/gg/coldLawBook.sectionsN+1duplicate
get_revision_dates()calls in both law templatesduplicate
item.get_related()calls inlaws/law.html
Re-profiled results (same temp DB, DevConfiguration)
Endpoint |
Before (cold -> warm) |
After (cold -> warm) |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Silk application SQL (post-fix)
/law/gg/cold:
meta_num_queries = 5(down from226)warm:
meta_num_queries = 1(profiling write only)warm Silk SQL count:
0
/law/gg/artikel-1cold:
meta_num_queries = 11(down from14)warm:
meta_num_queries = 1(profiling write only)warm Silk SQL count:
0
Remaining cold law-page queries after the fix
The largest law-book cold-path issue is resolved. Remaining cold queries on /law/gg/ are now small and expected (book lookup, count check, revision dates, law list query).
On /law/gg/artikel-1, the remaining cold queries include:
law + book fetches
related laws query
referencing cases query
revision-date query
These are much smaller than the removed book.sections N+1 and mostly only matter on cache misses.
I. Final Consolidated Post-Fix Sweep (API + Frontend + Laws + Search)
After the case-detail and law-page follow-up fixes, a final DevConfiguration profiling sweep was run against a fresh file-cache directory over:
/api//api/cases//api/laws//api/law_books///case//case/foo-case/law/gg//law/gg/artikel-1/search/?q=gg/search/autocomplete?q=gg
Header/timing results (cold -> warm)
Endpoint |
Querycount |
Time |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Silk request metadata summary (application SQL)
For the latest warm requests in this sweep:
all endpoints above had
warm_sql_count = 0in Silkmeta_num_queries = 1for most warm endpoints (profiling instrumentation write)querycountheader values > 1 on warm responses are instrumentation/middleware overhead, not app SQL
Selected cold meta_num_queries from this final sweep:
/api/cases/:3/api/laws/:3/api/law_books/:4/:5/case/:4/case/foo-case:6/law/gg/:6/law/gg/artikel-1:10
Practical conclusion
For the anonymous request paths profiled here, the current branch has reached a strong state:
warm cache hits are effectively database-free across API and frontend pages tested
remaining performance work is mostly cold-path optimization and CPU/template/render work
J. Profiling Still Worth Doing (Not Completed in This Session)
The highest-value remaining profiling work is:
Search with a real Elasticsearch backend available
current local search timings reflect fallback/error-handling behavior because Elasticsearch was unreachable
true search miss/hit performance (including facet processing under real results) still needs validation
Authenticated and staff variants
cache variance/correctness has been fixed, but performance for:
authenticated non-staff users
staff users (extra annotation/marker queries and controls)
has not been profiled end-to-end in this session
CPU profiling on warm pages
DB work is mostly eliminated on warm hits
next likely costs: template render, marker insertion, middleware stack, compression
Silk Python profiler reported environment conflicts earlier (
Another profiling tool is already active), so a dedicated CPU-profiler pass may need a cleaner runtime
K. Phase 1 (Conservative) Implementation Update: Anonymous Search/API Speed
Implemented Django-only anonymous optimizations focused on public search endpoints and shared request patterns:
Added dispatch caching + safe
Varyheaders to public search APIs:/api/cases/search//api/laws/search/
Hardened autocomplete cache behavior:
normalized query handling (trimmed input)
blank-query short-circuit (
{"results": []})cache key now varies by host and language
cache key normalization is case-insensitive (improves hit rate for anon traffic)
Small cold-path optimization for court case list pages:
CourtCasesListViewnow fetchescourtwithselect_related("state")
Validation completed in this session
Targeted tests (mocked search backend, no Elasticsearch required):
search API cache headers + response caching behavior
autocomplete normalization cache hit behavior
autocomplete blank-query short-circuit
autocomplete cache-key host/language variance
Validation still pending (for this phase)
ES-backed profiling of
/api/cases/search/and/api/laws/search/under a reachable Elasticsearch backendlocal environment still lacks reachable ES, so only mocked/test validation was performed for the search APIs
Test command used
source .venv/bin/activate
python manage.py test \
oldp.apps.search.tests.test_api_cache_headers \
oldp.apps.search.tests.test_views.MockedSearchViewsTestCase.test_autocomplete_blank_query_short_circuits \
oldp.apps.search.tests.test_views.MockedSearchViewsTestCase.test_autocomplete_cache_key_varies_by_host_and_language \
oldp.apps.search.tests.test_views.MockedSearchViewsTestCase.test_autocomplete_cache_normalizes_query \
--verbosity 2
L. Phase 2 (Conservative) Implementation Update: Public Docs/Autocomplete + Small View Wins
Implemented additional low-risk anonymous optimizations that do not require Elasticsearch:
Wrapped public API schema/docs routes with
cache_per_role(settings.CACHE_TTL):/api/schema.json/api/schema.yaml/api/schema/(Swagger UI)/api/docs/(ReDoc)
Wrapped courts autocomplete endpoints with
cache_per_role(settings.CACHE_TTL):/court/autocomplete//court/autocomplete/state/
Small cold-path query reduction:
CourtCasesListViewcourt lookup now usesselect_related("state")
Homepage anonymous rendering tightened:
latest cases query is sliced in the view (only fetch 10 rows)
homepage count cache keys now vary by host/language
Phase 2 profiling run (DevConfiguration, temp SQLite, no ES dependency)
Measured endpoints (cold -> warm):
Endpoint |
Querycount |
Time |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Silk application SQL summary (Phase 2 endpoints)
All warm requests above had:
warm_sql_count = 0(application SQL)meta_num_queries = 1for most endpoints (profiling instrumentation write)
Selected cold meta_num_queries:
/api/schema.json:1/api/schema.yaml:1/court/autocomplete/:2/court/autocomplete/state/:3/court/ag-aalen/:4
Important header caveat for drf-yasg schema/docs
Even though internal Django caching now speeds up repeated schema/docs requests, drf-yasg responses still emit cache-preventing headers such as:
Cache-Control: max-age=0, no-cache, no-store, must-revalidate, private
Implication:
internal server-side caching works (as shown by timing improvements and Silk warm SQL = 0)
downstream/browser caching is still effectively disabled for these endpoints
If desired later, a follow-up can adjust schema/docs response headers explicitly (carefully) to enable browser/proxy caching.
Caveats
django-querycountnumbers include profiling middleware/instrumentation overhead (especially Silk writes), so warm cached requests will not necessarily show0.Timing is local development timing (SQLite + dev server), useful for relative comparisons and regression detection, not production latency estimates.
Files Changed During This Profiling Work (Relevant to Findings)
oldp/settings.py(profiling/querycount runtime activation + querycount config keys)oldp/utils/cache_per_user.py(cache correctness + variance)oldp/apps/search/views.py(facet caching)oldp/apps/laws/api_views.py(LawBookViewSet.defer(...))oldp/apps/cases/api_views.py(API cache vary headers)oldp/api/views.py(API cache vary headers)oldp/apps/cases/views.py(case detail caching improvements)oldp/apps/cases/templates/cases/case.html(avoid template-triggered related query)