Performance Profiling Methodology and Findings (2026-02-22)

This document summarizes the profiling methodology and findings for the current feature branch performance work, including cache correctness fixes, API caching verification, and frontend/case-page profiling.

Scope

Profiled and verified:

API endpoints:
- /api/
- /api/cases/
- /api/laws/
- /api/law_books/
Frontend pages:
- / (frontpage)
- /case/ (case list)
- /case/foo-case (case detail, fixture data)

The work included both correctness fixes (cache variance/safety) and performance improvements (query reductions, caching).

Environment and Tooling

Date: February 22, 2026
Python env: .venv
Profiling packages installed with uv:
- django-silk
- django-querycount
- gprof2dot

Commands used to install profiling packages

source .venv/bin/activate
uv pip install django-silk==5.4.3 django-querycount==0.8.3 gprof2dot

Important Runtime Setup Notes

Env variable names (important)

django-configurations uses different env names for different settings types in this project:

Database override: DATABASE_URL=...
Profiling toggles: DJANGO_PROFILING_ENABLED=1
Querycount toggle: DJANGO_QUERYCOUNT_ENABLED=1
File cache location: DJANGO_FILE_CACHE_LOCATION=/tmp/...

Using DJANGO_DATABASE_URL does not override the DB here.

Temp SQLite DB location under escalated commands

When profiling with escalated commands, the process did not see the same SQLite file contents created in the sandbox workspace path. The reliable approach was:

create/migrate/load the profiling DB in /tmp
run DevConfiguration against /tmp/oldp-dev-profile.sqlite3

Methodology

1. Validate profiling middleware stack

Enabled Silk + QueryCount via env vars
Verified:
- middleware activation
- /silk/ route registration
- X-DjangoQueryCount-* headers on responses
- Silk request recording in DB

2. Build a temporary profiling DB

Using a fresh SQLite DB for reproducible profiling:

TMPDIR=/tmp SQLITE_TMPDIR=/tmp \
DJANGO_CONFIGURATION=DevConfiguration \
DATABASE_URL=sqlite:////tmp/oldp-dev-profile.sqlite3 \
DJANGO_PROFILING_ENABLED=1 \
DJANGO_QUERYCOUNT_ENABLED=1 \
DJANGO_FILE_CACHE_LOCATION=/tmp/django_cache_devprofile \
.venv/bin/python manage.py migrate --noinput

Load fixture data:

TMPDIR=/tmp SQLITE_TMPDIR=/tmp \
DJANGO_CONFIGURATION=DevConfiguration \
DATABASE_URL=sqlite:////tmp/oldp-dev-profile.sqlite3 \
DJANGO_PROFILING_ENABLED=1 \
DJANGO_QUERYCOUNT_ENABLED=1 \
DJANGO_FILE_CACHE_LOCATION=/tmp/django_cache_devprofile \
.venv/bin/python manage.py loaddata \
  locations/countries.json \
  locations/states.json \
  locations/cities.json \
  courts/courts.json \
  sources/default.json \
  laws/laws.json \
  cases/cases.json

3. Run profiled local server (`DevConfiguration`)

TMPDIR=/tmp SQLITE_TMPDIR=/tmp \
DJANGO_CONFIGURATION=DevConfiguration \
DATABASE_URL=sqlite:////tmp/oldp-dev-profile.sqlite3 \
DJANGO_PROFILING_ENABLED=1 \
DJANGO_QUERYCOUNT_ENABLED=1 \
DJANGO_FILE_CACHE_LOCATION=/tmp/django_cache_devprofile \
.venv/bin/python manage.py runserver 127.0.0.1:8014 --noreload

4. Measure cold vs warm requests (2-pass)

Two sequential requests per endpoint were issued with curl, capturing headers and timing:

status
X-DjangoQueryCount-Count
Vary
Cache-Control
ETag
time_total

Example:

curl -sS -o /dev/null -D - -w 'time_total=%{time_total}\n' \
  http://127.0.0.1:8014/api/laws/

5. Use Silk SQL traces for root-cause analysis

For hotspots (especially /case/foo-case), Silk SQL traces were inspected to attribute remaining queries to:

view code
model methods
template-triggered calls

Findings

A. `DevConfiguration` profiling (with real cache behavior)

Endpoint results (cold -> warm)

Endpoint	Querycount	Time
`/api/`	`2 -> 2`	`0.173s -> 0.043s`
`/api/cases/`	`6 -> 4`	`0.089s -> 0.042s`
`/api/laws/`	`6 -> 2`	`0.115s -> 0.066s`
`/api/law_books/`	`6 -> 2`	`0.070s -> 0.036s`
`/`	`8 -> 2`	`0.528s -> 0.044s`
`/case/`	`6 -> 2`	`0.109s -> 0.091s`
`/case/foo-case` (before case-detail follow-up optimizations)	`12 -> 8`	`0.091s -> 0.091s`

Interpretation

Cache behavior is working for API and frontend pages under DevConfiguration.
Cached API endpoints show the expected Vary headers (including Authorization, Cookie, Accept-Language, Host).
Frontpage and case list show significant query-count reductions on warm requests.
Case detail remained the main hotspot after the initial pass.

B. Case Detail (`/case/foo-case`) Deep Dive and Optimizations

Warm-request SQL (before follow-up case optimizations)

Silk showed 3 application SQL queries on warm anonymous requests:

public annotation markers (item.get_markers(request))
related cases (item.get_related())
references query (item.get_grouped_references() via template include)

This showed that the existing shared case cache only covered:

case object
reference markers

It did not yet cache template-driven shared data (references, related_cases) or anonymous marker/content rendering.

Optimization 2: Cache anonymous public markers + rendered content

Implemented anonymous-only shared caches for:

public markers list
annotated rendered content (insert_markers(...) result)

This is safe for anonymous users because marker visibility is public-only.

Final `/case/foo-case` result (anonymous)

cold: 12 querycount, ~0.613s
warm: 2 querycount, ~0.049s

Silk warm SQL after this step:

0 application SQL queries
Remaining querycount is profiling instrumentation writes (Silk)

C. Cache Correctness Fixes Verified During Profiling

The branch now includes cache correctness protections that were important to verify while profiling:

API cache variance includes:
- Authorization
- Cookie
- Accept-Language
- Host
Role-based page cache keys vary by:
- method
- host
- language
- path
- role
Role cache skips responses that:
- set cookies
- vary on cookie
- are not cache-safe

This avoids cross-user/session and cross-locale/domain cache contamination while keeping aggressive caching enabled.

D. Remaining Bottlenecks / Next Targets

After the case-detail follow-up optimizations:

Warm anonymous /case/foo-case is effectively DB-free.
Remaining performance work is more likely CPU/template/middleware/compression than DB on warm hits.
Best next profiling target:
- /search/?q=...
- validate search facet cache hit behavior
- inspect cache-miss CPU hotspots (facet processing and template rendering)

E. Continued Profiling: Search and Law Pages (same temp DB, `DevConfiguration`)

This follow-up pass profiled:

/search/?q=gg
/search/autocomplete?q=gg
/law/gg/ (law book page)
/law/gg/artikel-1 (law detail page)

using the same profiling DB and the same DevConfiguration methodology as above, with a fresh file cache directory.

Endpoint results (cold -> warm)

Endpoint	Querycount	Time
`/search/?q=gg`	`2 -> 2`	`0.683s -> 0.042s`
`/search/autocomplete?q=gg`	`2 -> 4`	`0.064s -> 0.036s`
`/law/gg/`	`408 -> 2`	`1.615s -> 0.045s`
`/law/gg/artikel-1`	`22 -> 2`	`0.168s -> 0.034s`

Silk request metadata (application SQL focus)

querycount includes profiling/instrumentation overhead, so Silk request metadata and SQL traces were used to measure application SQL:

/search/?q=gg
- cold: meta_num_queries = 1
- warm: meta_num_queries = 1
- warm Silk SQL count: 0
/search/autocomplete?q=gg
- cold: meta_num_queries = 1
- warm: meta_num_queries = 1
- warm Silk SQL count: 0
/law/gg/
- cold: meta_num_queries = 226
- warm: meta_num_queries = 1
- warm Silk SQL count: 0
/law/gg/artikel-1
- cold: meta_num_queries = 14
- warm: meta_num_queries = 1
- warm Silk SQL count: 0

Interpretation

Search and law pages are all effectively DB-free on warm cache hits (application SQL 0 in Silk traces).
/search/?q=gg shows a large cold-to-warm drop (~683ms -> ~42ms), which confirms the role-level page cache is paying off.
/law/gg/ is now the standout cold-path hotspot (very high SQL query volume before cache).

Search profiling caveat (important)

In this local profiling environment, Elasticsearch was not reachable and the server logged connection failures for the search requests. The search views still returned 200 via the application’s error-handling/fallback path.

Implication:

the search endpoint timings above are still useful for measuring page-cache impact and local fallback rendering behavior
they do not represent true indexed-search performance under a healthy Elasticsearch backend

F. New Bottleneck Identified: Cold `/law/gg/` Page (`LawBook.sections` N+1)

Observed pattern

Silk SQL trace for cold /law/gg/ showed:

203 SQL rows recorded
226 application SQL queries (meta_num_queries)
198 duplicate queries of the same form:

SELECT laws_lawbook.id, laws_lawbook.sections FROM laws_lawbook WHERE laws_lawbook.id = ... LIMIT 21

Root cause

This is a template-driven N+1 on the law book page:

oldp/apps/laws/templates/laws/book.html loops over items and calls item.get_section (oldp/apps/laws/templates/laws/book.html:18, oldp/apps/laws/templates/laws/book.html:21)
Law.get_section() calls self.book.get_sections() (oldp/apps/laws/models.py:378)
get_sections() accesses self.sections (oldp/apps/laws/models.py:142)
In view_book, the queryset defers book__sections via Law.defer_fields_list_view (oldp/apps/laws/views.py:107, oldp/apps/laws/models.py:285)

Because book__sections is deferred, each item.get_section call triggers a lazy refresh of LawBook.sections, causing repeated identical queries.

Recommended fix (next pass)

Either of these should remove the N+1 on cold /law/gg/:

Do not defer book__sections in view_book (or in the list-view defer set used there)
Precompute section labels once in the view (e.g., sections = book.get_sections()) and use that in the template instead of item.get_section

G. Smaller Cold-Path Pattern: Duplicate `get_revision_dates()` Queries on Law Pages

Both cold /law/gg/ and /law/gg/artikel-1 traces show duplicate revision-date queries:

SELECT laws_lawbook.revision_date ... WHERE code = ... ORDER BY revision_date DESC (executed twice)

Likely cause:

templates call book.get_revision_dates multiple times (iteration + length check), e.g. laws/book.html:43 and laws/book.html:49
LawBook.get_revision_dates() returns a queryset each time and does not cache/materialize (oldp/apps/laws/models.py:186)

This is a smaller issue than the book.sections N+1 and mostly affects cold requests, but it is still a clean optimization candidate (cache on instance or materialize once in the view).

H. Implemented Follow-up Fix for Law Pages and Re-Profiling Results

A follow-up patch was applied to remove the template-driven duplicate method calls on law pages without undefering book__sections on every Law row.

Implemented changes

view_book(...)
- materialize revision_dates = list(book.get_revision_dates()) once
- materialize items list once
- compute item.display_section from book.get_sections() once in the view (avoid item.get_section() in template loop)
view_law(...)
- materialize revision_dates = list(book.get_revision_dates()) once
- materialize related_laws = item.get_related() once
templates updated to use revision_dates, related_laws, and item.display_section

This removes:

the /law/gg/ cold LawBook.sections N+1
duplicate get_revision_dates() calls in both law templates
duplicate item.get_related() calls in laws/law.html

Re-profiled results (same temp DB, `DevConfiguration`)

Endpoint	Before (cold -> warm)	After (cold -> warm)
`/law/gg/` querycount	`408 -> 2`	`10 -> 2`
`/law/gg/` time	`1.615s -> 0.045s`	`0.676s -> 0.056s`
`/law/gg/artikel-1` querycount	`22 -> 2`	`20 -> 2`
`/law/gg/artikel-1` time	`0.168s -> 0.034s`	`0.154s -> 0.037s`

Silk application SQL (post-fix)

/law/gg/
- cold: meta_num_queries = 5 (down from 226)
- warm: meta_num_queries = 1 (profiling write only)
- warm Silk SQL count: 0
/law/gg/artikel-1
- cold: meta_num_queries = 11 (down from 14)
- warm: meta_num_queries = 1 (profiling write only)
- warm Silk SQL count: 0

Remaining cold law-page queries after the fix

The largest law-book cold-path issue is resolved. Remaining cold queries on /law/gg/ are now small and expected (book lookup, count check, revision dates, law list query).

On /law/gg/artikel-1, the remaining cold queries include:

law + book fetches
related laws query
referencing cases query
revision-date query

These are much smaller than the removed book.sections N+1 and mostly only matter on cache misses.

I. Final Consolidated Post-Fix Sweep (API + Frontend + Laws + Search)

After the case-detail and law-page follow-up fixes, a final DevConfiguration profiling sweep was run against a fresh file-cache directory over:

/api/
/api/cases/
/api/laws/
/api/law_books/
/
/case/
/case/foo-case
/law/gg/
/law/gg/artikel-1
/search/?q=gg
/search/autocomplete?q=gg

Header/timing results (cold -> warm)

Endpoint	Querycount	Time
`/api/`	`2 -> 2`	`0.162s -> 0.040s`
`/api/cases/`	`6 -> 2`	`0.080s -> 0.036s`
`/api/laws/`	`6 -> 2`	`0.117s -> 0.052s`
`/api/law_books/`	`6 -> 2`	`0.067s -> 0.039s`
`/`	`8 -> 2`	`0.514s -> 0.041s`
`/case/`	`6 -> 2`	`0.158s -> 0.045s`
`/case/foo-case`	`14 -> 2`	`0.118s -> 0.049s`
`/law/gg/`	`10 -> 2`	`0.121s -> 0.038s`
`/law/gg/artikel-1`	`20 -> 2`	`0.204s -> 0.035s`
`/search/?q=gg`	`2 -> 2`	`0.088s -> 0.046s`
`/search/autocomplete?q=gg`	`2 -> 4`	`0.061s -> 0.040s`

Silk request metadata summary (application SQL)

For the latest warm requests in this sweep:

all endpoints above had warm_sql_count = 0 in Silk
meta_num_queries = 1 for most warm endpoints (profiling instrumentation write)
querycount header values > 1 on warm responses are instrumentation/middleware overhead, not app SQL

Selected cold meta_num_queries from this final sweep:

/api/cases/: 3
/api/laws/: 3
/api/law_books/: 4
/: 5
/case/: 4
/case/foo-case: 6
/law/gg/: 6
/law/gg/artikel-1: 10

Practical conclusion

For the anonymous request paths profiled here, the current branch has reached a strong state:

warm cache hits are effectively database-free across API and frontend pages tested
remaining performance work is mostly cold-path optimization and CPU/template/render work

J. Profiling Still Worth Doing (Not Completed in This Session)

The highest-value remaining profiling work is:

Search with a real Elasticsearch backend available
- current local search timings reflect fallback/error-handling behavior because Elasticsearch was unreachable
- true search miss/hit performance (including facet processing under real results) still needs validation
Authenticated and staff variants
- cache variance/correctness has been fixed, but performance for:
  - authenticated non-staff users
  - staff users (extra annotation/marker queries and controls)
- has not been profiled end-to-end in this session
CPU profiling on warm pages
- DB work is mostly eliminated on warm hits
- next likely costs: template render, marker insertion, middleware stack, compression
- Silk Python profiler reported environment conflicts earlier (Another profiling tool is already active), so a dedicated CPU-profiler pass may need a cleaner runtime

K. Phase 1 (Conservative) Implementation Update: Anonymous Search/API Speed

Implemented Django-only anonymous optimizations focused on public search endpoints and shared request patterns:

Added dispatch caching + safe Vary headers to public search APIs:
- /api/cases/search/
- /api/laws/search/
Hardened autocomplete cache behavior:
- normalized query handling (trimmed input)
- blank-query short-circuit ({"results": []})
- cache key now varies by host and language
- cache key normalization is case-insensitive (improves hit rate for anon traffic)
Small cold-path optimization for court case list pages:
- CourtCasesListView now fetches court with select_related("state")

Validation completed in this session

Targeted tests (mocked search backend, no Elasticsearch required):

search API cache headers + response caching behavior
autocomplete normalization cache hit behavior
autocomplete blank-query short-circuit
autocomplete cache-key host/language variance

Validation still pending (for this phase)

ES-backed profiling of /api/cases/search/ and /api/laws/search/ under a reachable Elasticsearch backend
- local environment still lacks reachable ES, so only mocked/test validation was performed for the search APIs

Test command used

source .venv/bin/activate
python manage.py test \
  oldp.apps.search.tests.test_api_cache_headers \
  oldp.apps.search.tests.test_views.MockedSearchViewsTestCase.test_autocomplete_blank_query_short_circuits \
  oldp.apps.search.tests.test_views.MockedSearchViewsTestCase.test_autocomplete_cache_key_varies_by_host_and_language \
  oldp.apps.search.tests.test_views.MockedSearchViewsTestCase.test_autocomplete_cache_normalizes_query \
  --verbosity 2

L. Phase 2 (Conservative) Implementation Update: Public Docs/Autocomplete + Small View Wins

Implemented additional low-risk anonymous optimizations that do not require Elasticsearch:

Wrapped public API schema/docs routes with cache_per_role(settings.CACHE_TTL):
- /api/schema.json
- /api/schema.yaml
- /api/schema/ (Swagger UI)
- /api/docs/ (ReDoc)
Wrapped courts autocomplete endpoints with cache_per_role(settings.CACHE_TTL):
- /court/autocomplete/
- /court/autocomplete/state/
Small cold-path query reduction:
- CourtCasesListView court lookup now uses select_related("state")
Homepage anonymous rendering tightened:
- latest cases query is sliced in the view (only fetch 10 rows)
- homepage count cache keys now vary by host/language

Phase 2 profiling run (`DevConfiguration`, temp SQLite, no ES dependency)

Measured endpoints (cold -> warm):

Endpoint	Querycount	Time
`/api/schema.json`	`2 -> 2`	`1.599s -> 0.055s`
`/api/schema.yaml`	`4 -> 2`	`1.564s -> 0.046s`
`/api/schema/`	`2 -> 2`	`0.057s -> 0.043s`
`/api/docs/`	`4 -> 2`	`0.046s -> 0.044s`
`/court/autocomplete/?q=ag`	`4 -> 2`	`0.045s -> 0.031s`
`/court/autocomplete/state/?q=ba`	`6 -> 2`	`0.047s -> 0.031s`
`/court/ag-aalen/`	`6 -> 2`	`0.427s -> 0.033s`

Silk application SQL summary (Phase 2 endpoints)

All warm requests above had:

warm_sql_count = 0 (application SQL)
meta_num_queries = 1 for most endpoints (profiling instrumentation write)

Selected cold meta_num_queries:

/api/schema.json: 1
/api/schema.yaml: 1
/court/autocomplete/: 2
/court/autocomplete/state/: 3
/court/ag-aalen/: 4

Important header caveat for drf-yasg schema/docs

Even though internal Django caching now speeds up repeated schema/docs requests, drf-yasg responses still emit cache-preventing headers such as:

Cache-Control: max-age=0, no-cache, no-store, must-revalidate, private

Implication:

internal server-side caching works (as shown by timing improvements and Silk warm SQL = 0)
downstream/browser caching is still effectively disabled for these endpoints

If desired later, a follow-up can adjust schema/docs response headers explicitly (carefully) to enable browser/proxy caching.

Caveats

django-querycount numbers include profiling middleware/instrumentation overhead (especially Silk writes), so warm cached requests will not necessarily show 0.
Timing is local development timing (SQLite + dev server), useful for relative comparisons and regression detection, not production latency estimates.

Files Changed During This Profiling Work (Relevant to Findings)

oldp/settings.py (profiling/querycount runtime activation + querycount config keys)
oldp/utils/cache_per_user.py (cache correctness + variance)
oldp/apps/search/views.py (facet caching)
oldp/apps/laws/api_views.py (LawBookViewSet.defer(...))
oldp/apps/cases/api_views.py (API cache vary headers)
oldp/api/views.py (API cache vary headers)
oldp/apps/cases/views.py (case detail caching improvements)
oldp/apps/cases/templates/cases/case.html (avoid template-triggered related query)

Performance Profiling Methodology and Findings (2026-02-22)

Scope

Environment and Tooling

Commands used to install profiling packages

Important Runtime Setup Notes

Env variable names (important)

Temp SQLite DB location under escalated commands

Methodology

1. Validate profiling middleware stack

2. Build a temporary profiling DB

3. Run profiled local server (DevConfiguration)

4. Measure cold vs warm requests (2-pass)

5. Use Silk SQL traces for root-cause analysis

Findings

A. DevConfiguration profiling (with real cache behavior)

Endpoint results (cold -> warm)

Interpretation

B. Case Detail (/case/foo-case) Deep Dive and Optimizations

Warm-request SQL (before follow-up case optimizations)

Optimization 1: Cache shared references + related cases

Optimization 2: Cache anonymous public markers + rendered content

Final /case/foo-case result (anonymous)

C. Cache Correctness Fixes Verified During Profiling

D. Remaining Bottlenecks / Next Targets

E. Continued Profiling: Search and Law Pages (same temp DB, DevConfiguration)

Endpoint results (cold -> warm)

Silk request metadata (application SQL focus)

Interpretation

Search profiling caveat (important)

F. New Bottleneck Identified: Cold /law/gg/ Page (LawBook.sections N+1)

Observed pattern

Root cause

Recommended fix (next pass)

G. Smaller Cold-Path Pattern: Duplicate get_revision_dates() Queries on Law Pages

H. Implemented Follow-up Fix for Law Pages and Re-Profiling Results

Implemented changes

Re-profiled results (same temp DB, DevConfiguration)

Silk application SQL (post-fix)

Remaining cold law-page queries after the fix

I. Final Consolidated Post-Fix Sweep (API + Frontend + Laws + Search)

Header/timing results (cold -> warm)

Silk request metadata summary (application SQL)

Practical conclusion

J. Profiling Still Worth Doing (Not Completed in This Session)

K. Phase 1 (Conservative) Implementation Update: Anonymous Search/API Speed

Validation completed in this session

Validation still pending (for this phase)

Test command used

L. Phase 2 (Conservative) Implementation Update: Public Docs/Autocomplete + Small View Wins

Phase 2 profiling run (DevConfiguration, temp SQLite, no ES dependency)

Silk application SQL summary (Phase 2 endpoints)

Important header caveat for drf-yasg schema/docs

Caveats

Files Changed During This Profiling Work (Relevant to Findings)

3. Run profiled local server (`DevConfiguration`)

A. `DevConfiguration` profiling (with real cache behavior)

B. Case Detail (`/case/foo-case`) Deep Dive and Optimizations

Final `/case/foo-case` result (anonymous)

E. Continued Profiling: Search and Law Pages (same temp DB, `DevConfiguration`)

F. New Bottleneck Identified: Cold `/law/gg/` Page (`LawBook.sections` N+1)

G. Smaller Cold-Path Pattern: Duplicate `get_revision_dates()` Queries on Law Pages

Re-profiled results (same temp DB, `DevConfiguration`)

Phase 2 profiling run (`DevConfiguration`, temp SQLite, no ES dependency)