Performance Optimization Audit

This document identifies performance bottlenecks in OLDP and provides actionable, code-level recommendations. Findings are organized by severity.

Context: OLDP serves legal documents (cases, laws, courts). Data is updated approximately once per week, making it an excellent candidate for aggressive caching.

Profiling Report

Measured profiling methodology and branch verification results (API + frontend + case detail) are documented in:

Performance Profiling Methodology and Findings (2026-02-22)

Infrastructure-level tuning beyond Django (Gunicorn, Nginx, MariaDB, Elasticsearch, Redis, plus Docker Compose examples) is documented in:

Performance Optimization Beyond Django

1. Database Query Issues (HIGH)

1.2 N+1 in `get_content_as_html()`

oldp/apps/cases/models.py:202-224 — Calls get_reference_markers() and get_markers(), each triggering separate DB queries. This method is invoked on every case detail page view (oldp/apps/cases/views.py:91).

Fix: Prefetch reference markers and annotation markers on the case queryset in the detail view, or cache the rendered HTML output per case.

1.3 Inefficient `get_latest_law_book()`

oldp/apps/laws/views.py:61-75 — Uses len(candidates) which forces full queryset evaluation instead of using .exists() or .first().

Fix:

def get_latest_law_book(book_slug):
    candidate = LawBook.objects.filter(slug=book_slug, latest=True).first()
    if candidate is None:
        logger.info("Law book not found: %s", book_slug)
        raise Http404()
    # Check for duplicates (should not happen)
    count = LawBook.objects.filter(slug=book_slug, latest=True).count()
    if count > 1:
        logger.warning("Book has more than one instance with latest=true: %s", book_slug)
    return candidate

1.4 Uncached `Law.get_next()` and `has_next()`

oldp/apps/laws/models.py:328-347 — Each call to get_next() or has_next() triggers a separate DB query. When both are called (e.g., in templates), two queries are executed for the same information.

Fix: Cache the result of get_next() on the instance and derive has_next() from it:

def get_next(self):
    if not hasattr(self, "_next_cache"):
        try:
            self._next_cache = Law.objects.get(previous=self.id)
        except Law.DoesNotExist:
            self._next_cache = None
        except Law.MultipleObjectsReturned:
            logger.error(f"Multiple laws found with previous={self.id}")
            self._next_cache = Law.objects.filter(previous=self.id).first()
    return self._next_cache

def has_next(self):
    return self.get_next() is not None

1.5 `Law.get_latest_revision_url()` — 2 DB Queries Per Call

oldp/apps/laws/models.py:397-414 — Fetches the latest book and then checks for law existence on every call. Particularly expensive if called in list contexts.

Fix: Cache on the instance, or avoid calling in list views entirely.

1.6 Missing `select_related` in API Views

Location	Missing	Fix
`oldp/apps/laws/api_views.py:49-50`	`LawViewSet.get_queryset()`	Add `.select_related("book")`
`oldp/apps/annotations/api_views.py:32-33`	`AnnotationLabelViewSet.get_queryset()`	Add `.select_related("owner")`
`oldp/apps/annotations/api_views.py:62-64`	`CaseAnnotationViewSet` queryset	Add `"label__owner"` to existing `select_related`

1.7 Missing `defer()` on Heavy Text Fields

Location	Issue	Fix
`oldp/apps/laws/views.py:107`	`view_book()` loads all Law fields including `content` for list display	Add `.defer("content", "footnotes")`
`oldp/apps/laws/api_views.py:97`	`LawBookViewSet` loads `changelog`, `footnotes`, `sections` (large JSON)	Add `.defer("changelog", "footnotes", "sections")` for list action
`oldp/apps/laws/sitemaps.py:10-11`	`LawSitemap` queryset loads `content`	Add `.defer("content", "footnotes")`

Note: CaseSitemap (oldp/apps/cases/sitemaps.py:10-12) correctly uses .defer(*Case.defer_fields_list_view).

2. Caching Gaps (HIGH)

2.1 Views Missing Cache Entirely

View	Location	Impact
`CourtCasesListView`	`oldp/apps/courts/views.py:55-84`	DB queries on every page load
`CustomSearchView`	`oldp/apps/search/views.py:55-87`	Elasticsearch queries on every request
`autocomplete_view`	`oldp/apps/search/views.py:200-212`	New `SearchQuerySet` on every keystroke

Fix for CourtCasesListView: Add @method_decorator(cache_page(settings.CACHE_TTL)) or use the @cache_per_user decorator.

Fix for autocomplete_view: Cache results by query string in Django’s cache framework:

from django.core.cache import cache

def autocomplete_view(request):
    query = request.GET.get("q", "")
    cache_key = f"autocomplete:{query}"
    suggestions = cache.get(cache_key)
    if suggestions is None:
        try:
            sqs = SearchQuerySet().autocomplete(title=query)[:5]
            suggestions = [result.title for result in sqs]
        except Exception as e:
            logger.error("Autocomplete search failed: %s", str(e))
            suggestions = []
        cache.set(cache_key, suggestions, timeout=settings.CACHE_TTL)
    return JsonResponse({"results": suggestions})

2.2 API Endpoints with Insufficient or No Caching

Endpoint	Location	Current	Recommended
Case API	`oldp/apps/cases/api_views.py:88`	60 seconds	15 minutes (`settings.CACHE_TTL`)
Law API	`oldp/apps/laws/api_views.py`	None	Add `cache_page(settings.CACHE_TTL)` on dispatch
LawBook API	`oldp/apps/laws/api_views.py`	None	Add `cache_page(settings.CACHE_TTL)` on dispatch
Court API	`oldp/api/views.py:19-43`	None	Add `cache_page(settings.CACHE_TTL)` on dispatch
City API	`oldp/api/views.py:91-97`	None	Add `cache_page(settings.CACHE_TTL)` on dispatch
State API	`oldp/api/views.py:100-106`	None	Add `cache_page(settings.CACHE_TTL)` on dispatch

2.3 No Template Fragment Caching

Zero {% cache %} tags found across all templates. Static fragments like the navbar, footer, and sidebar re-render on every request.

Fix: Wrap stable template fragments with Django’s {% cache %} tag:

{% load cache %}
{% cache 3600 navbar %}
  ... navbar HTML ...
{% endcache %}

2.4 Homepage Counts on Every Cache Miss

oldp/apps/homepage/views.py:24-25 — Law.objects.all().count() and Case.get_queryset(request).count() execute on every cache miss. The @cache_per_user decorator mitigates this but each unique user still triggers these queries.

Fix: Cache the counts separately with a longer TTL (e.g., 1 hour), since exact counts are not critical:

from django.core.cache import cache

laws_count = cache.get("homepage_laws_count")
if laws_count is None:
    laws_count = Law.objects.count()
    cache.set("homepage_laws_count", laws_count, timeout=3600)

3. Elasticsearch / Search (MEDIUM)

3.1 `datetime.now()` in Queryset

oldp/apps/search/views.py:82 — datetime.datetime.now() is called on every search request for the date facet end boundary.

Fix: Use a fixed date or compute it once per day via caching.

3.3 Autocomplete Without Caching

oldp/apps/search/views.py:206 — Creates a new SearchQuerySet() per request with no caching. See fix in section 2.1 above.

4. Middleware & HTTP (MEDIUM)

4.1 GZip Compression Disabled

oldp/settings.py:138 — GZipMiddleware is commented out. Responses are sent uncompressed, increasing transfer sizes.

Fix: Uncomment 'django.middleware.gzip.GZipMiddleware' and place it first in the middleware list. Alternatively, enable gzip at the reverse proxy level (nginx).

4.2 No Conditional Request Support

No ConditionalGetMiddleware is configured. The server cannot return 304 Not Modified for unchanged content, forcing full response re-transmission.

Fix: Add 'django.middleware.http.ConditionalGetMiddleware' to MIDDLEWARE.

4.3 FlatpageFallbackMiddleware on Every 404

oldp/settings.py:137 — FlatpageFallbackMiddleware triggers a DB query on every 404 response to check for a matching flatpage.

Impact: Low on most requests but adds latency on every 404 (bots, broken links, etc.).

4.4 Static File Hashing Disabled

oldp/settings.py:291 — CompressedManifestStaticFilesStorage (whitenoise) is commented out. Static files are served without content hashes, preventing long-lived browser cache headers.

Fix: Uncomment the whitenoise storage backend:

STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'

5. Context Processor Overhead (LOW)

oldp/apps/lib/context_processors.py:31 — reverse("flatpages", ...) is called on every request to resolve the API info URL.

oldp/apps/lib/context_processors.py:35 — get_version() is called on every request.

Fix: Compute these once at module load time:

_API_INFO_URL = None
_APP_VERSION = None

def global_context_processor(request):
    global _API_INFO_URL, _APP_VERSION
    if _API_INFO_URL is None:
        _API_INFO_URL = reverse("flatpages", kwargs={"url": "/api/"})
    if _APP_VERSION is None:
        _APP_VERSION = get_version()
    return {
        ...
        "api_info_url": _API_INFO_URL,
        "app_version": _APP_VERSION,
    }

6. SQL Injection Risk + Performance (LOW)

oldp/apps/sources/views.py:43 — Raw SQL with Python string formatting:

where_clause = ' WHERE c.created_date > "{}"'.format(diff_str)

This is a SQL injection vulnerability. Although diff_str is derived from datetime.timedelta, the pattern is dangerous and should be replaced.

Fix: Use parameterized queries:

if "delta" in date_range:
    diff = today - datetime.timedelta(**date_range["delta"])
    where_clause = " WHERE c.created_date > %s"
    params = [diff.strftime("%Y-%m-%d")]
else:
    where_clause = ""
    params = []

# ...
cursor.execute(query, params)

7. Deployment / Infrastructure Recommendations

7.1 Cache Backend

The default cache backend is file-based (oldp/settings.py:255). For production, use Redis for significantly better performance:

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": "redis://127.0.0.1:6379",
    }
}

7.2 Cache TTL

CACHE_TTL = 60 * 15 (15 minutes) at oldp/settings.py:254. Given weekly data updates, this could be extended to 1-4 hours for most views, with explicit cache invalidation on data import.

7.3 Database Query Logging

Enable query logging in development to catch N+1 issues early:

# settings_dev.py
LOGGING["loggers"]["django.db.backends"] = {
    "level": "DEBUG",
    "handlers": ["console"],
}

Or use django-debug-toolbar to inspect query counts per request.

Summary — Priority Matrix

Priority	Finding	Effort	Impact
HIGH	N+1 in `get_related()` (1.1)	Low	High — reduces queries from N+1 to 1
HIGH	Missing caching on search views (2.1)	Low	High — ES queries are expensive
HIGH	API caching gaps (2.2)	Low	High — repeated identical API calls
HIGH	Missing `select_related` in API views (1.6)	Low	Medium — extra query per serialized object
HIGH	Missing `defer()` on heavy fields (1.7)	Low	Medium — reduces memory and transfer
MEDIUM	GZip compression disabled (4.1)	Low	Medium — reduces response sizes ~60-70%
MEDIUM	Static file hashing disabled (4.4)	Low	Medium — enables long-lived browser caching
MEDIUM	Search facet processing (3.2)	Medium	Medium — heavy Python on every search
MEDIUM	Template fragment caching (2.3)	Medium	Medium — avoids re-rendering static HTML
LOW	Context processor overhead (5)	Low	Low — minor per-request cost
LOW	SQL injection in sources (6)	Low	Low (security fix, staff-only view)
LOW	`get_latest_law_book` queryset eval (1.3)	Low	Low — minor per-request cost