Performance Optimization Audit
This document identifies performance bottlenecks in OLDP and provides actionable, code-level recommendations. Findings are organized by severity.
Context: OLDP serves legal documents (cases, laws, courts). Data is updated approximately once per week, making it an excellent candidate for aggressive caching.
Profiling Report
Measured profiling methodology and branch verification results (API + frontend + case detail) are documented in:
Infrastructure-level tuning beyond Django (Gunicorn, Nginx, MariaDB, Elasticsearch, Redis, plus Docker Compose examples) is documented in:
1. Database Query Issues (HIGH)
1.2 N+1 in get_content_as_html()
oldp/apps/cases/models.py:202-224 — Calls get_reference_markers() and
get_markers(), each triggering separate DB queries. This method is invoked on
every case detail page view (oldp/apps/cases/views.py:91).
Fix: Prefetch reference markers and annotation markers on the case queryset in the detail view, or cache the rendered HTML output per case.
1.3 Inefficient get_latest_law_book()
oldp/apps/laws/views.py:61-75 — Uses len(candidates) which forces full queryset
evaluation instead of using .exists() or .first().
Fix:
def get_latest_law_book(book_slug):
candidate = LawBook.objects.filter(slug=book_slug, latest=True).first()
if candidate is None:
logger.info("Law book not found: %s", book_slug)
raise Http404()
# Check for duplicates (should not happen)
count = LawBook.objects.filter(slug=book_slug, latest=True).count()
if count > 1:
logger.warning("Book has more than one instance with latest=true: %s", book_slug)
return candidate
1.4 Uncached Law.get_next() and has_next()
oldp/apps/laws/models.py:328-347 — Each call to get_next() or has_next()
triggers a separate DB query. When both are called (e.g., in templates), two
queries are executed for the same information.
Fix: Cache the result of get_next() on the instance and derive has_next()
from it:
def get_next(self):
if not hasattr(self, "_next_cache"):
try:
self._next_cache = Law.objects.get(previous=self.id)
except Law.DoesNotExist:
self._next_cache = None
except Law.MultipleObjectsReturned:
logger.error(f"Multiple laws found with previous={self.id}")
self._next_cache = Law.objects.filter(previous=self.id).first()
return self._next_cache
def has_next(self):
return self.get_next() is not None
1.5 Law.get_latest_revision_url() — 2 DB Queries Per Call
oldp/apps/laws/models.py:397-414 — Fetches the latest book and then checks for
law existence on every call. Particularly expensive if called in list contexts.
Fix: Cache on the instance, or avoid calling in list views entirely.
1.7 Missing defer() on Heavy Text Fields
Location |
Issue |
Fix |
|---|---|---|
|
|
Add |
|
|
Add |
|
|
Add |
Note: CaseSitemap (oldp/apps/cases/sitemaps.py:10-12) correctly uses
.defer(*Case.defer_fields_list_view).
2. Caching Gaps (HIGH)
2.1 Views Missing Cache Entirely
View |
Location |
Impact |
|---|---|---|
|
|
DB queries on every page load |
|
|
Elasticsearch queries on every request |
|
|
New |
Fix for CourtCasesListView: Add @method_decorator(cache_page(settings.CACHE_TTL))
or use the @cache_per_user decorator.
Fix for autocomplete_view: Cache results by query string in Django’s cache
framework:
from django.core.cache import cache
def autocomplete_view(request):
query = request.GET.get("q", "")
cache_key = f"autocomplete:{query}"
suggestions = cache.get(cache_key)
if suggestions is None:
try:
sqs = SearchQuerySet().autocomplete(title=query)[:5]
suggestions = [result.title for result in sqs]
except Exception as e:
logger.error("Autocomplete search failed: %s", str(e))
suggestions = []
cache.set(cache_key, suggestions, timeout=settings.CACHE_TTL)
return JsonResponse({"results": suggestions})
2.2 API Endpoints with Insufficient or No Caching
Endpoint |
Location |
Current |
Recommended |
|---|---|---|---|
Case API |
|
60 seconds |
15 minutes ( |
Law API |
|
None |
Add |
LawBook API |
|
None |
Add |
Court API |
|
None |
Add |
City API |
|
None |
Add |
State API |
|
None |
Add |
2.3 No Template Fragment Caching
Zero {% cache %} tags found across all templates. Static fragments like the
navbar, footer, and sidebar re-render on every request.
Fix: Wrap stable template fragments with Django’s {% cache %} tag:
{% load cache %}
{% cache 3600 navbar %}
... navbar HTML ...
{% endcache %}
2.4 Homepage Counts on Every Cache Miss
oldp/apps/homepage/views.py:24-25 — Law.objects.all().count() and
Case.get_queryset(request).count() execute on every cache miss. The
@cache_per_user decorator mitigates this but each unique user still triggers
these queries.
Fix: Cache the counts separately with a longer TTL (e.g., 1 hour), since exact counts are not critical:
from django.core.cache import cache
laws_count = cache.get("homepage_laws_count")
if laws_count is None:
laws_count = Law.objects.count()
cache.set("homepage_laws_count", laws_count, timeout=3600)
3. Elasticsearch / Search (MEDIUM)
3.1 datetime.now() in Queryset
oldp/apps/search/views.py:82 — datetime.datetime.now() is called on every
search request for the date facet end boundary.
Fix: Use a fixed date or compute it once per day via caching.
3.2 Heavy Facet Processing in Python
oldp/apps/search/views.py:88-161 — get_search_facets() performs nested
iteration over facet data, building URL parameters in Python on every search
request.
Fix: Cache the facet processing result keyed by the query + selected facets combination.
3.3 Autocomplete Without Caching
oldp/apps/search/views.py:206 — Creates a new SearchQuerySet() per request
with no caching. See fix in section 2.1 above.
4. Middleware & HTTP (MEDIUM)
4.1 GZip Compression Disabled
oldp/settings.py:138 — GZipMiddleware is commented out. Responses are sent
uncompressed, increasing transfer sizes.
Fix: Uncomment 'django.middleware.gzip.GZipMiddleware' and place it first in
the middleware list. Alternatively, enable gzip at the reverse proxy level (nginx).
4.2 No Conditional Request Support
No ConditionalGetMiddleware is configured. The server cannot return 304 Not Modified for unchanged content, forcing full response re-transmission.
Fix: Add 'django.middleware.http.ConditionalGetMiddleware' to MIDDLEWARE.
4.3 FlatpageFallbackMiddleware on Every 404
oldp/settings.py:137 — FlatpageFallbackMiddleware triggers a DB query on every
404 response to check for a matching flatpage.
Impact: Low on most requests but adds latency on every 404 (bots, broken links, etc.).
4.4 Static File Hashing Disabled
oldp/settings.py:291 — CompressedManifestStaticFilesStorage (whitenoise) is
commented out. Static files are served without content hashes, preventing
long-lived browser cache headers.
Fix: Uncomment the whitenoise storage backend:
STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
5. Context Processor Overhead (LOW)
oldp/apps/lib/context_processors.py:31 — reverse("flatpages", ...) is called on
every request to resolve the API info URL.
oldp/apps/lib/context_processors.py:35 — get_version() is called on every
request.
Fix: Compute these once at module load time:
_API_INFO_URL = None
_APP_VERSION = None
def global_context_processor(request):
global _API_INFO_URL, _APP_VERSION
if _API_INFO_URL is None:
_API_INFO_URL = reverse("flatpages", kwargs={"url": "/api/"})
if _APP_VERSION is None:
_APP_VERSION = get_version()
return {
...
"api_info_url": _API_INFO_URL,
"app_version": _APP_VERSION,
}
6. SQL Injection Risk + Performance (LOW)
oldp/apps/sources/views.py:43 — Raw SQL with Python string formatting:
where_clause = ' WHERE c.created_date > "{}"'.format(diff_str)
This is a SQL injection vulnerability. Although diff_str is derived from
datetime.timedelta, the pattern is dangerous and should be replaced.
Fix: Use parameterized queries:
if "delta" in date_range:
diff = today - datetime.timedelta(**date_range["delta"])
where_clause = " WHERE c.created_date > %s"
params = [diff.strftime("%Y-%m-%d")]
else:
where_clause = ""
params = []
# ...
cursor.execute(query, params)
7. Deployment / Infrastructure Recommendations
7.1 Cache Backend
The default cache backend is file-based (oldp/settings.py:255). For production,
use Redis for significantly better performance:
CACHES = {
"default": {
"BACKEND": "django.core.cache.backends.redis.RedisCache",
"LOCATION": "redis://127.0.0.1:6379",
}
}
7.2 Cache TTL
CACHE_TTL = 60 * 15 (15 minutes) at oldp/settings.py:254. Given weekly data
updates, this could be extended to 1-4 hours for most views, with explicit cache
invalidation on data import.
7.3 Database Query Logging
Enable query logging in development to catch N+1 issues early:
# settings_dev.py
LOGGING["loggers"]["django.db.backends"] = {
"level": "DEBUG",
"handlers": ["console"],
}
Or use django-debug-toolbar to inspect query counts per request.
Summary — Priority Matrix
Priority |
Finding |
Effort |
Impact |
|---|---|---|---|
HIGH |
N+1 in |
Low |
High — reduces queries from N+1 to 1 |
HIGH |
Missing caching on search views (2.1) |
Low |
High — ES queries are expensive |
HIGH |
API caching gaps (2.2) |
Low |
High — repeated identical API calls |
HIGH |
Missing |
Low |
Medium — extra query per serialized object |
HIGH |
Missing |
Low |
Medium — reduces memory and transfer |
MEDIUM |
GZip compression disabled (4.1) |
Low |
Medium — reduces response sizes ~60-70% |
MEDIUM |
Static file hashing disabled (4.4) |
Low |
Medium — enables long-lived browser caching |
MEDIUM |
Search facet processing (3.2) |
Medium |
Medium — heavy Python on every search |
MEDIUM |
Template fragment caching (2.3) |
Medium |
Medium — avoids re-rendering static HTML |
LOW |
Context processor overhead (5) |
Low |
Low — minor per-request cost |
LOW |
SQL injection in sources (6) |
Low |
Low (security fix, staff-only view) |
LOW |
|
Low |
Low — minor per-request cost |