Architecture overview

OLDP is a Django project organised into a set of focused apps under oldp/apps/. This page gives a high-level map of those components, the core data model, and how a request or a piece of content flows through the system. For setup see Getting Started; for the surrounding projects see The OLDP ecosystem.

Applications

Each Django app under oldp/apps/ owns one part of the domain:

App

Responsibility

cases

Court decisions (Case, RelatedCase): metadata, HTML content, file number, ECLI, review status.

laws

Legislation: LawBook (a collection with revisions) and Law (a section), plus RelatedLaw.

courts

Court entities and their geography: Court, State, City, Country.

references

Citation extraction and linking between content items (Reference, reference markers).

search

Full-text search via Haystack + Elasticsearch; SearchableContent mixin, SearchQuery.

annotations

User annotations on content: AnnotationLabel, CaseAnnotation, CaseMarker.

sources

Provenance of imported content: Source and the SourceContent mixin.

accounts

User accounts, API tokens and granular per-token permissions (extends django-allauth).

processing

The content-processing pipeline (ContentProcessor, input handlers, timeouts).

mcp

Model Context Protocol server exposing OLDP to AI agents.

lib

Shared utilities, notably markers.py (reference-marker insertion with drift recovery).

topics

Topic / category taxonomy.

homepage

Landing page and static page rendering.

contact

Contact / feedback form.

Data model

The core entities and their relationships:

OLDP database schema

  • Case ↔ Court — every Case belongs to a Court. Courts are organised hierarchically by Country State City, and classified by court type and level of appeal (the German court types are contributed by the oldp-de theme).

  • Law ↔ LawBook — a LawBook (e.g. BGB) groups many Law sections and is versioned by revision date, with a latest flag marking the current revision. Laws link to their previous/next section.

  • Reference — extracted citations are stored as Reference rows. Law citations are keyed by a (law_book_slug, law_section_slug) pair rather than a numeric id, so they survive law-book revisions and database rebuilds.

  • Source — imported content carries a Source (the crawler/provider it came from) via the SourceContent mixin; this is how oldp-ingestor provenance is tracked.

  • AnnotationsAnnotationLabel defines labels (colour, trust level, value type); CaseAnnotation and CaseMarker attach labelled, positioned markers to case text.

Several content models share cross-cutting mixins: SearchableContent (indexable in Elasticsearch), ReferenceContent (has extractable citations), SourceContent (has provenance), and AnnotationContent (can be annotated).

The relational schema is reflected in MySQL/SQLite — see Database for the schema and encoding notes.

Request flow

OLDP serves three surfaces from the same models:

  1. Web frontend — server-rendered Django templates for /case/, /law/, /court/, /search/ and static pages. The look and feel is supplied by a theme (see Themes).

  2. REST API — built on Django REST Framework with per-app viewsets and serializers, token authentication, and Swagger/ReDoc schema. See REST API.

  3. MCP server — exposes discovery, search, retrieval and cross-reference tools to AI agents at /mcp. See MCP Server.

Anonymous responses on public paths are made CDN-cacheable by AnonymousPublicCacheMiddleware (configurable — see Configuration).

Processing pipeline

Raw imported content is refined by the processing pipeline in the processing app. A ContentProcessor reads items from the database or filesystem and runs a sequence of processing steps (e.g. resolve the court, extract references) inside per-item transactions with timeout guards. Steps run from the Django admin or via management commands (process_cases, process_laws, process_courts, process_references). See Processing and, for writing your own step, Development.

Search and the citation graph

Search is powered by Elasticsearch through Haystack. Cases and laws are indexed via search_indexes.py in their respective apps. Beyond full-text search, the case index stores the citation graph — the cited_laws and cited_cases fields — which powers the “referenced by” / “citing cases” panels and the REST and MCP cross-reference endpoints. These fields are documented authoritatively in Elasticsearch → Index fields; see also Searching for the user-facing filters.

Themes

The frontend’s templates and static assets can be overridden by a theme package without touching the core. The reference theme is oldp-de, which supplies the German UI, German court types, and German-specific settings. A theme is activated by pointing DJANGO_SETTINGS_MODULE at the theme’s settings module (e.g. oldp_de.settings). See The OLDP ecosystem.

Configuration

All runtime behaviour is controlled through environment variables and django-configurations classes. See the Configuration reference for the complete list and Deployment for production setup.