The OLDP ecosystem

OLDP is the core of a small ecosystem of independent, MIT-licensed projects. The core platform (this repository) stores and serves legal data; the sister projects feed data in, theme it, and process exports back out. Each is a separately installable Python package.

Projects at a glance

Project

Role

Entry point

Repository

oldp

Core Django web app — models, REST API, search, MCP server

Django app / manage.py

openlegaldata/oldp

oldp-de

German theme & country-specific settings (templates, static assets, German court types)

Django settings module oldp_de.settings

openlegaldata/oldp-de

oldp-ingestor

Scrapers & API clients that pull German laws and court decisions from 13+ sources and push them into OLDP

CLI: oldp-ingestor

openlegaldata/oldp-ingestor

oldp-toolkit

Dump preprocessing — converts OLDP data dumps into HuggingFace / Parquet / JSONL datasets

CLI: oldpt

openlegaldata/oldp-toolkit

How data flows

   External sources                  ┌─────────────────────────────┐
 (RIS, GII, Bayern, NRW,             │            OLDP             │
  Juris, EUR-Lex, …)                 │   (this repository)         │
        │                            │                             │
        │   scrape / API fetch       │  • Cases, Laws, Courts      │
        ▼                            │  • References / citations   │
 ┌───────────────┐   REST API POST   │  • Elasticsearch search     │
 │ oldp-ingestor │ ────────────────▶ │  • REST API + MCP server    │
 └───────────────┘                   │  • Themed by  ◀── oldp-de   │
                                     └──────────────┬──────────────┘
                                                    │  dump_api_data
                                                    ▼
                                         gzipped JSONL snapshot
                                          (+ manifest.json)
                                                    │
                                                    ▼
                                         ┌────────────────────┐
                                         │    oldp-toolkit    │
                                         └─────────┬──────────┘
                                                   ▼
                                   HuggingFace dataset / Parquet
                                  (openlegaldata/court-decisions-germany)

oldp-de: German theme

oldp-de is a pluggable Django theme package that adapts OLDP for German legal data without modifying the core platform. It provides:

  • Alternative Django templates and static assets (German UI), found ahead of the core templates by the template loader.

  • German-specific configuration classes (DevDEConfiguration, ProdDEConfiguration) layered on top of OLDP’s base settings.

  • The courts_de app, contributing 40+ German court types (AG, LG, OLG, BGH, BVerfG, BAG, BSG, BFH, …) and ECLI mapping data.

Deploy it by pointing DJANGO_SETTINGS_MODULE at oldp_de.settings. See Architecture → Themes and the Configuration reference.

oldp-ingestor: data ingestion

oldp-ingestor is a standalone CLI tool that fetches German laws and court decisions from external providers and writes them into OLDP through the REST API (or to local JSON files).

  • Providers for RIS, GII, RII, Bayern, NRW, Niedersachsen, EUR-Lex and several Juris state variants, built on a shared HTTP / scraper / Playwright base.

  • Polite by default: request pacing, exponential backoff, and Retry-After handling.

  • Authenticates against OLDP with an API token (see REST API → Authentication).

The data it writes lands in OLDP’s content models and is then refined by the processing pipeline (court resolution, reference extraction).

oldp-toolkit: dump preprocessing

oldp-toolkit consumes the gzipped JSONL snapshots produced by OLDP’s dump_api_data command (see Data Dumps & Bulk Downloads) and converts them into distribution formats:

  • HTML → Markdown conversion and inline legal-reference extraction.

  • Output as HuggingFace Hub dataset, HuggingFace on-disk dataset, Parquet, or JSONL.

  • Powers the public openlegaldata/court-decisions-germany dataset.