The problem
The Digital Prosopography of the Roman Republic (DPRR) is a dataset of every known political figure in the Roman Republic — magistrates, priests, senators, their offices, dates, and family relationships. It's built from decades of scholarship, primarily Broughton's Magistrates of the Roman Republic (1951–1986). The data is available as RDF, queryable via SPARQL.
The problem: SPARQL is not something most people can write from scratch. The DPRR ontology has its own vocabulary, its own conventions, its own quirks. Even researchers who know the dataset well spend time wrangling queries. And the upstream site experiences frequent downtime.
I wanted to see if an LLM could bridge that gap — take a natural language question about Roman Republican prosopography and turn it into a valid SPARQL query, execute it, and synthesize the results with appropriate scholarly caveats.
The approach
Projects like sparql-llm from the Swiss Institute of Bioinformatics have explored this space with a full retrieval pipeline — vector stores for query examples, schema indexing, validation feedback loops. I wanted something more lightweight: skip the retrieval infrastructure and just give the LLM direct access to the schema and a query executor.
I built dprr-mcp, an MCP server that bundles the DPRR RDF dataset in a local Oxigraph store and exposes three tools:
get_schema— returns the DPRR ontology: prefixes, classes, properties, and query tipsvalidate_sparql— syntax check with auto-repair for missing PREFIX declarations, plus semantic validation against the ontologyexecute_sparql— full validation + execution against the local store
The key design decision: don't try to answer questions directly from the data. Instead, give the LLM the tools to generate, validate, and execute structured queries. The LLM handles the natural language → SPARQL translation. The SPARQL engine handles correctness. The LLM then synthesizes the results.
This is retrieval-augmented generation in the literal sense — the model retrieves structured data through a formal query language, not a vector similarity search. No embeddings. No chunking. No fuzzy matching. The query either returns the right triples or it doesn't.
What works
Ask "Who held the office of praetor in 150 BC?" and the system generates a SPARQL query against the DPRR ontology, validates it, executes it, and returns a synthesized response with source citations. It correctly identifies C. Livius Drusus and P. Sextilius, notes the uncertainty flags on their dates, cites Broughton and Brennan, and — critically — explains why only two names appear when six praetors were elected annually.
The three-tool pipeline (schema → validate → execute) forces the LLM through a structured workflow. It can't hallucinate data because the data comes from SPARQL results. It can't write invalid queries because the validator catches them. The ontology schema grounds its understanding of what the data actually contains.
Some examples of what the pipeline produces:
- Praetorships of the Middle to Late Republic — a stacked bar chart of 996 praetor records across 23 decades (264–31 BC), broken down by certainty category, with a searchable data table linking back to DPRR
- Family Trees — prosopographic family reconstructions rendered as SVG diagrams with sourced data tables, including the Fulvii Flacci & Calpurnii Pisones and the Atilii Reguli
What doesn't
The system inherits every limitation of its source data, and this is where it gets interesting.
DPRR is a database of secondary sources. It encodes what Broughton, Zmeskal, and Rüpke concluded about the ancient evidence — not the ancient evidence itself. Every query result should be understood as "according to Broughton, as digitized by the DPRR team," not "according to the ancient sources."
I wrote a Caveat Utilitor document that catalogs the risks:
- Elite bias — the dataset covers only the political upper strata. Women appear almost exclusively through family relationships to male office-holders.
- Argument from silence — zero results for a query doesn't mean something didn't happen. The fasti are fragmentary. Broughton synthesized rather than exhausted.
- Nomenclatural ambiguity — Roman naming conventions create fundamental identification problems. A name that appears to indicate a family connection may reflect coincidence or conventions we don't fully understand.
- Temporal unevenness — the late Republic is densely documented. The early Republic rests on traditions of questionable historicity.
The LLM handles these caveats well when prompted — it includes uncertainty flags, cites the specific secondary sources behind each assertion, and explains gaps. But it wouldn't do this naturally. The system prompt and the Caveat Utilitor document do the heavy lifting. Left to its own devices, the model would present DPRR query results as facts rather than scholarly interpretations.
Why MCP
MCP made this project straightforward. The server runs as a standalone HTTP process. Claude Code or Claude Desktop connects to it over the network. No plugin architecture, no custom API integration, no SDK bindings. Define tools, implement handlers, deploy.
The three-tool design maps cleanly to MCP's tool model: each tool has a clear input/output contract, the LLM decides when and how to call them, and the server handles all the SPARQL and RDF complexity behind the interface.
Bundling the data locally (with auto-download from GitHub releases) solved the upstream reliability problem. The server initializes in seconds and doesn't depend on romanrepublic.ac.uk being up.
The meta-lesson
The interesting thing about this project isn't the Roman Republic. It's the pattern.
Structured data with a formal query language is a better RAG target than unstructured text. You don't need vector embeddings when you have a schema. You don't need fuzzy matching when you have SPARQL. The LLM's job becomes translation (natural language → query language) and synthesis (structured results → readable narrative), both of which it's good at.
But the hardest part wasn't the technical pipeline. It was the epistemological framing — making sure the system represents what it actually knows versus what it appears to know. A query that returns two praetors for 150 BC is only useful if the response explains that six were elected annually and the gap reflects source survival, not historical reality.
That's a human-in-the-loop problem. The system prompt, the Caveat Utilitor document, the uncertainty flags — all of that is domain expertise encoded as guardrails. The LLM is good at following them. It's not good at inventing them.
The tools are ready. The question is whether the people using them know enough to set the right boundaries.
The code is on GitHub: gillisandrew/dprr-mcp
References
- Mouritsen, H., Mayfield, J. & Bradley, J. (2017). Digital Prosopography of the Roman Republic (DPRR). King's Digital Lab, King's College London.
- Bradley, J. (2020). A Prosopography as Linked Open Data. Digital Humanities Quarterly, 14(2).
- Broughton, T. R. S. (1951–1986). The Magistrates of the Roman Republic. 3 vols. American Philological Association / Scholars Press.
- Brennan, T. C. (2000). The Praetorship in the Roman Republic. 2 vols. Oxford University Press.
- Rüpke, J. (2008). Fasti Sacerdotum: A Prosopography of Pagan, Jewish, and Christian Religious Officials in the City of Rome, 300 BC to AD 499. Oxford University Press.
- Zmeskal, K. (2009). Adfinitas: Die Verwandtschaften der senatorischen Führungsschicht der römischen Republik von 218–31 v. Chr. Verlag Karl Stutz.
- Déjean, S. et al. (2024). sparql-llm: SPARQL query generation for RDF knowledge graphs using LLMs. Swiss Institute of Bioinformatics.