METHODSDecember 2025

Methodology

Data sources and research methods behind our analysis of Latin publishing, digitization, and translation.

Primary Data Source

Universal Short Title Catalogue (USTC)

All bibliographic data comes from the Universal Short Title Catalogue, maintained by the University of St Andrews. The USTC is the most comprehensive database of early European printing (1450–1700).

USTC Statistics (as of 2025)

  • 1.65 million editions catalogued
  • 7 million surviving copies located
  • 10,000+ libraries, archives, and museums worldwide
  • 500,000+ digital links to scanned copies

Source: USTC About Page

We queried the USTC for records where the language field contains "Latin," yielding 533,320 Latin-language edition records (32.7% of the total catalogue).

Editions vs. Works: Our Analysis

The USTC counts editions (individual printings), not unique works. A popular text like Cicero's De Officiis might appear in 500+ editions. To estimate unique works, we analyzed the USTC Latin dataset:

Total Latin editions533,307
Unique author+title combinations~150,000
Unique author names~52,000
Unique title strings~99,000

The ~150,000 unique author+title pairs is our best proxy for "works," though this still overcounts because:

  • Same work with variant titles counts multiple times
  • Spelling/abbreviation differences across editions
  • Anonymous works harder to deduplicate

Conservative estimate: 80,000–120,000 unique Latin works in USTC. This means the average work was reprinted ~5 times, though the distribution is highly skewed—canonical authors have hundreds of editions while most works have just 1–2.

Analysis: Computed from USTC Latin export (December 2025) using author_name + std_title deduplication.

Digitization Data

USTC Digital Links (~27% of all editions)

The USTC itself provides the most reliable digitization data. According to their website:

"The USTC hosts links to more than half a million digital scans, currently tagged to some 450,000 editions."

This means ~27% of all catalogued editions (450,000 / 1,650,000) have at least one digital scan available through USTC links. These links point to major digitization projects including:

English Translation Data

We compiled counts from major Latin-English scholarly translation series and digital libraries:

Digital Libraries

SourceTextsNotes
Perseus Digital Library631Latin works with translations (Scaife Viewer)
The Philological Museum~200British neo-Latin with translations (Dana Sutton)

The Philological Museum

Dana Sutton's Philological Museum at the University of Birmingham is a major open-access resource. It contains critical editions of ~200 British neo-Latin texts (plays, poems, letters, essays) from the 16th–17th centuries, most with facing-page English translations. The associated Analytic Bibliography indexes 79,760 neo-Latin texts freely available online (though most are Latin-only scans without translations).

Renaissance & Neo-Latin

SeriesVolsNotes
I Tatti Renaissance Library100Italian Renaissance Latin (2001–)

Classical Latin

SeriesVolsNotes
Loeb Classical Library (Latin)~158520+ total (Greek & Latin), 1912–
Aris & Phillips Classical Texts170+Greek & Latin, 1979– (~80 Latin)
Penguin Classics (Latin)114Latin language filter

Medieval Latin

SeriesVolsNotes
Oxford Medieval Texts103Facing-page translations, 1967–
Dumbarton Oaks Medieval Library~50Latin subseries, 2010–
Toronto Medieval Latin Texts37Pedagogical editions, 1972–
Liverpool Translated Texts for Historians86Late Antique/Medieval, includes Greek (~50 Latin)

Patristic & Church Fathers

SeriesVolsNotes
Fathers of the Church (CUA)147Latin & Greek, 1947– (~80 Latin)
Ancient Christian Writers (Paulist)76Latin & Greek, 1946– (~40 Latin)
Ante-Nicene/Nicene Post-Nicene Fathers38Public domain, 1885–1900
Classics of Western Spirituality130+Mixed languages, 1978– (~40 Latin)

The Missing Database

A Gap in Research Infrastructure

No comprehensive database exists of which Latin works have been translated into English. The series counts above are our best effort to estimate coverage, but they represent a patchwork, not a systematic catalogue.

Existing resources fall short:

Building this database is a potential output of this project. Cross-referencing USTC records with translation series catalogues would produce the first comprehensive map of what's accessible.

Our Estimate (Rough)

Named series above (Latin-specific)~1,000
Perseus Digital Library (unique works)~400
Other academic presses, dissertations, journals~500
Very Rough Total~1,500–2,000

This estimate has significant uncertainty. Without a systematic catalogue, we cannot know the true number.

Translation Coverage of Unique Works

Given our estimate of ~100,000 unique Latin works in USTC and ~1,500–2,000 translated works, translation coverage is approximately 1.5–2% of unique works. The vast majority of Renaissance Latin literature remains untranslated.

OCR and Searchability

Having a digital scan does not mean the text is searchable. OCR (Optical Character Recognition) quality varies dramatically for early printed books.

OCR Accuracy Research

Academic studies on OCR accuracy for historical prints show:

  • Modern documents: 99%+ character accuracy
  • 19th century prints: 98%+ with general OCR models
  • Early modern prints (pre-1800): 40%+ error rates with untrained models due to blackletter fonts, abbreviations, and regional type variations
  • With specialized training: 94–98% accuracy achievable on individual books

Sources: Springmann & Lüdeling (2017), Reul et al. (2017)

Text Creation Partnership (TCP)

The Text Creation Partnership has produced the gold standard for early modern text transcription:

  • 70,000+ transcribed and encoded texts
  • 1 billion+ searchable words
  • 99.995% accuracy (double-keyed transcription)
  • ~60,000 texts from EEBO-TCP specifically

Source: TCP Website

Critical Limitation

EEBO-TCP focuses on English-language books from the British Isles. Latin works from continental Europe—the vast majority of Latin printing—are not covered by TCP transcription efforts.

Summary of Key Findings

MetricValueSource
Total USTC editions (all languages)1.65 millionUSTC website
Latin editions (USTC)533,307USTC query
Unique author+title pairs (Latin)~150,000Our analysis
Estimated unique Latin works~80,000–120,000Estimated (dedup)
Editions with digital scans (all languages)~450,000 (27%)USTC website
High-quality transcriptions (English texts only)~60,000EEBO-TCP
Latin works with English translations~1,500–2,000Series counts
Translation coverage (of unique works)~1.5–2%Calculated

References

Discussion

Loading comments...