Methodology: How We Estimated Digitization Rates

Data sources and research methods behind our analysis of Latin publishing, digitization, and translation.

Primary Data Source

Universal Short Title Catalogue (USTC)

All bibliographic data comes from the Universal Short Title Catalogue, maintained by the University of St Andrews. The USTC is the most comprehensive database of early European printing (1450–1700).

USTC Statistics (as of 2025)

1.65 million editions catalogued
7 million surviving copies located
10,000+ libraries, archives, and museums worldwide
500,000+ digital links to scanned copies

Source: USTC About Page

We queried the USTC for records where the language field contains "Latin," yielding 533,320 Latin-language edition records (32.7% of the total catalogue).

Editions vs. Works: Our Analysis

The USTC counts editions (individual printings), not unique works. A popular text like Cicero's De Officiis might appear in 500+ editions. To estimate unique works, we analyzed the USTC Latin dataset:

Total Latin editions	533,307
Unique author+title combinations	~150,000
Unique author names	~52,000
Unique title strings	~99,000

The ~150,000 unique author+title pairs is our best proxy for "works," though this still overcounts because:

Same work with variant titles counts multiple times
Spelling/abbreviation differences across editions
Anonymous works harder to deduplicate

Conservative estimate: 80,000–120,000 unique Latin works in USTC. This means the average work was reprinted ~5 times, though the distribution is highly skewed—canonical authors have hundreds of editions while most works have just 1–2.

Analysis: Computed from USTC Latin export (December 2025) using author_name + std_title deduplication.

Digitization Data

USTC Digital Links (~27% of all editions)

The USTC itself provides the most reliable digitization data. According to their website:

"The USTC hosts links to more than half a million digital scans, currently tagged to some 450,000 editions."

This means ~27% of all catalogued editions (450,000 / 1,650,000) have at least one digital scan available through USTC links. These links point to major digitization projects including:

Google Books — 40+ million books scanned as of 2019
HathiTrust — 19+ million digitized items from 219 research libraries
Bavarian State Library (MDZ) — 10,000+ incunabula, ~300,000 16th–17th century books
Internet Archive — 3.8+ million scanned books

English Translation Data

We compiled counts from major Latin-English scholarly translation series and digital libraries:

Digital Libraries

Source	Texts	Notes
Perseus Digital Library	631	Latin works with translations (Scaife Viewer)
The Philological Museum	~200	British neo-Latin with translations (Dana Sutton)

The Philological Museum

Dana Sutton's Philological Museum at the University of Birmingham is a major open-access resource. It contains critical editions of ~200 British neo-Latin texts (plays, poems, letters, essays) from the 16th–17th centuries, most with facing-page English translations. The associated Analytic Bibliography indexes 79,760 neo-Latin texts freely available online (though most are Latin-only scans without translations).

Renaissance & Neo-Latin

Series	Vols	Notes
I Tatti Renaissance Library	100	Italian Renaissance Latin (2001–)

Classical Latin

Series	Vols	Notes
Loeb Classical Library (Latin)	~158	520+ total (Greek & Latin), 1912–
Aris & Phillips Classical Texts	170+	Greek & Latin, 1979– (~80 Latin)
Penguin Classics (Latin)	114	Latin language filter

Medieval Latin

Series	Vols	Notes
Oxford Medieval Texts	103	Facing-page translations, 1967–
Dumbarton Oaks Medieval Library	~50	Latin subseries, 2010–
Toronto Medieval Latin Texts	37	Pedagogical editions, 1972–
Liverpool Translated Texts for Historians	86	Late Antique/Medieval, includes Greek (~50 Latin)

Patristic & Church Fathers

Series	Vols	Notes
Fathers of the Church (CUA)	147	Latin & Greek, 1947– (~80 Latin)
Ancient Christian Writers (Paulist)	76	Latin & Greek, 1946– (~40 Latin)
Ante-Nicene/Nicene Post-Nicene Fathers	38	Public domain, 1885–1900
Classics of Western Spirituality	130+	Mixed languages, 1978– (~40 Latin)

The Missing Database

A Gap in Research Infrastructure

No comprehensive database exists of which Latin works have been translated into English. The series counts above are our best effort to estimate coverage, but they represent a patchwork, not a systematic catalogue.

Existing resources fall short:

Catalogus Translationum et Commentariorum — tracks Latin translations of Greek classics, not translations from Latin
UNESCO Index Translationum — only covers 1979–present, not historical translations
Publisher series catalogues — scattered across dozens of presses with no aggregation

Building this database is a potential output of this project. Cross-referencing USTC records with translation series catalogues would produce the first comprehensive map of what's accessible.

Our Estimate (Rough)

Named series above (Latin-specific)	~1,000
Perseus Digital Library (unique works)	~400
Other academic presses, dissertations, journals	~500
Very Rough Total	~1,500–2,000

This estimate has significant uncertainty. Without a systematic catalogue, we cannot know the true number.

Translation Coverage of Unique Works

Given our estimate of ~100,000 unique Latin works in USTC and ~1,500–2,000 translated works, translation coverage is approximately 1.5–2% of unique works. The vast majority of Renaissance Latin literature remains untranslated.

Translation Coverage by Period

Estimating translation coverage requires defining both the corpus size (denominator) and the number of available translations (numerator) for each historical period. This is complicated by the lack of any systematic scholarly survey. The estimates below synthesize available data from standard reference catalogues and translation series counts.

Classical Latin (~80% of major literary works)

Corpus Size

The Packard Humanities Institute (PHI) Latin Texts database contains 362 authors and claims to include "essentially all Latin literary texts written before A.D. 200"—approximately 7.5 million words. This represents the "canon" of Classical Latin literature.

Translation Coverage

For this canonical subset, translation coverage is genuinely high. The Loeb Classical Library provides 200–250 Latin volumes, Aris & Phillips has published 160+ volumes, and major authors (Cicero, Virgil, Ovid, Livy) exist in multiple competing translations.

Important Caveat

The ~80% figure applies only to major literary works. The full Classical Latin corpus is far larger: the Corpus Inscriptionum Latinarum catalogs 180,000+ inscriptions, with fewer than 500 translated online (<1%). Including inscriptions, fragments, technical literature, and documentary papyri, coverage of the complete Classical Latin corpus is perhaps 15–25%.

Patristic Latin / Church Fathers (~35%)

Corpus Size

The Clavis Patrum Latinorum (CPL) lists 2,348 entries for Latin Christian texts from Tertullian to Bede, with each entry often subdividing into dozens of individual sermons, letters, or treatises. The Patrologia Latina comprises 221 volumes totaling over 177,000 pages.

Translation Coverage

Major English translation series have produced approximately 350–400 volumes combined: Fathers of the Church (147), Ancient Christian Writers (~72), Ante-Nicene/Nicene-Post-Nicene Fathers (38), Popular Patristics Series (63), and Corpus Christianorum in Translation (49).

Scholarly Estimate

One scholarly assessment states: "I would venture to guess—but a studied guess—that 60–70% of the extant patristic literature has never been translated into English." Even Augustine—with 132 works totaling 5 million surviving words—achieved his first complete English translation only through the ongoing New City Press project (44–45 of 49 planned volumes as of 2024).

Medieval Latin (~10%)

Corpus Size

The scale of Medieval Latin defies easy quantification. Sharpe's Handlist of Latin Writers of Great Britain and Ireland alone identifies 5,200+ Latin works—and covers only the British Isles. A 2009 study by Buringh and van Zanden estimated that 2.9 million medieval manuscripts survive globally.

Translation Coverage

Major translation series have produced perhaps 250–350 Medieval Latin volumes over six decades: Oxford Medieval Texts (100+), Dumbarton Oaks Medieval Library (~41 Latin volumes), Toronto Medieval Latin Texts (37).

Scholarly Estimate

A 2023 scholarly assessment from Found in Antiquity states: "I have heard it estimated that over 90–95% of our surviving Latin texts remain untranslated... The percentage of literary texts from Antiquity and even more so from the Middle Ages that have never been translated into any modern language is overwhelming." Even Thomas Aquinas—arguably the best-translated medieval author—has "a substantial amount" of writings still untranslated per the Oxford Handbook of Aquinas.

Renaissance & Early Modern Latin (~2%)

Corpus Size

The Universal Short Title Catalogue now contains 1.65 million editions for 1450–1700. Our analysis identifies ~533,000 Latin editions, representing approximately 80,000–120,000 unique works after deduplication. The Oxford Handbook of Neo-Latin notes that the Neo-Latin corpus "is currently simply unquantifiable" but "dwarfs that of Latin in all other periods combined"—approximately 95% of all extant Latin texts date from the Renaissance onward.

Translation Coverage

Translation resources remain modest: the I Tatti Renaissance Library reached its 100th volume in March 2025; the Collected Works of Erasmus has published 66 of 89 planned volumes for one author's output alone. A realistic count of English translations yields 1,500–3,000 works.

Calculation

Applying our methodology: 2,000 translations ÷ 100,000 unique works = ~2%. If using the full USTC scale (650,000+ Latin editions), the figure drops to ~0.3%. The 2% estimate is conservative.

The Translation Decay Curve

As the volume of Latin texts increases exponentially through history, the percentage of available translations drops precipitously:

Period	Estimated Coverage	Corpus Source
Classical	~80% (literary canon)	PHI Latin Texts
Patristic	~35%	Clavis Patrum Latinorum
Medieval	~10%	Sharpe's Handlist + ms. surveys
Renaissance	~2%	USTC

All figures are order-of-magnitude estimates with substantial uncertainty (±50%). No systematic scholarly survey of translation coverage exists for any period.

OCR and Searchability

Having a digital scan does not mean the text is searchable. OCR (Optical Character Recognition) quality varies dramatically for early printed books.

OCR Accuracy Research

Academic studies on OCR accuracy for historical prints show:

Modern documents: 99%+ character accuracy
19th century prints: 98%+ with general OCR models
Early modern prints (pre-1800): 40%+ error rates with untrained models due to blackletter fonts, abbreviations, and regional type variations
With specialized training: 94–98% accuracy achievable on individual books

Sources: Springmann & Lüdeling (2017), Reul et al. (2017)

Text Creation Partnership (TCP)

The Text Creation Partnership has produced the gold standard for early modern text transcription:

70,000+ transcribed and encoded texts
1 billion+ searchable words
99.995% accuracy (double-keyed transcription)
~60,000 texts from EEBO-TCP specifically

Source: TCP Website

Critical Limitation

EEBO-TCP focuses on English-language books from the British Isles. Latin works from continental Europe—the vast majority of Latin printing—are not covered by TCP transcription efforts.

Confidence Levels

Not all claims on this site carry equal certainty. We assign explicit confidence levels to help readers assess our estimates:

Claim	Confidence	Basis
533,307 Latin editions in USTC	HIGH	Direct USTC database query
80,000–120,000 unique works	MEDIUM	Author+title deduplication estimate
1,500–2,000 English translations exist	MEDIUM	Aggregated series counts
~2% overall translation rate	MEDIUM	Derived from above estimates
Field-by-field translation rates	LOW-MED	Extrapolated from series focus areas
Specific work counts by field	LOW	No systematic catalog exists

We report translation rates rather than work counts on the homepage because the numerator (translated works) is more tractable than the denominator (total works by field).

Summary of Key Findings

Metric	Value	Source
Total USTC editions (all languages)	1.65 million	USTC website (2025)
Latin editions (USTC)	533,307	USTC query (Dec 2025)
Unique author+title pairs (Latin)	~150,000	Our deduplication analysis
Estimated unique Latin works	80,000–120,000	Conservative estimate after variant titles
Editions with digital scans (all languages)	~450,000 (27%)	USTC digital links
High-quality transcriptions (English texts only)	~60,000	EEBO-TCP
Latin works with English translations	1,500–2,000	Aggregated series counts
Translation coverage (of unique works)	~1.5–2%	Derived (translations ÷ unique works)

References

Universal Short Title Catalogue. About Page. University of St Andrews.
"Celebrating 30 years of USTC." St Andrews Staff News. October 2025. (1.65 million editions, 7 million copies, 10,000+ institutions)
Perseus Digital Library. Scaife Viewer. Tufts University. 631 Latin works.
Text Creation Partnership. EEBO-TCP. University of Michigan / ProQuest.
Springmann, U. & Lüdeling, A. (2017). "OCR of historical printings with an application to building diachronic corpora." Digital Humanities Quarterly 11(2).
I Tatti Renaissance Library. Harvard University Press. 100 volumes as of March 2025.
Loeb Classical Library. Harvard University Press. ~520 volumes total (Greek and Latin).
Oxford Medieval Texts. Oxford University Press. 103 volumes.
Fathers of the Church. Catholic University of America Press. 147 volumes.
Aris & Phillips Classical Texts. Liverpool University Press. 170+ volumes.
Sutton, Dana F. (ed.). The Philological Museum. University of Birmingham. ~200 British neo-Latin texts with translations.

Corpus Catalogues

Packard Humanities Institute. PHI Latin Texts. 362 authors, ~7.5 million words of Classical Latin.
Dekkers, E. & Gaar, A. (eds.). Clavis Patrum Latinorum. Brepols. 2,348 entries for Latin Christian texts (Tertullian to Bede).
Corpus Inscriptionum Latinarum. Berlin-Brandenburgische Akademie der Wissenschaften. 180,000+ Latin inscriptions.
Sharpe, R. (1997). A Handlist of the Latin Writers of Great Britain and Ireland before 1540. Pontifical Institute of Mediaeval Studies. 5,200+ works (British Isles only).
Ford, P., Bloemendal, J. & Fantazzi, C. (eds.). (2014). The Oxford Handbook of Neo-Latin. Oxford University Press.
Buringh, E. & van Zanden, J.L. (2009). "Charting the 'Rise of the West': Manuscripts and Printed Books in Europe, A Long-Term Perspective." Journal of Economic History 69(2), 409–445. (2.9 million surviving medieval manuscripts estimated)

Scholarly Estimates on Translation Coverage

Found in Antiquity. (2023). "On Translation Coverage of Latin Texts". (Estimate: 90–95% of surviving Latin texts remain untranslated)
Patristics scholars estimate 60–70% of extant patristic literature has never been translated into English. (Scholarly consensus cited in multiple sources)
Stump, E. & Kretzmann, N. (eds.). (2005). The Cambridge Companion to Augustine. Cambridge University Press. (Notes on Augustine's 132 works, 5 million words)
Davies, B. & Stump, E. (eds.). (2012). The Oxford Handbook of Aquinas. Oxford University Press. (Notes on untranslated Aquinas writings)

Methodology

Primary Data Source

Universal Short Title Catalogue (USTC)

USTC Statistics (as of 2025)

Editions vs. Works: Our Analysis

Digitization Data

USTC Digital Links (~27% of all editions)

English Translation Data

Digital Libraries

The Philological Museum

Renaissance & Neo-Latin

Classical Latin

Medieval Latin

Patristic & Church Fathers

The Missing Database

A Gap in Research Infrastructure

Our Estimate (Rough)

Translation Coverage of Unique Works

Translation Coverage by Period

Classical Latin (~80% of major literary works)

Corpus Size

Translation Coverage

Important Caveat

Patristic Latin / Church Fathers (~35%)

Corpus Size

Translation Coverage

Scholarly Estimate

Medieval Latin (~10%)

Corpus Size

Translation Coverage

Scholarly Estimate

Renaissance & Early Modern Latin (~2%)

Corpus Size

Translation Coverage

Calculation

The Translation Decay Curve

OCR and Searchability

OCR Accuracy Research

Text Creation Partnership (TCP)

Critical Limitation

Confidence Levels

Summary of Key Findings

References

Corpus Catalogues

Scholarly Estimates on Translation Coverage

Discussion