Transkribus
The Center implemented Transkribus in the Summer of 2025, together with staff from our Partner institutions, for various pilot projects. The new technology is also being incorporated into reference and research requests in the Lillian Goldman Reading Room, Ackman & Ziff Genealogy Institute, and throughout the Center. This is truly a new endeavor where the Center community will need to learn as a group how to most effectively use Transkribus. To contact the Transkribus User Group at the Center to ask questions, seek advice, or share successes, feel free to email the distribution list.
Collectively, the Center has an Epoch Plan subscription, which gives the Center community access to 15 seats (project log-ins that can leverage language super models), 60,000 credits toward transcription projects, and 1TB of file storage.
To log into Transkribus, please visit https://app.transkribus.org.
Overview
Transkribus harnesses artificial intelligence to help decipher digitized handwritten and printed historical texts. Transkribus employs a credit-based system for analyzing the structure, layout, and handwriting found in uploaded images.
For most projects using Transkribus, whether utilizing a language model or training one of your own, these are the basic workflow steps:
- Uploading an image or PDF file
- Recognition
- Editing transcribed
- Sharing or exporting a file
Each step within the workflow presents options to maximize the accuracy of what is transcribed in a translated text. With options comes troubleshooting; consult the Help Center for additional assistance.
Just as with the early days of digitization at the Center, best practices will emerge for undertaking new Transkribus projects and integrating newly translated, discovered information into our other shared library systems. Together, we will discuss, record, and disseminate these recommendations within the wiki.
Credit Usage
Recognition Type=Credit Consumption
- Handwritten Text + Lines=1 credit
- Printed Text + Lines=0.5 Credits
- Lines Recognition=0.25 Credits
- Tables Recognition=1 Credit
- Fields Recognition=1 Credit
Credits can only be allocated by an administrator. Please contact Metadata & Discovery Services to discuss your project, coordinate an invite for a project seat, or to allocate credits for working in Transkribus.
Resources
The cooperative developers, Read-Coop, that created Transkribus offer many recorded webinars and tutorials on using the artificial intelligence tool. There is also the Transkribus Help Center, which offers extensive documentation and a search bar for troubleshooting.
Starting to use Transkribus
The Transkribus team offer a YouTube playlist that will help with learning how to use Transkribus AI, please see Getting Started with Transkribus.
More Advanced Webinars
- Transkribus Table Models Webinar (English)
- Using Public AI Models with Transkribus Webinar (English)
- Training a custom Transkribus model
- Expert Text Recognition Model Training with Transkribus (English Webinar)
- Publishing with Transkribus Sites Webinar (English)
- Baseline Models & Complex Layouts Webinar (English)
Past User Conferences
Other pilot projects and use cases
- Historical Archives of the Supreme Court of Louisiana: Transkribus
- Transkribus as an Institutional Service
- From Digitization and Images to Text and Content: Transkribus as a Case Study
- Transforming Scholarship in the Archives Through Handwritten Text Recognition: Transkribus as a Case Study
- Make a complete collection accessible with Transkribus: A best-practice example from the Tyrolean State Archives
- Automated Transcription of Non-Latin Script Periodicals: A Case Study in the Ottoman Turkish Print Archive
More information on Language Models and Super Models in Transkribus
Selected List of Available Large Language Super Models
Depending on the scope and desired outcome for a Transkribus project, using a language super model may be easier than training AI to transcribe.
- The Text Titan I (GER, DUT, FRE, FIN, ENG, SWE)
- Dutch Dean (DUT)
- Dansk Dokumentalist (DAN)
- German Genius (GEN)
- Polski Bizon (POL)
- English Elder (ENG)
- Faucon Français (FRE)
- Spanish Sage (SPA)
A complete list of language models is available here.
To request exported files
Transkribus project users should primarily utilize human-legible JPG files (image/jpeg) for work within the system. A smaller file size speeds file uploads and minimizes shared storage space on Transkribus servers. Metadata & Discovery Services can assist with exporting Rosetta files and converting large preservation master files (usually TIFF files) into more lightweight JPG files.
Please use the CJH Help Desk to contact Metadata Services with a list of IEs that you need for your respective project. The files will be exported from Rosetta, reformatted to JPG, and placed in a mapped SFTP location for you to retrieve. Please note provenance information for your future reference, such as original collection, location, and filename for exported materials.
Upload the requested files to Transkribus or move them to a local desktop to work on your project. Files will be deleted from the SFTP server after 30 days.