Transkribus: Difference between revisions
| Line 69: | Line 69: | ||
=To request exported files= | =To request exported files= | ||
Transkribus project users should primarily utilize human-legible JPG files (image/jpeg) for work within the system. A smaller file size speeds file uploads and minimizes shared storage space on Transkribus servers. Metadata & Discovery Services can assist with exporting Rosetta files and converting large preservation master files (usually TIFF files) into more lightweight JPG files. Please contact the department | Transkribus project users should primarily utilize human-legible JPG files (image/jpeg) for work within the system. A smaller file size speeds file uploads and minimizes shared storage space on Transkribus servers. Metadata & Discovery Services can assist with exporting Rosetta files and converting large preservation master files (usually TIFF files) into more lightweight JPG files. Please contact the department with a list of IEs that you need for your respective project. The files will be exported from Rosetta, reformatted to JPG, and placed in a mapped SFTP location for you to retrieve. | ||
[[File:Trans_req.JPG| | [[File:Trans_req.JPG|600px|A new dropdown option has been added]] | ||
=Ethical Guidelines examples for Use of Artificial Intelligence in Archives= | =Ethical Guidelines examples for Use of Artificial Intelligence in Archives= | ||
*[https://data.nls.uk/about/ai-statement/ National Library of Scotland] | *[https://data.nls.uk/about/ai-statement/ National Library of Scotland] | ||
*[https://www.unesco.org/en/artificial-intelligence/recommendation-ethics Ethics of Artificial Intelligence] | *[https://www.unesco.org/en/artificial-intelligence/recommendation-ethics Ethics of Artificial Intelligence] | ||
Revision as of 18:41, 10 June 2025
The Center implemented Transkribus in the Summer of 2025, together with staff from our Partner institutions, for various pilot projects. The new technology is also being incorporated into reference and research requests in the Lillian Goldman Reading Room, Ackman & Ziff Genealogy Institute, and throughout the Center.
Collectively, the Center has an Epoch Plan subscription, which gives the Center community access to 15 seats (projects that can leverage language super models), 60,000 credits toward transcription projects, and 1TB of file storage.
To log into Transkribus, please visit https://app.transkribus.org
Overview
Transkribus harnesses artificial intelligence to help decipher digitized handwritten and printed historical texts.
For most any project using Transkribus, whether utilizing a language model or training one of your own, these are the basic workflow steps:
- Uploading an image or PDF file
- Recognition
- Editing transcribed
- Sharing or exporting a file
Each step within the workflow presents options to maximize the accuracy of what is transcribed in a translated text. With options comes troubleshooting; consult the Help Center for additional assistance.
Credit Usage
Recognition Type=Credit Consumption
- Handwritten Text + Lines=1 credit
- Printed Text + Lines=0.5 Credits
- Lines Recognition=0.25 Credits
- Tables Recognition=1 Credit
- Fields Recognition=1 Credit
Resources
The cooperative developers, Read-Coop, that created Transkribus offer many recorded webinars and tutorials on using the artificial intelligence tool. There is also the Transkribus Help Center, which offers extensive documentation and a search bar for troubleshooting.
Starting to use Transkribus
The Transkribus team offer a YouTube playlist that will help with learning how to use Transkribus AI, please see Getting Started with Transkribus.
More Advanced Webinars
- Transkribus Table Models Webinar (English)
- Using Public AI Models with Transkribus Webinar (English)
- Training a custom Transkribus model
- Expert Text Recognition Model Training with Transkribus (English Webinar)
- Publishing with Transkribus Sites Webinar (English)
- Baseline Models & Complex Layouts Webinar (English)
Past User Conferences
Other pilot projects and use cases
- Historical Archives of the Supreme Court of Louisiana: Transkribus
- Transkribus as an Institutional Service
- From Digitization and Images to Text and Content: Transkribus as a Case Study
- Transforming Scholarship in the Archives Through Handwritten Text Recognition: Transkribus as a Case Study
- Make a complete collection accessible with Transkribus: A best-practice example from the Tyrolean State Archives
- Automated Transcription of Non-Latin Script Periodicals: A Case Study in the Ottoman Turkish Print Archive
More information on Language Models and Super Models in Transkribus
Selected List of Available Large Language Super Models
Depending on the scope and desired outcome for a Transkribus project, using a language super model may be easier than training AI to transcribe.
- The Text Titan I (GER, DUT, FRE, FIN, ENG, SWE)
- Dutch Dean (DUT)
- Dansk Dokumentalist (DAN)
- German Genius (GEN)
- Polski Bizon (POL)
- English Elder (ENG)
- Faucon Français (FRE)
- Spanish Sage (SPA)
A complete list of language models is available here.
To request exported files
Transkribus project users should primarily utilize human-legible JPG files (image/jpeg) for work within the system. A smaller file size speeds file uploads and minimizes shared storage space on Transkribus servers. Metadata & Discovery Services can assist with exporting Rosetta files and converting large preservation master files (usually TIFF files) into more lightweight JPG files. Please contact the department with a list of IEs that you need for your respective project. The files will be exported from Rosetta, reformatted to JPG, and placed in a mapped SFTP location for you to retrieve.