CER on Validation Set
Size (Nr. of Words)
Model ID
This model is trained to recognize the Gothic typefaces used in Castilian incunabula. It is based on twenty books printed in the workshops of Stanislao Polono & Meinardo Ungut (Seville: RGP, CNB, JOS, CLS, SPO), Fadrique de Basel (Burgos: LES, AYL, C87, AXP), Pablo Hurus (Zaragoza: ERI, APL, SVH, ACM, VTS, LIM), Cuatro Compañeros (Seville: CAR), Juan de Burgos (Burgos: AUG, BMP), Pedro Hagenbach (Toledo: CUR), and Guillén de Brocar (Pamplona: GEN), all printed between 1487 and 1499. For the correspondence between these acronyms (developed by the Hispanic Seminary of Medieval Studies) and the actual works and copies used, see Fradejas Rueda & Cossío Olavide (2025).
The samples from these twenty editions include ten folios (verso-recto) per copy, transcribed according to the HSMS transcription system ( This means that all abbreviations are expanded and enclosed between < > signs, and superscript letters are followed by a grave accent. The model was trained over 207 epochs on a dataset of 180.152 words and 24.061 lines (mostly from 200 double-column folio openings). The training CER is 0.2%, with a validation CER of 0.77% and a WER of 3.21%.
Developed within the project 7PartidasDigital (PID2020-112621GB-I00; funded by MCIN/AEI/ 10.13039/501100011033) by José Manuel Fradejas Rueda.
You can use this model to automatically transcribe Handwritten documents with Handwritten Text Recgnition in Transkribus.