Printed Latin and Greek (also German, English, Italian) 15th-19th century | PyLaia

Model details

Creator(s)

NOSCEMUS project

Language(s)

German, Greek Ancient (to 1453)

Centuries

CER on Validation Set

0.8%

Size (Nr. of Words)

667,127

Model ID

52640

About this Model

The “NOSCEMUS General Model” is tailored towards recognizing Latin prints from the early modern period. Although the model is designed to recognize Latin prints set in Antiqua-based typefaces, it is also capable of recognizing passages in Greek and passages set in (German) Fraktur.

In creating the Ground Truth the following transcription guidlines were followed:
– ligatures (e. g. Æ or æ, Œ or œ) and standard abbreviations (e.g. -que, -us, -tur, …mm…, …nn…) have been expanded
– long s (ſ) was transcribed as a normal s
– small caps were transcribed as majuscules
– special characters and diacritics (e. g. &, ë, ï or ę) were kept

The model was released by Stefan Zathammer and it is based on training data coming from the Digital Sourcebook of the NOSCEMUS project (https://transkribus.eu/r/noscemus/#).

If you use the Noscemus model as a base model for your own model, or if your edition is based on a transcription made with the help of the Noscemus model, you are kindly requested to mention the Noscemus model.

The NOSCEMUS project (https://www.uibk.ac.at/projects/noscemus) has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 741374).

Try it out

Printed Latin and Greek (also German, English, Italian) 15th-19th century | PyLaia is freely available to everyone

You can use this model to automatically transcribe Handwritten documents with Handwritten Text Recgnition in Transkribus.