DEK German Combined

Model details

Creator(s)

Achim Rabus, Martin Meindl & Milanka Matić-Chalkitis (MultiHTR project)

Language(s)

Centuries

CER on Validation Set

9.5%

Size (Nr. of Words)

144,709

Model ID

47882

About this Model

This is the first version of a combined model for the Deutsche Einheitskurzschrift (DEK), based on natural and synthetic training data. The natural GT data consists of several diaries of a private person and was kindly provided by the German Diary Archive (DTA) (https://tagebucharchiv.de/). Special thanks at this point go to the director of the DTA, Marlene Kayen. The synthetic training data (electronically available longhand texts converted into German standard shorthand) are composed of Goethe's “Faust” (https://jens-wawrczeck.de/stenogenerator/goethe/Faust%201%20(Goethe)%20-%20A4%20oL.pdf und https://www.projekt-gutenberg.org/goethe/faust1/) and Grimm's fairy tales. The model was trained by Achim Rabus. Martin Meindl and Milanka Matić-Chalkitis also worked on the creation of this model as part of the MultiHTR project at the Department of Slavic Languages and Literatures of the University of Freiburg (Germany). The model is suitable for transcribing natural manuscripts written in DEK. It can also be useful as a base model for other German shorthand systems.

Try it out

DEK German Combined is freely available to everyone

You can use this model to automatically transcribe Handwritten documents with Handwritten Text Recgnition in Transkribus.