Creator(s)
Polina Staroverova, Alena Kuznetsova, Natalia Perkova, Dmitri Sichinava
Language(s)
Russian
Centuries
CER on Validation Set
2.4%
Size (Nr. of Words)
23,291
Model ID
48282
This model was trained as a student project in a master’s program “Digital Humanities” during November 2021 – January 2022.
The text corpus for the model includes books that were published after the Reform of Russian orthography made by Peter I in the following printing houses:
the printing house of the Academy of Sciences in St. Petersburg, the one of the Imperial Moscow University, the one of Vilkovsky and Galchenko, the one of The Land Cadet Corps and some decrees printed in civil script.
Training sources are books, scanned by Rusneb (https://rusneb.ru/) and by Google Books.
The model shows good results on Russian language material, but it does not recognize other languages that can occur in texts of this period.
You can use this model to automatically transcribe Handwritten documents with Handwritten Text Recgnition in Transkribus.