Creator(s)
Gabriela Elgarrista, Frédérique Mélanie-Becquet (LATTICE CNRS), Carmen Brando (EHESS)
Language(s)
Centuries
19th, 20th
CER on Validation Set
0.3%
Size (Nr. of Words)
72,674
Model ID
48293
HTR+ model created from ground truth data resulting from the transcription and manual segmentation of a sample of 169 pages of these old address books, in particular volumes 1898 and 1923. This model is valid for automatically transcribing the 1903 and 1913 volumes and any other document printed in two columns and using the Latin alphabet, especially in French. The choice of the sample is made by alphabetical criterion because it is the mode of organisation of the information in this document. The brackets present in the document have not been segmented. 118 pages for training and 51 pages for validation.
You can use this model to automatically transcribe Handwritten documents with Handwritten Text Recgnition in Transkribus.