Bringing the old writings closer to us: Deep learning and symbolic methods in deciphering old Cyrillic Romanian documents
The paper addresses the problem of transliteration of scanned copies of old Romanian books written in the Cyrillic script into the Latin script. The motivation of this endeavor and attendees of such a technology are enumerated. Then, a number of peculiarities of these documents, which create difficu...
Saved in:
Published in: | Memoirs of the Scientific Sections of the Romanian Academy Vol. XLVI; pp. 87 - 125 |
---|---|
Main Authors: | , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Publishing House of the Romanian Academy
01-11-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The paper addresses the problem of transliteration of scanned copies of old Romanian books written in the Cyrillic script into the Latin script. The motivation of this endeavor and attendees of such a technology are enumerated. Then, a number of peculiarities of these documents, which create difficulties for automatic processing, are exemplified. The proposed technology is presented in the form of a pipeline of modules, each applying AI or symbolic methods. Then, the component parts are discussed individually, and solutions are presented. The research is presented as work in progress, which leaves space for further enhancements. The data supporting training and evaluation of the modules is rooted in the former DeLORo project. |
---|---|
ISSN: | 1224-1407 2343-7049 |