CNN-optimized text recognition with binary embeddings for Arabic expiry date recognition
Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on convolu...
Saved in:
Published in: | Journal of Electrical Systems and Information Technology Vol. 11; no. 1; pp. 11 - 18 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01-12-2024
Springer Nature B.V SpringerOpen |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on convolutional neural networks (CNN) that take the dot matrix as input and generate embeddings that are rounded to generate binary representations of the digits. The binary embeddings are then used to perform Optical Character Recognition (OCR) on the date images. To overcome the challenge of the limited availability of dotted Arabic expiration date images, we developed a True Type Font (TTF) for generating synthetic images of Arabic dot-matrix characters. The model was trained on a synthetic dataset of 3287 images and 658 synthetic images for testing, representing realistic expiration dates from 2019 to 2027 in the format of
yyyy/mm/dd
and
yy/mm/dd
. Our model achieved an accuracy of 98
.
94% on the expiry date recognition with Arabic dot matrix format using fewer parameters and less computational resources than traditional CNN-based models. By investigating and presenting our findings comprehensively, we aim to contribute substantially to the field of OCR and pave the way for advancements in Arabic dot-matrix character recognition. Our proposed approach is not limited to Arabic dot matrix digit recognition but can be also extended to text recognition tasks, such as text classification and sentiment analysis. |
---|---|
ISSN: | 2314-7172 2314-7172 |
DOI: | 10.1186/s43067-024-00136-2 |