Bangla PDF Speaker : A Complete Computer Application to Convert Bangla PDF to Speech
In this paper, a complete computer application is presented that can convert Bangla PDF to Bangla Speech. According to the proposed technique, images are extracted from PDF and then after processing the images, they are sent to OCR engine to extract text. Extracted text are then normalized and sent...
Saved in:
Published in: | 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI) pp. 1 - 5 |
---|---|
Main Authors: | , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
08-07-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, a complete computer application is presented that can convert Bangla PDF to Bangla Speech. According to the proposed technique, images are extracted from PDF and then after processing the images, they are sent to OCR engine to extract text. Extracted text are then normalized and sent to text to speech (TTS) engine to generate speech. Image processing is a key component of the developed application as it increases the efficiency of OCR engine to a great extent. We propose a novel threshold selection method that is able to detect type of noise in the extracted image and select threshold accordingly for binary transformation. Thus it solves the problem of selecting appropriate threshold of different images and it increases the overall accuracy and efficiency of the application. Another feature that has improved the performance of introduced computer application is text normalization. Normalization of the extracted text from the OCR engine makes the text more accurate to pronounce by the TTS engine depending on the context. Finally, we present experimental results that show 80.804% accuracy on text extraction from the PDF file and 3.92 score (out of 5) on the generated speech by human evaluation. |
---|---|
DOI: | 10.1109/ACMI53878.2021.9528221 |