Tools for Developing OCRs for Indian Scripts

Development of OCRs for Indian script is an active area of research today. Indian scripts present great challenges to an OCR designer due to the large number of letters in the alphabet, the sophisticated ways in which they combine, and the complicated graphemes they result in. The problem is compoun...

Full description

Saved in:
Bibliographic Details
Published in:2003 Conference on Computer Vision and Pattern Recognition Workshop Vol. 3; p. 33
Main Authors: Kumar, M N S S K Pavan, Kiran, S S Ravi, Nayani, Abhishek, Jawahar, C V, Narayanan, P J
Format: Conference Proceeding
Language:English
Published: IEEE 01-06-2003
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Development of OCRs for Indian script is an active area of research today. Indian scripts present great challenges to an OCR designer due to the large number of letters in the alphabet, the sophisticated ways in which they combine, and the complicated graphemes they result in. The problem is compounded by the unstructured manner in which popular fonts are designed. There is a lot of common structure in the different Indian scripts. In this paper, we argue that a number of automatic and semi-automatic tools can ease the development of recognizers for new font styles and new scripts. We discuss briefly three such tools we developed and show how they have helped build new OCRs. An integrated approach to the design of OCRs for all Indian scripts has great benefits. We are building OCRs for many Indian languages following this approach as part of a system to provide tools to create content in them.
ISBN:0769519008
9780769519005
ISSN:1063-6919
DOI:10.1109/CVPRW.2003.10023