Handwritten Document Offline Text Line Segmentation

In this paper, a method of text segmentation into lines of text known as cut text minimization (CTM) is described. Results applying the CTM method to the NIST data base examples of fifty two word handwritten paragraphs of the American Constitutions are given. The method uses a modified projection me...

Full description

Saved in:
Bibliographic Details
Published in:Digital Image Computing: Techniques and Applications (DICTA'05) p. 27
Main Authors: Weliwitage, C., Harvey, A.L., Jennings, A.B.
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2005
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, a method of text segmentation into lines of text known as cut text minimization (CTM) is described. Results applying the CTM method to the NIST data base examples of fifty two word handwritten paragraphs of the American Constitutions are given. The method uses a modified projection method to obtain starting points. Then an optimisation technique is applied which varies the cutting angle and start location to minimize the text pixels cut while tracking between the text lines. Also the method attempts to track around projecting ascenders or descenders by a line following technique. A comparison with the projections method is given. From the results, it is evident that the method is successful on quite distorted documents, and can correctly cut the text block into text lines with minimal incorrect partitioning of data into adjacent lines even when text lines have varying slope and there are penetrations into the space of adjacent lines. The CTM method does not assume text lines have a constant slope.
ISBN:0769524672
9780769524672
DOI:10.1109/DICTA.2005.42