INTERACTIVE THINNING FOR SEGMENTATION-BASED AND SEGMENTATION-FREE SINDHI OCR

D. N. HAKRO, S. A. AWAN, M. MEMON, A. M. AAMUR, G. N. MOJAI

Abstract


Optical Character Recognition (OCR) is converting image based images into editable text so the text written in image form is available for editing purpose. The thinning technique can be applied in preprocessing stage or after segmentation of words and characters from a text image when features are extracted to differentiate the characters. Thinning is to decrease the thickness of strokes and finding out one pixel skeleton of the character image. In this paper we present an iterative and interactive thinning algorithm for Sindhi script step by step. Our thinning algorithm removes pixels by preserving connectivity and pattern of image intact. The process can be stopped and checked with pixel based editor for the connectivity patterns. This algorithm can be used with segmentation-based Sindhi OCR and segmentation-free OCR. The algorithm along with application is tested on Sindhi line text and individual characters and the results are presented. The algorithm and the application can also be applied with other language scripts. The presented work is a part of research done on Sindhi OCR.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Copyright (c) 2015 Sindh University Research Journal - SURJ (Science Series)

 Copyright © University of Sindh, Jamshoro. 2017 All Rights Reserved.
Printing and Publication by: Sindh University Press.