Segmentation of Sindhi Handwritten Text

S. A. AWAN, D. N. HAKRO, Z. H. ABRO, A. H. Jalbani

Abstract


Optical Character Recognition (OCR) and Intelligent Character Recognition (ICR) are two emerging areas to understand and convert document text into editable text. The change of language script on a text image pose various challenges and deman d challenging algorithms and approaches to overcome these challenges especially in Arabic script and its adopting languages like Sindhi, Urdu, Pashto and Farsi. Sindhi is a very rich literature language and needs some powerful OCR and ICR systems to manage the level of advances with other languages having perfection in these areas such as English , Latin, Russian and Korean. This study presents a segmentation algorithm for the segmentation of lines, words and characters. The input images written by various subjects are scanned and preprocessed and tested on segmentation algorithm. The segmentation of lines produced 100% accuracy along with words accuracy of 95%. The characters segmentation level also produced and acceptable accuracy of 81%.


Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Copyright (c) 2018 Sindh University Research Journal - SURJ (Science Series)

 Copyright © University of Sindh, Jamshoro. 2017 All Rights Reserved.
Printing and Publication by: Sindh University Press.