PASHTO ISOLATED CHARACTER RECOGNITION USING K-NN CLASSIFIER

N. AHMAD, A. A. KHAN, S. A. R. ABID, M. YASIR, NASIM- ULLAH

Abstract


This paper presents the development of Optical Character Recognition (OCR) system for printed Pashto text. The problem of the unavailability of the standard database for Pashto language has also been addressed by developing a medium size database with 25 different variations with a total number of 1125 entries in the final database. In the proposed approach, individual Pashto characters are recognized utilizing both high and low level features. High level features are based on the structural information from the characters and the resulting binary trees uniquely classify each of the characters. The approach though quite robust is affected slightly by the variation in size, orientation and writing style. An alternative low level feature approach based on K-Nearest Neighbors has been used giving an overall word recognition of 74.8%.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Copyright (c) 2016 Sindh University Research Journal - SURJ (Science Series)

 Copyright © University of Sindh, Jamshoro. 2017 All Rights Reserved.
Printing and Publication by: Sindh University Press.