URDU OCR

Javed, Muhammad Mursaleen; Aslam, Muhammad

DSpace Home
→
Thesis/Dissertation Repository Karachi Campus
→
Department of Software Engineering (BUKC)
→
BSE (BUKC)
→
View Item

URDU OCR

Javed, Muhammad Mursaleen; Aslam, Muhammad

URI: http://hdl.handle.net/123456789/1836

Date: 2013

Abstract:

The Idea is to develop an OCR system for Urdu language. There is no such tool available in market to help Urdu publishers to digitalize available Urdu books for Urdu readers. Urdu OCR is a dream goal demands a lot of effort. The available research media was helpful for us to get our directions. We as a team studied available research papers and discussed the possibilities regarding idea implementation. After the discussions we decided to try each and every possibility. We used ACCORD Library for image processing. These possible implementations led us to a decent solution. We focused on pixels and extracted black pixels from the image. Then processed these details for learning purposes (getting new characters / combinations) and then compare these details with the learned data. Provided images are processed to resolve image quality issues, each separate connected ligature is segmented and stored with a unique code. These segmented parts are learned and saved to a defined location and is used to recognize words and symbols. After this process verified words was presented in an Urdu Text editor. In the final quote we would like to get attention and help of the concerned person’s regarding future work to make our solution more capable. 5