Abstract:
Optical Character recognizer is a software that understands the characters of that language. In its general function, it takes the image which contains text and recognizes it based on its training. For the languages like English, French, German and similar other languages, it is possible to train a classifier which recognizes and separates out the characters where characters are limited. But for the languages like Urdu, Arabic, Persian and similar languages, where characters from nearly countless shapes while joining with . others, the similar classifier fails to fulfil the requirement and the techniques of machine learning and advanced and modified forms of neural networks can be used. Our implemented project takes a frame (which contains Urdu text), recognizes the ligatures, extracts the characters of every shape from all ligatures for all characters by moving an indefinite' size of window over the text and generates (heir UNICODES.