Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.
dc.contributor.author | Safia Shabbir | |
dc.contributor.author | Nizwa Javed | |
dc.contributor.author | Imran Siddiqi | |
dc.contributor.author | Khurram Khurshid | |
dc.date.accessioned | 2018-09-24T10:29:00Z | |
dc.date.available | 2018-09-24T10:29:00Z | |
dc.date.issued | 2017 | |
dc.identifier.uri | http://hdl.handle.net/123456789/7468 | |
dc.description.abstract | Clustering is a pivotal step in any Optical Character Recognition (OCR) or Word Spotting system. It serves as a base for the classification and indexing of different words or characters depending upon the level of segmentation. Various clustering methodologies have been applied by different researchers on Latin script based document images. However for Urdu language, which belongs to the family of Arabic and Persian, clustering based indexing systems have not been extensively researched. In this paper, we present a comprehensive study of various known clustering techniques applied on printed Urdu Document Images. The images are segmented into ligatures or partial words and then they are grouped together using different clustering methods. Performance of these methods is evaluated using Calinski-Harabasz, Davis-Bouldin and Dunn indexes. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Bahria University Islamabad Campus | en_US |
dc.subject | Department of Computer Science CS | en_US |
dc.title | Comparative Study on Clustering Techniques for Urdu Ligatures in Nastaliq Font | en_US |
dc.type | Article | en_US |