Comparative Study on Clustering Techniques for Urdu Ligatures in Nastaliq Font

Safia Shabbir; Nizwa Javed; Imran Siddiqi; Khurram Khurshid

dc.contributor.author	Safia Shabbir
dc.contributor.author	Nizwa Javed
dc.contributor.author	Imran Siddiqi
dc.contributor.author	Khurram Khurshid
dc.date.accessioned	2018-09-24T10:29:00Z
dc.date.available	2018-09-24T10:29:00Z
dc.date.issued	2017
dc.identifier.uri	http://hdl.handle.net/123456789/7468
dc.description.abstract	Clustering is a pivotal step in any Optical Character Recognition (OCR) or Word Spotting system. It serves as a base for the classification and indexing of different words or characters depending upon the level of segmentation. Various clustering methodologies have been applied by different researchers on Latin script based document images. However for Urdu language, which belongs to the family of Arabic and Persian, clustering based indexing systems have not been extensively researched. In this paper, we present a comprehensive study of various known clustering techniques applied on printed Urdu Document Images. The images are segmented into ligatures or partial words and then they are grouped together using different clustering methods. Performance of these methods is evaluated using Calinski-Harabasz, Davis-Bouldin and Dunn indexes.	en_US
dc.language.iso	en	en_US
dc.publisher	Bahria University Islamabad Campus	en_US
dc.subject	Department of Computer Science CS	en_US
dc.title	Comparative Study on Clustering Techniques for Urdu Ligatures in Nastaliq Font	en_US
dc.type	Article	en_US