Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.
dc.contributor.author | Mubashir Ali | |
dc.contributor.author | Shehzad Khalid | |
dc.contributor.author | Muhammad Haneef Saleemi | |
dc.date.accessioned | 2017-12-26T12:23:03Z | |
dc.date.available | 2017-12-26T12:23:03Z | |
dc.date.issued | 2014 | |
dc.identifier.issn | 2090-4274 | |
dc.identifier.uri | http://hdl.handle.net/123456789/5195 | |
dc.description.abstract | Stemming is one of the most important pre-processing steps in the process of Text Mining which boosts the performance of information retrieval (IR) system. It is also equally important for many other interesting research areas like natural language processing (NLP), text categorization etc. The main objective of stemming is to bring many grammatical word forms, for example parts of speech, gender, tense etc. to their stem or root form. Due to the rich morphological structure of Urdu language, it is a challenging task to develop an Urdu stemmer for information retrieval system. In this paper, we have proposed an effective rule-based stemming method for Urdu language to cope with the challenges of Urdu morphological structure. Our proposed Urdu stemmer generate the stem of Urdu words as well as borrowed words (words from other languages such as Arabic, Persian, Turkish, etc). The proposed methodology is compared with the existing Urdu stemming technique such as Light Weight Stemmer for Urdu Language to demonstrate the dominance of proposed Urdu stemmer as compared to the competitor. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Bahria University Islamabad Campus | en_US |
dc.subject | Department of Computer Engineering CE | en_US |
dc.title | A Novel Stemming Approach for Urdu Language | en_US |
dc.type | Article | en_US |