Parts-of-speech tagger (post) for Urdu language Using semi-supervised machine learning model (T-0738) (MFN 6885)

Welcome to DSpace BU Repository

Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.

Show simple item record

dc.contributor.author Kinza Faisal Jamal, 01-241161-008
dc.date.accessioned 2018-08-29T07:47:10Z
dc.date.available 2018-08-29T07:47:10Z
dc.date.issued 2018
dc.identifier.uri http://hdl.handle.net/123456789/7374
dc.description Supervised by Dr. Raja Muhammad Suleman en_US
dc.description.abstract atural Language Processing (NLP) is the study of interaction between human and machine through natural language. Natural language is extremely rich in form, structure and ambiguity. There are many ways introduced in NLP to resolve ambiguity. One of its primary method is Parts of Speech (POS) tagging. POS Tagger is a software that is used to tag words to their respective parts of speech tags. A lot of work has been done in POS tagging for English and European Languages but Urdu language has limited POS taggers and resources. The current POS taggers for Urdu have multiple issues with them such as; dependence on lexical databases, missing contextual prediction of the word. Moreover, all POS taggers for Urdu language have been built using Supervised Machine Learning which depends on the availability of completely annotated datasets. This research, proposes a Semi-Supervised Machine Learning model-based Parts of Speech Tagger (POST) that is not dependent on lexical database or completely annotated corpus by using a partially annotated corpus, to train the model. The Model we used is known as Maximum Entropy Markov Model (MEMM). The model gives promising results with an accuracy of ~93%, which is significant when compared to the results of the existing POS taggers for Urdu that employ a Supervised Machine Learning approach. en_US
dc.language.iso en en_US
dc.publisher Software Engineering, Bahria University Engineering School Islamabad en_US
dc.relation.ispartofseries MS SE;T-0738
dc.subject Software Engineering en_US
dc.title Parts-of-speech tagger (post) for Urdu language Using semi-supervised machine learning model (T-0738) (MFN 6885) en_US
dc.type MS Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account