| dc.contributor.author | Mohsin Hassan Khan, 01-244151-039 | |
| dc.date.accessioned | 2017-08-02T06:50:55Z | |
| dc.date.available | 2017-08-02T06:50:55Z | |
| dc.date.issued | 2017 | |
| dc.identifier.uri | http://hdl.handle.net/123456789/3514 | |
| dc.description | Supervised by Dr. Raja M. Suleman | en_US |
| dc.description.abstract | Natural language processing (NLP) is a computerized technique that is used for analyzing and representing human language automatically. NLP has been employed in many applications such as information retrieval, information processing, translations of language, automated answer grading and many more. The main problem with NLP is high level of uncertainty in natural language. High uncertainty in natural language makes automated analyses and extraction of useful information very difficult. Several approaches have been developed for automated grading. Latent Sematic Analysis (LSA) is one of the widely used approaches for automated text matching. LSA is a corpus based approach that evaluates similarity on the basis of semantic relations among words and ignores the structural composition of sentence. The structure blindness of LSA treats a logically wrong answer as a correct answer. LSA cannot recognize sentences that are semantically related but inverse of each other [8]. Furthermore, LSA cannot handle “gaming the system”, where user provides only the list of keywords without proper sentence structure. The target of our research is to develop an algorithm Extended Latent Sematic Analysis (xLSA) which focuses on synthetic composition of a sentence and overcome LSA’s syntactic blindness problem. xLSA examine sentences and identifies that proper sentence structure exists to cater “gaming the system” problem. xLSA analyzes text inputs to recognize their dependency structure and then decompose each sentence to identify subject, verb and object. Sentences are then compared and an approximation of synthetic and semantic space is generated for similar texts. xLSA compute semantic similarity score of two sentences and also identifies inverse sentences, negative sentences and “gaming the system”. We have tested xLSA with 200 semantically similar sentences from two corpuses [28] [29]. Results show xLSA outperforms then traditional LSA and identifies inverse sentences, negative sentence and list of keywords without having proper sentence structure. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Software Engineering, Bahria University Engineering School Islamabad | en_US |
| dc.relation.ispartofseries | MS SE;T-0722 | |
| dc.subject | Software Engineering | en_US |
| dc.title | A Novel Approach to Manage LSA's Sytactical Blindness Problem (T-0722) (MFN 5883) | en_US |
| dc.type | MS Thesis | en_US |