Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.
| dc.contributor.author | Aftab Alam Janisar, 01-241171-002 | |
| dc.date.accessioned | 2023-02-23T10:17:00Z | |
| dc.date.available | 2023-02-23T10:17:00Z | |
| dc.date.issued | 2019 | |
| dc.identifier.uri | http://hdl.handle.net/123456789/14961 | |
| dc.description | Supervised by Dr. Hammad Afzal | en_US |
| dc.description.abstract | From the last few years, researchers are very much attracted to sentiment analysis and especially towards hate speech detection because in other different languages procreation of hate speech has compelling and symbolic consideration on social media. Hate speech has a great impact on society, using hate words harms others dignity. Hate speech detection is important to stop the transformation of hate words into crimes. In this research, we have developed a framework for hate speech detection in the Pashto language. A corpus is created for which data is collected from Twitter. Because there is no related data available. Most of the research work has been done in this domain for other languages, and it’s very mature in the context of detecting hate speech. But when it arrives at the morphological languages not much work has been done especially in the Pashto language. In this research, we have aimed and collected data from Twitter, Tweets related to ethnicity and religion. The data collected from twitter has been annotated manually and we have categorized the data as hate or not by comparing it with the offensive content. For hate speech to view the impact of different features/attribute we have performed experiments on the existing classifiers i.e. SVM, Naïve Bayes, Decision tree and KNN. SVM produced the highest result at dataset of 500 i.e. 74% among all the classifiers. KNN and Decision Tree produced same result at dataset of 1500 i.e. 65.0%. Dataset of 2800 Decision Tree produced the highest result i.e. 72% and SVM produced 71.9%. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Software Engineering, Bahria University Engineering School Islamabad | en_US |
| dc.relation.ispartofseries | MS-SE;T-2053 | |
| dc.subject | Software Engineering | en_US |
| dc.title | A Framework to Detect Hate Speech in the Pashto Language from Social Media | en_US |
| dc.type | MS Thesis | en_US |