| dc.contributor.author | Abdur Rehman Anwar, 01-134142-199 | |
| dc.contributor.author | Suneel Kumar, 01-134142-202 | |
| dc.date.accessioned | 2018-08-09T10:11:34Z | |
| dc.date.available | 2018-08-09T10:11:34Z | |
| dc.date.issued | 2018 | |
| dc.identifier.uri | http://hdl.handle.net/123456789/7222 | |
| dc.description | Supervised by Dr. Arif Ur Rahman | en_US |
| dc.description.abstract | Text forensics is a field of research in which researchers develop innovative techniques for analyzing text for various purposes like identifying the characteristics of the author of a document. It can be applied in various scenarios like security and marketing. There are various international competitions which include tasks related to text forensics. Author Profile Identification System (APIS) is developed considering the importance of the field of text forensics. APIS is a desktop application which is designed and developed to identify social aspects such as gender and age group of authors by processing text in a document. It uses machine learning classification algorithms, namely K-Nearest Neighbor (K-NN) and Naive Bayes to classify text written by different authors and then identify their social aspects. A dataset that consists of 681,288 posts of 19320 documents created by 19320 authors is used to train the system. Various features such as pronouns, assent, negations, determiner, preposition, blog words, and hyperlink, are automatically collected for each document and author from the dataset. The system then uses the extracted features to analyze a new text document by extracting the same features and for identifying the author profile attributes. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Bahria University Islamabad Campus | en_US |
| dc.relation.ispartofseries | BS (CS);P-6739 | |
| dc.subject | Computer science | en_US |
| dc.title | Author Profile Identification System | en_US |
| dc.type | Project Reports | en_US |