| dc.description.abstract |
The spread of internet and the technological growth have caused a huge effect on social interactions. Nowadays, the way of communication for a person is social media, it has become progressively popular, information can be easily communicated through social media. People can share their thoughts and get inspiration from digital and social media. The information from social media is now became very doubtful because of the misinformation. The topic of misinformation has haggard attention both from academic groups and the public. Any misinformation has the possibility of fascinating public opinion, which may change the results against hostile parties. The false information has become important research topic due to maximum false content available on social networks. It’s very easy for any user to spread misinformation through the media. Therefore, for the professionals, organizers, and societies the misinformation became a problem. Hence, it is essential to observe the credibility and validity of the information articles being shared on social media. The core challenge is to distinguish the difference among real and false information. The extensive propagation of misinformation on social networks has newly established a lot of consideration in academia. Recent studies focus on the article content such as content title and description which has limited their achievement. However, there are 2 normally agreed-upon features of misinformation: the title or text of an article and the user engagement. In this experiment, we focused on both misinformation content as well as its context. In the social context, we extract different user engagements with articles for example tweets, i.e., read-only, user retweet, likes and reply. With user context we calculate user credibility and combine with article content. After combining both features, we used 3 NLP feature extraction technique i.e., TF-IDF, Count-vectorizer and Hashing-vectorizer, after which, we apply different machine learning classifiers to classify misinformation as real or fake, therefore we use SVM, Naive Byes, Logistic Regression, KNN, Random forest, Decision Tree and Gradient Boosting. The highest accuracy score is 93.4%. The proposed model achieves its highest accuracy when using count vector features and random forest classifier. The proposed method has been tested on a real-world dataset: “fakenewsnet”. We refine fakenewsnet dataset repository according to our require features. The dataset contains 23000+ articles with millions of user engagements. Our discoveries confirmed that the proposed methods will be effective to classify misinformation on social networks |
en_US |