Cyber Competence OSINT: The Development of Cyber-Crime Mitigation using Crawler Enhancement

Sami Ullah, 01-247201-013

DSpace Home
→
Thesis/Dissertation Repository Islamabad Campus
→
Department of Computer Sciences (BUIC-E-8)
→
MS (IS) (BUIC-E-8)
→
View Item

Cyber Competence OSINT: The Development of Cyber-Crime Mitigation using Crawler Enhancement

Sami Ullah, 01-247201-013

URI: http://hdl.handle.net/123456789/13020

Date: 2022

Abstract:

Both the industrial and academic communities have emphasized sentiment analysis methodologies for detecting illegal and hateful content from social media sites such as Twitter, particularly when applied to tweets. Machine Learning and Deep Learning base robust and cutting-edge solution implemented on tweets dataset that handles twitter jargon. Furthermore, many social media sites, mostly Twitter, are used to spread hateful speech content. Studies on detecting Twitter tweets discovered a substantial correlation between hate speech transmission and sentiment analysis from tweets. Because of the importance of social media and the societal consequences of hate speech, it is necessary to implement a solution that helps to identify hate speech and sentimental analysis across several social media platforms. However, the algorithm previously implemented for detecting these content is often limited to some classes and content that need to be extended with an advanced and robust transformer-based solution. Neural Network-based solutions have emerged as a cutting-edge solution for sequence modeling and content detecting algorithm. Unfortunately, these methods have disadvantages such as long-term dependency and a lack of parallelization. This study presents the Twitter crawler enhancement approach by hitting hidden JSON responses, which allows us to get data more efficiently. For the next page, a cursor that can be appended to the link to get the response of the next page, which allows us to avoid manual scrolling using bots and instead hit the other link with a new cursor to get more data. Following that,The Transformer base Approach Proposed by Google named BERT to detect sentiments analysis and hateful content from Twitter was used. A new dataset of more than 43000 tweets, including the tweets of two famous Pakistani politicians Twitter handle named Minister of Information and Broadcasting of Pakistan and the Spokesperson of Pakistan Muslim League was created. Our implemented Transformer base BERT technique was compared with Neural Network and other State-of-the-art models for detecting sentiment and hateful content in Twitter documents. According to the study’s findings, the unsupervised BERT transformer technique performs outstanding compared to other baseline algorithms used for detecting this content from Twitter. vi