Abstract:
The gain in Internet popularity underway in 1990’s, initially it was recognized to be
an outstanding advertising device. At almost no cost, an individual can practice
Internet to send email messages, update tweets, and Facebook messages to a vast
number of people. These messages can also contain unsolicited advertisement which
is identified as a spam. Twitter spam has turned into a basic issue these days.
According to twitter spam rules, tweets holding distinctive words of a trending issue,
repetition oftweet and the URLs that lead users to completely unrelated websites. The
twitter’s dataset, tweets about “iPhone” collected by using API and pre-processed it.
In this paper, content-based features have been selected that recognize the spamming
tweet by using R. The machine learning algorithms applied to detect spamming tweets
are Naive Bayes, Logistic Regression, KNN, Decision Tree, and Support Vector
Machine having accuracy of about 89%, 86%, 85%, 86% and 86.8% respectively.
Hence concluded that Naive Bayes gives the best accuracy as compared to other
highlighted algorithms