Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning and Deep Learning Techniques.
Anisha P RodriguesRoshan FernandesAakash AAbhishek BAdarsh ShettyAtul KKuruva LakshmannaR Mahammad ShafiPublished in: Computational intelligence and neuroscience (2022)
In this modern world, we are accustomed to a constant stream of data. Major social media sites like Twitter, Facebook, or Quora face a huge dilemma as a lot of these sites fall victim to spam accounts. These accounts are made to trap unsuspecting genuine users by making them click on malicious links or keep posting redundant posts by using bots. This can greatly impact the experiences that users have on these sites. A lot of time and research has gone into effective ways to detect these forms of spam. Performing sentiment analysis on these posts can help us in solving this problem effectively. The main purpose of this proposed work is to develop a system that can determine whether a tweet is "spam" or "ham" and evaluate the emotion of the tweet. The extracted features after preprocessing the tweets are classified using various classifiers, namely, decision tree, logistic regression, multinomial naïve Bayes, support vector machine, random forest, and Bernoulli naïve Bayes for spam detection. The stochastic gradient descent, support vector machine, logistic regression, random forest, naïve Bayes, and deep learning methods, namely, simple recurrent neural network (RNN) model, long short-term memory (LSTM) model, bidirectional long short-term memory (BiLSTM) model, and 1D convolutional neural network (CNN) model are used for sentiment analysis. The performance of each classifier is analyzed. The classification results showed that the features extracted from the tweets can be satisfactorily used to identify if a certain tweet is spam or not and create a learning model that will associate tweets with a particular sentiment.