Abstract:
Online social networks have become a part of billions of peoples’ daily lives. Through their
interactions on such networks, these people come across a wide variety of information.
Achieving a better understanding of how particular information propagates and diffuses
through these networks is important to more accurately understand people and their interaction
with information. Twitter is an extremely popular online social network that deals with a lot of
information every day, including information regarding particular people, events,
organizations, situations, interest groups, etc. The diffusion of such data depends on multiple
factors and there has been much research into finding out the intricacies of how information
travels through online social networks, especially Twitter. While there has been a lot of work
done to understand information diffusion in Twitter, very little of it focuses Twitter in the
Pakistani context. The users of social networks in Pakistan could be potentially very different
when compared to other communities. This research attempts to delve into this phenomenon
deeper and study information diffusion in Twitter in Pakistan, focusing more specifically on
the characteristics that enable a particular tweet to be retweeted more than others. A dataset of
trending Pakistani tweets is taken over a period of 3 months using the Twitter API. Various
user, content and time-based characteristics are defined and the correlation for each of these
characteristics to the retweet count is calculated using multiple correlation algorithms as well
as a trained neural network to find out which characteristic of a tweet has what affect on its
potential to diffuse further into the network.