Abstract:
Social network analytics is an important research area and attracts a lot of attention from researchers. A social network has graph based topology having vertices V relating to users and edges £ relating to links or interactions between these users. Predicting missing links from current network state is known as link prediction problem, having stupendous applications. Similarly emergence of signed social networks gives interesting insights into the social graphs as they have the ability to represent various real-world relationships with positive (friend) and negative (foe) links. In analytical process signed networks suggest that negative links have significant importance over positive ones. Major obstruction exists in t heir constructive usage is that in most social media sites users are unable to specify them individually. There exist a gap between the significance of negative links as well as their availability in real life datasets. Hence it is essential to investigate how can we predict negative links from already existing network data. As for a network of modest size having n nodes, there exist O(n2 ) feasible links, and therefore, it is often challenging to assess pairwise likelihood for link formation in a substantial way. Missing value estimation problem is firmly related to link prediction problem and it is equally challenging to use complex models, e.g., latent factor models on sparse and large networks as signed social networks are highly imbalanced. Various link prediction methods are devised in a way that they assess the tendency of link formation over entire network while ignoring underlying sub-graphs. As a practical application it is significantly important to accomplish extensive search over entire social network considering all underlying sub graphs. In this study in order to scale up negative link prediction, we propose an ensemble approach which breaks negative link prediction problem into multiple sub problems, i.e., signifying the importance of negative links in each sub-graph. Each sub problem is independently addressed by the application of a probabilistic latent factor model. Ensemble approach is able to incorporate negative link prediction characteristics without giving up the predictive accuracy over positive links. We evaluate ensemble approach using real datasets to demonstrate the scalability, robustness and correctness of the approach.