In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Santosh K C
will defend his dissertation
Anomalous Behavior Analysis in Social Networks and Consumer Review Websites
Web and social media have been influencing every aspects of today’s world rendering tremendous amount of data that requires new insights to know about the current society. The social relationships and the business strategies are shaped by the views expressed in these new medias. Consumer opinions as reviews of products and services are constantly emerging highly influential in e-commerce. The information garnered from the social media is guiding social life of the people. The usage and dependence on these medias has led to their active use by a small yet powerful group of users to sway the sentiment of people for selfish gains. To check the infiltration of these anomalous users, we face two challenges: (1) studying opinions to learn their behaviors (2) detecting opinion spam to reduce their effects. We study the behaviors of anomalous users in reviews collected on Yelp, Amazon and social data from Twitter. Yelp is one of leading consumer website which filters the users who are potentially spammers using its own in-house algorithm. Using Yelp reviews we explore the temporal behaviors of spammers who try to promote their restaurants. We study the correlation of various review time series to characterize the types of spamming policies. For different spamming policies, we further characterize their approaches on rating/popularity and uncover latent factors involved in buffered and reduced spamming. Social spammers easily penetrate and are difficult to filter as they are sophisticated and continuously adapt to changing filtering algorithms. Reflexive reciprocity (users following back when they are followed by some to show courtesy) helps them to sneak into and disperse their agendas. To learn the behaviors of social medias, we study the behaviors of anomalous users in Twitter. We categorize the spammers based on their success rate of creating social relationship. We further study their success rate based on their fraudulence and content posting activities. We find that the successful spammers have stronger friendship base and post amalgam of spam and non-spam contents to pose more trustful and avoid getting caught. We exploit the behaviors learnt from Yelp and Twitter to generate spam detection algorithms. We combine the temporal features along with previously studied linguistic and behavioral features that beat state-of-the-art approaches of spam detection. To detect social spammers in Twitter, we combine the content-based features representing behavioral patterns and graph based approach embodying the social relationship. We learn biased random walks of the nodes and use language model employing neural networks to learn their latent embedding that significantly improves classification of the normal and anomalous users. We further characterize the review system in Amazon, a leading online marketplace. We speculate that there could be competition among similar products that may have effect in spamming to strengthen oneself or undermine a competitor. We use verified purchase in Amazon as popularity index to evaluate models of popularity prediction. We also model competition based on popularity among similar products. Our research on real world data from social networks and online review websites analyze the behaviors of anomalies of the web. We encode these behaviors to develop the computer algorithms capable of discovering them.
Date: Wednesday, August 15, 2018
Time: 1:00 PM
Place: PGH 550
Advisors: Dr. Arjun Mukherjee
Faculty, students, and the general public are invited.