Computer Science Seminar - University of Houston
Skip to main content

Computer Science Seminar

Faculty Candidate 2014

Opinion Mining for the Internet: Models, Algorithms and Predictive Analytics

When: Monday, March 31, 2014
Where: PGH 232
Time: 11:00 AM

Speaker: Dr. Arjun Mukherjee, University of Illinois at Chicago

Host: Prof. Christoph Eick

The massive amounts of user generated content in social media offers new forms of actionable intelligence. Public sentiments in debates, blogs, and news comments are crucial to governmental agencies for passing new bills/policies, gauging social unrest, predicting elections, and socio-economic indices. The goal of my research is to build robust statistical models for opinion mining with applications to marketing, social, and behavioral sciences. To achieve this goal, a number of research challenges need to be addressed. The first challenge is fine-grained information extraction which can capture diverse types of opinions (e.g., agreement/disagreement; contention/controversy, etc.) and various other latent sentiments expressed in social conversations and discussions. The state-of-the-art machinery (e.g., topic modeling) falls short for such a task. I develop several novel knowledge induced sentiment topic models which respect notions of human semantics. The second challenge is that social sentiments are inherently dynamic and change over time. To leverage the sentiments over time for predictive analytics (e.g., predicting financial markets), I develop Bayesian nonparametric topic based sentiment time-series and vector autoregression models. The third challenge is to filter deceptive opinion spam/fraud. It is estimated that 15-20% opinions on the Web are fake. Hence, detecting opinion spam is a precondition for reliable opinion mining.

In this talk, I will present novel statistical models for sentiment analysis and talk about two key frameworks: (1) Semi-supervised graphical models for mining fine-grained opinions in social conversations, and (2) Bayesian nonparametrics, sentiment time-series, and vector autoregression models for stock market prediction.

In the later part of the talk, I will discuss the problem of opinion spam and throw light on some techniques for filtering opinion spam. The focus will be on modeling collusion and combating group spam in e-Commerce reviews. The talk will conclude with a discussion about my ongoing research and future research vision in opinion contagions, forecasting socio-economic indices, and healthcare.

Bio:

Arjun Mukherjee is currently a Ph.D. candidate at the University of Illinois at Chicago. Previously, he was a research intern fellow at Microsoft Research and Indian Statistical Institute. He is the recipient of several highly competitive fellowships like Dean’s Scholar, Chancellors Fellow, and Provost and Deiss Fellow. His research spans areas such as Bayesian inference, statistical data mining, machine learning, natural language processing, and social and information sciences with a particular emphasis on solving big-data problems in social media and the Web. His works have addressed a wide variety of social computing problems including (1) modeling opinion spam, deception, and user behaviors; (2) sentiments analysis and graphical models of social conversations; and (3) financial market prediction using social sentiments. His works have been published in leading venues in Computer Science like KDD, WWW, ACL, EMNLP, CIKM, IJCAI, etc.

Further information about him may be found at: http://www.cs.uic.edu/~amukherj.