Atmospheric Sciences Ph.D. Student Awarded NASA STEM Fellowship
Fan Yang Researches Machine Learning and Deep Learning Methods to Detect Falsehoods
When Fan Yang was looking for Ph.D. programs, he was still trying to decide what research he wanted to pursue.
While considering the University of Houston, he met with computer science assistant professor Arjun Mukherjee, who was working on research involving machine learning. Mukherjee was also working on language detection and sentiment detection in humans, using computer programs.
Although machine learning was not as prevalent as it is now, Yang’s wife encouraged him to attend UH with Mukherjee as his mentor.
With internships at Visa and Amazon, and now a job at Amazon, the decision paid off.
Identifying Falsehoods in Text
At UH, Yang worked diligently in the computer science Ph.D. program at the College of Natural Sciences and Mathematics before graduating in spring 2020.
His research focused on using machine learning and deep learning methods to detect false information in writing. Specifically, claim verification and satire detection from articles online.
For example, in a paper in Empirical Methods in Natural Language Processing, Yang proposed a 4-level hierarchical network for satirical news detection. The model detected satirical news effectively and incorporated a mechanism to reveal satirical cues.
Yang also recently worked on cross-domain sentiment analysis. Domains are essentially the different mediums people use to communicate, such as books, television, the internet, etc. Sentiment analysis is the automated process of identifying opinions in text from customers and labeling them as positive, negative or neutral.
Yang published in at least four science journals on these topics with Mukherjee and has a few more papers under review.
He credits his mentor not only for guiding him through the difficult first two years of his Ph.D., but for instilling a sense of purpose.
“Dr. Mukerjee always tells me, and other lab mates, to have a vision about our projects,” said Yang. “He always said, we must understand what’s the real problem that our society is facing. And try to work on that problem so we can make a way for others.”
Mukherjee said he knew Yang was an exceedingly sharp, committed and focused candidate since 2014, when he first interviewed him.
“He was one of the very few people I have worked with in over decade who could go deep and innovate from the smallest spark of ideas that I used to provide during our meetings,” said Mukherjee. “He possesses the ‘one-man army’ attitude in research, which is very helpful, because he was able to learn all the applied skills required to meet the research goal.”
Using Skills in Industry
Both of Yang’s internships showed him the operations behind the highly successful businesses.
At Visa, he worked on the fraud detection team, specifically credit card misuse, when fraudsters misuse people’s credit card information in small increments.
“It may be a small amount of money,” he said, “but it may be multiple times.”
His project tried to mimic that behavior by developing a model to generate misuse behavior. And based on the model, they tried to detect what features could help Yang’s team spot the misuse earlier.
At Amazon, he was part of the team that built language understanding models for the virtual assistant Alexa. The team tried to figure out the context of a voice command and the intent of the command.
“We’re trying to see – knowing the history of a specific user,” he said, “can we understand, or can we have a better prediction of what that user wants, to provide a better service.”
Better Understanding All Languages
Yang performed so well at his Amazon internship that he was offered a position on the Alexa team and began his new job this month.
Moving forward, Yang said he wants to continue pursuing his fundamental research goal – to build a machine that can better understand all human languages.
“Human language is very complex,” he said. “Researchers are working hard to build machines for language understanding. Even though some machines work well for English, it doesn’t mean they can understand other, low-resource languages. That’s a very big issue and there are a lot of opportunities in this area.”
Yang also continues to collaborate with Mukherjee on research and new projects.
- Rebeca Trejo, College of Natural Sciences and Mathematics
July 20, 2020