[Seminar] Adapting Neural Models to Address Challenges in Information Extraction from Social Media Data
Friday, February 5, 2021
11:00 am - 12:00 pm
Online via MS Teams
Social media data poses several interesting challenges to information extraction technology. In my group, we have been working on studying how and why we observe lower performance of sequence labelling methods on social media data, compared to performance of the same models on more edited text, such as newswire data. These studies have informed our design choices for models that are more robust to naturalistic data, even data that includes language switching. My goal is to contribute to increasing the coverage of language abilities by NLP technology.
During this talk, I’ll briefly discuss the different proposals we have developed that include simple adaptations to contextualized embeddings, and a more flexible subword tokenization approach than what is available in the commonly used byte-pair encoding of language models. I’ll conclude with a discussion of possible research lines for the near future.
About the Speaker
Thamar Solorio is an Associate Professor of the Department of Computer Science at the University of Houston (UH). She holds graduate degrees in Computer Science from the Instituto Nacional de AstrofÃsica, Ãptica y ElectrÃ³nica, in Puebla, Mexico. Her research interests include information extraction from social media data, enabling technology for code-switched data, stylistic modeling of text and more recently multimodal approaches to online content understanding. She is the director and founder of the Research in Text Understanding and Language Analysis Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attribution, and recipient of the 2014 Emerging Leader ABIE Award in Honor of Denice Denton. She is an elected board member of the North American Chapter of the Association of Computational Linguistics (2020-2021). Her research is currently funded by the National Science Foundation and ADOBE, and in the past she has received support from the Office of Naval Research and the Defense Advanced Research Projects Agency (DARPA).
- Online via MS Teams