Thesis Defense
In Partial Fulfillment of the Requirements for the Degree of Master of Science
Michael Yantosca
will defend his thesis
ARTIC: An Adaptive Real-Time Imprecise Computation Pipeline for Audio Analysis
Abstract
One of the more complex issues facing natural language processing (NLP) is how to deal with overlapped speech, i.e. when two or more speakers interfere with or talk over each other, and the more general case of co-channel speech, i.e. when two or more speakers are present in an audio stream regardless of interference. Frequently, one speaker is selected as the primary speaker for the purpose of analysis with other speakers relegated to the category of interfering speakers. Despite the breadth of research into overlapped speech detection, few endeavors have been made into preserving the speech of so-called interfering speakers. A compelling case can be made for a more comprehensive analysis of co-channel speech in the fields of computational linguistics, accessibility automation, and entertainment, particularly under real-time constraints. Currently available open-source audio libraries, while technically capable of supporting such research endeavors, are cumbersome to work with. To this end, the work introduces the Adaptive Real-Time Imprecise Computation (ARTIC) pipeline for audio analysis, a simple but flexible approach to stream processing that tracks computation times and deadlines for the various pipeline stages and affords the user the ability to specify automatic precision reductions to avoid projected deadline misses as well as automatic precision increases to combat underutilization. A proof of concept is tested with the intent to build upon this groundwork for a more comprehensive project having the goal of multi-speaker interference detection and eventually speaker separation.
Date: Friday, November 15, 2019
Time: 11:00 AM - 12:30 PM
Place: PGH 550
Advisor: Dr. Albert M.K. Cheng
Faculty, students, and the general public are invited.