Computer Science Seminar

Faculty Candidate 2014

Frontiers of Learning from Imperfect Data

When: Friday, March 28, 2014
Where: Technology 101
Time: 11:00 AM

Speaker: Dr. Andrew Wan, Simons Institute for the Theory of Computing

Host: Prof. Carlos Ordonez

When faced with real-world, imperfect data, what kinds of structures (i.e., classifiers) can we hope to infer and use to make predictions about future data? The agnostic learning model (Kearns, Schapire, and Sellie 94; Haussler 92) is a generalization of the classical PAC model (Valiant 84) that gives a theoretical framework for studying this problem and designing algorithms, which are provably efficient and correct. In this talk I will present both positive and negative results, depending on the kind of "access" the learner has to the data. The positive result shows that if the algorithm can request the label of specific examples (not necessarily receiving the correct answer), then almost all DNF formulas can be learned in the agnostic setting. The challenging problem of learning DNFs was first posed by Valiant in the introduction of his PAC framework, and has since been intensively studied in numerous variants of the model. I will then present a negative result for algorithms that use only statistical access (a precise definition will be giving) to the data, showing that many classifiers learnable in the PAC model are not learnable in the agnostic one. Furthermore, there has been essentially only one algorithm for learning in this setting, and we explain this by showing that no algorithm can do better.

The techniques underlying this work have rich connections to numerous areas in theoretical computer science, such as complexity and differential privacy, and I will briefly discuss my work in these areas.

Bio:
Andrew is currently a Google Research Fellow at the Simons Institute for the Theory of Computing. He received his Ph.D. in 2010 from Columbia University under the supervision of Tal Malkin and Rocco Servedio, after which he was an assistant professor at IIIS Tsinghua University and a postdoctoral fellow with the Theory of Computation Group at Harvard.

Department of Computer Science

Events

Computer Science Seminar

Frontiers of Learning from Imperfect Data