Title: On Recent Developments in Learning with Agnostic Noise Speaker: Elad Verbin ITCS, Tsinghua University Date: Wednesday, Feb 20, 2008 Time 11:00-12:00PM Venue: Room 3464, HKUST Abstract: In this talk I will give an overview of some exciting recent developments in learning boolean functions with noise. In normal non-noisy learning, the goal is to learn a function, given samples of the function at randomly-selected points. In noisy learning, the problem is made more difficult, by allowing a fraction of the examples to be wrong. If the wrong examples are chosen at random, then the noise is called "random", while if the examples are corrupted by a malicious adversary, the noise is called "agnostic". The subject of computation with noisy input has been ubiquitous in computer science, in topics such as coding theory and approximation algorithms, as well as practical topics such as computational biology and vision. Learning with noise has been considered extremely difficult (especially in the agnostic setting), but in recent years we've seen a large influx of positive results (algorithms), as well as some negative results (hardness results). I will describe some of these. I will also outline connections to coding theory and computational complexity. Here is an example of one of the problems that I will discuss in the talk, the problem of learning parity functions with noise: In the problem of learning parity (without noise), one gets samples of the form (x,f(x)), where x is an n-bit string chosen uniformly at random, and f is the XOR function on a subset of the bits. The function f is not known to the algorithm. The algorithm can request as many of these samples as it wants (but cannot specify the x's. They are picked at random). The goal is to return f with high probability. Learning parity without noise is easy to do in time O(n3), using O(n) samples, by using Gaussian elimination. Now, suppose that 1% noise is introduced, so that in each sample, f(x) is returned with probability 0.99, and 1-f(x) is returned with probability 0.01. In this case, the problem becomes extremely difficult. The faster algorithm known is due to Blum, Kalai and Wasserman, and runs in time $2^{O(n/logn)}$. The problem is believed to be hard (but even defining what hardness means here is non-trivial). In discussing this problem, I will describe some possible consequences of parity with noise being easy or hard. Each way would be good: Easiness would give us a holy hammer for use in learning theory, while hardness would imply simple and efficient cryptographic primitives, among other consequences. Some parts of this talk are based on joint work with Adam Kalai and Yishay Mansour. In the talk I will not assume any prior knowledge on computational learning theory.