Title: UBIQUITOUS PATTERN MATCHING AND ITS APPLICATIONS (Biology, Security, Multimedia) Speaker: Wojciech Szpankowski Purdue University Date: Friday, May 19, 2006 Time 11-12 Venue: Room 1504 , HKUST Abstract: Pattern matching comes in many flavors: In the string matching problem, for a given string one counts the number of its occurrences in a text. In the subsequence pattern matching, also known as the hidden word problem, we search for a given subsequence rather than a string. Finally, in the repetitive pattern matching we aim at determining how long it takes for a prefix of a text to reappear somewhere in the text. We apply probabilistic and analytic tools of combinatorics and analysis of algorithms to discover general laws of pattern occurrences. An immediate consequence of our results is the possibility to set thresholds at which the appearance of a pattern in given text starts being `meaningful'. In this talk, we first discuss some applications of string matching methodology to biological sequence analysis; in particular, to the problem of finding weak signals and avoiding artifacts. We then use the approach of hidden words to construct a reliable threshold for the intrusion detection in detecting anomalies. Then, we present a video compression scheme based on a two-dimensional self-repetitive pattern matching (i.e., a lossy extension of the Lempel-Ziv scheme). We conclude this talk with a novel application of the repetitive pattern matching to an error-resilient Lempel-Ziv77 scheme. Bio: Wojciech Szpankowski received his M.S. and Ph.D. degrees in Electrical and Computer Engineering from the Technical University of Gdansk in 1976 and 1980, respectively. Currently, he is Professor of Computer Science (and by courtesy of Electrical & Computing Engineering) at Purdue University. In 1992 he was Professeur Invite at INRIA-Rocquencourt, France, and in 1999 he was Visiting Professor at Stanford University. Dr. Szpankowski's research interests cover analysis of algorithms, information theory, bioinformatics, analytic combinatorics, stability problems, and applied probability. He has published over 150 papers on these topics. His recent work is mostly devoted to probabilistic analysis and design of algorithms on sequences and applications of analytic tools to problems of information theory and biology. His book "Average Case Analysis of Algorithms on Sequences" was published by John Wiley & Sons in 2001. Dr. Szpankowski has been a guest editor for several journals including the IEEE Transactions on Information Theory. He is on the editorial boards of Theoretical Computer Science, IEEE Trans. Information Theory, ACM Trans. Algorithms, Combinatorics, Probability, and Computing, and Foundation and Trends in Communications and Information Theory. He is a Fellow of the IEEE.