Large Text Indexing Zhang Yan Department of Computer Science HKUST November 7, 2003 11-11:50 AM Room 3501, HKUST Abstract -------- Search engines have become more and more popular in recent years. The basic function of a search engine is to search for a pattern in the text database. Classical pattern matching algorithms, such as the KMP algorithm, can search for a pattern of length p in a raw database of size n in O(p+n) time. However, as the size of databases becoming very huge, the O(n) factor in the searching time becomes unacceptable. To tackle this problem, we need some auxiliary data structure built on the raw text database to facilitate pattern matching. Such data structures and algorithms are, in general, called "indexing" techniques. In this talk, we will survey different indexing techniques for different purposes. The talk combines lectures from two summer schools, the EEF summer school in 2002 and the Lipari summer school in 2003.