One-Unambiguous Regular Languages Yo-Sub Han HKUST Date: Friday, May 14, 2004. Time: 11-12 Venue: Room 3464, HKUST Abstract -------- The ISO standard for the Standard Generalized Markup Language (SGML) provides a syntactic meta-language for the definition of textual markup systems. In the standard, the right-hand sides of productions are based on regular expressions, although only regular expressions that denote words unambiguously, in the sense of the ISO standard, are allowed. In general, a word that is denoted by a regular expression is witnessed by a sequence of occurrences of symbols in the regular expression that match the word. In an unambiguous regular expression as defined by Book et al. each word has at most one witness. But the SGML standard also requires that a witness be computed incrementally from the word with a one-symbol lookahead; we call such regular expressions 1-unambiguous. A regular language is a 1-unambiguous language if it is denoted by some 1-unambiguous regular expression. We give a Kleene theorem for 1-unambiguous languages and characterize 1-unambiguous regular language in terms of structural properties of the minimal deterministic automata that recognize them. As a result we are able to prove the decidability of whether a given regular expression denotes a 1-unambiguous language; if it does, then we can construct an equivalent 1-unambiguous regular expression in worst-case optimal time. The paper being reported is by Bruggemann-Klein and Wood.