One-Unambiguous Regular Languages
Yo-Sub Han
HKUST

Date:  Friday, May 14, 2004.
Time:  11-12
Venue: Room 3464, HKUST


Abstract
--------
The ISO standard for the Standard Generalized Markup Language (SGML)
provides a syntactic meta-language for the definition of textual
markup systems. In the standard, the right-hand sides of productions
are based on regular expressions, although only regular expressions
that denote words unambiguously, in the sense of the ISO standard, are
allowed. In general, a word that is denoted by a regular expression is
witnessed by a sequence of occurrences of symbols in the regular
expression that match the word. In an unambiguous regular expression
as defined by Book et al. 
each word has at most one witness. But the SGML standard also requires
that a witness be computed incrementally from the word with a
one-symbol lookahead; we call such regular expressions 1-unambiguous.
A regular language is a 1-unambiguous language if it is denoted by some
1-unambiguous regular expression. We give a Kleene theorem for
1-unambiguous languages and characterize 1-unambiguous regular
language in terms of structural properties of the minimal
deterministic automata that recognize them. As a result we are able to
prove the decidability of whether a given regular expression denotes a
1-unambiguous language; if it does, then we can construct an
equivalent 1-unambiguous regular expression in worst-case optimal
time. 

The paper being reported is by Bruggemann-Klein and Wood.