Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-5)

ACL HLT 2011 / SIGMT / SIGLEX Workshop
23 June 2011, Portland, Oregon, USA

The Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-5) seeks to build on the foundations established in the first four SSST workshops, which brought together a large number of researchers working on diverse aspects of structure and representation in relation to statistical machine translation. Its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation.

The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is a growing consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and their tree-transducer equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages.

Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations.

At the same time, SMT has seen a movement toward semantics over the past five years, which has been reflected at recent SSST workshops. The issues of deep syntax and shallow semantics are closely linked. Semantic SMT research now includes semantic role labeling (SRL) for MT evaluation, SRL for SMT, and WSD for SMT.

In order to emphasize structure and representation at semantic and not only syntactic levels, “Semantics” has been explicitly added to the name of this year's Workshop (the acronym remains SSST), and is a special workshop theme. Special sessions will be devoted to the Semantics theme.

Special Theme: Semantics in SMT

The need for semantic modeling in MT is becoming increasingly obvious in the MT community: even as BLEU scores steadily improve, crucial errors of meaning still hurt the quality of current SMT systems. At the same time, there is renewed interest in the semantics community for designing models that are directly relevant to NLP applications. However, semantic models designed for standalone tasks do not easily fit in current MT architectures. With this year's special theme, we seek to bridge this gap by bringing together researchers working on semantics and on translation in order to encourage cross-pollination of ideas, share insights into the needs of MT and what current developments in semantics have to offer.

We particularly encourage the submission of papers addressing the following issues:


 Session 1
09:00Opening Remarks
09:15Automatic Projection of Semantic Structures: an Application to Pairwise Translation Ranking
Daniele Pighin and Lluís Màrquez
09:40Structured vs. Flat Semantic Role Representations for Machine Translation Evaluation
Chi-kiu Lo and Dekai Wu
10:05Semantic Mapping Using Automatic Word Alignment and Semantic Role Labeling
Shumin Wu and Martha Palmer
10:30Coffee Break / Poster Session
 Incorporating Source-Language Paraphrases into Phrase-Based SMT with Confusion Networks
Jie Jiang, Jinhua Du and Andy Way
 Multi-Word Unit Dependency Forest-based Translation Rule Extraction
Hwidong Na and Jong-Hyeok Lee
 An Evaluation and Possible Improvement Path for Current SMT Behavior on Ambiguous Nouns
Els Lefever and Véronique Hoste
 Improving Reordering for Statistical Machine Translation with Smoothed Priors and Syntactic Features
Bing Xiang, Niyu Ge and Abraham Ittycheriah
 Session 2
11:00Reestimation of Reified Rules in Semiring Parsing and Biparsing
Markus Saers and Dekai Wu
11:25A Dependency Based Statistical Translation Model
Giuseppe Attardi, Atanas Chanev and Antonio Valerio Miceli Barone
11:50Improving MT Word Alignment Using Aligned Multi-Stage Parses
Adam Meyers, Michiko Kosaka, Shasha Liao and Nianwen Xue
 Session 3
13:50Automatic Category Label Coarsening for Syntax-Based Machine Translation
Greg Hanneman and Alon Lavie
14:15Utilizing Target-Side Semantic Role Labels to Assist Hierarchical Phrase-based Machine Translation
Qin Gao and Stephan Vogel
14:40Combining statistical and semantic approaches to the translation of ontologies and taxonomies
John McCrae, Mauricio Espinoza, Elena Montiel-Ponsoda, Guadalupe Aguado-de-Cea and Philipp Cimiano
15:05A Semantic Feature for Statistical Machine Translation
Rafael E. Banchs and Marta R. Costa-jussa
15:30Coffee Break / Poster Session
 Session 4
16:00A General-Purpose Rule Extractor for SCFG-Based Machine Translation
Greg Hanneman, Michelle Burroughs and Alon Lavie
16:25Panel Discussion


