Syntax and Structure in Statistical Translation (SSST)

NAACL-HLT 2007 / AMTA Workshop
26 April 2007, Rochester, New York

The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is a growing consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars (hereafter, S/TGs) and their tree-transducer equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages.

Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations.

In response to this bustling new situation, the workshop on Syntax and Structure in Statistical Translation (SSST) seeks to bring together researchers working on diverse aspects of synchronous/transduction grammars in relation to statistical machine translation, to discuss current work, compare and contrast different approaches, and identify the questions that are most pressing for future progress in this topic.

Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
Dekai Wu and David Chiang (editors)
Invited Talk: Tree Transducers and Tree Adjoining Grammars [slides]
William C. Rounds
David Chiang, Dan Gildea, Kevin Knight, Stuart Shieber and Dekai Wu
Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation [slides]
Yuqi Zhang, Richard Zens and Hermann Ney
Extraction Phenomena in Synchronous TAG Syntax and Semantics
Rebecca Nesson and Stuart M. Shieber
Inversion Transduction Grammar for Joint Phrasal Translation Modeling [slides]
Colin Cherry and Dekang Lin
Factorization of Synchronous Context-Free Grammars in Linear Time [slides]
Hao Zhang and Daniel Gildea
Binarization, Synchronous Binarization, and Target-side Binarization [slides]
Liang Huang
Machine Translation as Tree Labeling
Mark Hopkins and Jonas Kuhn
Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair [slides]
Sriram Venkatapathy and Aravind Joshi
Generation in Machine Translation from Deep Syntactic Trees [slides]
Keith Hall and Petr Nemec
Combining morphosyntactic enriched representation with n-best reranking in statistical translation
Hélène Bonneau-Maynard, Alexandre Allauzen, Daniel Déchelotte and Holger Schwenk
A Walk on the Other Side: Using SMT Components in a Transfer-Based Translation System [slides]
Ariadna Font Llitjós and Stephan Vogel
Dependency-Based Automatic Evaluation for Machine Translation [slides]
Karolina Owczarzak, Josef van Genabith and Andy Way
Probabilistic Synchronous Tree-Adjoining Grammars for Machine Translation: The Argument from Bilingual Dictionaries
Stuart Shieber
Three models for discriminative machine translation using Global Lexical Selection and Sentence Reconstruction [slides]
Sriram Venkatapathy and Srinivas Bangalore
Comparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation [slides]
Markus Dreyer, Keith Hall and Sanjeev Khudanpur


