Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)

ACL-08 : HLT Workshop
20 June 2008, Columbus, Ohio

The Second Workshop on Syntax and Structure in Statistical Translation (SSST-2) seeks to build on the foundations established in the first SSST which was held as a NAACL-HLT 2007 / AMTA workshop. SSST-1 brought together a large number of researchers working on diverse aspects of synchronous/transduction grammars (hereafter, S/TGs) in relation to statistical machine translation. Its program was comprised of high-quality papers discussing current work spanning topics including: new grammatical models of translation; discriminative training of syntax-based models; using S/TGs for semantics and generation; syntax-based evaluation of machine translation; and formal properties of S/TGs. The presentations led to productive and thought-provoking discussions, comparing and contrasting different approaches, and identifying the questions that are most pressing for future progress in this topic.

The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is a growing consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and their tree-transducer equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages.

Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations.

Proceedings of SSST-2, Second Workshop on Syntax and Structure in Statistical Translation
David Chiang and Dekai Wu (editors)
Invited Talk: Shuffling Non-Constituents [slides]
Jason Eisner (with David A. Smith and Roy Tromble)
Imposing Constraints from the Source Tree on ITG Constraints for SMT [slides]
Hirofumi Yamamoto, Hideo Okuma and Eiichiro Sumita
A Scalable Decoder for Parsing-Based Machine Translation with Equivalent Language Model State Maintenance [slides]
Zhifei Li and Sanjeev Khudanpur
Prior Derivation Models For Formally Syntax-Based Translation Using Linguistically Syntactic Parsing and Tree Kernels [slides]
Bowen Zhou, Bing Xiang, Xiaodan Zhu and Yuqing Gao
Generalizing Local Translation Models [slides]
Michael Subotin
A Rule-Driven Dynamic Programming Decoder for Statistical MT [slides]
Christoph Tillmann
Syntactic Reordering Integrated with Phrase-Based SMT [slides]
Jakob Elming
Experiments in Discriminating Phrase-Based Translations on the Basis of Syntactic Coupling Features [slides]
Vassilina Nikoulina and Marc Dymetman
Multiple Reorderings in Phrase-Based Machine Translation [slides]
Niyu Ge, Abe Ittycheriah and Kishore Papineni
Improving Word Alignment Using Syntactic Dependencies [slides]
Yanjun Ma, Sylwia Ozdowska, Yanli Sun and Andy Way
Inductive Detection of Language Features via Clustering Minimal Pairs: Toward Feature-Rich Grammars in Machine Translation [slides]
Jonathan H. Clark, Robert Frederking and Lori Levin
Syntax-Driven Learning of Sub-Sentential Translation Equivalents and Translation Rules from Parsed Parallel Corpora [slides]
Alon Lavie, Alok Parlikar and Vamshi Ambati


