The ISL Statistical Machine Translation System

by Dr. Stephan Vogel, Visiting Researcher, Language Technologies Institute, Carnegie Mellon University, USA

 :  23 Mar 2005 (Wed)
 :  4:00pm - 5:00pm
Venue  :  LTC, Academic Concourse, HKUST

Statistical machine translation is currently a fashionable approach. Starting from the word alignment models advocated by the IBM MT research group major improvements have been possible by adding phrases to phrase translations. In the talk I will present some of the work we did in statistical machine translation over the last couple of years.

The first part of the presentation will focus on the new phrase alignment approach, developed last year and successfully applied in recent evaluations. To find the translation for a source phrase we calculate a> restricted word alignment: words inside the target phrase
can align only to words inside the source phrase, and words outside of the target phrase can only align to words outside of the source phrase. To find the best translation is then the search for the boundaries of the target phrase, which give the best overall alignment score.

In the second part details of the decoder will be given. The decoder works in two stages: first collecting potential translations for words and phrases and building a directed graph with all these translation candidates. Then a single-best or n-best search through this lattice is performed, adding the language model probabilities. The best path search is extended to allow for limited word reordering.


Stephan Vogel is a researcher at the Language Technologies Institute, Carnegie Mellon University, where he heads the statistical machine translation team. He received a Diploma in Physics from Philips University Marburg, Germany, and a Masters of Philosophy from the University of Cambridge, England. After working for a number of years on the history of science, he turned to computer science, especially natural language processing. Before coming to CMU, he worked for several years at the Technical Univerity of Aachen on statistical machine translation, and also in the Interactive Systems Lab at the University of Karlsruhe.


