Chris Quirk

Microsoft Natural Language Processing Group

Toward Syntactically Informed Statistical Machine Translation

UW/Microsoft Symposium, 1/28/05

Over the past decade, we have witnessed a revolution in the field of machine translation (MT) toward statistical or corpus-based methods. Statistical machine translation (SMT) systems are now highly competitive real-time translation systems. Yet transfer-based systems built upon traditional linguistic analysis are complementary in many ways: while SMT systems excel in domain specific terminology and phrases, transfer-based systems succeed more often in producing grammatical and fluent translations. A natural next step is to incorporate the strengths of both approaches. In this talk, we describe a novel approach to translation that incorporates a source language dependency parser as well as statistical models and an end-to-end search. This system promises to combine the power of SMT with the linguistic generality available in a parser.

Back to symposium main page