Paper: Efficient Phrase-Table Representation for Machine Translation with Applications to Online MT and Speech Translation

ACL ID N07-1062
Title Efficient Phrase-Table Representation for Machine Translation with Applications to Online MT and Speech Translation
Venue Human Language Technologies
Session Main Conference
Year 2007
Authors

In phrase-based statistical machine transla- tion, the phrase-table requires a large amount ofmemory. Wewillpresentanefficientrepre- sentationwithtwokeyproperties: on-demand loading and a prefix tree structure for the source phrases. We will show that this representation scales well to large data tasks and that we are able to store hundreds of millions of phrase pairs in the phrase-table. For the large Chinese– English NIST task, the memory requirements of the phrase-table are reduced to less than 20MB using the new representation with no loss in translation quality and speed. Addi- tionally, the new representation is not limited to a specific test set, which is important for online or real-time machine translation. One problem in speech translation is the matching of phrases in the inpu...