Paper: Structural Feature Selection For English-Korean Statistical Machine Translation

ACL ID C00-1064
Title Structural Feature Selection For English-Korean Statistical Machine Translation
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2000
Authors

When aligning texts in very different languages such as Korean and English, structural features beyond word or phrase give useful intbrmation. In this pa- per, we present a method for selecting struetm'al features of two languages, from which we construct a model that assigns the conditional probabilities to corresponding tag sequences in bilingual English- Korean corpora. For tag sequence mapl)ing 1)etween two langauges, we first, define a structural feature fllnction which represents statistical prol)erties of elnpirical distribution of a set of training samples. The system, based on maximmn entrol)y coneet)t, se- le(:ts only ti;atures that pro(luee high increases in log- likelihood of training salnl)les. These structurally mat)ped features are more informative knowledge for statistical ...