Paper: Concept Unification Of Terms In Different Languages For IR

ACL ID P06-1081
Title Concept Unification Of Terms In Different Languages For IR
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2006
Authors

Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as Chinese and Korean. Although these English terms and their equivalences in the Asian languages refer to the same concept, they are erroneously treated as independent index units in tra- ditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and suggests a novel technique to solve it. Our method firstly extracts an English phrase from Asian language Web pages, and then unifies the extracted phrase and its equivalence(s) in the language as one index unit. Experi- mental results show that the high preci- sion of our conceptual unification ap- proach greatly improves the...