Paper: Term Extraction Through Unithood and Termhood Unification

ACL ID I08-2084
Title Term Extraction Through Unithood and Termhood Unification
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2008

Term Extraction (TE) is an important com- ponent of many NLP applications. In gen- eral, terms are extracted for a given text collection based on global context and fre- quency analysis on words/phrases associa- tion. These extracted terms represent effec- tively the text content of the collection for knowledge elicitation tasks. However, they fail to dictate the local contextual informa- tion for each document effectively. In this paper, we refine the state-of-the-art C/NC- Value term weighting method by consider- ing both termhood and unithood measures, and use the former extracted terms to direct the local term extraction for each document. We performed the experiments on Straits Times year 2006 corpus and evaluated our performance using Wikipedia termbank. The experiments s...