Paper: Description Of The LINK System Used For MUC-5

ACL ID M93-1024
Title Description Of The LINK System Used For MUC-5
Venue Message Understanding Conference
Session Main Conference
Year 1993

are removed, as are author name lines, and COMLINE tag lines . Sentences that are too short to be interesting are removed . The Tagger Because the input is mixed case in this domain, and because many of the proper names tha t would normally be unknown to the system lexicon are capitalized, the MUC-5 LINK syste m uses a pre-parse tagger to process and attempt to identify capitalized words which are passe d as strings from the Tokenizer. The Tagger uses heuristics (aka hacks) to break apart strings i n several different ways. Some of the tags that are used include : :COMP-NAME for things that seem to be obviously company names, :LOCATION for city/state pairs, :PERSON-NAME for people names (if they have Mr, Mrs, VP, Dr in front), and :NAME for other names . Some example rules that the tagger ...