Lister Hill Center Logo  
Search Tips
About the Lister Hill Center
Blue Arrow
Blue Arrow
Blue Arrow
Blue Arrow
Innovative Research
Blue Arrow
Blue Arrow
Blue Arrow
Blue Arrow
Blue Arrow
Publications and Lectures
Blue Arrow
Blue Arrow
Blue Arrow
Training and Employment
Blue Arrow
Blue Arrow
LHNCBC: Document Abstract
Year: 2002Adobe Acrobat Reader
Download Free Adobe Acrobat Reader
LHNCBC-2002-025
Assessing the Consistency of a Biomedical Terminology Through Lexical Knowledge
Bodenreider O, Burgun A, Rindflesch TC
Proc. of the Workshop on Natural Language Processing in Biomedical Applications. 2002;:77-83.
Objective: In this paper, we investigate the use of lexical knowledge for determining consistency in biomedical terminologies. We focus on adjectival modification as a way of assessing the systematic use of linguistic phenomena to represent similar lexical or semantic features in the constituent terms of a vocabulary. Methods: Terms consisting of one or more adjectival modifiers followed by a head noun are selected from disease and procedure terms in SNOMED. After one modifier is extracted from the term, the remaining head noun - along with the other modifiers, if any - forms the context of this term. Modifiers sharing the same context are clustered together and ranked by frequency. For a pair (m1, m2) of frequently co-occurring modifiers, two terms m1c and m2c are created by systematically associating each modifier with the context in which at least one of the modifiers appears, called c. The existence of m1c and m2c is checked in both the vocabulary studied and the entire UMLS Metathesaurus, as well as the existence of the term corresponding to the context alone. Finally, relationships between m1c and m2c and between each of these terms and their context c are studied. Results: Four pairs of modifiers were studied: (acute, chronic), (unilateral, bilateral), (primary, secondary), and (acquired, congenital). The numbers of contexts studied for each pair ranged from 73 to 974. The percentage of contexts associated with both modifiers ranged from 5% to 50% in SNOMED and from 10% to 60% in UMLS. The presence of the context term varied from 31% to 64% in SNOMED and from 43% to 79% in UMLS. Finally, 172 occurrences (9%) of synonymy between a modified term and the context term were found in SNOMED. 145 such occurrences (8%) were found in the entire Metathesaurus. Discussion: The application of this method to discovering inconsistencies in a vocabulary is discussed, as well as differences among the different pairs of qualifiers studied. Examples or inconsistencies are presented and their consequences in terms of knowledge representation are discussed.
PDF