Correlation identification methods based on concept co-occurrences have been commonly used on medical free texts. However, concepts co-occur for different reasons, and generalizable approaches to determine the meaning of those co-occurrences are needed. In this talk, I will describe a new extraction approach that incorporates a medical knowledge base (UMLS) and text classification methods to identify the semantics of the relationships between co-occurring concepts in MEDLINE abstracts. The major difficulty of this approach is the lack of annotated sentences for training and testing purposes. I describe how we semi-automatically annotate the sentences with a combination of heuristics and a partially supervised classification method. In our evaluations, we focus on extracting the meaning of only the correlations between drugs or chemicals and disorders, and we limit the meaning to treats and causes. Based on the good performance results, we believe that our approach shows great promise for tackling the difficult relationship-identification problem in medical free text.
Back to symposium main page