Thierry Fontenelle

Microsoft Natural Language Group

A large bilingual lexical resource for word sense disambiguation

UW/Microsoft Symposium, 3/12/04

Over the last few years, NLP researchers who have tried to exploit linguistic information embodied in machine-readable dictionaries focused primarily on monolingual dictionaries, somewhat neglecting bilingual dictionaries. We will show how the collocational information contained in the Collins-Robert English-French dictionary has been made explicit to create a large database of collocational and semantic information accessible via a number of entry points (the English base of the collocation, the English collocator, the French translations, or via the lexical-semantic relationship holding between the members of the pairs of collocations). During my presentation, I will demonstrate how linguists, lexicographers or translators can select a particular lexical-semantic relation and retrieve all the exponents of this function, which can be used to extract fragments of the lexicon that exhibit a similar behavior. The emphasis will be laid on the flexible access possibilities which make it possible to use the dictionary as a resource for language generation and for word sense disambiguation and translation selection.


Back to symposium main page