Lab member thesis topics

Line: 9 to 9

Utilizing Multilingual Resources for Automatic Lexical Acquisition (MA)

Michael Wayne Goodman

(this is very much a rough draft. expect further revisions later this term) I will explore methods for using resources such as the Turing Center's Transgraph to automatically map words to lexical types, as well as methods to evaluate the performance of such a system. Some of the expected problems include: how to extract, derive, or assume syntactic constraints for words when faced with minimal resources, how to deal with source words that don't map to a single word in the target language (eg. English "hurt" vs. Italian "make harm"), how to deal with small and incomplete grammars (eg. not all possible lexical types are represented), when/how to solicit information from the user/linguist, etc. (updated 2008.04.02)
I'm investigating how we can leverage the knowledge built into the lexicons of large, mature grammars to help bootstrap the lexicons of much smaller grammars. For my test, I am using the Jacy Japanese grammar as the source and the Ita Italian MMT grammar as the target. I am using the Turing Center's Transgraph project to provide word translations, and some hand-built type mappings from one grammar to the other to figure out the types a word can have. Because of the nature of the project, many spurious items are produced, so I need to apply some filtering to the data to try and remove them. Another aspect of the project is to try and automatically learn transfer rules between the grammars involved. This becomes difficult when source words do not transfer to a single target word, when they change argument structure, etc. (updated 2008.06.27)

Generating Referring Expressions (MA)

Margaret Ann Mitchell

