Emily M. Bender

UW Linguistics

Grammar Checking in the Arboretum Finding and Curing Trees

UW/Microsoft Symposium, 10/24/03

This talk describes an application of the LinGO English Resource Grammar (ERG) (Flickinger 2000) to the problem of grammar checking for non-native speakers. The ERG is a broad-coverage precision HPSG for English, developed over the past 9 years and comprising over 70,000 lines of code. To create a prototype grammar checker, we have augmented the ERG with mal-rules, which produce well-formed semantic representations from ill-formed syntactic input. We then treat the problem of correcting the sentence as a kind of machine translation, and generate a well-formed sentence from the semantic representation using the core grammar (without the mal-rules).

One consequence of the broad-coverage of the grammar is ambiguity, such that most inputs have more than one possible parse. With the addition of mal-rules, ambiguity only increases. We are exploring handling the ambiguity on the parsing side with a maximum entropy model trained on a treebank (Oepen et al 2002), and on the generation side by aligning the generated output as closely as possible to the winning parse of the input.

