Jianfeng Gao

Microsoft Research

A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing

UW/Microsoft Symposium, 04/20/07

In this talk we present a comparative study of five parameter estimation algorithms on four NLP tasks. Three of the five algorithms are well-known in the computational linguistics community: Maximum Entropy (ME) estimation with L2 regularization, the Averaged Perceptron (AP), and Boosting. We also investigate ME estimation with the increasingly popular L1 regularization using a novel optimization algorithm, and BLasso, which is a version of Boosting with Lasso (L1) regularization. We first investigate all of our estimators on two reranking tasks: a parse selection task and a language model adaptation task. Then we apply the best of these estimators to two additional tasks involving conditional sequence models: a Conditional Markov Model (CMM) for part of speech tagging and a Conditional Random Field (CRF) for Chinese word segmentation. Our experiments show that across tasks, three of the estimators - ME estimation with L1 or L2 regularization, and the Averaged Perceptron - are in a near statistical tie for first place.

Back to symposium main page