Yan Song and Fei Xia


Domain adaptation via effective feature engineering across domains

Domain adaptation aims at bridging the performance gap when training and test data come from different domains. In the field of natural language processing, domain adaptation techniques have been applied to tasks such as POS tagging, parsing, machine translation, and sentiment analysis, and there have been extensive studies in this area in the past decade.

In this talk, we present two novel approaches to domain adaptation. In the first approach, we address the limitations of two existing domain adaptation methods, training data selection and feature augmentation, by combining the two methods; that is, we propose to use training data selection to divide the source domain training data into two parts, pseudo target data (the selected part) and source data (the unselected part), and then apply feature augmentation on the two parts of the training data. In the second approach, we improve system performance by introducing new features that represent shared properties among two domains. Our experiments show that these approaches can boost system performance not only when the training and test data come from different domains, but also when they are from two closely related languages.

Yan Song was a visiting student at UW in 2011-2012. After receiving his PhD from City University of Hong Kong in 2014, he joined Microsoft Search Technology Center Asia in Beijing, China as a speech scientist. Fei Xia is an Associate Professor at the Linguistics Department at UW. She is one of the organizers of the UW/MS NLP Symposium.

Back to symposium main page