Many probabilistic models for Natural Language Processing tasks, such as language modeling, tagging, or machine translation, are adversely affected by a lack of training data. One possible approach to increasing model robustness is to adopt a different word representation in which words are encoded as feature vectors rather than indivisible units. Such a representation allows training data to be shared across words and permits a wider range of probability smoothing techniques. In addition, it opens up the possibility of sharing data between different but related languages and dialects, which is of importance for developing natural language technology for resource-poor languages. This talk will describe the use of feature-based representations for various NLP applications, in particular language modeling and statistical machine translation. It will present the basic theoretical model, data-driven optimization procedures, and experimental evaluations for a range of languages, including English, Spanish, and Arabic.
Back to symposium main page