Reinforcement learning has been successfully applied to the problem of mapping natural language instructions to actions with little or no supervision (Branavan et al., 2009). However, this mapping cannot be used to solve new problem instances unless we also receive a set of instructions for the new instance. We present an algorithm for generalizing the knowledge gained while learning to map instructions to actions, allowing us to solve new problem instances with no additional knowledge. The algorithm is a form of imitation learning using Counting-MLNs (C-MLNs), a novel statistical relational representation that can reason about the number of objects that satisfy a formula. We present an algorithm for learning C-MLNs, and apply it to the problem of generalizing instructions for the Crossblock puzzle game. We also investigate the use of C-MLNs for standard relational reinforcement learning, without the use of natural language instructions.
(Joint work with Matthew Richardson.)
Back to symposium main page