A central problem in grounded language acquisition is to learn the correspondences between a rich set of events and complex sentences that describe those events. In this talk, I will introduce a novel approach to learn these correspondences under weak supervision that comes from loose temporal alignments between events and sentences. The core idea is to exploit the underlying structure between correct, but latent, correspondences using a discriminative notion of similarity coupled with a ranking function.This algorithm reasons in terms of pairwise discriminative similarities and utilizes popularity metrics to learn the alignments between events and sentences and even discover group of events, called macro-events, that best describe a sentence. I will demonstrate extensive evaluations on our new dataset of professional soccer commentaries. Furthermore, I will describe how this model can be applied under the general framework of Multiple Instance Learning.
Hannaneh Hajishirzi has recently joint University of Washington as a research scientist. Prior to that, she spent a year as a postdoctoral research associate at Carnegie Mellon University and Disney Research. She received her PhD in 2011 from the Computer Science department at the University of Illinois at Urbana-Champaign. Her research interests are in Artificial Intelligence and Machine Learning, and their intersections with Natural Language Processing and Information Extraction. In particular, her current research is mainly focused on semantic analysis of natural language texts and designing automatic language-based interactive systems. Her prior research was on designing statistical relational frameworks to learn, control, and reason about complex dynamic domains such as commentaries, narratives, and web-pages with applications in narrative understanding, web monitoring, spam detection, and near duplicate detection.
Back to symposium main page