Mike Calcagno

Microsoft Natural Language Group

Extracting and exploiting type information in search-related scenarios

UW/Microsoft Symposium, 10/22/04

Basic search treats both the queries that users type and the documents they intend to retrieve as collections of strings, with a small amount of "syntactic" structure (e.g., proximity, word order) used to enhance relevance by favoring phrases over simple word matches in certain appropriate cases. I'll talk about how a similarly small amount of semantic information, namely, recognizing strings in both the query and document as being of a certain type, can be used to provide a different and richer search experience, with an emphasis on scenarios that go beyond the traditional results presentation that we see in most search engines today. In addition, I'll talk about ways to extract imperfect, yet usable, types from various unstructured information sources with minimal human intervention.

