2002 ARDA Northeast Regional Research Center Workshops

In the Summer of 2002, the ARDA-funded Northeast Regional Research Center hosted three workshops. All of these workshops were pursued in the context of the AQUAINT program of ARDA's Information Exploitation thrust. This page provides an overview of the Workshops and links to reports and briefings generated by the workshops' activities, including summaries of their results and any "products" that they produced as a result of their explorations.

Workshop on Reuse in Question Answering Systems
Leaders: Marc Light and Abraham Ittycheriah
Participants: Nancy McCracken, Andrew Latto

MPQA: Multi-Perspective Question Answering
Leader: Prof. Janyce Wiebe
Participants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson

TERQAS: Temporal Event Recognition for Question Answering Systems
Leader: Prof. James Pustejovsky
Participants: Luc Belanger, Jose Castano, David Day, Lisa Ferro, Robert Gaizauskas, Patrick Hanks, Bob Ingria, Graham Katz, Marcia Lazo, Dragomir Radev, Anna Rumshishky, Antonio Sanfilippo, Roser Sauri, Andrea Setzer, Ed Slavich, Beth Sundheim, Marc Verhagen

Workshop on Reuse in Question Answering Systems
Leaders: Marc Light and Abraham Ittycheriah
Participants: Nancy McCracken, Andrew Latto

An advanced Q&A system should be able to accumulate questions, answers, and other auxiliary information. This information could then be "reused" to enable the system to better answer future questions. In this way, a system could duplicate a human’s ability to gain knowledge and proficiency in an area as she or he answers questions. This workshop set out to: (i) Explicate: find questions and answers that captured the ways in which we would want an automated Question Answering system to reuse the answers from earlier questions in responding to later questions (focussing on a small number of domains, e.g., epidemiology, terrorism, nuclear proliferation, etc.). (2) Categorize: find and classify different types of possible reuse. (3) Describe: write short document discussing the categories, their characteristics, and their distributions

Browse or download the final briefing developed at the conclusion of the workshop.

View a paper derived from the efforts of this workshop.

If you are interested in obtaining any of the data generated in the course of this workshop, please contact David Day, day@mitre.org.

MPQA: Multi-Perspective Question Answering
Leader: Prof. Janyce Wiebe
Participants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson

A group of resarchers and PhD students worked together to explore the area of Multi-Perspective Question Answering (MPQA). The accomplishments include a knowledge representation scheme to support manual annotation and analysis of data; a repository of linguistic clues relevant for perspective; a data corpus; a set of manually annotated data; an annotation system to support manual annotation; an application architecture; and the results of various types of evaluation.

The problem we addressed is finding and organizing expressions of opinions in the world press and other text. Our work builds toward the following tasks to support activities of professional information analysts.

  • Given a particular topic, event, or issue, find a range of opinions being expressed about it in the world press.
  • Once opinions have been found, clustering them and their sources in various ways. The {\it source} of an opinion or perspective is simply the person or group whose opinion or perspective it is. There are various attributes according to which opinions and their sources may be clustered, including:
  • The type of attitude that is expressed. For example, the source might be expressing a positive, negative, or uncertain attitude.
  • The basis for the opinion, such as supporting beliefs, or experiences
  • The expressive style of the sentences. The style might be sarcastic and vehement, for example, or neutral.
  • Once systems are developed to automate the above tasks, they may be applied to many topics and documents, to build perspective profiles of various groups and sources, and observe how attitudes change over time.
  • Browse or download the final briefing developed at the conclusion of the workshop.

    View a final report on the work carried out in this workshop. (Download postscript version of report.)

    View the annotation guidelines developed and used within the course of the workshop.

    View an example news report of the kind annotated within this workshop. View an image of how the data was annotated using the GATE annotation tool. View an image depicting the high-level opinion annotations added to the example news report. View an image depicting the low-level opinion annotations added to the example news report.

    If you are interested in obtaining any of the data generated in the course of this workshop, please contact David Day, day@mitre.org.

    TERQAS: Temporal Event Recognition for Question Answering Systems
    Leader: Prof. James Pustejovsky
    Participants: Luc Belanger, Jose Castano, David Day, Lisa Ferro, Robert Gaizauskas, Patrick Hanks, Bob Ingria, Graham Katz, Marcia Lazo, Dragomir Radev, Anna Rumshishky, Antonio Sanfilippo, Roser Sauri, Andrea Setzer, Ed Slavich, Beth Sundheim, Marc Verhagen

    From January 30, 2002, through July 22, 2002, a workshop, funded through the National Regional Research Center (NRRC), was held at MITRE Bedford and Brandeis University. The funding was fully sponsored by the Advanced Research Development Agency (ARDA). This document reports their activities and accomplishments.

    The purpose of this workshop was to address the problem of how to answer temporally-based questions about the events and entities in text, specifically news articles. For example, currently questions such as those shown below are not supported by question answering systems.

  • Is Gates currently CEO of Microsoft?
  • When did Iraq finally pull out of Kuwait during the war in the 1990s?
  • Did the Enron merger with Dynegy take place?
  • What characterizes these questions as beyond the scope of current systems is the following: they refer, respectively, to the temporal aspects of the properties of the entities being questioned, the relative ordering of events in the world, and events that are mentioned in news articles, but which have never occurred. There has recently been a renewed interest in temporal and event-based reasoning in language and text, particularly as applied to information extraction and reasoning tasks (cf. Mani and Wilson, 2000, ACL Workshop on Spatial and Temporal Reasoning, 2001, Annotation Standards for Temporal Information in Natural Language, LREC 2002). Several papers from the workshop point to promising directions for time representation and identification (cf. Filatova and Hovy, 2001, Schilder and Habel, 2001, Setzer, 2002). Many issues relating to temporal and event identification have remained unresolved, however, and it was these issues that the workshop addressed. Specifically, the workshop goals were twofold: (a) to examine how to formally distinguish events and their temporal anchoring in language (text); and (b) to develop algorithms for ordering events in text relative to each other, and the operations for computing closure over an entire discourse of events.

    Four basic problems in event-temporal identification were addressed in the workshop:

  • Time stamping of events (identifying an event and anchoring it in time);
  • Ordering events with respect to one another (lexical versus discourse properties of ordering);
  • Reasoning with contextually underspecified temporal expressions (temporal functions such as last week and two weeks before);
  • Reasoning about the persistence of events (how long does an event or the outcome of an event last).
  • Browse or download a summary briefing developed at the conclusion of the workshop.

    View a final report on the work carried out in this workshop.

    If you are interested in obtaining any of the data generated in the course of this workshop, please contact David Day, day@mitre.org.