2003 ARDA Workshop
Title: Preparing to Explore a New Paradigm in Information Access:
A Scenario Approach to Question-Answering (pdf version)Leader: Elizabeth Liddy, Syracuse University
Workshop Team
Elizabeth D. Liddy, Professor, Syracuse University and Director of its Center for Natural Language Processing (CNLP) (Team Lead), Marc Light, Professor at University of Iowa, Nancy McCracken , Senior Researcher at CNLP, TBA Intel Analysts, TBA Subject Matter Experts
Problem
Analysts in the nation's intelligence community are reported to be adopting a new paradigm in their work by advancing from single-item searching to casting a wider search net that facilitates a scenario-based mode of investigation. That is, rather than looking for and reporting individual, un-associated facts, the analysts are interested in simultaneously exploring a fuller range of information facets in regards to a scenario involving individuals, organizations, and events. They wish to cast a wider net – and the questions will form that net - so that a fuller, richer range of information is retrieved, extracted, and stored in a scenario-like representation that includes vital background as well as a fuller picture of the current situation or scenario of interest.
Approach
Given the emerging nature of the scenario paradigm, this short workshop will focus on answering two key questions that need to be addressed before a fuller investigation can begin:
- What higher-level question-answering capabilities would this paradigm require?
- Would the substantial textual resources that the Center for Non-Proliferation Studies has gathered over time on complex, multi-faceted international issues, be an appropriate collection to support the new scenario-based question-answering paradigm?
Exploring each of these questions in more detail:
- While there was significant early research in Question Answering (QA) in the fields of logic and linguistics (Belnap, 1963; Belnap & Steel, 1976), automatic QA was first focused on in a large-scale evaluation framework in the TREC Conferences, beginning with TREC-8 in 1999 (Voorhees & Tice, 1999). The paradigm established in TREC-8 and continued in the next two TREC Conference QA tracks is simple fact-based, short-answer questions. Initially, answer strings were limited to either 50 or 250 bytes. Under current TREC QA guidelines, only fragments which contain the minimal answer to the question are deemed correct. Any explanatory text, even if within the 50 byte limit, causes the answer to be scored as incorrect.
While the existing TREC QA evaluation approach has utilized very simple, short questions and has focused on a narrow definition of a useful answer, some QA system builders have begun to call for an evaluation paradigm that considers dimensions above and beyond short, fact-based QA (Breck et al, 2000; Liddy, 2002).
In fact, there are a few research and development efforts in place that have advanced to the level of dealing with question types that go beyond the simple factoid questions. This is accomplished by more complex natural language processing of both sources and queries to enable a QA system to respond with fuller information to more complex query types such as ‘how' and ‘why' (Liddy, 2002) or by techniques such as QA reuse which accumulates questions, answers, and question-answering processes over time that can be reused to enable the system to better understand and answer future questions (Light et al, Submitted).
However, even these efforts need to be further developed in order to produce the advanced capabilities needed for scenario-based intelligence analysis. But what are the capabilities that will be needed? One goal of this workshop will be to involve individuals with experience in QA systems in an active brainstorming session, once the new paradigm has been described and to produce a first description of the expected requirements.
- The Center for Non-Proliferation Studies (CNS) has amassed a collection of documents on complex, multi-faceted international issues related to Weapons of Mass Destruction (WMD), which they have already provided to ARDA and for which they have offered to provide Subject Matter Experts (SMEs) to assist in analysis and understanding. The CNS databases contain information compiled from hundreds of source publications, including United Nations documents, trade journals, government and defense publications, periodicals and electronic news sources, academic journals, U.S. congressional testimony, conference proceedings, book chapters, correspondence from international advisors, unpublished papers, and Internet sources. These databases contain continually updated information about the global proliferation of weapons of mass destruction (nuclear, chemical, and biological) and their delivery systems ( http://cns.miis.edu ). This collection may well provide an ideal collection in which scenario-based analysis can be investigated. Additionally, CNS has also offered their expertise to assist in the development of possible scenarios. The second goal of the workshop will be to determine the viability of these resources as the corpus on which to build and test new scenario-based QA capabilities.
Given the serendipitous co-occurrence of the newly evolving analysis paradigm and this exceptional open-source collection and domain expertise in a domain of keen interest, this workshop will perform a preliminary investigation of what new types of QA would both be useful and possible, and whether the CNS corpus is appropriate. While this is a preliminary foray into the topic, it is believed that it can provide the necessary first-stage understanding that may, in fact, enable a fuller investigation to be undertaken within a future ARDA program.
The workshop will bring together researchers with expertise and interest in the topic of advanced QA and SMEs from the intelligence community and CNS. They will utilize their combined capabilities to investigate and report back on:
- The new paradigm from the intelligence analysts' perspective,
- The more advanced capabilities being developed beyond simple factoid QA,
- The nature of the CNS collection,
- The scenarios proposed by the SMEs,
- The range of QA capabilities needed to make scenario-based QA a reality.
Evaluation
Given the preliminary nature of this investigation, qualitative evaluation is most appropriate. It is expected that the workshop will be able to report back the potential of this new, more advanced scenario-based QA becoming a reality for the intelligence analysis process. The workshop will present its understandings to the relevant communities and elicit their insight and qualitative evaluation.
Key Tasks / Milestones:
Understand the new scenario-based paradigm with the guidance of an intelligence analyst.
Develop scenarios for the WMD domain in collaboration with CNS expert.
Brainstorm the nature and range of questions that will be useful within the new paradigm, including the re-use taxonomy developed in last year's workshop.
Prepare the CNS collection of WMD documents for interrogation by these query types.
Conduct preliminary human + computer interrogation of the CNS collection based on the SME's scenario-based paradigm and the QA experts' new question types.
Evaluate potential utility of results by intelligence analyst and SME.
Prepare a detailed report of the potential for further, in-depth investigation of scenario-based QA on the CNS collection for a future program-level investigation.
Impact
The major outcome of this short workshop will be a detailed investigation, comparable to seedling level DARPA efforts, which should provide insight into the potential for a new, full-scale ARDA program to produce an advanced QA capability.
References
Belnap, N. D. (1963). An analysis of questions: Preliminary report. Scientific Report TM-1287. Santa Monica , CA .
Belnap, N. D. & Steel, T. B. (1976). The logic of questions and answers . New Haven , CT. , Yale University Press.
Breck, E.J., Burger, J.D., Ferro, L, Hirschman, L., House, D., Light, M. and Mani, I. (2000). How to evaluate your question answering system every day…and still get real work done. Proceedings of Language Resources and Evaluation (LREC Conference Proceedings).
Chen, J., Diekema, A, Yilmazel, O. & E.D. Liddy, (2002). Automatic Question-Answering in the Aerospace Domain. Virtual Reference Desk Conference. Chicago , Il .
Liddy, E.D. (2002). Why are People Asking these Questions?: A Call for Bringing Situation into Question-Answering System Evaluation. LREC Workshop Proceedings on Question Answering – Strategy and Resources . Grand Canary Island , Spain .
Light, M., Ittycheriah, A., Latto, A., McCracken, N. (Submitted). Reuse in Question Answering: A Preliminary Study. ( Proceedings of AAAI Workshop on Question Answering . Stanford , CA .
Voorhees, E. and Tice, D. (1999). The TREC-8 question answering track evaluation. In Voorhees, E. and Harman, D. Proceedings of the Eighth Text Retrieval Conference . Gaithersburg , MD : NIST Special Publications.