NAACL 2015
TechTalks from event: NAACL 2015
6B: Discourse and Coreference
-
Encoding World Knowledge in the Evaluation of Local CoherencePrevious work on text coherence was primarily based on matching multiple mentions of the same entity in different parts of the text; therefore, it misses the contribution from semantically related but not necessarily coreferential entities (e.g., Gates and Microsoft). In this paper, we capture such semantic relatedness by leveraging world knowledge (e.g., Gates is the person who created Microsoft), and use two existing evaluation frameworks. First, in the unsupervised framework, we introduce semantic relatedness as an enrichment to the original graph-based model of Guinaudeau and Strube (2013). In addition, we incorporate semantic relatedness as additional features into the popular entity-based model of Barzilay and Lapata (2008). Across both frameworks, our enriched model with semantic relatedness outperforms the original methods, especially on short documents.
-
Chinese Event Coreference Resolution: An Unsupervised Probabilistic Model Rivaling Supervised ResolversRecent work has successfully leveraged the semantic information extracted from lexical knowledge bases such as WordNet and FrameNet to improve English event coreference resolvers. The lack of comparable resources in other languages, however, has made the design of high-performance non-English event coreference resolvers, particularly those employing unsupervised models, very difficult. We propose a generative model for the under-studied task of Chinese event coreference resolution that rivals its supervised counterparts in performance when evaluated on the ACE 2005 corpus.
-
Removing the Training Wheels: A Coreference Dataset that Entertains Humans and Challenges ComputersCoreference is a core NLP problem. However, newswire data, the primary source of existing coreference data, lack the richness necessary to truly solve coreference. We present a new domain with denser references---quiz bowl questions---that is challenging and enjoyable to humans, and we use the quiz bowl community to develop a new coreference dataset, together with an annotation framework that can tag any text data with coreferences and named entities. We also successfully integrate active learning into this annotation pipeline to collect documents maximally useful to coreference models. State-of-the-art coreference systems underperform a simple classifier on our new dataset, motivating non-newswire data for future coreference research.
- All Sessions
- Best Paper Plenary Session
- Invited Talks
- Tutorials
- 1A: Semantics
- 1B: Tagging, Chunking, Syntax and Parsing
- 1C: Information Retrieval, Text Categorization, Topic Modeling
- 2A: Generation and Summarization
- 2B: Language and Vision (Long Papers)
- 2C: NLP for Web, Social Media and Social Sciences
- 3A: Generation and Summarization
- 3B: Information Extraction and Question Answering
- 3C: Machine Learning for NLP
- 4A: Dialogue and Spoken Language Processing
- 4B: Machine Learning for NLP
- 4C: Phonology, Morphology and Word Segmentation
- 5A: Semantics
- 5B: Machine Translation
- 5C: Morphology, Syntax, Multilinguality, and Applications
- 6A: Generation and Summarization
- 6B: Discourse and Coreference
- 6C: Information Extraction and Question Answering
- 7A: Semantics
- 7B: Information Extraction and Question Answering
- 7C: Machine Translation
- 8A: NLP for Web, Social Media and Social Sciences
- 8B: Language and Vision
- 9A: Lexical Semantics and Sentiment Analysis
- 9B: NLP-enabled Technology
- 9C: Linguistic and Psycholinguistic Aspects of CL
- 8C: Machine Translation
- Opening remarks