TechTalks from event: ACL-IJCNLP 2015

session 3A Language Resources

  • a new corpus and imitation learning framework for context-dependent semantic parsing Authors: Andreas Vlachos and Stephen Clark
    Semantic parsing is the task of translating natural language utterances into a machine-interpretable meaning representation. Most approaches to this task have been evaluated on a small number of existing corpora which assume that all utterances must be interpreted according to a database and typically ignore context. In this paper we present a new, publicly available corpus for context-dependent semantic parsing. The MRL used for the annotation was designed to support a portable, interactive tourist information system. We develop a semantic parser for this corpus by adapting the imitation learning algorithm DAGGER without requiring alignment information during training. DAGGER improves upon independently trained classifiers by 9.0 and 4.8 points in F-score on the development and test sets respectively.
  • it depends dependency parser comparison using a web-based evaluation tool Authors: Jinho D. Choi,
    The last few years have seen a surge in the number of accurate, fast, publicly available dependency parsers. At the same time, the use of dependency parsing in NLP applications has increased. It can be difficult for a non-expert to select a good ``off-the-shelf'' parser. We present a comparative analysis of nine leading statistical dependency parsers on a multi-genre corpus of English. For our analysis, we developed a new web-based tool that gives a convenient way of comparing dependency parser outputs. Our analysis will help practitioners choose a parser to optimize their desired speed/accuracy tradeoff, and our tool will help practitioners examine and compare parser output.
  • generating high quality proposition banks for multilingual semantic role labeling Authors: Alan Akbik,
    Semantic role labeling (SRL) is crucial to natural language understanding as it identifies the predicate-argument structure in text with semantic labels. Unfortunately, resources required to construct SRL models are expensive to obtain and simply do not exist for most languages. In this paper, we present a two-stage method to enable the construction of SRL models for resource-poor languages by exploiting monolingual SRL and multilingual parallel data. Experimental results show that our method outperforms existing methods. We use our method to generate Proposition Banks with high to reasonable quality for 7 languages in three language families and release these resources to the research community.