TechTalks from event: NAACL 2015

1A: Semantics

  • Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework Authors: Ivan Titov and Ehsan Khoddam
    We introduce a new approach to unsupervised estimation of feature-rich semantic role labeling models. Our model consists of two components: (1) an encoding component: a semantic role labeling model which predicts roles given a rich set of syntactic and lexical features; (2) a reconstruction component: a tensor factorization model which relies on roles to predict argument fillers. When the components are estimated jointly to minimize errors in argument reconstruction, the induced roles largely correspond to roles defined in annotated resources. Our method performs on par with most accurate role induction methods on English and German, even though, unlike these previous approaches, we do not incorporate any prior linguistic knowledge about the languages.
  • Predicate Argument Alignment using a Global Coherence Model Authors: Travis Wolfe, Mark Dredze, Benjamin Van Durme
    We present a joint model for predicate argument alignment. We leverage multiple sources of semantic information, including temporal ordering constraints between events. These are combined in a max-margin framework to find a globally consistent view of entities and events across multiple documents, which leads to improvements on the state-of-the-art.
  • Improving unsupervised vector-space thematic fit evaluation via role-filler prototype clustering Authors: Clayton Greenberg, Asad Sayeed, Vera Demberg
    Most recent unsupervised methods in vector space semantics for assessing thematic fit (e.g. Erk, 2007; Baroni and Lenci, 2010; Sayeed and Demberg, 2014) create prototypical role-fillers without performing word sense disambiguation. This leads to a kind of sparsity problem: candidate role-fillers for different senses of the verb end up being measured by the same yardstick, the single prototypical role-filler.
  • A Compositional and Interpretable Semantic Space Authors: Alona Fyshe, Leila Wehbe, Partha P. Talukdar, Brian Murphy, Tom M. Mitchell
    Vector Space Models (VSMs) of Semantics are useful tools for exploring the semantics of single words, and the composition of words to make phrasal meaning. While many methods can estimate the meaning (i.e. vector) of a phrase, few do so in an interpretable way. We introduce a new method (CNNSE) that allows word and phrase vectors to adapt to the notion of composition. Our method learns a VSM that is both tailored to support a chosen semantic composition operation, and whose resulting features have an intuitive interpretation. Interpretability allows for the exploration of phrasal semantics, which we leverage to analyze performance on a behavioral task.

1B: Tagging, Chunking, Syntax and Parsing

  • Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment Authors: Yonatan Belinkov, Tao Lei, Regina Barzilay, Amir Globerson
    Prepositional phrase (PP) attachment disambiguation is a known challenge in syntactic parsing. The lexical sparsity associated with PP attachments motivates research in word representations that can capture pertinent syntactic and semantic features of the word. One promising solution is to use word vectors induced from large amounts of raw text. However, state-of-the-art systems that employ such representations yield modest gains in PP attachment accuracy. In this paper, we show that word vector representations can yield significant PP attachment performance gains. This is achieved via a non-linear architecture that is discriminatively trained to maximize PP attachment accuracy. The architecture is initialized with word vectors trained from unlabeled data, and relearns those to maximize attachment accuracy. We obtain additional performance gains with alternative representations such as dependency-based word vectors. When tested on both English and Arabic datasets, our method outperforms both a strong SVM classifier and state-of-the-art parsers. For instance, we achieve 82.6% PP attachment accuracy on Arabic, while the Turbo and Charniak self-trained parsers obtain 76.7% and 80.8% respectively.
  • An Incremental Algorithm for Transition-based CCG Parsing Authors: Bharat Ram Ambati, Tejaswini Deoskar, Mark Johnson, Mark Steedman
    Incremental parsers have potential advantages for applications like language modeling for machine translation and speech recognition. We describe a new algorithm for incremental transition-based Combinatory Categorial Grammar parsing. As English CCGbank derivations are mostly right branching and non-incremental, we design our algorithm based on the dependencies resolved rather than the derivation. We introduce two new actions in the shift-reduce paradigm based on the idea of revealing (Pareschi and Steedman, 1987) the required information during parsing. On the standard CCGbank test data, our algorithm achieved improvements of 0.88% in labeled and 2.0% in unlabeled F-score over a greedy non-incremental shift-reduce parser.
  • Because Syntax Does Matter: Improving Predicate-Argument Structures Parsing with Syntactic Features Authors: Corentin Ribeyre, Eric Villemonte de la Clergerie, Djam Seddah
    Parsing full-fledged predicate-argument structures in a deep syntax framework requires graphs to be predicted. Using the DeepBank (Flickinger et al., 2012) and the Predicate-Argument Structure treebank (Miyao and Tsujii, 2005) as a test field, we show how transition-based parsers, extended to handle connected graphs, benefit from the use of topologically different syntactic features such as dependencies, tree fragments, spines or syntactic paths, bringing a much needed context to the parsing models, improving notably over long distance dependencies and elided coordinate structures. By confirming this positive impact on an accurate 2nd-order graph-based parser (Martins and Almeida, 2014), we establish a new state-of-the-art on these data sets.
  • Randomized Greedy Inference for Joint Segmentation, POS Tagging and Dependency Parsing Authors: Yuan Zhang, Chengtao Li, Regina Barzilay, Kareem Darwish
    In this paper, we introduce a new approach for joint segmentation, POS tagging and dependency parsing. While joint modeling of these tasks addresses the issue of error propagation inherent in traditional pipeline architectures, it also complicates the inference task. Past research has addressed this challenge by placing constraints on the scoring function. In contrast, we propose an approach that can handle arbitrarily complex scoring functions. Specifically, we employ a randomized greedy algorithm that jointly predicts segmentations, POS tags and dependency trees. Moreover, this architecture readily handles different segmentation tasks, such as morphological segmentation for Arabic and word segmentation for Chinese. The joint model outperforms the state-of-the-art systems on three datasets, obtaining 2.1% TedEval absolute gain against the best published results in the 2013 SPMRL shared task.

1C: Information Retrieval, Text Categorization, Topic Modeling

  • A Hybrid Generative/Discriminative Approach To Citation Prediction Authors: Chris Tanner and Eugene Charniak
    Text documents of varying nature (e.g., summary documents written by analysts or published, scientific papers) often cite others as a means of providing evidence to support a claim, attributing credit, or referring the reader to related work. We address the problem of predicting a document's cited sources by introducing a novel, discriminative approach which combines a content-based generative model (LDA) with author-based features. Further, our classifier is able to learn the importance and quality of each topic within our corpus -- which can be useful beyond this task -- and preliminary results suggest its metric is competitive with other standard metrics (Topic Coherence). Our flagship system, Logit-Expanded, provides state-of-the-art performance on the largest corpus ever used for this task.
  • Weakly Supervised Slot Tagging with Partially Labeled Sequences from Web Search Click Logs Authors: Young-Bum Kim, Minwoo Jeong, Karl Stratos, Ruhi Sarikaya
    In this paper, we apply a weakly-supervised learning approach for slot tagging using con- ditional random fields by exploiting web search click logs. We extend the constrained lattice training of Tckstrm et al. (2013) to non-linear conditional random fields in which latent variables mediate between observations and labels. When combined with a novel initialization scheme that leverages unlabeled data, we show that our method gives signifi- cant improvement over strong supervised and weakly-supervised baselines.
  • Not All Character N-grams Are Created Equal: A Study in Authorship Attribution Authors: Upendra Sapkota, Steven Bethard, Manuel Montes, Thamar Solorio
    Character n-grams have been identified as the most successful feature in both single-domain and cross-domain Authorship Attribution (AA), but the reasons for their discriminative value were not fully understood. We identify subgroups of character n-grams that correspond to linguistic aspects commonly claimed to be covered by these features: morpho-syntax, thematic content and style. We evaluate the predictiveness of each of these groups in two AA settings: a single domain setting and a cross-domain setting where multiple topics are present. We demonstrate that character $n$-grams that capture information about affixes and punctuation account for almost all of the power of character n-grams as features. Our study contributes new insights into the use of n-grams for future AA work and other classification tasks.
  • Effective Use of Word Order for Text Categorization with Convolutional Neural Networks Authors: Rie Johnson and Tong Zhang
    Convolutional neural network (CNN) is a neural network that can make use of the internal structure of data such as the 2D structure of image data. This paper studies CNN on text categorization to exploit the 1D structure (namely, word order) of text data for accurate prediction. Instead of using low-dimensional word vectors as input as is often done, we directly apply CNN to high-dimensional text data, which leads to directly learning embedding of small text regions for use in classification. In addition to a straightforward adaptation of CNN from image to text, a simple but new variation which employs bag-of-word conversion in the convolution layer is proposed. An extension to combine multiple convolution layers is also explored for higher accuracy. The experiments demonstrate the effectiveness of our approach in comparison with state-of-the-art methods.