You can sort by using the menu on the right.

  • Big Learning: Algorithms, Systems, and Tools for Learning at Scale

    Driven by cheap commodity storage, fast data networks, rich structured models, and the increasing desire to catalog and share our collective experiences in real-time, the scale of many important learning problems has grown well beyond the capacity of traditional sequential systems. These “Big Learning” problems arise in many domains including bioinformatics, astronomy, recommendation systems, social networks, computer vision, web search and online advertising. Simultaneously, parallelism has emerged as a dominant widely used computational paradigm in devices ranging from energy efficient mobile processors, to desktop supercomputers in the form of GPUs, to massively scalable cloud computing services. The Big Learning setting has attracted intense interest across industry and academia, with active research spanning diverse fields ranging from machine learning and databases to large scale distributed systems and programming languages. However because the Big Learning setting is being studied by experts of these various communities, there is a need for a common venue to discuss recent progress, to identify pressing new challenges, and to exchange new ideas.

    This workshop aims to:

    * Bring together parallel and distributed system builders in industry and academia, machine learning experts, and end users to identify the key challenges, opportunities, and myths of Big Learning. What REALLY changes from the traditional learning setting when faced with terabytes or petabytes of data?
    * Solicit practical case studies, demos, benchmarks and lessons-learned presentations, and position papers.
    * Showcase recent and ongoing progress towards parallel ML algorithms
    * Provide a forum for exchange regarding tools, software, and systems that address the Big Learning problem.
    * Educate the researchers and practitioners across communities on state-of-the-art solutions and their limitations, particularly focusing on key criteria for selecting task- and domain-appropriate platforms and algorithms.

  • The 4th International Workshop on Music and Machine Learning: Learning from Musical Structure

    With the current explosion and quick expansion of music in digital formats, and the computational power of modern systems, research on machine learning and music is gaining increasing popularity. As complexity of the problems investigated by researchers on machine learning and music increases, there is a need to develop new algorithms and methods to solve these problems. The focus of this workshop is on novel methods which take into account or benefit from musical structure. MML 2011 aims to build on the previous three successful MML editions, MML’08, MML’09 and MML’10.
    It has been convincingly shown that many useful applications can be built using features derived from short musical snippets (chroma, MFCCs and related timbral features, augmented with tempo and beat representations). Given the great advances in these applications, higher level aspects of musical structure such as melody, harmony, phrasing and rhythm can now be given further attention, and we especially welcome contributions exploring these areas. The MML 2011 workshop intends to concentrate on machine learning algorithms employing higher level features and representations for content-based music processing.

    Papers in all applications on music and machine learning are welcome, including but not limited to automatic classification of music (audio and MIDI), style-based interpreter recognition, automatic composition and improvisation, music recommender systems, genre and tag prediction, score alignment, polyphonic pitch detection, chord extraction, pattern discovery, beat tracking, and expressive performance modeling. Audio demonstrations are encouraged when indicated by the content of the paper.

  • Sparse Representation and Low-rank Approximation

    Sparse representation and low-rank approximation are fundamental tools in fields as diverse as computer vision, computational biology, signal processing, natural language processing, and machine learning. Recent advances in sparse and low-rank modeling have led to increasingly concise descriptions of high dimensional data, together with algorithms of provable performance and bounded complexity. Our workshop aims to survey recent work on sparsity and low-rank approximation and to provide a forum for open discussion of the key questions concerning these dimensionality reduction techniques. The workshop will be divided into two segments, a "sparsity segment" emphasizing sparse dictionary learning and a "low-rank segment" emphasizing scalability and large data.

    The sparsity segment will be dedicated to learning sparse latent representations and dictionaries: decomposing a signal or a vector of observations as sparse linear combinations of basis vectors, atoms or covariates is ubiquitous in machine learning and signal processing. Algorithms and theoretical analyses for obtaining these decompositions are now numerous. Learning the atoms or basis vectors directly from data has proven useful in several domains and is often seen from different view points: (a) as a matrix factorization problem with potentially some constraints such as pointwise nonnegativity, (b) as a latent variable model which can be treated in a probabilistic and potentially Bayesian way, leading in particular to topic models, and (c) as dictionary learning with often a goal of signal representation or restoration. The goal of this part of the workshop is to confront these various points of view and foster exchanges of ideas among the signal processing, statistics, machine learning and applied mathematics communities.

    The low-rank segment will explore the impact of low-rank methods for large-scale machine learning. Large datasets often take the form of matrices representing either a set of real-valued features for each datapoint or pairwise similarities between datapoints. Hence, modern learning problems face the daunting task of storing and operating on matrices with millions to billions of entries. An attractive solution to this problem involves working with low-rank approximations of the original matrix. Low-rank approximation is at the core of widely used algorithms such as Principal Component Analysis and Latent Semantic Indexing, and low-rank matrices appear in a variety of applications including lossy data compression, collaborative filtering, image processing, text analysis, matrix completion, robust matrix factorization and metric learning. In this segment we aim to study new algorithms, recent theoretical advances and large-scale empirical results, and more broadly we hope to identify additional interesting scenarios for use of low-rank approximations for learning tasks.

  • Learning Semantics Workshop

    A key ambition of AI is to render computers able to evolve in and interact with the real world. This can be made possible only if the machine is able to produce a correct interpretation of its available modalities (image, audio, text, etc.), upon which it would then build a reasoning to take appropriate actions. Computational linguists use the term ``semantics'' to refer to the possible interpretations (concepts) of natural language expressions, and showed some interest in ``learning semantics'', that is finding (in an automated way) these interpretations. However, ``semantics'' are not restricted to natural language modality, and are also pertinent for speech or vision modalities. Hence, knowing visual concepts and common relationships between them would certainly bring a leap forward in scene analysis and in image parsing akin to the improvement that language phrase interpretations would bring to data mining, information extraction or automatic translation, to name a few.

    Progress in learning semantics has been slow mainly because this involves sophisticated models which are hard to train, especially since they seem to require large quantities of precisely annotated training data. However, recent advances in learning with weak and limited supervision lead to the emergence of a new body of research in semantics based on multi-task/transfer learning, on learning with semi/ambiguous supervision or even with no supervision at all. The goal of this workshop is to explore these new directions and, in particular, to investigate the following questions:
    \item How should meaning representations be structured to be easily interpretable by a computer and still express rich and complex knowledge?
    \item What is a realistic supervision setting for learning semantics? How can we learn sophisticated representations with limited supervision?
    \item How can we jointly infer semantics from several modalities?

    This workshop defines the issue of learning semantics as its main interdisciplinary subject and aims at identifying, establishing and discussing potential, challenges and issues of learning semantics. The workshop is mainly organized around invited speakers to highlight several key current directions, but, it also presents selected contributions and is intended to encourage the exchange of ideas with all the other members of the NIPS community.

  • Domain Adaptation Workshop: Theory and Application

    Despite the recent advances in domain adaptation, many of the most successful practical achievements in domain adaptation have not been robust, in part because they lack formal assumptions about when they could perform well. At the same time, some of the most influential theoretical work guarantees near optimal performance in new domains, but under assumptions that may not hold in practice.

    Our workshop will bridge theory and practice in the following ways:

    1.We will have one applied and two theoretical invited talks.

    2.We will advertise the workshop to both the applied and theoretical communities.

    3.We will have discussion sessions whose aim emphasizes both the formal assumptions underlying successful practical algorithms and new algorithms based on theoretical foundations.

    Workshop attendees should come away with an understanding of the domain adaptation problem, how it appears in practical applications and existing theoretical guarantees that can be provided in this more general setting. More importantly, attendees will be exposed to the important open problems of the field, which will encourage new collaborations and results.

  • Machine Learning in Computational Biology (MLCB) 2011

    A workshop at the annual Conference on Neural Information Processing Systems (NIPS 2011) @ Sierra Nevada, Spain, December 17, 2011.

  • NIPS 2011 Workshop on Integrating Language and Vision

    A growing number of researchers in computer vision have started to explore how language accompanying images and video can be used to aid interpretation and retrieval, as well as train object and activity recognizers. Simultaneously, an increasing number of computational linguists have begun to investigate how visual information can be used to aid language learning and interpretation, and to ground the meaning of words and sentences in perception. However, there has been very little direct interaction between researchers in these two distinct disciplines. Consequently, researchers in each area have a limited understanding of the methods in the other area, and do not optimally exploit the latest ideas and techniques from both disciplines when developing systems that integrate language and vision. The goal of this workshop is to bring together researchers in both computer vision and natural-language processing (NLP) to interact, collaborate, and discuss issues and future directions in integrating language and vision.
    Traditional machine learning for both computer vision and NLP requires manually annotating images, video, text, or speech with detailed labels, parse-trees, segmentations, etc. Methods that integrate language and vision hold the promise of greatly reducing such manual supervision by using naturally co-occurring text and images/video to mutually supervise each other.

    There is a wide range of important real-world applications that require integrating vision and language, including but not limited to: image and video retrieval, human-robot interaction, medical image processing, human-computer interaction in virtual worlds, and computer graphics generation.

  • Copulas in Machine Learning Workshop 2011

    From high-throughput biology and astronomy to voice analysis and medical diagnosis, a wide variety of complex domains are inherently continuous and high dimensional. The statistical framework of copulas offers a flexible tool for modeling highly non-linear multivariate distributions for continuous data. Copulas are a theoretically and practically important tool from statistics that explicitly allow one to separate the dependency structure between random variables from their marginal distributions. Although bivariate copulas are a widely used tool in finance, and have even been famously accused of "bringing the world financial system to its knees" (Wired Magazine, Feb. 23, 2009), the use of copulas for high dimensional data is in its infancy.

    While studied in statistics for many years, copulas have only recently been noticed by a number of machine learning researchers, with this "new" tool appearing in the recent leading machine learning conferences (ICML, UAI and NIPS). The goal of this workshop is to promote the further understanding and development of copulas for the kinds of complex modeling tasks that are the focus of machine learning. Specifically, the goals of the workshop are to:

    * draw the attention of machine learning researchers to the
    important framework of copulas

    * provide a theoretical and practical introduction to copulas

    * identify promising research problems in machine learning that
    could exploit copulas

    * bring together researchers from the statistics and machine learning communities working in this area.

    The target audience includes leading researchers from academia and industry, with the aim of facilitating cross fertilization between
    different perspectives.

  • 1st Lisbon Machine Learning School

    LxMLS 2011 will take place July 20-25 (*) at Instituto Superior Tecnico, a leading Engineering and Science school in Portugal. It is organized jointly by IST, the Instituto de Telecomunicacoes and the Spoken Language Systems Lab - L2F of INESC-ID. In its debut year, the topic of the school is Learning for the Web. The school will cover a range of machine learning (ML) Topics, from theory to practice, that are important in solving natural language processing (NLP) problems that arise in the analysis and use of Web data.

  • FOCS 2011

    52nd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2011). Palm Springs, California, October 23-25, 2011.