You can sort by using the menu on the right.

  • Columbia-Princeton Probability Day 2013

    The workshop was held in Jadwin Hall, Room A10 on the Princeton University campus.

  • Vator Splash SFO Feb 2013

    Vator (short for innovator) is one of the largest business networks dedicated to high-tech entrepreneurs. Founded and run by veteran and award-winning journalist Bambi Francisco, Vator is an entrepreneur and investor community, with some 100,000 members and high-tech companies. VatorNews is Vator's news site focused on innovation with about 500 contributors.

  • WritersUA 2013

    The Conference for Software User Assistance

  • NYU Course on Big Data, Large Scale Machine Learning

    Taught by John Langford and Yann LeCun, this course is for people interested in automatically extracting knowledge from large amounts of data. Students should have some prior knowledge or experience with basic machine learning methods.

    You must have taken a machine learning course at the undergraduate or graduate level prior to taking this course, or have industry experience with machine learning.

  • 2nd Lisbon Machine Learning School (2012)

    LxMLS 2012 took place during July 19-25 at Instituto Superior Técnico, a leading Engineering and Science school in Portugal. It is organized jointly by IST, the Instituto de Telecomunicações and the Spoken Language Systems Lab - L2F of INESC-ID. Click here for information about past editions (LxMLS 2011) and to watch the videos of the lectures.


    In our second year, the topic of the school is Taming the Social Web.


    The school covers a range of machine learning (ML) Topics, from theory to practice, that are important in solving natural language processing (NLP) problems that arise in the analysis and use of Web data.

  • Machine Learning in Computational Biology (MLCB) 2012

    The field of computational biology has seen dramatic growth over the past few years, both in terms of new available data, new scientific questions, and new challenges for learning and inference. In particular, biological data are often relationally structured and highly diverse, well-suited to approaches that combine multiple weak evidence from heterogeneous sources. These data may include sequenced genomes of a variety of organisms, gene expression data from multiple technologies, protein expression data, protein sequence and 3D structural data, protein interactions, gene ontology and pathway databases, genetic variation data (such as SNPs), and an enormous amount of textual data in the biological and medical literature. New types of scientific and clinical problems require the development of novel supervised and unsupervised learning methods that can use these growing resources. Furthermore, next generation sequencing technologies are yielding terabyte scale data sets that require novel algorithmic solutions.

    The goal of this workshop is to present emerging problems and machine learning techniques in computational biology. We will invite several speakers from the biology/bioinformatics community who will present current research problems in bioinformatics, and we will invite contributed talks on novel learning approaches in computational biology. We encourage contributions describing either progress on new bioinformatics problems or work on established problems using methods that are substantially different from standard approaches. Kernel methods, graphical models, feature selection, and other techniques applied to relevant bioinformatics problems would all be appropriate for the workshop. The targeted audience are people with interest in learning and applications to relevant problems from the life sciences.

  • NIPS 2012 Workshop on Log-Linear Models

    Exponential functions are core mathematical constructs that are the key to many important applications, including speech recognition, pattern-search and logistic regression problems in statistics, machine translation, and natural language processing. Exponential functions are found in exponential families, log-linear models, conditional random fields (CRF), entropy functions, neural networks involving sigmoid and soft max functions, and Kalman filter or MMIE training of hidden Markov models. Many techniques have been developed in pattern recognition to construct formulations from exponential expressions and to optimize such functions, including growth transforms, EM, EBW, Rprop, bounds for log-linear models, large-margin formulations, and regularization. Optimization of log-linear models also provides important algorithmic tools for machine learning applications (including deep learning), leading to new research in such topics as stochastic gradient methods, sparse / regularized optimization methods, enhanced first-order methods, coordinate descent, and approximate second-order methods. Specific recent advances relevant to log-linear modeling include the following.

    • Effective optimization approaches, including stochastic gradient and Hessian-free methods.
    • Efficient algorithms for regularized optimization problems.
    • Bounds for log-linear models and recent convergence results
    • Recognition of modeling equivalences across different areas, such as the equivalence between Gaussian and log-linear models/HMM and HCRF, and the equivalence between transfer entropy and Granger causality for Gaussian parameters.

    Though exponential functions and log-linear models are well established, research activity remains intense, due to the central importance of the area in front-line applications and the rapid expanding size of the data sets to be processed. Fundamental work is needed to transfer algorithmic ideas across different contexts and explore synergies between them, to assimilate the influx of ideas from optimization, to assemble better combinations of algorithmic elements for tackling such key tasks as deep learning, and to explore such key issues as parameter tuning.

    The workshop will bring together researchers from the many fields that formulate, use, analyze, and optimize log-linear models, with a view to exposing and studying the issues discussed above.

    Topics of possible interest for talks at the workshop include, but are not limited to, the following.

    1. Log-linear models.
    2. Using equivalences to transfer optimization and modeling methods across different applications and different classes of models.
    3. Comparison of optimization / accuracy performance of equivalent model pairs.
    4. Convex formulations.
    5. Bounds and their applications.
    6. Stochastic gradient, first-order, and approximate-second-order methods.
    7. Efficient non-Gaussian filtering approach (that exploits equivalence of Gaussian generative and log-linear models and projecting on exponential manifold of densities).
    8. Graphic and Network inference models.
    9. Missing data and hidden variables in log-linear modeling.
    10. Semi-supervised estimation in log-linear modeling.
    11. Sparsity in log-linear models.
    12. Block and novel regularization methods for log-linear models.
    13. Parallel, distributed and large-scale methods for log-linear models.
    14. Information geometry of Gaussian densities and exponential families.
    15. Hybrid algorithms that combine different optimization strategies.
    16. Connections between log-linear models and deep belief networks.
    17. Connections with kernel methods.
    18. Applications to speech / natural-language processing and other areas.
    19. Empirical contributions that compare and contrast different approaches.
    20. Theoretical contributions that relate to any of the above topics.

  • Big Learning : Algorithms, Systems, and Tools

    This workshop will address algorithms, systems, and real-world problem domains related to large-scale machine learning (“Big Learning”). With active research spanning machine learning, databases, parallel and distributed systems, parallel architectures, programming languages and abstractions, and even the sciences, Big Learning has attracted intense interest. This workshop will bring together experts across these diverse communities to discuss recent progress, share tools and software, identify pressing new challenges, and to exchange new ideas. Topics of interest include (but are not limited to):

    - Big Data: Methods for managing large, unstructured, and/or streaming data; cleaning, visualization, interactive platforms for data understanding and interpretation; sketching and summarization techniques; sources of large datasets.

    - Models & Algorithms: Machine learning algorithms for parallel, distributed, GPGPUs, or other novel architectures; theoretical analysis; distributed online algorithms; implementation and experimental evaluation; methods for distributed fault tolerance.

    - Applications of Big Learning: Practical application studies and challenges of real-world system building; insights on end-users, common data characteristics (stream or batch); trade-offs between labeling strategies (e.g., curated or crowd-sourced).

    - Tools, Software & Systems: Languages and libraries for large-scale parallel or distributed learning which leverage cloud computing, scalable storage (e.g. RDBMs, NoSQL, graph databases), and/or specialized hardware.