You can sort by using the menu on the right.

  • NYU Course on Big Data, Large Scale Machine Learning

    Taught by John Langford and Yann LeCun, this course is for people interested in automatically extracting knowledge from large amounts of data. Students should have some prior knowledge or experience with basic machine learning methods.

    You must have taken a machine learning course at the undergraduate or graduate level prior to taking this course, or have industry experience with machine learning.

  • 2nd Lisbon Machine Learning School (2012)

    LxMLS 2012 took place during July 19-25 at Instituto Superior Técnico, a leading Engineering and Science school in Portugal. It is organized jointly by IST, the Instituto de Telecomunicações and the Spoken Language Systems Lab - L2F of INESC-ID. Click here for information about past editions (LxMLS 2011) and to watch the videos of the lectures.


    In our second year, the topic of the school is Taming the Social Web.


    The school covers a range of machine learning (ML) Topics, from theory to practice, that are important in solving natural language processing (NLP) problems that arise in the analysis and use of Web data.

  • Machine Learning in Computational Biology (MLCB) 2012

    The field of computational biology has seen dramatic growth over the past few years, both in terms of new available data, new scientific questions, and new challenges for learning and inference. In particular, biological data are often relationally structured and highly diverse, well-suited to approaches that combine multiple weak evidence from heterogeneous sources. These data may include sequenced genomes of a variety of organisms, gene expression data from multiple technologies, protein expression data, protein sequence and 3D structural data, protein interactions, gene ontology and pathway databases, genetic variation data (such as SNPs), and an enormous amount of textual data in the biological and medical literature. New types of scientific and clinical problems require the development of novel supervised and unsupervised learning methods that can use these growing resources. Furthermore, next generation sequencing technologies are yielding terabyte scale data sets that require novel algorithmic solutions.

    The goal of this workshop is to present emerging problems and machine learning techniques in computational biology. We will invite several speakers from the biology/bioinformatics community who will present current research problems in bioinformatics, and we will invite contributed talks on novel learning approaches in computational biology. We encourage contributions describing either progress on new bioinformatics problems or work on established problems using methods that are substantially different from standard approaches. Kernel methods, graphical models, feature selection, and other techniques applied to relevant bioinformatics problems would all be appropriate for the workshop. The targeted audience are people with interest in learning and applications to relevant problems from the life sciences.

  • NIPS 2012 Workshop on Log-Linear Models

    Exponential functions are core mathematical constructs that are the key to many important applications, including speech recognition, pattern-search and logistic regression problems in statistics, machine translation, and natural language processing. Exponential functions are found in exponential families, log-linear models, conditional random fields (CRF), entropy functions, neural networks involving sigmoid and soft max functions, and Kalman filter or MMIE training of hidden Markov models. Many techniques have been developed in pattern recognition to construct formulations from exponential expressions and to optimize such functions, including growth transforms, EM, EBW, Rprop, bounds for log-linear models, large-margin formulations, and regularization. Optimization of log-linear models also provides important algorithmic tools for machine learning applications (including deep learning), leading to new research in such topics as stochastic gradient methods, sparse / regularized optimization methods, enhanced first-order methods, coordinate descent, and approximate second-order methods. Specific recent advances relevant to log-linear modeling include the following.

    • Effective optimization approaches, including stochastic gradient and Hessian-free methods.
    • Efficient algorithms for regularized optimization problems.
    • Bounds for log-linear models and recent convergence results
    • Recognition of modeling equivalences across different areas, such as the equivalence between Gaussian and log-linear models/HMM and HCRF, and the equivalence between transfer entropy and Granger causality for Gaussian parameters.

    Though exponential functions and log-linear models are well established, research activity remains intense, due to the central importance of the area in front-line applications and the rapid expanding size of the data sets to be processed. Fundamental work is needed to transfer algorithmic ideas across different contexts and explore synergies between them, to assimilate the influx of ideas from optimization, to assemble better combinations of algorithmic elements for tackling such key tasks as deep learning, and to explore such key issues as parameter tuning.

    The workshop will bring together researchers from the many fields that formulate, use, analyze, and optimize log-linear models, with a view to exposing and studying the issues discussed above.

    Topics of possible interest for talks at the workshop include, but are not limited to, the following.

    1. Log-linear models.
    2. Using equivalences to transfer optimization and modeling methods across different applications and different classes of models.
    3. Comparison of optimization / accuracy performance of equivalent model pairs.
    4. Convex formulations.
    5. Bounds and their applications.
    6. Stochastic gradient, first-order, and approximate-second-order methods.
    7. Efficient non-Gaussian filtering approach (that exploits equivalence of Gaussian generative and log-linear models and projecting on exponential manifold of densities).
    8. Graphic and Network inference models.
    9. Missing data and hidden variables in log-linear modeling.
    10. Semi-supervised estimation in log-linear modeling.
    11. Sparsity in log-linear models.
    12. Block and novel regularization methods for log-linear models.
    13. Parallel, distributed and large-scale methods for log-linear models.
    14. Information geometry of Gaussian densities and exponential families.
    15. Hybrid algorithms that combine different optimization strategies.
    16. Connections between log-linear models and deep belief networks.
    17. Connections with kernel methods.
    18. Applications to speech / natural-language processing and other areas.
    19. Empirical contributions that compare and contrast different approaches.
    20. Theoretical contributions that relate to any of the above topics.

  • Big Learning : Algorithms, Systems, and Tools

    This workshop will address algorithms, systems, and real-world problem domains related to large-scale machine learning (“Big Learning”). With active research spanning machine learning, databases, parallel and distributed systems, parallel architectures, programming languages and abstractions, and even the sciences, Big Learning has attracted intense interest. This workshop will bring together experts across these diverse communities to discuss recent progress, share tools and software, identify pressing new challenges, and to exchange new ideas. Topics of interest include (but are not limited to):

    - Big Data: Methods for managing large, unstructured, and/or streaming data; cleaning, visualization, interactive platforms for data understanding and interpretation; sketching and summarization techniques; sources of large datasets.

    - Models & Algorithms: Machine learning algorithms for parallel, distributed, GPGPUs, or other novel architectures; theoretical analysis; distributed online algorithms; implementation and experimental evaluation; methods for distributed fault tolerance.

    - Applications of Big Learning: Practical application studies and challenges of real-world system building; insights on end-users, common data characteristics (stream or batch); trade-offs between labeling strategies (e.g., curated or crowd-sourced).

    - Tools, Software & Systems: Languages and libraries for large-scale parallel or distributed learning which leverage cloud computing, scalable storage (e.g. RDBMs, NoSQL, graph databases), and/or specialized hardware.

  • Big Data Meets Computer Vision: First International Workshop on Large Scale Visual Recognition and Retrieval

    The emergence of “big data” has brought about a paradigm shift throughout computer science. Computer vision is no exception. The explosion of images and videos on the Internet and the availability of large amounts of annotated data have created unprecedented opportunities and fundamental challenges on scaling up computer vision.

    Over the past few years, machine learning on big data has become a thriving field with a plethora of theories and tools developed. Meanwhile, large scale vision has also attracted increasing attention in the computer vision community. This workshop aims to bring closer researchers in large scale machine learning and large scale vision to foster cross-talk between the two fields. The goal is to encourage machine learning researchers to work on large scale vision problems, to inform computer vision researchers about new developments on large scale learning, and to identify unique challenges and opportunities.

    This workshop will focus on two distinct yet closely related vision problems: recognition and retrieval. Both are inherently large scale. In particular, both must handle high dimensional features (hundreds of thousands to millions), a large variety of visual classes (tens of thousands to millions), and a large number of examples (millions to billions).

    This workshop will consist of invited talks, panels, discussions, and paper submissions including, but not limited to, the following topics:

    -- State of the field: What really defines large scale vision? How does it differ from traditional vision research? What are its unique challenges for large scale learning?

    -- Indexing algorithms and data structures: How do we efficiently find similar features/images/classes from a large collection, a key operation in both recognition and retrieval?

    -- Semi-supervised/unsupervised learning: Large scale data comes with different levels of supervision, ranging from fully labeled and quality controlled to completely unlabeled. How do we make use of such data?

    -- Metric learning: Retrieval visually similar images/objects requires learning a similarity metric. How do we learn a good metric from a large amount of data?

    -- Visual models and feature representations: What is a good feature representation? How do we model and represent images/videos to handle tens of thousands of fine-grained visual classes?

    -- Exploiting semantic structures: How do we exploit the rich semantic relations between visual categories to handle a large number of classes?

    -- Transfer learning: How do we handle new visual classes (objects/scenes/activities) after having learned a large number of them? How do we transfer knowledge using the semantic relations between classes?

    -- Optimization techniques: How do we perform learning with training data that do not fit into memory? How do we parallelize learning?

    -- Datasets issues: What is a good large scale dataset? How should we construct datasets? How do we avoid dataset bias?

    -- Systems and infrastructure: How do we design and develop libraries and tools to facilitate large scale vision research? What infrastructure do we need?

    The target audience of this workshop includes industry and academic researchers interested in machine learning, computer vision, multimedia, and related fields.

  • OpenCV using Python

    Learn how to develop a machine learning application using Python and OpenCV. OpenCV is a cross-platform library of programming functions for real-time computer vision. While OpenCV is written primarily in C and C++, much of the API functionality can be accessed through wrappers in other languages, including Python. This tutorial will delve into using Python to develop computer vision application.

  • IPC–SMTA High-Reliability Cleaning and Conformal Coating Conference 2012

    This event is focused on electronics assembly reliability and the influence of cleaning and coating on the production of reliable hardware.

    “How clean is clean?” is even more challenging to answer: conductors and circuit traces are growing narrower and what is acceptably clean for one industry segment may be unacceptable in others.

  • The Website Is The App (And Vice Versa)

    There are two basic models of media and entertainment companies using mobile web apps built on HMTL5. One model is to create a whole new experience using web technologies. This could be re-packaging existing content, giving new life to archives, or just experimenting with a new format or media product. In my experience, some music labels to this really well The other model is to take the entire media property and create a single, multi-use app where the website, the mobile app, and the tablet app are all one entity. Here, there are some great examples from the news media. The talk will examine case studies of both of these approaches and provide the pros and cons. But the main takeaway is that with HTML5, mobile website and mobile apps can be one-and-the-same.