TechTalks from event: Technical session talks from ICRA 2012

Conference registration code to access these videos can be accessed by visiting this link: PaperPlaza. Step-by-step to access these videos are here: step-by-step process .
Why some of the videos are missing? If you had provided your consent form for your video to be published and still it is missing, please contact

Data Based Learning

  • Improving the Efficiency of Bayesian Inverse Reinforcement Learning Authors: Michini, Bernard; How, Jonathan
    Inverse reinforcement learning (IRL) is the task of learning the reward function of a Markov Decision Process (MDP) given knowledge of the transition function and a set of expert demonstrations. While many IRL algorithms exist, Bayesian IRL [1] provides a general and principled method of reward learning by casting the problem in the Bayesian inference framework. However, the algorithm as originally presented suffers from several inefficiencies that prohibit its use for even moderate problem sizes. This paper proposes modifications to the original Bayesian IRL algorithm to improve its efficiency and tractability in situations where the state space is large and the expert demonstrations span only a small portion of it. The key insight is that the inference task should be focused on states that are similar to those encountered by the expert, as opposed to making the naive assumption that the expert demonstrations contain enough information to accurately infer the reward function over the entire state space. A modified algorithm is presented and experimental results show substantially faster convergence while maintaining the solution quality of the original method.
  • Learning Diffeomorphisms Models of Robotic Sensorimotor Cascades Authors: Censi, Andrea; Murray, Richard
    The problem of bootstrapping consists in designing agents that can learn from scratch the model of their sensorimotor cascade (the series of robot actuators, the external world, and the robot sensors) and use it to achieve useful tasks. In principle, we would want to design agents that can work for any robot dynamics and any robot sensor(s). One of the difficulties of this problem is the fact that the observations are very high dimensional, the dynamics is nonlinear, and there is a wide range of “representation nuisances” to which we would want the agent to be robust. In this paper, we model the dynamics of sensorimotor cascades using diffeomorphisms of the sensel space. We show that this model captures the dynamics of camera and range-finder data, that it can be used for long-term predictions, and that it can capture nonlinear phenomena such as a limited field of view. Moreover, by analyzing the learned diffeomorphisms it is possible to recover the “linear structure” of the dynamics in a manner which is independent of the commands representation.
  • Interactive Generation of Dynamically Feasible Robot Trajectories from Sketches Using Temporal Mimicking Authors: Luo, Jingru; Hauser, Kris
    This paper presents a method for generating dynamically-feasible, natural-looking robot motion from freehand sketches. Using trajectory optimization, it handles sketches that are too fast, jerky, or pass out of reach by enforcing the constraints of the robot’s dynamic limitations while minimizing the relative temporal differences between the robot’s trajectory and the sketch. To make the optimization fast enough for interactive use, a variety of enhancements are employed including decoupling the geometric and temporal optimizations and methods to select good initial trajectories. The technique is also applicable to transferring human motions onto robots with non-human appearance and dynamics, and we use our method to demonstrate a simulated humanoid imitating a golf swing as well as an industrial robot performing the motion of writing a cursive ”hello” word.
  • A Robot Path Planning Framework That Learns from Experience Authors: Berenson, Dmitry; Abbeel, Pieter; Goldberg, Ken
    We propose a framework, called Lightning, for planning paths in high-dimensional spaces that is able to learn from experience, with the aim of reducing computation time. This framework is intended for manipulation tasks that arise in applications ranging from domestic assistance to robot-assisted surgery. Our framework consists of two main modules, which run in parallel: a planning-from-scratch module, and a module that retrieves and repairs paths stored in a path library. After a path is generated for a new query, a library manager decides whether to store the path based on computation time and the generated path's similarity to the retrieved path. To retrieve an appropriate path from the library we use two heuristics that exploit two key aspects of the problem: (i) A correlation between the amount a path violates constraints and the amount of time needed to repair that path, and (ii) the implicit division of constraints into those that vary across environments in which the robot operates and those that do not. We evaluated an implementation of the framework on several tasks for the PR2 mobile manipulator and a minimally-invasive surgery robot in simulation. We found that the retrieve-and-repair module produced paths faster than planning-from-scratch in over 90% of test cases for the PR2 and in 58% of test cases for the minimally-invasive surgery robot.
  • Evaluation of Commonsense Knowledge for Intuitive Robotic Service Authors: Ngo, Trung L.; Lee, Haeyeon; Mayama, Katsuhiro; Mizukawa, Makoto
    Human commonsense is required to improve quality of robotic application. However, to acquire the necessary knowledge, robot needs to evaluate the appropriateness of the data it has collected. This paper presents an evaluation method, by combining the weighting mechanism in commonsense databases with a set of weighting factors. The method was verified on our Basic-level Knowledge Network. We conducted questionnaire to collect a commonsense data set and estimate weighting factors. Result showed that, the proposed method was able to build Robot Technology (RT) Ontology for a smart “Bring something” robotic service. More importantly, it allowed robot to learn new knowledge when necessary. An intuitive human-robot interface application was developed as an example base on our approach.
  • A Temporal Bayesian Network with Application to Design of a Proactive Robotic Assistant Authors: Kwon, Woo Young; Suh, Il Hong
    For effective human-robot interaction, a robot should be able to make prediction about future circumstance. This enables the robot to generate preparative behaviors to reduce waiting time, thereby greatly improving the quality of the interaction. In this paper, we propose a novel probabilistic temporal prediction method for proactive interaction that is based on a Bayesian network approach. In our proposed method, conditional probabilities of temporal events can be explicitly represented by defining temporal nodes in a Bayesian network. Utilizing these nodes, both temporal and causal information can be simultaneously inferred in a unified framework. An assistant robot can use the temporal Bayesian network to infer the best proactive action and the best time to act so that the waiting time for both the human and the robot is minimized. To validate our proposed method, we present experimental results for case in which a robot assists in a human assembly task.

Vision-Based Attention and Interaction

  • Computing Object-Based Saliency in Urban Scenes Using Laser Sensing Authors: Zhao, Yipu; He, Mengwen; Zhao, Huijing; Davoine, Franck; Zha, Hongbin
    It becomes a well-known technology that a low-level map of complex environment containing 3D laser points can be generated using a robot with laser scanners. Given a cloud of 3D laser points of an urban scene, this paper proposes a method for locating the objects of interest, e.g. traffic signs or road lamps, by computing object-based saliency. Our major contributions are: 1) a method for extracting simple geometric features from laser data is developed, where both range images and 3D laser points are analyzed; 2) an object is modeled as a graph used to describe the composition of geometric features; 3) a graph matching based method is developed to locate the objects of interest on laser data. Experimental results on real laser data depicting urban scenes are presented; efficiency as well as limitations of the method are discussed.
  • Where Do I Look Now? Gaze Allocation During Visually Guided Manipulation Authors: Nunez-Varela, Jose; Ravindran, Balaraman; Wyatt, Jeremy
    In this work we present principled methods for the coordination of a robot's oculomotor system with the rest of its body motor systems. The problem is to decide which physical actions to perform next and where the robot's gaze should be directed in order to gain information that is relevant to the success of its physical actions. Previous work on this problem has shown that a reward-based coordination mechanism provides an efficient solution. However, that approach does not allow the robot to move its gaze to different parts of the scene, it considers the robot to have only one motor system, and assumes that the actions have the same duration. The main contributions of our work are to extend that previous reward-based approach by making decisions about where to fixate the robot's gaze, handling multiple motor systems, and handling actions of variable duration. We compare our approach against two common baselines: random and round robin gaze allocation. We show how our method provides a more effective strategy to allocate gaze where it is needed the most.
  • 3D AAM Based Face Alignment under Wide Angular Variations Using 2D and 3D Data Authors: Wang, Chieh-Chih
    Active Appearance Models (AAMs) are widely used to estimate the shape of the face together with its orientation, but AAM approaches tend to fail when the face is under wide angular variations. Although it is feasible to capture the overall 3D face structure using 3D data from range cameras, the locations of facial features are often estimated imprecisely or incorrectly due to depth measurement uncertainty. Face alignment using 2D and 3D images suffer from different issues and have varying reliability in different situations. The existing approaches introduce a weighting function to balance 2D and 3D alignments in which the weighting function is tuned manually and the sensor characteristics are not taken into account. In this paper, we propose to balance 3D face alignment using 2D and 3D data based on the observed data and the sensors characteristics. The feasibility of wide-angle face alignment is demonstrated using two different sets of depth and conventional cameras. The experimental results show that a stable alignment is achieved with a maximum improvement of 26% compared to 3D AAM using 2D image and 30% improvement over the state-of-the-art 3DMM methods in terms of 3D head pose estimation.
  • Robots That Validate Learned Perceptual Models Authors: Klank, Ulrich; Mösenlechner, Lorenz; Maldonado, Alexis; Beetz, Michael
    Service robots that should operate autonomously need to perform actions reliably, and be able to adapt to their changing environment using learning mechanisms. Optimally, robots should learn continuously but this approach often suffers from problems like over-fitting, drifting or dealing with incomplete data. In this paper, we propose a method to automatically validate autonomously acquired perception models. These perception models are used to localize objects in the environment with the intention of manipulating them with the robot. Our approach verifies the learned perception models by moving the robot, trying to re-detect an object and then to grasp it. From observable failures of these actions and high-level loop-closures to validate the eventual success, we can derive certain qualities of our models and our environment. We evaluate our approach by using two different detection algorithms, one using 2D RGB data and one using 3D point clouds. We show that our system is able to improve the perception performance significantly by learning which of the models is better in a certain situation and a specific context. We show how additional validation allows for successful continuous learning. The strictest precondition for learning such perceptual models is correct segmentation of objects which is evaluated in a second experiment.
  • Uncalibrated Visual Servoing for Intuitive Human Guidance of Robots Authors: Marshall, Matthew; Matthews, James; Hu, Ai-Ping; McMurray, Gary
    We propose a novel implementation of visual servoing whereby a human operator can guide a robot relative to the coordinate frame of an eye-in-hand camera. Among other applications, this can allow the operator to work in the image space of the eye-in-hand camera. This is achieved using a gamepad, a time-of-flight camera (an active sensor that creates depth data), and recursive least-squares update with Gauss-Newton control. Contributions of this paper include the use of a person to cause the control action in a visual-servoing system, and the introduction of uncalibrated position-based visual servoing. The system's efficacy is evaluated via trials involving human operators in different scenarios.
  • Leveraging RGB-D Data: Adaptive Fusion and Domain Adaptation for Object Detection Authors: Spinello, Luciano; Luber, Matthias; Arras, Kai Oliver
    Vision and range sensing belong to the richest sensory modalities for perception in robotics and related fields. This paper addresses the problem of how to best combine image and range data for the task of object detection. In particular, we propose a novel adaptive fusion approach, hierarchical Gaussian Process mixtures of experts, able to account for missing information and cross-cue data consistency. The hierarchy is a two-tier architecture that for each modality, each frame and each detection computes a weight function using Gaussian Processes that reflects the confidence of the respective information. We further propose a method called cross-cue domain adaptation that makes use of large image data sets to improve the depth-based object detector for which only few training samples exist. In the experiments that include a comparison with alternative sensor fusion schemes, we demonstrate the viability of the proposed methods and achieve significant improvements in classification accuracy.

Control and Planning for UAVs

  • Deploying the Max-Sum Algorithm for Decentralised Coordination and Task Allocation of Unmanned Aerial Vehicles for Live Aerial Imagery Collection Authors: Delle Fave, Francesco Maria; Rogers, Alex; Xu, Zhe; Sukkarieh, Salah; Jennings, Nick
    We introduce a new technique for coordinating teams of unmanned aerial vehicles (UAVs) when deployed to collect live aerial imagery of the scene of a disaster. We define this problem as one of task assignment where the UAVs dynamically coordinate over tasks representing the imagery collection requests. To measure the quality of the assignment of one or more UAVs to a task, we propose a novel utility function which encompasses several constraints, such as the task's importance and the UAVs' battery capacity so as to maximise performance. We then solve the resulting optimisation problem using a fully asynchronous and decentralised implementation of the max-sum algorithm, a well known message passing algorithm previously used only in simulated domains. Finally, we evaluate our approach both in simulation and on real hardware. First, we empirically evaluate our utility and show that it yields a better trade off between the quantity and quality of completed tasks than similar utilities that do not take all the constraints into account. Second, we deploy it on two hexacopters and assess its practical viability in the real world.
  • Mixed-Integer Quadratic Program Trajectory Generation for Heterogeneous Quadrotor Teams Authors: Mellinger, Daniel; Kushleyev, Aleksandr; Kumar, Vijay
    We present an algorithm for the generation of optimal trajectories for teams of heterogeneous quadrotors in three-dimensional environments with obstacles. We formulate the problem using mixed-integer quadratic programs (MIQPs) where the integer constraints are used to enforce collision avoidance. The method allows for different sizes, capabilities, and varying dynamic effects between different quadrotors. Experimental results illustrate the method applied to teams of up to four quadrotors ranging from 65 to 962 grams and 21 to 67 cm in width following trajectories in three-dimensional environments with obstacles with accelerations approaching 1g.
  • Safety Verification of Reactive Controllers for UAV Flight in Cluttered Environments Using Barrier Certificates Authors: Barry, Andrew J.; Majumdar, Anirudha; Tedrake, Russ
    Unmanned aerial vehicles (UAVs) have a so-far untapped potential to operate at high speeds through cluttered environments. Many of these systems are limited by their ad-hoc reactive controllers using simple visual cues like optical flow. Here we consider the problem of formally verifying an output-feedback controller for an aircraft operating in an unknown environment. Using recent advances in sums-of-squares programming that allow for efficient computation of barrier functions, we search for global certificates of safety for the closed-loop system in a given environment. In contrast to previous work, we use rational functions to globally approximate non-smooth dynamics and use multiple barrier functions to guard against more than one obstacle. We expect that these formal verification techniques will allow for the comparison, and ultimately optimization, of reactive controllers for robustness to varying initial conditions and environments.
  • On-board Velocity Estimation and Closed-loop Control of a Quadrotor UAV based on Optical Flow Authors: Grabe, Volker; Buelthoff, Heinrich H.; Robuffo Giordano, Paolo
    Robot vision became a field of increasing importance in micro aerial vehicle robotics with the availability of small and light hardware. While most approaches rely on external ground stations because of the need of high computational power, we will present a full autonomous setup using only on-board hardware. Our work is based on the continuous homography constraint to recover ego-motion from optical flow. Thus we are able to provide an efficient fall back routine for any kind of UAV (Unmanned Aerial Vehicles) since we rely solely on a monocular camera and on on-board computation. In particular, we devised two variants of the classical continuous 4-point algorithm and provided an extensive experimental evaluation against a known ground truth. The results show that our approach is able to recover the ego-motion of a flying UAV in realistic conditions and by only relying on the limited on-board computational power. Furthermore, we exploited the velocity estimation for closing the loop and controlling the motion of the UAV online.
  • Visual Terrain Classification by Flying Robots Authors: Khan, Yasir Niaz; Masselli, Andreas; Zell, Andreas
    In this paper we investigate the effectiveness of SURF features for visual terrain classification for outdoor flying robots. A quadrocopter fitted with a single camera is flown over different terrains to take images of the ground below. Each image is divided into a grid and SURF features are calculated at grid intersections. A classifier is then used to learn to differentiate between different terrain types. Classification results of the SURF descriptor are compared with results from other texture descriptors like Local Binary Patterns and Local Ternary Patterns. Six different terrain types are considered in this approcah. Random forests are used for classification on each descriptor. It is shown that SURF features perform better than other descriptors at higher resolutions.
  • Real-Time Decentralized Search with Inter-Agent Collision Avoidance Authors: Gan, Seng Keat; Fitch, Robert; Sukkarieh, Salah
    This paper addresses the problem of coordinating a team of mobile autonomous sensor agents performing a cooperative mission while explicitly avoiding inter-agent collisions in a team negotiation process. Many multi-agent cooperative approaches disregard the potential hazards between agents, which are an important aspect to many systems and especially for airborne systems. In this work, team negotiation is performed using a decentralized gradient-based optimization approach whereas safety distance constraints are specifically designed and handled using Lagrangian multiplier methods. The novelty of our work is the demonstration of a decentralized form of inter-agent collision avoidance in the loop of the agents' real-time group mission optimization process, where the algorithm inherits the properties of performing its original mission while minimizing the probability of inter-agent collisions. Explicit constraint gradient formulation is derived and used to enhance computational advantage and solution accuracy. The effectiveness and robustness of our algorithm has been verified in a simulated environment by coordinating a team of UAVs searching for targets in a large-scale environment.