TechTalks from event: Technical session talks from ICRA 2012

Conference registration code to access these videos can be accessed by visiting this link: PaperPlaza. Step-by-step to access these videos are here: step-by-step process .
Why some of the videos are missing? If you had provided your consent form for your video to be published and still it is missing, please contact

Navigation and Visual Sensing

  • Navigation among Visually Connected Sets of Partially Distinguishable Landmarks Authors: Erickson, Lawrence H; LaValle, Steven M
    A robot navigates in a polygonal region populated by a set of partially distinguishable landmarks. The robot's motion primitives consist of actions of the form ``drive toward a landmark of class x''. To effectively navigate, the robot must always be able to see a landmark. Also, if the robot sees two landmarks of the same class, its motion primitives become ambiguous. Finally, if the robot wishes to navigate from landmark s_0 to landmark s_{goal} with a simple graph search algorithm, then there must be a sequence of landmarks [s_0,s_1,s_2,...,s_k=s_{goal}], in which landmark s_i is visible from s_{i-1}. Given these three conditions, how many landmark classes are required for navigation in a given polygon P? We call this minimum number of landmark classes the connected landmark class number, denoted chi_{CL}(P). We study this problem for the monotone polygons, an important family of polygons that are frequently generated as intermediate steps in other decomposition algorithms. We demonstrate that for all odd k, there exists a monotone polygon M_k with (3/4)(k^2+2k+1) vertices such that chi_{CL}(P) geq k. We also demonstrate that for any n-vertex monotone polygon P, chi_{CL}(P) leq n/3+12.
  • Natural Landmark Extraction in Cluttered Forested Environments Authors: Song, Meng; Sun, Fengchi; Iagnemma, Karl
    In this paper, a new systematical method for extracting tree trunk landmarks from 3D point clouds of cluttered forested environments is proposed. This purely geometric method is established on scene understanding and automatic analysis of trees. The pipeline of our method includes three steps. First, the raw point clouds are segmented by utilizing the circular shape of trees, and segments are grouped into tree sections based on the principle of spatial proximity. Second, circles and axes are extracted from tree sections which are subject to loss of shape information. Third, by clustering and integrating the tree sections resulted from various space inconsistencies, straight tree trunk landmarks are finally formed for future localization. The experimental results from real forested environments are presented.
  • Rapid Vanishing Point Estimation for General Road Detection Authors: Miksik, Ondrej
    This paper deals with fast vanishing point estimation for autonomous robot navigation. Preceding approaches showed suitable results and vanishing point estimation was used in many robotics tasks, especially in the detection of ill-structured roads. The main drawback of such approaches is the computational complexity - the possibilities of hardware accelerations are mentioned in many papers, however, we believe, that the biggest benefit of a vanishing point estimation algorithm is for primarily tele-operated robots in the case of signal loss, etc., that cannot use specialized hardware just for this feature. In this paper, we investigate possibilities of an efficient implementation by the expansion of Gabor wavelets into a linear combination of Haar-like box functions to perform fast filtering via integral image trick and discuss the utilization of superpixels in the voting scheme to provide a significant speed-up (more than 40 times), while we loose only 3-5% in precision.
  • A New Tentacles-Based Technique for Avoiding Obstacles During Visual Navigation Authors: Cherubini, Andrea; Spindler, Fabien; Chaumette, Francois
    In this paper, we design and validate a new tentacle-based approach, for avoiding obstacles during appearance-based navigation with a wheeled mobile robot. In the past, we have developed a framework for safe visual navigation. The robot follows a path represented as a set of key images, and during obstacle circumnavigation, the on-board camera is actuated to maintain scene visibility. In those works, the model used for obstacle avoidance was obtained using a potential vector field. Here, a more sophisticated and efficient method, that exploits the robot kinematic model, and predicts collision at look-ahead distances, is designed and integrated in that framework. Outdoor experiments comparing the two models show that the new approach presents many advantages. Higher speeds and precision can be attained, very cluttered scenarios involving large obstacles can be successfully dealt with, and the control inputs are smoother.
  • Maintaining visibility constraints during tele-echography with ultrasound visual servoing Authors: LI, Tao; Kermorgant, Olivier; Krupa, Alexandre
    This paper presents a multi-task control method to maintain the visibility of an anatomic element of interest while the doctor tele-operates a 2D ultrasound probe held by a medical robot. The prior task consists in automatically maintaining several visual constraints that guarantee an intersection between the ultrasound image plane and the anatomic object of interest and the second task allows the medical expert to manually apply probe motion through tele-operation. Unlike classical visual servoing technique which continually regulate the current visual features to desired values, our control approach gradually activates the regulation of one or several ultrasound visual features that go close to fixed limits in such a way to keep them in a safe domain. The main advantage of this approach is to give to the clinician the control of all the degrees of freedom of the probe to examine the patient while automatically preserving the visibility of the element of interest if required. Both simulations and experiments performed on an abdominal phantom demonstrate the efficiency of the visibility assistance task.
  • Simple and Robust Visual Servo Control of Robot Arms Using an On-Line Trajectory Generator Authors: Kroeger, Torsten; Padial, Jose
    Common visual servoing methods use image features to define a signal error in the feedback loops of robot motion controllers. This paper suggests a new visual servo control scheme that uses an on-line trajectory generator as an intermediate layer between image processing algorithms and robot motion controllers. The motion generation algorithm is capable of computing an entire trajectory from an arbitrary initial state of motion within one servo control cycle (typically one millisecond or less). This algorithm is fed with desired pose and velocity signals that are generated by an image processing algorithm. The advantages of this new architecture are: (a) jerk-limited and continuous motions are guaranteed independently of image processing signals, (b) kinematic motion constraints as well as physical and/or artificial workspace limits can be directly considered, and (c) the system can instantaneously and safely react to sensor failures (e.g., if cameras are covered or image processing fails). Real-world experimental results using a seven-joint robot arm are presented to underline the relevance for the field of robust sensor-guided robot motion control.


  • 3-D Mutual Localization with Anonymous Bearing Measurements Authors: Cognetti, Marco; Stegagno, Paolo; Franchi, Antonio; Oriolo, Giuseppe; Buelthoff, Heinrich H.
    We present a decentralized algorithm for estimating mutual 3-D poses in a group of mobile robots, such as a team of UAVs. Our algorithm uses bearing measurements reconstructed, e.g., by a visual sensor, and inertial measurements coming from the robot IMU. Since identification of a specific robot in a group would require visual tagging and may be cumbersome in practice, we simply assume that the bearing measurements are anonymous. The proposed localization method is a non-trivial extension of our previous algorithm for the 2-D case, and exhibits similar performance and robustness. An experimental validation of the algorithm has been performed using quadrotor UAVs.
  • Online Model Estimation of Ultra-Wideband TDOA Measurements for Mobile Robot Localization Authors: Prorok, Amanda; Gonon, Lukas; Martinoli, Alcherio
    Ultra-wideband (UWB) localization is a recent technology that promises to outperform many indoor localization methods currently available. Yet, non-line-of-sight (NLOS) positioning scenarios can create large biases in the time-difference-of-arrival (TDOA) measurements, and must be addressed with accurate measurement models in order to avoid significant localization errors. In this work, we first develop an efficient, closed-form TDOA error model and analyze its estimation characteristics by calculating the Cramer-Rao lower bound (CRLB). We subsequently detail how an online Expectation Maximization (EM) algorithm is adopted to find an elegant formalism for the maximum likelihood estimate of the model parameters. We perform real experiments on a mobile robot equipped with an UWB emitter, and show that the online estimation algorithm leads to excellent localization performance due to its ability to adapt to the varying NLOS path conditions over time.
  • Orientation Only Loop-Closing with Closed-Form Trajectory Bending Authors: Dubbelman, Gijs; Browning, Brett; Hansen, Peter; Dias, M. Bernardine
    In earlier work closed-form trajectory bending was shown to provide an efficient and accurate out-of-core solution for loop-closing exactly sparse trajectories. Here we extend it to fuse exactly sparse trajectories, obtained from relative pose estimates, with absolute orientation data. This allows us to close-the-loop using absolute orientation data only. The benefit is that our approach does not rely on the observations from which the trajectory was estimated nor on the probabilistic links between poses in the trajectory. It therefore is highly efficient. The proposed method is compared against regular fusion and an iterative trajectory bending solution using a 5 km long urban trajectory. Proofs concerning optimality of our method are provided.
  • Capping Computation Time and Storage Requirements for Appearance-Based Localization with CAT− SLAM Authors: Maddern, William; Milford, Michael J; Wyeth, Gordon
    Appearance-based localization is increasingly used for loop closure detection in metric SLAM systems. Since it relies only upon the appearance-based similarity between images from two locations, it can perform loop closure regardless of accumulated metric error. However, the computation time and memory requirements of current appearance-based methods scale linearly not only with the size of the environment but also with the operation time of the platform. These properties impose severe restrictions on long-term autonomy for mobile robots, as loop closure performance will inevitably degrade with increased operation time. We present a set of improvements to the appearance-based SLAM algorithm CAT-SLAM to constrain computation scaling and memory usage with minimal degradation in performance over time. The appearance-based comparison stage is accelerated by exploiting properties of the particle observation update, and nodes in the continuous trajectory map are removed according to minimal information loss criteria. We demonstrate constant time and space loop closure detection in a large urban environment with recall performance exceeding FAB-MAP by a factor of 3 at 100% precision, and investigate the minimum computational and memory requirements for maintaining mapping performance.
  • Improving the Accuracy of EKF-Based Visual-Inertial Odometry Authors: Li, Mingyang; Mourikis, Anastasios
    In this paper, we perform a rigorous analysis of EKF-based visual-inertial odometry (VIO) and present a method for improving its performance. Specifically, we examine the properties of EKF-based VIO, and show that the standard way of computing Jacobians in the filter inevitably causes inconsistency and loss of accuracy. This result is derived based on an observability analysis of the EKF's linearized system model, which proves that the yaw erroneously appears to be observable. In order to address this problem, we propose modifications to the multi-state constraint Kalman filter (MSCKF) algorithm, which ensure the correct observability properties without incurring additional computational cost. Extensive simulation tests and real-world experiments demonstrate that the modified MSCKF algorithm outperforms competing methods, both in terms of consistency and accuracy.

Perception for Autonomous Vehicles

  • Active Perception for Autonomous Vehicles Authors: Unterholzner, Alois; Himmelsbach, Michael; Wuensche, Hans J
    Precise perception of a vehicle's surrounding is crucial for safe autonomous driving. It requires a high sensor resolution and a large field of view. Active perception, i.e. the redirection of a sensor's focus of attention, is an approach to provide both. With active perception, however, the selection of an appropriate sensor orientation becomes necessary. This paper presents a method for determining the sensor orientation in urban traffic scenarios based on three criteria: the importance of traffic participants w.r.t. the current situation, the available information about traffic participants while considering alternative sensor orientations as well as sensor coverage of the vehicle's relevant surrounding area.
  • A Probabilistic Framework for Car Detection in Images using Context and Scale Authors: Held, David; Levinson, Jesse; Thrun, Sebastian
    Detecting cars in real-world images is an important task for autonomous driving, yet it remains unsolved. The system described in this paper takes advantage of context and scale to build a monocular single-frame image-based car detector that significantly outperforms the baseline. The system uses a probabilistic model to combine multiple forms of evidence for both context and scale to locate cars in a real-world image. We also use scale filtering to speed up our algorithm by a factor of 3.3 compared to the baseline. By using a calibrated camera and localization on a road map, we are able to obtain context and scale information from a single image without the use of a 3D laser. The system outperforms the baseline by an absolute 9.4% in overall average precision and 11.7% in average precision for cars smaller than 50 pixels in height, for which context and scale cues are especially important.
  • Real-Time Topometric Localization Authors: Badino, Hernan; Huber, Daniel; Kanade, Takeo
    Autonomous vehicles must be capable of localizing even in GPS denied situations. In this paper, we propose a real-time method to localize a vehicle along a route using visual imagery or range information. Our approach is an implementation of topometric localization, which combines the robustness of topological localization with the geometric accuracy of metric methods. We construct a map by navigating the route using a GPS-equipped vehicle and building a compact database of simple visual and 3D features. We then localize using a Bayesian filter to match sequences of visual or range measurements to the database. The algorithm is reliable across wide environmental changes, including lighting differences, seasonal variations, and occlusions, achieving an average localization accuracy of 1 m over an 8 km route. The method converges correctly even with wrong initial position estimates solving the kidnapped robot problem.
  • SeqSLAM: Visual Route-Based Navigation for Sunny Summer Days and Stormy Winter Nights Authors: Milford, Michael J; Wyeth, Gordon
    Learning and then recognizing a route, whether travelled during the day or at night, in clear or inclement weather, and in summer or winter is a challenging task for state of the art algorithms in computer vision and robotics. In this paper, we present a new approach to visual navigation under changing conditions dubbed SeqSLAM. Instead of calculating the single location most likely given a current image, our approach calculates the best candidate matching location within every local navigation sequence. Localization is then achieved by recognizing coherent sequences of these “local best matches”. This approach removes the need for global matching performance by the vision front-end – instead it must only pick the best match within any short sequence of images. The approach is applicable over environment changes that render traditional feature-based techniques ineffective. Using two car-mounted camera datasets we demonstrate the effectiveness of the algorithm and compare it to one of the most successful feature-based SLAM algorithms, FAB-MAP. The perceptual change in the datasets is extreme; repeated traverses through environments during the day and then in the middle of the night, at times separated by months or years and in opposite seasons, and in clear weather and extremely heavy rain. While the feature-based method fails, the sequence-based algorithm is able to match trajectory segments at 100% precision with recall rates of up to 60%.
  • Image Sequence Partitioning for Outdoor Mapping Authors: Korrapati, Hemanth; Mezouar, Youcef; Martinet, Philippe
    Most of the existing appearance based topological mapping algorithms produce dense topological maps in which each image stands as a node in the topological graph. Sparser maps can be built by representing groups of visually similar images as nodes of a topological graph. In this paper, we present a sparse topological mapping framework which uses Image Sequence Partitioning (ISP) techniques to group visually similar images as topological graph nodes. We present four different ISP techniques and evaluate their performance. In order to take advantage of the afore mentioned maps, we make use of Hierarchical Inverted Files (HIF) which enable efficient hierarchical loop closure. Outdoor experimental results demonstrating the sparsity, efficiency and accuracy achieved by the combination of ISP and HIF in performing loop closure are presented.
  • Anytime Merging of Appearance Based Maps Authors: Erinc, Gorkem; Carpin, Stefano
    Appearance based maps are emerging as an important class of spatial representations for mobile robots. In this paper we tackle the problem of merging together two or more appearance based maps independently built by robots operating in the same environment. Noticing the lack of well accepted metrics to measure the performance of map merging algorithms, we propose to use algebraic connectivity as a metric to assess the advantage gained by merging multiple maps. Next, based on this criterion, we propose an anytime algorithm aiming to quickly identify the more advantageous parts to merge. The system we proposed has been fully implemented and tested in indoor scenarios and shows that our algorithm achieves a convenient tradeoff between accuracy and speed.