CVPR 2014 Video Spotlights
TechTalks from event: CVPR 2014 Video Spotlights
Orals 3B : Video: Events, Activities & Surveillance
-
L0 Regularized Stationary Time Estimation for Crowd Group AnalysisWe tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance. Our key contribution is to propose a robust algorithm of estimating how long a foreground pixel becomes stationary. It is much more challenging than only subtracting background because failure at a single frame due to local movement of objects, lighting variation, and occlusion could lead to large errors on stationary time estimation. To accomplish decent results, sparse constraints along spatial and temporal dimensions are jointly added by mixed partials to shape a 3D stationary time map. It is formulated as a L0 optimization problem. Besides background subtraction, it distinguishes among different foreground objects, which are close or overlapped in the spatio-temporal space by using a locally shared foreground codebook. The proposed technologies are used to detect four types of stationary group activities and analyze crowd scene structures. We provide the first public benchmark dataset1 for stationary time estimation and stationary group analysis.
-
Scene-Independent Group Profiling in CrowdGroups are the primary entities that make up a crowd. Understanding group-level dynamics and properties is thus scientifically important and practically useful in a wide range of applications, especially for crowd understanding. In this study we show that fundamental group-level properties, such as intra-group stability and inter-group conflict, can be systematically quantified by visual descriptors. This is made possible through learning a novel Collective Transition prior, which leads to a robust approach for group segregation in public spaces. From the prior, we further devise a rich set of group property visual descriptors. These descriptors are scene-independent, and can be effectively applied to public-scene with variety of crowd densities and distributions. Extensive experiments on hundreds of public scene video clips demonstrate that such property descriptors are not only useful but also necessary for group state analysis and crowd scene understanding.
- All Sessions
- Orals 1A : Matching & Reconstruction
- Orals 1B : Segmentation & Grouping
- Posters 1A : Recognition, Segmentation, Stereo & SFM
- Orals 1C : Statistical Methods & Learning I
- Orals 1D : Action Recognition
- Posters 1B : 3D Vision, Action Recognition, Recognition, Statistical Methods & Learning
- Orals 2A : Motion & Tracking
- Orals 2B : Discrete Optimization
- Posters 2A : Motion & Tracking, Optimization, Statistical Methods & Learning, Stereo & SFM
- Posters 2B : Face & Gesture, Recognition
- Orals 3A : Physics-Based Vision & Shape-from-X
- Orals 3B : Video: Events, Activities & Surveillance
- Posters 3A : Physics-Based Vision, Recognition, Video: Events, Activities & Surveillance
- Orals 3C : Medical & Biological Image Analysis
- Orals 3D : Low-Level Vision & Image Processing
- Posters 3B : Biologically Inspired Vision, Low-Level Vision, Medical & Biological Image Analysis, Segmentation
- Orals 4A : Computational Photography: Sensing and Display
- Orals 4B : Recognition: Detection, Categorization, Classification
- Posters 4A : Computational Photography, Motion & Tracking, Recognition
- Orals 4C : 3D Geometry & Shape
- Orals 4F : View Synthesis & Other Applications
- Posters 4B : 3D Vision, Document Analysis, Optimization Methods, Shape, Vision for Graphics, Web & Vision Systems
- Orals 2F : Convolutional Neural Networks