TechTalks from event: CVPR 2014 Video Spotlights

Orals 3B : Video: Events, Activities & Surveillance

  • L0 Regularized Stationary Time Estimation for Crowd Group Analysis Authors: Shuai Yi, Xiaogang Wang, Cewu Lu, Jiaya Jia
    We tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance. Our key contribution is to propose a robust algorithm of estimating how long a foreground pixel becomes stationary. It is much more challenging than only subtracting background because failure at a single frame due to local movement of objects, lighting variation, and occlusion could lead to large errors on stationary time estimation. To accomplish decent results, sparse constraints along spatial and temporal dimensions are jointly added by mixed partials to shape a 3D stationary time map. It is formulated as a L0 optimization problem. Besides background subtraction, it distinguishes among different foreground objects, which are close or overlapped in the spatio-temporal space by using a locally shared foreground codebook. The proposed technologies are used to detect four types of stationary group activities and analyze crowd scene structures. We provide the first public benchmark dataset1 for stationary time estimation and stationary group analysis.
  • Scene-Independent Group Profiling in Crowd Authors: Jing Shao, Chen Change Loy, Xiaogang Wang
    Groups are the primary entities that make up a crowd. Understanding group-level dynamics and properties is thus scientifically important and practically useful in a wide range of applications, especially for crowd understanding. In this study we show that fundamental group-level properties, such as intra-group stability and inter-group conflict, can be systematically quantified by visual descriptors. This is made possible through learning a novel Collective Transition prior, which leads to a robust approach for group segregation in public spaces. From the prior, we further devise a rich set of group property visual descriptors. These descriptors are scene-independent, and can be effectively applied to public-scene with variety of crowd densities and distributions. Extensive experiments on hundreds of public scene video clips demonstrate that such property descriptors are not only useful but also necessary for group state analysis and crowd scene understanding.