In-store Tracking with a Single Panoramic Camera

Video Results

The novelty of our particle filtering approach comes from a way we formulate observation likelihoods to account for 3D locations of the bodies with respect to the camera and occlusions by other tracked human bodies as well as static objects.

Traffic Visualization: Heat-map
Motion orientation   15 MB
Motion+skin color orientation (Kalman filter combined)   12 MB
Panoramic tracking   35 MB
Customer paths   2 MB
Panoramic tracking   42 MB
Customer paths   3 MB
Panoramic tracking   25 MB
Customer dots (colored by group)   1 MB
Body heights   5 MB
Obstacle markup dialog   149 KB
Customer Paths:   8 MB ,   8 MB
Detecting head candidates

  Head candidates

Zoom in

 
To generate human hypotheses within a blob detected in the scene we have used a principle similar to that of the vertical projection histogram of the blob. Our method utilizes information about the vanishing point location we obtain from the camera during the calibration stage. The projection of the blob is done along rays going through the vanishing point instead of the parallel lines projecting onto the horizontal axis of the image, hence the name: Vanishing Point Projection (VPP) Histogram . By Bresenham's algorithm we rasterize a number of rays going through the verical vanshing point and each of the contour's vertices.

  Vertical Projection Histogram

Zoom in

 
We isolate local maxima in the histogram by first smoothing within the local window and averaging neighboring peaks. Notice that to represent the cyclic nature of the contour for the leftmost and rightmost bins the neighbors are wrapped around from the end or the beginning of the histogram correspondingly. Typically the window size can be determined as the total number of bins in the histogram divided by the maximum amount of candidates allowed with one blob. This number is set empirically from 3 to 10 depending on the average complexity or ``crowdedness'' of the scene.





MeetCrowd Meet_WalkSplit Meet_Split_3rdGuy
Tracking by Fusing Color and Thermal Information

This is the novel model of the scene background where each pixel is represented as a multi-modal distribution. The number of modalities is dynamically changing for both color and thermal input. We demonstrate how to eliminate the influence of shadows with this type of fusion. Based on this background model we introduce a pedestrian tracker designed as a particle filter. It includes a number of informed reversible transformations to sample the model probability space in order to maximize our model posterior probability.

  14 MB   898 KB
  2 MB
  15 MB
  16 MB
  17 MB
Resolving Occlusions Using Multiple Body Views

What if each person could be represented by several color histograms: each corresponding to a specific orientation? We developed a method to automatic determine for any tracked object the number of intrinsic color-views. The methods work run-time, by clustering observed color-space into an optimal number of clusters (views). We show how this more detailed representation of each body can increase success rates when resolving person-to-person occlusions.

A sequence demonstrating switching between views. Colors represent views. Example 1
Tracking with stationary camera in arbitrary indoor environment. Colors represent different objects. Example 2
A sequence illustratinng tracking through merge-split events. Colors represent different objects. Example 3