Vehicle detection and tracking using acoustic and video sensors

TitleVehicle detection and tracking using acoustic and video sensors
Publication TypeConference Papers
Year of Publication2004
AuthorsChellappa R, Qian G, Zheng Q
Conference NameAcoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Date Published2004/05//
Keywordsacoustic, applications;, audio, audio-visual, beam-forming, Carlo, chain, density, detection;, direction-of-arrival, DOA, empirical, estimation;, framework;, functions;, fusion, fusion;, joint, Markov, methods;, Monte, moving, multimodal, object, optical, posterior, probability, probability;, processes;, processing;, sensing;, sensor, sensors;, signal, Surveillance, surveillance;, systems;, target, techniques;, tracking;, vehicle, video

Multimodal sensing has attracted much attention in solving a wide range of problems, including target detection, tracking, classification, activity understanding, speech recognition, etc. In surveillance applications, different types of sensors, such as video and acoustic sensors, provide distinct observations of ongoing activities. We present a fusion framework using both video and acoustic sensors for vehicle detection and tracking. In the detection phase, a rough estimate of target direction-of-arrival (DOA) is first obtained using acoustic data through beam-forming techniques. This initial DOA estimate designates the approximate target location in video. Given the initial target position, the DOA is refined by moving target detection using the video data. Markov chain Monte Carlo techniques are then used for joint audio-visual tracking. A novel fusion approach has been proposed for tracking, based on different characteristics of audio and visual trackers. Experimental results using both synthetic and real data are presented. Improved tracking performance has been observed by fusing the empirical posterior probability density functions obtained using both types of sensors.