%0 Conference Paper
%B Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
%D 2010
%T Robust RVM regression using sparse outlier model
%A Mitra, K.
%A Veeraraghavan,A.
%A Chellapa, Rama
%K 3D
%K analysis;
%K approach;Gaussian
%K denoising;computer
%K denoising;lighting
%K denoising;regression
%K estimation;relevance
%K human
%K machine;robust
%K model;Gaussian
%K noise;basis
%K noise;computer
%K outlier
%K pose;Bayesian
%K pursuit
%K regression;sparse
%K RVM
%K vector
%K vision;image
%X Kernel regression techniques such as Relevance Vector Machine (RVM) regression, Support Vector Regression and Gaussian processes are widely used for solving many computer vision problems such as age, head pose, 3D human pose and lighting estimation. However, the presence of outliers in the training dataset makes the estimates from these regression techniques unreliable. In this paper, we propose robust versions of the RVM regression that can handle outliers in the training dataset. We decompose the noise term in the RVM formulation into a (sparse) outlier noise term and a Gaussian noise term. We then estimate the outlier noise along with the model parameters. We present two approaches for solving this estimation problem: (1) a Bayesian approach, which essentially follows the RVM framework and (2) an optimization approach based on Basis Pursuit Denoising. In the Bayesian approach, the robust RVM problem essentially becomes a bigger RVM problem with the advantage that it can be solved efficiently by a fast algorithm. Empirical evaluations, and real experiments on image de-noising and age estimation demonstrate the better performance of the robust RVM algorithms over that of the RVM reg ression.
%B Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
%P 1887 - 1894
%8 2010/06//
%G eng
%R 10.1109/CVPR.2010.5539861
%0 Conference Paper
%B IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09
%D 2009
%T Inexact Local Alignment Search over Suffix Arrays
%A Ghodsi,M.
%A Pop, Mihai
%K bacteria
%K Bioinformatics
%K biology computing
%K Computational Biology
%K Costs
%K DNA
%K DNA homology searches
%K DNA sequences
%K Educational institutions
%K generalized heuristic
%K genes
%K Genetics
%K genome alignment
%K Genomics
%K human
%K inexact local alignment search
%K inexact seeds
%K local alignment
%K local alignment tools
%K memory efficient suffix array
%K microorganisms
%K molecular biophysics
%K mouse
%K Organisms
%K Sensitivity and Specificity
%K sequences
%K suffix array
%K USA Councils
%X We describe an algorithm for finding approximate seeds for DNA homology searches. In contrast to previous algorithms that use exact or spaced seeds, our approximate seeds may contain insertions and deletions. We present a generalized heuristic for finding such seeds efficiently and prove that the heuristic does not affect sensitivity. We show how to adapt this algorithm to work over the memory efficient suffix array with provably minimal overhead in running time. We demonstrate the effectiveness of our algorithm on two tasks: whole genome alignment of bacteria and alignment of the DNA sequences of 177 genes that are orthologous in human and mouse. We show our algorithm achieves better sensitivity and uses less memory than other commonly used local alignment tools.
%B IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09
%I IEEE
%P 83 - 87
%8 2009/11/01/4
%@ 978-0-7695-3885-3
%G eng
%R 10.1109/BIBM.2009.25
%0 Conference Paper
%B Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on
%D 2009
%T Salient Clustering for View-dependent Multiresolution Rendering
%A Barni,R.
%A Comba,J.
%A Varshney, Amitabh
%K (computer
%K algorithms;cluster
%K analysis;mesh
%K attention;mesh
%K AUTOMATIC
%K centred
%K clustering
%K clustering;rendering
%K clustering;user-centric
%K clusters;low-level
%K dependent
%K design;
%K framework;salient
%K graphics);user
%K graphics;face
%K human
%K information;propagative
%K mesh
%K multiresolution
%K rendering;image
%K representation;mesh
%K resolution;image
%K saliency;mesh
%K seed
%K segmentation
%K segmentation;pattern
%K segmentation;perceptual
%K selection;computer
%K system;view
%K visual
%X Perceptual information is quickly gaining importance in mesh representation, analysis and rendering. User studies, eye tracking and other techniques are able to provide ever more useful insights for many user-centric systems, which form the bulk of computer graphics applications. In this work we build upon the concept of Mesh Saliency - an automatic measure of visual importance for triangle meshes based on models of low-level human visual attention - applying it to the problem of mesh segmentation and view-dependent rendering. We introduce a technique for segmentation that partitions an object into a set of face clusters, each encompassing a group of locally interesting features; Mesh Saliency is incorporated in a propagative mesh clustering framework, guiding cluster seed selection and triangle propagation costs and leading to a convergence of face clusters around perceptually important features. We compare our technique with different fully automatic segmentation algorithms, showing that it provides similar or better segmentation without the need for user input. We illustrate application of our clustering results through a saliency-guided view-dependent rendering system, achieving significant frame rate increases with little loss of visual detail.
%B Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on
%P 56 - 63
%8 2009/10//
%G eng
%R 10.1109/SIBGRAPI.2009.34
%0 Journal Article
%J Multimedia, IEEE Transactions on
%D 2008
%T Synthesis of Silhouettes and Visual Hull Reconstruction for Articulated Humans
%A Yue,Zhanfeng
%A Chellapa, Rama
%K active
%K algorithm;articulated
%K algorithm;inner
%K body
%K camera;visual
%K collection;virtual
%K computation;contour-based
%K Context
%K detection;image
%K distance
%K distance;turntable
%K estimation;shape
%K function
%K hull
%K human
%K image
%K image;approximate
%K localization
%K measurement;silhouette
%K measurement;turning
%K part
%K pose;circular
%K reality;
%K recognition;virtual
%K reconstruction;approximation
%K reconstruction;image
%K segmentation
%K segmentation;pose
%K SHAPE
%K similarity
%K synthesis;silhouette
%K technique;human
%K theory;cameras;edge
%K Trajectory
%X In this paper, we propose a complete framework for improved synthesis and understanding of the human pose from a limited number of silhouette images. It combines the active image-based visual hull (IBVH) algorithm and a contour-based body part segmentation technique. We derive a simple, approximate algorithm to decide the extrinsic parameters of a virtual camera, and synthesize the turntable image collection of the person using the IBVH algorithm by actively moving the virtual camera on a properly computed circular trajectory around the person. Using the turning function distance as the silhouette similarity measurement, this approach can be used to generate the desired pose-normalized images for recognition applications. In order to overcome the inability of the visual hull (VH) method to reconstruct concave regions, we propose a contour-based human body part localization algorithm to segment the silhouette images into convex body parts. The body parts observed from the virtual view are generated separately from the corresponding body parts observed from the input views and then assembled together for a more accurate VH reconstruction. Furthermore, the obtained turntable image collection helps to improve the body part segmentation and identification process. By using the inner distance shape context (IDSC) measurement, we are able to estimate the body part locations more accurately from a synthesized view where we can localize the body part more precisely. Experiments show that the proposed algorithm can greatly improve body part segmentation and hence shape reconstruction results.
%B Multimedia, IEEE Transactions on
%V 10
%P 1565 - 1577
%8 2008/12//
%@ 1520-9210
%G eng
%N 8
%R 10.1109/TMM.2008.2007321
%0 Conference Paper
%B Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
%D 2007
%T Coarse-to-Fine Event Model for Human Activities
%A Cuntoor, N.P.
%A Chellapa, Rama
%K action
%K activities;spatial
%K airport
%K browsing;video
%K dataset;activity
%K dataset;UCF
%K event
%K framework;human
%K human
%K indoor
%K Markov
%K model
%K model;event
%K models;image
%K probabilities;hidden
%K processing;
%K recognition;coarse-to-fine
%K reduction;video
%K representation;image
%K resolution
%K resolution;image
%K sequences;hidden
%K sequences;stability;video
%K signal
%K Surveillance
%K tarmac
%K TSA
%X We analyze coarse-to-fine hierarchical representation of human activities in video sequences. It can be used for efficient video browsing and activity recognition. Activities are modeled using a sequence of instantaneous events. Events in activities can be represented in a coarse-to-fine hierarchy in several ways, i.e., there may not be a unique hierarchical structure. We present five criteria and quantitative measures for evaluating their effectiveness. The criteria are minimalism, stability, consistency, accessibility and applicability. It is desirable to develop activity models that rank highly on these criteria at all levels of hierarchy. In this paper, activities are represented as sequence of event probabilities computed using the hidden Markov model framework. Two aspects of hierarchies are analyzed: the effect of reduced frame rate on the accuracy of events detected at a finer scale; and the effect of reduced spatial resolution on activity recognition. Experiments using the UCF indoor human action dataset and the TSA airport tarmac surveillance dataset show encouraging results
%B Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
%V 1
%P I-813 -I-816 - I-813 -I-816
%8 2007/04//
%G eng
%R 10.1109/ICASSP.2007.366032
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
%D 2007
%T Epitomic Representation of Human Activities
%A Cuntoor, N.P.
%A Chellapa, Rama
%K action
%K activities
%K airport
%K dataset;epitomic
%K dataset;UCF
%K decomposition;modelling;statistics;video
%K decomposition;TSA
%K dynamical
%K human
%K indoor
%K Iwasawa
%K matrix
%K matrix;human
%K modeling;input
%K processing;
%K representation;estimated
%K sequences;image
%K sequences;matrix
%K signal
%K statistics;linear
%K Surveillance
%K system
%K systems;video
%X We introduce an epitomic representation for modeling human activities in video sequences. A video sequence is divided into segments within which the dynamics of objects is assumed to be linear and modeled using linear dynamical systems. The tuple consisting of the estimated system matrix, statistics of the input signal and the initial state value is said to form an epitome. The system matrices are decomposed using the Iwasawa matrix decomposition to isolate the effect of rotation, scaling and projective action on the state vector. "We demonstrate the usefulness of the proposed representation and decomposition for activity recognition using the TSA airport surveillance dataset and the UCF indoor human action dataset.
%B Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
%P 1 - 8
%8 2007/06//
%G eng
%R 10.1109/CVPR.2007.383135
%0 Conference Paper
%B Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
%D 2007
%T Hierarchical Part-Template Matching for Human Detection and Segmentation
%A Zhe Lin
%A Davis, Larry S.
%A David Doermann
%A DeMenthon,D.
%K analysis;global
%K approach;background
%K articulations;video
%K Bayesian
%K detection;human
%K detectors;hierarchical
%K detectors;partial
%K framework;Bayesian
%K human
%K likelihood
%K MAP
%K matching;human
%K matching;image
%K methods;image
%K occlusion
%K occlusions;shape
%K part-based
%K part-template
%K re-evaluation;global
%K segmentation;image
%K segmentation;local
%K sequences;
%K sequences;Bayes
%K SHAPE
%K subtraction;fine
%K template-based
%X Local part-based human detectors are capable of handling partial occlusions efficiently and modeling shape articulations flexibly, while global shape template-based human detectors are capable of detecting and segmenting human shapes simultaneously. We describe a Bayesian approach to human detection and segmentation combining local part-based and global template-based schemes. The approach relies on the key ideas of matching a part-template tree to images hierarchically to generate a reliable set of detection hypotheses and optimizing it under a Bayesian MAP framework through global likelihood re-evaluation and fine occlusion analysis. In addition to detection, our approach is able to obtain human shapes and poses simultaneously. We applied the approach to human detection and segmentation in crowded scenes with and without background subtraction. Experimental results show that our approach achieves good performance on images and video sequences with severe occlusion.
%B Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
%P 1 - 8
%8 2007/10//
%G eng
%R 10.1109/ICCV.2007.4408975
%0 Conference Paper
%B Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
%D 2007
%T An Interactive Approach to Pose-Assisted and Appearance-based Segmentation of Humans
%A Zhe Lin
%A Davis, Larry S.
%A David Doermann
%A DeMenthon,D.
%K algorithm;appearance-based
%K algorithm;hidden
%K approach;layered
%K density
%K EM
%K Estimation
%K estimation;
%K estimator;pose-assisted
%K feature
%K human
%K Kernel
%K mechanisms;pose
%K method;expectation-maximisation
%K model;nonparametric
%K occlusion
%K reasoning
%K removal;image
%K segmentation;inference
%K segmentation;interactive
%K segmentation;probabilistic
%X An interactive human segmentation approach is described. Given regions of interest provided by users, the approach iteratively estimates segmentation via a generalized EM algorithm. Specifically, it encodes both spatial and color information in a nonparametric kernel density estimator, and incorporates local MRF constraints and global pose inferences to propagate beliefs over image space iteratively to determine a coherent segmentation. This ensures the segmented humans resemble the shapes of human poses. Additionally, a layered occlusion model and a probabilistic occlusion reasoning method are proposed to handle segmentation of multiple humans in occlusion. The approach is tested on a wide variety of images containing single or multiple occluded humans, and the segmentation performance is evaluated quantitatively.
%B Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
%P 1 - 8
%8 2007/10//
%G eng
%R 10.1109/ICCV.2007.4409123
%0 Conference Paper
%B Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
%D 2007
%T Markerless Monocular Tracking of Articulated Human Motion
%A Liu,Haiying
%A Chellapa, Rama
%K analysis;image
%K blob
%K camera;spatial-temporal
%K Chi
%K equations;markerless
%K flow;scaled
%K human
%K intensity;cameras;gait
%K model;articulated
%K monocular
%K MOTION
%K motion;global
%K optimization;human
%K orthographic
%K projection;single
%K recognition;image
%K sequence;linear
%K sequences;anatomical
%K sequences;optimisation;
%K structure;articulated
%K Tai
%K tracking;optical
%X This paper presents a method for tracking general 3D general articulated human motion using a single camera with unknown calibration data. No markers, special clothes, or devices are assumed to be attached to the subject. In addition, both the camera and the subject are allowed to move freely, so that long-term view-independent human motion tracking and recognition are possible. We exploit the fact that the anatomical structure of the human body can be approximated by an articulated blob model. The optical flow under scaled orthographic projection is used to relate the spatial-temporal intensity change of the image sequence to the human motion parameters. These motion parameters are obtained by solving a set of linear equations to achieve global optimization. The correctness and robustness of the proposed method are demonstrated using Tai Chi sequences
%B Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
%V 1
%P I-693 -I-696 - I-693 -I-696
%8 2007/04//
%G eng
%R 10.1109/ICASSP.2007.366002
%0 Conference Paper
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%D 2005
%T Pedestrian classification from moving platforms using cyclic motion pattern
%A Yang Ran
%A Qinfen Zheng
%A Weiss, I.
%A Davis, Larry S.
%A Abd-Almageed, Wael
%A Liang Zhao
%K analysis;
%K angle;
%K body
%K classification;
%K compact
%K cyclic
%K DETECTION
%K detection;
%K digital
%K Feedback
%K Gait
%K human
%K image
%K information;
%K locked
%K loop
%K loop;
%K loops;
%K module;
%K MOTION
%K object
%K oscillations;
%K pattern;
%K pedestrian
%K phase
%K Pixel
%K principle
%K representation;
%K sequence;
%K sequences;
%K SHAPE
%K system;
%X This paper describes an efficient pedestrian detection system for videos acquired from moving platforms. Given a detected and tracked object as a sequence of images within a bounding box, we describe the periodic signature of its motion pattern using a twin-pendulum model. Then a principle gait angle is extracted in every frame providing gait phase information. By estimating the periodicity from the phase data using a digital phase locked loop (dPLL), we quantify the cyclic pattern of the object, which helps us to continuously classify it as a pedestrian. Past approaches have used shape detectors applied to a single image or classifiers based on human body pixel oscillations, but ours is the first to integrate a global cyclic motion model and periodicity analysis. Novel contributions of this paper include: i) development of a compact shape representation of cyclic motion as a signature for a pedestrian, ii) estimation of gait period via a feedback loop module, and iii) implementation of a fast online pedestrian classification system which operates on videos acquired from moving platforms.
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%V 2
%P II - 854-7 - II - 854-7
%8 2005/09//
%G eng
%R 10.1109/ICIP.2005.1530190
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
%D 2005
%T Using the inner-distance for classification of articulated shapes
%A Ling,H.
%A Jacobs, David W.
%K articulated
%K CE-Shape-1
%K classification;
%K database;
%K databases;
%K dataset;
%K descriptor;
%K dynamic
%K human
%K image
%K inner-distance;
%K Kimia
%K landmark
%K leaf
%K matching;
%K MOTION
%K MPEG7
%K points;
%K programming;
%K SHAPE
%K silhouette
%K silhouette;
%K Swedish
%K visual
%X We propose using the inner-distance between landmark points to build shape descriptors. The inner-distance is defined as the length of the shortest path between landmark points within the shape silhouette. We show that the inner-distance is articulation insensitive and more effective at capturing complex shapes with part structures than Euclidean distance. To demonstrate this idea, it is used to build a new shape descriptor based on shape contexts. After that, we design a dynamic programming based method for shape matching and comparison. We have tested our approach on a variety of shape databases including an articulated shape dataset, MPEG7 CE-Shape-1, Kimia silhouettes, a Swedish leaf database and a human motion silhouette dataset. In all the experiments, our method demonstrates effective performance compared with other algorithms.
%B Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
%V 2
%P 719 - 726 vol. 2 - 719 - 726 vol. 2
%8 2005/06//
%G eng
%R 10.1109/CVPR.2005.362
%0 Conference Paper
%B Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
%D 2004
%T Fusion of gait and face for human identification
%A Kale, A.
%A Roy Chowdhury, A.K.
%A Chellapa, Rama
%K access
%K algorithm;
%K analysis;
%K combining
%K control;
%K covert
%K cues;
%K data
%K decision
%K Environment
%K Face
%K fusion;
%K Gait
%K hierarchical
%K holistic
%K human
%K identification;
%K importance
%K intelligent
%K interfaces;
%K invariant
%K perceptual
%K recognition
%K recognition;
%K rules;
%K sampling;
%K score
%K scores;
%K security;
%K sensor
%K sequential
%K similarity
%K view
%X Identification of humans from arbitrary view points is an important requirement for different tasks including perceptual interfaces for intelligent environments, covert security and access control etc. For optimal performance, the system must use as many cues as possible and combine them in meaningful ways. In this paper, we discuss fusion of face and gait cues for the single camera case. We present a view invariant gait recognition algorithm for gait recognition. We employ decision fusion to combine the results of our gait recognition algorithm and a face recognition algorithm based on sequential importance sampling. We consider two fusion scenarios: hierarchical and holistic. The first involves using the gait recognition algorithm as a filter to pass on a smaller set of candidates to the face recognition algorithm. The second involves combining the similarity scores obtained individually from the face and gait recognition algorithms. Simple rules like the SUM, MIN and PRODUCT are used for combining the scores. The results of fusion experiments are demonstrated on the NIST database which has outdoor gait and face data of 30 subjects.
%B Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
%V 5
%P V - 901-4 vol.5 - V - 901-4 vol.5
%8 2004/05//
%G eng
%R 10.1109/ICASSP.2004.1327257
%0 Conference Paper
%B Image Processing, 2004. ICIP '04. 2004 International Conference on
%D 2004
%T Multiple view tracking of humans modelled by kinematic chains
%A Sundaresan, A.
%A Chellapa, Rama
%A RoyChowdhury, R.
%K 3D
%K algorithm;
%K analysis;
%K body
%K calibrated
%K cameras;
%K chain
%K displacement;
%K error
%K estimation;
%K human
%K image
%K iterative
%K kinematic
%K kinematics;
%K methods;
%K model;
%K MOTION
%K motion;
%K multiple
%K parameters;
%K perspective
%K Pixel
%K processing;
%K projection
%K sequences;
%K signal
%K tracking;
%K video
%K view
%X We use a kinematic chain to model human body motion. We estimate the kinematic chain motion parameters using pixel displacements calculated from video sequences obtained from multiple calibrated cameras to perform tracking. We derive a linear relation between the 2D motion of pixels in terms of the 3D motion parameters of various body parts using a perspective projection model for the cameras, a rigid body motion model for the base body and the kinematic chain model for the body parts. An error analysis of the estimator is provided, leading to an iterative algorithm for calculating the motion parameters from the pixel displacements. We provide experimental results to demonstrate the accuracy of our formulation. We also compare our iterative algorithm to the noniterative algorithm and discuss its robustness in the presence of noise.
%B Image Processing, 2004. ICIP '04. 2004 International Conference on
%V 2
%P 1009 - 1012 Vol.2 - 1009 - 1012 Vol.2
%8 2004/10//
%G eng
%R 10.1109/ICIP.2004.1419472
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%D 2004
%T Role of shape and kinematics in human movement analysis
%A Veeraraghavan,A.
%A Chowdhury, A.R.
%A Chellapa, Rama
%K activity
%K algorithm;
%K algorithms;
%K analysis;
%K autoregressive
%K average
%K based
%K classification;
%K community;
%K Computer
%K definition;
%K dynamical
%K extraction;
%K feature
%K Gait
%K hidden
%K human
%K identification
%K image
%K Kendall
%K linear
%K manifold;
%K Markov
%K modeling;
%K models;
%K MOTION
%K Movement
%K moving
%K processes;
%K recognition
%K sequences;
%K SHAPE
%K spherical
%K system;
%K VISION
%K vision;
%X Human gait and activity analysis from video is presently attracting a lot of attention in the computer vision community. In this paper we analyze the role of two of the most important cues in human motion-shape and kinematics. We present an experimental framework whereby it is possible to evaluate the relative importance of these two cues in computer vision based recognition algorithms. In the process, we propose a new gait recognition algorithm by computing the distance between two sequences of shapes that lie on a spherical manifold. In our experiments, shape is represented using Kendall's definition of shape. Kinematics is represented using a Linear Dynamical system We place particular emphasis on human gait. Our conclusions show that shape plays a role which is more significant than kinematics in current automated gait based human identification algorithms. As a natural extension we study the role of shape and kinematics in activity recognition. Our experiments indicate that we require models that contain both shape and kinematics in order to perform accurate activity classification. These conclusions also allow us to explain the relative performance of many existing methods in computer-based human activity modeling.
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%V 1
%P I-730 - I-737 Vol.1 - I-730 - I-737 Vol.1
%8 2004/07/02/june
%G eng
%R 10.1109/CVPR.2004.1315104
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%D 2004
%T View independent human body pose estimation from a single perspective image
%A Parameswaran, V.
%A Chellapa, Rama
%K 3D
%K analysis;
%K biomechanics;
%K body
%K body-centric
%K camera;
%K capture
%K coordinate
%K coordinates;
%K detection;
%K epipolar
%K equation
%K estimation;
%K geometry;
%K human
%K image
%K image;
%K images;
%K model-based
%K models;
%K MOTION
%K object
%K optical
%K perspective
%K physiological
%K polynomial
%K polynomials;
%K pose
%K real
%K single
%K synthetic
%K system;
%K systems;
%K torso
%K tracking;
%K twist;
%K uncalibrated
%X Recovering the 3D coordinates of various joints of the human body from an image is a critical first step for several model-based human tracking and optical motion capture systems. Unlike previous approaches that have used a restrictive camera model or assumed a calibrated camera, our work deals with the general case of a perspective uncalibrated camera and is thus well suited for archived video. The input to the system is an image of the human body and correspondences of several body landmarks, while the output is the set of 3D coordinates of the landmarks in a body-centric coordinate system. Using ideas from 3D model based invariants, we set up a polynomial system of equations in the unknown head pitch, yaw and roll angles. If we are able to make the often-valid assumption that the torso twist is small, there are finite numbers of solutions to the head-orientation that can be computed readily. Once the head orientation is computed, the epipolar geometry of the camera is recovered, leading to solutions to the 3D joint positions. Results are presented on synthetic and real images.
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%V 2
%P II-16 - II-22 Vol.2 - II-16 - II-22 Vol.2
%8 2004/07/02/june
%G eng
%R 10.1109/CVPR.2004.1315139
%0 Conference Paper
%B Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on
%D 2003
%T An appearance based approach for human and object tracking
%A Capellades,M. B
%A David Doermann
%A DeMenthon,D.
%A Chellapa, Rama
%K algorithm;
%K analysis;
%K background
%K basis;
%K by
%K Color
%K colour
%K correlogram
%K detection;
%K distributions;
%K frame
%K histogram
%K human
%K image
%K information;
%K object
%K processing;
%K segmentation;
%K sequences;
%K signal
%K subtraction
%K tracking;
%K video
%X A system for tracking humans and detecting human-object interactions in indoor environments is described. A combination of correlogram and histogram information is used to model object and human color distributions. Humans and objects are detected using a background subtraction algorithm. The models are built on the fly and used to track them on a frame by frame basis. The system is able to detect when people merge into groups and segment them during occlusion. Identities are preserved during the sequence, even if a person enters and leaves the scene. The system is also able to detect when a person deposits or removes an object from the scene. In the first case the models are used to track the object retroactively in time. In the second case the objects are tracked for the rest of the sequence. Experimental results using indoor video sequences are presented.
%B Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on
%V 2
%P II - 85-8 vol.3 - II - 85-8 vol.3
%8 2003/09//
%G eng
%R 10.1109/ICIP.2003.1246622
%0 Conference Paper
%B Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
%D 2003
%T Combining multiple evidences for gait recognition
%A Cuntoor, N.
%A Kale, A.
%A Chellapa, Rama
%K analysis;
%K dynamic
%K evidences;
%K extraction;
%K feature
%K features;
%K frontal
%K Gait
%K height;
%K human
%K identification;
%K image
%K MIN
%K multiple
%K nonprobabilistic
%K probabilistic
%K Product
%K recognition;
%K rules;
%K sets;
%K side
%K static
%K Sum
%K sway;
%K swing;
%K techniques;
%K views;
%X In this paper, we systematically analyze different components of human gait, for the purpose of human identification. We investigate dynamic features such as the swing of the hands/legs, the sway of the upper body and static features like height, in both frontal and side views. Both probabilistic and non-probabilistic techniques are used for matching the features. Various combination strategies may be used depending upon the gait features being combined. We discuss three simple rules: the Sum, Product and MIN rules that are relevant to our feature sets. Experiments using four different datasets demonstrate that fusion can be used as an effective strategy in recognition.
%B Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
%V 3
%P III - 33-6 vol.3 - III - 33-6 vol.3
%8 2003/04//
%G eng
%R 10.1109/ICASSP.2003.1199100
%0 Conference Paper
%B Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on
%D 2003
%T A hidden Markov model based framework for recognition of humans from gait sequences
%A Sundaresan,Aravind
%A RoyChowdhury,Amit
%A Chellapa, Rama
%K analysis;
%K background-subtracted
%K binarized
%K discrete
%K distance
%K feature
%K Gait
%K hidden
%K human
%K image
%K image;
%K Markov
%K metrics;
%K model;
%K models;
%K postures;
%K recognition;
%K sequences;
%K vector;
%X In this paper we propose a generic framework based on hidden Markov models (HMMs) for recognition of individuals from their gait. The HMM framework is suitable, because the gait of an individual can be visualized as his adopting postures from a set, in a sequence which has an underlying structured probabilistic nature. The postures that the individual adopts can be regarded as the states of the HMM and are typical to that individual and provide a means of discrimination. The framework assumes that, during gait, the individual transitions between N discrete postures or states but it is not dependent on the particular feature vector used to represent the gait information contained in the postures. The framework, thus, provides flexibility in the selection of the feature vector. The statistical nature of the HMM lends robustness to the model. In this paper we use the binarized background-subtracted image as the feature vector and use different distance metrics, such as those based on the L_{1} and L_{2} norms of the vector difference, and the normalized inner product of the vectors, to measure the similarity between feature vectors. The results we obtain are better than the baseline recognition rates reported before.
%B Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on
%V 2
%P II - 93-6 vol.3 - II - 93-6 vol.3
%8 2003/09//
%G eng
%R 10.1109/ICIP.2003.1246624
%0 Conference Paper
%B Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
%D 2003
%T Human body pose estimation using silhouette shape analysis
%A Mittal,A.
%A Liang Zhao
%A Davis, Larry S.
%K 3D
%K analysis;
%K body
%K classification;
%K clutter;
%K detection;
%K estimation;
%K extraction;
%K feature
%K function;
%K human
%K image
%K likelihood
%K multiple
%K object
%K parameter
%K parameters;
%K Pixel
%K pose
%K probability;
%K segmentation;
%K segmentations;
%K SHAPE
%K silhouette
%K structure;
%K surveillance;
%K views;
%X We describe a system for human body pose estimation from multiple views that is fast and completely automatic. The algorithm works in the presence of multiple people by decoupling the problems of pose estimation of different people. The pose is estimated based on a likelihood function that integrates information from multiple views and thus obtains a globally optimal solution. Other characteristics that make our method more general than previous work include: (1) no manual initialization; (2) no specification of the dimensions of the 3D structure; (3) no reliance on some learned poses or patterns of activity; (4) insensitivity to edges and clutter in the background and within the foreground. The algorithm has applications in surveillance and promising results have been obtained.
%B Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
%P 263 - 270
%8 2003/07//
%G eng
%R 10.1109/AVSS.2003.1217930
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on
%D 2003
%T Learning dynamics for exemplar-based gesture recognition
%A Elgammal,A.
%A Shet,V.
%A Yacoob,Yaser
%A Davis, Larry S.
%K arbitrary
%K body
%K by
%K Computer
%K constraint;
%K detection;
%K discrete
%K distribution
%K dynamics;
%K edge
%K estimation;
%K example;
%K exemplar
%K exemplar-based
%K extraction;
%K feature
%K framework;
%K gesture
%K gesture;
%K hidden
%K HMM;
%K human
%K image
%K learning
%K Markov
%K matching;
%K model;
%K models;
%K motion;
%K nonparametric
%K pose
%K probabilistic
%K recognition;
%K sequence;
%K space;
%K state;
%K statistics;
%K system
%K temporal
%K tool;
%K view-based
%K vision;
%X This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM approach that uses a discrete HMM with arbitrary states (decoupled from exemplars) to capture the dynamics over a large exemplar space where a nonparametric estimation approach is used to model the exemplar distribution. This reduces the need for lengthy and non-optimal training of the HMM observation model. We used the proposed approach for view-based recognition of gestures. The approach is based on representing each gesture as a sequence of learned body poses (exemplars). The gestures are recognized through a probabilistic framework for matching these body poses and for imposing temporal constraints between different poses using the proposed non-parametric HMM.
%B Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on
%V 1
%P I-571 - I-578 vol.1 - I-571 - I-578 vol.1
%8 2003/06//
%G eng
%R 10.1109/CVPR.2003.1211405
%0 Conference Paper
%B Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
%D 2003
%T Shape and motion driven particle filtering for human body tracking
%A Yamamoto, T.
%A Chellapa, Rama
%K 3D
%K body
%K broadcast
%K camera;
%K cameras;
%K estimation;
%K Filtering
%K framework;
%K human
%K image
%K MOTION
%K motion;
%K particle
%K processing;
%K rotational
%K sequence;
%K sequences;
%K signal
%K single
%K static
%K theory;
%K tracking;
%K TV
%K video
%X In this paper, we propose a method to recover 3D human body motion from a video acquired by a single static camera. In order to estimate the complex state distribution of a human body, we adopt the particle filtering framework. We present the human body using several layers of representation and compose the whole body step by step. In this way, more effective particles are generated and ineffective particles are removed as we process each layer. In order to deal with the rotational motion, the frequency of rotation is obtained using a preprocessing operation. In the preprocessing step, the variance of the motion field at each image is computed, and the frequency of rotation is estimated. The estimated frequency is used for the state update in the algorithm. We successfully track the movement of figure skaters in TV broadcast image sequence, and recover the 3D shape and motion of the skater.
%B Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
%V 3
%P III - 61-4 vol.3 - III - 61-4 vol.3
%8 2003/07//
%G eng
%R 10.1109/ICME.2003.1221248
%0 Conference Paper
%B Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
%D 2003
%T Simultaneous tracking and recognition of human faces from video
%A Zhou,Shaohua
%A Chellapa, Rama
%K appearance
%K changes;
%K density;
%K Face
%K human
%K illumination
%K Laplacian
%K model;
%K optical
%K pose
%K processing;
%K recognition;
%K series
%K series;
%K signal
%K TIME
%K tracking;
%K variations;
%K video
%K video;
%X The paper investigates the interaction between tracking and recognition of human faces from video under a framework proposed earlier (Shaohua Zhou et al., Proc. 5th Int. Conf. on Face and Gesture Recog., 2002; Shaohua Zhou and Chellappa, R., Proc. European Conf. on Computer Vision, 2002), where a time series model is used to resolve the uncertainties in both tracking and recognition. However, our earlier efforts employed only a simple likelihood measurement in the form of a Laplacian density to deal with appearance changes between frames and between the observation and gallery images, yielding poor accuracies in both tracking and recognition when confronted by pose and illumination variations. The interaction between tracking and recognition was not well understood. We address the interdependence between tracking and recognition using a series of experiments and quantify the interacting nature of tracking and recognition.
%B Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
%V 3
%P III - 225-8 vol.3 - III - 225-8 vol.3
%8 2003/04//
%G eng
%R 10.1109/ICASSP.2003.1199148
%0 Conference Paper
%B Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
%D 2003
%T Towards a view invariant gait recognition algorithm
%A Kale, A.
%A Chowdhury, A.K.R.
%A Chellapa, Rama
%K (access
%K algorithm;
%K analysis;
%K Biometrics
%K biometrics;
%K Calibration
%K calibration;
%K camera
%K canonical
%K control);
%K equations;
%K flow;
%K Gait
%K gait;
%K human
%K image
%K invariant
%K model;
%K MOTION
%K optical
%K perspective
%K phenomenon;
%K projection
%K recognition
%K scheme;
%K sequences;
%K spatio-temporal
%K view
%K view;
%X Human gait is a spatio-temporal phenomenon and typifies the motion characteristics of an individual. The gait of a person is easily recognizable when extracted from a side-view of the person. Accordingly, gait-recognition algorithms work best when presented with images where the person walks parallel to the camera image plane. However, it is not realistic to expect this assumption to be valid in most real-life scenarios. Hence, it is important to develop methods whereby the side-view can be generated from any other arbitrary view in a simple, yet accurate, manner. This is the main theme of the paper. We show that if the person is far enough from the camera, it is possible to synthesize a side view (referred to as canonical view) from any other arbitrary view using a single camera. Two methods are proposed for doing this: (i) using the perspective projection model; (ii) using the optical flow based structure from motion equations. A simple camera calibration scheme for this method is also proposed. Examples of synthesized views are presented. Preliminary testing with gait recognition algorithms gives encouraging results. A by-product of this method is a simple algorithm for synthesizing novel views of a planar scene.
%B Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
%P 143 - 150
%8 2003/07//
%G eng
%R 10.1109/AVSS.2003.1217914
%0 Conference Paper
%B Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
%D 2003
%T View synthesis of articulating humans using visual hull
%A Yue,Zhanfeng
%A Liang Zhao
%A Chellapa, Rama
%K analysis;
%K body
%K convex
%K gesture
%K hull;
%K human
%K image
%K image-based
%K image;
%K mapping;
%K MOTION
%K part
%K parts;
%K postures;
%K recognition;
%K reconstruction;
%K segmentation;
%K silhouette
%K synthesis;
%K TEXTURE
%K texture;
%K view
%K virtual
%K visual
%X In this paper, we present a method, which combines image-based visual hull and human body part segmentation for overcoming the inability of the visual hull method to reconstruct concave regions. The virtual silhouette image corresponding to the given viewing direction is first produced with image-based visual hull. Human body part localization technique is used to segment the input images and the rendered virtual silhouette image into convex body parts. The body parts in the virtual view are generated separately from the corresponding body parts in the input views and then assembled together. The previously rendered silhouette image is used to locate the corresponding body parts in input views and avoid the unconnected or squeezed regions in the assembled final view. Experiments show that this method can improve the reconstruction of concave regions for human postures and texture mapping.
%B Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
%V 1
%P I - 489-92 vol.1 - I - 489-92 vol.1
%8 2003/07//
%G eng
%R 10.1109/ICME.2003.1220961
%0 Conference Paper
%B Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on
%D 2002
%T 3D face reconstruction from video using a generic model
%A Chowdhury, A.R.
%A Chellapa, Rama
%A Krishnamurthy, S.
%A Vo, T.
%K 3D
%K algorithm;
%K algorithms;
%K analysis;
%K Carlo
%K chain
%K Computer
%K Face
%K from
%K function;
%K generic
%K human
%K image
%K Markov
%K MCMC
%K methods;
%K model;
%K Monte
%K MOTION
%K optimisation;
%K OPTIMIZATION
%K processes;
%K processing;
%K recognition;
%K reconstruction
%K reconstruction;
%K sampling;
%K sequence;
%K sequences;
%K SfM
%K signal
%K structure
%K surveillance;
%K video
%K vision;
%X Reconstructing a 3D model of a human face from a video sequence is an important problem in computer vision, with applications to recognition, surveillance, multimedia etc. However, the quality of 3D reconstructions using structure from motion (SfM) algorithms is often not satisfactory. One common method of overcoming this problem is to use a generic model of a face. Existing work using this approach initializes the reconstruction algorithm with this generic model. The problem with this approach is that the algorithm can converge to a solution very close to this initial value, resulting in a reconstruction which resembles the generic model rather than the particular face in the video which needs to be modeled. We propose a method of 3D reconstruction of a human face from video in which the 3D reconstruction algorithm and the generic model are handled separately. A 3D estimate is obtained purely from the video sequence using SfM algorithms without use of the generic model. The final 3D model is obtained after combining the SfM estimate and the generic model using an energy function that corrects for the errors in the estimate by comparing local regions in the two models. The optimization is done using a Markov chain Monte Carlo (MCMC) sampling strategy. The main advantage of our algorithm over others is that it is able to retain the specific features of the face in the video sequence even when these features are different from those of the generic model. The evolution of the 3D model through the various stages of the algorithm is presented.
%B Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on
%V 1
%P 449 - 452 vol.1 - 449 - 452 vol.1
%8 2002///
%G eng
%R 10.1109/ICME.2002.1035815
%0 Journal Article
%J Image Processing, IEEE Transactions on
%D 2002
%T A generic approach to simultaneous tracking and verification in video
%A Li,Baoxin
%A Chellapa, Rama
%K approach;
%K Carlo
%K configuration;
%K correspondence
%K data;
%K density
%K density;
%K estimated
%K estimation;
%K evaluation;
%K extraction;
%K Face
%K facial
%K feature
%K generic
%K human
%K hypothesis
%K image
%K measurement
%K methods;
%K Monte
%K object
%K performance
%K posterior
%K probability
%K probability;
%K problem;
%K processing;
%K propagation;
%K recognition;
%K road
%K sequence
%K sequences;
%K sequential
%K signal
%K space;
%K stabilization;
%K state
%K synthetic
%K temporal
%K testing;
%K tracking;
%K vector;
%K vehicle
%K vehicles;
%K verification;
%K video
%K visual
%X A generic approach to simultaneous tracking and verification in video data is presented. The approach is based on posterior density estimation using sequential Monte Carlo methods. Visual tracking, which is in essence a temporal correspondence problem, is solved through probability density propagation, with the density being defined over a proper state space characterizing the object configuration. Verification is realized through hypothesis testing using the estimated posterior density. In its most basic form, verification can be performed as follows. Given a measurement vector Z and two hypotheses H_{1} and H0, we first estimate posterior probabilities P(H_{0}|Z) and P(H_{1}|Z), and then choose the one with the larger posterior probability as the true hypothesis. Several applications of the approach are illustrated by experiments devised to evaluate its performance. The idea is first tested on synthetic data, and then experiments with real video sequences are presented, illustrating vehicle tracking and verification, human (face) tracking and verification, facial feature tracking, and image sequence stabilization.
%B Image Processing, IEEE Transactions on
%V 11
%P 530 - 544
%8 2002/05//
%@ 1057-7149
%G eng
%N 5
%R 10.1109/TIP.2002.1006400
%0 Journal Article
%J Image Processing, IEEE Transactions on
%D 2002
%T Optimal edge-based shape detection
%A Moon, H.
%A Chellapa, Rama
%A Rosenfeld, A.
%K 1D
%K 2D
%K aerial
%K analysis;
%K boundary
%K conditions;
%K contour
%K cross
%K detection;
%K DODE
%K double
%K edge
%K edge-based
%K error
%K error;
%K exponential
%K extraction;
%K facial
%K feature
%K filter
%K filter;
%K Filtering
%K function;
%K geometry;
%K global
%K human
%K images;
%K imaging
%K localization
%K mean
%K methods;
%K NOISE
%K operator;
%K optimal
%K optimisation;
%K output;
%K performance;
%K pixel;
%K power;
%K propagation;
%K properties;
%K section;
%K SHAPE
%K square
%K squared
%K statistical
%K step
%K theory;
%K tracking;
%K two-dimensional
%K vehicle
%K video;
%X We propose an approach to accurately detecting two-dimensional (2-D) shapes. The cross section of the shape boundary is modeled as a step function. We first derive a one-dimensional (1-D) optimal step edge operator, which minimizes both the noise power and the mean squared error between the input and the filter output. This operator is found to be the derivative of the double exponential (DODE) function, originally derived by Ben-Arie and Rao (1994). We define an operator for shape detection by extending the DODE filter along the shape's boundary contour. The responses are accumulated at the centroid of the operator to estimate the likelihood of the presence of the given shape. This method of detecting a shape is in fact a natural extension of the task of edge detection at the pixel level to the problem of global contour detection. This simple filtering scheme also provides a tool for a systematic analysis of edge-based shape detection. We investigate how the error is propagated by the shape geometry. We have found that, under general assumptions, the operator is locally linear at the peak of the response. We compute the expected shape of the response and derive some of its statistical properties. This enables us to predict both its localization and detection performance and adjust its parameters according to imaging conditions and given performance specifications. Applications to the problem of vehicle detection in aerial images, human facial feature detection, and contour tracking in video are presented.
%B Image Processing, IEEE Transactions on
%V 11
%P 1209 - 1227
%8 2002/11//
%@ 1057-7149
%G eng
%N 11
%R 10.1109/TIP.2002.800896
%0 Conference Paper
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%D 2002
%T Probabilistic recognition of human faces from video
%A Chellapa, Rama
%A Kruger, V.
%A Zhou,Shaohua
%K Bayes
%K Bayesian
%K CMU;
%K distribution;
%K Face
%K faces;
%K gallery;
%K handling;
%K human
%K image
%K images;
%K importance
%K likelihood;
%K methods;
%K NIST/USF;
%K observation
%K posterior
%K probabilistic
%K probability;
%K processing;
%K propagation;
%K recognition;
%K sampling;
%K sequential
%K signal
%K still
%K Still-to-video
%K Uncertainty
%K video
%K Video-to-video
%X Most present face recognition approaches recognize faces based on still images. We present a novel approach to recognize faces in video. In that scenario, the face gallery may consist of still images or may be derived from a videos. For evidence integration we use classical Bayesian propagation over time and compute the posterior distribution using sequential importance sampling. The probabilistic approach allows us to handle uncertainties in a systematic manner. Experimental results using videos collected by NIST/USF and CMU illustrate the effectiveness of this approach in both still-to-video and video-to-video scenarios with appropriate model choices.
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%V 1
%P I-41 - I-44 vol.1 - I-41 - I-44 vol.1
%8 2002///
%G eng
%R 10.1109/ICIP.2002.1037954
%0 Conference Paper
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%D 2002
%T Quasi-invariants for human action representation and recognition
%A Parameswaran, V.
%A Chellapa, Rama
%K 2D
%K action
%K analysis;
%K body
%K canonical
%K change
%K human
%K image
%K invariance;
%K MOTION
%K poses;
%K quasi-invariants;
%K recognition;
%K representation;
%K tolerance;
%K viewpoint
%X Although human action recognition has been the subject of much research in the past, the issue of viewpoint invariance has received scarce attention. In this paper, we present an approach to detect human action with a high tolerance to viewpoint change. Canonical body poses are modeled in a view invariant manner to enable detection from a general viewpoint. While there exist no invariants for 3D to 2D projection, there exists a wealth of techniques in 2D invariance that can be used to advantage in 3D to 2D projection. We employ 2D invariants to recognize canonical poses of the human body leading to an effective way to represent and recognize human action which we evaluate theoretically and experimentally on 2D projections of publicly available human motion capture data.
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%V 1
%P 307 - 310 vol.1 - 307 - 310 vol.1
%8 2002///
%G eng
%R 10.1109/ICPR.2002.1044699
%0 Conference Paper
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%D 2002
%T A robust algorithm for probabilistic human recognition from video
%A Zhou,Shaohua
%A Chellapa, Rama
%K algorithm;
%K algorithms;
%K Carlo
%K continuity;
%K human
%K image
%K methods;
%K model;
%K Monte
%K parameterized
%K probabilistic
%K recognition;
%K robust
%K sequential
%K series
%K series;
%K space
%K state
%K state-space
%K temporal
%K TIME
%X Human recognition from video requires solving the two tasks, recognition and tracking, simultaneously. This leads to a parameterized time series state space model, representing both motion and identity of the human. Sequential Monte Carlo (SMC) algorithms, like Condensation, can be developed to offer numerical solutions to this model. However in outdoor environments, the solution is more likely to diverge from the foreground, causing failures in both recognition and tracking. In this paper we propose an approach for tackling this problem by incorporating the constraint of temporal continuity in the observations. Experimental results demonstrate improvements over its Condensation counterpart.
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%V 1
%P 226 - 229 vol.1 - 226 - 229 vol.1
%8 2002///
%G eng
%R 10.1109/ICPR.2002.1044661