TY - CONF
T1 - Kernel PLS regression for robust monocular pose estimation
T2 - Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on
Y1 - 2011
A1 - Dondera,R.
A1 - Davis, Larry S.
KW - (computer
KW - 3D
KW - analysis;rendering
KW - correlations;projection
KW - detection;monocular
KW - detection;pose
KW - estimation;Gaussian
KW - estimation;nonlinear
KW - estimation;regression
KW - GP
KW - graphics);
KW - images;rendering
KW - latent
KW - monocular
KW - PLS
KW - pose
KW - process;Kernel
KW - processes;object
KW - regression;Gaussian
KW - regression;human
KW - software;robust
KW - structures;realistic
KW - to
AB - We evaluate the robustness of five regression techniques for monocular 3D pose estimation. While most of the discriminative pose estimation methods focus on overcoming the fundamental problem of insufficient training data, we are interested in characterizing performance improvement for increasingly large training sets. Commercially available rendering software allows us to efficiently generate large numbers of realistic images of poses from diverse actions. Inspired by recent work in human detection, we apply PLS and kPLS regression to pose estimation. We observe that kPLS regression incrementally approximates GP regression using the strongest nonlinear correlations between image features and pose. This provides robustness, and our experiments show kPLS regression is more robust than two GP-based state-of-the-art methods for pose estimation.
JA - Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on
M3 - 10.1109/CVPRW.2011.5981750
ER -
TY - JOUR
T1 - Social Snapshot: A System for Temporally Coupled Social Photography
JF - Computer Graphics and Applications, IEEE
Y1 - 2011
A1 - Patro,R.
A1 - Ip, Cheuk Yiu
A1 - Bista,S.
A1 - Varshney, Amitabh
KW - 3D
KW - acquisition;data
KW - acquisition;photography;social
KW - computing;
KW - coupled
KW - data
KW - photography;data
KW - photography;temporally
KW - reconstruction;social
KW - sciences
KW - snapshot;spatiotemporal
KW - social
AB - Social Snapshot actively acquires and reconstructs temporally dynamic data. The system enables spatiotemporal 3D photography using commodity devices, assisted by their auxiliary sensors and network functionality. It engages users, making them active rather than passive participants in data acquisition.
VL - 31
SN - 0272-1716
CP - 1
M3 - 10.1109/MCG.2010.107
ER -
TY - CONF
T1 - Trainable 3D recognition using stereo matching
T2 - Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on
Y1 - 2011
A1 - Castillo,C. D
A1 - Jacobs, David W.
KW - 2D
KW - 3D
KW - class
KW - classification
KW - classification;image
KW - data
KW - dataset;CMU
KW - dataset;face
KW - descriptor;occlusion;pose
KW - estimation;solid
KW - image
KW - image;3D
KW - matching;pose
KW - matching;trainable
KW - modelling;stereo
KW - object
KW - PIE
KW - processing;
KW - recognition;face
KW - recognition;image
KW - set;3D
KW - variation;stereo
AB - Stereo matching has been used for face recognition in the presence of pose variation. In this approach, stereo matching is used to compare two 2-D images based on correspondences that reflect the effects of viewpoint variation and allow for occlusion. We show how to use stereo matching to derive image descriptors that can be used to train a classifier. This improves face recognition performance, producing the best published results on the CMU PIE dataset. We also demonstrate that classification based on stereo matching can be used for general object classification in the presence of pose variation. In preliminary experiments we show promising results on the 3D object class dataset, a standard, challenging 3D classification data set.
JA - Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on
M3 - 10.1109/ICCVW.2011.6130301
ER -
TY - CONF
T1 - Fast directional chamfer matching
T2 - Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
Y1 - 2010
A1 - Ming-Yu Liu
A1 - Tuzel, O.
A1 - Veeraraghavan,A.
A1 - Chellapa, Rama
KW - 3D
KW - algorithm;edge
KW - algorithms;single
KW - chamfer
KW - cost
KW - detection;image
KW - directional
KW - distance
KW - distribution;cost
KW - example;smooth
KW - function;sublinear
KW - hand-drawn
KW - images;edge
KW - information;fast
KW - integral
KW - localization
KW - MATCHING
KW - matching;gallery
KW - matching;transforms;
KW - model;piecewise
KW - of
KW - orientation
KW - problem;object
KW - score;directional
KW - shapes;object
KW - smooth;shape
KW - TIME
KW - time;cost
KW - transforms;computational
KW - variation;directional
AB - We study the object localization problem in images given a single hand-drawn example or a gallery of shapes as the object model. Although many shape matching algorithms have been proposed for the problem over the decades, chamfer matching remains to be the preferred method when speed and robustness are considered. In this paper, we significantly improve the accuracy of chamfer matching while reducing the computational time from linear to sublinear (shown empirically). Specifically, we incorporate edge orientation information in the matching algorithm such that the resulting cost function is piecewise smooth and the cost variation is tightly bounded. Moreover, we present a sublinear time algorithm for exact computation of the directional chamfer matching score using techniques from 3D distance transforms and directional integral images. In addition, the smooth cost function allows to bound the cost distribution of large neighborhoods and skip the bad hypotheses within. Experiments show that the proposed approach improves the speed of the original chamfer matching upto an order of 45 #x00D7;, and it is much faster than many state of art techniques while the accuracy is comparable.
JA - Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
M3 - 10.1109/CVPR.2010.5539837
ER -
TY - CONF
T1 - Pose estimation in heavy clutter using a multi-flash camera
T2 - Robotics and Automation (ICRA), 2010 IEEE International Conference on
Y1 - 2010
A1 - Ming-Yu Liu
A1 - Tuzel, O.
A1 - Veeraraghavan,A.
A1 - Chellapa, Rama
A1 - Agrawal,A.
A1 - Okuda, H.
KW - 3D
KW - algorithm;object
KW - based
KW - camera;multiview
KW - depth
KW - detection;object
KW - detection;pose
KW - distance
KW - edge
KW - edges;cameras;image
KW - edges;integral
KW - estimation;binary
KW - estimation;multiflash
KW - estimation;robot
KW - function;depth
KW - images;location
KW - localization;pose
KW - maps
KW - matching;cost
KW - matching;image
KW - pose-refinement
KW - texture;object
KW - transforms;angular
KW - vision;texture
KW - vision;transforms;
AB - We propose a novel solution to object detection, localization and pose estimation with applications in robot vision. The proposed method is especially applicable when the objects of interest may not be richly textured and are immersed in heavy clutter. We show that a multi-flash camera (MFC) provides accurate separation of depth edges and texture edges in such scenes. Then, we reformulate the problem, as one of finding matches between the depth edges obtained in one or more MFC images to the rendered depth edges that are computed offline using 3D CAD model of the objects. In order to facilitate accurate matching of these binary depth edge maps, we introduce a novel cost function that respects both the position and the local orientation of each edge pixel. This cost function is significantly superior to traditional Chamfer cost and leads to accurate matching even in heavily cluttered scenes where traditional methods are unreliable. We present a sub-linear time algorithm to compute the cost function using techniques from 3D distance transforms and integral images. Finally, we also propose a multi-view based pose-refinement algorithm to improve the estimated pose. We implemented the algorithm on an industrial robot arm and obtained location and angular estimation accuracy of the order of 1 mm and 2 #x00B0; respectively for a variety of parts with minimal texture.
JA - Robotics and Automation (ICRA), 2010 IEEE International Conference on
M3 - 10.1109/ROBOT.2010.5509897
ER -
TY - CONF
T1 - Pose-robust albedo estimation from a single image
T2 - Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
Y1 - 2010
A1 - Biswas,S.
A1 - Chellapa, Rama
KW - 3D
KW - albedo
KW - estimation;
KW - estimation;shape
KW - Face
KW - filtering;face
KW - image;single
KW - image;stochastic
KW - information;pose-robust
KW - matching;pose
KW - nonfrontal
KW - pose;class-specific
KW - recognition;filtering
KW - recovery;single
KW - statistics;computer
KW - theory;pose
KW - vision;illumination-insensitive
AB - We present a stochastic filtering approach to perform albedo estimation from a single non-frontal face image. Albedo estimation has far reaching applications in various computer vision tasks like illumination-insensitive matching, shape recovery, etc. We extend the formulation proposed in that assumes face in known pose and present an algorithm that can perform albedo estimation from a single image even when pose information is inaccurate. 3D pose of the input face image is obtained as a byproduct of the algorithm. The proposed approach utilizes class-specific statistics of faces to iteratively improve albedo and pose estimates. Illustrations and experimental results are provided to show the effectiveness of the approach. We highlight the usefulness of the method for the task of matching faces across variations in pose and illumination. The facial pose estimates obtained are also compared against ground truth.
JA - Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
M3 - 10.1109/CVPR.2010.5539987
ER -
TY - CONF
T1 - Robust RVM regression using sparse outlier model
T2 - Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
Y1 - 2010
A1 - Mitra, K.
A1 - Veeraraghavan,A.
A1 - Chellapa, Rama
KW - 3D
KW - analysis;
KW - approach;Gaussian
KW - denoising;computer
KW - denoising;lighting
KW - denoising;regression
KW - estimation;relevance
KW - human
KW - machine;robust
KW - model;Gaussian
KW - noise;basis
KW - noise;computer
KW - outlier
KW - pose;Bayesian
KW - pursuit
KW - regression;sparse
KW - RVM
KW - vector
KW - vision;image
AB - Kernel regression techniques such as Relevance Vector Machine (RVM) regression, Support Vector Regression and Gaussian processes are widely used for solving many computer vision problems such as age, head pose, 3D human pose and lighting estimation. However, the presence of outliers in the training dataset makes the estimates from these regression techniques unreliable. In this paper, we propose robust versions of the RVM regression that can handle outliers in the training dataset. We decompose the noise term in the RVM formulation into a (sparse) outlier noise term and a Gaussian noise term. We then estimate the outlier noise along with the model parameters. We present two approaches for solving this estimation problem: (1) a Bayesian approach, which essentially follows the RVM framework and (2) an optimization approach based on Basis Pursuit Denoising. In the Bayesian approach, the robust RVM problem essentially becomes a bigger RVM problem with the advantage that it can be solved efficiently by a fast algorithm. Empirical evaluations, and real experiments on image de-noising and age estimation demonstrate the better performance of the robust RVM algorithms over that of the RVM reg ression.
JA - Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
M3 - 10.1109/CVPR.2010.5539861
ER -
TY - CONF
T1 - Visibility constraints on features of 3D objects
T2 - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on
Y1 - 2009
A1 - Basri,R.
A1 - Felzenszwalb,P. F
A1 - Girshick,R. B
A1 - Jacobs, David W.
A1 - Klivans,C. J
KW - 3D
KW - algorithms;synthetic
KW - complexity;iterative
KW - constraints;computational
KW - data;synthetic
KW - dataset;NP-hard;image-based
KW - features;COIL
KW - framework;iterative
KW - images;three-dimensional
KW - methods;object
KW - object
KW - recognition;
KW - recognition;viewing
KW - sphere;visibility
AB - To recognize three-dimensional objects it is important to model how their appearances can change due to changes in viewpoint. A key aspect of this involves understanding which object features can be simultaneously visible under different viewpoints. We address this problem in an image-based framework, in which we use a limited number of images of an object taken from unknown viewpoints to determine which subsets of features might be simultaneously visible in other views. This leads to the problem of determining whether a set of images, each containing a set of features, is consistent with a single 3D object. We assume that each feature is visible from a disk of viewpoints on the viewing sphere. In this case we show the problem is NP-hard in general, but can be solved efficiently when all views come from a circle on the viewing sphere. We also give iterative algorithms that can handle noisy data and converge to locally optimal solutions in the general case. Our techniques can also be used to recover viewpoint information from the set of features that are visible in different images. We show that these algorithms perform well both on synthetic data and images from the COIL dataset.
JA - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on
M3 - 10.1109/CVPR.2009.5206726
ER -
TY - CONF
T1 - Compressed sensing for multi-view tracking and 3-D voxel reconstruction
T2 - Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on
Y1 - 2008
A1 - Reddy, D.
A1 - Sankaranarayanan,A. C
A1 - Cevher, V.
A1 - Chellapa, Rama
KW - 3D
KW - background-subtracted
KW - coding;
KW - Estimation
KW - image
KW - problems;multi-view
KW - projections;silhouette
KW - reconstruction;CS
KW - reconstruction;video
KW - sensing;multi-view
KW - silhouettes;image
KW - sparsity;sparse
KW - theory;compressed
KW - tracking;random
KW - voxel
AB - Compressed sensing (CS) suggests that a signal, sparse in some basis, can be recovered from a small number of random projections. In this paper, we apply the CS theory on sparse background-subtracted silhouettes and show the usefulness of such an approach in various multi-view estimation problems. The sparsity of the silhouette images corresponds to sparsity of object parameters (location, volume etc.) in the scene. We use random projections (compressed measurements) of the silhouette images for directly recovering object parameters in the scene coordinates. To keep the computational requirements of this recovery procedure reasonable, we tessellate the scene into a bunch of non-overlapping lines and perform estimation on each of these lines. Our method is scalable in the number of cameras and utilizes very few measurements for transmission among cameras. We illustrate the usefulness of our approach for multi-view tracking and 3-D voxel reconstruction problems.
JA - Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on
M3 - 10.1109/ICIP.2008.4711731
ER -
TY - CONF
T1 - Multi-Scale 3D Morse Complexes
T2 - Computational Sciences and Its Applications, 2008. ICCSA '08. International Conference on
Y1 - 2008
A1 - Comic,L.
A1 - De Floriani, Leila
KW - (mathematics);inverse
KW - 3D
KW - analysis;duality
KW - complexes;topology;data
KW - computing;topology;
KW - critical
KW - expansion
KW - lines;inverse
KW - morphology;mathematics
KW - Morse
KW - operations;morphology;multi-scale
KW - points;duality;integral
KW - problems;mathematical
AB - Morse theory studies the relationship between the topology of a manifold M and the critical points of a scalar function f defined over M. Morse and Morse-Smale complexes, defined by critical points and integral lines of f, induce a subdivision of M into regions of uniform gradient flow, representing the morphology of M in a compact way. Function f can be simplified by canceling its critical points in pairs, thus simplifying the morphological representation of M, given by Morse and Morse-Smale complexes of f. Here, we propose a compact representation for the two Morse complexes in 3D, which is based on encoding the incidence relations of their cells, and on exploiting the duality among the complexes. We define cancellation operations, and their inverse expansion operations, on the Morse complexes and on their dual representation. We propose a multi-scale representation of the Morse complexes which provides a description of such complexes, and thus of the morphology of a 3D scalar field, at different levels of abstraction. This representation allows us also to perform selective refinement operations to extract description of the complexes which varies in different parts of the domain, thus improving efficiency on large data sets, and eliminating the noise in the data through topology simplification.
JA - Computational Sciences and Its Applications, 2008. ICCSA '08. International Conference on
M3 - 10.1109/ICCSA.2008.10
ER -
TY - CONF
T1 - Multimodal Tracking for Smart Videoconferencing and Video Surveillance
T2 - Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
Y1 - 2007
A1 - Zotkin,Dmitry N
A1 - Raykar,V.C.
A1 - Duraiswami, Ramani
A1 - Davis, Larry S.
KW - (numerical
KW - 3D
KW - algorithm;smart
KW - analysis;least
KW - approximations;particle
KW - arrays;nonlinear
KW - cameras;multiple
KW - Carlo
KW - estimator;multimodal
KW - filter;self-calibration
KW - Filtering
KW - least
KW - likelihood
KW - methods);teleconferencing;video
KW - methods;image
KW - microphone
KW - MOTION
KW - motion;Monte-Carlo
KW - problem;particle
KW - processing;video
KW - signal
KW - simulations;maximum
KW - squares
KW - surveillance;
KW - surveillance;Monte
KW - tracking;multiple
KW - videoconferencing;video
AB - Many applications require the ability to track the 3-D motion of the subjects. We build a particle filter based framework for multimodal tracking using multiple cameras and multiple microphone arrays. In order to calibrate the resulting system, we propose a method to determine the locations of all microphones using at least five loudspeakers and under assumption that for each loudspeaker there exists a microphone very close to it. We derive the maximum likelihood (ML) estimator, which reduces to the solution of the non-linear least squares problem. We verify the correctness and robustness of the multimodal tracker and of the self-calibration algorithm both with Monte-Carlo simulations and on real data from three experimental setups.
JA - Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
M3 - 10.1109/CVPR.2007.383525
ER -
TY - CONF
T1 - Simulation and Analysis of Human Walking Motion
T2 - Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Y1 - 2007
A1 - Nandy, K.
A1 - Chellapa, Rama
KW - 3D
KW - algorithm;revolute
KW - analysis;image
KW - chain;mechanical
KW - DYNAMICS
KW - Euler
KW - extraction;image
KW - geometry;angle
KW - inverse
KW - joints;rigid
KW - links;surveillance;time
KW - method;feature
KW - model;recursive
KW - models;torque
KW - MOTION
KW - motion;kinematic
KW - Newton
KW - patterns;Newton
KW - problems;time
KW - sequences;dynamic
KW - sequences;inverse
KW - sequences;walking
KW - series
KW - series;time
KW - simulation;torque;
KW - TIME
KW - walking
KW - warp
KW - warping;healthcare;human
AB - Simulation and analysis of human walking motion has applications in surveillance and healthcare. In this paper we discuss an approach for modeling human walking motion using a mechanical model in the form of a kinematic chain consisting of rigid links and revolute joints. Our goal is to discriminate different types of walking motions using information such as joint torque and angle sequences extracted from the model. The angle sequences are initially extracted using 3D geometry. From these angle sequences we extract the torque sequences using a recursive Newton Euler inverse dynamics algorithm. Time series models and dynamic time warping of the torque and angle sequences are used to characterize and discriminate different walking patterns. A forward dynamics algorithm is also presented for synthesizing different walking sequences like limping from a normal walking torque sequence
JA - Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
VL - 1
M3 - 10.1109/ICASSP.2007.366028
ER -
TY - CONF
T1 - Headphone-Based Reproduction of 3D Auditory Scenes Captured by Spherical/Hemispherical Microphone Arrays
T2 - Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Y1 - 2006
A1 - Zhiyun Li
A1 - Duraiswami, Ramani
KW - 3D
KW - analysis;headphones;microphone
KW - arrays;orthogonal
KW - arrays;spatial
KW - auditory
KW - beam-space;spatial
KW - filter;spherical
KW - filters;
KW - function;headphone-based
KW - harmonics;array
KW - microphone
KW - processing;audio
KW - processing;harmonic
KW - related
KW - reproduction;hemispherical
KW - scenes;head
KW - signal
KW - transfer
AB - We propose a method to reproduce 3D auditory scenes captured by spherical microphone arrays over headphones. This algorithm employs expansions of the captured sound and the head related transfer function over the sphere and uses the orthonormality of the spherical harmonics. Using a spherical microphone array, we first record the 3D auditory scene, then the recordings are spatially filtered and reproduced through headphones in the orthogonal beam-space of the head related transfer functions (HRTFs). We use the KEMAR HRTF measurements to verify our algorithm
JA - Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
VL - 5
M3 - 10.1109/ICASSP.2006.1661281
ER -
TY - CONF
T1 - Invariant Geometric Representation of 3D Point Clouds for Registration and Matching
T2 - Image Processing, 2006 IEEE International Conference on
Y1 - 2006
A1 - Biswas,S.
A1 - Aggarwal,G.
A1 - Chellapa, Rama
KW - 3D
KW - cloud;computer
KW - function
KW - geometric
KW - graphics;geophysical
KW - graphics;image
KW - Interpolation
KW - matching;image
KW - point
KW - processing;image
KW - reconstruction;image
KW - registration;image
KW - registration;implicit
KW - representation;interpolation;
KW - representation;variational
KW - signal
KW - technique;clouds;computer
KW - value;invariant
AB - Though implicit representations of surfaces have often been used for various computer graphics tasks like modeling and morphing of objects, it has rarely been used for registration and matching of 3D point clouds. Unlike in graphics, where the goal is precise reconstruction, we use isosurfaces to derive a smooth and approximate representation of the underlying point cloud which helps in generalization. Implicit surfaces are generated using a variational interpolation technique. Implicit function values on a set of concentric spheres around the 3D point cloud of object are used as features for matching. Geometric-invariance is achieved by decomposing implicit values based feature set into various spherical harmonics. The decomposition provides a compact representation of 3D point clouds while achieving rotation invariance
JA - Image Processing, 2006 IEEE International Conference on
M3 - 10.1109/ICIP.2006.312542
ER -
TY - CONF
T1 - Motion Based Correspondence for 3D Tracking of Multiple Dim Objects
T2 - Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Y1 - 2006
A1 - Veeraraghavan,A.
A1 - Srinivasan, M.
A1 - Chellapa, Rama
A1 - Baird, E.
A1 - Lamont, R.
KW - 3D
KW - analysis;motion
KW - analysis;video
KW - based
KW - cameras;feature
KW - correspondence;motion
KW - dim
KW - extraction;image
KW - extraction;multiple
KW - features
KW - MOTION
KW - objects;video
KW - processing;
KW - signal
KW - tracking;motion
AB - Tracking multiple objects in a video is a demanding task that is frequently encountered in several systems such as surveillance and motion analysis. Ability to track objects in 3D requires the use of multiple cameras. While tracking multiple objects using multiples video cameras, establishing correspondence between objects in the various cameras is a nontrivial task. Specifically, when the targets are dim or are very far away from the camera, appearance cannot be used in order to establish this correspondence. Here, we propose a technique to establish correspondence across cameras using the motion features extracted from the targets, even when the relative position of the cameras is unknown. Experimental results are provided for the problem of tracking multiple bees in natural flight using two cameras. The reconstructed 3D flight paths of the bees show some interesting flight patterns
JA - Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
VL - 2
M3 - 10.1109/ICASSP.2006.1660431
ER -
TY - JOUR
T1 - Numerical analysis of plasmon resonances in nanoparticles
JF - Magnetics, IEEE Transactions on
Y1 - 2006
A1 - Mayergoyz, Issak D
A1 - Zhang,Zhenyu
KW - 3D
KW - analysis;plasmon
KW - and
KW - boundary
KW - eigenfunctions;electrostatics;nanoparticles;permittivity;surface
KW - equation;boundary
KW - equations;eigenvalues
KW - integral
KW - nanoparticles;eigenvalue
KW - plasmon
KW - problem;numerical
KW - resonance;
KW - resonances;specific
AB - Plasmon (electrostatic) resonances in nanoparticles are treated as an eigenvalue problem for a specific boundary integral equation. This leads to direct calculation of resonance values of permittivity and resonance frequency. The numerical technique is illustrated by examples of calculation of resonance frequencies for three-dimensional nanoparticles
VL - 42
SN - 0018-9464
CP - 4
M3 - 10.1109/TMAG.2006.870976
ER -
TY - CONF
T1 - On the equivalence of common approaches to lighting insensitive recognition
T2 - Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
Y1 - 2005
A1 - Osadchy,M.
A1 - Jacobs, David W.
A1 - Lindenbaum,M.
KW - 3D
KW - conditions;lighting
KW - cosine
KW - difference;image
KW - direction
KW - filters;gradient
KW - function;image
KW - insensitive
KW - intensity;lighting
KW - recognition;image
KW - recognition;lighting
KW - scenes;Gaussian
KW - segmentation;
KW - variation;monotonic
AB - Lighting variation is commonly handled by methods invariant to additive and multiplicative changes in image intensity. It has been demonstrated that comparing images using the direction of the gradient can produce broader insensitivity to changes in lighting conditions, even for 3D scenes. We analyze two common approaches to image comparison that are invariant, normalized correlation using small correlation windows, and comparison based on a large set of oriented difference of Gaussian filters. We show analytically that these methods calculate a monotonic (cosine) function of the gradient direction difference and hence are equivalent to the direction of gradient method. Our analysis is supported with experiments on both synthetic and real scenes
JA - Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
VL - 2
M3 - 10.1109/ICCV.2005.179
ER -
TY - CONF
T1 - Moving Object Segmentation and Dynamic Scene Reconstruction Using Two Frames
T2 - Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Y1 - 2005
A1 - Agrawala, Ashok K.
A1 - Chellapa, Rama
KW - 3D
KW - analysis;
KW - constraints;
KW - dynamic
KW - ego-motion
KW - estimation;
KW - flow
KW - image
KW - images;
KW - independent
KW - INTENSITY
KW - least
KW - mean
KW - median
KW - method;
KW - methods;
KW - model;
KW - MOTION
KW - motion;
KW - moving
KW - object
KW - of
KW - parallax
KW - parallax;
KW - parametric
KW - processing;
KW - reconstruction;
KW - scene
KW - segmentation;
KW - signal
KW - squares
KW - squares;
KW - static
KW - structure;
KW - subspace
KW - surface
KW - translational
KW - two-frame
KW - unconstrained
KW - video
JA - Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
VL - 2
M3 - 10.1109/ICASSP.2005.1415502
ER -
TY - CONF
T1 - A robust and self-reconfigurable design of spherical microphone array for multi-resolution beamforming
T2 - Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Y1 - 2005
A1 - Zhiyun Li
A1 - Duraiswami, Ramani
KW - 3D
KW - anti-terrorism;
KW - array
KW - array;
KW - arrays;
KW - audio
KW - beam
KW - beamforming;
KW - beampattern
KW - directivity
KW - Frequency
KW - microphone
KW - multiresolution
KW - omnidirectional
KW - optimisation;
KW - optimization;
KW - processing;
KW - reorganization
KW - response;
KW - robustness;
KW - sampling;
KW - self-reconfigurable
KW - signal
KW - soundfield
KW - spherical
KW - steering;
AB - We describe a robust and self-reconfigurable design of a spherical microphone array for beamforming. Our approach achieves a multi-resolution spherical beamformer with performance that is either optimal in the approximation of desired beampattern or is optimal in the directivity achieved, both robustly. Our implementation converges to the optimal performance quickly while exactly satisfying the specified frequency response and robustness constraint in each iteration step without accumulated round-off errors. The advantage of this design lies in its robustness and self-reconfiguration in microphone array reorganization, such as microphone failure, which is highly desirable in online maintenance and anti-terrorism. Design examples and simulation results are presented.
JA - Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
VL - 4
M3 - 10.1109/ICASSP.2005.1416214
ER -
TY - CONF
T1 - 3D model refinement using surface-parallax
T2 - Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Y1 - 2004
A1 - Agrawala, Ashok K.
A1 - Chellapa, Rama
KW - 3D
KW - adaptive
KW - arbitrary
KW - camera
KW - coarse
KW - compensation;
KW - Computer
KW - DEM;
KW - depth
KW - digital
KW - ELEVATION
KW - environments;
KW - epipolar
KW - estimation;
KW - field;
KW - image
KW - incomplete
KW - INTENSITY
KW - map;
KW - model
KW - MOTION
KW - parallax;
KW - plane-parallax
KW - reconstruction;
KW - recovery;
KW - refinement;
KW - sequence;
KW - sequences;
KW - surface
KW - surfaces;
KW - urban
KW - vision;
KW - windowing;
AB - We present an approach to update and refine coarse 3D models of urban environments from a sequence of intensity images using surface parallax. This generalizes the plane-parallax recovery methods to surface-parallax using arbitrary surfaces. A coarse and potentially incomplete depth map of the scene obtained from a digital elevation map (DEM) is used as a reference surface which is refined and updated using this approach. The reference depth map is used to estimate the camera motion and the motion of the 3D points on the reference surface is compensated. The resulting parallax, which is an epipolar field, is estimated using an adaptive windowing technique and used to obtain the refined depth map.
JA - Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
VL - 3
M3 - 10.1109/ICASSP.2004.1326537
ER -
TY - CONF
T1 - Appearance-based tracking and recognition using the 3D trilinear tensor
T2 - Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Y1 - 2004
A1 - Jie Shao
A1 - Zhou,S. K
A1 - Chellapa, Rama
KW - 3D
KW - adaptive
KW - affine-transformation
KW - airborne
KW - algorithm;
KW - appearance
KW - appearance-based
KW - based
KW - estimation;
KW - geometrical
KW - image
KW - mathematical
KW - novel
KW - object
KW - operator;
KW - operators;
KW - perspective
KW - prediction;
KW - processing;
KW - recognition;
KW - representation;
KW - signal
KW - structure
KW - synthesis;
KW - template
KW - tensor
KW - tensor;
KW - tensors;
KW - tracking;
KW - transformation;
KW - trilinear
KW - updating;
KW - video
KW - video-based
KW - video;
KW - view
AB - The paper presents an appearance-based adaptive algorithm for simultaneous tracking and recognition by generalizing the transformation model to 3D perspective transformation. A trilinear tensor operator is used to represent the 3D geometrical structure. The tensor is estimated by predicting the corresponding points using the existing affine-transformation based algorithm. The estimated tensor is used to synthesize novel views to update the appearance templates. Some experimental results using airborne video are presented.
JA - Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
VL - 3
M3 - 10.1109/ICASSP.2004.1326619
ER -
TY - CONF
T1 - Dynamic distortion control for 3-D embedded wavelet video over multiuser OFDM networks
T2 - Global Telecommunications Conference, 2004. GLOBECOM '04. IEEE
Y1 - 2004
A1 - Su,Guan-Ming
A1 - Han,Zhu
A1 - M. Wu
A1 - Liu,K. J.R
KW - 3D
KW - 802.11a;
KW - channels;
KW - codec;
KW - codecs;
KW - communication;
KW - control;
KW - deviation;
KW - distortion
KW - diversity
KW - diversity;
KW - downlink
KW - dynamic
KW - embedded
KW - fairness;
KW - Frequency
KW - IEEE
KW - LAN;
KW - maximal
KW - minimax
KW - minimization;
KW - modulation;
KW - multimedia
KW - multiuser
KW - OFDM
KW - OFDM;
KW - PSNR
KW - rate
KW - reception;
KW - streaming;
KW - systems;
KW - techniques;
KW - theory;
KW - TIME
KW - transforms;
KW - video
KW - video;
KW - wavelet
KW - wireless
AB - In this paper, we propose a system to transmit multiple 3D embedded wavelet video programs over downlink multiuser OFDM. We consider the fairness among users and formulate the problem as minimizing the users' maximal distortion subject to power, rate, and subcarrier constraints. By exploring frequency, time, and multiuser diversity in OFDM and flexibility of the 3D embedded wavelet video codec, the proposed algorithm can achieve fair video qualities among all users. Compared to a scheme similar to the current multiuser OFDM standard (IEEE 802.11a), the proposed scheme outperforms it by 1-5 dB on the worst received PSNR among all users and has much smaller PSNR deviation.
JA - Global Telecommunications Conference, 2004. GLOBECOM '04. IEEE
VL - 2
M3 - 10.1109/GLOCOM.2004.1378042
ER -
TY - CONF
T1 - Multiple view tracking of humans modelled by kinematic chains
T2 - Image Processing, 2004. ICIP '04. 2004 International Conference on
Y1 - 2004
A1 - Sundaresan, A.
A1 - Chellapa, Rama
A1 - RoyChowdhury, R.
KW - 3D
KW - algorithm;
KW - analysis;
KW - body
KW - calibrated
KW - cameras;
KW - chain
KW - displacement;
KW - error
KW - estimation;
KW - human
KW - image
KW - iterative
KW - kinematic
KW - kinematics;
KW - methods;
KW - model;
KW - MOTION
KW - motion;
KW - multiple
KW - parameters;
KW - perspective
KW - Pixel
KW - processing;
KW - projection
KW - sequences;
KW - signal
KW - tracking;
KW - video
KW - view
AB - We use a kinematic chain to model human body motion. We estimate the kinematic chain motion parameters using pixel displacements calculated from video sequences obtained from multiple calibrated cameras to perform tracking. We derive a linear relation between the 2D motion of pixels in terms of the 3D motion parameters of various body parts using a perspective projection model for the cameras, a rigid body motion model for the base body and the kinematic chain model for the body parts. An error analysis of the estimator is provided, leading to an iterative algorithm for calculating the motion parameters from the pixel displacements. We provide experimental results to demonstrate the accuracy of our formulation. We also compare our iterative algorithm to the noniterative algorithm and discuss its robustness in the presence of noise.
JA - Image Processing, 2004. ICIP '04. 2004 International Conference on
VL - 2
M3 - 10.1109/ICIP.2004.1419472
ER -
TY - CONF
T1 - Robust Bayesian cameras motion estimation using random sampling
T2 - Image Processing, 2004. ICIP '04. 2004 International Conference on
Y1 - 2004
A1 - Qian, G.
A1 - Chellapa, Rama
A1 - Qinfen Zheng
KW - 3D
KW - baseline
KW - Bayesian
KW - CAMERAS
KW - cameras;
KW - coarse-to-fine
KW - consensus
KW - density
KW - estimation;
KW - feature
KW - function;
KW - hierarchy
KW - image
KW - images;
KW - importance
KW - matching;
KW - MOTION
KW - posterior
KW - probability
KW - probability;
KW - processing;
KW - random
KW - RANSAC;
KW - real
KW - realistic
KW - sample
KW - sampling;
KW - scheme;
KW - sequences;
KW - stereo
KW - strategy;
KW - synthetic
KW - wide
AB - In this paper, we propose an algorithm for robust 3D motion estimation of wide baseline cameras from noisy feature correspondences. The posterior probability density function of the camera motion parameters is represented by weighted samples. The algorithm employs a hierarchy coarse-to-fine strategy. First, a coarse prior distribution of camera motion parameters is estimated using the random sample consensus scheme (RANSAC). Based on this estimate, a refined posterior distribution of camera motion parameters can then be obtained through importance sampling. Experimental results using both synthetic and real image sequences indicate the efficacy of the proposed algorithm.
JA - Image Processing, 2004. ICIP '04. 2004 International Conference on
VL - 2
M3 - 10.1109/ICIP.2004.1419754
ER -
TY - CONF
T1 - Robust ego-motion estimation and 3D model refinement using depth based parallax model
T2 - Image Processing, 2004. ICIP '04. 2004 International Conference on
Y1 - 2004
A1 - Agrawala, Ashok K.
A1 - Chellapa, Rama
KW - 3D
KW - algorithm;
KW - analysis;
KW - and
KW - based
KW - camera;
KW - coarse
KW - compensation;
KW - DEM;
KW - depth
KW - digital
KW - ego-motion
KW - eigen-value
KW - eigenfunctions;
KW - eigenvalues
KW - ELEVATION
KW - epipolar
KW - estimation;
KW - extraction;
KW - feature
KW - field;
KW - iteration
KW - iterative
KW - map;
KW - method;
KW - methods;
KW - model
KW - model;
KW - MOTION
KW - parallax
KW - partial
KW - range-finding;
KW - refinement;
KW - refining;
KW - surface
AB - We present an iterative algorithm for robustly estimating the ego-motion and refining and updating a coarse, noisy and partial depth map using a depth based parallax model and brightness derivatives extracted from an image pair. Given a coarse, noisy and partial depth map acquired by a range-finder or obtained from a Digital Elevation Map (DFM), we first estimate the ego-motion by combining a global ego-motion constraint and a local brightness constancy constraint. Using the estimated camera motion and the available depth map estimate, motion of the 3D points is compensated. We utilize the fact that the resulting surface parallax field is an epipolar field and knowing its direction from the previous motion estimates, estimate its magnitude and use it to refine the depth map estimate. Instead of assuming a smooth parallax field or locally smooth depth models, we locally model the parallax magnitude using the depth map, formulate the problem as a generalized eigen-value analysis and obtain better results. In addition, confidence measures for depth estimates are provided which can be used to remove regions with potentially incorrect (and outliers in) depth estimates for robustly estimating ego-motion in the next iteration. Results on both synthetic and real examples are presented.
JA - Image Processing, 2004. ICIP '04. 2004 International Conference on
VL - 4
M3 - 10.1109/ICIP.2004.1421606
ER -
TY - CONF
T1 - A spherical microphone array system for traffic scene analysis
T2 - Intelligent Transportation Systems, 2004. Proceedings. The 7th International IEEE Conference on
Y1 - 2004
A1 - Zhiyun Li
A1 - Duraiswami, Ramani
A1 - Grassi,E.
A1 - Davis, Larry S.
KW - -6
KW - 3D
KW - analysis;
KW - array
KW - arrays;
KW - audio;
KW - auditory
KW - beamformer;
KW - capture;
KW - dB;
KW - environment;
KW - gain;
KW - microphone
KW - NOISE
KW - noise;
KW - processing;
KW - real
KW - robust
KW - scene
KW - signal
KW - spherical
KW - system;
KW - traffic
KW - traffic;
KW - virtual
KW - white
KW - World
AB - This paper describes a practical spherical microphone array system for traffic auditory scene capture and analysis. Our system uses 60 microphones positioned on the rigid surface of a sphere. We then propose an optimal design of a robust spherical beamformer with minimum white noise gain (WNG) of -6 dB. We test this system in a real-world traffic environment. Some preliminary simulation and experimental results are presented to demonstrate its performance. This system may also find applications in broader areas such as 3D audio, virtual environment, etc.
JA - Intelligent Transportation Systems, 2004. Proceedings. The 7th International IEEE Conference on
M3 - 10.1109/ITSC.2004.1398921
ER -
TY - CONF
T1 - Uncalibrated stereo rectification for automatic 3D surveillance
T2 - Image Processing, 2004. ICIP '04. 2004 International Conference on
Y1 - 2004
A1 - Lim,S.-N.
A1 - Mittal,A.
A1 - Davis, Larry S.
A1 - Paragios,N.
KW - 3D
KW - AUTOMATIC
KW - conjugate
KW - epipolar
KW - image
KW - lines;
KW - matching;
KW - method;
KW - processing;
KW - rectification
KW - scene;
KW - stereo
KW - surveillance;
KW - uncalibrated
KW - urban
AB - We describe a stereo rectification method suitable for automatic 3D surveillance. We take advantage of the fact that in a typical urban scene, there is ordinarily a small number of dominant planes. Given two views of the scene, we align a dominant plane in one view with the other. Conjugate epipolar lines between the reference view and plane-aligned image become geometrically identical and can be added to the rectified image pair line by line. Selecting conjugate epipolar lines to cover the whole image is simplified since they are geometrically identical. In addition, the polarities of conjugate epipolar lines are automatically preserved by plane alignment, which simplifies stereo matching.
JA - Image Processing, 2004. ICIP '04. 2004 International Conference on
VL - 2
M3 - 10.1109/ICIP.2004.1419753
ER -
TY - CONF
T1 - View independent human body pose estimation from a single perspective image
T2 - Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
Y1 - 2004
A1 - Parameswaran, V.
A1 - Chellapa, Rama
KW - 3D
KW - analysis;
KW - biomechanics;
KW - body
KW - body-centric
KW - camera;
KW - capture
KW - coordinate
KW - coordinates;
KW - detection;
KW - epipolar
KW - equation
KW - estimation;
KW - geometry;
KW - human
KW - image
KW - image;
KW - images;
KW - model-based
KW - models;
KW - MOTION
KW - object
KW - optical
KW - perspective
KW - physiological
KW - polynomial
KW - polynomials;
KW - pose
KW - real
KW - single
KW - synthetic
KW - system;
KW - systems;
KW - torso
KW - tracking;
KW - twist;
KW - uncalibrated
AB - Recovering the 3D coordinates of various joints of the human body from an image is a critical first step for several model-based human tracking and optical motion capture systems. Unlike previous approaches that have used a restrictive camera model or assumed a calibrated camera, our work deals with the general case of a perspective uncalibrated camera and is thus well suited for archived video. The input to the system is an image of the human body and correspondences of several body landmarks, while the output is the set of 3D coordinates of the landmarks in a body-centric coordinate system. Using ideas from 3D model based invariants, we set up a polynomial system of equations in the unknown head pitch, yaw and roll angles. If we are able to make the often-valid assumption that the torso twist is small, there are finite numbers of solutions to the head-orientation that can be computed readily. Once the head orientation is computed, the epipolar geometry of the camera is recovered, leading to solutions to the 3D joint positions. Results are presented on synthetic and real images.
JA - Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
VL - 2
M3 - 10.1109/CVPR.2004.1315139
ER -
TY - JOUR
T1 - Wide baseline image registration with application to 3-D face modeling
JF - Multimedia, IEEE Transactions on
Y1 - 2004
A1 - Roy-Chowdhury, A.K.
A1 - Chellapa, Rama
A1 - Keaton, T.
KW - 2D
KW - 3D
KW - algorithm;
KW - baseline
KW - biometrics;
KW - Computer
KW - configuration;
KW - correspondence
KW - doubly
KW - error
KW - extraction;
KW - Face
KW - feature
KW - holistic
KW - image
KW - matching;
KW - matrix;
KW - minimization;
KW - modeling;
KW - models;
KW - normalization
KW - probability
KW - probability;
KW - procedure;
KW - processes;
KW - processing;
KW - recognition;
KW - registration;
KW - representation;
KW - sequences;
KW - shapes;
KW - Sinkhorn
KW - spatial
KW - statistics;
KW - Stochastic
KW - video
KW - vision;
KW - wide
AB - Establishing correspondence between features in two images of the same scene taken from different viewing angles is a challenging problem in image processing and computer vision. However, its solution is an important step in many applications like wide baseline stereo, three-dimensional (3-D) model alignment, creation of panoramic views, etc. In this paper, we propose a technique for registration of two images of a face obtained from different viewing angles. We show that prior information about the general characteristics of a face obtained from video sequences of different faces can be used to design a robust correspondence algorithm. The method works by matching two-dimensional (2-D) shapes of the different features of the face (e.g., eyes, nose etc.). A doubly stochastic matrix, representing the probability of match between the features, is derived using the Sinkhorn normalization procedure. The final correspondence is obtained by minimizing the probability of error of a match between the entire constellation of features in the two sets, thus taking into account the global spatial configuration of the features. The method is applied for creating holistic 3-D models of a face from partial representations. Although this paper focuses primarily on faces, the algorithm can also be used for other objects with small modifications.
VL - 6
SN - 1520-9210
CP - 3
M3 - 10.1109/TMM.2004.827511
ER -
TY - JOUR
T1 - Accurate dense optical flow estimation using adaptive structure tensors and a parametric model
JF - Image Processing, IEEE Transactions on
Y1 - 2003
A1 - Liu,Haiying
A1 - Chellapa, Rama
A1 - Rosenfeld, A.
KW - 3D
KW - accuracy;
KW - adaptive
KW - and
KW - coherent
KW - confidence
KW - dense
KW - eigenfunctions;
KW - eigenvalue
KW - eigenvalues
KW - eigenvectors;
KW - estimation;
KW - flow
KW - generalized
KW - ground
KW - image
KW - measure;
KW - model;
KW - MOTION
KW - optical
KW - parameter
KW - parametric
KW - problem;
KW - real
KW - region;
KW - sequences;
KW - structure
KW - synthetic
KW - tensor;
KW - tensors;
KW - three-dimensional
KW - truth;
AB - An accurate optical flow estimation algorithm is proposed in this paper. By combining the three-dimensional (3D) structure tensor with a parametric flow model, the optical flow estimation problem is converted to a generalized eigenvalue problem. The optical flow can be accurately estimated from the generalized eigenvectors. The confidence measure derived from the generalized eigenvalues is used to adaptively adjust the coherent motion region to further improve the accuracy. Experiments using both synthetic sequences with ground truth and real sequences illustrate our method. Comparisons with classical and recently published methods are also given to demonstrate the accuracy of our algorithm.
VL - 12
SN - 1057-7149
CP - 10
M3 - 10.1109/TIP.2003.815296
ER -
TY - CONF
T1 - Camera calibration using spheres: a semi-definite programming approach
T2 - Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
Y1 - 2003
A1 - Agrawal,M.
A1 - Davis, Larry S.
KW - 3D
KW - algorithms;calibration;cameras;computer
KW - approach;sphere
KW - calibration;camera
KW - contours;semidefinite
KW - extraction;
KW - field;ellipse;intrinsic
KW - location;spheres;vision
KW - networks;common
KW - parameters;occluding
KW - Programming
KW - target;camera
KW - view
KW - vision;feature
AB - Vision algorithms utilizing camera networks with a common field of view are becoming increasingly feasible and important. Calibration of such camera networks is a challenging and cumbersome task. The current approaches for calibration using planes or a known 3D target may not be feasible as these objects may not be simultaneously visible in all the cameras. In this paper, we present a new algorithm to calibrate cameras using occluding contours of spheres. In general, an occluding contour of a sphere projects to an ellipse in the image. Our algorithm uses the projection of the occluding contours of three spheres and solves for the intrinsic parameters and the locations of the spheres. The problem is formulated in the dual space and the parameters are solved for optimally and efficiently using semidefinite programming. The technique is flexible, accurate and easy to use. In addition, since the contour of a sphere is simultaneously visible in all the cameras, our approach can greatly simplify calibration of multiple cameras with a common field of view. Experimental results from computer simulated data and real world data, both for a single camera and multiple cameras, are presented.
JA - Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
M3 - 10.1109/ICCV.2003.1238428
ER -
TY - CONF
T1 - Human body pose estimation using silhouette shape analysis
T2 - Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
Y1 - 2003
A1 - Mittal,A.
A1 - Liang Zhao
A1 - Davis, Larry S.
KW - 3D
KW - analysis;
KW - body
KW - classification;
KW - clutter;
KW - detection;
KW - estimation;
KW - extraction;
KW - feature
KW - function;
KW - human
KW - image
KW - likelihood
KW - multiple
KW - object
KW - parameter
KW - parameters;
KW - Pixel
KW - pose
KW - probability;
KW - segmentation;
KW - segmentations;
KW - SHAPE
KW - silhouette
KW - structure;
KW - surveillance;
KW - views;
AB - We describe a system for human body pose estimation from multiple views that is fast and completely automatic. The algorithm works in the presence of multiple people by decoupling the problems of pose estimation of different people. The pose is estimated based on a likelihood function that integrates information from multiple views and thus obtains a globally optimal solution. Other characteristics that make our method more general than previous work include: (1) no manual initialization; (2) no specification of the dimensions of the 3D structure; (3) no reliance on some learned poses or patterns of activity; (4) insensitivity to edges and clutter in the background and within the foreground. The algorithm has applications in surveillance and promising results have been obtained.
JA - Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
M3 - 10.1109/AVSS.2003.1217930
ER -
TY - CONF
T1 - Shape and motion driven particle filtering for human body tracking
T2 - Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
Y1 - 2003
A1 - Yamamoto, T.
A1 - Chellapa, Rama
KW - 3D
KW - body
KW - broadcast
KW - camera;
KW - cameras;
KW - estimation;
KW - Filtering
KW - framework;
KW - human
KW - image
KW - MOTION
KW - motion;
KW - particle
KW - processing;
KW - rotational
KW - sequence;
KW - sequences;
KW - signal
KW - single
KW - static
KW - theory;
KW - tracking;
KW - TV
KW - video
AB - In this paper, we propose a method to recover 3D human body motion from a video acquired by a single static camera. In order to estimate the complex state distribution of a human body, we adopt the particle filtering framework. We present the human body using several layers of representation and compose the whole body step by step. In this way, more effective particles are generated and ineffective particles are removed as we process each layer. In order to deal with the rotational motion, the frequency of rotation is obtained using a preprocessing operation. In the preprocessing step, the variance of the motion field at each image is computed, and the frequency of rotation is estimated. The estimated frequency is used for the state update in the algorithm. We successfully track the movement of figure skaters in TV broadcast image sequence, and recover the 3D shape and motion of the skater.
JA - Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
VL - 3
M3 - 10.1109/ICME.2003.1221248
ER -
TY - CONF
T1 - Using specularities for recognition
T2 - Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
Y1 - 2003
A1 - Osadchy,M.
A1 - Jacobs, David W.
A1 - Ramamoorthi,R.
KW - 3D
KW - formation;object
KW - glass;computer
KW - image
KW - information;specular
KW - light
KW - measurement;reflection;stereo
KW - objects;specular
KW - objects;wine
KW - processing;
KW - property;pottery;recognition
KW - recognition;object
KW - recognition;position
KW - reflectance
KW - reflection;compact
KW - reflection;transparent
KW - shape;Lambertian
KW - source;highlight
KW - systems;shiny
KW - vision;lighting;object
AB - Recognition systems have generally treated specular highlights as noise. We show how to use these highlights as a positive source of information that improves recognition of shiny objects. This also enables us to recognize very challenging shiny transparent objects, such as wine glasses. Specifically, we show how to find highlights that are consistent with a hypothesized pose of an object of known 3D shape. We do this using only a qualitative description of highlight formation that is consistent with most models of specular reflection, so no specific knowledge of an object's reflectance properties is needed. We first present a method that finds highlights produced by a dominant compact light source, whose position is roughly known. We then show how to estimate the lighting automatically for objects whose reflection is part specular and part Lambertian. We demonstrate this method for two classes of objects. First, we show that specular information alone can suffice to identify objects with no Lambertian reflectance, such as transparent wine glasses. Second, we use our complete system to recognize shiny objects, such as pottery.
JA - Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
M3 - 10.1109/ICCV.2003.1238669
ER -
TY - CONF
T1 - Video based rendering of planar dynamic scenes
T2 - Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
Y1 - 2003
A1 - Kale, A.
A1 - Chowdhury, A.K.R.
A1 - Chellapa, Rama
KW - (computer
KW - 3D
KW - analysis;
KW - approximation;
KW - based
KW - camera;
KW - cameras;
KW - direction;
KW - dynamic
KW - graphics);
KW - image
KW - monocular
KW - MOTION
KW - perspective
KW - planar
KW - processing;
KW - rendering
KW - rendering;
KW - scenes;
KW - sequence;
KW - sequences;
KW - signal
KW - video
KW - weak
AB - In this paper, we propose a method to synthesize arbitrary views of a planar scene from a monocular video sequence of it. The 3-D direction of motion of the object is robustly estimated from the video sequence. Given this direction any other view of the object can be synthesized through a perspective projection approach, under assumptions of planarity. If the distance of the object from the camera is large, a planar approximation is reasonable even for non-planar scenes. Such a method has many important applications, one of them being gait recognition where a side view of the person is required. Our method can be used to synthesize the side-view of the person in case he/she does not present a side view to the camera. Since the planarity assumption is often an approximation, the effects of non-planarity can lead to inaccuracies in rendering and needs to be corrected for. Regions where this happens are examined and a simple technique based on weak perspective approximation is proposed to offset rendering inaccuracies. Examples of synthesized views using our method and performance evaluation are presented.
JA - Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
VL - 1
M3 - 10.1109/ICME.2003.1220958
ER -
TY - CONF
T1 - Video synthesis of arbitrary views for approximately planar scenes
T2 - Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Y1 - 2003
A1 - Chowdhury, A.K.R.
A1 - Kale, A.
A1 - Chellapa, Rama
KW - (access
KW - 3D
KW - applications;
KW - approach;
KW - approximately
KW - approximation;
KW - arbitrary
KW - Biometrics
KW - control);
KW - data;
KW - direction
KW - estimation;
KW - evaluation;
KW - Gait
KW - image
KW - monocular
KW - MOTION
KW - performance
KW - perspective
KW - planar
KW - processing;
KW - projection
KW - recognition;
KW - recovery;
KW - scenes;
KW - sequence;
KW - sequences;
KW - side
KW - signal
KW - structure;
KW - Surveillance
KW - surveillance;
KW - synthesis;
KW - synthesized
KW - video
KW - view
KW - views;
AB - In this paper, we propose a method to synthesize arbitrary views of a planar scene, given a monocular video sequence. The method is based on the availability of knowledge of the angle between the original and synthesized views. Such a method has many important applications, one of them being gait recognition. Gait recognition algorithms rely on the availability of an approximate side-view of the person. From a realistic viewpoint, such an assumption is impractical in surveillance applications and it is of interest to develop methods to synthesize a side view of the person, given an arbitrary view. For large distances from the camera, a planar approximation for the individual can be assumed. In this paper, we propose a perspective projection approach for recovering the direction of motion of the person purely from the video data, followed by synthesis of a new video sequence at a different angle. The algorithm works purely in the image and video domain, though 3D structure plays an implicit role in its theoretical justification. Examples of synthesized views using our method and performance evaluation are presented.
JA - Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
VL - 3
M3 - 10.1109/ICASSP.2003.1199520
ER -
TY - CONF
T1 - 3D face reconstruction from video using a generic model
T2 - Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on
Y1 - 2002
A1 - Chowdhury, A.R.
A1 - Chellapa, Rama
A1 - Krishnamurthy, S.
A1 - Vo, T.
KW - 3D
KW - algorithm;
KW - algorithms;
KW - analysis;
KW - Carlo
KW - chain
KW - Computer
KW - Face
KW - from
KW - function;
KW - generic
KW - human
KW - image
KW - Markov
KW - MCMC
KW - methods;
KW - model;
KW - Monte
KW - MOTION
KW - optimisation;
KW - OPTIMIZATION
KW - processes;
KW - processing;
KW - recognition;
KW - reconstruction
KW - reconstruction;
KW - sampling;
KW - sequence;
KW - sequences;
KW - SfM
KW - signal
KW - structure
KW - surveillance;
KW - video
KW - vision;
AB - Reconstructing a 3D model of a human face from a video sequence is an important problem in computer vision, with applications to recognition, surveillance, multimedia etc. However, the quality of 3D reconstructions using structure from motion (SfM) algorithms is often not satisfactory. One common method of overcoming this problem is to use a generic model of a face. Existing work using this approach initializes the reconstruction algorithm with this generic model. The problem with this approach is that the algorithm can converge to a solution very close to this initial value, resulting in a reconstruction which resembles the generic model rather than the particular face in the video which needs to be modeled. We propose a method of 3D reconstruction of a human face from video in which the 3D reconstruction algorithm and the generic model are handled separately. A 3D estimate is obtained purely from the video sequence using SfM algorithms without use of the generic model. The final 3D model is obtained after combining the SfM estimate and the generic model using an energy function that corrects for the errors in the estimate by comparing local regions in the two models. The optimization is done using a Markov chain Monte Carlo (MCMC) sampling strategy. The main advantage of our algorithm over others is that it is able to retain the specific features of the face in the video sequence even when these features are different from those of the generic model. The evolution of the 3D model through the various stages of the algorithm is presented.
JA - Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on
VL - 1
M3 - 10.1109/ICME.2002.1035815
ER -
TY - CONF
T1 - Bayesian structure from motion using inertial information
T2 - Image Processing. 2002. Proceedings. 2002 International Conference on
Y1 - 2002
A1 - Qian,Gang
A1 - Chellapa, Rama
A1 - Qinfen Zheng
KW - 3D
KW - analysis;
KW - Bayes
KW - Bayesian
KW - camera
KW - estimation;
KW - image
KW - images;
KW - importance
KW - inertial
KW - information;
KW - methods;
KW - MOTION
KW - motion;
KW - parameter
KW - processing;
KW - real
KW - reconstruction;
KW - sampling;
KW - scene
KW - sensors;
KW - sequence;
KW - sequences;
KW - sequential
KW - signal
KW - structure-from-motion;
KW - synthetic
KW - systems;
KW - video
AB - A novel approach to Bayesian structure from motion (SfM) using inertial information and sequential importance sampling (SIS) is presented. The inertial information is obtained from camera-mounted inertial sensors and is used in the Bayesian SfM approach as prior knowledge of the camera motion in the sampling algorithm. Experimental results using both synthetic and real images show that, when inertial information is used, more accurate results can be obtained or the same estimation accuracy can be obtained at a lower cost.
JA - Image Processing. 2002. Proceedings. 2002 International Conference on
VL - 3
M3 - 10.1109/ICIP.2002.1038996
ER -
TY - CONF
T1 - Wide baseline image registration using prior information
T2 - Multimedia Signal Processing, 2002 IEEE Workshop on
Y1 - 2002
A1 - Chowdhury, AM
A1 - Chellapa, Rama
A1 - Keaton, T.
KW - 2D
KW - 3D
KW - algorithm;
KW - alignment;
KW - angles;
KW - baseline
KW - Computer
KW - configuration;
KW - constellation;
KW - correspondence
KW - creation;
KW - doubly
KW - error
KW - extraction;
KW - Face
KW - feature
KW - global
KW - holistic
KW - image
KW - images;
KW - matching;
KW - matrix;
KW - model
KW - models;
KW - normalization
KW - panoramic
KW - probability;
KW - procedure;
KW - processes;
KW - processing;
KW - registration;
KW - robust
KW - sequences;
KW - SHAPE
KW - signal
KW - Sinkhorn
KW - spatial
KW - statistics;
KW - stereo;
KW - Stochastic
KW - video
KW - view
KW - viewing
KW - vision;
KW - wide
AB - Establishing correspondence between features in two images of the same scene taken from different viewing angles in a challenging problem in image processing and computer vision. However, its solution is an important step in many applications like wide baseline stereo, 3D model alignment, creation of panoramic views etc. In this paper, we propose a technique for registration of two images of a face obtained from different viewing angles. We show that prior information about the general characteristics of a face obtained from video sequences of different faces can be used to design a robust correspondence algorithm. The method works by matching 2D shapes of the different features of the face. A doubly stochastic matrix, representing the probability of match between the features, is derived using the Sinkhorn normalization procedure. The final correspondence is obtained by minimizing the probability of error of a match between the entire constellations of features in the two sets, thus taking into account the global spatial configuration of the features. The method is applied for creating holistic 3D models of a face from partial representations. Although this paper focuses primarily on faces, the algorithm can also be used for other objects with small modifications.
JA - Multimedia Signal Processing, 2002 IEEE Workshop on
M3 - 10.1109/MMSP.2002.1203242
ER -
TY - CONF
T1 - Clustering appearances of 3D objects
T2 - Computer Vision and Pattern Recognition, 1998. Proceedings. 1998 IEEE Computer Society Conference on
Y1 - 1998
A1 - Basri,R.
A1 - Roth,D.
A1 - Jacobs, David W.
KW - 3D
KW - clustering;image
KW - clustering;sequences
KW - images;unsupervised
KW - objects;local
KW - of
KW - properties;reliable
KW - recognition;
KW - sequences;object
AB - We introduce a method for unsupervised clustering of images of 3D objects. Our method examines the space of all images and partitions the images into sets that form smooth and parallel surfaces in this space. It further uses sequences of images to obtain more reliable clustering. Finally, since our method relies on a non-Euclidean similarity measure we introduce algebraic techniques for estimating local properties of these surfaces without first embedding the images in a Euclidean space. We demonstrate our method by applying it to a large database of images
JA - Computer Vision and Pattern Recognition, 1998. Proceedings. 1998 IEEE Computer Society Conference on
M3 - 10.1109/CVPR.1998.698639
ER -
TY - JOUR
T1 - Computing smooth molecular surfaces
JF - Computer Graphics and Applications, IEEE
Y1 - 1994
A1 - Varshney, Amitabh
A1 - Brooks, F.P.,Jr.
A1 - Wright,W. V
KW - 3D
KW - algorithm;smooth
KW - algorithms;physics
KW - analytical
KW - atoms;three
KW - complexity;computational
KW - complexity;parallel
KW - computing;
KW - computing;surface
KW - dimensional
KW - geometry;interactive
KW - geometry;parallel
KW - improvements;computation
KW - molecular
KW - rates;linear
KW - regular
KW - surface
KW - time;computational
KW - triangulation;algorithmic
KW - triangulation;computational
AB - We consider how we set out to formulate a parallel analytical molecular surface algorithm that has expected linear complexity with respect to the total number of atoms in a molecule. To achieve this goal, we avoided computing the complete 3D regular triangulation over the entire set of atoms, a process that takes time O(n log n), where n is the number of atoms in the molecule. We aim to compute and display these surfaces at interactive rates, by taking advantage of advances in computational geometry, making further algorithmic improvements and parallelizing the computations.<>
VL - 14
SN - 0272-1716
CP - 5
M3 - 10.1109/38.310720
ER -
TY - CONF
T1 - Finding structurally consistent motion correspondences
T2 - Pattern Recognition, 1994. Vol. 1 - Conference A: Computer Vision Image Processing., Proceedings of the 12th IAPR International Conference on
Y1 - 1994
A1 - Jacobs, David W.
A1 - Chennubhotla,C.
KW - 3D
KW - boundaries;
KW - common
KW - consistent
KW - correspondences;
KW - estimation;
KW - features;
KW - image
KW - independent
KW - linear
KW - MOTION
KW - motion;
KW - occlusion
KW - programming;
KW - scene
KW - segmentation;
KW - specularities;
KW - structurally
KW - structure;
KW - tracked
AB - Much work on deriving scene structure and motion from features assumes as input a set of tracked image features that share a common 3D motion. Producing this input requires segmenting independent motions, and detecting image features that do not correspond to 3D features, originating instead, for example in occlusion boundaries or specularities. We derive a linear program that tells when a set of tracked points might have come from 3D points that share a single motion, assuming affine motion and bounded error. We can also use linear programming to place conservative bounds on the structure of the scene that corresponds to tracked points. We implement and test this algorithm on real images
JA - Pattern Recognition, 1994. Vol. 1 - Conference A: Computer Vision Image Processing., Proceedings of the 12th IAPR International Conference on
VL - 1
M3 - 10.1109/ICPR.1994.576388
ER -
TY - CONF
T1 - Segmenting independently moving, noisy points
T2 - Motion of Non-Rigid and Articulated Objects, 1994., Proceedings of the 1994 IEEE Workshop on
Y1 - 1994
A1 - Jacobs, David W.
A1 - Chennubhotla,C.
KW - 3D
KW - common
KW - consistent
KW - estimation;
KW - features;
KW - image
KW - independently
KW - linear
KW - MOTION
KW - motion;
KW - moving
KW - noisy
KW - point
KW - points;
KW - programming;
KW - real
KW - segmentation;
KW - sequence;
KW - sequences;
KW - video
AB - There has been much work on using point features tracked through a video sequence to determine structure and motion. In many situations, to use this work, we must first isolate subsets of points that share a common motion. This is hard because we must distinguish between independent motions and apparent deviations from a single motion due to noise. We propose several methods of searching for point-sets with consistent 3D motions. We analyze the potential sensitivity of each method for detecting independent motions, and experiment with each method on a real image sequence
JA - Motion of Non-Rigid and Articulated Objects, 1994., Proceedings of the 1994 IEEE Workshop on
M3 - 10.1109/MNRAO.1994.346249
ER -
TY - JOUR
T1 - RF scattering and radiation by using a decoupled Helmholtz equation approach
JF - Magnetics, IEEE Transactions on
Y1 - 1993
A1 - D'Angelo,J.
A1 - Mayergoyz, Issak D
KW - 3D
KW - analysis;
KW - approach;
KW - computer-efficient
KW - computing;
KW - decoupled
KW - domain;
KW - electrical
KW - electromagnetic
KW - element
KW - engineering
KW - equation
KW - finite
KW - finite-element
KW - formulation;
KW - Frequency
KW - frequency-domain
KW - Helmholtz
KW - method;
KW - Physics
KW - problems;
KW - propagation;
KW - radiation
KW - radiowave
KW - RF
KW - scattering;
KW - wave
AB - A novel finite-element formulation for the solution of 3-D RF scattering and radiation problems is presented. This formulation is based on the solution of a set of decoupled Helmholtz equations for the Cartesian components of the field vectors. This results in a robust, computer-efficient method that eliminates previous difficulties associated with `curl-curl' type partial differential equations. Although it is presented in the frequency domain, the method is easily extendible to the time domain
VL - 29
SN - 0018-9464
CP - 2
M3 - 10.1109/20.250811
ER -
TY - CONF
T1 - Optimal matching of planar models in 3D scenes
T2 - Computer Vision and Pattern Recognition, 1991. Proceedings CVPR '91., IEEE Computer Society Conference on
Y1 - 1991
A1 - Jacobs, David W.
KW - 3D
KW - approximation;flat
KW - error;close
KW - error;model
KW - features;computerised
KW - features;optimal
KW - matching;planar
KW - models;point
KW - object;image;maximum
KW - pattern
KW - picture
KW - processing;
KW - recognition;computerised
KW - scenes;bounded
KW - sensing
AB - The problem of matching a model consisting of the point features of a flat object to point features found in an image that contains the object in an arbitrary three-dimensional pose is addressed. Once three points are matched, it is possible to determine the pose of the object. Assuming bounded sensing error, the author presents a solution to the problem of determining the range of possible locations in the image at which any additional model points may appear. This solution leads to an algorithm for determining the largest possible matching between image and model features that includes this initial hypothesis. The author implements a close approximation to this algorithm, which is O( nm isin;^{6}), where n is the number of image points, m is the number of model points, and isin; is the maximum sensing error. This algorithm is compared to existing methods, and it is shown that it produces more accurate results
JA - Computer Vision and Pattern Recognition, 1991. Proceedings CVPR '91., IEEE Computer Society Conference on
M3 - 10.1109/CVPR.1991.139700
ER -