%0 Conference Paper
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%D 2005
%T Extracting regions of symmetry
%A Gupta,A.
%A Prasad,V. S.N
%A Davis, Larry S.
%K algorithm;
%K coherence;
%K DETECTION
%K detection;
%K elimination;
%K extraction;
%K feature
%K image
%K images;
%K natural
%K normalized-cut
%K object
%K region
%K segmentation
%K segmentation;
%K spatial
%K spurious
%K symmetry
%X This paper presents an approach for extending the normalized-cut (n-cut) segmentation algorithm to find symmetric regions present in natural images. We use an existing algorithm to quickly detect possible symmetries present in an image. The detected symmetries are then individually verified using the modified n-cut algorithm to eliminate spurious detections. The weights of the n-cut algorithm are modified so as to include both symmetric and spatial affinities. A global parameter is defined to model the tradeoff between spatial coherence and symmetry. Experimental results indicate that symmetric quality measure for a region segmented by our algorithm is a good indicator for the significance of the principal axis of symmetry.
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%V 3
%P III - 133-6 - III - 133-6
%8 2005/09//
%G eng
%R 10.1109/ICIP.2005.1530346
%0 Conference Paper
%B Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
%D 2005
%T Fast multiple object tracking via a hierarchical particle filter
%A Yang,Changjiang
%A Duraiswami, Ramani
%A Davis, Larry S.
%K (numerical
%K algorithm;
%K analysis;
%K Color
%K colour
%K Computer
%K Convergence
%K detection;
%K edge
%K fast
%K filter;
%K Filtering
%K hierarchical
%K histogram;
%K image
%K images;
%K integral
%K likelihood;
%K methods);
%K methods;
%K multiple
%K numerical
%K object
%K observation
%K of
%K orientation
%K particle
%K processes;
%K quasirandom
%K random
%K sampling;
%K tracking
%K tracking;
%K vision;
%K visual
%X A very efficient and robust visual object tracking algorithm based on the particle filter is presented. The method characterizes the tracked objects using color and edge orientation histogram features. While the use of more features and samples can improve the robustness, the computational load required by the particle filter increases. To accelerate the algorithm while retaining robustness we adopt several enhancements in the algorithm. The first is the use of integral images for efficiently computing the color features and edge orientation histograms, which allows a large amount of particles and a better description of the targets. Next, the observation likelihood based on multiple features is computed in a coarse-to-fine manner, which allows the computation to quickly focus on the more promising regions. Quasi-random sampling of the particles allows the filter to achieve a higher convergence rate. The resulting tracking algorithm maintains multiple hypotheses and offers robustness against clutter or short period occlusions. Experimental results demonstrate the efficiency and effectiveness of the algorithm for single and multiple object tracking.
%B Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
%V 1
%P 212 - 219 Vol. 1 - 212 - 219 Vol. 1
%8 2005/10//
%G eng
%R 10.1109/ICCV.2005.95
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
%D 2005
%T Flattening curved documents in images
%A Liang,Jian
%A DeMenthon,D.
%A David Doermann
%K calibration;
%K camera
%K character
%K content;
%K curved
%K distortion;
%K document
%K document;
%K image
%K images;
%K OCR
%K optical
%K page
%K pictures;
%K printed
%K processing;
%K recognition;
%K restoration;
%K scanned
%K techniques;
%K textual
%K warping;
%X Compared to scanned images, document pictures captured by camera can suffer from distortions due to perspective and page warping. It is necessary to restore a frontal planar view of the page before other OCR techniques can be applied. In this paper we describe a novel approach for flattening a curved document in a single picture captured by an uncalibrated camera. To our knowledge this is the first reported method able to process general curved documents in images without camera calibration. We propose to model the page surface by a developable surface, and exploit the properties (parallelism and equal line spacing) of the printed textual content on the page to recover the surface shape. Experiments show that the output images are much more OCR friendly than the original ones. While our method is designed to work with any general developable surfaces, it can be adapted for typical special cases including planar pages, scans of thick books, and opened books.
%B Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
%V 2
%P 338 - 345 vol. 2 - 338 - 345 vol. 2
%8 2005/06//
%G eng
%R 10.1109/CVPR.2005.163
%0 Conference Paper
%B Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
%D 2005
%T Moving Object Segmentation and Dynamic Scene Reconstruction Using Two Frames
%A Agrawala, Ashok K.
%A Chellapa, Rama
%K 3D
%K analysis;
%K constraints;
%K dynamic
%K ego-motion
%K estimation;
%K flow
%K image
%K images;
%K independent
%K INTENSITY
%K least
%K mean
%K median
%K method;
%K methods;
%K model;
%K MOTION
%K motion;
%K moving
%K object
%K of
%K parallax
%K parallax;
%K parametric
%K processing;
%K reconstruction;
%K scene
%K segmentation;
%K signal
%K squares
%K squares;
%K static
%K structure;
%K subspace
%K surface
%K translational
%K two-frame
%K unconstrained
%K video
%B Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
%V 2
%P 705 - 708
%8 2005//18/23
%G eng
%R 10.1109/ICASSP.2005.1415502
%0 Conference Paper
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%D 2005
%T Robust observations for object tracking
%A Han,Bohyung
%A Davis, Larry S.
%K (numerical
%K adaptive
%K analysis;
%K component
%K enhancement;
%K filter
%K Filtering
%K framework;
%K image
%K images;
%K likelihood
%K methods);
%K object
%K observation
%K particle
%K PCA;
%K principal
%K tracking;
%X It is a difficult task to find an observation model that will perform well for long-term visual tracking. In this paper, we propose an adaptive observation enhancement technique based on likelihood images, which are derived from multiple visual features. The most discriminative likelihood image is extracted by principal component analysis (PCA) and incrementally updated frame by frame to reduce temporal tracking error. In the particle filter framework, the feasibility of each sample is computed using this most discriminative likelihood image before the observation process. Integral image is employed for efficient computation of the feasibility of each sample. We illustrate how our enhancement technique contributes to more robust observations through demonstrations.
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%V 2
%P II - 442-5 - II - 442-5
%8 2005/09//
%G eng
%R 10.1109/ICIP.2005.1530087
%0 Conference Paper
%B Image Processing, 2004. ICIP '04. 2004 International Conference on
%D 2004
%T Robust Bayesian cameras motion estimation using random sampling
%A Qian, G.
%A Chellapa, Rama
%A Qinfen Zheng
%K 3D
%K baseline
%K Bayesian
%K CAMERAS
%K cameras;
%K coarse-to-fine
%K consensus
%K density
%K estimation;
%K feature
%K function;
%K hierarchy
%K image
%K images;
%K importance
%K matching;
%K MOTION
%K posterior
%K probability
%K probability;
%K processing;
%K random
%K RANSAC;
%K real
%K realistic
%K sample
%K sampling;
%K scheme;
%K sequences;
%K stereo
%K strategy;
%K synthetic
%K wide
%X In this paper, we propose an algorithm for robust 3D motion estimation of wide baseline cameras from noisy feature correspondences. The posterior probability density function of the camera motion parameters is represented by weighted samples. The algorithm employs a hierarchy coarse-to-fine strategy. First, a coarse prior distribution of camera motion parameters is estimated using the random sample consensus scheme (RANSAC). Based on this estimate, a refined posterior distribution of camera motion parameters can then be obtained through importance sampling. Experimental results using both synthetic and real image sequences indicate the efficacy of the proposed algorithm.
%B Image Processing, 2004. ICIP '04. 2004 International Conference on
%V 2
%P 1361 - 1364 Vol.2 - 1361 - 1364 Vol.2
%8 2004/10//
%G eng
%R 10.1109/ICIP.2004.1419754
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%D 2004
%T View independent human body pose estimation from a single perspective image
%A Parameswaran, V.
%A Chellapa, Rama
%K 3D
%K analysis;
%K biomechanics;
%K body
%K body-centric
%K camera;
%K capture
%K coordinate
%K coordinates;
%K detection;
%K epipolar
%K equation
%K estimation;
%K geometry;
%K human
%K image
%K image;
%K images;
%K model-based
%K models;
%K MOTION
%K object
%K optical
%K perspective
%K physiological
%K polynomial
%K polynomials;
%K pose
%K real
%K single
%K synthetic
%K system;
%K systems;
%K torso
%K tracking;
%K twist;
%K uncalibrated
%X Recovering the 3D coordinates of various joints of the human body from an image is a critical first step for several model-based human tracking and optical motion capture systems. Unlike previous approaches that have used a restrictive camera model or assumed a calibrated camera, our work deals with the general case of a perspective uncalibrated camera and is thus well suited for archived video. The input to the system is an image of the human body and correspondences of several body landmarks, while the output is the set of 3D coordinates of the landmarks in a body-centric coordinate system. Using ideas from 3D model based invariants, we set up a polynomial system of equations in the unknown head pitch, yaw and roll angles. If we are able to make the often-valid assumption that the torso twist is small, there are finite numbers of solutions to the head-orientation that can be computed readily. Once the head orientation is computed, the epipolar geometry of the camera is recovered, leading to solutions to the 3D joint positions. Results are presented on synthetic and real images.
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%V 2
%P II-16 - II-22 Vol.2 - II-16 - II-22 Vol.2
%8 2004/07/02/june
%G eng
%R 10.1109/CVPR.2004.1315139
%0 Journal Article
%J Signal Processing, IEEE Transactions on
%D 2003
%T Anti-collusion fingerprinting for multimedia
%A Trappe,W.
%A M. Wu
%A Wang,Z.J.
%A Liu,K. J.R
%K (mathematics);
%K additive
%K algorithm;
%K and
%K anti-collusion
%K attack;
%K averaging
%K binary
%K code
%K codes;
%K codevectors;
%K coding;
%K colluders
%K collusion;
%K combinatorial
%K communication;
%K compression;
%K correlation;
%K cost-effective
%K data
%K data;
%K design
%K DETECTION
%K detection;
%K digital
%K embedding;
%K fingerprinting;
%K Gaussian
%K identification;
%K image
%K images;
%K keying;
%K logical
%K mathematics;
%K Modulation
%K modulation;
%K multimedia
%K multimedia;
%K of
%K on-off
%K operation;
%K orthogonal
%K processes;
%K real
%K redistribution;
%K Security
%K signal
%K signals;
%K theory;
%K tree-structured
%K TREES
%K watermarking;
%X Digital fingerprinting is a technique for identifying users who use multimedia content for unintended purposes, such as redistribution. These fingerprints are typically embedded into the content using watermarking techniques that are designed to be robust to a variety of attacks. A cost-effective attack against such digital fingerprints is collusion, where several differently marked copies of the same content are combined to disrupt the underlying fingerprints. We investigate the problem of designing fingerprints that can withstand collusion and allow for the identification of colluders. We begin by introducing the collusion problem for additive embedding. We then study the effect that averaging collusion has on orthogonal modulation. We introduce a tree-structured detection algorithm for identifying the fingerprints associated with K colluders that requires O(Klog(n/K)) correlations for a group of n users. We next develop a fingerprinting scheme based on code modulation that does not require as many basis signals as orthogonal modulation. We propose a new class of codes, called anti-collusion codes (ACCs), which have the property that the composition of any subset of K or fewer codevectors is unique. Using this property, we can therefore identify groups of K or fewer colluders. We present a construction of binary-valued ACC under the logical AND operation that uses the theory of combinatorial designs and is suitable for both the on-off keying and antipodal form of binary code modulation. In order to accommodate n users, our code construction requires only O( radic;n) orthogonal signals for a given number of colluders. We introduce three different detection strategies that can be used with our ACC for identifying a suspect set of colluders. We demonstrate the performance of our ACC for fingerprinting multimedia and identifying colluders through experiments using Gaussian signals and real images.
%B Signal Processing, IEEE Transactions on
%V 51
%P 1069 - 1087
%8 2003/04//
%@ 1053-587X
%G eng
%N 4
%R 10.1109/TSP.2003.809378
%0 Conference Paper
%B Image Analysis and Processing, 2003.Proceedings. 12th International Conference on
%D 2003
%T Depth-first k-nearest neighbor finding using the MaxNearestDist estimator
%A Samet, Hanan
%K branch-and-bound
%K data
%K depth-first
%K distance;
%K DNA
%K documents;
%K estimation;
%K estimator;
%K finding;
%K images;
%K k-nearest
%K matching;
%K maximum
%K MaxNearestDist
%K mining;
%K neighbor
%K parameter
%K pattern
%K possible
%K process;
%K processing;
%K query
%K search
%K searching;
%K sequences;
%K series;
%K similarity
%K text
%K TIME
%K tree
%K video;
%X Similarity searching is an important task when trying to find patterns in applications which involve mining different types of data such as images, video, time series, text documents, DNA sequences, etc. Similarity searching often reduces to finding the k nearest neighbors to a query object. A description is given of how to use an estimate of the maximum possible distance at which a nearest neighbor can be found to prune the search process in a depth-first branch-and-bound k-nearest neighbor finding algorithm. Using the MaxNearestDist estimator (Larsen, S. and Kanal, L.N., 1986) in the depth-first k-nearest neighbor algorithm provides a middle ground between a pure depth-first and a best-first k-nearest neighbor algorithm.
%B Image Analysis and Processing, 2003.Proceedings. 12th International Conference on
%P 486 - 491
%8 2003/09//
%G eng
%R 10.1109/ICIAP.2003.1234097
%0 Journal Article
%J Pattern Analysis and Machine Intelligence, IEEE Transactions on
%D 2003
%T Properties of embedding methods for similarity searching in metric spaces
%A Hjaltason,G. R
%A Samet, Hanan
%K complex
%K contractiveness;
%K data
%K databases;
%K decomposition;
%K dimension
%K distance
%K distortion;
%K DNA
%K documents;
%K EMBEDDING
%K embeddings;
%K Euclidean
%K evaluations;
%K FastMap;
%K images;
%K Lipschitz
%K methods;
%K metric
%K MetricMap;
%K multimedia
%K processing;
%K query
%K reduction
%K search;
%K searching;
%K sequences;
%K similarity
%K singular
%K spaces;
%K SparseMap;
%K types;
%K value
%X Complex data types-such as images, documents, DNA sequences, etc.-are becoming increasingly important in modern database applications. A typical query in many of these applications seeks to find objects that are similar to some target object, where (dis)similarity is defined by some distance function. Often, the cost of evaluating the distance between two objects is very high. Thus, the number of distance evaluations should be kept at a minimum, while (ideally) maintaining the quality of the result. One way to approach this goal is to embed the data objects in a vector space so that the distances of the embedded objects approximates the actual distances. Thus, queries can be performed (for the most part) on the embedded objects. We are especially interested in examining the issue of whether or not the embedding methods will ensure that no relevant objects are left out. Particular attention is paid to the SparseMap, FastMap, and MetricMap embedding methods. SparseMap is a variant of Lipschitz embeddings, while FastMap and MetricMap are inspired by dimension reduction methods for Euclidean spaces. We show that, in general, none of these embedding methods guarantee that queries on the embedded objects have no false dismissals, while also demonstrating the limited cases in which the guarantee does hold. Moreover, we describe a variant of SparseMap that allows queries with no false dismissals. In addition, we show that with FastMap and MetricMap, the distances of the embedded objects can be much greater than the actual distances. This makes it impossible (or at least impractical) to modify FastMap and MetricMap to guarantee no false dismissals.
%B Pattern Analysis and Machine Intelligence, IEEE Transactions on
%V 25
%P 530 - 549
%8 2003/05//
%@ 0162-8828
%G eng
%N 5
%R 10.1109/TPAMI.2003.1195989
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on
%D 2003
%T Simultaneous pose and correspondence determination using line features
%A David,P.
%A DeMenthon,D.
%A Duraiswami, Ramani
%A Samet, Hanan
%K algorithm;
%K algorithms;
%K annealing;
%K clutter;
%K cluttered
%K Computer
%K correspondence
%K detection;
%K determination;
%K deterministic
%K environment;
%K extraction;
%K feature
%K feature;
%K image
%K image;
%K imagery;
%K images;
%K joint
%K line
%K local
%K man-made
%K MATCHING
%K matching;
%K measurement;
%K model-to-image
%K noise;
%K occlusion;
%K optimum;
%K perspective
%K point
%K pose
%K position
%K problem;
%K processing;
%K real
%K realistic
%K registration
%K simulated
%K softassign;
%K SoftPOSIT
%K stereo
%K synthetic
%K vision;
%X We present a new robust line matching algorithm for solving the model-to-image registration problem. Given a model consisting of 3D lines and a cluttered perspective image of this model, the algorithm simultaneously estimates the pose of the model and the correspondences of model lines to image lines. The algorithm combines softassign for determining correspondences and POSIT for determining pose. Integrating these algorithms into a deterministic annealing procedure allows the correspondence and pose to evolve from initially uncertain values to a joint local optimum. This research extends to line features the SoftPOSIT algorithm proposed recently for point features. Lines detected in images are typically more stable than points and are less likely to be produced by clutter and noise, especially in man-made environments. Experiments on synthetic and real imagery with high levels of clutter, occlusion, and noise demonstrate the robustness of the algorithm.
%B Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on
%V 2
%P II-424 - II-431 vol.2 - II-424 - II-431 vol.2
%8 2003/06//
%G eng
%R 10.1109/CVPR.2003.1211499
%0 Conference Paper
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%D 2002
%T Anti-collusion codes: multi-user and multimedia perspectives
%A Trappe,W.
%A M. Wu
%A Liu,K. J.R
%K and
%K anti-collusion
%K authentication;
%K binary
%K code
%K codes;
%K coding;
%K combinatorial
%K computing;
%K content;
%K data
%K designs;
%K digital
%K embedding;
%K encapsulation;
%K fingerprinting;
%K image
%K images;
%K logical
%K mathematics;
%K message
%K Modulation
%K modulation;
%K multimedia
%K operation;
%K performance;
%K watermarking;
%X Digital fingerprinting is an effective method to identify users who might try to redistribute multimedia content, such as images and video. These fingerprints are typically embedded into the content using watermarking techniques that are designed to be robust to a variety of attacks. A cheap and effective attack against such digital fingerprints is collusion, where several differently marked copies of the same content are averaged or combined to disrupt the underlying fingerprint. We present a construction of collusion-resistant fingerprints based upon anti-collusion codes (ACC) and binary code modulation. ACC have the property that the composition of any subset of K or fewer codevectors is unique. Using this property, we build fingerprints that allow for the identification of groups of K or less colluders. We present a construction of binary-valued ACC under the logical AND operation using the theory of combinatorial designs. Our code construction requires only Oscr;( radic;n) orthogonal signals to accommodate n users. We demonstrate the performance of our ACC for fingerprinting multimedia by identifying colluders through experiments using real images.
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%V 2
%P II-149 - II-152 vol.2 - II-149 - II-152 vol.2
%8 2002///
%G eng
%R 10.1109/ICIP.2002.1039909
%0 Conference Paper
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%D 2002
%T Bayesian structure from motion using inertial information
%A Qian,Gang
%A Chellapa, Rama
%A Qinfen Zheng
%K 3D
%K analysis;
%K Bayes
%K Bayesian
%K camera
%K estimation;
%K image
%K images;
%K importance
%K inertial
%K information;
%K methods;
%K MOTION
%K motion;
%K parameter
%K processing;
%K real
%K reconstruction;
%K sampling;
%K scene
%K sensors;
%K sequence;
%K sequences;
%K sequential
%K signal
%K structure-from-motion;
%K synthetic
%K systems;
%K video
%X A novel approach to Bayesian structure from motion (SfM) using inertial information and sequential importance sampling (SIS) is presented. The inertial information is obtained from camera-mounted inertial sensors and is used in the Bayesian SfM approach as prior knowledge of the camera motion in the sampling algorithm. Experimental results using both synthetic and real images show that, when inertial information is used, more accurate results can be obtained or the same estimation accuracy can be obtained at a lower cost.
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%V 3
%P III-425 - III-428 vol.3 - III-425 - III-428 vol.3
%8 2002///
%G eng
%R 10.1109/ICIP.2002.1038996
%0 Conference Paper
%B Motion and Video Computing, 2002. Proceedings. Workshop on
%D 2002
%T A hierarchical approach for obtaining structure from two-frame optical flow
%A Liu,Haiying
%A Chellapa, Rama
%A Rosenfeld, A.
%K algorithm;
%K aliasing;
%K analysis;
%K computer-rendered
%K depth
%K depth;
%K error
%K estimation;
%K extraction;
%K Face
%K feature
%K flow;
%K gesture
%K hierarchical
%K image
%K images;
%K inverse
%K iterative
%K methods;
%K MOTION
%K nonlinear
%K optical
%K parameter
%K processing;
%K real
%K recognition;
%K sequences;
%K signal
%K structure-from-motion;
%K system;
%K systems;
%K TIME
%K two-frame
%K variation;
%K video
%X A hierarchical iterative algorithm is proposed for extracting structure from two-frame optical flow. The algorithm exploits two facts: one is that in many applications, such as face and gesture recognition, the depth variation of the visible surface of an object in a scene is small compared to the distance between the optical center and the object; the other is that the time aliasing problem is alleviated at the coarse level for any two-frame optical flow estimate so that the estimate tends to be more accurate. A hierarchical representation for the relationship between the optical flow, depth, and the motion parameters is derived, and the resulting non-linear system is iteratively solved through two linear subsystems. At the coarsest level, the surface of the object tends to be flat, so that the inverse depth tends to be a constant, which is used as the initial depth map. Inverse depth and motion parameters are estimated by the two linear subsystems at each level and the results are propagated to finer levels. Error analysis and experiments using both computer-rendered images and real images demonstrate the correctness and effectiveness of our algorithm.
%B Motion and Video Computing, 2002. Proceedings. Workshop on
%P 214 - 219
%8 2002/12//
%G eng
%R 10.1109/MOTION.2002.1182239
%0 Journal Article
%J Image Processing, IEEE Transactions on
%D 2002
%T Optimal edge-based shape detection
%A Moon, H.
%A Chellapa, Rama
%A Rosenfeld, A.
%K 1D
%K 2D
%K aerial
%K analysis;
%K boundary
%K conditions;
%K contour
%K cross
%K detection;
%K DODE
%K double
%K edge
%K edge-based
%K error
%K error;
%K exponential
%K extraction;
%K facial
%K feature
%K filter
%K filter;
%K Filtering
%K function;
%K geometry;
%K global
%K human
%K images;
%K imaging
%K localization
%K mean
%K methods;
%K NOISE
%K operator;
%K optimal
%K optimisation;
%K output;
%K performance;
%K pixel;
%K power;
%K propagation;
%K properties;
%K section;
%K SHAPE
%K square
%K squared
%K statistical
%K step
%K theory;
%K tracking;
%K two-dimensional
%K vehicle
%K video;
%X We propose an approach to accurately detecting two-dimensional (2-D) shapes. The cross section of the shape boundary is modeled as a step function. We first derive a one-dimensional (1-D) optimal step edge operator, which minimizes both the noise power and the mean squared error between the input and the filter output. This operator is found to be the derivative of the double exponential (DODE) function, originally derived by Ben-Arie and Rao (1994). We define an operator for shape detection by extending the DODE filter along the shape's boundary contour. The responses are accumulated at the centroid of the operator to estimate the likelihood of the presence of the given shape. This method of detecting a shape is in fact a natural extension of the task of edge detection at the pixel level to the problem of global contour detection. This simple filtering scheme also provides a tool for a systematic analysis of edge-based shape detection. We investigate how the error is propagated by the shape geometry. We have found that, under general assumptions, the operator is locally linear at the peak of the response. We compute the expected shape of the response and derive some of its statistical properties. This enables us to predict both its localization and detection performance and adjust its parameters according to imaging conditions and given performance specifications. Applications to the problem of vehicle detection in aerial images, human facial feature detection, and contour tracking in video are presented.
%B Image Processing, IEEE Transactions on
%V 11
%P 1209 - 1227
%8 2002/11//
%@ 1057-7149
%G eng
%N 11
%R 10.1109/TIP.2002.800896
%0 Conference Paper
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%D 2002
%T Page classification through logical labelling
%A Liang,Jian
%A David Doermann
%A Ma,M.
%A Guo,J. K
%K article
%K attributed
%K base;
%K character
%K classification;
%K constraints;
%K document
%K document;
%K experimental
%K global
%K graph
%K graph;
%K hierarchical
%K image
%K images;
%K labelling;
%K logical
%K model
%K noise;
%K OCR;
%K optical
%K page
%K pages;
%K processing;
%K recognition;
%K relational
%K results;
%K technical
%K theory;
%K title
%K unknown
%X We propose an integrated approach to page classification and logical labelling. Layout is represented by a fully connected attributed relational graph that is matched to the graph of an unknown document, achieving classification and labelling simultaneously. By incorporating global constraints in an integrated fashion, ambiguity at the zone level can be reduced, providing robustness to noise and variation. Models are automatically trained from sample documents. Experimental results show promise for the classification and labelling of technical article title pages, and supports the idea of a hierarchical model base.
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%V 3
%P 477 - 480 vol.3 - 477 - 480 vol.3
%8 2002///
%G eng
%R 10.1109/ICPR.2002.1047980
%0 Conference Paper
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%D 2002
%T Probabilistic recognition of human faces from video
%A Chellapa, Rama
%A Kruger, V.
%A Zhou,Shaohua
%K Bayes
%K Bayesian
%K CMU;
%K distribution;
%K Face
%K faces;
%K gallery;
%K handling;
%K human
%K image
%K images;
%K importance
%K likelihood;
%K methods;
%K NIST/USF;
%K observation
%K posterior
%K probabilistic
%K probability;
%K processing;
%K propagation;
%K recognition;
%K sampling;
%K sequential
%K signal
%K still
%K Still-to-video
%K Uncertainty
%K video
%K Video-to-video
%X Most present face recognition approaches recognize faces based on still images. We present a novel approach to recognize faces in video. In that scenario, the face gallery may consist of still images or may be derived from a videos. For evidence integration we use classical Bayesian propagation over time and compute the posterior distribution using sequential importance sampling. The probabilistic approach allows us to handle uncertainties in a systematic manner. Experimental results using videos collected by NIST/USF and CMU illustrate the effectiveness of this approach in both still-to-video and video-to-video scenarios with appropriate model choices.
%B Image Processing. 2002. Proceedings. 2002 International Conference on
%V 1
%P I-41 - I-44 vol.1 - I-41 - I-44 vol.1
%8 2002///
%G eng
%R 10.1109/ICIP.2002.1037954
%0 Conference Paper
%B Multimedia Signal Processing, 2002 IEEE Workshop on
%D 2002
%T Wide baseline image registration using prior information
%A Chowdhury, AM
%A Chellapa, Rama
%A Keaton, T.
%K 2D
%K 3D
%K algorithm;
%K alignment;
%K angles;
%K baseline
%K Computer
%K configuration;
%K constellation;
%K correspondence
%K creation;
%K doubly
%K error
%K extraction;
%K Face
%K feature
%K global
%K holistic
%K image
%K images;
%K matching;
%K matrix;
%K model
%K models;
%K normalization
%K panoramic
%K probability;
%K procedure;
%K processes;
%K processing;
%K registration;
%K robust
%K sequences;
%K SHAPE
%K signal
%K Sinkhorn
%K spatial
%K statistics;
%K stereo;
%K Stochastic
%K video
%K view
%K viewing
%K vision;
%K wide
%X Establishing correspondence between features in two images of the same scene taken from different viewing angles in a challenging problem in image processing and computer vision. However, its solution is an important step in many applications like wide baseline stereo, 3D model alignment, creation of panoramic views etc. In this paper, we propose a technique for registration of two images of a face obtained from different viewing angles. We show that prior information about the general characteristics of a face obtained from video sequences of different faces can be used to design a robust correspondence algorithm. The method works by matching 2D shapes of the different features of the face. A doubly stochastic matrix, representing the probability of match between the features, is derived using the Sinkhorn normalization procedure. The final correspondence is obtained by minimizing the probability of error of a match between the entire constellations of features in the two sets, thus taking into account the global spatial configuration of the features. The method is applied for creating holistic 3D models of a face from partial representations. Although this paper focuses primarily on faces, the algorithm can also be used for other objects with small modifications.
%B Multimedia Signal Processing, 2002 IEEE Workshop on
%P 37 - 40
%8 2002/12//
%G eng
%R 10.1109/MMSP.2002.1203242
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 1993. Proceedings CVPR '93., 1993 IEEE Computer Society Conference on
%D 1993
%T 2D images of 3-D oriented points
%A Jacobs, David W.
%K 2D
%K 3-D
%K database
%K derivation;
%K image
%K images;
%K indexing;
%K linear
%K model
%K nonrigid
%K oriented
%K points;
%K processing;
%K recovery;
%K structure-form-motion
%K structure-from-motion
%K transformation;
%X A number of vision problems have been shown to become simpler when one models projection from 3-D to 2-D as a nonrigid linear transformation. These results have been largely restricted to models and scenes that consist only of 3-D points. It is shown that, with this projection model, several vision tasks become fundamentally more complex in the somewhat more complicated domain of oriented points. More space is required for indexing models in a database, more images are required to derive structure from motion, and new views of an object cannot be synthesized linearly from old views
%B Computer Vision and Pattern Recognition, 1993. Proceedings CVPR '93., 1993 IEEE Computer Society Conference on
%P 226 - 232
%8 1993/06//
%G eng
%R 10.1109/CVPR.1993.340985