Department of Computer Science at UH

University of Houston

Department of Computer Science

In Partial Fulfillment of the Requirements for the Degree of
Master of Science

Varun Maheshwari

Will defend his thesis

Quantitative Comparison of Metrics for Human Pose Estimation


Human Pose Estimation is an important problem in computer vision that has received considerable attention in recent years. Multitude of application related to Visual Surveillance, Human Computer Interaction and Activity Recognition often require the ability to estimate human pose. The ability to quantify the orientation of humans observed in images is a rather challenging problem attributed to variation in illumination, occlusions and variations due to articulation of the human body. Many approaches to address this problem have been investigated, most of which rely on the use of 3D models and estimate the pose based on a model fitting process. Such approaches are limited due to the assumptions of static and easily removable background in addition to limited occlusion and variations in the articulations that can be incorporated in the 3D training dataset. Alternatively, approaches based on local image and shape features have also been proposed. In this thesis we consider pose estimation based on image features and provide a comparative evaluation to assess the utility of three common feature descriptors and three common classifiers. Image Feature based pose estimation involves a multi-step process including Feature Extraction, Feature Selection, and Classification. We investigate Histogram based, GIST - based and SIFT based feature extraction and representation algorithms. In the Feature Selection stage we reduce the dimensionality of the features using Information Gain and Principal Component Analysis. For Classification we first discretize the pose estimation into 4 coarse orientations denoted by right-, left-, front- and back- facing. Classification is studied as both a hierarchical (two stage) solution and a direct estimation. In the hierarchical solution, we group front- and back- facing into a single class and right- and left- facing into another class. At the first level, a classifier is trained to differentiate between these two classes. A second classifier is trained to then differentiate the classes within each group from the first level. A direct estimation simply trains a classifier to differentiate the 4 classes. SVM, Decision Trees and Random Forest classifiers are used for training and prediction, and results are presented on the publicly available PASCAL VOC 2010 dataset.


Date: Monday, July 15, 2013
Time: 2:00 PM
Place: PGH 550

Faculty, students, and the general public are invited.
Advisor: Prof. Shishir Shah