Show simple item record

dc.rights.licenseRestricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.
dc.contributorJi, Qiang, 1963-
dc.contributorSanderson, A. C. (Arthur C.)
dc.contributorRadke, Richard J., 1974-
dc.contributor.authorShao, Wenxin
dc.date.accessioned2021-11-03T08:52:18Z
dc.date.available2021-11-03T08:52:18Z
dc.date.created2017-11-10T12:34:31Z
dc.date.issued2017-08
dc.identifier.urihttps://hdl.handle.net/20.500.13015/2038
dc.descriptionAugust 2017
dc.descriptionSchool of Engineering
dc.description.abstractHead pose estimation is an important computer vision task. Its applications can benefit people's daily life. In this thesis, our goal is to solve the head pose estimation problem using deep convolutional neural networks. The first task is to perform head pose estimation as a classification task. We propose a multimodal convolutional neural network(CNN) for head pose classification. The architecture of the model consists of three pathways whose inputs are face image, head image, and facial landmarks, which respectively capture the face appearance, facial context, and facial shape. We first perform the experiments on benchmark datasets. Then we perform head pose classification on low-quality driving videos. In order to deal with the noises in the videos, we propose the Max-Feature Map(MFM) with the help of Network-In-Network(NIN) for the CNN model, which has a better capability of handling the noises and the feature selection.
dc.description.abstractWe also use training techniques such as _ne-tuning and joint training to improve the performance on driving videos. The second task is to predict head pose angles using regression. We propose a deep CNN model which uses larger face image as the input and outputs all three head pose angles. The model has more layers and more parameters. Because head pose estimation needs more face images with various large head poses, face detection method is an essential part of data processing. Thus besides the face detection method we use in head pose classification, we also use a deep-learning face detection method based on region-based convolution neural network(R-CNN). We compare the performance of these two face detection methods on some datasets with continuous head poses, including benchmark datasets, driving videos, and some videos we record with a head tracker device. From head pose classification on benchmark datasets to head pose estimation on arbitrary data, we move on to more challenging tasks step by step. The experimental results on all these tasks and the comparison with other state-of-art methods show that our methods achieve a good robustness as well as a better accuracy, compared to the baseline and existing methods.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.subjectElectrical engineering
dc.titleHead pose estimation on deep CNN models
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid178523
dc.digitool.pid178524
dc.digitool.pid178525
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreeMS
dc.relation.departmentDept. of Electrical, Computer, and Systems Engineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record