• Login
    View Item 
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Head pose estimation on deep CNN models

    Author
    Shao, Wenxin
    View/Open
    178524_Shao_rpi_0185N_11110.pdf (12.68Mb)
    Other Contributors
    Ji, Qiang, 1963-; Sanderson, A. C. (Arthur C.); Radke, Richard J., 1974-;
    Date Issued
    2017-08
    Subject
    Electrical engineering
    Degree
    MS;
    Terms of Use
    This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
    Metadata
    Show full item record
    URI
    https://hdl.handle.net/20.500.13015/2038
    Abstract
    Head pose estimation is an important computer vision task. Its applications can benefit people's daily life. In this thesis, our goal is to solve the head pose estimation problem using deep convolutional neural networks. The first task is to perform head pose estimation as a classification task. We propose a multimodal convolutional neural network(CNN) for head pose classification. The architecture of the model consists of three pathways whose inputs are face image, head image, and facial landmarks, which respectively capture the face appearance, facial context, and facial shape. We first perform the experiments on benchmark datasets. Then we perform head pose classification on low-quality driving videos. In order to deal with the noises in the videos, we propose the Max-Feature Map(MFM) with the help of Network-In-Network(NIN) for the CNN model, which has a better capability of handling the noises and the feature selection.; We also use training techniques such as _ne-tuning and joint training to improve the performance on driving videos. The second task is to predict head pose angles using regression. We propose a deep CNN model which uses larger face image as the input and outputs all three head pose angles. The model has more layers and more parameters. Because head pose estimation needs more face images with various large head poses, face detection method is an essential part of data processing. Thus besides the face detection method we use in head pose classification, we also use a deep-learning face detection method based on region-based convolution neural network(R-CNN). We compare the performance of these two face detection methods on some datasets with continuous head poses, including benchmark datasets, driving videos, and some videos we record with a head tracker device. From head pose classification on benchmark datasets to head pose estimation on arbitrary data, we move on to more challenging tasks step by step. The experimental results on all these tasks and the comparison with other state-of-art methods show that our methods achieve a good robustness as well as a better accuracy, compared to the baseline and existing methods.;
    Description
    August 2017; School of Engineering
    Department
    Dept. of Electrical, Computer, and Systems Engineering;
    Publisher
    Rensselaer Polytechnic Institute, Troy, NY
    Relationships
    Rensselaer Theses and Dissertations Online Collection;
    Access
    Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.;
    Collections
    • RPI Theses Online (Complete)

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV