Pointing estimation for human-robot interaction using hand pose, verbal cues, and confidence heuristics
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
People give pointing directives, both verbally and physically, frequently and effortlessly. Intelligent robots and applications need to interpret pointing directives in order to understand the intentions of the user. This is not a trivial task as the intended pointing direction rarely aligns with the ground truth pointing vector. In this work, I aim to interpret pointing directives by using a combination of visual (i.e. hand pose) and verbal information (i.e. spoken command) in order to capture the context of the directive. While state-of-the-art methods for pointing directives use a combination of hand and head pose, this paper addresses the case where head pose is unavailable due to occlusion or other system constraints. I compute confidence heuristics to determine the quality of each information source and evaluate the performance of using these features with a collection of learning algorithms and a rule-based approach. The results reveal that confidence heuristics improve the accuracy of the models and the fusion of hand pose, verbal messages, and confidence heuristics can achieve satisfactory accuracy without requiring additional visual information besides the hand pose.
School of Science
School of Science
Dept. of Computer Science
Rensselaer Polytechnic Institute, Troy, NY
Rensselaer Theses and Dissertations Online Collection
Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.