Combining supervised machine learning and structured knowledge for difficult perceptual tasks

Klawonn, Matthew

Combining supervised machine learning and structured knowledge for difficult perceptual tasks

Authors

Klawonn, Matthew

Files

179631_Klawonn_rpi_0185E_11463.pdf (33.37 MB)

Other Contributors

Hendler, James A.
Fox, Peter A.
Zaki, Mohammed J., 1971-
Ji, Qiang, 1963-

Issue Date

2019-05

Keywords

Computer science

Degree

PhD

Terms of Use

Attribution-NonCommercial-NoDerivs 3.0 United States
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.

URI

https://hdl.handle.net/20.500.13015/2388

Abstract

Learning models of visual perception lies at the heart of a number of computer vision problems, including object detection, image description, motion tracking, and more. There are a variety of models which may complete such tasks, though the tasks themselves are usually assumed to be consistent in their requirements: receive visual input, and perceive some desired content in said input. Yet for certain tasks, the desired outputs are very difficult to predict given input images alone. Many perceptual tasks require not only the ability to parse content of a visual scene, but also the ability to combine visual information with auxiliary knowledge to reach conclusions. Rather than attempt to incorporate auxiliary knowledge into the parameters of a learned model, this work presents an alternative approach.
In order to improve compatibility between the scene graph generator and any external knowledge that is available to produce inferences, we also develop a novel meta-learning method for directing the training process of learning algorithms. Specifically, our method learns in an online fashion to select training data for which a given model has good performance. When combined with the scene graph generator, this meta-learning algorithm facilitates a clean split between learned knowledge and external knowledge. The meta-learning algorithm distills information in the training data and in the external knowledge, constructing training scene graphs that are ``learnable", while leaving remaining information to be used during inferencing. We test our approach on a semantic search task, showing that the combination of learned perceptual model, meta-learning algorithm, and structured knowledge inferencing techniques perform better together than they do separately.
We hypothesize that there are significant benefits in training perceptual models such that they can interact successfully with external information, while keeping said information external. Towards validating this hypothesis, we take the following steps. Firstly, we motivate representing external knowledge in a structured, symbolic form, a choice based in the flexibility and expressivity of knowledge representation and reasoning techniques. We then create and evaluate a novel perceptual model that produces scene graphs, an output that can be combined with structured symbolic knowledge to produce complex inferences. Experiments show that this model performs comparably to the state of the art using standard benchmarks and metrics, while holding significant advantages in the flexibility of its training setup and the variety of outputs it can produce.

Description

May 2019
School of Science

Department

Dept. of Computer Science

Publisher

Rensselaer Polytechnic Institute, Troy, NY

Relationships

Rensselaer Theses and Dissertations Online Collection

Access

CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.

Collections

RPI Theses Open Access
RPI Theses Online (Complete)

Full item page