Combining supervised machine learning and structured knowledge for difficult perceptual tasks

Loading...
Thumbnail Image
Authors
Klawonn, Matthew
Issue Date
2019-05
Type
Electronic thesis
Thesis
Language
ENG
Keywords
Computer science
Research Projects
Organizational Units
Journal Issue
Alternative Title
Abstract
Learning models of visual perception lies at the heart of a number of computer vision problems, including object detection, image description, motion tracking, and more. There are a variety of models which may complete such tasks, though the tasks themselves are usually assumed to be consistent in their requirements: receive visual input, and perceive some desired content in said input. Yet for certain tasks, the desired outputs are very difficult to predict given input images alone. Many perceptual tasks require not only the ability to parse content of a visual scene, but also the ability to combine visual information with auxiliary knowledge to reach conclusions. Rather than attempt to incorporate auxiliary knowledge into the parameters of a learned model, this work presents an alternative approach.
In order to improve compatibility between the scene graph generator and any external knowledge that is available to produce inferences, we also develop a novel meta-learning method for directing the training process of learning algorithms. Specifically, our method learns in an online fashion to select training data for which a given model has good performance. When combined with the scene graph generator, this meta-learning algorithm facilitates a clean split between learned knowledge and external knowledge. The meta-learning algorithm distills information in the training data and in the external knowledge, constructing training scene graphs that are ``learnable", while leaving remaining information to be used during inferencing. We test our approach on a semantic search task, showing that the combination of learned perceptual model, meta-learning algorithm, and structured knowledge inferencing techniques perform better together than they do separately.
We hypothesize that there are significant benefits in training perceptual models such that they can interact successfully with external information, while keeping said information external. Towards validating this hypothesis, we take the following steps. Firstly, we motivate representing external knowledge in a structured, symbolic form, a choice based in the flexibility and expressivity of knowledge representation and reasoning techniques. We then create and evaluate a novel perceptual model that produces scene graphs, an output that can be combined with structured symbolic knowledge to produce complex inferences. Experiments show that this model performs comparably to the state of the art using standard benchmarks and metrics, while holding significant advantages in the flexibility of its training setup and the variety of outputs it can produce.
Description
May 2019
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Journal
Volume
Issue
PubMed ID
DOI
ISSN
EISSN