Incorporating context into knowledge graph completion methods
Loading...
Authors
Shirai, Sola, Shaka Ditschler
Issue Date
2024-05
Type
Electronic thesis
Thesis
Thesis
Language
en_US
Keywords
Computer science
Alternative Title
Abstract
Well structured and semantically rich resources, such as knowledge graphs (KGs) and ontologies, can support tasks like knowledge-driven predictions or recommendations. KGs in particular have gained attention for such purposes in recent years due to their increasingly large scale and availability across domains. Given the directed graph structure of KGs, expressed as (subject, predicate, object) triples connecting the subject and object entities via the predicate relation, Knowledge Graph Completion (KGC) methods can be employed to identify missing entities or links in the KG in order to solve high-level tasks such as in recommendation or prediction systems. Such methods are, in a sense, using the KG to produce “new” knowledge rather than simply querying the KG for existing facts. In understanding the knowledge captured in a KG, as well as producing meaningful new knowledge for tasks predictions and recommendations, it is crucial to understand the context in which our information exists. However, while various context-aware applications have been developed over the years, the concept of context tends to be poorly defined and represented. Context might encapsulate information that characterizes an entity, or it might more generally describe the situation in which an activity is taking place. Context could also refer to background knowledge which somehow influences the outcome of a task. The delineation between what is or is not context is often left unclear due to the lack of any one-size-fits-all definition -- indeed, we often fall into a circular definition where what is context depends on the context. Towards addressing the problem of context and its use in knowledge-driven Artificial Intelligence (AI) applications, in this thesis we explore the development of KGC methods for producing new knowledge with an eye towards context. Our main technical contribution are three novel KGC methods which we demonstrate and evaluate in three distinct problem areas -- personalized food recommendation, event forecasting, and tabular data management. These three problem areas present us with a variety of different challenges and considerations that must be made to successfully apply KGC to the task, limiting the applicability of existing methods. For our first contribution, in the domain of food recommendation, we introduce novel methods to identify viable ingredient substitutions through generating and embedding flow graph representations of recipe instructions. More generally, these methods can be applied to the task of modifying entities that are involved in procedural instructions, allowing us to produce “new” knowledge by modifying existing knowledge. In our second contribution, tackling the domain of event forecasting, we present a novel model to predict properties of yet-unseen events based on performing 2-hop link prediction in a causal event KG. Our approach provides inherent explainability and requires no training, and can be applied more generally to predict properties about unseen entities in KGs -- this task of inductive link prediction enables us to produce “new” knowledge by making predictions about the properties of new entities. Our third contribution, for the domain of tabular data management, is the development of a method to predict table joinability using a relation prediction model over a KG of table metadata. We present a novel pipeline to process textual metadata into embeddings which are further processed by a graph neural network model, enabling joinability prediction for any new table and its metadata. This task produces “new” knowledge by predicting the likelihood of a new property -- specifically, joinability between tables -- existing between two target entities. Across these three contributions, we discuss a common theme of utilizing context surrounding entities and subgraphs in KGs. Further, based on our learnings from each of the technical contributions, we put forth a new ontology for context in KGC methods which can be used to explicitly model what is context in such methods. Our ontology enables users to represent and relate key concepts which are relevant to a context-aware KGC method, and more generally lays a foundation for defining context in a context-dependent manner. We demonstrate how this ontology can be applied using competency question and our three technical contributions as examples, and examine how our model can be applied more generally to common classes of KGC methods. We conclude by discussing the broader impact and limitations of our research, as well as future directions for the topic of effectively utilizing context in knowledge-driven AI.
Description
May2024
School of Science
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY