Author
Kannan, Amar Viswanathan
Other Contributors
Hendler, James A.; McGuinness, Deborah L.; Fox, Peter A.; Si, Mei;
Date Issued
2018-05
Subject
Computer science
Degree
PhD;
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
Abstract
In this approach an input SPARQL query that contains schema or taxonomic concepts and instance data (entity data) is reformulated to give alternate ranked reformulations that are similar to the original query. Viewing a query as a conjunction of triples, a decision is made on choosing the right triple patterns to reformulate using data awareness or schema awareness. Extending prior work on query relaxation with data availability for schema concepts, this proposal also addresses instance data(entity) elements in the query. Query log statistics show that a majority of the queries issued to knowledge graphs are often entity based queries. Entities don't have a taxonomy and they end up being generalized. To address this issue, I utilize the RDF graph, and entity triple statements to suggest rewrites. Once the features are identified, the entity in concern is reformulated as a set of features. Since entities in large-scale graphs can have a large number of features, we introduce strategies that select the top-k most relevant and informative ranked features. This is then augmented to the original query to create a valid reformulation. We then evaluate our approach by showing that our reformulation strategy produces results that are more contextual and crisp when compared with the state-of-the-art. Finally I incorporate both schema data awareness and instance data awareness into a combined methodology to reformulate any kind of SPARQL query.; In today's age of Linked Data and Knowledge Graph proliferation, complex SPARQL queries returning results in the billions are ubiquitous. However, when such a query does not return the intended answers, the onus is on the user to resolve the schema semantics and instance data properties to recover from the failure. Given the vast scale of Knowledge Graphs, this is an ad hoc and time consuming process which largely works by trial and error. To address this issue, I propose a reformulation approach that takes into account the underlying RDF Schema and also the nature of the instance data expressed in the graph.;
Description
May 2018; School of Science
Department
Dept. of Computer Science;
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection;
Access
Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.;