Schema - and data -aware query reformulation in knowledge graphs
Loading...
Authors
Kannan, Amar Viswanathan
Issue Date
2018-05
Type
Electronic thesis
Thesis
Thesis
Language
ENG
Keywords
Computer science
Alternative Title
Abstract
In this approach an input SPARQL query that contains schema or taxonomic concepts and instance data (entity data) is reformulated to give alternate ranked reformulations that are similar to the original query. Viewing a query as a conjunction of triples, a decision is made on choosing the right triple patterns to reformulate using data awareness or schema awareness. Extending prior work on query relaxation with data availability for schema concepts, this proposal also addresses instance data(entity) elements in the query. Query log statistics show that a majority of the queries issued to knowledge graphs are often entity based queries. Entities don't have a taxonomy and they end up being generalized. To address this issue, I utilize the RDF graph, and entity triple statements to suggest rewrites. Once the features are identified, the entity in concern is reformulated as a set of features. Since entities in large-scale graphs can have a large number of features, we introduce strategies that select the top-k most relevant and informative ranked features. This is then augmented to the original query to create a valid reformulation. We then evaluate our approach by showing that our reformulation strategy produces results that are more contextual and crisp when compared with the state-of-the-art. Finally I incorporate both schema data awareness and instance data awareness into a combined methodology to reformulate any kind of SPARQL query.
Description
May 2018
School of Science
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY