Show simple item record

dc.rights.licenseCC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.
dc.contributorHendler, James A.
dc.contributorBerners-Lee, Tim
dc.contributorFox, Peter A.
dc.contributorAdali, Sibel
dc.contributor.authorWilliams, Gregory Todd
dc.date.accessioned2021-11-03T07:58:10Z
dc.date.available2021-11-03T07:58:10Z
dc.date.created2013-09-09T14:21:17Z
dc.date.issued2013-05
dc.identifier.urihttps://hdl.handle.net/20.500.13015/843
dc.descriptionMay 2013
dc.descriptionSchool of Science
dc.description.abstractTo demonstrate the practicality of this federated query planning framework, we present results of empirical evaluation of the framework components over a real-world dataset of bibliographic data. These results show that the federated query planning, evaluation, and caching techniques are able to produce query results quickly and efficiently. The effects of several optimizations on the execution of federated queries is discussed, and their impact on performance is evaluated.
dc.description.abstractThe Web of Data continues to increase in size and diversity, providing access to large amounts of structured, linked data. However, existing approaches to querying this data often fail to make use of existing database access points and must resort to web crawling to collect data of interest. Furthermore, in order to provide efficient query answering over this data, existing systems are forced to construct centralized database indexes, making it difficult to maintain up-to-date data. For approaches that do utilize existing databases, disregard for fundamental design principles of the Web results in query systems that lack some basic features of their web crawling counterparts. If an efficient query answering system can be provided that does not require centralized indexing, and leverages both existing databases and static web content, users may benefit from up-to-date access to structured, disparate data.
dc.description.abstractIn this dissertation, we develop a federated query planning framework based on the RDF data model and the SPARQL query language. This framework is able to leverage the high performance of existing SPARQL databases while also providing access to linked data available as RDF documents on the web. These two access methods are used to provide a single interface to querying semantic data.
dc.description.abstractThe primary challenge of evaluating queries over both SPARQL databases and linked data is in finding an efficient execution plan. Such a plan must perform better than the naive approach of completely decomposing the query and executing each subquery against each data source or traversing linked data by web crawling. Moreover, it must allow metadata discovered during query execution to be incorporated into the existing plan.
dc.description.abstractGiven this, in this dissertation, we develop three techniques to increase performance and flexibility of federated query evaluation: we develop a federated query planning algorithm that prioritizes the execution of subqueries that have high expected value (that is, expected relevant results with low latency); we develop a re-planning algorithm, able to augment an existing query plan with newly discovered data sources and a mechanism for discovering such sources; and we develop a server-side technique to greatly enhance the web cacheability of SPARQL query results.
dc.description.abstractFinally, the developed framework is designed using a traditional query planner, allowing it to integrate with and benefit from existing work on query planning and optimization.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjectComputer science
dc.titlePlanning and evaluation of federated queries on the web
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid167038
dc.digitool.pid167041
dc.digitool.pid167042
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreePhD
dc.relation.departmentDept. of Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.
Except where otherwise noted, this item's license is described as CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.