Scalable RDF query processing on clusters and supercomputers

Authors
Weaver, Jesse
Williams, Gregory
ORCID
No Thumbnail Available
Other Contributors
Issue Date
2009-10-26
Keywords
Degree
Terms of Use
Full Citation
Abstract
The proliferation of RDF data on the web has increased the need for systems that can query these data while scaling with their growing size and number. We present an application of parallel hash-joins for basic graph pattern matching over large amounts of RDF designed for shared nothing architectures including high-performance clusters and the Blue Gene/L. Our approach does not require any pre-processing of the RDF data or costly index building. Rather, we rely on a cluster's high bandwidth and fast memory to load and query data in parallel and in near-real time. We present an initial evaluation of our algorithm showing competitive results on clusters of up to 1,024 processors.
Description
Department
Publisher
Relationships
Access