Scalable RDF query processing on clusters and supercomputers

Weaver, Jesse; Williams, Gregory

Scalable RDF query processing on clusters and supercomputers

Authors

Weaver, Jesse

Williams, Gregory

Issue Date

2009-10-26

URI

http://www.cs.rpi.edu/~weavej3/papers/ssws2009.pdf
https://hdl.handle.net/20.500.13015/4655

Abstract

The proliferation of RDF data on the web has increased the need for systems that can query these data while scaling with their growing size and number. We present an application of parallel hash-joins for basic graph pattern matching over large amounts of RDF designed for shared nothing architectures including high-performance clusters and the Blue Gene/L. Our approach does not require any pre-processing of the RDF data or costly index building. Rather, we rely on a cluster's high bandwidth and fast memory to load and query data in parallel and in near-real time. We present an initial evaluation of our algorithm showing competitive results on clusters of up to 1,024 processors.

Collections

Tetherless World Publications

Full item page