Large-scale graph analysis and matching on parallel architectures
Loading...
Authors
Mandulak, Michael
Issue Date
2025-12
Type
Electronic thesis
Thesis
Thesis
Language
en_US
Keywords
Computer science
Alternative Title
Abstract
Graphs are relational data structures used to represent a wide variety of scientific domains, ranging from biological and chemical interactions, network security, machine learning, social networks, web data and many more. As these real-world instances rapidly grow in size, scalable methods on high-performance computing architectures are required to perform efficient data analysis. However, the development of these methods is notably challenging due to several factors: the irregularity of real-world graph data, the complexity of relevant analysis algorithms and the difficulties in communication and synchronization overheads at scale. To tackle these challenges, this thesis explores the design of parallel graph analysis methods ranging from vertex ordering to matching, with an emphasis on scalability and performance. Grouped into three primary sections of general graph analysis, matching-based methods, and a combination of the two, this thesis discusses algorithmic and implementation-based optimizations on multicore, manycore and distributed topologies. The primary contributions are as follows. For general graph analysis, this thesis discusses contributions in both multithreaded vertex ordering and in network sensing on GPUs. First, a parallel refinement-based ordering algorithm is presented to improve cache efficiency in popular graph analysis methods, such as PageRank. In network sensing, the computation of graph properties is performed across multiple GPUs using emerging C++26 standard parallelism. Optimizations made in both cases are shown to outperform comparison methods, while offering experimental insights into new approaches and forthcoming technologies, respectively. In the realm of maximum weighted matching (finding vertex disjoint edge sets with maximum weight), methods are presented to efficiently implement approximation algorithms on multi-GPU topologies, with optimizations to communication and data movement. This is extended to applications in set similarity computations on web-based text data, specifically in data join operations. The multi-GPU methods yield the first results on billion edge graphs, outperforming state-of-the-art parallel implementations by up to 45x. Matching-based sequential data join results yield 20x improvement upon state-of-the-art filter-verify set similarity frameworks. Finally, this thesis combines the notions of graph analysis and matching on distributed frameworks, relying on 2D graph processing. This aims to minimize communication overheads at scale, applying sparse communication patterns within popular graph analytics. These optimizations allow for the scaling of simple and complex analytics on up to 400 GPUs. The results show near theoretical scaling on graphs in the billions of edges while significantly outperforming similar distributed graph processing frameworks.
Description
December2025
School of Science
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY