Software/hardware design techniques to unleash the full potential of dram and flash memory for big data and ai applications

Loading...
Thumbnail Image
Authors
Ma, Linsen
Issue Date
2025-05
Type
Electronic thesis
Thesis
Language
en_US
Keywords
Electrical engineering
Research Projects
Organizational Units
Journal Issue
Alternative Title
Abstract
The exponential growth of data-intensive applications, such as Artificial Intelligence (AI), Machine Learning (ML), and Big Data analytics, has placed unprecedented demands on memory and storage infrastructures. Traditional architectures often struggle to meet these requirements, leading to performance bottlenecks and increased operational costs. This research addresses these challenges by exploring advanced memory and storage optimization techniques to enhance data processing efficiency. This thesis first investigates methods to mitigate performance degradation associated with integrating block data compression into in-memory key-value (KV) stores. Despite extensive prior research on in-memory KV stores, minimal attention has been given to reducing memory usage through block data compression (e.g., LZ4, ZSTD) due to concerns over performance penalties. We introduce design techniques that leverage decompression streaming, latency differences between compression and decompression, and data access locality from real-world workloads. These techniques integrate seamlessly with conventional hash or B+-tree indexing structures, enabling broad applicability without altering existing core indexing frameworks. By addressing these performance challenges, this research makes it feasible to significantly reduce memory costs in KV stores, enhancing scalability and efficiency for applications that rely heavily on rapid data access and cost-effective memory usage. This thesis further investigates utilizing computational storage drives (CSDs) to reduce storage costs in security-first environments. Modern cloud computing systems face the challenging task of simultaneously achieving security, performance, and cost efficiency, particularly in data storage services. In such environments, data typically undergoes compression to reduce storage demands, followed by encryption for security, which introduces significant complexity, performance overhead, and increased costs, especially in snapshot management. Emerging CSD technology addresses these issues by offloading computationally intensive compression tasks directly to storage devices and offering virtualized logical storage space, creating new opportunities for efficient optimization. By enabling more streamlined management of compressed and encrypted data, this work significantly reduces operational complexity and costs, ultimately supporting more secure, efficient, and cost-effective cloud storage solutions. Finally, this thesis studies how to make flash memory more relevant in AI computing systems. In today's AI computing platform, flash memory plays a secondary, supportive role by mainly serving data accesses outside the core AI training/inference operations. With the emergence of AI models/workloads that demand TB-scale embeddings, it presents opportunities for flash memory to play a more essential role in the AI era. State-of-the-art SSDs (solid-state drives) are optimized for 4KB LBA (logical block address) block size and achieve only up to ~3M peak IOPS (I/O per second). This nonetheless is far inadequate for most AI systems. Given the typically smaller-than-4KB embedding vector size (e.g., 512B~2KB), this thesis reveals an encouraging potential to make future SSDs much more AI-friendly, which is realized by cohesively leveraging new flash memory device features and enhancing SSD controller architecture/firmware design.
Description
May2025
School of Engineering
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Terms of Use
Journal
Volume
Issue
PubMed ID
DOI
ISSN
EISSN
Collections