• Login
    View Item 
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Resource-aware distributed analytics and machine learning for hybrid edge-cloud systems

    Author
    Das, Anirban
    View/Open
    Das_rpi_0185E_11970.pdf (4.002Mb)
    Other Contributors
    Patterson, Stacy; Zaki, Mohammed J., 1971-; Varela, Carlos A.; Brunschwiler, Thomas;
    Date Issued
    2021-12
    Subject
    Computer science
    Degree
    PhD;
    Terms of Use
    This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute (RPI), Troy, NY. Copyright of original work retained by author.;
    Metadata
    Show full item record
    URI
    https://hdl.handle.net/20.500.13015/6130
    Abstract
    With more intelligent applications, data analytics and inference at the edge are proliferating as a complement to traditional computation done at a centralized cloud location. At the same time, distributed machine learning training at the edge of the network near the data producers is also gaining popularity, mainly due to benefits to security, privacy, and communication costs. However, edge devices are often resource-constrained, and further, there may be communication bottlenecks between the edge and the cloud. Successful solutions for these edge computing workloads must address challenges posed by constrained computation and communication resources. The first part of the thesis focuses on scheduling and task placement of data processing, analytics, and inference workloads. The goal is to provide some quality of service, for example, low latency or cost reduction in the context of edge-cloud architectures. We start with benchmarking the leading industry edge computing platforms that use the serverless computing paradigm as the medium of execution. We next consider serverless applications, consisting of a single-stage, and propose a framework to jointly execute such applications in the presence of an edge device and the public cloud. The aim is to decide whether to execute user jobs at the edge or the public cloud based on given latency or cost constraints. Finally, we consider a hybrid cloud scenario, where we consider a private cloud instead of a single edge device. Here, we study the problem of task placement and scheduling of multi-stage serverless applications between a private and the public cloud to minimize the cost of public cloud usage. In the second part of the thesis, we consider machine learning training workloads in edge-cloud platforms. More specifically, we study federated learning in this part of the thesis. Like the first part of the thesis, we first conduct a feasibility study of federated learning algorithms on resource-constrained devices. Next, we study an algorithm for horizontal federated learning in a hierarchical communication network. We analyze the convergence of the algorithm when there is a non-IID data distribution among the participants. Our analysis shows that the non-IID data distribution can have a significant impact on the algorithm convergence error. This insight paves the way for a more sophisticated algorithm design to diminish this performance gap. We then turn our focus towards vertical federated learning in a hierarchical network. We propose a new algorithm for model training where data is vertically partitioned across silos in the top tier and horizontally partitioned in the bottom tier among clients inside each silo. We present a theoretical analysis of our algorithm and show the dependence of the convergence rate on the number of vertical partitions, the number of local updates, and the number of clients in each hub. Lastly, we close with the summary and discussions on the future research directions and open questions of interest.;
    Description
    December 2021; School of Science
    Department
    Dept. of Computer Science;
    Publisher
    Rensselaer Polytechnic Institute, Troy, NY
    Relationships
    Rensselaer Theses and Dissertations Online Collection;
    Access
    Restricted to current Rensselaer faculty, staff and students in accordance with the Rensselaer Standard license. Access inquiries may be directed to the Rensselaer Libraries.;
    Collections
    • RPI Theses Online (Complete)

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2023  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV