Show simple item record

dc.rights.licenseRestricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.
dc.contributorCarothers, Christopher D.
dc.contributorCutler, Barbara M.
dc.contributorRoss, Robert B., 1972-
dc.contributorShephard, M. S. (Mark S.)
dc.contributor.authorRoss, Caitlin J.
dc.date.accessioned2021-11-03T09:13:53Z
dc.date.available2021-11-03T09:13:53Z
dc.date.created2020-06-12T12:31:10Z
dc.date.issued2019-08
dc.identifier.urihttps://hdl.handle.net/20.500.13015/2459
dc.descriptionAugust 2019
dc.descriptionSchool of Science
dc.description.abstractTo better our understanding of optimistic PDES, a dynamic instrumentation layer was introduced into the ROSS simulation framework that allows model developers to collect a variety of metrics across the model and simulation engine software layers. Because the instrumentation has the potential to collect large amounts of data that is infeasible for either storing to disk or transferring over a network from the supercomputer running the simulation to another system for analysis, we also developed the ROSS In Situ Analysis system (RISA) that can perform data reduction while the simulation data resides in memory. We demonstrate the usefulness of our instrumentation and analysis tools by performing visual analyses of high performance computing (HPC) network models built on top of the ROSS framework. With the visual analysis, we are able to find load and communication imbalances in the simulation and determine their causes. In addition, we perform perturbation studies of both the ROSS instrumentation and RISA. This compares instrumented and non-instrumented simulations to ensure that these tools do not significantly affect simulation performance nor introduce new performance bottlenecks.
dc.description.abstractDiscrete event simulation is a cost-effective tool for exploring the design space of next generation computer systems. Optimistic synchronization algorithms for PDES, such as Time Warp, allow for a model's inherent parallelism to be discovered using an out-of-order event detection and recovery scheme. When events are processed out of timestamp order, the simulation is rolled back to a prior state and events are re-executed in the correct order. Although optimistic protocols can be highly scalable, optimizing optimistic simulations to minimize time spent performing rolling backs is not a trivial task due to the number of factors that can affect the rollback behavior of the simulation.
dc.description.abstractIn this work, we demonstrate the efficacy of discrete event simulation in evaluating and improving the performance of parallel and distributed scientific analysis systems, such as the MG-RAST metagenomics analysis service provided by Argonne National Laboratory. We propose hardware and job scheduling changes to their system that can improve scalability under increased user workloads that are anticipated in the future. We use event-driven simulation to evaluate the proposed changes and compare them to the current infrastructure and job scheduling policies. However, the simulation exhibits poor parallel performance, which limits the size of the workloads able to be simulated for MG-RAST. This highlights the need for scalable analysis and visualization tools for use in optimistic PDES that can be used to gain insights to their rollback behavior and performance.
dc.description.abstractFinally, we also explore the use of three-dimensional animations for understanding both the time series model data as well as optimistic PDES performance of the CODES network models. Typically these simulations are visualized using information visualization techniques such as parallel coordinates and radial diagrams. However, adding spatial data to the compute nodes and routers of the HPC networks enables the visualization of simulation data in a context familiar with simulation users, such as network architects. Replaying time series model data, such as network congestion, over the network visualizations has helped to provide insight to hotspots that occur in HPC networks during simulation, and enable a visual comparison of different networks.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.subjectComputer science
dc.titlePerformance analysis and visualization tools to support the codesign of next generation computer systems
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid179846
dc.digitool.pid179847
dc.digitool.pid179848
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreePhD
dc.relation.departmentDept. of Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record