Author
De los Santos, Hannah
Other Contributors
Bennett, Kristin P.; Hurley, Jennifer M.; Magdon-Ismail, Malik; Zaki, Mohammed J., 1971-;
Date Issued
2020-05
Subject
Computer science
Degree
PhD;
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
Abstract
By developing these three methods, we strengthen both modeling and interaction techniques. Novel approaches for analysis of extended harmonic oscillators address model fitting, selection, evaluation and visualization, with applicability in various fields. Since cyclic data frequently results from both natural and artificial control systems, the approaches developed here can be deployed in many other areas, including semi-conductor manufacturing and cell cycle studies. The automatic starting point discovery we develop in ECHO and MOSAIC not only applies to the general models provided, but also provides intuition for any models with these specific parameters. The development of these tools in relatively low resolution data solidifies their robustness. Further, through work with ENCORE, we enhance human-computer interaction by introducing an easily accessible multi-window paradigm, mwshiny, to take advantage of an increasingly multi-monitor world. As such, it has been adopted as the default paradigm for the Rensselaer Campfire, deployed for applications ranging from health to social dynamics.; With the third piece of the PAICE Suite, MOSAIC (Multi-Omics Selection with Amplitude Independent Criteria), we bridge the gap between multiple omics types to reduce the noise effects prevalent in omics datasets. MOSAIC extends and augments the ECHO model by performing model selection on circadian omics data, including both non-oscillatory and oscillatory models. MOSAIC then connects multiple omics types through joint modeling, sharing the period parameter between the transcriptome and proteome to identify previously unrecovered rhythms. Using synthetic data and proteomic data from Neurospora crassa, we showed that MOSAIC's workflow finds more rhythms other identification methods and highlights the differences in circadian regulation between different omics types. Thus, we can correctly identify significant circadian and non-circadian trends in both the transcriptome and the proteome to get at the heart of post-transcriptional regulation.; We then leverage these new AC categories to create the ECHO Native Circadian Ontological Rhythmicity Explorer (ENCORE), the second application of the PAICE Suite. ENCORE combines the power of GO and STRING with associated statistical-enrichment testing to derive biological understanding of circadian function. ENCORE also connects to outside databases, UniProt and QuickGO, to provide access to further information about genes and GO terms. We also enhanced the ENCORE workflow through 3D interaction in the Rensselaer Campfire, extending traditional user interfaces to span multiple windows through the creation of a new package and paradigm, Multi-Window Shiny. With ENCORE, we were able to extend and bolster the results of previous studies on various organisms, discovering novel conclusions that were not available with previous tools. In total, we find that the application of ENCORE to large-scale circadian data sets allows for a deeper understanding of the impact of circadian regulation over biological functions.; Circadian rhythms are endogenous cycles of approximately 24 hours, reinforced by external cues such as light. Interrupting these rhythms can cause increased risk of numerous health problems, including cancer and diabetes. These cycles are typically modeled as harmonic oscillators with fixed amplitude peaks, which doesn't reflect the damping observed in practice. As such, we present the PAICE (Pipeline for Amplitude Integration of Circadian Exploration) Suite, a set of novel modeling methods for rhythmic data and applications based upon these methods, used for the detection and understanding of circadian rhythms in large-scale time-series omics data.; We build ECHO, ENCORE, and MOSAIC into the easily navigable set of applications, the PAICE Suite, creating an ease of use unprecedented by current methods in circadian biology, leading to fundamental new biological insights and enhanced methodology in the field of machine learning.; The first model in the PAICE Suite, ECHO (Extended Circadian Harmonic Oscillator), uses an optimization approach that fits the solution to the differential equation corresponding to an underlying negative feedback loop with external influences to omics scale data, thus extending the simple harmonic equation to identify and classify the degree of amplitude change. To do this, we have also developed novel starting point heuristics and weighting schemes for nonlinear least squares that find more accurate parameter values than other prevalent methods. We also extend rhythm identification outside of circadian periods through the specification of unconstrained problems, which allow for shorter and longer periods. Using biological datasets in various organisms, we find that rhythms of different Amplitude Change (AC) categories, including damping and forcing, reveal functional biological differences.;
Description
May 2020; School of Science
Department
Dept. of Computer Science;
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection;
Access
Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.;