Data analytics of time-series for complex (biological) systems

Dhulekar, Nimit
Thumbnail Image
Other Contributors
Yener, Bülent, 1959-
Stewart, Charles V.
Magdon-Ismail, Malik
Pandey, Gaurav
Issue Date
Computer science
Terms of Use
Attribution-NonCommercial-NoDerivs 3.0 United States
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
Full Citation
Complex time-series systems such as biological networks have been studied for many years using conventional molecular and cellular techniques. However, the multiscale nature of these networks make these techniques limited in their application. In this thesis, we present coupled interdisciplinary algorithms covering disparate concepts such as graph-theory, level sets, autoregressive modeling, and domain knowledge transfer -- with a view to improving the modeling and prediction of the evolution of biological networks. Applying our approaches to various modalities such as image-based and signal-based data, we demonstrate the importance of coupling these various techniques for a much improved holistic algorithm.
We then move on to the problem of investigating the susceptibility of heterogeneous mice populations to Mycobacterium tuberculosis. Varied mouse populations react in different ways to the bacterial disease. However, this differentiation is not completely obvious, and one of the goals of this project was to identify the important set of features that can perform this separation successfully. We present results on using different feature sets and multiple supervised learning classification algorithms to separate the various mouse populations.
The other critical problem associated with epilepsy is seizure prediction, or early seizure detection. Looking at a raw EEG signal, it is not possible for physicians to identify early indicators of seizures. The problem is further complicated by the addition of two constraints -- optimizing seizure prediction horizon and minimizing false positive rate. The seizure prediction horizon is the period during which predictions are made about the impending seizure. This period has to be optimized such that it gives the patient ample time to prepare for the seizure but not an excessively large period of time that would negatively disrupt the patient's life. Also, in terms of the success of the algorithm, missing a seizure completely is much worse than making a few incorrect predictions, and would make the system unusable in a real-world setup. Thus, the false positive rate has to be minimized as well. We tackle these problems by augmenting the synchronization graphs with concepts from linear algebra and machine learning. In particular, we construct an autoregressive process on the features calculated from the synchronization graphs. The autoregression coefficients are then improved on by using transfer learning and manifold alignment. We then used a one-dimensional error profile based on our prediction of the state of the system vs. the actual state of the system.
Next, we tackle problems related to epileptic seizures, in particular, seizure localization and early seizure prediction. Epilepsy is a growing concern among physicians, and a concentrated effort is being made to develop algorithms for these problems. Using the notion of a seizure being a synchronous event spreading to all regions of the brain, we build synchronization graphs to quantify this neural activity. The EEG electrodes inserted into the brain to record neural activity form the vertices of this graph, and edges are added to the graph between two vertices when they record similar neural activity. By calculating features describing these time-evolving synchronization graphs, we are able to localize the seizures temporally. A tensor-based-approach with the three modes consisting of electrodes, time, and features, is used to spatially localize the seizure to a particular side of the brain.
Sticking with the primary theme of coupling algorithmic techniques, we then combine this graph-theoretical model with level sets to create an improved model. This model takes into account cellular spatial organization and provides a better method for gland evolution. We demonstrate that this coupled cellular level set model simulates the growth of the tissue much better than other models currently in use.
We begin by investigating cleft formation in the first round of branching morphogenesis in the mouse submandibular salivary gland. The mouse model has been well-established in biology as a precursor testing ground before human testing. By developing a model that can predict this developmental process, we would be one step closer to building realistic models for human salivary glands. Here, we present a dynamic-graph-based algorithm that not only describes tissue evolution using novel region-of-interest features but also predicts the growth of the tissue under varying concentrations of growth factors.
May 2015
School of Science
Dept. of Computer Science
Rensselaer Polytechnic Institute, Troy, NY
Rensselaer Theses and Dissertations Online Collection
CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.