Author
Park, Daniel
Other Contributors
Yener, Bülent, 1959-; Milanova, Ana; Zaki, Mohammed J., 1971-; Pendleton, Marcus;
Date Issued
2021-05
Subject
Computer science
Degree
PhD;
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
Abstract
The first contribution of this dissertation is a technique for obfuscated malware detection on low-powered internet-of-things (IoT) devices. The state-of-the-art in malware detection use a large number of features extracted using static and dynamic analysis to train a deep neural network. However, this is not feasible on IoT devices due to their computational and power constraints. As an alternative, we propose using Markov matrices as a feature for low-power malware detection. We experimentally show that the proposed method maintains a high malware detection rate with a low false-negative rate, while using less power than related works.; This dissertation investigates the application of machine learning algorithms in malware detection at various points in the pipeline. We find that although machine learning can be used increase automation and future threat detection, it also faces a trade-off due to its vulnerability against adversarial examples. We demonstrate both the usefulness and vulnerability of machine learning models used in malware detection and take steps towards ensuring a safe adoption of machine learning in the cybersecurity field.; The second contribution is an attack against machine learning based malware detectors that creates evasive malware samples guided by adversarial examples. We show that by exploiting adversarial examples, we can use simple dummy-code insertion to create obfuscated malware with higher evasion rates than malware obfuscated with more complex techniques, such as control-flow obfuscation. Additionally, we provide the first survey and classification of practical adversarial malware attacks, or attacks against machine learning models that produce executable malware samples.; Many in cybersecurity are beginning to refer to machine learning as a "silver bullet". Machinelearning algorithms, due to their ability to recognize patterns, have been used in many cybersecurity fields, especially malware detection. Due to the large number of sophisticated malicious files created and transmitted over the internet daily, machine learning has been used to automate early threat detection and analysis using methods ranging from k-nearest neighbors to convolutional neural networks. However, it has recently become known that machine learning models, especially complex convolutional neural networks, are susceptible to adversarial examples. These adversarial inputs induce misclassification by the target model using perturbations that are imperceptible to the human eye. Motivated by the cybersecurity realm’s recent push towards machine learning-based automation, this dissertation explores machine learning in malware detection, the vulnerability of machine learning based malware detection models, and how these models can be defended.; The third contribution is a general defense against adversarial examples. We show that output randomization can be used as a defense against multiple classes of adversarial example attacks. We explore its effectiveness in white-box and black-box adversarial settings and mathematically formulate its effect on the adversary. Unlike other randomization approaches, output randomization has low-overhead and does not require additional randomization at test-time. The reduced overhead allows the proposed output randomization defenses to be easily implemented and to be used together with other defenses, such as adversarial training.;
Description
May 2021; School of Science
Department
Dept. of Computer Science;
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection;
Access
Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.;