Animal behavior recognition from drones using computer vision techniques
Loading...
Authors
Kholiavchenko, Maksim
Issue Date
2025-12
Type
Electronic thesis
Thesis
Thesis
Language
en_US
Keywords
Computer science
Alternative Title
Abstract
Understanding animal behavior in natural habitats is vital for wildlife conservation. This dissertation examines the application of computer vision methods to recognize animal behaviors in drone footage, addressing key challenges in wildlife monitoring and conservation. To support this work, two large-scale in situ video datasets are introduced: the Kenyan Animal Behavior Recognition (KABR) dataset, which focuses on ungulates (e.g., giraffes, plains and Grévy’s zebras), and the BaboonLand dataset, which focuses on baboons in their natural habitats. The KABR work introduces a method for isolating individual animals within drone videos using what are referred to as “mini-scenes”. These are tightly cropped segments that follow a single animal, creating a consistent viewpoint that makes it easier to observe posture, movement, and subtle behavioral changes. The BaboonLand dataset builds on this by extending the task to baboon detection, tracking, and behavior recognition. It includes multiple individuals moving through varied and often challenging landscapes, like cliffs, rocks, trees, and rivers. The use of “mini-scenes” helps maintain a focus on each baboon across time, even when individuals overlap or move through dense environments. To further improve tracking performance in these challenging environments, a depth-aware tracking method (DA-Track) is developed. This approach incorporates features of the surrounding terrain into the tracking pipeline to improve tracking stability and identity preservation. For behavior recognition, this work introduces a novel two-stage method called Past, Present, and Future (PPF). This framework combines the speed of convolutional models with the reasoning capabilities of large vision-language models to better recognize rare and ambiguous behaviors in long-tailed datasets. By drawing temporal context from video segments before and after the behavior of interest, the method improves the recognition accuracy for infrequent but ecologically important actions. This dissertation proposes practical tools for conservationists to conduct non-invasive and large-scale monitoring of animal populations, quantify subtle behavioral patterns indicative of health or stress, and ultimately make more informed decisions for habitat management and species protection. Future applications of these techniques could extend to broader species and ecosystems, ultimately supporting global efforts in biodiversity preservation and ecological sustainability.
Description
December2025
School of Science
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY