Comprehensive deep learning pipeline for whale shark recognition

Kholiavchenko, Maksim
Thumbnail Image
Other Contributors
Ivanov, Radoslav
Gittens, Alex
Stewart, Charles V.
Issue Date
Computer science
Terms of Use
Attribution-NonCommercial-NoDerivs 3.0 United States
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute (RPI), Troy, NY. Copyright of original work retained by author.
Full Citation
The whale shark is the largest fish species in existence today. The main threat to the whale shark population is poaching. Despite conservation efforts, whale shark hunting persists in tropical countries due to population increase and, as a result, growing demand for food. The long maturation period and slow rate of reproduction add to the whale shark population's vulnerability. Whale sharks are listed as endangered species by the International Union for Conservation of Nature, which estimates a 50% decline in the whale shark population over the last 75 years. Whale sharks migrate over great distances in search of plankton. To date, little is known about whale sharks' life cycle, characteristics of their behavior, and reproduction. Recognition of whale sharks is a starting point for studying the migrations of these animals. In this work, we present an approach for whale shark recognition through a region of interest detection, spot segmentation, and deep metric learning. Whale sharks are speckled with dazzling white spots and lines. Such natural markings are distinctive which makes it possible to achieve good recognition results with modern deep learning techniques. In this work, we employ a multi-stage approach to tackle the problem of whale shark recognition. Firstly, we prepare a novel whale shark detection dataset and train the YOLOv5s model to detect areas from the pectoral fin to the dorsal fin. This area contains a large amount of whale shark biometric information such as uniquely patterned white spots. Secondly, we train a U-net model with the SEResNet34 backbone to segment these spots on whale sharks' bodies. Thirdly, we train an InceptionResNet embedding model which makes use of spots location as well as originally detected whale shark image to produce high-quality embedding. Finally, we introduce an embedding-based recognition algorithm and validate its performance. For the experiment without new individuals in the test set, our algorithm scores 93% top-1 recognition accuracy, while for the experiment with new individuals in the test set, it scores 83%.
May 2022
School of Science
Dept. of Computer Science
Rensselaer Polytechnic Institute, Troy, NY
Rensselaer Theses and Dissertations Online Collection
CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 license. No commercial use or derivatives are permitted without the explicit approval of the author.