Image segmentation and deep metric learning for whale shark re-identification

Pianin, Eric
Thumbnail Image
Other Contributors
Stewart, Charles
Gittens, Alex
Magdon-Ismail, Malik
Issue Date
Computer science
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
Full Citation
According to a study published by the National Academy of Sciences, the world is in an ongoing sixth mass extinction event. Per the study’s sample of half of known vertebrae species, 32% are experiencing population decreases [2]. These revelations are the motivation behind Wildbook, a citizen-scientist platform for tracking animal demographics using artificial intelligence and computer vision techniques.In particular, the whale shark is in dire need of accurate demographic information.Despite being the world’s largest fish, there is currently no robust estimation of its population and distribution. The species is considered at risk of extinction. [14] Wildbook possesses a large database of whale shark images and annotated locations of spots. These spots are manually labeled within a distinct “measurement region” marked by physical boundaries on the shark’s body. The pattern defined by these spots, in essence, is a virtual fingerprint that uniquely identifies a whale shark specimen.This thesis covers two main methods of shark re-identification. Chapter 3 describes work on spot detection. A fully convolutional neural network is trained to extract spot locations from cropped shark images. These locations are used as input to the Modified Groth Algorithm [1], an algorithm traditionally used for celestial navigation, which has been adapted to identify individual whale sharks based on triangles formed by their spots. This approach produces to 91.9% spot recall in the best configuration.The second is end-to-end shark re-identification through metric learning. An embed-ding space is learned such that the Euclidean distance between images of the same shark identity is minimized and the Euclidean distance between images of differing identities is at least a margin α. The learned embedding delivered top 1 recall of 90.06% for previously identified sharks.
December 2020
School of Science
Dept. of Computer Science
Rensselaer Polytechnic Institute, Troy, NY
Rensselaer Theses and Dissertations Online Collection
Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.