The geometry of learning and the learning of geometry

Authors
Tatro, Norman Joseph
ORCID
Loading...
Thumbnail Image
Other Contributors
Lai, Rongjie
Kovacic, Gregor
Xu, Yangyang
Chen, Pin-Yu
Issue Date
2021-05
Keywords
Mathematics
Degree
PhD
Terms of Use
Attribution-NonCommercial-NoDerivs 3.0 United States
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
Full Citation
Abstract
Overall, this work concerns the intersection of machine learning and differential geometry. It investigates ways in which these fields can mutually inform each other. Our goal is to promote bringing a geometric paradigm to machine learning which can provide further insight into existing machine learning models, while also extending their capabilities.
Within the past decade, machine learning has emerged as a powerful tool in the new age of Big Data. As the field expands, there is greater interest in understanding the complexities of machine learning models, as well as extending these models to more complex data. Geometry is a natural paradigm through which we investigate both of these interests. To this end, we address problems in the geometry of learning and learning of geometry.
Considering the geometry of learning, we analyze the loss landscapes of neural networks to gain better insight into said networks. These loss landscapes are not well understood due to their high nonconvexity. Empirically, the local minima of these loss functions can be connected by a learned curve in model space, along which the loss remains nearly constant; a feature known as mode connectivity. We propose a more general framework to investigate the effect of symmetry on mode connectivity by accounting for the weight permutations of the networks being connected. To approximate the optimal weight permutation, we introduce an inexpensive heuristic referred to as neuron alignment. Neuron alignment promotes similarity between the distribution of intermediate activations of models along the curve. We provide analysis establishing the benefit of alignment to mode connectivity based on this simple heuristic. Empirically, optimizing the permutation is critical for efficiently learning a simple, planar, low-loss curve between networks that successfully generalizes.
Considering the learning of geometry, we address the task of unsupervised geometric disentanglement. Geometric disentanglement, the separation of latent embeddings for intrinsic (i.e. identity) and extrinsic (i.e. pose) geometry, is a prominent task for generative models of non-Euclidean data such as 3D deformable models. It provides greater interpretability of the latent space, and leads to more control in generation. We introduce architectures and feature descriptors for achieving this disentanglement in multiple settings. We propose CFAN-VAE to achieve unsupervised geometric disentanglement in genus zero surfaces, such as human bodies. For the space of protein conformations, we introduce ProGAE to accomplish the same task given the backbone of a protein.
Description
May 2021
School of Science
Department
Dept. of Mathematical Sciences
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection
Access
CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.