Dissertation Defense of CMSE Sarah McGuire

Department of Computational Mathematics, Science & Engineering

Michigan State University

Dissertation Defense Notice

June 3, 2024 12:00pm, Room EGR 1502

https://msu.zoom.us/j/93169668117

Meeting ID: 931 6966 8117
Passcode: cmse

Leveraging Topological Structure of Data for Applications of Deep Learning

Sarah McGuire

Abstract:
Topological data analysis and deep learning are fields which have widely growing interest and rapid research developments, each in their own right. However, at the intersection of these fields, there is an opportunity to leverage the topological structure of data, incorporating the additional information into deep learning algorithms and methods applied to deep learning tasks. The design of deep learning architectures for various tasks on the domain of topological objects has seen quick progress, fueled by the desire to model higher-order interactions that are often naturally occurring in data.

The first focus area of this dissertation is an extension of pooling layers to simplicial complex input data. For deep learning problems on graph-structured data, pooling layers are important for down sampling, reducing computational cost, and to minimize overfitting in the model. We define a pooling layer, NervePool, for data structured as simplicial complexes, which are generalizations of graphs that include higher-dimensional simplices beyond vertices and edges; this structure allows for greater flexibility in modeling higher-order relationships. The proposed simplicial coarsening scheme is built upon partitions of vertices, which allow us to generate hierarchical representations of simplicial complexes, collapsing information in a learned fashion. NervePool builds on the learned vertex cluster assignments and extends to coarsening of higher dimensional simplices in a deterministic fashion. While in practice, the pooling operations are computed via a series of matrix operations, the topological motivation is a set-theoretic construction based on unions of stars of simplices and the nerve complex.

The second focus of this dissertation is another input data type with topological structure, the Euler Characteristic Transform (ECT). The ECT provides a summary of the topological shape of data which is both simple to define and simple to compute for many different input data types, including images, graphs, and embedded simplicial complexes. In contrast to alternative directional transform methods in topological data analysis, the ECT is easier to compute and represent in a format well-suited for machine learning tasks. To leverage the inherent structure of ECT data on a cylinder for our input data types, we employ a particular choice of convolutional neural network (CNN) architecture for the classification of ECT data. We prove that our ECT-CNN pipeline produces equivariant representations of input data, which allows for the use of un-aligned input data. We apply the ECT-CNN to two different leaf shape datasets and compare the model performance against traditionally used methods which require data to be pre-aligned, and in doing so we exhibit its efficacy in shape classification tasks.

Committee Members:

Liz Munch (chair)

Teena Gerhardt

Dan Chitwood

Vishnu Boddeti