The provisional schedule of the events is available here. The timezone is London time. The talks are held in LT 308 (third floor) in the Huxley building.
You can add the events to your calendar via iCal.
Joan Bruna: Quantitative Benefits of Rank in Attention Layers (Slides)
Abstract: Attention-based mechanisms are widely used in machine learning, most prominently in transformers and graph neural networks. However, hyperparameters such as the rank of the attention matrices and the number of heads are kept nearly the same in all realizations of this architecture, without any theoretical justification. While the total number of parameters ( a proxy for `capacity’ in the language of modern scaling laws) only depends on the product of rank and number of heads, in this talk we will show dramatic tradeoffs between these two parameters. Namely, we present a simple and natural target function that can be represented using a single full-rank attention head for any context length, but that provably cannot be approximated by low-rank attention unless the number of heads is exponential in the embedding dimension, even for small context lengths. Moreover, we prove that, for short context lengths, adding depth allows the target to be approximated by low-rank attention. For long contexts, we conjecture that full-rank attention is necessary. Joint work with Noah Amsel and Gilad Yehudai.
Kathryn Hess Bellwald: Cochains are all you need (Slides)
Abstract: I will present results from the thesis of my recently graduated student Kelly Maggs, a diverse collection of fruitful applications of the classical algebra-topological notion of “cochains” in signal processing, machine learning, and gene expression analysis. The first of these applications concerns the interplay between discrete Morse theory and combinatorial Hodge theory. The second involves the use of differential k-forms in Euclidean space to represent simplices in a simplicial complex and thus facilitate interpretability and geometric consistency in geometric deep learning, without message passing. The last application leads to a pipeline for detecting closed biological processes of various types (e.g., the cell cycle) from single-cell RNA seq data, based on persistent cohomology and lead-lag theory for embedded simplicial complexes. Before sketching these applications, I will introduce the notion of cochains and explain how they encode the interplay between geometry and topology.
Vishnu Jejjala: Deep Learning Topology (Slides)
Pietro Lio: Diffusion and geometric models for medicine
Peter Lu: Tutorial on Optimal Transport (Slides)
Anthea Monod: A Tropical Geometric Perspective on Deep Learning
Islem Rekik: The landscape of generative GNNs in network neuroscience
Michalis Vazirgiannis: Multimodal Graph Generative AI and applications to biomedical domain
Graph generative models are recently gaining significant interest in current application domains. They are commonly used to model social networks, knowledge graphs, molecules and proteins. In this talk we will present the potential of graph generative models and our recent relevant efforts in the biomedical domain. More specifically we present a novel architecture that generates medical records as graphs with privacy guarantees. We capitalize and modify the graph Variational autoencoders (VAEs) architecture. We train the generative model with the well known MIMIC medical database and achieve generated data that are very similar to the real ones yet provide privacy guarantees. We achieve there as well promising results with potential for future application in broader biomedical related tasks. Finally we present ongoing research directions for multi modal generative models involving graphs and applications to protein function text generation – the prot2text model.