We organize a one-day AlToGeLiS meeting with moderated discussion sessions at KTH Stockholm on June 17, 2022, right after the conference on Mathematics of Complex Data. Registration ended on May 31, 2022.
We organize four moderated problem discussion and working sessions on the topics algebraic statistics, applied algebraic geometry, combinatorics and geometry, and topological data analysis. Participants are warmly encouraged to share questions and problems they would like to discuss. In the morning, the speakers give a 10 minutes presentation on some aspects of their area of research, with particular emphasis on one or two open problems. Each of these presentations will be followed by a moderated discussion where the audience is encouraged to make comments and ask questions, which then will be summarized and translated into concrete problems to work on. Participants then choose one of the questions and work on that in teams in the afternoon. The day ends with a short presentation by each team and dinner for registered participants. Please find more information about the dinner below.
Schedule
09:00-09:40h Session 1 (Kathlén Kohn)
09:45-10:25h Session 2 (Martina Scolamiero)
10:30-11:00h Fika
11:00-11:40h Session 3 (Carlos Améndola)
11:45-12:25h Session 4 (Katharina Jochemko)
12:30-14:00h Lunch
14:00-15:30h Discussion in small groups
15:30-16:00h Fika
16:00-17:00h Discussion in small groups cont’d
17:00-18:00h Wrap-up
19:00-22:00h Dinner
The discussion sessions take place in room D37.
Session 1: Applied algebraic geometry (Kathlén Kohn, moderated by Anna-Laura Sattelberger)
Our world is full of geometry, and so is the data that is being produced every day. Applied algebraic geometry employs algebraic tools to explain the geometric structures that are underlying real-world phenomena, data, or methods used across the sciences and engineering. We will discuss open problems in data science and the life sciences. For instance, how can the concept of disentanglement be formally defined? One of the most ambitious challenges in learning is to develop algorithms that disentangle the different factors of variation in the data. What exactly that means depends highly on the application at hand and several mathematical ideas have been proposed to capture the concept of disentanglement (e.g. [3, 2, 5]) but a widely accepted formal definition that applies to (almost) all practical settings is still missing. Other open problems in machine learning include the theoretical explanation of the empirical observations, such as the ability of autoencoders to memorize data (by forming attractors from the training data; see [7]) or the convergence of neural networks to ‘good’ local minima [1]. Networks also play a crucial role in the life sciences, for instance in the form of biochemical reaction networks. Here open problems include to count the connected components of steady state varieties [6, 4] or to derive a catalog of all semialgebraic sets that can be obtained from “small” chemical reaction networks [9, 8].
References
[1] B. Bah, H. Rauhut, U. Terstiege, and M. Westdickenberg: Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers. Preprint arXiv:1812.02230, 2018.
[2] T. Cohen and M. Welling: Learning the irreducible representations of commutative Lie groups. Preprint arXiv:1402.4437, 2014.
[3] J.J. DiCarlo, D. Zoccolan, and N.C. Rust: How does the brain solve visual object recognition? Neuron, 73(3):415–434, 2012.
[4] E. Feliu and M.L. Telek: On generalizing Descartes’ rule of signs to hypersurfaces. Preprint arXiv:2107.10002, 2021.
[5] I. Higgins, D. Amos, D. Pfau, S. Racaniere, L. Matthey, D. Rezende, and A. Lerchner: Towards a definition of disentangled representations. Preprint arXiv:1812.02230, 2018.
[6] M. Pérez Millán, A. Dickenstein, A. Shiu, and C. Conradi: Chemical reaction systems with toric steady states. Bulletin of Mathematical Biology, 74(5):1027–1065, 2012.
[7] A. Radhakrishnan, M. Belkin, and C. Uhler: Overparameterized neural networks implement associative memory. Preprint arXiv:1812.02230, 2018.
[8] A. Shiu: The smallest multistationary mass-preserving chemical reaction network. In: Horimoto, K., Regensburger, G., Rosenkranz, M., Yoshida, H. (eds) Algebraic Biology, pages 172–184, 2008. Lecture Notes in Computer Science, vol 5147. Springer, Berlin, Heidelberg.
[9] A. Shiu and T. de Wolff: Nondegenerate multistationarity in small reaction networks. American Institute of Mathematical Sciences, 24(6):2683–2700, 2019.
Session 2: Topological data analysis (Martina Scolamiero, moderated by Francesca Tombari)
Topological methods, in combination with classical statistical ones, have proven to be a precious resource for understanding and visualizing data in domains ranging from neuroscience and biology to sensor networks or material science. The study of a data set using topological methods is commonly called topological data analysis (TDA). A standard TDA method is persistent homology, where a dataset is studied through a sequence of geometric objects parametrised by the real line. Underlying the popularity of persistent homology are efficient algorithms for the computation of the barcode representing the homology of the parameterized geometric objects. In applications however there are many scenarios where multiple parameters are of interest and the need to work over a multidimensional parameter space leads to the need to study spaces indexed by the poset R^n [1],[2]. Another poset of interest in the TDA community is the zig-zag used for example in density estimation [3]. The aim of this session is to discuss generalisations of persistence from both theoretical and practical aspects. Which posets naturally parameterize spaces arising from data [3]? On the other hand for which parameterized spaces can we define computable invariants? Homological invariants for modules over posets are for example presented in [4], [5] and [6]. It is also of interest to define and compute a rich space of distances to compare parametrised spaces in the TDA setting. In addition to providing means to compare the various datasets such distances can be used to define stable invariants [7].
References
[1] G. Carlsson and A. Zomorodian: The Theory of Multidimensional Persistence. Discr. Comput. Geomet. (1)(42)(2009) 71-93.
[2] V. de Silva, G. Carlsson, and D. Morozov: Zigzag persistent homology and real-valued functions. Proc. 25th Annual Symposium on Computational Geometry (SoCG), June 2009, pp. 247–256.
[3] R. Corbet, M. Kerber, M. Lesnick, and G. Osnag: Computing the Multicover Bifiltration. Preprint arXiv:2103.07823.
[4] E. Miller: Homological algebra of modules over posets. Preprint arXiv:2008.00063.
[5] P. Bubenik and N. Milićević: Homological Algebra for Persistence Modules. Found. Comput. Math. 21, 1233–1278 (2021).
[6] W. Chachólski, A. Jin, and F. Tombari: Realisations of posets and tameness. Preprint arXiv:2112.12209.
[7] M. Scolamiero, W. Chachólski, A. Lundman, R. Ramanujam, and S. Öberg: Multidimensional Persistence and Noise. Foundations of Computational Mathematics 17(6), 1367-1406.
Session 3: Algebraic statistics (Carlos Améndola, moderated by Pratik Misra)
Algebraic statistics is an interdisciplinary field that uses tools from computational algebra, algebraic geometry, and combinatorics to address problems in probability, statistics and related applications. The so-called “mantra” is that many statistical models of interest are semialgebraic sets, defined by polynomial equalities and inequalities. Algebraic statistics is not only concerned with understanding the algebra and geometry of the underlying statistical models, but also with applying this knowledge to improve statistical procedures and to devise new methods for analyzing data. Some of the recent relevant topics in the area with several open problems include parameter identifiability, maximum likelihood estimation and graphical models.
References
[1] M. Drton: Algebraic problems in structural equation modeling. The 50th anniversary of Gröbner bases. Mathematical Society of Japan, 2018. 35-86.
[2] B. Sturmfels: Open problems in algebraic statistics. Emerging applications of algebraic geometry. Springer, New York, NY, 2009. 351-363.
[3] S. Sullivant: Algebraic statistics. Vol. 194. American Mathematical Soc., 2018.
Session 4: Combinatorics and geometry (Katharina Jochemko, moderated by Maria Dostert)
Geometric tomography is concerned with the reconstruction of shapes from geometric data such as volumes of sections and support function evaluations. These tasks arise naturally in various applications and give rise to interesting and challenging mathematical questions, for example, about the uniqueness and convergence of the reconstruction, see [2] for a brief overview.
Many classical results in geometric tomography pertain to the reconstruction of general convex bodies. In contrast, reconstruction problems for combinatorially interesting classes of polytopes, parameterized families of convex bodies, and/or objects with restricted geometry are much less studied. See, e.g., [1, 3] for recent examples.
The goal of this project is to advance our knowledge in this area by investigating reconstruction problems for subclasses of convex bodies/polytopes with interesting combinatorial/geometric structure.
References
[1] Maria Dostert and Katharina Jochemko: Learning polytopes with fixed facet directions. Preprint arXiv:2201.03419 (2022).
[2] Richard Gardner: Geometric tomography. http://www.geometrictomography.com/
[3] Yong Sheng Soh and Venkat Chandrasekaran: Fitting tractable convex sets to support function evaluations. Discrete Comput. Geom. 66 (2021), no. 2, 510–551. MR 4292751
Dinner (for registered participants only): The dinner takes place at Restaurang Michelangelo. We recommend going there with the tunnelbana 14 from the stop Tekniska högskolan to Gamla Stan. Tickets can be conveniently purchased via the SL app.
Organizers: Sandra Di Rocco, Anna-Laura Sattelberger, Liam Solus, and Francesca Tombari. Supported by the Brummer & Partners MathDataLab and the KTH Department of Mathematics.
We are looking forward to meeting you in Stockholm!