Machine learning in non-euclidean spaces
The activities of our Focus Group are primarily centered on the interplay between geometry, machine learning, and computer vision. Analysis of geometric objects has been a topic of computer vision and pattern recognition since the inception of the field, and geometric principles of symmetry and invariance underpin the success of modern deep learning methods.
Focus Group Computer Vision and Machine Learning
Prof. Michael Bronstein (Imperial College / University of Lugano / Intel / Twitter), Alumnus Rudolf Diesel Industry Fellow | Prof. Daniel Cremers (TUM), Carl von Linde Senior Fellow | Host : Prof. Daniel Cremers, Computer Vision and Artificial Intelligence, TUM
Members of the group have both academic and industrial experience. Michael Bronstein was until 2019 a principal engineer at Intel responsible for the development of RealSense range-sensing technology, and following the acquisition of his startup Fabula AI in 2019 is the head of Graph Learning Research at Twitter. Daniel Cremers is the founder and Chief Scientific Officer of the autonomous driving startup Artisense.
“Geometric deep learning” (a term coined by Michael Bronstein) tries to approach machine learning problems from the position of geometric priors such as symmetry and scale separation. Modern machine learning systems need to routinely deal with data in thousands or even millions of dimensions, running into a phenomenon colloquially known as the “curse of dimensionality.” Incorporating prior knowledge about the structure of the data (typically expressed through the symmetry group of the underlying domain) turns out to be a powerful geometric principle that finds its realization in the majority of popular deep representation learning architectures dealing with all sorts of data: convolutional networks ubiquitously used for image analysis (emerging from the translational symmetry of the grid); graph neural networks, deep set, and Transformer architectures (based on principles of permutation equivariance); and gated recurrent neural networks (time warping).
Graph neural networks have recently taken the spotlight of the machine learning community, with successful applications ranging from novel antibiotic drug discovery to traffic prediction in Google Maps services. We believe the rapid movement of these methods from a niche interest to the spotlight of research are in part thanks to our contributions and the activities of our Focus Group.
Of particular interest to our Focus Group is bridging the gap between geometric and deep learning by generalizing neural architectures and the underpinning mathematical models to non-Euclidean domains, as well as developing next-generation machine learning methods capable of dealing with geometric data such as meshes and point clouds.
In collaboration with a startup company Ariel AI (founded by Iasonas Kokkinos, who was a visitor in summer 2018 and participated in the TUM-IAS Workshop on Machine Learning for 3D Understanding), we developed a hybrid pipeline for 3D hand pose estimation with an image CNN-based encoder and a geometric decoder. A demo of this system presented at CVPR 2020 allowed the creation of realistic body avatars with fully articulated hands from video input on a mobile phone faster than real-time. Ariel AI was acquired by Snap in 2020.
Another paper [2] co-authored with the team of Nassir Navab (TUM Chair for Computer Aided Medical Procedures & Augmented Reality) and presented at MICCAI 2020 developed a new graph neural network architecture with latent graph learning for automated diagnosis of neurological disorders.
In a new book preview on Geometric Deep Learning [3] authored by Michael Bronstein with Joan Bruna (NYU), Taco Cohen (Qualcomm), and Petar Veličković (DeepMind), we provide a geometric unification for a broad class of machine learning problems and show how to derive some of the most popular deep representation learning architectures from first principles.
Efficient deep learning architectures developed in the group of Daniel Cremers [4] formed the core engine for the project Slow Down COVID-19 at Harvard, allowing the detection of COVID-19 in X-ray scans.
Figure 1
A Cross-Season Dataset for Multi-Weather SLAM in Autonomous Driving
The group of Daniel Cremers presented a novel data set covering seasonal and challenging perceptual conditions for autonomous driving [5]. Among other capabilities, it enables research on visual odometry, global place recognition, and map-based re-localization tracking. The data was collected in different scenarios and under a wide variety of weather conditions and illuminations, including day and night. This resulted in more than 350 km of recordings in nine different environments ranging from a multi-level parking garage to urban (including tunnels), countryside, and highway driving. Moreover, the team of Daniel Cremers developed self-supervised learning methods to recover dense and detailed reconstructions of the world around the car from a single car-mounted camera. [6]
New ERC Proof of Concept grant awarded to Michael Bronstein in 2020
Michael Bronstein was awarded the ERC Proof of Concept grant Hyperfoods, to explore the commercial application of graph-based deep learning methods for discovering drug-like molecules in food ingredients. The project, based on previous research of Michael Bronstein on drug repositioning, took a tasty twist through collaboration with the renowned Italian chef Bruno Barbieri, who prepared a series of recipes based on ingredients the team identified. Following an ERC Starting Grant, an ERC Consolidator Grant, and two ERC Proof of Concept Grants, Hyperfoods is Michael Bronstein's fifth grant from the European Research Council.
Figure 2
New ERC Advanced Grant SIMULACRON awarded to Daniel Cremers in 2020
Daniel Cremers started on his fifth ERC grant project, following an ERC Starting Grant, an ERC Consolidator Grant and two ERC Proof of Concept Grants. SIMULACRON is focused on inferring physical simulations of the observed world from videos. The project addresses the shortcoming that computer vision has been largely focused on recovering the 3D surface of observed objects, neglecting the underlying physical properties of the observed objects. Yet, recovering a complete physical model of an observed action should enable us to simulate this action so as to better extrapolate into the future allowing predictions of what will happen next.
Michael Bronstein received the Royal Academy of Engineering Silver Medal
for developing “pioneering methods of graph deep learning, a new class of AI algorithms allowing to perform machine learning on complex systems of relations of interactions such as molecules, biological interactomes, and social networks.”
Figure 3
[1]
D. Kulon, R. A. Guler, I. Kokkinos, M. M Bronstein and S. Zafeiriou, “Weakly-supervised mesh-convolutional hand reconstruction in the wild”, 2020, arXiv:2004.01946.
[2]
L. Cosmo, A. Kazi, S.-A. Ahmadi, N. Navab and M. M. Bronstein, “Latent-graph learning for disease prediction”, in Medical Image Computing and Computer Assisted Intervention – MICCAI 2020, Lecture Notes in Computer Science, Martel A. L. et al., Eds., Cham: Springer, 2020, vol. 12262, pp. 643-653.
[3]
M. M. Bronstein, J. Bruna, T. Cohen and P. Veličković, “Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges”, 2021, arXiv:2104.13478.
[4]
F. Pasa, V. Golkov, F. Pfeiffer, D. Cremers and D. Pfeiffer, “Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization”, Scientific Reports, vol. 9, no.1, 2019.
[5]
P. Wenzel, R. Wang, N. Yang, Q. Cheng, Q. Khan, L. von Stumberg, N. Zeller and D. Cremers, “4Seasons: A Cross-Season Dataset for Multi-Weather SLAM in Autonomous Driving”, 2020, arXiv:2009.06364.
[6]
F. Wimbauer, N. Yang, L. von Stumberg, N. Zeller and D Cremers, “MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.