My name is Stefano Spigler. I am a postdoc in the Laboratory of Physics of Complex Systems at the École Polytechnique Fédérale de Lausanne (EPFL, Lausanne, Switzerland) and in the
Institut de Physique Théorique at the Commisariat à l'Energie Atomique (CEA, Paris). My funding comes from the Simons Collaboration on Cracking the glass problem.
My current research focuses on the study of the energy (loss) landscape and of the learning dynamics in deep neural networks; more on this in this section).
Education & academic positions
Download the full CV (pdf)
2017-2019
Postdoc
Eneergy landscape and learning dynamics in deep learning
In collaboration with:Matthieu Wyart & Giulio Biroli
Physics of Complex Systems Laboratory
École Polytechnique Fédérale de Lausanne
Institut de Physique Théorique
Commisariat à l'Energie Atomique
Eneergy landscape and learning dynamics in deep learning
In collaboration with:
École Polytechnique Fédérale de Lausanne
Commisariat à l'Energie Atomique
2014-2017
PhD in Physics
Distribution of avalanches in disordered systems
Supervisor:Silvio Franz
Laboratoire de Physique Théorique et Modèles Statistiques
Université Paris Sud (Université Paris-Saclay)
Scolarship by the École Normale Supérieure
You can read here my Ph.D. thesis
Distribution of avalanches in disordered systems
Supervisor:
Université Paris Sud (Université Paris-Saclay)
Scolarship by the École Normale Supérieure
2014
Internship
Laboratoire de Physique Théorique et Modèles Statistiques
Université Paris Sud (Université Paris-Saclay)
Université Paris Sud (Université Paris-Saclay)
2012-2014
Master in Physics of Complex Systems (link)
Politecnico di Torino
International School for Advanced Studies (SISSA)
International Centre for Theoretical Physics (ICTP)
Université Pierre et Marie Curie (UPMC, Paris 6)
Université Paris Diderot (Paris 7)
Université Paris Sud (Paris 11)
École normale supérieure de Cachan
Ranked 1st among all the participants
You can read here my M.Sc. thesis
Ranked 1st among all the participants
2009-2012
Scuola Galileiana di Studi Superiori (link)
Skills
Languages
Italian
English
French
Swedish (beginner)
Informatics
Debian/Red Hat based linux distros,
C/C++, Python and Pytorch,
XML and derived languages, CSS, JavaScript (basic), PHP, SQL,
Mathematica (basic), LaTeX, Office suite
Research




Keywords: disorderd systems, glass transition, jamming, avalanche statistics, replica symmetry breaking, neural networks, deep learning.
I had the chance to start working with the Simons Collaboration on Cracking the glass problem during my Ph.D., where I have studied the statistical physics of disordered systems, with focus on the glass transition in spin- (link) and structural-glasses (link) and on the jamming transition (link) in random packings of spheres.
In particular, I studied the statistics of the response of systems of hard spheres under a small shear-strain — figure (a); such response is random (sample-dependent) and jerky (discontinuous) — figure (b). The distribution of the response can be computed analytically in infinite dimensions via an approach based on the RSB theory — figure (c), — and in this case it displays critical properties, namely a power-law behaviour. Remarkably, this mean-field prediction holds true in finite dimensions as well! The details of my work can be found in my Ph.D. thesis and here.
I have recently started a joint postdoc affiliated with the Simons Collaboration that takes place between the EPFL (Lausanne, Switzerland) and the CEA (Paris, France). Now my research consists in studying the properties of the energy landscape and of the dynamics in deep neural networks.
Neural networks are, loosely speaking, algorithms that are inspired by some features of the (visual) cortex of living animals. (Supervised) neural networks can be used to do inference on data; one of the typical tasks that these algorithms can do is classification: a network can be trained on a set of pictures belonging to multiple categories, in such a way that when presented with a new image there is a high chance that it will be able to correctly classify the category the image belongs to. Training is characterized as a process where some error function is minimized, with respect to the parameters of the model; the error is defined as a measure of mismatch between the predicted category of a picture and the true one
In order to achieve good results on complex datasets, these networks need to be fairly deep (millions of tunable parameters to be inferred). There are several reason to think that the energy landscape, as a function of the parameters, is rough and complex, as one finds in disordered glassy systems, but it turns out that it might be much smoother. The goal of my research is to characterize this landscape and the salient features of the dynamics that the system undergoes while learning (i.e. during training).
I had the chance to start working with the Simons Collaboration on Cracking the glass problem during my Ph.D., where I have studied the statistical physics of disordered systems, with focus on the glass transition in spin- (link) and structural-glasses (link) and on the jamming transition (link) in random packings of spheres.
In particular, I studied the statistics of the response of systems of hard spheres under a small shear-strain — figure (a); such response is random (sample-dependent) and jerky (discontinuous) — figure (b). The distribution of the response can be computed analytically in infinite dimensions via an approach based on the RSB theory — figure (c), — and in this case it displays critical properties, namely a power-law behaviour. Remarkably, this mean-field prediction holds true in finite dimensions as well! The details of my work can be found in my Ph.D. thesis and here.
I have recently started a joint postdoc affiliated with the Simons Collaboration that takes place between the EPFL (Lausanne, Switzerland) and the CEA (Paris, France). Now my research consists in studying the properties of the energy landscape and of the dynamics in deep neural networks.
Neural networks are, loosely speaking, algorithms that are inspired by some features of the (visual) cortex of living animals. (Supervised) neural networks can be used to do inference on data; one of the typical tasks that these algorithms can do is classification: a network can be trained on a set of pictures belonging to multiple categories, in such a way that when presented with a new image there is a high chance that it will be able to correctly classify the category the image belongs to. Training is characterized as a process where some error function is minimized, with respect to the parameters of the model; the error is defined as a measure of mismatch between the predicted category of a picture and the true one
In order to achieve good results on complex datasets, these networks need to be fairly deep (millions of tunable parameters to be inferred). There are several reason to think that the energy landscape, as a function of the parameters, is rough and complex, as one finds in disordered glassy systems, but it turns out that it might be much smoother. The goal of my research is to characterize this landscape and the salient features of the dynamics that the system undergoes while learning (i.e. during training).
Publications
2019
M. Geiger, A. Jacot, S. Spigler, F. Gabriel, L. Sagun, S. d'Ascoli, G. Biroli, C. Hongler, M. Wyart
Scaling description of generalization with number of parameters in deep learning
to be submitted (arXiv preprint)
Scaling description of generalization with number of parameters in deep learning
to be submitted (arXiv preprint)
2018
S. Spigler, M. Geiger, S. d'Ascoli, L. Sagun, M. Baity-Jesi, G. Biroli, M. Wyart
A jamming transition from under- to over-parametrization affects loss landscape and generalization
NIPS workshop "Integration of Deep Learning Theories" (arXiv preprint)
A jamming transition from under- to over-parametrization affects loss landscape and generalization
NIPS workshop "Integration of Deep Learning Theories" (arXiv preprint)
2018
M. Geiger, S. Spigler, S. d'Ascoli, L. Sagun, M. Baity-Jesi, G. Biroli, M. Wyart
The jamming transition as a paradigm to understand the loss landscape of deep neural networks
Submitted to PR X (arXiv preprint)
The jamming transition as a paradigm to understand the loss landscape of deep neural networks
Submitted to PR X (arXiv preprint)
2018
M. Baity-Jesi, L. Sagun, M. Geiger, S. Spigler, G.B. Arous, C. Cammarota, Y. LeCun, M. Wyart, G. Biroli
Comparing dynamics: deep neural networks versus glassy systems
ICML, PMLR 80:314-323 (PMLR)
Comparing dynamics: deep neural networks versus glassy systems
ICML, PMLR 80:314-323 (PMLR)
2016
S. Franz, G. Gradenigo, S. Spigler
Random-diluted triangular plaquette model: Study of phase transitions in a kinetically constrained model
Phys. Rev. E 93(3), 032601 (PR E)
Random-diluted triangular plaquette model: Study of phase transitions in a kinetically constrained model
Phys. Rev. E 93(3), 032601 (PR E)