© Jérémy Barande 2025

Arnak DalalyanTeacher-researcher

Advanced Grants

Arnak Dalalyan's research focuses on statistical complexity and the theoretical foundations of machine learning, particularly in the context of large-scale data. He is particularly interested in the conditions that enable an algorithm to produce results that are reliable, original, and resource-efficient—a central issue in the era of generative models, which are used in fields ranging from artistic creation to pharmaceutical research. After graduating from Yerevan State University, he continued his studies in France, where he obtained a PhD in statistics from the University of Le Mans in 2001, followed by a HDR (Habilitation à Diriger des Recherches, or accreditation to supervise research) from Pierre and Marie Curie University in 2007. After a post-doctorate at Humboldt University in Berlin, he became a lecturer at UPMC, then a professor at the École nationale des ponts et chaussées. He then joined ENSAE Paris and the CREST laboratory1 , which he has headed since 2020.

CREST is a joint research unit bringing together researchers in quantitative social sciences and applied mathematics.

  • 1CNRS/Ecole polytechnique/Groupe des Ecoles Nationales d'Economie et Statistique

SAGMOS (Statistical Analysis of Generative Models: Sampling Guarantees and Robustness)

Generative modeling algorithms can automatically produce objects such as texts, images, molecules, or pieces of music similar to those found in a dataset. Although very powerful, these algorithms often rely on large amounts of data and computing resources. The ERC project “Sagmos” led by Arnak Dalalyan aims to analyze the mathematical properties of these methods in order to better understand their strengths and limitations, improve their efficiency, and design new ones. This analysis draws on tools from several fields of statistics and probability, including dimension reduction, nonparametric estimation, variety learning, optimal transport, and stochastic computation. The goal is to obtain interpretable statistical guarantees on the accuracy, robustness, creativity, and frugality of these algorithms, taking into account factors such as sample size, noise level, problem dimension, and the presence of outliers. Particular attention is paid to stability in the face of disturbances and model specification errors, in order to contribute to the development of more reliable, explainable, and economical artificial intelligence algorithms.