Conférence
Notice
Lieu de réalisation
Centre Inria de l'Université de Rennes
Langue :
Anglais
Crédits
Anne-Laure Boulesteix (Intervention)
Crédit image : Centre Inria de l'Université de Rennes
Détenteur des droits
Centre Inria de l'Université de Rennes
DOI : 10.60527/5hkk-8e64
Citer cette ressource :
Anne-Laure Boulesteix. Inria. (2024, 19 juin). Keynote: Replicable empirical machine learning research. [Vidéo]. Canal-U. https://doi.org/10.60527/5hkk-8e64. (Consultée le 2 novembre 2024)

Keynote: Replicable empirical machine learning research

Réalisation : 19 juin 2024 - Mise en ligne : 19 juin 2024
  • document 1 document 2 document 3
  • niveau 1 niveau 2 niveau 3
Descriptif

In the absence of mathematical theory addressing complex real-life settings beyond simplifying assumptions, the behavior and performance of machine learning methods often has to be addressed by applying them to real or simulation data and observing what happens. In this sense, methodological machine learning research can be viewed as an empirical science. Are the results published in this field reliable? When authors claim that their (new) method performs better than existing ones, should readers trust them? Is an independent study likely to obtain similar results? The answer to all these questions is probably „not always“. The so-called replication crisis in science has drawn increasing attention across empirical research fields such as medicine or psychological science. What about good practice issues in methodological empirical research – that considers methods as research objects? When developing and evaluating new machine learning methods, do we adhere to good practice principles typically promoted in other fields? I argue that the machine learning community should make substantial efforts to address what may be called the replication crisis in methodological research, in particular by trying to avoid bias in comparison studies based on simulated or real data. I discuss topics such as publication bias, cherry-picking/over-optimism, experimental design and the necessity of neutral comparison studies, and review recent positive developments towards more reliable empirical evidence. Benchmark studies comparing statistical learning methods with a focus on high-dimensional biological data will be used as examples.

Intervention

Sur le même thème