Notice
Keynote: Replicable empirical machine learning research
- document 1 document 2 document 3
- niveau 1 niveau 2 niveau 3
Descriptif
In the absence of mathematical theory addressing complex real-life settings beyond simplifying assumptions, the behavior and performance of machine learning methods often has to be addressed by applying them to real or simulation data and observing what happens. In this sense, methodological machine learning research can be viewed as an empirical science. Are the results published in this field reliable? When authors claim that their (new) method performs better than existing ones, should readers trust them? Is an independent study likely to obtain similar results? The answer to all these questions is probably „not always“. The so-called replication crisis in science has drawn increasing attention across empirical research fields such as medicine or psychological science. What about good practice issues in methodological empirical research – that considers methods as research objects? When developing and evaluating new machine learning methods, do we adhere to good practice principles typically promoted in other fields? I argue that the machine learning community should make substantial efforts to address what may be called the replication crisis in methodological research, in particular by trying to avoid bias in comparison studies based on simulated or real data. I discuss topics such as publication bias, cherry-picking/over-optimism, experimental design and the necessity of neutral comparison studies, and review recent positive developments towards more reliable empirical evidence. Benchmark studies comparing statistical learning methods with a focus on high-dimensional biological data will be used as examples.
Thème
Sur le même thème
-
Tutorial Track 1: Reproducible distributed environments with NixOS Compose
Presented by Quentin Guilloteau, Postdoctoral Fellow, Fernando Ayats Llamas, Research Engineer and Olivier Richard, Assistant Professor.
-
Tutorial Track 1: Reproducibility of Scientific Results using E4S Containers
Presented by SHENDE, Sameer, Research Profesor.
-
Tutorial Track2: Fostering Reproducibility By Integrating Large Language Model and Scholarly Knowl…
Presented by Hassan Hussein, PhD Student, Vindoh Ilangovan, Researcher and Kaouter Kebaili, PhD Student.
-
Tutorial Track 3: Managing HPC Software Complexity with Spack
Presented by Massimiliano Culpo, Researcher.
-
Tutorial Track 4: Practical strategies for teaching reproducibility
Presented by Fraida Fund, Research Assistant Professor, Sarah Cohen-Boulakia, Professor and Bogdan Alexandru Stoica, PhD Student.
-
-
-
-
Session 4: Poster Lightning Talks
Talk 1 [00:00] NPF: orchestrate and reproduce network experiments. By Tom Barbette. Presented by Tom Barbette, Assistant Professor. Talk 2 [03:10] : From reproducible to reusable bioinformatics
-
Keynote: Reproducibility and replicability of computer simulations
HinsenKonradSince the early days of the reproducibility crisis, much progress has been made in understanding and improving computational reproducibility and replicability (R and R)...
-
-