Paper: Why We Must Rethink Empirical Research in Machine Learning
Quick take on Herrmann, M. et al. (2024) ‘Position: Why We Must Rethink Empirical Research in Machine Learning’, International Conference on Machine Learning, PMLR.
It is interesting that they call this perspective “empirical ML”. I rather think of their position as making ML/AI a proper branch of science instead of a branch of theoretical CS or as a branch of software engineering. The issue is quite analogous to how some practice statistics as a branch of mathematical science whereas many those of the British statistics tradition (Fisher, Box, etc..) practice statistics as a branch of science.
We’ve settled to be second rate mathematicians when we could be first-rate scientists — George Box.
I rather see the current crop of ML papers as too empirical (similar to what I see in many other sciences) without any formal methodology or normative principles about how to draw conclusions correctly.
We believe that one of the main reasons for this is that ML stands, like few other disciplines, at the interface between formal sciences and real-world applications. Because ML has strong foundations in formal sciences such as mathematics, (theoretical) computer science (CS), and mathematical statistics, many ML researchers are accustomed to reasoning mathematically about abstract objects – ML methods – using formal proofs. On the other hand, ML can also very much be considered a (software) engineering science, to create practical systems that can learn and improve their performance by interacting with their environment. Lastly, and especially concerning experimentation in ML, there exists an applied statistical perspective with a focus on thorough inductive reasoning. With its tradition in data analysis and design of experiments, it emphasizes the empirical aspects of ML research.
A statistical perspective, which we adopt here, is very sensitive to such empirical issues – explaining/analyzing real-world phenomena and generalizing beyond a specific context (inductive reasoning) – and thus particularly suited to explain 1) why ML is faced with non-replicable research, and 2) how a more complete and nuanced understanding of empirical research in ML can help to overcome this situation. With empirical ML we thus mean in a broad sense the systematic investigation of ML algorithms, techniques, and conceptual questions through simulations, experimentation, and observation. It deals with real objects: implementations of algorithms – which are usually more complex than their theoretical counterparts (e.g., Kriegel et al., 2017) – running on physical computers; data gathered and produced/simulated in the real world; and their interplay. Rather than focusing solely on theoretical analysis and proofs, empirical research emphasizes practical evaluations using real-world and/or synthetic data. Empirical ML, as understood here, requires a mindset very different from engineering and formal sciences and a different approach to methodology to allow for the full incorporation of the uncertainties inherent in dealing with real-world entities in experiments.
https://epub.ub.uni-muenchen.de/121738/1/herrmann24b__1_.pdf
We could nearly replace “statistics” with “ML” in Box’s Science and Statistics lecture would apply quite well.
[This is a repost of a note, before I realized how Substack Notes work]