Yes, recently, we learned from a highly official source that statisticians are in some kind of a panic:

While the crisis of statistics has made it to the headlines, that of mathematical modelling hasn’t. Something can be learned comparing the two, and looking at other instances of production of numbers.Sociology of quantification and post-normal science can help.

While statistical and mathematical modelling share important features, they don’t seem to share the same sense of crisis. Statisticians appear mired in an academic and mediatic debate where even the concept of significance appears challenged, while more sedate tones prevail in the various communities of mathematical modelling. This is perhaps because, unlike statistics, mathematical modelling is not a discipline. It cannot discuss possible fixes in disciplinary fora under the supervision of recognised leaders. It cannot issue authoritative statements of concern from relevant institutions such as e.g., the American Statistical Association or the columns of Nature.

Andrea Saltelli, “A short comment on statistical versus mathematical modelling” atNature

So what’s going on? Our physics color commentator Rob Sheldon offers,

The author of this article is contrasting the growing sense of panic in statisticians, with the complacency of modelers.

The panic in sociology, psychology, nutrition science, and pharmacology has been growing as >70% papers with “p-values” smaller than 0.05 are discovered to be unrepeatable.

Since the “p-value” is a statistical quantity invented by Ronald Fisher and is tied to “frequentist” statistics, the competing “Bayesian” statisticians have claimed that the method is deeply flawed. That battle is not new, having been fought since the year that Fisher introduced his p-value, but until recently, had been won by the frequentists. Today, Bayesian methods are not just widely popular, but have replaced frequentists in many niche fields, so that the “irreproducibility” crisis is not simply pointing the finger at a few fraudulent bad apples, but at an entire educational system that promoted p-hacking.

By contrast, modellers have been growing in prestige and fame year upon year. For example, in 2018, nine Neanderthal genomes had been sequenced, and one Denisovan genome.

Yet we have a news item this week, typical of recent news items, which claims that Neanderthals carry 1% of their genes from previous encounters with modern humans.

How do they figure this out? Especially since we have zero genomes from Modern humans that predate Neanderthals?

Models.

But how do we know, asks Andrea Saltelli, if our models are valid? Can we run calibration tests on them with known answers? How about simple consistency checks? What about stating all our assumptions up front?

Nope, nope, and double nope. Modellers get a free pass, while statisticians get the bright lights in their eyes and the grilling from unseen questioners, with the threat of retracted papers and tenure-destroying expulsion.

Saltelli then goes on to show a rather disturbing plot. The more complicated our model becomes, the more ability it has to match our actual data. If you have only two data points, a model needs only two free parameters, and it can find a line through both those points. If you have 3 points, you can find a curve, a quadratic polynomial that will go through them. As long as you have as many free parameters as there are data points, there is always a curve that goes directly through all the points.

But is this increasingly complex mathematical model valid?

The way to test it, is to find one additional point, and see if the curve for n-1 points matches this last point. And weirdly enough, when the model has too many free parameters, it gets more and more “unstable”, more and more “wiggly” as it strains to perfectly match the previous data, with less and less likelihood of matching new data. This is what Saltelli’s disturbing plot shows, that the model error is minimized somewhere in the middle of the “complexity” axis.

So rather than complimenting our modellers (think global climate models) for matching past data perfectly by adding in adjustable variables (aerosols, feedback), we should be suspicious that they are actually making their predictions worse by overcomplicating them.

And it isn’t just Neanderthal genetics and global climate models. This is true for every area of science, from cosmology to particle physics to cladistics and AI. This is why IBM is abandoning “Deep Mind.” The problem wasn’t fixed by throwing more complexity at it.

So rather than being complacent, modellers ought to be in an equal state of panic as statisticians. Saltelli is not abandoning modelling, he just wants it to be ethical. From his concluding paragraph: “While this vision is gaining new traction [sociology of modelers working with suppliers of data and users of models] more could be done. A new ethics of quantification must be nurtured.”

Perhaps this is all part of the Paley renaissance, recognizing that the days of coddled dogmatics and their supporting cast of modellers are coming to an end.

*See also:* Confirmed: Deep Mind’s deepest mind is on leave. The chess champ computer system just never made money

*Note:*Rob Sheldon is the author of *Genesis: The Long Ascent*

Follow UD News at Twitter!

Interestingly, just a few months ago Eric Holloway and I published a mechanism for testing models against their complexity. It isn’t the first (or last) word on model complexity testing, but it offers a straightforward way of criticizing overly-complex models.

Generalized Information: A Straightforward Method for Judging Machine Learning Models.

The short, short version is, we want our models to

generalizeour data. Generalization means that the model should be smaller than the data. It should be smaller still based on the amount of error. While this doesn’t guarantee fit, it at least seems to give a good starting point.Rob:

There’s the case of Lord Monckton’s “simple model,” using a much simpler–and likely more apt, formula for ‘feedback,’ and which “models” recent climate well…………..much better than the other climate modelers.

So, instead of a “line,” we get a “cloud.” And within that “cloud” everything is related to everything else in an almost equal way. Which means you end up with no correlation at all.

From the Nature article:

From Dembski’s 1998 paper, http://www.arn.org/docs/dembski/wd_idtheory.htm

Intelligent Design as a Theory of InformationWhat is it for a possibility to be identifiable by means of an independently given pattern? A full exposition of specification requires a detailed answer to this question. Unfortunately, such an exposition is beyond the scope of this paper. Thekey conceptual difficulty here is to characterize the independence condition between patterns and information.This independence condition breaks into two subsidiary conditions: (1) a condition to stochastic conditional independence between the information in question and certain relevant background knowledge; and (2) a tractability condition whereby the pattern in question can be constructed from the aforementioned background knowledge. Although these conditions make good intuitive sense, they are not easily formalized. For the details refer to my monographThe Design Inference.This is exactly what Dembski’s efforts sought to address, to which he gives a full explanation in

No Free Lunch.Most people, even those who should know better, can’t differentiate between knowing the pattern beforehand and then discovering it, and just discovering the pattern after the fact.For the former, the question, “Is a pattern present prior to the generation of the pattern I now recognize?, while for the latter, the question that is asked is simply, “Do I recognize a pattern.” There not the same thing.

The latter is no more than “ritual.”

News, I was taught from the beginning that models are useful [if empirically reliable and even better, accurate at predictions . . . ] rather than true. Indeed, later, I saw that a key difference for theories was that they had some possibility of being true [= accurate to reality]. Yet later, I realised there is a debate between [chastened?] scientific realists and anti-realists [think, Feyerabend, Lakatos and Kuhn] with the ghost of the pessimistic induction on the history of falsified theories haunting the discussion. It looks a lot like we are falling into an abyss of reducing everything to modelling, then locking in some models as effectively sacrosanct because of ideologies such as evolutionary materialistic scientism. Maybe, it is time for serious reconsideration. Back to my son-assigned homework, reading Bishop Berkeley: just what is

matterin an era of the Casimir effect and linked quantum field theory? Lurking, and what is mind too? KF