Proliferation of wrong papers at 95% confidence level

Kilotons of breathtaking [unflattering noun referring to their IQ], mostly gathering around the environmentalist cult, claim that it is possible - or desirable - to build science upon 2-sigma observations.

If you don't know what it is, it is an observation where the "signal" is only twice as large as "noise", using a proper definition of their magnitude - yet it is claimed that the "signal" is real and means something. The probability that it is not real and the deviation arose by chance is 5% or so. (In reality, it may be much higher because of various biases, but let's generously assume it's 5%.) The idealized figure is actually closer to 4% but let us use the conventional figure 5%.

So 5% of the statements claimed to be right because of statistical observations are wrong while 95% of them are right. Is it a good enough success rate to build science? Someone who is not familiar with science or rational thinking in general may think that it is good enough to make 95% of correct statements.

Infection spreads

However, science is not about the search for completely isolated insights. The scientific insights, theories, and papers depend on each other. For example, a theory about the shrinking paths of the birds due to global warming depends on 3 observations about the bird behavior, 4 observations about the rise of temperatures, and so on.

Each of those depends on the union of several other papers, paleoclimatological reconstructions, the assumed validity of various computer models of the climate, and so on. Each climate model depends on 30 other assumptions. You need to rely on the work and conclusions by other scientists. In any science, this is a rule rather than an exception. Even Isaac Newton had to stand on the shoulders of giants.

Assume that your paper claims to be right at the 95% confidence level. But this figure only means that you are adding a 5% chance of an error: you must assume that the papers you're building upon are right. Some of them won't be - or they're at risk of being wrong.

A typical paper cites several dozens of other papers. However, most of the citations are bogus: the new paper only depends on several older papers. The other papers are being cited but their results are not essential for your new paper. Let's be very modest and assume that your new paper - and every typical paper in your field - only depends on two additional papers written in the previous T years (imagine T=1 or T=2) that are also at a small risk of being wrong. The period of T years will be referred to as one generation (generation of papers).

In reality, a typical paper may depend on many more older papers and the propagation of the errors will be much faster but let just assume that it depends on two.

Now, let's assume that the paper is correct if the essential "ancestor" papers are correct, and if the new "added value" test is right. Of course, the conclusions of the newest paper may also be correct by "chance", despite the fact that the methodology or essential assumptions are wrong, but let's omit this whole branch of papers that are correct because the crucial errors cancel or because of complete chance. We're only interested in science where not only the results but also the arguments are at least qualitatively correct.

Let's use the symbol "P(t-T)" for the probability that a typical paper written before T years - the previous "generation" - is correct. We said that the new paper is correct if the two ancestor papers are correct, and a new test works well - which has the 95% probability. You see that the probability of the new generation paper's being correct equals

P(t) = 0.95 * P(t-T)²

Let's assume that the first-generation papers didn't depend on anything. So they had the following success rate:

P(0) = 0.95

The second-generation papers, written T years later, have

P(T) = 0.95³ = 0.857

The third-generation papers, written T years afterwards, have

P(2T) = 0.95 P(T)² = 0.698

The fourth-generation papers, written 3T years after the initial papers, have

P(3T) = 0.95 P(2T)² = 0.463

So these fourth-generation papers already have a higher chance of being wrong than right.

Because in the reality, T can be as short as 1 or 2 years, it's clear that with these two-sigma "standards", your discipline deteriorates into complete noise and rubbish in (much) less than a decade. I really mean 10 years. In other words, you know that the logical arguments in science often (or usually) require four steps or so, so the four-floor model above is realistic and the valid papers become a minority.

This is clearly unacceptable. Note that my estimates of the dependence on the previous papers was very modest - just two previous papers were used. It was enough to see that 2-sigma standards can't be enough for a sustainable science - whatever the science studies.

But in fact, even 3 sigma fails to be enough. At least, it's surely not "safely" enough.

A 3-sigma result has the 99.7% confidence level, a 1-in-300 chance of being wrong. Assume that a paper depends on 2 others. An analogous calculation of the validity of various generations of the papers gives

P(0) = 0.997
P(T) = 0.997⁶
P(2T) = 0.997 P(T)² = 0.991
P(3T) = 0.997 P(2T)² = 0.98
P(4T) = 0.997 P(3T)² = 0.958
P(5T) = ... = 0.915
P(6T) = ... = 0.835
P(7T) = ... = 0.695
P(8T) = ... = 0.48

Eight generations - which may still be within one decade - bring your scientific discipline into a complete chaos in which most claims have nothing to do with the truth. If you assumed that you depended on 10 papers, it would be 2 or 3 generations.

Clearly, it's still unacceptable.

The mankind needed more than 5 generations of intense work before it degenerated into communist and socialist swines - otherwise it couldn't have developed the industrial civilization.

You need five sigma, and even with five sigma, you must be very certain that you're not building upon excessively long "trees" of arguments that depend on each other. You should better replace the other "5-sigma" papers you're depending upon by "10-sigma" or very "fundamental" papers.

Even with the 5-sigma standards, the calculation above could tell you that after 20 generations, the papers run amok. But as time goes by, the older papers are actually getting more certain. So some papers that seemed to be in the 10th generation - which would only have something like a 99.9% confidence level - actually become much more certain because the experimental evidence is either more accurate or it depends on a much shorter pyramid of older results.

Taming the infection

The simplified calculations above of course imply that the errors are propagating "exponentially", like an infection. And given the simplified assumptions, they do. But the calculation is idealized and the reality is not as bad. What you clearly need in science is to regulate any potential infection of this kind so that a better model replaces the hopeless infection in time.

It means that the time scale calculated from the algorithm above must be longer than the time scale after which the older results become "qualitatively" more certain than they were before (e.g. after which the confidence level in standard deviations doubles). If your scientific discipline doesn't have this property, it won't be able to resist the "infection of errors" because the latter proceeds exponentially.

If your confidence level is "P", a number slightly below 1, it can be seen that the probability of a valid paper in the G-th generation is something like "P^(2^G)" or so. For this to be safely above 50% or so, you need to take the logarithm and see that

-2^G * ln(P)

must be smaller than approximately one (so that its exponential is above 50% or so). It means that the number of generations after which the infection swallows your discipline is (not distinguishing "e" and "2" as the bases of the logarithm because it doesn't make much difference)

G = -ln(-ln(P))

Note that there are two logarithms embedded into one another. Just to be sure about the numbers,

-ln(-ln 0.95) = 2.97
-ln(-ln 0.997) = 5.81
-ln(-ln 0.9999994) = 14.32

Three generations of a survival is clearly not enough for a viable scientific discipline because much more accurate "proofs" usually arrive after much longer a time than 2.97 times the separation between consecutive papers, i.e. 2.97 generations.

Six generations - which is what you obtain from 3-sigma papers - is marginally enough.

However, 5-sigma confidence, with its 0.9999994 confidence level, gives you 14 generations of survival despite the infection which is more than enough to replace the older and convoluted arguments - assumptions or older papers you need - by more accurate or less convoluted ones.

Alternatively, you may claim that the "pyramids" of human knowledge and sensible theories that depend on each other never have more than 14 floors so that the "immunity" of the set of 5-sigma research papers against the pandemics of errors is sufficient.

However, you may see that the environmentalist sciences can be captured by an "ecological" model themselves. Because the reliance on previous 2-sigma papers is omnipresent in the discipline, you can see that by purely statistical arguments, the discipline would be inevitably plagued by an unstoppable infection of errors even if the people in it were honest and impartial.

Regardless of the character and interpretation of the hypotheses and theories, it's clear that a working scientific discipline requires at least the 5-sigma standards if its insights are going to be quasi-reliably reused in realistic, slightly longer chains of reasoning that can be as long as 6 steps or more.

People defending a 2-sigma science are loons, pseudointellectual weeds that are trying to infect not only their contaminated sub-world but all of science and all of modern civilization with a diarrhoea of bullshit.

And that's the memo.

Bonus: London Science Museum goes AGW neutral

The U.K. Times bring us some happy news: the London Science Museum goes AGW neutral, claiming that it has to respect the views of the people who disagree with the orthodoxy and those who remain unconvinced.

Also, the climate exhibition was renamed from "climate change" to "climate science" to remove the alarmist bias from the very title. This symbolic act is somewhat similar to the removal of the adjective "socialist" from the name of the Czecho(-)Slovak Republic in 1990.

Just a few months ago, the museum would ask its visitors to "Prove It", collecting votes to argue that the Copenhagen accord should be as tough as possible. As you know, the poll ended up by humiliating loss of the proponents of the climate panic and Copenhagen has been a blessed and total failure.

Even more importantly, the ClimateGate and the IPCC scandals have made it clear to pretty much everyone that the IPCC-linked scientists are not trustworthy and they can't boast any scientific integrity. So the people who lead the museum have actually learned their lesson. They have converged closer to the opinions of the staff e.g. at Pilsen's science museum/center, Techmania, which are somewhat more clearly against the AGW orthodoxy. I was there yesterday - it's a great place!

Via Climate Depot.

IND2906

Proliferation of wrong papers at 95% confidence level

You May Also Like

No comments:

Recent Posts

Facebook

Blog Archive

Popular Posts

Popular Posts

Social Widget

Random Posts

Recent Posts