We thought about throwing up a post discussing US ex rel Jones v. Brigham and Women's Hospital and Harvard Medical School last month but couldn't come up with a good tie-in to mass torts, toxic torts, regular-old-torts or any of the science stuff we like to write about.
US ex rel Jones is a False Claims Act case arising out of an allegation that a researcher "deliberately manipulated [data] in order to achieve statistically significant results". Had the results not been statistically significant the researcher "could not have reported his findings in published scientific journals or to the NIH in support of an application" for the grant made the basis of the FCA claim, said the government's expert. The trial court granted summary judgment for defendants but the 1st Court of Appeals reversed. Scientific fraud will at last be on trial.
But ok, somebody(ies) cooked the books to get a bunch of NIH grant money. We're shocked. Shocked! that a tool created for gamblers has been used to burnish entreaties to the National Institutes of Health based on probabilistic reasoning. Alas. That's where we left it. Then we read "Fraud-Detection Tool Could Shake Up Psychology".
It discusses a statistical tool that lets its user surmise whether or not a researcher being investigated has been cherry-picking the data and consciously, or perhaps unconsciously, excluding data that undermine his hypothesis. Apparently, by excluding data points that severely conflict with his hypothesis, a dishonest researcher can (surprisingly) reduce variance and thereby increase significance. The problem for the dishonest researcher is that across all the data a histogram of randomly generated p-values is supposed to look like this whereas the dishonest researcher's p-values produce a histogram like this.
There's nothing that limits the tool to empirical psychology; a discipline which has been rocked by repeated episodes of fraud involving researchers bent on generating "science" that confirms the common sense notions of one group or another. These days pretty much every researcher has the computing power necessary to see which data are critical obstacles in the path of "proving" a hypothesis. Thereafter he can generate ad hoc rationalizations for excluding the troublesome data; or just ignore them altogether, or just nudge them off the table and into the trash can. Problem solved!
Now we can see who's been solving problems by ignoring (or deleting) problematic data.
So how did we get here; a place where we need such a tool? Why are most hypotheses supported by statistical data probably false; and why is a lot of what's false actually fraudulent? US ex rel Jones manages to let slip the root cause. Rather than confining significance testing to saying something about whether a hypothesis is probably wrong it is now commonly, and wrongly, thought "to cause proof of a particular scientific hypothesis to emerge from the data." Thus, rather than discovering if, on our endless ordeal of trial and error, we've gone off the rails yet again (which is to say that our pet theory has likely been falsified) many have come to use these tools to establish the claim that their hypotheses have been verified (or, worse yet, proven).
And that gets us to the essence of science. If, per our reading of Karl Popper, all we know is what isn't so, then the way forward to truth is heralded upon saying of new data: "That's funny". Similarly, the claim that our dogmas, preconceived notions, biases or prejudices have been verified by science ought to immediately summon our inner skeptics. Of course, too many are content to be fed a steady diet of seeming affirmations of their own convictions. But whichever side of the falsificationist/verificationist divide you fall on, our guess is that in the years to come the application of statistical tools like the one described above will open your eyes. Many things we thought of as "science" may turn out to have been the result of an elaborate fraud. Let the chips fall where they may.