Mass Torts: State of the Art

Mass Torts: State of the Art

What Is “Natural”? The FDA Thinks It Might Know.

Posted in Microbiology, Risk, The Law

By claiming on the label that their product is “natural”, manufacturers have helped themselves to 40 billion of consumer dollars each year, this claim being second most lucrative only to labeling products based on their fat content.

While certain informative labeling serves a truly important function – helping consumers with celiac disease or with milk or peanut allegories to select products that are gluten, lactose, and peanut free – the word “natural” on a food label, while clearly comforting to consumers looking to make smart choices about their consumption, has no agreed-upon meaning.  The FDA has so far declined to define what it means for a food to be “natural,” and, since 1993, its only non-binding guidance on the meaning of “natural” has been “nothing artificial or synthetic (including all color additives regardless of source) has been included in, or has been added to, a food that would not normally be expected to be in the food.”  58 Fed. Reg. 2303, 2407 (January 6, 1993). But, this rather unhelpful definition may soon be changing as a result of the FDA’s recent request for comments on use of the term “natural” on food labeling.

This development is likely a response to the recent wave of litigation involving false advertising claims based on the presence of bioengineered ingredients (genetically modified organisms – GMOs – such as corn grown from bioengineered seeds) in foods labelled “natural.”  Courts have had an easier time finding that artificial ingredients such as the Sodium Acid Pyrophosphate (“SAPP”) most likely ran afoul of the 1993 guidance and allowed false advertising claims based on presence of such ingredients in foods labelled “natural” proceed passed motions to dismiss.  E.g. Musgrave v. ICC/Marie Callender’s Gourmet Prods. Div., No. 14-cv-02006 (N.D. Cal. Feb. 5, 2015).  But GMOs have presented a more formidable challenge.  For example, in a trio of cases – Cox v. Gruma Corp., No 12-06502 (N.D. Cal.), Barnes v. Campbell Soup Co., No. 12-05185 (N.D. Cal.), and In re General Mills, Inc. Kix Cereal Litigation, No. 12-00249 (D.N.J.) – courts found it impossible to determine without FDA’s guidance whether GMOs fell into the category of “artificial or synthetic” products described in the 1993 guidance.  The Cox, Barnes, and Kix Cereal courts then stayed or administratively dismissed the complaints and referred the question of whether foods containing GMOs can be labeled “natural” to the FDA – the entity the courts agreed had primary jurisdiction over this question.  In the January 6, 2014 letter to the judges the FDA, however, “respectfully” declined to define “natural.”  The FDA did indicate that if it were inclined to provide an actual definition beyond the 1993 guidance, it would not comment in the context of lawsuit, but “would likely embark on a public process such as issuing a regulation or formal guidance.”  Even before the FDA sent its letter, however, many courts had pessimistically predicted this result and declined defendants’ requests to stay or dismiss cases until the FDA weigh in on the definition of “natural.”  E.e. Gedalia v. Whole Foods Mkt. Servs., Inc., No. 4:13-CV-3517 (S.D.Tex. 2014) (“deference to the FDA would likely be unfruitful due to the agency’s long-standing reluctance to officially define the term ‘natural.’”)

Now through February 10, 2016, anyone with an opinion can weigh in on whether: (1) it is even appropriate to define the term “natural,” (2) if so, how the FDA should define “natural,” and (3) how the FDA should determine appropriate use of the term on food labels.  In asking for the guidance, the FDA itself recognizes that the policy it expressed through the 1993 guidance does not address food production or processing methods, such as use of pesticides in growing the food, or the use of thermal technologies, pasteurization, or irradiation in food manufacturing, or whether the term “natural” should indicate any nutritional or health benefits.

With respect to the contentious issue of GMOs, it is likely that two lines of thinking will dominate the conversation of whether any degree of gene modification would allow for a food to still be labelled “natural.”  Some will advocate the position that any modifications, even those that merely enhance or suppress certain existing genetic characteristics of a plant or an animal, would cause the food to be excluded from the “natural” category.  Others will argue that only the process by which genes from the DNA of one species are extracted and artificially forced into the genes of an unrelated plant or animal would make the food “unnatural.”  The purist – no genetic tinkering of any kind – position skirts the fact that in “natural” breeding practiced by those who do not manipulate genes through irradiation or use of agrobacterium, plants and animals are cross-bred mixing whole strands of DNA and swapping many genes at once in the hopes of getting a new interesting combination of useful traits.  This approach is messier and can be nearly as rife with “dangers” of undesirable and potentially harmful genes getting turned on in the process.

It remains to be seen whether the FDA’s newest effort will make it any easier to know when it is proper to label a food “natural”. That it was willing to take a position on the issue, especially in light of the rising number of consumer fraud food labeling class actions, is certainly encouraging.  But, given the ongoing revolutions both in science and in the public’s perception of what constitutes “natural,” it is hard to see this as much more than an attempt to not fall too far behind.

Robreno Tries to Tackle Sorites Paradox; Ford Fumbles Risk Factors

Posted in Causality, Epidemiology, Reason, Risk, The Law

Judge Robreno has done a heroic job of resolving the “elephantine mass” of asbestos litigation stuck in the federal system (MDL 875) but his attempt to resolve an ancient Greek paradox came up short. In a memorandum opinion (Mortimer v. A.O. Smith Corp., et al.) addressing Ford’s motions to exclude Plaintiff’s experts intent on opining that his renal cell cancer was caused by exposure to asbestos, the judge tries to draw a distinction between “any exposure” (a/k/a “every breath”) and cumulative exposure yet misses, I think, the point. Meanwhile, Ford decided to run the risk factor play – by which I mean that it wants to assert as alternate causes those risk factors for renal cell cancer gleaned from plaintiff’s medical records (e.g. hypertension) while denying that asbestos isn’t really a risk factor for renal cell cancer and so cannot be a cause. When everyone agrees that a risk factor is the same thing as a cause a lot of unfortunate pronouncements about science usually follow and that’s very much what happened here.

First though the Sorites paradox. A grain of sand is not a heap of sand. Add a second grain and it’s still not a heap. Add a third and still no heap. Thus adding a grain of sand to something that isn’t a heap won’t make it one. Similarly take a heap of sand. Remove one grain and it remains a heap. Remove another and it’s still a heap. Thus removing a grain of sand from a heap won’t make it a non-heap. Swap asbestos fiber for grain of sand and you get the point of this post.

When you don’t require plaintiff to quantify a range of exposure but instead create a rule whereby plaintiffs win with a heap but lose with just grains you wind up with competing arguments about heaps and grains that cannot be rationally resolved. Here Plaintiff’s experts claim to have solved the problem of discerning when sand goes from grains to heap calling their theory “cumulative exposure” and further that plaintiff experienced a heap of it. Ford argued that it was still just a bunch of grains of sand. Judge Robreno sided with plaintiffs.

After carefully reviewing the opinion of Dr. Frank (and that of Dr. Bralow), the Court notes that, as stated by Plaintiff, the opinions are “cumulative exposure” opinions, which are – in substance and, by definition – different from the “any exposure” opinion often proffered by experts in asbestos litigation and rejected by the Pennsylvania Supreme Court. For this reason, the Court declines to accept Defendant’s argument that the experts’ opinions are inadmissible on grounds that they are the same as the “any exposure” opinion.

Apparently the court was persuaded in part by the fact that Ford “does not dispute that there is evidence of frequent, regular and proximate exposure to asbestos from brakes for which it is liable” and the fact that Ford’s own data demonstrates that “the number of asbestos fibers to which Plaintiff would have been exposed from Ford brakes would be at least in the millions and possibly in the billions.” The problem is that neither frequency, regularity nor proximity to a billion asbestos fibers does a cumulative exposure heap make.

A single breath while resting is about 500 cc of air. The mid-range for an average person is 15 breaths per minute. That comes to about 4 billion cc of air per year. Using the typical asbestos background level of 0.0001 fibers/cc in non-industrial urban/suburban areas that means 400,000 fibers per year, and nobody thinks that’s a heap. Using 0.04 f/cc, which is below not only OSHA’s Permissible Exposure Level (PEL) but also below its “action level” yields 35 million fibers over the course of 240 8-hour work days, over a billion fibers in a typical work life and not even OSHA thinks that’s a heap.

The only sensible course is to stop arguing about whether the fibers alleged constitute a few or a heap and to instead make plaintiffs quantify a reasonable range of exposures, compare those to reliable risk estimates for the disease in question given the putative causal agent and see if plaintiff has a causal claim that can be justified. The proposed “cumulative exposure” theory is not in fact an answer to the Sorites paradox; it’s actually just another version of the Small Glasses approach whereby plaintiffs try to change the conversation (or at least their analogies) when people start to ask whether their new causal argument isn’t really just their old tautology.

Well, this post has gone on long enough already but I want to make one other point and that has to do with Ford’s embrace of risk factor epidemiology. Risk factor epidemiology it must be remembered is not science. Don’t take my word for it, re-read The Emptiness of the Black Box. I understand completely why Ford wants to talk about obesity, high blood pressure, age and smoking – they’re all risk factors for renal cell cancer, and they’re all either under Plaintiff’s control, unavoidable, or at least not Ford’s fault. The problem is that once you play the risk-factors-are-causes game you wind up playing by plaintiffs’ rules. You forfeit the argument that observational epidemiology studies cannot demonstrate causation when you rely on observational epidemiology studies for your defense.

You also wind up with opinions like the one in Mortimer in which a court holds that a risk factor is the same thing as a cause, that Plaintiff’s experts have two reliable studies that say asbestos is a risk factor for renal cell cancer, that injuries may have more than one cause so that the problem of multiply present risk factors is no obstacle to recovery and that Plaintiff’s experts reliably reasoned their way to their opinions from these foundations. The problems with this are many but here are the main ones: (1) By definition the crowing of a rooster is a risk factor for the sun’s rise, so remember that correlation is NOT causation; (2) the vast majority of studies of renal cell cancer show that all of the proposed risk factors combined account for less than half of all cases; (3) the vast majority of renal cell risk factor epidemiology studies have failed to suggest asbestos is a risk factor thus Plaintiff’s two studies are no more “reliable” than a broken watch that’s accurate twice a day; (4) the multiple sufficient causes idea as presented here is fatally flawed – if you found in your closet a shirt that had been cut and on your dresser a knife, scissors and a razor it wouldn’t even cross your mind that they all, each and every one, caused the cut; and, (5) relying on observational epidemiological studies demonstrating weak effects is actually not a reliable way to reason about causation – if it were we wouldn’t be in the midst of a reproducibility crisis in biomedical science and billions wouldn’t have been wasted on theories arising out of nothing more than statistically significant findings – read Scientific Method – Statistical Errors again if you have any lingering doubt.

So in closing, remember that if you lie down with risk factor dogs you’ll get up with risk factor fleas.


Stacking Cans in Michigan

Posted in Reason, The Law

Streptococcus pneumoniae can cause middle ear infections. Middle ear infections can cause erysipelas (a bacterial skin infection). Streptococcus pneumoniae can cause erysipelas. Erysipelas can cause cellulitis. Cellulitis can cause bacteremia. Bacteremia can cause septicemia. Septicemia can cause pneumonia.

That’s an quite a stack of cans when you’re trying to prove that a death from pneumonia was probably caused by a failure to appropriately treat a case of erysipelas caused by Streptococcus pneumoniae; especially when (a) neither plaintiffs decedent’s middle ear infection nor her erysipelas was ever diagnosed as having been caused by Streptococcus pneumoniae; and, (b) despite more than 207,000 (as of today per published studies of Streptococcus pneumoniae, ear infections, erysipelas, cellulitis, bacteremia and pneumonia, no one has ever reported a case of a skin infection caused by Streptococcus pneumoniae that conquered the tower of cans needed to produce a case of pneumonia. In fact, a number of those cans have never been observed to come in the Streptococcus pneumoniae variety – i.e. the cans exist but they’ve always involved a different bacterium. On the other hand, there’s a mountain of studies showing that Streptococcus pneumoniae is commonly found in the upper airways of healthy people, that it sometimes invades the lungs and that from there it can enter the bloodstream. Nevertheless, citing “pure science” the Michigan Court of Appeals held that plaintiffs expert’s stack of cans was admissible.

Estate of Beverly Kay Garcia v. West Shore Medical Center, though cloaked in paeans to science and Daubert, is one of the most aggressively anti-science / anti-Daubert opinions that I’ve ever read. Yet it redeems itself by providing good examples of the bad arguments to which judges resort when they come up against the scientific revolution’s central tenet: “science is the belief in the ignorance of experts.”

Though the theory espoused by plaintiffs’ expert has never been tested or replicated the appellate court held those shortcomings to be irrelevant to the issue of admissibility. Wrapping “replication” in scare quotes the court makes the following straw man argument: “No reputable physician or scientist we can imagine would infect a patient’s skin with streptococcus [sic] pneumoniae, fail to treat …” etc. Two things. First, no experiment can ever be repeated exactly anyway; the ceteris paribus condition being aspirational at best. Second, Richard Feynman didn’t have to build another Challenger and blow it up in order to replicate the properties of the space shuttle’s O-rings when exposed to freezing temperatures. Get it? The real question is whether Streptococcus pneumoniae has the ability to do something that it has never been observed to do; i.e. to scale each can in the tower of cans stacked by plaintiffs. Plaintiffs produced no animal studies nor any in vitro human skin studies suggesting that their theory might be true. They couldn’t even show that the bacterium had the genetic machinery needed to mount the expedition.

Next up is the purpose of scientific publishing and here the court gives us an especially bad argument. The question that arises once you learn plaintiffs’ expert has a novel theory is: why haven’t you published it? To deal with that awkward question the court redefined the expert’s theory from one about discovery of a heretofore unknown general property of Streptococcus pneumoniae to one about a unique and likely never to be repeated manifestation of that property; something like “the bacterium has this ability when you have someone like decedent, whose disease manifested like decedent’s, was treated like decedent and had the same outcome as decedent.” Thus recast, the court concluded that it’s “hardly surprising” that the theory wasn’t published as it was “unlikely to be of interest to the medical community, given the rarity (one hopes) of inadequately treated erysipelas.”

The obvious problem with this argument is that it makes the ability to climb the tower of cans not the result of some law of nature but rather the result of a unique set of circumstances confined to the unreachable past. That makes the theory unfalsifiable and thus pseudoscience.

The not so obvious problem is that “[e]rysipelas is a common and severe infection where the aetiology and optimal management is not well-studied“. As of this evening there are over 2,200 articles in PubMed containing the word erysipelas. A brief review of abstracts reveals erysipelas to be a problem worldwide prompting much debate about its cause and the best way to treat it. If plaintiffs’ expert really does know the answer there are thousands of patients desperate for that knowledge. The fact that he doesn’t share it outside the courtroom probably says all that needs to be said about his theory.

But that’s not all that needs to be said about Garcia. The court also has a peculiar take on peer review. It deems as peer reviewed three articles that say when strung together (1) “Streptococcus pneumoniae can cause a wide variety of clinical symptoms … by hematogenous spread”; (2) [b]acteria … from hematogenous spread find their way to the lung … Once there, a combination of factors (including virulence of the infecting organism …) may lead to bacterial pneumonia”; and, (3) “pneumonia” appears on a list of “possible” complications of erysipelas. Thin gruel and yawning analytical gaps for sure but I was intrigued. A search, first for the papers by name and then by authors in online science libraries, turned up none of them.

Eventually I realized that they were nothing more than Medscape summaries. Medscape is a part of WebMD and all of their summaries, including the ones used by plaintiffs’ expert, are covered by the following disclaimer: “Your use of [our] Services is at your own risk. Without limiting the foregoing, we, our licensors, and our suppliers make no representations or warranties about the following:

The accuracy, reliability, completeness, currentness, or timeliness of the Services or information contained therein.

If Medscape doesn’t want your doctor making medical decisions on the basis of their summaries what makes the Michigan Court of Appeals court think it’s a good idea?

Then there’s that bit about “pure science”. The sentence reads: “Here, we deal with an issue more closely akin to pure science than to epidemiologically-proven relationships.” Yikes. Apparently word that correlation (the business of epidemiology) does not imply causation hasn’t made its way that far north yet. Worse of course is the idea that the purest form of science can be found in the pronouncements of credentialed deep thinkers whose only methodology is reason. They string together bits of facts to fashion a compelling narrative. Just like lawyers. And maybe that’s why so many judges find their arguments persuasive. The fatal flaw, as we’ve written so many times, is that you can’t reason about what you don’t know about and we still know so very little about nature. Playing off an article in today’s CELL, how would you reason about the role of retrotransposons in leukemia if you didn’t know retrotransposons existed and if the mechanism by which they act conflicted with your beliefs about the mechanisms of leukemia? Pure science is the hard work of figuring out how LINE-1 retrotransposons affect evolution; it’s not stacking cans.

There are other bad arguments in the opinion but they’re of the usual variety. The court topples the – we’re trying to find “absolute truth” down here at the courthouse – straw man in one paragraph and chides defendant’s experts for not having any scientific articles saying you can’t stack cans so high in another (you’d think that 207,000 observations with zero sightings of plaintiffs’ Tower ‘O Cans would count for something). Those are just side shows though. Garcia is ultimately part of a troubling project that’s been underway a few years now.  One which has as its goals establishing and fortifying Junk Science of the Gaps redoubts where plaintiffs can hide from Daubert. More on that some other day.

Measuring the Expertise in Expert Opinions: A Rebuttal, by David L. Faigman

Posted in Uncategorized

With the large scandal this year involving the FBI’s bungling of hair analysis, it is easy to wonder what other types of unreliable scientific evidence have made it through our legal system. The short answer is that the failure in hair analysis is only the tip of a much more troubling iceberg – unreliable science makes it into the courts much more frequently than we like to admit (i.e. bitemark evidence, silicone implants, etc…). This blog has stated before that the answer to this problem is simple: admit only expert opinions that are “borne out by observation of predictions made by [a] theory.” This is not controversial. Indeed, this is the definition of “science” in some sense, and it is the standard baked into Daubert (and Frye to lesser extent). The harder question, however, is how can a system be created to ensure that this actually happens? Who decides what is “good science” in litigation?
As litigators well know, testifying experts are chosen specifically because they support the client’s position. The “science” being presented may or may not actually be accepted by mainstream scientists. Cross-examination is the traditional method to question experts – but both sides use this technique, resulting in a confusing zero-sum game with little clarity and plenty of erroneous outcomes. What I proposed in my article in Bloomberg BNA is that the best way to ensure valid science is to have neutral and mainstream scientific peers blindly evaluate expert opinions. Indeed, peer review is the system used by scientists for centuries to validate scientific theories and hypotheses. It presents the best available option to ensure that only valid science survives admissibility under Daubert/Frye.
                                                                                                                                                                                                                                               Peer Review as a Tool for Litigators
Daubert and Frye motions to exclude experts are arguably underutilized in the current system. Many lawyers undoubtedly fear showing their cards early or gambling on a motion that will ultimately be decided by dueling experts. In many ways, excluding experts under Daubert/Frye relies on the same arguments and questions that a litigator will submit at a deposition or under cross-examination.

Frye, which is still employed in many states, requires judges to determine whether an expert’s opinions have been generally accepted by the scientific community from which the opinion came. Unfortunately, judges have no mechanism by which to ask scientists this question. In effect, judges today typically either inquire whether other courts have accepted similar testimony from other experts, or ask the expert himself whether his methods are accepted (with the obvious “yes” response). This makes exclusion under Frye risky – how can a judge truly assess admissibility when she has no external and reliable source of information?

In an opinion concurring in part and dissenting in part, Chief Justice Rehnquist complained in Daubert that the rule as interpreted would require judges to become “amateur scientists.” And indeed it might, since it calls upon trial court judges to be “gatekeepers” and assess the validity of expert opinion by analyzing the methods and principles from which they are based. This is a tall order for judges who might come across scientific evidence ranging from acoustics to zoology. Certainly, not even scientists are able to analyze the methods of principles of other fields. Would you ask a psychologist to evaluate medical causation? As a result, Daubert motions are correspondingly risky as well – only a “sure bet” might have a chance of convincing a judge that an expert is not sound.

Ultimately, the risky nature of Frye and Daubert stems from the adversarial bias that a judge is presented with in evaluating expert evidence. Without external feedback to assess reliability, judges (like most scientists), have no “reliable” way to evaluate expert conclusions. In the scientific community, such feedback comes from uninterested peer reviewers. In the legal context, peer review of expert testimony provides a most promising avenue to make Daubert and Frye exclusion much more predictable and effective.
Fitting Peer Review into Litigation
Peer review fits into multiple places in any litigation. First, peer review can help to solidify a draft expert report. In a recent case study by JuriLytics – City of Pomona v SQM – a groundwater expert was excluded by a district court and then reversed on appeal. In that case, peer review of the expert’s draft report would have addressed many of the concerns the judge had with the expert’s conclusions and might very well have saved him from exclusion and therefore the city $100k+ on appeals costs. The moral of the story: no matter how confident an expert or litigator is in an expert opinion, peer review will always add valuable input to an expert report. Further, reviewers would be treated like any other consulting expert – they would remain confidential under work-product and would not be discoverable.

Second, peer review can be used as a sword or shield when deciding whether Daubert or Frye is a good avenue to pursue (read: win) your case. As I said beforehand, expert exclusion has traditionally been a gamble – will a judge understand enough to be able to determine why an opinion is (un)reliable? This calculation changes drastically with peer review. Now, litigators can test whether expert exclusion could be utilized to win the case. Positive reviews can be cited with Daubert or Frye motions as appendices. Reviewers might get deposed, but only if the reviews are employed (i.e. they are favorable).

All told, peer review is a general tool for litigators to bring credible and mainstream science to the courtroom without losing control of the litigation. Many other options yet unthought are still possible using this core concept. It’s nice to know that there is an expert willing to testify to your position. But the sure bet is when you convince the judge that mainstream science is on your side.

(Many thanks Dr. Faigman – when I read that you’d clerked for Reavley I figured (as they still say down here) that your rebuttal would be well and concisely argued. — DAO)

David L. Faigman is the John F. Digardi Distinguished Professor of Law at the University of California Hastings College of the Law and the Co-Founder and CEO of JuriLytics, LLC. He holds an appointment as Professor in the School of Medicine (Dept. of Psychiatry) at the University of California, San Francisco. He received both his M.A. (Psychology) and J.D. from the University of Virginia. Professor Faigman clerked for the Honorable Thomas Reavley of the U.S. Court of Appeals for the Fifth Circuit. He is the author of numerous articles and essays. He is also the author of three books, Constitutional Fictions: A Unified Theory of Constitutional Facts (Oxford, 2008), Laboratory of Justice: The Supreme Court’s 200-Year Struggle to Integrate Science and the Law (Henry Holt & Co. 2004) and Legal Alchemy: The Use and Misuse of Science in the Law (W.H. Freeman,1999). In addition, Professor Faigman is a co-author/co-editor of the five-volume treatise Modern Scientific Evidence: The Law and Science of Expert Testimony (with Blumenthal, Cheng, Mnookin, Murphy & Sanders). The treatise has been cited widely by courts, including several times by the U.S. Supreme Court. Professor Faigman was a member of the National Academies of Science panel that investigated the scientific validity of polygraphs and he is a member of the MacArthur Law and Neuroscience Network.


Robust Misinterpretation of Confidence Intervals by Courts

Posted in Reason, The Law

“How are courts doing when it comes to interpreting the statistical data that goes into their decision-making?” That was a question posed by someone in the audience at a presentation I gave recently. I was discussing, among other things related to the perils of litigating statistical inferences, the recent paper “Robust Misinterpretation of Confidence Intervals.” It reports on the results of a study designed to determine how well researchers and students in a field that relies heavily on statistical inference actually understand their statistical tools. What it found was a widespread “gross misunderstanding” of those tools among both students and researchers. “[E]ven more surprisingly, researchers hardly outperformed the students, even though the students had not received any education on statistical inference whatsoever.” So, returning to the very good question, how are our courts doing?

To find out I ran the simple search “confidence interval” across Google Scholar’s Case Law database with the date range set to “Since 2014″. The query returned 56 hits. Below are eight representative quotes taken from those orders, reports and opinions. Can you tell which ones are correct and which constitute a “gross misunderstanding”?

(A) “The school psychologist noted that there was a 95% confidence interval that plaintiff’s full scale IQ fell between 62 and 70 based on this testing.” And later: “A 90% confidence interval means that the investigator is 90% confident that the true estimate lies within the confidence interval”

(B) “A 95% confidence interval means that there is a 95% chance that the “true” ratio value falls within the confidence interval range.”

(C) “Once we know the SEM (standard error of measurement) for a particular test and a particular test-taker, adding one SEM to and subtracting one SEM from the obtained score establishes an interval of scores known as the 66% confidence interval. See AAMR 10th ed. 57. That interval represents the range of scores within which “we are [66%] sure” that the “true” IQ falls. See Oxford Handbook of Child Psychological Assessment 291 (D. Saklofske, C. Reynolds, & V. Schwean eds. 2013).”

(D) “Dr. Baker applied his methodology to the available academic research and came up with a confidence interval based on that research. The fact that the confidence interval is high may be a reason for the jury to disagree with his approach, but it is not an indication that Dr. Baker did not apply his method reliably.”

(E) “A 95 percent confidence interval indicates that there is a 95 percent certainty that the true population mean is within the interval.”

(F) Statisticians typically calculate margin of error using a 95 percent confidence interval, which is the interval of values above and below the estimate within which one can be 95 percent certain of capturing the “true” result.

(G) “Two fundamental concepts used by epidemiologists and statisticians to maximize the likelihood that results are trustworthy are p-values, the mechanism for determining “statistical significance,” and confidence intervals; each of these mechanisms measures a different aspect of the trustworthiness of a statistical analysis. There is some controversy among epidemiologists and biostatisticians as to the relative usefulness of these two measures of trustworthiness, and disputes exist as to whether to trust p-values as much as one would value confidence interval calculations

(H) The significance of this data (referring to calculated confidence intervals) is that we can be confident, to a 95% degree of certainty, that the Latino candidate received at least three-quarters of the votes cast by Latino voters when the City Council seat was on the line in the general election.

Before I give you the answers (and thereafter some hopefully helpful insights into confidence intervals) I’ll give you the questionnaire given to the students and researchers in the study referenced above along with the answers. Thus armed you’ll be able to judge for yourself how our courts are doing.

Professor Bumbledorf conducts an experiment, analyzes the data, and reports:

The 95% confidence interval for the mean ranges from 0.1 to 0.4

Please mark each of the statements below as “true” or “false”. False means that the statement does not follow logically from Bumbledorf’s result.

(1) The probability that the true mean is greater than 0 is at least 95%.

Correct Answer: False

(2) The probability that the true mean equals 0 is smaller than 5%.

Correct Answer: False

(3) The “null hypothesis” that the true mean equals zero is likely to be incorrect.

Correct Answer: False

(4) There is a 95% probability that the true mean lies between 0.1 and 0.4.

Correct Answer: False

(5) We can be 95% confident that the true mean lies between 0.1 and 0.4.

Correct Answer: False

(6) If we were to repeat the experiment over and over, then 95% of the time the true mean would fall between 0.1 and 0.4.

Correct Answer: False

Knowing that these statements are all false it’s easy to see that the statements (A), (B), (C), (E), (F), and (H) found in the various orders, reports and opinions are equally false. I included (D) and (G) as examples typical of those courts that were sharp enough to be wary about saying too much about what confidence intervals might be but which fell into the same trap nonetheless. And that trap is believing that confidence intervals have anything to say about whether the parameter (again typically the mean/average – something like the average age of recently laid off employees) that has been estimated is true or even likely to be true. (G), by the way, manages to get things doubly wrong. Not only does it repeat the false claim that estimations falling within the confidence interval are “trustworthy” it also repeats the widely held but silly claim that confidence intervals are perhaps more reliable than p-values. Confidence intervals you see are made out of p-values (see “Problems in Common Interpretations of Statistics in Scientific Articles, Expert Reports, and Testimony” by Greenland and Poole if you don’t believe me) so that the argument (albeit unintentionally) being made in (G) is that p-values are more reliable than p-values. Perhaps unsurprisingly, of the 56 hits I only found two instances of courts not getting confidence intervals wrong and in both cases they avoided any discussion of confidence intervals and instead merely referenced the section on the topic from the Reference Manual on Scientific Evidence, Third Edition.

Why do courts (and students and researchers) have such a hard time with confidence intervals? Here’s a guess. I suspect that most people have a pretty profound respect for science. As a result, when they first encounter “the null hypothesis racket” (please see Revenge of the Shoe Salesmen for details) they simply refuse to believe that a finding published in a major peer reviewed scientific journal could possibly be the result of an inferential method that would shock a Tarot Card reader. People are only just coming to realize what the editor of The Lancet wrote this Spring: scientific journals are awash in “statistical fairy tales”.

Now there’s nothing inherently suspect about confidence intervals – trouble only arises when they’re put to purposes for which they were never intended and thereafter “grossly misunderstood”. To understand what a confidence interval is is to know the two basic assumptions on which it rests. The first is that you know something about how the world works. So, for example, if you’re trying to estimate the true ratio of black marbles to white marbles in a railcar full of black and white marbles you know that everything in the railcar is a marble and each either black or white. Therefore whenever you take a sample of marbles you can be certain that what you’re looking at is reliably distinguishable and countable. The second is that you can sample your little corner of nature over and over again, forever, without altering it and without it changing on its own.

Without getting too deep in the weeds those two assumptions alone ought to be enough to make you skeptical of claims about largely unexplained, extraordinarily complex processes like cancer or IQ or market fluctuations estimated from a single sample; especially when reliance on that estimate is urged “because it fell within the 95% confidence interval”. And here’s the kicker. Even in the black and white world of hypothetical marbles the confidence interval says nothing about whether the sample of marbles you took is representative of, or even likely to be representative of, the true ratio of black to white marbles. All it says is that if your assumptions are correct, and given the sample size you selected, then over the course of a vast number of samples your process for capturing the true ratio (using very big nets two standard deviations on either side of each estimate) will catch it 95% (or 99% or whatever percent you’d like) of the time. There is no way to know (especially after only one sample) whether or not the sample you just took captured the true ratio – it either did or it didn’t. Thus the confidence interval says nothing about whether you caught what you were after but rather speaks to the size of the net you were using to try to catch it.

Thus my take on all the judicial confusion surrounding confidence intervals: they’re like judges at a fishing competition where contestants keep showing up with nets but no fish and demanding that their catches be weighed “scientifically” according to the unique characteristics of their nets and their personal beliefs about fish. Who wouldn’t be confused?

Texas: No More No No-Duty

Posted in Reason, The Law

Until last Friday an owner whose premises harbored some known or knowable danger could not avail itself of the argument that it had no duty either to warn invitees or to render its premises safe even when the danger was “open and obvious” and even when the invitee was aware it. The “no-duty” rule that had once meant “no money” for plaintiffs who slipped and fell in the very spills they were paid to clean up had been abolished years before and Texas became a “no no-duty” state. The idea behind abolishing the rule was that Texas’ then new comparative fault scheme, especially once coupled with a bar on recovery whenever plaintiff’s fault is greater than 50%, would sort things out fairly, and spare judges the bother of untangling knotty issues of duty in the bargain. The results were otherwise.

If a deliberating jury wants to keep turning pages until it gets to one with blanks for the money they want to give away it’s not hard to figure out how to do it. They just have to solve this riddle: “Answer [the damages question] if you answered “Yes” for Danny Defendant to [the liability question] and answered: 1) “No” for Pauline Plaintiff to [the liability question], or 2) 50 percent or less for Pauline Plaintiff to [the percentage of causation question]”. Without the no-duty rule to screen out obviously meritless claims juries began to return verdicts which when compared to the underlying facts were simply absurd. Compounding the problem, efforts to reconcile the reasoning behind the reversal of a judgment for e.g. a plaintiff who’d been shown a hole by a premises owner and then promptly stepped in it anyway, and a rule the absence of which implies that a premises owner owed the plaintiff a duty irrespective of the hole’s obviousness or the plaintiff’s awareness of it, produced a number of appellate opinions which were, to put it kindly, confusing.

With Austin v. Kroger the Texas Supreme Court has declared that the era of “no no-duty” is over. An owner still has a duty to maintain its premises in a reasonably safe condition but that duty is discharged by warning of hidden dangers. Open and obvious dangers, and those of which an invitee is aware, don’t give rise to a duty to warn or to remediate in the first place. Two exceptions remain and both are quite limited. The first involves criminal activity by a third party and arises when the premises owner “should have anticipated that the harm would occur despite the invitee’s knowledge of the risk”. The second exception arises in the context of “necessary-risk”. If an invitee is aware of a hazard and yet must cross it nonetheless then a duty to lessen the attendant risk likely remains.

All in all it’s a very good opinion though I wish they’d spent a bit of time on the issue of why an open and obvious danger, or one of which an invitee is aware, cannot give rise to a duty, because it’s vitally important to understanding the no-duty rule.

Any system that adjudicates outcomes based on fault rests upon the idea that the parties being judged have agency – that they have both the faculty of reason and the ability to act according to their reason. When a party with agency is confronted with a known and avoidable danger the risk drops to zero so long as the party acts according to her reason, which is to say “reasonably”.

Since duty (in this context, at least) manifests only when a risk rises to a level at which a reasonable person would take action (i.e. warn or remediate) there can be no duty to act in a no risk (i.e. open and obvious) situation.

So, (and at last I come to the point) what always bothered me about the no-duty rule was that it essentially denied that individuals have agency. By denying that Texans couldn’t be assumed to be reasonable, to be the sort of people who upon seeing a hole decide to walk around it,  the no-duty rule denied that they had the faculty of reason and/or the ability to act upon it.  Stranger still, the rule assumed that the typical defendant, a corporation, did have agency which is why it got stuck with the duty. Thus corporations could be assumed to have agency but not so the state’s citizens. Ugh. Glad that chapter’s behind us.

I Dreamed of Genie

Posted in The Law

A month has passed since the Texas Supreme Court delivered its opinion in Genie Industries, Inc. v. Matak. It’s the most thorough explication of Texas products liability jurisprudence that I’ve read in a good while. Nevertheless I struggled to come up with a blog post because I couldn’t quite be sure what to make of it. Did it really, as it seemed upon first reading, finally set the risk component of Texas’ risk/utility analysis on a solidly objective foundation? Or was that just wishful thinking; an illusion produced by the inevitable discussion of foreseeability and prior incidents within the context of a case that turned on weighing risk against utility?

The opinion draws no bright lines. However, the following two sentences finally settled the question for me:

The undisputed evidence is that Genie has sold more than 100,000 AWP model lifts all over the world, which have been used millions of times. But the record does not reflect a single misuse as egregious as that in this case.

Immediately after those words the court summed up its reasoning and concluded that the Genie lift is not unreasonably dangerous. I read these tea leaves to mean the court believed that no reasonable person could conclude that a risk of death at the level of 1 in 1 million (or less) could outweigh a product’s demonstrated utility. If so it’s both sensible and a pretty big deal. Hard data can now trump a jury’s or judge’s subjective risk assessment.

As noted above the opinion is a very nice summary of Texas product liability law and below this paragraph I’ll set out in bullet point fashion a CliffsNotes version. Before getting there however I want to touch upon the issue of misuse. The court spends some time talking about misuse but could have done a better job of saying where and exactly how it fits in a risk/utility scheme. The concept of misuse, when understood as the likelihood of misuse (gauged by the obviousness of what would follow) multiplied by the gravity of the failure produced by the misuse, is really just another dimension of risk if you think about it. That means it ought to be dissolved back into the general risk construct rather than being precipitated out and given a different name (and causing confusion).

Here are the takeaways:

To recover on a design defect product liability claim plaintiff must prove:

(1) the product was defectively designed so as to render it unreasonably dangerous;

(2) a safer alternative design existed; and

(3) the defect was a producing cause of the injury for which the plaintiff seeks recovery.

A product is unreasonably dangerous when its risk outweighs its utility.

A safer alternative design is one that would have prevented or significantly reduced the risk of the injury, would not substantially impair the product’s utility, and was economically and technologically feasible at the time.

When weighing risk against utility consider:

(1) the utility of the product to the user and to the public as a whole weighed against the gravity and likelihood of injury from its use;

(2) the availability of a substitute product which would meet the same need and not be unsafe or unreasonably expensive;

(3) the manufacturer’s ability to eliminate the unsafe character of the product without seriously impairing its usefulness or significantly increasing its costs;

(4) the user’s anticipated awareness of the dangers inherent in the product and their avoidability because of the general public knowledge of the obvious condition of the product, or of the existence of suitable warnings or instructions; and

(5) the expectations of the ordinary consumer


So Much For Science

Posted in Reason

Wired ran an article last week titled “Science Says American Pharoah Won’t Win The Triple Crown“. It consisted of a detailed review of the science of horse racing; the energy demands, the metabolic hurdles a horse must overcome to refuel for the next race, the microscopic injuries produced by any great exertion, the time needed for bone to remodel to meet new demands, the impossible task of balancing treatments that speed recovery of one bodily system only to slow down recovery of another, and the unique challenge posed by Belmont’s long track (1.5 miles). In the concluding paragraph the author offered the doomed American Pharoah consolation: “It’s not your fault. It’s science and those pesky fresh horses.”  If you click on the link (which contains the original title) you’ll see that experience has spoiled yet another good theory and in doing so caused a new title to take the place of the old: “Update: Whoa! American Pharoah Beats Science to Win the Triple Crown“.

The point of this post is not to to mock Lexi Pandell, who authored the piece. She is to be commended for having a sort of courage conspicuously absent in most of the expert witnesses I encounter. Specifically, she laid out her theory, the data that led her to it, and then made a testable prediction by which her theory could be judged. That her theory failed the test is no cause for shame – the (vast) majority of all theories meet the same fate.

Rather, the point of this post is to remind you that until it is tested a clever argument, though painstakingly built fact by fact and expertly cemented together with analysis so that the gaps between are all fully and solidly filled, remains just that – a clever argument. One hundred accurate predictions adds only modestly to its strength yet a single failed prediction causes it to collapse. That is the essence of science.

And so the only flaw I found in Ms. Pandell’s piece is that she, like too many courts, mistook the impressive argument built out of studies and rhetoric for science. Science wasn’t the clever argument; science was the race.

Revenge of the Shoe Salesmen

Posted in Causality, Epidemiology, Reason

By 1990 Paul E. Meehl had had enough. He’d had enough of lazy scientists polluting the literature with studies purporting to confirm fashionable theories that in fact couldn’t even be tested; enough of cynical scientists exploiting the tendency of low power statistical significance tests to produce false positive results just so they could churn out more of the same; and enough of too many PhD candidates, eager to get in on what Meehl called “the null hypothesis refutation racket”, who were unashamedly ignorant of the workings of the very mathematical tools they hoped to use to further muddy the waters with their own “intellectual pollution.” He called on them to give up their “scientifically feckless” enterprise and to take up honest work, something more suited to their talents –  selling shoes perhaps. The shoe salesmen as we now know would not give up so easily.

Rather than a tiresome diatribe in a meaningless war of words among academics, what Meehl wrote in Why Summaries Of Research On Psychological Theories Are Often Uninterpretable is one of the best explanations you’ll ever read about what went wrong with science and why. And if you’re curious about whether he (posthumously) won the argument, the results are now coming in. Of the first 100 important discoveries in the field of psychology tested to see if they are in fact reproducible, all of which “discoveries” by the way were peer reviewed and published in prominent journals, only 39 passed the test. The obvious conclusion is that the literature has indeed been thoroughly polluted.

Meehl demonstrated that any time tests of statistical significance are used to test hypotheses involving complex systems where everything is correlated with everything, as in the psyche and the body, a weak hypothesis (which is to say one that is merely a suspicion not built upon other theories about underlying mechanisms that have been rigorously tested) carries an unacceptably high risk of producing a false positive result. This problem is not limited to psychology. It is estimated to arise in the biomedical sciences just as often.

Fortunately, those who in the past funded the “null hypothesis refutation racket” have begun to take notice, and action. The National Children’s Study which recently got the axe from the NIH (following a review by the NAS) is the most notable example thus far. Criticized for years as being short on robust hypotheses and long on collecting vast amounts of data on environmental exposures and physical, behavioral and intellectual outcomes it was finally determined that the study was “unlikely to achieve the goals of providing meaningful insight into the mechanisms through which environmental factors influence health and development.” That the study would have found all sorts of statistically significant correlations between environment and outcomes was a given. That none could reliably be said to be causal was the problem.

The shoe salesmen turned scientists had a good run of it. Uncounted billions in grant money went into research founded on nothing more than the ability of computers to find correlations among random numbers and of humans to weave those correlations into a plausible explanation. Scientists in the right fields and also blessed with earnestness or at least the skills of an advocate really got lucky. They became expert witnesses. But now, frustrated with research that never seems to go anywhere and alarmed that good research is being obscured by bad, funders are directing their money towards basic research. And it’s a target rich environment. Take for example the remarkable discovery that an otherwise harmless amoeba can for purposes known only to itself resuscitate a moribund Listeria monocytogenes, let it grow and multiply within itself and then release the bacteria into what was previously thought to be an L. monocytogenes-free environment.

Alas, such research is hard and its chances of success, unlike significance testing, wholly unpredictable. It looks like the shoe salesmen’s luck has run out. That one of their last redoubts has turned out to be the courthouse is perhaps the most remarkable development of all.

Tracing Listeria Through Time

Posted in Uncategorized

The aspect of the recent Listeria monocytogenes outbreak that is likely to have the biggest impact on pathogen transmission litigation going forward is the ability to identify victims who acquired the infection years before the outbreak was finally recognized and the source identified. Thanks to the fact that in recent years some state health departments have begun preserving samples from patients diagnosed with certain infectious diseases the CDC  has realized, now armed with the ability to use of PFGE to “fingerprint” bacteria, that though the ice cream contamination wasn’t suspected until January of this year and wasn’t confirmed until last month people have been getting sick from it since as far back as 2010.

L. monocytogenes infections are acknowledged to be far more widespread than what’s reflected in the CDC outbreak statistics. Most cases produce nothing more than short term, mild, flu-like symptoms and go undetected as patients rarely get to a physician before they’re feeling better and so diagnostic tests aren’t even run. In the very old, the very young and the immune-compromised however it can produce a systemic or invasive infection with a significant mortality rate. It is these cases, assuming the infection is detected and it’s estimated that at least 50% of all such cases are accurately diagnosed, that get the attention of state health departments and the CDC. The silent tragedies are the miscarriages and stillbirths caused by L. monocytogenes. Expectant mothers can acquire the infection and experience nothing other than the vaguest sense of being under the weather while the bacteria is launching an all out attack on her child. The cause of those deaths regularly go undetected. This whole thing renders jokes about husbands being sent out on late night runs for pickles and ice cream soberingly unfunny.

From the legal perspective the creation of databases of the genetic fingerprints of pathogens will obviously increase the number of plaintiffs in the future as more silent outbreaks are discovered and previously unknown victims from the past are identified. It will also create some interesting legal issues. Take for instance Texas’ two year statute of limitations in Wrongful Death cases. There’s no discovery rule to toll the claim in large part because death is an easily appreciated clue that something has gone wrong and those who suffered the loss have two whole years to figure out the cause. Here though the ability to discover the cause didn’t exist in 2010. But of course if we start to draw the line somewhere else the debate over where is quickly overrun by the horrid thought of people digging up Granny who died of listeriosis back in 1988 to see if the genetic fingerprint of her killer matches one from a growing list of suspects. And let’s not forget about L. monocytogenes’ aiders and abettors, the other types of bacteria with whom it conspired to form the biofilm that protected L. monocytogenes from the disinfectants used to clean the food processing equipment. The promiscuous bugs likely acquired their special skill thanks to horizontal gene transfer not just among their own phyla but any passing bug with a helpful bit of code (like one that protects against the chemical agents, scrubbing and high pressure sprays used to disinfect food processing equipment) – and none of them were spontaneously generated at the ice cream factory. They all came from somewhere else.

Ultimately what makes claims arising out of the transmission of pathogens so different from other mass torts is that there is none of the usual causal uncertainty because, for example, the only cause of the 2010 patient’s listeriosis was Listeria monocytogenes that came from a particular flavor of ice cream that came from a particular plant. So what makes today’s news important is not that science can now answer “where did it come from and how many were infected?” but rather that science now asks in reply “how far back do you want to go?”