If you’re feeling awash in the flood of talking points coming in waves from the Information Age — read this: Six Ways to Tell Lies From Statistics, by Betsey Stevenson and Justin Wolfers. There’s some excellent advice in this column, and it’s in simplified language with the jargon mangled out. Some points might be augmented with a bit more explication.
Their first point: “1. Focus on how robust a finding is, meaning that different ways of looking at the evidence point to the same conclusion.” We find all manner of research which comes from the Million Studies of One Department. If the results of a study can be repeated or replicated (and these are not necessarily the same thing) then the conclusions drawn are valid. If not, then we have to ask if the original study publication jumps the gun, putting out “results” before insuring that the conclusions are replicable. Compare this to the current flap over the austerity policy promoting Reinhart and Rogoff Study — we could look at the employment statistics from Eurozone countries, and we could see the double dip recessionary trends in their economies — but the Reinhart-Rogoff study said everything was going to be just fine because “austerity” worked. A “robust” study conforms to other evidence, or confirms conclusions from different points of view. Very rarely does it fly in the face of information from a variety of sources.
Their second point: “2. Data mavens often make a big deal of their results being statistically significant, which is a statement that it’s unlikely their findings simply reflect chance. Don’t confuse this with something actually mattering.” Amen. If we toss pennies in the air the odds are heads and tails show up 50% of the time over the long run. That’s chance. The basic idea is summarized as follows:
“Statistical significance is a mathematical tool that is used to determine whether the outcome of an experiment is the result of a relationship between specific factors or merely the result of chance.” [WiseGeek]
It’s a tool. What is the tool used to do?
“In a scientific study, a hypothesis is proposed, then data is collected and analyzed. The statistical analysis of the data will produce a number that is statistically significant if it falls below a certain percentage called the confidence level or level of significance. For example, if this level is set at 5 percent and the likelihood of an event is determined to be statistically significant, the researcher is 95 percent confident that the result did not happen by chance.” [WiseGeek] (emphasis added)
Where the results have more importance for policy, like the approval of an experimental drug for use by human beings, or the conclusions have public safety implications — like in food testing protocols we might want to have a 3% confidence level. However, we need to be careful not to confuse statistical significance with social or political significance.
For example, if I were to hypothesize that apples are better for us nutritionally than pears, and I have a whopper big sample size and perform all the right statistical tests, I may come up with a statistically significant result that the apples are the winners BUT what if the difference between the two fruits is so small as to be inconsequential in the overall human diet? The larger conclusion (eating fresh fruit is good for us) stands, and while my study might be informative — it’s just not that informative.
Their third point: “3. Be wary of scholars using high-powered statistical techniques as a bludgeon to silence critics who are not specialists.” Merely because someone can wield a regression analysis doesn’t mean his or her study is the Be and End All of Research. I’d add another note of caution. Beware of jargon in general. The column’s authors make a good point: “If the author can’t explain what they’re doing in terms you can understand, then you shouldn’t be convinced.” If you happen to be a reasonably well educated, reasonably intelligent individual, then a reasonably well constructed study yielding reasonable results ought to be clear to you.
Their fourth point: “4. Don’t fall into the trap of thinking about an empirical finding as “right” or “wrong.” This is especially difficult when the finding is “agreeable” or supports preconceived notions about a subject. Look, See, the Study Supports Me! Empirical findings (results of data analysis) don’t automatically make the conclusion “right,” it simply adds an element of defensible data into the discussion.
Their fifth point: “5. Don’t mistake correlation for causation.” Yes, oh yes, oh yes. Some financial news pundits were quick to all but assert that high levels of government debt slowed economic growth. That statement or its inference speaks to causation, not correlation. While debt and growth may correlate we have to ask are there other factors, not incorporated into the study and its results, driving the conclusions? There may well be Chicken and Egg issues herein. Is growth slow because the debt is increasing, or is the debt increasing because growth is slow already and governments are increasing debt to finance projects and employment to stabilize growth trends?
Their sixth point: “6. Always ask “so what?” The authors of the column speak to “external usefulness,” and this is an important concept. This concept is closely related to the questions about statistical significance. Merely because a study shows statistical relationships doesn’t mean the study is dealing with significant issues, and because a study shows results based on a very high level of confidence doesn’t mean it has much, if any, utility for our discussions of public policy.
The authors provide a good hypothetical questions involving the use of the Reinhart-Rogoff study, centering on the reason for the increase in indebtedness. Government profligacy or necessary expenditure for infrastructure projects?
I’d add a seventh proposition to this list: From whence comes the study? Who is paying the piper? The controversial “Success For All” program vacuumed up a hefty portion of educational research funding and it’s “success” was supposedly replicated all over the country. But, wait. There was a working relationship between the research and the program, one that should have raised red flags about the independence of the research and the applicability of the results. [Pogrow] Does anyone truly believe that the Heritage Foundation think tank will publish a study indicating that a growth based economic policy is better than cutting government spending?
On the other hand, the payment issue doesn’t’ have to necessarily be a non-starter. Facts are facts no matter their point of origin. It’s the methodology and the conclusions which we need to analyze carefully, not just the origin. Once analyzed the data and conclusions have to pass that “So What?” test.
A Johns Hopkins School of Public Health Study found that: “A study of risk factors for violent death of women in the home found that women living in homes with 1 or more guns were more than 3 times more likely to be killed in their homes.” (pdf) One side of the gun safety argument might see this as “proof” that fewer guns would make women safer in situations involving domestic violence. Another view might argue that she should have her own arsenal readily available. However, arguing that merely because a university sponsored the study — perhaps some ‘librul bastion’ — the finding is invalid isn’t productive. Facts are still facts.
When reading the pundits and the pontificators Disticha Moralia reminds us “Sermo datur cunctis; animi sapientia paucis.”