Manipulating Science in the Data Age

April 27, 2019

Who are you going to believe—my academic paper/editorial/meme or your lying eyes?

It’s a pressing question in today’s world of artificial intelligence, machine learning, faked videos, and tendentious scientific claims—and particularly pressing in light of ambitious, far-reaching policy proposals based on data analytics and models.

Perhaps you remember Climategate 1.0, when emails from the UK’s East Anglia Climatic Research Unit were hacked (or leaked). Many who read through them saw clear evidence that climate researchers in the United Kingdom and the United States worked to suppress legitimate research results and data that mitigated against their claim of catastrophic human-caused global warming.

Among those researchers was Pennsylvania State University climatologist Michael E. Mann, who was accused of having deliberately cherry-picked tree ring data in order to assert a “hockey stick” shaped graph in which global temperature spiked over the last century or so. That cherry-picked data, it was said, served to “hide the decline” in overall global temperatures that others saw using different data sets, leading to this satirical video.

What followed were two investigations which sort of, kind of, exonerated the participants of offenses that would otherwise cut off their research funding from government agencies.

And now we have . . . drumroll . . . another email dump and Climategate 2.0. Emails that include passages like these:

Mike, The Figure you sent is very deceptive . . . there have been a number of dishonest presentations of model results by individual authors and by IPCC. —Tom Wigley, University Corporation for Atmospheric Research

Observations do not show rising temperatures throughout the tropical troposphere unless you accept one single study and approach and discount a wealth of others. This is just downright dangerous. We need to communicate the uncertainty and be honest. Phil, hopefully we can find time to discuss these further if necessary … I also think the science is being manipulated to put a political spin on it which for all our sakes might not be too clever in the long run. —Peter Thorne, UK Met Office

Wigley’s caution is particularly interesting given that, in a Climategate 1.0 email, he urged his colleagues to “get rid of” a scholarly journal editor who committed the offense of publishing research papers that did not fit the catastrophic human-caused global warming narrative.

Heaven forbid scientists actually, you know, acknowledge all the data and how we analyze it.

And “hide the decline”? It looks like that’s still going on, with the active help of the National Oceanic and Atmospheric Administration, the U.S. government agency responsible for maintaining official temperature records. Investor Business Daily reports:

NOAA has made repeated “adjustments” to its data, for the presumed scientific reason of making the data sets more accurate.

Nothing wrong with that. Except, all their changes point to one thing—lowering previously measured temperatures to show cooler weather in the past, and raising more recent temperatures to show warming in the recent present.

A New Deception Primer
So how do you lie in the Data Age? Let me count the ways.

Human ingenuity (not to mention our ability to deceive even ourselves) is boundless. But there are three main approaches.

First, you can simply lie about research results or other statistical data. For instance, you might deliberately mislead the public about the impact of recent tax law changes. Once you’ve achieved the election results you wanted, you might boast about it (as Matthew Yglesias did in a now-deleted tweet). Or, if you’re the New York Times, you can pretend you just realized on April 14 that a majority of Americans got a tax cut after all.

Second, you can use misleading analytic or predictive methods. Say your approach uses machine learning algorithms. These basically are different methods for discovering patterns within a set of data. A beginner classroom exercise is to apply different algorithms to the same data to see which ones correctly find a pattern that is already known—distinguishing male and female faces in photos, for instance. For extra credit, the student can analyze how the algorithms developed their results. Neural nets that analyze photos often end up depending heavily on combinations of features that we humans aren’t conscious of using in our own recognition.

“Show me the man and I will show you the crime,” boasted Lavrentiy Pavlovich Beria, head of Stalin’s secret police. But in the Data Age, it’s not always necessary to invent evidence. Often one can get a desired outcome simply by choosing the data set and algorithm to apply.

Or if you are applying a model of how different phenomena interact to create a complex situation, your model might be misleading or inadequate. For instance, a climate change model that failed to include variations in solar radiation might give misleading results if those variations were an important factor in the real world. Otherwise the model might be basically sound even if it doesn’t include that factor in its calculations.

There’s one complication: knowing what matters gets tricky in the case of what are technically called complex systems. These systems have non-linear responses to some events, the so-called “butterfly effect” in which a small change produces a very large effect. And if the system elements are adaptive—i.e., able to vary their individual responses to what is happening around them—it gets interesting and sometimes non-intuitive pretty quickly. Think of preference cascades and elections.

Killing Science—and Self-Government
The third way to lie, mislead, or be mistaken in the Data Age goes to the heart of things. Garbage data in, garbage results out. This is the claim at the center of Climategate 1.0—that poorly chosen, and sometimes deliberately cherry picked or even modified, data was fed into climate models to produce predetermined desired outcomes.

What critics of the anthropogenic global warming hypothesis note is that those pushing for its acceptance have in many cases refused to make their data, models, and intermediate results available for review by other scholars. Since the legitimacy of science rests on the ability of other researchers to validate (or offer critiques of) research outcomes, such a refusal is deadly to the enterprise of science as a whole.

And at a practical level, it does more than undercut the authority of science—it makes it impossible for us to discuss the pros and cons, and the tradeoffs, associated with policy initiatives.

And so we get both sides of the climate issue making emotion-laden claims, pointing to weather as if this week’s temperatures said anything about a whole planet’s complex, dynamic climate system, and demanding that government “Take Action Now”—either to force far reaching, disruptive changes with major second and third order side effects on us all, or to cut government funding for science research and wash our hands of it.

For “climate change” substitute “gun violence.” Or “education outcomes.” Or “poverty and inequality.” Or any other issue in which the overall phenomenon is the result of many factors interacting.

If we have any hope whatever at bettering our society and our world, it must—must—start with honesty about what we know and what we don’t. It must include the ability of people with differing policy preferences and priorities to examine data and analyses in detail. And it must rest on a degree of humility that increasingly is an endangered species.

Manipulating science, or simply the facts around tax legislation, can be tempting as a means to a desired goal. But it is indeed “too clever in the long run” for the Data Age.

And it is deadly to consent of the governed in a republic.

Photo Credit: Getty Images

Manipulating Science in the Data Age

This is the Link Between “Hormone Therapy” and Cancer

This is What Happens When Judges Let Violent Offenders Off Easy

‘Conspiracy Theories’ About the COVID-19 Shot that Turned Out to be True (Part 1)

Get the news corporate media won't tell you.