Manipulating Science in the Data Age

Who are you going to believe—my academic paper/editorial/meme or your lying eyes?

It’s a pressing question in today’s world of artificial intelligence, machine learning, faked videos, and tendentious scientific claims—and particularly pressing in light of ambitious, far-reaching policy proposals based on data analytics and models.

Perhaps you remember Climategate 1.0, when emails from the UK’s East Anglia Climatic Research Unit were hacked (or leaked). Many who read through them saw clear evidence that climate researchers in the United Kingdom and the United States worked to suppress legitimate research results and data that mitigated against their claim of catastrophic human-caused global warming.

Among those researchers was Pennsylvania State University climatologist Michael E. Mann, who was accused of having deliberately cherry-picked tree ring data in order to assert a “hockey stick” shaped graph in which global temperature spiked over the last century or so. That cherry-picked data, it was said, served to “hide the decline” in overall global temperatures that others saw using different data sets, leading to this satirical video.

What followed were two investigations which sort of, kind of, exonerated the participants of offenses that would otherwise cut off their research funding from government agencies.

And now we have . . . drumroll . . . another email dump and Climategate 2.0. Emails that include passages like these:

Mike, The Figure you sent is very deceptive . . . there have been a number of dishonest presentations of model results by individual authors and by IPCC. Tom Wigley, University Corporation for Atmospheric Research

Observations do not show rising temperatures throughout the tropical troposphere unless you accept one single study and approach and discount a wealth of others. This is just downright dangerous. We need to communicate the uncertainty and be honest. Phil, hopefully we can find time to discuss these further if necessary … I also think the science is being manipulated to put a political spin on it which for all our sakes might not be too clever in the long run. —Peter Thorne, UK Met Office

Wigley’s caution is particularly interesting given that, in a Climategate 1.0 email, he urged his colleagues to “get rid of” a scholarly journal editor who committed the offense of publishing research papers that did not fit the catastrophic human-caused global warming narrative.

Heaven forbid scientists actually, you know, acknowledge all the data and how we analyze it.

And “hide the decline”? It looks like that’s still going on, with the active help of the National Oceanic and Atmospheric Administration, the U.S. government agency responsible for maintaining official temperature records. Investor Business Daily reports:

NOAA has made repeated “adjustments” to its data, for the presumed scientific reason of making the data sets more accurate.

Nothing wrong with that. Except, all their changes point to one thing—lowering previously measured temperatures to show cooler weather in the past, and raising more recent temperatures to show warming in the recent present.

A New Deception Primer
So how do you lie in the Data Age? Let me count the ways.

Human ingenuity (not to mention our ability to deceive even ourselves) is boundless. But there are three main approaches.

First, you can simply lie about research results or other statistical data. For instance, you might deliberately mislead the public about the impact of recent tax law changes. Once you’ve achieved the election results you wanted, you might boast about it (as Matthew Yglesias did in a now-deleted tweet). Or, if you’re the New York Times, you can pretend you just realized on April 14 that a majority of Americans got a tax cut after all.

Second, you can use misleading analytic or predictive methods. Say your approach uses machine learning algorithms. These basically are different methods for discovering patterns within a set of data. A beginner classroom exercise is to apply different algorithms to the same data to see which ones correctly find a pattern that is already known—distinguishing male and female faces in photos, for instance. For extra credit, the student can analyze how the algorithms developed their results. Neural nets that analyze photos often end up depending heavily on combinations of features that we humans aren’t conscious of using in our own recognition.

“Show me the man and I will show you the crime,” boasted Lavrentiy Pavlovich Beria, head of Stalin’s secret police. But in the Data Age, it’s not always necessary to invent evidence. Often one can get a desired outcome simply by choosing the data set and algorithm to apply.

Or if you are applying a model of how different phenomena interact to create a complex situation, your model might be misleading or inadequate. For instance, a climate change model that failed to include variations in solar radiation might give misleading results if those variations were an important factor in the real world. Otherwise the model might be basically sound even if it doesn’t include that factor in its calculations.

There’s one complication: knowing what matters gets tricky in the case of what are technically called complex systems. These systems have non-linear responses to some events, the so-called “butterfly effect” in which a small change produces a very large effect. And if the system elements are adaptive—i.e., able to vary their individual responses to what is happening around them—it gets interesting and sometimes non-intuitive pretty quickly. Think of preference cascades and elections.

Killing Science—and Self-Government
The third way to lie, mislead, or be mistaken in the Data Age goes to the heart of things. Garbage data in, garbage results out. This is the claim at the center of Climategate 1.0—that poorly chosen, and sometimes deliberately cherry picked or even modified, data was fed into climate models to produce predetermined desired outcomes.

What critics of the anthropogenic global warming hypothesis note is that those pushing for its acceptance have in many cases refused to make their data, models, and intermediate results available for review by other scholars. Since the legitimacy of science rests on the ability of other researchers to validate (or offer critiques of) research outcomes, such a refusal is deadly to the enterprise of science as a whole.

And at a practical level, it does more than undercut the authority of science—it makes it impossible for us to discuss the pros and cons, and the tradeoffs, associated with policy initiatives.

And so we get both sides of the climate issue making emotion-laden claims, pointing to weather as if this week’s temperatures said anything about a whole planet’s complex, dynamic climate system, and demanding that government “Take Action Now”—either to force far reaching, disruptive changes with major second and third order side effects on us all, or to cut government funding for science research and wash our hands of it.

For “climate change” substitute “gun violence.” Or “education outcomes.” Or “poverty and inequality.” Or any other issue in which the overall phenomenon is the result of many factors interacting.

If we have any hope whatever at bettering our society and our world, it must—must—start with honesty about what we know and what we don’t. It must include the ability of people with differing policy preferences and priorities to examine data and analyses in detail. And it must rest on a degree of humility that increasingly is an endangered species.

Manipulating science, or simply the facts around tax legislation, can be tempting as a means to a desired goal. But it is indeed “too clever in the long run” for the Data Age.

And it is deadly to consent of the governed in a republic.

Photo Credit: Getty Images

Get the news corporate media won't tell you.

Get caught up on today's must read stores!

By submitting your information, you agree to receive exclusive AG+ content, including special promotions, and agree to our Privacy Policy and Terms. By providing your phone number and checking the box to opt in, you are consenting to receive recurring SMS/MMS messages, including automated texts, to that number from my short code. Msg & data rates may apply. Reply HELP for help, STOP to end. SMS opt-in will not be sold, rented, or shared.

About Robin Burk

Robin Burk started her career wearing bell bottom jeans in the basement of the Pentagon, where she had the challenging privilege of interacting with computing legend Grace Hopper, and in Silicon Valley, where she wrote one of the first commercially deployed Internet protocol software stacks. The remainder of her first career half was spent in roles through senior executive in small and mid-sized tech companies serving defense and national security customers in the US and abroad. After the attacks of 9/11 Robin taught in two departments at the U.S. Military Academy (West Point). Returning to the Beltway area, she grew a fledgling research grant program in the new discipline of complex network systems at the Defense Threat Reduction Agency, center of U.S. counterWMD expertise, then led a team that addressed national security and commercial applications at a major R&D organization. Today her passion is helping organizations and individuals make the best responses to disruptive tech-driven change. Along the way she picked up a PhD in artificial intelligence and some DOD civilian medals. She is currently being trained by a young English Cocker Spaniel whose canine appreciation for social compacts rivals that of Confucius and his followers.