| December 01, 2015

Ethics and Raw Data

Scientists have a well-established tradition of disseminating their research results. Those results often include collected data, which researchers are generally willing to share. There are limits, of course. If research involves human subjects, their privacy must be protected. If data contains trade secrets, employers may insist that researchers maintain confidentiality. And it is easy to imagine other exceptions, such as those related to human safety, legal requirements, business contracts, etc. However, scientists who do basic research—research intended to develop, test, or extend general theories—are likely to have broad discretion concerning the use of their data. The following discussion will focus on this kind of scientist.

Although research scientists are accustomed to sharing their research results, in some situations they may be reluctant to give others unrestricted access to their data. This can happen when scientists are on opposing sides of a political or ideological debate, such as climate change. When a scientific disagreement has important public policy ramifications, scientists can become as polarized as the public itself. They may split into opposing camps and regard each other as competitors rather than colleagues. In extreme cases, they may even disparage each other’s integrity and competence. Conflict replaces cooperation in this atmosphere, and scientists may be as interested in discrediting other scientists’ research as they are in conducting their own.

One way in which researchers criticize other scientists’ work is to accuse them of ignoring or misinterpreting important raw data. Thus, if a scientist’s research is under attack, he/she may be tempted to make his/her raw data off limits to opposing scientists, or at least impede their access. But is a general policy of scientists denying other scientists access to their raw data justified? Or should researchers freely share all their data with other scientists?

The so-called “Climategate” affair, which garnered significant press back in 2009, serves as a strong illustration of how the scientific community can become politicized and polarized. A large number of especially sensitive emails of the Climatic Research Unit (CRU) at the University of East Anglia were hacked and released to the public, prompting an intense and widespread reaction. Climate-change skeptics charged that the emails revealed a conspiracy among climatologists to suppress and manipulate temperature data. Critics also used the emails to argue that climatologists were attempting to discredit and marginalize climate change skeptics. For example, they cited messages in which two CRU scientists discussed how to handle skeptics who asked to see their raw temperature data. In one email, a CRU scientist mentioned that he was having trouble with a certain journal editor who had instituted a new policy requiring that authors of papers submitted for publication make their raw data available to reviewers. He declared indignantly that he would not submit any more papers to the journal unless the editor reversed his policy.

It is not hard to sympathize with the CRU scientists. They had good reason to fear that skeptics would use their raw temperature data to mislead the public about the evidence for climate change. In the past, skeptical scientists and their journalistic allies had exploited the fact that climatologists use “adjusted” temperature data to support their conclusions. The skeptics called the climatologists’ use of adjusted data a “scandal,” an obvious attempt to perpetrate a fraud on the public and argued that the real data—i.e., the raw data—proved that global temperatures were not increasing at all. In their opinion, climatologists were manipulating temperature data to make the data appear to support climate change when in fact the opposite was true. Even today, a number of conservative politicians and commentators remain convinced that scientists are part of a vast conspiracy to manipulate and indoctrinate the electorate. One U.S. Senator has for years maintained that the entire climate change movement is a “hoax.” Unsurprisingly, such accusations resonate with many non-scientists, because increases in worldwide temperatures are apparent in the adjusted data but not in the raw data.

These are some of the challenges climate change proponents are up against. If they are to win the public policy debate, they must counter these attitudes and convince non-scientists that adjusting data is a legitimate scientific practice. Again, let’s consider climate change. In order to study global climate, climatologists have measured temperatures at multiple locations around the globe and collected data over many years. During that time, newer, more accurate thermometers replaced older ones. Consequently, not all thermometer readings could be directly compared. This meant that scientists needed a way to homogenize different sets of temperature data. For example, where simple mathematical relationships existed among readings from different thermometers, it made sense to take the newest type of thermometer as the standard and calculate “equivalent” temperatures for the older thermometers. This is a classic case of data adjustment. Furthermore, changes in the physical characteristics of the locations where scientists collected temperature data required another type of data adjustment. For example, new buildings might be built near a particular thermometer and cast new shadows, change wind patterns, etc. It is perfectly reasonable to try to compensate for such changes by making suitable adjustments to the raw temperature data. Which adjustments are suitable, however, might be very difficult to say, and scientists themselves might honestly disagree about which adjustments are best.

Considering the many ways skeptics can criticize how scientists adjust their raw data, one may wonder whether to allow anyone, scientists included, unrestricted access to it? Why not just make one’s raw data off limits to other scientists, especially those who are out to cause trouble? Although adopting such a policy might be tempting, there are compelling reasons not to do so. Even though critics are sometimes unfair and insincere in their criticisms, it is often a good idea for scientists to reconsider and reevaluate the methods they are using to interpret their raw data. Doing so can help researchers avoid myopia, groupthink, and similar pitfalls. Of course, when a scientist’s research is entangled in heated political and ideological controversies and that research and the scientist’s integrity are attacked, sharing raw data with scientific and political opponents may feel like unilateral disarmament. Yet in the long run, withholding and suppressing raw data is likely to discredit scientists and their research in the eyes of the public. This happened in the Climategate affair to a considerable degree.

There are also deeper, ethical reasons why research scientists should freely share their data with other scientists. As a profession, scientists have an unstated but generally accepted mission in a free society. In a nutshell, this mission is to endeavor to identify and understand the fundamental “building blocks” and dynamics of the natural universe. Experience indicates that facilitating and encouraging the flow of ideas and information within the scientific community is the most effective way to do this. The free exchange of information and ideas is an ideal to which the vast majority of scientists aspire. It is the foundation of scientists’ ethical obligation to allow each other unrestricted access to the results of their research, including their data. This obligation to share research results should always take precedence over extraneous, competing considerations, such as political or ideological allegiances or personal animosity toward other scientists. This is why it was wrong for the CRU scientists to put obstacles in the way of the researchers who requested access to their temperature data. As a general rule, scientific data, including raw data, should be available for everyone to see.

Ted Lockhart is Emeritus Professor of Philosophy at Michigan Technological University. He is the author of Moral Uncertainty and Its Consequences (Oxford Univ. Press, 2000).