Fixing the IPCC (and the Motley CRU) Part 2

[Oops, this post initially appeared dated Feb 10th because that’s when I created the draft it’s based on. Deleting and reposting as it’s now Feb 15th and more to the point so that Part 2 appears after Part 1!]

As promised a few days ago, I’m now delving into what has gone wrong with climate science.

The “CRU hack” and the Guardian Investigation

It all started to go wrong with the email leak from the Climate Research Unit (CRU) at the University of East Anglia (UEA). The issues have recently been thoroughly analysed by Fred Pearce in the Guardian.

Hitting the ground running with a front-page lead, Fred reported in Part 1 of his investigation how Phil Jones, the currently suspended head of the CRU, had, back in 1990, published a paper – referenced in 2007 in the latest IPCC report, known as Assessment Report 4 or simply AR4 to its friends – which included data from some Chinese weather stations. The problem is that the precise location of these weather-stations is now uncertain. This is important as an increasing urban heat island effect could be mistaken for a warming climate. Independent climate scientists wanted the Chinese data and CRU couldn’t supply it. It’s no surprise that we’re now being asked to believe that the urban heat island effect is a general problem. In actual fact the issue has been debated for decades and the data corrected. That’s why the missing information about the Chinese data is such a problem in the first place.

The first problem is that data is not being made freely available. There is now general agreement that it should be, otherwise results of analyses are simply not reproducible.

Part 2 of the investigation showed how the emails revealed alleged attempts to prevent certain papers from being published in peer-reviewed journals and to exclude papers from certain journals from the IPCC process.

This second problem is entirely different to the first, and not so easily solved. In effect, scientists are being asked to endorse papers they don’t believe in. This puts them in a dilemma for which there is no obvious resolution. What’s more, there is no clear prescription for how to resolve the dilemma. One has to have some sympathy for Phil Jones, although anyone who puts “HIGHLY CONFIDENTIAL” in the subject line of an email should obviously be fired on the spot for gross stupidity.

The third part of Fred’s investigation looked at how Freedom if Information (FoI) requests were handled by the CRU. What’s really needed is the attention of a great satirist, but to summarise the madness, there were at least three aspects that caused friction, whether or not there was willingness to share data in the first place:
– the volume of FoI requests became onerous for the scientists involved – why apparently no-one thought of simply appointing an administrator to deal with them is anyone’s guess.
– the FoI requests asked for code used to analyse data as well as the data itself. Not only did the scientists involved regard this as their intellectual property (and the result of thousands of hours of work), it defeats the object for it to be released. It’s important for others to be able to access data and analyse it independently, but independently is the operative word. Using a slight variant of their software is not independent and could simply reproduce the same errors.
– the FoI requests asked for emails. The ones subsequently leaked. But the value of these is entirely contingent. They represent recorded private conversations that could almost as easily have been carried out verbally, with no record, or for that matter by coded text messages between untraceable mobile phones. Obviously it’s sensible to be careful what you write in emails because they may become public, but for the law to specifically allow them to be requested under FoI will, in the long-run, simply inconvenience those affected, who will be unable to have private conversations in the most convenient way. If their emails are to be treated as public property, surely the next logical step is for the spooks to follow climate scientists about and record their every word.

The third problem is a tricky one – because one response to the IT revolution has been to implement a raft of poorly drafted and generally over-specified laws relating to information, instead of the minimum necessary. Scientists can already publish as much of their reasoning as they wish, but beyond that the only aspect of scientific work that we should insist is made public is the raw data. (Though we must insist on seeing all the raw data – including when, where and by whom it was collected).

The final part of the Guardian investigation looked into how the emails were leaked. Interesting, but not relevant to a discussion of how to fix the process.

The latest twist is that Professor Jones claims to have “lost track” of data (though it’s not clear exactly what he’s referring to). I’d say his excuse seems reasonably plausible. The FoI amounts to retrospective legislation, after all.

Nevertheless, the sloppiness is inexcusable. If scientific results are not reproducible, they are worthless. Climate studies are based on large amounts of historic data, generally collected by third parties. We’re not talking about personal lab notebooks or electronic data collected by the researcher (though these should also be retained). To ensure reproducibility we need some separation of responsibility between archivists and researchers:
– data collectors report their measurements (light snow here in London just now!) to anyone that wants it.
– these are collated by archivists together with the vast existing database of historic data and published (open source if they’re to be useful).
– to be valid any papers must specify precisely what data they reference and how it has been analysed.

Nothing else is needed. Inspecting computer code is not necessary. Far better for several independent teams to analyse the same data. If source code is released – and open source development of climate models may well make sense – then we need to be cautious. We might end up with the same underlying code in all the models, resulting in the same errors. We’re not trying to make a computer operating system, which just has to be good enough. We need the right answer.

And private email correspondence between scientists has no bearing at all on whether their results are valid.