Saturday, November 21, 2009

dummygrid

I suspect that some of the most interesting information that will come out of the Hadley/CRU leaked files is an analysis of the source code. From the file quick_interp_tdm2.pro:
dummygrid=fltarr(144,72,12) ; set up a "zero anomaly" grid for infilling spaces with missing data
; missing data defined as areas of grid further than the decay distance from any
; real station point
Is this a "smoking gun"? Of course not. Does it raise questions? It seems to. The obvious question for CRU is just how much "missing data" do they typically see?

Background on missing data here.

This is interesting, too, from README_GRIDDING.TXT:
Bear in mind that there is no working synthetic method for cloud, because Mark New
lost the coefficients file and never found it again (despite searching on tape
archives at UEA) and never recreated it. This hasn't mattered too much, because
the synthetic cloud grids had not been discarded for 1901-95, and after 1995
sunshine data is used instead of cloud data anyway.
Hmmm ...

UPDATE 2 December 2009 16:26: I've distilled the (scientific) reasons to be skeptical about AGW here.

UPDATE 24 November 2009 10:15: Welcome visitors from Bishop's Hill. Take a look around. I have rather a lot of posts on Climate Change here, although I suspect that you'll have seen much of it before.

UPDATE 24 November 2009 11:40: The lighter side of CRU code here.

3 comments:

Francis Turner said...

I don't think that is particularly bad. You have to have a way to handle missing data and figure out interpolations to it and it makes sense to first set up an array of clear values.

Having said that, this file (http://di2.nu/foia/cru-code/idl/pro/quick_interp_tdm2.pro ) really is a wonder. You should have gone down a few lines more, see my post yesterday - http://www.di2.nu/200911/23a.htm

Sai Tan said...

I am a Business Intelligence Program Manager. I build systems that analyse company data, produce real time reports, historical analysis and predictive analytics. I have been doing it for over 10 years. I am not the best, but I am pretty good. Companies trust my work and make strategic decisions with long term implications based upon it.

I say this to put my comments in context.

I am shocked at the sloppy coding and use of data exposed in the CRU leaks. I would not accept it in any of my work. Data sets are corrupted, missing, metadata is inconsistent at best. There seems to be no data dictionaries, no universal layer.

This should have been worlds best practice. It is far from it.

No meaningful predictive analysis can be derived from this data.

Borepatch said...

Frances, the issue is, as Sai Tan points out, the lack of Quality Control procedures. The details of the code are much less of an issue.

What's disturbing is the idea that probably nobody did any QA on the climate models at all. It's not like this is Rocket Science or something.