On June 4th, 2010 HM Government released Combined Online Information System (COINS) database, the main database used by HM Treasury budgeting. I’ve written a brief overview of the data here. This post is about a little programming project I set myself to actually do something with this data.
The COINS data is in a raw format and needs to be converted to a more useful format before any useful reports can be generated. Mainly out of curiosity, I set myself a project to do something useful with this data. I wanted to do three things:
- Extract useful subsets of the data that were small enough to be loaded into Excel.
- Load the COINS data into an SQL database.
- Recreate some of the government PESA(Public Expenditure Statistical Analyses) reports from the COINS data and check that my results are the same.
The project to do all this is written in python, and I’ve placed it on github here
Extracting useful subsets of the COINS data
The COINS fact_table_extract_2008_09.txt file contains 2,043,129 records that have a non-zero value field: of these 20,566 are outturn records, 1,610,900 are forecast records, 13,733 are plans records, and 397,930 are snapshot records. Older Excel spreadsheets are limited to 65535 rows, and I wanted to produce a useful data set that fit within this limit. The non-zero outturn records easily fit, I’ve made them available on rapidshare: coins2008_09 and coins2009_10.
Loading the COINS data into an SQL database.
Checking my reports against published government data
HM Treasury produces Public Expenditure Statistical Analyses(PESA) reports. I wanted to recreate some of these reports from the COINS data.
I’ve produced outturn Resource, Capital, DEL and AME reports by departmental group from the 2008-09 COINS data, and checked it against the estimated 2008-09 outturn from the PESA 2009 report and the actual 2008-09 outturn from the PESA 2010 report.
The PESA data used was:
The spreadsheet PESA 2010, Chapter 1 tables which contains tables of DEL, AME and TME grouped by Resource and Capital expenditure (from webpage PESA 2010 section 1 – Budgets). 2008-09 subtotals by departmental group are provided in Column G (2008-09 outturn) of the worksheets:
- Table 1.3 Resource budgets, 2004-05 to 2008-09 (by departmental group)
- Table 1.6 Capital budgets, 2004-05 to 2008-09 (by departmental group)
The spreadsheet PESA 2010, Chapter 1 tables which contains tables of DEL, AME and TME grouped by Resource and Capital expenditure (from webpage Public Expenditure Statistical Analyses 2009). 2008-09 subtotals by departmental group are provided in Column G (2008-09 estimated outturn) of the worksheets:
- Table 1.5 Resource budgets, 2003-04 to 2010-10
- Table 1.10 Capital budgets, 2003-04 to 2010-11
The following tables give my results and the PESA result side by side for comparison.
|Children, Schools and Families||46,848||46,848||46,848|
|Innovation, Universities and Skills||N/A||N/A||16,666|
|CLG Communities and Local Government||28,755||4,105
|Law Officers’ Departments||723||724||733|
|Foreign and Commonwealth Office||2,027||2,027||2,025|
|Energy and Climate Change||292||293||1,016|
|Business, Enterprise and Regularity Reform||175||N/A||1,594|
|Environment, Food and Rural Affairs||2,449||2,446||2,654|
|Culture, Media and Sport||1,455||1,456||1,633|
|Work and Pensions||7,937||7,937||8,059|
|Northern Ireland Executive||7,926||7,926||8,117|
|Northern Ireland Office||1,177||1,177||1,342|
|Children, Schools and Families||-2||10,652||10,914|
|Innovation, Universities and Skills||N/A||N/A||274|
|CLG Communities and Local Government||1,282||621
|Law Officers’ Departments||9||9||N/A|
|Foreign and Commonwealth Office||-28||-28||-7|
|Energy and Climate Change||2,403||2,403||4,589|
|Business, Enterprise and Regularity Reform||1||N/A||819|
|Environment, Food and Rural Affairs||0||0||-52|
|Culture, Media and Sport||3,890||3,890||3,911|
|Work and Pensions||135,344||135,344||135,546|
|Northern Ireland Executive||6,711||6,104||8,142|
|Northern Ireland Office||395||396||264|
|Children, Schools and Families||5,519||5,519||5,634|
|Innovation, Universities and Skills||0||N/A||2,123|
|CLG Communities and Local Government||7,233||7,112
|Law Officers’ Departments||8||9||12|
|Foreign and Commonwealth Office||226||227||218|
|Energy and Climate Change||1,667||1,667||1,688|
|Business, Enterprise and Regularity Reform||2||N/A||16|
|Environment, Food and Rural Affairs||609||610||618|
|Culture, Media and Sport||841||842||791|
|Work and Pensions||91||91||57|
|Northern Ireland Executive||1,233||1,233||1,137|
|Northern Ireland Office||67||68||78|
|Children, Schools and Families||0||N/A||N/A|
|Innovation, Universities and Skills||0||N/A||4,230|
|CLG Communities and Local Government||3,766||516||-|
|Law Officers’ Departments||0||N/A||N/A|
|Foreign and Commonwealth Office||41||N/A||0|
|Energy and Climate Change||-35||-279||-419|
|Business, Enterprise and Regularity Reform||-210||N/A||N/A|
|Environment, Food and Rural Affairs||-12||1||0|
|Culture, Media and Sport||571||572||717|
|Work and Pensions||126||136||142|
|Northern Ireland Executive||611||378||413|
|Northern Ireland Office||0||N/A||N/A|
There is good agreement between some of my calculated results and the PESA reports, but there are also some significant differences between the figures.
In the “Resource DEL” and “Capital DEL” tables there is generally good agreement, except for Health, Transport, Defence, Cabinet Office and Independent Bodies departmental groups. The Cabinet Office and Independent Bodies discrepancies could be because some departmental data is missing from the COINS data (see next section). I think the differences in the values for Health are probably because the PESA reports do not include the figures for Scotland, but I haven’t checked this. I have no explanation for the differences between the figures for Transport and Defence.
The agreement for the “Resource AME” and “Capital AME” tables is less good. Some of this disagreement is for the same reasons as the “DEL” differences, but I suspect the main reason for the differences is that my SQL query for the “AME” figures is not correct. (Any corrections to my SQL queries are most appreciated, as are any bug reports or corrections to my programs.)
Missing COINS departmental data
The COINS data does not seem to have spending figures for the following government departments:
- Assets Recovery Agency
- Department for Business, Enterprise and Regulatory Reform
- Office of Communications
- Forrestry Commision
- National Investment and Loans Office
- Royal Mint
- Office of Government Commerce
- Security and Intelligence Agencies
This may because the departments have been renamed, or subsumed into other departments.
I’ve satisfied my curiosity about the COINS data. I’ve managed to recreate some of the PESA reports, and although there are differences between my reports and the PESA reports, the differences are mostly explainable. I’m not Cliff Stoll, and won’t be tracking down the exact nature of the differences. I don’t plan to do any further work on the COINS data, but may look at future data that is released, especially if it is more detailed.
Feel free to use my programs if you wish – it would nice if you credit me if you do so. I’m also interested in other programming projects that use the COINS data – feel free to post a comment with a link if you do anything interesting with the COINS data.
The success of the Olympus and Panasonic mirrorless interchangeable lens cameras shows that there is a demand for this type of camera – a small, light quality camera. Something smaller and lighter than a DSLR, but with better image quality and flexibility than a compact camera.
I’ve argued before that there is a proportion of this market segment for whom size and weight is a priority. It seems that Sony gets this: its recently announced NEX cameras demonstrate what can be done when size and weight are important design criteria.
But it’s not the weight of the size and weight of the camera that is important. It is the weight of the camera/lens/battery/memory card package that counts. Micro Four Thirds cameras have an inherent advantage here, since, for a given image quality, lenses for a m4/3 sensor can be made smaller and lighter than lenses for a APS-C image sensor.
This blog post collects together camera and lens sizes and weights. If camera size and weight is an important consideration for you, then these tables may be of some help. Note I make no attempt to compare the quality or ergonomics of any of these cameras, there are plenty of camera review sites that do that.
The following table gives the sizes of various mirrorless cameras, and their weights with various lenses. The pancake lens is the manufacturer’s wide angle prime lens. The zoom lens is the manufacture’s nearest equivalent to a 28-84mm (full frame) lens. The superzoom is the manufacture’s nearest equivalent to a 28-300mm (full frame) lens. All weights include batteries.
|Olympus E-P1||355g||426g||505g||645g||121 x 70 x 36 mm||230,000|
|Olympus E-P2||355g||426g||505g||645g||121 x 70 x 36 mm||230,000|
|Olympus E-PL1||334g||406g||484g||624g||115 x 72 x 42 mm||230,000|
|Panasonic GF1||315g||415g||480g||775g||119 x 71 x 36 mm||460,000|
|Sony NEX-3||297g||371g||511g||821g||117 × 62 × 33 mm||920,000|
|Sony NEX-5||287g||361g||501g||811g||111 × 59 × 38 mm||920,000|
|Samsung NX10||414g||499g||612g||831g||123 × 87× 40 mm||614,000|
The overall lightest camera/lens combination is the Sony NEX-5 with 16mm lens. This is thanks to the low weight of the Sony camera. With zoom lenses, the inherent size/weight advantage of micro4/3 comes into play: the Panasonic GF1 provides the lightest camera with standard zoom combination and the Olympus E-PL1 provides the lightest camera with superzoom combination. (Of course slightly lighter combinations could be obtained by mixing Panasonic and Olympus lenses and bodies.)
Note that Samsung does not seem to have got it. The NX10 is barely smaller or lighter than some of the smallest DSLRs (see next section).
For comparison, here are the sizes and weight of some of the smaller and lighter DSLRs:
|Canon EOS 1000D||502g||N/A||702g||1097g||126 x 98 x 65 mm||230,000|
|Nikon D60||522g||N/A||787g||1082g||126 x 94 x 64 mm||230,000|
|Olympus E-450||426g||521g||616g||N/A||130 x 91 x 53 mm||230,000|
And for further comparison, here are the weights and dimensions of some of the higher end compact cameras:
|Canon PowerShot G11 [28-140mm]||375g||112 x 76 x 48 mm||461,000|
|Canon PowerShot S90 [28-105mm]||195g||100 x 58 x 31 mm||461,000|
|Leica X1 [35mm]||306g||124 x 60 x 32 mm||230,000|
|Nikon Coolpix P6000 [28-112mm]||280g||107 x 66 x 42 mm||230,000|
|Panasonic Lumix DMC-LX3 [24-60mm]||265g||109 x 60 x 27 mm||460,000|
For reference, here are the weights and sizes of the lenses available for mirrorless interchangeable lens cameras:
|Olympus 9-18mm[18-36mm] f4.0-4.6||155g||57 x50 mm||52mm|
|Olympus 14-42mm[28-84mm] f3.5-5.6||150g||62 x 44 mm||40.5mm|
|Olympus 14-150mm[28-300mm] f4.0-5.6||290g||64 x 83 mm||58mm|
|Olympus 17mm[34mm] f2.8||71g||57 x 22 mm||37 mm|
|Panasonic 7-14mm[14-28mm] f4.0||300g||70 x 83 mm||-|
|Panasonic 14-42mm[28-84mm] f3.5-5.6||165g||61 x 64 mm||52 mm|
|Panasonic 14-45mm[28-84mm] f3.5-5.6||195g||60 x 60 mm||52 mm|
|Panasonic 14-140mm[28-280mm] f4.0-5.8||460g||70 x 84 mm||62 mm|
|Panasonic 20mm[40mm] f1.7||100g||63 x 26 mm||46 mm|
|Panasonic 45mm[90mm] f2.8||225g||63 x 63 mm||46 mm|
|Panasonic 45-200mm[90-400mm] f4.0-5.6||380g||70 x 100 mm||52 mm|
|Samsung 18-55mm[28-85mm] F3.5-5.6||198g||63 x 65 mm||58 mm|
|Samsung 30mm[46mm] f2||85g||62 x 22 mm||43 mm|
|Samsung 50-200mm[77-308mm] F4-5.6||417g||70 x 101 mm||52 mm|
|Sony 16mm[24mm] f2.8||74g||62 x 23 mm||49 mm|
|Sony 18-55[27-83mm] f2.5-5.6||214g||62 x 60 mm||49 mm|
|Sony 18-200[27-300mm] f3.5-6.3||524g||76 x 99 mm||67 mm|
Films are sometimes re-released as a director’s cut – a version that more closely reflects the director’s vision for the film, free from the commercial pressures of the studio.
Science programme makers could take a cue from filmmakers and produce two cuts: one for the general public and one for enthusiasts.
Take Wonders of the Solar System. This has all the makings of a great science programme: an enthusiastic and knowledgable presenter in Professor Brian Cox, great pictures of space and the earth, travel to interesting locations, interviews with scientific experts and fantastic computer simulations. Yet the programme is hard to watch – it’s like eating cabbage: I do it because I know I ought to, not because I want to. And it’s not just me. A friend of mine, who is also a science and technology enthusiast, said he fell asleep during the programme.
The main problem is that the programme is constrained to a 1-hour slot, and is edited to fit that slot. This means there is a lot of filler to sit through. Shots of Brian Cox walking up to a telescope, unnecessary shots of him driving in his car (and, of all forms of filler, presenters driving in their car is the most clichéd and overused), scenes that are portmanteaux of earlier scenes, and so on.
The second problem is that the programme, by necessity, targets a wide audience: since there are relatively few science programmes on television, such programmes have to target both the general public and those who have a deeper understanding of science. This means these programmes generally need to include explanations of the basics, and even when these explanations are good, they are not that interesting to people who already understand what is being explained. A good example is when Brian Cox explained the size of the solar system, by placing planets in their relative positions on a table, and then driving to the position of the Oort cloud. This was a dramatic illustration of the solar system’s size (and, incidentally, appropriate use of a shot of him driving), but not that interesting to someone who understands the vastness of space.
Both these problems could be solved by having two cuts of the programme – a standard version and a scientist’s cut. The scientist’s cut would differ from the standard cut in that it would not include the filler and the basic explanations. But it might include more detail in other areas.
Programme lengths no longer have to be totally dictated by the TV schedule. Perhaps the main showing needs to fit a one-hour slot, but repeats on BBC iPlayer certainly do not. There is no reason why two cuts of “Wonders of the Solar System” should not be available on iPlayer – the full version and the scientist’s cut. The programme is repeated on BBC1 and BBC4, often in the middle of the night – there is no reason why one of these repeats should not be a scientist’s cut. The programme is to be sold on DVD – there is no reason why there should not be a DVD scientist’s cut.
Despite the fact that it contains some wonderful material, I won’t be buying the DVD of Wonders of the Solar System. But if Professor Brian Cox was given full editorial control and was allowed to make a scientist’s cut of the programme, I would buy that.
Film enthusiasts get a special cut, science enthusiasts should get one too.
jökull is Icelandic for glacier. Glaciers do not erupt, volcanoes do.
The volcano underneath the glacier is called Eyjafjall. Note it’s Eyjafjall, not Eyjafjalla. Eyjafjalla is the genitive (possessive) plural form: Eyjafjallajökull literally means “eyja mountains’ (sic) glacier”. (Fjall is Icelandic for “mountain”, it is derived from the Old Norse, as is our English word “fell”. Eyja is Icelandic for “island”, it is derived from the Old Norse íeg as is our word “island” – which is why the “s” is not pronounced. “Ey” lives on in English placenames, such as Anglesey, Guernsey, Jersey and Sheppey.)
Talking about the Eyjafjallajökull eruption is not only inaccurate, it is also long-winded. Let’s be accurate and more concise and say “Eyjafjall”.
The Statute of Anne, the first copyright law, entered into force 300 years ago today, on the 10th April 1710. It established a copyright term of 14 years, which could be renewed for another 14 years if author was still alive when the first term expired (books already in print were granted a copyright term of 21 years).
The Copyright Act 1814 extended the copyright length to 28 years, and if the author was still alive when the term expired, the right of publication could be extended for the rest of the author’s life.
The Copyright Act 1842 extended the copyright term to the life of the author plus 7 years, or 42 years from the first date of publication, whichever was longer. Posthumously published works were provided with a 42 year copyright term.
The Copyright Act 1911 extended the term of copyright to life and fifty years (with certain exceptions).
The Copyright, Designs and Patents Act 1988 extended the copyright term to seventy years from the death of the author for literary, dramatic, musical or artistic works. If the author is unknown, copyright expires seventy years after the work is first made available to the public. If the work is computer-generated, copyright expires fifty years after the work is made.
In most countries around the world, copyright length is life of the author plus 50 years or life of the author plus 70 years
We see it time and again in the movies. A disaster is about to happen: a nuclear bomb is about to explode in a major city, a deadly virus has been released from a laboratory, a giant meteor is about to strike the earth, an alien spaceship is spotted heading towards the earth. And the reaction from the president is always the same: “We must not tell the public or there will be mass panic and hysteria”.
Except there won’t.
The assumption of public panic is a useful plot device for disaster movies, but as an instrument for policy formulation it is defective. Policy makers seem to assume that the public will panic, but research shows the opposite. Baruch Fischhoff, professor of social and decision sciences at Carnegie Mellon University and president of the Society for Risk Analysis, says “people, however stressed, almost always keep their wits and elevate their humanity”. In his article A Hero in Every Aisle Seat he says:
Studies of civilians’ intense experiences in the London Blitz; the cities of Japan and Germany in World War II; the 1947 smallpox outbreak in New York; the earthquake in Kobe, Japan, in 1995; and even fires have found that people, however stressed, almost always keep their wits and elevate their humanity.
In his article Panic: myth or reality, Lee Clarke, Associate Professor, Department of Sociology, Rutgers University states:
Before, during and after disasters, the ‘general public’ warrants trust and respect. Panic is often used as a justification by high-level decision makers to deny knowledge and access to the public, on the presumption that people cannot handle bad news. Research on how people respond to life threatening disasters and the stories form the World Trade Center show that people handle even the most terrifying news civilly and cooperatively. Our leaders would do well to see us as partners in recovery rather than a ‘constituency’ to be handled.
In The Swine Flu Panic That Wasn’t. Mass hysteria fails to materialize. Again. Jesse Walker says:
People are sharing information, they’re seeking out information, they’re asking questions about whether or not they have the symptoms,” says Jeannette Sutton, a researcher at the Natural Hazards Center at the University of Colorado at Boulder. “Those are not incidents of panic or hysteria. That’s rational thinking, where people are asking questions and trying to make decisions based on the information they have available to them.”
It’s not as though there haven’t been any destructive overreactions to the H1N1 flu. It’s just that they’ve come from officials, not the general public.
A John Hopkins University study dispels panic myth and suggests ways to involve the public in response to a bioterrorist attack.
Planners and policy makers have long discounted the public’s ability to participate in a response to bioterrorism, because of a belief that an attack would create mass panic and social disorder.
However, researchers at the Johns Hopkins Bloomberg School of Public Health, who reviewed the public’s response to the terrorist attacks of September 11th, the recent anthrax mailings, and other disasters concluded that the public does not react with panic but with effective and adaptive action and can be an valuable response force and that should be considered in biodefense planning.
It is a myth that a community’s first response to a crisis is panic.
The public are much smarter than portrayed – we pay much more attention to the actual reported numbers than the apocalyptic predictions of the media and politicians.
The assumption of public panic results in poor policy decisions. It’s not the public who panics, it’s the politicians.
The first major criminal trial to be held without a jury in Britain for more than 350 years (the Heathrow Airport Robbery Trial) concluded yesterday. Until this case, and since the abolition of the Court of Star Chamber in 1641, all serious criminal offences on indictment were tried by jury. Trial by jury traces its roots back to Article 39 of the Magna Carta signed by King John in 1215.
Trials on indictment without a jury were made possible by The Criminal Justice Act 2003. A judge may now order a jury-less trial in the specific cases of complex fraud and jury tampering.
There has been much discussion about the ramifications of the removal of the right to trial by jury in criminal trials. I think that the right to trial by jury of ones peers is an important right, and I think it was wrong to remove that right.
But, in the context of our situation where we have had our right to trial by jury removed, I want to talk about something else – the form of those jury-less trials.
The jury should be replaced, not eliminated
In a jury trial, broadly speaking, the judge rules on matters of law and is responsible for sentencing; the jury decides matters of fact (by evaluating the evidence). What’s more the jury consists of individuals with varying opinions and backgrounds – the debate that occurs between jurors in the jury room is an essential part of establishing a verdict. A judge sitting on their own has nobody to challenge their assumptions.
When a case is tried by a single judge, both the form of the trial and the dynamic of the courtroom is radically changed.
Part 7 (Trials on indictment without a jury) of The Criminal Justice Act 2003 addresses the problems of complex and lengthy trials and the problems of jury tampering. These are problems resulting from having citizen jurors, not problems resulting from a having a jury per se.
Rather than solving the problems by complete removal of the jury, I believe that the less drastic measure of replacing the citizen jury by an appointed jury should be adopted. In particular:
- The jury should be replaced, not combined with the role of judge. That is there should be a judge who acts as judge in the case, and a separate judge who sits unrobed in the jury box and acts as proxy for the jury. Being unrobed is important, since it is a constant reminder to the court that the judge is acting as jury, not as judge and jury.
- Having said (2), and mentioned the importance of debate within the jury, I think that if the jury is replaced, then it should actually be replaced by at least two judges who sit unrobed in the jury box.
- The sitting judge should treat the appointed jury of judges just like a citizen jury.
- Having three judges for a trial is undoubtedly expensive, but the expense serves as a disincentive to removing a citizen jury.