On June 4th, 2010 HM Government released Combined Online Information System (COINS) database, the main database used by HM Treasury budgeting. This is a significant milestone in the opening up of UK government data. The data is available here
This post gives a brief overview of that data, and links to sources that help understand that data. I’ve also done a blog post about a little programming project I did to recreate the government Public Expenditure Statistical Analyses(PESA) reports from the COINS data, see: Using the COINS data to recreate PESA reports
Overview of COINS data
The structure of the COINS data is explained in HM Treasury – Understanding the COINS data
Initially the government released four files in zip format. These unzipped to give:
A Very brief summary of the file structure
The released files contain the COINS data in unicode format, with fields separated by @ and one record on each line. All amounts are in thousands of pounds.
The Treasury maintains data for recorded spending (outturn), forecast spending (estimated outturn for the latest year) and planned spending (up to three years ahead). The COINS data also includes snapshots of the spending data. The Data_type field has values that reflect this, including “Outturn”, “Forecast Outturn ” (eg “Forecast Outturn March”), “Plans” and “Snapshot”.
A clear distinction is made between current and capital spending, based on Generally Accepted Accounting Practice (GAAP). The Resource_Capital field has a value of “Resource” or “Capital” to reflect this.
Departments are given firm three year spending limits called Departmental Expenditure Limits(DELs) within which they prioritise resources and plan ahead. Spending that cannot reasonably be subject to firm multi-year limits, or that relates to certain non-cash transactions, is included in Annually Managed Expenditure(AME). DEL and AME together make up Total Managed Expenditure(TME). The Budget_Boundary field has a value of DEL, AME or Not DEL/AME to reflect this.
Spending is by department. The Department_code field reflects this.
- Where Does My Money Go?
- Where Does My Money Go? – A User Guide to COINS
- Open Knowledge Foundation Blog – The Hunt For COINS
- Open Knowledge Foundation Blog – COINS: A Users Guide
- What Do They Know COINS database schema
- EtherPad – Coins open notepad
- Google spreadsheet of COINS Schema overview:
- Open Knowledge Foundation(okfn) coins python scripts
HM Treasury publish documents that may help to understand COINS and, and how that data may be used and aggregated. These include:
- Public spending planning and control – a brief introduction
- Consolidated Budgeting Guidance
- Classification papers
- Information on the Supply Estimates process at: Financial reporting – Parliamentary Supply Estimates
HM Treasury publishes a number of documents based COINS data. Public Spending Statistics are available here: Statistics on Public Finance and Spending.
On June 4th, 2010 HM Government released Combined Online Information System (COINS) database, the main database used by HM Treasury budgeting. I’ve written a brief overview of the data here. This post is about a little programming project I set myself to actually do something with this data.
The COINS data is in a raw format and needs to be converted to a more useful format before any useful reports can be generated. Mainly out of curiosity, I set myself a project to do something useful with this data. I wanted to do three things:
- Extract useful subsets of the data that were small enough to be loaded into Excel.
- Load the COINS data into an SQL database.
- Recreate some of the government PESA(Public Expenditure Statistical Analyses) reports from the COINS data and check that my results are the same.
The project to do all this is written in python, and I’ve placed it on github here
Extracting useful subsets of the COINS data
The COINS fact_table_extract_2008_09.txt file contains 2,043,129 records that have a non-zero value field: of these 20,566 are outturn records, 1,610,900 are forecast records, 13,733 are plans records, and 397,930 are snapshot records. Older Excel spreadsheets are limited to 65535 rows, and I wanted to produce a useful data set that fit within this limit. The non-zero outturn records easily fit, I’ve made them available on rapidshare: coins2008_09 and coins2009_10.
Loading the COINS data into an SQL database.
Checking my reports against published government data
HM Treasury produces Public Expenditure Statistical Analyses(PESA) reports. I wanted to recreate some of these reports from the COINS data.
I’ve produced outturn Resource, Capital, DEL and AME reports by departmental group from the 2008-09 COINS data, and checked it against the estimated 2008-09 outturn from the PESA 2009 report and the actual 2008-09 outturn from the PESA 2010 report.
The PESA data used was:
The spreadsheet PESA 2010, Chapter 1 tables which contains tables of DEL, AME and TME grouped by Resource and Capital expenditure (from webpage PESA 2010 section 1 – Budgets). 2008-09 subtotals by departmental group are provided in Column G (2008-09 outturn) of the worksheets:
- Table 1.3 Resource budgets, 2004-05 to 2008-09 (by departmental group)
- Table 1.6 Capital budgets, 2004-05 to 2008-09 (by departmental group)
The spreadsheet PESA 2010, Chapter 1 tables which contains tables of DEL, AME and TME grouped by Resource and Capital expenditure (from webpage Public Expenditure Statistical Analyses 2009). 2008-09 subtotals by departmental group are provided in Column G (2008-09 estimated outturn) of the worksheets:
- Table 1.5 Resource budgets, 2003-04 to 2010-10
- Table 1.10 Capital budgets, 2003-04 to 2010-11
The following tables give my results and the PESA result side by side for comparison.
|Children, Schools and Families||46,848||46,848||46,848|
|Innovation, Universities and Skills||N/A||N/A||16,666|
|CLG Communities and Local Government||28,755||4,105
|Law Officers’ Departments||723||724||733|
|Foreign and Commonwealth Office||2,027||2,027||2,025|
|Energy and Climate Change||292||293||1,016|
|Business, Enterprise and Regularity Reform||175||N/A||1,594|
|Environment, Food and Rural Affairs||2,449||2,446||2,654|
|Culture, Media and Sport||1,455||1,456||1,633|
|Work and Pensions||7,937||7,937||8,059|
|Northern Ireland Executive||7,926||7,926||8,117|
|Northern Ireland Office||1,177||1,177||1,342|
|Children, Schools and Families||-2||10,652||10,914|
|Innovation, Universities and Skills||N/A||N/A||274|
|CLG Communities and Local Government||1,282||621
|Law Officers’ Departments||9||9||N/A|
|Foreign and Commonwealth Office||-28||-28||-7|
|Energy and Climate Change||2,403||2,403||4,589|
|Business, Enterprise and Regularity Reform||1||N/A||819|
|Environment, Food and Rural Affairs||0||0||-52|
|Culture, Media and Sport||3,890||3,890||3,911|
|Work and Pensions||135,344||135,344||135,546|
|Northern Ireland Executive||6,711||6,104||8,142|
|Northern Ireland Office||395||396||264|
|Children, Schools and Families||5,519||5,519||5,634|
|Innovation, Universities and Skills||0||N/A||2,123|
|CLG Communities and Local Government||7,233||7,112
|Law Officers’ Departments||8||9||12|
|Foreign and Commonwealth Office||226||227||218|
|Energy and Climate Change||1,667||1,667||1,688|
|Business, Enterprise and Regularity Reform||2||N/A||16|
|Environment, Food and Rural Affairs||609||610||618|
|Culture, Media and Sport||841||842||791|
|Work and Pensions||91||91||57|
|Northern Ireland Executive||1,233||1,233||1,137|
|Northern Ireland Office||67||68||78|
|Children, Schools and Families||0||N/A||N/A|
|Innovation, Universities and Skills||0||N/A||4,230|
|CLG Communities and Local Government||3,766||516||–|
|Law Officers’ Departments||0||N/A||N/A|
|Foreign and Commonwealth Office||41||N/A||0|
|Energy and Climate Change||-35||-279||-419|
|Business, Enterprise and Regularity Reform||-210||N/A||N/A|
|Environment, Food and Rural Affairs||-12||1||0|
|Culture, Media and Sport||571||572||717|
|Work and Pensions||126||136||142|
|Northern Ireland Executive||611||378||413|
|Northern Ireland Office||0||N/A||N/A|
There is good agreement between some of my calculated results and the PESA reports, but there are also some significant differences between the figures.
In the “Resource DEL” and “Capital DEL” tables there is generally good agreement, except for Health, Transport, Defence, Cabinet Office and Independent Bodies departmental groups. The Cabinet Office and Independent Bodies discrepancies could be because some departmental data is missing from the COINS data (see next section). I think the differences in the values for Health are probably because the PESA reports do not include the figures for Scotland, but I haven’t checked this. I have no explanation for the differences between the figures for Transport and Defence.
The agreement for the “Resource AME” and “Capital AME” tables is less good. Some of this disagreement is for the same reasons as the “DEL” differences, but I suspect the main reason for the differences is that my SQL query for the “AME” figures is not correct. (Any corrections to my SQL queries are most appreciated, as are any bug reports or corrections to my programs.)
Missing COINS departmental data
The COINS data does not seem to have spending figures for the following government departments:
- Assets Recovery Agency
- Department for Business, Enterprise and Regulatory Reform
- Office of Communications
- Forrestry Commision
- National Investment and Loans Office
- Royal Mint
- Office of Government Commerce
- Security and Intelligence Agencies
This may because the departments have been renamed, or subsumed into other departments.
I’ve satisfied my curiosity about the COINS data. I’ve managed to recreate some of the PESA reports, and although there are differences between my reports and the PESA reports, the differences are mostly explainable. I’m not Cliff Stoll, and won’t be tracking down the exact nature of the differences. I don’t plan to do any further work on the COINS data, but may look at future data that is released, especially if it is more detailed.
Feel free to use my programs if you wish – it would nice if you credit me if you do so. I’m also interested in other programming projects that use the COINS data – feel free to post a comment with a link if you do anything interesting with the COINS data.