1 of 10

Slide Notes

Today's conference is about big data and I want to acknowledge its importance at the outset. It seems almost self evident to me that the utility provided by the effective analysis and application of big data is going to be increasingly the key element for decision making in and out of government in the medium to long term.

Some of you may be aware of the recent discussion on the future of artificial intelligence and its application to autonomous weapons. While frightening to many, some of us may recall that this problem has been solved for us, in fiction at least, by the Three Laws of Robotics, coined by the science fiction author, Isaac Asimov. I commend them to you.

Of course, hard core science fiction fans, those who remember it before it was lumped with fantasy in the book stores, will recall other works of Isaac Asimov, especially the Foundation series. Based on Gibbons' magisterial History of the Decline and Fall of the Roman Empire, Asimov tells the story of a decaying galactic empire that can only be saved by the science of psychohistory - the use of data, huge quantities of data, collected from observations of human history and behaviour over a hyperextended time scale. What was psychohistory if not big data on a galactic scale?

While we might be moving toward the galactic scale for big data, we are not there yet. The purpose of my presentation this morning is to demonstrate for you that even little data can advance the functions of government, giving us a taste of where we can go. After all, wasn't the smallest Droid, R2-D2, the one who made the most difference?

it doesn't have to be that big

Published on Nov 18, 2015

A presentation to the 2015 Big Data Conference in Sydney on 9 Aug 15

PRESENTATION OUTLINE

it doesn't have to be that big

data based decision making at the coalface
Today's conference is about big data and I want to acknowledge its importance at the outset. It seems almost self evident to me that the utility provided by the effective analysis and application of big data is going to be increasingly the key element for decision making in and out of government in the medium to long term.

Some of you may be aware of the recent discussion on the future of artificial intelligence and its application to autonomous weapons. While frightening to many, some of us may recall that this problem has been solved for us, in fiction at least, by the Three Laws of Robotics, coined by the science fiction author, Isaac Asimov. I commend them to you.

Of course, hard core science fiction fans, those who remember it before it was lumped with fantasy in the book stores, will recall other works of Isaac Asimov, especially the Foundation series. Based on Gibbons' magisterial History of the Decline and Fall of the Roman Empire, Asimov tells the story of a decaying galactic empire that can only be saved by the science of psychohistory - the use of data, huge quantities of data, collected from observations of human history and behaviour over a hyperextended time scale. What was psychohistory if not big data on a galactic scale?

While we might be moving toward the galactic scale for big data, we are not there yet. The purpose of my presentation this morning is to demonstrate for you that even little data can advance the functions of government, giving us a taste of where we can go. After all, wasn't the smallest Droid, R2-D2, the one who made the most difference?

Scope

  • Evidence Based Policy
  • Open Data & Transparency
  • Myth Busting
  • Better Value
  • Looking Forward
In today's presentation, I will cover the themes on this slide.

prove it

evidence based policy
Government isn't easy. The problems faced every day at the federal level range from the hard to the wicked. The set of policy choices is large, the combinations and permutations are many, and the consequences can be dramatic.

What can we do to assist us in making the right decisions? The answer is employing evidence based policy making.

This isn't a new idea. 6 years ago, the Productivity Commission was doing work on it http://www.pc.gov.au/research/completed/strengthening-evidence

This work is worth reading if, for nothing else, its revelation of the fact that the liberalisation of abortion in the 1970s had a greater effect or reducing crime in New York under Mayor Rudolph Giuliani in the late 1990s than did his policing policies.

Of course, the retrospectoscope is a powerful tool. The challenge for us in government is to develop lead indicators of policy success, or at least, develop faster analysis methodology for lag indicators. This is the role for big data or even little data that we need to be filled now.
Photo by Me2 (Me Too)

in the public interest

open data and transparency
The history of open data can be traced back to when I was born - in 1958 https://en.wikipedia.org/wiki/Open_data

Its recent emphasis owes a lot to work in the open government sphere. I like to think of this work as having several different themes - the first of which is transparency about government.

The second is about putting the data government collects to better use. This is a form of recycling.

Data.gov.au demonstrates what we have been doing in this space recently http://www.finance.gov.au/taxonomy/term/1274/

One of our most significant stores of such data is 16 years of contract data from AusTender. http://data.gov.au/dataset/historical-australian-government-contract-data


Photo by Pedro Vezini

that's not right

MYTH BUSTING with Data
In government procurement, among a range of activities, we are using open data to bust myths critical of policy and practice http://www.finance.gov.au/taxonomy/term/1456/
Photo by pixbymaia

chasing better value

saving tax payers' money
One of the major tasks of the Department of Finance is pursuing the efficient use of tax payers' money. And one of the ways we do this is by conducting coordinated procurements. http://www.finance.gov.au/tpd/

The process we use for developing coordinated procurements focuses heavily on evidence and thus on data. We begin with a somewhat informal pre-scooping study, designed to determine whether an opportunity might exist. If it does, we begin a more formal scoping study, modifying agencies' procurement behaviour, in order to determine the best way to approach the issue.

All of these activities require the collection of detailed data.

Once a procurement arrangement is under way, we also continue to collect data.
Photo by nur_h

travel savings

Travel data is a good example of this work.

Each trip is actually a sort of best and final offer contest between vendors. Our lowest practical fare domestic rule and best fair of the day international rule drive the behaviour of travellers and require competitive pricing by vendors.

We use the data collected to determine whether we are best using the arrangement and to determine opportunities for further savings. Monitoring the data can provide the opportunity to address issues with particular routes and inform future procurement behaviour and allow us to better structure our arrangements.

We just don't collect quantitative data either. We consult widely before we go to market for a new or refreshed activity to ensure we get the best outcome.
Photo by Henrik Sjodin

software and hardware

In ICT we have managed significant procurement outcomes. Detailed data analysis of our Microsoft software usage allowed us to save $205m over 7 years, first by understanding our usage and then by negotiating the best possible deal, using the models we developed to reflect how various options would give us the best deal.

In internet based network connections, we've saved $90m against a target of $54m by better procurement based on data collection and analysis.

My colleagues across Finance also use benchmarking data to examine the state of government ICT http://www.finance.gov.au/sites/default/files/Australian%20Government%20ICT...
Photo by Daniel Dionne

where to now?

Looking Forward
I have demonstrated how we are using data now. The challenge in the future will be multifaceted:
- how will we cope with the amount of data the Internet of Things will provide?
- how will we integrate non-quantitative data sets, particularly unstructured data into our data analysis activities?
- how will we move from the analysis of lag indicators to being predictive using real time analysis of lead indicators?
- how will we know, quickly, what is good and bad data?
- and where will we get the data scientists we need for this work?
Photo by enigmabadger

ANY questions

it doesn't have to be that big
Questions from the floor
Photo by CDRaff