A four step approach to collaborative data projects

by Eddie Copeland
374 views

Having absorbed at least some of the functions of what would have been a London Office of Data Analytics, the London Office of Technology and Innovation (LOTI) has a major role to play in supporting public sector data collaboration.

LOTI already has projects underway that aim to improve information governance across London’s public sector, and to inject the boroughs’ perspective into the GLA’s discovery phase for a new London City DataStore.

We hope these projects will make it easier to share data legally, ethically and securely. Yet, perhaps more fundamentally, it’s important for public sector organisations to understand when and how data can help.

In this article (which is an update to a blog I wrote at Nesta in April 2018), I sketch out a four step method for projects that involve using data from multiple public sector bodies.

The four step method

I’ve previously summarised the process of determining whether data can help address a specific problem or question in the following four steps.

Can I address this problem with data? Diagram by Eddie Copeland originally created at Nesta: bit.ly/IdeaOnAPage
Can I address this problem with data? Diagram by Eddie Copeland originally created at Nesta: bit.ly/IdeaOnAPage

Let’s explore each step in turn and unpick why the order matters.

1 — Specific Problem

Public sector organisations face many large and complex problems. When stated in general terms like “Homelessness is growing in our area” or “Our current approach to adult social care is unsustainable”, it’s hard to identify specific ways in which data (or any other tool or method for that matter) might help lead to a better outcome.

It’s therefore important to be clear on what specific, actionable problem or question you would like to solve or answer.

To do that, it’s helpful to be aware of five problem types that lend themselves to data-enabled solutions.

The following list comes from the analytics team in New Orleans’ Office of Performance and Accountability (see their version here):

  1. Targets are difficult to identify within a broader population
  2. Services do not categorise high priority cases early
  3. Resources are overly focused on reactive services
  4. Repeated decisions made without access to all relevant information
  5. Assets or staff are scheduled or deployed without the input of latest service data

Do any of these problem types describe issues you face?

2 — Defined action

I often speak to organisations who say they hope to tackle a problem by “getting the data together and seeing what it shows”.

Within just one team or organisation, that may sometimes work. But for collaborative data projects, I’d advise against this approach.

Instead, ask: What would you do differently if you had better information (or all the information in the world) on the problem you’re trying to solve?

Remember, data is not the intervention; it’s just a means to the end you want to achieve. So what do you want to achieve?

The table below offers example opportunities that relate to the problem types outlined in step 1. Again, this comes from the New Orleans’ analytics team:

NOLAlytics Solving real-world problems with data — https://datadriven.nola.gov/nolalytics/
NOLAlytics Solving real-world problems with data — https://datadriven.nola.gov/nolalytics/

Being clear about what you want to do is vital for at least three reasons:

  1. If your solution will eventually require using any personal or sensitive information sourced from more than one team or organisation, you’ll need to be able to state exactly why you need it and what you’ll do with it in order to identify a specific legal gateway to permit the sharing and processing of that data under the terms of GDPR (see more in step 4).
  2. If you can’t think of anything your organisation and its partners could do differently even if you had unlimited information, you risk investing a lot of time and resources for little result by proceeding further. (I say this based on experience. In late 2016, Nesta worked with local authorities across the North East of England to explore how their collective data might help them better understand and tackle issues related to alcohol abuse. After six months of collecting and visualising data on a regional map, local public sector organisations realised they couldn’t determine how it would would enable them to do anything differently.)
  3. To look at point 2 a different way, you may also realise that a lack of relevant data is not the real barrier you face. Perhaps you need more time, staff or money. Or perhaps you’d be better served by a different innovation approach, like those outlined below.
A diagram showing Nesta’s landscape of innovation approaches
Nesta Landscape of Innovation Approaches: https://www.nesta.org.uk/blog/landscape-of-innovation-approaches/

I’ll say it once more. Don’t skip this step!

3 — Data Product

Once you’ve established what you’d like to do differently, it can be tempting to jump straight to looking at what data you need.

But pause for a moment. This step, which I learned from Mike Flowers, can help make that conversation much more focused and productive.

Ask: What would a person need to see on a screen in order to enable the actions defined in the previous step?

It’s unlikely that whoever is doing the action (e.g. a frontline worker or service manager) will want a spreadsheet or raw data. Instead they’ll want the data conveyed in a more intelligible way that provides real insight. That’s what we mean by a ‘data product’.

A data product could be a map, a heatmap, a prioritised list, a dashboard, an alert and so on. Certain data products are suited to certain problem and opportunity types.

NOLAlytics Solving real-world problems with data — https://datadriven.nola.gov/nolalytics/
NOLAlytics Solving real-world problems with data — https://datadriven.nola.gov/nolalytics/

You might like to try to visualise this by drawing a specific person in a specific place, doing a specific thing based on specific information. (You get the gist… be specific.)

This way, you can start to see whether an insight from a particular data product could enable one or more of the actions you outlined in step 2. By clearly defining your data product, it will be much easier to identify what data you need in the fourth and final step.

4 — Accessible data

In this step, ask: What data do you need to create the data product, does it exist, can you get it, and can you use it?

Data can come from many different sources.

It’s sensible to first explore whether you can create your data product using open data. If it’s open, you can skip much of the complexity relating to information governance outlined below.

If open data doesn’t meet your needs, look at what data is held by your organisation and other public sector organisations. Further sources of data may be businesses, universities, charities or even citizens themselves.

If the data you need doesn’t exist, or is held by an organisation that’s unwilling to share it, you may wish to consider:

  1. Are there other datasets that might contribute a similar type of information, or act as a proxy measure?
  2. Could you start collecting this data so that analysis becomes more feasible in future?

Unless all the data is open data, you’ll need to follow a robust process to determine whether it’s legal, ethical and secure to use it. It’s well worth engaging with your Information Governance Lead as early as possible to get their expert advice on how to do this.

Beyond that, the first step is to carry out a Data Privacy Impact Assessment (DPIA, sometimes just called a ‘PIA’). A DPIA is a standard series of screening questions that guides you through the potential risks and benefits of sharing personal data. The DPIA equally prompts you to develop mitigation strategies to minimise potential downsides of information sharing.

This editable DPIA is provided by the Information Commissioner’s Office (ICO).

If you must use personal data, an important step is to identify the legal gateways that grant your organisation the permission or authority to pursue certain objectives which could be supported by the sharing of personal data. You must have a valid, lawful basis in order to process personal data.

There are six available lawful bases for processing (Consent,Contract, Legal Obligation, Vital Interests, Public Task, Legitimate Interests). No single basis is ‘better’ or more important than the others. Which basis is most appropriate to use will depend on your purpose and relationship with the individuals to whom the data relates.

If the data is personal, it may be possible to remove personally-identifiable information and aggregate the data to reduce the risks associated with using it. Good guidance on data anonymisation and pseudonymisation is available in the Research Ethics Guidebook.

If there is a legal way to share the data, organisations should ask themselves about the ethics of doing so. (“Just because you can share it, should you share it?”) The Open Data Institute’s Data Ethics Canvas and UK Statistics Authority’s Data Ethics Self-assessment are useful tools for this purpose.

The next step is to set out a common set of rules and conditions for sharing the data in the form of an Information Sharing Agreement (ISA). The typical elements to be covered in an ISA are:

  • The purpose of the sharing
  • The potential recipients and the circumstances in which they will have access
  • The exact data to be shared
  • Data quality — accuracy, relevance, usability, etc.
  • Data security
  • Retention of shared data
  • Individuals’ rights — procedures for dealing with access requests, queries and complaints
  • Review of effectiveness/termination of the sharing agreement
  • Sanctions for failure to comply with the agreement or breaches by individual staff

(As part of our project to standardise Information Governance across London’s public sector, members of LOTI intend to trial the use of the Information Sharing Gateway, a tool that helps standardise the process of creating, reviewing and agreeing ISAs.)

The diagram below shows how you may have to adapt your data product based on what data you can use.

Image for post

Conclusion

I’ll close by pointing out that while the order of the four step process is important, it’s visualised as a circle for a reason. You’ll often need to run through the steps multiple times to find a viable match between a problem you want to solve and a data product that enables a real solution.

Does this overview resonate with your work? I’d welcome readers’ thoughts and comments.

Find me on Twitter.

Thank you to Hilary SimpsonMichelle EatonCamilla BertoncinMike Flowers and Oliver Wise, whose ideas have significantly influenced this article.

You may also like