A growing number of governments and public sector organisations are adopting artificial intelligence tools and techniques to assist with their activities. This ranges from traffic light management systems that speed emergency crews through congested city streets, to chatbots that intelligently answer queries on government websites.
The application of AI that seems likely to cause citizens most concern is where machine learning is used to create algorithms that automate or assist with decision making and assessments by public sector staff. While some such decisions and assessments are minor in their impact, such as whether to issue a parking fine, others have potentially life-changing consequences, like whether to offer an individual council housing or give them probation. The logic that sits behind those decisions is therefore of serious consequence.
A considerable amount of work has already been done to encourage or require good practice in the use of data and the analytics techniques applied to it. The UK government’s Data Science Ethical Framework, for example, outlines six principles for responsible data science initiatives: 1) Start with a clear user need and public benefit; 2) Use data and tools which have the minimum intrusion necessary; 3) Create robust data science models; 4) Be alert to public perceptions; 5) Be as open and accountable as possible; and 6) Keep data secure.
Meanwhile, data protection laws mandate certain practices around the use of personal data. Not least is the EU’s General Data Protection Regulation, which comes into force on 25 May 2018, placing greater obligations on organisations that collect or process personal data. GDPR will also effectively create a “right to explanation,” whereby a user can ask for an explanation of an algorithmic decision that was made about them.
But is this enough?
Some organisations, including Microsoft, have suggested that data scientists should sign up to something akin to a Hippocratic oath for data, pledging to do no harm with their technical wizardry.
While debate may continue on the pros and cons of creating more robust codes of practice for the private sector, a stronger case can surely be made for governments and the public sector. After all, an individual can opt-out of using a corporate service whose approach to data they do not trust. They do not have that same luxury with services and functions where the state is the monopoly provider. As Robert Brauneis and Ellen P. Goodman have written, “In the public sector, the opacity of algorithmic decision making is particularly problematic both because governmental decisions may be especially weighty, and because democratically-elected governments bear special duties of accountability.”
With this in mind, below I have drafted 10 principles that might go into a Code of Standards for Government and Public Sector Use of AI in Algorithmic Decision Making. This is very much a working draft. I would welcome readers’ comments and ideas for what should be added, omitted or edited. I have outlined specific questions against some principles. Please comment via Twitter, or edit this Google docs version.
One note: below, I sometimes mention the need for organisations to ‘publish’ various details. Ideally, this would entail releasing information directly to citizens. In some cases, however, that may not be possible or desirable, such as where algorithms are used to detect fraud. In those cases, I would propose that the same details be made available to auditors who can assure that the Code is being adhered to. Auditors may come from data protection authorities, like the UK’s Information Commissioner’s Office, or from new teams formed within the established regulators of different public sector bodies.
E.g. “This algorithm assesses the probability that a particular property is an unlicensed House in Multiple Occupation (HMO) based on an analysis of the physical characteristics of known past cases of HMOs in Cardiff recorded between January 2016 and December 2017. Its objective is to assist local authority building inspectors in identifying properties that should be prioritised for risk-based inspections. Its intended impact is to increase the number of high risk properties that are inspected so that HMO licences can be enforced.
Rationale: If we are to ask public sector staff to use algorithms responsibly to complement or replace some aspect of their decision making, it is vital they have a clear understanding of what they are intended to do, and in what contexts they might be applied. In the example given above, it would be clear to a building inspector that the algorithm does not take into account details of a property’s ownership, and is only based on past cases in Cardiff from two years’ worth of records. In this way, they can understand its strengths and limitations for informing their decisions and actions.
Questions: How do we strike the right balance between simple language and accurately describing how an algorithm works? Should we also make some requirements about specifying where in a given process an algorithm is introduced?
Rationale: Public sector organisations should prove that they have considered the inevitable biases in the data on which an algorithm was (or is continuously) trained and the assumptions used in their model. Having done this, they should outline the steps they have taken to mitigate any negative consequences that could follow, to demonstrate their understanding of the algorithm’s potential impact. The length and detail of the risk assessment should be linked to the likelihood and potential severity of producing a negative outcome for an individual.
Questions: Where are risk assessments based on the potential of negative outcomes already deployed effectively in the public sector?
Rationale: Given the rising usage of algorithms by the public sector, only a small number could reasonably be audited. By applying an Algorithmic Risk Scale (based on the risk assessment conducted for Principle 2), public sector organisations could help auditors focus their attention on instances with the potential to cause the most harm.
Questions: How would such a scale be categorised? Who could assess what level a particular algorithm should be categorised as?
Rationale: Transparency on what data is used by an algorithm is important for a number of reasons. First, to check whether an algorithm is discriminating on inappropriate grounds (e.g. based on a person’s ethnicity or religion). Second, to ensure that an algorithm could not be using a proxy measure to infer personal details from other data (e.g. guessing someone’s religion based on their name or country of origin). Third, to ensure the data being used are those that citizens would deem acceptable, thereby supporting the Data Science Ethical Framework’s second and fourth principles to: “Use data and tools which have the minimum intrusion necessary” and “Be alert to public perceptions”.
Rationale: For citizens to have recourse to complain about an algorithmic decision they deem unfair (e.g. they are denied council housing or probation), they need to be aware that an algorithm was involved. This might work in a similar way to warnings that a credit check will be performed when a person applies for a new credit card.
Questions: In what instances, if any, would this be a reasonable requirement? Does this wrongly place higher standards on algorithms than we place on human decision making?
Rationale: It has sometimes been suggested that the code of algorithms used by government and the public sector should be made open so that their logic and function can be assessed and verified by auditors.
This now seems impractical, for at least four reasons. First, the complexity of modern algorithms is such that there are not enough people who would understand the code. Second, with neural networks there is no one location of decision making in the code. Third, algorithms that use machine learning constantly adapt their code based on new inputs. And fourth, it is unrealistic to expect that every algorithm used by the public sector will be open source; some black box priority systems seem inevitable.
Instead, auditors should have the ability to run different inputs into the algorithm and confirm that it does what it claims. If this cannot be done in the live system, then a sandbox version running identical code should be required. Testing should focus on algorithms scored at the upper end of the Algorithmic Risk Scale outlined in Principle 3.
Rationale: Given the specialist skills required, most public sector organisations are likely to need to hire external expertise to develop algorithms, or pay for the services of organisations that offer their own algorithms as part of software-as-a-service solutions. In order to maintain public trust, such procurements cannot be absolved of the need to meet the principles in this code.
Rationale: This would be a powerful way to ensure that the leadership of each organisation has a strong incentive only to deploy algorithms whose functions and impacts on individuals they sufficiently understand.
Questions: Would this kind of requirement deter public sector bodies from even trying algorithms? Can and should we distinguish between an algorithmic decision and the actions that are taken as a result of it?
Rationale: On the assumption that some people will be adversely affected by the results of algorithmic decision making, a new insurance scheme should be established by public sector bodies to ensure that citizens are able to receive appropriate compensation.
Rationale: This final evaluation step is vital for three reasons. First, to ensure that the algorithm is used as per the function and objectives stated in Principle 1. Second, to help organisations learn about the strengths and limitations of algorithms so they can be improved. Third, to ensure that examples of best practice can more easily be verified, distilled and spread throughout the public sector.
I would welcome your suggestions and edits to this Code here.
Image credit: qimono on Pixabay | CC0 Creative Commons