West Midlands Police strive to get offender prediction system ready for implementation

By admin In News, Technology No comments

West Midlands Police strive to get offender prediction system ready for implementation

Police forces across England and Wales are poised to develop AI models to assess reoffending risk for convicted offenders. Bearing the brunt of a wave of criticism, partly due to its laudable efforts to be more transparent than other forces, is West Midlands Police (WMP). It is building such a system right now and expects it to kick off later in the year – but only if it can accommodate the concerns of its ethics committee. 

One key concern of the ethics committee is that the new model could wrongly label innocent ex-offenders as ‘harmful’. Ethical scrutiny of the system appears thorough. The ethics committee, established and overseen by Tom McNeil, a strategic adviser to the West Midlands Police and Crime Commissioner (PCC), has so far given the system a hard time. It rejected two previous AI models proposed this year. 

Machine learning model accuracy is defined by multiple factors. One is how many true positives the model spits out correctly. Another is how accurately true negatives are identified.

According to a statement made to E&T by Davin Parrott, a principal data scientist at WMP, the statistical ‘sensitivity’ of the latest model is 75 per cent. Sensitivity describes the proportion of actual positive cases that were predicted as positive. Parrott’s statement implies that WMP’s current prototype misses one in four ex-offenders who will reoffend. Although not ‘base-tested’, it raises questions over accuracy, and whether the logic of valuing true positives as less important than false negatives could fall foul of public expectations. 

Parrott explains that given the ‘imbalances’ in the data, he and his team were more concerned with avoiding falsely identifying people as high harm when they are not: “so we placed greater weight on the specificity than on the sensitivity”. 

This is mirrored in the model’s accuracy of predicting actual negative cases correctly as negative, expressed as ‘specificity’. At 99.5 per cent, it is nearly perfect. Or in other words, it is misidentifying one person as ‘high-harm’ from a pool of 200 ex-offenders.

However, it remains uncertain how the model will perform under real-life conditions. So far, it has not been piloted. Current policy foresees that if the model accuracy drops below a certain threshold – predictions of high levels of harm to be at least 20 per cent better than choosing occurrences at random – the model would need to be altered or the whole thing stopped, says McNeil.

Constant monitoring is laudable, but it also raises serious questions about whether the accuracy bar should be higher. 

How West Midlands Police is battling bias

In E&T’s conversations with architects of the WMP crime prediction system, the issue of bias came up promptly. Highlighted by the media and experts, criticism so far centres around racial and demographic bias, but Parrott rebuts this confidently, saying he and his team are doing everything to prevent that from happening.

On the contrary, to put so much emphasis on eradicating bias appears to a fairly recent endeavour by the team. It was only decided this year that the model must ignore the ethnicity data column in the stop and search (SAS) data it uses. 

In fact, the team ran the model with and without the ethnicity data and found that it didn’t improve the accuracy of the output, so they agreed to omit it.

Location data, which can act as a proxy for ethnicity, income status and other demographics, have all been dropped from the SAS data too. Also, the team will only use data points where arrests took place. 

Others remain sceptical about whether such efforts are enough. Alexander Babuta, the author of a report on data analytics and algorithmic bias in policing launched last week by the Royal United Services Institute (RUSI), commented that using only parts of the data is part of a separate issue. 

The problem with using police-reported crime data is that it gives only an incomplete representation of how crime is patterned, he said. “In many cases, you use arrest data to predict future crime. But arrest [data] is only a small proportion of future crime that occurs. The data would inevitably be skewed in terms of where the police chose to focus their resources on in the past. That can introduce certain biases. The prediction model could replicate or amplify those biases”.

In the WMP case, it raises questions about how WMP data scientists can solve this when limited to police data only.

E&T’s evaluation suggests that imbalances remain, despite attempts by the AI architects to account for ethnicity and proxy variables. It might be hard to eliminate bias completely. Some of the latest technical approaches in removing bias from machine learning – not used by WMP – are more sophisticated. One is using a “unique capability to use associative data structures [which] allows the system to run algorithms on the entire set of data, and to only use users’ analysis intent as the context, and not to limit the data”, according to Elif Tutuk, director of research at analytics company Qlik.

Regarding WMP’s prediction model, “the fact that non-arrest stop and search data was even considered is a red flag”, said Big Brother Watch investigator Griff Ferris. “That it was initially used is really concerning. Even stop and search data of arrests alone is likely to be extremely biased, based on the discriminatory use of the tactic.”

Those investigating infringements of human rights, like Ferris, have a fundamental worry about the use of crime prediction systems to reverse the presumption of innocence based on statistical prediction, “without any actual breaking of the law”. 

More specifically, experts like Ferris express concerns regarding the nature of the data. This involves two separate issues. Firstly, low-level drug addiction data – such as from the drug intervention programme (DiP)- is a mental health problem. Making such data the basis of crude police interventions after the prediction of a model is worrying in the context of human rights. 

Despite the ethical concerns, McNeil from the WMP ethics committee can imagine use-cases where the use of data is vindicated. “[We] see that as pertinent and relevant data. A huge part of our PCC (police and crime commissioner) strategy is acknowledging the role of drug addiction in crime. We want to move the system towards a more public-health approach to support people away from crime. Therefore [we are] strongly pushing the treatment agenda and treatments away from prison. But if it were used to give people longer sentences, we would hate it”. 

Secondly, there is a concern that the force could seek to feed other data to its model – beyond its pledge to only use police datasets. Such concerns are warranted. Previously, WMP stated its intent to partner with public bodies to exchange data.

Also, there are other examples where data is already shared. At Bristol Council, at the integrated analytics hub, data is drawn from various sources, including police data, in order to identify vulnerable individuals in need of safeguarding, said Babuta at RUSI.

But WMP intends to stick to using police data, for now at least. McNeil told E&T that “we are very very long way away from [partnerships with other public bodies] happening in practice – certainly, in terms of health data.”

Using ‘other’ data from public bodies in police AI triggers another concern: “We know that people from poorer social-economic backgrounds engage with public services more frequently. That is not because they are more involved with crime. They are just more likely to have more contact with services because they depend on those services more, e.g housing, social services etc. Then if you are looking for risk [of reoffending] you have more data on those people. Your algorithm is probably going to identify those groups more often than people of less disadvantaged background,” Babuta argues.

It would all boil down to the way interventions by offender managers are structured, according to the ethics committee. The way the prediction system would be used, should it pass the scrutiny of the committee, lies within the workflow of offender managers.

Offender managers are a combination of well-vetted police officers and police staff. Their job is less about law enforcement than keeping in touch with those who could be at risk of offending. “It is a form of probation, really,” McNeil said.  At present, offender managers at WMP already use a scoring system, but is “not hugely sophisticated” and only in low use, he said. But limited use of a previous scoring system raises questions about how well a new crime prediction system would be adopted.

Also, experts question the legal responsibility the dashboard providers bear. Qlik Sense, an analytics solution by US-headquartered tech company Qlik, will provide WMP with the online dashboard solution. Offender managers will use it to receive the probability scores provided by the AI model on the likelihood of ex-offenders re-offending. 

Qlik was asked by E&T whether it is aware of the complexities of WMP’s offender system. A spokesperson told us the firm is used to assist customers to visualise and analyse their data sources. “Customers decide what data to use and how best to deploy the Qlik software”. 

But shunning any responsibilities could be tricky. Babuta from RUSI thinks it is concerning that no legal requirements are taken into account in Qlik’s response. “Things like data protection [would need to be considered]. They would need to do a data protection impact assessment. If you are handling personal data, especially sensitive personal data, to what extent is that data shared and to whom is it made available”.

Communication problems with the public  

When talking to E&T, senior representatives at WMP criticised the way media covered the coming-out of WMP’s re-offender crime prediction model. “There has been some misrepresentation. Some of that has been fuelled by the early conceptual discussions where we didn’t think about whether we would limit ourselves to police data or not”, Chris Todd, Chief Superintendent at West Midlands Police, told E&T in regards to the communication of NDAS (a different system from the offender prediction model WMP is developing right now). 

The dangerous lesson is that WMP’s good-will in being transparent may have proven to be bad for its public relations. After publishing every one of its ethics committee minutes on the website, the waves of critical media reports are still reverberating on the web. But it could have been much more opaque with the public.

Crucially, there is evidence that other police forces may hesitate to be as transparent as WMP. Not long ago, Biometrics Commissioner Paul Wiles published a statement on automated facial recognition where he highlighted the lack of transparency in how trials are being conducted. 

This raises questions whether there is enough incentive for police forces to be fully transparent and whether legislation is needed to regulate disclosure.  So far, there is no standard unambiguous regulation in how police forces need to share details on police AI models.  As more police forces seem poised to experiment with their own re-offender AI models, it could become a more pressing problem for policymakers.

There is also a political concern. McNeil worries that if the present police and crime commissioner (PCC) is to step down in May 2020 (which is expected), and a new PCC from a different political party emerges, he or she may “not place ethics as high on the agenda as he and his team do. 

“We are conscious that there is an election coming up in 2020,” he continues. “I could go. That is why we as an office, are really keen to see some national leadership because our group could disappear in theory”. 

Police forces across England and Wales are poised to develop AI models to assess reoffending risk for convicted offenders. Bearing the brunt of a wave of criticism, partly due to its laudable efforts to be more transparent than other forces, is West Midlands Police (WMP). It is building such a system right now and expects it to kick off later in the year – but only if it can accommodate the concerns of its ethics committee. 

One key concern of the ethics committee is that the new model could wrongly label innocent ex-offenders as ‘harmful’. Ethical scrutiny of the system appears thorough. The ethics committee, established and overseen by Tom McNeil, a strategic adviser to the West Midlands Police and Crime Commissioner (PCC), has so far given the system a hard time. It rejected two previous AI models proposed this year. 

Machine learning model accuracy is defined by multiple factors. One is how many true positives the model spits out correctly. Another is how accurately true negatives are identified.

According to a statement made to E&T by Davin Parrott, a principal data scientist at WMP, the statistical ‘sensitivity’ of the latest model is 75 per cent. Sensitivity describes the proportion of actual positive cases that were predicted as positive. Parrott’s statement implies that WMP’s current prototype misses one in four ex-offenders who will reoffend. Although not ‘base-tested’, it raises questions over accuracy, and whether the logic of valuing true positives as less important than false negatives could fall foul of public expectations. 

Parrott explains that given the ‘imbalances’ in the data, he and his team were more concerned with avoiding falsely identifying people as high harm when they are not: “so we placed greater weight on the specificity than on the sensitivity”. 

This is mirrored in the model’s accuracy of predicting actual negative cases correctly as negative, expressed as ‘specificity’. At 99.5 per cent, it is nearly perfect. Or in other words, it is misidentifying one person as ‘high-harm’ from a pool of 200 ex-offenders.

However, it remains uncertain how the model will perform under real-life conditions. So far, it has not been piloted. Current policy foresees that if the model accuracy drops below a certain threshold – predictions of high levels of harm to be at least 20 per cent better than choosing occurrences at random – the model would need to be altered or the whole thing stopped, says McNeil.

Constant monitoring is laudable, but it also raises serious questions about whether the accuracy bar should be higher. 

How West Midlands Police is battling bias

In E&T’s conversations with architects of the WMP crime prediction system, the issue of bias came up promptly. Highlighted by the media and experts, criticism so far centres around racial and demographic bias, but Parrott rebuts this confidently, saying he and his team are doing everything to prevent that from happening.

On the contrary, to put so much emphasis on eradicating bias appears to a fairly recent endeavour by the team. It was only decided this year that the model must ignore the ethnicity data column in the stop and search (SAS) data it uses. 

In fact, the team ran the model with and without the ethnicity data and found that it didn’t improve the accuracy of the output, so they agreed to omit it.

Location data, which can act as a proxy for ethnicity, income status and other demographics, have all been dropped from the SAS data too. Also, the team will only use data points where arrests took place. 

Others remain sceptical about whether such efforts are enough. Alexander Babuta, the author of a report on data analytics and algorithmic bias in policing launched last week by the Royal United Services Institute (RUSI), commented that using only parts of the data is part of a separate issue. 

The problem with using police-reported crime data is that it gives only an incomplete representation of how crime is patterned, he said. “In many cases, you use arrest data to predict future crime. But arrest [data] is only a small proportion of future crime that occurs. The data would inevitably be skewed in terms of where the police chose to focus their resources on in the past. That can introduce certain biases. The prediction model could replicate or amplify those biases”.

In the WMP case, it raises questions about how WMP data scientists can solve this when limited to police data only.

E&T’s evaluation suggests that imbalances remain, despite attempts by the AI architects to account for ethnicity and proxy variables. It might be hard to eliminate bias completely. Some of the latest technical approaches in removing bias from machine learning – not used by WMP – are more sophisticated. One is using a “unique capability to use associative data structures [which] allows the system to run algorithms on the entire set of data, and to only use users’ analysis intent as the context, and not to limit the data”, according to Elif Tutuk, director of research at analytics company Qlik.

Regarding WMP’s prediction model, “the fact that non-arrest stop and search data was even considered is a red flag”, said Big Brother Watch investigator Griff Ferris. “That it was initially used is really concerning. Even stop and search data of arrests alone is likely to be extremely biased, based on the discriminatory use of the tactic.”

Those investigating infringements of human rights, like Ferris, have a fundamental worry about the use of crime prediction systems to reverse the presumption of innocence based on statistical prediction, “without any actual breaking of the law”. 

More specifically, experts like Ferris express concerns regarding the nature of the data. This involves two separate issues. Firstly, low-level drug addiction data – such as from the drug intervention programme (DiP)- is a mental health problem. Making such data the basis of crude police interventions after the prediction of a model is worrying in the context of human rights. 

Despite the ethical concerns, McNeil from the WMP ethics committee can imagine use-cases where the use of data is vindicated. “[We] see that as pertinent and relevant data. A huge part of our PCC (police and crime commissioner) strategy is acknowledging the role of drug addiction in crime. We want to move the system towards a more public-health approach to support people away from crime. Therefore [we are] strongly pushing the treatment agenda and treatments away from prison. But if it were used to give people longer sentences, we would hate it”. 

Secondly, there is a concern that the force could seek to feed other data to its model – beyond its pledge to only use police datasets. Such concerns are warranted. Previously, WMP stated its intent to partner with public bodies to exchange data.

Also, there are other examples where data is already shared. At Bristol Council, at the integrated analytics hub, data is drawn from various sources, including police data, in order to identify vulnerable individuals in need of safeguarding, said Babuta at RUSI.

But WMP intends to stick to using police data, for now at least. McNeil told E&T that “we are very very long way away from [partnerships with other public bodies] happening in practice – certainly, in terms of health data.”

Using ‘other’ data from public bodies in police AI triggers another concern: “We know that people from poorer social-economic backgrounds engage with public services more frequently. That is not because they are more involved with crime. They are just more likely to have more contact with services because they depend on those services more, e.g housing, social services etc. Then if you are looking for risk [of reoffending] you have more data on those people. Your algorithm is probably going to identify those groups more often than people of less disadvantaged background,” Babuta argues.

It would all boil down to the way interventions by offender managers are structured, according to the ethics committee. The way the prediction system would be used, should it pass the scrutiny of the committee, lies within the workflow of offender managers.

Offender managers are a combination of well-vetted police officers and police staff. Their job is less about law enforcement than keeping in touch with those who could be at risk of offending. “It is a form of probation, really,” McNeil said.  At present, offender managers at WMP already use a scoring system, but is “not hugely sophisticated” and only in low use, he said. But limited use of a previous scoring system raises questions about how well a new crime prediction system would be adopted.

Also, experts question the legal responsibility the dashboard providers bear. Qlik Sense, an analytics solution by US-headquartered tech company Qlik, will provide WMP with the online dashboard solution. Offender managers will use it to receive the probability scores provided by the AI model on the likelihood of ex-offenders re-offending. 

Qlik was asked by E&T whether it is aware of the complexities of WMP’s offender system. A spokesperson told us the firm is used to assist customers to visualise and analyse their data sources. “Customers decide what data to use and how best to deploy the Qlik software”. 

But shunning any responsibilities could be tricky. Babuta from RUSI thinks it is concerning that no legal requirements are taken into account in Qlik’s response. “Things like data protection [would need to be considered]. They would need to do a data protection impact assessment. If you are handling personal data, especially sensitive personal data, to what extent is that data shared and to whom is it made available”.

Communication problems with the public  

When talking to E&T, senior representatives at WMP criticised the way media covered the coming-out of WMP’s re-offender crime prediction model. “There has been some misrepresentation. Some of that has been fuelled by the early conceptual discussions where we didn’t think about whether we would limit ourselves to police data or not”, Chris Todd, Chief Superintendent at West Midlands Police, told E&T in regards to the communication of NDAS (a different system from the offender prediction model WMP is developing right now). 

The dangerous lesson is that WMP’s good-will in being transparent may have proven to be bad for its public relations. After publishing every one of its ethics committee minutes on the website, the waves of critical media reports are still reverberating on the web. But it could have been much more opaque with the public.

Crucially, there is evidence that other police forces may hesitate to be as transparent as WMP. Not long ago, Biometrics Commissioner Paul Wiles published a statement on automated facial recognition where he highlighted the lack of transparency in how trials are being conducted. 

This raises questions whether there is enough incentive for police forces to be fully transparent and whether legislation is needed to regulate disclosure.  So far, there is no standard unambiguous regulation in how police forces need to share details on police AI models.  As more police forces seem poised to experiment with their own re-offender AI models, it could become a more pressing problem for policymakers.

There is also a political concern. McNeil worries that if the present police and crime commissioner (PCC) is to step down in May 2020 (which is expected), and a new PCC from a different political party emerges, he or she may “not place ethics as high on the agenda as he and his team do. 

“We are conscious that there is an election coming up in 2020,” he continues. “I could go. That is why we as an office, are really keen to see some national leadership because our group could disappear in theory”. 

Ben Heublhttps://eandt.theiet.org/rss

E&T News

https://eandt.theiet.org/content/articles/2019/09/ai-offender-prediction-system-at-west-midlands-police-examined/

Powered by WPeMatico