Monday, July 25, 2016

ProPublica exposes racial bias in predictive analytics



A recent story from the nonprofit in-depth journalism site ProPublica quotes a warning issued in 2014 by then-Attorney General Eric Holder to the U.S. Sentencing Commission. His warning concerned a fad spreading through the criminal justice system. Said Holder:

Although these measures were created with the best of intentions, I am concerned that they inadvertently undermine our efforts to ensure individualized and equal justice. They may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society.

The fad that so concerned Holder is, of course, predictive analytics; the same fad spreading through child welfare.
Now, ProPublica has found that Holder was right.
ProPublica looked at 7,000 cases in Broward County, Fla., which uses a secret algorithm created by a for-profit company to assign risk scores to people arrested in that county, much as Los Angeles County plans to use a secret algorithm from a for-profit company in its child abuse investigations.
According to the story, when it came to predicting violent crime, the algorithm did a lousy job in general – four times out of five, people the algorithm said would commit a violent crime within two years did not.
In addition, according to the story:
The formula was particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants.
 White defendants were mislabeled as low risk more often than black defendants.

The company that came up with the algorithm disputes the findings, saying its own analysis of the data found no racial disparities.

Poverty is equated with risk

Since the algorithm itself is secret, we can’t be sure why the results came out racially biased.
But Prof. Sonja Starr of the University of Michigan Law School has written that the factors used to create these sorts of algorithms typically include “unemployment, marital status, age, education, finances, neighborhood, and family background, including family members’ criminal history.”

Or as Prof. Starr put it to Bloomberg Technology: “Every mark of poverty serves as a risk factor.”
Similarly, the algorithm LosAngeles plans to use for child abuse investigations includes risk factors such as whether the child has been taken often to an emergency room or whether the child often changes schools, both factors closely correlated with poverty.  Perhaps that helps explain why, when tested, the Los Angeles model apparently produced false positives a staggering 95 percent of the time.

There is a similar problem when it comes to the use of “criminal history.”
As The Marshall Project and the website FiveThirtyEight explain:

Heavy policing in some neighborhoods … makes low-income and nonwhite residents more likely to be arrested, whether or not they’ve committed more or worse crimes. … Even using convictions is potentially problematic; blacks are more likely than whites to be convicted of marijuana possession, for example, even though they use the drug at rates equivalent to whites.

The same, of course, is true when it comes to “reports” alleging child abuse – some communities are much more heavily “policed” by child protective services. If anything, broad, vague definitions of “neglect” that equate neglect with poverty itself make the problem even worse in child welfare. And, of course, the problem is compounded when those most loudly beating the drum for predictive analytics don’teven understand what such reports really mean.


Predictive analytics as computerized racial profiling

The parallels to child welfare don’t end there.
§  In criminal justice, the use of predictive analytics is far outrunning objective evaluation. ProPublica found that evaluations were rare and often done by the people who developed the software. ProPublica had to do its own test for racial bias because, it seems, no one else has bothered.
§  Predictive analytics originally was sold in criminal justice as a benevolent intervention – meant to help agencies custom tailor rehabilitation and supportive services to the needs of high-risk defendants and reduce incarceration.

But it’s quickly metastasizing into use at all stages of the criminal justice projects, including, most ominously, sentencing.
So just as predictive analytics puts black defendants at greater risk of prolonged sentences, predictive analytics in child welfare puts black children at greater risk of being sentenced to needless foster care – with all of the attendant harms in terms of abuse in foster care itself and other rotten outcomes.

But wouldn’t I consider it OK to just use predictive analytics for “prevention”? asks Daniel Heimpel, publisher of the Chronicle of Social Change, (the Fox News of child welfare). The criminal justice experience makes clear that can’t be done, and there is no need. Instead of targeting individuals, you can simply bring genuine, voluntary help to poor neighborhoods, giving you plenty of bang for limited bucks, while limiting the risk of what amounts to computerized racial profiling.