And if you’re claiming success in reducing racial disparities by ensnaring more white children in the system instead of fewer children of color, you’re missing the point.
Pittsburgh's supposed success in reducing child welfare racial disparities consists mostly of slapping scarlet number "risk scores" on more children such as these. |
“When it comes to stopping state-sanctioned violence –
whether an unjustified police shooting or child removal – shouldn’t we use the
most advanced tools at hand?” Daniel Heimpel, publisher of the Chronicle of Social Change, asks in the
conclusion of a
recent column.
Since he’s long been one of the most ardent supporters of
using predictive analytics in child welfare, [UPDATE: In a tweet, Heimpel takes issue with this characterization, which is based on my impression of years of Chronicle stories] his answer is unsurprising: “It
seems to me that predictive analytics – which has been so maligned as the
harbinger of automated racism – could actually be a key to eroding its hold.”
But the principal child welfare study Heimpel cites teaches
a very different lesson.
Whodunit vs. who might do it
Heimpel begins by suggesting that predictive analytics could
be used to find caseworkers who are racially biased – as demonstrated,
presumably, by the fact that they are outliers in the number of times they “substantiate”
alleged child abuse or neglect or remove children from families of color. He cites research showing that it is possible
to pinpoint which police officers stop and frisk African-Americans at a
disproportionate rate.
But that’s not predictive analytics. That’s just math. You’re not predicting what people are going
to do – you’re just looking at what they’ve actually done. In other words, you’re
looking for whodunit, not who might do it next week or next year. If
all the other variables, such as nature of allegations, income of the family,
etc. are the same, and a few workers are far more “trigger happy” about
removing children of color than most others, odds are those workers have a bias
problem.
Of course, there’s also an underlying assumption that child
protective services agency administrators want to find such workers and change
their behavior. It is at least as likely
that many CPS agencies would seek out and punish workers who are more cautious
than most about substantiating alleged abuse and removing children – because take-the-child-and-run
is a terrible policy for children but it’s often good politics. That’s one reason why we have foster-care panics.
In any event, predictive analytics applied to families is
very different. As I discuss
in detail here, it’s more like the dystopian sci-fi movie Minority Report.
When the images happen to be true
Heimpel writes that “The idea of using predictive analytics
in child welfare easily conjures images of child abuse investigators targeting
parents a machine deems likely to harm their children.”
Yes, it does. Because those images are accurate.
The “machine” uses a series of data points, many involving
whether a family is poor, and uses it to “predict” if that family will abuse or
neglect a child in the future. But if
the data points are biased – confusing poverty with neglect, for example, then
the predictions are likely to be biased.
Virginia Eubanks, author of Automating
Inequality aptly
calls it poverty profiling. And
Prof. Dorothy Roberts, an NCCPR board member, advances
Eubanks’ analysis to show the racial bias as well.
Furthermore, when actually put into effect, these models
have been shown to have enormously high rates of false positives – predicting terrible
harm will come to children when in fact it didn’t.
But what about Pittsburgh?
Heimpel cites a recent evaluation of the nation’s most
advanced predictive analytics model, one
I’ve criticized often, the Allegheny Family Screening Tool (AFST) used in
Pittsburgh and surrounding Allegheny County, Pa. For every neglect call
received by the county, AFST generates a risk score between 1 and 20 – an invisible
“scarlet number” that supposedly predicts how likely it is that a given child
will be harmed. The number then helps
call screeners decide when to screen out a call and when to send a caseworker
out to investigate.
The evaluation suggests that AFST reduced racial disparities
at one child welfare decision point – opening a case for investigation. And it did.
But in the worst possible way.
As the evaluation
itself acknowledges, this achievement was accomplished through
increases in the rate of white children determined to be in need of further child welfare intervention coupled with slight declines in the rate at which black children were screened-in for investigation. Specifically, there was an increase in the number of white children who had cases opened for services, reducing case disparities between black and white children. [Emphasis added.]
In other words, what they’re really saying in Pittsburgh is:
Great news! We’re running around
labeling so many more white parents as child abusers that we’ve reduced racial
disparities! (“Opened for services,” is
a euphemism, by the way. It means the caseworker decided the allegation should
be “substantiated” and the family put under the thumb of the child protective
services agency.)
This is rather like a child welfare system suddenly throwing
thousands more children into foster care, sending those children home after
only a few days and then saying “Great news, folks! Our average length of stay in foster care has
plummeted!”
Given all we know about the enormous harm of needless child
abuse investigations and needless foster care, the solution to racial
disparities should involve treating black families more like white families,
not the other way around.
And nowhere mentioned in the evaluation is something else
that happened after AFST was implemented – something deeply disturbing: There
was a
sharp, sudden spike in the number of children torn from their parents in
2017. In a typical year, Allegheny
County tears children from their parents about 1,000 times. In 2017 that spiked
to 1,200 before returning to 1,019 in 2018.
We don’t know of AFST contributed to the spike – the evaluation
never addresses it. But in the past the
longtime director of the Allegheny County Department of Human Services (DHS),
Marc Cherna, has taken pride in avoiding such spikes in entries. This time, there is silence.
And even the usual number of removals in Pittsburgh, about
1,000 per year, is disturbingly high. When compared to the number of
impoverished children it represents a rate-of-removal as bad as Phoenix, which has the highest
rate-of-removal among child welfare systems in America’s largest cities, and
worse than Philadelphia, which is second worst.
If anything, all this raises questions about whether Cherna, the
one-time reformer who has led Allegheny County DHS for decades, has stayed too
long.
AFST widens the net
Indeed, among the deeply disturbing findings of this
evaluation is that AFST is
widening the net of coercive, traumatic state intervention into families, with
no actual evidence that children are safer. And the results would be even
worse if not for the fact that the human beings who screen calls are “standing
up to the algorithm” more often than the county seems to have expected.
But DHS appears to want to prevent this, so the effects of AFST on families are
only likely to worsen.
A flawed measure of accuracy ...
The evaluators made their case that AFST has improved accuracy
based on the following premise: Workers who go out to investigate cases are
concluding that a greater proportion of them warrant further
intervention. And since the investigators don’t know the actual scarlet
number – somewhere between 1 and 20 for each child in the family – the evaluation
assumes AFST must be singling out a greater proportion of cases where there
really is a need for DHS to intervene.
Here’s the problem. The investigators
don’t know if the scarlet number was, say, a 6 or an 18. But the investigators
know enough for the very existence of AFST to bias their decision-making.
They know that the algorithm that is the pride of Allegheny County, and has gotten
an avalanche of favorable national attention is probably what sent them into
this home in the first place. That alone probably is enough to make them more
skittish about potentially “defying” the algorithm and saying there’s no
problem here. So what
the report claims is an increase in accuracy is more likely a self-fulfilling
prophecy.
...as the net grows wider
A child abuse investigation is
not a benign act. Even when it does not lead to removal it can be
enormously traumatic for children. But under AFST this trauma is increasing.
According to the evaluation, before AFST the proportion of reports “screened
in” was declining. AFST stopped that decline. That is deeply
disturbing in itself, all the more so when combined with the one-year increase
in entries into care noted earlier.
The human factor
The one bit of good news in this evaluation is that the human
beings who do the actual screening have been less afraid to stand up to the algorithm
than I’d expected. But what’s interesting here is the fact that DHS seems
to be upset by this.
One of the biggest selling points for AFST has been that it’s
supposedly just a tool, something that gives advice to the screeners who still,
with their supervisors, are making the actual decisions. According to the
evaluation:
“…there is considerable lack of concurrence with the AFST by call screeners … only 61 percent of the referrals that scored in the ‘mandatory’ screen-in range were, in fact, screened in. Therefore, the county will continue to work with call screeners to understand why they might be making these decisions.”
That does not sound like DHS is happy with the screeners daring
to question the algorithm. It’s frightening to think of the effects on
the poorest communities in Allegheny County if DHS takes this one “brake” off
AFST.