Escolar Documentos
Profissional Documentos
Cultura Documentos
Computational Journalism
Columbia Journalism School
Week 7: Algorithmic Accountability and Discrimination
24 Judges of Swiss Federal Administrative court are randomly assigned to cases. They
rule at different rates on migrant deportation cases. Here are their deportation rates
broken down by party.
Florida legislators created the point system to ensure defendants committing the same crime are
treated equally by judges. But that is not what happens.
The Herald-Tribune established this by grouping defendants who committed the same crimes
according to the points they scored at sentencing. Anyone who scored from 30 to 30.9 would go
into one group, while anyone who scored from 31 to 31.9 would go in another, and so on.
We then evaluated how judges sentenced black and white defendants within each point range,
assigning a weighted average based on the sentencing gap.
If a judge wound up with a weighted average of 45 percent, it meant that judge sentenced black
defendants to 45 percent more time behind bars than white defendants.
Bias on the Bench: How We Did It, Michael Braga, Herald Tribune
Unadjusted disciplinary rates
A greater share of black inmates are in prison for violent offenses, and minority inmates are
disproportionately younger, factors that could explain why an inmate would be more likely to
break prison rules, state officials said. But even after accounting for these elements, the disparities
in discipline persisted, The Times found.
The disparities were often greatest for infractions that gave discretion to officers, like disobeying a
direct order. In these cases, the officer has a high degree of latitude to determine whether a rule is
broken and does not need to produce physical evidence. The disparities were often smaller,
according to the Times analysis, for violations that required physical evidence, like possession of
contraband.
The Scourge of Racial Bias in New York States Prisons, NY Times
Comparing more subjective offenses
From Kosinski et. al., Private traits and attributes are predictable from digital records of
human behavior
Predicting gender from Twitter
Zamal et. al., Homophily and Latent Attribute Inference: Inferring Latent
Attributes of Twitter Users from Neighbors
Predicting race from Twitter
Northpointe response
Altering a risk algorithm to improve matters can lead to difficult stakeholder choices. If it
is essential to have conditional use accuracy equality, the algorithm will produce
different false positive and false negative rates across the protected group categories.
Conversely, if it is essential to have the same rates of false positives and false negatives
across protected group categories, the algorithm cannot produce conditional use
accuracy equality. Stakeholders will have to settle for an increase in one for a decrease
in the other.
Fairness in Criminal Justice Risk Assessments: The State of the Art, Berk et. al.
Multi-stage discrimination models
All The Stops, Thomas Rhiel, Bklynr.com, 2012
In search of fairness
Benchmark Test
Outcome Test
Problem: Infra-marginality
Simoiu et. al.
Infra-marginality
Outcome tests, however, are imperfect barometers of bias. To see this,
suppose that there are two, easily distinguishable types of white drivers: those
who have a 1% chance of carrying contraband, and those who have a 75%
chance. Similarly, assume that black drivers have either a 1% or 50% chance
of carrying contraband. If officers, in a race-neutral manner, search
individuals who are at least 10% likely to be carrying contraband, then
searches of whites will be successful 75% of the time whereas searches of
blacks will be successful only 50% of the time. This simple example illustrates a
subtle failure of outcome tests known as the problem of infra-marginality.
p(contraband|d)
p(contraband|r)
Searches Hits
r = race
d = department
Simoiu et. al.