Human Decisions and Machine Predictions

City:

New York, NY, U.S.A.

Project start and end date:

Unknown

Reference:

Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2018). Human decisions and machine predictions. Quarterly Journal of Economics. https://doi.org/10.1093/qje/qjx032

Problem description:

In the US, when someone is arrested by the police, a judge has to decide if this person will await trial at home or in jail. The decision to incarcerate the defended can result in several months in prison, which could result in high social and economic costs. Thus, this paper analyses if a machine learning algorithm could be used to enhance New York’s bail decision making.

Technical Solution:

Data Preparation:

·         The attributes race, ethnicity, and gender were excluded from the analysis;

·         The dataset was divided into training (40%), imputation (40%) and test (20%) datasets.

 

Modeling:

·         Gradient boosted decision trees were used to combine both current offences and prior criminal history data in order to predict crime risk based on judge release decisions. The gradient boosted tree algorithm is the result of the combination of several decision trees, built one after the other on the training data, in order to up-weight poor predictions made by the sequence of other trees until that point.

·         This model outputs a predicted probability of failure to appear in court.

 

Model Evaluation:

·         The prediction function was evaluated to determine whether the algorithm was capable of improving decision quality. For that, the testing sample was used to measure the accuracy of the fitted function;

·         Aside from measuring accuracy by determining the area under the receiver operating characteristic curve (AUC = 0.707), the model was also assessed based on the impact of unobserved characteristics (seen only by the judge, not by the model).

 

Model Validation:

·         To analyse the value added by the machine-learning algorithm, its predicted risk distribution was compared to that produced by a logistic regression, a more familiar econometric method;

·         To assess the influence of differing judge leniencies in the results, balance tests were also carried out in a contraction sample with a quasi-random assignment of defendants across leniency quintiles distribution;

·         To determine the potential drivers of judge misprediction, the datasets were also evaluated based on omitted-payoffs, such as different outcomes, racial fairness, defendant employment or family circumstances.

Datasets Used:

·         Dataset of information available to judges at the time of the bail hearing (e.g., current offense, prior criminal history) in cases heard in New York City between November 1, 2008, and November 1, 2013;

·         National pretrial defendant dataset between 1976 and 1978, draw from several jurisdictions.

Outcome:

Based on three types of results, the algorithmic predictions were considered capable of improving judicial decisions:

·         First, many defendants identified by the algorithm as very high risk are released by the judges. The riskiest 1% of released defendants fail to appear for court at a 56.3% rate and are rearrested at a 62.7% rate However, they are released at a 48.5% rate.

·         Second, less lenient judges do not necessarily jail the riskiest defendants first. If additional defendants were selected based on predicted risk, judges could keep the same detention rate, having a 75.8% larger crime reduction, or they could incarcerate 48.2% as many defendants with the same drop in crime risk.

·         Third, when all cases are reranked by predicted risk, the algorithmic rule, at the same jailing rate as judges, could reduce crime (14.4% to 24.7%); or without any increase in crime, could reduce jail rates (18.5% to 41.9%).

Issues that Arouse:

·         Missing data: it is not possible to determine whether jailed defendants would have committed crimes had they been released;

·         The fact that judges rely on several other factors that cannot be measured by the data;

·         Racial equity: the algorithm could lead to decrease in crime but aggravate racial disparities;

·         Judges weight different kinds of crimes differently;

·         Decisions that appear bad might actually reflect different goals;

·         Additional objectives beyond the outcome the algorithm is predicting (Omitted-Payoff Bias: Other Outcomes different than FTA, Racial Fairness, and other biases, such as family or employment).

Status:

Unknown

Entered by:

Tatiana Costa Guimaraes Trindade (tatiana.costaguimaraestrindade@mail.utoronto.ca)

September 28, 2018