Data Science analysis of court transcripts reveals biased jury selection

A filtered photo showing a juror's box with 12 empty chairs with a ray of light shining on some of them

Friday, July 28, 2023

Cornell researchers have shown that data science and artificial intelligence (AI) tools successfully identify disparate questioning – when prosecutors question potential jurors differently, in an effort to prevent women and Black people from serving on juries.

In a first-of-its-kind study, researchers used natural language processing (NLP) tools to analyze transcripts of the jury selection process. They found multiple quantifiable differences in how prosecutors questioned Black and white members of the jury pool. Once validated, this technology could provide evidence for appeals cases and be used in real-time during jury selection to ensure more diverse juries.

The new study, “Quantifying disparate questioning of Black and White jurors in capital jury selection,” appeared July 14 in The Journal of Empirical Legal Studies. Anna Effenberger ‘22, was first author on the study.

Striking jurors on the basis of race or gender has been illegal since the Supreme Court's landmark Batson vs. Kentucky decision in 1986, but this type of discrimination still occurs.

“One of the things the courts have looked at is whether the prosecutor questions Black and white jurors differently,” said study co-author John Blume, the Samuel F. Leibowitz Professor of Trial Techniques at Cornell Law School and director of the Cornell Death Penalty Project. “NLP software allows you to do that on a much more sophisticated level, looking at not just at the number, but the way in which the questions are put together.”

Under the assumption that Black and female jurors will be more sympathetic to a defendant – especially a Black one – prosecutors will sometimes press them to reveal disqualifying information. A common tactic in capital cases is to provide an especially gruesome description of the execution process and then ask if the person would be willing to sentence the defendant to death. If the answer is no, that person is struck from the jury pool.

To see if NLP software could identify this and other signs of disparate questioning, Blume collaborated with Effenberger and Martin Wells, the Charles A. Alexander Professor of Statistical Sciences in the Cornell Ann. S Bowers College of Computing and Information Science and director of research in the School of Industrial and Labor Relations, to analyze transcripts from 17 capital cases in South Carolina. Their dataset included more than 26,000 questions that judges, defense attorneys, and the prosecution asked potential jurors.

The researchers looked not only at the number of questions asked of Black, white, male, and female potential jurors, but also the topics covered, each question’s complexity, and the parts of speech used.

“We consistently found racial differences in a number of these measures,” said Wells. “When we do job interviews, we usually have a list of questions, and we want to ask everyone the same question, and here that's not the case.”

The analysis showed significant differences in the length, complexity and sentiment of the questions prosecutors asked of Black potential jurors compared to white ones, indicating they were likely attempting to shape their responses. The questions asked by judges and the defense showed no such racial differences.

The study also found evidence that prosecutors had attempted to disqualify Black individuals by using their views on the death penalty. Prosecutors asked Black potential jurors – especially those who were ultimately excused from serving – more explicit and graphic questions about execution methods compared to white potential jurors.

In six of the 17 cases analyzed in the study, a judge had later ruled that the prosecutor illegally removed potential jurors on the basis of race. By looking at the combined NLP analyses for each case, the researchers could successfully distinguish between cases that violated Batson vs. Kentucky, and ones that hadn’t.

The researchers said the findings provide proof of principle that NLP tools can successfully identify biased jury selection. Now, they hope to see similar studies performed on larger datasets with more diverse types of cases.

Once the validity of this method is established, “this could be done during jury selection almost in real time,” Wells said.

Whether used to monitor jury selection or to provide evidence for an appeal, this software could be a powerful tool to diversify juries – especially for defendants who are potentially facing the death penalty.

For this study, the Cornell Death Penalty Project funded the collection and transcription of the data. Effenberger was a Theodore Eisenberg Research Fellow.

By Patricia Waldron, a writer for the Cornell Ann S. Bowers College of Computing and Information Science.