Transparency in Sentencing


The QSIDE Institute has released a comprehensive dataset of federal court sentencing data, cobbled together from a number of sources, including PACER. The dataset has over 570,000 records, each corresponding to a single sentencing, from 2001-2018. The project represents a phenomenal investment, and it should help make sentencing more transparent. In particular, because the dataset includes judge identifiers as well as defendant characteristics, it may help address questions such as whether particular judges discriminate against minorities in sentencing.

For example, Chad Topaz has tweeted a graph from a not-yet-published analysis that, he reports, indicates that "30+ judges display … statistically significantly different sentencing behavior by defendant race." Many of these judges are more likely to give a below-Guidelines sentence to a white defendant than to a "minoritized" defendant, and many also are more likely to give an above-Guidelines sentence to a minoritized defendant than a white defendant. At this stage, the judges are not named, though the underlying data is public.

More nuanced analysis is promised, from David Abrams, Sonja Starr, and Crystal Yang, and I look forward to it. [Correction: Topaz was indicating only that these scholars would be capable of offering nuanced analysis. I agree and apologize for the misreading.] It will be difficult even from this data source, however, to make conclusive determination of whether minority defendants are treated worse in sentencing, all else equal. Black defendants may be systematically different from white defendants in ways that the data cannot capture, so overall racial disparities may be suggestive but not conclusive. Still, large differences in sentencing controlling for available information, as have been found in past studies on smaller datasets, may be highly suggestive of discrimination. The provision of more data on this important question is a valuable contribution.

We may be able to obtain somewhat better information about variations in judicial behavior. After all, cases are generally randomized to judges. If Judge A and Judge B are in the same district and over a large number of cases, Judge A's differential treatment of white vs. minority defendants relative to Judge B's is statistically significant, then we can infer that one judge is more generous to white defendants than the other. But Topaz identifies "VERY IMPORTANT CAVEATS" (caps in original). Just because a disparity exists does not necessarily mean that a judge is biased. As just one possibility that Topaz notes, the judges may differ in whether they vary sentences based on whether a defendant has dependents, a variable that itself may be correlated with race. The fact that a dataset includes many control variables does not exclude the possibility that it omits other important controls, and even where a dataset includes a control, if that control is only a proxy for some underlying variable, the regression only partly controls for that underlying variable.

Once data about how judges differ on the basis of race are publicly released, that may place pressure on judges to change their sentencing patterns so that they treat different races equally. One might argue that this is harmful, if it means that judges stop taking into account race-neutral variables that they genuinely care about in sentencing. But if a particular judge is an outlier in this regard, then most judges apparently agree that this consideration should not factor into sentencing. Topaz is right that judges who are systematically harsher on defendants of a particular race may not be biased (because other unidentified factors may explain their decisionmaking), but they are at least outliers. Transparency in sentencing may, like the Sentencing Guidelines, serve as a mechanism for reducing sentencing disparity.

A more worrisome possibility is that with the release of such data, judges start directly taking into account race, albeit not admitting that they are doing so. Judges might keep internal track of their record of sentencing defendants of different races. A judge who sees an apparent bias on a public dashboard, even a statistically insignificant one, might shade decisionmaking in the direction of reducing bias. (I assume that judges would not want to appear either biased for or against minority defendants, though it is theoretically possible that some judges would want to appear biased in one direction or the other.) Perhaps in the future, we might determine that there is less difference among judges in bias measures than one would expect from random chance, suggesting that judges are ignoring relevant considerations in order to appear unbiased.

This highlights a more general point about transparency: If greater amounts of data allow people to be scored based on some considerations but not others, then they will be relatively more likely focus on those considerations and ignore the others. Thus, if transparency leads to seemingly better performance on one variable, it may be at the expense of other variables. More alarmingly, if someone wants to make it appear that a variable does not influence the person's decisionmaking, the person counterintuitively may allow the variable to play an explicit part in decisionmaking, to counter any random fluctuations that otherwise might incorrectly be interpreted as bias.

Still, my instinct is that the benefits of transparency greatly outweigh these costs. There is an appropriate national focus on race in the criminal justice system. Unless one has a very strong prior that race does not play any role in the criminal justice system and that seeming patterns reflect spurious correlations, one should favor better data. And if that data indicates that judges are outliers in one direction or another, it seems to me more likely than not that the scrutiny of those judges will be more likely to make sentencing genuinely consistent and fair than to achieve a false consistency.


NEXT: New on SCOTUSBlog: "Invisible majorities: Counting to nine votes in per curiam cases"

Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Report abuses.

  1. I hope they didn’t go to all that work just to focus solely on the idea of racist judges. This seems like it could be great tool for holding judges accountable to the public opinion. It would be amazing if they would do this for state courts. Don’t like the crime in your neighborhood? Look at an app and check which judge is too soft. Think some people are getting punished too harshly? Well you’d know who to blame.

  2. What an excellent idea! The more transparent the better.

    “A more worrisome possibility is that with the release of such data, judges start directly taking into account race, albeit not admitting that they are doing so.”

    I would argue that those judges who “discover” they punish minorities more or less harshly in a statistically significant manner have already been “taking into account race,” even if they weren’t doing it consciously. Judges are no different than anyone else, and humans have biases, even when we think we don’t.

    1. Agree 100%. My strong suspicion is that, for the vast majority of these judges, it’s all subconscious. I had a judge who, in 2 years and more than 300 hearings, did not rule for me a single time–except when one party or another also happened to support my argument. With the prior 3 judges in my courtroom, I had the highest winning percentage of any of my peers. Maybe I suddenly became an incompetent attorney. That’s certainly one explanation. But it was pretty clear to me, and–to the judge’s credit–clear to her as well, when I pointed it out and gave her a printout of of the hearings and her rulings.

      We don’t know what we don’t know. And while some judges will take this info and ignore it, and others may swing the pendulum too far in the other direction; a lot of judges will take it in and become better (ie, more fair) bench officers. Can’t see how the benefit will not outweigh any potential harm.

      1. The phrase, “We don’t know what we don’t know” is my mantra! It gives me confidence that I don’t suffer from Dunning-Kruger, if I’m being honest.

        I’ve been an English teacher/instructor for the last 20 years, and I would really like to see some data reflecting my own teaching/grading biases. There are a number of studies suggesting black students are disciplined more often and for less serious infractions than white students, so it is reasonable to assume that these same biases would manifest in more “hidden” ways, like in more subjective types of grading.

        Confident, intelligent judges would appreciate learning a little something about themselves.

  3. I thought all judges, or at least trial court judges, were Black women.

  4. ” Still, my instinct is that the benefits of transparency greatly outweigh these costs. ”

    Knowledge Is Good
    — Emil Faber, Founder

  5. When lawyers try to do statistics
    >Siri what is the law of small numbers?
    >Siri how do crime statistics work when broken down by race, severity, and repeat offender status?
    >Siri what are fat tails?

  6. Maybe I missed it, but how did they collect information on defendants’ race? I wasn’t aware that that information was either systematically collected or publicly available.

    1. I was wondering the same thing, and about the data sources more generally. Some of this info isn’t available on PACER, such as the Guidelines range and whether a sentence is within, above, or below it. Also, I’m pretty sure the Sentencing Commission has judge-specific data, but to my knowledge, the Commission doesn’t make that info publicly available. If there are publicly available data sets that can show judge-to-judge disparities with this granularity, I’d love to see it. Unfortunately, the first link in the post redirects to the landing page of the Southern Poverty Law Center’s website. If the SPLC is behind this project, I’m skeptical, to say the least.

  7. It will be interesting to see the actual analysis when published.

    If someone says ’30 out of 1000 studies showed a statisticly (at p=.95) significant correlation between eating rutabagas and nasal cancer’, well, duh – you’d expect the number to be 50 if rutabagas had nothing whatever to do with nasal cancer.

    I sometimes think we teach too much calculus and too little statistics.

    1. Sounds to me like rutabagas are pretty effective at combating cancer. 😉

  8. Statistical significance is usually set at 5%. This means that by chance – if there is no real effect, and everything is random – about 1 in 20 events will be statisically significant. There are about 670 federal district judges. So we’d expect about 32 or 33 of them to show up as statistically significant “by chance” if there was no actual racial discrepancy in sentencing. 30 is very consistent with this. The evidence would seem to suggest that there is no statistical significance.

    If one rolled one 20-sided die, it would be unusual for it to come up with a particular number. But roll 670 of them, and it’s quite reasonable to find 30 of them with that number. It’s totally consistent with the dice being completely fair.

    1. oops, see below (meant to reply here).

  9. Just read the whole Twitter feed… they are using adjusted p-values (due to multiple comparisons) so these seem actually statistically significant.

Please to post comments