Papers
Frank Ramsey’s Contributions to Probability (and Legal) Theory
“The life of the law has not been logic: it has been experience”.**
Introduction
Frank Ramsey was one of the first theorists, along with Bruno de Finetti,[1] to formalize the notion of subjective probability, but how did Ramsey discover this revolutionary insight in the first place – the idea that probability can consist of a subjective value or purely personal estimate? Cheryl Misak’s beautiful 2020 biography of Frank Ramsey, which is subtitled A Sheer Excess of Powers, explores this terrain as well as Ramsey’s many other scholarly contributions.[2] Misak’s book is divided into three broad temporal sections: i) “Boyhood”, which consists of three chapters devoted to the years 1903 to 1920, i.e. from the year of Ramsey’s birth up to his arrival at Cambridge University; ii) “The Cambridge Man”, which contains seven chapters that describe Ramsey’s undergraduate years at Trinity College as well as his six-month sojourn in Vienna in 1924; and, lastly, iii) “An Astonishing Half Decade”, which contains nine chapters and covers the last five years of Ramsey’s short but productive life. Likewise, I will structure my review of Misak’s work into three parts, with each part corresponding to one of the three main sections of Misak’s book. Part I of my review revisits Ramsay’s boyhood; Part II, his years at Cambridge University; and Part III, his final years. Part IV then concludes by showing how Ramsey’s work and ideas can help shed new light on the law. [3]
I. Ramsey’s Boyhood
Although there is no evidence that the young Ramsey was exposed to the rigors of probability theory by his parents Arthur and Agnes Ramsey or during his formal education at Winchester College (a demanding English boarding school for boys), two details from Ramsey’s boyhood, as recounted in Part I of Misak’s book, stood out for me the most. One was the young Ramsey’s principled opposition to the brutal system of bullying and hazing at his boarding school. Misak summarizes this savage system in the second chapter of her book. Here is just one chilling excerpt:
Each junior was the personal servant of an older student and had to “fag” or “sweat” for him. That meant cleaning the buttons and boots of his Officers’ Training Corps uniforms as well as his muddy cricket boots so they gleaned white again, as well as countless other tasks. The juniors [also] had to make the prefects’ tea, or afternoon meal, and wash up after …[4]
Ramsey, however, detested these regular hazing rituals, and towards the end of his tenure at boarding school, he wrote in his diary that he had “[d]ecided to give up sweating juniors”.[5] Furthermore, according to Misak, Ramsey even made a bargain with the younger boy assigned to him that he would not be required to do any chores at all for him; in return, the boy was to pay it forward to his assigned junior when the time came.[6] In other words, this early episode shows how Ramsey, even at a young age, was a man of principle.
The other aspect of Ramsey’s boyhood that caught my attention was his voracious reading habits.[7] Even at such an early age, Ramsey was a boy who loved the world of ideas, for in addition to his regular coursework at his boarding school, Ramsey devoured dozens of advanced works from a wide variety of fields. Among the many extracurricular books the young Ramsey is reported to have read are David Hume’s Treatise on Human Nature, Bertrand Russell’s Problems of Philosophy, and G.E. Moore’s Ethics. If one were to create a syllabus with the goal of imparting a well-rounded liberal arts education, one would be hard-pressed to assemble a better collection of classic works.
II. Cambridge University
Part II of Cheryl Misak’s intellectual biography of Frank Ramsey is devoted to Ramsey’s undergraduate years at Cambridge University, where the young scholar would, among other things, translate Wittgenstein’s Tractatus Logico-Philosophicus and study John Maynard Keynes’s Treatise on Probability. If there is a common or overarching theme during these formative years in Ramsey’s intellectual life (1920-24), it is his willingness to challenge the most powerful and sacrosanct ideas of such great and legendary scholars and philosophers as G.E. Moore, Bertrand Russell, and Ludwig Wittgenstein. Here, I will limit myself to just one such momentous undergraduate episode: Ramsey’s early critique of Keynes’s theory of probability.
To appreciate the young Frank Ramsey’s first foray into probability theory, I must first provide some relevant background. The great John Maynard Keynes had published his Treatise on Probability in 1921,[8] and in a review of Keynes’s work, none other than Bertrand Russell had called Keynes’s Treatise “the most important work on probability that has appeared for a very long time”,[9] adding that the “book as a whole is one which it is impossible to praise too highly”.[10] Why was Keynes’s work so highly praised at the time? Because Keynes had developed a new way of looking at the concept of probability. For Keynes, probability consisted of an objective or logical relation between evidence and hypothesis, or in the words of Misak, a relation “between any set of premises and a conclusion in virtue of which, if we know the first, we will be warranted in accepting the second with some particular degree of belief”.[11]
Ramsey, however, immediately identified two big blind spots in Keynes’s conception of probability.[12] One was Keynes’s admission that not all probabilities are numerical or measurable, especially when the truth values of our underlying premises are in dispute. In that case, when we have no idea whether our premises are true or not, Keynes’s approach does not allow us to measure the probabilities of our conclusions. For Ramsey, by contrast, all probabilities should be measurable. The other problem with Keynes’s theory – a deeper problem to boot – was the objective nature of his view of probability: the idea that all statements or propositions stand in logical relation to each other. Ramsey, by contrast, denied the existence of these logical relations altogether. For Ramsey, probability was based on experience, not logic. That is, far from being an objective relation, the strength or weakness of the relationship between two propositions also depended on psychological factors: on one’s personal experiences and subjective beliefs.[13]
Yet, as the saying goes, “it takes a theory to beat a theory”,[14] and at this stage of his promising career the young Ramsey had yet to develop his own full-fledged theory of probability. Ramsey would finally get around to doing so in the last half decade of his short life, but before proceeding, I will take a short detour to recount two important personal episodes that occurred during this middle stage of Ramsey’s short life: his six-month sojourn in Vienna,[15] and his secret love affair with Lettice Baker,[16] who deserves a biography of her own.
In brief, upon the completion of his undergraduate studies, Ramsey had decided to spend an extended period of time in Vienna to undergo psychoanalysis.[17] At that time, according to Misak, “taking the cure in Vienna” was a common pastime for many young Cambridge academics.[18] Furthermore, Frank Ramsey took full advantage of all that the former imperial capital had to offer: deep discussions with members of the legendary Vienna Circle of anti-metaphysical philosophers, cultured nights at the world-famous Opera, and even some sordid sexual escapades with a Viennese prostitute.
Moreover, within a month of his return to England, in the fall of 1924, Ramsey met Lettice Baker at a Moral Sciences Club meeting at Trinity College.[19] Shortly thereafter, Ramsey asked her out to tea, and they quickly fell in love. For my part, what struck me the most about the Ramsey-Baker love affair was how they had to keep their amorous relationship a secret until they were legally wed in September of 1925, but more importantly for the world of ideas, Ramsey’s greatest scholarly contributions, including his theory of subjective probability (a theory that is especially relevant to law, as I shall show below), were right around the corner.
III. Ramsey’s Swan Song
The third and last part of Cheryl Misak’s beautiful biography of Frank Ramsey (“An Astonishing Half Decade”) covers the last five years of Ramsey’s fleeting life. During the last half decade of his short but productive life, Ramsey made major contributions to a wide variety of fields, including economics, mathematics, and philosophy, but I shall focus here on his contributions to probability theory, for it was during this time that Ramsey developed his own full-fledged theory of subjective or psychological probability.
Ramsey developed his new approach to chance in a paper titled Truth and Probability, which he presented for the first time at a meeting of the Moral Sciences Club in November of 1926.[20] In this remarkable paper, which was eventually published posthumously in 1931,[21] Ramsey sketched out an entirely new and revolutionary way of looking at probability. We can summarize Ramsey’s picture of probability in ten words: “probabilities are beliefs and beliefs, in turn, are metaphorical bets”,[22] or to quote Ramsey’s himself, “Whenever we go to the station we are betting that a train will really run, and if we had not a sufficient degree of belief in this [outcome] we should decline this bet and stay at home”.[23] On this subjective view of probability, one can measure the strength of a person’s beliefs in betting terms, or again in Ramsey’s own words: a “probability of 1/3 is clearly related to the kind of belief [that] would lead to a bet of 2 to 1”.[24] Most importantly, Ramsey also showed how one’s bets – i.e., one’s subjective or personal probabilities – should obey the formal axioms of probability theory.[25]
It is hard to overstate the importance of Ramsey’s subjective betting paradigm.[26] First and foremost, Ramsey’s subjective approach fills a huge gap left open by standard probability theory based on frequencies, for as Misak herself correctly notes, frequentist methods can neither “provide an account of partial belief, nor an account of how an individual should make one-off decisions”.[27] This blind spot is so enormous and so well-known by now that I won’t belabor it here; it suffices to say that Ramsey’s subjective picture of probability delivers a serious blow to standard or frequentist models.
Secondly, Ramsey’s betting paradigm fills another huge blind spot in probability theory. Before Ramsey, the conventional wisdom, so to speak, was that some probabilities (especially personal probabilities) were not measurable in any rigorous way. After Ramsey, by contrast, we are fully able to express any person’s partial beliefs, even his subjective ones, using numerical values. How? By converting one’s beliefs into betting odds. (As an added bonus, Ramsey’s intellectual framework even enables us to determine whether our subjective beliefs are rational or not via now-standard Dutch book arguments).[28] In addition, Ramsey’s betting paradigm also provides the intellectual foundations for modern prediction markets,[29] one of the most promising and exciting market innovations of our time.[30] Perhaps we should rename prediction markets “Ramsey markets” in honor of Frank Ramsey.
But what do Ramsey’s ideas have to offer lawyers, judges, and legal theorists?
IV. A “Ramsey Model” of Judicial Voting
As it happens, Frank Ramsey’s subjective picture of probability has both normative and descriptive implications for law. To recap, according to Ramsey’s subjective picture of probability, probabilities are not an external or objective property of the real world. Instead, probabilities are internal entities – the subjective expression of one’s personal view of the world. Moreover, one’s subjective belief about the probability of a proposition’s truth can be expressed in numerical terms: one’s “degree of belief”.[31]
In other words, the probability of a particular proposition being true is just one’s degree of belief or level of confidence in the truth of that proposition. On this subjective view of probability, even if two people’s judgments about the probability of a proposition are vastly different at time t1, after evidence for (or against) the statement/hypothesis contained in the proposition is introduced at time t2, rational individuals should then revise their initial degrees of beliefs, and moreover, their degrees of belief will tend to converge as more and more new evidence becomes available. Isn’t this subjective convergence toward truth a good description of how common law judges and juries decide cases?
In addition to its descriptive power, Ramsey’s approach to probability may also have some normative implications for law. Consider judicial voting on multi-member panels, such as the Supreme Court of the United States. An appellate judge’s vote contains information (independent of whatever reasons the judge may give to justify his vote), so a Rasmisan judge should update his priors before casting a final and decisive vote, especially in close cases.[32] But is there any way of operationalizing these Bayesian insights? There is: “Ramsey voting”, “Ramsian voting”, or what I have also called “Bayesian verdicts”[33] or “Bayesian voting”.[34] In plain English, what if an appellate judge, when deciding a case, were required to not only state the reasons for his vote but were also required to express his degree of belief in the soundness or correctness of his decision?
To the point, a Ramsian system of judicial voting would look much different from the existing system of binary judicial voting in appellate cases. In place of emitting a simple binary vote either affirming or reversing a lower court’s decision, a Ramsian judge would rate or score the strength of the legal arguments of the parties. With Ramsian voting, a judge would have to assign a numerical score reflecting his relative degree of belief or credence in what the proper outcome of an issue or case should be, depending on whether the judge is engaged in outcome-voting or issue-voting.[35]
To understand how Ramsian voting could work in practice, recall that one’s degree of belief can be expressed in numerical terms anywhere in the range from 0 to 1:
In summary, the higher the score, the greater the judge’s credence or degree of belief. A score below 0.5, for example, would mean that the party with the burden of persuasion is not expected to prevail. A score above 0.5, by contrast, indicates that the party is expected to prevail, while a score of 0.5 means the judge is undecided about which party should prevail. Bayesian or Ramsian voting thus recognizes the subjective as well as the interdependent nature of law and legal interpretation.
This alternative method of voting goes by various names, including range voting,[36] utilitarian voting,[37] score voting,[38] point voting,[39] and cardinal voting,[40] just to name a few variants. For my part, I have used the term “Bayesian voting” not only because judicial decision-making in close cases is ultimately a subjective exercise in legal reasoning, but also to emphasize the close connection between this proposed method of judicial voting and the theory of subjective probability developed by such giants as Frank P. Ramsey and Bruno de Finetti.[41] To sum up, the subjective approach to probability can be used to improve collective decision making, including legal adjudication. Perhaps we should rename these alternative methods of voting “Ramsey voting” or “Ramsian voting” in honor of Frank Ramsey’s contributions to probability theory.
Conclusion
I will conclude my review of Misak with a confession and a conjecture. My confession is that I am befuddled by two aspects of Ramsey’s personal biography, of Ramsey the man: his decision to squander no less than six months of his fleeting life to undergo psychoanalysis in Vienna and the open nature of Ramsey’s marriage to Lettice Baker, for I agree with Sir Karl Popper that psychoanalysis is unfalsifiable pseudoscience,[42] and for me, love is not a matter of degree; true love requires one to give up one’s outside options – what game theorists refer to as credible commitment.[43]
Now that I have made my confession, I will proceed to my conjecture. What if romantic love were a matter of degree, like truth, instead of an all-or-nothing proposition?[44] Also, what if it was Ramsey’s extended exposure to psychoanalysis during his six-month sojourn in Vienna that somehow inspired him to develop his subjective approach to probability? After all, beliefs and desires – the raw materials, so to speak, of psychoanalysis – also play a critical role in Ramsey’s subjective theory of probability. If so, his sojourn in Vienna was not a waste of time after all; it was a necessary precondition of his contributions to the world of probability theory.
Thank you, Cheryl Misak, for sharing your Frank Ramsey with us.
* * *
Appendix: A “Ramsey Model” of Litigation
To further illustrate the relevance of the ideas of Frank Ramsey to law, this Appendix presents a simple analytical of litigation contests and applies the model to both criminal and civil cases.[45] In both types of cases, litigation is a game with two possible outcomes: positive (+) and negative (–).[46] Seen from the moving party’s perspective, for example, a positive outcome occurs when the moving party successfully imposes civil or criminal liability on the defendant; a negative outcome, when the defendant is able to avoid the imposition of liability.[47]
Before proceeding, notice that the relevant rules of procedure (i.e., the rules of the litigation process) – as well as the scope and legal meaning of wrongful acts and the types of legal liability imposed on wrongful actors – are not relevant to my simplified model.[48] In place of traditional legal analysis, this Ramsey-inspired model abstracts from the morass of legal materials and takes these features of the legal landscape as a given. Stated formally, these details are exogenous or external to the model. Having stated these simplifying assumptions, I now proceed to apply Bayes’ theorem to the litigation process as follows:
Pr(guilty|+) = [Pr(+|guilty) × Pr(guilty)] ÷ Pr(+)
In plain English, we want to find the posterior probability, Pr(guilty|+), that a defendant will be found liable at trial, given that he has really committed a wrongful act.[49]
My Ramsian approach to litigation contests thus takes into account both the possibility of a false positive (i.e., the imposition of liability when the defendant has not committed any wrongful act); as well as the possibility of a false negative (no liability even though the defendant has, in fact, committed a wrongful act).[50] The purpose of this stylized Ramsian model, however, is not to explore the many systemic imperfections (procedural or practical or otherwise) in the existing legal system, imperfections and asymmetries contributing to the problem of false positives and negatives.[51] Instead, the goal of the model is to solve for Pr(guilty|+) and answer the following key question: how reliable is the legal system? That is, how likely is it that a defendant who is found liable is, in fact, actually guilty of committing a wrongful act?
The remainder of this appendix will consider four possible scenarios or types of litigation games: i) non-random adjudication with risk-averse or virtuous moving parties; ii) non-random adjudication with risk-loving or less-than-virtuous moving parties; iii) random adjudication with risk-averse moving parties; and iv) random adjudication with risk-loving moving parties. This schema may thus be depicted in tabular form as follows:
Type of litigation game |
Type of moving party |
non-random adjudication |
risk-averse |
non-random adjudication |
risk-loving |
random adjudication |
risk-averse |
random adjudication |
risk-loving |
In summary, the adjudication variable in our model refers to the reliability or screening effectiveness of the process of adjudication. Non-random adjudication refers to litigation games that are 90% sensitive and 90% specific,[52] an assumption based on the classic and oft-repeated legal maxim “it is better that ten guilty men escape than that one innocent suffer”.[53] Random adjudication, by contrast, occurs when litigation games are only 50% sensitive and 50% specific and thus no more reliable than the toss of a coin.[54]
In addition, the terms risk-averse and virtuous, as applied to moving parties, refers to plaintiffs and prosecutors who play the litigation game only when they are at least 90% certain that the named defendant has committed an unlawful wrongful act, while risk-loving or less-than-virtuous moving parties refers to plaintiffs and prosecutors who are willing to play the litigation game even when they are only 60% certain that the named defendant has committed a wrongful act. Stated colloquially, virtuous plaintiffs are civil plaintiffs who rarely file frivolous claims and criminal prosecutors who rarely abuse their discretion; by contrast, less-than-virtuous moving parties are more willing to gamble on litigation outcomes than their more virtuous colleagues.
Non-random Adjudication with Risk-averse Moving Parties
Suppose the litigation game is 90% sensitive and 90% specific, that is, suppose the process of litigation is able to determine correctly, at least 90% of the time, when a defendant has committed a wrongful act, and suppose that the process will also determine correctly, again at least 90% of the time, when a defendant has not, in fact, committed a wrongful act.[55] But even with a 90% accuracy rate, this model of non-random adjudication still suffers from a 10% error rate. Given this error rate, we must turn to Bayes’ rule to determine the posterior probability that liability will nevertheless be incorrectly imposed on an innocent defendant (i.e., the probability that a defendant who has not committed a wrongful act will be incorrectly classified as a wrongful or guilty defendant). To apply Bayes’ theorem, we must find the prior probability that any given defendant, selected at random, has in fact committed a wrongful act. What is this prior probability?
First, let the term “guilt” stand for a guilty defendant, let “innocent” represent an innocent defendant, and let the + symbol indicate the event of a positive litigation outcome for the plaintiff or prosecutor, as the case may be. That is, from the plaintiff or prosecutor’s perspective, a positive outcome, or +, occurs when liability is eventually imposed on the defendant. We now proceed to find the values for Pr(+|guilty), Pr(+|innocent), Pr(guilty), Pr(innocent), and Pr(+). To begin with, Pr(+|guilty) is the probability that a guilty defendant will be found guilty at the end of a litigation contest. Since we have assumed that the litigation process is 90% sensitive, the value for Pr(+|guilty) is equal to 0.9. By the same token, Pr(+|innocent), the probability that a particular litigation contest will produce a false positive (i.e., the probability that liability will be imposed on an innocent defendant) is equal to 0.1. This value is 0.1 since, given these initial assumptions, the litigation game produces false positives only 10% of the time.
Now suppose that plaintiffs and prosecutors are risk-averse or virtuous parties, that is, assume that plaintiffs and prosecutors alike are willing to play the litigation game only when they are at least 90% certain that the named defendant has, in fact, committed an unlawful wrongful act. (This risk-averse conduct is considered virtuous in my model since such moving parties are less willing than their risk-loving colleagues to gamble on the outcome of litigation, or expressed in legal language, virtuous civil plaintiffs rarely file frivolous claims and virtuous criminal prosecutors rarely abuse their discretion).[56]
Accordingly, given these stringent assumptions (i.e., risk-averse moving parties and non-random adjudication), the prior probability that a given defendant is guilty is 90%, or stated formally, letting A stand for the prior probability of being guilty, then Pr(A) = Pr(guilty) = 0.9. Summing up, Pr(A) or Pr(guilty) is the prior probability, in the absence of any additional information, that a particular defendant has committed a wrongful act. As stated above, this term is equal to 0.9 since we have assumed that 90% of all named defendants are guilty. Likewise, we determine Pr(B) or Pr(innocent), the prior probability that a particular defendant has not committed any wrongful act. This is simply 1 – Pr(guilty) or 0.1, since 1 – 0.9 = 0.1.
Lastly, Pr(+) refers to the prior probability of a positive litigation outcome (again, positive from the plaintiff’s or prosecutor’s perspective), in the absence of any information about the defendant’s guilt or innocence. This value is found by adding the probability that a true positive result will occur (0.9 × 0.9 = 0.81), plus the probability that a false positive will happen (0.1 × 0.1 = 0.01) and is thus equal to 0.81 plus 0.01 = 0.82. Stated formally, Pr(+) = [Pr(+|guilty) × Pr(guilty)] plus [Pr(+|innocent) × Pr(innocent)]. That is, the prior probability of a positive litigation outcome, Pr(+), is the sum of true positives and false positives and, given our assumptions above, is equal to 0.82 or 82%.
Having translated all the relevant terms of Bayes’ theorem, we may now restate our Bayesian model of litigation contests and find the posterior probability, Pr(guilty|+), that civil or criminal liability will be incorrectly imposed on a guilty defendant (i.e., the probability that a defendant who has not committed a wrongful act will nevertheless be incorrectly classified as a wrongful or guilty defendant):
Pr(guilty|+) = [Pr(+|guilty) × Pr(guilty)] ÷ Pr(+) = {see below}
= [Pr(+|guilty) × Pr(guilty)] ÷ ([Pr(+|guilty) × Pr(guilty)] +
[Pr(+|innocent) × Pr(innocent)])
= (0.9 × 0.9) ÷ [(0.9)(0.9) + (0.1)(0.1)]
= 0.81 ÷ 0.82 = 0.988
In other words, given our rosy assumptions above, the outcome of any particular litigation game will be highly accurate. Specifically, the probability that a defendant who is found liable for a wrongful act is actually guilty of committing such wrongful act is close to 99%, though there is still a 1% probability that an innocent defendant will nonetheless be found liable. But what happens when the litigation game is played by strategic plaintiffs or zealous prosecutors? That is, what happens when plaintiffs file a greater proportion of frivolous claims (relative to the optimal level of frivolous claims) or when prosecutors routinely ‘overcharge’ criminal defendants with extraneous or vague offenses (e.g., conspiracy)? I turn to this possibility below.
Non-random Adjudication with Risk-loving Moving Parties
Suppose the litigation game is still highly sensitive and specific as before (i.e., 90% sensitive and 90% specific), but that plaintiffs and prosecutors are risk-loving or less-than-virtuous actors. Specifically, assume that the moving parties are willing to play the litigation game even when they are only 60% certain (instead of 90% certain, as I assumed earlier) that the named defendant has committed a wrongful act. (Such behavior is less-than-virtuous in our model because the moving party is less concerned with the defendant’s actual guilt than a risk-averse or virtuous moving party). The intuition behind this revised assumption is that, in reality, the litigation game might be played by litigants (as well as judges) who are engaged in rent-seeking and self-serving behavior.[57] Thus, with risk-loving moving parties, the prior probability, Pr(guilty), that a given defendant is guilty is now only 60%, while the prior probability, Pr(innocent), that a particular defendant has not committed a wrongful act is 1 – Pr(guilty), or 1 – 0.6 = 0.4. Stated formally: Pr(guilty) = 0.6, and Pr(innocent) = 0.4.
Next, what is the probability that a guilty defendant will be found guilty, or Pr(+|guilty)? In this variation of the model, the value for Pr(+|guilty) is equal to 0.90 since we continue to assume the litigation game is 90% sensitive. Pr(+|innocent), the probability that a particular litigation game will produce a false positive (i.e., the probability that liability will be imposed on an innocent defendant), remains 0.1. Lastly, recall that Pr(+) is the probability that a true positive result will occur (in this case, 0.9 × 0.6 = 0.54), plus the probability that a false positive will happen (0.1 × 0.4 = 0.04), and is thus equal to 0.54 plus 0.04 = 0.58. Stated formally, Pr(+) =[Pr(+|guilty) × Pr(guilty)] plus [Pr(+|innocent) × Pr(innocent)] = 0.54 plus 0.4 = 0.58.
Given these revised assumptions – non-random adjudication and less-than-virtuous plaintiffs – we now find the posterior probability that liability will be correctly imposed on a guilty or wrongful defendant as follows:
Pr(guilty|+) = [Pr(+|guilty) × Pr(guilty)] ÷ Pr(+) = {see below}
= [Pr(+|guilty) × Pr(guilty)] ÷ ([Pr(+|guilty) × Pr(guilty)] +
[Pr(+|innocent) × Pr(innocent)])
= (0.9 × 0.6) ÷ [(0.9)(0.6) + (0.1)(0.4)]
= 0.54 ÷ 0.58 = 0.931
In this case, despite the presence of risk-loving moving parties, the outcome of any particular litigation game will still be highly reliable. Specifically, although there is a 7% chance that an innocent defendant will be found liable, the posterior probability that a defendant who is found liable for a wrongful act is actually guilty is still 93%. But now, consider what happens when litigation is a crapshoot, that is, stated formally, what happens when the litigation game is only 50% sensitive and 50% specific?
Random Adjudication with Risk-averse Moving Parties
Suppose now that litigation outcomes area only 50% sensitive and 50% specific; that is, courts are only able to decide cases correctly no more than half of the time when a defendant has committed a wrongful act; the process of litigation is random, no better than a coin toss, since courts will correctly determine with one-half probability, or p = 0.5, whether a defendant has committed a wrongful act.
Given this inherent randomness along with the presence of virtuous plaintiffs, we now turn to Bayes’ rule to determine the posterior probability that liability will be incorrectly imposed on an innocent defendant (i.e., the probability that a defendant who has not committed a wrongful act will be incorrectly classified as a wrongful or guilty defendant). Again, let “guilt” stand for a guilty defendant, “innocent” an innocent defendant, and the symbol + the event of a positive litigation outcome for the moving party (plaintiff or prosecutor). Next, we find the values for Pr(guilt), Pr(innocent), Pr(+|guilt), Pr(+|innocent), and Pr(+).
First, assuming that plaintiffs and prosecutors are virtuous or risk-averse actors and thus are willing to play the litigation game only when they are at least 90% certain that the named defendant is guilty, then Pr(guilty), the prior probability in the absence of other information that a particular defendant has committed a wrongful act, will be equal to 0.9, or stated formally, Pr(guilty) = 0.9. Likewise, Pr(innocent), the prior probability in the absence of other information that a particular defendant has not committed a wrongful act, is simply 1 – Pr(guilty) or 0.1, since 1 – 0.9 = 0.1
Next, Pr(+|guilty), the probability that liability will be imposed on a defendant who is actually guilty, is 0.5 since the litigation game in this variation of the model purely random (i.e., 50% sensitive). Similarly, Pr(+|innocent), the probability that liability will be imposed on an innocent defendant, is also 0.5 since, given these revised assumptions, the litigation game will produce a false positive half of the time the game is played.
Lastly, recall that Pr(+) is the sum of true positives and false positives, that is, the prior probability of a positive litigation outcome, positive from the plaintiff’s or prosecutor’s perspective, in the absence of any information about the defendant’s guilt or innocence. Specifically, given our assumptions above, this value is equal to 0.5, that is, 0.5 × 0.9 = 0.45 (true positives) plus 0.5 × 0.1 = 0.05 (false positives). Thus, the prior probability of a positive litigation outcome, Pr(+), absent any information about the defendant’s guilt or innocence, is equal to 50%.
With these assumptions (random adjudication and virtuous or risk-averse plaintiffs), we apply Bayes’ theorem as follows:
Pr(guilty|+) = [Pr(+|guilty) × Pr(guilty)] ÷ Pr(+) = {see below}
= [Pr(+|guilty) × Pr(guilty)] ÷ ([Pr(+|guilty) × Pr(guilty)] +
[Pr(+|innocent) × Pr(innocent)])
= (0.5 × 0.9) ÷ [(0.5)(0.9) + (0.5)(0.1)]
= 0.45 ÷ 0.50 = 0.9
This result is perhaps the most surprising one thus far. Even when the litigation game is a purely random process, no better than a coin toss, the outcome of any individual litigation game will still be highly reliable, given the presence of virtuous moving parties. Specifically, under this scenario there is a 90% probability that a defendant who is found liable for a wrongful act is, in fact, actually guilty. (In other words, even when litigation is random, there is only a 10% chance that an innocent defendant will be found guilty or civilly or criminally liable). Although this value is less than the corresponding values for Pr(guilty|+) in the previous two permutations of the model (subsections ii and iii above), this difference is marginal at best, considering the enormous qualitative differences between non-random adjudication and a purely random legal system. The present permutation of the model, however, assumes the presence of virtuous plaintiffs and prosecutors. What happens when the litigation game is purely random and the moving parties are less-than-virtuous? We explore this intriguing possibility in subsection iv below.
Random Adjudication with Risk-loving Moving Parties
Now suppose the litigation game is still a crapshoot but that plaintiffs and prosecutors are risk-loving or ‘less-than-virtuous’; that is, assume that the moving parties are more willing to gamble than their virtuous colleagues. Specifically, we will assume that the litigation game is 50% sensitive and 50% specific, and that plaintiffs and prosecutors are willing to play the litigation game even when they are only 60% certain that the named defendant has committed a wrongful act. Although these assumptions do not appear to be plausible (unless we assume the presence of risk-loving actors and picture litigants as pure gamblers), this permutation of the model, however implausible, may nevertheless provide an instructive counter-factual or hypothetical illustration of the Bayesian approach to litigation.
Given these revised assumptions (i.e., random results and risk-loving or less than virtuous actors), we can once again turn to Bayes’ theorem to determine the posterior probability that liability will be incorrectly imposed on an innocent defendant (i.e., the probability that a defendant who has not committed a wrongful act will be incorrectly classified as a wrongful or guilty defendant), and once again, “guilt” stands for a guilty defendant, “innocent” indicates an innocent defendant, and the symbol + represents the event of a positive litigation outcome for the plaintiff or prosecutor.
As such, Pr(guilty), the prior probability (in the absence of any additional information) that a particular defendant has committed a wrongful act, is equal to 0.6, while Pr(innocent), the prior probability that a particular defendant has not committed a wrongful act, is 0.4 (i.e., 1 – Pr(guilty), or 1 – 0.6). Next, Pr(+|guilty), the probability that liability will be imposed on a defendant who is actually guilty, and Pr(+|innocent), the probability that liability will be imposed on an innocent defendant, are both equal to 0.5 since, given our assumptions, this version of the litigation game is purely random. Lastly, Pr(+), the sum of true positives and false positives, is also 0.5 since, given our assumptions above, 0.5 × 0.6 = 0.3 (true positives) and 0.5 × 0.4 = 0.2 (false positives), or put another way, the prior probability of a positive litigation outcome (again, from the plaintiff’s or prosecutor’s perspective), absent any information about the defendant’s guilt or innocence, is equal to 50%.
Therefore, given random adjudication and risk-loving plaintiffs, we now apply Bayes’ theorem as follows:
Pr(guilty|+) = [Pr(+|guilty) × Pr(guilty)] ÷ Pr(+) = {see below}
= [Pr(+|guilty) × Pr(guilty)] ÷ ([Pr(+|guilty) × Pr(guilty)] +
[Pr(+|innocent) × Pr(innocent)])
= (0.5 × 0.6) ÷ [(0.5)(0.6) + (0.5)(0.4)]
= 0.3 ÷ by 0.5 = 0.6
What is most surprising about this result is the ability of the litigation process to produce reliable results more than half the time, even when the underlying litigation game itself is purely random and even when the actors are less than virtuous. Specifically, the probability that the outcome of any individual litigation game will be accurate is 60%, even though the underlying litigation game is purely random, no more reliable than a coin toss. One way of explaining this potential paradox is to take another look at the Pr(guilty) term: the prior probability in the absence of additional information that a defendant selected at random is guilty (i.e., the prior probability that a particular defendant has committed a wrongful act). This prior probability term exerts a decisive influence in the fourth permutation of our model precisely because the outcome of litigation is purely random. That is, when litigation is a crapshoot, or to be more precise, when litigation is a coin toss, both the prior and posterior probabilities of the defendant’s guilt are the same. Here, since Pr(guilt) = 0.6, then Pr(+|guilty) = 0.6.
* J.D., Yale Law School. B.A., University of California at Santa Barbara. Associate Professor, Pontificia Universidad Católica de Puerto Rico (PUCPR). Associate Instructor of Law, University of Central Florida (UCF).
** O.W. Holmes, Jr., The Common Law, Dover, New York, 1991 [1881], p. 1.
[1] For Bruno de Finetti’s landmark contributions to our understanding of probability, see B. de Finetti, Theory of Probability: A Critical Introductory Treatment, Wiley, New Jersey, 1974.
[2] C. Misak, Frank Ramsey: A Sheer Excess of Powers, Oxford University Press, Oxford, 2020.
[3] In addition, to further demonstrate the relevance of Frank Ramsey to law and legal theory, this review includes a technical appendix that presents a probabilistic model of litigation inspired by Ramsey’s probabilistic ideas.
[4] See C. Misak, op. cit., p. 30.
[5] Id., p. 49.
[6] Ibid.
[7] Id., pp. 48-49.
[8] J.M. Keynes, A Treatise on Probability, Dover, New York, 2004 [1921].
[9] See B. Russell, ‘Review: A Treatise on Probability by John Maynard Keynes’, in The Mathematical Gazette, n. 32, 1948, p. 152.
[10] Ibid.
[11] See C. Misak, op. cit., p. 113 (emphasis added).
[12] Id., pp. 114-115. For the record, Ramsey published his review of Keynes’s Treatise on Probability in the January 1922 issue of Cambridge Magazine, i.e. while he was still an undergraduate!
[13] For students of the common law (myself included), Ramsey’s approach to probability may sound familiar. See, e.g., O.W. Holmes, op. cit., p. 1 (“The life of the law has not been logic: it has been experience”).
[14] See, e.g., L. Solum, ‘Legal Theory Lexicon: It Takes a Theory to Beat a Theory’, Legal Theory Blog, 21 October 2012, available at https://perma.cc/82QH-8A4B.
[15] See C. Misak, op. cit., ch. 7.
[16] See id., pp. 205-208.
[17] It was during this time that Ramsey received the news of his appointment to a lectureship at King’s College. See C. Misak, op. cit., pp. 178-181.
[18] Id., p. 161.
[19] According to Misak, it was at this meeting that G.E. Moore read his now famous paper ‘A defence of common sense’. See C. Misak, op. cit., p. 205. Moore’s paper is reprinted in G.E. Moore, Philosophical Papers, Allen & Unwin, Crows Nest, 1959.
[20] See C. Misak, op. cit., p. 263.
[21] See F. Ramsey, ‘Truth and Probability’, in R.B. Braithwaite (edited by), The Foundations of Mathematics and Other Logical Essays, Routledge & Paul, London, 1931.
[22] See, e.g., R.T. Cox, ‘Probability, Frequency, and Reasonable Expectation’, in American Journal of Physics, n. 14, 1946, p. 1.
[23] See C. Misak, op. cit., p. 268.
[24] Id., p. 271.
[25] See generally F. Ramsey, ‘Truth and Probability’, op cit. See also F. MacBride, et al., ‘Frank Ramsey’, in E.N. Zalta & U. Nodelman (edited by), The Stanford Encyclopedia of Philosophy, 14 August 2019, available at https://plato.stanford.edu/entries/ramsey/. Misak’s biography itself also includes a separate summary by Nils-Eric Sahlin of the technical details of Ramsey’s subjective or betting approach to probability. See C. Misak, op. cit., pp. 272-273.
[26] See generally A. Hájek, ‘Interpretations of Probability’, in E.N. Zalta (edited by), The Stanford Encyclopedia of Philosophy, 28 August 2019, available at https://plato.stanford.edu/archives/fall2019/entries/probability-interpret/.
[27] See C. Misak, op. cit., p. 266.
[28] In brief, the term “Dutch book” originates from the worlds of gambling and bookmaking. In the context of probability theory, a Dutch book refers to a series of bets that are structured in such a way that a bettor is guaranteed to either win or lose regardless of the outcome of an event. See generally Susan Vineberg, ‘Dutch Book Arguments’, in E.N. Zalta, U. Nodelman (edited by), The Stanford Encyclopedia of Philosophy, 14 May 2022, available at https://plato.stanford.edu/archives/spr2016/entries/dutch-book/.
[29] In summary, a prediction market is a type of financial market that allows individuals to trade contracts that are based on the outcome of future events. These events can range from political elections and sports results to economic indicators and even natural phenomena. See, e.g., J. Wolfers, E. Zitzewitz, ‘Prediction Markets’, in Journal of Economic Perspectives, n. 18, 2004, p. 107. For an extension of the prediction market model to the retrodiction of past events, see F.E. Guerra-Pujol, ‘Truth Markets’, Social Science Research Network, 24 January 2023, available at https://perma.cc/255J-6PVK.
[30] Among other things, prediction markets are designed to aggregate and utilize the collective wisdom and information of participants in order to generate more accurate predictions about the likelihood of various outcomes than other methods like polling or experts can. See, e.g., K.J. Arrow et al., ‘The Promise of Prediction Markets’, in Science, n. 320, 2008, p. 877.
[31] See generally S.G. Vick, Degrees of Belief, ASCE Press, Reston, 2002, ch. 2.
[32] Broadly speaking, a law case or legal issue is described as a ‘close case’ when its outcome is contested, leading to a split decision among the judges who are deciding the case or issue, with some favoring one outcome and others favoring a different outcome. For some examples of close cases, see E.A. Posner, A. Vermeule, ‘The Votes of Other Judges’, in Georgetown Law Journal, 2016, pp. 178-179.
[33] See F.E. Guerra-Pujol, ‘Why Don’t Juries Try Range Voting’, in Criminal Law Bulletin, n. 51, 2015, p. 680.
[34] See F.E. Guerra-Pujol, ‘The Case for Bayesian Judges’, in Journal of Legal Metrics, n. 6, 2019, p. 13.
[35] For an extended discussion of issue-voting versus outcome-voting by courts, see D. Post, S.C. Salop, ‘Rowing against the Tidewater: A Theory of Voting by Multijudge Panels’, in Georgetown Law Journal, n. 80, 1992, p. 743.
[36] See W.D. Smith, ‘Range Voting’, unpublished manuscript, 28 November 2000, available at https://www.rangevoting.org/WarrenSmithPages/homepage/rangevote.pdf.
[37] See C. Hillinger, ‘The Case for Utilitarian Voting’, Munich Discussion Paper 2005-11, May 2005, available at https://perma.cc/657U-VNMD.
[38] See, e.g., Wikipedia, ‘Score Voting’, Wikipedia: The Free Encyclopedia, 20 July 2023, available at https://en.wikipedia.org/wiki/Score_voting.
[39] See A. Hylland, R. Zeckhauser, ‘A Mechanism for Selecting Public Goods when Preferences Must Be Elicited’, Kennedy School of Government Discussion Paper 70D, December 1980, available at https://rangevoting.org/HylZeck1980.pdf.
[40] See K.J. Arrow, Social Choice and Individual Values, 2 ed., Yale University Press, New Haven, 1970.
[41] See, e.g., M.C. Galavotti, ‘The Notion of Subjective Probability in the Work of Ramsey and de Finetti’, in Theoria, n. 57, 1991, p. 239.
[42] See generally K. Popper, Conjectures and Refutations, 2 ed., Routledge, London, 2002.
[43] See generally T. Schelling, The Strategy of the Conflict, 2 ed., Harvard University Press, Cambridge, 1980.
[44] An economist might ask: “What is the ‘optimal’ level of love?”.
[45] This Appendix is based on my previous work. See F.E. Guerra-Pujol, ‘A Bayesian Model of the Litigation Game’, in European Journal of Legal Studies, n. 4, 2011, p. 220.
[46] Put another way, litigation (whether civil or criminal) is a contest in which the moving party, the plaintiff or prosecutor, attempts to impose civil or criminal liability on the defendant for the commission of a wrongful act, and like the term litigation, I define wrongful act broadly to include both civil wrongs, such as torts and breaches of contract, as well as criminal behavior.
[47] Seen from the defendant’s perspective, litigation is a contest in which the defendant attempts to avoid the imposition of liability or minimize his liability.
[48] My model also ignores the temporal dimension of adjudication; instead, I assume for simplicity that litigation is an instantaneous event, like a coin toss or the roll of a die.
[49] Ideally, of course, liability should be imposed only when a defendant has actually committed a wrongful act, and conversely, no liability should be imposed on innocent defendants (stated formally, in an ideal or perfect legal system the value for Pr(guilty|+) should be equal to or close to one: Pr(A|B) ≈ 1), but in reality false negatives and false positives will occur for a wide variety of reasons, such as heightened pleading standards and abuse of discovery in civil actions and prosecutorial discretion and prosecutorial misconduct in criminal cases.
[50] In the context of litigation, a false positive or Type I error occurs when a defendant who has not committed a wrongful act is nevertheless found liable for the commission of such act. By contrast, a false negative or Type II error occurs when a tortious or guilty defendant is able to avoid the imposition of liability. Stated colloquially, some guilty defendants will be able to avoid the imposition of liability, while some innocent ones will be punished.
[51] See, e.g., M. Galanter, ‘Why the “Haves” Come Out Ahead: Speculations on the Limits of Legal Change’, in Law & Society Review, n. 9, 1974, p. 95.
[52] The adjudication variable can never be 100% sensitive nor 100% specific since errors are inevitable in any process of adjudication, regardless of the litigation procedures that are in place.
[53] See W. Blackstone, Commentaries on the Laws of England, 2007 [1769], Vol. 4, p. 358, quoted in A. Volokh, ‘N Guilty Men’, in University of Pennsylvania Law Review, n. 146, 1997, p. 173.
[54] See, e.g., F.E. Guerra-Pujol, ‘Chance and Litigation’, in Boston University Public Interest Law Journal, n. 21, 2011, p. 45.
[55] The intuition behind this assumption (non-random adjudication) is that reliable legal procedures will tend to produce just and fair results (Hart & Sacks 1994). The existence of reliable adjudication procedures in which liability is imposed only on guilty defendants is not a sufficient condition for justice. When a defendant has broken an unjust or unfair law (licensure requirements and racial segregation laws quickly come to mind), for example, justice would be better served by an unreliable adjudication procedure (i.e., by not enforcing the unjust or unfair law in the first place). But putting aside the underlying meaning of justice, such a litigation game appears to be a highly accurate one, since it will correctly determine with 90% probability, or nine times out of 10, whether the defendant has or has not committed a wrongful act, an essential precondition before liability may justly be imposed.
[56] I will relax these assumptions later.
[57] For further exploration of the problem of rent-seeking in law, see G. Tullock, The Logic of Law, Basic Books, New York, 1971. In principle, a more hard-core risk-loving moving party might be willing to gamble on the litigation game even when he or she is only 50% certain of the outcome. Nevertheless, I assume that a risk-loving moving party requires a 60% probability of a positive litigation outcome simply because he or she must expend resources to play the litigation game. Put another way, since litigation is not costless, the higher the cost of playing the litigation game, the more risk-averse an otherwise risk-loving moving party will be.