Making Sense of the Legal Analytics Revolution (Part 1 of 2)

October 18, 2021

Reprinted with permission from the my article in the October 2017 issue of ALI CLE’s The Practical Lawyer.

“In God we trust. All others must bring data.”

— Professor William Edwards Deming

All of us who often speak and write about the ongoing revolution in data analytics for litigation have heard it from at least some of our fellow lawyers: “Interesting, but so what?”

Here’s the answer in a nutshell. One often hears that business hates litigation because it’s enormously expensive and risky. There’s a degree of truth to that, but it’s far from the whole truth. Business doesn’t dislike expense or risk per se. Business dislikes unquantified expense and risk. As the maxim often (incorrectly) attributed to Peter Drucker goes, “You can’t manage what you can’t measure.”

Don’t believe me? If your client offers to sell an investment bank a two billion dollar package of mortgages, the bank gets nervous. But tell the bank that based on the past ten years of data, 65.78 percent of the mortgages will be paid off early, 24.41 percent will be paid off on time, and 9.81 percent will default, and they know how to deal with that.

It’s the same thing in litigation. For generations, most facts that would help a business person understand the risks involved have been solely anecdotal: this judge is somewhat pro-plaintiff or pro-defendant; the opposing counsel has a reputation for being aggressive or smart (or not); juries in this jurisdiction often make runaway damage awards or are notoriously parsimonious. But every one of those anecdotal impressions and bits of conventional wisdom can be approached from a data-driven perspective, quantified and proven (or disproven). Do that, and we’ve taken a giant step towards approaching litigation the way a business person approaches business—by quantifying and managing every aspect of the risk.

I hear lawyers talking about “early adopters” of data analytics tools in litigation, but the truth is, we’re not early adopters by a long shot. The business world has been investing billions in data analytics tools for a generation in order to understand and manage their risks.

Tech companies use algorithms to choose among job applicants and assign “flight risk” scores to employees according to how likely each is thought to be to leave. Billions of dollars in stock are traded every day by algorithms designed to predict gains and reduce risk. Both Netflix and Amazon’s websites (among many others) track what you look at and buy or rent in order to recommend additional choices you’ll be interested in. In 2009, Google developed a model using search data which predicted the spread of a flu epidemic virtually in real time. UPS has saved millions by placing monitors in their trucks to predict mechanical failures and schedule preventive maintenance. The company’s algorithm for planning drivers’ optimal routes shaved 30 million miles off drivers’ routes in a single year. Early in his term as New York Mayor, Michael Bloomberg created an analytics task force that crunched massive amounts of data gathered from all over the city to determine which illegal conversions (structures cut up into many smaller units without the appropriate inspections and licensing) were most likely to be fire hazards. Political campaigns now routinely use mountains of data to not only identify persuadable voters, but determine the method most likely to work with each one.

The application of data analytic techniques to the study of judicial decision making arguably begins with a 1922 article for the Illinois Law Review by political scientist Charles Grove Haines. Haines reviewed over 15,000 cases of defendants convicted of public intoxication in the New York magistrate courts. He showed that one judge discharged only one of 566 cases, another 18 percent of his cases, and still another fully 54%. Haines argued that his data showed that case results were reflecting to some degree the “temperament . . . personality . . . education, environment, and personal traits of the magistrates.”

In the early 1940s, political scientist C. Herman Pritchett published The Roosevelt Court: A Study in Judicial Politics and Values, 1937-1947. Pritchett published a series of charts showing how often various combinations of Justices had voted together in different types of cases. He argued that the sharp increase in the dissent rate at the U.S. Supreme Court in the late 1930s necessarily argued against the “formalist” philosophy that law was an objective reality which judges merely found and declared.

Another landmark in the judicial analytics literature, the U.S. Supreme Court Database, traces its beginnings to the work of Professor Harold Spaeth about three decades ago. Professor Spaeth created a database which classified every vote by a Supreme Court Justice in every argued case for the past five decades. Today, thanks to the work of Spaeth and his colleagues Professors Jeffery Segal, Lee Epstein and Sarah Benesh, the database has been expanded to encompass more than two hundred data points from every case the Supreme Court has decided since 1791. The Supreme Court Database is the foundation of most data analytic studies of the Supreme Court’s work.

Professors Spaeth and Segal also wrote one another classic, The Supreme Court and the Attitudinal Model, in which they proposed a model arguing that a judge’s personal characteristics—ideology, background, gender, and so on—and so-called “panel effects”—the impact of having judges of divergent backgrounds deciding cases together as a single, institutional decision maker—could reliably predict case outcomes.

The data analytic approach began to attract attention in the appellate bar in 2013, with the publication of The Behavior of Federal Judges: A Theoretical & Empirical Study of Rational Choice. Judge Richard Posner and Professors Lee Epstein and William Landes applied various regression techniques to a theory of judicial decision making with its roots in microeconomic theory, discussing a wide variety of issues from the academic literature.

Although the litigation analytics industry is changing rapidly, the four principal vendors are Lex Machina, Ravel Law, Bloomberg Litigation Analytics and Premonition Analytics. Lex Machina and Ravel Law began as startups (indeed, both began at Stanford Law School), but LexisNexis has now purchased both companies. Lex Machina is fully integrated with the Lexis platform, and Ravel will be integrated in the coming months. Although there are certain areas of overlap, all four analytics vendors have taken a somewhat different approach and offer unique advantages. For example, Premonition’s database covers not just most state and all federal courts, but also offers data on courts in the United Kingdom, Ireland, Australia, the Netherlands and the Virgin Islands.

The role of analytics in litigation begins with the earliest moments of a lawsuit. If you’re representing the defendant, Bloomberg and Lex Machina both offer useful tools for evaluating the plaintiff. How often does the plaintiff file litigation, and in what areas of the law? Were earlier lawsuits filed in different jurisdictions from your new case, and if so, why? Scanning your opponent’s filings in cases in other jurisdictions can sometimes reveal useful admissions or contradictory positions. If your case is a putative class action, these searches can help determine at the earliest moment whether the named plaintiff has filed other actions, perhaps against other members of your client’s industry. Have the plaintiff’s earlier actions ended in trials, settlements or dismissals? This can give counsel an early indication of just how aggressive the plaintiff is likely to be.

All four major vendors have useful tools for researching the judge assigned to a new case. Ravel Law has analytics for every federal judge and magistrate in the country, as well as all state appellate judges. State court analytics research is always a challenge because of the number of states whose dockets are not yet available in electronic form, but Premonition Analytics claims to have as large a state-court database as Lexis, Westlaw and Bloomberg combined. How much experience does your judge have in the area of law your case involves compared to other judges in the jurisdiction? How often does the judge grant partial or complete dismissals or summary judgments early-on? How often does the judge preside over jury trials? Were there jury awards in any of those trials, and how do they compare to other judges’ trials? What is defendants’ winning percentage in recent years before your judge? Ravel Law and Bloomberg can provide data on how often your trial judge’s opinions are cited by other courts— an indicator of how well respected the judge is by his or her peers— as well as how often the judge is appealed, and how many of those appeals have been partially or completely successful. The data can be narrowed by date in order to focus on the most recent decisions, as well as by area of law. Say your assigned judge appears to be more frequently appealed and reversed than his or her colleagues in the jurisdiction. Are the reversals evenly distributed across time, or concentrated in any particular area of law? If your judge’s previous decisions in the area of law where your case arises have been reversed unusually often, it can influence how you conduct the litigation. Counsel can keep all this data current through Premonition’s Vigil court alert system, which patrols Premonition’s immense litigation database and can give counsel hourly alerts and updates, keyed to party name, judge, attorney or case type, from federal, state and county courts. Many jurisdictions give parties one opportunity, before any substantive ruling is made, to seek recusal of the assigned judge as a matter of right, without proof of prejudice. Data-driven judge research can help inform your decision as to whether to exercise that right.

Lex Machina’s analytics platform focuses on several specific areas of law, giving counsel a wealth of information for researching a jurisdiction (additional databases on more areas of law will be coming soon). For example, in antitrust, cases are tagged to distinguish between class actions, government enforcement, Robinson-Patman Act cases, as well as others. The platform is integrated with the MDL database, linking procedurally connected cases. The database reflects both damages—whether through a jury award or a settlement—and additional remedies, such as divestiture and injunction. Cases are also tagged by the specific antitrust issue, such as Sherman Act Section 1, Clayton Act Section 7, the rule of reason or antitrust exemptions. The commercial litigation data includes the nature of the resolution, any compensatory or punitive damages, and the legal finding—contract breach, rescission, unjust enrichment, trade secret misappropriation, and many more. The copyright database similarly tracks damages, findings and remedies, and allows users to exclude from their data “copyright troll” filings. Lex Machina’s federal employment law database includes tags for the type of damages—backpay, liquidated damages, punitive damages and emotional distress, the nature of any finding, and the remedy given. The patent litigation database includes many similar fields, but also a patent portfolio evaluator, isolating which patents have been litigated, and a patent similarity engine, which finds new patents and tracks their litigation history. The securities litigation database enables users to focus on the type of alleged violation, tracking the most relevant outcomes, and the trademark litigation database contains data for the legal issues and findings, damages and remedies in each case.

Analytics research is important for the plaintiffs’ bar as well. Bloomberg’s Legal Analytics platform is integrated with its enormous library of corporate data covering 70,000 publicly held and 3.5 million private companies. Counsel can survey a company’s litigation history, and the information is keyed to the underlying dockets. The data can be focused by jurisdiction or date, as well as to include or exclude subsidiaries. Lex Machina’s Comparator app can compare not only the length of time particular judges’ cases tend to take to reach key milestones but also previous outcomes, including damages awards and attorneys’ fees awards. A plaintiffs’ firm can use such data in cases where there are multiple possible venues to select the jurisdiction likely to deliver the most favorable result in the shortest time.

One bit of conventional wisdom that is commonly heard in the defense bar is that defendants should generally remove cases to federal court when they have the right to do so because juries are less prone to extreme verdicts and the judges are more favorable to defendants. Although comprehensive data on state court trial judges is still less common than data on federal judges, all four major analytic platforms can help evaluate courts and compare judges, giving a client a data-driven basis for making the removal decision.

Researching your opposing counsel is important for both defendants and plaintiffs. How aggressive is opposing counsel likely to be? Bloomberg Analytics covers more than 7,000 law firms, and enables users to focus results by clients, date and jurisdiction. Is your opposing counsel in front of your judge all the time? If so, that can inform decisions like whether to seek of-right substitution of the judge or remove the case. What were the results of those earlier lawsuits? Reviewing opposing counsel’s client list can suggest how experienced opposing counsel is in the area of law where your case arises. Lex Machina’s law firms comparator also enables the user to compare opposing counsel to their peers, and get an idea of what opposing counsel’s approach to the lawsuit is likely to be. Lex Machina’s app enables counsel to compare opposing counsel’s previous cases by open and terminated cases, days elapsed to key events in the case, case resolutions and case results. In preparing this article, I reviewed a report generated by Lex Machina’s Law Firms Comparator and learned several things I didn’t know about my own firm’s practice. Ravel Law’s Firm Analytics enables counsel to study similar data about one’s opponent, focused by practice area, court, judge, time or proceeding—or all of the above. Firm Analytics also compares opposing counsel to other law firms in the jurisdiction, showing whether counsel appears before the trial judge frequently, and whether they tend to win (or lose) more often than comparable firms. All this information gives counsel a tremendous leg up as far as estimating how expensive the litigation is likely to be.

As you begin to develop the facts of a case, motions begin to suggest themselves. Is your client’s connection to the jurisdiction sufficiently tenuous to support a motion to dismiss for lack of personal jurisdiction, or for change of venue? Has the plaintiff failed to satisfy the Twombly/Iqbal standard by stating a plausible claim? Discovery motions to compel and for protective orders are commonplace, and inevitably defense counsel will face the question of whether to file a motion for summary judgment.

Ravel Law’s platform has extensive resources for motions research. For every Federal judge, the system can show you how likely the judge is to grant, partially grant or deny a total of 90+ motions—not just the easy ones like motions for summary judgment or to dismiss, but motions to stay proceedings or remand to state court, motions to certify for interlocutory appeal, motions for attorneys’ fees, motions to compel or for an injunction and motions in limine. This can by an enormous savings in both time and money for your clients. Even where examining the facts suggests that a motion for summary judgment might be in order, that calculus might look very different when one learns that the trial judge has granted only 18 percent of the summary judgment motions brought before him or her since 2010.

Image courtesy of Flickr by Adam Moss (no changes).

Appellate Legal Daily

Making Sense of the Legal Analytics Revolution (Part 1 of 2)

About this Site