The Collection

The Genesis of the Database

The U.S. Supreme Court Database traces its history back four decades, when Harold J. Spaeth asked the National Science Foundation to fund a database that would be so rich in content that multiple users—even those with vastly distinct projects and purposes in mind—could draw on it. Professor Spaeth’s goal was at once refreshingly simple and extremely ambitious: to produce a database that would include and classify every single vote by a Supreme Court justice in all argued cases over a five-decade period. After securing the funding, Spaeth collected and coded the data, performed reliability checks, and eventually amassed the Database. In the late 1980s, he made it (and the documentation necessary to use it) publicly available.

Since then, the Database has been updated at the end of each Supreme Court term. It has also been backdated to 1791 such that today there are two versions of the Database: the Modern Database, which covers 1946 to present and the Legacy Database, which begins in 1791 and ends with the 1945 term. These Databases can be easily combined.

Both versions of the Database house 247 pieces of information for each case, roughly broken down into six categories:

Identification variables (e.g., citations and docket numbers)
Background variables (e.g., how the Court took jurisdiction, origin and source of the case, the reason the Court agreed to decide it)
Chronological variables (e.g., the date of decision, term of Court, natural court)
Substantive variables (e.g., legal provisions, issues, direction of decision)
Outcome variables (e.g., disposition of the case, winning party, formal alteration of precedent, declaration of unconstitutionality)
Voting and opinion variables (e.g., how the individual justices voted, their opinions and agreement scores)

The success of Spaeth’s Database is without question. Virtually all systematic analyses of the contemporary Supreme Court and its members have relied on it. This holds for research conducted by social scientists and their graduate students and, increasingly, by legal academics; and it holds for quantitative and qualitative studies, as well as those more descriptive in nature. In fact, several inventories of peer-reviewed journals show that it is the rare article on the Court that derives its data from an alternative source. Monographs published by top presses also regularly rely on the Database, and the many numerical studies of the Court receiving public attention in recent years have made liberal use of the data it houses. By the same token, journalists seeking to illuminate dimensions of the Court’s work regularly deploy Spaeth’s product; indeed, Linda Greenhouse, the Pulitizer-prize winning reporter, once referred to it as “a computerized treasure trove…created under a grant from the National Science Foundation,” and has cited it (or research relying on it) in her writings.

In short, the U.S. Supreme Court Database has not just helped fill gaps in our knowledge. It is one of those rare creatures in the law and social science world: an invention that has substantially advanced a large area of study, inspiring research by scholars hailing from no fewer than three and as many as seven disciplines.

This Website

This website houses the most recent versions of the Modern and Legacy Databases. It also includes legacy versions of each so that users can replicate any analysis conducted with earlier versions of data they downloaded (but discarded) or even used on the site.