Lower Court Compliance with Precedent
Americans appear to be complacent in the belief that the courts—and the U.S. Supreme Court in particular—constitute a powerful branch of government. We rarely consider the degree to which the courts actually possess power, and the public continues to come to the courts with its problems assuming that an effective judicial remedy will be forthcoming (Scheingold 2004). Indeed, some even lament the “excessive power” that this unelected branch wields in society (Ely 1980). However, as Hamilton famously put it, the Court has “no influence over either the sword or the purse … It may truly be said to have neither force nor will, but merely judgment” (Hamilton). Though there is a Supreme Court police force, the mission of that force is to protect the justices, the Court, and visitors to the Court, not to compel other actors to follow the will of the Court.1 In other words, the Court makes decisions but those decisions are not self-executing; the Court’s rulings are given effect only through the actions of others, and the reaction of those other actors is not always perfectly in concert with the Court’s rulings (Baum 2002). This is reflected in the famous quote attributed to President Andrew Jackson regarding a decision of the Court under Chief Justice Marshall: “John Marshall has made his decision, now let him enforce it!” The historical evidence indicates that Jackson made no such statement (Boller and George 1989, 53). However, it persists in the political folklore, certainly in part because it reflects the Court’s predicament when it comes to enforcement of its decisions (if not the accuracy of the president’s words). How, then, in the absence of the usual carrots and sticks relied upon to induce others to “fall in line,” has the Supreme Court become such a significant player in American politics? How does it have impact in American society?
Our focus in this chapter is on one key dimension of the Court’s power: its ability to induce lower courts to abide by its precedents, hence increasing its influence and impact. As a preliminary matter, we note that the question of what impact the Supreme Court has is certainly much broader than the extent to which the nation’s high court can command lower court compliance. Understanding the impact of the Court requires understanding the ability of the Court to exert influence in society writ large.2 Accordingly, alternative foci for the study of the Court’s impact include, for example, the Court’s influence on congressional behavior (Martin 2001), the development and administration of higher education admissions policies (Taylor, Haynie, and Sill 2008), or the presence of small-town displays of religious symbols in late December (Segal, Spaeth, and Benesh 2005, 368).3 Scholars vigorously debate whether the Court can indeed exert a broad impact in society given its enforcement challenges (Hall 2011; Rosenberg 2008; cf. McCann 1994), and some even suggest that the Court, when it makes controversial decisions in a given issue area, can make matters worse by creating a backlash (Keck 2009). Nonetheless, lower court compliance certainly affects Supreme Court impact—without compliance by lower courts and other actors affected by Supreme Court decisions and without their responsiveness to changes in Supreme Court doctrine, the Court can have no impact.4 Thus, the appeal of focusing on lower court compliance lies in the fact that the lower courts constitute “the first link in the chain of events that gives a judicial decision its impact” (Canon and Johnson 1999, 29).
We begin our consideration of compliance by examining the leading models of lower court compliance. We then discuss the legitimacy of the Supreme Court and its power as derived from that support, considering the extent to which the mere moral force of Supreme Court authority might compel compliance with its rulings by the lower state and federal courts, its most important of agents. Our inquiry is ultimately directed at the compelling question of whether it is inevitable that the lower courts will faithfully implement the precedents of the Supreme Court.
Compliance with the Supreme Court
Scholars studying the Supreme Court originally paid little heed to how the Court’s rulings were treated by lower courts. Lower court compliance was simply taken as a given. As students of the Court became aware that compliance was not, in fact, a given (perhaps prompted by the Court’s efforts to desegregate public schools), they devoted increasingly more time and attention to understanding whether and how lower courts responded to the rulings of the nation’s highest court. Not surprisingly, then, early analyses of lower court compliance centered on evident noncompliance on the part of lower courts with Supreme Court precedents in controversial areas of the law, such as civil rights and liberties (see, for example, Murphy 1959; Peltason 1961). In contrast, most contemporary research on compliance has found that lower courts (as well as other actors, for that matter) generally comply with Supreme Court rulings across a variety of issue areas and under a variety of circumstances (see, for example, Benesh and Reddick 2002; Luse et al. 2009; Songer and Sheehan 1990). Of course, a lower court need not completely or overtly thwart Supreme Court precedent in order to provide a less-than-faithful application of it. Indeed, the lower courts have several options available to them should they find themselves faced with a Supreme Court precedent they find unappealing. They can interpret the decision narrowly, limiting the application of the precedent based on very specific factual differences between the precedential case and the case to which the precedent ostensibly applies (Canon and Johnson 1999, 92–114). They may cite their own opinions in lieu of citing the “offending” precedent (Manwaring 1968).5 They may attempt to distinguish their case from the one for which Supreme Court precedent is available (Caminker 1994). They may dispose of the case on procedural grounds (e.g., finding that the parties do not have standing to bring suit) (Canon and Johnson 1999, 92–114). They may criticize the Supreme Court while following it (Tarr 1977). Or, the lower courts may simply ignore the existence of the offending precedent (Reddick and Benesh 2000).
Given the wealth of tools at the disposal of lower courts to avoid adherence to Supreme Court precedents, the widespread compliance that has been documented seems like an enigma. To unravel that puzzle, scholars have employed a variety of frameworks for understanding the relationship between the Supreme Court and the lower courts and the dynamics of compliance. Communications theory, for example, draws attention to the clarity of the transmission of precedent from superior to inferior courts (Canon 1999, 442–443). An obvious prerequisite for compliance in that theory is that those responsible for complying are aware of the precedents with which they are to comply. Awareness of a decision, however, is unlikely to be problematic for lower court judges, as litigants (through their attorneys) and law clerks will likely call all relevant cases to the judges’ attention. Judges also have enhanced abilities to conduct their own online searches for case law, making it quite unlikely that a judge will not know about an applicable precedent (Cross et al. 2010). Indeed, given modern technology, lower courts probably hear of Supreme Court decisions immediately, long before they consider cases to which they may apply.
Alternatively, organizational theories conceptualize the relationship between the Supreme Court and lower courts as an organization, applying theories of organizational behavior borrowed from the public administration literature. A key concept in this line of research is the notion of decisional (or organizational) inertia, which is the tendency of organizational routines and standard operating procedures to be sticky; i.e., slow to change (Baum 1976). In this framework, precedents from the Supreme Court can be seen as disruptive forces, requiring lower courts to change their own standard operating procedures. How compliant lower courts are, then, is a function of how different the edicts contained in new Supreme Court precedents are from those embedded in old Supreme Court precedents or the precedent of the lower court itself. The common law nature of the American legal system,6 however, means that courts (both inferior and superior) are always engaged in some level of adjustment in their decisions over time. Hence, a new precedent may well be adopted far more easily than a new regulation may be by an administrative agency.
Though theoretical frameworks drawn from communications theory and organizational behavior have appealing aspects to them, they do not explicitly take into account the hierarchical nature of the judiciary. The Supreme Court is the institution at the apex of the federal judiciary and the only court created explicitly by the Constitution.7 The most compelling frameworks for understanding compliance, then, pay particular attention to this hierarchy of justice.8 Principal–agent theory and team theory take this hierarchical perspective into account, and are the theories most employed in recent research on lower court compliance with Supreme Court precedent. We discuss each in turn.
Principal–agent theory originated as a way of understanding the relationship between buyers and sellers in economic transactions (see, for example, Ross 1973). Buyers are the principals who wish to obtain the highest quality goods or services at the lowest possible cost. Sellers, on the other hand, wish to maximize profits. Profit maximization may take the form of selling more goods or services than the principal needs or maximizing prices while minimizing quality. Note that the principals and agents have different goals and that these goals may, in fact, conflict with one another. To illustrate this theory, consider the relationship between an individual in the market for a used car (a principal) and an individual who sells used cars (an agent). The buyer wants to obtain the best car possible for the lowest price possible, while the seller wants to obtain the biggest profit possible from the sale. The problem for the buyer is that she does not have access to the same information about the used cars for sale that the seller does. Given the differences in goals between the buyer and the seller—the buyer wants to get the best deal while the seller wants to make the most profit—that difference in information gives an advantage to the seller. The seller (the agent), after all, knows about the histories of the cars on sale while the buyer (the principal) does not. In short, there is an information asymmetry that benefits the seller.
This concept of information asymmetry led political scientists to adopt the principal–agent framework to illuminate the relationship between bureaucracies and the legislatures that create and maintain them (Mitnick 1975). Legislators delegate authority to bureaucrats to implement policies, but bureaucratic interests may diverge from those of the legislators because they develop independent expertise about policy implementation or attract constituencies with preferences that diverge from those of the legislators (Waterman and Meier 1998, 176). This permits bureaucratic agents to shirk, that is, to direct their efforts at goals that differ from those of their legislative principals (Brehm and Gates 1997, 21). Principals can reduce shirking by monitoring the behavior of their agents, but monitoring is costly in terms of time and energy. And, if taken to an extreme, monitoring can be costly enough to obviate any benefit from delegation; if a principal must monitor each and every action of an agent, then the principal might as well do the tasks herself!
The first scholars to explicitly apply the principal–agent framework to the study of the judicial hierarchy were Songer, Segal, and Cameron (1994) but it has been used by numerous scholars since then (see, for example, Benesh 2002; Benesh and Martinek 2002; Brent 2003; Westerland et al. 2010). As Brent (2003) notes:
That judicial scholars should be attracted to principal–agent theory is not surprising, because it accurately describes many of the essential features of the American judiciary. The judiciary is a hierarchy in which the subordinate actors (agents) are charged with the responsibility of implementing policy devised by actors at higher levels (principals). Judicial principals engage in only sporadic, inconsistent oversight of their agents … , permitting those agents to pursue their own goals when those goals conflict with those of the principal.
The Supreme Court—by both practice and design—is not intended to “right every wrong” or adjudicate every dispute. It explicitly disavows the role of error corrector. Indeed, it hears few cases each year and exercises virtually unrestricted discretion in determining which, of the thousands of appeals it receives, to hear (Pacelle 1995; H.W. Perry 1991). As a consequence, the Court has few opportunities to monitor the thousands of cases decided by the lower federal courts and state courts of last resort,9 and it embraces even fewer of the opportunities it does have. In short, lower courts may have a good deal of latitude to shirk in the application of Supreme Court precedents, should they choose to do so.10
Of course, a principal’s need to minimize shirking by monitoring agents is lessened the more information the principal has when it comes to picking agents. This leads to the concept of adverse selection, which refers to the conditions under which a principal selects the agent. Ideally, a principal should have complete information about the skills, abilities, and preferences of the agent(s) it is selecting. This would permit the principal to select the agent that is least likely to act contrary to the principal’s interests (i.e., the lower court judges most likely to faithfully apply Supreme Court precedents, in the context of the judiciary). Here, the utility of the principal–agent framework for understanding superior–inferior court interactions is compromised in that the Supreme Court has nothing to do with the selection of its agents (the lower courts) at either the state or the federal level. Like their superiors on the Supreme Court, lower federal court judges undergo selection via a constitutionally-mandated process in which the president makes a nomination that must be confirmed by the Senate.11 With regard to state court judges, the majority of such judges are elected by state or sub-state electorates, and those that do come to the state bench via a non-elective mechanism are still in no way beholden to the members of the Supreme Court for their appointments (see American Judicature Society 2011). In short, the Supreme Court faces an extreme adverse selection problem: it must rely on agents chosen for it by others, and, additionally, it is powerless to remove wayward agents from office.
The fit of the judicial hierarchy to the principal–agent framework, while sometimes argued to be uncomfortable, nonetheless is of considerable value for understanding lower court compliance given the explicit hierarchical setup of the judiciary. The Supreme Court is “the boss” of the lower courts,12 and it is an authoritative one at that. Accordingly, lower court judges should, for the most part, heed the policy prescriptions of the Supreme Court precisely because the Supreme Court is their superior. From this perspective, the judiciary is much like an administrative agency, with the lower courts acting as bureaucrats, implementing to the best of their ability the policy enactments of the Supreme Court, and using their expertise to fill in gaps where they arise.
An alternative to principal–agent theory is team theory. Like principal–agent theory, team theory has its origins in economics (Marschak and Radner 1972). Specifically, team theory was advanced as a model of decision-making in organizations. The basic logic of the team model is that different members of a team (an organization) possess different information relevant to the achievement of the team’s goals and, further, that different team members control different decisions of the organization. In short, both information and decision-making authority are decentralized. “But all team-theoretic models share one key feature: they ignore the interests of the team members—there is no shirking, free-riding, lying, lobbying or strategizing of any kind” (Gibbons 2003, 761). The interests of team members can be ignored because they are assumed to be identical (i.e., each member of the team desires the same thing—namely, the success of the organization). To illustrate the team theory framework, consider a firm with a central production manager and a set of unit managers. The central production manager controls the allocation of resources among the units, deciding, for example, how much manpower is allocated to each unit. Each unit manager, however, makes decisions regarding how the manpower allocated to his unit will be deployed. Further, each unit manager has specialized knowledge about his own particular unit and what is necessary for it to function best. That is, both decision-making authority (except for the resource allocation made by the central manager) and information are spread throughout the organization. All of the managers— including the central production manager—want the company to maximize profits across all units since performance bonuses at the firm are determined by overall company performance. That is, each manager gets the same share of the available bonus money and the available bonus money is a straightforward function of the company’s total profits. In effect, all of the managers have an incentive to share information and take cues from one another to enhance the likelihood of each unit manager making the decisions for his unit that will reap the biggest profit for the company (and, hence, the biggest bonuses for the managers).
Various scholars—primarily, but not exclusively, legal scholars situated in law schools—imported team theory to describe and explain lower court compliance (Cross 2005; Kornhauser 1995; Staudt 2004). The basic assumption of the team model—the assumption of a shared organizational goal—is useful for thinking about the fidelity of lower courts to higher court rulings because it implies that adherence to precedent is simply a matter of enhancing the likelihood of correctly deciding a case. Judges at higher levels of the judicial hierarchy presumably have greater levels of expertise to select good vehicles for articulating guiding precedents and, given the luxury of discretionary dockets, more time and judicial energy to craft such precedents (Kornhauser 1995; Westerland et al.