universities face increasing pressures not only to publish in peer-reviewed academic journals, but also obtain external income from consultancy. This often means writing a report evaluating the work of some public agency, using a mixture of quantitative and qualitative methods.1 My objective in this paper is to provide a critical review of evaluation research, contrasting this with sociological approaches to evaluation and quality assurance. I will particularly focus on legal services, and the implications for socio-legal researchers based in departments of law or social science in British universities.
The paper starts with an overview of what evaluation research involves, and how it is relevant to the delivery of legal services. It then reviews two criticisms that have been made of evaluation as a research paradigm, and explain why many university-based researchers would prefer not to conduct this kind of research. One criticism is that methodological standards are often lower than in other areas of social science, partly because most ‘small-scale’ evaluations are poorly resourced, but also because the managers and civil servants who commission these studies have little interest in how academics understand and debate methodological questions. Another is that evaluation research has a managerial bias, and researchers are required to become ‘hired hands’ and give up their intellectual independence.
The next part of the paper reviews how sociologists in different traditions have studied this new form of regulation. It starts by considering the writings of Michael Power, the 2002 Reith Lecturer Onora O’Neill, and the Foucauldian governmentality tradition.2 It suggests that these all make a powerful moral and political case against quality assurance, and also implicitly against evaluation research, but that qualitative or ethnographic research on how quality reports are produced and used, and the alleged burdens created by excessive regulation, is needed to test their general arguments. I will argue that ethnomethodology goes considerably further than these critical approaches in addressing the day-to-day work involved in quality assurance, and makes it possible to understand the deteriorating relationship between managers and professionals across the public sector.
The paper concludes by considering how socio-legal researchers should respond to the opportunities offered by evaluation research. Following Payne, Dingwall and Carter, it suggests that we should pursue the difficult course of constructive engagement, rather than withdrawing into textually-based critical scholarship.3 We face the twin challenges of not only raising our own methodological standards, which are low in relation to mainstream sociology and social policy, but also persuading the organisations commissioning research that they will benefit from more thoughtful and rigorous evaluations.
A. WHAT IS EVALUATION RESEARCH?
According to Carol Weiss, one of the most respected figures in this field, ‘evaluation is the systematic assessment of the operation and/or the outcomes of a program or policy, compared to a set of explicit or implicit standards, as a means of contributing to the improvement of the program or policy’.4 In contrast to academic research which is concerned with the production of knowledge, or perhaps delivering a general political message, the aim of an evaluation is to make a practical difference to the delivery of government programmes and policies.
This kind of applied research has a long history in America. Private foundations funded studies of the social welfare programmes that they supported in the 1940s. From the mid–1960s, the federal government evaluated the various programmes established during the War on Poverty, and ‘by the end of the 1970s evaluation had become common-place across federal agencies’.5 Although there were fewer new social programmes to evaluate during the Reagan administration, there was a revival in the 1990s. A large number of consultancy firms, in-house government agencies, and university-based researchers conduct evaluation studies for federal, state and local government. They have established several journals, such as Evaluation Review and the American Journal of Evaluation, that publish evaluation research and debate methodological issues. They also have a professional association, the American Evaluation Association, and hold regular conferences.
Evaluation took longer to develop in Britain, although the present New Labour government is committed to what is termed ‘evidence-based’ policy.6 This has involved commissioning pilot studies of new initiatives such as the New Deal, and making evaluation a statutory requirement for local government programmes under the 1998 Crime and Disorder Act. Evaluators have also benefited from a shift in philosophy across the public sector, the New Public Management, in which government services previously provided by Departments of State are devolved to quasi-independent agencies that must demonstrate they provide a good service to the customer, and value for money to the tax-payer. Britain now has its own Evaluation Society and academic journals such as Evaluation which was established in 1994.
1. Methods of Investigation
The majority of studies have employed quantitative methods, and focus on whether government programmes make a difference. A typical research question might be whether an educational initiative to reduce juvenile delinquency had a demonstrable effect on the youths who participated. The researcher, in this case, would measure attitudes using a questionnaire before and after the programme. A major concern would be to establish a randomised experimental design, modelled on natural science, so that one could also measure a change of attitudes in youths with the same characteristics who did not attend the programme.7 It has also become increasingly common to employ qualitative methods, such as structured interviews, focus groups, and short periods of fieldwork to obtain more detailed information on whether a programme is working. In some cases, large numbers of interviews are carried out, and the data analysed into themes using software programmes such as NUDIST. In small-scale evaluations, a researcher might interview a few practitioners, attend a planning meeting or observe the delivery of some service, and draw on this data.8 To the best of my knowledge, no researchers have obtained funding in recent years to conduct ethnographic projects based on a long period of observation in one organisation.9
Whatever combination of quantitative and qualitative methods is used, it is important to recognise that the logic of explanation, and the assumptions informing evaluation research, are usually positivist in character. It is assumed that one can identify causal relationships between variables, and measure phenomena such as attitudes, provided that one employs the right techniques. Qualitative studies are often used to establish whether programmes are being administered correctly, so that it is then possible to conduct comparisons or ‘meta-analysis’ on a series of evaluations.
The only sustained interpretive challenge within the evaluation community has come from Egon Guba and Yvonne Lincoln.10 They argue that reality is socially constructed, so the version of the setting produced by the evaluator should have no special status. Conventional evaluations also uncritically support the official definition of reality in the sponsoring organisation, whereas the views and actions of professionals and clients are equally important in shaping outcomes.11 The objective of ‘fourth generation’ evaluation should be to promote dialogue between these different groups, and reflexive learning, rather than produce some spuriously objective account that privileges the perspective of managers. One might add, however, that only a few American evaluators have experimented with these ideas, and most evaluations are based on the assumption that it is possible to determine what happens in organisations, and make authoritative recommendations, without being troubled by the existence of different versions.
2. Some Definitional Problems
It will be apparent from this short review that the evaluation industry (Weiss calls it a ‘community’) draws on mainstream social science for its methods, but often tries to keep its distance from the theoretical and methodological debates that interest university-based researchers. This is because it is primarily concerned with providing useful knowledge in an accessible form to the organisations that provide government services. At the same time, evaluators see themselves as doing more than collecting facts, or offering recommendations based on intuition or opinion, and this is because they have a commitment to scientific method: ‘Doing evaluation through a process of research takes more time and costs more money than offhand evaluations that rely on intuition, opinion, or trained sensibility, but it provides a rigor that is missing in these more informal activities.’12
This does not fully explain how evaluation differs from applied social scientific research in general. In the United Kingdom, for example, the Economic and Social Research Council will mainly allocate funding to projects that improve economic performance, or the efficiency and effectiveness of public services. This is also evaluation research, although it is usually funded at higher level than consultancy, and must also contribute to the development of theory or methodology in an academic discipline.
Another difficulty is whether one should include the research conducted by a large number of organisations that are concerned with evaluation, but have no interest in creating a new academic discipline. This is the growing field of quality assurance that is overseen by the General Accounting Office (GAO) in America and the Audit Commission.13 There are also a large number of inspectorates in the United Kingdom which examine the activities of schools, the social services, universities and the criminal justice system. The majority of people writing these reports are civil servants or practitioners without any background in social science, but they also conduct evaluation research.
B. EVALUATION AND LEGAL SERVICES
Socio-legal researchers should be interested in evaluation for two reasons. In the first place, it is an important, but under-researched, area of regulation. Local authorities and other agencies have a statutory obligation to conduct evaluations, and provide statistical data to inspectorates and the Audit Commission. There is almost no part of society untouched by quality assurance, and more routine evaluations are conducted than at any time in British history.14
One should also remember that, although many lawyers work in private practice, the legal system is part of public services, and subject to the same pressures to cut costs and demonstrate value for money. Legal aid firms are now regularly assessed and audited by the Legal Services Commission (Sherr, et al, 1994; cf Travers 1994, Somerlad 1999). There has also been a greater emphasis on performance targets in the courts following the Woolf Report, although the Lord Chancellor’s Department has resisted pressures to make the judiciary more accountable. Finally, law schools along with other university departments are subject to review by the Quality Assurance Agency for Higher Education (QAA), and the Research Assessment Exercise which are intended to raise standards, allocate scarce resources, and encourage competition between institutions.
It is difficult to assess the extent to which British socio-legal researchers (who are primarily based in law departments) are engaged in doing evaluation research. This is because, unlike other areas of academic inquiry, the majority of evaluation studies are never published. One can get some idea of what research has been commissioned or conducted by government departments by looking at the publications produced by the Home Office Research Unit, the Lord Chancellor’s Department and the Law Society, or at studies cited in The Macpherson Report on the murder of Stephen Lawrence or the Auld Review of the Criminal Courts. However, a much larger number of local evaluations, commissioned by local authorities, police forces, and the voluntary sector are never distributed beyond a local policy network.
C. OBJECTIONS TO EVALUATION RESEARCH
Although evaluation can be viewed as an opportunity to make social science interesting and relevant to a wider audience, relatively few university-based researchers have welcomed these developments.15 There are two main objections. The first is that most evaluation research is less rigorous and offers less intellectual satisfaction than publishing in peer-reviewed academic journals. The second is that there is an inevitable political bias in evaluation studies towards the needs and perspective of managers, rather than practitioners or clients: the researcher becomes a hired-hand serving government, and loses the ability to represent subordinate or disadvantaged groups.
1. The Charge of Lower Standards
When discussing the issue of standards, it is important to make a distinction between the ‘small-scale’ evaluations commissioned by local agencies, and the ‘flag-ship’ projects that are more generously funded by government departments. There is, for example, no evidence to suggest evaluation research is inferior to peer-reviewed social science if one looks at the many highly rigorous quantitative studies that are published in the American evaluation journals. This has not, however, prevented some academics from complaining about the routine character of work in this field, and the need for a more thoughtful or critical approach to data collection and analysis.16
Methodological standards are considerably lower in the average small-scale study commissioned by a local agency to demonstrate its efficiency or effectiveness for the purposes of some external or internal review. Unsurprisingly, one finds that many studies measure the effect of programmes without using a randomised control group. Although a layperson might regard the findings as persuasive, from a scientific perspective they are virtually worthless.17 Similarly, one finds far-reaching conclusions being drawn from tiny samples, and less care and attention to the problems involved in measuring variables, and using appropriate statistical tests.
In the case of qualitative research, there is arguably an even greater gulf between the methods used in evaluation research, and those taught in university departments. One major difference is that the assumptions informing evaluations are broadly positivist in character, in the sense that it is assumed one can make objective findings through measuring and relating variables, whereas there are also interpretive, realist and post-modern traditions in ethnography that could be employed in studying any social setting. None of the main texts on evaluation mentions standard qualitative traditions like symbolic interactionism, ethnomethodology or conversation analysis. Even grounded theory hardly gets a mention, which is strange since this is informed to some extent by positivist assumptions, and widely used in applied social science.18
There are all kinds of methodological debates about representation in qualitative research, but these are largely irrelevant to evaluation studies. Instead extracts from interviews are used to support some argument about the processes that produce outcomes in an organisation, without considering any of the philosophical issues about representation that trouble sociologists or anthropologists. These include the possibility there might be different perspectives, or the extent to which interviews can adequately address the practical issues involved in delivering a service, each of which should be highly relevant in an evaluation.
This is one reason why academics often look down on evaluation, and it is still a small field, relative to other areas of social science, despite the pressures on universities to secure external income from consultancy. Irrespective of political considerations, which I will consider next, one can see that one cannot usually pursue cutting edge intellectual questions, or state-of-the-art methods in evaluation: rather, as Weis notes, the skill lies in making ‘research simultaneously rigorous and useful’ while dealing with ‘the complexities of real people in real programs run by real organizations’.19
2. The Charge of Political Bias
Social scientists have also kept their distance from policy research because it has an inevitable managerial bias. The classic statement of this position is Alvin Gouldner’s critique of symbolic interactionists who conducted liberal studies about deviance during the 1960s.20 Gouldner saw them as making a mutually profitable compact with middle-managers in the welfare state that provided ideological support for liberal capitalism. Although, on the face of things, it appeared that they were siding with disadvantaged groups against state institutions, in fact this diverted attention away from massive structural inequalities in American society.
Most academics have liberal or left-wing political views (they are more likely to read The Guardian than The Daily Mail), so it is hardly surprising that sociologists and socio-legal researchers have maintained a critical distance towards agencies like the police, courts and social services. The best empirical studies about law and criminal justice from the late 1960s and 1970s have a strong ideological bias. This is evident in how they collect and present data selectively to support a left-wing political case.21
By contrast, evaluators generally have no qualms about serving the needs of managers, and do not question the objectives of government policy. Weiss, for example, appears to have complete faith that policies and programmes improve over time, and that people in authority can be trusted to make good decisions based on the latest scientific knowledge: ‘Evaluation is a practical craft, designed to make programmes work better and to allocate resources to better programs. Evaluators expect people in authority to use evaluation results to take wise action. They take satisfaction from the chance to contribute to social betterment.22