Profiling Technologies and Fundamental Rights and Values: Regulatory Challenges and Perspectives from European Data Protection Authorities
© Springer Science+Business Media Dordrecht 2015Serge Gutwirth, Ronald Leenes and Paul de Hert (eds.)Reforming European Data Protection LawLaw, Governance and Technology Series2010.1007/978-94-017-9385-8_1
1. Profiling Technologies and Fundamental Rights and Values: Regulatory Challenges and Perspectives from European Data Protection Authorities
Emerging Crimes Unit, UNICRI, Turin, Italy
Centre for Technology and Society (CTS), Technische Universität Berlin (TUB), Berlin, Germany
Law Department, University of Turin, Torino, Italy
Amapola Progetti per la sicurezza delle persone e delle comunità, Torino, Italy
Tilburg Institute for Law, Technology, and Society (TILT), Tilburg, The Netherlands
Francesca Bosco (Corresponding author)
This paper aims to map the field of profiling, its implications for fundamental rights and values, and the measures which are or can be taken to address the challenges of profiling practices. It presents a working definition of profiling and elaborates a typology of its basic methods. In the second section the paper gives an overview of the technological background of profiling to display how fundamental rights and values of European societies are endangered by the use of profiling. Finally the paper presents the findings of a questionnaire addressed to European DPAs on the current and future legal framework, the domains of application, the complaints and remedies procedures regarding the use of profiling techniques, the main risks and benefits for the fundamental rights, and citizens’ awareness on this topic. These findings contribute important insights for the ongoing discussion on the regulation of profiling in Europe.
The term “Big Data” is grounded in socio-technological developments, which began with the invention of the computer and has unfolded a rapidly growing dynamic over the past decades. Technological advancement has fueled the digitization of our societies by increasingly powerful infrastructures, basing on digital devices and software. Mediated communication today has mostly become digital communication, and information has consequently become easy to process and store as data, and is at the same time fluid and persistent. New potentials of gathering data raise hopes for developing more advanced ways to manage societies. The more we know the better we can control social processes and steer societal progress. At least that is what we are promised by “Big Data” proponents. “Big Data” appears to be a fetish, a crystal ball which allows those who use it to not just look into the future but to gain information which enables them to shape it at their needs.1
However, big data itself is not information but still mere data.2 The more data we gather the harder it is to extract usable information as the huge amounts of data exceed human capabilities of consideration. Consequently data needs powerful tools to be utilized as a marketable resource. These tools are considered to be found in technologies such as data mining. They are supposed to turn “Big Data” into the new oil.3
Profiling can be understood as a specific data mining method. In this perspective profiling is regarded as an (semi-)automated process to examine large data sets in order to build classes or categories of characteristics. These can be used to generate profiles of individuals, groups, places, events or whatever is of interest. Profiles structure data to find patterns and probabilities. Using actuarial methods in this context is supposed to generate prognostic information to anticipate future trends and to forecast behavior, processes or developments. The aim is to develop strategies in order to manage uncertainties of the future in the present. In this regard, the ideology of “Big Data” and analytical tools such as profiling can be understood as an important facilitator and part of a preventive paradigm which can be found in diverse societal contexts.4
Even though the reality of profiling might not live up to the expectations of its prophets,5 the assumed potentials of gathering and processing data spawn the dream of overcoming human deficiencies with technology, these new technologies also draw fears and skepticism as they impose threats on some of the core values and principles of European societies. Key challenges which have been identified by scholars include infringements of democratic principles and the rule of law: Data gathering, exchange, and processing potentially harm central values like individual autonomy and informational self-determination as well as the fundamental rights of privacy, data protection, and non-discrimination.
This paper aims to map the field of profiling. It focuses on its implications for fundamental rights and values in different fields of application and on the assessment of the existing countermeasures to address the challenges of profiling practices. In the following section this paper proposes a working definition of profiling. The third section gives an overview of the technological evolution building the ground for the emergence of profiling, afterwards it is demonstrated how fundamental rights and values of European societies are endangered by the application of profiling in various contexts (Sect. 1.4). In Sect. 1.5 the legal regulation of profiling is sketched. Finally the paper presents the first findings of a questionnaire carried out by the project PROFILING,6 in order to gain knowledge about European Data Protection Authorities’ awareness, attitudes, and activities regarding profiling and its societal impacts.
1.2 Profiling: Towards a Definition
Profiling is a highly evocative term with multiple meanings, used in both specialist and non-specialist contexts. Whereas the literature on statistics does not pay specific attention to definitions and tends to focus on technical aspects (e.g. data mining techniques and predictive models), providing a definition appears an issue among socio-legal scholars and policy makers. However a widely shared definition has not yet emerged.
Gary T. Marx gave one of the oldest definitions of profiling in a paper that analyses systems of data searching. Profiling (defined by the author in contrast to “matching”) is defined by stressing the logic behind it: “the logic of profiling is more indirect than that of matching. It follows an inductive logic in seeking clues that will increase the probability of discovering infractions relative to random searches. Profiling permits investigators to correlate a number of distinct data items in order to assess how close a person or event comes to a predetermined characterization or model of infraction”.7 According to the author’s background, this definition is strictly related to the law enforcement domain.
Almost 10 years later, Roger Clarke defined profiling as a “dataveillance technique (…) whereby a set of characteristics of a particular class of person is inferred from past experience, and data-holdings are then searched for individuals with a close fit to that set of characteristics”.8
A legal scholar, Bygrave again stressed: “profiling is the inference of a set of characteristics (profile) about an individual person or collective entity and the subsequent treatment of that person/entity or other persons/entities in the light of these characteristics”.9
Later on, Mireille Hildebrandt was the one who put the best effort to precisely define profiling and its distinctive features and the working definition proposed here has built on her work. She defines profiling as “the process of ‘discovering’ patterns in data in databases that can be used to identify or represent a human or nonhuman subject (individual or group) and/or the application of profiles (sets of correlated data) to individuate and represent an individual subject or to identify a subject as a member of a group (which can be an existing community or a discovered category).”10
Profiling creates a new form of knowledge that makes visible patterns that are otherwise “invisible to the naked human eye”.11 They are based on correlations found in data sets, and cannot be “equated with causes or reasons without further inquiry; they are probabilistic knowledge.”12 Profiling represents a shift from the idea that knowledge is the result of tested hypothesis. It generates hypotheses: “the correlations as such become the ‘pertinent’ information, triggering questions and suppositions”.13 Consequently profiling fosters new forms of generating and applying knowledge. Due to the growing capacities of databases, and capabilities of advanced analysis profiling procedures become increasingly complex. In this context the human role in interpreting data changes significantly.
As pointed out by Hildebrandt, profiling can be categorized into non-automated, automated and autonomic profiling. Non-automated profiling is a form of reasoning that does not rely on any process of automation. Automated profiling is based on “automated functions that collect and aggregate data” and develop into “automation technologies that can move beyond advice on decision-making, taking a load of low-level and even high-level decisions out of human hands.”14 Differently, autonomic profiling describes the process whereby the human role is minimized and the decision making process is entirely driven by the machine.15 Autonomic profiling “goes one step further than automated profiling.”16 The machines drive the decision making process, providing for a readjusted environment based on their profiling and without calling for human intervention. Besides their degree of automation profiling methods can be distinguished by their object and application. Profiling can be applied as group profiling or individual profiling: the techniques that identify and represent groups can also focus on individuals.17 Moreover profiling relies on data collected from one single person or group to apply the information derived from data processing to the same person or group – direct profiling – or it relies on categorization and generalisation from data collected among a large population to apply it to certain persons or groups – indirect profiling. Group profiling can also be classified as distributive group profiling or non-distributive group profiling.18 A distributive group profile identifies a certain number of people having the same attributes. All the members of the group share the same characteristics. In contrast, a non-distributive group profile identifies a certain number of people who do not share all the attributes of the group’s profile.
These distinctions give an idea of the different types of profiling and their application. The forms of profiling, which are subject of this article are automated and autonomic profiling and their various forms and fields of application.
The following proposed definition takes into account the preceding evolution of technologies in which profiling is embedded and focuses on the purpose profiling is being used for. It will be the basis for this paper:
Profiling is a technique of (partly) automated processing of personal and/or non-personal data, aimed at producing knowledge by inferring correlations from data in the form of profiles that can subsequently be applied as a basis for decision-making.
A profile is a set of correlated data that represents a (individual or collective) subject.
Constructing profiles is the process of discovering unknown patterns between data in large data sets that can be used to create profiles.
Applying profiles is the process of identifying and representing a specific individual or group as fitting a profile and of taking some form of decision based on this identification and representation.
1.3 Societal Consequences of Digitization
Advanced data analysis tools have established new social practices of knowledge production and have created new types of knowledge. We argue that the practices of profiling have facilitated and are part of a broader societal paradigm of prevention. We will elaborate on the societal implications of changing social practices through emerging profiling technologies as a ground for the examination of threats for fundamental rights and values of European societies in Sect. 1.4.
Observations made by human beings need to be written down to be made explicit. The written documentation of observations can be regarded as a first step to enable a generalized and objectified way of keeping information and exchanging it between individuals and institutions.19 Digitized information, however, can be processed and analysed automatically so that information is easier and cheaper to store, process and analyse. An illustrative example of how exhaustive and expansive the detailed documentation of people’s activities and behaviour has been, is the comparison between digital data the NSA stores with the amounts of files the Stasi – German Democratic Republic’s domestic secret service – produced. All the information captured throughout the Stasi history would fill about 48.000 cabinets covering approximately 0,019 km2. The NSA’s planned data centre in Utah will host about 5 zettabytes of data which could roughly be converted in about 42 quadrillion file cabinets covering 17 million km2 – bigger than the European continent.20 The example also shows the differing efforts needed to collect and archive data depending on whether using analog or digital data processing. While the Stasi needed to install microphones, hire staff to monitor and document people’s behaviour to gain information about their habits, attitudes and social networks, in a digitized world a lot of that information can be monitored and stored on the fly through sensors, log data or user generated content. This shows that the digitization of communication and transactions does not only produce more data but also provides new kinds of information21 which can be used to extract knowledge about individuals: their social relations, interests and activities. Once stored and made accessible via computer networks, data becomes easily exchangeable worldwide. At the same time it becomes hard to grasp how data is exchanged, which information is gained and by whom. Furthermore the specific mediums can store specific data. Certain elements which can be archived on paper cannot be archived digitally and vice versa. Moreover certain information can hardly be digitized respectively digitally analyzed, e.g. hand-written information, and smells. By that, archives have a filtering function which shapes the accessibility of information as data. But simplified storage and exchange of data are only one aspect of the ongoing process of digitization of everyday life. Beyond that advanced methods of data analysis have fundamentally changed the procedures of knowledge production through automation.
Another effect of the digitization of data becomes evident when we think of the different haptic and cognitive perceptions of digital versus analog files and folders. Different items and elements can be put in an analog or digital file, and at the same time, the availability of and the access to certain kinds of information fundamentally changes. In other words: accessing information at a (real) desktop is very different from accessing information when sitting in front of a computer screen. Paper folders can be touched and felt, digital files are browsed on a screen and can be searched by keywords. Consequently, the way of reasoning changes, as first findings of one of the case studies conducted in PROFILING show.22 More interaction of the analyst is oriented towards computer interfaces and thus influenced by the way user interfaces are designed, information is presented, and how searches can be conducted.23 The transformation of the human role in knowledge production processes is even more significant when it comes to examining large-scale databases. Learning algorithms are trained on specific data sets to build categories or to find patterns in the data. Assumptions or hypotheses made by the analyst play a minor role during data processing, they are to a certain degree hidden in the process of writing algorithms and training the algorithms. Finally, hypotheses are derived “from the material”.24 As a consequence implicit assumptions driving the actors during the selection of training data, preprocessing target data and suitable algorithms become invisible and the outcomes produced by “the data” seem objectified. Subjective assumptions and social norms are hidden in the technology during the process of automatization, while outcomes based on computed models and databases are often perceived as solid statistics and thus more objective than human interpretation.25 This perception as objectified knowledge of computer-generated models supports the thesis of a general tendency of technology to make social norms more durable26 and more specifically the thesis that social sorting becomes strengthened if mediated through technology.27 Profiles, as mentioned above, can be seen as hypotheses. These hypotheses are inductive as they are not necessarily developed on the basis of a theory or a common sense expectation, but often emerge in the process of data mining. This can be regarded as a shift from a more traditional, rather assumption-driven approach to a discovery-driven approach to knowledge generation.28 This shift results not only from growing data capabilities and advancing technological methods. Lyon argues that the conceptualization of social threats as actuarially identifiable and addressable risks and the desire for intelligence-led management of populations play a key role in the spread of profiling technologies.29 In this context data mining is considered a key technology for risk assessment in various fields of application such as eHealth, airport security, and policing. Profiling techniques are used to identify categories and groups in order to assess risks and probabilities of certain future developments. The generated profiles can then be used to sort individuals, groups, events or processes in order to make them addressable for specific practices.30 In this regard profiling is a technology to structure potential futures in order to make them governable in the presence. Therefore profiling is an important practice of a broader societal preventive paradigm, which is based on probabilistic knowledge used to manage social processes in the form of risk management.31 By that profiling technologies provide means of control, which can be exercised for care and protection or coercion and repression.32
1.4 Profiling as a Threat for Fundamental Rights and Values
Even though the results of data mining are often limited reliable,33 proponents claim that the potentials for managing social and technological processes in more efficient ways through data gathering and analysis are immense. They expect that the growing amount of data and increasingly advanced tools for examination will provide information which will allow organisations to identify, target, and act upon undesirable developments at an early stage – preferably before they occur. Preemptive policing, early detection of pandemic risks, and the prevention of tax fraud are examples of the societal benefits of the use of sophisticated data mining methods. Yet there is a downside to these opportunities implied by the technological evolution of digitization: it threatens key aspects of fundamental citizen rights, such as the rights to privacy, data protection and non-discrimination, and core values of European societies – democracy, the rule of law, autonomy and self-determination. As societies rely more and more on profiling methods to steer social and technological processes the urgency of dealing with these threats grows.
1.4.1 Fundamental Values
The clash between liberal democracy34 and profiling is brought about by their inherent characteristics. Profiling is considered a glamour technology: it gives the idea that human beings can attain unforeseeable knowledge that allows making better decisions. But the dark side of profiling is that it makes “invisible all what cannot be translated into machine-readable data.”35 This means that the decision-making process is prone to be biased in the data collection phase and because of the complexity of the applied algorithms, human beings cannot properly intervene in repairing this bias. Consequently, “as far as the governance of people and things becomes dependent on these advanced profiling technologies, new risks will emerge in the shadow of the real time models and simulations these technologies make possible. What has been made invisible can grow like weeds.”36 In other words, not to consider some of the aspects of an issue can turn, at least, into ineffective and wrong decisions or, at most, in serious risks and damages for the population.37
Not only human intervention is reduced during the decision-making process, but also citizens do hardly have any access to the procedure behind the construction and application of profiles. This seriously hampers the quality of a liberal democracy because of the unbalanced distribution of power38 and knowledge asymmetries39 between the ordinary citizens, on the one hand, and the government on the other hand. Knowledge asymmetries are a common phenomenon but it reaches a new peak in profiling technologies. In most of the cases, citizens are not aware of the information circulating and how they could be used in the future. In particular, when profiles are constructed from data that is not of the data subjects, information is used to take decisions about them without their involvement. So there is no easy protection on the horizon. Moreover some sophisticated profiling technologies like Behavioural Biometric Profiling (BBP) “do not require identification at all”40 and by that increase this problem.
If the position that citizens enjoy versus the state is one of the indicators of the quality of a liberal democracy, the governmental use of profiling techniques seriously challenges some essential democratic features. This is not only related to the recognition of rights by the state, but also to the opportunities these rights entail for the full and free development and expression of citizens’ personalities and their effective participation in democratic life. In this framework are placed the fundamental values of autonomy and self-determination. Against the backdrop of the discussion about profiling, self-determination acquires the specific meaning of informational self-determination, which means that an individual needs to have control over the data and information produced by and on him/her. This control is “a precondition for him/her to live an existence that may be said ‘self-determined’.”41 As shown in the prior section digitization of everyday life has led to opaque ways of data gathering, exchange and processing. Consequently technologies like profiling do not leave much space for autonomy and self-determination.42
As in any other field, the application of profiling in healthcare can be helpful, yet harmful. eHealth and mHealth (electronic health and mobile health) technologies enable constant monitoring and profiling of persons’ physical conditions, their activities, medical treatment, or diet. That way e- and mHealth-applications might help people to pick up healthier lifestyles as well as improve cures for illnesses and the individual treatment of diseases. At the same time there is potential for gathering information about patients’ lifestyles from a hard to grasp range of sources that could be used for an actuarial assessment of lifestyles to build risk categories which are not only used for “individualized” treatments, but also to offer “individual” insurance fees or other incentives to make clients adapt certain lifestyles. Yet the categories on which these incentives are created by profiling are anything but individual. They derive from abstract calculations conducted under the premise of profit maximization and transfer this economic logic to individual lifestyle choices by rewarding behaviours assessed as low risk or healthy, while sanctioning the ones which are considered as increasing risks for accidents or diseases. Even though profiling in this context is supposed to empower healthy lifestyles, it also undermines individuals’ autonomy. It facilitates the economization of everyday life by addressing individuals as dividuals – bundles of risks and behavioural probabilities, reducing them to profiles.43 eHealth is only one area in which this logic is executed. Risk factors or behavioural probabilities, which are identified and addressed, vary contextually as aims and scopes of profiling agents differ. “Although we are constantly being monitored in some way or another we do not live in an Orwellian ‘Big Brother’ dystopia. […] Rather, an intricate network of small surveillance societies exists, often overlapping, connectable or connected, but each with their own features and rules.”44 What links these small surveillance societies is the idea to create knowledge gathered from certain populations which allows steering individuals, groups, and social processes. At this point autonomy and informational self-determination are closely interwoven as putting one at risk can jeopardize the other.
In policing, the development of preventive measures is a key argument for the implementation of growing capacities of gathering, exchanging and analyzing information. In Germany, police forces host large numbers of distinct databases for various purposes. They are fed and maintained by different institutions, such as the federal police organizations, state police organizations, or domestic secret services. The rules for gathering and exchanging data as well as for the access to the information for different institutions are hardly comprehensible. They are defined by federal data protection and criminal justice law (e.g., Bundesdatenschutzgesetz, Bundeskriminalamtgesetz, Strafprozessordnung), and various other laws and orders on state and federal level.45 Beyond that several technical orders and so called “Errichtungsanordnungen” determine the architecture, use and purposes of data bases installed by the police.46 This opaque framework still lacks a legal definition that covers data mining measures like profiling as stated by the German Government.47 This results in serious threats for informational self-determination and in particular cases it affects citizens’ political participation and finally even the development of a liberal democracy. For example, the German federal police, Bundeskriminalamt (BKA), maintains databases for politically motivated offenders (distinguished as left, right and foreign offenders), which are fed by and accessible for the state police organizations (Landeskriminalamt, LKA). The information stored can be used for example to reconstruct social networks, allocate people to groups or institutions, or to identify people to be kept away from certain events of special interest, for instance NATO or G8 summits. First findings of interviews, conducted within a PROFILING case study,48 with activists who are an involved in civil rights groups, show that interviewees considered data gathering, exchange and its use in the policing practice as non-transparent and by that intimidating, especially for people which are just starting to join civil rights groups. (Potential) activists do not know if and which information is gathered at which events, for which reasons, for whom this information is accessible, and how it might be used – or if it could lead to further police measures. This uncertainty may result in hindering the exertion of civil rights or lead to adaptive behaviour. Persons might change their behaviour in order to not seem conspicuous or suspicious and avoid to be linked with e.g. civil rights groups. Even though the technology used in this context cannot be considered as fully automated profiling, the computer-assisted data storage and representation already leads to opaque structures which undermine informational self-determination and restrain citizens’ political participation. Furthermore it indicates challenges emerging from “predictive policing” approaches which aim on using (semi-)automatically generated profiles to score the risk of certain groups and individuals to commit particular crimes.
1.4.2 Fundamental Rights
The fundamental values presented before are strictly interrelated with the right to privacy and data protection and to the protection from discrimination. As clearly underlined by Rodotà, “the strong protection of personal data continues to be a ‘necessary utopia’ if one wishes to safeguard the democratic nature of our political systems.”49 Data protection is necessary in a democratic society, as Rouvroy and Poullet pointed out, to sustain a vivid democracy. The right to non-discrimination is equally important.50 It is not by chance that the European Court of Justice, in two recent profiling-related cases51 has invoked both the legislation on Data Protection and anti-discrimination to protect citizens’ rights.
22.214.171.124 The Right to Privacy and the Right to Data Protection
Leaving aside all difficulties of defining the various notions of privacy52 it is useful to shortly revisit the interplay between privacy and data protection. Following Gellert and Gutwirth, most privacy definitions53 can be summarized in either the problem of being left alone, or the question of how to cope with information stemming from social interaction in a way that certain areas of one’s personal life are hidden from unwanted views.54 Data protection law however is made to ease the free flow of information by safeguarding personal data. In this respect privacy is a matter of opacity while data protection is related to transparency.55 In the field of profiling it is highly relevant to consider the scope of both terms: while privacy is broader in the sense that privacy covers more than mere personal data the misuse of personal data can affect much more than someone’s privacy. As outlined above various technologies nowadays potentially create digital data which can be part of automated processing and profiling. Accordingly the concepts of privacy and data protection are increasingly challenged by the capabilities of data usage and analytics. The concepts evolve over time as technologies develop and have to catch up with the constant progress: “its content varies from the circumstances, the people concerned and the values of the society or the community.”56 Moreover profiling technologies, as shown in this paper, lead to more black boxing, more opacity of data processing. It is in fact questionable how the factual use of data can be made transparent.
In order to build an exhaustive framework of the threats towards the right to privacy and the right to data protection, the OECD Privacy Principles57 are taken as term of reference as one of the most comprehensive and commonly used privacy frameworks.58
These principles include (1) Collection Limitation Principle: data should be obtained by lawful and fair means and with the knowledge or consent of the data subject; (2) Data Quality Principle: data which are to be used, should be accurate, complete and kept up-to-date; (3) Purpose Specification and (4) Limitation Principle: The purposes for data collected should be specified only be used for the specified purposes; (5) Security Safeguards Principle: Personal data should be protected by reasonable security safeguards; (6) Openness Principle: There should be a general policy of openness about developments, practices and policies with respect to personal data. (7) Individual Participation Principle: Individuals should have the right: (a) to obtain the data stored relating to them; (b) to be informed about data relating to them (c) to be given reasons if a request made under subparagraphs (a) and (b) is denied, and to be able to challenge such denial; and (d) to challenge data relating to them and, if the challenge is successful to have the data erased, rectified, completed or amended. (8) Accountability Principle: A data controller should be accountable for complying with measures, which give effect to the principles stated above.59
RFID-enabled travel cards (as used in many metropolis, e.g. Oyster Card in London and Octopus Card in Hong Kong) can serve as an example to display how new technologies challenge the right to privacy and data protection. The cards contain personal information about their holders so that they can be allocated to a single person to avoid abuse by others. Beyond that the RFID chips can be used to generate sophisticated traveler profiles,60 or even consumer profiles, where the cards can also be used to pay in shops. Furthermore traveling profiles could be used to find suspicious traveling patterns, revealing potentially deviant behaviour (e.g. people which are using uncommon amounts and combinations of subway stations indicating activities from drug dealing to infidelities, as illustrated in Doctorow’s Novel “Little Brother”). This shows that data which is not conceived as sensitive or potentially harmful can become such through combinations with other data.61 Even data which is anonymized or de-identified can be used to generate outcomes which lead to issues from privacy infringements to discrimination. Furthermore the effectiveness of those approaches is doubted by scholars. Big Data analytics allow to draw unpredictable inference from information and by that undermine strategies of de-identification as by combination of anonymized data identities can be reconstructed.62 New technologies such as RFID-chips make it difficult to keep track of which information is collected for which purposes and to keep track of the factual use of such data. The temptation for those gathering data to use it in new ways and generate new knowledge is high, and getting aware of such (unspecified) use can be very difficult. The discussions about putting data protection into practice through measures of accountability aims on making the use of data proactively transparent and traceable, but the practical implication is complicated.63 There is a general lack of transparency in profiling techniques64 and also data processor’s accountability is challenged by opaque practices and black boxed technologies inherent to data mining and profiling. This makes both the Security Safeguards Principle and the Openness Principle far from being taken into consideration. Individuals become more and more transparent, as public bodies, and even private companies, become more and more intrusive, moving on legal borderlines.
126.96.36.199 The Right to Non-discrimination
The right to non-discrimination “emanates from the general postulate of the equal dignity of human beings.”65 It constitutes a general principle in EU Law and lately has been enshrined as a fundamental right in Article 21 of the EU Charter of fundamental rights. It consists of a general principle of equality (i.e. similar situations have to be treated in the same way and different situations have to be treated differently) and of specific provisions developed in anti-discrimination legislations related to certain protected grounds (e.g. age, race, gender, religion, sexual orientation, etc.) and specific domain of application (i.e. labour market, vocational training, education, social security, health care, access to goods and services, criminal law).
The basic conceptual distinction in EU law is that between direct and indirect discrimination, both of which are prohibited in the EU law. Direct discrimination occurs when a person is treated less favourably than another and this difference is based directly on a forbidden ground. Indirect Discrimination occurs when apparently neutral criteria, practices or procedures have a discriminating effect on people from a particular protected group. This distinction is highly relevant in the context of profiling because rarely does the classification and categorization made by profiling techniques occur directly on forbidden grounds. More often the categorization is based on algorithms used to classify some attributes that can result as proxies of a protected ground. As stated by Romei and Ruggieri “the naive approach of deleting attributes that denote protected groups from the original dataset does not prevent a classifier to indirectly learn discriminatory decisions, since other attributes strongly correlated with them could be used as a proxy by the model extraction algorithm.”66 The best-known example is the one of “redlining”, which is explicitly forbidden by US law. Redlining is used to identify the practice of denying products and services in particular neighbourhoods, marked with a red line on a map. Due to racial segregation or increasing demographic concentration of people similar for social class, employment condition and even nationality, people living in a particular neighbourhood may belong to a specific racial group or an ethnic minority. Hence, an apparently neutral attribute such as ZIP Code may turn into an indirect discrimination situation. In general profiling applied to marketing (web marketing, loan market, price determination, etc.) can easily hide practices of indirect discrimination. For this reason the research on data mining techniques that prevent discrimination (a kind of “discrimination proof data mining”) is a fruitful research field.67
Another example is the smart approach to border surveillance. It relies on the use of technologies to automatically check the passengers at the border (so called smart borders). This use of technology consists of databases, sophisticated tools such as body, iris scanners and comprehensive programme of surveillance (e.g. Eurosur) whose final aim is to speed up border crossing for bona fide travellers, fight against illegal migration and enhance security. The proposed databases (Passenger Name Record, Registered Traveller Programme, Entry/Exit System) rely on an extensive collection of personal and non-personal data in order to differentiate among welcome and unwelcome travellers. Besides the risks related to privacy and data protection due to the use of biometrics and the lack of respect of the principle of purpose-binding and use limitation, the opacity of the logic behind the data mining procedure is in itself hard to harmonize with the obligation not to discriminate on prohibited grounds and above all raise huge concerns on the respect of human dignity.
The manifold risks which profiling imposes on fundamental values and rights as well as the complex effects of the implementation of this technology show that it is a challenge to provide adequate measures to protect European values and rights. The next section gives a brief overview of the state of this process in Europe.
1.5 So Far so Good – Regulating Profiling
In the current EU data protection legislation the word profiling does not appear. However. Article 15 of the Directive 95/46/EC (hereinafter, Data Protection Directive, DPD) concerns ‘automated individual decisions’ and thus is closely related to profiling. According to article 15(1): “every person has the right not to be subject to a decision which produces legal effects concerning him or significantly affects him and which is based solely on automated processing of data intended to evaluate certain personal aspects relating to him, such as his performance at work, creditworthiness, reliability, conduct, etc.” At the same time, article 15(2) states an exception: “a person may nevertheless be subjected to an automated individual decision if that decision is taken: (a) in the course of the entering into or performance of a contract, provided the request for the entering into or the performance of the contract, lodged by the data subject, has been satisfied or that there are suitable measures to safeguard his legitimate interests, such as arrangements allowing him to put his point of view; or (b) is authorized by a law which also lays down measures to safeguard the data subject’s legitimate interests”.
In the light of Article 15 of the DPD, it is relevant whether the processing is meant to evaluate a certain aspect of the person’s behavior, character or identity on which a decision can be based. A decision based on a profile can comply with the law, but a natural person has to be involved in the process. To sum up, Article 15 does not take the form of a direct prohibition on a particular type of decision-making; rather, it directs each EU Member State to confer on persons a right to prevent them from being subjected to purely automated decisions in general.68