Chapter 3 | Data Ethics Guidebook

The 'Informed' Part of Informed Consent:
Data Fluency for End Users

By MJ Petroni and Jessica Long, with Steven Tiell and Harrison Lynch

1. Introduction to Informed Consent and Doing No Harm

2. Designing a System for Informed Consent

3. The 'Informed' Part of Informed Consent: Data Fluency for End Users

4. Mechanisms for Consent

5. Outcome Monitoring and Discovering Harm

6. Mitigating Harm

7. Data Sovereignty and Data Politics

8. Sample 365-Day Plan for Data Ethics

Informed consent requires a new level of Data Fluency from both end-users and organizations. For consent to be informed, end-users must understand what data is being gathered, who will have access to it, and how it will be used.

‍

However, ethically, it is also incumbent upon organizations gathering data to ensure that potential harms are conceived of and shared with users. These harms should be shared in a form users can understand and be weighted by their potential impact and the statistical likelihood of their occurrence. It’s possible that companies managing private data could alert users about the “side effects” of sharing data the same way drug companies must do in their advertisements. To request and achieve informed consent, organizations must first understand and then clearly communicate how they and their partners will 1) use data and 2) ensure that the data will not be accessed or used in ways that fall outside of the scope that has been, or will be, communicated to end-users. This is the 'consent' element.

Common sense dictates that users should also derive value from the disclosure of the data they share somehow. If not, no matter how good the messaging around the data usage is, companies will struggle to receive consent. The requirements necessary for informed consent fall under the larger concept of “Data Fluency”— awareness of how digital data is affecting all parts of modern life. Discussion of Data Fluency raises questions about the feasibility and responsibility for education around data collection and use on a public and enterprise level. How much time and attention should people reasonably be expected to devote to understanding the intricacies and implications of data collection and use? What is the level of responsibility that should be placed on organizations seeking consent?

Selecting “accept” on an End-User License Agreement (EULA) may count as informed consent from a legal perspective. But is the information in the small print really accessible to most users in a way that satisfies the ethical challenges surrounding data monetization, data sharing, and the myriad other ways that individual data may be used or could be used in the future? The uses of data and the distinctions of who benefits from those uses (and how) are constantly evolving and in flux, and imagining those uses is part of Data Fluency. As a result, it isn’t easy to define an endpoint for consent in this context. There is perhaps no such thing as being truly data-fluent in the general sense; we can only attend to specific uses and types of data, and on an organizational level, commit to transparency and iteration of consent agreements as organizations continue to explore the value and dangers of personal data as a resource in the digital age. End-user license agreements are premised on the idea that organizations and end-users are digitally literate— prepared to imagine the impact of their disclosures—and that both organizations and end-users speak the same language about data. Without that awareness, and that shared language, informed consent in EULAs is not present.

With the information and understanding required for consent and ethical use of data constantly changing and hard to measure, best practices for organizations that collect and use data focus on transparency and communicating intent. From a customer and partner relationship perspective, there’s an obvious benefit for organizations that make a visible and genuine effort to provide information about their data use. Critically, that information must be provided in terms that everyday users can understand and accept—or reject—with confidence. One might also propose that the process of converting the most common jargon in EULAs and Terms of Service (TOS) documents to everyday language would go a long way toward having people within organizations understand, and be honest with themselves, about the ethical nuances of data collection and use.

There is a disincentive for many companies to disclose data uses. First, when their data use is either not in the obvious interest of the user (e.g., marketing/advertising emails). Second, understanding how data is collected, transformed, and protected—or made vulnerable—can scare users. Whether or not that fear is justified is not up to companies to decide, but can be influenced through education provided in good faith.

‍

An example of the gray area created by a lack of Data Fluency can be seen in misunderstandings between Google and some of its users about how it was (or was not) using the content of customer emails to provide targeted ads through free versions of its Gmail service.⁷ Because users did not understand (and Google did not effectively communicate) where customer data was being processed and what the implications of that were for users, stories of upset customers raised skepticism about the integrity of Google’s handling of private information.

Ethical practice is particularly complex when intent, consent, and benefit are subject to very different interpretations. Consider Facebook’s massive, longitudinal study on the impact a positively or negatively skewed news-feed had on a user’s own posts.⁸ When this study was announced (after the fact), there was immediate backlash. While Facebook may have been within the bounds of their EULA to conduct this study, the response shows that they had misjudged how users would react. Users responded poorly to the news that their behavior was being studied based on manipulating the information being presented in their feeds. In addition, it was unclear whether their unwitting participation in the study would lead to better products and services (which might at least provide some positive outcome) or if their results would be used to steer spending or ad placement (which might make the study feel exploitative). This study existed in a controlled environment with an institutional review board (IRB), responsible for ensuring study participants were treated fairly, but the response when the information was made public was not entirely positive.⁹ In response to this reaction, Facebook has taken steps to publish a framework that details the guidelines and best practices they will utilize in research, to prevent miscommunication around future studies.¹⁰, ¹¹ However, typical A/B (and multiple variable) software testing is not required to undergo these same review processes. When changing variables ‘A’ and ‘B’ in ways that could have real impacts on the emotions of users (or in the physical world), organizations need to be clear about how they intend to use the resulting data.

Data transformation and use

Informed consent requires sufficient understanding by all parties of how data will be transformed into meaningful information. It is in this transformation that many of the unintended consequences of data sharing and data collaboration take form. Use implies access, but the real issue at hand is accessing data to transform it into something else—information. This information is then used on its own, as insight, or to trigger actions. Whether those actions are executed by humans after manual digestion of those insights, or those actions are a response to logic programmed in advance by humans, the world of human ethics collides with the world of binary logic in a messy, hard-to-track decision tree with effects often more evocative of chaos theory than simple branched charts. Given this complexity, the best approach to managing user expectations around data transformation and use of resulting information is to provide clarity at the time of data collection as to intended and potential future uses. The goal is to ensure that meaningful consent is achieved.

Complexities of law and jurisdiction

While it is a common method of securing consent from a legal perspective, requiring users to grant use of data before accessing necessary services is often not in the user’s best interests. Moreover, such agreements may play on a lack of data fluency, even more so with use of complex legal language. End-User License Agreements (EULAs) and Terms of Service (TOS) agreements are the places where most of these data exchange agreements occur. The Electronic Frontier Foundation has been warning users of the rights they relinquish in these agreements since at least 2005. Case law in the United States and other jurisdictions provides limited protections to end-users and corporations, so the enforceability of EULAs and TOS agreements varies by jurisdiction.

There has been widespread debate over enforceability within complex data relationships—especially those in which companies, users, and strategic data partners may exist in jurisdictions with conflicting case law. Perhaps most notable is the European Union’s decision to insist that all European user data be stored on servers housed in the EU, both to protect users from environments with less privacy-focused regulatory controls than the EU, and also to prevent government-sponsored surveillance and tampering.¹² However, this approach is incomplete because it is based on the belief that data being used is primarily at rest—stored statically—and primarily accessed by the same parties who stored it, on behalf of the same users. When data is often in motion between various servers and algorithms with massively complex interdependencies across state and corporate lines, such regulation provides little real protection for users. At the other end of the spectrum, strict interpretation could silo data about users in a way that limits meaningful use of the data by the people furnishing, storing or transforming it.

References

7. Seshagiri, A. (2014, October 1). Claims That Google Violates Gmail User Privacy. The New York Times. Retrieved June 1, 2016.

8. Kramera, A. D., Guilloryb, J. E., & Hancockb, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. PNAS, 111(29), 10779.

9. Goel, V. (2014, June 29). Facebook Tinkers With Users’ Emotions in News Feed Experiment, Stirring Outcry. The New York Times. Retrieved June 1, 2016.

10. Schroepfer, M. (2014, October 2). Research at Facebook [facebook Newsroom]. Retrieved April 3, 2016.

11. Jackman, M., & Kanerva, L. (2016). Evolving the IRB: Building Robust Review for Industry Research. Washington and Lee Law Review Online, 72(3), 442.