Data Management

Ethical Data Collection for Regular Developers

Ethical Data Collection for Regular Developers

by Colin Fleming

In his talk at RubyConf 2018, Colin Fleming addresses the critical theme of ethical data collection, particularly in sensitive fields like the DC Abortion Fund. He underscores the profound responsibilities developers hold in ensuring ethical handling of data while navigating the risks and potential harms that can arise. Fleming begins by highlighting a philosophical thought experiment involving a teleportation device that, despite its benefits, involves fatal ethical compromises, symbolizing the weight of ethical decisions in technology.

Key points from the discussion include:

  • Ethical Duality of Software: Software can be transformative but also dangerous, necessitating careful consideration of developers' responsibilities.
  • Public Perception and Accountability: Ongoing industry scandals have decreased trust in technology; developers must accept accountability for their choices and the systems they create.
  • Framework for Ethical Decision-Making: Fleming suggests defining ethics rooted in shared and individual values, incorporating principles from the Association for Computing Machinery (ACM) code of ethics as a guide.
  • Values of the DC Abortion Fund: Key values include ensuring individual agency in making personal choices and recognizing the need for transparency regarding data collection, highlighting the tension between asking necessary questions and respecting client comfort.
  • Case Studies and Decision Points: The team discusses real-world scenarios illustrating dilemmas faced at the DC Abortion Fund, such as recalibrating their approach to collecting sensitive information in response to changing social and political contexts.
  • Balancing Data Collection and Patient Respect: Decisions around data retention and use are weighed against the potential risks posed to individuals, ensuring that actions align with their established values.

Fleming emphasizes the importance of maintaining a mindset prepared to confront ethical questions, encouraging developers to reflect on their values and the power dynamics involved in their work. He concludes by asserting the need for practical approaches to ethical dilemmas, advocating for direct engagement with the ethical implications of data usage, and reinforcing the importance of accountability in the tech industry.

00:00:15.560 Thanks very much, Jamie, and thank you all for coming and joining us for a drink. I'm Colin, and it's really lovely to see you all. I'm going to be talking about data ethics and how to apply reasoning to our work around data in this session. Every talk in this track has set a high bar, so I have some tough acts to follow. However, this talk serves as a connective thread between them, as developers, we have the chance to work on projects that can significantly impact people's lives. Sometimes the software we create is truly transformative; other times, it can be quite dangerous. A lot of the time, it resides somewhere in between or embodies both characteristics.
00:01:07.820 This talk is about what that duality means for us in our everyday jobs and how to handle that responsibility. Just a quick note: although I'll touch on some heavy topics, including the work of the DC Abortion Fund, I'm not out to ruin anyone's good time at this conference. If anyone feels triggered by the subject matter, I will not be offended if you want to step out.
00:01:46.409 Let’s dive in. A philosopher in the 90s presented a thought experiment: imagine an invention that guarantees personal freedom of movement, allowing individuals to teleport anywhere instantaneously, reminiscent of the Star Trek transporter. This would eliminate distances that typically restrict our movements, offering incredible advantages. However, such freedom comes with a severe cost—the fuel powering this invention necessitates the annual sacrifice of 30,000 randomly selected individuals. If you're tasked with writing software for this invention, what would you do? Would you proceed, believing the freedom it offers outweighs the cost of human sacrifice? Or would you reject involvement due to the ethical implications of risking lives, despite the incredible benefits it promises? Perhaps you'd consider a compromise, restricting access to a select few and thus reducing the death toll, albeit at the expense of equitable access. As developers, how prepared do we feel to tackle such ethical questions with the tools we currently have?
00:02:37.950 This thought experiment serves to highlight how cars function. They grant us the freedom to travel anywhere, empowering us significantly, albeit at the cost of fatalities on the road that we have accepted as a given in exchange for that freedom. While most of us may not work on systems that reshape the world as dramatically as cars do, the software we create still involves varying degrees of risk to users. Our work can lead to negative outcomes for individuals, particularly when human data is involved. It’s crucial that we, as developers, ensure we treat people ethically and responsibly while using their data. There has never been a better time to critically examine our approach to people’s data, given the current state of our industry.
00:04:25.560 The public perception of technology has suffered significantly due to various scandals and mishaps rooted in misuse of data that places individuals at risk. Companies like Experian analyze consumer financial records, Facebook monetizes data to optimize advertising, and Palantir employs algorithms to dictate police presence in neighborhoods. As we discuss these ethical dilemmas, it's essential to recognize that the developers—like you and I—are the ones who implement these systems. Thus, as we confront these ethical questions, we must understand our shared responsibility, as well as our power to influence outcomes.
00:05:05.340 I believe that facing ethical dilemmas in our work is a common occurrence, and we should tackle them head-on. Many systems utilizing human data offer opportunities for genuine improvement in people's lives. It’s gratifying that our careers provide us with such potential; however, we must not shy away from recognizing the associated risks and ethical implications. My premise is that these ethical complications are akin to technical issues. Just as failing to write efficient code can have consequences, ignoring the risks we impose on individuals when using their data can yield harmful results.
00:06:01.080 I want to set the stage for some critical thinking that prepares us for the ethical challenges we will undoubtedly encounter in our data work. We'll begin by defining ethics as developers to guide our inquiries into our work. Next, we will identify our core values, so we know what is vital to us as individuals and as a team. Different groups uphold diverse values, so we will start from a broader professional level and focus down to a team context. Finally, we'll review real decision points encountered in a recent project at the DC Abortion Fund, applying this framework to the complex challenges that arose.
00:07:40.290 Now, please don't share this with any philosophy professors, but here’s a working definition of ethics we can use throughout this talk: ethics involves making decisions based on our shared and personal values. Two key points resonate with me about this definition. First, ethics is rooted in shared and individual values. It signifies that doing the right thing is about interpreting the values of our community, our profession, and ourselves. Second, ethics entails action; we are empowered to take decisive steps according to our values.
00:08:50.080 I’ll employ a humorous tweet from the internet as an opportunity to take a quick sip of water while we reflect on this point. This example illustrates the power of our actions, underscoring that even in straightforward situations, such as an online survey, the decision to acknowledge a product lies with us. Likewise, as developers, we carry the ethical responsibility concerning how we manage data and cannot delegate that accountability upwards, even if pressured by management or facing tight deadlines.
00:09:59.280 I'm emphasizing that there's real strength in being responsible for implementing these systems. This work provides us the chance to shape the world according to our ethical principles. Regarding shared values, it’s vital to understand that this concept can differ significantly among individuals. If you seek common ground, I highly recommend the Association for Computing Machinery (ACM) code of ethics and professional conduct. This comprehensive document addresses shared values specifically for developers, offering a foundational guide.
00:11:14.790 Now, let’s conduct an exercise to establish our core values, starting with the ACM. While I won’t dissect it line by line, I will summarize the essentials: first, everyone is a stakeholder in computing, given that modern life is powered by technology; second, we must avoid harm and respect individuals’ privacy. Subsequent sections emphasize that a reactive, 'yo-yo' approach is insufficient in ethical programming, particularly regarding security and protecting individuals. This framework serves as a starting point for developers while allowing us to incorporate our organizational and personal values.
00:12:21.200 For example, the DC Abortion Fund’s values stem from its founding mission to eliminate financial barriers to abortion. It exists to help individuals afford abortion services when they otherwise could not. Deciding whether to terminate a pregnancy can be profoundly challenging for many, and it’s essential that people maintain the agency to make their own choices. From this mission, we derive two pertinent values: first, as we provide funding for abortion care, we aim to empower individuals to make their own choices; second, it is acceptable to ask personal questions if transparently aiming to improve the service provided.
00:13:49.110 However, the notion of 'asking questions' introduces trade-offs; while prioritizing intrusive queries enables us to understand our populations better, it can also create stress for individuals in sensitive situations. Therefore, our engineering team recognizes the importance of balancing the value placed on gaining insights against the discomfort that invasive questions may impose.
00:14:54.530 As an engineering team, we have our own values. First, we acknowledge that collecting data carries inherent risks, and for every data point we gather, we must have a compelling rationale that justifies the risk imposed on our patients. This is at odds with the opportunity to collect extraneous data for later analysis. Second, as we work with a population often facing financial difficulty and experiencing anxiety, we must avoid exploiting them for our research and optimization goals. This situation can limit our ability to innovate or refine our product.
00:16:35.710 Nevertheless, we hold a firm agreement as a team to not overreach with our data collection—strictly adhering to our principles is more important than pursuing wide-ranging optimization.
00:17:10.880 Now that we have identified our values on individual, organizational, and team levels, let’s explore some case studies. We’ll apply a rough framework of questions derived from our previously identified values to ensure our actions align with our ethical commitments. For instance, we might ask: Is what we’re proposing in line with our values? In cases of conflict between our values, how do we resolve these differences? These inquiries allow us to measure our decisions against our ethics and help navigate murky waters.
00:18:14.560 Asking whether our approach is ultimately fair toward patients is also paramount. If individuals engage with the DC Abortion Fund, they arrive with a clear understanding of our purpose, which creates a power imbalance. We must possess necessary information to assist them effectively, but this can conflict with our commitment to empowering our clients. Hence, sometimes, we need to prioritize our core value of facilitating client agency over collecting data.
00:19:28.780 Next, we need to consider the potential harm associated with our decisions. This encompasses everything from minor inconveniences to significant safety risks, as recognizing these dangers is critical to informed decision-making. When we relayed our intentions to hold on to personal data indefinitely, we grappled with ethical responsibilities, ultimately deciding to dispose of personally identifiable information after a set period.
00:20:36.380 The final element emphasizes accountability. If we make a mistake with data handling—what are the repercussions? Mismanagement can lead to severe consequences, so it's essential that we maintain strong safeguards. In our case, the various processes we implemented, such as data shredding, enabled us to mitigate some of these risks, reassuring us that we would not put our users in precarious positions.
00:21:29.180 Given the business aspects of data collection, the questions we pose also shift. The DC Abortion Fund operates without the direct profit motive of for-profit organizations, allowing us to prioritize service delivery. If the situation were different, we might be tempted to normalize less ethical practices under financial pressure. Put simply, the values of a mission-driven organization help us make ethical decisions.
00:22:57.500 Reflecting on our previous decision to collect sensitive information regarding patient immigration status, we initially believed that the risk was minimal and the data could improve service delivery. However, this changed rapidly as policy shifts at the national level led to an increase in operations targeting undocumented individuals. We quickly recognized our judgment was flawed and determined we could no longer justify retaining such sensitive data. Ultimately, while some individuals may have received reduced services as a result, we felt it necessary to eliminate potential points of exposure and unnecessary risk.
00:24:57.370 Moreover, the ethical implications of predictive modeling with collected data drew skepticism from our team. Despite the potential benefits in providing tailored support, we acknowledged the inherent risks in applying opaque algorithms to assess patient needs. We felt this approach could perpetuate inequities and were uncomfortable accepting these moral dilemmas.
00:27:25.950 These scenarios highlight that as developers, we have a duty to navigate ethical landmines in our work, just as we would address technical challenges. While dilemmas do not possess a one-size-fits-all resolution, acknowledging our core values and understanding power dynamics in our systems can guide us through complex decisions. Ultimately, we must confront these ethical questions head-on, instead of shying away from them.
00:29:57.390 In summary, we need to cultivate a mindset ready to face ethical challenges similar to how we would approach technical issues, ensuring we don't inadvertently harm others. A practical approach involves reflecting upon our values and understanding power structures at play. While we lack straightforward formulas for ethical conduct, our experience and practice can lead us to better decision-making as individuals and collectives. Let’s commit to addressing ethical concerns surrounding data directly, ensuring we act in ways that reflect our values.
00:32:21.310 Lastly, I want to extend my gratitude to Jamie, the curator of this track, as well as the technical staff responsible for making this event successful. Thank you for attending this discussion on data ethics. Please enjoy this picture of Olga, a cat who truly enjoys attention. My contact information is displayed, so feel free to reach out if you have any thoughts or questions. I appreciate you all being here.