Talks

Consequences of an Insightful Algorithm

EuRuKo 2016

00:00:03.670 Our next speaker is Carina C. Zona. She is a developer, advocate, and a certified sex educator. She is also the founder of Quebec Women Evangelist for Ruby Together and co-organizer of We So Crafty. So let's welcome her to the stage.
00:00:32.020 Hi! As she said, my name is Carina C. Zona and you can find me pretty much everywhere on the internet at CC Zona, including on Twitter. This talk is a toolkit for empathetic coding. We'll be delving into some specific examples of uncritical programming and the painful results that can arise from benignly intended actions.
00:00:50.360 Before we get started, I want to offer a content warning because I will be discussing some intense topics. These topics include grief, post-traumatic stress disorder, depression, miscarriage, infertility, sexual history, consent, stalking, racial profiling, and the Holocaust. If you would rather not think about these things right now, it's perfectly fine to step out and grab another cup of coffee. I won’t be at all unhappy with that. You will have about ten minutes or so before I delve into those subjects.
00:01:37.539 Algorithms impose consequences on people all the time. We're able to extract incredibly precise insights about an individual, but the question is: do we have a right to know what they do not consent to share, even when they are willingly sharing data that leads us there? This also raises the question: how do we mitigate against unintended consequences?
00:01:44.060 When we talk about algorithms, we usually think in terms of patterns of instructions articulated in code, math, or formulas. The classic thing we associate with algorithms is Big-O notation, right? Bubble sorts and their friends. However, algorithms are far more expansive than this. Essentially, an algorithm is just any step-by-step set of operations for predictably achieving an outcome.
00:02:05.299 In our daily lives, we encounter algorithms all the time. Things like recipes, directions on a map, or even the process of crocheting a shawl are all algorithms. Additionally, deep learning is a field of machine learning focused on algorithms for training artificial neural networks rapidly. It has gained great popularity for mining data.
00:02:27.500 Although deep learning has been around in academia since at least the 1980s, only recent breakthroughs, starting around 2012, have made it possible to use deep learning in production at scale rather than in a theoretical academic framework. These advancements have enabled us to extract more sophisticated insights than ever from vast amounts of data.
00:02:47.709 Here's a basic overview of the process: inputs comprise any collection of data, which can include words, images, sounds, objects, or even abstract concepts. The training data does not need to be labeled; instead, we can simply throw raw data into the process without needing specific categorizations. Execution occurs when a series of functions is run repeatedly in what we call a black box, while outputs are predictions of useful properties for drawing conclusions about future similar inputs.
00:03:20.090 Deep learning relies on artificial neural networks to discover patterns within a training dataset automatically and applies those discoveries to infer insights about future data. In our industry, this presents a significant breakthrough for handling big data, and the consequences are quite exciting. Many companies are adopting this technology and doing amazing things with it.
00:03:44.989 However, it’s essential to note that deep learning is based on a black box. The neural network drills down to thousands of subtle factors it believes to have predictive value, and we don’t know what those factors are. This technology is driving significant advancements in various areas, including medicine, pharmaceuticals, emotion detection, sentiment analysis, and even self-driving cars.
00:04:02.980 Today, we'll explore concrete examples involving ad targeting, behavioral prediction, recommendation systems, image classification, and face recognition. First, however, I want to show you a whimsical example: this is Mario, an artificial neural network that teaches itself how to play Super Mario World. It starts with no knowledge of the game world, movement, rules, or scores and instead manipulates numbers, noticing that certain actions yield interesting outcomes. This self-training enables it to identify patterns and gain insights to play the game.
00:06:47.660 Now let's play a game called 'Data Mining Fail.' This is a bingo-style game where we examine case studies highlighting insightful algorithms and their pitfalls. Are you ready? Here are some user stories we will analyze. Target, a large department store chain in the U.S., sought to detect when customers were in their second trimester of pregnancy. This is a critical time, as buying patterns can change significantly during this period, representing a powerful opportunity for targeted marketing.
00:08:18.410 One day, the marketers at Target posed an interesting question to one of their programmers: how would you determine if a customer is pregnant without asking them? It turns out that frequent purchases of moisturizer is a significant indicator. However, even though they identified this pattern, we must question the ethics of targeting someone who may not wish to disclose their pregnancy status.
00:09:04.550 For several years, a man came into one of the stores angrily complaining about pregnancy-related flyers being sent to his teenage daughter, questioning whether Target was promoting sex or pregnancy. The store manager, not responsible for these marketing decisions, apologized, but the next day the man returned, having spoken to his daughter, who was indeed pregnant. This incident underscores the tension between data-driven marketing and individual privacy.
00:10:04.410 In response to the feedback, Target modified their approach, opting to send out random coupons alongside items intended for pregnant customers to avoid alarming them. The rationale was to maintain a level of deception—if a customer perceives that they haven't been 'spied on,' it seemed to work better. But is this ethical? It certainly raises questions about the trade-offs between effective marketing and customer autonomy.
00:10:50.260 Similarly, Shutterfly attempted to target customers who had recently had a baby to encourage them to send thank you cards to family and friends. Unfortunately, their mass outreach missed those experiencing infertility or loss. Feedback from users highlighted the insensitivity of these marketing messages, which proved detrimental to those grieving or unable to conceive.
00:11:40.720 Mark Zuckerberg shared his personal experience regarding miscarriages as he announced the birth of his child on Facebook, pinpointing the complex emotions surrounding such memories. Facebook’s 'Year in Review' feature had been programmed to display joyful memories, but this can unintentionally cause pain for individuals grappling with loss.
00:12:50.690 This phenomenon, dubbed 'accidental algorithmic cruelty,' occurs when an algorithm functions correctly in most cases but does not account for alternative scenarios. Eric Meyer coined this term after he faced something uniquely painful when Facebook repeatedly displayed images of his deceased daughter as it rotated between fun backgrounds.
00:13:46.480 Meyer has called upon us to increase awareness and consideration of failure modes, edge cases, and worst-case scenarios. My first recommendation for all of us is to exercise humility. We cannot intuitiate emotions and private subjectivity we are looking at external indicators and assuming they reflect what is inside.
00:14:24.110 In December 2014, Eric's blog post gained considerable attention within both the industry and broader media, so many people should have been aware of the need to avoid blindsiding individuals with unpleasant content during sensitive periods.
00:14:31.480 Three months after Eric's experience, Facebook introduced a similar feature called 'On This Day,' which provides reminders of various trivial events that occurred on the same day in past years. However, instead of being a comforting reminder, these reflections can trigger emotional responses. Sometimes, the reminders prompt painful memories that users might not wish to confront again.
00:15:12.790 It is crucial to recognize that we need to learn from our mistakes and those of others. Doing so helps illuminate the subtle consequences of both harmful and benign actions. Fitbit initially included features designed to track users' sexual activity, but this data was defaulted to public access, leading to potential privacy concerns.
00:16:58.390 Uber, in its internal tool called 'God View,' allowed employees to track the movements of customers in real time. Such access was misused as managers abused their privileges for non-operational purposes. While algorithms have their utility, it is imperative to limit their use for the right purposes.
00:17:07.970 Additionally, OkCupid used to share insights with users about their collective dataset, contrasting this with Uber's approach, which often operated in a bubble, solely focused on metrics rather than enriching user experiences. It is important to highlight when algorithmic choices intrude on people's privacies without justified reason.
00:17:53.670 In one notable study, a black woman searching for her name on Google discovered that ads suggested she had a criminal record, demonstrating racial bias inherent in algorithmic processes. This bias does not end with data inputs and clicks, but rather, it reflects broader social biases, echoing societal flaws.
00:18:49.650 Joanne McBeal introduced the term 'accidental algorithmic run-ins,' summarizing the challenges posed by careless classification, often leading to scenarios difficult for individuals to navigate. This highlights the risks inherent in recommendation systems, especially when they superficially categorize users based on limited data.
00:19:27.170 Face recognition technologies have also raised ethical concerns. Services like Flickr and Google Photos have made mistakes, leading to humorous yet troubling misclassifications. These mistakes reflect the bias robust in our training sets and algorithms, which means they often miss the subtleties of human experience.
00:20:07.900 Such misclassifications made by algorithms demonstrate the risks of algorithmic hubris. It indicates a failure to acknowledge human insights and instead relies on machine judgment that can lead to disturbing results. Google Photos faced backlash for mislabeling images inappropriately, highlighting the dangers of applying algorithms uncritically.
00:20:34.960 The failure to account for historical biases in developing film has a lasting effect scrutinizing the consequences that arise when social biases are reflected in technical artifacts. This shaping of technology often obscures the gaps where individuals with diverse backgrounds are overlooked during development.
00:21:06.320 Consumer lenders, for example, often assess creditworthiness based on a handful of key factors, which may not accurately represent an individual's financial habits or history. Algorithms must not only measure outputs but also recognize the diverse contexts from which biases may stem, ensuring a more equitable approach to data analysis.
00:21:16.780 Moreover, biases ingrained within data are influenced both by how the information has been collected and how we interpret it. Each assumption we make throughout the life of data collection impacts algorithmic outputs, leading to potential discrimination against certain vulnerable groups.
00:21:40.300 As we incorporate various factors into our models, we must remain vigilant about interpreting them fairly to prevent obfuscation. We must challenge the notion that algorithms can provide infallible outcomes; instead, they should augment human understanding.
00:21:55.480 Taking measures against reinforcing privilege is crucial in avoiding perpetuating systemic inequalities. Algorithms that categorize or assess qualities should inherently provide avenues for correcting and addressing bias that may arise as a result of their operations.
00:22:07.960 Auditing outcomes of algorithms is essential for ensuring the absence of bias. Employing methods such as sending in two distinct applicants under similar circumstances helps test for visible discrepancies and allows us to identify areas needing improvement.
00:22:20.660 To summarize, it demands continuous diligence to scrutinize the results of algorithms — especially in relationships, job hunting, and other applications influenced by biases inherent within datasets. We must eschew unexamined practices which favor the reinforcement of categories over authentic evaluation.
00:22:37.960 Fostering a culture of accountability requires that we remain conscious of the implications of our decisions. We need to ask, 'What impact do our technologies have on those they affect?’ There should always be empirical evidence supporting claims that algorithms facilitate equitable outcomes.
00:23:05.970 Cultivating diversity within teams responsible for developing algorithms is paramount. Diverse perspectives allow for more holistic understandings of how technology interacts with society, which ultimately leads to samples being less influenced by fixed assumptions and blind spots.
00:23:15.530 Constructing decision-making authority under diverse leadership ensures that biases originating from specific demographics don’t inadvertently shape entire systems. Prioritizing cultural inclusion dismantles groupthink and fosters environments where varied experiences contribute to better decision-making processes.
00:23:40.780 Through regular audits of algorithmic decisions, we can ensure that unintended biases do not impact information processing favorably. For example, sending in paired applications bearing different demographic details can reveal discrepancies and promote fairness in evaluation.
00:24:10.530 Lastly, moving towards implementing artificial intelligence requires collectively addressing the shadows of systemic biases already present in our data sources. This endeavor can only succeed through ongoing conversations, transparency, and commitments to challenge ingrained practices.
00:24:46.910 Informed consent must be central to our practices; we should establish 'opt-out' methods guarding the rights of individuals before any data is generated. It is essential that consent defaults are predicated on privacy considerations rather than curated experiences aimed at coercive data collection.
00:25:06.430 Organizations like Facebook are making strides in asking users for input on features like 'On This Day' and 'Year in Review,' but we must continue pursuing features that prioritize personal agency while reducing the burden on emotional labor, facilitating genuine engagement.
00:25:32.730 It's necessary to implement systems that cultivate meaningful engagement with consent rather than presenting automated prompts demanding users keep track of what they wish to avoid. For instance, social networks could implement mechanisms that allow users to recommend accounts to follow rather than relying solely on algorithmic suggestions.
00:26:10.600 Transparency regarding how algorithms operate informs users, enhancing their trust and public accountability. Many companies maintain proprietary secrets, believing it constitutes a competitive advantage, yet it can foster mistrust and a lack of accountability.
00:27:03.870 Pushing for algorithmic transparency ensures that products align better with user needs. Demonstrating how algorithms can yield valid and trustworthy insights fosters confidence in the technology, and ultimately enhances its efficacy and utility.
00:27:23.200 Finally, professionals in technology must approach their work with a commitment to ethical practices, understanding that the algorithms they create can have profound effects on people's lives. Awareness about the consequences, potential pitfalls, and ensuring empathy in coding practices helps us to serve the communities impacted.
00:28:11.140 We should strive to embrace a culture of responsibility, ensuring what we build aligns with broader societal expectations of equitable treatment. The algorithms should support, rather than undermine, social equity throughout all applications that involve human interactions and experiences.
00:28:59.180 In conclusion, we find ourselves at a crossroads where we can either learn from previous mistakes or persist with a limited lens of perception. As coders and developers, we carry the responsibility to contribute positively to the technology shaping our world. We must refuse to perpetuate systems that impose unauthorized consequences on people's lives.
00:29:18.060 Thank you for taking the time to engage with these concepts. We need to be empathetic coders who understand the implications of our work. Let’s strive to be advocates for responsible technology, mindful of how our creations can impact those around us.
00:30:00.000 Thank you for your attention.