Talks

Consequences of an Insightful Algorithm

EuRuKo 2016

00:00:03.670 our next speaker is Karina see zona she is a developer she's an advocate and a
00:00:10.820 certified sex educator she is also the founder of Quebec woman evangelist for
00:00:17.180 Ruby together and co-organizer of we so crafty so let's welcome her on stage
00:00:32.020 high as she said my name is Karina see zona you can find me pretty much
00:00:37.340 everywhere in the internet at CC zona including on Twitter this is a talk it
00:00:43.610 is a tool kit for empathetic coding we'll be delving into some specific examples of uncritical programming and
00:00:50.360 the painful results that can arise from doing things in ways that are completely benignly intended and
00:00:57.160 because that I want to start here with a content warning because I'm going to be delving into some examples that deal
00:01:02.450 with some pretty intense topics they're not the point of it but they do come up so some of the things that you'll be
00:01:08.420 hearing about deal with grief post-traumatic stress disorder depression miscarriage infertility
00:01:13.990 sexual history consent stocking racial profiling and the Holocaust if you would
00:01:21.229 rather not be thinking about those things right now it is fine to go and get another cup of coffee I will not be at all unhappy with that go and enjoy
00:01:28.789 the hall and it will take about ten minutes or so it even get into those so you have some time to think about it
00:01:37.539 algorithms impose consequences on people all the time we're able to extract
00:01:44.060 incredibly precise insights about an individual but the question is do we have a right to know what they don't
00:01:50.600 consent to share even when they're willingly consenting to share data that leads us there and so that also raises
00:01:57.469 the question how do we mitigate against unintended consequences when we talk about that word algorithms
00:02:05.299 we're usually thinking in terms of patterns of instructions that articulated in codes or math or formulas
00:02:12.590 and of course the classic thing we associate with algorithms is big-oh right bubble sorts and its friends but
00:02:19.760 algorithms are far more expansive really than this so generically an algorithm is just any step by step set of operations
00:02:27.500 for predictably arriving at an outcome that's it so in real life we also have
00:02:33.110 algorithms all the time things like patterns of instructions articulated in other ways such as a recipe or
00:02:41.049 directions on a map or making a crocheting a shawl
00:02:47.709 deep learning is a field of machine learning which is algorithms for fast
00:02:53.989 trainable artificial neural networks and deep learning is very hot right now for
00:02:59.840 mining data essentially it is it has been around in academia for a
00:03:06.530 long time since the 1980s at least but it's only been some really recent breakthroughs starting around 2012 2013
00:03:13.540 that's really made it possible to start using deep learning in production at scale rather than this sort of
00:03:20.090 theoretical place that it's been in academia those recent advancements have been making it possible to suddenly
00:03:26.629 extract considerably more sophisticated insights than anything we've been possible of doing before and that's out
00:03:33.620 of the vastness of even big data in production in scale at virtually real-time
00:03:39.549 so here's the basic process inputs are just any collection of data and they can
00:03:44.989 be all sorts of things words images sounds objects even abstract concepts
00:03:50.079 and then this training data doesn't even have to be labeled as such it doesn't
00:03:55.340 have to be picture of bird picture of dog picture of window you just throw at pictures so they don't be labeled or
00:04:01.819 categorized in any way so execution then is just running a series of two functions of two functions of functions
00:04:07.819 repeatedly in a black box and then finally outputs are predictions of
00:04:13.250 properties that are useful for drawing intuitions about similar future inputs but remember similar the training data
00:04:20.539 set must look a lot like the data set that it will be doing analysis on and we'll get back to that later
00:04:27.310 so deep learning relies on artificial neural networks just automated discovery
00:04:33.229 of patterns within a training data set and it applies those discoveries to draw intuitions about future data for our
00:04:40.039 industry that means it's presenting us with a breakthrough for dealing big data and it's really exciting there's already
00:04:47.119 so many companies that are adopting this and being able to do fantastic interesting things but notice what this
00:04:52.819 means deep learning is premise on a black box the neural network has drilled
00:04:58.909 down to tens of thousands or even hundreds of thousands of incredibly subtle factors that it believes how
00:05:05.869 predictive value and we don't know what they are so this stuff is driving major advances
00:05:12.949 right now in all sorts of areas including medicine pharmaceuticals emotion detection all sorts of NLP stuff
00:05:20.029 face identification voice related things fraud detection in
00:05:25.039 in transactions sentiment analysis even self-driving cars including Tesla are
00:05:31.009 right now using deep learning to drive today we're going to be looking at some
00:05:37.219 concrete examples that include ad targeting behavioral prediction recommendation systems image
00:05:43.369 classification and face recognition but first I want to really to show you a very whimsical one
00:05:49.029 this is Mario it's an an N that teaches itself how to play Super Mario World it
00:05:55.389 starts with absolutely no clue whatsoever it doesn't know about its world it doesn't know about movement it
00:06:00.469 doesn't know about rules and scores it doesn't even have an understanding of gaming itself it's just manipulating
00:06:07.039 numbers and noticing that sometimes things happen it's looking at the outputs and it's curious so sometimes it
00:06:15.679 notices that cumulatively changes produce interesting outcomes and so it just keeps playing like this it's
00:06:22.669 learning movement and gameplay purely on its own via self training session in which it just engages in 24 hours of
00:06:29.929 progressively more fine-grained experimentation that ultimately leads it to being able to identify patterns and
00:06:36.619 use those patterns to predict insights to play the game
00:06:41.979 speaking of games let's play one right now is called data mining fail it looks something like
00:06:47.660 bingo and it looks like this
00:06:53.830 insightful algorithms are full of pitfalls these are just a few by looking
00:06:59.630 at case studies though we can explore some of this board so are you ready are
00:07:05.080 you ready all right I like to hear actual answers ah here are some user stories that we're
00:07:12.110 going to be playing with target target is based in the US it is a
00:07:17.600 large department store chain where you can get stuff for every room in your house plus your yard and your school and anything else
00:07:25.280 you can possibly imagine including groceries it's a major retailer
00:07:31.330 throughout the industry of retail the second trimester of pregnancy is
00:07:36.919 considered the holy grail because it's a moment in time where a person is ready
00:07:42.770 to change every buying pattern they've made in their life every loyalty every
00:07:49.130 place that they go what they buy their brands it's all up for grabs all over again and because a person's starting a
00:07:55.250 family you're talking about being able to co-opt not just that person but everyone else in that family too so it's
00:08:01.729 a powerful moment if you can harness that buying power then for a long time
00:08:07.130 you're going to have that customer but how do you detect that a customer is in
00:08:12.320 their second trimester one day the marketers at Target asked one of their program was that question
00:08:18.410 how would you figure that out it's an interesting question I mean take a moment kind of think that well if you
00:08:24.350 couldn't ask them what would you use well it turns out that buying a lot of
00:08:32.630 moisturizer is one of the key signs I
00:08:40.000 think there's some problems here the question they asked is if we wanted to figure out if a customer is pregnant
00:08:46.400 even if she didn't want us to know can you do that but notice even if she
00:08:51.560 didn't want us to know it's an interesting challenge but is it an ethical one
00:08:58.990 in oh gosh if several years ago a man
00:09:04.550 came into one of the stores and he was really mad he was yelling he said how dare you send flyers like this to my
00:09:10.430 teenage daughter full of stuff up pregnancy are you trying to tell her it's okay to have sex are you trying to
00:09:15.770 get her pregnant what is this and the manager who's not in charge of these decisions that I'm very sorry you're
00:09:22.970 absolutely right the man went away he came back the next day and he apologized
00:09:28.550 to the manager he said I had a talk with my daughter today she is pregnant
00:09:36.490 so the algorithm was right or at least it was correct but
00:09:42.100 it took something away from her they took away her moment her decision as to
00:09:48.140 when to have that communication that's something you don't want to start
00:09:53.450 with someone who's angry so target dot feedback like this and
00:10:02.300 they decided to change things a bit so instead of having a flyer sent out to someone that's full of things that are
00:10:08.720 clearly targeted at someone who's pregnant they changed it to a flyer full of various completely random things that
00:10:16.100 have absolutely nothing to do with pregnancy and then the coupons that do so most of it is cover for what they
00:10:22.459 actually are targeting what they actually want you to do and their logic for this is as long as a
00:10:29.630 pregnant woman thinks that she hasn't been spied on as long as we don't spook her it works
00:10:36.250 is this okay it works
00:10:45.240 shutterfly also did something similar they were looking to target people who
00:10:50.260 had recently had a baby and convinced them to please send out lots of cards thanking people for their gifts and for
00:10:56.380 their parties and all their happy thoughts so they sent out something like this to folks it says as a new parent
00:11:02.710 hey you have this obligation to send out cards that we sell time to send those thank you notes about your to the birth
00:11:09.340 of your child they sent them to a whole lot of people and some of those people had feedback
00:11:16.840 for them too thanks shutterfly for the congratulations of my new bundle of joy
00:11:21.850 I'm horribly infertile but hey I'm adopting a kitten so I
00:11:28.950 lost a baby in November who would have been due this week it was like hitting a
00:11:34.150 wall all over again shutterfly responded that the intent of
00:11:40.720 the email was to target customers who have recently had a baby well yes we got that
00:11:48.330 the point is that there's an error rate and they didn't consider the effect of an error
00:11:55.980 Mark Zuckerberg became a father last year he and his wife had announced the upcoming birth on Facebook of course and
00:12:02.980 with great excitement he also revealed then that they'd had a series of miscarriages before that and he wrote
00:12:11.310 you feel so hopeful when you learn you're going to have a child you start imagining who they'll become dreaming of
00:12:18.250 hopes for their future you start making plans and then they're gone it's a
00:12:24.100 lonely experience Facebook has a feature called year in
00:12:31.150 the review it's been implemented in various ways over the years a couple of years ago they decided to do it
00:12:37.780 algorithmically what they fail to take into account is that our lives are constantly changing that's something
00:12:44.500 that was really exciting to talk about six months ago maybe something that it's painful to talk about now
00:12:50.310 not every memory stays the joyous one that it once was
00:12:55.680 accidental algorithmic cruelty is result of code that works in the overwhelming majority of cases but doesn't take into
00:13:03.220 account other use cases Eric Meyer coined this term and the reason he gets
00:13:09.279 to name it is because he's one of the people that happened to this is a picture of my daughter who is
00:13:16.360 dead she died this year the urine who you add keeps coming up in
00:13:22.180 my feed rotating through different fun and fabulous backgrounds as if clicking
00:13:27.759 her death and there's no obvious way to stop it
00:13:35.939 Eric calls on us to increase awareness of and consideration for the failure
00:13:41.019 modes the edge cases the worst case scenarios and that's what I'm hoping to
00:13:46.240 do here today and I'm hoping that you'll carry it forward to others so with that in mind here is my first recommendation
00:13:52.029 for all of us be humble we cannot into it interstate
00:13:58.589 emotions private subjectivity we're looking at external indicators as if
00:14:04.360 they can tell us what's inside Eric's blog post was in December of 2014
00:14:10.329 and it garnered a lot of attention both within the industry and from broader media so there should have been a lot of
00:14:16.990 people who knew about this how do you avoid blindsiding someone with unpleasant stuff annually Facebook had
00:14:24.399 we've done some introspecting about that asking themselves this question of what do we do next time
00:14:30.269 three months after Eric's experience they introduced a new feature it's
00:14:36.069 similar to your interview but it's at any time of the year it's called on this day and it gives your reminders of
00:14:41.290 various trivial things that may have happened on the same day in a previous year so hey five years ago today became
00:14:47.499 facebook friends with someone two years ago you went hiking a year ago
00:14:53.439 today you had dim sum yum alright so these are pretty simple cases and they're really great about you know
00:15:01.959 I mean this is one of the things they learned is to say we care about you we get it here's a memory for you from
00:15:08.559 three years ago and we think you're gonna like it oh look at that on this day you posted a
00:15:16.520 picture thanks Facebook for picking today to hit me with this thumb feature and remind me
00:15:23.060 that my dog died three years ago sometimes Facebook's on this day sends
00:15:29.390 me memories from high school and it's triggering I did not enjoy high school and I need to forget it
00:15:36.250 Facebook you do not get to decide what parts of my life I should keep fresh in my mind and which parts I walk away from
00:15:43.730 off we have to learn from mistakes we've learned from ours we have to learn from
00:15:49.790 others we they decide that harmful and harmless are not consequences that
00:15:55.400 somehow balance each other out Fitbit started off a little differently
00:16:01.910 than the device we know today its website also was a little different when
00:16:07.670 it started out it had a feature to track your sex life and as you recall Fitbit
00:16:13.490 is a social service so you're meant to be competing on things like how many steps you took how many calories you ate
00:16:19.970 how many times you went running it for how long or in this case compete on your sex life the problem is
00:16:27.620 that this defaulted to public
00:16:34.310 here is an algorithm that decided or assumed that all data is equal it's a
00:16:40.530 default that doesn't work we have to look at individual cases and make sure
00:16:45.960 an algorithm isn't just sweeping up everything and pushing it out again
00:16:56.150 most of us need internal dev ops tools you know they may be for monitoring performance tuning business metrics
00:17:03.350 whatever Uber's is called God view
00:17:08.870 it tracks the cars around you have a good sense of where things are in the city but uber didn't limit access to
00:17:16.620 just administrators or restricted just operational use employees could freely identify any passenger by name and
00:17:24.240 monitor that person's movement in the car real time drivers used to have
00:17:31.020 access to God view to I don't know why do you need to know where every other cars on the road and who's in it
00:17:37.730 even a job applicant was welcome to access these private records
00:17:43.940 meanwhile managers felt free to abuse God view for non operational purposes such as stocking celebrities rides in
00:17:51.300 real time and showing that office party entertainment look at this cool thing we can do
00:17:56.990 it's an abuse of an algorithm the algorithms find it has use but this is
00:18:03.510 not what it should ever be touched for the research group at dating site
00:18:09.120 OkCupid also used to blog about things that they were learning from their aggregate trend data they really love
00:18:14.490 looking at the data and their blog focused on sharing various different insights about simple ways that an
00:18:20.400 OkCupid user could use the dating site to date better uber also had a blog about its data but
00:18:28.950 it was a little different the crucial difference is is however approached it it wasn't about improving customers
00:18:35.280 experience of their service if you look closely it says uber can and does
00:18:42.150 tractor one-night stands it tracking your sex life as well this is
00:18:48.160 purely invading people's privacy not for any operational reason but purely for the sake of judging and shaming and
00:18:55.480 laughing this is not a predictable consequence for signing up for a
00:19:00.520 rideshare service google adwords there was an interesting
00:19:07.690 study a few years ago at harvard in which the researcher herself had gone
00:19:14.680 searching for her name and she is black her name is latonya and was surprised to
00:19:20.860 see that the google ads alongside the search results suggested that she has a
00:19:26.800 criminal record and so she did further study on this and using two sets of
00:19:32.440 names once highly associated with white people one highly associated with black
00:19:37.840 people and separated by gender just searching for those combined with the
00:19:43.360 last names of real academics so what do you find what they found was that a
00:19:49.450 black identifying name was twenty five percent more likely to result in an ad that implied that person had an arrest
00:19:55.750 record and before you think yourself well maybe twenty five percent of those people did adwords doesn't know anything
00:20:02.470 about real life adwords work simply on clicks it watches clicks and it tries to
00:20:09.280 find the ad that people will most want to click on next so what this is
00:20:14.800 reflecting is our bias our preconceptions and repeating them
00:20:20.620 magnifying them honing them the real world isn't relevant to this
00:20:27.550 but it has real-world effect what we see here our collective bias is
00:20:33.880 being both reflected to us and it's being reinforced to the next person data is generated by people it isn't
00:20:41.830 objective it's constrained by our tunnel vision it replicates our flaws it echoes
00:20:49.720 our preconceptions
00:20:54.780 Twitter I can already hear the like yeah
00:21:02.190 Joanne McBeal coined the term accidental algum algorithmic run-ins and as you can
00:21:08.770 tell it's sort of a take-off on Eric's term and whilst you didn't give a formal
00:21:14.410 definition it can be roughly summarized as this classifying people as similar where careless prompts create scenarios
00:21:22.060 that are harder for people to control and prepare for
00:21:27.120 essentially you're trapped by a recommendation system that's determined to show you someone similar that you'd
00:21:34.330 actually probably want to avoid it's a false positive that cannot easily be detected algorithmically and
00:21:41.190 sometimes a similarity factor can be pretty trivial like this example from last.fm
00:21:47.400 just someone who you know two people share a mixtape but it was made by a
00:21:53.260 person who one of them broke up with and one of them is with currently and so the
00:21:59.650 relationship between these people who don't know each other at all has something emotional baggage to it
00:22:07.230 sometimes the factor connecting you to the person is intensely upsetting if
00:22:12.610 you've been stalked by a former co-worker Twitter may reinforce this connection algorithmically boxing you
00:22:17.890 into a past while you're trying to move on your affinity score your harasser
00:22:23.830 'he's will keep getting higher but every person who follows this person at
00:22:29.530 trickers recommendation so notice this just like Adwords the algorithm doubles
00:22:35.890 down on its false certainty with every action that third persons take you're
00:22:41.500 not the one who has any control over this and people who don't know they have control over it are the ones who are in
00:22:47.740 fact influencing it similarity algorithms can become an effect a proxy harasser and many of
00:22:55.450 those systems give no off switch for the user now we're going to look at some examples
00:23:02.590 of face recognition flickr google photos so you recall when this first start
00:23:09.470 coming out we had them on phones and they'll give a cool square box like this that's a face
00:23:17.679 it's become a lot more commonplace now and we've seen plenty of really humorous mistakes like these along the way
00:23:24.100 this was iphoto just a few years ago it's a harmless mistake it's a false
00:23:29.299 positive but it's one that's easy to chuckle at here's another you may remember this site from about a
00:23:36.440 year ago it's microsoft's how old dotnet and it uses deep learning to make face recognition to take it up to the next
00:23:43.429 level essentially it's drawing those intuitions about age and gender and it's
00:23:48.620 assigning tags and inevitably it's going to make a few mistakes along the way but on this site
00:23:55.610 that also doesn't seem like a big deal they look pretty harmless but there are false positives that are
00:24:02.659 not funny such as this next one which is also last year flickr classified this as
00:24:08.350 children's playground equipment
00:24:14.919 Dachau concentration camp with its motto on the Gade work will set
00:24:21.320 you free notice something else here the gray tags on the side are the photographer's the
00:24:28.070 ones he manually input the white tags are flickers algorithmically added ones
00:24:33.789 this is a consequence of algorithmic hubris it's treating human understanding
00:24:40.220 as irrelevant to machine intuition is treating data as inherently neutral as
00:24:47.149 if something already known isn't worthwhile if it's something provided by a human
00:24:59.729 flickr tagged this man as an animal originally it also tagged him as an ape
00:25:05.879 this is a comparison that has particularly ugly history so let's be clear this isn't about
00:25:12.249 picking on any particular company or coder or methodology or even this particular technology it's about a
00:25:19.479 broader set of problems that befall us all here's google photos just a month after
00:25:26.499 that last one google photos y'all up my friends
00:25:32.019 not a gorilla so how does this happen for one answer
00:25:37.599 you can go back to the 1950s in the US in the 1950s Kodak film was living in a
00:25:46.629 very segregated America and they were only interested in white consumers and so they were making fine film emulsion
00:25:53.909 that were focused on getting most accurate detail possible
00:26:00.009 from white skin black skin wasn't of interest for them so for decades their algorithm for
00:26:07.989 developing film was based on finding the subtlest possible details in white skin
00:26:14.019 and so they passed out these cards called surely cards and they were used every day by the photo lab
00:26:20.459 technicians sorry to ensure that the foam development process was always
00:26:25.839 calibrated to find exactly the same colors exactly the same detail you never wanted to have variation but notice here
00:26:32.889 that black skin isn't even been tested it's just not even relevant
00:26:38.459 the tools used to make film the very science of it are not racially neutral
00:26:44.919 and that means all of our development impressive processes have been
00:26:50.200 responding to this legacy that is more than a half a century old it
00:26:55.839 contaminates our data new as well as old and that's a hard problem to overcome
00:27:01.889 essentially you have lossy data for people just because they have dark skin
00:27:07.529 it's tempting to just avoid thinking about it all together and so it continues on a
00:27:16.409 firm is a consumer lender specializing in a handful of products and firms
00:27:21.789 rather than something where you would take a credit card and buy whatever you want any place and it's also of course
00:27:27.279 highly tech based the CEO is also one of the founders of PayPal a
00:27:33.149 firm makes an assessment of your creditworthiness based on just a few factors your name email mobile number
00:27:39.880 birth date and your government identification number
00:27:47.250 it also may ask you for additional information if necessary from other online sources usually social profiles
00:27:54.010 like github okay well not everyone who buys stuff is
00:27:59.440 on github it also may ask borrowers to share
00:28:04.720 information I'm sorry as well it looks at other behavioral factors like how long the person took to remember all the
00:28:11.139 information that they gave not everyone processes information at the same rate does this mean that Stephen Hawking is a
00:28:18.100 bad credit risk algorithms like these reinforce privilege so remember an algorithm is
00:28:25.059 just that procedure for relying for live water time
00:28:32.340 for reliably arriving at some outcome whereas it's up to us to take into
00:28:37.480 account what impact those outcomes lead to and the outcome here is to reliably
00:28:43.570 identify privileged people mostly privileged programmers and reliably
00:28:48.820 exclude most people who don't share our abundance of privilege deep learning looks at such a deluge of
00:28:56.380 data points and it learns how to assign labels to them but we have to hang on because as humans we know that two
00:29:02.950 identical data points can mean very different things understanding context is essential for
00:29:10.330 drawing accurate conclusions about what individual data points mean without
00:29:15.730 these biased always runs rampant if a machine is expected to be
00:29:22.269 infallible it cannot also be intelligent that's Alan Turing the immense power of
00:29:27.530 machine intuition is irreplaceable without a doubt but it's not a replacement for human comprehension a
00:29:35.170 firm analyzes applicants social media accounts so do some other companies one
00:29:40.790 of them was Germany's largest credit rating agency it considered evaluating
00:29:45.890 applicants Facebook relationships which is odd because a personal friend
00:29:52.610 is not necessarily the same as a facebook friend so now you're being
00:29:58.040 judged based on relationships that may not even exist but what about when that facebook friend
00:30:05.720 is in fact a genuine personal friend Facebook recently defended a patent that pushes even further than a firm's does
00:30:12.290 this patent is for making credit decisions about a person based on the unrelated credit history of their
00:30:18.410 Facebook friends so here's an algorithm with potential to deeply intrude on and even alter a person's real life
00:30:25.790 relationships simply to avoid being financially shamed and financially punished by an algorithm
00:30:35.590 it's important to maintain the discipline of not trying to explain too much says max left chimneys chief
00:30:42.470 executive affirmed adding human assumptions he says could introduce bias into the data analysis
00:30:51.580 what dude data is not objective data always
00:30:58.880 has bias its inherent at minimum from how it was collected and interpreted and
00:31:04.550 then every flaw and assumption in that first data training set and the original
00:31:09.650 functions and who made assumptions about them all these things along the way are
00:31:15.230 having unrecognized influence on the algorithms and the outcomes they generate a firm says that its algorithm
00:31:21.830 assesses 70,000 personal qualities whatever that means but 70,000 factors
00:31:27.740 how many of those have potential for some sort of discriminatory outcome how would anyone know it's not like someone
00:31:34.940 can tell you what criteria led to a decision only the black box knows rationales from the algorithm can only
00:31:41.720 be seen from inside the black box so I took a photo for you of the inside
00:31:46.909 of a really really black box making lending decisions inside a black
00:31:53.750 box is not a radical new business model it's a regression what is disrupting is
00:32:00.440 fairness accountability and oversight algorithms always have some kind of
00:32:07.700 underlying assumption about meaning about accuracy about the world in which the data was generated about how code
00:32:14.870 should assign meaning to it underlying assumptions influence outcomes and
00:32:20.090 consequences being generated every time our industry is in an arms race right now a
00:32:26.990 deep learning arms race and it will continue to accelerate and move on major
00:32:32.330 players are already rolling out projects that have made big bets on this technology and it's opaque intuitions
00:32:39.110 and for the moment yes quality varies but remember this moves fast it's all
00:32:46.580 about iteratively drawing predictive intuitions at extremely fine grained
00:32:53.390 levels which means every time they're growing both more precise and
00:32:59.600 correctness and more damaging and wrongness and that's the dilemma for us
00:33:07.390 the thing is we do care we care about getting this stuff right
00:33:13.700 we want to be empathetic coders so how do we flip that paradigm how do we go
00:33:20.929 about solving some of this here are some starting points consider decisions potential impacts on
00:33:28.580 others how might a false positive affect someone's such as those shutterfly customers are those Twitter users how
00:33:35.330 might a false negative effects of one for instance being denied alone how many
00:33:40.490 other ways can algorithms intuition be superficially correct and yet deeply wrong about human context like at dachau
00:33:48.320 or the reminder of Eric Myers daughter we need to project the likelihood of
00:33:55.970 consequences to others we need to minimize negative consequences to others and I'm just going to keep on saying to
00:34:02.690 others because it's really easy to look at these things and say it's really cool
00:34:08.419 we're making something really great it's moving the world forward we have to keep
00:34:13.460 on thinking not about what's really useful and neat and not likely to harm us and constantly be thinking about
00:34:19.820 users and those beyond them we can adopt the same motto that the medical field
00:34:25.760 has first do no harm and we need to be really honest and
00:34:31.490 trustworthy not just because you know it's the right thing to do but also because it's business assess it E we're
00:34:38.660 going to make big mistakes on this stuff just like these other folks did and we need to be able to say when that time
00:34:44.120 comes we're sorry this was an honest mistake we're fixing immediately we will make sure it never happens again if we
00:34:50.810 haven't earned that trust beforehand we could really lose big it's important to
00:34:56.419 have that foundation of trust already there and that's why it's also important to always
00:35:02.390 build in recourse for someone to easily correct a conclusion that was wrong
00:35:09.220 we need to provide others with full disclosure of limitations and we need to
00:35:15.740 call attention to signs of risk of harm to others because there's always limitations write an algorithm is
00:35:22.820 targeting a particular problem a particular context for that problem applying assumptions about it and it
00:35:29.750 always use an acceptance of a certain amount of errors or flaws or imprecision we're never waiting until it's a
00:35:36.980 thousand percent always correct every time in all circumstances it just has to apply to solving a particular problem at
00:35:43.880 a particular moment in time this is a chicken breast the picture is
00:35:49.790 excerpted from a video that Emily Gorski took of an activity tracker while it was
00:35:55.220 detecting a steady pulse of 120 beats per minute from minute from the chicken
00:36:00.760 which i think is pretty good the algorithm is driving consumer
00:36:06.150 activity trackers are not precise or consistent they have value but you know
00:36:11.579 need users do need to hear from us that they're not intended to be used for some
00:36:17.579 sort of yeah exact purpose and we also need to somehow make sure
00:36:23.160 that that's known to the wider public because recently an activity tracker was used to convict a person the data from
00:36:29.849 it said that she wasn't at the place that she said she was and it didn't say that her heart rate was consistent with
00:36:35.789 what they believed the tracker was showing and so the kind of an accuracy
00:36:41.009 that we're seeing on Emily's chicken breasts can be used to devastating effect unless it's very clearly
00:36:47.849 communicated what the limitations of this technology is it's really more for entertainment value as for having some
00:36:54.900 sort of baseline to compare your own day to day not to say objectively this is reality
00:37:00.890 we need to be visionaries about creating more ways to counteract bias data biased
00:37:06.390 analyses biased impacts and finally we need to anticipate diverse ways to screw
00:37:13.289 up because as long as the teams who chart charged with defining data collection use and analysis are less
00:37:20.579 diverse than the intended user base we will keep failing them we must have decision-making authority
00:37:27.989 in the hands of highly diverse teams culture fit is the antithesis of
00:37:34.880 diversity superficial variations are allowed to exist that's all their unique
00:37:41.160 perspective is suppressed because the point of culture thing inherently is to
00:37:46.650 disrupt avoid disruption of groupthink we're all going to think basically the same thing and agree it's really good
00:37:53.089 you know dimensional variety is also not diversity diversity is wildly varied on
00:38:00.930 as many dimensions as possible it has different origins different ages
00:38:06.059 different assumptions different experiences where there's no majority clearly to see it all
00:38:15.600 audit outcomes constantly in housing and other forms of discrimination
00:38:21.350 auditing is used to find essentially what comes out of the black box so while
00:38:26.940 you can't stand over landlord shoulder and ask them are you biased are you biased what you can do is send in two
00:38:33.120 different people with the exact same application and see whether they both get selected this is also been done all
00:38:40.440 sorts of research studies things like resumes as well job hunting and every
00:38:47.010 time it turns out that yes by bias can be detected reliably this way so this is
00:38:52.320 an opportunity make sure you're looking at the outputs and make sure they line up with what your expectations are make
00:38:59.100 sure they're lining up with highly unlikely expectations does it still make sense
00:39:06.410 because again all we've got is that really really black box
00:39:12.590 so here you have an example something where you might have a fellowship a job a mortgage and you only
00:39:19.890 have to change just one attribute whatever it is and just send those application data into the black box and
00:39:25.650 see what's its outcome are they fair are they equal are they predictable
00:39:31.220 you know really obvious differences to audit for anything that's protected by
00:39:36.960 laws at a minimum that of course should be there but what about subtle or cues two identical applications for something
00:39:45.210 sorry yeah why is it not working
00:39:51.530 okay sorry what meaning is that system than implicitly ascribing so for
00:39:58.020 instance a difference of a family name or a personal apparel face recognition
00:40:03.360 is in there is it judging less of a face as somehow less data or as different
00:40:09.600 data all together we could say no harm intended but that
00:40:15.240 really is not sufficient because as we've seen unintended totally
00:40:20.340 unconscious bias is surrounding us so the rule has to be that unless you can draw a complete decision tree for every
00:40:27.720 possible combination of inputs and every possible output for them then you need
00:40:33.390 to audit outcomes all right so what is that entail in
00:40:38.850 practical terms either we looked at that Google AdWords study from Harvard that's one example of outcome auditing here's
00:40:46.680 another one Carnegie Mellon called made an ad called ad Fisher and it generates
00:40:52.050 pristine google user histories that are carefully chosen trails across the web
00:40:57.630 in such a way that an ad targeting network will infer certain interests or activities that sounds very dry but I
00:41:03.990 think we know what this is really doing here google responds with just job ads it's given templates it fills them in it
00:41:11.040 ended up showing high pay jobs to six hundred percent more of the men compared
00:41:16.110 to women so the job opportunities are being again changed in the same kind of
00:41:21.210 way that they were from black and white people
00:41:28.100 another thing we could do is crowdsource all the things
00:41:33.230 these case studies are highlighting for us that mining social data is full of pitfalls many of which cannot be solved
00:41:40.050 by technology alone so don't go it alone bring crowds in because this is
00:41:47.040 important artificial neural networks are only crude approximations of the human brain's ability to constantly develop
00:41:54.300 sophisticated algorithms were the real geniuses in the equation what we ask of
00:41:59.460 a machine is consistency Yammer has been using this for years
00:42:05.940 this is from a talk given in 2012 by Heather rivers at Yammer and they crowdsource internationalization so you
00:42:13.740 can choose at any time to start translating for the site and you can choose a variety of languages and as
00:42:21.030 well then once you do put in a translation others essentially fact check your work so by uploading and
00:42:27.420 downloading you're getting a more refined set of data without having to do
00:42:32.490 any work at all the users are doing it for you google has similar problems it's getting
00:42:38.070 pretty good at being able to interpret almost every language except one brand of English it can't
00:42:46.000 figure out what the heck Scottish people are saying and so they've outsourced this problem to
00:42:52.089 mechanical turk which is as humans it's just asking humans a whole lot what are
00:42:57.280 these people saying what are these people saying what are these people saying so they're paying to substitute
00:43:03.339 for machines and substitute with people they also have a crowdsource app which is kind of fun to look at I recommend
00:43:09.670 downloading if nothing else for some entertainment value it gave the chance
00:43:15.220 to again sort of check other's work and identified the translation in German to
00:43:20.890 English Deutsche is either the translator to America Canada or
00:43:26.530 South Africa depending on the personal user who translated it so we also need
00:43:32.980 to make instructions for the crowd pretty clear because it's really noticeable that some users are mistakenly treating it as something else
00:43:39.520 like a thesaurus so instead of giving a direct translation they'll give you oh it means this or this or that and so
00:43:46.030 that's also essentially data noise that you need somebody else to catch and correct some users also treated as I'm
00:43:52.330 talking to a programmer and so they would give responses like I don't know so we need to make sure that when we're
00:43:58.900 turning to the crowd that we still have an algorithm for talking to them about what it is we want and so we always need
00:44:04.330 to iterate get that feedback not just the original data but getting it more refined feedback on that over and over
00:44:10.510 again because there are attempts to sabotage there are just innocent errors all of them need to be caught
00:44:16.320 we also need to cultivate informed consent that's asking for permission with the default being know where we
00:44:22.690 focus on the many who eagerly share themselves and are enthusiastic about giving consent to be known more and
00:44:28.780 served better Facebook's been making adjustments along the way on this day and you're in review
00:44:35.710 and now ask questions like do you want to use this teacher do you want to edit its choices which people don't you want
00:44:42.910 to be reminded of which dates which posts that sounds pretty good it's effort it's well intended it still
00:44:50.020 misses the point because you have do things like remember who you don't want to remember you have to make a list of
00:44:57.089 them the newer gentler features are all premise on list things you don't want to
00:45:02.880 think about just think of a list of everything you don't ever want to have to think about
00:45:08.869 dig up what hurts you to fend off the algorithm so what they're doing is still
00:45:15.420 fundamentally opt-out not opt-in it is still putting the burden on the user to
00:45:22.950 solve a problem created by the algorithm
00:45:28.010 why not just let Twitter users select the people that they recommend that others follow we already have this it's
00:45:35.010 called followfriday hashtag FF this data already exists why are you trying to
00:45:40.650 extrapolate it from a much poorer system it's cheaper it's faster and it's more
00:45:47.160 accurate or you could just have a checkbox that pregnant person isn't the only one who might have a stake in the
00:45:52.410 pregnancy and in buying a whole lot of stuff for it friends and family members might also really love to have all those
00:45:57.450 coupons and those are also people that you can be cultivating right now opt-in is not necessarily a bad thing it can be
00:46:04.079 more profitable and that's why we also need to commit to data transparency and algorithmic
00:46:11.670 transparency and I know this is the hardest conversation to have internally because too many companies think keeping
00:46:18.089 everything propriety is the secret sauce the way we win but I also really remember that it
00:46:24.210 wasn't that long ago that we had to fight the legitimacy of open source in our professional tool pit at all we push
00:46:31.020 back and we were right to we're professionals we know that transparency
00:46:36.450 is crucial for drawing insights that are genuine and useful so we have to start the conversations please argue for
00:46:44.400 increasing transparency because it's for the sake of a better product clean our features fewer bugs stronger tests
00:46:51.569 happier users public trust more money because we build stuff that matters
00:46:58.880 Amy hoy is harsh but she's right if your product has to do with something that
00:47:04.930 deeply affects people either care or quit or go live in a cave and don't hurt
00:47:10.960 other people it's so easy to unthinkingly build an
00:47:16.480 app full of data and mining fail building differently requires awareness critical thinking and deciding as a team
00:47:23.410 to take a stance to say hey listen here's the deal we do not build things here without first understanding
00:47:29.760 consequences to others this is just our process this is a good process
00:47:34.950 we're hired for more than just to do code we're not code monkeys were hired as professionals to apply our expertise
00:47:43.150 and judgment about how to solve problems our role is fundamentally to be
00:47:48.910 opinionated about how to make code serve a problem well when we're asked to write code that
00:47:54.970 presumes to intuit people's internal life and act on those assumptions as professionals we have to be people's
00:48:01.330 proxies be their advocate stand up for them saying no on their behalf to using
00:48:06.790 their data in ways that they have not enthusiastically and knowingly consented to say no to uncritically reproducing
00:48:14.230 systems that were biased to begin with say no to writing code that imposes
00:48:19.590 unauthorized consequences on to their lives in short
00:48:25.110 refuse to play along
00:48:38.850 Thank You Korina this was whoa I opening
00:48:52.680 you