Data Visualization

Summarized using AI

Orders of Magnitude

Davy Stevenson • April 12, 2016 • Earth

In the video titled 'Orders of Magnitude', Davy Stevenson explores how humans perceive and handle numbers across vastly different scales—from minute measurements at the atomic level to astronomical distances. The talk emphasizes that up until the 17th century, human understanding was largely limited to what could be intuitively counted or seen. However, advancements over the past few centuries have drastically reshaped our worldview, particularly through technology and exploration. Stevenson particularly addresses the cognitive limitations humans face in comprehending both very large and very small numbers, leading to misconceptions in data handling and programming.

Key points discussed include:
- Human Number Sense: Humans possess an innate ability to count, but this is limited to small quantities. We actually struggle with comprehending numbers beyond four without the help of counting systems.
- Development of Counting Systems: Stevenson illustrates how various cultures have developed numeral systems and methods for counting, exemplified by the Incas' rope counting and the evolution from Roman to Arabic numeral systems.
- Conceptual Challenges: The speaker elaborates on the difficulties in assigning meaning and understanding to large and small numbers, noting that we have only grasped these concepts in the last few centuries.
- Practical Applications in Programming: Using numerical ranges, programmers can enhance software design and logic by structuring information into manageable sections and understanding the importance of visualization over raw numbers.
- Risk and Edge Cases: Stevenson discusses risk evaluation in programming, highlighting the need to anticipate edge cases and emphasizing that just because an unlikely scenario seems improbable, it doesn't mean it won't occur. This is particularly relevant when discussing software uptime and system reliability.
- Expanding Human Experience: The growth of human understanding has evolved from thousands to millions, billions, and beyond, underscoring the necessity for software systems that can accommodate this expanding complexity.

In conclusion, Davy Stevenson suggests that programmers must be prepared for edge cases and that our cognitive biases regarding numbers require careful consideration in software development. By grounding our understanding in experiences and visual representations, we can better navigate the complexities of programming amid vast numerical scales.

Orders of Magnitude
Davy Stevenson • April 12, 2016 • Earth

Orders of Magnitude by Davy Stevenson

Up until the 17th century, the world was mostly limited to what we could see with the naked eye. Our understanding of things much smaller and much larger than us was limited. In the past 400 years our worldview has increased enormously, which has led to the advent of technology, space exploration, computers and the internet. However, our brains are ill equipped to handle dealing with numbers at these scales, and attempt to trick us at every turn.
Software engineers deal with computers every day, and thus we are subject to both incredibly tiny and massively large numbers all the time. Learn about how your brain is fooling you when you are dealing with issues of latency, scalability, and algorithm optimization, so that you can become a better programmer.

Help us caption & translate this video!

http://amara.org/v/IMZu/

MountainWest RubyConf 2016

00:00:00 Hey.
00:00:22 Hello, everybody! Come on in, we're going to get started. This year has been a little bit different, and for the organizing, I like to use metaphors.
00:00:38 The metaphor I've used is like a 'greatest hits' album. I wanted to bring back some of our favorite sessions and speakers and see where everyone is now.
00:00:50 You know, on all the best greatest hits albums, there's always one new song. So that’s where Davy comes in, who I have never spoken to before, but she’s awesome and she’s going to speak to us next. So thank you for coming and being the extra reason to buy the album.
00:01:03 Sounds good? Testing. Awesome.
00:01:14 Thank you so much, Mike. MountainWest has been a conference on my wish list for a long time, and it never worked out until this year. I’m just so happy to be part of the very last night of MountainWest Ruby.
00:01:37 I'm from Portland, Oregon, where our slogan is, of course, 'Keep Portland Weird'. I’m certainly keeping it weird with this feedback. We have the smallest park in the world right here; it’s one of my favorites.
00:01:55 I’ve taken many people visiting Portland to this park, which is in the middle of the street, so you have to dodge cars to get there. But I think it’s worthwhile.
00:02:03 Of course, we have Portlandia, which is about fifty percent documentary.
00:02:10 Testing, testing. Okay, so Portlandia is mostly a documentary. This is one of my favorite scenes from Portlandia—putting a bird on everything.
00:02:29 Since 1999, I do my part to keep Portland weird. This is my cat, Komachi. He loves going for walks; he’s very gregarious.
00:02:36 He’s also on Twitter, but not quite as famous as Garbi Puffs, though maybe one day.
00:02:41 I came by yesterday at the conference and decided to wear contacts instead of glasses, which made two or three people not recognize me. So today, I’m wearing my glasses so you can all remember me.
00:02:57 And yes, that did make me feel like a superhero.
00:03:03 I work for GitHub as an engineering manager for the team that works on the availability and performance of the Rails app. That also means I have stickers—lots of them!
00:03:21 So if anyone wants some stickers, come find me afterward. There’s a good cat one!
00:03:32 Unfortunately, we have retired all of our copyright infringing stickers, so we don’t get sued.
00:03:39 So no more Zelda stickers or Rainbow Dash stickers. I’m sorry, I don’t have any of those.
00:03:46 GitHub likes to stay in business and not get sued out of business.
00:03:56 Okay, so on to my actual talk. The title of this talk is 'Orders of Magnitude'.
00:04:03 The concept is that I'm going to be talking about big numbers, small numbers, and everything in between, but more importantly, I'm going to talk about how our brains conceptualize and process numbers.
00:04:25 Number sense is the idea that certain animals have an innate conceptualization of numbers and counting built directly into their brains.
00:04:36 They don't have to learn about numbers. It might not be surprising to think that animals such as elephants, dolphins, and many of the great apes have number sense.
00:04:47 But it's also been shown that animals such as birds, and even insects like bees, have the ability to differentiate between numbers.
00:05:00 What this usually means is that they can tell the difference between one of something versus two of something; they can even distinguish between three and four.
00:05:12 However, this ability tends to break down at around five or six. Oftentimes, animals can differentiate between sets with a larger difference, such as eight versus twelve.
00:05:23 You might expect that humans have number sense too, right? We like to think we're pretty good at counting.
00:05:36 But science shows we’re not actually that much better than the other animals we study. Tribes that haven’t developed finger counting often struggle to determine differences between numbers above about four.
00:05:52 This can be surprising. This leads us to one of the first misconceptions we have about our brains and how we process numbers: our brains don't inherently understand numbers beyond one through four.
00:06:01 We are born with number sense, but we must learn how to count. It’s hard to imagine life before counting.
00:06:16 But learning to count was only really advantageous to humans once we started farming, herding, or managing livestock. That was when we needed to figure out how many sheep we had.
00:06:30 So we invented things like finger counting, tallying with stones, notches on sticks, or knots on ropes to help us manage numbers larger than we could comprehend by ourselves.
00:06:44 Now, we don't always have words for these numbers, but we are able to keep track of our flock of sheep by keeping a pebble in our pocket for each one.
00:06:56 Before you think this is a naive or simplistic way of managing numbers, the Incas managed to keep track of a vast civilization based on large numbers by using what was known as rope counting.
00:07:07 They used knots in strings to keep track of their currencies and the values they had to deal with in their civilization.
00:07:20 Once we recognized the need to count, the next step was to determine that maybe we should have names for these numbers. It’s easier to say 'I have eight sheep' than saying 'I have this many sheep' while piling rocks.
00:07:35 Initially, many of the first number systems only had names for numbers like twelve. Many tribes today still only have these numbers represented with words.
00:07:52 Two examples are the San people of Namibia and various aboriginal tribes in Australia.
00:08:04 The next thing we see is that different names for number words depend on the context. For example, the Sheehan language, a tribe in British Columbia, has completely distinct names for numbers depending on what you are counting.
00:08:20 So, if you are counting flat objects, you would use a completely different value of three than if you were counting men. They also had a different set for long objects and trees, but canoes were in a completely new category.
00:08:37 Canoes were vital to them, and before we start thinking this is silly, the English language has a history of this built into it directly.
00:08:56 We have many different terms for the number two, which is a relic of our ancestors dealing with numbers.
00:09:07 A 'brace' is often used to refer to a paired set of animals, such as a brace of horses to run a cart.
00:09:20 This brings us to our second misconception: counting things is easy. Even coming up with the names for numbers can be difficult.
00:09:31 The names we use to count can vary widely. Abstract number counting is actually quite a difficult thing to master.
00:09:40 Once we’ve learned to count and have the words for our numbers, the next step is to create numeral systems.
00:09:49 Of course, we are familiar with the Roman numeral system, which was an improved tallying system. Its main features are positional value and subtractive notation.
00:10:01 That was an advancement over pure tallying. However, interpreting Roman numerals can still take some mental energy.
00:10:18 Next, we developed the Arabic counting system. This was a little different because each digit has a representational character.
00:10:32 Then we have positional notation, in which the sequence of digits creates the numerical value. This system is much easier to parse and understand.
00:10:42 When we adopted this positional notation, the concept of exponentiation followed quickly after.
00:10:49 Now that the digits have a place, that place can itself be counted. One common way to reference exponential notation is to use 10 to the power of something.
00:11:06 Another method is using 'e' notation, which I'll be using going forward since it’s shorter and avoids the irritating subscripts.
00:11:18 We also needed to assign names for the different positions, and of course, we couldn’t stick to just one naming convention; we have multiple.
00:11:29 The existence of number subsystems allows us to count both larger and smaller numbers, but this leads us to another lie: naming these numbers doesn't mean we understand them.
00:11:43 To recap, humans have number sense, but that only helps us up to about four. Counting is hard, and naming doesn't equal understanding.
00:11:54 So how can we apply this to our day-to-day life as programmers? One common way we do this is by breaking up our web pages into different sections. It’s much easier to understand and navigate when there are three sections instead of seventeen.
00:12:07 For example, I have a header, a right column, and then the main body. Another way we can help ourselves is by graphing data instead of presenting numbers as a table.
00:12:17 Visual representations allow us to understand the relationships between numbers instead of having to scan through a table. This impacts our design philosophies.
00:12:35 It’s a good idea to limit the number of lines within each method, which ties back to our number sense. Each method should handle a specific number of tasks.
00:12:50 It’s simpler for our brains to grasp if they are doing three things in order rather than fifteen. Similarly, this applies to good testing practices.
00:13:06 Also consider our object hierarchy; it’s easier to understand if you deal with a small subset of classes. We want to avoid complex classes.
00:13:19 Speaking of our dear friend Tender Love, he mentioned yesterday that a number is meaningless until you know the class name associated with it.
00:13:28 This underscores the idea that having a contextual name is crucial for understanding, rather than referencing a random number.
00:13:40 Now, let’s create a baseline for what really big and small numbers are by establishing a human experience baseline.
00:13:52 For the baseline distance, we're using one meter, which is approximately the size of a human and how far you can reach.
00:14:06 So, with that baseline, what’s the smallest thing a human can experience? It’s about the width of a human hair, approximately e to the negative four.
00:14:21 On the other hand, the biggest thing we can perceive is a mountain. Mountains can be seen in the distance and climbed to understand their vastness.
00:14:40 This is approximately e to the four. In terms of time, I’ll say the human experience baseline is about one hour.
00:14:57 One minute seems far too short, so we block things into one hour or half-hour chunks, which is our natural experience of time.
00:15:14 The smallest time we can perceive is the blink of an eye, which is also about e to the negative four.
00:15:31 The most extended time we can perceive is our own lifespan, which is approximately one hundred thousand hours, or e to the five.
00:15:48 This brings us to another misconception: we have direct experience with very big and small numbers. Our experience ranges from the width of a hair to a mountain.
00:16:09 However, these ranges are only thousands to thousands, which, in the grand scheme of things, aren’t exceedingly large or small.
00:16:22 Long ago, humans discovered that curved surfaces magnified and distorted light, but we didn’t really understand the mathematics behind it for quite some time.
00:16:39 That didn’t stop us from carving lenses to try and magnify the world around us. This is the Nimrod lens, dating back to about 750 BC.
00:16:50 These lenses were found in areas like ancient Assyria, Egypt, Greece, and Babylon. They were crude magnifiers, but they were among the first ways we could visualize beyond our naked eye.
00:17:03 Optic theory was the study of how light bends going through mirrors and lenses, where the laws of refraction were crucial to compute how lenses could enhance our view of the world.
00:17:18 Around 1590, the first microscope was invented, and by 1608, the telescope was invented, which drastically expanded our understanding of the universe.
00:17:36 Let’s return to our baselines and see how much further we've expanded our understanding since then. When it comes to smaller names, we discovered bacteria at e to the negative six.
00:18:03 The microprocessor memory cell right now is about e to the negative eight (the 14 nanometer resolution began shipping in 2014).
00:18:16 The current smallest gate length of our processors is e to the negative nine, which is a size of a 16 nanometer processor.
00:18:31 This is fantastically small, considering that atoms themselves are about e to the negative ten, while electrons are e to the negative eighteen.
00:18:47 On the larger scale, the moon is approximately e to the sixth and the sun is e to the ninth, and that’s not even the biggest star we study.
00:19:06 We have Rigel at e to the eleven and Betelgeuse at e to the twelve. Because I studied astrophysics, you’re going to get a quick lesson.
00:19:23 Both Betelgeuse and Rigel are in the constellation Orion. Betelgeuse is the left shoulder of Orion, while Rigel is the right foot.
00:19:37 For reference, Sirius, the brightest star in the sky, is also located in the vicinity.
00:19:56 Moving on, we've also been able to study the Pillars of Creation at e to the sixteen. These are interstellar gas and dust in the Eagle Nebula.
00:20:09 They’re so named because this mass of gas is actively creating new stars. This is an even smaller part of a much larger constellation.
00:20:22 The leftmost pillar in this image is about four light years long. We can also expand our sense of time.
00:20:36 As we go smaller, the single synapse in our brain is about e to the negative seven.
00:20:48 In 1980, we created processors at five megahertz, which stood at e to the negative ten.
00:20:59 Our current processors, at 3.5 gigahertz, are at e to the negative thirteen.
00:21:15 Looking at the larger scale, the oldest known living thing on Earth is a particular type of clonal organism that has been alive for 5,000 years, or e to the seven.
00:21:28 Humans have been around for about 200,000 years, or e to the nine, while dinosaurs lived for about 100 million years, or e to the twelve.
00:21:40 What does this tell us? There’s another misconception: we have not been able to explore the world in great depth for very long.
00:21:56 It’s only been over the past couple of hundred years that we've expanded beyond the range of thousands.
00:22:09 Another thing our brains are responsible for is determining risk and estimating odds of dangerous situations.
00:22:24 Clearly, we have a desire to preserve our lives, so we need to assess risks.
00:22:39 Unfortunately, we are not adept at this and tend to prioritize immediate short-term risks over more long-term ones.
00:22:53 We have a visceral reaction to immediate threats, such as being attacked by sharks or bitten by snakes.
00:23:05 Yet, daily we ride in cars or smoke cigarettes. In the context of computing, we might reflect this by discussing our nines.
00:23:19 When we talk about the availability of our software, we like to frame it in terms of nines, referring to the percentage of uptime.
00:23:36 For example, having three nines allows for about 36 and a half days per year in downtime.
00:23:50 Aiming for a minimum of two nines—the time lost from downtime is bad enough, roughly three point six five days per year.
00:24:02 Four nines results in about fifty-two point five six minutes of downtime per year. And when we discuss five nines, we're talking about five minutes and thirty seconds of downtime annually.
00:24:14 So, what happens with nine nines? Anyone know how much downtime we'd be allowed? It comes to 31.5 milliseconds, or thirty-one brain synapses.
00:24:32 This leads back to another misconception: our brains are adept at calculating odds.
00:24:45 How many times have you heard the saying 'it will never happen'? The answer is usually 'no' as the number of possible occurrences grows.
00:25:05 We now understand that large and small numbers are relatively recent to humanity, and our brains aren’t good at computing odds.
00:25:18 So let's talk about edge cases. This fellow, Hubert Wolfstime, is said to have the longest official name.
00:25:31 He has one name for each letter of the alphabet. However, that only covers Zeus before concluding with his last name, which is Senior.
00:25:46 So, do you think your site can handle Mr. Hubert signing up? Not likely, since he has already passed away, but what about his son?
00:25:59 There are others like celebrities who only have one name, or even the royal families of Japan and Thailand, who traditionally also have only one name.
00:26:15 If the Emperor of Japan wants to sign up for your site, should they be allowed? And what about those who struggle with hyphenated names or emoji characters in their names?
00:26:31 Emails also present their challenges. We can think about easily parsing emails that contain plus signs or ridiculous domain names.
00:26:46 Can you write a regular expression that correctly parses all possible email addresses? Probably not.
00:27:01 It shows that we have to deal with many edge cases and exceptions, especially when implementing protections.
00:27:13 It’s crucial to have mutexes in place for concurrent requests, handle database transactions carefully, and ensure background jobs are queued properly.
00:27:29 You also need to consider whether your test suite is adequately covering all possible scenarios. Are you just testing the happy path?
00:27:43 Is your deployment process hindering your ability to achieve optimal uptime? When dealing with data migrations, remember that your customers’ data is valuable.
00:27:56 Make sure that you handle it with the utmost respect.
00:28:07 The human experience has dramatically increased over the last hundred years. We have moved from experiencing things mainly in the thousands to the millions, billions, and trillions.
00:28:30 In writing scalable code, we need to be preparing for that millionth user, billionth request, and trillionth event.
00:28:47 If you take away one thing from this talk, it should be that edge cases are a certainty. Just because something seems unlikely doesn’t mean it won’t happen.
00:29:02 Ensure that your software can handle those edge cases. Thank you very much.
00:29:15 That's a great question. How has my education in astrophysics affected my career?
00:29:27 I was quite happy not to code in the programming languages used in astrophysics, which were mainly Fortran.
00:29:43 One interesting aspect of astrophysics was learning how we, as humans, tried to understand things beyond our own world.
00:29:58 When limited by the photons hitting Earth, how do you figure out the materials of the sun or how far things are?
00:30:10 Being able to creatively query that information from a narrow data set is impressive.
00:30:32 I think we mirror this creatively in programming as well, especially when debugging issues that arise in servers.
00:30:45 When faced with just a stream of electronic signals, it can be challenging to diagnose problems.
00:30:59 The answer to risk calculation of edge cases when programming? Metrics: record everything all the time.
00:31:11 When you don’t have metrics for when things are going well or poorly, you can't do effective comparisons.
00:31:27 GitHub is well known for its robust storage of information about everything happening at every level of the stack.
00:31:39 Being able to create graphs is essential; numbers are meaningless without context.
00:31:52 Seeing trends over time is critical. It’s important to maintain a thorough logging mechanism.
00:32:07 In specific areas, full logging isn’t always feasible, leading to choices like capturing only ten percent of occurrences.
00:32:20 This decision must be case-by-case based on project requirements.
00:32:27 It's a good question. Each project may use different process improvement tools like Six Sigma.
00:32:38 In conclusion, I regret I couldn't find a tool for visualizing scales, such as millions versus billions.
00:32:51 However, I recommend exploring XKCD comics, which effectively illustrate scales of numbers.
00:33:06 Thank you.
Explore all talks recorded at MountainWest RubyConf 2016
+10