The Science and Magic of Debugging

00:00:10.719 It feels like I'm back at the gym or something after 20 months. How do I do this again? Hopefully, there won't be any physical ramifications from giving this talk. I won't be sore tomorrow or something. Hello! Oh, we can do better than that. RubyConf, hello! That's better, thank you. I need this energy. This is like, as I mentioned, 20 months of pent-up talk giving coming out right now. I'm really excited to be here, and I'm really glad all of you came to this talk. Feel free to grab a seat if you're still coming in.

00:00:40.239 It's been a minute, but I'm so glad to be here with all of you. I'm really excited that my first in-person talk is at RubyConf. I've been working with Ruby for years now, and amazingly, this is somehow my first RubyConf. It's amazing. It's so great to be here.

00:01:14.000 I'm going to wait for that host caffeine energy to kick in. I'm going to need some energy from you all. Anyway, let me introduce myself. I see a lot of familiar faces in the crowd and a lot of new ones too. My name is Vaidehi. It's great to be here with all of you. By day, I'm a senior software engineer at Vimeo. We are hiring for a bunch of roles. We do use Ruby, and we will be at the job fair right after this, which I think is at 3 PM.

00:01:39.520 So if you want to learn more, come and chat with me or the rest of my team, who are around here somewhere. When I'm not at work and have free time, I really enjoy learning new things. Every once in a while, I get obsessed with learning something and dive really deep into it. Those learning adventures sometimes evolve into side projects. These are a couple of them, which maybe you've heard of. If you haven't, that's cool, but I do have some stickers. I have some Basis podcast stickers and Bite Size stickers, so come grab a sticker and talk to me afterward to learn more about these side projects.

00:02:27.760 As I was writing this talk, I thought a lot about how much this joy of learning has been instrumental in my career. Now admittedly, my desire to learn new things sometimes manifests in interesting ways. I remember when I was starting out at my very first programming job. I would get really excited whenever an outage happened during work hours, which in hindsight, what does that say about me? I also feel sorry for the engineers I worked with because that probably was terrible for them. But the reason I would get excited is that I wanted to watch other engineers debug things in real time, and those outages were a great opportunity for me to observe them.

00:05:05.760 You see, back when I was first starting out in tech, I was really intimidated by debugging, especially debugging under pressure. That terrified me, and I wanted to know how other engineers knew what to look for while debugging. Debugging has always been somewhat mysterious and mystical to me in those early years of my career. I particularly admired any engineer who could systematically debug something, no matter what the bug was. Great debuggers seemed like wizards to me. When I would watch others debug, I would quietly wonder to myself, 'How did she know where to look for that? How did they know where to begin debugging with that error message?'

00:05:58.720 As I've progressed in my career and gained experience, I now find myself on the other side of that. Over the last few years, in particular, as I've mentored early-career engineers and paired with others on outages, I've noticed that they are now asking me those questions—questions that I used to quietly think to myself. It's forced me to pause and reflect on how I know what I know when it comes to debugging.

00:06:36.720 How do I know where to start looking for a bug? What exactly do I do when I debug? How can I articulate that to others? What I really wanted to know was what we are actually doing when we debug something. So, I did what I always do when I don't know the answer to something: I decided to learn about it and did a lot of furious googling in the process. I read numerous research papers and articles to better understand the cognitive process of debugging.

00:07:05.760 Before we answer any of those questions, it's important to clarify what we mean when we talk about debugging in technical terms. Debugging is the process of locating a bug in a piece of software and then correcting it. More generally, we can think of debugging as problem-solving. As developers, we spend a lot of our time problem-solving. According to a 2013 survey from the University of Cambridge Business School, software developers spend about 50 percent of their time either fixing bugs or making code function. Another way to think about that is that we spend half of our jobs just debugging stuff, which is pretty wild to consider.

00:07:48.959 When we consider how much of software development involves maintaining projects that already exist and probably were written by someone else, it makes sense that debugging is an inherent part of our jobs. However, debugging is also incredibly hard. Computer scientist Brian Kernighan has a great quote about debugging: 'Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?' I appreciate this sentiment because I think what he’s getting at is that programming, in and of itself, is a hard skill. When we're debugging, we’re trying to untangle the knots of code, logic, and syntax within a system, and that adds another level of complexity.

00:08:47.200 When we debug something, we're holding a lot of context in our minds—context about the program, how it behaves, and how it is structured. Devon O'Dell, an engineer, has a wonderful article called 'The Debugging Mindset,' where he refers to software problems as those with combinatorial complexity. This is an astute observation because the smallest additions to our codebases can significantly impact the complexity of our systems. As debuggers and problem solvers, we end up having to hold all this complexity in our heads, but how do we do that?

00:09:20.480 We utilize a mental model to understand how something operates. A mental model is the internal image you hold in your mind of how something works. Whether we're dealing with a 100-line Ruby program or 20 different Ruby microservices talking to each other, we each have our own mental model of how things are functioning. However, mental models can only be approximations; none of us can possibly memorize every detail about our system.

00:09:58.720 When we're debugging, we are constrained by our mental models, and that can be limiting, making debugging harder. What happens in our brains when we debug something? After extensive research and reading in software debugging, I came across a paper titled 'Cognitive Process During Program Debugging,' written back in 2004. Unfortunately, I don't have enough time to delve into all the details, but I highly recommend you read it. This paper references 20 other papers, making it an excellent jumping-off point.

00:10:28.640 The authors of this paper, both computer science academics, examined the process that programmers follow when debugging problems. They looked at a case study where a programmer worked to correct a failure in their application. In this instance, the bug involved the web server talking to the database, which sometimes would be slow or not respond at all; however, the bug's details aren't the focus. What’s intriguing is the steps that the programmer took to debug the issue.

00:11:06.640 The authors of this paper found that the cognitive process followed by the programmer mapped to a framework known as Bloom's Taxonomy. If you're unfamiliar with Bloom's Taxonomy, it was proposed in 1956 by educational psychologist Benjamin Bloom of the University of Chicago and classifies cognitive skills. It’s often used by educators to derive objectives for their students as they learn new concepts.

00:11:59.200 Bloom categorized the cognitive process into six distinct levels that represent how we develop mental skills and acquire new knowledge. It essentially serves as a framework for how we achieve mastery over a subject. The model starts off simply, seeming somewhat intuitive. In developing a mental skill or acquiring new knowledge, we need to begin by remembering information and should be able to recall facts and basic concepts.

00:12:35.360 Next, we must demonstrate an understanding of the facts and concepts that we've remembered, meaning we should know what the information we recalled actually means; otherwise, it’s somewhat useless. After understanding something, we should apply that knowledge to real-life situations, being able to use the information to answer questions and help solve problems. Following that, we need to be able to analyze the information and examine the ideas behind it.

00:13:03.920 This means breaking the information down into simpler parts, questioning the information or ideas, and examining the details closely. At this stage, we should explain the connections and cause-effect relationships between different aspects of the information. Lastly, we have to evaluate and synthesize the information we’ve just analyzed, meaning we assess it and form an opinion based on our knowledge, justifying that opinion.

00:13:56.720 At the highest level of cognition in this framework, we take all the information we have and turn it into something new, which may involve creating a new understanding of a concept or crafting a new solution to a problem. As we ascend through this taxonomy, each level becomes more complex and abstract, requiring higher-order thinking skills to transition from one level to the next. Notably, to progress through Bloom's Taxonomy, the previous level must be mastered before moving on to the next.

00:14:46.080 The authors of the research paper discovered a connection between the cognitive framework, Bloom's Taxonomy, and the steps the programmer took to debug the problem. They noticed that every step the programmer took could be mapped to a level of Bloom's Taxonomy. One observation that stands out is that the first level of the taxonomy is not even included; this implies that a mastery of this first level is necessary before even beginning the debugging process.

00:15:18.560 The researchers concluded there was so much knowledge required before one can even begin debugging that they excluded this from their findings entirely. This speaks volumes about the baseline knowledge a programmer needs to start debugging something. Additionally, there’s noticeable activity in the fifth and sixth levels of Bloom’s Taxonomy, which refer to evaluation and creation.

00:15:40.160 This reaffirms our earlier idea that debugging necessitates a great deal of higher-order thinking. It's evident that we operate at these elevated cognitive levels when debugging. Understanding that Bloom's provides a framework for learning new things and acquiring knowledge, we can synthesize these notions—asserting that when we debug, we’re essentially engaged in a learning process.

00:16:03.680 We are functioning at a high cognitive level, gathering information about the program and the system surrounding it. But how do we accomplish this? How do programmers collect knowledge? How do we begin to find information about a program or system? How do we navigate through Bloom's Taxonomy?

00:16:36.160 To answer this second question, I should introduce another research paper, one that the first references. This paper was published back in 1991 by three computer scientists in Japan and is titled 'A General Framework for Debugging.' The researchers observed that, although debugging forms a core part of software development, there’s no established methodology for describing it, let alone for teaching it. Therefore, they endeavored to create the minimum requirements necessary for debugging, referred to as a debugging process model.

00:17:14.240 Here you can see that process model illustrated, though it’s worth noting that back in 1991, photocopying methods might not have been very advanced. This model resembles a flow chart and emphasizes a start with some kind of error report. This paper refers to the error report as an initial hypothesis set, but we can also view it as a literal error code, error message, or some unexpected behavior observed in our system.

00:17:57.920 After generating our first hypothesis for why the error might be occurring, we enter the hypothesis set modification phase, where it's essential to recognize that the hypothesis is itself a fluid entity. We often generate numerous hypotheses but can only focus on one at a time, leading us to the hypothesis selection phase. Here, we choose one specific hypothesis to validate.

00:18:40.160 One approach is to simplify the error condition, meaning we seek other conditions that lead to the same error and reproduce a simpler scenario reflective of that bug. Another strategy involves identifying a 'suspicious region'—starting with a broad hypothesis and refining it into a smaller, more focused scope; for instance, narrowing the assumption to pinpoint a specific class or method.

00:19:18.000 Another tactic is to modify our code in the areas we suspect might be problematic. By making changes there, we can observe the modifications' effects on our program. If we determine a particular file appears suspicious, we might comment out portions of the code in that file and analyze what occurs when we run our program.

00:19:49.960 Ultimately, once we select a hypothesis, it’s time to verify it. The hypothesis verification phase is the part of the debugging process where we determine if our hypothesis is true, false, or possibly neither. We achieve this through several means: by examining the code before executing it, running the code with varying inputs, or a combination of these approaches. Sometimes, we modify the program and verify our hypothesis this way. At this point, two outcomes are possible: we either verify our hypothesis as true (hooray!) or discover that we cannot fix the bug right away.

00:20:40.880 Usually, we fail to confirm the hypothesis on the first try—or even the second or third. When that occurs, we return to the hypothesis modification phase. However, we don't go back empty-handed. If we've failed to verify our hypothesis, that refutation now becomes a fact we can pivot on during our ongoing debugging process. Each time we attempt to verify a hypothesis, we learn something. If we verify a hypothesis but don’t find the bug, our hypothesis transitions into a fact, guiding our future modifications.

00:21:28.480 Each cycle of debugging allows us to re-evaluate what we understand. We continuously gather information, develop hypotheses, and then verify or refute them, repeating the process repeatedly. You might be thinking that this cyclical process is somewhat familiar, which is a positive thought since it mirrors the scientific method. In school, we all likely learned the scientific method, which begins with observing a problem, generating a hypothesis, testing the hypothesis through experimentation, then concluding and refining the hypothesis as needed.

00:22:06.880 When we debug issues in our code and within complex systems, we employ a variation of this very strategy to hone in on the code causing the bug or unexpected behavior. Now we understand two concepts that can enhance our understanding of the debugging process and how to apply it.

00:22:30.880 Both Bloom's Taxonomy and the scientific method offer insights into the nature of debugging. Regarding Bloom's, if debugging is fundamentally a learning process that involves gathering information, the key takeaway is that to be proficient debuggers, we need to excel at accumulating knowledge and learning about the systems we operate within.

00:23:09.680 Fortunately, in the world of Ruby, we have fantastic tools—like logging, puts debugging methods, gems like Pry and Byebug, and others I can't even recall at this moment—all of which enhance our ability to gather knowledge and learn more effectively. What can we derive from the variation of the scientific method?

00:23:54.080 If debugging compels us to form, learn from, and verify our hypotheses, we must always check our assumptions. Earlier in this talk, I mentioned that we all have our own mental models where we visualize how something works. I indicated that these models are approximations, presenting another layer of complexity to debugging.

00:24:31.680 Our mental models are approximations because they are built on assumptions that can be wrong. This means our mental models can be flawed or incorrect. To be competent debuggers, we need to be ready to question and validate these mental models as part of the debugging process.

00:25:12.480 Where does that leave us? When I started this talk, I said I wanted to answer the question of what we do when we debug. I've discovered that debugging is not all that different from the fundamental activities we engage in daily while building software. Debugging combines learning new information and applying that knowledge to complex problems.

00:25:57.680 Just as with writing software, it follows an iterative process; we go through this cycle repeatedly. Much like programming, debugging is a learnable skill that everyone can master with the right tools and environment for success.

00:26:48.800 While crafting this talk, I contemplated that earlier version of myself who viewed debugging as a mystical ability and believed those around me were wizards. Reflecting on that experience, I empathize with anyone starting out in this field, as it's easy to watch someone else debug and wonder if you will ever learn how to do it.

00:27:33.680 What we interpret as magic or intuition is actually just experience. Julia described it perfectly when she wrote: 'I often think about how programming is running into the same bugs repeatedly and how being exposed to common bugs and being shown how to handle them gives you a huge advantage.' She's spot on.

00:28:18.640 For the early-career professionals among us, it’s essential to remember that when you see someone debug a problem, even if they don’t articulate their thought process, they’re following a process of their own. That engineer may simply have more experience than you or occupy a higher position in Bloom's Taxonomy. What seems like magic to you is, in fact, years of accumulated experience.

00:29:17.120 For those of us with experience, it's crucial to make debugging accessible. We need to demystify it so that newcomers understand debugging is a skill they can learn. We ought to explain what we know, why we know it, and how we came to acquire that knowledge, so they can adopt our processes, our mental models, and our approaches to debugging.

00:30:08.960 We must teach the skills surrounding hypothesis formation, verification, and refutation, applying that to our roles. If we’re pairing with someone, we should meet them at their current level, recognizing which Bloom's Taxonomy level they occupy, empowering them to gather the knowledge they need to ascend to the next level.

00:30:43.280 Ultimately, debugging is not magic. I know it’s part of the talk title, added to attract your attention, but the reality is that debugging is an organized process—a science. It becomes easier with practice, and like any science, it requires examining our assumptions, maintaining a flexible perspective, and being open to being proven wrong.