Debugging

Summarized using AI

GDB: A Gentle Intro

Jason Clark • November 21, 2015 • San Antonio, TX

In the video "GDB: A Gentle Intro" presented by Jason Clark at RubyConf 2015, the main topic is the use of GDB (GNU Debugger) to debug Ruby programs, particularly when encountering issues like deadlocks or hangs due to native extensions or threading issues. This presentation aims to introduce attendees to GDB without requiring prior knowledge of C programming, demonstrating how to effectively leverage GDB to troubleshoot Ruby applications.

Key points covered in the video include:
- Introduction to GDB: GDB is a powerful tool used for debugging processes at the C level. It is essential for Ruby developers, especially when dealing with the Ruby VM, which is written in C.
- Common Debugging Scenarios: Clark shares a scenario where a Ruby program is hung due to a potential deadlock, illustrating the need for a debugger to inspect what may be happening under the hood.
- Starting with GDB: Viewers learn how to attach GDB to a running Ruby process, using commands like 'gdb -p' and exploring output like backtraces with 'bt', providing vital information on where the process is paused and what functions are active.
- Understanding Threads: The presenter explains how to examine multiple threads with 'info threads' and use 'thread apply' to see backtraces from all threads, helping to identify deadlocks.
- Exploration of Ruby's uniq! method: Clark investigates a puzzling behavior of Ruby's uniq! method, using GDB to inspect the method’s internals and understand its design decisions, revealing how it can return 'nil' in specific cases.
- Utilizing Documentation: He emphasizes the importance of Ruby documentation for bridging Ruby methods to their C implementations, demonstrating how to set breakpoints in GDB for deeper analysis.
- Advanced GDB Features: Clark mentions several GDB features for effective debugging, including scripts for automating commands, managing file descriptors, and examining memory structures related to Ruby objects.
- Conclusion and Takeaway: The session concludes by encouraging developers to explore GDB for debugging Ruby processes and emphasizes the benefits of understanding the low-level operations of Ruby, which can lead to more effective debugging and problem resolution.

Overall, the talk serves as both an introduction and a practical guide for Ruby developers looking to improve their proficiency with debugging native extensions and internal Ruby mechanisms, fostering greater insight into the Ruby programming environment.

GDB: A Gentle Intro
Jason Clark • November 21, 2015 • San Antonio, TX

GDB: A Gentle Intro by Jason Clark

We love Ruby’s elegance, simplicity, and flexibility. But our favorite language perches atop a world of native code. When that other world intrudes on your peaceful Ruby, GDB, the venerable GNU debugger, is the tool to turn to.

We’ll examine setting up Ruby to work with GDB. We’ll learn the fundamental commands, and soon you’ll be debugging with ease. We’ll even peer deep into Ruby object internals and face down crashes, deadlocks, and bugs.

Whether you’re writing a native gem, hacking the Ruby VM, or just want a glimpse of the layers below, this talk is for you!

Help us caption & translate this video!

http://amara.org/v/H0nH/

RubyConf 2015

00:00:17.810 Good afternoon, everyone! Thank you for hanging tough until the last session before the closing keynote. My name is Jason Clark, and I work as an engineer at New Relic. This background informs a little bit of what we're going to talk about today.
00:00:25.230 I'm going to start off with a couple of different examples that you may be able to relate to. The first one is: has anyone here ever had a Ruby program that's running but stopped? It's still there but not doing what it's supposed to do; it’s deadlocked in some fashion or hung up somewhere, and you don't really know what's going on.
00:00:43.710 You look at the process, and while it is there, it's hardly turning in any CPU cycles. There's just no detectable activity. You check the log file, but it has nothing. Imagine you're in this scenario, and just to up the ante a little: perhaps your co-workers have been talking about adding some multithreading support to this service. They believe it will make it much faster if we put some threads in there.
00:01:14.670 Suddenly, you have this deep and abiding dread because you think you might have a deadlock. Your Ruby process is stuck, and you have no idea why, nor do you know where in the code this is happening. That's the sort of situation where GDB can come to your rescue.
00:01:35.190 GDB stands for the GNU Debugger. It is actually a very old tool; the first version was released in 1986. However, it is a C-level debugger, among other things, with lots of extensions. It allows you to dig into the native layer of code beneath where your Ruby is executing. We’re at RubyConf—why am I talking about a C debugger? Why do we care about this?
00:01:59.880 The majority of us use MRI, which is the reference implementation for Ruby, also known as Matz's Ruby. The truth is that Mats actually prefers we call it CRuby. The Ruby VM that executes your code is written in C, and GDB, as a tool, has visibility into that layer. It lets us see what's happening underneath as we try to solve these problems.
00:02:27.890 This talk is intended as a gentle introduction; you're not expected to know C. I will step you through how you can use GDB as a tool to understand what's happening in these under-the-hood processes.
00:02:52.820 With our stuck process, there are a couple of conventions we will follow during this talk. When there's a dollar sign at the beginning of a line, it indicates a command we will issue at our terminal. At our terminal, we're going to say 'gdb -p' followed by the process ID of the process that we care about.
00:03:05.980 GDB is going to throw out a fair amount of output. It will inform us that it's attaching to the process, and you may see it spit out information about symbols it's loading. Eventually, you'll arrive at a prompt within GDB where you can issue commands. GDB has taken the process that we were running, paused its execution, and is now asking us, 'What do you want to do?' The first thing you might want to know is where exactly the execution is paused.
00:03:50.420 So, the natural command to issue is 'bt' for backtrace. When you do this, the output may be much more extensive than I've shown here; GDB can generate some pretty large stack traces. However, this isn't something to be scared of. It looks complicated, but it is basically just a stack trace showing us the methods and files that we've stepped through. Let's try to break down what GDB has shown us and see where it is similar to what we know.
00:04:23.060 The output is lengthy, but it gives us nice line numbers as some of the lines may wrap, which will help in tracking frames. Each frame shows the function name associated with it. You can think of these frames a lot like our Ruby methods, which you would identify in your code. If the appropriate debugging symbols are available, it will even tell you where in the source code that came from.
00:05:09.740 For instance, you see something in our Unix system dependencies, and on the third one, that corresponds to MRI itself. If we reference line 42 of that file, we could look and see exactly where this is executing. This information is quite pertinent; we're suspecting some sort of deadlock, and MRI methods conventionally start with an 'rb_' in the function name. Here we have 'rb_mutex_lock,' which certainly smells like a deadlock.
00:05:55.149 This is a snapshot of what's going on in this process, but simply observing one thread in a lock doesn't provide us a complete picture. Let’s ask GDB what else is going on in the process. If we use 'info threads,' GDB will give us a list of all the threads in the current process and show us where they're actually executing.
00:06:22.999 The threads are numbered, and those numbers remain consistent across the lifetime of the thread. As seen in the backtrace, GDB provides the function names. Importantly, we can see that this 'pthread_cond_wait' was the top of the frame for that backtrace we were paused on, and here we have two separate threads both waiting on a condition in the pthread library.
00:06:59.999 This gives us further information and reinforces our theory about the deadlock, but we want to see how we arrived at those locations. GDB provides a command called 'thread apply' that allows us to select certain threads and have GDB run a specific command across those threads.
00:07:29.999 The command we can use is to show all the backtraces for all the threads. If we execute that command, we notice that thread number three has the 'rb_mutex_lock'—exactly where we had been paused before. Additionally, thread one is also paused at that same lock. With this information secured, we can approach our co-worker—who thought they were clever for adding multithreading—and point out that they likely introduced a deadlock. Hopefully, they are wise enough to fix it!
00:08:27.130 Now, to exit GDB, we type 'exit.' However, 'exit' isn’t actually defined, unfortunately, so we simply type 'quit' instead.
00:08:40.830 Before moving on, let’s review briefly what we’ve discussed. We have seen how to attach GDB to a running process. You can do this with any process on your system as long as you have the permissions to do so. However, it’s wise to avoid critical processes since it can be easy to make things stop or behave badly from within GDB. Nonetheless, it is tremendously useful to stop processes to inspect and debug them. We've explored several commands that allow us to look at the state of our processes and understand where our code is executing.
00:09:40.800 Lastly, having found the answers we needed, we can gracefully exit GDB and return to our tasks in the real world. That is a concrete example of using GDB to debug a live process. However, I have found GDB is also incredibly useful for exploration.
00:10:43.990 Next, we're going to discuss an intriguing problem that I encountered in Ruby itself. I didn’t quite understand why it was behaving the way it was, and we can utilize GDB to dig into the details and figure out what’s happening in the VM beneath us. The specific setup for this is the 'uniq!' method. You see, this method takes an array and is supposed to return the list of unique elements.
00:11:38.480 The bang methods, as you may know, are mutating methods. Typically, there is a non-bang counterpart available in the standard library, which means it should modify the original array in place. In this case, the actual array we have would modify its members. However, when we use 'uniq!', it also surprises us by returning the array of unique elements.
00:12:02.760 If you call 'uniq!' on an array that contains no duplicates, you get back 'nil,' which seems puzzling. When I first encountered this behavior, I was sure something strange was occurring. To investigate further, we will set up a little script that replicates this bug, allowing us to analyze what’s happening.
00:12:42.350 When we execute this script, we output the array that has been processed by 'uniq!' and observe that it returns the expected results. However, it also returns 'nil.' This was quite perplexing. How do we initialize this with GDB?
00:13:07.140 As expected, GDB has facilities to start an arbitrary process instead of attaching to one that is already running. You can achieve that using '--args' from the command line. We provide the same command you would normally use to run that Ruby script. GDB starts up and drops us to a prompt. The program is paused before anything happens.
00:13:43.240 To get it started, we instruct GDB to run the program. As before, we see a variety of outputs. There will be several outputs detailing the handling of threads and debugging symbols. Yet in the midst of all that, we see output to standard output, which matches what we anticipate from our program, confirming it's running correctly.
00:14:31.370 Now that we've established a process to run the same test case repeatedly from GDB, we want to analyze what's happening specifically when we invoke this method on an array. To do this, we need to find a way to instruct GDB to pause at the specific point of interest.
00:15:40.270 Fortunately, we have the Ruby documentation as a valuable resource to guide us through the bridge between Ruby code and C functions. For instance, if you lookup the 'uniq!' method in the Ruby documentation, you can toggle and expand a section showing the source code for that Ruby method.
00:16:46.790 While some methods might be implemented in Ruby, many core features of Ruby are actually implemented in C. For those unfamiliar or a bit rusty, critical pieces to understand include that C functions are key for pinpointing locations to debug in GDB.
00:17:32.650 Now, if we look at the code that appears when we expand the Ruby docs, we notice fundamental pieces resembling the C functions we discussed. For example, the return value is declared above the function name while 'value' denotes the type and refers to a Ruby object that is passed around throughout MRI.
00:18:49.290 We also see the code for 'uniq!' takes an array as the parameter and is designed to return a Ruby object. The key takeaway to progress with our debugging is the identifier 'rb_array_uniq_bang,' which GDB can utilize to halt execution when this method is accessed.
00:19:29.130 As mentioned before, the 'rb_' prefix is standard in MRI to indicate Ruby methods, while 'ary_' denotes functions related to the array class. This naming convention helps link C functions with their Ruby equivalents.
00:20:33.550 To invoke GDB so it breaks every time 'rb_array_uniq_bang' is called, we set that up during our initial run command. If we then execute this, rather than completing the program all the way through, GDB halts and presents us with a prompt, allowing us to inspect what is occurring around that location.
00:21:13.430 Now let's take a moment to survey where we are. Since we've gotten to this paused state with GDB, we can inspect the variables available to us and check their values. GDB will also show us the source code preceding and succeeding the line we're currently paused at.
00:22:01.020 Within this paused execution, we can display any variable at hand, including our array parameter. However, displaying this array yields an address in memory—essentially a pointer, indicating where in memory the array data is stored.
00:23:06.620 What we really desire is to see the actual values within this array. To comprehend how those values are organized in memory, we introduce the concept of a struct—a way that C organizes and groups data.
00:23:59.310 For instance, a struct called data might represent an integer in memory for us. When we create this struct, the layout corresponds to our specification, allowing us insights into how Ruby objects relate to their internal struct definitions in C.
00:24:49.750 Hence, every Ruby object operates with an underlying struct called 'RBasic.' This contains flags indicating the type of Ruby object and a pointer to its class type. This principle means that we can interpret object attributes yet require structural familiarity to unwrap their details.
00:25:56.900 If we execute GDB commands to examine the 'RB' structures, GDB displays their details neatly, allowing us to decipher what data is present in memory. However, this can be complicated when you're thinking about the underlying structure of arrays. Fortunately, Ruby introduces 'rb_obj_print,' a function that helps us print a Ruby object’s state.
00:26:56.550 By invoking this function, we can get a clear representation of our array. Yet as we pause and check, we notice that the array yields a string rather than the expected array elements. This prompts us to further investigate its origin.
00:27:55.250 Utilizing 'gdb list,' we can review the nearby code context as well as the values around us. Meanwhile, the 'display' command will print a neat representation of any local variable that’s accessible during our paused state.
00:28:44.550 As we explored the output, we noticed some unexpected values in our array, which raises questions of their origins. Following this exploration, MRI provides another helpful function, 'rb_backtrace,' which gives us Ruby's perspective on the current stack.
00:29:35.610 Upon inspecting the backtrace, we realize that the execution flow isn’t focused on our script but rather through Ruby gems that are invoked before our method executes. So, as per the user experience, Ruby preparations that run upon starting the script led it to call 'uniq!'
00:30:38.490 Knowing this allows us to address the root cause of our issues, and if we say '--disable-gems' when running our Ruby code, we find that the expected output is achieved, as we're now directly invoking our script.
00:31:19.820 Moving further into the unique bang method, it is practical to set breakpoints throughout without cluttering our execution flow. We start by checking the length of our array; if it's less than or equal to one, we return 'Qnil'—the C representation of 'nil'. This line shows the intention of design to return 'nil' in cases where the input is already unique.
00:32:54.950 Thereafter, the method checks if a block is given; otherwise, it maps the array into a hash based on the values presented. This leads us to realize that our intuition about memory efficiency might be misleading because it trades off memory usage by adding yet another structure.
00:34:31.380 This means that if the initial length matches the created hash size, we again return 'nil.' Thus, the behavior we saw arose from the function returning 'nil' meaningful contextually.
00:35:05.440 Having examined this method's internals, we can guide our process forward—back to our active execution! Using 'continue,' we can allow the program to resume its course.
00:35:45.540 Throughout this section, we’ve learned how to traverse methods inside Ruby while also utilizing GDB’s extensive features. Moreover, Ruby's documentation provides robust guidance on connecting Ruby's interface with its underlying native code.
00:36:39.740 Finally, as we conclude, I’d like to mention a few unconventional tricks you can implement once you get accustomed to operating under this level. Sometimes, GDB's output can be perplexing, especially if it conflicts with your application writing logs to standard output. A novel approach involves closing file descriptors for standard out and error, redirecting them to log files.
00:37:47.040 This achieves cleaner debugging, as it eliminates interleaved output that can distract during the inspection. GDB also allows executing arbitrary Ruby code via 'call' which provides further insights into the backtraces of all threads—albeit with certain risks.
00:39:10.390 GDB also allows scripting commands to automate tasks, and we have developed a script that captures thread backtraces for effective debugging. With a simple setup, you can customize your GDB environment to streamline commands and facilitating smoother operations with pleasant little features.
00:40:09.520 Let's wrap up with a recap: We started off talking about how GDB can effectively debug Ruby processes, especially in situations where deadlocks occur. Looking at a unique problem, we navigated the depths of initializing code, encompassing a variety of techniques and capabilities.
00:40:57.900 By exploring GDB alongside Ruby’s integration with C, we broadened our understanding of how low-level programming intersects with high-level debugging. If the content sparks curiosity, I hope that you'll ponder experimenting with GDB yourself and explore processes on your machines in the near future! Thank you!
00:42:53.800 So the question is, did I find out why uniq! returns nil? The answer is no, but it is clearly the specified behavior of that method.
00:42:56.510 During the Q&A, someone asked whether you need to recompile Ruby with a debugging flag, and the answer is yes; you would need to check specifics based on how you're installing Ruby. If you want a fast play environment, the official Docker images for Ruby on Docker Hub have all the necessary setups.
00:43:22.350 Regarding re-entrancy issues encountered while debugging, I haven’t personally run into that, but it is indeed a possibility.
00:43:36.570 Additionally, yes, you can disable the garbage collector or check its running status during debugging, which is indeed advisable when setting up test cases.
00:43:44.920 Thank you all for your attention!
Explore all talks recorded at RubyConf 2015
+80