RubyKaigi 2023

Tips and Tricks for working in the MRI Codebase

RubyKaigi 2023

00:00:01.740 Hello everyone! I'm delighted to be here.
00:00:06.620 RubyKaigi is well known for having some really deep and incredible technical talks, and I wanted to highlight a few that all have something in common. You might have seen some of these talks today, or maybe you saw some yesterday, or perhaps you'll see some tomorrow.
00:00:15.480 The commonality you might have noticed so far between all of these talks is that they all involve, in some way, working with CRuby. This isn't a complete set, just a selection of talks from this year's conference that involve working in CRuby.
00:00:25.680 You might be sitting in one of these talks, or have been sitting in one of these talks, thinking to yourself, 'How can I dig into the CRuby codebase? Is it something I could do too? What are all these people doing and talking about? Could I contribute to CRuby?' Or maybe you learned something or saw a snippet and had a thought about changing something or doing something differently, and you thought, 'How can I test something that I've learned?'
00:01:00.840 Today, in this talk, we're going to discuss tips and tricks for working in the CRuby codebase. You might learn a few tips for eventually being able to do something similar.
00:01:12.900 I'm Jemma, and if we haven't met yet, I'd love to meet you at some point during the conference. I spoke last year at RubyKaigi about a project I was working on, which has since merged to implement object shapes in CRuby. Through working on that, I really learned how to work within the CRuby codebase in a way that worked well for me, and I'm hoping to share some tips and tricks from that experience that you can hopefully apply.
00:01:51.240 We'll tell this as a story of three bugs. We'll look at three separate bugs that have been filed in CRuby, and from each of them, we'll learn a little more about how the codebase works and how we can work in it.
00:02:15.320 In case you didn't know, you can report bugs to CRuby at a specific URL, and there's a thorough Wiki that explains how to report the bugs. Now, let's talk about our first bug.
00:02:26.520 This bug was filed by one of my co-workers at Shopify, David Stosik, who is also a Tokyo local. He reported that 'array sum' and 'enumerable sum' sometimes show different behaviors.
00:02:36.900 A thorough bug description includes a minimal reproduction that David has provided. To understand what he was saying about the bug, he has this class, which is a float wrapper. The critical part of this float wrapper is the plus method, which adds a float (the float within the float wrapper) to anything that implements 'to_f', and then wraps it in another float wrapper.
00:03:00.240 As you can imagine, we could add together two float wrappers, say with integer values 1.5 and 3.5, and we would get back a float wrapper with a float value of 5.0. This plus method also allows us to add a float wrapper to a float or an integer, or anything that implements 'to_f'. Therefore, we could also add a float wrapper with a float value of 1.5 to say 3.5 and similarly get back a float wrapper with a float value of 5.0.
00:03:22.500 So that’s how this float wrapper class David is using works. But what was he saying was the crux of his bug? He was saying that 'array sum' and 'enumerable sum' sometimes show different behaviors.
00:03:40.920 Let’s look at those behaviors. The first case he described is enumerable sum, which takes an array containing the element seven, and adds a float wrapper to it. As expected, we get back a float wrapper with a float value of 7. If we call array sum with the same array and pass in the same float wrapper as an argument, we instead get back just the number seven—not a float wrapper containing seven—which David accurately described with an emoji as an explosion. They show different behaviors.
00:04:11.880 The first question David asked was, 'What do the docs say?' He included this within his ticket, stating what the documentation said. If the documentation doesn’t reflect the behavior, that means one of two things is true: either the docs are inaccurate, or the behavior itself is wrong. It’s a great way to contribute to CRuby; there's a whole documentation guide on how you can add or improve documentation.
00:04:44.640 My co-workers Stan, Peter, and I actually worked on improving the documentation briefly last year. So let's look at the expected behavior of array sum David was talking about. He included a snippet which is straight from the docs: it starts with sum as the initial value, iterates over the array to add each element to the sum, and then returns the sum. In this case, our array contains just the number seven.
00:05:00.960 If we run this, we'll get back the expected result: a float wrapper which was not the actual behavior that we got when we received just the number seven. The documentation was not equal to the behavior. But the crux of what we want to do here is to learn how we can understand what’s happening. We can demonstrate this behavior, but how can we go behind the scenes, see through another layer, and understand what’s actually happening?
00:05:25.320 There are two steps I want to talk about: the first is how to build Ruby locally, and the second is how to test Ruby locally. So first, let’s talk about building. There are three main commands here: the first one should make sense; we want to clone the repo down. Next, we're going to want to run the autogen command, which will generate our configure file.
00:05:42.840 Lastly, we're going to want to configure CRuby, and this prefix we're passing is where we would install it if we later run an install command—specifying which directory on our local machine. There are some runtime flags I want to call out explicitly that can speed up installation or our builds later on: 'disable install doc' and 'disable gems'. Installing the documentation takes a little bit of time, and if you're not working on documentation, there's no need to include this when you build. You can add these flags right at the end of the configure line.
00:06:00.960 The second thing I wanted to touch on is setting environment variables for debugging. There are three here I wanted to mention if you want to build Ruby in debug mode. We’ll see later on in this talk what that means, but you will want another set of tools that allows you to use all the debugging capabilities.
00:06:06.839 The first is debug flags, which set the debugging options for the compiler. The '-g' flag generates debugging information, which is invaluable when using a debugger like GDB to debug the Ruby interpreter. 'Opt flags' set the optimization level for the compiler, and '-O0' disables optimizations that may make debugging easier while resulting in slightly slower execution.
00:06:34.020 The last flag sets the preprocessor options for the compiler. '-D RUBY_DEBUG=1' defines the Ruby debug macro to enable additional debugging features or checks within the source code. You’ll want to set those flags right within that configure line.
00:06:45.360 One tip I want to share is that it can be really helpful to use separate build directories with different configurations. You might want to keep a debug build alongside a non-debug build or a yjit build, and anything with the build prefix is going to be ignored. This allows you to organize your builds and maintain them more effectively.
00:07:02.940 So, what actually happens when we're building Ruby? There’s something worth mentioning here: we will have Ruby itself and also have mini Ruby available to us. As the name suggests, mini Ruby is much more minimal and lightweight. It does not have the standard library or some features, which together result in a much faster build time and quicker cycles during debugging.
00:07:23.160 When you don't need the standard library, it can be beneficial to just use mini Ruby while Ruby is full-featured and contains all of these components, leading to longer build times. So, what’s the command to actually build Ruby? For Ruby, it's 'make -j', where '-j' allows us to do it in parallel. For mini Ruby, it’s 'make mini Ruby -j'.
00:07:44.880 Now that we’ve talked about how to build Ruby locally, let's go into how to test Ruby locally. The Ruby test suite is in the test directory, while the mini Ruby test suite is found in the bootstrap test directory.
00:08:01.260 To run Ruby tests, we execute 'make test all -j'. In this case, we need to pass a number to '-j' that specifies how many processes to run on. Usually, we can just use '-j' to parallelize, but for 'test all', we need to specify this number so the test runner knows as it can be inferred.
00:08:20.580 Optimally, the number of processes should be slightly more than the number of cores, while B test cannot be parallelized, so we run 'make btest' to run just the mini Ruby lightweight tests.
00:08:38.420 We can also specify specific tests or files containing many tests using the 'TEST' environment variable for mini Ruby. These are effective and, as you’re debugging, it can be quite useful to add more test files or cases to any test suite that already exists.
00:08:53.520 In the case of the bug David filed, we would want to run a local test file. Fortunately, David kindly provided a snippet that we can save directly into a file named 'test.rb', which will be ignored, and all of the make commands are set up to reference this file directly. For example, to run this file with Ruby, we would use 'make run-ruby', and with mini Ruby, we can use 'make run-mini Ruby'.
00:09:12.360 Let’s do that. We talked earlier about how we want to use mini Ruby in this case. If we call 'make run', we will get the expected results. As we learned earlier in the enumerable sum case, but in the array sum case, we will get 7.0.
00:09:27.960 So that's wonderful, right? We can run this test file using our local Ruby. But the next thing we want to do is find the appropriate functions to understand where the issue is happening, to figure out how we can fix it.
00:09:49.680 We’re looking for where array sum lives, as we think it may be broken. The C Ruby codebase has an immense number of lines of code, and it can be a behemoth to search through, even using grep. Here, I want to mention a few things: one is the 'rb_define' prefix, which has many functions that follow it, and, as the name implies, these are definitions.
00:10:21.960 For instance, there's 'rb_define_method', which we will use in a second, and many other functions like 'rb_define_singleton_method', 'class', 'module', etc. All of these take strings as arguments to reference what they're pointing to. If we're looking for the sum method, this will take the string 'sum'. A great tip here is to look for a specific function by grepping for the method name in quotes. For example, 'sum' in quotes will yield far fewer appearances than just 'sum'.
00:10:50.920 If we grep for 'sum' in quotes, the first two things we'll see are the define method calls for array sum and enumerable sum, where the third argument in both cases points directly to the corresponding C function we're looking for, namely 'rb_array_sum' and 'rb_enum_sum'.
00:11:20.760 Another tip is to look in an appropriate .c file. Most times, whatever you’re investigating will have a .c file located in the root directory. For demonstration, if we list the files in the Ruby root directory, we should see the one we’re looking for, such as 'array.c'.
00:11:45.360 Once we find the function we're looking for, we can set about debugging it. In an earlier talk, I learned about building a mini Ruby debugger, which focuses on the bottom layer we're concerned with today, which is CRuby itself.
00:12:00.900 Just as we described, the two debuggers we can use here are GDB and LLDB. Let's now look into debugging with LLDB. To run our test file with LLDB using Ruby, we would use 'make lldb ruby', and for mini Ruby, it's similarly 'make lldb mini Ruby'.
00:12:25.320 To run our test file under LLDB, we will get some output and then an LLDB prompt. We’ll need some knowledge about LLDB for this. The first thing we know we want to do is set breakpoints—specifically, we want to break on the 'rb_array_sum' method. We can list all breakpoints by typing 'b', and for a specific file and line, we can type 'b file:line'. For example, 'b array.c:1234'.
00:12:47.640 Instead, we can set a breakpoint directly on the function name by using 'b rb_array_sum'. The next step is to run the file, using 'r' for run at the LLDB prompt. This will break right at our 'rb_array_sum', and we want to look at what it's returning since we know that wasn’t what we expected.
00:13:18.360 Rather than copying a lot of code repeatedly, we can step through to reach the return line, which is returning 'Double2Num', a numeric value. Right away, we can see our issue here: we didn’t want just a number back; we wanted a float wrapper.
00:13:42.060 One of my colleagues, Jean, has since fixed this bug, cautioning against taking the fast path if the initial value isn’t a native numeric type. If it isn’t a native numeric type you should go to the slow path and avoid calling 'Double2Num', the crux of our change.
00:14:06.480 Now, let’s move to the second bug. We had a bug report stating 'Expecting system stack error but crashing', this concerned some code I wrote related to object shapes. I won’t delve into the specifics of my work; however, the crashing part is the key focus.
00:14:27.600 This bug report also contained an example we can take advantage of and plug directly into our test.rb file. If we run 'make lldb' now and execute 'run', it will take us directly to the crash, stopping the process at that point.
00:14:49.620 There are a few more LLDB commands that will be useful. 'bt' can show the backtrace, and we can pass a number of frames as an argument, for example 'bt 7' will show seven frames. If we want to navigate to a specific frame, we can use 'f number', and this will jump us to that designated frame.
00:15:14.520 In this instance, we can observe that this has to do with assertions only being called in debug mode. The person who filed this bug was running Ruby with debug mode enabled, making that assertion the breaking point.
00:15:37.320 The last two LLDB commands we must remember are 'up' and 'down', which shift us up or down a frame, respectively. LLDB helpers defined in 'mlodbc Ruby' can be particularly beneficial. One notably useful command is 'rp', which prints Ruby objects.
00:15:57.840 If you have some Ruby integer, for example, named 'two', with a value of two, simple printing that integer in LLDB wouldn’t yield helpful results. However, calling 'rp' on it would provide the Ruby value—much more informative for debugging purposes.
00:16:22.320 One final tip is to utilize 'rb_bug' if you're repeatedly using LLDB and want to avoid past points you already examined; 'rb_bug' takes you directly there so you won’t have to reset multiple breakpoints.
00:16:45.180 Some folks may not enjoy debugging, and I completely understand that. If you prefer using puts for debugging, a useful tip is to create a puts line that utilizes '__FILE__' and '__LINE__'.
00:17:07.680 This will allow you to print out something like 'array.c:1298' and then you can use it in multiple places without difficulty, yet still identify where it was triggered.
00:17:29.520 Now, moving on, we want to talk about a bug that involves supporting IPv4-mapped IPv6 addresses specifically within iPad or private. In our search for this, we find that it has to do with IP address and/or the private method.
00:17:53.520 We utilize the technique we discussed earlier by searching the method name in quotes but we find nothing regarding 'ip_addr'. If we search for the name 'private' without quotes, we are led to a Ruby file indicating a Ruby method definition.
00:18:13.880 Upon inspection, we see precisely what we should change to resolve this bug. However, it is vital to note that this is in the lib directory, where items within the lib directory differ slightly from others in the Ruby codebase.
00:18:44.400 There's a documentation page regarding making changes to standard libraries, explaining that anything within the lib directory is mirrored from a different repository. Therefore, if you'd like to modify something that resides in the lib directory, it's not done within the Ruby codebase but rather within the corresponding repository.
00:19:09.120 To conclude, I want to share a useful redirect to a gems repository. Utilizing the gems metadata through 'gem.wtf' followed by a gem name, it will take you straight from a place such as IRB to the related codebase.
00:19:29.640 That wraps up our story of three bugs, which I hope has provided greater insight into developing within CRuby.
00:19:50.520 If you're interested in learning more, I find these documents to be quite beneficial in explaining contributing to CRuby, how to make changes, build, test, and so on. I've also included a blog series that my colleague, Peter, wrote about C extensions, which may also help you as you learn to contribute.
00:20:02.099 Thank you so much for having me!