Gem Development

Source-Diving for Fun and Profit

Source-Diving for Fun and Profit

by Kevin Kuchta

The video titled 'Source-Diving for Fun and Profit' features Kevin Kuchta at RubyConf 2019, where he discusses the valuable skill of diving into the source code of Ruby gems to troubleshoot issues and enhance development efficiency.

Kuchta shares his personal experiences with debugging, emphasizing how often developers face challenges when third-party gems fail or produce errors. He recounts common frustrations that arise when trying to fix bugs, such as relying solely on documentation or community forums like Stack Overflow. Kuchta aims to encourage developers to embrace the practice of exploring gem source code as a means to resolve issues more effectively.

Key points discussed in the video include:
- The Importance of Source Diving: Kuchta stresses that learning to explore a gem’s source code can significantly reduce time spent on debugging.
- Steps for Effective Source Diving: He introduces a simplified three-step process:
- Take Me To Your Leader: Start by identifying the key files that contain the core logic of the gem. This involves locating the primary entry point of the gem and focusing on the largest files within the codebase to understand its structure.
- If You See Something, Search Something: Kuchta advocates for utilizing search tools to find specific code snippets or error definitions within the gem. He provides examples of using various tools like grep, ag, and GitHub’s search capabilities to streamline this process.
- Exploring Code for Clarity: Once the relevant sections are identified, developers should study the code in context to clarify misunderstandings or identify undocumented features.

Throughout the video, Kuchta shares anecdotes of his own encounters with bugs and how becoming comfortable with source diving allowed him to troubleshoot more efficiently. For instance, he describes instances where he discovered bugs were caused by incorrect assumptions in gem documentation or simply by inputting parameters incorrectly.

In conclusion, Kuchta emphasizes that while diving into source code won't solve every problem, it is a valuable practice that can lead to quicker problem resolution and improved understanding of third-party libraries. By fostering a culture of exploration, developers can enhance their skill set and overcome obstacles they encounter in their coding journeys.

00:00:13.299 Hi everyone, my name is Kevin Kuchta, and the other day I was writing some code.
00:00:18.580 If you've never seen me write code, this is what it looks like. Well, okay, something happened to me that happens entirely too often.
00:00:26.650 I found a bug, and when I encounter bugs, they look sort of like this.
00:00:32.469 They have long ears and waistcoats, but more importantly, this was a bug I wasn't quite sure about.
00:00:38.470 I didn't know where it was coming from. As you do, I started chasing this bug down.
00:00:44.770 You know, I chased it—chased it through the fields, drifted across a river, and ended up in the stack trace.
00:00:51.100 And in this stack trace, I lost it. It disappeared down a rabbit hole.
00:00:57.520 You might actually recognize the shape of this rabbit hole. Does this look familiar?
00:01:03.610 Notice how the lines right around here suddenly go from much shorter to much longer.
00:01:10.630 This rabbit hole is where the bug disappeared from my code into third-party gem code.
00:01:16.359 Like Alice, I walked up to the edge of this rabbit hole and tried to gather any context I could on where this bug had gone.
00:01:22.779 I wanted to know why it had disappeared, how deep this rabbit hole went, and whether I should pursue it.
00:01:28.959 I built up my courage and promptly gave up, spending the rest of the day staring at the clouds.
00:01:35.529 Thank you for coming to my talk at the end. Okay, obviously, I didn't actually do that.
00:01:41.889 This is a story that has played out many times in my career, and perhaps in yours too.
00:01:47.439 You trace the bug and find that it disappears into a gem, and you can't debug it any further.
00:01:53.889 For the first few years of my career, I sort of gave up. I turned back. I didn't have time to go down a metaphorical or literal rabbit hole.
00:02:00.520 But I still had to fix the bug, so I tried everything I could think of short of going down that rabbit hole to debug it.
00:02:07.149 I reread the documentation for the gem that was giving me trouble. I went to the GitHub issues page to see if anything was related to my problem.
00:02:13.060 I Googled everything I could associate with it. I went on Stack Overflow, asked a question, and waited for it to inevitably be closed as a duplicate of an unrelated question.
00:02:18.280 When that didn't work, I tried staring at the code really hard. When that failed, I resorted to everyone's favorite debugging technique:
00:02:25.090 rerunning the code repeatedly without changing anything to see if I got a different result.
00:02:30.130 I don't mind telling you the fact that this has actually worked a couple of times in my career haunts me to this day.
00:02:35.980 But it usually fails, which leads me to falling back on everyone's debugging technique of last resort: screaming into the void.
00:02:42.280 That was usually the end of my process for the first few years of my career.
00:02:47.920 Until one day, in a fit of anger, I said to myself, "You know what? I'm just gonna go to GitHub, find the gem that's giving me trouble, clone the repository, and take a look."
00:02:52.930 I don't know exactly what I expected. I think I thought it would be sort of impenetrable code written by godlike programmers.
00:02:58.060 But lo and behold, I opened up this gem that was giving me trouble, and I found out it was just more Ruby code.
00:03:03.310 It was similar to what I had written. It was a little foreign, with some new idioms and new patterns, but I could read it.
00:03:10.540 I could explore this gem and figure out what was causing my problem.
00:03:16.030 Now, I'll admit, the first time this happened, the problem was just that I spelled some input wrong.
00:03:21.850 That wasn't a great use case, but other times I dove into a gem that was giving me trouble.
00:03:27.700 I found out that maybe the documentation for the gem was flat-out wrong; it stated it behaved one way but actually worked another.
00:03:33.250 Another time, a gem had a function I needed that solved my problem, but that function wasn't documented anywhere.
00:03:39.609 I was only able to find it by digging into the code of that gem.
00:03:45.819 Yet another time, the gem was taking some input with unstated assumptions about how that input was shaped.
00:03:51.790 I only knew this because I had to go into the code of the gem itself.
00:03:57.310 My point is that diving into the source code of a gem that's giving you trouble is an invaluable skill.
00:04:03.910 To that end, this talk is entitled "Source Diving for Fun and Profit." My name is Kevin Kuchta, and I am a software engineer based in San Francisco.
00:04:11.109 I genuinely feel like one of the biggest leaps I made as a growing engineer was realizing that I could just dive into the code of an external gem.
00:04:17.349 Getting over that hump and that fear: I knew at some level it was something I could do.
00:04:22.350 But I hadn't really internalized it and started doing it regularly.
00:04:29.070 Once I did, I felt like I leveled up noticeably as an engineer.
00:04:35.460 The goal of this talk is to help some of you get over that same hump.
00:04:41.100 Or, if you're already over it, to give you better techniques for exploring a gem that's giving you trouble.
00:04:47.100 I don't know about you, but the first time I did this, I just opened up a gem and started reading through it like it was a book, cover to cover.
00:04:53.610 Now, I have a slightly more structured process. Admittedly, my process isn't that structured.
00:04:59.990 But to give you the false impression it is, I present to you the Kevin Kuchta three-step process for source code diving, patent-pending.
00:05:05.160 The first step I like to call "Take Me To Your Leader." So, you've just downloaded a gem that you need to explore to fix a bug.
00:05:10.710 The first step, in my mind, is to grab the shape of this gem.
00:05:16.590 To do that, find the biggest or at least the most important files—the ones with all the logic.
00:05:22.890 A lot of gems have 30 or 40 different files, but only two or three contain the heart of the gem.
00:05:28.170 A good way to find this logic is to start at the top.
00:05:34.710 Say you've just downloaded the Sidekiq gem. In this example, you open the 'lib' directory where the source code is.
00:05:40.860 There's a file named after the gem; this is the entry point of the gem, the first code that gets run.
00:05:47.310 You open that up and, lo and behold, your first try yields a file that has a pretty good amount of logic.
00:05:53.370 This is a few hundred lines long and contains lots of methods, each with actual code in them.
00:06:00.690 By contrast, here's the KinKink gem, which is a community fork of the CanCan gem.
00:06:06.630 There's a file in it called 'kinkink.rb.' I opened it up and found a whole lot of nothing.
00:06:12.810 This is the entire file. All it does is declare a module named 'kinkink' and then require another file.
00:06:19.740 Let's trace this one step further and open up the kinkink file.
00:06:25.680 It turns out there's also a whole lot of nothing in there: just a bunch of require statements.
00:06:31.770 At this point, I could go through all these require statements to see which have actual logic, but one thing you should know about me is that I am extremely lazy.
00:06:38.660 I don't have time for that. If only there were a way to prioritize these files and figure out which ones to look at first.
00:06:44.060 A relatively dumb but surprisingly powerful technique is to look for the biggest files.
00:06:51.380 There's an easy way to do that on any UNIX-based system, like Linux or Mac. There's a nice bash one-liner.
00:06:58.670 I have no idea; I have to do this once every three to six months, and I always forget.
00:07:04.970 Just Google it. But if you're watching this talk after the fact, and you did Google it to arrive here, here's an actual answer.
00:07:11.120 I'll save you the trouble of Googling it; the slides for this talk will be available.
00:07:16.580 So you don't have to memorize this command. All it does is search a directory, find all the files,
00:07:22.160 sort them by line number, and then return the top ten.
00:07:29.060 Here are the top ten files by size, so I can start working my way down this list to explore the kinkink gem.
00:07:35.480 I look at the first file, 'controller additions.rb,' and it turns out that file has not much in it; it's huge, but it's 95% comments.
00:07:41.780 The same thing with the next one. I work my way down and find that the fourth and fifth largest files,
00:07:48.290 which are 'control resource.rb' and 'rule.rb,' contain a lot of meat.
00:07:54.080 They hold the real logic of this gem, showing how most of it works.
00:08:01.250 That's really all there is to what I'm calling "Take Me To Your Leader." Look for the entry points to a gem.
00:08:07.190 Look for the biggest gem files. Whatever you do, just look for the files that contain most of the logic.
00:08:12.740 That's a great way to get a feel for the rough shape of a gem before you sink your teeth into a more focused analysis.
00:08:18.050 The next step is what I'm calling "If You See Something, Search Something." This is me imploring you to use searching in a gem.
00:08:23.930 It's a surprisingly powerful tool. For example, say I've got a gem that's outputting something, like RuboCop.
00:08:30.630 Maybe I'm using it on the command line, and it prints out this error.
00:08:36.760 Now, I know what this error is, but RuboCop has about ten thousand different possible errors.
00:08:40.760 Let's say it's one I don't understand, and I want to find the exact code in the RuboCop gem that defines this error.
00:08:46.360 I go to GitHub, clone the RuboCop repository, open up the source code, and I just do a search across the entire library.
00:08:54.009 And find a file that contains the string 'useless assignment to.' It turns out this is also in 'useless_assignment.rb.'
00:08:59.350 Now it points me right to the file that has all the logic that caused this error.
00:09:04.360 I sort of glossed over how we're doing the search here, but all you need for this tool, for this technique is any tool that allows you to search recursively across a large code base.
00:09:10.149 A nested directory structure, like grep -r, I can grab 'git grep'—any of these will work.
00:09:16.509 The one I like to use most is 'ag,' which allows you to do more intelligent searching across a codebase.
00:09:22.059 Here, I'm using 'ag' just to search for the string 'something.'
00:09:28.350 However, 'ag' is smart enough to ignore things like files mentioned in your .gitignore file.
00:09:34.149 It will not return any search results that match patterns in your .gitignore file.
00:09:39.449 'ag' has a handful of useful flags; you can do a case-insensitive search, regular expression searches, and even search by file type.
00:09:43.719 Here, I'm going to look only at Ruby files. I want to highlight that regular expression searching is a surprisingly useful way to navigate a codebase.
00:09:49.449 If I want to find all these strings, I could do a case-insensitive search with a relatively simple regular expression to find all files containing any of these three strings.
00:09:55.569 But let me suggest a specific example. Here's a line of code—or three lines of code—that was giving me trouble earlier this summer.
00:10:01.360 This code uses the Reddit API, and it's utilizing a gem called 'red' to interface with that API.
00:10:07.100 What should happen is that anytime anyone makes a comment on reddit.com, it should trigger this code to print out that comment, but it wasn't working.
00:10:13.480 So, I started using searching to try to debug it. Since the method I'm calling is 'stream,' I'll first search for 'stream.'
00:10:19.050 I'll run the search across the codebase, and admittedly, my first results for this were not very promising.
00:10:25.330 I got results for this inside the README file and documentation for the 'red' gem.
00:10:31.830 That's not terribly useful, so I'll narrow this down using 'ag stream ––ruby' to only look at Ruby files.
00:10:38.140 That gives me a much smaller number of results. For instance, I see on line 65 in 'agitated_listing.rb' a method named 'stream' being defined.
00:10:44.670 Now let's go take a look at that. This brings me to this 'stream' method that doesn't really do anything.
00:10:50.370 It just calls another method and returns the results. Let's check out that other method: 'underscore_stream.'