GoRuCo 2018

Closing Keynote: Analyzing and Reducing Ruby Memory Usage

GORUCO 2018: Closing Keynote: Analyzing and Reducing Ruby Memory Usage by Aaron Patterson

GoRuCo 2018

00:00:23.420 Hold on, I got my desktop background all set up. How is everyone feeling? Good? I feel amazing! I feel incredibly good. I was actually in Japan 48 hours ago and I'm super jet lagged. However, I decided to take a nap, and oh my god, naps are amazing! I feel really good now.
00:00:57.750 So, okay, I'm gonna give a talk now, and I'm kind of excited because no matter how poorly I do, I'm not going to get invited back. In fact, I'm going to make a huge faux pas right now and get my clicker out. It's in my bag, one sec. Okay, that wasn't a pun or anything, just me getting my clicker.
00:01:10.049 Okay, so this talk is titled 'Reducing Memory Usage in Ruby'. Hello, everyone! I'm so excited to be here. My name is Aaron Patterson, but you may also know me on the Internet as Tenderlove. If you don't recognize me here on stage, this is what I look like online. I do look different; that is a wig. Some people don't know that. And yes, my name is Aaron; you've caught me. I work for a tiny startup called GitHub. You may have heard of it; it's the only legit company I've ever worked for! I love using Git, but I will not force push it on you.
00:01:54.829 I use these puns at every conference now that I am working for GitHub. However, our company has recently been in the news, so I feel like I'm going to branch out soon. In fact, I think this will open a lot of doors for me. Did I say doors? I meant to say windows! I can't wait! I'm gonna make so many puns, it's going to be so good.
00:02:22.560 I have two cats. One of them is named Gorbachev Thunderhorse; this is him. You may have seen the airdrop to you or you may have rejected my airdrop. I'm not going to call out those people rejecting my airdrops. My other cat is named SeaTac Airport Facebook YouTube Instagram Snapchat — what is another social network? I don't know! Anyway, her short name is Choo Choo, but we keep adding social networks to her name mainly because she has no idea what her real name is.
00:02:42.810 This is the very last GoRuCo, and I am so excited to be here, really happy to be in the Big Apple. Frances specifically told me not to call it the Big Apple. I'm trying to do all of the typical tourist stuff I possibly can, like going to Sbarro.
00:03:05.210 I'm staying at the Hilton, which is holding up the street. I'm staying there and I'm a member of their rewards program, called H Honors. Every time I stay at a Hilton, I make sure to say, 'Hey, I need you to enter my number in! I'm an Honors member!'
00:03:29.540 So, I went to the Hilton today, or yesterday, and I did the same thing. The receptionist said to me, 'Well, you can’t use your points because you booked through a third party, so it’s not going to help you.' So that was my experience!
00:03:43.220 Anyway, let’s talk about GoRuCo. First, GoRuCo has always been a special conference to me, and I have a lot of firsts in my career associated with this conference. GoRuCo is the first regional Ruby conference I have ever attended. I think it actually is the first regional Ruby conference because before that, there was just RubyConf. I loved attending these Ruby conferences because, as Frances mentioned earlier, back at that time, there weren’t many technical conferences for programmers.
00:04:28.370 I happened to just get my first Ruby job. But before that, I used to be a J2EE developer. Don't tell anyone! I used to go to conferences for J2EE, but it was really not fun because I couldn't apply any of that knowledge. We used a proprietary J2EE container at work, so anything I learned couldn’t really be used in other contexts. And whenever I attended conferences, it was mainly marketing stuff, nothing practical. But going to Ruby conferences felt like a breath of fresh air.
00:05:17.880 I truly felt that going to GoRuCo was similar. GoRuCo was also the first place where I made a talk proposal; it was the first conference I ever submitted a talk to, and it was also the first conference to reject my proposal. But here I am today, delivering the last keynote at the last GoRuCo. It’s a huge honor to me!
00:06:24.360 I want to say thank you to the organizers for all of your hard work. Please give them a round of applause! They deserve it. Alright, so let’s get to the technical part of my presentation. I’m going to talk about reducing memory usage in Ruby.
00:06:45.560 I'm also going to discuss two patches that I wrote for Ruby to reduce memory usage. We will talk about the loaded features cache and a technique I call direct instruction sequence marking. We won’t look much at the code itself but rather the techniques I used to find these optimizations.
00:07:03.820 It’s more interesting for us to learn how to identify these opportunities for optimization, as we can apply that knowledge in various contexts in our applications.
00:07:32.300 The first thing we need to do when dealing with memory is to determine what our memory usage is. MRI Ruby is written in C, so we need to find memory usage in C-based programs. There are two ways I typically go about it: the first way, which is not very effective, is reading the code. I believe this is the worst way to find memory issues in a program. However, sometimes you have to read the code.
00:09:53.600 The other method I prefer is what’s called malloc stack tracing. In Ruby, we actually have two different types of allocations to worry about: allocations made by the garbage collector (GC) and allocations made with malloc. We need to keep track of both types. Tools like object space or the allocation tracer gem in Ruby can help us inspect Ruby-based memory, but when it comes to finding issues like array bloat, we need to dig deeper with lower-level C-based tools.
00:11:11.350 One of my favorite tools for this is malloc stack logging. This tool is only available on macOS, but there are equivalents on other operating systems. I'll show you how to use it on macOS, but you can apply the knowledge on Linux as well. First, you have to enable the logger using an environment variable. This enablement is crucial, especially when profiling a Rails application.
00:12:48.400 We print out the PID, clean up any 'live' garbage since we aren’t really interested in it, and then we pause the process. Here’s why: when we start a process, the memory address where things are allocated is randomized. We have to capture live process profiling information so we can accurately track it. Once the process is paused, we can dump the malloc logs using ‘malloc history’.
00:14:29.000 If we look at the log file, it shows us an allocation overview: it indicates what was allocated, the memory address, size, and the stack trace. Analyzing this data allows us to reconcile live memory within the application at any point. For instance, we can find the memory usage of the program at any given time as well as identify which functions are allocating the most memory.
00:15:29.430 We need to focus on finding the callers instead of just the malloc calls themselves. By figuring out who is calling malloc, we can get an idea of the allocators in our program. For example, we might find that a significant portion of our Rails boot process is utilized by a specific method such as ‘RB_AstNewNode’. This insight can then be used to refine and reduce memory usage.
00:16:33.390 Now that we know how to find where our program uses memory, let's dive into the improvements I made. The first one I want to discuss is the loaded features cache. This optimization relates to how strings are managed in Ruby. Each string in Ruby is stored using C character arrays, allowing for some memory optimizations.
00:19:05.920 When multiple Ruby objects point to the same string, we can reduce memory allocation if care is taken. However, if we only slice part of the string, we’ll need to allocate different sections, so the rules here are about ensuring we fully utilize these character arrays to minimize malloc calls.
00:20:56.960 The loaded features array functions as a cache to track what has been required so we don't require the same file repeatedly. To optimize this, Ruby creates a cache of potential parameters for required files, which allows us faster lookup times. It efficiently checks if a file has been required before, eliminating unnecessary overhead.
00:21:43.120 The structure looks something like this: it links each file to corresponding indices in the cache, allowing quick access. The challenge with the loaded features cache is determining file equality, given variations in how files are required. Repeated array searches in loaded features can slow down Ruby's boot time significantly.
00:24:13.130 To mitigate these issues, I redesigned the loaded features cache, focusing on reducing memory overhead. Instead of allocating multiple Ruby objects for each required file, I streamlined the process, creating pointers directly to C characters. This meant fewer allocations for features required, effectively cutting memory usage in half.
00:27:40.060 Next, I want to discuss another patch that helps with direct instruction sequence marking, which requires a bit of background on Ruby’s VM architecture. The VM runs as a stack-based machine, processing operations through a series of instructions while maintaining a program counter that indicates the next instruction to execute.
00:29:50.320 As Ruby source code is compiled, it’s transformed into an abstract syntax tree (AST), then into bytecode, which is a series of instructions in binary format. These binary instructions, in turn, translate back to operate on the stack during execution.
00:31:40.960 When manipulating strings in Ruby, the reality is that when it compiles code, string literals can cause duplication under certain operands. During execution, if you directly push the original string without duplication, any changes you make can unintentionally mutate the original string.
00:33:24.140 Thus, we must duplicate these objects before pushing onto the stack to avoid unintended side effects during program execution. This is important for ensuring the integrity of string objects and overall memory management during Ruby operation.
00:35:05.949 In conclusion, we learned how to optimize Ruby's VM to handle memory effectively. By removing unnecessary array bloating caused by maintaining two pointers for strings, we can now directly manage and mark these objects in instruction sequences. The result of these patches is a notable reduction of objects maintained during execution.
00:37:24.670 Through these optimizations, Ruby 2.6 promises substantial memory efficiency improvements, which developers can gain access to by upgrading today! Thank you so much for having me.