Talks

Compacting GC in Ruby 2.7

Aaron Patterson — Compacting GC in Ruby 2.7

1:15 Hugging selfie
1:55 #Pivorak rules are…
3:19 Meet the speaker
7:43 Java vs Ruby
8:34 My First Serious Ruby Program
12:00 Moral of the story
12:20 Why do I love Rails?
15:38 Comparing GC for MRI
16:26 Compaction
18:00 CPU Caches
19:30 CoW Friendliness
21:18 Eliminating Fragmentation
22:00 Two Heaps (Ruby Heap and Malloc Heap)
25:14 Two Finger Compaction
29:45 Reference Updating
35:10 Allowing Movements in C Extensions
40:18 Pure Ruby shouldn't crash
44:38 Don’t Use Object ID!
46:30 Future Plans
48:28 Questions?

Let's chat, meet, and share our ideas via all the social media:

Join us on Facebook: https://bit.ly/2WjAgVb
Tweet a bit with us there: https://bit.ly/2XmndyK
Follow us on LinkedIn: https://bit.ly/2MmSn8j
Hop in some cool articles here: https://bit.ly/2IcMgOx
Our Instagram account: https://bit.ly/2QDZCaf
Become a register member on our site and get all the benefits: https://bit.ly/313cImj

Pivorak Conf 3.0

00:00:18.430 There are some tender laughs.
00:00:27.650 We've been waiting for this moment from the very beginning of our community. Four years ago, Aaron Patterson, Thunder laughs.
00:00:46.309 I'm really, really nervous actually. Before we start, I want to take a selfie with everyone. I heard that we're supposed to hug each other here, so I want to take a photo of everybody giving a hug. Can we do that?
00:01:00.629 Okay, okay!
00:01:36.810 Alright, so first I want to say thank you so much for having me. It really, really means a lot to me.
00:01:42.370 I was actually really moved by the Pivorak rules. Ah, there we go! I think that the Pivorak rules are the best. I think they are amazing. I want to encourage all of you to become a speaker.
00:01:53.680 It's really hard for new people to give talks or to get into speaking, and I think it's because a lot of people feel they have nothing to teach anyone or don’t know what to say on stage. But the thing is, every single one of you knows something that somebody else doesn't! You might not know what that thing is, but it doesn't really matter; just get up on stage and talk about some idea you're working on, or anything you do, because someone out there won't know about it, and they will learn something.
00:02:13.570 So, please, become a speaker. Another thing I really liked about the rules is the hugs. This is good because I love hugs and I want you to give me a hug! Well, not now. We'll wait until after the party. But yes, please come give me a hug; say hello to me.
00:02:25.959 Now, this I don’t need to emphasize, but my name is Aaron Patterson, and my name on the Internet is tenderlove. If you don't recognize me in person, this is what I look like online.
00:02:38.620 I have a couple of cats. This one, his name is Gorbachev Puff-Puff Thunderhorse the Third. We call him Gorby. You can ask later why I named him that. This is that her name is SeaTac Airport; on Facebook, YouTube, Instagram, Snapchat; I think that's it. We call her just SeaTac for short.
00:03:05.140 I felt so bad introducing my cats in my presentations; at one conference, there was someone translating me into sign language, and I thought, oh man, why did I say this name? This is so terrible for the sign language people! Anyway, I have stickers of my cats, so if you'd like a sticker of my cat, just come and ask me! If you don't know what to say when you come to me, just ask for a sticker, and I'll give you one.
00:03:49.539 I started looking through my commit history this year and found out that this October will be my tenth year on the Ruby core team! I'm also on the Rails core team, responsible for developing Rails. I decided to look through my history there as well and found I've been on the core team since 2011, but my first commit was in 2009. So, almost ten years of committing, and eight years as a core team member.
00:04:15.940 I want to tell you about my history a little bit. I’m not doing this to boast, but I really want to share with you why I have been involved in the community for so long and why I have stayed so long. The first reason is that I really love Ruby.
00:04:36.170 I love the Ruby programming language. I've been doing development work since 1999. I got my first programming job then and got into Ruby around 2005 because some of my co-workers went to a conference called No Fluff, Just Stuff, where Dave Thomas gave a presentation about Ruby.
00:04:56.660 My co-workers brought it back and showed it to me. At that time, I was a reluctant Java programmer, and programming in Ruby felt like a breath of fresh air. Before Ruby, I was a Perl programmer, so Ruby felt really nice to me!
00:05:18.840 The language was just so easy; everything worked the way I thought it would. When I made a mistake, it was easy to tell why. I really like the language, as the basic patterns were simple, meaning you don't have to learn many rules to program in Ruby.
00:05:32.080 That meant my mental overhead was lower. I also didn't have to write a lot of boilerplate code, which I hated as a Java developer! Let me show you an example of what I mean.
00:05:48.300 I googled Java 1.3, as that's what I had to program in at the time. Here are two examples that do the same thing: one in Java and one in Ruby. Both examples are taking two lists of strings and converting them to integers.
00:06:12.430 In Java 1.3, we didn't have generics, so you had to pull everything out of a list, cast it, and then push it back. Now, the funny thing is that this Java example doesn't even work, because when you call parseInt, it returns an int value, which isn't an object, meaning you can't add it to an ArrayList that only supports objects!
00:06:28.220 This is what I mean when I say mental overhead. In Ruby, everything is an object, so you don't need to think about these particular differences. You just learn that one rule: everything is an object.
00:06:41.610 Now, let's talk about my first serious Ruby program that I wrote in 2005. I wanted to see the third Lord of the Rings movie, and it was being released. They were going to show all three Lord of the Rings movies in one day, and that day was my birthday!
00:06:59.960 It was also a day early. I was really excited about this! I wanted to go see these movies for my birthday, so I went to work, and they were selling movie tickets online. I tried to buy the tickets, but the website was crashing.
00:07:22.160 At that time, I was working on a Java app that took about ten minutes to compile, so every time I made a change, I had to wait ten minutes to test it. So, I decided to write a Ruby program that would try to buy those movie tickets for me.
00:07:36.280 I wrote this program and even put my actual credit card number in it, stored in a plain text file. It would make requests, and I knew what a failure case looked like, so when I got an error, I would know what to do!
00:07:54.700 I thought I'd just retry. I kept trying, and if I got a different response, I would log a message. But I'd still retry anyway. I thought maybe it would keep failing, but perhaps with a different page.
00:08:12.740 So, I ran the program while I was working on my normal Java job and forgot it was running. Then I looked in the terminal and saw a whole bunch of log messages. Oh no, something changed!
00:08:26.110 So I went to the website and tried going through the process, and it turned out that it had been succeeding over and over again! So I called my credit card company and asked a very awkward question.
00:08:48.970 I asked them, 'Hi! How many times have I charged my credit card today?' They said, 'Well, you charged it once.' I was like, 'Oh wow, that's great!'
00:09:08.009 So then I called the ticketing company to ensure I actually got the tickets. I told them, 'Hi, I've been trying to buy tickets online, and my credit card company says I got charged, but I want to verify that I actually got the tickets.'
00:09:19.050 I gave my information, and the guy on the phone said, 'Oh yeah, you bought two tickets, but it looks like we've actually tried to charge your credit card hundreds of times!' I was like, 'Oh, the website wasn't working, so I was just hitting refresh!'
00:09:38.350 The moral of the story is that I never write my credit card information into anything again without an exit case. Also, I love Rails.
00:09:50.100 Around that time, DHH made his famous blog video, and I decided to try out Rails. I fell in love with Rails; to me, it had the same features as Ruby. Everything just worked, and there was no boilerplate code.
00:10:08.970 There was something more to it than just these specific traits; it was the values of the language and the framework that really resonated with me. Ruby and Rails were trying to shift all of the burden from the programmer onto the computer.
00:10:23.970 I didn't want to do that work; I wanted the computer to figure it all out! This mentality is what I wanted to work with. I wanted the computer to do my job. Unfortunately, at that time, I was still a Java programmer.
00:10:41.320 But I thought to myself, 'I must write Rails!' This is a photo of me from 2006, and unfortunately, around that time, nobody was hiring Rails developers; it just wasn't a job that existed.
00:10:54.220 Two friends of mine, two co-workers, quit and decided to start a startup using Rails. I was so unhappy being a Java developer that I decided to join them, but that cost me 25 percent of my salary because there were no Rails jobs available.
00:11:09.970 After joining this company, I knew I had to do something about this. I wanted to increase the popularity of Ruby and Rails so more companies would start hiring Ruby developers.
00:11:29.320 It wasn't just about me getting a job; it was about making development fun and helping everyone else feel that same joy. I want to thank all of you for writing Rails applications and Ruby code.
00:11:47.160 You're helping spread these values and making it possible for us to have jobs writing code for fun! So, let’s move on to the technical portion of my presentation. I apologize; it’s almost 9:00 p.m. and this will be very technical, but I will do my best to make it understandable.
00:12:18.360 I'm going to talk to you about a compacting GC for MRI—a patch that I've been working on for three years now. This presentation is about the most difficult code I've ever written in my life, so I hope I can distill it so everyone can understand.
00:12:44.670 And if you have questions, please ask! I don’t think I’ll come to Ukraine very often, so take advantage of my time here.
00:12:57.060 First off, what is compaction? Essentially, compaction is taking allocated memory and free memory—imagine a picture of computer memory: we have allocated and free chunks. Simply put, we rearrange them so they're next to each other.
00:13:10.470 Combining these together allows us to make one allocated memory chunk and one free memory chunk. For those of you who may remember, this is like defragmenting a hard drive, but instead, we're doing it in memory.
00:13:26.410 So, why should we compact? What are the benefits of compaction? One significant benefit is that it gives us more efficient memory usage. For example, suppose we have a memory layout that looks fragmented, and we want to allocate something new, but it won't fit inside of the free area.
00:13:43.650 If we rearrange the memory so that the free memory chunks are together, then there's enough space to allocate what we need! Compaction improves memory efficiency.
00:13:57.140 Another advantage is that it helps CPU cache. When a program reads memory, it has to go through the CPU to get it; the CPU will read memory in chunks from your RAM, and that chunk is stored in the CPU cache.
00:14:09.520 So what happens is: as the program runs, the CPU stores chunks of memory to optimize access. If we can arrange memory so that allocated chunks are contiguous, it ensures that the memory reads fit inside CPU caches. This leads to better locality.
00:14:26.850 Another advantage is Copy-on-Write friendliness. At GitHub, we use a Unicorn web server that saves memory through a Copy-on-Write technique, and compaction can improve the efficiency of this copy.
00:14:42.490 Let’s say we have a parent process. When the parent forks a new child process, that child process points to the parent's memory, thereby avoiding doubling memory usage. If the child process needs to write to the memory, it copies the memory from the parent.
00:15:02.360 However, the operating system cannot copy just the amount of memory you need; it copies a fixed-width chunk, which can lead to unnecessary memory usage. Compaction solves these fragmentation problems, essentially moving memory around.
00:15:20.620 Now, in Ruby, programs have two heaps. We have the system memory, which is our total memory, and memory allocated using the malloc system call. The malloc heap is created by the operating system, out of which we allocate Ruby's objects.
00:15:41.060 So, Ruby's object heap is contained within the malloc heap, and every time we allocate a new Ruby object, it comes out of this Ruby object heap. Ideally, the Ruby heap would be the same size as the malloc heap, but plenty of data is allocated outside of the Ruby heap.
00:15:58.470 One example is a string: in this case, the Ruby object representing the string is allocated inside of Ruby's object heap, but the actual string buffer is allocated inside the malloc heap.
00:16:12.690 So, fragmentation can occur inside both of these heaps, not just Ruby's heap. For the malloc heap, we use an allocator called je_malloc; I recommend using this in production. For the Ruby heap, we use something called GC compact, which is the patch I wrote.
00:16:30.500 Now, let’s take a look at how Ruby's heap is built. Each Ruby object is represented by a fixed amount of memory, specifically 40 bytes wide. We call these chunks 40-byte slots; each slot can either be empty or filled.
00:16:43.920 These slots are contiguous on a chunk of memory, known as a page, and each page is approximately 16 kilobytes. Ruby's object heap consists of many of these 16 kilobyte pages allocated using malloc.
00:17:01.750 Each slot has a unique address, which is crucial for building a compactor. Now that we know about this heap layout, let’s look at the actual algorithm. The algorithm I use is called a two-finger compactor.
00:17:15.290 Originally implemented in 1964, this algorithm isn’t the most efficient but it is straightforward. It consists of two primary steps: moving objects and updating references. We will discuss moving objects first.
00:17:31.370 The algorithm works by using two pointers, one on each side of the heap—one called the free pointer and the other the scan pointer. The free pointer moves to the right until it finds a free slot, while the scan pointer moves left until it finds a filled slot.
00:17:46.130 Once it finds both a filled slot and a free slot, they swap places, leaving a forwarding address. The process repeats until the two pointers meet, at which point the heap is compacted.
00:18:02.070 After compaction, we need to update references. For this, we walk through each object, asking for its references and updating them. For instance, if object A points at object B, and B moves, we change A's reference to point to B's new location.
00:18:18.420 Once all references are updated, we change the 'moved' slots back to free slots. That's it! In Ruby code, I translated the algorithm into something easier to read, since this is a Ruby conference.
00:18:36.520 The reference updating code seems straightforward, but it hides a lot of complexity since we have to know how various Ruby structures, like hashes and arrays, hold references.
00:18:56.050 Updating references is about 80% of the code I wrote. Supporting C extensions is one of the hardest parts of this process. I came up with a scheme to handle C extensions and I want to introduce it to you.
00:19:15.470 The biggest challenge was figuring out where those references are stored. Known types in Ruby are straightforward; we can read their implementation. But what about unknown types?
00:19:32.900 Unknown types are generally implemented in C. For example, the JSON parser, which is a C extension, can store Ruby objects, but the garbage collector doesn't know about them, making it unable to update references.
00:19:45.420 Here's how we resolve this: all C extensions must mark references during the garbage collection process. Using the RG BGC mark function, we ensure those references are preserved, preventing any potential crashes.
00:20:04.230 During the mark phase, we introduce a pin bit table. When our BGC mark gets called, it pins certain references so they can’t move during compaction. This allows the garbage collector to work efficiently.
00:20:22.220 Another thing I introduced allows movement inside C extensions through three concepts: a compaction callback, an open marking, and a new location function.
00:20:37.370 Since the GC can’t update a C extension, we let the C extension update itself. The extension knows its own internals, enabling it to handle its memories according to our algorithm.
00:20:50.450 This is how we work together: C extensions declare a marking function and a sweep function. I introduced a third callback: a compaction callback that gets called during the next compaction.
00:21:07.300 We also adjust the marking function, telling it not to pin objects when marking references during compaction.
00:21:23.120 This ensures a fluid operation. Now, I want to address one known issue: if a C extension holds a reference and that reference isn't marked, it could lead to program failure.
00:21:39.640 I fixed this with a patch called direct marking, which is included in Ruby 2.6. This eliminates unmarked references that could fail, ensuring all Ruby objects are marked directly from instruction sequences.
00:21:56.410 Another issue was found in the JSON gem where an internal constant wasn't marked. To solve this, the authors needed to allow marking to prevent objects from crashing during garbage collection.
00:22:12.080 Generally, the rule here is simple: if you hold a reference, you must mark it! In other words, avoid writing C code; if you only write Ruby, you don't need to worry about this!
00:22:29.990 Lastly, I want to address the challenge of Object ID. The overarching theme here is that having direct memory access prevents the object from moving.
00:22:46.120 If you call Object ID in the same slot, it will return the same number, but if the object moves, the answer becomes uncertain. To address this, I implemented a system to store seen object IDs in a global hash.
00:23:05.390 If we see a specific object ID, for example, we allocate memory and update it in the hash. If that memory moves, we retrieve the information from the hash ensuring consistent results.
00:23:22.970 However, if an old object ID is reused, that leads to complications. In such cases, I increment the ID to avoid collisions.
00:23:39.770 This way we ensure there’s always a unique ID available. Once we free an object, we must also clean up the hash to maintain order. In short, the use of Object ID is fine, but avoid over-reliance on it.
00:23:58.790 Now, let’s take a look at the impact of these changes. We studied the memory layout at the beginning. I want to show you a visualization of a basic Rails application.
00:24:15.420 In this image, each column represents a page and each dot within signifies a Ruby object. Red dots are Ruby objects that cannot move; black dots are Ruby objects that can move; and white dots are free space.
00:24:33.850 As you can tell, our heap appears to be quite fragmented with a lot of free space interspersed among allocated objects.
00:24:47.870 However, after running the compactor, it looks noticeably slimmer, eliminating plenty of fragmentation and reducing the number of pages.
00:25:01.230 Let's look at a more comprehensive example, which is the GitHub application. The top column is before compaction and the bottom section is after compaction.
00:25:19.070 You can see the direct impact of our changes. This is the entire view of what it looks like; the objects have now been compacted to the side.
00:25:38.150 Now, let’s talk about future plans. First and foremost, I want to improve the performance of this patch. Currently, it’s very inefficient because it requires four steps: a full GC, moving objects, updating references, then performing a full GC again.
00:25:57.920 Reducing this to only two steps would be ideal—executing compaction right after a major GC.
00:26:06.790 I’d also like to implement a sliding compactor. The difference between it and a two-finger compactor is that instead of doing swaps, it slides everything down. This offers several advantages.
00:26:25.690 The new sliding compactor improves locality. For example, if you allocate an array containing elements, they may end up scattered throughout the memory. But with the sliding compactor, all of those elements would stay together.
00:26:44.060 Additionally, it supports variable widths. Unlike the two-finger compactor that relies on a fixed width, this sliding version provides more flexibility.
00:27:03.380 This allows for more efficient memory use without the constraints imposed by allocators. So, that is the end of my presentation.
00:27:21.920 You can see the actual bug, and it is finally merged! Thank you so much for having me here; it’s an honor!
00:27:36.340 And if you have questions, please ask.
00:27:43.980 That’s a good question! This patch only impacts the Ruby heap, not the malloc heap. The performance of the malloc heap differs from the Ruby heap.
00:28:01.600 Typically, when those issues occur, they relate to the system's malloc, which is why we switched to je_malloc; it efficiently releases memory back to the OS!
00:28:20.880 However, I must clarify that this is a manual compactor; you must specifically call GC compact. My next goal is to make it automatic, but for now, it has to be triggered manually because current inefficiencies need to be resolved first.
00:28:36.580 So, to summarize: I wanted to ensure as many objects could move as possible while also factoring in the aspects of Object ID, which is something many developers leverage.
00:28:52.240 While there may be more C extensions that reference Object IDs than we realize, it leaves an opening for improvements. The use of individual pinning may lead to unfortunate performance issues.
00:29:09.470 Numerous objects could inadvertently become pinned without realizing, leading to unnecessary memory consumption.
00:29:25.060 However, I believe that the benefits of compaction should outweigh the downsides, especially when we eliminate unnecessary memory allocations.
00:29:41.470 At GitHub, we use it in production, manually triggering compaction before the Unicorn forks a child process, effectively sharing memory to lessen the overall load.
00:29:58.320 Thank you, and I appreciate your attention today!