00:00:16.460
Calm down! Alright, we did it! We did it! Okay, okay. This talk is called 'Compacting GC for MRI.'
00:00:22.170
Hello everybody! My name is Aaron Patterson, also known as Tenderlove. If you don't recognize me in person, this is what I look like on the internet; that is my icon.
00:00:35.160
So this is my online self, but this is me in real life. Now, I have two cats, and this is one of them: Gorbachev, or Gorbachev Puff-Puff Thunderhorse the Third.
00:00:43.980
I think that some of you may not know him because I was airdropping his picture to people, and some people declined.
00:00:49.440
Maybe now you know where this is coming from. This is my other cat, SeaTac, or 'Oops Sea-Tac Airport Facebook YouTube Instagram Snapchat' as her full name. But we just call her Choo Choo.
00:01:06.720
I have stickers of my cat, so if you would like a sticker of my cat, come say hello to me! I'm a pretty, I don’t know, shy person and I know other people are too, so I have a hard time making conversations. But if you want to come say hi to me, I'll have stickers and we can talk about cats. That is something we can do together.
00:01:29.820
I work for a small startup company called GitHub. I don't know if you've heard of GitHub, but it's a place where you can host code online. You could say that working at GitHub is my active job.
00:01:41.909
Alright, okay, I'm really excited to be here in Bulgaria. It's my very first time here, and it's amazing! I'm having a great time. The food is amazing, the people are so friendly, and I just want to say thank you so much for having me here.
00:02:16.230
Whenever I go somewhere, I like to support local culture, try out the local foods, and do things that local people do. So I tried out some local drinks.
00:02:30.120
I can’t read this drink, but I had it and it was interesting. I also tried out Rakia, and there are photos missing of this, but it was very good; I enjoyed that.
00:02:57.239
I also wanted to check out local computers because I thought it would be cool to try local computers. So I tried one of these. As mentioned when I was being introduced, I also wanted to get some local cultures, so I brought home Bulgarian yogurt culture. Thank you so much for that!
00:03:30.180
I didn't know how famous Bulgarian yogurt was until I told a friend of mine that I was going to Bulgaria. She is in Japan and said, 'Oh, yogurt, yogurt!' I asked her what she was talking about, and she said, 'They're famous for yogurt.' I had no idea!
00:04:00.299
But the most famous yogurt in Japan is called Bulgarian yogurt; it literally says Bulgarian yogurt on it. That's how I learned about it.
00:04:14.040
Anyway, I’m on the Ruby core team, which is responsible for Ruby development. I actually just found out that I've been on the team for almost ten years; this October will mark my 10th year on the team.
00:04:57.270
This is my very first commit! I'm also on the Rails core team. I looked up my oldest commits on Ruby, and I found my very first commit on Rails was in 2009; it was merged by Jeremy, but I wasn't on the Rails core team until 2011.
00:05:21.570
I've been on the Rails core team for eight years. I’m not bringing this up just to say I've been doing this for a long time, but I actually wanted to discuss why I got involved in the community and why I’ve been here for so long.
00:06:04.010
The reason is that I love Ruby. I love the Ruby programming language and I’ve been doing development work since 1999. I got into Ruby programming around 2005 when someone at work attended the 'No Fluff, Just Stuff' conference where Dave Thomas gave a presentation about Ruby.
00:06:55.170
They brought it back to work and told me about it. I tried it out and thought it was amazing! At the time, I was actually a reluctant Java programmer; I previously was a Perl programmer.
00:07:16.149
I was forced to switch to Java, thinking that once Perl 6 came out, I wouldn’t have to deal with Java anymore. However, we all know how that ended up. I learned about Ruby, and it felt like a breath of fresh air as a Java developer.
00:08:34.389
Everything was so easy for me. Everything just worked the way I thought it would, and when I made a mistake, it was easy to identify why. The language was very easy to pick up, and there was no boilerplate code.
00:09:15.850
I want to show you what I mean by boilerplate code. Back then, I was doing Java 1.3, so I looked up how to do a map in Java 1.3, and I will provide an example.
00:10:56.220
Here we have a list and we're iterating over a list of strings, trying to convert them into integers. On the left is the Java example, similar to what I would have written at the time, and on the right is our Ruby example.
00:11:16.030
What is actually annoying about this example is that the Java example doesn't actually work because it returns an int, which is not an object, and you can't just put that onto an ArrayList.
00:12:21.320
This doesn't work; you need to think about primitive types versus object types. In Ruby, we don't have to think about that stuff because everything is just an object, which lowers the amount of mental overhead required for programming.
00:13:19.740
Since I am a very forgetful person, lowering that mental overhead is very important. I want to talk about my very first serious Ruby program.
00:14:32.880
In 2005, the third Lord of the Rings movie was released, and I wanted to go see it. It was going to be released on my birthday, December 16th, 2005.
00:15:05.569
All three movies were going to be played on the same day, and I wanted to go see this. I tried to buy the tickets online, but the website kept crashing.
00:15:36.710
I was at work, and our application took ten minutes to compile. Every time I made a change, I'd have to wait ten minutes. So, I decided to write a Ruby program that would automatically attempt to purchase the tickets for me.
00:16:39.550
I even put my credit card information into the Ruby program that submitted to the website. However, the problem was that while it could submit the form, I was aware of what the error page looked like, but I didn't know what the success page looked like.
00:17:29.370
So, I made it retry if there was an error. If the page changed, the program would log a message and then retry. I let the program run and forgot about it. Later, I noticed that the output in the terminal had changed, and I wondered what was happening.
00:18:48.220
When I went through the process, I discovered that it was actually succeeding over and over again! I had a mild panic attack.
00:19:02.220
I called my credit card company and asked a very shady question: 'Hi, how many times have I charged my credit card today?' They said, 'Oh, you charged your credit card once.' I felt relieved.
00:19:40.590
Then, I called the ticketing company because I wasn’t sure if they had processed my ticket purchase. I gave them my details, and they said, 'Oh yeah, you purchased two tickets, but it looks like we tried to charge your credit card hundreds of times.'
00:20:14.930
So the moral of the story is: never write infinite loops. You should always have an exit case. At the same time, I really love Rails. Around this time, DHH published the 'Whoops' video where I discovered Rails, and it had all the same features as Ruby.
00:21:20.170
It was so easy to use; everything just worked the way I thought it would work. I didn’t have to write any boilerplate code, but beyond that, what I really enjoyed was that the community valued these particular things.
00:22:04.490
The values of the community truly resonated with me. We didn't want to endure writing a myriad of XML configuration files or waiting long times for compilation. We wanted the burden to be on the computer, not on us, the programmers.
00:23:09.500
This made me feel really good, probably for the first time I genuinely enjoyed programming and thought, 'I need to do this as my day job.' This is in 2005-2006.
00:23:52.360
I thought I must write Rails! This is a picture of me from 2006. Unfortunately at that time, nobody was hiring Rails developers because it was so new. There were no companies doing this.
00:24:36.440
But I really wanted to pursue it. I ended up joining a startup where some of my coworkers quit their jobs to start a startup using Rails, and I took the leap to join them.
00:25:36.990
However, since there were no hiring opportunities for Rails developers, I had to take a salary cut, which was about 25%. That was a significant loss for me.
00:26:48.410
At that time, however, it motivated me to become involved with the Ruby and Rails communities. I wanted to improve the language and the framework, increasing the number of people developing Rails applications.
00:27:41.790
My goal was to enhance the community so that I could work somewhere else, possibly receiving a raise. I also wanted to ensure that more people could feel the productivity boost and that sense of, 'Oh, the machine should be doing the work for me, not me doing all the labor.' I wanted everyone to experience that.
00:28:34.370
This is a long-winded way of saying thank you to each of you for being Ruby and Rails developers. You are the ones creating companies, jobs, and writing Ruby code so that I can also write Ruby code during my day job.
00:29:13.210
If it weren't for you writing Ruby and Rails code, I don't think I would have the job I have today, or have the opportunity to write in the language I love so much. So thank you all.
00:30:10.200
Now, we will move on to the actual technical part of my presentation. I want to talk to you today about compacting GC for MRI, a feature I wrote for Ruby that will be shipping with Ruby 2.7.
00:30:35.960
This feature took me three years to develop. To be honest, this presentation focuses on the most difficult code I have ever written in my entire life, and I feel strange summarizing it here for you in a forty-minute presentation.
00:32:24.460
However, I am going to explain this to you, and I hope that I will do a good job.
00:32:32.220
First off, let's address the question: what is compaction? Compaction essentially refers to rearranging allocated memory.
00:33:38.080
Imagine we have memory in our computer, and we are rearranging it so that free and allocated memory are next to each other. All we're doing is taking the free chunks and the allocated chunks and trying to group them together.
00:34:24.480
You may remember the process of defragmenting your hard drive; we are doing something similar but to the memory inside of your computer.
00:35:19.820
So why bother with compaction? What are its benefits? Some benefits include efficient memory usage. For instance, if we have a memory layout that looks something like this and need to allocate a new item that is the same size as the existing free slots, it won't fit.
00:36:04.640
However, rearranging the objects can allow this new allocation to fit, enabling more efficient memory usage.
00:36:51.360
Another benefit of compaction is efficient usage of CPU caches. When your program accesses memory, it reads that memory in chunks.
00:37:28.960
If our memory is compacted, we can leverage that cache more efficiently, improving performance and locality.
00:38:04.780
Compaction also provides advantages for techniques like copy-on-write, which can increase the efficiency of our memory usage system. These advantages are primarily a solution to the problem of fragmentation.
00:38:53.900
Fragmented memory means we have chunks of free space interspersed throughout our memory. By compacting, we eliminate that fragmentation by ensuring that similar types of data are grouped together.
00:39:35.510
Before getting into the algorithms, I want to explain that our Ruby programs have two heaps: the malloc heap and Ruby's object heap.
00:40:06.920
We can visualize our system memory as a malloc heap, which allocates memory from the operating system. Inside that malloc heap is Ruby's object heap, where all Ruby objects are allocated.
00:40:55.400
Each Ruby object requires a fixed amount of memory, typically 40 bytes, which we can refer to as a slot. Each slot can be either empty or filled. Memory allocation and fragmentation can occur in both heaps, the malloc heap and Ruby's object heap.
00:41:38.600
At work, we utilize je malloc for the malloc heap, and for dealing with fragmentation within the Ruby object heap, we use GC compact, which we will discuss further.
00:42:24.050
Let's delve into Ruby's heap, which is stored inside that malloc heap. Ruby's heap consists of slots, each of which can either contain an object or be empty.
00:43:18.890
Now, let’s talk about the algorithm implemented for compaction: it’s called two-finger compaction. This method originated in Lisp and involves two main steps: moving the objects and updating their references.
00:44:05.530
Let’s say we have a heap layout with six objects, each with fixed addresses. The algorithm places two pointers on opposite sides of the heap: one (the free pointer) moves right until it finds a free slot, and the scan pointer moves left until it finds a filled slot.
00:44:55.890
When it discovers these two locations, it swaps them. For example, if a filled slot is found on the left and a free slot on the right, those two would be swapped.
00:45:40.830
After moving, a forwarding address is left stating where the object moved from, and this process continues until the two pointers meet. Once they meet, all objects have been moved.
00:46:31.800
The next step is updating the references. With an object relationship such as A pointing at D and B pointing at F, once the objects are moved, the references are outdated and need updating.
00:47:33.540
The process checks each object, and if it detects that B points at an object that has moved, it updates that reference to the new location.
00:48:03.100
After completing the updates, those new locations transform back into free slots, finishing the compaction process.
00:48:45.960
Now let’s explore the reference updating step in detail. The process involves iterating over each object within the heap and adjusting their respective references to their new locations.
00:49:29.520
Understanding how objects hold references is key. We need to know how to update references for all Ruby classes, such as hashes and arrays.
00:50:03.860
The core challenge is how C extensions interact with this process. If C objects point at Ruby objects, and those Ruby objects move, the C objects will crash.
00:50:50.480
To prevent this, I developed a scheme to identify where references are stored. For known types, we have a straightforward way of updating references.
00:51:25.610
For example, if an array is accessing a Ruby object, the GC knows how to update that reference. However, unknown types, like objects in C without Ruby visibility, pose a challenge.
00:52:10.260
In scenarios where we want to ensure C extension references do not cause crashes post-compaction, I introduced a mechanism of pinning bits.
00:52:54.160
The pin bits essentially flag whether an object can move. This way, if a C extension holds a reference to an object, and we call our mark function, we ensure it doesn't move.
00:53:30.130
This ensures that every object that has been marked with a pin bit remains stationary throughout compaction.
00:54:10.640
However, there was a limitation: if a C extension didn't mark an object, it might unintentionally hold a reference to a moved object, resulting in a crash.
00:54:55.790
To ensure that GC doesn’t lose track of hidden references, we established a rule: if a C extension references a Ruby object, it must mark that reference.
00:55:38.540
This simplicity emphasizes the rule: holding a reference requires marking.
00:56:27.330
Another challenge arises with the object ID method, which also necessitates direct memory access. Typically, the object ID corresponds to its memory location.
00:57:08.250
However, if the object moves, we must determine what value to cache. If an object ID is created and compacted, we return the location it was first assigned.
00:57:58.450
This involved creating a map of seen object IDs, ensuring that anytime the GC runs, the object IDs are updated accordingly.
00:58:40.720
I hope you all can see the multifaceted nature of garbage collection and compaction. The original purpose of this was to improve Ruby's memory efficiency, which it undoubtedly has.
00:59:21.050
To see the actual impact of these changes, I will show you a basic Rails application. The graph reflects the heap before and after compaction.
00:59:58.600
After running compaction, you can see the decline in the number of slots.
01:00:45.270
Ultimately, this compaction process reduces fragmentation significantly. Looking ahead, I seek to refine this approach further to improve performance.
01:01:32.520
This transition will include reducing the number of required steps under garbage collection operations.
01:02:10.320
I want to adopt a sliding compactor that organizes memory more effectively and promotes better locality for CPU caches.
01:02:53.169
In the future, I hope to implement a variable width allocation process, giving Ruby full control over memory, which in turn would enhance performance.
01:03:45.270
Thank you for your time. If you have any questions, I would love to engage during the Q&A.
01:04:34.360
And remember, come say hi to me later; I will have stickers!