RailsConf 2016
Tweaking Ruby GC Parameters for Fun, Speed, and Profit

Tweaking Ruby GC Parameters for Fun, Speed, and Profit

by Helio Cola

The video titled 'Tweaking Ruby GC Parameters for Fun, Speed, and Profit' is presented by Helio Cola at RailsConf 2016. The talk focuses on the Ruby Garbage Collector (GC) and how it can be tweaked to enhance the performance of applications built with Ruby on Rails. Helio Cola shares personal experiences and research findings on the evolution of Ruby's garbage collector, leading to practical insights on parameter configurations that can improve application performance.

Key points include:
- Introduction to GC: Helio begins his presentation with an overview of the significance of the Ruby Garbage Collector and his journey of exploring its capabilities.
- Personal Journey: He shares his background in software development and how his curiosity about GC performance led him to advocate for optimizations in his workplace.
- History of Ruby GC: Cola explains the evolution of Ruby's garbage collection algorithms from Ruby 1.8 to 2.2, highlighting the introduction of generational and incremental collectors.
- Configuration Parameters: He delves into the various configuration parameters available in Ruby versions 2.0 and later, emphasizing the importance of tuning them based on the specific application's needs.
- Testing and Adjustments: The speaker provides insights on how to test these parameters effectively, recommending the implementation of one change at a time for measurable impact.
- Performance Results: Cola highlights the performance improvements observed after tweaking parameters, citing examples where garbage collection times reduced significantly, leading to faster application response times.
- Conclusion and Recommendations: He emphasizes the significance of documentation and knowledge sharing within teams to facilitate better performance tuning practices. Cola encourages participants to monitor their applications closely and adjust GC parameters as necessary to strike a balance between memory usage and performance efficiency.

Overall, the presentation aims to empower Ruby developers to harness the capabilities of the GC to improve their applications' speed and efficiency, while sharing the methodology for doing so effectively and thoughtfully.

00:00:09.820 All right, thank you for giving me 10 seconds of your time. I'm going to get started. Hello everybody, the title of my talk is 'Tweaking Ruby GC Parameters for Fun, Speed, and Profit.' My part here is to share how I convinced my boss to give me time to work on this topic.
00:00:39.440 Before I start, let me tell a bit of a story. Who is here for the very first time? All right, it's a pretty good audience. Something amazing about RailsConf is that last year was my first year, and I had a great time. I thought it was an awesome conference, and after it concluded, I decided I wanted to come back as a speaker.
00:01:06.439 So, to those of you here for the first time, if you like the conference and think it’s really awesome, you are not alone. A lot of people think like that. If you decide to attend next year and want to participate as a speaker, I can assure you that it is totally possible. If you have any questions about that, let me know. I can share a lot about my journey.
00:01:25.790 Now that we have that out of the way, my name is Helio Cola. I've been in software development for about 15 years. I spent about ten years working with C, C++, Solaris, and Linux environments before I switched to Ruby on Rails. It has been about five or six years that I have been working with Ruby on Rails. If you want to find me, you can look me up online.
00:01:55.100 Let's talk about the Ruby garbage collector. Throughout this presentation, I will use terms like GC, which stands for Garbage Collector, and IGC, which is the Incremental Garbage Collector. I'm also going to refer to MG, which is the restricted incremental garbage collector. My talk will cover several topics: why I’m here discussing garbage collection, a little history of how Ruby's garbage collector evolved, some configuration parameters, and how to measure and tune these parameters.
00:03:11.510 So, why am I here discussing garbage collection? A while ago, in our company, we were working on a Rails app, and we lacked insights into how the application was behaving. We decided to install some fancy monitoring tools, and that’s how it all started. I got exposed to this information, and I started investigating.
00:03:39.470 On one side of the monitoring tool, I saw that the garbage collector (GC) runs were mostly blue, while on the right side, the runs were mostly yellow, showing minor and major collections. I noticed that on the left side, the GC was running 80 times per 100 transactions, whereas the right side was running only 46 times. This made me curious about why my left-side app was not performing as well as the right-side app.
00:04:11.419 This curiosity led me to research extensively. I read everything I could about the Ruby garbage collector, and I learned a lot during that time. I found that while there was plenty of documentation available, there weren't many discussions focused on the Ruby garbage collector specifically. As a result, I decided it was essential to share what I learned.
00:04:53.030 Before diving into how I approached the tweaking of GC parameters, I'd like to give you a brief overview of how the algorithms evolved. So, how many of you have ever changed the tuning of a Ruby garbage collector in production? That’s a handful of people, interesting!
00:05:15.460 Moving on to the algorithms, I'm going to glance over them. I won’t get into too much detail, but Ruby 1.8 used a simple mark-and-sweep algorithm, while 2.0 introduced bitmap masking. Ruby 2.1 introduced the generational garbage collector (GC), which focuses on the idea that 'objects die young.' If an object survives a garbage collection run, it gets promoted to an older generation.
00:06:01.540 The generational strategy allows the garbage collector to spend less time working on objects that have been around longer, as they are less likely to be garbage. As I dove deeper into my research, I found a blog that visually expressed the differences between these algorithms, leading to a significant 'aha' moment for me.
00:06:57.330 I’ll quickly show you how the mark-and-sweep algorithm works. The Mark phase runs while the program stops executing, marking objects that are still being referenced. The Sweep phase then removes unreferenced objects. Ruby 1.8 uses the same mark phase but does the sweeping in a delayed manner. As objects are requested, they are immediately released, allowing for more efficient memory management.
00:07:40.180 Ruby 2.0 improved memory management even further with its non-recursive mark-and-sweep approach. If you plug a Ruby 2.0 application into a fancy monitoring tool, you will see the GC runs are more descriptive. In Ruby 2.1 and later, it’s important to note that the garbage collector uses a generational approach to improve performance by reducing the frequency of GC runs by focusing on younger objects.
00:08:56.080 In Ruby 2.2, the incremental garbage collector was introduced, allowing for the collection of symbols, which ensures that they don't use up the Ruby VM's memory as before. The incremental garbage collector offers short pauses instead of long ones, which enhances user experience.
00:09:44.310 As the Ruby garbage collector evolved, so did the configuration parameters. Ruby 2.0 had three parameters: MALLOC_LIMIT, HEAP_SLOTS, and HEAP_INCREMENT. These define how many slots to allocate during your application startup and the minimum number of slots that should be free after a GC run. Ruby 2.x increased this to 11 configuration parameters.
00:10:58.230 These include parameters like HEAP_GROWTH_FACTOR, HEAP_OLD_OBJECT_LIMIT_FACTOR, and others. During my tests, I played around with some of these parameters, although many remained at their default values because they were sufficient for my application. However, I did discover configurations that improved performance.
00:11:51.740 One of the key things to remember is that if you change defaults, you should document your changes. Adjusting parameters is important based on your application needs as not every app is the same. You may not need to touch the parameters at all if your application runs smoothly.
00:12:25.390 For you to test configuration parameters on your local machine, you can set them as environment variables. You would export those variables in a Linux environment, and when you start the Ruby VM, it will take these variables into account.
00:12:56.590 Back to tuning the GC, if you tweak parameters to see changes, it’s essential to implement one change at a time. This way, you can measure the impact of each configuration. When making adjustments, especially if you collect a lot of data, give your brain time to digest the information before drawing conclusions.
00:14:36.420 In conclusion, I encourage you to document your findings, share them with your team, and make small presentations or discussions about what you learned. This could lead you to opportunities to present at conferences like RailsConf in the future. It’s rewarding to share knowledge and experiences with the community.
00:15:16.340 Now let me share some examples of the parameters I changed, which helped improve performance. These include HEAP_SLOTS, HEAP_LIMIT, and HEAP_GROWTH parameters that I adjusted during my testing for the production environment.
00:15:52.240 Through benchmarks, I was able to compare the response times before and after the parameter changes. After adjusting, it was evident that my application experienced faster performance with notably reduced garbage collection times.
00:16:12.400 When looking at the metrics, I noticed a drastic reduction in the time my application spent in garbage collection after making the parameter adjustments. It seemed like a considerable improvement in response time, especially in high-load scenarios.
00:16:34.080 In my final tests, going from Ruby 2.0 to 2.1 improved the average GC duration from 80ms to 20ms per transaction, demonstrating the effectiveness of parameter tuning in real-world applications.
00:18:10.240 With that said, I have covered transaction metrics and how essential it is to monitor your applications in production. Having robust monitoring tools allows you to spot issues early and adjust your parameters accordingly.
00:18:49.570 I appreciate your attention, and if you have any questions, feel free to ask. Yes, changing GC parameters may lead to an increase in memory usage, as I found that allocating more memory allowed the GC to run less frequently in some scenarios. It’s a trade-off between memory allocation and time efficiency, but it’s important to find the right balance.
00:20:30.500 In conclusion, I hope this presentation has provided you with insights into tweaking Ruby GC parameters and how this can lead to significant improvements in performance. Thank you all for your time!