Profiling

Summarized using AI

The Hitchhiker's Guide to Ruby GC

Eric Weinstein • November 15, 2015 • San Antonio, TX

The video titled "The Hitchhiker's Guide to Ruby GC," presented by Eric Weinstein at RubyConf 2015, focuses on the intricacies of Ruby's garbage collection (GC) and its significant impact on performance in Ruby applications. The talk emphasizes that performance issues in Ruby are not solely due to common suspects like database queries, but often stem from the complexities of Ruby's object space and garbage collection mechanisms. The main points of the discussion include:

  • Understanding Garbage Collection: Eric begins by debunking myths about Ruby's performance, asserting that garbage collection is a critical factor often overlooked by developers.
  • Historical Context: The talk delves into the history of GC in Ruby, starting from Ruby 1.8 with a mark and sweep approach, evolving to improvements in Ruby 2.0 and 2.2. He highlights the transition from simple algorithms to more sophisticated systems, like generational garbage collection, which optimizes memory management for young vs. old objects.
  • Technical Insights: Eric explains key features such as bitmap marking, copy-on-write, and incremental major GC, detailing how these advancements improve performance and reduce pause times during garbage collection cycles.
  • Tuning Garbage Collection: He cautions against hasty optimizations, emphasizing that tuning GC should only be considered with thorough measurement and understanding of its implications.
  • Case Study: A notable example shared involves a performance issue experienced in a company’s Ruby application, which transitioned from a Sinatra framework to a more object-oriented approach, inadvertently complicating memory management and increasing GC workload, leading to significant performance issues. This case illustrates the necessity of mindful memory management in Ruby applications.
  • Conclusion and Recommendations: Eric concludes with important takeaways about garbage collection, underscoring the need for developers to grow their understanding of Ruby’s memory management practices to effectively mitigate performance issues. He advises careful profiling and tuning of GC parameters to optimize application performance, while also managing expectations about Ruby’s capabilities fully utilizing updates from newer versions. This talk serves as a comprehensive guide for Ruby developers looking to enhance their understanding of garbage collection and its pivotal role in application performance, winning both their attention and appreciation for Ruby's rich history and complexity.

The Hitchhiker's Guide to Ruby GC
Eric Weinstein • November 15, 2015 • San Antonio, TX

The Hitchhiker's Guide to Ruby GC by Eric Weinstein

When Ruby programs slow down, the usual culprits—database queries, superlinear time complexity—aren't always the real problem. Ruby's object space and garbage collection are a surprisingly rich and oft-misunderstood area of the language, and one where performance issues can easily hide. This talk is a brief but deep dive into the history and details of garbage collection in Ruby, including its evolution, parameter tuning, and a case study using the Unicorn web server.

Help us caption & translate this video!

http://amara.org/v/H1VS/

RubyConf 2015

00:00:16.320 Hey everyone, can you hear me? Okay, good, rock on! I think we'll go ahead and get started.
00:00:24.359 So yes, this is the Hitchhiker's Guide to Ruby GC. I tend to speak very quickly, so if I start talking too fast, just let me know.
00:00:32.160 I get excited because, you know, Ruby is exciting, and garbage collection is exciting.
00:00:38.360 If I start talking way too fast, you guys can sort of signal me, and that would be awesome.
00:00:45.960 Unlike Justin, however, I don't have any tenx insights for you.
00:00:51.480 I do make silly jokes like Aaron does, but unlike Aaron, I'm not going to offer you life-changing insights into the Ruby Virtual Machine.
00:00:58.840 My talk is, in fact, literally garbage— or rather, it's about garbage.
00:01:06.920 Speaking of garbage, did you guys hear about the controversy around the show Dirty Jobs?
00:01:12.479 Basically, it has something to do with the huge number of microservices that they're employing.
00:01:17.680 You can laugh if you want; at least someone did!
00:01:23.320 Yeah, I also don't enjoy my puns nearly as much as Aaron does, so I'm going to skip that.
00:01:30.119 Anyway, I should say I'm learning a lot at this conference—yeah, that was a terrible joke.
00:01:38.960 I'm learning a lot at RubyConf, and I'm trying to apply it immediately.
00:01:44.280 Gary had a great talk about ideology and belief, which is something I'm trying to internalize.
00:01:50.360 But you know, it's hard, right? We don't know what we don't know, and sometimes even when we do know it, we are unable to disarm ourselves.
00:01:58.360 This is my first time giving this talk; in fact, this is only my second talk ever.
00:02:04.560 I spoke earlier this year at RailsConf, so I'm going to try to face my own imposter syndrome.
00:02:11.680 Also, you'll see that my nervousness manifests in obnoxious slide transitions.
00:02:17.360 Anyway, my name is Eric Weinstein. I work at a company called Condé Nast.
00:02:22.920 I don't know if you've heard of Condé Nast; you're probably familiar with the various brands like Wired, The New Yorker, Vogue, GQ, etc.
00:02:29.519 I really enjoy writing Ruby. I write JavaScript a lot at work, so it's nice to be able to write Ruby for some of those projects and in my own time.
00:02:35.840 The Condé Nast folks are all using Ruby and Rails, so we are hiring.
00:02:43.840 Feel obligated to say that we're hiring.
00:02:49.959 And you might be interested in this strange hash and why it's a hash rather than an object, such as a person.
00:02:55.200 This foreshadows a lot of literary devices in this talk.
00:03:00.599 Also, obligatory self-promotion: I wrote the Ruby curriculum on Codecademy a couple of years ago.
00:03:06.360 I also wrote a book called Ruby Wizardry for kids ages 8 to 12.
00:03:12.480 There's a really great Birds of a Feather session organized by J. McGavin, who is doing a talk on method lookup later in this room.
00:03:18.480 I highly encourage you to go to that talk as well.
00:03:26.280 If you're interested in learning Ruby or teaching Ruby to kids, please come see me after the show.
00:03:31.840 No Starch Press has been delightful and is offering a 40% discount.
00:03:37.480 If anyone wants the book, use the code RUBYCONF2015 for 40% off.
00:03:43.840 Now, let’s talk about garbage collection.
00:03:49.840 There's a lot of mythology around GC and GC tuning, and it's not as bad as it seems.
00:03:56.360 For those unfamiliar, "Don't Panic" is emblazoned in large friendly letters on the cover of The Hitchhiker's Guide to the Galaxy.
00:04:02.560 This talk is as much for my benefit as it is for yours.
00:04:10.439 Part zero: because this is a computer-type conference, we should start at zero.
00:04:16.160 Ruby is not slow. Okay, sometimes it can be, but not for the reasons you think.
00:04:22.919 People may think their Ruby program is slow due to database issues or super-linear, nested loops.
00:04:28.639 Or they may claim Ruby is an interpreted language that can't possibly be fast and that we should all be using Go or Java.
00:04:34.280 What I've found is that when Ruby programs slow down, garbage collection is often in the culprits.
00:04:41.880 The object space and GC are a rich part of the language.
00:04:48.360 Not surprisingly, when there's a lot of richness, there’s a lot of nuance and performance bugs can hide.
00:04:54.560 The statement shown on the screen does make some operations slower than more compiled languages.
00:05:01.520 However, we have this architecture because everything is an object.
00:05:07.840 Let's take some time to talk about the history of garbage collection in Ruby.
00:05:14.300 First, we are talking about MRI, or CRuby.
00:05:20.100 We are not discussing Rubinius or JRuby, which have different garbage collectors.
00:05:26.550 I wanted to include them in this talk, but I only have so much time.
00:05:32.380 I also don't know nearly as much about them, but if you’re interested, come find me with your cool insights.
00:05:39.750 I may make comparisons to these alternate implementations when it makes sense.
00:05:47.200 Let’s talk about Ruby 187.
00:05:53.600 Ruby 187 uses tracing rather than reference counting for garbage collection.
00:05:59.760 Garbage collection traces involve searching for reachable objects in the object graph.
00:06:06.520 By traversing objects, we can find out if they should be marked as active.
00:06:15.200 If an object is unreachable, it is eligible for collection.
00:06:21.920 Ruby 187 used a simple mark and sweep algorithm.
00:06:28.500 Mark and sweep was invented by John McCarthy in 1958 or 1959.
00:06:35.160 It's astounding that Ruby has gone so far with such a simple garbage collection method.
00:06:41.960 In the mark and sweep method, Ruby allocates a certain amount of memory.
00:06:48.180 When there's no more free memory, Ruby looks for all the active objects, marks them as active, and sweeps the inactive ones onto a new list.
00:06:55.080 This is a broad overview of what Ruby is doing.
00:07:01.680 The important thing to know is that everything stops.
00:07:07.080 Reachability in the object graph can change during execution.
00:07:13.040 To mark and sweep effectively, the collector has to stop the world. If you hear 'stop the world garbage collection,' this is what they mean.
00:07:19.600 So, in 1993, we got improvements with lazy mark and sweep.
00:07:26.480 This improves stop times by sweeping in phases.
00:07:31.680 It doesn't affect the total time spent collecting garbage, but it reduces the longer pauses.
00:07:39.000 If you have a highly event-driven or I/O-driven application, not sitting for several seconds while collecting garbage is a big win.
00:07:46.920 However, both 187 and 193 subverted native copy-on-write.
00:07:54.240 This means that while it doesn't reduce the overall pain of stopping the world, it amortizes the pain over more sweeps.
00:08:00.480 In Ruby 2.0, we introduced bitmap marking.
00:08:07.359 Instead of marking objects directly, we now have a bitmap that represents the state of objects and their eligibility for collection.
00:08:14.240 This will be extremely important later because it allows us to use copy-on-write.
00:08:20.600 I'll be covering all these improvements in depth soon.
00:08:26.000 If, like Aaron, I don't die during this presentation, I'll be turning 30 in March.
00:08:32.000 I guess I’m officially entering old age.
00:08:39.680 Let’s talk broadly about Ruby 2.1. It has generational garbage collection.
00:08:46.480 There are two generations: a young one and an old one.
00:08:52.360 If an object survives three collections in 2.1, it becomes considered old.
00:08:59.360 This is based on what's called the weak generational hypothesis: objects die young.
00:09:06.000 There are many objects that appear and then are gone.
00:09:12.160 Because of this, it makes sense to do fast minor GCs frequently and the slower stop-the-world collections less frequently.
00:09:18.000 If you're interested in the RGC algorithm, Kai has a talk from EuroRuby a few years ago, which is excellent.
00:09:26.800 Now, let's go into more depth about 2.2.
00:09:34.000 We'll talk about copy-on-write, bitmap marking, and the tuning of garbage collection.
00:09:40.000 Let's start with two significant topics: incremental and symbol GC.
00:09:46.080 Symbol GC is something you might be familiar with if you’ve been writing large Rails applications.
00:09:53.240 There’s a notion of symbol denial-of-service: if you generate a lot of new symbols, they never get collected.
00:10:00.520 The new feature allows the collection of some symbols when they are no longer referenced.
00:10:07.200 However, not all symbols can be collected because Ruby internally generates symbols for method names.
00:10:13.600 If you dynamically generate many methods, each one getting their own symbol, that can lead to memory issues.
00:10:19.040 The really cool addition in Ruby 2.2 is the ability for incremental major GC.
00:10:26.120 Many of you might be familiar with tricolor marking.
00:10:33.120 The idea involves three types of objects: white, gray, and black.
00:10:39.720 White objects are unmarked, gray are marked and may refer to some white objects, and black are marked without referring to any white.
00:10:45.440 When starting, all objects are white, and alive objects are marked as gray.
00:10:51.360 We pick gray objects, visit all references, and mark them gray.
00:10:57.120 Then we change gray to black if they don't refer to white objects.
00:11:04.560 This algorithm allows us to sweep away unmarked objects.
00:11:10.080 However, there is a bug: if we create a white object and there are no gray objects referencing it, we can inadvertently reclaim live objects.
00:11:16.160 Right barriers help us prevent this issue.
00:11:23.240 For complicated reasons, there are insufficient right barriers in CRuby.
00:11:29.920 This leads to right barrier protected and unprotected objects.
00:11:36.960 The pause time corresponds to the number of living right barrier unprotected objects.
00:11:44.000 Most user-defined objects, such as strings, arrays, and hashes, are right barrier protected.
00:11:50.960 The actual fix involves checking all unprotected black objects before collecting white ones.
00:11:57.760 This helps prevent collecting objects we shouldn’t, where they are still in use.
00:12:04.600 So that's a brief history of GC in Ruby from 187 to 2.2.
00:12:11.480 Now let's talk about GC tuning.
00:12:17.360 If there's one takeaway from this talk, it should be: do not do it.
00:12:23.360 Or rather, for experts only, don’t do it yet.
00:12:29.360 This is a paraphrasing of Michael Jackson—yes, another Michael Jackson.
00:12:35.200 He said to avoid optimization unless you are certain of what you're doing.
00:12:41.680 There are a few variables you can modify to affect how garbage collection is performed.
00:12:48.480 These include the heap growth factor and max slots; lowering either of these will trigger more frequent young object garbage collection.
00:12:54.280 The default was around 1.8 for a long time. Adjusting these parameters can have unintended consequences.
00:13:01.440 Also, lowering the other parameters—like the malloc limit and max—can force more frequent allocations or collections.
00:13:07.760 Again, if you trigger many major garbage collections, you’ll find yourself just sitting around collecting garbage.
00:13:13.920 It's important to realize there are no silver bullets.
00:13:20.360 If you modify any of these variables without proper measuring, you might worsen the situation.
00:13:27.000 There is a lot of mythology around garbage collection that can lead to misguided optimizations.
00:13:33.680 If you decide to tune the garbage collector, measure everything carefully before and after.
00:13:39.440 Now, let me delve into the case study portion of the talk.
00:13:48.160 A couple of years ago, I was working at a company with a Ruby application that was very interesting.
00:13:55.000 It was not a Rails application, but rather a suite of several Sinatra applications that had been smashed together.
00:14:01.840 The way it worked is that users would browse to the site, and the Ruby application would make requests to several Java services.
00:14:09.600 These Java services communicated using JSON, which the Ruby app would inflate into hashes.
00:14:16.000 We ended up with huge hashes floating around, which led to people mutating them.
00:14:24.000 Someone got this crazy idea to orient our code around objects, suspecting it was a fad.
00:14:30.640 This approach tanked performance and caused memory problems.
00:14:37.040 Everyone was struggling; New Relic showed we were constantly running out of memory.
00:14:45.040 We realized that by moving towards object-oriented Ruby, we had inadvertently shot ourselves in the foot.
00:14:51.920 Let’s talk about memory and how it all relates to Ruby 1.9.
00:14:58.560 In Ruby 1.9, Ruby objects are 40-byte rvalue structures allocated into 16-kilobyte heaps.
00:15:05.200 If you're using 64-bit architecture, you can get about 400 Ruby objects per heap.
00:15:10.920 At the start, Ruby gives you approximately 150 heaps.
00:15:17.520 You can see through the object space that we were creating way too many objects.
00:15:23.200 We started the web server and saw we had over half a million live objects right off the bat.
00:15:29.680 This seemed completely wrong for a Sinatra app.
00:15:36.800 We were making many more objects than needed for a simple application.
00:15:43.200 Ruby stores small values directly on the object up to 23 characters; longer values are stored as pointers.
00:15:50.760 When creating many objects, the cost can rise dramatically.
00:15:58.000 And, of course, when Ruby goes through garbage collection, linked lists can become cumbersome.
00:16:04.880 When there are no more free R values, Ruby sets the FL mark on all active objects.
00:16:11.439 This marks objects during the sweep phase by relinking inactive ones into a free list.
00:16:18.040 With copy-on-write, parent and child processes can share memory until a write occurs.
00:16:24.840 If marking occurs directly in an object being forked, this leads to issues with memory management.
00:16:31.000 Increases in the number of forks lead to an increased collection of white and black objects.
00:16:37.000 As the number of unicorns increases, the GC issues compound.
00:16:44.240 This is not an inherent flaw with Ruby, but rather a result of the complexities of GC.
00:16:51.560 So, what's not inherently bad?
00:16:57.920 Memory management and bitmap marking have been implemented.
00:17:04.280 Now every heap has a header pointing to a bitmap.
00:17:10.960 This simplifies the marking process, allowing us to leverage copy-on-write effectively.
00:17:16.560 Finally, we need to reduce overhead and unnecessary writes.
00:17:24.160 Unicorn does well at managing processes, but we must be mindful of how many unicorns we run.
00:17:30.440 Every GC run incrementally improves things.
00:17:36.800 Loading the application in Ruby 1.9 invoked 122 GC runs and took about 4.4 seconds.
00:17:43.680 In Ruby 2, simply loading the app only invoked 66 GC runs and took around 3 seconds.
00:17:49.600 This shows a significant improvement that we needed to take advantage of.
00:17:56.800 We upgraded to Ruby 2, which leverages copy-on-write.
00:18:02.160 This also means we adopt newer features while maintaining performance.
00:18:09.120 Number two: we profiled extensively, trying to identify and eliminate the sources of bloat.
00:18:15.920 We utilized the Ruby 2 documentation as well as the Ruby Prof gem.
00:18:23.920 We tuned the GC with variables such as malloc limit, managing trigger points for full GC.
00:18:31.920 Making informed adjustments to heap slots afforded us better allocation without pitfalls.
00:18:38.320 This tuning allowed us better growth and fewer stop-the-world pauses.
00:18:45.920 As we dive deeper into Ruby optimization, we addressed a wide net of issues.
00:18:52.960 Significant credit goes to various authors and developers who provided resources, insights, and referrals.
00:18:59.040 Many thanks again for your time and taking a moment to listen to my talk.
00:19:05.600 And of course, obligatory self-promotion: that's the book I wrote.
00:19:12.160 If you're interested in Ruby Wizardry, please reach out to me or look for it through standard channels. Thank you!
Explore all talks recorded at RubyConf 2015
+80