Performance Optimization
Halve Your Memory Usage With These 12 Weird Tricks

Summarized using AI

Halve Your Memory Usage With These 12 Weird Tricks

Nate Berkopec • November 11, 2016 • Cincinnati, OH

In his talk at RubyConf 2016, Nate Berkopec discusses strategies to reduce memory usage in Ruby applications, offering twelve 'weird tricks.' The core idea presented is that many developers misunderstand Ruby's memory allocation and garbage collection, leading to unnecessary performance issues.

  • Understanding Memory Allocation: Berkopec begins by explaining that Ruby’s memory management and garbage collection often mislead developers into believing they have memory leaks. He emphasizes that Ruby is a garbage-collected language, which simplifies memory management but can obscure how memory actually evolves over time.

  • Common Misconceptions: He notes that many issues developers experience stem from not accurately understanding memory behavior, especially during the initial hours of application runtime. New object allocations and caching behaviors contribute to memory increases that are normal in the Ruby environment.

  • Solution Examples: Berkopec introduces several solutions, beginning with reducing the number of application instances to manage memory more effectively. Insights into how Puma and Sidekiq applications react to memory limits offer evidence of this approach in action.

  • Garbage Collection Insights: He dives deep into how garbage collection works in Ruby, clarifying that it does not always reclaim memory immediately or efficiently due to complexities like heap fragmentation. He emphasizes the need to consider the threshold-based nature of Ruby’s garbage collection, which operates on dynamic limits rather than fixed schedules.

  • Heap Fragmentation: Berkopec discusses the challenges created by heap fragmentation, a situation where memory is not optimally reused due to active references to allocated objects. Understanding this concept is crucial to tackling slow, invisible increases in memory usage that can affect application performance.

  • Long-Term Memory Monitoring: He advocates for monitoring memory usage over longer periods to avoid premature conclusions about leaks based on short-term metrics. Aiming for targeted memory usage metrics, like 300 MB per instance for Rails applications, can significantly improve performance.

In conclusion, Berkopec's talk equips Ruby developers with actionable strategies to streamline memory usage, urging them to thoroughly understand Ruby's memory management and to carefully monitor their applications over time to ensure optimal performance. This knowledge is essential to effectively debugging and enhancing Ruby programs, ultimately contributing to a more stable and efficient application environment.

Halve Your Memory Usage With These 12 Weird Tricks
Nate Berkopec • November 11, 2016 • Cincinnati, OH

RubyConf 2016 - Halve Your Memory Usage With These 12 Weird Tricks by Nate Berkopec

How does Ruby allocate memory? Ever wonder why that poor application of yours is using hundreds of megabytes of RAM? Ruby's garbage collector means that we don't have to worry about allocating memory in our programs, but an understanding of how C Ruby uses memory can help us avoid bloat and even write faster programs. We'll talk about what causes memory allocation, how the garbage collector works, and how to measure memory activity in our own Ruby programs.

RubyConf 2016

00:00:14.290 I think we'll get started. I have a lot to cover today, so I'll probably need my full time allotment. Thank you for coming, it's 4:20 p.m.
00:00:19.900 Thank you for skipping your smoke breaks to attend this talk. Alright, so let's talk about halving your memory usage with twelve weird tricks. The eleventh one will shock you.
00:00:25.420 My name is Nate Berkopec, and I run an independent, one-man consultancy called Speed Shop. I work on people's Ruby and Rails applications to try to improve their performance and scalability.
00:00:36.610 So let's talk about memory. The inspiration for this talk comes from the observation that a good number of the people who come to me to fix their Ruby applications' performance have memory issues. Even if they don't come to me specifically for that, I often say we have to fix the memory issues first.
00:00:54.450 I'm also very active in reading the Puma and Sidekiq GitHub repositories. If you look at any of those, especially Sidekiq, about 90% of the issues are related to memory. For example, people say, "My app uses too much memory," or, "I switched to Puma, and now I have a memory leak," or "I switched to Sidekiq, and now I'm using 300 gigabytes of memory." About 90% of these situations are not actual leaks or bugs, but simply misunderstandings of how Ruby works with memory.
00:01:14.350 So part of this talk is going to focus on discussing those misconceptions and correcting them, while also providing some real solutions to fix the genuine problems you may have. We often think we're leaking memory all the time, but really, we're not. Ruby is a garbage-collected language. The common thought goes like this: "My memory is going up, therefore that must be a memory leak." However, that is just not the case.
00:01:32.000 As Ruby programmers, we are allowed not to think about memory, and that's a good thing. Thank goodness we don't have to call malloc and free on our own; otherwise, we'd all be C programmers. It's okay and even expected that, as Ruby programmers, we may not fully understand what's happening at the memory level.
00:01:52.630 One of Puma's top issues ever, in terms of comments, is the 172 comments regarding memory usage increasing over time. I want you to remember the shape of this graph because it's very interesting. Many people believe they have a memory leak or some memory problem, but it turns out it's just dozens of people discussing how they switched to Puma and now have a memory leak or that their processes are using four gigabytes of memory, while Unicorn didn't do that.
00:02:09.340 So, solution one of our twelve solutions here is to dial back the number of application instances. A lot of the people in that thread didn't actually understand how much memory one instance of their application used. This is quite common: people hit an R14 error on Heroku, or they are reaching memory limits on their AWS instance, or they are using a worker killer that terminates workers after a certain memory threshold is reached.
00:02:32.440 They often never truly understand how much memory their application would use if it simply ran for 24 or 18 hours. Does the memory usage eventually level off? People tend to look at their memory usage over too short a time frame.
00:02:57.380 What we first need to do is dial back the number of instances we are using to avoid hitting any worker killer limits or the limits of our server or container. If you are encountering either of these situations, you do not know how much memory you're actually using in the long term.
00:03:11.770 It's bizarre to me how many Ruby applications I see that are like literal sinking ships; their applications are running out of memory but it’s so easy to fix. You could just turn down the amount of memory pressure you're administering. I very rarely come across someone with an application that had an out-of-memory error while they were running just a single instance of it.
00:03:27.550 When I talk about instances, I refer to Puma workers. Unicorn may call them workers or something similar; each forked process is part of your application. It's worth noting that threads share memory — processes only share a limited amount of it, and we will get to that in a minute.
00:03:41.440 The myth here is that memory usage should remain constant, resembling a long flat line. In reality, people expect Ruby's memory usage to look flat during steady state. But that's not accurate; instead, it typically appears as logarithmic.
00:03:57.150 There is significant information regarding what's occurring at this stage. The first two hours after your application starts up are particularly essential. Code is getting loaded, and not everything gets required at boot.
00:04:13.740 Rails tries to accomplish that, but perhaps your libraries don’t, or your application code is being required in a staggered manner. As code gets loaded, our memory usage naturally increases. We may even be filling up caches; even if you don't implement application caching, you might fill up Rails’ Active Record cache, incorporated in Rails 4.2.
00:04:33.320 You might also be creating connection pools to your database. All these actions create objects, which takes time for those code paths to be activated under production load. Thus, seeing memory increase during the first hour or two is normal.
00:04:54.670 Another thing to remember is that different actions in your application require varying amounts of memory. Simple actions may allocate a couple of thousand objects, whereas more complicated ones might allocate 200 megabytes worth of objects. As you hit those complex paths, expect to see increased memory usage, as Ruby requires a larger heap to manage that action.
00:05:12.470 This explains the sharp increase in memory during the first couple of hours after launching a Ruby application. I want to emphasize that it doesn’t level out.
00:05:28.860 I’ll explain why this happens shortly, but don't expect a steady-state Ruby application that never grows in memory usage. I should clarify that I'm primarily talking about MRI (Matz's Ruby Interpreter) and C Ruby; this discussion mainly applies to those systems.
00:05:43.250 If you are here because you work with JRuby, I apologize, as not everything I discuss regarding memory will apply to you. However, I believe most of us are running MRI in production, which is why I have tailored the talk to this environment.
00:06:01.899 Another issue with the logarithmic memory curve is that you can look at any short portion, like the memory usage over 30 minutes or an hour. It will appear as a steep, linear rise, leading you to believe there's a memory leak.
00:06:14.570 However, you must allow a Ruby process to grow over 18-24 hours. The duration of wait can depend on the load you're under. If you're subject to substantial load, like Shopify, you might notice changes within 20 minutes due to numerous requests per instance. For most, however, you should wait around 12-18 hours.
00:06:28.030 The problem with using worker killers is the imposition of a hard memory cap. If you set this cap too low, you're killing the Ruby process before it reaches its steady state. For example, if the threshold is 1 gigabyte while the true steady state memory use is 2 gigabytes, you’re terminating it after 1 gigabyte.
00:06:42.990 You'll observe a sharp curve in the memory usage graph, making it appear as though you possess a leak, but that's because you aren't allowing enough time for it to prove otherwise.
00:07:00.469 Consequently, my recommendation is to aim for around 300 megabytes per instance for a general Rails application. This amount is likely to be lower for certain rack applications that simply serve an API.
00:07:12.210 In most Rails applications, though, I believe 300 megabytes is a reasonable goal. I've seen instances use as much as 600 megabytes, which isn't great, but aiming for 300 is advisable. The same expectation applies to Sidekiq; Sidekiq processes should ideally operate within that range.
00:07:30.260 Solution two is to stop allocating too many objects at once. This is crucial, and I will likely spend the most time discussing this point.
00:07:43.930 There is a myth that garbage collection (GC) should automatically clean up all of our unused objects after a job or an action completes, thus reducing memory. However, garbage collection is very lazy. It doesn't run on timers; rather, it operates based on thresholds. Various thresholds exist in your application that trigger GC within the Ruby VM, and it doesn’t run continuously in the background at least, not the parts you care about. The sweeping phase of GC — that which frees memory — activates based on these thresholds.
00:09:14.919 Another reason memory usage doesn’t decrease after freeing objects is something called heap fragmentation, which we will address shortly. It's especially important to understand that the free function doesn't always release memory back to the operating system, and I will elaborate on that soon.
00:09:41.240 Let's discuss the thresholds associated with garbage collection. There are three main metrics: the count of slots that the Ruby VM has for objects, the general allocation of memory, and adjustments to the heap. If we run out of slots, we need more, which will trigger garbage collection. Ruby will try to first garbage collect its existing slots to find objects that it no longer requires, then reclaim those slots.
00:10:15.000 In Ruby 2.1 and later, we implemented generational garbage collection, which divides objects into old and new categories. When memory allocated on the heap exceeds a specific threshold, that can also trigger a GC. These thresholds are dynamic; they aren't fixed values but rather change.
00:10:45.000 For instance, with free slots, we might start off with 10,000 slots, but once we hit that limit, we need additional slots. Ruby will multiply the heap's size by a factor of 1.4. So if we run out of slots again, we expand it again, continuing to multiply. Now we have 14,000 slots available. If we need more space again, the multiplier continues to apply.
00:11:27.370 Now let’s touch on heap fragmentation. If you delve into the C source of Ruby's garbage collection, you'll find something called object space or heap (though the terminology can be confusing). In the Ruby object space, we have pages, with each page containing slots for objects. Each page holds 4096 slots, and each slot is 40 bytes. This space contains what’s known as an R-value, meaning it points to the object's data.
00:12:06.710 The issue with heap fragmentation arises when, for instance, I allocate 600,000 strings and subsequently drop their references. If I hold onto a reference to a few of those strings, the remaining slots in memory could still be occupied by those active references and thus can't be freed. As a result, Ruby cannot move the objects around in memory because of the way the C extension API works.
00:12:47.760 If you're faced with memory fragmentation, Ruby can only deallocate a page (16 kilobytes worth) back to the operating system if there are no objects in that memory page. If even one slot within the 4096 slots still holds an object, returning memory to the operating system is impossible. Aaron Patterson is researching how to address this issue that could potentially enhance Ruby's memory management.
00:13:30.860 Regarding object space, let’s consider what happens with larger objects. An R-value is just 40 bytes, but if I create a 500-character string, the string can't be stored within that 40-byte slot. Instead, Ruby calls malloc to reserve space on the heap for larger objects. As a result, we have two areas where Ruby objects are often stored.
00:14:12.930 This leads to a complication when the memory allocated surpasses specific thresholds, thus invoking garbage collection in varied ways. It's crucial to understand that free isn’t an absolute command; it serves more as a suggestion.
00:14:34.090 When Ruby calls free to return memory, it's hoping the operating system will reclaim that memory. Yet allocators may hold on to that memory for themselves, depending on the allocator in use. Additionally, the OS may not always want that memory reclaimed, leading to fluctuations in how Ruby handles memory.
00:15:12.320 For example, macOS has a mechanism known as inactive memory, which does not guarantee that any part of the Ruby memory stack — whether the Ruby VM, the allocator, or the operating system — will cause the Resident Set Size (RSS) to decrease, even when trying to free up memory.
00:15:47.240 Heap fragmentation may also stem from the fact that we can't move pages or slots around due to the reliance on pointers. Allocating many strings, then creating new constants that don't get garbage collected can lead to fragmentation.
00:16:23.180 So, if you allocate 600,000 strings and then create constants, you now have many occupied slots filled with those strings while new slots don’t participate in memory management because references remain.
00:17:12.380 Memory fragmentation could often lead to a slow increase in memory usage, leaving you with a consistent concern as your Ruby application may seem stable but is consuming memory noiselessly. By this understanding, we’ll see that finding the peaks of Ruby memory usage over time is crucial because it can cause slow performance.
00:38:12.980 In summary, I would recommend aiming for the middle ground in terms of memory allocation for steady applications. The solutions offered today are based on years of practice and testing to streamline performance without an unwanted reliance on performance overheads. This approach applies to modern performance tuning and improvement methodologies in Ruby applications.
00:39:10.310 If you have any questions, please, I’d love to hear from you. Thank you so much for your time, and I hope this talk has brought you valuable insights.
Explore all talks recorded at RubyConf 2016
+82