MountainWest RubyConf 2015

On Memory

by John Crepezzi

In the talk titled "On Memory" presented by John Crepezzi at the MountainWest RubyConf 2015, the speaker discusses the often-overlooked topic of memory management in Ruby programming. While Ruby's garbage collector simplifies memory management for developers, it's critical to remain aware of memory usage to avoid common pitfalls such as memory bloating, leaking, and faulting.

Key points discussed in the presentation include:

- Evolution of Ruby's Garbage Collector: The garbage collector has progressed through several phases, adapting to better manage memory allocation.

- Memory Profiling Tools: Crepezzi highlights useful tools and methods for memory profiling, including the built-in ObjectSpace API, GC Profiler, and the Memory Profiler gem, which enable developers to track memory allocations and understand object life cycles within the Ruby VM.

- Impact of Measurement on Memory: He introduces the concept of the probe effect, where measuring objects can inadvertently create more objects, affecting accuracy. Solutions to mitigate this can enhance profiling results.

- Lazy Enumerables: Lazy enumerables are discussed as an efficient approach to reducing memory consumption by deferring computation until necessary. This strategy can lead to significant memory savings.

- Memoization and String Freezing: The importance of caching computation results using memoization, and optimizing string operations to prevent unnecessary allocations through string freezing, are also outlined.

- Potential Memory Leaks: The speaker emphasizes the need to be cautious of memory leaks that can result from local variables persisting longer than intended within closures in class definitions.

Overall, the talk encourages developers to proactively consider their memory management strategies, recognize potential memory issues, and utilize profiling tools to optimize Ruby applications effectively. The presentation serves as a vital reminder of the necessity to maintain awareness of object creation and memory allocation to improve performance and application efficiency.

00:00:22.660 Does everyone remember memory? It's easy to forget about memory in our day-to-day Ruby lives. We encounter problems like memory bloating, leaking, and faulting—issues that we typically don't have to deal with anymore because Ruby automatically collects all the memory that we allocate. However, it doesn't automatically collect all the objects that you may refuse to let it manage for you. Today, I'm going to talk about how easy it is to forget about memory. However, I won't be delving into the story of Ruby's garbage collector.
00:00:50.690 Let me give you a brief overview. The garbage collector in Ruby has evolved over time. It has transitioned from mark-and-sweep, to the lazy sweep, followed by bitmap marking, and finally to generational garbage collection. If you look at the timeline of these changes, you'll see that we should have a new garbage collector by now. Today, I will discuss the current state of memory profiling in Ruby, some helpful tools for dealing with memory profiling, and tips and techniques that you can use in your everyday Ruby code to reduce the number of objects you allocate or to allow them to be freed more quickly.
00:01:23.020 First, a quick introduction about myself. My name is John Crepezzi. I run Ruby projects and have worked on some of the largest Ruby projects, such as IceCube, which is a scheduling library. I'm also responsible for HeySpin.com, which is unfortunately banned in India—I have no idea why it got banned, but it feels a bit amusing to me.
00:01:59.030 Additionally, I work at RapGenius.com, now rebranded as Genius.com. How many of you are familiar with Genius.com? Just raise your hand. Thank you! For those of you who do not know, it's a site for annotating song lyrics. You can click on different parts of the lyrics to see what the community has interpreted them to mean. You can also contribute your own explanations, and try out our beta product, which involves off-site annotations. If you go to Genius.com followed by any URL, you will get an annotated version of that page, allowing you to select text and write your own annotations to share.
00:02:34.450 Now let's dive into the state of memory profiling in Ruby. Who here is familiar with ObjectSpace? ObjectSpace is a built-in Ruby API that provides introspection into all of the different objects and counts of objects that currently exist inside the Ruby VM. This API gives you access to methods like ObjectSpace.count_objects, which will return a hash containing counts of all the different types of objects that are currently live in the Ruby VM.
00:03:03.990 If we use ObjectSpace.count_objects and perform no operations in between, we would expect the post-count minus the pre-count to equal zero. However, in some cases, we might see a discrepancy, which is an illustration of the probe effect. The probe effect occurs because, in the very act of measuring or monitoring your objects, you may inadvertently create additional objects, thus affecting your ultimate results.
00:03:31.100 For example, when you call count_objects, the first thing that it does is create a hash for the counting process. Thus, the total count may increase by one due to this additional object. Fortunately, the APIs allow you to pass in your own hashes to eliminate the probe effect by creating these hashes beforehand, giving you more accurate results. This API also allows you to traverse each live object in the VM, letting you explore memory allocations.
00:04:03.820 Here's an example: if we traverse the first few objects created by Ruby, we encounter the initial error classes. The API allows you to iterate through a particular class and gather statistics on instances of that class currently living in the VM. For example, if we inspect the BigDecimal class, we can observe instances and their memory consumption.
00:04:42.090 There's also a lesser-known extension to ObjectSpace called ObjectSpace::Allocation. This feature allows you to monitor which lines of code create particular objects, enabling you to analyze your memory allocations more effectively. You can retrieve the memory size of individual objects, track how class instances grow over time, and obtain a list of reachable objects from any given object.
00:05:07.050 For instance, when defining a new class instance, you can determine the objects that maintain a reference to this instance. If that list is empty, it suggests that it's ready to be garbage collected. Conversely, if you redefine or overwrite attributes of a known object, it potentially changes the reference list and thus affects its garbage collection status.
00:05:45.780 Who here has used GC—Garbage Collection? Great! Using methods like GC.stat, you can get insights into the total number of garbage collection cycles, disable or enable garbage collection, and obtain detailed information about the latest garbage collection cycle. Available since Ruby 2.1, the garbage collector now provides insight into various statistics, such as how many heat pages were allocated and how many objects remained after a garbage collection cycle. This data is useful, particularly because of the new generational garbage collection mechanism.
00:06:23.840 Let's look at some helpful tools for memory profiling. Unfortunately, this section is somewhat short, and I may seek your assistance. One noteworthy tool is the GC Profiler, a built-in garbage collection API that enables you to control profiling. You can start the profiler, run some code, then trigger a garbage collection cycle, and receive data on when the garbage collection happened, how long it took, and how many objects were cleaned up.
00:06:59.280 Another excellent tool is the Memory Profiler gem created by Sam Saffron, which has a straightforward API. You merely pass in a block, and it provides you with a comprehensive report on memory allocation by location, file, or method name. This makes it easy to pinpoint the methods generating high allocations and identify problematic gems. At the end of the report, you also receive an allocation report, highlighting strings that were allocated often without being released.
00:07:38.790 There are also tools like Ruby Prof and Memory Prof—a bit older and less useful these days, especially since memory_profiler works seamlessly with modern Ruby versions. It's worth noting that Ruby Prof, although it has a memory profiting capability, may require separate patches which can complicate its use. What we need are more tools like Memory Profiler that take advantage of ObjectSpace's capabilities.
00:08:26.350 As a recommendation for developers looking for interesting projects, consider contributing to memory profiling tools. There’s plenty of room for improvement, and it is a fascinating area ripe for exploration, especially with new features and libraries available. Now let's discuss some tips and techniques.
00:09:07.700 First, I'll cover lazy enumerables. Who here is familiar with lazy enumerables? They are a fantastic feature, especially when you know how to leverage them effectively. Commonly, when we write code, we often default to using traditional array operations, which can result in unnecessary allocations, particularly if you're filtering data but only require a small subset of it. By using lazy enumerables, Ruby will defer computation until necessary, thus minimizing memory usage.
00:09:45.600 For example, with a typical enumerable operation, we may run through an array of numbers, multiply them, filter them, and then ultimately limit the results. When using lazy enumerables, Ruby only processes elements as they are necessary for the final output, significantly reducing the number of computations and allocations.
00:10:22.880 However, when creating an enumerable lazy operation, be cautious. If the pipeline lacks a terminal function, it will yield no results until explicitly triggered. Lazy enumerables can result in impressive reductions in memory and allocations when used wisely. For instance, let’s take two scenarios where we square numbers and convert them into strings; the standard approach could yield large temporary arrays, while lazy enumerables reduce this drastically by processing each element one at a time.
00:11:08.640 Now, let's see how lazy enumerables impact memory consumption. When utilizing lazy enumerables, we often see a decrease in memory required for additional operations—sometimes resulting in fewer resulting arrays while running through processes. Although lazy operations may generate more single-element arrays, the overall savings in total memory may be substantial, sometimes offering up to a 45% decrease in bytes.
00:11:48.650 Contrarily, when analyzing operations through fewer iterations on smaller sets, lazy enumeration can lead to increased memory utilization. Thus, there is a threshold where one must assess the trade-offs concerning memory and speed, especially comparing large data sets to small ones.
00:12:41.820 When examining allocations and object creation, it is crucial to consider how your code's memory footprint impacts performance. Graphing object creation patterns reveals that larger collections of smaller objects yield higher memory complexity as size increases compared to collections of larger objects. Memory situations vary since each case behaves differently depending on the balance of creation rates versus memory space occupied.
00:13:21.420 Additionally, one important aspect of memory management is understanding how laziness operates. If you leverage lazy enumerables in situations where they fit naturally, instead of treating them as a quick 'fix-all', the benefits will be more pronounced. Establishing lazy evaluations alongside complex filtering operations provides performance benefits in terms of speed and memory allocation, which leads to a more efficient Ruby application.
00:14:07.530 Now, let’s discuss memoization, an effective technique in Ruby that caches return values for expensive computations to avoid unnecessary recalculations. This can be especially advantageous in prolonged calculations, significantly reducing overhead and memory consumption when correctly implemented. However, bear in mind that if an object relies heavily on holding large caches, it must be efficiently structured to allow old values to be cleaned up.
00:14:58.000 When it comes to strings, consider using string freezing to preserve memory. In a typical operation, creating combined strings can result in widespread allocations, leading to needless bloat. Techniques such as freezing strings enforce immutability and prevent unnecessary duplications during interpolation, which can further optimize performance and reduce memory.
00:15:38.400 Understanding how to optimize string operations and memory costs associated with them is vital for developers. There's a misconception about string freezing, thinking that there's no memory allocation happening when, in fact, Ruby automatically optimizes strings to guard against surplus allocations. Fostering a strong understanding of these mechanisms can lead to better Ruby applications.
00:16:31.020 Last but not least, method definitions can be a source of inadvertent memory consumption. When defining classes, be cautious that local variables created within closures may persist longer than intended, causing memory leaks. Special attention should be given to class definitions that may inadvertently trap large objects or strings in memory and avoid using closure within class definitions unless truly necessary.
00:17:15.100 In summary, managing memory in Ruby can easily slip through the cracks due to the garbage collector handling many operations for us. However, it’s essential to maintain awareness of how many objects we create, the memory they occupy, and their lifecycle within our applications. I hope this talk has encouraged everyone to be more considerate and thoughtful about object creation and memory management in Ruby. Thank you!
00:21:25.140 Thank you.