Memory Profiling

Summarized using AI

Tidying Active Record Allocations

Richard Schneeman • June 01, 2021 • Rotterdam, Netherlands

The video titled "Tidying Active Record Allocations" features Richard Schneeman at Euruko 2019, where he discusses improving the performance of Ruby applications through efficient memory management. The talk draws inspiration from Marie Kondo's organizing philosophy, focusing on identifying and optimizing memory allocations within Active Record, a Ruby framework component.

Key points include:

- Introduction of Richard Schneeman: A Ruby enthusiast and Heroku employee, he shares insights from his experience in programming.

- The importance of performance: Schneeman emphasizes that optimizing memory allocations is crucial for enhancing the speed of Ruby applications, introducing the audience to practical tools for achieving this.

- Memory Profiling Tools: He introduces two key profiling tools: Memory Profiler for tracking object allocations and Derailed Benchmarks for assessing performance without needing to restart the server.

- Example Benchmark: A crucial part of the presentation involved a quiz to demonstrate the differences in memory allocation between two methods for comparing values, revealing that unnecessary allocations can significantly slow down applications.

- Case Study: CodeTriage: Using his project CodeTriage as a case study, Schneeman presents findings from using Derailed Benchmarks to expose allocations at a specific line of code.

- Refactoring for Optimization: He suggests a refactor of the code to avoid unnecessary string allocations by checking for symbol existence against column names, ultimately leading to a modest performance gain.

- Statistical Significance of Changes: The talk concludes with an overview of how to ensure the improvements are statistically significant using methods like the Student's t-test, validating the necessity of fewer server resources post-optimization.

The session encourages developers to apply these tidying techniques to their own codebases, identifying performance hotspots and optimizing object allocations. Richard's engaging presentation style and effective use of real-world examples help reinforce the lesson that tidying up allocations can lead to substantial performance improvements, even if the initial impact seems small.

In essence, the takeaways from Richard Schneeman's presentation underscore the necessity for developers to manage memory allocation intelligently in Ruby applications to optimize performance effectively.

Tidying Active Record Allocations
Richard Schneeman • June 01, 2021 • Rotterdam, Netherlands

The Life-Changing Magic of Tidying Active Record Allocations
Your app is slow. It does not spark joy. In this talk, we will use memory profiling tools to discover performance hotspots. We will use this technique with a real-world application to identify a piece of optimizable code in Active Record that leads to a patch with substantial page speed impact.

Richard Schneeman - https://twitter.com/schneems
EuRuKo 2019

EuRuKo 2019

00:00:06.960 Our next talk is by Richard Schneeman. When I told him I used something he built called Run Dock, he expressed his joy in a rather unique way, which I thought was quite amusing. On the other hand, he was teaching a master course in programming and made all the material available for free online. I think that's a very nice move on his part.
00:00:15.280 Well, without further ado, please give a warm welcome to Richard.
00:00:28.720 It is time to set sail! Hello everyone, my name is Richard. Thank you for the great introduction. On the internet, I go by the name 'schneems.' Some people who know me are aware that I love Ruby so much that I actually married her. This is my wife, Ruby, and we have two wonderful children. I'm not going to talk about them, but I will mention one of my dogs. His name is Hans Peter Von Wolfe the Fifth, and we just call him Cinco for short. I maintain a service called CodeTriage, which helps people get started in open source. It sends issue contribution ideas as well as documentation ideas so you can contribute to a Ruby project.
00:00:58.960 I work for a small startup based out of San Francisco that you might have heard of; it's called Heroku. They were nice enough to pay for my flight, which is fantastic, and they also pay me to work on the Ruby build pack as well as some other open source projects. Before I get started with my talk, I want to share a bit of background. When I started working on this presentation, I honestly hurt my hands pretty bad, so much so that I had to take a couple of weeks off. Around the time I began writing this presentation, I was limited to using an eye tracker and voice controls as my input devices on a computer. Here's what that looks like.
00:02:10.800 So, how did I manage to create such an amazing presentation without the use of my hands? You can give a hand to my hands, or rather, to Caleb Thompson! Besides Caleb, I also have to thank a voice-over artist, Yuri, whose work you will experience later.
00:03:03.760 Now, I've been told that we need boat jokes since we are on a boat. Let me ask you this: why do we have a Ruby conference on a boat? Because everyone knows that Ruby is written on top of C!
00:03:40.560 In all seriousness, I have a couple of fun facts to share. For example, there is an RSS submarine named Bodie McBoatface. While it has a silly name, it has made serious contributions to climate science. I also maintain Puma, and we are about to release Puma 4 soon. Before I can release it, I need to get back to port.
00:04:11.280 This talk is primarily about performance. I have found something that I believe can make Ruby faster. I realized that the problem is simply that we need to take the sleep out.
00:05:09.360 Now, who knows who this is? This is Marie Kondo, a world-famous organizing expert. She has a Netflix show and several fantastic books. I want to share a quick clip of her working on object allocations.
00:05:56.080 Unfortunately, Marie couldn't be here with us today, so instead she sent her pen rabbit, Con Hair, to join us!
00:06:06.400 Hello and welcome, everyone. In English, my name translates to 'love it,' and I truly love colors, movies, and performance. Today, you're here to hear about the concrete method of tidying your Ruby applications.
00:06:40.240 To tidy up, first, collect your objects into a pile where you can see them clearly. Next, consider each object: how do you know if it sparks joy? If an object is very useful and keeps your code clean without causing performance problems, then it sparks joy. If an object is necessary for your code to function and removing it causes it to crash, then it also sparks joy.
00:07:01.960 This process will help us determine which objects we can consider getting rid of. We are going to use two tools for this purpose: Memory Profiler and Derailed Benchmarks, which will help us benchmark our Rails applications. We'll start out with a little quiz to test our understanding.
00:07:46.480 Here is a benchmark comparing two different methods of determining the largest value between two inputs. One method allocates an array and compares its values, while the other method uses a direct comparison without any allocation. I want to know, who thinks the array method is faster?
00:08:00.400 And who thinks the other method is faster? Correct—the second method is faster! But does anyone know how much faster? Shout out a number. The answer is that it is two times faster. Both methods execute the exact same logic, making this a significant performance difference.
00:09:07.760 In general, touching memory will slow your program down more than performing comparisons or calculations. This is not only true in Ruby but in every programming language. For instance, if you're writing C, calling malloc is slow, and you want to avoid it. Since we know that Ruby's object allocation is expensive, we’ve seen that we can optimize some scenarios by removing allocations and opting for comparisons instead.
00:09:40.480 If we discover where many objects are being allocated, this discovery shows us where our program is likely to have performance hotspots. When we're optimizing, reducing the percentage of bytes allocated by about 1% can usually speed up our program by about that same percentage. This is typically true but not always. This simplifying assumption helps us benchmark faster as well.
00:10:13.440 Typically, when writing benchmarks, we cannot just run and time two pieces of code. There’s a lot of variance, and benchmarking may require running them thousands or even millions of times. With memory comparisons, a fixed value like 100 bytes will always equate to 100 bytes, which is consistent. This consistency also means we only need one measurement instead of many.
00:11:14.400 Next, we will look at the Memory Profiler gem, which will allow us to view all allocations. Behind the scenes, it uses a wrapper for object space allocation tracing. If you’re profiling a Rails app, you can use Derailed Benchmarks. The gem can hit an endpoint on your application directly from the CLI, which is beneficial because you don’t have to restart the server between changes, nor do you have to refresh your browser.
00:12:05.440 For our case study, I call this section 'Inadequate Record.' First, we will run Derailed against the target application, CodeTriage, to gather memory allocations. The output will show the most memory allocations at the top, which is where we will start looking.
00:12:43.440 Once we pick a file from the list, we must zoom in on it and filter our results. The filtered output is much cleaner and shows that the majority of allocations come from line 270. Let’s see what the code does by opening that file.
00:13:06.720 On line 270, we can see it’s allocating strings. To understand this better, we must know what the code is doing. This part of the code is inside the respond_to method. When we call respond_to on an object, we need to know if a method by that name exists. Since ActiveRecord is backed by a database, we also need to know if a column by that name exists.
00:14:19.200 Typically when you call respond_to, you pass in a symbol, while ActiveRecord stores columns as strings. The code converts the symbol to a string for comparison and then iterates over the column arrays. For example, if we call respond_to on a user with the email as a symbol, we convert it to a string and check for matches against the column names.
00:15:00.799 Now that we understand this code, we can ask ourselves: does this object allocation spark joy? It is in use, which makes it useful, but it does a lot of allocations, making it less performant. It doesn't specifically help keep our code clean either, and we don’t know yet if it is absolutely necessary.
00:15:29.120 To figure this out, we should refactor the code to be faster while maintaining correctness. Since the name variable must be a string, we’ll need to make a conversion somewhere. My hypothesis is that instead of converting the symbol each time, we can check for a column by name as a symbol.
00:16:33.440 This allocation does not spark joy, so we should eliminate it. We never want to allocate a string if we can avoid it. We can store our column information as a hash, mapping the symbol as a key to its corresponding string, and then check for column existence without unnecessary allocations.
00:17:12.160 This patch reduced object allocations by about 1% of total memory before CodeTriage, leading to a rendering time that was approximately 1.01 times faster. So, I think we’re basically done here.
00:18:09.920 However, we need to verify that these results are statistically significant. Time to teach them that. The Student's t-test helps us determine if the performance results are statistically significant.
00:18:49.680 We use an advanced statistical tool, Excel, which has a function that generates a t-test. If the result is below a certain threshold, it is statistically significant. And yes, your changes were statistically significant, so they got merged! While achieving a 1.01x speed increase may not seem like much, it suggests that fewer servers will be necessary. In theory, if your application previously required 100 servers, after this patch it may only need 99.
00:19:45.600 This increase might seem modest, but we can showcase another example of using this method to tidy up. Let’s remember the steps: first, collect all your objects into one place, then evaluate whether they spark joy.
00:20:28.720 Spotting object allocation hotspots indicates areas we can perform optimizations, and now you're ready to take some of these techniques into your own code.
00:21:17.360 Thank you very much! My name is Richard, and I go by schneems. A special acknowledgement to Caleb Thompson for his help with the slides. Without him, I wouldn't be here today. Thank you all for coming and for making Euruko what it is!
Explore all talks recorded at EuRuKo 2019
+12