RailsConf 2014

Improve Performance Quick and Cheap: Optimize Memory and Upgrade to Ruby 2.1

Improve Performance Quick and Cheap: Optimize Memory and Upgrade to Ruby 2.1

by Alexander Dymo

The video presents a session by Alexander Dymo at RailsConf 2014, focusing on improving application performance through memory optimization and upgrading to Ruby 2.1. Dymo addresses common issues faced by developers, such as high memory consumption, slow background processes, and complex caching strategies, emphasizing that optimizing memory is crucial for enhancing application speed.

Key Points Discussed:

  • Memory Optimization Importance: Dymo argues that memory optimization is the foremost action developers can take to improve Ruby application performance. High memory allocation leads to slower garbage collection (GC), which can consume up to 70% of an application’s processing time.
  • Ruby 2.1 Enhancements: The upgrade to Ruby 2.1 is highlighted as it introduces improvements to the garbage collector, making it approximately 40% faster, thus reducing the overhead caused by memory allocation.
  • Practical Examples: Dymo shares case studies of applications that maintained the same hardware for years by optimizing memory. For instance, a Rails app migrated to Ruby 2.1 achieved a significant performance improvement.
  • Strategies for Memory Optimization:
    • Tuning the Garbage Collector: Adjusting settings to find a balance between the number of GC calls and peak memory usage.
    • Managing Memory Growth: Proactively monitoring and controlling Ruby process growth to prevent excessive memory consumption.
    • Manual Garbage Collection Control: Running GC between requests to manage memory effectively in web applications.
    • Using Efficient Data Handling: Utilizing databases for data processing instead of loading large datasets into memory, thus minimizing the need for Ruby to handle intensive operations.
    • Avoiding Memory Hogs: Adopting coding practices that reduce memory overhead, such as leveraging in-place string modifications and efficiently managing active record queries.
  • Tools for Optimization: Dymo discusses various tools such as GZ stat, ObjectSpace for tracking memory allocation, and the Walgreen tool for heap profiling to aid developers in analyzing and improving memory usage.

Conclusions:

Dymo concludes that effective memory management not only boosts application performance but also allows developers to utilize older versions of Ruby competently without significant detriments. He encourages developers to prioritize memory optimization strategies alongside considering upgrades to newer Ruby versions.

00:00:16.590 Welcome! It's time to talk about performance. Who here thinks that Ruby is fast? Raise your hand.
00:00:25.090 Okay, a few people think that Ruby is fast, but it seems like most people here believe it's slow. I disagree with you guys! Ruby was considered slow, but it's not slow anymore, unless you allocate a lot of memory—which pretty much every application does.
00:00:35.940 This is why your applications are slow. It’s not because Ruby is slow; Ruby itself is extremely fast. The issue lies in memory allocation. Memory consumption is the number one factor affecting the speed of your application. Ruby has a huge memory overhead because every object allocates an extra 40 bytes in memory. This, combined with a slow garbage collection algorithm, results in high memory consumption.
00:01:10.869 High memory consumption means that garbage collection needs to spend more time doing its work. The size of an average Rails application is about 100 megabytes, if not more. This means that garbage collection actually has to spend a lot of time working. If you optimize your memory usage, you can reduce that time significantly. I've seen applications spending as much as 70% of their time on garbage collection—that’s just unacceptable!
00:01:39.460 By optimizing your memory, you could recover that 70% of time. This is why Ruby 2.1 is important. While it may not optimize memory directly, it improves the performance of the garbage collector itself. In fact, garbage collection in Ruby 2.1 can be about 40% faster.
00:01:52.659 Let me give you some examples to illustrate why memory optimization is crucial. Consider an application that is still running on Ruby 1.8. Despite an increasing number of requests over five years, it's still on the same hardware purchased back in 2010, which demonstrates the benefits of memory optimization.
00:02:01.210 In another instance, a modern Rails app hosted on Heroku was upgraded to Ruby 2.1, resulting in about a 40% performance improvement as expected. But what if you can't upgrade, or decide not to? The answer is simple: just optimize your memory usage.
00:02:38.170 Let me show you a simple example of how memory optimization can make a difference. This program reads a CSV file, which is actually quite large. It reads it line by line, converts strings from uppercase to title case, and outputs the results. However, it is not memory optimized—it requires a lot of memory to load the entire dataset.
00:03:01.090 Comparing Ruby versions, you can see that version 1.9 takes about 20 seconds to finish processing while Ruby 2.1 offers a 40% improvement. Many applications that were upgraded to Ruby 2.1 experienced this performance boost simply because garbage collection runs faster or less frequently.
00:03:36.430 Here’s the simplest way to optimize memory for this application: instead of storing the entire CSV in memory, I load it line by line. I also use string manipulation methods that perform in-place modifications. Although this still loads the whole CSV to parse, it leads to significant improvements.
00:04:08.850 This means that my program now runs in just 15 seconds—what's more, all major Ruby versions will perform similarly if you optimize your memory. This is great news if you cannot upgrade because even Ruby 1.8 can perform adequately if memory is optimized.
00:04:30.110 Now, let's discuss how to optimize memory. I want to talk about five strategies to achieve this. The first is to tune your garbage collection settings. Since garbage collection is often the cause of slow performance in applications, adjusting these settings can help.
00:05:05.280 You want to find the right balance between the number of times garbage collection runs and your application's peak memory usage. By default, Ruby aims for low peak memory usage and a high number of garbage collection calls. If you want to reduce the number of garbage collection calls, you may need to accept higher peak memory usage. Measure and test to see what works best for your application.
00:06:01.250 Ruby 2.1 introduces two types of garbage collection: minor garbage collection, which works on new objects, and major garbage collection, which operates on all objects. Minor garbage collection occurs when there's no space in the heap for new object allocations or every 16 to 32 megabytes of allocated data.
00:06:30.720 Major garbage collection occurs when too many objects become old. While this is enough information for most applications, one way to reduce garbage collection calls is to increase the memory limits. Increasing memory limits will lead to less frequent garbage collection runs. You can also increase the heap space for Ruby objects, allowing Ruby to allocate objects faster.
00:07:01.500 In Ruby 2.1, five environment variables can be set to control this behavior, including memory allocation limits. However, I advise against adjusting other settings unless you fully understand their implications, as they may lead to worse performance.
00:07:31.520 The second strategy is to limit the growth of your Ruby processes. Ruby processes tend to grow over time and may keep allocating space for new objects without releasing it back to the operating system. This can lead to increased memory usage over time, especially for long-running processes.
00:08:04.110 To manage memory better, you should implement an internal control mechanism. For example, monitor memory usage after each request, checking against a defined limit, and terminate the process if it exceeds that limit.
00:08:35.400 Moreover, if you deploy your application on platforms like Heroku, you can use built-in monitoring tools. However, be cautious, as running out of memory far too quickly may require you to instruct your operating system's kernel to set specific memory limits.
00:09:05.600 For background jobs, a good practice is to utilize fork processes. This allows child processes to consume memory freely until completed, after which the memory is released back to the OS—allowing better memory management for long-running tasks.
00:09:35.430 The third strategy is manual control of garbage collection. I do not recommend disabling garbage collection entirely, as this can quickly lead to memory exhaustion. Instead, it’s advisable to invoke garbage collection manually between requests in your web application.
00:10:05.680 Ruby 2.1 offers built-in tools to help you in this regard. Just ensure that you maintain enough workers to serve requests effectively while some of them are dedicated to garbage collection.
00:10:38.080 The best Ruby code may sometimes be the code that doesn’t exist at all. There are situations where other tools may accomplish your tasks more efficiently compared to writing in Ruby. For example, consider manipulation of large datasets.
00:11:09.610 If you have thousands of records you need to process, it’s not efficient to load all this data into memory in Ruby. Instead, leveraging efficient database capabilities such as PostgreSQL or Oracle can lead to performance improvements. Using features such as window functions helps in processing such operations seamlessly.
00:11:41.400 There's often a sentiment within the Ruby community suggesting that using SQL is to be avoided, but I think that SQL is a powerful tool and should be utilized. Data processing and calculations using SQL can deliver better performance than trying to manage them manually in Ruby.
00:12:13.560 Finally, avoid memory-hogging operations. Many practices increase memory usage, such as creating unnecessary duplicate objects. Always try to perform in-place operations, especially when modifying strings.
00:12:37.540 Use methods that don't create new objects unnecessarily. For example, always prefer in-place modifications using methods like 'shift' or methods that operate line by line when processing large inputs or files.
00:13:04.900 It's also a good idea to limit the use of Active Record if you don't need to instantiate objects unnecessarily. Utilization of the 'update all' method allows you to execute a single SQL query without the overhead from Active Record object instantiation.
00:13:35.060 When handling large results, it’s advisable to read and iterate through them instead of loading them into memory as Active Record objects. For any result set over 1000 objects, consider using iterate instead to manage memory efficiently.
00:14:05.310 Additionally, careful use of objects can greatly optimize memory consumption. Several libraries can pull data that are less memory-efficient; therefore, it’s essential to evaluate their performance on memory consumption.
00:14:36.410 Now, let’s talk about tools for memory optimization. Some of the effective ones include GZ stat, which helps you understand garbage collection performance and allows you to track memory usage efficiently.
00:15:00.210 Another useful tool is the object_space module in Ruby, which provides memory usage statistics and can trace object allocations. Understanding where your memory is going is pivotal for efficient application performance.
00:15:30.210 To learn more about Ruby performance tuning, I am also working on a comprehensive book covering these topics in-depth. If you register on my website, I will share beta versions when it's completed.
00:15:54.020 Now, a tool I recommend for profiling memory use is Walgreen. This tool emulates your CPU and can track memory allocations during your code execution.
00:16:20.960 Run your Ruby code through Walgreen's massive tool, which acts as a heap profiler. It traces how much memory you have allocated and at what point during execution. This can help diagnose where most memory is being consumed.
00:16:50.020 However, to analyze the output effectively, I would suggest using a visualizer tool that can illustrate memory usage over time. This will provide better insights regarding potential memory leaks.
00:17:20.650 That's all I have for now. Thank you for listening, and feel free to ask any questions.
00:28:03.669 Thank you all for your attention!