Daisuke Aritomo

Hacking and profiling Ruby for performance

RubyKaigi 2023

00:00:05.880 Hello, everyone. Thank you for being here today.
00:00:14.059 Today, I'm going to talk about hacking and profiling Ruby for performance.
00:00:22.740 My name is Daisuke Aritomo, but my friends call me also you. I'm a member of Rubik NOC, which is a Network Operation Center.
00:00:28.320 It's similar to a Wi-Fi team, and I’ve been involved in dragging cables here and there since Day Zero. Additionally, I work at Cookpad.
00:00:45.300 Let me quickly introduce Cookpad. Did anyone notice the refrigerators over there? Those are actually from Cookpad.
00:00:48.000 The Ruby code controls the locks on these refrigerators. If anyone wants to build Ruby-powered refrigerators, join us! We're also organizing a work-pack Rubik event in Tokyo on May 18, which includes a hands-on part. If you're excited, please visit the Cookpad sponsor booth for details.
00:01:56.520 Now, let's start talking about the actual content of this talk. I'm going to discuss how to profile and tune a Ruby web application in just eight hours as part of a performance tuning competition for fun.
00:02:16.020 First, how many of you love the word 'performance'? Yes, everyone loves performance! There's actually a performance challenge competition called Ifcon. Contestants are given three virtual machines, infinite computing resources, and a very low-performing web application.
00:02:35.840 The app is so slow that it is practically unusable. Contestants have eight hours to improve its performance. They may also request a 50-second benchmark during the contest to see their progress.
00:03:04.019 When a benchmark is requested, the benchmarking server will run a large number of HTTP requests to the Ruby application, and the benchmark results will determine the contestants' scores.
00:03:25.379 The team with the highest score wins. There was another talk related to this contest, which was done today. I recommend checking it out afterward, as all talks will be available on YouTube.
00:03:45.959 The contestants are provided with a Ruby application implemented using Sinatra, running on Puma with Nginx included. My FQL (Fine Query Language) is also on the server, and alternate implementations of the same app in other languages like Python, Go, and Node.js are also included.
00:03:56.760 Initially, the performance is quite poor. The challenge is to make the web app server as fast as possible within eight hours, but contestants cannot scale up their servers or purchase more servers. They can only use their skills to optimize the existing application.
00:04:40.199 The main loop involves hacking around the Ruby code, running benchmarks, and trying to improve scores repeatedly. The aim is to achieve the highest score, and while there are many aspects to improve performance, today I will focus on what can be done during Ifcon.
00:05:10.860 One of the reasons Ruby is a great language to compete with in Ifcon is due to certain libraries available and how to effectively profile code. It's essential to track down and utilize Ruby's profiling features for optimal performance.
00:05:34.380 I won't be discussing configuring Nginx or Linux today, despite their importance in achieving a good score in Ifcon. This presentation focuses primarily on Ruby. However, we'll touch upon monitoring at the system level and preparations you need for your recipes.
00:05:57.260 There are many aspects we could explore, such as effective RDBMS indexes, optimizing queries, and avoiding N+1 issues. N+1 issues occur when queries are executed in loops, leading to unnecessary overhead.
00:06:15.240 Replacing sub-optimal algorithms, optimizing server resources, caching, and upgrading to the latest Ruby version are also crucial steps. Ruby 3.0 and onward are touted as among the fastest versions.
00:06:43.140 Now, where should we start our optimization efforts? It's crucial to identify where to focus our hacking. Performance profiling will guide us on what areas need improvement.
00:06:59.999 Before we dive into profiling, let me explain why Ruby is a superb language for this competition. In this contest, you might encounter somewhere between 500 to 800 lines of network code. Understanding this code completely in eight hours can be quite daunting.
00:07:22.059 Fortunately, Ruby's syntax is quite compact, making it easier to read and comprehend within this limited time frame.
00:07:35.720 Ruby also provides many helpful methods and classes like `array`, `hash`, and `enumerable`, which can assist in performing queries effectively. This can significantly improve both coding and debugging efficiency.
00:08:01.940 For the competition, one of my preferred techniques is monkey patching. It's a great tool to adjust methods on the fly and adapt them to achieve better performance.
00:08:30.840 However, let's shift back to performance challenges in Ruby. There may be instances where you encounter 'no method error' after starting benchmarking, indicating a typo or an unrecognized method in your code.
00:08:57.005 No matter where this error occurs, it is fixable. It's a matter of carefully reviewing your code. Furthermore, tools like 'RBS' and type checking can help mitigate such issues.
00:09:25.079 Performance tuning in Ruby can be understood through three primary points: profiling accurate bottlenecks, effectively utilizing CPU resources, and achieving high concurrency.
00:09:55.740 Before making random improvements, it's crucial to measure the changes' impact accurately. Random optimizations can lead to minor gains, but deliberate profiling will provide a clearer picture of where improvements are truly needed.
00:10:46.739 Fortunately, there are many profiling tools available in Ruby, ranging from line profilers that track every single function call to sampling profilers which collect data at defined intervals. These profilers can help identify performance bottlenecks.
00:11:15.800 Tools like 'stackprof' and 'rbx5' are excellent options for sampling. Sampling profilers have lower performance overhead while providing insightful analysis.
00:11:46.420 If you're profiling, visualizing your profiling results can also be beneficial. One common visual representation is the flame graph, which displays time consumption across various functions in the application.
00:12:11.580 Each bar in the flame graph represents a function, and the length of the bar indicates how much time that function consumed.
00:12:34.740 If a bar is particularly long, that indicates potential room for optimization. For instance, if you notice that a network handler function has a long execution time, it may be time to analyze and optimize that specific part of the code.
00:13:01.600 Let's dive deeper into profiling internals.
00:13:02.120 The `rb_profile_frames` method is a powerful aspect of Ruby profiling. It returns the call stack of the currently executing thread, allowing us to see precisely what was happening when performance measurements were taken.
00:13:29.300 When using multiple threads, results may not always reflect a complete view of the entire system. For example, if one method overlaps in execution time with other threaded operations, it may appear less significant.
00:13:55.840 In this context, threading adds complexity. Only one thread can access the CPU at a particular time due to the GVL. Other threads can perform IO operations, leading to confusion in profiling results.
00:14:24.160 My next point is maximizing CPU utilization in Ruby. As previously mentioned, Ruby's GVL means that even if you create multiple threads, only one can be executing Ruby code at a time. In this structuring, important strategies like using multiple processes can be beneficial.
00:15:08.240 For example, if you want to achieve CPU parallelism, consider implementing multiple processes instead of relying merely on threads, as they are not necessarily efficient in Ruby.
00:15:32.824 Memory management also plays a vital role in this balance. When employing multiple processes, higher memory consumption is inevitable. Your goal should always be to find the best performance ratio between processes and threads.
00:15:49.329 To summarize, many beneficial features exist to enable performance improvements in Ruby applications, such as the internal event handling functionalities which could assist in achieving better resource management.
00:16:34.580 As we wrap this session, remember that Ruby's performance potential lies in its ability to utilize concurrency effectively, which can be a game changer in competitions like Ifcon.
00:16:50.400 Let us remember the significance of performance profiling combined with accurate measurements, as it is crucial to circumstances like the performance contests we talked about today.
00:25:42.879 Thank you for your attention. I hope you find this session informative and useful for your Ruby programming endeavors. Acknowledgments to my teammates and colleagues who've provided additional advice and support.
00:26:08.000 It's been a pleasure sharing my insights with you today. Thank you for being such an attentive audience!