Benchmarking Ruby

by Davy Stevenson

In the video "Benchmarking Ruby" presented by Davy Stevenson at RubyConf 2014, the focus is on the significance of benchmarking in Ruby programming. The talk establishes a contrast between testing, which ensures functionality, and benchmarking, aimed at assessing performance and efficiency of Ruby code. The speaker emphasizes that while testing verifies if code behaves as intended, benchmarking reveals how well the code performs under various conditions.

Key points discussed include:
- Definition and Importance: Benchmarking is crucial for understanding the performance of both your own Ruby code and the gems you depend on, ensuring that your overall application maintains efficiency.
- Benchmarking Tools: Stevens introduces several benchmarking tools, including:
- Benchmark: A standard library tool that helps track performance using raw time.
- Benchmark IPS: Created by Evan Phoenix, this gem simplifies output and focuses on iterations per second to facilitate comparisons between code snippets.
- Benchmark Big O: A gem developed by Stevenson that estimates performance as input sizes change, allowing for more detailed analysis of scaling performance.
- Method Comparisons: The video illustrates how different methods can have vastly different performance characteristics demonstrated through an example comparing 'at' and 'index' methods.
- Best Practices: Important practices include ensuring a consistent environment free from external interference, isolating variables in benchmarks to accurately gauge performance, and validating the behavior of the methods being tested to ensure they produce expected results.
- Common Pitfalls: Stevenson warns about the risks of accidental object mutation during tests and emphasizes the importance of running benchmarks under controlled conditions for reliable results.

Stevenson concludes by encouraging the audience to utilize benchmarks effectively, reminding them of the importance of confirming assumptions about performance and critically interpreting results. Additionally, he advises that unnecessary benchmarking should be avoided to prevent overhead. Useful resources are provided for further exploration into efficient Ruby coding and practical benchmarking techniques.

Overall, the video serves as an engaging introduction to the topic of benchmarking in Ruby, providing both theoretical insight and practical tools for developers to optimize their code efficiently.

00:00:18.600 Um, thank you so much for being here. I'm going to talk about benchmarking in Ruby. But first, I would like to say hi to all of you. It's so great to see so many friendly and familiar faces in the audience today. I really appreciate you deciding to come to my talk. My name is Davy Stevenson, and you can find me on Twitter as @DavyStevenson. I would love to hear any of your thoughts and opinions on the talk afterwards, so feel free to contact me there or just catch me wandering around the halls for the remainder of the last day here at RubyConf.

00:00:36.680 Now, let's talk a little bit about benchmarking. What do I mean by benchmarking? I want to clarify that I'm not going to be discussing Ruby implementations or frameworks and benchmarking those. If you're interested in that sort of topic, there are plenty of great resources online, including isRubyFastYet.com and a fantastic blog post called the Ruby Web Benchmark Report made by M.D. Mark. If you want to learn more about how different Ruby implementations compete and compare against each other or how various web frameworks stack up, I would point you to those resources. Instead, I'm going to delve into how to benchmark Ruby code.

00:01:13.320 I will also discuss some common pitfalls that people encounter when benchmarking code—pitfalls that I often fall into myself. I will use these examples to illustrate my points, so you can think about these issues while writing your own benchmarks. Ultimately, I want to show you that benchmarking Ruby can be both fun and easy, making it something that you can easily incorporate into your workflow. Moreover, it helps you gain extra knowledge about the code you're writing.

00:02:15.560 So, why should you benchmark? To help answer that, consider this: why do you write tests? Many of us have fully subscribed to a test-driven culture, recognizing how important it is to write tests that verify the functionality of our code. Writing tests allows us to be confident that the code we're writing is doing exactly what we expect and enables us to refactor code with peace of mind, knowing that our tests will catch any mistakes.

00:02:34.160 Therefore, why should you benchmark your code? Benchmarking plays a completely different role than testing. It focuses on certainty and performance. You want to ensure that your code performs as expected without unexpected issues that could slow it down or make it run inefficiently. While you might benchmark your code, it's also advantageous to benchmark the code in the gems you depend on, as well as Ruby code itself. Benchmarking code you don't own gives you a broader understanding of its performance and how using that code can affect your final product.

00:03:13.360 So, how do we proceed with benchmarking in Ruby? First off, the standard library offers a great tool called Benchmark. All you need to do is require 'benchmark', and you can write a few reports easily. For example, we can benchmark an array with 10,000 integers starting at zero. By shuffling those integers, we can time how long it takes to find a particular element in the array as well as the time it takes to perform an index lookup. The output of such a benchmark typically consists of numerous numbers, but what you're really looking for is the final output column that tracks what's happening in your system.

00:03:40.120 It's essential to ensure that nothing else is impacting your results. While benchmarking, there are pros and cons to the built-in Benchmark library. The pros are that it is easy to use since it's included in the standard library. However, there are also cons, such as needing to manage various variables, like the size of the object you're creating and how many times you're iterating the code block. This is crucial to avoid exceedingly slow or incredibly fast results that may not be meaningful.

00:04:17.479 Many find the output difficult to read, leading to a heavy cognitive load when attempting to parse the results. Additionally, there's a lot of boilerplate code needed to set up these benchmarks. To address these issues, Evan Phoenix created a fantastic gem called Benchmark IPS, which eliminates much of the boilerplate code. Instead of calculating the raw time that a block of code takes to run, Benchmark IPS calculates the number of iterations per second that the code can perform.

00:05:12.600 Switching from using the built-in Benchmark to Benchmark IPS is easy—just require 'benchmark/ips' and change the method you're calling. Benchmark IPS also includes a nifty compare function that allows for easy comparisons of the performance of different block implementations. The output of the Benchmark IPS report details both the iterations per second and standard deviation, making it easier for programmers to understand how their code performs.

00:05:45.919 This tool can show how much faster the fastest code block runs compared to the others, which can be immensely helpful. If you want quick results, focus on the number of iterations per second for comparison. It can be enlightening to see how certain operations can run vastly different from one another, such as an 'at' method running 3.2 million iterations per second compared to an 'index' method, which only runs approximately 41,500 iterations per second. Understanding these differences can lead to smarter code optimization.

00:06:36.120 The pros of Benchmark IPS include being less fiddly than the built-in Benchmark, offering easier comparisons since higher numbers indicate better performance, and maintaining similar syntax to the built-in Benchmark. However, it does require you to include a separate gem, which can be a hassle for some developers. Additionally, Benchmark IPS usually offers a snapshot view of the performance, which may not always provide a complete picture.

00:07:10.680 To address this limitation, I developed a gem called Benchmark Big O. This tool builds on what Benchmark IPS provides by estimating the performance of operations as input sizes change. Instead of requiring you to specify the size and properties of the input arrays outside of the Benchmark block, Benchmark Big O allows you to generate the objects directly within the Benchmark. This is critical for accurately assessing the performance of code as the sizes and types of data structures change.

00:07:51.000 Using a generator to create the objects, we can specify exactly how to set up the array we want to benchmark. Under the hood, Benchmark Big O makes use of Benchmark IPS for its calculations, allowing for efficient performance tests. Furthermore, the results can be output in various graphical formats, showcasing performance differences effectively. By employing these visual tools, you can more easily discern performance trends and understand how different methods behave as input sizes vary.

00:08:30.200 For example, you might notice significant performance differences when comparing the efficiency of an 'at' operation versus an 'index' operation by understanding the differences in performance scaling. This visibility allows programmers to make informed decisions regarding what approaches to use for optimal efficiency.

00:09:02.160 One important aspect to consider while benchmarking is ensuring that the environment remains consistent. Running benchmarks while other applications (like Netflix or SoundCloud) are active can produce inconsistent output. It's crucial to run benchmarks in isolated conditions to achieve reliable results. Closing down unnecessary applications can clarify your benchmark results and provide a more accurate representation of the performance.

00:09:37.680 Beyond consistent environments, it is imperative to verify the behavior of the methods you are testing. Writing tests alongside benchmarks ensures the methods yield expected outputs and can identify any discrepancies. Keeping track of whether an operation returns consistent values aids in verifying the models you're intending to evaluate against.

00:10:09.280 Furthermore, validating the scalability of the benchmarks is crucial. If the benchmarking process miscalculates based on manipulated data, the reporting could lead to inaccurate conclusions. To clarify, consider the Fibonacci sequence as an example—this serves as an accessible yet effective benchmark to measure implementation performance. When benchmarking the Fibonacci method, ensure your test cases cover valid scenarios to identify performance characteristics accurately.

00:10:58.320 Moreover, you should only modify one variable at a time in your benchmarks to gauge direct impacts accurately. For instance, when comparing 'each_with_object' and 'reduce', be mindful that modifying other parts of your code can skew results. Assessing performance should isolate the variable being changed while maintaining all other components constant.

00:11:32.400 It's also critical to avoid accidental mutation of objects in your benchmarks. Mutating an object changes its structure, leading to misleading results. Use methods like 'Array#dup' to create a working instance of your data structure before making alterations—this retains the baseline and allows for accurate comparisons.

00:12:05.440 Another thing to keep in mind is the effective use of randomness within your benchmark tests. If a search function returns different results based on where in the array your data is located, the efficacy of that performance measure can yield variable results. Ensure the randomization occurs consistently with each benchmark execution, as this solidifies the reliability of your reports.

00:12:45.600 Lastly, be clear about which cases you intend to benchmark—be it best, average, or worst-case scenarios. This understanding can significantly influence the interpretation of your data. In the realm of Ruby, the language's internal operations are well-understood, so when benchmarking your own code, recognizing how input size variances impact performance is key. For instance, when examining a gem like 'terraformer' for its convex hull algorithm, scrutinizing performance under different input types will yield rich insights into optimizations available for your algorithms.

00:13:45.760 Consequently, you can generate benchmarks for specific edge cases and compare performance effectively. Remember, algorithms can behave unpredictably based on the data types and sizes you're testing, impacting performance results dramatically. Always verify the claims made by benchmarks with systematic tests that confirm output consistency.

00:15:01.320 In summary, as you embark on your benchmarking endeavors, remember to verify your assumptions about the performance of your code, ensure consistent conditions, and compare results across different scenarios. Benchmarks not only help illuminate performance characteristics but also encourage a deeper understanding of the Ruby programming language and other libraries you rely on.

00:15:46.000 When it comes to benchmarks, I recommend using Benchmark IPS for its ease of use and clarity in results. Similar arguments can be made for Benchmark Big O, which shines when size variances of your input are significant. Other than that, exercising judgment on whether you genuinely need to perform benchmarks is vital. Premature benchmarking yields no meaningful insights and could lead to unnecessary overhead.

00:16:36.599 For additional resources, check out Eric Michaels’ talk, 'Writing Fast Ruby' for insights on writing performant code—also, explore the Fast Ruby repository for practical examples and comparisons of language features. Lastly, for Ruby on Rails practitioners, the 'derailed benchmarks' gem offers intuitive tools for analyzing performance in your applications. Continued exploration and experimentation in benchmarking will yield profound insights into optimization opportunities within your code.

00:17:27.440 I would like to thank the talented designers at The Noun Project for creating the icons that I used during this presentation. Thank you all very much for attending, and I'm available for a couple of minutes in case anyone has questions.