Talks

Parallel testing with Ractors: putting CPUs to work

Parallelizing tests is an opportune way of reducing the total runtime for a test suite. Rails achieves this by forking multiple separate workers that fetch tests from a queue. In Ruby 3, Ractors introduced new mechanisms for executing code in parallel. Can they be leveraged by a test framework? And how would that compare to current parallelization solutions?

Let’s find the answers to these questions by building a test framework built on Ractors, from scratch. We’ll compare the current solutions for parallelization and what advantages or limitations Ractors bring when used in this context.

RubyKaigi Takeout 2021: https://rubykaigi.org/2021-takeout/presentations/vinistock.html

RubyKaigi Takeout 2021

00:00:00.399 Hello and welcome to my talk about parallel testing with Ractors and putting CPUs to work.
00:00:02.240 My name is Vinicius Stock, and I'm a senior developer at Shopify where I'm part of the Ruby types team.
00:00:05.359 We work with Sorbet, Tapioca, and everything related to typing. We are hiring, so if you're interested in joining us, please reach out.
00:00:09.679 If you want to know more about my work, you can find me at 'veenystock' on both Twitter and GitHub. During RubyKaigi 2020, we had the first Ractor demonstration by Koichi Sasada just before Ruby 3 was released.
00:00:21.199 One of the things I really wanted to explore was whether building a parallel test framework using Ractors would be any different or provide any benefits compared to our current implementations.
00:00:35.280 Today, we're going to take a look at a few test execution strategies. We'll see an introduction to Ractors and how they communicate. We will build a test framework similar to MiniTest, parallelize it using Ractors, and then examine a few highlights, features, and limitations of the framework we built.
00:00:46.719 How do we run tests? What does it mean to execute a test? Tests are essentially pieces of code—snippets of code—that we need to organize and run. It doesn't matter if you're using MiniTest with blocks as test methods or RSpec with Ruby blocks; they are just pieces of code that we have to organize and execute.
00:01:08.799 It then becomes a matter of how we can do that efficiently, executing those pieces of code more quickly. This is where we start getting into parallelizing tests, and we'll consider a few execution strategies to achieve that.
00:01:30.240 The first strategy involves dividing the tests into different groups. We start by determining which tests we want to run. We randomize the tests into groups, and then each group gets assigned to a different CPU or parallel worker that will run those tests simultaneously.
00:01:46.560 It's a simple strategy, but it has one major flaw: it does not guarantee an even distribution in terms of the time it takes to run the tests. For example, if CPU 1 gets assigned more tests than CPU 2, including integration tests which are slower than unit tests, CPU 2 may finish its group significantly faster and sit idle, waiting for CPU 1 to finish.
00:02:08.080 Execution strategy number two tries to address this by using a queue mechanism. We begin by figuring out the tests we want to run and then randomize them into a queue. Each individual test example becomes an item in this queue, and our CPUs can pull items from the queue as soon as they're ready for more work.
00:02:39.680 In the case of MiniTest, each item in the queue corresponds to an individual test method. The workers can pop items from the queue as soon as they're ready to work, avoiding being idle until the suite is finished.
00:03:01.920 Originally, Ruby did not support using all of the different CPUs we have available. How do we parallelize tests currently? For example, Rails ships with the 'mini-test' plugin that allows us to run tests in parallel. This works by employing execution strategy number two, utilizing a queue object to organize our tests.
00:03:30.560 For the parallel workers, it forks different Ruby processes that will run those tests in parallel. It then becomes a challenge of how to organize this and how to communicate between the workers and the main process to delegate work effectively.
00:03:50.720 Rails, in this case, uses a gem called DRb (Distributed Ruby) to handle inter-process communication. This gem provides a means of sharing a Ruby object between different processes, and in this case, the main process shares the queue object with all workers so that they can pop items from the queue as soon as they’re ready to work.
00:04:14.000 So now we have Ractors. Can they perform any differently than processes? Let's begin with a few examples of using Ractors. You can create a new Ractor simply by using 'Ractor.new' like any other Ruby object. Here, we create a worker that will receive a message and process it.
00:04:41.440 The worker will receive a message that’s sent to it. Upon receiving that message, it starts processing the information. We can wait until the worker is done and take the return value of the processing.
00:05:09.000 A more involved example we'll use in our framework is a worker pool. We begin with a queue of numbers from 0 to 99, create an array to store the results of our calculations, and create a pool of ten Ractors, each of which will be stuck in an infinite loop.
00:05:19.200 Inside this loop, we will receive a number and yield back that number multiplied by five. The 'Ractor.yield' method is typically used when we need to return values from Ractors at multiple times during execution.
00:05:39.680 The worker pool facilitates the processing of numbers, and we can send items from the queue to the workers until the queue is empty. We then use 'Ractor.select' to find the first idle worker, receive the result, save it in our array, and allow that worker to continue processing more calculations.
00:06:02.000 Keep in mind that this loop ends when the queue is empty, but we can still have Ractors running in the background yielding values back to us. We still need to check on each of the workers to ensure they are done with their calculations before taking their return values.
00:06:34.240 Now let's take a look at how test frameworks work so that we can build one and parallelize it. I like to divide them into three areas of concern: execution, utilities and syntax, and reporting.
00:06:47.680 The first area is execution, which involves organizing all the code pieces that need to run the tests. We create a queue and distribute work to different parallel workers. The second area, utilities and syntax, focuses on how we test our application code, whether we use assertions or expectations, and if mocks or stubs are available.
00:07:14.239 Finally, reporting concerns how we display information to the user. Let's take a look at the code based on MiniTest. We can extract a few concepts that we will use to build our framework. The test files are just regular classes in MiniTest, so we can require the files under the test folder to load them.
00:07:51.600 These classes are of special interest to us; we need to track them for later reference. Each one of these classes can define multiple test methods, so we need to loop through them and run each test defined within.
00:08:10.560 If we look inside a specific test, we start exploring how to verify our application code works. We'll look into assertions and how we define syntax to write our tests. Assertions also need to assist with execution flow.
00:08:23.040 If an assertion fails, there's no reason to continue running that test; we can mark it as a failure and move on to the next test. We need to save information about failures and any other statistics for display at the end.
00:08:41.920 Now, let's implement our test framework, starting with how to keep a reference to each one of the test classes. We will require the files under test, but how do we keep track of the classes? Ruby provides a neat trick: if we create a class method named 'inherited' in a base test class, Ruby will invoke it each time a subclass inherits from it.
00:09:03.680 We can define that all tests need to inherit from our base class, and when Ruby calls this callback, we can save the inheriting class in an array. This will keep track of all the test classes we require. But how do we model our test queue?
00:09:39.520 Since we're saving the test classes, we must remember that each class can define multiple test methods. We need each test method to become an individual item in our queue, organized as an array of tuples, where each tuple contains the test to run and its corresponding class.
00:09:59.040 After building the queue with test methods, we need to define how to execute them. We create a new instance of the test class received from the queue. This ensures isolation; each test runs in its instance, preventing conflicts due to instance variables and keeping data separate for each test.
00:10:27.040 With the instance in hand, we invoke methods defined for each step of running a single test, such as setup, the test method itself, and teardown to finalize the test. But at this point, we can't assert anything, so let's build our assertions.
00:10:54.359 The most basic assertion simply checks whether something is truthy or not. There are two possibilities: if it passes and is truthy, we want to continue running tests without interruption; we may only record statistics. If it fails, we need to save that failure information for later while cutting to the next item in the queue.
00:11:27.679 This is how the assert method would look in code: we begin by incrementing the total assertions made. In the happy path, if something is truthy, we return early. On failure, we will register that failure with relevant information, such as the failure message and the current test instance.
00:11:48.799 Each failure allows us to print progress and maybe display an 'f' on the screen for failures. As we increment the total number of tests executed, we know each item from the queue represents a single unit test.
00:12:14.400 If we reach the end of executing a test successfully, we increment our count of successful tests. Given the structure, as long as we do not raise errors, this handling ensures that any assertion failures are properly recorded and we continue to print progress accordingly.
00:12:38.400 We now have the foundational blocks to build our test framework, so we need to put the pieces together. We start by requiring each one of the test files to populate our list of classes, then loop through our queue, executing each tuple with corresponding test classes.
00:13:04.560 As we execute through the queue, we're also storing the reporter information—registering all progress and failures—which will allow us to print out a summary at the end. This setup provides very basic sequential execution, but now, we're interested in parallelizing using Ractors.
00:13:39.040 We return to our execution script: we require each test file and then assign our queue of tuples to a variable, randomizing the order to ensure even distribution. Just like in the worker pool example, we will create a pool of Ractors based on the available processes.
00:14:01.680 Each Ractor will then loop, receiving tuples from the queue and executing the defined flow for each tuple. We can send items from the queue to idle workers until the queue is fully processed.
00:14:34.079 However, running this setup initially results in errors regarding class variables used within non-main Ractors. Ruby doesn’t allow access to class variables in child Ractors, which presents a challenge when attempting to aggregate results within our singleton reporter instance.
00:15:21.440 One solution is to create multiple reporter instances responsible for aggregating information, yielding back information about each run to an aggregator. This approach allows us to aggregate information without relying on class variables, ensuring we still collect all relevant data.
00:15:54.240 It is important to aggregate values that Ractors yield until there is no information missing, and this method allows running tests parallelized in Ractors without issues. Our framework is available on GitHub under the name 'Loop', where you can explore its features.
00:16:14.560 Moving on to a demonstration, Loop can run our tests in both process mode and Ractor mode. In process mode, there is a noticeable delay caused by forking separate Ruby processes, which is absent in Ractor mode.
00:16:45.840 In addition to running the tests, I will highlight the interactive reporter Loop provides. When run in interactive mode, we get paginated results detailing failures. This allows for cycling through failures, viewing highlighted files, and accessing specific errors in our favorite text editor.
00:17:09.920 Once we fix a failure, we can mark it as successful. Having both process and Ractor modes available allows for measuring and comparing performance across various test scenarios.
00:17:39.040 As we benchmark the two modes while increasing the number of tests, we observe that although process mode incurs an initial delay due to forking, the difference in performance lessens as the number of tests increases because most time is spent executing application code rather than within the testing framework.
00:18:16.640 However, our framework does have some limitations. For example, broad usage of class variables is restricted since child Ractors do not have access to them. This means we can only utilize them within the main Ractor, and we also cannot widely use dynamic methods or any form of meta-programming across other Ractors.
00:19:10.080 In addition, the code being tested must be Ractor compatible in order to execute within Ractor mode. Even though we identify several limitations, it’s essential to note that Ractors are still considered experimental and will continue to evolve.
00:19:35.680 Developer tools present a valuable opportunity to test Ractors since they do not operate in high-risk environments, allowing for experimentation to find problems, provide feedback, and ultimately contribute to the future of Ruby parallelism.
00:20:12.640 Thank you for listening, and I hope you enjoy the rest of the conference.