00:00:01.599
Hi everyone, I'm Koichi Sasada from Cookpad, Japan. Today, I will introduce the new feature in Ruby 3, named Ractor, which is an abstraction for thread-safe concurrent and parallel execution. It is currently midnight in Japan, and I'm not sure if I can stay awake, so this is a recorded video. If you have any questions, please tweet with the hashtag #Ractor, and I will answer your questions. You can also download this presentation file from this URL, which includes all the source code, so you can learn about it on your own machine.
00:00:47.120
First, I want to share the background of the Ractor project. Nowadays, GPUs have many cores, and parallel computation is needed for performance improvement. It is easy to utilize multiple CPU cores with multiple processes; however, it is difficult to create multi-process programs because inter-process communication is challenging. Moreover, multiple processes can consume computational resources such as memory. Making many threads is another well-known way to utilize multiple cores, but our MRI Ruby interpreter does not support parallel execution with multiple threads.
00:01:34.640
Let’s try to learn from this program on MRI. This program creates 20 threads, and each thread runs a busy loop causing the CPU to work in parallel. Let's observe this on the test computer. The upper screen shows CPU usage for 40 logical CPUs. As you can see, a single thread utilizes 100 percent of the CPU. However, when we create 20 threads, we cannot see busy CPUs because all threads cannot run simultaneously. MRI switches threads over a short period of time, which is what happens on MRI's threading systems. Parallel execution is not allowed on MRI due to implementation limitations. I believe that threading can be very complex when trying to create a correct program because we need to care about many issues, especially regarding synchronization.
00:02:29.040
If we forget to synchronize properly with locks, it can lead to critical bugs such as race conditions, deadlocks, and livelocks. Furthermore, these bugs are difficult to debug because of their non-deterministic nature, meaning you cannot easily reproduce the same issues. These challenges run counter to the happy programming philosophy that Ruby aims for, which is one of the reasons why I have opposed allowing parallel thread execution. I don't want to send Ruby users into the complexities of threading help.
00:03:35.680
Additionally, it is challenging to improve single-thread performance due to fine-grained synchronization. This is the simple reason why it was difficult to implement parallel threads in MRI. Let me show you a simple example of a counter incrementing program with two threads. Each thread repeatedly gets the current counter and increments it by one. This simple code should output two million; however, the actual output is often lower. Each time the outcome is different because there is no synchronization.
00:04:31.199
To correct this program, we need to introduce mutexes, and then the output becomes two million correctly. It is easy to introduce such mutexes because the code is small. However, in general, applications are larger and may use numerous libraries, making it difficult to identify threading issues in the application and associated gem libraries. Therefore, we set two goals for Ruby 3: to introduce an easy and parallel concurrent abstraction.
00:05:05.280
We propose a new concurrent abstraction called Ractor, which is an actor-like concurrent abstraction. The key idea is limited object sharing between actors and making programs communicate through message passing. By the way, this project was renamed from Guild to Ractor earlier this year. Ractors introduce several concepts, which I will explain during this presentation with working codes. If you want to see the specifications for Ractor, please check the documents.
00:06:12.720
Firstly, we can create multiple Ractors, and each Ractor can run in parallel on a multi-core computer. The interpreter process has at least one Ractor called the main Ractor, and each Ractor can have one or more threads. However, multiple threads in the same Ractor cannot run in parallel, which is the same behavior as the current MRI.
00:06:51.200
We can create a Ractor using the `Ractor.new` method with a given block, and this block will run in the Ractor. The created Ractor can replace the outer self of this block. If you create multiple Ractors, you will see an experimental warning indicating that Ractors are not yet mature in Ruby 3.0, and specifications may change. If you have any comments, please share your feedback; your input can help improve the specifications.
00:07:37.120
Let’s create 20 Ractors on the same machine with the program where each Ractor runs a busy loop. As you can see, the 20 Ractors utilize 20 logical CPUs, and when we increase this to 40 Ractors, you can observe that 40 logical CPUs are busy. Ractor works fine on Windows as well, and we can observe the same CPU utilization using a task manager.
00:08:25.600
Now let’s move on to a more useful program. This program checks whether a given number is prime or not. You can pass parameters to the Ractors and obtain the results. The Ractor block computes the prime number check. The left program is a single-threaded sequential program, while the right one uses two Ractors to compute in parallel.
00:09:08.240
Let’s execute this on the machine. The sequential program takes a few minutes, while the parallel Ractor implementation significantly speeds up the computation. For instance, checking large prime numbers takes about 26 seconds with the threaded approach, but only 13 seconds when using two Ractors running in parallel. Thus, we can observe a two-fold performance improvement by utilizing parallel execution with Ractors.
00:10:06.080
Next, I want to discuss the impact of Ractor on object allocation. For instance, creating one million strings using a single thread takes just 0.2 seconds, while the Ractor version takes several seconds. This performance delay is primarily due to the current implementation of object allocation. I am working to improve this performance regression before the release of Ruby 3.0.
00:10:53.920
Another key concept of Ractor is limited object sharing between Ractors. The major difficulty in threading programming arises from sharing the memory model. We need to care about the synchronization of all objects. This is why we decided to limit object sharing between Ractors. Normal objects such as strings, arrays, and hashes are not shared between Ractors, so we don't need to worry about synchronization, as they are not shared. In other words, you can't introduce synchronization-related bugs because they aren’t necessary anymore in Ractor programming.
00:11:50.880
While most objects are not shared between Ractors, there are some shareable objects, such as classes, modules, and immutable objects. Immutable objects are those that cannot be altered after their creation, ensuring data integrity. On a Ractor program, you can create a Ractor network using interlocked communication APIs, allowing Ractors to wait for message arrivals, thus controlling the program flow.
00:12:43.680
There are two types of communication APIs: push-type and take-type communication. The push-type APIs involve the `Ractor.send` and `Ractor.receive` methods. You can send an object as a message, and another Ractor can receive that message. This process follows a message-passing model, ensuring that object isolation is maintained. When you send an unsharable object, it gets copied, protecting the integrity of the message.
00:13:34.640
This program creates a Ractor that sends a string object from the main Ractor to a Ractor named R1 using the send method. If R1 receives the object, the object sent is received by the `Ractor.receive` method, which waits for the message. When checking object IDs on the main Ractor, the object ID is different, confirming that the object was copied.
00:14:28.200
The push-type communication involves an incoming port linked to an incoming queue. Sent objects are queued in this incoming queue, and any Ractor can send messages to any other Ractor. The incoming queue is infinite, so sending operations return immediately. If an incoming port is closed, sending messages will fail and raise an exception. If a Ractor terminates, the associated incoming port is also closed automatically.
00:15:43.200
The second communication type is the take-type. Ractors can yield an object as a message using the `Ractor.yield` method. The receiving Ractor can take this message using the `Ractor.take` method. In contrast to the push method, the take method specifies which Ractor to receive from. In this communication type, Ractors wait for each other, thus facilitating a cooperative communication model.
00:16:38.080
This program demonstrates creating a Ractor named R1 that initializes three integers: 0, 1, and 2. The block yields the symbol :thing, and the main Ractor can take the results of the block through the `take` method, retrieving the values 0, 1, 2 along with the block result.
00:17:33.920
The program implementation includes an outgoing port and an incoming Ractor that sends a message to another Ractor. You can connect two Ractors using outgoing and incoming ports, enabling a pipeline computation system. This implementation allows sending requests from the main Ractor to additional worker Ractors, which can run the tasks in parallel. If the outgoing port is closed, any attempt to take a message from that Ractor will result in an exception.
00:19:30.640
Multiple Ractors can take messages from one Ractor, but an individual message can only be taken by one Ractor at a time. In this scenario, the main Ractor can take a message using the push-type and take-type methods. Load balancing can also be achieved using multiple working Ractors when sending a request to a bridge Ractor.
00:20:42.240
In this instance, the received message will be processed by one of the worker Ractors. You can write a message by utilizing multiple Ractors, sending a request to the bridge Ractor for executing tasks. We can program the bridge Ractor to send requests and return results from the worker Ractors. Additionally, I want to discuss an important feature of exception propagation. If a Ractor terminates due to an exception, the taking Ractor will raise a remote error just like the Thread.join method.
00:22:43.200
This program illustrates how a Ractor named R1 raises a TypeError and when the main Ractor attempts to take from R1, it receives the remote error exception. This exception contains the reason for the failure, pinpointing where the original exception was raised. In doing so, the main Ractor can supervise other working Ractors and detect exceptions as they happen. This is essential for managing the stability of the application's execution.
00:23:53.040
In our effort to achieve object isolation between Ractors, we introduce several limitations. Non-main Ractors are restricted from using global variables, class variables, or instance variables in classes and modules. They cannot access constants if they contain unshared objects. The main Ractor remains compatible with the current Ruby 2.7. However, to support multiple Ractors, we will need to modify the application code.
00:24:39.680
While the implementation is still maturing and will display an experimental warning upon the first Ractor creation, I believe Ractor will enhance concurrent programming in Ruby by providing a means to write efficient programs with fewer thread safety concerns. Ruby 3.0 will ship with Ractors, and I hope you all enjoy the new capabilities it brings. Thank you for your attention.