RubyConf 2020
Ractor Demonstration

Ractor Demonstration

by Koichi Sasada

In the video 'Ractor Demonstration', Koichi Sasada introduces Ractor, a new feature in Ruby 3 designed to facilitate thread-safe concurrent and parallel execution. This presentation aims to highlight the benefits of Ractor while also addressing its limitations, providing a comprehensive understanding of how it enhances Ruby's capabilities in handling parallel programming. The background for Ractor arises from the necessity for efficient multi-core processing without succumbing to traditional threading complexities, which have posed significant challenges in the past. The key points discussed include:

  • Background and Motivation: The need for parallel computation in modern GPU and multi-core CPU environments, contrasting the difficulties faced with multiple processes and threads in the current MRI Ruby interpreter.
  • Challenges with Current Threading: Issues like race conditions, deadlocks, and the complex nature of synchronization that can complicate Ruby programming.
  • Introduction of Ractor: The Ractor model minimizes object sharing between threads (Ractors), promoting a message-passing communication system which prevents synchronization-related bugs.
  • Creating and Using Ractors: Demonstrations on how to create Ractors using Ractor.new, showcasing their ability to utilize CPU cores effectively.
  • Performance Comparisons: Examples illustrating Ractor's performance improvements over conventional threading, such as the prime number checking program which illustrated significant speed increases by parallelizing the workload.
  • Object Sharing and Messaging: Ractors allow limited sharing of objects, emphasizing the importance of isolating mutable state to avoid bugs.
  • Communication Methods: Explanation of push-type and take-type communication methods, allowing for effective inter-Ractor message passing which maintains isolation, yet facilitates collaboration.
  • Exception Handling: Ractors can propagate exceptions back to the main Ractor to aid in handling errors across concurrent execution.
  • Limitations and Future Outlook: Discussion of the constraints imposed on Ractors, including restrictions on global and shared variables, highlighting necessary adjustments developers may need to make for compatibility.

Sasada aims to clarify that while Ractor is still in its experimental phase, it holds great potential to upgrade Ruby's parallel programming landscape, making it accessible and less error-prone for developers. The conclusion emphasizes the importance of feedback during the maturation of Ractor as it approaches its official inclusion in Ruby 3.0, encouraging the community to engage with the new feature and its capabilities.

00:00:01.599 Hi everyone, I'm Koichi Sasada from Cookpad, Japan. Today, I will introduce the new feature in Ruby 3, named Ractor, which is an abstraction for thread-safe concurrent and parallel execution. It is currently midnight in Japan, and I'm not sure if I can stay awake, so this is a recorded video. If you have any questions, please tweet with the hashtag #Ractor, and I will answer your questions. You can also download this presentation file from this URL, which includes all the source code, so you can learn about it on your own machine.
00:00:47.120 First, I want to share the background of the Ractor project. Nowadays, GPUs have many cores, and parallel computation is needed for performance improvement. It is easy to utilize multiple CPU cores with multiple processes; however, it is difficult to create multi-process programs because inter-process communication is challenging. Moreover, multiple processes can consume computational resources such as memory. Making many threads is another well-known way to utilize multiple cores, but our MRI Ruby interpreter does not support parallel execution with multiple threads.
00:01:34.640 Let’s try to learn from this program on MRI. This program creates 20 threads, and each thread runs a busy loop causing the CPU to work in parallel. Let's observe this on the test computer. The upper screen shows CPU usage for 40 logical CPUs. As you can see, a single thread utilizes 100 percent of the CPU. However, when we create 20 threads, we cannot see busy CPUs because all threads cannot run simultaneously. MRI switches threads over a short period of time, which is what happens on MRI's threading systems. Parallel execution is not allowed on MRI due to implementation limitations. I believe that threading can be very complex when trying to create a correct program because we need to care about many issues, especially regarding synchronization.
00:02:29.040 If we forget to synchronize properly with locks, it can lead to critical bugs such as race conditions, deadlocks, and livelocks. Furthermore, these bugs are difficult to debug because of their non-deterministic nature, meaning you cannot easily reproduce the same issues. These challenges run counter to the happy programming philosophy that Ruby aims for, which is one of the reasons why I have opposed allowing parallel thread execution. I don't want to send Ruby users into the complexities of threading help.
00:03:35.680 Additionally, it is challenging to improve single-thread performance due to fine-grained synchronization. This is the simple reason why it was difficult to implement parallel threads in MRI. Let me show you a simple example of a counter incrementing program with two threads. Each thread repeatedly gets the current counter and increments it by one. This simple code should output two million; however, the actual output is often lower. Each time the outcome is different because there is no synchronization.
00:04:31.199 To correct this program, we need to introduce mutexes, and then the output becomes two million correctly. It is easy to introduce such mutexes because the code is small. However, in general, applications are larger and may use numerous libraries, making it difficult to identify threading issues in the application and associated gem libraries. Therefore, we set two goals for Ruby 3: to introduce an easy and parallel concurrent abstraction.
00:05:05.280 We propose a new concurrent abstraction called Ractor, which is an actor-like concurrent abstraction. The key idea is limited object sharing between actors and making programs communicate through message passing. By the way, this project was renamed from Guild to Ractor earlier this year. Ractors introduce several concepts, which I will explain during this presentation with working codes. If you want to see the specifications for Ractor, please check the documents.
00:06:12.720 Firstly, we can create multiple Ractors, and each Ractor can run in parallel on a multi-core computer. The interpreter process has at least one Ractor called the main Ractor, and each Ractor can have one or more threads. However, multiple threads in the same Ractor cannot run in parallel, which is the same behavior as the current MRI.
00:06:51.200 We can create a Ractor using the `Ractor.new` method with a given block, and this block will run in the Ractor. The created Ractor can replace the outer self of this block. If you create multiple Ractors, you will see an experimental warning indicating that Ractors are not yet mature in Ruby 3.0, and specifications may change. If you have any comments, please share your feedback; your input can help improve the specifications.
00:07:37.120 Let’s create 20 Ractors on the same machine with the program where each Ractor runs a busy loop. As you can see, the 20 Ractors utilize 20 logical CPUs, and when we increase this to 40 Ractors, you can observe that 40 logical CPUs are busy. Ractor works fine on Windows as well, and we can observe the same CPU utilization using a task manager.
00:08:25.600 Now let’s move on to a more useful program. This program checks whether a given number is prime or not. You can pass parameters to the Ractors and obtain the results. The Ractor block computes the prime number check. The left program is a single-threaded sequential program, while the right one uses two Ractors to compute in parallel.
00:09:08.240 Let’s execute this on the machine. The sequential program takes a few minutes, while the parallel Ractor implementation significantly speeds up the computation. For instance, checking large prime numbers takes about 26 seconds with the threaded approach, but only 13 seconds when using two Ractors running in parallel. Thus, we can observe a two-fold performance improvement by utilizing parallel execution with Ractors.
00:10:06.080 Next, I want to discuss the impact of Ractor on object allocation. For instance, creating one million strings using a single thread takes just 0.2 seconds, while the Ractor version takes several seconds. This performance delay is primarily due to the current implementation of object allocation. I am working to improve this performance regression before the release of Ruby 3.0.
00:10:53.920 Another key concept of Ractor is limited object sharing between Ractors. The major difficulty in threading programming arises from sharing the memory model. We need to care about the synchronization of all objects. This is why we decided to limit object sharing between Ractors. Normal objects such as strings, arrays, and hashes are not shared between Ractors, so we don't need to worry about synchronization, as they are not shared. In other words, you can't introduce synchronization-related bugs because they aren’t necessary anymore in Ractor programming.
00:11:50.880 While most objects are not shared between Ractors, there are some shareable objects, such as classes, modules, and immutable objects. Immutable objects are those that cannot be altered after their creation, ensuring data integrity. On a Ractor program, you can create a Ractor network using interlocked communication APIs, allowing Ractors to wait for message arrivals, thus controlling the program flow.
00:12:43.680 There are two types of communication APIs: push-type and take-type communication. The push-type APIs involve the `Ractor.send` and `Ractor.receive` methods. You can send an object as a message, and another Ractor can receive that message. This process follows a message-passing model, ensuring that object isolation is maintained. When you send an unsharable object, it gets copied, protecting the integrity of the message.
00:13:34.640 This program creates a Ractor that sends a string object from the main Ractor to a Ractor named R1 using the send method. If R1 receives the object, the object sent is received by the `Ractor.receive` method, which waits for the message. When checking object IDs on the main Ractor, the object ID is different, confirming that the object was copied.
00:14:28.200 The push-type communication involves an incoming port linked to an incoming queue. Sent objects are queued in this incoming queue, and any Ractor can send messages to any other Ractor. The incoming queue is infinite, so sending operations return immediately. If an incoming port is closed, sending messages will fail and raise an exception. If a Ractor terminates, the associated incoming port is also closed automatically.
00:15:43.200 The second communication type is the take-type. Ractors can yield an object as a message using the `Ractor.yield` method. The receiving Ractor can take this message using the `Ractor.take` method. In contrast to the push method, the take method specifies which Ractor to receive from. In this communication type, Ractors wait for each other, thus facilitating a cooperative communication model.
00:16:38.080 This program demonstrates creating a Ractor named R1 that initializes three integers: 0, 1, and 2. The block yields the symbol :thing, and the main Ractor can take the results of the block through the `take` method, retrieving the values 0, 1, 2 along with the block result.
00:17:33.920 The program implementation includes an outgoing port and an incoming Ractor that sends a message to another Ractor. You can connect two Ractors using outgoing and incoming ports, enabling a pipeline computation system. This implementation allows sending requests from the main Ractor to additional worker Ractors, which can run the tasks in parallel. If the outgoing port is closed, any attempt to take a message from that Ractor will result in an exception.
00:19:30.640 Multiple Ractors can take messages from one Ractor, but an individual message can only be taken by one Ractor at a time. In this scenario, the main Ractor can take a message using the push-type and take-type methods. Load balancing can also be achieved using multiple working Ractors when sending a request to a bridge Ractor.
00:20:42.240 In this instance, the received message will be processed by one of the worker Ractors. You can write a message by utilizing multiple Ractors, sending a request to the bridge Ractor for executing tasks. We can program the bridge Ractor to send requests and return results from the worker Ractors. Additionally, I want to discuss an important feature of exception propagation. If a Ractor terminates due to an exception, the taking Ractor will raise a remote error just like the Thread.join method.
00:22:43.200 This program illustrates how a Ractor named R1 raises a TypeError and when the main Ractor attempts to take from R1, it receives the remote error exception. This exception contains the reason for the failure, pinpointing where the original exception was raised. In doing so, the main Ractor can supervise other working Ractors and detect exceptions as they happen. This is essential for managing the stability of the application's execution.
00:23:53.040 In our effort to achieve object isolation between Ractors, we introduce several limitations. Non-main Ractors are restricted from using global variables, class variables, or instance variables in classes and modules. They cannot access constants if they contain unshared objects. The main Ractor remains compatible with the current Ruby 2.7. However, to support multiple Ractors, we will need to modify the application code.
00:24:39.680 While the implementation is still maturing and will display an experimental warning upon the first Ractor creation, I believe Ractor will enhance concurrent programming in Ruby by providing a means to write efficient programs with fewer thread safety concerns. Ruby 3.0 will ship with Ractors, and I hope you all enjoy the new capabilities it brings. Thank you for your attention.