Talks

A Sneak Peek into Ractors!

A Sneak Peek into Ractors!

by Abiodun Olowode

The video titled "A Sneak Peek into Ractors" presented by Abiodun Olowode at Helvetic Ruby 2023 focuses on the concept of Ractors in Ruby, an experimental feature that aims to facilitate parallel execution without the complexities associated with thread safety. This talk aims to boost the understanding and potentially increase the adoption of Ractors among Ruby developers by reintroducing them and discussing their advantages and drawbacks.

Key Points Discussed:

  • Introduction to Ractors:
    Ractors were introduced by Koji Sasada to address race conditions in Ruby's concurrency model. Despite being in the Ruby ecosystem for three years, Ractors haven’t seen widespread adoption, prompting the need for this discussion to drive their utilization.
  • Understanding Concurrency and Parallelism:
    • Concurrency allows multiple tasks to progress in overlapping time periods, akin to a single chef managing multiple dishes.
    • Parallelism, however, involves multiple cooks working simultaneously on their respective tasks, allowing for true multi-core resource utilization.
  • Core Usage Observations:
    • Demonstrations were made comparing the execution of threads versus Ractors, indicating that Ractors allow all cores to be utilized simultaneously, unlike threads that are limited by the Global Interpreter Lock (GIL).
  • What are Ractors?:
    Ractors are Ruby's implementation of the actor model, enabling parallel execution while avoiding thread safety issues such as data races and deadlocks. They lock scope to maintain private states and communicate via messages.
  • Shareable and Unshareable Objects:
    Ractors operate with shareable (immutable) and unshareable (mutable) objects, making copies of mutable objects, thus ensuring that each Ractor runs independently without interference.
  • Properties of Ractors:
    • They maintain a private state not affected by external modifications.
    • Communication occurs through sending and receiving messages, which avoids traditional synchronization mechanisms.
  • Considerations for Using Ractors:
    • Speed: Ractors can improve computation times but may slow object creation due to garbage collection overhead.
    • Compatibility: Existing programs may require redesigning to use Ractors efficiently because many rely on global states, which Ractors do not support.
    • Environment: Experimental in nature, Ractors should be cautiously adopted in production environments. Starting with non-critical code segments and leveraging them in testing can be beneficial.

Conclusion and Takeaways:

  • Ractors offer a promising approach to concurrency in Ruby but still have limitations regarding integration in existing applications. Developers should carefully assess their projects' needs in terms of speed, compatibility, and execution environments before implementing Ractors.
  • Engaging with the Ruby community about Ractors offers potential for further enhancements and compatibility improvements in the future.
00:00:06.720 Hello everyone, this is my first Ruby conference, so no pressure. In 2020, Koji Sasada introduced the Ruby community to Ractors. It's been three years, and we can say that the adoption hasn't been so great, right? Has any of you here used them?
00:00:12.160 Okay, so I'm correct. The aim of this talk is to reintroduce us to Ractors for the purpose of driving their adoption. My name is Abiodun, short form AB, so just call me Ab. In this talk, I'm going to take us on a journey from what is the go-to option in Ruby regarding concurrency to what Ractors are, why they were created, and if they're really worth trying out at all. Along with new examples and illustrations, I'm also going to be reusing one or two examples from the Ruby documentation as well as coaches.
00:00:37.160 Before we can accurately talk about Ractors, it's important that we go over concurrency and parallelism. So, what is concurrency? Concurrency is the ability of a program to make progress on several tasks within the same time frame, allowing them to start, run, and complete in overlapping time periods. A good example of this would be a single chef. By single, I don't mean married or not partnered; I mean one chef.
00:01:12.799 So, as I was saying, it would be a single chef managing multiple dishes by transitioning between them. Let's call this context switching. She could add salt to the rice at one point or butter to the cake at some other point. However, at no point is she focused on more than one dish. How then can this be fast if we're not doing everything at the same time? Well, while we are waiting for the rice to boil, we can also add butter to the cake, and while we're waiting for it to simmer, we can make a salad. In Ruby, a good example of this would be threads.
00:01:58.560 Context switching happens with threads due to the Global Interpreter Lock (GIL). This is a lock that every thread must acquire in order to be executed. What does this mean for us? It means that irrespective of the number of cores that you have, you would only have one thread being executed at a time. Let's carry out a core usage observation for threads. These are two methods. Now I feel like I’m so short because I have to look up there, so I’ll just look here and assume that you can see.
00:02:39.280 Okay, so we have two methods: a single loop one, which just runs a loop, and a second one that spins up ten threads and runs a loop. This is a brief overview of my hardware. I'm using a MacBook Pro that has ten cores, six for performance and four for efficiency. You are the observers, so keep your eyes on the cores. First things first, we run this single loop method. What happens? Keep your eyes on the cores.
00:03:02.920 Yeah, right there we have a peek at one core. We kill this method and run the thread usage method, which spins up ten threads. Let's keep our eyes on the cores. Yeah, we can see several cores in use there. Another one there. We should see a third one. Yes, that one there. However, we don't see all the cores busy at the same time. This might not make any sense, but I promise it will all come together when we carry out this same exercise for parallelism.
00:04:18.760 So, what is parallelism? In the context of the chef example, this would be multiple chefs handling multiple dishes at the same time. Not necessarily dancing, but you get the point. So let's carry out a core usage observation for parallelism. We have the single loop as we saw before, and now we have a new method called a Ractor usage method that spins up ten Ractors and runs the single loop. Keep your eyes on the cores; remember you are the observers.
00:04:59.680 Let's see what happens. Yes, that's what happens— all the cores are busy at the same time. When we kill this, everything becomes dormant, not doing anything. Yes, so I guess this is the perfect point to ask the question: what are Ractors? According to the Ruby documentation, a Ractor is designed to provide parallel execution of Ruby without thread safety concerns. Let's divide this into two functions for the sake of this talk. The first one will be achieving parallelism, as we just saw. This means that we can have all the cores working at the same time.
00:05:57.760 The second one will be eliminating thread safety concerns. What are these concerns? We have race conditions, data races, and deadlock or livelock issues. In order to understand how Ractors aim to eliminate thread safety concerns, we need to understand how they were modeled. Ractors were actually built based on the actor model. The actor model adopts the philosophy that everything is an actor, just as we have in object-oriented programming, where everything is an object.
00:06:39.440 So, within the actor model, we can say that the fundamental agent of computation is the actor. I guess it's already getting pretty clear where the name Ractors came from. From this beautiful piece of code, we can see where the name actually emanated from. I like it a lot. In the beginning, they were introduced as guilds, but with the release of Ruby 3, they were introduced as Ractors. What does this mean for us? We can say that Ractors are basically Ruby's implementation of the actor model.
00:07:10.360 If you don't recall anything from this talk, remember this: the properties of Ractors are the same as that of actors since Ractors are Ruby actors. So, what are these properties? First things first, Ractors are unable to access any objects through variables that are not defined within their scope. This means that any variable that you wish to access within it would have to be defined during initialization.
00:07:47.599 Let's see how that works. We have an array that is going to become very famous very soon in the next few minutes. This array has three items: one, two, and three. We create a new Ractor within which we attempt to concatenate it. We try to mutate this array by adding three more items: four, five, and six. What happens? We get an error that says we cannot isolate a proc because it accesses outer variables.
00:08:34.920 This is because during the creation of a new Ractor, we call a method called proc do isolate. It's actually an internal Ruby method that prevents us from accessing any variable that has not been defined during initialization. How do we fix this? We can pass this array to the Ractor within this method. It's not really a method; within this piece of code, we attempt to concatenate the array, and it’s actually successful, as we see we get an output of six items.
00:09:20.880 However, when we try to access the external array, we find that it still has three items. This is in opposition to what you would naturally expect from a Ruby method. Let's see a Ruby method: we have a concat_array method, and within it, we mutate this external array. When we do that, we get six items, and when we access this same array, we still get six items. That's not what happens with the Ractor.
00:09:58.280 As we can see, we have six items, but the external array is still three items. This begs the question: what array got concatenated within the Ractor? Ractors basically operate based on the concepts of shareable and unshareable objects. Shareable objects are immutable, and those that are unshareable are mutable. See this like a child with a piece of candy who knows that a particular adult would always take a bite.
00:10:36.080 So, if you come and say, "Hello, baby, can I please see your candy?" The child says, "Nope," because they know you can mutate it. But if they trust you, like this is Daddy, oh, my favorite daddy, he's not going to do anything to my candy. When you say, "Can I see your candy?" they're like, "There you go," because it's shareable.
00:11:10.720 So, let's go back to our infamous array of three items. Let's see what actually happened here. Let's check the ID, and we see that we have two different IDs: one is 797420, and the other one is 824340. What actually happens is when you pass an object that is mutable to a Ractor, a deep copy is made because it's mutable. So it's not the same object; that's why we have two different object IDs.
00:11:54.799 So how then can we make this array shareable? We can do this by utilizing the Ractor's make_shareable method, which makes this array shareable. Remember, once it's shareable, that means it's now immutable. So within this piece of code, we make this array sharable, and we try to concatenate it. What happens? We see that the ID is now the same: 824340. However, we get this error: "Cannot modify frozen array." That's because we have made it immutable by freezing it.
00:12:36.799 Ractors basically freeze any object that you try to share using this method. So if Ractors make a deep copy of unshareable objects and sharable objects are immutable, how then do we mutate states with Ractors? This brings us to the second property: Ractors communicate by sending and receiving messages, and if they have many, they process one at a time.
00:13:31.160 Here, we have four Ractors: Ractor 1, Ractor 2, Ractor 3, and Ractor 4. Ractor 1 is the receiving one, while Ractor 2, 3, and 4 are the sending Ractors. They are trying to communicate with the receiving Ractor, which is Ractor 1. How do they do this? They first send messages to the Ractor's incoming port, and they all form a queue. All the messages from the queue, as you can see, message 2, 3, and 4, via Ractor 1, do receive the messages.
00:14:05.120 The messages are received, processed, and via yield, the result is sent back. This is not okay though because they have to take the results. So it’s basically I made it quite easy here. Ractor 2 do send sends a message to the port, and then Ractor 1 do receive it, process it, and then via yield, the result is sent back to the outgoing port and goes back to the sending Ractor.
00:14:51.440 So let's apply this to our infamous array of three items. In here, we create a new Ractor. Ractors typically terminate when the code block within them is done running. In order to keep them running, we have to put the particular part we expect to keep running in a loop. Here we have an array of three items, and via do receive, we receive a message: concatenate this array using this, and then we yield the results.
00:15:28.600 So let's see what happens. Right now, we are going to try to use do send, take, yield, and push-pull types of communication. The first thing we do is we try to send an array of two items: four and five, via Ractor do send. We call take and see that we then have five items: one, two, three, four, five. That array has been mutated. When we send another array of two items: six and seven, we see that it has been further mutated, resulting in seven items.
00:16:25.880 What does all of this mean? Basically, we're not here to discuss arrays, so what does this mean for us? It means that Ractors maintain a private state that cannot be modified by any system outside of them. So what do we gain from this? How many of you like football? Do you remember this game? Okay, so, basically, we experience freedom from race conditions and data races without shared states.
00:17:03.440 We reduce the likelihood of this because we would not have a Ractor trying to access a variable while another Ractor is trying to mutate it at the same time—typical data races. We also experience freedom from the overhead of using locks for synchronization. Basically, since they inherently prevent shared states via sending messages, the traditional synchronization methods using mutex and lock would no longer be necessary.
00:17:43.680 However, it's still important to ask: are we totally thread-safe? Because it looks like I've been selling gold, like Ractors are the solution to all problems in the world. But really, are we totally thread-safe? Two things: class or model objects are sharable, but they can actually be modified. If we remember that they should be immutable, they can still be modified within a Ractor program or multi-actor program.
00:18:35.520 Therefore, it's important that if you're using this, you need to be careful not to modify class or model objects, as this can introduce typical data races. The second thing is that blocking operations can still occur via waiting send, waiting yield, and waiting take. This means that we can still create programs that have deadlock or livelock issues.
00:19:09.280 At the beginning, we discussed three thread safety concerns: race conditions, data races, and deadlock or livelock issues. Now we can see that it seems like we still have to tweak a few things. The next question will be: are Ractors right for my project? We have to consider three things, maybe four.
00:20:09.440 The first thing will be speed. In speed, we will look at computation and object creation. Let's start with computation. This is the tarai method, which takes a lot of computation. This is an example from Ruby documentation. In this method, we have a sequential version and a parallel one. The sequential one runs four times, and the parallel one runs four times but using Ractors. What's the result? We can see that the parallel version is almost four times as fast.
00:21:09.600 This doesn't mean that Ractors are generally four times as fast, but in this case, we run them four times. So if it was five times or six times, we would be getting that same result. Now let's look at object creation. This is one of Koji’s examples. We have a method that creates objects ten million times, which is not absurd at all, and then we do this four times sequentially and four times using Ractors.
00:21:46.840 What's the result? We can see that the sequential version is actually faster than the parallel version. This is because Ractors share the same space, and for garbage collection, we have to stop all of them, which is very costly. So, you would need to identify the parts of your code that can benefit from parallel execution. However, you also need to ascertain that introducing Ractors is not actually slowing them down instead of speeding them up.
00:22:38.280 The second thing is compatibility. Existing programs need to be redesigned to work with Ractors because many of our programs use gems, and many gems depend on global state. Ractors do not support global states because they try to maintain a private state. So, if you're going to use them, you need to be sure that you are willing to take the time to redesign your programs.
00:23:14.720 The next thing we need to check is the environment. When you create a new Ractor, the first thing you see is that Ractors are experimental and their behavior may change in future versions of Ruby. There may also be implementation issues. When you see this, your first reaction might be, "Oops, nope, on to the next one," so I can understand why you wouldn't want to try them out.
00:23:56.120 However, in production, you can identify non-critical parts of your code, try them out, build trust, and then expand. Nevertheless, in your test environment, Ractors can actually be greatly beneficial to speed up your tests because you can use them to parallelize your tests. This way, we build trust and gain greatly.
00:24:51.360 So this is a table just summarizing everything: speed, computation – yes; object creation – nope, not for now, but we are getting there. The team is working on a solution that isolates a Ractor such that garbage collection happens within it and you don't have to stop all actors. So let's hope that very soon that will be released.
00:25:39.760 Secondly, compatibility: new projects – yes, because now you know how they work, so you can decide to use gems that are Ractor compatible or streamline everything you're doing towards that. For already existing projects, I would say maybe there's no hard and fast rule here, because you have to determine whether your program needs a thorough redesign or if it just works based on how you’ve already designed it.
00:26:07.760 Environment: in the test environment – yes, parallelize your tests to make them faster. In production, start with non-critical points, build trust, and then expand. Now that we have seen the good, let's start with the great.
00:26:42.960 The good, the bad, and maybe the ugly of Ractors. Let's circle back to what they are. Ractors are a parallel execution feature in Ruby inspired by the actor model, in which each Ractor operates independently, helping mitigate thread safety concerns. We may not have all our thread safety concerns eliminated yet, and our journey towards optimal speed across all operations might still be ongoing.
00:27:49.679 However, if we all come together to use Ractors and actively contribute feedback, we can enhance their functionality. We can also create more gems that are Ractor compatible and contribute to a thriving ecosystem.
00:28:01.920 Picture a parallel execution feature in Ruby that is not only lightweight, but also economical in its usage of memory and streamlined in terms of communication complexity. You won't have to use locks anymore. It's all about message passing, and it is thread safe.
00:28:54.480 That's a feature that we can all be proud of. So let's start using them! Not today; maybe tomorrow morning, let's start to use them! In the next few years, when we talk about parallelism in Ruby, it can be a party of Ractors—Ractors everywhere.
00:29:32.960 Thank you so much for listening! You can connect with me.