Talks
Actors in Ruby! Why Let Elixir Have All The Fun?

http://www.rubyconf.org.au

We all want to have performant concurrent code, but threads are such a pain. Who wants to deal with race conditions in their code?! In this talk we show a better way of doing concurrency, the Actor model. No, that’s not just for Erlang or Elixir, we can have it in Ruby too.

RubyConf AU 2017

00:00:26.310 Hello everyone! Well, 500 people is a lot when you're staring at them, so let's talk a little bit about concurrency today. For the purposes of this talk, I'm going to define concurrency as doing multiple things at sort of the same time. It's something we use every time we interact with a computer.
00:00:40.479 The fact that you can, for example, listen to music while you're restarting your slide presentation means your computer is trying to do two things at the same time. You might say, "Oh, of course, because my computer has multiple cores," but that's not necessarily true. You don't need multiple cores to achieve this; computers have been able to perform multiple tasks for quite a long time through time-slicing.
00:01:03.850 They fake it by giving a small amount of time to different tasks, which we perceive as parallel processing, but they are not actually happening in parallel. So, if you ask a fellow Rubyist if they bother with concurrency, you can ask that person sitting right next to you, and you're likely to receive an answer that falls into one of four categories.
00:01:22.780 The first category will be, "Yes, of course they use concurrency; my app runs on Unicorn, Thin, Puma, or whatever!" For these developers, concurrency often stops when they choose the right application server, as it's something that the server does for them, not something they implement themselves. The second group will respond, "Of course not! Ruby's not built for that!" This reflects a widely-held belief that Ruby struggles with concurrency, leading many programmers to question why they should even bother writing concurrent code.
00:02:11.590 The third response comes from newly founded startups: "Who cares? I can just throw money at the problem and it goes away!" However, this is an approach that works until it doesn't, resulting in a much larger issue. The fourth answer, which seems to be growing in popularity, is, "Oh, have you tried Elixir?" Developers dissatisfied with the previous three answers often turn to languages that offer a better story for concurrency.
00:02:29.490 So why do we have such a shaky history with concurrency in Ruby, considering its ubiquity? The first reason is a well-known issue: the Global Interpreter Lock, or GIL. This is a fundamental characteristic of Ruby's implementation, where there is a lock around the entire virtual machine that makes its code thread-safe by blocking parallel execution. Consequently, to avoid developers dealing with concurrency issues, Ruby essentially disables concurrency.
00:02:56.170 While everyone knows about the GIL and brings it up whenever concurrency is mentioned, a lesser-known fact is that it does not provide complete protection. It only applies to Ruby C code, and sometimes that code has to revert back to Ruby. For instance, when you're using Cocoa libraries (or any library, really), the Ruby C code often needs to request something from Ruby, and at that point, the locks are released. Thus, it's not entirely safe.
00:03:15.580 Another lesser-known fact is that the GIL is released during I/O operations. Therefore, while Ruby does have a GIL, it allows for some concurrent I/O, even though the lock generally constrains other operations.
00:03:25.510 The second reason for the challenges with concurrency in Ruby is its inclination towards forking processes. Forking is a brute-force approach; if you want to do two things, you just create two copies of the process. This approach offers them the safety of memory isolation since operating systems protect separations, but it is also quite resource-intensive. Even with optimizations like copy-on-write, it still tends to consume more memory while complicating the coordination of work. Developers often need to rely on system calls, message queues, or other external mechanisms because these processes are isolated from one another.
00:05:03.360 The third reason is what I call thread phobia, which is a common affliction among developers. It's the anxiety surrounding threads, defined by Wikipedia as a reaction of fear towards them. When we discuss threads, we often conjure images of dire issues such as deadlock, impossible bugs, and race conditions—strange behaviors that might only occur under specific conditions, like every other full moon. Developers typically prefer to avoid dealing with such precarious scenarios.
00:05:52.380 This fear is reflected even in official documentation, like the conditional variable class in Ruby, which seems overly complex and intimidating. Concepts like this can lead to more confusion rather than clarity, creating a frustrating experience when trying to implement concurrency.
00:06:22.600 The root of the issue doesn't lie solely with threads; if you examine the thread API, it's actually quite simple. You would create a thread with `Thread.new` and pass it a block. Where developers face trouble is when they need to manage mutable states and shared space. Once threads start interacting while running concurrently, complications arise.
00:07:01.800 The tools that Ruby offers out of the box, like mutexes, conditional variables, and monitors, are often clunky and complicated. Developers must know a lot about concurrency to use these tools effectively, which can feel like navigating a minefield. Thankfully, there are more interesting models to explore. Concurrency is not just a Ruby problem; it's been an issue in computing since its inception, leading to various theoretical approaches to address these challenges.
00:08:18.890 One such model is the Actor model. To explain the Actor model, imagine an object-oriented system where every object runs in its own execution context, with two key differences. The first difference is that each object operates in isolation rather than sharing a thread, meaning they run independently in their own space. The second key difference is that all method calls are asynchronous, meaning they don't return values but instead send messages.
00:08:51.380 If we have those two aspects—a unique context for each object and asynchronous messaging—then we can create a system based entirely on actors. An actor can hold its own state and receive messages in a mailbox that functions like a message queue. When the actor processes its messages, it does so in its own context, ensuring that the state and messages do not interfere with its operations.
00:09:49.320 As a concrete example, consider an API system for comparing flight prices. If a message arrives requesting to compare prices from Melbourne to Brazil, the system realizes that determining those prices can be complex. To address this, the main actor can spawn helper actors, sending them messages to perform tasks related to fetching prices for subsets of a date range, allowing for concurrent computations.
00:10:54.610 When the helper actors finish their price calculations, they send messages back to the main actor with the results. This allows the main actor to aggregate the information and return the cheapest fare found. The beauty of the actor system is that it can handle many actors concurrently, avoiding the shared state issues that typically trouble many threading approaches.
00:12:32.970 Actors do have a long history interconnected with object-oriented programming. The first actors papers date back to 1973 and the first object-oriented programming languages emerged around the same time, showing how intertwined these concepts have always been. The famous computer scientist Alan Kay has defined object-oriented programming as primarily about messaging and encapsulation, ideas that are at the heart of the actor model as well. By replacing conventional objects with actors, we maintain the benefits of object-oriented design while sidestepping some of the more complex challenges associated with concurrent programming.
00:13:54.280 In the Ruby ecosystem, there are currently two active projects exploring concurrency through the actor model. One is Concurrent Ruby, which provides definitions and tools for various concurrency-related tasks. The other project is called Celluloid, which I will focus on during this talk.
00:14:43.820 Celluloid was developed starting in 2012 and has gained significant adoption within the Ruby community. What is great about Celluloid is that it makes creating actors fairly straightforward. For instance, beyond the normal implementation of a Ruby class, you simply include the Celluloid module, transforming it into an actor. Upon instantiation, you get a proxy that behaves just like a normal Ruby object, allowing you to define asynchronous methods effortlessly.
00:15:40.760 Returning to the example of comparing prices, with Celluloid, we can define a price comparator that initializes multiple worker actors to handle the computational load. Each worker can fetch prices independently, with the aggregator holding onto the results returned to it, effectively managing multiple price comparisons seamlessly.
00:17:26.550 However, one critical aspect of these actor systems is error handling. If an actor encounters an error, it’s essential to manage this gracefully. One mechanism provided is the finalizer, which acts as a callback invoked when an actor is about to be garbage collected. It can queue up the responses from actors and ensure a proper cleanup.
00:18:28.910 Celluloid also provides a linking feature where you can tie actors together. If a linked actor fails, the parent actor is notified, allowing for a more resilient system. This approach shifts the focus away from trying to prevent errors from happening and instead ensures that when they do occur, the system can handle them appropriately.
00:19:54.340 Additionally, Celluloid supports actor pools, allowing multiple actors to handle a task and maintain a fixed number of active actors. If one fails, the system can restart it without requiring manual intervention. Finally, the supervisory model ensures there are always active actors in your pool, allowing for a self-regenerating system.
00:21:35.700 When designing these systems, you can define the types of actors and how they communicate within a supervisory framework. This sort of dynamic configuration means that your actor system can adapt to various workloads, handling failures in a predictable manner. This allows for a flexible, resilient, and comprehensive framework for building concurrent Ruby applications.
00:22:29.170 It's important to note that the underlying structure is live; as your system runs, you have the ability to replace and redefine actors without halting execution. This flexibility means you can effectively upgrade and modify your system while it remains operational.
00:23:38.000 Another exciting development in the Ruby ecosystem is a proposed concurrency model from Koichi Sasada, who has suggested an approach that resembles the Actor model but introduces object ownership transfers for inter-thread communication. While this proposal is not finalized, it marks an important step toward improving concurrency in Ruby.
00:25:06.480 In conclusion, concurrency is undeniably challenging, and while it can seem daunting in Ruby due to its historical issues with threading, solutions like Celluloid offer a way to better manage concurrent tasks. As the Ruby community continues to evolve, the integration of actor models and advancements toward native concurrency models will provide new and innovative ways to handle concurrent programming.
00:26:30.420 While Ruby may not yet compete with languages like Elixir, which have been built around these concepts, tools like Celluloid present viable ways to embrace concurrency within Ruby projects. Whether it be through background jobs, microservices, or other applications, utilizing concurrency models offers exciting potential to optimize Ruby applications.
00:27:20.310 Now, let’s shift gears and open it up for some Q&A. If you already have an Elixir implementation, would you still opt for actors in Celluloid if given the choice? The reality is, if your team prefers Elixir’s concurrency model due to its maturity, there is little motivation to switch unless you have strong reasons for using Ruby.
00:28:02.570 Concerning how concurrency works in Ruby, the GIL does restrict true parallel execution. However, Ruby's capabilities shine when handling I/O operations, where you can take advantage of concurrent I/O through the actor model without being constrained by GIL.
00:28:48.890 With regards to using Celluloid or Concurrent Ruby, both frameworks offer unique features, yet Celluloid is more mature and focused specifically on actors. Therefore, if your project is actor-heavy, it stands to reason that you will benefit from using Celluloid.
00:30:21.890 As for implementation details at my company, although most microservices are composed in Elixir, we haven’t utilized actors extensively in Ruby-based projects.
00:30:43.420 Thank you for your time!