Async Ruby

Ruby

Async Ruby

Bruno Sutic

#asynchronous-programming

#rubygems

#scalability

Async Ruby

by Bruno Sutic

This video, presented by Bruno Sutic at RubyConf 2021, introduces the innovative Async Ruby, a game-changing addition to the Ruby programming language. Async Ruby offers a new approach to concurrency, described as 'threads with NONE of the downsides.' Despite its significant capabilities and support from the Ruby core team, it remains relatively unknown in the Ruby community. This talk aims to highlight both basic and advanced features of Async Ruby, suitable for a wide audience.

Key points covered in the presentation include:

Asynchronous Programming Explanation: The speaker begins with an overview of asynchronous programming, demonstrating its principles through a JavaScript example involving an HTTP GET request and the use of promises, illustrating the performance advantages of asynchronous versus synchronous programming.
Limitations of Ruby's Synchronous Nature: Bruno points out that while Ruby traditionally runs synchronously, leading to longer execution times for multiple requests, Async Ruby allows concurrent processes, sharing how this can be achieved efficiently.
Introduction of Async Ruby: Async Ruby is introduced as a gem that can be easily installed and used. It supports concurrent execution without the common complexities of traditional Ruby threads, such as race conditions and thread limits.
Async HTTP Examples: Several coding demonstrations show how to perform multiple HTTP requests simultaneously using Async Ruby, achieving improved execution times compared to standard synchronous Ruby methods.
Application to Other IO Operations: The talk expands beyond HTTP requests, showcasing that other networking operations, like Redis commands and system commands, can also be executed asynchronously, significantly improving performance.
Key Concepts Explained: The speaker outlines three essential concepts behind Async Ruby: the event reactor, fibers, and the fiber scheduler, explaining how these components work together to manage asynchronous execution seamlessly.
Production Readiness and Community Success: The presentation concludes with examples of successful implementations in production settings, highlighting the real-world performance improvements achieved by organizations using Async Ruby.

Overall, Async Ruby represents a significant advancement in Ruby programming that maintains backward compatibility without overhauling existing code. The talk encourages developers to explore Async Ruby to enhance performance and scalability in their applications.

00:00:10.400 Hi.

00:00:11.440 In this speech, I'm going to talk about Async Ruby. Async Ruby is an incredible addition to the Ruby language. It has been available for some time now, but relatively few people know about it, and it has remained outside the Ruby mainstream.

00:00:22.960 The goal of this talk is to give you a high-level overview of what Async Ruby is about. Whether you're a beginner or an advanced Ruby developer, I hope to share something you didn't know about Ruby. We'll explore a couple of simple examples that demonstrate the power of asynchronous programming, and we'll explain the core concepts of how it all works.

00:00:41.120 I've been a Ruby programmer for ten years now, and in my opinion, this is by far the most exciting addition to the Ruby language during this time.

00:01:05.519 My name is Bruno Sutic. I'm an early adopter of Async Ruby and have made a few small contributions to it. You can find me on GitHub as "bruno-sutic". I don't use social networks, but you can find my contact information on my webpage, brunoceutic.com.

00:01:23.840 Now, before jumping into Async Ruby, let's explore what async really means. What is asynchronous programming?

00:01:30.479 I think it's widely accepted that JavaScript brought asynchronous programming into the mainstream consciousness of developers. Therefore, it seems fitting to explain asynchronous programming with a simple JavaScript example. I also assume many of you have written at least a little JavaScript because it's so unavoidable these days.

00:01:45.520 Now, let's look at this example. We're making a simple HTTP GET request to httpbin.org. We're registering a promise that runs when the response to the request is received. This function just prints the response status. The output shown below the code is as expected, which demonstrates a simple example of an async program where we're making an I/O request, and then something happens later in a callback when the request is done.

00:02:25.040 One thing to note in the output here is that the program first runs the code on the last line, which prints the string. Later, when the request is done, it prints the response status. If you think about it, it's unusual for simple programs to run backward like line 1, line 4, then back to line 2. For us developers, programs that run top to bottom are easier to understand. The point I'm trying to make here is that async programs are harder to follow and understand compared to synchronous programs, which are easier to reason about.

00:03:07.760 In the case of JavaScript, as the program becomes more complex, you may end up in an infamous state called "callback hell," "promise hell," or even "async-await hell." So then why would we want to make our programs asynchronous? Why not just stick to a linear, top-to-bottom approach? The answer is simple: performance.

00:03:30.879 To understand this, let's look at the following example with some JavaScript pseudocode. Here, we're making three HTTP GET requests, and each one takes two seconds to complete. How long will this whole program run? Surprise, surprise: the program will run for only two seconds in total. In this example, we are firing three HTTP requests at almost the same time. The trick is that waiting for the responses happens in parallel, and asynchronous programming enables this parallel waiting, which is how we achieve these significant performance gains.

00:04:11.120 If we look at the equivalent code in Ruby, we'll see that the same example takes three times longer to run. Three times two seconds equals six seconds in this case. The reason for this is that there's no parallel waiting on responses; Ruby is synchronous.

00:04:32.960 So, how do you make three, five, or even a hundred requests in Ruby more performant? You use threads. In this example, we see how to speed up our program with three requests in Ruby, and the whole program finishes in two seconds. This works. Now, you may be wondering: "Ruby isn't asynchronous by design, but it has threads. Are we good then?" If you've done any real-world programming with Ruby threads, you know the answer to that question.

00:05:01.760 Threads in Ruby are hard. Specifically, there are two problems with them: the first is language-level race conditions, which are particularly nasty and hard to debug. This type of problem can occur even in the simplest thread programs. The second problem with threads is the maximum number of threads. This is relevant when you want to make a large number of parallel requests. I tried maxing out the number of threads on my mid-range MacBook, and the maximum number I was able to spawn was 2,000. That may seem like a lot, but if you have, say, a million HTTP requests to make, that number of threads is not sufficient.

00:05:39.430 Let's talk about Async Ruby. Async Ruby is a new type of concurrency in Ruby. If you've ever thought, "I want to do multiple things at the same time in Ruby," Async may be a good fit. For example, you may want to serve more requests per second with the same hardware, make more requests with your API client at the same time, or handle more WebSocket connections concurrently. Ruby has a few options when you want to perform multiple tasks simultaneously.

00:06:04.000 First, you can do more work with multiple Ruby processes. Second, there are reactors, a new Ruby 3.0 feature, although it seems to be not production-ready yet. The third approach, which we've already mentioned, is threads. Lastly, we now have Async.

00:06:48.080 So, what is Async Ruby, and how do you run it? Async is just a gem, and you can install it by using the command "gem install async." It's a very nice gem because Matz invited it to Ruby's standard library, though that invitation has not yet been accepted. The gem's creator is Samuel Williams, a Ruby core committer, so you can feel that the Ruby core team, including Matz himself, are backing this gem.

00:07:04.400 Async Ruby is also an ecosystem of gems; for instance, there's async-http, a powerful HTTP client, async-await gem, which is syntactic sugar, Falcon, a highly scalable async HTTP server built around Async Core, and we also have async-redis, async-websockets, among others. This talk will mainly focus on the core Async gem and the accompanying Ruby language integration.

00:07:51.360 Let's do an Async Ruby example that is equivalent to the JavaScript example we had before. In this example, we're using the async-http gem. The only thing you have to know about it is that it's an HTTP client; you call "get" on it, and it makes the request. The actual code starts with a capitalized "Async" and a "current" method with a block. All the asynchronous code in a Ruby program is always wrapped in this Async block.

00:08:41.280 Async Ruby has a concept of tasks. We spin multiple tasks when we want to run things concurrently. In this example, we're running three requests at the same time, and just like in the previous JavaScript example, all three requests are started at virtually the same time. The big win here is that waiting for the responses happens in parallel. The total running time of this program is slightly more than two seconds; this is not exactly two seconds because of network latency.

00:09:00.000 This basic example shows the general structure of Async Ruby programs. You start with an Async block that contains the main task, which is usually used to spawn more Async subtasks. These subtasks run concurrently with each other and with the main task. Just to make it explicitly clear, Async tasks can be nested indefinitely. A task block can pass a subtask, which can then create a sub-subtask, and so on.

00:09:45.760 Another important point is that it's all just Ruby. Async does not implement any special DSL nor does it involve gimmicks like monkey patching. In the previous example, we only performed HTTP requests within tasks, but you can run any Ruby code anywhere—main tasks or subtasks. It's just standard Ruby code with method calls and blocks.

00:10:19.440 Okay, hopefully, you had a positive first impression of Async Ruby. Once you get a little used to how things work, you may find it really neat, and the performance benefits are remarkable. Let's now see another code example. If you're not impressed yet, this may just blow your mind.

00:10:58.000 You may not have liked that we used a new HTTP client in the first example, but the truth is that you can use Ruby's "URI.open" to achieve the same result. Here we see that two requests triggered with "URI.open" are completed in approximately two seconds; this yields the same result as before. However, "URI.open" may also not be your favorite tool.

00:11:40.080 The brilliant thing about Async Ruby is that any HTTP client is supported. Let's utilize "httparty" and see how that works. The program ran in about two seconds, which means all requests ran concurrently.

00:12:06.560 So far, we've only seen examples making HTTP requests, but what about other network requests? Let's try Redis, which has its own protocol built on top of TCP. This Redis command runs for two seconds before returning.

00:12:36.360 We run the program and it completes in about two seconds. Wow! We can also make Redis commands asynchronous. In fact, any I/O operation can be made asynchronous. All existing synchronous code is fully compatible with Async. You don't have to use only Async-specific gems like "async-http" or "async-redis"; you can continue using the libraries you're already familiar with.

00:13:16.880 Let's add another example to the mix. I'll use "net-ssh" to execute an SSH command on the remote server. This SSH command runs "sleep" on the target server, completing in about two seconds.

00:13:51.680 There you have it. We added SSH to the mix, and it works seamlessly with other network requests. You may be wondering, what about databases? We connect to databases over the network.

00:14:29.120 I'll use the "sequel" gem to check if asynchronous database operations are supported. The query you're looking at takes exactly two seconds to run. And yes, that is supported as well. Cool, right?

00:15:02.160 Let's see another example. What do you expect? Will this "sleep" increase the total program duration by two seconds? The entire program runs in about two seconds, indicating that this "sleep" ran concurrently with other tasks.

00:15:49.880 So not only can we run network I/O asynchronously, but we can also run other blocking operations async. What other often-used blocking operations do we perform in Ruby? How about spawning new child processes? I'm using a "sleep" system command in this example.

00:16:33.240 Don't get confused; this is actually running an external system command. It could be any executable; I chose "sleep" to control the duration easily. And there you have it—system commands can run asynchronously as well. We covered a lot in this last example, and hopefully, these features look exciting.

00:17:14.320 You saw something new—something really innovative in Ruby. But that's not all. Let me show you how easily Async Ruby scales. I will run every task in an Async block ten times.

00:18:01.280 Quick note about "net-ssh": I had to remove that one because I couldn’t configure the SSH settings correctly for this example. What do you think? How long will this program run? Two seconds? Yes! We're running sixty tasks, each lasting two seconds, and the total program runtime is slightly more than two and a half seconds.

00:18:54.960 How about cranking things up? Let's repeat this a hundred times. Let's see what happens. We're now running 600 concurrent operations. The total program runtime increased by a second due to the overhead of establishing so many connections.

00:19:36.080 Still, I find this pretty impressive. So there you have it—easy scaling with Async. You can increase the numbers significantly, but in my case, the Redis server and Postgres database started complaining, so I left it at that.

00:20:05.080 You could argue we could do the same thing with threads by creating 600 threads. I think that's really pushing the limits with threads. My hunch is that thread scheduling overhead would be too high. When using threads, it's more common to limit the number of threads to 50 or 100.

00:20:54.480 On the other hand, 600 concurrent Async tasks are a common occurrence. The upper limit on the number of async tasks per process is in the single-digit millions. Some users have successfully achieved this limit, which of course depends on the system and the task at hand.

00:21:19.680 For example, if you're making or receiving network requests, you’ll likely run out of ports at around 40,000 to 50,000 concurrent tasks unless you adjust your networking settings.

00:21:54.319 In any case, I hope you get the idea that Async Ruby is a very powerful tool. To me, the biggest magic is running three HTTP requests with "URI.open". Using standard Ruby, that takes six seconds, but by employing the same method within an Async block, the program runs for two seconds.

00:22:17.520 The same goes for other examples like sleep and Redis. They all normally run in a blocking manner, but when placed within an Async block, they operate asynchronously. This is a great example of keeping Ruby code fully backwards compatible.

00:23:06.400 But how does this work? There's a lot to learn about Async Ruby, but I think there are three main concepts to grasp: the event reactor, fibers, and the fiber scheduler.

00:23:59.440 Each of these three topics is quite broad, so I'll just provide a summary. Let's start with the event reactor, which is sometimes called other names such as the event system or event loop.

00:24:06.880 Every async implementation in every language—JavaScript, for instance—always has some kind of event reactor behind it. Async Ruby is no exception. The current version of the Async gem uses the 'neo4r' gem as its event reactor backend. Neo4r then uses libuv to wrap system-native APIs like epoll on Linux and kqueue on Mac.

00:24:54.000 What does the event reactor do? It effectively waits for I/O events. When an event happens, it performs the action we've programmed it to do. On a high level, when we make an HTTP request and then wait, the event reactor can notify us when the response for that request is ready and can be read from the underlying socket. These notifications are highly efficient in terms of resource usage, allowing for high scalability.

00:25:46.000 For example, if you hear that a server can handle 10,000 connections at the same time or that a crawler can make a large number of concurrent requests, the event reactor is likely the technology behind that. Async does have tasks that act as wrappers around fibers, and the event reactor drives the execution of these fibers.

00:26:15.080 For instance, when a response in task 1 is ready, the event reactor resumes task or fiber number 1. Later on, when the response in task 2 is ready, it resumes task or fiber number 2. You get the idea.

00:26:53.440 Due to the decision to register fibers with event reactors, we achieve an important property: code within a single task behaves completely synchronously. This means you can read it from top to bottom, which is huge. It allows our async programs to be easy to write and understand.

00:27:04.560 The code behaves asynchronously only if you use task.async; there’s no way you can encounter callback hell in Async Ruby.

00:27:38.640 The last piece of the puzzle, and the final significant concept, is the fiber scheduler. The fiber scheduler is considered one of the big features of Ruby 3.0. It provides hooks for blocking functions inside Ruby.

00:27:58.080 Examples of these blocking functions include waiting for I/O reads or writes, or waiting for a sleep function to finish. In essence, the fiber scheduler transforms blocking behavior into non-blocking behavior inside an async context.

00:28:41.920 Let’s take the sleep method as an example. If you're running "sleep 2" within an async block, instead of blocking the entire program for two seconds, the fiber scheduler will execute that sleep non-blocking. It will use the event reactor's timing features to effectively sleep in one task while allowing other tasks to run during that time.

00:29:20.480 So that is the significant advantage of the fiber scheduler, along with fibers and the event reactor; they make Async Ruby seem like magic.

00:29:52.080 Now that we have a high-level idea of how things work, it's time to address the big question: does it work with Ruby on Rails?

00:30:14.400 The answer is currently no. The reason is that Active Record needs more work to support the Async gem. Another big question you may have is: is Async Ruby production-ready?

00:30:44.080 The answer to that question is yes; Async Ruby is indeed production-ready, and a number of users are running it in production. Everyone who uses it has nothing but praise for Async.

00:31:01.000 Recently, I spoke with Trevor Turk from helloweather.com. They replaced Puma and Typhus Hydra with Falcon and Async HTTP, which immediately cut their server costs to one-third, and their overall system is now more stable.

00:31:53.679 If you feel excited about what you've seen in this speech, you're probably thinking, "How do I get started?" I think the best starting point for learning Async Ruby is the Async GitHub repo. From there, you'll find a link to the project documentation.

00:32:20.560 I've already mentioned Samuel Williams, but it doesn't hurt to mention him again. He is the sole creator of the Async Ruby ecosystem and a Ruby core committer who implemented the fiber scheduler feature. Huge thanks to Samuel for making an awesome contribution to all of us Ruby developers.

00:32:57.840 I hope you liked what you saw in this speech. Async Ruby is an exciting new addition to Ruby. It introduces a new type of concurrency to the language.

00:33:23.680 As you saw, it's highly powerful and very scalable. This changes what's possible with Ruby and alters the way I think about designing programs and applications.

00:34:10.560 One of the best aspects is that it does not obsolete any of the existing code. Just like Ruby itself, the Async gem is beautifully designed and a joy to use. Happy hacking with Async Ruby!

RubyConf 2021