00:00:12.519
Good afternoon! This presentation will be in English, so I apologize to my fellow Brazilians.
00:00:14.879
My name is Matheus Richard, and you might know me from Twitter. I work for a consultancy company called Thoughtbot, which many of you might recognize from Factory Bot. Today, we are going to talk about asynchronous Rails.
00:00:28.199
I know it’s a late Friday, and everyone is waiting for happy hour. But bear with me; I have something special for us. You may know a lot about async Ruby or nothing at all, so I've organized the talk into three parts. Each part will build on the previous one. Even if you don't understand everything, you'll still be able to take something away.
00:00:41.280
Feel free to take notes, but I will share the slides later. Before we dive into the async train, I want us to ensure we have the basics down. Let's start with the first chapter: what is asynchronous programming?
00:01:05.920
You might have some intuition about it, like understanding it as stuff running at the same time. While that's somewhat true, we need to be more precise. We have to discuss the difference between concurrency and parallelism.
00:01:18.159
To put it simply, concurrency is taking turns while parallelism is about independence. Let's illustrate this with an example: imagine two siblings who love to play video games but have only one controller. They can take turns, where one kid plays one level, and the other plays the next level. This approach is called cooperative execution.
00:01:40.200
However, if one of the kids is selfish, they might hog the controller, and this calls for some oversight. In the asynchronous world, we refer to this oversight as schedulers. Schedulers decide how much time each child has and ensure that everyone gets their fair share. This is known as preemptive execution.
00:02:00.760
Now, let's discuss another problem. What if the kids want to play different games? They could switch games during their turns, but that means they would waste time changing games instead of actually playing. This context switching can lead to inefficiencies.
00:02:24.400
Ultimately, they want to play games at the same time. In other words, they desire independence, which equals parallelism. To achieve this, we need two separate consoles. One kid wants to play a different game, and that's perfectly fine.
00:02:41.040
Now, let's quickly recap:
Concurrency means taking turns, while parallelism means independence. That's the basics. We can now move on to the second chapter: let's talk about async Ruby.
00:02:58.640
There are several Ruby implementations including M Ruby, C Ruby, J Ruby, and Truffle Ruby. This talk will focus specifically on C Ruby. If you don't know which Ruby you are using, it is probably C Ruby.
00:03:14.560
The landscape of Ruby and async programming is quite diverse. On the concurrency side, we have threads and fibers; meanwhile, for parallelism, we have actors and processes. We will discuss each of these, starting with fibers.
00:03:38.919
Fibers resemble methods, but they are created using 'Fiber.new' with a block. Instead of invoking them normally, you call them with 'resume'. The unique aspect of fibers is a construct called 'fiber.yield', which allows you to pause the execution and save the current state.
00:03:56.480
When you call 'resume' again, the fiber starts from where it last yielded. It’s similar to a checkpoint in a video game. You define a fiber that allows the kids to take turns playing, maintaining a state variable that indicates the current game stage.
00:04:13.360
We could also create a similar fiber for the other child, and they would take turns peacefully. This way, you can control how the fibers switch among the kids. For instance, if you favor one child over the other, you can ensure they play more, reinforcing that with fibers, you act as the scheduler.
00:04:32.480
Now, let's talk about threads. They are similar to fibers; they are created with 'Thread.new' and a block but they start running automatically. If you need to wait for them, you need to call 'join'. The question arises: what exactly is the difference between threads and fibers?
00:04:52.960
Consider a simple example where you have some code that takes a long time to execute, like calculating the Fibonacci sequence multiple times. Suppose it takes about one second per call. You might assume that by using two threads, you would execute them in parallel, expecting it to finish in one second.
00:05:19.200
However, if you try that, you'll notice it takes two seconds instead. This is due to how Ruby's threads function: they do not run in parallel. Instead, they switch automatically when faced with I/O operations. If you aren't familiar, I/O refers to any action that requires external input, such as a system call or a network request.
00:05:44.720
If instead, you have an I/O-bound task—like a sleep call—then yes, in that case, they can run concurrently in around one second. Importantly, the execution of Ruby threads never overlaps. The green portion in your execution diagram shows where Ruby's executing commands.
00:06:06.560
This means these threads are handling their work while waiting for I/O requests to complete. It functions comparably with database queries, file read/writes, and sleep operations.
00:06:27.680
Let’s consider how threads might assist our siblings. Imagine a scenario where they have a health challenge where one sibling plays a game and the other runs around a block. To simulate this, they’ll need to sleep for a second between tasks.
00:06:44.640
Without automatic switching, one child must finish playing before the next child can play, creating inefficiencies. If we use threads, as soon as one child starts running, the other can start playing, leading to much more efficient use of time.
00:07:06.560
In this threaded approach, once implemented, it would take about five seconds for the two to finish their tasks, predominantly waiting on sleep calls. However, anyone who has worked with threads before will tell you they can get complicated.
00:07:28.920
If we analyze the behavior of the code, we might see odd stage numbers being played by the children. This might indicate a race condition that occurs when accessing a shared variable which leads to unexpected outcomes.
00:07:52.640
To fix this, we could introduce mutexes to create synchronized blocks, ensuring that the threads won't switch during critical updates to shared variables. This would help us get the expected results in five seconds.
00:08:06.400
However, threads require a deeper understanding due to the complexity introduced by shared mutable states. If you're on board with the challenges of threads, you might appreciate using actors because they eliminate shared mutable states.
00:08:28.960
With actors, you can even access their local variables and ensure thread safety. However, they are still experimental features in Ruby, meaning they might have unknown bugs, making them unsuitable for production use.
00:08:50.480
The most reliable option for parallelism in Ruby is processes. To create a new process, you employ 'Process.fork' which allows you to execute tasks in a separate clone of your Ruby application.
00:09:12.000
This method parallels our sibling analogy—each child plays in their newly created environment—processes function like building two new houses for the kids to live in while they play.
00:09:33.920
This approach is heavy and not ideal for frequent use since it consumes significant system resources. If we map our strategies on a chart from concurrency to parallelism and ease to heavyweight, we see the following:
00:09:55.680
Fibers are easy and lightweight but only concurrent. Threads offer automatic switching on I/O but are heavier. Actors are parallel but cumbersome due to the necessary mutability concerns. Lastly, processes are truly parallel but quite heavy.
00:10:13.600
In Ruby 3.0, blocking fibers were introduced, allowing fibers to change states on I/O like threads do, but they require the implementation of a scheduler.
00:10:34.240
Fortunately, the Async gem provides an interface for this. Using it, fibers can operate similarly to threads but in a more lightweight manner. Let’s move toward the third section: async Rails.
00:10:54.760
How can we implement async in our apps, and how does Rails deploy it? First, a disclaimer: while I’ll present code examples, focus more on the principles since they apply to various programming languages, not just Ruby or Rails.
00:11:13.520
There are two critical principles. The first comes from advice my mother imparted: 'don’t do tomorrow what you can do today.' I’m flipping that idea. The principle here is: don't do now what you can do later.
00:11:35.680
For instance, consider a registration controller where we send a welcome email after a user registers. Is it necessary to send that email immediately? Probably not. Instead, let’s send it later. This principle applies to various tasks such as collecting statistics or processing images.
00:12:01.760
In another example, there might be a team model linked to many player models with a dependent destroy association. When we delete a team, we could also delete all associated players, which can take considerable time.
00:12:25.040
Instead, by using 'destroy async', Rails will enqueue a job to handle the deletion of associated players later, minimizing the time it takes to delete the team itself.
00:12:45.760
The second principle I want to emphasize is: don't stand still; don't wait idly. This means you should strive to avoid synchronous waits while code methodologies are continuously executing.
00:13:10.640
For example, say we have a Twitter newsletter class that generates summaries from tweets. In reality, creating these summaries can take up to two seconds if we’re waiting on I/O.
00:13:32.480
However, we can run these tasks concurrently using the Async gem. By wrapping your code in an 'async' block for each summary, you will effectively decrease your total wait time from two seconds to approximately one second.
00:13:54.960
This works not only for small quantities but can scale up to thousands of tweets without incrementing execution time. I have used this in an application, and it resulted in a tenfold increase in speed.
00:14:16.680
Similarly, we can enhance database queries under Rails 7. Suppose you have numerous queries that take time to run; instead of running them one after the other, use the 'load_async' function to initiate concurrent execution.
00:14:39.600
This method allows you to take advantage of concurrent querying, significantly reducing your overall execution time. By utilizing 'load_async', you get improved performance without considerable overhead.
00:15:02.160
However, caution is necessary when utilizing concurrent queries. Using 'load_async' extensively on busy routes might lead to contention in your database thread pool, causing delays for other users.
00:15:25.160
A good use case occurs when you have an HTTP request alongside a database query. You can execute both simultaneously to optimize performance.
00:15:48.320
Let’s now discuss lazy loading, particularly with Turbo frames in Rails applications. Imagine a dashboard application showcasing various charts where generating one query might take as long as 10 seconds.
00:16:07.440
Even with async loading, the whole page would still require that time unless we implement lazy loading. Turbo frames allow you to render elements only once they enter the viewport.
00:16:28.640
By using Turbo frames, even if some data takes longer to load, users will still see updated content available immediately for other sections, minimizing overall wait time.
00:16:51.440
On the topic of performance enhancements, applying best practices in CSS and JavaScript loading also contributes significantly. For example, load CSS files lazily or utilize 'font-display: swap' to display what fonts are available immediately while loading preferred ones in the background.
00:17:12.560
Image loading can also be optimized with lazy strategies, ensuring only visible images are downloaded, thus saving users both data and rendering time. Rails even allows for default lazy loading of all images.
00:17:33.920
Utilizing these approaches, your application can minimize unnecessary loading times, providing users with a more fluid and responsive experience.
00:17:55.520
Next, let’s touch on development tools tailored for async programming. PostgreSQL, a favorite database, inherently has limitations when it comes to writing indexes to tables, rendering them unusable to new writes during the indexing phase.
00:18:16.960
However, using concurrent indexing allows additions to table indexes while still writing to the table, though with a slight performance trade-off. Sometimes, having a slower setup is preferable to an entirely unusable database.
00:18:38.160
Regarding performance during testing, since Rails 6, you can run tests in parallel, leading to faster results. While this might not be available through RSpec automatically, several gems bridge this gap.
00:19:05.440
As a final note, I want to highlight some pitfalls associated with async programming. With great power comes great responsibility—async code requires a robust understanding of threads and potential race conditions.
00:19:24.160
Errors may become increasingly difficult to debug, leading to situations where understanding the flow of your code becomes challenging. Before venturing into asynchronous programming, ensure your performance fundamentals are in place.
00:19:44.160
By addressing database indexing, resolving N+1 queries, and effectively implementing caching policies, you provide a solid foundation for introducing async programming.
00:19:54.880
After ensuring you're writing performant code, be cautious not to incorporate poorly designed algorithms, even async code won't save inefficient code.
00:20:11.440
If you did your homework right, then async programming could indeed be the performance upgrade that you’ve been waiting for.
00:20:31.600
That's all I have for you today, and I hope you enjoyed this session.