00:00:00.399
Our next talk is from our sponsor, Shopify, and it's titled "Fun Passing Blocks Around." Please welcome Alan Wu from Shopify.
00:00:15.759
Hi, my name is Alan. I am a member of the Ruby core team, and I work on improving the Ruby runtime at Shopify. I want to share some fun facts about the implementation details for passing blocks around in C Ruby today.
00:00:26.880
Let's get started. In Ruby, all methods implicitly take blocks, and methods can invoke the block they receive with the `yield` keyword. This is the most basic way to work with blocks, and it works well, but it has some limitations. For example, the method can't pass the block it receives to another method, nor can it save the block into a variable for later.
00:00:58.559
For more flexibility, we can give a name to the block argument in our method. `Yield` still works, but it's also possible to call the block using the block argument. The block argument can be a Proc object instance or `nil` in case the method receives no block. Now, let's say you want to change the method to check whether a block was given. If you want to be efficient with CPU time, this is where things start to get interesting.
00:01:19.920
Let's call our method a couple million times to see if we can measure how long our script takes to run before and after adding the `if` check. On my computer, the version with the `if` takes about three times longer to run. Of course, the `if` check itself takes some time to execute, but it shouldn't make the script take three times longer. So what's going on? To spoil the surprise, the version with the `if` check allocates a Proc object every time it runs.
00:01:59.440
This is my first fun fact: some forms of block parameter usage don't allocate objects. The version without the `if` check benefits from a special optimization in the runtime. I will try to explain how the optimization works, but first I want to dive into why the optimization cannot be performed in all situations.
00:02:27.920
Normally, local variables are reserved out of a single chunk of memory. It's done this way because the reservation itself can be performed very efficiently. Reserving and releasing space essentially involve adding and subtracting from a single number. We reserve space when entering a method and release space when returning from the method. This strategy runs into trouble when local variables need to stay accessible after the method returns.
00:03:01.360
Let's look at an example. Here, we save the Proc object for the block into an instance variable for later use. We then return and call into method 3, which reuses the space method 1 used. When we call into the saved block, it would be incorrect for it to set the local variable `x` that is defined in method 3. So where is the local variable `x` defined? In method 1, if it's not in the chunk of memory for local variables.
00:03:56.239
C Ruby deals with this problem by dynamically allocating heap space and moving locals there when creating the Proc object for the block. In this case, the Proc object is allocated at the highlighted line. When method 2 returns to method 1, method 1 refers to freshly allocated heap space for the local variable `x`. This evacuation of local variables to the heap also happens when the Ruby program calls the binding method.
00:04:06.239
But I digress; I've put more details about this operation on screen in case you are interested. That was the second fun fact I wanted to share today: local variables move when a Proc object is created for a block.
00:04:22.239
Now, let's talk about the special optimization that avoids allocating the Proc object and, hence, avoids evacuating locals. We've seen that it's problematic to refer to the chunk of memory for local variables once the method housing the block returns. However, it is acceptable to do this while the method is still running. So if we can guarantee that the method housing the block is running, we can call into the block without evacuating the locals. The runtime looks for this condition by strictly limiting the optimization to two operations on the block parameter: passing the block parameter to another method and calling the block parameter.
00:05:38.400
Performing any other operations with the block parameter defeats the optimization while the method receiving the block is running. We also know that the method passing the block is running, because the method receiving the block must have received it from somebody. C Ruby is very conservative when looking for these two operations. For example, assigning the block parameter to a local variable defeats the optimization, and using the block parameter in an `if` condition also defeats the optimization.
00:06:36.560
C Ruby's virtual machine uses a special instruction, `get_block_param_proxy`, to implement this optimization. You can search for it in compile.c to see how it targets this optimization. The instruction pushes a unique special object that stands in for the Proc object. This special proxy object has a specific call method that behaves similarly to the `yield` keyword when called. I've been discussing this optimization a lot, and I would be remiss if I didn’t give credit to its author: this optimization was written by Koichi Sasada-san, or Ko1 on GitHub.
00:07:03.360
So that was my third fun fact: C Ruby implements this allocation avoidance scheme with a special proxy object. All right, I have one last thing to discuss. You might be curious about whether this optimization is fully semantics-preserving. If it is, then all possible Ruby programs should behave the same both before and after the introduction of this optimization. This is important for maintaining compatibility with existing Ruby programs.
00:07:48.400
It turns out the optimization did introduce a behavior change, particularly in relation to the `lambda` method. To quickly summarize the issue, the `lambda` method needs the special ability to differentiate between a literal block and when passed in with the `&` syntax. There is no straightforward way in Ruby code to know whether the block is a literal block. It's easy to see how this very specific corner case was missed while introducing this optimization.
00:08:04.720
The `lambda` issue was fixed in the 2.7 release, so this optimization is fully semantics-preserving now. Right? The answer is a definite maybe. There's an even more obscure and less important situation where the optimization poses a problem. It's so obscure and unimportant that I'm going to leave it here as a riddle for those interested to solve.
00:08:12.720
That was my fourth and final fun fact: the `lambda` method has the special ability to tell whether the block passed to it is a literal block or not. All right, that's all I have. Thank you for your attention, and I hope you have a nice day.
00:08:39.599
Thank you, Alan! We sure do have a nice day today. Thank you for the three and plus one extra fun facts from Alan Wu at Shopify.