00:00:00
Ready for takeoff.
00:00:16
Oh my goodness! You can't, I get at least like six seconds back on this clock. That's not fair! They started the clock before they switched the slides up here, and I have to use all of these minutes. Thank you.
00:00:29
Thank you.
00:00:34
Oh, alright. Um, did anybody else have a hard time with these elevator buttons? When I got to the hotel, I just went to the elevator and thought, 'I don't understand where the buttons are.' One of the doors was open, so I just went in and used that.
00:00:46
Then I had to leave to meet people, and when I went to the elevator again, I thought, 'I don’t know how to use these.' It took forever for me to figure out that those buttons were actually part of the normal operation. I thought the emergency sign was the button.
00:01:05
I don’t know why I’m going on about this; I just don’t have time. So, this talk is titled: "Don't @ Me! Faster Instance Variables with Object Shapes."
00:01:11
Oh, hold on a sec, let me shut off the notifications here. Okay, sorry! Yeah, I'm very excited to be in Houston.
00:01:21
I've never been to Houston before, so I'm really happy to be here. The food is great; this is a lot of fun.
00:01:39
My name is Aaron Patterson. I usually put about 15 minutes of stand-up at the beginning of my presentations, but I just don’t have time for that here, so I cut all of it. I’m really sorry.
00:01:51
I’m part of the Ruby core team, and I’m also on the Rails core team. I go by 'tenderlove' everywhere online, so you can find me on all social media under that handle, except on LinkedIn where I use my more professional name, which is also tenderlove.
00:02:07
I work for a mom-and-pop e-commerce website called Shopify. I’m on the Ruby infrastructure team at Shopify, where we're working on various projects to improve the performance of Ruby as well as the quality of life of developers.
00:02:22
Our customers essentially are the development teams at Shopify, so we're making Ruby and Rails better so that they can do their jobs more quickly and with fewer resources. We are working on projects like wide GC improvements, the variable-width allocation project, and other infrastructure improvements.
00:02:41
Today, I want to talk to you about instance variables and how they work. I was going to call this talk 'Instance Variables TMI' because I am going to share way too much information about instance variables. But instead of just presenting pure facts, I want to derive the way that instance variables work.
00:02:59
Hopefully, we can implement them together, and you’ll be able to come away with a deeper understanding of how they work and why we make different decisions regarding performance and optimization.
00:03:15
I’m also going to be talking about object shapes, which is a technique that we use for speeding up access to instance variables, as well as other things. This project has been ongoing at work; my team has been working on it, and it’s going to be shipping along with the Ruby 3.2 release.
00:03:36
I’m also going to discuss how all of these elements work together with a wide JIT to make instance variable access extremely fast. I'm going to cover all of this in 30 minutes.
00:03:53
First off, I want to say thanks to everyone on the Ruby infrastructure team, especially the YG team. I’ve been working very closely with Gemma on this project, and I also want to thank Maxine for her guidance.
00:04:00
Additionally, a shout out to John Hawthorne at GitHub, who has also been helping with this project. There have been a lot of people working together on this.
00:04:20
So let’s discuss how instance variables work. This is a joke for all the people from Seattle.
00:04:30
Just a note here: I’ll refer to them as instance variables, Ivars, or IVs. Those all mean the same thing; I just need to shorten it sometimes because I only have 26 minutes left.
00:04:44
Let’s talk about implementing instance variables. Let’s say we have a very simple class like this with a few instance variables on it. If we were implementing a language, how might we store this data? I think a simple approach would be to store your instance variables in a hash table associated with the instance.
00:05:02
For example, we’ll have our instance of 'hello' here, and we’ll say we have a hash table associated with it. The key in the hash table will be the name of the instance variable, and the value will be the value of that instance variable.
00:05:26
We can imagine writing this code is quite easy. When you write something, it writes to the hash table; when you read something, it reads from it. All of this seems simple to implement if we understand how hash tables work.
00:05:40
In fact, this is how instance variables were implemented in Ruby 1.8 and earlier. They functioned via a tree-walking interpreter, where we would take your code and turn it into a tree and then walk each node in that tree to evaluate them.
00:06:03
Let's walk through an example. We have a very simple method called Foo. The way it works is we evaluate its children before we can evaluate foo plus bar.
00:06:30
Foo does a hash lookup to get its value, and then bar also does a hash lookup to get its value.
00:06:38
Once we have those values, they get returned up the tree, and plus can execute to add those two values together and return that to the caller. Now Ruby 1.9 came along and introduced a virtual machine.
00:07:02
The virtual machine compiles all of your code into bytecode and executes that bytecode. I won’t get into the compilation process, as we don’t have much time.
00:07:18
But let’s walk through how the virtual machine executes this method.
00:07:27
The compiler converts the Foo method into bytecode, and it’s going to walk through those instructions one at a time, executing them while manipulating a stack.
00:07:41
The first thing we do is get the IVAR here, pushing one onto the stack, and then we get the IVAR for bar, pushing two onto the stack. When we execute plus, it pops those two values off the stack and pushes the return value.
00:08:05
Imagining how we might implement the get Ivar instruction, it could be simple. We’ll say we take the name, which comes from the instruction, and first look up self.
00:08:18
Self will be stored in the current frame. We want to get the hash table of instance variables, and all we need to do is look up the value by name in that hash table, and push the value onto the stack.
00:08:31
It’s easy to see how we could transition from a tree-walking interpreter to a virtual machine implementation. However, the problem with this implementation is that hashes are relatively slow compared to arrays.
00:08:48
I don’t want to say hashes are slow, but they aren't as fast as arrays. Hashes also use a lot of memory when compared to arrays because the hash data structure uses more memory than an array would.
00:09:03
So could we use an array instead of a hash? Yes, we could do that. Imagine a simple class again with a couple of instance variables.
00:09:24
When we allocate a new instance of hello, that instance must point to a class. We can say to the class, 'Do you have an index for this instance variable?'