MRI Magic Tricks

Talks

MRI Magic Tricks

Charlie Somerville

1 talk

#performance-optimization

#monkey-patching

MRI Magic Tricks

by Charlie Somerville

In the presentation titled "MRI Magic Tricks" by Charlie Somerville, recorded at RubyConf AU 2014, the speaker explores the intricacies of Ruby's canonical implementation, MRI (Matz's Ruby Interpreter). The talk is designed to entertain and educate attendees on some of the lesser-known internal mechanisms of Ruby while exploring specific bugs inherent to MRI that can be exploited to perform unexpected actions in the language.

Key Points Discussed:
- Introduction to the Speaker: Charlie Somerville is a member of the Ruby core team and works at GitHub, contributing to performance patches and bug fixes in Ruby.
- Focus on MRI Bugs: The presentation outlines three specific bugs within MRI, emphasizing that the techniques demonstrated are not applicable to alternative Ruby implementations like JRuby or Rubinus.
- Frozen Core: The first trick involves the ‘frozen core,’ a hidden class in MRI that houses numerous internal methods critical for Ruby's operation. Charlie discusses how to access and manipulate this class, showcasing how it can be used to redefine behaviors such as global variable aliasing and method definition logging.
- Altering Superclass Chains: Charlie explains how to handle superclass mismatches encountered in Rails applications through class duplication. By modifying the handling of class ancestry, developers can achieve a smoother class reloading experience.
- Catching Segmentation Faults: The final trick covers how to manage segmentation faults caused by buggy C extensions. By capturing these faults and triggering controlled exceptions, developers can maintain a stable Ruby execution context.

Throughout the presentation, several technical methods are exemplified, demonstrating how to push the limits of Ruby functionality. For instance, Charlie explains how to redefine core methods within the frozen core class, manipulate hash behavior, and even handle Ruby’s error management system.

Conclusion and Takeaways:
- The focus on internal operations sheds light on Ruby's design, offering insight into performance optimization and debugging techniques.
- While the content is intended for entertainment and educational purposes, Charlie urges viewers not to apply these techniques in production environments due to their potentially unstable nature.
- Attendees are encouraged to appreciate the depth of Ruby's architecture and its various quirks, which can be both a source of wonder and caution.

00:00:11.639 All right, welcome. We're going to be talking about MRI magic tricks. My name is Charlie Somerville, and you can find me on Twitter as @charliesu. You should follow me if you enjoy discussions about Ruby. I’d love to see a big number next to my name on Twitter, which would make me happy.

00:00:20.720 As Josh mentioned, I am part of the Ruby core team and contribute to Ruby with a few performance patches here and there, along with some bug fixes. I also work at GitHub on the systems team, where I get to spend a lot of time diving deep into the Ruby VM, exploring really weird behaviors, and figuring out how we can make Ruby faster for large production Rails applications.

00:00:32.920 In this talk, we're going to cover three different MRI bugs and discuss how we can exploit these bugs in MRI to really push the boundaries of the Ruby language. This will allow us to do things you never thought were possible in Ruby. I should mention that these bugs are specific to MRI and will not affect other Ruby implementations. So if you're using JRuby or Rubinus, you have nothing to worry about. Even if you're running a recent version of MRI, you should be fine.

00:01:00.600 Before we dive in, I would like to give a disclaimer: this talk does not contain any useful information that you can apply to your work. If you are here to become a better developer, then you’re in the wrong place. I advise you not to attempt any of the techniques I will discuss at home, and certainly do not try any of these in a production environment. With that out of the way, let’s get started.

00:01:42.840 The first trick I would like to discuss is something called the frozen core. There are several hidden, secret methods in MRI that are called internally by the virtual machine, all grouped under a hidden class called Ruby VM frozen core. This class does not officially exist in Ruby, so attempting to access it will result in a NameError. However, it does exist, and we can explore its internals.

00:02:06.680 In the C code behind MRI, we discover that there are multiple methods hidden from user land, yet available inside the VM. For example, if you define a method in Ruby, such as 'def foo; end', there are specific instruction sequences that the Ruby VM interprets, indicating that it internally calls a method for defining functions, known as 'call hasde method'. When you execute Ruby code like 'def my_method; 1 + 2'; this actually translates to a method call against a frozen core object.

00:02:35.640 When defining a method, the Ruby VM passes the class context, the method name, and an instruction sequence. The frozen core class contains a host of methods crucial for how Ruby operates. For instance, the method alias syntax activates an internal method that functions as 'coret Method Alias'. Other methods within this hidden class include those responsible for aliasing global variables, defining methods, and creating literals. Even though it's not directly accessible in Ruby, the frozen core class plays a significant role in how Ruby interprets and executes code.

00:03:36.080 To explore this frozen core and see what we can do with it, we’ll need to find it using some dedicated heuristics. The C code responsible for creating this hidden class comes with comments stating that the frozen core is hidden and inherits from basic object. The flags set in the class make it invisible when we take a look at all existing objects in the object space.

00:04:03.439 Objects allocated around the same time in memory often share addresses close to one another, allowing us to grab hold of the frozen core object by using the object ID from the Ruby VM class. This pointer lets us access a range of pointers nearby, and we can attempt to reference them to see if they lead us to the frozen core. If we catch any exceptions by passing in an invalid object ID, we can use a rescue clause to keep exploring.

00:04:48.200 While exploring, we can discover plenty of objects, including various internal classes. We're specifically interested in identifying an unnamed middle class. This class never receives a name or assignment to a constant, which is indicative of the frozen core class we are hunting for. If we check our findings against known method definitions, we can indeed verify that this class has a method named 'core hash defined', confirming we’ve located the frozen core class.

00:05:43.760 We might want to mess around with this class and start calling some of its methods. However, creating a new instance directly is not possible as it is designed to act as a singleton. Instead, we need to locate and redefine the 'lambda' method in the frozen core. By redefining it to return self, we can invoke the lambda syntax and grab an instance of the frozen core class.

00:07:06.540 With that instance in hand, we can now access its methods. To showcase its capabilities, we can look at how to alias a global variable to a different name using the syntax for global variable assignment. Similarly, we can utilize the frozen core to send a method called 'set variable alias' to redefine what a global variable points to. This allows for memory manipulation in ways previously thought impossible within Ruby.

00:08:27.239 Additionally, we can exploit the hash syntax by overriding it to reverse the order of its members when using a hash literal. Since hash literals are method calls internally, we can completely redefine their behavior. For instance, when we define a hash, we can manipulate it so that its keys get output in reverse order. We can also hook into how methods are defined, logging method definitions whenever they’re created, giving us insights into how Ruby resolves methods.

00:09:41.720 Next, we can take a recent feature from Ruby 2.1 that allows method definitions to return the symbol of the method name defined. This feature provides more flexibility and power, similar to decorators in other languages. By overriding the core hash defined method, we can bring this functionality to older versions of Ruby.

00:11:11.360 However, a word of caution: methods within the frozen core are primarily intended for internal use by MRI, which means they might not handle incorrect calls gracefully and could lead to unexpected crashes. For instance, methods like 'hash from array' perform checks that, if violated, can cause assertion failures that crash the Ruby interpreter.

00:12:44.720 Similarly, within the 'core set' method alias, if we incorrectly pass parameters such as non-class objects, we could trigger runtime errors or crash Ruby altogether. Now that we have explored tricks around the frozen core, let's shift gears and discuss altering superclass chains.

00:14:25.360 How many times have you been working on a Rails app and encountered a bug due to the Rails reloader? Occasionally, it mistakenly thinks a class has already been defined when you try to set a different superclass. This can lead to superclass mismatch errors. The trick here is understanding how class duplication works in Ruby. When duplicating a class with the 'dup' method, it retains its ancestry without breaking any internal dependencies.

00:15:43.360 By diving into the C implementation of the 'dup' method, we can see it makes a new allocation of the class instance and calls its initializer to copy over its state. Should you want to change a class’ superclass, the process involves creating a new class inheriting from the desired superclass, and passing that in. Attempting to reinitialize existing classes will result in exceptions; however, by overriding internal handling, you can manipulate ancestry to suit your needs when altering existing classes.

00:18:39.760 Now, let’s discuss a technique for catching segmentation faults in Ruby. If Ruby encounters a segmentation fault due to buggy C extensions, it crashes and displays a backtrace. Within this output exists the loaded features and their paths, which are a list of all dependencies loaded into the Ruby process. If a malicious actor could manipulate this contents array, it could lead to further vulnerabilities.

00:22:00.920 Interestingly, we can intervene in the error handling mechanism to capture the segmentation fault and instead trigger an exception within Ruby if we carefully manipulate the loaded features array. If we manage to inject a controlled object into this opportunity, while Ruby is dumping its debugging information, we can execute Ruby code to provide messages before the interpreter exits.

00:23:00.460 In essence, we can perform tricks to raise exceptions during segmentation faults to encapsulate them with rescue blocks, thus maintaining a stable execution context despite the underlying faults. Today, we covered three fascinating tricks involving the frozen core, manipulating superclass definitions, and handling segmentation faults in Ruby. Thank you for watching; I hope you learned something new, even if it was something you’d never try at home.

RubyConf AU 2014