Performance Optimization

Digesting MRI by Studying Alternative Ruby Implementations

Digesting MRI by Studying Alternative Ruby Implementations

by Christian Bruckmayer

In this talk titled "Digesting MRI by Studying Alternative Ruby Implementations," speaker Christian Bruckmayer explores how developers can improve their Ruby programming skills by studying the internals of Ruby's various implementations, particularly focusing on MRI, JRuby, Rubinius, and Opal. Bruckmayer reflects on his journey from being a new developer to a senior position, emphasizing continuous learning as an essential part of professional growth in software development.

Key Points Discussed:
- Role of a Senior Developer: Bruckmayer reflects on his early experiences and the evolution of his role within his first job, highlighting the importance of mentorship, leadership, and continuous improvement in programming skills.
- Understanding Ruby Internals: He shares his motivation to delve into Ruby's implementation, recognizing Ruby as a computer program that can be studied just like any other application.
- Involvement with MRI: Bruckmayer demonstrates his initial exploration with MRI, addressing an issue related to the min and max methods in Ruby and the surprising performance results he discovered through benchmarking.
- Alternative Implementations: The presentation covers his work with other Ruby implementations, such as Rubinius and Opal, showing how he tackled features from newer Ruby versions and learned about type conversions in the process.
- Focus on JRuby: Bruckmayer emphasizes a significant project involving JRuby's hash table optimization. His hands-on approach to understanding and implementing open addressing in hash tables showcases his methods for navigating complex codebases and seeking community input during development.
- Performance and Contribution: He discusses the challenges and learning experiences that arose from his attempts to merge performance improvements, highlighting both successes and setbacks.

Main Takeaways:
- Continuous learning and experimenting with code, especially through prototypes and community feedback, is critical for becoming a proficient developer.
- Contributing to different Ruby implementations can provide significant insights and enhance one's understanding of Ruby as a language.
- The importance of collaboration and community support in open-source development is reinforced throughout his experiences.

In conclusion, Bruckmayer's talk is a call to action for developers to explore Ruby's internals, contribute to its development, and recognize that mastery is a continuous journey, implying that no one fully masters Ruby but strives for improvement every day. His experiences exemplify how engaging with open-source projects can enrich one's programming capabilities and foster a supportive learning environment.

00:00:12.440 Hello, RubyConf! I hope you all had a great lunch. Welcome to my talk. I would like to start with a little story about my first job. It was actually my last job as well; it was the first job I had after university. I was one of the first engineers on the team, which consisted of only my boss and another engineer at the beginning. Over time, we started to grow this team and hired more developers. In the end, we had around 12 developers. During this time, I helped onboard these new engineers and engaged in peer programming. I took on more responsibilities and eventually got promoted to senior developer.
00:00:30.119 It was still my first job, and I found myself at a point in my career where I was reflecting on what it means to be a senior developer. One thing that came to mind was experience. Most likely, many of you would agree that someone with 10 or 15 years of experience could be considered a senior developer. Furthermore, a senior developer is someone who provides leadership, helping to bring features from inception to production, delegating tasks among different developers. They are also good mentors who help junior developers level up and become better programmers.
00:01:10.200 I also thought that a senior developer should be a proficient programmer. Since I was a Ruby programmer, I kept pondering how I could become a better Ruby developer. It occurred to me that Ruby is just a computer program, and just like you would study arrays in an application for understanding, you could study Ruby to enhance your skills as a Ruby programmer. This brings me to the topic of my talk today: digesting MRI by studying alternative Ruby implementations. My name is Christian Bruckmayer, and most of what I discuss today can be found on my website, brookmayer.net. I have written a few blog articles mostly about JRuby and the performance work I have done there. If you're interested in learning more and seeing additional benchmarks, please visit my blog.
00:02:04.080 Currently, I live in Bristol, United Kingdom, which is around two hours away from London. It's not London, and there are many other cities in England besides London. I currently work at Cookpad, and their global headquarters is also in Bristol. If you're into home cooking, check out Cookpad! Today, I have brought three different examples for you. One example is about MRI, my second example is about Rubinius and Opal, and the last example is about JRuby. The first two examples are quite small, while I will focus primarily on JRuby.
00:03:20.200 Let's get started with my first example, which is about MRI. Most people, when they talk about Ruby, are actually referring to MRI, which is the reference implementation of Ruby. It was started by Yukihiro Matsumoto in 1993 and is implemented in C. When people mention Ruby, they generally mean MRI. I wanted to become a better Ruby programmer, and since Ruby is a computer program, I decided to go to GitHub, read the Ruby source code, check the Ruby issue tracker, and perhaps fix a bug or implement a feature. Eventually, I hoped that by doing so, I would improve my skills as a Ruby programmer.
00:04:12.790 My first step was to browse around the Ruby issue tracker for features or bugs I could tackle. I found an issue related to the `min` and `max` methods. It stated that using `array.min` and `array.max` is much slower than calling both `min` and `max`. What does this mean? If you have an array with the numbers one to nine, calling `array.max` will return the highest number, while calling both `array.min` and `array.max` will yield the same result. I decided to conduct a benchmark comparing the two methods.
00:04:52.990 To my surprise, I discovered that `array.min` plus `array.max` was actually 1.8 times faster than using the `minmax` method. The reason for this is that `minmax` is included in the Enumerable module, which doesn't utilize the internal fast methods of the array. Thus, I figured this might be an excellent first issue to work on, so I started programming in Ruby, created a pull request, and was excited about it. However, my pull request was closed quickly because I missed an existing patch that had been attached to the issue tracker, which I hadn't seen.
00:06:73.760 I felt disappointed because I had put in significant effort, hoping to contribute to Ruby. But then I realized that my primary goal was to understand Ruby better, not necessarily to get a patch submitted. While I didn't achieve my original goal, I still learned invaluable lessons. I learned how to compile Ruby, got some practice programming in C (which I hadn't done since university), and it also inspired me to explore other Ruby implementations.
00:07:00.060 This brings us to my second example, which is about Rubinius and Opal. Rubinius is a Ruby implementation written in Ruby itself, except for its virtual machine, which is implemented in C and C++. On the other hand, Opal is a Ruby to JavaScript transpiler that allows you to run your Ruby code in the browser. My idea was that, since MRI is the reference implementation of Ruby, I would check the Ruby release notes to see what’s new in Ruby, then examine the various implementations to see if they had implemented those features. If not, I would do it.
00:07:15.160 For example, in Ruby 2.5, a new method called `delete_prefix` was introduced for strings. This method allows you to remove the beginning of the string if it matches a specified prefix. In Rubinius, implementing this method required simple type conversion. If the string matches the prefix, it simply removes it; otherwise, it returns a copy of the string. The implementation in Opal looks very similar, but I discovered an interesting aspect, particularly when I started trying to use the method with symbols. I received a type error when trying to pass a symbol.
00:09:05.590 I was caught off guard because I expected symbols to work seamlessly. Upon reviewing the implementation, I realized there was a conversion to `to_str`, but symbols do not implement the `to_str` method. They do implement a `to_s` method since they can behave like strings in some contexts. This discovery led me to learn about explicit versus implicit conversions. The explicit conversion is when we dictate that a symbol should be treated as a string, and implicit conversion occurs in the background when Ruby handles it for us.
00:09:30.840 For example, if we define a class `Prefix` that implements the `to_str` method, it can work as a prefix. Interestingly, if we look at Ruby on Rails, that’s exactly what they do with their Path class. Initially, the Path class was inherited from String, but Aaron Patterson refactored it to not inherit from String anymore, instead implementing a `to_str` method. This allows for easy concatenation with strings, which would not be possible with just the `to_s` method.
00:10:57.760 Moreover, when implementing this in the Ruby context, if the prefix does not match, it returns a copy of the string. This behavior surprised me a little. One evening after dinner, I spoke with Ryan, who presented a talk about Artichoke, which is a Ruby implementation in Rust. I mentioned that I found an issue in Artichoke because it did not return a copy of the string if there was no match; it returned the same string. Ryan found this interesting and inquired why I hadn't submitted a pull request to fix it, which encouraged me to do so.
00:12:17.170 Just the other day, I did submit that pull request, and Ryan has already merged it. I looked deeper into it to understand why the implementation did not return a copy of the string, uncovering that most string methods do, if they do nothing, as a means to maintain consistency. For instance, if I have a string 'hello' and I attempt to delete a character that doesn’t exist within that string, it will simply return a copy of the original. This experience made me realize that while the contributions were relatively small—just three or four lines of code across a few projects—the learning was significant.
00:13:56.170 This brings me to my last example regarding JRuby. JRuby is the Java implementation of Ruby that runs on the JVM and supports concurrency, making it quite fast. One of the core goals of JRuby, as stated in its documentation, is to be a complete, correct, and fast implementation of Ruby. Since one of their main objectives is performance, I sought out performance issues to tackle. I again browsed the Ruby issue tracker and the JRuby issue tracker and ultimately found an issue regarding hash tables with open addressing, a feature introduced in Ruby 2.4 that improved hash table performance by about 40%.
00:14:49.240 I thought this would be fun to work on, especially since I saw a tweet from Aaron a couple of days ago suggesting that implementing hash would be straightforward. However, upon reviewing the patch for MRI, I realized it was nearly 1,500 lines of code. Having very little experience as a C programmer, I felt overwhelmed. So, I targeted Rubinius in my approach. I planned to analyze its Ruby code to understand how they implemented it and see if I could adapt the concept to JRuby.
00:15:57.170 Unfortunately, they hadn't implemented it either. So, I decided to leverage my knowledge of compiling Ruby and sprinkle in ports to execute various pieces of code to understand better how everything works. I also began to read the patch to learn the implemented algorithm, which was creatively illustrated using ASCII art. I attempted to transfer that knowledge into Ruby so I could later adapt it to Java and researched the specifics of how open addressing operates.
00:17:03.150 To recap, with a hash table, we create a hash, set a key with a value, and retrieve that key. Initially, the approach employed was called separate chaining, meaning each bucket in the hash is an array that can hold multiple entries. To insert a key-value pair, you search for the precomputed bucket index. If the key exists, you update its value; if not, you append an entry to the bucket. However, to improve performance, especially regarding cache locality, we decided to eliminate the nested arrays in an open addressing approach.
00:19:16.870 With open addressing, we pass through a linear array and check for entries. We calculate the index the same way, and instead of appending, we simply insert entries directly. If we don't find the key in the expected bucket, we compute the next index based on the probing strategy. As my implementation began taking shape in JRuby, I found having to refactor the code for Java to be a challenge, particularly because of static typing. At first, I was frustrated with this aspect, but over time, I started to appreciate how it guided me during extensive refactoring stages.
00:20:55.724 The open source community has a way of continuing to refine ideas. In relation to my work, I created a draft pull request and involved the community, which provided input and support during my development process. This collaborative effort became instrumental in completing the project, and I submitted the first implementation of the new approach; however, it turns out it was not faster than anticipated. We discussed this in the pull request, brainstorming ways to enhance the performance. A contributor suggested we eliminate the entry objects, and by doing so, we reduced a significant number of allocations.
00:23:05.170 This adjustment greatly enhanced the performance, and a few weeks later, I had the opportunity to present my approach at the Ruby conference, where Akira Matsuda discussed performance optimization, emphasizing it as a game aiming for high scores. Such optimizations are not always feasible in regular day-to-day development environments, but they're entirely appropriate in this context as we pursued the goal of making JRuby as fast as possible. My significant effort culminated in merging the pull request after trying hard for two months in my spare time.
00:24:15.960 After it was merged, we needed to revert it temporarily, discovering that the new implementation didn't perform as well regarding concurrency due to some inherent differences with MRI, which utilizes a global interpreter lock. We are currently addressing those robustness issues and working towards merging it back into the main branch.
00:25:21.600 To wrap things up, working on prototypes is a valuable skill I've carried over into my daily job. If I encounter a feature that's too complex or uncertain, I quickly prototype outside of the project, seeking input from the community early on, asking for help or sharing ideas. Today, I presented three different examples and while I started by asking what defines a senior developer, my goal was always to enhance my proficiency in Ruby programming. Nobody truly masters Ruby, except perhaps Matz himself. The journey of programming is about continual learning, with each day bringing new insights, especially in open source where one can explore, contribute, and seek assistance from a supportive community.