00:00:02.040
Thank you.
00:00:09.960
Hi everybody, my name is Maxime, and today I'm going to be telling you about optimizing YJIT's performance from inception to production.
00:00:24.539
I mean, wait until after to talk to the club. We all love Ruby, but optimizing Ruby's performance has proven to be a difficult and daunting problem.
00:00:35.820
There have been many different projects trying to build just-in-time compilers for Ruby to improve its performance, but most of these projects today are either abandoned or didn't reach the outcomes they hoped for at the beginning in terms of adoption.
00:01:01.140
Recently, YJIT has reached an important milestone at Shopify; we've deemed it to be 'production ready.' This is not just talk; we've deployed YJIT to all of the storefront render infrastructure at Shopify, serving all of the requests at Shopify stores.
00:01:12.780
Every time you visit a Shopify store, you're using code that was run by YJIT. We're seeing significant end-to-end speedups, and it's not just us; as of yesterday, Discord announced that they are also using YJIT in production and getting similar speedups as well.
00:01:45.720
This talk is not just technical; it's about some of the decisions that led to YJIT being production-ready. I also want to share a bit about the origin story and the background behind YJIT.
00:01:58.560
There's a famous quote by Mary Kay Ash: 'Ideas are a dime a dozen; people who implement them are priceless.' What I take from this is that many good ideas exist, but there aren't enough people out there implementing these good ideas.
00:02:29.459
Unfortunately, people tend to shorten the quote to 'Ideas are a dime a dozen,' which can lead to a cynical interpretation that all ideas are equally worthwhile or worthless, merely a matter of implementation.
00:02:50.400
However, I believe that starting from a bad concept may never deliver the desired outcomes, no matter how many resources or years are thrown at it. In this talk, I want to discuss the origins and goals of the YJIT project, the benchmarks we've curated to evaluate its performance, our data-driven approach to optimization, and some of the engineering trade-offs involved in compiler design.
00:03:35.040
The YJIT project started over two years ago, originally built primarily at Shopify but fully open-sourced. Notably, we received significant contributions from individuals at GitHub as well.
00:03:56.459
A key aspect of this project is that we take a data-driven approach to optimization. To achieve this, we have a large and diverse set of benchmarks; we benchmark often and gather detailed metrics.
00:04:08.220
The effort behind YJIT isn't just my own; it's a much larger team. This project would not be where it is today without the contributions of many amazing programmers.
00:04:19.919
One of the original goals of the YJIT project was to run any Ruby code, aiming for 100% compatibility with the code that we run in production at Shopify.
00:04:41.220
We decided early on that we couldn't require developers to change the codebase to fit YJIT, as our production codebase is vast.
00:04:58.500
Our primary focus is web workloads, mainly Ruby on Rails, and we aimed to achieve double-digit speedups on real-world software. Additionally, we wanted to ensure that YJIT never slows down Ruby code—at worst, we should see no speedup and no slowdown.
00:05:18.000
Our Ruby codebase is large, and YJIT has enabled efficient lazy generation of machine code.
00:05:34.740
We optimize our execution through runtime value promotion, type specialization, and polymorphic inline caches.
00:05:50.400
There’s more; in the past year, a lot of work has gone into YJIT for Ruby 3.2. We've imported YJIT to Rust, which was originally unplanned.
00:06:05.940
We've also implemented a new backend for ARM64, which offers native support for Apple hardware and Raspberry Pi, achieving good performance.
00:06:11.880
In 2022, we deployed YJIT across our infrastructure as no longer experimental.
00:06:31.380
I was thrilled with this progress.
00:06:37.860
Now, let’s go back a bit to the origins of YJIT from my perspective.
00:06:48.539
I began my undergraduate degree at McGill University in Montreal, Canada in 2004, and discovered a passion for compilers.
00:07:02.520
In 2007, I joined Professor Lori Hendren’s team at McGill's Sable Lab to pursue a master's degree.
00:07:14.639
My advisor wanted me to develop a JIT compiler for Matlab, focusing on numerical optimizations.
00:07:27.780
While working on this, I realized there was significant potential in type optimizations for dynamically typed languages.
00:08:06.560
As my master's work focused on generating specialized versions of Matlab functions based on arguments and type propagation, I culminated that work in a published paper.
00:08:26.639
In 2009, I transitioned to the University of Montreal for a PhD, initially planning to pursue optimization for Python.
00:08:41.820
Ultimately, we pivoted to optimizing JavaScript with hybrid type analysis, blending inter-procedural type analysis and speculative optimization.
00:09:05.519
Traditional fixed-point type analyses tend to be expensive and inefficient for dynamically typed languages, so they often don't yield valid type assumptions.
00:09:30.660
I explored a method to speculate on type correctness and to avoid expensive type analysis by disallowing speculative parts of the code.
00:09:49.860
However, I got scooped, which is a common challenge in academic research.
00:10:21.899
After some soul-searching, my advisor and I reconsidered our approach to type specialization.
00:10:41.460
We found a more efficient strategy without extensive type analysis, balancing between performance and resource overhead.
00:11:47.959
What is a JIT compiler? Many think of it as simply a static ahead-of-time compiler running at program initiation.
00:12:04.380
However, JIT compilers can observe the running program, giving them valuable insights for optimization.
00:12:26.579
This access to live program data allows us to generate more efficient machine code than static compilers.
00:12:40.560
Through this understanding, we devised lazy basic block versioning.
00:13:05.820
This technique leverages self-modifying code to enhance execution efficiency by delaying code generation.
00:13:35.100
Initially, this technique faced skepticism in the compiler community but proved successful after repeated attempts to publish.
00:14:53.840
Ultimately, our results showed that the lazy basic block versioning technique was able to achieve performance beyond traditional type analysis.
00:15:19.680
Shortly after joining Shopify, I discussed building a JIT for CRuby with my manager.
00:15:32.640
The project evolved from a toy to a dedicated effort, leveraging the principles we researched.
00:15:47.100
We generated super-instructions to optimize common sequences of YARV instructions.
00:16:04.620
Though our first prototype achieved speedups, it underperformed in Rails benchmarks.
00:16:20.400
Subsequently, we developed YJIT, which targeted x86_64 and aimed to achieve double-digit speedups.
00:16:41.100
After nine months of concentrated development, we delivered impressive performance results for realistic benchmarks.
00:17:10.239
Ultimately, effective benchmarking is crucial in JIT compiler development, historically focused on microbenchmarks.
00:17:37.679
However, a narrower focus on benchmarks can obscure potential performance issues.
00:17:52.780
The choice of benchmarks significantly impacts performance evaluation in compilers.
00:18:09.279
Our approach for benchmarking YJIT has enabled real-time usability and has gathered valuable metrics for optimization.
00:18:39.240
This allows developers to run benchmarks easily and encourages meaningful participation in the YJIT development process.
00:19:03.480
We focused on both representative and synthetic benchmarks, leading to better insights into our optimizations.
00:19:34.260
Attention to the setup process ensures that contributors are not deterred by complexity during benchmarking.
00:19:54.720
In terms of methodology, we recommend benchmarking on stable environments rather than on laptops.
00:20:11.540
It’s also essential to stabilize benchmarking results through consistent configurations.
00:20:31.380
The evolution of YJIT has shown continuous improvement in Ruby performance.
00:20:52.940
Performance metrics indicate significant advancements, with active record benchmarks showing over two times the speed as compared to Ruby 3.1.
00:21:14.300
We use various metrics to effectively evaluate the efficiency of our optimizations within the YJIT project.
00:21:43.800
Shopify's storefront renderer is crucial for all Shopify stores and requires high efficiency.
00:22:00.300
The storefront renderer processes significant traffic daily, affecting operational efficiency.
00:22:19.160
Therefore, we aim to measure YJIT performance directly against the storefront renderer.
00:22:50.300
We employ a systematic approach for testing in production systems.
00:23:06.600
As of January 2023, we initially achieved around a 6% speedup and recently improved that to 18%.
00:23:18.500
These results translate into significant savings in server infrastructure.
00:23:40.000
Designing a JIT compiler entails balancing performance, memory usage, and code generation speed.
00:24:02.300
The challenge lies in optimizing for user experience while ensuring the compiler remains efficient.
00:24:23.640
YJIT has made impressive strides in achieving better memory allocation strategies.
00:24:38.360
The implementation of code garbage collection and lazy memory allocation optimizes resource usage.
00:25:00.000
Overall, the YJIT project demonstrates continuous improvement in being production-ready.
00:25:15.620
Looking forward, we aim to enhance functionality and explore advanced profiling techniques.
00:25:34.150
Your feedback is vital for our continued improvement.
00:26:00.210
If YJIT is beneficial for your applications, please let us know.
00:26:12.859
We value constructive feedback on both successful implementations and performance challenges.
00:26:42.800
Our quest for optimization continues, and we are grateful for your support on this journey.
00:27:06.220
Thank you for listening, and I welcome your insights and questions.