Ruby Video | Keynote: Optimizing YJIT’s Performance, from Inception to

Keynote: Optimizing YJIT’s Performance, from Inception to Production

Maxime Chevalier-Boisvert

#yjit-yet-another-ruby-jit

#just-in-time-jit

#ruby-performance

#ruby-optimization

Keynote: Optimizing YJIT’s Performance, from Inception to Production

Maxime Chevalier-Boisvert • May 11, 2023 • Nagano, Japan

The keynote presentation by Maxime Chevalier-Boisvert at RubyKaigi 2023 centers on the optimization of YJIT (Yet another Just-In-Time compiler for Ruby) from its early development stages to its current deployment in production environments. The talk highlights the challenges of improving Ruby's performance, particularly through just-in-time compilation, and chronicles the inception and evolution of YJIT.

Key Points Discussed:
- Background on Ruby Performance Optimization: Insight into previous failed attempts at creating JIT compilers for Ruby, setting the stage for YJIT's goals and progress.
- YJIT Milestone: YJIT has been deemed 'production ready' at Shopify, currently serving requests across all Shopify storefronts, demonstrating significant speed improvements.
- Technical Decisions: Discussion of the data-driven approach adopted in the YJIT project with substantial contributions from developers across various organizations, particularly highlighting support from GitHub.
- Goals of YJIT: Ensuring 100% compatibility with existing Ruby code without requiring changes, targeting web workloads like Ruby on Rails, and aiming for double-digit speedups or at worst no slowdown in performance.
- Methodology: Emphasis on rigorous benchmarking procedures, focusing on real-world software performance metrics instead of solely on microbenchmarks.
- Continuous Improvement: Highlights on YJIT's advancements since its inception, such as implementing ARM64 support, achieving increased speedups, improving memory management strategies, and ongoing development plans that focus on profiling enhancements and user feedback.
- Personal Journey: Maxime also shares personal anecdotes from his academic background and his passion for compiler technology, linking how this groundwork laid the foundation for the innovations in YJIT.

Conclusion: The presentation closes with a call for community feedback on YJIT's application and insights, reinforcing the project's commitment to continuous optimization and adaptation to user needs.

Keynote: Optimizing YJIT’s Performance, from Inception to Production
Maxime Chevalier-Boisvert • May 11, 2023 • Nagano, Japan

RubyKaigi 2023

00:00:02.040 Thank you.

00:00:09.960 Hi everybody, my name is Maxime, and today I'm going to be telling you about optimizing YJIT's performance from inception to production.

00:00:24.539 I mean, wait until after to talk to the club. We all love Ruby, but optimizing Ruby's performance has proven to be a difficult and daunting problem.

00:00:35.820 There have been many different projects trying to build just-in-time compilers for Ruby to improve its performance, but most of these projects today are either abandoned or didn't reach the outcomes they hoped for at the beginning in terms of adoption.

00:01:01.140 Recently, YJIT has reached an important milestone at Shopify; we've deemed it to be 'production ready.' This is not just talk; we've deployed YJIT to all of the storefront render infrastructure at Shopify, serving all of the requests at Shopify stores.

00:01:12.780 Every time you visit a Shopify store, you're using code that was run by YJIT. We're seeing significant end-to-end speedups, and it's not just us; as of yesterday, Discord announced that they are also using YJIT in production and getting similar speedups as well.

00:01:45.720 This talk is not just technical; it's about some of the decisions that led to YJIT being production-ready. I also want to share a bit about the origin story and the background behind YJIT.

00:01:58.560 There's a famous quote by Mary Kay Ash: 'Ideas are a dime a dozen; people who implement them are priceless.' What I take from this is that many good ideas exist, but there aren't enough people out there implementing these good ideas.

00:02:29.459 Unfortunately, people tend to shorten the quote to 'Ideas are a dime a dozen,' which can lead to a cynical interpretation that all ideas are equally worthwhile or worthless, merely a matter of implementation.

00:02:50.400 However, I believe that starting from a bad concept may never deliver the desired outcomes, no matter how many resources or years are thrown at it. In this talk, I want to discuss the origins and goals of the YJIT project, the benchmarks we've curated to evaluate its performance, our data-driven approach to optimization, and some of the engineering trade-offs involved in compiler design.

00:03:35.040 The YJIT project started over two years ago, originally built primarily at Shopify but fully open-sourced. Notably, we received significant contributions from individuals at GitHub as well.

00:03:56.459 A key aspect of this project is that we take a data-driven approach to optimization. To achieve this, we have a large and diverse set of benchmarks; we benchmark often and gather detailed metrics.

00:04:08.220 The effort behind YJIT isn't just my own; it's a much larger team. This project would not be where it is today without the contributions of many amazing programmers.

00:04:19.919 One of the original goals of the YJIT project was to run any Ruby code, aiming for 100% compatibility with the code that we run in production at Shopify.

00:04:41.220 We decided early on that we couldn't require developers to change the codebase to fit YJIT, as our production codebase is vast.

00:04:58.500 Our primary focus is web workloads, mainly Ruby on Rails, and we aimed to achieve double-digit speedups on real-world software. Additionally, we wanted to ensure that YJIT never slows down Ruby code—at worst, we should see no speedup and no slowdown.

00:05:18.000 Our Ruby codebase is large, and YJIT has enabled efficient lazy generation of machine code.

00:05:34.740 We optimize our execution through runtime value promotion, type specialization, and polymorphic inline caches.

00:05:50.400 There’s more; in the past year, a lot of work has gone into YJIT for Ruby 3.2. We've imported YJIT to Rust, which was originally unplanned.

00:06:05.940 We've also implemented a new backend for ARM64, which offers native support for Apple hardware and Raspberry Pi, achieving good performance.

00:06:11.880 In 2022, we deployed YJIT across our infrastructure as no longer experimental.

00:06:31.380 I was thrilled with this progress.

00:06:37.860 Now, let’s go back a bit to the origins of YJIT from my perspective.

00:06:48.539 I began my undergraduate degree at McGill University in Montreal, Canada in 2004, and discovered a passion for compilers.

00:07:02.520 In 2007, I joined Professor Lori Hendren’s team at McGill's Sable Lab to pursue a master's degree.

00:07:14.639 My advisor wanted me to develop a JIT compiler for Matlab, focusing on numerical optimizations.

00:07:27.780 While working on this, I realized there was significant potential in type optimizations for dynamically typed languages.

00:08:06.560 As my master's work focused on generating specialized versions of Matlab functions based on arguments and type propagation, I culminated that work in a published paper.

00:08:26.639 In 2009, I transitioned to the University of Montreal for a PhD, initially planning to pursue optimization for Python.

00:08:41.820 Ultimately, we pivoted to optimizing JavaScript with hybrid type analysis, blending inter-procedural type analysis and speculative optimization.

00:09:05.519 Traditional fixed-point type analyses tend to be expensive and inefficient for dynamically typed languages, so they often don't yield valid type assumptions.

00:09:30.660 I explored a method to speculate on type correctness and to avoid expensive type analysis by disallowing speculative parts of the code.

00:09:49.860 However, I got scooped, which is a common challenge in academic research.

00:10:21.899 After some soul-searching, my advisor and I reconsidered our approach to type specialization.

00:10:41.460 We found a more efficient strategy without extensive type analysis, balancing between performance and resource overhead.

00:11:47.959 What is a JIT compiler? Many think of it as simply a static ahead-of-time compiler running at program initiation.

00:12:04.380 However, JIT compilers can observe the running program, giving them valuable insights for optimization.

00:12:26.579 This access to live program data allows us to generate more efficient machine code than static compilers.

00:12:40.560 Through this understanding, we devised lazy basic block versioning.

00:13:05.820 This technique leverages self-modifying code to enhance execution efficiency by delaying code generation.

00:13:35.100 Initially, this technique faced skepticism in the compiler community but proved successful after repeated attempts to publish.

00:14:53.840 Ultimately, our results showed that the lazy basic block versioning technique was able to achieve performance beyond traditional type analysis.

00:15:19.680 Shortly after joining Shopify, I discussed building a JIT for CRuby with my manager.

00:15:32.640 The project evolved from a toy to a dedicated effort, leveraging the principles we researched.

00:15:47.100 We generated super-instructions to optimize common sequences of YARV instructions.

00:16:04.620 Though our first prototype achieved speedups, it underperformed in Rails benchmarks.

00:16:20.400 Subsequently, we developed YJIT, which targeted x86_64 and aimed to achieve double-digit speedups.

00:16:41.100 After nine months of concentrated development, we delivered impressive performance results for realistic benchmarks.

00:17:10.239 Ultimately, effective benchmarking is crucial in JIT compiler development, historically focused on microbenchmarks.

00:17:37.679 However, a narrower focus on benchmarks can obscure potential performance issues.

00:17:52.780 The choice of benchmarks significantly impacts performance evaluation in compilers.

00:18:09.279 Our approach for benchmarking YJIT has enabled real-time usability and has gathered valuable metrics for optimization.

00:18:39.240 This allows developers to run benchmarks easily and encourages meaningful participation in the YJIT development process.

00:19:03.480 We focused on both representative and synthetic benchmarks, leading to better insights into our optimizations.

00:19:34.260 Attention to the setup process ensures that contributors are not deterred by complexity during benchmarking.

00:19:54.720 In terms of methodology, we recommend benchmarking on stable environments rather than on laptops.

00:20:11.540 It’s also essential to stabilize benchmarking results through consistent configurations.

00:20:31.380 The evolution of YJIT has shown continuous improvement in Ruby performance.

00:20:52.940 Performance metrics indicate significant advancements, with active record benchmarks showing over two times the speed as compared to Ruby 3.1.

00:21:14.300 We use various metrics to effectively evaluate the efficiency of our optimizations within the YJIT project.

00:21:43.800 Shopify's storefront renderer is crucial for all Shopify stores and requires high efficiency.

00:22:00.300 The storefront renderer processes significant traffic daily, affecting operational efficiency.

00:22:19.160 Therefore, we aim to measure YJIT performance directly against the storefront renderer.

00:22:50.300 We employ a systematic approach for testing in production systems.

00:23:06.600 As of January 2023, we initially achieved around a 6% speedup and recently improved that to 18%.

00:23:18.500 These results translate into significant savings in server infrastructure.

00:23:40.000 Designing a JIT compiler entails balancing performance, memory usage, and code generation speed.

00:24:02.300 The challenge lies in optimizing for user experience while ensuring the compiler remains efficient.

00:24:23.640 YJIT has made impressive strides in achieving better memory allocation strategies.

00:24:38.360 The implementation of code garbage collection and lazy memory allocation optimizes resource usage.

00:25:00.000 Overall, the YJIT project demonstrates continuous improvement in being production-ready.

00:25:15.620 Looking forward, we aim to enhance functionality and explore advanced profiling techniques.

00:25:34.150 Your feedback is vital for our continued improvement.

00:26:00.210 If YJIT is beneficial for your applications, please let us know.

00:26:12.859 We value constructive feedback on both successful implementations and performance challenges.

00:26:42.800 Our quest for optimization continues, and we are grateful for your support on this journey.

00:27:06.220 Thank you for listening, and I welcome your insights and questions.

Maxime Chevalier-Boisvert

explore all talks recorded at RubyKaigi 2023

Explore all talks recorded at RubyKaigi 2023

RubyKaigi 2023

DIY Your Touchpad Experience: Building Your Own Gestures

The future vision of Ruby Parser

Yuichiro Kaneko

RuboCop's baddest cop

Genadi Samokovarov

Generating RBIs for dynamic mixins with Sorbet and Tapioca

Make Regexp#match much faster

Hiroya FUJINAMI

Understanding the Ruby Global VM Lock by observing it

develop chrome extension with ruby.wasm

Ractor reconsidered

High-performance real-time 3D graphics with Vulkan

Frederico Linhares

UTF-8 is coming to mruby/c

Power up your REPL life with types

Plug & Play Garbage Collection with MMTk

Matt Valentine-House

Keynote: 30 Years of Ruby

Yukihiro "Matz" Matsumoto

Lightning Talks

katsyoshi, kokuyou, OKURA Masafumi, Yukihiro "Matz" Matsumoto, lulalala, Shugo Maeda, Yla Aioi, Yuichiro Kaneko, Yuya Fujiwara, Peter Zhu, Yudai Takada, and Sorah Fukumori

On Ruby and ꝩduЯ, or How Scary are Trojan Source Attacks

Martin J. Dürst

JRuby: Looking Forward

Build a mini Ruby debugger in under 300 lines

Implementing ++ operator, stepping into parse.y

Yet Another Ruby Parser

RubyGems on the watch

Maciej Mensfeld

Fix SQL N+1 queries with RuboCop

Revisiting TypeProf - IDE support as a primary feature

Splitting: the Crucial Optimization for Ruby Blocks

The Resurrection of the Fast Parallel Test Runner

Multiverse Ruby

Ruby Implementation of QUIC: Progress and Challenges

Yusuke Nakamura

Reading and improving Pattern Matching in Ruby

Fitting Rust YJIT into CRuby

Hacking and profiling Ruby for performance

Daisuke Aritomo

Introduction of new features for VS Code debugging

Tips and Tricks for working in the MRI Codebase

The Second Oldest Bug

Eliminating ReDoS with Ruby 3.2

Takashi Yoneuchi

Keynote: Optimizing YJIT’s Performance, from Inception to Production

Maxime Chevalier-Boisvert

Keynote: Parsing RBS

Soutaro Matsumoto

Ruby Committers and The World

Ruby Committers

Gradual typing for Ruby: comparing RBS and RBI/Sorbet

Alexandre Terrasa

The Adventure of RedAmber - A data frame library in Ruby

Hirokazu SUZUKI

Ruby + ADBC - A single API between Ruby and DBs

Ruby vs Kickboxer - the state of MRuby, JRuby and CRuby

Code indexing: How language servers understand our code

Find and Replace Code based on AST

Load gem from browser

Shigeru Nakajima

Ruby JIT Hacking Guide

Takashi Kokubun

Unleashing the Power of Asynchronous HTTP with Ruby

Samuel Williams

Rethinking Strings

Let's write RBS!

Masataka Kuwabara

Build Your Own SQLite3

How resolve Gem dependencies in your code?

Hiroshi Shibata