Talks

From Interpreting C Extensions to Compiling Them

RubyKaigi 2024

00:00:03.719 Okay, hello and welcome to my talk.
00:00:11.000 Today, we are going to discuss interpreting C extensions and then turning them into compiling.
00:00:17.039 My name is Benoit Daloze. I work at Oracle Labs in Zurich and have been involved in Truffle Ruby since 2014.
00:00:22.119 I hold a PhD in Dynamic Languages and I am the maintainer of Ruby spec and set up Ruby. I am also a committer for Truffle Ruby.
00:00:28.400 Truffle Ruby is a high-performance Ruby implementation that uses the Graal JIT compiler to target full compatibility with C 3.2, including C extensions.
00:00:40.360 As a point of reference, applications like Mastodon and Discourse can run on Truffle Ruby due to the high level of compatibility.
00:00:46.160 You can find Truffle Ruby on GitHub and so on.
00:00:52.199 Now, let's look into why Ruby has C extensions in the first place. There are two main reasons for this.
00:01:04.600 The first reason is to bind to existing C libraries. For instance, libraries like OpenSSL, MySQL, and PG can all be linked via corresponding native libraries.
00:01:17.159 That's a common reason for using C extensions. The alternative, which is typical, is to use the FFI gem. However, only a small proportion of gems utilize the FFI gem.
00:01:29.600 Furthermore, the FFI gem struggles when libraries have numerous macros in their headers, as one would need to replicate extensive logic in Ruby code.
00:01:40.520 The second purpose of C extensions is to improve performance. Sometimes Ruby code can be too slow.
00:01:52.439 For instance, both JSON and MessagePack have C extensions to speed up serialization. However, this practice has become less encouraged over the years.
00:02:04.759 As discussed in Maxim's talk, it's seen as less necessary now, and Truffle Ruby executes Ruby code so swiftly that writing a C extension may not have significant advantages.
00:02:10.800 Moreover, C extensions usually require higher maintenance.
00:02:16.800 So far, I have talked about C extensions, but what I really mean is native extensions, as they are not limited to C; you can also use C++ or Rust and other languages.
00:02:24.959 If we look at usage, we can count among the top 10,000 gems by the number of downloads and see how many of them actually use a C extension.
00:02:31.800 Only 4% of these gems have a C extension, which may indicate that C extensions are not that crucial for compatibility. However, if we account for gems that have a transitive dependency on a gem with a native extension, the percentage increases significantly.
00:02:42.800 For the top 10,000 gems, it's 46% that depend transitively on a C extension.
00:02:48.840 It's important to note that we don't count FFI in this statistic, as FFI is also treated as a C extension, but in other Ruby implementations, it isn't.
00:02:54.360 We can also be more precise; for instance, the JSON and Rack gems have a probable fallback, allowing them to work even if the extension is not supported.
00:03:01.080 Even with this, it still amounts to 43%, meaning that a large portion of the Ruby ecosystem relies on some extensions, making it imperative to support them for compatibility.
00:03:07.920 Now, for JRuby, it's a bit more complicated due to Java extensions, and I will not dive into that detail here.
00:03:20.680 If we examine the compatibility, of course, C defines the specs, ensuring compliance, while Truffle Ruby passes about 98% of the CPI specs.
00:03:28.000 This level of compatibility is commendable. However, JRuby does not implement C extensions—at least not in the current version—so it does not pass these tests.
00:03:41.239 Next, I'd like to delve into how we can implement C extensions.
00:03:52.959 For the past ten years, we have been keen on implementing these extensions as they are crucial for compatibility. We have tried various approaches, each with its own set of pros and cons.
00:04:04.360 As a bit of background, let me explain what Truffle is. Truffle is a framework designed to facilitate the creation of high-performance languages.
00:04:09.840 The main advantage of writing an interpreter with the Truffle framework is that you essentially get a JIT compiler for free. This works because the Graal VM compiler, a language-agnostic JIT compiler, can partially evaluate any language's bytecode and generate efficient machine code from it.
00:04:28.000 For example, suppose I have a Ruby method or block. The JIT compiler, which predominantly understands Java, interacts with the Truffle Ruby interpreter to identify operations, such as a plus operation, and JIT compiles to machine code.
00:04:46.840 The end result is machine code optimized for that specific Ruby method or block. Essentially, it is akin to having written a JIT compiler specifically for Ruby, but without the overhead of doing so.
00:05:03.280 This dual functionality of not needing to duplicate logic between the interpreter and the JIT compiler is a common pain point that many virtual machines suffer, leading to bugs and performance issues.
00:05:14.720 The journey began in 2014 with an initial prototype, utilizing Truffle C. Truffle C, as its name suggests, is a Truffle interpreter for the C language.
00:05:38.080 This approach was quite innovative and unusual, as C isn't conventionally viewed as an interpreted language.
00:05:45.880 You might wonder why we took this route instead of simply using Clang.
00:05:52.120 I will answer this question in more detail, but in summary, we aim for more freedom of representation. For instance, we can represent a C struct as a Ruby object.
00:06:05.400 This seems counterintuitive, as a C struct is merely a layout in memory, while a Ruby object is more complex. However, this approach enables us to overcome limitations.
00:06:17.880 Moreover, surprisingly, performance benefits can arise because a Graal VM JIT compiler can compile both C and Ruby together.
00:06:29.360 Thus, it can inline functions across both languages. When we compile code, we can optimize a Ruby method that calls a C function, which subsequently calls another Ruby method.
00:06:41.720 Inlining is crucial for performance, as shown by the early statistics, which revealed that calling C extensions with inlining led to a significant performance gain.
00:06:54.400 When C calls Ruby, it often results in slowdowns; however, inlining mitigates this problem.
00:07:04.320 Two years later, we see the inception of Sulong, a successor to Truffle C that allows for passing C code more efficiently. Instead of passing C directly, Sulong employs LLVM bitcode, which serves as an intermediate representation.
00:07:50.760 This approach is faster than typical C passing, as it allows us to compile C to bitcode ahead of time with Clang. Sulong interprets this bitcode, allowing for efficient runtime behavior.
00:08:02.000 Now, transitioning to native extension scenarios, we initially had challenges with representing structures directly.
00:08:15.400 In C, Ruby headers define value types commonly as unsigned long. This isn't convenient, as objects in Truffle Ruby, represented as Java objects, cannot have direct memory addresses.
00:08:27.280 Java's garbage collector can move these objects at any time, so obtaining valid addresses isn’t practical, as any address you might obtain could be invalid.
00:08:40.199 Sulong's solution provides a method for managing representations creatively. For example, when defining a method in a C extension taking a Ruby object, we manipulate object fields directly without converting these objects to integers.
00:08:53.600 Let’s consider redirecting access to a C struct, where commonly we would use Ruby macros to reference Ruby objects directly.
00:09:07.760 Instead of reading memory addresses, our implementations can return Ruby objects directly, streamlining the interaction.
00:09:19.160 Instead of adhering to traditional memory access patterns, we facilitate interoperability, allowing for Ruby and C exchanges while maintaining efficiency.
00:09:31.640 However, when working with inter-library calls, unexpected leakage of Ruby objects into native memory can complicate matters.
00:09:44.320 For instance, when interfacing with libraries like libuv, Ruby objects could be passed around leading to further complications.
00:09:52.520 Inspired by these observations, we introduced handles, which allow us to associate native pointers with Ruby objects and prevent any unnecessary overhead.
00:10:05.400 While handles facilitate this representation change, we aim to minimize their use to situations where they are absolutely necessary.
00:10:20.160 This year, I made significant changes to Truffle Ruby, allowing it to handle extensions natively, moving away from reliance on Sulong.
00:10:34.000 This transition provides various advantages, including faster startup and warm-up times, eliminating the costly JIT compilation step.
00:10:46.760 By leveraging native compilation tools like Clang or GCC, we foster more trustworthy code execution.
00:10:56.320 With direct compilation, large extensions such as gRPC, which often utilize advanced C++ features and edge cases, now become feasible, showcasing our resolve.
00:11:12.960 That said, this new approach has drawbacks; we no longer have the flexibility to perform certain optimizations or inline caches.
00:11:23.560 For instance, every variable must adhere to type specifications in C, and we cannot afford the same level of dynamic behavior previously enjoyed.
00:11:38.840 Similarly, the constraints of inline caches—previously beneficial for performance—must now be adapted to suit native specifications.
00:11:50.720 In focusing on structures, we needed to rethink how to manage specific Ruby C structs. For instance, instead of relying on RBASIC, which would have required instantiating traditional C structures, we applied macros to interface directly with desired Ruby classes.
00:12:05.680 One such example arises with IO objects, where we also construct a pointing mechanism to facilitate access to the relevant C fields.
00:12:17.920 Moreover, the challenge involves bulk structures, like RB Encoding, part of Ruby's core, demanding specialized consideration during implementation.
00:12:31.000 Additionally, the use of inline caches has resurfaced as a notable tactic traditionally used in the Ruby environment.
00:12:45.960 Despite needing a redesign, this leveraging of caches along with the existing C methodology aligns neatly with optimal performance operational principles.
00:12:59.640 In summary, we have seen the evolution from interpreting and JIT compiling C code to running everything natively through a customizable toolchain.
00:13:14.160 This enables extensions to be utilized effectively while preserving valuable instructions found within the standard libraries.
00:13:27.720 While the extension API still has its pitfalls—it’s immensely better than the CPython API, which exposes a significant number of structures.
00:13:39.800 Many developers realize that mutable design choices can contribute to stunted progress and performance.
00:13:45.240 Therefore, ongoing research continues to explore strategies that combine the intuitiveness of C extensions with the flexibility of Ruby.
00:13:58.520 In conclusion, around 43% of the top 10,000 gems are dependent on gems with native extensions, underscoring the importance of supporting them for compatibility.
00:14:05.000 The Ruby extension API is flexible and can be implemented by alternative Ruby implementations.
00:14:12.520 Truffle Ruby initially approached it by JIT compiling C extensions, which provided many advantages.
00:14:19.560 We have now transitioned to running these natively, which yields different yet significant advantages.
00:14:26.680 Overall, the challenges faced by C extensions revolve around managing memory representation effectively.
00:14:32.760 With advancements in handling functional macros, we can refine and improve efficiencies further.
00:14:39.000 If any of you would like to experiment with Truffle Ruby, recent releases are available along with early access development builds.
00:14:45.520 Thank you for listening!