Ferrari Driven Development: superfast Ruby with Rubex

00:00:00.030 Konnichiwa Mr. Pony! My name is Sameer Deshmukh, also known as Quadro on GitHub and Twitter. I am from the city of Pune in India, which is home to about six million people on the western coast of India. It's also known as the ‘Oxford of the East’ due to its approximately 600 colleges. Currently, I am a master's degree student at the Tokyo Institute of Technology, studying high-performance computing. I have been living in the amazing city of Tokyo for the past eight months and I hope to learn a lot from the wonderful Japanese people during my stay.

00:00:21.539 I am also a member of the Ruby Science Foundation, often referred to as HiRuby for short. HiRuby is an organization dedicated to improving Ruby as a language for scientific computing and data analysis. I maintain several libraries for this purpose, some of which I will discuss today. This is the third time in a row that I am speaking at RubyKaigi, and I am very excited to be here. The first talk I gave was about a gem called Daru, which is a data frame library for Ruby, similar to Pandas in Python. That first talk was a pivotal moment for Daru; since then, it has received more than 185,000 downloads on RubyGems and has various plugins contributed by many developers.

00:01:12.360 I made some Japanese friends during this talk, and they were amused to learn that 'Daru' means 'sake' in Hindi, the official language of India. This connection has made my stay in Japan very memorable. The second time I spoke here, I discussed Rubex, which I will elaborate on further today. Since my first talk about Rubex, it has seen over 120 commits and has become a more stable and usable language. This time, I will speak about the improved and polished Rubex and introduce you to a new way of writing Ruby code called FDD, or Ferrari Driven Development, which allows you to write Ruby code in a much simpler manner without excessive complexity.

00:02:16.990 The reason for this need is that we all know Ruby is a great language, but it can be rather slow. To enhance Ruby's speed, many programmers opt to port Ruby code to C extensions primarily for better performance and reliability. Notable examples of Ruby gems that use C extensions are Nokogiri and FastBlank. Nokogiri is a powerful library for parsing XML and HTML, while FastBlank is a plain C extension with handwritten C code. Both of these gems significantly speed up their respective tasks by leveraging the power of C and interfacing it with the C Ruby interpreter.

00:02:42.430 However, C extensions come with their share of considerable challenges. They are often very difficult to edit and write, and the learning curve is quite steep; you need to write a lot of scaffolding code just to get a simple C extension running. You also have to manually bootstrap this extension with Ruby, which can involve a lot of documentation reading and can be hard to understand. Public C APIs that I've written for C extensions can be difficult to navigate as well. Consider a gem with a C extension that another gem depends on; exposing this API publicly becomes cumbersome, unlike how you would do it in object-oriented programming. It's essential to care about smaller details when using these extensions, but Rubyists should be able to focus more on higher-level abstractions.

00:03:47.440 There are various solutions to these issues. One of the first solutions is RubyInline, which lets you write C code directly within a Ruby string, but it does not scale well since it requires all code to reside within a single file. Another solution is SWIG, which can lead to unreadable code if the abstractions grow large over time. Helix is another approach in which you write extensions using the Rust programming language, but it’s an entirely new language that requires mastering. The ideal solution for these problems would be a super-fast, developer-friendly Ruby programming language, but as we know, ideal solutions are rarely available.

00:05:07.130 In my opinion, a good solution to these issues could be the newly redefined Rubex, which I like to think of as the ‘Ferrari for Ruby’. It allows you to write Ruby C extensions while preserving the developer happiness you've come to expect as Ruby developers. In this talk, I will introduce you to major improvements made to Rubex since last year's presentation. The first advantage is an even more stable and robust language. There has been a significant amount of refactoring of the internal codebase of Rubex. In addition, there has been a slight shift from Rubex’s initial goals.

00:05:40.400 Until last year, I thought Rubex would serve primarily as a way to speed up Ruby and facilitate easier writing of Ruby extensions. However, it's now clear that a broader need exists for public C APIs that are accessible and comprehensible through documentation—without requiring developers to dive deep into the C code of a Ruby C extension just to use it in another gem. Imagine you have a Ruby library on top called Green and you want to access the C extension of another gem. Ideally, this should be straightforward, just like in object-oriented programming, but practically speaking, it often turns into a cumbersome situation where you end up reading and analyzing the source code to understand method implementations.

00:06:57.300 Now, let's take a brief look at a small Rubex snippet. This is a very minimal Rubex program; on the right-hand side, you can see the Rubex code. The only difference from typical Ruby code is that we've specified the integer types for the arguments A and B. The ‘add’ method behaves just like a standard Ruby method accessible via a Ruby script, with the distinction that it resides inside a C extension. When Rubex reads this method, it converts arguments A and B from Ruby objects to C integers, performs the addition using C, and returns the result as a Ruby object.

00:08:12.220 To give you a better overview of Rubex, here's a glance at how the compiler operates. The Rubex code is processed by a compiler that is entirely written in Ruby. This compiler then transforms the Rubex code into C code, which interfaces with the Ruby VM using the C extension API, allowing it to operate within the CRuby runtime just like any other C extension.

00:08:47.470 Now, let’s assume you have an array of numbers and need to convert it into a hash. If you try performing this operation in a standard manner, it can be inefficient; therefore, you can leverage Rubex to create a class called ArrayToHash, which simplifies the process. The construction of the Rubex class looks very similar to a Ruby class but comes with additional features. The first key difference is that we specify that the argument in the convert method is an array. This tells Rubex that the input will always be a Ruby array, enabling it to emit optimized C code that takes full advantage of Ruby's internal data structures.

00:09:51.070 Rubex will implicitly optimize calls to the size method on the array and convert it into a long integer without requiring you to do the conversion manually. The results generated are in the form of a hash structure that Rubex has specified. This method uses optimized code, allowing you to directly interact with the C data structures that represent the Ruby hash without going through Ruby method calls. In this way, the process remains highly efficient.

00:10:44.990 Moreover, with Rubex, you have the ability to build a Ruby class called BlanketWrapper that closely mirrors typical Ruby classes while incorporating specific functionalities. An innovative feature is the 'attach' keyword, which instructs the Rubex compiler that this class is responsible for managing the allocation and deallocation of the associated Ruby struct. This means that Ruby's Garbage Collector (GC) will handle memory management for you, alleviating the burden of writing complex allocation code.

00:12:10.610 Another great feature is a special variable called data$. This allows easy access to the struct that the class is associated with, enabling you to interact with it using dot notation as you would in standard programming. You can accept Ruby objects into the initialization method, and Rubex manages the implicit conversion into the appropriate C data types using established functions in the C API. This methodology leads to a significant reduction in lines of code while still providing a simplistic Ruby-like interface without compromising speed.

00:13:42.380 One of the significant hurdles in writing C extensions is managing the codebase. C does not offer the same namespaces as seen in C++ or Ruby. However, Rubex enables you to define C and Ruby functions by using Ruby’s class and module architecture. For example, in a simple code sample, you can define two functions, ‘bar’ and ‘bash,’ within a class called ‘foo.’ The functions are distinguished from regular Ruby methods by being prefixed with the ‘CFunc’ keyword, indicating that ‘bar’ is a C function while ‘bash’ remains a typical Ruby method that can be accessed from other Ruby scripts.

00:14:34.900 These public APIs for C extensions are saved in separate files known as Rubex D files. For instance, you may have a class called 'Class1' alongside another named 'OtherClass.' You can explicitly define the C function names in a Rubex D file and require this file in any C extension that also employs Rubex. By doing this, it becomes simple to access these functions just like any other Ruby API. The compiler acknowledges these as API codes and you merely need to supply the compiled binary or API files for the C libraries.

00:15:51.830 Another significant concern in Ruby is the Global Interpreter Lock (GIL). Though we discussed GIL today, it's important to highlight that we require mechanisms to effectively reduce GIL, particularly in C extensions that deal with extensive numerical computations. To illustrate, let me show you a simplified diagram of how GIL usually operates, presenting its constraints in providing actual multi-threading capabilities in Ruby. Suppose you want to read a file consisting of 500,000 lines, each containing a value. Ideally, you would want to distribute the reading task across four threads to expedite the process.

00:16:53.050 However, GIL limits multiple threads from accessing the CPU simultaneously, forcing operations to execute sequentially. With Rubex, we provide a non-GIL block, a straightforward approach to releasing GIL without complicated workarounds. When you create a non-GIL block, you wrap the code specific to this section, and Rubex automatically handles the GIL release while executing your code, subsequently restoring GIL.

00:18:05.420 For instance, when you define a computational method inside the non-GIL block, you can call this method using Ruby threads. Because this method is wrapped inside the non-GIL block, it allows the CPU to operate on a thread, returning results in Ruby easily. I have run benchmarks demonstrating substantial improvements, revealing that operating without GIL can lead to approximately 350 times faster performance compared to scenarios constrained by GIL.

00:19:02.580 This is a relatively basic way of demonstrating performance improvements, but I captured a screenshot while running my program on htop, showcasing near-complete CPU utilization without GIL compared to significantly reduced utilization when operating through Ruby threads. You can explore the code I used via my GitHub repository called Quadro's Rubex CSP Reader to play around with it yourself.

00:20:01.800 However, one limitation with GIL releases is that you are limited to using only C data structures within the non-GIL blocks. Utilizing Ruby methods can lead to segmentation faults, as Ruby does not permit the execution of Ruby methods outside the GIL. Therefore, it is crucial to be cautious when working inside GIL blocks along with code that is dependent on them.

00:21:01.780 Error handling presents yet another challenge when working with C extensions. Previously, writing error handling code required invoking numerous C functions, including using 'rb_raise' for raising errors, implementing rescue and ensure methods, and manually retrieving errors through 'rb_error_info'. This leads to an incredibly complex workflow lacking the simplicity we are accustomed to in Ruby's begin-rescue-end workflows.

00:22:14.500 Rubex redefines this process by introducing a user-friendly Ruby-like syntax for error handling. You can simply specify a begin-block to enclose the code, which will automatically integrate the necessary C methods for error handling seamlessly. My talk concludes here, and I encourage you to explore Rubex further.

00:23:15.800 There are several significant differences between Ruby and Rubex. For instance, you will need to specify brackets for function calls, and final return keywords need to be defined before exiting functions. Currently, Rubex does not yet support blocks or closures. In contrast to C, the value of operator is not supported, though there are alternative constructs to work around this. You can find notable examples of Rubex implementations in the examples folder within the GitHub repository.

00:24:45.480 An exciting addition to Rubex is the 'AddToHash' gem, which is fully written in Rubex and serves as a C extension. Comprehensive documentation and tutorials regarding Rubex is available on the repository, including a detailed language specification in the reference.md file. This tutorial provides a brief guide for porting your C extensions to Rubex.

00:26:10.220 In summary, Rubex provides a fast and productive way to create C extensions, ensuring optimal performance without compromising on usability. The only performance trade-off is the necessity to go through a compile step. Nevertheless, this rarely outweighs the benefits of maintaining developer productivity.

00:27:08.220 Every year at RubyKaigi, I like to discuss new ideas that the Ruby community has been developing over the past year, and this year is no exception. One of the exciting projects is Pipe Memory Views, utilizing tools like Numo, a Ruby gem for linear algebra; this allows direct access to C arrays within these objects, greatly enhancing performance. Plan for integrating direct interfaces to GPUs using native CUDA kernels is also underway. We'll also work on integrating with GDB to simplify debugging during development.

00:28:18.660 Another critical project is Ruby Plot, as Ruby is a mature language, yet it lacks a robust plotting ecosystem used by other languages. Various partial solutions exist, but they often rely on third-party tools, which aren’t accessible to everyone. Our goal is to create a native plotting solution through Ruby using C++ that interfaces with ImageMagick, GTK, and GR, creating a versatile plotting tool.

00:29:18.540 Ruby Plot will be language-independent, serving as a C++ library that can interface with Ruby or any other language, enabling contributions from diverse language communities. Keep track of our progress on Ruby Plot through our discourse forum and GitHub, as the project is still in its early stages.

00:30:23.260 Another exciting idea includes common array libraries that bridge the gap between NMatrix and Numo, as we aim to create a robust framework for working with numerical data in Ruby. This involves exploring the potential integration of a library called Flores, which abstracts NumPy functionality into C libraries, providing a Ruby front-end. I look forward to discussing this idea with all of you.

00:31:17.580 Before I conclude, I would like to extend my gratitude to the Ruby Association, the grant providers, and the Ruby Science Foundation for their support. Additionally, I have many Side Ruby stickers to share, so if you'd like one, come say hello, and I'll gladly give you as many as you want! Thank you for listening, and I'm open to any questions.

00:32:59.360 Yes, Troffer Ruby is quite interesting. The challenge I see is that, from a performance perspective, accessing C extensions is a different story altogether.

00:33:17.020 If Ruby continues to mature and become as fast as Julia, we might not need C extensions at all, but as long as C extensions and public APIs continue to be necessary, Rubex provides an excellent solution.

00:33:31.780 Thus, the API and abstraction capabilities of Rubex will remain valuable, even as Ruby itself evolves.

00:34:00.020 To clarify, the 'hash to array' functionality in Rubex relies on Ruby's core hash functions, maintaining a similar speed.

00:34:09.790 Rubex's interaction with Ruby’s hash data structure means it uses macros to manipulate the internal C representation directly.

00:34:18.160 Yes, you can use Rubex in combination with pure Ruby for debugging. It operates under the Ruby interpreter.

00:34:29.480 You need a C compiler along with the Rubex compiler; the process involves translating Rubex code into C, followed by compiling to a shared object for CRuby.

00:34:52.620 So to clarify, you cannot run pure Rubex code without the required compiler.

00:35:04.830 Thank you! I'm here if you have any further questions.

00:35:11.210 Thank you very much!