Talks
Introduction to CRuby source code

Introduction to CRuby source code

by Andy Pliszka

In the video titled 'Introduction to CRuby source code,' Andy Pliszka, a speaker at MountainWest RubyConf 2014, provides an informative overview of the CRuby source code and its implications for Ruby developers. He emphasizes the advantages of understanding the CRuby source code, which helps enhance the knowledge of Ruby and explains crucial concepts like singleton methods and garbage collection.

Here are the key points discussed:

  • Motivation for Understanding CRuby: Pliszka highlights that familiarity with CRuby can significantly enhance one's programming skills in Ruby, especially for developers with several years of experience.
  • Building Ruby from Source: He walks through the process of checking out the Ruby source code from GitHub, configuring it, and installing it on Mac and Linux systems. This involves using tools like RVM and running necessary commands such as 'make' and 'make install'.
  • Debugging Tools: The speaker mentions the importance of having a reliable debugger to explore CRuby, recommending tools like LLDB for macOS and GDB for Linux.
  • Exploring CRuby Source Code: Pliszka guides the audience through essential files in the CRuby source code, explaining the structure and how critical classes such as arrays and strings are implemented in C.
  • Performance Optimization: He illustrates how code written in C can be significantly faster than Ruby code, citing examples like the Fibonacci function and quicksort algorithms, which show the potential performance gains of leveraging CRuby.
  • Case Study - Rails Installation: Testing a Ruby installation by attempting to install Rails is suggested as a means to confirm successful setup and C configuration.

At the conclusion of his talk, Pliszka encourages developers to experiment with CRuby and delve into C code to reignite their passion for programming and realize substantial productivity and performance improvements. He underscores that exploring CRuby offers many ‘aha’ moments that can enrich the learning experience and performance optimization pursuits as a Ruby developer.

00:00:25.720 Hello everyone, my name is Andy, and I work at Pivotal Labs in New York City. Today, I'm going to give you a gentle introduction to the CRuby source code.
00:00:47.160 First, I will share my motivation for looking at CRuby. Understanding the CRuby source code is beneficial for enhancing your knowledge of Ruby. If you have been coding in Ruby for a couple of years, and you’ve read meta-programming books or blogs about the object model, that may not be enough to fully grasp how Ruby objects work, especially concepts like singleton methods. To really understand these aspects, looking into the C source code of CRuby is the best way to learn. Additionally, if you're interested in garbage collection, examining the source is the essential way to learn about it.
00:01:31.640 Another great benefit of knowing C and how to interface it with Ruby is that it can dramatically speed up your code. While I am not suggesting that you rewrite all your web applications in C, it is worth profiling your application to identify tight loops that could be rewritten in C. You might find that doing so can result in performance gains of 10 to 50 times in speed. Also, if you want to write Ruby extensions in C, there are many examples available, such as Psych, which can be referred to as a guide.
00:02:05.159 Many gems are Ruby extensions, so knowing both Ruby and C allows you to harness the best of both worlds. You can enjoy Ruby's productivity by writing high-level functions in Ruby while implementing tight loops and algorithms in C to achieve performance that's difficult to realize in other languages like Python. Now, let’s build our Ruby from scratch. If you are using RVM, I’ll show you what RVM does behind the scenes, as it also builds Ruby from source.
00:03:16.799 The first thing you need to do is get the source code, which you can check out from GitHub. For these slides, I checked out a stable tag, which is 2.4.7. It’s a little older, but to ensure that you get repeatable results when trying this at home, please check out that tag. The next step involves configuring the source code because it is designed to run on multiple platforms.
00:03:52.000 If you’re on Mac or Linux, you will need to install OpenSSL, as it’s required for Ruby gems. After that, run the Autoconf utility to configure the source code for your hardware. When configuring the Ruby source code, you will want to install it into a designated folder in your home directory with optimization disabled for easier debugging. Ensure that the debug flags are set so that the binaries will include debug information to get line numbers when you run the debugger. On Linux, the setup closely mirrors that of Mac, installing necessary libraries and running the configuration command with your specified destination folder.
00:05:04.000 Once configured, you simply run 'make', which compiles all your C files into object files and links them into binaries. By this point, the Ruby binary is built. It’s a good idea to run unit tests after building Ruby, as CRuby comes with about 14,000 tests. Running 'make check' will execute all tests, with the expectation that most will pass. If you encounter a couple of failures at the beginning, don’t worry; keep going. However, if you face massive failures with thousands of tests failing, it’s better to delete the folder and start the process over.
00:06:56.480 Once you've confirmed that your Ruby build passed all checks, you can install it by running 'make install', which takes all the binaries, copies them to your designated Ruby folder, and sets up the gem folder structure, including some basic gems like Psych and Rake. At this point, your Ruby installation is complete. You must then ensure your system is aware of these binaries and that the gem setup is correct. Setting up the path to your Ruby bin folder and configuring 'GEM_HOME' and 'GEM_PATH' to point to your Ruby folder is essential.
00:08:18.200 Now that everything is set up and working correctly, you can verify it by running 'which ruby', which should show the binary location. Testing IRB is also a good idea; raising an exception in IRB will also confirm that it throws exceptions from the binaries in your Ruby folder. An essential part of your Ruby installation is the gem system, so running 'gem env' should confirm all directories point to your designated Ruby folder.
00:09:08.920 A good test for any Ruby installation is to install Rails, as it has a significant gem requirement that checks the C configuration thoroughly. If the installation of Rails is successful, you can be confident that the Ruby build closely matches the version you would get from RVM. Once Rails is installed, you can create a new application and run it using your CRuby.
00:10:01.880 If you’re using RubyMine, setting up the new compiled Ruby version is straightforward; just specify the SDK and navigate to the location of the binary in your Ruby folder. So, we’ve completed our first level: building our own version of Ruby, configuring it, and getting Rails to work. This process shouldn’t take more than 30 minutes, and in essence, it’s what RVM does. This experience gives you a better understanding of what RVM and similar tools do.
00:10:33.920 Now let’s move on to level two. As we will be working with C, having a reliable debugger is important. There are several options depending on your operating system; for instance, on macOS, you could use LLDB, and on Linux, GDB is commonly used. You can even use Xcode for debugging your C code. For example, if you want to debug a simple Ruby script, while running LLDB, specify your Ruby binary and the script you want to debug, such as one that tests the 'upcase' method.
00:11:44.960 Set a breakpoint at the beginning of the 'upcase' function in the C code. When you run the script, it should hit the breakpoint, allowing you to inspect the stack frame and view the commands executed right beforehand. If you prefer GUI debugging, Xcode can also be set up similarly to examine the functioning of particular methods.
00:12:24.960 Let’s take a closer look at the source code itself. Upon checking out the Ruby source code from GitHub, the most important files are located in the root directory. For example, you'll find the C implementation of the Ruby array and other fundamental classes here such as strings and fixnum. When navigating through the code, focus on files located in the main folder, particularly the 'extension' sub-folder where all the default C extensions are stored. This can be a valuable resource, especially if you’re interested in writing your own C extension gems.
00:13:22.720 Let's examine some critical Ruby classes, starting with the array class. Each C file that provides an implementation of a Ruby class, like the array class, typically includes an 'init' function. This function initializes the class when it is loaded. Reading through the CRuby source code isn't too intimidating; in fact, the formatting is user-friendly, making it accessible even if you're new to C.
00:15:19.760 For the array class, you can see how it's defined using 'Ruby define class', which takes the name of the class and its parent class. To include modules within a class, you can use the 'Ruby include module' function. Memory allocation for new objects is handled through the 'alloc' function. Remember, in C, the object creation process is typically a two-step process: allocating memory and then constructing the object, while in Ruby, these steps are combined.
00:16:32.960 If you are interested in meta-programming concepts, this source code will provide practical insights, as you can find definitions of singleton methods and other meta-programming constructs. For instance, when defining methods in C, you would use 'Ruby define method', and you can delineate constructors for classes in a similar manner. Let’s take a look at the string class; the definition follows a similar pattern to the array class, featuring an 'init' function and class definition, revealing the cohesiveness of structure in CRuby.
00:18:21.000 As we explore further, you can see how similar patterns are repeated across different C files for fundamental Ruby classes, providing solid structure and organization. While more intricate files exist dealing with garbage collection or object creation, you can focus initially on simpler concepts.
00:19:22.720 Defining a method in CRuby is also a straightforward two-step process. First, you define your method using the 'Define method' helper, specifying the class reference, the method name, and the implementation. Every method definition usually requires at least one reference to self, to represent the object upon which the method is called. For instance, in the implementation of the array length function, it returns a reference to fixnum and effectively converts a long value to a Ruby numeric type.
00:20:33.160 When defining class constructors, it’s essential to follow certain steps such as creating initialization methods and defining allocation functions. This process also involves defining class-level functions and handling memory pointers directly when managing object state in C. It's pivotal to remember the dichotomy between C and Ruby as you venture into the CRuby source code.
00:21:31.360 In C, you work directly with memory addresses using 'malloc' and 'free' to manage your allocations manually. In contrast, Ruby handles memory on the heap with garbage collection, taking care of object memory allocation for you. Consequently, understanding which environment you are in becomes vital, as crossing boundaries from Ruby to C or vice versa requires type conversion due to the differing type systems.
00:22:44.080 For example, calling a C method from Ruby necessitates converting a fixnum into long, using the appropriate macros. During returns, you'll also need to translate the value back into the Ruby format. In our project, we’ll aim to enhance our fixnum class with additional functionality, like implementing a new Fibonacci function in C with optimal performance.
00:24:22.760 By executing a benchmark, we can illustrate the capabilities of CRuby. For example, executing the Fibonacci function directly in C results in substantially quicker performance compared to Ruby, demonstrating the potential time savings by leveraging CRuby. I’ll showcase how simple changes can lead to increased efficiency and execution speed, while employing conventions from C in the Ruby environment.
00:26:06.480 Now let’s examine an implementation of a quicksort function as a practical example. By creating an adapter function, we can apply existing C algorithms in our Ruby extensions. Effectively, by utilizing such optimizations, you can achieve significantly faster operations than those achieved via Ruby’s built-in methods.
00:27:38.800 As a result of our discussions, we see that CRuby allows for both productivity and performance enhancements. Remember that employing a custom Ruby build in production isn’t always advisable unless absolutely necessary, such as needing to troubleshoot unique C exceptions.
00:28:00.480 To conclude, I trust I’ve demonstrated how easy it is to install CRuby in around 30 minutes. I encourage you to experiment and explore the process, especially if you're feeling stagnant in your Ruby journey. As you dive into C Ruby code, you will encounter many 'aha' moments that can rekindle your interest and provide practical insights into performance optimization.