00:00:10.639
Thank you for coming to my talk. This is 'Building Native Extensions.' This is my first live in-person talk in eight years.
00:00:17.279
So please be gentle with me. Okay, if you don't know me, I am Mike. I go by Flavor Jones on the internet, and I work at Shopify.
00:00:24.080
I help lead the Ruby on Rails infrastructure group, and I maintain Nokogiri among other Ruby gems.
00:00:32.000
At RubyKaigi a month or so ago, I presented a technical talk on this topic.
00:00:37.840
It was about the 'how' of native C extensions. For that talk and for this one, I prepared working code to demonstrate what a C extension is and show a few different ways that C extensions can be packaged and installed as a Ruby gem.
00:00:42.879
The code is available for you to read and utilize—it's MIT licensed. Here's the Bitly link, and if you prefer typing, there's the GitHub project name.
00:00:53.920
My hope in presenting all of this 'how-to' material is to clearly show a few basic patterns for C extensions. There’s not a lot new here; these patterns have existed for years, but to my knowledge, this is the first time they’ve been named and presented together for comparison and contrast.
00:01:06.720
Now, what the RubyKaigi talk was missing was the 'why'. I want to explain why Nokogiri has evolved to use more complex techniques for compilation and installation over the years. So this talk is going to get through the 'how-to' section as quickly as I can, so I can discuss the 'why'.
00:01:19.759
In that sense, it's kind of a companion talk to the RubyKaigi.
00:01:26.000
So there will be a little bit of repeated material, but not too much. The things I will talk about today include, first, the basics of C extensions.
00:01:38.560
Then I'm going to spend some time discussing why you shouldn't do this. I'll also talk about why it was a good thing that I did this.
00:01:50.880
I will discuss trust and security, and I'll try to scare you a little bit. If we have time, we will do a Q&A, but I have a lot of slides, so that might not happen, and I apologize in advance.
00:02:01.759
What is a C extension? No big deal, it's just Ruby code, but you write it in C. We create new extensions every day in Ruby simply by writing Ruby code.
00:02:08.959
I can extend the Ruby virtual machine to have a new module named 'Mike is great' simply by doing this. I've extended the virtual machine. A C extension is the same thing—except you write the code in C instead of Ruby.
00:02:22.879
Here’s the same extension written in C. It’s a little more complicated, but it’s manageable. You can imagine analogous C code for any Ruby statement that you might write.
00:02:34.879
MRI can't read that C code directly, though. We have to first get it into a form where it can be required like any other Ruby file.
00:02:46.640
This is where it can get challenging. We have to first compile the C code into an object file, and then go through a process known as linking to combine it with Ruby's C library. Only then is it in the correct form to be required, which is what we call a shared object.
00:03:04.800
Lucky for you, I have some working code handy to actually demonstrate all of this.
00:03:10.640
The gem I'm going to show you is called 'isolated' because it is completely self-contained. It doesn't call any third-party libraries, so this is just to provide a basic foundational sense of what goes into a C extension.
00:03:23.599
You might choose to write a C extension like this that is isolated and self-contained as a performance optimization if you have CPU-intensive work to do. Bcrypt is a great example of this; it iterates over cryptographic math, which is just faster in C than in Ruby.
00:03:34.640
A side note: with the advent of better just-in-time compilers, that might not be true for much longer, so be sure to benchmark before you make that decision.
00:03:47.200
The isolated gem contains a little bit of C code to perform that CPU-intensive work; here I've labeled it as a TODO, like any good developer. This is just like the 'Mike is great' example.
00:04:04.080
It provides a singleton method defined, which is called 'do something.' Because it's a small extension, it compiles and installs really quickly—under a second usually.
00:04:09.920
When we peek at the directory where the gem has been installed, we see a file named 'isolated.so.' That is the C extension; it's the compiled C code, and it can be required like any other Ruby file.
00:04:16.320
Just like this: we do a 'require_relative,' and that loads that shared object file.
00:04:22.720
There's a lot going on, so let's go through it slowly one more time.
00:04:28.560
The isolated code is checked out locally. The gem specification file tells us where in the directory structure we can find the extension directory.
00:04:40.720
That extension directory contains all of the C code as well as the extension configuration file, which is conventionally named 'extconf.rb.'
00:04:47.840
This 'extconf.rb' is a Ruby script that has one job: to write a precise recipe for compiling and linking the extension. In the case of the isolated gem, this 'extconf.rb' file is as simple as it gets.
00:05:00.400
Mkmf is short for 'Make Makefile.' It's a Ruby module that ships with the standard library. It defines 'create_makefile' as well as a handful of other methods for advanced configuration.
00:05:11.759
The 'create_makefile' method is actually what writes the recipe for compiling and linking. Since we haven't configured anything else, it will adopt some intelligent defaults.
00:05:22.400
The recipe it generates starts with a line that reads: please compile 'isolated.c' into the object file 'isolated.o' and ensure to include Ruby's header files. If we had more C files, each would be compiled separately into a corresponding object file.
00:05:49.440
The recipe will contain a second step—linking—that combines all the object files with Ruby's C libraries to create that shared object.
00:05:55.919
This step reads: please create a shared object by combining all of our object files with Ruby's library and the system's C and math libraries.
00:06:07.919
When we actually run this, we get a Makefile that gets created. It is big, long, and ugly, but it is very precise. 'Create_makefile' has done a lot of work for us.
00:06:13.520
It has turned that general high-level description into something specific for my CPU architecture: Linux x86_64 and for Ruby 3.0.
00:06:25.120
So the 'extconf.rb' is generic, but the Makefile will only work on Linux x86_64 running Ruby 3.0.
00:06:37.039
Finally, at the end of the journey, if I run 'make,' I generate this 'isolated.so' file, which we can observe, and we can see that it is a shared object.
00:06:48.400
If I type fast enough, we can require it like any other file and call methods on the classes that get loaded.
00:06:53.520
Great success! End to end. If you actually run 'gem install,' you'll see it get installed; it installs in under a second.
00:07:00.319
Here’s a summary in diagram form of what just happened: you run 'gem install.' It downloads the gem file from rubygems.org, extracts the gem specification, looks at the specification to find where the 'extconf.rb' file is, and runs that script to generate a Makefile, which will compile and link into your shared object. It's just that easy.
00:07:14.880
However, a lot can go wrong because that's a complicated series of steps.
00:07:21.360
The most common problem is that people don't have a compiler toolchain installed. You need to have 'make,' a compiler, a linker, system header files, and system libraries—all must be installed in the right place and discoverable.
00:07:40.800
Setting that up is such a common problem that Nokogiri has a separate subsection in its documentation for getting that installed.
00:07:46.000
A lot of Linux distros will offer a meta package like build essentials to help you do that. On Windows, Ruby is very lucky to have Lars Canis and before him, Louis Lavina maintaining the Ruby installer DevKit.
00:08:02.479
It ships a version of Ruby along with the entire integrated toolchain for Windows. It's awesome! On Mac OS, you've got Xcode command line tools, which have their own installation problems, but I typically use Linux.
00:08:13.840
Secondly, you'll need to have a working Ruby development environment, which includes Ruby's headers and libraries. This can be a real hurdle for Ruby newbies.
00:08:31.200
I'm sure you've all seen this error message at some point; it's a little hard to understand. You need to install the Ruby libraries.
00:08:42.640
Third, you need a consistent environment. Once you build that shared object, your C extension can only run on that architecture with that minor version of Ruby and those system libraries. You can't copy your gems from your Mac to your Linux production machine, and you can't upgrade Ruby and still use the gems that were previously installed.
00:09:05.120
So those things can also go wrong with C extensions. However, C extensions should solve a lot of good problems too.
00:09:10.480
I mentioned earlier that performance optimizations like bcrypt are a great opportunity to use C extensions, but there's another more common reason: to integrate with a third-party library.
00:09:28.560
The reason I'm giving this presentation is that I've spent a lot of time over the last few years working with third-party libraries and calling them from Ruby. I've been the primary maintainer of Nokogiri for a decade.
00:09:40.080
If you're not familiar, Nokogiri is a Ruby library for parsing and manipulating XML and HTML. The formal specifications for XML and HTML are huge; it would be a monumental task to try to re-implement all of this in Ruby from scratch.
00:09:52.640
And if you don't know this about me, I'll tell you—I’m pretty lazy. So instead, Nokogiri uses existing open-source libraries: libxml and libxslt, and it interacts with them through the C extension.
00:10:06.160
A lot of Ruby gems do this. Here's a list; you can probably think of one or two others.
00:10:12.240
Most of these gems are just a thin-ish layer of Ruby and C that work together to make the library's features available as idiomatic Ruby.
00:10:25.120
There are other ways to call third-party libraries in Ruby. There are a dozen ways to do everything, and this is no exception.
00:10:36.880
One popular method is Ruby's Foreign Function Interface, or FFI. This allows Ruby code to call C libraries directly without having to write any C code.
00:10:48.000
It's a really powerful tool, but choosing when to use it is a topic for another talk, so I won't dig into it today.
00:10:56.000
If you're on JRuby, you could write an analogous Java extension that works very similarly. Again, that's a topic for another talk, but it's worth noting that this is how Nokogiri works on JRuby—we have a Java extension.
00:11:05.280
Finally, let's get to actually accessing an external library. For the next few minutes, I will show you a series of gems that escalate in complexity, and they each build on the previous one.
00:11:17.120
So we'll get there pretty quickly in small steps. The code I'm presenting here is in a GitHub project, and I have the Bitly link and the long form if you care.
00:11:28.000
Strategy number one: gems like sqlite3 and rmagic follow a strategy requiring the library to already be installed on the system by the user before you install the gem.
00:11:36.000
That’s why it’s called the system strategy—it has to be on the system already.
00:11:43.120
I've created a gem to demonstrate how to do this; all these examples will use libyaml.
00:11:49.360
If we look at the C code for our system gem, you can see we're calling the function 'yaml_get_version' and returning the result as a string.
00:11:56.560
It's a really simple integration, but it proves that it works end-to-end.
00:12:03.680
Looking at the 'extconf.rb,' it's very similar to the isolated gem except it has a new stanza where we call 'find_header' and 'find_library,' which do exactly what you think.
00:12:11.360
They go out into the system and try to find where libyaml might already be installed.
00:12:18.000
If it finds them, it ensures that the compile and link steps pull in the header and the library. Thus, the recipe it generates will contain the libyaml include directory in the compile step and the library directory in the link phase.
00:12:30.399
Let’s actually go ahead and try to build this.
00:12:43.120
Okay, my bad, I don’t have libyaml installed. I'm sure this has happened to all of you too.
00:12:50.000
So let’s imagine I spend five minutes Googling to figure out how to install it on my system, and I’ll try that again.
00:13:02.800
This time I run 'extconf.rb', and it generates the Makefile. I can run 'make' and it generates the shared object file. I can even see that it is calling out to libyaml on my system.
00:13:14.880
So this shows that as long as libyaml is on the system and discoverable, we can compile and link against it. Great! Great success.
00:13:23.200
If you actually go and run 'gem install' for our system gem, you’ll see it installs. It’s a little bit slower than the isolated gem because it's actually doing more work.
00:13:30.640
It's still pretty fast, about a second and a half—not too bad.
00:13:40.240
However, there are things that can break when you're using the system strategy, which are probably obvious. One thing is the user has to figure out how to install that library, as we saw earlier.
00:13:56.400
Another issue is that the library could be in a non-standard directory, an old version, or could be a version that's too new for the gem to handle.
00:14:03.920
Additionally, compile-time feature flags may not be set. As a gem maintainer, you might be okay with these trade-offs.
00:14:13.280
In that case, you’ll improve your documentation to explain how to install everything, add edge cases to your 'extconf.rb' for non-standard directories, and handle flags being flipped on and off.
00:14:27.768
This can get really challenging in the real world; if you take a look at the 'extconf.rb' for sqlite3 or rmagic, you’ll see how messy this can get.
00:14:40.520
Nokogiri relied on the system strategy for about five years, until 2013, when a nasty bug in XPath in libxml2 broke everybody's CSS queries.
00:14:52.720
Homebrew immediately upgraded the entire universe to this buggy version of XPath and libxml, which made fixing things pretty messy.
00:15:05.680
The alternative was to include a known good version of libxml inside Nokogiri, which brings us to the next strategy.
00:15:19.560
The easiest way to package up external libraries is to drop source files into the extension. I'm not going to explain that in detail today.
00:15:33.040
If you’re interested, please check out my RubyKaigi talk or the GitHub repo instead. I'm going to focus on packaging a tarball, specifically an upstream tarball.
00:15:42.240
A lot of third-party libraries will have their own build process based on tools like autoconf or cmake, and for those libraries, we just take the tarball and drop it in, doing the configure and compilation cycle at gem installation time.
00:15:56.800
The working example I have is called 'package tarball.' We can see there’s a new directory called 'ports' that contains the upstream tarball for libyaml.
00:16:05.120
We use MiniPortile to deal with the tarball; this is a gem that takes the tarball of source code, runs the configure script, generates the shared object file, and then modifies the Makefile.
00:16:15.040
It's a really fabulous piece of software. The new bit in 'extconf.rb' is this MiniPortile block.
00:16:27.600
The code is quite busy, but I will break it down into smaller pieces. We declare where our libyaml tarballs are, identifying them for downloading if they're not cached.
00:16:42.000
We can also verify the checksum of the tarball. This line configures the build system, compiles libyaml, and links it into a shared object.
00:16:54.400
These two lines will configure the Makefile. One advantage of using an autoconf package is that you can use pkg-config to set all of your compiler and linker flags.
00:17:02.240
This was a huge timesaver for me. When I actually run 'extconf.rb,' you'll notice that all the libyaml stuff happens before the Makefile is created.
00:17:10.560
So the compile and link steps in the Makefile know where to find the libyaml that was just generated.
00:17:23.600
A good thing about this strategy is that it's a known good version of the library. I can eliminate many edge cases from my code.
00:17:38.399
However, the downside is that I will have to be responsible for security updates to that library. You'll notice a lot of Nokogiri updates often end up being new versions of libxml, with a note to upgrade.
00:17:50.880
This is part of the responsibility I take on, but I believe it’s a better user experience.
00:18:05.120
The installation time, however, can be problematic. This method can take up to 13 seconds to install, as compared to the system strategy's installation time of about a second and a half.
00:18:16.800
User satisfaction is significantly impacted by installation time. Nokogiri has been named one of the top five frustrating gems in the universe on multiple occasions.
00:18:25.840
It’s really not hard to understand why; if you witness the video of paint drying while installing Nokogiri, you'll see where a lot of negativity comes from.
00:18:37.760
I won’t wait for the entire installation to finish; it takes 45 seconds, and that’s with all eight cores on my laptop.
00:18:44.320
The next step of Nokogiri’s evolution was specifically to avoid that pain—by using pre-compiled libraries.
00:18:58.880
Nokogiri 1.11, which shipped earlier this year, was the first version that offered pre-compiled libraries for most major platforms.
00:19:05.520
The advantage is that I'm doing all the compilations, so when you install a gem, it’s just a quick file copy for you.
00:19:19.760
This gets complicated, but an amazing tool that helps with this is Rake Compiler Docker. Essentially, this is a Docker environment that runs Rake Compiler.
00:19:35.520
I can cross-compile things—running on my Linux machine, I can compile for Windows and Macs.
00:19:45.760
Once you assume that the cross-compilation happens reliably, the remaining problems boil down to figuring out how to package it all and how to ensure it works.
00:19:58.480
I have an example gem using 'package tarball' as a base with a couple of extra files for testing.
00:20:06.240
We make a few changes to 'package tarball' to accommodate cross-compiling. The first change involves the rake extension task, which now looks a bit bigger.
00:20:21.440
We declare what versions of Ruby and which platforms we care about, and set an environment variable to let Rake Compiler know we're coming.
00:20:34.080
We signal the Rake Compiler to turn on cross-compilation and create additional Rake tasks.
00:20:41.520
We indicate to our extconf that this is a cross-compilation. It doesn’t know unless you tell it; otherwise, it thinks it’s just doing a native build.
00:20:54.720
Finally, this block only gets run during cross-compilation, and we modify the gemspec to remove 'mini_portile' as a dependency.
00:21:03.760
There’s no need to ship the tarball to users anymore since we’re handling all pre-compilation.
00:21:15.680
Now we have some ugly Rake tasks. The first runs on the Docker host, and all it does is trampoline into the Docker container and runs the bottom Rake task.
00:21:29.680
It builds the extension and packages the gem according to the updated gemspec.
00:21:37.760
The top Rake task runs the second Rake task in the Docker container to handle cross-compilation.
00:21:49.520
The only changes we made to the 'extconf.rb' file ensure we use the cross-compiling compiler instead of the regular compiler.
00:21:58.399
We also check that we're in cross-compilation mode so that we can flip some flags later.
00:22:06.080
The options we have to set for cross-compiling include 'fPIC,' which allows us to mix static and shared libraries.
00:22:12.800
The second option is specific to libyaml, as normally, its build system runs tests.
00:22:26.720
If I'm in a Docker container building Mac binaries, I can compile them, but I can't run them, which means I have to turn off the tests.
00:22:39.760
Lastly, we change the way the extension gets required. In the directory structure for the package gem, we have four C extensions.
00:22:45.760
These correspond to each minor version of ruby we support. Remember, a C extension is specific to architecture and Ruby version.
00:22:51.839
If we're running Ruby 3.1, we need to use the precompiled .so in the 3.0 directory.
00:23:03.040
Therefore, we expand our requirement into a longer statement to first look for the precompiled version that matches our Ruby version.
00:23:09.280
If you can’t find it, it will fall back to what it assumes is the version that was compiled during installation.
00:23:20.320
That’s a pretty big set of changes, but it’s not hard to understand, and that's all it took for us to change the compilation process.
00:23:26.720
It's also important to note that we still have to ship the vanilla Ruby version of the gem for anyone not on a supported platform.
00:23:37.040
So FreeBSD users, you'll essentially fall back to the packaged tarball experience, which means you'll have an installation time of about 13 seconds.
00:23:44.599
This isn’t a perfect strategy. I mentioned earlier that compiled C extensions are specific to the version of Ruby and machine architecture.
00:24:01.200
This takes care of the first two, but the system libraries have a big problem regarding glibc versus musl libc, which is used in Alpine. These are not compatible.
00:24:19.680
We often run into issues, particularly with gems like Sass not being pre-compiled anymore due to this conflict.
00:24:29.920
The mitigation for this involves testing all the time; the GitHub repo I mentioned for the RC (Ruby Compatibility) gems comes with demo pipelines.
00:24:38.080
These will use scripts to build the gems and then attempt to install them on target platforms, verifying that all tests pass.
00:24:49.760
Unfortunately, this testing doesn’t cover musl libc today, as the GitHub actions don't run on Alpine.
00:25:05.200
Most gems shouldn’t take this approach. During last year's RubyKaigi, Sutu Kuhei, who maintains Rake Compiler, encouraged everyone to cease doing this.
00:25:13.920
He made several valid points.
00:25:20.400
One point he raised is that users can't use the latest Ruby; this has since been fixed, as Bundler now allows for a more modern Ruby.
00:25:26.960
It will compile the vanilla Ruby version instead of attempting to install a pre-compiled version that doesn't work.
00:25:36.080
He also mentioned that upstream patches to third-party libraries are delayed; that's on me as the maintainer, and I’m willing to accept that.
00:25:47.999
Another point was that users can't control the library version they use; as the maintainer, I am willing to claim I know better.
00:25:53.120
High maintenance costs are indeed present; however, it's a one-time cost to set up testing and configuration, which is cheaper than providing user support.
00:26:09.680
Finally, optimizations aren't enabled for specific platforms, and that's a truly valid point.
00:26:21.920
Some GCCs will perform significant optimizations for specific architectures, and we won't achieve that with the pre-compiled version of Nokogiri.
00:26:35.680
But here's why it's a good thing that we recompile Nokogiri: For me, it's good because there are fewer support issues.
00:26:45.120
I've gone from experiencing one support issue a week to about one a month. That's a significant drop.
00:26:56.000
For users, it's beneficial because they are struggling less. Page hits on the installation documents are down by 30%.
00:27:06.800
Meanwhile, page hits on the rest of the Nokogiri.org site are up by 20%. Fewer angry complaints lead to a better experience.
00:27:18.320
In the context of the larger universe, this is beneficial as Nokogiri is now at web scale. Since January, pre-compiled versions of Nokogiri have been downloaded 60 million times.
00:27:32.240
If we do a rough calculation assuming saved power based on some core calculations, that's like half the power of my plane flight here.
00:27:44.000
I thought it would be a larger number, but we can proceed to an even more interesting calculation.
00:27:56.160
If you take 82 seconds, which represents installation time if you use a single core, and multiply that by 60 million, even if only 10% of those are humans waiting for the installation to finish, it amounts to 13 million dollars.
00:28:06.560
That's a nice gift to Ruby companies, I guess!
00:28:17.440
Now before I wrap up, I want to speak briefly about trust. How do you know what’s in the box?
00:28:27.920
Pre-compiled gems like Nokogiri give you an opaque library file that’s simply copied onto your production system during installation. How can you know what's in it?
00:28:39.600
The short answer is: you don’t. You could decompile and analyze it, but it's not feasible for the average user.
00:28:50.560
Determining whether you trust a gem boils down to trusting its source, the author, and the entire chain of custody that delivered the gem to you.
00:29:01.440
You might ask: why trust this guy? You don’t know if I follow basic security hygiene—think about it; I could have gambling debts or other vulnerabilities.
00:29:15.520
You have to assess whether someone might try to inject malicious code into your supply chain.
00:29:24.800
Fortunately, I am a good guy, so you don't have to worry about it.
00:29:31.120
Trusting the supply chain means asking questions like: Have all maintainers enabled MFA? Was the gem signed? Did you check the signature? Have you verified checksums?
00:29:44.800
For Nokogiri, we do our best to make all that information available, but I bet most of you haven't checked any of those things.
00:29:56.960
I started a discussion asking people what would help them trust Nokogiri more, and I received three comments. One of them was from Aaron, which doesn't count.
00:30:04.240
I think you're all too trusting, and I'd encourage you to be a little more cautious.
00:30:12.080
If you're a gem author, enable MFA on your RubyGems account.
00:30:19.280
A new feature was just released that allows a gem to require MFA for all future versions, and Nokogiri has opted in.
00:30:29.840
Gem signing: my team at Shopify is working with the RubyGems team to improve gem signing and we've developed a proof of concept.
00:30:36.960
If you find me, I can talk with you about this endlessly, and I’ll introduce you to my team.
00:30:44.080
Coming soon: Recompiler Docker will support ARM64 on Linux this winter. If you’re using Graviton on AWS, this is important.
00:30:59.080
We will refactor Recompiler Docker to work better in CI/CD pipelines, and hopefully, we’ll deliver Ruby 3.1 support early next year.
00:31:07.200
I aim to work harder on GLibC versus musl libc compatibility, as this is a significant problem.
00:31:15.200
If we could ensure all pre-compiled gems work for both libraries, that would be great for the ecosystem.
00:31:23.320
A big thank you to Lars, Luis, and Kuhei.
00:31:30.800
That's my talk; thank you all for listening to me ramble about C extensions. I appreciate it.