Building Ruby Head for your Rails App

Talks

Maple Ong

Building Ruby Head for your Rails App

by Maple Ong

In the video titled 'Building Ruby Head for your Rails App,' presented by Maple Ong at RailsConf 2023, the importance of maintaining the latest Ruby version for Rails applications is explored. Ruby head, representing the latest commit on Ruby's GitHub main branch, is highlighted as a useful tool for developers who wish to stay ahead of potential issues caused by version obsolescence.

Key points discussed include:

- Importance of Upgrading Ruby: Regular updates ensure access to new features, bug fixes, security patches, and performance improvements. Running outdated versions poses risks as dependencies and community support dwindle over time.

- Gusto's Experience: The speaker shares Gusto's journey from upgrading Ruby 2.7 to 3.1, detailing challenges such as database timeouts during tests, which stemmed from changes in Active Record's connection pool behavior in newer Ruby versions.

- Ruby Head Implementation: Running Ruby head on application tests via a CI/CD setup allows developers to address issues as they arise and to adapt gradually rather than undertaking large, disruptive upgrade projects. Employing two parallel pipelines (one for production and one for Ruby head) is recommended.

- Benefits of Gradual Upgrades: Incremental upgrading allows for reduced debugging complexity, early addressing of deprecations, and continuous maintenance which distributes workload more effectively across the development team.

- Setting Up Ruby Head with Docker: The video details how to build Ruby head locally and within Docker, alongside integrating it into CI systems. Using Docker offers consistency across different environments and simplifies management of dependencies.

- JEmalloc Overview: Although briefly touched upon, the benefits and implementation of the JEmalloc memory allocator to improve performance in Ruby applications, especially those that are memory-intensive and multi-threaded, are also discussed.

The key takeaway from the talk is that actively maintaining and upgrading to Ruby head not only keeps Ruby applications robust but also enhances developer productivity and reduces risk in the long term. By adopting these practices, developers can participate in the ongoing evolution of Ruby and contribute to its community improvements.

00:00:21.260 I work with Gusto. Gusto is a people platform that provides payroll, benefits, and HR solutions in a single platform for thousands of businesses in the United States. I've been here for six months. I received an email just today.

00:00:32.640 I brought up the infrastructure back-end team, and some of us are actually here today. Thanks for coming out, team. My team built the background job system and handles other work, such as ensuring the health of the CI pipeline and upgrading Ruby and Rails. I really enjoy this work, so if you're interested, please come and talk to me.

00:00:46.020 With that being said, I would like to assert that we should all strive to be on the latest Ruby versions, and here's why: Ruby keeps improving, thanks to the Ruby core team, and we benefit from important updates like bug fixes, security patches, performance improvements, and new Ruby features in both major and minor releases. I believe it is to our advantage to be on the latest Ruby version.

00:01:12.180 This is a list of the current stable releases of Ruby: we have Ruby 3.2.2, 3.1.4, 3.0.6, and 2.7.8. However, if you've been following along, you would know that Ruby 2.7 has reached its end of life as of March 31st of this year.

00:01:30.119 Thus, version 2.7.8 will be the last release for Ruby 2.7. Next year, Ruby 3.0 will also reach its end of life and will no longer be supported. This means that every single year, a team or individual will have to upgrade Ruby for your Rails application to ensure that you're off the deprecated version or at least on one of the supported Ruby versions.

00:01:54.299 Let me share with you how we updated Gusto from Ruby 2.7 to 3.1, including some of the blockers that we encountered. Gusto's main Rails monolith ran on Ruby 2.6 last year, and we knew Ruby 2.7 would reach its end of life in March this year. Therefore, we worked on a project to upgrade Ruby.

00:02:13.800 As Nan points out here, it took us three to four months to complete this upgrade, including my contribution, as I joined a bit later. We successfully upgraded two of our largest Rails applications at Gusto. Let me share with you one blocker we faced during this project.

00:02:31.140 We experienced database timeouts on various tests in our pull request to introduce Ruby 3.1. The database timeouts were caused by an issue within the Active Record connection pool. We investigated this issue and eventually found a minimum reproducible example of the problem. When performing database queries inside an enumerator, such as through a transaction, the program would hang, leading to test timeouts in CI.

00:03:01.980 To address this issue, we needed to understand why this behavior was only apparent in Ruby 3.1 and not in Ruby 2.7, which we were currently running. The first step was to identify the Ruby version that exhibited this behavior. We also needed to run Ruby 3.0 and various patch versions like 3.0.1 and 3.0.2. Ultimately, we determined that the deadlock did not occur in Ruby versions prior to and including 3.1, while deadlocks did occur in versions above 3.1.

00:03:39.239 The next step was to track down the commit that caused the issue. We knew that the change was introduced between 3.1 and 2.0.0, so we reviewed the list of commits. As we understood some parts of the enumerator, we recognized that it was implemented using fibers in Ruby. We found a particularly suspicious-looking commit in the list of backports into the 3.1 version.

00:04:10.680 We traced back this issue to a change on Ruby made approximately two years ago. Ultimately, we understood the reason for the deadlock. As Nan points out in his tweet, Ruby monitors now use Fiber as of 3.0, and since Active Record queries are framed inside an enumerator within a database transaction, it can lead to deadlocks.

00:04:42.540 Figuring out the issue was a significant breakthrough for us, and we collaborated with Jean Busier from Shopify, realizing that an issue had already been opened for this problem on Rails. We shared our experience regarding the bug, which opened discussions with others. Jean and I paired on a solution, leading to an approved pull request.

00:05:40.919 Our team at Gusto was able to backport the changes into our Rails application, thanks to Nan, who diligently documented this entire issue. On December 1st, we successfully backported it into our Rails application. It required extensive investigation into Rails and Ruby source codes, alongside collaborative efforts within Gusto and with others outside Gusto. Ultimately, we were able to upgrade from 2.7 to 3.1, taking around one month from the moment we identified the root cause.

00:06:11.100 Now, consider what would happen if we were running the latest version of Ruby and encountered the deadlock issue as soon as it was introduced. This situation could have been avoided, and that’s what I'm trying to convey.

00:06:31.199 We will also talk about how we would run Ruby head and why we should all do it. So, what is Ruby head, exactly? Ruby head is the development version of Ruby, and "head" refers to the latest commit on the main branch on GitHub, which is what we're referring to here when we talk about 'Ruby head.' It's also known as Ruby trunk or Ruby Edge and contains changes that are not yet available in the stable version of Ruby.

00:07:12.720 This is why we opt not to run it in production but rather execute it in our application tests and CI pipeline. This means we can run two parallel CI pipelines for our Rails application: one for the production Ruby and another for Ruby head. This approach allows us to upgrade Ruby progressively, minimizing potential impacts on our application and letting us see how tests behave.

00:07:54.600 Once a new Ruby version is released, updating the Ruby version file is all that's necessary. Now let's discuss upgrading Ruby gradually versus all at once. Gradual upgrades involve addressing build issues as they arise over time, whereas an all-at-once upgrade is a concentrated effort involving potentially multiple engineers working on a project.

00:08:51.840 While the net sum of work may appear similar, I argue that upgrading Ruby gradually is more efficient. Let's explore the benefits of gradual Ruby upgrades. The first benefit is that we spend less time debugging. Whenever a failure occurs in the Ruby head pipeline, we have a smaller surface area to investigate managing issues rather than going through the commits of two Ruby versions.

00:09:24.780 Another advantage is that it provides us with a list of non-urgent tasks. For instance, we can identify gems that will be incompatible in the latest version or take on new language features, such as keyword argument changes introduced in Ruby 3.0. It allows us to compile a list of tasks over the year, improving our code.

00:10:15.540 We also reduce the risk of encountering major upgrade issues. For instance, we didn’t foresee the issues that I previously described, but if you have visibility into the changes made to Ruby in real time, you lessen the risks of those changes becoming significant blockers in your project during the upgrade process.

00:10:56.880 As for those of you who are not at Shopify or who may feel resource-strapped, it is a better business investment. Once you set up the CI pipeline for Ruby head, you can maintain it throughout the year while being confident in your Ruby upgrades for production once the new version is released. Gradual upgrading would also mean that you won’t have to continually seek leadership buy-in for specific projects every year or when a new Ruby version comes out.

00:11:40.680 Finally, this approach places less burden on a team or an individual. Gradual Ruby upgrades can be perceived as maintenance work, allowing the effort to be distributed over time among engineers. This way, individuals can take ownership of specific issues and learn from these experiences, reducing pressure on any single team.

00:12:12.660 Now, let's discuss some disadvantages of gradual Ruby upgrades. The first one is the upfront work and resources required to set up the CI pipeline to run Ruby head. Additionally, the increase in maintenance effort must be considered, as someone will have to maintain and understand the pipeline.

00:12:28.440 Lastly, the additional pipeline in the CI system will incur extra costs. Personally, I might not remember specific technical details after attending conferences, but I do remember concepts and ideas.

00:13:05.160 If there's one key takeaway from this talk, it’s the importance of feeling comfortable running Ruby head with your Rails application tests. It doesn't have to be Ruby head; it can be the next Ruby version you're using.

00:13:48.420 Let me discuss how we can build Ruby head both locally and using a Docker image. We will create a CI schedule for the Ruby head image and run a toy Rails application test on it. A quick disclaimer: there are many ways to build Ruby head, and this is just one method. I chatted with colleagues from various companies, including Gusto, and we each follow slightly different methods.

00:14:28.560 So, let's consider how Ruby head can fit into your Rails application and CI system. To build Ruby head locally, the first step is to compile it from source, which involves generating the configuration script, compiling Ruby, and installing it.

00:15:04.800 It’s important to differentiate between compiling and installing Ruby. Compiling Ruby means converting the Ruby source code into machine-readable code, requiring configuration of build settings and generating necessary binaries. On the other hand, installing Ruby involves placing the compiled code into the appropriate directories on your local machine for use by applications.

00:15:41.580 If you check the Ruby website under the installation section for building from source, you’ll find three commands: configure, make, and sudo make install. The configure command generates a configuration script for the Ruby build, where you specify build options to enable or disable certain features.

00:16:22.200 The make command compiles the source code into binary executables and libraries, while sudo make install takes the compiled Ruby binaries and installs them into the system, requiring sudo privileges because the default path is not user-writable.

00:16:57.600 The Ruby website advises that using third-party tools or package managers might be a better option, so let’s follow that recommendation. Setting up Ruby locally can be accomplished through various installation managers like RVM, rbenv, or RubyInstall. RVM stands for Ruby Version Manager, and it differs from rbenv.

00:17:36.960 We are going to use Ruby build to install Ruby head. The command you see will build the Ruby head version 3.3.0 Dev and install it to the desktop. We will use Ruby build directly to compile and install Ruby head in our Docker image for several reasons, primarily for simplicity and configurability. With Ruby build, we can compile Ruby with specific configurations without needing to manage multiple versions.

00:18:46.380 I want to share how Ruby build utilizes a build definition file, which acts like a DSL for Ruby build. Each method call maps to functions in the Ruby build script, allowing for customized compilation and installation.

00:19:14.700 For example, the 'install-packet-share' method ensures the correct version of openssl is installed. In some cases, we can install Ruby from the master branch on GitHub rather than downloading.

00:19:58.560 We should also bear in mind that Ruby installers do a lot of work for us, which is why they’re more convenient than manual installations. They handle both system and Ruby dependencies, and they even download and apply patches if needed. It's also worth mentioning that there are other methods to set your Ruby environment, including using a package manager like Homebrew, yet those typically use pre-compiled binaries.

00:20:49.440 Now let’s get to building Ruby head on Docker. Docker is a tool that packages and runs applications in a consistent manner. We will create a Docker image with Ruby head, allowing us to run our applications using this image. There are existing Docker images for Ruby maintained by the Docker community, supporting various Ruby versions based on OS preferences.

00:21:44.820 However, the official Docker images do not support Ubuntu-based images, which is what we use at Gusto. There was an open PR on the Docker library to introduce Ubuntu support, but the maintainer stated a preference for more lightweight operating systems. That's fine because we can build our own.

00:22:20.160 Here’s a Dockerfile for Ruby head, and I will step through it so we can understand what's happening in the image. The first step is to specify the parent image, which in our case will be an Ubuntu image.

00:23:01.320 Next, we copy the build definition file into the Docker image, which we will use to compile Ruby head. We then set up the environment for gem and bundler, as the image is fresh and lacks defaults.

00:23:31.740 Then we install the required system dependencies before compiling and installing Ruby. In this section, we begin by downloading Ruby build from GitHub and extracting the tarball to manually install. This is where we utilize the build definition file to build Ruby head.

00:24:17.880 Afterward, we remove unnecessary files to keep the container clean. Next, we specify the creation of a directory for the gems if it doesn’t already exist. Now let’s look at how the Ruby head build definition file appears.

00:25:01.860 Similar to the previous one, we aim to use a nightly snapshot from Ruby Lane. Nightly snapshots represent the current state of the development branch. This snapshot is what we'll use to install Ruby head, which we can build and run using Docker.

00:25:46.860 Let’s walk through what this looks like once we have the image built. We utilize Docker to build this image and run it in IRB, where the Ruby version is confirmed to be 3.3.0 Dev.

00:26:38.820 Moving forward with time constraints, I won't dive deeply into the topic of Jemalloc, although I'm happy to discuss it if there's time later. How many of you are interested in Jemalloc?

00:27:14.460 Let’s continue. Large-scale Rails applications often consume a significant amount of memory, especially when they're multi-threaded. For instance, if your Rails application employs Puma, which is multi-threaded, it's worth noting that Ruby is a dynamic language.

00:27:49.620 This means Ruby can dynamically create and manage objects during runtime. However, this flexibility can lead to high memory consumption. This is where Jemalloc comes into play.

00:28:21.480 Jemalloc is a memory allocation library that provides an alternative memory allocator. It is specialized in reducing fragmentation within memory, an issue that occurs when memory is allocated and deallocated in ways that leave behind small gaps.

00:29:04.460 Over time, these gaps can accumulate and lead to memory inefficiencies or worse—memory allocation failures. Jemalloc utilizes per-thread caching, meaning that it stores cached memory blocks, which increases efficiency during allocation.

00:29:29.100 However, it should be noted that Jemalloc is not a drop-in replacement for your existing memory allocator. It's not a one-size-fits-all solution, which is partly why Ruby does not ship with it by default. To use Jemalloc with Ruby, you can employ one of two methods.

00:30:09.360 The first method involves using a compilation flag. This means, when using the configure command to build Ruby, you specify it with '--with-jemalloc' to tell Ruby to utilize Jemalloc as the default memory allocator.

00:30:51.840 The second method involves preloading the Jemalloc library before executing Ruby. You do this by utilizing an environment variable called LD_PRELOAD, which allows you to overwrite functions or symbols in shared libraries.

00:31:26.640 This means that if you specify the library to be preloaded, Ruby will employ Jemalloc for memory allocation instead of the default from the operating system. Preloading Jemalloc generally yields more benefits in memory management than compiling Ruby with it.

00:32:08.420 Lastly, it’s crucial to acknowledge that while using LD_PRELOAD can significantly improve memory efficiency, it may result in compatibility issues with certain applications. Please exercise caution when using it.

00:32:43.140 That said, you can also compile Ruby with Jemalloc to test any potential memory or performance benefits you might gain.

RailsConf 2023