Testing the Multiverse

00:00:29.679 Jon is from Portland. He works on the New Relic Ruby agent team with me, along with a couple of other fine people. If you've used New Relic, you have benefited from Jason's code, whether it’s some of my documentation or a test. Jason writes a lot of things and he is highly knowledgeable about all aspects of the agent. He is also kind of an atypical Portlander in that he brews his own beer and primarily rides his bike everywhere. He even has chickens in his backyard! Yes, that's correct! He has a giant beard. It's quite rare to find someone like that in Portland.

00:01:14.439 So, please give a big round of applause for Jason Clark! Hello everyone, I’m glad to be back here at Ruby on Ales. It’s awesome to have the chance to talk to you again. Last year, I mainly discussed home brewing and the beer side of things, so this year I thought it would be fitting to talk a little more about the Ruby side of things. The topic today is 'Testing the Multiverse.'

00:01:26.960 When I refer to the Multiverse, I mean that sometimes when you build a gem, it relies on other libraries that exist. The most common example might be something like Rails, where you have something you've written that needs to work against multiple versions. Today, we're going to talk about what you need to do to consistently and easily test that.

00:01:41.360 Rather than diving straight into technical details, I want to set this up as a story, some of which might be a little fictional. There may be elements that aren’t true—I'll leave it to you to decide what those are. In the Ruby ecosystem, there has been a library that has taken off in the past few years, garnering an incredible amount of momentum. Everyone's talking about it and using it, and of course, that framework is Ruby on Rails.

00:01:58.000 Ruby on Rails is the best possible framework for quickly and easily building command-line applications. This harkens back a bit to Rails, a framework you might be familiar with for web programming. However, since the Unicorn and Puma Wars of 2018, which caused many splits in the community, there has been an opening for command-line apps to take off. It was a sort of Renaissance for us.

00:02:11.200 You’ve all seen something like this: a standard Bales command. We have our class that derives from Bales command, then we provide it with a run method. This represents just about the simplest thing you could do. The conventions it brings are awesome. When running on the command line, you can see how easily you can build command line applications. But like any massive ecosystem shift, Bales has opened up many other possibilities for people to explore.

00:02:40.120 This brings me to a metric I created called the 'Peor Metric.' Is anyone here familiar with it? No? Well, that’s good because I made it up! It stands for 'Programmer Input Output Ratio.' This metric measures the amount of text you input against the amount of output that it generates. It’s kind of like golf scoring—the lower the number, the better. I found this to be a much better measure of productivity than just lines of code.

00:03:03.880 So, I thought it would be a great idea to log an issue on the Bales project, suggesting we track these metrics. I wanted all the Bales commands I executed to report my productivity boost. Unfortunately, the comments at the bottom suggested that this didn’t belong in the core, although I was welcome to build it myself. And so, the Straw project was born.

00:03:17.640 Straw is a plugin that builds on top of Bales, allowing us to extract metrics from the commands we're running. It generates a lot of data that looks something like this: the first number represents output, followed by the status code, which indicates whether there were errors or not. At this moment, you might be wondering how many beers I've had prior to this presentation or where I'm headed with this information.

00:03:47.400 However, this is the meat of it: as a good Ruby developer, I want to test the code I've written. So, my tests look like this: I'm using many tests because why wouldn’t you? I instantiate a Bales Runner, run a thing, and then I have some helpers that let me examine the output and the data I've stored. Life is good; my code is well tested.

00:04:05.079 Straw 1.0 ships, and people everywhere are able to access these numbers, figuring out their productivity and truly understanding what’s going on. Life is good. But then life doesn’t stay still in open source. Bales rolled out a new release, introducing a 2.0 version, and all of a sudden the way I had written my tests—though still valid—needed some adjustments.

00:04:28.000 Not much changed in how I plugged into Bales for the second version, but I'd like to run the tests against both sets of dependencies to ensure I haven’t broken anything. Luckily, there's support in the Ruby ecosystem that makes this relatively straightforward, which brings us to the Bundler project.

00:04:48.200 Most of us are familiar with Bundler. When working with a typical Bales app, you have a gemfile in your directory that lists a number of dependencies, and you run commands to install those gems or execute commands with them. However, Bundler actually has more tricks up its sleeve.

00:05:01.760 For most of us, we've created just a gemfile in our directory, but you can be more specific about where to fetch those dependencies from. To test against multiple versions, we will create several gemfiles and put them in a test directory, naming them logically according to the Bales versions we want to test against.

00:05:24.120 I will create a gemfile for Bales version one and another one for version two. My other test files will have gemfiles that resemble the typical ones we work with daily, where we can specify whatever dependencies and combinations we need.

00:05:37.799 This is all great, but how do I get my tests to run and utilize those gemfiles? Well, Bundler provides a method to set this up through the `BUNDLE_GEMFILE` environment variable. I am exporting it in a way that fits neatly on the command line, but you can set it inline with your command as well.

00:05:57.599 By specifying `BUNDLE_GEMFILE` and pointing it to the alternate location, when I run `bundle exec` my tests, they'll load the specific set of gems listed in that gemfile, instead of whatever is in my default gemfile or installed on my system. And behold, everything runs smoothly, and all tests pass.

00:06:13.080 However, this introduces a slight difference from what used to happen with my tests. When I run them at a particular time, I can no longer tell which dependencies I’m actually running against. It used to be straightforward, but now I might be running against this version or that version, or some other combination of dependencies.

00:06:30.199 One helpful trick to assist in this situation is to add a small piece of code in your test helper. This code can provide output and context about what is happening, especially when you’ve got long streams of output. As you begin testing against numerous versions, you can identify which environment you are running against.

00:06:51.399 This adjustment can yield valuable insights into breaking tests should you encounter a regression. We can take it further and utilize other features that Bundler provides, such as acquiring the list of specs loaded and generating a cohesive report on them. By capturing this information, we will have a comprehensive understanding of the dependencies loaded by Bundler.

00:07:15.399 This understanding is pivotal for uncovering inconsistencies and issues that might not replicate themselves immediately with different gem combinations. This setup is fantastic for testing against various versions, but configuring Bundler and setting all these gemfile variables can be cumbersome. Automating this process makes our lives easier.

00:07:40.480 This brings us to our Rake tasks. When we run `rake multiverse`, this command manages running those with all the necessary additional parameters. If we examine the implementation, it consists of a basic Rake task that takes a couple of parameters. When we call it without arguments, it will look for all gemfiles within our test directory.

00:07:59.360 It iterates through each of those gemfiles. After each iteration, we backtick to run the exact same test invocation we previously performed but preceding it with the appropriate environment variable to specify which gemfile to utilize.

00:08:15.760 Now, we can execute the entire set of tests against all gemfiles residing in the test directory. It’s also beneficial to run a particular subset, and we provide a mechanism to allow this. You can pass a specific version suffix, which tells the Rake task to focus only on gemfiles matching that suffix.

00:08:30.480 This is excellent. We are now equipped to run tests against multiple versions, establishing a foundation that will support our growth. However, one of the wonderful aspects of open source is that other developers identify issues and bugs in your code.

00:08:46.480 This highlights a situation that we encountered with the issue: `rake multiverse`. My boss cloned the repository and, upon attempting to run the tests, it immediately fell over because he didn’t have the correct version of the necessary dependencies. Your initial reaction may be to say, 'It works for me,' but that’s not a useful response.

00:09:10.560 We want this to work with a clean installation, so our issue comes from running the tests with the specified gemfile without verifying that all necessary dependencies are installed. Remember this rule: ABC—Always Bundle Constantly. This is an essential practice when working in Ruby, but here, it’s especially relevant.

00:09:30.360 The simplest method for addressing this is to run `bundle install` before the command you are executing. In this instance, I have separated `BUNDLE_GEMFILE` from the inline command for clarity because we'll run various commands that rely on it later. This method allows for a single configuration point rather than re-declaring it every time.

00:09:49.440 Unfortunately, running `bundle install` each time I execute tests can slow things down. It’s not that Bundler is slow, but this process is undertaking significantly more work than before, which may involve network activity and downloads. It would be great if we could circumvent some of that labor occasionally, although more efficient approaches might exist.

00:10:09.680 One step we can take is to set `BUNDLE_GEMFILE`, and rather than executing the full `bundle install`, we can use the `--local` option. This way, it won’t reach out to RubyGems; instead, it will resolve dependencies based on what you have installed locally. If this succeeds, great! We can proceed to run our tests.

00:10:29.480 However, if it fails, we might be missing some gems that we require. We need to ensure our environment is correctly configured before running `bundle exec` and carrying on with our tests. At this point, you might be wondering about another issue that came from Matt regarding a gem called Appraisal, which simplifies this whole process.

00:10:47.000 Appraisal is an actual gem, unlike some other concepts I’ve mentioned that may be fictitious. It’s equipped to handle much of the mechanics we’ve discussed and is tailored precisely for cases like this. With it, you define an appraisal file, which only expresses the changes from your base gemfile. It doesn’t reiterate the rake and Minitest dependencies because those appear in the main gemfile.

00:11:07.600 By utilizing Appraisal, you can consolidate the gemfiles and run an appraisal install to generate files much like the ones we've hand-coded. You can prefix any command with `appraisal` followed by the chosen appraisal set and command. This allows you not just to run tests but other operations within that environment.

00:11:27.840 This setup is wonderful! If this is all the complexity you desire, this gem will help you configure your multi-environment testing without writing much manual code. Nevertheless, I want to explore a couple more things before we conclude. Let's revisit a recent issue I encountered.

00:11:44.800 My boss was working with an outdated version of Bales, specifically version 0.8. It's astonishing why someone wouldn’t upgrade, but as a developer, it's crucial that my library functions correctly, even if the version is outdated. I shouldn't create problems when users try to integrate it with older dependencies. Let’s examine what this entails.

00:12:02.640 First, we need to create a gemfile for that environment and add the required dependencies. Running the tests reveals an 'undefined method run' error on the Bales runner class. This error hints that before the 1.0 release, a method named 'execute' was replaced by 'run', causing compatibility issues with older versions.

00:12:27.440 At this juncture, we must decide whether to modify our code for compatibility or skip tests for unsupported versions altogether. We’ll opt for the latter—moving forward with the assumption that users might need to upgrade. We create a helper function within our codebase to check version ranges and determine when modifications apply.

00:12:43.760 For clarification, while version checks can be done more robustly, string comparisons are often easier to read. A crucial note for gem authors: If your gem's version is solely presented as a hardcoded string in your spec file, please avoid that. Instead, encapsulate that version in a constant that others can reference.

00:13:03.080 Once we have our version check, we can conveniently skip non-applicable tests and focus solely on those that work with the supported versions. Even more crucial is ensuring that, if running with an unsupported version, we aren’t damaging code. We can structure tests deliberately to assert that we don’t interfere with the core functionalities.

00:13:23.600 For instance, if Straw's internal writer is directed to a small object that will fail if it receives any write requests, we can execute basic validation to ensure everything else has passed intact. Moreover, if we're patching methods or adapting signatures—factors that indicate our code has injected itself into the existing framework—we can employ Ruby's introspection features to guarantee we are compliant.

00:13:40.440 Having explored the past, it's now time for us to gaze toward the future. The latest version of Bales, version 3, has introduced a compelling feature that we want to ensure compatibility with Straw. This functionality allows for unrunnable commands, which is quite significant.

00:13:56.799 The tests go green—that's exciting! But does anyone wish to guess what happens when we run those tests against an earlier Bales version? Unfortunately, the tests fail. The output typically indicates an error—if it attempted to execute a command that it couldn’t locate, it transforms into a lament.

00:14:12.160 Much like the unsupported versions, we need to gate these scenarios. There are multiple methods to accomplish this, and you can choose whichever approach fits your style best. One way could be wrapping the definition of your test in a block that checks for a supported version. If it’s not valid, just don’t define those test methods.

00:14:33.440 Another way could be to perform an early return within the test's body contingent on those same conditions. Nevertheless, what I find most effective for significant functionality gating is to create a dedicated test file to manage those features.

00:14:53.040 This reinforces the understanding that the tests are contingent upon specific versions. I’ll create output that informs me about unavailability in unsupported environments. This is crucial because if a version check goes awry, it results in unknown statuses and unexpected test executions.

00:15:15.760 Just as every great project inspires competitors, Bales has seen similar developments. Others are exploring innovative methods for building command-line applications, leading to the inception of the Kuner project. Kuner simplifies many operations in a way that may resonate more with those who prefer minimalism, as seen in how the code is structured.

00:15:31.480 The Kuner example is notably simpler than Bales; it does not require as much setup, which may appeal to some developers. Despite this, as the author of the Straw project, I want my library to work proficiently with Kuner while my existing tests only apply to Bales and thus require separate strategy for Kuner.

00:15:49.760 To address this, I’ll introduce a construct referred to as a 'test suite'. Instead of merely coding a single continuous test group, I will dedicate suites to categorize various dependencies, allowing similar tests to operate collectively. For instance, we might differentiate tests running for Bales versus Kuner.

00:16:06.320 The implementation largely relies on the work we’ve previously done with our Rake task. We’ll incorporate parameters into our command call, allowing us to select the test suite we want to execute. Afterward, we transition to running a separate test runner to manage the operations we wish to carry out.

00:16:22.800 The test runner assumes roles previously ascribed to the Rake test tasks, managing dependencies and paths. It will identify test files in a given subdirectory, load those up, and subsequently run them. With this setup, we can run `Rake Multiverse for Kuner`, exclusively executing the Kuner tests against the relevant gemfiles.

00:16:37.680 This functionality is robust, enabling us to maintain separation between various test suites and dependencies. However, there remains a persistent challenge the life of an open-source maintainer demands continuous attention. Although I’ve presented these Rake commands using `rake multiverse`, I've discovered that if run through `bundle exec`, it causes breaks.

00:16:51.760 At first, I was puzzled as to why this happens. My gratitude goes to Jonan for his invaluable input, which illuminated the intricacies involved. Upon reflection on how our test executions occur, I discovered that when you utilize `bundle exec` to initiate commands, it sets several parameters to reflect the currently loaded gemfile.

00:17:08.760 In doing so, it changes the subshell environment. The execution of commands may not behave as expected, causing issues when resetting to another gemfile. Fortunately, Bundler offers a solution. It provides a 'clean' block wherein any subsequent processes within this scope do not inherit these environment modifications.

00:17:27.480 Within this clean block, everything runs effectively, facilitating the proper loading of desired gemfiles during our test execution process. Indeed, while it’s true that enhancing performance is significant for user satisfaction, I'd also encountered a request concerning JRuby compatibility.

00:17:43.760 By design, the JVM does not support forking, thereby eliminating that as a possible solution for executing tests in parallel. Instead, I’ll show you a strategy using threads. To accomplish this, we create a thread for every process we intend to invoke, tracking these within an array.

00:18:01.760 Within the clean block, we create a separate thread dedicated to executing the test runner as intended. Don’t forget to join at the end; we need to wait for all threads to finish processing before we continue. This approach works quite smoothly until I encountered an unexpected issue.

00:18:17.840 I hadn’t realized that my tests were not failing correctly until it was too late. This posed a problem when processing results; tracking exit statuses proved to be imperative. To achieve this, we initialize a status variable to discern the outcomes from subprocesses, relying on the standard $?.

00:18:34.240 The last recorded status provides crucial feedback on the outcome of our processing. It’s essential we terminate with the tally we’ve computed. This adjustment showcases clearer status handling and establishes a fallback for failure management.

00:18:52.200 Our adherence to UNIX norms enhances our overall approach. Yet the journey for open-source maintainers is never quite complete. One user spotted anomalies while debugging their code, marking a positive outcome but clarifying their exploration pathway should be surveyed.

00:19:10.560 Observing how we initially communicated our subprocess invocation can clear up the complexity. By back-ticking commands effectively, we ensure that outputs revealing errors or states have a consistent approach to reporting. This oversight can lead to blocking behavior where the processing waits for command input, thus inhibiting our output from being registered.

00:19:30.360 To address this, it’s often practical to run tests serially within the main process. Our test runner wraps up within a unified class, capturing essential input parameters. While it defaults to the anticipated command execution flow, we leverage the test runner we've constructed to execute processes directly within the command's context.

00:19:46.720 This method guarantees that all debugging tools and expectations function smoothly within the main process. We aptly name this task 'serial,' indicating that operations execute directly and eliminating issues related to multiprocessing failures. The limitations dictate that once a specific gemfile is loaded, only one version executes; however, this is acceptable for debugging purposes.

00:20:03.840 Through this process, I hope to have inspired you to consider how you can test your work within the complexities of evolving dependencies. We’ve discussed how Bundler supports smooth workflows and aids in the existence of alternate environments. We’ve seen useful Rake tricks to enhance productivity.

00:20:21.040 Moreover, we looked at handling version differences, parallel testing, and organizing tests into suites. With this information, I hope your testing experiences become significantly more effective and less frustrating than previously. Thank you!

00:20:56.039 You