How does Bundler work, anyway?

by André Arko

The video titled "How does Bundler work, anyway?" presented by André Arko at RubyConf 2015 provides an insightful exploration of how Bundler facilitates dependency management in Ruby applications. The talk begins with the assertion that many developers use Bundler without fully understanding its functionality, often just running commands like bundle install and bundle exec. Arko emphasizes the importance of grasping the problem Bundler solves by reviewing the history of dependency management in Ruby, beginning with the require method, followed by setup.rb, and introducing RubyGems, which simplified library management.

Key points discussed include:

- Historical Context: The evolution from require, which requires knowledge of file locations, to setup.rb which simplified installation to the introduction of RubyGems for efficient library download and management.

- Challenges in Dependency Management: The difficulties posed by manual library installations and lack of versioning prior to RubyGems, which led to potential conflicts with installed libraries.

- Bundler's Role: Bundler mitigates problems by providing a way to specify dependencies per project and ensuring all developers use the same gem versions. It creates a dependency graph and resolves compatible gem versions, which are documented in Gemfile.lock for consistent installations across environments.
- Command Functions: Arko explains the primary functions of Bundler: bundle install for downloading the required gems and bundle exec for running the application with those specific gems to avoid conflicts. He also discusses alternatives for making command usage more efficient.
- Continuous Improvement: The video highlights ongoing enhancements to Bundler, addressing slow resolution issues and the introduction of projects like Gem Stash to cache gems locally for faster installations.

The presentation wraps up with a Q&A session where Arko answers inquiries about dependency resolution heuristics, release management, and the merger of Bundler and RubyGems for improved functionality. The main takeaway is the significant enhancement Bundler brings to Ruby development by resolving dependency issues and ensuring consistency across teams and environments. By understanding and leveraging Bundler, developers can streamline their workflows and focus on building robust applications.

00:00:15.400 Hi everyone. This talk is called "How Does Bundler Work, Anyway?" It came from realizing, as I talked to people, that a lot of them think they understand how Bundler works. They copy and paste a line from RubyGems or a README file, and then they run `bundle install`. However, if you truly want to know what happens during these processes, you need to understand the problem that Bundler solves.

00:00:29.039 To grasp the problem that Bundler addresses, you must understand the problems that preceded it. This will be a brief history of dependency management in Ruby. Before I dive into how dependencies have historically worked, let me quickly introduce myself. My name is André Arko, and I'm known as @indirect on the internet. You may have seen my picture on a website somewhere. I work for Cloud City Development, focusing on helping teams avoid pitfalls in decision-making when it comes to Ruby.

00:01:01.760 Additionally, I co-authored the third edition of "The Ruby Way," a book that profoundly influenced me when I learned Ruby in 2001. The third edition has been updated for Ruby 2.2 and 2.3, and while it provides valuable information, it'll only remain relevant for a short while. Another project I'm involved with is called Ruby Together, a nonprofit organization that acts similarly to npm but without venture capitalists. We gather funding from companies that rely on Ruby and pay developers to maintain essential Ruby projects everyone needs.

00:01:58.320 Companies like Stripe, New Relic, and Basecamp are among those who support this initiative. Ruby Together funds work on crucial infrastructure, including RubyGems and the rubygems.org server, where everyone finds and downloads gems. I've been leading Bundler for about four years, which is why I'm here to discuss it today.

00:03:04.080 Let’s dive into some Ruby code. How difficult is it to use code written by others? Surprisingly easy! You just need to include it in your Gemfile and run `bundle install`, followed by `bundle exec`, and voila! But what actually just happened? That's where it gets complicated. Gems were installed and downloaded, but from where and how can you access them later? It’s about to get historical.

00:03:32.720 Let me take you on a little history tour regarding dependencies. You might be familiar with the `require` method, which lets Ruby load code from files. Next, we’ll explore setup.rb, which was the first way to install Ruby code in such a way that you didn’t need to specify its location. We’ll also look at RubyGems, the initial means to easily download someone else's code, and then we’ll discuss Bundler, which allows for the simultaneous functionality of multiple gems.

00:04:21.120 The `require` method has existed since at least 1997, the year the oldest Ruby code that is currently in version control was created. Although the `require` method is quite old, it can still be simplified into smaller concepts. Using code from another file in Ruby is essentially the same as copying and pasting the contents of the file where the `require` line is placed.

00:05:31.800 You could implement a basic `require` function with a single line of code. You simply read the file you want Ruby to evaluate, and there you have it – a naive implementation. However, you would quickly realize that calling `require` more than once results in the file being executed multiple times, which is typically undesirable.

00:06:09.639 Consequently, it is essential to create a system that checks whether a file has already been required. You would define a global variable, such as loaded_features, to keep track of this. Ruby provides a global variable named loaded_features that contains a list of all files previously required.

00:06:43.680 The next challenge is that requires need absolute paths. If you say `require rails`, it won't find the file. This wouldn't be an issue if you knew the location of every file on your machine, but who really wants that hassle? A straightforward solution is to treat every file name as relative to the directory where the program was started from.

00:07:31.960 The drawback occurs when you attempt to load files from various directories simultaneously. You could implement a mechanism that allows requiring from multiple directories by maintaining a global variable that's an array containing a list of these directories. Thus, you would loop through the list to find the required file.

00:08:21.520 Ruby's existing global variable, load_path, already provides this functionality. When you call `require`, Ruby iterates over the load_path to find the requested file in each directory. You've now learned how `require` works.

00:09:06.279 With real Ruby, features like loaded_features and load_path are integrated into the same method. Adding items to the load path is beneficial because it allows you to find Ruby libraries regardless of whether they're spread across multiple directories. For example, loading net/http is straightforward; you simply require net/http, and Ruby handles the rest.

00:09:38.760 This efficiency led to a paradigm shift where developers began sharing their code, prompting a new solution: setup.rb. Until around 2000, using require was sufficient; however, as lots of developers started sharing Ruby code, the tedious process of manually installing code by copying Ruby files became clear.

00:10:31.839 This led to the creation of setup.rb, written by Mano Ai. This script automated the process of copying Ruby files into a specific location, known as site_lib, which is where Ruby installs libraries by default. Setup.rb continues to exist online, but hasn’t been actively maintained since around 2005.

00:11:43.239 Setup.rb is effectively the Ruby implementation of the classic Unix commands: configure, make, and make install, applied to Ruby files. The installation process meant you find a library, download it, unpack it, cd into the directory, and execute `ruby setup.rb all` to copy the Ruby files into the `site_lib` directory.

00:12:22.079 During this time, the Ruby Application Archive (RAA) was another significant initiative, offering a collection of Ruby code already written by developers. This allowed users to find libraries more conveniently, bypassing the tedious process of hunting down files. However, significant issues arose surrounding versioning.

00:13:05.480 There lacked a method of versioning. You could only hope that the library author mentioned the version somewhere; otherwise, you had to guess which version you had. The update process was equally cumbersome. If you didn't bookmark the URL of the library, you were left searching again.

00:13:31.800 Without a method for uninstalling previously installed libraries, you might end up with files from older versions colliding with newer versions. This was especially problematic when two libraries defined the same class, leading to conflicts and overwrites. The installation process became messy, as revisions could cause disruptions to existing applications.

00:14:58.360 In 2003, RubyGems was introduced to tackle the shortcomings of setup.rb. It provided a single command-line utility, `gem`, enabling the download and installation of libraries effortlessly. You could uninstall gems with a simple command, and it centralized library management.

00:15:20.040 This innovation revolutionized how Ruby developers shared code, fostering a community where people were more willing to share their libraries. RubyGems also introduced the possibility of managing multiple versions, a critical feature since different applications could rely on various versions.

00:15:59.550 One significant drawback of the previous methods was the lack of a centralized process for ensuring that all developers were using the same versions among different projects. This inevitably led to version conflicts, notably with the introduction of Bundler.

00:16:23.680 With Bundler, you can define gem dependencies for each project and ensure that all developers working on that project are using the same versions. This resolves many headaches when needing to update libraries across multiple applications.

00:17:14.199 Bundler’s philosophy revolves around dependency graph resolution. It determines which versions of each gem will work together, solving this through a detailed process. The key to Bundler is ensuring that all needed gem versions are compatible. By using Bundler, you can list the required gems and let it find which versions will work well together.

00:17:57.560 The evolution of dependency management has not been without its complexities. Bundler must tackle challenges related to legacy code and existing workflow practices, some of which can lead to lengthy resolution times. However, recent updates have greatly improved resolution speed, ensuring that most gem files resolve within seconds.

00:18:50.840 Once Bundler identifies compatible versions, it writes the exact version information to a file called `Gemfile.lock`, which becomes your project's versioning blueprint. This enables consistent installations across various environments, including developer machines, CI servers, and production environments.

00:19:33.479 To wrap things up, Bundler operates mainly through two functions: `bundle install` and `bundle exec`. The first command resolves and downloads the required gem versions, while the second one manages ensuring only those specific versions are used when running your code.

00:20:28.720 With `bundle exec`, Ruby code has access solely to the versions specified in the lock file rather than versions installed elsewhere. By restricting access to only the required versions during execution, Bundler mitigates potential conflicts.

00:21:22.639 Many developers find the command line somewhat cumbersome, and I personally agree. A popular workaround is to alias 'b' to 'bundle exec' for quicker access. Alternatively, you can generate bin stubs for your application commands, allowing you to avoid using 'bundle exec' directly.

00:22:13.200 Both Rails and Bundler have incorporated this approach, creating binaries alongside Rails applications. Commands like `bin/rails` or `bin/rake` let you seamlessly run the corresponding versions of these tools appropriate to your application.

00:22:58.879 This effectively brings us up to the present day in Bundler's ongoing evolution.

00:23:03.880 Even after Bundler was launched, challenges persisted, such as slow resolution times. To combat this, newer versions of Bundler have been introduced that optimize the download process. While we currently have improvements in Bundler 1.10, we’re excited to announce future enhancements and encourage contributions from the developer community.

00:24:43.559 If your company is unable to spare resources for Bundler, consider supporting Ruby Together, which funds initiatives that enhance tools like Bundler. Additionally, we’re working on an exciting project named Gem Stash, which acts as a local cache for Ruby gems, significantly speeding up gem installation within your infrastructure.

00:25:44.040 As we progress, I’ll be happy to take questions as we have around seven minutes left for any discussions about Bundler.

00:26:38.080 The first question inquires about interesting heuristics used to reduce the time taken for solving the dependency graph problem. To be honest, the methods we apply aren’t particularly fascinating. We optimize by starting with the newest versions, working backward when a specific version is needed, and tweaking as necessary based on dependencies.

00:27:31.919 As for managing releases of Bundler, we typically release when there are substantial updates ready—this can take a few months. The next version, 1.11, is currently in the pipeline, and we aim to provide more updates soon.

00:28:37.720 There are ongoing discussions about certain changes made to the lock file in the 1.10 update, particularly about the 'bundled with' section. This sparked some debate among users who are detail-oriented about their Gemfile.lock, especially since it changes the file only under specific circumstances.

00:29:40.200 Concerning the ownership of Bundler, the project operates under an MIT license, which allows open contributions. Different stakeholders at various times have impacted progress, including issues related to merging pull requests and ensuring everyone is on the same page regarding updates.

00:30:35.599 Finally, the merger of Bundler and RubyGems is progressing steadily. They now share a unified dependency resolver, which streamlines operations. While individual codes within Bundler might still call RubyGems functions, this alignment strengthens the overall functionality of both tools.

00:31:44.639 Having gone through these insights, I am more than happy to connect with anyone who has further questions. Unfortunately, our time is limited, but I encourage discussions afterwards.