How Does Bundler Work, Anyway?

RubyConf AU 2017

http://www.rubyconf.org.au

We all use Bundler at some point, and most of us use it every day. But what does it do, exactly? Why do we have to use bundle exec? What's the point of checking in the Gemfile.lock? Why can't we just gem install the gems we need? Join me for a walk through the reasons that Bundler exists, and a guide to what actually happens when you use it. Finally, we'll cover some Bundler "pro tips" that can improve your workflow when developing on multiple applications at once.

RubyConf AU 2017

00:00:05.640 Hi everyone, thanks for coming! I appreciate you making the long trek up both escalators.

00:00:10.200 This talk is about how Bundler works. This is a question that I ask myself on a regular basis, so I figured I'd write it down so that I wouldn't forget.

00:00:14.700 Unfortunately, I'm sorry in advance that telling you the answer to this question will require a brief history of the entirety of dependency management in Ruby. No big deal though.

00:00:19.470 Before we dive in, I'm André Arko, as you've heard. I'm involved in pretty much all the internet things, and that’s my tiny internet picture which might be more recognizable than my actual face at this point. My day job involves Ruby, Rails, and team training architecture stuff as a consultant through Quad City Development, mostly in San Francisco.

00:00:33.180 I co-authored the third edition of 'The Ruby Way,' which is pretty cool. I learned Ruby from the first edition of that book, so it was a great feeling to be able to say, 'Oh, I contributed to that.' I also work on Bundler, and I hope you've heard of it. My side project is Ruby Together, which you heard about yesterday from Pat.

00:00:50.400 If you didn't hear about it yesterday from Pat, we take money from companies and individuals who use Ruby and pay developers to spend time working on Bundler and RubyGems, as well as the rubygems.org web app. It’s great to have infrastructure that works—at least, I like having infrastructure that works, and I hope you do too.

00:01:14.430 If you're interested, I have lots of Ruby Together stickers, so come see me after the talk, and I will hook you up.

00:01:22.470 Sharing Ruby code written by other developers is super easy, right? You add it to your Gemfile, run 'bundle install,' and start using it. Done! That’s it.

00:01:31.510 But what just happens? That is what the rest of the talk is about. Based on the guessing that your eyes glazed over while 'bundle install' printed things out, you probably concluded that something got downloaded and something got installed.

00:01:37.460 You did type 'bundle install' after all, but what got downloaded? What got installed? Where did it go? Why those things? And how do you use someone else's code just by adding the name of it to your Gemfile? I will explain all of those things, hopefully soon.

00:02:07.020 But first, a small diversion. I have to take you back in time to the beginning of dependencies in Ruby. The reason why things are the way they are today is really just a result of how things were before.

00:02:29.100 So we're going to start at the beginning with the 'require' method, which you have probably also heard of, and then I’ll talk about 'setup.rb,' a Ruby thing that you probably haven't heard of. After that, we’ll go from 'setup.rb' to RubyGems and then to Bundler.

00:02:37.959 We'll discuss why exactly dependency management for a single project is different than having gems that exist.

00:03:00.480 The require method has been around since at least 1997, which is the oldest Ruby code that we have in version control that we can still access. Unfortunately, we don't have a copy of the CVS repo anywhere on the internet anymore, but we do still have the SVN repo that goes back to ’97.

00:03:20.960 Even though 'require' is that old, you can still break it down into slightly smaller concepts. Using code from another file is practically the same as copying and pasting that code directly into the file that you’re using. You can implement a naive require function with literally one line of code. It's really straightforward—you tell it a filename, it reads the file off the disk, takes the string that has the contents of the file, and passes it to eval to run the code.

00:03:55.300 You may have noticed there are some small problems with this implementation. Among other problems, you can tell it the same filename twice and it will run the code twice. This is less than ideal. It will reinitialize values, redefine classes, and overwrite methods, which is not great.

00:04:20.900 So let's hypothesize how you might fix this. What if, hypothetically, you were to create a global array that checks whenever you evaluate a file, and if you've ever processed that file before, you skip it? This is theoretically pretty good.

00:04:56.420 It turns out there's a global variable that's an array, and its name is 'loaded_features.' It contains a list of all the files that Ruby has ever required, and in fact, Ruby uses it to track whether it should require something or not. However, there’s a new problem: this only works with absolute paths.

00:05:30.490 Maybe it's less than ideal to have to pass the absolute path that includes your username to every file you want to use in your program. The easiest way to allow requires that aren’t absolute is to treat every filename as if it’s relative to the current working directory where you started the program.

00:06:07.610 However, this doesn't work well if you need to be able to require from multiple directories. So let's hypothesize about how you might fix this problem.

00:06:18.900 What if you made a global variable that contains a list of directories to search in? When you run 'require,' you can look for the file being required in every directory listed in that global variable. This approach works quite well and there is a global variable named 'load_path' in Ruby that does just that.

00:06:54.770 When you require something, Ruby looks in that load path for the specified filenames in the order specified. So, needless to say, this is not the actual implementation that the Ruby interpreter uses, but this is the functionality provided by Ruby.

00:07:05.670 Ultimately, load paths are pretty cool because they allow you to find Ruby libraries spread across many directories. I’m going to leave the combined implementation as an exercise to the reader because I couldn’t fit all of the code onto one slide.

00:07:40.900 But now we have a 'require' function that can hold the Ruby standard library in that list by default. So now, when you say, 'Oh I want to load net HTTP,’ it says, 'Oh, this standard library directory is on the load path by default. I found it here; you’re all set.' This is actually pretty great.

00:07:54.960 You can use code from multiple directories. However, the next problem that arises once you've solved those issues is what do you do if you want to get code from someone else somewhere on the Internet? You could create a directory, put that code in it, and then manually add that directory to the load path every time you start a Ruby interpreter.

00:08:43.140 That gets tedious. So someone had a bright idea: why not make a default place where you can add stuff, and Ruby will always keep that place on the load path? It’s called 'site_ruby.' Almost no one uses it anymore, but it’s still there for backward compatibility.

00:09:04.500 Once we had 'site_ruby,' someone suggested that we write a script to automate the installation of new libraries. This became 'setup.rb,' which was written around the year 2000. Surprisingly, this library, 'setup.rb,' is still on the Internet in some subversion repository hosted by its author.

00:09:53.520 The process it facilitated was inspired by the UNIX tradition of 'make files' and running 'configure, make, make install.’ 'setup.rb' became the dominant way to install shared code because it automated the process.

00:10:33.490 You would browse the Ruby Application Archive to find a library, click a link to the homepage, download a tarball, decompress it, and then run 'ruby setup.rb all.' It would print out a bunch of stuff, and you’d have a library ready to use.

00:11:05.850 This was a huge improvement as you didn’t have to figure out where to put the files; 'require' would just work. However, there were still some issues.

00:11:28.960 There was no versioning of libraries, which meant there was no way to uninstall them. Everything went into the 'site_ruby' directory, and you had no idea which files belonged to which library. If a library got updated, how do you find out? Well, you need to go back to the Ruby Application Archive and remember the version number you installed, then repeat the download process.

00:12:15.040 The tediousness of this process led people to think, 'What if you could just give the name of a package, and it would find, download, decompress, and install it?' This problem led to the creation of RubyGems in 2003, which fixed all of these issues.

00:12:54.120 There was a single command, 'gem install,' which downloaded the library with the name provided. It was a significant improvement. RubyGems also introduced the concept of versions, allowing you to install multiple versions of a library and specify which one you wanted, making it easier to manage dependencies.

00:14:02.200 It would set up a structured way that installed each version into its own directory. Thus, manipulating the load path determined which libraries you could load using 'require.' RubyGems introduced dependency resolution, particularly at runtime.

00:14:50.200 However, over time, this led to complications because if you had two applications requiring different versions of the same library, you might run into conflicts.

00:15:40.140 When I became a full-time Ruby on Rails developer in 2006, it was common for new developers to experience issues during setup, often needing to find the right combinations of gems on their work laptops.

00:16:19.080 Recognizing this problem, Rails tried to offer solutions, suggesting you declare what gems and versions you needed for your apps. However, you still needed to load the Rails gem to find out what versions you needed, making it all somewhat circular.

00:18:02.040 This led to a situation where dependency conflicts arose frequently across different servers, complicating development. Any conflict would lead to frustrating bugs and production issues, particularly as you switched between projects.

00:18:52.530 As the Ruby community evolved, developers recognized the need for stronger installation dependency resolution. Bundler entered this landscape to address the effective management of gems across various dependencies.

00:19:30.420 Bundler resolves dependency graphs to ensure all installed gems will work together harmoniously. It creates a Gemfile.lock file as a record of all the exact versions of gems that were installed and confirmed to work together.

00:21:00.900 Here's how 'bundle install' works: it reads the Gemfile and the lock file (if there is one), requests RubyGems to find the gems, records the required versions, and then installs them.

00:21:40.620 The command 'bundle exec' serves a similar purpose, ensuring that it has access to all the required gems when running your program. Pro tip: use 'bundle binstubs' command for creating commands that will always run with the correct version of your gems, even if you have multiple versions installed.

00:22:40.000 I find this very helpful when working on multiple Rails applications at the same time. However, we’re still improving Bundler, adding features, making it faster, and helping the community to contribute.

00:23:04.000 That’s it! Thanks everyone for listening.