Ruby's Environment Variable API

by Jack Danger Canty

In the presentation titled 'Ruby's Environment Variable API', Jack Danger Canty explores the intricate relationship between Ruby's behavior and environment variables, particularly focusing on tools such as Bundler, RVM, rbenv, and chruby that rely heavily on these variables. He emphasizes how environment variables serve as critical components that dictate Ruby's operation and how they interact with various tools and libraries within the Ruby ecosystem. The key points highlighted in the talk include:

Understanding Environment Variables: Canty explains that environment variables act like a global hash of strings, passed from parent processes to child processes within the operating system. This concept is foundational for managing Ruby applications and debugging issues.
Debugging Tips: The talk provides practical advice for debugging Ruby applications when encountering loading errors, illustrating how misconfigurations in environment variables can lead to problems.
Load Paths and Gems: Canty discusses the Ruby load path, emphasizing how Ruby searches for required files in a structured manner, similar to how an operating system finds commands. He explains the significance of the PATH variable and the gem path in managing dependencies.
Practical Examples: Canty offers examples of setting and exporting environment variables, demonstrating their immediate effects on the Ruby environment, and how tools like Bundler automate the management of gem dependencies.
Encouragement to Contribute: He encourages audience members to get involved with Ruby-related projects, emphasizing that understanding these concepts enables developers to contribute meaningfully and improve the ecosystem.

Overall, the talk seeks to demystify the Ruby environment variable landscape, providing attendees with the insights needed to troubleshoot effectively and work efficiently with Ruby and its tools. The conclusion reiterates the importance of environment variables and proper configuration in Ruby development, aiming to empower developers to manage their projects adeptly and contribute to the broader Ruby community.

00:00:14.410 My name is Jack Danger. It's a silly name, but it is mine, and I work at a company called Square. We're in San Francisco and a few other places like New York, Atlanta, St. Louis, Waterloo in Canada, and Tokyo. It's a great place to work, and because Square sent me here, it feels obligatory to tell you a little about how we do engineering. I'll sum it up by saying that we do billions of dollars in financial work with our own manufactured hardware. We create code and firmware for the hardware, then send it out to all our people, and for all the iOS, Android, and JavaScript applications that our customers use.

00:01:11.030 On the other side of the Internet, we manage our own hardware in leased data center cabinets, handling all the networking and all the services that run on those completely. We do all of this with about three or four hundred engineers, and we’d love for you to join us because we do it in a really secure, perfectly secure, and well-designed way. It's a lot of difficulty, and it's really hard, so please come join us. We’d love to show you how we do it and have you help us do it better.

00:01:30.620 Alright, this talk is about the Ruby Environment Variable API. I actually said in my description that I wasn't going to call it that, but then Abdi encouraged me to use it, so I am. I actually kind of like it too because we’re going to talk about the things you can do with environment variables, which I'll explain in a second if you're new to them. Environment variables tell Ruby how to behave, and this is actually the core of how all the tools we use that are not Ruby itself operate, such as Bundler, RVM, rbenv, chruby, and others. They all work through environment variables.

00:01:56.540 We're also going to discuss how to debug your machine when things go wrong. For example, if you encounter an error like 'cannot load such file,' it indicates that something in the chain of configuration I’m about to show you has been misconfigured. So, let’s talk about the ideal world. The perfect world would have one version of Ruby located at, let's say, /bin/ruby. You'd run it as root without worrying about users or permissions, with just one file that contains all of the code you need — you simply copy your favorite gems and paste their source into a single file. Then, you run it, and it never has to call 'require.' That's perfect. But, that's not the reality.

00:02:29.060 The real world has a lot of Rubies, and your application requires many files. You can't specify the full path to every required file because that would be extraordinarily tedious and error-prone. It's impossible to manage such specificity regarding versions. Instead, you can simply say, 'I want this thing, give me that thing,' and you expect it to just work. Except when it doesn’t. You can require something like 'nokogiri' and hope it loads version 12, but if you encounter issues, you might end up trying to troubleshoot what’s gone wrong.

00:03:02.470 I want to apologize for the syntax highlighting in the slides; it's hard to do properly, and sometimes I may get it wrong. Also, as a colorblind person, I see seven different shades of yellow that I can't distinguish, as I tried to make them all look the same. Can we make this more comprehensible? Yes, we can, and we've achieved that largely through these projects. We have a bunch of tools that wrap up the complexity of working with different versions of Ruby and their dependencies, making it really easy and intuitive to run your code without having to think much about it. Also, please pretend some logos appear on slides where they’re missing, because I didn’t want them to look bad next to the ones that do.

00:04:06.859 Despite the heroes who have built these projects and the incredible effort that has gone into making them work across many different systems with various dependencies, this is how most people solve problems with their environment. I pose a little joke here: during the conference, I asked people a simple question: 'If you can't find the right file or if Bundler doesn't work the way you want, what do you do?' The responses are almost exactly the same list, which illustrates the trial-and-error nature of debugging in our environment. It's almost a comical reinforcement of how many people flail around, like 'rake,' then check if Bundler is right, or run 'rvm use' to check their Ruby version before they finally close the terminal and start over.

00:05:04.410 In fact, if you still fail, you might consider just reinstalling RVM entirely or even rebooting your machine altogether. Essentially, this reflects the truth where we can find ourselves blindly trying things to stem the bleeding caused by our sharp tools. What we really need are more heroes like those who built the tools that I mentioned. This talk is about equipping you with the knowledge necessary to join in on projects related to Ruby gems, Bundler, RVM, and chruby. By the end of this talk, if you choose not to participate, it’ll be because you made the choice, not because you weren't capable.

00:06:40.870 To become a superhero in this environment, there’s just one step: learn the tools. If you want to be the person everyone else refers to as the wizard, just remember that when you start programming, it’s not because of some special ability. It’s simply because you got really frustrated for a long time until you found a bit of understanding. Then, when others ask you how you did it, you might say it’s mostly about Stack Overflow! So, let’s talk about a couple of tools — the main focus of my talk is, first and foremost, environment variables.

00:07:27.400 Now, environment variables often trip up even the most brilliant people. The individuals who handle them easily have typically worked with them often. Environment variables serve as a global hash of strings — a hash map or dictionary, if you will — that a program inherits from its parent program. When you’re in a shell session, you inherit everything set up earlier in the boot cycle that created your session. Any program you execute afterward inherits your environment variables.

00:08:10.330 To visualize this, think of them like a really large purse that your parent gave you, which you can then pass on to your child. It’s not the actual purse; you just make a copy of it for the next generation. In other words, you don’t pass it as is; you give a copy of the environment variables. They look like individual variables but think of them as just keys in a hash, with both the keys and their values as strings. You can set things for yourself, but anything you export gets passed on to your child processes.

00:08:46.030 If you want to test your environment variables, you’d type 'printenv' to get a printout of all your environment variables. However, some of those may include control characters and escaped codes that can make things display in odd colors. If you want to search for something specific, you can 'grep' it for a keyword like 'home' to find your home directory variable. You can also set up environment variables using all capital letters, which is a convention, but they don't have to be.

00:09:13.680 If I set a variable like 'oh_dear=0' while checking with 'printenv', it will not show up in the output unless I export it. There are two ways to do this; you can set the variable and export it simultaneously or do it on separate lines. For instance, 'export x=y' or 'x=y' in one line then export afterward both work fine. This interaction is quite intuitive as you play around with the commands. When typing 'printenv' after exporting 'oh_dear', you would now see it included in the output.

00:10:00.500 Now let's talk a bit about process IDs. When your computer starts, it begins with one program, known as 'process one', which has a process ID of 1. On Linux, it’s referred to as 'init', on macOS it’s 'launchd', and on Windows, the landscape is a bit different. This initial process creates all other processes, establishing a hierarchy. The first process may export some values like 'happy=yes,' making all its children happy, while retaining something for itself like 'food=pizza.'

00:10:46.230 When you start a shell, it inherits the environment variables, but specific ones you define (like 'oh_dear=0') wouldn’t be inherited unless explicitly exported. Later on, a shell might start running and would inherit the 'happy' variable but not 'oh_dear'. If you type 'export oh_dear=0' in your shell, the next command line will accurately inherit that variable, allowing you to effectively manage your variables.

00:11:43.110 Let’s discuss the most important environment variable of all: PATH. The PATH is a list of directories the operating system checks when you run a command. When you want to view your current PATH, you can simply type 'printenv PATH' or 'echo $PATH.' A common mistake made is confusing how you set variables without using a dollar sign (e.g. 'x=y') versus reading them, where you do use a dollar sign (e.g. 'echo $x'). This differentiation is essential when managing environment variables.

00:12:31.390 For better readability, you can run commands through tools that transform colons into new lines, allowing you to visualize each entry in your PATH clearly. When you type a command, the operating system looks through every directory in the PATH to find the executable you're trying to run. It's a basic yet vital process on how commands are executed on your system.

00:13:22.890 Now, there are tools on Linux like 'strace' that can provide a powerful look at what is happening at the OS level by displaying all system calls a command makes. It’s fascinating as it unfurls how the operating system manages or handles commands, illustrating the simplistic way your machine identifies what you’re telling it to do. A command typically searches through the entries in the PATH to find an executable, and iterates through the paths sequentially until it finds the desired file to execute. This exploration is an essential function of any operating system.

00:14:49.260 Moving back to the Ruby environment, when you try to 'require' a file, it's a similar process to how the operating system looks up commands. Each entry in the Ruby load path is iterated through when you require a file. Essentially, Ruby asks the operating system to look for a particular file, appending '.rb' for Ruby files and '.so' for shared objects until it finds one. If something isn’t found, like if a required command fails, Ruby will return an error message.

00:15:57.310 An interesting aspect to note is that when you print out the Ruby load path, the paths it checks resemble the ones in the operating system PATH. It iterates over its load path, checking each entry for the files being requested, and appends the required file name dynamically. Essentially, Ruby works in a similar manner to the operating system; it constantly looks for files to load in a structured way.

00:17:15.350 The next path we will cover is the gem path, which can mystify many people because they don’t often have to engage directly with it. If you’re utilizing tools like RVM or other versions, you might not need to explicitly work with the gem path, but it's essentially a colon-separated list of directories much like the Ruby load path. If you set the gem path to include a specific directory for your gems, you enable Ruby to find gems easier when you execute bundler commands or install new gems.

00:18:09.630 For example, if you set 'GEM_HOME' as your gem directory when you install a new gem, that gem will be placed directly in the specified path, allowing you to manage your gems logically and effectively. You can utilize 'gem install' to install the gem directly into your gem path. It’s quite intuitive: if you know the directories, you can structure your gems efficiently.

00:19:57.340 Next, we will talk about the load path and why it often creates issues when requiring files in Ruby. To simply illustrate this, you could run a command to print out your load path. Just as in other environments, Ruby checks paths in its load path to find the files necessary for running your application when requirements arise. It performs a very similar operation as your operating system when executing shell commands.

00:21:03.410 For paths in Ruby, when you require something, Ruby will iterate over the entries in the load path and the load path can be modified as needed. You can add or change the load path dynamically within your scripts, which is especially useful in larger applications that might rely on many gems organized across various directories.

00:22:31.790 Now, to modify the load path, there are three primary methods, including the use of environment variables, command-line options (dash capital I), and Ruby's built-in structures. By default, the current working directory is not included in either the operating system or Ruby’s load path due to security reasons; this prevents executing files that might be potentially harmful.

00:23:23.390 While working within Ruby, you can add the current working directory to the load path manually. So if your application requires files located in the current directory, you can make sure to append 'Dir.pwd' to the end of your load path ensuring Ruby will search in your current directory when it attempts to load files.

00:24:41.640 In particular, Bundler and similar tools manage these load paths dynamically; they take care of specifying what needs to be included at runtime, making things much easier for developers. Bundler also allows you to set an alternative location for your gems which can be extremely useful for managing multiple projects on the same machine without interference.

00:25:21.640 Additionally, using Ruby's environment variables, you can fine-tune how your gems are managed and set specific paths for various runtime environments simply with a series of environment variables. There are tools like RVM that will let you specify your own ruby version which helps to ensure the right environment is used every time you run your code.

00:26:38.910 In conclusion, understanding how each of these components fits together allows for efficient Ruby development. Environment variables and paths dictate how Ruby finds and loads the necessary files and gems to keep your code running smoothly. You need to use the knowledge of paths and environments effectively when setting up your Ruby projects to ensure that everything works as you expect.

00:27:27.820 This brings us to bundler. Bundler essentially works as a project manager that keeps track of your gem dependencies and makes sure the correct versions are being used. By reading the Gemfile and lockfile, it figures out which gems are needed, their exact versions, and finds them efficiently using the paths you’ve established. It also modifies the load path so that the required gems can be loaded seamlessly.

00:28:39.040 While this may sound complicated, once you establish what the primary environment variables are and how they impact your code, you can manage them with more confidence moving forward in your Ruby development. You can certainly override some of the default behaviors, tweak how bundler works through specific variables, and adapt it for your needs.

00:29:53.880 I want to encourage those of you listening today to really look into contributing to Ruby related projects like Bundler or RubyGems. With the understanding of environment variables and paths together, you can make contributions that smooth out rough edges in these tools. The growing ecosystem around Ruby depends on your insight and engagement, so don’t hesitate to step in and improve the experience for everyone. Thank you very much for your time.

00:31:29.390 Now I welcome any questions regarding the talk, especially around how RubyGems interacts with the Gemfile.lock and what happens if either isn’t present. Although I can’t provide exhaustive details without looking into the Ruby source, I know there’s a full-featured parser that allows you to work with the contents of Gemfile.lock. I appreciate your interest and enthusiasm, thank you for engaging in this conversation.