Ruby Gem

Zeitwerk: A new code loader

The talk presents Zeitwerk, a new code loader for gems and applications.

Zeitwerk is able to preload, lazy load, unload, and eager load the code of gems and applications with a compatible file structure without the need to write require calls.

RubyKaigi 2019 https://rubykaigi.org/2019/presentations/fxn.html#apr19

RubyKaigi 2019

00:00:00.030 All right, let's go. So this is the menu that we have after lunch.
00:00:05.520 First, we are going to see what Zeitwerk is about. Zeitwerk is a German word, and although I don't speak any German, I will try to pronounce it as best as I can.
00:00:18.840 More or less, is that okay, Zoidberg? All right, so we are going to see what it is about, what motivated me to work on this.
00:00:33.710 After that, we are going to understand how Rails has autoloaded since the beginning and how Zeitwerk integrates with Rails 6.
00:00:48.239 I hope it will become the autoloader by default in Rails 6 applications. Let's go!
00:01:01.379 So Zeitwerk is a Ruby gem that provides these features mainly. These are the main features; the API has more, but this is what it resolves.
00:01:08.790 It is able to autoload, eager load, and it does so without needing require statements. That's the key point.
00:01:21.210 So you are able to autoload the same way you do in Rails, and you can eager load as well. However, if you want to eager load without writing requires, you need to autoload as well.
00:01:33.390 Normally, regular code has constants at the top level or class level. You do not need to order the way the files are loaded to be able to eager load.
00:01:54.630 For that to be possible, you have to autoload. For instance, in production mode, Rails, by default, still autoloads, because you have constants at the top level.
00:02:11.580 Once you have finished eager loading, you are set, but in the meantime, you still need to autoload. Therefore, both features are related.
00:02:24.690 You are also able to reload code, which is convenient if you are writing a service that runs on a server.
00:02:38.280 It is important to understand that, albeit we are talking about Zeitwerk as the new default autoloader for Rails 6, it is actually an independent project designed to be usable in any Ruby project.
00:02:53.430 Indeed, it has no dependencies and no relationship with Rails, so Zeitwerk is an independent gem.
00:03:08.160 In Rails, there is integration code that delegates these features to Zeitwerk, but by itself, it's independent and can be used in any project without carrying anything extra.
00:03:14.970 All right, so first things first, let's see how to use this gem. The assumption the project has to follow is that file names have to match constant paths.
00:03:45.000 This is a very conventional way to structure projects, but Ruby in general is much more flexible than that. We put a constraint on the problem which is this: file names must match constant paths.
00:04:11.040 In the case of Rails, that's the way you normally write project structures in Rails applications. This is a given. In order to be able to use Zeitwerk in your project, you have to comply with this.
00:04:56.700 For instance, if you have a user.rb file, that should define a user constant. A user class named UserProfile should define the UserProfile as a camel case constant.
00:05:25.680 If you have an HTML parser, by default, it should define HTMLParser with lowercase letters. But if you prefer to have them as uppercase letters, every instance of Zeitwerk can configure its own inflector independently.
00:05:56.790 For example, you could say in your inflector, "HTML is going to be inflected as HTML acronym." You can do that if you want.
00:06:24.400 Now, let's talk about namespaces. File names correspond to constant paths. If you have an image space, a namespace is not like a formal word in Ruby, but since we have this assumption, we know what we mean.
00:06:46.250 Namespaces correspond to directories. For instance, we have a class Hotel that defines a hotel constant. That class can also act as a namespace, corresponding to a directory called hotel that defines something beneath it.
00:07:11.720 We also support implicit namespaces in the same way that Rails has done since forever. If you have a namespace called Admin, you do not need to define admin.rb; you can simply create a directory called admin.
00:07:38.190 If there is a directory called admin and there is no admin.rb, Zeitwerk will define a dummy module for you automatically.
00:08:00.290 Rails has done this since forever. So that's the convention.
00:08:16.140 How do you use this? Very easily. You just instantiate a loader, then tell the loader which are the root directories of your project.
00:08:42.500 This is the generic interface. You specify the root directories, which correspond to what we call the autoload paths in Rails. You can have many of them. Then you call setup, and boom, you are ready to do everything.
00:09:02.140 In the case of gems, there's a shortcut because normally, a gem has a lib directory, so it has only one root path, which is called lib. You normally put this in the entry point of the gem.
00:09:38.220 Gems typically follow a convention of naming files in uppercase. This is just a shortcut that configures that for you if you want.
00:09:57.420 In the case of a gem, apply a shortcut for brevity, but it doesn't technically need to be a gem; it does not need a gemspec or anything.
00:10:11.220 This is just a way to streamline things. If your project has a lib directory, Zeitwerk works as well for it.
00:10:51.160 You can eager load this way; you do the same thing. Just instantiate the loader, set it up, and then you can eager load. I believe in general, gems are going to eager load unless they are very large.
00:11:14.370 This eager loading is done by all your dependencies that are managed by Zeitwerk. So in Rails, you don't need to do anything regarding this; that's what the integration code does for you.
00:11:40.520 When Rails boots in production mode by default, it will eager load everything. Even if you benefit from all the dependencies that are using Zeitwerk, your eager loads also the Rails code.
00:12:06.930 For reloading, you have to opt-in. This is a Ruby project created for Ruby projects, and I believe the majority of use cases are not going to reload.
00:12:41.000 So when you are working in a Ruby project that does not implement something like a service, you usually only need to autoload or eager load.
00:13:13.240 When you change the code, you may re-run the suite, but there's nothing to reload. I treated reloading as a special case.
00:13:47.159 In order to reload, you have to store some metadata, and we can save that metadata if you are not reloading, which I believe will be the majority of cases.
00:14:19.187 For example, if you have a Rails application, and 20 of your dependencies are using Zeitwerk, you know that you are not going to be storing metadata from those gems that are not going to be loaded.
00:14:58.220 In regular projects, this isn't a big deal, but if you have a really huge application, it could save memory in production.
00:15:32.890 So that's the basic usage of Zeitwerk. Now, why was I motivated to work on this?
00:16:10.150 Several things motivated me—the last of which was to improve Rails' autoloading. If you have written Rails applications, you'll probably know that the way constants are autoloaded has some pitfalls.
00:16:42.200 It is not wrong, but it does not match the semantics. When it works, it works beautifully, but when you have a problem, it can be difficult to debug. I wanted to improve this.
00:17:22.000 Initially, this was what made me launch the project, but then, while working on it, I realized that I could also solve another personal pain point.
00:17:51.900 I dislike writing require statements in projects. Requires feel brittle in my experience. If you have a very small project, it's not a big deal, but in larger projects, it's easy to forget requires.
00:18:32.790 Also, there is a sense of not following the DRY principle. If I write projects where the file structure is conventional and map constants to file names, every time I use the user constant, I need to require the user class.
00:19:00.150 For every constant, this becomes repetitive. It felt burdensome, so I thought, couldn't we automate this somehow? Thus, those were my motivations.
00:19:30.040 For some people, writing requires is fine, but for me, it was a point of frustration. Take, for example, a class called Airplane that includes a module in a larger project; this can fail if you forget to require the module.
00:20:00.840 Having to be disciplined in projects of non-trivial size is necessary, and when you refactor code, you must remember to add or remove requires. It relies heavily on your discipline.
00:20:49.410 Additionally, requires have a global side effect; this means that a file without the required statements could work if some other file in your load path has required this module and is already in memory.
00:21:35.200 Then you could encounter issues based on the load path; thus, you have dependency problems. I have dealt with this in the Rails codebase for a long time.
00:22:09.770 It results in brittle code, and I prefer to avoid this. For instance, this is how the entry file for NanoSee, a static site generator, works.
00:22:43.480 In NanoSee, instead of doing requires individually, you're simply eager loading everything within the project, which streamlines the process.
00:23:09.410 With this setup, if you add a file to the project, you won't have to remember to add it to the requires manually; it makes maintenance much easier.
00:23:38.930 Zeitwerk, which is used in NanoSee, allows everything to be reachable without explicitly requiring them, streamlining the coding experience.
00:24:03.780 Let's now understand how Rails autoloads and what issues it entails, as well as how Zeitwerk addresses those problems.
00:24:32.740 To do that, let me quickly make a refresher on constants so we have a clear idea of how they work.
00:25:04.720 In programming languages, constants are a straightforward concept. However, in Ruby, they are a rich topic, and this aspect is crucial for understanding.
00:25:38.090 When assigning a constant in Ruby, you are essentially creating a storage space, similar to variables. Constants provide a name for that storage space, which will hold the object.
00:26:06.570 For instance, a module or class keyword stores the class object in the corresponding constant name. This means that constants belong to modules, which have their own sets of constants.
00:26:30.430 When defining a constant, remember that they belong physically to the module. This means that the context of where you define your constants matters in Ruby.
00:27:06.540 For example, when defining a hotel constant, it belongs to the object at the top level. If you define a pricing constant inside the hotel, it belongs to that module instead.
00:27:37.680 Now, let's talk about how we resolve constants. This has to do with the concept of nesting and the importance of ancestors.
00:28:10.060 When you reference a constant, Ruby checks the nesting, and if it doesn't find it there, it goes up the ancestors. If it's relative, Ruby checks the inner namespace first.
00:28:53.680 For example, if the innermost class is a module, Ruby will search the object before triggering a name error.
00:29:21.100 Thus, Ruby's const_missing callback receives the name of a constant that wasn’t defined previously and searches for that.
00:29:57.010 This callback tries to find the corresponding file that should define that constant, but the technique has limitations and is not foolproof.
00:30:30.430 The problem arises due to the absence of a locking mechanism, which Rails uses to ensure thread safety. This is a big limitation with that approach.
00:30:54.370 For instance, if multiple threads are trying to resolve constants, it creates race conditions that can make it difficult to manage efficiently.
00:31:35.200 This is why Zeitwerk, by leveraging the module autoload feature, is designed to address these issues effectively.
00:32:10.000 At a high level, it autoloads constants defined under specific namespaces as well as within individual class and module contexts.
00:32:56.000 The goal with Zeitwerk's approach is to ensure that whenever a constant is referenced, it automatically loads the relevant file that defines it.
00:33:30.000 In this scheme, when you call setup, Zeitwerk traverses the root directories and sets an autoload for constants defined there.
00:34:10.500 To summarize, we have seen how constant loading works in both Rails and Zeitwerk and how Zeitwerk rectifies the existing limitations.
00:34:40.240 Under Zeitwerk, the goal is to automate file loading for constants effectively, thus reducing the manual efforts previously needed to include them.
00:35:20.360 In conclusion, Zeitwerk will be the default autoloader in Rails 6 and will greatly enhance the developer experience by removing the need for explicit requires.
00:36:00.780 Thank you for your attention, and now I will take any questions.