Talks

Re-graphing The Mental Model of The Rails Router

Re-graphing The Mental Model of The Rails Router

by Vaidehi Joshi

In the talk titled "Re-graphing The Mental Model of The Rails Router" delivered by Vaidehi Joshi at RailsConf 2018, the speaker explores the critical workings behind the Rails router, emphasizing the underlying computer science concepts that facilitate its operations. The presentation aims to demystify the abstractions present in the Rails framework, guiding developers towards a deeper understanding of routing mechanics.

Key points discussed include:

- Introduction to Routing Basics: Vaidehi explains that the Rails router acts as the middleman between incoming requests and the corresponding controller actions, akin to a post office sorting mail.

- Understanding Middleware: The importance of middleware in Rails applications is highlighted, with particular attention to the order in which components are executed during request handling.

- Journey Routing Engine: Introduced as an essential component of the Rails router, Journey utilizes graph data structures to improve routing efficiency. The speaker credits Aaron Patterson for creating this routing engine, which simplifies the routing process compared to an inefficient naive method of iterating over all defined routes.

- Graph Representation: Just as a post office efficiently uses a system to organize and narrow down addresses, the Rails router applies a graph algorithm to ascertain the correct routing. This involves breaking down the requested URL into tokens and constructing a syntax tree, followed by generating a directed graph.

- Finite State Automaton (NFA): Vaidehi explains that Journey employs a non-deterministic finite automaton to navigate the constructed graph based on parsed URL tokens, determining whether the given request directs to a valid controller action or results in an error state for non-existent routes.

- Conclusion and Abstraction Encouragement: The presenter concludes by reinforcing that while Rails abstracts these complexities, understanding them can empower developers. She invites participants to explore resources for further learning, such as her project, Base TS, dedicated to computer science education.

In summary, the session fosters appreciation for the intricacies behind Rails routing mechanics and encourages developers to venture deeper into understanding abstract systems, ultimately leading to improved problem-solving capabilities in their development practices.

00:00:11 Everyone hear me? Awesome! All right, so I know there's a keynote before this morning and then there was a break. I'm assuming you had time to rest and recuperate, plus we're going out to do some hardcore routing this morning. So I'm gonna try it again because I need all your energy; I've got 123 slides! I definitely need your support to help me out.
00:00:26 All right, how's everybody doing? Better? All right, I'm gonna wait 20 more seconds because there was a break, and then nobody came in, and now a whole swarm of people are coming in. So I'm just gonna wait to let everybody get settled.
00:01:03 All right, looks like a good time to start. Okay, good morning! My name is Vaidehi Joshi, and I am an engineer at Tilda in Portland, Oregon, where I work on Skylight, your favorite Rails profiler. You may have noticed our booth; you should come by and say hi. Skylight is a Rails app, but it's also not my first rodeo when it comes to Rails.
00:01:18 In fact, I've been working with Rails for about four years, and I’m quite comfortable with it. I really love this framework; whenever I work with it, I feel like I'm coming home. It’s super organic and comfortable, and I know it pretty well. However, just like with other frameworks, when you get into the groove of working with something and following 'the Rails way,' you can be quite productive without thinking too much about what's actually going on.
00:01:55 It's comfortable, and you know it very well. This seems to hold true except in those instances where you can veer a little bit off the happy path. For me, veering off the happy path usually means I’m trying to figure out how something works, or I’m stuck debugging an error, or for some wild reason, I find myself in the source code of Rails, and at that point, I often feel very lost.
00:02:17 But generally, going off the happy path for me means I'm going to stumble upon some abstraction. Despite my familiarity with Rails, I’ve found that I sometimes only understand one layer of its complexity. A great example of that is the Rails router.
00:02:33 So quick show of hands, how many of you know how the Rails router works—like the nitty-gritty details? Yes? All right, good! I was looking for at least one hand up. That’s Aaron by the way; he wrote the router! So it’s okay that you didn’t raise your hands because I also didn’t know how the Rails router works. To be honest, even after spending a lot of time preparing for this talk and learning about it, I still don’t know everything. However, today we can try to investigate a little bit about how the Rails router functions under the hood, and hopefully, we'll have some fun along the way!
00:03:17 Let's begin with a recap of some routing basics just to refresh our memory. It’s early in the morning; it’s day three of RailsConf. I understand. We’ll work our way up from the beginning. The router basically allows us to recognize URLs. It lets us navigate throughout our application; it's the go-between of our application and the external world because it handles incoming requests. You can kind of think of it as the post office of your application, meaning the incoming requests are like all the mail that has to go to the correct places.
00:03:57 Correspondingly, responses are outgoing mail. Today, we’re just gonna focus on incoming mail. The router is responsible for making sure that a request ends up at the right mailbox. So when we say mailbox, what we really mean is the correct controller action, as that’s where all requests ultimately want to go. When Rails receives an incoming request, the router has to decide which controller to send the request to, and then the controller is going to take it from there. The router is kind of like the intermediary; it decides where the request should go.
00:04:47 To refresh our memory, some handy commands we can use include seeing all of the possible addresses, which in this case means controller actions. We can see all the available endpoints that a request can reach in our app by using 'rails routes.' This command displays all available places in your app where requests can end up. The definition of these routes comes from the 'routes.rb' file, which every Rails application will have.
00:06:02 Here’s an example of a simple 'routes.rb' file. The controller name and action correspond to the route in the form of a string. You can route to an action using 'get,' 'post,' and many other HTTP verbs. You can also use resources, which is not just one route but many routes bundled into a single easy-to-use syntax.
00:06:39 That’s basically our routing, and that’s probably the extent of what you may know early on when you start working with Rails. That’s fine because that’s the basics of what you need to know to get started.
00:06:50 However, it feels a bit magical—like things just end up at the right place. You write some strings, and voila! Things just go where they should. While it’s easy to accept that the Rails router performs this magic for you, it’s worth taking the time to investigate how it works. We can familiarize ourselves with the little parts of the router too and break down these abstractions.
00:07:33 The first step to investigating how the router works is identifying exactly where it fits in the lifecycle of your app. If you want to figure out how something works, the best rule of thumb is to look at the docs first. In the Rails guides, there’s a command called 'rails middleware' that outputs the entire middleware stack being used within your Rails app. When you run this command, you'll see a list of various middleware components in the order of execution within your application.
00:08:14 Pay attention to the item at the bottom of the stack; it might come in handy in about five slides. In case you’ve forgotten what middleware is, it can seem a bit confusing and intimidating. But all it really means is that it's a Rack app that takes another Rack app as an argument. This distinction is essential because there are Rack apps that are standalone, meaning they aren't initialized with other Rack apps. These standalone apps are often referred to as Rack endpoints.
00:08:51 Let’s quickly refresh on what Rack is. A Rack app is any application that responds to a 'call.' The most commonly used example is 'app.call,' which just returns a status, headers, and body. The middleware we saw in that stack are just Rack apps, meaning they need to take in another Rack app and respond accordingly. If you don't believe me, here's a little pro tip you can try. Next time you're in your Rails app, experiment by replacing some of your controller endpoints, because controller actions are actually just Rack endpoints too.
00:09:24 But we won't delve too much into Rack today; let's return to our middleware stack and find out where our router comes into play. At the bottom of this middleware stack, we see something called 'Application Routes.' This part of the stack is significant because it relates closely to the router. Since this is the last piece of the stack, we need to understand what this 'routes' thing actually returns.
00:10:07 If we investigate further and look into Rails, there’s a file called 'Rails Engine' that has a definition for a 'routes' method. This 'routes' method returns an instance of all of our routes. Now we know that this is how the request actually passes through the middleware stack to reach the router. The question now is: how does the router route it?
00:10:53 Let’s revisit our 'routes.rb' file, something we’re familiar with. There might be some hidden clues there. So how does an incoming request get directed to the right route? The naive solution, and perhaps the most immediate one, would be to iterate through all the routes in the app until we find the correct one. We could technically write a loop that uses regex to check if the request matches each route.
00:11:40 However, this solution isn't the best because, as you might imagine, if you have a lot of routes, your if statements are going to get quite lengthy and already may include regular expressions which makes it unnecessarily complicated. This approach doesn’t scale well; most Rails applications will have many routes, and this could lead to inefficiencies.
00:12:07 In fact, if we consider 'n' as the number of routes, this approach runs in linear time, meaning it grows proportional to the number of routes. If we're looking for a route that doesn’t exist, we’ll end up checking every single route before concluding the search.
00:12:48 This is not ideal. You start to realize how cumbersome this naive solution is by comparing it to the postal service. Imagine if a post office looked through a long list of addresses each time it received an envelope, checking whether the address matched any one on the list. As you can imagine, that's very inefficient. The post office is already slow handling mail, and this would only exacerbate that issue.
00:13:09 The goal is to find a better strategy to narrow down which routes we’re looking at. Instead of iterating through all routes, we want to filter them so we’re not spending time searching through what we know can’t match.
00:13:22 We aren't the first to realize this could be improved; it turns out someone already has—namely Tenderlove! I should really thank Aaron for his work on Journey, because without it, I wouldn’t be giving this talk today. I spent a lot of time learning about this engine, and it was quite enlightening.
00:13:38 So what is Journey? It’s a routing engine that Aaron wrote that used to be a standalone library but was merged into Rails starting from version 4. If you look at the Journey readme, it can seem complex and intimidating. But if we start breaking down what it's actually doing, those concepts are not as scary as they might appear.
00:14:06 So where does Journey fit into the router? Recall when we called the 'routes' method. That method returns a routes instance, which creates a structure known as a route set. I can assure you, we are finally getting towards understanding the routing system—hooray!
00:14:28 But now I have one question: what is this 'route set'? When we explore 'ActionDispatch,' we can see that a set contains Journey routes within it. This is our first encounter with Journey's code, and at this point, we will focus on the general concepts of what the engine does.
00:14:51 Journey utilizes a computer science concept called a graph to simplify routing. Graphs are a fundamental data structure in computer science, and once you start to understand how they work, you will see them everywhere. Embracing and appreciating graphs can make them less daunting!
00:15:26 To create a graph, you need at least one node, though usually, you'll see multiple nodes that connect to one another through edges or links. For our purposes in Journey today, we’ll deal primarily with directed graphs. Let’s circle back to the post office analogy, which helps illustrate the concept.
00:16:13 An ideal post office doesn’t just randomly sort mail into boxes; there's an efficient system in place to ensure smooth operations. The same holds true for the Rails router. The concept of matching an address to reduce unnecessary searching mirrors how Journey operates.
00:16:54 Journey uses a graph data structure to determine how to direct requests by matching them against the given URL, which acts as the address of each request. When a request comes in, Journey must first establish where this request needs to go within the routes file, similar to finding an address in the post office metaphor.
00:17:42 For instance, if we receive a request like 'recipes/ID', we need to find a specific recipe. The ultimate goal is to forward the request to the controller where we can look for the recipe by ID. It makes little sense to consider routes related to 'articles' or 'comments' since they won’t be valid targets.
00:18:34 This intuitiveness is clear to us when we consult the routes file and eliminate the irrelevant options immediately. Moreover, in any routes file with a resource defined, it narrows down and dismisses several routes we needn't search through.
00:18:51 So, instead of traversing every single route as per our naive implementation, what if we applied a smart strategy like the post office? For a request to route effectively, it first needs to be correctly parsed. We cannot send a request to the right address if we cannot decipher the address itself.
00:19:25 How does the router read incoming requests? Well, we need to teach it to do so. Journey reads requests in a manner similar to us, and intriguingly, it adopts the same approach compilers use to parse code. When we read a sentence, our minds recognize the capitalization, punctuation, and spacing and divide it into coherent structures.
00:20:16 This process of breaking down the string into meaningful components is akin to what Journey needs to perform. In essence, Journey must replicate our ability to deconstruct the request address (the URL) into manageable pieces.
00:20:50 To do this, Journey employs a process called tokenization. Tokenization involves dissecting an expression into its minimal significant parts—termed tokens. Fortunately, Journey has a tool, the 'Journey Scanner,' that assists with this task.
00:21:17 Although technically, it’s a class inherited from Ruby's string scanner class, the Journey Scanner can take any string and follow a defined set of rules to distill tokens from the provided input—specifically request URLs. You can witness this scanner at work in the Rails console by creating a new instance and tokenizing a string.
00:21:49 To see it in action, one could call 'next_token' on the scanner instance and observe how it splits the request URL 'recipes/ID' into individual tokens. The Journey Scanner recognizes several important tokens, including slashes, string literals, and parentheses.
00:22:11 It's crucial for Journey to distinguish these tokens, as they represent the fundamental components it seeks to parse. So, after the scanning process has identified the tokens, Journey must then apply grammatical rules to make sense of them.
00:22:32 Journey follows a similar path to humans, utilizing a parser class to derive meaning from the tokenized values. This parser creates a syntax tree, a computer science concept that visualizes the structure and relationships within language.
00:23:05 Like we might have been taught to diagram sentences, the parser generates a syntax tree for every route defined in our routes.rb file. This structure is crucial for Journey to map out the grammar of its routing language.
00:23:54 If you're curious, you can actually see the syntax tree for a request in the Rails console, where it will be displayed as an HTML string. Now, with a bunch of syntax trees produced, we need to unite them into a coherent system that effectively routes our requests.
00:24:27 At this stage, we circle back to our previous discussion on graphs. Journey employs a graph to route incoming requests by combining these syntax trees into a unified structure called a generalized transition graph (GTG). For our context, think of it in terms of a state machine.
00:25:06 The relevant state deals with the request URL at hand and how far along we’ve progressed in parsed data. As a request reaches the router, Journey will navigate the graph node by node, pausing to match segments of the request URL.
00:25:55 As Journey reads the request URL section by section, it will continue traversing the graph. If a node in the graph corresponds to the current state of the request string, it advances to the next node. Journey keeps walking down this tree until it has fully parsed the request URL.
00:26:35 If at any time, it finds a node corresponding to the entire request string, it has successfully identified which route matches the incoming request URL. Our human ability to look at the routes file and recognize the right action contrasts with Journey's necessity to execute this translation under the hood.
00:27:21 Equipped with its graphs, Journey achieves this efficiently. There’s a little virtual object—think of it as a robot—traversing this graph, alongside a real person working in a post office handling incoming mail. This little robot navigates through our state machine graph and systematically checks the request URL to determine the right path.
00:28:01 This concept is known as a non-deterministic finite automaton (NFA). Essentially an NFA processes input in bits and transitions through its states based on the current input it receives. Each time it evaluates the string request URL, it makes the choice of which route to pursue in the graph.
00:28:47 There are basically two outcomes as our little robot moves through this graph: first, it might find something that matches the current state, corresponding to the string request URL it took to initiate this journey; or second, it might not match any valid state. When it reaches the end of the string, if it is well-positioned within the graph at a matching node, it confirms that there's a route correlating to our input.
00:29:35 If it manages to identify a matching route, Journey will simply hand off the request to the correct controller and action. Until now, we haven’t looked closely at any actual Journey code, so it’s time to dive into it, or in this case, a visual representation.
00:30:10 This visualizer offers a graphical depiction of how requests flow within the Journey framework. At first glance, it might appear complex, but it truly follows the concepts we've previously discussed. Here’s a depiction of an NFA if we were to search for a valid route, limited to a handful of routes for simplicity.
00:30:36 Should Journey locate an accepting state, it will carry on and dispatch the request to the relevant controller and action. But what occurs when the incoming URL is invalid or contains typos? In that situation, you would be in a rejected state.
00:31:19 When in a rejected state, the NFA cannot continue, and typically, it would raise an error indicating that the URL does not match any defined route. If you’ve worked with Rails, you've likely confronted this 'no route matches' error, which originates from Journey.
00:32:08 You can simulate this NFA machine in your own Rails application. There is a visualizer command in Rails called 'rails application:routes:visualizer,' allowing you to see your own graph representation internally, which can serve as a valuable debugging tool.
00:32:46 This has been quite the journey! We explored how requests reach valid controller actions or are redirected from the graph entirely. As we've traced the lifecycle of a request through the router, it's become clear that routing isn't just magic—there are systematic processes at work.
00:34:46 We learned that Journey is a regex engine emphasizing tokenization, scanning, tree processing, and automaton concepts—all of which span across various fields of technology—including other frameworks, programming languages, and compilers. Now that you understand the fundamentals surrounding these elements, the next time you encounter them in your work, I hope you'll recall what we've discussed today.
00:35:31 Rails is fantastic in that you don't need to comprehend every detail right away to be productive. It can be daunting to consider all this information as a beginner, but the beauty of frameworks like Rails lies in their abstractions, allowing us to skip the nitty-gritty details while still being effective.
00:36:08 But abstractions shouldn't be something to fear. If you’re curious about how something works, whether it’s navigating source code or debugging an issue, you can always dive deeper into those abstractions—often, they are understandable! You might just need to investigate to grasp what's going on.
00:36:55 If you're interested in learning about abstractions and deeper concepts in computer science, you might enjoy a project I worked on, called Base TS. It's a weekly writing series that uncovers new computer science topics every week, complete with a video series and a podcast aimed at introducing you to computer science fundamentals in an enjoyable and approachable manner.
00:37:38 I also have some podcast stickers with me today. If you’d like one, feel free to come find me afterward. You can learn more at bcs.org. Thank you all so much for attending! I truly hope that if nothing else, you leave feeling empowered to understand the processes beneath the surface. People often suggest certain concepts are too intricate to grasp. I fundamentally disagree—everyone can learn everything! Thank you for embarking on this journey with me. That's the last journey I promise!