RubyConf 2014

Roda: The Routing Tree Web Framework

Roda: The Routing Tree Web Framework

by Jeremy Evans

Introduction

In the talk titled "Roda: The Routing Tree Web Framework," Jeremy Evans discusses the creation and features of Roda, a Ruby web framework designed with a focus on routing trees. The framework combines the simplicity of Sinatra with improved performance and scalability, particularly for more complex web applications.

Key Points

- Background and Motivation for Roda:

- Jeremy started using Ruby and Rails but became disillusioned by Rails' complexity.
- He was inspired by Sinatra’s simplicity but encountered issues with route duplication in larger applications.
- After discovering the Cuba framework, which effectively reduced routing duplication, Jeremy envisioned a new framework that combined the benefits of both Sinatra and Cuba, leading to the development of Roda.

  • Routing Tree Concept:

    • Roda is based on a routing tree, which differs from conventional route handling methods by breaking down requests into manageable segments, enhancing performance and flexibility.
    • Routing trees operate by examining each segment of the HTTP request path rather than iterating over a full list of routes, akin to a file system operation.
  • Routing Methods in Roda:

    • Roda utilizes methods like r.on for prefix matching and r.is for terminal matching to construct its routing tree, enabling more precise control over request handling.
    • Different request methods (GET, POST) are managed effectively through r.get and r.post calls, maintaining clarity and minimizing redundancy in route definitions.
  • Performance Advantages:

    • Roda boasts significant performance improvements, being observed to run approximately 2.5 times faster than Sinatra for basic applications and demonstrating reduced memory usage.
    • Performance benchmarks show Roda's capabilities with 10,000 routes, where it maintains efficiency compared to Rails and Sinatra, which experience noticeable slowdowns.
  • Potential Downsides:

    • One trade-off with Roda’s design is a potential loss of routing introspection, which may affect debugging and certain application requirements that depend heavily on introspection features.
  • Historical Context:

    • Evans provides a brief history of routing tree frameworks in Ruby, mentioning key predecessors like Rum and Cuba, which informed Roda's design.

Conclusion

Jeremy Evans's presentation highlights Roda’s unique routing tree design, its practical programming benefits, and performance advantages over other Ruby frameworks. The framework's capability to handle complex applications while maintaining simplicity presents a compelling option for developers. Roda not only enhances performance and reduces routing duplication but aims to streamline the coding experience in Ruby web development.

00:00:18.279 I'm going to go ahead and get started because it's already a couple of minutes late. It's great to be here at RubyConf. This talk is entitled 'Roda: The Routing Tree Web Framework,' and my name is Jeremy Evans. I'm the lead developer of the Roda web framework and also the lead developer of the SQL Ruby database library.
00:00:35.000 I want to start off by explaining why I created Roda. After all, there are a lot of Ruby web frameworks, so there should be a good reason for creating another one, especially if you want to recommend that others use it. To explain why I created Roda, I need to go back about ten years to when I first started using Ruby. I first got into Ruby in late 2004, about six months after Rails was first released; I think 0.8.5 was the first Rails version I used. I found that Rails made web development much easier than the PHP framework I was using at the time. I used Rails for nearly all my web development for a few years, but I gradually got disillusioned by its complexity.
00:01:19.400 When I discovered Sinatra in late 2007, I was amazed by its simplicity. You just specified the routes you wanted to handle, and they just yielded to a block to handle the action, which made developing web applications so much simpler. So in April 2008, I started using Sinatra for all of my new web applications. Over the years, however, as the applications I was working on became more complex, I ended up with a lot of duplication in my Sinatra routes. Since my only real basis of comparison was Rails and the Sinatra code was simpler than what I would have written in Rails, I didn't consider this too much of an issue.
00:02:00.479 In July of this year, I was looking at a comparison of a bunch of lesser-known Ruby web frameworks, and I read about Cuba. Cuba has been around since 2010, but this was my first exposure to it. As I read about Cuba, I saw how it addressed some of the duplication issues I experienced in my more complex Sinatra applications. Additionally, from reviewing some benchmarks, I discovered that Cuba was significantly faster than Sinatra. After trying to convert one of my simpler Sinatra applications to Cuba, I found there were quite a few things I didn't like about Cuba that were easier in Sinatra.
00:02:18.319 This led me to the conclusion that I should create a new web framework based on Cuba that borrowed many features from Sinatra, along with some other ideas I had about extensibility and avoiding namespace pollution, to create the best web framework for the types of web applications I develop. That's the story behind Roda's creation. When I originally decided to fork Cuba, I hadn't settled on a name, so I temporarily called it 'Sauba' since it borrowed features from both Sinatra and Cuba. While I was lying in bed one night thinking of a name, I considered what made the framework special. To me, the main difference between Roda and Sinatra is that in Sinatra, you are iterating over an array of possible routes, while in Roda, the routing process is broken down and generally takes the form of a tree.
00:03:12.360 So Roda is named after the 'Rota' trees that appear in the Ys video game series, which help the main characters accomplish their goals. This image is from Ys Origins and prominently features a Rota tree.
00:03:29.200 The main feature that separates Roda from most other Ruby web frameworks is that it is designed around the concept of a routing tree. Since most programmers are not familiar with the routing tree concept, I will first explain what a routing tree is and how it works. Routing, in general, is the process of taking a request and determining the code that should handle it. While routing can consider any aspect of the request, in most cases, only two parts are used during routing, and these two parts are contained in the HTTP request line, which is the first data transmitted from the client to the server during an HTTP request.
00:04:00.280 The HTTP request line looks like this: here, 'GET' is the request method, and 'albums/1/tracks' is the request path. It is possible to use other parts of the request during routing, such as information from the request headers or the request body, but typically only the request method and the request path are used. Sinatra and other similar Ruby web frameworks look at the full path of the request when deciding how to route it, iterating over their array of possible routes and checking each one to see if it matches the current request.
00:04:40.639 A routing tree handles routing differently by examining segments of the path. For example, when routing a request for 'albums/1/tracks', a routing tree will first look at the segment 'albums'. If this segment does not match, the entire 'albums' branch is skipped, and other routes under that branch are not considered. If the 'albums' segment does match, the tree will then look for a route for '1/tracks' within that branch, ignoring other branches. This is similar to how a file system works; when you ask a file system to open a file, it doesn't compare every path on the file system but checks the first segment of the path, sees if it's a directory, and looks inside if it is.
00:05:49.000 If the use of a routing tree was merely a performance issue, it wouldn't be particularly interesting. However, what makes routing trees compelling is their ability to operate on the current request at any point during routing. If any of you have used the Ragel state machine compiler, this is similar to Ragel's capability to execute arbitrary code during parsing. Now that I have briefly described what a routing tree is, let me explain how Roda implements this concept.
00:06:43.120 Let's start with a simple 'Hello, World!' example since it's the easiest case to understand. This Roda app will return 'Hello, World!' as the response body for every request. While it does not showcase the routing tree aspects of Roda, it illustrates how Roda retrieves the response body from the value returned by the block, which is similar to Sinatra.
00:07:03.759 In almost all Roda applications, you will want to use Roda's routing tree methods. The first routing tree method is called 'r.on', which creates branches in the routing tree. Here, you call 'r.on' with the string 'albums', which matches the current request path if it starts with 'albums'. Let me break down what is happening.
00:07:43.160 In the first line, we are calling the 'r.route' method, which starts the routing tree. All incoming requests are yielded to the block provided to 'r.route'. This block is passed a Rack request instance, along with some additional methods. By convention, the block argument is named 'r'. These additional methods added to the Rack request instance relate to routing the request.
00:08:38.560 As mentioned earlier, the requests 'on' method is used to create branches in the routing tree. Any arguments you pass to this method are called 'matchers' and are used to match the current request. In this case, a single matcher is provided as a string, which matches the first segment in the request path. If the request path starts with 'albums', this will match, and the request will be passed to the block provided to 'on'. Now that block returns a 'Hello, album' string, which Roda will use as the response body.
00:09:58.000 If the request path is 'artists', this will not match, so 'on' will return nil without yielding to the block, and execution will proceed after the method call. In this case, nothing follows the call to 'r.on', so the return value of the 'r.route' block will be nil. Since the block didn't return a string, Roda will use a 404 status code with an empty response body, providing a principle of least surprise. If you don't specifically handle a request, an empty 404 response will be used.
00:10:50.960 However, this approach has issues because it returns the same response for all paths under 'albums', including non-existent albums. Generally, you want to return a 404 response for any path you do not specifically handle. If you only want to handle 'albums' and not the paths underneath it, you can use the 'r.is' method. The 'r.is' method is similar to 'r.on', but it performs a terminal match and only matches if the request path is empty after applying the matchers.
00:11:43.280 So, this code will not match a path like 'albums/should-not-exist'. The reason for the distinction here is that 'r.on' matches the request prefix, while 'r.is' matches only when the match is complete. Routing trees in Roda are constructed using a combination of 'r.on' and 'r.is' methods. 'r.on' does prefix matching of the request path, and 'r.is' performs full matching.
00:12:44.360 For example, 'r.on :albums' creates a branch that handles all paths under 'albums'. Here, calling 'r.is :list' will only match if the request path is exactly 'list'. It may seem odd that this works, but the reason is that the request path is modified as the request is routed.
00:13:08.000 So when a request for 'albums/list' comes in, the routing tree uses the initial request path, and when the 'r.on :albums' method matches, it consumes 'albums' from the front of the request path. Inside the 'r.on' block, the request path is empty, meaning that 'r.is :list' matches because 'list' is the remaining request path. If you get a request for 'albums/list/all', the 'r.on' call will still match, but since 'r.is :list' doesn't completely consume the request path because 'list' is not the entirety of the path, it will not match, and Roda will return an empty 404 response.
00:14:29.360 So far I've focused primarily on routing using the request path. As I previously mentioned, routing usually takes into account the request method as well. Consider this routing tree that will handle requests for 'albums/new'. To handle the GET and POST request methods, Roda provides 'r.get' and 'r.post' routing methods. If you call these methods without any arguments, they perform a simple match against the request method; 'r.get' matches GET requests and 'r.post' matches POST requests.
00:15:32.200 Thus, a GET request for 'albums/new' will return 'Hello, albums', while a POST request will return 'Album added'. The typical way to build a routing tree in Roda involves combining these methods: you use 'r.on' to create branches based on the request path prefix, 'r.is' to do a complete path match, and 'r.create or r.post' to handle different request methods for the same request path.
00:16:50.680 If you do not provide any matchers to the 'r.get' or 'r.post' methods, they do a simple check against the request method. However, if you supply any matchers to these methods, they perform a terminal match on the request path, resulting in an API that is similar to Sinatra. For instance, a GET request for 'albums' will be matched by 'r.get :albums', and a POST request for 'artists' will match 'r.post :artists'. A POST request for 'artists/1' will not be matched by either because it is not a GET request, nor will it match 'r.post' because the request path will not be completely consumed by the matchers.
00:17:57.560 When constructing a routing tree, if you only want to respond to GET requests for 'albums/list' and not other request methods, instead of calling 'r.is :list' and 'r.get :list', you can simply use 'r.get :list', resulting in more succinct code. Now that we've covered the four basic routing methods, let's talk about the arguments passed to these methods, known as matchers.
00:19:02.160 We have already discussed one type of matcher, which is a string matcher that matches the exact string in the first segment of the request path. Strings can contain slashes if you need to match multiple segments in the request path. So, 'albums/list' will match, but not 'albums/1'. You can also use embedded colons in your strings, which match arbitrary segments in the request path; this matches both 'albums/1' and 'albums/2'. Note that when you use an embedded colon, the text captured by it is yielded to the block. This is the primary way of extracting data from the request path in Roda.
00:20:07.600 There are also separate symbol matchers, which work similarly as embedded colons by yielding the matched segment to the block. You can also define matchers using regular expressions, where any captures are yielded to the block. Other types of matchers allow for advanced matching, but due to time constraints, I won't cover all of those here.
00:20:59.599 One of the main advantages of routing trees is their ability to execute arbitrary code during the routing process. While this might not sound significant, it is the primary reason Roda allows for simpler and DRYer code compared to most Ruby web frameworks. For instance, if you want to ensure that a user is logged in before accessing certain routes, you can include the login check as the first line in your 'r.route' block. This approach is akin to a global before filter in Rails or Sinatra.
00:22:48.600 However, having the ability to execute code at any point in the routing tree provides a more elegant solution than a global before filter that checks the current path to avoid disrupting the login process. The ability to execute code across the routing tree is also useful when dealing with various request methods for the same route.
00:23:55.040 For example, if a GET request for 'albums/1' displays a form for editing that album, while a POST request for the same path processes the input, you can share code for retrieving the album in both routes. This is not revolutionary, as similar results can be accomplished using before filters in Rails or Sinatra. However, in both of those frameworks, before filters are separate from the code being executed, making it harder to reason about the application structure. Repeatedly using this pattern necessitates specifying a separate before filter for every set of GET and POST routes, creating an unwieldy codebase.
00:25:17.520 Roda outperforms many Ruby web frameworks, and while Ruby has a reputation for performance, one might say Roda is among the fastest Turtles. However, it significantly outpaces both Rails and Sinatra. For a simple 'Hello, World!' app with a single route, Roda performs approximately two and a half times faster than Sinatra due to its lower overhead. However, benchmarks do not capture real-world performance; in practice, I have observed my production applications running faster on Roda than either Rails or Sinatra. The performance difference depends on the specific actions involved, with simpler actions yielding the most impressive speed-ups.
00:26:43.400 My integration tests also sped up significantly, with identical Rack test-based integration tests running 50% faster with Roda compared to Sinatra, and doubling in speed after transitioning from Rails. In terms of memory usage, Roda consumes about 10 MB less than Sinatra simply by requiring the library. In real-world applications, I have only noted a decrease of 1-2 MB in memory usage compared to Rails, but on my largest application, the transition from Rails to Roda resulted in a memory usage drop from 15 MB to around 8 MB per Unicorn worker process.
00:28:16.800 Similarly, the second-largest app saw a decrease from 100 MB to 60 MB per Unicorn worker process. Although my largest app is around the size of an average Rails app with roughly 200 routes, this does not indicate whether Roda's approach will scale. I decided to investigate how well Roda would perform in an application with a large number of routes. You might be familiar with the 'c10k problem' posited by Dan Kegel in 2001, stating that web servers should efficiently manage 10,000 simultaneous clients.
00:29:56.960 I'm proposing the 'r10k problem', which states that web frameworks should handle 10,000 routes efficiently. I wrote a code generator that produces web applications with 10, 100, 1,000, and 10,000 routes for Roda, Rails, and Sinatra and conducted benchmarks. For 10 routes, Roda generates routes from 'a' to 'j' using a single segment per request path. For 100 routes, the generator creates routes from 'aa' to 'jj' with two segments per request path.
00:31:05.760 This follows the same pattern for 1,000 routes using three segments and 10,000 routes using four segments. Looking at the comparison of Roda, Rails, and Sinatra at 10, 100, and 1,000 routes, we see runtime results for 20,000 requests, noting these results are using the Rack API directly, thereby excluding any web server overhead. Roda and Rails show no significant performance drop as the number of routes increments, because Roda uses a routing tree while Rails employs finite automata for request routing. Since Rails iterates over an array of possible routes, its performance declines linearly in relation to the growing number of routes.
00:32:45.360 As the number of routes increases to 1,000, Sinatra's performance nearly equals that of Rails. Regardless of the number of routes, Roda remains dramatically faster than both Rails and Sinatra. The unfolding performance story does become clearer as we reach 10,000 routes—at this scale, Sinatra's performance fared far worse than Rails'. Meanwhile, Roda's performance remained below 5 seconds, even when handling 10,000 routes. It's important to note that these figures do not account for startup time, which, while typically unchanging, does greatly increase the metrics for Rails at the 10,000 route mark.
00:34:33.720 While Rails requires significant startup time equivalent to serving 20,000 requests, a staggering amount of that time is spent within routes. This slowdown appears to stem from the process involved in building the finite automata structures for its router. Memory usage also features directly into performance comparisons; Roda utilizes less memory than Sinatra and far less than Rails, irrespective of the number of routes. This trend grows even more pronounced when comparing the routes of Roda and Sinatra.
00:36:09.480 With 10,000 routes, Roda consumes less than half the memory than Sinatra and only about a fifth of the memory of Rails. Indeed, Roda utilizes less memory with 10,000 routes than Rails does with just 10. To keep the benchmarks accountable, the source code will be available in my r10k GitHub repository. I invite anyone to verify my findings and ensure I am conducting fair comparisons regarding Rails and Sinatra.
00:37:38.720 Following the exploration of Roda's design and performance benefits, it’s important to recognize that such advantages come at a potential cost: loss of routing introspection. Because all routing within Roda is conducted at the instance level, introspecting the routes as one might in Rails or Sinatra is not feasible. While this typically doesn't pose a problem, some applications rely on route introspection and must be adapted accordingly.
00:38:41.480 I will now touch upon the history of routing tree web frameworks in Ruby. The first routing tree web framework for Ruby was Rum, which was introduced by Chris Nind in January 2009. Rum was never officially released as a gem; it served primarily as a proof of concept for routing tree structures in Ruby applications. Cuba was devised by Michael Martins in April 2010, initially as a simple wrapper around Rum, incorporating support for HAML templates.