Stuck in the Middle: Leverage the power of Rack Middleware

RailsConf 2016

by Amy Unger

Introduction

In the video "Stuck in the Middle: Leverage the power of Rack Middleware," Amy Unger explores the essential role of Rack middleware in the Rails ecosystem. She reflects on her journey as a Rails developer, emphasizing the importance of understanding middleware to enhance application functionality and troubleshoot effectively.

Key Points

- Understanding Rack and Middleware:

- Rack acts as an interface between web servers and applications, standardizing the handling of HTTP requests and responses. Instead of adapting to various server implementations, Rack allows developers to write consistent application logic.

- Rack middleware is a layer that can manipulate both incoming requests and outgoing responses, enabling alterations before they reach the Rails application or server.

Examples of Middleware:
- Common middleware functionalities in Rails include serving static files, logging requests, managing session cookies, handling flash messages, and parsing parameters.
- Notable gems that use middleware include Warden for authentication and Honeybadger for security throttling.
Creating Custom Middleware:
- Unger provides a step-by-step guide on writing custom middleware, using a "Ping" example to demonstrate functionality.
- Successful middleware implementations must follow certain rules: respond to method calls, accept a request environment, and return a standardized response.
Benefits of Middleware:
- Middleware can simplify applications by offloading routing logic, managing unwanted requests, and providing logging capabilities.
- It acts as a means for code-sharing, allowing functionalities to be wrapped into reusable gems.
Best Practices and Considerations:
- Proper middleware ordering is critical; the sequence affects how requests are processed and responses logged.
- Developers should avoid embedding business logic directly in middleware, which complicates debugging and testing.
- Middleware methods should always return a response to prevent errors that disrupt application functioning.
Thread Safety in Middleware:
- Unger highlights the importance of thread safety, especially when handling instance variables in middleware.

Conclusion

Through this talk, Amy Unger underscores the power of Rack middleware in Rails applications, encouraging developers to leverage it for improved application design and functionality. By sharing her lessons learned and offering practical examples, she empowers her audience to become proficient in creating and utilizing custom middleware effectively.

00:00:10.759 Good morning, everyone! Thank you for coming to this talk. My name is Amy Unger, and I'm here to discuss a part of the Rails ecosystem that I ignored for quite some time, and I came to regret that. It's easy as a new Rails developer to overlook Rack middleware. I mean, it has words like "Rack" and "middleware," which can be intimidating.

00:00:18.080 Especially when you're still getting comfortable with the concepts of models, views, and controllers. As I became a more experienced developer, I ran into gems that relied on middleware and even encountered small pieces of middleware in the codebases I was working in. It’s really easy to focus only on the parts of the middleware that seem relevant to the changes you need to make or the bugs you’re trying to track down. So, it’s pretty easy to assume that there’s not much more to Rack middleware. To be honest, there isn’t. It's designed to be a simple but powerful interface.

00:00:50.680 However, I never took the time to truly understand what was going on. At the beginning of my career, Rack middleware seemed far too advanced for me. Later, I found it to be boring and too obvious. Here’s a secret: I wrote some pretty bad middleware because of that. I wrote middleware that wasn't thread-safe, didn't push back when middleware that should never have been written was created, and I maintained sprawling middleware that was practically unintelligible. Finally, I didn't write middleware when I should have. I simply didn’t know that it was a tool I could use.

00:01:20.520 So today, I want to discuss some things that I wish I had known and some mistakes I made so you don’t have to repeat them. First, I'll explain what Rack middleware is and then go through some examples of how we can build Rack middleware. I’ll cover why you might want to use middleware as a tool. Finally, I’ll conclude with a section titled "Who?" discussing who might have made similar mistakes.

00:01:44.840 Let’s jump right into what Rack and Rack middleware are, look at how they fit into Rails, and then take a brief look at some familiar examples of Rack middleware. So, what is this Rack thing? Let’s consider a world where Rack doesn’t exist, and you need a server, for instance, using a CGI server. In this scenario, your user makes an HTTP request through a browser, and it hits your CGI server.

00:02:31.560 The CGI server takes that request and parses it into different parts of data, shoving those into the environment, which consists of around 20 or 30 environment variables. Some of these should be somewhat familiar: path info, HTTP headers, and so forth. This information is then processed, and your application code writes to standard output, which the CGI server picks up to form an HTTP response and send it back to your user.

00:02:48.439 Now, what this means is that your application needs to know it's being run by a CGI server. It has to extract those environment variables to figure out, for instance, that someone is requesting index.html. This can be problematic since we often develop with one server and then deploy to production using another. Your application has to adjust to whatever server it’s running on.

00:03:18.640 Now, let’s move into a world with Rack. When we introduce Rack between your server and your application, the situation changes. Your user makes an HTTP request to your server – it could be Webrick or any supported server. The server knows it should communicate with Rack.

00:03:35.400 Rack then parses the information it receives from the server into a standard incoming request that is consistent across different servers. This means no matter which server is running your application, your app can write the same logic. When responding, it’s the same process. Your application returns the response in a Rack-compliant way, and Rack figures out how to communicate with the server.

00:04:01.440 To your application, Rack presents an incoming request with an environment hash. Rack took inspiration from CGI as it essentially wrapped those environment variables in a hash, calling it the environment. However, it does not set those variables in the environment; instead, it passes them into your code. The outgoing response from your app includes the status code, headers, and content body.

00:04:47.679 So, let’s examine the simplest Rack app we can create. Rack apps need to follow three rules: we need something we can call, whether it’s a class, an instance method, or the proc we have here. That entity must accept the environment, which is a hash including path info, headers, and all the relevant data. Then, it needs to return an array with the following components: the status, a hash of headers (for instance, returning HTML), and an array of the content body.

00:05:14.440 Thus, those are the three rules that need to be followed to qualify as a Rack app. So, what is Rack middleware?

00:05:30.920 If we look at this diagram and zoom in on Rack, we see three parts to working with Rack. Rack took its logo inspiration from a server rack, and I’m going to use that concept to illustrate these three parts. The first part is the handler for your server (like Webrick, Mongrel, CGI, Puma, etc.). There’s a handler for each.

00:06:00.400 The next part is the adapter at the bottom that communicates with your framework. Initially, the setup is simple; the request just flows through Rack and gets transformed for both the server and the framework. But what if we wish to intervene in that request before it reaches the server or the application?

00:06:19.480 That’s where middleware comes in. Middleware allows us to manipulate the request or response before it exits out the bottom of the stack or comes out the top.

00:06:28.480 To illustrate the power of what Rack middleware can accomplish, let’s look at some examples of middleware that Rails provides. First, Rails uses middleware to serve static files, to set up logging for each request, and to flush all logs at the end of the request. Middleware is also used to set cookies for handling flash messages and parsing params. If you’re familiar with the params used in controllers, that’s handled in middleware.

00:06:44.640 In addition to middleware in the Rails core codebase, other notable gems in the Ruby web app ecosystem utilize middleware as a strategy. Examples include throttling for security with Honeybadger and authentication with Warden. Now, let’s take a quick look at how we can build our middleware.

00:07:14.760 First, we'll write a basic middleware. This will serve as a simple ping setup; basically, we’ll ask our application, "Are you up and running?" We’ll create a file in `lib/middleware/ping`, and then we create a class called Ping. To prevent any namespace clashes, we’ll place that class within a module.

00:07:35.240 Now we have the Ping middleware class. We’ll write an `initialize` method, which accepts the app. This app could refer to your Rails app, but it could also refer to another piece of middleware. You can imagine middleware as a set of Russian nesting dolls, each calling down to the smaller one until it hits your application. Essentially, our middleware gets initialized with the next middleware down the stack.

00:08:07.160 The next thing to remember is that any middleware needs to follow the same three rules as an app. It must respond to `call`, accept the environment, and return the status, headers, and content body. Our `call` method is going to resemble a Rails app or any Rack-compliant app, except it will also need to call down the stack.

00:08:25.040 Let’s write the simplest Rack-compliant response—our `call` method here accepts the environment and returns a Rack-compliant response. While this may be cool, it will always respond with "pong." This, of course, isn't what we want to achieve, so let’s fix that.

00:08:58.560 First, we will take a look at the request, parsing out the environment into a more manageable format. The request will have environment variables set, allowing us to see if it's a GET or a POST request, and we can also examine the path info. With this information, we complete the method: if the request path matches the route we want, we respond with a 200 and "pong." If not, we just call down the stack and pass everything down.

00:09:31.760 Next, let's explore a more complex middleware example: request response time logging—which is one of the most common middleware functionalities. It’s remarkable how often you don’t have access to nice tools like New Relic for monitoring, instead finding yourself needing to implement your logging and monitoring.

00:10:01.720 In this case, we're going to track how long it takes your app to fulfill a request to index.html or any route. We'll follow the same pattern we used for the Ping middleware to create a new file, `lib/middleware/request_time_logging`. We'll structure it with a class again and create the `initialize` method, which will carry a reference to the app down the stack.

00:10:36.400 Now we come to the core of this middleware—the `call` method. This method will take in the environment. We know that we must call down the stack for this middleware, as it won’t drop or intercept any requests. We need to record the start time of the call and then compute the elapsed time.

00:11:05.440 However, the issue is we’re currently just returning the elapsed time in seconds, which isn’t a Rack-compliant response. We want to return the status, headers, and response. We’ll save that data as we call down the stack and, as we get that response coming back, we can return it.

00:11:27.120 Even though our app is functioning well with users hitting various endpoints and getting responses, we still need to log the elapsed time somewhere. To achieve this, we’ll create another method called `log_response_time`. This method will take in the elapsed time and the request, allowing us to create a JSON payload with that data to log it.

00:12:05.760 Because sending this logging information is most commonly done to tools like Splunk, we can write an arbitrary implementation to send our payload with the request and response time. And just like that, our request response time logging middleware is complete.

00:12:38.920 Now, let’s quickly review an example of middleware in a gem. This is often useful when you’re debugging something, as it’s beneficial to deal with a larger codebase. A great example is throttling middleware, which can drop requests or handle them efficiently, returning unauthorized responses without putting undue load on your application.

00:13:30.320 In many cases, you’ll see a GitHub repository for Rack Throttle, and one interesting observation when debugging middleware gems is the need to locate the core middleware class—the one with the `initialize` and `call` methods. This class often has a different name; in this case, it's "Limiter." Once you identify it, you will discover the methods taking the app as well as options that allow flexibility for user configuration.

00:14:01.760 The call method, of course, takes the environment, triggering relevant logic to determine if the request is allowed. If it is, it calls down the stack; otherwise, it issues a method indicating that the rate limit has been exceeded, likely returning an unauthorized response with a helpful message. This demonstrates how little Rack middleware code you need to write to create effective middleware.

00:14:41.800 So, why would you want to write Rack middleware? Middleware can simplify your application; it's great for managing requests your app shouldn’t see. For example, at my workplace, Heroku, we used to support a website called add-on.heroku, which acted like a catalog of products. Behind this application exists an admin interface for managing those add-ons, but the product data has since moved.

00:15:19.600 Now, many users still attempt to access those add-on routes, but without any related front-facing functionality. The route files became extensive, requiring hundreds of lines to redirect users seamlessly. To manage this better, we moved some routes into middleware, making the application simpler. This way, when users accessed old routes, they didn't hit the bloated route file, and developers had less to manage.

00:15:56.800 Middleware can also protect your application. Continuing on the theme, there are certain requests you don't want to reach your app—especially malicious requests. Middleware can intercept these requests; for instance, throttling examples can be implemented so that requests your application shouldn't see are blocked, thus preserving your server resources.

00:16:34.280 Moreover, middleware can also implement honey pots. Since middleware has access to both the request and response objects, straightforward logging methods would only observe the incoming request and the result afterward, while middleware can access both simultaneously. This is why implementing features like the request/response timer inside middleware is advantageous.

00:17:11.640 Another key advantage of middleware is its ability to serve as a code-sharing mechanism. If you’re hoping to share certain aspects of your code as a gem, middleware provides an effective route for doing so. Users can easily integrate this functionality with minimal setup.

00:17:55.040 Lastly, I want to discuss the things that can trigger your successors to critique your code. First, the order of middleware is important, much like the Russian nesting doll analogy. You cannot fit the largest doll into the smallest one; similarly, the order of middleware in Rack matters critically.

00:18:48.520 Rails provides a handy rake task to show the configured middleware for an application and the order in which they’re executed. If you run `rake middleware` on any modern Rails app, you'll see the output indicating how your Rack middleware will run as requests come in and in reverse order as responses exit.

00:19:25.960 Let's explore how order can create issues, using return statements for static file requests as an example. Rails is set up to respond immediately for static files at the top of the middleware stack, while request IDs and logging are set below. This causes static file requests to go unnoticed in logging unless you have alternative monitoring systems or change the order so that static file requests are logged.

00:20:11.640 Consider the case where we add Warden right from the top of our middleware stack. While that might seem appealing at first, the Warden library documentation states that it must be downstream—meaning it depends on session variables being set in prior middleware. Thus, we must ensure Warden is lower in the stack to work correctly.

00:20:55.000 As noted earlier, Rack middleware is a fantastic tool to simplify your application by extracting irrelevant parts. However, nothing prevents you from improperly obscuring your entire app through middleware. If that happens, debugging becomes complicated, and it's better to have a clear structure where developers can locate logic.

00:21:34.360 Some red flags to watch for when considering whether or not to place logic in Rack middleware are modifying request data. If your middleware modifies or overwrites request attributes like post data or request paths, you’re likely proceeding down a troublesome path. It’s also crucial to have awareness of business logic since searching for bugs across multiple pieces can be hard.

00:22:23.720 You also want to avoid placing business logic in middleware, primarily because it will be more challenging to test, leading to issues later on. Awareness of different data structures is vital; if the middleware knows too much about the model or data, it may tie your application together unexpectedly.

00:22:58.240 To mitigate these issues, I suggest using app middlewares to clarify that you're implementing significant logic. This contributes to easier debugging since another developer can quickly search for keywords in the app. If your middleware is particularly hacky, use 'app hacks' or 'lib hacks' as a warning flag; this indicates a lack of comfort with what you’re doing.

00:23:51.160 Lastly, always ensure that your middleware methods return a response. Remember that you’re in a stack of middleware, and adhering to the three rules of Rack is crucial. If your middleware throws uncaught exceptions, that typically results in a 500 error, which isn’t pretty. You will not receive the friendly Rails error page, as your Ruby code will have errored out.

00:24:36.640 Finally, we should discuss thread safety in the context of Rack middleware. This concern is primarily relevant if you're setting instance variables. Such practices typically indicate a more complicated piece of middleware than what I’ve presented today. As an example, we’ll make our ping middleware thread-safe—even though it technically doesn't need to be.

00:25:29.720 To achieve this, we’ll duplicate the instance of middleware, transferring all the logic found in the call method to a new private method, conventionally named 'uncore_call', and that’s it!

00:26:07.240 To summarize, we’ve reviewed what Rack and Rack middleware are, along with some reasons to utilize middleware in your application. We also discussed useful practices for ensuring you use middleware wisely. I hope this talk has energized you about possibly employing Rack middleware in the future. Thank you!