How HTTP Already Solved All Your Performance Problems 10 Years Ago

00:00:14.910 Hello everyone! Today, I'm going to be talking about HTTP. Earlier, I saw a show of hands regarding how many people write web applications in Ruby, and I wanted to see how many of you have actually read the HTTP specification. Surprisingly, more than I expected raised their hands, about a quarter or so. If you're writing web apps, you really should read it. Here it is, printed out—it’s about a hundred pages. It doesn’t take long to read, and it's not boring like most specifications. It’s an RFC and quite interesting, at least to me, as I’m a bit of a nerd. HTTP started out just as a document transfer protocol, but about ten years ago, it evolved into a full-featured application protocol with many features that significantly improved performance. So, I’ll be discussing some important features that can help make your web app faster and more scalable.

00:01:08.110 Most of my focus will be on the server-side because that’s where we have control. We don’t really have influence over how browsers manage their connections, but their HTTP client implementations have become very efficient. Even older browsers like Internet Explorer 4 supported caching and many other features. You can find my code on GitHub; I have several libraries there. I also work at a company called Absolute Performance, where we provide hardware and application monitoring for various clients, most of whom are large data centers. For instance, in one of our data centers, we handle about 150,000 HTTP requests every minute with just five application servers, a load balancer, and a database node. We achieve this by leveraging the advantages that HTTP offers, which helps reduce the number of requests we have to respond to.

00:02:34.090 To give you a bit of history about HTTP, it was proposed in 1990 by Tim Berners-Lee as the World Wide Web of hyperlink documents delivered over the Hypertext Transfer Protocol, which was later named HTTP 0.9. Development continued informally until 1996, as vendors implemented various features as they saw fit. This led to a wide variety of implementations without any standardization. Around 1996, efforts were made to formalize the features, including the addition of request headers and several new methods, such as HEAD and POST. HTTP was formalized in RFC 1045, and concurrently, work began on the first draft of HTTP 1.1, which later became RFC 2068. This version introduced key features such as persistent connections, hierarchical proxies, caching, and virtual hosts, culminating in its finalization as RFC 2616 in 1999. These enhancements transformed HTTP from a transfer protocol into a robust application protocol, improving the quality of content delivery.

00:04:08.889 The HTTP specification is quite extensive, around 60,000 words, similar in length to a small book. While it may be slightly dry, it provides an abundance of valuable information in a well-organized manner, making it accessible for anyone interested in improving their understanding of HTTP. Some particularly beneficial features in HTTP/1.1 that can enhance performance include persistent connections, also known as keep-alive connections or pipelining, and response caching. Persistent connections allow you to avoid the overhead of establishing new TCP/IP connections for every request. Most web pages include multiple resources, such as CSS, JavaScript, and images. Reestablishing these connections repeatedly introduces significant overhead that can be mitigated by using persistent connections. This functionality is often enabled by default in web servers like Apache and Nginx.

00:05:29.350 When browsing a web page using non-persistent HTTP, the process involves several steps: first, the DNS lookup, followed by a TCP handshake before you can send the GET request and receive the response. Once that’s done, there's a four-way handshake to close the connection. If you need to retrieve another resource, the cycle starts again, resulting in additional overhead. In contrast, with persistent connections, the number of these handshakes is reduced significantly. For instance, if multiple JavaScripts are being loaded from the same server, the overhead savings become clear, leading to faster page loading times and an enhanced user experience.

00:06:21.060 Furthermore, understanding the two primary goals of caching is crucial for enhancing web application performance. First, we want to avoid making unnecessary requests; if you already have a cached response that hasn’t changed, you can serve it directly, saving on the HTTP overhead. Second, you want to enable validation, allowing the server to quickly respond to queries about whether cached content is still valid. Cache invalidation might not occur frequently, but when it does, it can lead to better performance and reduced server load.

00:07:39.280 Despite the advantages of caching, it presents substantial challenges, particularly cache invalidation, which is famously recognized as one of the two hard problems of computer science along with naming things. This complexity arises from the fact that caches exist throughout numerous layers, over which you often have no control. Deciding when content is stale and needs to be invalidated can become complicated. For example, if a single user updates a blog post, how do you ensure that others see that update or the additional comments? With multiple users and different caches, establishing an effective invalidation strategy can become highly complex. When discussing caching, it's essential to have the correct terminology. HTTP clients are any entities capable of connecting to HTTP servers, while servers are those that accept these connections.

00:09:37.060 Among these entities, certain roles are critical. The user agent, typically a browser or a spider, is the client where the request originates. The origin server is where the resource lives, serving as the authoritative source document. Proxies act as intermediaries, functioning as both clients and servers, either providing cached responses or forwarding requests to the origin server. In HTTP/1.1, numerous headers are available to manage caching behavior. These headers dictate how user agents and intermediary caches handle stored content, determining when a resource is considered fresh or stale. Two key caching directives are the 'Expires' and 'Cache-Control: max-age' headers, which designate when cached content should be revalidated.

00:11:04.780 The 'Expires' header provides a specific date after which a cached resource is considered outdated, while max-age specifies how many seconds a resource is valid before needing revalidation. Proper implementation of these headers can significantly reduce redundant requests and optimize resource utilization. For example, popular APIs like Google's can set their resources to be valid for a year. This means that whenever a browser requests a widely used library like jQuery, it can serve the cached version without incurring the overhead of a new request, which improves load times across multiple sites using shared libraries.

00:12:57.590 In Rails applications, effectively managing caching can be accomplished by manually setting the necessary headers. The 'Cache-Control: max-age' directive allows developers to set an explicit expiration in seconds. Furthermore, validation using headers like 'Last-Modified' and 'E-Tag' can enhance efficiency. The 'Last-Modified' time provides a timestamp indicating when a resource was last updated, allowing efficient revalidation queries without regenerating responses. Conversely, the 'E-Tag' header allows servers to assign a unique identifier to resource states, providing another method for browsers to confirm if the cached version is still valid.

00:13:51.860 When a browser makes a request for a cached resource, it can include an 'If-Modified-Since' header containing the last modified timestamp. If the resource hasn't changed, the server can respond with a quick '304 Not Modified', meaning no further data needs to be sent. This saves time and resources. Similarly, with the 'E-Tag' header, a browser sends 'If-None-Match' containing the E-Tag it previously received. If the E-Tag matches, again, the server replies with '304 Not Modified', minimizing response times and bandwidth usage.

00:15:47.160 Rails framework simplifies managing cache freshness using these headers. If developers handle the validation logic correctly, they can reduce unnecessary server load significantly. Additionally, the 'Vary' header plays a critical role in ensuring that cached responses are valid across different request headers, such as 'Accept' or any authorization mechanisms. When dealing with sensitive data, controlling how caches handle those responses becomes extremely vital to maintain user privacy.

00:16:50.390 The 'Cache-Control' header provides a variety of directives for controlling how caches should handle stored responses. The 'private' directive specifies that intermediate proxies may not cache the response, while user agents are allowed to do so. Alternatively, the 'public' directive permits caching by intermediary proxies as well. A 'no-cache' directive means that caches can store the response but must revalidate with the origin server upon every request. Lastly, 'no-store' prohibits caching entirely, ensuring the content will not be stored to disk, which is advantageous for verifiable content like medical documents.

00:18:32.040 Throughout this discussion, it is important to highlight various factors affecting caching strategies. Proxies can significantly enhance performance by accepting requests and serving cached responses, either from memory or through other strategies. Understanding the differences between a load balancer and caching proxy can aid in making informed architectural decisions for handling traffic efficiently. Load balancers distribute requests among the available application servers based on several algorithms, while caching proxies primarily store the responses and reuse them to alleviate server load.

00:19:49.600 In many scenarios, proxies like Squid and Varnish operate seamlessly with caching strategies and help maintain cache coherence. By ensuring that cache rules are respected, they strike a balance between performance and data integrity. The principles of REST, outlined by Roy Fielding during the same period HTTP was being developed, can also optimize web architecture. While REST is not part of HTTP itself, it provides guidelines that can enhance effective use of HTTP, which, when combined with best practices, leads to more efficient caching strategies. Understanding this interplay between HTTP and REST promotes the design of responsive and efficient web applications.

00:21:47.280 REST emphasizes using hyperlinks and resources effectively, allowing web applications to be stateless. By linking different representations of resources, such as HTML and JSON, applications can efficiently manage state transitions without relying on sessions. This promotes better caching strategies because every state transition inherently invalidates prior cached responses when a resource changes. Ultimately, the goal is to ensure users receive accurate and timely data while minimizing unnecessary requests, enhancing the overall user experience. Implementing these practices can ensure that web applications are more performant, responsive, and equipped to handle high volumes of traffic efficiently.

00:24:51.080 Throughout this talk, I emphasized various caching mechanisms and their role in optimizing web application performance. For anyone looking to dig deeper, I recommend checking out my GitHub for source code and references to the specifications that outline how to effectively implement caching in HTTP. In particular, certain papers encapsulate detailed information on caching strategies. If there are any questions about what I covered or specific implementations, I am here to clarify and assist.

00:28:42.490 To summarize, if you’re using Cache-Control with private, cache behavior will differ based on user agent settings. It’s important to consider implications, especially when scaling applications across multiple servers with caching proxies. Rails produces header defaults that cater to worst-case behavior, but it’s equally necessary to tweak settings for optimal performance according to individual application needs. Understanding these nuances within caching can significantly enhance the efficiency and user experience of your web applications. Thank you for your time! If there are any further questions or discussions about caching or specific implementations, I’m more than happy to engage.

00:30:56.230 Before closing, let’s address any lingering questions or thoughts. I appreciate your engagement, and I hope this presentation has equipped you with valuable insights into HTTP and caching practices that can enhance your web development efforts.