The Little Server That Could

http://www.rubyconf.org.au

Have you ever wondered what dark magic happens when you start up your Ruby server? Let’s explore the mysteries of the web universe by writing a tiny web server in Ruby! Writing a web server lets you dig deeper into the Ruby Standard Library and the Rack interface. You’ll get friendlier with I/O, signal trapping, file handles, and threading. You’ll also explore dangers first hand that can be lurking inside your production code- like blocking web requests and shared state with concurrency.

RubyConf AU 2017

00:00:08.920 It's so luxurious to get an introduction. As Liam mentioned, my name is Stella Cotton, and I am really excited to be here in Australia.

00:00:16.039 If you've talked to me yet, I've probably told you how much I love it here. I've seen kangaroos and koalas in the wild, which I've probably also mentioned. It's truly amazing, and I just wanted to say a huge thank you to the organizers for all of their hard work. This is basically my favorite conference, so I'm really grateful to be back.

00:00:30.560 I work as an engineer on the tools team at Heroku. Before we get started, unrelated to my job at Heroku, I would like to ask you a favor. I get really nervous when I'm up here speaking, and I should be drinking water, but with everyone staring at me, I tend to do a weird little dance.

00:00:43.640 So, I'm going to steal a trick from my friend Lily. Every time I grab my water and take a drink, I want everybody to start clapping and cheering, really losing your minds. So let's practice. Hold on, that’s awesome! And just another point of order: I'm probably going to go through the code samples pretty quickly, but I’ll tweet out a link to my slides after the talk in case you want to take a closer look at the code.

00:01:22.439 I have lured you all here today under the pretense that we're going to learn how to write a tiny web server: the little server that could. But the reality is, this talk should be called 'the little server that can't.' If you want to use this little server to do anything in production, you're going to find that it's slower, it's not secure, and in some cases, we're just going to say, 'you know what, we're not even going to implement that.'

00:01:44.719 Also, it's important to note that a lot of the things I'm going to cover are going to be really limited to Unix-like environments. They will be familiar on Mac and Linux, but they won't really apply to Windows machines. So, why do we even care? What can the server do? To talk about what it can do, I want to discuss something about abstractions.

00:02:13.640 As engineers, a really powerful skill is learning when to dig into an abstraction and when to just accept the constraints. If every time you wanted to use a third-party API you had to understand the internals of that API, you'd be wasting your time and sacrificing the value of that API. It's an abstraction away from something that you just don’t really need to know. These abstractions make your code stronger, cleaner, and more efficient. You've probably all worked beside, or been that coworker who just can't stop going down rabbit holes. But if you respect the abstraction, you can actually be quite productive.

00:02:57.920 However, real talk: I came to software late in life, coming from something totally different. I often fight this urge to think of these abstractions as magical, as something completely unknowable. Good abstractions should feel like magic, but sometimes we need this reminder that they're just a tool. Servers are specifically a tool that everybody here is likely using every single day.

00:03:14.440 When you run ‘rails s,’ you don’t need to know what's happening underneath for the most part. The server just starts. It’s powerful and super easy to use as an abstraction. So powerful, in fact, that it can feel like magic. But servers are not magic. If you dig down past that abstraction, you’ll find that it's just code. So, we're going to explore the components of the server.

00:03:48.200 If tonight you go home and open any of the open-source Ruby web servers on GitHub, you're going to see some familiar components. Our server will help us understand what's happening inside our production web servers. What else can this little server do? It will be composed of some fundamental pieces that will help you build a foundation to understand other great developments happening in the Ruby community. For instance, why was garbage collection such a big deal in Ruby 2? Why do we care about Koichi's concurrency model with guilds?

00:04:27.639 Let's start off by talking about what a server really is, specifically today: a web server.

00:04:30.480 It runs on a physical computer, which is somewhat confusingly also called a server. You're probably using one of these common Ruby servers: Unicorn, Puma, WEBrick. People can interact with these servers by visiting a web page, using commands like telnet or curl, and I'll be using 'client' as an umbrella term to talk about all these methods.

00:04:55.440 In a lot of ways, it's just like any other program on your computer. It has code, it lives in a file, and you can run it. Today, our little server is going to be kicked off with just Ruby server.rb. But what makes this server different from all the web development code we write inside Sinatra or Rails? First, it's going to communicate directly with the outside world, leveraging the power of the operating system itself. Secondly, this communication happens over a very specific API.

00:05:31.479 This may not sound like a big deal, but let's think about how many web pages are out there today. Any guesses? Five? No, that's too low. It's actually 4.6 billion indexed web pages, as of last November, and that number just goes up every time you check. They're being served from servers all over the world and viewed on various browsers, both desktop and mobile. The fact that all of these clients and servers speak the same standardized language is mind-blowing.

00:06:05.360 So, how is this possible? In my experience, if you ask five developers to come up with a solution to a problem, you'll receive six different solutions. This is the magic of the standards body called the W3C (World Wide Web Consortium), formed in 1994 to create standards that form the open web. They created this API that the entire web uses today to communicate over HTTP, and they established it in a document called RFC 2616.

00:06:49.400 It’s a 175-page document that outlines the HTTP/1.1 API, which we use every day unless you're using HTTP/2, which was discussed yesterday. This document essentially serves as a contract for how we structure web requests and responses, allowing anyone in the world to interact with each other. In simpler terms, it’s like a massive list of tests that you would need to write if you wanted to create a production web server.

00:07:27.320 It’s not just saying that you need to support a web request that looks like a common sample of a GET request. It also defines how all these URLs should return the same web page. The document describes the entire interface between the open web and our applications.

00:08:14.320 I first came across this RFC when I was learning about how a web server worked. I Google-searched 'how does a web server work?' and found a Stack Overflow answer that claimed the only way to learn about web servers was to read this 175-page RFC and implement it yourself.

00:08:47.120 I was frustrated, feeling like the meme of 'how to draw an owl,' because I couldn’t take that specification and turn it into a web server. The reality is that while this information is amazing, it won't really help you build a web server; it's more beneficial if you're trying to build a production web server because you need to respect this contract.

00:09:17.600 However, it won’t necessarily help you understand some of the fundamentals. We will respect the very basic contract so that we can use curl and get a response, but we’ll focus on what's happening underneath.

00:09:44.080 We started off talking about what a server is, but now let’s dig a little deeper. Servers, in a cool way, are all about communication. We can discuss three different ways that they communicate.

00:10:06.040 First, a web server will communicate with the outside world. It operates as a program running on your machine and communicates in defined ways. When you start a program on your machine, Unix creates a little world for your program to run in called a process.

00:10:35.960 The idea of a process is that you can assign variables and change the state inside your program without affecting the global state in your operating system. Like any other program, you can see it running with the ps command. What's different is that this server leverages the power of the operating system to communicate externally.

00:11:05.000 The Ruby standard library conveniently provides a small web server you can run called WEBrick. You don’t have to do all the stuff we're covering today, but it also provides some cool wrappers around the common Unix system calls that enable you to build your own server.

00:11:47.840 This is the basic flow of how our web server is going to work. First, we open up a socket. A socket is a means for processes on your machine or in a system to communicate either with other processes on the same machine or with the outside world. You’ll hear people say that everything in Unix is a file; sockets are no different. They're just a specific kind of file that both servers and clients can read and write to.

00:12:29.760 If you want to see what sockets are running on your machine right now, you can use the netstat command. The operating system identifies these different files by a number called a file descriptor or file handle. Ruby gives us a higher-level abstraction, so we don’t need to track that number, but it's good to know that the operating system uses that to identify the file you’re reading and writing to.

00:13:05.720 We’re not just opening a socket; we’re opening a very specific socket that must be capable of accepting web traffic. To create a web socket, we need to choose our addressing format. The two most common formats are Unix and Internet sockets. Some sockets represent processes communicating with each other, while others involve communication with the outside world.

00:14:04.560 Unix sockets may point to a pathname on the file system so that two programs on the same machine can talk. Internet sockets use an internet address, allowing anyone to communicate with that process on your machine. For this reason, we’re going to use an Internet socket so we can receive incoming contacts.

00:14:34.080 Next, we need to tell the type of socket. Typically, we hear about two types of sockets: stream and datagram. Stream sockets are like a telephone; they communicate back and forth, and the protocol is TCP. This requires two computers to connect to chat back and forth. Yesterday, a TCP 3-way handshake was discussed, involving the SYN and ACK flags that ensure the client and server both know they're connected before exchanging information.

00:15:07.920 The server will continue sending information over the socket along with a sequence number so the client can track the order of the information sent. This ensures that even if something arrives out of order, the web page will be correctly displayed. The client keeps track of that order through the sequence numbers provided by the server.

00:15:44.160 While TCP ensures the correct order, it is slower to connect. On the flip side, datagram sockets are like a megaphone. They are unidirectional and don't require a handshake, making them very fast. However, the downside is that you cannot guarantee the order of the messages. A real-world example would be multiplayer games or streaming audio.

00:16:29.680 Now, over our web server, we’ll use all that we've learned to set up our socket. We will communicate via TCP over a stream socket. Once we set up the socket, we need to give it an IP address to bind to, like generating a telephone number that someone can call into our server from the outside. We will bind the socket to that address so it knows that’s its number, and finally, we tell the socket to listen for incoming communications.

00:17:26.880 Now we wait. We just listen. We need to add a method where we loop forever, continuously listening for requests. When someone dials our number using curl or visits our website, accept will create a new socket that allows us to converse back and forth with the client. We can’t use the first socket because that one has one job: accepting incoming communication. We need a different socket to avoid mixing up our data.

00:18:00.400 Now that we have opened the socket and are listening for incoming requests, what comes next? Since it's a web server, we typically return an HTTP response. We will add some application code so that when we run this lovely application, it responds with 'Hello, World!' formatted according to the RFC we discussed earlier. We will write the response back to our socket and close it so that someone else can use it.

00:18:42.720 This is the basic way our tiny server will communicate with the outside world, repeating the process every time a client makes a request, returning our favorite phrase in programming: 'Hello, World!' Now that we’re communicating with the outside world, let’s delve more into how the server interacts with the application code running underneath.

00:19:19.760 We will start by discussing parsers. When we built our tiny web server earlier, we received a response but didn’t do much else with it. We just printed 'Hello, World!' every time. But in reality, users want specific information. A web server uses a parser to comprehend what they are asking.

00:19:39.480 The parser’s job is to take the request, break it down, and utilize the guidelines from RFC 2616 to extract the header, body, and URL. It must operate quickly, accurately, and securely. Even though you’ll find production web servers written in Ruby, parsers are typically written in C. Zed Shaw created a unique parser in Ragle for the Mongrel web server, which is now found in most Ruby web servers.

00:20:23.760 I won’t dive too deep into this, but it’s worth noting if you’re digging into Ruby web server production code since most have adopted this parser. As we build our little server that 'can't,' I'm going to avoid building a parser since I'm not proficient in C. Instead, we’ll assume that all users want is 'Hello, World!'

00:20:59.320 Next, let's talk about communication with the application. Rather than hardcoding 'Hello, World!', how can we modify our server to allow any application — like Rails or Sinatra — to plug in? We can achieve this with the magic of Rack.

00:21:32.600 In the Ruby ecosystem, there exists a common interface for all servers and applications to communicate, called Rack. Both Sinatra and Rails utilize this interface, allowing you to swap one server for another. This basic interface requires a Ruby object that responds to a method called 'call,' taking an environment hash, a response status, a header, and a body.

00:22:14.960 This is a super simple, lightweight Rack app that responds to a call with those arguments. Previously, we relied on a basic string, but now our server will invoke 'app.call,' assuming that everything running underneath knows it must return something in that standardized format.

00:22:49.400 Despite switching to Sinatra or Rails, you'll still receive the necessary information. Here is a cool GIF of our server running as it communicates back and forth with the client. It's conversing with our Rack application which is really rad. You can even have another client communicating with the server — life is great!

00:23:24.080 However, let’s mix things up a little bit. We’ll return to the Rack application and modify it so that instead of just saying 'Hello, World!', we can make an external API call. For example, we will download a new cat GIF every time you visit your homepage.

00:24:07.920 To illustrate, I've added a 5-second sleep to the process. If you see on the left, you'll have to wait five seconds for the response. But if another visitor happens onto the page simultaneously, one person will wait 10 seconds for their cat GIF, which is unreasonable. Even relatively fast API calls can build up and significantly slow down response time.

00:24:39.320 You can liken this to a grocery store. If you have one cashier checking customers out, as more people line up, the longer each person must wait. Each of these requests will wait for the one ahead. So how can we make it quicker? Similar to adding another cashier, in code, we can fork our process and create a new subprocess to download our cat GIFs.

00:25:16.520 We’ll modify our server so that instead of processing everything in line, we can separate the part that receives requests from the execution. By using the 'fork' command, it will create a new process to handle the request for a cat GIF. This allows the parent process to keep accepting requests without waiting for the GIF call to complete.

00:26:02.840 Now, if you run this, the two processes aren’t blocking each other. The first client can fetch their cat GIF while the second is also queried simultaneously. They still wait five seconds for their GIF, but it's no longer a ten-second wait as before!

00:26:39.520 As a final note, don’t forget to close the parent socket after forking. If you forget, you might end up with an 'address in use' error when trying to start your server because that old process is still hanging around. You can kill those zombie processes using the kill command and checking open files to find the process IDs if you need to.

00:27:26.080 So, back to forking. Each child process we fork is a copy of the parent. If any of these processes want to share information, they can't access each other's memory directly; they'll have to open a Unix socket for communication.

00:28:09.760 Some may wonder if forking doubles our memory usage since it creates a new process. Not exactly. Unix has an optimization called copy-on-write. Much of the allocated memory is static code that doesn’t change and thus can be shared between parent and child, resulting in a smaller memory footprint. If a portion of that memory needs to change, only that segment is copied.

00:29:00.320 In the past, with Ruby versions before 2.0, we couldn’t utilize this optimization, leading to Ruby's reputation as a memory hog. The issue revolved around garbage collection, which could inspect memory and inadvertently lead to non-shared memory between parent and child processes.

00:29:42.680 However, with Ruby 2.0, a significant change was made to garbage collection management, which allows us to utilize copy-on-write strategies effectively. If you want to learn more about this improvement, Pat Shaughn's blog provides great insights.

00:30:32.720 As we reach the end of this discussion, it's important to note that even though forking allows for parallel processes, those processes may run out of memory if there are many concurrent clients. Threads are a memory-efficient alternative since they share memory space, but there’s a catch with the Global Interpreter Lock (GIL) present in MRI Ruby.

00:31:19.920 The GIL restricts multi-threaded programs to execute only one thread at a time in Ruby, which makes writing thread-safe code more complex. While Ruby manages memory well, you still need to ensure your code and any gems are thread-safe.

00:31:57.080 Lastly, let's talk about communication with our process. We can use signals and traps to do this. Signals are methods for interacting with processes running on your machine, and you can view supported signals using the kill command. The most common signal is the interrupt signal triggered by pressing Ctrl+C, which instructs programs to shut down.

00:32:40.720 In our little server code, we can use a trap to handle this signal, executing certain code before fully shutting down. This can come in handy during long-running jobs, allowing us to save progress before the process is interrupted.

00:33:18.960 In summary, today we’ve learned about what a server is, how it communicates, and we’ve written a little server that really can’t do everything. Hopefully, you’ve gained some insights into Unix tricks that will help you navigate the mysteries of production servers in the future. I’ll post a link to my slides on Twitter.

00:34:06.440 And by the way, I brought some Heroku stickers! If you want a sticker or some other quirky patches, come say hi afterwards. Thank you!