Keynote: A Beginner's Guide to Puma Internals

00:00:15.179 Hello everyone! This keynote was proposed to me three years ago, so I've been waiting three years to give this talk. Today, I want to talk about Puma.

00:00:20.460 If you don't know, I am kind of the main maintainer of the Puma web server project. I've been involved since 2017, which is about five years now. It's something I’ve committed to almost as long as my main gig, which is my Ruby on Rails performance consultancy called Speed Shop.

00:00:32.759 I help make Rails apps faster, more scalable, and assist developers in managing high traffic without pulling their hair out. As I began writing about making Rails apps more scalable, Evan, the original author of Puma, approached me and asked if I could help maintain the Puma project. I said yes.

00:00:58.860 This is my plug: If you want me to continue maintaining Puma or you want to thank me for using Puma in your projects, you can check out my courses and books on Gumroad. But anyway, back to Puma.

00:01:23.759 The Puma project is a Ruby web server designed for parallelism. We used to say 'concurrency,' but now we say 'parallelism.' That’s what’s highlighted at the top of the README. It's the Ruby web server designed for parallel processing.

00:01:36.900 We've been the default web server for Rails for a while now—since around Rails 4.0 or 5.0—when Action Cable was introduced, and since then, it has become hugely popular. Currently, we are the most downloaded Ruby web server. We've reached nearly 250 million downloads, representing a true honor to maintain a top 100 gem by download count.

00:02:00.960 I got into open-source software (OSS) contribution as a junior developer two years into my career. At that time, I was the only developer at a startup, meaning I had no one to review my code. So, I started contributing to Rails to get code reviews from Raphael Franca, who is one of the best Rails engineers in the world— and this was for free.

00:02:25.500 Open-source contribution, particularly to Puma, should be fun, and you should learn a lot from it. I believe Puma is a great project for your contributions, and it's my responsibility as the maintainer to facilitate this.

00:03:00.300 However, my philosophy as a maintainer might come off as lazy. I want other people to help do the work; I'm not the kind of maintainer looking to push out thousands of lines of code every day. My style focuses on building a community of Puma maintainers—an army of enthusiasts to help maintain the Puma project and ensure it continues into the future.

00:03:31.180 So, today I'm here to build that army. I want all of you to contribute to Puma. I know many of you have never contributed to open source before or maybe have only contributed once or twice. Some of you might have your own GitHub repo, but it's personal and you haven't contributed to a big project like Puma.

00:03:56.420 One of the biggest hurdles I hear is the intimidation factor: 'I don’t know what the project does,' or 'How can I contribute if I don’t know anything about it?' Today, I'm here to eliminate that excuse. I firmly believe Puma is a great entry point for contributing to open source.

00:04:21.900 As a neat incentive, if you make the most contributions to a major or minor release of Puma, you can name the release. There's a log message when you start Puma that says a name, and if you have the most contributions, you can set that name, which will be seen by millions for as long as that version is out.

00:04:52.920 I understand that Puma can be complicated; it has around 5000 lines of Ruby and 4000 lines of native extension code. It might not seem massive, but it is complex and uses concepts that you may not be familiar with as a web application developer.

00:05:18.240 In this presentation, I will cover many new concepts quickly at a high level. If you remember just 10% of this, you will know more than I did five years ago when Evan made me the maintainer.

00:05:51.420 This overview is not going to provide you with an in-depth understanding of how Puma works; instead, it’s meant to pique your interest and provide you with a broad overview of what Puma is.

00:06:07.860 First, I want to discuss Puma's design goals and purposes—why it's architected the way it is. Then, we'll cover processes and threads, which are the bare bones of how any web server functions. Finally, I'll walk through some specific bits of code and discuss how to contribute to Puma.

00:06:25.740 Puma was originally designed as the web server for the Rubinius project, which was an early Ruby implementation by Evan that did not have a global VM lock. This allowed Ruby to run in parallel, and so, there was a need for a web server to demonstrate the capabilities of this super-parallel Ruby implementation.

00:07:03.180 Puma was designed to be a 'batteries included' web server—meaning you should be able to start a Puma server and it should work out of the box, with no additional stuff required. Furthermore, the goal was to add less than a millisecond of overhead to every request.

00:07:50.040 Puma aims to stay simple, avoiding a codebase that balloons into tens of thousands of lines. So what exactly is a web server? A web server is an application that accepts connections through a socket and serves HTTP applications over those connections.

00:08:18.160 Sockets may be a new concept for some. They are endpoints for streaming data between clients; they’re identified by a combination of an IP address and a port number. In Puma, we have three types of sockets: TCP, Unix, and SSL sockets.

00:08:33.599 Puma primarily utilizes TCP sockets and Unix socket classes directly from the standard library. When we read input from these sockets, we close them afterwards to tell the client we are done with their request.

00:09:09.660 What are we sending over the socket? We're sending HTTP, which is an application-layer protocol in the OSI model. Puma speaks only HTTP 1.1. This simple protocol utilizes a verb, a path, server version, headers, and values.

00:09:25.500 Puma is also a Rack server, meaning it serves Rack-compatible applications written in Ruby. Rack applications adhere to a simple, elegant standard that requires an object to respond to a call and return an array with a status, headers, and a response body.

00:09:47.340 The simplest Rack application is merely a proc, receiving a Rack environment as an argument. The server is responsible for converting this into bytes on a socket that go back to the client. You typically define a Rack application in a 'config.ru' file, which Puma looks for by default.

00:10:14.880 The application is provided with a large environment hash containing numerous keys specified by Rack. This is the input to our function, and the output is that three-value response we discussed.

00:10:31.680 Puma, like almost all Ruby web servers, operates as a pre-forking server. In this mode, one process (the parent) starts and initializes the application. After booting, it creates several child processes which listen on the same socket and respond to incoming connections.

00:10:47.509 This approach keeps the parent process idle, allowing it to focus on receiving signals, like when you send a Ctrl+C in your terminal. It is the parent process that distributes signals to all child processes.

00:11:07.480 In cluster mode, the parent process creates multiple child processes with the fork system call. For instance, if you specify '-w 4,' the parent will create four child processes that handle requests.

00:11:27.360 Now, let’s establish a mental model: we have a socket linked to multiple child processes. Each child process handles an HTTP connection, transforms it into a Rack environment, evaluates it, and returns the response.

00:11:45.600 Now, let’s talk about the real power of Puma—the thread pool. This is the unique feature that sets Puma apart from other web servers. Each child process has its own thread pool, which can house any number of threads. These threads work on a shared array of tasks.

00:12:22.860 It works like this: work is added to the array, and the threads in the pool execute the tasks sequentially from that array. Even with a Global VM Lock (GVL), Puma can effectively handle a higher volume of requests.

00:12:51.539 The GVL allows only one thread to run Ruby code at a time. However, with multiple threads waiting for I/O operations—like database calls or external API requests—we can still achieve efficient parallel performance.

00:13:29.220 Ruby 3.0 introduced a revolutionary change. Although individual reactors can run in parallel, Puma still operates in one reactor, maintaining an effective thread implementation strategy.

00:13:56.760 Even with this limitation, a Puma process running an application that waits on I/O can process double the requests than a server with a single thread. If you replace two Unicorn processes with a single Puma process and four threads, you can expect similar throughput with significantly lower memory usage.

00:14:25.440 At this point, we still have our socket connected to the child processes and are prepared to convert HTTP into a Rack environment. Once that rack environment is ready, it goes to the thread pool where a thread can handle the application.

00:14:56.220 Puma also employs a reactor to buffer requests, ensuring that only complete requests are sent to the thread pool. This prevents slower clients, like those uploading large files, from monopolizing resources.

00:15:30.180 Unlike some frameworks, which require third-party solutions like Nginx or Apache to do the buffering, Puma incorporates the reactor to manage this process internally.

00:16:02.460 Now, with the final model, we've established multiple child processes connected to the specified port. Each process has a reactor that takes the raw socket data, transforms it into a complete Rack environment, and passes it on to one of the many threads in the thread pool.

00:16:32.360 When examining new projects, I like using a tool called 'cloc’ to count lines of code. It can be helpful to identify large files, which often house crucial functionality within the codebase.

00:17:02.580 In Puma, key classes include the EXT directory for native extensions, which consists of C and Java code. These are critical components that allow us to parse HTTP and implement SSL, and it’s essential to have experienced developers review this code.

00:17:36.699 The largest Ruby classes within the Puma project are the server, cluster, worker, runner, and thread pool. They serve as the circulatory system of Puma, determining how it operates in terms of request handling.

00:18:02.040 When creating a Puma instance, we establish a Runner class, which is a subclass that determines whether we're running in cluster or single mode. In cluster mode, multiple worker threads handle incoming requests.

00:18:27.360 The server object manages threads and is designed to listen to sockets. Each thread manages connections independently in a loop until a request is received.

00:18:51.240 At this stage, the input socket is processed, and the request is handed off to the thread pool. Each thread pulls a task off the work queue, retrieving HTTP data and returning it to the client.

00:19:18.900 In summary, from establishing the initial connection to sending out the response, Puma works comprehensively to manage tasks efficiently using every thread within its configured environment.

00:19:45.000 Now, let’s discuss contributing to Puma. In our repository, we have a file called CONTRIBUTING.md, which serves as our guide to contribution. Right at the top, it invites newcomers to book time with me for assistance.

00:20:12.000 If you need even just 30 minutes for help, I would be more than happy to gather on Zoom to assist you in getting started with your contributions to Puma.

00:20:44.680 Additionally, if scheduling conflicts arise, please use GitHub discussions to clarify your inquiries. Setting up Puma locally is a straightforward process.

00:21:13.440 You'll want to clone the repository, install the Regal dependency, and compile the extensions using Bundler. Once that's done, you can use your local Puma copy to serve applications.

00:21:52.020 It's crucial, though, to avoid claiming issues solely to avoid conflict. Rather, work on your own code and post a draft pull request. This empowers maintainers to assist you once they see what you’re working on.

00:22:12.720 Contribution isn't limited to writing new features; it’s also about fixing bugs and improving documentation. Start with issues labeled 'contrib wanted' for defined tasks requiring less context.

00:22:37.620 The ‘needs repro’ label indicates that there's a bug without definitive reproduction steps. Addressing these bugs can provide clarity for other maintainers, easing the response process.

00:23:08.100 Documentation improvements can also drive significant value, helping others learn while reinforcing your understanding of the project.

00:23:29.880 Moreover, I appreciate reviews from anyone involved in the contributing process. Whether you're a maintainer or a newcomer, offering constructive feedback on pull requests can greatly enhance the development process.

00:23:54.780 Finally, fixing reproducible bugs is an excellent place to get started. They involve clear steps and often minimal code changes, making them manageable first contributions.

00:24:19.920 After a few small contributions, you can consider more significant features. Knowing your pace and being cautious about the workload is essential as features often require extensive code contributions.

00:24:44.100 This suggestion is applicable beyond just Puma. It serves as a solid methodology for anyone looking to engage with OSS more broadly. Be proactive in reading issue trackers to find solvable challenges.

00:25:07.850 Lastly, if you're intrigued to dive deeper into contributing to Puma or understanding its underlying concepts better, I recommend visiting workingwithruby.com. There you can find free resources on working with sockets and other relevant topics.

00:25:39.180 Thank you very much for your time, and I hope to see you all on GitHub!

00:26:21.000 Thank you.