Nate Berkopec
Summarized using AI

How Puma Works

by Nate Berkopec

In the presentation titled "How Puma Works" at RubyConf AU 2023, speaker Nate Berkopec, the maintainer of the Puma web server, delves into the inner workings of Puma and its architecture as a pre-forking web server designed for Ruby applications.

Key Points Discussed:

  • Introduction to Puma:

    • Puma is described as a web server and Rack application server optimized for parallelism and efficiency.
    • Since its inception by Evan Phoenix, it has gained immense popularity with over 260 million downloads.
  • Open Source Contributions:

    • Berkopec emphasizes the importance of community contributions, sharing his personal experiences with maintaining open source.
    • He highlights that a collaborative model in open source can lead to better software and less burnout among maintainers.
  • Internal Architecture:

    • The server uses a pre-forking model where a parent process spawns multiple child processes to handle requests.
    • Each process features a thread pool that allows concurrent handling of requests, improving throughput, especially during IO wait times, where Ruby's Global VM Lock (GVL) can limit execution.
  • Components of Puma:

    • Introduction of critical components such as sockets for connections, Rack for handling HTTP applications, and a reactor that buffers requests, enhancing performance during heavy uploads compared to other web servers like Unicorn.
  • Contribution Guidelines:

    • Berkopec encourages new contributors by outlining processes for engaging with the project including reading contribution guidelines, opening draft pull requests, and seeking feedback on issues.
    • He mentions how contributors have previously named Puma versions after personal interests, fostering a sense of community.
  • Educational Resources:

    • Recomends utilizing the Ruby documentation and external resources for anyone looking to improve their understanding of webs servers, sockets, and threads.

Conclusion:

The session not only explains how Puma operates under the hood but also inspires individuals to engage with open-source projects. Berkopec aims to cultivate a vibrant community around Puma, making contributions rewarding and accessible, ultimately pushing for more collaborative software development in the Ruby ecosystem.

00:00:00.000 Hey, what does Superfly Birdie's version Sweet Nighter mean? Spoonie, Bard Heaven, common—they all start with 'S'.
00:00:06.060 Birdie's version ends in 'S.' I don't know; I give up. What are these things? Anyone in the crowd?
00:00:20.100 I heard they are all npm modules. Probably, they are all major versions of Puma.
00:00:29.220 You don't actually see Birdie's version anymore when you start a Rails server; it depends on what version you're on. At the time we wrote this script, Birdie's version was 5.6. But what's 6.1? Well, we'll hear that soon.
00:00:43.020 Right, Puma is a web server for Rails. Does anyone know how that works? Let's ask the audience.
00:00:49.500 Who here knows the ins and outs of Puma and web servers? Well, it looks like we have at least one expert here.
00:01:10.280 Actually, Michael, we have the expert here today. The next presenter will explore the internals of how a pre-forking web server like Puma buffers and processes requests, what Rack is, and how Puma uses it to interact with Rails.
00:01:22.380 This is the last talk of the day, and it's going to be really exciting—no extra pressure! And who better to introduce us to these concepts than Nate Berkopec himself, the maintainer of Puma, Ruby's most popular web server.
00:01:41.159 Yes, Nate Berkopec is also the author of 'The Complete Guide to Rails Performance' and 'Sidekiq in Practice,' as well as 'Rails Performance Apocrypha.' He has run Rails performance workshops for hundreds of developers around the world.
00:01:55.079 However, he says it’s challenging to pair program while working in open source and different time zones. That said, he has paired on Puma several times—coincidentally always at conferences.
00:02:07.500 Well, we are all at a conference with Nate, so someone may be lucky enough to pair on some Puma code! According to Nate, you actually don’t need to be lucky; all you have to do is ask.
00:02:21.020 If you didn’t bring your laptop today, you have another day tomorrow. Now, let’s welcome Nate Berkopec to talk about how Puma works.
00:02:44.420 Thank you very much! Today, I'm going to talk about Puma. My day job is running a software consultancy that I call Speed Shop. My job is to make Ruby on Rails applications faster and more scalable.
00:02:57.720 You can find me online—I’m @nateberkopec everywhere. There’s only one of that name, and I'm also on Mastodon, and my website for my blog and everything is speedshop.co.
00:03:04.739 One of the topics I’ll discuss today is how you can give back to open source. One way you could do that is through my work.
00:03:24.720 Here’s where it is—everything's in English right now, but I do have Japanese coming soon. If you happen to be Japanese, there will be Japanese versions of my content released soon.
00:03:30.540 I cover topics related to Sidekiq, Rails performance, and scaling Rails. But today, we're here to talk about Puma.
00:03:36.599 Puma is a web server and Rack application server built for parallelism. This was originally written by Evan Phoenix a long time ago as the web server for his Rubinius Ruby implementation.
00:03:54.720 He wanted to demonstrate that there was no Global VM Lock in Rubinius, and what better way to do so than to build a fully parallel web server? It turns out it’s still actually useful when you have a Global VM lock.
00:04:06.300 Puma is now the most popular Ruby web server, with over 260 million downloads. We get probably 50 to 70 thousand downloads for every version we release.
00:04:14.640 Hopefully, it has been good for all of you here. I have really enjoyed the experience of being the maintainer of Puma for the last six or seven years.
00:04:20.100 I did hear this morning that someone named Samuel Williams was talking about open source.
00:04:28.040 That's cool because, to me, open source is not a competition. I love that! And I'm not saying it is for Samuel; he is actually the nicest guy ever, which is why I wanted to put this slide up.
00:04:36.600 He even has four commits to Puma, which means I owe him four commits because I don't think I have any commits on any of Samuel's projects.
00:04:50.640 One thing that comes to mind when it comes to open source is this extremely famous XKCD comic that I think we're all familiar with, and it relates to a core.js article that was floating around Hacker News recently.
00:05:04.500 This was yet another case of the story you’ve heard a million times about an overworked, underappreciated open source developer.
00:05:19.200 While I’ve heard that story many times, I don’t identify with it at all. I’ve never felt this way when it comes to Puma.
00:05:34.980 I think the solution to this problem, which I will call 'hero culture,' is to recognize that as open source maintainers, we are supported by contributors.
00:05:42.180 We deliver software to everyone, not just to some elite group, but to everyone as equals.
00:05:51.300 I don't believe in this hero culture, and I don’t think the answer is to underpay and undervalue contributors.
00:05:57.600 One thing I checked was the contributor page on GitHub for core.js, and I think you'll notice that there really is just one person on that list.
00:06:06.780 In contrast, Puma has four or five top contributors who are not the original author.
00:06:12.420 I am the maintainer, and I still haven’t surpassed Evan in the number of commits to Puma, even though he hasn’t committed for about five years.
00:06:20.520 This demonstrates that our model of open source can be effective and positive.
00:06:30.300 There are some advantages to this approach. First, we have a lower bus factor. If I get hit by a bus tomorrow, there are several people who could take over my contributions to Puma.
00:06:37.020 Second, it is extremely motivating for me. I love working on software with other people, and knowing I’m not alone fuels my contributions.
00:06:45.420 Third, this collaborative model produces better software, and fourth, it leads to less burnout.
00:06:52.440 So, raise your hand if you've never contributed to open source. Okay, raise your hand if you’ve contributed a few times.
00:07:03.000 Now, raise your hand if you’ve contributed regularly and still do. Okay, great! I want to help move the first two groups into the last category.
00:07:10.740 There are a reasons why others might not contribute, but with roughly 300 people in this room, it’s important to note open source is not for everyone.
00:07:22.740 Most people simply don't want to work on code outside of work or may have trouble finding a place to do it.
00:07:32.280 Perhaps it’s just not that important to you. That’s totally fine! But for the remaining 50 of you, if I can encourage just two contributions to Puma, that’s 360 contributions from this room.
00:07:42.120 If I can get one of you to become a super contributor, you could become a new maintainer, effectively cloning myself.
00:07:52.020 This is a mindset that is quite different from how most people who call themselves open source maintainers approach contribution.
00:08:06.240 It’s about creating not just contributors but an entire community that shares the maintenance workload.
00:08:17.460 However, there is a reason why most projects don’t function this way, and that is we often don’t make the experience for new contributors very good.
00:08:25.680 In its best form, open source contribution is fun, easy, and a great way to learn more about software.
00:08:38.160 When you make a contribution to the Rails project, your code will likely be reviewed by some of the best Rails engineers in the world—a fantastic opportunity!
00:08:47.700 So my goal with Puma is to make your contributions fun, easy, and a great way to learn more about software. I wish more projects shared this goal.
00:08:58.080 If you're an open source maintainer here today, this is the flip side of my talk. I want to be like Uncle Sam—I want you to contribute more to open source software.
00:09:07.440 This is my call to action: take a project that you use every day, learn more about it, and start getting into the GitHub issue tracker to make pull requests.
00:09:15.660 But here’s a bonus: if you make enough contributions to Puma, we will actually let you name the next version of Puma.
00:09:23.640 Previous contributors have named Puma versions after their favorite jazz albums, Swedish pastries, and even Malaysian folklore.
00:09:37.380 Puma is pretty complicated, not in size, but it utilizes various components of Ruby that you're probably not familiar with. Most of you may not have thought very deeply about sockets, threads, or processes.
00:09:51.960 This can be intimidating because you're accustomed to writing Rails applications. Part of my goal today is to introduce you to these topics so you can contribute.
00:10:05.880 If you remember 10% of this presentation, you will know so much more about Puma than I did when Evan asked me to start maintaining it six years ago.
00:10:19.600 At that time, I had made just one pull request to Puma and everyone thought, 'Can you maintain this?' I said, 'Sure!'
00:10:31.500 Here’s my outline for today: I'm going to discuss the design goals and purpose of Puma. This will inform how we'll go through the rest of the presentation.
00:10:45.060 I will talk specifically about processes and threads within Puma, provide an overview of the code, and then discuss how to contribute to Puma.
00:10:55.500 Puma's design goals are as follows: number one is to achieve more throughput for the same resources through parallelism, specifically through threads.
00:11:03.780 Secondly, the battery should always be included without needing additional tools, servers, proxies, or monitors.
00:11:16.130 This is not a design goal for Unicorn; Unicorn encourages you to use a reverse proxy like Nginx in front of it.
00:11:25.740 Thirdly, we aim to add less than one millisecond of overhead to every request, and fourth, to keep it simple.
00:11:37.080 What is Puma? I said it was a web server—a web server is an application that accepts connections on a socket and serves HTTP applications over those connections.
00:11:48.760 We are a subcategory of a web server called a Rack application server. I'm going to define what Rack is.
00:11:57.900 All Rack servers are web application servers because Rack is a protocol designed to create HTTP applications.
00:12:07.800 Sockets are endpoints for streaming data to and from clients, identified by an IP address and a port.
00:12:16.020 For example, you might say you are listening on all interfaces 0.0.0.0 on port 3000; that's a socket.
00:12:26.360 We create a socket for each new connection to our server, and the socket has file descriptors.
00:12:35.160 Most of the time, you will be using a TCP socket, and we utilize underlying standard Ruby classes for various types of connections.
00:12:44.580 TCP socket objects are fairly simple; you create them at localhost on Port 2000 and read from them to get data.
00:12:55.680 Fortunately, all the internals of that are largely abstracted away from you in Ruby, so we don’t dive too deeply into lower-level details in Puma.
00:13:05.640 Next, we serve HTTP on top of that—it’s an application-level protocol. Puma only deliberately speaks HTTP 1.x.
00:13:15.540 We have considered HTTP/2 support a number of times but haven't found the added benefit yet. HTTP 1.1 has the great advantage of being a plain text protocol.
00:13:25.320 It's straightforward to read from the socket and understand what's being transmitted.
00:13:34.740 I discussed how we are a Rack application server—what is a Rack server? It's a type of web server that serves Rack-compatible applications written in Ruby.
00:13:44.520 The idea of being a Rack-compatible application server is that I know nothing about Hanami, but Hanami knows how to act like a Rack application.
00:13:55.380 It will then be possible to serve it with Puma seamlessly. I don’t really interact much with the Rails or Hanami teams, but we aim to support this standard.
00:14:04.580 Rack applications are super simple; they are just objects that respond to 'call' with an argument and return three elements.
00:14:14.160 One is the HTTP status, two is a hash of headers and values that turn into your response headers and values, and the third is the response body.
00:14:22.380 This is a rack application—it's just four lines of code. You can mount these directly in a Rails application.
00:14:33.180 If you didn’t know, all Rails controller actions are just mini Rack applications.
00:14:41.760 It's simple to define a Rack application in what's called a Rackup file, which is usually named config.ru.
00:14:52.020 We start by requiring your application with the first line, and then we use the special 'run' method from Rack to denote where our Rack application is located.
00:15:01.860 The Rack server calls the application with the environment hash, which contains extensive information about what the request indicated.
00:15:10.560 This environment hash translates HTTP into a big Ruby object and contains a lot of details regarding the HTTP request.
00:15:19.740 So, Rack servers fundamentally act as interfaces between these Rack-compatible Ruby applications and a socket.
00:15:31.020 That is Puma's basic job: take the port 3000 on localhost, extract the HTTP from it, convert it into a Ruby object, and pass that object in a defined manner to the application.
00:15:41.640 At this point, we have a mental model of how Puma works. We've got a socket, a rack environment, and a rack application—all those processes occur in tandem.
00:15:51.960 Puma is (like most Ruby web servers) a pre-forking web server. This is an older model, working the same way as Unicorn.
00:16:00.420 We start one process—the main process. This main process boots the application, optionally, and calls 'fork,' creating many child processes.
00:16:09.960 These child processes are copies of the main process. They share nothing technically; they receive copies but are not sharing memory.
00:16:19.020 They do share the socket—so while the main process opens up the socket, the child processes listen to the same socket simultaneously.
00:16:29.220 After the app boots, the child process listens on the socket and calls 'accept' to say, 'Hey, I want to pull a request off of the socket!'
00:16:38.420 A common misconception is that the parent process does this; it does not. The parent process isn’t involved with requests or responses.
00:16:50.820 In a pre-forking web server design, the parent process is simply there to provide signals. In Puma, to use this pre-forking design, we call this cluster mode.
00:17:00.840 When you use Puma -w and provide a number, you are specifying the number of child processes to create.
00:17:10.920 For instance, when you say Puma -w 4, we create one master process and then add four additional worker processes.
00:17:21.480 Now we have a more complicated setup: a Puma parent process that doesn’t listen to the socket and several child processes that do.
00:17:32.700 Each child process independently takes HTTP, converts it into Rack, and calls the Rack application with the environment hash.
00:17:42.420 Now we move on to the internal workings of Puma: the thread pool. Inside every process is a thread pool that can have zero or more threads.
00:17:54.020 This thread pool picks up requests and calls 'app.call'; it’s this multi-threaded approach that makes Puma unique.
00:18:06.060 When you add work to the pool, the threads automatically pick it up. However, thread safety in the way I described isn't how things work internally.
00:18:19.020 Now, let's discuss how this results in parallel execution, which involves the Global VM lock (GVL). This lock only allows one thread to run Ruby code at any time.
00:18:33.900 It doesn’t mean only one thread runs; rather, one thread is specially given the opportunity to run Ruby code at a time.
00:18:46.380 This setup is similar to a scenario where individuals at a border checkpoint have a specialized machine for stamping passports.
00:18:59.160 For example, they can perform tasks simultaneously until they require access to the machine.
00:19:13.020 Most of the time, Ruby processes wait for IO, like database calls, during which the GVL is released.
00:19:24.840 Since Ruby 3.0, the GVL no longer exists; now there’s one VM lock per reactor.
00:19:36.660 While Puma doesn't use reactors yet, understanding this change is crucial.
00:19:48.840 Bear in mind that all this is specific to the MRI Ruby implementation; JRuby, Truffle Ruby, and Rubinius operate differently.
00:20:02.460 In a typical Puma process, about 50% of the time is spent waiting on IO operations, meaning about four threads can handle approximately twice the number of requests as would one thread.
00:20:16.020 That’s why using Puma with threads matters: it allows you to serve the same volume of requests with less memory.
00:20:29.640 Now, let’s revisit our Puma mental model. We have the Puma parent process and the socket.
00:20:42.420 Now we also have Puma child processes that turn HTTP into Rack, along with threads within each child process.
00:20:55.320 Inside every Puma process, we have a thread pool, and each of those threads is calling the app.call method to execute the Rack application.
00:21:06.960 Moving forward, let's talk about the fancy part of Puma: the reactor. Its job is to buffer requests so only complete requests are sent to the thread pool.
00:21:18.600 The issue with our simple thread pool design is that if someone uploads a massive file while on a slow network, it will bog down the pool.
00:21:31.560 If I handed the request directly to the thread pool, that thread would wait for a long time, and I would lose significant processing capacity.
00:21:44.400 Unicorn lacks a reactor—this is why it's recommended to use a reverse proxy like Nginx or Apache in front of it.
00:21:58.680 If you've got overwhelming upload requests, those can really bring down a Unicorn server.
00:22:12.960 Puma has a reactor that helps mitigate this. It buffers incoming requests to keep the processing capacity high.
00:22:28.560 As we finalize our mental model, we have the socket, child processes, the thread pool, and the reactor, which all work together.
00:22:43.260 The reactor runs an event loop, taking raw socket data, converting it into HTTP, and then passing it to the Ruby code.
00:22:52.380 As you explore an open source project, I recommend using CLOC, a line of code counting tool.
00:23:04.140 You can analyze the code by file to get a sense of where improvement opportunities are. For example, Puerto has about 8,000 lines of code, with most of that written in Ruby.
00:23:16.740 Among the total lines are native language extensions written in C and Java—this includes our HTTP parser.
00:23:29.520 It’s fun to learn about how these native extensions work, and you'll find a folder in the repository for those.
00:23:43.620 Keep in mind we have an HTTP/1.1 parser based on Zed Shaw's original work in Mongrel. You'll see that copyright statement is present in both Unicorn's and Puma's parsers.
00:23:56.460 This shows us how open source often leads to shared solutions across different projects.
00:24:06.960 The beauty of the parser is that it uses a state machine library called Regal, which we use to define what character patterns to expect.
00:24:19.140 This helps us shape how we expect a header value or header field to be formatted, conforming to HTTP standards in the RFCs.
00:24:30.780 We need help here. I don't write C, and our second main maintainer Greg only knows a little bit, and we're all essentially faking it.
00:24:44.040 If you have experience in C or Java, we absolutely need you to help with Puma!
00:24:55.740 Now, let’s take a quick code tour—the primary Ruby classes fall into various groups.
00:25:05.880 The first group I call the server classes: these are our key management classes.
00:25:17.460 Each of them is about 500 lines of code. The 'server' class is the main class that manages everything.
00:25:30.600 'Cluster' and 'cluster worker' support cluster mode. Then we have configuration and startup classes that are initiated when Puma starts up.
00:25:43.140 The DSL class is crucial for defining how Puma’s configurations work. It lays out all of the configuration options.
00:25:55.740 This is followed by request and response classes, which are generated as requests move through Puma.
00:26:09.360 We have a client object that can issue many requests, creating a new object for every request sent.
00:26:20.520 In short, there are about 12 classes really making a difference in Puma—it's simple once you pause to think about it.
00:26:30.960 It feels complex because perhaps you haven’t thought deeply about threads, processes, and sockets, but the core functionality runs on the order of just a few hundred lines.
00:26:46.320 To maintain Puma, I commit about 15 minutes a day. I don’t spend much time on it during the week; often, I’ll dedicate weekends to it.
00:26:59.520 This pace helps me keep Puma maintainable without feeling overwhelmed. My goal is for contributors to adopt a similar philosophy.
00:27:12.780 15 minutes a day on open source is quite achievable. Look for issues on GitHub’s contributing.md files.
00:27:24.720 If a project does not have this file, that’s a red flag indicating they may not be interested in new contributors.
00:27:39.780 However, if they do have one, it’s worth reading and, while you’re at it, you can even hope to book a call with me if you want to contribute!
00:27:52.080 Remember, it’s always acceptable to communicate with the maintainers about the issues you want to address.
00:28:03.540 When getting into an open source project, the steps remain similar: clone the repository, read contributing.md, install any specified libraries, and compile any required extensions.
00:28:16.680 Additionally, run tests to ensure you have success, either running them locally or through another method.
00:28:29.880 Keep in mind, please don't stake claims on issues. Most times I hear nothing from these claimants.
00:28:42.840 Instead, open draft pull requests, even if they’re just a few lines—you'll demonstrate you are working on something.
00:28:57.300 Not everything depends on pushing new code; many aspects of open source thrive on assessment and quality.
00:29:10.680 Another misconception is that being a hero coder is the only way to contribute; that's not the case.
00:29:24.300 Many contributors handle aspects that are non-code related—such as documentation. No project thinks they have sufficient documentation!
00:29:37.680 Many projects need new documentation, and you can write content from your perspective, helping to clarify usage.
00:29:50.820 Code reviews are also beneficial; I appreciate when people outside the maintainer circle provide input on pull requests.
00:30:04.440 Fixing bugs is often simpler than you’d think. The scoped nature allows for easier testing, and you might find tasks only require a line or two.
00:30:19.200 In Puma, we welcome partially implemented features and backport contributions. Lastly, you could work on features, but I would suggest working on smaller items first.
00:30:31.440 So my final checklist: start with areas labeled for new contributors, reproduce bugs, write documentation, review code, and finally tackle feature development.
00:30:43.560 Also, if you’d like to contribute to Puma but need to familiarize yourself with the concepts, I recommend the ruby.com books by Jesse Stormer.
00:30:58.860 They remain relevant and practical in discussing sockets and threads.
00:31:14.280 As I finish, I want to leave you with this thought: negative comments can be demotivating in the open source community.
00:31:30.720 It’s key to show gratitude to those contributing to open source—expressing thanks can truly uplift people in this space.
00:31:45.120 Thank you for your time, and I hope to see you on GitHub in the Puma issue tracker. If not, feel free to stop by my table for questions.
00:31:57.180 Thank you all very much!
Explore all talks recorded at RubyConf AU 2023
+10