Deploy, Scale and Sleep at Night with JRuby

by Joe Kutner

In the talk "Deploy, Scale and Sleep at Night with JRuby," Joe Kutner, the author of "Deploying with JRuby," explores the advantages of using JRuby for web application deployment and scaling. He starts by drawing a parallel between building with Legos and deploying applications on different architectures. Kutner explains that the traditional architecture used in MRI-based web applications often leads to scalability and reliability issues, likening it to the individual Lego cars that require more resources.

He contrasts this with JRuby's architecture, which operates on the Java Virtual Machine (JVM) and allows for better concurrency by avoiding the Global Interpreter Lock (GIL) found in MRI. Key points he discusses include:

Deployment Architecture: He emphasizes the difference between MRI and JRuby deployments, stating that JRuby allows developers to write pure Ruby without being forced to delve into Java code.
Sysadmin Considerations: Kutner highlights the importance of sysadmins in the deployment process, noting that many issues arise from coding mistakes that impact production environments.
Memory and Resource Usage: The JRuby architecture scales better by using threads rather than individual processes, reducing memory usage and improving resource efficiency.
Technologies for Deployment: He reviews three main technologies for deploying JRuby applications: Warbler, Trinidad, and Torquebox. Each has its advantages and disadvantages, catering to different deployment needs:
- Warbler: Creates a WAR file for easy deployment to Java servers but complicates development environments.
- Trinidad: Mimics MRI deployment processes while allowing for greater performance through JVM features.
- Torquebox: A full-stack application server that best utilizes JVM capabilities, supporting clustering and high-availability features.
Deployment in the Cloud: Kutner discusses modern cloud deployment solutions for JRuby applications, including Heroku and Engine Yard, showing that multi-faceted deployment options exist today.

In conclusion, Kutner emphasizes that using JRuby can enhance both developer efficiency and sysadmin peace of mind, leading to a healthier working relationship between developers and their operations teams. He encourages attendees to consider the processes in their organizations when choosing deployment strategies and highlights the importance of running tools like JRuby Lint to ease the transition. The ultimate goal is to design a system that supports scalability while reducing the operational burden on sysadmins, fostering a better work-life balance.

00:00:08.519 So our next speaker is Joe Kutner. He lives in one of those flyover states and he's the author of "Deploying with JRuby" and a committer to the Torquebox app server.

00:00:25.039 Thank you, Joe.

00:00:34.559 Alright, so how many of you played with Legos as a kid? Right? Now how many of you play with them as adults too? Alright, I have an excuse: I have a son, and he has just reached the age where he's started to enjoy playing with Legos. So this is a very exciting time in my life.

00:00:50.239 But it's also a very expensive time because of stuff like this. This is a little Lego motor that you hook up to a battery pack. It spins, and then you can build cool stuff out of it. I bought my son one of these kits, and the first thing we built, of course, was a car. Like any young boy, he wanted to build a second car so that we could race them.

00:01:03.880 Having two cars actually turned out to be kind of fun because we experimented with different gear ratios so that my car would always win. I’m very competitive. But eventually, my son wanted to build a third car. Unfortunately, we couldn't because we didn't have any more motors. Effectively, these cars were very expensive. Each new car was a complete copy of the other car, and eventually, we ran out of parts.

00:01:19.840 But my son, in his creativity, had the idea that instead of cars, we would build a train, where a single engine acted as the locomotive, pulling a bunch of lighter weight cars behind it. This allowed us to build more cars and carry more passengers, including the Statue of Liberty, and have more fun.

00:01:39.160 As we were doing this, it occurred to me that the difference between the train and the individual cars was very similar to the difference between deployment architectures of JRuby applications and MRI-based web applications. This difference is what allows both the train and JRuby systems to scale better, perform better, and ultimately be more reliable. That reliability is what helps you sleep at night.

00:01:54.200 So yesterday, Charlie talked about JRuby. I’m not actually going to talk about JRuby today; instead, I’m going to talk about Ruby on the JVM. It's a small difference, right? But I find it’s a big one because when most folks see that letter 'J', their initial concern—maybe even terror—is that they'll have to write Java code, or even worse, XML to get their apps running on the JVM.

00:02:00.080 But it's simply not true. In fact, you can write pure Ruby web applications and run them on the JVM. All the features that JRuby provides, like the Java import and the Java namespace, are really cool tools, and if you want to integrate with Java libraries, that’s great! But they’re completely optional; you don’t have to use them.

00:02:19.120 This is so true that I wrote an entire book on the subject of deploying JRuby applications, and there isn't a single line of Java code or XML in the 200-plus pages. The reason this is possible is that the technologies we use to run and deploy our applications with JRuby do a very good job of shielding us from those uncomfortable parts of the Java ecosystem.

00:02:39.120 As Charlie likes to say, I write Java code so that you don’t have to, and what that means for you as a developer is that using JRuby at development time doesn’t have to be all that different from using MRI. That stuff I just talked about? You still get all that; you still get to write Ruby code the way you love to. There is that startup time issue, but there are workarounds for it.

00:03:01.320 However, there is someone who will notice this shift from MRI to JRuby. It’s someone who is important in our work lives but who we often neglect. That person is your sysadmin. Now, being a sysadmin is arguably the worst job in this industry. They are on call on weekends, they get phone calls and text messages in the middle of the night, which means they don’t spend much time with their families.

00:03:21.799 Ultimately, they might start drinking, and it’s a downward spiral from there. And the worst part is, when I say we get to sleep at night—that’s the title of my talk—I’m not talking about you; I’m talking about your sysadmin. Stop being so selfish! The worst part about the sysadmin’s job is that they are cleaning up after our mess, the mistakes we made.

00:03:39.040 Very often, we're the ones who wrote the code that had the memory leak or the error condition that brought the system down. But they’re the first line of defense. A good example of this is when it’s time to deploy new versions of our applications into production. In most organizations, you tell your operations team, 'Hey, go run `cap deploy`.' And even though you know there’s going to be all this infrastructure shifting around, processes restarting, and things reconfiguring.

00:03:59.560 Your sysadmin trusts you. I don’t know why, because we’ve let them down so many times, but they do trust us. When they run `cap deploy`, many times, the first thing that happens is the CPU spikes, and they start running out of memory. The website becomes unavailable for whatever reason, and then they get emails from their boss. Of course, all this is happening on a Friday afternoon when you’re at home spending time with your family.

00:04:17.920 They just want to go home! They may not have families anymore, but they still want to go home. I love sysadmins; I really do. But all these problems they have are the result of us. What this shows is the general architecture of any MRI-based web application, regardless of the server you’re using.

00:04:30.839 They all do basically the same thing: handling multiple HTTP requests by putting a proxy in front of a pool of application instances. Each of these application instances has to run in a separate process. Furthermore, any background jobs—scheduled jobs, cron jobs, rescue workers—they have to run in their own processes.

00:04:46.160 The first problem this causes is memory growth. To scale this system up, we have to create more processes, and each of these processes contains a complete copy of our application in memory—all its supporting gems, Rails, if you’re using it, the Ruby standard libraries. So each new process is quite heavy, rather like my son’s cars. Ultimately, we end up running out of memory before we maximize our hardware and system's potential.

00:05:06.640 The next problem is that we have to balance these processes. Our proxy has to be aware of which ones are handling long-running requests or which are idle so that it can distribute new requests evenly across the pool. Sometimes, those long-running requests aren’t supposed to be long-running; they're actually stuck processes or zombies.

00:05:29.360 We have to introduce more infrastructure like God and Monet to monitor these things, recycle them or restart them if they hit certain memory thresholds or become stuck. It’s good that we can restart them one at a time, because if we try to restart them all at once, we quickly overtax the CPU and bring the system to a crawl.

00:05:42.720 We usually have to do rolling restarts. Right now, Passenger and Unicorn have made that problem immensely easier. But let’s be clear about Passenger and Unicorn: I use Passenger when I deploy MRI applications. I love it. But let's be clear about what it's doing—it’s not solving the problems that this architecture creates; it’s just making them easier to deal with. It’s an important distinction.

00:06:00.960 Apart from those process management issues, there are other issues. We need to replicate session state across these processes. Each of these processes has its own database connection pool, and having an unbounded number of database connection pools can often defeat the purpose of having a cap on the number of connections.

00:06:20.320 So we introduce more infrastructure; if you’re using PostgreSQL, probably PG Pool sits between your application processes and the database to prevent blowing out your connection limit. If we were to analyze each of these problems and dig down to find the root cause, we’d see that it's the Global Interpreter Lock (GIL). This is a mechanism in the MRI runtime that allows it to map multiple Ruby threads to a single kernel thread.

00:06:39.840 This means that only one is running at a time, effectively making the runtime single-threaded. There are many folks within the MRI community that want to see the GIL go. In fact, at RubyConf last year in New Orleans, someone asked Matz if we would ever get rid of the GIL. Very good direct question. Matz's answer was something like, 'No, I don’t want it to be that kind of platform.' That was when I finally understood why the GIL is there.

00:07:02.040 It’s not that they’re lazy and don’t want to take it out, but that Matz’s philosophy is that programming should be fun and easy. But concurrency is hard, so Matz’s solution is to eliminate concurrency. For some applications, that may be a fine compromise. Web servers, however, are not among those applications. In a web server, throughput is of the utmost importance.

00:07:20.640 The way we achieve higher levels of throughput is by parallelizing requests. To do that without incurring all the problems I just described, we need a runtime that doesn’t have a GIL. That’s where the JVM comes in. The JVM maps Ruby threads directly to kernel threads, allowing the operating system to schedule them on multiple cores or processes at the same time.

00:07:37.440 What this allows us to do is change our architecture to something more like this, where we have a single process with a single application instance, and we handle multiple HTTP requests with threads that run in parallel against that same instance. This is more like the train that my son and I built, where the threads are like the lighter weight train cars.

00:07:51.840 This also solves all the problems I just described in the previous architecture. Memory growth is not an issue because we only have one application instance. We don’t have to balance these threads; when they get stuck, it’s not a big deal. We don’t have to replicate session state because they’re sharing memory, and there’s only one database connection pool.

00:08:06.600 My argument is that this architecture is more scalable and more reliable. But is it more performant? That’s why I brought data. The Torquebox team has done extensive testing—not only benchmarking it with real applications but also one comparing to Passenger, Unicorn, and MRI servers, as well as Trinidad, another JRuby server.

00:08:27.960 They have a whole slew of data showcasing how great Trinidad is and why it performs so much better. Most of that data is due to the things that Charlie talked about yesterday. But I picked two graphs that illustrate a key point.

00:08:43.680 In these two graphs, we see that the four servers—two MRI and two JRuby—follow almost exactly two distinct curves. In the top graph showing throughput, measured in requests per second over time, all four servers have about the same level of throughput until a certain point. That point is defined by the size of their process pool, which is when they can no longer serve more requests in parallel.

00:09:05.360 However, the two JRuby servers can continue to create more threads and service more requests. In the bottom graph showing CPU usage, we see something similar: the JRuby servers have consistently lower CPU usage because there’s just less going on. Instead of one garbage collector per process, there’s one garbage collector.

00:09:20.440 Given this, the things Charlie discussed yesterday along with the features of JRuby, there’s still been a nagging question. Charlie calls it the gap in the JRuby story. Yes, JRuby is great—it's powerful and fast—but how do I get my applications into the real world?

00:09:41.440 There are a few reasons this has been a point of confusion. The first is, what do you need to do to your application to get it ready for JRuby? It turns out that’s a pretty simple problem to solve. The JRuby Lint gem—you install it and run `jr lint` against your application. It tells you exactly what you need to change.

00:09:59.200 You won’t have to write any Java code, but you might have to change some of your dependencies. If you take nothing else away from my talk today, run `jr lint` on your projects. I think you'll find it’s not that big of a deal. But beyond getting your code ready, there's still the question of how to handle deployment.

00:10:12.120 There are two reasons this has been confusing. The first is the change in architecture, which I've already described. The second is that deploying JRuby applications offers more multifaceted options compared to MRI deployments.

00:10:24.680 In the MRI world, regardless of how you’re deploying, you’re essentially doing the same thing: taking code from a repository directory and transferring it onto a production server as loose files, then restarting each of your application instances. In the JRuby world, you have some options.

00:10:42.440 You can package your entire application into a single file that acts as a sort of unit of deployment. You can continue to use Capistrano if you want and push those loose files onto the production server, but the way you restart your applications differs.

00:11:00.080 Beyond that, the way we compose our applications in production starts to change, allowing us to integrate background jobs and scheduled jobs into our applications so they don’t require external infrastructure to run.

00:11:18.640 Furthermore, there are capabilities of JRuby that don’t have good analogies in the MRI world, like clustering. Given these different options, there are essentially three different strategies you can use to deploy your JRuby applications, each facilitated by different technologies.

00:11:36.640 These technologies are Warbler, Trinidad, and Torquebox, which I'll talk about today. Each of these technologies has its own characteristics, meaning they each solve different problems.

00:11:54.640 We’ll start with Warbler, which is different from the other two I mentioned because Warbler is not part of your runtime infrastructure. It’s a build tool you install as a gem. Warbler provides you with a set of Rake tasks, the most important of which is the `war` task.

00:12:05.040 You run `war` against your Rails or Rack project, and it produces a WAR file. People in the Java world don’t need an explanation for what that is. If you’re not familiar with a WAR file, just think of it as a zip file that follows certain conventions, allowing you to package everything you need to run an application into one.

00:12:26.480 Once you have that WAR file, you can deploy it in production by simply giving it to your sysadmin. If they’re familiar with Java deployment, they’ll know exactly what to do with it. Most likely, they’ll install either Tomcat or Jetty from their system's package manager, which provides all the init scripts they need to run.

00:12:44.280 Once the server is up and running, they drop that WAR file into a directory that Tomcat listens to. So there’s a difference here: we’re having this inversion of control between the application and the container.

00:13:02.280 In the MRI world, you run your application, which starts up the container. In the Java or JRuby world, you run your container server and deploy your application to it, or more likely, the server deploys your application for you.

00:13:21.280 This has a number of advantages. For one, we can hot deploy our applications; that is, deploy new versions of WAR files without ever bringing down the server. It’s not quite zero-downtime deployment, but it’s significantly faster.

00:13:34.560 Moreover, we can also deploy multiple applications to a single server, which further reduces the overhead and additional infrastructure we need to run multiple applications. This might encourage us to break our applications down into smaller, more cohesive packages or modules.

00:13:53.320 There are really four main advantages of Warbler. The first is portability: a WAR file is a WAR file, regardless of how or where you deploy it, what operating system you’re using, or what Java server you’re using.

00:14:08.000 The second advantage is security: a WAR file can actually be signed, and if it’s been corrupted or tampered with, these Java servers won’t deploy it. You can also compile your Ruby source code into Java bytecode to obfuscate it if you’re deploying in an untrusted environment.

00:14:28.840 The third advantage is speed. I mentioned that you can hot deploy these applications, but also the deployment process itself can be expedited because once your WAR file is on the production server, you don't have to run things like `bundle install`—all your dependencies are already there.

00:14:46.840 The fourth advantage and certainly the most important, I think for Warbler, is consistency. It provides high assurance that what you’re deploying to test, staging, and production is exactly the same, which is important to a lot of people.

00:15:08.680 Now, there are two disadvantages of Warbler. The first is that it’s very difficult to have a development environment that mirrors your production environment. You don’t want to generate a new WAR file every time you change a line of code, so you often end up using Webrick or something in development.

00:15:30.640 Then you might find differences when you deploy. The second disadvantage is that we are really towing the line with Java. I promise there would be no Java code or XML, and there isn't, but we're using this standard Java file format and these Java servers, and if your operations team isn't Java-savvy, they might hate you for this.

00:15:50.560 If that’s the case and they’re already deploying MRI applications, Trinidad may be a better solution. Trinidad is a lightweight JRuby web server, and one of its goals is to feel friendly and familiar to those already deploying MRI applications.

00:16:10.720 You install the Trinidad gem and it provides everything needed to run your server. You run it with `rails s` or `rackup`, just like you would with other similar servers. In production, you deploy to a Trinidad server with Capistrano.

00:16:27.680 It still takes advantage of this inversion of control I mentioned earlier, where the container is deploying the application. However, the way you configure the Capistrano deployment file becomes somewhat invisible to you, and your Capistrano deployment scripts tend to look a lot like a Passenger script.

00:16:42.440 One of the other advantages of Trinidad is its rich set of extensions. It provides a number of plugins that you can add to your application, like a job scheduling extension where you can eliminate dependencies on things like Whenever or Cron, running their own processes instead of running those scheduled jobs as threads within your runtime.

00:17:04.680 The disadvantage of Trinidad, however, is that by hiding some of the uncomfortable parts and making it feel like MRI deployment, we aren't truly maximizing the potential of the JVM. To do that, we need a full-stack solution like Torquebox.

00:17:22.760 Torquebox is a JRuby application server, and the folks at Torquebox have done a great job of making this feel as comfortable and familiar as possible to Rubyists. You install Torquebox as a gem, which provides you with a 'torquebox' command to run the server and deploy to it.

00:17:41.440 Torquebox is explicit about the change in production as well. You can deploy to a Torquebox server with Capistrano, or you can deploy as a knob file, which is similar to a WAR file and has most of the same advantages.

00:17:59.960 But the real power of Torquebox comes from its various subsystems. These subsystems make it an application server; Torquebox is designed to run any kind of application, not just web applications. Some of these subsystems are for job scheduling, like the Trinidad extension, messaging to replace Rescue, and long-running services like Daemons and Stoms, which allow you to push messages to a browser.

00:18:18.080 The most powerful of the subsystems are the clustering subsystems. When I say cluster, I don’t mean merely a set of servers running in parallel; I mean a set of servers running in parallel, communicating and coordinating with each other.

00:18:39.440 Let’s say your application has a scheduled job that runs every night at midnight. When you deploy to multiple servers, you don’t want that job to run on every node in the cluster; otherwise, you might hit an external service and duplicate the operation, which could lead to problems.

00:18:58.680 So you want only one instance of that job to execute. To keep your cluster homogeneous, you don’t want to configure one node differently from the others, and this is where Torquebox helps. It configures jobs as high-availability singletons.

00:19:15.680 When you create a scheduled job in Torquebox and deploy it to the cluster, when the cluster starts up, one node will identify itself as the master and run that job. If that node goes down, the other nodes will be aware of that, and one will identify itself as the master to continue running the job.

00:19:34.560 This is just the tip of the iceberg of what Torquebox can do on a larger scale. From my experience using Torquebox, I was involved with a company that was building an analytics engine. We had millions of records in our database, and we needed to crunch that data to discover trends and make predictions.

00:19:57.040 So, we built a system using Torquebox and Ruby code that took this data, decomposed it across multiple nodes in the cluster, performed computations, and then returned to predict results. We started with just a couple of nodes, but as the business grew to hundreds of millions of records, all we had to do was stand up more Torquebox instances.

00:20:15.480 That’s how we scaled, and it was a very effective approach. There are a few other servers I haven’t talked about just yet, like Mizuno and Puma, which are viable options. Torquebox Lite has been announced but is not ready yet; it’s designed to be a Torquebox server that behaves more like Trinidad.

00:20:31.760 The most common question I get about JRuby deployment is, 'How do I deploy in the cloud?' You know, three or four years ago, that was an issue. But today, there are a number of options. The JRuby team has a community-supported buildpack for Heroku that works very well and has many great features.

00:20:48.800 Engine Yard has had Trinidad as a deployment option for at least a year, and Red Hat's OpenShift Cloud platform as a service has good support for Torquebox. If you’re looking to deploy a WAR file to the cloud, CloudBees supports it, and you may know them for their support of the Jenkins CI server.

00:21:05.120 They indeed have a WAR file in the cloud platform. There are also options like Google App Engine, which don’t use any of the technologies I talked about today—they sort of rolled their own deployment tools that use XML, so I don’t use them.

00:21:24.440 Ultimately, deciding which of these technologies and strategies is right for you is a matter of the people, processes, and technologies within your organization. So, who are the people you work with? Are they Java people or MRI people? What processes do you use?

00:21:47.560 What level of security—or level of assurance—do you need that what you deploy to test, staging, and production is exactly the same? Because Warbler can provide features to help with that. On the other hand, if you’re doing rapid deployments directly from development to production, you will want your development environment to mirror your production environment, and that’s where Trinidad is a good fit.

00:22:06.520 Finally, what technologies do you already use? If you're using Capistrano with success, it probably makes sense to continue using it. However, if your organization already runs an enterprise-class Java web server, you can easily deploy a WAR file to it without anyone even knowing Ruby code is running inside.

00:22:20.760 If, after thinking about these three aspects, you’re still unsure, you can check out my book. I have a discount code here; the JRuby code is somewhere between 1% and 99% off, so I don’t know. But in the book, I discuss more about the why and when to use each strategy.

00:22:41.760 I've spent a lot more time on the how: how to write Puppet scripts to provision dedicated servers to run these various technologies. Oh, I almost forgot: I have a copy of the book I'll put out on a table; feel free to flip through it. Please don’t steal it.

00:22:58.000 In conclusion, no matter which of these JRuby technologies you use, I believe it will ultimately bring you and your sysadmin closer together. Maybe one day the two of you will have little Rubyists running around, and you can play Legos with them.

00:23:13.000 So thank you.