Deployment

JRuby: Zero to Scale! 🔥

JRuby: Zero to Scale! 🔥

by Charles Oliver Nutter and Thomas E Enebo

JRuby: Zero to Scale! 🔥

In this presentation at RubyConf 2019, Charles Oliver Nutter and Thomas E Enebo delve into JRuby, an alternative Ruby implementation that runs on the Java Virtual Machine (JVM). Designed for compatibility with CRuby, JRuby offers numerous advantages, particularly in terms of performance and scalability.

Key Points Discussed:

- Getting Started with JRuby:

- Installing Java (version 8 or higher is recommended) is the first step to using JRuby.

- JRuby installations mimic those for CRuby, making the transition straightforward.

  • JRuby vs. CRuby:

    • JRuby allows for concurrent execution of Ruby code using native threads, while CRuby is limited by a global interpreter lock (GIL) that reduces performance on multi-threading tasks.
    • A practical micro-benchmark showed substantial improvements in CPU utilization with JRuby compared to CRuby when executing concurrent operations.
  • Performance Improvements:

    • The recent JRuby release (9.2.9) has reduced memory usage by 24% and improved startup times significantly.
    • JRuby takes advantage of JVM features such as just-in-time compilation and various garbage collectors, contributing to enhanced performance.
    • Benchmark results highlighted that JRuby achieved about four times the performance in micro-services compared to CRuby, although the performance difference narrows in larger applications like Redmine.
  • Migration and Compatibility Considerations:

    • Migrating applications to JRuby requires assessing C extensions and ensuring thread compatibility.
    • Tools like the JRuby Lint gem help analyze existing codebases for compatibility issues.
  • Scaling and Monitoring:

    • JRuby can consume more memory than CRuby due to JVM resource management, but efficient use of threads can yield better performance outcomes.
    • The talk discussed tools for performance monitoring in JRuby applications, such as VisualVM, JDK Flight Recorder, and Async Profiler, which assist in understanding and improving application behavior.

Conclusion:

Nutter and Enebo encourage developers to consider leveraging JRuby for its enhanced concurrency and performance capabilities, particularly for applications that can benefit from multi-threading. Despite some challenges in startup times and native C extension support, the advantages of JRuby, especially in resource utilization and speed, present a compelling case for adoption.

Overall, the session presents JRuby not just as an alternative to Ruby but as a powerful tool for building scalable applications.

00:00:12.470 All right, thanks for coming today. We know that Aaron Patterson is next door.
00:00:18.030 So you must really want to hear about JRuby. Hi, I'm Tom Enebo and I'm Charles Nutter.
00:00:27.600 We've been working on JRuby for probably more than 15 years. It seems like forever and we work at Red Hat.
00:00:36.600 Which is gracious enough to pay for us to work on JRuby. So if you take away anything from the talk today,
00:00:42.629 it's just that we're another Ruby implementation. We aim to be as compatible to 5.7 as we can.
00:00:48.510 In the next major release of JRuby, we're going to be aiming for 4.7 sometime after 2.7 comes out.
00:00:54.299 We happen to run on top of the Java Virtual Machine, and we do actually have some extra features that you can leverage from Java.
00:01:00.510 But if you never use those, you could just think of this as just being an alternative Ruby implementation.
00:01:06.660 About two weeks ago, we released 9.2.9. We've been spending a lot of time working on memory improvements.
00:01:15.570 From 9.2.7 to 9.2.9, on a single-controller Rails app, we've reduced our memory usage by about 24%.
00:01:21.450 We've also made several improvements to startup time. Pretty much any command line you execute will feel a little bit faster than before.
00:01:28.470 We're always fixing options and the Java module stuff, which Charlie will talk about later.
00:01:34.920 There are three steps to getting started with JRuby.
00:01:41.070 The first step is that you need to have Java installed.
00:01:46.110 So if you type `java -version` and it returns something that's version 8 or higher, you're golden.
00:01:54.090 Although, we recommend versions 8 or 11 because those are the long-term support versions of Java.
00:02:06.960 If it's not installed, use your package manager to install it, or you can go to AdoptOpenJDK and download it yourself.
00:02:12.810 Once you have that, you just use your favorite Ruby manager to install JRuby.
00:02:19.470 This is exactly like using CRuby.
00:02:26.070 Alternatively, if you're on Windows, you might want to go to our website and download our installer.
00:02:32.730 That's sort of the Windows way, but there is no step 3.
00:02:38.130 So once you have Java, this is no different than working with any other Ruby implementation.
00:02:44.940 But why would you want to use JRuby, and what differences are there with CRuby?
00:02:53.880 Probably the biggest feature we have is that we can concurrently execute Ruby with native threads.
00:03:07.380 In contrast, CRuby has a global interpreter lock (GIL). Although it can do stuff across threads,
00:03:12.720 only one system call can execute at a time, which means you don't normally see much benefit from it.
00:03:21.450 Here is just a simple micro-benchmark: we create a big array ten times and iterate through it.
00:03:28.709 This is done with one thread. For the second benchmark, for each of the ten times, we walk through the array,
00:03:34.170 create a new thread, and join it to ensure all threads finish.
00:03:41.760 This is highly contrived, but if we look at the single-threaded run, we see that a single core is pretty much dominated,
00:03:47.310 with all other cores not doing much.
00:03:56.580 Once we switch to the multi-threaded operation, on the top right side for CRuby,
00:04:02.850 you can add up those blue blocks and see similar CPU utilization on the left side.
00:04:09.989 If you look down at the JRuby side, we're not only utilizing all the real cores,
00:04:15.930 but we're also filling up the hyper-threads.
00:04:22.620 On CRuby there's really no change in performance; it takes about the same amount of time.
00:04:28.170 The blue bars indicate that we went from 0.23 to 0.15, which illustrates that threads are indeed helping us.
00:04:35.010 Charlie will cover some more real-world application performance with threads later.
00:04:41.280 Another significant difference is that we're built on top of a virtual machine, so we both have our own teams.
00:04:49.080 Sometimes these people are in both boxes, and we have to write to our hosts.
00:04:56.460 In JRuby, we do some POSIX calls straight down to the operating system in an effort to be more compatible with CRuby.
00:05:01.680 However, for the most part, our host is the JVM itself.
00:05:10.770 MRI doesn't really have any dependencies other than the C compiler.
00:05:16.140 From our perspective, this isn't as good as being built on top of the JVM because,
00:05:28.590 they have the control to do anything they want, but they have to do everything.
00:05:36.380 If we consider this from what we get by building on top of the JVM, there are entire teams of people that have been working for decades to continue to make the JVM execute code faster.
00:05:51.660 We get tons of features for free, including multiple garbage collectors.
00:05:56.670 Charlie and I talked about how awesome just-in-time (JIT) compilation is to generate native code.
00:06:02.670 You can profile that code and make it even faster.
00:06:07.980 Charlie will also be discussing tooling, and JIT runs everywhere.
00:06:13.950 In fact, there's not just one JVM; there are multiple ones that you can choose.
00:06:23.310 HotSpot is the one that you're most likely to use, but IBM has Open J9.
00:06:29.310 One really nice feature of Open J9 is that when you execute code with a specific flag,
00:06:34.950 it will notice all the classes that you use and save that information for the next run.
00:06:40.550 This significantly improves startup time.
00:06:46.710 GraalVM, while being its own distribution, also plugs into HotSpot and replaces one of the JIT compilers.
00:06:55.500 Sometimes the performance is fantastic. However, in practice, it seems to be about the same as OpenJDK for larger applications.
00:07:00.990 As I said, JRuby is absolutely everywhere. We actually have users on VMs.
00:07:07.229 At least in the past, we've had someone on an IBM mainframe, which is pretty scary.
00:07:14.449 We also compile all of our code down to Java bytecode, which is platform agnostic, providing a great level of portability.
00:07:22.380 Here's our native Java extensions that we have.
00:07:28.710 When you run a gem install, there's no compile step because we've already compiled it when the gem was released.
00:07:34.950 We have a different C extension API than CRuby. I think we all know this.
00:07:44.130 These extensions exist to provide better performance because Ruby isn't the fastest language out there.
00:07:50.040 However, people aren't going to rewrite OpenSSL in Ruby, so it's useful to be able to call out to a C library or a Java library.
00:07:56.760 Considering the C extensions themselves, it's massive because basically every C function in the codebase is potentially usable.
00:08:05.370 Typically, there's a small set that most extensions use. Back in JRuby 1.6, we had experimental support for C extensions.
00:08:13.290 This quickly turned into a support nightmare due to continuous complaints about unimplemented methods.
00:08:24.330 Because Ruby can't concurrently execute code, the C extensions were not written to be concurrent.
00:08:32.520 Essentially, we had to place a lock around any calls to C extensions, which undermined a significant reason to use JRuby.
00:08:39.870 If you have multiple threads but are always locking, you end up with performance comparable to MRI.
00:08:44.940 The extra overhead from protecting the C code negated any benefits from using C extensions.
00:08:52.860 It just didn’t offer any advantages, and it resulted in a high support cost.
00:08:59.340 Plus, we already had Java native extensions, which actually ran faster.
00:09:06.240 This is also a significant API as it's our implementation.
00:09:12.540 We have plans to change that, and it's embarrassing we haven’t done it yet.
00:09:19.320 As you saw earlier, we can do concurrent execution with minimal cost in our API.
00:09:26.670 As a side project, I've been working on OJ, a very popular JSON library that does pretty much anything you could want.
00:09:34.680 It's increasingly becoming an important transitive dependency.
00:09:41.940 The test results are showing only about fifteen errors right now.
00:09:49.950 Regarding load performance, we conducted tests on a small, medium, and large payload.
00:09:56.220 In the benchmark, MRI running OJ achieved better times, while the yellow line shows us running OJ.
00:10:06.030 You can see we're achieving about one and a half to three times faster than CRuby.
00:10:14.130 The green line provided for convenience is actually the Psych default JSON gem that we ship.
00:10:22.920 If you're already a JRuby user, OJ is going to give you a significant boost for JSON processing.
00:10:30.840 This is interesting because this is essentially the same logic from OJ ported into JRuby's extensions running on the JVM.
00:10:36.060 Having excellent profiling along with a powerful garbage collector significantly enhances our performance.
00:10:44.640 Indeed, we can run essentially the same code much faster and with concurrency.
00:10:51.690 It's clear that the best scenario is to have extensions running on the JVM.
00:10:57.900 Dump performance isn't currently as fast in comparison, but we still provide some benefits.
00:11:05.250 There is a significant optimization that needs to be added for further improvement.
00:11:10.730 One of the other big advantages of being on the JVM is the ability to call in to all other JVM libraries.
00:11:17.220 Of course, we can access Java classes, and we have a nice syntax for it.
00:11:25.140 It essentially functions like requiring a jar and accessing the classes as if they were regular Ruby classes.
00:11:31.310 This capability isn't limited to just Java; it extends to libraries in Closure, Scala, and anything else running on the JVM.
00:11:40.380 There are tens of thousands of libraries available, so if you need something, there's probably a dozen implementations on the JVM.
00:11:47.510 I'll provide a quick example of using a Java library from IRB.
00:11:54.360 Here we have our IRB shell, and we're going to use the Swing GUI framework built into OpenJDK.
00:12:01.410 We create a new frame with a title, then a button and place it in the frame.
00:12:06.570 These are all standard Java calls; we access the classes from the JVM directly.
00:12:13.380 Set the size so it's reasonably visible, and then we get our window popping up.
00:12:19.920 Next, we can add an action listener here using standard Ruby semantics.
00:12:27.660 I notice the method is `add_action_listener` to fit into the flow of your program.
00:12:34.920 Once you've done that, you can script Java Swing from Ruby.
00:12:41.220 A few lines of code is all it takes. You can also use JavaFX for building GUIs across all platforms.
00:12:47.880 There's plenty of excitement over projects like Minecraft, which is primarily implemented in Java.
00:12:56.130 Here's an example of using the plugin library which Tom wrote to allow scripting Minecraft plugins directly in Ruby.
00:13:02.970 This code essentially changes the number of chickens hatched every time you throw in a chicken egg.
00:13:11.160 Normally, one or two chickens are hatched, but with this code, you could generate 120 chickens.
00:13:17.400 I think this could really mess up your world by playing around with it.
00:13:23.550 Now, getting back to something more practical, we'll discuss various aspects of performance.
00:13:31.320 We always strive to make JRuby perform well on applications.
00:13:39.390 We want to emphasize that startup time is one of our weaker areas.
00:13:47.370 All of those runtime optimizations that we implement internally, along with those done by the JVM, provide excellent peak performance.
00:13:56.580 However, it can take a while to achieve that performance.
00:14:01.920 As a result, startup time is impacted; there's a warm-up curve to get an application up and running.
00:14:08.520 We continue to improve this; we're adding new ways of starting up JRuby faster.
00:14:16.740 We're tuning our own JIT and working on reducing the JVM's JIT warm-up time.
00:14:22.230 But it is still something to keep in mind when you start using JRuby.
00:14:29.220 Now, comparing a few simple commands here will illustrate why we have this warm-up time.
00:14:35.520 We start with our Ruby code and parse it into an AST, then compile it into our internal instructions.
00:14:42.360 Parsing and compiling all occurs in JVM bytecode that the JVM needs to optimize.
00:14:48.330 That's the first step we start out running cold with.
00:14:55.530 We have our own interpreter for our internal instructions, also compiled to JVM bytecode.
00:15:01.740 The JVM takes some time to optimize it, and we eventually turn the Ruby code into JVM bytecode directly.
00:15:08.520 All these phases help optimize Ruby code but it takes a while to complete.
00:15:15.540 We have to spin it up to determine hot code and make those optimizations.
00:15:23.040 Eventually though, we do reach a steady-state performance and enjoy full performance from our optimizations.
00:15:30.840 However, it's worth noting that most of these steps are not critical for running day-to-day commands.
00:15:36.840 For example, listing installed gems, installing a gemfile, or starting a Rails console, we introduced the -dev flag for JRuby.
00:15:45.390 This flag disables some internal JIT optimizations, simplifying the JIT optimizations that the JVM performs.
00:15:52.920 As a result, it can cut a whole bunch of extra steps out, resulting in a 30 to 40 percent reduction in startup time for most commands.
00:15:58.770 This is definitely the first line of attack for improving JRuby startup time.
00:16:03.630 Next, we can compare the time it takes to list about 350 gems on a system.
00:16:12.510 With the -dev flag applied, we often see about a threefold slowdown.
00:16:18.690 However, for larger commands such as launching a Rails console, both implementations have notable work to do.
00:16:24.630 The more work you execute, the less startup time tends to be an issue.
00:16:32.640 A feature that's being pushed on the OpenJDK side more and more is class data sharing.
00:16:40.380 This can help us skip a few of those steps when loading code if JVM already knows that the code is valid.
00:16:51.390 Combining this with the -dev flag is currently helping us achieve optimal startup times.
00:16:57.780 We are working on various ways to simplify this process for users.
00:17:04.920 So, once you install a gem, it sets up all these configurations to speed up startup time.
00:17:10.440 Next, we will move on to actual application performance.
00:17:17.850 Once we've started an application, what kind of performance should we expect?
00:17:24.510 We categorize applications into two scales: one very small, micro-service style application with Sinatra or Roda, and another larger one using the Redmine bug tracker.
00:17:30.840 I will first discuss peak performance, which is achieved after startup and the warm-up curve has stabilized.
00:17:40.410 Developers often exercise the server for a few minutes to ensure it's 'hot', and everything is cached.
00:17:47.880 So generally, we see better peak performance, although the warm-up curve is significant.
00:17:52.920 Now, let's look at requests per second for Sinatra and Roda.
00:18:00.870 Starting from around 12,000 requests per second in CRuby, we see JRuby achieving around four times that.
00:18:07.740 It’s impressive how small microservices perform significantly better using JRuby.
00:18:14.520 Similarly, for Roda, we reach about 14,000 requests per second, demonstrating a similar performance ratio.
00:18:21.510 However, for larger applications like Redmine, we achieve only about 41 requests per second for CRuby and 50 for JRuby.
00:18:29.190 Part of this is due to how Redmine processes and renders additional string data.
00:18:36.120 However, when utilizing it more as an API and returning just JSON, we observe a good 30 percent improvement over CRuby.
00:18:45.750 We noticed a couple of small bottlenecks when benchmarking Redmine.
00:18:53.271 Redmine does not have extensive optimization, and we’re working with Rails and Redmine teams to improve performance.
00:19:00.720 Lastly, let's discuss warm-up time and how it affects achieving peak performance.
00:19:07.920 Larger applications tend to take longer to warm up.
00:19:14.760 More complex code must be compiled and optimized by the JVM.
00:19:20.970 So, we’re working towards new tuning for our JIT and the JVM's JIT to help reduce the warm-up curve.
00:19:27.750 So the warm-up with Sinatra looks at how it compares over ten seconds.
00:19:34.440 As we hit the same action for about ten seconds, the JVM will recognize it’s 'hot' after the second iteration.
00:19:41.100 At this point, JRuby starts surpassing CRuby and shows significant improvements.
00:19:47.640 In Roda, we start at a similar level, but improvements show quickly.
00:19:54.900 Smaller microservices definitely have less warm-up concern.
00:20:01.800 In contrast, for Redmine's JSON version, it takes a couple of minutes of warming up.
00:20:08.640 This situation arises due to the large volume of code from standard Ruby on Rails applications.
00:20:16.290 We’re iterating for ways to reduce this warm-up curve.
00:20:23.040 After a third or fourth run, the performance gains become noticeable as JRuby improves.
00:20:31.290 Now, let’s investigate what it takes to achieve good concurrent performance.
00:20:36.600 Running JRuby out of the box with eight threads versus eight workers on CRuby yields interesting results.
00:20:42.780 JRuby often consumes more memory, as the JVM tends to use its resources generously.
00:20:48.840 Initially, the memory usage stands out when running these applications.
00:20:54.810 In the reporting column, we can actually choke JRuby down to about a 300MB heap, as opposed to 1GB.
00:21:01.590 This performance difference can save considerable resources on larger applications.
00:21:09.150 However, when we experiment with four threads per worker, the performance difference isn't noticeable.
00:21:16.530 Typically, simply using workers yields better performance over varying tasks.
00:21:23.430 Most of the time, adding multiple threads to CRuby counters the performance.
00:21:30.000 That means for CRuby, workers are generally the only efficient way to scale.
00:21:38.640 Now, most of you are CRuby users and your first activity will likely involve migrating your existing app to JRuby.
00:21:46.110 Even for new applications, these points are essential.
00:21:52.860 There will be a few small configuration changes, which usually pertain to database handling.
00:22:00.870 This includes assessing which C extensions exist and determining what to replace them with in JRuby.
00:22:07.530 Lastly, ensure compatibility with threads, as failing to do so can undermine the scaling benefits.
00:22:14.760 A use case we attempted was getting Discourse to run;
00:22:23.370 it's a massive application with more than 500 gems.
00:22:31.320 In essence, it partially works; everything starts up fine and we can navigate all the pages.
00:22:39.630 However, the main content pane has no data.
00:22:46.080 It seems to be an issue rendering markdown format to HTML with JavaScript.
00:22:53.880 This problem doesn't seem directly related to JRuby; it's a little odd.
00:23:00.300 When we resolve this, it will indeed be a significant milestone.
00:23:09.510 If we can run Discourse, we can run virtually anything.
00:23:16.410 The Ruby code in Discourse is functional; it's a single call to JavaScript presenting issues.
00:23:23.400 Our first step was to install the JRuby Lint gem.
00:23:31.920 You install it, navigate to your app directory, and run the analysis.
00:23:38.940 You will then receive a report for an extensive code base, such as 250,000 lines of Ruby.
00:23:46.650 It offers a plethora of tips on how to adjust your implementation.
00:23:53.520 The initial focus is on checking your gem file for compatibility with JRuby.
00:24:03.390 It will suggest replacements, such as using the ActiveRecord JDBC PostgreSQL adapter.
00:24:11.880 Most concerns relate to threading and atomic operations in Ruby.
00:24:20.460 Though something may just happen to work, evaluate if two threads can access it concurrently.
00:24:27.150 If these encounters occur while the app is bootstrapping, it's typically within a single core.
00:24:37.800 Otherwise, you might need to impose a lock around it.
00:24:43.200 Additional performance features may also be flagged.
00:24:51.300 While most features are supported, there are limitations such as Fork, which will never be supported.
00:25:01.680 When it comes to native C extensions, you’ll need to employ some strategies for successful migration.
00:25:09.360 You can utilize some pure Ruby implementations of gems, considering they are preferred.
00:25:18.900 In some cases, you can call into a dynamic-loaded library via the foreign function interface jewels.
00:25:26.790 Unquestionably, there's a Java library for practically anything that comes to mind.
00:25:35.700 Alternatively, consider developing a JRuby Java native extension, which yields excellent performance.
00:25:43.050 However, preparing such an extension might require the most effort up front.
00:25:51.930 But once that's done, we all enjoy the advantages of your hard work.
00:25:58.440 So, the final step is to launch Puma, our recommended deployment.
00:26:05.310 Make sure to utilize threading; traditionally, workers haven't demonstrated effective performance.
00:26:16.290 If there are issues, be sure to file an issue report.
00:26:22.920 Join our chat channels for additional support; we are accessible.
00:26:27.990 Typically, if your application is functioning correctly, you should benefit from enhanced concurrency and performance.
00:26:34.890 But if you're not witnessing benefits, there are various tools available for JVM usage that can assist us.
00:26:41.520 Let's explore three primary tools: VisualVM, JDK Flight Recorder, and Async Profiler.
00:26:52.560 First, VisualVM is a basic graphical console that allows you to connect to any JRuby application.
00:26:59.220 It monitors garbage collector activities and provides detailed metrics on system performance.
00:27:07.140 You can download it from the VisualVM GitHub to use in conjunction with your JRuby instance.
00:27:13.410 On the left, you see some basic metrics, like CPU monitoring and thread information.
00:27:24.870 On the right, the Visual GC plug-in displays real-time garbage collection metrics.
00:27:30.480 If you don’t observe a smooth garbage collector sawtooth, it might indicate a memory issue.
00:27:37.320 You may also see excessive CPU time usage indicating unnecessary object allocations.
00:27:44.550 Next, we look into JDK Flight Recorder, a feature built into OpenJDK.
00:27:53.850 When activated, it can receive monitoring commands accessible through the GUI client, JDK Mission Control.
00:28:01.920 This can facilitate detailed tracking of object allocations within your application.
00:28:08.520 Post-run, it presents findings regarding allocation stats and inefficient code sections.
00:28:16.200 Lastly, we have Async Profiler, which operates via command line and can generate output files.
00:28:24.120 This allows profiling in production environments where GUI access may be restricted.
00:28:32.280 You can install Async Profiler now, and we will continue making this integrated into JRuby.
00:28:39.750 This offers insightful flame graphs of your application performance.
00:28:46.920 It comes with minimal impact on your production performance.
00:28:53.460 Our closing remarks indicate that challenges will inevitably arise during the transition.
00:29:00.660 You may identify a library without a suitable replacement or encounter a perplexing bug.
00:29:07.740 Please reach out; we inhabit a friendly Matrix channel similar to Slack.
00:29:14.280 JRuby offers an extensive wiki that contains documentation and discussions for improvements.
00:29:21.660 During development, consider setting the --dev flag in an environment variable.
00:29:29.310 This is substantial as Ruby commands frequently launch sub-commands.
00:29:35.640 We often experience confusion over Ruby version management; it is vital to remain aware of your Ruby version.
00:29:41.220 If affected, you may find yourself back at MRI without realizing it.
00:29:48.660 Furthermore, if you're employing multiple Ruby implementations, you should generally avoid sharing gem paths.
00:29:55.140 The Ruby switchers typically manage this, isolating the Ruby gems from JRuby gems.
00:30:03.240 Lastly, if you begin using Java 11 or any version greater, you might notice warnings concerning JRuby.
00:30:09.840 These warnings, concerning illegal reflective access, are generally harmless but could be irritating.
00:30:17.610 We are working on ways to minimize these warnings to enhance JRuby's utility.
00:30:24.480 We've also provided guidance in our wiki regarding this issue.
00:30:31.680 Please consider giving JRuby a try if you haven't already.
00:30:39.600 We’d love to hear your stories, and if you face any issues, feel free to ask us.