JRuby 9000 Is Out; Now What?

by Thomas E Enebo and Charles Nutter

In the video titled 'JRuby 9000 Is Out; Now What?' presented at RubyConf 2015 by Thomas Enebo and Charles Nutter, the speakers discuss the release of JRuby 9000, which now supports Ruby 2.2 and features a redesigned optimizing runtime. They highlight improvements in performance and compatibility that have been made over the years and provide insight into future developments in JRuby.

Key points discussed include:

- Overview of JRuby: JRuby is another implementation of Ruby that runs on the Java platform, benefiting from Java's native features. It does not have a Global Interpreter Lock (GIL), which allows for better concurrency.
- Performance Benchmarks: The comparison between JRuby, CRuby (MRI), and CRuby with C extensions shows that JRuby can outperform CRuby with pure Ruby implementations due to JVM optimizations.
- Library Availability: JRuby users have access to a vast range of Java libraries, significantly supplementing RubyGems.
- Runtime and Compatibility: JRuby 9000 focuses solely on supporting the latest Ruby version without the dual-runtime complexity of prior versions like JRuby 1.7.
- Performance Improvements: The introduction of block jitting and optimizations for defined methods significantly enhance JRuby's execution speed.

- Handling Exceptions: Strategies are discussed to improve performance when handling exceptions, a common source of overhead in Ruby applications.
- Internal Representation (IR): A new runtime model with features such as semantic analysis and optimization passes is outlined, moving away from the abstract syntax tree approach.
- Future Work: Opportunities for performance enhancements through method inlining and unboxing are explored, with potential improvements of 10 to 20 times performance in benchmark tests.
- Community and Collaboration: The speakers express gratitude towards the community and highlight collaborative efforts with Oracle to enhance JRuby's capabilities.

- Challenges Ahead: Addressing startup time remains a key challenge, alongside ongoing optimization and bug fixes as JRuby progresses towards its goals.

This presentation summarizes the journey and future direction of JRuby, emphasizing the notable improvements and the commitment to the Ruby community's needs.

00:00:15.320 Hello, I'm Tom, and this is Charlie. We've been working on JRuby for a long time.

00:00:20.920 Before we start, how many people here have had exposure to JRuby in some way? All right, most of you! Good.

00:00:28.279 For those who didn't raise their hand, I will go over a quick overview.

00:00:33.960 JRuby is just another Ruby implementation. We try to be as compatible as we can with CRuby.

00:00:39.160 Currently, we support three versions. Of course, JRuby is built on top of the Java platform, so we benefit from what Java offers.

00:00:44.760 We don't have to write our own garbage collectors, and HotSpot makes our code run very quickly, which we'll see in the next slide.

00:00:51.840 The most important thing to notice is that Java has native threads, and so does JRuby. There is no Global Interpreter Lock (GIL). There are some good talks later today; Jared Antonio will discuss how the GIL isn’t your savior, showing how you can utilize real concurrent threads.

00:01:05.799 That's at 1:15 in GM. Peter Halupa will talk about the concurrent Ruby library, which provides a good set of concurrency primitives and tools that work across all Ruby implementations. That talk is at 4:20 in this room, so if you're interested in concurrency at all, those are two great talks to check out.

00:01:18.119 I wanted to include this graph showing JRuby's performance. We have a benchmark of a red-black tree library. The top bar represents CRuby (MRI) running a pure Ruby red-black implementation, taking about 2.5 seconds to run the benchmark. The benchmark creates several nodes, traverses them, deletes them, and repeats this process. This shows why we often have to rely on C extensions in CRuby.

00:01:42.880 The second bar down shows Ruby with C extensions, which provides a significant performance improvement, taking only about 0.5 seconds. However, the most interesting result is at the bottom: JRuby running a well-written pure Ruby red-black tree implementation performs faster than CRuby with C extensions, thanks to the capabilities of the JVM, our stellar garbage collectors, and optimizations.

00:02:06.280 Furthermore, there are numerous Java libraries available. If you find that a Ruby gem is lacking, for instance, if you’re working with Prawn and it cannot do something you need, you can switch over to the Java world and use iText. There are about 7,000 libraries available on RubyGems, compared to approximately 47,000 libraries in Maven. There is likely a JVM library out there for whatever you need.

00:02:43.239 It’s incredibly easy to call into other languages. Java is highlighted, and calling Java using Ruby syntax is straightforward, but you can also call any language on the Java platform, such as Clojure. Here are the two supported branches: we have the master branch for JRuby 9000, which we will be discussing, and we also have a maintenance branch for JRuby 1.7, which we will likely continue to maintain for another six months to a year, but only if people still need it.

00:03:17.879 JRuby 1.7 was an interesting release for us because it allowed users to pick which compatibility level they wanted—either Ruby 1.8 or 1.9 mode with a flag. This ended up being a difficult decision because we had to maintain two runtimes in the same codebase, and it didn’t work out so well. For JRuby 9000, we will only support the latest version of Ruby and track the current version of CRuby. Right now, that's Ruby 2.2, which will soon become 2.3. Now that Ruby 2.3 preview 1 is out, we hope to begin implementing its features within a month or two of the final release.

00:04:13.280 Last Friday, just before heading to the plane, JRuby 9004 was released. When we return next week, JRuby 1.7.23 will also come out. We are very conference-driven here. JRuby 9000 consists of a few high-level bullet points. As I mentioned, we are tracking CRuby, and we have a completely new runtime, which we've been developing for years. Most of this talk will focus on this new runtime.

00:04:56.720 We are now bypassing Java for I/O, primarily using native calls instead. While we can still fall back to Java, this change results in better performance and, more importantly, gives us compatibility features that we couldn’t implement using pure Java solutions. JRuby is currently probably the most POSIX-friendly JVM language.

00:05:23.160 Furthermore, OniGara’s transcoding facilities have been completely ported, and we promise there are no more encoding bugs. Some attendees might be wondering why we chose 9,000 as the version number. It’s entirely because of Dragon Ball; it started as a joke. Initially, we considered calling it JRuby 2, and that coincided with the release of Ruby 2, which would have been confusing. We couldn’t come up with a better number, and it just stuck.

00:05:59.280 Charlie is even wearing the T-shirt with the slogan "It's over 9,000!" the funny thing is that JRuby 9,000 started out as just a code number, but we later realized this is the ninth major release of JRuby. We kind of follow the Java numbering scheme, as they moved from 1.4 to Java 5. So what’s next? That is, of course, the title of our talk.

00:07:06.320 We do an incredible amount of compatibility work; we probably spend more time on compatibility than on performance. However, no one wants to hear about how we fixed compatibility bugs, so let's focus on performance. Recently, we have made several changes to improve the performance of Ruby code, things we have wanted to do for years that were very challenging with the old runtime, but now are made easier by the new runtime.

00:07:51.360 Let’s go over these changes quickly. Up until JRuby 1.7, when we compile just-in-time (JIT) code to JVM bytecode at runtime, we only did this at method boundaries. If a method was called 50 times or more, we would compile it into JVM bytecode, leading to good performance. However, there are many cases where code consists of independent procs or lambdas. If you create a table of procs for various function calls or use a defined method, those would remain unoptimized, executing in our interpreter instead.

00:09:06.919 This often resulted in performance that was slower than MRI (Matz's Ruby Interpreter). This issue had to be addressed. In JRuby 9003, we introduced block jitting. With this change, JIT now works on block boundaries as well as method boundaries. This results in performance much more in line with our expectations. A comparison with MRI shows that now regular method definitions are significantly faster, and there's also a noticeable performance increase for defined methods.

00:10:26.200 One major topic we've been tackling is improving the performance of defined methods. We face a challenge because they generally perform about half as well as regular method definitions in CRuby and MRI due to additional overhead. In JRuby, our performance was slightly better than CRuby, but still lagged behind regular methods. The strategy for optimizing these involves treating non-capturing defined methods as standard methods in our compiler. If a method does not access any surrounding state, we compile it as a regular method, achieving performance on par with it.

00:11:42.760 If we detect that some surrounding state is only read within the defined method, we aim to lift those values out as constants for optimization in the future. Our early results illustrated a marked improvement in performance for defined methods, with expected results placed within the level of standard method performance.

00:12:50.560 A significant area of concern is that JRuby often receives complaints regarding slowness in benchmarks due to exceptions being raised. Both the backtrace costs and exception costs are high in CRuby but are notably worse on the JVM. Building a backtrace requires piecing together inline frames and various method structures, which adds significant overhead. This is especially burdensome, as exceptions are frequently ignored or utilized for flow control, leading to challenges when exceptions are raised.

00:14:18.819 We have developed a strategy to optimize this situation. If the exceptions are ignored, and there is no need to look at them or use the backtrace, we simply do not generate one. For example, if a function raises an exception and it’s immediately rescued with a simple value or nil, we can set a thread-local flag indicating we do not need a backtrace for exceptions raised below this point. When raising the exception, we refer to this flag and can skip building a backtrace, resulting in improved performance.

00:15:48.840 This optimization is especially vital in areas like CSV validation, where numerous exceptions can be thrown during conversions. With our updated approach, the performance improvement has been remarkable—up to five times faster in some cases for simple rescues and significantly improved for more complex scenarios. However, we must ensure edge cases are considered where we cannot apply this optimization.

00:17:38.520 I’d like to talk about our new runtime called IR, which stands for Internal Representation. This is probably the most boring name for a runtime you could come up with! In JRuby 1.7 and earlier, everything was executed with the abstract syntax tree. We parsed Ruby into a tree and the interpreter would navigate through that tree. The JIT compiler performed a similar process. We wanted to move away from this.

00:19:57.760 We aimed for a traditional compiler design that developers with compiler theory knowledge could understand and contribute to. We hope this is our last runtime!

00:20:09.000 We now have more phases: semantic analysis translates the syntax tree into a series of instructions, creates supplementary data structures like control flow graphs, and performs several optimization passes before interpreting the bytecode. After some time running, the instructions are then compiled into JVM bytecode, which the HotSpot can optimize.

00:21:09.679 Regarding instructions in IR, we check for required arguments and bind parameters. Our semantic analysis employs various transformations, improving instruction handling. A common simplification example is with the "super" keyword. In the legacy system, we needed to maintain extra state, but in IR, we can simplify this greatly.

00:22:52.839 We build pluggable compiler passes like dead code elimination and constant propagation to streamline our code further. When analyzing a snippet, we can notice unused variables and remove them as we parse through. This helps us eliminate redundant instructions and simplify the runtime without compromising functionality.

00:24:57.800 We aim to get method inlining working next. Most optimizing runtimes achieve performance gains through inlining methods, allowing us to eliminate various overheads typically generated during method calls. This is particularly significant for JRuby, where we need to pass additional information that Java doesn't require.

00:26:54.000 Interestingly, our approach involves counting method calls and determining the object's type through the profiler. If we can guarantee a type remains unchanged, we can optimize calls accordingly. This will enhance our runtime capabilities, and we have started to prototype unboxing, anticipating robust performance improvements.

00:27:51.000 In fact, when we ran prototypes for numeric algorithms, we observed performance accelerations of 10 to 20 times with our unboxing. We are diligently working on this, and the results hold great promise.

00:29:02.000 It’s an exciting time to be a JRuby engineer! We are dusting off our profiler and have made progress with our inlining strategies. It’s essential to mention that while our optimizations produce promising results, many bugs need resolution before we can ascertain reliable performance in production.

00:30:04.000 In the past three months, we've been focusing heavily on JRuby 9000. We've spent more time on compatibility than with any release. Users transitioning from 1.7 to 9,000 report experiences that appear seamless, and we have achieved a good level of compatibility with Ruby 2.2. This development has given us more freedom to explore optimizations.

00:31:46.000 Refinements still present challenges, operating effectively only in simple situations. We are pragmatic in how we handle bug reports and improvements as they arise.

00:32:42.920 We need to acknowledge the collaborative efforts with the JVM team at Oracle. Their enhancements to the JVM have made substantial improvements possible for JRuby. They've communicated effectively, sharing insights that have facilitated our advancements.

00:34:17.680 Regarding upcoming projects, one exciting initiative is Project Panama, which aims to enhance native support at the JVM level. Using this will allow JRuby to make native calls much more efficiently. We have also been working on measures to address JRuby’s startup time, our most significant challenge. We have plans to use ahead-of-time (AOT) compilation methodologies, which we foresee reducing startup times effectively.

00:35:56.839 The ultimate goal remains to achieve peak performance seamlessly: we are determined to optimize startup while retaining functionality. Users should see equally improved performance over several actions such as basic startup and Rails application tasks with minimal overhead time.

00:37:00.840 Finally, I would like to express sincere gratitude to the community. We've gathered some insight into companies using JRuby, revealing its utilization in critical environments. We appreciate the interest and commitment to optimizing JRuby for production—it's essential for its future.

00:39:07.480 Thank you for attending! If anyone has any questions or is interested in discussing specific JRuby applications or bugs you're facing, we will hold office hours from 2 to 4 PM. Furthermore, we have several JRuby stickers available, so feel free to come up and grab one.