Koichi Sasada

Summarized using AI

Incremental GC for Ruby interpreter

Koichi Sasada • December 19, 2014 • San Diego, CA

In this presentation at RubyConf 2014, Koichi Sasada from Heroku discusses the implementation of incremental garbage collection (GC) for the Ruby interpreter, specifically highlighting the challenges of long application pause times associated with major GC.

Key points discussed include:

- Rubies 2.1 introduced generational garbage collection which improved throughput but did not fully address the problem of long pause times during major GC.

- Incremental GC is a well-known technique designed to significantly reduce pause times.

- In Ruby 2.2, the combination of generational GC and incremental GC aims to provide high throughput while minimizing pause durations.

- Sasada illustrates the impact of these improvements using performance data that shows incremental GC dramatically reduces major GC pause times when compared to traditional methods.

- Implementation details of incremental GC are discussed alongside technical challenges, such as the need for write-barrier protections to segregate objects.

- The classification of objects into write barrier protected and unprotected categories allows Ruby to manage memory efficiently without causing compatibility issues with C extensions.

- The presentation also addresses the evolution of Ruby’s GC technologies from simple marking techniques to more complex generational models, highlighting improvements in Ruby 2.0 and 2.1.

- Sasada advocates for the use of keyword parameters and discusses changes in Ruby 2.2 that enhance performance.

In conclusion, Sasada emphasizes that while incremental GC reduces pause times, it does not guarantee performance improvements for all applications. Users must evaluate their specific application needs against the behavior of garbage collection to optimize performance effectively.

Incremental GC for Ruby interpreter
Koichi Sasada • December 19, 2014 • San Diego, CA

In this presentation, I will talk about incremental GC algorithm for Ruby interpreter.

Ruby 2.1 has Generational GC algorithm, which improves GC performance dramatically. However, long application pause time problem on major GC is not solved.

To overcome this problem, we Matz team at Heroku, Inc. implemented a well-known incremental GC algorithm into MRI/CRuby interpreter to reduce application pause time without any compatibility issues, with our "write-barrier-unprotected" objects technique. In this talk, I will explain the algorihtnm, implementation and evaluation results about new GC.

Help us caption & translate this video!

http://amara.org/v/FqJd/

RubyConf 2014

00:00:18.240 Good morning everyone. In this presentation, I will use a small character format for the text.
00:00:23.720 If you cannot read the characters, please feel free to come up to the front of the stage.
00:00:30.759 Okay, let's start my presentation on incremental garbage collection for Ruby.
00:00:36.040 My name is Koichi Sasada from Heroku, and I live in Japan. I traveled here for about 11 or maybe 10 or 13 hours.
00:00:41.840 Firstly, I want to mention that this year, 2014, is very important for me.
00:00:50.440 It's my first wedding anniversary with my wife, celebrated at a wedding party this April.
00:00:57.680 Additionally, this year marks the 10th anniversary of my development work on Y, which stands for Yet another VM.
00:01:04.439 Ten years ago, I started this development and gave my first presentation at RubyConf 2004.
00:01:09.759 This year marks a decade since that presentation.
00:01:15.000 Recently, we have made significant improvements to the garbage collector, increasing throughput and reducing pause times.
00:01:26.119 These improvements will be included in the Ruby 2.2 release around this Christmas.
00:01:34.280 Last year, I introduced generational garbage collection in Ruby 2.1.
00:01:41.720 With this, we dramatically improved garbage collection throughput using a generational approach.
00:01:48.200 However, the problem of long pause times during major garbage collection remains.
00:01:53.680 Generational garbage collection separates young and old generations, allowing young objects to have shorter pause times.
00:02:01.759 As shown in these bars, the minor garbage collection pause times are relatively small.
00:02:08.640 However, there are instances of long pause times caused by the major garbage collection.
00:02:14.040 The blue lines represent the simple marking and sweeping garbage collection times before Ruby 2.0.
00:02:20.360 The orange lines illustrate the pause times in Ruby 2.1, showing a reduction.
00:02:26.519 Yet, even in Ruby 2.1, we encounter long pause times with major GC occasionally.
00:02:35.000 To further improve the application response times, Ruby 2.2 employs an incremental GC algorithm.
00:02:43.440 This well-known algorithm reduces pause times significantly, though it may not improve throughput.
00:02:49.200 While generational garbage collection provides significant performance improvements, it still faces long pause times.
00:02:56.239 Incremental GC, on the other hand, focuses on achieving shorter pause times.
00:03:03.080 Therefore, our goal is to combine generational garbage collection with incremental garbage collection.
00:03:11.840 By doing so, we can achieve high throughput as well as short pause times.
00:03:22.200 The results of our trials are promising, as indicated by the data presented here.
00:03:30.720 The orange line represents the traditional generational GC.
00:03:36.840 We then introduce incremental GC, represented by the gray line.
00:03:41.959 With incremental GC, we observe no long pause times, indicating a significant achievement.
00:03:50.720 In conclusion, we have successfully reduced pause times.
00:03:56.879 Now, I would like to discuss how to implement incremental GC in Ruby.
00:04:02.040 If you are purely a Ruby programmer, feel free to move to another room.
00:04:10.480 If you are interested in technical details, please stay here.
00:04:16.880 Thank you for your presence.
00:04:22.600 Let me introduce myself; I am a Ruby committer since 2007.
00:04:28.520 I have been involved in the original development since 2004.
00:04:35.000 I work at Heroku, where we employ three full-time Ruby developers.
00:04:41.039 Our mission is to improve the quality of the Ruby interpreter.
00:04:47.680 Quality means no bugs, no regressions for the next version, and reduced memory consumption.
00:04:54.560 Our boss, Matz, oversees everything.
00:05:00.639 Yesterday, there was a well-known session by Nobu.
00:05:07.840 Nobu is a great Ruby interpreter developer known for many patches and fixes.
00:05:15.840 I, however, primarily focus on improving performance.
00:05:21.760 Before delving into incremental garbage collection, I want to introduce upcoming changes in Ruby 2.2.
00:05:27.680 There are no notable changes in syntax except one.
00:05:33.600 From Ruby 2.2, we can write hashes using a new syntax that may confuse some.
00:05:41.760 It represents symbols rather than string literals, which may lead to misunderstandings.
00:05:49.920 There are also several changes in class methods, but most of these are not very significant.
00:05:55.680 Internally, there have been numerous improvements.
00:06:01.920 For instance, we have removed obsolete functions.
00:06:08.400 If you maintain an old gem or extension, please check for any necessary updates.
00:06:16.759 We have also enhanced internal definitions for data types.
00:06:23.919 An example of our improvements includes optimizing garbage collection performance.
00:06:28.480 The SLE GC is critical for some Ruby programmers and significantly enhances the performance of both generational and incremental GC.
00:06:38.799 Today, I want to showcase the performance improvements related to keyword parameters.
00:06:44.480 We’ve made changes to the virtual machine to support frozen string literals.
00:06:51.040 For example, let’s take a demonstration.
00:06:56.239 This program generates one million symbols and checks the total number of symbols.
00:07:04.359 In Ruby 2.2, all dynamically created symbols are now corrected and properly managed.
00:07:10.480 We can thus achieve very fast keyword parameter method invocations.
00:07:17.360 For example, by defining a method with keyword parameters and calling it ten million times.
00:07:23.040 Before this optimization, it took 17 seconds; now, it only takes 1 second.
00:07:30.440 This means we are now 15 times faster.
00:07:36.640 So please consider using keyword parameters.
00:07:43.359 We have no performance issues with this approach.
00:07:46.600 Now, let’s discuss garbage collection. This is an ancient technology.
00:07:56.760 Ruby has a long history with garbage collection. Initially, Mr. Yukihiro Matsumoto employed a simple conservative marking garbage collector.
00:08:06.960 This simple algorithm is easy to implement, especially for C extensions.
00:08:15.840 However, it poses challenges in enhancing the performance of the Ruby interpreter.
00:08:21.760 In Ruby 1.9.3, we introduced R-Sweep, and Ruby 2.0 improved marking to make it more compliant with light memory usage.
00:08:30.000 In Ruby 2.1, we showcased generational garbage collection.
00:08:36.720 Currently, we have generational garbage collection implemented.
00:08:43.600 To explain how to implement incremental GC, I’ll first explain how garbage collection works.
00:08:48.360 The simple marking garbage collection approach is quite easy to grasp.
00:08:55.600 It identifies root objects and marks reachable objects starting from these roots.
00:09:01.720 After marking, we can identify unreachable objects for collection.
00:09:08.960 This concept is basic and straightforward.
00:09:15.360 Generational garbage collection is based on the hypothesis that most objects die young.
00:09:21.720 Temporary objects, such as strings and arrays, generally do not live long.
00:09:29.600 Thus, we can focus on reclaiming young objects primarily.
00:09:37.680 The young generation is separated from the old generation.
00:09:45.480 If objects survive long enough, they promote to old generation.
00:09:51.920 Typically, garbage collection occurs only in the young space, termed minor GC.
00:09:59.120 If memory pressure arises, we perform major GC or full GC.
00:10:05.840 This approach optimizes total throughput.
00:10:12.960 The figure illustrates generational GC's working mechanism.
00:10:19.600 It first traverses and checks the old objects.
00:10:27.440 If it reaches old objects referencing the new ones, performance can be improved.
00:10:34.080 However, Ruby can introduce a relationship causing issues.
00:10:42.960 This situation must be detected to maintain proper relationships.
00:10:49.760 For this, we utilize a write barrier technique to manage relationships.
00:10:57.920 Unfortunately, the Ruby 2.0 implementation did not support write barriers.
00:11:06.000 Introducing them required substantial rewrites throughout the interpreter and C extensions.
00:11:12.960 Lacking infinite resources, we chose not to make those drastic changes.
00:11:20.160 Moreover, external extensions often lack source access.
00:11:29.440 This poses compatibility issues.
00:11:36.560 Therefore, generational GC could not be utilized before Ruby 2.1.
00:11:43.680 The current implementation uses write barrier unprotected objects.
00:11:51.440 The key idea is to separate objects into two types: write barrier protected and unprotected.
00:11:57.760 This distinction is made at the time of object creation.
00:12:06.240 For example, it is easy to apply write barriers to string objects.
00:12:13.040 When creating a string, the system can easily manage protected relationships.
00:12:20.360 In contrast, applying barriers to block objects is difficult.
00:12:28.240 So we classify block objects as write barrier unprotected objects.
00:12:36.400 This separation allows us to maintain a generational garbage collection without introducing significant compatibility issues.
00:12:44.080 For traversing references, if they are protected, we do not promote them.
00:12:51.760 Instead, we mark them during the minor garbage collection phase.
00:12:59.440 We can strategically treat unprotected objects as root objects.
00:13:06.160 This allows proper traversal and correct handling.
00:13:13.440 You can inquire further about the technical details if you wish.
00:13:20.960 In terms of performance, we have a time consumption profile for each GC operation.
00:13:30.640 The blue line represents a regular mark-and-sweep GC.
00:13:39.840 As such, there is a dramatic reduction in marking time and efficiency of minor GC.
00:13:48.240 Total marking time reflects significant improvement, while sweeping times remain stable.
00:13:54.080 This benchmarking illustrates how efficiency has enhanced with the implemented changes.
00:14:04.920 The results are a continuation of the previous year, not the current achievement.
00:14:12.440 With large applications, we can observe reductions in total execution time.
00:14:20.520 I have additional figures to support my papers on these improvements.
00:14:28.640 Please feel free to ask me about them afterward.
00:14:36.759 Ruby's generational garbage collection timing chart illustrates how GC performs.
00:14:46.640 In regular garbages, marking time tends to consume longer durations.
00:14:55.760 With minor GC, the efficiency is short-lived.
00:15:03.680 Hence, the major GC still results in longer pauses.
00:15:11.760 This is a key reason for implementing incremental garbage collection.
00:15:19.840 The main goal of incremental GC is to reduce major GC pause times.
00:15:27.560 The objective is not to impact the minor GC.
00:15:34.480 Today, I will discuss the details of implementing this restricted incremental garbage collection algorithm.
00:15:40.640 Incremental GC is a well-known algorithm for reducing pause times.
00:15:48.480 To summarize, we perform GC steps incrementally while integrating with Ruby execution.
00:15:55.760 This allows us to mark some parts and then run the Ruby program continuously.
00:16:03.400 We maintain a cohesive process with efficient marking phases.
00:16:12.960 As a part of this, we adhere to the TR coloring garbage collection terminology.
00:16:22.320 In this context, we define three colors for object states: white, gray, and black.
00:16:30.560 White objects represent unmarked objects, gray objects are reachable, and black objects are already marked.
00:16:39.760 The initial step involves coloring all objects as white.
00:16:46.640 Through various simulations, we mark root objects and determine relationships.
00:16:53.920 By repeating this process, we gradually collect and finalize reachable objects.
00:17:01.600 When all gray objects have been traversed, we conclude the GC marking phase.
00:17:08.720 We sweep and collect unmarked white objects.
00:17:15.680 However, similar issues arise parallel to generational GC.
00:17:23.920 We need to navigate marking phases efficiently while executing Ruby programs.
00:17:31.680 This may cause black objects to reference white objects.
00:17:39.680 We need to ensure that we properly handle these references.
00:17:46.880 Thus, we need to retain black objects during GC manipulations.
00:17:54.080 To this end, we remember the black and gray markings as we progress.
00:18:01.920 This technique enhances our approach and ensures collective management.
00:18:09.280 In terms of implementation, we need to be mindful of write barrier roles.
00:18:16.560 The classification between protected and unprotected objects plays a significant role.
00:18:22.720 This classification is essential in differentiating the management processes.
00:18:29.920 While protected objects can link to white objects, unprotected complexities require special handling.
00:18:37.920 We scan all black and white barrier unprotected objects during the incremental GC process.
00:18:47.280 This step occurs only after we have finished marking.
00:18:53.120 While performing marking and referencing, we must ensure thoroughness.
00:19:00.960 Incremental GC, though efficient, comes with challenges.
00:19:09.840 The pause times can increase relative to the count of unprotected objects.
00:19:16.840 Many unprotected write barriers mean longer pause durations.
00:19:23.760 Thus, we refer to this algorithm as restricted, not complete.
00:19:32.640 Though it yields similar or shorter pause times compared to minor GC.
00:19:40.640 If minor GC pause times suffice, this will be effective for users.
00:19:48.280 If responses remain unsatisfactory, further implementation details are necessary.
00:19:55.840 During implementation, I also renamed certain functions for clarity.
00:20:02.960 I also implement management strategies for light barrier protection and state representation.
00:20:10.960 This includes utilizing bitmaps for effective memory management.
00:20:17.760 The use of bitmaps simplifies sets and enhances traversing efficiency.
00:20:24.240 I acknowledge the contributions from Aaron on benchmarking.
00:20:32.560 This benchmarking explores the performance impacts of the incremental GC.
00:20:39.760 The figures display the results parallel to the previously discussed outcomes.
00:20:46.560 The orange line highlights the long pause times due to major GC.
00:20:53.760 In contrast, incremental GC shows significant reductions in pause duration.
00:21:00.720 This yields a tangible benefit for performance optimization.
00:21:08.000 We are able to achieve enhancement while maintaining pace.
00:21:15.760 We remain cautious not to become overconfident.
00:21:23.840 Incremental GC does not solve every challenge.
00:21:30.960 It does not guarantee application performance improvements.
00:21:38.800 Certain applications may still require careful analysis of GC behavior.
00:21:46.560 For instance, web applications often experience delays caused by major GC.
00:21:53.760 Response times will need thorough assessments against previous GC metrics.
00:22:01.440 In summary, today we explored Ruby 2.2 and its robust incremental garbage collection features.
00:22:09.600 I appreciate your attention and engagement.
00:22:16.240 Thank you very much.
Explore all talks recorded at RubyConf 2014
+77