Keyword Arguments: Past, Present, and Future

Talks

Jeremy Evans

#programming-best-practices

#ruby-development

#performance-optimization

Keyword Arguments: Past, Present, and Future

by Jeremy Evans

The video titled "Keyword Arguments: Past, Present, and Future" presented by Jeremy Evans at RubyKaigi Takeout 2020 focuses on the evolution of keyword arguments in Ruby, particularly with the upcoming changes in Ruby 3. The presentation covers the historical context, the internal handling of keyword arguments, the separation of keyword from positional arguments, and potential future enhancements.

Key Points Discussed:

- History of Keyword Arguments:

- Ruby did not originally support keyword arguments until Ruby 2.0 (released in 2013), although they were planned for Ruby 1.8.

- Prior to that, hashes were used as a workaround to simulate keyword arguments, which was clunky and could lead to performance issues.

Current Implementation:
- In Ruby 2.7, improvements included the ability for non-symbol keys in keyword hashes, addressing various backward compatibility concerns. Evans details the functionality changes and the rationale behind them.
- Examples demonstrate how optional hashes could lead to unexpected behavior when treated as keyword arguments, drawing attention to specific class methods that contributed to these confusions.
Separation of Keyword Arguments in Ruby 3:
- Discussed the significant changes coming with Ruby 3, particularly the need for full keyword argument separation, which intended to eliminate ambiguities when passing arguments.
- The new rules make keyword arguments distinct from positional arguments, breaking compatibility for some existing patterns but ultimately providing clearer behavior for method callers.
Future Improvements:
- Evans proposes optimizations to reduce hash allocations during method calls involving keyword arguments, including strategies for managing mutable flags during operations.
- He concludes on positive notes about performance improvements already implemented and the expected enhanced clarity in function usage moving forward in Ruby 3.

In conclusion, the presentation outlines the significant transformation of keyword arguments within the Ruby language, emphasizing a history marked by gradual improvements, ongoing challenges with backward compatibility, and a promising future with better performance and clarity in method definitions involving keyword arguments.

00:00:00.080 Hello everyone. In this presentation, I will discuss the history of keyword arguments, their current implementation, how keyword argument separation was handled, and future improvements to keyword arguments.

00:00:12.719 My name is Jeremy Evans, and one of my main focuses since becoming a Ruby committer last year has been implementing keyword argument separation. Back when I started using Ruby around the time of Ruby 1.8, Ruby didn't have keyword arguments.

00:00:20.640 The book I used to learn Ruby programming was the first version of Programming Ruby, which was released in December 2000 and covers Ruby 1.6. Now, the book specifically mentions keyword arguments, stating that Ruby 1.6 does not have keyword arguments, although they were scheduled to be implemented in Ruby 1.8.

00:00:33.360 It looks like the schedule slipped by about ten years since Ruby 1.8 was released in 2003, and Ruby did not support keyword arguments until 2013, when Ruby 2.0 was released. Shortly after that, Programming Ruby talked about how you can use hashes to achieve the same effect as keyword arguments.

00:00:51.360 So if you have a method like this and you want to add keyword arguments to it, you just pass in a hash of values. It does mention that this can be slightly clunky, mostly because these braces could be mistaken for a block. However, it also mentions that the braces are optional if the hash is the last argument. While older versions of Ruby didn't support keyword arguments, they provided something that handled pretty much the same need.

00:01:29.440 This same syntax on the caller side can work when using keyword arguments in Ruby 2.7 and Ruby 3. Before keyword arguments were supported and hashes were used instead, you would just list the hash as a normal parameter. However, this approach made the keywords a required parameter. In general, it's more common for such hashes to be optional parameters, so the typical case for older Ruby code would be to use an empty hash as the default value for the parameter.

00:01:46.960 This does allocate a new hash every time the method is called without an options hash, which can slow things down. High-performance code would usually set up a frozen hash constant and use that constant as the default value for the hash parameter. This way, calling the method without an options hash can avoid allocations. Let's assume this was our existing method in our Ruby 1.8 codebase, and now Ruby 2.0 was released.

00:02:26.720 We want to change the options hash to keyword arguments in the method definition. We can replace the optional hash argument with separate keyword arguments. The keyword arguments must have default values because when keyword arguments were introduced in Ruby 2.0, all keyword arguments were optional. Required keyword arguments were not introduced until Ruby 2.1.

00:03:05.320 This change requires changing the callers of the method in the Programming Ruby example. String keys are used in keyword arguments, but Ruby 2.0 only supports symbol keys. So specifying keyword arguments when calling the method requires changing to using symbols as keys. It would probably be fairly common to switch to the simplified hash syntax introduced in Ruby 1.9 when making this change.

00:03:23.760 If you wanted to support arbitrary keyword arguments, you could add a double splat argument to the method definition, and the double splat syntax could also be used when calling the method if you have a hash that you want to treat as keywords. Now, in Ruby versions before 2.7, using the double splat operator is optional in most cases since hash arguments are implicitly converted to keyword arguments. Ruby 2.0 tried to make the introduction of keyword arguments as backwards-compatible as possible.

00:03:55.920 If you were using an options hash approach, you could switch the options hash to a keyword splat, and everything would continue to work. Ruby would automatically treat the hash as keywords. You could even use explicit keywords with a hash argument, and this would also work as long as all keys used in the hash were valid keywords. Since it was already valid syntax to emit braces for a final hash argument, the same syntax was used for keyword arguments, keeping backwards compatibility intact.

00:04:39.680 In general, keyword arguments and the final positional hashes were considered interchangeable in most cases. This approach worked fine and did what users wanted it to do. Unfortunately, there were a couple of cases where the approach did not yield the desired results. Mixing optional arguments and keyword arguments generally resulted in undesired behavior.

00:05:05.120 If the optional argument could be a hash, this is because the passed hash argument is treated as keyword arguments instead of as the optional argument. Similarly, if the method accepted an argument splat as well as keywords, the passed hash argument is treated as keywords instead of being included in the splat array. In both cases, the way to work around the problem would be to stick an empty hash as the last argument.

00:05:57.440 The empty hash would then be treated as keywords, allowing the preceding hash to be treated as the optional argument or a member of the splat array. The keyword argument issue didn't just affect literal hashes but any object that is implicitly converted to a hash. So, let's say you passed an instance of YAML::DBM. This class is included in the standard library, and most people would reasonably expect this to get passed as the optional argument.

00:06:39.040 However, that is not the case; it gets passed as keyword arguments even though it is not a hash. Why is that? While examining the source code of the YAML::DBM class, near the bottom, we see that it defines the to_hash method. This is what causes the problem.

00:07:27.440 In Ruby 2, all methods that accept keyword arguments and are passed more than the number of mandatory arguments will check if the final argument responds to to_hash. If the final argument responds to to_hash, it will call to_hash on the argument and treat the resulting hash as keywords.

00:07:59.680 This has caused substantial problems for libraries that defined the to_hash method in any of their classes. One of the other strange behaviors of keyword arguments was how it handled hashes differently, depending on their contents. Let's use a slightly different example with a method taking an optional argument, an explicit keyword, and a keyword splat.

00:08:59.440 This method will return the argument value, so we can see how it handles the arguments. If we pass in no arguments, we get nil and an empty hash. If we pass an empty hash, we still get nil and an empty hash. That's because the empty hash is treated as the keyword splat.

00:09:57.040 If we pass a hash that has symbol keys, the hash is treated as keywords, with k being set to the explicit keyword argument and the remaining entries being assigned to the keyword splat. If we pass a hash that has string keys, the hash is treated as a positional argument. This is because it contains no symbol keys, and Ruby then determines that it should not be treated as keywords.

00:10:57.440 Now Ruby 2.7 allows non-symbol keys in keyword splats, but both Ruby 2.6 and below, and Ruby 3.0 will treat this case as passing a positional argument. Ruby 2.7 does as well. Finally, if we instead pass a hash that has a symbol key and a string key, Ruby will take such a hash and split it into two hashes, one containing only symbol keys and another containing all other keys.

00:12:06.560 A hash of symbol keys will be treated as keywords, while the other hash will be treated as a positional argument. In this example, the optional positional argument will be set to a hash containing the string key, and the hash with the symbol key will be used to set the keyword argument.

00:13:01.760 So that finishes my discussion of the past. Let me now discuss the present, and by that I mean Ruby 2.7 and how it changed compared to Ruby 2.6 in regards to keyword arguments.

00:14:23.200 In November 2017, Matt announced at the RubyConf and RubyWorld conferences that Ruby 3.0 will have real keyword arguments, or keyword arguments that are separated from positional arguments. The discussion of how to implement keyword argument separation occurred in feature 1914.

00:15:01.120 This is one of the largest discussions in the bug tracker with over a hundred comments in a two-year period. The original proposed behavior was for full keyword argument separation, where keyword arguments would never be treated as positional arguments, and positional arguments would never be treated as keyword arguments.

00:15:44.240 So if we have this code where method foo takes a regular argument with the default value being an empty hash and method bar takes keyword arguments, calling the method foo with a hash argument would be fine, and calling the method bar with keyword arguments would be fine. However, calling the foo method with keywords would be an error, and calling the bar method with a hash would be an error.

00:16:31.440 My main issue with this approach was that it broke backwards compatibility for the case where you are calling a method with keyword arguments when the method accepts an options hash. My libraries tend to use this pattern extensively. I didn't want to change these methods to accept keyword splats because those are substantially worse for performance due to the hash allocations.

00:17:39.520 Also, in many cases, methods that support options hashes also support non-symbol keys in the hash, and in some cases, they support non-hash objects in addition to hash objects. Using keywords would not allow me to handle either of those cases. I was well aware of the problems with keyword arguments when also used with optional arguments, which were all caused by positional argument to keyword conversion. I agree that we should fix these cases.

00:18:58.960 The first proof of concept patch related to keyword argument separation was posted by Endoscon last March, and this implemented the full keyword argument separation. While I only had a little experience with the internals of Ruby back then, I was fairly sure I would not be able to convince Ruby developers to keep backwards compatibility for methods that did not accept keyword arguments unless I had a working proposal with updated tests.

00:19:56.440 Over the course of the next week, I learned enough about the internals to modify Endoscon's patch to implement my proposal. I probably spent equal amounts of time reading existing code and using trial and error to figure out if my understanding was correct. GDB was both my best friend and my worst enemy. Eventually, I was able to get my proposal working, being backwards compatible for methods without keyword arguments and fixing the issues for methods with keyword arguments.

00:20:59.360 When I gained my work, I had a goal that all code that would break in Ruby 3.0 due to keyword argument separation changes would issue deprecation warnings in Ruby 2.7. However, there were a couple of cases where we actually changed the behavior in Ruby 2.7. One of the changes in Endoscon's patch was to allow non-symbol keys in keyword hashes if the method accepts arbitrary keywords.

00:21:40.960 This change was made so that more methods that currently accept options hashes can switch to accepting keywords. This code takes a hash with a string key, and we are double splatting the key when calling the method. This raises a type error in Ruby 2.6, but Ruby 2.7 accepts this without an error. Shortly before modifying Endoscon's patch, I realized that to fix delegation issues with keyword arguments, I would need to make splatted empty keyword hashes not pass positional arguments when calling a method that does not accept keyword arguments.

00:22:57.440 Now, this change was made so that you could write simple delegation methods in Ruby 3. Here's an example of a method named bar that accepts arbitrary arguments, and here is a method named foo that delegates all arguments and keyword arguments it receives to bar. The problem with older versions of Ruby with foo is that it does not delegate keyword arguments correctly.

00:23:24.720 If you call bar directly with an argument, you get the expected results returned in both Ruby 2.6 and 2.7. However, if you call foo with an argument, you get two arguments passed in Ruby 2.6, one being an empty hash, even though the empty hash was not provided when calling foo.

00:24:37.200 With the changes in Ruby 2.7, you get the expected result when calling foo, exactly the same as if you had called bar directly. This is why before Ruby 2.7, delegating methods would always use regular arguments rather than keyword arguments. The example still works fine in Ruby 2.7 and will work in Ruby 3.0 because bar does not accept keyword arguments.

00:25:39.040 However, it would not work if you changed bar to accept keyword arguments. If you call foo with keywords in Ruby 3.6, it's fine since the keywords are passed as a hash to foo, and the hash will be implicitly converted back to keywords when foo calls bar. Ruby 2.7 results in the same behavior but issues a deprecation warning because Ruby 3.0 will no longer implicitly convert the hash to keywords when calling bar.

00:26:38.720 In Ruby 3.0, the hash remains a positional argument when calling bar. Delegation turned out to be a much trickier issue than I originally expected. As I showed a few slides ago, this was a way to do delegation, delegating both arguments and keyword arguments, and I said that it would work correctly in Ruby 2.7.

00:27:44.160 However, there are actually two problems with this approach. The first is if you have a hash and you provide it as a positional argument to the foo method, Ruby 2.6, 2.7, and 3.0 all return the same results.

00:28:28.400 However, Ruby 2.7 warns that this has to be a bug, right? I mean, Ruby should only warn for cases where the behavior is going to change. It shouldn't warn in a case where the behavior is the same. It turns out this is not a bug. The warning here comes from this call to foo; in this call to foo, the positional hash argument is converted to keywords, and that is what is causing the warning inside foo.

00:29:00.560 The key hash is keyword splatted in the call to bar, and because bar does not accept keywords, the splatted hash is turned back into a regular hash argument. That doesn't warn because Ruby 3.0 will keep compatibility with older versions of Ruby for that type of call. With the initial proposal for full argument keyword argument separation, this type of code would actually emit two warnings, not just one.

00:30:03.200 It would warn once in the call to foo and again in the call to bar. The other problem with this approach that I've already covered as a positive in terms of Ruby 2.7 behavior is that you can't use this approach with older versions of Ruby because you end up passing an empty hash to the target method if you are calling the delegating method with no keywords where the last positional argument is not a hash.

00:30:51.040 So, what I thought we needed was a delegation approach that would be both backwards compatible with older versions of Ruby and would not issue a warning in cases where the behavior wouldn't change between Ruby 2.6 and 3.0. Unfortunately, I was not able to come up with a good approach; I was only able to come up with a passable hack I originally called that hack 'pass keywords' but it's now known as Ruby 2 keywords.

00:31:50.840 So, the basic idea with Ruby 2 keywords is that you can keep your existing delegation code that worked in previous Ruby versions. However, you could use your Ruby 2 keywords to flag the method to pass keywords through the method. Because the Ruby 2 keywords method may not be defined in previous Ruby versions, you would need to check whether you could use Ruby 2 keywords.

00:32:55.520 If you call the foo method with keywords, it would be converted to a hash and stored as the last element of the splat array of the method call. However, the hash would have a special flag. When the array of args is used in a splat call to another method, if the last element is a hash that has that special flag, then the hash will be treated as keywords instead of as a positional argument. This results in the same behavior in Ruby 2.6, 2.7, and Ruby 3.0, and it does not cause any warnings.

00:34:03.120 One of the reasons for full separation of keywords and positional arguments is that it is always possible to add keywords later without breaking any code. In addition, this is called safe keyword extension, and it's something that we gave up by default when we chose a more compatible approach.

00:35:04.880 However, another change added in 2.7 was the ability to add safe keyword extension to a method. We can implement safe keyword extension by adding a syntax that indicates it is forbidden to pass keywords to the method. If you have a method named foo that takes an arbitrary number of arguments, you could call it with a hash and you could call it with keywords to get the same result.

00:36:12.960 Now if you add keywords to the method later, you actually break both of these calls in Ruby 2.7, resulting in an argument error. In the positional hash case, you also get a warning since Ruby 2.7 will hopefully convert the positional hash to keywords and warn before raising an error because the keywords are not valid.

00:37:13.440 In Ruby 3.0, passing the positional hash will work correctly since the hash will be treated as a positional argument. Passing keywords will not work. To avoid this issue for newly defined methods where you don't need to worry about backwards compatibility, you can use this star nil syntax when defining the method.

00:38:21.760 The method forbids keywords; it changes the behavior of the method so that the method acts like it accepts explicit keywords, but no explicit keywords are defined. This will make using keywords with this method raise an argument error; however, using a positional hash argument will still work correctly even in Ruby 2.7.

00:39:58.960 Ruby 2.7 does not automatically convert this hash to keyword arguments because we don't need to worry about backward compatibility with Ruby 2.6 as the star star nil syntax is not valid in Ruby 2.6.

00:40:50.960 One of the big issues with keyword argument separation was how to handle methods defined in C, both those implemented by core classes and those defined in C extensions. In Endoscon's original patch, methods defined in C did not implement keyword argument separation.

00:41:54.400 Now, in order to support keyword separation for C methods, there needed to be a way to expose to such methods whether the method was called with a positional hash or keywords while keeping the API for calling the C functions the same.

00:42:55.440 To give methods defined in C the ability to check if keywords were passed to the method, I added a function called rb_keyword_given_p to Ruby's public C API. This function is callable for methods defined in C to check whether keywords were passed when calling the method.

00:43:48.240 Most methods defined in C that accept a variable number of arguments use rb_scan_args in order to parse the arguments. Previously, rb_scan_args treated a hash argument and keyword arguments the same. Here's an example of using rb_scan_args. The first two arguments are rc and rv, usually passed directly from the caller's arguments.

00:44:49.440 The third argument is a format string that indicates how elements of argv should be assigned. The remaining arguments are generally local variables to which the elements of argv should be assigned.

00:45:49.680 In this case, the first character within the format string is one, indicating the method takes one mandatory argument. The second character is one, indicating one optional argument. The third character is a star, indicating a rest argument, which will combine all remaining arguments into an array, and the last character is a colon, indicating keyword arguments.

00:47:12.560 Up until Ruby 2.6, a final hash argument would always be treated as keywords, mirroring the behavior of methods defined in Ruby. In Ruby 2.7, this modifier uses rb_keyword_given_p to determine whether or not to emit a warning message.

00:48:42.240 Again, to mirror the behavior of methods defined in Ruby, adding rb_keyword_given_p and fixing rb_scan_args handled most issues when calling C methods with keyword arguments. However, there's another side of this coin, which is that C methods can call Ruby methods.

00:50:05.680 One way to do this is to use rb_funcallv. This function takes the method receiver, the id of the method to call, the number of arguments to call the method with, and a C array of the arguments. One issue with this API is it does not offer the ability to specify whether you are passing keyword arguments when calling the method.

00:51:47.520 To add the ability to pass keyword arguments when calling Ruby methods from C, I added rb_funcallv_kw that accepted an additional argument for whether the call passes keywords. There are actually many C functions that are used to call Ruby methods, and all of these lack the ability to specify whether keywords were passed when calling.

00:53:23.680 In all cases, the fix was to add a kw variant of the method that accepts an additional flag for whether keywords are passed. These methods are defined in Ruby 2.7 and later versions and do not exist in Ruby 2.6 and earlier versions. So we see extensions that want to be backwards compatible with earlier versions need to implement these using a macro if they are not defined.

00:54:53.600 Ruby's extension documentation provides example macros for all of these. With those changes, both methods defined in C and methods defined in Ruby could handle keyword arguments correctly in Ruby 2.7. Unfortunately, it became apparent that there were numerous special cases that did not handle keyword arguments correctly.

00:56:27.920 So here's a partial list of special cases that needed to be fixed to handle keyword arguments correctly. Unfortunately, due to limited time, I can't go over the details for all of these. So that ends the discussion of the present status of keyword arguments as of Ruby 2.7.

00:57:31.760 Let's finish up with a short discussion about the future of keyword arguments in Ruby. So in early January, the compatibility code and deprecation warnings added in Ruby 2.7 related to keyword arguments were removed.

00:58:13.760 Positional hash arguments are now never treated as keyword arguments. Keyword arguments are now never split into positional hashes and keyword hashes, and calling a method with an empty keyword splat no longer passes an empty positional hash argument.

00:59:30.480 While discussing the future, if you've been using the master branch after the release of 2.7, you've already experienced these changes. One idea I had after the release of Ruby 2.7 was an approach to optimize keyword arguments.

01:00:14.360 Here's an example of calling a method that takes explicit keyword arguments with a keyword splat. This allocates a hash on the caller side, and here’s a slightly modified method that accepts arbitrary keywords. This allocates two hashes, one on the caller and one on the callee side. In both examples, I think it should be possible to avoid the hash allocation on the caller side.

01:01:46.920 We can add a flag for whether the keyword argument passed during the method call is mutable. In this case, the flag would not be set, indicating that the keyword hash is not mutable. The callee could take the hash directly and access the bar member of the hash to set the bar keyword.

01:03:03.320 So this would not allocate a hash. In the case where the method accepts arbitrary keywords, the caller would hash the hash directly, and the callee would duplicate the hash. This way, this code would allocate one hash instead of two.

01:03:53.440 Here's a more advanced case with multiple hash splats when calling the method. In this case, the caller side has to combine these hashes into a single hash, and this case would set the flag, indicating that the keyword hash is mutable.

01:04:52.080 If the method accepts arbitrary keywords, because the mutable flag was set, the callee would not need to duplicate the hash; they could use the hash allocated on the caller side directly, resulting in a single hash allocation instead of two.

01:05:57.760 I was able to implement this approach and it was committed in February, so calls to methods that accept keywords will not allocate more than one hash, and calls to a method that accepts explicit keywords using a single keyword splat will not allocate any hashes.

01:06:42.640 This significantly improves the performance for most simple methods that use keywords. I hope you had fun learning about the past, present, and future of keyword arguments. That concludes my presentation, and I'd like to thank you all for listening.

RubyKaigi 2020 Takeout