Ruby Lambdas

Ruby

Ruby Lambdas

Keith Bennett

@keithrbennett

#lambda

#functional-programming

#object-oriented-programming-oop

Ruby Lambdas

by Keith Bennett

In the presentation titled "Ruby Lambdas" given by Keith Bennett at RubyConf 2022, the focus is on understanding and utilizing lambdas in Ruby programming. Bennett emphasizes the importance of lambdas for achieving cleaner, more flexible, and simplified code compared to traditional object-oriented design without lambdas.

Key Points Discussed:
- Definition and Purpose of Lambdas: A lambda is introduced as a free-floating function that is not tied to any object. The presentation highlights the benefits of mastering lambdas, which will lead to cleaner and more efficient code.
- Lambda Definitions and Calls: Bennett explains how to define a lambda, as well as various methods to invoke it, such as using the call method, abbreviation without parentheses, and case equality operator.
- Distinction Between Lambdas and Procs: He clarifies that both lambdas and procs in Ruby are callable, but their behaviors differ, particularly regarding return values and arity checking. Lambdas enforce strict parameter counts and return from themselves, while procs return from the enclosing method.
- Encapsulation and Organization: The use of lambdas as local nested functions is discussed to limit variable scope and complexity in methods, promoting better organization of code. Examples illustrate how complex methods can be simplified using nested lambdas.
- Practical Use Cases: Bennett presents several use cases for lambdas, such as formatting outputs for command line applications, acting as lightweight event handlers, and transforming data in an enumerable sequence. He discusses their role in enhancing reusability and separation of concerns in the codebase.
- Closure and Context: The concept of closures is explained, highlighting how lambdas capture their surrounding context. He illustrates the potential benefits and pitfalls of this feature with examples involving local variables.
- Advanced Concepts: The presentation touches on advanced functionalities like defining classes within lambdas and employing currying and partial application for lambda functions.

In conclusion, the presentation showcases that while lambdas may be less familiar to some Ruby developers, they offer significant advantages for code simplification, modularity, and flexibility. The overarching message is that adopting lambdas can lead to the production of better-structured software solutions, enhancing both functionality and maintainability.

00:00:00.000 Ready for takeoff.

00:00:17.279 I'm Keith Bennett. Thank you for being here. I'm going to talk about lambdas in Ruby.

00:00:22.020 I'm a Ruby, not Rails developer. I say that playfully; I have nothing against Rails, but my career experience hasn't included much of it. I do a lot more command line applications and tools for testing and networking.

00:00:26.640 So, Ruby, not Rails. But maybe in the future, Rails. I am not a functional programming expert. I've just dabbled a little bit in Elixir, Clojure, and Erlang a few years ago. I did some superficial study and really liked it, and I tried to bring some of those concepts to my Ruby programming.

00:00:30.720 We'll be talking about lambdas, obviously: What is a lambda? What does it look like? What is it good for? And how is it used? A lambda is a free-floating function. It's not part of an object; it's an essence of the proc class, but it's not really attached to an object. Do you remember when you encountered code blocks for the first time? Do you remember how they were confusing at first? If they weren't confusing, then you're smarter than I am. For me, they were confusing. I was wondering, how does this work? But we persevered and mastered them. Was it worth the effort? Of course.

00:01:03.660 Lambdas are the next step in that progression; they're unfamiliar to many of us, but mastering them will bring a lot of benefits, such as these. Let's look at the ways to call a lambda. First, let's define one. Can you see my cursor here? Yes? Great, that's working. So here's a lambda that takes a single parameter and returns whether or not it's a multiple of two—that is, whether or not it's even—and we're assigning it to a local variable called 'is_even.' Now, unfortunately, you can't name a local variable with a question mark. That would be nice, but you can't. That's a little frustrating that you lose that when you use lambdas in local variables.

00:01:58.979 Of course, the conventional notation is to use the call method by name, but there's also an abbreviated notation without parentheses. You can also, strangely, call a lambda by using square brackets, but please don't do this. Most of us see square brackets and think there's a collection we’re trying to find something in. However, if you're in a community where everybody's doing it and everybody understands it and it's a convention, then that's fine, but usually that's not the case. The case equality operator, the triple equals, will also call a lambda, but this too is confusing, and I don't recommend it. However, in a case statement, it can be handy. If you specify a lambda in a case statement in the 'when' clause, then that lambda will be called with the case variable as a parameter.

00:02:36.060 A lot of times, I'm going to be talking about lambdas and procs, but often it really doesn't matter whether the object in question is a lambda or anything containing a call method. The best name that I can come up for this broader category is 'callable'—anything with a call method, including all proc instances, lambdas, and non-lambdas. Modules are classes with a call class method; instances of classes have a call instance method; and any other object having a call method added to it directly.

00:03:20.159 Here's an example of a method that you could use to test whether or not something is callable: it simply calls 'respond_to' passing it 'call.' We create a class with a call class method, a class with a call instance method, and when we call 'callable' on all these four, they all return true. So here's just an empty lambda, an empty non-lambda proc, a class with a class method 'call,' and an instance whose class has an instance method 'call.' They're all true, so they’re interchangeable when you have a method that's taking a callable, and it doesn't have to be a lambda—it can be anything with a call method. Because Ruby has duck typing, unlike pretty much every other language, I think.

00:03:50.760 As a result, the dot-parenthesis notation can be used for any of these callables, not just procs or lambdas. If you want to use a lambda where a code block is expected, you just precede it with an ampersand. If you want to use a method where a lambda is expected, you can do this. In Ruby versions before 1.9, this was the only way to specify a lambda; notice it's really the same as a code block. Starting in 1.9, we had the stabby lambda notation added, which we can use this way. If there are parameters to the lambda, then this is the way to pass them in. The old notation, which is the same notation as with code blocks, and with a stabby lambda, it's the same notation as method calls.

00:04:36.480 Self-invoking Anonymous functions are a fancy name for something that calls itself and is not named. Here, we have three lines of code enclosed in a lambda. Why would we want to do that? Isn't that silly? Why not just use those three lines without the lambda? Well, once in a while, it comes in handy for hiding the local variable names from the outer scope. When I worked with CoffeeScript a few years ago, they recommended doing this for that very reason. The dot-parenthesis notation is interesting in that you really need to use it. You can't just call it like with method calls. In Ruby, normally you don't need parentheses, but if you're using the dot-parent notation, you do. Here, we create a lambda, we try to execute it, but IRB returns the object itself—the lambda object itself. Of course, if you use the conventional method name, you don't need the parentheses.

00:05:55.300 One of the things that can be confusing is the proc class, which can produce instances that are lambdas and instances that are not lambdas. The naming is unfortunate because the name of the class proc is the same as the name of the non-lambda proc when we define it using this syntax. In spoken language, if somebody says 'proc,' it's really ambiguous. For that reason, I usually use the term 'non-lambda proc' when I refer to an instance of the proc class that is not a lambda. Here, we have a lambda; we ask for its class, and it's a proc. We ask, 'Are you a lambda?' This is a method on the proc class, and this method says yes. Here’s a proc—what's your class? Proc. Are you a lambda? No.

00:07:34.899 Let's compare the behavior of lambdas and non-lambda procs: the return behavior is different. A lambda's return only returns from the lambda and not from the enclosing method or lambda. For example, we have a method 'foo,' we have a lambda, and we execute it right there in place. Are we going to see this? What will be returned? Well, it turns out we do see this. So this lambda returned from itself, but not from the 'foo' method. In contrast, a non-lambda proc returns from its enclosing scope. If we do the same thing, but with a proc instead of a lambda, we don't see that still in 'foo' because it has returned from the method already.

00:08:26.040 Checking... in case you're not familiar with the term 'arity,' it's just the number of parameters passed to a method. Checking is making sure that the correct number has been passed. Lambdas have strict arity checking; blocks and non-lambda procs do not. Here's a lambda that's expecting two arguments. If we call it with one, we get an error. In contrast, if we have a proc, there's no complaint. And if we have a code block, then there's also no complaint. If we define a method here that creates two random point values and passes them to the passed block, if the block that is passed is only expecting one parameter, there's no complaint. It just uses the first value and substitutes nil for anything that's missing.

00:09:15.480 If we pass the right number, of course, it works correctly. If we pass it too many, it still works correctly, but it uses nil for anything missing. I just want to say before I go on that lambdas and procs are 'selfless.' If you say 'puts self,' you won't get the proc instance that the lambda is; you'll get whatever it happens to be in the enclosing context. In IRB, the name of the enclosing object is 'main,' which is why we see that. The same thing applies with non-lambda procs. These differences in behavior related to arity checking and return behavior lead me to believe that lambdas are preferable: they're safer and more restrictive.

00:10:16.560 Of course, if you need that looser behavior, that's fine, but I think we probably rarely do need that behavior. Unnecessary complexity is our enemy. Right? You don't want a co-worker who's going to make something overly complex, because it's going to be harder to understand, modify, etc. As software developers, we strive to maximize the ratio of functionality over complexity. We want to maximize the functionality we give our users while minimizing the complexity we'll need to deal with over time with this software.

00:10:52.639 Why do we use local variables? To limit their scope. Why does it improve simplicity, reliability, and readability? Yes. Regarding the number of instance methods in a class, you may have encountered situations where there is just a really large number of instance methods in a class. One metric of that complexity is the number of possible paths of interaction, and it turns out there’s a formula to calculate that: it’s the number of methods times that number minus one. This is a very small class, but imagine in your head that it were larger with only five methods. We have a complexity value of 20.

00:11:52.540 In my experience, it's often the case that a method is used by only one other method. So, let's say, as an example, two of these methods are used by only one of them. If we could find a way to move those two into that method, we could look at the difference in the complexity metric—it's less than a third. Of course, there'd be a tiny bit of complexity in the right-side triangle, but very little. So, let's see if there's a way we could do that. It turns out that in Ruby, you can define methods within other methods.

00:12:30.779 We have a class 'C' here, we have an outer method, we just output a message, and then we define another method. So we create an instance of that class, we call 'outer,' defining the method; it says it is. So how would you call the inner method of a method? My best guess is that it would be 'outer.inner.' Let's try that. It doesn't work; what's going on? Well, it turns out that 'inner' is just a regular instance method like any other instance method, so we can't use methods as inner methods in Ruby. But guess what we can use instead? Lambdas.

00:13:18.180 I call it encapsulation light because it's on a very micro level. We can use lambdas as local nested functions. Here, we have a method, and we want to take two parts of the input and apply the same behavior to both of them—identical computation—so we create a lambda to do that computation and then call it twice here. Now, in case you're not familiar with multiple return values in Ruby, this is the way you do it: you create an array and return the array. Then in Ruby, you can destructure an array by giving a comma-separated list of variable names, and it'll put the appropriate array value into the appropriate local variable name. But let's say there were many lambdas and not only one.

00:14:56.840 Would you notice anything interesting about the structure of that method? You have a method with a lot of little pieces of isolated behaviors. It's kind of like a class, right? A lot of times, I'm writing a method, and as it gets more complex, I say, 'Okay, this should really be a separate behavior,' but I won't put it into an instance method just yet; I'll put it into a lambda and keep going. Once in a while, it just gets so complex that I realize this really should be a class of its own, or sometimes some of them should really be instance methods. So, a method with nested lambdas can easily be converted into a class. Here’s the case we had before: we create the 'compute_part' method and we call it. This is the way we would call it in practice: I would make these class methods so there's no need to create an instance. It’s kind of useless to have an instance there.

00:16:44.740 So we make all these class methods of the module or module methods. I'm not sure what the right name for that is, and then they can be called in the same way. A very common use case for lambdas in my experience is formatters. I do a lot of command line work, and I don't use CSS for formatting; I use printf, and that’s what you’ll see here. Let's say I wanted to produce an output like this—these last three lines, if we look at them, we see that they follow the same pattern: we have a caption, a colon, and a value.

00:17:40.560 So what do we do about that? Well, let's look at the bottom of this. First, the return value is a multi-line string, and because we have the lower-level behavior isolated in a lambda, the reader doesn't have to bother filtering out that low-level implementation to understand what’s going on. They say, 'Okay, something is being formatted in some way, and here are the values that are being formatted.' If they care to see the low-level implementation, they can go look at it, but they probably won't, and it saves people time when they’re reading.

00:18:40.560 There are two really good benefits of this: one of them is that the code is more DRY—Don't Repeat Yourself—and the other is that you're separating high from low-level code. In my experience, one of the things that makes code the most difficult to understand is when high- and low-level code are mixed together. It's really important and helpful to separate them if you can. Here's another example of using lambdas for formatters within a list of interchangeable formatters. We have a hash here—this is from my Rex gem, which simplifies the command line use of Ruby with inputting and outputting different formats and some other things.

00:19:56.060 You can configure this to use different formatters and parsers, meaning you would give a command line option, like '-i j', and it would convert that into a symbol. This hash would then allow the configuration to fetch the callable for that symbol and plug it into a variable, which can then be used for the remainder of the program. The same goes for parsers. And I just wanted to mention that these callables all need to have the same interface: they need to take the same parameter and return the same kind of thing. In short, they need to be interchangeable.

00:20:46.620 So in that case, we were taking an object and returning a string. In the case of parsers, we’re taking in a string and returning an object. This is where configuration will come in: it will take those formats, look up the behaviors corresponding to the options, store them in instance variables, and then use them later. Lambdas are handy for threads. When you create a thread and launch it, you pass it a code block, but using the ampersand, you can use a lambda instead. Then you have all the power of lambdas.

00:21:42.400 Lambdas are closures, so they carry with them the context of the scope in which they were defined. If there's a local variable 'n' which is 15, we can output it. Now, we could pass this to somewhere else in the program, and it would still work. Fortunately or unfortunately, you could also modify those values, and that could be a problem. But if it is a problem, you can tell Ruby, 'Oops, I want this 'n' variable to be a local lambda variable; don't use the enclosing scope.' And that works fine.

00:22:56.340 You can call binding on a lambda, and you will get the binding that contains those local variable definitions and some other information. I don't know if it would ever be useful, but it's there if you want it. Private methods aren't really private; you can use 'send' to call them if you want something truly private. You could put it in a lambda and assign it to a local variable, and that would be invisible to the outside world. Why would you want to do that? I'm not sure—maybe if you don't want your library users to cheat and use things you don't want them to, because you're going to change them later. Unfortunately, it also means you can't get to it with a unit test either. If you really need to unit test this behavior and it’s not enough to test the method's behavior in which the lambda lives, you're out of luck. You probably want to make that a method instead of a lambda.

00:24:51.300 Lambdas are great lightweight event handlers. You can define a lambda and then put it in a variable and pass it, or you can just define it in place without assigning it to a variable. Look at this notation and compare it to what it would look like if you were passing a code block. It's almost the same: the only difference is the parentheses and the arrow. Syntactically, it's really no big deal to use a lambda instead of a block, of course, provided the thing you're calling can deal with it, and it will be handled differently on that end.

00:26:04.680 For a lot of us coming from object-oriented languages other than Ruby, we're used to using classes for polymorphism. In Java, for example, you'd need to create an interface defining even the call method. It's a lot of verbosity! Compare that with this: lambdas are just so simple; they're really good in cases where you just need something simple. Predicates are functions that return either true or false, and we use them a lot in Ruby. The 'select' method, for example, takes a filter for messages and only adds to a list those messages that pass that filter.

00:27:36.240 This is what it might look like if we used a lambda as a parameter. We have a parameter called 'filter,' and we give it a default value, which is a lambda that returns true unconditionally—in other words, not filtered at all. Where we call it, we can say 'do this' if the filter returns true. That’s pretty simple and self-explanatory. If we contrast this with the more conventional use of code blocks, first of all, there's no mention of the code block in the signature. You can specify a name and then use that name, but it’s not conventionally done.

00:28:23.760 Also, more importantly, look at how this is being called: does that give you a clue about what's going on in terms of the actual logic? Not really; it’s kind of obtuse in my opinion. Furthermore, if we need to pass multiple behaviors, we certainly can't use code blocks because we only get one code block per method call. So here's an example of something that takes two.

00:28:55.740 There's another point about the idea of separating high-level from low-level concerns and also separating unrelated concerns into different areas of code. This is a method I was working with to ingest network messages from various sources and then filter them, wanting to get rid of some of them, keeping only what I wanted. There were different types of messages and different behaviors to apply to them. The first time I approached this, I wrote it all in one place and thought there was so much going on—it was complex. If I'm doing two things in the same area of code, then my complexity is probably squared—it's probably four—and if I can separate them out, it might be a quarter as complex and certainly easier to read.

00:29:15.060 So, I created this buffered enumerable class, which just handled the process of receiving the messages and buffering them, then yielding them. I parameterized the behavior that knew how to fetch the messages and the behavior that notified you in whatever way you wanted that a chunk of messages had been fetched—maybe a log, or a progress indicator, or something like that. Lambdas are great for that kind of thing. Now, here's a really weird thing that I learned, and I don't know if it really has any use in the real world at all, but you can define a class in a lambda using the conventional class definition notation.

00:30:10.760 You could define a class in a method by using 'class' and 'def,' but if you want it to look as if it were a conventional class definition, you could do that in a lambda, as long as that lambda is not defined inside a method. So here, we're creating a lambda and assigning it to a class constant. If we have a method that calls that lambda and then calls it, that works. Again, I'm not sure if that's useful at all, but I found it fascinating that you could do that.

00:31:05.180 Transform chains, ETL light! Usually, when we work with enumerables, we take a list of values and iterate over them with the same behavior. I'm talking about taking a list of behaviors and iterating over them with a single value, actually in sequence. So, let's say we have a tripler lambda and a square lambda, and we put them in an array. Then we have a starting value of four. This is called transformers. We call 'inject' with that starting value of four, and then apply each behavior in succession to the value, accumulating a final result.

00:32:12.600 I'm not entirely sure when or how this would be useful, but it's really interesting that this can be done, and it might come in handy sometime. As you work with lambdas more, you may find that there's duplication you want to resolve. Here is an example: all of these lambdas are multiplying by a factor. There are two major ways to deal with this. One of them is called partial application, which is basically creating a method or lambda that pre-fills value(s) in another lambda.

00:32:29.780 Here’s an example: we have a lambda that takes in a factor and it has a lambda that returns, hard-coding that factor into this lambda. Moreover, this hard-coded factor is now immutable, and that can be handy as well. We can call it with a 3, for example, and it will hard-code that 3 here, returning a lambda that returns n times 3. The other way is currying: let's say you have a lambda that takes two numbers. You can pre-fill a 3 in there by calling '.curry' and passing 3 to that, obtaining that tripler.

00:32:56.920 The other thing you can do is if you want to split that out, you could just get the return value from curry and put it in a variable called 'curry'. This just might be simpler to understand; it’s not something that you might want to do, but you can. And then you just pass 3 to that curried variable to get the tripler. We're done! Thank you for listening.

00:33:09.240 I'll leave my contact information up there for as long as they keep that on. Feel free to contact me; I’d love to hear from you. Thanks for listening.

RubyConf 2022