GoRuCo 2016

Micro Talk: Enumerable's Ugly Cousin

Enumerable's Ugly Cousin by Ross Kaffenberger

Everyone loves Ruby's Enumerable module. What about Enumerator? Many of us don't what Enumerator is or why it's useful. It's time to change that. We'll challenge conventions and (finally?) understand why Enumerator is important while unveiling the 'magic' of how it works.

Help us caption & translate this video!

http://amara.org/v/PsL9/

GoRuCo 2016

00:00:15.760 Public service announcement: if you see an organizer, thank an organizer. Alright, so Ruby's Enumerable module is something we all love. We call it powerful, simple, and elegant. It's how modules should be made, which is why we fall in love with Ruby.
00:00:20.900 In contrast, here's what some people say about Enumerator: 'I don't get it. That's ugly. Why would I ever use this? This seems like a big hack.' Something about Enumerator doesn't quite feel right. It doesn't feel like the Ruby way; it's not clean, concise, or elegant. Instead, it feels awkward, and we may even be tempted to call the RuboCop on Enumerator.
00:00:38.200 Thankfully, we don't have to deal with the PHP police. They say, 'You are all wrong; it's also illegal.' This makes me feel bad for Enumerator. I want to give it some love because I believe it is an amazing tool to have in your repertoire. However, I haven't always felt this way. When I started with Ruby, I didn't get it. I didn't know much about it. I first started in Ruby back in 2008 when I joined WePlay, a youth sports social network. There, I worked with some of the most talented developers in the Ruby community, including some members of the 9-10 club that we saw earlier.
00:01:19.680 We had built a Google Calendar clone for team scheduling, and at some point, we were asked to add recurring events so that coaches could easily repeat practices and games. You can imagine our sprint planning: we were very confident. We discussed it and thought it would be straightforward—just take our existing events and make them repeatable, right? So we estimated something like four days, but you can probably guess we were wrong. It took well over a month just to introduce the basics of recurring events. It turns out this was a challenging problem for us, and it was humbling.
00:02:09.650 Keep in mind this was still early in the game for Rails, so we couldn't rely on all the plugins we have today. One thing among many we were missing was a way to model infinitely repeating events in Ruby. Think about it for a second: how would you represent an abstraction like the second Tuesday of every month forever, so you can enumerate and query dates in that sequence? Turns out, as I’ve come to learn, one way to do it is to use Enumerator, and today we're going to see why.
00:02:36.840 A quick recap from the Ruby documentation: Enumerator is a class for internal and external iteration. You can enumerate any object by calling the `to_enum` method. It's mixed into all Ruby objects that are Enumerable, so you can call each, map, and select. You can also use the next method repeatedly for so-called external iteration. However, I don’t find the documentation incredibly inspiring, nor do they do a great job of demonstrating what Enumerators are capable of. So I'd like to try.
00:03:30.160 For one, Enumerators help us implement the generator pattern in Ruby. I don’t mean Rails generators or these generators; I mean the fundamental abstraction for producing data on the fly. Generators can pause and resume execution control as they emit multiple values, making them a powerful building block across many languages for behaviors like list comprehension, lazy evaluation, and asynchronous operations. They're powerful concepts but they come with some baggage depending on the community. Generators were only recently introduced into JavaScript in ES2015, and some think they're harmful due to potential performance concerns.
00:04:15.980 The venerable Airbnb style guide says simply, 'Don't use generators.' Others recognize their importance but kind of feel bad for them. Still, they are being used for next-generation web frameworks in Node.js and the much-anticipated async/await feature. In contrast, take a look at Python where generators are held in high regard. Talk to an experienced Pythonista, and you'll likely hear praise like 'Generators are awesome.' The key benefit of generators is that they let us be lazy.
00:05:05.800 I'm actually going to use some Python code to show what I mean. Imagine we write a Fibonacci function that, given n, returns the first n values of Fibonacci. We could start with an empty list, loop n times, calculate, and append each member of the sequence to the list, then return the result. Now we can iterate through this list with a for loop. This is an eager expression: all the values are calculated before the function returns; everything is produced in memory upfront.
00:05:20.420 Let's turn this into a generator. We replace our imperative eager list manipulation with a yield statement in Python. This means the function now returns a generator object that can emit members of the sequence on the fly. It’s now lazy; the generator won't produce values until they’re needed for iteration. Guess what? We can do this in Ruby! So let's port the Python Fibonacci function to Ruby almost line by line; it’s really just changing tabs to spaces.
00:06:15.060 There's one problem, though: to make this equivalent to the Python method, we need its return value to be an Enumerable generator, and this method is not. To enumerate, we're going to add a guard clause that returns an enumerator when this method is called without a block. Now drink this in for a second: this looks kind of ugly, doesn’t it? We're converting an object to an enumerator, passing in this magic method name, and using a conditional. It’s a little difficult to wrap our heads around. The big win, though, is that we have now converted this to an enumerator that can lazily generate the sequence as an Enumerable.
00:07:01.540 In other words, we can use methods like map, find, and select on this Fibonacci enumerator. It's kind of cool! In case you think I’m making up the term 'enumeratorized', note that I borrowed it from the MRI source code for the 'to_enum' method in enumerator.c. Yes, naming things is really hard! Enumerators can also be chained together to stream values lazily, which is useful for large or memory-intensive data sets.
00:07:43.360 Imagine our infinite range. We can't actually process the data with Enumerable methods because they’re eager by default. This map call will never finish. However, we can insert the lazy method, which creates an enumerator pipeline that can stream values. Now we can filter and reduce this infinite range to get a result. So how is this possible? Lazy augments how data is processed along the chain. The eager pipeline processes each member of the collection before moving on to the next step, whereas the lazy version passes data all the way down the chain before moving on to the next member.
00:08:36.510 We can potentially terminate the iteration early, which is ideal for working with a generated data set. Enumerators help us solve difficult problems that would otherwise be hard to perform eagerly. Think of streaming API clients, processing large files, or generating infinite sequences like recurring events. So, all these years later, after absorbing all this knowledge about generators and laziness, it has recently dawned on me that we can model event recurrence with an Enumerator.
00:09:12.580 I wrote a recurrence abstraction, and at its heart is an Enumerator which can lazily yield successive date-times in a recurrence with the potential to generate these elements indefinitely. That’s right: I Enumeratorized it! This is all packaged up in a gem called Montrose, which is for modeling recurring events in Ruby. You can check it out on GitHub or ask me about it later.
00:09:51.130 So, the next time you find yourself creating a method to return an eagerly created collection, you might want to consider whether to make it lazy. I like to think of this as a choice between Costco and Netflix. You can choose to buy in bulk and potentially waste quantities all upfront or be flexible and go on demand. Neither is right or wrong, but it may depend on the problem you're trying to solve. Costco covers more of your basic needs, but I personally find Netflix to be a lot more fun. And what could be more Ruby than that?
00:10:13.440 I think Enumerator is beautiful because, at a high level, it gives us the option of laziness. It’s not something you necessarily need all the time, but it may come in handy. So, even if Enumerator is not your thing, my takeaway for you is to look at the beauty in what may seem like ugly code because sometimes it can teach us something. And it did for me.
00:10:40.000 My name is Ross Kaffenberger, I'm Rasta on the internet. Thank you very much.