Talks

Enumberable's Ugly Cousin

Everyone loves Ruby's Enumerable module. What about Enumerator? Many of us don't what Enumerator is or why it's useful. It's time to change that. We'll (finally?) understand why Enumerator is important and unveil the "magic" of how it works. After learning how to get started with Enumerator, we'll build up to some diverse use cases, including web crawlers and recurring events. We'll jump from crazy ideas, like emulating lazy sequences more common in functional programming languages, to sane takeaways for more common problems.

Even if you've been programming Ruby for years, you may see something new or, at least, see a familiar problem with a fresh perspective. Every if you don't adopt Enumerator into your daily work, you'll come away with a deeper understanding of its advantages and how it complements its famous relative.

Ancient City Ruby 2016

00:00:00.000 So let's get started. We have Ross Kaffenberger here, and he’s going to be talking about Enumerator’s ugly cousin.
00:00:05.580 Good morning, everyone. I'm here to talk about Enumerator today, which I consider to be the ugly cousin of Enumerable.
00:00:17.789 My name is Ross Kaffenberger; I go by Rasta on the internet, and I come from New York City. Is anyone else here from New York City? Very few, okay.
00:00:29.220 I’m not a native New Yorker, but there are a few things I’ve learned about living in New York over the past few years that sort of bother me.
00:00:42.149 For example, if you’re a New Yorker, you have to pretend like everything is normal, no matter what. The city has such a high-paced environment that you're basically forced to choose one of two speeds: move fast or get out of the way.
00:00:56.430 You also find yourself questioning your self-worth at every turn, even on a philosophical level. Even the subway ticket machines ask you if you're adding any value.
00:01:10.290 It can be tough, especially as someone entering the tech field. This talk is inspired, among other things, by an internet cat and my feeling that the Enumerator concept in Ruby doesn't get enough love. I feel a bit bad for it, so this is my chance to show you what I mean.
00:01:31.140 Here are some of the things Rubyists say about Enumerable: 'Yeah, you should learn how to use it. It’s powerful, elegant, and it’s why I fell in love with Ruby.' However, when Rubyists encounter Enumerator for the first time, they often say, 'I don’t get it. It's confusing, it looks ugly, and I would never write code this way. Why would I ever use this? It seems like a big hack.'
00:01:50.189 I’m trying to change that perception a little bit. I’m not sure if I'll be successful, but I get the feeling that folks don’t believe that Enumerator obeys the Ruby way.
00:02:01.729 When we talk about the Ruby way, we think of things like expressiveness, cleanliness, happiness, and elegance. Enumerator may not fit those descriptions because we have conventions in Ruby.
00:02:22.110 Conventions are great because they make us more productive; they allow us to have a shared understanding of how to write Ruby code and help us get things done.
00:02:44.400 However, I suggest that adhering too strictly to conventions, especially when learning, can hold us back. For example, for loops are a really useful concept found in many languages, including Ruby.
00:03:08.050 But would you ever see a real Rubyist write code this way? No. Real Rubyists use each and Enumerable to iterate over things like an array.
00:03:37.440 Zed Shaw would call this indoctrination. We have to be careful when we teach the right way of doing things as if it's some kind of moral duty.
00:03:52.800 There’s a difference between code that works, which is correct, and code that is done the right way. We’ve even started automating the review of code that’s done the right way.
00:04:07.500 We are giving ourselves gray areas and policing our own code for doing things the right way. Other languages also police the way code is written.
00:04:19.780 If you think it’s a stretch to compare writing code to legislation, consider that there’s a petition to the White House to outlaw programming languages that threaten the safety of American citizens.
00:04:41.260 Some of the suggested languages to restrict are JavaScript, Java, and Ruby. So, if we’re not careful, it may someday be illegal to write Ruby altogether.
00:05:01.990 The point I’m trying to make is that conventions are social norms but they are not universal truths. Conventions are good, but it's also worth exploring unconventional ideas, dare I say ugly.
00:05:15.370 This brings me back to Enumerator. I want to discuss what an Enumerator is and why you should care.
00:05:27.300 Ruby documentation states that Enumerator is a class that allows for both internal and external iteration. But I don’t know about you, that’s not terribly inspiring.
00:05:39.490 To give you a bit more idea, you can get a reference to an Enumerator on any object, like an array, by calling a special method: to_enum.
00:05:55.930 When you call that method on an array or any other object, it gives you back an Enumerator that represents that object. Typically, the each method is used.
00:06:10.290 With an Enumerator, you can iterate internally using a block, just like you would with any other collection, but you can also iterate externally.
00:06:24.660 You can call the next method on any Enumerator to get the items in that collection out one by one. When you reach the end of calling next, it raises a StopIteration exception.
00:06:48.060 I want to explore a bit beyond this basic definition, following my personal journey in learning how to program and looking at other programming languages.
00:07:08.400 When I first started programming, the first language I learned was Java. Did anyone else start with Java? Maybe a few.
00:07:24.000 Java has a concept called an Iterator, so I’d like to show you some Java code to demonstrate what an Iterator does because it relates to what an Enumerator is.
00:07:45.030 In Java, we have a verbose way of creating a list of items and calling a function called iterator that returns an Iterator object. You can then call a next method to get the items one by one.
00:08:03.030 You can think of an Iterator as an object that can suspend its iteration, passing control back to its caller. It's waiting for the caller to call the next method again to continue.
00:08:27.420 You can think of it like yield in Ruby turned inside out. If you define a method in Ruby and use the yield keyword, you pass members to a block.
00:08:38.640 With an Enumerator, calling next adds those items to the calling context, so you don’t need a block.
00:08:59.830 So why would we ever use this? We don’t see this often in Ruby. One use case would be to replace nested loops.
00:09:06.730 Imagine we’re building a web page and creating a list of rows in a table. It's common to add a different color to alternating rows.
00:09:21.150 We can create an Enumerator for colors in Ruby, using the to_enum method, which takes arguments like the name of an Enumerable method.
00:09:39.610 The cycle method will iterate indefinitely over an array. Within a loop where I'm iterating over a list of projects and rendering them to HTML, I simply ask for the next color.
00:09:54.490 It will alternate because I have two colors in that Enumerator.
00:10:01.460 Now, let’s look at what I've learned from JavaScript and Python. Both languages have a concept called Generators, which are a type of Iterator.
00:10:15.060 Generators are objects with a next method, but they typically don't represent collections; they can produce data on-the-fly. They are used in various applications: list comprehensions, fetching asynchronous data, infinite sequences, and concurrency.
00:10:36.570 Generators are a relatively new addition to JavaScript and are still a topic of discussion. In the JavaScript community, there is some skepticism about their usefulness.
00:10:47.200 For example, the Airbnb JavaScript Style Guide discourages using Generators, claiming they are not yet ready.
00:11:00.260 In contrast, Python has embraced Generators for a long time. Pythonistas love them; they are expressive, easy to maintain, and very readable.
00:11:17.180 In Python, you can use the yield keyword to define a Generator. When you call the generator function, it returns a Generator object.
00:11:30.260 You can then call next on it to get the yielded items from that function. Let's imagine getting the first n numbers of the Fibonacci sequence.
00:11:44.290 In Python, the imperative way is to build up an array, using a for loop to calculate those first n numbers and append them to the array.
00:12:05.630 However, we can replace the array with a yield statement, allowing the function to act like a Generator.
00:12:18.830 This way, we can iterate over the output using a for loop or call next method to get the terms one by one.
00:12:34.330 You can think of a Generator as a function that can suspend its execution and pass control back to the caller.
00:12:48.410 Now, can we write Generators in Ruby? Yes, we can. Let’s take the Python Fibonacci Generator and implement it in Ruby.
00:13:04.700 We can replace the loop with something more Ruby-like and yield to a caller. However, this is not quite a Generator.
00:13:18.360 We can pass a block to the Fibonacci method in Ruby and ask for the items. But I can't treat this like a collection. What I want is to represent it as an Enumerable.
00:13:33.250 We're going to coin a term here: 'Enumerate-a-Rise' this method. By using the return to enum method, we can pass any arguments without a block.
00:13:55.410 This is usually the point where Rubyists look at it and say, 'OMG, I don’t get this. It looks ugly.' I can understand that.
00:14:05.820 However, let's see what this gets us. When we implement this in Ruby, we added the ability to treat this method like a Generator.
00:14:20.850 We can now invoke the method without passing a block, and it gives us back an Enumerator that behaves like an Enumerable.
00:14:33.480 So you could say that Enumerator combines the functionality of a Generator and Enumerable in Ruby.
00:14:48.720 Let’s take a closer look at the to_enum method in Ruby. It’s a source of a lot of confusion and can be implemented in various ways.
00:15:04.540 We can call to_enum on almost any object, passing arguments representing the names of methods the object responds to.
00:15:19.840 Arrays respond to methods like each, map, select, group_by, and cycle, all returning Enumerators. In fact, there’s shorthand for most of these methods.
00:15:37.620 You can call these methods without passing a block and get back an Enumerator. This exists across the standard library and even allows iterating over all objects in the object space.
00:15:56.960 You can call to_enum on any Ruby object, but it’s important to understand that it's not magic.
00:16:08.680 To re-implement to_enum in Ruby, we can create an Enumerator with the initialize method, which has specific syntax.
00:16:23.480 This allows us to allocate memory for an Enumerator with that object, method, and arguments.
00:16:35.260 So to recap, an Enumerator wraps a collection, an object, and some Enumerable behavior.
00:16:46.520 We can go a step deeper and re-implement the behavior of Enumerator in Ruby using Fibers, which are also a Ruby concept.
00:17:10.040 Fibers were introduced in Ruby 1.9, and despite seeing blog posts about them, not many programmers use Enumerator in their daily code.
00:17:30.390 However, they can be really useful for many reasons. You can create a Fiber by giving it a block, which allows you to resume and yield when needed.
00:17:46.060 Does this sound familiar? In the previous discussion, we mentioned how yield acts like multiple returns; when calling next on a Generator, Fibers behave similarly, allowing you to suspend and resume execution.
00:18:05.650 So you can think of a Fiber as a block that can suspend its execution and pass control back to the caller.
00:18:22.290 Using your imagination, you can implement next behavior of Enumerator with Fibers for gradual data retrieval.
00:18:38.010 To re-implement the next method on a custom Enumerator, we simply use the Fiber to regain control for the caller.
00:18:55.800 By including the Enumerable module and implementing each method, we can loop over the next method, making it behave like an Enumerable.
00:19:09.140 If we have an infinite loop, we need to ensure that we raise a StopIteration exception, allowing the loop to continue gracefully.
00:19:29.280 Should we enumerator-ize by convention? It turns out there are many great use cases to use this in day-to-day work.
00:19:48.120 Consider that many of us have written an API client to a remote resource like pagination. Imagine having a post method that asks for a certain page.
00:20:05.400 You can implement a method that knows how to ask for the first page and yield those items back while recursively calling posts with the next page if needed.
00:20:30.960 Using the Enumerate-a-rise syntax, you can treat all items like an Enumerable, fetched on demand without worrying about the pages.
00:20:47.800 Imagine also a document class; we might want to iterate over each line, each word, or each paragraph, treating these as enumerators.
00:21:06.960 This separation allows for more expressive ways to fetch slices of your object without needing to include the Enumerable module.
00:21:28.320 Consider a binary tree; traversal can occur in different orders— in order, pre-order, post-order, brute force.
00:21:53.210 We don’t need to include the Enumerable module; instead, we create an enumerator method for each traversal.
00:22:12.380 This way, we expressively ask for a specific type of enumerator, making clear how we want to traverse the tree.
00:22:27.470 In my opinion, while Enumerator might be considered ugly, it’s very effective, providing a lot from our expressions.
00:22:40.890 I want to discuss functional languages like Clojure and Elixir. There are concepts we can apply from these languages to Enumerators.
00:22:55.890 One of these is infinite sequences. I’ll show you similar examples from both languages, filtering for odd numbers and selecting the first two.
00:23:12.490 We can replace the list of numbers with a function that will return items infinitely; however, we need to tell it when to stop.
00:23:31.440 This works similarly across both languages, allowing for flexible applications of infinite sequences.
00:23:46.870 In contrast, Ruby doesn’t have infinite sequences... Or does it? We can convert our Fibonacci method into an infinite sequence.
00:24:06.210 We need to remove the limiting parameter and replace it with a block to yield objects continuously.
00:24:23.020 This yields an Enumerator that will allow us to define how to return values infinitely, taking advantage of the lazy functionality.
00:24:39.880 Furthermore, enumerators allow augmenting how data flows through an Enumerable method chain, making them incredibly useful.
00:24:56.940 So let’s discuss a failed project where I modeled recurrence and recurring events. Back in 2008, I worked for a company called Replay.
00:25:14.230 Replay was a youth sports social network where coaches scheduled games and practices, and the need arose for recurring events.
00:25:31.210 Imagine being able to create a schedule for recurring events that could evolve over time; however, our early attempts were met with complications.
00:25:50.230 During those initial development days, enumeration was poorly understood, and we didn’t see how we could model such complex behaviors.
00:26:11.970 Eventually, I had a chance to redeem myself in a present-day project and create a library for modeling recurrence that could support infinite sequences.
00:26:29.830 By applying concepts of Enumerator, I was able to generate such recurrences on-the-fly while having them be infinite.
00:26:47.020 I wrote a gem called Montrose, which models recurrence with an Enumerator, and knows how to find the next event given a recurring description.
00:27:05.470 Montrose efficiently yields timestamps and performs calculations flexibly using the benefits of Enumerator.
00:27:23.330 Through this, I learned that Enumerator is a beautiful concept combining useful ideas from various languages, often overlooked when learning Ruby.
00:27:41.220 This talk wasn't just about Enumerator; it was also about my personal learning journey. I’ve had to step out of my comfort zone and explore.
00:27:59.160 At first, I was excited about Ruby, but then I encountered frustrations, leading me to complain about issues rather than celebrating the beauty of the language.
00:28:17.330 Now, I’m striving to maintain optimism as I understand why these concepts exist.
00:28:37.080 I encourage everyone to approach new ideas that challenge familiar conventions with curiosity. Instead of dismissing them, ask what you can learn.
00:28:56.400 Sometimes, ugly code can teach us valuable lessons. So, be curious and explore.
00:29:13.680 You can find the slides for this talk at rossta.net. I’m Rasta on Twitter and GitHub. Thank you very much.
00:29:31.740 Thank you.