Data Structures

Summarized using AI

The Audacious Array

Ariel Caplan • November 08, 2021 • Denver, CO

The video titled "The Audacious Array" presented by Ariel Caplan at RubyConf 2021 explores the versatility and power of arrays in Ruby programming. The speaker begins by introducing arrays as more than just simple indexed lists of values, highlighting their potential as various data structures such as randomizers, 2D maps, stacks, queues, and sets. Caplan emphasizes that a deep understanding of arrays is essential for writing elegant and effective Ruby code.

Key Points Discussed:
- Basics of Arrays: An array is defined as an ordered collection of values, capable of holding any Ruby object, including strings, symbols, and even Active Record objects.
- Manipulating Arrays: Techniques for modifying arrays include adding (using push, unshift) and removing elements (pop, shift). Caplan illustrates using integers for simplicity and visual clarity.
- Stacks and Queues: The video describes how arrays can be used to implement stacks (last in, first out) and queues (first in, first out) using simple push/pop and push/shift methods.
- Randomization and Sorting: Caplan elaborates on generating randomness with methods like shuffle and sample, along with sorting arrays using either built-in methods or custom criteria with blocks.
- Enumerable Module: Arrays in Ruby include the Enumerable module, offering diverse methods for iteration, selection, and aggregation, such as each, select, map, and inject or reduce.
- 2D Arrays as Maps: Caplan discusses the use of two-dimensional arrays to represent maps, explaining how to transpose, flip, and rotate these arrays for various applications.
- Searching and Autocomplete: The video covers basic searching techniques including linear and binary search, highlighting the usefulness of efficient searching in large datasets.
- Set-like Behavior: Caplan concludes by explaining how arrays can mimic the behavior of mathematical sets, using methods like uniq, and set operations such as union and intersection.

Throughout the talk, Caplan uses a game she developed called "Catwalk" to demonstrate practical applications of these array functionalities, showcasing how different methods are utilized effectively to build game features and ensure performance.

In conclusion, the video underscores that mastering arrays is fundamental to becoming a proficient Ruby programmer, as they provide essential tools for data handling and manipulation.

The Audacious Array
Ariel Caplan • November 08, 2021 • Denver, CO

What could be simpler than the array? It's just a list of indexed values, right?

Well, not so fast. Ruby reimagines what an array can be, turning it into a randomizer, a 2D map, a stack, a queue, and a mathematical set. It also provides built-in utilities for unparalleled analysis and search of its contents.

Understanding arrays deeply is a powerful step forward in writing concise, beautiful, semantic Ruby.

RubyConf 2021

00:00:10.880 Hi everybody, thanks so much for coming. My name is Ariel Caplan, and I'm here today to open your eyes to some powerful secrets of Ruby arrays. Now, when we program, many of the problems we solve fall into the same few categories.
00:00:21.920 We often need to create constructs like randomizers, a 2D representation of data like a map, some kind of search mechanism, or data structures like stacks and queues. We might need to take some data and use it to create a report, and sometimes we even need to perform operations that are similar to those we do on a mathematical set. I want to thank my six-year-old, Eminem, for finding that graphic of a set; she actually chose that.
00:00:46.640 It turns out that to do all these things and a whole lot more, all we need is a pair of square brackets—one of Ruby's audacious arrays. I’ll tell you about that in a moment, but first I want to talk to you about Catwalk.
00:01:12.320 What is Catwalk, you ask? Well, it is a game. It's not a game you've ever heard of because I made it for this talk. It's just a regular Ruby file you run by using ruby catwalk.rb, and you're a cat walking through a complicated world, gathering various items to increase your score.
00:01:30.400 You can see the world is circular, so it kind of loops back at the end. If you hit a turn block, the world will turn by 90 degrees, so you're basically turning left. There are various items, like the bird head—two kinds of birds—that act as score multipliers. They don’t provide scoring items, but they will increase your score if you do grab a scoring item. There are also cyclones, which mix up the whole world to give you a new layout.
00:01:56.239 Moreover, there are sneakers that will make the game go faster, kind of like a running thing. But you might get bored being a cat after a while, so if you type in 'tiger', you become a tiger. You can also be a frog. There are cheat codes; you could be a whale, which makes a ton of sense running through an empty field, or switch back to a cat. When you hit 'q', you will exit the game and receive a tally of all the items you collected and, of course, your final score.
00:02:39.840 That’s the game—it's on GitHub. Don't look it up right now; you can check it out after the talk. It’s under my GitHub account, am kaplan, catwalk. You're going to see how I've used a lot of the tricks you'll see in this talk to implement the game. Speaking of my GitHub and me, hi again, I’m Ariel Caplan. I’ve been a Ruby programmer since 2014.
00:03:01.360 The slide says that I'm a backend developer at Cloudinary, but as of the time I made that slide, that's no longer true. They were my employer during the entire time that I put together this talk, and I have a lot of gratitude toward them for giving me the opportunity to put this together, which is why they're on all the slides. You can find me online at am kaplan and my personal site, am kaplan.ninja, where at some point, the slides for this talk will be posted.
00:03:30.159 A bit more about my new employer in a moment, but you're not really here to hear about me, right? You're here to learn about arrays, so let's talk about the basics. What is an array? At its most fundamental level, an array is an ordered collection of values.
00:03:48.480 Those values can be any Ruby object, meaning you can have an array that combines different types such as strings, symbols, numeric types, and even ActiveRecord objects. The sky's the limit! It's very challenging for those working with types, as you may have heard in some other talks. You can get and set elements using an index—note that indexes are zero-based, so the first element is index zero, the second is index one, and so on.
00:04:14.640 Most of the time, when creating an array directly, you would use the literal syntax with a pair of square brackets containing a number of items delimited by commas. Note that there's technically an explicit syntax where you call array.new, providing a length and the fill type. Although that technically works, it’s better to avoid it as no one will understand what you did.
00:04:37.680 You can also set elements by index. For example, using the notation square bracket equals, you can say that the element at index two should be 'val2'. When you fetch an element, many people don't realize that you can fetch it in a more strict way. If you request the element at index three and the array is not long enough, it will raise an index error, which is useful to know if you accidentally reach outside of the array and get nil.
00:05:07.840 However, most of the time in programs, you will use square bracket syntax, where you pass an index to get the item from that position. If the index doesn’t exist, it will simply return nil. You can also pass a range to get everything from index one to index two. In this case, I would have done more, but I just wanted to fit it on the slide. You can also use negative numbers for indices. Instead of counting from the beginning of the array, you can count from the end—negative one being the last element, negative two the second to last, and so on.
00:05:47.680 Additionally, you can get descriptive information about the array, such as its size and whether it is empty. Note that an array with 10 nils in it is not considered empty; an empty array has zero elements, so in essence, you're checking if the array size equals zero. That was pretty basic, and I imagine a lot of you already know that.
00:06:06.479 Now let's move up a notch and talk about adding and removing elements. In Ruby, you can add and remove elements from the front and back of an array. Let's say we have an array with five integers: one, two, three, four, and five. You'll note that I use integers frequently for simplicity.
00:06:18.560 If you want to add items, we can add to the beginning or the end. We can use the push method or the shovel operator, which is a double less-than sign. They serve the same purpose—adding an element to the end of the array. So when we push six or shovel on seven, we see the array grows in length. We can also use unshift to add to the beginning, and the new index zero becomes the new first position, effectively bumping everything in the array up by one index.
00:06:58.400 To remove items, we use the pop method to take an item off the end, which reduces the length of the array by one. We can also use shift to remove an element from the beginning, which means our initial index zero will be removed. This information is enough to start building data structures like stacks and queues. A stack is a last-in, first-out data structure, where you add items and then remove them in the opposite order of their addition.
00:07:19.919 In this case, we can show a visual example by creating a stack where we push on the symbol 'a', followed by 'b', and finally 'c'. When we pop off elements, we simply remove them in reverse order: c, then b, and lastly a. For a queue, we would most likely use push and shift. Thus, we create a new queue by pushing on a, b, and c, and then removing elements in the same sequential manner. Any queuing system that you encounter will behave similarly.
00:07:52.960 Now, you might be thinking that this is a bit arbitrary—adding to the end and taking from the beginning. Couldn't we add items to the beginning and then remove them from the end? In a certain sense, that's more intuitive, as whatever's at the end is always getting removed first. That’s a good question with a complex answer, combining yes, no, and maybe. An article titled Ruby's Missing Data Structure by Pat Shaughnessy explains this.
00:08:37.360 As you may know, Pat wrote a popular book called Ruby Under the Microscope about the internals of Ruby. In this article, he describes how an array is represented inside Ruby using a contiguous set of memory addresses. A pointer at the start fills up elements at these addresses. Now if we want to have a queue as described, we can push elements to the end and remove them from the beginning by sliding that pointer over.
00:09:05.440 However, if we attempt to add an element to the beginning, we run into a problem. There's no reserved space for that new item. Thus, we have to find a new area of memory that has available space, move everything over (which is expensive), and finally add that new element. So practically speaking, the difference between adding to the beginning and the end is an implementation detail.
00:09:30.720 I don't believe in giving performance advice without providing evidence. After all, some conventions in the Ruby community suggest avoiding unshift due to expected poor performance. However, I ran benchmarks that demonstrated the performance of adding to the beginning versus the end was essentially the same. When I first encountered this, I was surprised because the conventional wisdom in the Ruby community suggests not to use unshift due to anticipated performance issues, yet my results showed no significant difference.
00:10:09.680 A little story illustrates this. At RailsConf this year, I met someone from Shopify, and we discussed how they have a great group of people knowledgeable in Ruby. When I was preparing for my move to Shopify, I reached out to them for help, particularly Peter and Kevin, who helped me understand the underlying principles behind this behavior within Ruby arrays. This led to an important realization: a feature in the Ruby language tracker called arrays queue—a solved problem, introduced in Ruby 2 back in 2013.
00:10:55.200 What makes me sad is that many in the community still read the article titled Ruby's Missing Data Structure, which popularized the idea that unshift should not be used, published just a couple months after Ruby 2's release. If you ever want to impress some very senior developers who may not be aware of this, you can say that you use unshift, and when they mention its bad performance, you could present the benchmarks from this talk.
00:11:25.440 Now, since I work at Shopify, I find this Shopify sticker sheet pretty cool. While we go through this next part of the talk, I'm going to start adding them to my outfit. So now we're up to the next section dealing with order and randomness. In Ruby, we can create randomness using arrays. If we have an array, we can shuffle it up to get a new order, and as many times as we do it, it will use a standard computer science algorithm called Fisher-Yates shuffle to perform the task.
00:12:01.000 We can also pull a random element from the array by using array.sample, and if we want a few items, we can call sample with an argument. A word of warning: sample with an argument is not the same as calling sample multiple times; calling sample three times could yield the same result repeatedly, while sample with an argument will always return different elements.
00:12:33.520 As an illustration, let's say we have an array with five elements. If we call sample with twenty, we'll actually receive just a five-element array back. So there are pros and cons to each option—a note to keep in mind. If you want to tidy up your array after creating a mess with your shuffle, you can simply call sort to put it back in order; I believe it uses quicksort under the hood.
00:13:11.040 In case you want to sort by something other than direct comparison, you can pass a block to the sort method to sort accordingly. Let's say you have a set of ActiveRecord objects you want to sort by their IDs, timestamps, or even product prices; using sort by is your answer.
00:13:36.080 To generate combinations, we can take an array here—let’s use a three-element array—and produce all combinations of two elements. There’s also a permutations method to consider, and it’s important to understand the difference. Permutations care about the order; for example, [1,2] and [2,1] are distinct permutations, whereas combinations do not care about order, so we consider these the same. A helpful way to remember this is that a combination lock is effectively a permutation lock.
00:14:11.679 As we delve into more practical examples, let's say you run an online store and wish to display related items, such as 'customers who purchased this also bought…'. For one particular item in your store, you can sample three related items, and because of sample with an argument, you’ll always receive different items—no duplicates. Avoid using sample three times, or you might end up with three identical pillows instead of a varied selection.
00:14:47.680 If you're looking for a more scientific approach, you might want to examine all different ways of combining items to see which arrangements perform better. Here, you can create an enumerator and call permutation followed by cycle to get an infinite enumerator that will give you a new permutation each time you invoke next. If you have thousands of simultaneous requests to your server, over time, each product will get equal airtime.
00:15:15.680 Next, let’s tackle interpretation, and before that, put on your thinking cap for a dense portion of the talk. Arrays include the enumerable module. If you’re not familiar, it provides tools for iteration, introspection, selection, and extrapolation, but don’t worry, we’ll go through each one step by step. The most basic aspect is iteration, which you probably learned first in Ruby with the 'each' method, allowing you to go through every element in your array.
00:16:01.760 Other than that, we have more complex methods such as each_slice or each_cons that yield values in batches. Next, let’s talk about introspection, which means asking an array questions about itself—such as whether all or any of your elements are odd. Methods like all?, any?, some?, and none? are very useful here.
00:16:44.000 Selection focuses only on elements of interest. To achieve this, you can use the select method if you want to pull matching elements. If you only need one match, use the find method to fetch the first match. Thus, with an array of values, if four is the first applicable answer, four is the return value.
00:17:04.640 Extrapolation is where things get intense. The map method allows us to define transformations so that you can apply a transformation to each element and receive a new array as a result. In the case of multiplying by 2, each value modifies to create a new list. Other times, you want to reduce the array down to a single value, and for this, you'll use the inject method, also known as reduce, starting with an initial value that will turn into what is called the accumulator..
00:17:56.959 It permits you to sum values or identify the minimum and maximum amounts across the array, collapsing it to a singular outcome. Finding the minimum, maximum, or both happens in one pass; this isn't limited to integers but applies to any comparable item, like strings.
00:18:25.840 A solid example comes from shopping carts commonly found in online retailers. Each item in your cart could be represented by a hash, with its name and price, yielding the cart itself as an array of arrays. If you wanted to visualize which item you had the most of, you might use the max_by method to yield the item with the highest count, or max_by product price to display the priciest item, while sum gives you the total amount configured by multiplying item prices with their quantities.
00:19:04.160 Now let's explore arrays as maps using an array of arrays filled with integers. You might want to rotate, move, or flip a map, so flipping can be simply done by reversing the array itself. A defined vertical flip can also be achieved by reversing individual sub-arrays.
00:19:39.119 Transpose serves to exchange rows and columns, where the one at the top left remains firm while all others rotate around. By utilizing a combination of transpose and reverse, you can achieve a 90-degree rotation in any direction. This rotation becomes incredibly handy, as demonstrated in the context of solving jigsaw puzzles; leveraging code that utilizes transpose and reverse lets solvers see pieces fit into their expected orientations.
00:20:21.440 Now, let’s examine search methods. Autocomplete is a frequent operation, and while Ruby provides a simple solution, it may not be ideal for every case. If you have an array of cat breeds and a user input of 'pe', regex helps us construct a literal search query without vulnerability to attacks. By using the grep method, you can find all strings that match that pattern, and should you wish to filter based upon specific criteria, adjusting regex can lead to flexibility.
00:20:56.160 Find commands can assist in pinpointing items based on conditions, and utilizing bsearch serves as a binary search method for efficiency. For example, if you had a sorted array and wanted to find the first order based on a specified condition, bsearch effectively reduces unnecessary cycles. Importantly, to utilize bsearch correctly, the array must be structured accurately and sorted according to the search criteria of your chosen block.
00:21:47.679 Lastly, to achieve set-like behavior, arrays can uniquely represent values through methods such as unique and subtraction. These methods eliminate redundant entries, returning an array that closely adheres to mathematical set standards. Moreover, the union and intersection of values between arrays yield distinctive results, enabling you to analyze overlaps or shared items effectively.
00:22:21.920 This provides the foundation for an array's sharp functionality, making it incredibly useful with practical examples found in platforms like ActiveRecord, which leverage these features with ease to compare and update post content across various system elements, keeping them in sync based on unique identifiers and user-generated content.
00:23:06.959 To recap, we've navigated through arrays as indexed collections of elements, learned about randomization and sorting, methodical iteration, plus usage of two-dimensional arrays as maps. We've also explored rudimentary search capabilities and the extent of set-like functionalities responsible for efficient data implementations in Ruby.
00:23:55.440 If you care to choose your own adventure now, while remaining mindful of time, I invite those needing to leave for the keynote to do so without any hard feelings. For those who wish to stick around a little longer, I will talk about how I've applied the concepts presented today into my Catwalk game.
00:24:40.880 To generate the map, I've created an array of map items, indicating how often something appears. The instances within this array serve as probability indicators for any random location generated by sample. As a result, when characters hit a cyclone in the game, I use flatten, shuffle, and each_slice methods to introduce chaos while maintaining references to the original item placements.
00:25:00.960 Movement within the game loop allows for player interactions, where hitting turn blocks rotates the playing field by 90 degrees and changes the entire map’s structure. Additionally, the control mechanics permit quick navigation of the map through parameters to send items to their respective final destinations.
00:25:23.679 Keeping track of active items was an intriguing aspect of the game as well. As movement occurs, various temporary items need to be accounted for across long gameplay scenarios. As a convenient solution, I utilized a binary search index on timestamps to identify the most recent active item, subsequently filtering through the remaining pool of data to match items that fit active conditions accurately.
00:26:06.559 Miscellaneous elements, like multiplicative score modifiers, leverage inject to yield cumulative effects, while cheat codes utilize a queue-based logging system to recognize changes in avatar states triggered through user interactions. The game ensures engagement through dynamic interaction, significantly enhancing player experience.
00:26:41.760 All compiled code can be explored on my GitHub under am kaplan catwalk, showcasing the Ruby methods and teachings implemented throughout the game. I also want to finish this talk with a thought on fostering appreciation by practicing gratitude within your teams, challenging you to find ways to spread positivity in your workplace.
00:27:09.760 Your homework is to ponder how to incorporate these methods into your projects effectively. And with that, I’d like to initiate my own appreciation circle, acknowledging you for being an amazing audience, while inviting applause for all your contributions today.
Explore all talks recorded at RubyConf 2021
+95