00:00:12.840
I don't usually wear Godric Gryffindor's enchanted sorting hat, and I don't typically give personal introductions for my talks. The idea is that a bio is meant to get you into the room, and since all of you are already here, I'd like to introduce myself. My name is Caleb, and this morning I had 970 followers on Twitter. I'd love to hit 1,000, and that means I need to get just 30 more people to follow me. I think we can make this happen, so if you like what you hear today, please take out your phones and follow me at Caleb Thompson. Also, on this slide, we have Joanne Chang sharing some behind-the-scenes information about what it takes to get up on stage—mostly, it's about not preparing for your talk.
00:00:37.950
All right, that's out of the way. I want to clarify that this talk is not aimed at people who have studied computer science. If you know what Big O notation is, that's great! Sit back and hopefully still enjoy the presentation. However, I might not be teaching you anything new. This talk is for those who call the sort method in Ruby and have no idea what algorithms are doing behind the scenes to sort an array for you but would like to learn a little bit more. So, let's dive in to understand what Ruby or Postgres does when you call sort. We need to talk about algorithms. I know algorithms can sound like a scary term, but at its core, an algorithm is just a set of step-by-step instructions.
00:01:16.049
For example, I have a recipe for chocolate chip cookies, which is essentially an algorithm. I didn't need to understand chemistry or gastronomy when I followed the instructions from the recipe. I just had to find a recipe for chocolate chip cookies and make a batch, which I did—procrastinating on preparing for this talk—and I ended up eating them all! Before we get too far into the details, I also want to address Big O notation, which can be another intimidating concept, especially when you see it next to an algorithm. I typically assume it's for math people, and since I build websites, I think it might not apply to me. But what it comes down to is that Big O notation is not really that scary. It's a way to answer the question, 'How fast is an algorithm?'
00:02:00.149
We can compare how fast algorithms are because Big O gives us a means to evaluate the number of operations an algorithm will perform as a function of the inputs. So let's take a look at a simple example: cutting squares from paper. In a previous instance of this talk, I was told that my visual aids were a bit small and hard to see. So, I bought a giant pair of scissors and a big easel along with a giant pad of sticky notes, hoping you’d be able to see better. As I start cutting paper, each time I cut out a square, that's one operation. If I were to cut out 16 squares, my algorithm would be described in terms of O(N²), which is quadratic time, because I have to make multiple cuts.
00:03:06.940
But if I needed to cut 32 squares, it would only take me 31 cuts instead of 32, which is a slight variation. This illustrates that Big O can be a bit lossy; we mainly look at growth rates. For instance, quadratic growth is represented as N². Then, I switch to another algorithm for cutting pieces of paper. This time, I might use fewer cuts, making it more efficient. When I compare algorithms, the performance (in terms of Big O notation) really highlights cases like logarithmic time complexity, or O(log N), which can be much faster than the quadratic examples I showed earlier.
00:04:52.260
So, I want to make sure you understand that cutting paper this way is much more efficient. In fact, with logarithmic time, I'm reducing the number of required cuts significantly. Thus, the performance improves dramatically over my initial example. Remember, when we look at performance, we always want to keep an eye on the slower growth rates—those represented by O(N²) or worse—when we're designing algorithms.
00:05:17.790
Now let's discuss bubble sort. It’s a classic algorithm for sorting, and it works well for small datasets. The method is simple: for each pair of elements, compare the two. If they're in the wrong order, swap them. You continue this process until no swaps are necessary, which indicates that the set is sorted. So, I need some volunteers. Can you please come up here to represent numbers? You'll help me demonstrate how bubble sort works in real time with some props I have ready for you.
00:06:56.840
We’ll use the numbers 42, 9001, 5.1, pi, and e, which I've labeled here. As you guys compare the numbers, follow the algorithm: 42 is less than 9001, so nothing changes initially. However, as we continue checking values, we’ll make swaps as necessary until everything is in the correct order. Once we finish sorting, we can see that bubble sort operates in O(N²) time, as it compares each element with every other element. However, for the best-case scenario where the array is already sorted, it runs in linear time, O(N). Once we complete the demonstration, we can move on to more advanced sorting algorithms like insertion sort, which is known for its efficiency with nearly sorted datasets.
00:14:32.080
Insertion sort functions similarly to bubble sort but is generally faster. It goes through each element and compares it with the elements to its left. When we reach one that is smaller, we move this element into the correct position by shifting other numbers accordingly. This means we can efficiently sort data that is almost in the correct order. Moving on to quicksort, this algorithm has a different approach. Unlike bubble and insertion sort—where elements are moved around—quicksort picks a pivot and sorts elements relative to it. The average complexity for quicksort is O(N log N), which is much better than the O(N²) for both bubble and insertion sort. This makes it the go-to choice for many programming languages, including Ruby.
00:26:42.900
In quicksort, the efficiency comes from how we choose the pivot. Selecting random or middle elements instead of the first or last elements helps to prevent scenarios that lead to O(N²) performance. Remember, we’re aiming for algorithms with lower growth rates, such as O(N log N) or even better. Quicksort strikes that balance well. Now, I’d like to take a moment to thank my employer, Heroku, for their support in attending RailsConf. If you have any questions, please feel free to reach out to me through the office hours or directly on Twitter. Remember, I’d love for everyone to follow me, and thank you all for your attention today!