00:00:16.800
I'm going to announce Rein now. Rein is going to speak. Before I do that, I have to say that today at lunch we convened and came up with a bunch of puns. I just have a really long list of them, so I'm just going to read through the list because, just because. Okay, Rein, I don't want to rain on your parade, so I'll rain in my puns. It's raining rain, hallelujah! The weather is nice; let's bring back the rain. Did you save this talk for a rainy day? Stop my reign of terror, I'm going to turn over the reins to Rain. Rain, please make it rain. All right, everybody, welcome Rein!
00:01:14.159
Well, good afternoon everyone. Can you all hear me okay? Excellent! So, before I start, I can beat all of those puns because I've been living with this name longer than he has. Before I do that, I'd like to tell a story. I was traveling in Western Europe a few years ago and got the chance to go skydiving for the first time. The reason I'm telling you this story is actually so that I can now say, 'The rain in Spain falls mainly from a plane.'
00:01:49.520
Okay, let's talk about math. I can tell from the looks on some of your faces and the attendance at my talk that some of you don't want to talk about math. That's a problem, and the problem's not actually with any of you; the problem is with math. And not specifically with math, the problem is with the way that we teach math.
00:02:00.800
I think we can all agree that we have vague memories of high school math and maybe college math, but we all have pretty clear memories of how terrible it was. How many of you remember high school geometry? How many of you enjoyed high school geometry? How many of you, before high school geometry, thought that shapes were pretty cool? How successful was high school geometry in squashing that notion for you forever? Super good! Okay, let's talk about calculus.
00:02:27.960
Isn't how things move really cool? How you can throw a ball and catch it? That's amazing until you're taught it like this. What high school calculus manages to do is replace the innate wonder of the mathematics of motion with a mountain of indecipherable symbology that makes no sense to anyone, including the mathematicians. So that's a problem. What I want to do is present to you what I think is an example of a solution, and that is basically what my talk is today.
00:03:07.040
My talk is sort of a math sandwich where there's math in the middle, and then on the outside, the bread is me talking about how awesome math is. The reason that I think math is awesome is because what math actually is—the actual doing of math—is much more like being a field botanist in Costa Rica, going out into the jungle every day, and then one day discovering this amazing Costa Rican jungle hamster. You can observe its behavior for a while, discover some really interesting things about the way it lives in its ecosystem, and then teach it some tricks. Math is way more like that than what you're taught in school. So I want to hopefully teach you how to do a few tricks with math.
00:04:36.000
I'm going to start by demonstrating some mathematical creatures for you. This is a triangle. This is a square. All right, that's the math, everyone! Thank you for having me. You can do a thing with a triangle where you take the middle out of a triangle by drawing another triangle, and then you can keep doing that for a while, and a shape begins to appear. You might notice that what we're doing is starting with a hole and we're removing parts of it.
00:05:04.000
What if we could do the opposite? What if we start with a part and then grow it into the hole? Let's start with a square. Now, what if we duplicate that square with the points of a triangle like this? What if we take these three squares and duplicate those at the points of a triangle again? What if we take these and do it again? What we're building here is the same structure in two different ways. It turns out that there is a deep mathematical relationship between the two ways that we're doing this.
00:05:34.160
One of those ways is recursive and the other way is what's called co-recursive, but I'm not going to talk to you about that today. I just like Sierpinski triangles; they're pretty. What I'm going to talk to you about instead is a different kind of math, a mathematical creature. How many of you are familiar with reduce in Ruby? How many of you feel like you know when to use reduce?
00:05:56.640
Well, how many of you feel like reduce or inject is that thing you reach for when you don't quite want each but you also don't quite want map? That's about as far as you've gotten. I want to give you a way to think about reducing things mathematically. I want to start with summing a list of integers, then we can move on to joining a list of strings, and finally, we can concatenate a list of lists. You might notice that there's something special about the plus operator when applied to those different things.
00:06:30.400
The first thing you might notice is that it doesn't matter how you group the pluses; you can put the parentheses anywhere you want and get the same result. Sometimes you can also switch the terms around and get the same result, but not always. So, a plus b is not equal to b plus a when they're strings. But it works for a lot of things; it works for a surprisingly large number of things.
00:06:56.800
So let's talk about this idea of grouping stuff in arbitrary ways and what we can get out of that. That principle, if you remember it from high school math, is called associativity. A set with a binary operation that is associative, like plus and numbers or plus and strings or plus and lists, is what's called a semi-group. You might wonder, 'Why do I care about semi-groups?' I'll get to that later, but first, I want to introduce another interesting thing about plus.
00:07:46.160
If you've ever tried to reduce an empty list, you've got nil, which is not what you want. If you've ever thought about why you get nil, the reason is that in Ruby we can't tell what type of list that is because it's empty. If we knew what type of list that is, we might be able to provide a good default value. But what would be a good default value? If we want to sum, we might want to start with zero. If we want to concatenate strings, we might start with the empty string because it has that same property: the empty string plus any other string gives you that string back.
00:09:05.760
With lists, it's the same. That property is called identity. It might be defined mathematically like this, but I think it's clearer to just look at an empty circle and say the empty circle doesn't change the other circle you add it to. When you take that semi-group with the associative operator and you add identity to it, you get a mathematical creature called a monoid.
00:09:48.320
One of the advantages of knowing what a monoid is that if you want to take a bunch of values and reduce them all together, if those values happen to form a monoid, you're done. That's all you have to do. If they don't, then maybe you could spend some time thinking about how you might form a monoid out of them. Let's talk about both; let's actually talk about two different things that we can do with monoids.
00:10:04.320
The first is we can use monoids to do some very interesting things. For example, we can use monoids for big data processing. Because we can group operations however we want, we might start by grouping all of the values we've seen so far and then adding another one. Then we might do that again and add another one. What this means is that monoids let you stream, let you fold over, and accumulate over a stream.
00:10:22.000
So, if you're looking to do stream processing and you want a data streaming algorithm or an online algorithm that lets you do streaming, if your data is a monoid, you don't have to look any further; you just use the properties of the monoid. Another example is because you can group these any way you want, I can split them into two, and then split those two into two. Once I get down to two elements at the end, I can join each of those in parallel.
00:10:47.040
I can have four cores, four threads; each thread is joining two elements and then do that again, and again. What I actually get is, due to being able to fold in parallel, I can achieve log(n) speedups folding a list of things together. If I can get maximum parallelization, I can get a log(n) speedup. So, that's another advantage of monoids: they give you the ability to get parallelism out of your reductions and folds.
00:11:18.080
Monoids form the core of MapReduce. They provide a mathematical way to think about things that you can map-reduce. You try to map a value onto a monoid, and then you use the reduction operation that you get for free because it's a monoid. Let's look at specific examples of monoids so I can give you an idea of what I'm talking about.
00:11:48.640
We've seen the sum monoid for integers, we've seen the string concatenation monoid, and we've seen the list concatenation monoid, but some surprising data structures form monoids that you might not think ought to. If you have two monoids, you can compose them together and you get a monoid out, and one way you can use that is if you want to take the arithmetic mean of some stream of data or fold over some data in parallel and get the arithmetic mean. I realize this is kind of a trivial example, but a mean is what it is: the sum over the count.
00:12:11.040
You can use a sub-monoid and a count monoid, and when you're done, you just take the sum and divide it by the count. That lets you do an arithmetic mean incrementally and in parallel. A slightly less trivial example would be a standard deviation monoid. Here's a formula for standard deviations—n is the number of elements—so that would be a count. The problem is that the sum on the right-hand side cannot be computed in a single pass.
00:12:32.080
Unfortunately for us, using this equation to compute a standard deviation requires two passes, so we have to look again to see if we can find a different way to compute a monoid. If you dig through some of the research or flip through some books, you might find algorithms that work in one pass. The challenge is to find an algorithm that doesn’t accumulate floating point error. One classic issue is that you take a large number and subtract it from another very large number and get back a small number, which affects your floating points.
00:13:02.640
This can lead to negative standard deviations, which is impossible. So, we need to think about another way. After some searching, you can find an algorithm that works in one pass and doesn't require computations that produce floating point errors. I'll leave the implementation of this algorithm as a monoid up to all of you. Now, how many of you know what a Bloom filter is?
00:13:31.440
Awesome! How many of you know whether or not a Bloom filter forms a monoid? Spoiler alert: it does. A Bloom filter is a bit vector, essentially just a list of bits, along with some hashing functions that take an element in a set and activate some bits in the vector based on the hashing results. A Bloom filter is a data structure that lets you check approximate set membership.
00:13:45.520
Sometimes you get false positives, but you have very little space usage as it's literally just a bit vector, leading to very good data locality. So, you can utilize Bloom filters for caching or for various applications where slight inaccuracies are acceptable. The question is: do they form a monoid? What do we need? We need identity, which means we need an empty Bloom filter. If a Bloom filter is a list of bits, what is an empty list of bits? Zero, right? It's a bit vector where all the bits are zero. So we have our identity.
00:14:15.760
Now we need a binary operation to combine two Bloom filters into a single Bloom filter that associates and preserves identity correctly. To combine two bit vectors, you perform a bitwise OR on them. The only requirement is that the Bloom filters must have the same bit vector length and the same hashing functions. So by instantiating an empty Bloom filter with a vector of zeros and performing a bitwise OR on two vectors, you can combine Bloom filters together.
00:14:43.760
That means you can do everything we discussed with Bloom filters: aggregate data, stream over that data, and perform operations in parallel in a MapReduce setting. To validate that this is correct, you need to demonstrate that it forms a monoid. There are some other advanced data structures used in big data, like Count-Min Sketch. Count-Min Sketch is an approximate counting algorithm that's known to form a monoid. Likewise, HyperLogLog forms a monoid, showcasing that some surprisingly complex data structures, if shown to be monoids, grant you many benefits just by having that property.
00:15:20.000
Another way to utilize monoids is not to perform computation but to reason about computation, particularly for concurrent or distributed computation. So, how can we use a monoid to reason about concurrent or distributed computation? We're going to start with a mathematical definition: an alphabet, which is a non-empty set of letters or symbols. We'll define an alphabet containing the letters a, b, c, d, and e. You can then join words formed from that alphabet using string concatenation, where the empty word is the empty string. Words formed from an alphabet effectively form a monoid.
00:15:48.240
Now, we're going to do one other thing. This monoid is called the free monoid for certain reasons, which are not important right now. However, it shows up everywhere because the free monoid can be transformed into any other monoid while preserving the structure. Additionally, we will say that within words from this alphabet, letters b, c, and d commute, meaning the ordering of those letters does not matter; they are independent.
00:16:16.160
By forming a word with that alphabet, we can explore the various ways we can rearrange letters while maintaining equivalence to commutativity. You can switch c and d, b and d, and b and c, thus building all the permutations of this word equal up to defined commutativity. You can think of computations in terms of a dependency graph. If we draw arrows from a to b, c, and d, and from b, c, and d to e, we create a valid computations dependency graph.
00:17:05.160
This equivalence means you can reason about concurrent computations without relying on tools from graph theory and topology; instead, you can employ algebraic methods because monoids form algebras. The specific type of monoid we're discussing is the trace monoid. The trace of a word is the set of all equivalent reorderings of letters based on defined commutativity, allowing reorganizing of the word into equivalent forms without altering its meaning. The size of the trace compared to the total permutations of that word indicates the potential parallelism.
00:17:35.560
If the letters represent computations, then the traces are valid ordering of computations based on our defined rules. The size of the trace gives you insights into the amount of parallelism you can achieve based on how computations can be ordered. The larger the traces, the more your computations commute with each other and the greater the parallelism you may obtain. In concurrent or distributed programming, you learn that building systems with independent computations leads to greater parallelism than synchronizing dependent tasks.
00:18:00.720
I’m doing well on time, surprisingly. Here's what I want to leave you with: what math looks like during the discovery process resembles discovering a new and interesting creature, like finding your own tropical jungle gerbil or whatever, and learning about its behaviors. You may find it can perform tasks you didn't think it could, or discover someone else found the same thing and published a paper on it 30 years ago.
00:18:20.320
This trace monoid concept is derived from a paper published 30 years ago introducing the interactions of a framework called process calculus. This gradually leads our understanding of math to where we realize it embodies artistry, crafting beauty and meaning of explanations, emphasizing discovery and creativity rather than merely applying rules learned by rote. We are all makers of patterns, where mathematicians, like painters or poets, create structures and relationships revealing creativity in mathematical exploration.
00:18:43.680
Math encompasses more than what we initially thought. It is essential that we recognize the art in math, revealing its beauty. The realization may arise that some interests we have were not recognized as mathematics, but truly are. To share is to learn, and all the joy intrinsic to learning new things and sharing them with others makes pursuits in the mathematical domain worthwhile.
00:19:15.680
This reflection encapsulates my message today: learning is fulfilling, and we can deepen this joy by sharing it. I would like to express special gratitude to Jim Weirich, who inspired Sandy's incredible talk, as well as my own journey to deliver this message to you today. Jim emphasized the joy of learning new things, reinforcing that there is even more joy in sharing those discoveries with others. Thank you, Jim, and thank you all for allowing me to share my insights.