00:00:18.500
When people find out that I'm a programmer, one of the first questions they ask me is, 'Oh, you must be really good at math.' Show of hands here: who has heard this before when people find out you're a programmer?
00:00:25.199
Yeah, we got a lot of hands. How many of you would actually consider yourselves good at math?
00:00:31.140
Okay, a few hands—not quite as many. My name is Joël, and I am not good at math.
00:00:44.879
In fact, as a Rails developer, I mostly just use basic arithmetic—the kind of stuff you can do on your fingers. These are the kinds of things that just about everybody else uses in their jobs as well.
00:00:56.760
Recently, I was having a conversation about what is the most useful computer science class for a working Rails developer because there's a lot of theory and interesting subjects, but what's actually useful for us here?
00:01:04.140
Let me tell you upfront: it is not data structures and algorithms. I know, right? Dropping the hot opinion early in the morning. So, I thought for a while and eventually settled on discrete math, which surprised me and the people I was talking to.
00:01:15.299
But it turns out that this is actually a skill I use on a daily basis. So, what is this discrete math?
00:01:21.720
Discrete math is a bit of a grab bag; it has elements from a variety of mathematical disciplines including propositional logic, Boolean algebra, graph theory, set theory, combinatorics, probability, predicate logic, and a few other fields.
00:01:29.460
Now today, we're not going to dig into the theory of all of these fields. Instead, we're going to look at practical problems and explore ways that discrete math informs our solutions.
00:01:40.979
Our first topic is unnecessary conditionals. Conditional code, such as if-else expressions, is a core part of programming, and one of the first things that you learn.
00:01:46.200
When I was a brand new developer, I wrote a lot of code that looked like this: a permissions code.
00:01:51.299
This is a simple conditional: if admin, return true; otherwise, return false. This is perfectly serviceable code, but the conditional is entirely unnecessary because it could be simplified to just calling the admin Boolean method.
00:02:03.780
So the lesson we can learn from Boolean algebra that is relevant here is to concentrate on operators. Boolean algebra deals with Boolean values: true and false, and the operations that can happen on them.
00:02:10.739
Now that’s pretty basic—these are things that we learn in probably an introductory programming course—but there’s an entire field of math built on these simple elements.
00:02:16.260
Some common Boolean operations are identity, returning the value itself, which is what I had accidentally implemented in that conditional you saw.
00:02:26.280
Secondly, we might want to do negation. In Ruby, we do this with the exclamation mark operator.
00:02:32.580
There are also a couple ways that we can combine two Boolean values. We can use 'and' to combine them or 'or'.
00:02:44.040
Earlier, we saw that sometimes we only want to use the admin value directly, so this is just doing identity.
00:02:50.759
Here’s another example of a conditional that’s unnecessary. I've written plenty of these.
00:02:56.100
Here we're saying: if admin, return false; else return true. This is very similar to the first one, except the two branches are flipped. This is really just re-implementing the not operator: we’re doing negation here.
00:03:06.720
Let’s look at something a little fancier. At first glance, you might look at this code and say that’s not a conditional, but see at the end of the first line we have that trailing 'unless', and that 'true' on the second line—that’s an implicit else.
00:03:13.500
We have some conditional code happening here, and this is actually equivalent to Boolean 'or'. Honestly, in my opinion, this reads a little bit better as a Boolean expression—who can edit an admin post?
00:03:18.960
Well, you can edit it if you're either an owner or an admin, and now I don't need to parse a complex conditional expression in my head.
00:03:28.440
So the first lesson that we get from discrete math and from Boolean algebra in particular is to lean into operators. If you're doing a Boolean expression, don’t try to re-implement everything with if; use the tools provided by the language.
00:03:37.259
You've written some code and improved it, but what about other people's code? What if we're doing a code review?
00:03:41.640
That weird condition we saw earlier might be written by a colleague, and we're looking at it, trying to understand what’s happening, and we feel a little puzzled.
00:03:49.320
Well, the field of propositional logic gives us a really helpful tool we can use in this situation: a truth table. I love these.
00:03:58.260
The way they work is that there are two values that influence the result: whether someone is an owner or whether someone is an admin.
00:04:05.340
We place those in two columns and fill out all possible combinations of true and false for each of them: both owner and admin, just an owner, but not an admin, not an owner, but an admin, and then neither owner nor admin.
00:04:13.500
Here in the final column to the right, we’ll put the return value of the method given those input values.
00:04:20.220
Now we can see that this confusing conditional code is always going to return true, except when both the owner and the admin are false.
00:04:26.360
Part of the value of a truth table is that it allows you to easily scan and see how the method behaves without having to interpret conditional code in your head.
00:04:35.660
I don’t know about some of you; you may have a really good Ruby interpreter in your mind. Mine depends on the day—some days it’s good, others not so good.
00:04:42.600
So a truth table like this is really helpful for understanding code. There’s also another advantage of truth tables, which is that we can determine if two expressions are equal when their truth tables match.
00:04:51.720
Remember how I told you that the initial condition is equivalent to the Boolean expression 'owner or admin'? Now, you all just believe me, and I appreciate that, but let me prove it to you.
00:05:01.819
Here, we’ve got an extra column with the truth values for that expression, and we can see they match the same values as we have for the conditional implementation.
00:05:07.820
Everything is true except for the last row where it's false. This means we can now have confidence that our refactor using a Boolean expression does not change behavior; this is a pure refactor.
00:05:14.819
The implementation is changed, but the behavior has not.
00:05:20.460
If you’re doing a code review, GitHub allows you to embed tables in its Markdown syntax, so you could leave a comment and put a table in there to illustrate your point.
00:05:26.700
I love truth tables because they help me understand complex code that I struggle to comprehend otherwise, and they also help improve my communication.
00:05:32.640
They are a great way to talk about code with other people.
00:05:40.420
So we’ve seen some confusing code, but let's look at actual broken code.
00:05:46.240
We have a feature where we want to forbid signed-out users and untrusted IPs from accessing our website, and the code looks something like this.
00:05:52.320
We have a method, 'allow_access_to_site', and we say: 'not signed out and untrusted IP'. I don’t know about you, but that kind of messes with my brain a little bit.
00:05:59.539
There’s some double negatives happening here: 'signed out' and 'untrusted IP' are negative in English, and then we have them wrapped in parentheses with a negation sign in front of that.
00:06:08.040
So one of our colleagues helpfully submits a PR where they decide to clean things up a bit. Instead of wrapping it in parentheses, let’s just negate each of the individual conditions.
00:06:15.360
It’s still a double negative, but maybe this is slightly easier to understand.
00:06:20.520
What’s not up for debate, though, is our colleague broke the build.
00:06:27.600
This is no longer going to work. The expected behavior, if we build a truth table, is that when both values are true, you’re forbidden from accessing the website.
00:06:34.560
But otherwise, you have access. However, if we compare the truth table for the new implementation, you'll notice it does not match at all.
00:06:41.300
These two expressions are not equivalent.
00:06:47.720
A lesson from Boolean algebra informs us about de Morgan's laws, which teach us how to negate compound conditions.
00:06:55.140
To negate a compound condition using de Morgan's laws, we first have to negate every clause, and that's what our colleague did.
00:07:01.380
But you can’t stop there; you also have to invert the operator that's in the middle.
00:07:06.960
So if you're saying 'not A and B', then you would convert that into 'not A or not B'. Notice we’ve converted that 'and' into an 'or'.
00:07:14.940
Similarly, if we have 'not A or B', that becomes 'not A and not B'.
00:07:22.260
Here's the original code before my colleague tried to change it: if we apply de Morgan's rules, step one is we negate each individual clause—that's what our colleague did.
00:07:32.880
Then, step two, we invert the operator. It’s now an 'or' in the middle, not and.
00:07:39.600
If we check this now in our truth table, we’ll see that the truth tables now match, and we have a valid implementation that’s equivalent to what we had before.
00:07:46.920
Now we have our correct code, and we still have the two double negatives, but at least maybe it’s slightly easier to read.
00:07:52.200
Something nice about having the double negatives at the individual clause level, rather than wrapping the whole thing, is that we could just alias those two extra methods and write them positively.
00:08:01.200
Now, the code reads: Who's allowed to access the website? If you're either signed in or you’re a trusted IP, you get access to the website.
00:08:07.860
So keeping de Morgan's rules for compound conditions in mind can help us write code that is much more readable.
00:08:14.520
Additionally, it's a tool I use to help catch bugs. Anytime I'm writing code that negates a compound condition, or if I'm reviewing code and see that someone has negated one, I'm going to pay close attention.
00:08:20.760
Maybe I’ll scribble out a truth table or look at the test suite, but those are areas that need a second look.
00:08:26.760
De Morgan's rules are not intuitive; they’re not the kind of thing one would naturally do if they’re unaware of them, and they’re worth taking a closer look.
00:08:35.400
Also, keep in mind that 'unless' is a form of negation. If you see code that says 'unless' followed by a compound Boolean expression, pay attention there because that could also be a bug.
00:08:41.760
It’s useful to look at code and identify potential de Morgan violations. Build out a truth table or analyze it manually.
00:08:49.200
But you might be thinking: couldn’t our test suite catch this? Shouldn’t we have coverage so we don’t have to deal with this nonsense?
00:08:59.520
Well, yes, but are you confident that you have enough coverage? How do you know if you've written enough test cases to cover all the edges in a particular method?
00:09:09.240
Combinatorics, which is a subfield of discrete math, gives us some tips.
00:09:18.360
Suppose we have a policy object with a 'can_read' method. It states that you can read this article either if it has a public access policy or if the user is an admin.
00:09:25.740
We’ve written four tests to cover this. It allows an admin to access a public article, allows an admin to access a private article, allows non-admins to access public articles, and disallows non-admins from accessing private articles.
00:09:34.740
That sounds like we’ve covered most of the cases, right? But how do we feel more confident?
00:09:42.480
There are two inputs in this method: whether or not the user is an admin and whether or not the article is public.
00:09:48.840
Checking whether the user is an admin has two states: it’s either true or false—simple enough.
00:09:56.160
However, whether a policy is public has three states: true, false, or nil, because of that safe navigation.
00:10:03.920
There’s some uncertainty happening in this system. The field of combinatorics tells us that when we build a compound state, we need to multiply the number of states in each of the underlying pieces.
00:10:12.120
In this case, we have two states (admin) times three states (public), which gives us six states. But we only have four tests.
00:10:19.560
I think there are a couple of edge cases we are not handling.
00:10:24.360
If I make a list of all the combinations of inputs and outputs, you can see I’ve highlighted the two rows at the bottom—those are the ones we don’t have test cases for.
00:10:32.760
Oh look at that! We didn't handle the nil case. I made that mistake. I’m sure none of you have ever forgotten to handle a nil.
00:10:38.880
Dealing with combinations is not just about Booleans, either. I consider something a little different here—let’s look at a constructor method that takes a couple of default parameters: a limit and a scope.
00:10:46.860
Optional parameters add two states to your method signature. We have two of those, so compound states multiply: two states times two states equals four states.
00:10:54.900
There are four different ways of calling this method. We can call 'new' by itself, with a limit, with a scope, or with both a limit and a scope—did you remember to test all those combinations?
00:11:00.180
So, what can we do when we realize that we do not have enough test coverage? I’m a big fan of Venn diagrams from set theory to illustrate this.
00:11:08.760
The problem is that we have a set of states and a set of tests. Ideally, we want those two sets to be equivalent.
00:11:14.760
But our set of tests is smaller than our set of states. So visually, how could we make these two equal?
00:11:22.200
There are two things we can do: First, we can grow the set of tests to match the set of states. Secondly, we can shrink the set of states to match the set of tests. It doesn’t have to just be one way.
00:11:31.560
So what does that mean for our problem? Either we can add more test coverage to handle the nil case, or we can reduce our state.
00:11:38.100
Just because you notice you don’t have enough test coverage doesn’t mean you have to add more tests. Sometimes, this indicates your code is trying to do too much.
00:11:46.680
In the case of our earlier example, we should check for nil earlier, so we don't have to deal with nil in this case, reducing our workload to four states instead of six.
00:11:54.300
Now we know how many tests we need, but what about when we have to deal with really complicated tests?
00:12:01.320
This is a technique I don’t use all the time; it’s less frequent, but when I do, it's a real lifesaver.
00:12:07.680
Let’s look at a complex test. Don’t feel like you need to read all of this code; it’s not really relevant.
00:12:17.460
But let me describe the general structure of it: we have a series of nested described blocks and some 'let' declarations occurring at various levels.
00:12:24.480
We also have a few tests that aim to perform certain actions. Anyone have code that looks like this in their codebase?
00:12:31.680
Yeah, we have a few. If you're not the one who wrote this, it can be hard to understand exactly what data gets set up for that test.
00:12:37.920
Here’s a visual model I like to use when I’m trying to understand what’s happening.
00:12:44.880
First, I'm going to list all the 'let's' in the file. In this case, we have an organization, a user, an admin, a product, and an invoice.
00:12:54.420
Then, I'm going to draw lines to connect those boxes. Those lines are not about Active Record associations; instead, they're about which 'let's' reference which other 'let's'.
00:13:01.260
The invoice 'let' block references the user that block and also references the product block.
00:13:07.800
We can connect them all, and we end up creating a nice little diagram with boxes and arrows pointing to other boxes.
00:13:14.220
This diagram is called a dependency graph. I absolutely love these things. I use them in various contexts.
00:13:19.800
Here we’re using them to analyze the 'let' blocks in a test.
00:13:26.700
When we evaluate a node in a dependency graph, we must also evaluate all downstream nodes.
00:13:33.300
In this case, because the organization is downstream from admin, we need to account for that.
00:13:40.440
This makes sense, as the admin 'let' block requires the organization to perform its tasks.
00:13:48.240
Now, we’ve set up our base state - both values will get evaluated regardless of what happens in your test.
00:13:54.120
Let's look at some actual tests now. How do they add extra behavior on top of that?
00:14:01.560
The first test references the invoice 'let' block, so we account for that.
00:14:09.240
Remember to follow the dependencies—follow all those arrows.
00:14:16.560
The invoice references the user, which references the organization, and the invoice also references the product.
00:14:23.220
Shading those—would you look at that? That one little test evaluates every single one of the 'let's' in the file.
00:14:29.880
What about the other test? This one references the product.
00:14:36.000
We again have our base state with the admin and the organization, and then we shade the product.
00:14:42.840
We try to follow the arrows downstream, but there are none. The product does not depend on anything.
00:14:48.360
In this test, only the product, admin, and organization get evaluated. The invoice and user are lazy; they don’t get invoked, and the associated work does not happen.
00:14:55.620
I’m a big fan of this kind of graph because, one, it helps me figure out what's happening in a test.
00:15:02.940
It’s not obvious sometimes that 20 or 30 'let' blocks might get called because only one of them is referenced in the test.
00:15:09.000
Through a chain reaction, all of them get evaluated.
00:15:16.620
Secondly, it's important to note that you're not locked into this structure. This reflects how your 'let' blocks are defined.
00:15:23.640
But you can always redefine them differently. Maybe you notice, 'Oh, the way I've structured them causes this long cascade of dependencies getting evaluated.'
00:15:31.740
You could invert a dependency, or make one thing reference another, and reduce the number of evaluations.
00:15:40.620
If you’re performing a performance optimization, trying to limit the number of database calls, this can be a useful exercise to simplify a test making many queries.
00:15:48.000
One of the powerful skills I’ve developed as a developer is breaking down complex tasks and working through them incrementally.
00:15:55.740
Graph theory is a tool I use to achieve that. I’m excited to share this particular approach with all of you today.
00:16:02.520
If you have a large task, you might break it down into a series of sub-steps: step one, step two, step three, four, and so on.
00:16:09.180
Simple enough—but many tasks are rarely linear. You can’t always just work through them in that order.
00:16:15.960
In fact, most subtasks might form some kind of dependency graph.
00:16:22.320
Step one might depend on steps two and three being completed. Similarly, step two can depend on step four, while step three could depend on both steps four and five.
00:16:32.160
If we start from step one, which likely requires all these dependencies, we find that to shift step one, we end up just doing a single chunk of work.
00:16:40.440
The goal of breaking down a large task into smaller subtasks is supposed to enable us to work incrementally.
00:16:48.780
Instead, we inadvertently create a single giant task. Now, let’s look at an alternative way.
00:16:56.760
Instead of starting with step one, what if we start with step five? We accomplish that first.
00:17:04.620
Remember that we have to follow all dependencies; however, step five has no dependencies.
00:17:10.680
It can be done in isolation. It will pass the eye-check and can stand alone.
00:17:15.960
You can ship this to production, fix a bug, and return to this task tomorrow.
00:17:22.560
Similarly, we can perform the same for step four, which has no dependencies.
00:17:30.360
Now we can do step three because steps four and five have already been completed and shift that as well.
00:17:41.040
The same is true for step two, and at the very end, we ship step one after all other tasks have been completed.
00:17:48.600
A lesson we learn from graph theory is that dependency graphs should be evaluated from the bottom up.
00:17:56.280
We start with those terminal nodes that have no dependencies and work backward.
00:18:02.460
By working this way, you can evaluate each node independently without needing to manage the surrounding work.
00:18:10.560
This is powerful because it allows you to build the steps independently.
00:18:19.320
You can merge them onto your main branch and do not need a lengthy running branch to reintegrate all the time.
00:18:29.760
You can pass the eye-check, merge to production, pause work, do something else, then return.
00:18:35.760
For those of you wanting to take notes and look into these ideas later, the formal term for this sort of bottom-up evaluation is 'topological sort'.
00:18:42.840
You don’t need to know this for today’s talk, but if you’re interested, this is the official name.
00:18:49.920
Let’s look at a more complicated situation: we need to replace an image processing gem.
00:18:56.880
We figured out the steps: one, change the gem in our gem file; two, update the article model because the new gem has a different API.
00:19:02.940
Then we will do the same for the user model.
00:19:08.760
If we draw this out, you’ll notice some strange things.
00:19:15.600
These arrows create cycles. If we change the image processing gem, we cannot do this on its own.
00:19:23.040
It’s going to break because the article model needs the new API; however, we cannot update the article model without the new gem.
00:19:30.720
So we are in a loop where everything requires each other, and nothing can be handled incrementally.
00:19:39.300
The graph theory shows us that cycles cannot be evaluated incrementally.
00:19:44.940
So what do we do? Are we just doomed to ship this in one large project?
00:19:51.960
No! Remember: you aren’t locked into this structure. You can modify your graph.
00:19:58.620
In this case, why not introduce a fourth step? We could have an adapter.
00:20:06.840
So the first step becomes introducing an adapter object that wraps the old gem but exposes the new API.
00:20:14.400
This has no dependency, so we can ship this on its own.
00:20:20.700
Merge it to master and go to production; nothing breaks.
00:20:28.680
A week later, we can decide to update the article model since we now have the adapter.
00:20:35.700
Thus, we can ship that without additional work.
00:20:43.680
Later on, we move to the user model and, ultimately, swap out the gem in the gem file.
00:20:49.920
Then we drop the adapter, shipping that as the last step.
00:20:56.520
Each of these pieces was completed incrementally, merged into the main branch, shipped to production, perhaps with weeks in between.
00:21:02.520
You might be wondering, isn't this still a cycle? I see a square here.
00:21:09.240
But remember what matters: the arrows. If you follow the arrows, once you reach that bottom step (the adapter), there are no arrows leading out.
00:21:16.920
This, in fact, is a terminal node.
00:21:23.460
The sophisticated term for graphs without loops is 'directed acyclic graph.' Again, not super relevant unless you want to Google it later.
00:21:28.560
The pattern we used by introducing an extra step that adds a sort of wrapper or adapter is often referred to as the 'Strangler fig pattern'.
00:21:34.740
This is a must-have in your toolkit if you're working on any large-scale refactor or upgrade project.
00:21:41.880
While we don't have time to go over everything, there are a variety of ideas from discrete math that inform our work.
00:21:48.420
Let me quickly shout out a few fun topics: if you’re generating IDs randomly, and you wonder about potential collisions.
00:21:56.280
You might naïvely think the chances are low, but combinatorics can help you understand the reality; check out the birthday paradox.
00:22:05.640
Designing a database schema often requires you to eliminate invalid states and avoid redundant data.
00:22:11.880
Ideas from set theory and combinatorics can help with that.
00:22:18.540
Predicate logic helps with statements of truth that apply to all or any value sets, aiding our understanding.
00:22:27.120
We frequently use all and any in communication, so predicate logic can also improve our understanding and expression.
00:22:35.580
If you’ve ever felt foolish for making an absolute or superlative statement, only to be corrected by a counterexample, you can see how useful logic can be.
00:22:41.740
Today, we've explored ideas and tools from discrete math: Boolean operators, truth tables, de Morgan’s laws, and how we handle compound states.
00:22:50.540
Evaluating graph nodes requires assessing everything downstream. Ideally, you evaluate from the bottom up.
00:22:58.380
It’s okay to break cycles in your dependency graphs. Thank you for coming to this talk. My name is Joël Quenneville, and I am not good at math.
00:23:12.240
Yes, the question is, where did I get the idea for applying discrete math to programming?
00:23:18.120
I think some of it I was already doing in my day-to-day, though I didn’t recognize it.
00:23:25.500
It took a conversation with a colleague about what is useful in discrete math to realize, wait a minute.
00:23:31.320
I do reach for complicated Boolean methods and sometimes draw truth tables, which are indeed useful.
00:23:39.300
Recently, I have been leaning into dependency graphs and how they can help me solve problems.
00:23:47.880
I apply them to many situations; here, I showed two: handling 'let' in a test and working incrementally.
00:23:54.960
There are many other circumstances where modeling with dependency graphs is helpful. I’m a big fan of tools like Mermaid to draw graphs; it’s easy.
00:24:02.760
If you’re more old-school, you might prefer Graphviz to quickly create graphs, or even just paper and pencil.
00:24:09.240
So the comment was about using shared examples to pass in expectations and results which can create a truth-table-like structure.
00:24:17.760
I’ve been experimenting with a custom RSpec matcher that allows passing an ASCII truth table as the matcher value.
00:24:24.720
Still not certain if that will enhance developer experience, but it could be more readable than numerous blocks.
00:24:32.280
Alright, question over there?
00:24:38.640
Yes, the question is, if you use these terms from discrete math, do you come off as pretentious?
00:24:43.860
I usually avoid formal mathematical terms; I wouldn't say, 'therefore we should convert our directed acyclic graph to a topological sort.'
00:24:49.980
No one wants to hear that nonsense! I might draw a diagram and say, 'If we do it in this order, notice we can handle each piece independently.'
00:24:56.700
I’d also link to an article about topological sort if someone wanted to dig deeper.
00:25:03.780
The goal is to keep the audience in mind; if you're speaking to people unfamiliar with it, skip the jargon.
00:25:10.380
There may be a tool to generate dependency diagrams for factories or let blocks; that sounds interesting.
00:25:17.340
Now, I want to know, what’s the largest number of 'let's' you've seen for a single test when analyzing one of these graphs?
00:25:26.760
Oh, that's probably a scary high number! I’ve seen test files thousands of lines long.
00:25:31.020
It usually has multiple layers of 'let', and that's where very deep nesting comes into play.
00:25:38.520
You often can’t tell what’s in scope, and it can be quite terrifying.
00:25:44.940
Were there any topics I had to cut from this talk? And are there resources for those?
00:25:50.280
I had mentioned a few at the end, including combinatorics for Collision detection and set theory ideas for database schema design.
00:26:00.360
I would’ve liked to discuss those if I had more time, as well as some predicate logic stuff.
00:26:07.560
I find it intriguing when models from math or software not only help us code but improve communication.
00:26:14.700
As developers, we communicate with computers and people, and we need to be good at both.
00:26:21.960
As for resources, Brilliant.org has a course on discrete math, and they have a publicly available wiki section.
00:26:27.150
I’m unsure if the course itself is paid.
00:26:32.040
That's about all the time we have for today. Thank you for coming to the talk!
00:26:39.600
Feel free to come and see me afterwards.