Refactoring

Summarized using AI

Grokking FP For The Practicing Rubyist

Norbert Wójtowicz • September 20, 2022 • Wrocław, Poland

In the talk titled "Grokking FP For The Practicing Rubyist," Norbert Wójtowicz explores the misconceptions surrounding functional programming (FP) and highlights its relevance for Ruby developers. Throughout his presentation delivered at the wroc_love.rb 2022 event, he emphasizes that functional programming is not solely about pure functions without side effects. Instead, the core of FP includes understanding data, calculations, and actions. Wójtowicz introduces three main concepts:

  • Data: This is defined as information at rest that does not require a computer for interpretation and can yield new insights when contextualized. He uses the anecdote of Sumerian clay tablets to illustrate how data retains its meaning and relevance across time.
  • Calculations: These functions yield consistent outputs based solely on their inputs, regardless of the number of executions. He exemplifies this using the Enigma machine, which decrypts messages reliably without variations in results. In contrast, he discusses how introducing variables like sales tax can render calculations dependent, thus complicating their predictability.
  • Actions: In contrast to calculations, actions depend on the timing and frequency of execution. Wójtowicz illustrates this concept with the example of a web service that generates random bits based on quantum principles, emphasizing that actions can lead to unpredictability in the outcomes.

The speaker emphasizes the importance of identifying these elements in their coding practices, advocating for a systematic understanding rather than merely adhering to trendy paradigms. Notably, he discusses the common pitfalls in Rails applications concerning logic handling between models, views, and controllers, leading to undesired complexities. He urges developers to practice stratified design to ensure clean, modular code and reduce coupling between components. With practical tips, Wójtowicz guides developers on how to organize their programs more effectively and think critically about their system's architecture.

In conclusion, Wójtowicz encourages embracing the concepts of data, calculations, and actions for better programming practices, facilitating a deeper intuition for clean design. He references a book, "Grokking Simplicity" by Eric Normand, as a resource for further exploration into these functional programming concepts.

Grokking FP For The Practicing Rubyist
Norbert Wójtowicz • September 20, 2022 • Wrocław, Poland

wroclove.rb 2022

00:00:15.599 It's always lovely to be here. Three signs: oh, can I move the microphone a little? Is that okay?
00:00:21.240 Okay, so three signs, brothers. You've had way too much coffee, not enough sleep, and you can hear my voice. It's been two days of partying, so I apologize. I may not be the speaker you deserve, but you're going to get me, alright?
00:00:28.320 I see the presentations working. Let's just get into it: functional programming.
00:00:34.739 I'm sure you've all heard of functional programming, and I’m sure you've been told that it's about programming with pure functions without side effects. Right? Hands up! Great. Disagree? I'm going to need more energy from you guys. I need the energy in this room to go up because it's only going to get weird, trust me.
00:00:47.000 This is not true. This is the worst possible explanation of what functional programming is, and I'm here to set the record straight. So we're going to learn three terms, we're going to learn two visualization techniques, one programming concept, and we're going to learn absolutely nothing about monads—this is not that talk because I'm here to teach you functional programming, not monad programming.
00:01:17.280 So meet Mr. Chang. He's your classic community college professor—sombrero on, feet up in the air, just chilling. When this slide is up, I want you to relax, lean back, breathe, think about what you’ve just seen, and process it. But that also means that when he is not up, I need you to put your phone down, lean forward, and read between the lines.
00:01:36.119 Because I'm not here to teach you a bunch of buzzwords. I want you to take every single slide you see and think about how it applies to your existing codebase. I promise you that if you put in the effort today, you will wake up tomorrow a better Ruby developer; this is my promise to you. Okay? Sounds good? I'm going to need more.
00:01:54.119 Thank you. Let's start with some softballs. No, I'm going to try not to ask any rhetorical or trick questions. This is not an interview. So if I ask something and it seems obvious, I want you to answer anyway—just shout it out. This is going to help you process this information. We're looking at something up on the screen. Can you take a wild guess what kind of application we’re dealing with here?
00:02:37.260 Yeah, some kind of shopping cart e-commerce. I think that would be a reasonable assumption. Can you tell me how many books are in this shopping cart? Three books. Can you tell me if any of them are on sale? One? Which one? The last one. So the first two books, that’s from yesterday’s talk.
00:03:00.239 The last book I threw in there because it's a book I read recently—maybe half a year ago—it's about women in computing in the early days of the computing field. It turns out that there are a lot of interesting anecdotes, and I think not enough people actually learn about our own history as a computer science field, so I highly recommend it.
00:03:32.159 So that's our first slide. Sorry, there was a typo. I’m sure there’s more than one. Okay, so it's going to be here too because I just copy-pasted. This is our second slide, and here I want to ask you: when was the 'Principles of Philosophy' published? Louder! Okay, by whom?
00:04:06.299 By Descartes. And how old was he when he published it? 48 years old. Yes! So think about what's just happened here. I gave you zero information, but you were able to not only figure out meaningful information that was here; you were actually able to do some graph traversals even though I didn't tell you how those things are somehow joined together.
00:04:30.720 Right? So you've just learned what data is. Data is information at rest. It means that it's meaningful without needing a computer to run it. You are now a data engineer! The other interesting property about data, aside from the fact that you don't need a computer to have any meaningful information processed from it, is that new facts can emerge.
00:05:00.660 Right, because there was nothing about how old Descartes was; this is something you figured out by joining other facts that were already there. The other interesting thing—sorry, first slide. Potter Stewart, if you don't know who that is, is a Supreme Court Justice, and he famously, when asked what pornography is, said 'I know it when I see it.'
00:05:37.920 And that's how I want you to remember what data is. Next time anyone asks you, 'What is data?', just smile and think, 'I know it when I see it.' That’s the first property of data: it’s information at rest. It doesn't need a computer to be interpreted. But the second important piece about data is that it has multiple interpretations. This is a Sumerian clay tablet, probably around 5000 years old. The rough translation is that someone owes someone a bunch of barley—like a bunch of barley stalks—and promises to pay off the debt in, like, 37 months or something like that.
00:06:51.180 So it's essentially an IOU that's 5000 years old. This was information that was written down so they remember what it is, but when we dig it up 5000 years later, we don't care about who owes what, because these people are long dead and this IOU will never be repaid. What we care about is that we can take this kind of information and learn something about agriculture, about their economy. We can have new insights about what this society looked like.
00:07:09.060 And this is something I want you to burn into your brains: when we write a piece of data into our system, we have no idea how it will be interpreted in the future. So when you're building your event sourcing system and you're putting this JSON in some database, I want you to remember that you're putting it in such a way that it can answer the questions you have right now, and also, two years down the line, your questions will likely be very different, and you have absolutely no idea what those questions will be.
00:07:35.940 That’s why it’s important to think about how you're storing the kind of data you're storing; to design it in such a way that it will be meaningful three years down the line—when maybe you're not even writing this in Ruby anymore, you're writing this in Java, Scala, or some new fangled language. But it can still read JSON because, as you noticed on those first two slides, I didn't even tell you what format that was; it could have been EDN for all you know, but you were able to interpret it anyway. So those are the interesting properties of data.
00:08:16.680 What about this slide? This is some JSON. This JSON will actually not compile because there's this weird bar thing—right? JSON can't have symbols. So let's pretend it's not JSON; let's pretend it's Ruby so at least it'll compile. Do we have an idea what food could possibly represent based on this slide?
00:08:29.500 Yeah, you can't do anything! So let me help you out. Let's say that it's actually called total, and the bar is actually calculated total. Do you now have some idea of what this represents? You have some idea, but can you tell me what the result would be? 42. That is the correct answer; that's the secret answer! So the problem with this is that this is no longer data because there's a function in there, and we need to understand what kind of runtime this function will run.
00:09:09.300 We can, of course, maybe look into the source code, assuming we have access to it. Now this source code—if you look at this, can you guess it? If I tell you what the items are, will you be able to tell me what the result would be? It seems reasonable, right? These are what we call calculations—calculations are functions such that the output is dependent only on the input and nothing else.
00:09:41.880 Now, it's not entirely true that you can know what the answer will be because we're Ruby developers, so maybe someone did something like this, right? But it is a reasonable assumption that we can analyze this code and say, 'Yes, this result will always be the same.' If I run it once or I run it 100 times, it will always give me the same result. That's the definition of a calculation: the output is only dependent on the input, and it doesn’t matter how many times you run it; you’ll always get back the same result.
00:10:20.100 In the real world, the Enigma machine is a calculation because it’s this complicated piece of machinery that you feed two things: the encrypted message and a set of settings. When you get the result back—the decrypted message—it doesn’t matter if you run it once or a million times; you’re always going to get back the same result. The Enigma machine might be so complicated that you can't figure out what the message would have been decrypted without running it at least once.
00:10:38.580 So, what’s your question so far, or just ask me. What’s your question? Oh, sorry. Oh wait, sorry. I can’t swear. You gave me one of the fizzy—that's not cool. Alright, so we’re back to the slides. We have the same slide, but this time we’ve introduced sales tax. Is this still a calculation? It would be reasonable to assume that yes, this is still a calculation because why?
00:11:02.240 The output is dependent only on the inputs. I want you to repeat this over and over in your head until it becomes second nature. Now, what if I change the argument list? Is this a calculation? Maybe? Yes? No? I love it; it’s both! It’s destroying your calculation. It’s both a calculation and not a calculation. It depends on what sales tax is. Because if sales tax is, for example, a constant in the class or a global or even in the environment, but you set it once during the runtime, for the purposes of how you understand what the system does, then this is a calculation because it won’t change, right?
00:11:54.720 But, if the sales tax is an instance variable and anyone can change it at any time, then you can no longer make predictions about what the value is without actually running it at a specific time. Agreed? A more realistic example would be if this was called 'current user geo sales tax' and then it would be a lot more obvious that yes, this is no longer a calculation because we’re assuming that users come from different geographies and so this will not always return the same result.
00:12:43.200 So we get to our next term: sorry, I messed up the slides. That’s okay. These things that are not calculations are actions. As Miho showed yesterday, there are a lot of ways we can think about the coordination between certain things, but all of these are examples of actions. The definition of action is following: a computation that is dependent on either when you run it or how many times you run it. That’s all it is; that’s all you ever need to understand is that the calculation output is dependent on input—an action is dependent on when or how many times you run it.
00:13:30.360 So these are all examples of actions. And notice that 'ship items' has to have a call-once semantics, but if you have an idempotent key like order ID, you can have at least-once operation, but you need to call at least once because otherwise, the shipment will never get sent. So those are our actions. In the real world, there’s this web service called HotBits, where I think it’s in Fermilab.
00:14:02.220 They’ve set up a Geiger-Muller tube hooked up to a computer, and every time you ask this service, 'give me random bits,' it will analyze the tube, which is dependent on radioactive beta decay, and due to the way that quantum mechanics and physics works, it’s completely unpredictable. So it’s a true random number generator, and that makes it a perfect action because the only way you can get any information is to call it, and every time you call it, you always get a perfectly random bit that cannot be predicted.
00:14:22.080 So those are our three terms. Actually, this is functional programming. This is the essence of functional programming: the only thing that functional programmers do is that they understand these three things, and they identify these three things in their own system. Everything else is just sugar; it’s more interesting ways of figuring out how to build complex coordination systems between these things or how to take things out or put things in, but it always comes down to this.
00:14:54.180 What’s amazing about this is that you don’t need functional programming—you don’t need to learn a specific language to identify these three things in your own system. But the real question is, why would you want to? Right? This talk won’t teach you everything you need to know about programming; that will take a lifetime of study. But my point is that until you start identifying these three things in your own system, you’re going to have a hard time moving forward.
00:15:35.520 One of those properties you can start with this new information that you’ve just gleaned is tomorrow when you sit down at work, and you identify what is data, calculation, and action; you can take this. I’m going to give you one quick tip of what you can do with this information: this matrix right here. One of the interesting properties of these three things is that they have specific properties of composition.
00:16:09.780 If you take some data and you mix it with more data, you get data back, which means that it's still information at rest; you don’t need a computer to interpret it. But as soon as you introduce a calculation—like we introduced the sales tax into our JSON payload—it is no longer data. So it means that anything that is dependent on this piece of data is now also a calculation, and similarly, if you have an action, then anything that is dependent on the output of this action—since an action is dependent on when or how many times you call it—by definition, everything else that depends on this is also now an action.
00:16:54.600 Right? Or another way of thinking about this, if you want a more visual, visceral version, is imagine you have a swimming pool, and then someone contaminates it on one end. It doesn’t matter where they did what they did; you’re not going into that swimming pool anymore. This is the side effect, right? This is what the functional programming people are like: 'Don't do side effects.'
00:17:10.440 It's not about side effects; it's about when you do something in the bowels of your system that is dependent on time, or whenever—it’s how you call it—then everything else that is dependent on that thing also automatically becomes an action. It becomes hard to reason about, and it’s hard to reason about because you are now a big coordination machine, and you sort of have to keep the entire thing in your head and wonder about all the different edge cases. Is that making sense?
00:17:44.640 Alright, so let’s look at an example, because I want to emphasize this. This is the most important thing, the fact that we have data, calculations, actions. I want to leave you with a little more practical advice on how you can apply this to your daily work. So one thing we can do is, let’s look at Rails. This is our first visualization; it’s a call graph.
00:18:47.400 It essentially shows dependencies between components, right? You have layers, so each of these horizontal things is a layer. You have a layer of routes and controllers, and as you might imagine there’s n layers. Usually, the first thing you do historically is you start putting more and more logic in the view until you have a couple of thousand lines of a view with a bunch of ifs, and you realize that’s not manageable.
00:19:22.260 So what do Rails developers do? Historically, they create concerns. However, concerns do not solve the problem as we know, but this dependency graph, this call graph shows why it doesn’t solve a problem. Because concerns just take a big blob of code and move it into a bunch of files—but in runtime, it's still just one big dependency. It’s split across multiple files, so you’re sort of hiding the complexity, but in runtime, that same exact complexity is still there.
00:20:07.860 So, once we’ve realized that this is not the solution, usually the first thing that historically we did was we created decorator objects. Right? Let’s move a lot of the logic of the view into a separate class. But who creates that class? The controller does. So now the controller is doing two things: creating a decorator class and creating the actual view.
00:20:43.500 Every time the dependency this call graph shows that you're pointing an arrow up—that’s an indication that something has gone terribly wrong. And in practice, how this ended up is once people started creating all these decorators, they realized that whenever they wanted to change something in the view logic, they usually had to change two things: both the decorator and the actual view because they were both dependent on each other.
00:21:18.300 So yes, we moved some code out into a separate layer, but it turns out that the coupling is so strict that you didn’t really abstract anything out. You still had two places that you could constantly touch and keep in sync. So what else can we do? Well, it turns out that some things can actually be modularized.
00:21:59.940 For example, if we have some kind of date time concept and you have a library that takes this date time and makes a relative string, like 'three minutes ago,' that is a self-contained thing. It shows that if you have a library that can take a date time and convert it into this kind of humanized string, the call graph shows that this is, in fact, a simplification of your architecture because you are able to modularize.
00:22:40.680 You pull out something and the controller doesn’t need to know about it; the controller just passes down the date time, and then the view calls it and says, 'I have this date time, but I'm really interested in this human string.' So that’s an example of how some things are, in fact, a good way of modularizing Rails code. But there was a lot of code that wasn’t able to be modularized in this way, and that’s because most of the view code was talking to models.
00:23:27.480 Right? You had a bunch of view logic that was iterating and touching models. Again, it took us a while to learn why this was a bad idea, but if we had drawn this call graph, we would have seen right away that this is going to end in disaster because, once again, we have arrows pointing up.
00:24:11.760 Okay, so what’s the next thing that happens when you have a bunch of views that are talking to a bunch of models? What's the next blog post that people wrote? 'Move your view logic to ...' No, no, you're skipping a step. To the controller, in general. There was this movement to move everything from the view into the controller—JavaScript! Yeah, CoffeeScript, if anything at this time. This is like circa 2015.
00:24:43.920 So all the view logic went into the controller, and because it was, in fact, simpler—the controller was already talking to models—so it did simplify the view part because the view part no longer has these up arrows. But what happened over time is obviously the controllers grew exponentially big. And when Rails developers looked at it, they went, 'I was told that functions must be small!' So what do we do?
00:25:23.160 We make concerns. But concerns don’t solve the problem; they just hide the problem. So, after we went through this entire phase of, 'Okay, so concerns are not the answer,' there was a bunch of blog posts and conference talks about 'thin controllers, fat models.' So now the controllers are thin, but the models are complicated because instead of being simple ORMs that talk to the database and wrap it, they now have all these weird dependencies and coupling because we moved all the logic there.
00:26:09.060 So in a way, yes, this is better than what we had previously, but I mean, it’s still not perfect. And over time, the models grew. So classically, what’s the first thing Rails developers do? Concerns! So now you have a model that’s still complicated as it was but has a bunch of logic hidden in a bunch of files. So everyone’s happy, and then over time we realized that no, this doesn’t actually solve the problem.
00:26:50.220 So we have service objects, and service objects—this call graph actually solves a legitimate problem. Now, you can argue what exactly is a service object? Is it a repository? Is it an operation? There can be multiple versions of this world, and we’re constantly evolving this. This is an ongoing discussion: what are the appropriate layers? My point here is I’m not here to give you the answers on how you’re supposed to model your application.
00:27:32.760 I’m here to tell you that you need to start thinking about these layers. Because, as Andre mentioned yesterday, service objects are not the answer, but they’re a good gateway drug. My suggestion to you is: don’t wait until someone tells you what the next pattern you should be using in your code is; instead, start drawing these layers and think about which layers make sense in your system.
00:28:13.860 Make sure you don’t have things pointing up, and that the layers that you draw are legitimate separate abstractions—not just ways of splitting out code so you have low ... you know, good score, and ‘clean code.’ By the way, is ‘clean code’ still a thing? I feel old!
00:29:01.140 Everywhere, time? Ah, maybe I should speed up. Stratified design is the name behind the thing we’re talking about, which is let’s develop—let’s think in layers and make sure that the layers make sense. So let’s look at a version that's not application-wide, but let's look at a version of stratified design—function or method-scoped.
00:29:38.880 Here’s a function I took from a blog post, which is essentially: let's find the most unpopular book that was sold, right? I mean, it's not pretty Ruby, but it’s Ruby that you’ll see in the wild. There’s some iterating, and there are some if-statements, and then there’s some sorting—and it’s not rocket science. So obviously, everyone that sees this code goes, 'We need to refactor this. Let’s extract method and let’s use pattern x, y, and z.'
00:30:17.160 So you get a version that’s more like this, right? We extracted out the 'is book' because that’s domain-specific knowledge. Maybe we replace ‘each’ with ‘each_with_object’ and you know we did a couple of nice things. I think this is pretty legitimate to say that this would be considered a refactoring process. I would argue this, because this is what the author of the blog post essentially said, that this is the improved version.
00:30:58.260 Now it's true, I do not doubt this, because it is a better version. But I suggest that we can do better than this. So here’s the original version as a call graph. Notice that there are two orange arrows. I don’t know about you, but when I’m reading code, there are certain functions or operations where you get a ‘spidey sensation.’ You slow down and read deeper because you’ve been bitten by bugs that are related to state.
00:31:40.380 You realize that these functions are potentially important to understand if there are some edge cases we need to worry about, agreed? This is the improved version. We see that there’s an additional layer: 'is book' is an additional domain layer that we have now, and we change some functions around. We’re no longer using 'each'; now we're using 'each_with_object', and so forth.
00:32:20.160 But as you should see, visually, we’re not fundamentally changing anything yet. So when I was looking at this code, my first thought was: there’s a bunch of this code that’s essentially talking about the semantics of wanting a mapping of how of items I have in the system and how often they appear. There’s a name for this—it’s called frequencies. And so what I can imagine is I’ll write something called frequencies.
00:32:57.660 This takes this map and generates this map that says, 'here are the items, and for each of those items, this is the number of times it shows up.' It’s the exact same code as we saw before; it’s just that you hide it away in an abstraction because nobody cares how it’s implemented if you give it a name. Right? And similarly, now that I have this concept of a frequencies map, I can do things like maybe I should add a method called 'min_value' that returns the minimum amount from this map.
00:33:34.920 And what you notice is that the color graph changed. There's the 'is book'—that’s a domain layer separate thing—but then there’s this layer that’s not quite ... it’s not exactly a Ruby collection; it has more specific semantics. Right? And if I work on this more abstract layer, I no longer have to care about the fact that it uses ‘hash new’ and ‘hash set’ and all this other stuff. And notice how all the spidey sense logic went into this one dark corner, that if I just verify it once, it works correctly. Then when I’m reading other pieces of the code, I don’t have to worry about it.
00:34:37.920 And if I came up with maybe a better API for that middle layer, I wouldn’t have to do that call to the first, and then I could draw a hard line and say, 'I don’t have to care about this bottom part.' Is it okay if I run over time a little? Alright, so obviously I can’t teach you how to design your systems in a 40-minute talk, but I posit to you that if you start looking at the data, calculations, and actions in your own system, and start building these call graphs, I want to offer you some simple advice.
00:35:06.600 Things you could actually take home today and apply tomorrow. The first one is: a straightforward implementation of stratified design says if you build your layers correctly, it will be immediately obvious that there is some complicated thing here that you can simplify. You, as Ruby developers, are very aware of this because these are all of those blog posts about refactoring and all the patterns for different ways of extracting information. You’re already good at this.
00:35:39.240 My suggestion to you is that if you look at it from a call graph perspective, you’ll get better at not doing it ad hoc but doing it systematically. The abstraction barrier is this idea: can you, in some way, make sure that the layers are well-separated, so you can draw a fat line and say, 'I never have to care what’s under this part; it’s completely opaque to me.'
00:36:09.840 For example, you all use the Ruby collections, but you have no idea how they’re actually implemented because you don’t care. It’s a good abstraction. Andre gave a talk yesterday, and he made this wonderful fat line between the application and the domain. Intuitively, he can create an abstraction barrier without even knowing it. And that’s how you know you’ve got good design: when you can do these kinds of things.
00:36:48.840 The minimal interface: once you’ve got these layers, you need to wonder: are they the right abstractions? Are they the right layers? Apotonic gave a talk on Friday; he accidentally came across minimal interface because he realized that the surface area of Reform had too many things going on that could potentially go wrong. He realized that there was some coupling that wasn’t explicit in the API and that users were misusing it. And he realized that he could split it up into a bunch of separate classes. Right?
00:37:30.840 He took one big layer and he split it into separate layers, then he hid those layers behind a single call to create a new object of Reform presentation. That can't do anything except give you various kinds of information about the view part or the validation part. So he applied this pattern of figuring out what are the layers of Reform and then creating better layers with a minimal interface.
00:38:53.520 And the last one—this is the most artsy-fartsy, in the sense that it’s, you know, we don’t exactly know what it is, but it’s this idea of being comfortable with your layers. If there’s too much going on in a single layer, you need to think about how to split it up. But if you go overboard and you create too many layers and there’s not enough cohesion between them, that’s also not good. So think about your team—think about yourself—don’t create layers just for the heck of it.
00:39:30.480 If someone comes along like with Trailblazer and says you need to have 20,000 layers, I’m not saying that Trailblazer does—but if Trailblazer, you know, says that you need 20,000 layers, you need to question it always. Similarly, if DHH comes along and says you don’t need any layers, you need to question it, also. You need to find the right amount of layers that you and your team are comfortable with right now.
00:40:12.600 Maybe that’ll change over time. This is this idea of working towards better intuition for good design. And this is not something that can be taught; it’s something you can learn over time. The thing is that these ideas are applicable at various levels. So we saw this example of how this will apply to how to structure a Rails app, but it also applies, for example, on the global scale of how it relates to your microservices.
00:40:50.760 If you figure out a lot of the problems that I saw with how microsources are built, it’s when you look at the call graph of these microservices and their coupling, they didn’t create good abstractions or good layers. They just split up a bunch of modules into separate web servers and prayed, right? The same thing can be said for the layers, where sometimes you have the wrong set of layers. The same can be said for functions, and this is the part where I feel Ruby developers excel.
00:41:34.440 They are used to all these extracts, methods, and all these other options, patterns. But those are just names for specific things. Well, this is sort of like a universal truth. We’re almost there.
00:41:47.820 This is the second visualization I wanted to show you: imagine you have event sourcing, and you have a shopping cart, and you have this business rule that says we can’t have more than three items in a shopping cart for some reason. So, you get a command that says add item. Notice there's already one item in the shopping cart, so we’re processing this new command.
00:42:32.880 We check the cart: no, it doesn’t have more than three items, so we update the cart and publish an event to the system. Now, if we wrote this in a handler in such a way that we can guarantee that these operations can’t happen out of order and they’re, for example, inside a database transaction log, we can simplify this visualization and say these three things happen. There’s no way to get in between them, right? Simple enough?
00:43:14.520 Now, what happens if the user clicks again? Well, then a second thing happens—it’s all fine. So good! But then business comes along and says, ‘Oh, we’re publishing these events about items being added. We have this new policy that says if you add two items to the cart, you get a free gift!’ Right?
00:43:54.420 Between the first publish event to the cart, we realize that we’re going to react to the event 'item added.' We check that the items are not exceeding three in the cart, so we add the gift and publish an event. Then the user clicks again, but this time when we check the cart, we realize that, oh, we’re already at three items, so it’s not going to fail. We’re not going to be in a bad state; it’s just that that last item won’t actually get added—no harm, no foul.
00:44:34.920 But from what I’ve heard, this is not how you build event sourcing systems because this would probably be on a separate stream. You would have a separate handler processing this information, and how would that look? It would look more realistic. These are separate workers processing this information separately, and now this visualization doesn’t say you have a race condition in your system, but it does indicate that you need to analyze whether you have a race condition in your system because it's dependent on how these things are actually implemented.
00:45:06.780 The second visualization, because this is just a sneak peek, is this timeline concept. This timeline concept, similarly to the call graph concept of visualization, provides a map of how to think about your system in such a way that you can identify this. Right? Because tests will not find this, because this is analyzing your system rather than randomly testing it. So this visualization is wonderful; you can apply it to your daily work to figure out if there are things that are possibly forking and, if they're forking, are we sure that they’re emerging back correctly?
00:45:54.720 So next time someone tells you that functional programming is pure functions without side effects, I hope you're going to set the record straight! Fun fact: if you have a function in your system that calculates something and then puts it in a cache because you don't want to calculate more than once, but then you have this try-catch exception—for example, if your redis is down, so if it fails, if you can't connect to the network or it fails—you'll just recalculate it, and the caller will have no idea that something has gone wrong.
00:46:36.480 Is that a calculation or an action? If you call it zero times, or if you call it once, or if you call it ten times and you get the same result, is it a calculation or an action? Calculation! It doesn’t matter. There’s I/O in your specific case; for your specific context, you’re saying, 'I don’t care about I/O.' Sometimes the function takes longer to run; sometimes, it runs slower, but in this specific context, I don’t care. So in your system, this might be a calculation, but from a purity perspective, there are side effects because there’s I/O.
00:47:06.840 But remember, we don't care about that definition; we care about the one that makes sense for understanding how our system works. Now think about the alternative case: there's a calculation that is completely dependent only on the input, but it takes a really long time to run; it’s really CPU-intensive, so you want to control when it gets run. It’s an action, but from a mathematical perspective, it’s a pure function.
00:47:43.920 So it doesn’t matter what things are pure and what things are side effects. It only matters which parts of your system the outputs are solely dependent on inputs and which things are things such that you care about when or how many times it runs. Because those things need to be coordinated. Once something is an action, everything that touches it also becomes an action, and you’re trying to push all those things to the edges of the system because then a lot of the rest of your system becomes easier to understand.
00:48:30.240 So this is the most important slide that I need you to take out of this presentation: stratified design is a really useful tool for thinking about how you build your systems. These are all examples of architectures that, if you ask me right now, I could draw how they map to what we just showed—like with the MVC thing, and so. But I don't have slides for all these architectures. I do have one for just for Andre.
00:49:09.240 This is your classic relational database, right—from the router to the controller to the model to the database. Because the database is side effecting, it means that the models are side effecting, which means that the controllers are side effecting, which means the whole thing is what? An action! Right? And what did Andre do yesterday? He showed how, when there’s a lot of code in the models and controllers, and everything, you can’t think about it.
00:49:48.840 You can’t reason about anything because everything is dependent on time. So he pulled things out, right? He said that the models should be simple or—I'm kind of stuff, right? All the domains, they’re not specific to a database—there’s no dependency on the database anymore. And then he drew this nice fat line and separated the application part from the domain part, but if instead of drawing a big fat line, he drew circles, he would essentially get the onion architecture.
00:50:29.760 It’s amazing how simple this idea is but how far you can apply it. I'm done with my talk, but I really suggest that if anything I said piqued your interest, there’s this book called 'Grokking Simplicity' by Eric Normand. The title of this talk is based on this book, and Eric goes into a lot more detail and explains all this stuff that I said but in a way that actually makes sense.
00:51:53.820 Nick has a question.
00:52:00.280 Thank you for this beautiful talk. I’m wondering in the call diagrams you had, how do you decide what direction an arrow goes? Because I didn’t really understand how you can figure out if the view is asking a model or if the model is asking the view—you had this, like when something goes up, there’s a problem—but how do you know the direction?
00:52:36.840 I needed to simplify. Just because I have a slide, the view layer and the other layers are separate— they’re independent. Notice that there’s no arrow pointing between the view and the model part, so it’s important to show that they’re on separate horizontal lines because they’re completely independent of each other.
00:53:11.520 But in a very early slide, you had this, um, when you were talking about decorators—this one, yes. Okay, so how do you know whether the yellow or orange arrows go up? Like, what would the code look like so you know something is pointing the wrong direction?
00:53:56.240 So when you think about it, if the layers were reversed and the views were on top, right, the arrow—there's one arrow from the controller to the view, and then the models would have this situation where it's a little mind-bending. But if you switch it, you'll get back the same results. Mind-bending, of course—if you switch it, turns out that it’s irrelevant because if you switch layers, you’re still going to get arrows that are pointing in the wrong direction.
00:54:36.240 But is the view changing state or something? Yeah, in this specific case, I was talking about how you sometimes have these views that are iterating over models, right? Not even necessarily changing the models, right? But they have a dependency on the model, so like that you’re doing users that each and then inside you’re doing a bunch of data.
00:55:23.720 Okay, but that’s it—essentially, the view knows too much about the model! The view knows too much about the model—exactly. Okay, I see. Yeah, I’ve seen those kind of views. Thank you.
00:56:00.360 When you've mentioned that frequency thing, although the whole talk was, um, heavily backed up by functional programming, I got a notion that—and please correct me here if I’m right or wrong—you suggest that if we have some fishy piece of code, that we should do something like come up with the method name or a message that we should be thinking in terms of rather than having that hash something.
00:56:25.480 It resembles to me something advocated by OP design that you’re not having method calls because you have objects or classes, but you have those classes or objects, and then the layers because you want to do a particular thing. Was that what you were trying to?
00:56:59.160 I’m having trouble parsing your question. Oh, but that's probably me, not you. Okay. Sorry! So maybe could you go back to the code with the frequencies? Could you elaborate on the?
00:57:25.600 Yeah! Okay, perfect! So, 'item sold' filter returns a new array of things, right? Now I did '.frequencies' because I know how Ruby developers like to change their things. If I were to write this code, I would probably just write a frequencies function that took the array as its argument and returned a new thing that could be called frequencies.
00:58:00.440 That new thing, you can think of it semantically as a subset of a hash map—you can implement it as a hash map—but you don’t give it the entire API surface of a hash map. Instead, you give it an API surface that represents semantically something that has a key-value association of item to count. And so, you can imagine we could write something like frequencies.
00:58:44.520 If we implement it this way, we would have added a new frequencies method to the array, like Active Support would do, right? But we could have also done it in such a way that frequencies, or if you call it as a function, would have returned a new kind of class. That new class could be called a frequency map, and the frequency map would just have the semantics of being able to, for example, return min value, max value, and nothing else.
00:59:23.440 So it’s a question of: can you identify something? So every time you make something more or less abstract and more concrete, you’re reducing the amount of the surface API. So fundamentally, you’re making the class weaker; you’re giving it less features. It can do less things, but because it can do less things, the things it can do are more cohesive.
00:59:59.680 So instead of just having a hash map that can do everything, you now have a frequency map, and the definition of a frequency map is: it’s a mapping of item to frequency count, and that’s all it can do. Right?
01:00:21.440 Okay, does that make sense? Yep, thanks!
Explore all talks recorded at wroclove.rb 2022
+11