GoRuCo 2009

SOLID Object-Oriented Design

Help us caption & translate this video!

http://amara.org/v/GUQO/

GoRuCo 2009

00:00:19.160 Taking away, Sandi, thank you. Can you hear me? Yes, Sebastian, can you hear me?
00:00:24.830 So, I'm Sandi Metz, I'm from Duke University. I know that some people think of universities as being just like the enterprise, except with lower standards. There's a little truth in that.
00:00:33.270 Honestly, there's a little truth in that. But also, universities are places where we get more freedom to be out on the leading edge of software. Because of that, at Duke, we've been writing Smalltalk applications since the early 1990s.
00:00:50.629 I have over 10 years of experience with Smalltalk. If this were a weekday, someone would be in my office doing maintenance on a production application that's written in Smalltalk, which manages grants at Duke University. They would be working on it right now.
00:01:02.699 We're a little bit different than a lot of you, in that we don't have millions of users, we don't have high transaction rates, and we don't have huge piles of data. But what we do have are applications that have been around for a long time and get changed really every day.
00:01:28.079 So, the apps that I write, I don't actually use myself; you can't say that we eat our own dog food. However, I look at leftovers every day, and I have for a long, long time.
00:01:34.860 But enough about me. Let's talk about you. You’re here because I'm guessing that you've written an application. Even though I've never seen that app, there's something I can say about it: it's going to change. And when it changes, what will happen?
00:01:47.820 The experience that you have when your application changes depends on its design. For example, your application might look like this: it might be rigid. In a rigid app, everything is connected to everything else. When you make a single change over here, it changes this thing, and this thing, and this thing, and it's connected to a thing at the far end of the app. Any change you make, no matter how small, causes a cascade of related changes.
00:02:10.890 Or your app might look like this: it might be fragile. Fragility is a lot like rigidity. The difference between fragility and rigidity is that fragile apps are rigid, but you can't tell just by looking.
00:02:23.489 In this case, what happens if you move a wire? You can't predict. This application is both rigid and fragile.
00:02:34.860 Or your application might be immobile. Immobility is the quality where you'd like to reuse some code somewhere else, but you can't extricate it from where it is. So what you end up doing is copying a section of code and putting that same section of code somewhere else. You don’t actually have reuse; you have reuse through duplication.
00:02:48.290 Just in case you find this image a little wiggy, let me see that no dolls were harmed in the making of the conjoined twin mummy. But you do have to wonder about the guy who made the doll took the photo and put it on Flickr. Anyway, immobility.
00:03:02.130 And finally, there's viscosity. Viscosity is when it's easier to make a change, and you can tell how the original designer wanted you to behave, but it's easier to do the wrong thing. In this case, you might just throw another piece of Tupperware in that cabinet, shut the door, and tiptoe away. You try not to pay attention when the door opens and stuff falls out on someone else.
00:03:14.220 So, how is it that your app is rigid, fragile, immobile, and viscous? It didn't start out that way. In the beginning, your application was perfect; it was like a beautiful flower. It was a joy to work on, and then it changed. That's all that happened; it changed.
00:03:33.510 When you make changes that introduce unexpected dependencies into your app, they will kill you. I'm here today to talk about how design can save you. Okay, I'll slow down.
00:03:51.000 This is a picture from Martin Fowler's website. Across the horizontal axis is time, and the vertical axis is cumulative features. There are two lines plotted on this graph. The red one represents good design, and the blue one represents when you write code and don't do design.
00:04:12.270 As you can see, there's a point where those two plotted lines cross. What this graph illustrates is that as time passes and you add features, there is a point in time when you're better off having done design.
00:04:29.100 Design is like TDD: early on, design takes time, and tests take time. At some short interval in your app, you can get more done feature-wise if you don't spend time on those activities.
00:04:52.380 However, this design suggests that you should skip design if you plan for your application to fail. But if you think it will succeed, it’s going to continue to cost you money, and design will pay off.
00:05:06.150 Interestingly, if you get to the point where you're an apparent success and you haven't done design right, you can guarantee that you'll fail later when they ask you to change it. At that point, you'll be unable to do so if you have not done good design.
00:05:23.220 Today, I'm talking about the SOLID object-oriented design principles. This acronym was popularized by Robert Martin. The four terms I just used to describe rotting software—rigidity, fragility, immobility, and viscosity—are directly from his paper called 'Design Principles and Design Patterns,' written in the early 2000s.
00:05:40.170 He didn’t make up all this object-oriented design stuff; he made up some of it, but not all of it. A lot of what he did was rightly give names to ideas that were floating around so that we could talk to each other about them.
00:05:57.900 The principles that he made form the acronym SOLID. Here they are: the five SOLID design principles.
00:06:06.150 You've probably heard of some of them, but I doubt you've heard of all of them. I'm going to run through them and give you a brief definition of each, and then we'll come back and talk about them a bit more.
00:06:24.990 S is for single responsibility: there should never be more than one reason for a class to change. O is for open/closed: a module should be open for extension and closed for modification.
00:06:41.490 This might seem impossible at first glance; it should be open, yet you should be able to change it without altering it.
00:06:50.290 L is for Liskov substitution, which is about subclassing. In case you're confused, there's a handy formula to make it clearer. I for interface segregation, which seems like total gibberish.
00:07:08.220 I completely agree. Lastly, D is for dependency inversion: you should depend upon abstractions, not concretions. So there they are, the five of them.
00:07:24.830 Now, they seem so academic, right? How can this help? So much of the stuff we talk about tells you a state that you should be in when everything is right, but they don't give you much guidance about how to get there.
00:07:39.990 You might think that they are all about different principles that seem orthogonal to each other. Open/closed is a goal, single responsibility is a coding strategy, Liskov is that confusing formula, and interface segregation doesn’t seem to even apply to us.
00:07:55.480 However, they share a common theme: they are about managing dependencies in your application. These principles are strategies you can apply to lead you to a place where your application has minimal entanglements with each other, allowing changes with ease.
00:08:14.100 Everything we do design-wise is really about dependencies. No matter how people talk about it, if you're entangled with another object and it changes, you have to change.
00:08:34.440 These entanglements are what makes your code rigid, fragile, immobile, and viscous. It's a death spiral if you get caught where things are tangled; it's very difficult to make changes without those changes cascading throughout your entire application.
00:08:50.060 These words—actually, these are not Robert Martin's words—are from Steve Freeman and Nat Pryce, and it's just another way of describing the exact same goal. They say that your code should be loosely coupled, highly cohesive, easily composable, and context-independent.
00:09:07.590 Now, these are just different ways of stating the SOLID design principles. Loosely coupled refers to dependency injection; you should inject dependencies. Highly cohesive means a class should be all about the same thing; it should have a single responsibility.
00:09:24.300 Easily composable means you should be able to rearrange your context-independent objects to achieve new behavior without changing the actual code.
00:09:38.370 So, all of this is just about strategies, actually. We're going to move on. It’s all about strategies to achieve independence.
00:09:51.420 So let's rearrange them so that we can talk about them from the bottom up. Unfortunately for you, we're going to throw some of them out right now, so we can skip them.
00:10:04.370 Interface segregation is something that you care about if you're using a statically typed language like C++ or Java. When you deal with another class, you have to deal with an instance of that class and its interface. There are a bunch of rules about how to make the interfaces smaller, so that when they change, you don't have to recompile the whole system.
00:10:21.870 However, because you're using a dynamic language like Ruby, you don't have this problem. The language itself abstracts your dependency on another object; you depend on the method signature you're calling.
00:10:32.850 And otherwise, the object is like a duck. Dynamic languages, by their very nature, obey this principle in the most extreme way possible, so we can just stop talking about it.
00:10:46.030 Liskov is about subclassing. Now, tell me: I would be interested in knowing who in here has written a class that they then created their own subclass of? This is outside of the ActiveRecord stuff.
00:10:53.630 So a few of you, but not very many. Less than half of the people here. Liskov says that if you create a class Foo and you create a subclass Foo-ish, any place you use the Foo, you ought to be able to substitute the Foo-ish. Simple as that.
00:11:20.090 If you don’t do that, if you subclass and then return different things from the method calls such that your callers have to check the type to figure out what to do with the object they have, you have violated Liskov.
00:11:35.750 You’ve created a dependency that leads to code smell. Don’t do it. If you subclass objects and find that they have to violate the contract that the superclass has, you’re saying it’s not really that kind of thing.
00:11:51.320 You should think about your design more and avoid that problem, or else you're going to create dependencies that ripple through your code and cause confusion.
00:12:01.470 So mostly today, we're going to talk about achieving open/closed by applying single responsibility and dependency injection.
00:12:14.330 Now we’re going to write some code. I know this strikes fear in your heart. I know you can’t see the code.
00:12:29.720 Unfortunately, I think we all had a vision of how big that screen was that was wrong. So, I put all my code samples up on a website this morning.
00:12:37.850 If you follow that top URL, you'll get to the page we're on now. If you look at the tag for GoRuCo on Twitter, that link is there, so you can click on it. Now promise me you won’t read ahead; stay with me.
00:12:54.200 I know what you're going to do now that you can see it. I'm going to set some ground rules about this code. I was tormented about whether or not to show you an involved code sample because we're going to look at code now for 20 minutes.
00:13:18.010 I really believe that it doesn't help you for me to talk to you. It doesn't matter; me telling you how it should look once you're done with it doesn't really help.
00:13:35.870 We're going to take some code and refactor it using these principles, trying to get to the point where it's open/closed. Here are the ground rules: I'm only going to work with classes that I own, and I'm only going to stub in the object under test.
00:13:51.660 I’m happy to have discussions with anyone who thinks there's a different rule, but these are my rules, and we'll abide by them in the tests we write today.
00:14:07.930 This is the app that we're writing: there's an FTP server somewhere that holds a CSV file. We're trying to download that file, parse it, and throw it into a database on our side. The data involves patents, so I'm writing a class called PatentJob.
00:14:32.670 You'll see this data in the slides. There's a little file of test data, and there’s an FTP server that has configuration information to access it. There's also an Active Record target.
00:14:47.300 Alright, let's write the spec for PatentJob. It should do two things: download the file and store it in the database. It seems like PatentJob ought to have an API that includes a download file method.
00:15:05.979 It seems like it should have a method that says run. I know that I have more than one assert here, but I would argue they're all for the same feature.
00:15:22.680 Here's the class: the simplest possible implementation. It says run, which calls a few more methods. Let me stop first; can everybody either see the screen or see the code on their PCs?
00:15:33.180 Yeah? Tell me if you're okay; I see heads nodding; I like that. So here's the class. If I expand the download file method, you'll see it calls Net::FTP and passes those arguments.
00:15:47.960 This slide you probably can't see, but this is the entire class, so I just put it here to give you some sense of how much code was involved. This is my entire application at this point, and it works; it passes those tests.
00:16:04.120 Here’s what the app looks like, just what we said when we started with. Now I guess we're done? Well, I don't know; maybe we're not done.
00:16:17.240 I don’t really like that code very much. It works; it does exactly what it’s supposed to do. It’s probably the simplest possible implementation; my tests run green, but I'm uneasy.
00:16:29.119 This code won’t tolerate change, and I have to download the file every time I run the test. A lot of things could happen that would cause me to go into the source code and edit it. Maybe I will never write another job.
00:16:52.899 Perhaps I should quit now; it's possible, right? Use your best judgment. Just because I can refactor this doesn't mean I should. However, it’s also possible that if you refactor, you’ll find something you didn’t know was true.
00:17:06.470 It’s like TDD that way: design is emergent when you follow the object-oriented principles, just like the features that emerge when you follow TDD.
00:17:25.490 I don't like some of the code smells about this and I'm going to change them. Just because I don’t like the code smells, here’s the first lesson: resistance is a resource in your personal life.
00:17:38.260 With your spouse, your children, and your dog. It's information you didn't have; you can push back or, better, listen to it. Try to hear what it’s really telling you and fix it.
00:17:54.670 In this case, I want to listen to the code smell, even though I can't articulate what's wrong, and I don’t know where I'm going, but I know that something’s not right. I also believe in the rules.
00:18:09.960 If testing seems hard, there's something wrong with your design. That's the bottom line. If you reach the point where you don't know how to test what you’re trying to test, your design is incorrect.
00:18:22.020 Examine your design in order to make your test simpler. In this case, tests depend upon the design of your code.
00:18:31.380 They reference the names of the classes in your code, the order of the arguments in your code, and they also reference what a class does. If your test has a lot of setup or is unduly complicated, the class you’re trying to test has too much stuff in it and is not well-factored.
00:18:47.680 If you can't write the test, the code is wrong. In my case, well, actually, here’s one more thing.
00:19:03.920 TDD will make your life a living hell if your design is bad, and if you’re new to TDD and Ruby and find that you’re having a very difficult time writing tests, it does not mean stop writing tests.
00:19:18.550 It means you should learn more about object-oriented design. It will help you.
00:19:35.020 I want this to be open for extension and closed for modification, but that’s just a goal. That’s a distant goal, and I don’t know how to reach it.
00:19:56.570 However, there is something very specific and concrete I can do right now: I can apply the single responsibility principle.
00:20:04.310 Oh, sorry, David; I get that wrong every time I practice this. Let's look at this. We've been taught that you should write tests that are clean code, make the test run green, and then refactor.
00:20:20.770 At the point of refactoring, the question they always tell you to ask is: is it DRY? Does it not repeat itself?
00:20:39.510 I'm going to suggest you ask yourself a few more questions at this point: does the class have just one responsibility? Does everything in the class change at the same rate?
00:20:55.220 And does the class depend on things that change less than it does? The correct answer is yes to all of these questions.
00:21:05.360 If you find that the answer is no, you should go change something in your code. In this case, when I look at the PatentJob spec, I find two responsibilities: it should know how to download the file, and it should know how to update the database.
00:21:19.460 This is a code smell, suggesting it has more than one responsibility. If I read it as two, or even worse—if I see it has more than one responsibility and they are not even closely related—I know I need to refactor.
00:21:33.630 Perhaps I should have been smart enough to recognize this to begin with, but the beauty is that you don’t have to be smart to start. You just have to know the rules.
00:21:47.330 You can apply them and dig yourself out of these holes. I'm going to move the downloading code out of the job and put it in a class by itself.
00:22:03.590 Now I can mock! When am I allowed to mock? I'm not allowed to mock behavior that's in myself, so I couldn't mock this before, as I was stuck downloading the file.
00:22:18.360 However, I now have a virtual object that I can define as a role and mock in my system. It will have the API of download file and I can use it in my PatentJob class.
00:22:31.920 This is the perfect place for a mock; it will make your tests faster without making your system fragile. If I had ten jobs, they could all use the mock.
00:22:46.390 When I write the actual download code, I can write a real test that runs against the real external resource. I can also use something like my core, not by the relevance that provides a class of tests.
00:23:02.710 These longer-running, slow tests serve as sanity checks of the external resource, but they don’t have to run every time I execute my big batch of tests.
00:23:17.620 So, what I’m going to do is mock out this in the PatentJob class. This was the second test, the second test that originally existed in that spec. Now, I’m going to create a mock and put it in.
00:23:35.000 Now this mock, it’s just a mock on mock in a role. Had I been smart, I could have done this to begin with, but I didn't recognize it then.
00:23:52.450 So now, I'm going to put a mock in, and it’s going to be a stand-in object, a stunt double that sits in that place in the diagram. I’ll use dependency injection to get that behavior back into the original object.
00:24:07.960 You notice when I get a new instance of PatentJob, I'm passing in the mock downloader, just like in real life.
00:24:22.360 I used to call the download file method, which was my own method. Instead, I can simply change that one line of code and say give me a new instance of this other class whose name I know that I'm now depending on.
00:24:37.760 Then, the download file method is over there. Instead, you should inject an instance of the new class or the other responsibility back into the original calling class.
00:24:50.440 I've got an accessor; I injected it in and took a default at injection time so that this class doesn’t have to create one.
00:25:04.240 My tests will run anywhere I want to use this, and anyplace I want to use it can take the default, but I can also inject the mock from the test without going in and changing partial objects.
00:25:18.080 Now, you can make a successful argument that this code is shorter than the previous code. If you're only going to have one job ever in your life, maybe you don’t want to do this.
00:25:30.970 But let me ask you: do you expect your application to succeed? If it succeeds, you're going to want to change this later, and if you don’t inject the dependencies now, you’re going to be unable to change it to a behavior without changing the code.
00:25:46.290 Now we need to write this new class, right? We mocked it in the test already, but now I need to create this object. I'm going to do the dumbest thing possible.
00:26:02.800 I actually wrote this upload test file method here, but other than that, this code is just completely ripped out of the other class.
00:26:18.560 The spec is an exact copy of the old spec, and the code in the class is an exact copy of the old class. Resist the temptation to do more than you need to.
00:26:33.390 This refactoring in this way is just like writing code with TDD. Don't guess where you're going to end up; just follow the rules and see what happens.
00:26:49.340 We have an application where I’ve separated the downloading responsibility out of the PatentJob, and so I guess we're done!
00:27:04.630 Oh, you know the thing I meant to say: we’re going to do this four times, so we’re done with number two.
00:27:19.600 Now that we're back to green, let's ask these questions again and do this exact same refactoring: is it DRY? Yes, it's dry.
00:27:31.660 There's no duplicated code in either of those classes. Does it have one responsibility? Whoa, probably does.
00:27:44.300 Does each class contain only things that change at the same rate? This class again, it’s hard to find the words for this when you haven't done this much.
00:27:59.160 This class makes me uneasy. It's unlikely that Net::FTP will change, but it's pretty likely that the login, path, or file name will change.
00:28:15.740 This class seems to be a combination of completely static, very unlikely to change things, and pretty dynamic, pretty likely to change things, and I just hate it.
00:28:31.130 I hate having the strings directly in the method. The chance of needing to edit this class to change that value is almost a hundred percent.
00:28:47.900 So the code smell is the feeling of unease about the likelihood that everything in the class won't change at the same rate.
00:29:01.160 I'm just going to try to pull out a responsibility. We’re engaging in a pattern of refactoring that occurs over and over.
00:29:18.680 Identify the responsibilities you want to remove, create a new spec in a new class that has that code and no other code in it. Then take that code and inject it back into the original class.
00:29:32.770 Follow this pattern until you don’t have any more responsibilities to take out.
00:29:45.830 In this case, I'm pulling the configuration data out and putting it in another object. The last time I did this, we mocked that scene.
00:30:05.390 The idea is really tricky to know when to mock and when not to. You might think you should mock everything, but you can ruin your code by trying to mock everything.
00:30:20.650 You can make your code really fragile by doing that. In this case, I won’t mock the config in the downloader. There's no reason to do it.
00:30:36.710 Config is not an external resource; it runs really fast. It’s a value kind of object, really just a container for static data.
00:30:52.890 There's no reason not to use the actual production object here. It's much simpler this way. So, instead of mocking seeding, I'm at a point now where my test runs green.
00:31:06.960 I’m going to simply write the config object. When I sit down to write the test, I realize I didn’t have a test for this.
00:31:19.560 It didn’t make sense to put this in my other test, it seemed too detailed for the granularity of the test at hand, but it’s something I probably should be testing.
00:31:34.470 It turns out there’s a conundrum in tests; in the test environment, I have a test path, but mostly the reason I write tests is to make sure my production environment will be okay.
00:31:49.830 It's always slightly different, so sometimes while you're in tests, you need to pretend you're in production so you can ensure that it will be okay.
00:32:02.350 I’ve injected the environment back into this class. This is probably wrong. If this changes, you should inject it if you expect it might be different.
00:32:17.560 However, it’s about to spark an if statement. If another environment appears, you can bet I'll have to change this code; it’s a code smell if you implement it.
00:32:35.100 No matter how well hidden, if you find yourself in this situation, you can suspect that your design isn’t yet correct.
00:32:50.960 But that’s okay because it passes right now—it passes the test. The bigger problem I have is if I need to create another configuration.
00:33:02.560 While developing the next class, I realize I may need additional configuration and will have to copy this class, change everything, and make another one.
00:33:14.220 There is no abstraction here, so I'm starting to think that the configuration concept is an abstraction.
00:33:26.570 Now, you don’t have to know where you’re going to successfully refactor lots of times. If you haven’t done this very much, it might seem intimidating.
00:33:41.400 You see nicely factored code and think you could never do that. Well, the truth is they didn’t think of it; they just wrote some crude code and refactored it.
00:33:53.150 You have that same ability to write really great code; just don't expect it to spring fully formed from your head. It happens because you know the rules.
00:34:05.740 You apply them over and over again, and then you arrive at good code. Now I have to revisit the code and address the strings that carry certain values.
00:34:20.110 What if I set up the configuration for the database based on that; I want to rip out all this code and put it where it can change without disrupting performance.
00:34:38.050 Now, I can write this code, and it might suck while I’m doing it, but I'm extracting data from it, and we're back to being green.
00:34:51.450 This code’s refactoring is complete; I now have a usable configuration class, and we’re at a field where all my tests still work.
00:35:05.780 Now my application stands out, where I've separated responsibilities profoundly and defined flexibility. I hope you find the next example illustrating.
00:35:20.370 So I’ve created a generic downloader. I’ve also built a new job class, with a streamlined initialized method.
00:35:34.380 There’s no downloader method because it’s been sent to the config class, which can be reused whenever needed. So, with our resources, we are successful.
00:35:51.360 The key takeaway from all of this is that test-driven development is good but not enough. BDD is great but also insufficient.
00:36:11.030 DRY is excellent but still not the whole answer. When you first came to object-oriented programming, coming from another language, those seemed intimidating.
00:36:27.100 But as you reach this point of understanding, you need to apply design principles in your day-to-day coding.
00:36:42.120 If you follow proper design principles, you’ll find your code remains flexible, easy to change, and enjoyable to work on ten years from now, as exciting as it was at the start.
00:36:55.470 You get to keep your jobs, and they keep paying you!
00:37:06.480 Thank you for hanging in there with the code! Let’s open it up for questions. I’m sorry I beat you all to death, and you really didn't pay good attention.
00:37:22.040 I have to say I’m quite impressed with the group.
00:37:36.520 A question related to metaprogramming: how does it contradict with the Liskov Substitution Principle?
00:37:44.350 I'm not sure about that; I’d say no. If you're designing developing code for use by others and following these principles, you'll find that your code gets abstracted into higher levels.
00:38:00.520 Abstractions should ideally have a well-defined API, and it's better if users do not need to understand the underlying code.
00:38:12.020 The tension arises between straightforward procedural code and well-factored abstractions. However, well-factored abstractions make for an easier user experience.
00:38:27.250 So, it's fine for you to employ extensive metaprogramming skills as long as those abstractions serve a clear purpose.
00:38:35.480 When it leads to confusion or makes using your API a burden, then you've likely strayed from sound design principles.
00:38:49.860 Many applications that are famously known for having extreme Ruby pipelines have had to refactor code after inexperience caused early problems.
00:39:06.440 If you find yourself grappling with the complexities of code design, don't hesitate to seek help from the community.
00:39:20.410 You can ask in user groups or forums for advice on improving code and ensuring it’s highly composable and amendable.
00:39:27.290 We're done!