00:00:22.000
All right, so Kobe, I'm sorry, I'm a pacer. You're going to have to follow me. My name is Nick, I'm Nickolas Means on Twitter if that's your thing. This is my city, and it's the one you're in.
00:00:27.920
Like you said, I built the WellMatch platform, and I get to pair all day long remotely with some really smart people working on really hard problems to try to make healthcare a little bit better. I'm going to be talking really fast; I have a lot to cover, so I apologize.
00:00:39.440
If you'll think back with me a few months, there was a big kerfuffle in the Ruby community. DHH got up at RailsConf and told us all how bad TDD was. But if you were listening, his talk was actually about more than that. The title of his talk was "Writing Software," and he spent a considerable amount of time talking about how he'd spent most of his career not feeling much like a computer engineer, but like a software writer. This got me thinking: what if we took that way too literally?
00:00:52.879
Well, we would probably need to replace our rulebook with something that pertains to writing, and there's no better choice than Strunk and White's "The Elements of Style." It was written by Cornell professor William Strunk in 1918 and expanded by his former student, E.B. White, in 1959. Yes, that E.B. White.
00:01:04.640
There have been several editions of it over the years, including this gorgeous black leather, gold embossed 50th anniversary edition. There's an illustrated version if you like your grammar rules with pictures. There's even a book about the book—that's how important this book is!
00:01:16.880
People have written about the process of writing this book. If you're looking for a book of rules on English prose, "The Elements of Style" is your book. But what does that have to do with code? I'm glad you asked. So, keep Ruby weird. I want to introduce you to a long-lost friend of mine: shipment utilities dot rb. It's part of an e-commerce platform that I wrote; it was my very first Ruby on Rails project and it shipped in Rails 1.1.
00:01:32.480
There's something that you should think about any time you see a class with 'utilities' in the name. So, take a look at this thing—this is two-point font for reference. Like I said, some statistics: it's 427 source lines of code, which are lines that are neither comments nor whitespace. It has a Flog score, which is a measure of relative complexity, of 561.1, which is pretty considerable. It has a Flay score of 70, which is a measure of duplication—not terrible, but it could be better. Code Climate rightfully gives it an F, and if you look at this first method in the Flog response, its score is 120.7—that is 21% of the complexity in this entire class rolled up into one method right here.
00:01:55.520
So, what should we do with it? We should refactor it, right? Except that we can't. We have to use Strunk and White because some idiot decided it would be a good idea to give a talk based on refactoring from Strunk and White. The thing about this is it’s false on the cover because refactoring is not a prose concept. You don't refactor text.
00:02:12.160
So we have to dig into Strunk and White and find a basis to start working on this at all. Rule 5, section 5: revise and rewrite. Revising is a part of writing. Few writers are so expert that they can produce what they are after on the first try. Quite often, you will discover on examining the completed work that there are serious flaws in the arrangement of material calling for transpositions.
00:02:34.320
But that doesn't really tell us how to get into this code. It doesn't give us anywhere practical to start. So let's find a practical rule: Rule 5, section 3—work from suitable design. Before beginning to compose something, engage the nature and the extent of the enterprise and work from suitable design.
00:02:54.800
Well, in order to do that, we kind of need to know what this code does. So let's talk about what build shipments actually does. Let's say you've got a shopping cart. We've got a jacket in there that's in stock in our warehouse from Vendor A, and we've got a glove that's a drop ship from Vendor B. Now, drop ship just means that your vendor ships it directly to your customer, meaning you never handle it in your warehouse.
00:03:20.640
Then we have a helmet that drop ships from Vendor C, a different vendor. We could take these three items and break them up into three shipments, but that means that our customer pays three shipping fees, which they're not happy about. So instead, let's consolidate the first two items since they're from the same vendor; let's just drop ship them. But what if our customers are real cheapskates and just don't want to pay for two shipments? They want to pay for one.
00:03:40.240
Well, we can order all the things into our warehouse and ship it that way. That's what this mess does—all of the logic for all of that that I just took about a minute to explain is in this code. So, we need to get it to a suitable design so we can even work on it. Right now, it's buried in those 427 lines of code, where we cannot possibly reason about it. Therefore, we need to take it out of that environment, put it in its own class, give it an initializer. Unfortunately, we have to bring its friend with it because it doesn't know how to do its thing without the create group if necessary method.
00:04:06.080
Let's add a test because this code, like any good code written by a PHP developer, has no tests, and of course, it fails. Why does it fail? Our initial Flog score is constant Rails default logger. How many of you remember using that? Let's look at this thing. Where are we using Rails to log? Well, everywhere!
00:04:19.280
Our production code is riddled with trace debug statements, so let's get rid of them. All right, let's check our test now and see if it's running. No, still exploding because we have an uninitialized constant ShipmentBuilderItem. This brings me to my favorite line in this whole class, and it is a line so nice that I use it twice.
00:04:38.560
So let's look at what it's doing. It's two—not one, but two—nested Active Record queries inside of an array predicated on a de facto object type check. Okay, all right. So let's see what Strunk and White tell us about that: "Do not take shortcuts at the cost of clarity." Many shortcuts are self-defeating; they waste the reader's time instead of conserving it. The one truly reliable shortcut in writing is to choose words that are strong and sure for it to carry readers on the way.
00:04:59.360
So let's see what this line is actually giving us. Let's see where we're using the output in the class; there are kind of two clusters of it. There's a good reason for that because it's in two legs of a conditional, so let's take that top block and zoom in. Remember, we can actually do something with it.
00:05:18.320
The first thing I'm going to do is take this line and change it from a ternary to an if-then-else because that lets me do this. You're welcome. Now let's turn down the noise and only look at the lines that mention item. Those ship status symbol lines look a little suspicious, so I'm going to ignore them for now and move on to easier things to deal with. The store ID line again has an object type check and a ternary because I really like that construct apparently.
00:05:38.080
And we need to get rid of that. So what's it doing? It's checking to see if the item is an instance of the item model, and if it is, we want to reach through item into product and get the product store ID. Otherwise, for some reason, we want to return one. The good news for us is that we don't have to care about this.
00:05:56.880
We can take our double down here and teach it how to respond to store ID, and that lets us take this line and call it directly on our line. Now, if this was the real world, I would have to reach down into that object and teach it how to actually do that, but this is not the real world; this is a refactoring talk.
00:06:06.960
So I'm going to pretend that the other class doesn't exist, and we can move on to the next line. It's the exact same change. Now we've got another line, and you may be noticing a pattern here to the changes that I'm making. It turns out Strunk and White has some advice on that as well. Rule 2.19 tells us we should express coordinate ideas in similar form.
00:06:19.600
And pay attention because this is a rule that we will be falling back on repeatedly in this refactoring. This principle that of parallel construction requires that expressions similar in content and function be outwardly similar. The likeness of form enables the reader to recognize more readily the likeness of content and function. So, things that we express similarly are easier for us to understand; because you understand it one time, you understand it every time.
00:06:40.000
So let's take this line—it turns out we can apply the exact same change. Go down here, teach our double how to respond to that message, go back up here, put the call on order line item, and we're good to go. Now, let's look at ship status symbol—let me tell you what this is doing.
00:06:57.440
This first line calls ship status symbol for quantity on item because, say, the customer wants two of something—well, we may only have one in our warehouse. So, if they order one, great, it's an in-stock shipper. But if they order two, then we've got to drop ship them because we don't have one.
00:07:14.400
So once we figure out the appropriate ship status symbol, we have to look and see if it’s either of these specialized in-stock statuses—either closeout or something that we only ship from our warehouse stock. If it is, we want to change that to be the main in-stock symbol to simplify. Then we go down here and we hang it on the same open struct as everything else and shove it into the line item.
00:07:40.760
It turns out we can apply the exact same refactoring again. We can teach our double to respond to that message, go back up here, make our change, and that lets us get rid of those lines. All right, and so now if you look at this block of code, we're no longer referring to item anywhere in this block, so that lets us hop up here and get rid of that too. No longer need that, put it back in our class, look at our tests. Hey, we're passing!
00:08:01.840
But how can that be? Because we still have this other line that references the item class. Well, clearly, we're not testing the other leg of our conditional. We've got to go back in our tests and add a test for the other leg of the conditional, and there's the failure we expect.
00:08:18.013
Well, the good news is the only message that we're sending to item here is store ID, and we're sending it twice for some reason, but we're not going to worry about that for now, because we've taught borderline item how to respond to that. So we'll call it directly on order line item, and we can delete this code now, and that gets our tests passing.
00:08:39.360
Now we're close to a suitable design, but we're not quite where I would like to be, and let me tell you why that is. Let's say we instantiate ourselves in ship and build and we pass it a couple of order line items; it does its thing and immediately returns two shipments to us. We didn't ask for any shipments; we just instantiated a class? That seems kind of odd to me. It feels funny—not quite obvious enough.
00:09:01.920
So let's hop on our test and change our API in our test so we can let that direct our class refactoring. While we're here, we're going to grow up this test and make a check to see if we're actually getting a shipment back with the right number of shipments and the right type of shipment. But most importantly, we're changing the API of our class where we instantiate the class and then call build shipment directly.
00:09:23.680
Do the same thing with this test. Two failures, just like we would expect. So now we can go into our class and make that change. We want to take order line item and consolidate and make the matters—we want to instantiate them in our initializer. We no longer need to pass those args in because they're stored on our class now.
00:09:42.960
We don’t have to pass them around, so we get rid of them here. Get rid of the build shift that's called here—that fixes our test, and our API is where we want it to be. Now I have to ask you to forgive me for what I'm about to do here. I'm going to wave my hands, finish up our tests, and surprise—they're all passing!
00:10:02.480
Great! If you would like to see how to do this, I would highly recommend that you take the time to watch Katrina Owens' therapeutic refactoring talk. She has a very patient explanation of how to take legacy code, wrap characterization tests around it, and then refactor.
00:10:20.800
I, on the other hand, what little hair I have on my head is on fire right now because I have a lot of code to get through, and I don't have time to show you. So let's look at our class and figure out what to do next. We've arrived at a suitable design.
00:10:37.680
We're into a design that we've got tests around us. We can actually go in and change the code confidently. So we need another Strunk and White rule to tell us where to go next. This is probably the most famous of them all: Rule 2.17, omit needless words. Vigorous writing is concise.
00:10:54.400
A sentence should contain no unnecessary words; a paragraph, no unnecessary sentences. For the same reason, a drawing should have no unnecessary lines, and a machine, no unnecessary parts. So let’s hop back in the code and see if we can find anything unnecessary.
00:11:10.880
Well, it turns out we were working with something that's pretty verbose just about everywhere. I didn't even realize it. It feels pretty verbose to be creating an inline object here that we're only using in one method. We've created this line item open struct duct-type thingy, and we're using it to drive this class when we don’t want to do that.
00:11:27.680
The problem is one of these lines is still not like the others. We want to express coordinate ideas in similar form. So we need to take that line and change it to a call on order line item. Now you'll notice that something interesting has happened. Our tests are still passing.
00:11:44.960
But if you look at the left, all the attributes are being set up on this open struct, and then look on the right; all the things we're calling on order line item, they all match verbatim except for this one line that lets us reach through and access the original object. But we can refactor that; we don't worry about it.
00:12:02.400
So now that that's the case, now that we've taught our object to match the duct type of this weird open struct thingy we're using in this method for some reason, we can get rid of it. We don’t need it anymore, so now line items just equals order line items.
00:12:20.720
Simple as that! And we'll change this one call and let's see if our tests are still passing. They are! So we just got rid of seven lines of completely unnecessary code, but we've still got more unnecessary words. We've got two variables here with the same value—there's no reason for that.
00:12:36.960
So, let's take them, move them up here, change order line item to line item on our class adders, and we can get rid of this line altogether. This block down here is still referring to order line items, but it's just reading the adder; it’s no big deal; it’s just a name change—simple switch, and we're still passing.
00:12:54.720
All right, now I know you can't see this so take my word that there's nothing obviously needless left. Everything's doing a job. A lot of it's doing it far too verbosely, but it's all still doing a job. So what's next? Rule 2.13: make the paragraph the unit of composition. This is where we get into sort of the meat of the refactoring.
00:13:12.560
A subject requires division into topics, each of which should be dealt with in a paragraph. The object of treating each topic in a paragraph by itself is, of course, to aid the reading. So if our methods are paragraphs, build shipments is a really long paragraph. So let's hop into it and see what's going on here.
00:13:33.600
If the consolidate flag is false, we do this stuff; if it's true, we do this stuff. That's the entirety of the method, and those two branches of the condition—well, those are two obvious paragraphs that we have to work with. So let’s start moving them out. Let’s take the bottom conditional leg and make it its own paragraph called build consolidated shipment; we'll replace it with the method call up here.
00:13:53.440
Check our tests—uh-oh, we broke something: undefined local variable or method shipments. So, we're calling a local variable in build consolidated shipment that we defined in build shipments. The easy way to fix that is just to elevate it to the class. We'll take it up here, make it an adder, assign it in the initializer, and we're good to go.
00:14:11.720
Now let's take the top leg of the condition and make it into its own paragraph as well. So we'll do that, and you'll notice that all of a sudden, build shipments is looking a lot clearer. It's a lot easier to understand what's going on. Our tests are still passing; we've got a much clearer story.
00:14:30.240
But we've sort of just taken all the junk off the cabinet, but at the juncture, we still have to address this—we can't just shove it out of view. So let’s zoom in on this code and see if we can figure out what's going on. There are three paragraphs embedded in here—three separate topics.
00:14:47.960
First off, we built this line items by symbol hash, which essentially gives us all of the line items we've got divided up by shipment type. The second giant block actually has a helpful comment in it for a change—thanks, past me! If we’ve got us some in-stocks and some drop ships, let’s see if we can do some consolidating.
00:15:06.560
It only makes sense to consolidate if we can completely get rid of in-stocks, so if we can take all of our in-stock items and assign them all to a drop ship shipment that we’re already going to have to make, we want to consolidate. If not, we don’t want to do it, and then we’ve got this third block that just takes all the line items and assigns them to shipments.
00:15:26.320
So let’s take this first block and throw it to a new paragraph. That’s great—our tests pass; we didn’t have to do anything to change it at all; we just had to move it. But it’s still pretty gross. What does Strunk and White tell us about that? Rule 2.16: use definite, specific, concrete language. Prefer the specific to the general, the definite to the vague, the concrete to the abstract.
00:15:44.720
So let’s see if we can make this more concrete. Well, the first thing we can do is take it and make it only return what we're asking for—not an entire hash of junk we don't need. And then, once we do that, we can go up top. We can change this hash access to a method call, but that’s still not that much clear—not that much more concrete.
00:16:03.840
What if we could do that instead? Well, it turns out now that we’ve got this method down below, it’s trivially easy to do that—we just set up an in-stock items method. We can change all those calls to in-stock items. We do the same with dropship items, and we’ve added a little verbosity at the bottom, but our main method is easier to read.
00:16:22.960
It’s continuing to get better, and our tests still pass. So now, let’s take this block and see if we can figure out what in the world’s going on. This is the gatekeeper line. This line checks to see if every in-stock item that we’ve got has the consolidatable flag set on it, and if we do, then we want to take those all, our in-stock items, and change their ships as to ship.
00:16:41.600
All right, skipped a few—running short on time. Sorry about that! So this top block, as it turns out, is just a conditional to see if we can convert all the in-stock items to drop ships. All that junk is just a conditional, so let’s move it to its own paragraph.
00:17:00.640
Move it back up top: if consolidate to drop ship, let’s change all of our ship statuses to drop ship. Much more readable, much better. Tests are still passing, but we can still make this better. It’s still not obvious what’s going on.
00:17:17.840
Let's take this bottom line and let’s change it. It’s checking to see if all of our in-stock items have the consolidatable flag set. Well, let’s reach into the Ruby standard library and see if we can find a better way to express this. This is a really terrible way to express this; the Ruby standard library is the software developer's equivalent of a vocabulary.
00:17:38.560
So, the more of the Ruby standard library that you know, the more expressively you can write Ruby. And there’s a much better way to write this: if all of our stock items are consolidated, we’ll return true; if not, return false. So let’s take this line and do the same thing we can use 'any' up here instead.
00:17:56.640
So we're no longer checking to see if this count is greater than zero; we can just check to see if we have any drop ship items that have the same vendor ID as the in-stock item. All right, let’s continue refactoring.
00:18:13.760
We don’t want to do anything unless all of our stock items are drop shipping—we’ve got that condition buried in the method. Let’s take it and move it up to the very top. It seems important, and we want to say what's important to us at the very top of our method.
00:18:28.640
So, let’s elevate that up, and if all our stock items are drop shippable, we’ll carry on; if not, we’ll return false. Let’s stop in here, and we can get rid of the condition out here—we don’t need it any longer. In fact, we can take this all and make it a one-liner.
00:18:47.520
But can we do it once? Instead of iterating over all of our items, let’s see if we take the vendor IDs of our stock items, subtract out the vendor IDs of our drop ship items, and see if that set is empty. Well then, we’ve proven to ourselves that the vendor IDs of in-stock items are a subset of the vendor IDs of drop ship items, and that one line replaces this five lines of junk. We’ve got a much clearer method there.
00:19:01.360
Now in our tests, pass. All right, so there’s still more work to be done here—there’s still more that we can do. We’ve done something very unintentional here. We’ve made this method quite a bit more confident, as it turns out, so we can go up here to the top and look at this conditional.
00:19:09.760
We don’t need it anymore because our method down below is not going to do anything unless all the conditions are met. So we can actually get rid of that conditional altogether; it just goes away, and our tests run and pass.
00:19:21.440
So now let’s take this line. It takes our in-stock items and converts them all to drop ship if consolidate to drop ship is true, but it’s not obvious what’s going on just from looking at that line. Let’s give it a word of scripted name so we can figure out what’s going on: stick with that concrete language.
00:19:42.240
So if we give the line a name, we can go back up top and make that a one-liner now; we want to consolidate the drop ships if the condition is true—easy, much more understandable. This still sticks out like a sore thumb to me; it still looks terrible.
00:20:04.960
So, we’ll take it down, move it down below, make it its own method, assign items to shipments, move the method name above, and suddenly filled shipments by ship status is a two-line method that consolidates to drop ship if consolidate to drop ships is true and then assigns items to shipments.
00:20:15.320
Our tests still pass, so that’s great! We still have a couple of methods here though that stick out to me. If you do the squint test, you can see a couple of methods up top that don’t quite look like the rest, so we still have some work to do to express everything in coordinate form.
00:20:29.760
Let’s change them. We’ll zoom in on to start with; let’s look at build consolidated shipment. It’s doing the same thing as build shipments by ship status—the exact same thing—but it’s doing it very differently. What's it doing? It's assigning every item to a shipment, but it’s hard-coding the consolidated flag.
00:20:48.760
Instead of iterating over the items changing all the ship statuses and assigning them to shipments, let’s make these do the same thing. Let’s express them in coordinate ideas; let’s express the coordinate ideas in similar form.
00:21:05.120
We’re just going to iterate over all of the line items, set the ship status to consolidated, call or assign items to shipments method.
00:21:20.960
Then we’ll take this and continue to make the coordinate ideas expressed in similar form, take that, move it down to a method, and now these two are starting to look more and more similar.
00:21:36.560
But you’ll notice we’ve got a conditional down here: we only do this if consolidate drop ships is true. We don’t have a conditional up here—why not? Well, it’s up top, so we want to make build consolidated shipment look the same as build shipment by drop status.
00:21:52.960
So we need to move that conditional down there, and while we’re at it, change it to single ship. We have to consolidate all over this class, right? It’s a word that doesn’t mean a lot to us anymore, so we want to change it to something that means more to us and single shipment means a lot more to us.
00:22:09.360
What we’re doing when that flag is set is consolidating all our items into a single ship. That lets us get rid of this conditional, and now build shipments is a lot easier to understand. Except that we’ve broken a couple of things now. Why is that? Well, it turns out we’ve assumed that both of these methods are now confident, and we can call them without any side effects.
00:22:27.920
The problem is that they both call the assign items to shipments unconditionally, so what we need to do is take that and move it up top to build shipments. And then we’re passing still not quite as clear as we want it to be though.
00:22:45.920
Because as it turns out, these two methods are not building any shipments at all. They’re updating ship statuses, so we need to make that clearer. We need to express that more clearly, so we’ll move them down into their own method called optimize consolidation, and we’ll call that up above.
00:23:01.920
Suddenly build shipments is telling a much better story, but those two methods are still named incorrectly. They’re like I said just a second ago; they’re not building anything—they’re just updating ship statuses. Well, as it turns out, those two lines are pretty expressive on their own. We don’t really need a method name to tell us what they’re doing.
00:23:18.360
So let’s just inline them, and our class is getting more and more compact and easier to read, and our tests still pass. So now let’s hop over to the main class, and if you take a look at everything that we’ve refactored here, it all lines up on the left. It passes the squint test.
00:23:36.960
We’ve expressed all of our ideas in very similar form; all these methods are structured the same. We have the same type of paragraph throughout our class, except for this junk at the bottom. I would love to show you how I refactored that, but we’re sort of running out of time here.
00:23:55.920
So, I’ll just show you the end result: we took create group if necessary and made it its own inline class—a value object called shipment list that has all the tools we need to interact with it and create a list of shipments by shipment type.
00:24:14.560
So there’s our final class; there’s our refactor. Now, how did we do? Well, let’s look at some scores. The Flay score for our class, which again measures duplication, originally was 36, and it’s down to zero. The original, after I extracted it out of the Rails class, was 180.
00:24:30.560
We’ve got it down to 77.2, so it’s one-third as complex as the original class. The Flay score for build shipments, if you’ll remember, was 120.7; we’ve got it down to three. Well, as it turns out, Flay score is a pretty effective measurement of how readable code is because complexity and readability are one and the same.
00:24:52.560
So there’s where we started; there’s where we finished, and that’s sort of the crux of the talk. We’ve got the build shipments method up top that is an absolute rambling manifesto of a paragraph—there’s no way you can make any sense out of what’s going on up there.
00:25:08.160
Build shipments down below actually follows a pretty clear narrative: we want to optimize consolidation, assign items to shipments, and return shipments. That’s it—easy to understand! So why did it work? It was a pretty textbook refactoring, but I didn’t reference any of the normal techniques you would reference when you're refactoring.
00:25:36.560
Well, it turns out the rules in Strunk and White coordinate pretty well with vagaries of programming. Rule 5, section 5 provides ‘revise and rewrite’ rendering 'refactor.' Rule 5, section 3: 'work from a suitable design.' There are plenty of people that want to teach you good object-oriented design.
00:25:54.480
Gang of Four, Foo Bar, Russ Olsen—who’s going to be speaking to you in just a minute—gave us the Ruby version of the Gang of Four books. Rule 5.19: do not take shortcuts at the cost of clarity. This is the Law of Demeter in case DHH ever sees this talk.
00:26:11.520
Rule 2.13: make the paragraph the unit of composition—this is the single responsibility principle! We want all our methods to only do one thing.
00:26:25.920
I could keep going through; I could keep giving you the relationships between Strunk and White and programming algorithms, but we can actually ask the guy who wrote the book on refactoring why this works. Martin Fowler is happy to tell us: any fool can write code a computer can understand; good programmers write code that humans can understand.
00:26:42.960
That’s the thing— I mean the code that we started with ran in production for many years and had a really low defect rate; it worked just fine. And then I tried to change it, and it was terrible—it fought me every step of the way.
00:27:02.160
The easier code is to read, the easier it is to change, and you really don’t know how good your code is until you try to refactor. You don’t know how good it is until you try to change it. Good code helps you change it; bad code fights you every step of the way.
00:27:19.520
So that’s the first part that I want you to take away. The second part: we need to reach into some famous literature. This is from "On the Road" by Jack Kerouac: 'We stopped going the