00:00:17.920
So, with that out of the way, let's talk about math.
00:00:25.199
This is how I know I am among my people, because for a lot of— I don't want to say normal, but for a lot of people who are not in this room, the word 'math' evokes a lot of unpleasant memories of graphing lines, solving for x, and doing word problems. For some reason, people really hate word problems.
00:00:44.399
Now, in my experience, when people make this face and rant about how much they hate math, what they're usually talking about is algebra. I can see where they're coming from. In high school, I used to come into algebra class with three or four fantasy novels in my backpack. I would sit down at my desk, pull out my algebra textbook, prop it up on my desk, and stick it inside while I read a fantasy novel.
00:01:06.640
A few times throughout the class, the teacher would walk around and grab the book away from me. I'd reach down into my backpack for the next one. Eventually, he started making me leave my backpack in his office, and then I dropped out of high school. Take that, Mr. Owl! But about a decade later, I went back to school for a computer science degree, and I was really excited to learn more about programming. However, I was not looking forward to all of the math because, at that time, I thought algebra was math.
00:01:40.960
But as it turns out, algebra isn't all of math. Or to use math to make my point about math: algebra is a proper subset of math. Math has all this other cool stuff in it, like Fibonacci spirals, fractals, hexaflexagons, and my personal favorite, graph theory. When I finally encountered graph theory, I was so frustrated that I had wasted my time with all those fantasy novels.
00:02:11.840
What I eventually came to understand is that math is a language and algebra is its grammar. This means that the algebra textbook I was using to conceal my peers' Anthony habit was the Dick and Jane of math. Now, this is not to say that it wasn't important or valuable. If you don't know the grammar of a language, there's almost nothing you can say in it, and books like this are incredibly important when you're learning.
00:02:34.480
But once you've mastered grammar, you probably don't give it much thought. This reminds me of the title of my talk. I am a big fan of the Ruby Rogues podcast; I see at least two of you out there. So, I'm going to channel Josh Susser and define my terms. In addition to the Ruby Rogues, I am also a fan of Aaron Patterson's presentation slides. For those of you who share that affliction with me, here you go.
00:03:21.440
Fluency can be thought of as what you can say when you're not thinking about how to say it. Another way to put it is what you can say when you're woken up in the middle of the night with a flashlight in your face. This sounds traumatic, but what it's really discussing is what you can do when you're under stress. I don't know about you, but stress makes me stupid. When I'm under stress, original creative thought goes out the window. I'm really unlikely to try anything new, and instead, I fall back on skills I've already mastered.
00:03:39.040
And for the record, this is not my cat—I am not this fast with my camera. There's an organization in Portland called Language Hunters, and they use a really interesting model to talk about levels of proficiency. This is not the greatest one that Katrina was talking about earlier, but it has some similarities. Level one they call 'Tarzan at a party,' which involves words and simple phrases. Level two, 'Going to the party,' means you can say complete thoughts. You know more words, but it is still hard to talk about some things.
00:04:07.520
Level three, 'Discussing the party,' now you start having some more interesting conversations, and level four, 'Charlie Rose,' is confusing because I never heard of Charlie Rose. I had to consult Wikipedia, and it told me that he is a journalist and has a talk show on PBS. Presumably, he has a very sophisticated grasp of the English language. What I like about this model is that it's not all or nothing; you can level up over time.
00:04:29.120
Remember, fluency isn't just about what you can say—it's what you can say without having to think about it. So, if you're fluent at level one, you may be able to operate at level two some of the time. You just have to work at it. So maybe earlier in the day, I got out my transit map and a notepad, and I asked somebody how to get to the party. But once I've been there for a while and I'm ready to go home for some introvert time, it's taxi home.
00:04:58.400
As for refactoring, here's a definition I swiped from refactoring.com: 'a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.' I really like this first word: 'disciplined.' It makes me feel like I should bow or have honor. You're welcome, Buffy fans. I've heard people use the word refactoring to describe an extremely wide range of software development activities.
00:05:27.680
Taking a couple of weeks, breaking some stuff is not refactoring—this is what I like to call 'doing it wrong.' Most refactoring steps take less than a minute; some of the more involved ones may take five minutes tops, and breaking stuff by definition is not refactoring. I deliberately sped through that first part, but I intended you to miss the last phrase: 'without changing its external behavior.'
00:05:57.440
As Katrina Owen said in her wonderful talk, therapeutic refactoring tests are implied. This also means I'm handwaving them and not going to talk about them very much. Tests detect and warn you of changes in behavior. Using them to verify your work makes it possible—and even fun—to experiment with different arrangements of your code and see which one tells the best story.
00:06:38.800
Now, I have no doubt that every word in this definition was carefully chosen, but there are a lot of them. So how can we make that last part stand out a little bit more? Here are the words in it that I think are most important: 'a technique for restructuring code without changing behavior.' So that's refactoring. As for what it means to refactor, we can make a verb out of the noun just by dropping a few words and we get to 'restructure code without changing behavior.' I think this tells the same story with fewer details, and that, to me, is the essence of refactoring—it's about making your code tell a clearer story with fewer distracting details.
00:07:10.800
I would also like to propose an alternate definition of refactoring: a language that describes ways to make your code suck less. Because we all know our code sucks. But more importantly, if refactoring is a language, then we can apply that language to the hundreds model that talks about levels of proficiency. It's no longer an all or nothing thing.
00:07:40.240
If you were interested enough to come listen to this talk—and I'm glad that you did—you’re probably already fluent in refactoring, even if it's only at a basic level. You can become more fluent in refactoring. This one is easy: you can become more fluent in anything; it just takes practice, and putting in the practice to become more fluent in refactoring is worth it.
00:08:05.919
Because you'll be able to say more things when you're under stress. And remember, language is the refactoring of making your code suck less, and who doesn't want that, right? So, I'm going to tell a story about some code that I encountered in the wild, and because I was the person in the story, I'm going to share what it was like to go through this at my own level of refactoring fluency.
00:08:35.360
If you're more fluent than me, then I look forward to hearing later about how I can improve what I did. And if you don't catch everything, it's okay! Take in what you can, trust me on the rest, and there will be video. I'll publish the slides, and I will put the code up once I get some sleep.
00:09:04.000
So, this code comes from a production Rails app. I got permission from its owner to use it in public as long as I change the identifying details. While I am going to tear this code to shreds, I do have a lot of respect for the developers who wrote it. They were under a lot of pressure from the business at the time, like all of us in that situation, and they did what they could with what they had. Also, I've written much crappier code than this.
00:09:38.720
As for what the code actually does, this is almost irrelevant because refactoring is concerned with how your code is structured, not what it says, means, or does. But just to provide some context, we'll say that it comes from a cable company and it's used to schedule installations for new customers. Here’s our test subject. I'm deliberately using a tiny font here to emphasize the shape of the code.
00:10:08.720
I was showing this to a coworker of mine yesterday, who'd never seen this. He looked at it and said, 'I can't read it, and already I hate it.' I will zoom in when the details become important, but even from here we can make some very basic observations: There are about 800 lines of code in this file; about 50 of them are right here in this method.
00:10:35.600
The longest line is a whopping 177 characters, and the indentation ranges from 4 to 16 spaces—that nice little arrow pattern. The indentation comes from nesting various control structures. There are a couple of copies of this audit trail method that takes a block, a couple of begin-rescue-end stanzas, and a whole lot of if-statements.
00:11:02.880
Now, I have a theory that once code reaches a certain level of complexity, it tends to get worse. If it's hard to understand everything the code is doing, a developer who’s working under schedule pressure will make the smallest change that doesn't obviously break anything—maybe run the tests, breathe a sigh of relief, and ship it.
00:11:41.440
Here's a little-known fact: this image contains a typo. A team of scholars and expert squirrel lip readers have determined that this squirrel is actually saying 'ship!' When you follow this squirrel's advice for too long, you wind up with a problem for waste management engineers.
00:12:02.399
On that note, perhaps you remember that first observation, which was that there are 800 lines of code in this file. A lot of Rails controllers turn into junk drawers, but at 800 lines, this one isn't just a drawer—this is an entire cabinet full of rusty cheese graters, bent whisks, and broken dreams.
00:12:34.480
My first instinct is to clean all the things, but 800 lines of Ruby code is a lot. I may not have the time or the energy to deal with the whole mess at once, but here's what I can do: I can make the job smaller. For right now, I'm just going to focus on cleaning up this one method. As for the other 750 lines in this class, 'not my circus, not my monkeys.'
00:13:08.080
Now for the part that I am going to fix. I already know from experience that this session is going to involve some swearing and it's going to take me a while. So, what I'm going to do is a little prep work. I'm going to walk over to the cabinet of despair, take that junk drawer of broken dreams right out of it, and set it down in the middle of the kitchen floor. The formal name for this process is 'replace method with method object.'
00:13:43.560
This one's in the book—it’s also in Katrina's talk—but it's not something you use every day, so here's a quick overview. This is the Rails controller action. I have folded up the entire method body. The first thing I do is create a new object that's named more or less after the original method. I take the entire body of the method and move it into the new object.
00:14:03.360
Then, back in the original method, which is now empty, I create that object and invoke it. That doesn't actually work yet because the code that I just moved over has a bunch of calls to params and render and redirect_to because it was in a controller. So I have to pass in a controller to this thing.
00:14:33.440
At this point, I have a choice: I can go through the transplanted method and add 'controller.' at the beginning of everything that is raising a NoMethodError, but that sounds like work, so I'm going to take door number two and use method missing.
00:15:00.480
It gets better! I could use delegate—I did that in one of the passes I took at this code. So now, all of those calls are going to get passed back to the controller. Now that I have this junk drawer out in the middle of the floor, I'm going to start picking things up from it one at a time and hold them up.
00:15:30.080
The first thing I see when I do this is an if-else/end structure that branches on request.xhr. The first time I went through this, xhr was new to me, but I looked in the docs and saw that it's an alias for XMLHttpRequest, so I guess I must be looking at different handling for ajax requests than HTML.
00:16:00.480
Okay, so I unfold the next layer of code and see a begin statement. Then I see something that checks to see if the installation is pending a credit check. If so, it renders and returns. I know what this is: this is a guard clause. I wonder if there's one in the else branch too. So I unfold that.
00:16:38.720
I do see that same pending credit check thing, but it's indented differently. Oh, I see. It's not in the first position where the other one is, and also it doesn't appear to return. What the—who writes this? So maybe this isn't a guard clause after all. After a minute or two of 'git blame' and swearing, something catches my eye way out at the end of this line.
00:17:08.000
There is a return statement after all! So I'll take out the 'end' and put the return on a new line. As for the asymmetry with the begin/end block, it turns out not to be a big deal. So now that I've identified a chunk of code on both sides, it does something I want to get it out of the way so that I don't have to keep looking at it and remembering what I just went through.
00:17:40.480
I made a copy of that if-else and just moved each of the guard clauses up into the appropriate branch. I'm ready to fold this section up and move on, but when I do that, I see some duplication. Now, I've had the DRY principle hammered into me—'Don’t Repeat Yourself.' So I unfold it again, and suddenly I notice that there are two pairs of duplicated if statements.
00:18:12.479
The ones that I started with and the one that I just made right now, it looks like no matter what I do, I'm going to have some duplication here. I'm not sure how I'm going to deal with this yet, but something else about this thing on the top is making my brain all itchy. What I want is for this code to tell me a story.
00:18:42.000
I've begun extracting a preface, but it still doesn't sound quite right. This chunk of code does one thing: if the installation is pending a credit check, it complains to the user and dies. But the first thing that the reader encounters is 'Hi there! Oh, did you know for AJAX and HTML I do different things? Isn’t that exciting?' And every single time I look at this, that’s what I see.
00:19:13.919
So I have an idea, which is to make the condition that triggers the guard clause the first line of the guard clause. Sounds radical, I know! This is my interpretation of a refactoring technique called 'flattened nested conditionals.' This one's not in the book; I stumbled across it in a Dr. Dobb's article by Michael Feather.
00:19:43.040
I'm telling you this so that if you need it, you can go look for it later. Here's the theory: there are two conditions here—request.xhr and installation.pending_credit_check. To make these slides a little easier to follow, I'm going to replace those with two variables: ajax and pending_credit_check.
00:20:00.560
And now we have two variables and two chunks of code. Each chunk of code will only be executed with a particular combination of those two variables. Right now, those conditions are implicit in the structure of the code, but as long as we only execute each of those chunks of code under the same conditions, we're free to use whatever control structures we want.
00:20:25.520
I'm going to start small by taking this if-else and turning it into a pair of if and if not statements. Now, I have two if statements, each of which has an if statement inside it. I can just end those conditions together.
00:20:51.680
Now, pending_credit_check is common to both of those, so I can pull it out. This is the inverse of that flattened nested conditionals, but this time I'm taking the other variable and pulling it out to the top. Everybody with me so far? Okay, I see nodding.
00:21:17.680
Now I can take those two if statements and turn them back into one if-else statement. Because that return statement is common to both branches, I can factor it out to the bottom. Here's what I started with; here's where I wound up.
00:21:42.960
These two snippets of code do exactly the same thing, but in the one on the right, the reason for the guard clause is right up there at the top, and the response handling that's inside it is secondary. In fact, it's so secondary that I don't even want to look at it right now.
00:22:04.640
Because that return statement is now out of the way, I can take that chunk of code and extract it into its own method. At this point, I'm down to a four-line guard clause, which is about the smallest I can get it, so I’m good with this part. Next up.
00:22:33.920
The first time I encountered this code, I spent three or four hours banging my head on it, and eventually, I came up with a terrible hack, which I will now summarize for your entertainment. So let's say I have some code that may or may not raise an exception.
00:22:59.320
I can wrap that code in a begin and an end, and it will behave exactly the same way. If an exception is raised, it propagates up the stack, and otherwise, execution continues out the bottom as normal. Now, the begin and end look kind of silly without a rescue clause, so I'll add one.
00:23:24.880
But rescuing an exception now changes the behavior of this code, so I have to take that exception that I just rescued and re-raise it right away. So let's say that code already has some exception handling wrapped around it, and let's say it has some weird conditional logic like branching on request type.
00:23:47.680
Now, technically, the rescue clause at the bottom still preserves behavior, which is to say anything that's raised above, whether or not it gets raised and how, will be re-raised as is below. However, I am perfectly free to mirror that same structure in the rescue clause. It often is not trying to strangle me yet, so I'm good.
00:24:35.520
Now, I can take the exception handling from one side and move it down to the bottom, and I can take the exception handling from the other side and move it down to the bottom as well. Now this if-else is completely empty; I can delete it.
00:25:04.720
That leaves this rescue clause completely empty; I can prune that. And hey, look—a pointless begin and end! It can go! This weird, gnarly exception handling code is still a mess; it's the same mess that it used to be, but here's the important thing: it's a mess that only does one thing and is by itself.
00:25:25.760
This means I can give that mess a name and get it out of my face. As I kept working with this code, I used the same tactic again and again, which was to take that question of 'What kind of request is this?' and push it down just one level of indentation. Then I can isolate the next bit of behavior and extract that into a method.
00:26:08.800
I want to leave some surprises in case any of you would like to play with this code yourself, but I thought it would be kind of fun to do a sort of wide-angle time-lapse of the rest of the changes to this method. If you're so inclined, I encourage you to softly hum the power ballad of your choice here we go.
00:26:55.280
Softly.
00:27:09.760
Let’s see it again. All right, let's go backwards.
00:27:32.160
And forward again.
00:27:43.520
Now this method is still more complex than I would like it to be, but it's down to the point where I can hold all the details in my head as I read through it, which is a vast improvement.
00:28:05.600
The best part is that request.xhr no longer appears in this method. There's still some more cleanup I can do, but this is good enough for right now. So let's follow that request.xhr and see where it went.
00:28:43.679
All in all, I wound up extracting five private methods that all have the same basic structure. Let's zoom in on one of the short ones: if this is an ajax request, render some JSON. Otherwise, set a flash message and redirect.
00:29:07.440
Now, the main method that I extracted these from mostly talks about things in the domain of the business model, but this code is using words like request, render, JSON, flash, and redirect, and all of those things are in the domain of the controller.
00:29:32.120
Now seems like a good time to at least mention a little thing called the Single Responsibility Principle, or SRP to friends. We’ve been working in a class that for want of a better name, I called 'ScheduleInstallation.' But if we were going to name this class according to the responsibilities it already has, it might look something like this.
00:30:02.560
Every... and in this name is an SRP violation. These SRP violations were all along in the original method. Excuse me for a moment.
00:30:31.440
I forgot to get my props out! I was doing some light reading on the plane, and this is Sandy Metz's book. This sentence reached out and grabbed my attention: 'Methods like classes should have a single responsibility.'
00:31:14.400
My reaction on seeing this was a split second of 'Well, duh!' followed by a massive facepalm when I realized that SRP, of course, is defined in terms of classes, and because this says 'class,' we don't think 'method.' So that was kind of a fun realization of the two SRP violations that we had in the original method and we now have in this class.
00:31:44.432
The bigger contrast is between scheduling and installation and all of this request business. So I'm going to split this class first along that boundary. There are a lot of things I could name this new class, but because it seems primarily concerned with managing an HTTP response, I'm going to call it 'Responder.'
00:32:24.919
The responder's job is to bridge that gap between the domain of the web and the domain of the model. ScheduleInstallation does its thing, and when something interesting happens, it just tells the responder, 'Hey, something interesting happened,' and then it's up to the responder to decide how to represent that information in the context of the current web response.
00:32:59.760
Now, in the interest of time and keeping everybody awake, I'm just going to do this next bit using boxes and arrows. In the beginning, there was InstallationsController, and then we extracted a method object called ScheduleInstaller and gave it a horrible method_missing hack.
00:33:34.080
Now we have this Responder class, which is going to sit in between the two. Its job is to translate 'Hey, something interesting happened' into 'render some JSON.' With that message forwarding in place, I can take all of those methods that I just extracted—those private ones inside ScheduleInstallation—and move them down into public methods on the Responder class.
00:34:07.200
Now, ScheduleInstallation is looking a lot smaller, but all I've done is just moved the mess off into Responder, right? Here’s the shape of the Responder class—all of its methods now have this same form: 'if request is xhr, do this, else, do that.' This Responder has an identity crisis; it's constantly asking if it should do this thing or that thing.
00:34:47.040
Fortunately, there's a refactoring for this too: it's called 'replace conditional with polymorphism.' I take one tool that's designed to solve two problems and asks you to tell it every single time which problem it's solving, and I split it into two tools, each of them solving one problem. Now I only have to make that decision once as to which one I'm using.
00:35:12.640
The implementation on this is pure brute force. Copy-paste the entire Responder class, change the names—one is AjaxResponder, one is HtmlResponder. Then, more brute force: I delete all of the HTML stuff from the AjaxResponder and vice versa.
00:35:51.759
Now, in boxes and arrows land, instead of having a Responder class acting as our proxy between the Scheduler and the Controller, we either have an HTML Responder or an Ajax Responder. There is no proper Responder class anymore; that has been turned into a role that both of these objects play, which means we have to go back into the Controller and tell it which one of these to use.
00:36:28.480
Finally, back in ScheduleInstallation, I have to decide—which one of these should I use? That if request.xhr has moved back into the Controller; right where it started. This thing is now where it belongs.
00:37:11.520
There are still more responsibilities lurking inside this code, but I'm going to leave them for you to discover. I found this to be a lot of fun, and I hope that you will enjoy it as well. The rest of this is just cleanup.
00:37:40.160
Here are some of the things that this code taught me. I began by talking about math because at so many stages along the way, it felt like I was constructing a little proof or applying the distributive property of multiplication.
00:38:16.480
I think that's kind of fun, but not everybody does. I also wanted to address the anxiety that a lot of people feel about math because I suspect that refactoring triggers a lot of those same associations and makes people feel like they're in high school again and just want to get it over with so they can go hang out with their friends.
00:39:02.480
Fast characterization tests rock! If you don't know what I'm talking about, I strongly encourage you to find Katrina's talk on therapeutic refactoring and go watch it. She does a much better job explaining it than I will ever be able to do. When I first encountered this code, it had tests that ran in about 30 seconds, and they had some coverage gaps.
00:39:46.240
I was up late one night, couldn't sleep, so I spent about an hour and a half going through the process that Katrina describes in her talk. When I was done, I had full coverage that ran in half a second. Going from half a minute to half a second is incredibly liberating, and I suggest that you try it sometime if you haven't already.
00:40:22.799
It's just so much more fluid. I learned to embrace duplication. I think that this is best illustrated by our good friend, request.xhr. When I started this, there was one of them. When I pulled the guard clauses out, there were two. By the time I was done with the original method, there were five.
00:41:14.720
Then I moved those five things over into public methods on Responder, and then I copy-pasted that—now there are ten. Right after that, there was one again. It was right back where it belonged in the Controller.
00:41:56.080
If I had let the second one stop me, or the third, I don’t know. Usually, most of the time I practice the rule of three—the third time I see something, I remove the duplication—or if I'd let the fifth one stop me, or the tenth, I would have been stuck.
00:42:31.040
But, you know, I just went with it, and it led me somewhere interesting. Sometimes, duplication tells you that you're getting somewhere.
00:43:09.360
Embrace evil hacks. The exception handling code in this example is probably the gnarliest one that I personally have ever seen. Sometimes putting an evil hack in is the right response to code like that.
00:43:46.160
Making the job smaller is sometimes the first step to making it go away entirely. And sometimes you just need to let ugly implementations stay ugly. But this isn't just tolerating evil hacks: sometimes it's fun to embrace them! If you have a mustache, twirl it! If you don’t, consider borrowing one.
00:44:27.520
Another thing I hadn't expected to learn from this code is that perspective on code matters. Refactoring isn't just moving code around. It is moving code around, but it's not just moving code around.
00:45:11.680
You're examining this thing as an artifact, turning it over, looking at it from all different angles, trying to figure out what it was designed for. Looking at your code sideways or upside down can make different characteristics of it more obvious.
00:45:49.760
There were bugs in here that I did not find until I had been staring at it for an hour or two; or to paraphrase Douglas Adams, superficial design flaws can sometimes conceal fundamental design flaws.
00:46:29.760
So what can you do to become more fluent in refactoring? Whether or not you're a Buffy fan, you could start with the canonical book on the subject, or its Ruby translation, which is slightly less expensive.
00:46:55.799
There's a catalog at refactoring.com as well. Whichever one you use, keep in mind that except for a few chapters at the front of the book, these are reference materials; they're not meant to be read in one sitting.
00:47:36.479
Browse through the listings, find something that catches your eye, and then go practice it. While I'm up here plugging books, I cannot recommend Sandy Metz's book, 'Practical Object-Oriented Design in Ruby'—or POODR—highly enough.
00:48:07.040
This is all about finding different ways you can structure your code, and this book will help you notice hidden responsibilities and give you plenty of ideas for structures you can refactor toward.
00:48:45.600
The most important thing you can do to get better at refactoring is to practice refactoring. If you use an IDE, figure out how to make it perform automated refactoring. It can be a lot of fun to see what different things would look like.
00:49:14.560
But I also recommend walking through the steps yourself a few times in the editor. This will help you develop an intuition for rearranging code so that as you're reading through it, you can see opportunities to change things.
00:49:56.160
Commit code every time your tests pass—even if you only change one line. Even if it's only been 30 seconds since the last time you committed and you're already confused, it's often, in my experience, easier to just do a hard reset.
00:50:21.680
This is often easier than walking through undo buffers in two or three files trying to figure out what state they were in the last time the tests were passing.
00:50:45.760
For bigger messes, do not merge the first thing you try. Use a throwaway branch; play around with a few different people. Show it to a friend, set it aside for a few days, then come back and start over again.
00:51:14.560
If you do find yourself revisiting a particular problem for practice, spend the time to write some fast characterization tests. Tests are feedback, and it's really fun and liberating to get that feedback as rapidly as you can.
00:51:52.519
This slide is obligatory: I work at LivingSocial— we are hiring! There are at least a handful of us here in Texas; I'm in Portland and do a lot of remote pairing— it doesn't always matter where you are.
00:52:18.560
And if you would like to play with this code yourself, it will be on GitHub as soon as I get around to putting it there. The repo is up there right now, so you can bookmark it if you want to; there's nothing in it yet.
00:53:21.449
Thank you!