Refactoring

Summarized using AI

Succession

Katrina Owen • May 04, 2016 • Kansas City, MO

In her talk "Succession" presented at RailsConf 2016, Katrina Owen addresses the challenges and methodologies of refactoring code. The presentation emphasizes the importance of systematic approaches to avoid common pitfalls during refactoring processes.

Key points discussed include:
- Refactoring as a Complicated Task: Owen leads with the notion that refactoring can often lead to complicated situations where making small changes results in many test failures, illustrating the fragile state of code that often accompanies it.
- Decision-Making in Refactoring: The speaker introduces a decision-making framework for refactoring, stressing the need to evaluate whether to proceed with changes and what specific aspects require alterations. The children's song "I Know an Old Lady Who Swallowed a Fly" serves as a metaphor for this discussion.
- Testing Practices: Owen emphasizes the importance of having tests before refactoring, allowing developers to guard against regressions and ensure stability in the codebase. She illustrates this with a stepwise approach to building tests around hard-coded strings.
- Incremental Improvements: The talk advocates for incremental refactoring—making small, controlled changes that gradually enhance code quality while cautiously avoiding premature commitments to larger changes. This includes separating template strings from varying data and improving naming conventions in the code.
- Avoiding unnecessary complexity: Owen discusses the balance of evolving code responsibly, highlighting the dangers of accumulating needless complexity across the codebase.
- Using Polymorphism: In the final sections, she introduces polymorphism as a way to enhance code flexibility and adaptability, particularly when accommodating new requirements such as distinct song variations. The discussion culminates in the implementation of a verse builder to manage the complexity more effectively.

The critical conclusion of the talk focuses on embracing the challenges of refactoring with gut intuition built from experience and understanding. Owen inspires developers to continually practice and refine their skills in refactoring rather than shy away from the difficulties of the process.

Overall, the presentation is a comprehensive exploration of practical refactoring techniques that prioritize clarity, simplicity, and flexibility in code management.

Succession
Katrina Owen • May 04, 2016 • Kansas City, MO

Succession by Katrina Owen

Refactoring sometimes devolves into an appalling mess. You're chasing a broken test suite, and every change just makes it worse. An even more insidious antipattern is the slow, perfectly controlled process culminating in dreadful design.

This talk presents an end-to-end refactoring that demonstrates simple strategies to avoid such misadventures.

Help us caption & translate this video!

http://amara.org/v/Jknx/

RailsConf 2016

00:00:09.740 The AV has been broken in the other room, which is why we're here. I didn't think we would be in this situation; I thought I would be dealing with the broken AV. The solution to that was to not have presenter notes and just mirror the display. At that point, mostly the slides would show. I have around 300 slides, so we're going to get through it. However, I rely heavily on my presenter notes, so I spent the night figuring out how to print them at the hotel.
00:00:41.670 Now, this is a hard problem in computer science because when you export your Keynote presentation to PDF, it gives you large slides, which I don't care about at this moment, and presenter notes that I couldn't read. So, I wrote an AppleScript to generate some Ruby code that would create a PDF I could email to the business center at the hotel. It was great. By the way, if you want to see my notes, they're going to be souvenirs after.
00:01:03.930 I realized afterwards that I could have reduced it by 50% if I had exported my PDF to PDF via the 'Print to PDF' function and done a layout adjustment. But at 3 in the morning, I wasn't thinking very clearly. Alright, so there's a thing we all do: you notice that a piece of code over here and another piece over there are basically doing the same thing except they're implemented differently. So, you decide to switch everything over to just use one of them, and all of a sudden, about 17 tests break.
00:01:59.909 You remember that there's an edge case, and you quickly patch it with a conditional. Now, 24 tests are failing. You realize there's a missing dependency, so you pull that in. Suddenly, you have 50 failing tests. It turns out there's a conflict between two different dependencies, so you swap them around so that they load in the opposite order. Now, almost 80 tests are failing.
00:02:13.800 By the way, that encoding issue always goes haywire. You are assuming that you've got UTF-8, but actually you’re getting Latin-1 — or is it the reverse? I don't know. The method you're trying to use needs access to some stuff that's not part of the public API, so you resort to using 'send', and now a whole bunch of issues blow up. It's something to do with unexpected nils, and it’s been hours since the tests were last passing.
00:02:31.210 Now you're considering monkey-patching nil class instead. You do a 'git reset --hard'. The most difficult part about refactoring isn't actually making the change; it's knowing where to begin, what to do next, and how to make that change safely. I'm going to use the children's song 'I Know an Old Lady Who Swallowed a Fly' as an excuse to talk about decision-making and refactoring.
00:02:51.670 The song is very simple, but it has an algorithmic component. This adds just enough complexity that we can talk about real-world refactoring principles without bogging down into actual code. The entire example is hard-coded into here; it's wrapped in a method and then wrapped in a class. The first thing you need to ask yourself when you're about to refactor is whether or not you should do it at all.
00:03:07.960 In this case, frankly, the answer is no. The song hasn't changed in generations. The code could be terrible, but it works, and we're never going to touch it again. You can stick it in production and walk away. So, to avoid unnecessary refactoring, I've invented a spurious requirement: we need to be able to continue generating the song the way we've always done it in its traditional form.
00:03:29.730 In addition, we want to generate the song with arbitrary creatures. This gives us a perfectly legitimate reason to change the code and provides some guidance. It tells us about the axis along which we need to support change. We don't need infinite flexibility; we just need this one type of flexibility.
00:03:48.650 At this point, there's another important question to ask yourself: do you have tests? Again, the answer is no because why on earth would you test a hard-coded string? To protect against regressions, copy the hard-coded content into a test with a very simple assertion, and unsurprisingly, this test passes. This means it is safe to start changing things.
00:04:08.510 But that raises yet another question: can you make the change you need to make without any dirty hacks? Almost invariably, the answer is no. So, we have this new feature we want to implement. The first thing we're going to do is not implement it. Instead, we're going to rearrange the code, find that flexibility we need, and then add the feature.
00:04:30.270 It's almost never obvious upfront what that flexibility is going to look like, so start the process by taking a moment to look at the code. Observe it; see what stands out. Look for things in the code that you don't like, and then pick one thing. I tend to pick the thing that I hate the most or the thing that I understand the best.
00:04:52.880 Now, most people point out all the duplication in the strings. There is duplication between verses and within verses, but it's not exact duplication. The bits that change are embedded inside the bits that stay the same, and it's all stuck in this big bag of hard-coded strings, which makes it hard to deal with.
00:05:09.000 The first change I'm going to make is to introduce a little bit of indirection. I need to take small steps; I want to change one small piece at a time. For this, I mean to create a case statement with a branch for each verse. This means we have to loop over the range of verses and join them to get the full lyrics. You can do it all in one method, but we only really care about the case statement.
00:05:27.540 So, I'm going to go one step further and isolate it into its own method. Finally, we can start touching individual strings. I want to tease apart the bits that vary from the bits that stay the same. In other words, some of this is template and some of it is data. We immediately run into the problem of naming things.
00:05:43.830 Typically, you extract it, you name it, and if you get the names wrong, it can make the code dramatically harder to understand, which makes the code dramatically harder to change. The easiest time to get names wrong is when we understand the least about the problem. In this case, even if you get the names right, it doesn't really help because we have everything mixed together.
00:06:02.900 This may seem a little bit odd, but C-style string formatting using the percent method can be really handy. It provides a clear separation between the static parts and the variations. I like that it doesn't matter where in the string the placeholder is; all the data ends up outside of the template, and we don't have to name anything.
00:06:17.740 Now, if you apply this to each of the verses, the duplication between them goes from being similar to being identical. All of the bits that vary are off to the side. The same goes for the duplication within each verse. Now, each entire verse has all the templated stuff on the left and all the data on the right.
00:06:38.000 This is a tiny change, but it is noticeable. Even when squinting at it, you can see the difference. This is the hard-coded string that we started with; we used a case statement to break up the individual verses. With the strings unchanged, we’ve now dissected the strings in place to separate the template from the data.
00:06:53.350 Now, you might argue that I just shoved code around on the slide and pretended to have refactored it. It's a little bit like pushing the food around on your plate and pretending that you've eaten — kind of pointless. Some people can look at duplication and immediately see how to extract methods and objects, but often, I just can't tell — not at first.
00:07:12.810 So, dissecting it like this gives me a bit more information without forcing me to commit to anything prematurely. In this case, separating the template from the data has made the algorithmic part a lot more obvious. Each verse grows systematically, adding repetitions, and finally, it stops abruptly.
00:07:31.310 Now, most of the verses take this phrase and repeat it some number of times, passing in a number of different creatures. It's not a lot of code, but it does seem like a complete thought, and I'd like to name it. There are three very common strategies for naming things. The first is to name it by using some fragment of the implementation, like 'swallow' or 'catch' or something equally concrete.
00:07:51.000 This is the kind of name that quickly gets out of date. You find yourself with a method named 'blue' that returns the hex code for red. The second strategy is to name it structurally — you have 'recurring line', 'incremental sentence', or 'middle phrase'. This kind of name is technically accurate but totally unhelpful.
00:08:09.060 The problem domain here isn't about phrases; it's about a little lady who inexplicably swallows a fly and then compounds the problem by swallowing larger and larger creatures. This part of the song tries to explain the reasoning behind her behavior, talking about why she'd ever swallow a bird, dog, or goat. In other words, her motivation.
00:08:30.890 So, we need to call motivation some number of times, each time with different data. Now, ignore the fact that cows and goats are vegan — the predator in one verse becomes the prey in the next. What we really want is a single list so that we're not writing everything twice. Ruby has an 'each_cons' method that conveniently yields each consecutive pair given a list of critters.
00:08:46.160 Loop over them, pass each pair to the motivation method, join it into a single string. This block of code also seems like a complete thought, and I’d like to name it, as we’re talking about a sequence or chain of events. It describes a sort of bizarre food chain. So, the algorithmic portion of the code is isolated into these two methods.
00:09:05.430 There’s a trade-off here; we’ve taken something that was blindingly simple and complicated it. We’ve paid a price, but we’ve gained benefits too. We’ve isolated a small, cohesive piece of code, named an important concept to the food chain, and the implementation now shows the bones of the algorithm.
00:09:26.750 It expresses the underlying structure of an essential piece of the song; this structure was indiscernible when everything was just hard-coded strings. Two things just happened: we extracted and named the algorithmic piece. However, one thing didn’t happen; we didn’t use it anywhere.
00:09:41.950 Technically, this is called a parallel implementation — it sounds very fancy, but it means I didn’t know what I was doing yet. I didn't want to screw things up while figuring it out. After creating a parallel implementation, you should be able to swap it in seamlessly, and it will just work. Well, sometimes it doesn't.
00:09:57.400 Here, it failed. This means there’s something we haven't understood yet about the food chain. It turns out one part of the chain is not like the rest.
00:10:07.370 What you'll usually see in a codebase when you have a rule with an exception is a conditional. These can be kind of problematic and foot-in-the-door kind of way. You have a conditional; you blink, and now you have 12. Somebody's going to get hurt. Within the context of refactoring, conditionals can be a useful tool.
00:10:27.520 First, create the conditional, name what it represents, and then use what you learn to make a decision about the next step. Now, that sounds deceptively simple, but there are a couple of pitfalls. The conditional itself should contain the smallest possible difference.
00:10:45.110 It makes sense if you think about it. If something is the same in both blocks of an if statement, then it's not conditional on anything. So, if we keep just the wriggly bit of the spider, there's nothing to put in that 'else' branch. You could drop it, and that might lead you to think you’re naming just the exception — but you're not; the conditional represents two variations on the same concept.
00:11:03.320 In the song, we’re talking about the critters' distinctive features as some sort of qualifier. It’s just that some critters aren’t all that distinctive. If you only leave the bit about the spider, then the name will almost inevitably end up being about the spider. If you name a fragment of an idea, it introduces not just indirection but misdirection.
00:11:23.480 This creates an illusion of understanding and makes it very difficult to refactor. By forcing yourself to consider the symmetry, you name the whole concept. This method needs an argument; 'prey' is not the right name here. 'Prey' is about hunting — this has nothing to do with one animal killing another for food.
00:11:40.000 This is about the critter itself. Take a moment to look at the conditional; it contains the smallest possible difference. It’s symmetrical, containing both the exception and the rule. The name doesn’t misinform; it’s about a general concept in the song. Notice that the result of the method depends solely on the argument that’s passed.
00:11:59.260 This method could live anywhere; it could be a global function for all we care. If you got this critter string and more stuff associated with it, we’re trying to decorate it with more behavior. Instead of passing the critter to the qualifier method, we should invert it and send the qualifier message to the critter.
00:12:18.350 To do that, critter has to be an object. The data can all be extracted from the verse. New up all the critters in the initializer, and predator and prey are now no longer just strings; they’re objects to which we can send messages. Notice that once we found the critter object, the conditional just went away.
00:12:34.370 That won't always be the case, so don’t be afraid to introduce something that makes you feel a little bit slimy. Refactoring isn’t about making the code pretty; it’s about understanding the problem better. Sometimes you have to get a little dirty to get clean eventually.
00:12:50.960 The motivation method was so nice just a moment ago, and now not so much. This chunk sticks out; it’s not terrible, but it has too much information at the wrong level of abstraction. These are intimate, nitty-gritty details, and the motivation method shouldn’t have to care.
00:13:05.220 If we extract it, we have to name it. The only thing I could come up with that kind of works is 'epithet'. I will not tell you how long I spent thinking about that before I came up with it. Worse, I’ve given this talk three times and mispronounced it every time. Oh, I just realized a few days ago how you actually pronounce that.
00:13:23.940 So, 'epithet', since this is all about the critter, belongs in a critter class. A lot has happened: we tried to swap in that parallel implementation and we failed. Focusing relentlessly on a small, symmetrical difference isolated a meaningful idea, and that idea was the seed of a tiny object: the critter.
00:13:41.620 This abstraction was not at all obvious at the start, but now having found it and named it, it feels inevitable. Now, our parallel implementation is finally complete. We can look at the case statement, complete the original implementation, identify the algorithmic repetition, and then swap it out with a call to this new implementation.
00:14:03.030 This time the change is seamless; the test passes. This is better: the case statement has gotten noticeably shorter than it was, but we've ended up with more code and more complexity. You'll notice there's a continuing squinty thing — that’s all one class except for the three little critter lines. It’s five lines of critter.
00:14:16.410 To be fair, this is pretty mild complexity; there’s probably nothing here that you couldn't understand even if you were just a little bit drunk. This whole time, we’ve been dealing with duplication in the strings of the song, but it wasn't obvious duplication; it was more indirect.
00:14:41.470 When we started out, there were fragments of plausible duplication throughout the case statement. Now, there are whole chunks of unmistakable duplication. Even the data has gone from being mostly a jumble to having a recognizable pattern. It’s tempting to focus on that sameness, to extract it and give it a name. Resist that temptation for as long as you can.
00:14:59.560 Identify the smallest difference, see if you can make it go away, and then ignore it. The difference in the first line is the name of some critter. We’ve got a bunch of critter objects; we just need to find the right one and get its name. This will take this tiny difference and make it the same everywhere, and we can ignore it.
00:15:12.560 This leaves us with one last difference between the different blocks of the case statement. This one is something that we haven’t named yet. Again, this is about the critter, and it’s kind of like a commentary or aside. We'll need to pass more data when we’re creating critters and then use that abstraction in the statement.
00:15:24.740 Now, this is identical, and we can ignore it. Something interesting has happened: all that sameness we’ve been ignoring can now collapse into a single block. Sameness is a trap; it’s a distraction. Differences are the parts that are interesting because they’re fragments of some larger idea.
00:15:41.570 You want to encapsulate the concept that varies, not the concept that stays the same. Once you've encapsulated the differences, then the sameness either evaporates or condenses into something obvious. So, all of this started with identifying duplication as the ugliest, best-understood problem.
00:16:00.680 We added indirection in order to be able to start dealing with individual strings and then encapsulated the algorithmic portion of the song, discovering the critter object, which is tiny and very cohesive. Finally, we focused on all the little differences between the different cases, making them go away and allowing us to collapse the entire set of identical cases.
00:16:15.270 But we’re not done yet. We started out with a gigantic string; we’ve extracted and extracted and spent a lot of time in the weeds. It would be really useful to take a step back and look at the bigger picture.
00:16:32.790 So, this is a song. It has five methods; three of them are private. All that private stuff seems like it could be its own thing. No song needs all the old code, otherwise the tests will blow up, but we can ignore it for a moment and just focus on this parallel implementation. Before calling this from the song, we have to do a couple of things. Well, one thing mainly: it has to work.
00:16:48.690 Otherwise, the tests fail, and this doesn’t actually work. I want to make the API of that class the API that I want, and I’m only doing that because it’s a hassle to fix it later. It takes longer, and I only have 40 minutes with you, so I’m going to do the straightforward changes first. Any other changes can wait until we’ve integrated it.
00:17:05.710 This way, when we run it, the tests will have our back. The reason this isn’t working yet is that the verse and chain methods call 'last_i' on critters, and the verse object doesn’t know about critters. It was pretty easy to fix: set some initialization logic and then give it a reader.
00:17:22.740 With this change, verse has what it needs, but verse has more than what it needs. It doesn't need all the critters; it just needs the critters for that verse. This has implications for the API of the class. Notice that verse and chain both take 'i' as an argument; we can get rid of the parameter and instead get 'i' from the size of the critters array.
00:17:39.810 This doesn’t fix the most glaring issue with the API of the class. This isn’t the verse’s verse; this is the string representation of the first subject. There’s a Ruby idiom for that, which is 'to_s', which is very nice and makes the public API reasonable, especially if we make the last two methods private.
00:17:57.950 To swap the parallel implementation into the song class, replace the call to the old verse method and replace it with a call to the new class. You don’t actually need to call 'to_s' here because 'join' will do that for you. Then you can delete all the old methods.
00:18:15.110 This passes the test. Now we have three tiny objects, each focused and cohesive. My critter is minuscule; it’s basically a tiny wrapper around the raw data we're initializing it with. The song class is a little bigger; most of this consists of the array of critter data in this class.
00:18:29.640 It knows only five things: it knows what the raw critter data is, how to transform that data into critter objects, and implicitly knows who eats whom in the food chain because it’s the order of the raw data in the array. It knows how to instantiate verses, and it knows how to combine verses into a complete song. It no longer knows what those verses are or how they're constructed. That is the domain of the verse class.
00:18:49.360 This class knows how the sausage is made. It knows which verse should be long and which should be short, how to combine that data into templates, and what the algorithm for the food chain actually is. It knows a lot. I am NOT saying that you should always split some things up into new methods and classes, but it can be useful to ask yourself if what you have were two different things, what would they be?
00:19:05.660 If you can come up with an answer that's not too far-fetched, it might be worth exploring. You can always inline it if you hate what you get. The first class is looking pretty decent from the outside but pretty gross on the inside. The most glaring thing here is the case statement, which is still present. It has some blatant duplication.
00:19:23.490 As before, ignore the sameness, focus on the smallest difference, and create a small, focused conditional for just this difference, making sure it’s symmetrical. Then, name it. In this context, this is what the narrator is summarizing or recapitulating everything that has happened up until this point.
00:19:44.350 There are two verses where there's nothing to summarize because there’s no backstory yet. In the very first verse, nothing has happened yet, so there’s nothing to say. In the very last verse, the little lady is dead, so there’s not much to say about that either. Having extracted the recap, the duplication can be collapsed and named.
00:20:07.370 This is the incident that the verse is about; this is the main deal. 'To_s' tells a pretty good story, with all of the details relegated to private methods, which provides a decent division of responsibilities. The blatant duplication is gone, but there are a number of smaller annoying things.
00:20:24.700 For example, there are calls to 'last' on critters, and that’s totally redundant because the critters stored in the verse are already the last critters. So you can delete those, which removes almost all of the instances where we reference 'i'. There’s one last reference left in the case statement; we can switch that out and just switch directly on the critters length.
00:20:43.580 Another little thing is the first critter in the list, which is the main character of the verse, and it would be really nice to name it. So, there is one more thing that I don’t like. Go ahead and squint at all the templated strings; notice that the templates are all strings. They’re all read; we’re interpreting stuff that varies.
00:21:04.730 Stuff that varies is abstraction, and here there’s one string that really sticks out — a hard-coded string. You could argue that if it doesn’t vary, should it even be data? Maybe it should be a part of the template. There are a couple of arguments for using an abstraction here.
00:21:22.860 The first is that we have this exact string in two places, and this is the aside for the fly, so we've already named it. The other reason is that the whole point of making these changes is that we might not even have a fly, so we’re going to use that abstraction.
00:21:39.510 The verse is looking pretty good at this point; it has boilerplate to set everything up. Then 'to_s' calls what is called by the outside. Everything we care about in terms of the API and the 'to_s' calls 'incident' and 'recap', and then 'recap' calls 'chain', and 'chain' calls 'motivation'.
00:21:55.700 It can feel like all of these itty-bitty, annoying, insignificant changes are not worth worrying about, but all these tiny irregularities and duplications and inconsistencies add up. It’s hard to measure the impact of one tiny change; the difference in terms of clarity and understanding can be staggering.
00:22:11.990 The code is pretty good. I have one complaint left, and that's the case statement at the very beginning. I added it as a way to sort of start getting at the strings; I thought it would be temporary. I hate that it’s still there.
00:22:31.430 So, the final refactoring is one of the classics from the refactoring book by Martin Fowler: it's called 'replace conditional with polymorphism'. This is a step-by-step recipe. Refactoring is not about achieving maximum design pattern density.
00:22:49.490 Refactoring is about balancing simplicity, readability, and changeability. Don’t refactor because you're embarrassed about your code — this is not about aesthetics. Don't refactor because there’s some shiny design pattern you’ve been wanting to try out; this is not academic.
00:23:06.900 Don’t refactor because you imagine a beautiful future with ifs and what-ifs that would really come in handy. Your guess about the future is as good as mine — which is to say, total crap. We’re more likely to be wrong than right.
00:23:22.220 Refactor because you need to make a change — not a hypothetical change, an actual change. The focal point of the refactoring wasn’t to make the code beautiful or satisfy some academic itch; it was to fulfill a new requirement. In addition to the age-old version of the song, the client wants to introduce a jungle-themed song and an ocean-themed song.
00:23:38.410 I do recall something about a squirrel. The purpose of the refactoring was to make this not only possible but easy. We need to evaluate the code from this perspective: we need to be able to send in any set of critters.
00:23:57.960 Right now, we've got the hard-coded critters for the traditional version of the song, and this is fine. The client likes the traditional version and wants to keep generating it; all of their code should still work. They also want flexibility, so this is a pretty straightforward change.
00:24:17.860 If we give the constructor a parameter, we can default it to the existing raw data and then loop over the argument instead of the constant. This would almost work exactly the way you want it, but the problem is that we’re making assumptions about how many critters are going to be involved.
00:24:35.600 Now, that’s also an easy fix; we can loop up to the size of the array.
00:24:51.170 However, there’s a hard-coded eight down in the first class, and probably that shouldn’t know anything about anything outside of that one verse. It shouldn't know about other verses or how many verses there are.
00:25:09.720 We have a few options here, most of which are really gross, so let’s talk about them. On the one hand, in addition to telling the verse what critters to use, we could tell it how many total critters there are. It’s no worse than hardcoding the eight, but it’s also no better; it’s more or less equivalent.
00:25:28.210 Another option could be to pass all of the critters while also passing 'i'. This pushes knowledge about the big picture even more into the verse, which is unfortunate. On the other hand, there’s a trade-off: the song knows a little bit less, which could be a good thing.
00:25:48.710 Another option would be for the song to determine whether the verse should be long or short and then pass some sort of token to the verse class. This duplicates knowledge: the verse knows about the long and short while the song also knows about them, which creates conditionals, leading us to a less-than-ideal situation.
00:26:05.750 A better solution would be a shim between the two; someone who can know about long and short templates and knows how to make verses and what they need. We could have two really short, really stupid verses: a short verse and a long verse. Neither would have to know about each other, nor should they know about the total number of verses.
00:26:23.690 As long as both verses have the same interface, then the song doesn't have to know which verse it's dealing with. So, we need to turn one verse with one conditional containing two branches into two verses with two recap methods, one for each branch of the conditional.
00:26:37.410 Since the long verse has a chain, it also needs the food chain stuff. There's a bunch of common stuff we could duplicate, but that seems like a bad idea. Instead, we could stick it in the short verse and have the long verse inherit from it, overwriting recap and defining the food chain.
00:26:54.420 In my song, we need to call this in-between thing the 'verse builder' or 'factory'. I don’t really care what it is called — I like 'verse builder'; it totally works. So, we used to have code that was incredibly straightforward, but we’ve ended up with code that is still understandable but definitely more complex.
00:27:14.130 Everything used to be all in the same place; now it's spread out across a bunch of tiny methods and classes. We've traded simplicity for flexibility. If we had done it right, we should be able to generate lyrics with arbitrary critter data. The simplest way to check is to write a small test — and when I say small, I mean as small as possible.
00:27:33.330 But it needs to cover all of our edge cases, everything that’s relevant about the song. Everything interesting should be included, but it shouldn’t be realistic because that would give us redundant data and obscure the important aspects.
00:27:49.420 So instead of lions, monkeys, and lizards, the data should be simple, contrived, and uninteresting. With this data, we should end up with these lyrics. The first and last verses should both be short, while the middle verses should be long — one of the middle verses should have that extra qualifier.
00:28:05.780 So, if we stick this whole thing in a variable, we can confirm that the code does the right thing. But it doesn't work! This was surprising when I prepared this talk. It's pure luck that my contrast of really silly, uninteresting, boring data uncovered a bug.
00:28:24.870 Now, this fails because we have the wrong indefinite article — it means 'one', so we have 'a zebra' and 'an alligator', and it’s not a big deal. If this came up in production, you could fix it in a heartbeat.
00:28:40.700 So, everything passes. Kent Beck once said, 'Make the change easy, then make the easy change first.' Refactor the code, change the structure without changing the behavior. Keep doing this until you can add your feature without hacking it in.
00:28:58.520 Often, you’ll find that the new requirement just takes a moment to implement and feels like a bit of a miracle. It’s actually Kent Beck; he didn’t say this. He said almost this way — he said 'Make the change easy... warning: this may be hard, then make the easy change.' Refactoring can feel like a dark art, and it kind of is!
00:29:16.210 There’s a lot of gut feeling involved. You’re recognizing stuff you don’t like, but that gut feeling and intuition are learned. Read about code smells, watch Sandy’s talk, look at code, do code reviews, practice refactoring, try stuff, get rid of it, and try something else.
00:29:36.830 Ignore the sameness for as long as you can. Don’t name the fragments and slivers and shards of ideas; name symmetrical differences. Name the whole concept. It should feel like you’re discovering your abstractions, not inventing them. A good abstraction feels obvious in hindsight; it's already there, buried in your code, and it’s up to you to unearth it.
00:29:53.750 Thank you.
00:30:03.110 We have four minutes supposedly; that’s for Q&A. I’m totally getting the tape. The code is up on GitHub in separate commits, one by one. You can clone it, download it, and look at all the changes. It’s actually in a slightly different order, but that’s because I kept changing things as I worked on the slides. It doesn’t really matter which order you do things in; you’ll usually end up in the same place.
00:30:19.600 I’m writing a book with Sandy Metz. I made this pretty slide, and she said, 'No, that’s what it actually looks like.' I failed to design my slides afterwards, so I have this odd red thing going on. The book is awesome; it's also about a children's song that is algorithmic — the 99 bottles of beer on the wall. It has that same algorithmic complexity that gives us enough difficult stuff to talk about good design and refactoring techniques without teaching you about investment banking or shipping containers.
00:30:39.780 If you want to practice refactoring, a great place to go is Exorcism. The story about Exorcism is that, in lots of languages, you can do exercises. The real story is that once you’ve finished an exercise, go look at everyone else’s code and start developing that gut feel about what trade-offs there are. What’s easy, hard to understand, and have that conversation with people about how that code could be more interesting.
Explore all talks recorded at RailsConf 2016
+106