00:00:13.250
Okay, so today I want to talk about refactoring legacy code. This whole talk is based on a code kata, which is like a coding exercise that you do to practice your skills, called the Gilded Rose kata.
00:00:19.500
Sandi Metz has been giving a talk this year based on this kata. She goes in a different direction than I do. There are a whole bunch of katas on a site, CMU something, which Todd Sadhana, who's down here, maintains. There's a lot of really cool exercises you can find there. Anyway, this is a cool kata, and you should actually try it out sometime. But I ask that you wait until after my talk to follow up on it.
00:00:31.349
So, we are starting a new job today. We are working at the Gilded Rose Inn, which, in addition to being an inn, sells a bunch of interesting items, and they have an inventory management system. The previous developer left, and they need us to come in and take over the system. The system runs a nightly process and updates how many days are left to sell the item and what the quality of the item is. As items age, they degrade in quality, and different kinds of items have different rules.
00:00:54.100
It's pretty straightforward. There is a spec, but we have no idea if we can trust it. Like most legacy code specs, they are often out of date, contain lies, or are completely wrong. We don't need to spend a lot of time on this right now; we will come back to it later. However, we don't know if we can trust this spec. There is a constraint: this team does not practice collective code ownership, and there's a very grumpy guy in the corner who maintains the item class and does not let anybody else touch it.
00:01:20.460
Now, this is our first day at work, so we are not going to come in and rock the boat by trying to solve this team issue. We’re going to hold off a little bit and respect this constraint for today. Our task is to add support for a new kind of item called a conjured item. A conjured item is just like a normal item, except it degrades in quality twice as fast.
00:01:56.159
That seems pretty straightforward. Let's take a look at the code. There’s the forbidden item class that we’re not allowed to touch; we will only look at it to notice that there’s nothing weird here. There are no strange side effects—just straightforward access, with getters and setters and no magic. This is important when you're dealing with legacy code because you need some firm ground to stand on while you reason about the code you're trying to clean up. That’s all I'll say about the item class for now, but it’s a very important point.
00:02:30.840
This is our database, which is fairly interesting. Notice at the bottom, there’s a conjured item already in the database. It probably is not being handled correctly, but we don't know that yet. Then there’s this update quality method. I mean, we have a task today; we need to support these new items. Now it looks like maybe this is where we want to focus our attention.
00:02:48.840
Okay, it’s been a while since lunch and a while since snack, so I hope it’s okay to look at this right now. That looks nasty, but wait, there’s more! Let’s look at the tests. The tests should be better, right? Well, there are no tests—none! Not a single one. Public service announcement: please write tests for your code!
00:03:06.060
So, what are we going to do about this? There are lots of techniques for dealing with legacy code; in fact, we could probably write a book with all the techniques in it. Fortunately, we don't have to because Michael Feathers already did. If you work with legacy code at all, you need this book—it is awesome! The second thing we need is some safety. We need tests! Now, I'm not going to talk about writing tests for legacy code; that's a different talk altogether.
00:03:40.620
However, while I was preparing for this talk, I wrote what Michael Feathers calls characterization tests. The goal is to preserve the existing behavior of the code, whether it's right or wrong. I want to ensure that I don't change that behavior while I'm doing my work. Therefore, I wrote characterization tests and used simple code to achieve 100% test coverage of everything I show you in this talk, with one exception that I will point out later. This kept the tests green the whole time. That’s important; I made sure I didn’t change the visible behavior of this code while I was refactoring.
00:04:15.240
So, as I said, we're not going to trust the spec, and we are going to fight the urge to rewrite when we don't know if we can trust the spec. We don’t know if we can safely rewrite. I mean, everyone I've seen do this kata takes the spec, looks at the code, and says, 'I'm throwing that out; I’m gonna rewrite it!' But we’re not going to do that. Rewriting is often very expensive and error-prone, and we can't always do it.
00:04:46.080
We need to have some techniques in our toolbag for when we can't rewrite. Instead, we’re going to follow the Boy Scout rule: we’re not just going to hastily make our change and perpetuate the issues present in that code. We’re going to leave the campground cleaner than we found it, by trying to improve the code in the area where we’re working. Therefore, we’ll take a lot of baby steps—very tiny refactorings—that add up to big changes. However, we won’t boil the ocean; we’re going to stay focused on our task.
00:05:22.000
We’ll only be messing with the code that we have to touch to implement this new feature. It would be tempting to go and rewrite the whole thing, but that’s not what we’re doing here. Alright, Kent Beck has this great advice: 'For each decided change, make the change easy, which may be hard, and then make the easy change.' And we are going to do that.
00:05:48.220
So, we need to stay focused on our tasks. Are we there yet? Can we see an easy way to implement support for conjured items? Not quite yet! So, what we need to do next is address a code smell known as 'feature envy.' Let's get rid of the feature envy.
00:06:06.140
Feature envy occurs when a method is too interested in the internals of another object. You see all those item dot references in there? That’s a clear sign of feature envy. There is a clear amount of interaction we’re doing to items rather than asking the items to do that work for us.
00:06:29.420
So, we would like to extract the body of this update method over to the item class. However, we have this constraint: we’re not allowed to touch item. Thus, we need to come up with a different strategy here. After all, we've only been working here for a few minutes, so we’re not really ready to address the grumpy guy in the corner just yet.
00:06:56.250
So, what we’re going to do is wrap the items in another class. By doing this, we can move some behavior to the wrapper. To do that, we can use something from the Ruby standard library called SimpleDelegator. I have implemented a class called ItemWrapper that inherits from SimpleDelegator. What SimpleDelegator does is, when you initialize it with an object, any message you send to the delegator that it doesn’t understand gets forwarded to the original object.
00:07:41.300
So, this rule or refactoring right here, where I introduced this class, did not change any behavior because all the messages we send to item get sent to the wrapper, which forwards them onto the item that we’re wrapping. But now, this gives us a place to move some behavior to. So, we can take the whole body of the update method and move it to a method on the ItemWrapper. There are some tools that can actually do this refactoring automatically, but RubyMine does not. Instead, we’re going to do it in a few steps manually.
00:08:29.020
First of all, we’re going to copy and paste the entire body of the update method and move it to a method on ItemWrapper. Since I don’t want to change those 43 lines just yet, I’ll introduce a temporary variable called item—using the same name as the parameter—and just assign self to it. This way, I’m actually addressing the object that I'm working through using that temporary variable, so I don’t have to change those lines of code yet.
00:09:04.000
Now that the update method at the bottom isn’t carrying its own weight, we can go ahead and inline that. This is called 'inline method refactoring.' You can see that item wrapper has the new item update. Now, we can go look at the item wrapper's update method, and we can inline that temporary variable, which is a bit counterintuitive. However, now almost all of those self references are completely unnecessary because when you're calling a getter on self, you don't need self dot. However, when calling a setter, you still do need it.
00:09:50.220
So, we still have to keep some of them for clarity. Are we there yet? Can we implement conjured items now? No, still not seeing it. However, the code is better; we’ve left it better than we found it, but it is still not ideal. Now, I notice some duplication—duplicate code is a breeding ground for bugs—because if you fix one instance but miss another, you end up in trouble.
00:10:26.930
Removing duplication allows you to give names to things that can help the code communicate better. The first duplication I see is the phrase, 'if quality is less than 50, self.quality plus equals one.' Let's extract that to a method called 'increase quality,' like so. Now, there’s another case that looks like we could extract it, but not quite, because of a nested if inside it. So, let’s dive in and inspect that a bit.
00:11:25.160
I'd like to pull that inner if out a level, but to do this safely, we have to ensure that it is safe to do so. This is the first refactoring we’re doing where we actually have to think a bit about the code—everything else up to this point has been mechanical. So, examining what it’s doing inside that inner if, we’re comparing against 'name' with 'cell n.' We already saw that 'name' and 'cell n' are finished getters with no magic.
00:12:06.410
With that established, there are two instances of calling 'increase quality,' both protected by an if that checks if quality is less than 50 at the bottom. Thus, we can safely pull that conditional check out a level, like so, and now we have exactly duplicated code. We can replace that with a call to 'increase quality.'
00:12:51.300
Now there's another instance of duplication, and this phrase looks like it could potentially be part of a 'decrease quality' method, but there's an extra comparison against 'name' inside there that doesn't really belong in the 'decrease quality' method. We can see that we have these two nested if statements, and there are no else clauses to worry about. These two conditions are completely independent, meaning we can reverse them.
00:14:18.200
With a quick evaluation, we can pull out some data and make a 'decrease quality' method. That’s better! When we look back at the item update, we see that we have three main sections: the top is all about updating quality; the middle section is about updating sell-in; and the bottom section adjusts quality again. I would like to group similar tasks, so I'll move that sell-in clause to either the top or the bottom. Upon closer inspection, let's move it to the top.
00:15:29.230
When we move it to the top, the method becomes a bit more logical since we’re now adjusting the sell-in date of the item before adjusting its quality based on the new selling date. This makes logical sense to me, and it reads well. However, we still need to adjust the other two if statements that concern sell-in. We will just consider that as we move forward, and our characterization tests will protect us if it goes wrong.
00:16:44.810
Now we have moved that check up to the top. When we zoom back out, we see that this method is doing two distinct tasks, so let's extract those and give them descriptive names. First, I will extract the top part and call it 'age.' Then we will extract the bottom part and call it 'update quality.' Now the update method is nice and clean, clearly indicating what it does.
00:17:14.470
Now, can we do conjured items? We’ve managed to improve the code. No, it’s still pretty nasty. I think we finally got to the point where we can tackle these conditionals. It’s time to simplify. At this stage, things might seem messy, but now that we’ve worked through the code, we’re feeling more confident in our ability to tackle it.
00:17:38.130
First off, I see that all of these conditions are negative: not equal to, negative checks, which are generally harder to reason about than positive conditions. I’d like to reverse those if we can. I will start at the bottom by looking at the backstage passes check. I want to make that an equals check rather than not equals. As a result, the logic should now read positively.
00:19:30.180
Now, we can zoom out a level and repeat for aged brie. Moving on, I like to look back at the top conditions again. This may seem slightly more complex to reverse, but thankfully Ryan taught you about De Morgan's Law. Here, I will teach you about it again through Ruby. If you want to negate a condition like a AND b, we apply De Morgan's Law, which states that it will be false if either A is false or B is false. Now let's apply that to our code.
00:20:38.960
After applying De Morgan's Law, the logic looks cluttered. However, simplifying those negative conditions can yield a positive check. Moving upward, we can notice a pattern with nested if statements, causing clutter. Thankfully, Ruby has an else-if keyword that can provide a more elegant solution for us. Now let’s take a look at the rest of these conditions.
00:21:57.520
Let’s analyze the bottom of both if clauses. If name is equal to 'sulfurous,' the method does absolutely nothing. Thus, we can eliminate the unnecessary checks from cluttering the code to a guard clause, meaning we can simply return if the condition is met. Cleaning it up for readability—now, let’s also review the age clause similarly to the first one.
00:22:54.720
Now, we have pulled both guard clauses to the top as they serve overlapping functions. Both are being called from a higher-level method, allowing us to do this safely, so we can reduce further repetitiveness. Now we can look at the update quality method again: the compound conditionals are bothersome since they’re not readable. Let’s simplify the backstage passes, too, allowing the method to be clearer.
00:24:38.280
This is the point where we can safely merge any potential redundancies. Yes, we have reached the point where we can begin making the code easier to read and understand. We have changed our usage to promote cleaner code and more easily maintained logic. However, let’s take it a step further and incorporate the polynomials for the item subclasses, so we can better express domain concepts consistently within the code.
00:26:29.040
At this point, we can begin to implement these classes as subclasses of item wrappers to push the behavior down to those subclasses. We are introducing a factory method to item wrapper. The self.wrap method currently just delegates its behavior to another method. Now, if item name is for example aged brie, then we can create an aged brie instance to represent that specific behavior, which we can also do for backstage passes and conjured items.
00:29:26.440
The Gilded Rose has now been dramatically improved from our starting point – ultimately marrying clean and accurate code while respecting the fundamentals of legacy systems. The goal is achieved through a gradual, mindful series of refactorings that focus on lasting results. In summary, we learned that slow and steady wins the race. We made tiny, safe steps without rewriting the entire system. There were some moments we had to think critically, but mostly this was about standard mechanical changes to yield better functionality. Remember that empathy with the code means a better situation for the next person who interacts with it, which, in most cases, is likely going to be you again!
00:30:32.120
Thank you very much! Awesome, thanks so much, Randy!