00:00:10.200
Okay, I'm Sandi Metz, and I am the last but one speaker, which means you’ve almost made it. You get a break after this, and then there’s another keynote. If you stay for that, it’s really bright up here. I’m going to have to look at you some, so yeah, you’ve almost survived. Even those of you who are new have almost made it through an entire RailsConf.
00:00:26.599
So I think we should all take a moment to thank our sponsors. They pay the bills, and we’re glad for that. Thank you! But even more than that, there are human people who organized this conference. If you see someone in one of those red shirts before you leave today, find them and say thank you.
00:00:43.609
Alright, let’s thank them here in person. Let me start my clock; I really do have about 40 minutes. I don’t have 500 slides of code this year, though, so it’s not going to be that bad.
00:00:56.719
I’m Sandi Metz, and I’m really happy to be here. This year’s talk is about code smells. This is a term invented by a guy named Kent Beck. We did not plan that I would follow Katrina, but it’s the perfect talk to follow hers since she just mentioned his name.
00:01:10.640
This is Martin Fowler; he’s the man who wrote the book that teaches us how to do refactoring. They collaborated on the third chapter of that book, and that’s why naming things is such a significant win. The term ‘code smells’ came from their work back in the 1990s.
00:01:38.720
I wrote a book a couple of years ago; before that, I wrote code for 35 years. I went to my desk every day and wrote code, and now I don’t do that. I teach instead, conducting classes in object-oriented design. In my class, I often ask people if they’ve heard of code smells, and just like I suspect with you all, everyone in my classes always says, 'Oh yeah, we know what code smells are.' Then I ask them to list five, and no one can.
00:02:01.160
Okay, I can hear you trying, but really most of you cannot. We all think we know this term, but it doesn’t mean I dislike your code, and I can’t tell you why. That’s not what a code smell is. There is some opinion involved, but really, it’s not that I disagree with how you wrote your code. The standard is higher; these 22 different smells all have very precise meanings.
00:02:38.720
The magic of this list is that they’ve given complex ideas names. Once you give a complex idea a name, if we can all learn what that name stands for, it means we can talk to each other in an unambiguous way without miscommunications. One thing they often say when discussing code smells is 'bad smells.' But really, the definition of a code smell is that it might indicate a problem; not that it does indicate a problem, just that it might.
00:03:05.240
You don’t have an obligation to filter out all the smells; it’s actually important that you not. It’s worth getting familiar with code smells because they have such great names. For instance, 'Feature Envy' is a cool term. I like the one that needs a code of conduct, 'Inappropriate Intimacy.' Then there’s 'Shotgun Surgery,' which is another favorite.
00:03:51.920
The problem with this list is, I confess, I've had that book for a really long time before I read it. I had a hard time with it. It’s one of those books that feels like a recipe book. I’ve heard it described as a cookbook. It would be like reading a cookbook if you did not eat. I needed the story arc, and I had trouble persisting with a bunch of recipes that just went 'shrub shrub shrub.' But it turns out it was really worthwhile when I finally did it.
00:04:35.360
There are 22 things on this list. I had a Stack Overflow question about them. There’s a guy, whose name I can't pronounce, Mantia (it'll be in the credits), who wrote a paper where he grouped them into five different categories. Through the magic of Keynote, I can illustrate this. This is the only reason we make talks – to use effects.
00:05:18.560
Let’s discuss this set. This is a list of things that simply do not need to be long. The terms ‘Long Method’ and ‘Large Class’ are probably self-explanatory. ‘Data Clump’ occurs when you have two or more pieces of data that always appear together. You pass them around over and over again together. ‘Long Parameter List’ is also obvious; it might only occur once, but if it’s long, there’s probably an object hidden somewhere in there. The wonderfully named 'Primitive Obsession' was seen in Katrina’s talk; it’s when you have an instance of a base class like a string, number, hash, or array and you pass it around.
00:06:15.960
These things are grouped in a category called 'Blowers' because they make code bigger than it needs to be in the places where you use them. The next group consists of ideas available in object-oriented programming that you can misuse: 'Switch Statements,' which are conditionals by normal conversation, 'Refused Bequest,' which is an inheritance issue where a subclass overrides a method inherited from a superclass and throws an exception saying, 'I don’t implement that thing, I refuse the bequest.' This category includes 'Temporary Fields.' It’s interesting that 'Temporary Field' is on this list, right?
00:07:05.480
Temporary fields can be handy, but sometimes they mean you should have made a method with that name. Why are you giving it a name? What is it about the code you have now that feels like it needs that name? These things are all grouped in a category called 'Tool Abusers.' I have a garage full of bikes, and I’m an amateur mechanic, which means I have amateur tools, sometimes involving a very short wrench, a long pipe, and a hammer. I can tell you it almost always turns out badly if you abuse your tools.
00:08:02.640
This next group consists of stuff that makes changes hard: 'Divergent Change' and 'Shotgun Surgery,' which we’ve discussed along with parallel inheritance hierarchies. Every time you change something in one side of a hierarchy, you have to move or add changes on both sides, making it hard to remedy. These sorts of issues keep you from wanting to change code, or they make code hard to change. Notably, if nothing ever changes, it's probably okay. Embarrassing code can remain if it’s not costing you money.
00:09:13.520
Just because smells exist doesn’t mean you need to panic; sometimes you should own them, be proud, and walk away. Don’t let people make fun of you for having bad code. This next category includes ‘Lazy Class’ and ‘Speculative Generality.’ Lazy classes do not justify their existence. I won’t delve into speculative generality just yet. In object-oriented programming, classes ought to have data and behavior.
00:09:45.960
Duplicated code is rather obvious. Now back to speculative generality. I ask you to raise your hands: who here has ever written some code for a feature you thought might come in the future? Keep your hands up for a moment. Okay, here’s a confession. Who here has, after many months of working on that code, ripped it out and thrown it away? We are bad guessers, and some of you love object-oriented design. Yet, this speculation leads to blame: it’s primarily things in that category that earn us a bad reputation.
00:10:38.240
You have to be right, so the few times that you are right must really be big wins to outweigh the vast cost of being wrong. Code is read many more times than it is written. We cost money primarily due to the time spent reading code. Introducing generality increases the level of abstraction, often adding levels of indirection that are hard for humans to understand. Every time someone looks at that code, it becomes harder to grasp. We should try to restrain ourselves from speculative coding since new requirements will tell us how we ought to write it.
00:11:40.840
Now, the last category encompasses 'Feature Envy,' 'Inappropriate Intimacy,' 'Message Chains,' and 'Middlemen.' Feature Envy occurs when I have an object and send it way more messages than I send myself. It could mean I’m more tightly coupled to that object than I realize. Inappropriate intimacy exists when I access private methods of an object – that would be harmful. A 'Message Chain' involves sending a message from an object to another object, which passes the message somewhere else.
00:12:13.640
These things are grouped together under a topic called ‘Couplers’ because their effect binds the objects together. Even if they are beautiful designs, it’s impossible to reach in and extract one for use in another context; they come bundled, one or the other. So there you have it: everything you need to know about code smells in the first 10 minutes and 34 seconds.
00:13:20.760
But we’re not done yet because now I will discuss refactoring, although not too much since you just heard a talk about it. There’s something people often miss – the researchers discovered in the 90s that I want you to take away today. Refactorings, just like code smells, have names and specific meanings. Refactorings have names, they are specific, and they come with recipes.
00:13:46.840
Here’s a recipe example from Page 149 of Martin Fowler's book; it looks like a recipe. There are numbers down the side and optional clauses that might differ in your situation. You’ll notice it refers to other recipes, also in capital letters – another whole recipe in itself, recipes within recipes. Every refactoring works the same way – we don’t just hand-wave and say ‘go refactor.’ Refactoring has a very specific definition: rearranging code without changing its behavior.
00:14:16.320
All the ways in which you can rearrange code are already documented by people who have extensively thought about this. Now you know code smells have names, and they are real things. Also, refactorings have names; they are real things and come with recipes. And here’s one last bit of news: every code smell maps to a curative refactoring recipe. Ponder that for a moment. What does that mean?
00:15:11.400
Here’s a cheat sheet provided by a guy at Industrial Logic. Notice that the top of the sheet addresses the code smell ‘Data Clump.’ There’s a brief definition of Data Clump. That ‘f81’ reference is to Page 81 in Martin Fowler’s book. The three items listed for refactoring beneath it are the recipes that cure that specific code smell.
00:15:24.880
This slide has been enlarged so you can see it clearly. I extracted it from a PDF containing cross-references of all code smells and corresponding refactorings in Martin Fowler's 'Refactoring' book and another book by Joshua Kovacs, 'Refactoring to Patterns.' It turns out this is all you need to know; the problem is solved. You don’t have to reinvent this wheel.
00:15:54.960
All you need to do is learn a few things. My experience teaching has shown me that somehow a generation has passed since these books were written. New programmers in the last 15 years may lack this knowledge. I’m a woman of a certain age. If you’re new to programming within the last 15 years, you may not be familiar with this information. Now I’m going to show you some code.
00:16:30.720
We’ll spend about 15 to 20 minutes examining code, highlighting the practical impact of recognizing code smells and applying refactorings. My class 'Sale' is a subclass of persistence, which you can equate with active record. If you’re sitting in the back and can’t see this, it won’t hurt my feelings if you get up and come forward; the font size of my code isn’t ideal.
00:17:09.280
Here we see an example where my class 'Foo' has a 'sales_total' method that takes parameters, perhaps functioning as a controller. This controller knows the 'Sale' class and certain attributes. It understands the name of the date and the cost, as well as how to pass a date range as an argument to ‘where’ so the query will work.
00:17:38.679
Now, later on, someone else comes in and does this. Someone makes ‘Bar’ which resembles ‘Sale’ but is intended to calculate weekly totals and takes parameters. You only need a starting date, so it can calculate the ending date. The method for expenses comes about later, which looks quite a bit like the other methods. Then, as expected, we run into an issue where someone passes in a bad string for a date, resulting in a ‘Kaboom’!
00:18:11.760
A bug report comes in, and someone fixes the problem, so now I have this code. Let’s pause a moment and identify all of the ideas encapsulated in this seemingly simple piece of code. There’s the idea of calculating a week. Dates might be passed incorrectly, and defaults can be set. There’s the notion that a start and end date always go together when needed. I know messages that can be sent to ‘Sale’ and the name of that attribute, so many ideas are marked up in different ways.
00:19:23.600
But if nothing ever changes, is it worth keeping, or should I just walk away? When I work in businesses that have been successful and have large codebases full of code they dislike, that means they’ve likely been around for five to seven years and carry a sense of chaos as problems compound. After establishing 'Foo' and 'Bar,' someone creates 'Larry,' 'Curly,' and 'Moe,' all of which function similarly.
00:19:51.920
Eventually, with many similar classes, you can end up with utter duplication, message change, and data clumps leading to 'Shotgun Surgery.' If you mess up, something breaks in production. Prevalent validation issues manifest in various places, creating an overwhelming mess. It can feel like you need to understand everything to fix anything, but from experience, if you try to tackle all problems simultaneously, you will never finish.
00:20:32.640
It isn’t feasible to say, ‘shut the doors, we’re going to do a big two-month refactoring; it will be fine,’ as these suggestions don’t resonate well with those who cover the bills. Therefore, we need a way to reach in, extract one problem, and fix it, and that’s where code smells come into play. We’ll identify the data clump issue and resolve it using the 'Extract Class' technique.
00:21:14.560
By isolating the data clump and uniting the behavior with the data, it becomes clear that certain behaviors do not belong in 'Bar' and 'Baz.' They rightfully belong with the newly-defined object I should create here. I won’t delve deeper into this refactoring process as the details are easily searchable if you wish to look them up.
00:21:53.239
The ‘Date Range’ class takes two arguments, understands how to set defaults, and possesses two methods: one that returns a regular range and another that returns a week range. This class acts as a representative for the data clump. In 'Foo,' I’m about to complicate the method, which is why people criticize object-oriented design. I’m going to introduce a temporary variable, which is itself another code smell. I’ll just plug it in right away.
00:22:31.920
If you examine this code, it gets worse. I introduced a new class, and the method length has increased. However, watch what happens. Now I don’t require that anymore. I can directly apply that here. Now, I no longer need that in the other method either. I can apply this method wherever needed.
00:23:06.760
These three methods demonstrate a data clump. The beauty of addressing data clumps is that if you identify them, isolate them, and create objects for them, the behavior naturally coalesces into the object. It's reminiscent of the 'Field of Dreams' principle: if you build it, they will come, allowing you to use date ranges consistently throughout your application.
00:23:37.880
Thus, all validations remain uniform across every location, and changes propagate seamlessly throughout the application. Everyone using this will prefer to leverage this standardization rather than implement their own in every instance. I must emphasize that you have tons of data clumps in your applications. If the only takeaway from this talk is to go home and convert them into objects, that’s a significant achievement.
00:24:17.679
Now, regarding message chains – I have a strong dislike for them. Returning back to my notes, I find them to be prevalent everywhere. Here we are, within my own objects: 'Foo,' 'Bar,' and 'Baz,' and we have this message chain reaching into my persistence object, sending messages to component objects and returning responses. While it’s acceptable for my class to know things about 'Sale,' it’s inappropriate for it to be aware of objects external to it.
00:25:27.680
Moreover, it can be challenging to convince others of the downsides of this design. To convey this idea, I’ve created two illustrations that portray the concept of message passing. Here’s the first illustration: I have 'Foo' sending the 'where' message to 'Sale,' which inherits from persistence. The message traverses the hierarchy, reaching a vast cloud of code that someone else wrote and provided to us for free.
00:26:12.080
Somehow within that code, there’s a list, among other components, but I don’t occupy my thoughts with what’s going on in that segment; I send the message, receive it back, and then pass it off to 'Foo.' This explains the mechanics of message passing. I’ll show you a different approach to visualize it, especially when I think about testing. I’m trying to test 'Foo,' which knows about 'Date Range' and 'Sale.' These are my immediate collaborators.
00:26:48.600
When creating unit tests for 'Foo,' I need to establish instances of those collaborators to run tests. But here, we notice 'Foo' has knowledge of something that isn’t a direct collaborator—a certain relationship exists. I’m aware of my collaborators, but if the situation does not enable me to run the database query directly, I then need to set up stubs or mocks, narrating how message chains can complicate the testing process.
00:27:38.200
With that in mind, fixing the message chain is straightforward. Identifying that duplication reveals an underlying idea that deserves a name. You’ll frequently observe a method name with a persistent prefix or suffix. That’s a telltale sign of an object wanting to escape its constraints, a point reaching for further abstraction. Consequently, I can conceal delegation by implementing a 'total' method on 'Sale.'
00:28:29.480
Once I achieve that, my code looks cleaner, and we’ve successfully hidden the delegate. Next, I want to acknowledge Justin Surl, who presented a talk a few years back regarding realistic levels of complexity in testing. I recommend checking out what he discussed concerning budgeting realities. Now, let’s consider the different tests we can implement for this code. For instance, I can utilize test cases by creating an instance of 'Foo,' sending a message, and checking the output.
00:29:21.160
While this operates well away from the database, and 'Foo' isn’t a subclass of 'Active Record,' it still goes through an entire database query. I’ve had many developers tell me their test suite runs increasingly slower as it grows, and if too much reliance exists on combined tests, the outcome becomes untenable. It’s imperative to have unit tests that execute quickly, and if your unit tests for your objects are tightly coupled to distant dependencies that introduce unnecessary delays, your code quality will deteriorate.
00:30:13.640
Moreover, I recognize it’s possible to modify and stub in the ‘Sale’ method, but I have good reason to prefer this approach. It shifts the focus to treating objects as real participants in their roles rather than instances of class definitions. This distinction permits flexible utilization anywhere in the testing structure without initiating a stubbing effort.
00:30:55.720
There remains one last task I want to accomplish before we conclude. Upon realization, I spot a duplication here. It’s beneficial to utilize ‘Pull Up Method’ strategy and migrating it into the superclass. It’s unwise to monkey-patch the 'Active Record' class by introducing total there; however, we can create a module instead. Employing a module enables me to easily add functionality and clarify distinctions to what I am transferring.
00:31:38.560
The strategy to pull this method up raises awareness of speculative generality, which I confess to being guilty of. The essential dilemma with speculative generality is that it tends to mislead and can become costly due to increased complexity. Personally, I find that my methods remain short and that injecting dependencies shouldn’t present a significant challenge while still enhancing flexibility more than it complicates things.
00:32:29.600
Now, if you are not intimately familiar with code smells, learning them can elevate your programming proficiency. The key to enhancing proficiency further lies in understanding code smells and learning the corresponding refactoring recipes. These concepts are not new; they've been around since the 1990s, and many individuals likely entered the programming domain since then.
00:33:10.640
These ideas offer practical and concrete advice on writing code. If you remain unaware of them, I encourage you to learn them. Problems that might seem insurmountable often have solutions through purely mechanical operations. The benefit is that it doesn’t strip away the excitement of programming; rather, it alleviates the boring elements to allow for focus on more intricate challenges.
00:33:54.680
Upon learning code smells, it’s like wearing glasses, transforming confusion into clarity. Following the smell leads to the creation of straightforward, understandable objects that are easier to test. So, while learning about code smells, keep in mind the reek tool. This is a static analysis tool that can run against your code and indicate what code smells are present.
00:34:36.960
Code smells aren’t alarming; in fact, they’re beneficial and ubiquitous throughout our code. But you need to lean into them. I recently got a new puppy, and if he were here, he would happily chew one of those flowers. You should also engage with code smells – they are simply information. Much of this information is neutral. For instance, if you're on a diet, you might want to avoid pizza, but if you’ve been biking for long, it’s the best kind of smell.
00:35:15.960
When reviewing my old code from three or four or five years ago, it’s common for me to express disbelief at my past choices. This is the extraordinary aspect of our jobs—it’s one of the best feelings! Imagine a job where your skills are stagnant over five years; that thought alone is discouraging. I love reflecting on my old code for the realization that I’ve improved, demonstrating I’ve learned since that time. This is such a wonderful concept.
00:35:58.480
I began this talk with a cup of coffee and will conclude similarly. I know it’s late, and you’re likely tired; I realize it’s not the best time to drink coffee right now. However, in the morning, when you awaken, it’ll be a fantastic smell. Go and learn about code smells—the ultimate goal is to raise your programming skills. Thank you.
00:37:07.279
Oh wait, I swear this book is close to being in beta. I know you have been very patient with me, but it truly is on its way. It's currently with the editor and is going to turn out great. Once again, writing can be thrilling yet torturous, but I can confidently say I’m proud of it.