Refactoring

Summarized using AI

A Brewer's Guide to Filtering out Complexity and Churn

Alan Ridlehoover and Fito von Zastrow • April 04, 2024 • São Paulo, Brazil

The video titled "A Brewer's Guide to Filtering Out Complexity and Churn" features Alan Ridlehoover and Fito von Zastrow, who discuss how complexity can infiltrate software codebases and the importance of controlling it. By using a coffee machine as a metaphor, they illustrate how seemingly simple changes can lead to escalating complexity within an application.

Key points discussed include:

- Introducing Complexity: The speakers explain how complexity sneaks into code through multiple small changes or commits rather than appearing abruptly.

- Understanding Complexity: They emphasize the need to assess complexity early on, citing metrics such as method length and the Assignments, Branches, and Conditionals (ABC) metric as tools to gauge code complexity.

- The Coffee Machine Example: As they develop a coffee machine application, they show how adding features, such as serving tea and various sweeteners, leads to increased complexity. By observing commits, they demonstrate the growth of conditionals in the code, which becomes difficult to manage over time.

- Evaluating Code Complexity: They introduce tools like Flog to measure method complexity and discuss qualitative indicators for identifying when design should be revised. They advocate for a cap of five lines for method length and advise when to pause and rethink the design as complexity scores increase.

- Refactoring to Reduce Complexity: After recognizing that their coffee machine design has become overly intricate, Alan and Fito present the concept of "rehydrating" the code by intentionally increasing duplication to uncover missing abstractions and simplify it. They reinforce that thorough testing is essential before making these changes.

- Using the Factory Pattern: They discuss how implementing a factory class helps consolidate responsibilities and maintain the open/closed principle, allowing the code to expand while keeping existing functionality intact.

- Churn and Complexity Metrics: The importance of analyzing churn (the frequency of changes in the code) alongside complexity to prioritize areas in need of improvement in the application is highlighted.

Main Takeaways:

- Complexity enters the codebase over time and must be managed actively.

- Regularly assess and monitor code complexity using relevant metrics.

- Be proactive in rethinking design when the pain of complexity becomes evident.

- Using strategies like polymorphism and factories prevents future complexity from escalating.

This insightful talk from Tropical.rb 2024 serves as a reminder that vigilance and intentional design choices are crucial in fostering maintainable software systems.

A Brewer's Guide to Filtering out Complexity and Churn
Alan Ridlehoover and Fito von Zastrow • April 04, 2024 • São Paulo, Brazil

Complex code is expensive and risky to change. Most programmers are unaware of how their changes increase complexity over time. Eventually, complexity leads to pain and frustration. Without understanding the complexity, developers tend to blame Rails. Come learn how to keep complexity under control.

Tropical.rb 2024

00:00:01.920 Hello everyone! Here we go! Hola!
00:00:34.320 Welcome to our talk, titled 'A Brewer's Guide to Filtering Out Complexity and Churn.' We also affectionately refer to it as the coffee machine talk, and you'll soon see why. Our primary goal here is to show you how to eliminate the bitterness caused by complexity and churn in your applications.
00:00:41.640 Over the next 30 minutes, we will illustrate how complexity sneaks into your codebase, how to recognize it before it becomes painful, and how to remove it permanently.
00:00:54.559 Let's start by introducing ourselves. I'll go first. Hola, I'm Alan. I have about 13 years of experience with Ruby and am originally from Seattle, where Starbucks was born. Coffee flows through my veins! My favorite drink is a sugar-free vanilla oat milk latte.
00:01:10.960 Hola, I'm Fito. I also have 13 years of experience with Ruby and Rails. I originally hail from Asunción, Paraguay, but I’ve lived in the San Francisco Bay Area for the past 13 years. One thing I love just as much as Ruby is coffee; my favorite is a Ghirardelli dark chocolate mocha. If you're ever in the Bay Area, be sure to get one—they're delicious!
00:01:38.439 Alan and I work together at a company you might not expect us to be at, given that this is a conference for Ruby and Rails developers. We work for Cisco Meraki, which is the largest Rails shop you've probably never heard of. We've been friends for a long time and have collaborated at three different companies over the last 10 years. We've encountered a wide variety of codebases of different sizes, and we spend our weekends writing code and drinking coffee together.
00:02:21.800 Alan, you grew up around coffee, didn't you?
00:02:39.000 Yes, I did! As a kid, I was fascinated by mechanical coffee machines. You would drop in a coin, listen for the clink, make your choice, and the machine would whirr to life with hissing and clicking sounds. The delightful end of the process was the glorious aromatic black liquid splashing into the cup. Magnifico! These days, I'm more captivated by the inner workings of software—Fito and I both are. However, similar to those coffee machines, there is often hidden complexity in code.
00:03:06.400 But that complexity doesn’t manifest overnight. How many of you have worked on a Greenfield or brand-new application? Wow, many hands! How did that feel? A pleasure, yes? Now how many of you have worked on a legacy application? Even more hands! How did that feel? Why is that?
00:03:26.599 We believe it relates to complexity. At some point, codebases cross a threshold beyond which you have two choices: you can either live with the growing complexity, deluding yourself that it will magically disappear, or you can pause, reflect on the design, and reorganize the code to regain momentum in development. We've witnessed organizations choose both paths, and invariably, those who choose to live with the complexity end up frustrated. Occasionally, this frustration leads people to blame Ruby and Rails—but it’s not Ruby's fault or Rails' fault; it’s the complexity itself.
00:04:16.320 What we’re going to show you is how to take the second path: removing complexity and falling back in love with Ruby and Rails. We’re going to build a coffee machine, add several features, observe how complexity infiltrates the program with each commit, and finally reorganize the code so that we can add new features without increasing complexity.
00:04:54.199 Are we ready? Okay, let’s get started! Fito, do you want to build a little coffee machine?
00:05:18.919 Yes, let's do it! So, how does complexity creep into software? The answer, of course, is one commit at a time—it sneaks its way in. I'll move through the slides pretty fast to show the shape of the code as it evolves. As the code grows longer, the font size will shrink, so don’t worry if you can’t read it all. We’re skipping tests here, but in reality, we would never do this without them. For the sake of time, we won’t be writing them.
00:05:43.240 Let’s look at the first commit in our coffee machine. At this point, the machine does just one thing—it serves coffee. First, it dispenses a cup, heats the water, prepares the grounds, dispenses the hot water into the cup, and finally disposes of the grounds. This works great, but we received feedback that not everyone likes coffee, so to increase our sales, we’re going to add tea.
00:06:08.240 In this commit, we added a conditional to determine whether to serve coffee or tea. However, in doing this, we introduced some duplication; the calls to 'dispense cup', 'heat water', and 'dispense water' are duplicated for both beverages. So naturally, we would dry it up. Here's the DRY version of the code.
00:06:41.239 Now that we have coffee and tea in production, we're starting to receive feedback. The most frequently requested feature is for us to add sweetener. Let’s do that! Here we added sweetener just after dispensing the hot water. Of course, not everyone wants sugar, so we’ll make it an optional ingredient. We push this out and our customers like it; now they want cream or milk.
00:07:08.039 We already have a pattern for optional ingredients, so we’ll dispense the cream right after dispensing the sugar, also as an optional choice. We push this feature out, and customers enjoy it. For the next feature, it turns out some folks don’t appreciate either coffee or tea, so we’re going to offer them something else, like cocoa or hot chocolate.
00:07:36.639 Here, we followed the existing pattern and added cocoa to the main conditional. There’s no need to add milk or sugar, as the cocoa powder we use is already sweet and creamy. So, we’ll exclude those optional ingredients when a customer requests cocoa. Finally, who doesn’t like whipped cream on their cocoa? Heck, I even like it on my coffee, so let’s add it.
00:08:03.319 Whipped cream is an optional ingredient that no one wants on tea, so we will add it below the other optional ingredients and exclude it if the selected drink is tea. Here we are: we are now seven commits into this little codebase, and we already have nine conditionals in one method. At this stage, it remains relatively simple to understand and work with, especially if you are the only one engaging with it.
00:08:37.520 However, if you're part of a team, procedural code like this simply does not scale. Future developers will likely extend what we have, continually adding more conditionals with each new feature, and the complexity will skyrocket. We are moving fast, releasing features. Our little coffee machine has proven so successful that it was just purchased by a big national soup chain, and they want us to add soup to our machines. This is going to introduce a substantial amount of complexity into our code.
00:09:05.760 Let's pause here and evaluate our current state before trying to add any more features. Alan, can you take us through that?
00:09:31.880 We’ve reached an inflection point in the life of the coffee machine. But how do we know that it’s time to pause and reflect? The first hint is that we had to reduce the font size to fit the whole method on the screen. Method length is definitely an indicator of complexity.
00:09:50.680 Sandi Metts, author of 'Practical Object-Oriented Development in Ruby', which if you haven't read yet, you should, has a rule about this—methods should only be five lines long. In addition to method length, we also need to consider method complexity, a quantitative measurement of how difficult it is to comprehend a piece of code.
00:10:02.880 Our preferred metric for method complexity is called the Assignments, Branches, and Conditionals metric, or ABC. The higher the number, the more challenging it is to understand that code. We use a gem called Flog by Ryan Davis to measure this for us. Flog computes the ABC score for each method in the application.
00:10:32.760 But how do we determine what constitutes a good score? Way back in 2008, a gentleman named Jake Scruggs, who wrote about metrics, provided these benchmark numbers. To this day, they are still the only references we've been able to find online, and they work effectively to help us simplify our software.
00:10:53.440 Let's revisit our little coffee machine and examine its complexity over time. Pay particular attention to those green numbers. A score of 0 to 10 is considered excellent, 11 to 20 is good enough. In our first commit, we achieved a complexity score of 5.5—fantastic! The churn number you see here simply represents the total number of commits to that file, which we'll discuss later.
00:11:13.600 Adding a conditional with duplicated code raised the complexity to 14.6, moving us out of the 'awesome' zone but still staying below 20, which is still acceptable. When we removed the duplication, the complexity dropped back down to 10.9, which might seem favorable. However, this is when things begin to go awry.
00:11:42.320 Notice how we’ve entangled the two algorithms in a way that complicates understanding how you're brewing coffee or steeping tea. This also sets a precedent for future developers, leading to more beverage additions resulting in further conditional logic. From a Flog perspective, while the algorithm's complexity appears to decrease—with fewer assignments, branches, and conditionals—it fails to grasp how we’ve interwoven the algorithms. This serves as a valuable lesson; there is no single magic metric that can illuminate the best path in every situation.
00:12:18.520 Instead, we have tools like Flog that can inform our decision-making process. Pay close attention to how it feels to add new features to your application. If it feels slow or burdensome, stop. Take a moment to pause and evaluate your design.
00:12:48.880 Next, we added sweetener, bringing our complexity score up to 13.5. Then we added cream, pushing it further to 16, which is still in the good enough range. Adding cocoa raised the complexity to 21.3, just above the acceptable line. Let’s examine where the last feature takes us.
00:13:05.600 Adding whipped cream pushed the complexity score to 25.5. Now, looking at the trend line, we can see that complexity has reached a point where it’s curving upward, and we are well beyond the good enough threshold.
00:13:27.520 So, how do we know when it’s time to pause and rethink your design? First, method length—which literally refers to how lengthy the method is—should stay under five lines. Secondly, assess method complexity; anything under 20 is acceptable, but above 60 indicates dangerous complexity. Finally, trust your intuition—if developing new features feels sluggish, it’s probably time to take a moment to reflect on your design.
00:14:10.000 Now that we've reached the inflection point, Fito’s going to address the issue of reorganizing our method before we add more features. We've crossed over Jake’s acceptable threshold and mixed three algorithms, which sounds grave—can we turn this around? Yes, we can do it right now.
00:14:36.000 Here’s the method as we left it moments ago. However, we must be cautious; too much early DRYness can steer us in the wrong direction. We’ll undry this code, introducing back some duplication to identify any missing abstractions hiding in plain sight. We refer to this practice as rehydration.
00:15:10.280 Before we proceed, it's crucial to note that you cannot do any of this without tests—without tests, you're guaranteed to break your code. While we won't be writing tests here for time’s sake, ensuring good test coverage is the essential first step to reducing complexity.
00:15:33.520 We highly recommend a tool called SimpleCov to verify that we've thoroughly tested each line of code. As you can see, we have reached 100% line coverage, which is excellent. We also have achieved 100% branch coverage, confirming we're testing both cases of every conditional code structure.
00:15:57.920 Achieving 100% branch coverage is vital before ‘rehydrating’ your code, as it ensures you won’t inadvertently alter the behavior of the code during refactoring. Now, here’s where we left our code previously, and we’ve confirmed our tests adequately support us, so we’re ready to rehydrate it.
00:16:17.840 After rehydration, our code will look like this. This intentionally increases duplication, but it’s a necessary step to help pinpoint any missing abstractions. Now we can clearly delineate each recipe, and since there’s no overlap among the algorithms, we can safely extract each recipe into separate polymorphic classes.
00:16:36.320 As you can see, we moved each recipe into its own class. Now we have one class for coffee, one for tea, and one for cocoa. This arrangement has several significant advantages: each algorithm is now separate, so if you need to modify one for instance, fixing a bug, the risk of introducing unintended side effects into another algorithm is far lower.
00:17:09.600 Moreover, the main method is now considerably simpler. You may have noticed duplication between the classes; every class contains calls to 'dispense cup', 'heat the water', and 'dispense water'. We actually want that duplication; it simplifies understanding the complete algorithms since the entire procedure for each drink is encapsulated within its respective beverage class.
00:17:34.560 Our goal isn't to eliminate duplication entirely but to ensure that each method is only implemented once. Ruby offers numerous options for achieving this, such as using modules, composition, or inheritance, where methods are placed in a base class. In this instance, we’ll likely go with inheritance, placing those methods in a base class.
00:18:04.720 We’re nearing completion, yet we have two remaining issues. First, the main method handles multiple responsibilities, and second, it isn’t truly open for extension because we have to modify the code to extend it. Currently, its only true responsibility should be to prepare a beverage.
00:18:33.000 However, it also determines which class to instantiate based on user input, which is the duty of a factory. Therefore, let’s go ahead and introduce a factory class. Here, we’ve extracted the class instantiation into a factory; its sole purpose is to choose which class to construct based on the selected drink.
00:19:01.280 Now the main method possesses only one responsibility: to prepare beverages. It’s now open for extension, meaning we will never need to modify it again when we add new functionality to the coffee machine. The main method is now said to be 'open for extension, but closed for modification.'
00:19:24.680 However, you may have noticed that the open/closed problem has just shifted to the factory. There’s another clever issue; the build method in the factory may return nil if it receives an unknown drink, causing an undefined method error. Let's address that first.
00:19:47.520 We can do this by introducing the Null Object pattern. So with the factory, by default, it returns a No Beverage class, which is simply a class with a prepare method that does nothing. It might also be beneficial to notify or log an error when an unknown drink is selected, but the goal is to prevent throwing an exception.
00:20:12.480 Now, regarding the second problem with the factory: it's still not entirely open for extension. To add a new beverage, we will have to modify that factory class and add new conditionals. We can resolve this by replacing the conditional-based factory with one based on conventions and light meta-programming.
00:20:57.200 The advantage of this approach is that it’s largely open for extension; there could come a time when we want a class name that doesn’t adhere to this convention, at which point we would have to modify that code, which would breach the open/closed principle. There are other fully open/closed approaches we would be happy to discuss offline, although they go beyond the scope of this talk.
00:21:24.720 For now, we will stick with this factory, as it suffices for our case. Let’s examine our complexity once more. It’s crucial to revisit the graph that Alan previously showed us. Here’s where we left off after adding whipped cream.
00:21:46.200 Next, we rehydrated the code, which temporarily increased the complexity compared to the DRY solution. However, the DRY solution hid the fact that some abstractions were missing. So, we are willing to live with this temporary increase in complexity.
00:22:04.760 Following that, we pulled the algorithms into their own polymorphic classes, which drastically reduced complexity again. Finally, we extracted a factory object, leading complexity to drop even lower than it has ever been. We are now back in the ‘awesome’ range, and the main method will never require changes again.
00:22:36.760 We've created several small classes. The coffee machine now maintains a low complexity of 3.9, with the factory at 6.2, while the three beverage classes are slightly more complex yet still comfortably within acceptable limits.
00:22:57.760 Ultimately, we needed to add soup to our coffee machine without increasing complexity further. Here's where we left off. Adding soup did not necessitate any changes to the existing files—it simply works as long as the soup class is loaded into memory.
00:23:27.680 Ultimately, the takeaway here is that soon, integrating a new beverage will involve adding just another class and its corresponding tests. Let’s examine the code again—one version without soup and one version with soup; nothing else changed.
00:23:59.679 I understand you may be thinking this is a very small Greenfield application, but your own applications are undoubtedly much larger and more intricate. So, what’s the takeaway lesson to take home with you about where to start when you return to work?
00:24:25.040 So far, we’ve mainly discussed method complexity within a single method. However, how can we assess the complexity of an entire application and recognize where to begin simplifying things? You can employ complexity along with churn to create a map of problem areas within your code.
00:24:47.760 As previously mentioned, complexity gauges how difficult it is to understand a segment of code, while churn represents how frequently you change that code. We like to think of complexity in terms of it being the pain you will experience the next time you revisit that file.
00:25:10.320 Meanwhile, churn is how often you voluntarily inflict that pain on yourself. To effectively use churn and complexity together, you can map your complexity and churn on a graph like this, using tools such as Code Climate or the RubyCritic gem.
00:25:37.680 This type of chart was originally put forth by Michael Feathers, author of 'Working Effectively with Legacy Code,' another book we highly recommend. We rely on charts like this to pinpoint areas of our codebase that require interventions.
00:26:05.120 To interpret the chart, let’s break it down into quadrants. The lower left quadrant is the 'pain-free zone,' where files are easy to alter and understand. In a healthy application, most files would belong in this area. Conversely, the upper right quadrant is the ‘painful zone,’ where files become hard to understand, intricate to change, and susceptible to regression.
00:26:30.800 Identifying files that fall within this quadrant helps you determine areas to avoid when adding new code. Instead of adding code to those files, it’s preferable to extract code from them, creating simpler classes with low churn scores, thus enabling you to bring them down into the green pain-free zone.
00:26:48.400 Moving on, the lower right contains low-complexity files that change frequently—these might simply be data masses integrating code. If that’s the case, attempt to pull any ever-changing configuration data out of the codebase to simplify it. You could store it in an external JSON file or database, reducing churn.
00:27:05.040 Finally, in the upper left quadrant, we see high-complexity files that change infrequently. These files often exhibit terrifying algorithms but are not causing pain due to limited modifications.
00:27:28.960 The advice here is to leave these files alone—they’re not inflicting pain and provide no impetus for change. If you do decide to modify them, ensure solid test coverage to minimize the risk of breaking anything.
00:27:50.880 These quadrants serve as helpful guidelines in theory, but reality exhibits more complexity. The red line on the graph indicates the pain threshold. Any files within the pink zone should be handled carefully, as they are resistant to change and highly susceptible to regression.
00:28:11.920 Files like these need the most attention; they are extremely complex, and frequent modifications increase the likelihood of serious issues. Gradually extracting hidden abstractions from this class can significantly simplify your overall application.
00:28:40.560 However, there’s no need to tackle all assigned complexities at once. Instead, focus on improving your code incrementally, making small adjustments each time you engage with the file. You might want to delay addressing highly complex code until you practice the techniques we've discussed.
00:29:07.600 To summarize the journey of our little coffee machine: we illustrated how complexity crept in, how we recognized it, and how we subsequently eliminated it. We will close with some key takeaways and a bit of homework.
00:29:26.760 First, remember that complexity will inevitably seep into your code; it can happen one commit at a time. Thus, maintain vigilance. Pay particular attention to conditionals—they might indicate objects striving to escape your grasp. Keep in mind that ‘DRY’ pertains to method implementation, not invocation.
00:30:02.680 Second, recognize the complexity before it becomes painful. Keep your methods succinct, monitor your complexity scores, and trust your intuition. If your methods exceed five lines, or your Flog scores breach 20, or if it simply feels slower, pause and reflect on your design.
00:30:29.440 Lastly, if you find yourself in a complex area with pain caused by complexity, remember you can step away from it. Utilize polymorphism and factories to facilitate the addition of new features without altering existing files. You can also rehydrate your code and reinstate duplication to help identify those essential abstractions.
00:30:54.560 Now, here’s what you can do with this information: first, ascertain your average method complexity using Flog. Second, identify which file in your class experiences the most churn; it’s probably a problematic area. Third, pinpoint which class requires the most attention—it likely has the highest churn and complexity. Confirm your suspicions.
00:31:28.960 Don't hesitate to reach out to us; feel free to tweet or give us a toot! We are more likely to see your message if you toot, as we aren’t very active on Twitter. Additionally, you're welcome to subscribe to my blog. If you’re interested in exploring that code, you can find it on GitHub with tests included. We also encourage you to check out some of our other projects, including RubyFlog, a VS Code extension that displays the Flog score for the selected text, current method, or average score for the entire class.
00:31:59.920 That’s all from us. Thank you!
Explore all talks recorded at Tropical.rb 2024
+12