Keith Pitty
Loving Legacy Code
Summarized using AI

Loving Legacy Code

by Keith Pitty

In his talk "Loving Legacy Code" at RubyConf AU 2015, Keith Pitty emphasizes the importance of changing our attitudes towards legacy code, arguing that it should be respected and improved rather than feared. He begins by acknowledging the common disdain for legacy code, often defined as code that lacks tests or impedes change. However, he suggests expanding this definition to include code with poor internal design or cumbersome testing frameworks.

Key Points Discussed:

  • Defining Legacy Code: Pitty refers to Michael Feathers' definition, stating legacy code can hinder change and effective development.
  • Understanding Developers' Pain Points: Legacy code often accumulates technical debt, leading to challenges such as confusing readability and frustrating test suites.
  • Opportunities for Improvement: Developers can view legacy code as an opportunity for rehabilitation; by identifying customer pain points and leveraging tools like New Relic or Rollbar, they can improve both user experience and code quality.
  • Importance of Version Control: Pitty stresses the necessity of utilizing version control (e.g., Git) in managing codebases.
  • Testing and Refactoring: The introduction of automated tests is crucial for safe refactoring. He encourages the focus on unit tests and emphasizes the importance of breaking dependencies in the code.
  • Simplifying Design: Following principles from Kent Beck and Corey Haines, simplification and elimination of duplication are highlighted as key practices for improving code quality.
  • Positive Attitude Towards Legacy Code: He advocates for a collaborative approach among team members and for developers to share knowledge, thus easing the burden that comes with legacy code.
  • Management's Role: Acknowledging the resource restrictions that come with managing legacy code is essential. Managers should budget for maintenance and respect developers' efforts to improve the codebase.
  • Metaphors for Legacy Code: Pitty concludes with the metaphor of gardening, where maintaining a healthy codebase requires ongoing care and can often be rewarding if approached correctly.

Conclusions and Takeaways:

  • Legacy code shouldn’t be avoided; rather, it can provide avenues for growth and improvement. The respect for original developers, the understanding of legacy code history, and the cooperative spirit among teams can lead to a more favorable relationship with legacy systems. Overall, it is possible to find fulfillment in tending to legacy code by actively improving it and sharing knowledge within the development team.
00:00:00.120 So, I guess you might be sitting there wondering about the title of this talk. Perhaps you're thinking to yourselves: is he seriously going to talk to us about loving legacy code? Surely, there's a mistake! But, just hold your fire.
00:00:14.219 I know you may be thinking, "How on Earth did this talk get selected?" But it did. So let me ask you this: let's have a show of hands. How many of you love legacy code? Okay, thank you! I'm going to assume that, well, we are in Australia after all, so all of you who raised your hands did so ironically. But that's okay because, if I'm honest with myself, I don't really love legacy code either.
00:00:55.680 However, I do think it deserves more respect. After all, it's probably paying for your wages right now, and I believe it deserves more love. But let me be clear: I'm not here to convince you that you should unreservedly love legacy code all the time. Rather, I typically like to share with you some ideas that may help you love legacy code more.
00:01:09.240 So, to begin with, how should we approach this topic? Starting with, well, what's legacy code? After all, how do we define it? Is it just old, crafty stuff? Or is it, as Michael Feathers suggested about ten years ago, code without tests? Now, if I think about that, I think, yes, that would qualify a code base as legacy because it impedes change.
00:01:36.540 But I think there are other ways that a code base can impede change. For example, if it has poor internal design or a cumbersome test suite. So, I prefer to define legacy code as code that impedes change in any way. Another question to ponder is: why do we react so negatively when faced with legacy code? Could it be the accumulation of technical debt over a long period of time? Definitely.
00:02:14.700 As cumulative Band-Aid solutions are applied, it leads to poor internal design. Classes and methods can become too large, complex, and unwieldy, making it very challenging to decipher what the code actually does. Readability is so important, isn't it? Our test suites can also be more frustrating than helpful. Test suites may take hours to run, and we may have tests that pass sometimes and fail a thousand times for no apparent reason. As developers, we typically do not have the opportunity to add the latest shiny technology to our CV when working with legacy code.
00:02:55.680 But I'm going to suggest to you that there's another path to developer happiness. Sure, exploring new technology is fun, but imagine a poor, neglected code base that's been left to rot, and you have the opportunity to nurse it back to good health by giving it some much-needed tender loving care. Sure, it can be frustrating. It can indeed be character-building, and in some cases, that code may be beyond help. It may have become so decrepit and pungent that the kindest thing to do is to put it out of its misery. But many legacy code bases are nowhere near that bad.
00:03:37.860 We'll talk about how we can nurse them back to health shortly, but let's pause before we do that and consider the original developers. Are they still around? If so, it may pay you to ask them some questions about the history of the code base. You might be surprised at what you learn. Even if you don't have that opportunity, it's important to respect those original developers: understand why they made the choices they did, and learn from the history of the code base.
00:04:05.640 Let us move on and consider how we can improve the legacy code that we're dealing with. A logical place to start is with identifying some pain points. Now, you might think, "Well, let's start with the code." I'd suggest that rather than doing that, we should make use of some customer feedback, whether it's in the form of requests for new features or reports of problems. We can use that opportunity to improve the code at the same time.
00:04:55.260 Then again, customers won't always tell us about the problems that affect them, will they? So in this case, tools like New Relic or Rollbar will help us identify problems that affect them, and again, at the same time, we can improve the quality of the internal code as we attend to those problems. I'm sure we're all familiar with the scenario where a developer is frustratedly cursing out loud as they try to understand what a particular piece of code does. Well, that's an ideal opportunity for that developer to improve the code so that a developer in the future has a better time trying to make sense of it.
00:05:23.460 As well, we've got to be mindful that practically, sometimes developers are under pressure to get fixes out into production, and they're not necessarily going to pause to put in place those measures to improve the code. But we do have code quality tools that can help identify areas of the code that need some tender loving care. So consider that we've identified some pain points. How do we go about improving the code? What techniques can we use? Well, sadly, we can't always assume that the code base is under version control. Seriously, I have come across this in my time.
00:06:14.699 Even after the advent of Git, I've seen a Rails application that I was asked to work on that was NOT under version control. If that's the case with your code base, get it into a Git repository as soon as you can. Thinking back to Michael Feathers' definition of legacy code, again if the code base has no tests, this is a showstopper too, isn't it? Because, as we know, to refactor safely, you'll need to have automated tests.
00:06:50.760 As we're introducing those tests, we might also think of introducing feature tests and end-to-end tests. Is that the best approach? I'd say that we should concentrate more on unit tests. Of course, that generates the question: what is a unit test? There's plenty of debate about that question. I recently read an interesting book by Jay Fields called "Working Effectively with Unit Tests" in which she distinguishes between what you call solitary units and sociable unit tests. The solitary ones are truly isolated and enable us to get fast feedback; they are the most useful, so we should aim for as many of those as possible in order to facilitate developer flow.
00:07:45.300 Another example of that is configuring Guard to automatically rerun tests that are affected by the code changes we make. As we're introducing unit tests, as Michael Feathers emphasizes, we're going to need to break some dependencies in the code. That will have the useful side effect of improving the internal quality of the design. Having introduced some tests, or maybe we inherited a legacy code base that already has a large suite of tests, we're going to need to nurture those tests.
00:08:48.639 We need to constantly reassess the value of each test. We shouldn't just keep accumulating them. We should think of each test in business terms: is it providing sufficient value for the cost that it requires to keep maintaining that test? In some cases, we'll need to remove tests, and it turns out that, in my experience, feature tests are a good example of tests that we should consider removing if they're becoming too troublesome for us.
00:09:40.920 After all, they take longer to run, tend to be more unreliable, and it's interesting to note that in Jay Fields' book, she recommends that for an application, we should have no more than a dozen smoke tests. We'll also need to nurture our continuous integration (CI) builds, for example. If we've inherited a test suite that takes hours to run, we should split it up into several steps and run them in parallel.
00:10:43.380 If we encounter tests that fail unexpectedly or pass inconsistently, we should take them out of the normal build stream and address them within a reasonable timeframe—either fixing them so they consistently pass or removing them. All right! So we now have a picture of our situation where we've established some tests in place and can refactor. As we do so, we will have to make judgments about which refactorings will bring us the most benefit.
00:11:06.780 Because this is really a budget that will provide the opportunity for us to refactor and add improvements to our house content. We also need to be mindful of separating concerns and understand that Rails doesn't necessarily guarantee a good separation of concerns. For example, thinking of Rails model classes, here’s an example of a Rails class that's accumulated many methods over time. This class is from a seven-year-old Rails application.
00:11:37.260 It has way too many methods—104 to be precise—so it's a safe bet that this is a class that should be analyzed and refactored, with a lot of its methods moved out to perhaps service classes. But it's also worth pondering what the best practices are. I mean, that situation probably arose because of the so-called best practice that was recommended several years ago of having skinny controllers and fat models.
00:12:28.799 But it's worth thinking about the reality that what’s considered a best practice today may not be optimal in several years. When it boils down to it, we're aiming to make our code easier to work with. To do this, we should endeavor to simplify design. This brings us to a set of design rules originally codified by Kent Beck, one of my software heroes, and more recently revisited in Corey Haines’ book, "Understanding the Four Rules of Simple Design." It's also worth noting that the third rule, which concerns the elimination of duplication, is about knowledge duplication rather than just code duplication.
00:14:07.300 Corey alludes to several blog posts by JB Rainsberger, and in one of those posts, JB distilled those four rules down to a guiding principle: if we strive to remove duplication and improve names in small cycles, we will end up with better code and usually less code to maintain, which is a win.
00:14:54.900 Another subtle point is that it’s easy to underestimate the satisfaction of transforming bad code into good code. A job well done means we’ve made the code easy to understand, test, and generally in a better state for future developers. Other opportunities arise when working with legacy code as well. There are good things you can get involved with.
00:15:37.620 For example, in a team, discuss what your team considers good internal design guidelines or tags. Perhaps you can engage with members of your organization outside of the development team in business units to examine a subsystem or feature that's proving a little troublesome and jointly arrive at a better design. These are challenging and rewarding opportunities that arise when working with legacy code, which don't necessarily appear when working on Greenfield projects.
00:16:35.460 Of course, tools can be of great benefit to the fallible brains of developers. At Blake E-Learning, where I work, we're making quite good use of Code Climate, which is proving useful for us not just in measuring code quality but also test coverage. There are open-source tools to consider, like RuboCop. As I mentioned before, your team might decide on internal code guidelines, and you can configure RuboCop to provide feedback to developers as they work.
00:17:27.360 If you check out the Ruby Toolbox site in the code metrics category, you'll find more examples of helpful open-source tools. It's also worth considering going the extra mile and including tools like RuboCop in your CI pipeline to provide feedback every time you do a build.
00:18:23.520 Okay, let's switch focus to managers. Obviously, there are limits on how much can be invested in maintaining and improving legacy applications, but we need a balanced approach. Adequate maintenance requires time and effort, technology does need to be updated and upgraded, and technical debt needs to be recovered. It all takes time.
00:19:38.640 So, don't plan with a blind spot; budget for maintenance, and be aware that sometimes a legacy code base may become too difficult to work with, triggering the decision to migrate away gradually. Hopefully, the new code base will have a better history.
00:20:42.060 Now, I'm guessing that many of you in the audience will be familiar with a lot of the suggestions I've made. Still, I think there's an even more important aspect: the attitude we bring to working with legacy code. Staying with a manager's perspective, we must acknowledge the time and effort required and remember that tools are not always open source and therefore require investment.
00:21:01.200 For example, your developers may have struggled to a point where they decide that Jenkins is just not cutting it anymore as a CI server. They might evaluate a new solution and think it's worth the investment, such as a tool called Buildkite. If this happens, it's important for you as managers to show respect for your developers, as they attempt to improve both the business and the code.
00:21:59.340 One way to respect them is by acknowledging that working with legacy code can indeed be wearing. Therefore, share that production support role around the team, enabling everyone to gain a good appreciation of the code base and its challenges. Now, developers, it's your turn: how can your attitude improve this situation? Sharing knowledge among the team is important, especially if some of you have accumulated a wealth of knowledge.
00:22:57.420 Make the effort to share that knowledge throughout the team because that's going to help everyone. Speaking of collaboration, I can recall back in the 1980s when I had the opportunity to use an approach called structured walkthroughs to review code. Fast forward to today, and thankfully we have GitHub pull requests, which offer a fantastic opportunity to collaborate while developing features.
00:23:30.720 So it's worth pondering: are you making the best use of GitHub pull requests? Are you using them to their full potential? I've been focusing mainly on the past and the present, but it’s also worth considering the needs of future developers when working with legacy code.
00:23:54.840 As we’ve mentioned before, if you commit changes that lead to breaking the build on CI, the temptation may be to simply rebuild and hope it goes green next time. While that may work in the short term, addressing non-deterministic failures is crucial for future developers.
00:24:22.680 Moreover, we should be mindful of the language we use when working with legacy code bases. For example, saying "What were they thinking when they wrote this?" is rarely helpful. Remember, when they wrote the code that we’re now struggling with, they did it with the best of motives.
00:25:14.340 Life throws various circumstances that affect people when they work with code, so reflecting on the constraints present since the first line of code was written in the application is important. We, as programmers, often seek perfection, but we have to remind ourselves no code base is perfect.
00:26:00.000 We should appreciate that our managers often have a better perspective on the business and how the code supports that business. It’s pertinent to use them as allies, rather than seeing them as adversaries, and respect that they're under pressure too.
00:26:43.140 Finally, let's turn to some metaphors that may assist with our attitudes toward legacy code. We've talked about nursing an application back to health, and many of us may relate to having a garden at home. I don't know how many of your gardens are in absolutely pristine condition, but I know mine isn’t. However, I do manage to keep it from becoming horribly overgrown with weeds.
00:27:42.480 If we apply this gardening metaphor to software, we can recognize the need for balance, as there might be calls for a new finish, remodeling, or simply repairing what’s broken. There’s going to be a need for compromise, and the same is true of software. We cannot pretend that we will fix every flaw all at once. Overall, attitude is very important, and hopefully, I have convinced you that working with legacy code needn’t be a prospect that you want to run away from.
00:28:40.560 It can be rewarding if you approach it with a helpful set of techniques and the right attitude. Consider those who have been involved with the code so far, those working with it now, and those who will be affected by it in the future. Respecting these individuals will enable us to develop a better relationship with the code.
00:29:09.660 Thank you for listening. I hope that by sharing some of these ideas, I have shed some light on how you can learn to love legacy code more.
Explore all talks recorded at RubyConf AU 2015
+14