Data-Driven Refactoring

00:00:28.560 Our next speaker is Coraline Ehmke. I had the privilege of meeting her two years ago at Madison Ruby. She's given lots of awesome talks since then, and we're lucky to have her here with us today.

00:00:49.120 Okay, so I'm here to talk to you today about data-driven refactoring.

00:00:55.120 Your cyclomatic complexity is going through the roof; you're in danger of flunking out of Code Climate. RuboCop has issued a warrant for your arrest. There's whisper talk of declaring bankruptcy on your technical debt. You're being asked to add new features, but you know that the underlying code base is very unhealthy, like a sinking ship. So, what do you do?

00:01:06.479 You refactor. That's basically my job. I'm Coraline Ada Ehmke, or Coraline Ada on Twitter. You can catch up with all the stuff that I'm doing at everywhere.coraline.codes. I'm a lead engineer at a company called Instructure here in Salt Lake City, but I'm originally from Chicago. I lead a team called the Developer Happiness Team; it was originally called the Refactoring Team but was renamed to emphasize our focus on making developers happy.

00:01:22.960 Part of our team’s mission is to make developers happy. We had to ask ourselves: what makes developers happy? Writing good code efficiently and effectively, and feeling good about the work that we do makes us all happy. Our mission is to create a code base that is a delight to work in, as the CEO told me, and that’s how I operate every single day. Refactoring is one of the ways we can make our developers happier. So, we looked at what refactoring is all about.

00:02:04.560 When I use the term refactoring, I mean it in a slightly broader sense than Michael Feathers. I'm talking about refactoring systems—not just methods—entire applications or ecosystems of applications, starting at that low level of the method. We should first ask ourselves: why do we want to refactor? Maybe the code base is high friction; when you want to make a change in one place, the application becomes less performant and clear, making it harder to test.

00:02:31.519 We might deal with a lot of Heisenbugs—bugs that change their behavior when you look at them. High cognitive complexity, denser nests of conditionals, and cognitive dissonance can also come into play when we look at a method within a class and don’t see the same things. Maybe the method has drifted away from its original intent, and you're afraid to change it because you don’t know it's going to break. Names are set in stone and very hard to change later, leading us into a semantic shell game.

00:03:08.480 We can't afford to blow everything up, even though we'd love to burn it all down and green field every application we touch; it's not going to happen. But we can envision a better future, one where our code is not just functional and successful but also elegant and beautiful. Beauty is a proxy for intuitiveness, and elegance is a proxy for maintainability. I believe the primary drive of software developers is an aesthetic sense: we want to write code that is functional, beautiful, and maintainable.

00:03:39.200 However, I cannot go to my manager and say, 'This code is ugly; I want to make it beautiful.' They don’t really support spending time and money just for beauty. So, we need to provide some good practical reasons to refactor—some practical advantages we can share. If we don’t set out specific goals and make practical decisions, then refactoring becomes a futile exercise, like combing a Wookie. In these situations, refactoring without a plan becomes pointless, leading to a loss of direction.

00:04:30.720 What are some legitimate reasons to refactor? Maybe we want to improve performance. Perhaps that one controller action generating a mile-wide object graph isn't the most efficient use of system resources. We can look at making some processes that are currently synchronous asynchronous to improve performance or reduce the number of database calls. All of these steps can positively impact the user experience, which managers understand is essential.

00:06:05.280 We also want to reduce the number of bugs in our code. As developers, we have a complicated relationship with bugs; they represent holes in our reasoning or logic, and we feel embarrassed when someone else finds them. But we shouldn't. Every developer in this room generates bugs; they're a natural part of coding. However, more complex code often leads to a higher proportion of bugs, meaning anything we do to simplify and streamline our code can help us reduce that bug count—a measurable advantage we can present to management.

00:07:43.680 Another goal is to reduce the cost of adding new features. As the codes of our application become more complex and entangled, we face increased resistance when trying to add features; the time to implement gets longer and longer. We can easily measure how many features are being added each sprint and the time taken to implement them. Promising shorter implementation times is a great way to win friends among the product team.

00:08:06.880 In our industry, we're bringing in a lot of junior developers, and someone's talked today about the idea of delivering on day one. I believe that the sooner a new developer can start delivering meaningful code changes, the better. If you have a complex system with intricate dependencies, it takes longer for new hires to understand it and overcome the fear of making changes. Simplifying code and the relationships between components can positively impact new developers’ ramp-up times.

00:08:49.679 Refactoring allows us to preserve valuable information encoded in our systems. Applications that have been around for years contain edge cases and a lot of institutional knowledge that—while not documented—could be crucial for maintaining business processes. We cannot afford to lose that information. Refactoring extends the life of existing code and preserves the investment of time and money that the company has put into its systems.

00:09:56.560 Next, I’d like to discuss how we can use tests to drive a refactoring effort. Typically, when we write tests, we do so for validation—ensuring the code works as expected, documenting edge cases, and providing guidance for future developers. In contrast, refactoring primarily uses tests as guardrails to keep you from going off track during your refactoring efforts. These tests challenge and validate your assumptions about how the code functions.

00:10:26.079 It's important to note that these tests are not the tests you want to keep in your application long-term; they may be generative or have flickering failures. Remember, these are throwaway tests, created for a specific moment in time. Michael Feathers might have said, 'If you're refactoring without writing tests, you're just changing.' I love that quote. It emphasizes that without writing tests, no one can prove you broke something or made it worse.

00:11:45.840 If you have no existing tests around a piece of functionality you want to change, you need to write those tests before doing anything else. It’s a good idea to run those tests by a more experienced team member who can help identify what the tests are covering. This way, you document your assumptions about how the code works and make informed changes.

00:12:42.640 One effective testing methodology is boundary testing. We should be doing unit and integration tests to some degree. In boundary testing, we generate extremes of the input domain and use those as data to run through our tests, focusing on key boundary values like nil, zero, one, and infinity. For instance, we can take a simple class that multiplies two numbers and apply boundary tests to gauge how well it handles extreme values.

00:13:53.920 In testing this class, we can implement a guardrail test by capturing the original algorithm within a lambda inside our test file. We would then run our example to verify that the refactored code produces the same results as the original algorithm. By running this multiple times with random inputs, we can check for consistency and validate our assumptions. If we encounter a failure, that indicates our initial assumption about how the code functioned was incorrect, which is valuable for guiding our revisions.

00:15:03.680 Additionally, we want to test values just outside the boundaries to ensure the code fails consistently. You can wrap your original call in a `begin/rescue` block to capture its output and then run the same test on your new method, expecting identical handling of edge cases. The tests we write during this phase act as guardrails, helping to guide our refactoring efforts while we eventually discard them once we achieve the desired outcomes.

00:16:03.920 Another useful testing technique is attribute testing, which evaluates the state of an object after a series of actions. For instance, if we have a coin class that can return heads or tails, we want to ensure that when we toss the coin, we can expect heads (more than 400 times out of 1000 tosses) while also not always returning heads (less than 600 times). These tests validate our assumptions about the internal workings of the coin class without needing to understand the precise operations.

00:17:38.080 One intriguing tool for guiding refactorings is the approval test (also known as golden master testing). I use a gem called 'Approval,' developed by Katrina Owen, that facilitates this process. You can write specs to verify the output of a class initialization. The first time you run it, if the output doesn't match an established standard, you can approve the current state, establishing a golden master. Moving forward, when you modify the class and run the tests again, you can quickly identify any discrepancies, highlighting further changes that might have downstream consequences.

00:18:23.920 As we validate our assumptions and prepare for refactoring, remember that if you’re refactoring without a plan, it's equivalent to combing a Wookie. To commit to meaningful improvements, we have to have measurable criteria to evaluate our progress. One of the easiest metrics to collect is the time it takes for your test suite to run. This might be a shocking number that makes you want to cry, but committing to reduce that runtime can serve as an effective measure of progress.

00:19:35.440 Another metric to consider is the feature-to-bug-fix ratio, observing how much of your sprint planning is devoted to fixing bugs versus implementing new features. Monitoring this ratio can have a dramatic impact on your code quality and developer productivity. Moreover, using code metrics tools and static analysis tools can help you establish a baseline of code quality and monitor changes over time. Any commitment to reduce complexity, duplication, or coupling without measuring your initial state won't allow you to effectively prove progress.

00:20:50.000 You might also generate a code smell catalog with a gem like 'reek,' which identifies various issues within your application. This gives you an opportunity to prioritize particular smells and systematically address them across your code base. Moreover, actively engaging with your developers is vital; we conducted a survey at Instructure, soliciting feedback on parts of the code that caused the most pain and insight into areas that hindered productivity.

00:21:47.200 It’s crucial to recognize that the metrics you derive from your refactoring strategy should reflect the needs and goals of your particular team and organization. What is essential for our engineering organization may not apply universally. Metrics can vary significantly across industries, from financial services prioritizing accuracy to gaming development focusing on speed.

00:22:46.720 We have many metric tools at our disposal as Rubyists; while some are advantageous, others can be problematic based on their use. A central issue with many code metrics tools is that they provide isolated snapshots of data that only show where the code stands at a specific moment. This can help identify files that are problem areas but does not allow for effective tracking of refactoring efforts over time.

00:23:25.760 Additionally, tools that give letter grades can present issues. If a developer sees a class with an 'A,' they might feel there’s no need for further inspection. Conversely, an 'F' can be disheartening, suggesting that the class is unmanageable. Rather than relying solely on grades, obtaining raw data allows for more personalized assessment of code quality.

00:24:36.000 I have many opinions about test coverage; for example, I'm wary of projects requiring 100% coverage, as it can lead to superficial practices like monkey patching to meet arbitrary standards. Many companies push for complete coverage, but this can foster poor coding practices in place of genuine quality assurance. Thus, it’s essential to evaluate the effectiveness of coverage requirements and how to balance meaningful metrics with actual software quality.

00:25:47.920 In an effort to improve this situation, I suggest using tools like Code Climate, which can provide useful data while being mindful of their shortcomings. For example, the GPA over time metric is particularly useful for assessing whether conditions are improving or deteriorating within a codebase. However, some metrics, like complexity ratings and qualitative assessments, can cause confusion and may ultimately provide limited actionable insights.

00:26:37.280 Moreover, tools like 'Churn' can effectively identify files changing frequently, allowing you to investigate the underlying causes. You can find correlations with bug reports to zero in on problematic areas. Rake notes is another tool that can help surface todos and comments, giving insight into whether developers are able to focus effectively on writing code.

00:27:37.760 Fukusatsu, a non-opinionated code complexity tool that calculates cyclomatic complexity, is another example of a useful tool. It generates JSON output that is machine-readable, which can facilitate a variety of upcoming analyses.

00:28:03.680 One of these tools, Society, analyzes class relationships and visualizes dependencies, allowing you to see couplings between classes. Again, the interest in using JSON output is that we want to work with these metrics over time. By pooling these results in a database, we can build systems for visualizing data effectively, further clarifying our progress.

00:29:07.600 To illustrate this, at Instructure, we built a tool called Pandometer. This system aggregates metrics, including data from Society and Fukusatsu, along with others, to provide insights about quality improvements on a commit-by-commit basis. As you drill down into individual commits, you see trends and histories, making it easier to respond to quality issues as they arise.

00:30:50.560 It took us several weeks to create this tool, but the important takeaway is that gathering quality metrics, tracking them over time, and using that visibility to improve code quality are all possible with well-planned strategies and systems. Going forward, be deliberate about identifying the quality attributes your team needs to focus on and finding ways to measure improvement.

00:32:28.880 If you don’t agree with my recommendations, that's fine! The key is to create a refactoring strategy that aligns with your goals based on your data and influences how you intend to proceed to enhance the quality of your code. Ultimately, we can create a world where our code is functional, beautiful, successful, maintainable, and extensible. Refactoring is how we can achieve that.

00:33:09.479 Thank you.