RailsConf 2021

Rescue Mission Accomplished

Rescue Mission Accomplished

by Mercedes Bernard

In the talk titled "Rescue Mission Accomplished" presented at RailsConf 2021, Mercedes Bernard addresses the challenge of assessing and stabilizing failing legacy Rails projects. The primary focus is on the practical steps needed to evaluate whether to refine or rewrite a problematic application. The speaker emphasizes understanding the application's current state and context. Key points discussed include:

  • Assessing the Legacy Code: Start by trying to run the project locally and documenting the technology stack to understand its maintenance status.
  • Evaluation of Technical Debt: Acknowledge that incurring some technical debt during the project's lifecycle can be acceptable for meeting key business objectives but must be balanced to avoid long-term problems.
  • Understanding Stakeholders' Context: It's critical to communicate with stakeholders to grasp the business reasons behind previous decisions that may have led to the current technical challenges.
  • Audit and Analyze the Code: Conduct thorough audits of the code, including security assessments, test coverage evaluation, and data integrity checks to identify weaknesses. Utilizing tools for static analysis can help gauge code quality and technical debt.
  • Incremental Improvements vs. Rewrites: The talk illustrates that many teams opt for rewrites out of frustration, often underestimating the complexity and time needed. Bernard argues that most legacy applications can be salvaged with suitable refactoring.
  • Creating a Stabilization Roadmap: Construct a prioritized list of tasks that should be addressed to stabilize the application before adding new features. Key tasks include implementing CI/CD pipelines, upgrading languages and dependencies, and enhancing test coverage.
  • Communicate with Stakeholders: Regularly update stakeholders about the stabilization plan and educate them on the importance of these tasks, aligning the discussion with their priorities to ensure buy-in.
  • Final Recommendations: Advocate for code quality while keeping business constraints and perspectives in mind. Aim for incremental investments to maintain a stable codebase to avoid future failures.

Bernard concludes by highlighting the importance of gradual, methodological approaches in refactoring projects and communicating effectively with all stakeholders involved. Success in stabilizing legacy applications lies in thoughtful planning, understanding the historical context of decisions, and prioritizing long-term quality over quick fixes.

00:00:05.359 Hello, today we're going to talk about how to assess legacy Rails projects that are in rough shape. We will figure out if we should save them and how to do it. This talk is designed to be accessible for anyone who is faced with legacy code and wants to figure out incremental steps to refactor it into something that is a delight to work with. There may be some things you don't understand if you're really early in your career, but most of what we discuss will be useful regardless of your level. My name is Mercedes Bernard, my pronouns are she/her, and I'm the VP of Delivery at a digital consultancy in Chicago called Tandem.
00:00:29.519 If you are someone who likes to follow along with slides, I've put them up on my website. The speaker notes will include everything I'm saying, which may help you follow some of the links and tools I'll be sharing. Now, story time: I had a recent client bring me four code repositories that were all in rough shape. They were undocumented and couldn't reliably be set up in a local development environment. They weren’t containerized, and there were multiple required dependencies that weren't managed with any sort of package manager. You had to know you needed them and then install and configure them manually. The reference data and seeds were out of date and produced many errors when you tried to run them.
00:00:53.760 Additionally, there were rake tasks sprinkled all over the place for adding data, and you couldn't tell if you needed them or if they had just been put there for one-off use cases. None of these repositories had any CI or CD, and there was no documented way to deploy them. Anytime a change was made, a lot of unintended side effects were introduced. Consequently, the client lacked any trust in any change or deployment and tried to avoid them at all costs. Two of these repositories were core to their business and had many users onboarded, generating revenue consistently. The other two weren’t as vital and had started as proofs of concept. Therefore, the client wanted these four repositories stabilized so they could start adding new features and continue building off what they had. Their main concern was cost, as they were a startup with a short runway trying to raise another round of funding.
00:02:11.700 They didn't have a lot of money to invest, so they were looking for the most cost-effective way to get their system to a place where they could market it to investors and continue to grow their business. We had to decide whether we could salvage these projects or if we would have to tell them that they weren't worth saving and should be scrapped. For what we could salvage, we needed to figure out the best way to achieve that, balancing both cost and long-term stability. This talk is practical, not philosophical. We'll focus on making these decisions and completing rescue missions. All of the ideas we’ll cover are relevant to any tech stack, but the tools I'll share are specifically for Rails projects.
00:03:07.080 So, if you are given a rescue mission and are trying to figure out whether to stabilize this codebase or start over, the best thing you can do is to understand the context and current state of the system. It's impossible to find the best solution moving forward without as full an understanding of the problem as you can get. Codebases never become legacy through intentional bad decisions. There are always constraints and limitations imposed on the humans who wrote the code before you that you may not be aware of. To build your empathetic understanding of the factors that led to the current state of the project, take the time to talk to stakeholders to understand the business and financial context, and how they have or haven't changed over time.
00:03:44.340 Maybe the company needed to invest in marketing to find customers, which meant they couldn't afford a large enough team to work on the code as needed. Perhaps they had to reduce the size of the team and layoffs meant that much of the knowledge departed with those individuals. They might have only been able to hire contractors who didn’t thoroughly understand their business, or their team could have been made up of early-career developers who lacked experience in the tech stack and had no mentorship to guide them. All of their choices were the best they could make with the information they had available, or perhaps the business moved in a new direction, and everyone is still learning about the new position and industry.
00:04:44.700 The last three scenarios occurred during my client's rescue mission. A group of contractors built the projects when they were in a greenfield state, and then a couple of early-career developers inherited them and did their best. Afterward, the startup pivoted when they gained a better understanding of their market, and the entire organization was figuring out what that meant. It's also beneficial to understand the main stakeholders’ priorities when trying to figure out what decisions led to where we are today. The project management triangle is a common heuristic for understanding constraints in almost all types of projects, not just software, and you may have heard it before: good, fast, cheap—pick two.
00:05:29.340 If a team or organization consistently prioritizes the same two constraints, it can lead to problems in the gap left by the third constraint. For example, if you always choose good and fast, you might quickly run out of money. If you always choose good and cheap, you may never make it to market in time to capitalize on your idea. Should you decide to always choose fast and cheap, that could result in severe quality issues. This is why it's okay to incur some technical debt on projects occasionally. Sometimes the best decision for the business is to choose fast and cheap to meet timeline or budget constraints, and deciding how to balance project management constraints is a core part of managing and paying down technical debt.
00:06:13.740 However, for my recent rescue mission, it was clear that the organization did not balance project constraints. They always chose to prioritize cost and timeline at the expense of product quality. This led to a 'kick the can down the road' strategy until the projects reached a point where they needed rescue. Taking the time to understand the business and the decisions behind the current state of the application allows you to remember the humans who built it and is key to developing an empathetic understanding. But now we are going to focus on the understanding part of the rescue.
00:06:51.180 When you inherit code that you didn't write—especially entire applications or systems, not just features—it can be daunting, and it can be hard to know where to start. When you need to understand the code, identify its weak spots, and decide how to stabilize it, it can be overwhelming. When faced with a rescue project, these are the steps I take to understand the scope of the mission. Full disclosure, this can be time-consuming. First, try to get the project up and running locally. In a best-case scenario, the project is containerized, and this is a five to ten-minute process. However, I've never encountered a best-case scenario on a rescue project.
00:08:02.940 Hopefully, some documentation exists to get you started; if not, you'll have to rely on industry-standard practices for the language and framework you are working in. In a Rails project, you know you'll start with 'bundle install.' This isn't always enough, but if you cannot get the project running in 30 minutes without documentation, that's a valuable data point indicating that the issues in the codebase might be bigger than anyone expected due to unknown unknowns. Then, look up and document the language and framework version that the project is running on. For a Rails project, check what version of Ruby the project is using by inspecting both the Ruby version file and your Gemfile. What is the maintenance status of that version? Is it supported? Is it going to reach end of life soon? Has its end of life date already passed? The Ruby maintenance branches documentation will help you determine the current maintenance status of the various Ruby branches.
00:09:29.339 Next, check the same things for Rails. Look in your Gemfile or Gemfile.lock to see what version you're running, and check the maintenance status for that. RubyGems has a list of all the Rails versions released, Rails Guides explains the Rails maintenance policy, and the Rails blog includes info on what's in each release in case you have to upgrade. The endoflife.date website provides a simple table to track end of life and version support. If there are other important packages and dependencies, take a look at those too. For example, what Postgres version is the app running on? You can also use 'bundle outdated' to find gems that have updates available. You won't necessarily have to upgrade all the gems, but it can give you a good sense of where the project stands, especially if you pay attention to critical dependencies when deciding what needs to be upgraded.
00:10:54.540 When we inherited our rescue mission, the Rails version had reached end of life two years prior, and the Ruby version had reached end of life a few months earlier. Therefore, we knew we would have to upgrade those if we decided to stabilize these applications. If your language and framework have reached end of life, they are no longer receiving security patches. However, even if they aren't at end of life, there's still a high chance of lurking vulnerabilities if things haven't been kept up to date. You'll want to conduct a security audit to understand what you're working with and how much work you'll need to do to secure the application.
00:12:44.100 Security is a discipline on its own and may not be your area of expertise. As web developers, we can't specialize in everything, so if you need to do a security audit and security isn't your forte, here are some tools and tips I find useful for a Rails project: I love to install the gem 'Brakeman' and configure it. It's a static analysis scanner that will tell you if you have any known vulnerabilities in your code. Additionally, check your certificates. Is the web traffic over TLS? Is there an SSL certificate in place? Is data that should be encrypted, like passwords, actually encrypted? Is there authentication and authorization for resources that should be private for users? Lastly, where are sensitive credentials for the system stored? Ensure they aren't in the source code or checked into source control.
00:14:27.360 There are likely many more things to be aware of. I’m not a security specialist either, but with static analysis and some general web security know-how, you can get a sense of how vulnerable the application is and how much time you might need to dedicate to remediate security issues. I've also included a helpful Rails security audit list I found on GitHub to get you started. After getting a general lay of the land, we should take a look at the data model. When I mention data, I primarily refer to the data model in a Rails application, usually dealing with relational databases.
00:15:16.620 The first thing to consider is the general understanding of that model: can you comprehend it? Does it make sense conceptually, or does it leave you scratching your head? This is a good gut check to get a sense of how challenging and time-consuming it will be to work with as you explore other parts of the code. Once you have a feeling about how understandable the data model is, examine its referential integrity. Are the associations in your models enforced through foreign keys in the database? Are the foreign keys required unless the association is optional? I've seen many Rails projects that rely only on model associations without adding foreign keys to enforce them. If you have strong referential integrity—meaning both answers to these questions are affirmative—you can have higher trust in your data. If you ever need to migrate it, you can do so without a lot of headaches managing edge cases or errors, because the records that are referenced actually exist.
00:16:44.340 Finally, check how normalized the data is. Does the level of normalization make sense in a relational database? In most Rails applications, we want a high level of normalization, meaning we want to minimize data redundancy. Normalization is why we don't store everything in one table with data duplicated between records. Consider if we tried to flatten a has many relationship. It's also why we have has many through associations, rather than simply placing multiple foreign keys everywhere to create direct relationships. Sometimes there may be performance reasons to intentionally denormalize data, but most of the time, we prioritize data integrity. A higher level of normalization means greater trust in your data.
00:17:45.720 There are two tools I like to use to get a quick visual representation of the data model. The first is DB Diagram, a web app to which you can upload your schema file. Since DB Diagram uses your database schema, the resulting visual will tell you if you have referential integrity in the database. It visually represents your foreign key relationships between the tables. This diagram can also indicate your level of normalization. If you see many foreign key relationships duplicating, it signifies your data model is not normalized, leading to data redundancy. You should also pay attention to duplicated columns across records that could be refactored into their own table; this requires more manual investigation.
00:18:56.660 The second tool I recommend for evaluating a Rails data model is Rails ERD, a gem that generates a diagram from the Rails model associations. It offers numerous customization options and will illustrate a data model built from the associations in your application code. This is helpful for checking if the foreign key relationships in the database match the associations in the model. Often, I find more relationships in the ERD generated from model associations than in the ERD generated from the schema. This discrepancy indicates missing foreign key relationships in the database.
00:20:06.600 From the last few diagrams I've examined regarding our rescue mission project, I learned that their data model was a mess. There were numerous unnecessary relationships throughout the database and application code. Although there was technically referential integrity, I couldn't ascertain if it aligned with the logic modeled in the code. There were duplications everywhere, leading to little trust that the foreign keys in the code were properly synchronized with other relationships, given the absence of many has many through relationships. These relationships were symbolized as dotted lines in this diagram, indicating they appeared almost nonexistent.
00:21:14.300 After digging through the data layer, our next focus was the application code. Start by examining the test coverage within the application: are there any tests? Are they well-written? Do they test meaningful aspects? For example, does a controller test validate the shape of the returned response or merely confirm a 200 status code? Analyzing test coverage within an application helps gauge the risk factor associated with changes and refactors. The less test coverage there is, the higher the risk of introducing unintended regressions.
00:21:47.520 Good test coverage serves as excellent developer documentation, helping to illustrate what a feature is intended to do and how to use it within the code. Within a Rails project, you can use a tool like SimpleCov, another gem that provides coverage percentages and identifies untested paths in the code. In our rescue project, there were no tests. It was no surprise that the client was hesitant to make changes because there was no feedback loop to catch issues before deployment. When test coverage is absent, you should allocate time during stabilization to create it. In my view, this is non-negotiable; the alternative could be far too costly.
00:23:51.000 We’ve discussed static analysis during the security audit and briefly when examining test coverage. Static analysis is immensely helpful for developers unfamiliar with an entire codebase. It involves examining code without executing it, typically aiming to evaluate its quality. Many tools assist with this process, allowing you to learn which parts of the code are complex and difficult to understand, which parts change frequently, and if they are also the complicated segments. Additionally, you can identify duplication and much more. Taken as a unified picture, all of this information can convey how manageable— or unmanageable—this code could be.
00:25:39.360 Some of my favorite tools for static analysis in Rails include Rubocop, which many of you may be familiar with; it’s a gem you can install that tells you how idiomatic your code is and how well it adheres to industry standards. You can install and configure Reek, which identifies code smells and indicates areas of the code that might be difficult to maintain. You might also configure Flog, which measures the complexity of your code and identifies parts that are challenging to test. Another useful tool is Flay, which highlights duplicated or similar areas within your code that should be simplified (i.e., DRY'd). Turbulence measures complexity and churn in the codebase, pointing out good candidates for refactoring—those sections that exhibit high complexity and high churn, as they should be prioritized for reduction. Lastly, Ruby Critic provides a nice developer experience as it wraps around several of these tools, generating a single report highlighting areas of concern.
00:27:34.380 You can then compile these high-complexity, high-churn, and duplicated code issues into your list for refactoring once the project achieves some stability. Another important metric to investigate is to identify unused code. Many clients hesitate to delete deprecated code because they wish to keep a record of it, despite the availability of source control. However, maintaining unused code incurs a substantial maintenance cost. As the codebase evolves, unused code may break tests—if you have them—as database constraints or associations change. Furthermore, that code can confuse developers who might think they need to refactor it and keep it in a ready state, despite its deprecated status.
00:28:42.780 Unused code generates many false negatives during your audits and subsequent stabilization efforts. If you identify unused code, add it to your removal list; we don't want to invest any further resources into it. Finally, let's review a couple more aspects related to developer experience. Happy developers are more productive. When we have a comfortable codebase, we also tend to be happier individuals. Firstly, is there error monitoring within the system? If an issue occurs in one of your deployed environments, how would you know? This becomes a significant concern for incremental updates in production environments.
00:30:20.800 Rescue missions typically struggle with resilience, so you'll want to know as soon as something goes wrong to have ample time to roll back or fix issues. If there's no error monitoring, ensure you add this to your priorities. Speaking of resilience, how easy is it to deploy the application? How easy is it to revert to the previous state in the event of a regression? Most projects today have some form of CI/CD pipeline, although that isn't always the case. Therefore, understand how changes are deployed in production. What is the deployment strategy? What is the branching strategy? Is that all documented? Most fundamentally, can this application be deployed?
00:31:59.340 Hopefully, the answers to these questions are all positive, making your life easier. If not, then that becomes another priority in your stabilization work. Lastly, consider edge cases you should be aware of before embarking on the project. One that has hampered me in the past involved supporting an old version of Internet Explorer. We realized this too late when we upgraded frontend packages, leading to numerous breaks—particularly regarding styling. That’s a lot of information to digest, but now that you have it all, you can formulate a plan to decide on the best next steps.
00:33:30.720 The first thing you need to determine is whether the application is salvageable. Let me start by saying, nine times out of ten, it absolutely is. I've witnessed numerous teams and clients opting to rewrite entire applications instead of refactoring them. I believe this happens for two primary reasons. The first is that teams often are not given enough time to understand the current state of an application when making this decision. As you sift through various factors to assess, if you're thinking this is too much or that you never have enough time to investigate everything, you may be on a team facing unknowns and ambiguity about what's in an application, its problems, and its stable components.
00:34:01.200 It becomes challenging to envision any incremental path forward, making it feel more reasonable to write code from scratch rather than making incremental improvements to someone else's code. Writing your code feels safer. The second prominent reason I've noticed for teams favoring rewrites over refactoring is that, as an industry, we tend to be serial underestimators; we've all had those moments of thinking an item will only take a day, only to find ourselves four days later still working on the same task.
00:35:05.100 Rewriting an application from scratch almost always takes longer than expected. There are often forgotten features, tedious and tricky data migrations, new technologies or methods that are trialed and ultimately discarded before even completing the project. Furthermore, styling frequently takes longer than anticipated, and we must stop giving CSS short shrift. I recognize I'm stating all this without empirical studies or hard figures to substantiate it, so a certain level of trust is necessary. Nevertheless, based on over nine years of experience as a consultant, I've observed that I've never seen a rewrite project succeed without exceeding both timeline and budget constraints.
00:36:17.940 Among the four repositories brought to us by my client, we refactored and stabilized three of them. The fourth repository was neither generating revenue nor had any active users; it no longer fulfilled their business needs after they gained a better understanding of the market. Additionally, it was built on a hand-rolled React Native library that hadn’t been maintained for years and was in a state of deadlock due to strictly enforced package versions. As a result, we couldn’t upgrade anything without also upgrading that package, which was impossible. Consequently, without users or business justification, it didn’t make sense to invest in that app. We recommended that they refrain from rewriting it and instead take the time to research what value a new mobile app could provide, and then create that app.
00:37:14.220 After completing all of the audit steps, you'd accumulate a list of areas to address, such as automating deployments, upgrading the language version, patching security vulnerabilities, adding error monitoring, and cleaning up the data model. You now possess the foundation for your roadmap, so it's time to take that list and prioritize the items based on their significance. Your top priority should focus on those that enable the team to ship working code efficiently. It's almost impossible to make incremental improvements to an application if there is no straightforward method to deploy that code.
00:38:09.460 Such friction results in more things being bundled in each release, which increases the likelihood of regressions and continues to undermine trust in the application. If the project lacks an automated deployment workflow, that should rank as one of your highest priorities, if not the most critical item. Once you have your prioritized roadmap detailing the work required to stabilize your application, leverage it to create an estimate. Following the audit, the team should understand the necessary actions clearly, broken down into your list.
00:39:11.560 Examine that list and develop realistic estimates that include time for manual regression testing. Your estimates will likely be high; this isn't a project that can be completed in one sprint of stabilization work—hence why this is a rescue mission. Once you've clearly identified, prioritized, and estimated the work, it’s time for stakeholder education. It’s crucial not to skip this step to ensure the team has the necessary time to stabilize the app, thereby enhancing the chances of success. All stakeholders must understand the work involved.
00:40:27.559 First, explain why the issues are problematic. Non-technical stakeholders may not understand why investing resources in establishing a stable CI/CD pipeline is crucial, potentially leading them to deprioritize it because they've previously managed to get code to production without that pipeline. Therefore, you'll need to articulate how automated tests and deployments empower developers to ship code more quickly and with greater ease. If you must dedicate considerable time to reworking the data model, explain the risks associated with data migration when there isn’t data integrity, as well as the long-term consequences of denormalized data.
00:41:10.260 When discussing parts of the roadmap and their necessary purposes, consider your audience's values. If they prioritize time and budget, frame your rationale to emphasize how this work maximizes future budgets. If they are concerned about the customer experience and eager for new features, explain how increased stability and reduced regressions foster greater customer trust and enhance the overall experience.
00:41:49.320 When stabilizing a rescue project, you might find you cannot introduce new features; in fact, you shouldn't. Thus, the final part of stakeholder education involves setting clear expectations about when they can expect new features they might desire. As the project progresses towards greater stability, excitement may lead them to push for new features before you're finished, so prepare a milestone in the roadmap indicating when the project will reach a satisfactory state and the team can begin balancing stabilization work alongside new feature development.
00:43:05.940 Share that milestone with your stakeholders. When they begin requesting changes, you will have a concrete reference point agreed upon in advance. During my rescue project, my stakeholders primarily cared about budget; their goal was to minimize costs. When advocating for a refactor of their projects rather than rewrites, I explained that this approach would save them money. Initially, they thought they could get an entirely new application for the same cost. However, we navigated some candid discussions about how their tendency to sacrifice quality led them to their current situation, thus it was time to invest in well-crafted code for reliable software that could be built upon, avoiding the need for rewriting every couple of years.
00:44:10.680 That last aspect—building upon their foundation—was crucial, as aside from budget considerations, adding new features was their other top priority. They had a list of desired features to market to investors for raising their next funding round. When explaining the advantages of stabilizing their projects, I illustrated how a robust foundation would enable future developers to build features rapidly since they would not have to sift through tons of cruft—plus, the feedback loops from test coverage, CI/CD, and all process improvements would be shorter. They were somewhat skeptical at first, but the promise of quick delivery was a significant selling point.
00:45:17.200 Now that you have a plan for stabilization and obtained stakeholder buy-in, it's time to put the plan into action. Your prioritized roadmap might look different from mine, depending on what is essential to you and your team, as well as the current state of your project. Your priorities could place upgrades at the top of your list, or you might not have a CI/CD implementation at all. For us, since the project was being deployed using Heroku, we prioritized other items before addressing the Ruby and Rails version upgrades, to ensure we had test coverage in place to catch any potential issues that arose with the upgrades.
00:46:22.740 As you begin to work on the plan, it may be challenging to adhere strictly to it. Once your team dives into the code, they might encounter areas they feel should be cleaned up or glaring bugs they want to tackle immediately. However, it is essential to remain methodical and organized during the stabilization process. You invested significant effort into determining the best way to proceed, so it's vital not to become distracted by various side projects.
00:47:51.659 As we previously discussed, rescue projects often lack resilience; thus, applying easy, low-friction strategies to build resilience early in the stabilization process is crucial. Adding error monitoring is one of those low-hanging fruits—it won't necessarily make it easier to solve problems, but it will quickly provide feedback if a regression or outage occurs. If your project has no error monitoring in place, make sure to prioritize adding this early in your stabilization effort.
00:48:58.860 Similarly, when you reach the point of adding test coverage, it may feel counter-intuitive, but you should test for bugs. We’re not ready yet to implement intentional behavior changes since we lack strategies to catch regressions. It can feel discouraging to leave broken code unaddressed, but we must at least test it. You can add information to the RSpec context block indicating that it represents buggy behavior which requires fixing. Then, when it's time for testing, ensure your team agrees on a desired coverage percentage—100% coverage would be ideal for catching unintended behavior changes, particularly if you don't have familiarity with the code.
00:50:38.280 However, achieving 100% coverage isn't realistic, so select a percentage and work towards that across all files tested by you and your team. Timebox your efforts on this task and focus on critical path user flows and revenue-generating tasks. If you have features that are infrequently utilized, they are lower priority for testing. Should you run out of time for testing those, that’s acceptable—your priority is to hit the target for coverage in all critical path files.
00:51:46.680 As you address security vulnerabilities and complete your upgrades, be sure to deploy those changes frequently and in smaller increments. Smaller deployments are preferable; if you can deploy three changes a day rather than one deploy a week with 15 changes, you'll improve your confidence in identifying causes of regressions if they arise. Smaller rollbacks are often easier too, as there are fewer side effects involved—this is the very advantage of CI/CD pipelines! This process also encompasses database changes.
00:53:42.120 When you refactor and make changes to your database, you might require multiple deployments to facilitate production updates. For instance, if moving data from a deprecated table or column into a more appropriate database location, typically you'll first deploy to create the new table, running a rake task or migration to copy the data from the old location. After executing QA to ensure the application code using this new location works as expected, you would then do a second deployment to remove the old table or column and validate again. Thus, having an easy-to-use deployment workflow is absolutely vital.
00:54:53.740 As part of refactoring, make incremental changes. Now that you have established some test coverage, aim to make small, isolated modifications wherever possible. When approaching refactoring, you need to decide on your method: Do you want to refactor layer by layer—starting with the data layer, then the application layer, and finally the UI—or do you want to organize by domain within the code? Perhaps you start with user management, followed by search, and then reporting. In the case of our project, we opted for a hybrid approach.
00:55:56.760 We dealt with a highly complex, denormalized data model, as evidenced by the diagrams earlier. Therefore, we chose one domain to refactor, starting with the database. We refactored layer by layer, progressing from the data layer through the business logic to the presentation layer, completing the process through many small deployments until that domain was stable. Afterward, we would select the next domain and repeat the process. So, what lessons did we learn?
00:57:25.740 While I discussed unused code during the audit, I learned about its importance through experience. If the code serves no purpose, delete it! You have source control in case you ever wish to revisit it. Thank the code for its service and then let it go. Retaining excessive unused code can increase regressions, inflate the effort needed for test coverage, and complicate situations whenever you modify associations, method names, etc. There’s no justification for keeping unnecessary cognitive load within the codebase if no one needs to understand its functionality because it’s never engaged.
00:58:23.400 As your team processes the stabilization tasks, ensure you're keeping track of your code logistics: which branches are utilizing what version? Have upgrades been merged into feature branches? When was each database migration deployed? Did the rake tasks run? Frequent, small deployments simplify this tracking, while long-running branches create unmanageable complications. Though long-running branches are unavoidable at times, try your best to be vigilant about these factors. The worst-case scenario is not realizing that data hasn’t been migrated before deploying the next migration, which drops the table that contained needed data. This would necessitate rolling back and restoring your last database backup, and starting the process over—definitely not an ideal situation!
01:00:17.540 Though not the end of the world, it is undoubtedly stressful and rather unpleasant. Now, for our final takeaways: we should strive to avert rescue missions whenever possible. Technical debt is a reality we encounter while working with software, and like any investment, there are times when it makes sense to incur some tech debt to achieve larger goals, provided the decision is intentional and made with a complete understanding of the costs involved. It can be a sound choice, but we must avoid cutting corners and accumulating tech debt indiscriminately.
01:01:33.600 Advocating for code quality is challenging when business stakeholders exert pressure to constantly ship code; however, the consequences will often be worse. Advocate for high-quality software using a shared language that aligns with stakeholders' values—whether it be time, budget, future extensibility for new features, or customer experience. As you continue your work, make small incremental investments in your codebase—utilize clear naming conventions, leverage service objects liberally—there’s a related talk at RailsConf regarding service objects that I’ll be checking out, and you should too. Replace old tests, implement model validations, and ensure you take steps to prevent your project from needing future rescues.
01:02:37.680 Ultimately, we can audit everything, devise a plan, execute the work, and sometimes the business will determine that the investment is no longer worthwhile, or your team may exhaust the timeline due to unforeseen challenges, or perhaps your deadline gets trimmed for reasons completely out of your control. That’s perfectly fine! You are not culpable for the application reaching its present state; it did not fail solely because of you; it’s not as though all responsibility fell on your shoulders. Thank you for your time!