Microtalk: A House of Cards - The Perils of Maintaining a 7-Year-Old Codebase

by Julie Gill

In her talk titled "Microtalk: A House of Cards - The Perils of Maintaining a 7-Year-Old Codebase," Julie Gill shares her experiences working with a large, legacy codebase at StreetEasy. Graduating with a computer science degree, she found herself facing the challenges of maneuvering through a codebase filled with complexities that had been developed in 2005 and 2006. The presentation is a metaphorical exploration, likening the codebase to a delicate house of cards that could collapse if not approached with caution.

Key Points:
- Initial Setup Challenges: Gill describes her experience setting up a complex development environment with numerous dependencies such as Homebrew, MySQL, and various Ruby gems, contrasting it with building applications from scratch.
- Learning Curve: The speaker emphasizes the steep learning curve involved, which included not only understanding the code but also grasping the business logic and real estate concepts essential to the application.
- Debugging Struggles: Gill discusses the detective work involved in debugging old code, including navigating Git repositories to trace the history of bugs and assess the intent behind old code segments.
- Old Code Quality: She highlights that just because code is old does not mean it is correct, and past decisions were not always made with foresight.
- Importance of Comments and Specs: Comments such as 'to do: make this not suck' can be helpful clues for future developers, and writing specs can prevent misuse of the code, though not all code can be spec'ed due to scale.
- Inherent Risks of Feature Building: The process of building new features is fraught with risks as changes can have unforeseen effects on unrelated parts of the codebase, creating a domino effect that can lead to bugs.
- Gradual Mastery: Gill notes that with time, she became more familiar with the codebase, learning how the pieces fit together and reducing the risk of breaking functionality.

Conclusion and Takeaway: Ultimately, Gill encourages developers to follow a careful, patient approach when working with large, mature codebases. The same consideration should be applied as one would when renovating a historic structure. The talk serves as a reminder of the intricacies involved in maintaining and evolving code that predates one’s experience.

00:00:16.400 Hey everyone, my name is Julie, and I'm here to talk to you today about large codebases. You might be wondering what this has to do with the house of cards. Am I talking about the really awesome Netflix show with Kevin Spacey? No, I’m actually talking about a house made of very carefully balanced playing cards. Let me back up a little. I graduated from Pace University about a year ago with a computer science degree. At that time, the largest application I had written had maybe ten models. I worked on freelance projects and school projects, but everything I did, I built all of the code myself. Then I started at StreetEasy in August, whose codebase was built in 2005 and 2006 with hundreds of models and possibly billions of rows of data and a whole pile of code. So this talk is hopefully going to give you an entertaining look at my first experience with such a huge, old codebase and how I have come to think of this codebase as a house of cards, ready to crumble as soon as you let your guard down.

00:01:12.560 So welcome to StreetEasy. I spent my first week setting up my development environment. There were so many dependencies to set up: Homebrew, MySQL, RVM, all the gems, Chef, Knife, ImageMagick—all coming from apps that I had built from scratch. I had never had to set up someone else's codebase before. Then you add on a complex mix of code, models, helpers, modules, views, partials, controllers, and a whole heap of metaprogramming. I had never seen so much code in one place. Even taking that cool class where we built a compiler wasn't comparable. Then you add a ton of data: city data, user data, listing data, recorded sales data. I had never seen so much data before. In fact, my first script that I wrote to handle listing amenities ran out of memory! I was shocked; I had never had to optimize a script just to run it because there was so much data. So all this adds up to a whole lot to learn. Not only was I learning about code, but I was also learning about real estate. What are recorded sales? Why is a co-op different from a condo? What are MLSs and why does New York have so many regulations? What is all this code? What is the business logic? How does this code interact with each other? And what does all this data mean? How does the code use it?

00:02:06.800 So we’ve surveyed the house of cards a little bit. Let’s dive in with the first GitHub repository. But where do I start? Google doesn’t help. Google is great; it helps you solve a ton of problems, but it really can’t help you find where this bug is. It takes a whole lot of detective work. I had never had to debug, understand, or fix somebody else's code. It’s a really important skill to be able to inspect elements in the browser, follow a trail of views and partials, and trace methods in the debugger several levels deep to try and figure out what was going on when this was written. Learning to follow and understand these obscure trains of thought is a large part of my job.

00:02:25.040 So in this process, congratulations! You found a five-year-old bug. The problematic line was last touched in 2007. Thanks, Git blame. So what do you do now? Are you even authorized to change such a thing? Was this even written like this on purpose? Figuring out the intent of that piece of code is very important before you try to fix it. You have to consider what design decisions led to this. Is it detrimental to change those, or are those just no longer applicable at all? If you use Git and version control to kind of discover why this line was written, you'll have a lesser chance of breaking the whole house of cards. Another fun question is: did it ever work? Has this been broken for five years? I don’t know; maybe it has, maybe it hasn’t. And what else is going to break when I fix this bug? Something important to learn is just because something was written in the past doesn't make it right. You might think, 'But this code is so old, how could it be wrong?' Past authors had crunch times, other bugs, and bad days; it wasn't always written perfectly. I mean, are you writing perfect code? Probably not! You should picture your future coworkers five years from now trying to make sense of the master making.

00:03:40.640 So comments like 'to do: make this not suck' that we can find in various places around the StreetEasy codebase are actually extremely helpful because as developers coming in, you wonder why something was written. Did this person do this on purpose or by accident? But if you see this, you know they were aware of the issue and wanted you to come fix it in five years. So there you go; now you have to fix it! And specs help. I mean, specs stop people from misusing your code poorly. You’re probably not going to spec every piece of code if you’ve got one of these giant apps, but at least write some specs for something you think others are going to mess up. Now, you’re in this debugging process, and sometimes you get to the point where you’re like, where is this logic coming from? I think of this as a game of StreetEasy magic or Rails magic. So I'm trying to track down this bug, and I run into some unexpected behavior, encountering these magical methods—things that happen behind the scenes that are really useful but tricky to track down.

00:05:05.919 So you ask questions like, ‘Who is doing this? Where is this coming from? What is this?’ Can Google help again? Not sure. Google helps you avoid asking a lot of dumb questions of your co-workers, so you ideally want to do your due diligence to make sure you're not just asking how this Rails method works. You’re trying to Google the problem; trying to figure out what it is, but you can’t find it. This is a clue that it might be StreetEasy magic. It’s a clue that it might be magic that your coworkers wrote in the past to help them out, but now it’s something you have to find and deal with. Some examples I’m talking about are things like `url_for` model and `link_to`, which can look like Rails magic, yet they might actually be StreetEasy magic. They look similar to Rails methods, but you can't find them in the Rails docs. Surprise—those methods are specific to StreetEasy, and they are incredibly useful, but trying to find them in the Rails documentation won’t be helpful. Also, there’s the area model where you can call `area.sub('manhattan')`; on the surface, it looks like a Ruby trick, but it’s actually a StreetEasy trick where there’s a method that overrides the array accessor method.

00:06:15.680 Now, let’s dive into the world of metaprogramming. If you haven’t seen the `ascend` method before, it may look daunting. You find yourself googling, 'What is this and what is being sent to who?’ Tracking down this kind of crazy stuff can be exhausting. As we cover all these points, let's focus on building a feature. The first question is always, 'How long is this going to take?' And I often think, 'I’m a pretty good programmer; I know what I’m doing. This is going to take me a week.' Famous last words! You realize, once you dive in, that this doesn’t actually work how you thought, and you might need to rebuild parts just to start building your feature. Whenever you consider how long something’s gonna take, always factor in the possibility that you’ll need to rebuild everything before starting your new task. It’s not very likely that five years ago, the authors of this code knew then what you would need the code for and built it correctly. You’ll probably feel the domino effect here; adding a feature might affect some distant piece of code. Sometimes a spec catches it; that’s great! If not, get ready for weird bugs to pop up, and you may wonder why that is. Chances are, those weird things are indeed related.

00:08:17.760 Building a feature starts to feel more like adding onto this house of cards. Learning and understanding the codebase and its dependencies feels very fragile. You might change one thing and see totally random things break that you had no idea were even connected. It feels like this house is crumbling; when you fix one thing, another thing breaks. If you carefully examine the structure of this house of cards before making changes, you have to check the foundation and dependencies. Doing research on weaknesses and potential places that could break with your new addition or modification is essential. At some point, this process gets easier. You learn the feel of the code and how the whole codebase works, kind of like how the pieces fit together. I would say around month three or month four, it started to make sense, and I stopped breaking everything all the time. In conclusion, my main takeaway is that when dealing with other people’s huge, old codebases, exercise the same care, caution, and patience that you would when remodeling or building onto a historic structure.