00:00:12.559
Hi, welcome! I'm Joe Mastey and we're going to talk about horrible, terrible, no-good codebases.
00:00:17.760
Can I get a quick show of hands? Who here is currently dealing with just horrifying code at work? Yeah? Awesome! So it's pretty much a support group, if nothing else.
00:00:30.150
I've worked at a lot of different companies over the course of my career, and this has really been one of the common threads across all the companies I've worked at.
00:00:41.280
I guess I'm fortunate enough to work in places that are very profitable, but the downside is that they're usually places that have just terrifyingly bad code.
00:00:48.510
So, we're going to talk about a lot of different things. I want to start off by giving you an example of one of the codebases I’m working with, because I'm going to use it as an example throughout the entire talk.
00:01:04.350
This was somewhere I was working when Rails 4.1 was being released. The codebase I was working with was on Rails 1.2. We were actually quite proud of being on Rails 1.2 because we had started at 0.7!
00:01:16.320
I managed to work our way all the way up to 1.2, which was fantastic.
00:01:21.330
And it had a lot of the characteristics that you might be thinking about for your own codebases. We had custom packages.
00:01:33.200
Somebody decided that they should write their own lib SSL changes, which meant that we were running our own Postgres system libraries. Pretty nasty stuff.
00:01:41.640
We had a 6,000 line user model.
00:01:45.320
The good news was that everything was in one place; the bad news is that it was all in one place.
00:01:54.720
Our test suite took four days to run on a laptop. That’s a little misleading, because if it takes four days to run your test suite, you don’t run your test suite on a laptop.
00:02:07.469
So we actually incurred five-figure AWS bills to run our tests across hundreds of machines.
00:02:13.400
We got it down to four or five hours before you could see whether your test passed.
00:02:19.879
And then when it did pass, about 10% of the tests flapped.
00:02:27.170
A flapping test is basically any test that passes sometimes but not all the time.
00:02:39.799
So like 50% of the time it's green, or 30% of the time it's green, unless it's like a Friday at the end of the month.
00:02:45.920
Every time you got your test run back you'd have between 150 and 300 failing tests.
00:02:51.890
But it was not really clear which tests they were. We had about a million lines of Ruby, so it’s probably more accurate to say this wasn’t a Rails app; it was a Ruby ecosystem with a Rails app hiding somewhere inside.
00:03:01.890
So, we slid right into it. So we did exactly what you would do in this situation, right? We declared the entire legacy codebase as deprecated.
00:03:10.370
We split off 30% of our team and had them build a brand new copy of the codebase.
00:03:16.430
We decided that a year from that point, we would completely tear down the old codebase and be done with it. It was going to be Rails 4, going to be microservices. It was going to be awesome.
00:03:28.730
And it went exactly the way you would expect it to go, right?
00:03:34.370
Yeah, sort of a successful rewrite. Here, let me show you.
00:03:40.489
They had actually deprecated the terrible code base before. I don’t see one hand raised, two hands, nice.
00:03:46.819
So yeah, it happened in the way you would expect: 18 months later, we had the new codebase which we had managed to move about 10% of the revenue onto.
00:03:58.040
So we couldn't kill it because now it was actually making money, and we had the old monster that continued to make all the money.
00:04:04.699
Of course, everybody really wanted to work on the new codebase. You really can't blame them, right?
00:04:11.269
The new codebase is where all the more interesting code happens.
00:04:17.870
You know, things that took us two weeks to do on the old codebase, we could do in just a handful of days on the new codebase.
00:04:27.690
But the challenge here is that we have to continue running the business. Our requirements are still changing.
00:04:34.370
We couldn't just freeze or, in our hubris, attempt to deprecate the entire thing. We had to keep updating it.
00:04:40.489
This is a big challenge, right? And I became really interested in this topic because we did this to ourselves.
00:04:51.030
At some point, someone went into a class file that had 5,990 lines of Ruby in it and decided that adding ten more lines of Ruby was the right thing to do.
00:05:02.190
You know better; you know you need to have fewer methods in one class. You know that's not the right thing to do.
00:05:09.030
And yet, we made that decision anyway.
00:05:15.480
I wanted to understand why, particularly because we were about to build a second monster.
00:05:22.500
To me, this is such a common thing that I don’t really understand.
00:05:30.750
How do people who have all of these skills, who go through refactoring courses, spend years working on this, and yet still make really bad decisions?
00:05:37.170
They keep adding barnacles upon barnacles until we have a completely untenable codebase.
00:05:44.550
I was working with a bunch of developers on this codebase and sat down to ask why we don't split out a little bit of code.
00:05:51.630
I got responses like, 'I can't even tell what codes in use anymore.'
00:05:58.760
Whenever you have 200 methods in an object, and you delete a little bit of code, you start wondering if that code is still being used.
00:06:05.039
Of course, it's practically impossible to figure out.
00:06:11.360
We had some super nasty method missing magic used all over the place; someone had just learned Ruby in the early days of the codebase.
00:06:17.740
They thought, 'I don’t have to declare anything; it's all going to be method missing dispatch that way.'
00:06:23.290
So, it was impossible to tell what was going on.
00:06:29.900
Changing the code could break completely unrelated things. I swear to God, one time I made so many changes that I broke a unit test.
00:06:37.620
I do not know how that works, and yet here we are.
00:06:46.890
Good luck merging your style pulls! Hopefully, you guys are of the belief that it's useful to have a somewhat consistent code style.
00:06:52.890
So, if you do what we did, you know, install RuboCop.
00:07:01.480
You run RuboCop against your codebase, and it's trouble.
00:07:08.030
You can auto-fix all of your style violations, and then put up a pull request that has 69,000 changes, and then we can't trust the test suite.
00:07:16.450
As I mentioned, it takes forever to run, and there are lots of failures.
00:07:24.920
Of course, any time you change a piece of code, a bunch of new tests are breaking.
00:07:30.620
But we don't know if that actually means functionality is broken.
00:07:37.700
So you start to shy away from those, like, 'I'm not going to wait days to run my tests.'
00:07:44.410
Instead, we became surgical in our precision.
00:07:51.360
We learned how to not disturb any of the code around what we were trying to change, make the tiniest possible change.
00:07:59.190
Then we’d push it up, don't hang it up, and then run like hell, as long as my name is not on the git blame, we’re okay.
00:08:08.300
A couple of times we tried, thinking we’d fix this all: add whatever new libraries, split everything up.
00:08:16.370
But it always felt like trying to bail out an entire ocean with this little plastic bucket; you keep going and keep going, and you never make any kind of progress.
00:08:28.650
And so that brings us here.
00:08:37.620
Over time, we became afraid of changing the codebase. This is actually a psychological condition.
00:08:45.880
Interestingly, it's also one of the underlying conditions of clinical depression.
00:08:53.100
We just became afraid of making changes. So now not only do we have the original problem of a codebase that continues to get worse, but we have the secondary problem of not being willing to change it.
00:09:01.960
This leaves us in a state of atrophy and despair. It's a pretty nasty situation.
00:09:09.970
The good news is that once we actually took the time to look critically at why we continued to make everything worse, it gives us the opportunity to start fixing it.
00:09:19.600
So, we're going to talk about how to tackle these two problems: First, how do you become less afraid of your codebase, and second, how do you actually dismantle it in a sane manner?
00:09:27.090
The first goal that we have is to name the thing. There’s this whole concept that knowing a thing's true name gives you power over it.
00:09:36.600
If I ask you to think about this really tough, terrible codebase you've dealt with, what you usually think of is the very worst of that thing.
00:09:44.430
I put up a whole slide with a million lines of code, 6,000 lines of this, and a test suite that takes forever.
00:09:51.930
This becomes a stand-in in your mind for everything about that codebase.
00:09:59.030
Every time you try to deal with it, the image of this big, impossible thing overwhelms you.
00:10:07.570
But that's a cognitive trick, and it's not the reality.
00:10:14.340
Every codebase I've seen has badness that is unevenly distributed.
00:10:22.270
There are usually those couple of things that you would put on the slide, which are horrible, while everything else kind of trails off.
00:10:30.750
Here's an example of how you can examine this. If you don’t do a lot of Bash, this might look kind of weird.
00:10:41.070
You go into your project’s folder and use the command find with a name of star.rb, which finds all your Ruby files.
00:10:50.850
Then pipe it to wc -l, which counts the number of lines in every Ruby file on your system.
00:10:56.700
Finally, you sort it based on the output. This is something I do frequently.
00:11:02.690
People say, 'Our service objects are bad' or 'Our tests are bad,' and I say, 'Okay, let’s look into the tests.'
00:11:09.760
What we find is the kind of output that details the number of lines sorted.
00:11:17.480
I did this on the device codebase, and the characteristics were interesting.
00:11:26.740
The bottom file had 700 lines, which isn’t great but not terrible.
00:11:34.770
However, it only took about four or five files before reaching something that is half the size.
00:11:41.400
Most everything in that codebase is much smaller.
00:11:49.200
The badness is actually distributed exponentially.
00:11:58.600
At the far left, we have a couple of files that are infinitely terrible. As we move to the right, we see a long tail of things that are just kind of okay.
00:12:06.270
That’s a really useful realization because it means we can simply cut off a couple of things and make significant progress.
00:12:13.430
There’s another way we can quantify some badness.
00:12:21.360
If you're still using Rake stats (which is built into your Rails app), you can run this command now.
00:12:30.680
It runs through all of your files and gives you useful output.
00:12:39.150
This gives you the lines of code in every section of your application.
00:12:47.410
The exact numbers don't matter as much because you know you have a billion lines of horrible code and you're not going to delete all of Ruby.
00:12:55.680
However, it does tell you where the bad code is hiding.
00:13:06.210
In this case, there wasn’t much happening in mailers, whereas there was a ton happening in controllers.
00:13:15.480
This indicates a very controller-heavy application.
00:13:22.800
We can start to quantify what’s going on.
00:13:35.620
The last two columns detail the number of methods per controller and the number of lines of code per method.
00:13:40.750
This allows us to gather real metrics that are accurate.
00:13:47.790
We can see that we know our methods are too long, but that's really only true inside our models.
00:13:54.500
It’s not equally distributed. In our case, there were many different metrics we wanted to gather.
00:14:07.660
Once we grasped the idea that we could start quantifying, we realized that the standard output from the tools we were looking at wasn’t very helpful.
00:14:14.830
So, we wrote our own tools. We created a method to count the number of methods for ActiveRecord class.
00:14:23.000
If you don’t know, you can look at all classes that exist, select for those that are children of ActiveRecord, and sort them by the number of methods.
00:14:31.560
This isn’t ideal code but it gives you a really interesting view of what’s happening in your system.
00:14:39.500
Once again, we see that the models at the very top had a lot more going on, while those at the bottom are less complex.
00:14:47.140
I had an inference in one of my codebases that my tests were slow because I was creating tons of ActiveRecord models.
00:14:57.130
I know that’s a blinding realization, but it’s not enough to simply say our tests are bad because many models are created.
00:15:03.670
In reality, you usually don’t know which ones are creating a lot of records.
00:15:12.150
So, I built a simple formatter for RSpec. Whenever you run your tests, it shows how many models are created and how many queries were run.
00:15:20.700
This offers a sense of measurement and precision.
00:15:30.450
When you add these sorts of things to your codebase, you find the overwhelming terror of fixing codebases becomes much more manageable.
00:15:38.870
There are a couple of places that are horrible, a couple of tests that are horrible.
00:15:45.000
The next thing we need to do is talk about how we actually approach fixing some of this code.
00:15:52.870
The first step is to stop the bleeding. We got here because we kept making things incrementally worse.
00:16:02.130
We cannot fix everything until we stop making the codebase worse.
00:16:10.680
It’s difficult to enforce that line by just saying, 'please don't make everything bad' because people are already in that position.
00:16:17.840
The obvious thing was to run RuboCop, remember it gave us 70,000 style violations.
00:16:23.540
But there’s another way; if you run RuboCop against your codebase, find the very largest file and set that as the threshold.
00:16:30.970
What I mean by this is, when RuboCop says you have 23 parameters on this method, you say, 'Great, this is our new limit.'
00:16:38.520
You’ll never be worse than 23 parameters on a method again.
00:16:44.570
It’s a very courageous statement. Then, you go through and apply this to all the major metrics.
00:16:50.340
The goal here is to have a green RuboCop run—it's kind of a fake green, right?
00:16:57.640
You're no worse than 6,275 lines of code.
00:17:04.720
The usefulness of that green is something you shouldn't take for granted, though.
00:17:11.920
Because anytime someone goes back into that class and adds more code, they get an actionable violation.
00:17:18.810
Every time you can start cutting apart the worst part of your codebase, you say, 'we have this method with 23 parameters, let's reduce it to 22'.
00:17:27.030
Great victory! We can also tighten up our parameters.
00:17:36.490
This technique is known as ratcheting.
00:17:43.520
It’s one of my favorite techniques for getting hold of codebases because they only move one way.
00:17:52.150
At the very least, they won't get worse. As you get better, you’ll find it’s incredibly satisfying to tighten those ratchets.
00:18:00.480
In the situation I was dealing with, some metrics that RuboCop didn’t handle at the time may be covered now.
00:18:08.430
But there’s another approach if you don’t want to use RuboCop.