00:00:11.200
Come in.
00:00:12.880
Hello.
00:00:14.000
Welcome.
00:00:14.920
Everybody, welcome to RubyConf.
00:00:22.320
It's exciting to see so many people gathered here after so long for the love of a programming language.
00:00:25.119
I personally love Ruby.
00:00:28.160
I know they say that the spot right after lunch is the worst one because people just want to close their eyes.
00:00:34.079
But I actually want to indulge that in a second, if you let me. I want to do an experiment for one minute, so actually just go and close your eyes.
00:00:44.160
And, you know, if you're home or watching this video, just look away from the screen. Okay, now I want you to picture an object that you own, that you have in your room, like your bedroom or your office.
00:00:53.199
Think of something that you haven't used in a long time. It can be a book that you haven't read in a while, or maybe a shirt that doesn't fit you anymore. And I want you to think about how it would feel to get rid of it.
00:01:09.520
Is it pain? Is it joy? Is it anxiety, or is it freedom? You can open your eyes now.
00:01:32.799
I asked this question to different friends and co-workers, and I got very different responses.
00:01:36.000
Some people really like to collect objects, and they feel good with many items around them, while other people, like me, prefer the opposite.
00:01:45.200
I like pristine spaces; I like not having unused objects around me. I think it might be a personality trait. Different people have different personalities, and I realized that in my case, this doesn't only apply to my real-world environment, but also to my code.
00:01:58.960
I love Ruby, and I especially love opening a blank new Ruby file, or maybe a file that's like 10 or 20 lines of code. Last year, I joined a corporation that has probably one of the biggest Ruby codebases in the world.
00:02:15.680
To give you some numbers, I actually checked, and in the first quarter I was there, the amount of code that was added was 500,000 lines of code. Of course, there are many employees, but for somebody like me, that's overwhelming. I open a Ruby file, and it has 2,000 lines of code, which gives me a headache.
00:02:39.519
And, I don't just want to complain about it; I want to do something about it. What I've been doing about it is deleting code.
00:02:58.560
It might sound weird, but I'm passionate about deleting code, and I've never heard a talk about it. That's why I'm here to give this talk. In fact, in my first quarter, I deleted about 50,000 lines of Ruby code, and I'm sure there is more.
00:03:18.640
I'm sure if you work at a company, you can probably find some code that can be deleted as well. So, if you're passionate about it, or if I manage to make you passionate about it, then this talk is for you.
00:03:36.239
This talk, as I was saying, really comes from experience. I have made many pull requests, and when I looked back at those, I realized a pattern was forming. Every time I was creating a pull request, there were some steps that I was repeating.
00:03:54.480
To give you the alternative, I didn't just make one pull request that says 'delete 500,000 lines of code' or something, because my co-workers would be like, 'What are you even doing? You just joined; you can't do that!'
00:04:01.360
So that's not what I did. I made very small pull requests that followed a pattern, and that's what I'm going to talk about today.
00:04:05.760
If you're still in the back, you can come closer. The technical part hasn't started yet. So this is kind of like the pattern or process I found, which is easy to understand. The first step is: how do you find code that can be deleted?
00:04:35.199
How do you even get there? That's kind of like the recognition part. Then, of course, you want to make sure that you can delete it. You don't want to just type backspace and pray that production doesn't go down.
00:04:50.160
Finally, you probably have co-workers, so they also need to understand why you did it. They might need to review your code, so there’s also some effort that goes into that.
00:05:06.560
Let's get started! Here's some Ruby code, since we are at RubyConf. This is a method, a pretty sure one, and it's nine lines of code. To give you some context, this is a method called 'model' that displays a model in an ERB view.
00:05:31.280
You don't have to type HTML and CSS; you can just invoke this method we wrote in your view, and it accepts some options regarding how the model should be displayed.
00:05:45.199
So, this method already existed, and one day I opened this file because I had to add an option to this method. The options are a hash, and there are different options.
00:06:03.759
So, you know, you might be curious about this code, just like I was when I opened it: what is this code doing? Because I knew I had to edit it, but I wanted to read it a little bit and understand what was going on.
00:06:24.639
I see there's an option called 'header_icon,' and I might guess what that is doing. Then I see this conditional option, and I'm like, 'What is that option exactly?' I have this curiosity; I want to understand the code because maybe I can reuse it.
00:06:45.280
Or maybe that code is actually code that can be deleted; I don't know.
00:07:01.680
So before I even start typing, I get there. I'm just curious: what is this options header? Could it even be used anywhere?
00:07:05.760
So the first tool that we can all reach for is 'Finding in Project.' This looks different based on your editor; it can be like Command+Shift+F, or you can use it in your terminal or graphical interface.
00:07:29.639
Basically, what I'm doing here is I see this header icon, and I want to see where else it is in the code: who is actually using this option? That might give me some insight.
00:07:48.639
And it turns out that this option is only used in this file, so now I have a suspicion: I'm like, 'Wait, why is there an option that nobody's calling?' You know, maybe I'm on the right track; maybe this code is not used.
00:08:01.679
However, let's not forget that Ruby has metaprogramming. Maybe somebody, some evil co-worker called 'dot send' and passed an interpolated string or things like that you can do in Ruby.
00:08:25.440
But I can't be sure that I can delete this yet, so I want to go a little deeper.
00:08:40.640
The next tool I reach out for is Git. Now, not every Ruby programmer uses Git; there are other version control systems, but it's definitely one of the most common.
00:08:58.400
So, with Git, what I can do is use 'git blame' and the name of the file.
00:09:01.679
What 'git blame' does is, for each line of code, it tells me who wrote it, when, and why—there's a commit message. That number you see at the beginning is also called a SHA, and it tells me when the line was introduced.
00:09:19.919
For instance, I'm looking for this one. The reason why I'm doing this exploration is that if someone added it, maybe that commit message is going to tell me why. What was it doing?
00:09:30.240
So I can look more specifically at this commit.
00:09:44.720
There are more seats available if you want to come and join us over there.
00:09:51.040
There is another Git command to look at a specific commit, and that is 'git show.' So I type 'git show' with a SHA, and now I'm investigating this commit.
00:10:05.760
It's titled 'create an icon helper' and was created three and a half years ago. It does indeed add this option called 'header_icon' that I'm curious about.
00:10:31.760
Not only does it do that, but also in a separate file called 'delete_app,' it used this option. This makes sense, right? If somebody added an option and also used it, why would someone add an option just because?
00:10:55.040
So, what this is telling me is that three and a half years ago, somebody added the option and was using it somewhere. But then 'Finding in Project' is telling me that it's not used now.
00:11:18.320
So, what happened? Now I'm kind of like Indiana Jones trying to find the Holy Grail. What I want to do now is look at this other file, 'delete_app.rb.' What's the story of this file?
00:11:30.720
So there is another Git command for that: it's called 'git log.' You type 'git log' with a file, and you know, it normally shows the history of the file.
00:11:48.320
But I'm getting an error here that says there is no known revision or path not in the working tree. This error simply means that the file doesn't exist anymore.
00:12:04.800
So I might think 'cool,' the file doesn't exist anymore, so I'm good—nobody's calling it. But once again, I want to be more specific. Okay, it doesn't exist anymore, but can I see when it was deleted?
00:12:22.320
Like can I still look at the history of a file? It turns out you can do that with 'git log'; you just need to be more explicit with the options.
00:12:35.040
So you can do 'git log --follow' which will continue the history of a file even beyond the names. Then you can do the '--' to pass the files. Now I can actually see the history of the file even though it's been deleted.
00:12:56.760
The most recent commit called 'remove_the_delete_app_card' actually makes sense. It says basically this file was deleted, and it was done two and a half years ago.
00:13:22.960
So now the story in my mind is complete: three and a half years ago, somebody added the option and used it. Two and a half years ago, somebody else removed the only usage of this option from the code but forgot to delete the option itself.
00:13:42.080
That happens; we're all busy building million-dollar features.
00:14:02.720
So that is the whole story: there was a commit where it was added, and then there was a commit where it stopped being used.
00:14:19.199
Finally, I have everything with me to write what I would call a good commit message.
00:14:35.360
So I can delete the code and type 'git commit.' The way I write the commit messages goes along this way: remove unused option 'header_icon'; it was introduced in 27b but its last invocation was removed in 86e.
00:14:53.600
The reason I call this a good commit message is that it's compact but poignant. What I mean by that is that anybody who reviews this commit has all the tools to understand what I'm doing.
00:15:12.160
Especially if you use GitHub, GitHub translates those numbers and SHAs into links so you can click on them, and it actually takes you to those commits.
00:15:31.760
For this reason, I don't need to type out 'three and a half years ago, x did this' and so on, because you can just click on those links and see for yourself.
00:15:51.680
So this commit message is short but really powerful, and I'm doing this basically to do a favor to myself. What I mean is, I'm going to submit this commit, and at my company, just like many other companies, somebody has to review the code changes.
00:16:11.680
Maybe the person who reviews this is going to be reviewing it tomorrow or three days from now. If I just type something like 'deleted unused code,' they might come back and say, 'Are you sure? What is this for?'
00:16:32.080
Three days from now, maybe I'm working on another feature, and I have to interrupt myself. Why do that if I have already done all the work today?
00:17:00.320
This is the way in which I share this kind of code with my co-workers.
00:17:06.720
Another thing I do to make sure that my changes are approved in a very easy way is to keep this commit small. So this specific commit is just deleting those lines of code.
00:17:30.720
So when somebody sees it, they're like, 'Oh, okay, I get it.' I look at the code, I get what's happening, I can click, I can ask, but it's not 300 additions, 400 deletions, and so on.
00:17:51.840
Now, normally, when I find code that can be deleted, I'm in the middle of a gigantic feature. I'm like, 'Oh, I'm doing all these things; I'm building the next big feature!'
00:18:05.680
But now I also want to delete this.
00:18:20.399
So it's easy to put it all together in one commit. It's easy to think, 'I can just do one commit, and I have a commit message that's like a book,' saying it's doing all these things at once.
00:18:41.920
But that would still not be nice to my co-workers. So, instead, what I suggest is that if you're working on a big feature, pause whatever you're doing, just put it aside, make a small commit to delete code, and then bring your work back.
00:19:02.720
In fact, this pause is not hard to do. There is a Git command for that; it's called 'git stash.'
00:19:18.960
Let me show you an example. Let's imagine I'm working on a feature, and meanwhile, I've already touched this file; I've already added another file.
00:19:24.960
And I'm like, 'Wait, I want to pause this; I want to, you know, put this aside and delete this.' So what you can type is, you can type 'git stash.'
00:19:47.760
If you add the dash included untracked option which is simply '-u', it’s going to stash all the files, including the untracked ones, which by default are not.
00:19:59.200
So remember to use that option. You stash, then you do your very small commit; you know, you push it, and then you bring your work back with 'git stash apply.'
00:20:16.800
So maybe this is simple, maybe this is complicated, and you know I'm here to hear your feedback after the talk. But it's really just a process.
00:20:43.360
So if you are in the mindset, and you really just want to make sure that some code gets deleted and that the approval for that PR is, you know, just a thumbs up, this is something that has worked for me so far.
00:21:03.920
In the last part of this talk, I want to talk about a couple of other techniques that are a little more advanced, but they still fit into this process.
00:21:33.760
The first one is recognition. I said before, I happened to open a file that had some unused code, and I saw it.
00:21:52.160
But if you work in a very big codebase, you probably have files that you never open, and maybe those are the files that have code that can be deleted.
00:22:08.160
So it's kind of like a paradox. How do you even find it if you never open them? I developed my own strategy. I don't know if anybody does this.
00:22:27.680
The way in which I do it is with code coverage. Code coverage is a tool that's normally used for testing.
00:22:42.720
It's available in almost every programming language, and in Ruby, there is one that I really like, which is called SimpleCov. Normally, when you do code coverage in testing, you write your tests, you run them, and then this tool comes up and tells you which lines of code have been executed by your tests.
00:23:03.680
But nothing prevents you from using these tools in development or in production, which is what I do. In development, I start the tool, then locally I just play around with the app, you know, try to hit all the methods and stuff for a while.
00:23:23.200
Then I stop the tool, and it comes up telling me which lines of code I have executed.
00:23:43.120
Sometimes I do this in production as well. I let users roam around for like an hour, you know, do whatever they need to do.
00:23:59.920
If a file has 100% coverage, it means that it's been completely executed; all the lines are used for one reason or another.
00:24:25.200
But if that's not the case, that gives me a hint. Maybe there is a method that customers have been going around for an hour that’s never been executed.
00:24:44.000
I have a hint; next time I have some spare time, maybe I can start there. I don't have to randomly open a file and hope that I find a good one, especially in a very big codebase.
00:25:00.960
So code coverage has helped me with that, and it just gives me hints or places where I can start. There might be other static analysis out there that do similar things.
00:25:19.680
The good thing about code coverage tools is that probably you already have them in your codebase for testing. They’re really easy to use.
00:25:38.360
Finally, I want to talk about another technique, and here is where you need to be awake because this is the hard part. It's another Git command that I want to talk about.
00:25:57.960
So this is the exact same slide I had before where I said this 'delete_app' file was using the option that I was mentioning, but this file is gone.
00:26:09.480
It’s pretty easy to understand that it’s not invoking the code anymore; the file has been deleted completely.
00:26:24.960
Now, when I talk about another scenario, which goes like this, the file itself still exists but has different code.
00:26:41.600
Three and a half years ago, somebody was using the option in this file. Today, the file is still here, but the code is different.
00:27:05.760
So in other words, three and a half years ago, the code that I'm trying to find was there. Today, I do a 'Finding in Project,' and it’s not there; it’s not in that file.
00:27:19.440
So somewhere in the middle, somebody went and changed this file and removed the only usage of that option.
00:27:31.440
The question is, where? When you know there might be dozens or hundreds of commits in the middle, I don't want to go one by one and check. Maybe there are hundreds of them.
00:27:57.440
So is there a faster way to do that? There is, and it's a command called 'git bisect.'
00:28:04.720
So to the right, these are all the commits that have ever affected this file. I know that somewhere here, there is one commit that removed the only usage of that option.
00:28:22.440
So what I do is type 'git bisect start.'
00:28:39.520
Then I type 'git bisect good' with the very last commit (good means the option was invoked; the code was there; we know that).
00:28:56.480
Then you type 'git bisect bad' with the most recent commit (we know the option is not there).
00:29:02.720
When you type this, Git takes you exactly in the middle. That's why it's called bisect. It takes you, you know, no matter how many commits there are, it takes you in the middle and it says, 'What about this commit? Is the code here invoking that option or not?'
00:29:18.000
Now, you have to do that work. You can use 'Finding in Project'; what I use is 'grep,' which counts how many occurrences of 'header_icon' are in that file.
00:29:39.840
In this case, zero means it’s not using the option, so now all I have to type is 'git bisect bad', because I know this code is not there.
00:29:54.720
What 'git bisect' does now is take me to the middle of the second half. It says, 'What about here?'
00:30:12.640
And now you do this again. So was it there? In this case, it was there, so you just type 'git bisect good'.
00:30:23.760
Now, that takes you to the half of the third quarter, and you do that; it only takes four or five steps until you get to a single commit.
00:30:42.960
In this case, it tells me which was the first bad commit.
00:31:08.560
Once you're done with 'bisect,' you type 'git bisect reset' to just go back to your normal work.
00:31:20.960
Now that we have this, we can use that in the same way we were using before; we can say, 'Remove unused option header_icon. It was introduced, but its lasting location was removed in 1927.'
00:31:41.040
Then people can click there to see that it's not that the file was deleted; it's that that specific line of code was deleted, and then the rest of the commit is similar.
00:32:03.760
So, to wrap up, these are all the Git commands I mentioned in my talk and I find them pretty useful to delete code.
00:32:20.720
The first one is you're just curious. You're looking at some code: when or why was this added? You do a 'git blame.'
00:32:41.040
Then, maybe, you find the line of code and you want to see what else happened at the same time; you can do 'git show' with that commit.
00:33:03.840
If you're looking at a specific file and want to know the history, even if the file was renamed or deleted, you can do 'git log' with those options.
00:33:28.400
If you know there was a commit that changed something but you don't know which one, you can do this in a rapid and efficient way with 'git bisect.'
00:33:51.920
With all this, you have what you need to write a good commit message.
00:34:14.720
So, first of all, if you're in the middle of something, put it away, stash it—do 'git stash.'
00:34:32.960
Then write the commit. My suggestion is to never use 'git commit' with the '-m' option, meaning in line in your terminal.
00:34:44.960
Just do 'git commit,' and open your editor. So then you have all the space that you need to write your message.
00:35:00.000
Then you can bring your work back with 'git stash apply.'
00:35:14.240
To conclude, I just want to mention that, as I said at the beginning, this is a talk that really comes from experience and passion.
00:35:43.680
It's really not about looking back. I never really care about who left the code there or why it's there; I'm perfectly aware that these things happen.
00:36:02.560
Probably, it might have been me; I forgot about it, and that's really not relevant to me doing this.
00:36:19.680
It's kind of similar to when you go to the beach, and there's a piece of trash. Some people don't even notice it, and then others see it and think, 'Well, I didn't leave it there, so I don’t have to pick it up.'
00:36:36.480
But then there are people who pick it up because it doesn't matter; we don't want to linger in the past; it's just more about building a better future.
00:36:54.880
Finally, I want to thank the committee for accepting my talk because, in my mind, this was kind of a very personal and even weird talk.
00:37:06.400
But it actually was accepted, so if you're in the audience and you have any passion or theme or anything that's related to Ruby and you have doubts and think it will never get accepted, mine was!
00:37:24.560
So you really only have to submit an abstract which is like 300 words. I encourage you all to do it because maybe next time, you're going to be standing on this stage giving a talk.
00:37:41.600
That concludes my presentation. Thanks for coming, and have a great rest of your day!