Squashing Security Bugs with Rubocop

by Omar

In this presentation at RubyConf 2021, Omar discusses how to utilize Rubocop for identifying security bugs in code. He begins with an explanation of static analysis, contrasting it with dynamic analysis, and emphasizes its significance in software development. The core focus is on how Rubocop can automate the detection of coding patterns and security vulnerabilities, improving the overall developer experience.

Key Points Discussed:
- Definition of Static Analysis: Static analysis is about examining code without executing it, providing fast and efficient error detection but may lead to false positives.
- Role of Abstract Syntax Trees (ASTs): Omar explains how ASTs help in analyzing the structure of code and how Rubocop leverages this concept.
- Automating Code Reviews: Rubocop enforces style guides and finds issues automatically, allowing developers to focus on substantive code review rather than stylistic preferences.
- Security Anti-patterns: The speaker outlines the concept of security anti-patterns—common coding practices that may lead to vulnerabilities over time, illustrated with examples like using eval method carelessly and allowing unrestricted access in Rails controllers.
- Creating Custom Cops: Omar shares his experience building a Rubocop cop to detect unscoped find calls in Ruby on Rails. He explains the process of defining anti-patterns and creating ASTs to automate this detection, making code safer by preventing such patterns from emerging in the future.
- Developer Experience Consideration: An essential aspect of integrating security tools is ensuring they do not hinder productivity. Omar argues for a balance between security and usability, emphasizing the importance of clear communication and ideally integrating security checks into existing workflows without adding undue overhead.
- Tools and Resources: Throughout the talk, Omar provides resources and tools his team developed to aid in this analysis, encouraging developers to utilize Rubocop for better security practices.

In conclusion, the key takeaways from this presentation include the importance of prioritizing developer experience in security tools, the ease of creating custom security checks with Rubocop, and the critical nature of addressing security anti-patterns in codebases. This talk serves as a practical guide for developers looking to leverage Rubocop as a tool for enhancing security in their applications.

00:00:10.080 Hello folks, thanks for joining me. My name is Omar and today I'll be talking about using Rubocop to find security bugs.

00:00:11.920 At a high level, this is what I'll be covering today. I'll start with a quick primer on static analysis and Rubocop, tie that into security, and then focus on the developer experience and why it's so important.

00:00:16.000 So first, a quick introduction. I'm currently a staff security engineer at Betterment. By the way, we are hiring, so if any of this is interesting to you, please come talk to me afterward. Prior to working at Betterment, I used to work at Etsy and OkCupid, where I did all sorts of security things. Also, just to note, I'm not a Ruby programmer. This is my first time writing Ruby, as I much more enjoy program analysis and vulnerability research. This just seemed like a great opportunity to learn Ruby at the same time.

00:00:30.640 Now, cool, static analysis. What is it? It's just a fancy computer science way of saying understanding programs without actually executing them. The opposite would be dynamic analysis, where you might, for example, use a debugger to inspect your program and get some static concrete values out of it.

00:00:37.040 There's some pros and cons to using static analysis. The pros being generally that these programs are a lot easier to write and are much more performant, but this usually comes at the expense of false positives and coverage issues.

00:00:40.159 So if you've ever actually used GitHub source code navigation tools, you'll notice that if you can click on a method name, it'll tell you where that method is called and where that method is defined. Well, static analysis is the core of this. GitHub isn't actually executing any of this Ruby code to come to these conclusions; it's purely static.

00:00:58.960 So I like to think of static analysis as kind of like the Socratic method of learning, where you ask these questions back and forth: you ask, 'Oh, who calls this method?', 'What does this method return?' Going back and forth, these are things you ask to learn more about the program you're looking at.

00:01:21.560 We do this stuff implicitly all the time when we load up source code in our text editors or review pull requests in GitHub, but the problem with this is it's very manual, very error-prone, and most importantly, it doesn't actually scale.

00:01:38.800 So for the rest of this talk, I'll be focusing on performing this type of static analysis automatically.

00:01:40.560 To do that, we're going to start by learning more about abstract syntax trees. At a high level, abstract syntax trees, or ASTs, are the format that your compiler or interpreter turns your code into, making it easier to operate with.

00:01:47.040 Basically, computers don't have any issues with both of them, but as lowly humans, it is much easier to write programs for ASTs than it is for raw source code.

00:02:05.160 So to deal with ASTs, we'll be leaning very heavily on Rubocop.

00:02:08.480 At a high level, Rubocop is used for identifying formatting issues and bugs, a lot of which Rubocop can actually fix entirely automatically. It's an indispensable tool for anyone who works on a Ruby codebase, particularly one with many people working on it at the same time. It makes it easy to enforce your style guide and keep your codebase very consistent.

00:02:26.400 So take this code, for example. In this case, that return statement is completely optional. However, because it is optional in a code base without Ruby, you might find this being used very inconsistently. Some developers might prefer it; some might not. This might make code a little bit more tricky to deal with, especially for your code reviewers, who maybe can't come to a conclusion about what should be used.

00:02:41.040 However, by introducing Rubocop, we can actually start to take a hard stance on things like this. Here you can see Rubocop actually called me out for saying, 'Oh, you don't actually need this return statement; just get rid of it.' This is really important because not only am I enforcing the style guide in a very consistent way, but it also saves time for the code reviewer.

00:02:49.840 The reviewer doesn't need to look for these issues or leave a comment saying, 'Hey, get rid of this!' Rubocop has already done that for them.

00:03:07.680 So let's just save nitpicking for the robots.

00:03:08.880 Okay, so I've talked a bit about ASTs and Rubocop. You might be wondering what they have to do with each other. Well, pretty much everything. At the heart of just about every Rubocop is this class called NodePattern, which is basically regular expressions but for ASTs. I know, that sounds kind of scary, but if you stick with me for a few minutes, I'm hoping that it'll all make sense by the end.

00:03:19.440 So take a look at this built-in Rubocop that looks for unsafe uses of the eval method. In particular, look at the def node matcher block right there in the middle. If you notice, it looks very similar to the abstract syntax tree that we generated earlier a few slides ago. This is where we tell Rubocop this is the code that we're interested in looking at.

00:03:36.960 So think of a node or def node matcher as a simple wrapper for node patterns. It makes it easy to say, 'I've got one line that defines what my code looks like,' and Rubocop knows to look there. I like to think of it like writing a program to process phone numbers in a document, where you break it up into extracting phone numbers and then processing phone numbers. The def node matcher is the one that's doing the extracting.

00:03:51.840 It's very similar to regular expressions in that you are defining what it looks like, extracting the things that you're interested in, and then once you've extracted those details, you can do whatever validation you want.

00:04:08.720 So you can say, 'Cool, I know what eval methods look like. What about the eval methods am I most interested in?' Let’s look again at this cop from earlier. First, we start out by defining what an eval call looks like using the slightly modified abstract syntax tree.

00:04:16.560 Next, when we are handed a piece of code, Rubocop will ask, 'Does this actually look like an eval?' If it doesn't, it just bails early and keeps on moving. But if it does, then we start asking: 'Does the one argument that this eval method takes, is it a static string?' If it is a static string, we exit early because if an eval method takes a static string, chances are it doesn't really have any vulnerabilities.

00:04:33.920 However, if it takes a dynamic string—possibly taking in user input—then at that point we alert the user saying, 'Hey, this eval looks potentially unsafe,' so that you and your code reviewers can tell that maybe this evaluates worthy of a second look.

00:04:51.840 The alternative to using Rubocop like this would be to write a very long program where you might find yourself asking questions like: 'What methods are being called here? Do any of these methods look like an eval call? What arguments do they take, if any?' Writing a program like this isn't impossible, but it is very tedious, very error-prone, and most importantly, very difficult to maintain.

00:05:06.000 Cool, so you all came here for a security talk. It's only fair that I actually start talking a bit about security now, so I'll be talking about anti-patterns. As software engineers, we're all very familiar with anti-patterns. These are generally considered bad practices that you shouldn't do.

00:05:12.960 But what makes security anti-patterns more egregious is that they are much more likely to lead to vulnerabilities. Therefore, over time, as you introduce more and more of these anti-patterns, you're much more likely to introduce a vulnerability. For further clarification, you can see my very scientific chart here, where as time goes on, you end up entering into foot-gun territory, as I like to call it.

00:05:36.400 Basically, the more you do it, the more likely you are to introduce a vulnerability. So that's why we call them security anti-patterns. Consider this AutoJSON class, for example. The interesting thing about this AutoJSON class is that it will pick a JSON parsing library for you automatically based on what's installed.

00:05:52.240 If you don't have the faster, more performant JSON library installed, it'll just use a default. To do that, it relies on the eval method, basically using eval to dynamically import the JSON parsing module. However, things become a little trickier when you want to add support for, say, YAML.

00:06:10.559 What would you do if you need to add support for YAML or even more? If you need to let the client specify whether they're using JSON, YAML, or any other serialization format, basically, with code like this, it looks a little bit scary but is safe technically speaking.

00:06:27.440 But what you can't do is tell what it'll look like a year from now and the security implications that come with those changes. So a more realistic example: consider this Rails controller. If you've ever worked with Rails, you've seen controllers like this a million times.

00:06:46.480 But what stands out? Well, you might notice that the document object is not scoped to the current user. So whenever you are accessing this controller, the user can actually ask for any document, not just the ones they own; any document they know the ID for.

00:07:01.280 In some contexts, you know that might be safe, but generally speaking, that's not the behavior that you want. To fix it, the way we would typically do this is by changing this line into something like this—basically leaning on ActiveRecord's relations—so that the current user has its own documents, and you can rely on ActiveRecord to tell you that this user has these documents.

00:07:13.719 You don't have to do a complete search for all documents from scratch.

00:07:30.880 So let's tie all this together. We've talked about security anti-patterns and static analysis. Now, how can we put all these things together to actually start finding bugs?

00:07:39.680 Let's focus on the document.find example from earlier. The reason why I think this is a really good candidate for security static analysis is that, structurally speaking, it's actually very simple. There are no conditionals involved; it's a single method, and it takes a single parameter. So the abstract syntax we will be looking at will be very tiny.

00:07:54.960 Next, the undesirable behavior is actually pretty abstract; it's not specific to document objects at all. You might have documents, messages, secrets; any model you have in your database, this anti-pattern typically will work just fine.

00:08:12.960 And then the best part is that remediation for these findings is actually very easy. Like you saw earlier, if you can scope it to the current user, you've got your solution right there.

00:08:24.320 The general rule I like to think of is if you can write a grep command for it or if you can find a regular expression to capture this anti-pattern, even if there are false positives, it's still a great candidate for security static analysis.

00:08:27.560 I feel like just the single sentence alone makes it very easy for me to think, 'Is this worth spending time doing static analysis on, or is this too complex that might need addressing from a different standpoint altogether?'

00:08:42.039 So let's build a cop to find this type of anti-pattern. Let's start with the actual anti-pattern: document.find params[:document_id]. It's an anti-pattern because it takes user input, params[:document_id], and finds a document based on it without doing any authorization checks. Let's turn that into an abstract syntax tree.

00:09:05.359 Now we know what that looks like, but Rubocop uses a slightly different flavor of abstract syntax tree, so let's modify it just a little bit. One thing in particular that I've taken this opportunity to do is to swap out document for a placeholder.

00:09:20.040 This way, we're not just looking for document objects. Like I mentioned, this will abstract out to any ActiveRecord model, so instead of document, we can look for any other .find methods. Then we glue in some Rubocop boilerplate code, add some extra logic, and now we've got a fully functional cop.

00:09:38.399 Let's walk through what we've got here. First, we start with a helpful message for the developer, telling them, 'Hey, model.find on user input might not be what you want to do.' Realistically, we would go into a little bit more detail; this is actually pretty vague, but for the sake of the screenshot, I kept it brief.

00:09:56.640 Now we've embedded our modified Rubocop AST, which will tell Rubocop what an unscoped find looks like. Next, I decided to add just a little bit of extra logic so you can kind of get an understanding of what some of the capabilities are. I've decided to say only alert if the parameter name ends in _id.

00:10:13.920 So if we're looking for document.find random parameter, maybe that's okay, but typically if the user is specifying an ID, then we want to pay a bit closer attention to that. With this logic at the end, I can actually say not just any param, but a param that ends in _id.

00:10:30.720 With this few lines of Rubocop code, I was able to find all unscoped finds in the codebase as is. The great part is that not only can we find this anti-pattern in the codebase right now, but we can also prevent new instances of this anti-pattern from being introduced going forward.

00:10:46.560 The next best part of this is that the Rubocop you saw is actually very simple, making it very easy to iterate on. So a next step might be to say, as a Rails programmer, you know params is not the only way to get user input—there's param.permit, and that might be another way that user input ends up in your code.

00:11:02.960 You might be able to modify that cop I just made to look at params that permit code as well.

00:11:15.840 The last part, but arguably the most important part, is the developer experience. I think this is often overlooked when discussing stack analysis, so I feel it to be arguably the most crucial point of this presentation.

00:11:27.760 Developer experience is key because, ultimately, the developers are your customers and they are responsible for the success of you and the company. So prioritize this as much as possible. Just as you want your apps and websites to be easy to use, intuitive, and requiring no extensive research, I feel the same way about developer tools.

00:11:46.560 When thinking about developer experience, it's a bit of a buzzword-y term nowadays, but developer experience is basically user experience where the user is a developer and the product is a developer tool. These are things that you want to keep in mind when you make developer tools.

00:12:02.560 Lastly, what we want to do is make sure that we keep our developers safe without sacrificing productivity in the process. If developers realize that they are operating at half capacity because they're wrestling with security tools the whole time, chances are they won't want anything to do with you or your team.

00:12:18.560 I find myself asking these questions a lot. I'm not the developer here; I'm just the maintainer of this tool. I just wrote this tool, so I want to keep in mind the perspective of the developers who work with my tools every day.

00:12:35.040 Firstly, I need to focus on whether the offenses that Rubocop raises are my developers equipped to deal with these if they are working on a pull request that needs to go out the next day? Will this Rubocop finding delay that deploy by a day, an hour, or five minutes? These are questions that you need to ask.

00:12:53.520 Next, how easy are these tools to configure? A lot of security static analysis tools and static analysis tools in general are very difficult to use. Some old-school security static analysis tools are very enterprise-focused and have entire teams dedicated to them.

00:13:05.920 Unfortunately, I can't expect that from my colleagues. I don't think it's fair to expect that from them, so one thing that I focus on a lot is making sure that anyone who wants to adopt these tools has a very easy time doing so.

00:13:26.080 Lastly, what burden are we adding when we add static analysis tools to an already existing and pretty long and hefty testing pipeline? I want to make sure that if there are false positives, it doesn't take an hour to triage every single one of them. These are considerations you want to keep in mind.

00:13:47.360 Dealing with the offenses basically means a developer should be able to deal with offenses without having to consult a subject matter expert every single time. If they find themselves needing to open a ticket with security or get on a Zoom call with you every single time, they're going to start tuning these things out. That's not fair to them, and ultimately, that's not fair to anybody involved, even the security team that made these tools.

00:14:11.760 So the harder it is to use these tools, the less likely they'll actually want to use them. I've seen a lot of security teams take kind of an RTM approach with these tools where basically they'll say, 'We did the hard part in identifying these bugs; it's on you to figure out why these bugs matter.'

00:14:18.880 I don't think that really makes sense in the long run. Personally, I've seen that actually cause a lot of problems when you have engineers who have been told that there are vulnerabilities in the code but not told what that actually means.

00:14:33.280 They'll think, 'Oh, this is just a false positive; I'll tune it out, push it to production, and I'm done for the day.' You can't really do that.

00:14:41.040 One way we've addressed that is by trying to add inline remediation advice as much as we can. Our production version of these Rubocops will actually tell you, 'Hey, this is what you're doing; this is what the code might actually need to look like.' Instead of having a massive message, we actually link out to some documentation that goes into a little bit more detail about what remediation could look like.

00:14:59.760 Another key detail is to keep things simple. As software engineers, it's very tempting to have all sorts of configuration options. You might want to have a sliding scale for how sensitive the tool is to finding vulnerabilities and what arguments it looks at.

00:15:10.000 But ultimately, as the developer of this tool, you are the subject matter expert, and it's fair for you to be opinionated. In fact, I think you should be opinionated when you deploy these tools.

00:15:27.760 As a subject matter expert, you are the one that people come to for advice. To save them time, be opinionated. Embed those opinions into the tools you make.

00:15:43.040 The empathetic approach actually turned out to be the most productive approach where we could have done the security static analysis in any number of ways. We could have used a new tool or purchased a vendor, but we found that all of our engineers are very familiar with Rubocop. It made sense for us to stick with that.

00:16:00.480 Prior to this work, we actually did use several vendors that attempted to do this, but we found it to be very difficult. These developers had to read more documentation, install vendor tools, talk to vendors, and figure out why vulnerabilities were being surfaced when they maybe didn't need to be.

00:16:23.760 By sticking with Rubocop, it required a little bit more effort from the security team's part, but it pays off a lot in the long run when you stick with the tools that the developers are already comfortable with.

00:16:43.920 Basically, if you meet the people where they're at, you'll have a lot more luck.

00:16:57.040 And lastly, if your tools all look like this, no one's going to like you. No one's going to want to talk to you. No one's going to want to look at this; they're just going to close their laptops and walk out of their office.

00:17:02.800 Just don't do this.

00:17:10.320 So next up, minimizing toil. I feel like this is also kind of peripherally touched upon when people talk about static analysis, but they don't go into enough detail.

00:17:20.479 Maintaining a high signals-to-noise ratio is crucial. As a developer, you might say, 'Oh, if I add this one extra line to my tool, I can surface two more edge case vulnerabilities.' And as a security person, I'd be like, 'Cool, those are very important things to surface.'

00:17:36.800 But you know, if those two security vulnerabilities are real but come at the expense of 50 false positives, I don't know if in good faith I can condone that type of finding. Ultimately, that's 50 false positives that an engineer has to triage. That might take a lot of time and ultimately reduce trust in your tools.

00:17:55.840 When you reduce trust in your tools and your security team, you end up weakening security overall in the long run.

00:18:07.040 There are also other ways of addressing this. Instead of saying, 'Oh, we need to find these two edge case vulnerabilities,' you might be able to address those vulnerabilities and the gap in other ways—such as developer training or code reviews—basically non-technical approaches to addressing technical problems.

00:18:24.240 We've found that developer trainings can actually be very important. Sometimes, you can't solve everything with a technical solution.

00:18:42.760 To close with a few key thoughts I think are the most important: writing cops for security vulnerabilities is a lot easier than you might think. I'm a security engineer but not a Ruby programmer. A lot of you are not security engineers, but you don't have to be. I took a single line anti-pattern and managed to turn it into maybe a 20-line Rubocop to look for that anti-pattern.

00:19:02.240 If you've ever spent time grabbing for a vulnerability, for example, you're reading the news and you find this new vulnerability, or you're doing a code review and find something like, 'Oh hey, this code doesn't smell right,' chances are you can turn that into a Rubocop yourself.

00:19:20.480 The next step is to prioritize developer experience as early as you can and as much as you can. The developer experience is key because, ultimately, the developers are your customers, and they are responsible for the success of you and the company.

00:19:35.920 So prioritize this as much as possible. As maintainer of these tools, you should be empowered to be as opinionated as possible. You are the expert here.

00:19:39.840 You should feel very comfortable making decisions on behalf of your developers, your team, and your company. This saves you time; this saves everybody time. So wherever you can, embed your opinions; please do so.

00:19:51.120 I want to thank you all for coming; this is my contact information. As I said, we are hiring, so please feel free to come talk to me or anyone else in the Betterment t-shirt. You can also email me or click that link. Don't hesitate to flag me down at any point. Cool, that's about it.