RailsConf 2022

Service Denied! Understanding How Regex DoS Attacks Work

Service Denied! Understanding How Regex DoS Attacks Work

by Kevin Menard

In the video "Service Denied! Understanding How Regex DoS Attacks Work," Kevin Menard explores the nature of regular expression denial of service (DoS) attacks and their implications for Ruby applications. Using his experience at Shopify, he provides a comprehensive overview of the performance challenges associated with regex implementations, particularly in Rails. The talk underscores the rapid increase in regex-related vulnerabilities reported in 2021, impacting various technologies and multiple programming languages. Menard begins with a general introduction to denial of service attacks, differentiating regular expression attacks as especially tricky due to their resource-intensive nature.

Key points discussed include:

  • Context of Regex DoS Attacks: Significant uptick in vulnerabilities in 2021 affecting Ruby, Rails, and many Ruby gems.
  • Understanding Denial of Service (DoS): How attacks can lead to excessive strain on application performance due to resource depletion, thereby causing delays or outages.
  • Performance Analysis Overview: An introduction to Big O notation and algorithmic complexity, emphasizing the importance of recognizing how different algorithms perform under varying conditions.
  • Regular Expression Mechanisms: Explanation of non-deterministic finite automata (NFAs) versus deterministic finite automata (DFAs) and how these concepts relate to regex performance and vulnerabilities.
  • Examples of Regex Vulnerabilities: Using a specific regex pattern to demonstrate how certain inputs trigger exponential slowdown, highlighting the importance of regex management in dynamic applications.
  • Preventative Measures: Recommendations include avoiding dynamic regex construction from user inputs, employing string truncation, upgrading Ruby and Rails, and implementing timeout limits for regex operations to prevent excessive resource consumption.

Menard emphasizes the security risks posed by regex patterns and provides actionable advice to protect Rails applications from potential attacks. By enhancing our understanding of how regex engines work and the performance implications of different regex techniques, developers can better guard against the vulnerabilities that lead to denial of service incidents.

In conclusion, this talk provides critical insights into regex DoS attacks, underscoring the importance of performance awareness in coding practices and regular updates to security measures against vulnerabilities.

00:00:00.900 Welcome to today's session.
00:00:12.540 Hi, my name is Kevin, and this is my first time speaking at RailsConf. I'm really excited to attend conferences in person again.
00:00:19.199 I think the RailsConf staff have done a fantastic job of making this happen while ensuring everyone stays safe.
00:00:26.100 A bit about myself: I work at Shopify, and I just had my one-year anniversary there. I still think it's a fun place to be.
00:00:33.540 We do really interesting work, and I'm sure you've heard we are hiring basically all the time.
00:00:38.940 Feel free to reach out if it's of interest to you. Additionally, we have a booth in the sponsor area where you can talk to someone.
00:00:44.399 In particular, I work in a group called the Ruby and Rails Infrastructure Team. This group is tasked with ensuring the longevity of the Ruby tools we use at Shopify.
00:00:51.239 Our work may encompass performance, correctness, or ease of use—we tackle everything.
00:00:56.820 We have people working on type systems, developer experience, and optimizing tools like Visual Studio Code for Ruby development.
00:01:04.320 We also focus on optimizing compilers for Ruby. There's quite a bit of focus here, and if you've used Ruby open-source code, there's a good chance you've used work that has come out of our group.
00:01:09.960 In particular, I work on a project called Truffle Ruby. Truffle Ruby is a completely new implementation of Ruby, emphasizing peak performance.
00:01:15.240 We aim to optimize Ruby as Rubists actually write it. In contrast, some alternate implementations require you to change the way you write code to optimize for the tool, similar to what asm.js did.
00:01:20.700 We look at things like meta-programming and aim to ensure that it runs just as quickly as if you called methods directly, instead of using method missing.
00:01:27.360 Our goal is to be 100% compatible with MRI. We even run native extensions and are still working towards achieving that goal.
00:01:32.820 To give you an idea of what we're targeting, here are two benchmarks that come out of a project called YGIT Bench. These are from the YGIT team and are aimed at assessing sub-components of Rails.
00:01:39.060 On the left, we have a benchmark using Active Record, and on the right, one using the default templating engine in Rails.
00:01:45.180 For Active Record, we perform about four and a half times faster than C Ruby 3.1 on this benchmark.
00:01:51.060 I chose this benchmark because I think it’s interesting to discuss Active Record. Many people believe performance is completely gated by I/O due to the database, and that is often true.
00:01:56.399 However, there is still a lot of optimization that can be achieved within the Ruby VM. We're not talking about a 10 or 15 percent incremental improvement here; these enhancements are more like 400 to 500 percent.
00:02:01.740 Now, moving on to the talk's title: 'Service Denied! Understanding How Regex DoS Attacks Work.' There's a lot to cover in this 30-minute slot, so I'll try to blitz through some details.
00:02:08.459 First, I want to establish the context, then explain what denial of service attacks are just in case you're not familiar, and then delve into regular expression denial of service (ReDoS) attacks.
00:02:14.940 To understand ReDoS attacks, we also need to cover a brief crash course in performance analysis, which includes Big O notation, and what a regular expression engine does.
00:02:21.360 I hope everything will come together in the end to form a cohesive idea.
00:02:28.319 In the context of 2021, we observed a rapid uptick in vulnerabilities related to regular expression denial of service attacks.
00:02:36.540 These attacks impacted C Ruby, Rails, and popular Ruby gems.
00:02:41.580 If there was a vulnerability update for some key technology you used in 2021, it's highly likely it was related to a regular expression denial of service attack.
00:02:48.720 I really do mean any key technology; this wasn’t limited to just Ruby.
00:02:54.739 Although Google Code issues aren’t the best proxy, if you look through the issues, you'll find they impacted languages with a regular expression engine like Ruby.
00:03:00.360 This includes numerous instances in JavaScript and even Perl if you're still using that.
00:03:06.420 My interest in this topic arose for unrelated reasons while working on regular expression performance in Truffle Ruby.
00:03:11.459 At Truffle Ruby, we noticed that while Ruby regular expressions offer a lot of functionality, many people only use a subset of that functionality.
00:03:18.060 We hypothesized that we could process those expressions faster than running them through the full regular expression engine.
00:03:24.120 Earlier in 2021, we adopted a new engine based on a Deterministic Finite Automaton (DFA), and I’ll explain what that means shortly.
00:03:30.780 In particular, I was analyzing performance on an open-source gem called Browser Sniffer.
00:03:36.180 This gem simplifies working with user agent strings so you can determine if your users are on mobile or desktop, and which browsers they are using.
00:03:43.380 I also looked into our primary web application at Shopify, and I discovered we did not use back references in our regular expressions.
00:03:48.900 Back references are considered an advanced feature. Meanwhile, I was also studying the C Ruby regular expression engine to ensure compatibility as we updated it.
00:03:54.479 We wanted to confirm whether the new Truffle Ruby engine exhibited any vulnerabilities, and in each case, it did not.
00:04:00.600 For example, when we ran a proof of concept with C Ruby, it took 22 seconds to complete, while it ran in just 22 milliseconds on Truffle Ruby.
00:04:06.480 This performance difference had nothing to do with how Truffle Ruby optimizes; it was simply due to the new type of engine, which is actually based on older concepts.
00:04:13.620 To understand what a regular expression denial of service attack is, we first need to know what a denial of service attack actually is.
00:04:20.459 This entire talk is essentially a long setup for understanding these attacks.
00:04:27.060 Denial of service attacks are among the easiest forms of computer security attacks.
00:04:33.780 In these cases, the objective isn’t to break into systems or escalate privileges; the intent is purely to cause significant disruption.
00:04:41.280 When we talk about denial of service attacks, particularly in relation to Rails, we see that we have dynamic web applications.
00:04:48.180 These applications usually respond to requests, and we presumably have some performance targets, like a response time of 100 to 150 milliseconds.
00:04:54.419 However, during a denial of service attack, those response times may elevate to tens of seconds, or the application may become so flooded with requests that it is unable to respond.
00:05:00.600 You can have denial of service caused by CPU wastage, or it can occur by overwhelming the server with an excessive volume of requests, leading to distributed denial of service (DDoS) attacks.
00:05:06.360 This context can be somewhat subjective; for example, we've all accidentally triggered an N+1 query in our applications.
00:05:12.120 If your database isn’t overly loaded or if there’s a limited amount of data, you might get quite far with that approach.
00:05:18.180 But then suddenly, a customer accesses the application with vastly more data, and performance issues arise.
00:05:24.120 In these cases, discerning whether the performance issue stems from malicious intent or just a coincidence can become quite challenging.
00:05:30.780 Concerning regular expression denial of service attacks, it is generally evident when they are malicious.
00:05:36.180 So, now that you understand what a denial of service attack is, you might be wondering what this ReDoS stuff is.
00:05:41.760 The first hit I found on Google defines a regular expression denial of service as an attack that exploits the fact that many regular expression implementations may experience extreme performance degradation.
00:05:47.880 This degradation usually relates exponentially to the input size. However, this definition can be dense, and we can look at Wikipedia's take on it.
00:05:55.680 Wikipedia describes ReDoS as an algorithmic complexity attack producing denial of service by providing a regular expression and/or an input that takes a long time to evaluate.
00:06:02.400 While that isn't terribly helpful, the next statement mentions that certain regex input pairs might exhibit super-linear worst-case complexity.
00:06:08.640 As a result, evaluation time can grow polynomially or exponentially relative to input size.
00:06:15.420 In order to fully grasp what ReDoS entails, we need to revisit performance.
00:06:22.620 When considering performance, benchmarks typically come to mind.
00:06:29.500 Benchmarks can be nice; when looking at charts, different colored bars correspond to different data, allowing quick visual comparisons.
00:06:36.399 However, the challenge with benchmarks is that they are notoriously difficult to replicate.
00:06:43.659 To even attempt replicating a benchmark, you'd need the original code that was executed and the inputs used.
00:06:50.640 Even then, benchmarks are highly tied to the system they run on; for instance, if the benchmark was heavily multithreaded on a 16-core machine, your results on a 4-core laptop may vary drastically.
00:06:57.240 Moreover, modern systems always have background processes running, making normalization a challenging task.
00:07:03.960 We often try working around this by running benchmarks repeatedly and conducting statistical analyses.
00:07:11.160 Even that can feel odd, and there’s an entire field dedicated to studying how to interpret benchmarks correctly.
00:07:17.940 The definitions of ReDoS examine performance through algorithmic complexity.
00:07:24.780 Algorithmic complexity itself can be an intimidating term; it comprises two words not commonly used together.
00:07:32.580 Yet, the essential concept is that instead of running code to measure how long it takes, we reason about it and count key operations.
00:07:39.540 For instance, if you're matching 10 keys to 10 doors, you can devise different strategies and calculate how many times you need to turn each key in order to pair them.
00:07:46.260 As you increase the number of doors, you can determine if the original approach still holds up.
00:07:52.800 Algorithmic complexity considers different contexts like optimal (best), worst, and average case scenarios.
00:07:59.700 The best case is when every key you try matches on the first turn, while in the worst case, you pick the wrong key every time—leading to quadratic time complexity.
00:08:07.620 These operations not only get counted, but they also correlate with the input size.
00:08:14.220 In regular expressions, the key input data is the size of the string you're trying to match against.
00:08:20.640 When discussing algorithmic complexity, you might encounter the term asymptotic complexity, which relates to Big O notation.
00:08:28.200 Big O notation allows us to express the efficiency of an algorithm.
00:08:35.640 Moving from top to bottom, you'll see functions increasing in time efficiency, where Big O notation maps performance.
00:08:42.240 You can also have algorithms that vary in their Big O complexity while maintaining similar functionality.
00:08:50.460 Essentially, we want to understand how the time complexity of algorithms varies with an increase in input size.
00:08:57.960 To visualize these complexities, here are some graphs that map performance scaling.
00:09:05.580 Different performance levels grow at different rates, and at the bottom, the scale changes drastically, indicating enhanced efficiency.
00:09:13.500 You will want to avoid exponential growth to ensure efficient processing.
00:09:20.940 As we examine algorithmic complexity, let's tie it back to a concrete ReDoS example.
00:09:27.840 This slide is for reference: we're looking at a regular expression consisting of n question marks followed by n A's.
00:09:34.740 This example is drawn from Russ Cox's article, 'Regular Expression Matching Can Be Simple and Fast,' which is widely regarded.
00:09:41.520 In this case, the question mark in the regex syntax matches zero or one times, followed by a series of A's, leading to a match.
00:09:47.940 As I increased the size of the string up to 36 A's, the processing time hit seven minutes.
00:09:54.480 A fun fact about regular expressions in Ruby is that you can't interrupt them—you're either waiting it out or killing the process.
00:10:01.980 For those familiar with logarithmic scales, I plotted the time to showcase graphical representation.
00:10:09.060 Having defined ReDoS, let's explore what regular expressions are.
00:10:16.260 In Ruby, they're used far more frequently than in most other languages due to its built-in literal syntax.
00:10:23.460 The Ruby core library allows for regular expressions in many applications.
00:10:28.740 In contrast, regular expressions are often unwieldy in languages like C or Java, where they’re utilized less frequently.
00:10:35.340 Regular expressions can be thought of as a domain-specific language for writing tiny programs.
00:10:42.240 When using regular expressions, you essentially obtain a little program that simply matches strings.
00:10:49.620 Input strings are matched against this program to yield either a match or not, and sometimes the outputs include captured values as well.
00:10:57.360 Another perspective is to view regular expressions as a set of commands for constructing a state machine that processes these strings.
00:11:04.020 The technical definition of regular expressions involves describing something termed regular languages.
00:11:12.120 Although I won't delve into that, if the topic interests you, I encourage further exploration.
00:11:19.200 Notably, Ruby and similar languages do not strictly adhere to regular languages; they have introduced extensions like back references.
00:11:26.520 This extra functionality contributes to the performance traps we're discussing.
00:11:33.720 Now, let's look at state machines—here's a representation from the 80s.
00:11:41.460 A state machine is an abstraction encoding the state of a system.
00:11:49.680 For example, a traffic light can only display one color at a time, represented by states in a state machine.
00:11:56.280 In the U.S., a green light transitions to yellow, yellow to red, and red back to green.
00:12:02.760 This system can run indefinitely as long as it has electrical power.
00:12:10.560 Some transitions may be conditional, and we’ll see that in our subsequent discussion on regular expression state machines.
00:12:18.480 Regular expression state machines come in two types—NFAs (Non-deterministic Finite Automata) and DFAs (Deterministic Finite Automata).
00:12:25.560 While it might seem like terminology overload, it's essential to comprehend these terms in context.
00:12:32.400 NFA stands for Non-deterministic Finite Automata, which describes state machines that allow multiple transitions for a given input character.
00:12:40.620 Conversely, a DFA allows transitions to only one state per given input, maintaining a singular transition approach to efficiency.
00:12:48.660 Using the same Russ Cox example with the pattern consisting of two question marks and two A's, we can visualize how state transitions occur in these machines.
00:12:54.540 For instance, with the NFA pattern, we could transition from start to three different states based on the first A here.
00:13:04.200 In contrast, with a DFA, we can only make one transition for each input character.
00:13:10.740 Thus, a DFA operates at linear speed since it processes characters once, unlike an NFA which may backtrack.
00:13:17.160 This addition of backtracking leads to exponential time performance issues.
00:13:22.740 Ultimately, we still observe logarithmic growth while the performance of Truffle Ruby remains linear, providing efficiency.
00:13:29.940 Despite segmented expressions being equally expressive, Ruby's reliance on NFA causes backtracking complications.
00:13:36.540 Consequently, in Truffle Ruby, if we have a pattern that does not match, the engine switches, improving efficiency.
00:13:43.260 Given that our Rails applications are dynamic, they often handle user-generated input.
00:13:50.160 Whether via form submission or API calls, allowing the wrong kind of input can trigger ReDoS attacks.
00:13:57.780 Thus, it’s critical never to allow user input to create a regular expression.
00:14:05.220 For those instances where you must validate input using regular expressions, ensure you're truncating inputs adequately.
00:14:11.040 While front-end validation can help, malicious users can bypass it, reiterating the necessity of back-end safeguarding.
00:14:18.840 This is what was implemented in some Ruby versions when patterns became problematic.
00:14:25.740 As a proactive measure, regular updates of Ruby and Rails address known vulnerabilities, particularly those posing ReDoS threats.
00:14:32.820 Due to open-source technology usage, attackers may have an upper hand as they can easily analyze your regular expressions.
00:14:39.720 Therefore, there’s ongoing work to monitor and detect which regex patterns may be exploitable.
00:14:46.800 Ruby 3.2, which isn’t released yet, introduces timeout limits for regex executions, either globally or on a case-by-case basis.
00:14:54.600 With an understanding of regular expression denial of service attacks and how they can spiral into prolonged processing, let's think like an attacker.
00:15:02.160 Identifying different ways to craft patterns that could origin a denial of service attack is critical.
00:15:08.880 Avoid nested quantifiers, and if possible, limit backtracking by controlling how many times elements can repeat.
00:15:15.960 If you're structuring groups, consider using Atomic grouping to minimize backtracking.
00:15:23.040 Be mindful of your matching pattern's potential to elicit exponential backtracking—they're sneaky and worth vigilance.
00:15:29.880 We made it through! This has been a substantive look into regex denial of service attacks.
00:15:36.180 For those familiar with algorithmic complexity, I hope this has provided additional insight and practical understanding.
00:15:43.680 Understanding performance without solely relying on benchmarking can help you address vulnerabilities more accurately.
00:15:50.760 I hope this presentation helps reduce some of the mystery surrounding regular expression engines and their vulnerabilities.
00:15:56.760 Finally, I've compiled a list of helpful resources on the topic that will be available in the slides after this talk.
00:16:02.760 For a more in-depth look at what regex engines do, check out the Russ Cox article.
00:16:09.360 My colleague Kevin Newton created a regular expression engine in Ruby, a project from our Shopify hack days.
00:16:15.060 You can also explore T-Rex, the DFA-based regular expression engine used for Truffle Ruby.
00:16:21.480 Thank you all for your time, and I look forward to any questions you may have.