00:00:00.900
Welcome to today's session.
00:00:12.540
Hi, my name is Kevin, and this is my first time speaking at RailsConf. I'm really excited to attend conferences in person again.
00:00:19.199
I think the RailsConf staff have done a fantastic job of making this happen while ensuring everyone stays safe.
00:00:26.100
A bit about myself: I work at Shopify, and I just had my one-year anniversary there. I still think it's a fun place to be.
00:00:33.540
We do really interesting work, and I'm sure you've heard we are hiring basically all the time.
00:00:38.940
Feel free to reach out if it's of interest to you. Additionally, we have a booth in the sponsor area where you can talk to someone.
00:00:44.399
In particular, I work in a group called the Ruby and Rails Infrastructure Team. This group is tasked with ensuring the longevity of the Ruby tools we use at Shopify.
00:00:51.239
Our work may encompass performance, correctness, or ease of use—we tackle everything.
00:00:56.820
We have people working on type systems, developer experience, and optimizing tools like Visual Studio Code for Ruby development.
00:01:04.320
We also focus on optimizing compilers for Ruby. There's quite a bit of focus here, and if you've used Ruby open-source code, there's a good chance you've used work that has come out of our group.
00:01:09.960
In particular, I work on a project called Truffle Ruby. Truffle Ruby is a completely new implementation of Ruby, emphasizing peak performance.
00:01:15.240
We aim to optimize Ruby as Rubists actually write it. In contrast, some alternate implementations require you to change the way you write code to optimize for the tool, similar to what asm.js did.
00:01:20.700
We look at things like meta-programming and aim to ensure that it runs just as quickly as if you called methods directly, instead of using method missing.
00:01:27.360
Our goal is to be 100% compatible with MRI. We even run native extensions and are still working towards achieving that goal.
00:01:32.820
To give you an idea of what we're targeting, here are two benchmarks that come out of a project called YGIT Bench. These are from the YGIT team and are aimed at assessing sub-components of Rails.
00:01:39.060
On the left, we have a benchmark using Active Record, and on the right, one using the default templating engine in Rails.
00:01:45.180
For Active Record, we perform about four and a half times faster than C Ruby 3.1 on this benchmark.
00:01:51.060
I chose this benchmark because I think it’s interesting to discuss Active Record. Many people believe performance is completely gated by I/O due to the database, and that is often true.
00:01:56.399
However, there is still a lot of optimization that can be achieved within the Ruby VM. We're not talking about a 10 or 15 percent incremental improvement here; these enhancements are more like 400 to 500 percent.
00:02:01.740
Now, moving on to the talk's title: 'Service Denied! Understanding How Regex DoS Attacks Work.' There's a lot to cover in this 30-minute slot, so I'll try to blitz through some details.
00:02:08.459
First, I want to establish the context, then explain what denial of service attacks are just in case you're not familiar, and then delve into regular expression denial of service (ReDoS) attacks.
00:02:14.940
To understand ReDoS attacks, we also need to cover a brief crash course in performance analysis, which includes Big O notation, and what a regular expression engine does.
00:02:21.360
I hope everything will come together in the end to form a cohesive idea.
00:02:28.319
In the context of 2021, we observed a rapid uptick in vulnerabilities related to regular expression denial of service attacks.
00:02:36.540
These attacks impacted C Ruby, Rails, and popular Ruby gems.
00:02:41.580
If there was a vulnerability update for some key technology you used in 2021, it's highly likely it was related to a regular expression denial of service attack.
00:02:48.720
I really do mean any key technology; this wasn’t limited to just Ruby.
00:02:54.739
Although Google Code issues aren’t the best proxy, if you look through the issues, you'll find they impacted languages with a regular expression engine like Ruby.
00:03:00.360
This includes numerous instances in JavaScript and even Perl if you're still using that.
00:03:06.420
My interest in this topic arose for unrelated reasons while working on regular expression performance in Truffle Ruby.
00:03:11.459
At Truffle Ruby, we noticed that while Ruby regular expressions offer a lot of functionality, many people only use a subset of that functionality.
00:03:18.060
We hypothesized that we could process those expressions faster than running them through the full regular expression engine.
00:03:24.120
Earlier in 2021, we adopted a new engine based on a Deterministic Finite Automaton (DFA), and I’ll explain what that means shortly.
00:03:30.780
In particular, I was analyzing performance on an open-source gem called Browser Sniffer.
00:03:36.180
This gem simplifies working with user agent strings so you can determine if your users are on mobile or desktop, and which browsers they are using.
00:03:43.380
I also looked into our primary web application at Shopify, and I discovered we did not use back references in our regular expressions.
00:03:48.900
Back references are considered an advanced feature. Meanwhile, I was also studying the C Ruby regular expression engine to ensure compatibility as we updated it.
00:03:54.479
We wanted to confirm whether the new Truffle Ruby engine exhibited any vulnerabilities, and in each case, it did not.
00:04:00.600
For example, when we ran a proof of concept with C Ruby, it took 22 seconds to complete, while it ran in just 22 milliseconds on Truffle Ruby.
00:04:06.480
This performance difference had nothing to do with how Truffle Ruby optimizes; it was simply due to the new type of engine, which is actually based on older concepts.
00:04:13.620
To understand what a regular expression denial of service attack is, we first need to know what a denial of service attack actually is.
00:04:20.459
This entire talk is essentially a long setup for understanding these attacks.
00:04:27.060
Denial of service attacks are among the easiest forms of computer security attacks.
00:04:33.780
In these cases, the objective isn’t to break into systems or escalate privileges; the intent is purely to cause significant disruption.
00:04:41.280
When we talk about denial of service attacks, particularly in relation to Rails, we see that we have dynamic web applications.
00:04:48.180
These applications usually respond to requests, and we presumably have some performance targets, like a response time of 100 to 150 milliseconds.
00:04:54.419
However, during a denial of service attack, those response times may elevate to tens of seconds, or the application may become so flooded with requests that it is unable to respond.
00:05:00.600
You can have denial of service caused by CPU wastage, or it can occur by overwhelming the server with an excessive volume of requests, leading to distributed denial of service (DDoS) attacks.
00:05:06.360
This context can be somewhat subjective; for example, we've all accidentally triggered an N+1 query in our applications.
00:05:12.120
If your database isn’t overly loaded or if there’s a limited amount of data, you might get quite far with that approach.
00:05:18.180
But then suddenly, a customer accesses the application with vastly more data, and performance issues arise.
00:05:24.120
In these cases, discerning whether the performance issue stems from malicious intent or just a coincidence can become quite challenging.
00:05:30.780
Concerning regular expression denial of service attacks, it is generally evident when they are malicious.
00:05:36.180
So, now that you understand what a denial of service attack is, you might be wondering what this ReDoS stuff is.
00:05:41.760
The first hit I found on Google defines a regular expression denial of service as an attack that exploits the fact that many regular expression implementations may experience extreme performance degradation.
00:05:47.880
This degradation usually relates exponentially to the input size. However, this definition can be dense, and we can look at Wikipedia's take on it.
00:05:55.680
Wikipedia describes ReDoS as an algorithmic complexity attack producing denial of service by providing a regular expression and/or an input that takes a long time to evaluate.
00:06:02.400
While that isn't terribly helpful, the next statement mentions that certain regex input pairs might exhibit super-linear worst-case complexity.
00:06:08.640
As a result, evaluation time can grow polynomially or exponentially relative to input size.
00:06:15.420
In order to fully grasp what ReDoS entails, we need to revisit performance.
00:06:22.620
When considering performance, benchmarks typically come to mind.
00:06:29.500
Benchmarks can be nice; when looking at charts, different colored bars correspond to different data, allowing quick visual comparisons.
00:06:36.399
However, the challenge with benchmarks is that they are notoriously difficult to replicate.
00:06:43.659
To even attempt replicating a benchmark, you'd need the original code that was executed and the inputs used.
00:06:50.640
Even then, benchmarks are highly tied to the system they run on; for instance, if the benchmark was heavily multithreaded on a 16-core machine, your results on a 4-core laptop may vary drastically.
00:06:57.240
Moreover, modern systems always have background processes running, making normalization a challenging task.
00:07:03.960
We often try working around this by running benchmarks repeatedly and conducting statistical analyses.
00:07:11.160
Even that can feel odd, and there’s an entire field dedicated to studying how to interpret benchmarks correctly.
00:07:17.940
The definitions of ReDoS examine performance through algorithmic complexity.
00:07:24.780
Algorithmic complexity itself can be an intimidating term; it comprises two words not commonly used together.
00:07:32.580
Yet, the essential concept is that instead of running code to measure how long it takes, we reason about it and count key operations.
00:07:39.540
For instance, if you're matching 10 keys to 10 doors, you can devise different strategies and calculate how many times you need to turn each key in order to pair them.
00:07:46.260
As you increase the number of doors, you can determine if the original approach still holds up.
00:07:52.800
Algorithmic complexity considers different contexts like optimal (best), worst, and average case scenarios.
00:07:59.700
The best case is when every key you try matches on the first turn, while in the worst case, you pick the wrong key every time—leading to quadratic time complexity.
00:08:07.620
These operations not only get counted, but they also correlate with the input size.
00:08:14.220
In regular expressions, the key input data is the size of the string you're trying to match against.
00:08:20.640
When discussing algorithmic complexity, you might encounter the term asymptotic complexity, which relates to Big O notation.
00:08:28.200
Big O notation allows us to express the efficiency of an algorithm.
00:08:35.640
Moving from top to bottom, you'll see functions increasing in time efficiency, where Big O notation maps performance.
00:08:42.240
You can also have algorithms that vary in their Big O complexity while maintaining similar functionality.
00:08:50.460
Essentially, we want to understand how the time complexity of algorithms varies with an increase in input size.
00:08:57.960
To visualize these complexities, here are some graphs that map performance scaling.
00:09:05.580
Different performance levels grow at different rates, and at the bottom, the scale changes drastically, indicating enhanced efficiency.
00:09:13.500
You will want to avoid exponential growth to ensure efficient processing.
00:09:20.940
As we examine algorithmic complexity, let's tie it back to a concrete ReDoS example.
00:09:27.840
This slide is for reference: we're looking at a regular expression consisting of n question marks followed by n A's.
00:09:34.740
This example is drawn from Russ Cox's article, 'Regular Expression Matching Can Be Simple and Fast,' which is widely regarded.
00:09:41.520
In this case, the question mark in the regex syntax matches zero or one times, followed by a series of A's, leading to a match.
00:09:47.940
As I increased the size of the string up to 36 A's, the processing time hit seven minutes.
00:09:54.480
A fun fact about regular expressions in Ruby is that you can't interrupt them—you're either waiting it out or killing the process.
00:10:01.980
For those familiar with logarithmic scales, I plotted the time to showcase graphical representation.
00:10:09.060
Having defined ReDoS, let's explore what regular expressions are.
00:10:16.260
In Ruby, they're used far more frequently than in most other languages due to its built-in literal syntax.
00:10:23.460
The Ruby core library allows for regular expressions in many applications.
00:10:28.740
In contrast, regular expressions are often unwieldy in languages like C or Java, where they’re utilized less frequently.
00:10:35.340
Regular expressions can be thought of as a domain-specific language for writing tiny programs.
00:10:42.240
When using regular expressions, you essentially obtain a little program that simply matches strings.
00:10:49.620
Input strings are matched against this program to yield either a match or not, and sometimes the outputs include captured values as well.
00:10:57.360
Another perspective is to view regular expressions as a set of commands for constructing a state machine that processes these strings.
00:11:04.020
The technical definition of regular expressions involves describing something termed regular languages.
00:11:12.120
Although I won't delve into that, if the topic interests you, I encourage further exploration.
00:11:19.200
Notably, Ruby and similar languages do not strictly adhere to regular languages; they have introduced extensions like back references.
00:11:26.520
This extra functionality contributes to the performance traps we're discussing.
00:11:33.720
Now, let's look at state machines—here's a representation from the 80s.
00:11:41.460
A state machine is an abstraction encoding the state of a system.
00:11:49.680
For example, a traffic light can only display one color at a time, represented by states in a state machine.
00:11:56.280
In the U.S., a green light transitions to yellow, yellow to red, and red back to green.
00:12:02.760
This system can run indefinitely as long as it has electrical power.
00:12:10.560
Some transitions may be conditional, and we’ll see that in our subsequent discussion on regular expression state machines.
00:12:18.480
Regular expression state machines come in two types—NFAs (Non-deterministic Finite Automata) and DFAs (Deterministic Finite Automata).
00:12:25.560
While it might seem like terminology overload, it's essential to comprehend these terms in context.
00:12:32.400
NFA stands for Non-deterministic Finite Automata, which describes state machines that allow multiple transitions for a given input character.
00:12:40.620
Conversely, a DFA allows transitions to only one state per given input, maintaining a singular transition approach to efficiency.
00:12:48.660
Using the same Russ Cox example with the pattern consisting of two question marks and two A's, we can visualize how state transitions occur in these machines.
00:12:54.540
For instance, with the NFA pattern, we could transition from start to three different states based on the first A here.
00:13:04.200
In contrast, with a DFA, we can only make one transition for each input character.
00:13:10.740
Thus, a DFA operates at linear speed since it processes characters once, unlike an NFA which may backtrack.
00:13:17.160
This addition of backtracking leads to exponential time performance issues.
00:13:22.740
Ultimately, we still observe logarithmic growth while the performance of Truffle Ruby remains linear, providing efficiency.
00:13:29.940
Despite segmented expressions being equally expressive, Ruby's reliance on NFA causes backtracking complications.
00:13:36.540
Consequently, in Truffle Ruby, if we have a pattern that does not match, the engine switches, improving efficiency.
00:13:43.260
Given that our Rails applications are dynamic, they often handle user-generated input.
00:13:50.160
Whether via form submission or API calls, allowing the wrong kind of input can trigger ReDoS attacks.
00:13:57.780
Thus, it’s critical never to allow user input to create a regular expression.
00:14:05.220
For those instances where you must validate input using regular expressions, ensure you're truncating inputs adequately.
00:14:11.040
While front-end validation can help, malicious users can bypass it, reiterating the necessity of back-end safeguarding.
00:14:18.840
This is what was implemented in some Ruby versions when patterns became problematic.
00:14:25.740
As a proactive measure, regular updates of Ruby and Rails address known vulnerabilities, particularly those posing ReDoS threats.
00:14:32.820
Due to open-source technology usage, attackers may have an upper hand as they can easily analyze your regular expressions.
00:14:39.720
Therefore, there’s ongoing work to monitor and detect which regex patterns may be exploitable.
00:14:46.800
Ruby 3.2, which isn’t released yet, introduces timeout limits for regex executions, either globally or on a case-by-case basis.
00:14:54.600
With an understanding of regular expression denial of service attacks and how they can spiral into prolonged processing, let's think like an attacker.
00:15:02.160
Identifying different ways to craft patterns that could origin a denial of service attack is critical.
00:15:08.880
Avoid nested quantifiers, and if possible, limit backtracking by controlling how many times elements can repeat.
00:15:15.960
If you're structuring groups, consider using Atomic grouping to minimize backtracking.
00:15:23.040
Be mindful of your matching pattern's potential to elicit exponential backtracking—they're sneaky and worth vigilance.
00:15:29.880
We made it through! This has been a substantive look into regex denial of service attacks.
00:15:36.180
For those familiar with algorithmic complexity, I hope this has provided additional insight and practical understanding.
00:15:43.680
Understanding performance without solely relying on benchmarking can help you address vulnerabilities more accurately.
00:15:50.760
I hope this presentation helps reduce some of the mystery surrounding regular expression engines and their vulnerabilities.
00:15:56.760
Finally, I've compiled a list of helpful resources on the topic that will be available in the slides after this talk.
00:16:02.760
For a more in-depth look at what regex engines do, check out the Russ Cox article.
00:16:09.360
My colleague Kevin Newton created a regular expression engine in Ruby, a project from our Shopify hack days.
00:16:15.060
You can also explore T-Rex, the DFA-based regular expression engine used for Truffle Ruby.
00:16:21.480
Thank you all for your time, and I look forward to any questions you may have.