00:00:00.900
Hello, everyone! I think this room must be the unofficial testing track.
00:00:03.800
Audrey described a great talk prior to this session about TDD as a treasure map. For those of you who didn't catch it, I recommend checking it out online once the talks are available. Today, we're going to be discussing why your test suite is making too many database calls.
00:00:15.839
The key problem we're trying to solve is slow tests. Slow tests are a problem because longer feedback loops while writing code slow down our development process as we wait for tests to run. Longer feedback loops also slow down deployment. If you're dependent on CI, that could mean 30 minutes to 60 minutes of waiting for a feature to go out or, even worse, a bug fix. During that time, the site is down and your customers aren't happy.
00:00:38.160
Overall, you get lower value from your tests because they are not running as frequently, and you may not feel as confident in their results. Three things that are particularly expensive in tests are HTTP requests, anything involving a headless browser, and database queries. If you've spent some time in the Rails testing community, you’ve probably heard some familiar pieces of performance advice.
00:00:58.739
Firstly, disable real network calls. This can be done using tools like WebMock or VCR. The idea is that the best way to prevent slow tests from HTTP requests is to not make them at all. Secondly, you may often hear recommendations to organize your tests using the testing pyramid. In this architecture, you have a few expensive end-to-end tests that use costly headless browser requests, a medium number of integration tests that test subsystems, and a lot of cheap unit tests that validate individual objects.
00:01:26.420
Finally, avoid persisting data when you don't need it. For example, if you have a test that verifies the full name method on a user will concatenate the first and last name, you don’t need to go to the database for that. Instead of creating a user in the database, you might just create one in memory and assert its properties that way. This is common advice, and if you're not already doing this, you probably want to consider it to start seeing performance benefits.
00:01:57.380
However, today, I want to go deeper into the problem of database queries in your tests. Specifically, we’ll look at the issue of setting up more data than you actually need, which leads to making unnecessary inserts or updates to the database.
00:02:09.840
It's worth noting that queries aren't inherently expensive, so if you have one extra query in your test suite, it's generally fine, and you may not notice it. The problem arises when you have tens of thousands of accumulated queries over the course of a whole suite. That’s when you start to see a significant difference in performance.
00:02:44.760
The problem can look something like this. Here, I have an object graph. This differs slightly from the class diagrams you might be familiar with, which show how different classes interact. An object graph illustrates each individual instance within a particular system. If there are two instances of the same class, both will appear here. This is important for the problem we are discussing today, as we want to identify when more than one instance is being written to the database.
00:03:09.780
In this example, we have a unit test for a single object in the system under test. However, we inadvertently create a couple of collaborators and even a secondary collaborator. On the edge, there’s a random extra object with another random collaborator. If all of these result in writes to the database, we end up with six writes when we only needed one. Multiply that across an entire test suite, and you start to see the impact.
00:03:35.640
You might say, 'I don't do unit tests; testing in isolation with mocks seems brittle. I only use integration tests, so I don't need to worry about this problem.' Unfortunately, I have bad news for you. You'll face the same issue but on a larger scale. Now you're testing an entire subsystem or even the whole system using end-to-end tests, leading to the same problem of unnecessary collaborators and additional records being created in the database.
00:04:03.960
So how did this happen? Nobody sets out to create unnecessary database load during their test runs. Today, we're going to delve into three locations in your code base where we tend to accidentally create more data than intended: in your test code, support code, and within the actual code being tested. Let’s start by discussing ways in which your test code can lead to excessive database calls.
00:04:34.200
A common culprit here is shared test setup. When you have test one and test two that require similar but perhaps not identical data setups, you might think you're being efficient by extracting shared setups into a series of lets or a before block. However, this often leads to creating all the data necessary for both tests every time they run, effectively creating more data than you originally intended.
00:05:05.640
Here's an example. We have a test at the top that only requires an organization and a test at the bottom that requires an organization with two users. Due to the way our lets are structured, when we run the test that only needs the organization, it will also create two users. As a result, we end up with three queries executed instead of just one.
00:05:28.320
In an object graph, this might look like this: I’ve shaded in orange the data we actually need for our test, which is the organization, and then we inadvertently created those extra users that we didn’t want. While two additional objects isn't catastrophic, the lack of scaling implies that as the complexity of your tests increases, this issue could exacerbate considerably.
00:05:48.480
Consider a more complex scenario with four tests where each test requires two records. Although there’s some sharing between the tests, none of them utilize the same two records, and across all of them, there are only six unique records. We end up extracting all of them into shared setup, leading to the creation of all six records for every single test. That results in 24 insert queries where only eight were truly necessary.
00:06:15.840
It gets worse from there. Shared test setups can also lead us to create data that is not only superfluous but sometimes of the wrong type for our tests. This might happen when you want to create an average case that could work for most tests, but not all tests actually require that data. For example, in one test, we may need an admin user with a contact, while in another, we only need a regular user without needing any contact.
00:06:37.020
In our setup, the test requiring a regular user has to create a regular user, an admin user, and a contact, which means it has to undo that admin flag and set it back to false later—resulting in two create queries and one update, even though we only needed one create. Anytime you see an update in a test setup, it’s a significant code smell. It typically indicates something went wrong with the setup, leading to dependencies that aren't correctly shaped.
00:07:07.680
The recommended solution is to move your setup inline. While this may seem more verbose, it allows each test to only receive the data it actually needs. Now, those of you well-versed in RSpec might be thinking that using let does not create all that extra data, thus negating the need for inline setup. However, in practice, lets might become entangled, causing all the data to be initialized.
00:07:38.820
It is indeed possible to write your lets in such a way that they remain decoupled, but achieving that requires significant discipline from your team, which is often difficult to maintain. Hence, my recommendation is to do your setup inline, preventing a whole class of errors and ensuring better scalability as your team grows.
00:08:07.920
Now that we've explored how to avoid making these mistakes in your test code, let's discuss how to identify existing issues within your code base. This is challenging with tests since each small test does not generate significant amounts of extra data; it’s the aggregate behavior that matters. Thus, it may require a gradual approach to refactoring your code base, rather than trying to fix all tests at once.
00:08:37.860
Instead of trying to change all tests that do not generate much data, focus on finding tests that are slow and then profile them. This will target your efforts where you can gain the greatest benefit from changing how setup is performed. Some helpful tools include Rails logs, which track all queries to your test log when executing tests. You can run slow tests while tailing the test log and observe all the queries being logged in real-time.
00:09:07.800
You could be surprised to discover that a seemingly slow test is performing far more queries than expected, or conversely, that you might've anticipated too many queries, only to find there aren't as many as presumed. The SQL Tracker gem can also run within a block around specific test executions, providing valuable insights on how many queries are executed.
00:09:43.920
Now let's delve into ways your support code might lead to excessive database calls. A major offender in this area is factories that perform too much work. For instance, consider an organization factory that generates a list of three members for every organization it creates. This can pose challenges, as you may think a simple creation of an organization only generates one insert when, in reality, it triggers four due to the additional users.
00:10:10.020
The compounding effect is even worse, as expensive factories tend to create additional records unexpectedly. For example, a service object test that only attempts to initialize a user might inadvertently call an organization factory, leading to the creation of three more users without your knowledge.
00:10:31.020
To mitigate this, your base factory should be minimal. By minimal, I mean absolutely essential—if any line in your factory can be removed without causing issues with Rails validations or database constraints, that line should go. The only attributes present should be those necessary to successfully create the factory without errors.
00:10:57.780
For example, in a minimal organization factory, you would only ensure the presence of the name attribute required for validation. It's also beneficial to extend your base factory using sub-factories or traits, which allows you to include pre-packaged sets of attributes without cluttering your base structure or creating unwanted records.
00:11:23.880
So how might you find existing misuses within your factories? Factory definitions are a single point of change that can have a significant impact, but identifying which factories need modification can be tricky due to the sheer number of them. One useful technique is to leverage FactoryBot’s ActiveSupport notification event subscriptions.
00:11:56.220
By logging the invocations and durations of factories, you can discover which factories are consistently slow. Additionally, the test-prof gem comes with a factory profiling tool that can identify the slowest factories both individually and in aggregate, helping you pinpoint hotspots to scrutinize.
00:12:17.640
If you come across a slow factory but can’t figure out why, a great technique is to load the Rails console in test mode. Set the log level to debug, then create the factory in the console to observe all the log messages generated, which will reveal the queries executed when the factory is created.
00:12:42.180
Often, the number of queries generated can be quite surprising. Now let's transition to discussing how your source code may also be leading to excessive database calls. Active Record callbacks can complicate things, particularly if they create additional records without your intention.
00:13:07.260
For example, imagine an organization is set up to generate a new admin user on a before_create callback. If you supply your own admin user when creating the organization, you may wind up with two admins—the one created by the callback and the one you intended to create—causing potential data integrity issues.
00:13:35.520
The complex fallback where unexpected data persistence may occur can lead to tests passing amidst subtle failures that emerge later. It’s common to add a guard clause in your tests by creating a local admin and then updating it to ensure that your test setup overrides the callback’s output, but doing so is a classic code smell. It suggests an underlying problem within your implementation.
00:14:06.360
This situation can lead to unnecessary inserts and updates, resulting in a test setup that makes three inserts and one update when you only really needed to create two inserts. Finding existing misuses creating excessive data in callbacks can be particularly challenging due to their tight coupling with the surrounding code base.
00:14:38.760
However, when pain arises, it can signify hotspots in your code that need investigation. Many developers are aware of the pitfalls associated with certain callbacks, and addressing these issues will often lead to a significant impact across your code base.
00:15:01.080
In TDD, the underlying principle is that your tests should guide you towards necessary changes. If you encounter unit tests that require an excessive amount of setup, that may be a sign your objects are too coupled with others around them. Refactoring the architecture may help yield tests that require far less data to execute successfully.
00:15:27.060
Our final concern is to recognize places where we inadvertently create more data than we aim for. Let's walk through a case study where we analyzed a codebase to enhance test performance by eliminating extraneous creates.
00:15:58.560
On one project, I discovered a very slow test suite, intuitively knowing some factories were generating more data than we expected. This project had numerous factories, making it unclear where the problematic instances were. Initially, I utilized ActiveSupport notifications alongside the FactoryBot run factory event I previously mentioned to construct simple profiling that quantified both the time taken for each main factory and their frequency of invocation.
00:16:29.640
I subsequently visualized this data in a graph, with the vertical axis indicating factory invocation frequency and the horizontal axis representing the duration of individual executions. The size and color of the bubbles denoted the aggregate impact each factory had on the test suite.
00:16:52.920
One noticeable factory was towards the lower right corner, indicating a significant impact, albeit infrequently called, but nonetheless taking approximately four seconds to execute just for its setup. I tested this in the console and confirmed that it took four seconds to create this factory—an excessive duration for what should be a straightforward operation.
00:17:10.320
I executed the factory in debug mode within the console, allowing me to monitor the log output that generated as objects were created. The logs revealed an overwhelming amount of data being generated, indicating beyond reasonable doubt that something unusual was occurring.
00:17:30.120
To understand the relationships within the schema, I referred back to the concept of the object graph. My simplified version of what was happening illustrated that I was crafting an organization with various associated entities, indicating a recurring and convoluted pattern.
00:17:51.720
This was concerning, so I turned to my active record models and created a class diagram to clarify the intended data model. The diagram illustrated an organization managing a venue, staff, and events, which are run at that venue—all of which share the same organization.
00:18:11.940
An implicit assumption prevails where an event must belong to its corresponding organization, be it through venue or staff. If this assumption is ever broken, it could lead to unanticipated results. I then investigated the factory responsible for creating these relationships.
00:18:34.680
The factory initially created an organization along with its associated elements—a venue, staff, and several associated events. While they were previously linked through the venue, this connection did not extend to the staff. Consequently, an event factory would generate a new staff member, leading to infinitely nesting associations due to these misaligned connections.
00:19:00.540
The solution was simple—ensuring both the venue and the staff were associated through the same organization. After implementing this change, the object graph reflected only the relevant data our tests required, effectively eliminating any unnecessary creations.
00:19:28.320
Once the implementation was complete, we executed the test suite to measure if our changes had a measurable impact. It is essential to establish baseline metrics before committing any performance enhancements. After running the tests, we discovered a remarkable 15% increase in speed—evidence that minimizing extraneous data creates meaningful performance gains.
00:19:58.380
We learned several debugging techniques during this process. We utilized log outputs, profiling tools, and object diagrams to gain insight into the underlying issues. The root of the problem resided within our factories, and a single-line modification yielded significant improvements, demonstrating how factories can unexpectedly lead to excess data creation.
00:20:21.480
In conclusion, we discussed various scenarios where we may unintentionally generate more data than anticipated. Furthermore, we reviewed methods to keep that data creation minimal, such as keeping your test setups local, maintaining minimal factory bases, avoiding unnecessary data creation within Active Record callbacks, and minimizing tight coupling within your code.
00:20:51.720
This advice holds true for new code or when working with a Greenfield application. However, many scenarios require us to identify existing inefficiencies. Tools like diagramming, particularly object diagrams, and studying the logs can help visualize the unintentional data being created across different code segments.
00:21:18.660
Lastly, profiling tools can help pinpoint the hotspots deserving of your focus, further enhancing your debugging. Thank you for attending this talk. My name is Joël Quenneville, and you can find me on Twitter at [insert handle]. I work at Thoughtbot, and we are hiring, so feel free to approach me with any questions regarding testing in general or discussing controversial topics.
00:21:45.960
I am happy to engage in discussions about creating excessive data within factories and would love to hear your thoughts on the topic. Thank you once again for your attention!