00:00:18.539
I think I'm going to go ahead and get started. It's so cool seeing all these people come to my talk.
00:00:24.600
Thank you so much. Uh, can you hear me clearly in the back?
00:00:29.640
Excellent. Thank you, technician. All right, my name is Justin Daniel.
00:00:36.840
I'm a senior software engineer with G2, and I have many of my colleagues and teammates sitting in the front here with me.
00:00:44.219
Today, I will be presenting a custom design pattern for building dynamic ActiveRecord queries.
00:00:50.640
Thank you so much for being here. Just a note: I did build a demonstration repository with the help of fellow engineer John Paul Hill.
00:00:57.719
The GitHub link is on the slide, along with the QR code. In this talk, I will be using PowerPoint slides to summarize the main points.
00:01:04.799
We will also be working in the code editor and console to really dig in and understand this query pattern.
00:01:10.799
So, my agenda for today is as follows.
00:01:17.580
I will discuss the big picture and why composable queries are so important.
00:01:24.540
We will compare different tools for composing queries, including raw SQL, ActiveRecord, and Arrow tables.
00:01:31.380
Then, we will go over our own domain-specific language and introduce the query builder and visitor pattern that we've created at G2.
00:01:37.640
Finally, I'll have time for questions and answers. My big goal here is to inspire you to use this builder pattern.
00:01:50.159
I want to not only provide you with an understanding of it but convince you that this is a tool you should try out.
00:01:55.979
I believe our demonstration repository really gives you everything you need to get started.
00:02:01.320
So, let's start with the big picture here.
00:02:07.079
Being a big data company is challenging. Serving all of your customers' needs can feel like pushing a big boulder up a hill.
00:02:20.640
There are three challenges I want to highlight. First, we need to surface our data in many different systems, such as live web pages, CSV downloads, and email notifications.
00:02:32.400
We also have third-party partnerships, such as with Salesforce, where we need to present our data.
00:02:40.980
Our users want data in a customizable manner; they want to include just the fields they are interested in while filtering out irrelevant signals.
00:02:54.000
We need to achieve this while maintaining fast, agile innovations.
00:03:00.300
Some acceptance criteria for our queries are as follows: hard-coded queries are out of the question.
00:03:12.140
They need to be composable and maintainable, allowing engineers to work with them effectively. Additionally, they must remain performant even as we build higher layers of abstraction on top of SQL.
00:03:30.680
I really like the metaphor of Legos here—I have been a big Lego fan since I was a child. I love the idea of having building blocks that you can shape into whatever you need.
00:03:48.360
Our queries should be modular and constructed from smaller, interchangeable parts.
00:03:54.000
They need to respond logically to conditionals to achieve the desired output and return fields that may vary based on the filters users select.
00:04:07.040
However, composable queries can become messy quite easily. I'm sure many developers in this room have experienced struggles with this.
00:04:24.960
What I'm going to show you is an old design pattern that we used, which involves SQL concatenation.
00:04:35.820
This pattern resulted in over 230 lines of code, served by many different private methods and a dozen or so helper files.
00:04:47.340
The problem here is not SQL injection—we guard against that in other parts of the code. The issue is that it's easy to lose the plot.
00:05:01.560
You can follow that this is an SQL query with its different clauses—SELECT, JOIN, WHERE, GROUP BY, etc.—but it doesn't really tell a story, making it hard to work with.
00:05:13.139
It also complicates discovery of common parts of domain logic. I made the decision to encourage our team to adopt a different design pattern for composing queries.
00:05:25.259
I'm going to give you a sneak peek of what we achieved and what you can also accomplish using the Builder pattern.
00:05:38.280
You can return to object-oriented programming principles when composing your queries.
00:05:45.000
Here's our sneak peek—you'll find this in the repository I mentioned, and we will dive deeper into it.
00:06:03.479
For now, I want you to notice two important things: first, we're back to using named classes in our query, which enhances discoverability.
00:06:08.240
Secondly, there are no conditional branches disrupting the flow of this code, allowing the query class to tell a coherent story.
00:06:14.699
As we will see, the conditional branches come downstream.
00:06:25.259
Before we get into the details, I want to convince you that this may be the right tool for your application.
00:06:32.860
Let's compare the different tools you might use for composing queries.
00:06:41.400
First, there's raw SQL concatenation, which still has its practitioners and defenders. It has advantages when used correctly—it is the most performant option.
00:06:55.440
You can't get more performant than building an abstraction layer above your database management system.
00:07:06.000
Plus, you have a reduced dependency footprint, meaning you aren't tied to your ORM.
00:07:12.000
However, there are downsides. Your readability will generally suffer. Raw SQL is less maintainable, less composable, and less reusable.
00:07:20.940
Additionally, it’s not as familiar to most Rails developers, and you receive less automated help.
00:07:27.639
Next, we have vanilla ActiveRecord, without any libraries or higher-level abstractions.
00:07:36.960
Its advantages include being composable and more readable than SQL. After all, it utilizes Ruby on Rails and is capable of most complex queries.
00:07:44.599
The downside is that it doesn't handle conditional branching well by itself, leading you to sometimes revert to raw SQL.
00:07:53.120
Moreover, you have a dependency on your ORM.
00:08:09.600
Our next tool is Arrow tables, which many Rails practitioners rely on. Arrow tables have advantages in handling conditional branches well.
00:08:17.400
They also deal with complex queries and reduce your dependency on a specific database management system.
00:08:24.540
However, the downside comes from being dependent on the Arrow library now, which introduces another dependency.
00:08:31.800
I personally find the relational algebra syntax difficult to read, and I believe many Rails developers share this sentiment.
00:08:45.779
I prefer object-oriented syntax instead. Another major reason we chose our own builder pattern over Arrow tables is that we can express multiple concerns in a single node.
00:08:52.680
Now we arrive at our custom design pattern with ActiveRecord. It's highly composable and effectively handles conditional branches.
00:09:01.500
It uses Ruby on Rails, and no complex library is needed. If you clone our repository, you'll see how the pattern operates and feel free to adapt it to your own application.
00:09:09.720
No tool is perfect, and there are a couple of downsides to consider: you will face an additional learning curve, as you'll need to be proficient with ActiveRecord and the Builder pattern.
00:09:22.560
Also, you will still have a dependency on ActiveRecord and Rails.
00:09:30.780
Let's delve deeper into the intricacies of the query builder and visitor design pattern. From here on, I will simply refer to it as the Builder pattern.
00:09:40.680
The Builder pattern is a higher abstraction that sits atop ActiveRecord. Abstraction is what I love best about being a software engineer.
00:09:48.600
I enjoy designing systems that tell a coherent story readable by humans, not just machines.
00:09:54.840
Now, here's where we will transition into our code editor, and I will show you how this pattern works.
00:10:20.300
Let me provide some context on how we designed this repository. It required some minimum schema to effectively demonstrate our point.
00:10:28.200
For context, this schema is modeled off of a tabletop role-playing game character builder. For those who aren't familiar, that's perfectly fine.
00:10:42.120
What you need to know is that in our database schema, we have a central table called characters.
00:10:52.260
Picture yourself as a dungeon master or someone else interested in characters, wanting to search among thousands of character records.
00:10:58.800
Our characters have numerous associations with campaigns, players, spells, feats, and more.
00:11:08.300
We need to be able to query from the characters table, and this is an example of one of our query classes, the index query.
00:11:15.300
There are three main concern objects in our query builder pattern: query classes, builders, and nodes.
00:11:23.640
This is where the query class serves as your public interface, returning an ActiveRecord relation.
00:11:29.900
It contains private methods that address implementation specifics while retaining its public interface from its base class.
00:11:40.920
Thank you. Our base class has one public method called `call`.
00:11:46.680
Whenever you need to run a query, you send it the `call` message. This method serves two purposes—it applies and tells the builder to build.
00:11:55.800
We have two primary parameters: the builder, which is injected as a dependency, and filters, which is a hash of user inputs.
00:12:07.680
The filters specify what the user wants; for example, they might only be interested in characters with certain spells.
00:12:16.020
The implementation of the `call` method will depend on the specific child class being called.
00:12:29.520
Let's return to our specific child class. The apply method here performs several tasks.
00:12:40.200
First, it applies joins only to those tables that will be needed based on the filters. It also adds nodes to our builder.
00:12:47.640
The initial state will simply be characters, building an ActiveRecord query concerning those characters.
00:12:57.840
Next, we need to examine the nodes and builders, so let's take a look at one of the nodes.
00:13:07.920
You may be wondering where the conditional logic resides; the answer is in the nodes.
00:13:19.739
We're going to look at one of our simpler nodes called 'spells level n.' This produces a clause for the spells table where level is included in the specified levels.
00:13:32.760
The parameter of levels determines how valid this node is, as its validity relies on the filters passed.
00:13:42.540
If the filters do not include levels, this node will be eliminated. We use inheritance to confer a common public interface.
00:13:50.760
All nodes will possess both an accept method and a valid method.
00:13:58.320
The simplest form of the accept method will merge different ActiveRecord clauses.
00:14:12.720
Now, let’s consider our builder.
00:14:19.920
Our builder is initialized with an empty array of nodes and has two public methods: an add method and a build method.
00:14:28.680
The query class will call the add method to provide all the nodes the builder might need, and once all nodes have been added, we call the build method.
00:14:36.520
The build method contains our central algorithm, which acts as a reducer to accumulate the state of the ActiveRecord query converted from the nodes.
00:14:50.640
The builder does not directly direct this; it instead delegates to the nodes using the visitor pattern. That's the reason we pass a block to reduce, which is visit.
00:14:58.800
Let's demonstrate how this works by entering our console.
00:15:06.840
We're currently inside our query class.
00:15:11.320
Let’s start with some initial state here.
00:15:18.880
The builder has its initial state set.
00:15:23.260
The builder currently points to the characters table but has no nodes.
00:15:31.900
After applying a join, our builder now has an inner join node added.
00:15:39.240
Let’s add our other nodes.
00:15:48.360
When we call builder.build, our actual query will be constructed.
00:15:54.520
It will filter out all the irrelevant nodes that we don't need.
00:16:03.880
I hope this served as a useful introduction to how this works.
00:16:10.499
Please bear with me while I navigate back to my presentation, as working with a projector is a little different.
00:16:17.060
So, let’s summarize what we've learned so far.
00:16:22.880
It's crucial to understand that our Builder pattern consists of object-oriented programming, involving three interacting objects: the queries, the builders, and the nodes.
00:16:30.360
The central algorithm utilizes reduce.
00:16:44.760
I’ll pause for a moment to take a drink.
00:16:51.400
I like this diagram; it illustrates how the different objects work together.
00:16:56.880
Going from left to right, we have our query class, which takes in filters that customize our query.
00:17:04.740
The dependency injection of the builder provides one public method: the call method.
00:17:10.560
The call method triggers two methods on the builder: add and build.
00:17:17.080
The builder keeps a state of nodes in an array and an initial state provided by the query class.
00:17:25.040
When the builder is told to build, it loops over the nodes and visits each node.
00:17:32.560
First, it asks the node if it is valid based on the state passed in. If valid, the accept method will be called.
00:17:40.160
Typically, this involves merging ActiveRecord clauses.
00:17:47.440
The node generates a clause, and ultimately, the builder returns an ActiveRecord relation to the query class.
00:17:54.680
Now, the central algorithm utilizes the reduce method.
00:18:02.800
Thanks for your patience. Let's go back to our code editor.
00:18:10.960
Let’s closely examine the reduce method.
00:18:16.120
We can find it in our builder—the base implementation contains a reduce method.
00:18:24.520
The standard Ruby library reduce can be called on an enumerable or an array, combining different elements using any provided code block.
00:18:31.640
Our implementation is customizable from the base class and accepts a state and node as parameters while visiting each node.
00:18:39.240
Let's see what our default builder does.
00:18:45.640
We should zoom in here.
00:18:50.040
Our custom reduce method first loops over all the nodes and checks if each is valid. Invalid nodes are eliminated and not built as part of our query.
00:19:26.900
We call reduce on the nodes that are valid, utilizing our initial state and block.
00:19:35.400
As we compile our logic, let’s consolidate this again in summary slides.
00:19:43.680
The main idea is that the reduce pattern accumulates nodes while also eliminating unnecessary ones.
00:19:49.860
Essentially, we are executing ActiveRecord queries, merging them repeatedly.
00:19:59.680
We could implement this naively using conditional branches, where we'd create two separate queries.
00:20:09.400
However, this approach would complicate things with numerous conditional logics, making it difficult to manage large queries.
00:20:16.360
Motivated to improve, we require a better idea, leading to reduce.
00:20:25.520
By providing the builder with nodes and allowing the nodes to self-eliminate, we manage to create cleaner queries.
00:20:35.520
I love metaphors, and the one I find fitting here is that of a sculptor.
00:20:41.760
The sculptor receives a medium and subtracts what is unnecessary, eventually revealing a beautiful statue.
00:20:54.920
The same concept applies to our reduce pattern.
00:21:01.740
To summarize, the builder constructs queries through the reduce algorithm and visits nodes.
00:21:09.120
The code block is delegated to the node's own method, allowing for organized logic.
00:21:15.300
One of the biggest advantages of this pattern lies in its ability to manage nesting within nodes. Complex queries become easier to handle.
00:21:22.580
Consider Russian nesting dolls, where a node can envelop other nodes.
00:21:29.580
There are numerous use cases for this: or and where clauses, higher domain logic, common table expressions, and subqueries.
00:21:38.260
Our pattern effectively handles nesting with grace.
00:21:46.380
Let's revisit our query class and check out one of the nodes.
00:21:53.240
We have a legendary node that embodies higher business logic, asking for legendary characters.
00:22:00.740
Legendary characters have various attributes, such as experience level and stat boosts.
00:22:06.900
Let’s inspect this further.
00:22:14.840
Our node initializes with a filter and maintains its own sub-builder.
00:22:20.960
This insight allows clauses to delegate back to builders, adding child nodes.
00:22:28.780
Now we have our and filter with two concerns from the characters table.
00:22:38.200
Let’s observe how it looks in console.
00:22:46.480
We’re inside our nested node, the legendary node.
00:22:53.680
Let’s see what happens when we add nodes to the builder.
00:23:00.480
Now we have a nesting of nodes.
00:23:08.760
We have constructed a beautiful query that integrates into our larger query.
00:23:16.200
We want characters with experience exceeding 10,000 and one of their stats must be greater than or equal to 18.
00:23:23.460
There’s a bit of a learning curve when switching between the console and the slideshow.
00:23:27.200
To summarize, with nesting of nodes, we delegate to our own builder. One of the powerful aspects of nested nodes is their ability to execute and manage complex queries.
00:23:36.880
Finally, I’ll outline how you would implement our query builder pattern.
00:23:45.200
There are four principles of best practice to follow. First, encapsulate the public interface of your query in a class.
00:23:55.940
You want the query encapsulated in a query object class with an appropriate scope.
00:24:02.640
The query should achieve a balance—it shouldn't be a monolith trying to do everything, nor should it be overly granular.
00:24:09.800
A helpful rule of thumb is to have one query class per API endpoint.
00:24:18.740
We have the index query we've discussed, and I appreciate the clarity it provides.
00:24:27.240
You want to make your queries modular by utilizing named classes that represent nodes.
00:24:38.320
This approach leads us back to the principles of object-oriented programming, ensuring discoverability and clarity.
00:24:47.520
Nodes can be general or specifically tied to a particular table.
00:24:54.820
You should also delegate functionality to your builders and nodes, moving the main logic outside of the query class.
00:25:05.600
The node classes should have a valid method that assesses whether the builder adds them to the ActiveRecord relation.
00:25:12.820
They also need an accept method that merges them into the ActiveRecord relation.
00:25:21.540
For example, this might look at a different campaign table, allowing users to filter start dates.
00:25:28.600
This node verifies if any filters are present; if they are, it will introduce the WHERE clause.
00:25:35.960
Finally, you want to create a builder that uses the visitor pattern.
00:25:43.780
The builder abstraction should remain distinct from your query class, while your query class crafts a story that reads descriptively.
00:25:51.640
Your builder class will use the visitor pattern to visit each node and extract its clause.
00:26:00.400
We have observed this with our default builder as well as the reduce algorithm employed.
00:26:09.640
To wrap this up, abstraction allows our code to narrate a cohesive story, which I hope was evident in my presentation.
00:26:19.420
We began with our legacy file constrained by SQL concatenation.
00:26:26.920
We’ve achieved a pattern where we return to using named classes, enabling us to narrate a coherent story.
00:26:34.200
Our objective as software engineers is to select the most appropriate abstractions to create beautiful, maintainable code.
00:26:40.640
Before I conclude, just a quick shoutout to my company G2, which is growing and actively hiring, so please consider checking us out.
00:26:59.180
I want to acknowledge that our Builder pattern was developed by three engineers: myself, John Paul Hill, and Evgeni Karpov.
00:27:05.080
I greatly appreciate all the feedback I received on this presentation, which helped improve it.
00:27:12.960
Finally, I want to thank G2 leadership for supporting us in developing this pattern and providing the resources to deliver it to you at Rails Conference.
00:27:19.960
At this point, I have time for questions and answers. Thank you so much!
00:27:30.240
Absolutely. The question was about the GitHub URL. Let's refer back to my slide with the QR code.
00:27:38.160
G2Crowd/query-builders, yes.
00:27:41.820
We indeed wanted to create our own pattern to avoid dependency on anything else. We looked online to see if anyone had implemented this pattern, and to our knowledge, they have not.
00:27:58.160
Therefore, we love owning this for ourselves.
00:28:05.280
The question was why I passed all the nodes to the builder instead of merely selecting the desired nodes.
00:28:14.720
The logic for adding a node needs a location; we've moved that logic downstream to the nodes themselves, particularly into the valid method.
00:28:25.680
This approach liberates our code, allowing it to tell a story without burdening the query class.
00:28:32.080
Did that answer your question?
00:28:39.600
Wonderful, yes, how do we ensure they don't create new query patterns?
00:28:45.500
My answer is to provide them with something they prefer over alternatives. If they discover a better option, I cannot nor should I hinder that.
00:28:54.680
The next question is when do you determine whether to create a new query class versus a different nested node?
00:29:03.640
Often, as you work with specs, you'll discover an explosion of branches that can be too complicated; this signals it's time to split it out.
00:29:10.000
That's the best advice I can offer, although several factors will also come into play.
00:29:17.720
Indeed, there are no performance differences; any abstraction will have some associated trade-offs.
00:29:26.320
How would this work for CTEs? While I don't have a verbal description on hand, I have built queries utilizing this pattern that employ CTEs.
00:29:34.680
Please feel free to reach out to me for more specific examples.
00:29:42.640
Regarding specs for the query class, you can write specifications for the nodes themselves.
00:29:49.340
In practice, you'll find a mixture of integration tests and unit tests will serve well.
00:29:58.100
So, what roadblocks did we encounter while learning and building this pattern? My team faced numerous iterations before honing in on this solution.
00:30:06.720
The reduced pattern we arrived at, while relatively simple, required ample trial and error to find the balance of complexity, performance, and readability.
00:30:14.480
Thank you for all the amazing questions; I'm grateful to see such interest in this topic.
00:30:23.640
Any other questions I may have missed?
00:30:30.000
Yes, certainly, you can change sub-queries.
00:30:34.680
Are you referring to an entire query class?
00:30:41.820
I can't recall a specific instance with that offhand, but I don't see why not.
00:30:47.480
I think we've reached the end. Thank you so much, everyone!