Query Optimization
Advanced aRel: When ActiveRecord Just Isn't Enough

Summarized using AI

Advanced aRel: When ActiveRecord Just Isn't Enough

Cameron Dutro • April 22, 2014 • Chicago, IL

In the talk 'Advanced aRel: When ActiveRecord Just Isn't Enough,' Cameron Dutro discusses the limitations of ActiveRecord for complex queries in larger Rails applications and introduces aRel, a powerful library for building more sophisticated database queries using a Ruby-centric approach.

Throughout the presentation, Dutro outlines the following key points:

- ActiveRecord Overview: ActiveRecord functions as a database abstraction layer in Rails, allowing developers to interact with the database without needing to write SQL directly. However, it can struggle with complex queries and multiple associations as applications scale.

- Introduction to aRel: aRel is presented as an essential tool to write objects-oriented SQL without resorting to raw SQL strings. It allows developers to build complex queries in a more readable manner.

- Code Examples: Dutro provides multiple code snippets to illustrate how aRel simplifies query construction compared to ActiveRecord's native capabilities.

- Example of a complex ActiveRecord query with joins and where clauses

- Transitioning to aRel for clearer syntax and improved error checking.

- Terminal Methods: The concept of terminal methods in ActiveRecord is highlighted, emphasizing that some methods like first or last execute immediately, which limits chaining capabilities.

- Constructing Queries with aRel: Various query components like select, where, and join are discussed in relation to aRel's capabilities, showing the syntax and its benefits over ActiveRecord.

- Join Examples: Complex join operations are presented, especially the advantages of performing outer joins using aRel, which are less straightforward in ActiveRecord.

- Query Builders: The idea of encapsulating query logic with custom query builders is introduced, allowing developers to create reusable query patterns.

- Tools: Dutro briefly discusses a project called 'scuttle', an editor for converting SQL queries into aRel syntax, making it easier for developers to leverage aRel without extensive manual coding.

In conclusion, the main takeaways from the talk emphasize the necessity of using aRel for complex database interactions within Rails applications as they scale, highlighting its elegant syntax, improved readability, and capabilities for constructing robust queries. The overarching message encourages developers to move beyond simple ActiveRecord patterns and embrace aRel for cleaner, more maintainable code when dealing with sophisticated database logic.

Advanced aRel: When ActiveRecord Just Isn't Enough
Cameron Dutro • April 22, 2014 • Chicago, IL

By Cameron Dutro

We all love Rails, and lots of us love ActiveRecord. It's intuitive and easy to use in small apps that don't have lots of models. You can select, join, and where your way into a great, working app in no time. Then your app starts to grow, you add more models, and you need to start building more and more complex queries. This talk describes how to harness the awesomeness of aRel, ActiveRecord's powerful relational algebra system, to perform arbitrarily complex queries using nothing but pure Ruby.

Cameron Dutro works on Twitter's International Engineering team, primarily on the Twitter Translation Center, a large Rails application. He's been building stuff in Ruby for the past three years, including the TwitterCLDR internationalization library, and loving every minute of it.

Help us caption & translate this video!

http://amara.org/v/FG1s/

RailsConf 2014

00:00:16.380 Okay, sweet! Welcome everybody! Thank you all for coming to this talk. This is the second talk that I've given at RailsConf. The first one I gave was on internationalization, and there were maybe 15 people in the audience. Now, there are a lot more people here. I guess this topic is more popular, I don’t know. So again, thanks to all for coming.
00:00:40.860 I’m Cameron Dutro, and I work for Twitter on the International Engineering team, which explains the topic of my previous talk two years ago. You can follow me at @CameraTrot on Twitter and check out my GitHub as well. A little bit of personal background: this is my cat. He's pretty cute, and his name is Chester. You can follow him at @CatWithTail.
00:01:03.780 Now, I want to do a little interactive exercise before diving into the technical portions of this talk. Raise your hand if you've ever written Rails 2 code. Wow, look at that! Okay, nice! Now, how many of you saw a query in Rails 2 that looked like this? (shows a complex SQL query) Yes, there are tons of strings in here, and there's not really an object-oriented way to construct queries in Rails 2.
00:01:21.869 Thank Heavens for Rails 3 and 4 because they introduced a much more chainable, object-oriented pattern for query construction. How many of you wrote Rails 3 or 4 code that looks like this? (shows another complex SQL query) That's pretty much the same thing as before. I don’t want to say you’re doing it wrong, but you're doing it wrong.
00:01:40.860 Let's go into this code and maybe clean it up a little bit. I just moved one of those strings that defined a join association on this table and converted it to a symbol. I’m joining the comments table on posts. This is a lot cleaner looking, and it's allowing ActiveRecord to introspect the associations between these two models to derive the columns to use and automatically construct that query for you.
00:02:26.569 So that's awesome! But we still have a couple of other problems with this code. We still have these question marks in the WHERE clause, and we still have that second join. The reason that join is not yet a symbol is because it's an association that doesn't directly pertain to posts. It's joining authors in relation to comments; in this system, authors are part of comments, not part of posts.
00:03:18.470 That’s so confusing! You could simplify this even further by using this cool hash rocket syntax that says, even though comments and authors are not directly related to posts, I want to get there through comments. Fortunately, ActiveRecord has given us a lot over the past couple of years between versions 2 and 3 and 4, but this is still a little bit dirty, especially if you want to do something like an outer join in this join clause or if you wanted to do something more complex than just saying one column is equal to another.
00:03:51.889 So I kind of look at this style as better, but you're still handed a great weapon and not quite using it appropriately. Let's look at parts of this query. I’m going to use the previous string example to identify what’s wrong with these queries. One of the first things that’s wrong is that you have to write the words 'JOIN' and 'AND'. This isn’t difficult to type, but it does mean you have to know MySQL or PostgreSQL syntax to get this done.
00:04:50.290 Also, there's essentially no syntax checking. You have no idea if you’ve typed authors is an actual table or whether authors.id is a column. All you know is that you have this string, and it's being handed to ActiveRecord to execute it for you. So let’s address these issues. You might think there's a better way and do a search, and you might come across a Stack Overflow article that tells you to use strings in your queries or perhaps Rails casts that suggest the same.
00:06:04.570 I'm here to tell you that no, you can avoid doing that! You can keep calm and avoid literal strings in your queries. Let's look at the previous query, but completely eralized. This is the query eralized. Notice a bunch of cool things going on here. First off, you don’t have to know SQL syntax; all you need to know is Ruby syntax.
00:06:44.330 Of course, this is a little bit of a DSL, so you might have to go about learning aRel's syntax as well, but that’s what this talk is for. Because ActiveRecord 3 and 4 are obtainable, aRel is chainable as well, allowing you to compose queries in a step-by-step fashion.
00:07:13.880 It's also easier to read since you’re reading Ruby, not SQL embedded in Ruby, which feels a bit odd. Here’s what we’re going to cover today. I’ve already gone over the initial state of the world, so I’ll cover what ActiveRecord is, what aRel is, and the indistinct line separating those two. I’ll try to separate them, talk about how to reference tables and columns, what terminal methods are, and why they can be confusing. We'll cover the meat of SQL such as SELECT, WHERE, JOIN, and a particular feature I’ll show called join association.
00:08:01.320 So what is ActiveRecord exactly? Well, you all know this, but it's worth repeating: ActiveRecord is a database abstraction, so there's no need to speak a dialect of SQL. You can execute queries, and it returns data back to you. ActiveRecord also serves as a persistence layer, so database rows are Ruby objects. You can pull them out, put them back in, and encapsulate, in many cases, your domain logic.
00:09:09.149 I know people say no fat models and no fat controllers, but a lot of domain logic is crammed into those models. They can have validations and define associations between your models. In contrast, aRel is distinct in that it’s really only used for building queries. It's described on its GitHub page as a Relational Algebra for Ruby, even though that might seem confusing at first.
00:09:55.410 Essentially, what that means is they’ve built SQL queries in an object-oriented way, generating abstract syntax trees (ASTs) for your queries and then using the visitor pattern to turn those into SQL strings to be executed by ActiveRecord. This enables chaining effectively because ASTs are composable; you just have to reference a subtree to pull in another branch of query logic.
00:10:51.289 One blogger even described aRel as vexing but undocumented. You can find pieces of aRel all over the internet, but putting all that together took me time. The root of this talk rose from a project I was working on at Twitter. I work on the Twitter Translation Center, and we needed to implement an access control system to determine whether a user was able to see a particular phrase, which turned out to be a challenge.
00:11:37.350 We would have had to write numerous strings in our SQL queries, so we wanted to avoid that and have something composable, leveraging scoping capabilities and more. Having done that project, I’m trying to share these findings with you today. As a recap, aRel knows nothing about your models and understands very little about your database. It only knows the queries you're trying to construct and does not care if those tables exist—ActiveRecord does, but aRel does not.
00:12:41.610 In summary, aRel constructs queries, and ActiveRecord does everything else. Looking at this as a hierarchy, ActiveRecord interacts as a layer above aRel, which processes and forms the query before sending it off to the database; then, ActiveRecord will package that data back into nice Ruby objects for you.
00:13:33.360 Now, something else that needs introduction is what an AST is. As DHH eloquently put it in his keynote, not everyone comes from a computer science background, and an AST is very computer science-heavy. An AST essentially represents a basic tree structure; this is actually a binary tree, where each node has left and right children unless it's a leaf.
00:14:34.590 Trees are cool because you can represent equations or expressions with them. For example, you can represent a multiplication operation as a node at the root, with left and right children representing operands. In this way, you can expand this concept into how this would look for a SQL expression.
00:15:19.680 For example, you might have a query root, with a SELECT for ID and text, and the FROM would be one child, for example, 'posts'. Now that was an overview—let's get into some coding. The first thing I want to discuss is a little gem I put together to assist with this talk, called aRel Helpers.
00:15:57.090 Go ahead and install it; it has three small helpers, and let's jump into how to reference tables and columns in aRel. Suppose you have a table called 'posts' which has many comments. If you want to reference the ID in this table, you would use 'post.aRel.table.ID', referring to the aRel table.
00:16:37.060 Now brackets for 'ID' are called an aRel attribute. You can do the same for 'text' and even for columns that don't exist—for example, if you want to derive a column, you could refer to it in a similar way. The aRel helpers gem makes this easier; if you include aRel helpers and access your tables with square brackets, it effectively delegates to aRel table returning an attribute for you.
00:17:44.630 Let’s move on to a pop quiz. What does this statement return: 'Rails is cool'? Does it return an array of strings, instantiated objects from ActiveRecord, or does it return an ActiveRecord relation? Raise your hands—who thinks it’s A? B? C? Nice! The answer is C!
00:18:45.050 One thing cool about ActiveRecord 3 and 4 is that methods like select, join, and where actually return an ActiveRecord relation; they don't execute right away, making them chainable. If I say 'Post.select(:title)', it doesn’t execute anything right away, rather it stores an intermediate AST in a variable called 'query'.
00:19:20.270 I can say 'query = query.select(:another_column)' and when I say 'query.to_sql', I’ll get the string that ActiveRecord would execute against the database for the query. The 'to_sql' method is crucial, as any ActiveRecord relation can be called to retrieve the SQL string.
00:20:47.400 So, let’s talk about the specifics here. Looking at select, you can select multiple columns using array syntax. This is still just ActiveRecord, not diving into aRel quite yet. You might think that to get SQL for COUNT, you would say 'SELECT COUNT(*) FROM posts', but doing so would throw a 'NoMethodError' because COUNT is a terminal method.
00:21:44.950 Being a terminal method, it executes immediately and returns a result instead of returning an ActiveRecord relation. To get a count and SQL string, you must dive into aRel, using your aRel table. You can call count on an aRel attribute, and of course, pass that to select to assemble the expected count.
00:22:28.949 Terminal methods in ActiveRecord execute the query immediately and do not return a relation; most common methods include count, first, and last. Any enumerable methods will trigger immediate query execution by default. This includes regular enumerators like 'each', which will iterate over the results. Interestingly, some methods like 'each_slice' can also execute immediately.
00:23:33.040 Okay, let’s dive into select one more time. We have seen how we can call count on an aRel attribute, and now we can also say 'SUM' to achieve similar outputs. ActiveRecord has added some methods to help; you can derive a column name by adding 'AS' at the end of your queries.
00:24:16.220 For example, you could have a method like 'MAX' or 'MIN'. If you try to run a function that isn’t part of either aRel or ActiveRecord, you can define any arbitrary function...to do so, you would call a named function passing the name and array of arguments, aliasing the result as needed.
00:25:01.040 You also have options to select everything by specifying 'aRel.*', and certain functionalities allow specifying FROM as part of the construction. It’s common to specify sub-queries directly in the FROM.
00:26:03.460 Here, you can see how to add a sub-query as part of your SELECT expression; simply ensure that at any point in construction, you can access AST to verify the structure. Let’s briefly touch on WHERE, which has gotten much better since Rails 2.
00:26:41.430 Now, we can pass a title through a WHERE clause directly. It constructs the SQL query like we expect—no question marks here! However, if you wish to specify additional conditions, aReL allows conditions like 'NOT EQUAL', 'LESS THAN', or logic like AND/OR chaining.
00:27:38.210 Let’s explore the JOIN functionality of ActiveRecord and how aRel can help. Using the previous data model for a blog, with Posts, Comments, and Authors, it’s easy to use join conditions to pull associations effectively. ActiveRecord inspects this model and collects the association automatically, producing appropriate join queries.
00:28:50.470 However, when it comes to outer joins, you have to dive deeper into aRel. The way we do it involves using the aRel table features to specify join types and conditions manually—this could rapidly become verbose, but it affords accuracy!
00:30:11.090 To bridge these gaps, I developed the 'Join Association' feature which allows automatic introspection similar to ActiveRecord but eliminates tedious, manual conditions. The benefit of doing this in aRel is that it allows constructing complex queries more efficiently and cleanly.
00:32:09.480 So we've discussed joining tables, particularly with join tables—say, courses and teachers, which showcases a many-to-many relationship. Here, you can reference an ad-hoc join table using aRel capabilities. The query process showcases how to dynamically build joins effectively.
00:34:46.620 Finally, I encourage you to explore a project I worked on called Scuttle, which allows you to enter SQL and converts it to aRel, helping simplify query building. Thank you all for your attention and for listening!
Explore all talks recorded at RailsConf 2014
+133