00:00:16.380
Okay, sweet! Welcome everybody! Thank you all for coming to this talk. This is the second talk that I've given at RailsConf. The first one I gave was on internationalization, and there were maybe 15 people in the audience. Now, there are a lot more people here. I guess this topic is more popular, I don’t know. So again, thanks to all for coming.
00:00:40.860
I’m Cameron Dutro, and I work for Twitter on the International Engineering team, which explains the topic of my previous talk two years ago. You can follow me at @CameraTrot on Twitter and check out my GitHub as well. A little bit of personal background: this is my cat. He's pretty cute, and his name is Chester. You can follow him at @CatWithTail.
00:01:03.780
Now, I want to do a little interactive exercise before diving into the technical portions of this talk. Raise your hand if you've ever written Rails 2 code. Wow, look at that! Okay, nice! Now, how many of you saw a query in Rails 2 that looked like this? (shows a complex SQL query) Yes, there are tons of strings in here, and there's not really an object-oriented way to construct queries in Rails 2.
00:01:21.869
Thank Heavens for Rails 3 and 4 because they introduced a much more chainable, object-oriented pattern for query construction. How many of you wrote Rails 3 or 4 code that looks like this? (shows another complex SQL query) That's pretty much the same thing as before. I don’t want to say you’re doing it wrong, but you're doing it wrong.
00:01:40.860
Let's go into this code and maybe clean it up a little bit. I just moved one of those strings that defined a join association on this table and converted it to a symbol. I’m joining the comments table on posts. This is a lot cleaner looking, and it's allowing ActiveRecord to introspect the associations between these two models to derive the columns to use and automatically construct that query for you.
00:02:26.569
So that's awesome! But we still have a couple of other problems with this code. We still have these question marks in the WHERE clause, and we still have that second join. The reason that join is not yet a symbol is because it's an association that doesn't directly pertain to posts. It's joining authors in relation to comments; in this system, authors are part of comments, not part of posts.
00:03:18.470
That’s so confusing! You could simplify this even further by using this cool hash rocket syntax that says, even though comments and authors are not directly related to posts, I want to get there through comments. Fortunately, ActiveRecord has given us a lot over the past couple of years between versions 2 and 3 and 4, but this is still a little bit dirty, especially if you want to do something like an outer join in this join clause or if you wanted to do something more complex than just saying one column is equal to another.
00:03:51.889
So I kind of look at this style as better, but you're still handed a great weapon and not quite using it appropriately. Let's look at parts of this query. I’m going to use the previous string example to identify what’s wrong with these queries. One of the first things that’s wrong is that you have to write the words 'JOIN' and 'AND'. This isn’t difficult to type, but it does mean you have to know MySQL or PostgreSQL syntax to get this done.
00:04:50.290
Also, there's essentially no syntax checking. You have no idea if you’ve typed authors is an actual table or whether authors.id is a column. All you know is that you have this string, and it's being handed to ActiveRecord to execute it for you. So let’s address these issues. You might think there's a better way and do a search, and you might come across a Stack Overflow article that tells you to use strings in your queries or perhaps Rails casts that suggest the same.
00:06:04.570
I'm here to tell you that no, you can avoid doing that! You can keep calm and avoid literal strings in your queries. Let's look at the previous query, but completely eralized. This is the query eralized. Notice a bunch of cool things going on here. First off, you don’t have to know SQL syntax; all you need to know is Ruby syntax.
00:06:44.330
Of course, this is a little bit of a DSL, so you might have to go about learning aRel's syntax as well, but that’s what this talk is for. Because ActiveRecord 3 and 4 are obtainable, aRel is chainable as well, allowing you to compose queries in a step-by-step fashion.
00:07:13.880
It's also easier to read since you’re reading Ruby, not SQL embedded in Ruby, which feels a bit odd. Here’s what we’re going to cover today. I’ve already gone over the initial state of the world, so I’ll cover what ActiveRecord is, what aRel is, and the indistinct line separating those two. I’ll try to separate them, talk about how to reference tables and columns, what terminal methods are, and why they can be confusing. We'll cover the meat of SQL such as SELECT, WHERE, JOIN, and a particular feature I’ll show called join association.
00:08:01.320
So what is ActiveRecord exactly? Well, you all know this, but it's worth repeating: ActiveRecord is a database abstraction, so there's no need to speak a dialect of SQL. You can execute queries, and it returns data back to you. ActiveRecord also serves as a persistence layer, so database rows are Ruby objects. You can pull them out, put them back in, and encapsulate, in many cases, your domain logic.
00:09:09.149
I know people say no fat models and no fat controllers, but a lot of domain logic is crammed into those models. They can have validations and define associations between your models. In contrast, aRel is distinct in that it’s really only used for building queries. It's described on its GitHub page as a Relational Algebra for Ruby, even though that might seem confusing at first.
00:09:55.410
Essentially, what that means is they’ve built SQL queries in an object-oriented way, generating abstract syntax trees (ASTs) for your queries and then using the visitor pattern to turn those into SQL strings to be executed by ActiveRecord. This enables chaining effectively because ASTs are composable; you just have to reference a subtree to pull in another branch of query logic.
00:10:51.289
One blogger even described aRel as vexing but undocumented. You can find pieces of aRel all over the internet, but putting all that together took me time. The root of this talk rose from a project I was working on at Twitter. I work on the Twitter Translation Center, and we needed to implement an access control system to determine whether a user was able to see a particular phrase, which turned out to be a challenge.
00:11:37.350
We would have had to write numerous strings in our SQL queries, so we wanted to avoid that and have something composable, leveraging scoping capabilities and more. Having done that project, I’m trying to share these findings with you today. As a recap, aRel knows nothing about your models and understands very little about your database. It only knows the queries you're trying to construct and does not care if those tables exist—ActiveRecord does, but aRel does not.
00:12:41.610
In summary, aRel constructs queries, and ActiveRecord does everything else. Looking at this as a hierarchy, ActiveRecord interacts as a layer above aRel, which processes and forms the query before sending it off to the database; then, ActiveRecord will package that data back into nice Ruby objects for you.
00:13:33.360
Now, something else that needs introduction is what an AST is. As DHH eloquently put it in his keynote, not everyone comes from a computer science background, and an AST is very computer science-heavy. An AST essentially represents a basic tree structure; this is actually a binary tree, where each node has left and right children unless it's a leaf.
00:14:34.590
Trees are cool because you can represent equations or expressions with them. For example, you can represent a multiplication operation as a node at the root, with left and right children representing operands. In this way, you can expand this concept into how this would look for a SQL expression.
00:15:19.680
For example, you might have a query root, with a SELECT for ID and text, and the FROM would be one child, for example, 'posts'. Now that was an overview—let's get into some coding. The first thing I want to discuss is a little gem I put together to assist with this talk, called aRel Helpers.
00:15:57.090
Go ahead and install it; it has three small helpers, and let's jump into how to reference tables and columns in aRel. Suppose you have a table called 'posts' which has many comments. If you want to reference the ID in this table, you would use 'post.aRel.table.ID', referring to the aRel table.
00:16:37.060
Now brackets for 'ID' are called an aRel attribute. You can do the same for 'text' and even for columns that don't exist—for example, if you want to derive a column, you could refer to it in a similar way. The aRel helpers gem makes this easier; if you include aRel helpers and access your tables with square brackets, it effectively delegates to aRel table returning an attribute for you.
00:17:44.630
Let’s move on to a pop quiz. What does this statement return: 'Rails is cool'? Does it return an array of strings, instantiated objects from ActiveRecord, or does it return an ActiveRecord relation? Raise your hands—who thinks it’s A? B? C? Nice! The answer is C!
00:18:45.050
One thing cool about ActiveRecord 3 and 4 is that methods like select, join, and where actually return an ActiveRecord relation; they don't execute right away, making them chainable. If I say 'Post.select(:title)', it doesn’t execute anything right away, rather it stores an intermediate AST in a variable called 'query'.
00:19:20.270
I can say 'query = query.select(:another_column)' and when I say 'query.to_sql', I’ll get the string that ActiveRecord would execute against the database for the query. The 'to_sql' method is crucial, as any ActiveRecord relation can be called to retrieve the SQL string.
00:20:47.400
So, let’s talk about the specifics here. Looking at select, you can select multiple columns using array syntax. This is still just ActiveRecord, not diving into aRel quite yet. You might think that to get SQL for COUNT, you would say 'SELECT COUNT(*) FROM posts', but doing so would throw a 'NoMethodError' because COUNT is a terminal method.
00:21:44.950
Being a terminal method, it executes immediately and returns a result instead of returning an ActiveRecord relation. To get a count and SQL string, you must dive into aRel, using your aRel table. You can call count on an aRel attribute, and of course, pass that to select to assemble the expected count.
00:22:28.949
Terminal methods in ActiveRecord execute the query immediately and do not return a relation; most common methods include count, first, and last. Any enumerable methods will trigger immediate query execution by default. This includes regular enumerators like 'each', which will iterate over the results. Interestingly, some methods like 'each_slice' can also execute immediately.
00:23:33.040
Okay, let’s dive into select one more time. We have seen how we can call count on an aRel attribute, and now we can also say 'SUM' to achieve similar outputs. ActiveRecord has added some methods to help; you can derive a column name by adding 'AS' at the end of your queries.
00:24:16.220
For example, you could have a method like 'MAX' or 'MIN'. If you try to run a function that isn’t part of either aRel or ActiveRecord, you can define any arbitrary function...to do so, you would call a named function passing the name and array of arguments, aliasing the result as needed.
00:25:01.040
You also have options to select everything by specifying 'aRel.*', and certain functionalities allow specifying FROM as part of the construction. It’s common to specify sub-queries directly in the FROM.
00:26:03.460
Here, you can see how to add a sub-query as part of your SELECT expression; simply ensure that at any point in construction, you can access AST to verify the structure. Let’s briefly touch on WHERE, which has gotten much better since Rails 2.
00:26:41.430
Now, we can pass a title through a WHERE clause directly. It constructs the SQL query like we expect—no question marks here! However, if you wish to specify additional conditions, aReL allows conditions like 'NOT EQUAL', 'LESS THAN', or logic like AND/OR chaining.
00:27:38.210
Let’s explore the JOIN functionality of ActiveRecord and how aRel can help. Using the previous data model for a blog, with Posts, Comments, and Authors, it’s easy to use join conditions to pull associations effectively. ActiveRecord inspects this model and collects the association automatically, producing appropriate join queries.
00:28:50.470
However, when it comes to outer joins, you have to dive deeper into aRel. The way we do it involves using the aRel table features to specify join types and conditions manually—this could rapidly become verbose, but it affords accuracy!
00:30:11.090
To bridge these gaps, I developed the 'Join Association' feature which allows automatic introspection similar to ActiveRecord but eliminates tedious, manual conditions. The benefit of doing this in aRel is that it allows constructing complex queries more efficiently and cleanly.
00:32:09.480
So we've discussed joining tables, particularly with join tables—say, courses and teachers, which showcases a many-to-many relationship. Here, you can reference an ad-hoc join table using aRel capabilities. The query process showcases how to dynamically build joins effectively.
00:34:46.620
Finally, I encourage you to explore a project I worked on called Scuttle, which allows you to enter SQL and converts it to aRel, helping simplify query building. Thank you all for your attention and for listening!