00:00:20.840
All right, I had a minor tragedy: I lost my dongle for my clicker, so I'm a little bit late, but let's get started.
00:00:24.330
Welcome! Today, we're going to dig into ActiveRecord with a couple of detours.
00:00:30.509
In the lower right, there's a link to the slides if you want to follow along. If you press 'P', you can see some presenter notes.
00:00:37.500
I also have some links and notes that I won't cover in the talk. My Twitter handle is in the lower left corner, feel free to tweet at me or about me using the hashtag #RailsConf.
00:00:50.089
We'll focus on some major issues with ActiveRecord, look at some alternatives, discuss why you might not want to use those alternatives, and then talk about a potential solution and a pattern that I think we should all be using.
00:01:04.979
Who here uses ActiveRecord? Okay, most of us! Who hasn’t used ActiveRecord? I don’t see any hints for that.
00:01:19.289
Who’s used a different Ruby ORM? A couple of hands. How about a non-Ruby ORM in a different language? A couple more hands.
00:01:27.720
Who loves ActiveRecord? A few hands. Who hates ActiveRecord? Anyone else have a love-hate relationship like I do?
00:01:41.130
All right, actually, more with that.
00:01:47.550
ActiveRecord is the 800-pound gorilla, and odds are if you're going to work on Rails, if you get hired to work on Rails, you're going to be using ActiveRecord.
00:01:57.239
First, I want to make sure that everyone knows what a norm is. Ruby deals with objects, obviously, and SQL databases deal with relations. It's actually something called relational algebra that they work with. Sounds pretty cool!
00:02:18.360
I'm not sure there's any mathematical foundation for NoSQL databases.
00:02:28.500
So, Norm maps these two sides together; it maps between objects and relations. Note that there's an impedance mismatch between the two sides. What works well on one side might not work well on the other.
00:02:33.269
Some straightforward data structures can’t be mapped one-to-one. One canonical example of this is a tree structure, which is really easy to do in OOP, but there are several different ways to represent it in relational algebra, making it difficult to map between those two.
00:02:46.290
I gave a talk at RubyConf 2015, where I went more in-depth on what a norm is. I sort of found the essence by building one in 400 lines.
00:03:07.140
So, Rails' ActiveRecord is based on the ActiveRecord pattern. Here’s Martin Fowler's definition: I’m not sure if he's the first one to come up with it, but he documented it in the Patterns of Enterprise Application Architecture.
00:03:19.200
Note that he lists three separate things here: wrapping the data of a database table, encapsulating database access, and adding domain logic.
00:03:30.569
You could argue that wrapping and encapsulating are pretty much the same thing, but domain logic is clearly a separate concern.
00:03:38.280
Having that in there indicates that we might be violating the Single Responsibility Principle (SRP). So here’s a UML class diagram of the ActiveRecord pattern.
00:03:46.260
Note that there are two different kinds of things going on: 'find' and 'save' deal with persistent storage, while 'name', 'agent', and 'address' deal with domain logic.
00:03:49.619
The biggest problem I have with ActiveRecord is that it encourages bad engineering project habits, mostly because it violates the Single Responsibility Principle. It co-mingles persistence with domain logic. Separation of concerns is important, just like Rails separates the MVC (Model-View-Controller) into separate concerns.
00:04:09.980
We should probably be doing the same with the model itself. As your project gets bigger, ActiveRecord's flaws become more apparent.
00:04:24.570
I find that when I get to about 12 to 20 model classes, it starts to hurt, whereas if you’re below that, it probably doesn’t really matter to you.
00:04:47.550
ActiveRecord is big; it's about 40% of Rails, and that size is another symptom of the Single Responsibility Principle possibly being violated. It tries to do too much in one place and conflates multiple concerns.
00:05:14.180
I am showing some stats here from Rails 5.2.3. A model with one field came in with over 200 instance methods and 600 class methods. For comparison, the Object class has about 86 methods, while String and Array have about 250 methods.
00:05:40.280
So, the number of instance methods is pretty high. You're not going to remember most, if not many of those methods. Granted, some of those are dynamic, but you know if it has a field, it's going to have certain related things.
00:06:01.360
The number of class methods is really concerning. Class methods have several issues, which I think I will cover later.
00:06:03.890
All right, another frustrating thing I found about ActiveRecord is that relationships or associations are defined in the model, like your 'has_many' and 'belongs_to', but attributes are defined in the database schema.
00:06:06.999
I think that's a terrible abuse of the DRI (Don't Repeat Yourself) principle. DRI states there should be only one place to look for any piece of information. I feel like attributes and relations are similar kinds of things where you should look in one place for both.
00:06:16.110
Putting related things in different places seems really counter to the DRI guidance, so you have to look in two places for all the details about a model. This is a case where there’s too much magic for me.
00:06:38.120
There are some workarounds like model annotations, and there is an Atom package to show a toggle displaying the model's attributes from the schema, but unfortunately that's currently broken for me.
00:07:05.219
The attributes API actually came out in 4.2, but it wasn't publicized until Rails 5, yet we have to use it, and hardly anyone does.
00:07:20.412
Does anyone here actually use annotations in their ActiveRecord models? Decent number. All right, that's good.
00:07:28.350
I also released a couple of gems, actually, to let you define active record models before the attributes API was available. One was called Virtus. Has anyone seen this talk?
00:07:45.150
Well, you know the Bob architecture in the last year—a couple—anyone lucky enough to have been there?
00:08:01.349
Well, another person was there. I'm the person that asked him a question: can you show us some code? He said you have to figure it out yourself.
00:08:19.350
So, you can't hear me in the video. It's a seminal talk. Oddly enough, it was only given once at a regional conference. I’m not sure why. Maybe because he didn’t provide those details and I called him out. I don’t know, probably not. He doesn't usually seem affected by that.
00:08:50.970
Since that talk, and probably even earlier, I've struggled to find a way to implement all those architectural suggestions.
00:09:01.480
My last project used the Interactors gem to handle that part. It's actually on the chart there, in that diagram.
00:09:10.110
It separates the Rails controller from the business logic, according to this talk. The fact that our app is a web app is sort of incidental.
00:09:19.260
So, we should have the business logic separate from that incidental delivery mechanism.
00:09:23.700
Interactors gives us that, and there’s a pretty good Interactor gem that works well for that part.
00:09:35.380
But I've never found a great way to split entities in the database. In the previous slide, you can see entities on the right side and a database or any gateway in the database.
00:09:59.470
This is the quest that I’ll be talking about today.
00:10:01.960
So, after almost ten years from that talk, Uncle Bob wrote a book on the topic called Clean Architecture. It's a pretty good book, but it really doesn't help me with this problem.
00:10:10.750
It doesn't get into the details. Uncle Bob also has a blog article called 'Clean Architecture' that provides a succinct explanation.
00:10:14.290
The first stop on my quest is Sequel. I will not pronounce SQL as Sequel because then I would get really confused.
00:10:30.490
This is the biggest surprise I found when I did research for an earlier related talk. This was written by Jeremy Evans, and I don’t see him here today.
00:10:59.440
Sequel has tons of plugins and leverages a lot of database features, especially for Postgres. It supports almost any SQL database you can think of.
00:11:16.560
I really like the documentation.
00:11:19.410
Sequel has two different APIs so that you can use it: the dataset and the model API.
00:11:38.880
Here’s the code used to set up Sequel for the next couple of slides. Not a lot going on here.
00:11:42.160
Pretty much just creating a table since the Sequel syntax is really nice. Here’s the dataset; look at line four. Note that the block lets you use bare column names, which is pretty cool.
00:12:06.960
ActiveRecord does not let you do that, although there is a gem that adds that sort of feature. The problem is it doesn't stay synced up with ActiveRecord as well as I'd like, and I've run into a few other bugs and problems.
00:12:22.670
The dataset is enumerable, with each element as a hash-like object. You can see that being used in line five. I haven't come across anything that Sequel can't do, which is pretty cool.
00:12:44.800
It just doesn't fit the pattern that I'm looking for.
00:12:53.820
So here’s the higher-level API: the Sequel model, that's using objects instead of just a hash-like object. You’d probably be more likely to use this layer in Rails as we like to apply object-oriented programming.
00:13:08.560
Like ActiveRecord, attributes are derived from the database schema, but similarly to ActiveRecord, relationships have to be specified manually. Again, that’s something that frustrates me.
00:13:24.680
I really like Sequel. I wish ActiveRecord was more like Sequel, actually, but Sequel doesn’t solve the problem I’m trying to address.
00:13:39.610
So the next stop on my journey is ROM: the Ruby Object Mapper. This was originally implemented as a Data Mapper. Does anyone remember the Data Mapper library? Some people, decent amount.
00:14:07.280
Originally, this was Data Mapper 2, and in 2013 they renamed it to ROM. In 2014 they moved away from object-relational mapping altogether, so it's not really technically a norm.
00:14:20.060
It just maps the data and not the objects. Most of the work was done by Peter Saluski. I'm not going to try to pronounce it in his language.
00:14:38.700
It's similar in spirit and partly inspired by Elixir's Ecto. Anyone use Ecto? All right, good number.
00:14:55.360
So you guys might find this a little more palatable than I do.
00:15:07.780
Peter Saluski also formerly wrote Virtus, a really nice attribute declaration library. ROM is a bit complex to use; it has commands, relations, and mappers, and you have to buy into this completely different paradigm and mindset.
00:15:25.850
ROM's developers are responsible for the dry-rb libraries, which we'll actually talk about a little more. It's really good as small independent low-level composable libraries.
00:15:41.650
Some of the leaders of this movement towards functional programming and immutability in Ruby are part of the dry-rb and the ROM team. I find them to be a bit too focused on the low-level details.
00:15:59.160
I think that's why it takes them a long time to get their product out. But once it's out, it’s really high-quality code.
00:16:14.820
ROM relation looks pretty straightforward. We have a model class called User, which is empty, and then we have a Users class, which is a ROM relation, and we define the attributes in a schema block.
00:16:37.210
Then we have the associations in a sub block of that. I kind of like that; that's nice.
00:16:52.960
We could tell ROM to pull the schema from the database; we would replace the schema block with a schema and set it to true. But then we wouldn't see all the attributes, and this seems to be the preferred way.
00:17:01.750
To save an object, we start with the relation; we create a relation object and pass it to a change set. This feels really familiar.
00:17:14.440
When that collection includes the create or update, I don't know what happens if you get it wrong, and it passes all the attributes as a hash.
00:17:29.589
So like I said, we’re not really dealing with objects. You’d have to convert your object to a hash if you're dealing with a Ruby object; and then we have to explicitly commit the changes.
00:17:47.940
I think that might be a nice feature, not sure. I found ROM to be really complex. Here’s an overview diagram of their architecture. Honestly, I can't follow everything that’s going on there.
00:18:02.999
I want to like ROM, but I find it too complex and confusing, and I couldn't actually get things set up right to run the code that I showed you in those previous examples.
00:18:15.130
The last stop of my quest is the model layer in Hanami. Hanami is a full web framework and an alternative to Rails. I’ve liked everything I’ve seen; if I had a choice, I might choose Hanami instead of Rails for some side projects.
00:18:37.710
Hanami supports SQL through Sequel, memory, and file adapters. It follows a data-driven design architecture. I will talk about that a little more.
00:18:59.370
So, it has entities which are models without persistence or validations. It has a repository, which is mostly like the class methods in ActiveRecord model classes.
00:19:15.790
So, things like create, update, persist, and delete are all fine.
00:19:34.690
First and last, it has a mapper, which is a declaration of how to map between the database.
00:19:43.250
Here’s the start of a Hanami model. We inherit from Hanami entity, which surprisingly adds only four methods; at least the last time I looked.
00:19:57.440
It adds ID, ID equals, initialize and then a class method called attributes, which we’re using there.
00:20:12.180
The flawed-ish initializer takes a hash of attributes to set all the keys' entity attributes and types come from the dry-types library.
00:20:40.720
So, those types like : : int and : : string are dry types. We could again let the model pull the schema from the database like ActiveRecord, but I don't think that's that common.
00:20:59.530
Persistence is done by the repository class. Note that things like where and order are private. We can only use them within that query on line 3. Queries are analogous in the way they get to scopes.
00:21:15.870
Here, we instantiate an article from the article repository, we created and then we can find it by the author.
00:21:30.930
I think Hanami is my favorite Ruby ORM. If I had a choice, I’d probably use it over Rails, but it's not a very realistic option.
00:21:43.990
It requires everyone on your team to learn something new. If it's just you, that's not a big deal; but if you have a team— I think we have eight developers on my team—it's probably not going to work.
00:22:05.979
Also, you probably wouldn’t want to use it on a project that already has hundreds of models that’s been around for eight or ten years.
00:22:29.350
There’s not much documentation on using it with Rails, nor with the other ORM frameworks.
00:22:49.320
I talked about Rails add-ons that suit your usage of ActiveRecord most of the time, and they may or may not work with another ORM.
00:23:06.429
So, Hanami model implements the repository pattern, which represents a collection of domain objects.
00:23:17.750
In a lot of ways, we can treat the database as an in-memory collection and abstract that even more than we do with ActiveRecord.
00:23:36.000
We do have something similar in ActiveRecord with the class methods to create, the where, the find all. When you create a scope, that’s also sort of the repository pattern, but again, it's stuck in class methods.
00:24:05.800
Those have serious limitations—they look more like procedural code than object-oriented code.
00:24:24.190
It indicates that you’ve missed an abstraction, limits your polymorphism, and it's hard to test and refactor.
00:24:52.320
There’s a good article on Code Climate that talks about all the problems with class methods, and if you check the presenter notes, there’s a link to it.
00:25:14.700
Here’s the UML class diagram for the repository pattern. Note the arrows: the domain model is not dependent on anything.
00:25:31.920
There’s a clear separation of concerns. The domain model class handles the business logic, and the repository class handles persistence.
00:25:48.149
We could end up with more than one repository for a given model. Maybe you want to do sharding, soft delete things in a separate database, or read/write segregation.
00:26:09.290
Perhaps you want in-memory persistence for tests that utilize a different database backing or in-memory backing.
00:26:24.360
You might see this repository pattern with a third class, which is the mapper class that handles the coercion between database fields and object attributes.
00:26:41.430
So, I’ve spent several years looking for a way to have my cake and eat it too. I want to keep using ActiveRecord, but I want to separate my domain model from the database persistence.
00:27:12.250
One Saturday morning I was lying in bed a little late and thinking about it again. Don’t ask! I don’t know why I need to think about those things in bed, but I came up with a solution that I thought could work.
00:27:34.950
In Rails 3, they split ActiveRecord into several modules, and I thought I could use those various modules that ActiveRecord uses and split them into the two sides.
00:27:56.930
The funny thing is, I think I misremembered that; I think it was ActionController that got modularized for breaking it into pieces.
00:28:12.920
ActiveModel did get pulled out of ActiveRecord at that time, but I don’t think they were really meant to be used separately.
00:28:37.210
So, it wasn't quite as easy to make this work as I expected. All the modules have a lot of interdependencies, and there’s no real documentation on how to use each module and what their dependencies look like.
00:29:00.289
But it turned out that the domain model is just most of ActiveModel. So, I ended up calling that ActiveModel Entity when I originally called it ActiveRecord Entity.
00:29:16.540
The repository side is still mostly ActiveRecord, so I'm going to show the difference between using standard ActiveRecord and using ActiveRecord Repository, which is the gem I'm working on.
00:29:42.200
So, here’s a typical ActiveRecord model. This should be pretty familiar to you. We have associations like 'belongs_to', 'has_many', validations, and scopes.
00:30:01.120
And then there are some fields that we don’t know about just by looking at the code, unfortunately.
00:30:23.800
Here’s the same thing using ActiveRecord Repository: instead of subclassing, I’m including a module.
00:30:50.070
This is an interesting little pattern that I found. The module is actually dynamically generated through the call to ActiveRecord or ActiveModel's entity method.
00:31:06.980
So, we can pass parameters, and I'll talk about that some more when I discuss the implementation.
00:31:28.020
Let’s see, the module we’re mixing in is ActiveModel Entity. The term 'entity' comes from Eric Evans' domain-driven design. An entity is an object with an identity.
00:31:46.830
So, we could have two items with the same attributes but different IDs, and those would be considered different entities. If we have two items of the same type that have the same ID, they’d be considered the same thing.
00:32:19.880
There’s actually something called an identity map in ActiveRecord. The other major difference is that we declare attributes here: their names and types, which fixes my second biggest gripe.
00:32:38.930
We still have the 'belongs_to' and 'has_many', but we don't have the scopes. Any instance method we add would be defined here as well.
00:32:59.340
Here’s the repository for that same class. Again, we're including a module instead of subclassing. We can pass parameters; we can pass the model class we’re working with.
00:33:19.580
By default, I’m taking the term User Repository, knocking the 'Repository' off, and assuming it's 'User'. We can specify the database table name if it can’t be derived.
00:33:35.950
We can also specify a primary key, and we could specify a mapping of database column names to entity attribute names.
00:33:54.290
The scope is on this repository side because it deals with the entire collection, not any individual object.
00:34:13.200
Here’s the typical controller with Rails and ActiveRecord. We tell the User model to save itself on line 4, and that will return false if it failed to save.
00:34:29.960
Here’s the same thing with my ActiveRecord Repository gem: only two lines have changed. Line four explicitly tests to see if the model is valid.
00:34:52.280
Then we deal with that; on line five, we tell the repository to save the model object instead of asking the model object to save itself.
00:35:01.370
There’s one caveat: if you have a uniqueness validation that can't be determined until you hit the database, you're actually going to have to catch an exception on the save.
00:35:28.520
Here’s a bit of the implementation of the entity model. I talked about that pattern, the parameterized model pattern.
00:35:51.150
Unfortunately, I think this is the simplest implementation possible, but basically we create a list of modules that may vary depending on what we passed.
00:36:12.020
Then we create a module composed of those modules. The self-composed model module is not important to understand, but it’s important to note that we’re taking several modules and composing them together.
00:36:30.720
This, as I said, allows us to pass parameters.
00:36:51.160
I previously called that ActiveRecord Entity, but we’re not using anything from ActiveRecord, so I changed that.
00:37:16.030
You can see that we are just including and extending ActiveModel modules; that’s hard to say.
00:37:38.580
So here’s the repository side again. I’m using the same parameterized module pattern, but I haven’t implemented anything on this side yet.
00:37:55.580
This side is all ActiveRecord, plus some custom code. We’re mostly ensuring that ActiveRecord still works despite the parts I’ve taken away.
00:38:13.839
It still utilizes most of ActiveRecord, but not quite all.
00:38:32.370
We’ve got some helper methods that are calling ActiveRecord. This method lets you do user :: repository dot save and then pass a user object.
00:38:44.310
This one is a bit tricky: we have to create an ActiveRecord model object temporarily to save and then update the entity's ID when we save to indicate that the entity has been persisted.
00:39:17.000
This is an implementation of ActiveModel's persisted question mark, which I think is required to be an ActiveModel citizen for Rails.
00:39:45.220
There are quite a few challenges—more than I expected—probably due to the fact that I misremembered which things kept modularized.
00:40:01.120
It didn't occur to me for a while how to separate the modules. It turned out that the entity side is all ActiveModel.
00:40:31.040
The repository side is all ActiveRecord. We're not subclassing ActiveRecord, and that turned out to be really tricky.
00:40:53.340
I spent hours trying to fix this. ActiveRecord uses that to figure some things out and includes info about the connection to the database.
00:41:10.540
I also had to tell ActiveRecord that the repository class is not an abstract class.
00:41:26.730
Currently, I’m fighting with ActiveRecord relations and getting an error that doesn’t seem to be related to the code I added.
00:41:44.520
This makes it really hard to troubleshoot.
00:41:56.180
So, I still have a lot of work to do to make this usable. Please do not use this in production.
00:42:10.750
I’m not going to use it in production. I’m not sure I’ll even get to that point, but it was fun and interesting to learn.
00:42:27.859
Maybe I can make it work. The main part is testing how the relations work, like cascading deletions, loading, or auto-loading—all the relations—and mapping them to objects.
00:42:50.289
We could automatically create migrations because we have all the data we need in the model class, all declared there.
00:43:10.230
I think the only thing I’m missing right now is indexing. Data Mapper actually had that option.
00:43:28.890
If you’re into migrations, go see Metas in Ski’s talk on migrations right after this.
00:43:47.509
A teammate and colleague of mine covers a lot of gotchas with migrations. I plan to look at those gotchas if I do get to automating migrations.
00:44:01.420
That’s in the next time slot over in room F.
00:44:32.640
I need some help from all of you. If you're interested, please go star the repo on GitHub so that I know if people are interested.
00:44:52.059
The more people are interested, the more likely I am to complete the project.
00:45:06.170
I’m easy to find on the internet or in person. I’ve got the Weed Maps t-shirt on today; I made the repo and the talk easy to find.
00:45:28.010
I have links for everything on the last slide: links that kind of link back to each other.
00:45:52.400
I’d like to thank you all for coming and watching, especially my co-workers who watched the previous talk and provided some valuable feedback.
00:46:03.270
If you liked listening to me, I do a podcast on Agile called 'This Agile Life'. We do it semi-sporadically.
00:46:15.120
I’m not always on, but we have resurrected it and are recording podcasts again.
00:46:33.740
A big thank you to my employer, Weed Maps, for sponsoring this talk.
00:46:36.880
There’s about 20 of us here; most of us have t-shirts on, and we are hiring big-time.
00:46:56.420
Come see us at our booth—we'll have t-shirts; I’m not sure exactly which ones yet.
00:47:15.020
The source of the presentation is on GitHub in my presentations repo. Easy to find.
00:47:35.360
There’s the link to the ActiveRecord Repository gem; you can also find that on my GitHub page, near the top.
00:47:42.869
Feel free to stop by in the hall if you have any questions.