00:00:21.260
Hi everyone.
00:00:22.920
How about this amazing RailsConf 2019?
00:00:26.789
I hope you all have had an amazing week like I have, and I really appreciate you coming to my talk because I know you're probably tired.
00:00:34.110
My name is Colleen, and I run a Ruby on Rails consulting business.
00:00:36.629
Today, I'm here to share my adventures—or misadventures, as they were—migrating a production application from Shrine to Active Storage using Amazon S3 storage.
00:00:40.980
Actually, I used Shrine when I did this for my client, but for the purposes of this talk, I'm going to use Paperclip.
00:00:43.170
The five people that responded to my Twitter poll said they use Paperclip more than Shrine.
00:00:47.940
But first, I'd like to start with a little story.
00:00:51.449
So how did I get here?
00:00:54.149
I was contacted by a cool new startup looking for a Rails developer to do just that—migrate their solution from Shrine to Active Storage.
00:00:56.489
I was excited to work with this company, and this was the first time I was going to get to use Active Storage.
00:01:02.250
I had actually attended the Active Storage talk, I believe it was at RailsConf last year, so I felt quite confident in my ability to migrate this application.
00:01:06.509
For those of you who are not yet on Rails 5.2, let's start with what Active Storage is.
00:01:11.940
Active Storage is an easy way to attach files to Active Record objects and store those files in cloud-based storage.
00:01:17.729
Have you ever needed to add an avatar to a user or maybe a resume to an applicant? Active Storage helps you take care of all of those file attachment needs.
00:01:22.920
Well, that's great, Colleen, but Paperclip is working fine for me. Why should I go through the trouble of switching?
00:01:26.789
That's a good question. Why should you migrate to Active Storage?
00:01:30.209
The first and possibly most important reason is that Active Storage is now the built-in solution for handling file uploads to cloud storage in Rails.
00:01:34.560
It supports Amazon, Google, and Microsoft, and here's something fun: there are no additional migrations needed!
00:01:40.100
Maybe if you remember with Paperclip, every time you added a new file, you had to write a new migration.
00:01:45.660
Active Storage is different—it doesn't work that way.
00:01:47.520
And if I still haven't convinced you, Paperclip is deprecated, so you're out of luck!
00:01:51.449
So, I accepted the contract, and the first thing I did was I went and looked at the Active Storage docs.
00:01:56.209
In my experience, the documentation for Rails is usually excellent, and Active Storage appeared to be no different.
00:02:01.530
Step one: install Active Storage.
00:02:04.530
Step two: configure cloud storage.
00:02:08.610
Step three: add an attachment to a model.
00:02:11.310
Step four: let the magic of Rails extrapolate away all of the heavy lifting for you.
00:02:14.129
And it just works!
00:02:17.069
Well, has anyone tried to migrate an application to Active Storage by simply following these steps?
00:02:22.920
If you have tried, you might know that implementing Active Storage in a new application is relatively easy.
00:02:27.600
But migrating to Active Storage can be quite challenging.
00:02:30.720
Why is that?
00:02:34.560
Well, Active Storage is fundamentally different from Paperclip.
00:02:37.200
Paperclip works by attaching file data to the user table.
00:02:41.520
For example, here we have an avatar for a user.
00:02:46.050
So if we add an avatar to our user using Paperclip, it's going to change the users table.
00:02:50.040
It adds four columns to your users table.
00:02:53.670
Active Storage, however, creates two new tables: the active storage attachments table and the active storage blobs table.
00:02:56.910
So if we revisit our steps, I would say that step three, adding an attachment to a model, needs to be changed.
00:03:02.400
As Active Storage is not going to be able to access the data since there is currently nothing in your Active Storage tables.
00:03:06.209
However, we can still perform step one and step two.
00:03:10.590
Step one is to install Active Storage, create the tables, and configure your cloud storage.
00:03:14.430
The way this is set up right here is we have an Amazon S3 bucket that acts as our production storage and another Amazon bucket as our dev storage.
00:03:17.630
I created this contrived example for this talk so you can see I came up with a clever bucket name.
00:03:21.600
When we set this up on our production application, this is how it was done.
00:03:26.100
It's going to depend on your setup, but I would highly recommend testing this on a dev bucket in your cloud storage provider.
00:03:29.700
After you configure it in storage.yml, you then have to configure it on a per-environment basis.
00:03:33.920
What I'm showing you here is development, which is configured to use Amazon dev, and production that uses Amazon S3.
00:03:38.130
Great, that took just one minute!
00:03:42.030
At this point, you already have Active Storage installed and your Active Storage tables exist in your database.
00:03:47.840
Now, let's talk about step three.
00:03:50.760
I've changed step three to say: move avatar data from the user table to the Active Storage tables.
00:03:56.280
How do we move data from one table to another in our database? A rake task.
00:04:00.300
So we are going to write a rake task together.
00:04:03.600
Let's talk about this rake task: we're going to be moving a substantial amount of data.
00:04:08.400
It's not one-to-one because we have one user table and two Active Storage tables.
00:04:10.800
We'll also be mapping some data.
00:04:13.080
Understanding what we are trying to do is essential.
00:04:16.470
So we are moving this data from the users table to the Active Storage attachments and blobs, and we're technically copying it.
00:04:24.600
Reaching into my database to change records on a production application can be a bit scary.
00:04:28.800
I was told I wouldn't have to write any SQL, but unfortunately, that seems to be the case here.
00:04:32.640
Before we jump into the rake task, let's discuss the Active Storage tables.
00:04:36.810
The first table I want to talk about is the Active Storage attachments table.
00:04:42.420
We'll start with a name, which is the name of your attachment—in this case, 'avatar'.
00:04:45.960
Then you have your polymorphic association columns: user and user ID, followed by your blob ID.
00:04:52.500
Now, the second table is the blobs table.
00:05:00.120
If we look at the blobs table, the key is the location of your current file in Amazon S3 storage.
00:05:02.700
Then you have your file name, your content type, the byte size, and your checksum.
00:05:05.340
Now, how do these tables relate to one another?
00:05:08.520
On your left is the users table, and on your right is the Active Storage attachments table.
00:05:12.300
The user becomes our record type; the ID becomes our record ID, and the name becomes the 'avatar'.
00:05:15.030
You can see that the avatar file name from our users table will go to our blob as the file name.
00:05:19.740
The avatar content type will go to the content type, and file size will map to byte size.
00:05:27.930
Now we can start working on that rake task.
00:05:35.790
The good people at ThoughtBot put together the skeleton of a task that serves as an excellent starting point.
00:05:41.520
As I mentioned, they actually use a migration, but I advocate using a rake task.
00:05:44.130
If we look at this, we get our blob ID, and these two statements define our insert statements.
00:05:47.040
This is all cut and paste for you. After this, we're looping through all of the models and pulling out the attachment names.
00:05:52.410
The important thing to realize here is that this code used to pull out the attachment name is specific to Paperclip.
00:05:56.310
That's how Paperclip names the files on your user table.
00:06:00.870
So this string avatar underscore file underscore name is what we're looking at.
00:06:03.600
If you have one or two models with attachments, you do not have to do all of this.
00:06:07.290
You can directly call out the model and the attachment name instead of looping through every model.
00:06:10.590
Now, this instance represents your user, and in our example, we have the user avatar.
00:06:14.430
The statement user.avatar.path is important because it relies on the Paperclip relationship.
00:06:17.730
It is essential to note that this process requires two deploys.
00:06:22.050
Why does this process require two deploys?
00:06:25.920
The rake task we are building needs that user avatar relationship defined by Paperclip.
00:06:29.550
It also needs the Active Storage tables because it requires a destination for the data.
00:06:33.550
Now, Active Storage needs data in those tables, but you cannot run Active Storage without first running the rake task.
00:06:39.300
And the rake task is dependent on Paperclip.
00:06:42.120
Let’s return to our rake task.
00:06:45.780
What we have here is the blob insert statement.
00:06:48.960
The key and checksum methods are important and need to be written by you.
00:06:52.500
I did not include my specific solution because yours will be specific to your Paperclip and Amazon S3 configuration.
00:06:55.680
The key is how Active Storage will look for your files.
00:07:00.840
Before I move on, I have to mention a potential pitfall.
00:07:05.640
I used Paperclip, assuming the key would be 'user avatar path', but that can lead to issues.
00:07:10.380
Make sure your path does not return a forward slash unless that's the intention.
00:07:15.600
Now, concerning checksums: when I did this on production, we had about 80,000 images.
00:07:21.990
I ran each image through the MD5 process; some gems might provide the checksum automatically.
00:07:26.070
The last step is writing the records to your attachments table, which includes your model name and instance ID.
00:07:30.600
That's the entire rake task.
00:07:36.150
Next, after you run your rake task, determine if it worked.
00:07:40.200
The quickest way to do this is to see if the correct number of blob and attachment records were created.
00:07:43.620
If you're feeling adventurous, you can check individual records in your database.
00:07:47.760
I feel like I sped through a lot of code there, so let's do a brief overview.
00:07:50.880
We have created the Active Storage tables by installing Active Storage and running the migrations.
00:07:53.160
We set up the cloud storage through storage.yml at the configuration level.
00:07:57.540
We wrote the rake task to create user avatar records in the attachments and blobs tables.
00:08:03.420
We sourced the data from the user table, and hopefully, we've confirmed that records were created.
00:08:06.240
But we don't know if they're correct unless we check our database.
00:08:11.880
Before moving on, I recommend checking out a new branch.
00:08:14.880
You don't have to do this; you can push a branch, run your rake task, and then push another for Active Storage.
00:08:18.030
But it's easier to work with a new branch to ensure everything operates smoothly.
00:08:22.260
This was my preferred method; I made mistakes with the key initially.
00:08:26.520
So, I had one branch with Paperclip and another for Active Storage.
00:08:29.940
If it doesn't work, you can clear the Active Storage records, fix the rake task, and try again.
00:08:34.560
After successfully installing Active Storage, we need to alter our code, models, and views to use this functionality.
00:08:40.200
This is why it appears easy in the documentation; you just add 'has_one_attached'.
00:08:43.560
It only works if there is data present.
00:08:49.049
For instance, if you're using multiple sizes of images, you’ll utilize something called variants.
00:08:53.520
What’s cool about variants is that you can pick your image size on the fly without being constrained by predefined sizes.
00:08:56.150
If you're working with images, active storage does lazy transformations on original blobs, caching the variants.
00:09:01.470
I was working for a client to migrate this application and had a rake task, ensuring it was working.
00:09:08.760
But around thirty percent of our images were blurry, which was quite worrisome.
00:09:13.200
Why were thirty percent of our images blurry?
00:09:15.900
They were blurry because Active Storage uses MiniMagick for image transformation.
00:09:20.400
Unfortunately, MiniMagick does not support the advanced image processing we had previously utilized with Shrine.
00:09:27.240
This became a significant pain point for us.
00:09:30.420
Fortunately, there's a happy ending.
00:09:34.710
This experience is about eight months to a year ago.
00:09:38.160
I feel we might have been a bit early adopters of Active Storage, mainly due to that image processing issue.
00:09:43.560
However, Rails 6 should be addressing this specific issue.
00:09:48.600
Active Storage in Rails 6 has deprecated MiniMagick and is now utilizing the Image Processing gem.
00:09:53.250
Fortunately, now the resize functions that didn’t work with MiniMagick should function properly.
00:09:59.160
We have already discussed what steps we took: deploy with Paperclip, run the rake task, create the Active Storage tables.
00:10:05.040
The next step is to deploy with the Active Storage models and views.
00:10:08.180
If that works, then you have made good progress on your migration to Active Storage.
00:10:11.520
Let’s revisit all of our steps.
00:10:16.000
We installed Active Storage, configured cloud storage, and moved avatar data from the user table into the Active Storage tables.
00:10:19.950
Now Active Storage can perform its magic.