Andrea Fomera
Pushing the Boundaries with ActiveStorage

Summarized using AI

Pushing the Boundaries with ActiveStorage

Andrea Fomera • September 26, 2024 • Toronto, Canada

In her talk at Rails World 2024, Andrea Fomera explores innovative uses of Active Storage for file management, particularly focusing on integrating external services like Wistia and Pixels.

Key Points Discussed:
- Introduction to Active Storage: Active Storage simplifies file uploads in Rails applications by managing interactions with cloud storage services like Amazon S3, Google Cloud, and Microsoft Azure. It consists of three main components: Blobs (file representations), Attachments (links to Active Record objects), and Variants (processed image modifications).
- Efficiency in File Management: The talk highlights the inefficiencies that arise when multiple identical files are uploaded repeatedly, consuming unnecessary storage and bandwidth. Fomera suggests building a media library as a solution to optimize this process, enabling users to manage their uploads more effectively.
- Creating Custom Services: Fomera demonstrates how to build custom services in Active Storage, using Wistia as an example. By inheriting from Active Storage Service and implementing key methods like upload and delete, developers can create tailored file management solutions for external services. The integration of Wistia showcases how clients can upload and delete video files seamlessly.
- Introducing Active Storage Providers: A new concept called Active Storage Providers is introduced, allowing developers to specify multiple sources for file attachments. This framework enhances flexibility in file handling and management, accommodating various integrations.
- Integration with Pixels: The talk further covers how to integrate Pixels, a platform for free images, detailing how to handle image uploads and searches within the application's UI. This provides users with a versatile media library experience, combining direct uploads and external resource integration.
- Lessons Learned in Building a Media Library: Fomera shares insights from her experience of creating a media library, emphasizing user tracking and managing file uploads through a simple approach that can be expanded to include advanced features like file tagging and improved search functionality.
- Implementation Considerations: The presentation concludes with practical advice on managing user-uploaded files and integrating custom functionalities while keeping in mind best practices like handling user IDs and search features.

Conclusions: Fomera emphasizes that creativity in leveraging Rails functionalities can push the boundaries of file management solutions. Through integrating Active Storage with external providers, developers can enhance the user experience while retaining the simplicity of the Rails framework.

Pushing the Boundaries with ActiveStorage
Andrea Fomera • September 26, 2024 • Toronto, Canada

In her talk at #RailsWorld, Andrea Fomera showed how she works with #ActiveStorage using custom services for external providers.

Thank you Shopify for sponsoring the editing and post-production of these videos. Check out insights from the Engineering team at: https://shopify.engineering/

Stay tuned: all 2024 Rails World videos will be subtitled in Japanese and Brazilian Portuguese soon thanks to our sponsor Happy Scribe, a transcription service built on Rails. https://www.happyscribe.com/"

Rails World 2024

00:00:10.639 Hello everyone! Today I'm very excited to talk about pushing the boundaries with Active Storage, specifically using it to manage representations of files that are hosted on third-party services like Wistia. We'll also explore how to build a comprehensive media library, enabling end users to manage their uploads and view them with ease. Before we dive into the technical details, I'd like to take a moment to introduce myself.
00:00:32.800 My name is Andrea Fomera, my pronouns are she/her, and I'm a senior software engineer at Soundstripe, where I recently joined the team. I’d like to give a big shout-out to Soundstripe for allowing me to take some time out of my day to be here today. For those interested in connecting with me further, you can find my information at [A.D]. I can also be found online pretty much everywhere that matters.
00:01:09.560 Before we get started, my slides will move very quickly. There will be a link to the code, and the slides will be available shortly after this talk. Everything I’ll be sharing today has been written specifically for this talk, with the goal of inspiring you to explore new possibilities within your own projects with Active Storage. I hope you'll walk away with fresh ideas and practical techniques that you can immediately apply.
00:01:28.640 First, let’s talk about Active Storage. Let’s define what Active Storage is and how it fits into a Rails application. According to the Rails guides, Active Storage facilitates uploading files to a cloud storage service like Amazon S3, Google Cloud Storage, Microsoft Azure, and attaching files to Active Record objects. It comes with a local disk-based service for development and testing and supports mirroring and subordinate services for backups and migrations.
00:01:49.399 Put simply, Active Storage simplifies file uploads by extracting the complexity of working with different cloud storage providers while also supporting local file storage for development and testing environments. Behind the scenes, Active Storage relies on three main tables to manage to accomplish this: Active Storage Blobs, Active Storage Attachments, and Active Storage Variants.
00:02:05.719 Active Storage Blobs are the key representation of files we upload. Each blob is a record in Active Record that stores essential information about each file, such as its content type and size. Attachments act as the bridge between your Active Record model and the blobs. When you attach a file to a record in your application, Active Storage creates an attachment that links it to the blob to a specific model instance.
00:02:30.519 Variant records track variations of processed images. For example, if you need to resize an image to create a thumbnail or adjust its dimensions to a smaller size or larger, these variations can be stored in the table, assuming tracking is enabled. This allows for easier reference and management of different file versions.
00:02:50.560 Now, how does this work in practice? Typically, applications follow a one blob to one attachment relationship. If you need to associate the same file with multiple records, Active Storage will upload a new file each time you upload in the browser for each attachment, meaning a new blob is created each time. This behavior works well for most scenarios, but let's consider a scenario where your application is hosting millions of files, and users are repeatedly uploading the same files.
00:03:17.960 This redundant behavior leads to multiple blobs being created for identical files, consuming unnecessary storage space and bandwidth. So the question becomes: is there a more efficient way to handle this? How can we make lives easier for our users and our infrastructure? Today, we are going to explore how building a media library can solve this problem.
00:03:34.480 Before I dive into that, I want to discuss building custom services with Active Storage. Active Storage is primarily known for managing files to cloud storage providers like Amazon S3, Google Cloud, or Microsoft Azure. These services define how Active Storage interacts with these providers. But did you know you can create your own custom service to interact with different providers and handle different use cases? Today, we’ll walk through the requirements for building your own custom service.
00:04:05.920 As an example, let’s look at how you can integrate Wistia, a video hosting platform, directly into your Rails application. In the code on the slide—which I know can be hard to see—you can see that a custom service in Active Storage starts by inheriting from Active Storage Service. There are several key methods you need to define to get your custom service up and running: upload, delete, download, download_chunk, and public. These methods are the backbone of any Active Storage service and define how files are uploaded, retrieved, and managed in the service.
00:04:46.880 Additionally, depending on the service, there are two private methods you may need to implement: public_url and private_url. These methods control how generated URLs are accessed for files. For example, if the file is public, you simply return a direct link. If the file is private, you would typically want to provide a link with expiration dates. As you can see, each method currently implements a 'not implemented' error signaling that we need to fill in the details based on a specific use case.
00:05:09.160 Let’s talk about implementing Wistia as a service. Video uploads will be handled by the front end using a JavaScript snippet that Wistia provides, and then deleting files will happen via the backend's API request. Since Wistia handles the uploads directly via the JavaScript API, we don't need to perform any action, so we'll just do a no-operation (no-op) here. We'll rely on their uploader, and the delete method will make a call to the Wistia API to remove the video. This operation could be deferred to a background job to ensure it doesn't slow down the user experience.
00:05:44.400 For download and download_chunk in this case, videos are streamed directly from Wistia; so we can skip these methods as well. We're going to embed the Wistia player with some HTML and JavaScript. For public files, we can just assume videos are public. The next step is to handle the actual file upload using Wistia's uploader widgets. Here's one way you could integrate the Wistia JavaScript API into your form.
00:06:18.000 As a side note, we're going to break all of the usual rules today and embed a token directly into it. So, you shouldn’t do this; you should use a short-lived expiring token, but why not? Let's break all the rules! We use the Wistia upload widget to handle the file upload directly from the client side. We utilize Wistia’s uploader success events. Once the file is successfully uploaded, Wistia will provide us with that event, and then we can pass the media ID and the name to the backend in the format 'wistia:hashed_id:file_name'. This string will be parsed by the backend later.
00:06:55.640 As you can see, we've just looked at an example of how to create a custom service in Active Storage and explored a small example with Wistia. This setup starts to push the boundaries of what Active Storage can do by integrating a third-party service provider and creating a seamless experience for managing video files outside of the traditional storage system Active Storage typically operates with.
00:07:30.239 Now, when the backend receives this parameter when it's submitted, how do we actually handle this? Active Storage expects an assigned ID and not the string, so I'd like to introduce a concept I'm calling Active Storage Providers. This is a concept that does not come with Active Storage, but I figured it was a clean way to wrap things up.
00:07:51.520 What is a provider? Well, a provider by definition is something that makes a resource available for use. In this context, the provider is responsible for managing how files are uploaded, stored, and retrieved from different services. For example, we can make Wistia a provider and handle how the backend processes the string submissions we saw earlier when the video is uploaded.
00:08:17.120 Let’s take a look at the developer API that developers would use to integrate providers. As you can see, there's a has_one_attached with providers and then an array of the providers. This is very similar to the Active Storage API for managing services. We've introduced a new keyword argument, providers, which accepts an array of symbols.
00:08:34.000 The logo field has two possible providers, Wistia and Media Library, which we'll talk about in a minute, while the video field is configured to use the Wistia provider. This allows developers to easily switch between different sources based on different file sources. To implement this, we introduce a new concern called Active Storage Providers, which is an Active Support concern. This concern will be included in the application record, allowing all models in our app to use this functionality seamlessly.
00:09:12.840 Next, let's take a look at Active Storage Providers and how this code will monkey-patch the has_one_attached method. If the options providers are present, we then add provider hooks by calling a method called 'add_provider_hooks', and we store a hash for the providers with a specific attachment name that we can look up when needed.
00:09:36.120 Speaking of looking up providers when needed, we introduce a providers_for_class method that we’ve defined for a specific Active Storage association. This is where the providers_form method comes into play. This method fetches any providers that have been defined for a given model and attachment name to add the necessary functionality. We define the add_provider_hooks method, which modifies the setter method for the attachment, ensuring that the correct provider is used to handle the attachments.
00:10:10.080 To do this, we implement a simple if statement. If the value starts with 'wistia:', we handle it by calling 'create_wistia_blob', which creates a blob in Active Storage for the video uploaded in Wistia. Otherwise, we fall back to the default behavior and pass the value to the original method, and Active Storage handles it as usual.
00:10:37.040 Finally, let’s look at the create_wistia_blob method. This is where we parse a string provided by Wistia and save it to the Active Storage blob for future reference. In this method, we extract the Wistia ID and file name from the string, generate a checksum for the Wistia ID, and create a new Active Storage blob. This blob contains metadata that indicates it was uploaded via Wistia, which could be helpful for retrieval or deletion purposes later on.
00:11:03.840 With Active Storage Providers, we've introduced a flexible way for Rails developers to work with multiple storage providers for different attachments. This system makes it really easy to integrate with other third-party services like Wistia, Pixel, or even a custom Media Library into your application’s file management, all while keeping the API simple and intuitive.
00:11:30.720 The ability to define multiple providers opens up new possibilities for handling files in your app in a way that fits your unique needs, pushing the boundaries of what Active Storage can do. Now, let’s say we’ve all been there: requirements change, the product owner introduces a new requirement, and we want to add support for Pixels as a provider.
00:11:55.360 If you’re unfamiliar, Pixels is a platform offering a collection of free images similar to Unsplash, which also offers free images. However, a key difference between Pixels and Unsplash is that Pixels allows you to download and serve files directly from your own servers.
00:12:20.640 To support Pixels, we can add the add_provider_hooks method to handle files coming from the Pixels API. This is how we would modify the method to account for Pixels: we add a condition to check if the string starts with 'pixels', just like we did for Wistia. If it does, we call the create_pixels_blob method, which is responsible for downloading and processing the files from Pixels.
00:12:55.120 Next, we define the create_pixels_blob method to download the image from Pixels, set its content type, and save it as an Active Storage blob. Here's what this method does: we take the URL, which will be in the format 'pixels:URL', and extract the URL after 'pixels'. Then we download the file using open-uri to download from the Pixels URL. You probably want to look into some security aspects here if you use this code. We use the Marcel gem to determine the MIME type of the file based on its content, and finally, we create a new Active Storage blob, storing the file’s metadata and returning the blob.
00:13:37.920 Let's move on to the front-end implementation of Pixels. First, let me show you the final result, because why not? Demo time! In this case, I have a form here that allows me to upload from my computer or search Pixels for free images. I'll search for 'mountains'. The images will show up. I can click one or search for 'cars', or perhaps I'll search for 'coffee'. That coffee looks good; let’s pick that one. Lastly, we’ll create the article, and behind the scenes, we download the file and save it to our Active Storage hosts.
00:14:13.200 So how do we implement this? We want users to be able to upload images to an Active Storage association and give them the option to choose from Pixels. To enable this, we define a has_one_attached field and pass a Pixels provider as follows, just using the keyword argument we talked about earlier. To simplify the integration, we'll create a custom form builder that engineers a helper method that defines a single image uploader, which will render a shared partial that includes the uploader logic.
00:14:46.880 I won’t be showing the full uploader logic today because it's kind of a mess, but you can look through the code afterwards. This helps us pass local variables for providers using the providers_form method we defined earlier. We render a shared partial that will contain the markup for the uploader. The single image uploader view includes a Stimulus controller that wraps the uploader and manages the states. Users will have the option to upload an image from their computer or search for images on Pixels.
00:15:16.520 Assuming the Pixels provider is enabled, when users click search Pixels, a text input appears, allowing them to search for images. After entering at least three characters, a Stimulus controller sends a request to a custom Pixels controller. The Pixels controller handles the API request to Pixels. This controller extracts the API key from the credentials, makes a request using a Pixels client, formats the response, and returns a list of images as JSON, including the photo's URL, description, and the photographer's information.
00:15:52.839 Once the Stimulus controller receives the search results, it displays the images in a grid. When the user clicks on an image, the image URL is stored in a hidden field, formatted as 'pixels:photo_url'. Submitting the form then triggers the backend to download the photo and save the blob. That was a lot to take in, but the key here is that just by making a few changes to Active Record, we've enabled seamless integration with external providers like Wistia and Pixels.
00:16:27.320 These providers allow us to extend the functionality of file uploads and offer more flexibility to users, all within the familiar Rails interface. If you're like me, you might think that was a lot of work, but it's actually worth it. Once the providers are set up, you can reuse the functionality across your entire application.
00:16:51.320 Next, I’m going to talk about lessons learned while building a media library that allows users to search, view, and manage files they've uploaded. First, up is another demo video of the media library. Here I’m searching for 'Andrew Mason.' Hello! I decided not to, and I'm searching for videos. Now, if I click the video, you can see there's an embedded video player from Wistia.
00:17:18.640 If I search for a Pixels photo, you can see it embeds the full photo into the modal. You can download it, or delete it. That photo is gone, bye-bye. Then it downloaded the file. The code I’m about to show you has not been fully tested in a production environment, so while it’s functional for demonstration purposes, you may need to make adjustments for your own purposes.
00:17:48.320 Before we dive into the implementation, let's revisit the two main tables in the Active Storage schema: Active Storage Blobs, which hold a representation of the file including metadata such as the size and content type, and the service where the file is stored, like Amazon S3. The attachments table establishes the association between your application’s records and the blobs it connects, for example, an article or user model with the uploaded file.
00:18:08.600 Looking at this schema from a high-level perspective, it makes sense to add a reference column, such as user ID, to the Active Storage attachments table. This would allow us to track which user uploaded which file, providing a way to filter files by user. An alternative approach may be to create a media library object that has many attached files and use callbacks to automatically generate attachments whenever an Active Storage association is added.
00:18:30.360 For example, you could define a method like 'attach_to_media_library'. This method would handle all the logic to add a before_save callback to look for the changed attachments and automatically attach them to the media library. This is a more generic approach that may work very well, but today I want to focus on a simple MVP approach to validate the core functionality.
00:19:01.720 For the MVP, we’ll keep things simple by adding a user ID column to the Active Storage attachments table. This allows us to easily track which user uploaded each file. Here’s how we can do this using a concern that's included with Active Storage into Active Storage Attachment to automatically set the user association, leveraging current attributes in Rails.
00:19:34.720 By doing this, we achieve two major benefits: we can track who uploaded each file, and we can query blobs based on the user ID. With this setup, querying the database to find all the blobs uploaded by a specific user is just a matter of joining the Active Storage attachments table.
00:20:00.520 Here’s how you might implement this: We join Active Storage blob with Active Storage attachments to find out the blobs created by the current user. We can add filters such as searching by file name to allow users to search through their uploaded files. We order the results by created_at to display the most recent files uploaded first.
00:20:24.240 We now have a way to display all the blobs the user has uploaded in the media library. These files can be listed, previewed, and managed directly from the user interface, providing a seamless experience for managing media. Building a media library with Active Storage can be quite straightforward by using the existing schema and just adding a custom bit of logic.
00:20:43.840 By extending Active Storage Attachment to track user uploads, we can now give users visibility into their own files, allowing them to manage their media in a user-friendly way. This approach is minimal and scalable, and it can be further expanded to include advanced features such as file tagging, searching by metadata, or integrating with third-party storage providers.
00:21:06.840 Here’s another demo video. In this case, I'm either uploading from my computer or choosing an existing file we've uploaded in the media library. Here's Andrew Mason, and then a car here. We’ll pick Andrew's face.
00:21:29.360 When you select a file from the media library, the backend receives the signed ID from the original blob and creates a new attachment. There’s no need to go through the Active Storage providers' code. We'll just use a provider reference for the media attachment on the frontend code, so we can display the option.
00:21:51.960 As I bring this train into the station, I’d like to talk about a few interesting things I encountered while using Active Storage at scale. First, let’s dive deeper into the search functionality I mentioned earlier: the search by file name method. This is a common need when working with files and attachments in Active Storage, especially when you have a large number of assets.
00:22:13.760 To accomplish this, we need a way to extend the Active Storage Blob class and add the functionality for searching by file name. Let’s take a look at the code, and I’ll walk you through it step by step. Here, I’m defining a search_by_file_name class method inside the Active Storage Scopes module. This method runs a SQL query to filter the Active Storage blob records by file name.
00:22:37.680 If the file name is blank, we return all records. We’re using 'LIKE' in the SQL query to match any parts of the file name. This allows for partial matches and makes the search more flexible. Don't forget to include the Scopes concern in Active Storage Blob.
00:22:59.760 In a previous role, I encountered a concept of well-known blobs. These were blobs the system would use for demo content, and we had to ensure they weren't deletable by end users. Additionally, we needed to prevent the deletion of files on the file storage when it was hooked up to production.
00:23:23.840 This posed an interesting problem, so one tip I learned was to conditionally prepend the concern over to override the delete method on the blob to accomplish this task. Some helpful things to keep in mind that might save you some debugging time: if you go this route, adding a user ID to Active Storage attachments can be error-prone if you don't have the current user set in your background jobs.
00:23:54.560 Don’t forget to enable providers if you add them to your backend to get the frontend to show up. And if you leave here, remember just a few things: think outside the boundaries of frameworks provided by Rails with a little creativity, like enhancing an upload component with the Pixels searching.
00:24:11.440 Custom services can allow you to do new and unique things with Active Storage. Thank you for your time. It's been an absolute joy to speak here today, and I really appreciate your attention. If anyone has questions or comments, feel free to come find me after the talk. You can find my slides at [A.P] speaking and a link to the repo on my GitHub, or you can find me online on Twitter, where I’ll be posting this information later. Thank you so much!
Explore all talks recorded at Rails World 2024
+17