Talks

Refactoring Towards Component-based Rails Architectures

Refactoring Towards Component-based Rails Architectures

by Stephan Hagemann

In the talk titled "Refactoring Towards Component-based Rails Architectures," Stephan Hagemann presents effective strategies for restructuring large Ruby on Rails applications to improve maintainability and performance. He emphasizes the importance of transitioning from a monolithic Rails application to a component-based architecture, which allows developers to divide their code into smaller, manageable parts that can be developed, tested, and maintained independently. The key points of the presentation include:

  • Understanding Component-based Architectures: Hagemann explains that component-based architectures enhance the quality of applications by allowing for clearer dependencies and easier testing. Developers should visualize their applications by organizing functionalities into boxes connected by arrows signifying dependencies.

  • Experiences with Refactoring: Drawing from his own experiences, he shares insights on the process of refactoring existing applications into components, encouraging developers to view their applications as collections of smaller parts rather than a single, monolithic codebase.

  • Case Study: Hagemann references a sample application designed to illustrate the structure of a component-based system, demonstrating how Rails engines can serve as encapsulated components within a larger application. By organizing functionalities into engines, developers can cleanly separate concerns and minimize dependencies.

  • Complexity Management: He discusses the growing complexity in large codebases and how introducing structure can help manage this complexity. He illustrates the concept of preferential attachment in code, where certain components (like user models) tend to accumulate functionality, leading to bloated and hard-to-maintain code.

  • Refactoring Strategies: Hagemann details specific strategies for refactoring existing applications into component structures, including ‘teasing out an app component’ and ‘extracting a functional component’. He stresses the necessity of having robust tests before starting the refactoring process to ensure that moving code does not introduce new bugs.

  • Final Thoughts: He encourages developers to embrace component-based architectures as a way to improve the quality of their applications, offering support for those looking to implement these changes. By continuously refining applications into components, developers can manage complexity more effectively and promote cleaner, more maintainable code.

The overarching takeaway from Hagemann's talk is that refactoring towards a component-based architecture is not only beneficial but essential for maintaining healthy codebases as applications grow in size and complexity.

00:00:17.039 Thanks for coming! You came to the session on big Rails applications. If you did not intend to come here, I can understand; big Rails apps can be messy. I'd suggest you get out now as the flight might be bumpy. I want to talk about really large Rails apps.
00:00:39.920 My buddy Austin suggested a couple of days ago that we use GitHub to search for Rails apps sorted by size. This is officially the biggest Rails app you've never heard of on GitHub. It's open source and only has eight models, which is because they upload all their assets. It is 150 megabytes big, and that's not the kind of big I'm talking about. I want to talk about component-based Rails architectures and why you should refactor towards them, as well as how to do that. I tag my tweets about component-based architectures as 'zebra', and I'm the only person in the world doing this. But if you use it too, certainly reply! If I can get one idea across right off the bat, it is that this stuff is easy. If you can think about your application by writing, by painting boxes and drawing arrows for things that depend on each other and how they interact, then you know everything else you need to know about component-based architectures. You just need to start doing them to what I think will improve the quality of your application and the quality of your code.
00:01:20.720 Since I call this talk 'Refactoring Towards Component-Based Rails Architectures', this talk assumes that you all know what that is. So, I will actually explain what that is, and you're not screwed if you don't know what it is. I started talking and actually writing applications pretty much exclusively in this kind of odd way about two to three years ago. I wrote this sample app that you can find at the address s_hagemann with one N on GitHub. It's called the 'Next Big Thing' sample app; it does nothing except show you the structure and how to hook certain things up.
00:02:13.680 Let me quickly go through what that means. This is the root of the project, and as you can see, it's not a traditional Rails application. There's actually no particular prescribed way on how to do this. In many apps where I've done this, the Rails application is still at the root of the project. But in this particular example, I chose the version where Rails moves into a subfolder. So, it looks pretty much like Rails, but it doesn't have an 'app' folder. You might be asking, 'Is there nothing happening in this app?' Well, there is! At least one thing is mounted—there's an engine that's mounted.
00:02:52.800 I don’t know if anyone's ever downloaded the 'Teaser' engine from RubyGems—don’t try; I didn’t upload it there. Let’s look at the Gemfile to find out where that gem is. The trick you need to employ when writing component-based architectures is to use gems not as a distribution mechanism but just as a packaging mechanism for parts of your application. You leave them in the same source tree and reference them. You can do everything you’re used to with those things. If we go to the components folder, we expect to see a teaser folder, which we do, and there is an engine. This is just a Rails engine, a Ruby gem that is tweaked a bit, so it has Active Support and all the Active Rails in it.
00:03:46.879 This indeed now looks like an application. If you don’t know what an engine is, it’s essentially Rails minus everything you need to actually get it booted. If we look in here, we can see there’s a route file, and this engine is doing something. There’s a normal structure—let's look at the controller. You can see this is namespaced under the engine name, which is nice. But it’s just your ordinary controller; everything's namespaced. There are assets in here, standard stuff, and even migrations down here in this engine. Maybe one of the coolest parts is that there are specs in here. I like to use RSpec, and there could be tests if you're doing those better.
00:04:51.840 There's a spec here for this controller that I just looked at. The cool thing about that, and I really want to emphasize, is that I can run these tests, and the only thing that these tests are going to load is the code from this engine—not from all the other stuff in my big Rails application. Just by having a spec folder in here, I have a part of which I can prove that it is small, or at least as big as only this part and not dependent on anything else. I typically bind these components together, and there is more in this folder; you can see there's this event counter signup and whatnot.
00:05:51.680 I wrote a little script here that binds all the tests. As you can see, there are some request specs and some Jasmine specs. If you look into the other folders, you’ll find more test files. Usually, in the root, I put this build script, and what this does is loop over all the folders, tries to find test files, and runs them. So you can still run this as the app with the specs for one application, but if you just change one part, you can just test that one part and still be sure that it works.
00:06:14.160 Now that you know what these applications are, you don’t need to worry about all those little things—how to hook them up. You’ll discover that when you try to do it. What’s important is that you can now talk about your application as a sum of smaller parts. For example, the app I showed you is an empty Rails container that mounts an engine, which uses two other engines that I didn’t discuss, in another gem. This was a tiny project. It was a tiny Rails app container with two engines. One was using two gems to talk to APIs, and the other one was just a straight-up normal application. Another site I worked on was a travel site that was so small we didn't bother to use engines; we just added a couple of gems for API connections.
00:07:25.760 This site started out with six engines and one gem, but it ended up significantly larger. However, this is slightly misleading because there were about twice as many engines originally; I just abstracted some away for clarity in the picture. As you can see, it looks chaotic, but I can draw such a chaotic picture about your application. If you were to draw the arrows a bit straighter, you would see that dependencies all go in one direction, and there are parts that other parts are based upon.
00:08:04.160 So this app is composed of parts, and it’s no longer a ball of mud. The fact that you can see this as a complicated domain is an improvement, despite this picture looking a bit weird. This illustration is from Ben Smith's talk about how he architected his big Rails app for success, which he gave at Rocky Mountain Ruby last year; I recommend you watch that talk. If you’re wondering why you would want to refactor towards this style, as I just said, it's because it helps manage complexity when applications get big.
00:08:47.840 I believe it's fundamentally important to be able to think about the parts of the application independently as much as possible. There’s a wealth of resources out there. I’ve given talks about this, as has Ben, and there are blog posts on the subject. Most importantly, there’s that repository I recommend you check out. Now back to big Rails.
00:09:51.520 How many here think they are working on a big Rails application? That's about a third. I think I've worked on a couple, but I didn’t know what to show you because I work for Pivotal Labs and we consult; the code is our client's. So, I can't show it here. However, I went on Google and searched for open-source Rails apps, and there they were! These claim to be big Rails apps, and I wondered how big they are.
00:10:56.160 If you currently have your company's code on your laptop, please execute this script and tweet the result with the zebra hashtag. This will output how many lines of code your app has. Here’s the result for this list of applications: as you can see, they don’t all share the same interpretation of big. We're going from about 400,000 lines of code to still in the thousands.
00:11:30.080 If you look at the file sizes, you’ll see the same picture: thousands of files in the biggest apps and still hundreds for the smaller ones. I will accept that as big, and I hope you will too. Ever wonder what happens to the complexity within these systems when they grow? Something like this happens, and perhaps it's as bad as that, and you just can't take it anymore. When you introduce structure, you fundamentally change the situation.
00:12:18.880 I'm supposed to first say that this increase in complexity is due to the number of interactions between elements inside the system growing somewhat like an exponential function when you add new parts. It can explode in complexity. What if you were able to split a system and use a complex web of service objects to connect the two? You would then have two smaller, isolated parts.
00:13:24.800 What happens to the complexity then? Well, it’s still there, but probably at a slower rate, which is beneficial. I wish I could run up to the slide right now, but you should see that green line at the bottom—it’s almost flat. Now, imagine you could split it again and again. You would always stay in this flatter part, so your complexity never really explodes. You get those complex diagrams that I was showing earlier, but your complexity doesn’t explode quite as badly.
00:14:13.919 You’re still writing a huge app—that's something I'm not going to dispute—but you might be able to manage it better. The rich get richer, and the poor get children. This is from a song from the 1920s. While it's funny, the first part alludes to a phenomenon called preferential attachment. This describes how when the quantity of things you have, whatever that may be, defines how much more you will get.
00:15:51.840 Chris Anderson wrote about the long tail in 2006, focusing on a few popular items while many small things don’t gain traction. For example, Netflix allows us to watch movies that few want to see while very few blockbusters gain most of the attention. The relationship between the green and yellow graph is about 80% of the tail versus 20% of the front. Preferential attachment occurs when you already have a large quantity, so you get more.
00:16:59.839 I have seen so many codebases where something similar seems to be happening. I looked at these codebases, and I would urge you to check yours out as well. Try to execute a bash script that lists all the Ruby files in your app by size, and see what that looks like. This may look something like this for these applications: does that not look similar to the preferential attachment phenomenon I mentioned before?
00:18:16.879 There’s another way of expression: the rich get richer, and there's a German proverb that I often refer to. The proverb states that the devil always shits on the biggest pile. I think something like that is going on here. I encourage you to consider which files may be contributing to this issue. The question you might ask is which file could it be?
00:19:12.160 Who thinks that might be a good guess? Don't lower your hand, as it’s probably yours! I mentioned this two days ago, and then the guy next to me showed me his user file, and it was literally that long. I was shocked! Everyone else probably recognizes this problem: user models are typically very important to a system, therefore, functionality tends to attach itself to them, creating a pile.
00:20:20.720 I thought there would be something else, so I looked at associations as another way of expressing dependencies. I wanted to discuss 'has many' associations, as I find them particularly interesting. For example, when a 'User' has many 'Cars', 'Flowers', 'Trees', etc., this is a very readable setup. We can empathize with this structure, but what's odd is that when you look at the database table, there is no mention of these associations at all. It raises the question of why we write this kind of code for user classes and not for other classes.
00:21:24.880 We tend to attach all sorts of functionality to user models since they are important to us. This leads me to the conclusion that we shouldn't be overusing 'has many' associations. You can check this on your own and see if you agree. But look at which file is on the left—that's right, usually, it’s the user model that's too big.
00:22:23.440 It’s important to recognize that while the user may often be the culprit, it is likely that your other important domain files also attract too much functionality. As I looked through all the user models in past projects, I discovered they often had bloated sizes. I found it interesting that these models just tend to grow in proportion to how big the application gets.
00:23:13.680 To recap, we have things that grow, and when they do, they become exponentially more complex. Those things at the top of the complexity scale depend on everything else in your application. I think we can do better! We can attempt to reduce the size of the parts and flatten out the curve to reduce the number of dependencies between those parts.
00:24:26.320 Applying SOLID principles has helped me in this regard—especially the Single Responsibility Principle. I think that if you use this as a rule of thumb, it will improve the quality of your code over time. However, when we discuss Single Responsibility, we encounter many questions.
00:25:23.680 When discussing which elements to apply the Single Responsibility Principle to, if you’ve been to any refactoring talks, you might have had the chance to refactor methods that need to be more focused. We look at methods, aiming to make them more concise, and we determine what they should be responsible for, especially concerning Active Record.
00:26:16.279 Yet, when we design classes, we often overlook the full potential of applying this principle—primarily due to the convenience of Active Record, which allows it to save itself, among other things. We need to be cautious, however, and not succumb to creating models that become too broad in scope—especially the user model.
00:27:10.880 I think we’re in a good position to reevaluate our modules. If you’re using modules as mixins, you are merely obscuring complexity rather than addressing it. I wouldn't suggest hiding complexity by placing it elsewhere; you may have done it in the past, but let’s not continue down that path. Utilize modules as namespaces to group things that have a clearer responsibility.
00:28:50.640 I think designing your application this way improves how we view its purpose. We still need to be conscious of the way our application interacts within a larger system. I encourage you to attend different sessions and enhance your applications at all levels, but today I want to focus on components, as they enhance the concept of a namespace.
00:29:49.760 Components offer a scalable and provably independent way to discuss parts of your application. If you're considering restructuring your application to increase its quality and maintainability, be aware of the advantages of breaking it into components. You will still retain one repository and a single test suite while allowing for separate functional components within your app.
00:30:42.560 Stick to a single deployment strategy, but you can use load balancers to distribute traffic to certain app parts, facilitating better performance. You won't face additional versioning issues, since everything is contained within one application, which simplifies deployment. I have experienced this first-hand—I was involved in a project that lasted about six months and included multiple pairs working on refactoring.
00:31:44.080 At one point, they created multiple components, but also removed others as needed. Think through all the elements of your application, even if you might not be currently feeling the pain. A component can be a manageable option that helps you think about the pieces of your application effectively.
00:32:15.680 I have spent most of my time explaining why you should refactor towards this, but I promised to share how. If you don’t have tests, I wouldn’t move code just yet. First, add tests around the parts you want to extract.
00:32:36.320 When looking for where to create a component in your application, focus on vertical slices. View controllers and models that clearly form a unit on their own, so that means space them in a manner that highlights their dependencies. Once you’ve identified this vertical slice, you can move it into an engine.
00:33:26.400 Extracting into an engine in Rails is as easy as using the command 'rails plugin new component --full --mountable'. Just be prepared; your app will be unhappy initially. Make sure your tests are green, find any overlooked dependencies in your vertical, and move them into the emerging component.
00:34:04.720 Most of the time, this vertical will not be straightforward to identify, due to the presence of the user model scattered among other components. Expect to extract both the specific functionality and any common components, but avoid calling those common elements just that—otherwise, you’ll just create another pile of confusion.
00:34:49.280 Dependencies will likely arise when transferring basic functionality into the gem. Thus, expect to find other dependencies. Once this gem is functional and tested, reintroduce it in your main application while ensuring that the primary app still communicates effectively with it.
00:36:56.080 Establish ports and adapters to facilitate communication between the component and the main application. This will help you avoid adding unnecessary complexity by keeping your dependencies lightweight.
00:37:52.400 This approach allows you to chip away from that huge complexity while putting it in a bounded context with a singular meaning. I know this process can take time and may not work as smoothly as expected—during our first attempt, it took three weeks and several tries before we could successfully split one app into two, ultimately ending up with three!
00:38:43.760 If you want to attempt this and can’t find the resources I mentioned, tweet at me, talk to me, or send me an email. I'll respond to all of them, possibly even publicly to help others too. I think this is a great way to structure applications to enhance their quality. Thank you for your attention, and I hope you can all join me in this pursuit of better Rails applications!