Application Architecture: Boundaries, Object Roles, & Patterns

00:00:13.599 Thanks for having me back again at this conference. It's great to be here. I presented here last year as well on parallel programming and concurrency. Last year, there was a heckler in the audience, so if any of you are going to heckle me, I am prepared and ready. I have a script on one screen and my slides on the other. Hopefully, I can manage the coordination correctly.

00:00:21.359 Yesterday, PE gave a good talk on creating a legacy Rails application in one easy step. Like many people, I came to Ruby through Rails. I was doing these super repetitive PHP applications that were essentially PHPMyAdmin, but with a nicer user interface. Over the years, I wrote my fair share of embarrassing Ruby code. I think I wrote a gem called 'act as God object'. I suffered through gigantic applications and began to ask myself why I was even using this thing in the first place.

00:00:39.559 You know, once the test suite starts to take over an hour and a half, or you have 179 gems in the project, there are ActiveRecord API calls consuming all the classes everywhere. Pretty much every other reason these Rails rescue project consultancies exist means you really have to evaluate your own choices. Make no mistake, this talk is not about Rails in any way. This talk is about how I learned to fix that pain entirely.

00:01:01.840 It took me a long time to arrive here and I learned a lot of things along the way. Today, I'm here to share my experience in the hopes of making everyone in this room better software engineers. My primary goal is to encourage all of you to rethink your applications architecture. This talk will cover a lot of ground; unfortunately, I cannot show as much code as I'd like. Please read my blog and other in-depth resources for more technical discussion. Instead, this talk will focus on the high-level concepts with minimal code examples.

00:01:20.080 I don't think any of these ideas are new. Each stands on its own or on the shoulders of brilliant engineers. Hopefully, everyone is here... We're lucky enough to have Michael Feathers with us today. He coined the mnemonic so all of us could remember the important things more easily. It's also great that he's here because he can answer your questions about the material after my talk. If you ask him a question, let me know and I'll buy him a beer for technical support.

00:01:32.479 We have a slide about productivity, and I feel this is how most applications end up over time. Maybe it starts off okay in the beginning; you get going, you ship, and then eventually you end up in a place where you cannot function anymore. My goal is to get us into a better position, where we aren't burdened by all the weight of previous technical decisions. We can continue to develop and release features over the course of an application's lifetime. To achieve this, we have to spend more time upfront architecting and designing the application.

00:01:55.040 Yes, thank you. This conference is about challenging ideas. I challenge the idea that the Ruby community knows and understands how to architect and design codebases. We do not respect SOLID principles, and I think that at the level the Ruby community produces programs, we often ignore fundamental architectural boundaries. This creates more technical debt, making it harder for us to ship and limits our capacity to meet ongoing business needs. Let us not forget that this is our single responsibility principle.

00:02:13.840 To address this problem, we need to start from the beginning to arrive at a position where the application contains proper boundaries and uses design patterns. I think every application is generated in a big bang moment in our heads. There's a moment where everything comes together, and poof, we can see how it should work and how it should be programmed. However, we need to unpack the word 'application.' What is an application, and what does everyone think when they start thinking about a new application? Is the first thought, 'Oh my god, shiny! Look at all the things I can do?' Or is it 'I can't wait to try out this new library?' Or 'What framework should I use?'

00:02:43.560 If any of these thoughts resonate with you, then I hope you can take away something from my talk. None of these things are actually important. We must understand that an application is a collection of use cases, and that's it—regardless of anything else. Now, as web developers, we suffer from what I like to call the 'onion' effect. There are so many things in front of us that prevent us from actually focusing on what we're supposed to deliver or ship, such as HTML, JavaScript, and CSS. All of these things are just anecdotes; they're implementation details, and yet we pay so much attention to them.

00:03:04.040 We forget that it's not important. We need to focus on the core of what we're supposed to be doing—that's the use case. A use case represents business logic; it takes input from the user and does something for them. It interacts with a host of other classes in the system to enact the desired change. That is the single responsibility principle at work. But what are the other classes? These are the domain entities, and they are the nouns in the system. They also encompass adjectives, verbs, and prepositions. It's the use case's job to arrange all these different parts into a coherent sentence.

00:03:36.360 Yes, I put business logic on a pedestal because it is the most important thing that we need to be paying attention to. So, I mentioned that use cases take input from the user and do something with it. Now we're faced with the question: How does that actually happen? Well, there's this thing called the delivery mechanism. It takes domain classes and exposes them to the outside world in a way that a human can understand and interact with. As programmers, we think the best delivery mechanism is for the user to clone our repository, require a class, and call a method.

00:03:57.920 However, this actually wasn't arcane enough, so the industry invented JavaScript, HTML, and CSS. In reality, the delivery mechanism is the boundary between the human interaction context and our domain logic. And there's the word 'boundary' again—it's important, and we have to go deeper to understand its power. Boundaries separate larger application concerns. Each concern is another layer, and every application has at least two layers: the business logic layer and the data layer. Notice I did not include the UI because the UI is not important; the business logic can exist independent of any sort of presentation.

00:04:12.800 These boundaries are extremely important because they enable different subsystems to vary independently of each other. This means that each side of the boundary can be replaced with another component, and the wider system is none the wiser. Aggressive boundary creation allows us to focus our efforts on more discrete units of code, increasing our ability to estimate change and meet real engineering requirements. It might even make us happier. Now that the problem domain is smaller, it becomes easier for us to apply SOLID principles, design patterns, and use object roles. Let's turn our attention back to the application layer. What things live there? Well, this is where all the magic happens.

00:05:02.079 We know we have use cases and domain entities. Now, 'domain entities' is a broad term on purpose because large systems contain many different types of classes and many different responsibilities, making it hard to categorize everything. However, there are a few common use cases such as forms, validators, models, and repositories. Let's go on a quick tour of these objects. A form object collects and sanitizes input and acts as the boundary between the UI (presentation layer) and the application layer. They can perform context-free validations such as 'Is a given value less than 100?' and can coerce input into the right type, such as converting a string from a console into an integer, or parsing a gigantic JSON blob into an object you can use in your system.

00:06:02.520 These form objects act as border guards, protecting our system from the outside world. They should be strict and ruthless because once data passes through this boundary, it will never be checked in the wider system. Next up, we have models, which are simple classes that encapsulate data and state. There's nothing fancy about this; I think we can all understand it. Now if there are any Rails fanatics in the audience and if you thought about ActiveRecord here, I think that you should definitely stop and reconsider your approach.

00:06:43.680 It's essential to acknowledge that there's a much better way to approach these problems. If you didn't think about ActiveRecord and you thought to yourself, 'yeah, that's good. I like that,' then perhaps you already know what's coming next. Now, unfortunately, it's hard for us programmers to think of data without considering persistence. After all, what good are the objects if we can't manipulate them and get them back again? This focus can be dangerous because we usually tend to concentrate precisely on how the data itself will be persisted, rather than the boundary between data access and data persistence.

00:07:02.279 We ask ourselves questions like, 'Should I use an RDBMS? Or what about a document store, or key-value, or maybe Elasticsearch as my primary data store?' All of these questions are important, but it's crucial to ask them at the appropriate time. When we're thinking about modeling our data, we need to think in terms of abstractions, not concretions. How many times have we committed to one solution too early or realized we might need a different style of persistence for different kinds of data access? Perhaps some of our data could be persisted in RDBMS, while part of it is easier to access via a key-value system like Redis. Committing to those decisions too early can lead to mistakes.

00:07:42.920 In fact, using an ORM from the beginning can push its concerns higher up into the application layer. All of these can be deadly mistakes. We have to learn to build on abstractions and not particular concretions. It's fitting that we arrive at this point. How many of us have ever asked ourselves, 'How can I run my tests without the database?' There's a well-known answer to this and it comes down to the principle of introducing a boundary between data access and persistence. This is arguably the most important decoupling you can do, and it was a turning point for me. Once I had my tests running against memory-style implementations, there was simply no going back.

00:08:34.440 I realized that I could aggressively create more and more boundaries in my application to isolate my domain logic from whatever was communicating with the outside world. If you're familiar with Haskell, this can be understood as the distinction between impure versus pure functions. I did all of this just because I wanted my tests to be fast. But what this actually empowered me to do was completely separate all the different concerns of my application, resulting in what I like to call a ports-and-adapter style application, which is how I develop all of my applications today. I mean, decoupling is not just about fast tests; it also allows us to defer important decisions about our architecture until we have enough information to make the right decision.

00:09:21.040 Uncle Bob Martin states that this is the hallmark of good architecture, and there's a design pattern that fits the bill perfectly: the repository pattern. Repositories are the interface to the underlying data layer. Here's Martin Fowler's definition: 'A repository mediates between domain and data mapping layers, acting like an in-memory object collection. Client objects construct query specifications declaratively and submit them to the repository for resolution. Objects can be added and removed from the repository as they can from a simple collection of objects, and the mapping code encapsulated by the repository will carry out the appropriate operations behind the scenes.'

00:09:59.680 Conceptually, a repository encapsulates the set of objects persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer. Most importantly, the repository also supports the objective of achieving a clean separation and one-way dependency between domain and data mapping layers. The most important thing to take away from that definition is the idea of one-way dependency. It means the dependency flows in only one way. The caller relies on the repository to perform actions, but the repository will never call back into a different layer. If there was communication back and forth, we could not call that a boundary.

00:10:37.040 Now that that's solved, let's move on to the next object: validators. Validators implement more complex validations, potentially requiring state or complex business rules. I know you would probably like me to pull out an example class from my hat to illustrate what a validator is, but as you can see, I don't have a hat. It's hard for me to provide a concrete example because it is domain-specific and application-specific.

00:11:21.760 Earlier, I mentioned that forms can do context-free validations, like determining if a given value is less than 100, where no other information is needed. However, in practice, there are more complex validations we have to think about, like, 'Did the user fill in this field?' or 'Did the user complete this task?' These miscellaneous conditions might add up to specific context-based validations. This is an example of a more complex validation that can be modeled using a dedicated class. Just know this role exists and will eventually materialize in your application.

00:11:54.799 Finally, use cases interact with all these objects to enact a change. They combine stateless domain entities with state and produce the required outcome. In most cases, state usually just represents who the current user is. All this information is simply passed in via the constructor, so the use case takes in a form and the state.

00:12:26.679 Now, thank you. This is not working out as well as I had hoped. The problem with Keynote is that you can't have a gigantic script, so just to backtrack a little—here is an example of a repository with a simple interface to create and read existing objects. What those classes or objects are doesn't make any difference to us. We just take our objects and throw them over the wall. Essentially, that's how most of us throw our code over to operations—'Just take care of it, right?'

00:13:08.639 But the important thing is that the logic exists between those two methods, so naturally, we can swap it out with something like an SQL repository that interacts with tables, or we can have a Redis-type implementation that uses JSON to persist data. The advantageous part of this is that the logic is entirely encapsulated within this class, and we can use this class in different situations.

00:13:45.559 As I mentioned earlier, it's important because the wider system is none the wiser about how these things are actually happening. This is an example of a use case from one of my applications, which is a JSON API serving an iOS app for sharing photos. As you can see, the state is passed in via the constructor, meaning these objects are easily testable. You pass in the form and state, and you can assert the results and any other potential side effects. This encourages a clean separation between the system's actual behavior and how the user interacts with it.

00:14:16.360 This is especially critical as systems grow because requirements and functionalities tend to get tangled. There's an intermingling between how things are presented versus how the system actually works. In my current company, there's so much entanglement that most people do not actually understand what's supposed to happen; they just know how things are supposed to look.

00:14:48.679 Now, we've gone all the way down and come back up. I mentioned user interaction, and this is the delivery mechanism. The delivery mechanism instantiates the forms, handles state management in its medium, instantiates use cases, and displays the results in whatever way makes sense for that medium. It also handles failure scenarios in a medium-appropriate manner. I keep saying 'medium-appropriate' because we can have multiple delivery mechanisms.

00:15:30.720 For example, this could be a Sinatra route handler. Naturally, we all think in HTML and CSS, or whatever, so perhaps this one will display JSON. If there's an error, it will do some sort of flash message. But it's also important to note that we can extract this code into a command-line interface or any specific medium, using standard output or standard APIs. The point is that we can have multiple delivery mechanisms and interact with them in different ways because the actual application logic is decoupled from how it's presented.

00:16:17.160 We're left with two primary tiers: the user-facing application layer, which contains all the business logic. Naturally, within this layer, there are more responsibilities. The presentation layer will have view objects for logic, templates, and may even have serializers for powering a JSON API. It will most likely include middleware to handle a variety of tasks and a bunch of other utilities to make it easier for developers.

00:17:06.240 Now, the application layer will contain form objects, use cases, validators, repositories, and domain entities—everything that actually makes the application work. The repository acts as a boundary between the domain objects and the data layer. We hope to replace all dependencies on the outside world with boundaries. If a boundary needs to talk to Twitter, there's a boundary there; what about sending email? There's a boundary there too.

00:17:32.440 Most of us have probably already been exposed to this concept in some way without necessarily even noticing it. For instance, in the beginning, when using ActionMailer, they have a test delivery method. What does this do? It mimics the interface but stores all delivered emails in an array to allow for assertions. That's an example of a boundary at work.

00:18:14.039 Now, all these decisions come together during testing. If you want to test your data layer, you can unit test that. If you want fast tests, swap the real implementations for memory-style or fake implementations. Everything is decoupled and isolated, putting us in a powerful position to leverage the 'O' in SOLID, which means open for extension and closed for modification.

00:19:02.560 Now it's possible for us to compose more complex flows or applications by using our existing domain entities. For example, when I worked on a CRM, I had some very complicated flows that were composed of different components. One flow was sending an email; if the user inputted an email address that didn’t exist in our customer database, they had to go through the use case of creating a customer first, then sending an email. Since I had already created the 'create customer' use case, I could simply instantiate it within the 'send email' process.

00:19:36.480 All the things that happened, like sending notifications or emails, could just be reused in different contexts. Doing that with MVC involves a different implementation style. I shamelessly took these slides from Uncle Bob Martin's talk on the architecture of the lost years, as he is much better at making hand-drawn diagrams. But it covers most of the important points. His terminology is a little different; for instance, he uses 'interactor' where I say 'use case'. However, the principles remain consistent.

00:20:57.040 A user performs an action, and something goes across the delivery mechanism through the boundary. I think he's a little more enterprise-focused because there are request models and response models, but the key takeaway remains that the delivery mechanism handles instantiating and passing requests through the boundary down to the interactors or use cases, which coordinate validations and all the necessary steps to make everything happen.

00:21:34.440 Then it comes back through the boundary and out to the user in some context-appropriate manner. So why aren't more people doing this? I think there are many reasons. First, there are so many legacy applications that simply have to keep functioning, and it's not always a priority to spend significant time refactoring them or updating them. Second, it is genuinely hard to implement this architecture; it requires a lot more contextual overhead and upfront design.

00:22:23.120 You have to be aware of various different concepts, and you may have gone through the pain of not applying these principles before; it just becomes hard. Third, I don't think the Ruby community encourages people to think in this manner. That’s why I was happy to be given this opportunity to speak, to talk about this, and hopefully to make more people aware of how to better separate their concerns within their applications.

00:23:00.239 I think that the Ruby community fosters an overreliance on gems and various other tools. Yes, gems and frameworks allow us to create functionality, as we saw in the previous talk, but the question is—why? Why is this encouraged, and why do people think this way? Unfortunately, I believe that this mentality could be damaging to us in the long run.

00:23:37.440 For those of us who have more experience working on larger applications, we have to communicate to those who are just starting out that there is a different way to do these things. Yes, the guides and documentation will tell you to do things a certain way, which is beneficial in an introductory context, but eventually, it doesn’t pan out in the long run.

00:24:16.760 However, all of this isn't lost. We're here today, and hopefully, we're learning something. Now, I have two resources for you. First up is 'Rails Refactoring' by our lovable organizer, Andre. I was lucky enough to read a beta copy of the book, and I must say, the content is fantastic. He provides a wealth of useful refactorings you can employ to decouple parts of your application, enabling you to better apply these architectural patterns.

00:24:46.720 Next, I will self-promote a little because I also have a lot more content about this on my blog. I think it's a 10 or 11-part series, including plenty of technical examples and guidance. In addition, I have a 15-minute introductory screencast on how to start a greenfield project from the very beginning, incrementally adding functionality without making any ridiculous commitments to specific libraries. It's an example of how you can approach a project with a different mindset.

00:25:25.600 So, that's all I have to say. I hope everyone learned something, and now if there are questions, fire away!

00:25:31.360 Questions? So, don’t you think that one of the problems we have in Ruby as a language is the lack of interfaces? Because we can't clearly separate boundaries, and the issue is that we don’t have tools to deal with that easily. Sometimes I want to say, like in Java, 'interface so-and-so,' which is nice since the compiler will tell you everything.

00:26:13.440 But we can do it in Ruby; we just have to put in a little more legwork to make it happen. For example, I have a project that uses the Abstract Type gem and a few others, where I specify what the class should look like. I put a class in front of that to verify that if the method is called and doesn't respond, it fails with defined exceptions about mismatched type signatures. Sometimes it can seem unnecessary overhead, but I don't think it's a problem; it just requires us to be more aware of the data types we pass and their signatures.

00:26:51.840 Also, this is why testing becomes even more crucial; to combat this issue, I test those classes in isolation, and then in a whole integration test suite, I swap the different objects to ensure there’s no difference in how the implementations behave. I believe it’s not required at the language level; it just takes more effort from us as programmers to ensure that all objects collaborate correctly.

00:27:29.520 Questions? Hi, good morning. Excuse me for asking this, but is Rails productive or counterproductive regarding this approach? I feel that Rails is entirely counterproductive to anyone trying to write standard architecture.

00:27:58.480 I used to think Rails was awesome, but now I look at it and think it was a colossal mistake. The issue is I want to keep this discussion open-ended. Rails got to this point because everyone said, 'Use Rails; it's good,' but the reality is you need to look through the libraries and various frameworks to figure out what bits you actually want, then compose something from there.

00:28:46.600 I mean, the hardest code to maintain is somebody else's code. If you spend more time upfront crafting your own code, it becomes much easier to change and evolve over time. However, being overly reliant on a myriad of libraries can be detrimental. So to sum it up, I think you have to make your own decision.

00:29:11.840 Yes, Rails is good for certain classes of applications, but whenever I do any Ruby work now, I do not look to Rails to solve any of my use cases.

00:29:43.680 More questions? Who's next? So, for example, I like your principles, but creating a lot of classes with single responsibilities means we have to instantiate many of these classes. I've heard opinions that in Java it's very expensive to create instances; how about in Ruby? Is it expensive?

00:30:14.639 Well, I don’t know; that would depend on the garbage collector and various implementation details of the language. I’m just not equipped to answer that question. Perhaps a new book, 'Ruby Under the Microscope,' provides detailed insights into such aspects.

00:30:39.280 Sorry, I don’t have the knowledge to provide an answer to your question.

00:30:46.080 It’s fine; thanks! Okay, would you recommend this approach for every kind of project? I know early startups won’t have the luxury to develop this way.

00:31:16.960 I recommend this approach for projects that team members are concerned with creating long-term maintainable solutions, where you can get going quickly and still do things effectively. You can move fast, but one reason I advocate for this is that to reach this understanding, some often have to fail first to recognize essential abstractions.

00:31:55.920 But after some practice, teams may be able to start here. For very small applications like a blog, you might not need this approach. I'm not sure when the inflection point occurs, but there definitely is a separation.

00:32:26.520 Why do you think this is connected strictly to Rails? I think this isn't about Rails; it's about the Rails community. I wish to separate those two elements. There’s a way of doing things by default in Rails, and there’s a gem called Rails that's a great piece of code.

00:32:53.680 But the way Rails is often utilized may push productivity higher through gem use and simple functionality without developers realizing the robust underlying Ruby infrastructure that should be employed, such as understanding how Rack middleware works. People can become too reliant on everything happening magically, which isn’t ideal.

00:33:44.959 So I don’t think it's simply a problem with Rails; it’s indicative of the Rails ecosystem as a whole where the two are tightly coupled.

00:34:13.960 I got a question. You may be surprised, but some percentage of this audience is sort of on your side; I think you're preaching to the choir. My question is, assuming we already build use cases in applications, do you feel that Ruby, as an object-oriented language that combines State and Behavior together, should we consider separating data from behavior?

00:34:35.840 Well, not yet. By doing applications with this style, I realized there are distinctly different infrastructure concerns. For example, part of an application may utilize a different language or need to scale independently. I haven’t reached that decision point just yet.

00:35:04.800 Last question? Hi! I found your discussion about the repository pattern interesting. Do you recommend a particular gem for that, or do you implement it yourself?

00:35:19.760 Well, naturally, I don’t reimplement it every time. I would recommend checking out my own project called Chassis, which is a collection of small, lightweight modules, including an implementation of the repository. In all my projects, I use that repository pattern.

00:35:54.760 I wouldn’t consider that work finished because I'm still finding opportunities for improvement. The repository pattern itself is fairly easy to create, yet the actual implementation is where most of the hard work lies. I can’t really reuse my implementation here.

00:36:26.469 Thank you very much; great talk!