Persistence Pays Off: A New Look At rom-rb

RubyConf AU 2017

http://www.rubyconf.org.au

RubyConf AU 2017

00:00:12.719 Hi! First of all, I wanted to say that I'm incredibly excited to have this opportunity. It’s my first time in Australia, and here I am talking to you about Ruby. That’s fantastic! So, thanks for inviting me. It's a great honor to be here.

00:00:24.560 Since probably many of you don’t know me, let me give a short introduction. My name is Piotr, and I’m a software developer from Poland. I live in Częstochowa and I work a lot in open source, mostly on Ruby libraries.

00:00:37.520 You can find my profile on GitHub, where I go by the username 'solnic'. You can also follow me on Twitter and check out my blog, though I don’t write often there. Occasionally, I share insights about things I've learned or interesting libraries I’m working on—so feel free to check it out.

00:00:51.520 I work for Ice Lab, where we build web applications utilizing Dry-RB and ROM-RB. Today, I will be talking about ROM-RB, specifically about persistence and its significance.

00:01:14.960 Coincidentally, just two or three weeks ago, we released ROM 3.0, which is a huge release. This included the first stable version of the SQL Adapter and the first stable version of the ROM repository.

00:01:22.159 Today, we’re going to take a look at how it works. The plan for today is to focus on four big topics. The first one is why ROM-RB was created. I'll explain the motivation behind the project.

00:01:35.040 Then, I would like to discuss how it works by examining its main components, without going too deep into the details. After that, I’ll talk about how ROM-RB differs from other Ruby gems, as it is quite a unique gem.

00:01:49.200 Lastly, the final part will be a brief discussion about future challenges and how you can help. So, let’s get started with why ROM-RB was created.

00:02:01.520 This is a challenging question, and the simple answer is: to solve a problem. Obviously, ROM-RB is not just a toy project; it's a real project aimed at addressing a specific issue.

00:02:10.000 That issue is separating persistence from the domain layer. As a community, we have been exploring this problem since around 2009 or 2010. I recall the first discussions happening around that time.

00:02:24.319 We began discussing this issue primarily because of the increasing complexity observed in application development. One of the factors that contributed to this complexity was the use of Active Record.

00:02:37.000 Not just as a library, but the pattern itself, which sacrifices the separation of concerns. Consequently, we started looking into alternative methods for interacting with databases in Ruby.

00:02:49.200 One of the solutions proposed was to separate persistence from the domain layer, which aids in managing complexity. To understand why this is a problem worth solving, we need to examine the Active Record pattern.

00:03:01.520 Active Record operates under a significant assumption: it claims that your database schema is your domain model. However, this is misleading.

00:03:14.440 In reality, you cannot assert that your database schema equates to your domain model, as this is a false premise.

00:03:20.560 Active Record operates as if your database schema were your domain model, neglecting the discrepancies between the two. When these discrepancies eventually surface, the outcome can be quite challenging to address.

00:03:35.040 Many libraries in Ruby implement the Active Record pattern, with the most popular one being Active Record for Rails. This model is very object-oriented and appears appealing at first glance.

00:03:42.760 We often think in terms of objects: we create a project, modify a title, and save it. At that moment, it seems like there is no database involved. However, the underlying database is indeed present, and we intend to utilize it.

00:03:58.200 What often happens in several projects is that developers begin incorporating numerous custom queries. For example, in Discourse—a Rails-based discussion forum—many custom queries can become quite advanced.

00:04:10.400 It may not be easy to read all of them, but it is clearly complex code. Ultimately, developers create their own ad-hoc solutions to simplify portions of their systems.

00:04:19.160 Discourse, for instance, has implemented its own SQL Builder that allows them to define custom queries and receive mapped results into straightforward structs.

00:04:26.800 You might think this is surprising solely because Discourse is a large project; however, the reality is that any size project can encounter these complexities. I have worked on smaller projects, with around 8,000 to 10,000 lines of code, where the complexity of the domain dictated an equally complex database interaction.

00:04:40.560 This ultimately leads to the necessity of using your database effectively. It’s crucial to focus not on how we use the database, but on why efficient use of our database is important.

00:04:51.880 Simply put, we need to account for application-specific data. In dynamic, object-oriented languages such as Ruby, people often prioritize object design and message passing, viewing raw data as something undesirable.

00:05:06.000 But in reality, application data is central to our systems. All systems must process data: they take input, transform it, and store the output in a database. Therefore, managing and defining application data effortlessly becomes crucial.

00:05:24.160 Interestingly, even in applications using ORM technologies like Active Record, application data still exists, albeit less visibly. You need to dig into the code to decipher what it is doing. Active Record objects may start off looking straightforward, but as the code progresses, they become intertwined with various data formats and structures.

00:05:40.600 Since we aim to define our own application data, it should be easily retrievable from the database. When developers start crafting their own SQL Builders and maneuvering around their ORM, it signals that the ORM is no longer beneficial.

00:05:51.120 The same principle applies to data modifications. It's not uncommon for applications to receive input that doesn't align with the database structure. This necessitates transformation to ensure compatibility for persistence.

00:06:07.360 What we store in our database often diverges significantly from what we represent in our application. The way we design tables and specify relationships among them is often optimized for database performance, rather than application logic.

00:06:18.600 Additionally, maintaining data integrity is crucial for the health of applications. Fixing broken data in production can be both risky and stressful. The database itself provides powerful tools to ensure data integrity, and rather than relying solely on custom validation, we should utilize these mechanisms.

00:06:33.360 Unfortunately, the Active Record pattern neither provides nor encourages such mechanisms. It simplifies persistence without addressing the underlying concerns about data integrity or separating domain logic from database structure.

00:06:52.560 Thus, ROM-RB was specifically developed to tackle these challenges and provide a genuine alternative to conventional Ruby ORMs.

00:07:03.640 So, how does it work? First of all, ROM is not just a typical monolithic library; it is more accurately seen as a toolkit. It consists of core abstractions, along with higher-level abstractions built on top of these.

00:07:18.320 I will illustrate the top-level abstractions available today. There are primarily three main abstractions: repositories, relations, and change sets. Repositories are used to fetch application-specific data, utilizing relations that interact with the database.

00:07:31.760 Relations are provided by adapters, which means that every adapter can support distinct database-specific features.

00:07:42.680 This allows for optimization according to the specifics of each database. Moreover, change sets manage edits to our data.

00:07:52.760 Let’s start with repositories. Repositories encapsulate access to application data. If we were to build a simple blog in 15 minutes, we’d have posts.

00:08:05.000 In ROM, rather than defining models that are connected to the database, we create repositories for specific application concepts. For example, we would create a PostsRepository that connects to a Posts relation.

00:08:18.200 If we wanted to fetch posts by their slugs, we would define a method called `get` that accepts a slug string and uses the post relation to execute the query and retrieve a single result.

00:08:32.840 It’s important to note here that your application accesses data through repositories rather than performing ad-hoc queries directly. This reduces coupling between the application and the persistence layer.

00:08:44.400 In ROM, repositories are indeed provided by an external gem; they are not a core part of ROM but are encouraged for use. This convention allows us to abstract away database-specific DSLs from your application.

00:08:55.040 Consequently, your application can utilize your own repository interface, simplifying the overall structure.

00:09:08.720 Repositories utilize relations, which encapsulate database queries. This means that relations can be defined as classes that are tailored to a specific database backend.

00:09:23.320 Here’s a relation that’s configured for SQL. A relation has schemas that define attributes with their respective types and all database-specific information.

00:09:33.240 In SQL, this corresponds to database tables and columns, and we can infer these attributes, which eliminates the need to manually type them out.

00:09:47.560 We can also define relationships. For instance, we can specify that a post relation belongs to users and has many tags. Notice that the class is named `Posts`; it’s plural, reflecting that relations represent collections of data.

00:10:01.760 Typically, you will define your own methods for relations. For instance, we create an `index` method that is an instance method, making it classically chainable like Scopes.

00:10:15.440 These methods can also be composed, allowing you to create more complex queries and nested data structures.

00:10:25.080 Let’s explore a feature that provides an overview of posts. We can create a query to fetch exactly what we need, which includes selecting specific columns, such as ID, title, and attributes concatenated together.

00:10:38.760 Instead of loading the entire User entity and risking an N+1 query problem, we can join with the Users table and filter down to just the required columns.

00:10:55.279 As a result, implementing this overview feature leads to a more efficient query that uses a repository method to retrieve the necessary data with minimal overhead.

00:11:07.680 If we decide to enhance the overview with sorting capabilities, we can easily accomplish this using a simple method call. For instance, we can add a `sort` keyword and invoke the `order` method.

00:11:20.320 This allows us to sort posts based on the specified attributes, regardless of whether those attributes physically exist in the database.

00:11:32.520 Let’s make this filtering more robust by allowing users to filter based on individual relations, such as tags. To implement this, we can extend our overview method to accept a filter hash that represents nested relations and their attributes.

00:11:46.280 If our input isn't a hash, we maintain original behavior, but if it is a hash, we can reduce it accordingly, applying additional restrictions as specified.

00:12:01.680 It's essential to note that we are not passing symbols and values; instead, we are directly asking the relation for specific attributes, with built-in validation ensuring they exist.

00:12:15.120 Though we might want to filter by an attribute like 'author' that doesn't directly exist in the database. However, since we’ve created a virtual attribute, that allows for operations like this.

00:12:29.920 By running manual queries, we can compose elegant SQL that performs filtering based on dynamic input, making it capable of handling complex queries with less effort.

00:12:44.000 Therefore, when we review our approach, we can see how flexible and powerful it can be. You can expand this functionality by adding new relationships or attributes without disturbing previously established filters.

00:13:00.720 Ultimately, this flexibility allows you to focus more on solving application-specific problems rather than getting bogged down by the intricacies of the database.

00:13:14.160 The integration of basic objects is crucial, and with ROM, if it meets our needs differently, we can always create custom objects within our repositories.

00:13:26.960 For instance, we can define our classes and retrieve more complex structures that fulfill our unique requirements, rather than relying solely on built-in structs.

00:13:40.320 Now, let’s discuss how we can modify data within our framework. In ROM, we use change sets, a dedicated component aimed at easing the creation and updating of records.

00:13:56.480 A basic example of a change set is asking a repository to provide one, filled with new data for instances.

00:14:09.920 You can commit this change, and it returns what the database yields in return.

00:14:20.000 Two key benefits of using change sets are data association and transformations. Libraries like Active Record are excellent when it comes to associations, and ROM utilizes change sets for this.

00:14:35.960 Let’s say we create a transaction that incorporates a user and a post, establishing a connection between them, subsequently completing it to retrieve a post with the user ID assigned.

00:14:47.440 Another substantial reason for employing change sets is for data transformation—this is a core concept within ROM.

00:15:03.360 Consider the situation where data arrives in a hash structure: for example, we may receive an author key, not split into first and last names.

00:15:18.960 Utilizing change sets, we can create custom classes, where we define transformations; for example, splitting that single value into two distinct ones.

00:15:32.720 Essentially, this method provides support for applying complex changes by using ROM’s inbuilt functionality, streamlining significant portions of our code.

00:15:45.040 The flexibility of change sets also means they can manage arrays of hashes or any object that can convert into a hash. This allows for batch creation of multiple users and other records.

00:15:59.840 Under the hood, we enhance performance further with multi-insert capabilities, regardless of the database in use, optimizing how we tackle multiple records.

00:16:14.239 This was a quick introduction to the primary features of ROM. There are numerous functionalities that extend beyond what I've covered, and I'd like to point out how ROM distinguishes itself from typical Ruby libraries.

00:16:29.040 ROM is unique as it adopts functional programming paradigms extensively. The core of this gem is quite functional, taking cues from functional programming principles.

00:16:42.480 An exceptional illustration of this is how relations, which are crucial for data retrieval, operate in a functional manner. You can call a relation directly to fetch data, resulting in what we refer to as a 'loaded relation'.

00:16:57.200 This design element is intentional; it enables composition of relations, where you can combine them as functions, significantly simplifying complex queries.

00:17:11.280 For instance, if you request an aggregate, the underlying process composes relations into nested data structures, which streamlines database interactions.

00:17:24.080 Simultaneously, ROM maintains its object-oriented nature; it employs objects extensively throughout its code.

00:17:37.440 For example, repositories are objects, as are the relations and their schemas, which incorporate attributes. Each of these layers facilitate a comprehensive, object-oriented structure.

00:17:52.000 In essence, blending functional programming with an object-oriented approach has yielded significant successes within this project.

00:18:07.200 This design paradigm has profoundly influenced others, leading to the creation of complementary gems such as Dry-RB.

00:18:21.200 This intertwining of abstractions facilitates the development of gems that are inherently composable, leading to a lighter codebase.

00:18:34.640 In fact, if you look at the ROM core library, it comprises just over 3,322 lines of code, while the SQL adapter consists of 212 lines.

00:18:48.560 Moreover, my favorite repository, with numerous advanced features, clocks in at only 180 lines. When we tally all these components, it totals around 6,504 lines, which is impressively concise for the functionality provided.

00:19:04.480 I genuinely believe that gems designed with composability in mind enhance the Ruby ecosystem. We already have significant examples, such as Hanami and Trailblazer, illustrating the benefits of gem compositional architecture.

00:19:19.680 However, achieving such systems isn't effortless; you can't simply declare that your project will be composable overnight.

00:19:36.720 It requires following certain principles, starting with avoiding monkey patching. A gem that relies on monkey patches cannot be considered reusable.

00:19:50.760 Reducing global state is another crucial aspect, as many Ruby libraries depend heavily on mutable global state, complicating their reusability.

00:20:05.840 Additionally, achieving clear separation of concerns is paramount, requiring a thoughtful understanding of the abstractions involved.

00:20:20.240 It also involves favoring more objects over classes and modules. Libraries are generally easier to reuse if they focus on providing objects rather than relying on classes or modules.

00:20:35.040 Ultimately, the ethos of composition supersedes inheritance, as classes tend to come bundled with global state, resulting in complexity.

00:20:47.760 Real-world examples abound, demonstrating the advantages of gem composition. For example, ROM utilizes SQL as a backend for the SQL adapter effectively.

00:21:01.600 It employs dry-types as the foundation for relation schemas, which simplifies the creation and management of attributes.

00:21:16.480 The dry-initializer is used across multiple ROM gems to simplify complex constructors, ensuring robustness through type validation.

00:21:31.360 SQL’s strength lies in its lack of global state; when connecting to a database, it hands back a connection object without global dependencies.

00:21:47.760 In fact, SQL 5.0 will render all datasets immutable by default, further strengthening the design principles that we’ve built into ROM.

00:22:03.360 In ROM, all data structures are already frozen, improving stability and reducing potential errors when interacting with the database.

00:22:16.840 Additionally, SQL provides elegant abstractions that facilitate the rationalization of various SQL expressions.

00:22:30.240 This concept extends validity to Hanami as well, which also avoids global states and utilizes objects consistently within its framework.

00:22:44.560 Since every aspect of ROM is treated as a first-class citizen, it provides ample opportunity for extensibility without the concerns of intrinsic state.

00:22:59.200 Eventually, as we work on further extending the database, completing the adapters continues to be our greatest challenge.

00:23:13.320 We currently have several adapters running in production including ROM HTTP, ROM Dynamo, and ROM CouchDB, yet we still have many more in prototype stages.

00:23:26.480 While ROM is already quite efficient, there is still significant room for speed enhancements in various areas.

00:23:39.840 For instance, utilizing the transp gem for in-memory data manipulations will inherently speed up the process.

00:23:52.960 Also, we intend to optimize the performance of fetching single objects, enhance support for prepared statements, and commit to making numerous micro-optimizations.

00:24:05.680 Documentation is also critical. With the ongoing effort to treat all public APIs as first-class components, ensuring their full documentation represents a never-ending journey.

00:24:18.960 We are always eager for contributors to assist in the documentation process. However, one of the simplest ways you can help is by using ROM.

00:24:32.080 Even if you don’t have a pressing requirement, give it a try and provide us with your feedback, which will be instrumental in enhancing ROM.

00:24:46.320 And that’s all I have for today. Thank you very much!