Ruby Preserves

by Craig Buchek

In this video titled 'Ruby Preserves', Craig Buchek discusses his development of a lightweight Object-Relational Mapper (ORM) for Ruby, which he refers to as a micro ORM. The presentation addresses the complexities commonly associated with traditional ORMs and seeks to explore the utility of a simpler ORM that doesn't abstract SQL operations away from developers. Key points discussed include:

Definition and Purpose of ORM: Buchek explains that an ORM serves to bridge the differences between SQL databases and object-oriented programming in Ruby, addressing the 'impedance mismatch' issue.
Inspiration for Ruby Preserves: He describes his dissatisfaction with existing ORMs and his journey in creating his own, emphasizing the learning process inherent in developing new software.
Simplicity versus Functionality: Buchek poses a critical question about how simple an ORM can be while remaining useful, suggesting the adoption of SQL directly rather than abstracting it away.
Comparison to Active Record: He contrasts his approach with Active Record, highlighting the complexities and code bloat associated with it, advocating for a more streamlined method of ORM development.
Data Mapper Pattern: The presentation introduces the Data Mapper pattern, which offers a clearer separation between domain logic and data access. Buchek argues that this pattern can lead to better scalability and a more manageable codebase.
Implementation Details: He outlines specific features of Ruby Preserves, such as direct SQL execution, relationships implementation ('has many' and 'belongs to'), and performance considerations like eager loading to address the 'N + 1 query problem'.
Future of Ruby Preserves: Buchek shares his vision for the future development of Ruby Preserves, including automatic mapping of database structures and exploring other paradigms for ORM design.
Final Thoughts and Acknowledgments: He concludes by reflecting on the feedback received and the ongoing challenges he faces in ORM development, inviting further interaction from the audience.

Overall, the presentation emphasizes the potential for simplicity in ORM design, the importance of understanding SQL fundamentals, and the need for continuous improvement and learning in software development.

00:00:14.130 Hi, welcome! My talk today is on a normal ORM called Ruby Preserves. My name is Craig Buchek. I'm an independent web developer, and I've been using Rails and Ruby since about 2006. I'm also a big fan of Agile and host a podcast called 'This Agile Life'.

00:00:21.010 Last year, I started writing a Ruby ORM, and it's surprisingly small. You could call it a micro ORM—it's missing some features that most ORMs have, but it includes some features that other micro ORMs don’t.

00:00:33.640 First, let's clarify what an ORM is. An ORM, or Object-Relational Mapper, bridges the gap between SQL databases, which deal with relations, and Ruby, which deals with objects. These two systems work differently, and an ORM helps bring them together.

00:00:45.400 However, there are caveats. One issue people often mention is called 'impedance mismatch' because things don’t always work the same way. For example, a tree structure in an object-oriented language is relatively simple to implement with pointers or links, but this isn’t so straightforward in SQL.

00:01:09.310 Why would I tackle such a daunting task as writing an ORM? For starters, I'm not satisfied with any of the existing Ruby ORMs. I wanted to explore and learn more about writing my own. My colleague, Amos King, often encourages people to write their own ORM, and I foolishly decided to listen to him. Perhaps I could learn enough to create my ideal ORM someday.

00:01:48.880 They say that if you want to create something good, you should write one, throw it away, and then write a new one. In a way, that's what this project is—I'm making mistakes to learn from them, so hopefully, the second try will have a better architecture.

00:02:07.180 A critical question is: how simple can we make an ORM while still making it useful? Every ORM I've used has a DSL (Domain-Specific Language) to help write SQL. The problem is, you often end up having to write some SQL manual. This phenomenon is known as leaky abstraction, meaning the abstraction doesn’t always work.

00:02:21.700 So, what if we made leaky abstraction the norm and just used SQL? I began designing the ORM based on some strong opinions. The thing that frustrates me most about Active Record is having to look in two places for information—relationships are defined in the model class, while attributes are defined in the database schema.

00:02:50.480 While Active Record is well-tested and easy to start with, I am primarily in the 'hate' camp regarding its complexity. Many of us are not well-versed in SQL to know when we should avoid it. The team that created SQL is very knowledgeable, and I don't think I'm smarter than them in this regard.

00:03:31.130 SQL, particularly PostgreSQL, can do just about anything including most of what a NoSQL database does. A great article by Sarah Mei discusses why you should avoid using MongoDB. The gist is that you can end up in a corner you won't see until about a year later.

00:03:49.160 I spoke with someone at Stripe who uses MongoDB as their primary storage and they encountered this problem. They now continuously copy data from MongoDB into a PostgreSQL database for ad-hoc queries. This presents a dilemma: how many of you have ever switched database vendors in a single project? Only a few hands go up. I think it’s not an easy change.

00:04:39.340 Why prepare for something that's probably not going to happen? Furthermore, modern developer workstations can run full database systems, making it practical to develop with what’s in production. If you’re using PostgreSQL in production, you should try to use PostgreSQL in development too.

00:05:03.830 Active Record is a fixture in Ruby applications; it’s well known and widely used. Domain logic refers to the things in our application and their interactions, formed based on the Active Record pattern. When referring to 'Active Record' in a space, I mean the ORM that comes with Rails, whereas 'Active Record' without a space denotes the pattern itself.

00:05:30.060 The footprint of Active Record is significant; it comprises about 210,000 lines of code out of the total lines in Rails. A lot of that code is complexity that many don’t need, which is why I created Ruby Preserves—an ORM with around 350 lines of code, making use of the basics.

00:06:21.560 Active Record can help with small, CRUD-based applications. However, as applications grow more complex, they often require alternative patterns. This leads us to the Data Mapper pattern, which separates domain logic from the data access layer. While there was a Ruby ORM called Data Mapper, it didn't effectively implement the Data Mapper pattern, resembling Active Record more closely.

00:07:25.550 In conclusion, the Data Mapper pattern allows for better scalability and separation of concerns compared to Active Record's structure. It allows domain models to focus solely on their responsibilities, whether it means handling business logic or accessing data, preventing the blend of these concerns.

00:08:11.270 I initiated Ruby Preserves by initially writing the README file, which became a driving force—a method I refer to as README-driven development. Before diving into coding, I put all my motivations and thoughts in that document, outlining my vision for the ORM.

00:08:58.320 The high-level API I envisioned has changed somewhat since then, but it's effectively similar to the original concept. The API begins with defining the domain model, allowing us to work with plain old Ruby objects without requiring a database right away.

00:09:38.920 This approach allows you to write substantial parts of the application without persistence. It’s simpler than Active Record as it defines all field names in one place and avoids the additional complexity that comes with conventional Active Record structures.

00:10:32.680 One of the main components of Ruby Preserves is that it allows you to map the domain model to the database without excessive boilerplate. You define primary keys in every repository and can leverage built-in methods to fetch objects associated with those keys seamlessly.

00:11:20.460 While I haven't implemented persistence yet, the goal is to maintain simplicity. For simplicity, you can execute arbitrary SQL directly within Ruby and handle results efficiently, mimicking behaviors of Active Record but in a more concise format.

00:12:10.350 Relationships are usually not implemented in micro ORMs, but Ruby Preserves does implement 'has many' and 'belongs to' associations. The implementation took less than two hours each, but I had spent months contemplating the approaches.

00:14:03.180 The 'has many' relationship is represented through associated tables and foreign keys. If it requires retrieval, the repository offers the necessary capabilities through simple queries. While this is a very straightforward implementation, there is still room for refinement.

00:15:06.420 An important part of Ruby Preserves is keeping an eye on performance. The dreaded 'N + 1 query problem' is handled through eager loading to improve efficiency. If you find yourself working with Active Record, you should also consider using Bullet to monitor these N + 1 queries, allowing proactive optimizations.

00:16:16.450 I have made mistakes along the way with Ruby Preserves. For instance, I initially decided to generate SQL for relationships, which contradicted the core principle of using raw SQL. This resulted in inefficient queries, as well as unnecessary complexity. It took some trial and error, but ultimately I found a solution for these relationships that seems to work well.

00:17:28.610 Certainly, joins are one area I'm working to better understand. An issue arises when two tables have columns with the same name. SQL doesn't handle this automatically; it requires specifying each column in queries, necessitating manual tracking of what columns are called in SQL versus how they are represented in Ruby.

00:19:30.620 Most web applications don't need to display thousands of records at once, which often implies adding some kind of limit clause in your queries. While Ruby Preserves aims to keep things straightforward, there’s always room for further features and optimizations, chiefly around prepared statements that cache query plans to enhance performance.

00:20:55.310 Future ideas for Ruby Preserves include automatic mapping based on database structure and exploring similarities with Data Access Objects. While it's nowhere near ready for production, I still don’t have a standard ORM that I'm fully satisfied with.

00:22:36.400 My current thinking about ORMs is nuanced. Lotus is a relatively new entry with a Data Mapper implementation, but I have concerns about its maturity and how it handles scopes, as it may lead to similar issues seen in Active Record.

00:23:46.050 Perpetuity is another tiny gem that aims to simplify ORM usage but is still lacking support for relationships. Then there’s ROM, which uses a different paradigm that can prove challenging to wrap one’s head around but shows promise for functional programming enthusiasts.

00:25:27.470 SQL is fundamental to ORM design, though its handling can cause friction if not structured properly in Ruby, as the community often focuses on other paradigms. My final choice involves using Active Record with attribute declarations for model validation. These offer added assurances about attribute types.

00:27:28.350 In closing, I wish to thank everyone who provided feedback on Ruby Preserves, especially James Edward Gray II and Amos King. The slideshow was made using remark.js, while the UML diagrams utilized a well-known but now unmaintained program. I appreciate any feedback from you all. You can reach out to me on Twitter or GitHub or via email. The project is available on GitHub, along with the slides, which can also be accessed online.

00:30:15.190 Thank you!