ActiveRecord

Building an ORM with AReL: Walking up the (AS)Tree.

Building an ORM with AReL: Walking up the (AS)Tree.

by Vipul A M

In this talk titled "Building an ORM with AReL: Walking up the (AS)Tree," Vipul Amler discusses the process of creating a tiny Object-Relational Mapping (ORM) library called 'Storm' using ARel, which is a SQL abstraction framework. The purpose of this presentation is to provide insights into how ActiveRecord generates complex SQL queries, and to explore the foundational aspects of ORMs. Key points include:

  • Introduction to ORM: Vipul explains what an ORM is and its role in bridging the gap between databases and programming languages, particularly Ruby. He mentions ActiveRecord as an example of an ORM that simplifies database interactions.
  • Using ARel: ARel is introduced as the underlying framework that will assist with query generation in the ORM. ARel employs an abstract syntax tree (AST) to manage complex SQL queries involving operations such as joins and unions.
  • Basic ORM Structure: The talk includes a walkthrough on setting up the basic components of the ORM, including defining models, setting up database connections (using Postgres), and implementing essential functionalities like attribute access and type casting.
  • Key Features Implementation: The process includes defining tables with ARel, setting up a schema cache, managing column definitions, and establishing a primary key sequence for record insertion.
  • CRUD Operations: Vipul elaborates on implementing fundamental CRUD (Create, Read, Update, Delete) operations within the ORM, detailing how to use ARel's capabilities to interact with the database efficiently.
  • Demo and Testing: The presentation highlights a practical demonstration of building the ORM step-by-step, and talks about setting up tests that encompass various scenarios to ensure the ORM performs as expected.
  • Conclusion: By the end of the presentation, attendees gain a clearer understanding of both ARel and ActiveRecord, alongside the knowledge needed to build a simple ORM framework. Vipul emphasizes that while the ORM created is basic, it provides an excellent foundation for further development, potentially integrating advanced features in the future such as validations and migrations.

Overall, this talk serves as a thorough introduction to the mechanics behind ORMs and the role ARel plays in simplifying SQL generation, alongside practical coding examples relevant to Ruby developers.

00:00:14.000 Hello everyone! Today, I'll be speaking about building an ORM with ARel. My name is Vipul, and I was supposed to present alongside Prathamesh, but he chose to go to Ruby instead. It seems he's enjoying himself with lots of food, especially Japanese cuisine. I work at a software consultancy based out of Miami, and I work remotely. My work primarily involves Ruby on Rails, building web applications.
00:00:33.920 As I mentioned, I run Ruby India, a platform in India designed to help Ruby enthusiasts come together. We host content, including blogs, and I also do a newsletter called Ruby Indian Newsletter, which you can check out at rubyindia.org. We recently started a podcast series at podcast.rubyindia.org, where we feature many community members. During the past month, I had the pleasure of interviewing prominent figures like Koichi Sasada and Fabio Kita. Feel free to check it out.
00:01:09.520 Now, about Ruby Indian conferences: several Ruby Indian conferences are coming up, with the main one being RubyCon India. How many of you have heard of Goa? If you love Goa, you should definitely attend RubyCon India; you will stay at a beautiful resort. Additionally, there’s Garden City RubyCon in Bangalore and Deccan RubyCon, which was held two months ago in my hometown, Pune. If you're planning to visit India, I encourage you to attend one of these events.
00:01:55.439 Before we dive in, let’s take a moment to celebrate the fact that Rails has been around for ten years now—let’s give a round of applause! We had a fantastic event celebrating the launch of Rails, during which a notable issue was raised in the Rails issue tracker. If you get a chance, do show your love for issue number 16731.
00:02:25.760 As someone from the Rails community might mention, when in San Francisco, it's essential to wear your own company’s T-shirt. I was wearing a Braintree T-shirt I got in Madison, and I was approached by numerous people asking, 'Where's your office?' or 'Do you know Jack?' It was quite embarrassing! So, make sure to wear your own T-shirts.
00:02:50.000 Returning to the presentation, I’ll be doing a code walkthrough. We will see a lot of hands-on code, so I hope you can keep up. There's a lot to cover, so I'll be moving a bit quickly. Before we start, we need to discuss how to create caches, but that has already been covered.
00:03:07.040 Now, why would we want to build our own ORM? The goal is to experiment and create a tiny ORM, which I’m calling 'Storm.' We will explore the various aspects involved in an ORM and gain a better understanding of how ActiveRecord operates. This way, we can identify potential bugs or hurdles related to ActiveRecord.
00:03:37.519 So, what exactly is an ORM? In scientific terms, it's a tool used for converting incompatible types, which can be described as a virtual object database. It creates a bridge between your database and your programming language—in our case, Ruby. Within Ruby, there are two primary patterns for implementing ORMs: one is the Data Mapper, and we’ll focus on ActiveRecord today. ActiveRecord wraps a row in a database table, allowing access to that row's data through instance methods while also enabling you to define your own domain logic.
00:04:23.840 We'll look at various components such as database connection, which we'll implement using Postgres, query generation where ARel will assist in generating queries, attribute access, and type casting mechanisms. Essentially, we want to represent a post in Ruby with attributes like ID, name, content, author, among others.
00:05:12.720 Our ORM will convert these fields into their respective Ruby types—like integers or strings—based on the values returned from the database. You can think of it like this: in ActiveRecord, creating a post would involve calling 'Post.new' with a hash of values, followed by operations like 'post.save' or 'post.find.'
00:06:01.360 The role of ARel in this process is crucial—it will help with the actual query generation. ARel, which stands for Relational Algebra, is primarily a SQL abstract manager. It deals with composing queries using an abstract syntax tree (AST). ARel's capabilities allow it to generate complex queries involving unions, joins, and other operations, adapting seamlessly to various databases without requiring us to worry about the specifics.
00:06:37.520 Every component of ARel follows a node-based approach, where each node in the AST has the information needed to construct SQL queries, such as 'where' or 'limit.' Node visitors traverse the AST and convert these nodes into strings for execution, while managers assist visitors in navigating the tree accurately, depending on the operation being performed—like insert, update, or delete.
00:07:50.720 The first task for us is implementing a basic engine to provide ARel with information about our tabular structure, such as the column structure, types, and how to connect to the database. In ActiveRecord, this is handled using database adapters for SQLite, MySQL, or Postgres. Our engine’s role is to facilitate this process, ensuring ARel knows the data types and what kind of processing should occur.
00:08:52.680 Let’s move forward and take a look at the engine skeleton. This part isn't live coding and can become challenging at times. We're starting with a basic engine that currently lacks functionality but will grow with time. In our module, we define this engine that will gain more methods as we progress. We also have our model defined, showcasing fundamental CRUD operations.
00:10:06.399 One key note here is that we define Arel tables, using the name of the table we want to access from the database. We consistently append an 's' for pluralization. ActiveRecord employs ActiveSupport for such inflections, ensuring proper naming conventions.
00:10:56.079 Next, we need to set up the database connection. We've defined our Postgres adapter, making use of the PG gem. Here, we have hard-coded our database name as 'storm_development.' In ActiveRecord, this is handled via a database configuration file. Visitors play a crucial role; they ensure the correct SQL syntax is applied when defining queries based on the database type.
00:12:10.560 Moving forward, we've created a class to manage column definitions. This class will hold attributes for columns, including SQL types, which might range from VARCHAR to FLOAT or DECIMAL. Our type casting method will ensure values passed between the database and the user are managed properly.
00:12:53.440 As we set up these basic structures, we’ll define a schema cache to avoid repeatedly querying the database for column information. This schema cache will be a simple hash, linking column information to the corresponding table.
00:14:02.720 So far, we've established the foundational work for our Postgres adapter, defining how columns and types interact. Now we will provide basic tests to cover various scenarios of our ORM.
00:14:29.000 To start testing, we are defining our post model and inheriting from our ORM. We are setting the stage for basic tests, including where clauses and initializing the ARel tables. The ultimate objective is to ensure that when we do 'Post.new' and pass in certain values, we can read those values from the created instance.
00:15:08.640 We’ll also explore saving new records and deleting existing records in the upcoming slides. Is everyone with me so far? Great! Now, let's build on our basic structure as we define attributes.
00:15:24.800 Attributes are key components for accessing values within an instance. For example, after calling 'Post.new' and providing values for name, author, or subject, you should be able to access those values through attribute accessors like 'post.name.' We will ensure our attributes work seamlessly with both user-provided values and those sourced directly from the database.
00:16:19.200 To facilitate this interaction, we will support data types, specifically for strings and integers, allowing us to manage type casting effectively. We are establishing a type map to store definitions for these data types, enabling us to register and look up types as needed.
00:16:54.320 Now that we have our types in place, we can define attributes. Each attribute will consist of a name, a value before type casting, and its corresponding type. The aim is to create a convenience method to assign values and ease access to data.
00:17:15.760 Next, we will compile these attributes into what we call an attribute set. When executing 'Post.new' with a hash, this hash will convert into individual attributes, grouped together as an attribute set.
00:17:41.760 Upon creating a new post, we will utilize an attribute variable that acts similarly to ActiveRecord, providing access to the attributes' hash. By doing 'post.new' and passing in values, we will ensure the values entered by the user are converted and stored correctly.
00:18:23.120 Moving forward, we will define read methods for accessing values such as ID, name, or author. Additionally, we'll create write methods to assign values while ensuring they properly convert and type cast the data as required.
00:19:22.640 After establishing reader and writer methods, we’ll enhance our model by implementing how new objects are initialized. During instantiation with a hash of attributes, the model will create appropriate reader and writer methods for the defined columns.
00:20:01.040 This will allow us to pass values for a post through the 'Post.new' method while ensuring all attributes are initialized appropriately. We aimed to work toward storing user-provided values and maintaining nil values for any omitted fields.
00:20:47.680 Having covered the basics of initializing our model attributes, we can now move on to implement CRUD operations—creating, reading, updating, and deleting records.
00:21:17.600 For the create operation, we will follow ActiveRecord’s pattern: taking a hash of values and creating a new record in the database. To do this, we'll utilize the insert manager in ARel, providing it with the table information and values we want to insert.
00:22:39.840 For every successful insertion, we can return the last record inserted. We will implement this using a primary key sequence, which acts like an auto-incrementing identifier, allowing us to track the last successful insertion and fetch the record using the 'find' operation.
00:23:18.400 In a similar manner, we will implement update operations. We'll utilize the update manager to process updates based on the record’s ID, ensuring that changes propagate to the proper database entries. Destroy operations will utilize the delete manager for records that need to be removed.
00:24:41.120 Lastly, our save method will facilitate the process of saving records. It will determine if a record already exists based on the presence of an ID. If an ID is available, we’ll update that record; if not, we'll create a new one.
00:25:45.520 To wrap things up, we implemented basic querying methods including finding, counting, and deleting. ARel significantly simplifies the process through its own methods, allowing for efficient database operations without having to construct SQL queries manually every time.
00:26:40.079 We now have a working tiny ORM capable of performing basic CRUD operations using just 560 lines of code. Throughout our journey, we've explored essential components like database connections, column definitions, schema caches, attribute types, and importantly, query generation. Although this demonstration focuses on foundational concepts, ActiveRecord supports a broader range of features like validations, callbacks, and migrations, which we can consider for future development. Thank you for your patience and attention!
00:30:17.360 Thank you!