Piotr Solnica

DataMapper 2

This video was recorded on http://wrocloverb.com. You should follow us at https://twitter.com/wrocloverb. See you next year!

I would like to describe all of the pieces that we're working on: new relational algebra engine, new model definition and introspection layers, new validation library and other things that will become part of DM2 (better migrations, UoW library, optimizer layer). The talk would be in the context of a better way of handling business logic in Rails apps.

wroc_love.rb 2012

00:00:12.960 All right, so here we have Chad Sonita.
00:00:23.039 Thank you! I hope you can hear me fine. The topic of my talk is "Beyond the ORM".
00:00:28.560 This is an introduction to DataMapper 2. Why beyond the ORM? Because we are building something more than just an ORM.
00:00:34.160 The lead developer of the DataMapper project calls it an ORM toolkit. You can take the pieces and build something custom.
00:00:41.120 Now, quickly about me: I'm Piotr Solnica. I work at a small company in Bend, Oregon, called Benders.
00:00:48.879 We have awesome guys like Fdream and DenKop in the team. If you're looking for good Ruby developers, feel free to contact us.
00:00:55.760 I blog at soullink.edu. On GitHub, I'm Sonic, and on Twitter, I'm summoning with these fancy underscores.
00:01:02.239 Before I start talking about DataMapper, I actually want to give you some theoretical background so that you can understand our motivation behind working hard on the DataMapper implementation in Ruby.
00:01:20.880 Let's start with ActiveRecord. Martin Fowler describes it as an object that wraps a row in a database table or view, encapsulating the database access and adding domain logic on the data.
00:01:28.240 There are two important aspects in this description: first, it encapsulates the database access, and second, it implements any domain logic that your application needs.
00:01:35.360 This means that when you're using ActiveRecord, you're supposed to have the data and domain model in one place.
00:01:42.719 Every attempt to separate persistence from domain logic in ActiveRecord is essentially against its pattern.
00:01:51.920 In Ruby, we probably don't ask the question because every Ruby ORM implements the ActiveRecord pattern.
00:01:57.360 Interestingly, DataMapper 1 actually implements the ActiveRecord pattern. So, when you want to use a database as a Ruby programmer, you just use ActiveRecord.
00:02:09.599 Martin Fowler suggests that ActiveRecord is a good choice for domain logic that isn't too complex, like creates, reads, updates, and deletes.
00:02:16.800 He states that basic operations are what's considered simple domain logic. However, the more complex your application becomes, the harder it is to manage everything within ActiveRecord.
00:02:35.120 Thus, this leads us to DataMapper. We've had a lot of discussions in the community about making ActiveRecord models thin.
00:02:42.400 I wrote a blog post advocating for using different patterns than ActiveRecord, suggesting that we should separate persistence from domain logic. However, doing this with ActiveRecord is awkward.
00:03:01.599 In the conference, Fdream suggested that the solutions we have on the market generally lead to DataMapper because we want to achieve clear separation of concerns.
00:03:09.360 The DataMapper pattern, as described by Martin Fowler, moves data between objects and a database while keeping them independent of each other.
00:03:23.440 The important aspect here is separation: the domain model knows nothing about persistence, and vice versa.
00:03:30.080 This way, mapping data from the database happens in a clean space.
00:03:35.120 Now, when should you use DataMapper? This is a common question.
00:03:42.000 According to Martin Fowler, the primary occasion for using DataMapper is when you want the database schema and the object model to evolve independently.
00:03:53.120 However, this isn't necessarily a choice you make; it happens as complexity grows, particularly when using relational databases.
00:04:02.160 The more complex your application becomes, the bigger the impedance mismatch can be. This is what DataMapper aims to solve.
00:04:10.560 DataMapper 1 essentially was an ActiveRecord implementation. While working on it, we realized that we had reached a dead end.
00:04:24.000 The coupling between database schema and your domain model was challenging, making it difficult to implement the things we wanted.
00:04:36.639 Consequently, we decided to abandon ActiveRecord completely and focus on a proper DataMapper implementation.
00:04:48.000 DataMapper 2 is built on a general purpose relational algebra engine and will support different data stores via adapters.
00:05:04.160 This concept allows you to implement an adapter for any type of data store.
00:05:15.040 Another interesting thing about DataMapper 2 is that it isn't a single project but rather a collection of libraries that you can use independently.
00:05:30.320 We're currently halfway through the process and I can't provide a set deadline for when everything will be ready.
00:05:44.240 So, what are the new libraries? The first one is Veritas, which is the new relational algebra engine.
00:05:54.639 We also have Virtus, which will serve as the model definition and introspection layer, and Equitas, which is the new validation library.
00:06:08.000 Equitas is essentially a refactored and improved version of validations from DataMapper 1. Please note that Latin names are code names, and we will rename all projects when they are ready.
00:06:22.720 The new core library will be smaller than the existing one, including the mapper, a new query API, unit of work, and interfaces for third-party plugins.
00:06:34.320 Veritas started about two years ago, with DenKop as its author, and aims to provide relational algebra capabilities in a more abstract way than traditional SQL generators.
00:06:55.040 It already supports all relational algebra operations and includes other common operations like sorting.
00:07:07.680 Furthermore, Veritas allows operations to be run in memory, meaning if a particular database doesn't support joins, you can perform those joins in memory.
00:07:22.799 Additionally, Veritas can be extended with adapters for any kind of data source.
00:07:30.720 This allows you to work with multiple data stores simultaneously while having transparent access to any database.
00:07:45.920 Moreover, Veritas is designed to support database-specific optimizations. If there's a more efficient way of doing something, you can hook into Veritas to implement your own optimizations.
00:08:00.360 This approach allows us to leverage powerful features native to various databases.
00:08:17.760 Let me show you how you can currently utilize Veritas. For example, I’ll use the GitHub API.
00:08:35.920 In my example setup, I'm fetching a list of DataMapper organization members and all comments from the EmCore project.
00:08:53.120 I parse the comments from JSON into an array of hashes. To build relations with Veritas, I simply define headers.
00:09:04.800 These headers act like schemas for Veritas relations. In this case, I define two headers: one for members and another for comments.
00:09:18.640 Then, according to the headers, I map the data to arrays, ensuring that each array contains attribute values in the same order as the headers.
00:09:35.200 With that done, I can build relations using Veritas. You simply call Veritas relation base and pass in the relationship's name, header, and data.
00:09:46.720 Veritas relations implement a Enumerable API, allowing you to iterate over members or comments and access all contained data.
00:10:00.480 Here's an example of a restriction operation that mimics a SQL query: 'Select from members where login equals d cup or login equals sonic'.
00:10:17.440 This operation will work in memory, utilizing the raw data from GitHub.
00:10:28.560 Let's also see how to perform a join operation between two relations: members and comments.
00:10:40.240 We can join them on login and committer attributes, and it will occur in memory.
00:10:57.440 What's truly impressive about Veritas is its ability to perform joins between various data stores.
00:11:09.119 This opens up many use cases, such as when you have sharding set up, allowing you to access shards transparently.
00:11:21.760 Additionally, I will demonstrate an example using PostgreSQL alongside our GitHub API data.
00:11:36.480 Veritas already has adapters for Data Objects, which provides drivers for popular relational databases.
00:11:52.800 Here, I create a simple table in PostgreSQL called users with columns for ID, username, and country. After inserting data, I can connect to this database using the Data Objects adapter.
00:12:08.799 As with the GitHub setup, I can define the header for the relation, which knows how to pull data from the PostgreSQL database.
00:12:24.639 This process is similar to what we did with GitHub, but here we employ a gateway to retrieve the data.
00:12:38.200 Because Veritas implements Enumerable, it will issue the SQL statement to fetch data when I start iterating over the relation.
00:12:45.680 Now I can work with both GitHub relations and PostgreSQL simultaneously, allowing various operations such as joins.
00:13:01.440 For example, I can take data from PostgreSQL and join it with the members relation from GitHub, and it will work seamlessly.
00:13:15.440 Additionally, all relations in Veritas are composable, allowing you to build complex queries.
00:13:32.160 For instance, we could apply a restriction on the users relation (from PostgreSQL), then join it with the members relation from GitHub.
00:13:46.320 Veritas will determine that the restriction can be performed on the PostgreSQL side, executing the query, and then joining that result set with the data from GitHub.
00:14:04.080 This flexibility is one of the most exciting features of Veritas.
00:14:19.680 You can already install Veritas as it is available as a gem. The GitHub readme provides many examples.
00:14:31.520 In terms of stability, while it's still under 1.0 and we may change some public APIs, the code quality is solid.
00:14:46.920 Now, about Virtus: it began as an extraction of the property API from DataMapper 1.
00:15:03.200 Initially, I thought it was already great and would just extract it. However, we ended up rewriting much of it.
00:15:17.440 Virtus allows you to define attributes on plain Ruby objects and includes a creation library separate from attributes.
00:15:35.040 It supports embedded values and collections, which enhances its usability.
00:15:49.520 For example, we can define a User model that includes attributes and also set up a constructor that accepts a hash of attributes.
00:16:00.960 Virtus is designed to handle embedded collections beautifully. If you were building a library system, for instance, you could define a Book class as well as a Library class.
00:16:16.560 The Library class could have an attribute for books, which would be a set of Book instances.
00:16:30.480 This makes for an intuitive API, as you can easily initialize a library by passing an array of hashes representing the books.
00:16:45.680 During this process, the hashes are coerced into actual Book instances using the standard coercion mechanism.
00:17:05.235 As I mentioned earlier, questions are separated from attributes. I'm considering extracting coercions into a separate gem in the future.
00:17:19.120 Currently, coercion is part of the Virtus gem but can be used independently if needed.
00:17:35.840 There are various classes for coercions, and they're all detailed in the documentation.
00:17:50.320 Now, I planned to describe the mapper and unit of work, but due to time constraints, I will only mention them briefly.
00:18:08.840 The mapper intends to generate mappers automatically by reflecting on the database schema or model definitions.
00:18:22.560 This flexibility is excellent for prototyping, making it easier to get started quickly while allowing for custom mappers.
00:18:38.080 Another significant component is the session unit of work, which changes how data is saved and managed within the application.
00:19:02.120 Unlike ActiveRecord, where you directly use objects to make changes, in DataMapper, you will use a session.
00:19:14.640 This session will be responsible for managing all the changes made to your business objects and the order in which they're committed.
00:19:29.760 This presents a complex but necessary challenge, and we aim to solve it effectively.
00:19:43.920 We also plan to utilize database constraints and reflection mechanisms in DataMapper to further enhance functionality.
00:20:00.000 The Veritas adapters currently focus on relational databases, specifically PostgreSQL, due to its standard compliance.
00:20:14.520 For instance, we already have a working example of the mapper in my account, which accomplishes what we set out to do.
00:20:31.680 We have a plain user class with a constructor, and the mapper can define attribute mappings to the database.
00:20:46.720 The query API will also be part of this mapper, setting it apart from using domain classes for database interactions.
00:20:59.919 So, sessions will completely change how you make changes to data. You will explicitly use a session object to perform operations like inserts, deletes, and updates.
00:21:18.720 Once you call the commit function, the session will calculate dependencies and determine the order of execution.
00:21:41.840 Additionally, rollback mechanisms will be in place, representing a shift from traditional ActiveRecord handling.
00:21:58.560 If you're interested in the development of DataMapper 2, I encourage you to follow the Veritas project and related libraries.
00:22:14.800 You can also check our roadmap for future updates and developments. Now, I believe we have time for questions.
00:22:32.560 We have four minutes remaining for questions.
00:22:37.760 Transactions will be handled on the session side.
00:22:50.880 If your database supports transactions, we will utilize that functionality.
00:23:00.640 We want to ensure that the system is smart enough to manage operations properly.
00:23:07.840 The relational algebra terminology will not limit support for other backends.
00:23:16.800 Our implementation of relational algebra is designed to be flexible and applicable across various data structures.
00:23:44.000 From a marketing standpoint, do you think I should cut my hair?
00:23:50.640 If you have any further inquiries or want to delve deeper into this subject, we encourage continuation of discussions.
00:24:06.720 I appreciate everyone's attention. Thank you very much!
00:24:24.880 Let's stay connected and keep the conversation going.
00:24:30.720 Thank you!
00:24:36.960 Have a great day!
00:24:59.760 Goodbye!