00:01:20.960
We are going to get started. This is a bit louder, but it worked, so whatever works!
00:01:29.060
I need you to be super quiet because I have an announcement to make. Hi, my name is Andrew, and I've been doing a great job capturing this conference in written form.
00:01:38.660
I am absolutely awesome! Please give me a round of applause!
00:02:08.960
Nice! All right, we have two talks coming up, then there’s a break, and then we have one more talk coming up. This will be followed by the closing notes from Ramona and Pilar, and then, of course, the after-party starts at 8:00 tonight.
00:02:19.550
So, our next speaker is Brad. I’ve heard that he is quite the karaoke singer, a barbecue enthusiast, and a coder. Who would have expected that? He has about 15 years of experience with SQL under his belt, so if you are about to embark on designing your next database challenge, he is probably the best to help you with that.
00:03:01.569
Hi! How are you doing? Guten Tag! Did I say that right? I’m very happy to be here in Vienna. This is my second time; I was here back in 2003. I could not be more thrilled. I flew in from Los Angeles last night just to speak to you all, and this has been a lovely time.
00:03:12.860
Thank you to our organizers and everyone who set up this conference and invited me up here. My name is Brad Urani; I am a staff engineer at Procore in Santa Barbara, California, and this talk is about database patterns, specifically Rails anti-patterns or relational anti-patterns.
00:03:18.290
I change the title depending on who my audience is, but it’s the same talk: Rails anti-patterns—how not to design your database. So, warning: tough decisions ahead. I’m going to give you some patterns you can use in your database, but realize that I cannot give you hard and fast rules about what to use where. I hope to provide you with tools that you can add to your toolkit.
00:03:38.180
Ultimately, making decisions about design patterns is challenging and very specific to your use case. At the end of this, remember there are still choices to be made. I work for Procore, which provides software for the construction industry. We create software for people building skyscrapers and giant buildings like this.
00:04:04.730
This is an example of one of our iPad apps; it’s a blueprint management app for construction plans. Here’s a gentleman out in the field using it. The construction industry has a similar problem to software in that most workers are men, but we are implementing programs to increase gender diversity in construction.
00:04:24.770
We have modeled these programs after some Rails initiatives like Rails Girls, and we’re proud of them. We also support the Ruby community. We sponsored RailsConf and were diamond sponsors for the last two years. Thank you!
00:04:36.800
We use Postgres SQL at a very large scale—60 million rows scanned per second on our primary database instance. That’s pretty big! I gave this talk at PostgresConf in New Jersey City, and even the Postgres people were somewhat impressed. We handle about fifty thousand transactions per second, which is probably an even more impressive number—these include both reads and writes.
00:05:11.449
This involves an implicit transaction with a 14 terabyte database. It means we have to handle SQL extremely well, which also necessitates that we design our database very well. Mistakes in database design can be very costly when you're running at this scale.
00:05:36.159
For example, this is from Procore; it’s a construction budget. When you load a page like this, it can run 20 to 30 SQL queries just to render. That requires carefully tuning those SQL queries to ensure optimal performance while maintaining developer happiness.
00:05:54.560
We have a large team of Rails developers—dozens of them—and it is one of the largest Ruby on Rails apps in the world with over a million lines of Ruby, close to two million lines currently. This means not only is there a SQL problem but also a Ruby on Rails problem, which necessitates our database patterns to effectively work with our tooling.
00:06:12.440
In our case, our tooling is Ruby on Rails and Active Record. We must ensure that the patterns we choose work well with SQL queries that are both efficient and easy for our developers to use.
00:06:29.890
Let's talk about anti-patterns. Today, I will address two major anti-patterns: naive trees and polymorphic relationships. But first, a little disclaimer—there are no actual anti-patterns. An anti-pattern is only classified as such if you use it in a context where it is not the best choice.
00:07:04.159
Every pattern has a scenario where it may be the most suitable option. Even my so-called anti-patterns can be the best choice if they fit your use case well. Trees, for instance, can be tricky to work with in databases. They represent structures like a file system or an organizational chart.
00:07:40.030
If you're representing a hierarchy, such as a file system, a comment thread, or an org chart, you may use a tree structure. However, naive trees, the simplest way to implement a tree in databases, can also be the trickiest to query effectively.
00:08:05.310
In a naive tree, you typically have a primary key ID and a foreign key for the managerial hierarchy. This structure is space-efficient but makes queries to get descendants exceedingly complicated, as this requires multiple joins to retrieve deeper levels of the tree structure.
00:08:40.840
Thus, getting a person's descendants might involve multiple complex joins, with unclear depth of recursion at the time of writing your query. As a result, this approach can become inefficient very quickly.
00:09:08.790
Alternatively, starting at the top and using separate queries to get descending generations can quickly become an N+1 query issue, which can bog down performance and exhaust database connections.
00:09:35.110
Another option could be recursion, but writing recursive SQL can be challenging as it lacks readability and is not well-supported in ORM frameworks like Active Record. This request can become even more difficult for developers unaccustomed to recursive SQL.
00:10:12.610
Even if you manage to implement recursive SQL, additional queries are often necessary to display it correctly on a webpage, adding to inefficiencies. Also, handling deletions in a tree structure can be particularly complex; you have to adjust references throughout the hierarchy, which can lead to significant overhead.
00:11:00.520
In summary, naive trees have severe limitations when dealing with complex queries, rendering them suboptimal for deep trees or when frequent manipulation is required. It’s critical to find an alternative that meets your performance needs.
00:12:01.239
However, if what you're working with is relatively simple, a naive tree structure may prove adequate. It boasts referential integrity, is space-efficient, and may yield usable results with caching depending on your use case.
00:12:41.310
A closure tree is a more robust alternative to naive trees. In a closure tree, you create a table referencing every descendant for each parent, allowing for quick retrieval of all descendants with a single join.
00:13:07.509
This approach reduces the complexity of queries, requiring fewer joins and allowing easier on-the-fly data manipulation, significantly enhancing developer experience and ensuring good performance.
00:13:38.990
There is a popular Ruby gem called Closure Tree that implements this pattern in your application, helping automate the creation of necessary database queries and structures, which simplifies database interactions and maintenance.
00:14:24.220
Another alternative to represent trees is using a materialized path. In this approach, you maintain a path column that contains the entire hierarchy for each node as a string, making descendant queries very straightforward with a single lookup.
00:15:04.020
Though this approach sacrifices some referential integrity because it cannot enforce foreign key constraints, it typically boasts better performance compared to closure trees, especially for certain types of queries.
00:15:55.460
The materials path may also pose limitations in terms of depth; it is subject to character length limitations which could restrict more complex hierarchies. However, for speed and flexibility, it can stand as a solid option.
00:16:38.800
When discussing anti-pattern number two, polymorphic relationships, many developers may be familiar with this pattern from Rails. It involves having a foreign key that can refer to multiple tables based on an extra column that specifies the table type.
00:17:28.820
While it can be flexible and useful in certain contexts, it is often misused for cases where enforcing data integrity is paramount, such as financial or medical data management, where foreign key constraints become essential.
00:18:10.610
A well-structured design that utilizes exclusive arcs or separate parent tables often proves to be the better choice when dealing with critical data, ensuring referential integrity compared to the limitations of polymorphic relationships.
00:18:53.070
Using a parent table and a commentable middle table can assist in structuring complex many-to-one relationships by allowing for shared attributes across diverse data types while retaining the ability to enforce referential integrity.
00:19:49.840
In conclusion, if you face a wide variety of parent types, polymorphic relationships can provide a useful solution. However, maintaining referential integrity may offer superior peace of mind in scenarios with fewer parent entities.
00:20:29.390
Ultimately, whether you choose to go for performance or simplicity, referential integrity, or rapid development—it's crucial to evaluate your application's needs and the trade-offs you’re making with each decision.
00:21:16.530
Congratulations! You've made it through the talk! I have a few book recommendations for those interested in deeper insights into SQL hierarchy and design patterns. One is 'Trees and Hierarchies in SQL for Smarties,' which covers numerous patterns and anti-patterns.
00:22:01.810
I appreciate your attention today! I’m Brad Urani; please feel free to reach out to me on Twitter or LinkedIn. I work for Procore, located in beautiful Southern California. We are hiring for various roles, including Ruby developers, Android/iOS engineers, security experts, and machine learning specialists. Come join us!
00:23:26.940
So quick applause for Andrew too for such a quick job of taking down the talk. There’s a case of a missing laptop; it’s so exciting!
00:23:40.670
Please come back for the last talk of the day at half-past four, followed by the closing notes. The next speaker is Coraline; she is a well-known writer and speaker.