00:00:13.080
Today, I would like to talk to you about mythical creatures. There's a saying, 'Here be dragons,' which apparently you can find on very old maps. People used to place them on maps when they didn’t know what was actually outside the boundaries. However, this isn't exactly true for the maps that are really old, as the original phrase was 'Here be lions,' not 'Here be dragons.' Dragons are mythical creatures, while lions are often thought to be mythical as well. This particular phrase, 'heavy dragons,' is specifically annotated on an ancient globe called the Hunt-Lenox Globe, which is written in Latin.
00:01:30.000
Now, I would like to relate this to programming, more specifically to the somewhat scary topic of layoffs. Layoffs are very much a reality right now, and many of you are likely looking for work. When you’re on an interview, it’s crucial to ask intelligent questions about the project.
00:01:50.000
Ask about the version of Rails being used; it shows you have a genuine interest in whether the software is up to date, maintainable, and secure. You can also inquire about architectural decisions, which are quite common. Additionally, you can ask front-end related questions to demonstrate that you possess knowledge beyond just backend development.
00:02:05.000
Questions about the process can reveal your infrastructure knowledge and show that you can collaborate well with others. Similarly, test-related questions are always appropriate since there are often discussions around testing practices.
00:02:18.000
However, once you start discussing databases, many people seem lost. When asked about the version of Postgres being used, you often hear, 'I’d need to check' because general shop talk revolves around data manipulation and storage.
00:02:35.000
Most conversations revolve around tasks related to work, which is precisely what databases do. If you look at a simplified diagram of Rails, most of it revolves around Ruby and its backend functionalities. The frontend usually caters more to front-end developers while the middle part is where the significant processing occurs.
00:02:50.000
If you’re not convinced about the critical role that the database plays, consider looking at the concept of the test pyramid. Google it, and you'll find that ideally, most of your code should be tested through unit tests, isolating them from database dependencies. After that, you add an integration layer of tests to see if all components work together, with a scant sprinkling of expensive end-to-end tests overseeing your business logic.
00:03:09.000
In the Ruby on Rails ecosystem, however, we encounter an anti-pattern known as the ‘ice cream cone,’ which flips this ideal pyramid upside down with mostly manual testing on top. My experience of over fifty Rails projects reveals that the actual testing pyramid generally does not resemble the ideal. We have unit tests, but it's challenging to isolate them; a larger portion consists of integration tests.
00:03:28.000
The reason for this lean towards integration tests is simple: Rails applications are often highly dependent on databases. Most of them are essentially points of interaction with databases, and I’m not saying that every single application is like this, but a vast majority are.
00:03:46.000
Yet, we treat these databases as mythical creatures that we’re not supposed to be concerned about. We hide them behind ActiveRecord and multiple abstraction layers. Here be dragons!
00:03:56.000
How do we handle these mythical creatures? First, we need to acknowledge that databases are indeed very strange. They are not object-oriented at all; most utilize declarative approaches, which differ from functional programming. Even those familiar with Elixir might find them odd. Generally, databases operate on a single-user model, leading to potential security risks because typically we only use one user account to connect to the database.
00:04:30.000
However, databases do possess remarkable strengths. They are typically much faster than anything we can accomplish in Ruby. Starting with SQLite—an example I generated during my journey—it’s usually ideal for small projects.
00:04:44.000
When thinking of SQLite, you might envision a simple website you created in your free time that you want to publish without the hassle of setting up an entire database server.
00:05:00.000
This is indeed a valid use case. Interestingly, the SQLite website itself operates using SQLite, supporting hundreds of thousands of requests daily. Each of those requests can generate nearly 200 queries, resulting in an astonishing total of 20 million queries each day!
00:05:11.000
It's impressive, considering that the SQLite infrastructure relies on seven-year-old hardware and a shared virtual machine with twenty-three other applications. This showcases just how powerful SQLite can be. It operates effectively under demanding conditions.
00:05:25.000
SQLite excels in reading data but has limitations in writing; only one thread can perform write operations at any given time. It’s essential to note that it's public domain software, which can be used freely without legal constraints.
00:05:39.000
Additionally, its backup process is incredibly straightforward—simply copy and paste files. These features are certainly appealing, especially when considering high traffic on a minimal budget for hosting.
00:05:53.000
One significant advantage of SQLite is its ability to prepare databases for quick backups or testing. It allows for cloning multiple versions of the database easily, enabling parallel testing. However, there are a few challenges.
00:06:05.000
For instance, when using SQLite in applications hosted on threaded servers, it can lock the entire interpreter, which significantly disrupts performance. Heavy write operations may lead to database locking errors, so it’s wise to implement retry logic, especially for frequently accessed endpoints.
00:06:19.000
Moreover, there is a lack of built-in support for search functionality, which means we often need to implement cumbersome workarounds. Rails doesn’t inherently support some features necessary for SQLite to operate with full effectiveness.
00:06:30.000
So, considering SQLite's limitations, let’s move on to the elephant in the room: Postgres, another tool designed for omnipresent database management. Almost everyone seems to use Postgres—raise your hand if you do? Yes, just about every one of you.
00:07:05.000
Postgres is often deemed the safest choice for a database because you can virtually do anything with it. It can even replace components of your entire stack, including Rails, if necessary. A notable post by Stefan Schmidt discusses how Postgres can substitute for nearly every part of your stack.
00:07:25.000
Essentially, it can function as a document database, a caching layer, a GraphQL server, a job queue, and could even replace Kafka if desired. While this might not be practical for every project, it’s advantageous for smaller companies looking for less maintenance.
00:07:45.000
Let’s explore one of Postgres's features. Starting with unlocked tables—these are tables without write-ahead logs. The write-ahead log is a crucial component of a database that captures all data changes to ensure data persistence.
00:08:05.000
If you don’t have this log, your tables will remain in-memory and will not persist through a crash. Consequently, this feature effectively behaves as a cache. You can use it in Rails, even employing helper methods in migrations.
00:08:20.000
In Rails 6, you can create these unlocked tables easily. Although you might think, 'I already have caching credits for this,' consider which database allows you to leverage this SQL-compatible caching feature.
00:08:35.000
If you set up a few lines in your test environment, you can realize substantial performance improvements in potentially horrible test suites, yielding a 25% boost in one of the most resource-heavy projects I worked on.
00:08:52.000
Next, let’s discuss window functions. They are quite powerful features in SQL. Remember that window functions allow you to perform calculations across sets of table rows, refining results without affecting the aggregate level of calculations.
00:09:08.000
Some people might find window functions confounding, believing they integrate complex syntax, but they are exceptionally powerful. For example, calculating employee salaries versus average salaries within a department demonstrates their utility.
00:09:23.000
You can extract data that indicates growing sales using these functions, making it easy to apply window functions in Rails while maintaining clarity.
00:09:42.000
Window functions allow you to aggregate data rolled up over certain timeframes or categories while retaining individual row details available for further processing.
00:09:57.000
The second example incorporates pattern matching in SQL, which can be particularly useful when it enables you to number every ID while restarting the count whenever you encounter a new category ID. This capability allows for efficient querying for the cheapest items in each category.
00:10:18.000
The last example focuses on cumulative sums, which is a frequently searched SQL query on Stack Overflow. By extracting values based on prior months, you can create cumulative revenues that adapt as more data gets collected.
00:10:36.000
These kinds of operations prove to be significantly swifter in SQL than executing similar operations in Ruby, especially with larger datasets.
00:10:57.000
Since SQL can understand the data it processes inherently, it can innovate strategies that often outpace Ruby implementations. SQL explains the execution plan that outlines its performance.
00:11:12.000
In learning to read execution plans, you can gain insight into your database's underlying processing and optimize your queries accordingly.
00:11:30.000
Being aware of query costs helps indicate potential performance bottlenecks. Using tools like PG Hero, which examines indices and provides suggestions for optimization, can significantly enhance database performance.
00:11:46.000
Indices serve to optimize database queries, but remember that utilizing too many can counterintuitively lead to performance degradation. Finding the most efficient indexing approach for your queries is vital.
00:12:07.000
Furthermore, testing different indexing strategies based on multiple factors can yield unexpected benefits for unique applications, particularly when tight on resources.
00:12:20.000
Distributed databases like CockroachDB leverage these principles effectively, eliminating much of the complexity that comes with sharding in Postgres, while ensuring data integrity across multiple machines.
00:12:35.000
The incorporation of geographical indices, or knowledge of where user data is located, can drastically improve performance and reduce costs for servers based on user proximity.
00:12:56.000
If considering alternatives, technologies like ClickHouse provide efficient handling of large datasets and are excellent for complex queries commonly found in business intelligence areas.
00:13:12.000
When utilizing databases that can pre-aggregate data, there’s potential not just for speed improvements, but also for large resource savings in endpoint activity when querying multiple sources.
00:13:28.000
In conclusion, various databases each have their advantages and are suited for specific use cases in application design. While NoSQL databases offer flexibility and performance, combining database functionalities seamlessly can yield excellent results.
00:13:42.000
Thank you for your time. Are there any questions?
00:14:00.000
I really appreciate the insight you provided on the unlocked tables, but I would love to take a picture of that slide. I was also wondering about the constraints when these can be used in tests.
00:14:11.000
I can find the slide on unlocked tables for you. It was rather early on, before we discussed window functions.
00:14:25.000
Here it is! The part that you'll want to note is that this content belongs in config environments test.rb to make it successful.
00:14:38.000
Remember, however, that if you try to restart the DB, those unlocked tables won’t persist unless you migrate again. This is best suited for continuous integration pipelines.
00:14:57.000
Right, the write-ahead log is essential for maintaining data integrity. If you lack this, there's no backing up which leads to data loss.
00:15:12.000
Right, and in the context of using `database_cleaner`, tables will truncate and thus effectively cater to your needs.
00:15:24.000
Would you please explain what happens after a Docker restart? Well, in that case, the existing tables would be detached unless you migrate them.
00:15:38.000
Thank you for clarifying that; it was exactly the kind of insight I was looking for. Any other questions?
00:15:53.000
Thanks for the great talk! A quick query: Have you worked with any gems such as Scenic?
00:16:10.000
I haven’t dealt with Scenic, but it certainly leverages Postgres's capabilities for managing views efficiently.
00:16:25.000
Yes, it allows for effective use of resources across your application when considering various views.
00:16:39.000
I think it might be beneficial to shift towards a model that’s more resilient against database changes.
00:16:53.000
True; while integrations of various libraries can bring more features, you can still use Rails capabilities without drastic measures.
00:17:04.000
Absolutely, and with migration, certain features can be easily integrated while not losing core functionalities.
00:17:12.000
Thank you again for all the information you shared during the talk.