Talks

How (and why) to run SQLite in production

wroc_love.rb 2024

00:00:09.719 In a multirack conference, I would typically say how awesome it is to see so many people excited to learn about running SQLite in production. However, I recognize that you're all stuck with me.
00:00:15.719 But I'm hoping that by the end of this talk, you will actually be genuinely excited about the potential of running SQLite in production. To start, let me reintroduce myself. My name is Stephen, and you can find me on Twitter at fractalmind. As Lucas said, I am an American living in Germany.
00:00:30.560 I've been in Germany for five years, and four weeks ago, my wife and I got married. It turned into a bit of an international adventure. Thank you! We both live in Germany, but neither of us are German citizens. German bureaucracy made it functionally impossible for us to get a marriage license in Germany, so we ended up taking a train north to Denmark to get our marriage license.
00:00:49.800 Unfortunately, none of our friends or family could come to Denmark, so we planned a destination wedding in Jamaica to have some friends and family with us. To get to Jamaica, we had to fly through New York, so we had our photo shoot there.
00:01:08.040 Finally, we made it back to Berlin, and we are now legally married. We had our ceremony, and it only took four countries to do it. We live in Berlin with our two dogs, Ruby and Tuck, who I think bear a striking resemblance to the two dog emojis, but that might just be luck.
00:01:21.320 At my day job, I'm the head of engineering at test.io, which is a crowd-driven quality assurance testing company. In my free time, I'm an open source developer, contributing to and maintaining a number of gems in the Ruby and Rails ecosystem. Along the way, I write a fair bit on my blog.
00:01:45.039 Today, I want to discuss why I start all of my new Rails projects, whether those are projects at work or demo apps for blog posts or side projects. Specifically, I want to talk about why I use SQLite as my database engine and suggest that it might make sense for some of you, some of the time, to choose SQLite as well. In fact, I want to propose that it might make sense to use SQLite as the driver for all of your application's persistent data needs, not just to back your models.
00:02:21.720 Whenever I make this suggestion, the first question I inevitably get is: 'Why would I do that? Isn't SQLite for toy applications?' It’s a fair question, and while it certainly isn’t just for toy applications, I’m curious to hear from you all. What are some of the reasons that you have hesitated or thought it seems risky to run SQLite in production?
00:02:38.680 You can just shout out some thoughts. Like, why have you not used it?" Some say, 'Just a single node?' Fair point. Others mention 'Cloud hosting' or 'I already have PostgreSQL,' which is very true. I want to address some of these concerns over the next few minutes.
00:03:14.640 I want to start by addressing one of the most common reasons why people think SQLite can't possibly be production-grade software, which is the fact that it only supports linear writes. This is a true statement: you can only have one write operation at a time with SQLite. But does that really mean it can’t be used in production? I think the answer is 'no.' Let’s dig into why.
00:03:43.679 The most important insight is to step back and consider how many linearly ordered write operations a system like SQLite can perform in the same amount of time that PostgreSQL would take to perform one write operation. It helps us see the difference in performance that we are looking at. Luckily, I don’t need to do this work; Ben Johnson, the creator of Lightstream, did some benchmarking for a conference talk he gave at GopherCon in 2021, looking at running PostgreSQL in three different contexts.
00:04:26.000 What you can see is that even when you’re running PostgreSQL on the same machine as your application, just the overhead of interprocess communication means that you could run 10 SQLite writes in the same amount of time that you could run one PostgreSQL write. Now, it's not very common for Rails applications in production to self-host and self-manage their own PostgreSQL on the same machine as their application. Typically, you're going to have PostgreSQL running on its own machine.
00:05:14.680 If that PostgreSQL instance is in the same availability zone as your application, that is already twice as slow to three times as slow as if it were running locally. If the PostgreSQL instance is in a different region than your application, it's about three times slower than it would be if both were running on the same machine.
00:05:56.840 In fact, in today's cloud-hosted environments, the most marketed solution is to use a managed database service. Odds are that in that instance, your PostgreSQL server will be in a different region than your application, and if your application is distributed across multiple regions, definitionally, almost all of them will be in different regions. Even if those two regions are neighbors, such as US East 1 and US East 2, you can still see that you could run 1,000 SQLite write operations in the same amount of time that your PostgreSQL instance is running just one.
00:06:51.760 So yes, while SQLite only supports linear writes, you can pack a lot of linearly ordered writes in the same amount of time that you make your call to a managed database. Now, I want to speak more about why I think you should consider SQLite, rather than spending all of this time reinforcing the reasons you might not want to use it.
00:07:11.560 Many of you may have seen the growing articles about surprising cloud bills. People waking up one day to check their dashboard and going, 'Whoops! I don't have $20,000.' SQLite is one of the cheapest databases in the world, being both free and incredibly stable. Running it is quite inexpensive. More importantly, its cost is incredibly stable and predictable. That's a significant bonus. For those managed database services, we are seeing many of them drop their free tiers.
00:07:54.560 It is becoming harder and harder to start a side project where you do not have any revenue while also getting a high-performance cloud-based managed database service for free. One of the last remaining free tiers in the managed database space is Turo, which manages Cloud SQLite. They have highlighted that the nature of SQLite and its predictably low-cost attributes is what enables them to maintain a generous free tier.
00:08:34.600 More and more people are starting to see this. It’s not just within the Rails community; people are beginning to recognize that SQLite really does make sense as part of an application stack for running production web applications. As more and more developers experiment with it, they discover that it is capable of handling production-grade loads.
00:09:09.840 During his Rails World keynote, DHH reiterated that Rails is conceptually focused on being The One-Person Framework, aimed at enabling the smallest possible team—just you as a full-stack developer with an idea—to build feature-rich, valuable applications. I believe SQLite is uniquely aligned with that vision. I think there are three reasons why Rails as The One-Person Framework and SQLite—as the one-person database engine—make good bedfellows.
00:09:51.360 The first is simplicity. Your database is literally just a file on disk. Your database engine is a single executable that runs inside of your application's process. This simplicity provides a unique level of control because you can embed every aspect of your database: the engine, its configuration, into your repository. This control unlocks unique developer experiences, whether that's branch-specific databases or controlling the compilation flags to tweak the SQLite executable. You can do that through Bundler.
00:10:47.160 You can have an experience similar to Docker without actually running Docker. You might have multiple applications, and even without Docker, you can keep everything you need in your Git repository. You call push main, and your CI builds it for production. Everything lives inside of the repository.
00:11:38.760 Furthermore, SQLite is fast. In fact, it can be a thousand times faster than cloud-driven PostgreSQL. The kind of freedom it gives you is hard to comprehend until you try it. There's a sort of famous, or infamous, part of the SQLite documentation that suggests you should probably not prematurely optimize away N+1 queries. They even have a section about it, and while it may sound heretical, it actually makes sense in certain contexts. This flexibility makes SQLite quite unique.
00:12:15.760 So why do I choose SQLite? I choose SQLite because it enables me to build feature-rich, valuable, and fast applications quickly, with manageable maintenance. It allows me to run these applications in production both resiliently and with high performance. At this point, I'm curious to know—does anyone here have an application running in production using SQLite for any purpose?
00:12:39.970 Yes? Well, if it gets me a hand, then it counts! This is to be expected. What I want to do now is shift to the practical side of things. Let's talk about how you can actually make this work, how you can get an application in production that runs performantly and resiliently.
00:13:09.720 What steps do you need to take? What does it look like? That way, we can all leave this session feeling comfortable and confident that you could start a new application using SQLite, and it would work well. So, here's the truth: running SQLite on Rails out-of-the-box, with the command 'rails new --database=sqlite,' isn't viable right now. Unfortunately, it won’t be as performant or resilient.
00:13:47.280 It’s my personal goal for Rails 8 to make this statement true, allowing you to run 'rails new' and have a production-ready out-of-the-box experience. However, that just isn’t the case today, and I'm not going to hide that from you. So, let’s discuss the necessary steps, why they’re necessary, and what it looks like.
00:14:05.830 Here are the steps: first, you need to add the Enhanced Adapter gem. This is a gem that I have written that will provide you with performance enhancements. Next, add the Lightstream gem, which is a wrapper for the Lightstream utility that provides point-in-time backups, contributing to resilience. Then, run the installer for Lightstream. That's three CLI commands. If you want to see more of those, I've got the GitHub repos and some blog posts talking more about them.
00:14:59.040 Thank you all for your time; I hope you had a great time. Now you know how and why to run SQLite in production. Though to be fair, I have a hundred more slides, and that is the takeaway. So if I lose your attention for the rest of the time, that’s okay—I understand it’s the morning, and I’m doing my best. I want to take a moment to dig into what these gems are doing, specifically focusing on how the Enhanced Adapter gem ensures our applications will perform well.
00:15:32.040 I want to delve deep into performance because there are several problems, and we are solving them automatically for you within the gem. It performs magic behind the scenes, and it’s important for everyone to understand what is happening, what the problems are, what the solutions are, and how we can encapsulate them in a gem so you can just run 'bundle add'.
00:16:05.320 To explore this, I built a Rails application, called Laura M News, which is a basic Hacker News clone. Users can make posts and comments, and all of the content is stored in SQLite. Initially, I intended to do more live coding; however, that didn’t pan out. Instead, I took screenshots and we'll look at the code in slides. The repository will be available shortly.
00:16:53.760 To start, I’ll run it on my local machine in production mode and use a load testing CLI utility to simulate load. We'll use that to investigate the pain points in my application and determine where we need improvements.
00:17:36.079 We’ll start the benchmarking with a controller simulating standard CRUD actions: creating a post, creating a comment, going to the Post index view, and visiting the Post show view. First, we'll hit the post create endpoint since that's where SQLite's bottleneck lies.
00:18:00.000 We will start off sending a concurrency level of one, meaning we will send one request at a time for five seconds. When we do that, things look pretty good; all 302s and decent stats indicate that we are on the right track. Perhaps I was incorrect, and it is indeed viable, but of course, it isn't.
00:18:45.920 When we bump that concurrency up to four and send waves of four requests to our server, we suddenly see some errors—500 status codes. This is the first problem you'll encounter if you just set up your Rails app with the default SQLite settings. You’ll notice a lot of reports in your error monitoring service.
00:19:09.960 So, we go back and check the logs to determine the cause of these 500 errors. We find the dreaded SQLite busy exception, which is a common issue for those of us running SQLite in production. It indicates that the database is locked.
00:19:25.280 As mentioned earlier, SQLite only supports linear writes, meaning it utilizes a locking mechanism to ensure that only one write operation happens at a time. If one connection acquires that lock and starts its write operation, any other connection that attempts to write simultaneously is told, 'I’m busy; come back later.' However, it doesn’t process every operation in that way; SQLite maintains a queuing mechanism that incorporates timeouts.
00:20:12.560 Let’s dig into how to fix this issue. We’ll work backward from the solution. Notably, as we double our load, say from eight to sixteen concurrent requests, we notice our error percentage jumps over 50%. This clearly indicates a failure in production-grade software.
00:20:54.840 The answer lies in using immediate transactions. SQLite’s support for multiple behaviors in transactions is inherently flawed. When you don’t explicitly state something, SQLite defaults to deferred mode, meaning it will only acquire the write lock when it sees an actual write operation. This behavior makes sense for scenarios where one connection may often interact with the database, but it becomes problematic for Rails applications where write operations are almost ensured.
00:21:45.520 The root of the issue is that active record, by default, uses multiple connections to the database and assumes that all write operations will follow a read operation in the transaction. With deferred mode, if a transaction begins with a read operation, it doesn't acquire the write lock until it actually performs a write operation, which can result in errors.
00:22:49.440 To mitigate this, we need to ensure that we acquire the write lock automatically when beginning the transaction. By doing so, we can allow SQLite’s retry mechanism to function effectively, as the system tries to manage connections efficiently.
00:23:31.560 Now, how do we implement this within our Rails application? Since version 1.6.9 of the sqlite3 gem, you can set a default transaction mode for your connection. Rails will pass the top-level database.yml keys directly to the database initializer gem. Just set the default transaction mode to immediate, and your Rails transactions will now behave appropriately.
00:24:08.760 When running our load tests again, we should see the results improve. Indeed, at a concurrency level of sixteen, we start to reduce the number of errors, indicating we're making progress. However, we still encounter some errors.
00:24:56.560 If we delve into latency metrics, we begin to understand the challenges that persist. Analyzing data from our load testing reveals that P95 and P99 latency values are at alarming rates beyond acceptable levels. This indicates further issues we must address.
00:25:38.000 In our benchmarking, we set the maximum allowable duration for a transaction to acquire the lock, which defaults to five seconds. When queries exceed this timeout, they throw busy exceptions, resulting in a problematic state. Clearly, we need to adjust our configuration to address this timeout.
00:26:32.760 Our previous timeout setting has proven inadequate. SQLite's API allows us to configure a busy timeout, indicating the maximum amount of time a connection can retry acquiring the lock before failing. If this time is exceeded, we will see errors.
00:27:00.960 The issue is that many of our requests aren't acquiring locks promptly enough, leading to negative performance impacts. To optimize, we will implement a fair retry interval that allows us to streamline how queued operations function in practice.
00:27:47.360 Instead of relying on SQLite's busy handler for back-off timing, a more efficient way is simply to set all queries to wait a certain period, say one millisecond, before retrying. This way, no query will unfairly penalize older queries waiting in the queue.
00:28:32.760 We can replace the current busy handler function in SQLite with one written in Ruby that utilizes the Sleep method. Since Ruby waits while executing the Sleep command, it allows the global VM lock to be released, thereby optimizing request handling and allowing greater concurrency.
00:29:18.840 If we integrate this change into our application, we should find that our latency metrics improve dramatically. Subsequent load tests reveal a flattening in latency trends—a result we want to see. It shows that the system is capable of handling concurrent requests more effectively.
00:30:00.760 We aim for aggressive performance with minimal latency, ensuring that no requests exceed the acceptable duration. With these changes, we should now have a foundational level of efficiency in our database.
00:30:41.520 Problems still persist at the edge case latency levels. To address this, we need to adjust our busy timeout algorithm further. SQLite's current algorithm penalizes older queued operations excessively, leading to suboptimal performance as newer queries can acquire locks preferentially.
00:31:03.520 To flatten out our P999 latency graph, we should treat each query fairly by implementing a uniform delay before attempting to reacquire the lock. By refocusing efforts on fair retry conditions, we can create a more resilient system. Thus, we will commit to enhancing the retry logic further for optimal performance.
00:31:49.520 By removing unnecessary complexity from the retry mechanism, we improve overall performance and can treat every write operation uniformly. Our aim is to simplify the process so it becomes straightforward to implement within the SQLite framework.
00:32:29.760 Fortunately, this updated logic is already integrated into the main branch of SQLite. While it may not be in a point release yet, this works towards a future where our Rails applications can be fully optimized.
00:33:06.760 When contemplating how to run SQLite on a Rails application effectively, four key principles emerge. First, use immediate transactions. Second, ensure the timeout mechanism for queuing queries does not block the GVL. Third, maintain fairness with retry intervals. Finally, employ write-ahead logging journal mode to further enhance performance.
00:33:49.120 The last point regarding the journal mode is critical, as it allows multiple concurrent reads while still maintaining linear writes.
00:34:22.760 Each of these principles can significantly enhance resilience and performance when scaling SQLite in Rails applications. It's imperative to adopt these strategies to build a capable and reliable setup.
00:35:00.760 For those interested in resilience, the future of Rails is moving towards a model where the database forms the foundation for all IO-bound components. With this in mind, utilizing SQLite facilitates more seamless experiences when developing concurrent systems.
00:35:58.920 SQLite proves to be a useful foundation, especially when combined with solid gems for caching, queuing, and other solutions that seamlessly fit into the ecosystem.
00:36:36.760 Moreover, control over SQLite's compilation is becoming increasingly accessible through Bundler. By customizing the SQL configuration with C flags, developers can enhance application performance and tailor their ecosystem.
00:37:14.760 Another vital topic is branch-specific databases. Given that a SQLite database is simply a file, it provides developers with an excellent opportunity for isolated environments. By integrating this into Active Record's database tasks, developers can handle migrations easily and spin up isolated databases per branch.
00:38:04.760 In closing, establishing resilience in your database setup is essential. I strongly encourage you to integrate light stream into your application, as it automatically streams database operations to an S3 compatible bucket.
00:39:00.760 Every Rails application faces moments where database mishaps arise. It’s vital to maintain backups and keep resiliency a part of your workflow.
00:39:51.760 I cannot stress enough the importance of establishing this pattern within every application you manage. As developers, we know our databases are ultimately crucial.
00:40:42.760 I hope that, after today, you can see the merit in using SQLite more effectively than ever. I understand that our conversation has focused primarily on performance, but please feel free to engage with me further on resilience.
00:41:10.760 I appreciate the opportunity to discuss and present these concepts. If you have been intrigued by what you heard today, I encourage you to explore these ideas further.
00:42:29.800 The floor is open for your thoughts and inquiries regarding SQLite in production.
00:43:55.760 One of you asked about the performance focus versus security expectations. It’s a legitimate concern to heighten awareness for any embedded database approach.
00:45:09.760 Simply put, while some trade-offs exist, it is essential to weigh the benefits in performance against the potential security consequences of embedding SQLite directly.
00:46:36.760 To conclude, I have provided my perspective on security. However, every implementation must be gauged against the required security benevolence specific to the application.
00:48:25.760 No, nothing in SQLite locks you to the setup forever. You can migrate to PostgreSQL as the application evolves.
00:49:54.760 The flexibility of Rails and Active Record allows room for future migrations and transitions with existing setups.
00:51:00.760 I want to thank you all once more for your time today. Please feel free to connect with me later during the conference for further discussions.