Balkan Ruby 2024

Summarized using AI

How (and why) to run SQLite in production

Stephen Margheim • June 07, 2024 • Sofia, Bulgaria

In the video titled "How (and why) to run SQLite in production," Stephen Margheim explores the viability and benefits of using SQLite as a production database for Rails applications. He begins by establishing his background as a contributor to Ruby on Rails and highlights his position as an engineering manager at Test IO. His primary intention is to advocate for the use of SQLite, countering common misconceptions that it is merely suitable for toy applications or local testing.

Key Points Discussed:

- Common Misconceptions: Stephen addresses frequent concerns about SQLite in production, such as scalability, backup challenges, and limited support on platforms like Heroku. He points out that despite these concerns, SQLite can handle a significant number of write operations in certain contexts and may even outperform PostgreSQL in multi-region cloud setups.

- Costs and Predictability: The speaker emphasizes the financial advantages of using SQLite, including its free nature and predictable cost structure, which simplifies budgeting for cloud-based applications.

- Simplicity and Control: SQLite's architecture, allowing it to function as a single file on disk, provides developers with enhanced control and a straightforward debugging process. The absence of network latency further contributes to its performance.

- Production Readiness: Although directly deploying Rails applications with SQLite isn't feasible without adjustments, Stephen assures the audience that achieving a production-ready application is manageable with the right tools and setup. He introduces several gems to enhance SQLite’s performance and resilience, suggesting configuration changes for optimal use.

- Backups and Resilience: He recommends using Lightstream for backing up SQLite databases effectively, detailing its installation and configuration process. Additionally, he discusses the importance of backup verification.

- Enhancements and New Features: Stephen touches on various features provided by the enhanced adapter gem, including deferred foreign keys and virtual columns, and how these can improve the developer experience and application performance.

- Deployment Options: The speaker concludes by mentioning viable hosting options for SQLite applications, cautioning against platforms like Heroku that may not be suited for SQLite due to their storage policies.

In conclusion, Stephen Margheim presents a compelling case for SQLite in production environments, urging developers to consider its benefits in terms of cost-efficiency, simplicity, and performance potential. He encourages a shift in mindset towards SQLite, especially for scenarios not requiring massive scale, reiterating that it can support meaningful business use cases.

How (and why) to run SQLite in production
Stephen Margheim • June 07, 2024 • Sofia, Bulgaria

Stephen is an American expat living in Berlin with his wife and two dogs. He is a contributor to Ruby on Rails, sqlite3-ruby gem, and the maintainer of a handful of gems aimed at making Ruby and Rails the absolute best platforms in the world to run SQLite projects. For his day job, he is an engineering manager at Test IO.

Join him on a journey to explore the use-cases and benefits of running SQLite in a production environment.

Balkan Ruby 2024

00:00:31.920 As Fety said, my name is Stephen. You can find me on Twitter at @fractalmind. I am an American expat who moved to Berlin, Germany five years ago. About five weeks ago, I got married.
00:00:44.640 Thank you. It turned into a bit of an international adventure. We live in Germany, but getting a marriage license in Germany turned out to be functionally impossible for two non-citizens. So, we ended up taking the train up north to Denmark, and now I have a Danish marriage license. We didn't have any friends or family with us in the Danish Town Hall, so we also planned a destination wedding in Jamaica and took photos in New York. It was a fun couple of months.
00:01:26.360 As she said, for my day job, I am the head of engineering at Test IO. In my free time, I am an open source contributor and maintainer. Along the way, the work that I do involves writing on my blog. Today, I want to talk about why I start all of my new Rails applications with SQLite and have for the last couple of years. I want to specifically focus on why I choose SQLite as my database engine.
00:01:51.240 I want to suggest that for many of your projects, it would actually make sense for you to try to use SQLite as well. Of course, anytime I suggest this to a group of developers, the first question I get is, 'Why? Why should I use SQLite? Isn't that just for toy apps? Isn't that just for running local tests quickly?' I know for a fact that it is not just for toy apps, but I recognize that that's a fair question.
00:02:30.680 One of the things that I like to do as I talk to different groups of developers about this is take the opportunity to actually hear some of the reasons that you aren't using SQLite in production. Let's hear them out. Some common responses include: it doesn't scale, it has 'lite' in the name, how do you do backups, and you can't run it on Heroku. These are all really good points, and I'm going to cover a lot of them today. If I don't cover yours, let's talk afterwards.
00:03:08.519 Indeed, one of the most common responses I get is that I’m not using SQLite because it doesn’t scale, and it doesn't scale because it only supports linear writes. So, you can only have one write operation occurring in the entire database at a time. Logically, this cannot possibly scale. I want to address this point at the outset.
00:03:32.920 The specific thing that I want us to think about is how many linear write operations SQLite can perform in the same amount of time that we can perform one PostgreSQL write operation. Luckily for me, I didn't even need to do the benchmarking. Ben Johnson, the creator of Lightstream, which we will come back to, did this exact benchmarking for a conference talk that he gave at GopherCon in 2021.
00:04:00.599 He compared running a single SQLite write operation to running a single PostgreSQL write operation in three different configurations. What you see is that even when you're running PostgreSQL on the same machine as your application, you can run ten SQLite write operations in the same amount of time that you run one PostgreSQL write operation. The overhead of serializing and deserializing across that process is noticeable.
00:04:27.040 However, it's not common for most of us to build our applications today and self-host and self-manage PostgreSQL ourselves. The most common situation is that you're going to be running a cloud-hosted managed PostgreSQL service, and in those contexts, it’s quite likely that your application is going to be in a different cloud region than your PostgreSQL. If you have a multi-region deployment, definitionally some of those application instances are not in the same region. Even if you're in a neighboring region, the benchmark I'm observing looks at the latency from US East 1 to 2, and you'll see that you can run over 600 SQLite writes in the same amount of time that you have to make one PostgreSQL query.
00:05:14.199 So, absolutely SQLite does scale to a point. I'm not here to say that it's magic and you could build the next GitHub, but many of us don't need to build the next GitHub. So, why do I think it's worth taking the time to suggest using SQLite? I imagine that many of you, like me, have seen the growing number of news stories of people waking up to astronomical cloud bills.
00:05:30.600 While SQLite is free, which is amazing, it's also incredibly predictable what it's going to cost because it's just the cost of storage. You can easily see the size of your database and your usage patterns, which allows you to estimate costs. Vertical scaling is straightforward, which avoids the problem of relying on complex cloud providers. In general, when it comes to cost, these cloud-managed database providers are dropping their free tiers left and right. It’s getting harder and harder to bootstrap an application without having to pay for the database.
00:06:07.759 In fact, one of the few remaining generous free tiers in cloud-managed databases is Turo, which is cloud-managed SQLite. They’ve written extensively on why the unique nature of SQLite allows them to offer such a generous free tier. More and more people across the web development ecosystem are exploring the possibilities of using SQLite as a foundational part of the web application stack. As more people explore this, we are seeing more examples of individuals who can achieve meaningful scale for real business use cases with just a single VPS and SQLite behind their applications.
00:06:56.840 Whether they're writing JavaScript applications, Go applications, Ruby applications, or Python applications, in the Rails community though specifically, I think SQLite aligns really beautifully with the vision of Rails as the one-person framework. This was something that DHH reiterated in his talk just before lunch. This union makes the Rails ecosystem unique to truly take advantage of the full power and flexibility of SQLite.
00:07:31.599 For me, the primary reasons that I choose SQLite are that, firstly, it is simple. It is literally a single file on disk, and the database engine is just a single executable that runs inside your process. This is a mental model that I can fully understand and fully debug. To go back to like DHH's slide about boring tools, being able to know how something is going to fail provides safety, security, and speed in debugging. Simple tools allow for a lot of control.
00:08:01.840 This simplicity provides a unique degree of control when your database is simply a file on disk. It becomes possible to create a backup, SCP that backup to your local machine, and dig into production data anomalies in a fully featured but still fully sandboxed environment. That’s quite nice. You also have a single executable, and the tooling in Ruby with Bundler allows you to fully control all of the compilation flags of SQLite directly. You can have special compilation instances of SQLite for your application without any additional tooling.
00:09:30.760 The general simplicity of SQLite opens up some unique developer experience opportunities that we'll discuss more at the end. Of course, it goes without saying, but it is worth saying: it is fast. It's hard to appreciate until you play with it. Working with data that is right next to your application—same machine, same process—provides an experience free from network latency. Many of us have become accustomed to dealing with network request latency, and the absence of it can be a bit addictive. I do speak from experience.
00:10:32.040 So, that is why I think you should consider using SQLite. But I am curious: how many of us in the room have run or are running some kind of web application using SQLite in production? Show of hands. Yes, a couple. But not many.
00:10:57.160 I want to make sure that more of us feel comfortable and confident to actually run SQLite in production. So, I want to turn from the more theoretical to the pragmatic and let's talk about how we can build, deploy, and maintain a SQLite on Rails application.
00:11:03.240 Unfortunately, I have some bad news: it is not viable to take a 'rails new' application, put it up on a server, and walk away. You are going to experience pain; it is not production-ready. Personally, it’s one of my goals to make that true for Rails 8—that the 'rails new' out of the box experience is production-ready.
00:11:42.560 But as of today, in Rails 7.1, even the main branch of Rails that is not true. We’re going to have to do some tweaking and fine-tuning, but I hope to show you over the course of the next few minutes that it’s not too complicated; it’s not scary. What it gives you in the end is something that is indeed production-grade software with a great developer experience.
00:12:20.639 So, what does it take? Luckily, it's not much. Here’s a slide you want to take a picture of. I've got QR codes; these are the repositories and blog posts talking more about these tools.
00:12:29.840 This is the sum total of the requirements to have a production-ready SQLite on Rails application.
00:12:34.560 Okay, that is, one of these days I want to do a version of this talk and I will just stop there. But I recognize that probably a large number of you aren't yet fully convinced and are not really comfortable to say, 'Yeah, okay, I’ll just go home and spin this up.' So, let’s take a little bit more time and dig into some of the details.
00:12:51.680 What are these gems doing? Why are they doing them? What else is worth talking about? I want to start with performance. How do you make your application performant? How do we take advantage of the speed of SQLite? There are four must-haves and one nice-to-have. I won’t have the opportunity to go in depth on these.
00:13:21.880 But I presented this in detail about two weeks ago, and for anyone who might have been at that presentation, you would have enjoyed about an hour-long deep dive into these details. For everyone else, I extracted that into a blog post with all of the slides. I was honored and humbled to hear Javier say that it was excellent content. So, if it’s good enough for Javier, it’s good enough for you.
00:13:50.639 At a high level, what you need to ensure is that you’re using write-ahead logging mode. As of Rails 7.1, that is the default, so you don't have to do anything for that. The next three are what the enhanced adapter provides: immediate transactions, a timeout mechanism that isn't blocking Ruby's Global VM lock, and a fair retry interval.
00:14:10.920 The last point is a nice to have. This is an experimental feature I’ve been working on to isolate the connection pool, ensuring that write operations don’t saturate your connection pool and block read operations.
00:14:33.680 Unfortunately, I don’t have the time to dive deep into them. There's a lot of depth to explore, and I explain all of the problems and all of the solutions in my blog post. If you want to talk more about it, come and talk to me later. Luckily, it’s simple to get all of these things; the enhanced adapter gem automatically injects the configuration into your Rails application.
00:15:02.800 It’s literally as simple as 'bundle add' and you're done. If you want to experiment with isolated connection pools, that is an opt-in configuration. Just throw this in your application.rb, and you’ll get that. This will give you the full extent of performance as I know how to achieve it with those two lines of code.
00:15:21.960 Now let’s discuss resilience. It’s not just enough to have an application that runs quickly; you need to make sure that your data is backed up. I say this as someone who has accidentally deleted their production SQLite database before.
00:15:47.799 For about two weeks, I kept thinking, 'I need to add backups. I need to add backups.' Then, I accidentally deleted it. It was an unfortunate day, so this is something you should really do on day one. That's why I've tried to invest time in making it as simple and easy as possible. There are several tools, but I genuinely think that the best tool for the job is Lightstream.
00:16:05.680 Ben Johnson, who created Lightstream, is one of the experts on SQLite, and if you’re interested in this topic, he's really worth following and reading. It is a single Go executable that streams your changes to your database to an S3-compatible storage service. You can then use the CLI to restore a backup onto any machine.
00:16:19.960 Since it’s just a single executable, I wrapped it up in a Ruby gem, similar to how the SQLite3 Ruby gem works. So, you can use 'bundle add lightstream,' and it will compile it for your platform. In order to use it, there’s an installation command. You only need one command to install the gem, but still two commands to set it up.
00:16:37.920 As always, I have QR codes for more information. The installer will create two files: the first is the configuration file for the Lightstream utility itself. It uses YAML configuration, and what the installer will do is introspect your database configuration, find all of the SQLite databases in your production environment, and write a Lightstream configuration that backs them up.
00:16:55.560 If you've got a database for Active Record, a database for Sidekiq, and a database for Action Cable, it will back up all of them. It uses environment variables to keep your secrets secure. To wire everything together, it will create an initializer file, allowing you to configure those details in Ruby.
00:17:14.319 The default example in the initializer would typically pull values from credentials, but you can structure it however you want. With those two files, you have everything needed to get Lightstream up and running, but you do have to turn it on.
00:17:33.319 The gem provides you with a Rake task that maps directly to the CLI commands. This is helpful because it binds the Ruby configuration with the YAML configuration. It’s all about making sure everything runs smoothly. This process runs as a separate instance, so by default, the gem expects you'll add this command to your Procfile. If you're not using a Procfile, just throw this command wherever you manage your processes.
00:18:02.360 Replicating your database is a nice start, but we also need to discuss how to restore a database. That’s another simple Rake task where you pass in the database path, which matches something in your configuration. You can run this command on the same machine where you're replicating or even from another machine, including your local laptop.
00:18:40.960 It provides an easy mechanism to pull production data down without having to do an SCP. I was thinking, is this sufficient? Would I want something more? Yes, I would want something more. It’s not just enough to restore; I want to verify that my backups are working.
00:19:01.000 I want to know that I have up-to-date backups, and I can restore them as needed. So, I added a new Rake task. This is not one of the Lightstream commands; this is specific to the gem. You call this task with the database path, and it will pull down the most recent backup and verify it.
00:19:29.600 It will output details such as the size in bytes of the original database and the restored copy. It’ll show the delta and the number of tables. I believe there are additional things to add, and I am currently working on a configuration option so you can define custom logic to verify a backup, too.
00:20:00.680 This is a significant step towards tying this process into your routine. You might even create a recurring Sidekiq job that runs this verification each day or week and sends you reports. This way, you can sleep well at night knowing your backups are not just present but also restorable.
00:20:30.960 However, I don’t think that’s entirely sufficient. I want to quickly and clearly see in my Rails application whether things are running smoothly. I want a little web dashboard.
00:20:47.400 I want to see Lightstream's process status, the PID, and information about all the databases it's backing up. To achieve this, I created a simple web page that aggregates all of the information Lightstream provides and maps it onto a dashboard within Rails.
00:21:08.679 You have the freedom to secure this route however you want, but it’s quite nice to check the status of everything Lightstream knows about at any specific point in time.
00:21:35.760 Resilience is taken care of; now let’s talk about enhancements. There are several ways to enhance the experience of using SQLite with Rails.
00:21:50.000 The enhanced adapter gem, as the name suggests, includes many features. Beyond performance improvements, it also brings many SQL features into the Active Record adapter. Some of these features are already in Rails 7.1 and others are in the main branch.
00:22:13.680 One of the features is deferred foreign keys, which is available for PostgreSQL and MySQL adapters. The SQLite syntax is supported, and the Active Record adapter supports it as well.
00:22:39.320 Another feature I love is virtual columns. SQLite supports both stored and unstored virtual columns. While this feature is in the Rails main branch, it isn't in a point release just yet. For those of us familiar with it from PostgreSQL, we agree it’s a valuable addition, and I can't wait until it's officially available.
00:23:06.720 Moreover, you have the ability to specify custom returning values. This feature drives auto-hydrating virtual columns, so when you run 'create' commands that include virtual columns, the Active Record object will return those values.
00:23:36.760 Additionally, you'll be able to specify any of SQLite's configuration pragmas directly in your YAML file. You can also list any extensions you want to load, using an 'extensions' array provided by the enhanced adapter gem.
00:24:06.160 In addition to these enhancements from the gem, the Rails ecosystem is moving toward handling persistent data needs within the database. This applies to tools like Sidekiq and possibly future developments as well.
00:24:38.960 In applications using PostgreSQL with Active Record, it might make sense to include Sidekiq and back those jobs with a SQLite database. You can mix and match based on your needs; it’s not an all-or-nothing situation.
00:25:06.760 This creates valuable use cases for SQLite. Personally, I decided to go all-in because I'm looking to minimize costs while mastering a small number of tools, but that's not a requirement. The goal is to have a cohesive ecosystem where all components work together.
00:25:46.440 You can control the compilation settings of SQLite, which aligns with its recommendations. A note worth reiterating is that the SQLite team prioritizes backwards compatibility over hot new features. They want to ensure you can still run a database created decades ago.
00:26:31.680 Though this might be annoying at first, it’s reassuring knowing that you can upgrade without breaking existing functionality. Their commitment to thorough testing further enhances reliability, which is a key part of verifying the software's longevity.
00:27:03.200 If you want to read more about the SQLite3 gem and its compilation settings, I've also discussed this in my blog post.
00:27:22.679 One enhancement I find particularly useful is the concept of branch-specific databases. Many of us have worked on long-running feature branches, only to find ourselves in messy situations when colleagues need reviews.
00:27:44.279 It’s much easier to work with entirely isolated databases for each Git branch. I was amazed at how simple it was to set up. By piping the database YAML through ERB, we can dynamically name our database according to the current branch.
00:28:09.200 The Active Record provides programmatic access to tasks, enabling you to ensure that the database is migrated and seeded properly for use. This has been incredibly helpful for teams working on shared applications.
00:28:35.920 It only takes a couple of lines to implement, and I’m sure many more opportunities will arise as we explore this space further, finding creative ways to incorporate SQLite into our applications.
00:28:55.200 Finally, let’s touch on deployment. It's worth clarifying that, as we heard earlier, you can’t run a SQLite application on Heroku due to their ephemeral storage.
00:29:20.960 If you store your database there, you're going to experience significant pain within 10 to 24 hours. Alternatives include newer services like Render or Fly.io. Personally, I use Hatchbox, which I find incredibly powerful and easy to use.
00:30:00.760 I have a detailed blog post explaining how to deploy a SQLite on Rails application on Hatchbox if you want to check that out. Also, as mentioned yesterday, there’s the new Kamal tool that Joseph discussed.
00:30:31.079 I personally haven't used it because I've found my SQLite on Rails applications work seamlessly with cross-platform reproducible builds using Bundler and Ruby gems. I don't require Docker for additional dependencies.
00:30:50.319 For those already in the Docker world, many resources discuss how to deploy SQLite Rails applications using Docker. But that's really all I have to say on deployments.
00:31:06.560 We’ve covered performance, resilience, enhancements, and deployment. As far as I know, those are the four key pillars of running a production application. If I've missed anything, please let me know. I’d love to expand on this topic and make this talk even more informative next time.
00:31:50.440 I hope that those who managed to stick around have better insight into why DHH might emphasize using SQLite for certain use cases. If you don't need truly global scale deployments, and if you only need, say, 500 requests per second, then SQLite on Rails can serve you well. It’s likely to be cheaper, easier to build, maintain, and deploy.
00:32:39.360 While it's not currently viable to take a Rails application and just deploy it into production without some adjustments, it can be simple. It can amount to just a few CLI commands and some additional tweaking to create a fully functional application.
00:33:08.400 You could run such an application on a $15 a month VPS and drive real business value from it, as R explained. If that's not where you are at this point, that’s totally fine as well—everyone's approach varies.
00:33:28.600 Thank you very much once again for your attention!
Explore all talks recorded at Balkan Ruby 2024
+5