RubyConf Taiwan 2023

How (and why) to run SQLite in production

#rubyconftw 2023

How (and why) to run SQLite in production

You've heard the whispers, sensed the hype, but you're still not sure what all the fuss is about with SQLite these days. Join me as we explore the use-cases and benefits of running SQLite in a production environment. Along the way, we will learn why SQLite makes sense as your next production database and how to ensure that your setup is optimized for end-user performance and developer happiness. We will setup and deploy a full Rails application—with caching, background jobs, websockets, and full-text search all backed by SQLite—in these 40 minutes. Come along for the ride!

RubyConf Taiwan 2023

00:00:02.120 Okay, hopefully too many of you aren't disappointed that you're not learning about JRuby, and instead talking about SQLite. But if you are disappointed, I'm hoping that by the end of this talk, you will be excited.
00:00:08.040 Today, I want to talk to you about how and why you might want to run SQLite in production. A little about me: my name is Stephen. I'm FractalMind on Twitter, and no, I will not be calling it X. I am an American and I moved to Berlin, Germany about five years ago. In two and a half weeks or so, I get to marry my fiancée, who is Russian and moved to Berlin about ten years ago. She's actually here today; this is her first time watching me give a conference talk, so I don't feel any pressure.
00:00:25.960 At my day job, I'm the head of engineering at a company called Test.io. We are a crowd-driven quality assurance testing provider. In my free time, I am an open-source developer actively contributing and maintaining a number of different gems, many of them in the Rails and SQLite ecosystem. I also write on my GitHub Pages blog, and you'll find a lot of content there; I'm going to be discussing various topics I've been writing about over these last few months.
00:00:50.600 Now, this is how I start all of my new Rails applications, and I want to explain why you might want to start your new Rails applications this way as well. Specifically, I want to talk about why you might want to use SQLite as your database engine.
00:01:05.280 When I suggest using SQLite on the internet or in conversation, the first question I always get is, 'Why? Why use SQLite in production?' I'm curious—has anyone here run or is currently running an application in production that uses SQLite?
00:01:20.599 Okay, I see a few hands. Has anyone spent more than a few minutes with the Rails default database before switching it to Postgres? Did you use SQLite in your Rails application on your local machine but made sure to change it to Postgres before you pushed it to production?
00:01:32.800 One thing I always do when I get this question is turn it right back around: why not use SQLite in production? Why change the database engine? I'm actually curious—does anyone have some reasons? Have you encountered pain points or heard things that made you hesitant to use SQLite in this context?
00:02:04.479 Yeah, it's a common issue that people talk about—like, 'Hey, I need my data to be persistent. If I just have a file on one server, what if something happens to that server? Do I lose all my data?' That's definitely one thing we will talk about today. Any other reasons you've heard for not using SQLite?
00:02:42.120 Well, there are a lot of myths that are grounded in some reality, and I want to talk about that today. One of the big ones is that SQLite can't really scale. Sure, you can run it on your machine, but if you put an actual application in production with real users, are you going to be able to handle more than ten users? What are we going to do about the fact that SQLite doesn't allow concurrent writes?
00:03:11.640 Fundamentally, what I want to do today is show you that, as of now, a lot of these myths are not true. They might have been to some degree true in the past, but they aren't anymore, and it's time to start reassessing our views.
00:03:27.760 In fact, DHH, in his keynote at Rails World, reemphasized the point that Rails, at its core, is the one-person framework that bridges complexity to empower the smallest possible team. You, on your laptop by yourself, can build rich and valuable web applications, and I believe that SQLite is the perfect companion to that vision.
00:03:57.720 Firstly, because it is simple—the database is a literal file on disk, and the engine is a single executable. This means that you can fully embed the entirety of the database into your application. That simplicity gives you unique control because when everything is embedded into your application, it becomes possible to configure everything directly in your code.
00:04:19.120 You can set up all of the fine-grain details of your database and how it needs to work in your repository code, checked into Git. You don't have to ensure that you're SSHing into your Postgres server before you go to production to get the settings just right. It's also fast: when your data lives on the same machine as your application, you remove network latency, and you immediately go from measuring queries in milliseconds to microseconds.
00:04:54.640 That's really fast, and it opens up a lot of unique possibilities. When your data retrieval happens in microseconds, there's really no point in worrying that much about optimizing queries when working with SQLite. You can make as many queries as you need. This works particularly well when using an ORM like Active Record—it can become troublesome to manage your queries and database interactions when you have to ensure you're making as few efficient queries as possible.
00:05:34.839 The handshake and network latency make it costly to run 200 queries inside one web request, but that just isn't a problem with SQLite. That's fundamentally why I like to use SQLite in production.
00:06:06.639 What I want to focus on today is how to actually build, deploy, and run a full-fledged web application on SQLite.
00:06:12.840 Let's start where every project begins—with a minimum viable product. We're going to use modern Rails tools and get something going. Originally, I had planned on doing a bunch of typing to demonstrate that I'm a real coding ninja, but I realized that was a complete waste of time. None of the application details have anything to do with SQLite, and they're not particularly useful to this talk.
00:06:40.799 Instead, I want to show you what I've created, and we'll use this codebase as the foundation for the rest of the talk. How's the font size—good? Should I increase it? Okay, better.
00:07:06.679 Here is an incredibly simple Hacker News clone. We have users with screen names; we don't need to worry about authentication at this point, but they can sign in with sessions. Users can create posts that have a title and description. We also have comment functionality—users can create comments just with a body. We're not doing anything with voting or ranking just yet; the simplicity here allows us to see clearly how it all works.
00:07:41.480 Users can sign up and once authenticated, they can view the details of their profiles, sign out, create new posts, edit their old posts, and create new comments. Anyone can look at posts, and we're going to be integrating some benchmarking to quickly assess the performance characteristics of this application.
00:08:12.920 This is a custom controller I developed to get some dynamic, real-world usage in the application without sending a whole bunch of requests. Our benchmark will focus on read-heavy operations, primarily hitting the post show, post index, and comment show actions. This will create a balanced workload.
00:08:53.920 Before all of this, we'll update the user's last activity timestamp. So, we'll run three queries here to find the user and update this information, create a post, create a comment, destroy a post, and destroy a comment. This is pretty standard for what this application is doing, and we can hit this endpoint and benchmark the performance of our application.
00:10:09.360 So, let's start running this and see what it's like. We have a landing page showing 100 of the most recent posts. You can view the details of posts and see what comments have been created.
00:10:44.720 You should explore the application. What I've shown you so far is the foundation of what we're working with. Now we have an application that's working and running on our local machine, but obviously, this talk is about running things in production.
00:11:16.640 How do we get it into production? There are several different hosting providers that make sense, but a critical detail is that Heroku is not one of them. Any hosting provider that only provides an ephemeral file system cannot work as a hosting provider for a SQLite Rails application—you have to be able to store that file on disk.
00:11:39.840 There are a number of providers that do offer this capability. Render and Hatchbox are among the more popular ones in the Rails ecosystem. I personally use and love Hatchbox; I've been using it for years. Although it is a bit lesser-known than Fly or Render, it essentially operates as a service where you give it access to your Digital Ocean, Linode, or Hetzner account along with a repository and say, 'Deploy this app.'
00:12:06.160 It is like a web interface for running straightforward deployment scripts. Originally, I planned on typing everything out live, but I realized I didn't want to just sit and wait for ten minutes as it sets everything up on the server. I've taken screenshots of literally every step.
00:12:38.440 So, what does it look like? First, we need to create a cluster. Hatchbox has this concept of clusters because, typically, you would want multiple instances of your web app running behind a load balancer, along with a primary database server and replica server. However, for simplicity's sake, we'll only create one server.
00:13:18.639 We need to connect our Digital Ocean account to Hatchbox and choose a region for the server. Digital Ocean offers several options; I typically choose Frankfurt for its central location. Once that's done, we can start creating servers for this cluster.
00:13:53.839 Hatchbox will remind us that the server has to serve two core responsibilities: running the web application and handling automatic migrations. Let's configure a server to do both. Digital Ocean has a variety of options in the Frankfurt location, and the largest one is quite beefy, but for this MVP, let's choose the smallest, simplest, cheapest option at $4 a month.
00:14:25.440 Once we tell Hatchbox to go, it starts provisioning our server. This takes some time; in real life, it takes about ten minutes. You can view the logs in the Hatchbox web interface and see everything it's doing. It's not hiding anything from you.
00:14:43.679 When everything is done, your server will be ready. Now, we need to deploy our application. We go to the apps tab, click on 'New App,' and select the cluster we want to deploy to. We simply provide our GitHub information and the repository name along with the branch.
00:15:06.760 Once that's done, we hit 'Deploy the App.' If we were doing this live, it would take about three minutes for the first deploy to finish, but it wraps up quickly, and now we have an app running in production. I'd love for anyone with a phone or a computer to go ahead and check out the demo.
00:15:43.920 The first thought we may have is, 'Sure, you have an application, and it is technically running in production, but is it usable?' I want you to explore the application, and I'll do the same to see what it feels like.
00:16:06.680 This is not local; this is the production application running on conference Wi-Fi. Let's reload the index. We've already got some users here. We can view a post that loads quickly. Let's check out this user's details—11 posts, 97 comments.
00:16:47.920 We can sign in as this user; it all works smoothly. Is the application perfect? No, it might not be cutting-edge, but I spent a total of four hours to get here, and the cost-benefit analysis is clearly in my favor.
00:17:12.920 Now, let’s run some benchmarks. I'm using a simple command-line utility called 'Hey,' which is a load-testing tool. I ran three different scenarios with increasing concurrency for ten seconds each and took the average at the bottom.
00:17:35.520 To be fair, I ran these benchmarks in Berlin, next to Frankfurt, understanding that might be a little fishy. So I tested it yesterday from Taipei, running a balanced workload and achieved 14 requests per second.
00:18:12.520 I'm not claiming that this is an amazing number, but it's absolutely good enough. I asked on Twitter for anecdotal information from people running production Rails applications, and the feedback varies, but is this setup able to handle three busy applications? No. Can it handle three less busy ones? Yes.
00:18:47.720 It took me four hours from start to finish to run Rails and get to the production side, and this is an app that could be potentially making money. This is my first major point: running SQLite on Rails applications today is viable. Just a year ago, it might not have been feasible.
00:19:24.720 Today, a vanilla Rails application running on the cheapest Digital Ocean droplet with a simple Hatchbox deployment process is fully usable. You can go from nothing to a production application in hours, not months, and you retain a full understanding of your application's operational details.
00:19:54.520 I want to underscore that this is the reality today for Rails applications. Now, let's talk about what has changed recently and discuss how you can take your applications to the next level.
00:20:11.440 So why are things different today than they were even just three months ago? Two major changes in the Rails ecosystem: first, with Rails version 7.1.0, new defaults have been established for your SQLite database.
00:20:36.720 These are the defaults, and if you want to learn more about what each of them does, feel free to read the blog post about it. The key change is that we are running our database with Write-Ahead Logging (WAL). This is how Postgres is doing things now.
00:21:03.080 The original way that SQLite was developed used rollback journaling, which is slower. Write-Ahead Logging is faster and smarter, but it is not the default option because it is all about backward compatibility, so you have to opt-in.
00:21:28.800 The other significant change is that the default compilation for the SQLite executable has changed. When you run 'bundle install' for SQLite three months ago, you would get an executable that did not have any of these enhancements.
00:21:53.920 Now, we are telling the executable that it will run in WAL mode and forget about rollback journaling. These two changes completely transform the SQLite experience. There was a research paper from the University of Texas that you can read for more details about this.
00:22:20.679 The critical point is that opting to use Write-Ahead Logging combined with synchronized flush to disk at level one provides a 12x improvement in performance. This greatly changes how you can handle concurrent loads, which is a major problem for web applications.
00:22:55.440 By default, if you simply download SQLite from their website, it is not designed for web application contexts. We’ve had to think deeply about how to make SQLite function properly in that environment.
00:23:22.039 Those are the main reasons why the experience is different now. But I don't want to stop there; we shouldn't have to stop. If you have a research and development project at work, many of my experiences with SQLite in production come from running these small research ideas.
00:23:46.920 We look for potential opportunities and we try to create traction without investing too much time. Hence, I won’t recommend using SQLite in production if you cannot evolve your application over time, so let’s explore how you can upgrade your application.
00:24:18.960 The first major step is to add the Enhanced Adapter gem. This gem brings in several key features, enabling you to leverage Active Record's capabilities with SQLite.
00:24:37.840 The first two features are ones that are already present in Active Record, but the current Rails version does not allow you to use them because they haven’t been integrated yet. With the Enhanced Adapter, you can now use generated columns.
00:25:04.720 The next two enhancements relate to the database file itself. We want to embrace the idea that we can embed all operational details of our application directly into our codebase.
00:25:31.680 You have options to fine-tune SQLite's runtime settings through the database.yml file, allowing you to load extensions as well.
00:25:49.440 The major point here is that the Enhanced Adapter gem provides improved concurrency support. This is something that will eventually be upstreamed into Rails.
00:26:08.640 So where does this leave us? We could run our production app on a small Digital Ocean droplet, and with the Enhanced Adapter, we can enhance its capabilities. If relying solely on SQLite, we will face issues when it comes to concurrent writes.
00:26:42.640 But if we implement the Enhanced Adapter and run SQLite properly in a web application context, we can actually make it work. Would it be acceptable to say that SQLite can handle web requests now? Yes.
00:27:12.520 By tuning SQLite to support your application context, you can ensure it performs adequately without errors. There’s a bit of work involved up front, but it’s worth it for building a reliable application.
00:27:29.640 Next, if you run into issues, the Enhanced Adapter allows for the implementation of a custom busy timeout handler. The default behavior of retrying in C can be problematic because it holds up all Ruby code execution.
00:27:49.920 This new approach enables us to better coordinate our worker threads in Ruby to track the execution of database operations.
00:28:07.920 Additionally, you can fine-tune the compilation of the SQLite executable itself. The default compilation options are great, but if you want to squeeze every ounce of performance from SQLite, you’ll have to compile it with specific flags.
00:28:32.360 Let’s shift gears a bit. How many times have you been working on your feature, only to switch over to code review and manage to run into merge conflicts because you had different migrations than another developer?
00:28:55.040 Isn’t it nice to have isolated databases for each branch? This capability is entirely feasible with SQLite. With just two lines in your configuration, you can use your branch name as a database name and ensure all setups run when you switch branches.
00:29:16.880 You just need to make sure that the proper migrations run, and once you start Rails, it populates everything correctly. Switching between branches becomes seamless.
00:29:38.079 Now, let’s address the need for backups. Persistence is crucial when dealing with production data. This can feel risky because I know I have deleted production SQLite files in the past.
00:30:02.120 Fortunately, there are various tools that you can use for SQLite, and one of my favorites is a tool called Lightstream. It allows you to stream all write operations from your SQLite database into an S3 compatible bucket.
00:30:24.920 In the Ruby community, we have created a gem that includes this functionality. You can easily integrate it into your application by running a simple Rails generator command.
00:30:50.520 I know we are moving quickly here at the end, but there is so much to discuss about the enhancements available for SQLite. Lastly, I want to talk about the Lightstack project.
00:31:18.280 For those who haven't heard of it, Lightstack allows you to back every functionality of your Rails application with SQLite. From the database server to cache servers and job processing, you can use SQLite for all these components.
00:31:39.080 You can run everything on a minimal $4 a month Digital Ocean server. Installation is straightforward, and it integrates automatically with your Rails app.
00:32:04.520 I hope at the end of this session, you feel a bit warmer to the idea that, especially within the Rails community, we are at the cutting edge of leveraging SQLite for web applications.
00:32:28.920 It absolutely makes sense for certain apps and contexts to embrace SQLite in production. Thank you for your time.
00:32:45.240 Now we have a little time for questions. Please raise your hand if you have any inquiries.
00:32:59.360 Thank you again to our speaker, Stephen Margheim, and please give him a warm round of applause.