00:00:05.240
Hi! Oh yep, that is working. Hi, it's really lovely to be here. So, I'm Karen Jex, a Senior Solutions Architect at Crunchy Data.
00:00:10.519
This is my first ever Ruby event, so I'm really, really thrilled to be here. Thank you!
00:00:13.960
I'm looking forward to sharing my tips with you on how to keep your database happy. First, I have to get the confession out of the way: I'm not a developer. I'm sorry! But as you can see from my career illustration so far, I know a lot about databases. I have worked with databases exclusively for the past 25 years.
00:00:28.480
This is my first ever job title that doesn't include 'database' in the name, but I still work exclusively with databases. I've learned a lot from the developers I've worked with, and I've tried really hard to share with them what I know about databases. That's what I'm here for today.
00:01:03.440
I'm here to talk about databases, particularly Postgres. One of the things that developers appreciate about Postgres is that it just works. This is great because, as a developer, you don't want to spend a lot of time managing your database; you have better things to do.
00:01:17.280
Just out of curiosity, how many people here actually use Postgres or have applications that run on Postgres? Fantastic! I feel it’s worth me being here. So, I promised you my top five tips—things that you can implement without too much effort to ensure that you've got a robust, scalable, performant database environment.
00:01:42.840
The focus is very much on Postgres, but most of the tips are also relevant to other databases. Given that Postgres is relevant to the audience, let’s dive into the tips!
00:02:02.640
Tip number one: check a few key configuration parameters and ensure they are set correctly for your environment and workload. Number two: take regular backups of your database and test your recovery process.
00:02:36.200
Number three: put a high availability architecture in place. Number four: make sure the right users and applications can connect to and interact with your database. Finally, number five: ensure you know what’s going on in your database so you can react quickly if something goes wrong.
00:02:57.400
So, there you have it—my top five tips. Thank you very much! Now, I think we have roughly three minutes for each tip, so let’s go into more detail.
00:03:03.440
Number one: checking the settings for a few key configuration parameters. Out of the box, a Postgres database has a very small footprint, allowing you to install it virtually anywhere without worrying too much about resources.
00:03:39.280
However, the default configuration may not be suitable for your production environment. In fact, there are 361 parameters in Postgres 16, but you’ll be pleased to know that you don’t need to look at all of them. Just adjusting a few can make a big difference.
00:03:57.640
For example, you should allocate the right amount of memory to your database and manage checkpoints appropriately—not too many, not too few, but just right. Let’s examine a few of those parameters. Shared buffers dictate the amount of memory allocated at startup for caching data. The default value is 128 MB but should be much higher if you have enough memory.
00:04:54.160
If your system has more than 1 GB of RAM, start somewhere between 25% and 40% of your total memory.
00:05:01.400
Work_mem refers to the maximum amount of memory that can be used by a query operation before it spills to disk and creates a temporary file. The default is 4 MB. A larger value is useful for queries that perform complex sort operations, but be cautious; a complex query could run multiple operations.
00:05:52.760
It’s often better to set this for specific sessions that require more memory rather than across the board. The log_temp_files parameter can help you see when temp files are created. If you notice a lot of temporary files, it may suggest that you need to increase work_mem.
00:06:31.600
Maintenance work_mem is similar but determines the memory used by maintenance operations such as vacuum and database index creation. The default is 64 MB, but a higher value can significantly improve performance.
00:06:50.760
Now, tuning your write-ahead log (WAL) and checkpoints can also impact performance. The WAL logs all changes made to your data before being written permanently to disk. Checkpoints flush dirty data pages to disk; they're expensive operations, so frequency should be managed.
00:07:26.000
The wal_buffers parameter determines how much memory is allocated for WAL before syncing to disk. The default is -1, calculated automatically to about 3% of shared buffers. If you have many concurrent connections, consider increasing it.
00:07:49.760
Checkpoint timeout triggers a new checkpoint if one hasn’t occurred within a set duration, typically defaulting to 5 minutes, which many find too low.
00:08:02.760
Adjust it between 10 and 30 minutes for better control. Max_wal_size will also trigger checkpoints if the specified amount of WAL is generated. The default is 1 GB, equivalent to a certain number of WAL files. You want to ensure your setup has suitable space for WAL files.
00:09:09.560
Tip number two: make sure you take regular backups of your database and test your recovery process. I’m going to assume that the data in your database is important, and if something goes wrong, you need to recover it.
00:09:48.200
So, you’ll need somewhere to store those backups, preferably in a separate location or on a different server. You need a backup tool; pg_basebackup is integrated with Postgres. Consider using pgBackRest or Barman for comprehensive options. Don’t forget logical backup methods, such as pg_dump, which allows you to recreate objects in your database.
00:10:59.840
Also, schedule regular backups. An optimal setup means most people cannot afford downtime for their database. You want point-in-time recovery as well.
00:11:21.880
For that, configure WAL archiving by saving copies of your WAL files. Use tools to facilitate your backups and test recovery regularly to prove its effectiveness.
00:11:52.320
Tip number three: implement high availability. While you can recreate your database from a backup, your users prefer immediate access. Most production environments need high availability. This setup isn’t as quick or easy as the other tips, but isn't as complex as you might think.
00:12:17.920
If you lack the time or infrastructure for setup, consider managed services. Using Postgres streaming replication to create replica databases that stay current is a common architecture.
00:12:56.520
Patron is a monitoring framework and uses a DCS like etcd to maintain the cluster's current status. If issues arise, it can promote a standby to primary, quickly minimizing downtime without waiting for user intervention.
00:14:04.720
You may use tools like HAProxy for automatic virtual IP management, helping eliminate the need for application changes. Although detailed setup for these frameworks would take longer than three minutes, links to documentation are provided for further reading.
00:15:11.440
Tip number four: ensure the right users and applications can connect to and interact with your database. You'll want to manage connections carefully to control resource use.
00:15:40.280
Create application users with restricted privileges—it's crucial that your applications don’t connect using the database superuser account. Control connection settings such as listen addresses, specifying which hostnames or IP addresses Postgres will accept.
00:16:17.400
The default is local connections only, so configure for wider access if needed. Also, manage max_connections—default is typically 100—but be careful of setting this too high, as Postgres performs optimally with fewer concurrent connections.
00:16:50.320
Control idle transactions and client authentication through the pg_hba.conf file. It specifies which users are allowed to connect, their allowed IP ranges, and authentication methods. Make necessary custom entries for your environment.
00:17:40.440
Finally, tip number five: remain aware of what's happening in your database to quickly react if issues arise. You may want an advanced monitoring system, but even basic Postgres logs can provide valuable insights.
00:18:36.520
Ensure logging_collector is active to capture log messages. Be aware of your log directory's location and contents. Consider configuring log parameters, such as log_line_prefix, to ensure logs are informative.
00:19:20.200
Utilize the pg_stat_statements extension for insights into query performance. Make sure to configure and enable it for information on frequently executed queries.
00:20:13.880
In conclusion, Postgres truly does just work. You shouldn't have to exert much effort to maintain a reliable database. Focus on key configuration parameters, ensure regular backups with recovery testing, set up high availability or consider managed services, manage connections appropriately, and maintain awareness of database activity.
00:21:45.080
Once you have all this in place, your database should be self-sufficient, allowing you to focus on developing your application. Thank you very much!
00:22:02.799
the