Biggish Data With Rails and Postgresql

In the talk "Biggish Data With Rails and Postgresql" at RailsConf 2014, Starr Horne explores the challenges of managing large datasets in Rails applications using PostgreSQL. The presentation focuses on how traditional ActiveRecord conventions become ineffective as databases grow, leading to performance degradation. Horne shares his experiences from working at Honeybadger, a company that monitors errors and uptime, and emphasizes that common Rails practices can falter as data volume increases. The key points discussed include:

Understanding Biggish Data: Horne clarifies that biggish data doesn't involve complex solutions like Hadoop but refers to practical issues faced as simple applications scale up to handle large data volumes.
Performance Considerations: The speaker discusses how the performance characteristics of a database change significantly as its size increases. For example, typical pagination systems may fail at high page numbers due to inefficiencies in how queries are executed.
Query Optimization: Horne emphasizes the importance of refining queries to limit the number of touched rows. He introduces the EXPLAIN command in PostgreSQL as a tool for understanding query performance and suggests using range queries to avoid the pitfalls of inefficient pagination.
Infrastructure Improvements: The talk covers practical steps to enhance database performance, such as increasing RAM, optimizing disk I/O, and utilizing connection pooling to manage database connections efficiently.
Database Management Techniques: Key strategies for maintaining performance over time include regular database vacuuming, partitioning data to improve deletion processes, and implementing read replicas for intensive queries.
Backup Strategies for Large Datasets: Horne highlights challenges in backing up large databases and recommends using tools like WAL-E for incremental backups.

Ultimately, the speaker reassures the audience that while managing biggish data can be daunting, many issues arise one at a time and are solvable with practical approaches. The overarching message is that proper preparation and proactive measures can significantly ease the transition into handling large datasets, making applications more robust and performant.

Biggish Data With Rails and Postgresql
Starr Horne • May 13, 2014 • Chicago, IL • Talk

Once your database hits a few hundred million rows normal ActiveRecord conventions don't work so well.

...you find yourself in a new world. One where calling count() can bring your app to its knees. One where migrations can take all day.

This talk will show you how to work with big datasets in Rails and Postgresql. We'll show how normal conventions break down, and offer practical real-world advice on maintaining performance, doing maintenance, and tuning rails for optimal DB performance.

Starr likes building things that make people happy, usually with Ruby and Javascript. (He once built a bookshelf, but it was crooked and made noone happy.) He's the cofounder of Honeybadger.io. He lives in Guadalajara, Mexico and speaks very bad Spanish.

Help us caption & translate this video!

http://amara.org/v/FG18/

RailsConf 2014