The Future of Rails 6: Scalable by Default

Talks

Eileen M. Uchitelle

The Future of Rails 6: Scalable by Default

by Eileen M. Uchitelle

In her talk "The Future of Rails 6: Scalable by Default" at the Paris.rb Conf 2018, Eileen M. Uchitelle discusses the enhancements and future direction of the Ruby on Rails framework, focusing on scalability. She emphasizes the long-standing misconception that Rails cannot scale, despite its successful use by major companies. The session covers two main aspects of Rails scalability: the efficiency of the test suite and the ability to manage increased traffic and data effectively.

Key Points Discussed:
- Eileen introduces her background and involvement with Rails at GitHub, highlighting her work on upgrades and features for future releases.
- The need for Rails to overcome its reputation for scalability is underscored through a humorous comparison to Fisher-Price toys, pointing out that both are well-designed and dependable.
- Scalability is defined as an application's capability to grow without sacrificing performance over time. Eileen aims to make this process enjoyable for developers.
- A crucial aspect of scalability is ensuring that long-running test suites do not hinder development. The introduction of parallel testing in Rails 6 is highlighted, allowing developers to run tests concurrently, significantly speeding up testing processes.
- The talk outlines the specifics of implementing parallel testing using forked processes and threads, making it easier for developers to run their test suites more efficiently.
- Eileen also addresses the management of multiple databases within Rails, which has traditionally been complex and poorly documented. The need for improved multi-database support is discussed, emphasizing clearer workflows for migrations and database connections.
- The importance of contributing improvements to the Rails community is communicated, advocating for community-driven development of solutions to common scalability issues.
- Finally, Eileen concludes with a push for Rails to become inherently scalable by default, encouraging developers to share their experiences and solutions back to the community.

Conclusions and Takeaways:
- Rails can and does scale, albeit with some challenges that will be addressed in upcoming versions.
- Enhancements in testing speed through parallelization and better multi-database handling signify a commitment to improving developer experience.
- The evolution of Rails relies heavily on community contributions to tackle scaling issues collectively and sustainably, ensuring that Rails meets modern web development demands.

00:00:11.460 I'm really glad to be here. This is my first time in Paris, and I've had a lot of fun exploring your city. You're all very lucky to live here, except for the traffic situation. But hey, we can't all be perfect.

00:00:25.529 I'm Eileen Uchitelle, and you can find me online at the handle EileenCodes. I live in New York, but not New York City—I'm slightly north. So, I guess I'm not as used to the traffic, considering I live in the suburbs but not those suburbs way out in the mountains.

00:00:40.800 I work at GitHub as a senior systems engineer on the App Systems team. We do a lot of systems work, specifically related to Rails and Ruby, and how they interact with the GitHub application. I have worked on upgrades for Rails, helping to work on the transition toward version 5.0, and I can assure you that one day, we will reach version 6.

00:01:11.369 For anyone who’s new to the Rails community, the Rails core team is responsible for releasing new versions of Rails, deciding what features will be included, and ensuring we maintain a welcoming community. We continue improving Rails to align with what it is designed to be.

00:01:40.530 If you've been involved in the Rails community for any length of time or browsed Reddit, you've probably come across the phrase 'Rails doesn't scale.' I’ve seen it in Hacker News comments and even read it from VC professionals. Recently, I came across an article about Twitter that claimed they couldn’t effectively combat harassment because Rails is 'just Fisher-Price software.' It's amusing they called it that because although it was intended as an insult, they overlooked that both Rails and Fisher-Price are robust, well-designed, and unforgettable.

00:02:21.650 The interesting part about Rails getting a bad reputation for scalability is that it has taken many companies to heights they'd never imagined. Had Twitter built their application using a skeleton framework, they would have gone bankrupt before they even had a proof of concept.

00:02:41.490 The reality is that Rails does scale; it just doesn’t do so easily. Historically, Rails hasn’t excelled at making scaling smooth. While we pride ourselves on developer happiness, we’ve fallen short in the area of scaling. In this talk, we will explore how Rails has been hard to scale easily and what we plan to do about it.

00:02:57.150 I want scaling Rails to be as enjoyable as running your Rails application. However, scaling means different things depending on who you're talking to and the specific goal of your application. For the sake of this discussion, scalability means that, over time, an application can grow in code, data, and traffic without a decline in performance.

00:03:22.590 Rails can’t shoulder the responsibility of all parts of scaling a web application, but there are two key areas where I want to focus. The first is that as you scale your application, the test suite should not obstruct development. As applications grow, new code should not create additional friction. Developers ought to be able to prototype and build features quickly without the time it takes their test suite to run becoming a hindrance.

00:03:46.860 While we’ve made improvements to test performance in the past, we haven’t had a solution for concurrent testing. The other vital area that Rails was failing to address is that applications should handle increased data and traffic without issues. A database shouldn't crash simply because user traffic increases or because large amounts of data are being processed. It should also be effortless to distribute your load across multiple databases as your traffic increases.

00:04:19.440 Rails has had support for multiple databases, but this support has often been undocumented, hard to comprehend, and difficult to implement. We may not be able to address every aspect of scaling a web application with Rails, but we can definitely enhance developer satisfaction in these two core areas.

00:04:50.640 My aim is to tackle these two issues to make Rails truly scalable by default. By scalable by default, I mean it should be straightforward for engineers to scale their web applications without needing to rely on external dependencies or spending excessive hours developing a custom setup.

00:05:09.270 Scaling your web application doesn’t have to be complicated. Rails should ideally take on a portion of that burden for us. Some of you might be thinking, 'Wait, Rails 5.2 just came out; why are you even discussing Rails 6?' Well, major versions of Rails allow us to rethink what’s possible. It will be a while before Rails 6 is officially released, and I want to share my plans with you.

00:05:52.350 Over the past few months, I have been focused on refactoring Rails and adding features that make scaling easier. We will look at enhancements that speed up your test suite and tools that improve multi-database support, ensuring that scaling remains simple and intuitive.

00:06:23.370 The first feature I would like to introduce for Rails 6 is parallel testing. This feature will enable you to run your tests concurrently using either forked processes or threads. Parallelizing test suites is crucial for scaling your web application because as you increase your codebase, more tests need to be added—and your test suite will slow down over time.

00:06:56.030 You can mock multiple calls and ensure that you’re not sending external requests or dealing with large data sets in your tests, but this won't help speed up the total time it takes for 10,000 tests to run in a straight line.

00:07:01.840 Parallelizing large test suites can drastically reduce the wait time developers experience while waiting for builds to finish. Other companies like GitHub, Basecamp, and Shopify have been writing their own implementations of parallel testing for years. The issue with this is that we aren’t sharing our ideas with one another or with the community. Rails is popular for simplifying complexity in applications and transforming it into reusable, generic components. By incorporating parallel testing in Rails, we can lessen application complexity, set standards for parallelization, and facilitate scaling your test suite.

00:07:47.220 Earlier this year, I collaborated with Aaron Patterson to write the parallel testing feature in Rails based on our existing parallel testing infrastructure. When I got back home, I was inspired by the code, and GitHub’s Rails version has been rewritten to have a much smaller footprint, be much more generic, and significantly easier to use.

00:08:17.200 Rails' parallel testing feature also supports multiple databases right out of the box. This was essential for us at GitHub, given that we operate with multiple databases across all our environments. The concurrent test runner in Rails 6 can parallelize your test suite using either forked processes or threads.

00:08:58.230 Let’s take a moment to look at the implementation for the forking parallelizer. Forked process parallelization is incorporated into Rails by employing Distributed Ruby (also known as DRB), which is a distributed object system that comes with zero dependencies and is part of the standard library. Parallelization is invoked in your test helper with the 'parallelize' method, which takes the number of workers you wish to fork.

00:09:49.020 This parallelization lives in Active Support and is supported by the MiniTest’s parallel testing feature. When you start your test suite using the forked parallelizer, the testing parallelization class will be initialized with the number of workers you requested. In Rails, parallelization is backed by MiniTest, so once this class is initialized, all other methods within the class are invoked by MiniTest, not Rails.

00:10:18.300 From this point, MiniTest will call the 'start' method in the parallelization class. This 'start' method sets an empty pool array to a block that forks the processes based on the queue size you requested. The 'start' method subsequently calls the 'after_fork' method, allowing Active Record to hook into the parallelization and create databases for each process.

00:10:42.450 Next, it calls the DRB to create a DRB object that serves as the work queue. This DRB object maintains a reference to the DRB server. The queue variable is utilized within this loop to divide tasks for each process, allowing MiniTest to execute test methods assigned to each individual process.

00:11:01.230 Eventually, MiniTest will return results, and this queue variable will then call 'record', which sets the reporter and the results so that, at the conclusion of the test suite, each process can report whether the tests it executed passed or failed.

00:11:32.340 At the end of the 'start' method, we invoke a 'run_cleanup' hook, which is utilized by Active Record to clean up the temporary databases we created in the 'after_fork' hook. Lastly, a MiniTest call to 'shutdown' will clear the queue and wait for the processes to terminate.

00:11:57.520 This essentially wraps up the internal implementation of fork processes in Rails 6. The implementation is quite compact at around a hundred lines. However, if you didn’t follow any of this, worry not; you don’t need to grasp these details to use parallel testing in your Rails application.

00:12:40.090 For newly generated Rails 6 applications, parallelization will be automatically integrated into your test helper. For upgraded applications, you merely need to add a call to 'parallelize' within your test helper. The 'parallelize' method takes a workers argument that accepts an integer.

00:12:57.530 By default, the number of workers is set to two, meaning Rails will automatically fork your tests into two processes and create a database suffix corresponding to the worker number for each process to run tests against. The databases are generated after the processes are forked in the 'after_fork' hook, and they are automatically cleaned up when the processes terminate.

00:13:22.170 Active Record handles all this for you, so you don’t need to worry about setting up your databases; Active Record knows how to manage this thanks to the forks we've implemented.

00:13:37.170 The other type of parallel testing being offered for Rails 6 is threaded testing. There are a few scenarios where you might want to use threads over processes. The first is for Windows users, as UNIX sockets supported by DRB are not available on Windows. You'll also need the threaded parallelizer if you're using JRuby because JRuby relies on threads.

00:14:11.160 Lastly, threaded parallelization is advantageous if your tests are I/O-bound, as they can perform faster with threads because they won’t need to wait for processes to close before the next operation starts. The primary caveat, however, is that if you're utilizing a threaded parallelizer, your code needs to be thread-safe.

00:14:48.420 The 'parallelize' method takes an optional second argument, called 'threads', which defaults to processes. To utilize the threaded parallelizer in your application, simply set 'threads' to two; it's as easy as that. When you generate the application using JRuby, it will automatically be set to threads, so you won’t have to adjust this manually.

00:15:11.250 However, the parallelization setup and teardown hooks are not supported by the threaded parallelizer, as it uses only one database. Another important point is that running tests within the same suite cannot mix different parallelization methods or worker counts; you cannot divide your tests into half threads and half processes without creating two separate suites and corresponding CI builds.

00:15:34.860 In Rails 6, threaded parallelization relies on MiniTest's threaded executor. A key difference between the forked process parallelizer and the threaded parallelizer is that we did not write any code for the threaded parallelizer; MiniTest’s built-in parallelizer utilizes threads. We simply integrated a conditional that initializes that executor instead of the Rails parallelizer.

00:16:07.560 Initially, Aaron and I were hesitant to implement a threaded parallelizer because threads can be quite challenging to manage correctly. But I strongly felt that providing thread support out of the box was essential for JRuby and Windows users. A humorous example of the difficulty with threads emerged while Aaron and I were working on this feature.

00:16:27.530 While we were pairing, we began to notice odd failures in test runs—some tests passed while 90% of them failed. Running a single test consistently produced a passing outcome, indicating a potential concurrency issue. To dig deeper, we modified the test log to display the connection ID for each database call. Each thread should connect with a unique connection ID.

00:17:07.300 However, instead, we observed that all threads were connecting to the database with the same connection ID. Aaron remarked that it seemed odd, as if the databases were using a single shared ID. This pointed to a connection isolation problem, which soon became clear to me. I realized that the code I had written a year prior in Rails was coming back to haunt me.

00:17:43.820 How many of you in this room saw my talk on system tests at RailsConf last year? I didn't see my own talk either! When I implemented system tests, I aimed for a solution that eliminated the need for database cleaner. The issue at hand was that when running system tests, Rails opens a connection to the database on one thread, while Capybara initiates a Puma server that opens a second thread with its own database connection.

00:18:26.310 This setup caused the Rails thread and the Puma thread to become isolated from one another, leading to inconsistent data during test runs because the transaction on the Rails thread was unaware of the Puma thread's existence. We resolved this by forcing each thread to utilize the same connection ID. It became apparent to me that the very same connection-sharing implementation was the reason it broke threaded parallelization.

00:19:14.580 For effective threaded concurrency, each individual thread needs to maintain its separate connection to the database. Luckily, since I had authored the offending code, it became simple to locate and fix the issue by enabling per-thread connection to the database for the threaded polarizer.

00:19:47.520 It’s also crucial to recognize that system testing isn't currently feasible with threaded parallelization. We're actively trying to resolve this, but for now, if you wish to use it with your system tests, you can still conduct them with processes. Running your parallelized tests will require no extra effort—just run 'rails test' as you typically would and Rails will manage the rest.

00:20:15.290 If you’re interested in the PR for the parallel testing feature, you can find it on GitHub. I'm very enthusiastic about this feature, and I hope it assists you in scaling your application's test suite, enabling you to concentrate on deploying code instead of waiting for CI to complete.

00:20:54.520 Now that we’ve made your test suite faster, let’s examine another area where Rails has struggled with scalability, specifically in handling traffic spikes.

00:21:10.250 I’m not referring to the kind of traffic spikes that come from a DDoS attack, which typically require external vendors or specialists. Instead, I mean large amounts of organic traffic, such as when an influencer uses your application and you realize your single primary database cannot manage it.

00:21:51.690 This is a situation I’ve faced before, and it’s not pleasant. Imagine waking up, sans coffee, only to find Nagios alerting you that your application is on the verge of crashing due to validation errors in your primary database!

00:22:22.150 You need to split tables from your main MySQL cluster, needing read, write, and read-only databases, as well as replication and increased capacity. Your single primary database simply will not suffice anymore.

00:23:02.499 Rails has indeed supported multiple databases for some time, but doing so has often been arduous. It took me three hours to implement multiple database support in a demo app on Rails 5, and the complexity comes from the absence of standardization.

00:23:41.879 The underlying infrastructure functions, but we have yet to provide clear documentation or organizing tools to simplify the process. Hence, the second major focus of our improvements in Rails deals with enhancing support for multiple databases.

00:24:08.240 These improvements may not be the flashiest new features like parallel testing, but they play a significant role in scaling web applications. Rails has maintained support for multiple databases for quite a while; things became much easier with the updates made in Rails 5 compared to earlier versions.

00:24:41.660 However, when I set about creating that demo application, I quickly became aware of how many gaps exist in terms of usability. For instance, consider an application with two databases: one as the primary database consisting of flowers and people tables, and the other database for animals, featuring cats and dogs tables.

00:25:20.830 By default, Active Record is aware of the primary database and connection as it’s tied to Active Record base. We must instruct our application on how to identify and connect with the animals database whenever we're working on the dog or cat models.

00:26:04.020 However, we already have an issue at hand: best practices for utilizing multiple databases aren't documented within Rails. There’s no guidance around configuring your database YAML file for model connections, migrations, or rake tasks related to the second database.

00:26:48.610 We will have to figure out how to inform our application about that additional connection by ourselves. You can connect the application to the animals database by adding a second entry in your database YAML file. Leveraging a three-tier configuration lets us categorize our databases according to the respective environment, leading to a clearer database YAML file that simplifies understanding how everything works together.

00:27:34.080 The primary database serves as the default for our main connection, representing the production database housing people's and flowers' tables. Then we include our animal database entry, which will contain all information regarding animals.

00:28:12.080 Next, we need to direct Active Record on where to find migrations for the dogs and cats tables, as it doesn't know by default. In the past, I've worked with migrations that involve multiple databases and found two different ways to approach it. What we've implemented at GitHub is to modify our migrations to designate the connection at the start of the migration, but it's quite messy and assumes a deep understanding of internals.

00:28:51.700 Therefore, in my demo application, I've chosen a cleaner route; I placed all migrations pertaining to the animals database in their own directory, making them significantly easier to locate and manage. Grouping migrations by connection avoids the hassle of modifying each migration file to adjust the database connection.

00:29:33.220 Yet, utilizing this method to organize migrations requires that we specify the migration paths each time we call 'migrate.' While this approach works well for custom rake tasks, it becomes unfeasible with the existing rake tasks because the migrations paths method is defined on the migrator class, presenting a significant challenge.

00:30:14.320 This indicates that we must independently configure the connection and recalibrate it each time we invoke a migration task relying on the migrations path, which is time-consuming. Hence, we’ve unveiled another limitation within Rails' handling of multiple databases: inadequate support for executing migrations across different connections.

00:30:54.030 You can create database tasks to trigger migrations, but the Rails internals dependent on migrations would have no understanding of the connections aside from the default Active Record base connection. Our next step is to establish a connection for the dog and cat models to the animals database.

00:31:41.560 To achieve this, we create a new base class; you can name it anything, but for this demonstration, I termed it AnimalBase. For it to function appropriately, we need to set the 'abstract_class' attribute to true, which instructs Active Record to bypass the implied Single Table Inheritance rules tied to the parent class among its derived classes.

00:32:24.440 Thus, we want the dog model searching for an actual table rather than the animal's base dogs table. Next, we put in place the necessary connection to the animals database. At this point, we are required only to modify our child classes so they inherit from AnimalBase instead of ApplicationRecord, which indicates to Rails which connection to utilize when you call 'Dog.new' or 'Cat.find'.

00:33:10.990 While it may seem relatively straightforward, attempting to execute any tasks afterwards exposes a critical issue: rake tasks won’t work efficiently for multiple databases. These tasks only account for Active Record base, which is limited to the primary database.

00:33:54.650 Consequently, create, drop, and migrate tasks will exclusively impact the primary database. Thus, we are compelled to develop all the rake tasks autonomously—writing create, drop, migrate, schema dump, and schema load tasks, which is extremely tedious and time-consuming.

00:34:41.880 Having undertaken this laborious task, I ran the create task for the primary database to find an error indicating that the production database wasn't configured. I double-checked my rake task configuration, stashed all my modifications aside from the three-tier configuration, to ensure I hadn’t made an error.

00:35:24.290 After some debugging, I discovered that default connections simply malfunction due to the three-tier configuration in Rails 5. Essentially, they are broken. When you call 'rake db:create', it first initializes Active Record in the Railtie, which calls 'establish_connection' without any arguments.

00:36:09.090 This is the point where Rails attempts to secure the default connection to the database within a standard application. Normally, it picks the database associated with the environment you're in. For instance, if you're operating in development, it will automatically choose the development configuration.

00:36:46.940 However, when employing a three-tier configuration, Rails is unaware of which entries are the defaults due to multiple options available in each environment—ranging from primary databases to animal databases. Thus, we have now pinpointed four key areas where Active Record complicates scalability for your database.

00:37:28.280 There's a lack of documentation, migration tasks malfunction, essential rake tasks are nonexistent, and finally, the default connection process doesn’t function properly. At this stage, I understood that managing multiple databases within Rails applications is genuinely challenging.

00:38:10.120 I know what I’m doing; I have prior experience with multiple database setups across different applications, and I still found it tough to establish quickly or effectively. Rails isn’t addressing the basic requirements needed to make database scaling easy or even feasible.

00:38:53.080 The scope of my project expanded from an initial goal of upstreaming GitHub's multiple database capabilities into Rails. What began as an endeavor to refine a few issues transformed into a major undertaking requiring several complex solutions. Each of these problems posed significant time challenges, and there was no straightforward resolution.

00:39:38.520 In situations like this, it can be incredibly easy to feel overwhelmed and frustrated. Trust me; there were numerous moments when I questioned the wisdom of even trying to upstream GitHub’s multiple database functionality into Rails. However, I remembered that I had applied to speak at RailsConf, and instead they invited me to deliver a keynote!

00:40:35.220 During this time, I had to step back and identify each individual problem presented. I resolved to tackle fixing the migrations for multiple databases first. It might seem odd to start here, but it was critical because these migrations are central to how changes are made and will improve the workflows as we venture forward.

00:41:22.290 Initially, if you wanted to organize your migrations into various directories, it was necessary to define the migration paths before executing a migrate command. This approach worked fine with manual invocation, but any reliance on Rails internals rendered it nearly impossible.

00:42:10.670 So I decided that the most effective way to solve this problem would be to save the migration paths in the connection instance rather than at the class level. This allows Rails to query for the migration directory instead of having to rely on the application to specify where migrations are located each time.

00:42:50.120 The accomplishment came from the refactoring of how migration paths function. You can now define your migrations paths in the database YAML configuration for each database you require to modify. If you neglect to set this, Rails defaults to DB/migrate, but when correctly set up, Active Record can inquire about where migrations are located.

00:43:30.940 This simple change required the pulling apart of considerable private APIs in the migrator class to enable broader flexibility in managing migrations. Once it was concluded, it laid the groundwork for subsequent development for optimizing multiple databases.

00:44:10.510 Transitions were made to ensure that Rails wouldn’t constrain developers when needing to infer where migrations were stored. But now it was time to approach the next issue—creating tasks for the new database.

00:45:14.470 One significant pain point associated with multiple databases in Rails 5 is the necessity of crafting your own rake tasks for the additional database. Writing these tasks can be exceedingly time-consuming, and it's something Rails should efficiently handle for us.

00:45:56.720 My objectives with this iteration focused on ensuring that database tasks are user-friendly and predictable. The original tasks were fundamentally limited; for example, create and drop functions would execute only against the primary database, and there were no namespace tasks to ease the handling.

00:46:37.450 What I set out to do was ensure that the create, drop, and migrate functions were designed to automatically function across both the animals and primary databases, but they were only designed to run for the primary database.

00:47:18.480 The challenge emerges from navigating code within Rails that assumes a two-tier setup. When a single database corresponds with an environment, Rails operates seamlessly by selecting a single configuration hash.

00:47:57.730 In contrast, a three-tier setup complicates matters; when requesting a configuration hash for an environment, Rails can't deduce which configuration is the required one without knowledge of existing namespaces. To resolve this predicament, Aaron and I converted these hashes to objects internally.

00:48:44.620 We created a new class, DatabaseConfig, that captures complete information about each environment, specification name, and config hash. Consequently, once we possess this object, tossing queries like asking for environments tied to configurations becomes trivial.

00:49:30.910 We could essentially dive into the configurations associated with each environment easily by utilizing the newly established object-based methodology. This newfound flexibility allows for improved manipulations and queries relating to our databases.

00:50:09.480 The system for migrating tasks underwent further upgrades aimed at auto targets for objects so that Rails can effectively operate on both existing and new databases corresponding with a specified environment.

00:50:101.790 Through these structural changes, it is possible to collect configurations targeting the designated environment, combine these with the tasks provided, and handle crucial database commands effectively. The migrated databases adopted a framework capable of integrating these updates seamlessly.

00:50:202.450 This adaptability and capability in managing multiple databases underscore a significant evolution in how Rails approaches those tasks, streamlining interactions for the developer.

00:50:355.280 We kept reiterating the enhancement process, ensuring that it framed itself around usability. These new connections exponentially improved experiences while working with multiple databases within Rails.

00:51:36.360 Despite this progress, considerable work remains to be done to address truly optimal scalability and performance in multi-database interactions.

00:52:09.020 One last point worth examining is the challenge surrounding default connections for three-tier configurations. I devised a temporary fix that works; however, it is inherently flawed and requires intricate restructuring to refine the Active Record's database management to rely on the newly established configuration objects.

00:52:58.200 This upcoming work will facilitate selecting the desired connection that aligns with procedural goals and will greatly clarify the entire process.

00:53:34.200 The tasks remaining will pivot on the magnitude of this undertaking, which is a work in progress. While I shared considerable insight at RailsConf during April and I’m still navigating through the complexities.

00:54:17.860 The connection management tasks remain voluminous and are gradually evolving at this stage; they began from relatively modest aspirations but have transcended into significant areas of focus.

00:54:46.160 However, I'll share that I have been consistently documenting my progress and plan to talk about the compelling outcomes that will emerge from this improved architecture.

00:55:30.720 I hope that through this transformative process, we’ll discover enhancements that will not only prove beneficial to Rails but also contribute positively to applications built on its foundation.

00:56:12.290 In conclusion, a few months ago, a question was raised: are you worried that Rails may become bloated with all of this scaling code? This concern is valid, as most applications do not require GitHub-scale infrastructure.

00:56:50.640 Nevertheless, when reflecting on the methods GitHub has implemented over the years to facilitate scaling, I can’t help but contemplate how Rails would look today had we processed upstream these features years ago.

00:57:30.240 How might this improve the usage of parallel testing and multiple databases? Would everyone still assert that Rails doesn't scale, or, instead, would the dialogue center around how we can maximize Rails’ scaling capacities?

00:58:07.220 Through our own experiences in GitHub whilst upgrading our application from Rails 3 to 2.4, I gained insight into the intricate components from GitHub's beginnings to the present, including tools, database infrastructure, and CI deployments.

00:58:44.320 Reflecting on the decisions made over the past years—both commendable and regrettable—has significantly influenced GitHub's scaling journey. I'm certain that GitHub isn’t the sole entity driving Rails’ gradual evolution.

00:59:22.980 In working with other companies on custom multiple database setups and parallel testing, I recognized how unique scaling obstacles were encountered, which led to the emergence of highly specialized solutions.

00:60:01.080 I've identified a trend where engineers naturally assume the ownership of their scaling problems, viewing them as one-of-a-kind dilemmas when, in reality, they likely share several similarities with other experiences.

00:60:38.220 It’s imperative to address the bugs and performance issues as they arise, yet this frantic approach can lead to hastily constructed band-aid fixes instead of creating upstream optimized scaling solutions.

00:61:21.460 Under pressure, it’s easy to succumb to the notion of building solutions tailored solely to our applications that we can quickly patch later. However, experience indicates that this 'later fix' rarely materializes.

00:62:04.120 This leads to code becoming convoluted over time, ultimately resembling a game of Jenga. The slightest misstep could result in total disarray. In this context, doubts about whether Rails scales may begin to creep in.

00:62:46.940 The truth is, it isn’t Rails that struggles with scalability; rather, it’s the applications we’ve built on top of it that present challenges.

00:63:31.120 When we only craft solutions that cater to our specific applications, our code becomes fragmented and inflexible. These bespoke tools stagnate as they miss opportunities for community enhancements.

00:64:14.060 In doing so, we often find ourselves recreating similar tools repeatedly. A better approach would be to adopt a perspective that encourages us to assess problems upstream initially.

00:64:57.600 The first step is acknowledging that our applications are not special; they will eventually require scaling just like any other application.

00:65:39.720 I've observed many companies reinvent parallel testing, devise multiple databases from scratch, and rush to create impending solutions only to see them diverge rather than building reusable solutions upstream.

00:66:24.080 I’m not suggesting that upstreaming is a straightforward endeavor, but when teams start recognizing familiar patterns across varying setups, it’s clear that the need for solutions isn’t unique to just their app.

00:67:09.120 Rails is an open-source project specifically designed to cater to collective needs. By contributing your tools to the community, you’ll be building robust frameworks while simultaneously alleviating personal workload.

00:67:53.460 When we share our solutions, others benefit from improvements, which in turn makes the framework accessible for everyone across different applications.

00:68:38.340 Consequently, with this collaborative approach, it will lead to enhanced tools available for everyone; by documenting and upstreaming your experiences, you strengthen the foundation for the gauge of Rails' future scalability.

00:69:22.940 We, at GitHub, have been refining our applications for scalability for years, and we've concluded that it’s imperative to share these insights within the community to enhance Rails’ capability for scaling.

00:70:08.160 Examining how we’ve leveraged Rails over the past decade outlines not just its limitations but also its strengths and potential areas for growth. Through continued contributions and collective input, we can facilitate a path towards comprehensive scaling solutions.

00:70:53.680 Furthermore, I envision that in the coming years, as we push for more robust tools, the potential introduction of multi-tenancy, improved database connection management, and optimized workflows will transform frameworks like Rails into something that caters to increasing demands.

00:71:35.240 Rails is not dying; it’s transforming into a more mature system. To ensure it remains the preferred option for constructing web applications, we, the Rails development team, must adapt our creations to align with the needs of evolving web frameworks, focusing not solely on prototypes but on sustainability.

00:72:15.640 The road to maturity involves developing scaling solutions upstream entirely so that users don’t have to repeatedly reinvent systems or search for alternatives.

00:72:56.440 Some may still hold the belief that Rails is merely children’s software. Yet, if we look back to the earlier comparisons of Rails and Fisher-Price toys, both are characterized by robust designs, impeccable construction, and memorable impressions.

00:73:36.040 We will persist in enhancing Rails, ensuring it becomes increasingly scalable, resilient, and unforgettable. By refining your applications from Rails new to Rails scale, we commit to evolving Rails until the narrative changes to 'Rails does scale'.

00:74:14.700 For this to materialize, I urge you to evaluate your own applications: what have you done to facilitate Rails scaling for you? What can you contribute back to the community? Perhaps you can help improve parallel testing, or do you have tools that could be generalized and pushed upstream?

00:75:00.560 Rails is, and will be shaped by its community. Help us define the future of Rails—let’s work towards making Rails inherently scalable by default.

See Slides on speakerdeck.com

Paris.rb Conf 2018