The Recipe for the Worlds Largest Rails Monolith

00:00:30.189 It is an absolute pleasure to introduce Akira Matsuda. He is one of the founding organizers of RubyKaigi, Asakusa.rb, and RubyKaigi, which is the largest Ruby conference in the world, boasting 800 attendees last year. If you ever get the chance, you should definitely attend. Asakusa.rb hosts meetings near Wano in Tokyo every month as a weekly event. They are quite dedicated to their meetups in Japan.

00:01:03.280 If you have ever encountered paged events in a Rails app, that is probably an implementation he wrote. He also created the active args and action decorator gems among many others, and he's a core committer on Ruby and Rails. Akira speaks English very well; he learned it while on an exchange program in Zimbabwe. Although he has an accent that you may not recognize, I encourage you to get to know him. Once again, this is Akira Matsuda with us today.

00:01:48.970 Cheers, everyone! I'm from Japan and I am here to talk at this conference today. Japan, of course, is the country where Ruby was born. We have Ruby in Japan, great sushi, and, of course, fantastic sake.

00:02:06.830 Actually, I brought a big bottle of high-quality sake to share with you. I don't know how we can show it to you, but let me just pass this around. So anyway, my name is Akira Matsuda, which is my family name. You might be familiar with the spelling of 'Mazda,' like the car company. My GitHub account is 'a_matsuda' and my Twitter handle has an underscore at the end. On GitHub, I'm recognized for my contributions to the community, such as the plugin for Elf, ActiveDecorator, and many other gems.

00:02:50.840 Some of you might remember that I had the chance to speak here three years ago to introduce some of my gems during Ruby on Rails. This is my second time speaking at this conference, so thank you very much to the organizers for inviting me this year. I'm very happy to return. On GitHub, I'm listed as a committer for Ruby and Rails frameworks, as well as a template engine called Haml. Recently, I received a commit bit for CarrierWave uploader—I'm not sure why, but I'm glad for the opportunity. I live in Tokyo, Japan, and I also organize a local Ruby group called Asakusa.rb.

00:04:07.540 By the way, I realize that I have 985 followers on GitHub. Thank you to all of you; that is amazing! It's nearly 1,000! If you're not already following me, you can visit my page and click the follow button in the top right corner. If 15 of you do that right now, I can check how many I have.

00:06:58.189 As for my day job, I'm freelancing in Tokyo, working for several companies. My talk today will focus on one of my clients named Cookpad. To begin with, I want to show you the result of the 'rake stats' command on the Cookpad application, which illustrates how massive the application is.

00:07:17.180 This Rails app has 1,700 models and approximately 400,000 total lines of code, which is quite large. The application depends on 273 external gem libraries. The website attracts 50 million unique users per month, capturing about one-third of the entire population of Japan, and handles a staggering 15,000 requests per second at peak times.

00:07:44.360 This record was set just last month. To handle these requests, we use 300 Rails application servers. Our database demo has over 1,000 tables, and our production Rails application connects to 30 database servers. Additionally, we have 20,000 RSpec examples, illustrating that the application is actively maintained and developed by around 50 developers. We make about 2,000 commits per month and deploy the application regularly, more than 10 times a day.

00:09:18.210 So, what is Cookpad? It is a cooking recipe-sharing platform, somewhat like a social networking site, where users can post their recipes, search through menus, and purchase ingredients. There are currently over 2 million recipes available, although you might not have come across the site since all content is primarily in Japanese. We are working on creating an English version from scratch.

00:09:46.790 With 15 million unique users each month, pretty much every Japanese cook uses Cookpad. Our philosophy is to ensure that the application is as fast as possible, with a target response time of less than 200 milliseconds. We are consistently achieving this goal. One common belief on the internet is that such large applications do not scale well, but we have found the opposite to be true—we simply let Rails scale. The nature of our website, being cooking-related, means it experiences spikes in traffic around 5:00 and 6:00 p.m. when mothers start to prepare dinner, as well as around lunchtime.

00:12:30.670 As a result, we run 300 servers to manage this demand during peak times, but we don’t need them active all the time. Hence, our approach is to scale dynamically. We refer to our scaling solution as Cookpad on Scale. We developed proprietary software similar to what Amazon uses for scaling but with a clever implementation of a locking system.

00:13:06.670 First, we make the Linux image disposable and immutable. The system spins up servers when traffic increases, and automatically shuts them down when traffic decreases. We are deploying this solution on a production server, and the graph illustrates the correlation between the number of active servers and the incoming request volume.

00:14:19.200 This strategy ensures that users never experience a heavy load even during peak dinner times, and it keeps our server costs manageable. Although we manage up to 300 servers at maximum, deploying our application across these servers is often cited as challenging due to the time it requires.

00:15:08.210 However, we don’t use traditional deployment approaches. We steer clear of Capistrano because it can be slow due to its reliance on secure shell (SSH) protocols. Instead, we created our own deployment tool called Mamiya. This new deployer utilizes a protocol called GoSip, and allows us to deploy to 200 to 300 servers in under a minute, whereas Capistrano typically takes around 20 minutes.

00:15:43.960 For more details, the creator of Mamiya, who became a Ruby committer at a young age, discussed it at various conferences such as Eureka and RubyConf. He's an exceptional talent—he started working at the company when he was 15 and became a committer at just 14 years old.

00:16:37.060 Moving on to our next topic: the database. As I mentioned, our database structure consists of over 1,000 tables, and traditionally, ActiveRecord struggles to manage connections to multiple databases. We need to facilitate 30 database connections efficiently.

00:17:30.640 However, ActiveRecord has a method named 'establish_connection' that allows you to connect to any database using this method. Simply connecting to each table would overwhelm our database, especially considering we have 300 servers. Moreover, we need to implement master-slave switching to enable parallel processing.

00:18:30.800 Therefore, we built our own gem called SwitchPoint for master-slave connections using ActiveRecord, which is well-structured and performs efficiently. This gem allows us to manage our database connections without causing performance bottlenecks.

00:19:00.920 Now let's discuss testing. We currently manage 20,000 RSpec examples, and we heavily utilize Capybara for integration tests. Running all of these tests takes about five hours on my local machine, but we aim to complete the tests in just ten minutes.

00:20:32.150 To accomplish this, we realized that reducing application size was crucial. Therefore, we created our own version of parallel test execution, initially developed as a remote spec tool. We took this solution, open-sourced it, and now use it effectively on both CI servers and our local machines.

00:21:37.780 The strategy involves distributed remote execution that optimizes test execution order and maintains high fault tolerance. We utilize services such as EC2 Spot Instances with substantial memory and CPU resources, enabling us to finish running all tests in about ten minutes.

00:22:06.880 This approach is actually more cost-effective than existing cloud solutions since we only use the servers as needed. Another challenge we faced was with the database cleaner, which struggled to delete records from our 1,000 models across all 20,000 tests.

00:23:14.230 Realizing that we do not need to interact with all of those models for each individual test, we developed our own version of the database cleaner. This solution allows us to monkey patch ActiveRecord, tracking the names of only the tables in use for each test, making it significantly faster.

00:24:02.420 We worked hard to ensure that our cleaner is much faster than database cleaner, often exceeding speeds of over 100 times. Moving on to migrations, we avoided using ActiveRecord migrations due to the vast number of databases we handle.

00:24:47.890 Instead, we developed our own migration tool, which is similar to what is found in Chef or Puppet. As the next topic, let's discuss our prototype framework. With around 50 developers actively working on the same application, managing conflicts can be quite daunting.

00:25:52.870 To address this, we created a prototyping framework on top of Rails called Chenko to facilitate rapid application development. This framework allows us to create units that contain entire Model-View-Controller (MVC) structures, which can be activated or deactivated with a simple flag.

00:27:02.690 If an error occurs within the unit, it does not propagate to the parent application but merely ignores it, allowing the parent application to function as usual. Each unit is organized within a singular directory for simplicity.

00:28:09.290 Now, how do we avoid becoming legacy? Our application has been around for 7 to 8 years since Rails 1.1. We continuously upgrade to the latest version of Rails. We're making progress by ensuring that the response remains consistent between versions by comparing user requests on both new and old versions.

00:29:39.150 We use a shadow proxy server for this purpose, allowing us to capture user requests and send them to both versions, analyzing the logs to see if everything functions correctly.

00:30:35.480 We also perform this response checking with our RSpec tests, saving the response bodies and comparing them to confirm consistency. Although we have not yet published this method, it is simple and open-sourced.

00:31:20.640 We are committed to openly sharing our solutions and contributing back to the Ruby and Rails community. You'll find there are three Ruby communities within our company and we actively engage with them.

00:32:02.450 Recently, I upgraded our application from Rails 3.2 to 4.1, and had to patch various frameworks and gems, including Ruby itself. This illustrates the importance of maintaining relationships with the open-source community.

00:33:17.820 In conclusion, while there is much discussion around microservices, our large application remains a monolith, and we continue to develop it healthily. I believe Rails is an excellent framework that scales effortlessly, even for large applications, using a few unconventional strategies.

00:34:56.220 Microservices may be suitable for certain scenarios, but they are not universally the right approach. It is crucial to first identify and understand the underlying problems that need resolving before transitioning to microservices.

00:35:32.180 Thank you for your attention!