00:00:05.299
Thank you for coming to my session. The theme of this session is scaling Rails API to handle write-heavy traffic.
00:00:12.660
First, I will introduce myself.
00:00:19.199
I am Takumasa Ochi, a software engineer and product manager at DNA.
00:00:25.800
I am responsible for developing back-end systems for mobile game apps. I love both developing and playing games.
00:00:31.920
I also work with Ruby and Ruby on Rails. This is our company, DNA, where our mission is to delight people beyond their wildest dreams.
00:00:41.340
We offer a wide variety of services, including games, sports, live streaming, healthcare, automotive, e-commerce, and more.
00:00:48.920
We started using multiple databases for e-commerce in the 2000s.
00:00:57.780
Today, I will discuss games and multiple databases. This is the agenda for today's session.
00:01:05.960
First, I will present an explanation of mobile game API backends and the characteristics of mobile game apps. We will also have a quick overview of scalable database configurations. Then, we will move on to API designs for write scalability, exploring application API response design and sharding.
00:01:26.880
After that, we will discuss some pitfalls around multiple databases and testing with multiple databases. Finally, I will share considerations about the past, present, and future of multiple databases at our game servers.
00:01:58.500
Let's start with mobile game API backends.
00:02:10.979
Generally speaking, mobile game API backends can be split into two categories: real-time servers and non-real-time servers. In this session, we will focus on non-real-time servers. The objectives of these servers are to provide secure data storage, manage transactions, and enable multiplayer or multi-device communications.
00:02:29.280
For example, non-real-time servers handle inventories, friends, daily bonuses, and other transactions. This figure illustrates an example of API calls: a client consumes energy and helps a friend, while the server conducts a resource exchange transaction and returns a successful response.
00:02:41.340
This slide explains the system overview of our mobile game API backends. We started developing our game API server in the mid-2010s. The system runs on Ruby on Rails 5.2 and primarily operates on Infrastructure as a Service (IaaS).
00:02:55.980
The main API is a monolithic application, and there are other components such as proxy servers, management tools, and HTML servers for web views. We utilize the API mode to minimize controller modules and overhead.
00:03:06.360
Additionally, we employ schema-first development using protocol buffers, and we utilize MySQL, Memcached, and Elasticsearch. Now, let's examine the characteristics of mobile game apps.
00:03:15.180
How do you try a new game app? You tap the install button, wait a minute, and start playing. It's super easy for players; however, it is also super easy for the servers to face write-heavy traffic. This slide outlines the reasons for the write-heavy traffic.
00:03:41.040
First, as mentioned earlier, easy installation leads to easy write access. Downloading is very simple, and mobile app distribution platforms ensure installations from all around the globe. There is no need to create an account, and the first in-game action equates to the first write operation. Secondly, various in-game actions trigger accesses, especially writes, such as receiving daily bonuses, engaging in battles, and purchasing items. Generally speaking, the order of magnitude for mobile apps ranges from 10,000 requests per second to 1,000,000 requests per second. Therefore, scaling the API to handle write-heavy traffic is essential.
00:05:06.900
Now, let's take a quick overview of scalable database configurations. This slide provides an overview of our database configurations. From the application side, we use vertical splitting, horizontal sharding, and read-write splitting. On the infrastructure side, we conduct load balancing on readers and automatic writer failover. Let's look at each of these components one by one.
00:05:53.419
The first technique is vertical splitting, where we divide the databases into two parts. The first part is common, containing one writer node optimized for read-heavy traffic; we prioritize consistency for scalability in this section. For example, game setting data and pointers to shards are stored here. The other part is sharded, which is the main focus of today's session. There are multiple writer nodes optimized for write-heavy traffic, and we prioritize scalability over consistency in this section. Player-generated data is primarily stored in this database.
00:07:34.560
In horizontal sharding, we consider data locality concerning each player and partition it based on player ID rather than resource types. This architecture is effective in ensuring consistency for each player and is efficient for player-based data access. We maintain a mapping table called player_shards to assign each shard. We determine a shard based on a player's first action using weighted sampling. This dynamic shard allocation is flexible for adding new shards and balancing the load among them.
00:08:41.460
There is a limitation on maximum inserts per second with this architecture; for instance, thousands or tens of thousands of queries per second are the limitations of MySQL. If we hit this limit, we can utilize a range mapping table or distributed mapping tables. We also employ manual read-write splitting. This method is complex but effective, where API servers access writer nodes to retrieve data used for transactions and previous data, while they access reader nodes for non-transactional data and other players’ data.
00:09:17.460
From the infrastructure side, we implement various multiple database systems. The key concept here is to do that outside of Rails, which increases flexibility and achieves separation of concerns. In our case, we use a DNS-based service discovery system with a custom DNS resolver. Load balancing is conducted on the weighted sampling of IP addresses, and automatic writer failover is executed by changing the IP address to a new writer node.
00:09:56.030
Thus, every aspect of multiple databases is essential for scalability. Let's delve deeper into some critical features.
00:10:15.160
Now we are going to explore the API design for write scalability.
00:10:12.740
Let's start with traffic offload by replication. Replication is an effective technique to offload read traffic. Here, we replicate DDL and DML from writers to readers, meaning that almost the same data is available on reader nodes. By offloading read traffic to these readers, we can reduce the load on writers, making the system more scalable.
00:10:41.360
However, there is an application delay. Historically, the latest data is only available on the writer node. So, how do we balance scalability with consistency?
00:11:13.260
The solution is read-your-writes consistency, ensuring consistency on your writes. This slide shows the read-your-writes consistency in Rails 6. The solution is time-based connection switching; when there is a recent write, the request handler connects to the writer. This connection switching is the default in Rails 6, and a 2-second replication delay is tolerated. Using this architecture, we gain high scalability for read-heavy applications while ensuring strong consistency.
00:11:56.880
On the other hand, there are two main concerns with this architecture. First, it is not ideal for write-heavy applications since the proportion of writer traffic is high, making offloading less effective. Furthermore, we must maintain a delicate balance between replication delay and write-read ratio. When the replication delay increases, the write-read ratio may also increase, leading to scalability issues. Therefore, we must further improve this for write-heavy traffic.
00:12:57.420
This slide explains the read-write consistency for write-heavy applications. The strategy here is resource-based: players own their data, while critical data is accessed from writers, and other less critical data is retrieved from readers. The advantage of this architecture is effectiveness for write-heavy traffic, where much longer replication delays, such as tens or hundreds of seconds, are tolerated. Such replication delays may often occur due to cloud infrastructure or request surge.
00:13:59.800
This architecture effectively offloads inter-player API traffic, as fewer connections to writers are required. However, the drawbacks include complicated business logic and inefficacies concerning offloading inter-player API traffic. In essence, this architecture significantly enhances scalability, but further improvement is necessary, particularly for inter-player APIs.
00:14:52.440
Let's now examine traffic offload by resource-focused API design. Let's conduct a case study. The scenario involves buying hamburgers and adding them to an inventory system, where there is an infinite supply of hamburgers. In this scenario, player A buys hamburgers, decreasing their wallet balance while increasing the items in their inventory.
00:15:33.320
What APIs are required for this scenario? There are three necessary APIs: create burger purchase, index items in the knapsack, and show wallet. On your right-hand side, you can see the Rails routes for these resources. The API call sequence is illustrated on the left. Initially, a client calls the index items in the knapsack API and the show wallet API. The player then decides what to purchase, calling the create burger purchase and index items in the knapsack APIs again.
00:16:23.280
This illustrates that each of these API calls accesses the writer. This write-heavy traffic is driven by the recent writes in Rails 6 and due to the reliance on their own resources for read-your-writes consistency in write-heavy traffic. So, how do we address this issue of numerous write accesses in the API?
00:17:18.000
The question arises: why do we fetch data from a writer? This is necessary because the items in the wallet and inventory are being modified. Those modified items are tracked within the controller as ActiveRecord instances.
00:17:54.840
Thus, the solution is to include all the affected resources in the API response. In this scenario, there are two types of affected resources: the primary target resource, which is the burger purchase, and the affected resources, which include items in the inventory and the wallet.
00:18:30.720
In the JSON sample on the right, all resources are accounted for in the response. As illustrated, the client initiates the index items in the knapsack and show wallet call in the first action. After deciding what to buy, the client calls only create burger purchase, since all required resources are included in this response. Consequently, there are no read API requests aside from the initial call, and there are no direct accesses to the writer database save for the transaction itself.
00:19:20.040
This approach benefits both the databases and the clients. While there may be duplicated resources resulting in increased complexity, focusing on the writer database for transactions remains key, as this is the only aspect unmanageable by the readers.
00:20:01.440
This slide summarizes the mixed results of replication and resource-focused API design. On one hand, read-your-writes consistency based on data ownership proves effective for write-heavy traffic and inter-player APIs. On the other hand, including affected resources in write API responses effectively minimizes unnecessary read API requests and database accesses.
00:20:54.040
As a mixed result, we have achieved the following outcomes: first, we have developed a complex yet adaptive architecture due to manual read-write switching; second, we maximized traffic offload through affected resources in write API responses; third, we enhanced the robustness of replication delay, allowing for significantly longer application delays. Thus, we accomplished both consistency and write scalability while also improving the robustness of our application.
00:21:52.680
We have now successfully redirected writer databases to focus on transactions. However, what happens if the number of transactions exceeds the capacity of the writer databases? The answer lies in sharding.
00:22:07.200
Let's explore scalable data locality in the context of sharding. Generally, multi-tenant sharding involves splitting data by ownership, ensuring minimal interaction between shards.
00:22:46.380
This architecture is scalable; simply adding another shard is sufficient to achieve desired performance. Moreover, sharding represents an efficient architecture that guarantees optimal data locality.
00:23:27.280
However, when discussing single tenant multiplayer games, interactions between shards become crucial. Players in one extensive game world must interact with each other, and sharding does not inherently guarantee optimal data locality.
00:23:47.760
Moreover, sharding can lead to consistency issues or failures over the network, which necessitates careful management. The objective of data locality is to enhance performance and consistency by optimally placing data. The basic strategy involves minimizing operational costs while maintaining access ends to shared resources.
00:24:37.020
Let's conduct some case studies to illustrate this architecture.
00:25:10.500
The first scenario presents a situation where player A and player B train themselves, while player C calls one of them for assistance. In this case, obtaining the latest training state is unnecessary; the request encompasses 95% training requests and only 5% help requests. The solution is a pool architecture, as illustrated in the accompanying figure.
00:25:51.660
Here, we locate data according to ownership indexed by player ID, meaning that 95% of write requests only involve one writer while 5% of read requests necessitate access from one writer and one reader. This architecture performs well for most write requests.
00:26:39.480
The next scenario is about sending gifts, listing gifts, and receiving them. In this context, player A and player B send gifts to player C, who lists them chronologically. The requests consist of 30% gift sending requests and 70% gift listing or receiving requests. The proposed solution lies in push architecture, where data is located at the receiver's end, indexed by the receivable player ID.
00:27:23.760
Consequently, 30% of write requests involve two writers, whereas 70% involve only one writer. Although there is an added cost for sending gifts, this architecture proves effective for the majority of read-write requests, eliminating the need for costly merge sort operations due to already-indexed data.
00:28:13.959
The third scenario involves players becoming friends and listing them. Here, player A, player B, and player C list friends, with player C linking with player B. The request comprises 95% listing requests and 5% occasional linking requests. The solution here is a double-write architecture, enabling data to be correctly indexed by both player ID and friend player ID.
00:28:55.180
Thus, 5% of write requests will involve maximum two writers, while 95% of read requests consist only of one reader. This proves efficient for most read requests. Now, let’s move on to a more complex scenario.
00:29:45.460
In this scenario, players A, B, and C join a guild to combat a raid boss. They share a limited number of consumable weapons, and both actions and performance are critical when updating player health states and remaining weapon states. The optimal solution here is partial reshuffling.
00:30:36.580
From a database perspective, players belong to different shards, yet they are part of the same guild, necessitating both actions and performance. The first step is selecting another shard for the guild using weighted sampling. This approach demonstrates the desired shard inclusion, allowing us to locate essential resources for the raid boss in the guild’s shard.
00:31:26.020
This arrangement ensures both actions and performance since only one shard is involved per transaction. Consequently, non-critical resources remain attached to their original shards. This methodology allows guild-based transactions and APIs to be handled efficiently, ensuring reliable ACID transactions.
00:32:14.700
We have effectively enhanced data locality while achieving scalability. This data locality approach benefits not only conventional databases but also some distributed databases, considering their underlying architecture.
00:32:54.580
However, significant consistency issues due to sharding still need to be addressed. Keeping consistency over multiple databases presents considerable challenges, particularly during the mid-2000s when our game server began to operate.
00:33:50.960
Some claimed that two-phase commits could ensure the atomicity of multiple databases, similar to transactions in Excel.
00:34:32.420
This slide provides an overview of the two-phase commit for comparative study. There are two participant types: transaction managers (here the API server) and resource managers (the MySQL databases). The procedure initiates with the transaction manager requesting resource managers to update records. Upon completion, resource managers inform the transaction manager whether they’re ready to commit.
00:35:27.180
If all resource managers are ready to commit, the transaction manager instructs them to proceed; if not, all are triggered to roll back. At least that is straightforward for nominal cases.
00:36:19.360
However, prior to MySQL 5.7, transactions were incompatible with replication. Moreover, two-phase commits, like XA transactions, introduce scalability issues. To prioritize scalability, we must relax consistency while still maintaining practically acceptable levels.
00:37:23.560
Next, we will conduct a case study focusing on consistency. Consider a scenario where player A consumes an item within their knapsack and player B receives the consumed amount as a gift. While this may seem simple, the implementation of relaxed consistency within Rails 6 is, in fact, quite intricate.
00:38:19.640
Rule one involves reducing the risk of cross-talk between transaction anomalies. Before proposing this rule, we have to verify preconditions for relaxed cross-shard transactions. First, all resource managers should have the same underlying architecture, such as MySQL, and they need to be resilient to failures that may occur during continuous periods.
00:39:04.560
Let's analyze the figures on your right side; two MySQL databases are displayed. The top part illustrates a conventional two-phase commit, while the lower part demonstrates proposed relaxed transactions. In the latter case, the transaction manager directly instructs resource managers to update their records. They respond affirmatively, indicating that they’ll likely commit, which allows the transaction manager to proceed with the commitment commands.
00:39:57.920
Next, we'll explore the Rails code that facilitates relaxed transactions. First, two consecutive nested transaction blocks execute a commitment at the end of their scope. At the beginning of the outer transaction block, we utilize player.blog.find(player.id) to ensure resource locking and prevent race conditions, guaranteeing a higher success rate.
00:40:59.760
The primary business logic is omitted in this example. However, as the transaction block nears the end, consecutive safe methods need to be invoked during subsequent commit requests. The presented code is integral to executing relaxed transactions, as it minimizes uncertainty during both the transaction manager's and resource managers’ operations.
00:41:51.800
However, unlikely anomalies occasionally arise, including failures of the transaction or Rails process midway through a consecutive commitment; network failures between consecutive commitments; or failures of shadow nodes before the final commitment. Here, we aimed for an XA-compatible success rate in nominal cases, omitting differences in preparation between XA commits.
00:42:37.680
However, challenges may arise if we lack persistent logs within resource managers. Thus, we attain a reasonable success rate for nominal cases.
00:43:11.720
Rule two is to ensure that inconsistencies are observable and acceptable. Before establishing this rule, we should understand the three transaction states: none of the shards having committed (the failure state), some but not all shards having committed (the anomaly state), and all shards having successfully committed (the success state). Given an anomalous state, conventional two-phase commits will require all participating shards to commit.
00:43:53.760
To do what we did, we aim for serialization to control the order of commitment, reducing the chances of reaching an anomalous state. Secondly, we highlight that inconsistencies should be evident to the transaction manager. From this viewpoint, the sender becomes the trigger for trial recovery.
00:44:27.920
This necessitates tolerating differences in perceived outcomes while ensuring that the next action does not interfere with subsequent recoveries. This practice helps to maintain a non-blocking characteristic.
00:45:43.800
Finally, rule three emphasizes maintaining identity and achieving eventual consistency. The proposed solution utilizes retry mechanisms, the simplest yet most reliable method for recovering from failures, including network failures outside server systems. To utilize the retry, we must employ an identity key that identifies the resources involved; this may be referred to as transaction ID, identity key, or request ID.
00:46:30.780
We structure the business logic by incorporating the inconsistency history into tables, allowing identification of initial conditions conveniently. Upon review of Rails code, we see that all resources persist if the associated transaction history is intact. The scenario describes how to handle anomalies by adjusting existing gifts.
00:47:24.750
This demonstrates how introducing identity and eventual consistency enables us to manage database anomalies and failures in external servers. As a result, we not only secure eventual consistency but also enhance scalability. I believe this approach is practically sound. If you’re curious about the requirements, rule four advises using at least one message queue for strict completion.
00:48:15.240
The prerequisites for this rule are that the business logic should already comply with rules one, two, and three. This architecture is divided into two phases: enqueue and dequeue. In the enqueue phase, data is recorded into a single shard database where ACID is guaranteed. A message is committed along with data within the transaction.
00:49:15.180
This ensures that at least one message exists, and messages are processed only after transaction completion. We’ll utilize timeouts within Rails and associated databases while using perform later for delayed queue operations.
00:49:53.400
Meanwhile, during the dequeue phase, we retrieve messages from the queue and ascertain the existence of corresponding data in the database. If it doesn’t exist, the transaction is rolled back. Should the data exist, we proceed with the transaction to completion.
00:50:41.060
This methodology guarantees at least one job execution, though caution must also be exercised to mitigate the risk of Ruby process clashes. With this configuration, we strive for a theoretically guaranteed completion of cross-shard transactions.
00:51:33.650
In summary, we have acquired practically acceptable consistency without sacrificing scalability. These considerations are essential for concerned databases, and ideally, it should be resolved through database solutions.
00:52:26.740
Yet, there are additional aspects to consider regarding write scalability, particularly the pitfalls concerning multiple database configurations.
00:53:10.140
The primary goal of sharding is to achieve horizontal write scalability for databases. Specifically, given n as the number of shards, we aim to keep time and space complexity near O(1) or O(log n). Although this sounds trivial, it is achievable through sharding.
00:54:02.460
Nonetheless, we encounter various pitfalls, such as the number of connections per database. The connection pool within Rails caches and reuses established database connections for performance, presenting limitations since the cached pool cannot be shared across processes.
00:54:57.320
Additionally, Shadow File leveraging GRV (Global Resource Virtualization) permits only one thread to run per process. The result of these dual limitations often generates a number of processes exceeding CPU cores. Most API servers serve a multitude of databases.
00:55:44.460
For example, with 1,000 servers and 32 CPU cores, while each process has a max of 5 connections, we could potentially exceed maximum connection limits across databases when multiplied together.
00:56:34.860
This scenario leads to deteriorating database performance stemming from high connection volumes. Under typical conditions with cloud computing providers, connection limits lie in the hundreds or low thousands, necessitating strategic targeting of connection volumes.
00:57:34.040
Strategies to reduce connection volume include assuming an average of one active connection per database request, followed by efficient release management. The solutions involve multiplexing connections with proxies or periodic flushing of stale database connections.
00:58:24.560
An example of the first solution is the pgbouncer for PostgreSQL, which offers reduced connection overhead. However, this may disable specific features of some databases while connection pooling may require clearing out connections after each request.
00:59:14.560
Active Record has a refresh connection function famous for this purpose. Although there exists a connection overhead, it isn't a significant issue for MySQL, which resolves connection volume concerns effectively.
01:00:02.540
The next pitfall concerns the connection reaper thread assigned per API server, which naturally flushes idle database connections. This was once enabled by default in Rails 5,
01:00:53.780
but resulted in connection surges when numerous threads were created for each database connection. As a result, there were numerous reaper threads and performance degradation ensued.
01:01:47.920
To summarize the pitfalls, we learned that even well-constructed architecture can be challenged by substantial numbers of databases, with the potential for more risks emerging.
01:02:35.380
To avoid pitfalls, thorough testing with multiple databases is essential. If it isn’t tested, it should be assumed to be broken.
01:03:30.440
Recommendations for testing include real remote databases in the test environment, employing actual multi-database configurations, avoiding mocks, and utilizing proper toolset, while ensuring multiple sharding options.
01:04:30.840
We also recommend using real transactions during tests, simulating partial transactional failures to ensure robust error handling algorithms.
01:05:31.140
Additionally, anomalous situations should be injected in sandbox environments to assess system performance under failure conditions.
01:06:32.180
Lastly, conducting load tests before service launches is crucial to understand system performance under variable loads.
01:07:27.190
Now we can relax about multiple databases and their impact on our game server as we assess the past, present, and future of multiple database configurations.
01:08:30.440
When we initiated our project, Rails 2 was the latest version, lacking any inherent support for multiple databases. As a result, we began extending Rails using various gems and implementing multiple monkey patches. Our focus was to prioritize performance over complexity, as many features were implemented successfully for our games.
01:09:39.920
Currently, we find ourselves working on two different Rails versions: Rails 5.2 with extended gems and monkey patches, while Rails 6.2 provides native support for multiple databases.
01:10:21.400
Common issues are being resolved in upstream Rails, and new features continue to be implemented, although discrepancies in codebases remain. Many complex solutions being proposed are being adjusted as user needs evolve.
01:11:16.480
Moving forward, we plan to simplify our multiple database strategy, aligning it with upstream developments. This will allow us to continuously evolve while contributing our technical expertise back to Rails, converting technical debts into capital.
01:12:23.580
Now, let’s summarize today’s session. We have explored various techniques to scale Rails API to handle write-heavy traffic. First, we utilized all aspects of multiple databases, including vertical splitting, horizontal sharding, replication, and load balancing.
01:12:55.200
Next, we allowed writer databases to focus on transactions, while replication efforts handled read traffic more efficiently through resource-focused API response designs that minimized unnecessary access to the writers.
01:13:48.740
Afterward, we optimized data locality, ensuring we reduced performance overhead between shards. We assessed different architectural models, such as pull, push, double writes, and partial resholding, to achieve optimal performance.
01:14:50.560
Finally, we discussed practical considerations for achieving acceptable consistency, navigating risks, and pitfalls that still exist in the ecosystem. Testing with multiple databases is essential to validate our implementations and ensure smooth functioning.
01:15:17.440
Contributing upstream also helps transform technical debt into assets that benefit the broader community.
01:15:55.920
Thank you for listening. DNA loves Rails!