Software Architecture

Multi-region Data Governance in Rails Application

Multi-region Data Governance in Rails Application

by Miron Marczuk

In this presentation titled "Multi-region Data Governance in Rails Application" at the wroc_love.rb 2023 conference, Miron Marczuk discusses the intricacies of moving from a single-region multi-tenant application to a multi-region multi-tenant application using Ruby on Rails, especially in the context of a SaaS platform serving the film and event industry.

Key Points Discussed:
- Introduction and Context: Miron expresses his excitement about presenting and sets the stage by highlighting that the discussion centers on scaling applications from a local focus to a global presence, necessitating a change in data governance.
- Growth and Challenges: As a hypothetical entrepreneur's platform grows from the UK to multiple regions, clients demand that their data be stored in their respective regions. This need arises from the platform’s global expansion, resulting in a fundamental shift required in its architecture.
- Team Experience: Miron relates to his team's experiences at Apply4, a SaaS provider, emphasizing that their clients across various countries faced similar geographic data storage requirements, leading to the need for a multi-region shift.
- Architecture Options: He outlines different architecture possibilities: keeping a single application with a partitioned data layer, entirely separating applications by region, or adopting a hybrid approach. Miron’s team chose to completely separate the applications based on analysis of tenant movement and shared data.
- Implementation Steps: Miron details the stepwise approach his team employed, beginning with user redirection to region-specific URLs while keeping data persistent. They gradually migrated data and applications, emphasizing the importance of a seamless user experience throughout this transition.
- Data Management: The separation of data includes creating distinctions in database records, file storage, and necessary adjustments in the Rails application to accommodate the new architecture. Miron introduces a 'bucket' approach for appropriately categorizing and migrating data based on regions.
- Go-live Preparations: He highlights the critical aspects of preparing for the go-live event, underscoring the importance of testing, having clear contingency plans, and ensuring all systems function correctly before the transition.
- Questions and Conclusion: In closing, Miron acknowledges the significance of teamwork in such transitions, inviting feedback from the audience as this was his first presentation. He ends with a reminder of the importance of careful planning and celebrating milestones throughout the process.

Main Takeaways:
- Transitioning to multi-region data governance is a complex but manageable challenge that requires careful planning and execution.
- Proper user experience during transitions is crucial, and effective data management strategies help maintain integrity.
- Building strong procedures and contingency plans while celebrating team efforts leads to successful architecture changes.

00:00:02.360 Thank you very much for having me. I'm really excited to be here.
00:00:07.859 Actually, I'm from Poland, but this is my first time here in Wrocław.
00:00:13.740 It's a beautiful city, and as it was mentioned, it's also my first public tech conference talk.
00:00:20.820 There's a lot to be excited about, and I'm a bit stressed, but I hope to give you a good example.
00:00:27.900 I hope that by next year, you will find yourselves in my position.
00:00:34.140 Today, we'll be talking about multi-region data governance in Rails.
00:00:39.540 That's the title of my talk that I submitted. I admit that I liked it, not with bad intentions, but I think the topic is quite broad.
00:00:48.120 We'll get into what multi-region actually means in a moment.
00:00:53.879 However, because this is such a broad topic, I've decided to focus on a very specific aspect of multi-origin data governance.
00:01:04.860 I will discuss moving from having a single region multi-tenant application to a multi-region multi-tenant application.
00:01:10.979 I know there was a good talk about multi-tenancy last year; this year, we're going a step further.
00:01:19.820 Let's get started, and today's story will be about success.
00:01:25.200 It's a story that all the people who build applications dream about.
00:01:30.600 You, as entrepreneurs or developers, create applications to earn money and create business value for your clients.
00:01:37.740 The reason why we're even discussing multi-region governance is due to a success story—an idea that picks up and creates the problems we will solve later.
00:01:51.840 Let's imagine an entrepreneur who has an idea. For this example, let's assume that this person thought about a SaaS application to host merchants and allow them to build shops so their clients can buy products through their system.
00:02:03.420 The entrepreneur recognizes the market, sees the opportunity, and that idea grows in their mind.
00:02:09.360 To go from idea to application, they must grow that seed and build the thing they imagined.
00:02:18.659 What's the best way to do it in a web framework environment? Using Ruby on Rails.
00:02:24.900 OK, so we have our neighbor who sells some goods over the internet. Let's get them on the platform.
00:02:32.580 The neighbor agrees, creates a shop, and people start to visit that shop, buying things through our application.
00:02:43.680 Users start to use the shop on our platform, and word spreads; more and more clients join.
00:02:50.640 I think the platform is picking up. We have clients. Then, the entrepreneur starts to see more potential for growth.
00:03:00.300 Let’s move into other markets and grow globally. Initially, the platform was only based in the United Kingdom, but now there's a market opportunity in the USA.
00:03:09.840 They start bringing clients in from another country in another region. The shops begin to flourish, and there are clients there as well.
00:03:20.280 But here’s the issue: this development is organic. The entrepreneur is still the one who had the idea and built the application, and it was built in the UK.
00:03:30.300 However, all the data is still where it was initially stored, and now that it's grown globally, a problem arises.
00:03:39.600 There are now big clients saying, 'Okay, I want to join your platform. However, I have a requirement: you need to store my data in the region where I'm from.'
00:03:48.120 This creates a situation in which the initial setup of having data stored in the UK is no longer valid.
00:03:56.760 Clients want their data where they reside. This situation stems from your success because you have grown globally.
00:04:03.420 Now the question is: how do you move from having that single multi-tenant application to a multi-region multi-tenant application?
00:04:08.819 This is the challenge that my team at Apply4 faced. We do not build shops; rather, we have a permitting SaaS application for the film and event industry.
00:04:20.519 My name is Miron, and I work at Apply4 through Secret Source. The reason I mentioned two companies is that sometimes it's difficult to find a good job.
00:04:27.780 I was privileged to find two really good jobs—one with a fantastic product in a growing market and the other with an office in a beautiful place.
00:04:35.640 In Gran Canaria, so it's not bad! But don’t worry, I'm not leaving the beautiful dark November of Poland.
00:04:40.800 Now, back to business: we at Apply4 build a permitting SaaS platform for the film and event industries.
00:04:48.780 Our clients are authorities in the UK, Canada, the United States, and New Zealand.
00:04:55.140 These authorities enable people to hold their events or film shoots in their cities, for example, blocking streets for safe events.
00:05:03.180 Our end users are film producers or event organizers. The example of shops I started with matches our SaaS application; however, instead of shops, we work with authorities.
00:05:10.620 The problem I mentioned is exactly what we faced because our clients decided that they could no longer support the architecture.
00:05:18.180 Their data needed to be stored in the United States, prompting us into the multi-region journey.
00:05:25.140 The goal of this change is to conceptually think about moving data from one region to another.
00:05:31.680 This means that the data, which was stored previously in one region in the UK, suddenly needs to end up in a different infrastructure on the other side of the ocean.
00:05:38.700 Let's break down the details. We have the UK, where we initially stored the data of our clients and where our initial clients are from.
00:05:44.340 Then we introduce the new infrastructure in the US and separate the data, placing it there.
00:05:51.599 This setup is multi-tenant, as multiple clients operate within one application.
00:05:57.780 Clients have their own users, and then we add the additional layer of multi-region. Instead of having one multi-tenant application, we have two.
00:06:06.300 That's why I refer to this as a multi-region multi-tenant architecture.
00:06:12.180 Reflecting on the multi-tenant talk yesterday, I believe we may reach even greater potential in the future.
00:06:18.900 Now, what exactly is data? I mentioned that we need to move data from one place to another.
00:06:22.380 In our case, we're talking about database records and files uploaded by the users.
00:06:29.820 To provide some context, I'll briefly outline the stack we use so you have a better picture of the infrastructure.
00:06:36.420 We have our Ruby on Rails application in a container deployed to AWS EC2.
00:06:43.560 We're using AWS, but it doesn't really matter what you choose.
00:06:48.540 Our data persistence layer consists of a database, a file system, and Redis for data storage.
00:06:56.020 All these elements are part of our infrastructure, along with DNS and load balancers that connect our application to the outside world.
00:07:02.460 Now, when you think about a multi-region multi-tenant setup, what options do you have?
00:07:09.180 What kind of architectures can you implement to satisfy user requirements?
00:07:14.820 One option is to maintain a single application while separating the data persistence layer.
00:07:20.760 You end up with a situation where the data is clearly separated, but the application takes on the challenge of accessing the correct data.
00:07:28.500 This creates complexity, as the application needs to account for which region's data it should access.
00:07:36.300 Do you really want to implement another layer of complexity into your system?
00:07:43.140 The alternative option is to completely separate the applications. This means transitioning from one application and one data persistence layer to two distinct ones.
00:07:53.220 Then there’s a hybrid option where you separate the applications and data, but have shared components between the two regions.
00:08:01.380 Now, I will tell you why we chose one specific option. When facing this challenge, it's essential to consider a few crucial questions.
00:08:10.140 Firstly, can tenants move across regions?
00:08:14.580 Secondly, how much data is shared between the regions?
00:08:20.220 Lastly, can users operate in different tenants across regions?
00:08:25.680 The answers we found were that tenants don’t move across regions.
00:08:31.680 It's quite difficult to change regions, as it's separated by oceans.
00:08:39.300 Secondly, not much data is shared, though some does need to be synchronized between the two.
00:08:43.800 Lastly, users can operate in different regions, but it's not the norm to apply for an event across an ocean.
00:08:50.640 Given these considerations, we decided to simply separate the applications entirely.
00:09:00.840 We transitioned from a single application that hosted data in both regions to two distinct ones.
00:09:06.600 Now, let's think about the tasks we need to accomplish to achieve this.
00:09:14.160 We need to separate our infrastructure, separate our data, adjust our Rails application to support two different regions.
00:09:20.760 We also need to guide users to the correct region, as they were previously using one application. Now they will have two.
00:09:27.240 This may seem daunting, and when I thought about this challenge, I envisioned a leap of faith.
00:09:35.100 If you just deploy everything at once and jump in, it feels risky, as the safety net for mistakes is minimal.
00:09:45.300 Instead, I suggest taking it step by step, gradually implementing changes until you're ready to go live.
00:09:52.860 I have stages we went through to separate the data.
00:09:58.500 Firstly, we redirected users to the correct region without altering the data.
00:10:07.680 We created a situation where we already had two distinct applications, each with different URLs.
00:10:15.180 Importantly, we didn't touch the data persistence layer at all in this initial stage.
00:10:22.380 Once this was established, the second step was to separate the data and finally move applications to their respective regions.
00:10:30.240 After the first step, we still have applications located in the original UK region.
00:10:37.440 Now, how do you redirect users to the correct region?
00:10:42.780 Conceptually, we're splitting our application. Previously we had one application, but now we will have two.
00:10:50.040 That's a big change for users. We want to ensure it's seamless.
00:10:57.900 For the infrastructure, we maintain the persistent data while creating additional containers and load balancers.
00:11:06.180 It’s important that we make the user experience as seamless as possible.
00:11:12.960 We want to seamlessly redirect users from app.apply4.com to region-specific URLs.
00:11:20.760 When a request is sent to the region-agnostic URL, we want to perform a redirect to the region-specific URL.
00:11:30.100 For this, we can utilize DNS and load balancers. We have a DNS feature that employs geolocation.
00:11:38.300 DNS resolves the IP based on the origin IP, directing users based on their geographical location.
00:11:45.900 Second, a load balancer can redirect based on path patterns, enabling us to make decisions based on the requests.
00:11:54.540 In our case, URLs include country names, which helps thousands of users reach the right region.
00:12:02.820 Using DNS, if the origin of the request is from North America, it directs to load balancer 2; otherwise, it goes to load balancer 1.
00:12:10.320 While requests are not evaluated individually, this method points users to the appropriate load balancer.
00:12:19.260 Once in the load balancer, users will then reach the URL corresponding to that region.
00:12:27.420 There will be cases when a user has the wrong country in the URL. In those cases, we need to have the load balancer override the default route.
00:12:34.860 We aimed for seamless transitions, and we found HTTP status 301 responses had aggressive cache issues in browsers.
00:12:41.880 Therefore, we decided to use a 302 redirect; it's not perfect for semantics, but it worked.
00:12:50.100 Following this process, nothing complicated happens when we reach app1; a standard DNS record is simply reached.
00:12:58.380 We navigate to app1 through load balancer 1 and similarly to app2 through load balancer 2.
00:13:05.880 A critical point is to implement alternate links when dealing with two applications so they can reference each other.
00:13:12.180 This means both apps know of each other’s existence. The first stage goes well, and users are redirected properly.
00:13:20.760 We’re halfway through this transition, and users now face no disruptions.
00:13:27.960 Now, let’s address separating the data, which is the challenging part.
00:13:34.680 We will move from a shared data persistence layer to completely separate applications.
00:13:43.620 We are separating all infrastructures and creating a distinct copy of databases, file system storage, and Redis.
00:13:50.760 Each Region becomes entirely independent; none will be aware of the other.
00:13:57.540 Let's tackle the task of splitting data. This visualization shows a database with tables containing records.
00:14:04.680 Each of these records belongs to particular regions, and we need to sort them accordingly.
00:14:16.020 Firstly, categorize your tables. Some tables must be shared, while others contain region-specific data.
00:14:25.740 Determine which tables need to be shared and identify how often data in them changes.
00:14:32.460 Next, when you pick tables that have data records from both regions, you'll want to separate them.
00:14:38.460 This separation process is referred to as bucketing.
00:14:44.760 Here’s a fun name that stuck from our development process. Think of it as assigning records to buckets.
00:14:52.920 You assign each record to the correct bucket until their final destination is decided.
00:14:59.520 To implement this, we added a 'bucket' column to our tables.
00:15:06.780 It’s a simple integer column representing which bucket a record should be placed into.
00:15:12.900 Below is a snippet of Ruby code that defines an enum for bucket values, including unassigned records.
00:15:22.020 It’s included in our base class, and while not typical, it suits our use case.
00:15:28.740 We’re now half an hour into the presentation, and this is the first time I’m showing Ruby code.
00:15:35.520 So, here’s how we apply that column in our tables and assign buckets to those records.
00:15:47.520 We iterate through each authority and its associated data records, updating their bucket values.
00:15:54.840 Essentially, we categorize records according to the authority they relate to.
00:16:02.160 The top-down approach to bucket assignment has proven to be much simpler and more efficient.
00:16:09.030 However, consider that it doesn’t always yield perfectly clean solutions.
00:16:16.680 As we worked through this, we built a hierarchy of authorities and their corresponding values.
00:16:23.520 From this structure, each authority leads to associated records.
00:16:28.740 As you navigate up from records through to the authority, you see connections that dictate their region.
00:16:36.060 At each level, you can determine how to assign records correctly.
00:16:42.120 Now, let’s detail the steps we took to assign buckets.
00:16:49.710 You’ll want to iterate through each authority, accessing associations and updating records with their corresponding bucket values.
00:16:56.160 This simple yet effective process yields clearly defined buckets for every record.
00:17:06.100 Next, having assigned records to corresponding regions, we now separate our database.
00:17:11.820 Rather than deleting unnecessary data outright, we opted to create fresh tables to minimize risk.
00:17:18.420 You create a new, empty table and transfer only the necessary records, swapping names afterward.
00:17:24.600 However, be cautious about constraints and manage incrementing values properly.
00:17:31.800 Set a buffer to prevent overlapping IDs during this transition between regions.
00:17:41.040 Now that our databases are separated, let's look at files.
00:17:49.500 We need to move files to their respective directories based on the associations that determine their regions.
00:17:56.700 Change the storage paths to specify regions, allowing for clearer organization.
00:18:03.900 We can recreate the previous database separation approach for file management.
00:18:09.900 For testing purposes, consider creating empty copies of files, especially when dealing with large sets.
00:18:16.560 This method facilitates seamless copying and testing of files.
00:18:22.920 You should also create new directories for files based on records and then migrate old files accordingly.
00:18:34.500 For both new and old files, check for the presence of new directories to ensure smooth access.
00:18:43.500 Using data sync, an AWS tool, can help manage multiple files effectively.
00:18:50.340 Once the data is separated, we are prepared to go live.
00:18:57.840 Before going live, keep your clients informed about the changes.
00:19:05.940 Define a go-live procedure during low user activity periods, like early Saturday mornings.
00:19:12.000 Define clear steps you will follow, including commands to run and test checks.
00:19:20.760 It's essential to assign responsibilities to your team during the go-live event.
00:19:26.640 Decide on a contingency plan in case something goes wrong.
00:19:32.040 This preparation will help mitigate risks during the transition.
00:19:40.260 Finally, enjoy the process! You have made significant strides, so celebrate every milestone.
00:19:48.780 Rehearse your plans multiple times in staging environments.
00:19:55.800 The more familiar you become with the process, the more confident you will be on the go-live day.
00:20:02.220 Also, remember to use DNS records, which connect your application to external users.
00:20:10.020 After deploying the infrastructure, everything should function as it previously did.
00:20:16.680 Testing all connections will ensure a smoother go-live.
00:20:23.340 Once everything is verified and operational, you can take control of DNS and point users there.
00:20:29.940 The last piece of advice is to schedule some time off around the go-live event.
00:20:36.780 This enables a more relaxed atmosphere for you as you pull the trigger.
00:20:43.860 Trust me, it’s worth it to enjoy the moment and celebrate after such hard work.
00:20:51.780 In conclusion, great teamwork is critical, and I want to give kudos to my team.
00:20:58.200 They were instrumental in getting us to this stage.
00:21:03.840 Thank you for your time, and you can find me on X (formerly Twitter). I would appreciate any feedback as this is my first presentation.
00:21:12.660 Enjoy the rest of the conference, and it was a pleasure to be here.
00:21:22.859 Thank you for your presentation. I have a question for you.
00:21:29.880 If I understand you correctly, you promise your customers that their data will only be stored in specific regions.
00:21:35.220 But data is also stored in databases, right? We use monitoring tools, logging, etc.
00:21:40.319 So, for example, if we have Datadog monitoring, we can store that data in different regions.
00:21:45.720 What about when you want to aggregate data from multiple regions for business reporting?
00:21:51.240 Wouldn’t that break your promise to customers, and how do you handle it?
00:21:57.480 Regarding logging and business reports, it’s important to clarify our approach.
00:22:04.440 For logging data, we also implement region-specific logging. This means any information stays within its respective region.
00:22:10.380 As for business analytics, there is a challenge with aggregating across regions. We acknowledge that this is a limitation.
00:22:18.960 I'd suggest talking further about this in the future as we're still exploring this.
00:22:26.460 Your final question addressed data extraction and logs, considering existing customer data.
00:22:33.180 Yes, while transitioning, there will still be existing data logs.
00:22:39.720 We are cautious not to let UK data reach US servers and only prepare for it to stay in the UK during this transition.
00:22:48.720 The process ensures that UK data is ready to migrate only after the US is established.
00:22:55.620 This way, we maintain control over data privacy as we move forward.
00:23:02.280 Thank you for your question.
00:23:10.380 Are there any other questions?
00:23:18.780 You mentioned this table renaming process, and I’m curious why you approached it this way.
00:23:24.540 Instead of simply removing unnecessary data, why not avoid complications from renaming?
00:23:30.780 Interestingly, we found that creating fresh tables minimized risk during data management.
00:23:36.780 Not only is the copy-and-swap method generally faster, but it also reduces potential mistakes.
00:23:43.320 This experience guided our decision-making process.
00:23:49.920 Thank you for your understanding, and sorry for needing to cut it short. Thanks again!