wroc_love.rb 2023

Multi-region Data Governance in Rails Application

wroc_love.rb 2023

00:00:02.360 Thank you very much for having me. I'm really excited to be here.
00:00:07.859 Actually, I'm from Poland, but this is my first time here in Wrocław.
00:00:13.740 It's a beautiful city, and as it was mentioned, it's also my first public tech conference talk.
00:00:20.820 There's a lot to be excited about, and I'm a bit stressed, but I hope to give you a good example.
00:00:27.900 I hope that by next year, you will find yourselves in my position.
00:00:34.140 Today, we'll be talking about multi-region data governance in Rails.
00:00:39.540 That's the title of my talk that I submitted. I admit that I liked it, not with bad intentions, but I think the topic is quite broad.
00:00:48.120 We'll get into what multi-region actually means in a moment.
00:00:53.879 However, because this is such a broad topic, I've decided to focus on a very specific aspect of multi-origin data governance.
00:01:04.860 I will discuss moving from having a single region multi-tenant application to a multi-region multi-tenant application.
00:01:10.979 I know there was a good talk about multi-tenancy last year; this year, we're going a step further.
00:01:19.820 Let's get started, and today's story will be about success.
00:01:25.200 It's a story that all the people who build applications dream about.
00:01:30.600 You, as entrepreneurs or developers, create applications to earn money and create business value for your clients.
00:01:37.740 The reason why we're even discussing multi-region governance is due to a success story—an idea that picks up and creates the problems we will solve later.
00:01:51.840 Let's imagine an entrepreneur who has an idea. For this example, let's assume that this person thought about a SaaS application to host merchants and allow them to build shops so their clients can buy products through their system.
00:02:03.420 The entrepreneur recognizes the market, sees the opportunity, and that idea grows in their mind.
00:02:09.360 To go from idea to application, they must grow that seed and build the thing they imagined.
00:02:18.659 What's the best way to do it in a web framework environment? Using Ruby on Rails.
00:02:24.900 OK, so we have our neighbor who sells some goods over the internet. Let's get them on the platform.
00:02:32.580 The neighbor agrees, creates a shop, and people start to visit that shop, buying things through our application.
00:02:43.680 Users start to use the shop on our platform, and word spreads; more and more clients join.
00:02:50.640 I think the platform is picking up. We have clients. Then, the entrepreneur starts to see more potential for growth.
00:03:00.300 Let’s move into other markets and grow globally. Initially, the platform was only based in the United Kingdom, but now there's a market opportunity in the USA.
00:03:09.840 They start bringing clients in from another country in another region. The shops begin to flourish, and there are clients there as well.
00:03:20.280 But here’s the issue: this development is organic. The entrepreneur is still the one who had the idea and built the application, and it was built in the UK.
00:03:30.300 However, all the data is still where it was initially stored, and now that it's grown globally, a problem arises.
00:03:39.600 There are now big clients saying, 'Okay, I want to join your platform. However, I have a requirement: you need to store my data in the region where I'm from.'
00:03:48.120 This creates a situation in which the initial setup of having data stored in the UK is no longer valid.
00:03:56.760 Clients want their data where they reside. This situation stems from your success because you have grown globally.
00:04:03.420 Now the question is: how do you move from having that single multi-tenant application to a multi-region multi-tenant application?
00:04:08.819 This is the challenge that my team at Apply4 faced. We do not build shops; rather, we have a permitting SaaS application for the film and event industry.
00:04:20.519 My name is Miron, and I work at Apply4 through Secret Source. The reason I mentioned two companies is that sometimes it's difficult to find a good job.
00:04:27.780 I was privileged to find two really good jobs—one with a fantastic product in a growing market and the other with an office in a beautiful place.
00:04:35.640 In Gran Canaria, so it's not bad! But don’t worry, I'm not leaving the beautiful dark November of Poland.
00:04:40.800 Now, back to business: we at Apply4 build a permitting SaaS platform for the film and event industries.
00:04:48.780 Our clients are authorities in the UK, Canada, the United States, and New Zealand.
00:04:55.140 These authorities enable people to hold their events or film shoots in their cities, for example, blocking streets for safe events.
00:05:03.180 Our end users are film producers or event organizers. The example of shops I started with matches our SaaS application; however, instead of shops, we work with authorities.
00:05:10.620 The problem I mentioned is exactly what we faced because our clients decided that they could no longer support the architecture.
00:05:18.180 Their data needed to be stored in the United States, prompting us into the multi-region journey.
00:05:25.140 The goal of this change is to conceptually think about moving data from one region to another.
00:05:31.680 This means that the data, which was stored previously in one region in the UK, suddenly needs to end up in a different infrastructure on the other side of the ocean.
00:05:38.700 Let's break down the details. We have the UK, where we initially stored the data of our clients and where our initial clients are from.
00:05:44.340 Then we introduce the new infrastructure in the US and separate the data, placing it there.
00:05:51.599 This setup is multi-tenant, as multiple clients operate within one application.
00:05:57.780 Clients have their own users, and then we add the additional layer of multi-region. Instead of having one multi-tenant application, we have two.
00:06:06.300 That's why I refer to this as a multi-region multi-tenant architecture.
00:06:12.180 Reflecting on the multi-tenant talk yesterday, I believe we may reach even greater potential in the future.
00:06:18.900 Now, what exactly is data? I mentioned that we need to move data from one place to another.
00:06:22.380 In our case, we're talking about database records and files uploaded by the users.
00:06:29.820 To provide some context, I'll briefly outline the stack we use so you have a better picture of the infrastructure.
00:06:36.420 We have our Ruby on Rails application in a container deployed to AWS EC2.
00:06:43.560 We're using AWS, but it doesn't really matter what you choose.
00:06:48.540 Our data persistence layer consists of a database, a file system, and Redis for data storage.
00:06:56.020 All these elements are part of our infrastructure, along with DNS and load balancers that connect our application to the outside world.
00:07:02.460 Now, when you think about a multi-region multi-tenant setup, what options do you have?
00:07:09.180 What kind of architectures can you implement to satisfy user requirements?
00:07:14.820 One option is to maintain a single application while separating the data persistence layer.
00:07:20.760 You end up with a situation where the data is clearly separated, but the application takes on the challenge of accessing the correct data.
00:07:28.500 This creates complexity, as the application needs to account for which region's data it should access.
00:07:36.300 Do you really want to implement another layer of complexity into your system?
00:07:43.140 The alternative option is to completely separate the applications. This means transitioning from one application and one data persistence layer to two distinct ones.
00:07:53.220 Then there’s a hybrid option where you separate the applications and data, but have shared components between the two regions.
00:08:01.380 Now, I will tell you why we chose one specific option. When facing this challenge, it's essential to consider a few crucial questions.
00:08:10.140 Firstly, can tenants move across regions?
00:08:14.580 Secondly, how much data is shared between the regions?
00:08:20.220 Lastly, can users operate in different tenants across regions?
00:08:25.680 The answers we found were that tenants don’t move across regions.
00:08:31.680 It's quite difficult to change regions, as it's separated by oceans.
00:08:39.300 Secondly, not much data is shared, though some does need to be synchronized between the two.
00:08:43.800 Lastly, users can operate in different regions, but it's not the norm to apply for an event across an ocean.
00:08:50.640 Given these considerations, we decided to simply separate the applications entirely.
00:09:00.840 We transitioned from a single application that hosted data in both regions to two distinct ones.
00:09:06.600 Now, let's think about the tasks we need to accomplish to achieve this.
00:09:14.160 We need to separate our infrastructure, separate our data, adjust our Rails application to support two different regions.
00:09:20.760 We also need to guide users to the correct region, as they were previously using one application. Now they will have two.
00:09:27.240 This may seem daunting, and when I thought about this challenge, I envisioned a leap of faith.
00:09:35.100 If you just deploy everything at once and jump in, it feels risky, as the safety net for mistakes is minimal.
00:09:45.300 Instead, I suggest taking it step by step, gradually implementing changes until you're ready to go live.
00:09:52.860 I have stages we went through to separate the data.
00:09:58.500 Firstly, we redirected users to the correct region without altering the data.
00:10:07.680 We created a situation where we already had two distinct applications, each with different URLs.
00:10:15.180 Importantly, we didn't touch the data persistence layer at all in this initial stage.
00:10:22.380 Once this was established, the second step was to separate the data and finally move applications to their respective regions.
00:10:30.240 After the first step, we still have applications located in the original UK region.
00:10:37.440 Now, how do you redirect users to the correct region?
00:10:42.780 Conceptually, we're splitting our application. Previously we had one application, but now we will have two.
00:10:50.040 That's a big change for users. We want to ensure it's seamless.
00:10:57.900 For the infrastructure, we maintain the persistent data while creating additional containers and load balancers.
00:11:06.180 It’s important that we make the user experience as seamless as possible.
00:11:12.960 We want to seamlessly redirect users from app.apply4.com to region-specific URLs.
00:11:20.760 When a request is sent to the region-agnostic URL, we want to perform a redirect to the region-specific URL.
00:11:30.100 For this, we can utilize DNS and load balancers. We have a DNS feature that employs geolocation.
00:11:38.300 DNS resolves the IP based on the origin IP, directing users based on their geographical location.
00:11:45.900 Second, a load balancer can redirect based on path patterns, enabling us to make decisions based on the requests.
00:11:54.540 In our case, URLs include country names, which helps thousands of users reach the right region.
00:12:02.820 Using DNS, if the origin of the request is from North America, it directs to load balancer 2; otherwise, it goes to load balancer 1.
00:12:10.320 While requests are not evaluated individually, this method points users to the appropriate load balancer.
00:12:19.260 Once in the load balancer, users will then reach the URL corresponding to that region.
00:12:27.420 There will be cases when a user has the wrong country in the URL. In those cases, we need to have the load balancer override the default route.
00:12:34.860 We aimed for seamless transitions, and we found HTTP status 301 responses had aggressive cache issues in browsers.
00:12:41.880 Therefore, we decided to use a 302 redirect; it's not perfect for semantics, but it worked.
00:12:50.100 Following this process, nothing complicated happens when we reach app1; a standard DNS record is simply reached.
00:12:58.380 We navigate to app1 through load balancer 1 and similarly to app2 through load balancer 2.
00:13:05.880 A critical point is to implement alternate links when dealing with two applications so they can reference each other.
00:13:12.180 This means both apps know of each other’s existence. The first stage goes well, and users are redirected properly.
00:13:20.760 We’re halfway through this transition, and users now face no disruptions.
00:13:27.960 Now, let’s address separating the data, which is the challenging part.
00:13:34.680 We will move from a shared data persistence layer to completely separate applications.
00:13:43.620 We are separating all infrastructures and creating a distinct copy of databases, file system storage, and Redis.
00:13:50.760 Each Region becomes entirely independent; none will be aware of the other.
00:13:57.540 Let's tackle the task of splitting data. This visualization shows a database with tables containing records.
00:14:04.680 Each of these records belongs to particular regions, and we need to sort them accordingly.
00:14:16.020 Firstly, categorize your tables. Some tables must be shared, while others contain region-specific data.
00:14:25.740 Determine which tables need to be shared and identify how often data in them changes.
00:14:32.460 Next, when you pick tables that have data records from both regions, you'll want to separate them.
00:14:38.460 This separation process is referred to as bucketing.
00:14:44.760 Here’s a fun name that stuck from our development process. Think of it as assigning records to buckets.
00:14:52.920 You assign each record to the correct bucket until their final destination is decided.
00:14:59.520 To implement this, we added a 'bucket' column to our tables.
00:15:06.780 It’s a simple integer column representing which bucket a record should be placed into.
00:15:12.900 Below is a snippet of Ruby code that defines an enum for bucket values, including unassigned records.
00:15:22.020 It’s included in our base class, and while not typical, it suits our use case.
00:15:28.740 We’re now half an hour into the presentation, and this is the first time I’m showing Ruby code.
00:15:35.520 So, here’s how we apply that column in our tables and assign buckets to those records.
00:15:47.520 We iterate through each authority and its associated data records, updating their bucket values.
00:15:54.840 Essentially, we categorize records according to the authority they relate to.
00:16:02.160 The top-down approach to bucket assignment has proven to be much simpler and more efficient.
00:16:09.030 However, consider that it doesn’t always yield perfectly clean solutions.
00:16:16.680 As we worked through this, we built a hierarchy of authorities and their corresponding values.
00:16:23.520 From this structure, each authority leads to associated records.
00:16:28.740 As you navigate up from records through to the authority, you see connections that dictate their region.
00:16:36.060 At each level, you can determine how to assign records correctly.
00:16:42.120 Now, let’s detail the steps we took to assign buckets.
00:16:49.710 You’ll want to iterate through each authority, accessing associations and updating records with their corresponding bucket values.
00:16:56.160 This simple yet effective process yields clearly defined buckets for every record.
00:17:06.100 Next, having assigned records to corresponding regions, we now separate our database.
00:17:11.820 Rather than deleting unnecessary data outright, we opted to create fresh tables to minimize risk.
00:17:18.420 You create a new, empty table and transfer only the necessary records, swapping names afterward.
00:17:24.600 However, be cautious about constraints and manage incrementing values properly.
00:17:31.800 Set a buffer to prevent overlapping IDs during this transition between regions.
00:17:41.040 Now that our databases are separated, let's look at files.
00:17:49.500 We need to move files to their respective directories based on the associations that determine their regions.
00:17:56.700 Change the storage paths to specify regions, allowing for clearer organization.
00:18:03.900 We can recreate the previous database separation approach for file management.
00:18:09.900 For testing purposes, consider creating empty copies of files, especially when dealing with large sets.
00:18:16.560 This method facilitates seamless copying and testing of files.
00:18:22.920 You should also create new directories for files based on records and then migrate old files accordingly.
00:18:34.500 For both new and old files, check for the presence of new directories to ensure smooth access.
00:18:43.500 Using data sync, an AWS tool, can help manage multiple files effectively.
00:18:50.340 Once the data is separated, we are prepared to go live.
00:18:57.840 Before going live, keep your clients informed about the changes.
00:19:05.940 Define a go-live procedure during low user activity periods, like early Saturday mornings.
00:19:12.000 Define clear steps you will follow, including commands to run and test checks.
00:19:20.760 It's essential to assign responsibilities to your team during the go-live event.
00:19:26.640 Decide on a contingency plan in case something goes wrong.
00:19:32.040 This preparation will help mitigate risks during the transition.
00:19:40.260 Finally, enjoy the process! You have made significant strides, so celebrate every milestone.
00:19:48.780 Rehearse your plans multiple times in staging environments.
00:19:55.800 The more familiar you become with the process, the more confident you will be on the go-live day.
00:20:02.220 Also, remember to use DNS records, which connect your application to external users.
00:20:10.020 After deploying the infrastructure, everything should function as it previously did.
00:20:16.680 Testing all connections will ensure a smoother go-live.
00:20:23.340 Once everything is verified and operational, you can take control of DNS and point users there.
00:20:29.940 The last piece of advice is to schedule some time off around the go-live event.
00:20:36.780 This enables a more relaxed atmosphere for you as you pull the trigger.
00:20:43.860 Trust me, it’s worth it to enjoy the moment and celebrate after such hard work.
00:20:51.780 In conclusion, great teamwork is critical, and I want to give kudos to my team.
00:20:58.200 They were instrumental in getting us to this stage.
00:21:03.840 Thank you for your time, and you can find me on X (formerly Twitter). I would appreciate any feedback as this is my first presentation.
00:21:12.660 Enjoy the rest of the conference, and it was a pleasure to be here.
00:21:22.859 Thank you for your presentation. I have a question for you.
00:21:29.880 If I understand you correctly, you promise your customers that their data will only be stored in specific regions.
00:21:35.220 But data is also stored in databases, right? We use monitoring tools, logging, etc.
00:21:40.319 So, for example, if we have Datadog monitoring, we can store that data in different regions.
00:21:45.720 What about when you want to aggregate data from multiple regions for business reporting?
00:21:51.240 Wouldn’t that break your promise to customers, and how do you handle it?
00:21:57.480 Regarding logging and business reports, it’s important to clarify our approach.
00:22:04.440 For logging data, we also implement region-specific logging. This means any information stays within its respective region.
00:22:10.380 As for business analytics, there is a challenge with aggregating across regions. We acknowledge that this is a limitation.
00:22:18.960 I'd suggest talking further about this in the future as we're still exploring this.
00:22:26.460 Your final question addressed data extraction and logs, considering existing customer data.
00:22:33.180 Yes, while transitioning, there will still be existing data logs.
00:22:39.720 We are cautious not to let UK data reach US servers and only prepare for it to stay in the UK during this transition.
00:22:48.720 The process ensures that UK data is ready to migrate only after the US is established.
00:22:55.620 This way, we maintain control over data privacy as we move forward.
00:23:02.280 Thank you for your question.
00:23:10.380 Are there any other questions?
00:23:18.780 You mentioned this table renaming process, and I’m curious why you approached it this way.
00:23:24.540 Instead of simply removing unnecessary data, why not avoid complications from renaming?
00:23:30.780 Interestingly, we found that creating fresh tables minimized risk during data management.
00:23:36.780 Not only is the copy-and-swap method generally faster, but it also reduces potential mistakes.
00:23:43.320 This experience guided our decision-making process.
00:23:49.920 Thank you for your understanding, and sorry for needing to cut it short. Thanks again!