Operating Rails in Kubernetes

by Kir Shatrov

Summary of 'Operating Rails in Kubernetes' by Kir Shatrov

The presentation by Kir Shatrov at RailsConf 2018 discusses the transition of running Rails applications in Kubernetes, focusing on the techniques and insights gained while migrating hundreds of apps at Shopify. The talk provides a comprehensive overview of Kubernetes, particularly how it reshapes the deployment practices for Rails applications.

Key Points Discussed:

Conclusions:

Kir emphasizes the benefits of moving to Kubernetes for Rails applications, advocating that this transition enhances efficiency, scalability, and resilience of deployments.
The adoption of cloud-native practices allows teams to concentrate on application development without being bogged down by infrastructure management.

Shatrov's insights illustrate a significant shift in operational practices, showcasing a modernized approach to Rails app management in a containerized world.

00:00:10 Yes, time to start. Hi all, my name is Kir.

00:00:16 Today, I will talk about running Rails in Kubernetes. I work at a company called Shopify.

00:00:24 For over the last year or so, we've migrated hundreds of Rails apps within the company to Kubernetes.

00:00:31 This includes our main app, which is also known as one of the largest Rails apps in the community.

00:00:39 We learned quite a bit about running Rails efficiently in Kubernetes, and I decided to make this talk to share some of the things we've learned.

00:00:45 Today, we'll start with a quick introduction to Kubernetes for those who haven't been exposed to it yet.

00:01:00 Then, we'll talk about what makes Rails special when deployed on orchestrated platforms like Kubernetes.

00:01:13 Lastly, I'll share some insights that helped us migrate all these apps.

00:01:25 First of all, please raise your hand if you have ever played with Kubernetes or any container orchestration.

00:01:36 Oh, it's quite a lot of you.

00:01:42 In 2019, almost everyone agreed that containers are fantastic.

00:01:49 They provide a universal interface for any app to run in basically any environment you want.

00:01:58 However, the problem of running and scheduling containers still exists.

00:02:05 You need to run these containers somewhere.

00:02:11 Just a note, I'm not going to talk about containerizing Rails because there will be a great talk at 3:30.

00:02:23 If you're interested in containerizing Rails itself, please attend this talk by Daniel.

00:02:36 Here, I'll focus on running it in production using orchestrated containers.

00:02:43 You have a container with your app, and you'll run it somewhere in a static environment.

00:02:55 In a static world where servers are configured using tools like Chef, for example, you would have larger servers handling more fat containers.

00:03:07 These larger containers require more memory and CPU, while smaller servers might handle lighter containers.

00:03:17 All this orchestration is still quite manual and configured by humans.

00:03:29 This manual approach can sometimes waste resources, leaving some CPUs and memory unused.

00:03:41 The desired state is to have every CPU utilized and all resources efficiently scheduled.

00:03:48 This minimizes resource consumption and ultimately saves energy.

00:03:55 Kubernetes effectively solves this by smartly scheduling resources on servers in a dynamic way.

00:04:08 It bin packs containers to run them in the best way possible.

00:04:14 In a single sentence, it’s intelligent container scheduling for better resource utilization.

00:04:21 This is crucial because you no longer have a static list of servers.

00:04:35 Everything is scheduled dynamically; if one server crashes or loses power, the same unit of work is rescheduled elsewhere.

00:04:49 This allows for optimal resource usage, which is increasingly important as deployments grow.

00:05:05 Next, I want to share some key concepts that Kubernetes introduces.

00:05:11 One basic concept is the 'pod', which is essentially a running container—one instance of something.

00:05:41 If we run one process of Sidekiq, it would be just one pod. However, one instance alone is not sufficient to run an entire app or service.

00:06:10 This brings us to the next concept: a deployment, which is a set of pods. Each app in Kubernetes might have a couple of deployments.

00:06:22 For example, a Rails app could have two deployments—one for web workers and another for job workers.

00:06:54 The number of instances in any deployment is quite dynamic. It can be adjusted, allowing you to scale it up or down.

00:07:10 You can even set up auto-scaling.

00:07:17 If you’ve ever worked with Heroku, you're likely familiar with the concept of dynos and the ability to adjust the dyno count.

00:07:37 Similarly, Kubernetes allows you to scale deployments effectively.

00:07:57 This sounds great, but how do you actually describe these resources?

00:08:05 If you used Chef or Capistrano, you probably had a Ruby DSL that could be both expressive and overwhelming. Kubernetes leverages YAML files to describe its resources.

00:08:42 You might have a config file of maybe 20 to 30 lines for a Rails app, which you would apply to a Kubernetes cluster and keep in the repository.

00:09:07 This approach has many advantages and marks a mindset shift. We had to transition from controlling servers during deployments to describing configurations.

00:09:26 When you manage servers, you might run commands remotely and compare outputs. However, with Kubernetes, you simply push a YAML file and tell Kubernetes to apply it.

00:10:02 This desired state is rolled out within moments or a minute if a large configuration is applied.

00:10:20 This is a significant change, as I had to move on from managing servers to describing our infrastructure.

00:10:33 Controlling servers would mean running commands remotely and comparing outputs, while describing configuration is more about pushing settings to be applied.

00:10:55 This concept leads to abstraction from physical machines, which is great for self-healing.

00:11:11 If one server goes down, the same work is rescheduled elsewhere. This isn't the case with manual server management.

00:11:29 At Shopify, we have nearly 100 Capistrano configurations, and servers eventually became unmanageable due to scale, leading to outages when processes died.

00:11:46 This self-healing benefit does not hold true when controlling servers.

00:11:55 Discussing tools, examples of server controls are Capistrano and Chef, while Kubernetes lets you describe the desired state for rolling out processes.

00:12:11 Kubernetes takes a container and runs it for any number of instances specified. While it's easy to run a basic container, Rails is more than a simple process.

00:12:41 Many Rails apps work as monoliths, embedding multiple functions within them that complicate running them as simple containers.

00:12:57 If you’ve used Heroku, you might be familiar with the Twelve-Factor App methodology, which promotes using declarative formats and minimizing differences between production and development.

00:13:13 Apps that follow these Twelve Factors are usually easy to scale without significant architectural changes.

00:13:31 I’d like to go through a couple of these factors, which can sometimes be overlooked but are critical for running apps smoothly in Kubernetes.

00:14:01 One important aspect is disposability and termination, meaning what happens when you want to restart or shut down a process.

00:14:25 For web requests, this is straightforward; you can stop accepting new requests, wait for existing ones to finish, and shut down afterward.

00:14:45 However, background jobs require ensuring all current jobs terminate before safely shutting down.

00:15:10 Long-running jobs pose a further challenge; they may abort in the middle of execution and restart, which brings us to idempotency.

00:15:33 The code that processes work should not lead to unintended side effects, even when executed more than once.

00:15:56 Concurrency allows your app to scale effectively. For instance, if you have both web workers and job workers, these should not share bottleneck resources.

00:16:14 If they rely on a single shared resource, scaling won’t be effective.

00:16:26 Now let’s discuss assets management during deployment.

00:16:37 When using Capistrano, you might run asset precompilation on every server, wasting resources. Instead, pre-compile assets once and distribute them.

00:17:02 An efficient approach is to embed assets within the container with the app, ensuring all dependencies are included when the app starts.

00:17:23 Another potential issue is with database migrations. Within the Rails community, running migrations during deployment can make the process fragile.

00:17:44 Consider what happens if a code change fails after migrations are applied: rollback becomes complicated.

00:18:07 To mitigate this, we avoid running migrations as part of deployments and encourage developers to write compatible code between versions.

00:18:31 We implement asynchronous migrations, meaning they are applied separately from deploying the code.

00:18:51 This structure enables us to announce migrations on Slack, notifying developers of the migration status.

00:19:15 Another key aspect regards application secrets. Modern apps commonly interact with third-party APIs that require access tokens.

00:19:31 Rails applications must securely manage these sensitive tokens, and one approach is to use environment variables.

00:19:44 However, having hundreds of environment variables can become cumbersome.

00:20:02 Rails 5.2 introduced a credentials feature that allows storing encrypted credentials within the repository.

00:20:28 This method simplifies sharing keys; you simply need the rails master key to access your secrets.

00:20:39 In summary, adhering to the Twelve-Factor principles facilitates running Rails apps in orchestrated environments.

00:21:10 Mindfulness in worker termination and avoiding deploying migrations together with code promotes a safer rollout process.

00:21:29 Asynchronous migrations enhance the reliability of this process, along with the more secure sharing of credentials with Rails 5.2.

00:21:42 At Shopify, we've managed hundreds of apps across various environments.

00:21:49 Some were on Heroku, others on AWS, and some were managed on physical hardware with Chef.

00:22:05 Our goal was to allow developers to focus on building apps without being burdened by underlying infrastructure.

00:22:21 We decided to invest in Kubernetes to provide the platform needed for efficient scaling.

00:22:34 Describing apps for Kubernetes only requires YAML format, usually requiring no more than 20-30 lines of code.

00:22:47 However, not every developer needed to learn this YAML format. Instead, we created a bot that generates a PR on GitHub.

00:23:12 This PR is based on what they are using in production. For instance, if they use Sidekiq, it generates a YAML configuration.

00:23:24 The first item in the PR description includes a checklist to verify that the configuration is logical for the app.

00:23:40 Once everything looks good, they can merge it, and their app is ready to run.

00:23:53 The next step involves applying the configuration with the Kubernetes CLI.

00:24:15 When you run 'kubectl apply' on a file, it returns immediately after letting Kubernetes know about the desired state.

00:24:30 Then, Kubernetes takes time to provision containers, find servers with available CPU, and schedule the work.

00:24:49 This process isn't very visible. If you’re used to Capistrano, you might miss the progress updates.

00:25:06 To address this, we created a gem called 'kubernetes-deploy' which provides visibility into changes applied to the cluster.

00:25:19 This open-source project is widely adopted and allows monitoring the deployment progress.

00:25:36 Kubernetes helps schedule work efficiently, saving resources and eliminating worries about which server a container runs on.

00:25:57 Though Kubernetes isn't magic, it's technology that makes scheduling straightforward.

00:26:25 You do need to know certain things about Rails to run apps on orchestrated platforms smoothly.

00:26:46 Before, setting up Rails took hours through tools like Capistrano. Now, orchestrated containers simplify deployments dramatically.

00:27:05 Having standardization means getting started with any Rails app is easier, and it’s possible to quickly understand deployment structures.

00:27:29 Kubernetes abstracts complexities away, allowing applications to run smoothly.

00:27:51 If you are considering getting started with Kubernetes, keep in mind it’s a good solution if you want to stop worrying about physical machines.

00:28:08 For more insights or to discuss the topics I mentioned, feel free to follow me on Twitter.

00:28:23 Thank you for coming to the talk! Now, let's address some questions.

00:28:55 The easiest way to organize asynchronous migrations involves having developers submit separate pull requests for code changes and migrations.

00:29:41 This ensures that if something must be reverted, it can be done safely without affecting migrations.

00:30:05 At our company, we run a recurring job that checks for pending migrations and applies them regularly.

00:30:15 Regarding stateful resources, we aren't running things like MySQL in Kubernetes just yet but are looking into better solutions for that.

00:30:37 For stateless resources, the process is improving.

00:30:52 We do leverage Kubernetes secrets to store credentials securely and have encountered smooth setup processes with that.

00:31:09 As for managing configurations across different environments, we don't follow traditional staging practices but utilize feature flags.

00:31:25 Thank you all for your attention!