Real World Docker for the Rubyist

RailsConf 2016

by Jason Clark

The video "Real World Docker for the Rubyist" presented by Jason Clark at RailsConf 2016 dives into the practical application of Docker in a Ruby environment, particularly how the company New Relic has implemented Docker in their development processes. Jason outlines the advantages of using Docker, including packaging convenience, isolation, and efficient resource sharing. The narrative follows two developers, Jane and Jill, who explore Docker's features and functionalities as Jane develops a new service.

Key points discussed include:

- Understanding Docker: Docker is presented as a toolkit that allows developers to package applications with their dependencies into images, which can be run as isolated containers.
- The relationship between images and containers: Jason makes an analogy comparing images to classes in Ruby (defining properties/actions) and containers to objects (running instances of those classes).
- Getting started with Docker: Developers can set up Docker environments on non-Linux machines using Docker Toolbox. A Dockerfile is critical for creating and deploying images.
- Docker Registries: These play a vital role in moving images to different environments, with Docker Hub as the default registry; alternatives like Quay and Docket are also explored.
- Deployment process: New Relic utilizes a tool called Centurion, which simplifies interaction with multiple Docker hosts, enabling smooth deployments and configuration management.
- Troubleshooting and performance: The narrative emphasizes best practices for identifying and resolving issues related to service management, logging, and memory constraints.
- Best practices and future directions: The session concludes by discussing best practices in creating reusable Docker images and the importance of security and environment-driven configurations. Future tools like Kubernetes and Docker Swarm for orchestration and service management are briefly highlighted.

The talk delivers valuable insights into common pitfalls and strategies for effectively utilizing Docker in Ruby environments, empowering developers to leverage Docker’s capabilities to enhance application management and deployment. With active engagements, audience members are encouraged to share their Docker experiences and further discuss improvements.

00:00:10.320 Thank you, everyone, for holding on until the very end. This is the last session before the keynote on Friday. My name is Jason Clark, and I’m here to talk to you about real-world Docker for the Rubyist. The genesis of this talk comes from the fact that the company I work for, New Relic, deploys a lot of our services using Docker.

00:00:25.519 I hear a lot of hype about Docker, with many people saying you should use it. However, there are wildly diverging opinions about how to use this tool. Docker turns out to be a toolkit that gives you a lot of options to pick from. So, I wanted to give you a presentation that suggests an approach to Docker.

00:00:43.760 This is tried and true stuff that we've been doing at New Relic for the last couple of years. We got into Docker pretty early, so we've experienced many of the bleeding edges and have encountered a lot of situations that have made our lives easier. This talk will take the shape of a story about two developers: Jane, who is a new New Relic employee with a great idea for a product, and Jill, who has been at New Relic a bit longer and can help answer questions.

00:01:08.560 Jane has an idea for a service that will do metrics for how many lines of code you have—a highly useful service. We encourage experimentation, and we want to let people experiment with this service. Jill, with her experience, can guide Jane and answer her questions.

00:01:20.360 As we are a public company, this is our Safe Harbor slide. I'm not promising anything about features, so please don’t sue us or assume anything based on my storytelling about services that we might develop. This talk will help frame how we use Docker and offer you a picture of ways you might apply it.

00:01:58.280 One of the first questions that Jane has upon joining is, 'Why does New Relic use Docker at all?' What is the purpose and what are the features that drove us to this technology? One of the big components is the packaging that Docker provides. Docker offers images, which are essentially large binary packages of files—a file system snapshot of a piece of code and its dependencies that you can distribute and run.

00:02:17.920 At this point, Jane is like, 'Okay, I’ve heard about these images that you can build off of.' For instance, there's an official Ruby image maintained by Docker. You can use that image and insert your Ruby code into an image built from it. Jane pauses as Jill says, 'This can be a bit confusing—I've heard about images and containers. What's the relationship here?'

00:02:51.840 The relationship is that an image is the base of what you're going to run. It’s the deployable unit, whereas a container is the running instance of that image. If you draw an analogy to Ruby, the image would be like a class—defining the properties and actions available—while the container would be like an instantiated object. So we create Docker images, and then those images can be deployed as running containers that run our code in staging and production environments.

00:03:26.959 There are many ways to package your applications. You could shuttle files, create a tar file, or take other approaches. However, it’s not enough to tell us why we would want to use Docker. Another major advantage Docker provides is isolation. For most of us, our applications aren't set up in a way that one host is completely maxed out by a single app. We may wish to share resources and run multiple applications across different machines to increase resilience.

00:04:05.799 Traditionally, you might have had a server with different directories for separate applications. You deploy and run those applications on that server. The problems become apparent when you see these applications sitting next to each other, sharing the same neighborhood. They can interfere with each other's files and processes and share the same memory, leading to conflicts. Docker contains these applications, keeping their environments separate.

00:04:42.199 Containers use the same kernel, unlike a virtual machine with a separate operating system, but Docker provides a way to isolate them. Each running container appears as though it is the only entity in existence. It only sees its file system and has constraints on how much CPU and memory it uses, minimizing the possibility of interference.

00:05:16.560 Clearly, Jane, the new developer, is asking how to get started. Docker is a Linux-based technology that must run on a Linux system with a Linux kernel. Many of us here don’t run Linux systems directly—we’re using Macs or Windows. Fortunately, Docker Toolbox or Toolkit is available as a sanctioned way to set up a development environment and get necessary tools installed on a non-Linux system.

00:05:44.280 Once we have that, we can get down to crafting our images for deployment. So Jane sits down with Jill; they are pairing. Jill tells her to write a Dockerfile in the root of her application. Jane recognizes this as she had done some reading about Docker. She asks, 'What image should I start from as I build my application?' But that’s all Jill tells her.

00:06:09.120 Jane wonders whether there should be more instructions in the Dockerfile—she’s seen Dockerfiles with working directories, copies, runs, and shell commands. Jane is confused about what is going on with her Dockerfile from New Relic. Jill reassures her that it's a valid question, but suggests they focus on getting her service deployed to staging before digging into the details of the simplified Dockerfile.

00:07:00.560 After writing the Dockerfile, Jane goes to the terminal and enters the command to build the Docker image, indicating a tag to use for the constructed image. She instructs Docker to work in the current directory. Once done, there’s a lot of output at the command line as Docker processes the base image and builds the package for her app.

00:07:35.080 If any errors occur in the Dockerfile—permission problems or other issues—this is the point where it will emerge. Successful completion allows her to ask Docker for a list of known images, revealing her service image named 'loc' with the default tag 'latest', as she didn't specify a unique tag. Now, this runnable copy of her application is ready to be used.

00:08:15.960 Everything seems fine for Jane on her local machine, but to move this image to a staging environment, she needs to ensure it can reach somewhere else. To fill this gap, Docker Registries exist. Docker Hub is the default registry, where all Docker tools will push and pull images unless specified otherwise.

00:08:58.760 At New Relic, we ran into problems when they deprecated certain versions of Docker faster than we could migrate our systems. Consequently, we sought alternatives. One such alternative is called Quay; it offers a web UI for pushing and pulling images, with private services available. Another tool we've tried is Docket, which allows you to store images in your own S3 buckets, effectively removing third-party dependency, which is crucial for critical deployments.

00:09:45.760 Now, Jane has a vision of how her service will look, and she wants to get it started in the staging environment. New Relic has developed a tool called Centurion. Typically, to run a Docker image and create a container to start the application, you would use 'Docker run', invoking the default command.

00:10:14.920 This blocking command allows you to see output from the running container as commands execute. However, companies at scale often have multiple hosts, making it difficult to interact individually; that’s where Centurion comes in. Centurion is a Ruby gem that lets you work against multiple Docker hosts, enabling easy pushing, pulling, deployment of images, and rolling restarts.

00:10:46.399 Another strength of Centurion is its configuration files. These files can be checked into source control, providing central visibility into what’s deployed in your Docker environment rather than relying on individuals manually managing containers. Centurion's configurations can utilize Rake tasks to allow for dynamic programming.

00:11:24.240 To deploy services, you define a task for your staging environment, specifying the Docker image to be pulled onto various hosts. You can also tag different versions, and when deploying across multiple hosts, Centurion handles the rolling restarts to ensure smooth updates.

00:12:03.399 Using Centurion is straightforward; you install the gem, which gives you an executable command. You specify the environment, project, and the command, like 'deploy', to start all services. Jane, somewhat nervous, asks if everything looks good before executing the command.

00:12:53.000 She initiates the command, which provides a lot of output as it connects to different hosts. It pulls all images to the necessary boxes, stops any running containers, and then restarts them based on the configuration. There are also options to check the service status before proceeding to the next host, which enables rolling deployments.

00:13:26.920 Having completed these steps, her services are in staging. Jane tests her code, and everything seems to work perfectly. However, the next day she discovers the service is gone! What's happened? Now, it's time for Jill to ask Jane some questions to troubleshoot the issue.

00:14:01.199 Jill starts with the logging configuration. Jane checks her Rails app and realizes she had copied a line from somewhere, leading to logging going to standard output instead of files written inside the Docker container. This is actually helpful because New Relic's infrastructure collects standard output from Docker containers and forwards it to platforms such as Elasticsearch.

00:14:44.440 This setup allows Jane to perform analytics across her logs. It's recommended to send logs out of Docker containers to prevent large file sizes and better visibility. They review the logs, but there's sadly nothing useful, so it’s time to directly inspect the containers.

00:15:16.160 Docker provides commands for this purpose. By using a '-H' flag, they can connect to the specific host that contains the container, issuing a command to list the active Docker containers running on that host. They find their container with an ID that looks promising but notice that it seems to be missing active processes.

00:15:52.600 In the process of troubleshooting, Jill recalls similar issues seen on another project—applications disappearing with no trace—suspected to be memory-related. They see that the lines of code service is using a hefty 300 MB of RAM, which is excessive for a Rails app and triggers Docker’s memory limits.

00:16:18.360 It's noted that by default, Docker sets a memory limit around 256 MB, which means anything exceeding that will have processes killed. Fortunately, Centurion configs allow for setting memory limits, so they configured it to allow 2 GB.

00:17:00.800 With two gigs configured, things stabilize, but Jane wants better performance with more Unicorn workers. Jill suggests using the environment variables to configure this. Docker allows command flags to set environment variables passed into the container, which is a fundamental aspect of structuring Docker systems for flexibility.

00:17:42.800 Once inside the container, the number of Unicorn workers adjusts based on the environment variable, allowing scaling without modifying the Dockerfile. Centurion supports this by allowing configuration through environment variables extracted from code instead of hardcoding values in the application.

00:18:26.479 In Rails apps, database configurations can also be parameterized, ensuring explicit connections to databases in production and staging. The 12 Factor app principles promote these configurations to avoid hardcoding sensitive information. Keeping sensitive items outside of source control increases security and provides better runtime protection.

00:19:06.480 Jane feels confident about her work with Docker but is curious about how the simple Dockerfile she wrote fits into New Relic's work. New Relic has put significant effort into building shared Docker images, extending existing tools to streamline the development process with standard configurations.

00:19:51.520 The image called Base Builder captures standard practices, integrating components expected for Ruby development. They primarily use CentOS, due to team preferences, and the Base Builder installs Ruby versions using rbenv.

00:20:35.239 With installed Ruby, they can seamlessly incorporate commonly used applications like SupervisorD and Nginx into the base image alongside Bundler, streamlining shared components across Docker images.

00:21:06.600 However, bundling everything during the image building phase isn't straightforward as dependencies vary by application. To mitigate this, Docker allows for on-build commands, where instructions are executed once a user incorporates the base image into their Dockerfile.

00:21:52.120 Using this method, they can streamline custom app integrations and provide one-liner scripts for configuring popular web services without requiring extensive modifications to existing images.

00:22:34.440 As Jane continues developing the lines of code service, she encounters issues writing files in the root directory. Getting permission denied errors, she seeks help from Jill, who explains the identity of 'nobody'—a low-privilege user for additional security.

00:23:10.040 By using this user, applications are better protected against exploits that may grant higher privileges. Jane resolves the issue by writing to an allowed directory, but her focus also shifts to writing tests with Docker in realistic environments.

00:24:00.720 A straightforward way to do this is to use alternate commands against the built images, like running 'Docker run' with the command to run tests. However, this method requires an updated image anytime changes are made to the application code, proving to be inefficient.

00:24:55.520 Fortunately, Docker supports options like mounting volumes, which allows Jane to run tests against the image while using her live code without rebuilding. This method ensures efficient testing at New Relic.

00:25:36.599 As Jane develops further, she considers using Sidekiq for background processing. Thankfully, Docker allows self-service provisioning through environments already configured to install Redis, easily facilitating her progression.

00:26:25.840 Using images parameterized through environment variables enables Jane to deploy services on her own, taking full advantage of the flexibility Docker offers. There's much discourse surrounding Docker, and tools like Centurion arose to meet needs at New Relic and beyond.

00:27:00.360 Docker Swarm aids in managing clusters of hosts, streamlining deployments for growing teams. Another tool that we are exploring is Mesos, which, in conjunction with Marathon, offers more dynamic scheduling for containers, adding resilience through automatic recovery for failed instances.

00:27:59.080 Kubernetes, a popular orchestration tool from Google, also enriches this ecosystem. Numerous advancements are underway in this space to enhance workflows and improve the developer experience.

00:28:39.040 In summary, we’ve followed Jane and Jill’s story demonstrating how to leverage Centurion for deploying and controlling multiple Docker hosts. We have discussed the benefits of using environment-driven configurations and strategies for building shared images to facilitate best practices across the organization. Furthermore, we have touched on security measures, testing approaches, and insights into potential future developments.

00:29:26.719 I hope you find successful paths using Docker in your own companies and are able to avoid common pitfalls. Thank you.

00:30:05.080 The question was where the Dockerfile lives, and typically, for us, it's located at the root of your Rails app. This convention has proven to be the simplest.

00:30:22.320 The next question asked was about the differences between Vagrant and Docker for workflows. Docker starts up containers quickly, while Vagrant relies on VM startups which can be slower.

00:30:45.720 As of our last count, we had a couple hundred services utilizing Docker internally. While we are not transitioning all applications, all new products in the last few years have deployed into our Docker cluster.

00:31:37.660 Concerning the deployment workflow, we build a new image to run tests before deploying. Our CI system, Jenkins, handles much of this; however, you'll often see direct command-line usage, which is handy for local testing.

00:32:20.000 When it comes to database migrations or asset compilation, asset compilation typically happens during the image build process. As for database migrations, these presently run outside of Docker. We use existing environments, running migration commands in Docker images that interact with external databases.

00:33:21.760 Migration processes need careful control to avoid breaking running instances, so we manage this conservatively. I’m running out of time, but I’d be happy to talk more to anyone interested afterward.