The Many Ways to Deploy Continuously

Talks

Paul Biggar

2 talks

#continuous-deployment

The Many Ways to Deploy Continuously

by Paul Biggar

In the talk titled "The Many Ways to Deploy Continuously," Paul Biggar, co-founder of CircleCI, explores the varied strategies for implementing continuous deployment (CD) in software development. Continuous deployment is not a one-size-fits-all solution; different companies adopt specific methods that cater to their unique requirements and workflows.

Key points discussed include:

Diverse Approaches: Companies deploy code continuously in different manners, reflecting their individual development speeds, team sizes, and product architectures. For instance, some may avoid deployments on Friday afternoons, while others opt for early morning schedules to minimize disruptions.
Deployment Complexities: Continuous deployment introduces complexities such as race conditions during code transitions, where old and new code may run simultaneously. For example, during migrations or updates, specific strategies like using symlinks to upload new code into a separate directory before switching can help minimize issues.
Database Migration Challenges: The speaker cites IMVU's pioneering efforts in continuous deployment, illustrating how they avoided downtime by implementing versioning systems for user tables instead of modifying tables directly. This approach helped eliminate locking issues during database migrations.
Facebook’s Deployment Practices: Facebook, while not continuously deploying like some companies, deploys updates daily and has sophisticated systems to manage data changes and their effects on users. Feature flags are used to disable new features until they are ready for wider release, allowing incremental testing with real users.
Testing and Monitoring: With continuous deployment, comprehensive testing and monitoring are critical. GitHub requires that all code merged into the master branch must be tested in the production environment. This ensures reliability and mitigates risk.
Metrics and Rollbacks: Companies like IMVU utilize business performance metrics, such as conversion rates, to monitor user interactions and engagement. They can quickly roll back changes if critical metrics fall below expectations, thus maintaining service reliability.

Through this talk, Biggar emphasizes that while continuous deployment can present significant challenges, it also provides opportunities for increased efficiency and adaptability in software delivery. Companies must consider their specific environment and practices to effectively implement continuous deployment strategies. CircleCI continues to develop tools that aid in these processes, and they are actively hiring individuals interested in further exploration of CI/CD practices.

Overall, the presentation highlights the flexibility and risks of adopting continuous deployment and encourages audiences to share insights and approaches to enhance their deployment strategies.

00:00:20.760 All right, folks. I'm going to give a talk about the many ways to deploy continuously.

00:00:24.080 As was pointed out, Allan was supposed to give this talk, but rather selfishly he went off to Italy on his honeymoon. Fortunately, I know the material pretty well because Allan and I are co-founders of a company called CircleCI. CircleCI provides hosted continuous integration and deployment for Ruby apps, Node.js, and similar types of applications.

00:00:34.399 We have a couple of thousand developers as our paying customers, and about 42% of them are doing continuous deployment. This gives us the opportunity to observe a lot of different approaches to continuous deployment. We also spend a significant amount of time looking at what the future of our product will be by examining what larger companies like Facebook and GitHub are doing.

00:00:44.840 This talk is an overview of what we've learned through our research and by talking with our customers. Most talks you attend tend to provide answers, but this is not one of those talks. I’m not here to tell you the right way to implement continuous deployment, as I've learned that there is no one-size-fits-all solution.

00:00:57.320 Everyone we talk to does continuous deployment slightly differently. Some companies only deploy continuously during business hours, from Monday to Friday, but avoid deploying on Friday afternoons. Others might only deploy at 2 a.m. because that’s the only time they can afford to clear their cache.

00:01:04.960 If you're planning to adopt continuous deployment at your company, there’s likely a specific way you think it should be done. However, as you observe other companies, you'll see that they each implement it in their own unique manner.

00:01:11.760 What I'm aiming to do in this talk is not provide you with answers but examine the various factors that come into play with continuous deployment. These factors include the speed at which you're developing new features, the complexity of your code, the design and architecture of your software, and whether you follow a service-oriented approach or use a monolithic app structure.

00:01:24.760 Business priorities, the number of engineers on your team, and your overall state of mind will also impact how continuous deployment is implemented.

00:01:34.919 Most of the talk will focus on deployment in general, as continuous deployment is only a subset of all deployments. Many of the challenges you face in deployment won’t feel like problems if you only deploy monthly; you can afford a minute of downtime and still maintain reliable service. However, as the frequency of deployments increases—say, deploying 10 to 500 times a day—these issues become much more relevant.

00:01:42.120 There are numerous complexities involved with deployment. In the early days, people often deployed PHP applications by using FTP to a shared server.

00:01:55.160 The fundamental aspect of deployment is that your code lives on one machine, and you want to transfer it to a server so the code can run. A significant challenge arises during the transition period when files are being overwritten, leading to requests that may rely on both new and old code simultaneously.

00:02:10.840 This race condition occurs during every deployment; requests can be made while the old code is still running, resulting in unexpected behaviors if some files are replaced before others.

00:02:19.240 In the PHP days, advanced users implemented symlink strategies. They would upload new code into a separate directory and only switch the symlink to the new directory once all files were uploaded, minimizing the risks of race conditions.

00:02:25.560 However, race conditions can arise in various aspects beyond just code changes; they can happen with database schemas, API versions, and other service dependencies.

00:02:36.320 Let’s explore a more modern approach to deployment, particularly using platforms like Heroku, which also faces similar issues to those experienced during the early PHP days.

00:02:46.079 In Heroku, when new code is pushed, user requests can either hit old versions or newly deployed code because the transition between the two still presents challenges.

00:02:56.880 Moreover, when changes to database schemas occur, we must ensure that the new code can handle both the old and the new schema until we've completed the migration.

00:03:05.760 A practical solution to these challenges is to deploy an intermediate version of your app that understands both schemas and can work with either during the transition period.

00:03:17.040 If you do this successfully, you can run your migration smoothly, which will help you avoid potential pitfalls associated with a more significant shift.

00:03:30.160 Moving on, two tricky topics in deployments are data migrations and table locking. When changing your schema, database tables can become locked, leading to performance issues and downtime that can be quite detrimental to business operations.

00:03:44.300 For instance, IMVU, a pioneer in continuous deployment, faced significant downtime whenever they changed their database schema. To address this, they implemented a versioning system for their user tables, creating new tables whenever schema changes were needed, thus avoiding table locks.

00:03:56.680 They never modified existing tables directly, ensuring they could continuously serve users while migrating data without locking issues.

00:04:09.500 Another option for schema changes is to utilize databases that do not require table locking, like MongoDB, which handles schema migrations in a more flexible manner.

00:04:22.040 Now, let's shift our focus to how Facebook handles its deployment processes. While they don't practice continuous deployment in the same way as some companies, they do deploy once a day and have sophisticated mechanisms in place to manage their large volumes of data.

00:04:34.080 Facebook has effectively solved similar problems to those faced by IMVU, allowing them to migrate data gradually without significant downtime.

00:04:48.200 Deploying updates can often lead to racing conditions where both old and new data coexist, demanding a codebase that can handle the variability in data structures.

00:05:02.160 With the advent of one-page JavaScript applications, we must also deal with the additional problem of asset races. When deploying new code, it may not align perfectly with API availability if the deployment doesn’t synchronize.

00:05:12.840 This issue compounds itself, as the assets may rely on features that the backend APIs haven't deployed effectively yet, causing further complications in user experiences.

00:05:24.480 To combat this, the best practice is to roll out new code before the code that depends on it, thus ensuring that the dependent features are functional at the time users encounter them.

00:05:43.760 As we discuss continuous deployment, it's important to recognize that despite the obstacles encountered with regular deployments, they are magnified exponentially under continuous deployment conditions.

00:05:57.640 Monitoring and testing become key components in managing these complex challenges. Testing can be difficult, as it's hard to anticipate every combination of versions that may be running concurrently once deployed.

00:06:07.760 One approach is to conduct tests privately on staging servers before going live, but GitHub has taken it a step further by ensuring that all code merged into master must have been tested in the production environment first.

00:06:20.000 This practice ensures that their master branch is always ready for deployment and tested against a small subset of real users, adding a layer of safety.

00:06:32.960 Similarly, Facebook deploys all code to customers but uses feature flags to disable new features until they’re ready for wider release, allowing for gradual rollout and testing.

00:06:47.840 Feature flags can be finely tuned to allow specific user groups access, enabling them to monitor responses closely before activating for a broader audience.

00:07:00.480 In summary, the complementary strategies of testing and monitoring work together to address the challenges that arise from continuous deployment. This holistic approach can help you manage risks and improve reliability in production.

00:07:15.360 It’s vital to look beyond traditional monitoring techniques to embrace broader metrics that help gauge user engagement and business performance.

00:07:20.880 This can include factors like conversion rates and user interactions, allowing you to automatically roll back changes should metrics fall below expected thresholds.

00:07:28.000 For instance, IMVU monitored business metrics closely, using their insights to identify when confusion arose due to poor UX decisions, such as using a white buy button on a white background that users couldn’t see.

00:07:41.000 They quickly integrated business performance as a metric for their deployment success, allowing them to revert changes swiftly when needed.

00:07:53.600 To conclude, I hope this discussion has provided insights into various deployment strategies and the unique challenges they present. Continuous deployment doesn’t have to be daunting; rather, it opens new paths for efficiency and adaptation.

00:08:04.720 If this topic excites you, CircleCI is hiring! You can find us online at jobs.circleci.com.

00:08:07.920 I would love to hear from anyone with additional insights on continuous deployment strategies or any efficient transport mechanisms.

00:08:12.200 Thank you very much for listening. I appreciate your time and attention, and I hope you enjoyed exploring the many ways to deploy continuously.

00:08:21.300 Excellent.

MountainWest RubyConf 2013