Talks

Automation in Deployment on Hybrid Hosting and Private Cloud Environments

Automation in Deployment on Hybrid Hosting and Private Cloud Environments

by Fletcher Nicol, Matthew Cooker, Dr. Nic Williams, James Casey, Andy Camp, and Eric Lindvall

Summary of "Automation in Deployment on Hybrid Hosting and Private Cloud Environments"

In this panel discussion at RailsConf 2013, the speakers, including Fletcher Nicol and several industry experts, delve into the complexities of deploying scalable applications in hybrid hosting and private cloud environments. The session extends into the interplay between automation and deployment, highlighting both the challenges faced and strategies adopted by modern developers.

Key Points Discussed:

  • Introduction of Panelists: Each panelist introduces themselves, outlining their backgrounds in various tech entities such as Pivotal Labs, OpsCode, and others, establishing their credibility and relevance to the topic.
  • Industry Landscape: The discussion begins with an acknowledgment of the diverse deployment options available, emphasizing the balance between virtualized and bare metal servers and the challenges they pose for developers.
  • APIs for Deployment: The need for efficient APIs around hardware and software interactions is highlighted, noting how this capability can streamline deployment processes. The consensus among panelists is that developers desire simplified systems for application resource allocation.
  • Case Studies and Experiences:
    • Blue Box: A hosting provider, shares insights from multiple customers about the diverse approaches to deployment, particularly focusing on hybrid models.
    • King of the Web: Discusses leveraging APIs for scaling operations during high traffic events, detailing a successful implementation using New Relic and Chef.
  • Continuous Integration and Deployment (CI/CD): Emphasis on the importance of CI/CD frameworks for automating deployment, with mentions of tools like Jenkins and Selenium that facilitate smooth transitions through various testing stages.
  • Infrastructure as Code: The panel discusses adopting a coding approach to infrastructure management, which enhances consistency and repeatability in deploying applications.
  • Testing and Monitoring: The necessity of proactive monitoring and smoke tests to validate deployments while retaining human oversight in critical decisions is underlined.

Important Takeaways:

  • The shift towards automation in deployment is essential but must be approached with careful planning and a blend of human oversight to mitigate risks.
  • Effective API management and proper configuration management tools are vital for achieving agile and reliable deployment processes.
  • Continuous efforts in adapting to the changing environment of cloud services and infrastructure upgrades are necessary to maintain efficiency.
  • Building environments that mirror production environments from the outset can prevent issues during deployment phases.
  • The importance of visibility and predictability in resource provisioning remains a critical concern among developers in hybrid and multi-cloud scenarios.
00:00:12.259 Thank you, so hey.
00:00:17.820 Um, we're going to be talking about deployment and automation, and honestly, like, lots of buzzwords in this title.
00:00:25.619 I am all about the buzzwords. All the buzzwords, yeah, we did that.
00:00:31.679 So I'm Fletcher Nicol, a software engineer at Blue Box out of Seattle. While my name's on here for this talk, I'm hoping to actually not do a whole lot of talking.
00:00:39.600 There's an amazing panel down here that can talk much more at length about these things than I can.
00:00:51.180 I thought what I'd do is just kind of go down the row and have each panelist introduce themselves because they can do that far better than I can.
00:01:02.520 Great, thanks! I'm Matthew Coker, a developer at Pivotal Labs. I've been working on Cloud Foundry.
00:01:09.600 Now I work at Pivotal Labs, we're a division of Pivotal. It's very confusing frankly, but just call it Pivotal and we're fine.
00:01:15.119 We are now carrying forward the development of Cloud Foundry, which is an open-source platform as a service.
00:01:21.360 I'm leading the engineering effort there, so we're putting a huge amount of engineering resources behind it to make it the deployment platform that anyone uses for deploying apps.
00:01:27.360 My name is Doc Nic. I used to be the VP of engineering on technology at Engine Yard, now I'm doing consultancy around Cloud Foundry.
00:01:38.579 So I am in charge of harassing Matt pretty much; it's an awesome job.
00:01:44.700 I have a lot of fun, especially when we're actually harassing each other, because I lose every time.
00:01:50.460 I'm really looking forward to winning a couple of points today in front of the audience, so we can keep track of it.
00:02:00.720 Anyway, right now, my name is James Casey, I'm a development lead at OpsCode, where I work on Chef, mostly on the backend servers.
00:02:07.079 Before that, I worked at BankSimple here in Portland, deploying an online bank with Ruby, JRuby, Scala, and Java into EC2.
00:02:13.319 My name is Andy Camp, and I'm a development manager at King of the Web.
00:02:18.840 It's an awesome name! That's just one bit of the webmaster, which is the startup's underground online video where we give a cash prize to whoever gets the most votes.
00:02:31.520 It's centered around independent web video creators. Before that, I was at Hark.com, which is a site centered on audio.
00:02:38.160 I started at Hark with hardware on rented rack space, and I've deployed on raw EC2, Engine Yard, Blue Box, and personal sites on DreamHost, so I've had experience with a variety of different deployment strategies.
00:02:59.060 My name is Eric Lindvall. I'm a co-founder of Papertrail. We help you see what's going on in your server and application logs and help you do something about it.
00:03:05.340 I don't have anything else to say.
00:03:10.500 So I think what we're trying to do is get a mix of people involved with businesses that are deploying their applications as well as some more platform and tool makers.
00:03:29.420 We want to see where things are good and where things are not so good. The best-case scenario is that everybody's business gets bigger and better.
00:03:46.200 But then, you know, comes all these complications. Let's talk about city starts and automating failing businesses.
00:03:56.540 Failing businesses they want to admit to. Well, let's talk about that. I'm just curious to see what everybody's into these days regarding solving this problem at least for themselves.
00:04:20.699 Speaking for myself from Blue Box, as a hosting provider, we get to see lots of different customers and ways that they do deployment.
00:04:39.000 We're also a hybrid model of virtualized resources and bare metal servers, which gives us the additional challenge of helping customers provision their applications on resources.
00:04:58.440 Sometimes those resources can be provisioned over API and then you have these actual physical servers.
00:05:18.120 That can be quite the challenge as you grow your business, especially when dealing with more of these physical servers.
00:05:34.800 What I would like to do is put APIs around that and just treat that hardware like software.
00:05:41.940 We've been looking into how we can make our lives better internally on our platform as well as extending that out to customers.
00:05:59.280 The idea is that if you know the requirements of your application, you just want to check boxes or make API calls.
00:06:06.000 It would be really nice if all that was much more opaque.
00:06:11.580 But the reality is that it's a very hard thing to do when you're trying to do one or the other.
00:06:20.699 Cool! I think I'd sort of echo the sentiment that what developers really want is APIs around things they're going to have to interact with.
00:06:30.840 Over the last few years, Heroku has shown, at least to many people in the Ruby community, that a PaaS cannot actually solve the needs of how they're going to deploy their applications.
00:06:37.259 For the vast majority of them, unfortunately, we've also learned that the U.S. East is not the best place to host your application.
00:06:51.840 Everybody else is understandably there, it's where the cloud is.
00:07:04.139 I recognize that Heroku is working on being in multiple regions, but I love their commitment to get out of Virginia.
00:07:11.580 We're going to Europe!
00:07:27.660 I think what that shows is that the developers want an interface where they can specify their needs: here's my code, here's what resources it needs, it needs a database, a message bus, etc.
00:07:41.580 They want to specify those needs and have the system run it automatically.
00:08:04.680 Thus, we at Pivotal saw Cloud Foundry as the platform that would allow us to cater to customers who were unhappy with being in Amazon or other closed solutions.
00:08:19.740 So, where Fletcher's working on those lower-level APIs that offer capabilities like a load balancer or an instance, we're focusing on much higher-level APIs.
00:08:31.200 For instance, here's an application, here's the code, run it 100 times on different servers and distribute the load equally.
00:08:52.420 It's essentially the same issues being approached from different levels.
00:09:11.160 I work full-time with a large enterprise company involved in the music business. There's a big investment at the moment in redoing our approach to deploying.
00:09:30.420 Cloud Foundry comprises a crucial part of that.
00:09:45.030 Even though some larger groups might not want to exploit the talents of app developers, we realize as a solution, automation has to be provided.
00:10:00.300 Nonetheless, we also incorporate Jenkins on top of that to ensure that CI can trigger deployment into a Cloud Foundry environment for testing.
00:10:22.080 By running Selenium under that same mechanism, we are capable of moving apps smoothly from development, QA, and integration testing to staging and production.
00:10:40.800 If you've worked in a smaller agile environment, you might be lucky not to worry about changing production until customers complain, but larger organizations often have strict rules against this.
00:11:06.000 Instead, they strive never to change production and keep failing during staging.
00:11:28.500 But we have all the automation developed around that.
00:11:43.860 Another level of automation we must address relates to upgrading Cloud Foundry itself. Having ways to deploy your apps is great, but we also must adapt the underlying infrastructure.
00:12:06.180 The changes don't come as swiftly for the infrastructure beneath the app as they do for the app itself.
00:12:22.740 But eventually, changes like upgrading off MySQL 5-0, or moving towards better standards become critical.
00:12:40.500 BOSH, which the Cloud Foundry group developed, is certainly something worthwhile.
00:12:57.300 I think the first recommendation is to use configuration management tools, whatever they may be.
00:13:12.660 Ultimately, we are discussing agility in business, reacting quickly to changes, including small adjustments like an app upgrade.
00:13:28.420 Nonetheless, specific aspects of configuration management and how we conceive of them play into the concept of 'Infrastructure as code'.
00:13:49.320 As developers and engineers, approaching infrastructure from a coding perspective allows us to standardize changes and apply them consistently.
00:14:10.900 Imagine if the framework existed where you could improve upon your CI/CD workflow by having networking as code.
00:14:30.680 This major shift empowers developers to handle infrastructure and makes it possible to experiment with it similarly to how we interact with our code.
00:14:46.020 Ultimately, this evolves to relating it towards developers rather than solely relying on those in specialized roles that sometimes emerge.
00:15:01.080 From a customer standpoint, at King of the Web, we faced the challenge of being a contest that gives away cash prizes.
00:15:17.220 From the beginning, we wanted to maintain public vote counts so we could know the winner immediately after the contest ends.
00:15:32.520 That being said, we discovered that content didn’t always win; personalities prevailed.
00:15:48.720 These personalities, often YouTubers, wield immense social influence, generating 10 to 50 times their baseline traffic in about 15 minutes.
00:16:06.960 Additionally, we would detect attempts to manipulate the voting system and create multiple accounts, but we accounted for this as well.
00:16:24.300 We planned for scaling from the outset, but with the new challenge of handling high traffic, we had to quickly bring up 20 servers.
00:16:49.740 This process actually went remarkably smoothly for us, as we utilized New Relic's API to detect traffic, Blue Box's API to create servers, and Chef to configure those servers once they were live.
00:17:07.440 We were able to image our existing servers and scale those services quickly.
00:17:27.220 This made it effective, especially when implementing a custom Rails app that utilized rake tasks for automation.
00:17:41.460 As we tread through today's cloud landscape, we've also managed challenge in configuration changes, testing those changes as we scale.
00:17:56.960 Having static environments to copy enabled us to effectively manage changes in our NGINX config, Resque workers, etc.
00:18:05.760 Identifying these potential changes in a staging environment as opposed to production has posed a challenge.
00:18:20.940 However, I believe that making progress in tools like the Chef client will help automate these processes in the future.
00:18:35.640 There was an excellent point raised to address the need for APIs to deploy physical servers.
00:18:48.420 I think the key aspects that people appreciate are the visibility of resource availability and the repeatability of the provisioning process.
00:19:07.200 That predictability adds substantial value, greatly desired in deploying physical servers today.
00:19:20.099 That's a solid point. Having a consistent turnaround time improves the ability to identify bottlenecks with more assertiveness.
00:19:34.680 Even if things aren’t perfectly deterministic, it's about ensuring delivery time is consistent.
00:19:52.380 An API without those deterministic aspects isn't necessarily going to alleviate the underlying issues developers hope to solve.
00:20:11.599 Has anyone here extensively used the Amazon API? It's not just about having experience; it’s about automating it.
00:20:30.240 Unfortunately, it lacks a determinism that makes automating error-prone due to the multiple endpoints associated.
00:20:46.860 Imagine requesting a server and wondering if it's done yet; it takes forever to receive strong callbacks.
00:21:02.579 You could hit an endpoint expecting it to be operational, but what if it says that server never existed?
00:21:24.000 That abstraction can really create hurdles when automating deployments.
00:21:41.640 In Portland, we have a data center—let's go down and visit our local Amazon personnel.
00:21:50.520 I guess with the business application developer hats on, what do we do when things inevitably go wrong?
00:22:04.380 If all automation implements, it doesn’t guarantee a perfect outcome.
00:22:19.740 Automation exists to limit human error but still requires oversight.
00:22:29.940 I'm curious if anyone has ideas on tackling that strange balance between deploying flawless and efficient software.
00:22:49.920 Exactly, for us, it's about adopting strategies for scaling that may include scaling up by at least six servers.
00:23:04.980 By managing redundancy, we ensure there's always more capacity than needed.
00:23:17.880 It's imperative to trust automation while still allowing room for a human to intervene.
00:23:32.160 This might mean ensuring smoke tests are regularly being run to validate the deployment.
00:23:46.440 Meeting pre-determined thresholds, along with proactive monitoring, is essential.
00:24:00.300 Achieving a balance between automation for efficiency and retaining human touch for critical decisions is the way to go.
00:24:21.420 But don't underestimate the importance of configuration management tools.
00:25:00.940 Without documentation, it becomes virtually impossible to understand that what you're doing is valuable.”},{