MountainWest RubyConf 2009
Practical Puppet: Systems Building Systems
Summarized using AI

Practical Puppet: Systems Building Systems

by Andrew Shafer

In the presentation titled "Practical Puppet: Systems Building Systems," Andrew Shafer discusses the application of Puppet for system automation and infrastructure management, particularly in relation to deploying Ruby on Rails applications in cloud environments like Amazon EC2. The session aims to illustrate how Puppet facilitates efficient server configuration and maintenance by automating the installation and management of software dependencies.

Key points discussed include:
- Introduction to the Demo: Shafer starts with a live demonstration of creating an Amazon Machine Image (AMI) that leverages Puppet to automate Rails installation via user data scripts.
- Historical Context of Computing: The talk reflects on the evolution of computing, from early computers operated by mathematicians to today's complex systems increasingly influenced by cloud technologies.
- Puppet as a Configuration Tool: Shafer explains how Puppet works as a Domain-Specific Language (DSL) written in Ruby to manage configurations, emphasizing idempotent operations that help ensure consistency across deployments.
- Cloud Computing and Rails: He outlines the challenges of deploying Ruby on Rails applications, including the need for various components like Ruby, gems, Apache, and MySQL, and discusses how Puppet simplifies this orchestrating task.
- Importance of Testing: Shafer advocates for continuous integration in infrastructure with testing frameworks like RSpec to validate Puppet code, emphasizing the necessity for reliable deployment practices as applications scale.
- Evolution of Infrastructure Management: The talk highlights how modern infrastructures can be spun up quickly and efficiently on cloud platforms, arguing that this rapid provisioning opens opportunities for innovation without significant upfront costs.
- Q&A Session: The presentation concludes with a Q&A segment addressing practical challenges related to package management and infrastructure updates, demonstrating real-world considerations in using Puppet.

Overall, Shafer's talk emphasizes the transformative effects of automation on system management and the collaborative roles of developers and sysadmins in maintaining robust infrastructures. The primary takeaway is that with automated systems and efficient practices, organizations can enhance their capability to innovate and respond to market dynamics swiftly.

00:00:12.519 What's up, MountainWest? Try to wake up after lunch. Who here is using EC2? Who has an EC2 account? This is a way you can participate in this little demo.
00:00:21.119 Basically, I made an AMI, and if you send it a little bit of user data, which I'm going to send right here, but you can also get it from my Twitter account, it's a little idea on Twitter. The thing about it is you have to add a line because Twitter gets rid of the carriage returns. You have to add one or else it won't run as a script. When you send this user data, it's going to do some stuff, and we're going to walk through it. It also looks like this: if you want to start an AMI, if you search for 'little idea' or 'Mountain West Ruby Conference,' you should find it there. There should only be one match for that, and you just send that little bit of user data, and it will build Rails for us.
00:01:08.960 That's what we're going to do, so I'm going to start this from the command line so we'll get logs, and I'll start another one here. Oops! Say again. Hold on, I'm not getting the right window. There we go.
00:01:20.520 Somehow I cut off the bottom of my Firefox when I changed the settings, so let's try this.
00:01:24.760 How can I even close it now? Alright, so let's actually just give it a go. This is the user data we're going to launch. It should come up when the public DNS entry gets up. I'll put that in Twitter too, and then we'll see this one already building.
00:01:32.080 We have some slides to get through in the next twenty-eight minutes, so let's try that. Puppet: Systems Building Systems. Yay!
00:01:35.800 So, what I'm going to talk about today is testing all the time. I'd say something else, but I'm in Salt Lake, so I'm going to try to show respect. Not really; it's going to be different. We'll see what I'm really going to talk about, which is some code. I'm going to talk about tribes. If you've seen me talk before, I love talking about tribes, but we'll get there. I'm going to talk about the dawn of time, clouds, evolution, and opportunity, not necessarily in that order.
00:02:00.080 I'm Andrew Shafer. I have two little sons, I like math, and I work for Reductive Labs. I mostly work on Puppet and puppet-related stuff. So, let's start at the beginning. In the beginning, there was nothing. Before that, there were machines that talked to cards. How do you think Ruby would run on that?
00:02:30.200 Back in the day, there was a period of time when the only people who really ran computers were people who wanted to do computation. So, you have this brief period where the people that run the computers were the only ones that understood them. They were the only ones that programmed them, and there was no real computer science. Essentially, it was mathematicians and physicists building computers, and that lasted for about two minutes before they were separated.
00:03:00.840 Then, the internet became dominated by porn. What are you going to do? So, let's look at the code. This is the code that's running in the background as we speak. It takes about fifteen minutes to build the image; if we did some other things, you could get it down to way less—about four or five minutes.
00:03:09.440 That's what we just put in as the user data. This is Puppet code, which isn't Ruby code but is written in Ruby. We love Ruby! But as an external DSL, I mean, come on. It's Ruby! You can write beautiful API DSLs, it's all good. What the hell's going on here? This can't be happening! But I’m pleased to say, I saw it all over the place when this award was announced for submissions. We submitted to the Fauxka Ruby award, and we won an excellent prize, which is second place basically for the Puppet project.
00:03:38.800 One of the main judges was M. I was pretty proud of that. The first place was a Korean group building a Ruby project to model the climate. I don’t feel bad losing to that project at all. Now we’re back to this thing, right? What is that going to do? Well, let's walk through the Puppet code that actually does all this— the modules that are building this. I pre-loaded them, so when it gets the user data, we’ll talk about clouds a little bit more later.
00:04:10.680 It's going to take the user data, and basically, just run through those modules and build out the site. So, that's the code we're looking at now. This is the definition of that code for the Rails site, and it's going to do a bunch of stuff. Who here works on Rails stuff? Raise your hand. Like, pretty much everyone! Who's configured Rails? How many moving parts does Rails need to run?
00:04:51.600 More than two? Yeah. There are all these things about joy and optimizing for joy. On one hand, you have Ruby, which I think is optimized for developer joy but certainly not for operational joy and keeping machines up. You have all these choices, and you have all these configurations. It can become a real pain. You basically have to go through and set up certain things. You need Ruby, right? You need gems to install Rails easily.
00:05:13.120 You need Apache because we’re going to use Passenger. So, what the cool kids are using, I hear. I don’t know. Then, we’re going to use MySQL, and we’re going to install Rails and do some stuff. So we’re going to walk through all this code real quick, and then we’re going to get back to the story.
00:05:36.440 If you guys saw my Twitter, I put the public URL for the one we just started. It’ll take a little while, then the patch will come up. You’ll see the 'It works!' page, and then after a little while, it’ll build Rails, and we’ll have the base install of a Rails page.
00:06:07.760 So, we’re going to include Ruby here. When you include something, you’re going to look in the module path and find the class. This is basically telling Puppet to install these packages, and in this case, we didn’t provide a provider. So, you can choose different package providers, and we’ll see that in a minute.
00:06:40.920 These are going to use the base install, and this image happens to be Debian. So, that’s essentially going to end up being 'apt install.' If we go look at the logs, they’re being generated right now. It’s going to 'apt install' these Ruby packages like you’d expect.
00:07:08.320 Then we’re going to go back to the gems. This is kind of an interesting topic to me. One of the debates— this project has a lot of attention from sysadmins who have to manage a lot of computers, not necessarily Ruby programmers. What are gems?
00:07:32.600 Right? So you have native packages and source packages on the platforms. But then you get to gems—so when you install Passenger from gems, is that a source package or binary package? We’ll get back to that, but I think this is funny. Ruby Forge is a PHP app.
00:08:00.360 And the other thing is, people have evolved, so there’s all this stuff happening with clouds and REST. It makes it very difficult to automate things. If you look at the convention, there’s no rhyme or reason. There’s some weird key—there’s no way to just say, 'Give me the latest Ruby gems,' or even give me the gems by number. You can’t just match this version. You have to have this magic key to get it.
00:08:30.840 So when you start thinking about automating, these little problems make it difficult to create conventions around them, which can create impedance mismatches that add up over time and increase complexity and expense.
00:09:00.600 What this is going to do is something you've probably all typed before. If you install gems from Ruby or from Ruby Forge, you're going to download it, and it's going to create a command; something I'm going to explain in just a minute. You’ll see this sort of process—get the thing, untar it, run setup, and then I like to have a little link so I can just run 'gem.' So, that creates a condition that's known as idempotent.
00:09:30.760 Who knows what idempotent is? Maybe...Idempotent is when an operation will reach a certain state no matter how many times you perform that operation. If you’re talking about a purely mathematical example of an idempotent operation, if you multiply any number by zero, it doesn’t matter how many times you do it; you’ll always end up with zero.
00:10:03.880 Now we’re back here and we’ve set up these creates commands. Another thing that Puppet does—this is controversial—it builds a graph of resources. So you have to explicitly build those relationships to ensure ordering.
00:10:32.720 After we have gems, we're going to install Apache. The first thing we’re going to do is install the Apache packages from the native package manager. Then we’re going to ensure that the Apache service is running.
00:11:00.560 Now in this case, since we're building it and it's Debian, it doesn't really matter because when you install Apache, it's going to start it, but if you were to run this over time, how most people set their Puppet up is they run it on some synchronization cycle. Then it's going to restart processes that aren’t running.
00:11:36.040 It ensures everything is running, and you can also set things to notify that process, which we’ll see in a minute. Here, I’m going to get rid of the default site. If you built a server, especially on Debian, it starts with the default site. This is something I think is much more manageable in most cases than a lot of other distributions.
00:12:00.920 Now we’re going to look at the Apache site. This is another defined resource; it’s going to set up a case statement. This code is pretty easy to understand, and it’s almost all data. We pass in whatever value if it’s supposed to be present, then it’s going to enable the site, and this is setting up idempotent, so it’s not going to run it if it’s already enabled.
00:12:40.600 Then it’s going to notify that service. So, what that means is if a site gets enabled, you have to reload Apache. That’s what these notifies are going to set up, and then I wanted to get rid of it so I got rid of it. We keep going and we’re going to install Passenger.
00:13:10.320 Passenger is another good one because you’ll get to this line. This is going to set up some files for it, and we’re going to look at installing Passenger here.
00:13:36.040 This is a defined resource, and these are pretty standard. It’s going to set up a file that’s going to be in the mods available, and it’s going to load the configuration file. This down here is to be able to set settings—going back to the tribes idea, right? You have the developers and the sysadmins. If you’re going to maintain a site with sysadmins, no one’s going to rebuild Passenger every time, right?
00:14:42.760 It doesn’t make sense. You’re probably going to have a native repository; you build it once, and you distribute the repository. For one thing, it takes time, and for another thing, you install a gem and it says it can’t build until it has 'make,' and then you have to run something else.
00:15:00.640 But you can’t just run it; you have to hit return twice. Has anyone seen this? So we have to tell it to hit return, and then it’s going to build the module we loaded before. We’re going to keep going now and install MySQL; we put the packages in and blah blah blah.
00:15:20.960 Now we're going to create our databases. You’ve all created databases for your Rails apps, so this is just setting up something to create and drop databases. This will ensure that the Ruby gem for MySQL is installed.
00:15:50.440 Here we have the first use of the provider gem. 'Gem' is saying, 'I don’t want to install with the native package manager; I want to install with gem because I know it’s a gem.' Now we’re down to setting up Rails. Pretty straightforward! I’m just going to pass in a version; this is a default value.
00:16:22.560 I’m going to install it from the gem— 'gem install rails'—boom! Then I’m going to set up all the stuff that we do. You run Rails on some directory; I’m using 'VAR/rails' and the name, but people put it all sorts of places. Seems like a good place for me.
00:16:52.360 Then you have to set up these dependencies; again, it’s going to build it all out. You guys have done this, right? database.yaml, and you have to do sites available, which we’re going to see; that’s the Apache passenger-type stuff.
00:17:23.360 Then we already saw this defined earlier; that’s going to enable and restart when it gets to that point. This is the template it’s going to pass in; right now, it’s very secure. Make sure you note that if you’re a sysadmin.
00:17:49.840 This is the virtual host setup. If you’ve installed Passenger, I’m sure you’ve seen this. If you’re doing Rails, you’ve seen that too. I don’t know what the time is, but I think they should have Rails on some of those URLs, right?
00:18:19.680 So this is in the cloud! What Puppet gives you the ability to do—and there are other projects that try to do this as well—is to make your infrastructure code. You can start to take advantage of all the lessons learned by developers over the years.
00:19:00.240 There are places where they don’t really version scripts either. Now that you have this semantic code that can be updated and tracked in Git, you're leveraging a lot of flexibility.
00:19:37.520 When you start thinking about bringing up infrastructure and servers, it used to be a purchase order and wait weeks or even months for hardware. Now anyone can bring up servers—off the street—20 servers in minutes on EC2, right?
00:20:13.560 How do you do this? Well, the answer is I don’t really know, because the code is declarative, and the tests you’re trying to write usually are too. When you’re writing code and tests, it becomes more of a challenge.
00:20:49.760 With all of this, I’d love to hear anyone’s ideas on how to approach this. The best thing I’ve seen in practice is that you describe what your machine should be using something like RSpec. You run continuous integration against your Puppet code and validate it with RSpec.
00:21:18.480 It’s more about description testing—it’s not unit testing or test-driven. When we start talking about infrastructure, what does test-driven mean anyway, right?
00:21:34.640 Anyone who’s tried to scale serious infrastructure—whether it’s Rails or Java—understands that you go through phase shifts. As you add more users, the infrastructure has to adapt—you need to examine data models and usage patterns.
00:22:00.440 You may have a wall of confusion between developers and sysadmins, and the time saved here could be a lot. Evolution is always happening, right? Puppet has a little bit of everything, and it’s been great for allowing us to adapt.
00:22:51.520 I like the fish best! So this is the thing that's really interesting: anyone can now bring up servers for just ten cents an hour! You can do many experiments on infrastructure and applications; the feedback loop is quicker than ever.
00:23:14.720 You won’t have to think of upfront costs anymore; now you can spin up whatever you want for a minimal cost, and turn it off when you’re done.
00:23:46.920 The opportunities are immense; the biggest companies were often born out of economic downturns. With some resourcefulness, you can climb high, innovate, and generate value—there are limitless possibilities! So, that concludes my Puppet presentation.
00:24:35.920 I’m Andrew, and does anyone have any questions?
00:25:46.640 Based on experience, I think your points on testing are spot on, but execution matters. The system you propose relies on practical application, and often the realities of cross-platform or multi-package environments are challenging. But with more clarity around this, teams can manage their environments more effectively.
00:26:30.240 If a new package is released and a version changes, what’s the best practice for updating? The system you have allows for a centralized Puppet master that will synchronize with clients at set intervals. If you want to change your infrastructure, update your central repository and the changes will propagate to all instances.
00:27:52.160 But this does have some issues—be careful when deploying code to avoid crashing servers. It’s easier to rebuild than to attempt to backtrack complicated issues with packages that may not behave as expected.
00:28:06.240 Any other questions? I think we’re good. MountainWest, let’s go!
Explore all talks recorded at MountainWest RubyConf 2009
+3