Using Amazon's Web Services from Ruby

00:00:14.639 All right, so this presentation is about using Amazon's web services from Ruby. A common question that arises is whether these services scale well. When considering why one would use Amazon Web Services at all, it's important to note that they help solve scaling problems. Their services allow you to scale your applications effectively. However, when working with Ruby, you need to understand how this scaling works.

00:00:32.719 I was reminded of an episode of 'I Love Lucy' that serves as a perfect example of needing to scale. I'll play a short clip from the Chocolate Factory episode to illustrate this point.

00:01:09.799 [Clip Playing] All right, girls, listen carefully. This is the wrapping department. Now, the candy will pass by on this conveyor belt and continue into the next room where the girls will pack it. Your job is to take each piece of candy, wrap it in one of these papers, and then put it back on the belt. Do you understand? Yes, ma'am. Let her roll, let her roll! Wait here, somebody's asleep at the switch. What are you doing up here? I thought you were downstairs boxing chocolates. Oh, they kicked me out of there fast. Why? I kept pinching them to see what kind they were. This is the fourth department I've been in! Oh, I didn't do so well either.

00:02:21.239 All right, girls, now this is your last chance. If one piece of candy gets past you and into the packing room unwrapped, you're fired! Yes, ma'am. Well, this is easier! Yeah, we can handle this. Okay, listen, I think we're fighting a losing game. Here she comes! Fine! You're doing splendidly; speed it up a little! So, clearly, that raises the question of scalability, and in this case, it did not scale. However, often we don't have the opportunity to just eat all the chocolate, right? So we've got to figure out how to handle this—and that is where Amazon Web Services can help.

00:04:01.680 Thank you for that clip! It would have been useful to know about that scenario. Now, I want to talk about specific Amazon Web Services. The ones I want to cover are SQS (Simple Queue Service), EC2 (Elastic Compute Cloud), S3 (Simple Storage Service), and Simple DB. I'm going to focus on SQS and EC2 because their APIs are quite similar, and I will demonstrate Ruby code for these services using the right AWS Ruby gem from RightScale.

00:04:52.120 There are many different Ruby libraries available to communicate with Amazon Web Services, particularly for S3, SQS, and EC2. What I particularly like about the RightScale gems is that they support multiple services, including SQS, EC2, S3, and Simple DB, all from a single gem. Additionally, it is a well-maintained library that handles updates effectively in response to the frequent updates from Amazon. The RightScale library includes a robust HTTP library that automatically retries requests in case of errors, reducing the need for extensive error handling in your code.

00:05:39.520 Another library I want to mention is KO, which is an EC2 pool manager. It is essentially a port of the Lifeguard Java library, which assists in managing EC2 instances dynamically by determining when to start or stop instances based on the load. Using the RightScale gems, we can simply start by requiring our gems and utilizing them to interface with Amazon's services.

00:06:24.760 Since Amazon's web services communicate via XML, you can specify the use of the libxml library, which significantly enhances performance. Presently, we're just setting the required parameters to connect to SQS. Amazon provides you with an access ID and an access key, which are the essential parameters needed to communicate with their services.

00:07:05.520 Additionally, you can optionally specify the server, port, and protocol, which is particularly useful for local development where direct communication with Amazon might not be desirable. For instance, during this presentation, connecting directly to Amazon might present connectivity challenges.

00:07:52.080 To create a queue or retrieve an existing one, you can call SQS dot queue with the name of your queue. After obtaining a reference to the queue, adding a message is as simple as calling push. The most recent update to SQS increased the maximum message size from 256 KB to 8 KB, so it’s not ideal for pushing large messages. Typically, you would want to send some form of identifier that could be looked up in a database or a file stored on S3 for additional information.

00:09:38.560 To determine how many messages are in your queue, you call q.size. This will provide an approximate number of messages, which may not be completely accurate due to the way Amazon clusters their servers. As you add messages to the queue and retrieve them again, the messages may not always return from the same server.

00:10:40.120 Retrieving a message is straightforward; you simply call q.receive. If no messages are available, it returns nil. Even if you place a message into the queue, you might not retrieve it immediately due to clustering behaviors. To access the message content, you call body on the message object, and to delete a message from the queue, you call delete.

00:11:01.920 By default, when you receive a message, there is a visibility timeout you can specify. This means the message will not be accessible to other retrieve calls until the timeout expires. If a process calls to receive a message but does not delete it before this timeout, that message could become visible again.

00:11:53.048 Now, let's talk about EC2. The process of using the RightScale gems with EC2 is quite similar to SQS. You specify your Access ID, Access Key, and optionally, server, port, and protocol. All RightScale gems connect to Amazon services in this way, whether for S3, EC2, SQS, or Simple DB.

00:12:43.920 EC2 instances are essentially virtual servers that allow you to boot as many copies of a machine image as needed. The average boot time for an instance is about two minutes, although it can often be faster. Within EC2, different sized images are available to provide various hardware capabilities, such as increased memory or processing power.

00:14:10.840 To describe your currently running instances, you call EC2.describe_instances, which returns a hash containing information about the uptime and IP addresses of each instance. To run a new instance, you call run_instances with your Amazon Machine Image (AMI) name and specify the minimum and maximum number of instances you would like.

00:14:59.000 It’s important to note that you may not receive the exact number of instances you request, so it's wise to verify the number you have spun up. When you run an instance, details returned by the description will give you an identifier that can be used to terminate instances later. Usage of a simple array is adequate to perform terminations.

00:15:50.960 KO, as an EC2 pool manager, allows for dynamic determination of whether to start or stop an instance based on the workload. This is extremely useful in a scaling application. I researched options before developing my solution and found the Java Library Lifeguard, which served this purpose well. Consequently, I ported this functionality over to Ruby.

00:16:59.760 The pool manager monitors your queues and decides if it should start up or shut down instances based on need. KO can handle multiple pools of EC2 instances, which is beneficial for managing resources efficiently. For example, if your application servers are heavily loaded, KO can spin up additional instances while keeping the web servers running normally.

00:18:47.080 You can specify the minimum and maximum number of instances you want running, as well as ramp-up and ramp-down intervals. This flexibility allows the resource management to become responsive to demand. Since Amazon charges for a complete hour of service per instance, you can set a minimum run time to avoid incurring costs for short-lived instances.

00:20:49.680 There are certainly caveats to using Amazon services; it’s not a utopian environment. It's crucial to account for potential issues that arise while using Amazon's service, such as the occasional instance not coming online. Although I don't have specific statistics, I have experienced issues with SQS where messages don’t arrive immediately.

00:22:29.040 I aimed to create a sample application that illustrates this process in action, rather than just discussing the concept. Returning to the 'I Love Lucy' chocolate factory scenario, I proposed that we put a Ruby twist on it by making chunky bacon instead of chocolates. This way, we will explore the workflow of bacon production.

00:24:31.960 We start with slicing bacon, which registers in the database. Also, it is placed on the chunky unpackaged queue monitored by KO. Each cartoon fox represents an EC2 instance running and monitoring this queue. The pool manager assesses workload and decides if more foxes (EC2 instances) should be started.

00:26:20.720 Once the instance starts, it pulls messages from the queue and packages them. After packaging a slice of bacon, it moves to a packaged queue monitored by another process, which updates the database to indicate that the bacon has been packaged. We'll look at the Ruby code used in our implementation.

00:27:05.520 The implementation involves a SQL model that creates database entries. A reference to the unpackaged queue is obtained, and the slice ID is pushed onto the queue. As foxes process the slices, they report their status of busy or idle. Upon completion of packaging, they remove the message from the queue, ensuring it provides accurate handling of the process.

00:29:41.760 In responding to the earlier concern about message duplication, you should implement measures in your code to handle such occurrences, especially where messages could potentially be delivered twice.

00:31:37.760 Here, we arrive at our chunky bacon application, closely mirroring the chocolate factory scenario. Each cartoon fox represents an actively running EC2 instance. For this demonstration, I emulated EC2 and SQS locally to avoid internet dependency. As we run the application, notice how it dynamically adjusts to the workload.

00:33:47.720 Through live coding, I'll demonstrate the workflow, with bacon being sliced quicker than before to observe how the EC2 instances scale accordingly. The instances respond to the load, and any excess demand leads to the activation of more instances.

00:36:36.480 Now that you’ve seen the chunky bacon application in action, feel free to ask any questions. I'll be happy to address your queries regarding the discussed concepts, processes, or any implementations I’ve demonstrated.

00:38:43.720 One interesting question centers on the real-world applications for using Amazon Web Services. For example, I utilize their services to handle automated continuous integration. When developers commit code, it creates messages in a queue that triggers the necessary builds. My specific use case thrives on Amazon’s ability to handle services dynamically, enhancing efficiency.

00:39:41.440 While building out the infrastructure myself is always an option, having Amazon manage these processes can greatly reduce overhead and operational costs. This demonstrates the value of utilizing Amazon Web Services in a dynamic load environment.