00:00:21.680
Okay, can y'all hear me? Alright, cool. My name is Ian Warshak.
00:00:27.400
I am a developer, and I work at RightScale. RightScale provides a cloud platform management tool that you can use.
00:00:33.960
I have some code that I'll be showing you, and here's the GitHub account where you can check it out.
00:00:39.040
I won't be showing a lot of the code, so feel free to look at it on your own. I'm going to be talking about Rails in the Cloud.
00:00:45.559
I'm going to focus mostly on Amazon, specifically Amazon's web services and cloud-based services. I'll walk you through a simple app that I wrote for this purpose.
00:00:52.559
It doesn't do much; it's quite arbitrary and a bit silly, but hopefully, it will give you ideas on how you can leverage Rails tools in your own applications.
00:00:59.199
The first thing I wanted to do is differentiate terms when we talk about the cloud. It's a big, nebulous term that means different things to different people. I separate this into two buckets: cloud infrastructure and cloud services. Cloud infrastructure refers to servers, like Amazon EC2, which is the big one. There's also Rackspace, Flexiscale, and GoGrid. Cloud services refer to solutions like Amazon S3 or Amazon Simple Queue Service. These aren't infrastructure but rather services.
00:01:18.479
I wrote this app called Pictor, a play on Flickr. I thought of it as a sort of 'Flickr killer' because it can scale. I realize that sounds pretentious, but that was my goal.
00:01:30.200
It's actually a simple photo application. You upload a photo, and the application transforms it in a couple of ways and displays the images on the homepage. It's not impressive, but some interesting aspects include that the photos are converted and processed asynchronously.
00:01:42.519
As I mentioned, I’m going to talk about the technologies behind it. I hope it provides ideas for using these technologies in your own work. I chose this photo application idea because it naturally lends itself to cloud computing, utilizing a lot of disk storage, and bandwidth for users downloading pictures.
00:02:00.280
The offline processing part is crucial because processing photos with tools like ImageMagick can take time. You don't want users waiting for responses while the photo converts or resizes.
00:02:17.440
I foresee this as being an instant hit, and we are going to need scaling right out of the gate.
00:02:30.200
If I were building this for real, I would want to utilize cloud infrastructure and services for several reasons. I don't want to build or configure hardware. Making arrangements for 10 servers to prepare for future needs doesn't appeal to me.
00:02:47.560
I want to offload as much of this work as possible. For a small team or single developer, managing disks or storage solutions takes too much time. Instead, I want to spend as little time on that and focus more on development.
00:03:06.920
So, how did I build this app? It’s a Rails application with roughly 300 lines of code, although I don't have tests for it. It runs on Amazon EC2 servers. I have two pools of servers: one pool for the web front end, managed by load balancers, and another pool for processing servers that handle the work.
00:03:37.360
All the photos are stored on Amazon S3, meaning I don’t have to worry about where to store these files on my servers, avoiding concerns about using SANs. I'm also using a Content Delivery Network, specifically Amazon's CloudFront, which I'll explain more about later.
00:03:56.560
I use the RightScale AWS gem for virtually everything, as RightScale has an extensive library for utilizing Amazon's web services. This is what we use for our own infrastructure, which we've distributed as open-source.
00:04:06.920
This library includes a Ruby interface for Amazon EC2, S3, and other web services. Why did I opt for Amazon? It's primarily because Amazon is the market leader in cloud services. There's a lot of sound competition, but currently, Amazon is the big dog in the field.
00:04:44.760
The integration between services is seamless; if I have an EC2 server, transferring data between servers is free. Generally, they charge for bandwidth, but bandwidth between their services is cost-free.
00:05:01.920
There are APIs for all of this, along with tons of libraries and documentation. While I mentioned using the RightScale AWS gem, there are many other libraries available, which makes it more accessible if you're considering using these technologies.
00:05:17.280
In case you're not familiar with EC2, it's essentially servers on demand. With an API call, you can launch a server in Amazon's data centers, and you pay by the hour, which is convenient because I don’t want to pay upfront.
00:05:48.639
One of the main caveats of EC2 is that it's not persistent. If you shut down your server, any configurations you made will be gone. This means you need to automate configurations which can be a pain initially but is ultimately beneficial.
00:06:26.640
Running multiple configured server images can become unsustainable, especially when scaling. RightScale provides tools for this, but there are also options like Chef, Puppet, and various systems configuration tools.
00:06:55.440
Amazon also has Elastic Block Store, which provides persistent disk storage for EC2 servers. However, I won't delve too deeply into that today. I'm sure many of you know about Amazon S3, a popular online storage service where you pay for the data you store.
00:07:25.760
CloudFront, on the other hand, acts as a Content Distribution Network, which I'm using to efficiently serve images and content to end-users.
00:07:41.360
With distribution servers worldwide, you can create a CloudFront domain. Essentially, by associating an S3 bucket with CloudFront, I get a unique domain allowing images to serve quickly.
00:08:00.000
For Amazon Simple Queue Service, it's a basic queuing service where I send messages to a queue and pull messages off of it. When a user uploads a picture, it creates jobs on the queue. The processing servers pull these jobs, process images, and perform necessary tasks.
00:08:50.960
What's neat about most messaging queues is that they ensure job durability by returning messages to the queue if processing fails. This means if something goes wrong, the job won’t get deleted immediately, allowing for retries.
00:09:19.520
Amazon SimpleDB is a non-relational database they offer. It can be quite challenging to navigate if you're coming from a relational database background, primarily because it lacks tables and joins.
00:09:43.560
In SimpleDB, what they call a table is referred to as a domain. You can't perform joins across two domains, which forces you to denormalize your data, keeping related data together.
00:10:07.640
The lack of a schema allows you to define the domain as you proceed, which is quite powerful. However, all data is stored as a string, so you need to handle things like dates and integers carefully.
00:10:33.920
SimpleDB automatically indexes all your data, so you don't need to run any index management. However, you have to remain aware that querying is not as fast, as each data pull involves making an internet call.
00:11:20.960
As you build your loads, the speed remains fairly constant, meaning the performance remains stable amongst all records, whether you have 10,000 or 10 million.
00:12:08.880
To summarize, you can worry less about scaling in some ways, but you still need to consider aspects of configuration management and automation. Having automated configurations allows systems to boot and configure correctly, minimizing potential headaches.
00:12:40.960
With SimpleDB, database performance is consistent, which is something that MySQL can struggle with when you have a large number of records. It excels at handling large amounts of data without degrading performance.
00:12:59.919
Amazon S3 and CloudFront handle all my static file serving, which lessens the load on my servers significantly.
00:13:28.560
Here's a diagram of the architecture of the application. SQS, SimpleDB, and S3 are all services that I utilize. My web servers and processing servers operate in two separate pools.
00:14:00.960
Users upload pictures to the application. When they upload a picture, they're directed to upload directly to S3. This means my servers handle less work, which is beneficial.
00:14:36.280
Each upload form contains a hidden redirect URL to facilitate this process. Upon successful upload, users are redirected back to my servers for further processing.
00:15:02.560
The redirect URL is utilized to initiate job creation and to start processing the uploaded image once it is received by Amazon.
00:15:30.160
As for job creation, once the server receives requests and the image name, I create jobs to handle the processing tasks, which are then sent to the queue.
00:15:57.360
In addition to creating the jobs, I also generate a SimpleDB record indicating the processing state of the image.
00:16:30.960
This record keeps track of the image name and job statuses, allowing me to ascertain when processing is complete.
00:17:05.040
There's a monitor job that checks if both jobs are done and updates the database accordingly with their statuses.
00:17:45.000
Now, after processing the pictures, I update the SimpleDB record, providing a status update based on the completion of the processing tasks.
00:18:20.000
The processor daemon runs in a loop on a server, continually pulling job requests from the queue and executing them.
00:19:12.960
Each picture that is uploaded results in two processed images, one monochrome and one with a paint effect, thereby fulfilling the task requirements for each upload.
00:19:45.120
Let me demonstrate the application with a live demo now, so if you have any questions, feel free to ask.
00:20:20.800
Currently, one daemon is running per processing server, while in a more advanced setup, you would typically want several daemons running on each server.
00:20:44.800
This setup effectively minimizes processing delays by designing the system for each job to be handled separately.
00:21:06.240
As the uploads progress, I'm able to see two web servers and monitor how jobs are distributed among the servers.
00:21:32.720
If I refresh, the app will switch between the two processing servers based on which server completed the jobs.
00:22:00.000
You can see a live demo of the application now, allowing you to upload a picture and monitor the processing.
00:22:32.640
The upload form is tied directly to Amazon S3. There are hidden values that authorize the upload and ensure that all processes are handled securely.
00:23:03.280
When the file is uploaded and successfully processed, the user is redirected back to my server to begin the next steps in processing.
00:23:35.280
Streamlining the upload process allows for a smoother user experience, enhancing how users interact with the application.
00:24:01.600
Once my servers retrieve the request with an image identifier from the URL, I create job instances required for the image conversion process.
00:24:37.440
Each conversion job is serialized into a format suitable for the queue, making them ready for processing.
00:25:16.920
My convert job class has a specific method that handles the actual image conversion, utilizing ImageMagick to perform the processing tasks.
00:25:52.760
I create jobs based on user uploads, and these jobs are handled by the processing servers, ensuring they complete tasks efficiently.
00:26:26.000
The system maintains a record for every image undergoing conversion processes, tracking their statuses across transformations.
00:27:01.040
Once both processing tasks are completed, SimpleDB records will reflect the finished statuses. This transparency helps manage workflows effortlessly.
00:27:34.920
The system continuously checks if the images have been processed and updates the corresponding database entries. If both conversions are complete, the application shows the images.
00:28:10.880
Here's how I update records, signaling to the system that the respective image has been transformed and is ready to be displayed.
00:28:47.679
My processor daemon runs iteratively, managing job loads systematically. Every upload leads to converted images, maintaining efficiency within the architecture.
00:29:23.760
Let me provide a brief walkthrough of uploading images and monitoring the conversions in real time.
00:30:02.479
By refreshing the page, you will see the processing states of the uploaded images, further indicating how they are dynamically managed across the server.
00:30:37.599
Would anybody like to view the uploaded images with applied effects as part of the demonstration?
00:31:12.000
I recently had memorable experiences during a mission trip to Guatemala, where I assisted in medical work, which I shared as a lighthearted side note.
00:31:46.200
As the demo progresses, you can see the completion of jobs across my processing servers, showcasing the readiness of converted images with applied effects.
00:32:22.000
Currently, you can observe two separate conversions happening, with the corresponding job results reflecting the processing outputs on the screen.
00:33:03.760
If you have any questions about the project or the workings of the setup, please feel free to ask. I'm happy to provide more insights.
00:33:38.400
Security protocols during image uploads involve HTTPS connections, ensuring that all data exchanged remains secure and authorized.
00:34:10.000
Post-upload, users are redirected back to the appropriate page within the application where they can view the images or job summaries.
00:34:43.000
If there are more inquiries about the processes shown, or deeper dives into the code repository, I'm happy to assist further, as many resources are available.
00:35:20.800
The underlying principles and security mechanisms of the API ensure that user interactions are consistently managed, keeping services both reliable and efficient.
00:36:00.800
Please keep the questions flowing, as it’s valuable for understanding how we can apply cloud solutions effectively in various contexts.
00:36:21.719
All right, thank you everyone for your attention and engagement throughout this presentation. Let’s discuss any final thoughts or questions you may have.
00:36:38.880
Thank you for being here today.