Ruby on Rails

Your own 'Images as a Service'

Your own 'Images as a Service'

by Andy Croll

Summary of 'Images as a Service'

In his talk at RubyConf 2015, Andy Croll discusses the critical topic of 'Images as a Service,' emphasizing the importance of optimizing image serving to enhance user experience on the web. He outlines the challenges presented by the increasing size of web pages, particularly due to images, which now average over 2MB in size.

Key Points Discussed:

  • Growth of Web Pages and Image Sizes:

    • Average web page sizes have increased significantly, now being 3.5 times larger than five years ago, with images making up a large part of this growth due to their bandwidth consumption.
    • The need for speed is underscored, given that most connections do not achieve expected speeds, especially on mobile devices.
  • User Experience Impact:

    • Slow loading times critically affect user interaction, as highlighted by examples from Amazon and HouseTrip, where faster page loads led to improved user engagement.
  • Practical Solutions for Serving Images:

    • Encourages the use of responsive imagery practices, which tailor image delivery based on the device's screen size.
    • Introduces the 'source set' attribute in HTML to suggest appropriate image resolutions to browsers.
    • Describes implementing progressive enhancement and how browsers can selectively load images of varying sizes and formats like WebP.
  • Use of Sinatra and CDNs:

    • Proposes a simple architecture using Sinatra and Dragonfly for image processing, demonstrating how developers can serve appropriately sized images without significant overhead.
    • Highlights the importance of using CDNs to serve images efficiently, ensuring faster load times by caching and delivering images closer to the end-user.
  • Performance Testing:

    • Croll shares insights from performance tests using Apache Bench, illustrating that optimized images significantly improve load times and user experience.
    • Discusses the balance between image size and processing capabilities, revealing that smaller images yield substantial bandwidth savings.

Conclusions and Takeaways:

  • Prioritizing Speed:
    • It is vital to minimize the asset sizes and focus solely on what is necessary for proper rendering. Speed directly influences user experience, and adopting a microservices architecture can help streamline this process.
    • Croll encourages the exploration of available tools and services, suggesting that developers engage with modern practices and technologies to enhance image serving strategies.
  • Caching and Efficient Asset Serving:
    • He advocates for using CDNs and caching strategies to mitigate the load on Rails applications, allowing them to focus on core functionalities rather than asset delivery.

In summary, Andy's talk emphasizes that while web content grows in complexity and size, developers must adopt thoughtful strategies for serving images to retain user satisfaction and engagement.

00:00:15.059 All right, all right. Today, I'm going to talk to you about 'Images as a Service.' My name is Andy Croll, and I'm from the UK. I work at a company called Cal Script, which is a bit like Airbnb, except we have to raise money. It's focused on family villa holidays. I also run a very small consulting firm where we use some Ruby gems, and I organize a little conference called Brighton Ruby on the south coast of the UK. I’m running an event on the 8th of July, so if you’re interested, make sure to enhance your trip.
00:01:06.820 Now, onto the talk. Today, we'll discuss images as a service. This topic comes up often at work because websites typically have images on them, as you probably know. However, the thing we really need to consider is speed. There's a great article that poses an interesting question: how many floppy disks would it take to download a modern article from The Atlantic? If you were to download it and divide it into floppy disks and pass it around with friends—what a concept!
00:01:21.850 The author of this article, Pete Davis, discusses a dinosaur article that’s fascinating because, as my toddlers tell me, dinosaurs are awesome. So, how large would an article that includes four pretty large images and has a 6,000-word text be? Any guesses? Surprisingly, it would be only around 400 kilobytes, and most of that is images. The actual text weighs in only about 37 kilobytes, which fits comfortably onto a floppy disk with space to spare. If you have to download this over the internet on The Atlantic’s website, you would see how big it really is. This is the extreme end of modern publications that often contain large amounts of video ads.
00:01:58.299 I found my experiences online to be quite sobering. During a visit to a website with an ad blocker, I saw only about 3 megabytes of data. The web is evolving but not always for the better—this is a productive half-hour spent online! Does anyone here recognize anything from the past? Perhaps floppy disks? My personal favorite was the game "Day of the Tentacle"—an absolute classic! This article by Pete Davis was published on Medium.
00:02:17.780 At this point, I’d like to apologize that despite my keynote skills, this presentation doesn't have the whimsical animations you might expect. Let’s take a step back in time to 2010. During the Rails 2.3 migration, Ruby 1.9 was on the rise. People were walking around with their iPhone 4s, feeling quite advanced. In those five years, Marvel and Disney built the entire Cinematic Universe, bigger and more complex than ever, with giant ships crashing into even bigger things.
00:02:36.880 As web developers and designers, we’ve been keeping up with this trend of growth. According to Archive.org, which tracks snapshots of the web, web pages have exploded in size. JavaScript has grown tremendously in the past five years, and the same goes for web fonts. However, the staggering increase in the size of images dwarfs all that progress.
00:03:09.560 On average, web pages are now 3.5 times larger than they were five years ago, with the average page size exceeding 2 megabytes. Thankfully, a part of this growth can be attributed to faster internet speeds. In the UK, our broadband regulator reported increases from 5 megabits to just over 20 megabits in these same five years. You might think that with everything bigger and better, all is well.
00:03:36.159 But, as we all know, high speeds don't necessarily solve every problem. Only a third of connections in the UK are considered high speed. Even during peak times, only about 10% of people see the speeds they’re paying for, which can be frustrating. If you’ve ever wondered why Netflix doesn't work well at home on a Friday night, that’s the reason! Smartphone connections have also grown, moving from 20% to 60% of the market, which means while our pages are getting larger, the experience of surfing the internet is becoming increasingly problematic.
00:04:03.790 Well done, everyone! We’ve truly accomplished something amazing. There are countless examples showing that speed is a vital metric for how customers interact with services. Amazon, for example, is able to financially quantify how page load times impact customers’ shopping carts. Personally, I’ve seen similar trends at HouseTrip; when we make pages faster, users tend to engage more with the booking process. This discussion about speed, however, seems misplaced for a Ruby conference devoted to the technical side of things.
00:04:55.200 This leads us to a story involving a guy named Chris in a meeting at YouTube. He was listening to his senior engineer rant about performance issues and decided to take on a challenge: to reduce a 1.3-megabyte YouTube video page to under 100 kilobytes. He labeled this project 'Feather'. Initially, Chris managed to get it down to 250 kilobytes, but he knew he had more work to do.
00:05:30.020 After three painstaking days, he still hadn’t met his goal of 100 kilobytes. However, he did manage to integrate a new HTML5 video player, replacing the old Flash version. Surprisingly, he pushed the page down to 98 kilobytes with only 14 requests. So, he added basic monitoring, because, you know, this is Google, and launched it to a fraction of their traffic to gather some data.
00:06:03.790 The numbers rolled in, which were shocking—the average time to view a video had noticeably increased! Despite the significant reduction in page sizes, it turned out that the load times had jumped significantly in certain geographical areas. They discovered that in parts of Southeast Asia, South America, Africa, and even Siberia, the average load times peaked at over two minutes! This made it clear: many users couldn't access YouTube simply because it took too long to load.
00:06:47.310 So while Feather's average load time was two minutes, it finally became feasible for many users to watch a video as a result of those optimizations. All these anecdotes are enlightening, but what practical solutions can we offer?
00:07:04.710 There are various actions we can take. Firstly, testing is crucial—validating changes on natural devices can be helpful. Take, for instance, Facebook’s example, where they encourage developers to recognize the privilege of their connections. On Tuesdays, Facebook displays a bar prompting developers to experience the website as if they were on a 2G connection. It's a clever idea! Often, developers work with optimal connections, devoid of the realities of users with poor internet quality.
00:07:31.960 Let's dive into responsive imagery. As a former front-end developer, I'm no stranger to CSS. So, here’s the new syntax that you might find familiar. It resembles the usual image tag but includes a 'source set' attribute—this acts as a suggestion to the browser on loading a high-resolution version of an image if available. This is just a recommendation, so not every browser is compelled to load these images.
00:08:13.680 A key aspect of HTML is that if a browser, such as older versions of Internet Explorer, does not understand the new syntax, it simply ignores it. You might have heard your local CSS enthusiast mention 'progressive enhancement'—this is a perfect example of it. As we get a bit more advanced, the source set allows the browser to recognize resources accordingly based on viewport size.
00:08:40.840 For example, if the image is 600 pixels wide in its display, the browser fetches the corresponding 600-pixel version of the JPEG, and if it’s 1200 pixels, it will fetch the 1200-version. The image's size attribute indicates its width at a given viewport, similar to a CSS media query. If the screen width meets certain criteria, it directs which version of the image to load.
00:09:08.580 Here’s where it gets a bit more complex. There's a way to provide alternative sources within responsive web design. If there’s a drastic change in layout, you can provide portrait images at specific sizes inside a picture tag. This feature is well-supported across most modern browsers, except Safari, which is still catching up.
00:09:44.310 Additionally, you can also serve multiple formats. For instance, Chrome supports a future-format image called WebP, which compresses images better. You can specify a MIME type and provide alternative formats within a picture tag. This demonstrates non-destructive enhancement—if a user’s browser does not support a specific format, it simply reverts to the image in other formats.
00:10:13.470 Just to recap, if you want to ensure you're using appropriately-sized images, this is a snippet of code from HouseTrip; it's an example of Paperclip, which specifies an attachment in a model. I apologize if I've trimmed some details, but this snippet should track well with what you might already be using.
00:10:58.130 The goal is to serve images at the correct sizes without going through the lengthy process that comes with tools like Paperclip. Ideally, you'd prepare a few sizes in advance depending on your design's future needs, which I learned while working at HouseTrip, a holiday rental service where the visual presentation is vital. Each image we upload means generating all required sizes.
00:11:42.150 So, at HouseTrip, we had original large images stored on S3 for easy access. We kept resizing straightforward and usually served everything from a CDN. I wanted to explore if this could be a small enough service to deploy on Heroku, considering minimal code and avoiding extensive overhauls. I opted for Sinatra and Dragonfly, which serves as a user-friendly wrapper for the complex ImageMagick command line.
00:12:23.991 To configure Dragonfly, you include a simple setup within your app. Once established, the API generates easy-to-use requests. Within those requests, the geometry strings in ImageMagick format allow adjustments over dimensions—fixed width, fixed height or a set height and width.
00:12:49.750 In my Sinatra app, I built a utility class that checks the geometry string's validity. Then, it collects remaining parameters and resources to fetch appropriate images over HTTP, resizing them on-the-fly before serving back to Rack, which is how responses are managed.
00:13:18.760 Let's talk architecture. Essentially, we direct the heavy lifting away from our application toward services externally available. For instance, we were using S3 for images and Heroku for web serving; the CDN serves as the middleman. I built a simple architecture where devices make calls via CDN, which interfaces with the Sinatra app to transform images before serving them back to the requesting devices.
00:13:50.780 Upon launching this service on GitHub, it received a few stars—still counting! So, let’s assess the performance impact. You can grab any random image and try this out practically. For example, in my tests, I used a typical photographic image, roughly 1.1 megabytes at 2000 pixels wide, as a baseline for our systems.
00:14:27.710 I utilized Apache Bench, a straightforward command-line utility designed for performance testing. I ran tests to send multiple requests while observing the variations in response times. Based on this benchmarking, I found that even when images were reduced to 500 pixels wide or 1000 pixels wide, overall image processing times remained fairly consistent.
00:14:57.990 As I expanded tests with more requests or adjusted the concurrency, results remained stable. However, as I looked at different sizes of Heroku's performance dynos, I concluded there's a notable distinction as resources are shared among different users—as expected.
00:15:27.220 Despite small variations observed during testing, these response times proved feasible. Although not phenomenal, they weren’t dragging the process down when undertaking image transformations. Moreover, optimizing for proper image sizing resulted in significant performance benefits.
00:15:52.230 Instead of allowing browsers to downsize original images, it’s smarter to serve smaller versions to save bandwidth. For instance, providing smaller images cuts down on file size by ten percent. This efficiency is promising.
00:16:23.290 Additionally, JPEG compression benefits getting images down to manageable sizes, making this transition even smoother. A knowledgeable peer, Dave Newton, uncovered improvements beyond basic ImageMagick operations which delivered up to 85% size savings.
00:16:47.470 After running comparisons and accounting for discrepancies in request timings over hotel Wi-Fi, I found that more rigorous compression techniques take slightly longer but yield decisive results. The statistical variation in performance among image processing jobs persisted, contributing to unpredictability regarding processing times.
00:17:11.080 That said, the advantage of making images significantly smaller prevails, especially when handling a CDN. Consequently, it not only boosts performance seamlessly across deployments but ensures lower file sizes are accessible for users.
00:17:37.760 Unfortunately, in the real world, large images are a genuine challenge. In one instance, passing a random 6-megabyte image to our server overwhelmed everything, despite multiple resources. It was a 3000 pixels wide photo that simply couldn’t perform without significant upgrades.
00:18:05.580 This complexity often leads me to explore solutions like Refine, a service from Jonas Nicholas, the creator of CarrierWave. It brings fresh lessons learned over five years to enhance development processes—essentially embedding deeply into Active Record models.
00:18:39.110 The other alternative includes Etosha, a friend's creation similar to what I devised, which offers an engaging API ideal for managing image uploads along with plenty of features.
00:19:11.570 If you're in the market starting a greenfield project aimed at managing images, consider incorporating one of these tools; they reduce a significant amount of future project pain.
00:19:41.390 In conclusion, prioritize speed! Minimize your asset sizes and focus just on what the browser needs to render correctly. Emphasizing quick responses will elevate the user experience dramatically.
00:20:07.600 This approach extends beyond images; it encapsulates broader microservices architecture principles where functions run on the internet while relieving your application from complicated burdens of load management. Speed defines user experience.
00:20:38.830 Thank you all for your time. I hope you find ways to optimize your services! Any questions? Just remember to ask, and I'll repeat your queries for everyone.
00:21:00.870 Am I using this code live? No, I believe we’ve recognized the need for a more solid solution that scales better than the initial 100 lines of code. It’s an interesting exploration as we contemplate transitioning to a more considerate service structure.
00:21:14.330 In terms of caching and CDNs, my recommendation is to leverage these systems effectively. It’s essential not to serve asset files directly from Rails applications—this method can lead to inefficiencies.
00:21:38.470 Utilizing cloud services like Cloudflare or Amazon's CloudFront can significantly improve performance while allowing for fine-tuned cache expiration settings.
00:22:00.440 Overall, the best practice is to cache as much as possible, ensuring a proper asset-serving strategy so your application can focus on functionality, not on the delivery of images.
00:22:25.490 Thank you!