Talks

Realtime web applications with streaming REST

Realtime web applications with streaming REST

by Brad Gessler

In this session from Rails Conf 2012, speaker Brad Gessler explores the development of real-time web applications using a Rails framework. He emphasizes the importance of not just storing data persistently but also pushing it out to users' browsers to ensure they work with the latest information. Gessler discusses his journey with various technologies, beginning with REST and leading to the integration of real-time streaming. Key topics include:

  • Choice of Technology: Gessler discusses his preference for specific real-time frameworks over others like Socket.IO and Meteor. He highlights Socket.IO's shortcomings in terms of complexity and lack of configurability, ultimately praising the simplicity of building his own streaming solution.
  • Integration with Rails: The session covers how to leverage Rails' capabilities, outlining the integration of streaming resources over a JSON API. Gessler describes the development of 'Fire Hose', a custom framework for efficiently pushing real-time updates to connected clients.
  • Handling Real-time Data: Gessler explains the limitations of long polling, especially when dealing with large datasets and state changes, using Twitter's streaming as a case study. He emphasizes the difficulty of polling for state changes in such scenarios and how his solution circumvents these challenges.
  • Architecture and Security: The talk details their architecture involving RabbitMQ and Thin servers for real-time data management. Security considerations are also addressed, highlighting how existing Rails authorization logic can be adapted to protect sensitive data streams.
  • Team Collaboration: Gessler discusses the growing complexity of applications and the importance of breaking them into smaller, manageable services. This approach allows teams to focus on specific functionalities without overwhelming the Rails framework.
  • Conclusion: The final takeaway emphasizes how the right choice in technology and design can lead to efficient real-time applications without significantly altering existing infrastructure. Gessler encourages developers to learn from his experiences and consider contributing to the real-time application ecosystem with their own implementations.
00:00:25.599 Alright, so the results of this graph look good. This tells me that I made my slides right. I was kind of worried that everybody would come in here with a bunch of Socket.IO experience, and I would look like a total idiot. So this is good; this is a good start.
00:00:41.320 Today, I'm going to talk about the technology we use to build what you see right here at my company, Poll Everywhere. Our focus is on creating real-time audience situations to get instant feedback from everyone in the audience. I think what's more important today than just going out and using Socket.IO off the shelf is to understand the context of how you pick your real-time application framework. There are quite a few of them out there. As you saw in that chart, we have Socket.IO, and there are many new frameworks emerging like Meteor.
00:01:10.880 The approach we took since we started our company in 2007 was based on Rails 1.1, before REST came around. We only got into REST after Rails 1.2 was released, and the introduction of REST was pretty amazing. My mind was blown by the nice resources and clean design patterns it provided. We could build controllers that would expose parts of our application, like our polls, in clean ways that made it easy to test and secure with a solid authorization pattern. This evolution also allowed us to end up with a robust API.
00:01:56.680 Interestingly, when REST first came out, the only real consumer out there was Active Resource. The JavaScript frameworks at that time weren’t ideal for implementing MVC patterns client-side. There was a monolithic JavaScript framework called SproutCore that aimed to take over everything. However, We quickly abandoned the idea of using it for our application because it felt too cumbersome and would have forced us to give up tools we loved, like HAML and SASS, which significantly enhanced our productivity and the look and feel of our application.
00:02:49.440 Fortunately, Backbone.js came to the rescue, followed by Ember.js. Finally, we had sensible client-side JavaScript frameworks to consume our RESTful API. This allowed us to create responsive web applications that we all know, like Gmail and Google Maps. These frameworks work harmoniously with the RESTful backend we developed. For example, this is some Backbone code used to set up a user model on the client side, pulling user data from the server.
00:03:13.080 In today's world, building web applications results in a responsive client-side experience. However, it still tends to be limited to specific URL interactions, such as pulling in some JSON when editing a user. Oftentimes, you wouldn't even make an explicit request; you would just interpolate the JSON directly into the page.
00:04:06.599 This is actually part of the Backbone documentation today, demonstrating how to make your application feel real-time. This approach is precisely what we adopted, retaining much of our long polling in the application to refresh user data every 10 seconds. While this method is straightforward, issues arise when scaling up, with increasing latencies in processing data.
00:04:41.880 To deal with these problems, we’ve optimized our stack. We have engines running caches for a couple of seconds, which cache all the JSON in our long polling. Although it can create some spaghetti code, it's fast and effective for short time frames. As we receive messages related to polls, we update various counter caches to ensure the results remain current.
00:05:18.560 Another challenge with real-time applications is managing errors. We frequently receive emails from Airbrake every time an issue arises, as these problems tend to cascade and escalate quickly. With long polling, challenges become particularly apparent when managing large data sets. A great example would be streaming Twitter feeds without missing valuable data about state changes, such as tweets being deleted.
00:06:25.600 In assessing our technology stack today, we considered how to integrate streaming without compromising our existing Rails application. At the same time, we aimed to respect the existing authorization framework. Since our team has grown, we've ended up with a large Rails application that complicates matters when integrating real-time capabilities.
00:07:53.239 As a solution, we broke our application down into autonomous pieces. The current structure features a Rails application running our JSON API while other small satellite applications handle specific tasks like rendering and processing user interactions. These satellite applications utilize JavaScript and frameworks we enjoy, allowing us to build sleek applications similar to conventional desktop applications, but using web technologies.
00:08:57.480 Implementing these satellite applications is straightforward—deployments are a matter of syncing three files (HTML, JavaScript, and CSS) onto an NGINX server. Each application will then efficiently communicate with our JSON API.
00:09:16.080 On the streaming front, we have kept it separate from our Rails app for several reasons. Primarily, the streaming processes operate in an evented environment, making it easier to manage within a dedicated service.
00:10:40.960 When evaluating our streaming options, we explored various candidates. Initially, we looked at Socket.IO, widely popular for its full duplex capabilities. We encountered leaky abstraction issues with its implementations which complicate lower latency needs, leading us to explore alternatives.
00:11:58.400 Meanwhile, Meteor caught our attention for its real-time capabilities. However, we decided against it for our team structure, as it seemed too tightly coupled, reminiscent of the monolithic architecture we wanted to move away from. After considering several frameworks, none aligned perfectly with our needs, so we decided to create a custom solution.
00:13:29.480 We focused on solving the problem we identified with web applications — pushing data to Backbone and Ember.js models that reside on the client. We built a lightweight framework, which we referred to as Firehose, to manage data streaming effectively.
00:14:30.080 To implement Firehose, you'll first need to install RabbitMQ, which offers a robust messaging layer. We have a simple API that uses HTTP and Websockets as transport channels. By structuring the HTTP URLs for publishing and subscribing, we allow for straightforward setup and operation.
00:15:57.679 The core technology relies on event-driven functions. For example, when a client connects, it subscribes to its relevant user streams while the server maintains states effectively. This reduces much of the complexity associated with connection management as RabbitMQ retains messages even if clients experience intermittent connectivity.
00:17:05.160 From this server architecture, it’s essential to emphasize minimizing invasiveness while maintaining code efficiency. This ensures that resources are pushed out effectively without requiring major refactoring of existing code or frameworks. We also planned for simple asset distribution, enhancing deployment speed.
00:18:50.640 As we progressed with our streaming infrastructure, security became a crucial focus. We considered the use of middleware to verify access rights before establishing a stream connection and paired this with our existing Rails authorization logic.
00:20:25.760 Although this setup performs well, we’re aware of the evolving landscape. We know we can enhance it further, integrating other backends like ZeroMQ or considering different caching technologies, depending on the workload that emerges.
00:21:43.680 For now, everything is neat, clean, and manageable. Though the goal of full implementation isn't complete, we expect to refine this real-time layer further by next week. The plan includes going live after addressing the issues that previously concerned long polling and associated connection problems.
00:24:25.840 If you're interested in real-time streaming applications, you can explore Firehose on our GitHub page. We also have multiple developer positions available if you want to join our team. And finally, feel free to follow me on Twitter for further updates on Firehose.