Talks

Complex Made Simple: Sleep Better with TorqueBox

Complex Made Simple: Sleep Better with TorqueBox

by Lance Ball

Introduction

The video titled "Complex Made Simple: Sleep Better with TorqueBox" is a presentation by Lance Ball at Rails Conf 2012. It addresses the challenges of managing background tasks, scheduled jobs, WebSockets, long-running services, and caching in a complex Rails application. The solution presented is TorqueBox, a Ruby application server designed to simplify these complexities, enhancing the deployment and development experience for Ruby developers.

Key Points Discussed

- Growing Complexity in Rails Apps:

- Rails applications often start simply but become complex as they scale.

- Common complexities arise from needing background jobs (e.g., sending emails), scheduled tasks (e.g., monthly newsletters), and long-running processes (e.g., monitoring data feeds).

  • Deployment Complexity:

    • Developers often face a steep learning curve in managing different background workers, cron jobs, and deployment environments.
    • There’s a spectrum of deployment solutions ranging from rolling your own server setup to using a platform as a service like Heroku. TorqueBox positions itself as a middle-ground solution.
  • Introduction to TorqueBox:

    • Built on JRuby and JBoss AS7, TorqueBox centralizes server features into one manageable framework.
    • It facilitates background job management, caching, and scheduling without requiring extensive configuration.
  • Case Study - ASAP Application:

    • The talk includes a real-world case study involving ASAP, a nonprofit organization.
    • They initially deployed their Rails application on Heroku but sought a cost-effective, scalable solution by moving to a VPS with TorqueBox.
    • The conversion process included installing JRuby, configuring the application to accommodate TorqueBox’s features, and leveraging built-in capabilities like background jobs and caching.
  • Benefits of TorqueBox:

    • Simplified background job management using the Always background method to offload processing immediately.
    • Built-in caching that eliminates concerns over separate cache services like Memcached.
    • Handled scheduled jobs with minimal setup, ensuring background processes work seamlessly with the Rails application.
  • Deployment and Production Environment:

    • Setting up TorqueBox involves straightforward package installations and configuration for production environments.
    • It supports scaling through mod_cluster for load balancing and demonstrates high availability features through clustering.

Conclusions and Takeaways

The presentation concludes by emphasizing that TorqueBox significantly reduces the complexities associated with Rails applications. By integrating various functionalities seamlessly, it allows developers to focus more on application development rather than infrastructure management. For organizations working with complex Rails applications, leveraging TorqueBox can lead to simplified processes and better overall management, ultimately translating into improved productivity and efficiency.

00:00:25.480 Hi everybody! Thanks for coming to my talk.
00:00:30.720 Today, I'm going to talk a little bit about TorqueBox and how it can potentially make your life a little bit simpler.
00:00:41.399 Before I do that, let me tell you a bit about who I am. I'm a software developer for Red Hat, a contributor to TorqueBox, and I'm part of the Project Odd team.
00:00:54.199 Project Odd is a polyglot team at JBoss aimed at bringing different languages, other than Java, into the JBoss ecosystem. I also contribute to various open source projects.
00:01:04.640 But enough about me. Let's look at a scenario that I'm sure everyone in this room has experienced.
00:01:10.720 Imagine you have a Ruby application that you need to deploy into production. Since this is RailsConf, we'll call it a Rails app.
00:01:29.560 At its simplest, a Rails application consists of a web server that accepts HTTP requests and proxies them off to your application. This typically involves something like Apache and Passenger.
00:01:42.640 This is a very simple picture. And it may be that when you deploy your application, it looks like this.
00:01:54.719 But if you're successful, and we all want to be successful, your application will grow and become a bit more complex.
00:02:00.360 What kind of complexity can be introduced into your application? For instance, consider a delayed job. Most applications require some background task, like sending a verification email when a user signs up.
00:02:14.080 You don't want the user to wait while you format that email, open a connection to the SMTP server, and send it off. All of that should happen seamlessly in the background while control returns to the user immediately.
00:02:31.879 There we introduce a bit of complexity—not much in the application itself, but potentially in the deployment environment, because now you need to account for a worker process.
00:02:37.239 Another source of complexity in your deployment might be scheduled jobs. These are tasks that need to occur regularly, like sending a monthly newsletter.
00:02:50.120 Additionally, you might have long-running processes, like a monitoring task. For example, if your app needs to process tweets about Justin Bieber, you might have a long-running process on your server that connects to the Twitter Firehose to fetch all those tweets and store them in your database.
00:03:02.879 This adds more complexity, as you need to consider how to get this long-running process deployed and managed.
00:03:26.239 When contemplating deployment, there's a continuum of options. On one end, there's rolling your own solution where you get a dedicated VPS and manage everything yourself. On the other end is a Platform as a Service (PaaS) like Heroku.
00:03:40.720 TorqueBox sits in the middle of this continuum. So what is TorqueBox? It's a Ruby application server built on JRuby and JBoss AS7. If you're not from the Java world, you might not know what an application server is.
00:03:57.680 An application server is essentially a long-running process that hosts your application and provides various facilities and functionality that your application can utilize.
00:04:12.400 With TorqueBox, all the disparate components we discussed come together as a single unified entity. Built on top of JBoss, TorqueBox exposes numerous Java-based APIs for messaging, scheduling, and services, and it layers a thin Ruby API on top of those to make it accessible for Ruby applications.
00:04:47.400 Let’s explore the deployment of a TorqueBox application, or any Ruby application. If you're going the DIY route, it involves installing an operating system, several packages, configuring everything, and managing it all yourself.
00:05:01.560 This process can be extensive. You might install Apache for your web server, think about a load balancer, potentially use Unicorn or a cluster of Mongrels to host your app, and configure a cron job for background tasks.
00:05:25.319 If your application includes sending emails, you'll need to consider SMTP settings as well. Don't forget about caching and, of course, the database. All these factors contribute to the complexity of managing a Rails application.
00:05:53.520 You can simplify your life by outsourcing some of these responsibilities. Many of us turn to services like SendGrid for email, Capistrano for deployment, and New Relic for monitoring.
00:06:07.080 On the other end of the spectrum, platforms like Heroku allow you to outsource almost everything except your application, significantly reducing the burden of managing infrastructure.
00:06:34.440 While Heroku is fantastic for ease of deployment, TorqueBox does not yet run on Heroku. We're working on it, but for now, TorqueBox can make your deployment process more straightforward.
00:06:59.039 TorqueBox replaces components like cron and Unicorn, and uses ModCluster for load balancing. It is aware of new TorqueBox instances that come online and requires no additional configuration.
00:07:10.479 Out of the box, TorqueBox provides caching and retains other deployment and monitoring processes that are still relevant.
00:07:28.599 To illustrate how TorqueBox can simplify things, let's look at a real-world example involving the Appalachian Sustainable Agriculture Project (ASAP), a nonprofit organization in the mountains of Western North Carolina.
00:07:50.280 Their goal is to connect local food producers with consumers. They developed a Rails app a few years back to facilitate this, and it's quite standard for a mid-sized Rails application.
00:08:01.280 ASAP's app includes caching, background tasks, cron jobs, and various database queries. They currently host it on Heroku, which has served them well.
00:08:18.000 They have one web dyno and one worker dyno that costs about $50 a month. However, they feel they are nearing the limits of what that service can provide.
00:08:35.000 They are considering migrating to a VPS, like Lode, which offers a 1024 VPS for about $40 a month, allowing TorqueBox to absorb some of the complexity involved in self-management.
00:08:53.680 This would potentially allow them to grow without significantly increasing their costs. First, let’s look at the development environment.
00:09:12.320 In development, I like to use RVM. With TorqueBox based on JRuby, installing it is a single command: 'rvm install jruby.' After that, you merely need to run 'gem install torquebox-server,' and it's ready to go.
00:09:42.560 However, be cautious about doing this right away since it is a sizable installation. Once installed, you can convert your application to use TorqueBox.
00:09:59.440 The TorqueBox command line tool allows you to apply a Rails template to an existing application. You can execute 'torquebox rails my_app,' which will apply the TorqueBox Rails template to your app.
00:10:23.920 If a Rails app doesn't exist, it will create a new one called 'my_app' and apply the template accordingly. The template does several things, including adding necessary TorqueBox gems to your Gemfile.
00:10:44.839 It also adds the ActiveRecord JDBC adapter, essential for Java-based connectivity. The implementation is hidden from you as a Rails developer, as it's all handled by the template.
00:11:07.140 The template further adds TorqueBox support to your Rakefile and creates directories for long-running services, scheduled jobs, and caches required for your application.
00:11:26.399 Once everything is set up, your application should run smoothly under TorqueBox. But is that enough? We want to leverage all the features TorqueBox has to offer.
00:11:43.720 To maximize the benefits of TorqueBox, we will port the application to it, starting with background jobs. Currently, the ASAP application uses Delayed Job for background tasks.
00:12:10.560 We'll utilize the built-in functionality in TorqueBox called 'backgroundable' to handle those delayed jobs in the background. Here's the current setup in one of the controllers.
00:12:30.160 Within this method, a new Active Record object is created, called ExcelExport, which has a method named 'generate_report' that takes a long time to run.
00:12:43.800 To avoid making the user wait while that method runs, a new ExcelExport job is created. This scheme is familiar to anyone who has used Delayed Job—it's straightforward.
00:13:11.560 You assign an ID to the new job and put it on the queue. The job itself is quite simple, consisting of a struct with a 'perform' method. This method fetches the corresponding Active Record object and runs the long-running 'generate_report' method.
00:13:39.440 Now, while this process works, moving to TorqueBox can simplify things. With TorqueBox, you can achieve this in one line of code.
00:14:10.920 For example, we call 'always_background' on the ExcelExport class method, which indicates that this method should always run in the background.
00:14:38.920 Now, when you call 'generate_report' on an instance of the ExcelExport object, control returns immediately. TorqueBox handles the background processing automatically.
00:15:01.680 This significantly simplifies the controller method, reducing the unnecessary Delayed Job complexity, which is no longer needed.
00:15:24.640 With TorqueBox, the system frames a task in the background. Rather than managing a struct and worker processes explicitly, you leverage TorqueBox to effectively queue messages independently.
00:15:46.559 This makes background processing immensely simpler, achieved with minimal effort. Let's shift our focus to caching next.
00:16:05.559 TorqueBox has built-in caching. Upon applying the 'template.rb' template, it automatically sets up your web sessions to utilize the TorqueBox cache.
00:16:22.959 All fragment caching and other caching mechanisms in Rails are also configured to take advantage of the TorqueBox cache. TorqueBox's caching features are based on Infinispan.
00:16:39.920 This offers a replicable, distributable, highly available key-value store with automated clustering features. If you have a group of TorqueBox instances in a cluster, your sessions are automatically clustered.
00:16:56.440 You no longer need to remember to fire up the memcached daemon manually, streamlining operations. You can use this built-in cache ad hoc as needed.
00:17:06.959 Next, let's discuss how the ASAP application currently uses cron jobs, taking advantage of the free version of Cron on Heroku.
00:17:18.240 Cron is implemented as a Rake task that runs once a day, updating the GIS data, which is essential for planning trips to local farms in the Appalachian Mountains.
00:17:47.960 While this setup works, it's hardly ideal, necessitating some level of intervention to keep the data up to date.
00:18:05.720 Moving forward, ASAP might leverage TorqueBox's built-in scheduled jobs as a primary feature of their application.
00:18:23.760 In TorqueBox, scheduled jobs are first-class components that reside in the 'app/jobs' directory of your Rails root.
00:18:38.120 To configure a scheduled job, you define any Ruby class that must include a 'run' method. You use the TorqueBox configuration file, 'torquebox.yml,' to schedule the job.
00:18:53.240 Once the application is deployed, your scheduled job is automatically included in the environment setup. This automation reduces the workload on your team.
00:19:08.480 As the scheduled job executes, it has seamless access to all Active Record objects and can perform tasks in the same environment without additional overhead.
00:19:25.880 However, simply running the job once a day to update the cache might not be the most efficient approach. Instead, real-time data updating would be more beneficial.
00:19:46.280 TorqueBox services can handle this requirement by keeping data fresh in the background. These services are also Ruby classes with an 'initialize' method that receives options.
00:20:07.000 In the 'initialize' method, you can setup a message queue provided by TorqueBox, enabling real-time data refresh while avoiding heavy database queries.
00:20:29.920 The service typically returns immediately, allowing you to set up a thread that constantly waits for incoming messages to update the cache accordingly.
00:20:52.640 When there's an update to the Active Record object, a message gets published on the queue, and the service listens for that message to refresh the cache in real-time.
00:21:19.080 So, how does all this deployment look in production? Setting up a TorqueBox server is simple and straightforward.
00:21:38.560 My employer, Red Hat, runs on a Lode 1024 VPS using Fedora. The setup consists of installing a few packages including Java, Apache, and PostgreSQL.
00:21:57.520 Once the packages are in place, the TorqueBox installation can easily be done via a shell script or a zip download, which includes both TorqueBox and JRuby.
00:22:16.320 An init.d script is included to facilitate starting TorqueBox when the server boots, with a few simple lines of configuration.
00:22:34.239 Moreover, TorqueBox employs ModCluster to manage load balancing effectively. As new TorqueBox instances come online, ModCluster detects them automatically.
00:22:50.600 Configuring Apache simply involves loading the appropriate module and setting up a virtual host to route requests to TorqueBox.
00:23:12.200 Deployment operations can use Capistrano, integrating with TorqueBox recipes, making it easy to start, stop, and manage servers.
00:23:35.760 As the application grows and evolves, the ASAP organization considers transitioning to TorqueBox as they plan future features, like web sockets.
00:23:59.800 These features are built into TorqueBox, enabling high availability and easy management of tasks such as sending out bulk emails.
00:24:14.480 I would also like to mention that Red Hat has a cloud offering called OpenShift, which TorqueBox can run on, including its free version.
00:24:33.600 If you're interested, check out OpenShift where the TorqueBox team provides resources and information to help you get started.
00:24:50.560 As for the roadmap, TorqueBox is currently at version 2.0.1, recently released with continued investment from Red Hat expected.
00:25:20.400 We're also working on interoperability with other languages, such as Clojure, through a project called IM Mutant.
00:25:43.640 There are resources available if you're interested in TorqueBox: its website, Twitter account, and a friendly IRC channel where no question is too silly to ask.
00:26:09.920 I know I talked quickly, but I appreciate your attention. Are there any questions?
00:27:00.000 Yes, does it handle background jobs in a separate process?
00:27:07.920 Yes, background jobs run in a separate pool.
00:27:15.120 We expect soon to support JRuby 7 very shortly, while background jobs primarily run in one JVM instance.
00:27:45.440 Is there anything else you would like to know about the startup times?
00:28:00.320 It's not as quick as WEBrick, but once the JVM is fired up, TorqueBox performs efficiently.
00:28:30.160 Expect startup times of around four to seven seconds, depending upon your application's configuration.
00:28:52.560 We can deploy numerous applications on a single TorqueBox instance if you have the hardware to support it.
00:29:10.000 Regarding memory management, background jobs and web requests share the same JVM.
00:29:45.960 You want to ensure that your background jobs don't consume all available memory, as this could impede web requests.
00:30:05.560 TorqueBox currently doesn't sustain memory isolation but monitors memory consumption automatically.
00:30:30.680 For monitoring, New Relic performs exceptionally well and provides insights into background tasks.
00:30:53.040 Additionally, there’s a Sinatra app named Backstage, which monitors your application metrics in real time.
00:31:05.320 Finally, if you're interested in testing and continuous integration, TorqueBox has a testing framework designed to operate within the TorqueBox environment.
00:31:20.560 You can run your specs inside the TorqueBox context to maximize testing capabilities.
00:31:36.320 Thank you so much for your engaged listening and questions!