Talks

​Recurring Background Jobs with Sidekiq-scheduler

​Recurring Background Jobs with Sidekiq-scheduler

by Andreas Fast and Gianfranco Zas

The video titled "Recurring Background Jobs with Sidekiq-scheduler" features Andreas Fast and Gianfranco Zas discussing the Sidekiq-scheduler extension for Sidekiq, a popular background job processing tool in Ruby on Rails applications. The talk aims to fill the gap in Sidekiq's functionality by enabling the scheduling of recurring background jobs, which it does not natively support. Throughout the session, they highlight the following key points:

  • Introduction to Sidekiq-scheduler: The presenters, representing MoveIt, discuss their expansion and technical needs, setting the stage for a deeper look at sidekiq-scheduler.

  • Use Cases: They share their experience from 2012 where they needed to daily fetch financial data and send reports, revealing the limitations of using cron jobs and exploring the inefficiencies discovered with current entries, particularly in a JRuby environment.

  • Integration Challenges: They encountered issues managing multiple cron entries and the fact that deploying an app in a clustered environment often triggered duplicate tasks. This led to the search for a better solution, resulting in the development of the Sidekiq-scheduler.

  • Cyclic Scheduler Architecture: The main functionality of the Sidekiq-scheduler is discussed, explaining how it adheres to the Sidekiq lifecycle, including the setup of Rufus-scheduler as part of the configuration to manage recurring jobs effectively.

  • Handling Multiple Instances: The talk addresses the complexities of running multiple Sidekiq instances and how the scheduler can be utilized to avoid running duplicated jobs across different nodes.

  • Future of Sidekiq-scheduler: Fast and Zas outline their vision to improve the scheduler by minimizing conflicts in the global Sidekiq configuration and ensuring better multi-instance support without overstepping into functionalities outside of scheduling.

  • Open Source Contributions and Extensions: As a final bonus, they provide insights on creating custom Sidekiq extensions, detailing how developers can contribute to the Sidekiq ecosystem and leverage it for their applications.

In conclusion, the video presents a comprehensive view of how the Sidekiq-scheduler enhances the capabilities of Sidekiq by enabling recurring job scheduling, which is essential for efficiently managing background tasks in Ruby on Rails applications. The discussion reinforces the importance of collaboration and community contributions within the open-source environment.

00:00:12.440 Awesome! So, this is a sponsored talk by MoveIt, the company we work for. We want to talk about Sidekiq Scheduler, which is a Sidekiq extension for recurring job processing.
00:00:21.869 My name is Andreas, and this is Gianfranco. We work for MoveIt, based out of South America in Uruguay, and we’ve been growing. We have team members in Argentina and Colombia, and we also have an office in Austin. If you're in Austin, we’re always hiring, so feel free to check us out.
00:00:38.790 So, let's talk about Sidekiq Scheduler, which is where it all starts. I'll leave you with Gianfranco now for a while; he'll explain a little more about what it is.
00:00:46.950 Okay, let's begin with a couple of use cases. Back in 2012, we were working on a project that needed to daily fetch information related to credit level scores based on financial data and apply interest to Amplitude cards. Additionally, we needed to send weekly activity reports to various email addresses. The first approach we considered was to set up cron entries to execute scheduled tasks for these processes. We thought it was an original approach, but it came with certain drawbacks.
00:01:19.350 So, what are the issues with cron jobs? To answer that, it all comes down to composability. Cron jobs require starting the Ruby interpreter for each run, and since we were using JRuby at the time, it took some time to start up and consumed more memory. If we have our own entry scheduled to run at, say, 3 p.m., it has to start a new process and consume some memory. There's no straightforward way to programmatically enable or disable cron entries, and computations must live outside the application's scope.
00:01:42.720 Moreover, it's not a portable solution since it relies on the operating system. You can't stop the scheduler without affecting other cron jobs. Other issues include the limitation that cron jobs run at a minute level rather than at a seconds level, triggering duplicated tasks when deployed in a cluster instead of running only once. Despite these drawbacks, we did use cron jobs initially, but we also looked for alternative solutions.
00:02:01.750 In 2012, we discovered a gem called Sidekiq Scheduler, which was designed to scale sidekiq jobs for a specific future time. While this wasn't exactly what we were looking for, we explored whether cron could integrate with the Sidekiq Scheduler. While searching for alternatives, we came across Rufus Scheduler, which allows you to use cron-like syntax to schedule tasks easily.
00:02:30.470 We then came up with the idea of integrating Rufus Scheduler into Sidekiq Scheduler. After implementing this in 2012, we started actively using Sidekiq Scheduler in a project we worked on in 2013. Morton, who was maintaining the gem, eventually handed over ownership to us. As of 2016, we added support for ready jobs, allowing Sidekiq to act in sync with various external triggers.
00:03:04.490 Now, using Sidekiq Scheduler is just like using Sidekiq. First, you need to declare a recurring worker with your defined worker class, which includes the Sidekiq worker module. This class must define the `perform` method. The schedule configuration can be placed inside the Sidekiq configuration file under the scheduled key.
00:03:31.610 For example, in the configuration, you could set up a 'Hello World' job that runs every minute, scheduled to trigger when the current second is 0. When that specific time arrives, Sidekiq Scheduler will push the 'Hello World' job into Sidekiq. After that, you install the gem, and if you're using Bundler, add it to your Gemfile as usual, and you can run Sidekiq as normal.
00:04:01.950 In the example we showed, we told Sidekiq to require our Ruby file containing the worker class definition. Every minute, Sidekiq Scheduler will enqueue the job, and Sidekiq will perform the job accordingly. Various scheduling types are supported, primarily based on Rufus, which also allows for different job types to be queued.
00:04:31.960 The main purpose of Sidekiq Scheduler is to push jobs into Sidekiq, enabling normal processing. When Sidekiq Scheduler is required, it hooks directly into the startup and shutdown phases of the Sidekiq lifecycle, which we will explain shortly.
00:05:03.690 During the startup phase, we fetch the configuration from the Sidekiq initializer and start the Rufus Scheduler. Rufus is a thread that iterates over all the scheduled jobs and invokes each appropriate Ruby block as needed.
00:05:21.860 We then set up a scheduled job for every one of the configuration jobs. Each handler in Rufus ensures that the job instance was not previously pushed and finally pushes the job into Sidekiq.
00:05:39.420 While implementing this approach, we encountered the challenge of managing multiple running instances at the same time. Initially, we found that the cron jobs were intended only to run on a single Sidekiq instance. However, once multiple nodes were up, they could only trigger jobs on one scheduled instance. To solve this, we later extended support for multiple nodes for both cron and other job types.
00:06:33.740 Now, it is indeed possible to run multiple Sidekiq Scheduler instances, like cron and other types of jobs. So if you have a setup with multiple Sidekiq instances, the tasks will not interfere with each other. We have also put in some mechanisms to avoid conflicts.
00:06:50.060 Alright, let’s discuss the future of Sidekiq Scheduler. One of the main things we want to do is to stop polluting the global Sidekiq configuration. Right now, we load the configuration into the Sidekiq configuration file. However, if Sidekiq tries to perform other duties, we run the risk of a collision.
00:07:10.180 Our goal is to prevent such issues by ensuring that every scheduling type can work across multiple instances seamlessly. We also want to maintain the focus of Sidekiq Scheduler purely on the scheduling aspect without invading other functionalities.
00:07:44.150 Continuing from there, we plan to implement functionalities that improve our secure scaling abilities. Importantly, we aim to optimize our codebase and project structure for better organization.
00:08:06.430 As a practical example, we wanted to introduce a feature that automatically closes stale issues in open-source repositories. If an issue appears inactive after a specified period, like a month or two, we will want the system to automatically close it.
00:08:37.820 In our tests, we created a small code segment to mark duplicate issues. The created task runs every minute, checks the marked issues, and closes those that meet the criteria. Now, let’s demonstrate how it works.
00:09:01.630 You can check it on the homepage of Sidekiq Scheduler and you’ll see our cron job there for issue cleanup. It runs every minute. As you can see, it will find those issues and close them automatically if they are marked as stale.
00:09:34.440 Eventually, if no one shows any activity on these issues for a few months, the job will run again and will check, closing the stale ones accordingly.
00:10:09.860 As a bonus for our talk, we wanted to showcase what it takes to write a Sidekiq extension. In the Sidekiq lifecycle, there are specific events you can hook into. We utilize startup and shutdown events to manage configuration settings, like starting the Rufus Scheduler.
00:10:30.700 One important point to note is that when you run the web extension for Sidekiq Scheduler, you can create a customizable dashboard showing currently queued jobs. It also allows you to disable those jobs without needing to log into the server directly.
00:10:59.490 To achieve this, you just need to create a module extension with a method that defines the hooks for it. You also need to register the extension in Sidekiq to enable adding the dashboard tab easily. You can define actions to render the necessary data in that tab.
00:11:27.390 Additional features include the option to add multiple languages support for Sidekiq Scheduler. This makes it beneficial for projects of diverse user demographics.
00:11:36.100 Lastly, let’s discuss contributions and collaboration. Since Sidekiq Scheduler is an open-source project, it has existed for quite some time and has had around 43 contributors over the years. Currently, we are dedicating approximately six hours a week to fixing issues and improving the scheduler.
00:12:06.320 We appreciate all feedback and contributions from the community. Additionally, we maintain other open-source libraries. For instance, Rusyn is a simple exception notification gem for logging and sending errors in Ruby applications.
00:12:44.830 We also have Ruie, a lightweight rules evaluator for conditional expressions, and Angus, a REST-like API framework that generates documentation while you write. Lastly, we created Fake It, an Android data generator for creating realistic test data.
00:13:15.760 Thank you very much for your attention, and we hope you find Sidekiq Scheduler and these other projects useful!