Concurrency

Summarized using AI

Hacking Sidekiq for Fun and Profit

Darcy Laycock • February 19, 2014 • Earth

In this talk, Darcy Laycock discusses the potential of Sidekiq, a threaded background job processing tool for Ruby applications, presented at RubyConf AU 2014. The discussion is divided into two main parts: an introduction to Sidekiq and an exploration into its advanced capabilities.

Key Points Discussed:
- Background Processing: Background processing is essential in modern applications for handling complex logic and operations without blocking user interactions. Sidekiq is a powerful tool for managing these processes efficiently.
- History of Background Job Tools: Laycock provides a brief overview of historical background job processing tools in Ruby, including BackgroundRB, Delayed Job, Rescue, and the introduction of Sidekiq.
- Advantages of Sidekiq: Laycock highlights Sidekiq's features, such as its threading capabilities, built-in exception handling, job retries, unique job management through middleware, and the use of Redis for job storage and processing.
- Middleware and Extensibility: Sidekiq's middleware system allows developers to customize functionality, making it adaptable to specific application needs. Custom middleware can handle job uniqueness, manage retries, and improve overall job processing efficiency.
- Practical Aspects: The talk covers how Sidekiq operates, including job queuing, serialization into JSON, and the coordination of job execution through the Sidekiq manager, emphasizing fault tolerance and operational efficiency.
- Advanced Usage Patterns: Laycock presents advanced use cases, such as defining job pipelines, managing concurrency, and ensuring job idempotency to avoid unintended side effects.
- Future Developments: The speaker shares insights into upcoming features in Sidekiq 3, including a dead job queue for managing failed jobs and nested job batches for complex workflows.

Conclusions and Takeaways:
- Developers are encouraged to explore Sidekiq's capabilities to enhance background job processing. It is essential to understand its structure and functionality to effectively extend its features and maintain reliable and efficient background processing systems.
- Emphasis is placed on experimenting with Sidekiq, integrating it with Redis, and maintaining clean code practices while leveraging its powerful background processing capabilities.
- Debugging through Redis insights can offer a clearer picture for resolving job processing issues, and integrating analytics can further improve Sidekiq's functionality.

In conclusion, Sidekiq can significantly streamline background job management processes in Ruby applications and is a valuable tool for developers seeking efficiency and reliability.

Hacking Sidekiq for Fun and Profit
Darcy Laycock • February 19, 2014 • Earth

RubyConf AU 2014: http://www.rubyconf.org.au

It's almost inevitable in any Ruby Project - you hit that stage where your logic starts getting more complex, you start doing more stuff that needs to happen but doesn't have to happen in the foreground - or you just want things to be faster.
You move your logic out into workers and do the work in the background.
This talk is going to be all about Sidekiq - a threaded background job implementation written in Ruby - and, in two parts: How you can use it and how you can bend it to your will.
Part 1: Intro to Sidekiq
The boring: a brief introduction to sidekiq, how it works - what it's advantages are. The stuff you need to know about it, why it's useful to consider - even if you're using CRuby / MRI.
Part 2: Hacking Sidekiq
The cooler part - once you know what Sidekiq is, I'm going to show how you can use Sidekiq in your product, how you can extend it and bend it to your will. I'll go into how it implements itself in ruby land and how it interacts with the Redis.
I'll show how you can use the existing middleware (and write your own) to add behaviour to your code, patterns we've found useful for implementing and testing workers as well as the even more interesting side - using Lua support in Redis to implement stuff in Sidekiq.
I want to encourage developers to look at extending their tool set to work better with not just ruby - to become comfortable with how they work internally (e.g. you should really learn how to love redis) and what you really need to be careful of (e.g. bugs that manifest when the site is under less load than usual - a real world example of going too far).
Finally, I'll end with an important question: Why not just use a proper message queue?

RubyConf AU 2014

00:00:07.360 Hello, everyone! Thank you all for coming. Wow, there’s actually quite a lot of you here. My name is Darcy, and I’m excited to talk to you today. I really like monkey hats; they're quite awesome! I have a lot of slides for my talk, and I apologize upfront since it's not the best idea to be up against lunch. If I talk too fast, please yell out, and I’ll try to slow down. As a disclaimer, when I show code slides—and there will be a lot—don't even bother trying to read everything, as I will move on quickly.
00:00:36.280 Today, I’m going to talk about hacking Sidekiq for fun and profit. Essentially, we’ll explore how to make Sidekiq do things that it doesn't do out of the box, and it's applicable to anyone working with Sidekiq, whether for personal projects or for employers, like working with Amazon for background jobs.
00:01:06.320 Before going into detail, let’s cover what Sidekiq is and the history of background jobs. Background processing isn’t a new concept—it has existed since before Rails and dates back to early computer science. If you come from an enterprise background, it’s a given. As is the Ruby way, we've kind of reinvented it over the years. If you're not used to thinking about it, background processing primarily deals with complex logic that has many side effects. For instance, long-running operations or interactions with external APIs that you don’t want to block your request-response cycle are ideal candidates for background processing. We want to run this code in the background, which doesn't affect the immediate user experience but is necessary.
00:01:56.000 Ruby has a history of approaches to background processing, and I will skip through some of this but mention four main options we’ve had historically. The first was a project called BackgroundRB, a server-client integrated with Rails and recognized as quite old. If you see any references to it on Ruby Forge, that’s a sign you've been using Rails for a while! The next leap was Delayed Job, which realized that using a daemon with custom persistence was not a good idea. Instead, it utilized the database, which every app already has, to store job data.
00:03:18.200 The big step in popularity was Rescue, emerging from the early days of GitHub. Rescue changed the game by using Redis, allowing applications to adopt Redis as a complementary data store to their relational database. It also used built-in Redis operations to implement a fast and easy-to-run system, providing a wealth of tooling to ease job management. Finally, today’s topic, Sidekiq, was developed by Mike Perham who works for a startup called The Climb. Sidekiq introduced threading, which has historically had a bad reputation in Ruby due to the Global VM Lock in MRI.
00:04:55.000 However, Sidekiq became the first well-adopted background processing tool in Ruby that could effectively run multiple jobs concurrently in a single process using multiple threads. It does this efficiently while maintaining code readability and using an actor-based concurrency model, which simplifies handling concurrency issues and avoids the painful parts of threading. Importantly, even with the Global VM Lock, threading remains incredibly useful because of networking-heavy logic, like writing to databases or interacting with APIs.
00:05:56.000 Like Rescue, Sidekiq uses Redis for job and metadata storage, allowing for seamless integration between the two. You can push jobs into Redis using either Sidekiq or Rescue and pull them back out as needed. For example, if your application already uses Rescue workers, introducing Sidekiq is easy. Sidekiq is also compatible with multiple programming languages, allowing you to push jobs into Redis from languages like Go or JavaScript.
00:07:29.560 A major benefit of Sidekiq is that it is feature-rich out of the box. Unlike Delayed Job or Rescue, which often require adding additional gems to manage exceptions or monitor job status, Sidekiq includes built-in exception reporting, making it easier to handle failures. It efficiently tracks errors and automatically reports them if configured, offering developers insight into job execution times and resource usage.
00:08:09.160 When jobs fail, Sidekiq automatically retries them, employing exponential backoff to avoid overwhelming services when they fail, which prevents common issues that can arise when managing these processes manually. An important feature is the ability to schedule jobs, allowing developers to specify when a job should run. Rather than relying on cron jobs, you can simply create scheduled jobs with Sidekiq, which helps reframe how you think about managing background logic.
00:09:28.840 Sidekiq employs logical grouping for different types of jobs, enabling extensibility and support for multiple queues. The job management is designed to be extensible using middleware, similar to Rack, where you can influence the behavior of your code precisely before execution starts. For those interested in expanding Sidekiq's capabilities, you can also check out Sidekiq Pro, which builds on the open-source model and provides valuable features such as batching and reliable workers.
00:10:57.520 Next, let’s cover the practical aspects of how Sidekiq operates. When a job is queued, it communicates with the Sidekiq client class, which forms a hash with the job class and its arguments. This process helps ensure that we stick to standard practices while communicating with Redis. Sidekiq serializes job data into a JSON object, allowing every job to maintain metadata like job ID, retry count, and other configurations.
00:12:04.320 Job data is stored in Redis, which features various data structures like sorted sets that facilitate scheduling jobs based on timestamps. The Sidekiq manager coordinates checking for and invoking jobs, where it interacts with various components and ensures fault tolerance and efficiency.
00:13:29.760 A method called `fetch` retrieves jobs from queues based on specific strategies and configurations that you may want to implement, including strict priority handling or different sampling approaches, influencing how your queue operates. In the implementation, not only will you retrieve jobs, but you'll also have access to essential job details, while the process can be enriched through middleware.
00:14:52.520 One useful modification is ensuring jobs are unique, which can be achieved with minimal changes to Sidekiq’s core functionalities. Instead of enqueuing multiple jobs that do the same thing concurrently, you can implement middleware that checks for existing jobs and prevents duplicates. This ensures resource efficiency and cost-effectiveness.
00:16:20.400 Further enhancements can include a feature that allows clients to store and manage job uniqueness, with the ability to set specific locks, debounce jobs, or manage retries effectively. You can design your system to allow for high concurrency while maintaining reliability.
00:17:44.960 Advanced patterns can also help manage job running instances effectively. This means you can prioritize and schedule tasks while controlling how many workers operate on a job. If you're having issues with certain job types, such as legacy code causing data corruption, being able to pause job processing or dynamically allocate resources becomes critical.
00:18:56.560 You can also define job pipelines where certain queues are prioritized over others, providing better management of resources and a smoother workflow. It's essential not to restart processes when jobs are queued, and customizing the approach to managing queues allows for this level of flexibility.
00:20:16.560 Lastly, with all these optimizations in mind, jobs must be idempotent. Ensure that jobs can be run multiple times without worrying about duplication or unexpected side effects. Using transactions can significantly enhance reliability, ensuring that if something goes wrong, previous results do not corrupt new job executions.
00:21:39.760 In conclusion, leveraging Sidekiq efficiently requires an understanding of how it's structured. You can exploit its well-designed features to extend functionality and utilize Redis as a powerful backend. By knowing how to experiment and integrate with your application effectively, you can create a robust job processing system. Always be cautious as you extend and adapt Sidekiq to your projects, maintaining clean, understandable code.
00:23:02.240 I reached out to Mike to share any upcoming news, and he's informed me about two significant updates coming soon: Sidekiq 3, which will feature a dead job queue for managing failed tasks, and nested job batches, which will enhance usability for complex data imports and multi-stage workflows.
00:23:29.120 Thank you for your attention. If you have any questions, I would be happy to answer them!
00:34:00.000 Speaker Q&A
00:34:10.000 Audience Question: Can you talk about how Sidekiq handles failure scenarios particularly with spot instances?
00:34:40.000 Darcy: Sidekiq does not inherently track whether an instance gone down due to timeout or another issue. You can implement application code to deal with retries effectively.
00:34:57.000 Audience Question: Would having a backend web app for analytics help improve Sidekiq interactions?
00:35:21.000 Darcy: Yes, a mountable Sinatra app could integrate with Sidekiq and provide valuable metrics and insights into job processing.
00:35:40.000 Darcy concludes: Debugging in Redis provides a clear view of what’s happening behind the scenes, which is great for resolving issues.
00:36:00.000 Thank you!
Explore all talks recorded at RubyConf AU 2014
+17