00:00:07.799
Living without exceptions is a challenge every time I start a new app. I always put my mental hat on and imagine my project.
00:00:15.120
It feels like getting into a nice new car. You start coding, getting into the groove, and following the motto of "move fast and have breakfast" because we all know that breakfast is the most important meal of the day.
00:00:27.800
However, as time goes on, your application starts devolving into something that's still recognizable, but you see patches and confusion—like a feeling of 'what the hell just happened here?'.
00:00:40.360
The thing with exceptions is that they are like cute bunnies that eventually cover your application until they suffocate you. You might find yourself staring at screens full of errors. Today, I'm going to talk about how to live without these exceptions.
00:01:11.680
My name is Radoslav Stankov, and I come from Bulgaria, a country known for its Windows error screens. I write a newsletter called 'Tips', and I am currently the CTO and co-founder of a startup called Angry Building. I hope to rename it to Happy Building once we accomplish our mission. I was previously the CTO of Product Hunt, and I'll be sharing tips from my experience in those two companies.
00:01:36.799
I like to include a lot of code in my presentations, and I've uploaded my code here because I often notice people taking photos and that I present too quickly. So, all the slides are available, and you can see the code and everything.
00:02:04.159
When I started at Product Hunt, there were only four developers. We realized that while nobody wants errors in their applications, it's crucial to have a structured process to manage them. We established an internal process called 'Happy Fridays'. As the company grew to eight developers, we developed another process called the 'Strike Team'.
00:02:30.200
The Strike Team consisted of junior developers responsible for fixing exceptions. Over time, as the company expanded to over 20 engineers, we introduced a designated 'Bug Duty' sprint, during which one or two engineers focused on fixing bugs and handling exceptions. At my current company, I'm the sole developer, so I only conduct Happy Fridays. I miss Strike Teams since it’s just a team of one.
00:03:07.440
Happy Fridays turned out to be an effective hack to manage and prioritize tasks. You work four days as normal, then on Fridays, you can focus on one of five tasks: fixing bugs, addressing exceptions, bumping dependencies, paying off technical debt, or catching up on projects. The key idea is to think about your sprint as two weeks long but spread out across four days instead.
00:03:43.239
What I often did was pick exceptions to address. If you search GitHub, you'll see numerous pull requests dedicated to fixing exceptions left and right. The first takeaway today is that to establish a clear system for managing exceptions, you must have an explicit process outlining who fixes what, and when.
00:04:14.280
If no one owns the exceptions or if everyone feels responsible, they will just pile up. Another important aspect of working with exceptions is understanding that the Ruby exception system is well-designed, despite being somewhat dated. A good resource is a book that teaches us useful tricks about handling Ruby exceptions.
00:04:54.400
When you encounter an error in your code, you can use a single rescue to handle it. However, you should avoid this approach, as it might lead to your application behaving erratically over time without you understanding the root of the issue.
00:05:17.479
Instead, I recommend handling specific errors explicitly. This way, if you encounter a specific issue—like a network service being unreliable—it's fine to explicitly catch and return null.
00:05:53.479
The key here is to be deliberate about handling errors. Make sure to log any relevant information, even if your customers don't see it, to help you understand what went wrong. Additionally, monitoring your application for exceptions is vital.
00:06:20.440
Having a robust monitoring system ensures you are aware of errors, and not just assuming that everything is fine because your systems show no exceptions. I personally like using Sentry for this purpose. It's a tool we’ve used effectively in high-scale applications over the last decade.
00:06:44.400
As a tip, I create multiple Sentry projects for different tools. For example, in my Rails applications, I maintain at least two projects—one for production web and another for Sidekiq. This distinction is useful because errors in the web layer can differ greatly from those in the Sidekiq background jobs. The former typically receive immediate attention, whereas you can afford to delay the latter.
00:07:24.279
It's also beneficial to organize your exceptions into specific buckets based on usage. For instance, when we had a public API at Product Hunt, its exceptions were monitored separately because they had different causes and resolutions compared to the main web application or Sidekiq.
00:07:59.240
You want to avoid being overwhelmed by a flood of exceptions where you won't know where to start fixing. Setting up a Slack channel to notify your team every time you deploy can help track new exceptions entering the systems as well.
00:08:36.640
When you discuss exceptions in this environment, people can react to new errors appropriately and prioritize what needs addressing. I've found new exceptions tend to be easier to solve, especially right after a deployment. Therefore, having a system that correlates code deployments with exceptions is invaluable.
00:09:10.960
As a great quote from Ken Loughlin and Rob Pike states, 'Exceptions are for exceptional situations.' They're not for regular occurrences or control flows. When exceptions occur frequently across the same types of errors, we should determine whether this is noise or something to be addressed.
00:09:52.400
Managing the noise is important; you don’t want to flood your exception tracker with messages that don’t warrant immediate attention. Always ensure clarity by knowing whether certain exceptions, like 'Sidekiq shutting down', are significant enough to track.
00:10:33.440
It's about keeping the most relevant exceptions highlighted while filtering out those that don't require action. In development, you want to avoid introducing too much noise in your applications, as this can obscure critical issues. Now, let's shift gears and discuss some practical coding examples.
00:12:01.680
In my work, I often encounter various exceptions that I need to deal with. For example, when I first came across a specific error message and had to figure out how to fix it, I realized that sometimes the answers are easily available online. Nowadays, I often consult AI models like ChatGPT for quick solutions, which can save a significant amount of time.
00:13:20.919
Additionally, I have a practice of organizing my exception handling code into neat namespaces. In my projects, I create modules that encompass all my error handlers, making it easier to manage them efficiently. Keeping everything organized and neatly grouped helps to mitigate chaos in error handling.
00:14:05.760
We might also encounter common errors like a method returning null. To manage this, we need to determine the root causes—such as issues related to account subscriptions or misspecified data. When errors happen, we should remain calm and examine why the errors occurred, documenting any potential patterns that might arise.
00:14:52.100
One significant lesson is to never hide exceptions. If a network service cannot be accessed, and an error occurs, logging that is essential. The key takeaway is to analyze the reasons behind every exception efficiently.
00:15:36.320
For instance, in scenarios where users attempt to change subscriptions but don't have any, I inform them instead of letting an error bubble up without context. This allows users to see responses without seeing raw errors while still giving the development team the opportunity to troubleshoot proactively.
00:16:22.320
Managing exceptions gracefully can turn your application into a robust system that minimizes user frustration. It’s about building that safety net—ensuring users receive understandable errors while your team is notified in the background for resolution.
00:17:12.720
Another point to consider is during the deployment of updates, it's essential to capture how users interact with your application during this period. At times you might find errors you hadn't anticipated, and it's essential to analyze them systematically.
00:18:03.760
In dealing with races in processing, stay aware of how your jobs might interact with one another in quick succession. Multiple jobs being triggered for the same task can lead to conflicting states, generating setbacks. This is where implementing strategies to avoid race conditions becomes invaluable.
00:19:01.920
Let's also examine how job failures and retries work with Sidekiq or Active Job. Job recursion can lead to repeated failures if not handled correctly; having methods that effectively pull or retry jobs ensures that processes remain efficient.
00:20:00.160
I recommend having a uniform approach to handle network-related errors, ensuring that your jobs gracefully retry their attempts while providing helpful feedback to your systems. Should jobs continue to fail, keep a record so you can analyze patterns in resolution methods.
00:21:04.760
In managing network exceptions, for example, I utilize a method attended for rapid retries that trigger upon failure. This method can help streamline the function of background processing systems while adding robustness to your error-reporting systems.
00:22:00.400
As we optimize our systems to effectively handle these nuances, always note the need for reliable metrics and monitoring processes to avoid exceptional noise clouding productivity and focus for teams. Tailoring responses to the dynamic environment of development helps reduce confusion and maintain consistency.
00:22:46.080
It’s imperative to be cautious about the common errors that can arise when working with asynchronous processing. For example, ensuring tasks are performed only after transactions are finalized prevents data consistency issues with job execution.
00:23:33.680
Finally, it’s necessary to build upon a strong foundation of error handling measures in every aspect of your work. Keep refining the practices that help track, troubleshoot, and ultimately resolve the underlying issues—ensuring your applications run smoothly.
00:24:23.520
In summary, to live without unhandled exceptions, it’s crucial to explicitly define processes, utilize effective monitoring tools, review exceptions meaningfully, and involve your team actively in discussions. By investing time in building systematic approaches, you can avoid being overwhelmed by handling exceptions.
00:25:11.840
Thank you for your attention, and I'm eager to engage further with all of you. Let's keep the discussion productive!