00:00:07.120
Hello everybody.
00:00:10.080
Thank you, Radoslav, for the introduction. You can find me on the internet at various places.
00:00:13.510
I come from a small country in Europe called Bulgaria. It looks bigger in this picture than it actually is.
00:00:21.200
Currently, I’m the Head of Engineering at a startup called RoadHunt. I usually like to include a lot of code in my slides.
00:00:26.500
All my slides are already available at this address on Speaker Deck. I mention this because I've noticed that during my talks, many people spend time taking photos of the slides.
00:00:44.300
So, if you find this talk interesting, you can check it out later. One of my core beliefs about technology is that context is king. You cannot do anything if you don't understand where something comes from.
00:01:07.250
To understand this better, let’s talk about production. Our production is a traditional Ruby on Rails application that has transitioned to a single-page application using React and GraphQL. Right now, we are beta testing a brand new application called YourStacks, which is built with a very similar stack.
00:01:25.969
Currently, our engineering team consists of seven people. Our production architecture has three tiers: an OGS app responsible for server-side rendering, a Rails API server, and another group of containers dealing with background jobs, which most of you probably know as Sidekiq.
00:02:02.780
When starting up a new application like the one we're developing now, your stacks can feel like driving a fancy new car where everything is fast and enjoyable.
00:02:15.050
Initially, you're very happy developing and everything looks great. However, as you try to fix some issues, exceptions start appearing. You may begin to view these exceptions as cute little bunnies, considering them harmless, but over time, it becomes clear that something is seriously wrong.
00:02:37.000
The problem arises when you allow the situation to linger for too long. A few years ago, we introduced an initiative called Happy Friday, a day when engineers could focus on fixing bugs due to the rapid pace of product iterations.
00:03:17.590
Happy Fridays allowed us to tackle technical debt and other issues, which involved taking two hours every Friday to fix exceptions. This process provided us a lot of flexibility, and even now, when we look at exceptions, we see that most of them are resolved very quickly.
00:04:19.750
Now, I can open up a Friday and hardly see anything because most of the exceptions have been addressed. This leads me to my first tip: maintain a report around exceptions. This report should be specific to your organization and relevant to your projects.
00:05:23.460
It’s essential to have a structured approach to treat exceptions like regular work. The rest of this talk will provide actionable tips to help you handle exceptions more effectively.
00:06:10.970
There’s a great resource, a book by Avdi Grimm called Exceptional Ruby. It’s the best book I’ve read on exception handling and provides valuable insights into how the exception system works in Ruby.
00:06:41.990
In typical code, when using exceptions, you might use 'rescue' without realizing it could potentially silence important errors. The goal of an exception tracker is to provide you with clear information about your application’s errors.
00:07:04.120
When dealing with exceptions, keep the error messages informative. If you’re rescuing; from a specific error, ensure to comment on why that exception occurred, especially if it’s not obvious, such as file errors or network failures.
00:08:10.360
Avoid adding specific user names in the notes, as this information can become outdated or hard to trace. Instead, document the context and reasoning in your comments.
00:08:45.600
To improve your system's resilience to exceptions, be explicit about the exceptions you are handling. Monitoring is also crucial. If you lack monitoring, it’s difficult to understand what’s going on inside your application.
00:09:02.020
Using monitoring tools such as Sentry can help you track exceptions effectively. Separate projects for different server types can also simplify exception handling, reducing noise from blending exceptions from various sources.
00:09:37.910
My systematic approach helps reduce this noise by filtering out non-actionable errors, making it easier to focus on significant exceptions that genuinely need attention.
00:10:58.120
For instance, we learned that a common exception, 'invalid byte sequence in UTF-8', could be resolved by changing the encoding of input, which we track systematically.
00:11:43.020
My third tip is to reduce the noise by filtering exceptions to only show issues you can act on. This way, when you confront a legitimate issue like 'undefined method status for nil', you're more likely to tackle it promptly.
00:12:49.000
If you hide a problem instead of addressing it, it can lead to greater issues down the line. So, ensure you fix the root cause rather than masking exceptions.
00:14:24.860
By investigating the underlying problems, you can improve the integrity and performance of your system, learning valuable lessons along the way.
00:15:14.870
Implementing guards against common issues, like ensuring an account has a subscription, helps prevent exceptions stemming from logical flaws in your code, ultimately enhancing your application’s reliability.
00:16:05.450
Adding strategic logging to your exception tracker provides a clear record of issues encountered, enabling quicker troubleshooting. It’s important to keep monitoring tools updated with new context as you refine your application.
00:17:02.270
Networking exceptions can be particularly tricky, especially with all the different libraries and dependencies in your application. Keeping track of these can streamline your debugging process. Again, create a module to handle network-related exceptions to simplify error management.
00:19:04.200
When you start using these tools, especially in high-demand environments like with APIs, you'll tend to see your exceptions becoming more manageable over time. Using a 'retry on' feature in Rails 6 can also automate handling network-related exceptions, so you don't need to worry about them constantly.
00:21:25.500
Lastly, it’s essential to foster a culture around managing exceptions. Ensure your teams share knowledge on common exception types and standardized handling mechanisms. This approach can greatly enhance your applications' stability.
00:25:04.870
Thank you all for listening to my talk! I hope my tips for managing exceptions in production have been helpful.