Schrödinger's Error: Living In the grey area of Exceptions

In the talk "Schrödinger's Error: Living In the Gray Area of Exceptions" delivered by Sweta Sanghavi at RubyConf 2021, the focus is on the challenges developers face when managing exceptions in complex systems. The session acknowledges that while encountering exceptions is inevitable, the key lies in how effectively developers respond to them.

Key Points Discussed:

- Understanding Exceptions: Developers operate in systems where understanding every potential failure point is impractical. Exceptions serve as feedback mechanisms to reassess code assumptions.

- Initial Challenges: At BackerKit, the team initially faced disorganization in addressing exceptions, relying on a Slack integration that resulted in missed notifications due to lack of defined ownership and unclear priority among exceptions.

- Process Experiments: The team experimented with a structured approach to manage exceptions, initiating a weekly "Badger Duty" where team members took ownership of triaging exceptions, leading to better exposure and alignment on priorities.

- Goals for Improvement: Key goals identified by the team included filtering meaningful signals from the noise of exceptions, fostering collective ownership, and proactively addressing user-impacting exceptions.

- Daily Triage Duty: This included setting a clear rotation for exception triaging, defining tasks to categorize bugs, and promptly determining the urgency of issues, thus creating a continuous feedback loop.

- Learnings and Iterations: The team learned that continually iterating on the process helped maintain clarity and focus on actionable items within the backlog. For example, using dashboards for visualizing errors simplified the triaging process while providing data for future actions.

- Case Studies: The presentation included specific examples such as handling a "Faraday timeout error" and a "missing correct access error"—emphasizing the importance of understanding error contexts, prioritizing issues based on frequency and user impact, and determining effective responses.

- Concluding Thoughts: Sweta highlighted that creating a systematic approach to exception management not only clarified responsibilities but also encouraged team collaboration and knowledge sharing, ultimately driving the team towards a more disciplined, resilient, and responsive approach.

The talk concluded with an invitation for attendees to reach out for further discussion on exception management strategies and to connect on shared experiences, emphasizing the importance of community collaboration in improving processes.

Schrödinger's Error: Living In the grey area of Exceptions
Sweta Sanghavi • November 08, 2021 • Denver, CO • Talk

ArgumentErrors, TimeOuts, TypeErrors… even scanning a monitoring dashboard can be overwhelming. Any complex system is likely swimming in exceptions. Some are high value signals. Some are red herrings. Resilient applications that live in the entropy of the web require developers to be experts at responding to exceptions. But which ones and how?

In this talk, we’ll discuss what makes exception management difficult, tools to triage and respond to exceptions, and processes for more collective and effective exception management. We'll also explore some related opinions from you, my dear colleagues.

RubyConf 2021