Ruby Video
Talks
Speakers
Events
Topics
Leaderboard
Sign in
Talks
Speakers
Events
Topics
Use
Analytics
Sign in
Suggest modification to this talk
Title
Description
Date
Summarized using AI?
If this talk's summary was generated by AI, please check this box. A "Summarized using AI" badge will be displayed in the summary tab to indicate that the summary was generated using AI.
Show "Summarized using AI" badge on summary page
Summary
Markdown supported
The talk "The Overnight Failure" by Sebastian Sogamoso at wroc_love.rb 2017 focuses on the theme of embracing failure in the software development industry, highlighting the importance of discussing and learning from mistakes. Sogamoso, who works for Cookpad, shares a personal story about a major incident at a previous job that he refers to as Black Saturday, illustrating the challenges and lessons learned from this experience. Key Points Discussed: - **Introduction and Context**: Sogamoso introduces himself and his background, sharing his love for Poland and inviting attendees to an upcoming Ruby conference in Colombia. - **Cultural Observations**: He humorously mentions his observations about Poland, including its unique drunk tests and the confusion around building floor designations. - **Story of "The Overnight Failure"**: - The presentation dives into a critical failure associated with a carpooling app that Sogamoso was developing. - On a routine billing day, multiple users were accidentally charged due to a system error that created duplicate payment jobs, leading to outrage among users when their cards were declined due to excessive charges. - **Crisis Management**: Sogamoso recounts waking up to a flood of complaints and the frantic efforts to contain the financial damage, which involved stopping the charge processing and reversing the erroneous charges. - **Root Causes Identified**: - The errors were attributed to flaws in both the job retrieval system and the payment processing logic, which were not sufficiently verified for duplicates. - **Learning from Failure**: Despite the devastation caused by the incident, Sogamoso emphasizes the significance of openly discussing failures within the tech community to mitigate feelings of imposter syndrome and foster a culture where mistakes can be addressed constructively. - **Post-Mortem Analysis**: He outlines steps taken after the incident, including implementing better testing practices and monitoring systems to avoid similar failures in the future. Conclusions and Takeaways: - **Cultural Shift**: Emphasis on creating an environment where discussing failures is normalized rather than stigmatized, enabling teams to learn collaboratively. - **Importance of Testing**: Highlighting the necessity of robust testing and monitoring processes to catch issues before they escalate. - **Community Engagement**: Encouragement for attendees to share their own failures as part of a collective learning experience using the hashtag #IBrokeThings and stressing that individuals should not equate failures with their identity. - **Call to Reflection**: Sogamoso urges developers to reflect on their own worst-case work scenarios and learn to manage stress and repercussions when failures occur.
Suggest modifications
Cancel