Deterministic Solutions to Intermittent Failures

The video titled "Deterministic Solutions to Intermittent Failures" presented by Tim Mertens at RubyConf 2017 focuses on addressing the pervasive issue of intermittent test failures in software development. Mertens emphasizes the significance of tackling these failures systematically rather than dismissing them as mere "flaky tests". He introduces a comprehensive workflow that helps developers debug any type of test failure, including consistent, order-related, and non-deterministic failures.

Key Points Discussed:
- Introduction to Intermittent Failures: Throughout the presentation, Mertens highlights personal experiences with flaky tests, illustrating how they negatively affect the development process. He stresses that test failures are often wrongly attributed to being unresolvable, which can lead to a culture of ignoring them.
- Terminology Shift: Mertens advocates for retiring the term "flaky tests" in favor of "non-deterministic failures", emphasizing the need to identify the root cause of failures, treat tests as software, and ensure they function correctly.
- Continuous Integration (CI) Challenges: He outlines the evolution of testing at Avant, detailing how the growth of the codebase and test suite led to increased failures and a culture of overlooking these issues in pursuit of rapid feature development.
- Improving Debugging Workflows: Mertens provides a structured approach for tackling test failures:
- Isolate Failing Tests: Start by running the failing test alone to achieve reproducibility.
- Address Common Pitfalls: Be aware of stale branches, date dependencies, and tests missing preconditions, all of which can cause failures.
- Utilize RSpec Features: Leverage RSpec to rerun tests using seeds, allowing for examination of execution order and helping to replicate errors.
- Bisect Testing: Use automated bisecting to isolate test failures, determining if they arise from interactions between tests.
- Defensive Testing Practices: Ensure tests clean up after themselves and avoid global state mutations to prevent test pollution, which can yield intermittent failures.
- Handling Environmental Differences: Address conditions that may vary between CI and local environments, ensuring consistency across setups.
- Actionable Insights: Mertens concludes with strategies for systematically narrowing down and diagnosing test failures, advising developers to approach problems methodically.

Conclusions and Takeaways:

Mertens encourages developers to prioritize fixing intermittent failures to maintain a reliable codebase. His main messages include the necessity of treating tests as active software entities, the importance of identifying and resolving the root causes of failures rather than yielding to the myth of "flaky tests", and the value of a systematic debugging approach for effective software development practices.

Deterministic Solutions to Intermittent Failures
Tim Mertens • November 28, 2017 • New Orleans, LA

Deterministic Solutions to Intermittent Failures by Tim Mertens

Monday mornings are bad when the coffee runs out, but they’re even worse when all of your impossible to reproduce, “flaky” tests decide to detonate at the same time.

Join us to learn a systematic, proven workflow for expertly debugging any test failure including consistent, order-related, and non-deterministic failures. We’ll discuss the importance of fixing intermittent failures before they reach critical mass and why you should eradicate the term “flaky tests” from your vocabulary.

Don’t wait for an emergency to prioritize fixing the ticking time bomb of intermittent test failure.

RubyConf 2017