00:00:23.670
Hi, I'm Matt. Hi, I'm Robbie. We want to talk to you today about sustainable BDD practices.
00:00:30.340
This talk was originally titled, 'It's not your test framework, it's you.' However, we thought that made us sound like jerks, implying that we were blaming you for something. And while we are, in fact, blaming you, we didn't want to come off as jerks right from the start.
00:00:41.739
We've noticed a distinct hype cycle with BDD. Initially, there was an aggressive adoption curve. Many people were enthusiastic and excited, and numerous tools emerged. However, we began to see a decline as people started to criticize it, questioning the necessity of automated acceptance testing.
00:01:06.189
This leads us to a journey of reflection, where we want to discuss some points raised by those who have stopped practicing BDD. The real question is, what is going wrong, and why are we feeling this pain?
00:01:16.869
We see several common issues: brittle tests, poorly written acceptance criteria, unreadable tasks, flickering tests, and slow tests. However, before diving into these problems, we need to ensure we are all on the same page about what we're discussing.
00:01:36.700
When I first started doing BDD, I had no idea what I was doing. I was merely reading blog posts, learning about tools, and trying things out without fully grasping the deeper intentions behind BDD. This brings to mind the concept of 'cargo cult.' If you're unfamiliar with that term, I encourage you to check it out on Wikipedia—it's quite an interesting read. Essentially, we refer to 'cargo culting' when someone adopts or mimics a methodology, process, or technology without truly understanding it.
00:02:06.399
You have introduced it into your development environment without asking the right questions or doing it the right way. Before we delve deeper, let’s take a step back to reflect on how we got into this mess in the first place.
00:02:24.970
The journey begins with unit testing, particularly within the Smalltalk community, which began developing unit testing practices. The primary model was XUnit, which eventually evolved into JUnit, NUnit, and many other testing frameworks. These practices were structured to support various layers of testing, including real unit tests, integration tests, and some very basic automated tests.
00:02:50.170
It was Dan North who coined the term 'Behavior Driven Development' (BDD). He proposed, 'What if we flipped this on its head?' At that time, we were employing an inside-out approach to test-driven development, yet we had practices—such as Specification by Example—that were developing outside that orbit.
00:03:18.740
Our tooling wasn't aligning with this evolving process. Let's walk through an example. Imagine I’m a product manager and I say, 'We're going to build this web app, and we need the sign-in feature. Just make it happen.' Instead, we should respond with, 'Please, give me an example.' This might require some back and forth with your product manager, fostering a clearer understanding of the feature.
00:03:54.410
The example could be: 'A user opens the app, inputs their username, password, and email address, and they’re signed in immediately.' But we should aim to expand this scenario further. What happens if the input is invalid? What responses do we want to provide? This includes scenarios such as blank fields, mismatching password confirmations, and many more.
00:04:24.550
When discussing password security, we could ask questions like, 'How strong should the password be?' or 'Are we dealing with sensitive information?' These inquiries touch on the balance between security and convenience. Unless you initiate these discussions, assumptions regarding the product owner’s priorities will remain unchallenged.
00:04:47.739
Similarly, we need to consider how we handle invalid email addresses. Does the system bounce back undeliverable emails? Each of these conversations is crucial for understanding and refining the actual needs of the product.
00:05:21.270
What if the username is unavailable? It can be a reserved word or someone might forget that they have already signed up. In this case, we should discuss how a user might recover their account if they forget their login details.
00:05:48.270
After this extensive discussion, we might start with a seemingly simple feature request—integrating 'devise' for authentication. However, that could expand into four different features and 17 detailed scenarios. Agile practitioners like Dan North uncovered hidden complexities, enabling software developers to understand better what product owners want.
00:06:29.590
Dan North developed 'JBehave,' the first BDD tool that facilitated discussions about business value using terms like 'features' and 'scenarios,' allowing developers and product owners to share a common language. This helped bridge the gap between intended functionalities and actual implementations.
00:06:56.400
In a similar vein, 'Cucumber' followed, introducing 'gherkin,' a language specification that helped articulate how features of an application should work. Gherkin helps anyone—especially product owners—understand the state transitions of an application without requiring advanced technical knowledge.
00:07:43.720
Gherkin also encourages everyone involved to grasp the complexity of the application. If you're trying to hide that complexity from the product owner and your users, it will only come back to haunt you. A shared language like Gherkin aids in productive conversations around simplifying and refining application logic.
00:08:14.079
With a unified language established, integrating project management tools and code editors becomes seamless. Developers can directly translate business discussions into actionable tests, thus aligning their work with the overall product vision. The ultimate goal of BDD and TDD is to produce living documentation that serves as both a guide and a guarantee of functionality at any given moment.
00:08:50.360
As we moved from Cucumber to other frameworks like Spinach and others, we began to realize that we were often not asking the right questions about the applications. Instead of blaming the tools when test suites became frustrating, we need to dig deeper and address the underlying issues.
00:09:37.660
Addressing the root of the pain we felt while doing BDD was imperative. For instance, one common pain point is brittleness in tests. When changes are made at the product level, it often leads to multiple tests breaking, indicating a brittle suite that fails to reflect the application's intent.
00:09:54.400
Brittle tests often focus on how things work rather than what they are intended to do. Let's consider an example where we're building a feature for Twitter at the request of a product owner: 'I want the tweet feature.' You might think about scenarios where a tweet is valid, too long, or a duplicate.
00:10:19.250
However, simply testing that the UI performs correctly by walking through every interaction can lead to problems. These tests are verbose, susceptible to changes in the UI, and instead of focusing on business objectives, they recount the steps taken rather than what the application achieves.
00:11:08.052
You might find that logging in is duplicated across various scenarios, scattering your assumptions and knowledge across your codebase. In doing so, you may develop a test suite that is tightly coupled to the UI, creating a headache when changes arise.
00:11:46.220
If you realize that the issue isn't with the framework, but with how you’re writing your tests, it's critical to adjust. Consider the cautionary tale of 'big rats leave big patches,’ where the way we are utilizing tools like Capybara can introduce unnecessary complexity.
00:12:05.940
Imagine building a Twitter feature, which includes clicking a tweet button. As an example, if your tests are dependent on Capybara, writing raw steps in your specs can create brittle tests, particularly if different developers implement them differently.
00:12:44.840
If one developer writes tests that target UI elements directly, while another uses abstract layers, when it's time to modify something significant, you’ll end up patching numerous tests one by one. Instead, it’s more effective to abstract the repeated logic into modules.
00:12:59.450
By creating a module that handles the tweet functionality and makes use of helper methods, you can encapsulate that knowledge in one place, ensuring consistency in the tests while also reducing brittle dependencies on the UI.
00:13:40.320
At this stage, you're beginning to foster the development of a domain-specific language (DSL) focused on your application's core functionality, rather than on its interface. Both Cucumber and RSpec provide the flexibility to use this pattern, allowing developers to create abstractions that funnel into cleaner and more understandable tests.
00:14:40.970
Inadequate collaboration with product owners while defining acceptance criteria is another aspect where BDD can falter. Often, teams may have cargo-culted BDD, missing the collaborative intent it was founded on.
00:15:15.580
It can be tricky to convince product owners to adapt their perspective on testing. If you start imposing a new test framework that requires them to change their story-writing behavior, they may feel overwhelmed and annoyed.
00:15:41.729
It's crucial to foster an environment of collaboration where the focus is on building helper methods to make it easier for the product owner to express their vision clearly instead of shifting the burden onto them.
00:16:03.780
A common theme we've observed with Cucumber and BDD is failing to deliver on the promise of executable documentation. Many tests become unreadable due to poor design or because they're not readily accessible.
00:16:44.009
If tests are hard to read, you’re not likely to return to them, even for clarification seeking clarity on product requirements. Tidy documentation is essential regardless of whether or not it’s directly used; and tools like Cucumber can produce output in HTML formats for easy sharing.
00:17:09.490
For those of you using RSpec, you may have noticed the recent emergence of tools like ReLish, which help in presenting documentation styles. Alternatively, the BBC offers Wally, an open-source variant.
00:17:52.290
Another reason tests can become meaningless as documentation is if they’re simply unnecessary to read. If you’re practicing BDD with a product owner, writing stories together, you’ll still benefit even if you’re not directly leveraging a BDD tool.
00:18:35.240
However, using tools that support Gherkin generally maximizes effectiveness. The expectation is that living documentation continuously evolves, but any documentation that isn't actively consulted is not a pressing issue.
00:19:01.390
Up until now, we’ve reviewed painful challenges that surface while navigating BDD, including legible documentation and clarity in test logic. The real, persistent issue is the performance of tests—these can often slow things down.
00:19:53.960
For instance, have you encountered test suites that take an excessive amount of time to execute? The inefficiencies could stem from a poorly configured suite burdened with unnecessary tests.
00:20:37.190
In a project I worked on, when we inherited the legacy codebase, we eventually got the tests running, only to discover that they took 33 hours to complete with over a thousand tests failing. This cautionary tale illustrates that excessive test execution time diminishes the value of those tests.
00:21:49.860
It’s critical to identify tests that inherently produce long feedback loops, hindering your confidence in the framework. Slow tests can lead to a disconnect, whereby you lose touch with your continuous integration environment and ultimately your development process.
00:22:42.060
One common pitfall developers frequently make is inserting sleeps into their tests. This implies a surrender to the complexities of timing and can lead to unreliable test results.
00:23:06.220
Aligning with the practices within Capybara helps to mitigate these issues. By implementing effective waiting conditions rather than arbitrary pauses, you’re fostering better test reliability.
00:23:43.730
Should you still experience flickering tests, consider quarantining them with designated tags. This allows your CI to run critical tests while separating those that are less stable, giving you insight on areas needing attention.
00:24:20.340
Designate a role on your team, aptly termed 'build nanny', to keep track of these flaky tests. This person should investigate and propose re-writes or simply suggest deleting tests that no longer serve a purpose.
00:25:12.610
Adam Milligan, a senior engineer at Pivotal, eloquently reminds us that we shouldn’t fear deleting tests that no longer deliver value. We sometimes treat tests as sacred entities—permanent without regard for their relevance or utility.
00:25:47.890
As we remove inefficient tests, our test suite's reliability and speed should increase. Ultimately, we’re striving towards a maintainable suite that provides confidence rather than a burden.
00:26:26.960
At this point, reevaluate your suite. Ask yourself why authentication is being tested repetitively. In many cases, you might not need such extensive checks.
00:26:50.220
Consider integrating journey testing with functional testing—focusing your acceptance tests on high-level user interactions while keeping the core underlying code much more accessible to testing.
00:27:36.540
As your implementation improves, utilize tags effectively. Distinguish between tests that need to be run locally versus those that can remain solely within the CI environment. Strive for a balance where some tests run for confidence while others may be tagged to indicate lesser relevance.
00:28:26.590
Finally, don’t mistake BDD for acceptance tests that must always drive through the UI. Many features can be validated without exhausting the interface. Using domain helpers to build tests outside of the UI provides quicker insights with greater reliability.
00:28:55.260
As you refine this process, your tests should gradually become faster and easier to execute. When your suite is efficiently running in under a minute, you’re in a great position to keep pushing your work forward.
00:29:39.200
To conclude, remember these key takeaways: Identify where the pain points arise, treat your tests as equally important as your production code, and approach BDD with the commitment it deserves. Do it wholeheartedly, and you’ll likely find yourself much happier with the results.