00:00:09.280
It's a little bit later in the day. I hope the coffee is kicking in for everyone; it's starting to work for me.
00:00:15.280
Today, I'm going to talk about what I call real-world Ruby testing.
00:00:20.880
This includes a lot of things I’ve learned from writing and maintaining a freakishly large test suite.
00:00:26.880
But first, a little bit about myself. My name is Rein Henrichs, and I have a Twitter and a blog.
00:00:33.360
If you want to follow along, the slides are up on Heroku, and I work at Puppet Labs.
00:00:39.040
When you run this on our project, you get an output that says almost over nine thousand.
00:00:47.760
However, what it’s not actually showing are all of the tests that didn’t run because I don't have a Solaris box or an AIX box.
00:00:53.360
If you run all of the tests in the Puppet Labs suite, it's probably well over ten thousand. That seems like a lot of tests.
00:01:06.159
We've had a lot of experience wrangling those tests together and trying to make them work well with each other, running quickly enough that they don't impose too much of a burden during development.
00:01:20.560
So today, I’m going to talk to you about what I've learned about using tests to drive development and manage change.
00:01:27.200
There are a lot of testing frameworks, and I first want to give you a brief overview of the most popular ones.
00:01:32.320
This overview may help you decide which one might be right for your next project.
00:01:43.600
We'll start with Test Unit, which is the classic Ruby XUnit testing framework. It's been around for quite a while, actually, before version 1.0.
00:01:56.000
It's included in the Ruby 1.8 standard library, so everyone has access to it, and you're probably familiar with how the tests look.
00:02:02.159
On top of Test Unit, we have RSpec, which is the de facto standard behavior-driven development (BDD) framework.
00:02:14.720
I would say it’s the majority testing framework that people are using these days; most people are probably familiar with it.
00:02:27.360
Actually, before I go any further, let me ask you a few questions.
00:02:33.120
How many of you have never written any tests before? No hands? That's fantastic!
00:02:40.319
That is actually surprising; I don’t have as much to talk about as I thought I would.
00:02:51.280
So, how many of you use tests to drive your development? Test-driven development (TDD) or BDD?
00:03:02.840
A little bit, maybe eighty to ninety percent? Great job, everyone! This is unusual.
00:03:09.760
This is a very singular phenomenon in the Ruby community, I think.
00:03:15.519
So, as you’re probably familiar with Test Unit and RSpec, you might also know about Shoulda.
00:03:21.200
This is a BDD framework built on top of Test Unit that adds context and should syntax.
00:03:27.120
It has a lot of helpful matchers and assertions, many of which help implement Rails testing idioms.
00:03:33.360
These matchers can also be used in RSpec, making it compatible, which is nice.
00:03:40.560
It was written by Tamara Sula and the Thoughtbot team, and its tests look quite intuitive.
00:03:46.640
Now, let’s look at some of the less popular frameworks that I think are still interesting to explore.
00:03:52.799
By experimenting with a new framework, you may learn something new about testing.
00:03:58.000
MiniTest, for instance, is not that different; it was created by Ryan Davis and is the framework included with Ruby version 1.9.
00:04:08.159
So, when writing tests in Ruby 1.9, if you're not using another framework, you're likely using MiniTest.
00:04:13.599
MiniTest is a replacement for Test Unit, but it also has a BDD syntax and its own mock object.
00:04:19.600
Its syntax looks like this, and if you want to follow along with the code, you can check the slides on Heroku.
00:04:26.680
This is the spec style format.
00:04:33.840
We also have a lighter-weight BDD framework called Bacon.
00:04:40.320
Bacon is designed to be API compatible with RSpec, adding interesting features.
00:04:46.240
For example, it allows focused examples, letting the spec suite run only examples you designate.
00:04:51.759
You can run the entire suite or just the focused examples, which helps trim down runtime when needed.
00:05:04.400
It also supports custom metadata for implementing filters, and its syntax is pretty straightforward.
00:05:10.160
Bacon is very lightweight, with less than 350 lines of code.
00:05:16.160
It runs fast, often an order of magnitude faster than RSpec, and I have some benchmarks to share later.
00:05:22.720
Then we have Veritest, which is a newer framework created by Steven Rusterholtz.
00:05:29.680
Veritest is a BDD-style framework with many formatters, including TAP-style formatting.
00:05:36.080
For instance, Vim and Emacs can parse its output.
00:05:41.280
Veritest also has XML and TAP compatible output formats and the concept of test suites with dependencies.
00:05:48.319
At Puppet, we've integrated dependency declarations for context management on top of RSpec.
00:05:54.960
Veritest has assertion helpers for things like float inequalities and unordered collections.
00:06:01.680
It even has an interactive mode, similar to a debugger inside your test suite, which is pretty cool.
00:06:09.919
The development branch of Veritest is still evolving and may offer a new perspective on TDD.
00:06:16.400
Next, let’s look at benchmarks. Ryan can tell you more about how they're calculated.
00:06:22.320
I should point out that Shoulda is slow due to a bug, but its real performance is better.
00:06:28.800
If we look up at the top, MiniTest and Bacon are way faster—around 200 times faster than most of the other frameworks.
00:06:39.600
This speed may be more important than you might think when running tests.
00:06:44.960
Given the possible choice paralysis involved with choosing a testing framework, how should we decide?
00:06:52.880
Choosing a framework for our projects boils down to two key factors: familiarity and expressiveness.
00:06:58.680
Familiarity means how well you and your team know the framework and how quickly you can get answers.
00:07:03.680
Expressiveness refers to how informative your tests are about the state of the system under test.
00:07:10.400
More expressive frameworks provide better feedback.
00:07:16.720
Something surprising is that the difference in test quality between a good tester and a bad tester is way more significant.
00:07:24.480
Use a framework that you like and understand best, and one that has an active community.
00:07:30.720
After discussing testing frameworks, I want to shift focus to why and how we test.
00:07:36.160
Many of you who practice TDD and BDD may already understand the first part, but I have something interesting to share.
00:07:45.680
Ron Jeffries said that we test because it gives us clean code that works.
00:07:51.680
Kent Beck, in 'Extreme Programming Explained', states that change is inevitable and creates the need for feedback.
00:08:01.600
Values like trust and communication are critical and apply to various levels of development.
00:08:09.280
At its core, my talk is about why we test and how it relates to trust and communication.
00:08:15.680
We test for two main reasons: we want confidence and better communication about our code.
00:08:25.440
Confidence now allows us to drive development, while confidence in the future helps us manage change.
00:08:31.760
Communication is essential for dealing with change; without feedback, you can't alter a course once it's set.
00:08:38.320
Another commonly expressed benefit of BDD is that tests serve as living documentation.
00:08:46.160
Another reason we write tests is to allow teams to relax and develop trust.
00:08:52.320
Customers look forward to releases of well-tested software because it often works as intended.
00:08:58.560
My assertion is that most of you likely already understand the benefits of testing.
00:09:05.760
Now, why do we test? Let's discuss our goals in writing tests.
00:09:12.640
The purpose of tests is twofold: to drive development and to help us manage change.
00:09:20.880
These two aspects express most of the value we get from testing.
00:09:27.440
Now that we know why we test, let's talk about how we write tests that achieve these goals.
00:09:35.920
I've struggled with a concise definition of what a good test is, but I believe I've found it.
00:09:41.680
A good test is one that provides fast, focused feedback.
00:09:49.120
All of those words are important because a slow test suite doesn't get run enough.
00:09:55.920
If your tests take more than a tenth of a second on average, then they're certainly too long.
00:10:03.040
When you have 10,000 specs to run, it adds up, leading to long feedback cycles.
00:10:09.040
Our test suite runs almost 9,000 tests in about a minute, which is crucial for getting feedback to drive development.
00:10:18.400
It's also important that tests only test one thing at a time.
00:10:24.320
When a test fails, you should know why it failed, where the failure occurred, and how it happened.
00:10:32.000
If you can’t pinpoint that information from your test, it’s not a good test.
00:10:38.720
Finally, feedback is about minimizing the distance between your understanding of your code and your tests.
00:10:44.640
Having a testing framework in place allows you to manage change and be aware of the changes you make.
00:10:51.440
So, I hope you take away that the hallmark of a good test is that it provides fast, focused feedback.
00:10:58.560
Let’s discuss some common problems in tests and how they manifest and how we can solve them.
00:11:05.680
One of the most common problems, especially in large test suites, is state leak.
00:11:12.000
This occurs when different tests share data and possibly mutate it in ways that lead to issues.
00:11:19.760
At Puppet, we’ve struggled with this and have significant portions of our test suite that can fail.
00:11:26.400
Tests might fail depending on the order in which they are run, which is a major issue.
00:11:33.440
This undermines confidence in the quality of our tests and the feedback they provide.
00:11:39.120
It's crucial to ensure tests are encapsulated and that any state created does not persist across tests.
00:11:47.440
If shared state is necessary, it’s important to return to a known state at the beginning of each test.
00:11:54.720
Frameworks like Rails can help by resetting the database, but sometimes that isn't enough.
00:12:00.560
The first way to minimize state leak is to minimize shared state.
00:12:08.480
If there’s no shared state, there will be no state leak.
00:12:12.320
This leads us to the next problem: long setup or teardown blocks.
00:12:20.320
Long setup blocks indicate that the code being tested is too complicated.
00:12:26.160
It may indicate that your classes are too large, or the interactions among them are too many.
00:12:33.440
Ideally, setup and teardown should be concise and related directly to each test.
00:12:39.520
If your context is 'when a user is logged in', then the setup should only log in the user.
00:12:46.160
This helps keep a mental correspondence between what the tests say they are doing and their actual behavior.
00:12:52.400
The next issue is setup duplication.
00:12:57.520
If you see the same setup block repeated, it can indicate unnecessary complexity in the code.
00:13:06.720
It's a sign of potential design problems in the code under test.
00:13:13.920
All of these 'test smells' suggest you might have a design problem.
00:13:19.440
If the code is well designed, these test smells would largely disappear.
00:13:28.640
Long-running tests are another significant issue.
00:13:36.000
If your tests take more than a minute or two to run, people won't run them enough.
00:13:41.360
You risk losing confidence in the code provided by your teammates.
00:13:47.840
We previously had an RSpec suite that took around fifteen minutes to run.
00:13:54.560
That's unmanageable for a development environment.
00:14:00.560
It disrupts the rhythm of coding and slows down overall productivity.
00:14:06.720
When a test takes a long time to run, developers are less likely to run it frequently.
00:14:14.080
Long feedback cycles lead to decreased confidence in the code.
00:14:20.560
Fragile tests can be another major issue.
00:14:27.760
This happens when a change in one part of the code causes a failure in a completely unrelated test.
00:14:35.440
This is often due to improperly scoped tests or global state that isn’t managed properly.
00:14:42.560
Fragility is ultimately a failure of feedback from your tests.
00:14:49.440
To avoid fragile tests, be mindful of your use of mocking, stubbing, and faking.
00:14:57.360
The more invasive your use of these tools, the more fragile your tests may become.
00:15:04.000
Kent Beck's advice to do the simplest thing that could possibly work is especially relevant here.
00:15:11.760
Simpler tests generally indicate simpler code, which is easier to maintain and debug.
00:15:20.080
Being able to write simple tests that remain stable is a positive indicator that your code is well designed.
00:15:27.760
Fragile tests can seriously undermine developers’ trust in their testing framework.
00:15:35.760
It's essential to manage technical debt and not wait to address issues.
00:15:42.080
Now, I don't have time for Q&A, but let me quickly summarize testing practices.
00:15:49.680
Isolation between tests is essential; they should not affect one another.
00:15:57.440
Having regression tests provides feedback about improving trust in your code.
00:16:05.680
When you fix a bug, write a test first and think about why you didn’t write it sooner.
00:16:13.680
The first test you write should be one you're confident about, teaching you about your system.
00:16:20.800
Finally, consider the risks of mocking and stubbing.
00:16:31.760
Thank you all very much for your attention.