Real World Ruby Testing

by Rein Henrichs

In the video "Real World Ruby Testing" presented by Rein Henrichs at GoGaRuCo 2010, the speaker addresses the crucial aspects of Ruby testing, emphasizing its importance in ensuring code reliability and maintaining manageable changes over time. The talk is designed for new testers, those struggling with fragile test suites, and anyone eager to improve their testing strategies.

Key points discussed include:
- Overview of Testing Frameworks: Rein introduces various Ruby testing frameworks, including Test Unit, RSpec, Shoulda, MiniTest, Bacon, and Veritest, discussing their purposes and differences. He notes RSpec as leading in behavior-driven development (BDD).
- Selecting the Right Framework: The decision on which framework to use hinges on familiarity and expressiveness, meaning how well the team knows the framework and how clearly the tests communicate the system state.
- Purpose of Testing: Testing is essential for building confidence in code, driving development, and managing changes effectively. The benefits of testing extend to improved trust within teams and living documentation of the codebase.
- Writing Effective Tests: Rein outlines what constitutes a good test, including fast feedback, single-purpose testing, and the necessity for tests to be independent. He highlights common problems such as state leakage, excessive setup, and fragile tests, which can undermine confidence in the test suite.
- Common Issues and Solutions: Problems like state leak, setup duplication, long-running tests, and fragile tests are analyzed. Strategies to mitigate these issues include minimizing shared state, streamlining setup processes, and ensuring tests provide clear and actionable feedback.
- Best Practices: Rein emphasizes regression testing, writing initial tests with confidence, and being cautious with techniques like mocking and stubbing.

In summary, the hallmark of effective Ruby testing lies in creating a well-structured test suite that supports rapid development and encourages clear communication among team members. Testing is not merely a checkbox activity but a vital aspect of software development that promotes quality and trust in the code produced. The concluding message stresses the value of maintaining high standards in testing to foster both confidence in the code and better collaborative practices in development.

00:00:09.280 It's a little bit later in the day. I hope the coffee is kicking in for everyone; it's starting to work for me.

00:00:15.280 Today, I'm going to talk about what I call real-world Ruby testing.

00:00:20.880 This includes a lot of things I’ve learned from writing and maintaining a freakishly large test suite.

00:00:26.880 But first, a little bit about myself. My name is Rein Henrichs, and I have a Twitter and a blog.

00:00:33.360 If you want to follow along, the slides are up on Heroku, and I work at Puppet Labs.

00:00:39.040 When you run this on our project, you get an output that says almost over nine thousand.

00:00:47.760 However, what it’s not actually showing are all of the tests that didn’t run because I don't have a Solaris box or an AIX box.

00:00:53.360 If you run all of the tests in the Puppet Labs suite, it's probably well over ten thousand. That seems like a lot of tests.

00:01:06.159 We've had a lot of experience wrangling those tests together and trying to make them work well with each other, running quickly enough that they don't impose too much of a burden during development.

00:01:20.560 So today, I’m going to talk to you about what I've learned about using tests to drive development and manage change.

00:01:27.200 There are a lot of testing frameworks, and I first want to give you a brief overview of the most popular ones.

00:01:32.320 This overview may help you decide which one might be right for your next project.

00:01:43.600 We'll start with Test Unit, which is the classic Ruby XUnit testing framework. It's been around for quite a while, actually, before version 1.0.

00:01:56.000 It's included in the Ruby 1.8 standard library, so everyone has access to it, and you're probably familiar with how the tests look.

00:02:02.159 On top of Test Unit, we have RSpec, which is the de facto standard behavior-driven development (BDD) framework.

00:02:14.720 I would say it’s the majority testing framework that people are using these days; most people are probably familiar with it.

00:02:27.360 Actually, before I go any further, let me ask you a few questions.

00:02:33.120 How many of you have never written any tests before? No hands? That's fantastic!

00:02:40.319 That is actually surprising; I don’t have as much to talk about as I thought I would.

00:02:51.280 So, how many of you use tests to drive your development? Test-driven development (TDD) or BDD?

00:03:02.840 A little bit, maybe eighty to ninety percent? Great job, everyone! This is unusual.

00:03:09.760 This is a very singular phenomenon in the Ruby community, I think.

00:03:15.519 So, as you’re probably familiar with Test Unit and RSpec, you might also know about Shoulda.

00:03:21.200 This is a BDD framework built on top of Test Unit that adds context and should syntax.

00:03:27.120 It has a lot of helpful matchers and assertions, many of which help implement Rails testing idioms.

00:03:33.360 These matchers can also be used in RSpec, making it compatible, which is nice.

00:03:40.560 It was written by Tamara Sula and the Thoughtbot team, and its tests look quite intuitive.

00:03:46.640 Now, let’s look at some of the less popular frameworks that I think are still interesting to explore.

00:03:52.799 By experimenting with a new framework, you may learn something new about testing.

00:03:58.000 MiniTest, for instance, is not that different; it was created by Ryan Davis and is the framework included with Ruby version 1.9.

00:04:08.159 So, when writing tests in Ruby 1.9, if you're not using another framework, you're likely using MiniTest.

00:04:13.599 MiniTest is a replacement for Test Unit, but it also has a BDD syntax and its own mock object.

00:04:19.600 Its syntax looks like this, and if you want to follow along with the code, you can check the slides on Heroku.

00:04:26.680 This is the spec style format.

00:04:33.840 We also have a lighter-weight BDD framework called Bacon.

00:04:40.320 Bacon is designed to be API compatible with RSpec, adding interesting features.

00:04:46.240 For example, it allows focused examples, letting the spec suite run only examples you designate.

00:04:51.759 You can run the entire suite or just the focused examples, which helps trim down runtime when needed.

00:05:04.400 It also supports custom metadata for implementing filters, and its syntax is pretty straightforward.

00:05:10.160 Bacon is very lightweight, with less than 350 lines of code.

00:05:16.160 It runs fast, often an order of magnitude faster than RSpec, and I have some benchmarks to share later.

00:05:22.720 Then we have Veritest, which is a newer framework created by Steven Rusterholtz.

00:05:29.680 Veritest is a BDD-style framework with many formatters, including TAP-style formatting.

00:05:36.080 For instance, Vim and Emacs can parse its output.

00:05:41.280 Veritest also has XML and TAP compatible output formats and the concept of test suites with dependencies.

00:05:48.319 At Puppet, we've integrated dependency declarations for context management on top of RSpec.

00:05:54.960 Veritest has assertion helpers for things like float inequalities and unordered collections.

00:06:01.680 It even has an interactive mode, similar to a debugger inside your test suite, which is pretty cool.

00:06:09.919 The development branch of Veritest is still evolving and may offer a new perspective on TDD.

00:06:16.400 Next, let’s look at benchmarks. Ryan can tell you more about how they're calculated.

00:06:22.320 I should point out that Shoulda is slow due to a bug, but its real performance is better.

00:06:28.800 If we look up at the top, MiniTest and Bacon are way faster—around 200 times faster than most of the other frameworks.

00:06:39.600 This speed may be more important than you might think when running tests.

00:06:44.960 Given the possible choice paralysis involved with choosing a testing framework, how should we decide?

00:06:52.880 Choosing a framework for our projects boils down to two key factors: familiarity and expressiveness.

00:06:58.680 Familiarity means how well you and your team know the framework and how quickly you can get answers.

00:07:03.680 Expressiveness refers to how informative your tests are about the state of the system under test.

00:07:10.400 More expressive frameworks provide better feedback.

00:07:16.720 Something surprising is that the difference in test quality between a good tester and a bad tester is way more significant.

00:07:24.480 Use a framework that you like and understand best, and one that has an active community.

00:07:30.720 After discussing testing frameworks, I want to shift focus to why and how we test.

00:07:36.160 Many of you who practice TDD and BDD may already understand the first part, but I have something interesting to share.

00:07:45.680 Ron Jeffries said that we test because it gives us clean code that works.

00:07:51.680 Kent Beck, in 'Extreme Programming Explained', states that change is inevitable and creates the need for feedback.

00:08:01.600 Values like trust and communication are critical and apply to various levels of development.

00:08:09.280 At its core, my talk is about why we test and how it relates to trust and communication.

00:08:15.680 We test for two main reasons: we want confidence and better communication about our code.

00:08:25.440 Confidence now allows us to drive development, while confidence in the future helps us manage change.

00:08:31.760 Communication is essential for dealing with change; without feedback, you can't alter a course once it's set.

00:08:38.320 Another commonly expressed benefit of BDD is that tests serve as living documentation.

00:08:46.160 Another reason we write tests is to allow teams to relax and develop trust.

00:08:52.320 Customers look forward to releases of well-tested software because it often works as intended.

00:08:58.560 My assertion is that most of you likely already understand the benefits of testing.

00:09:05.760 Now, why do we test? Let's discuss our goals in writing tests.

00:09:12.640 The purpose of tests is twofold: to drive development and to help us manage change.

00:09:20.880 These two aspects express most of the value we get from testing.

00:09:27.440 Now that we know why we test, let's talk about how we write tests that achieve these goals.

00:09:35.920 I've struggled with a concise definition of what a good test is, but I believe I've found it.

00:09:41.680 A good test is one that provides fast, focused feedback.

00:09:49.120 All of those words are important because a slow test suite doesn't get run enough.

00:09:55.920 If your tests take more than a tenth of a second on average, then they're certainly too long.

00:10:03.040 When you have 10,000 specs to run, it adds up, leading to long feedback cycles.

00:10:09.040 Our test suite runs almost 9,000 tests in about a minute, which is crucial for getting feedback to drive development.

00:10:18.400 It's also important that tests only test one thing at a time.

00:10:24.320 When a test fails, you should know why it failed, where the failure occurred, and how it happened.

00:10:32.000 If you can’t pinpoint that information from your test, it’s not a good test.

00:10:38.720 Finally, feedback is about minimizing the distance between your understanding of your code and your tests.

00:10:44.640 Having a testing framework in place allows you to manage change and be aware of the changes you make.

00:10:51.440 So, I hope you take away that the hallmark of a good test is that it provides fast, focused feedback.

00:10:58.560 Let’s discuss some common problems in tests and how they manifest and how we can solve them.

00:11:05.680 One of the most common problems, especially in large test suites, is state leak.

00:11:12.000 This occurs when different tests share data and possibly mutate it in ways that lead to issues.

00:11:19.760 At Puppet, we’ve struggled with this and have significant portions of our test suite that can fail.

00:11:26.400 Tests might fail depending on the order in which they are run, which is a major issue.

00:11:33.440 This undermines confidence in the quality of our tests and the feedback they provide.

00:11:39.120 It's crucial to ensure tests are encapsulated and that any state created does not persist across tests.

00:11:47.440 If shared state is necessary, it’s important to return to a known state at the beginning of each test.

00:11:54.720 Frameworks like Rails can help by resetting the database, but sometimes that isn't enough.

00:12:00.560 The first way to minimize state leak is to minimize shared state.

00:12:08.480 If there’s no shared state, there will be no state leak.

00:12:12.320 This leads us to the next problem: long setup or teardown blocks.

00:12:20.320 Long setup blocks indicate that the code being tested is too complicated.

00:12:26.160 It may indicate that your classes are too large, or the interactions among them are too many.

00:12:33.440 Ideally, setup and teardown should be concise and related directly to each test.

00:12:39.520 If your context is 'when a user is logged in', then the setup should only log in the user.

00:12:46.160 This helps keep a mental correspondence between what the tests say they are doing and their actual behavior.

00:12:52.400 The next issue is setup duplication.

00:12:57.520 If you see the same setup block repeated, it can indicate unnecessary complexity in the code.

00:13:06.720 It's a sign of potential design problems in the code under test.

00:13:13.920 All of these 'test smells' suggest you might have a design problem.

00:13:19.440 If the code is well designed, these test smells would largely disappear.

00:13:28.640 Long-running tests are another significant issue.

00:13:36.000 If your tests take more than a minute or two to run, people won't run them enough.

00:13:41.360 You risk losing confidence in the code provided by your teammates.

00:13:47.840 We previously had an RSpec suite that took around fifteen minutes to run.

00:13:54.560 That's unmanageable for a development environment.

00:14:00.560 It disrupts the rhythm of coding and slows down overall productivity.

00:14:06.720 When a test takes a long time to run, developers are less likely to run it frequently.

00:14:14.080 Long feedback cycles lead to decreased confidence in the code.

00:14:20.560 Fragile tests can be another major issue.

00:14:27.760 This happens when a change in one part of the code causes a failure in a completely unrelated test.

00:14:35.440 This is often due to improperly scoped tests or global state that isn’t managed properly.

00:14:42.560 Fragility is ultimately a failure of feedback from your tests.

00:14:49.440 To avoid fragile tests, be mindful of your use of mocking, stubbing, and faking.

00:14:57.360 The more invasive your use of these tools, the more fragile your tests may become.

00:15:04.000 Kent Beck's advice to do the simplest thing that could possibly work is especially relevant here.

00:15:11.760 Simpler tests generally indicate simpler code, which is easier to maintain and debug.

00:15:20.080 Being able to write simple tests that remain stable is a positive indicator that your code is well designed.

00:15:27.760 Fragile tests can seriously undermine developers’ trust in their testing framework.

00:15:35.760 It's essential to manage technical debt and not wait to address issues.

00:15:42.080 Now, I don't have time for Q&A, but let me quickly summarize testing practices.

00:15:49.680 Isolation between tests is essential; they should not affect one another.

00:15:57.440 Having regression tests provides feedback about improving trust in your code.

00:16:05.680 When you fix a bug, write a test first and think about why you didn’t write it sooner.

00:16:13.680 The first test you write should be one you're confident about, teaching you about your system.

00:16:20.800 Finally, consider the risks of mocking and stubbing.

00:16:31.760 Thank you all very much for your attention.

GoGaRuCo 2010

Rails is Obsolete (But So's Everything Else)

Avi Bryant

Eschew Obfuscation and Omit Needless Words: Writing Clear Acceptance Tests

Elisabeth Hendrickson

Intelligent Ruby: Getting Started with Machine Learning

Ilya Grigorik

The Revolution will not be Tweeted

Rich Kilmer

The Shell Hater's Handbook

Ryan Tomayko

Workflow

Ryan Davis

Lightning Talks

Evan Phoenix, Ron Evans, Pete Forde, David Stephenson, Noah Gibbs, Aman Gupta, John Woodell, Seth Ladd, Pat Nakajima, Thomas Shafer, Nathan Esquenazi, Jim Puls, Shane Becker, Mislav Marohnić, Alex Chaffee, Yehuda Katz, Blake Mizerany

Arel: The Ruby Relational Algebra

Bryan Helmkamp

Hidden Gems of Ruby 1.9

Aaron Patterson

Being Your Best Asset and Not Your Worst Enemy by

Evan Phoenix

Real World Ruby Testing

Rein Henrichs

Ruby APIs for NoSQL

Sarah Mei

Data-Driven Government and the Ruby Developer

Eric Mill

Extending Rails 3

Yehuda Katz

Bryan's ActieModel Extravaganza

Bryan Liles

Keynote: (Parenthetically Speaking)

Jim Weirich

Test-First Teaching

Sarah Allen, Alex Chaffee

Polyglot: When Ruby isn't enough or even sane

Blake Mizerany