00:00:11.900
Hi everyone, my name is John, and I’m here to talk to you about mutation testing.
00:00:17.520
I'm the CTO at a small tech company in Palo Alto called Cognito.
00:00:22.609
Sorry, there’s a little flickering here; we’re not sure what’s going on.
00:00:27.980
All right, so before I get into it, I want to give you a quick outline of the talk.
00:00:34.350
I’ll give you an introduction to what mutation testing is.
00:00:39.480
Then, I'll show you how it can help you improve test coverage.
00:00:47.370
I'm also going to show you how it can teach you more about Ruby and the code that you rely on.
00:00:55.010
Moreover, I’ll describe how it works like an x-ray for legacy code.
00:01:00.030
It can be a great tool for detecting dead code and it’s probably the most thorough measure of test coverage.
00:01:07.650
Additionally, I'll explain how it can help simplify your code.
00:01:12.979
I’ll wrap it up by discussing the practicality of mutation testing today and how you can incorporate it at your job.
00:01:20.420
Before we dive into mutation coverage, we need to be on the same page regarding line coverage, or test coverage in general.
00:01:25.950
Usually, when we talk about test coverage, we refer to line coverage.
00:01:32.159
Line coverage roughly means the number of lines of code run by your tests over the total lines of code in the project.
00:01:38.640
There are different variations, like branch coverage, but that’s sort of the gist of it.
00:01:44.909
Mutation testing asks a different question.
00:01:50.130
It asks: How much of your code can I change without failing your test?
00:01:57.270
If you think about it, this makes sense.
00:02:02.969
If I can remove a line of code or meaningfully modify a line of code in your project without breaking your tests, then something is probably wrong.
00:02:08.640
There are parts of the code that are either missing tests or are not well-tested.
00:02:15.980
Before we dive into how to automate mutation testing, I want to give you a good intuition of what mutation testing is by doing it by hand.
00:02:22.820
I've got a sample code here, so take a second to read it over.
00:02:27.920
I have this class called Gluttons, and at the top, I initialize it with the Twitter clients.
00:02:38.720
Then, I do a search on the Twitter API using that client.
00:02:45.080
I get the first two results, grab the author from it, and return that.
00:02:50.600
That space is what the test specifies down here; it’s got a fake client and some fake tweets.
00:02:58.220
On the left here, I have the same code but in Sublime Text, and on the right, I've got a script.
00:03:05.540
This script is going to run whenever I modify the file.
00:03:12.110
The script outputs a diff of the code against the current output, as well as the result of running the test.
00:03:21.100
First, I’m going to try to modify the hashtags, but that does not fail the test.
00:03:26.750
I can also remove the search string entirely, and that doesn’t fail it.
00:03:32.980
Additionally, I can call it with zero arguments, and that also does not fail the test.
00:03:40.070
If I change "first" to "two first one," that does fail, which is good.
00:03:45.860
However, if I change it to "first three," that does not fail the test.
00:03:53.450
So, reviewing those again, I can basically change the input to the search method however I want.
00:03:57.099
I can remove the hashtag, the entire search string, or call it with a different number of arguments, and it doesn’t matter.
00:04:06.469
If I change "first" to "first one," that does fail given the specifics set in our fake tweets and our fake client.
00:04:14.900
However, if I change it to "first three," then that does not fail the test.
00:04:20.030
This is manual mutation testing.
00:04:26.360
You can imagine that doing this on a daily basis at your job would be pretty tedious.
00:04:34.759
This was just one method, but if we’re adding to the code, trying to do this for every part of what we’re adding would be a lot of wasted time.
00:04:41.210
It’s also going to be pretty hard to outsmart yourself.
00:04:47.120
If you did the best job you could at writing this code and the tests for it, it would be hard to come up with ideas you hadn't thought of before.
00:04:52.520
Now, I’m going to show you how to do mutation testing with an automated tool. The main tool for this is called Mutants.
00:05:06.289
It’s been around for years. I learned about it about two years ago in a targeted mutation testing effort.
00:05:12.830
Since then, I have become a large contributor to the project.
00:05:19.150
A friend and I also just started a fork of this project recently called NewTests that is very similar.
00:05:25.490
I’ll probably refer to them interchangeably throughout the presentation, but you can use either one.
00:05:30.889
All right, in this example here, I’m invoking the mutant command-line application and passing the required arguments.
00:05:36.849
I’m using the RSpec integration and telling it to mutate the class we just saw.
00:05:43.909
There’s going to be a lot of noise in this output, so don’t worry; we’ll go over the results in detail later.
00:05:49.840
Each disk here represents a mutation that I’m going to analyze while running my tests.
00:05:57.560
Some of the things we identified in our manual mutation testing run include the ability to remove an entire argument.
00:06:05.419
We can also pass a different type of variable to the search, including 'nil,' which is interesting.
00:06:11.349
Furthermore, we can change 'first' to 'last two,' which is also significant.
00:06:16.889
If this is the method that finds the most recent tweets, that’s a problematic change.
00:06:22.030
If we care about finding the most recent tweets, we probably want to ensure we return the oldest one.
00:06:29.169
We can also remove the 'first' call entirely, which could exhaust our API token and rate limit.
00:06:34.300
Our mutation testing tool shows us how to improve the test.
00:06:39.389
It recommends that we give it three fake tweets instead of two and explicitly specify the search we expect it to perform.
00:06:45.729
When we use automated testing, it’s quick and doesn’t require much effort.
00:06:51.700
It's likely to be more clever than what we could perceive.
00:06:58.540
Mutants has been accruing different mutations for years that target specific use cases and point out relevant changes.
00:07:05.820
Here’s another example you might encounter.
00:07:10.990
Imagine you’re working on an internal API, here’s some sample code.
00:07:16.720
We see the users controller and the show action; we’re validating the ID parameter.
00:07:23.770
We ensure that it's an integer and pass it to the user finder.
00:07:29.169
We either render JSON or run an error, which is what the test below specifies.
00:07:32.980
If we run this through our mutation testing tool, it shows us that we can replace the 'to_i' method with the 'Integer' method.
00:07:39.580
That's interesting, because the 'to_i' method works on any string and on 'nil'.
00:07:46.360
If I don’t have integers or digits in my string, it still gives me zero.
00:07:52.420
Calling 'Integer' on 'nil' raises an error if it can’t extract a number.
00:07:59.260
'fetch', instead of '[]', requires key presence, raising an error if the key is absent.
00:08:06.760
It is showing us how to enforce a stricter implementation, emphasizing the presence of the key.
00:08:15.580
Before, if someone incorrectly used the API, they might have passed in something incorrect or omitted the ID key.
00:08:22.240
We would end up querying the database for an ID of zero, which is not ideal.
00:08:29.310
This is a more precise implementation that forces us to think ahead.
00:08:36.740
Here’s another small example with a-created-after action.
00:08:42.950
We’re passing in a parameter called 'after' and parsing that input for a class method on a user called 'recent'.
00:08:50.869
Running that through a mutation testing tool shows us that we can replace 'part' with 'iso8601'.
00:08:58.459
This method is poorly named; it's a more strict parsing method.
00:09:05.770
It specifies the format: four digits for the year, two for the month, and two for the day.
00:09:12.500
This is quite different from the parsing rules for 'date_parse.'
00:09:19.060
The latter tries to parse the input very flexibly.
00:09:24.739
It might find the name of the month in the input, which isn't always what we want.
00:09:29.820
Let’s talk about regular expressions.
00:09:36.870
I'm particularly excited about this part of the presentation because it’s a feature that no other tool in the Ruby ecosystem provides.
00:09:43.240
Mutants can analyze regular expressions and show you if you're not covering branches within it.
00:09:50.790
Here’s some sample code where we're iterating over a list of usernames.
00:09:57.220
We’re selecting the ones that match a specified regular expression.
00:10:03.340
It will point out areas where we can be more explicit in our testing.
00:10:12.160
For example, if we do not provide test input for multi-line strings, it will let us know.
00:10:19.410
We can adjust this to a more strict format, ensuring we are accounting for the cases we want.
00:10:25.850
We also need to ensure we are testing the various conditions inside our regular expressions.
00:10:32.860
It will help enhance our regex conditions so that our tests are more comprehensive.
00:10:39.620
In Ruby 2.0, we have a new match Predicate method.
00:10:48.210
It's three times faster, only returns true or false, and its behavior avoids messing with global variables.
00:10:58.130
Incorporating these improvements results in more strict input handling and clearer error messages.
00:11:05.350
Now, let’s talk about HTTP clients.
00:11:12.940
In this example, we have a method called 'stars_form' using the popular HTTP Party client.
00:11:19.840
It's designed to retrieve data from a repository on GitHub's API and convert the result into a hash.
00:11:26.960
Then, we extract the key for the star gazers count.
00:11:32.480
When we run the mutation testing tool, it indicates we can remove the 'to_h' call and everything still works.
00:11:39.270
This might seem confusing at first, but the HTTP Party client looks at the response header's content type.
00:11:45.820
If the response is JSON, it behaves like a hash, allowing us to drop that method.
00:11:52.390
The beauty is that NewTests doesn’t need specific support for HTTP Party.
00:11:58.630
It understands how to evaluate different parts of your method and suggests changes.
00:12:05.060
It allows you to become aware of Ruby’s capabilities and avoid redundancy.
00:12:11.400
Now, let’s discuss legacy code.
00:12:19.900
This is the same code example we had before with the created-after endpoint.
00:12:26.110
Imagine instead of implementing this method yourself, you’re tasked with updating it.
00:12:34.360
Let’s say the original author wrote it two years ago and left little documentation.
00:12:39.740
When you run your mutation testing on that code before modifications, you’re going to see this mutation die.
00:12:46.600
This raises interesting questions about the author's intent.
00:12:53.020
Did they intend for users to only utilize the strict format, or should it allow any format?
00:12:59.170
How is it actually being used today? If other services use different formats, we want to guarantee compatibility.
00:13:05.860
Running the mutation testing tool gives us a checklist of potential issues.
00:13:12.460
This acts like a playlist of hotspots to diagnose before modifying.
00:13:19.800
It points out possible regression areas in the code that may not trigger any tests.
00:13:26.220
This is an extremely beneficial aspect of mutation testing.
00:13:32.050
Consider this method: when we invoke it, line coverage tools will report high coverage.
00:13:39.740
However, mutation testing indicates deeper issues.
00:13:46.380
It helps us check off-by-one errors and ensures we have comprehensive tests for boundaries.
00:13:53.300
It encourages rigorous testing, enhancing our methods to avoid missteps.
00:14:00.520
Here’s a nine-line method contemplating user permissions.
00:14:06.470
There are multiple roles possible: guest, muted, normal, admin, etc.
00:14:12.740
The tool challenges us to ensure we test all user conditions thoroughly.
00:14:20.640
It forces a reflection on how many conditions must be tested, which may amount to over 31.
00:14:29.310
This complexity is often hidden, and mutation testing brings it to the surface.
00:14:36.290
Here’s another small example: I’m taking an OSD user collection.
00:14:43.160
I'm grabbing emails while filtering out those without emails or who've unsubscribed.
00:14:50.940
Initially, I have a valid user and one without email.
00:14:57.090
I assert valid emails are the only things present in the output.
00:15:03.180
Then I do the same thing with a valid user and an unsubscribed user.
00:15:10.520
Here, my mutation testing tool is showing I could skip the iterations by manipulating the input.
00:15:18.920
This allows the bad user to end up on top, resulting in failed tests.
00:15:28.060
Correcting these tests means putting the unwanted user at the beginning.
00:15:35.650
The mutation testing approach also assists in detecting dead code.
00:15:42.160
Consider this scenario: perhaps we discover a column named 'name'.
00:15:48.510
Running the mutation testing tool shows we can replace it entirely.
00:15:54.640
It’s showing us that the method is already fully covered by the parent class.
00:16:02.330
The new user trail allows me to consider whether I’m introducing redundant methods.
00:16:09.040
In another instance, we have a method called 'authorized.'
00:16:14.590
It holds an optional user argument, defaulting to the current user.
00:16:21.080
When reviewing the mutations, it points out redundancy when we always pass in a user.
00:16:28.170
If it used to look different, it would apply that past mutation, leading to an error.
00:16:35.130
In this sense, we can inline the current user and simplify the argument.
00:16:41.580
The mutation testing tools consistently point out potential simplifications.
00:16:47.050
This occurs even with code navigation, ensuring I refer to the correct methods.
00:16:54.990
We pass in an ID parameter before calling the post finder, rendering a valid HTTP status.
00:17:01.800
The mutation testing tool reveals the non-necessity of that status entirely.
00:17:08.540
The default response associated with successful execution suffices.
00:17:15.030
Just as useful as it is for detecting dead code, mutation testing can simplify code.
00:17:22.039
For example, in a user IDs parameter, we can supply the users directly without splatting.
00:17:29.059
This exposes us to Ruby's interface intuitively, all at no extra cost.
00:17:36.420
Another example arises in a method designed to welcome users.
00:17:42.200
The mutation testing solution suggests navigating to a clearer syntax with the user method.
00:17:48.380
It demonstrates a streamlined approach that resolves to clearer errors.
00:17:54.590
Now, consider passing a string in a UNIX system by replacing the leading tilde.
00:18:01.460
We can verify that we’re not terribly dependent on overly generalized substitutions.
00:18:08.540
The mutation testing tool points out how using 'sub' can be more straightforward than a global method.
00:18:14.490
Serving as another example: if we’re looking for images unseen for two years.
00:18:20.380
The mutation testing tool displays how we can replace 'map' with 'each.'
00:18:27.480
This reaffirms we aren’t generating new arrays, simplifying our logic.
00:18:34.990
Finally, we encounter an example using the Logger from Ruby’s standard library.
00:18:42.560
Here, we’re setting a formatter that transforms log event details into a formatted string.
00:18:50.960
The mutation testing tool recommends swapping out 'Proc' for 'Lambda.'
00:18:57.820
Usually, I forget the differences, but for our purposes, it highlights valuable distinctions.
00:19:04.810
Procs and lambdas are similar but behave differently with the number of arguments.
00:19:12.220
You can avoid common pitfalls by adhering to using lambdas, providing a safer environment.
00:19:18.670
Mutation testing yields significant insights, but one wonders about its practicality.
00:19:25.570
Some may find its feasibility daunting but, you have more manageable options.
00:19:33.320
You can pass in flags telling the tool what to focus on, like just the parts altered.
00:19:40.080
This can be a tremendous time-saver, honing in on critical areas.
00:19:46.610
Overall, mutation testing signifies robust user growth over the years.
00:19:53.180
If you’re not implementing mutation testing daily, it can benefit you immensely.
00:20:00.340
It encourages learning nuances in Ruby’s special cases by exposing unique mutations relevant to your tasks.
00:20:08.620
You’ll learn real-time from mistakes, sparking more growth.
00:20:15.550
Ultimately, it improves your abilities in writing better tests and enhances your understanding.
00:20:23.350
You'll model your understanding of the code better and deliver products with fewer bugs.
00:20:30.720
The insights provided ultimately guide you to refine your testing strategies.
00:20:37.690
Employ this mutation testing process beforehand, particularly with unfamiliar code.
00:20:44.190
This guarantees you merit list and hotspots that indicate breakpoints.
00:20:51.700
Simplifying your code forms an integral part of this tool.
00:20:58.160
If you're eager about growing as a professional and seeking effective tools, embracing mutation testing will certainly propel you.
00:21:05.490
If your coworkers aren’t thrilled, you can still leverage NewTests before pushing.
00:21:12.240
You’re likely to learn more about Ruby as well as enhance your tests.
00:21:18.980
If you're a team lead, consider implementing it for continuous integration.
00:21:25.620
You don’t need to aim for 100% mutation coverage to derive benefits.
00:21:31.920
Recognizing code changes that won’t break existing tests is invaluable.
00:21:38.370
It prepares authors and reviewers alike.
00:21:45.680
If you liked the insights shared today and appreciate great code, contact us.
00:21:53.220
I hope you’re all excited about using mutation testing!