Mutation Testing

Summarized using AI

Maybe!

Markus Schirp • March 15, 2014 • Wrocław, Poland

In the video titled "Maybe!", Markus Schirp presents an enlightening talk on mutation testing, a strategy that examines the effectiveness of unit tests in software development, particularly within the Ruby community.

The main theme revolves around the question of whether perfect tests can exist, and Schirp argues that mutation testing is essential in enhancing the quality of tests. The following key points outline the core content of the discussion:

  • Introduction to Mutation Testing: Schirp begins by explaining mutation testing as a method for testing the tests themselves. He emphasizes the importance of validating that tests effectively catch errors that mutations (intentional code changes) introduce.

  • Coverage Metrics Limitations: Traditional coverage metrics such as line coverage, branch coverage, and statement coverage can be misleading. Schirp shares personal experiences where total coverage did not guarantee bug-free code, noting that a single executed line may not cover all edge cases.

  • Mutation Testing Explained: Mutation testing modifies code to see if existing tests can detect changes. When a test fails after a code mutation, it is said to "kill" the mutant; if it passes, the mutant is considered "alive". Schirp uses examples to illustrate how mutation testing can reveal unseen weaknesses in test cases.

  • Practical Applications: He further discusses the integration of mutation testing in real-world projects, highlighting the Axiom library and the time efficiency gained by focusing tests on public interfaces instead of running the entire suite for every mutation.

  • Challenges and Limitations: The talk also covers the challenges faced in mutation testing, including equivalent mutants (mutations that do not alter the code's behavior) and the infinite runtime problem (where mutated tests could lead to endless loops).

  • Integration in Development: Schirp encourages implementing mutation testing as part of the development workflow. He highlights its potential to enhance code quality and developer satisfaction by ensuring better test coverage.

  • Concluding Thoughts: The talk concludes with the assertion that while mutation testing is not a silver bullet, it is a valuable tool that promotes better testing practices. Schirp aims to inspire attendees to adopt mutation testing in their development cycles.

Overall, the disconnect between perceived test coverage and actual error detection is at the heart of mutation testing's importance. Developers are urged to recognize that good tests identify errors effectively, leading to more stable and robust applications.

Maybe!
Markus Schirp • March 15, 2014 • Wrocław, Poland

This video was recorded on http://wrocloverb.com. You should follow us at https://twitter.com/wrocloverb. See you next year!

Slides: http://slid.es/markusschirp/mutation-testing-fight-2

Markus Schirp with CAN WE WRITE PERFECT TESTS? - MAYBE!

Or: Why mutation testing is as a game changer for unit tests. The pros of a solid unit test suite are well understood and accepted in the ruby community.
The problem: How to define solid? Traditional metrics like line-coverage, branch-coverage and even statement-coverage can be misleading. Having a statement executed once does not mean all edge cases are specified and bug-free!
Automated tools can be used to identify uncovered edge cases that will introduce bugs into your program. Mutation testing brings fuzzing to the implementation level. Unlike input fuzzing it modifies the implementation to check if the test suite can detect a huge set of automatically introduced behaviour changes.
This talk will elaborate the history of testing and the metrics that are used to define coverage. And how such metrics can and have misguided development direction. It will also show examples of projects that heavily adopted mutation testing. Especially the long term effects and how the (mutation)-metrics driven approach improved developer happiness and code stability. Showing what kind of actual bugs where caught and how code became naturally streamlined. Lastly the current and future limits of existing mutation testing tools will be presented. The idea is to create the "MUST HAVE"-Feeling in the audience. Finally, you can prove that you have good tests! Do not miss it!

wroclove.rb 2014

00:00:13.019 Ah, really warmer, plus thank you.
00:00:19.080 Um, I would like to introduce a funny topic. It's called mutation testing.
00:00:24.779 I'm really proud that there are many people in the room. Sorry, um, okay, who saw this talk at Eurocamp?
00:00:33.180 Ah, some hands. Okay, basically, it's the same talk, but the background is now not black anymore.
00:00:38.520 So that's what I changed, and I'm really proud it looks better now.
00:00:44.520 So I'm going to talk about a testing strategy called mutation testing.
00:00:51.120 It's hard to explain, so I'll try not to rely too much on slides because actually I'm not really good at creating good slides.
00:00:56.460 I hope you will interrupt me in case I go too fast or use terms that need explanation.
00:01:03.059 That's the setup of talks I really like.
00:01:08.460 If there were no questions, I would pick random people in the audience and say, 'Okay, what's the AST I'm talking about? Tell me.'
00:01:14.220 So, let's start. Mutation testing is about testing your tests.
00:01:19.380 Why should tests be tested? We all say, 'Okay, we write code, we do TDD, and once we have the tests green, we can say, 'Okay, good job, let's move on, deploy it.'
00:01:26.820 Yes, it's not that easy.
00:01:33.240 Okay, I started testing years ago. I hated it, seriously, I hated it.
00:01:39.540 When I started testing, I was always freaking out.
00:01:45.479 Oh no, I just hacked for five minutes. I only found holes.
00:01:51.000 Typically yours. To get something working, now an HTML site is presented on the screen, I'm happy, I need to ship this.
00:01:56.340 Then someone told me I needed to test it, not me.
00:02:02.520 I tested it; I ran it multiple times in my browser, so it must work.
00:02:08.880 That was before I finally got into TDD.
00:02:14.760 Then I stopped, started to do too much testing.
00:02:20.940 I basically tested simple arithmetic primitives.
00:02:27.060 One plus one should equal two because I didn't want to stop.
00:02:32.760 I finally started testing, but I did too much. I wasn't identifying edge cases.
00:02:39.540 I was just enumerating everything I could enumerate because I thought this is testing.
00:02:45.959 So it didn't work out. Over time, I discovered testing tools.
00:02:51.120 Oh, what a cool thing! Test matrix, some automated way to verify if I tested enough.
00:02:57.660 Cool, let's achieve it. Let's try to use test matrix.
00:03:03.540 The first thing was pretty lame. I said, okay, there was a ratio between lines of code written and lines of code testing.
00:03:09.720 Yeah, if it's close to some specific number or whatever, it's good. I'm done with testing.
00:03:15.060 Because it's enough, okay? It's a joke.
00:03:22.260 Then I saw the line coverage thing, and the line coverage thing proved pretty well.
00:03:28.500 It showed me there was a nice report. Some lines were backgrounded. Yeah, green—that's probably because it's light on.
00:03:34.140 When I see 100% coverage, I'm done; I can ship my code.
00:03:40.680 Now, that's not actually true, because like here's the autonomy operator.
00:03:46.980 This is a 100% covering test case, so I only have to put in true, and only one of these branches gets tested.
00:03:53.940 So it might still blow up and it happened to me a lot.
00:04:00.480 I was the king of pushing typos because the right side was basically okay.
00:04:05.940 Now it's a little bit simple, but it could be any mistyped method call and would blow up in production.
00:04:12.659 Because your users are actually fuzzing your application all the time.
00:04:17.880 If you don't have any users, Google will pass your application, so it doesn't give you anything.
00:04:24.240 Then I did some formal research and I found the branch coverage—a really cool thing.
00:04:31.620 Because we don't have any tooling in Ruby, but there is a well-tested code outside.
00:04:36.780 We all use SQLite's branch coverage, and it's basically one of the best covered libraries we'll ever have in open source.
00:04:42.000 But actually, branch coverage itself or statement coverage, in my opinion, is the same.
00:04:48.780 We can argue later, but it also has some nasty properties.
00:04:57.000 If you have two statements next to each other, only the last side effect generates needs to be tested.
00:05:04.440 And this side effect does not need any confirmation; it's actually intended.
00:05:11.060 So it's not a solution.
00:05:16.680 Then I joined the ROM team, aka data mapper 2, aka 'We won't finish it' or whatever.
00:05:23.699 Yeah, and there was a guy, his name is Dan.
00:05:30.600 Cobb, and he said, 'Marcus, your tests suck! Please use, uh, try to read the code I wrote.'
00:05:36.300 And that's mutation coverage.
00:05:41.639 I said, 'Oh my God! What's that?' and I ran this test.
00:05:48.720 I was totally freaking out—it was so nice!
00:05:55.080 Because actually, mutation coverage turned out to capture all the metric values I just showed.
00:06:02.160 If you apply it right, you can easily beat all line coverage problems.
00:06:08.220 You can beat the statement coverage problems.
00:06:15.120 It worked, but it has some shortcomings, especially the mutation tests we used at that time.
00:06:21.900 So over time, I realized that, okay, I cannot simply blame a bad mutation tester because I didn't write it.
00:06:28.020 In case I have to fare for myself on doing that, it resulted in the mutant project.
00:06:34.560 Let's dive a little bit more into that.
00:06:41.640 Okay, I don't want to rely on slides too much.
00:06:48.720 So let's go into some formal definitions.
00:06:55.680 Mutation testing is not as easy to explain as line coverage or statement coverage.
00:07:02.520 We need some agreement on names.
00:07:09.960 In the next 10 or 15 minutes, I will talk about alive mutants and killed mutants.
00:07:17.880 A mutant itself is just a variation of your code.
00:07:24.240 If you change a literal one to a literal two, the literal two is a mutation.
00:07:31.260 That's quite easy to understand.
00:07:38.640 The problem is mutation testing changes your code in an automated manner.
00:07:46.080 It runs a test and involves the meaning of tests.
00:07:52.800 If a test fails after the change, we basically count the mutation as killed.
00:07:59.520 If any change slips through the test after it has been automatically introduced, then it is called alive.
00:08:06.180 Because we all know movies, mutants should be dead.
00:08:12.960 So I never saw a good mutant.
00:08:20.220 Okay, good. I also have several specific mutations which should not be that built.
00:08:25.800 We can go into that later.
00:08:33.600 When we want to mutate code, it's not about doing substitutions on the string representation of code.
00:08:41.160 Because you will break the syntax, and you're not really sure what you're doing.
00:08:48.360 So mutation testing must be outside.
00:08:55.920 I saw for the Go language and many other languages, it just grabs through the code and exchanges a minus with a plus.
00:09:02.520 That is not a durable strategy for a mutation tester.
00:09:09.420 The semantics of today's programming languages, especially Ruby, are incredibly complex.
00:09:16.920 So you need a transformable representation, and the transformable representation is the AST.
00:09:24.300 Who knows what the AST is?
00:09:31.140 Yeah, really nice! So just set the advanced shop to transform.
00:09:36.840 Um, it was not full coverage here.
00:09:42.600 So I will just go into the AST without this light because I don't know how to do slides in the right way.
00:09:49.740 The AST is the abstract syntax tree.
00:09:56.220 If you ever saw one at university, you probably saw it; it's one plus one at the top.
00:10:02.700 The left is one, and the right is also one—it's an abstract representation.
00:10:09.000 So if you parse code, you probably end up with an AST.
00:10:16.200 If you parse markdown, you enter this HTML.
00:10:23.940 Okay, there is another class of mutators you could use, but I did not go this route.
00:10:31.860 Because it would not have been portable in Ruby; we only have shared code.
00:10:39.240 We don't even have a shared AST, but the AST is somehow shareable across implementations.
00:10:46.560 So I chose to use an AST-based mutator.
00:10:53.520 Um, this is another mutation example. Probably the slide should have been more early.
00:10:59.640 This is the same method I mentioned where we had the line coverage issue.
00:11:06.780 That's interesting because the mutation tester changed that input to true, a constant.
00:11:12.840 The test we had before, which was one, which gave us 100% line coverage, would still pass.
00:11:19.260 So the mutation tester here identified a missing test.
00:11:25.440 So basically, yeah, cool, the missing statement was identified.
00:11:32.280 If you want to kill the mutations, you basically have to make sure that the other branch gets executed.
00:11:39.600 In this case, just add a new test or drop the code.
00:11:46.740 Okay, there are many ways to mutate your code.
00:11:52.440 I cannot even enumerate all of them.
00:11:58.260 I tried once, but it's a deep rabbit hole.
00:12:05.520 Maybe I don't understand the domain well enough to come up with a definitive set of mutations.
00:12:11.760 I'm still discovering.
00:12:18.240 Yes! Those limitations typically only change the code in an automated way.
00:12:24.240 It tries to make the tests red. In case it cannot make the tests red, probably a test is missing or there is superfluous code.
00:12:30.840 That's the idea behind it.
00:12:37.560 So code coverage is a little bit of an unspecific term.
00:12:44.760 Okay, we can do many types of mutations.
00:12:51.060 We can add or change literals, we can delete statements, which will prove a statement that has a measured side effect.
00:12:57.600 That solves the problem with pure statement coverage.
00:13:04.260 We can inverse conditionals, change binary conditional operator replacements, or delete arguments.
00:13:11.220 There are many other strategies. I cannot enumerate them all.
00:13:17.160 So let’s just move on. There is some real-world use of this stuff.
00:13:23.880 We tried to write ROM, the Ruby object mapper, and we tried to ensure it has 100% mutation coverage.
00:13:30.780 This is some measurement for the component we have; it's called Axiom.
00:13:38.340 It's the relational algebra behind it and I just ran it today.
00:13:45.960 It did not finish because mutant has currently some problems with this library.
00:13:51.899 Yeah, it's super slow.
00:13:59.640 It's a technique you don't want to run at your commit stage.
00:14:06.360 I wanted to cover that stuff later, but it's actually a really good point.
00:14:13.860 For each mutation, in the worst case, your whole test suite gets run.
00:14:19.920 I say your whole test suite, so if it touches the database to set up stuff.
00:14:25.140 Because you are an active record, you basically have super runtime.
00:14:33.060 So if you write real unit tests, mutation testing can be really fast.
00:14:40.080 If you write unit tests in a way the mutation test can identify that there is a unit test for that subject.
00:14:46.380 It can only run those specific tests, and then it becomes quite fast.
00:14:53.520 For example, when we introduced mutation testing to Axiom, it had a runtime of a day.
00:14:59.640 Once we scoped the test execution to public interface tests, it went down to 30 minutes, which is manageable.
00:15:06.240 Okay, I just simply lost track, so I will just discover what comes next.
00:15:12.900 Perfect, okay.
00:15:19.320 Typically, mutation testers need to report the mutations in a way programmers can consume.
00:15:26.339 I choose to implement it as a diff because we all know how to consume the diff.
00:15:33.780 If you are on a mutation tester and see 100,000 un killed mutations, sometimes a single test can kill 50 of them.
00:15:40.680 The mutation test itself is quite dumb. It just fuzzes your code. It's not fuzzing an input, it's just fuzzing code.
00:15:47.940 Okay, to run a mutation tester, you need to manage the CLI.
00:15:54.120 I don’t have any examples about the CLI here because I wanted to talk about the theory of mutation testing.
00:16:00.420 This talk is not really about mutant itself; it’s about what mutation testing is.
00:16:06.600 I want to interest you the audience in trying it out.
00:16:12.840 There are many mutation testers, so I don’t want to start with mutant.
00:16:19.440 No, because all mutation tests are quite complex.
00:16:26.520 I can only scratch the surface in this format.
00:16:32.760 I really hope for questions.
00:16:39.240 Other questions? No questions? Okay.
00:16:45.420 Okay, that's your slide.
00:16:51.000 Write real unit tests; that's actually a nice property of mutation testing.
00:16:57.720 If mutation testing is too slow, you don't have unit tests. Let's know there's nothing to argue about.
00:17:03.120 Okay, test selection; I probably should have just gone forward to the test selection slide.
00:17:09.420 The isolation one is really interesting.
00:17:15.060 If you just randomly change your code and execute it, what can possibly go wrong in a dynamic language?
00:17:21.540 If you have a test that dynamically generates a class, and if your code is subject to generate classes and put them somewhere in the VM.
00:17:27.960 It might leak, and all later tests could be totally screwed up, we could have artificial debt mutant.
00:17:34.560 There needs to be sandboxing.
00:17:41.760 That especially the sandboxing stuff was the reason I wrote my own mutation tester.
00:17:48.240 Because Heckel did not have it, and once I had 100% mutation coverage with one test.
00:17:55.200 Basically, that one test had mutations that invalidated future tests.
00:18:01.920 So currently, before injecting the mutant and measuring the effect, it works quite nicely.
00:18:08.280 There are many other strategies, especially for JRuby.
00:18:14.880 We could probably just build a second runtime to isolate effects.
00:18:21.600 I'm not really sure.
00:18:27.600 Because there is no silver bullet, you all know that there are really shortcomings of all mutation testers.
00:18:34.320 I have to mention here that there is the problem of equivalent code.
00:18:40.920 If a mutation tester mutates your code in a way that has exactly the same semantics, nobody can rescue you.
00:18:47.280 We cannot manually blacklist the mutation to say, okay, I as a human can decide that one should never be in the report again.
00:18:54.720 What's really nice in Ruby is that we have a such dense enumerable API.
00:19:00.960 Most of the equivalent mutants typically occur here.
00:19:07.140 This would not happen in mutant; you would just use one dot up to 10 and no mutant would generate an equivalent mutant here.
00:19:14.460 That's a really nice property.
00:19:19.800 Okay, then we have the infinite runtime problem.
00:19:27.180 I did not solve that currently and I'm not aware of any mutation testers that solve the problem.
00:19:34.620 Because if we have a conditional and that conditional controls a loop, that loop will never terminate.
00:19:42.000 The mutation tester will never terminate.
00:19:49.260 It's the halting problem; nobody can tell you whether you should account it as killed or not.
00:19:55.680 Currently, it doesn't happen much because of the enumerable API.
00:20:02.040 Because we control loop execution more in Ruby, this is not a big problem currently.
00:20:08.220 Another example: if someone codes a bug and writes a test for 9 or runs a lookup table for 100 integers.
00:20:14.940 Mutation testing will not tell you that the implementation is incorrect.
00:20:21.960 Mutation testing can only ensure the coverage between intentions of the tester and the coder.
00:20:28.560 If someone wants to cheat, they can still cheat.
00:20:35.220 So it's not a silver bullet here.
00:20:41.280 There was a hunting cheat sheet; it's totally imperfect.
00:20:48.960 But basically, it boils down to whenever killing a mutation, think about not adding a test, but writing simpler code.
00:20:55.440 If there is a way to avoid a literal, do it. If there is a way to reduce the insects, just do it.
00:21:03.060 It results in fewer mutations, and you will be happy afterward.
00:21:10.260 When to use mutation testing? I can’t say so.
00:21:17.100 I have commercial projects on Rails which are 100% mutation covered.
00:21:24.840 It takes a while; in that project, it proved some value.
00:21:32.160 I have projects where I tried to apply mutation testing, and it failed.
00:21:38.160 Mostly because we had a legacy test base which had too much remote code execution or inter-process communication.
00:21:45.840 The setup and teardown times ruined the experience.
00:21:53.040 That's up to the community to decide when mutation testing is appropriate.
00:22:00.540 For all mathematically and algebraic and transformation domains which are self-contained, I can only recommend it.
00:22:07.680 If you want to try mutation testing on your project, use mutation test classes where money is around.
00:22:14.640 It works; it's a fitting domain. It allows testing critical code.
00:22:21.180 Because your revenue calculation and promotion code stuff should have, in my opinion, potential.
00:22:28.080 It's all about doing the best to our knowledge.
00:22:35.460 So hopefully, you have more knowledge now and can run home to your product owner.
00:22:42.720 And say, 'Okay, I can't stand code without mutation testing anymore.
00:22:49.860 I would love to see some questions.
00:22:58.920 Yes, this statement stuff, yeah.
00:23:05.340 So I'll wait for him to get to the slide.
00:23:11.520 My question is about this slide. We have two statements: side effect A and side effect B.
00:23:17.220 How would mutation testing cover the fact that maybe side effect A should happen?
00:23:24.240 It would delete side effect A and run the test statement deletion, but then you'd have to ensure that the test actually tests the side effect.
00:23:30.840 Yes, exactly. You will notice a live mutant.
00:23:36.840 You will see in the diff just a removed line.
00:23:42.480 You, as a programmer, should identify, 'Hey, why can't the mutation tester remove that line without my knowledge?'
00:23:48.720 Let's go on.
00:23:55.920 There are some authors who write that it's not a good idea to aim for 100% line coverage.
00:24:02.520 Because it's too expensive. Could you tell us what we should aim for with mutation coverage?
00:24:09.540 I personally aim for 100%, but that's because of the loopback effect.
00:24:15.720 It's my tool; I try to fulfill the tool's intention.
00:24:22.320 But I would aim for 100% coverage of core classes.
00:24:29.040 If there's a tricky class, or even a group of classes within a domain that should be 100% covered.
00:24:36.960 It's a great way to ensure testing on the most important business path.
00:24:43.800 I got a question; how do these mutations work when you're mutating the code right here?
00:24:54.960 You're basically showing basic mutations where you mutate the code.
00:25:02.280 But there's also practices where you mutate the tests; QuickCheck is the best example.
00:25:09.960 Yes, do you have any experience with this?
00:25:16.920 I tried to use QuickCheck, and I failed with the setup, so I have zero experience with QuickCheck.
00:25:24.840 But I know it's closely related to mutation testing.
00:25:31.920 I think when we would have a more introspectable invariant definition—
00:25:39.240 I'm not talking about RSpec here; RSpec does not have that.
00:25:45.960 We have to pass the code again and assume certain things.
00:25:53.520 But if we have introspectable predicates on various states of objects under test, we could do something like QuickCheck.
00:25:59.760 So maybe as a follow-up, do you think Ruby is a language where you can do invariant checking sensibly?
00:26:06.120 In my opinion, yes. Ruby is the language where, to my knowledge, mutation testing performs best.
00:26:13.200 It excels under mutation testing because it effectively loads dynamic code.
00:26:19.680 Injecting mutants is nothing more significant than the requirements required in Ruby.
00:26:26.760 So it has a good optimized performance.
00:26:33.720 You gave an example of a project where mutation testing failed.
00:26:41.520 Can you give an example of where mutation testing was successful?
00:26:48.900 Yes, I need to include that stuff in the slide.
00:26:55.560 The person who wrote the Axiom library, mutation-covered Axiom, wrote a fuzzer.
00:27:03.780 They wrote an SQL generator and plugged the fuzzer to generate random mutations.
00:27:10.620 It was able to generate SQL, serialize it, and apply it to SQLite with Honda back in SQLite.
00:27:17.880 Yes, it was a fixed upstream.
00:27:26.760 I have to include the references into the slides.
00:27:33.840 More questions back there? Yeah.
00:27:38.520 Hey, do you have any statistics regarding how many mutants were killed by given tests, and also how many tests killed a given mutant?
00:27:45.180 I only have the first direction you mentioned.
00:27:52.320 I know how many mutants I should have included, but today it did not finish.
00:27:59.760 So, I couldn't update my slides.
00:28:06.000 So I know what mutants are dead on a given subject in the whole project.
00:28:12.720 I don't know the inverse; that's basically a shortcoming of me doing the RSpec integration.
00:28:19.140 We could do that, but in my opinion, we should do something like QuickCheck.
00:28:25.920 There, the measurements would become even better.
00:28:32.420 Hi, my question is because you've already said that it's kind of calculated renders mutant tests.
00:28:40.020 Why do you think they fit into the development process?
00:28:46.560 I run them locally.
00:28:53.040 I know I should probably write a god plug-in so I know I changed that class—let's mutation cover it.
00:29:00.240 It will take a minute before I check in, so I have an excuse to take a new coffee.
00:29:06.840 But I recommend running it on stage two.
00:29:12.480 If you have a normal CI setup, a multi-stage setup, there is a standard unit test.
00:29:19.200 I run them before my headless test.
00:29:25.620 That works best for me because identifying a problem on a headless test is much more costly.
00:29:31.680 It is much simpler to identify a problem that gets reported in a diff.
00:29:37.200 So I arrange the chain of tests in the way that reflects how problematic it would be to track down a problem.
00:29:44.520 Thanks.
00:29:51.600 Yeah, no problem.
00:29:58.080 So I have two questions. The first one is when you started doing mutation testing, what was the most common reason why your tests failed?
00:30:05.160 What were the things that were being changed most often that might lead to the tests failing?
00:30:11.280 It took me a while to learn to infer the connection between the behavior change introduced and the test.
00:30:18.600 And why the test does not kill it.
00:30:24.000 So I ended up basically doing the same mutation just firing it up.
00:30:30.540 I tried to make a dead mutation by hand with my normal TDD cycles.
00:30:36.600 I said, 'Okay, I will keep this mutation if I can't prove it wrong or I will just find a way to refactor my tests.
00:30:43.560 If I make them red, I would then change the code back.
00:30:50.520 At that time, I identified Git is really nice.
00:30:57.120 So my second question is more about the multi-stage testing setup that you mentioned.
00:31:03.840 Is it part of your content where you can switch out the different implementations so the tests run faster?
00:31:10.560 Do you think integrating mutation testing into a workflow encourages you to separate
00:31:17.880 the concerns more and to get to a point where you have a more multi-stage testing environment?
00:31:24.840 Absolutely! I identified good code is easy to mutation cover.
00:31:32.520 I sneaked in mutation testing to various projects already.
00:31:39.240 I started with the core classes, then expanded.
00:31:46.380 However, I realized, okay, I want to have this class mutation cover too.
00:31:52.920 It generated a significant number of mutations I could not test. It was better to refactor the class.
00:31:59.160 Then I mutation covered it by utilizing the existing tests.
00:32:05.520 No? Let's simply move on.
00:32:10.680 Any more questions?
00:32:16.560 No? Yeah, there was—yeah, I saw her hand.
00:32:22.680 When you do one mutation and you launch the whole suite, how do you identify the mutants?
00:32:30.480 They should fail, but you probably have a lot of tests that pass.
00:32:37.380 If you use the shotgun approach, all tests are allowed to kill the mutation.
00:32:46.560 Any test that fails kills the mutation.
00:32:53.220 So there is no need as it's a really unintelligent strategy.
00:32:58.200 For now, mutant's default is to use a selective selection.
00:33:05.460 If you have a mutation on Foo sharp bar, only tests that touch Foo sharp bar can kill the mutants.
00:33:12.840 Only they get executed, which speeds up the mutation test.
00:33:17.520 You selectively don't run the whole suite.
00:33:24.720 There are configurations you cannot set in mutant because I tend to write open-source code.
00:33:30.840 Only for my pleasure, so I'm a little bit easier on that.
00:33:38.160 But now I typically only bring in the features I personally like.
00:33:45.540 If there is a pull request that adds a feature, I will accept it.
00:33:53.280 But I don’t know—there's demand for more selective strategies.
00:34:00.720 But I couldn't make it currently.
00:34:05.460 No, it's okay.
00:34:09.840 My next question is about mocking.
00:34:13.920 In one of the slides, you will use mocking as an example.
00:34:19.200 Mark should receive mocking.
00:34:24.840 Does mutation testing invalidate the need for mocking?
00:34:30.840 No, no.
00:34:33.840 Because you need mocking to write a true unit test.
00:34:38.760 In cases where you test classes that do IO, you need mocks.
00:34:46.440 But I prefer to choose to set up units from my domain.
00:34:52.800 I don’t think mutation testing hinders or promotes mocking.
00:34:58.560 But mutant conducts lots of augmentations.
00:35:03.720 The last time I counted, I had 91 unique mutations.
00:35:10.920 But I should find a better strategy to enumerate them.
00:35:15.480 Well, was it one slide back where you have if input does something?
00:35:21.600 No, maybe I'm thinking of something else.
00:35:28.560 But maybe there was one test or one slide that said something about 'should receive.'
00:35:34.320 That was just an example to measure side effects.
00:35:40.320 You can use any strategy.
00:35:46.680 Mutant can change method calls or it’s just literal stuff.
00:35:52.020 No, it has shown lots of augmentation.
00:35:58.620 So anything that makes the test fail will count as a killed mutant.
00:36:05.220 Does having mutants invalidate the need for some mocking?
00:36:11.280 I don't think so.
00:36:17.520 No, I don't think so.
00:36:22.560 All right, thanks, Markus.
00:36:32.520 Thank you.
Explore all talks recorded at wroclove.rb 2014
+17