Talks

Deep into Ruby Code Coverage

Code coverage is an easy way to measure if we have enough tests, yet many of us have yet to use it.
This talk delves into the benefits of meaningful code coverage and how to avoid some of its pitfalls with a new tool called DeepCover.

RubyKaigi 2018 https://rubykaigi.org/2018/presentations/malafortune

RubyKaigi 2018

00:00:03.110 Hello, thank you all for being here.
00:00:08.580 My name is Marc-André Lafortune, but you can call me Marc; it's much easier. I'm honored to be an MRI committer and I really love open source, so I've contributed to multiple projects and I'm a maintainer on a couple of them.
00:00:17.779 Today, I want to talk to you about Ruby coverage. First, let me explain my accent; I'm from the French part of Canada, Montreal. This is what my neighborhood looks like. And no, I am not talking today about snow coverage.
00:00:31.529 I want to focus on code coverage. I'm going to discuss what code coverage is, why you should care, how it works, and we’re going to delve into different implementations to see how it is actually calculated.
00:00:44.430 So, first, what is code coverage? Imagine a large codebase. As professionals or aspiring professionals, you want to have tests in place. This way, if you break something, your tests will alert you to it. For instance, if a feature that used to work is no longer functioning, the tests will notify you of the problem. It serves as an alert to potential issues.
00:01:12.210 Code coverage asks the question: Are your tests sufficient? Do they actually run the code you want to test? Essentially, it's a test for your tests. When I say 'covered,' I mean that if you run all your tests—regardless of whether they take minutes or hours—at some point, a particular section of code should be executed. If it's not running, it means someone could change it, potentially introducing bugs, and no one would know — except, perhaps, your users, who will complain.
00:01:37.200 Code coverage identifies uncovered code. This usually occurs either because there is a missing test, which you definitely want to add to ensure that if anything breaks, you’ll know in advance — and not have users calling or emailing with complaints. Sometimes, there’s unused code, which might not have tests, but it suggests that the code might need to be removed. Also, some code can exist for edge cases that should rarely occur; while you may not have tests for those, it's beneficial to have some checks in place.
00:02:44.330 Imagine you've written this method `foo`. You are really proud of it because it does something amazing; however, if you have no tests, you might wonder how many of your methods are covered. In this case, none of them are covered. Method coverage gives you a number that indicates which methods are being executed. For example, it could show zero coverage for a method.
00:03:21.530 Let's say we write a test for `foo`. Now, we say we have one hundred percent method coverage. All our methods are covered because they’re executed. However, in this situation, we are calling `foo`, but the variable `something` is set to `false`, which is the default. Hence, we are not executing another part of `foo`. Thus, there's a possibility that `bar` could either not exist or have hidden bugs that we wouldn't know about due to our limited test suite. We want something that provides coverage to indicate parts of code that aren't actually tested. This is known as line coverage.
00:04:22.280 Line coverage informs you on which lines of code were run; for example, line one and line two were executed while line three wasn't. This means you’ll receive something like seventy-five percent coverage, or whatever the precise number is. However, this metric can be misleading. If you wrote your code in a certain way, every single line might run, but certain logic within those lines might not actually be executed. Thus, we could have one hundred percent method coverage and one hundred percent line coverage without certainty that everything is working as it should.
00:05:06.699 As you probably know, Ruby comes with a built-in library called Coverage that tracks line coverage. However, it might overlook situations where your tests are inadequate or incomplete. This is where node coverage comes into play.
00:05:52.000 Node coverage examines every part of the abstract syntax tree. Each bit of this tree is referred to as a node. For instance, if there’s a node defined as a method, the body of the method is another node, and if it has a child node that you want the test to hit, if it doesn’t get executed, you will know because you’ll see that node is uncovered. Successfully leveraging node coverage will indicate when you don’t have enough tests for specific segments of your code.
00:07:03.120 But if your tests are built differently, passing conditions that ensure components run, you might confidently hit one hundred percent coverage while still having flaws. If certain strings or variables are not passed, errors could arise even without encountering a failure. This leads us to branch coverage, which assesses whether you've tested all code branches. For instance, within an if-then condition, there are two possible outcomes: if it’s true or if it’s false. To achieve one hundred percent coverage across method, line, node, and branch coverage, you need tests for both conditions.
00:08:37.169 That's the beauty of code coverage tools. You run them on your whole test suite, and they identify all the places where code isn't fully covered. They prompt you to consider: Is this code necessary or should it be tested? Ultimately, this leads to hundreds or thousands of potential issues to examine.
00:09:10.730 At this point, you should be feeling excited about code coverage. It seems like a simple solution that helps you find bugs and missing tests effectively.
00:09:44.080 Ruby 2.6 brought changes that offered a more robust solution. Prior versions of Ruby only included line coverage, but with Ruby 2.5, branch coverage was introduced! It's important to note that there is still no node coverage natively. However, that's where DeepCover comes in—a gem I developed alongside a friend from Montreal. DeepCover can identify every single piece of your code that runs or doesn’t run, thus providing node and branch coverage.
00:10:24.880 Now, let me show you a live demo. I’ll run DeepCover and input a small piece of code. DeepCover will execute it, and based on the results, you’ll see which sections were run. For instance, it may tell you that the call to `bar` was not executed because the condition was false, while the else branch got executed.
00:11:01.360 Think about scenarios where method calls are made, or if you have multiple branch conditions. Even something that looks trivial can prove problematic if it isn’t executed correctly. For example, ActiveSupport has great tests but does not achieve one hundred percent coverage. If a function is not tested in edge cases, it will still pass overall. If someone were to change the method and users called it in ways that weren’t intended, it could lead to erroneous behavior.
00:12:39.960 We haven’t focused much on promoting DeepCover yet since it's still being fine-tuned. However, it’s ready for production. Some projects have started using it—initially, they aimed for one hundred percent coverage only to find their reported figures drop to ninety-six or even lower. Dan Allen, with his compact coding style, discovered bugs in his code that weren’t caught before simply by using DeepCover.
00:14:05.710 DeepCover can help identify coverage gaps that might otherwise go unnoticed. For example, in ActiveSupport, some classes may include unused variables or functions that haven't been checked adequately. Through DeepCover, I managed to find missing tests in certain branches that had logical flaws.
00:14:46.150 Many users are unaware of how often code coverage is inaccurately reported. For example, missing edge case tests can be a simple code that is defined but not executed because it’s never invoked in test conditions. Whether an edge case scenario is skipped or not, DeepCover unveils those segments to assure that tests accurately reflect code behavior.
00:15:36.730 So while MRI and DeepCover both provide unique advantages for measuring code coverage, we recognize that DeepCover can be much slower because of its depth and detail. However, you usually won't notice this on CI servers because coverage metrics take time to compute.
00:16:38.630 In later versions of Ruby, these updates and enhancements will continue to surface. Therefore, it’s essential to stay informed and utilize improvements to ensure your projects maintain adequate coverage reporting.
00:17:40.120 As I continue to develop DeepCover, I’ve been deepening my understanding of its workings. It’s fascinating to track coverage through line execution in Ruby, parsing the code and counting various conditions along the way. The objective is to know precisely how many times each line or node has been executed.
00:19:19.000 Let’s dive briefly into the underlying code structure. This is a section of C code from MRI, where various instructions are compiled. I cannot go through all the lines, but this illustrates how calls are made to track coverage as each node is executed. While inserting those trace points, we can discern which lines of code have been executed and which haven't.
00:20:37.580 When working with DeepCover, we can take advantage of Ruby's inherent structures and utilize a unique approach that allows us to interject coverage metrics seamlessly without interrupting the original flow of the Ruby operations.
00:21:22.110 Every operation must be monitored without sacrificing performance, and DeepCover tracks meta-information on methods that help expose where coverage might be lacking. As developers, we want to ensure we’re aware of potential pitfalls in our Ruby code.
00:22:38.490 Every Ruby project supports varying levels of complexity, which can sometimes be quite challenging. Thus we assess the deeper structure of our libraries and consider how each component interacts. By incorporating exploration of the coverage at such a detailed level, we can create a better environment for developers.
00:23:49.740 As with all genuinely effective tools, the key to success is ongoing improvement. Additionally, we are continually aiming to streamline the user experience with better reporting and integration capabilities to ensure DeepCover works harmoniously alongside applications you already know.
00:24:54.510 Thank you all for your attention today. I hope you found this useful and informative. Now, I’m happy to take any questions you may have about DeepCover, code coverage, or anything related to Ruby development.
00:26:04.890 Please do not hesitate to come up to the microphone. I am eager to hear your thoughts, opinions, and queries.
00:27:03.450 Thank you once again for the great presentation. One thing I wanted to inquire about is whether you’ve talked to platforms like Coveralls and CodeCov to see their opinions on your tool?
00:27:47.950 That’s an excellent question! We haven't had discussions with Coveralls yet, but we have spoken to the maintainer of SimpleCov about potential integrations. I would love to engage with Coveralls teams, so if any folks from this platform are in the audience, please reach out!
00:29:19.340 Thank you again for your questions. Let’s continue this dialogue; I would love to learn from everyone’s experiences and insights.