Ruby

Actionable Code Coverage

No more external tools / unclear percentage scores / slow PR feedback.

Code coverage at your fingertips while developing, by having tests fail when newly added code is missing coverage, with minimal overhead and great visibility.

how code coverage works in ruby (regular vs branch vs oneshot)
how single_cov keeps things fast and simple
onboarding for small and large codebases (automated onboarding + divide & conquer)
how to hack forked code coverage with forking-test-runner
wishlist for coverage.so

RubyKaigi 2019 https://rubykaigi.org/2019/presentations/grosser.html#apr19

RubyKaigi 2019

00:00:00.750 Welcome to Actionable Code Coverage. This is a talk about code coverage and its limitations.
00:00:06.029 It's not only a talk; it's also a repository with runnable examples.
00:00:12.889 These examples will show you step-by-step how this was done. It also features marked slides that you can modify or present elsewhere.
00:00:18.600 This presentation is meant as a deep dive for new developers to understand how we develop, how we test, and how we use code coverage.
00:00:25.529 My name is Mike Leckrone. I work at SanDisk, and we're hiring in San Francisco, Dublin, Copenhagen, and Sydney.
00:00:32.759 We offer a great work/life balance and visa assistance, so come join us!
00:00:39.480 You can find me on all these social platforms. My job here is as an infrastructure engineer.
00:00:46.550 I build a lot of these tools and help onboard large projects. This usually means I have to be careful not to disrupt existing workflows.
00:00:58.079 Onboarding has to be done piece by piece, which is also what this talk addresses. It discusses how to improve a large project step by step.
00:01:09.450 The plan is to go over what code coverage is in general to get everyone on the same page, then explain how to make your code coverage more actionable.
00:01:16.979 We'll discuss the problems with current approaches, solutions available, how to migrate large projects, and how to tackle coverage with forks.
00:01:24.810 Finally, we'll address my wishlist for Ruby maintainers regarding features that are still needed.
00:01:30.380 Code coverage in general has a built-in C library, so you don't need any gem to use it.
00:01:36.299 It's generally quite usable. If no library existed at all, you could create a system in about ten minutes that provides roughly 80% of what's needed.
00:01:42.869 You have to enable coverage before loading any code; you can't just load your files and then try to enable coverage.
00:01:49.829 You need to set it up by requiring the coverage tool and then all the other dependencies.
00:01:55.560 It’s also important to avoid enabling it for production and not to activate it for tests where coverage is not a concern.
00:02:02.369 The simplest form of coverage is line coverage, which is what you get by default when you turn it on.
00:02:08.099 When you enable coverage, you require your code, run your code, and then ask for the result.
00:02:13.890 This process disables coverage recording unless you use peek result, which allows you to continue recording coverage.
00:02:22.290 The result of line coverage is a hash containing all the files you have loaded and their respective coverage.
00:02:29.850 All the numbers essentially reflect hit counters, indicating how often the code was accessed. It will also show unreachable code, such as end statements, else clauses, and comments.
00:02:46.050 For example, in a method, the usage might return 1, 1, 0, nil. This indicates that the method definition was executed, but it doesn't mean someone called this method.
00:02:52.350 If the condition is such that the method is never reached, it will show as executed but marked as unreachable.
00:02:59.190 Despite what the coverage shows, it's possible to have sections of code that appear covered but are not really being tested, which is misleading.
00:03:05.850 This is a common pitfall where line coverage can be deceptive, as it looks like everything is fine, but parts of the code remain unassailable.
00:03:13.680 To address this, you could refactor your code to avoid putting multiple statements on one line, or create a Robocop rule that does this automatically.
00:03:20.519 However, a better option available from Ruby 2.5 is branch coverage, which is more informative.
00:03:26.940 It provides a more nuanced result, reporting on each branch rather than just a single line.
00:03:31.629 Branch coverage tells you which branches were executed, helping you identify actual test gaps.
00:03:37.150 Always be aware that it can be slower than line coverage, and should be avoided in production monitoring.
00:03:43.900 While it's effective, it doesn't cover some constructs, such as or statements effectively, leading to incorrect coverage assessments.
00:03:50.590 Another useful addition from Ruby 2.6 is the ability to watch coverage, which reduces the performance penalty during coverage calculations.
00:03:57.550 This means that after the first invocation, the coverage hooks that track all this get removed, resulting in no performance impact on subsequent runs.
00:04:04.930 This is very advantageous for production-level monitoring, where the precise hit frequency is less critical.
00:04:10.990 The one-shot coverage technique is also worth mentioning, which may look unconventional but serves a unique purpose.
00:04:20.519 This concept involves tracking which lines were covered, but it has limitations, especially concerning automation.
00:04:26.139 You can't determine if a line is not covered or if it's simply unreachable, which complicates calculating overall coverage percentages.
00:04:31.629 Consequently, there are some significant features to be desired in terms of coverage performance.
00:04:43.900 For example, in my benchmarks, I found that simple Ruby code could deliver around 50% for line coverage and 100% for branch coverage.
00:04:50.590 This contrasts starkly with RubyKaigi 2017, where the benchmarks were even lower.
00:04:57.550 In real-world applications, excessive overhead isn't typically observed; just enable and see how it works.
00:05:05.000 Thus, quick reminders about coverage types: we have line coverage, which is basic, and branch coverage, which is more comprehensive.
00:05:14.760 You'll want single-shot lines for speed but they are difficult to automate.
00:05:20.840 Next, let's explore actionable code coverage and the mindset around it.
00:05:25.970 The mindset here is to recognize that code coverage is not a metric in itself.
00:05:34.050 It's easy to manipulate coverage to appear satisfactory without ensuring the associated tests are meaningful.
00:05:41.000 Achieving 100% coverage doesn't necessarily mean the tests cover all use cases, and this often leads to a false sense of security.
00:05:49.160 You might end up with high coverage numbers, but your tests could be ineffective.
00:05:57.160 The goal instead is to treat code coverage as a helpful guide – it should prompt you to ask 'Shouldn't there be a test for that condition?'.
00:06:05.400 It's about maintaining standards in code reviews as well, where someone else should be able to point out deficiencies in your tests.
00:06:12.910 I have used this approach extensively and it reveals issues like unreachable code, especially when a test is misconfigured.
00:06:20.600 The issue with current coverage tools is that they tend to require lengthy waits for PR checks.
00:06:27.950 You submit your code, only to wait another five minutes to find that a line isn’t covered.
00:06:35.320 This can lead to frustration and an unproductive cycle, and often, achieving 100% coverage is an impossibility due to edge cases.
00:06:43.180 Many developers settle for 80% or 90% coverage, but assert that they are hitting 100%, when in fact they may not be.
00:06:51.740 There are scenarios where you might determine that a line is not covered because it's not reachable, which is acceptable.
00:06:59.080 An example would be if the line contains non-critical code, which may not warrant coverage.
00:07:06.060 Additionally, setting up code coverage typically requires intricate configurations that can be burdensome.
00:07:14.700 It involves managing webhooks, third-party providers, and ensuring that every contributor can see results.
00:07:22.560 The solution to this complexity is to seek quick, atomically responsive development feedback.
00:07:30.780 This means getting immediate notifications about which files are not covered.
00:07:38.500 If feedback is received directly within your Git environment, team members can adapt accordingly.
00:07:46.120 If you introduce any gaps in your code, they should be explicitly communicated to make PR reviews more effective.
00:07:53.600 This way, you also avoid the broken windows syndrome, creating a culture of responsible coding.
00:08:00.850 Achieving accurate branch coverage is essential, as conventional tools often only ensure line coverage.
00:08:08.400 The local setup should focus on providing straightforward coverage rules that don't impede productivity.
00:08:15.660 Today, I am going to demonstrate a tool called single_cov.
00:08:20.879 It runs your tests, flags coverage gaps, and allows you to accept them or fix the issues.
00:08:29.460 It works seamlessly with popular frameworks like MiniTest and RSpec.
00:08:36.650 By using single_cov, you can catch coverage issues locally without resorting to complex workflows.
00:08:44.100 It points out specific lines that require attention, indicating exactly which lines are at fault.
00:08:51.580 Moreover, its efficiency results from focusing solely on individual files.
00:08:57.540 By avoiding the creation of extensive HTML reports after each test run, it enables you to focus on what's important.
00:09:07.330 Using it in combination with forking the test runner synchronizes your local setups perfectly.
00:09:13.580 Opting in is quite simple; you merely install the gem and require it in your test helper.
00:09:21.170 This process allows you to keep significant portions of your codebase covered.
00:09:26.640 If you have a critical file, you can add its coverage in a straightforward manner.
00:09:33.150 You can document specifics of coverage gaps, allowing others to understand the context.
00:09:39.620 Additionally, you have control over how this coverage is managed, which can be helpful in code reviews.
00:09:45.530 One other useful feature is that you can add instructions indicating which files require coverage.
00:09:52.440 You can ensure your tests cover all necessary areas without complicating the setup.
00:09:59.370 Ultimately, it’s about clarity, so future contributors understand coverage statuses.
00:10:06.400 When introducing new classes, you can clarify if and why they might not be covered.
00:10:13.590 These best practices encourage an adaptable and maintainable codebase.
00:10:23.460 Now let's perform a brief demo of single_cov using a project for context.
00:10:30.970 The demo revolves around a project I maintain and its respective test structure.
00:10:38.920 It is important to approach this in such a manner that both the project and tests are aligned.
00:10:47.320 In doing so, I am maintaining the project's coverage integrity while demonstrating the ease of setup.
00:10:53.320 Through this process, expectations of coverage are actively documented throughout the code.
00:10:59.230 This not only bridges gaps but enhances collaboration among teams.
00:11:06.950 Moving on to forking test runners, they simplify the handling of global state changes.
00:11:14.890 We often have issues arising from tests trying to reset global state after running.
00:11:22.240 Because these tests are isolated, running a test will not affect unrelated functionalities.
00:11:30.190 You can pinpoint issues accurately, significantly speeding up debugging processes.
00:11:38.800 The problem with forking coverage is that it resets the coverage count upon each fork.
00:11:47.540 You want to retain the parent’s coverage data when testing in forks to maintain accuracy.
00:11:55.310 An elegant solution that's implemented with forking test runners merges parent results with child results after running the tests.
00:12:02.140 This allows for a clearer picture of total coverage without the hassle of resetting counts.
00:12:10.130 Ultimately, the idea is to sync parent and child coverage for complete clarity.
00:12:18.750 Moving forward, we must highlight the tools we still need to improve coverage.
00:12:24.740 As mentioned earlier, coverage is hard to automate due to the ineffectiveness of current options.
00:12:30.900 It would benefit us greatly if we had clear tooling for tracking true faults in tests.
00:12:37.789 A solution involving one-shot branches would help maintain coverage standards without performance drawbacks.
00:12:44.290 The goal here is to produce insights that demonstrate whether certain methods or code segments are hit or not.
00:12:51.400 This would grant us opportunities to reduce unnecessary code while keeping our application lean.
00:12:57.330 Branch coverage improvement would allow us to confidently identify code paths that are exercised.
00:13:04.200 Coverage for forked code must become smoother to avoid forgotten lines after forks.
00:13:11.260 Additionally, support for default pulling across multiple branches would streamline overall performance.
00:13:18.780 Lastly, systematizing the testing of boolean operator logic would ease the of writing automated tests.
00:13:27.600 Having clear insights into all possible paths through a program is essential.
00:13:34.050 While path coverage is labor-intensive, it’s crucial for scenarios involving complex business logic.
00:13:42.140 Recapping: guidance on automating coverage, ensuring immediate accurate feedback is vital.
00:13:48.969 The efforts put into improving single-shot branch coverage must balance the need for speed and accuracy.
00:13:56.850 Here’s where collaboration can elevate the coding practices amongst your teams.
00:14:04.440 The community must work together to ensure effective solutions are providing developers with accurate results.
00:14:10.100 Once you adopt these practices, your code quality can visibly improve.
00:14:18.690 We need to experiment with these new techniques to reinforce testing in real-world scenarios.
00:14:25.670 The hope is that as we keep integrating better coverage practices, techniques will evolve and grow.
00:14:34.850 Finally, while we may face remaining challenges, the overall objective is to elevate developer standards.
00:14:41.730 Make coverage an asset — a thoughtful companion to your development process.
00:14:48.600 Now, does anyone have any questions about what we've discussed so far?
00:14:54.830 Yes, I introduced single_cov earlier. There are a few other gems that accomplish similar functions.
00:15:04.180 Two months back, I came across the cover gem, which also runs tests and informs you of coverage omissions.
00:15:11.540 It appeared to encapsulate many functionalities as single_cov but lacks some advanced edge case handling.
00:15:17.780 While I appreciate that variation, I am happy with single_cov as it simplifies direct coverage insights.
00:15:25.320 I am aware that other languages have similar tools, but I'd prefer user-friendly solutions.
00:15:34.660 This is where local setups of single_cov appeal to me over larger external options.
00:15:40.680 So, what distinct boons come from using single_cov instead of the others?
00:15:48.129 The benefits of single_cov arise from avoiding the cumbersome reporting burdens of traditional tools.
00:15:54.839 Mainstream tools often generate extensive reports. They tend to slow down with large codebases.
00:16:02.699 Let's face it: we want prompt feedback without sending everything through a slow CI.
00:16:08.559 With single_cov, you can set it up locally for instant clarity without being bogged down by web hooks.
00:16:14.879 The visibility into branch coverage further helps maintain solid coding practices.
00:16:20.530 By tracking branch coverage, you add nuanced insights into tests beyond simple line assessments.
00:16:29.240 Ultimately, this approach leads to better practices and ensures more thorough testing routines.
00:16:36.060 Are there any more questions on setting adaptively?
00:16:43.073 Yes, it is feasible to tune coverage in production based on your specific needs.
00:16:50.919 The overhead tends to be minimal, yet I suggest testing conditions to be sure.
00:16:57.289 You can initiate coverage routines that allow real-time tracking for any performance hits.
00:17:05.690 The assessment allows coverage to be adjusted based on your app's behavior metrics.
00:17:12.820 Regular evaluations will help validate what settings yield the most efficient performance.
00:17:20.100 Compare your coverage intelligently across servers for all-round better efficiency.
00:17:27.650 I encourage you to leverage our coverage toolkit! It’s designed to cater to your goals.
00:17:35.730 Thank you for your engagement and thorough discussions, everyone!