Actionable Code Coverage

Ruby

Michael Grosser

#developer-experience-dx

#continuous-integration-ci

Actionable Code Coverage

by Michael Grosser

The video titled "Actionable Code Coverage" featuring Michael Grosser at RubyKaigi 2019 delves into the intricacies of code coverage in Ruby and highlights actionable insights for developers. The presentation aims to clarify how to implement effective code coverage practices while avoiding common pitfalls associated with conventional approaches.

Key Points Discussed:
- Introduction to Code Coverage: The session outlines the basics of code coverage, its significance, and its inherent limitations in providing a complete picture of software quality.
- Coverage Techniques: Distinguishing between line coverage, branch coverage, and one-shot coverage, the talk emphasizes that line coverage is inadequate for assessing test effectiveness alone. Newer techniques like branch coverage provide deeper insights by analyzing executed branches rather than just lines.
- Challenges of Existing Tools: Grosser addresses issues with current coverage tools, including the long feedback loops they create during pull requests (PRs) and the misleading confidence a high coverage percentage can instill in developers.
- Importance of Actionable Insights: The overarching theme is to treat code coverage not just as a metric but as a tool that drives meaningful discussions about testing practices. Developers are encouraged to use coverage results to question their test comprehensiveness and detect unreachable code.
- Introducing single_cov: The presenter introduces a tool called single_cov that enables developers to run tests and get immediate feedback on coverage gaps locally. This tool reduces dependency on complex external setups, enhancing development efficiency.
- Use of Forking Test Runner: Grosser also explains the concept of forking test runners, which helps manage global states in tests efficiently without losing coverage data, thus facilitating precise diagnostics during debugging.
- Desired Features and Future Improvements: The wishlist for the Ruby community includes better automated coverage tracking tools, improved handling of coverage in fork contexts, and enhancements in tracking logical constructs like boolean operators.

In conclusion, the video underscores the necessity of shifting toward a mindset where code coverage is seen as a collaborative asset rather than merely a metric. By implementing tools like single_cov and emphasizing thorough test examination, developers can significantly enhance their code quality and maintainability. The talk encourages continuous improvement and experimentation within teams to cultivate better coding standards, thereby making coverage a valuable part of the development process.

00:00:00.750 Welcome to Actionable Code Coverage. This is a talk about code coverage and its limitations.

00:00:06.029 It's not only a talk; it's also a repository with runnable examples.

00:00:12.889 These examples will show you step-by-step how this was done. It also features marked slides that you can modify or present elsewhere.

00:00:18.600 This presentation is meant as a deep dive for new developers to understand how we develop, how we test, and how we use code coverage.

00:00:25.529 My name is Mike Leckrone. I work at SanDisk, and we're hiring in San Francisco, Dublin, Copenhagen, and Sydney.

00:00:32.759 We offer a great work/life balance and visa assistance, so come join us!

00:00:39.480 You can find me on all these social platforms. My job here is as an infrastructure engineer.

00:00:46.550 I build a lot of these tools and help onboard large projects. This usually means I have to be careful not to disrupt existing workflows.

00:00:58.079 Onboarding has to be done piece by piece, which is also what this talk addresses. It discusses how to improve a large project step by step.

00:01:09.450 The plan is to go over what code coverage is in general to get everyone on the same page, then explain how to make your code coverage more actionable.

00:01:16.979 We'll discuss the problems with current approaches, solutions available, how to migrate large projects, and how to tackle coverage with forks.

00:01:24.810 Finally, we'll address my wishlist for Ruby maintainers regarding features that are still needed.

00:01:30.380 Code coverage in general has a built-in C library, so you don't need any gem to use it.

00:01:36.299 It's generally quite usable. If no library existed at all, you could create a system in about ten minutes that provides roughly 80% of what's needed.

00:01:42.869 You have to enable coverage before loading any code; you can't just load your files and then try to enable coverage.

00:01:49.829 You need to set it up by requiring the coverage tool and then all the other dependencies.

00:01:55.560 It’s also important to avoid enabling it for production and not to activate it for tests where coverage is not a concern.

00:02:02.369 The simplest form of coverage is line coverage, which is what you get by default when you turn it on.

00:02:08.099 When you enable coverage, you require your code, run your code, and then ask for the result.

00:02:13.890 This process disables coverage recording unless you use peek result, which allows you to continue recording coverage.

00:02:22.290 The result of line coverage is a hash containing all the files you have loaded and their respective coverage.

00:02:29.850 All the numbers essentially reflect hit counters, indicating how often the code was accessed. It will also show unreachable code, such as end statements, else clauses, and comments.

00:02:46.050 For example, in a method, the usage might return 1, 1, 0, nil. This indicates that the method definition was executed, but it doesn't mean someone called this method.

00:02:52.350 If the condition is such that the method is never reached, it will show as executed but marked as unreachable.

00:02:59.190 Despite what the coverage shows, it's possible to have sections of code that appear covered but are not really being tested, which is misleading.

00:03:05.850 This is a common pitfall where line coverage can be deceptive, as it looks like everything is fine, but parts of the code remain unassailable.

00:03:13.680 To address this, you could refactor your code to avoid putting multiple statements on one line, or create a Robocop rule that does this automatically.

00:03:20.519 However, a better option available from Ruby 2.5 is branch coverage, which is more informative.

00:03:26.940 It provides a more nuanced result, reporting on each branch rather than just a single line.

00:03:31.629 Branch coverage tells you which branches were executed, helping you identify actual test gaps.

00:03:37.150 Always be aware that it can be slower than line coverage, and should be avoided in production monitoring.

00:03:43.900 While it's effective, it doesn't cover some constructs, such as or statements effectively, leading to incorrect coverage assessments.

00:03:50.590 Another useful addition from Ruby 2.6 is the ability to watch coverage, which reduces the performance penalty during coverage calculations.

00:03:57.550 This means that after the first invocation, the coverage hooks that track all this get removed, resulting in no performance impact on subsequent runs.

00:04:04.930 This is very advantageous for production-level monitoring, where the precise hit frequency is less critical.

00:04:10.990 The one-shot coverage technique is also worth mentioning, which may look unconventional but serves a unique purpose.

00:04:20.519 This concept involves tracking which lines were covered, but it has limitations, especially concerning automation.

00:04:26.139 You can't determine if a line is not covered or if it's simply unreachable, which complicates calculating overall coverage percentages.

00:04:31.629 Consequently, there are some significant features to be desired in terms of coverage performance.

00:04:43.900 For example, in my benchmarks, I found that simple Ruby code could deliver around 50% for line coverage and 100% for branch coverage.

00:04:50.590 This contrasts starkly with RubyKaigi 2017, where the benchmarks were even lower.

00:04:57.550 In real-world applications, excessive overhead isn't typically observed; just enable and see how it works.

00:05:05.000 Thus, quick reminders about coverage types: we have line coverage, which is basic, and branch coverage, which is more comprehensive.

00:05:14.760 You'll want single-shot lines for speed but they are difficult to automate.

00:05:20.840 Next, let's explore actionable code coverage and the mindset around it.

00:05:25.970 The mindset here is to recognize that code coverage is not a metric in itself.

00:05:34.050 It's easy to manipulate coverage to appear satisfactory without ensuring the associated tests are meaningful.

00:05:41.000 Achieving 100% coverage doesn't necessarily mean the tests cover all use cases, and this often leads to a false sense of security.

00:05:49.160 You might end up with high coverage numbers, but your tests could be ineffective.

00:05:57.160 The goal instead is to treat code coverage as a helpful guide – it should prompt you to ask 'Shouldn't there be a test for that condition?'.

00:06:05.400 It's about maintaining standards in code reviews as well, where someone else should be able to point out deficiencies in your tests.

00:06:12.910 I have used this approach extensively and it reveals issues like unreachable code, especially when a test is misconfigured.

00:06:20.600 The issue with current coverage tools is that they tend to require lengthy waits for PR checks.

00:06:27.950 You submit your code, only to wait another five minutes to find that a line isn’t covered.

00:06:35.320 This can lead to frustration and an unproductive cycle, and often, achieving 100% coverage is an impossibility due to edge cases.

00:06:43.180 Many developers settle for 80% or 90% coverage, but assert that they are hitting 100%, when in fact they may not be.

00:06:51.740 There are scenarios where you might determine that a line is not covered because it's not reachable, which is acceptable.

00:06:59.080 An example would be if the line contains non-critical code, which may not warrant coverage.

00:07:06.060 Additionally, setting up code coverage typically requires intricate configurations that can be burdensome.

00:07:14.700 It involves managing webhooks, third-party providers, and ensuring that every contributor can see results.

00:07:22.560 The solution to this complexity is to seek quick, atomically responsive development feedback.

00:07:30.780 This means getting immediate notifications about which files are not covered.

00:07:38.500 If feedback is received directly within your Git environment, team members can adapt accordingly.

00:07:46.120 If you introduce any gaps in your code, they should be explicitly communicated to make PR reviews more effective.

00:07:53.600 This way, you also avoid the broken windows syndrome, creating a culture of responsible coding.

00:08:00.850 Achieving accurate branch coverage is essential, as conventional tools often only ensure line coverage.

00:08:08.400 The local setup should focus on providing straightforward coverage rules that don't impede productivity.

00:08:15.660 Today, I am going to demonstrate a tool called single_cov.

00:08:20.879 It runs your tests, flags coverage gaps, and allows you to accept them or fix the issues.

00:08:29.460 It works seamlessly with popular frameworks like MiniTest and RSpec.

00:08:36.650 By using single_cov, you can catch coverage issues locally without resorting to complex workflows.

00:08:44.100 It points out specific lines that require attention, indicating exactly which lines are at fault.

00:08:51.580 Moreover, its efficiency results from focusing solely on individual files.

00:08:57.540 By avoiding the creation of extensive HTML reports after each test run, it enables you to focus on what's important.

00:09:07.330 Using it in combination with forking the test runner synchronizes your local setups perfectly.

00:09:13.580 Opting in is quite simple; you merely install the gem and require it in your test helper.

00:09:21.170 This process allows you to keep significant portions of your codebase covered.

00:09:26.640 If you have a critical file, you can add its coverage in a straightforward manner.

00:09:33.150 You can document specifics of coverage gaps, allowing others to understand the context.

00:09:39.620 Additionally, you have control over how this coverage is managed, which can be helpful in code reviews.

00:09:45.530 One other useful feature is that you can add instructions indicating which files require coverage.

00:09:52.440 You can ensure your tests cover all necessary areas without complicating the setup.

00:09:59.370 Ultimately, it’s about clarity, so future contributors understand coverage statuses.

00:10:06.400 When introducing new classes, you can clarify if and why they might not be covered.

00:10:13.590 These best practices encourage an adaptable and maintainable codebase.

00:10:23.460 Now let's perform a brief demo of single_cov using a project for context.

00:10:30.970 The demo revolves around a project I maintain and its respective test structure.

00:10:38.920 It is important to approach this in such a manner that both the project and tests are aligned.

00:10:47.320 In doing so, I am maintaining the project's coverage integrity while demonstrating the ease of setup.

00:10:53.320 Through this process, expectations of coverage are actively documented throughout the code.

00:10:59.230 This not only bridges gaps but enhances collaboration among teams.

00:11:06.950 Moving on to forking test runners, they simplify the handling of global state changes.

00:11:14.890 We often have issues arising from tests trying to reset global state after running.

00:11:22.240 Because these tests are isolated, running a test will not affect unrelated functionalities.

00:11:30.190 You can pinpoint issues accurately, significantly speeding up debugging processes.

00:11:38.800 The problem with forking coverage is that it resets the coverage count upon each fork.

00:11:47.540 You want to retain the parent’s coverage data when testing in forks to maintain accuracy.

00:11:55.310 An elegant solution that's implemented with forking test runners merges parent results with child results after running the tests.

00:12:02.140 This allows for a clearer picture of total coverage without the hassle of resetting counts.

00:12:10.130 Ultimately, the idea is to sync parent and child coverage for complete clarity.

00:12:18.750 Moving forward, we must highlight the tools we still need to improve coverage.

00:12:24.740 As mentioned earlier, coverage is hard to automate due to the ineffectiveness of current options.

00:12:30.900 It would benefit us greatly if we had clear tooling for tracking true faults in tests.

00:12:37.789 A solution involving one-shot branches would help maintain coverage standards without performance drawbacks.

00:12:44.290 The goal here is to produce insights that demonstrate whether certain methods or code segments are hit or not.

00:12:51.400 This would grant us opportunities to reduce unnecessary code while keeping our application lean.

00:12:57.330 Branch coverage improvement would allow us to confidently identify code paths that are exercised.

00:13:04.200 Coverage for forked code must become smoother to avoid forgotten lines after forks.

00:13:11.260 Additionally, support for default pulling across multiple branches would streamline overall performance.

00:13:18.780 Lastly, systematizing the testing of boolean operator logic would ease the of writing automated tests.

00:13:27.600 Having clear insights into all possible paths through a program is essential.

00:13:34.050 While path coverage is labor-intensive, it’s crucial for scenarios involving complex business logic.

00:13:42.140 Recapping: guidance on automating coverage, ensuring immediate accurate feedback is vital.

00:13:48.969 The efforts put into improving single-shot branch coverage must balance the need for speed and accuracy.

00:13:56.850 Here’s where collaboration can elevate the coding practices amongst your teams.

00:14:04.440 The community must work together to ensure effective solutions are providing developers with accurate results.

00:14:10.100 Once you adopt these practices, your code quality can visibly improve.

00:14:18.690 We need to experiment with these new techniques to reinforce testing in real-world scenarios.

00:14:25.670 The hope is that as we keep integrating better coverage practices, techniques will evolve and grow.

00:14:34.850 Finally, while we may face remaining challenges, the overall objective is to elevate developer standards.

00:14:41.730 Make coverage an asset — a thoughtful companion to your development process.

00:14:48.600 Now, does anyone have any questions about what we've discussed so far?

00:14:54.830 Yes, I introduced single_cov earlier. There are a few other gems that accomplish similar functions.

00:15:04.180 Two months back, I came across the cover gem, which also runs tests and informs you of coverage omissions.

00:15:11.540 It appeared to encapsulate many functionalities as single_cov but lacks some advanced edge case handling.

00:15:17.780 While I appreciate that variation, I am happy with single_cov as it simplifies direct coverage insights.

00:15:25.320 I am aware that other languages have similar tools, but I'd prefer user-friendly solutions.

00:15:34.660 This is where local setups of single_cov appeal to me over larger external options.

00:15:40.680 So, what distinct boons come from using single_cov instead of the others?

00:15:48.129 The benefits of single_cov arise from avoiding the cumbersome reporting burdens of traditional tools.

00:15:54.839 Mainstream tools often generate extensive reports. They tend to slow down with large codebases.

00:16:02.699 Let's face it: we want prompt feedback without sending everything through a slow CI.

00:16:08.559 With single_cov, you can set it up locally for instant clarity without being bogged down by web hooks.

00:16:14.879 The visibility into branch coverage further helps maintain solid coding practices.

00:16:20.530 By tracking branch coverage, you add nuanced insights into tests beyond simple line assessments.

00:16:29.240 Ultimately, this approach leads to better practices and ensures more thorough testing routines.

00:16:36.060 Are there any more questions on setting adaptively?

00:16:43.073 Yes, it is feasible to tune coverage in production based on your specific needs.

00:16:50.919 The overhead tends to be minimal, yet I suggest testing conditions to be sure.

00:16:57.289 You can initiate coverage routines that allow real-time tracking for any performance hits.

00:17:05.690 The assessment allows coverage to be adjusted based on your app's behavior metrics.

00:17:12.820 Regular evaluations will help validate what settings yield the most efficient performance.

00:17:20.100 Compare your coverage intelligently across servers for all-round better efficiency.

00:17:27.650 I encourage you to leverage our coverage toolkit! It’s designed to cater to your goals.

00:17:35.730 Thank you for your engagement and thorough discussions, everyone!

RubyKaigi 2019

The send-pop optimisation

Urabe Shyouhei

State of Sorbet: A Type Checker for Ruby

Jake Zimmerman, Paul Tarjan

Ruby Committers vs the World

Ruby Committers

Pattern matching - New feature in Ruby 2.7

Kazuki Tsujimoto

The challenges behind Ruby type checking

Soutaro Matsumoto

Fibers Are the Right Solution

Samuel Williams

How RSpec works

Sam Phippen

Keynote: All bugfixes are incompatibilities

nagachika

Performance Improvement of Ruby 2.7 JIT in Real World

Takashi Kokubun

intimate Chat with Matz and mruby developers about mruby

Hiromasa Ishii

(partially) Non-volatile mruby

Yurie Yamane, Masayoshi Takahashi

Write a Ruby interpreter in Ruby for Ruby 3

Koichi Sasada

[JA|EN] Lightning Talks

Naoki Kishida, Satoshi "moris" Tagomori, Lin Yu Hsiang, Martin J. Dürst, MITSUBOSH, Yuichiro Kaneko, Sorah Fukumori, Hiroyuki Inoue, Shunsuke Onishi, ODA Hirohito, Koichi ITO, nagachika

A light weight JIT compiler project for CRuby

Vladimir Makarov

Pathfinder - Building a Container Platform in Ruby Ecosystem

Giovanni Sakti

Keynote: Optimization Techniques Used by the Benchmark Winners

Jeremy Evans

A Type-level Ruby Interpreter for Testing and Understanding

Yusuke Endoh

What is Domain Specific Language?

Tanaka Akira

RMagick, migrate to ImageMagick 7

Shizuo Fujita

Terminal curses

Shugo Maeda

Red Chainer and Cumo: Practical Deep Learning in Ruby

Naotoshi Seo, Yusaku Hatanaka

Performance Optimization Techniques of MessagePack-Ruby

Sadayuki Furuhashi

Timezone API

Nobuyoshi Nakada

Terminal Editors For Ruby Core Toolchain

ITOYANAGI Sakura

The Selfish Programmer

Justin Searls

Zeitwerk: A new code loader

Xavier Noria

RubyData Workshop

Kenta Murata, Kazuma Furuhashi, Kozo Nishida, Kouhei Sutou, Kazuhiro NISHIYAMA

The fastest way to bootstrap Ruby on Rails

Uchio KONDO

Better CSV processing with Ruby 2.6

Kouhei Sutou, Kazuma Furuhashi

Practical mruby/c firmware development with CRuby

Hitoshi Hasumi

Compiling Ruby to idiomatic code in static languages

Alexander Ivanov, Zahary Karadjov

TruffleRuby: Wrapping up compatibility for C extensions

Petr Chalupa

Ruby for NLP

Yoh Osaki

Working towards Bundler 3

Colby Swandale

Pragmatic Monadic Programing in Ruby

Tomohiro Hashidate

Writing Debuggers in Plain Ruby! Fact or fiction?

Genadi Samokovarov

Beyond `puts`: TruffleRuby’s Modern Debugger Using Chrome

Kevin Menard

Benchmarking your code, inside and out

Emily Stolfo

The future of the Bundled Bundler with RubyGems

Hiroshi Shibata

Reducing ActiveRecord memory consumption using Apache Arrow

Kenta Murata

dRuby 20th anniversary hands-on workshop

Masatoshi SEKI

Pre-evaluation in Ruby

Kevin Newton

Ruby Serverless Framework

Tung Nguyen

Keynote: The Year of Concurrency

Yukihiro "Matz" Matsumoto

Building Serverless Applications in Ruby with AWS Lambda

Alex Wood

Actionable Code Coverage

Michael Grosser

How to take over a Ruby gem

Maciej Mensfeld

GraphQL Migration: A Proper Use Case for Metaprogramming?

Shawnee Gao

JRuby: The Road to Ruby 2.6 and Rails 6

Charles Nutter, Thomas E Enebo

Determining Ruby Process Counts: Theory and Practice

Nate Berkopec

How to use OpenAPI3 for API developer

ota42y

Running Ruby On The Apple II

Colin Fulton

Six Years of Ruby Performance: A History

Noah Gibbs

Ruby 3 Progress Report

Yukihiro "Matz" Matsumoto

A Deep Learning Adventure

Paolo "Nusco" Perrotta

Cleaning up a huge ruby application

Sangyong Sim

Crystalball: predicting test failures

Alex Rodionov

Best practices in web API client development

Go Sueyoshi

Building Homebrew in Ruby: The Good, Bad and Ugly

Mike McQuaid

A Bundle of Joy: Rewriting for Performance

Matthew Draper

Play with local vars

Tatsuhiro Ujihisa

Compacting GC for MRI v2

Aaron Patterson

Yabeda: Monitoring monogatari

Andrey Novikov

Ovto: Frontend web framework for Rubyists

Yutaka HARA

Building a game for the Nintendo Switch using Ruby - First Half

Amir Rajan

Explore all talks recorded at RubyKaigi 2019

+69