Talks
Writing a Test Framework From Scratch

Writing a Test Framework From Scratch

by Ryan Davis

In the video titled 'Writing a Test Framework From Scratch', Ryan Davis delivers a code-heavy tutorial on creating a test framework, emphasizing the significance of assertions as the foundational unit of testing. Key points discussed include:

  • Introduction to Test Frameworks: Ryan, known as the author of MiniTest, introduces the topic of building testing libraries from scratch, focusing on a hands-on approach rather than a theoretical walkthrough.
  • Assertions as Building Blocks: The talk begins with the fundamental concept of assertions, explaining that the simplest assertion checks if an expression is truthy, which forms the bedrock of any testing framework.
  • Handling Assertions: Ryan discusses the decision to raise exceptions for failed assertions, advocating for immediate feedback to the user during test failures. He then refines error reporting to ensure useful messaging that indicates where the actual failure occurred.
  • Creating Additional Assertions: He demonstrates how various assertions, such as equality checks and floating-point comparisons, can be built upon the initial assertion framework, thus expanding its capabilities.
  • Independent Test Cases: The importance of organizing tests so they remain independent from one another is highlighted. Ryan illustrates using methods to ensure that tests are structured without interference from shared states.
  • Organizational Techniques: By employing classes, Davis shows how to group tests logically while maintaining efficient execution. He emphasizes the need for organized naming conventions to enhance clarity within the framework.
  • Execution and Reporting: The framework allows for running tests and reporting results, including a method for visual feedback indicating successful test runs. This provides developers with a clearer understanding of the overall test suite performance.
  • Refinements in Code Structure: Ryan highlights the need for clean code, separating exception handling from test runs to improve maintainability and readability.
  • Maintaining Independence Across Tests: He concludes with a strategy for ensuring each test runs independently, recommending random order execution to prevent dependencies affecting results.

In conclusion, by the end of the video, viewers are equipped with an understanding of how to incrementally build a simplified test framework. Ryan also mentions his upcoming book that further explores MiniTest, urging viewers to engage with him for more insights.

00:00:23.560 In my mind, nobody represents or epitomizes a Ruby hacker more than Ryan Davis. Also, I think no library represents the spirit of Ruby more than MiniTest. I'm a huge fan of both Ryan and MiniTest, and so I'm excited to introduce Ryan Davis, who will be talking about building your own testing library.
00:00:39.920 So, I'm going to be talking about creating test frameworks from scratch. As mentioned earlier, my name is Ryan Davis. I'm known elsewhere as Zen Spider or on Twitter as The Zen Spider because someone else has the username. I'm a founding member of Seattle RB, the first and oldest Ruby Brigade in the world. I'm now an independent consultant in Seattle and available for hire. Additionally, I am the author of MiniTest, which happens to be the most popular test framework in Ruby. I only mention this because I learned just last week that we're actually beating RSpec and have been since July.
00:01:12.119 Setting expectations is something I always like to do, so let me set them up front. This is going to be a very code-heavy talk. It will delve into the details about the 'whats', the 'hows', and the 'whys' of writing a test framework. With 326 slides, this is a little more than nine slides a minute, which is 50% more than I've done before. I'm going to be speaking at this pace, doing my best to convey the information while you can follow along with the published slides.
00:01:41.640 First, let me share a famous quote that hasn't been conclusively attributed to anyone: 'Tell me and I forget, teach me and I may remember, involve me and I learn.' While it is often believed to be said by Benjamin Franklin, the actual source may be a mystery. Regardless, I find that the quote points out an important problem in code walkthrough talks. Not all code walkthroughs are inherently bad; some are essential to work. However, I believe that merely going through the implementation of MiniTest without context would lead to little learning and might soon be forgotten.
00:02:19.080 One of the main issues with code walkthroughs is that they can be boring, top-down approaches focusing solely on the 'whats' rather than the 'whys'. It’s easy to tune out and not genuinely learn anything from them, which is why I won't conduct a code walkthrough of MiniTest today. The essence of this talk is about starting from scratch – working up from nothing allows you to join me in building a test framework.
00:02:47.520 I will aim to describe this process in a way that you can easily follow along on your laptops if you’d like to join in. However, it’s essential to note that this won't be an exact replica of MiniTest by the end of the talk; rather, it will be a subset. I will utilize the 80/20 rule to demonstrate most of what MiniTest can do in a minimal amount of code, referring to this as 'MicroTest' from here on out. I encourage you to deviate from the pathway I lay out and experiment; doing things differently may help you understand my choices better. Additionally, this talk is an adaptation of a chapter from a book I'm writing about MiniTest, and I will share more information on that at the end of the presentation.
00:03:34.120 To begin, the atomic units of any test framework are the assertions. Let’s start with a simple assertion, which is merely a straightforward method that takes the result of an expression and fails if that result isn't truthy. That’s it! A test framework can’t get any simpler than this. Thank you for your attention, and please consider buying my book when it comes out. Are there any questions? No? Well, although condensing a 326-slide talk into five minutes would be impressive, I would prefer to add a few more bells and whistles before we dive deeper.
00:04:21.840 Before I continue, let’s discuss why I chose to raise an exception when the assertion fails instead of simply pushing an instance of something onto a collection of failures. This choice, while arbitrary, was based on some clear trade-offs. It’s not crucial what method you choose for handling failed assertions; what matters is that you eventually report that failure. I prefer raising exceptions because they interrupt the execution of tests effectively, allowing us to see where the failure occurred. For example, if we evaluate '1 equals 1', the result is true; this would silently pass through the assertion. Conversely, if we evaluate '1 equals 2', this would raise an exception. Even though there will be additional mechanisms to handle exceptions and gracefully continue with the test suite, currently, the execution will halt at the first failed test.
00:05:24.319 One of the problems with raised exceptions is that the report is generated from where the exception was called in the assertion and not at the location where the assertion itself was called. My goal is to ensure that it only reports the line from where the assertion failed in 'test.rb'. To improve this, I will change how the exception is raised to use a specific exception class and backtrace, focusing on precise traceback details. We want useful messaging for developers addressing these failures, as it points directly to where the test failed.
00:06:37.920 Next, we will add a second assertion. Now that we have our basic assertion, we can build other more complex assertions from it. For instance, I would say that at least 90% of my own tests are covered by just a simple equality check. Therefore, we will check for equality. Luckily, this is straightforward: we simply pass the result of 'a equals b' to 'assert', and the method handles the passing behind the scenes. Here’s how you would use it: where '2 plus 2 equals 4', that would pass true to assert, while '2 plus 2 equals 5' would pass false which fails the assertion.
00:07:19.120 This gives us what we need to perform core tests effortlessly. However, the current implementation still provides an error message that points to 'assert equal', not where the assertion was invoked. I realize that while I used 'caller' for the backtrace, it included the entire call stack, including calls from other assertions, which still provides an inconvenient output.
00:08:02.679 Thus, I’ll refine this by filtering out anything that starts with the implementation file in the backtrace. By assigning this to a local variable, we can filter out all issues that originate in our implementation, which tidies up the error reporting significantly. Next, I want to better facilitate error messaging for 'assert equal'. We can create an optional argument for our assertion that will inform the user about the failure's context, enhancing the value of the feedback.
00:09:10.399 Let’s create one more assertion. A common mistake people encounter when testing is checking equality with floating-point numbers. The rule is straightforward: never test floats for equality. While there are exceptions, if you adhere to this rule, you'll generally avoid errors. As such, I will create a separate assertion that will check whether two floating-point numbers are close enough, which we can define as being within one thousandth of each other. Implementing this will show how it resembles 'assert equal' but uses an adjusted formula to determine closeness. This approach allows us to expand our assert method to be sufficiently generalized, enabling us to write various other assertions effectively.
00:10:56.320 Once we can write various assertions successfully, we’ll soon realize the next issue of having many tests. Tests should never interfere with each other; they must be organized and separate to maintain clarity and trustworthiness. There are many reasons for structuring tests independently, such as organization, refactoring, reusability, safety, prioritization, and parallelization. Hence, we need a method to break tests apart while keeping them distinct. We could assign the 'test' method to a set description, using a block with assertions, which makes it relatively simple to implement. However, this approach leads to leaky tests, where one test might influence another because of shared state or variables.
00:12:07.440 To address this, we’ll ensure tests are completely independent by utilizing methods for our tests. By running test cases through distinct methods, we isolate state. In Ruby, we have mechanisms to separate blocks and ensure that outer scopes do not interfere with each other. This means we can define local variables that won't overwrite state across different test methods. Thus, we can use this method to easily structure our tests effectively without additional overhead. Furthermore, it’s free and ensuring everyone can understand our code is an important consideration.
00:13:17.200 Once we have these methods for our tests, running them is as easy as calling any method. While there are more complicated ways to initiate tests, the simplest is to just call each test method directly. Even though methods help provide better organization, we need to grasp that unique names are necessary for each test method. This can become laborious, which is why we can take advantage of classes. This way, we can wrap the multiple test methods into classes, which helps with structure and organization while still allowing us to create instances for testing.
00:14:52.400 However, it is important to remember that the need to run these tests efficiently is paramount. Wrapping our test methods into classes helps us organize tests logically, but how do we actually run them? To correctly initiate tests when they sit within a class context, we need instances of those classes. By adding instance methods to invoke each test, we can begin to lay out how the classes will independently manage tests. We can introduce a 'run' method that takes a test name and invokes it accordingly. With this, the next step will be to formulate a method to manage running all test cases within the class effectively.
00:16:18.640 To create structure, we will refactor our code so that each new test class knows how to run all its tests by utilizing public instance methods that end with '_test'. This allows us to run each defined test in a controlled way while having organized results. At this stage, it would be beneficial to encapsulate redundant logic by pushing our functionality into a parent 'Test' class. This allows us to consolidate common methods, making it simple for all our test classes to inherit relevant structure and functionality for efficient code reuse.
00:17:28.560 To fully automate our test runs, we can incorporate Ruby's 'class inherited' hook, which allows us to record our test classes dynamically as they are created. Every time a new subclass of Test is made, this mechanism ensures it is noted for execution later. This organization simplifies the process of running all tests, as each instance will know to execute its tests whenever the encompassing master method is called. By amalgamating these changes, we create streamlined interactions among the tests.
00:18:49.680 Presently, our micro test framework can execute various tests without difficulty; however, it faces a limitation – while it successfully runs tests, the output may not provide adequate feedback. Currently, we have no way of confirming that any tests ran successfully. This prompts us to implement a method of reporting our run results. From my experience, I would find it gratifying to see at least something indicating that a test has run successfully. Thus, as a minimal goal, let's print a 'dot' for each test run. Establishing a minimal level of reporting like this enhances visibility into the testing process without delving too deeply into additional complexity.
00:20:45.760 While outputting test results, we must address failures. Right now, if a test fails, it stops execution due to raising an unhandled exception. While this accurately points out a failure, it limits visibility since you might only see the first failure until resolved. Instead of halting on the first failure, let’s capture exceptions and provide insight on what transpires with every single test. With some improvements, we can now see all tests, regardless of whether they pass or fail. This avoids overwhelming users with backtrace details and emphasizes failure information that aids in debugging.
00:22:33.720 As we advance through the implementation, I feel a compulsion to address code maintainability. Currently, our configuration logic is intertwined within the exceptions processing, which makes for a muddled experience. I propose separating the exception handling process from our test runs to foster cleaner code. For clarity, I will first identify all the steps in handling the exceptions while ensuring that test reporting remains distinct, thus creating better organized and manageable code.
00:23:53.440 Following this, we can design a reporter class entirely dedicated to managing output data concerning test results. When we conduct this refinement, it acknowledges responsibilities for running tests and handling output separately, yielding a cleaner abstraction. We will ultimately adopt a report method that provides results in a human-readable format, allowing for more immediate understandings of the situation. In doing so, we use different names that more accurately describe functionality while improving the clarity of our architecture.
00:25:30.680 Now that we have honed in on the reporting structure, we’ll implement a strategy to separate our result messages from the overview of successful tests. At this point, I'll rename some of our functions to enhance clarity further and streamline understanding. The changes revolve around replacing ambiguous titles with more descriptive alternatives. In this way, we aim to clarify our design further while ensuring continued readability across our codebase. Honestly, every refinement adds to how well our code communicates with both developers and users alike.
00:26:56.560 Next, we want to bolster our testing framework by addressing dependencies across tests. The ideal situation is that each test passes independently of prior results. To assure this principle is honored, we should run tests in random order; this can easily be comprehended within our framework. By abstracting our test run strategies into their own methods, we will make sure our design aligns with maintaining independence in our testing. Hence, the systems that generate the results can focus solely on ensuring accuracy.
00:28:39.800 At this point, we have achieved a functional test framework that comprises about 70 lines of clearly written code, covering a significant portion of what MiniTest does. It is well-structured, free from duplication, and exceptionally readable, enabling both easy comprehension and maintenance of the framework. Even without extensive comments, our code achieves clarity, and can effectively communicate functionality at a glance. One important transparency is that our framework runs faster owing to a lean structure that allows for rapid processing.
00:29:58.840 To summarize, we began this journey with the smallest unit – the assertion – and built our framework incrementally by chunking logic into manageable methods and classes that communicated coherently with each other. Ultimately, we integrated error handling and reporting effectively while keeping a focus on maintaining strong independence across tests. As a final note, I am working on getting a book published that delves deeper into MiniTest, and I hope to provide samples for you soon. Please consider following my updates on Twitter, and thank you for your attention! I would prefer to write tests with methods.