RubyConf 2021

To mock, or not to mock?

To mock, or not to mock?

by Emily Giurleo

In her talk titled "To mock, or not to mock?" at RubyConf 2021, Emily Giurleo explores the intricacies of mocking in software testing. She begins by referencing Shakespeare's famous soliloquy to frame her discussion on the use of mocks in code tests, emphasizing the balance between their benefits and potential pitfalls. Emily aims to provide a framework to help developers determine when to use mocks effectively. The key points discussed throughout her presentation include:

  • Understanding Mocks: Emily explains two high-level approaches to testing: the classical approach, which focuses on creating real objects and checking their state, versus the mockist approach, which involves creating fake objects and verifying their behavior through expected interactions.

  • Advantages and Disadvantages: While traditional testing is simpler and more straightforward, over-reliance on mocking can lead to tests that are tightly coupled to code implementations, thereby making them fragile during code refactoring. Mocking can also result in self-referential tests that lack meaningful checks.

  • Three Key Scenarios for Mocking: Emily identifies three specific scenarios where using mocks can be particularly beneficial:

    • Avoiding Expensive Resources: Mocking can prevent costly operations such as network calls, which slow down tests or incur financial penalties due to API usage. She illustrates this through a mock of a network request client to test a quote-generating method without making actual API calls.
    • Creating Deterministic Tests for Non-Deterministic Code: Mocks help in testing aspects of code that exhibit unpredictable behavior, such as exceptions during network requests or random number generation that could lead to varying test outcomes.
    • Testing Objects with Hidden States: Certain objects, like caches, have states that should not be exposed or observed directly. Emily discusses a caching mechanism in her gem, Fake Sphere, and how mocking the cache methods allows for verifying interactions without exposing its internal state.
  • Framework for Decision-Making: Emily concludes with a framework comprising three critical questions when considering mocks:

    1. Does the test use expensive resources (time, memory, or financial costs)?
    2. Are we testing a non-deterministic code case?
    3. Are we dealing with an object whose state cannot or should not be exposed?
      If any of these questions are answered affirmatively, it is a sign that using mocks would enhance the testing process. The session wraps up with a reminder to 'mock responsibly' and an invitation to engage with the community she co-founded, wnb.rb.
      Overall, Emily provides a well-rounded perspective on the nuanced use of mocks in software testing, emphasizing that while they can offer considerable advantages, they come with significant risks if misapplied.
00:00:10.559 So, um.
00:00:12.799 Around the year 1600, William Shakespeare wrote Hamlet's 'To be or not to be' soliloquy, which would go on to become one of the most widely recognized texts in the English language.
00:00:15.440 Now, while Shakespeare was contemplating questions of life or death, I've been thinking about a different question: to mock or not to mock.
00:00:21.840 And more specifically, when we're writing tests for our code, how do we decide when to use a mock?
00:00:23.359 My name is Emily Giurleo, and I'm a software engineer at Numero, where we make financial software for political campaigns.
00:00:26.400 I'm also the co-founder of WNB.rb, which is a virtual community for women and non-binary Rubyists.
00:00:29.199 Going back to the question I just posed, how do we know when to use a mock in our tests? This is the question I plan to answer over the course of this talk. In doing so, I want to achieve the following goals: first, I want us to understand what mocks are and why it's so difficult to figure out when to use them. I also want to identify three scenarios where mocks can be the most helpful and valuable in our tests. In doing so, I want to articulate a framework that we can use going forward to help us decide when to use mocks.
00:01:02.640 So when I say the word 'mock', what do I mean?
00:01:04.799 Well, when we're writing tests, there are two high-level approaches that we can use to decide how to create the objects that we're going to use in our tests.
00:01:06.320 The first is called the classical approach, and this is when we create real objects and then verify their internal state. The second is called the mockist approach, which is when we actually create fake objects, and instead of verifying their state, we verify their behavior. I'm going to show you some examples which illustrate the differences between these two approaches.
00:01:10.400 Let's imagine that we ran a store in the 16th century. Like any good 16th-century store owner, we want to create an application to manage our inventory. Our application might have two main objects: the first is a warehouse that keeps track of how much inventory we have in each of our warehouses, and the second is an order, which represents an order from a customer. Now, the order object might have an instance method called fill, which takes a warehouse and then removes inventory from that warehouse depending on what items are in the order.
00:01:32.400 I'm going to show you how I might test these two objects. Before doing that, I want to mention that I use RSpec and RSpec Mocks in my day-to-day life as a software engineer, and all of the examples in this talk will use those gems as well. But please don't get up and leave if you're a MiniTest user. I believe that the concepts I discuss will be relevant to any testing framework or even other coding languages.
00:01:35.519 So back to our inventory manager. If we were going to test the fill method using a classical approach, we might do something like this: we could create a real warehouse object and add some inventory to it—say, ruffle collars, quills, and ink—which are, of course, 16th-century staples.
00:01:39.520 Then we can pass that warehouse object to the order fill method and verify the internal state of the warehouse at the end of the test. We can check that it has 10 fewer ruffle collars at the end than it did at the beginning.
00:01:53.680 Now, what if we tested it using a mockist approach? Instead of creating a real warehouse object, we can create a fake object using the instance double method. We can define the expected behavior of our fake object, saying which methods we expect to be called on it—and with which arguments—and we can define what we want those methods to return when they're called.
00:02:05.760 Finally, after we pass that fake warehouse to the fill method, we can verify that the fake object had the right methods called on it, and that our fill method behaved as we intended. Looking at these two approaches, we might ask ourselves: which is better?
00:02:10.560 I'm going to tell you that this is a trick question. I think there are benefits and drawbacks to both approaches. The classical approach was arguably simpler; it had fewer lines of code and was easier to reason about. However, a cool thing about the mockist approach is that it does not only verify the result of our method, but it also verifies that our code achieved that result in the way we expected, which seems really powerful.
00:02:28.640 However, this verification comes at a price. Imagine we wrote these tests and then six months later we decided to go and refactor our application’s fill method. An interesting thing is that this test is probably going to break the moment we try to refactor the method. That’s because our test is deeply coupled to the exact implementation of our method.
00:03:00.680 So, this is a huge downside of mocking: we couple our tests to our method, which will break the tests the moment we try to refactor. Another potential downside to mocking is that, and this happens to even the most experienced developers, we can get into a hole where we mock so much that we remove all meaning from our tests.
00:03:34.880 Let’s imagine that we create also a fake order object, and then we mock the fill method—the method we're trying to test. Then, in our test, we call the fake method on the fake object and verify that we called that fake method on the fake object. We've essentially just written a test that tests itself. It has no meaning and is entirely self-referential, and mocking can potentially get us into a situation like this.
00:03:50.240 So, to summarize, at its very worst, mocking can lead us to write tests that are coupled to the exact implementation of our code, and it can cause us to write tests that are entirely self-referential. That's right, Shakespeare: mocking is risky business, and that's why it's often so hard for us to figure out whether we should be using mocks in our tests.
00:04:17.440 However, not all hope is lost. I think mocking does come with a few benefits—a few scenarios where it can add real value to our tests. I've identified three of those scenarios that I want to talk to you about today. The first is that mocking can help us avoid the use of expensive resources in our tests. It can create deterministic tests for non-deterministic code. Finally, it can help us test objects whose state can't or shouldn't be exposed.
00:04:56.929 As we go through these three scenarios, we're going to work together to identify a framework that can help us answer the question: to mock or not to mock.
00:05:00.000 To illustrate these concepts, I'm going to talk to you about a gem that I wrote for this talk called Fake Sphere. The Fake Sphere gem generates silly Shakespeare quotes with random Ruby words interspersed. If you call the method generate quote, the generate quote method works as follows: first, it makes a network call to a Shakespearean quote generator API and gets back a Shakespearean quote.
00:05:11.680 Then it identifies the nouns in that quote—and I'll admit it does this to varying degrees of success—and then it replaces those nouns with randomly generated Ruby words. This method is extremely silly and has no practical use case, but it demonstrates all three of the concepts I want to discuss in this talk, making it the perfect way to show you what I mean when I say to mock or not to mock.
00:05:26.400 So let’s get started. The first scenario that I wanted to share with you is one that we encounter in our day-to-day lives as software developers: the case where mocking can help us prevent the use of expensive resources in our tests.
00:05:29.760 Now, what do I mean by expensive? Expense can be determined in several ways, but some popular ones are time—we want our tests to run quickly—computer memory, we don't want our tests to become bulky and crash our computers—and even money; we don't want our tests to be expensive to run.
00:05:32.560 One example of something that could be expensive is a network call, and we just saw that I use one of these in the generate quote method. The first thing it does is make a network call to a Shakespearean API to get back a quote. What I mean by a network call is that it uses the internet to perform a query against an API that lives on a different server—not the same server as where the gem is running.
00:05:49.440 So I don't want to run this network call during my tests, and that's for a few reasons. First, network calls are slow, and we want our tests to be fast. Second, depending on what API provider we use (I actually wrote the Shakespearean API, but let's pretend that I didn't), our API provider might charge us money for every request, or if we make too many requests, they might rate limit us and then charge us to allow us to continue using the API.
00:06:14.480 Shakespeare doesn’t have time for this, and neither do I, so I think this is a great scenario where we can use a mock. Before we talk about how we might mock this, let's take a look at the code that I wrote to perform the network request.
00:06:31.280 So first, I create a URI object. This is just an object that specifies the address of the server where the API lives. Then, I use the Net::HTTP library, specifically the get response method, to perform the network request. When I was thinking about how to mock this code, I considered whether I should mock this method—the one that's actually making the network request. Does that seem like a good idea?
00:06:57.680 I'm going to say no. I got some murmurs from the crowd, and I'll tell you why I don’t think that’s a good idea. First, if we mock the Net::HTTP library, our test would have to know the exact implementation of this third-party method. It would have to know what arguments it takes and what kind of objects it returns, which seems like a lot of work for us to set up our tests. Also, doing this would couple the test to the exact response shape that we get back from the API. If the API provider (aka me) ever wants to change that response shape, then we'd have to change all of our tests as well.
00:07:18.960 To avoid these problems, I think we can do something clever: we can extract this external dependency (the network request) from the code of the rest of our gem. We can do this using something called the client pattern. The client pattern means that we create a custom class around this external dependency—the network request—called a client that essentially shields that code from the rest of our gem.
00:07:58.720 Now, instead of our gem (and by extrapolation our tests) having to know the exact implementation of our network request, all it needs to know is that there's a method on this class that it can call to say, 'Hey, give me a quote,' and the class will return a quote.
00:08:14.240 This also makes it easy to test, which I'll talk about in a second. First, I’m going to show you the Shakespeare client that I wrote to encapsulate this network request.
00:08:19.680 You'll notice that a lot of the code on the slide is the same as the previous slide. The only difference is that it’s in its own class, which has some awesome benefits. The first is that the gem no longer has to interact with Net::HTTP; it no longer needs to know exactly how this network request is implemented. It also no longer needs to know the exact response shape from the API, so if I ever change that response shape, I can keep all that complication within this class and leave the rest of the gem the same.
00:08:47.440 If we were going to mock this, it’s actually quite straightforward now. We can create a mocked instance of the Shakespeare client. We can even force the Shakespeare client class to return that mocked instance when we call the new method. Finally, we can define the behavior we expect on the Shakespeare client that we've mocked; we can specify that we expect it to have a certain method called on it.
00:09:18.080 Then, when that method gets called, we can specify exactly what quote we want it to return. In this section, we talked about how mocking can help us prevent the use of expensive resources in our tests, and we can use patterns such as the client pattern to help us mock in the most effective way possible.
00:09:39.680 I think this leads us to add the first question to our 'to mock or not to mock' framework, which is: does our test use expensive resources? If so, that's a good indication that we might want to use a mock in our tests.
00:09:54.480 The second scenario I wanted to discuss is when mocking can create deterministic tests for non-deterministic code. An example of non-deterministic behavior is an error case that happens when you least expect it.
00:10:17.200 Remember, we said that the first thing the generate quote method does is perform an API request. But what if that API request fails? What if, instead of a quote, we get back a network error? This is an error case that is hard to anticipate; we don't know when the internet might go down or when the Shakespeare API might be experiencing a problem. That being said, we still want our gem to handle it well, which means we want to test it.
00:10:42.880 The other interesting thing is that this case becomes hard to test because we can't just tell our test, 'Hey, shut off the internet for a second,' like that's pretty hard to do. So we can lean again on our client pattern. In this case, I have the client raise a custom exception class that I wrote called ShakespeareError whenever there's a problem performing the network request. This way, the rest of the gem can rescue that error and perform some exception handling whenever there's an issue with the API request.
00:11:06.560 If we were going to mock this scenario, it's pretty similar to the one we discussed before. We can force the mocked client to raise an exception whenever it receives the generate quote method. This way, we've taken a non-deterministic error case and made it deterministic for our tests.
00:11:32.960 Another example of non-deterministic code is side effects. When I say side effects, I mean inputs that aren't explicitly passed as arguments to a method, which could involve a call to Time.now within the method or some random generation that happens.
00:12:01.440 An example of this is the Fake Sphere gem. The last step of the generate quote method involves taking those nouns (which may or may not be nouns; sometimes they're verbs in the Shakespeare quote) and replacing them with randomly generated Ruby words. This case is really hard to test without mocking because the words are randomly generated, and the outcome won't be the same every time.
00:12:22.240 Therefore, by mocking that random generation, we can take this non-deterministic aspect of our test and make it deterministic. I even adhered to my own advice and extracted that random Ruby word generation into its own module, which I called RubyWords, making it easy to mock.
00:12:50.560 In this section, we discussed how mocking can transform a non-deterministic case into deterministic tests, leading us to the second question in our 'to mock or not to mock' framework: are we testing a non-deterministic case in our code?
00:13:06.240 The final scenario where mocks can really benefit us is in testing objects whose state can't or shouldn't be exposed. The best example I've encountered for this is a cache.
00:13:18.080 Now, there are many types of caches, but for this talk, I’ll say that a cache is a software layer that temporarily stores data for the purpose of returning it more quickly in the future. The cache that we interact with most is probably our browser cache.
00:13:58.240 So, returning to our friend Willie Shakes, he loves to visit his favorite website, which is RuffleColorsRS.org. When he visits the website, his browser, in a gross oversimplification, makes a request to the site’s server and retrieves some assets to display the site. It then stores those assets in its local browser cache to ensure a quicker load the next time he visits the site.
00:14:09.680 Now, this isn't the only place where we can use a cache. In fact, I think there's a great use case for caching in the Fake Sphere gem. Remember, the gem makes a network request to the Shakespeare API, but if that request fails, we get an error instead of a quote. What happens next?
00:14:34.400 Instead of raising that exception and calling it a day, we can use caching to improve the experience. Whenever the Fake Sphere gem receives a quote from the Shakespeare API, it stores that quote in the cache. In the future, if it ever faces an issue getting another quote from the API, it can make a call to the cache and retrieve the previously saved quote, enhancing the user experience.
00:14:54.720 The cache is incredibly challenging to test, and that's for a few interesting reasons. First, when the Fake Sphere gem makes a call to the Shakespeare API, receives a quote, and stores it in the cache, we want to ensure that it does this reliably. We might instinctively check the cache at the end of the test and ask, 'Is the expected quote in there?'
00:15:22.560 However, I don't think this is the best approach. The reason is that the state of the cache isn't meant to be examined by the rest of the gem; the cache fundamentally serves as an underlying data layer consisting of an array of strings—the quotes stored during its usage.
00:15:50.480 Thus, there is no method to 'get all the quotes you have in there' because it couples the rest of the gem with the exact implementation of the cache. If I want to change that implementation, I would need to rewrite a substantial part of the gem. Therefore, while it might be tempting to peek into the cache during our test, I believe mocking alleviates the coupling of our gem to the exact implementation of this component.
00:16:15.680 To mock this, we can define the method on the cache that we expect to be called—the set method—and it's completely fine for the cache to behave as normal. We can allow it to do exactly that.
00:16:37.680 Another scenario we might want to test with the cache is when the Shakespeare API fails, prompting the gem to successfully get a quote from the cache and complete the method. This scenario is hard to test for another reason: the user experiences the same outcome regardless of whether the quote comes from the cache or the API.
00:17:01.920 If the Shakespeare API fails and the gem retrieves a quote from the cache, or if the API succeeds and the gem gets the quote directly, my user experience remains identical—there's no indication of where the quote originated.
00:17:22.160 So it's challenging to use the classical method of testing. Verifying the state in this case would yield the same result. Thus, we can leverage a mock to help us verify our gem’s behavior, ensuring that it appropriately uses the cache when necessary.
00:17:46.240 In the case when the Shakespeare API fails, we can define the expected behavior of the cache: We anticipate that the get method will be called. In our test, we can ensure that this method was indeed invoked.
00:18:11.840 Conversely, we may also want to ensure that the gem doesn't utilize the cache when it doesn't need to. We can implement RSpec's not_to method to verify that when the Shakespeare API succeeds, we do not expect the cache to receive any unnecessary method calls.
00:18:30.960 In this section, we discussed how mocks can assist in testing objects whose state can't or shouldn't be exposed—such as a cache. In doing so, I believe we've completed our framework for deciding whether or not to mock by posing the question: Are we testing an object whose state can't or shouldn't be exposed?
00:19:01.600 Our framework is now complete. When deciding whether or not to mock, we can ask ourselves three questions: First, does our test use expensive resources, such as time, memory, or money? Second, are we testing code that’s non-deterministic? Finally, are we testing an object whose state can’t or shouldn’t be exposed?
00:19:21.680 If the answer to any of these questions is yes, I think that using mocks effectively adds value to our tests, suggesting we should mock away. However, if we answer no to all of these questions, then mocking might not be such a good idea. Yet, we shouldn’t despair in this case.
00:19:55.680 If we answer no to these questions but still feel tempted to use a mock, it likely signals that the code we're testing is challenging to verify for some reason. This could indicate that we haven’t sufficiently extracted side effects, or perhaps we’ve authored a class that's trying to execute too much. Whatever the case, our inclination to mock should prompt us to reconsider the structure of our code.
00:20:27.520 In the words of Shakespeare: 'The fault may not be in our code but in our tests.' Once again, I am Emily Giurleo. If you'd like to discuss this further, please follow me on Twitter and visit my website, emilygiurleo.dev.
00:20:48.240 If you’d like to check out the Fake Sphere gem, it exists! It's not very usable, and I'm not sure what you'd use it for, but it was a lot of fun to write, so check it out on my GitHub.
00:21:11.679 I also referenced some fantastic articles while putting this talk together, and I’ll upload these slides on my website along with those article links.
00:21:26.240 Lastly, I want to mention something that's been very near and dear to my heart over the past few months: WNB.rb. It is a community for women and non-binary Rubyists, featuring a monthly meet-up and a Slack workspace of now over 250 members, which is fantastic.
00:21:56.240 Here are some links where you can check us out. Thank you so much for coming to see my talk! Please mock responsibly and enjoy the rest of your RubyConf.