Test-Driven Development
Surgically Refactoring Ruby with Suture

Summarized using AI

Surgically Refactoring Ruby with Suture

Justin Searls • November 10, 2016 • Cincinnati, OH

Surgically Refactoring Ruby with Suture is a talk presented by Justin Searls at RubyConf 2016. The discussion addresses the challenges of refactoring legacy Ruby code, emphasizing the need for systematic approaches to manage this process effectively. Searls introduces Suture, a Ruby gem designed to facilitate safer refactoring by treating code changes like surgical procedures. He articulates the following key points throughout his presentation:

  • Refactoring Challenges: Refactoring legacy code is often perceived as risky and complex, leading to mental discomfort among developers. Searls highlights that the more a piece of code is intertwined with different variables and conditions, the higher the difficulty in modifying it without introducing bugs.

  • Business Perspectives: He points out the difficulties in selling the concept of refactoring to business entities, as it is seen as low priority compared to feature development and bug fixes. This perception often results in a lack of resources allocated for necessary refactoring efforts.

  • Refactoring Tools: Searls reviews existing methods such as characterization testing and A/B testing, which serve to manage risk during code changes. He explains that characterization testing can help encapsulate legacy code behavior, allowing developers to refactor with confidence.

  • Introduction of Suture: The primary focus of his talk is the introduction of the Suture gem, which supports developers in every stage of refactoring, including planning, development, staging, and production. Suture allows for recording and replaying interactions with legacy code to ensure that refactoring does not disrupt existing functionality.

    • Key Features of Suture:
    • Records interactions through a seam in the code.
    • Validates new implementations against recorded data.
    • Provides fallbacks in production environments if issues arise with new code.
    • Allows for smooth deletion once refactoring is complete.
  • Practical Examples: Throughout the talk, Searls provides examples, including a flawed calculator and tally service implementations, to demonstrate how Suture can be applied. He explains how to record calls, validate against expected behaviors, and refactor code while maintaining safety for users.

  • Concluding Thoughts: Searls emphasizes the importance of making changes safe for users and suggests that keeping users unaffected during the refactoring process should remain a primary goal. He encourages attendees to consider how they can contribute to the Ruby community and improve their refactoring practices.

In summary, Suture aims to revolutionize the way developers approach legacy code refactoring by introducing a comprehensive tool that ensures methods remain functional while changes are made, ultimately maintaining the integrity of existing systems.

Surgically Refactoring Ruby with Suture
Justin Searls • November 10, 2016 • Cincinnati, OH

RubyConf 2016 - Surgically Refactoring Ruby with Suture by Justin Searls

The next feature means changing legacy code. The estimate just says "PAIN" in red ink. Your hands tremble as you claim the card from the wall.

For all of Ruby's breakthroughs, refactoring is as painful as 2004, when Feathers' book released. It's time we Make Refactors Great Again.

That's why I wrote Suture, a gem to help at every step:

In development, new code is verified against recorded calls
In staging, old & new code is ensured side-by-side
In production, unexpected errors fall back to the old code
With renewed confidence and without fear, you grab the card. You've got this.

RubyConf 2016

00:00:15.119 Hi everyone! So this is not a normal thing that happens at RubyConf, but my name is Sam. I'm the track director for the testing track, and I'm actually really pleased Justin is here today because I owe him a huge debt of gratitude.
00:00:22.060 I got very, very ill immediately before RailsConf this year, and I couldn't come to make my talk. Exactly two days before the conference, I texted Justin, who at the time was many time zones away. I said, 'Justin, listen, there isn’t a backup speaker who can give a testing talk in time.'
00:00:34.510 Justin replied, 'All right, I'll do this. That seems fine.' He also asked, 'Why did you reject my talk in the first place, Sam?' So I just want to say a huge thanks to Justin for being here. He's definitely having a little bit of a rough week, as I think many of us are.
00:00:50.080 Could we all just give a huge round of applause to Justin and welcome him to the stage? I was touched! Come on, that was... it was touching. It was not an ironic, pithy statement; those are coming! All right.
00:01:11.200 Let's roll. This talk is called Surgical Refactoring. My real name is Sam Phippen. If you don’t get that joke, you can call me Searls. That’s what my face looks like on the internet. If you have any feedback from this talk, you can reach me at [email protected].
00:01:17.259 That's right, I come from a company called Test Double. The way we work at our company is we actually are consultants that work on existing engineering teams to get a lot of stuff done so that we can create some slack in the system, allowing teams to pay down technical debt and make things better. Our goal in life is to make the world a better place and make software a little less broken for everybody.
00:01:36.150 If your team could use assistance, you can say hello at Test Double, and we'll set up a call to talk. This is exciting! There's a national Ruby conference in Ohio, and I live in Ohio, so welcome to Ohio, everybody! If you live in Ohio, well, welcome to you too.
00:02:00.220 A lot of people who know me know that I live in Columbus, so they expect that I know all about Ohio. Well, I’m actually from Michigan. That’s right, I miss the beautiful, awesome landscapes and lakes that fill up my childhood memories for the rich cultural heritage of Ohio. This is the Circleville Pumpkin Show, and I'm actually not joking about Ohio's culture.
00:02:30.190 I think where I found Ohio culture best is in its comfort food. There's a lot of fun stuff. You go to the Circleville Pumpkin Show, and yes, there are these hilarious, silly pumpkins everywhere, but they also have deep-fried Buckeyes, which are chocolate peanut butter candies, and it’s fantastic.
00:02:55.580 If that’s not your fancy, they also have chocolate-coated chilled pumpkin cheesecake on a stick. Very creative, very intense culture here! But it's not just desserts; no, no, no, no cuisine is safe. One of my favorite dishes in Columbus is Ohio nachos. They’re kettle chips covered in queso and sprinkled with breakfast sausage.
00:03:15.700 It's very hashtag health, but it's not because we're into food; it’s because we’ve got great ingredients. As far as I know, all we do is grow corn and animals that eat the corn. So it doesn’t really matter about the ingredients because we're just going to deep-fry it anyway! This is a grilled cheese sandwich that contains fried cream cheese jalapeno poppers on the inside.
00:03:38.770 And then once they build the whole thing, they dump the sandwich in pancake batter and deep-fry that, serving it as a Monte Cristo. So that’s one of my favorites. That’s an example of American exceptionalism right there!
00:04:00.580 Another thing about Ohio culture that I've learned since moving here is that Ohioans are really competitive, especially when it comes to food. You know, if there’s food and a stopwatch, someone’s going to find a way to turn it into a contest. It’s something I have learned to accommodate in my life.
00:04:29.760 So Monday this week, I was just feeling like I wanted a good sandwich, so I thought I’d go to the neighborhood deli. I've got a lot of awesome restaurants; they’re all in strip malls, and they all have really generic names, but they're really good. So I decided to go to my neighborhood deli, which is of course called Neighbors Deli.
00:04:41.880 I walked in and I just noticed for the first time that they actually have a competitive challenge sandwich that you can buy. You get your money back if you eat the whole thing in 20 minutes. I was like, 'Well, that’s interesting,’ but honestly, I’m better than that; competitive eating is wasteful. I used to have weight issues, so that’s not something I’m going to do. I'm just going to get a normal little hot pastrami sandwich.
00:05:07.560 But as I looked through the list, I was like, 'Well, you know, they’ve got it. I could sample all the different types of meats.' Then I noticed it, the model with… and I know a few things about picking apart monoliths, so I thought maybe I could do this. And then I thought about all of you today, and I didn’t want to let you down by skipping this challenge.
00:05:40.670 So, this is what got served to me. Zooming out a little bit, like most monoliths with which we are familiar, it is falling over on itself and was not exactly built to spec. There was like a pound and a half of corned beef at the bottom, all for less than a grilled cheese in San Francisco.
00:06:01.120 So what did I have to lose? Of course, I looked at that thing and I was like, ‘Nope, I’m out.’ But maybe I picked up some Ohio culture because the waitress said, 'All right, 20 minutes!' and I was like, 'Yeah, I got this.' So I dug into my sandwich.
00:06:17.360 As developers, we're familiar with doing stupid things under extreme time pressure all the time. I thought, '20 minutes? How do I...' so I made a huge mess. It was disgusting, it didn’t even taste that good, and I was feeling sick. I knew if I pushed any harder, I would still fail and just be sicker, so I decided to quit.
00:06:40.120 Then there was this elderly lady who had come in and she was just watching me. She said, 'You can do it!' in that sweet, grandmotherly tone. I said, 'I really can’t, and I shouldn’t. Why are you rooting me on? This is unhealthy!' And she said, 'We believe in you!' She was really serious at that point. I thought, 'Well, I give up.'
00:07:03.729 I don’t know if that is also part of Ohio culture—just the sort of unjustified faith in others. Sam mentioned it was a rough week for a lot of us, and the reason that I participate in these conferences is that I do believe in all of you, and I believe that you can do great things. So I just wanted to say that before we got into stuff.
00:07:29.080 Anyway, this talk is about failing to conquer monoliths, so I propose nothing. First, let’s back up and talk about some context.
00:07:40.569 I love Ruby. Ruby is obviously a super successful, awesome language. If you think about what made Ruby successful in the early stages, everyone was really happy. People who were building gems did so just for fun, just for the accolades and attention gained from being associated with Ruby.
00:07:56.810 The thing about languages is that early days of success are determined by your ability to make it easy to create new things. Gems need to exist; you need to attract people to the ecosystem, and it needs to be easy to learn and pick up. Ruby was great at that!
00:08:14.250 But later success is fundamentally different. We're 20 years in now, right? After more than ten years of Rails, people are more critical. It's an incumbent; it's not the new shiny thing anymore. People use it at work, and they need these systems to be long-term maintainable. It’s a very different mindset, isn't it?
00:08:32.810 Later success for languages, like Java, is based on your ability to maintain old code. I really don't feel like Ruby has ever excelled at that. So my challenge in writing this talk was to ask myself if there is anything we can do as a community to make it easier to maintain old Ruby code.
00:08:52.730 To pull that thread, I thought let's refactor some legacy code. If you're here to talk about refactoring, you probably know that refactoring is hard. I think refactoring legacy code is particularly hard.
00:09:11.709 What makes it really hard is that it's easy to accidentally break unrelated functionality because there are so many variables, branches, and complexities all tangled up. As a result, most of us view legacy code refactoring as a mentally unsafe thing to do and not fun.
00:09:33.100 Additionally, they are hard to sell to people. The way to visualize this is with a two-axis graph: business priority on one axis and cost and risk of implementation on the other. In the top right you could place new feature development; they’re very important but also expensive. In the top left, you have bug fixes, which are also important but relatively less expensive.
00:09:59.200 In the bottom left, I’d probably put testing. It's certainly important to us but also not so expensive that the business doesn't allow us to do it. But what goes in the bottom right? If I had to put something there, I’d put refactoring. Refactoring is very expensive and has nebulous business priority, so we don't often have to sell our businesses on letting us build new features.
00:10:33.890 That’s probably why they pay us a salary in the first place. It's not hard to sell them on bug fixes; testing used to be hard to sell, but it's become normalized culturally in software. So typically, we're afforded time to do it, but it's still really hard to sell people on refactoring and on habitually paying down the technical debt in our projects.
00:11:05.070 When you think about why it’s hard, it’s because we can't predict how long a refactor will take us. From the business's perspective, we defined refactoring as not changing observable behavior. So if you spend a month refactoring something, they won't know if you were fleecing them or just playing video games. It takes a lot of trust.
00:11:23.150 Additionally, areas requiring the most refactoring tend to be tangled up, so when we work there, it's not safe for others to be working too. We have to stop everything to merge it in; otherwise, we'll have all sorts of merge conflicts, making it very disruptive to do a lot of refactoring.
00:11:48.400 And you notice that complexity correlates with importance. The more complexity there is in any bit of code—the more branches, conditions, and everything else—it didn’t get there by accident. It was important for the business to cover all those cases, but those things needing the most refactoring are also the things we’re most afraid to change.
00:12:05.920 So if you think about it, because it's low priority, it's hard to sell. What could we do to make refactoring a better sell to the business? That’s the first thing we should think about: how could we raise the priority? In their minds, you know, refactoring feels like road construction.
00:12:34.810 We’re telling them they’re going to get less of what they need more slowly, while the money continues to fly out the door at the same velocity it normally does. We have a few strategies for dealing with this, none of them good.
00:12:54.680 First, we can try to scare them. We might say, 'Hey, if we don’t refactor, then someday, we’ll need to rewrite everything.' That’s too nebulous, or perhaps we’ll just claim maintenance costs will be higher. However, that doesn’t help because it’s hard to quantify.
00:13:14.120 Next, people try to absorb the cost of refactoring into their development efforts. For instance, in this pie chart, we might spend some time planning, some time developing, and some time testing. The team could agree that for every card, we'll grow the pie and add habitual refactoring to every story.
00:13:42.020 That would be fantastic, except it requires extreme discipline, which probably means it won’t scale and doesn’t work on every team. Additionally, if the team is under any kind of time pressure, which most teams are, it’s usually the first practice that goes out the window. So I don’t think that’s going to be successful.
00:14:07.440 The most common thing I see as a consultant is the strategy of taking hostages. The business might say, 'Hey, I’ve got features one, two, three, and four, and I want them in that order.' We then respond, 'Oh no, you’re not going to get feature two until we pay down this technical debt. You're not going to get feature three until we pay down that technical debt.'
00:14:35.689 I dislike this approach because it is adversarial. It blames the business for having rushed us in the first place. Moreover, did you know that software developers are highly paid and expensive to businesses? So it erodes their trust in us.
00:14:58.649 If we tell them that this thing we built six months ago was shoddy junk and we need to go fix it, we risk them eventually seeking out new developers. Hence, refactoring is hard to sell. This is not a talk about figuring out how to solve that problem because I haven't yet.
00:15:24.440 I think there’s a lot we can do, but a lot of it is cultural. So let’s just give up on that for now and talk about the other axis: cost and risk. Why is it so costly and risky? Well, from a developer's perspective, there's a lot of pressure to do refactoring right.
00:15:50.029 You have to keep track of a lot of things in your head; it’s really scary! The darkest, dankest basement of the codebase creates pressure because getting any allowance to spend time on this stuff can be difficult, which we already discussed.
00:16:11.480 Furthermore, the tooling isn’t that great. Most open-source tooling and libraries are written by people who don’t want to think about the legacy mess they have. Their focus is on creating new stuff, which is where most of our attention goes. We don’t think of it as 80% of our job, even though it probably is.
00:16:33.270 So the tools aren’t that great, making refactoring feel even scarier. If I'm on any kind of mission, it’s to identify all the scariest aspects of dealing with the complexity of software development and somehow make them less frightening so that I can be productive.
00:16:46.110 If you're on board with that message, I think you should buy my book! However, I want to clarify that I'm way too afraid to write a book. That book does not exist. Let's talk about what we can do to make refactoring a little less costly.
00:17:05.940 The first tool we already have is the book refactoring patterns. They define operations like extract method, pull up, push down, or split view patterns. These are safe operations we can perform in our code.
00:17:17.410 Yet they become safer when we have good tools, like through language introspection and static analysis. My favorite aspect of using Eclipse IDE with Java is that I have this right-click menu where I can perform various operations, and I'm basically guaranteed safety because all the references can be matched.
00:17:37.920 I can't do that in Ruby but even if I could, these sorts of operations aren't expressive enough to radically redesign the code.
00:17:46.770 Characterization testing for refactors, making refactoring easy, was pioneered by Michael Feathers in his seminal book 'Working Effectively with Legacy Code,' published in 2004. The basic takeaway is to treat your legacy code like a black box.
00:18:08.840 Put a little test harness around it. Write a test just for that black box of code. Pass in arguments and listen to the results, locking them with assertions. Do this repeatedly to cover all anticipated cases and crystallize the current behavior of the code.
00:18:28.400 Once you have that, the black box becomes transparent. You can be as aggressive as you wish; you could delete everything if you want to. Refactor into new units that you do understand and can change with confidence, then backfill those with unit tests that actually express how the system should work.
00:18:51.150 However, this requires significant testing and work, which can be a considerable upfront commitment. After all this effort, the next step is to blow away those characterization tests since they don’t understand how the system is supposed to work.
00:19:05.540 If you retain them, they may become an albatross holding you back and increasing the carrying cost of the code. Yet if your team has a lot of legacy code, you probably don't have a lot of code coverage.
00:19:20.890 It’s challenging to let go of characterization tests because you just saw your code coverage statistic rise, and now Justin's telling you to delete them; your coverage will drop again, which feels depressing.
00:19:43.170 Another trap I’ve noticed with teams trying to use characterization testing is that they often only half-finish this process. You end up with a collection of characterization tests that you come to rely on but don’t actually follow through to fix anything, adding to the nine-hour build process of semi-quasi integrated tests.
00:20:09.870 So, while characterization testing can be very helpful, it’s not a perfect solution. Another approach is akin to A/B testing, or perhaps we’ve adapted it from A/B testing.
00:20:30.080 You write new implementation alongside the old code and put a router in front, allowing, say, 20% of traffic to go to the new code and 80% to go to the old code, limiting exposure to any problems the new code may introduce. GitHub has written a gem called Scientist, which is akin to this experimental activity.
00:21:08.710 The approach is great, but you do need sophisticated monitoring and analysis to understand how users are interacting with the new code path. Furthermore, your business domain needs to be one where it's safe for some users to have a subpar experience.
00:21:26.790 For instance, GitHub experiences occasional outages, which is acceptable. But in cases involving financial transactions or healthcare, that wouldn't be appropriate. So, if you find yourself in such a business environment, this approach could be valuable.
00:21:48.970 If we look at a spectrum, with characterization testing on the left and A/B experiments on the right, you can see a divide. Working effectively with legacy code is great; development is a little painful, while testing has almost no advice for staging or production.
00:22:12.240 Something like Scientist or the A/B testing approach doesn't cover how to develop the new piece or test it locally. However, it might be very useful in a staging environment where you can experiment and see how things are working. It might feel overwhelming in a production environment, but it answers those questions much better.
00:22:36.490 When I submitted the abstract for this talk, I wondered: what if one tool could offer a solid development story, testing story, staging story, and production story—carrying me through the entire lifecycle of a refactor? Because I’m not scared of just any one of those stages; I’m scared of all of them when it comes to a major refactor.
00:23:06.080 Thus, I did a lot of research, thought about it, and, after nine months passed, I thought, 'Oh crap, I need to give a talk on this.' I could give a standard Justin Searls talk with 700 snarky slides, but I had this cool idea: I could write a Ruby gem that helped people, so I did.
00:23:28.310 Instead of just writing snarky slides, I practiced something you might be familiar with: TDD, or Talk Driven Development. It involves submitting abstracts that commit you to massive amounts of work. Thus, at the other end of this, we have a gem called Suture, available on GitHub under Test Double.
00:23:49.290 The metaphor here is that refactors can be treated like surgeries. Similarities include that surgeries try to resolve intractable problems and make us feel better. They require careful upfront planning and leverage different tools employed across various contexts, similarly to how we want to utilize these in different developmental modes.
00:24:10.530 Surgeries follow clear processes not arbitrarily, but because there's so much variation in circumstances; following a clear process can help understand what makes a particular situation unique. They also plan for long-term observation. While under the needle, people keep a close eye on everything.
00:24:34.020 In follow-up check-ups, we often take a step back, followed by years of lower-resolution measurements confirming that everything is okay and successful. Of course, just as surgeries, refactoring can get messy!
00:24:50.660 Suture functions through nine features we’ll discuss, each designed to help you through the refactoring workflow. The first step is to plan the refactor and then cut what we call a seam in the code, which is a call site to the legacy code.
00:25:10.740 We then record all the interactions that pass through that seam—the arguments that were passed in and the results that are returned. We validate those recordings against the old code, ensuring we can replay them back, confirming the recordings are valid.
00:25:30.300 Only then can we refactor as aggressively as we like into a new implementation, which we’ll verify by replaying the new implementation against the same recordings. So, locally, we’re pretty confident. Once we reach staging and run through the code's critical path in both the old and new environments side-by-side.
00:25:49.960 If either side reacts differently, we present an error explaining what just happened. We can utilize the same configuration in production, so if anything unexpectedly goes awry, we can revert from the new path to the old path, preventing interruptions for users caused by our mistaken or buggy refactors.
00:26:09.740 Finally, when we’re confident everything’s correctly executed, Suture is designed to be deleted. We will extract it and direct everything to the new code path, subsequently marking the refactor complete.
00:26:30.990 In this discussion, we’ll explore two example bug fixes. First, there’s a simple calculator service; forgive the contrived nature, but all of these examples are for clarity. This calculator service is supposed to add numbers, but it doesn’t add negative numbers correctly.
00:27:00.060 It serves as an example of a pure function—pure functions are always easier to deal with. You pass in arguments, receive a return value. Here, we instantiate a new calculator and call add with a left and right operand.
00:27:20.210 If you look at the implementation of this add method, it's defined to simply add one to the left operand—the bug lies here! We’re always adding and not properly accounting for subtraction. You might be thinking this code is really ugly. Well, guess what? Your legacy code can be ugly too, so deal with it!
00:27:37.380 So our seam is obviously at the call to add. Later, we’ll introduce Suture there. The second bug involves a tally service, which has state—meaning it has side effects. It doesn't correctly handle odd numbers. This we’ll call the mutation case.
00:28:11.330 If you look at this implementation, it initializes a new calculator, loops through a number of parameters, and for each, it calls tally on that number. Lastly, it assigns the result to the total.
00:28:27.260 This seems more complex, and we need to figure out how to cut this. As we examine the pure function again, a pattern will emerge: Pure functions are always easier to work with. TL;DR: More pure functions can save you headaches because they are less likely to bite back.
00:28:42.400 So to refactor, we replace the existing add call with `suture.create`, naming it something like 'add'. We pass the arguments in as an array and direct it to the old code path. The method can be any callable, like a Proc or custom class—it doesn't matter.
00:29:02.540 Initially, this setup is a no-op, and I run the code to ensure it continues to function as intended. It should call through by default. For the mutation case, we’ll cut that call as well.
00:29:23.600 This is a bit more complex. We’ll replace the call to tally with `suture.create`, passing in calc and n as arguments. You might wonder why it’s organized this way because calc isn’t an arg.
00:29:41.210 So let’s talk about designing seams. When we don’t have a pure function, with side effects involved, pure functions create a black box. If we call add with 2 & 8, we’ll always get 10. Call it again, and the result stays consistent.
00:30:11.230 However, mutation and side effects are trickier. If we call tally with 4, it returns 4, but calling tally again with 4 gives us 8. So, while this isn’t an argument per se from a language standpoint, it’s effectively an influence from the state held within the calculator.
00:30:33.400 What we can do is logically say that calling calc with an ivar of total zero is the first parameter. This allows us to force repeatable inputs and outputs, a unique consideration for developers.
00:30:52.320 We can't simply delegate to tally, so we have to structure a custom proc that will accept the calculator's state along with the number. We're going to call tally with that number and return the total.
00:31:08.530 Now, it’s meaningful and useful to establish recordings for these calls. With the pure function, you merely need to enable the configuration setting that records calls to true, and it'll capture every call made to that seam.
00:31:26.200 Almost every option can be set using environment variables so if you run Suture in a deployed environment, you won’t need to alter the source code. You could create a controller in a Rails console, set some parameters, and cover both the happy paths and sad cases.
00:31:46.870 You could also click on options in the browser to invoke code or even run this in production and pull snapshots down later if you’d like. That seems safe; I think it would work. But I haven’t done it.
00:32:08.320 For the mutation case, you simply add record calls and there's no extra complexity here. You just need to pass in some numbers, ensuring you cover all cases. So, where does this get set? Right, Suture actually instantiates a superlight database on demand.
00:32:29.340 You can set this path wherever you like, and Suture will dump all recordings there, creating the database if it is not already present. It uses Marshall dump; Marshall is part of Ruby's standard library and facilitates converting Ruby objects into byte strings, which are easy to persist and rehydrate.
00:32:53.050 At this point, you may be wondering if this works with Active Record objects. I had written Suture for about ten days when I thought I should check if it interacts well with real components of legacy code, like Rails.
00:33:11.560 So I examined the gilded Rose kata, a cool exercise that Jim Wyrick shared in a repository. You can read it, but you’ll notice the same flow where this particular code takes in an item, calls update quality, and I configure the lambda to return the mutated item.
00:33:37.170 Next, I create a few items that are significant for the purpose of our refactor activity. They're listed, and update quality is where the critical path is. When I click, stuff happens, and then I check my SQLite database to see that recordings were successfully created.
00:33:55.020 I rehydrate them, ensure they’re good, and click again to capture even more. So yes, that all works on Rails. I've tested it, and if you're interested, you can find a Rails example in the example directory of the repo.
00:34:13.050 Next, we need to validate that these recordings can be played back in a test environment. In the case of the pure function, we simply write a test. Here, I create a calculator, and using the second API in Suture, I will verify the recordings.
00:34:36.580 We name it 'add' so it looks up from its database, confirming it has to match the production calls, and then the subject is whatever thing we want to test. Once that’s done, the runtime will verify every recorded argument set against the recorded result, whether it was a return value or an exception raised.
00:34:58.170 You can imagine that we effectively just obtained a collection of disposable characterization tests. We only need to record them. Furthermore, unlike traditional characterization tests, we don’t feel compelled to hold onto them forever since they don’t feel like tests; they're just rows in a database.
00:35:19.960 In the mutation case, it’s a similar process. We call verify with tally and create a lambda that mimics the production behavior. It should precisely match up with the expected functionality.
00:35:42.330 And one aspect I find fun yet challenging is that I’m not a big fan of code coverage as I’ve seen it abused on many teams, often used as a whipping metric. However, in this case, code coverage becomes a useful guide for our recording activities.
00:36:04.920 For instance, the Gilded Rose kata, in Jim Wyrick's example repo, he wrote a characterization test, all done with RSpec. Even though RSpec is terse, he had to write about 240 lines of custom testing setup. I assume that after completing this, he wasn’t in the mood to rewrite tests into isolated ones.
00:36:29.300 But that’s not ideal. In contrast, with Suture, I can achieve all that testing and cover the same behavior using a single test: `sutured.verify` calls for the return value of an item post-processing, all captured in a straightforward manner.
00:36:46.480 It captures the item before the call as well as the result after calling it, ensuring every row captures the delta. I also added an additional option, fail-fast true, when anticipating all recordings to pass, so if any failures occur, it doesn't proceed wastfully.
00:37:08.680 Before I call it done, I run a simple coverage report to see everything is covered. If something isn’t covered, I won’t need to write another test; I could just visit the website, interact with it, and cover that functionality quickly, running the coverage check again afterward.
00:37:22.390 At this point, we can finally refactor. You came here for a refactoring talk, so let's explore this secret: I must confess, I’m not the best at refactoring. I only know that when dealing with scary code, I tend to hold my breath.
00:37:43.470 This revelation you now know about me is a secret! Sandy and Katrina from our community authored a book, '99 Bottles,' that emphasizes refactoring, and if you want to delve deeper into effective strategies, I'd strongly recommend checking it out.
00:38:02.600 In the pure function case, this refactor demonstrates an understanding of the issue: it doesn’t work for negative values. So, we’re going to create a whole new method so we can call both independently without changing the existing one.
00:38:24.560 We will implement something clever, returning 'left' if 'right' is less than zero. Essentially, I just re-implemented the bug, marking it with a comment to fix it—because I want to retain the current behavior exactly, bugs and all. This may come counterintuitive, but remember that refactoring is about preparation for future fixes.
00:38:49.230 We won't implement the fix right away, as it might be arrogant to rush in without considering that a higher-order caller might depend on its buggy functionality in ways we don't anticipate. I'm all about taking my time before actually implementing the fix.
00:39:12.670 Now, we need to backfill with real unit tests, including a simple test to add two numbers. Next, we use a skip-pending test for the odd one, ensuring we don’t forget to fix the bug later.
00:39:33.270 For the mutation case, we’ll mimic the behavior while ignoring odd values. I will ignore the odd cases, creating a method where we instantiate the same ivar, add, and return, to still reflect the existing behavior while preparing to implement the fix.
00:39:55.670 We’ll include a guard clause at the top to check if the value is odd, and we’ll add a comment to denote that. I believe it’s vital to consider the phrasing of this infamous quote: 'Make the change easy,' followed by 'This may be hard,' and then 'Make the easy change.'
00:40:12.880 The mindset we adopt when approaching this process matters! So, I'm implementing unit tests to check the tally, ensuring calculations add to two while skipping odd values. Now, having refactored the code, we need to verify if the new paths play back against the original recordings.
00:40:36.460 In the case of the pure function, we will create a calculator, call `suture.verify`, passing in the subject of our calculations to check that it correctly matches the new 'add' method made.
00:40:58.950 The mistakes I made initially by calling the old method were corrected to ensure proper interactions happen. Yet the mutation case, which is more complicated, requires calling tally again with its new implementation.
00:41:21.700 During this process, I discover an error arises, which is critical to keeping track of. Jim Wyrick once told me that excellent error messages are one of the most important aspects of any library as they guide users through resolving issues.
00:41:41.630 So when a failure occurs, we’ll have detailed context available through our recordings, explaining every failure list in its run, including expected versus actual return values. Throughout this process, capture as much detail as possible.
00:42:00.710 In addition, clients can run just failed tests by setting debug flags, allowing them to examine individual errors to discern whether the issue lies within the argument comparison or results mismatch.
00:42:28.830 Suture ships with default comparators; by employing equality checks, it handles comparisons. If there are issues, we simplify things by passing custom comparators where needed.
00:42:50.990 Should we need to compare calculators, we write logic to check if totals match, bypassing unnecessary comparisons of trivial attributes. Utilizing classes in Ruby can facilitate flexibility, providing an inherent structure.
00:43:10.860 When you implement your comparison as a class, it’s straightforward to return true if the default comparison succeeds, or fallback on your custom logic if needed. Suture allows us to customize these behaviors smoothly.
00:43:30.130 Before each run, we ensure test order remains inconsistent by default to identify unpredictable dependencies, returning to a particular state if necessary. This includes limiting errors displayed for a seamless flow when failures occur.
00:43:52.060 Alongside this setup, we also display a progress chart through the feature of a bar of progress—a tool meant to help you track the completion state during your transition phase within refactoring.
00:44:11.610 We quickly identify lingering interactions from past instances, nip them in the bud, and ensure that our process paces consistently throughout this iterative navigation. When we run into errors, we need to return to elements without hesitation and keep the tasks moving smoothly.
00:44:28.450 This entails observing feedback when attempting functional code on our daily changes, willing to adapt with information from real-world test results, confronting errors as they arise, and rectifying them with agility.
00:44:48.000 Once the use cases pass successfully, we document the changes for clarity and transparency as that will aid future iterations while ensuring that no crucial pieces are left unexamined in transition.
00:45:00.990 With this thorough process, we will have enhanced our existing methods, preserving performance consistency, learning to meet expectations rather than merely achieving objectives. Moving forward, we find this approach accurate and leverageable.
00:45:18.000 Finally, effective developers must remember that the core objective remains making changes safe for the users. Our ultimate goal is that no new code causes detriment to their experience. When issues do arise, we utilize the fallback strategy to revert to the old methods.
00:45:40.000 In the pure function case, we simply modify the flag to 'fall back on error true.' Now, if the new code path raises an error, we rescue by invoking the old one, allowing users to remain unaffected by our faulty changes.
00:45:58.420 For the mutation case, since I’ve invested time in making things function correctly, altering the flag leads us to success with negligible changes. It’s worth noting that only calling the old path as necessary proves more efficient.
00:46:19.540 All errors get logged, Suture provides a robust logging system—you can configure it, merge it with your Rails app, and keep an eye on potential failure points to prevent users from facing web faults.
00:46:41.950 When the transitions finish, we delete the recordings. Much like stitches, we remove Suture once complete! For pure functions, we delete testing boards experimentally without hesitation to reinforce our transformations.
00:46:59.120 We sweep away the old approaches, aligning with new processes that feel natural. The mutation case follows suit, reinforcing the importance of adaptability in achieving streamlined developments.
00:47:26.190 So, we feel great as we finish this extensive process. As I wrap up, keep in mind achieving success in such a project entails intense collaboration. However, I want to emphasize that we did not resolve the bugs.
00:47:43.210 That’s a vital recognition I’ve added to the talk after realizing after several iterations that we never actually patched up the originally outlined issues. Thus, we get to the end of implementing Suture, a holistic approach to improved refactoring.
00:48:05.890 Suture is ready to use, and it's available on GitHub. We are cautious, releasing it at version 1.0, respecting significant changes to avoid surprise. Recently, I’ve been engaging with Michael Feathers about advancing safe production refactors.
00:48:29.200 A prominent part of that discussion revolves around deleting dead code. He just released a gem this week called Scythe, which helps you identify whether certain code is still in use and when it was last called.
00:48:54.120 All these efforts contribute to making refactoring less daunting, allowing Ruby to remain maintainable. We want to keep utilizing Ruby in our work for years to come, and even if you choose not to use these tools, consider how you can contribute to the community.
00:49:11.580 I have a feeling everyone in this room has experience with legacy rescue. Moreover, Sam, Betsy, Knoll, and I are arranging to congregate for lunch, so feel free to follow us! We’d be happy to chat about testing or any topic you find interesting, including Ohio food.
00:49:29.890 Lastly, I'd like to express my gratitude to Test Double and celebrate its fifth anniversary. Our company would not exist without Ruby and the community. I appreciate the support from both current and former clients in the room.
00:49:51.110 It's remarkable how far we've come. If told five years ago that we’d become one of the most recognized Ruby agencies globally, I believe I would have remained in a constant panicked state.
00:50:10.700 So I’m Searls. Find me on Twitter if you wish; I would love to connect. Share your thoughts about this talk, and feel free to check out the online version of my talk shared from Japan.
00:50:28.420 Test Double is on a mission to fix the software world’s broken aspects. Send us an email if you'd like to join us. We’re always interviewing potential new agents, eager to tackle the complex challenges teams face with legacy Ruby.
00:50:48.000 If you know of any teams needing our help, please don’t hesitate to message me or grab me after this; I have stickers and business cards I’d love to share. And with that, I conclude my slides. Thank you!
Explore all talks recorded at RubyConf 2016
+82