Refactoring

Summarized using AI

How To Code Like A Writer

Nickolas Means • March 05, 2015 • Earth

In the video titled "How To Code Like A Writer," Nickolas Means discusses an innovative approach to software development by treating coding as a writing exercise rather than a purely technical task. This presentation, delivered at Ruby on Ales 2015, emphasizes the importance of clarity and structure in code, analogous to good writing practices outlined in Strunk and White's "The Elements of Style."

Key Points:

- Coding as Writing: Means references a previous talk where it was suggested that developers spend more time writing than engineering, leading to the concept of considering code similar to prose. This opens up the idea of applying writing rules to coding practices.

- Strunk and White's Influence: The canonical text by Strunk and White serves as a guide for improving code readability and structure. For instance, revising is necessary and often reveals flaws in initial arrangements, paralleling code that may require significant refactoring after it reaches production.

- Refactoring Example: He presents a case study using a class named Shipping Utilities.rb, which was poorly structured and exhibited high complexity scores. The goal is to refactor this problematic code using principles from Strunk and White, rather than traditional technical refactoring methods.

- Specific Refactoring Steps:

- Isolate the problematic methods from the utilities class.

- Clear unnecessary lines and improve variable names for better readability.

- Teach methods to respond to specific messages, enhancing the code's clarity and structure.

- Break down lengthy methods into smaller, more manageable ones for concise code writing.

- Utilize parallel construction in code expressions to enhance readability and comprehension.

- Outcome of Refactoring: The final refactoring process results in significantly reduced complexity scores, making the code cleaner and more maintainable. For example, a Flog score improved from 180.4 to 77.2 and a Fog score from 120.7 to 3, illustrating the effectiveness of applying writing principles to programming.

Conclusions and Takeaways:

- Coding should prioritize clarity and organization, akin to constructing an engaging narrative in writing.

- Every piece of code should serve a clear purpose, eliminating unnecessary complexity and ensuring that the narrative is easy to follow.

- Following the philosophy that "Good programmers write code that humans can understand," encourages developers to craft code with both the machine and human perspective in mind.

- Ultimately, rethinking coding practices in the light of writing principles can lead to a more disciplined and effective approach to software development.

How To Code Like A Writer
Nickolas Means • March 05, 2015 • Earth

By, Nickolas Means
As developers, we spend more time writing code than thinking about the nuances of computer science. What would happen if we approached code like a writing exercise instead of a technical pursuit? What if we applied patterns from elegant prose instead of Gang of Four? Let's try it! We'll take some smelly Ruby and refactor it using only advice from Strunk and White's "The Elements of Style", the canonical text on writing beautiful, understandable English. You'll come away with a new approach to your craft and a new appreciation of the similarities between great writing and great code.

Help us caption & translate this video!

http://amara.org/v/GU7v/

Ruby on Ales 2015

00:00:29.720 So, our next speaker is Nick Means, a PHP developer from the olden days. He still loves PHP but he's been doing Rails since 1.1. Somebody once tried to make me contract on a Rails 1.2 app, and I bowed out, but I hear it was fun back in the day.
00:00:35.879 Nick is here to talk to us about something that's not related to Rails or PHP. A little background on Nick: his hobbies include playing with his two children, programming, and doing more managing than coding these days because he is a team lead.
00:00:48.239 I feel like team leads aren’t supposed to admit that they aren't coding full-time. Management often pretends that they just code full-time. It's amazing! Nick does pair programming 100% of the time and has a lot of experience in pairing. If you have any questions about pairing or pairing processes, Nick would be a great resource. So, can we get a round of applause for Nick? Thank you, sir.
00:01:14.799 Hi everybody! As Jonah mentioned, my name is Nick. Unlike Terren Lee, I am from the town of Austin and I actually live there most of the time. I work at a company called WellMatch where we build tools for healthcare pricing transparency. As Jonah said, I pair 100% of the time remotely, so I spend my entire day on a Skype call working in Tmux and Vim, and I love it.
00:01:49.920 Before I get started, I should warn you that I have 303 slides, which gives me a little less than 7 seconds per slide, so we're going to move pretty quickly. Please hang with me!
00:02:09.560 Think back to RailsConf 2014. David H. Hansson got up on stage and created quite a stir in the Ruby community by saying that TDD is dead. If you listened closely to his talk, you would hear he had a more subtle point as well. He spent a lot of time discussing how as software developers, we don't spend much time in the realm of computer science.
00:02:23.640 In fact, we spend most of our time writing. He posed a question: if we are not engineers, then what are we? We are software writers! This made me think: what would happen if we took that notion too literally?
00:02:35.720 We would have to replace our rule book! We couldn't just use the normal best practices we are accustomed to; we would need to choose something from the writing realm. When looking for a book of rules on writing, nothing is better than Strunk and White.
00:02:49.120 If you've taken a college writing course, you have likely owned a copy of this book at some point. If you had a good writing professor, you probably still own it. This book was written by Cornell professor William Strunk back in 1918 as a handout for his students, and it was later expanded by E.B. White, one of his former students, in 1959.
00:03:01.599 Yes, it is that E.B. White. This book has been published countless times, including this gorgeous leather-bound, gold-embossed 50th Anniversary Edition right here in front of me. It's so important in the writing community that there’s even a book about the book! If you're looking for a guide on how to write good English prose, you can do little better than Strunk and White.
00:03:30.519 But what does this have to do with code? Well, at Ruby on Ales, I want you to meet my old friend Shipping Utilities.rb. There’s something peculiar I want to point out in the name of this class.
00:03:38.560 Anytime you see the word 'Utilities' in the name of a class, you should think one thing: it's probably a sign of trouble. Now, looking at the stats: this class has 427 source lines of code, a flog score (which measures complexity) of 56.1, and a Flay score (which measures duplication) of 70. Code Climate rightly gives it a grade of F.
00:03:53.280 This is code that I wrote fresh out of being a PHP developer, so it has a fair bit of PHP-like syntax even though it's in Ruby. Everyone has seen this type of code before. It’s from a Rails 1 app that I wrote as part of an e-commerce company that sold motorcycle apparel. I'd like to point out that 21% of the complexity of this class comes from a single method, build_shipments. What does that method look like? There it is, in all its glory. That's what we are going to tackle today, which is why it's going to take us a while and why I need to move quickly.
00:04:30.520 So what does it do? Well, when you're an e-commerce vendor, there are a few ways that you can source products to send to your customers. Primarily, you have items in your warehouse, like this jacket from vendor Foo, and items that are drop-shipped from vendors. Drop shipping means a vendor ships it to your customer on your behalf. It never goes through your warehouse, and the customer doesn't know it's coming directly from the vendor, but it saves you from handling it.
00:05:32.600 For instance, we have a glove that's drop-shipped from vendor Foo, as well as a helmet from vendor Bar. There are several different ways we can break this up into shipments. We could put everything in its own box and send it along to the customer in three shipments or, since the first two items are both from vendor Foo, we could have them drop-ship the jacket for us along with the gloves. This saves us from having to handle it in our warehouse and saves us from paying for a shipment. Alternatively, if the customer wants to save a bit on shipping costs, we could order everything into our warehouse and put it all in one box to ship it to them.
00:06:02.120 That's what this gobbledygook does. Now, how do we make it better? How do we start? Well, in a normal world, we would pick up our handy dandy copy of Martin Fowler's refactoring book and go to town. But we can't do that here; there are different rules today!
00:06:31.400 In my abstract, I stated that we would take some smelly Ruby (which I have clearly provided) and refactor it using only advice from Strunk and White's Elements of Style. It was only after I submitted this abstract that I realized there's a bit of a lie in this; we can't refactor code. Refactoring is not really a thing in the world of writing, so we must turn to Strunk and White to even find a basis for improvement.
00:07:01.000 So, what should we do? Strunk and White rule 55 states: "Revising is part of writing. Few writers are so expert that they can produce what they're after on the first try. Quite often, when examining the completed work, you'll discover there are serious flaws in the arrangement of material calling for transpositions." This is a perfect example of what happens when the first working version of code gets out into production.
00:07:29.760 As a PHP developer, I didn’t understand what automated software tests were when I wrote this code, so it is also untested. We need to fix that. What is the practical first step we should take? Well, Strunk and White rule 53 tells us we need to work from a suitable design before we begin composing something. We must gauge the nature and extent of the enterprise and work from a suitable design.
00:07:54.120 In writing, this means you want to structure your text so that the reader can approach it, read it, and understand it. We want to do the same in software. We want to take this code and package it up in a way that someone encountering it can read it and make sense of it. First things first, this code is in desperate need of some isolation. We need to get it out of that utilities class.
00:08:30.000 So, we'll move it out of that file and build a class around it. We'll write a simple initializer that just calls build_shipments. We may include the create_group_if_necessary and insert_collaborator methods if needed for build_shipments to do its work.
00:09:01.600 Next, we'll write ourselves a test to see if we can even instantiate the thing, and of course, we can't! We have an uninitialized constant: Rails' default logger. Anyone who has been doing Rails for a while remembers when that was the way to log something in Rails.
00:09:28.800 It turns out, we have a lot of these—we're filled with trace debug statements. Some are even at the info level, which means this garbage is leaking into our production logs! So, we'll start by deleting those, which actually cuts our line count down significantly. But, our tests still don’t work. We have another uninitialized constant: item.
00:09:50.520 This brings me to my absolute favorite line in this whole mess, which might be the worst line of Ruby I've ever written. It's a line that I love so much I used it twice! So, what is this line doing? It's two nested Active Record queries inside an iterator, predicated on an object type check. If order_line_item has an item_id, we want to trot off to the database and pull out an item. But, for reasons unknown, we assume it's a used item when we go get a used item out of the database instead.
00:10:17.040 Obviously, this was a clever hack that past me thought was necessary to get this code working. Strunk and White tells us about clever hacks in rule 519: "Do not take shortcuts at the cost of clarity. Many shortcuts are self-defeating; they waste the reader's time instead of conserving it." The one truly reliable shortcut in writing is to choose words that are strong and clear to carry readers on their way.
00:10:53.680 So, we have to get rid of those lines. To do that, we need to understand how we're using the data they fetch for us. We have two blocks in the code that use the item variable. They happen to be in two legs of a conditional, making it easy for us to tackle Them. We'll take the lines in that first block and pull them out.
00:11:31.200 I'll zoom in a little bit so you can see them better. I'll take the awful line and convert it to an if statement, mainly so I can make the text a little bigger for readability. Then, we'll narrow it down to just the parts we're dealing with, specifically where we mention item. Now, the ship_status_symbol_item lines seem a bit complicated, so we’ll skip them for now and come back to them.
00:12:04.400 Let's start with the store_id line. We're performing an object type check. My past self apparently loved checking object types because it appears multiple times in this code. We check if item is an instance of the item class; if it is, we return through item to product, grabbing the product's store_id. Otherwise, we return one. Actually, I do know why we do this: it's because used items always have a store_id of one.
00:12:40.840 But, if another developer—other than me—had to follow behind and maintain this code, they would have no clue what I was trying to communicate here. It makes no sense! So, how do we get rid of this? Let's look at our test. We've got an order_line_item double down here.
00:13:10.880 I should mention that I pulled this class completely out of the Rails app, so we have none of our collaborating objects at all. If you want to find out how entangled your code is, this is a great way to uncover it. We want to teach our order_line_item double how to respond to this store_id message. That’s simple enough! However, since it's a double, we need to teach the real object how to do it as well.
00:13:43.480 I'm going to skip that for now because I have a ton of slides. Once we've done that, we go back up top and send that message to order_line_item. The same pattern works a few more times; we can apply it to this line as well, and we can do the same for item drop_shippable.
00:14:17.880 You're probably noticing a pattern here: I'm making the same change to multiple lines. There's a reason for that; Strunk and White rule 219 states: "Express coordinate ideas in similar form." This principle of parallel construction requires outwardly similar expressions when their content and function are similar.
00:14:56.360 This similarity in form helps the reader readily recognize the likeness in content and function. If we can make things look the same in our code, it becomes easier for someone approaching it to read and understand it, as they only need to grasp the pattern once and can apply it across multiple lines.
00:15:35.670 That's what we're doing here: walking through and applying that pattern. However, there is a mismatch. If you look at the symbol on the left, it’s not an interrogative, but on the right, it is an interrogative (or a method, depending on where you're from). So, we'll take our order_line_item double and teach it how to respond to plain drop_shippable without the question mark.
00:16:08.280 Then, we go back up top and change it to order_line_item drop_shippable. Now we get to tackle these ship_status_symbol lines. The first line asks the item what the shipping status is of a given quantity. Let's say we have one in our warehouse, and the customer wants to buy two; in that case, we'll need to drop-ship it because we don't have enough to fulfill the order.
00:16:43.640 However, if they only want one, we can ship it out of our warehouse. So, we ask for that ship_status_symbol and then let’s talk about a couple of special cases. If it's either 'close out' and 'in stock', or 'from stock only in stock', those are edge cases that have nothing to do with what we're trying to accomplish here. We want to convert it to 'in stock'.
00:17:17.280 That looks really complicated. It can be intimidating to consider how to refactor that, but we can apply the same pattern again! We can teach our order_line_item double how to respond to the ship_status_symbol message. Let's go up top and send that message to order_line_item instead.
00:17:57.700 Now, we don't even refer to item at all— we’ve refactored that out of this block of code! We can go back and remove the item variable entirely from where we're setting it. It doesn't need to be there anymore. That terrible line just vanished because we taught another object in our system that is responsible for answering these questions and how to do that.
00:18:23.480 So, we’ve cleaned that up, and it allows our test to pass! But wait, we shouldn't— we still need to check the other leg of our conditional. So, let’s quickly add a test for that and get the failure we expect. Then, we can go back and refactor that section down at the bottom.
00:19:01.200 The great thing is we’ve already taught order_line_item to respond to all of these messages. So, we’ll replace order_line_item there too, and both of our tests are passing! We’re getting close to a suitable design, but we're not quite there yet. There's still something odd about how you would use this class. If you instantiate shipment_builder, what should you expect it to return?
00:19:34.559 In idiomatic Ruby, shipment_builder.new should hand back a shipment_builder object. However, in this case, because we're calling build_shipments in the initializer, we’re handing back an array of shipments. This is unexpected behavior that’s hard to track down. So, we need to change the API of our class a little bit.
00:20:16.920 Before we get deeper into refactoring, we want to be able to call shipment_builder.new and then call build_shipments. We'll test-drive that change. You'll also notice that this test isn't really doing anything; it's just checking whether we can instantiate it. Let’s ask some meaningful questions to validate that we're actually achieving what we want to do with our class.
00:20:43.399 We'll do this with the other leg of the conditional. Then, we will get the two failures we expect: wrong number of arguments (zero for two). This error arises because we’re now calling build_shipments with no arguments, yet we expect it to have the context from the class to answer the questions we're asking of it.
00:21:06.680 To provide that context, we'll take order_line_items and consolidate them, moving them up the class so they become adders. We’ll assign those adders in our initializer, allowing us to go down to build_shipments and eliminate the arguments because the class attributes can now do the work needed. We no longer need to call build_shipments in our initializer, so we'll delete that line.
00:21:46.000 Now, I need to ask you to forgive me for what I'm about to do. I'm going to wave my hands and magically suggest that we have all the tests we need to support us in refactoring this class. I wish I had time to walk you through that entire process. All of our tests are passing! If you want to see how to do this for real, I highly recommend Katrina Owen's Therapeutic Refactoring talk. She does an excellent job of explaining how to wrap characterization tests around legacy code.
00:22:23.560 Now, what do we want to do? Let’s consult Strunk and White for our next step. This is probably the most famous of their rules: rule 217. "Omit needless words. Vigorous writing is concise. A sentence should contain no unnecessary words; a paragraph, no unnecessary sentences." The logic is the same as a drawing having no unnecessary lines, and a machine having no unnecessary parts.
00:23:12.920 Clearly, we have some unnecessary words here because this is as verbose as it could be. The first block that stands out to me is this block where we’re essentially creating a decorated line item object to give it some extra attributes needed for this class to do its job. However, since we've refactored, and because we have order_line_item answering many of these questions, we can get rid of that block altogether.
00:23:54.000 Next, we have the consolidated line that we need to deal with, so we’ll also ask about order_line_item here. This doesn't affect our tests; they continue to pass. Now, you'll notice something interesting: the symbols on the left that our open_struct responds to are now the same as the messages we're sending to order_line_item.
00:24:31.680 Since all our messages on the left (represented by open_struct) are the same as the ones on the right (representing order_line_item), we've taught order_line_item to handle the duct type that this class depends on. This means we can replace the code we previously had there and instead refer directly to the line_item variable.
00:25:11.840 I've got this one spot up here where we have two variables that are equal to each other; that’s just a quick rename away. We’ll change the adder variable for order_line_items to 'line_items' because it’s more concise, and we’re looking to eliminate verbosity.
00:25:49.440 Now, we’ve cleaned things up to ensure every piece of code serves a purpose. However, we still need to address our class by looking to Strunk and White again and considering rule 213: 'Make the paragraph the unit of composition. A subject requires division into topics, each of which should be dealt with in a paragraph.' Here, our methods are paragraphs, but build_shipments is a very long paragraph.
00:26:25.760 Fortunately, it’s relatively straightforward to break this down into smaller sections. There are two branches of a conditional that make up the entire body of the method. We’ll take the bottom portion, the smaller bit, and extract it into its own method calling it build_consolidated_shipment. This gives this block of code a nice label that explains what it’s doing.
00:27:01.600 Now that we’ve broken our code down, we need to redirect focus on the original build_shipments method. When I extract it, make sure to identify the local variables that need to be re-scoped so that the extracted method works appropriately. We'll elevate the necessary variables to class attributes so our tests continue to pass.
00:27:33.760 Next, we'll repeat the same process for the top leg of the conditional, moving it to its own method as well. Both branches of build_shipments are beginning to tell a much clearer narrative. While we've cleaned up the original mess fairly well, we might have mistakenly crammed everything into a metaphorical junk drawer.
00:28:05.760 Now, I want to point out something important before we dive deeper into refactoring this block. My past self was evidently proud of this solution, which was considered rather clever. However, it seems my wisdom has evolved, as magic now fuels the consolidation.
00:28:36.880 So, we have three blocks of functionality to break out into separate methods: the first block takes all our line items and builds a hash by ship_status_symbol. The second block consolidates in-stock items to drop-ships only when we can get rid of all the in-stock items. If we need to ship anything from our warehouse, the shipping cost requires packing a box.
00:29:11.520 However, if we can get our vendors to do all the legwork for us, we can consolidate all of these in-stock items to drop-ships. The third block calls our collaborator method to group line items into appropriate shipments. Let's distill those out into methods.
00:29:49.880 The first method we create will be line_items_by_symbol or abbreviated as line_items_by_sim. This clean extraction doesn't break anything, though it isn't specific enough. It's not doing the specific task we need, which is returning a hash of every item type.
00:30:26.720 Most times, we want to check for a single item type, so per Strunk and White rule 216, we should prefer definite, specific, and concrete language. Instead of asking for line_items_by_ship_status with an argument, we could simply request in_stock_items. This call reads more clearly, specifying exactly the type of items we desire.
00:31:05.520 The same principle applies for drop-ship items. Now we have cleaner calling code up top, and our tests still pass! But let’s look at this block at the bottom. We need to make a few adjustments to tidy up the last remnants of our messy implementation.
00:31:39.760 Upon unpacking this, we first verify whether we have any items in stock and not consolidating. We’ll count all our in-stock items and check that the count is less than one to make that our gatekeeper for changing statuses to drop-ship.
00:32:18.960 Let’s clean that up a little. With our handy dandy in_stock_items method, we can check for in_stock items and change them all to drop-ship. While staring at this line, we need to assess our matching drop shipment to ensure we can pair it with our in-stock item.
00:32:56.800 It makes sense to further encapsulate this logic into a gatekeeper method that prevents us from attempting any operation unless all in-stock items are drop-shippable. As we set out to consolidate drop-ships, we can simplify that logic into a one-liner.
00:33:31.720 This now provides a thorough clean-up of the method; trying to ensure our code openly communicates intentions while ensuring the tests still pass. What we are left with is the consolidate_to_drop_ships method feeling more confident than ever.
00:34:06.720 In the end, we can eradicate conditions from other branches as we consolidate this final functionality into a cohesive method. Let’s eliminate any remaining ambiguity to reveal our intent clearly. The flow of our logic still honors what our tests validate.
00:34:44.960 We’ve structured our class so everything has been methodical and at the same level of abstraction—making all pieces easy to understand and logical. We’ve cleaned our class nicely, and I’d love to share how we ditched the last vestiges of clutter.
00:35:23.920 Ultimately, our external utility class, which allows us to associate line items with the right shipment, also allows us to return shipments cleanly. The neat trick is the way it injects the array, allowing for straightforward manipulation without extra requests for verification later.
00:36:07.680 I wanted to share how we cleaned it up with a reminder of where we began—two stark contrasts in how our code now articulates meaning over mere function. After fine-tuning every part of this class, the score reflects on the absence of duplication as it now bears an impressive zero.
00:36:48.320 On the other side, we see how we improved complexity, transforming an initial Flog score from 180.4—leaving those two extraneous methods behind—to a lean 77.2. This leads us to the pièce de résistance: the fog score for build_shipments has whisked down from a hefty 120.7 to a neat 3, showing how we can significantly restructure functionality.
00:37:24.000 This was the heart of our exercise: showing how to approach challenges with clean practices, making our code understandable for ourselves and future maintainers. In revealing understandable code, we take a good codified narrative in our build_shipments method. It reflects down to the granular level, streamlining our narrative arc.
00:38:07.200 Everything we've discussed aligns with the true essence of our coding practices. Each point picked from Strunk and White outlines the parallels we encounter daily in software development, from revising and rewriting to development design and clarity of concepts.
00:38:30.600 As I wrap this up with the final message, I want to emphasize: we need to extract a sense of intuition while coding. Coding is for both humans and machines alike, as I conclude with a quote that resonates well—'Good programmers write code that humans can understand.' Let's commit to our craft, honing the clarity of narratives in everything we produce.
00:39:12.720 Thank you!
Explore all talks recorded at Ruby on Ales 2015
+5