00:00:00.000
A few years ago, I was hired to work on a successful product. This product had a number of characteristics in common with other successful products. For one, the team was finding it increasingly difficult to make changes. Over the years, the product had accumulated a lot of features and, in support of all these features, there was a lot of code and no tests. The company wanted to develop a culture of testing for all the usual reasons: shipping faster and not being terrified while deploying, and I was brought onto the team due to a 90-minute rant about testing and refactoring.
00:00:19.380
There I was, in a language that I didn't know, in a codebase that I had no possible way of understanding, and I was going to help them get this thing under test. I quickly discovered that I did not have the skills necessary to do that; I couldn't get any part of the application under test. Every single piece of the application seemed to depend on every other piece of the application, and I couldn't get the whole application loading in a test harness because of reasons I didn’t understand. So, every once in a while, I'd come across a tiny opportunity for refactoring. I'd extract a thirty-something line long class, and it would be completely understandable and completely test-covered, only to realize that the old class that used to be thirty-five hundred lines long was now still thirty-five hundred lines long.
00:01:01.920
I didn't know how to tackle the problem head-on, and I couldn't figure out how to develop the skills necessary to do so. So I'm going to borrow a story from a different industry. Tony Plogue is an internationally renowned trumpet virtuoso. I was told about a masterclass that he ran a few years ago for professional soloists. The first part of the class was very typical; the students played breathtakingly complex pieces, and then Tony would offer suggestions for improvement.
00:01:50.310
Once everyone had played, Tony asked them to play a very simple warm-up exercise, something that might be given to any beginning trumpet student. Each student played the exercise in turn, and compared to the dramatic pieces they had just played, this sounded childish. After everyone had taken their turn, Tony picked up his trumpet and played the exercise. When he played it, it didn't sound childish; it sounded exquisite. Every note was deep, rich, and beautiful, and he had taken this handful of notes and turned it into something elegant and graceful.
00:02:42.100
The contrast between Tony's performance and that of the students was astonishing. There was a profound difference between the true master and the skilled practitioner, and that difference occurred at a fundamental level. It had nothing to do with intricate and sophisticated complexity. Tony suggested that the advanced student should spend a lot more time focusing on practicing simple pieces intensely. I would like to bring Tony's lesson back into our industry.
00:03:02.830
Using a completely spurious exercise that I made up about a year ago, it wasn’t meant to teach anything—it was just a warm-up. Meet Bob. He's 15 years old, not very bright, and has an attitude problem. No matter what you say to him, he will respond in one of four ways, which have all been encoded into a test suite. If you've been programming for, oh, I don't know, say about 10 days or so, you should have absolutely no problem getting the test suite to pass. However, because this is natural language, there are ambiguities. For example, just because something contains a question mark, it doesn't mean that it's actually a question.
00:04:11.250
You can shout without using an exclamation mark, and the presence of an exclamation mark doesn’t make something shouting. Questions can also be shouted. Numbers are particularly interesting because they don’t have upper and lower case variants. As the complexity of the test suite increased, so did the complexity of the solutions, and I found myself adding more and more edge cases, trying to force the solutions to become simpler and more correct.
00:05:10.610
In the past year, about 2,000 people have written code to make this test suite pass. Some of these people are complete beginners, and others have been programming for several decades, with everything in between. Many interesting and completely valid solutions have emerged. I'm going to show you 16 typical iterations through the exercise, which I have seen people do over and over again. This is a fairly classic starting point for someone who has never programmed before, and let me just say this: it passes all the tests.
00:05:30.250
Laboriously, Sandy Metz talks about something she calls the squint test. It works like this: you lean back, squint your eyes, and look for changes in shape and color, which tend to represent changes in levels of abstraction. Although that's not the case here, because the solution is white space. Look at the vertical white space, observe the pattern created by the indentation, and note the inline horizontal white space. Cleaning up all the white space gives the code a clear rhythm, but it still looks long and is overwhelmed by the conditionals.
00:05:50.150
On the other hand, new patterns begin to emerge. The first lesson that this pointless exercise taught me is that white space is meaningful. Consistent indentation tends to suggest that I have a bug, or at the very least, a syntax problem. I use white space to make like things alike and to make different things stand out, and I rely on it much more than I was aware. This is the code that we ended up with in the previous iteration; it has a cadence, a rhythm. Squint at it and notice the repeating patterns. The return keyword is repeated unnecessarily, and the comparison to input occurs over and over.
00:06:39.560
We can delete the returns and replace the if statement with a case statement, and the program suddenly feels lighter. When you minimize boilerplate, the shape of the code becomes more pronounced, highlighting the bones of your story and often revealing flaws and weaknesses. Notice how obvious the duplication has become now that the cruft has been removed. Each of the four responses that Bob has is repeated over and over again. It turns out that there's a very simple solution to this.
00:07:06.950
One of the first lessons that we teach new programmers is 'don't repeat yourself'. If you don't know what else to do, removing duplication is a reasonable first step. It's easy; it's insidiously easy. We often reduce superficial bits of duplication that turn out to not be the same idea at all. Brian Helm Camp, who makes a code climate, calls this 'it's too dry'. If you remove duplication to the point of chafe, it’s often really hard to move forward. If you can't figure out where to go next, try reintroducing the duplication to see if something else occurs to you.
00:08:68.880
Alright, so we have code that passes all the tests. It's consistent, has no unnecessary boilerplate, no duplication, but it does have very long lines. We can totally rearrange that to have it on one screen. However, there is a problem—this code is very brittle. It doesn’t solve the general case of saying nothing, asking questions, shouting, and anything that is none of the above is listed as a specific but completely arbitrary list of statements.
00:09:36.160
This creates a great test suite if we're developing a talking toy with a very small set of specific responses. However, for ambiguous language like conversation, adding a test case will cause the test suite to fail. What we need to do is detect the rules and then provide responses based on those rules, not based on specific pieces of input. Very often, that will look something like this: we have three rules and some default responses.
00:10:45.860
It's not always easy to find the underlying abstraction, and one of the best analogies I have for this is deer hunting. I have a sister who has spent the last 20 years in Alaska. Her family puts food on the table by going out into nature and hunting. Every fall, they hunt approximately a freezer full of deer. My sister and her husband have widely different strategies when it comes to partitioning deer. He brings a chainsaw while she brings a small paring knife. A chainsaw is effective, but it's also not ideal because of blood and guts in your steaks.
00:11:40.760
A small blade is perfect, provided you know where the seams in the deer are; you have to edge through correctly to loosen it up. The deer will then fall apart easily. Getting the abstractions right in your programs is a lot like taking a chainsaw to a deer. It gets the job done, but it’s messy. But when you get it right, the code can feel obvious. It's as if there's no other way this could have been done.
00:12:24.590
We've removed duplication and ended up with a solution that resembles Ruby syntax; it's not too long and solves the general problem that the readme describes. However, I don't like how irregular this solution feels. For example, the red X's are not overly complicated but they are also not obvious. Instead of thinking about this in absolute terms, finding the pattern that defines what all caps means should be viewed in relative terms; an all caps string remains unchanged when you uppercase it.
00:13:21.390
Unfortunately, this doesn't work for things like phone numbers. Our phone number will not change with upper or lower casing; it's not all caps either. Meanwhile, the phone number won't change when cased either; it's simply incapable of this. Determining whether something ends with a question mark, however, is doing too much. There's a string method that does exactly what you need. Lastly, comparing a string to an empty string is valid, but if you’re going to be all Ruby about it, you just want to ask the string itself.
00:14:00.710
The ability to express an idea without needing to think about how to express it is known as fluency, which helps you say more with less. Fluency is vital, but it starts small, by paying attention to the tiniest building blocks of your language. Finally, we have a solution that smells like Ruby; it reads like English, but there's a problem with the 'and' operator. It’s supposed to be used for control flow, but we've confounded it with the double ampersand, which is a boolean operator.
00:14:47.050
A boolean expression asks a question, while control flow is much more imperative. These are not the same things. Our English phrase is a boolean expression and it should use a double ampersand. There’s a distinction between simple and easy; Rich Hickey describes the distinction by noting that 'easy' is familiar and close at hand, while 'simple' is the opposite of complex, defined as something that has many interleaved parts. A single string tied in a knot is complex, whereas one hundred strings hanging straight down is simple.
00:15:57.740
This solution is short and straightforward—a straightforward idiomatic Ruby implementation. However, I’m not convinced we should be using early returns. An early return says if this condition happens, then just exit the method altogether; don’t continue executing the code, etc. But that’s not what we have; we don’t have a scenario of 'do not pass go.' Instead, we have several roughly equivalent options. It can be difficult to find balance between being expressive and succinct. Too many words lose the meaning, while fewer words can cause the meaning to be lost or distorted.
00:16:44.580
What if you've never seen the readme? How obvious would this upper-and-lower case discussion be? Wouldn't it help if these concepts were named for easier understanding? People often reach for something like 'all caps question mt', which hides the fiddly bits and makes the concepts explicit but feels lopsided. The methods are at different abstraction levels; all caps and empty are about strings, while a question is about conversations.
00:17:50.820
Ideally, all three methods should be about the concept of conversation, not strings. From a conversation perspective, all caps truly represents a yell or shout, while empty might be silence. This symmetry gives all three methods a coherent narrative, sharing a story about conversations rather than an underlying implementation that’s merely strings. Kent Beck suggests in Smalltalk best practices that implementation details, regardless of how small they are, should be hidden behind a method with an intention-revealing name.
00:19:13.320
This is tricky because we need to define what intention revealing means, which is context-specific. Let’s talk about Bob’s API. We have the 'Hey Bob' method, the 'Bob, are you silent?' method, the 'Bob, are you a question?' method, and the 'Bob, are you a shout?' method. This is a terrible API that can be improved by adding a private declaration. A public API narrates a story from the point of view of someone, and it’s vital that the perspective is consistent and that the story told makes sense to the reader.
00:20:10.720
But the narrative is still pretty awful for many reasons. I’m uncomfortable with referencing the instance variable directly everywhere, as that means I'm locked into it, lacking indirection. There should be a seam allowing changes without altering the code itself. Philosophically, I want to stray from distinguishing between data and messages. If everything is an object, surely there’s a message I can send to it to make it do what I want. If that’s not the case, it's my responsibility to create it.
00:21:47.610
Imagine we have an instance of Bob, while Alice shouts, 'Are you crazy?' as Bob listens. At almost the same time, Charlie says, 'You’re late', which Bob records into his instance variable. When Bob needs to respond to Alice, he checks the instance variable which now holds Charlie’s statement. Essentially, we’ve introduced a potential race condition. Instead, we need to pass the input to the helper method for direct handling.
00:22:57.750
You might argue that shielding Bob from race conditions is excessive—after all, Bob’s computations are trivial. It seems impossible such a race condition could ever occur, but even if it did, who cares? Nonetheless, the crux of the issue is not to save ephemeral data in an instance variable. Thus, we eliminated race conditions using fancy case statement techniques, passing in a proc that evaluates the input.
00:23:37.920
Predicate methods must return true or false, or at the very least something truthy or falsey. In Ruby, everything can be true or false and the only issue is that a proc, regardless of function output, will always return true. If any private method is misused in an unexpected context, everything will break down. We must pass the argument to the method rather than to the proc. Also, idioms are crucial, so you better stick to making your intentions clear—just as Dr. Seuss once said, 'Say what you mean and mean what you say'.
00:24:24.960
This repetition bothers me; I must pass the same argument everywhere. Not only do the private methods all share the same parameter, but their bodies operate solely on that parameter rather than anything within Bob. This is referred to as feature envy, and experts recommend relocating those elements to the object they’re called upon. Hence, there was an issue with Bob having too many responsibilities, but now he's highly cohesive. A cohesive object features very few reasons to change and becomes far more understandable.
00:25:18.650
There is a problem with the string. It used to focus solely on string-related tasks, but now it arms us with functions not solely pertinent to strings. This raises questions about the integrity of the input behavior. Now we have an input object, which piques interest. Looking at cohesion, it turns out input’s cohesion is severely lacking due to inheriting far too much from the core string functions.
00:26:03.550
Issues with inheriting from core types in Ruby also arise. Picture a dazzling string expanding string behavior slightly, allowing strings to sparkle—a noble pursuit! Imagine possessing two dazzling strings: one a unicorn, one a rainbow; they sparkle beautifully. However, trying to combine them will yield a standard string, which unsparklingly defeats the dazzling effect.
00:26:53.930
Therefore, there’s no reason to inherit from string. So let’s create a standalone object. We may need to add an initialize method, but fundamentally, this makes the object’s API focus only on desired methods rather than hundreds more that might come from a core type. Once I worked through a refactoring workbook and discovered a three-method situation was too long—not because of line count, but because it handled too many disparate functions.
00:27:59.950
Ask yourself: if the method or object could become two things, what would they be? If you can suggest a plausible alternative, your object or method likely has too many responsibilities. Now I’m starting to appreciate the code, though one aspect frustrates me: the naming of the input variable. Named wrongly, it exists at the wrong abstraction level—not symbolizing the actual element at play.
00:29:01.830
Ultimately, it’s about understanding how you depict your environment accurately. If you clarify the current setup as verbal expressions or messages, Bob’s system could harmonize sophisticated terms reflecting interactions that challenge his reasoning. Hence, finding that correct name becomes ever more critical, as it portrayed the final solution. This solution is far from being the only respectable resolution for Bob, yet it unearths extremely practical preferences.
00:29:50.660
Ultimately, this journey through a simple warm-up exercise intended to teach nothing serves as a springboard for deeper understanding. It emphasizes asking challenging questions, questioning your comprehension, assuming there’s room for improvement, and pinpointing what irritates you about your code. In contrast to predicting its future, I’m keen to unravel how the current code may become simpler or more expressive. Thank you.
00:30:14.500
Thank you.