00:00:16.410
Thank you to the organizers here at RailsConf. This is my first time speaking at RailsConf, and frankly, it's kind of intimidating to be up here and see so many people out there.
00:00:21.760
My name is Mark Menard, and today I'll be talking about small code. I've prepared a lot of content: about 79 slides and 137 transitions—it's quite a bit to get through.
00:00:28.240
Let's get started. I want to let this quote sink in: all of us have that file filled with code that you just don't want to open.
00:00:35.860
As you heard earlier, it might be your user class. That class has comments saying, 'Woe to ye who edit here.' The issue with this kind of code is that it tends to live forever. It encapsulates business logic that often gets duplicated elsewhere because no one wants to go in and look at that complicated code. It’s also very hard to understand.
00:00:52.180
I'm going to talk about ways to avoid this situation, focusing on code at both the class level and the method level. Writing small code at both levels is fundamental to creating systems composed of small, understandable parts.
00:01:15.610
Let's start with a few basic concepts to ensure we're all on the same page. Many people struggle with what they think of as smaller, well-designed code. It's not about the total line count.
00:01:36.600
Well-designed code typically has more lines than poorly designed code. The overhead of declaring methods and classes increases your line count.
00:01:49.360
It’s also not about the method count. Well-factored code will indeed have more smaller methods. And it isn't about the class count either. Well-designed code will almost definitely have more classes than what I refer to as 'undesigned code.'
00:02:09.729
However, I've seen some cases where over-abstraction occurs, but that’s quite rare unless someone goes pattern-crazy. Small code is not about reducing the number of classes in your system; it's about having well-designed classes that aren’t poorly structured.
00:02:23.270
What do I mean by small? It refers to small methods and small classes. Small methods are the foundation of writing small code. The ability to decompose large methods into smaller methods is crucial.
00:02:38.270
To write small code, we must be able to decompose large classes into smaller classes, extract responsibilities, and base them on higher-level abstractions. It's important to keep our classes small because small classes lead to reusability and composability.
00:03:01.040
So, why should we strive for small code? Why is it important? We cannot predict the future; our software requirements are going to change. Software must be flexible enough to adapt to those changes.
00:03:15.980
Any software system that hopes to have a long and successful life will change significantly. Small code is simply easier to work with than large, complex code.
00:03:36.180
If you believe that your software requirements will never change, then you can ignore everything I say here, but I doubt that’s the case.
00:03:50.570
We should write small code because it helps us raise the level of abstraction in our code. This is one of the most important aspects of creating readable and understandable code.
00:04:07.340
Good design drives toward expressing the ubiquitous language of our problem domain within our code. The combination of small methods and small classes helps us elevate that level of abstraction, allowing us to express higher-level domain concepts.
00:04:26.330
We should also write small code to effectively utilize composition. Small classes and small methods work together well. As we compose instances of small objects, our systems will become message-based.
00:04:44.910
To build systems that are message-based, we have to use delegation and small composable parts. Small code creates small composable parts, which allows our software to be flexible over time.
00:05:01.790
This flexibility helps us accommodate future requirements without needing a forklift replacement.
00:05:08.600
The goal is to create small units of understandable code that are amenable to change. Our primary tools for this are 'extract method' and 'extract class.' Longer methods tend to be harder to understand than shorter methods.
00:05:25.670
Most of the time, we can shorten a method simply by applying the extract method refactoring technique. I use this approach all the time when coding.
00:05:37.899
After establishing a coherent set of methods around a certain concept, we can look to extract them into a separate class and move the methods there.
00:06:04.259
Let's explore the example of a command line option parser that handles boolean options. We want to run a Ruby program with a '-V' flag and handle boolean options.
00:06:20.470
In the Ruby program, I’ll define the options I’m looking for using a simple DSL. Then I want to consume it like this, checking if the options include a particular option before taking action.
00:06:34.610
Here’s how that all comes together: the DSL at the top, followed by how we consume the options object. Pretty straightforward.
00:06:51.640
Here's my specification: it should return true if the option is defined and present on the command line, and false otherwise. I run my specs and encounter two failures. Yes, I am using TDD.
00:07:02.540
Here’s the implementation, which fits nicely on one slide. I store the defined options in an array and keep the arguments for later reference. There's a 'has' method to check if the option is defined in the array.
00:07:16.090
Then I have my 'option' method, which implements my simple DSL. It’s nice and readable, fitting on one slide and probably easy to comprehend.
00:07:30.550
After running my tests, I achieve zero failures. They pass, so I'm done.
00:07:35.870
Until the future comes along, that is. My colleague comes to me and says, 'Hey, I really like that library, but can we also handle string options?'
00:07:48.740
This sounds straightforward, so I think about it and come up with a small extension to the DSL to pass a second argument to signify the option type; in this case, string. I also default to allowing boolean by not changing the existing code unnecessarily.
00:08:04.030
A string option is different from a boolean; it requires content. Therefore, I need to incorporate validation for string options—the absence of content indicates that it's not valid.
00:08:21.290
Now, I also have to normalize how I retrieve values from both string and boolean options. This change alters the API, but sometimes a small change is necessary to accommodate future growth.
00:08:38.000
This is a good time to break the API, especially since I have only one person using the library for now.
00:08:50.360
Putting it all together again, I can now pass the options on the command line, define them with the DSL, and here's how I use my validation and value methods to check if it’s valid and to extract the values.
00:09:07.740
Now here's the class that implements it, again fitting on one slide. It’s probably not as readable as before and might be a little harder to comprehend.
00:09:22.300
We're headed down what I refer to as the 'undesigned path.' It's not too large at 31 lines, but it does have issues. I’ve got a method that’s definitely large and teetering on becoming unwieldy.
00:09:32.310
It has to handle both boolean and string options, which adds quite a bit of conditional complexity. Soon, we’ll find it won’t be very amenable to change.
00:09:49.030
Let’s examine the components and how they work. The initialize method creates a hash to store the options because we need to store the type—not just the knowledge that an option exists.
00:10:05.740
Now we have a valid method that checks which options are strings. We’re checking both the type and whether they have content. The string options need validation, while the boolean options do not.
00:10:20.660
In the value method, there's a lot going on. Let’s pretend this method is a black box for now; we’ll revisit it because it's by far the worst code in this example. However, all my tests are still passing.
00:10:38.750
Let's dive into methods. We’ve got some big ones that need to be cleaned up. I call it the first rule of methods: do one thing, do it well, and do only one thing.
00:10:55.740
This principle harks back to the UNIX philosophy of tools that you can string together. But how do we know if a method is doing only one thing?
00:11:14.000
Here, our level of abstraction and the abstractions in your code come into play. Over time, you need to develop a feel for maintaining one level of abstraction per method.
00:11:30.330
If all of your statements are at the same level and coherent around a purpose, I consider that to be doing one thing. It doesn’t necessarily mean that a method can’t span multiple lines.
00:11:46.920
Often, I view methods whose comments succinctly describe their functionality, only to find that the method name isn’t as descriptive as it should be. This highlights the importance of using descriptive method names.
00:12:02.740
Using fewer arguments is also critical. My personal goal is to have zero arguments on methods. One is okay; two or three usually indicate I might have missed an abstraction.
00:12:18.640
Make sure you query something before you change the state of your object. This approach can confuse those who consume your library.
00:12:36.560
And as always, don’t repeat yourself. As Sandy discussed earlier, it takes judgment to decide when to eliminate repetition.
00:12:50.500
Leaving repetition in your code will come back to haunt you. Let’s examine our methods. Both 'valid' and 'value' currently inspect the 'ARGV' array to find options from the command line.
00:13:07.350
This is a perfect example of a candidate for extraction through method refactoring. We also have magic constants scattered about, indicating missed abstractions.
00:13:23.960
Both methods aren’t purely doing one thing. 'Valid' digs into the 'ARGV' array while 'value' figures out different types and how to return their values.
00:13:40.050
Now we’re going to eliminate some of the repetition by conducting an extract method refactoring. This involves moving part of a method into a new method with a descriptive name, thereby maintaining consistency in abstraction.
00:14:01.840
In our command line options class, both 'valid' and 'value' expose their values through the 'ARGV' collection. We'll extract that logic to retrieve the raw value.
00:14:12.640
The methods left behind will focus on the desired result without the complexity of the 'how,' which will be detailed in the extracted method.
00:14:28.060
I will proceed with two more extractions, specifically the value method for string options and the content method. The naming of the extracted methods is essential—they clearly define their purpose.
00:14:48.400
However, I’m still unsatisfied with the code; it’s more explanatory but too complex and not as small as it could be. The methods are large due to missed abstractions.
00:15:01.800
Next, I’ll reference the option type symbol to see if it’s a string, which is a big warning sign that we're on the wrong path. There are also those magic constants used to retrieve content from within that string.
00:15:20.460
If I were confident that there wouldn't be any future requirements for this class, I might leave it alone. But then my colleague comes back and asks, 'Can we also handle integer options?'
00:15:36.580
To deal with this, I could continue down the undesigned path and complicate the 'valid' and 'value' methods by switching based on the option types. However, this is our chance to enhance our code to be more adaptable.
00:15:50.900
Let me demonstrate the impact of poor design. This prototype of undesigned code is not small, in my view—definitely not.
00:16:07.060
The class has expanded due to changes in specifications, and the 'valid' and 'value' methods evolve in tandem. This is a clear sign that we’ve missed an abstraction, causing those methods to grow complicated.
00:16:23.060
While all my tests pass, I feel dissatisfied. We have large methods and complex conditional logic; it's time to refactor to facilitate change easily.
00:16:46.400
I’d like to highlight a pattern emerging from non-OO design. Not reinventing the type system is crucial—if you have ducks, let them quack.
00:17:01.600
In this example, our option types are boolean, string, and integer. There are likely ducks in your code longing to be set free.
00:17:20.000
Just confirming we’re dealing with abstractions or ducks here—the testing of option types is hidden inside the 'valid' and 'value' methods, showing a case statement.
00:17:35.680
When your code involves case statements like these, it's a sign you’ve missed an abstraction. The moment I encountered string types, I should have embraced the OO path.
00:17:52.020
Sometimes it’s hard to recognize when to shift gears while writing code. It's time for a fresh perspective.
00:18:06.840
Let’s rethink what constitutes a good class, focusing on the principles that allow us to write small classes. First and foremost, ensure each class has a single responsibility.
00:18:21.680
Furthermore, all properties of a class should be cohesive to the abstraction that the class is representing. If properties are only referenced in one or two methods, that's likely an indicator they don't belong there.
00:18:40.460
Choosing an apt name helps maintain focus on a single responsibility. Sometimes I refer to this as talking to the rubber duck; explaining your problem—even to an inanimate object—can lead to clarity.
00:18:58.010
The main tools we’ll use to create new classes from existing code are extract class and move method refactorings.
00:19:07.050
Characteristics of a well-designed class include a single responsibility, cohesion around a defined set of properties, and a small public interface that handles a limited number of methods.
00:19:18.330
If possible, the primary logic should be expressed in a composed method, but this topic deserves its own discussion.
00:19:37.040
Now, let’s explore the code we should have aimed towards once string option types emerged. Imagine we have a clean slate to write command line options with the knowledge we have now.
00:19:55.260
We need to account for boolean, string, and integer options, and let’s ensure our tests remain intact to prevent any breakages.
00:20:14.910
Here’s my initial attempt at writing the class. It’s 28 lines long and cohesive around its properties, and I’ll leave it mostly as it is.
00:20:30.050
Most methods will manage a collection of option objects, and the sole responsibility is to oversee these.
00:20:46.200
I did introduce a collaborator that manufactures the option objects, which I could extract to another class.
00:20:59.900
But for now, I’ll leave it here. In general, I refactor when I feel the pain during changes, indicating a need for refinement.
00:21:16.020
The command line options class should have a small public interface, specifically two methods: valid and value, with no hard-coded external dependencies.
00:21:33.150
Also, this class doesn't contain any conditional statements, and that’s intentional. In Sandy Metz's 2009 talk on SOLID principles, she stated that conditionals in OO languages are often a sign of poor design.
00:21:51.820
I don’t think Sandy means conditionals should never be used. Rather, they can obscure abstractions in your code.
00:22:06.120
Initial methods for options carried over unchanged, now storing options in a hash instead of just the type.
00:22:24.430
The valid method simply checks all options to verify their validity, while the value method looks up values in the option hash.
00:22:41.410
Now we need to implement our options, which requires instantiating objects that represent the boolean, string, and integer options.
00:23:01.800
This creates a dependency. When introducing dependencies, we should aim for those that can accommodate future changes.
00:23:19.320
Instead of depending on concrete implementations, we need to align with abstractions. This applies even to duck types, where we rely on the concept rather than specific implementations.
00:23:35.220
The option is the abstraction here. The simple abstraction consists of having valid and value methods and a consistent initialization process across types.
00:23:49.890
I could go down the case statement road again and check the option type, instantiating the correct type based on the symbol.
00:24:00.020
However, I won't do that, as it would tie my command line class to the concrete types—something we want to avoid.
00:24:12.780
Creating a dependency on abstractions instead of concretions means using Ruby’s dynamic capabilities to instantiate those objects.
00:24:27.310
Using naming conventions, we can automatically instantiate the appropriate option class based on types like string, boolean, etc.
00:24:44.000
This adjustment shifts the command line option class from depending on concrete implementations to relying on abstractions.
00:25:01.400
This approach embodies dependency inversion from the SOLID principles. Some have suggested mapping symbols to concrete classes, but that complicates future modifications.
00:25:15.640
In my case, I’m comfortable using the dynamic capabilities of Ruby.
00:25:28.920
At this point, our command line options class is designed to accommodate new option types without requiring changes to its core structure.
00:25:42.860
To this end, we’ve constructed a clean hierarchy while ensuring our classes remain open to extension.
00:25:56.060
Next, we need to shift the logic for various option types to the appropriate option classes. I decided to create a base class from which the concrete option types would inherit.
00:26:09.080
This inheritance structure allows us to maintain uniformity in initialization without redundant code. Each subtype maintains cohesion around specific attributes.
00:26:25.660
For the boolean option, the requirements are so straightforward: a boolean is always valid, and the raw value is identified as true if present, otherwise false.
00:26:41.020
Now, we need to implement string and integer options, extracting validation and value extraction logic from the original command line options class.
00:26:57.060
On one side, we have the original command line options; on the other are the new string and integer option classes. We've successfully divided the logic appropriately.
00:27:13.220
By applying a mixture of extract class and move method refactorings, we’ve effectively streamlined the command line options class, leaving minimal code remaining.
00:27:30.020
Now, we can replace the complex valid and value methods with simpler versions that adhere to the principles we've discussed.
00:27:45.060
To validate the various option classes, I shifted the corresponding sections from the command line option spec to their new respective areas, updating them as necessary.
00:27:59.260
As I went through the process of extracting those classes and moving their code, we isolated specific abstractions.
00:28:14.790
We need to differentiate the 'what' from the 'how.' Our goal is to transition from code that looks excessively complex to more streamlined representations.
00:28:30.000
The original command line options 'valid' method while containing all of the 'how' now focuses on simply stating what needs to be done.
00:28:46.000
As a result, the 'how' has been relegated to the respective collaborators, specifically the string option, boolean option, and integer option classes.
00:29:02.020
Once we’ve completed the refactoring, the command line options class boasts a very small interface that effectively meets its use case.
00:29:14.950
None of the private implementation details require attention from outside the class. This delineation makes it clear which methods are for internal use.
00:29:29.080
Ultimately, the implementation fulfills our public interface and is all dedicated. As I worked through the spec process, I made adjustments until everything passed.
00:29:46.490
Yet again, this is a reminder that nothing is ever truly finished. I now hear, 'Any chance we could add the ability to pass an array of values for an option?'
00:30:02.710
To implement this requirement, I only need to create a new array option class. I’ll draft a spec to make it fail, then create the array option class, and that’s done.
00:30:17.490
In this approach, the option class inherits from the option superclass. As I worked through this, it became evident that strings, integers, and arrays all have content.
00:30:31.660
This realization led to the extraction of that superclass. For each option type, I just implemented the value method and we’re set.
00:30:45.830
Now, we have a command line options class that is closed for modifications but open for extensions. I can introduce float types or other new option types without additional changes.
00:30:58.620
Our small, comprehensible option classes have single responsibilities and are easy to compose together. We can simply create new option types to meet requirements.
00:31:06.880
My name is Mark Menard, president of Enable Labs. We handle full lifecycle business productivity and SAS app development, taking ideas from napkin sketches to production.
00:31:35.030
I'm here at the conference, so let’s gather and discuss code. I'm available for questions.