Code Smells: Your Refactoring Cheat Codes

Talks

John Pignata

#test-driven-development

#ruby

#object-oriented-programming

#backend-development

Code Smells: Your Refactoring Cheat Codes

by John Pignata

In the video titled 'Code Smells: Your Refactoring Cheat Codes' presented by John Pignata at MountainWest RubyConf 2013, the speaker emphasizes the importance of recognizing and responding to code smells as essential heuristics for refactoring.

Key Points Discussed:
- Definition of Code Smells: Code smells are indicators of resistance within software design, pointing to potential areas that require refactoring. They were first articulated in the late 90s and help developers understand when they need to revise their code.
- Refactoring Process: The talk features a live refactoring session focused on a backend API for Android push notifications through a component known as the push daemon, which is responsible for receiving messages and delivering them via HTTP.
- Code Examination: Pignata goes through the initial code structure, noting issues such as having a long, unwieldy method that encapsulates various tasks, leading to complexity and difficulty in maintenance.
- Incremental Refactoring Steps: The speaker demonstrates the refactoring process step-by-step, such as extracting responsibilities into smaller methods, creating distinct objects for clarity, and leveraging patterns like the null object pattern to streamline command processing.
- Improving Communication Between Components: The need for improved communication between the push daemon and the UDP server is highlighted to avoid unnecessary complexity and dependency. This includes abstracting command interactions to prevent tight coupling.
- Final Enhancements: Throughout the demonstration, Pignata refines the system, promoting maintainability and better structural design, while ensuring the code aligns with best practices.

Conclusions and Takeaways:
- Listening to code smells can lead to a more sustainable software design by allowing developers to identify areas in need of improvement.
- A systematic approach to refactoring, focusing on individual responsibilities and clear interfaces, enhances both code quality and future maintainability.
- Adopting such strategies is crucial for striking a balance between immediate delivery and long-term software health, thereby enabling growth and adaptability in codebases.

00:00:20.600 Good morning everyone, can you all hear me? Great! So, they've lured you here with Matt's keynote, and now you're stuck in a refactoring talk. I apologize for that. My name is John Pignata, and I'm an engineer in New York with a company called GroupMe.

00:00:26.160 I want to start with a quick definition. Code smells are heuristics for refactoring. The concept of code smells was first introduced by Kent Beck on Ward Cunningham's Wiki in the late 90s, and later more formally published in the refactoring book in a chapter co-authored by Kent Beck.

00:00:45.039 When we're building software, our design communicates with us through resistance. We may say things like the code is difficult to understand, test, change, or reuse. This expression of resistance is really valuable; it's nudging us along to refactor. Code smells are hints from our software about how it wants us to reduce this resistance.

00:01:05.119 In this way, when we listen and respond to the hints that our software is giving us about how it wants to be structured, our design emerges from our system's parts. The rest of this talk will involve a lot of code. We will be refactoring a backend API for a mobile application used by Google Android devices.

00:01:30.920 The component we are focusing on is the push daemon. The feature we are looking at is delivering push notifications to users’ phones. For example, when someone favorites your message on Twitter, you get a push notification; that’s what we are doing here.

00:01:42.079 The component that delivers the push notifications to Android phones is the push daemon, which we have extracted from our application. The push daemon is a bit unorthodox. It listens on a UDP port, receives UDP datagrams, extracts messages, and then delivers those messages through HTTP to the Google API, which then handles the delivery to the user's device.

00:02:18.599 To give you an idea of the code as it was discovered, here it is. There’s a lot of it, and we will go through it line by line, so don’t worry about reading it. Hopefully, it’s somewhat legible.

00:02:24.879 Let’s talk about what this system does in detail. There are two internally visible behaviors in our push daemon: one command called ping, which is a simple health check, and a command called send, which is the central command of our system that delivers messages to Google’s API.

00:02:37.560 To see how this system works, we are actually going to run a script that sends UDP messages and observe its behavior externally. If we run the script, we see that it binds to a UDP port; in this case, it’s 6889. The script will block and wait for incoming datagrams. We could use a utility called netcat, which is a small Unix utility for sending messages across a network.

00:03:01.599 Netcat has a -u flag for UDP, so we will send messages to our local host on port 6889. We’ll type in ping and hit enter; it’s supposed to respond with pong, which it does. So, our first test is passing. Next, we’ll synthesize a send command to test its complexity.

00:03:25.720 The send command takes parameters: the first is a registration ID, which represents a physical Android phone in the field. An Android device registers with Google’s API, which provides a token that the application uses to address the phone. The second parameter is just some text we want to deliver to the phone. Since we’re testing this as a black box, the only way to ensure it works is to use an actual registration ID for an actual Google phone. Once we do that and hit enter, we check the physical phone and receive the message, confirming that our system works as intended.

00:04:18.639 Because we will be running this test many times during our refactoring process, we want instant feedback. We’ll use rspec to write our outermost possible coverage for this test. In this setup, we won’t be using a proper object definition or methods, just a raw script that runs.

00:04:55.480 We will use the Ruby load method, conduct the tests in a background thread, and assert the behavior. We could fork a process, but using a thread allows us to be in the same process as the component we're testing, which facilitates mocking, such as the Google API.

00:05:10.840 Here’s the first test, similar to what we did with netcat, but using Ruby. We will say socket.send ping, and then we will assert that the response should be pong.

00:05:25.720 This is the same as what we set up before, but now we want to automate the send test similarly. We will send a prefab message, so we have our token along with some text. However, we need to ensure that it functions as a self-contained unit, so we will mock the Google API to handle the parameters we send, verifying they are correct.

00:06:05.400 Now we can run the test, and once it passes, we could use an autograder running in the background to help during our refactoring process. Now, we can actually start refactoring. Going back to the original code, we note that all we currently have is one long method.

00:06:28.840 Although it is technically not a method, our script body acts like a main method. Unfortunately, this is quite long, at around 30 lines, performing various tasks. At the top, we instantiate collaborators like an HTTP client and a UDP socket—three components that appear unrelated.

00:07:08.520 Next, we create a thread pool here to manage background tasks. By using a queue, we can manage work coordination with the worker threads. Essentially, we are simultaneously making a service request to the Google API.

00:07:38.079 The next step is binding the script to our socket. We have our socket hardcoded to port 6889, and beneath that, we pull the socket to check for incoming datagrams, dispatching commands based on the content, either ping or send.

00:08:03.919 In the send method, we undertake strenuous string manipulations to extract parameters, creating a JSON body that Google expects. To organize this mess, we can break it down into smaller pieces, starting by replacing this body with a method object for a more structured approach.

00:08:24.760 Given we have just one method, we will create a method called push daemon. This will allow us to promote local variables into instance variables of the object we are structuring. The rest of the script body will be encapsulated in a method called start. References to locals will be updated to use our new instance variables.

00:09:13.360 While we still have a long method scenario, it at least now possesses a name, making it clearer. We can extract three distinct statements in the start method into private methods for better clarity, including spawning workers, binding the socket, and processing requests.

00:09:47.280 We still have extensive extraction to perform. To determine the responsibilities of our push daemon, we can assess potential changes. If our object has many reasons to change—observe the divergent change code smell—this may indicate too many responsibilities.

00:10:24.960 We can start by extracting the worker object. Since it uses a queue and a client, it can encase the job it’s performing. We will set up a worker to encapsulate these two members, and we can then return to the push daemon.

00:11:03.640 Once we replace the queue and client with our worker, we apply further scrutiny to avoid losing track of object state. In the process request, we're currently pushing directly to the queue. We need to create a shovel method in the worker to allow submission of background work.

00:11:43.679 Next, we can transfer the method spawn from push daemon to the worker. It will facilitate spawning workers, and while we parameterize count, we will leave the rest unchanged for now. We also need to delve into the UDP socket, which is crucial in our code.

00:12:08.160 The UDP socket is responsible for binding to a port, receiving data through it, and sending data. As we're dealing with the internal mechanics of the socket, it’s apparent our objects need a higher-level conversation interface. We'll create a UDP server.

00:12:43.360 The UDP server will expose methods like bind, receive, and send. This way, we raise the abstraction level for our push daemon, which will only need to tell the UDP server to bind to a certain port.

00:13:20.640 However, as we dive into the interactions between the push daemon and the UDP server, we discover inappropriate intimacy—our system is too tightly coupled. The push daemon shouldn’t need to pull or constantly check the UDP server for new data.

00:13:52.480 Instead, we need to rearrange the objects where they notify each other to process new data intuitively. Going forward, we'll update the push daemon to register its interest in new requests from the UDP server.

00:14:32.400 We’ll simplify the initial binding to just listen for new requests. The process request method must be updated as well; it should assume that data will be passed to it when invoked.

00:15:18.680 Meanwhile, we will change the name from process request to call, taking inspiration from typical conventions, making it more intuitive for future developers.

00:15:56.640 Next, we will update the UDP server based on the new arrangements, ensuring its listen method now encapsulates the looping and data retrieval responsibilities.

00:16:38.480 In turn, the push daemon can now bypass unnecessary checks and simply interact with its intended API. We remain left with a sizable case statement; thus, we should derive a better structure to prevent entangled behaviors.

00:17:21.840 To collapse the case statement, we'll replace its branches with respective objects that perform the desired actions—a process known as leveraging polymorphism, allowing us to use behaviors instead of conditionals.

00:17:59.720 We’ve launched the distinct jobs: ping and send, designating certain behaviors to them. Each job takes the necessary parameters, allowing them to encapsulate their specific methods.

00:19:07.320 Encapsulating the Google API details within the send job encapsulates the data and functionality adequately, while ping and send now operate uniformly.

00:19:56.240 Transitioning tasks between the worker and jobs ensures both functionalities remain distinct. As we uncover a factory mechanism, we realize it can operate independently of conditionals.

00:20:40.960 We finalize this work by eliminating the need for a case statement, instead implementing a hash to map commands to job classes—streamlining our push daemon and minimizing complexity.

00:21:31.760 However, as we continue refining the jobs, we discover clumped data attributes that signal an opportunity for further abstraction within our system.

00:22:09.760 With the introduction of a Client class capturing request details, we transition to a more holistic data model while simultaneously eliminating confusion in our code with clear attributes.

00:22:54.159 As we hand over significant responsibilities from the jobs back to the Client, we solidify its status as a key player in our architecture, granting it necessary behaviors.

00:23:39.680 Simultaneously, we continue to clean up our interface, removing the need for a superfluous server parameter, further encapsulating responsibilities and enhancing clarity.

00:24:24.240 We also need to refine our send job, ensuring that its attributes carry distinct meanings and addressing the underlying complexity with refined abstractions.

00:25:07.680 After resolving ambiguity in namings and aligning responsibilities, we identify what behavior we need to ensure proper tokenization within our request component.

00:25:48.119 Now, we pass along a request object simplifying the complexity while retaining clear functionality. No longer reliant on low-level string manipulations, we elevate our handling of incoming data.

00:26:32.639 Allowing hash behaviors and the command to drive job activation eliminates clutter, ensuring separation of structural elements from operational functionalities.

00:27:12.839 As we critically assess our code, we remain cognizant of nil checks scattered throughout, and we implement a null object pattern to encapsulate command responses.

00:28:05.920 By substituting the null job for overwriting instances that yield unintended behavior, we enhance our worker’s responsiveness while curbing redundancy.

00:28:55.239 This holistic approach to invoking job behavior fosters coherence and alleviates unnecessary checks, tracking our system towards a more streamlined model.

00:29:40.160 In closing, while there remains work to reconcile hard-coded configuration details, we’ve made substantial progress in addressing code smells and enhancing the structure of our push daemon. By listening and responding to these code smells, we’ve taken significant steps towards a more sustainable design, enabling better maintainability and growth.

00:31:00.960 Thank you all for your attention; I truly appreciate it.