00:00:20.600
Good morning everyone, can you all hear me? Great! So, they've lured you here with Matt's keynote, and now you're stuck in a refactoring talk. I apologize for that. My name is John Pignata, and I'm an engineer in New York with a company called GroupMe.
00:00:26.160
I want to start with a quick definition. Code smells are heuristics for refactoring. The concept of code smells was first introduced by Kent Beck on Ward Cunningham's Wiki in the late 90s, and later more formally published in the refactoring book in a chapter co-authored by Kent Beck.
00:00:45.039
When we're building software, our design communicates with us through resistance. We may say things like the code is difficult to understand, test, change, or reuse. This expression of resistance is really valuable; it's nudging us along to refactor. Code smells are hints from our software about how it wants us to reduce this resistance.
00:01:05.119
In this way, when we listen and respond to the hints that our software is giving us about how it wants to be structured, our design emerges from our system's parts. The rest of this talk will involve a lot of code. We will be refactoring a backend API for a mobile application used by Google Android devices.
00:01:30.920
The component we are focusing on is the push daemon. The feature we are looking at is delivering push notifications to users’ phones. For example, when someone favorites your message on Twitter, you get a push notification; that’s what we are doing here.
00:01:42.079
The component that delivers the push notifications to Android phones is the push daemon, which we have extracted from our application. The push daemon is a bit unorthodox. It listens on a UDP port, receives UDP datagrams, extracts messages, and then delivers those messages through HTTP to the Google API, which then handles the delivery to the user's device.
00:02:18.599
To give you an idea of the code as it was discovered, here it is. There’s a lot of it, and we will go through it line by line, so don’t worry about reading it. Hopefully, it’s somewhat legible.
00:02:24.879
Let’s talk about what this system does in detail. There are two internally visible behaviors in our push daemon: one command called ping, which is a simple health check, and a command called send, which is the central command of our system that delivers messages to Google’s API.
00:02:37.560
To see how this system works, we are actually going to run a script that sends UDP messages and observe its behavior externally. If we run the script, we see that it binds to a UDP port; in this case, it’s 6889. The script will block and wait for incoming datagrams. We could use a utility called netcat, which is a small Unix utility for sending messages across a network.
00:03:01.599
Netcat has a -u flag for UDP, so we will send messages to our local host on port 6889. We’ll type in ping and hit enter; it’s supposed to respond with pong, which it does. So, our first test is passing. Next, we’ll synthesize a send command to test its complexity.
00:03:25.720
The send command takes parameters: the first is a registration ID, which represents a physical Android phone in the field. An Android device registers with Google’s API, which provides a token that the application uses to address the phone. The second parameter is just some text we want to deliver to the phone. Since we’re testing this as a black box, the only way to ensure it works is to use an actual registration ID for an actual Google phone. Once we do that and hit enter, we check the physical phone and receive the message, confirming that our system works as intended.
00:04:18.639
Because we will be running this test many times during our refactoring process, we want instant feedback. We’ll use rspec to write our outermost possible coverage for this test. In this setup, we won’t be using a proper object definition or methods, just a raw script that runs.
00:04:55.480
We will use the Ruby load method, conduct the tests in a background thread, and assert the behavior. We could fork a process, but using a thread allows us to be in the same process as the component we're testing, which facilitates mocking, such as the Google API.
00:05:10.840
Here’s the first test, similar to what we did with netcat, but using Ruby. We will say socket.send ping, and then we will assert that the response should be pong.
00:05:25.720
This is the same as what we set up before, but now we want to automate the send test similarly. We will send a prefab message, so we have our token along with some text. However, we need to ensure that it functions as a self-contained unit, so we will mock the Google API to handle the parameters we send, verifying they are correct.
00:06:05.400
Now we can run the test, and once it passes, we could use an autograder running in the background to help during our refactoring process. Now, we can actually start refactoring. Going back to the original code, we note that all we currently have is one long method.
00:06:28.840
Although it is technically not a method, our script body acts like a main method. Unfortunately, this is quite long, at around 30 lines, performing various tasks. At the top, we instantiate collaborators like an HTTP client and a UDP socket—three components that appear unrelated.
00:07:08.520
Next, we create a thread pool here to manage background tasks. By using a queue, we can manage work coordination with the worker threads. Essentially, we are simultaneously making a service request to the Google API.
00:07:38.079
The next step is binding the script to our socket. We have our socket hardcoded to port 6889, and beneath that, we pull the socket to check for incoming datagrams, dispatching commands based on the content, either ping or send.
00:08:03.919
In the send method, we undertake strenuous string manipulations to extract parameters, creating a JSON body that Google expects. To organize this mess, we can break it down into smaller pieces, starting by replacing this body with a method object for a more structured approach.
00:08:24.760
Given we have just one method, we will create a method called push daemon. This will allow us to promote local variables into instance variables of the object we are structuring. The rest of the script body will be encapsulated in a method called start. References to locals will be updated to use our new instance variables.
00:09:13.360
While we still have a long method scenario, it at least now possesses a name, making it clearer. We can extract three distinct statements in the start method into private methods for better clarity, including spawning workers, binding the socket, and processing requests.
00:09:47.280
We still have extensive extraction to perform. To determine the responsibilities of our push daemon, we can assess potential changes. If our object has many reasons to change—observe the divergent change code smell—this may indicate too many responsibilities.
00:10:24.960
We can start by extracting the worker object. Since it uses a queue and a client, it can encase the job it’s performing. We will set up a worker to encapsulate these two members, and we can then return to the push daemon.
00:11:03.640
Once we replace the queue and client with our worker, we apply further scrutiny to avoid losing track of object state. In the process request, we're currently pushing directly to the queue. We need to create a shovel method in the worker to allow submission of background work.
00:11:43.679
Next, we can transfer the method spawn from push daemon to the worker. It will facilitate spawning workers, and while we parameterize count, we will leave the rest unchanged for now. We also need to delve into the UDP socket, which is crucial in our code.
00:12:08.160
The UDP socket is responsible for binding to a port, receiving data through it, and sending data. As we're dealing with the internal mechanics of the socket, it’s apparent our objects need a higher-level conversation interface. We'll create a UDP server.
00:12:43.360
The UDP server will expose methods like bind, receive, and send. This way, we raise the abstraction level for our push daemon, which will only need to tell the UDP server to bind to a certain port.
00:13:20.640
However, as we dive into the interactions between the push daemon and the UDP server, we discover inappropriate intimacy—our system is too tightly coupled. The push daemon shouldn’t need to pull or constantly check the UDP server for new data.
00:13:52.480
Instead, we need to rearrange the objects where they notify each other to process new data intuitively. Going forward, we'll update the push daemon to register its interest in new requests from the UDP server.
00:14:32.400
We’ll simplify the initial binding to just listen for new requests. The process request method must be updated as well; it should assume that data will be passed to it when invoked.
00:15:18.680
Meanwhile, we will change the name from process request to call, taking inspiration from typical conventions, making it more intuitive for future developers.
00:15:56.640
Next, we will update the UDP server based on the new arrangements, ensuring its listen method now encapsulates the looping and data retrieval responsibilities.
00:16:38.480
In turn, the push daemon can now bypass unnecessary checks and simply interact with its intended API. We remain left with a sizable case statement; thus, we should derive a better structure to prevent entangled behaviors.
00:17:21.840
To collapse the case statement, we'll replace its branches with respective objects that perform the desired actions—a process known as leveraging polymorphism, allowing us to use behaviors instead of conditionals.
00:17:59.720
We’ve launched the distinct jobs: ping and send, designating certain behaviors to them. Each job takes the necessary parameters, allowing them to encapsulate their specific methods.
00:19:07.320
Encapsulating the Google API details within the send job encapsulates the data and functionality adequately, while ping and send now operate uniformly.
00:19:56.240
Transitioning tasks between the worker and jobs ensures both functionalities remain distinct. As we uncover a factory mechanism, we realize it can operate independently of conditionals.
00:20:40.960
We finalize this work by eliminating the need for a case statement, instead implementing a hash to map commands to job classes—streamlining our push daemon and minimizing complexity.
00:21:31.760
However, as we continue refining the jobs, we discover clumped data attributes that signal an opportunity for further abstraction within our system.
00:22:09.760
With the introduction of a Client class capturing request details, we transition to a more holistic data model while simultaneously eliminating confusion in our code with clear attributes.
00:22:54.159
As we hand over significant responsibilities from the jobs back to the Client, we solidify its status as a key player in our architecture, granting it necessary behaviors.
00:23:39.680
Simultaneously, we continue to clean up our interface, removing the need for a superfluous server parameter, further encapsulating responsibilities and enhancing clarity.
00:24:24.240
We also need to refine our send job, ensuring that its attributes carry distinct meanings and addressing the underlying complexity with refined abstractions.
00:25:07.680
After resolving ambiguity in namings and aligning responsibilities, we identify what behavior we need to ensure proper tokenization within our request component.
00:25:48.119
Now, we pass along a request object simplifying the complexity while retaining clear functionality. No longer reliant on low-level string manipulations, we elevate our handling of incoming data.
00:26:32.639
Allowing hash behaviors and the command to drive job activation eliminates clutter, ensuring separation of structural elements from operational functionalities.
00:27:12.839
As we critically assess our code, we remain cognizant of nil checks scattered throughout, and we implement a null object pattern to encapsulate command responses.
00:28:05.920
By substituting the null job for overwriting instances that yield unintended behavior, we enhance our worker’s responsiveness while curbing redundancy.
00:28:55.239
This holistic approach to invoking job behavior fosters coherence and alleviates unnecessary checks, tracking our system towards a more streamlined model.
00:29:40.160
In closing, while there remains work to reconcile hard-coded configuration details, we’ve made substantial progress in addressing code smells and enhancing the structure of our push daemon. By listening and responding to these code smells, we’ve taken significant steps towards a more sustainable design, enabling better maintainability and growth.
00:31:00.960
Thank you all for your attention; I truly appreciate it.