Ruby Video | Refactoring With Science

Wynn Netherland

Refactoring With Science

Wynn Netherland

#continuous-integration

#developer-experience

Refactoring With Science

Wynn Netherland • February 20, 2014 • Earth

In the talk titled 'Refactoring With Science', Wynn Netherland of GitHub discusses the complexities of changing code confidently in software development. He recognizes that while modifying code can be straightforward, ensuring the reliability of those changes—especially in production environments—presents significant challenges. To address these challenges, Wynn emphasizes the necessity of employing robust testing strategies and leveraging data to inform decisions. Key points from his discussion include:

The Challenge of Confidence in Code Changes: Even with extensive test suites, developers often face anxiety when deploying changes to production.
Utilizing Data at GitHub: GitHub employs various tools to gather metrics and analyze code performance. These tools, such as Hubot and the 'Graph Store', enable teams to visualize data and gain insights into API status codes and other critical metrics.
Tracking Metrics: Wynn categorizes metrics into counters, gauges, and timings, each serving a different purpose in monitoring application performance.
Deployment Process: He describes GitHub's deployment strategy, which includes creating pull requests, verifying deployments, and using CI systems like Janky on Jenkins for smooth integration.
Managing Legacy Code: Wynn discusses the importance of confidence in removing unused methods and introduces a tool called Backscatter, which tracks method calls to help developers make informed decisions on code removal.
Refactoring with Science: The theme of using data-driven experiments to inform code refactoring is highlighted, showcasing how GitHub adapts its systems as it scales from 20,000 to over 7 million users.
Community Reflection: He concludes with a reflection on the value of community engagement, symbolized by contributions from individuals like Jim Wier, stressing the importance of connections over the code itself.

Overall, the talk provides a comprehensive overview of GitHub's innovative approaches to deploying code and maintaining high levels of confidence through measurement and experimentation.

Refactoring With Science
Wynn Netherland • February 20, 2014 • Earth

Changing code is easy. Changing code with confidence isn't. Even the most robust, mature test suites have blind spots that make large scale changes difficult. At GitHub we use Science to instrument, compare results, and measure performance of parallel code path experiments to see how new code runs against a current production baseline. This talk will show you how to Science, too.

Help us caption & translate this video!

http://amara.org/v/FG3t/

Big Ruby 2014

00:00:14.920 Hello everyone! My name is Wynn Netherland, and I work at GitHub on the API team. This year, I believe there are three of us attending from the API team as we descend upon Grapevine for the conference.

00:00:20.480 You can usually find me on Twitter under the handle @penguin and my personal website is win.fm, which is also a gospel R&B radio station located in North Carolina.

00:00:27.920 Today, I want to talk to you about changing code. While this might seem like a technical discussion, it's largely about the philosophy behind how we work at GitHub and the process we follow when we change code.

00:00:39.439 Now, changing code is pretty easy. You just do it, right? However, it's changing code with confidence that can be the challenge. We all know the potential pitfalls that come with this process, especially the anxiety that comes when pushing changes to production.

00:01:10.720 So, how do we gain confidence in making changes? Tests are likely the first thing that comes to mind. We write numerous tests as we develop features, we create tests to find bugs, and over time, we build a test suite that grows along with our code.

00:01:30.560 If you're feeling less than comfortable with the testing suite for your production applications, that's completely okay. There are certain aspects of the code that we test more closely to understand the impact of our changes, but when those changes affect production, we sometimes feel less confident.

00:02:01.920 At GitHub, we use data to inform our decisions, and this approach has transformed how I work. Previously, I used tools to gather metrics on things like the physical utilization of resources. However, at GitHub, I have experienced a whole new world of possibilities when we track everything and leverage that data to inform our decisions.

00:02:34.720 We employ Hubot for chat operations, essentially centralizing much of our processes. In integration with Hubot, we have a command called 'Graph Me', which allows me to pull up a visualization of nearly every metric we track within our application.

00:03:09.840 For instance, we track API status codes continuously to see how they meet our expectations with various changes being made. This provides valuable insights that can help us maintain confidence in our codebase.

00:03:44.000 Here is another example related to serializing queries—when we're taking Active Record models and preparing them for API responses. Access to rich datasets enables informed decision-making, allowing teams to move forward with greater confidence.

00:04:30.080 At GitHub, we prefer not to break the API. Nobody wants to do that; however, there have been occasions where we transitioned from a strict policy of never breaking the API to a more pragmatic approach, where we analyze who is actually using particular methods, reach out to them, and then make decisions based on that engagement.

00:05:03.680 We have a tool we refer to as the 'Graph Store', which is powered by Graphite. It contains a plethora of graphs, and if a specific graph you need doesn’t exist, you can save the visualization you create for future use.

00:05:41.520 Within the Graph Store, we have numerous visualizations and dashboards. For instance, when you deploy code, Hubot tells you to check the Graph Store for browser response times, allowing you to verify whether your changes improve or degrade performance for the segments you care about.

00:06:11.200 In addition to tracking countless metrics, we categorize them into three basic types: counters, gauges, and timings. A counter is commonly used for tracking occurrences of significant events within our code. If something noteworthy happens, we can increment this counter to monitor it over time.

00:06:54.240 A gauge could track readings over time, like CPU utilization or the number of followers on a social media account. Timings record how long methods take to execute, allowing us to analyze the duration of specific processes and optimize them as necessary.

00:07:32.400 Possessing this data empowers us to deploy with more confidence. I have to admit, I tend to lean toward sharing all the practical insights I've gained from my experience at GitHub, but I sometimes forget that not everyone shares the same background I do.

00:08:03.920 For context, I began programming back in the late 80s on a TRS-80, writing in BASIC. Back then, deploying code typically involved writing it down and turning in my homework. By the time I reached high school, we utilized floppy disks, and later, in college, we upgraded to zip disks. My first real job in the late 90s was as a web master for a newspaper.

00:09:43.680 The deployment process involved simple FTP file transfers, which seems archaic compared to some of today's techniques. Eventually, I was introduced to Ruby, Rails, and the wonders of Capistrano, which felt like a giant leap forward in the deployment process.

00:10:10.080 Now, at GitHub, our deployment process primarily follows GitHub Flow, incorporating Hubot and other tools to streamline the process. It all begins with the creation of a pull request, and I am curious, how many of you are utilizing pull requests as a central part of your development workflow?

00:10:54.080 To create a pull request, you simply branch off, add some commits, push it to GitHub, and then open the pull request for discussion and review. This process is generally iterative, with feedback and adjustments throughout.

00:11:20.400 In GitHub's case, we typically deploy code first, and after verifying that the deployment is successful, we merge the pull request into the main branch. It might sound counterintuitive, but merging happens after deploying initiated as a way to ensure that the version we put out there into production is sound.

00:12:09.680 In practice, once we receive confirmation that everything is in order, we can initiate deployment. Asking Hubot where we can deploy leads to comprehensive feedback on the state of our environments, including production and various lab stages.

00:12:50.720 Exciting developments allow us to deploy branches and have them available as live environments without lingering on those long waiting periods before deployment. As we work on features, instead of waiting for the environment to become available, we can now deploy using the branch name, making it easier for everyone involved.

00:13:25.600 After choosing where to deploy, Hubot checks that the master branch is merged into your branch. If it hasn't, it merges it for you before starting the deployment. The goal is to ensure that the deployed code is the latest and that it has passed all checks.

00:14:02.400 Waiting for CI results takes some time; however, GitHub has integrated Janky on top of Jenkins to enhance our experience. It’s an intuitive way to monitor CI status without the headaches often associated with using Jenkins directly.

00:14:50.720 Once we get a green light from CI, we unlock the production environment, and the deployment begins. The process may take a few minutes to complete, but once it does, we can get our hands on the freshly deployed code.

00:15:14.960 At this stage, we observe the deployment through various tools that provide metrics on the code we've pushed to production. For example, staff tools show metrics regarding site-wide response times, enabling developers to see how new code affects the application's performance.

00:15:59.680 In conjunction with monitor tools like New Relic, we also have Haystack, an internal tool for catching unhandled exceptions. We can visualize exceptions over a timeline, allowing us to determine if a deploy caused any new issues.

00:17:06.080 If we encounter a significant problem post-deployment, we need to roll back quickly. Without the complication of branches or environments, we can revert to the last known good state by just issuing a quick deploy command.

00:17:53.680 If everything looks good after observing the code, we can merge down from the feature branch to master effectively and seamlessly. Hubot will notify us of the change, unlocking production for further changes and kick off additional builds if necessary.

00:18:43.680 It is crucial that we keep track of our deployments to observe their success consistently. Gathering ample data ensures that our deployment processes are smooth, adapting as needed based on the outcome of previous deployments.

00:19:58.560 Let’s discuss the concept of removing legacy or unused code. When navigating a class to determine if an obsolete method can be removed, we must base that decision on confidence. To bolster this process at GitHub, we've implemented a tool called Backscatter.

00:21:02.080 This non-open-source tool enables us to track method calls and gather metrics on their usage. Using Backscatter allows us to enhance our confidence in removing methods that are no longer in use.

00:21:47.680 In practice, we can control how much data we capture, targeting specific methods without overwhelming our systems with too much information. Conducting these experiments provides us with significant insights into the efficacy of our code and helps us refactor intelligently.

00:22:43.920 With 'Science' as the theme of our discussion, it's about experimenting with confidence. At GitHub, we often refactor core components of our system to accommodate growth as we scale from 20,000 users to over 7 million.

00:23:39.840 As we grow, we learn that what worked for a smaller user base may not suffice for a larger one. Operating within the confines of frameworks can often lead us to less clarity on the behavior of our code bases, heightening the need for confidence when making changes.

00:24:37.760 To this end, we leverage science to make informed decisions about our code. For example, we review how our code behaves in experimental models and through our monitoring tools, enabling us to bridge user experience with systems functionality.

00:25:48.480 Lastly, I'd like to reflect on the impact of community members like Jim Wier. His dedication and humility resonate as a profound reminder that our connections with others outweigh the code we produce.

00:27:03.360 Thank you for your attention! I hope this discussion has provided insight into how we at GitHub approach code changes and deployment.

Wynn Netherland

explore all talks recorded at Big Ruby 2014

Explore all talks recorded at Big Ruby 2014

Big Ruby 2014

Testing The Untestable

Richard Schneeman

Lightweight Business Intelligence with Ruby, Rails, and MongoDB

Coraline Ada Ehmke

Mo' Jobs Mo' Problems - Lessons Learned Scaling To Millions of Jobs an Hour

Building DEF CON CTF with Ruby

How Shopify Sharded Rails

A 4-pack of Big Lightning Talks

Open Source Isn't For Everyone, But It Could Be

Castle On a Cloud: The GitHub Story

Keynote: Working Effectively on a Distrubuted Team

Glenn Vanderburg

Refactoring With Science

Wynn Netherland

It Looks Like You're Writing a Service: Would You Like

Mando Escamilla

In Praise of Smallness

Harry Potter and the Legacy Codebase

Lightning Talks

Christopher Krailo, Aaron Lasseigne, Mando Escamila, Jeremy Perez, and Jeffery Davis

Throw Some Keys on it: Data Modeling For Key/Value Data Stores by Example

Herding Elephants

Developers are From Mars, Developers are From Venus