Test-Driven Development

Pushing to master - adopting trunk based development

Pushing to master - adopting trunk based development

by Dylan Blakemore

In his talk at RubyConf 2022, Dylan Blakemore presents the concept of Trunk-Based Development (TBD), a methodology that, while intimidating for those used to traditional branching strategies like Git flow or GitHub flow, ultimately leads to improved code quality and faster delivery. Blakemore, a software engineering manager at Zappy, illustrates the need for change in development practices after observing declining delivery speed and low engineering morale in his team. He outlines key metrics from the State of DevOps Report, known as DORA metrics, which are crucial for measuring high-performing software teams: deployment frequency, change lead time, change failure rate, and mean recovery time.

Blakemore explains TBD as a strategy that emphasizes smaller, more frequent commits to the master branch instead of lengthy feature branches, which can create complications and bugs from prolonged divergence. He identifies several benefits of TBD:
- Faster delivery of value through smaller commits.
- Enhanced team collaboration and collective ownership through pairing.
- Simplified merge processes and reduced chances of conflicts.
- Improved quality assurance by promoting testing practices in the production environment.

To address the resistance to adopting TBD, Blakemore highlights three common excuses from teams: a perception that TBD is not applicable, an aversion to change, and the belief that it is too difficult. He counters these with examples from his own experience at Zappy, where the adoption of TBD practices has proved beneficial even for less experienced developers. Blakemore emphasizes the importance of 'roots' that support the practice of TBD, which includes good design, test-driven development (TDD), a robust CI/CD pipeline, and feature flags. Using RFCs (Requests for Comments) to document designs early in the process can mitigate issues during implementation.

The talk concludes with the notion that adopting TBD is about instilling a development culture focused on best practices, flexibility, and trust, rather than seeing TBD as a goal itself. The ultimate takeaway is that trunk-based development is an indicator of good software engineering and team health, and organizations must nurture their development practices to thrive.

00:00:00 Ready for takeoff.
00:00:16 Hi everyone! I hope everyone's had a good start to RubyConf 2022.
00:00:23 I'm Dylan, and I’m incredibly excited to be here. This is my first RubyConf and my first conference talk.
00:00:29 I'm excited and a little bit nervous, but I hope you all enjoy the talk and can take something away from it.
00:00:34 I’m a software engineering manager from Cape Town, South Africa, where I actually studied mechanical engineering.
00:00:41 About five years ago, I dropped out of a PhD program and somehow landed a junior developer job at a company called Zappy.
00:00:49 Zappy is the world leader in automated end-to-end market research with offices in Cape Town, Boston, and London, plus some satellite locations around the world.
00:00:56 Our stack is mainly Rails for the backend, with React on the front end, and we have a bit of Python and Elixir thrown in for good measure.
00:01:02 We have millions of lines of Ruby across a number of different apps, and, of course, we also have all the tech debt and spaghetti that comes along with that.
00:01:10 About two years ago, we started to notice some problems within our engineering department.
00:01:19 Delivery speed was decreasing, engineering happiness was at an all-time low, and our innovation was stagnating.
00:01:24 It really felt like we had forgotten how to be disruptive, and we were just plotting along with the mundane routine.
00:01:30 Customer requests and bugs were happening more and more often because we never knew when we would break something.
00:01:38 Clearly, we needed a change.
00:01:45 The problems we were facing weren't unique to us; many companies at a certain scale face similar issues.
00:01:52 High degrees of coupling, poor software practices, and frustrating processes are common.
00:01:58 So, surely there's something out there that can teach us how to move forward.
00:02:05 Enter the State of DevOps Report and the DORA metrics.
00:02:11 For those who might not know, the State of DevOps Report is an annual report on software engineering, striving to understand what makes a high-performing team.
00:02:28 Researchers take a rigorous scientific approach to gathering and analyzing data, which is well described in the fantastic book "Accelerate."
00:02:41 The outcome of years of research culminated in the DORA metrics, which indicate the performance level of software teams.
00:03:00 Deployment frequency measures how often an application can be deployed to production. Change lead time is the time taken between writing code and deploying it. Change failure rate measures how often deployment results in failure, and mean recovery time is how long it takes to fix any failures.
00:03:25 A new metric, reliability, has been added, but there’s still some debate about its validity, so I won't touch on that too much.
00:03:33 It's essential to note that these are metrics, not goals. Once a metric becomes a goal, it can be gamed.
00:03:44 Instead, we should focus on the best practices and behaviors that have been proven to improve these metrics.
00:03:57 About two years ago, when I was a tech lead and now as an engineering manager, there's one particular behavior that I’m very passionate about: trunk-based development (TBD).
00:04:09 It's an alternative branching strategy that encourages more regular but smaller commits to the master branch, directly leading to improvements in your deployment frequency metrics.
00:04:20 It has been proven that organizations practicing TBD are generally more high-performing. So, why isn't everyone doing it?
00:04:31 That's what I'm here to address.
00:04:38 Now, TBD purists may have very strict definitions of trunk-based development.
00:04:45 If you’re doing it right, then you should be committing and pushing straight to master and pairing whenever you're writing code.
00:05:05 However, I like to remember a favorite Twitter post by G. Paul Hill: "I don’t like or value definition wrangling. I don’t care whether you call how I work TDD or not."
00:05:16 He suggests that probing the boundaries of ideas is useful, but axiomatizing natural language can often be boring.
00:05:24 This is relevant to TBD, as intention and result are most important. If we achieve benefits without sticking strictly to definitions, it doesn't matter.
00:05:35 Therefore, we must clarify our intentions and the benefits of TBD.
00:05:41 The intent behind TBD is to keep each commit smaller and to commit more regularly to the trunk.
00:05:48 It avoids feature and integration branches entirely by understanding that the only true integration branch is master.
00:05:58 Compare those to GitHub or Git flow branching strategies, which typically involve moderately long-lived feature or integration branches.
00:06:09 While team members are committing to those branches, they are generally deployed to non-production environments and eventually merged into master.
00:06:22 The longer the life of these branches, the more they diverge from the main code.
00:06:32 As teams branch off for features and then attempt to merge back in, the discrepancies can lead to confusion.
00:06:45 This results in high conflict rates if multiple teams work on the same code base, increased probabilities of bugs due to inadequate testing, and large pull requests (PRs) which make quality reviews difficult.
00:07:12 To fix these issues, the approach is simple: do not use long-running branches. Treat the trunk as your integration and testing branch and keep your commits small.
00:07:27 This drastically decreases chances of merge conflicts and branch divergence since your branches don’t linger for long.
00:07:41 Because you are committing frequently and in smaller chunks, your PRs look quite different from other branching strategies.
00:07:54 You are not committing entire features in one go; the unit of value is much smaller.
00:08:03 If you finish writing a class and its spec, you can ship that code, regardless of whether it has been QAed, as long as there is a good accompanying spec.
00:08:20 There are two rules of thumb I utilize to determine if my commits or PRs are too complex.
00:08:39 The first is that the number of files changed should always be two: a functional change and a test.
00:08:46 The second rule pertains to the description of the commit message; if it contains the word 'and,' it likely means you're doing too much.
00:09:00 Now that we understand what TBD is and what it looks like in practice, let’s discuss the benefits.
00:09:20 Firstly, you deliver value much faster, though in smaller increments. Each good commit should deliver some value.
00:09:34 You also receive feedback faster, as smaller and more focused commits allow for discussion on functionality instead of stylistic choices.
00:09:45 Moreover, this practice, along with an emphasis on pair programming, fosters a better sense of collective ownership within the team.
00:09:59 TBD provides a more accurate commit history and facilitates simpler merges.
00:10:10 Merge conflicts can be indicators of outdated work. If a merge conflict occurs, you should start the QA process anew because what you're shipping is not what officially underwent testing.
00:10:26 When discussing QA, it becomes simpler, more reliable, and boasts higher quality.
00:10:38 We’ve all joked about testing in production at some point. When I say it, I'm serious; there’s no value in testing in a non-production environment since it's merely a poor imitation of reality.
00:10:57 Lastly, TBD makes you feel like an elite hacker from a 90s movie because there’s a thrill in committing 10 to 15 times a day.
00:11:04 All of these points, apart from perhaps the last, are facts—proven findings from the State of DevOps that show organizations practicing TBD typically outperform those that don’t.
00:11:23 So, the question remains: why isn’t everyone adopting this?
00:11:37 During my push for adoption at Zappy, I’ve encountered many excuses.
00:11:45 They generally fall into three categories. The first common excuse is that TBD is not applicable to their work.
00:11:58 Some may claim to be working on a large feature that needs to be coordinated with updates to multiple applications.
00:12:10 However, TBD is applicable in every scenario except for open source projects.
00:12:17 In open source, the code owners are not necessarily the same as the code contributors, so some branching strategies are required.
00:12:30 The second claim is a general aversion to change. Many say, "This has always worked for us; why should we change now?".
00:12:40 This shouldn't be the mindset of a software engineer. Our strength is in our ability to learn and adapt.
00:12:49 Being stuck in one's ways can lead to becoming obsolete.
00:12:54 The last category of excuse is that TBD is too difficult.
00:13:04 Interestingly, the latest State of DevOps Report adjusted its target demographics.
00:13:24 Compared to previous reports, which focused on senior engineers, the percentage of respondents with more than 16 years of experience dropped to only 13% this year.
00:13:41 As a result, feedback from less experienced developers indicated negative results with TBD, including perceived decreases in overall performance.
00:13:54 However, senior developers reported exactly the opposite results.
00:14:08 This suggests that perceived difficulty might be a valid reason for not adopting TBD, but is TBD inherently difficult for inexperienced developers?
00:14:29 I don't believe so. Earlier this year, a junior engineer joined my team, and Zappy was his first software job.
00:14:44 He may not be committing five times a day yet, but he has positively expressed support for shorter-lived branches and for testing in production.
00:14:57 Moreover, I have less than five years of experience in software engineering, and I find TBD to be one of my favorite practices.
00:15:06 Years of experience cannot be the only factor in the success of TBD.
00:15:18 Perhaps we're approaching it incorrectly. Shia LaBeouf might tell you that you must just do trunk-based development, but I argue that a trunk cannot grow without roots.
00:15:30 These roots are necessary to stabilize and nourish the trunk itself.
00:15:45 The roots correspond to best practices in software engineering.
00:15:57 I believe there are four primary roots: design, test-driven development, a great CI/CD pipeline, and feature flags.
00:16:14 The first root is design. During my first three years at Zappy, there was minimal focus on designing code and documenting that design.
00:16:30 The first time your teammates hear about your solution is often only when you present them with a PR of thousands of lines of code.
00:16:44 Feedback at that stage comes too late, causing issues that could have been avoided had you engaged in a design phase.
00:17:06 To combat this, my team has recently begun formalizing our design process with RFCs, which stands for Request for Comments.
00:17:21 These documents allow teams to create collaborative design specifications and receive early feedback.
00:17:36 We’ve found that we can save significant time by addressing issues during the design phase.
00:17:50 Thus, the best RFCs often arise from discussions about flaws.
00:18:02 When determining when to write an RFC, I consider three indicators: if I'm unsure about a decision, if I have multiple approaches in mind, and if the problem is complex enough that I can't explain it in a couple of paragraphs.
00:18:15 Design should always be a priority. Without it, TBD cannot happen.
00:18:29 Having a clear outline of the steps for a user story is essential, and good design facilitates this.
00:18:45 Next, we have test-driven development (TDD). Confidence in the code you're deploying is essential, and if you're deploying multiple times a day, manual QA isn't feasible.
00:19:03 You need a substantial suite of specs to ensure that you do not ship any breaking changes.
00:19:22 Real TDD involves writing the specs first, making them fail, coding to make them pass, and then refactoring.
00:19:36 As G. Paul Hill suggests, it’s about focusing on the benefits rather than strict definitions.
00:19:51 The most valuable aspect of TDD is that it should guide design.
00:19:59 Good design typically results in code that's easy to test and refactor, which is a quality hallmark of solid principles.
00:20:14 In case studies, teams using TDD reported an increase in quality, with fewer defects and higher performance.
00:20:29 Pair programming is an excellent way to instill TDD practices. More seasoned engineers can pair with juniors to show them the ropes.
00:20:44 During these sessions, I often write the specs first and then give control to the junior, allowing them to see firsthand how TDD enhances coding.
00:20:58 The next root is having a great CI/CD pipeline. At Zappy, our SRE team ensures that we have good CI integrated.
00:21:11 Every push to any branch initiates a test suite, which helps us catch issues early.
00:21:29 Because of the simplicity of deployment, we’ve made it easy to release code to production.
00:21:44 Automated CI is easy to implement, even for personal projects. But continuous deployment can be a bit trickier.
00:21:58 There are numerous available tools like GitHub Actions for deploying applications to AWS, but as complexity increases, so do the challenges.
00:22:13 A solid upfront investment is required to implement continuous deployment effectively.
00:22:25 Finally, we need to talk about feature flags (or toggles), which allow certain functionalities to be enabled or disabled.
00:22:39 Feature flags can be simple, like an if statement, or more advanced services like LaunchDarkly that allow for targeted deployments.
00:22:54 Using feature flags allows us to experiment in production—my favorite aspect of this practice.
00:23:09 It improves QA, allows us to present features to a select group of users, and simplifies the release process.
00:23:24 We can compare new features against existing ones, and the process grows more efficient.
00:23:37 An example from our platform highlights how feature flags can save us time; we once needed to revert a change due to a lack of communication.
00:23:50 The turnaround for this simple fix used to take two months to get approved, but with feature flags, all it took was to implement a toggle.
00:24:05 The important part here is that we weren't even involved in the communication process, which streamlined everything.
00:24:20 However, promoting feature flags can be challenging as there is a financial cost, and habits must change.
00:24:36 Nonetheless, at Zappy, we manage 88 active feature flags with nearly 70 engineers, which I consider a success.
00:24:50 Data shows that adopting these practices is beneficial in the long run, but it tends to create initial challenges.
00:25:01 Velocity may initially decrease, but an upfront investment in technology and training is essential to success.
00:25:15 Understanding what the roots are and their value isn't enough for change; we need people to advocate for this transition.
00:25:30 Adoption doesn't occur overnight. It's essential to lead by example by documenting designs and sharing them.
00:25:45 Using basic feature flags, even as simple as ENV variables, can provide a foundation for a better development culture.
00:26:00 Beyond best practices, nurturing a supportive culture is crucial.
00:26:15 Organizations with flexible work arrangements and collaborative cultures tend to outperform those burdened with bureaucracy.
00:26:30 Stable teams with developers who stay longer are more likely to succeed in high-performing organizations.
00:26:48 Additionally, trust is an essential nutrient for growth and success.
00:27:03 Going back to the original question, why is it so difficult to adopt trunk-based development?
00:27:15 The answer is that TBD itself is not a goal; it's more of a metric.
00:27:30 TBD is an indicator of good software engineering practices and team health.
00:27:49 Thus, the focus should be on the roots and the soil, because without them, the tree won't grow.
00:28:01 So go forth and crush it! I hope you learned something today. Thank you.
00:28:07 I guess we have two minutes if anyone has any questions.
00:28:12 The first question is: How long does your CI take to run?
00:28:20 We have different apps; some take two minutes, while others can take up to 20 minutes.
00:28:27 Another question is how do you handle dependent PRs?
00:28:34 If multiple files depend on each other, it indicates high coupling, which suggests a review of the design.
00:28:42 However, everything can go in separately until it’s time to merge dependent PRs, which should obviously be the last step.
00:29:01 The next question is how we deal with unused code.
00:29:14 We have tools to track unused code. However, it’s often on developers to clean up after themselves.
00:29:22 As for automating rules for big commits, we haven’t implemented automated rules.
00:29:32 It's a judgment call, and excessive automation might lead to conflicts.
00:29:36 We prefer to make our own judgment calls on a case-by-case basis.
00:29:41 I think that’s it, right? Thank you!