00:00:00.000
ready for takeoff
00:00:16.920
hi everyone I hope everyone's had a good start to rubycon 2022
00:00:23.039
I'm Dylan um I'm incredibly excited to be here this is my first Rubicon and my first
00:00:29.640
conference talk uh so I'm excited a little bit nervous but I hope you all
00:00:34.860
enjoy the talk and can take something away from it I'm a software engineering manager I'm
00:00:41.520
from Cape Town South Africa where I studied mechanical engineering actually but about five years ago I dropped out a
00:00:49.020
PhD and somehow landed a junior Dev job at a company called zappy
00:00:56.760
Zappy's the world leader in automated into End Market Research with offices in
00:01:02.760
Cape Town Boston and London and some satellite people all over the world a stack is mainly rails back-ends with
00:01:10.860
react front end obviously a little bit of python and elixir thrown in for luck we've got millions of lines of Ruby
00:01:19.320
across a number of different apps and we've also got all the tech debt and
00:01:24.600
spaghetti that comes that comes along with that so about two years ago we started to
00:01:30.960
notice some problems with our engineering department uh delivery speed was decreasing engineering happiness was at an all-time
00:01:38.520
low and our Innovation was just stagnating so it really felt like we'd forgotten
00:01:45.479
how to disrupt and we were plotting through the mark of BAU and customer requests and bugs that were happening
00:01:52.860
more and more often because we never knew when we would break something
00:01:58.740
so it obviously needs change now the problems we're facing weren't unique to us uh you know High degrees of
00:02:05.520
coupling poor software practices frustrating process these are all things that most companies at a certain scale
00:02:11.760
have had to deal with so and surely there's something out there that can teach us how to move forward
00:02:21.720
let's go he's gonna do that so enter the state of devops report and
00:02:28.800
the Dora metrics um for those who might not know the state of devops report uh is an annual
00:02:35.340
important report on software engineering and it strives to understand what makes a high performing team
00:02:41.340
other researchers take a rigorous scientific approach to Gathering and understanding that data and all of that
00:02:47.640
is described really well in the Fantastic book accelerate the outcome of years of research was
00:02:53.340
these Dora metrics which together indicate are indicators of high performing software teams
00:03:00.420
deployment frequency is a measure of how often an application can be deployed to
00:03:05.879
production change lead time is the amount of time between writing some code and getting that code into production
00:03:12.959
change failure rate is the rate at which uh failures happen
00:03:19.379
and then mean recovery time is the amount of time it takes to fix any failures
00:03:25.860
uh it has been a new metric added reliability but there's still a bit of debate around its validity so I'm not
00:03:33.360
going to touch on that too much but it's important to notice that these are metrics and not goals anytime a metric
00:03:39.060
becomes a goal it can be gained so we don't want to look at how we can
00:03:44.159
game these and improve them just by doing anything instead you focus on the best practices and the behaviors that
00:03:50.760
have been proven to improve the metrics as a tech lead about two years ago and
00:03:57.000
an em now there's one particular behavior that I'm very very passionate about and that's obviously trunk-based
00:04:03.540
development it's an alternative branching strategy which encourages more regular but
00:04:09.900
smaller commits to master and it directly leads to significant improvements in your deployment
00:04:15.120
frequency metric it's been proven that organizations that
00:04:20.400
practice TBD are more performant but if that's the case then why isn't everyone doing it why aren't we all
00:04:26.580
doing TBD that's what I'm here to try and answer so
00:04:31.820
let's explore TVD and why it's a difficult to adopt that scale
00:04:38.639
um now TBD purists have very specific definitions of trunk-based development if you're doing it right then you should
00:04:45.780
be committing and pushing straight to master and you should be pairing whenever you're writing code
00:04:51.960
um the thing is when I hear this I like to remember one of my favorite favorite Twitter posts by G Paul Hill which is I
00:04:59.699
don't like or value definition wrangling and I don't care whether you call how I work tdd or not probing the boundaries
00:05:05.580
of ideas is great but axiomatizing natural language is boring and often a
00:05:10.800
form of brow beating so he's obviously speaking about test driven development here but I think the
00:05:16.620
same can be said about trunk-based development what's important is the intent and the
00:05:24.240
result if we aren't sticking to the letter of the law but we're still gaining the same benefits then it really doesn't matter
00:05:29.940
whether someone on the internet fights you about the definition
00:05:35.039
but that does mean that we have to figure out the intent and the benefits of TBD
00:05:41.820
so the intent behind TBD is to keep each of our commits smaller and commit more regularly to the trunk it avoids feature
00:05:48.960
and integration branches entirely by understanding that the only true integration branch is master
00:05:55.800
so compare GitHub or git flow branching strategy um they have the differences but a
00:06:03.000
common pattern to both of them is that you have moderately long-lived feature or integration branches of some sort and
00:06:09.720
one or more people are committing to those they're generally deployed to a non-production environment where they
00:06:15.600
are qaed or fiddled around with and then at some stage when the future is done it's merged into master and deployed to
00:06:22.259
production now the problem with that approach is that the longer lift your branch is the
00:06:27.539
more it diverges from the truth you may Branch off from a feature Branch to do
00:06:32.580
some kind of patch of the QA and then merge back into the future branch and then you merge that back into trunk and
00:06:38.400
by the time you do merge it into master master it looks nothing at all like it did when you first branched off
00:06:45.300
so upon some inspection these strategies are filled with problems the probability of merge conflicts is high if you have
00:06:52.199
multiple teams working in the same code base the probability of bugs is high because what you test is not what trunk
00:07:00.360
actually looks like and you also tested probably in a non-production environment so the data
00:07:05.639
and usage is different the probability of big PR's is pretty high
00:07:11.759
you know we all know the meme a thousand lines diff one comment lgtm thumbs up 20
00:07:17.759
lines diff 20 comments it's funny because it's true but what does that say about the quality of the
00:07:24.240
reviews on these big PR's and what does it say about the team's understanding of the
00:07:29.639
code that's going into their Repository personally if it takes me more than five
00:07:35.280
minutes to understand every single line of a pull request I'm not going to review it I'm not going to approve it
00:07:41.220
I'm going to go to the developer and say hey can you break this down for me and really help me understand your code
00:07:50.340
so how do we fix all this actually quite simple you just don't use
00:07:56.280
long running branches treat the trunk as your integration and your testing branch
00:08:01.979
and keep your commits small and precise that way the chance of merged conflicts
00:08:07.319
and Divergence is massively decreased because your branches never really live for much longer than a day or two
00:08:13.500
now because you're committing so frequently and in smaller chunks your peers and commits in general look quite
00:08:18.900
different from other branching strategies um you don't really commit full-fledged
00:08:24.419
features in one go instead the unit of value is a lot smaller if you finished writing a class and it's spec then you
00:08:32.520
ship that it doesn't really matter whether it's been through QA or tested because if a class method function or
00:08:39.300
any piece of code has an Associated spec and it's a good spec there's no reason for it not to exist in master
00:08:46.680
now there's two rules of thumb I like to use for determining whether my commits or pull requests are too complex the
00:08:52.800
first is number of files changed it should always be two a functional change and a test
00:09:00.120
um and then second rule of thumb I use is about the description or your commit message if that includes the word and
00:09:07.440
you're probably doing a bit too much so now that we know what TBD is and a
00:09:13.380
bit of what it looks like in practice let's talk about the benefits so first is that you deliver value much
00:09:20.880
faster but in smaller increments every good commit should deliver some value even if it's just making future
00:09:27.420
development a tiny bit easier you get better feedback faster because your
00:09:32.820
commits are smaller and uh more focused so people actually discuss the functionality instead of whether you
00:09:39.959
know you're using single or double quotes or have a space after your brace um
00:09:45.420
so now this along with more emphasis on pairing creates a better sense of collective ownership within the team
00:09:53.160
uh TBD provides you with a more accurate commit history as well as simpler
00:09:59.160
mergers now emerge conflict is an inherent indication that your work is
00:10:04.200
out of date if you need to resolve a merge Conflict at time of merge then uh
00:10:10.560
you really should go through the entire QA process again because what you're shipping is not what was tested
00:10:17.820
and then speaking of QA it becomes simple more reliable and has a higher quality we've all joked about testing in
00:10:24.120
Pride at some point but when I say it now it's not a joke I don't see value in
00:10:30.720
testing in a non-production environment because it's always going to be a poor imitation of reality
00:10:39.060
and last but not least TBD makes you feel like Elite hacker straight out of the 90s film because
00:10:45.180
there's just something thrilling about committing 10 to 15 times a day and shipping those shipping those all
00:10:52.620
so now all of these points except possibly the last are not my opinion
00:10:57.959
their facts they are proven the state of devops has proven that organizations
00:11:03.300
that practice TBD show higher performance than those that don't so again the question is why isn't everyone
00:11:08.940
doing this in trying to push for adoption at sapi
00:11:14.100
I've desperately tried to answer this question and what I've heard are a lot of excuses
00:11:20.339
they all seem to fall into one of three categories first Common one is that it's not not applicable to my work uh I'm doing a big
00:11:27.899
V factor that has to go in as well and think well my work crosses multiple apps that all have to be updated together or
00:11:33.540
you know I won't know it actually works until everything's running together there's only one case when TBD is not
00:11:39.899
applicable and that is open source because in that case the code owners aren't necessarily the same as the code
00:11:45.660
contributors so you do need some branching strategies every other time it's applicable
00:11:54.060
the second cam is mainly an aversion to change just in general you know this has
00:11:59.279
always worked for us so why change now this should never be a software engineer's mindset our strength is the
00:12:08.760
ability to learn new things and adapt quickly to change being stuck in your way as a software
00:12:14.339
engineer is a fantastic way to become irrelevant
00:12:19.920
so we can cross that off too um
00:12:25.320
now the last category of excuse is that it's difficult
00:12:30.480
and I think this is an interesting one so let's talk about it this is a state of devops report has
00:12:37.680
amazing changes to their target demographics so previously they focused a bit more on our senior Engineers they
00:12:44.160
had 40 percent of respondents having more than 16 years of experience this year they wanted to get feedback
00:12:52.440
from all Juniors so that number dropped down to 13 percent and the results showed that the less
00:12:58.740
experienced developers showed negative results with TBD across the board with a
00:13:03.959
decreased a decrease in perceived overall experience sorry overall performance an increase in
00:13:10.560
unplanned work error proneness and change failure rate that's opposed to senior senior
00:13:16.860
developers who showed the exact opposite results so now this makes it seem like
00:13:22.139
difficulty is a very very valid reason for not adopting TBD but the question is is TBD itself
00:13:28.680
difficult or a poor tool for inexperienced Developers I don't believe so we had a junior
00:13:35.100
engineer join my team earlier this year um zappy was his first software job uh
00:13:40.740
now he's not committing five times a day yet but he's been vocally positive about
00:13:46.320
shorter-lived branches and testing in production I myself have less than five years of
00:13:52.200
experience in the software engineering world and TBD is one of the loves of my life
00:13:57.540
closely behind my dog and closely ahead of my girlfriend
00:14:03.540
um so years of experience can't be a real determiner for the success of TBD
00:14:09.480
and maybe that means that we're looking at it slightly wrong Shia LaBeouf will try to tell you
00:14:15.120
otherwise but you can't just do trunk-based development a trunk can't grow without roots you know roots to
00:14:21.120
stabilize the tree roots uh to feed it and Roots keep it healthy and these roots are software engineering best
00:14:26.940
practices I believe there's four of them which contribute primarily to the growth of
00:14:32.220
this tree the first fruit is design when I first joined zapian for about
00:14:37.500
three years or so while I was there there was never really a focus on designing your code
00:14:44.399
um and you know and documenting that design and getting feedback on it looking back it's it's a bit shocking to
00:14:50.459
me now um and it meant that the first time you know your teammates ever hear about your solution your proposal
00:14:57.660
is when I'd spent weeks building the thing and now I have a PR with 2000
00:15:03.240
lines of code and then you know someone comes along and they say hey there's actually a way better way of doing this
00:15:09.240
but at that point you've sung so much time into it that it's unfeasible to
00:15:14.399
start from scratch you've got deadlines you got PMS breathing down your neck so the code goes in with the promise to
00:15:21.899
clean it up later and I imagine almost everyone has had a similar experience and I also Imagine
00:15:28.019
almost everyone here has not cleaned it up every time it happens
00:15:33.300
so my team has been doing some very ad hoc design for about a year or so but
00:15:39.240
it's only within the last six months of zappy that we've tried to formalize the design phase of software development by
00:15:45.420
introducing rfcs so RFC stands for request for comment and it's just a design doc with any level of detail but
00:15:53.339
the reason for calling it an RFC is to emphasize the collaborative nature of design
00:15:59.100
um you know just like comments on pull requests we want feedback on the design so that we can interact quickly before
00:16:05.160
we we even start coding and we found that we can save weeks of work by catching flaws after only a
00:16:12.360
couple of days you know the best rfcs come from being wrong
00:16:17.399
now I mentioned briefly that the amount of detail in RFC uh Canon should be flexible
00:16:23.699
so it becomes a little tricky to figure out when to actually write one you know as much as we want design to happen we
00:16:29.579
don't want to write an essay for a one-line bug fix uh so I try and use three indicators to figure out whether I
00:16:35.519
should write a design doc the first is if I'm at all unsure about some decision that needs to be made the second if I've
00:16:42.540
got a number of different approaches in mind but I can't really figure out which is potentially the best one
00:16:48.899
and in the last is if a solution or a problem is too complex to easily explain
00:16:54.779
in a couple of paragraphs then I'll write up a nice design doc and get some feedback
00:17:00.180
now design should be a priority regardless of trunk-based development but TBD cannot happen without design
00:17:06.439
because you want to commit small and fast you kind of need to have an outline of what the Stepping Stones look like
00:17:12.419
for a story and that's only possible with design this chart shows the number of
00:17:18.540
deployments versus the number of RF rfcs for most of the teams that's Happy over the past year or so normalized by team
00:17:25.500
size I will let you guess which data point is my team
00:17:31.080
the data is a little bit rough you know some teams were excluded because I only got formed halfway through the year some
00:17:36.179
teams may not have uploaded all of their design documents to our database but they does seem to be a correlation of
00:17:43.080
sorts between the amount of time spent designing and the number of deployers per developer
00:17:49.020
now I'm not saying there's causation here just because you design a dock doesn't mean you're going to ship more but it does mean that or does imply at
00:17:57.960
least that trunk-based development gets enabled by Design
00:18:03.660
next up we've got test driven development you do need to be confident that your code going in is not going to
00:18:10.140
break anything when you deploy it and if you if you're deploying 10 times a day then manual QA is is not feasible
00:18:19.740
um so you need a robust set of Suite of specs to be confident that you won't
00:18:25.140
ship breaking changes and I'm talking about real test driven development not just unit testing you
00:18:31.980
know the strict definition is write us back first make it fail write your code
00:18:37.020
make it pass refactor rinse and repeat I will call on G Port Hill again and say
00:18:43.380
that we should focus on the benefits rather than the semantics you know it's okay if once in a in a while you write
00:18:49.559
your code first then you'll spec it's not the end of the world but I believe the most valuable part of
00:18:54.840
tdd is that the spec should guide the design and there's a very strong correlation
00:19:00.600
between code that's easy to test and that's that easy to refactor and both of those signal uh good tactical design and
00:19:07.260
solid principles in a case study involving some big development teams uh teams reported a 15
00:19:13.620
to 35 percent uh increase or uh decrease in initial velocity uh when using tdd
00:19:21.299
and I think that can be explained by the learning curve you know 40 to 50 percent of Engineers found that found that
00:19:27.179
adoption of tdd was quite difficult uh potentially due to a lack of upfront design
00:19:32.640
but the engineers found that in the long term their velocity increased uh
00:19:38.220
partly because the stability also increased in fact density of defects dropped by between 50 and 90 percent
00:19:45.179
when tdd practices were followed um and then Engineers found that their
00:19:50.580
code quality was was just better and the designs were simpler so how do we get people to do tdd I
00:19:58.860
think the best way is pairing you know have your more experienced Engineers pair with the Juniors and show them how
00:20:04.799
tdd Works something I like to try to do when pairing with a junior is to lead
00:20:10.020
first drive first and just write the specs so I'll just sit right the specs go through what they're supposed to do
00:20:16.080
with them and then I'll hand over the driving the steering wheel to them and they can write the code and that way
00:20:22.260
they learn the value of tdd while also doing the fun part of the coding what a test without some way to run them
00:20:28.919
the answer is the next route a great CI CD pipeline Zappy's got an incredible SRE team so I've never really worked in
00:20:35.940
a world without good CI and CD you know we have what any push to to any
00:20:42.360
branch Branch or test Suite runs Jenkins build a test for visual regressions on the front end you know deploying to prod
00:20:50.160
is a simple one-liner in in a shell we can even deploy from slack but I'm
00:20:55.919
pretty sure I'm the only one who's ever used that so great continuous integration pipeline
00:21:01.679
allows us to be more certain that things won't break and alerts us at every stage
00:21:06.900
of the process and the best part is CI pipelines are incredibly easy to set up
00:21:12.720
these days you know even for personal projects you can build and run tests with a very
00:21:19.260
simple GitHub actions workflow and we're at the point now where there's really no excuse to not have at least
00:21:25.380
some form of automated CI continuous deployment on the other hand
00:21:31.280
is a little bit trickier there are some easy available options out there you
00:21:36.480
know GitHub actions again can deploy applications straight to AWS but as soon as your your web app and
00:21:44.640
your infrastructure becomes complex enough and you have edge cases and strange things starting to appear
00:21:51.179
but you know adoption of continuous deployment does require a real upfront
00:21:56.640
investment but it's one of the most critical components not just for trunk-based
00:22:01.740
development but for efficient software engineering just in general then finally we've got feature Flags or
00:22:09.299
feature toggles as they're also known uh these allow you to switch pieces of functionality on and off for certain
00:22:14.820
users the most simple feature flag is just an if statement which maybe enables something locally but you can also use
00:22:21.960
services like launch Darkly which is what we use and they provide more advanced targeting functionality
00:22:27.860
feature flags are what allow us to test in prod my favorite thing and it makes
00:22:33.120
QA faster and more reliable and avoid straggling PRS being blocked
00:22:39.000
because QA hasn't had time to get to the code but therefore more than just QA
00:22:45.600
um feature Flags can be used to safely expose early stage features uh to L4 beta users
00:22:51.299
they can be used for a b feature testing and with the really Advanced tools you
00:22:56.340
can even have controlled incremental rollouts to your customer base but my favorite application
00:23:02.460
of so I was supposed to put up a flag there my favorite application of feature Flags is quite a selfish one and that's
00:23:09.419
that they allow the devs to ignore the bureaucratic crap that surrounds releasing a new feature
00:23:16.559
this year is a chart on our platform um it shows the behavior change score in
00:23:22.140
a survey for four different ads pretty easy stuff the behavior change question
00:23:27.240
asks How likely a respondent is to I don't know purchase the product in the ad or something along those lines and in
00:23:33.659
this case it was asked on the scale from 0 to 10. and again in this case we want to
00:23:39.720
display the result of that score as a mean so we just mean across all the respondents and get the number and that
00:23:45.960
calculation type is displayed at the top next to the behavior change label because we can switch between you know
00:23:51.900
top box mean whatever you want we thought it's a good idea to have it there but based on some feedback we got
00:23:57.919
customers can quite easily see that it's a mean so they just said it was noise it
00:24:04.200
made it messy we decided to remove that uh that little mean
00:24:10.260
it's our new chart looked like this being a very small change it was done
00:24:15.419
within like a few hours and it was shipped to production and then things just got messy we had sales and customer
00:24:24.419
facing reps shouting at us that we hadn't notified them properly so now they're sales pitches and their demos
00:24:30.720
were all out of date we a product manager asked us to quickly
00:24:36.480
quickly revert so that we could appease the business but we still wanted the change to go in
00:24:43.140
to get approval for this change it took six it took two months to bet all the documentation together and communicate
00:24:49.679
to everyone that needed to know this that little change there from that to that took two months to get approval
00:24:57.000
um it was a one-liner I think two years ago that PR would have been lying around for those two months
00:25:03.299
in in PR health but now we have feature flag so we just added a flag to the code shipped it and
00:25:09.720
gave control of the toggle to the PM presumably at some point he got all the communication out there
00:25:16.919
and turned it on but we didn't have to be involved
00:25:22.380
so getting people to use feature Flags heavily like everything else that enables TBD is not an easy task there's
00:25:28.200
a financial cost when integrating with a flagging service and devs need to be convinced to change their behavior
00:25:35.159
it does take time but eventually you get there you know at zappy we have 69
00:25:40.380
engineers at the moment and 88 active feature Flags so I think that is quite a
00:25:46.620
success so all the data shows that adoption of
00:25:51.720
these best practices while very beneficial is not always easy during the
00:25:57.240
early stages velocity and developer happiness tend to decrease you know engineering mindsets and
00:26:03.539
organizational process they all need to change and an upfront investment in Technologies and training is vital to
00:26:09.179
the success but and even knowing what the roots are and
00:26:15.360
understanding their value is not quite enough to bring about changes and Achieve adoption and that's where you
00:26:22.380
guys come in adoption doesn't happen without people with the drive to push for change or without willing or without
00:26:28.799
people willing to embrace it and it's not just the EMS and the tech leads of the world who have that power anyone
00:26:34.380
with the right mindset and the data to back themselves should be able to convince others
00:26:39.659
now it doesn't happen overnight you can't force it so the best bet is to lead by example start writing design
00:26:45.600
docs for any ticket that you pick up share them with your team when pairing write your specs first use Simple
00:26:51.900
feature Flags even if they're as basic as ifrails.emv.development
00:26:58.100
but there's even more to an effective culture to an effective team than just software best practices
00:27:04.440
we need to nurture at our people and our culture it's been repeatedly shown that
00:27:10.080
our teams and organizations with a generative culture and flexible work arrangements are performed those with
00:27:16.080
bureaucratic or pathological cultures stable teams with devs who stick around are more likely to be the norm in high
00:27:23.100
performing organizations but above all of this we need trust which is the most
00:27:29.220
vital nutrient needed to grow the tree so back to the original question why is
00:27:34.679
it difficult to adopt trunk-based development I think the answer is because trunk-based development itself is not a
00:27:41.159
goal it's a metric TBD is an indicator of good software engineering practices and team health so
00:27:48.179
your focus should be on the roots and the soil because without those the tree won't grow
00:27:54.840
so go forth and crush it and I hope you learned something thank you
00:28:07.400
I guess we have two minutes if anyone has any questions then shoot
00:28:12.720
uh so question is how long does RCI take to run um we've got a number of different apps some of them take
00:28:20.640
two minutes um a couple of them take longer up to I think up to 20 minutes is our longest
00:28:26.700
one something along those lines yep yep so the question is how do you handle
00:28:31.860
like dependent uh PRS I think
00:28:37.080
on one hand uh if a lot of different files are dependent on each other that's an
00:28:43.020
indicator of high coupling and maybe you can have a look at your design
00:28:48.480
um but the other thing I like to do or I like to think about is that everything can go
00:28:55.320
in separately until you have a kind of title together PR and that's when you bring in you know the dependencies that
00:29:01.440
one obviously comes last but yeah I do like to think of the tired together PR coming at the end
00:29:08.820
yeah so the question is around how we deal with unused code um
00:29:14.400
we do have tools for it I can't remember what the tool is called but you can like track you know um you can monitor which
00:29:20.820
code is actually touched with rails um there's some gem for it uh but a lot of the time it's just on
00:29:28.320
the the developer to kind of remember and clean them up after themselves
00:29:34.559
um the question is around automating um uh rules for for I guess big commits the
00:29:41.640
answer is no we haven't automated things uh just because it's like you know it's a
00:29:48.600
judgment call at the end of the day I putting two rigorous rules in place I think leads to
00:29:54.120
more drama than it does anything else so I'd rather just make a judgment all right
00:30:02.220
I think that's it right thank you