00:00:15.440
uh hello my name is Jade um I've worked
00:00:18.359
on as Mike said I'm a senior soft
00:00:21.039
engineer at Theta Lake uh I've
00:00:22.680
previously worked on eight different
00:00:24.080
rails apps as part of four different
00:00:25.760
rails systems uh totally different
00:00:28.320
architectures totally different products
00:00:30.400
but what they've had in common is
00:00:31.679
scaling
00:00:33.200
challenges so first up have you ever had
00:00:36.239
a big spike in user traffic come into
00:00:38.520
your system has your rails app ever
00:00:40.879
started slowing down or shedding
00:00:42.960
requests to the point where it looks
00:00:44.520
like it's actually down or maybe instead
00:00:47.440
of user traffic you've had a different
00:00:49.280
kind of problem in that whatever data
00:00:51.520
source you ingest grows to the point
00:00:53.480
where it's going to take just too long
00:00:55.320
to ingest all into your system at the
00:00:57.719
regular interval that you need to then
00:01:00.399
if any of those ring true this talk will
00:01:02.440
be useful for
00:01:04.600
you um so the last time I spoke at Ruby
00:01:07.920
comp in Denver one of the questions was
00:01:10.119
about load testing for user traffic and
00:01:12.400
I recommended a tool some former
00:01:13.880
co-workers had written foreshadowing or
00:01:15.759
reaying real requests against a system
00:01:18.799
the subsequent year I moved to Theta
00:01:20.520
Lake uh we are a fintech uh more
00:01:23.079
specifically Regulatory Compliance and
00:01:26.320
since 20 2022 a few Engineers have been
00:01:29.280
doing performance testing on a replica
00:01:31.280
of our production system and earlier
00:01:33.560
this year I was asked to join in on this
00:01:36.000
um so to give an overview this is
00:01:38.200
basically a cross- team project with
00:01:40.399
people from the Ingus team operations
00:01:42.920
team and uh my team working on the
00:01:45.600
customer facing app rails team and it's
00:01:48.159
led by our CTO Rich this talk is about a
00:01:51.719
methodology for replicating a production
00:01:53.880
system then seeing how how it performs
00:01:56.719
when we push more data into it than we
00:01:58.799
currently see
00:02:01.240
you might ask yourself why well the
00:02:03.880
point of the performance and scalability
00:02:06.000
testing we're carrying out is to
00:02:07.880
demonstrate the maximum performance and
00:02:09.640
throughput of our system when balanced
00:02:11.920
against operating at a reasonable cost
00:02:14.440
so you could horizontally scale out
00:02:16.000
further we're just looking to balance
00:02:18.280
operational cost and
00:02:19.760
throughput um there's the old quote
00:02:22.080
about premature optimization which I'm
00:02:24.080
sure many of you all know and I would
00:02:26.040
agree that if you don't need to do this
00:02:28.239
sort of thing then don't maybe um but
00:02:31.319
what we're trying to do here is given
00:02:33.160
what we know what we've been told and
00:02:35.599
where our ingestion is in terms of scale
00:02:37.360
at the moment we're trying to see ahead
00:02:39.599
a little bit into the future in order
00:02:41.519
find in order to find bottlenecks in our
00:02:43.800
system then we can then both demonstrate
00:02:46.800
how scalable it is at the moment and
00:02:49.280
also find limiting factors that we can
00:02:51.280
then look to
00:02:52.760
mitigate uh so my hope is if you're on a
00:02:56.200
team that is starting to see scaling
00:02:58.120
challenges uh then you'll be able to go
00:03:00.159
back to your team and apply this could
00:03:02.200
be a big hope but um the idea is you
00:03:04.840
want to do this before either your
00:03:06.400
server costs sort out of control or uh
00:03:09.879
your entire system is about to um start
00:03:12.560
acting like it's down or actually go
00:03:15.000
down um so yeah today I'm going to
00:03:18.400
present our methodology for replicating
00:03:20.599
our system for the purpose of load
00:03:24.000
testing right so bit of background uh
00:03:27.360
for this sort of thing a lot of the
00:03:28.840
tooling and resource that you'll hear
00:03:30.360
about are for looking at the performance
00:03:33.360
of um big systems uh designed by and for
00:03:39.200
absolutely massive companies operating
00:03:41.400
at truly massive scale so Netflix
00:03:44.040
Shopify and a lot of the work including
00:03:46.920
the literal book on the subject uh has
00:03:49.360
been done by grenon Greg who is in
00:03:51.239
Netflix the issue is in um smaller
00:03:56.079
companies where you don't have that uh
00:03:58.560
that large team
00:04:00.239
uh you may have different constraints
00:04:01.920
and also as well as having people's time
00:04:04.840
limited you might not want to operate a
00:04:07.519
fake version of your system all the time
00:04:09.680
because it would simply be too expensive
00:04:11.400
like why would you do that uh you may
00:04:13.799
also have some or a lot of pii personal
00:04:17.160
in identifiable information that you
00:04:19.479
have to remove before it goes into an
00:04:21.280
external logging tool and specifically
00:04:24.040
to us you cannot use production data in
00:04:27.360
any form even anonymized the problem is
00:04:30.560
when you're operating at some kind of
00:04:32.400
scale just your rails app alone like
00:04:34.960
your single monolith isn't often going
00:04:36.800
to be your entire system um so you could
00:04:40.600
locally optimize performance in it you
00:04:42.560
could see someone's PR and say oh I
00:04:44.160
think that could be a bit faster there
00:04:46.120
but you could do the classics you could
00:04:47.960
just switch out Malo for jalo put an
00:04:50.560
index somewhere in your database where
00:04:52.039
you need it um but this might not even
00:04:54.800
be where the bottlenecks are so that is
00:04:57.199
that could in that case be wasted work
00:04:59.720
so in short we want to replicate our
00:05:02.080
entire system on real infrastructure
00:05:04.800
with real logging and then load test
00:05:07.039
against that I personally think this is
00:05:09.320
really cool and I've not actually seen
00:05:10.880
it before so I'm going to walk you
00:05:13.000
through how we do
00:05:15.720
it so um I just wanted to clarify a bit
00:05:19.479
of terminology this is very directly
00:05:22.840
from Brandon Greg's book that I put a
00:05:25.120
few slides ago uh so few important
00:05:28.280
points so throughput is defined as the
00:05:30.479
rate of work performed workload is the
00:05:33.319
input to the system or the load you're
00:05:35.120
applying to that system response time is
00:05:37.960
the time for an operation to complete
00:05:40.440
both comprising of weight time and
00:05:42.360
actual service time and then utilization
00:05:45.960
uh has two definitions for resources
00:05:49.240
servicing requests like servers how busy
00:05:52.360
a resource was and for resources that
00:05:54.919
provide storage the cap the capacity
00:05:57.400
consumed so for example memory us
00:06:01.120
utilization then probably quite an
00:06:03.639
important one for this talk bottleneck
00:06:05.680
is a resource that limits the
00:06:07.280
performance of the system like a
00:06:08.960
limiting factor uh and you're aiming to
00:06:11.759
identify and remove systemic
00:06:15.639
bottlenecks um okay so this is a high
00:06:18.479
level system diagram of our system and
00:06:21.120
what it does and I've highlighted some
00:06:23.000
fairly typical areas of a system in my
00:06:24.960
experience so data ingestion your
00:06:27.560
pipeline and then rails py all the usual
00:06:30.560
maybe an API maybe not and here our data
00:06:35.360
ingestion has things like it's got a
00:06:37.919
pipeline uh go leads into a pipeline for
00:06:40.400
Content analysis and I'm not going to go
00:06:42.080
into great detail of that um my team's
00:06:44.479
part of the system is the rails psychic
00:06:46.120
or the
00:06:47.240
usual and then this is an architecture
00:06:50.199
diagram so we'll have various
00:06:51.800
Integrations like Zoom slack they feed
00:06:54.919
into a system called the integrator that
00:06:57.599
feeds into the ingestor through to the
00:06:59.919
pipeline through to portal which is the
00:07:01.599
rails app and then the API kind of feeds
00:07:03.919
into quite a few of those um I've seen
00:07:07.240
similar architectures or heard about
00:07:09.360
them and a fairly common thing I've seen
00:07:12.440
is for there to be differences in how
00:07:14.160
you do data ingestion um I've seen a few
00:07:17.560
places where the strategy is to write or
00:07:20.800
in some cases actually rewrite your data
00:07:22.680
ingestion service in a language other
00:07:25.160
than Ruby and um so maybe closure maybe
00:07:29.080
go line
00:07:30.479
uh the other month I actually heard
00:07:32.160
about a uh web hosting platform who'd
00:07:35.440
also decided to write like us their data
00:07:37.919
ingestion service in goang and that gets
00:07:40.720
data into database and then it
00:07:42.879
eventually um gets it into a form where
00:07:45.199
rails monolith reads it but
00:07:47.360
interestingly that team are actually
00:07:49.120
going to move back to Ruby because of
00:07:51.479
Team changes so it's therefore easier
00:07:53.479
for that team to maintain their data
00:07:55.000
ingestion
00:07:57.440
service okay
00:08:01.440
so I've covered when it will help to
00:08:02.840
load test your system let's get into
00:08:05.360
details so firstly we want to replicate
00:08:08.080
the existing system assuming you want to
00:08:10.520
repeat this process perform several load
00:08:12.960
tests not running just constantly and
00:08:15.800
also not pay to have that infra up all
00:08:18.199
of the time you want you're going to
00:08:19.960
need a way to build up and tear down
00:08:22.080
that
00:08:23.080
infrastructure so we use terraform uh
00:08:26.360
many of you will be familiar in case
00:08:28.599
anyone isn't terraform is infrastructure
00:08:30.800
is code if you've ever dug around in an
00:08:33.640
AWS console looking for some kind of
00:08:36.839
configuration setting you'll understand
00:08:38.800
why that's a useful
00:08:40.200
tool um in my experience some teams that
00:08:42.839
are on Heroku or AWS may already be
00:08:45.360
using it and you might not you might
00:08:47.680
have plans to move on to it benefits
00:08:49.839
here are for performance testing like I
00:08:53.080
said you want to be able to
00:08:56.839
um build up that infrastructure in the
00:08:59.200
same way and then tear it down between
00:09:01.360
tests so you're not leaving it idle and
00:09:03.160
paying for that capacity and you want
00:09:05.720
roughly the same setup as production and
00:09:08.000
you want to bring it up and tear it and
00:09:10.440
uh as I said tear it down in a
00:09:12.279
repeatable way and that is what
00:09:13.800
terraform was very useful
00:09:16.440
for um next thing is you want to think
00:09:19.560
about how work arrives in our system so
00:09:22.760
systems I've worked on have had two
00:09:24.600
typical ways pretty standard uh either
00:09:27.680
use a traffic or some kind of data
00:09:29.880
ingestion
00:09:32.320
service okay then the third thing you
00:09:34.560
want to think about for the purposes of
00:09:36.079
load testing is how can we then
00:09:38.000
artificially push work into this
00:09:41.040
performance testing system um so there
00:09:45.880
are for user traffic I'm aware of uh two
00:09:49.200
companies that have looked at uh doing
00:09:51.880
request replays or Shadow requesting so
00:09:54.640
you're capturing real production
00:09:56.000
requests removing the pii so it doesn't
00:09:58.240
go to You're logging and then replaying
00:10:00.279
those against your system um companies
00:10:02.360
I've heard of who are doing this have
00:10:03.519
actually open source their tools so one
00:10:05.160
is Carwell and Umbra and the other is
00:10:08.320
love holidays and a tool called
00:10:10.800
Ripley um which uh I recently saw a talk
00:10:14.480
about actually so was oh coincidence um
00:10:17.680
in our case because our customers are in
00:10:19.399
regulated Industries it is not
00:10:21.480
appropriate at all to use their real
00:10:23.880
data so instead what we use is a go tool
00:10:28.000
my colleague uh David who's on the
00:10:29.800
ingestion team wrote to generate some
00:10:32.600
fake data to approximate present day
00:10:35.000
workload and then from that we do 10x
00:10:38.480
100x THX Etc so the sort of things that
00:10:42.320
are coming into the system are emails
00:10:44.480
chats Zoom calls etc etc so we're going
00:10:48.320
to push fake versions of those through
00:10:50.200
the ingestion
00:10:52.440
service um so foundational work on our
00:10:56.320
tool for this generation and pushing
00:10:58.800
fake data into our system actually
00:11:01.120
happened before I got involved so I
00:11:02.920
asked David about his approach for
00:11:04.839
reflecting real traffic coming in uh
00:11:07.600
according to him the approach he took
00:11:09.639
was to push traffic at certain volumes
00:11:12.800
that correspond to what we were already
00:11:14.399
seeing and seeing what we would need to
00:11:16.600
process that in terms of resources
00:11:18.519
servers
00:11:19.760
Etc and then we could look at the sort
00:11:23.760
of 24 average rate across a work day
00:11:26.920
check the difference and then use that
00:11:29.480
to um check the difference between that
00:11:32.279
and the kind of peak load uh demand uh
00:11:35.200
requirement for resources and then work
00:11:37.320
out from that how many resources we
00:11:40.240
needed for that um so we used volume
00:11:44.680
volumes of then we used volumes of
00:11:47.360
anticipated customer data and allowed
00:11:49.800
some extra for growth and used that to
00:11:51.959
get a a kind of average rate and then
00:11:54.959
once we hit that then we would have at
00:11:57.079
most 24 hours latency in Pro processing
00:11:59.800
data coming in all of it kind of
00:12:01.120
normalized
00:12:02.160
out um with the rate of processing per
00:12:05.120
machine or maybe end machines we could
00:12:09.079
look at production data centers the
00:12:12.399
heaviest used ones and look at the
00:12:14.880
effective rate of the typical busiest
00:12:17.639
hour and then see how that differed from
00:12:20.199
the 24-hour rate so that would again
00:12:22.959
that's kind of deciding how many
00:12:24.399
resource that we need to process extra
00:12:27.519
workload coming in and a satisfa time
00:12:30.560
and this diagram is just showing 24
00:12:32.600
hours of ingestion for a production
00:12:34.399
server so average rate here was about
00:12:37.199
83,000 records processed per hour um
00:12:42.720
and the the idea here is just to get to
00:12:46.120
a message uh get to a process a record
00:12:48.959
through in a reasonable amount of
00:12:52.240
time so
00:12:54.560
finally measuring results um yeah before
00:12:58.120
I read computer science readed biology
00:13:00.160
which covered scientific method how you
00:13:02.199
generate a hypothesis design an
00:13:04.079
experiment test it and then analyze your
00:13:06.279
results to see if you're correct or not
00:13:08.040
pretty standard stuff but then we also
00:13:10.320
had a really cool lecture from one of
00:13:12.440
our Immunology lecturers Dan Davis about
00:13:15.360
how his research group was sort of
00:13:17.720
flipping that by gathering tons of data
00:13:20.800
and then sharing the data itself with
00:13:22.560
the wider scientific community so both
00:13:25.279
his research group and also other teams
00:13:27.680
could analyze it draw draw conclusions
00:13:29.839
and then publish from that so we're
00:13:32.199
partway there in that we're Gathering
00:13:34.519
tons of data in a standard and our
00:13:36.320
standard logging and then from that um
00:13:40.519
so like I said we have the same logging
00:13:42.240
that we do in a production system and
00:13:44.320
from that we can also therefore look at
00:13:46.040
CPU utilization RAM and also individual
00:13:49.639
log lines from each component in the
00:13:51.560
system um so I'm going to skip over
00:13:55.480
looking for absence of Errors because
00:13:57.800
that's kind of ear tend to be early on
00:13:59.839
in a low test where you're you might
00:14:02.639
have like tried needed to switch
00:14:04.839
something off external like sending
00:14:07.199
emails and that might be quite specific
00:14:09.519
to your
00:14:10.680
system so this is hang on yeah this is
00:14:14.560
an example load test and we wanted to
00:14:18.079
look at the part of the system that
00:14:19.240
Roots records through to workflow so the
00:14:21.440
idea is to assign to an individual for
00:14:23.680
review and we were putting 240k records
00:14:27.360
through and looking for no errors total
00:14:31.000
time to process all of those 240k
00:14:33.440
records and how long each individual
00:14:35.959
record took to be
00:14:41.199
processed and this is what it looked
00:14:43.240
like over time this is the event
00:14:45.079
individual time per record processed and
00:14:48.320
that was working out to about 2 and a
00:14:50.240
half seconds per record to go through
00:14:52.279
that workflow so then we have logging as
00:14:55.440
well through that out that code path so
00:14:57.639
we can break down where the time is
00:14:59.440
being spent so for most to least time
00:15:03.360
spent we were assigning those records to
00:15:05.920
a workflow and then a lot of time
00:15:08.720
preparing less time 31% to set up a new
00:15:12.079
record then 8% to enter the actual
00:15:15.000
workflow process and then everything
00:15:17.000
from there was 4% of the time or less so
00:15:19.040
we w super concerned about
00:15:22.399
that okay so that's a kind of a gist of
00:15:27.360
one performance test this is now a full
00:15:30.199
system performance test I'm going to
00:15:31.759
walk you through
00:15:34.079
so you do not need to read all
00:15:36.720
this uh the key points are this is a
00:15:39.240
full systems performance test that we
00:15:41.120
carried out it was across 5 days
00:15:44.120
ingesting 190,000 fake chat records per
00:15:48.160
hour
00:15:49.800
634k fake email records per hour and
00:15:53.040
intentionally exercising the identity
00:15:55.399
matching part of the code base so
00:16:00.399
on the rail side a very important area
00:16:02.839
is how we recognize new participants
00:16:04.839
from Zoom meetings with the we and chest
00:16:07.079
and make sure that we're not constantly
00:16:08.759
saying oh Jade with this email address
00:16:12.199
is one person and Jade with this email
00:16:13.920
address is another person but you have
00:16:16.160
to have a couple of things to match on
00:16:18.440
um there are going to be things that
00:16:20.759
will identify as clearly the same person
00:16:23.519
like combination of name and number or
00:16:26.079
name and employee record for example so
00:16:28.920
so to artificially stress test this part
00:16:30.920
of the system um my team already had
00:16:34.319
something to generate an arbitrary
00:16:36.000
number of meeting participants written
00:16:37.839
by uh written by our lead and I wrote a
00:16:40.839
rake task that was to generate pairs of
00:16:44.000
participants that would be recognized as
00:16:46.000
the same person so say
00:16:48.639
100,000 and then our rails app is going
00:16:50.759
to realize that those are actually
00:16:52.920
50,000 individuals just with very
00:16:55.040
slightly different
00:16:56.639
details then when we're pushing fake
00:16:58.920
Zoom meetings into the system in the uh
00:17:01.880
overall systems test it's going to hit
00:17:04.640
this code path for identity matching and
00:17:07.039
then we can see how that in handles that
00:17:08.880
increase
00:17:13.600
throughput okay so this was really cool
00:17:16.880
so two days into the test I mentioned it
00:17:18.679
was a 5-day test um just from force of
00:17:22.280
habit from doing daytime like site
00:17:24.439
reliability support rotations uh a
00:17:27.199
couple of jobs ago I check the sidekick
00:17:29.840
cues about 11:00 UK time and they were
00:17:32.360
really really backing up so just want to
00:17:35.799
emphasize this is not a production
00:17:37.640
system this is a performance testing
00:17:39.280
system so we're fine the workflow
00:17:43.400
sidekick jobs were taking excessive
00:17:45.880
amounts of time to complete and to be
00:17:49.840
more specific the meantime of seconds to
00:17:52.520
complete was suddenly the exact same as
00:17:55.039
the
00:17:56.320
p999 so the other issue issue the kind
00:17:59.720
of the comb the same issue really uh we
00:18:02.200
were looking at latency of 6 and 1 half
00:18:04.679
hours for one of our I kick qes which is
00:18:07.320
nothing like our normal latency so
00:18:09.919
obviously it felt a bit high and at some
00:18:13.440
point in the night the databased
00:18:15.039
connection count had doubled from about
00:18:17.960
500 to
00:18:19.559
1,000 and therefore each of those
00:18:22.320
workflow items was taking a lot more
00:18:24.760
time to process um the actual time per
00:18:28.400
record was still around 2 and 1 half
00:18:31.039
seconds but they were waiting far far
00:18:33.320
far longer in the queue than usual so
00:18:37.440
then at midday UK time so about an hour
00:18:40.039
later jobs actually started failing and
00:18:43.600
there was a massive spike in the classic
00:18:45.799
active record database connection error
00:18:48.200
which I'm sure many of you will have
00:18:49.400
seen before so reasonably obviously if
00:18:52.200
you've seen this yourself this is not
00:18:53.559
good this could be an incident depends
00:18:56.440
but obviously in a production system and
00:18:59.640
what was happening here was the database
00:19:02.520
server was just the resource was just
00:19:04.640
getting totally exhausted and
00:19:07.840
that just led to database like keep
00:19:11.840
trying to open a database connection
00:19:13.480
keep getting rejected job fails goes
00:19:16.360
around again and it just keeps on
00:19:17.919
backing up the queue like that so that
00:19:21.880
500 database connections number actually
00:19:24.520
used to be heroku's Li heroku's limit on
00:19:27.360
postgress database connection
00:19:29.760
and uh at the end I'll share a reference
00:19:32.039
from them on rale behind that so there's
00:19:35.640
ways that you can handle this in this
00:19:37.799
test we just scaled up the database and
00:19:39.840
that worked perfectly fine so what I
00:19:42.440
thought was really cool about this was
00:19:43.960
we got a preview of what could be a real
00:19:46.559
production incident before it ever
00:19:48.679
actually happened in production so we
00:19:50.880
got to mitigate and present prevent it
00:19:53.559
ever actually happening for
00:19:56.120
real
00:19:57.640
so sharing data why bother I've talked
00:20:00.440
about the how why the sort of things you
00:20:02.760
might find when load testing I want to
00:20:05.200
go back to what I said about sharing
00:20:06.440
data and not just results uh why would
00:20:09.400
you bother so you want to bring your
00:20:11.360
team along with you this was really an
00:20:13.600
excuse to get a copy on the slides again
00:20:16.159
and uh the idea is you're limited on
00:20:19.360
time um ideally if you can share around
00:20:23.400
the like results from your logs then
00:20:26.400
everyone can pick up optimization work
00:20:28.360
when time available so what would be
00:20:31.159
ideal share the data in this case by
00:20:33.720
sharing your logging results from each
00:20:35.440
load
00:20:37.480
test um yeah so from what you learn in
00:20:40.120
performance tests a load test sorry you
00:20:42.280
can do smaller scale controlled
00:20:43.799
experiments and I would basically just
00:20:46.039
point you to Nate bapex work especially
00:20:48.960
the DRM method database Ruby memory and
00:20:52.720
I'd say both of these two books are the
00:20:55.360
go-to resources for rails performance
00:20:57.480
optimization
00:20:59.080
um this actually doesn't require a
00:21:00.919
performance testing environment like
00:21:02.559
I've described instead you start by
00:21:05.080
benchmarking locally and proving your
00:21:07.520
the worth and Improvement of your
00:21:09.919
performance PR and then you do the same
00:21:11.640
in production if you've got approval for
00:21:14.159
the pr um what's kind of cool is that
00:21:17.279
once you follow that process you can
00:21:19.440
also prove how your optimization will
00:21:21.440
perform in a larger load test this is
00:21:23.679
something that we've done uh with a
00:21:25.200
couple of my teammates work which has
00:21:26.679
been really interesting so
00:21:30.080
um then just going back
00:21:33.600
to uh the point about premature
00:21:38.760
optimization um I'd agree with anyone
00:21:41.600
who's thinking about premature
00:21:42.679
optimization being the root of all evil
00:21:44.600
the actual full quote is that you
00:21:47.120
there's no doubt The Grail of efficiency
00:21:48.720
leads to abuse we waste enormous time
00:21:51.840
amounts of time thinking about or
00:21:54.159
worrying about the speed of non-critical
00:21:56.279
parts of our programs and these attempts
00:21:58.360
are efficiency actually have a strong
00:22:00.400
negative impact when debugging and
00:22:02.480
maintenance are considered so we should
00:22:04.400
forget about small efficiencies since
00:22:06.679
say about 97% of the time um premature
00:22:10.760
optimization is the root of all evil
00:22:12.440
that's the full quote so everything I've
00:22:14.799
discussed in this talk is about ignoring
00:22:17.080
the non-critical so instead to find
00:22:19.960
bottlenecks in critical parts of your
00:22:21.720
system so you can Rectify them
00:22:24.000
preferably just before you need
00:22:26.640
to uh summing up up uh using the
00:22:30.600
methodology I've described you can carry
00:22:32.799
out large scale measurements share data
00:22:35.520
and your insights from those uh those
00:22:38.120
tests with your wider engineering team
00:22:40.880
and you're using tools to anticipate
00:22:43.559
problems rather than having to react to
00:22:45.279
them where they happen and just cause
00:22:47.400
Panic uh the idea is to anticipate
00:22:51.840
multiples of your current scale and
00:22:54.760
replicate your system with terraform
00:22:56.960
think about how the workload or traffic
00:22:59.520
arrives into your system whether that's
00:23:01.240
user traffic or data ingestion and then
00:23:04.919
from that load test and take
00:23:07.080
measurements from a performance testing
00:23:09.159
environment and I've also walked you
00:23:11.120
through a few example findings and
00:23:14.000
thought about looking ahead how you can
00:23:15.440
share that with your team so yeah thank
00:23:18.320
you for listening I'll be around for
00:23:20.000
questions
00:23:26.799
afterwards thanks very much for