00:00:15.280
hey now so get spaceo so a little more
00:00:20.439
about me um you know uh I have been in the Ruby Community since 2004 uh with my
00:00:27.840
character's name the entire time I've been using psychic for a long time as well uh across several different
00:00:33.840
companies um and you know like many of you uh I learned Ruby by wise guide and
00:00:40.239
so like we were talking about it on madon hey do the do the thing in character so I am and uh we'll see how
00:00:49.280
this goes um anyway so uh the way that I set this presentation up is it's largely
00:00:55.920
going to be the uh mistakes that I've made do been this for a long time and
00:01:02.039
some of them are really boneheaded mistakes and so don't feel bad if you do that sort of stuff too um and you know
00:01:09.280
there's a lot of them I've done a lot of mistakes I've made but these are the top seven um and also going to assume that you're familiar with psychic and if not
00:01:16.479
hopefully you can hang a little bit um anyway so um this is really my opinion
00:01:24.159
and not necessarily anyone I've ever worked for just to put that out there so
00:01:29.960
what's the first lesson um use a separate database don't do what I
00:01:35.159
did um so we've been using I've been using psychic for a long long time and uh early early on it wasn't really well
00:01:42.680
understood what would happen if you shared the database with rails and um eventually made it to the wiki and it's
00:01:49.479
definitely never do that as the answer um if you can avoid it what whatever so
00:01:56.960
the reason for that is that um you you know okay we didn't put the other database up because we're a startup and
00:02:03.119
it costs money and so why bother and um
00:02:08.440
why would you want to maybe have a sear database well you're going to need to scale out the rail side of things right
00:02:13.760
you're going to want to use you're going to want to leverage cluster and you're going to want to leverage um you know
00:02:19.120
having sort of isolation between the different databases and stuff like that psychic kind of doesn't really need that
00:02:24.319
it can work great with one database no problem and so what do you do right well
00:02:30.280
you might try okay well if you want to use a different you Rea supports a different database index So like um you
00:02:35.840
could use the database index one put psychic there and you technically only running the one database but that
00:02:41.599
doesn't actually work because they still compete for resources the clustering is configured more or less the same and
00:02:47.319
also like a lot of different distributions of you know redus or valky don't
00:02:52.640
support um number databases so here's some ideas on how to fix this or at
00:02:59.480
least try to so the Brute Force solution uh what would that be right it's the simplest shut everything off copy
00:03:07.680
paste no that doesn't work that I mean you could do it if you can go offline
00:03:13.799
but that doesn't yeah that that doesn't work at scale right so what's the second one you might try well you res
00:03:20.400
replication right that kind of works so we with this it's like the job queuing never stops um or you know like you can
00:03:27.599
still write but you can't read and um you know you don't really have any user facing downtime but the way that
00:03:34.799
this works is it leverages sort of a a feature SL Miss feature of redus wherein
00:03:40.040
you can write to a replica which is not a thing you normally would want to do and so if you set it up that way and you
00:03:45.879
write to this replica and you know everything's good great you've migrated over no problem problem replication
00:03:53.159
restart throws all your data away so if you started to migrate your rights and they're about halfway over and replication restarts Network went down
00:03:59.920
there's your application buffer overfilled whatever you've just lost all those jobs permanently and you're never going to get them back so that doesn't
00:04:06.840
quite work and of course if you did get it to work the first way like you have to just immediately turn off replication
00:04:12.200
and take a snapshot and then you know hope that it didn't fail at any point in that process and okay that doesn't do it
00:04:20.000
for you either right also another thing is not every provider gives you the access to use replica of and in those
00:04:26.639
cases maybe you could use Riot but maybe you can't and uh you know right it's a thing that you can download from Rus and
00:04:32.440
it's it depends on the provider but it may or may not work for you so okay what
00:04:37.720
do you do right well we have to manually replicate we have to do something so um
00:04:43.199
what we've done or you know I've done talked with some colleagues about is that basically just Implement a manual
00:04:50.080
replication process and that's essentially what I'm going to go with um and this is also a high level overview
00:04:55.479
but you know sort of the keywords to think about if you're trying to think
00:05:01.120
about this problem too is we basically just you know our pole push the jobs to a new key and then like use the sorted
00:05:08.120
set commands to move them in batches and deal with psychic Pro batches and all of that stuff um the whole idea is really
00:05:15.199
that like there's not a step in the process that's going to result in data loss um you know that would definitely be a problem if that happened um and so
00:05:23.360
you can always revert every step y y y um there only downtime again is what like you can't execute jobs but you can
00:05:30.680
cue them so that's that's good right uh you also end up with the end state of like the source database and destination
00:05:37.039
database are clean and they don't have to remove any data list Left Behind anything like that and I just
00:05:42.199
anecdotally trying to advocate for open sourcing the migrator once we're done with it but we'll see I can't promise
00:05:49.240
anything so it's the second lesson um this one's really you know kind of uh
00:05:54.560
common knowledge at this point um Adam McCrae Judo scale wrote a really good article about this called an opinionated
00:06:01.280
guide to planning your psychic cues and it's really good it covers everything you need to know go search for that um
00:06:07.360
or you know find him I think he's here anyway so the one thing I'd say is like
00:06:14.000
you don't necessarily have to start with a gigantic cluster of psychic processes uh you do eventually need to have that
00:06:19.680
as a thing you want to scale to so think about it like make sure you're prepared for it you know maybe like give yourself
00:06:25.000
the provision for that in your you know infrastructure orchestration stuff like that and of course what happens if you
00:06:30.680
don't do this right and this is the mistake that I that I made um you end up with Q blocking so like something simple
00:06:37.280
like okay you have jobs that take tens of milliseconds so you're going to cue those 10 you know 10,000 jobs like maybe
00:06:43.759
sending activity emails or something so you got those 10,000 jobs cute but they're not going to execute because
00:06:50.080
you've got like image processing or transcoding or something that take like a minute and there's like a dozen or so of those and uh that's just using all
00:06:56.759
your resources and you can't possibly run the things that are super fast and
00:07:02.039
um you can't really use Auto scaling for this because it takes minutes you know sometimes 10 minutes or so to autosale
00:07:07.800
out and you know this sort of when you run into this and you hadn't really you
00:07:12.879
know again learned like I did um then you kind of get gunshy31
00:07:30.080
would be better if it just a long running psychic job right so what I would say is like I've done jobs that
00:07:35.400
take half an hour hour two hours sometimes and don't be afraid of them just make sure they're in a queue by
00:07:42.080
themselves um so the third lesson is um to kind of a little bit contradict what
00:07:49.759
I just said about using Q's named after the SLA you want um but uh to fall
00:07:55.599
isolate and so this is like think about like boundaries for like pack work right or I guess pack anyway um if you're
00:08:03.599
going to modularize with packwork then like uh you're already kind of drawing these boundaries right you're G to have different components and stuff like that
00:08:10.759
and you can think about your psychic cues the same way right so you have your SLA based cues but you like prefix them
00:08:16.759
with the name of the component and you know you don't necessarily always do this but this is a thing where you want to do that because um you know these are
00:08:24.240
going to typically be different like Amazon ECS Services um that are all Serv
00:08:29.960
a different set of cues or whatever and you know what you really want is for
00:08:35.760
that to um how do I put it you want that to the things to not necessarily be able
00:08:42.959
to interact with each other like you might want to have different resources allocated you know different CPU different Ram whatever you might want to
00:08:50.040
pack them more efficiently or less efficiently depending on whether or not the job can technically go a little bit
00:08:55.279
over its SLA or maybe it can't right like that's the thing you need to think about the most thing like the biggest
00:09:01.279
kind of important thing that is I think not thought about all that much is that the security context is important and
00:09:06.839
you know my background is an information security so this is what I think about at night sometimes anyway so what's the
00:09:16.399
um what's that mean right so what that really means is that like you okay you have like you know kubernetes pods or
00:09:22.040
ECS services or whatever those are going to get different Secrets assigned to them those are going to have different you know on AWS be an IM role whatever
00:09:28.920
your collab provider has um and you know you kind of don't necessarily need something that's just doing image
00:09:34.760
processing to be able to like read and write customer CRM data right like you know might not need that and so you can
00:09:41.920
restrict that at a level that it doesn't matter if someone could just upload an executable and run it it doesn't do
00:09:47.000
anything right like you can't access that data or do anything bad to it um
00:09:52.440
you can also use a restricted database user you know um again in the image processing example you might say okay
00:09:58.200
well the image is going to get uh you I'm only going to can only rewrite the images table and read a few others and
00:10:03.600
that's it um and so you know sort of like I said really limits the impact of remote code execution vulnerability and
00:10:10.000
the example I'm thinking of is Mastadon with image magic um so if that were FAL
00:10:15.440
isolated in the way that it should be then um it would be only able to have
00:10:21.279
affected images right because you have the image processing is happening in a completely separate set of containers
00:10:27.800
that are isolated from the others by virtue of the you know container ecosystem it probably you could still
00:10:33.800
Escape there's a lot of container escapes out there but it's more difficult right um You also can deal
00:10:40.360
with the fact that some jobs might need to more aggressively or less aggressively scale out and you know you need to have a knob to turn for that
00:10:46.519
theoretically and you know that said faster duration is most important um when you're starting out so really this
00:10:53.360
is more think about this have this in the back of your mind when you're when you're working on it and kind of don't create extra Tech DB for yourself
00:10:59.839
uh so they can go back in a future refactor and add this so the fourth lesson is to ensure preservability
00:11:07.160
um and I will make a a little bit of a I don't
00:11:12.959
know uh admission I don't run the psychic dashboard um and you know originally it
00:11:19.440
was because I didn't have the time to set it up to make it access restricted and then I did and uh I'd already set
00:11:27.639
the the metrics collection so I didn't really need it and then now I you know the company now got acquired by a big
00:11:34.079
company and uh we need to put SSO in front of it it's more complicated and still haven't done it and you know
00:11:39.920
psychic Enterprise gives you a hook for a lot of this but yeah it's a lot of
00:11:45.079
work it's not a lot of work it's just a little bit of work that doesn't necessarily bring all that much value at
00:11:51.360
the moment um but regardless if you're could to run the psychic dashboard don't make it inter accessible put it behind a
00:11:57.440
Bastion host or restrict it to a certain IP address range or whatever don't uh don't put it on the internet that that's
00:12:03.800
that's not a good idea um so in addition to the different uh you know libraries I
00:12:10.200
have on the screen you can also roll your own with psychic stats that's what you know I did because um at the time
00:12:16.760
these didn't exist um and you know really regardless of whatever system you're going to use to collect metrics
00:12:23.920
um collect them more than once per minute the reason for that is really pretty simple it's that you don't want
00:12:30.160
to just use one data point to make a decision on oh I need to scale out or something bad's happening I need to paid somebody right you want a few and you
00:12:37.240
want a few over some time period and that time period being 10 minutes might be the difference between having
00:12:43.720
customer impact or not right and so if you could make that you know time to response be a few minutes like one or
00:12:49.480
two or three because you're you know measuring every 15 20 seconds or something that's way better than if it
00:12:55.600
takes 10 minutes because if you think about like oh well the action is that scale out well uh uh scaling out if you have to
00:13:03.120
launch new instances can take 10 minutes and now again you're back to having customer impact if the jobs have to
00:13:08.240
execute quick enough right so you know you really excuse me you really want to avoid um you know really want to keep
00:13:15.600
track of that one thing though is that uh it's going to require a little bit of special consideration because usually
00:13:20.639
high resolution metrics cost more and sometimes require you to use a different API to publish them or whatever but you
00:13:26.199
know that's something that you can plan for accordingly um um you can also think
00:13:31.360
about like okay APM application performance metrics right and uh what I would say is that like that's not going
00:13:37.880
to be a way out that's that's the that's the mistake in this slide um is is you
00:13:42.920
can't really rely on APM to tell you that your jobs are performing poorly because it's going to tell you the code path that's performing poorly which
00:13:49.279
might not translate to the job actually being bad right like you might have a method that is not necessarily the most
00:13:54.959
optimal it can be but hey guess what it's still exec way faster than you need
00:14:00.279
it to and you don't need to care so it doesn't frequently correlate and that's kind of unfortunate because usually
00:14:06.240
you're already using some tool to do that um I think the sponsors or one of them is an APM provider so you're using
00:14:13.279
their tool theoretically and um you know if you are awesome and uh just probably
00:14:20.240
want to also collect your own separate metrics for job performance not just APM
00:14:25.320
um but you know what do you have when you have metrics what do you do right well like I said you probably are going
00:14:31.160
to end up paging someone in the worst case and then in the best case you have data to draw from to make decisions
00:14:36.720
about how to scale the different psychic you know Runners workers whatever you
00:14:42.040
have you know like in the case of like Amazon ECS right you have different ECS services and so you want to give some
00:14:47.600
more CPU and some less CPU and you know it's you know how do you know which ones
00:14:53.399
which you know how do you know what is the best choice is that you don't have to have the data and You by collecting the job performance metrics you're able
00:14:58.800
to figure that out um and then you know also it's like you know how many
00:15:04.560
different uh instances of the service are you running how many different you what's the psychic concurrency set to
00:15:09.920
all of these are things that like you can collect this data and then easily go compare later hey this is good this is
00:15:16.320
bad you know I'm going to make a tweak and you know make it more efficient and okay it still performs just as well I'm
00:15:21.800
GNA continue to tweak that until I'm you know at my cost optimization goal or whatever right um that said for Auto
00:15:29.279
scaling uh don't use CPU or Ram utilization as a proxy for performance like that you need to scale out or scale
00:15:35.959
back in doesn't really work very well for that uh in my experience and that's the mistake that I made here um and you
00:15:43.680
know it's it's uh unfortunate because it's you know really unhelpful to have
00:15:48.720
uh those be provided by default and you can't use them um I mean I guess if you did nothing I guess you could use them
00:15:55.560
and that would be okay but like it really um I would suggest just hooking up you wiring up Auto scaling with some
00:16:01.959
of your metrics you're collecting so the other thing is um you know um your scale
00:16:08.000
in and out logic might need to be more complicated uh and really the the key thing is to eagerly scale out but
00:16:13.720
conservatively scale in so the fifth lesson is to avoid the sharp edges
00:16:20.000
there's a lot of them and I thought about making this like a whole bunch of different ones but like then we're at
00:16:25.759
like less than 50 or something and it's not worth it um the kind of the first thing I have on here is make the job
00:16:31.120
idempotent um and what that really means is that sometimes you have a situation you know say it happens 0.01% of the
00:16:37.720
time at a certain scale that's daily or hourly or you know minutely for some
00:16:43.800
folks um and so what might happen is that the job can do all of its work and
00:16:49.839
commit the transaction to your your rdbs and everything is good and then it can't
00:16:55.279
Market complete and red as that pack it gets dropped psychic Pro will re happily deal with that problem but
00:17:02.440
um if you you know not every situation would that still there's still the possibility that it wouldn't necessarily
00:17:08.520
be able to check the job back in is complete right and so um what happens when that job gets recovered well it's
00:17:15.720
going to redo all the same work again meaning if you didn't have any check for idempotency that you're going to redo all the work you're going to duplicate
00:17:22.480
rows theoretically uh it might be at this point too late to do the thing you needed to do and now you are creating
00:17:29.480
like a exception that lives forever until you go clear out the dead side or something so that's deeply unpleasant so
00:17:37.000
try to make sure that you're checking this when the job starts and then you know or at least some version of IDE
00:17:42.320
dependency whether it's the checking that the status is needing you still need to do the thing or whatever um
00:17:47.960
another thing is like you know file system and containerized environment is weird um because containers are going to have quotas files the system is going to
00:17:54.480
have a quota they're not necessarily going to be set to what you expect them to be and they might need to be tuned or
00:18:00.480
might not need to be tuned whatever um you also might have it where each
00:18:05.720
container has a volume for itself or it might be there's a shared shared volume and you know it's for the whole instance
00:18:12.400
and you know that's for like cach or something and you know you kind of those are all things to think about but uh if
00:18:18.640
you're going to use the file system and you don't really know what the configuration is then you might end up
00:18:24.080
in a bad situation for instance temporary directories those are usually by default Ram dis nowadays and uh so
00:18:30.760
you're downloading say a multi-gigabyte file which you think is going to a dis that you don't really care about the size of but it's actually going to ram
00:18:36.480
and you crash the container that's not great you don't want to do that um ask me how I know
00:18:42.600
that um so you know think about that um also another thing that happens for me a
00:18:50.039
lot of my jobs are network hungry um not CPU or ramb bound really and so what do
00:18:55.880
you do about that right like the kind of thing you might initially think is I'm
00:19:01.480
going to go look at the product pricing page for my cloud provider and go hey how much does it you know how what what
00:19:07.120
is the the performance level for the network here right it's going to say up to a number well don't do that instead
00:19:13.679
go to the documentation and find the minimum performance usually it'll be like there's a uh a per flow rate and a
00:19:19.720
peak like a you aggregate flow rate or whatever pick whichever one's smaller and then you know use that to map to the
00:19:26.919
the CPU and RAM that are uh each instance type has and then use that to kind of use CPU and RAM as a proxy
00:19:34.600
for the amount of network performance you need right so like if you know that like I need to be on this instance serving you know five gigabits a second
00:19:41.760
it would be really bad to have jobs that could get scheduled on an instance that has a minimum performance that's
00:19:47.600
guaranteed to you that is a megabit right or 10 megabits or 50 megabits like that would be you would you would clog
00:19:54.080
up that instance's pipe really quickly even if it might be able to boost to 10 gbits per second or something and you
00:20:01.520
know it's uh unfortunate you can't really you know use that metric as a
00:20:07.440
thing to you know you have to you have to convert it to being CPU and RAM you can't just use network uh performance um
00:20:14.480
also like you might have a different you know networking strategy you want to use besides Docker or whatever um you can do
00:20:21.760
whatever right there's a lot of different choices out there there all different tradeoffs you might need to do some tuning for your cloud provider
00:20:28.080
there's all kinds of stuff like that there's also some more sharp edges um and really these are how the they
00:20:34.679
affect customer impact or how they have how they create customer impact and
00:20:40.480
so I've uh sure that no one here has ever pushed a job out that uh or push a
00:20:45.840
code change out that is a job that then suddenly is failing or not performing correctly and then oops I got to roll
00:20:51.000
this back immediately um no one no one's ever done that so um I have um and so so
00:20:58.919
what I found is that it's really valuable to have it'd be possible to have uh a key in redus that you can just
00:21:05.400
set and then jobs with that class name don't run anymore they just immediately fail and you just you can use a a you
00:21:12.159
know a super class basically that influen this or whatever and you know psychic also lets you pause cues but you
00:21:18.840
know being real you probably have more than one job in a queue um and so that's really not helpful if you just pause the
00:21:26.120
queue because you're just still creating that customer impact um again you just can kill that job and then while you
00:21:32.559
make the job fail it'll get retried and then you can fix the code deploy it pull the key out and then everything's good
00:21:38.520
and you've limited the customer impact of like one thing um you might also have jobs that get re-executed frequently
00:21:45.039
because like clockwork or cron or whatever is going to start those jobs maybe if they fail think about whether
00:21:50.919
or not they need to be retried in the first place um you know you also should probably avoid having jobs that uh get
00:21:57.240
retried while the job is scheduled again that's already running concurrently
00:22:02.279
that's a situation that's bad um again experience um and you know kind of other
00:22:08.880
things is like you know we have jobs that fail terminally don't retry those psychic retry in implements that if you
00:22:14.360
have a new enough versions of psychic uh in your project and if you don't sorry um but upgrade if you can um
00:22:23.480
and you know sometimes you have jobs that are like you know if it gets retried after 10 minutes it's worthless to have done it to to begin with don't
00:22:29.600
retry those like just customize the retry policy so that it doesn't try to run it 3 days from now if it only can be
00:22:35.600
valid 10 you know 10 minutes from an hour whatever so the sixth lesson psychic is not the only system out there
00:22:42.400
it's complimentary with a lot of systems and you're not going to be able to say like pick one you know they're all they
00:22:49.039
all do different things they all are really good and use whatever the best tool for the job is and so psychic is
00:22:55.840
like the best whenever it's like you want to do work in Ruby later from within Ruby and that's what you want to
00:23:00.880
do if you want to schedule a job from a Java
00:23:08.480
application and then run it in a ruby application later probably not the tool for you um but maybe Factory would be
00:23:16.320
right like Factory is a lot like psychic architecturally it has some neat features like progress tracking might be
00:23:23.120
a thing that in hindsight I would have used but I definitely have considered it I I don't know if I would use it or not but I definitely consider it um but if
00:23:31.919
you're going to use these other systems don't use active job extractions um because you're using systems for the
00:23:37.559
special features and you don't want to make those special features disappear because what's the point in doing that right so what are those other systems
00:23:44.159
what do they do for you there's you know sort of the high level ones are Kafka sqs and you know amqp or whatever so
00:23:51.799
kfka is a re into a log um consumers are going to track where they are in the log there transactions you can have exactly
00:23:58.200
once semantics and um really kofka is bad at the things psychic is great at um
00:24:05.720
like you can't really reliably execute individual jobs in Kafka because it costs a lot of resources to acknowledge
00:24:12.039
individual messages so instead you want to like acknowledge a batch of them and
00:24:17.200
what happens if some fail right that's not it's it's not it's not particularly great at that where psychic is is
00:24:23.520
particularly great at that um also I'll plug kka it's a really great processing framework
00:24:29.000
um highly recommend using it if you're using Kafka if you're using sqs SNS which you might be using in addition to
00:24:34.720
kofka like you know one thing to keep in mind is that you can really only have one consumer per queue U so if you need
00:24:40.960
to have two different like things that don't do the same thing you have to have two cues and use SNS to Fan out to two
00:24:46.240
different cues um typically you know I've been using sqs to do like fan
00:24:51.760
messages out over to Kafka or Q psychic jobs in some rare situations directly uh
00:24:58.000
but usually is to just process it entirely on AWS using Lambda um and you know that's what most people do but you
00:25:05.520
know again all of these are all these are things that um you're going to you know probably have three or
00:25:13.039
four different Tools in place right um amqp mqtt that's a Telemetry thing um
00:25:19.080
most of the time an amqp message or mqtt is not going to actually be like a job that needs to be executed it's going to
00:25:24.919
be an observation or a set of them and uh you know there's a whole bunch of different semantics that are possible
00:25:30.720
there's a whole bunch of different um it really depends on like software involved and it's it's not really the same thing
00:25:38.240
as any of the other things um so like for instance you might have your your amqp Q or whatever you might have that
00:25:44.080
that Q process your messages and uh use something like uh Apache Flink to
00:25:50.840
process the messages rather and like submit the observations like here's what action items right send those to Kafka
00:25:57.760
and then use kfka to consume the Kafka topic to then schedule psychic cues or or psychic jobs that's that's a thing
00:26:03.799
you might do um in that situation and so it's not um you know again you might run
00:26:10.399
every one of these or even more and it's all totally reasonable so what's the seventh lesson the last one is to pay
00:26:17.440
Mike um psychic Pro
00:26:23.520
yeah psychic Pro is worth it like the it's it's really there's a whole bunch of cool features besides the fact you're
00:26:30.399
supporting the development of it and the fact that you know Mike can be the person leading this track and all that
00:26:35.640
right like besides that like it's psychic pro has a bunch of cool features it also solves some licensing issues if
00:26:41.480
you work in a company that doesn't like the lgpl um I think it's easy to comply with but that's my opinion um but not the
00:26:48.760
opinion necessarily of corporate legal so um yeah it makes it really easy to
00:26:53.919
deal with that because now you're paying for it and they don't care um SuperFetch is really good reliable plush is really
00:26:59.200
good both of those would be on their own individually worth it batches are useful
00:27:04.760
again worth it um Enterprise maybe this supplies you maybe it doesn't but it
00:27:10.360
checks all the compliance boxes um that nobody really wants to deal with um but
00:27:16.360
you know it's one of those things that uh you might have to or you might just roll your own but like it's a lot easier
00:27:21.399
to just do interprise um and you know it's it's also has a bunch of other features like
00:27:27.279
the SSO thing I mention earlier um and really that's that's the main takea away from this is that uh it's really worth
00:27:35.480
it to get psychic pro at the very least um particularly if you're at scale that's what we run um we don't need the
00:27:41.919
stuff in Enterprise because we've already had to deal with like before Enterprise existed we dealt with the compliance stuff so anyway that's the
00:27:49.519
end um I don't know what if I'm over or not because the clock stopped working um
00:27:56.000
but this is all my socials um and and uh I will probably upload the the thing to
00:28:02.279
my website at some point but I didn't think about it and so it won't be until I'm back home so next week sometime
00:28:09.039
whatever um I don't know if we have time for questions if you have questions shoot them if you don't have questions
00:28:15.360
or whatever I have stickers more than just the two that are on the SC the screens over there not more than the
00:28:22.000
ones that are on the screen if you're interested I've got them in big size small size one thing I ask is if you put
00:28:28.320
me you put my character some more interesting send me a picture I'm in the Oakland
00:28:34.480
Coliseum so uh anyway that's anybody have questions I didn't look
00:28:40.240
up yeah so the um I guess this really just an
00:28:46.039
observation that if you use if you're trying to fall isolate then when you have a job that's like 30 minutes or 30
00:28:52.080
seconds rather and you have like five jobs and you don't want them to be able to compete for resources and you isolate
00:28:57.640
them right you end up with what you said like 30 seconds a b c d whatever it's really hard to reason about that and I
00:29:03.120
agree and so the thing I left out is that in practice we sort of just have
00:29:09.000
our cues not named the time but rather the function so like we would have like
00:29:16.279
listen tracking for instance and we everything listen tracking related is going to take a certain amount of time
00:29:22.799
and that goes in one place and um that said though at some point that becomes
00:29:29.279
really hard to reason about and it would be way better if I could say like um you know something like uh delivery
00:29:37.440
underscore 10 milliseconds and then delivery underscore 5 seconds because
00:29:42.600
there's kind of two things that are the same thing but different and some takes longer and so that would be better but
00:29:49.320
there's a certain inflection point where it makes a lot more sense to just do what I did and name them sort of not the
00:29:56.240
way that Adam suggested and uh I would say that like at a certain scale I guess
00:30:01.720
it'd be really annoying to have to deal with um oh how long is this Q take which queue should I put this job into I don't
00:30:08.320
know I put it in this one and now you're creating the problem you tried to avoid to begin with because you have like 50
00:30:13.720
developers and it's just super hard so anyway that that's that's my thoughts on
00:30:19.039
that but like I totally um yeah I I didn't I didn't it's a lie of a mission is what that is um yeah totally get it
00:30:26.919
though um thank you thank you that concludes our morning program please enjoy your