00:00:00.900
foreign
00:00:12.679
coffee drinkers in the house any coffee drinkers yes I'm not addicted to coffee
00:00:18.660
but if I do not have it my day is ruined um uh well uh I always start my day
00:00:25.080
though by going downstairs and getting the coffee maker going what that entails
00:00:30.599
is I will you know grab the filter I'll throw that in the maker I'll grind up some beans I'll throw those beans in the
00:00:37.200
filter and then I will press that button and walk away I want to do something else like feed
00:00:44.280
the baby maybe have some food myself right there are other things that I want to do while I'm making that coffee I
00:00:50.340
then come back maybe 10 15 minutes later grab my coffee drink it and now the day
00:00:55.920
isn't so bad you might be wondering what what the heck does this have to do with background jobs and hopefully by the end
00:01:02.640
of this talk you will understand exactly what I mean I'm Jake I'm a software engineer at
00:01:08.520
Weedmaps if you are not familiar with Weedmaps we are the leading technology and software infrastructure provider to
00:01:15.720
the Cannabis industry we are also a sponsor here so definitely come and check out our booth we got some great
00:01:22.560
swag to give away uh we are also hiring so if you're interested in working with us go ahead
00:01:28.979
and check out the careers page you can see some job postings there
00:01:34.740
but today's talk outline there's really five main things that I wanted to kind of talk through today uh to get an
00:01:42.360
understanding of background jobs what they are before I do that though I just want to preface that this is really
00:01:47.460
meant for people getting into background jobs if you are super familiar with background jobs and you're like I'm
00:01:53.399
running so many background jobs I need help come talk to me after the talk I'll help you with that later this is
00:01:59.280
definitely more catered towards understanding what they are why they are as well as kind of some of the cool
00:02:04.860
things that active job actually gives to us but we're going to run through this through five steps
00:02:10.619
first we're going to simulate a problem and try to kind of give us a single
00:02:16.440
starting place to to build off of we're then going to kind of prospect a little bit dig a little bit more into
00:02:22.200
that problem and maybe ask a couple uh leading questions from there we're going to investigate
00:02:28.140
and say okay well are there Solutions out in the world for this particular problem
00:02:35.520
we're then going to try to fill some of our knowledge gaps with rails and say hey this solution might exist out in the
00:02:41.760
world but does rails itself have an implementation of this solution and last but not least we are going to expand on
00:02:48.720
that knowledge and try to learn as much as we can about anything that might be in rails
00:02:54.260
coincidentally uh This spells Spike this
00:02:59.280
has nothing to do with um you know the agile term Spike I I I
00:03:05.099
wanted it to be related to that um I I do think of spikes similar to
00:03:10.920
this uh fun fact spike is not actually an acronym for anything so I'm gonna
00:03:16.440
throw this out there as a potential option for Spike so what is the problem
00:03:22.560
right uh what are what are we where's the starting point here well let's start with the rails request
00:03:28.739
I'm going to have some diagrams here on the side with little animations to try to provide some context of what the heck
00:03:35.159
I'm talking about um so you know you've got a router that determines a controller and an action to
00:03:42.360
render okay and then that goes to the controller which is then going to request data from the model the model is
00:03:49.080
going to talk to the database database is going to return stuff to the controller so on and so forth
00:03:54.739
controller sends stuff back to the the visitor of the site but the most important thing that I want
00:04:00.360
to highlight here is that we need each one of these steps to complete before
00:04:05.459
moving on to the next right we can't go to the controller if the router doesn't tell us to go to that controller right
00:04:11.760
we have to wait for each of these things to complete
00:04:17.940
this is also by no means uh what a full request is in rails if you want to learn
00:04:23.460
more about that Skylight did a rails comp talk on this and they also has a some accompanying blog posts so check
00:04:30.120
those out if you want to learn more about rails requests specifically oh look at that we got a new product
00:04:36.000
request surprise surprise um they would like to add a change that requires some more logic in this
00:04:42.479
controller we're going to simulate that with a service class for those who aren't familiar with Services think of a
00:04:49.199
regular Ruby object that will take in some input do something Fancy with it and maybe output something else
00:04:56.120
examples of this might include sending an email or updating a counter somewhere
00:05:02.880
or maybe doing some kind of account setup I'm like maybe you got to hook up to stripe and send some stuff over there
00:05:09.060
really it can it's anything right it's whatever is specific to your business needs
00:05:15.240
but the important thing here is we have added another thing in our request chain
00:05:21.180
right we are now reliant on whatever this service is doing and our visitor is
00:05:26.759
now relying on that too oops service is taking too long uh maybe
00:05:34.320
that's HTTP you're trying to send out that email your email server is down it's not working your Sol
00:05:42.120
um maybe the database is getting too much traffic it's late you know day everyone's trying to wrap everything up
00:05:49.080
and uh things are slowing down your service is slowing down
00:05:54.479
maybe the service just does a lot of things and it takes a few seconds anyway just to complete all of those tasks
00:06:01.680
the visitor though is still there they are waiting and maybe they are leaving
00:06:06.900
in anger because you are taking so long to solve this product request
00:06:13.979
so what might this look like in rails right you had some little diagrams there on the side with a you know ball going
00:06:20.400
up and down but like what does that actually look like for this example I've got a Pages
00:06:27.300
controller that has an index action imagine this is like a home page and on the home page we're going to call some
00:06:33.060
service that's going to print a message to the screen this some service is our
00:06:38.520
plain Ruby object it has a perform method it's going to sleep for five seconds just to simulate a little bit of
00:06:44.400
work and then it's going to puts that message to the screen
00:06:51.000
our terminal output would look something like this right I tried to highlight the printed message in green it actually
00:06:57.360
came out in white I don't have any fancy thing that puts that in green so there's that message
00:07:04.979
but some of you may have also noticed it took over five seconds for us to
00:07:10.860
complete this request and all we did was print something to the screen did the user need to wait for that that's the
00:07:18.419
question so let's dig a little bit further let's bring back that coffee maker
00:07:24.960
example right here's a kind of a similar chart to the requests right but in coffee maker
00:07:31.199
terminology so you've got an uncaffeinated person right they uh they're they're desperate
00:07:37.080
they need to place that filter in the machine they push the button to grind the beans get things going they're
00:07:42.120
waiting for the beans they put it in the filter and now they push that start button right but now
00:07:49.440
they're just sitting there they're waiting for the coffee maker they're just like eager I need the caffeine and
00:07:55.740
so they sit there and they wait for that coffee maker to finish Brewing
00:08:01.199
and once it does they grab their coffee they drink the coffee and they move on
00:08:08.580
again similar to our hanging request but like me I I don't want to sit there and
00:08:14.160
wait for my coffee I've got other things that I want to do right same thing with this rails request it has other things
00:08:20.220
that it wants to do too I want to render The View I want to move on to other requests
00:08:26.639
so do we have to wait while the coffee's Brewing no
00:08:32.099
so do we have to wait on this server call if yes okay well maybe we do have to
00:08:39.419
wait on it like we have to wait for the filter we have to wait for the green the beans to grind right we can't start the
00:08:45.300
coffee machine until we at least have some of the pieces together um so if we do have to rely on it can we
00:08:50.459
make it faster or are there ways to re-architect some things to make it so we don't have to do
00:08:56.519
it right now and something else can gather the filter and the beans later on but we just tell it hey by the way you
00:09:02.100
got to grab the filter and the beans but if we don't have to wait on this
00:09:08.279
like in our puts example before the user doesn't care about us logging some information right so
00:09:16.260
can we essentially in in coding terms can we like push the button and like
00:09:22.019
come back to it later so now let's investigate that that idea
00:09:29.160
a little bit more right surely somebody has thought of this right I can't be the only one that's like can we just do
00:09:36.360
these things kinds of things later and uh yes this has been solved this is background jobs and async processing
00:09:42.360
welcome to my talk um that uh that in a nutshell is what
00:09:48.360
background job processing is so what what does this look like at a really high level
00:09:54.120
um so you're going to have a service right that gets added to a list we're going to call that list a queue
00:10:00.779
um and it's going to come with any Associated arguments so in this case the queue is just some random class just
00:10:07.800
pretend it's a queue and we're going to add something to that queue and we're going to add the class some service right just like we did in the uh
00:10:14.580
controller and we're going to give it the hello rails comp message
00:10:20.880
so that's going to add it to the queue that lists somewhere and that list might look something like this
00:10:26.279
where hey we've got a couple things already in that list you know the some service with I'm the first job some
00:10:33.240
service with that makes me the second and now we just added that third one to the list to process some service hello
00:10:40.380
railsconf after that we're going to you know have
00:10:46.380
there needs to be something to to kind of run and pull these things off of this list off of the queue and so the job is
00:10:53.519
going to be equal to we're going to call that a job we're going to grab the first thing off of that list we're going to
00:10:59.279
grab the class which is going to be the first argument right there that's some service which is a string and then our
00:11:05.100
arguments are the second right that hello railsconf um so what we need to do then is we need
00:11:11.940
to take our string class and we need to turn it into Ruby class you can do that with the dot constantize method we need
00:11:18.959
to create a new version of this class we need to instantiate it so we'll call new so now we got some service.new right and
00:11:25.980
now we call the perform method on it just like we did in the rails controller and then we can Splat in those arcs for
00:11:32.880
those who aren't familiar Ruby gives you kind of a Splat operator that works very similarly to the JavaScript Splat
00:11:39.300
operator if you're familiar with it so definitely check that out if you're not familiar
00:11:48.060
but what does this actually help with like cool you did some research people push things to this queue and like pull
00:11:55.079
it off this queue but like why why does that even matter
00:12:00.480
well one thing is it prevents too many things from running at the same time so let's think really quick back to the
00:12:06.420
controller example if all of a sudden I just like threw that in the background immediately to process
00:12:11.579
if we had a hundred thousand people that came onto our page at the same time and now we just triggered a whole a hundred
00:12:17.760
thousand things to run in the background those are theoretically happening at the same time we're going to take down our own system
00:12:23.640
that doesn't seem very smart so one thing that adding stuff to a list or to
00:12:28.800
a queue does is it allows us to have a little bit more control in terms of how many things are running at a given time
00:12:34.860
I can say hey I want you to pull one item off the Queue at once maybe I have two different things that are pulling
00:12:41.040
off of the same queue and it's like how you can have the first one and then you can have the second one don't worry I'll go take the third one
00:12:48.360
but since everything is in a list or in a queue somewhere we can then also see anything that's
00:12:55.200
currently running right we can also see jobs that are waiting to run right or perhaps things
00:13:04.079
that have completed successfully but more importantly maybe even things that have failed and since we still have
00:13:10.200
references to the job class and the arguments that that job needed to run we
00:13:15.720
can theoretically retry that job whenever we need to maybe we introduced a bug in that worker
00:13:23.040
or in that job and we can make some tweaks push up that change and rerun it
00:13:28.980
later the main thing that this cue gives us is
00:13:35.519
stability and visibility
00:13:41.940
all right so we we had our our problem right which is the blocking requests we
00:13:47.760
kind of dug a little bit further to be like okay we we don't probably need to block the request for this case it seems
00:13:54.600
like from uh software engineering perspective someone has come up with this idea of cues right so let's fill
00:14:02.100
our knowledge of what rails might have for this so for those who may not be familiar
00:14:07.800
rails has something called active job for managing and processing
00:14:12.839
cues and it really only requires three basic
00:14:19.079
steps to get up and running one we create our job class
00:14:27.240
two we tell the job class to essentially add itself to the queue
00:14:32.820
and then step three is rails just takes care of the rest with an asterisk because it's not really
00:14:39.480
a step because rails just handles it for you but let's go through each of those okay
00:14:45.180
so creating our job what does this look like let's get a little classy so we're going to create a new class
00:14:52.579
usually you're creating an application job class that inherits from active job
00:14:58.680
base okay if you're familiar with rails models it follows a similar pattern to
00:15:06.000
where you might have an application model that inherits from active record base it's a similar thing right where I
00:15:13.560
could inherit from active job based directly I don't have to do this pattern but the pattern is already established
00:15:20.699
kind of from from models and so they've followed that with active job
00:15:27.240
the second thing that I need to do is okay now I've got my application job
00:15:32.279
class and I created my some job class that inherits from that I need to create a perform method on that uh on that job
00:15:40.740
this perform method is what eventually gets called with my arguments when that job does run
00:15:47.399
and you can see from this example I'm just taking that message and passing it directly to the service
00:15:58.199
but sometimes you might want to tell it what Cue to run on so this is also kind of something interesting you might have
00:16:03.959
more than one cue right you can have or a list you can and you can name them whatever you want you might have an
00:16:10.980
email queue and or you could have a notifications queue you can have an
00:16:16.320
errors cue but by default rails gives you a default queue in this particular
00:16:22.620
case I told active job I want to cue this job as my Q name
00:16:27.720
um so let's talk about actually enqueuing
00:16:33.120
that job so we have our class it's ready to go it's got that perform method it knows what Cue or what list it's going
00:16:39.540
to run on now we need to tell it hey go add yourself to that list so that you
00:16:44.579
can actually run later
00:16:50.579
so really all we need to do is take our sum job it's going to have a class
00:16:56.579
method available on it called perform later and then you give it whatever arguments you want to give to the
00:17:03.480
underlying instance of that class so these arguments may be serialized by
00:17:10.260
uh rails we're going to kind of get a little bit more into that here in a second but under the hood again what's
00:17:17.280
happening is when this job does run so that Q is going to look like the left is
00:17:24.000
going to have that some job as that string class and on the right is going
00:17:29.280
to be an array of those arguments that just like it was before it says hello rails comp but when that job actually runs it's
00:17:35.940
going to do the exact same thing that I showed in that example it's going to constantize some job it's going to
00:17:40.980
create a new and then it's going to call perform passing in that argument and it's going to look something like this
00:17:48.539
rails will also automatically deserialize or re-transform the stuff back again we'll kind of get to that
00:17:54.720
here in a second but a quick note on enqueuing this is kind of where active job is like
00:18:01.320
really cool and how it operates not only um can you add something to a queue but
00:18:07.860
you could tell it to wait to perform that job maybe I don't want to run that job right now maybe I want to run it in
00:18:14.520
24 hours like this is a you know a feedback email I don't want to give someone like hey give me feedback on my
00:18:20.580
product when you just signed up no I want them to give me feedback maybe 48 hours later so I'm going to tell this
00:18:26.820
job hey wait four to eight hours and then send that feedback email or maybe I actually want this job to run
00:18:33.360
at midnight or at 3 A.M because that's when the Ser the traffic is lowest so I
00:18:38.460
can say wait until you know 3 A.M and it automatically will just make sure that
00:18:44.580
job does not run until 3am I can also dynamically change that queue
00:18:49.919
say I have a default queue but I also have an urgent cue this person needs this email right now so I'm going to go
00:18:57.480
ahead and override that cue to say hey send this to the Urgent queue because I
00:19:02.700
want that to run immediately you can also set the priority the priority the queue it all kind of
00:19:09.600
depends on how you have your your cues or your list set up but you can also more or less put something to the
00:19:15.059
beginning of the queue versus automatically putting it at the end of the queue
00:19:24.000
but let's talk about that hand wavy kind of rails just takes care of the rest thing
00:19:29.940
um you know like how does rails actually process this job right I added it to that queue it's sitting there but how
00:19:36.780
does it actually run magic right rails magic runs it
00:19:43.260
um it depends BYOB this does not mean uh beer or or
00:19:49.740
whatever is back end bring your own back end so by default rails keeps these jobs
00:19:56.460
in memory so it'll go ahead and add it to your RAM and process it off of that ramp so you
00:20:03.419
technically don't need anything else to get jobs running in the background using
00:20:09.000
active job but this has one major flaw
00:20:14.220
it breaks down when the server restarts or crashes anyone use Heroku
00:20:21.000
yeah say goodbye to your jobs right every 24 hours those dinos will restart
00:20:27.620
any job that you had in that queue when that Dyno restarted or were being processed when that Dyno restarted is
00:20:33.900
gone forever so really what active job provides us is
00:20:41.340
a is a set of building blocks for enqueuing and managing
00:20:46.400
async data right with a set of defaults of course right you don't have to bring
00:20:51.900
anything else though you probably should it has it kind of gives us the the the
00:20:56.940
the groundwork to base everything off of the foundation if you will
00:21:03.480
Okay cool so we we've discovered our problem right it's the blocking request
00:21:09.120
um we know that um we can potentially unblock it we can put it on this queue there's this thing
00:21:14.880
active job and rails that we can use now let's talk a little bit more about active job
00:21:21.840
so first uh what are some of our of our available back ends that we have right like we don't want to lose our jobs when
00:21:29.220
the server restarts what can we possibly do well rails actually has out-of-the-box
00:21:34.799
solutions for a number of different back ends and it's going to depend on the needs of your application and maybe what
00:21:40.679
you're familiar with I'm going to go ahead and put those up here being stalked I have no idea what that is but
00:21:48.659
if you do you can use back burner but if you're using like a database you can use something like a delay job or k
00:21:55.980
or Q classic if you want to keep things in memory you can keep using the async
00:22:01.380
version of active job if you use red s you can use rescue or Sidekick is out
00:22:06.960
here is pretty popular you know and if you use rabbitmq for sending messages back and forth there's
00:22:13.020
also out of the box support with sneakers one thing I do want to highlight though
00:22:18.360
is this list is by no means complete the rails team has decided that they're
00:22:23.760
no longer accepting management of these adapters because all these adapters are built into rails themselves and rather
00:22:30.240
than having rails manage these adapters it should be up to the libraries to manage those instead so if somebody
00:22:37.440
might have gone to the talk yesterday that involved Kafka Kafka is not on this list so if you want to use Kafka as a
00:22:44.460
back end it's not officially supported through rails but the job is going to probably have an active job adapter that
00:22:52.260
will work with Kafka so I just want to make sure that if you're going exploring the docs know
00:22:57.600
that you will have and there probably are more things available than or what are explicitly listed
00:23:04.559
you might then be wondering yourself why not use the library directly oh we use sidekick we actually do use sidekick and
00:23:10.500
weed maps like why wouldn't I just use sidekick on its own why do I need active
00:23:15.900
job why add in that extra layer of complexity what does this give me
00:23:21.539
right well the first thing is global IDs
00:23:27.720
for those who aren't familiar with global IDs it is kind of like this unique URL that points to one of your
00:23:34.919
models a URL might look something like this right it starts with GID instead of HTTP and then my app is the name of my
00:23:42.539
app user is the user model in 420 is the ID of that user
00:23:48.360
so this allows us to do something like some job perform later and just we can give that active record object directly
00:23:55.559
to that class we don't have to worry about it going and turn like sending down all these types of things whereas
00:24:02.340
before we might have had to say okay I need to give the user class and the ID of the user so that the job can look up
00:24:08.940
that user itself no the job doesn't need to worry about that we can use Global IDs don't worry everybody else has got
00:24:15.360
it you got something else that you want to
00:24:20.760
serialize that doesn't have a global ID no problem rails has serializers for
00:24:26.039
that I stole this example from the rails guide so I want to if you do want to see
00:24:31.380
it it is uh it is there um but in this example it's the money right some people might be using a money
00:24:38.159
gem but say when I'm turning this money object into something that the job can
00:24:45.840
understand I want to say okay money when if I want you to grab that amount and
00:24:52.080
set it equal to amount in this hash and then I want you to set the currency equal to currency
00:24:57.539
and then when you actually take that money object and turn it back into something for the job I want you to
00:25:04.740
create a new money with amount and hash or amount in currency as the arguments
00:25:10.220
and then there's a third method that we have to Define called serialize that essentially says do we want to serialize
00:25:17.100
this class with active job if argument is a money then yes otherwise not so
00:25:23.159
much this one was actually new to me and I
00:25:28.320
think this is probably the most underrated one in that is that you get automatic localization with your jobs
00:25:36.260
so let's pretend for example that our app is in English right and we are going
00:25:43.020
to run this uh particular job here in the middle and in French uh say we're
00:25:49.080
our visitors coming in from France they're looking at our side everything is translated in French and they clicked
00:25:55.740
a button now we're going to send them an email later uh well well active job will actually keep track of that the fact
00:26:01.260
that that was in French and it doesn't matter if that job runs five minutes later five hours or five days later it's
00:26:07.020
going to keep that in memory and it's going to know hey by the way when you render that email render it in French
00:26:13.020
it's not going to call Lay job right it's the French version of your job it
00:26:18.179
is going to just know that that job should be using the French language
00:26:23.940
but outside of that context you can just call perform later and it's going to use your default locale
00:26:32.279
callbacks right this is kind of a hairy topic if you like callbacks you don't
00:26:38.400
like callbacks that's not what I want to get into but if you do like them you can use them they work very very similarly
00:26:46.140
to active model callbacks and that we get before around and after hooks for
00:26:53.159
both enqueuing adding the job to the list and Performing that job
00:26:58.860
so I don't know how easy it is to read that code example here but before in queue I'm giving it a little block and
00:27:05.340
I'm saying Hey I want you to log a message to the rails console that says hey I'm going to run this class
00:27:11.460
but then I also have this around perform method that's that captures the runtime
00:27:17.220
I want to know hey how long is it actually taking for this job to run how
00:27:22.620
long did it sit in the queue waiting to run I can get all that information and then also log that out you know I can
00:27:30.600
grab right here you can see I'm grabbing the the job start time um we actually do get a couple of
00:27:37.020
methods Available To Us by using active job one being this and queued at which will tell us a timestamp of when the job
00:27:43.679
was enqueued we also get a job ID method that we can use that is a unique ID for
00:27:48.720
that given job so just given those two things I can easily say hey Job ID
00:27:55.919
took this many seconds to run and since I know when it was enqueued and when the
00:28:01.679
job started I can know how long it was delayed and can log that as well
00:28:10.799
active job also has support for built-in error handling which is also really
00:28:16.500
really cool so pretend again in this email example you had a user but that user account was deleted by the time
00:28:23.220
that that job ran so when active job tries to find that User it's going to be like oh hey active
00:28:28.500
record record not found um and you can tell that job you know what if you can't find the record I
00:28:34.559
don't care discard it delete it move on but maybe you're working with an API and
00:28:41.400
that API has rate limits and you're like oh looks like I hit a rate limiting error I actually want to tell active job
00:28:47.700
hey if I if you hit this error retry it and it's going to keep retrying on some
00:28:53.220
type kind of like back off kind of where it's like oh it's going to retry it in a minute two minutes five minutes so on
00:28:58.320
and so forth but it will automatically handle all of that for you
00:29:03.960
if you want to have some custom logic around how you're handling a particular error that is known
00:29:10.100
you can call a rescue from so in this particular case maybe I'm also using
00:29:16.140
this API client but I got an authentication error my token is no longer good so I need to go get a new
00:29:21.240
token and store that so in this case I'm going to rescue from that authentication error I'm going to go get a new token
00:29:27.360
I'm going to store that and now I'm going to re-encue this job and hopefully it runs the second time or a third time
00:29:38.340
but the main reason to use active job rather than any other library is again
00:29:44.159
this idea that you have a unified API for interacting with jobs no matter the back end
00:29:50.399
right anyone that uses active job can go from one rails app to another and it
00:29:55.980
doesn't matter if you're using sidekick or delayed job or sneakers as your back end
00:30:01.860
you're familiar with active job that abstraction is taken away from you so you can just focus on writing code that
00:30:08.520
you are familiar with and leave the details for uh Downstream services
00:30:17.100
so I'm a little over time so I'm going to give just some parting thoughts here where do you go from here
00:30:22.500
um so how is this related to making coffee right we might always need to grind our
00:30:29.220
beans and put them in the filter there are going to be synchronous things that we have to do and we cannot avoid that's
00:30:35.159
just a fact of writing software right but we can push the start button on some
00:30:40.799
of our coffee and come back to it later like there's no need to stick around there's definitely a lot of code I'm sure there's code you're thinking about
00:30:46.740
in your own code base that you're like we probably don't need to do that right there we can probably maybe do that some
00:30:53.940
other time or re-architect things to do some some at some other time
00:30:59.640
and you know what if it doesn't complete we can always make a couple tweaks and try again you know if I forgot to put
00:31:05.340
some water in my coffee maker oops when I came back 15 minutes later it did not make the coffee so I can always put some
00:31:11.880
water back in hit the button come back 15 minutes later and there we go now I got my coffee same thing with the code
00:31:17.880
right let's jobs failed oh no let's go ahead and make a couple changes push up those
00:31:24.240
changes rerun those failed jobs back in a good state
00:31:31.020
so what other things are you async processing in life right you find yourself standing next to the microwave
00:31:36.659
often just waiting right uh what about code
00:31:44.340
thank you for coming talking uh coming to this talk uh if you want to talk more on on jobs I will be at our booth um and
00:31:51.539
if you want to go more low-level I'm very happy to talk about that too thank you