00:00:09.110
whoo hello
00:00:23.550
god this is a recording and I'm nervous hi everyone and welcome to Aaron's
00:00:30.780
variety hour I'm not sure if this thing is actually gonna go an hour but I will do my best to make it there so that I
00:00:37.739
can live up to the name of variety hour and as I said it's a variety hour so
00:00:42.989
we're gonna talk about a variety of things I'm extremely honored this year
00:00:48.030
because this is the very first time I've been able to do the opening keynote at railsconf every other year I have to I
00:00:56.130
always give the the final keynote of the conference and this is the very first time I've been able to give the opening
00:01:02.190
keynote at railsconf though I do think it is kind of weird that if you go to
00:01:07.590
the website and scroll down you'll see that I am I'm very last on the webpage
00:01:14.040
now I'm sure that this is just an oversight like a small oversight by the organising committee but you know I'm
00:01:21.869
sure we can get this like cleared up later but it's really great to be the very first opening presentation now
00:01:31.399
David usually does the opening keynote but since it's my turn this year I thought I should change my headshot to
00:01:37.740
something a little bit more appropriate so I took a new headshot photo and this is this is what I am going to start
00:01:44.520
using from now on when I give talks of places the other thing is that since I
00:01:51.929
won't be able to see David's presentation ahead of my presentation and that's because and I can't stress
00:01:57.660
this enough I am the opening keynote this year since I won't be able to see his talk since
00:02:03.899
I'm opening I won't be able to make any jokes about it so I've decided that this
00:02:09.450
year what I'm gonna do is try to guess the things that he'll say and then see if I can make some jokes in advance so
00:02:16.980
the the first thing I'm pretty sure he's gonna talk about coming in 2020 I heard that base camp is going to be
00:02:23.730
starting a new email service called Hakam and I'm really excited about this but I guess from what I understand they
00:02:30.840
plan to monetize the service by charging money for email and
00:02:36.110
I think that's really great but I don't understand why anyone would pay money for email people send me email all the time
00:02:43.670
just constantly and I I don't pay for any of it
00:02:48.770
so I'm not sure why people would pay for that but I wish him best of luck
00:02:55.540
the other thing I'm pretty sure is that most folks are doing a lot more remote
00:03:01.280
working these days and I think that's really great and I know that there's a
00:03:06.290
lot of people out there who were new to doing remote work and I'm also really
00:03:11.600
happy that David and Jason decided to release this book at such a steep discount they wrote a book about remote
00:03:18.140
working I haven't actually read the book yet but I'm going to I really plan on it I ordered a copy I'm gonna read about
00:03:25.070
this and the reason I want to read is because III too would really like to work from home I mean I would really
00:03:31.520
like to work from David's home it looks very nice now as you've no doubt noticed
00:03:39.709
that railsconf is not in Portland this year due to the outbreak of Cove in nineteen in the United States and I'm
00:03:46.100
pretty sad about that I love going to Portland it's this is really too bad I
00:03:52.730
was looking forward to seeing everybody in Portland this year and it's so close to Seattle so I can just drive and I
00:03:59.120
don't have to like you know fly down there so I really like that the other
00:04:05.660
thing is that I think that I love that show Portlandia so I'm so happy if I
00:04:11.930
could visit Portland which the show is about but honestly one of the reasons I think that the show is really funny is
00:04:17.690
because Seattle where I live is so similar to Portland and the show could
00:04:23.960
really be about either plays but I guess Portland got stuck with it so anyway
00:04:30.830
this year as I was saying we're doing the conference remotely this year and doing a pre-recorded talk is like a
00:04:37.430
really difficult thing for me to do I usually don't do pre-recorded talks it's
00:04:42.620
really important for me to get feedback from the audience I really need feedback from the audience when a presentation that way I can adjust my
00:04:49.580
presentation if I'm going too long or too short and also I get extremely
00:04:54.680
nervous when I'm giving a talk and the feedback from the crowd like helps me actually calm down when I'm on stage and
00:05:02.020
unfortunately like since I'm pre recording this I feel really nervous but
00:05:08.390
I decided I would try to deal with this by attempting to give a presentation ahead of time and record it in zoom
00:05:22.860
all right you started already you're muted sorry can you hear me now you hear
00:05:36.069
me all right sorry about that let's try again hi everyone welcome to Aaron's variety
00:05:43.930
hour I'm really happy to be here Aaron I think you give a color palette on your screen still oh really I don't I don't
00:05:51.759
see anything is there it's like one that's okay one sec one sec that is that okay oh yeah
00:06:01.029
thank you okay okay let's do this again okay hi everybody welcome to Aaron's
00:06:08.319
variety hour hopefully this will be an hour long I'm not I'm not sure can you
00:06:17.319
all mute if you're if you're not on like can you somebody's getting a phone call
00:06:23.860
can you mute please all right okay let me start over welcome to Aaron's variety hour I am so
00:06:29.800
happy to be here hopefully this will be an hour I've been trying to start okay I
00:06:41.740
will I will start oh sorry sorry it's okay it's okay all right everyone welcome there is
00:06:50.770
variety oh I'm sorry wrong meeting
00:06:56.080
okay hi everyone welcome to variety hour
00:07:01.850
this may be an hour but I'm not sure now
00:07:08.660
when that didn't work I decided that I would try to give the talk in front of my cats welcome to railsconf welcome to my very
00:07:15.980
first opening keynote at railsconf come here buddy welcome to railsconf come on
00:07:22.310
buddy now when that didn't work I decided that I would try to give a presentation on Animal Crossing
00:07:57.270
now since it was way too hard to give a presentation on animal crossing I decided to just do the format that I'm
00:08:03.759
doing right now which I guess is of late-night TV show host and I will just start adding laughs
00:08:09.520
crash rather than make myself feel better now before I get into the stuff
00:08:15.999
that I actually want to talk about for this conference I want to tell you all about my my productivity strategy my
00:08:25.300
productivity strategy is procrastination so in order for me to focus on building
00:08:32.229
a presentation I need to put it off until I absolutely can't put it off anymore and the reason I do this is
00:08:40.510
because it gives me a hyper focus I have to focus on the task since I'm required to focus on it like I know that there is
00:08:47.079
a particular deadline I have to be up in front of an audience this helps me really focus on preparing my
00:08:53.829
presentations for people however this this strategy also comes
00:09:00.160
along with a lot of extreme guilt now as I'm procrastinating and I know I should
00:09:07.329
be working on my talk I know I should be making this presentation but I'm not so in order to relieve my guilt for
00:09:13.959
procrastination I have to fill my time with things that I deem to either be more important than the tasks that I at
00:09:21.070
hand more important than the thing that I'm doing or it has to be something that's more fun now I call this
00:09:31.079
procrastination project management or ppm for short and actually I take it
00:09:36.610
very very seriously in fact I meticulously plan out my procrastination
00:09:41.910
these projects are actually planned procrastination projects now I had
00:09:47.380
planned out I had planned on procrastinating all the way until April and then working on my presentation for
00:09:55.750
all of April until railsconf came around now unfortunately we got hit with a
00:10:01.860
kovat 19 and railsconf was cancer and when that happened Evan texted me
00:10:08.130
and he asked me are would you be willing to record your presentation and of course I said yes so I asked him well what when is the you
00:10:16.890
know I need a deadline for this because that's how I work with my I didn't tell
00:10:22.320
him that this is my productivity strategy but I need to have a deadline in order for me to procrastinate and he
00:10:29.430
said oh yes the dates will be the same as railsconf and I said that is great I knew internally okay the dates are the
00:10:36.030
same this means that my procrastination plan is on schedule things can move like
00:10:42.330
according to plan so unfortunately maybe
00:10:47.820
a couple weeks later I got an email from the railsconf organizing committee
00:10:53.700
saying to me oh well actually your videos are do we want you to pre-record
00:10:58.950
this but your videos are due by April 13th and unfortunately this is three weeks earlier than I had planned so this
00:11:07.590
really really put a lot of pressure on me and I decided that part of my
00:11:13.230
presentation since since now everything is pushed up so far I thought okay well part of my presentation is I'm going to
00:11:19.320
give a presentation of my plan procrastination projects so before we actually dive into technical details of
00:11:27.900
my presentation I'd like to show you some of the stuff that I've been doing to basically put off doing my actual
00:11:35.310
work so once I found out that we are gonna be doing railsconf as an online
00:11:42.240
conference I got me really worried like I kept thinking to myself okay how am i
00:11:47.640
how am I gonna record the video now of course like I need to record it in 4k
00:11:53.660
and obviously an iPhone won't do so I am
00:11:59.040
gonna have to purchase a new camera that has 4k like you can shoot 4k video because it's very important that people
00:12:05.150
be able to see like you know how dry my skin is or see my face in excruciating
00:12:12.930
detail so I decided to purchase a new camera I got this camera it's Anna this om-d camera it's a very nice camera
00:12:21.450
and then I read the manual for it and to be honest I haven't owned an SLR camera
00:12:26.710
in quite a while so I'm not up on the latest technology and I was incredibly surprised at how you actually have to
00:12:33.850
transfer how you transfer photos from the camera onto you know your phone or
00:12:39.550
your computer or whatever according to the manual what you do is the Wi-Fi
00:12:45.070
create or the camera creates a Wi-Fi network now you connect to the Wi-Fi network with your phone and then your
00:12:53.080
phone transfers the photos off of the camera via Wi-Fi so of course you have
00:13:00.310
to install some kind of like proprietary application on your phone some kind of
00:13:05.920
app to transfer the photos from the phone over to your but are from the camera over to your phone and this made
00:13:12.430
me think a couple things like first off I thought install a new app in this
00:13:17.770
economy I've got millions of apps on my phone I can't afford to have another another app on my phone the other thing
00:13:25.960
that I thought is can i hack this there has to be a way for me there has to be a way for me to be able to hack my hack my
00:13:34.030
camera I don't want to transfer all the photos off using some third-party application I'd rather have a program
00:13:41.260
that takes them directly off of the camera and imports them right into the photos application rather than using
00:13:47.800
some weird thing so I decided that my one of my procrastination projects I
00:13:53.380
mean since I since I bought the camera to film the video I need to be able to get the photos off of the camera or the
00:13:58.810
video off of the camera in order to show the video to everyone I decided I needed
00:14:03.910
to hack my camera and to get started I decided to connect my computer to the camera a via Wi-Fi rather than my cell
00:14:11.110
phone and when I did I noticed a couple of interesting things first the camera
00:14:16.960
always had the same IP address so I thought that was that was very interesting and when I ran a port scan
00:14:23.710
on the pewter I found that it also had an HTTP service open and that was very
00:14:31.149
interesting now I also found that I could just run curl against the HTTP
00:14:37.449
port it gave me an error there was there was an error that showed on the in the terminal but the traffic was all over
00:14:46.180
port 80 it was unencrypted and if I could figure out what the actual end points were on
00:14:51.819
the camera like maybe I could write my own write my own client for the camera so I needed a way to figure out like
00:14:58.029
okay what is the data going between the cell phone and the camera what what are they transferring what is it doing and
00:15:05.259
if I can figure that out I can write my own write my own client so I wanted to
00:15:10.930
come up with a way to intercept the data that was being transferred between the camera and the phone so there are a couple things going for
00:15:18.309
me the first thing was that we had in a hard-coded IP address so the camera
00:15:24.790
always had the same IP address and that said to me that probably the application
00:15:29.860
that's it's accessing the camera is gonna have that IP address hard-coded in so I can use this to my advantage
00:15:36.779
the other thing is that since we have port 80 open and not 443 it's probably
00:15:42.639
going to be unencrypted traffic so the camera is going to be transmitting
00:15:47.759
unencrypted data now there is one crucial detail or one important thing is
00:15:54.670
that this the point of this exercise is not actually to write a client for the
00:16:00.189
camera but to put off doing the work that I really need to do and that means
00:16:08.079
that I need to make this project worthwhile so that I didn't feel guilty
00:16:13.420
about it and this gave me one really big constraint and that is that I needed to either use Ruby or rails to somehow
00:16:20.470
intercept and deal with this traffic so my idea for getting access to the data
00:16:28.629
that was being transferred between the phone and the camera was to essentially set up two computers that each had two
00:16:35.589
interfaces so one can would have two computers each computer would have Ethernet and a Wi-Fi now the
00:16:44.770
two computers would be connected to each other via ethernet and will connect the
00:16:51.700
cell phone to one of the computers via Wi-Fi and that computer will act as a
00:16:58.570
proxy so this computer is going to be intercepting the traffic and recording
00:17:03.640
it then will connect the camera to the other computer via Wi-Fi and create an
00:17:11.290
SSH tunnel between the camera and the proxy server this way our proxy server
00:17:17.560
can access the network for the camera so we can have the to two IP to same IP
00:17:23.709
addresses with access to each other so the secondary computer will set up an
00:17:29.680
SSH tunnel between the camera and our proxy server and then the proxy server can access the access the camera through
00:17:35.890
a port on localhost so when the phone
00:17:41.170
sends data the proxy server will save a copy of the request and when the camera
00:17:46.960
responds the proxy server can save a copy as a response something like this so the phone sends a request we save a
00:17:54.250
copy of the request and then the camera sends a response we save a copy of the response and send that back to the phone
00:18:00.910
and if we record all of that data we can figure out what is the protocol that the phone and the camera are using to talk
00:18:07.570
to each other now this is for completeness this is the SSH SSH tunnel
00:18:14.040
command that I used and you can see here essentially what happens is we say we're
00:18:19.510
gonna forward all forward port 80 on 192 168 0 10 to our Raspberry Pi I'm using a
00:18:28.900
Raspberry Pi as my proxy server and the local the localhost on the proxy server
00:18:34.900
can or the proxy server can access the camera on localhost via port 22 10 so if
00:18:41.590
we connect to 22 10 on localhost that'll actually be forwarded over to the camera
00:18:47.130
then in order to make this project worthwhile and make myself feel guilt free I wrote
00:18:54.370
the proxy server in Ruby and essentially what it is is just a a proxy server
00:19:01.270
written with web brick and this is the this is the code that handles the get request for the proxy server essentially
00:19:11.200
all it does is access the SSH tunnel here so when a request comes in we we
00:19:16.330
the proxy server create a new connection to the lope to localhost on for 20 to 10
00:19:22.210
which is actually a connection to the camera then we send the request replay
00:19:28.810
the request that we got from the phone then we get the response body that came from the camera and then we send the
00:19:37.270
response body back to the back to the phone logging all this data along the
00:19:43.960
way and then we have to handle posts as well this is the code for handling posts it's almost exactly the same as a code
00:19:50.560
for handling gets except that we need to log post potties as well so it was very
00:19:55.930
similar now the next thing that I did was that I used the app on the phone to
00:20:01.930
collect data now of course yes this did mean that I had to install the
00:20:07.450
application on my phone but hopefully I installed the application on my phone so
00:20:12.850
that you don't have to you too can buy this camera and not have to install the application I did it for you and I am
00:20:21.310
publishing this data so you won't need to do this hopefully anyway using the
00:20:26.350
data that we were able to collect that we intercepted on the proxy server it's
00:20:31.810
pretty trivial to write our own client and this is the client that I wrote for
00:20:38.310
getting photos and getting images and movies off the camera basically it just has two parts the top part there just
00:20:45.430
gets a list of all of images all the files that are stored on the camera and
00:20:51.520
the second part that downloads all those downloads all those images and of course I added a little bit of intelligence in
00:20:58.390
there so that if we've already downloaded the image it won't try to again so we did it our cameras hacked
00:21:05.309
that is great of course this is clearly not not enough procrastination this only
00:21:10.500
took a couple of days and I just after completing this project I just didn't feel enough pressure to work on the
00:21:17.010
presentation so the next thing I had to do is I wanted to make a virtual conference inside of Animal Crossing and
00:21:23.610
we saw the clip earlier today but now that that clip actually costs me a lot of time I had to grind a lot on Belle's
00:21:30.030
so that I could buy all the stuff to lay out and display for everybody but then
00:21:35.580
of course I needed a I needed a way to record the video that came off of my
00:21:41.610
switch so that I can give the presentation an animal crossing so I learned about that and buy I bought a
00:21:47.250
video capture card so that I could record all the data from the switch and I had to figure out the right software
00:21:53.190
to use to record it which was extremely hard as well and now I also needed to get a tripod for the camera but it was
00:22:01.410
very difficult to get a tripod because it was very kovat outside but the good news is that now that I've done all this
00:22:07.679
yak shaving I'm ready to become a twitch gaming streamer but this is all prepared
00:22:12.750
me for my next my next big career which is to launch my launch myself into the
00:22:18.270
great world of online game streaming so when I decide to retire from programming
00:22:24.360
I can just play Animal Crossing online all day the next project that I did though is I I still didn't feel enough
00:22:31.440
pressure even though I decided to start a new career in streaming games I
00:22:36.809
decided to do something which I call the Eren project after hacking my camera
00:22:43.500
grinding with grinding four bells dealing with capture cards tripods and starting my new career as an online
00:22:48.929
streamer since that wasn't enough I had to I decided to design and build a sign to let me know every time I had done
00:22:56.670
messed up and this is this is where I came up with the ay-ay-ron sign essentially I built a
00:23:04.200
sign that sits in my office above my desk this is this is the sign then I
00:23:09.360
design and I designed a PCB this is the PCB this PCB controls the sign and the PCB has a
00:23:17.460
serial UART connection so I can hook this PCB straight into my computer and
00:23:22.860
have my computer control the sign now I've wired it up so that every time I execute a command in my terminal if it
00:23:28.890
fails or if any of my tests fail the sign will display a a run so right here
00:23:34.380
is the serial port on the PCB that I use to control it and this sign will let me
00:23:40.650
know anytime I've messed up
00:23:58.760
moving on our main story tonight concerns where clauses in your active
00:24:04.980
record queries when you need to offload complexity to the database they're there for you we're going to
00:24:12.390
talk about how where clauses work in active record some features that you might not know about and then we're
00:24:17.789
gonna talk about profiling and speeding up speeding up where clauses now when I say profiling I don't mean
00:24:25.110
profiling your application I mean profiling rails itself certainly these same profiling techniques techniques can
00:24:32.460
be used to profile your application but today we're gonna look at how to speed up rails itself now this story started a
00:24:42.600
while back we have an internal blog at work and somebody posted on the internal
00:24:48.419
blog like hey check out this check out this life hack if you basically the blog
00:24:54.900
post just said like sanitizing IDs yourself is much faster than letting active record sanitize the IDS for you
00:25:01.049
so if you have a large list of IDs and you need to do an in statement doing the
00:25:06.450
sanitize sequel is way faster than asking the where clause to do it and I
00:25:11.520
read this and at first I just couldn't believe it and the next thing I thought
00:25:16.740
is well if it's true why would that be true these two things do approximately
00:25:21.750
the same thing so why would you want to use the harder version when you can just
00:25:27.570
use the easy version so in order to figure out what was going on the first
00:25:32.880
thing I did was I wrote a very simple benchmark now this benchmark just compares the two different strategies
00:25:39.059
using this great benchmark IPS gem I really like this gem because all it does
00:25:45.000
is run the same code over and over again for five seconds in whichever a chunk of code can run more times in five seconds
00:25:51.809
that one is the faster version and this is really nice because I don't have to think about how many times should I run
00:25:57.570
this thing just does it for me and sets a deadline and then however many times it executed within that deadline is our
00:26:04.280
is our benchmark now what this means is the more times you run the faster your
00:26:10.760
code is so we want to have a higher iterations per second our faster code
00:26:16.040
will have higher iterations per second now if we look at the results for this benchmark the results are incredibly
00:26:22.310
surprising actually sanitizing the IDS yourself was about 74 percent faster
00:26:28.910
than calling the easier the easier API and this really perplexed me why why
00:26:35.510
would this be the case these two things were doing almost exactly the same thing so how can the performance be so
00:26:43.220
dramatically different now before we dig into this issue I want to talk about
00:26:49.970
using where with active record so we're gonna do we're gonna get a little bit back to basics here let's take a look at
00:26:57.340
calling the wear method and how what it actually does and the sequel query is
00:27:03.020
that it generates so with active record the wear method just is simply used to
00:27:08.840
manipulate the where clause that's generated in your sequel statements we use this method anytime we want to add
00:27:14.630
constraints on the query we're making and these call these calls are additive so if we call wear multiple times it
00:27:22.460
will add add multiple things to your where clause so we can keep chaining warez and saying we want to add this and
00:27:29.150
this and this and this so from a high-level perspective we're translating
00:27:34.250
Ruby calls into sequel we're essentially taking the Ruby language and converting it into the sequel language and there
00:27:41.360
must be some process for converting Ruby code into sequel and executing it
00:27:47.300
if we look at our benchmark again we'll see that the faster version is doing some of that translation to sequel
00:27:53.420
manually so we can kind of tell intuitively that the speed difference must come from the way that the slower
00:28:00.830
example does sequel generation so something about the automatic sequel generation is slower than doing it
00:28:07.700
manually another interesting thing is that if we look at this particular
00:28:13.429
example in the console this isn't actually the sequel that gets Jen so we see the sequel like we think it
00:28:20.179
should be the sequel it gets generated but when we look at the console it's not that's not the sequel that's actually
00:28:25.220
generated what's actually generated is you'll see the same sequel statements
00:28:30.590
but they'll have a limit 11 attached to them and this is because active record
00:28:37.039
is lazy about generating the query so in fact actually the inspect method on
00:28:44.690
active record will add this limit 11 to
00:28:50.269
the query before it's executed so if we want to have the where statement to
00:28:56.629
generate the sequel that we would expect we need to actually proactively load them now this if we look at these three
00:29:05.119
calls will notice a difference between the behavior where the first one when we eagerly load when we call to a no limit
00:29:12.289
is added if we call just dot all without inspecting it nothing gets executed and
00:29:18.289
nothing's executed because we didn't access any of the records and we didn't
00:29:23.570
do anything with it so nothing is executed now the third one it adds limit 11 and that's because we called the
00:29:30.470
inspect method on it and remember that the console will call inspect for us or calling PU will automatically call
00:29:37.850
inspect now the important thing to learn from this is we actually wrote our
00:29:42.980
benchmark incorrectly if we look at our benchmark we'll see that we're just calling we're and we're not actually
00:29:49.850
loading any records so neither of these things will actually query the database
00:29:55.629
so the only thing that we've proven is that we're able to do nothing faster
00:30:01.460
when we call sanitize sequel than if we were just calling where so let's fix the
00:30:06.799
benchmark you know here we've fixed the benchmark to generate a thousand ideas and then try and query for those
00:30:12.889
thousand IDs and it actually hits the database and if we compare that again if we run rerun this benchmark we'll see
00:30:20.090
that in fact before we before we fix the
00:30:25.220
benchmark the where statement was are the sanitize sequel statement it was 70%
00:30:31.039
faster now that we fixed the benchmark we can see that well it's only 60% faster now so we've
00:30:38.659
sped up our we've sped up our program but rather than being 70% faster its no
00:30:43.999
60% faster so I guess that's better but ideally these two things should be
00:30:50.749
identical so let's profile let's actually profile this code and try and
00:30:56.389
figure out what's going on for these two pieces of code these two pieces of code doing most of the same thing being 60%
00:31:04.580
slower it's just it's too slow it's in the threshold of something that I would consider to be a bug so let's try to
00:31:11.090
figure out what the bottleneck of this code is and to figure that out I like to use a tool called stackprof for
00:31:17.029
profiling now this is an example usage of stackprof so this is using stackprof to profile these to our to benchmark
00:31:25.129
issues or to benchmark pieces of code it looks very similar to the IPS code we
00:31:31.729
saw earlier only this time we're just doing getting profiling information now what we're doing here is we're
00:31:39.169
generating actually two profiles one is with our slow case and one is with our
00:31:44.509
fast case and I liked in this particular case I'd like to generate two profiles because we need to I want to compare
00:31:50.779
them and it can be helpful to compare the two so if we run this we'll end up with two two profile files one for the
00:31:58.220
fast case and one for the slow case and if we run stackprof against the where
00:32:03.409
profile our slow case and we look at the top six stack frames we'll see that approximately 14% of our time is being
00:32:10.580
spent clearing the database now acquiring the database like normally
00:32:16.549
this might be our target for optimization so if we were just doing profiling on a web app say for for
00:32:22.190
example maybe we see that we're spending 14% of our time in one particular method
00:32:27.590
well we would probably want to target that method now let's take a look at the profile with manual sanitization or fast
00:32:35.090
case this top stack frames for the fast case well they they look almost identical to the frame
00:32:41.600
we saw in the slow case now this means that if we were to go in and speed up
00:32:46.730
that query method we would actually be speeding up both cases and that's not
00:32:52.250
what we want we want to figure out why is one case faster than the other and can we can we get those two bits of code
00:32:59.120
to be the same speed so speeding up the query method in this case won't actually
00:33:04.580
help it to you our goal now we can figure out like as we see these these
00:33:11.419
profiles they're not really helping us so much because we don't like we don't
00:33:17.419
know what the difference is looking at the two they're they're essentially the same so why is one faster than the other we haven't been able to answer that
00:33:23.480
question now these profiles when we're reading them they're telling us how much
00:33:29.539
time we're spending in any one particular function call and we call this self so this is this that one call
00:33:37.100
we're saying how much time do we spend in ourselves versus ourselves and other things that we call other functions that
00:33:43.730
we call well stackprof also measures the amount of time that you're this one
00:33:50.870
particular function takes plus all of its children as well so we call that self plus children so so far we've
00:33:59.539
looked at the amount of time or what the slowest functions are but that only
00:34:04.789
tells us how much time a particular function takes it doesn't tell us how much time that function takes plus all
00:34:11.359
the functions it calls now where this is important is let's say let's take a look
00:34:16.760
at an example so in this particular case if we profiled this code the profiler would tell us oh the method the slow
00:34:23.659
method the method names slow it is the slowest and you should speed that up and it's true that method is the slowest and
00:34:30.649
this slow caller method it wouldn't even show up on profiles because we don't spend any time in that method so the
00:34:37.760
profiler would say hey you need to speed up the slow method it's method named slow now it would be even better though
00:34:46.820
if we could eliminate the slow caller method like maybe maybe we don't need to call slow call or so much maybe we we
00:34:53.149
can just eliminate that method all together but we can't see the slow caller method in our profiler outputs so what we need to do
00:35:00.330
is say stackprof i want you to sort by the total amount of time that is spent inside of a function so we we say hey sort by total
00:35:08.390
we can say stackprof pass in the sort total flag and this will sort by sort
00:35:14.880
functions by how much time is spent in that function plus any functions it calls so if we sort by that will get a
00:35:23.310
profile for our slow case that looks something like this and this is actually
00:35:28.440
much more interesting than the previous profile we looked at and the reason is if we look through it we can see where
00:35:34.349
are we spending all of our time well actually we're spending approximately 28% or 29% inside of this compile method
00:35:44.760
inside of the two sequel visitors so this is this is actually very interesting why are we spending 30% of our time compiling sequel now if we
00:35:52.290
compare this profile to the self sanitization profile that Errol to
00:35:58.230
sequel visitor it doesn't even show up at all so here we found our difference between the two
00:36:03.839
so in our slow case we're spending nearly 30% of our time in this compiler we're in our fast case we don't even see
00:36:11.430
that compiler in the profile output so it's not that we're not spending time in
00:36:17.040
the compiler in the fast case it's just that we don't spend much time in the compiler in the fast case so we can tell
00:36:24.000
from the comparing these two profiles that our actual target is this two sequel visitor or this two sequel
00:36:30.150
compile so we need to figure out like why is this method taking up so much of
00:36:35.880
our time but maybe we can speed up that method but I think it's better to step
00:36:42.390
back a little bit and say you know what what exactly is this method doing so let's take a look at some active record
00:36:48.960
design to understand what that compile method is doing I want to look at a little bit of active records design
00:36:55.290
specifically I want to look at sequel generation sequel generation and active
00:37:00.990
record compile involves a few a few steps our first step is generating a
00:37:06.960
relation objects now every time we call any type of query builder method in active record
00:37:12.690
we're generating or creating a new relation object so each time we call like say we're or order or joins or any
00:37:21.599
of these types of query builder methods we're creating a new relation object and that relation object contains
00:37:27.979
information about constraints or things that we want to add to the sequel statement so in this case we're saying
00:37:34.589
hey we want to we want to find cats that are age 10 with names Glorian and we
00:37:40.170
want you to order them by ID and each of these relation objects stores that particular information the nice thing
00:37:46.499
about this design is that we can reuse these relation objects so we can say
00:37:51.719
okay I want you to find all cats that are age 10 and I want you to order them by ID and then we can actually split and
00:37:59.279
say well in this case I want you to order I want you to find cats named Gorby and in this case I want you to
00:38:04.410
find cats named SeaTac but we're able to reuse part of the query between both of
00:38:09.719
those two now this gives us a nice ability to reuse relations and refactor
00:38:16.920
our or factor our designs so our next step is actually active record takes
00:38:24.059
these relational objects and converts them into an AST this ast what an ast is
00:38:29.339
is simply a tree data structure that represents the sequel query that we want to execute so if we were to imagine the
00:38:36.029
object relationship of this ast it would look something like this this is what
00:38:41.700
the data structure might actually look like it's a tree of objects so say we have a statement object it points at a
00:38:47.190
fields object and the field object points at a list of fields it wants to query on this tree structure represents
00:38:54.599
the actual string below it now active record will go through look at each of
00:39:00.539
those relation objects and convert them into an ast that represents the sequel statement that we're going to execute so
00:39:07.049
maybe the tree will look something like this now we can actually view these or
00:39:15.210
get access to this ast by calling some extremely internal API is now the
00:39:20.670
these api's are very internal but I'm and they may break at some point in the future but they are very handy when
00:39:26.190
you're trying to debug stuff especially this kind of performance issues if we call dot arrow AST on the relation we
00:39:34.020
can actually get access to that ASC object so calling this will give us the top of that tree and we can traverse
00:39:40.589
through the tree and check out the actual tree that represents our sequel statement or we can call to dot and this
00:39:48.180
prints out our tree and graphviz dot format now graph is dot format looks
00:39:54.000
like this so this is the dot output it's actually a bit longer so the it this is
00:39:59.369
the whole thing and of course it depends on the relation that you've created in the sequel statement that you're trying
00:40:04.470
to execute now this isn't very helpful but if we take this dot format and run
00:40:10.740
it through the dot command line tool graph it as command line tool we can generate a PNG file that shows us the
00:40:16.710
actual structure of the sequel statement so in this case we're we're querying for
00:40:21.990
cats the output of graph is will look like this so this is the the actual structure of the sequel statement that
00:40:29.609
we're going to be executing if we zoom in we can see the column name so we have name and age and we can also see the
00:40:38.849
table that we're executing against as well as the columns and order information that we want to query with
00:40:45.480
query against as well now our final step before sending the query to the database
00:40:51.630
is actually generating a generating that sequel statement that string and to do
00:40:57.390
that we're just taking this ast this tree data structure and we're gonna walk the tree and convert the tree into a
00:41:04.680
literal string that we'll send to the database so we have a bit of code that's called a visitor it's using the what is
00:41:11.010
called the visitor pattern on the sequel visitor will walk through every one of these nodes in the tree converting each
00:41:17.280
node into string or each node into a bit of a sequel statement and concatenate
00:41:23.730
all of them together and then we send that string to the database so this is
00:41:29.059
this so this particular visitor here is generating a sequel statement where what we saw in the
00:41:36.029
previous slide was actually a graph of his visitor that walks the tree in outputs of the tree to adopt file that
00:41:43.410
we can use with graph is now this is very interesting because when we were
00:41:48.930
looking at our profiles earlier our target is the Aral visitor to sequel visitor it is exactly that it is that
00:41:55.109
code that's walking through this tree now we we know from our profiles that this is the bottleneck and that's the
00:42:02.430
thing that we need to figure out why it's taking so long so we know that the difference in performance is something
00:42:08.009
to do with the difference in the ast s that are being generated so let's compare the ast for each of these
00:42:13.859
queries so will generate dot files for our case that uses sanitize sequel or
00:42:20.279
fast case and will generate a dot file for our case that doesn't use sanitize sequel and we'll compare the two so this
00:42:27.599
here is the output for the ast e or this is the ast that is the fast case this is
00:42:33.449
where we use sanitize sequel now let's compare that to the ast for our slow
00:42:39.509
case where we're not using sanitize sequel this is what it looks like and we can see here that this auto sanitized
00:42:45.509
version is actually much different than our sanitized sequel version if we add
00:42:50.819
more IDs to the to the the auto sanitized version then this array will
00:42:57.209
keep growing and growing and growing in size and we know that we it will have more and more nodes to process as we're
00:43:03.809
walking this tree and converting it into an AST and this actually really this
00:43:09.269
actually coincides or works well with the the performance that we're seeing if
00:43:14.789
we compare the performance between the self sanitization or the manual sanitization and the auto sanitization
00:43:21.209
we can see that well our auto san our auto sanitization is spending about 29
00:43:26.609
percent of our time generating a sequel statement where our self sanitization is only spending 0.6% of our time so we can
00:43:33.959
see that there's a huge difference between these two but we can also see that there's a huge difference between the ast s so these numbers correlate it
00:43:42.779
makes sense we have a large ast it takes a large time oh we have a large a as seen it takes a
00:43:49.509
long amount of time to generate that query so how can we fix this well we
00:43:56.799
have a few options we could speed up the compiler that's one option but again
00:44:03.279
this would actually speed up both the fast in the slow cases so our fast case would get faster our slow case would get
00:44:09.910
faster too but that's not what we want we actually want to have our slow case be the same speed as our fast case so
00:44:17.380
maybe instead we could do less work our faster example is doing much less work
00:44:23.109
so maybe there's a way that we could reduce the amount of work that are slow cases is doing so maybe there's a way
00:44:32.289
that we could take this where call and have it just call sanitized sequel rather than going through the entire
00:44:40.209
compiler now if we trace through the trace through these profiling frames
00:44:46.449
well actually find the find the point where these relations are being
00:44:52.029
converted into IDs or into an AST and that this is the code that converts those relations into a list of IDs in
00:45:00.609
the ast and we can see here that essentially what we're doing is we're creating an in node and this in node
00:45:05.679
points at all the IDS that we're going to be that we're going to be using in the ast it would be nice if instead of
00:45:12.819
generating so many nodes we could just generate a sanitize sequel part right here just call sanitize sequel and we
00:45:19.569
don't generate all those notes to walk of course it'll require a little bit of
00:45:24.609
refactoring to get it to work but I have to say like when I encounter some code
00:45:31.269
that I need to refactor I always ask myself the same three questions can i
00:45:37.569
refactor this code should I refactor this code and do I want my name on the
00:45:45.670
blame if I answer yes to all three of these questions then I'll refactor this
00:45:51.130
code otherwise I need a different strategy
00:46:07.630
allene and I worked on optimizing this code for about three months and it seems like it should be a fairly
00:46:13.579
straightforward optimization but unfortunately it wasn't and that's because there are some subtle
00:46:18.770
differences between the ways of the two code examples were the first issue is
00:46:24.589
that we can't use the parameters hash and controllers the same way that with manual sanitization in the same way that
00:46:30.589
we would with Auto sanitization so for example here we have params that may have come in from the controller and we
00:46:36.589
pass those parameters into active record now if we look at the queries that are
00:46:42.049
generated by these two statements we can see that they're totally different the first one casts it to an ID whereas the
00:46:49.460
second one we use it as strain generates an in clause with the strength so this
00:46:54.470
is because the ast has some context about the values that are being put into
00:47:01.520
the sequel statement it knows oh we're going to be generating where we're
00:47:07.039
dealing with IDs here IDs are integer types so we need to cast this value to an integer type and it does that for us
00:47:13.819
automatically whereas the sanitized sequel version doesn't have that context another example is URL slugs say we have
00:47:22.700
some type of text that comes after the ID that come that comes in from the controller now when this ID comes in
00:47:30.049
what will the query look like if we look at the sequel statements we'll see using auto sanitization it does the right
00:47:36.589
thing it queries with the value of 100 for the ID whereas the manual sanitization will not do that it
00:47:42.410
generates an in clause with that string literal now again this is because the
00:47:52.160
auto sanitization has that context it knows that we need to convert these into IDs now the third issue is very large
00:48:00.410
numbers so in this case we have a very large number and we're trying to do a query against the database with it if we
00:48:06.920
look at the sequel statement that's general here we'll see that in fact the first one doesn't do any query it does nothing
00:48:12.940
where the second one does a very an in statement with a very large number where
00:48:18.710
this could raise an exception or something and the reason that the first one doesn't do any doesn't do a query is
00:48:24.859
because it knows well this number is outside of the bounds that this particular value can handle so I won't
00:48:31.579
do any query I'll return an empty set so to deal with this or try to try to speed
00:48:38.119
this thing up the solution that we came up to came up with was to introduce a new type of node called a precompiled
00:48:44.839
node part of the reason the compiler is slower than we'd like is because it's designed to deal with any type of node
00:48:51.140
in the tree but when application code is calling the where function we know that at that
00:48:56.720
point we're being called with a list of IDs and we can do something a little bit more intelligent so the compiler is much
00:49:03.680
more general it knows about general issues whereas at that point when we call where at that call site we know
00:49:09.920
that we're only dealing with a list of IDs so what we could do is at the point
00:49:15.170
where we create that list rather than creating that list of IDs we can create a precompiled node that has the sequel
00:49:22.069
statement we need to insert so normally our tree will contain an in node that
00:49:28.069
points to a bunch of IDs and we know that walking those IDs is our bottleneck so we aim to do is instead introduce a
00:49:35.119
type of precompiled note this precompile node will point at the old in statement
00:49:40.640
but it will also point it generated sequel that we can use when we're compiling our sequel statement now the
00:49:47.869
reason we point of both of these is so that we can maintain backwards compatibility for example maybe we need
00:49:53.300
to change the in statement into a not in statement so we need to do some kind of transformation on the tree well we were
00:49:59.540
able to maintain backwards compatibility and get higher performance so normally
00:50:05.000
during the compilation phase we would normally visit these all these in statements are all the nodes in this
00:50:11.859
inside of the tree whereas now when we encounter a precompiled node instead of
00:50:17.359
visiting all those we can just say ok i want to use this precompiled statement so Eileen and
00:50:24.190
I implemented this and I want to share some of the benchmark results with you so if we look at the performance difference so this benchmark just
00:50:30.490
measures how fast can we execute a where statement and we're comparing against our our branch to master will see that
00:50:38.380
our branch is actually 30% faster so we've been able to improve performance on master thirty five thirty percent now
00:50:46.329
we need to compare this with the original benchmark against self sanitization to see where where we are
00:50:53.440
with regards to self sanitization so we compare that as well using a thousand ids and if we run the benchmarks we'll
00:51:00.339
see here it's still actually slower it's 18 percent slower than doing self
00:51:05.770
sanitization now I think this is good news because it used to be sixty percent
00:51:11.500
slower of course we saw earlier that these two functions are doing something
00:51:17.200
a little bit different so we're not exactly comparing apples to apples so that could be that could be the reason
00:51:24.099
for this performance degradation but actually I think that we still have a lot of work to do this is a work in
00:51:31.720
progress and I think that we can actually get those numbers closer such that there is no performance difference
00:51:37.089
at all but for the time being I think this is a great improvement now I'd like to show
00:51:44.260
you all of the source code for this the entire diff or fixing this particular
00:51:50.230
issue but I think that that would be going from zero to draw an owl when actually I want to communicate to you
00:51:56.890
things that would be more useful to you in your daily lives like the
00:52:02.650
architecture of active record or benchmarking and profiling techniques
00:52:07.779
those are the things that I actually want to communicate to you in this talk versus the
00:52:13.170
this one particular performance fix now I'd be remiss if I didn't show you some
00:52:19.109
of the actual patch so let's take a look at it this is the bit of the code that handles AST conversion where we are
00:52:27.059
looking at in we're generating in nodes and we went from this to eater
00:52:32.430
compilation to this now the first thing you'll notice about this code is it's more complicated that is true but you
00:52:39.299
can see under the hood that we're actually just calling this sanitized sequel array just like our fast example
00:52:44.700
was doing now actually this isn't merged into master yet and the reason is we have the main issue with this particular
00:52:51.660
patch is we have this neat function here called serialized too now we need to
00:52:57.660
figure out a better name for this method in fact what we had to do to get this
00:53:02.969
performance boost is if you look at the internals of active record you'll see that each type has a serialized method
00:53:09.660
now that serialized method actually has it doesn't just serialize unfortunately
00:53:22.650
checks to see if the value is within bounds and then it does a serialization well in this patch we needed to be able
00:53:29.999
to split up that functionality so we needed to be able to do those checks say
00:53:35.519
are these things in bounds and then do the serialization or I do the serialization unfortunately serialize is
00:53:43.109
a public method so we can't change the behavior of that method so we just for the time being we introduced a method
00:53:49.079
called serialize to so we need to figure out a better name for that but once we do we'll merge this in or we'll just
00:53:55.529
make this private or something we'll figure out something but for now we have this fun method name all right so let's
00:54:04.589
wrap this up first thing I want to say is that not all code is equal we saw
00:54:09.989
those two those two bits of code at the very beginning and we saw that they did basically the same thing but in fact
00:54:17.009
they're not equal at all they handle different cases and this manual
00:54:23.609
sanitization route may not be applicable in the same cases as these regular active record calls so it's kind of
00:54:30.010
dangerous to say like okay these these two things are the same they are probably not exactly the same however
00:54:36.900
similar behaviors should have similar performance but these weren't even the
00:54:42.580
same the performance was very different the performance should be very close if they do approximately the same thing the
00:54:48.790
performance should be approximately the same but in this case they weren't even close at all it was very different so I
00:54:55.300
would consider this poor performance to be above I think that poor performance is a bug if you encounter something like
00:55:02.410
this with rails you should absolutely report it as an issue rather than saying it's you know just a performance protip
00:55:08.530
it's cool to have performance Pro tips like that but really we shouldn't be working to make the framework faster I
00:55:14.770
mean if you really think about it like poor performance in this economy we
00:55:20.470
can't afford that and you're right we can't afford that we need to be fixing these things so to recap today we looked at really
00:55:28.600
great ways to procrastinate I verified them is really great we looked at tools
00:55:33.700
for performance active record internals and design and I hope that you all enjoyed my very first opening keynote at
00:55:40.630
railsconf thank you so much and I really hope that we can actually see each other
00:55:46.060
in person next year thank you again