Aaron Patterson's Variety Show

00:00:09.110 whoo hello

00:00:23.550 god this is a recording and I'm nervous hi everyone and welcome to Aaron's

00:00:30.780 variety hour I'm not sure if this thing is actually gonna go an hour but I will do my best to make it there so that I

00:00:37.739 can live up to the name of variety hour and as I said it's a variety hour so

00:00:42.989 we're gonna talk about a variety of things I'm extremely honored this year

00:00:48.030 because this is the very first time I've been able to do the opening keynote at railsconf every other year I have to I

00:00:56.130 always give the the final keynote of the conference and this is the very first time I've been able to give the opening

00:01:02.190 keynote at railsconf though I do think it is kind of weird that if you go to

00:01:07.590 the website and scroll down you'll see that I am I'm very last on the webpage

00:01:14.040 now I'm sure that this is just an oversight like a small oversight by the organising committee but you know I'm

00:01:21.869 sure we can get this like cleared up later but it's really great to be the very first opening presentation now

00:01:31.399 David usually does the opening keynote but since it's my turn this year I thought I should change my headshot to

00:01:37.740 something a little bit more appropriate so I took a new headshot photo and this is this is what I am going to start

00:01:44.520 using from now on when I give talks of places the other thing is that since I

00:01:51.929 won't be able to see David's presentation ahead of my presentation and that's because and I can't stress

00:01:57.660 this enough I am the opening keynote this year since I won't be able to see his talk since

00:02:03.899 I'm opening I won't be able to make any jokes about it so I've decided that this

00:02:09.450 year what I'm gonna do is try to guess the things that he'll say and then see if I can make some jokes in advance so

00:02:16.980 the the first thing I'm pretty sure he's gonna talk about coming in 2020 I heard that base camp is going to be

00:02:23.730 starting a new email service called Hakam and I'm really excited about this but I guess from what I understand they

00:02:30.840 plan to monetize the service by charging money for email and

00:02:36.110 I think that's really great but I don't understand why anyone would pay money for email people send me email all the time

00:02:43.670 just constantly and I I don't pay for any of it

00:02:48.770 so I'm not sure why people would pay for that but I wish him best of luck

00:02:55.540 the other thing I'm pretty sure is that most folks are doing a lot more remote

00:03:01.280 working these days and I think that's really great and I know that there's a

00:03:06.290 lot of people out there who were new to doing remote work and I'm also really

00:03:11.600 happy that David and Jason decided to release this book at such a steep discount they wrote a book about remote

00:03:18.140 working I haven't actually read the book yet but I'm going to I really plan on it I ordered a copy I'm gonna read about

00:03:25.070 this and the reason I want to read is because III too would really like to work from home I mean I would really

00:03:31.520 like to work from David's home it looks very nice now as you've no doubt noticed

00:03:39.709 that railsconf is not in Portland this year due to the outbreak of Cove in nineteen in the United States and I'm

00:03:46.100 pretty sad about that I love going to Portland it's this is really too bad I

00:03:52.730 was looking forward to seeing everybody in Portland this year and it's so close to Seattle so I can just drive and I

00:03:59.120 don't have to like you know fly down there so I really like that the other

00:04:05.660 thing is that I think that I love that show Portlandia so I'm so happy if I

00:04:11.930 could visit Portland which the show is about but honestly one of the reasons I think that the show is really funny is

00:04:17.690 because Seattle where I live is so similar to Portland and the show could

00:04:23.960 really be about either plays but I guess Portland got stuck with it so anyway

00:04:30.830 this year as I was saying we're doing the conference remotely this year and doing a pre-recorded talk is like a

00:04:37.430 really difficult thing for me to do I usually don't do pre-recorded talks it's

00:04:42.620 really important for me to get feedback from the audience I really need feedback from the audience when a presentation that way I can adjust my

00:04:49.580 presentation if I'm going too long or too short and also I get extremely

00:04:54.680 nervous when I'm giving a talk and the feedback from the crowd like helps me actually calm down when I'm on stage and

00:05:02.020 unfortunately like since I'm pre recording this I feel really nervous but

00:05:08.390 I decided I would try to deal with this by attempting to give a presentation ahead of time and record it in zoom

00:05:22.860 all right you started already you're muted sorry can you hear me now you hear

00:05:36.069 me all right sorry about that let's try again hi everyone welcome to Aaron's variety

00:05:43.930 hour I'm really happy to be here Aaron I think you give a color palette on your screen still oh really I don't I don't

00:05:51.759 see anything is there it's like one that's okay one sec one sec that is that okay oh yeah

00:06:01.029 thank you okay okay let's do this again okay hi everybody welcome to Aaron's

00:06:08.319 variety hour hopefully this will be an hour long I'm not I'm not sure can you

00:06:17.319 all mute if you're if you're not on like can you somebody's getting a phone call

00:06:23.860 can you mute please all right okay let me start over welcome to Aaron's variety hour I am so

00:06:29.800 happy to be here hopefully this will be an hour I've been trying to start okay I

00:06:41.740 will I will start oh sorry sorry it's okay it's okay all right everyone welcome there is

00:06:50.770 variety oh I'm sorry wrong meeting

00:06:56.080 okay hi everyone welcome to variety hour

00:07:01.850 this may be an hour but I'm not sure now

00:07:08.660 when that didn't work I decided that I would try to give the talk in front of my cats welcome to railsconf welcome to my very

00:07:15.980 first opening keynote at railsconf come here buddy welcome to railsconf come on

00:07:22.310 buddy now when that didn't work I decided that I would try to give a presentation on Animal Crossing

00:07:57.270 now since it was way too hard to give a presentation on animal crossing I decided to just do the format that I'm

00:08:03.759 doing right now which I guess is of late-night TV show host and I will just start adding laughs

00:08:09.520 crash rather than make myself feel better now before I get into the stuff

00:08:15.999 that I actually want to talk about for this conference I want to tell you all about my my productivity strategy my

00:08:25.300 productivity strategy is procrastination so in order for me to focus on building

00:08:32.229 a presentation I need to put it off until I absolutely can't put it off anymore and the reason I do this is

00:08:40.510 because it gives me a hyper focus I have to focus on the task since I'm required to focus on it like I know that there is

00:08:47.079 a particular deadline I have to be up in front of an audience this helps me really focus on preparing my

00:08:53.829 presentations for people however this this strategy also comes

00:09:00.160 along with a lot of extreme guilt now as I'm procrastinating and I know I should

00:09:07.329 be working on my talk I know I should be making this presentation but I'm not so in order to relieve my guilt for

00:09:13.959 procrastination I have to fill my time with things that I deem to either be more important than the tasks that I at

00:09:21.070 hand more important than the thing that I'm doing or it has to be something that's more fun now I call this

00:09:31.079 procrastination project management or ppm for short and actually I take it

00:09:36.610 very very seriously in fact I meticulously plan out my procrastination

00:09:41.910 these projects are actually planned procrastination projects now I had

00:09:47.380 planned out I had planned on procrastinating all the way until April and then working on my presentation for

00:09:55.750 all of April until railsconf came around now unfortunately we got hit with a

00:10:01.860 kovat 19 and railsconf was cancer and when that happened Evan texted me

00:10:08.130 and he asked me are would you be willing to record your presentation and of course I said yes so I asked him well what when is the you

00:10:16.890 know I need a deadline for this because that's how I work with my I didn't tell

00:10:22.320 him that this is my productivity strategy but I need to have a deadline in order for me to procrastinate and he

00:10:29.430 said oh yes the dates will be the same as railsconf and I said that is great I knew internally okay the dates are the

00:10:36.030 same this means that my procrastination plan is on schedule things can move like

00:10:42.330 according to plan so unfortunately maybe

00:10:47.820 a couple weeks later I got an email from the railsconf organizing committee

00:10:53.700 saying to me oh well actually your videos are do we want you to pre-record

00:10:58.950 this but your videos are due by April 13th and unfortunately this is three weeks earlier than I had planned so this

00:11:07.590 really really put a lot of pressure on me and I decided that part of my

00:11:13.230 presentation since since now everything is pushed up so far I thought okay well part of my presentation is I'm going to

00:11:19.320 give a presentation of my plan procrastination projects so before we actually dive into technical details of

00:11:27.900 my presentation I'd like to show you some of the stuff that I've been doing to basically put off doing my actual

00:11:35.310 work so once I found out that we are gonna be doing railsconf as an online

00:11:42.240 conference I got me really worried like I kept thinking to myself okay how am i

00:11:47.640 how am I gonna record the video now of course like I need to record it in 4k

00:11:53.660 and obviously an iPhone won't do so I am

00:11:59.040 gonna have to purchase a new camera that has 4k like you can shoot 4k video because it's very important that people

00:12:05.150 be able to see like you know how dry my skin is or see my face in excruciating

00:12:12.930 detail so I decided to purchase a new camera I got this camera it's Anna this om-d camera it's a very nice camera

00:12:21.450 and then I read the manual for it and to be honest I haven't owned an SLR camera

00:12:26.710 in quite a while so I'm not up on the latest technology and I was incredibly surprised at how you actually have to

00:12:33.850 transfer how you transfer photos from the camera onto you know your phone or

00:12:39.550 your computer or whatever according to the manual what you do is the Wi-Fi

00:12:45.070 create or the camera creates a Wi-Fi network now you connect to the Wi-Fi network with your phone and then your

00:12:53.080 phone transfers the photos off of the camera via Wi-Fi so of course you have

00:13:00.310 to install some kind of like proprietary application on your phone some kind of

00:13:05.920 app to transfer the photos from the phone over to your but are from the camera over to your phone and this made

00:13:12.430 me think a couple things like first off I thought install a new app in this

00:13:17.770 economy I've got millions of apps on my phone I can't afford to have another another app on my phone the other thing

00:13:25.960 that I thought is can i hack this there has to be a way for me there has to be a way for me to be able to hack my hack my

00:13:34.030 camera I don't want to transfer all the photos off using some third-party application I'd rather have a program

00:13:41.260 that takes them directly off of the camera and imports them right into the photos application rather than using

00:13:47.800 some weird thing so I decided that my one of my procrastination projects I

00:13:53.380 mean since I since I bought the camera to film the video I need to be able to get the photos off of the camera or the

00:13:58.810 video off of the camera in order to show the video to everyone I decided I needed

00:14:03.910 to hack my camera and to get started I decided to connect my computer to the camera a via Wi-Fi rather than my cell

00:14:11.110 phone and when I did I noticed a couple of interesting things first the camera

00:14:16.960 always had the same IP address so I thought that was that was very interesting and when I ran a port scan

00:14:23.710 on the pewter I found that it also had an HTTP service open and that was very

00:14:31.149 interesting now I also found that I could just run curl against the HTTP

00:14:37.449 port it gave me an error there was there was an error that showed on the in the terminal but the traffic was all over

00:14:46.180 port 80 it was unencrypted and if I could figure out what the actual end points were on

00:14:51.819 the camera like maybe I could write my own write my own client for the camera so I needed a way to figure out like

00:14:58.029 okay what is the data going between the cell phone and the camera what what are they transferring what is it doing and

00:15:05.259 if I can figure that out I can write my own write my own client so I wanted to

00:15:10.930 come up with a way to intercept the data that was being transferred between the camera and the phone so there are a couple things going for

00:15:18.309 me the first thing was that we had in a hard-coded IP address so the camera

00:15:24.790 always had the same IP address and that said to me that probably the application

00:15:29.860 that's it's accessing the camera is gonna have that IP address hard-coded in so I can use this to my advantage

00:15:36.779 the other thing is that since we have port 80 open and not 443 it's probably

00:15:42.639 going to be unencrypted traffic so the camera is going to be transmitting

00:15:47.759 unencrypted data now there is one crucial detail or one important thing is

00:15:54.670 that this the point of this exercise is not actually to write a client for the

00:16:00.189 camera but to put off doing the work that I really need to do and that means

00:16:08.079 that I need to make this project worthwhile so that I didn't feel guilty

00:16:13.420 about it and this gave me one really big constraint and that is that I needed to either use Ruby or rails to somehow

00:16:20.470 intercept and deal with this traffic so my idea for getting access to the data

00:16:28.629 that was being transferred between the phone and the camera was to essentially set up two computers that each had two

00:16:35.589 interfaces so one can would have two computers each computer would have Ethernet and a Wi-Fi now the

00:16:44.770 two computers would be connected to each other via ethernet and will connect the

00:16:51.700 cell phone to one of the computers via Wi-Fi and that computer will act as a

00:16:58.570 proxy so this computer is going to be intercepting the traffic and recording

00:17:03.640 it then will connect the camera to the other computer via Wi-Fi and create an

00:17:11.290 SSH tunnel between the camera and the proxy server this way our proxy server

00:17:17.560 can access the network for the camera so we can have the to two IP to same IP

00:17:23.709 addresses with access to each other so the secondary computer will set up an

00:17:29.680 SSH tunnel between the camera and our proxy server and then the proxy server can access the access the camera through

00:17:35.890 a port on localhost so when the phone

00:17:41.170 sends data the proxy server will save a copy of the request and when the camera

00:17:46.960 responds the proxy server can save a copy as a response something like this so the phone sends a request we save a

00:17:54.250 copy of the request and then the camera sends a response we save a copy of the response and send that back to the phone

00:18:00.910 and if we record all of that data we can figure out what is the protocol that the phone and the camera are using to talk

00:18:07.570 to each other now this is for completeness this is the SSH SSH tunnel

00:18:14.040 command that I used and you can see here essentially what happens is we say we're

00:18:19.510 gonna forward all forward port 80 on 192 168 0 10 to our Raspberry Pi I'm using a

00:18:28.900 Raspberry Pi as my proxy server and the local the localhost on the proxy server

00:18:34.900 can or the proxy server can access the camera on localhost via port 22 10 so if

00:18:41.590 we connect to 22 10 on localhost that'll actually be forwarded over to the camera

00:18:47.130 then in order to make this project worthwhile and make myself feel guilt free I wrote

00:18:54.370 the proxy server in Ruby and essentially what it is is just a a proxy server

00:19:01.270 written with web brick and this is the this is the code that handles the get request for the proxy server essentially

00:19:11.200 all it does is access the SSH tunnel here so when a request comes in we we

00:19:16.330 the proxy server create a new connection to the lope to localhost on for 20 to 10

00:19:22.210 which is actually a connection to the camera then we send the request replay

00:19:28.810 the request that we got from the phone then we get the response body that came from the camera and then we send the

00:19:37.270 response body back to the back to the phone logging all this data along the

00:19:43.960 way and then we have to handle posts as well this is the code for handling posts it's almost exactly the same as a code

00:19:50.560 for handling gets except that we need to log post potties as well so it was very

00:19:55.930 similar now the next thing that I did was that I used the app on the phone to

00:20:01.930 collect data now of course yes this did mean that I had to install the

00:20:07.450 application on my phone but hopefully I installed the application on my phone so

00:20:12.850 that you don't have to you too can buy this camera and not have to install the application I did it for you and I am

00:20:21.310 publishing this data so you won't need to do this hopefully anyway using the

00:20:26.350 data that we were able to collect that we intercepted on the proxy server it's

00:20:31.810 pretty trivial to write our own client and this is the client that I wrote for

00:20:38.310 getting photos and getting images and movies off the camera basically it just has two parts the top part there just

00:20:45.430 gets a list of all of images all the files that are stored on the camera and

00:20:51.520 the second part that downloads all those downloads all those images and of course I added a little bit of intelligence in

00:20:58.390 there so that if we've already downloaded the image it won't try to again so we did it our cameras hacked

00:21:05.309 that is great of course this is clearly not not enough procrastination this only

00:21:10.500 took a couple of days and I just after completing this project I just didn't feel enough pressure to work on the

00:21:17.010 presentation so the next thing I had to do is I wanted to make a virtual conference inside of Animal Crossing and

00:21:23.610 we saw the clip earlier today but now that that clip actually costs me a lot of time I had to grind a lot on Belle's

00:21:30.030 so that I could buy all the stuff to lay out and display for everybody but then

00:21:35.580 of course I needed a I needed a way to record the video that came off of my

00:21:41.610 switch so that I can give the presentation an animal crossing so I learned about that and buy I bought a

00:21:47.250 video capture card so that I could record all the data from the switch and I had to figure out the right software

00:21:53.190 to use to record it which was extremely hard as well and now I also needed to get a tripod for the camera but it was

00:22:01.410 very difficult to get a tripod because it was very kovat outside but the good news is that now that I've done all this

00:22:07.679 yak shaving I'm ready to become a twitch gaming streamer but this is all prepared

00:22:12.750 me for my next my next big career which is to launch my launch myself into the

00:22:18.270 great world of online game streaming so when I decide to retire from programming

00:22:24.360 I can just play Animal Crossing online all day the next project that I did though is I I still didn't feel enough

00:22:31.440 pressure even though I decided to start a new career in streaming games I

00:22:36.809 decided to do something which I call the Eren project after hacking my camera

00:22:43.500 grinding with grinding four bells dealing with capture cards tripods and starting my new career as an online

00:22:48.929 streamer since that wasn't enough I had to I decided to design and build a sign to let me know every time I had done

00:22:56.670 messed up and this is this is where I came up with the ay-ay-ron sign essentially I built a

00:23:04.200 sign that sits in my office above my desk this is this is the sign then I

00:23:09.360 design and I designed a PCB this is the PCB this PCB controls the sign and the PCB has a

00:23:17.460 serial UART connection so I can hook this PCB straight into my computer and

00:23:22.860 have my computer control the sign now I've wired it up so that every time I execute a command in my terminal if it

00:23:28.890 fails or if any of my tests fail the sign will display a a run so right here

00:23:34.380 is the serial port on the PCB that I use to control it and this sign will let me

00:23:40.650 know anytime I've messed up

00:23:58.760 moving on our main story tonight concerns where clauses in your active

00:24:04.980 record queries when you need to offload complexity to the database they're there for you we're going to

00:24:12.390 talk about how where clauses work in active record some features that you might not know about and then we're

00:24:17.789 gonna talk about profiling and speeding up speeding up where clauses now when I say profiling I don't mean

00:24:25.110 profiling your application I mean profiling rails itself certainly these same profiling techniques techniques can

00:24:32.460 be used to profile your application but today we're gonna look at how to speed up rails itself now this story started a

00:24:42.600 while back we have an internal blog at work and somebody posted on the internal

00:24:48.419 blog like hey check out this check out this life hack if you basically the blog

00:24:54.900 post just said like sanitizing IDs yourself is much faster than letting active record sanitize the IDS for you

00:25:01.049 so if you have a large list of IDs and you need to do an in statement doing the

00:25:06.450 sanitize sequel is way faster than asking the where clause to do it and I

00:25:11.520 read this and at first I just couldn't believe it and the next thing I thought

00:25:16.740 is well if it's true why would that be true these two things do approximately

00:25:21.750 the same thing so why would you want to use the harder version when you can just

00:25:27.570 use the easy version so in order to figure out what was going on the first

00:25:32.880 thing I did was I wrote a very simple benchmark now this benchmark just compares the two different strategies

00:25:39.059 using this great benchmark IPS gem I really like this gem because all it does

00:25:45.000 is run the same code over and over again for five seconds in whichever a chunk of code can run more times in five seconds

00:25:51.809 that one is the faster version and this is really nice because I don't have to think about how many times should I run

00:25:57.570 this thing just does it for me and sets a deadline and then however many times it executed within that deadline is our

00:26:04.280 is our benchmark now what this means is the more times you run the faster your

00:26:10.760 code is so we want to have a higher iterations per second our faster code

00:26:16.040 will have higher iterations per second now if we look at the results for this benchmark the results are incredibly

00:26:22.310 surprising actually sanitizing the IDS yourself was about 74 percent faster

00:26:28.910 than calling the easier the easier API and this really perplexed me why why

00:26:35.510 would this be the case these two things were doing almost exactly the same thing so how can the performance be so

00:26:43.220 dramatically different now before we dig into this issue I want to talk about

00:26:49.970 using where with active record so we're gonna do we're gonna get a little bit back to basics here let's take a look at

00:26:57.340 calling the wear method and how what it actually does and the sequel query is

00:27:03.020 that it generates so with active record the wear method just is simply used to

00:27:08.840 manipulate the where clause that's generated in your sequel statements we use this method anytime we want to add

00:27:14.630 constraints on the query we're making and these call these calls are additive so if we call wear multiple times it

00:27:22.460 will add add multiple things to your where clause so we can keep chaining warez and saying we want to add this and

00:27:29.150 this and this and this so from a high-level perspective we're translating

00:27:34.250 Ruby calls into sequel we're essentially taking the Ruby language and converting it into the sequel language and there

00:27:41.360 must be some process for converting Ruby code into sequel and executing it

00:27:47.300 if we look at our benchmark again we'll see that the faster version is doing some of that translation to sequel

00:27:53.420 manually so we can kind of tell intuitively that the speed difference must come from the way that the slower

00:28:00.830 example does sequel generation so something about the automatic sequel generation is slower than doing it

00:28:07.700 manually another interesting thing is that if we look at this particular

00:28:13.429 example in the console this isn't actually the sequel that gets Jen so we see the sequel like we think it

00:28:20.179 should be the sequel it gets generated but when we look at the console it's not that's not the sequel that's actually

00:28:25.220 generated what's actually generated is you'll see the same sequel statements

00:28:30.590 but they'll have a limit 11 attached to them and this is because active record

00:28:37.039 is lazy about generating the query so in fact actually the inspect method on

00:28:44.690 active record will add this limit 11 to

00:28:50.269 the query before it's executed so if we want to have the where statement to

00:28:56.629 generate the sequel that we would expect we need to actually proactively load them now this if we look at these three

00:29:05.119 calls will notice a difference between the behavior where the first one when we eagerly load when we call to a no limit

00:29:12.289 is added if we call just dot all without inspecting it nothing gets executed and

00:29:18.289 nothing's executed because we didn't access any of the records and we didn't

00:29:23.570 do anything with it so nothing is executed now the third one it adds limit 11 and that's because we called the

00:29:30.470 inspect method on it and remember that the console will call inspect for us or calling PU will automatically call

00:29:37.850 inspect now the important thing to learn from this is we actually wrote our

00:29:42.980 benchmark incorrectly if we look at our benchmark we'll see that we're just calling we're and we're not actually

00:29:49.850 loading any records so neither of these things will actually query the database

00:29:55.629 so the only thing that we've proven is that we're able to do nothing faster

00:30:01.460 when we call sanitize sequel than if we were just calling where so let's fix the

00:30:06.799 benchmark you know here we've fixed the benchmark to generate a thousand ideas and then try and query for those

00:30:12.889 thousand IDs and it actually hits the database and if we compare that again if we run rerun this benchmark we'll see

00:30:20.090 that in fact before we before we fix the

00:30:25.220 benchmark the where statement was are the sanitize sequel statement it was 70%

00:30:31.039 faster now that we fixed the benchmark we can see that well it's only 60% faster now so we've

00:30:38.659 sped up our we've sped up our program but rather than being 70% faster its no

00:30:43.999 60% faster so I guess that's better but ideally these two things should be

00:30:50.749 identical so let's profile let's actually profile this code and try and

00:30:56.389 figure out what's going on for these two pieces of code these two pieces of code doing most of the same thing being 60%

00:31:04.580 slower it's just it's too slow it's in the threshold of something that I would consider to be a bug so let's try to

00:31:11.090 figure out what the bottleneck of this code is and to figure that out I like to use a tool called stackprof for

00:31:17.029 profiling now this is an example usage of stackprof so this is using stackprof to profile these to our to benchmark

00:31:25.129 issues or to benchmark pieces of code it looks very similar to the IPS code we

00:31:31.729 saw earlier only this time we're just doing getting profiling information now what we're doing here is we're

00:31:39.169 generating actually two profiles one is with our slow case and one is with our

00:31:44.509 fast case and I liked in this particular case I'd like to generate two profiles because we need to I want to compare

00:31:50.779 them and it can be helpful to compare the two so if we run this we'll end up with two two profile files one for the

00:31:58.220 fast case and one for the slow case and if we run stackprof against the where

00:32:03.409 profile our slow case and we look at the top six stack frames we'll see that approximately 14% of our time is being

00:32:10.580 spent clearing the database now acquiring the database like normally

00:32:16.549 this might be our target for optimization so if we were just doing profiling on a web app say for for

00:32:22.190 example maybe we see that we're spending 14% of our time in one particular method

00:32:27.590 well we would probably want to target that method now let's take a look at the profile with manual sanitization or fast

00:32:35.090 case this top stack frames for the fast case well they they look almost identical to the frame

00:32:41.600 we saw in the slow case now this means that if we were to go in and speed up

00:32:46.730 that query method we would actually be speeding up both cases and that's not

00:32:52.250 what we want we want to figure out why is one case faster than the other and can we can we get those two bits of code

00:32:59.120 to be the same speed so speeding up the query method in this case won't actually

00:33:04.580 help it to you our goal now we can figure out like as we see these these

00:33:11.419 profiles they're not really helping us so much because we don't like we don't

00:33:17.419 know what the difference is looking at the two they're they're essentially the same so why is one faster than the other we haven't been able to answer that

00:33:23.480 question now these profiles when we're reading them they're telling us how much

00:33:29.539 time we're spending in any one particular function call and we call this self so this is this that one call

00:33:37.100 we're saying how much time do we spend in ourselves versus ourselves and other things that we call other functions that

00:33:43.730 we call well stackprof also measures the amount of time that you're this one

00:33:50.870 particular function takes plus all of its children as well so we call that self plus children so so far we've

00:33:59.539 looked at the amount of time or what the slowest functions are but that only

00:34:04.789 tells us how much time a particular function takes it doesn't tell us how much time that function takes plus all

00:34:11.359 the functions it calls now where this is important is let's say let's take a look

00:34:16.760 at an example so in this particular case if we profiled this code the profiler would tell us oh the method the slow

00:34:23.659 method the method names slow it is the slowest and you should speed that up and it's true that method is the slowest and

00:34:30.649 this slow caller method it wouldn't even show up on profiles because we don't spend any time in that method so the

00:34:37.760 profiler would say hey you need to speed up the slow method it's method named slow now it would be even better though

00:34:46.820 if we could eliminate the slow caller method like maybe maybe we don't need to call slow call or so much maybe we we

00:34:53.149 can just eliminate that method all together but we can't see the slow caller method in our profiler outputs so what we need to do

00:35:00.330 is say stackprof i want you to sort by the total amount of time that is spent inside of a function so we we say hey sort by total

00:35:08.390 we can say stackprof pass in the sort total flag and this will sort by sort

00:35:14.880 functions by how much time is spent in that function plus any functions it calls so if we sort by that will get a

00:35:23.310 profile for our slow case that looks something like this and this is actually

00:35:28.440 much more interesting than the previous profile we looked at and the reason is if we look through it we can see where

00:35:34.349 are we spending all of our time well actually we're spending approximately 28% or 29% inside of this compile method

00:35:44.760 inside of the two sequel visitors so this is this is actually very interesting why are we spending 30% of our time compiling sequel now if we

00:35:52.290 compare this profile to the self sanitization profile that Errol to

00:35:58.230 sequel visitor it doesn't even show up at all so here we found our difference between the two

00:36:03.839 so in our slow case we're spending nearly 30% of our time in this compiler we're in our fast case we don't even see

00:36:11.430 that compiler in the profile output so it's not that we're not spending time in

00:36:17.040 the compiler in the fast case it's just that we don't spend much time in the compiler in the fast case so we can tell

00:36:24.000 from the comparing these two profiles that our actual target is this two sequel visitor or this two sequel

00:36:30.150 compile so we need to figure out like why is this method taking up so much of

00:36:35.880 our time but maybe we can speed up that method but I think it's better to step

00:36:42.390 back a little bit and say you know what what exactly is this method doing so let's take a look at some active record

00:36:48.960 design to understand what that compile method is doing I want to look at a little bit of active records design

00:36:55.290 specifically I want to look at sequel generation sequel generation and active

00:37:00.990 record compile involves a few a few steps our first step is generating a

00:37:06.960 relation objects now every time we call any type of query builder method in active record

00:37:12.690 we're generating or creating a new relation object so each time we call like say we're or order or joins or any

00:37:21.599 of these types of query builder methods we're creating a new relation object and that relation object contains

00:37:27.979 information about constraints or things that we want to add to the sequel statement so in this case we're saying

00:37:34.589 hey we want to we want to find cats that are age 10 with names Glorian and we

00:37:40.170 want you to order them by ID and each of these relation objects stores that particular information the nice thing

00:37:46.499 about this design is that we can reuse these relation objects so we can say

00:37:51.719 okay I want you to find all cats that are age 10 and I want you to order them by ID and then we can actually split and

00:37:59.279 say well in this case I want you to order I want you to find cats named Gorby and in this case I want you to

00:38:04.410 find cats named SeaTac but we're able to reuse part of the query between both of

00:38:09.719 those two now this gives us a nice ability to reuse relations and refactor

00:38:16.920 our or factor our designs so our next step is actually active record takes

00:38:24.059 these relational objects and converts them into an AST this ast what an ast is

00:38:29.339 is simply a tree data structure that represents the sequel query that we want to execute so if we were to imagine the

00:38:36.029 object relationship of this ast it would look something like this this is what

00:38:41.700 the data structure might actually look like it's a tree of objects so say we have a statement object it points at a

00:38:47.190 fields object and the field object points at a list of fields it wants to query on this tree structure represents

00:38:54.599 the actual string below it now active record will go through look at each of

00:39:00.539 those relation objects and convert them into an ast that represents the sequel statement that we're going to execute so

00:39:07.049 maybe the tree will look something like this now we can actually view these or

00:39:15.210 get access to this ast by calling some extremely internal API is now the

00:39:20.670 these api's are very internal but I'm and they may break at some point in the future but they are very handy when

00:39:26.190 you're trying to debug stuff especially this kind of performance issues if we call dot arrow AST on the relation we

00:39:34.020 can actually get access to that ASC object so calling this will give us the top of that tree and we can traverse

00:39:40.589 through the tree and check out the actual tree that represents our sequel statement or we can call to dot and this

00:39:48.180 prints out our tree and graphviz dot format now graph is dot format looks

00:39:54.000 like this so this is the dot output it's actually a bit longer so the it this is

00:39:59.369 the whole thing and of course it depends on the relation that you've created in the sequel statement that you're trying

00:40:04.470 to execute now this isn't very helpful but if we take this dot format and run

00:40:10.740 it through the dot command line tool graph it as command line tool we can generate a PNG file that shows us the

00:40:16.710 actual structure of the sequel statement so in this case we're we're querying for

00:40:21.990 cats the output of graph is will look like this so this is the the actual structure of the sequel statement that

00:40:29.609 we're going to be executing if we zoom in we can see the column name so we have name and age and we can also see the

00:40:38.849 table that we're executing against as well as the columns and order information that we want to query with

00:40:45.480 query against as well now our final step before sending the query to the database

00:40:51.630 is actually generating a generating that sequel statement that string and to do

00:40:57.390 that we're just taking this ast this tree data structure and we're gonna walk the tree and convert the tree into a

00:41:04.680 literal string that we'll send to the database so we have a bit of code that's called a visitor it's using the what is

00:41:11.010 called the visitor pattern on the sequel visitor will walk through every one of these nodes in the tree converting each

00:41:17.280 node into string or each node into a bit of a sequel statement and concatenate

00:41:23.730 all of them together and then we send that string to the database so this is

00:41:29.059 this so this particular visitor here is generating a sequel statement where what we saw in the

00:41:36.029 previous slide was actually a graph of his visitor that walks the tree in outputs of the tree to adopt file that

00:41:43.410 we can use with graph is now this is very interesting because when we were

00:41:48.930 looking at our profiles earlier our target is the Aral visitor to sequel visitor it is exactly that it is that

00:41:55.109 code that's walking through this tree now we we know from our profiles that this is the bottleneck and that's the

00:42:02.430 thing that we need to figure out why it's taking so long so we know that the difference in performance is something

00:42:08.009 to do with the difference in the ast s that are being generated so let's compare the ast for each of these

00:42:13.859 queries so will generate dot files for our case that uses sanitize sequel or

00:42:20.279 fast case and will generate a dot file for our case that doesn't use sanitize sequel and we'll compare the two so this

00:42:27.599 here is the output for the ast e or this is the ast that is the fast case this is

00:42:33.449 where we use sanitize sequel now let's compare that to the ast for our slow

00:42:39.509 case where we're not using sanitize sequel this is what it looks like and we can see here that this auto sanitized

00:42:45.509 version is actually much different than our sanitized sequel version if we add

00:42:50.819 more IDs to the to the the auto sanitized version then this array will

00:42:57.209 keep growing and growing and growing in size and we know that we it will have more and more nodes to process as we're

00:43:03.809 walking this tree and converting it into an AST and this actually really this

00:43:09.269 actually coincides or works well with the the performance that we're seeing if

00:43:14.789 we compare the performance between the self sanitization or the manual sanitization and the auto sanitization

00:43:21.209 we can see that well our auto san our auto sanitization is spending about 29

00:43:26.609 percent of our time generating a sequel statement where our self sanitization is only spending 0.6% of our time so we can

00:43:33.959 see that there's a huge difference between these two but we can also see that there's a huge difference between the ast s so these numbers correlate it

00:43:42.779 makes sense we have a large ast it takes a large time oh we have a large a as seen it takes a

00:43:49.509 long amount of time to generate that query so how can we fix this well we

00:43:56.799 have a few options we could speed up the compiler that's one option but again

00:44:03.279 this would actually speed up both the fast in the slow cases so our fast case would get faster our slow case would get

00:44:09.910 faster too but that's not what we want we actually want to have our slow case be the same speed as our fast case so

00:44:17.380 maybe instead we could do less work our faster example is doing much less work

00:44:23.109 so maybe there's a way that we could reduce the amount of work that are slow cases is doing so maybe there's a way

00:44:32.289 that we could take this where call and have it just call sanitized sequel rather than going through the entire

00:44:40.209 compiler now if we trace through the trace through these profiling frames

00:44:46.449 well actually find the find the point where these relations are being

00:44:52.029 converted into IDs or into an AST and that this is the code that converts those relations into a list of IDs in

00:45:00.609 the ast and we can see here that essentially what we're doing is we're creating an in node and this in node

00:45:05.679 points at all the IDS that we're going to be that we're going to be using in the ast it would be nice if instead of

00:45:12.819 generating so many nodes we could just generate a sanitize sequel part right here just call sanitize sequel and we

00:45:19.569 don't generate all those notes to walk of course it'll require a little bit of

00:45:24.609 refactoring to get it to work but I have to say like when I encounter some code

00:45:31.269 that I need to refactor I always ask myself the same three questions can i

00:45:37.569 refactor this code should I refactor this code and do I want my name on the

00:45:45.670 blame if I answer yes to all three of these questions then I'll refactor this

00:45:51.130 code otherwise I need a different strategy

00:46:07.630 allene and I worked on optimizing this code for about three months and it seems like it should be a fairly

00:46:13.579 straightforward optimization but unfortunately it wasn't and that's because there are some subtle

00:46:18.770 differences between the ways of the two code examples were the first issue is

00:46:24.589 that we can't use the parameters hash and controllers the same way that with manual sanitization in the same way that

00:46:30.589 we would with Auto sanitization so for example here we have params that may have come in from the controller and we

00:46:36.589 pass those parameters into active record now if we look at the queries that are

00:46:42.049 generated by these two statements we can see that they're totally different the first one casts it to an ID whereas the

00:46:49.460 second one we use it as strain generates an in clause with the strength so this

00:46:54.470 is because the ast has some context about the values that are being put into

00:47:01.520 the sequel statement it knows oh we're going to be generating where we're

00:47:07.039 dealing with IDs here IDs are integer types so we need to cast this value to an integer type and it does that for us

00:47:13.819 automatically whereas the sanitized sequel version doesn't have that context another example is URL slugs say we have

00:47:22.700 some type of text that comes after the ID that come that comes in from the controller now when this ID comes in

00:47:30.049 what will the query look like if we look at the sequel statements we'll see using auto sanitization it does the right

00:47:36.589 thing it queries with the value of 100 for the ID whereas the manual sanitization will not do that it

00:47:42.410 generates an in clause with that string literal now again this is because the

00:47:52.160 auto sanitization has that context it knows that we need to convert these into IDs now the third issue is very large

00:48:00.410 numbers so in this case we have a very large number and we're trying to do a query against the database with it if we

00:48:06.920 look at the sequel statement that's general here we'll see that in fact the first one doesn't do any query it does nothing

00:48:12.940 where the second one does a very an in statement with a very large number where

00:48:18.710 this could raise an exception or something and the reason that the first one doesn't do any doesn't do a query is

00:48:24.859 because it knows well this number is outside of the bounds that this particular value can handle so I won't

00:48:31.579 do any query I'll return an empty set so to deal with this or try to try to speed

00:48:38.119 this thing up the solution that we came up to came up with was to introduce a new type of node called a precompiled

00:48:44.839 node part of the reason the compiler is slower than we'd like is because it's designed to deal with any type of node

00:48:51.140 in the tree but when application code is calling the where function we know that at that

00:48:56.720 point we're being called with a list of IDs and we can do something a little bit more intelligent so the compiler is much

00:49:03.680 more general it knows about general issues whereas at that point when we call where at that call site we know

00:49:09.920 that we're only dealing with a list of IDs so what we could do is at the point

00:49:15.170 where we create that list rather than creating that list of IDs we can create a precompiled node that has the sequel

00:49:22.069 statement we need to insert so normally our tree will contain an in node that

00:49:28.069 points to a bunch of IDs and we know that walking those IDs is our bottleneck so we aim to do is instead introduce a

00:49:35.119 type of precompiled note this precompile node will point at the old in statement

00:49:40.640 but it will also point it generated sequel that we can use when we're compiling our sequel statement now the

00:49:47.869 reason we point of both of these is so that we can maintain backwards compatibility for example maybe we need

00:49:53.300 to change the in statement into a not in statement so we need to do some kind of transformation on the tree well we were

00:49:59.540 able to maintain backwards compatibility and get higher performance so normally

00:50:05.000 during the compilation phase we would normally visit these all these in statements are all the nodes in this

00:50:11.859 inside of the tree whereas now when we encounter a precompiled node instead of

00:50:17.359 visiting all those we can just say ok i want to use this precompiled statement so Eileen and

00:50:24.190 I implemented this and I want to share some of the benchmark results with you so if we look at the performance difference so this benchmark just

00:50:30.490 measures how fast can we execute a where statement and we're comparing against our our branch to master will see that

00:50:38.380 our branch is actually 30% faster so we've been able to improve performance on master thirty five thirty percent now

00:50:46.329 we need to compare this with the original benchmark against self sanitization to see where where we are

00:50:53.440 with regards to self sanitization so we compare that as well using a thousand ids and if we run the benchmarks we'll

00:51:00.339 see here it's still actually slower it's 18 percent slower than doing self

00:51:05.770 sanitization now I think this is good news because it used to be sixty percent

00:51:11.500 slower of course we saw earlier that these two functions are doing something

00:51:17.200 a little bit different so we're not exactly comparing apples to apples so that could be that could be the reason

00:51:24.099 for this performance degradation but actually I think that we still have a lot of work to do this is a work in

00:51:31.720 progress and I think that we can actually get those numbers closer such that there is no performance difference

00:51:37.089 at all but for the time being I think this is a great improvement now I'd like to show

00:51:44.260 you all of the source code for this the entire diff or fixing this particular

00:51:50.230 issue but I think that that would be going from zero to draw an owl when actually I want to communicate to you

00:51:56.890 things that would be more useful to you in your daily lives like the

00:52:02.650 architecture of active record or benchmarking and profiling techniques

00:52:07.779 those are the things that I actually want to communicate to you in this talk versus the

00:52:13.170 this one particular performance fix now I'd be remiss if I didn't show you some

00:52:19.109 of the actual patch so let's take a look at it this is the bit of the code that handles AST conversion where we are

00:52:27.059 looking at in we're generating in nodes and we went from this to eater

00:52:32.430 compilation to this now the first thing you'll notice about this code is it's more complicated that is true but you

00:52:39.299 can see under the hood that we're actually just calling this sanitized sequel array just like our fast example

00:52:44.700 was doing now actually this isn't merged into master yet and the reason is we have the main issue with this particular

00:52:51.660 patch is we have this neat function here called serialized too now we need to

00:52:57.660 figure out a better name for this method in fact what we had to do to get this

00:53:02.969 performance boost is if you look at the internals of active record you'll see that each type has a serialized method

00:53:09.660 now that serialized method actually has it doesn't just serialize unfortunately

00:53:22.650 checks to see if the value is within bounds and then it does a serialization well in this patch we needed to be able

00:53:29.999 to split up that functionality so we needed to be able to do those checks say

00:53:35.519 are these things in bounds and then do the serialization or I do the serialization unfortunately serialize is

00:53:43.109 a public method so we can't change the behavior of that method so we just for the time being we introduced a method

00:53:49.079 called serialize to so we need to figure out a better name for that but once we do we'll merge this in or we'll just

00:53:55.529 make this private or something we'll figure out something but for now we have this fun method name all right so let's

00:54:04.589 wrap this up first thing I want to say is that not all code is equal we saw

00:54:09.989 those two those two bits of code at the very beginning and we saw that they did basically the same thing but in fact

00:54:17.009 they're not equal at all they handle different cases and this manual

00:54:23.609 sanitization route may not be applicable in the same cases as these regular active record calls so it's kind of

00:54:30.010 dangerous to say like okay these these two things are the same they are probably not exactly the same however

00:54:36.900 similar behaviors should have similar performance but these weren't even the

00:54:42.580 same the performance was very different the performance should be very close if they do approximately the same thing the

00:54:48.790 performance should be approximately the same but in this case they weren't even close at all it was very different so I

00:54:55.300 would consider this poor performance to be above I think that poor performance is a bug if you encounter something like

00:55:02.410 this with rails you should absolutely report it as an issue rather than saying it's you know just a performance protip

00:55:08.530 it's cool to have performance Pro tips like that but really we shouldn't be working to make the framework faster I

00:55:14.770 mean if you really think about it like poor performance in this economy we

00:55:20.470 can't afford that and you're right we can't afford that we need to be fixing these things so to recap today we looked at really

00:55:28.600 great ways to procrastinate I verified them is really great we looked at tools

00:55:33.700 for performance active record internals and design and I hope that you all enjoyed my very first opening keynote at

00:55:40.630 railsconf thank you so much and I really hope that we can actually see each other

00:55:46.060 in person next year thank you again