RailsConf 2011

Double Dream Hands: So Intense!

RailsConf 2011: Aaron Patterson, "Double Dream Hands: So Intense!"

RailsConf 2011

00:00:02.440 but uh before I begin I want to tell everybody I have a lot of love for the Ruby community and I want to say thanks
00:00:08.639 to a few people first um first off I want to say thanks to John he's been doing a lot of good work on uh active
00:00:16.520 record uh relation codes so you should all tweet at him and say thank you um
00:00:22.720 the next person I want to say thank you to is um well not really thank you I just want to show him some love here is
00:00:32.040 it doesn't look like love uh this is this is a picture of this is a picture of Michael feathers and he told me that
00:00:38.680 he put my picture in one of his presentations so uh I told him all right I'm going to put a picture of you in one
00:00:44.719 of my presentations so um I am going to give him a
00:00:52.960 kiss so thank you Michael feathers uh the next thing I want to say is um I
00:00:59.239 also love Jose valim a lot I tried to kiss him the other night and he was not
00:01:05.040 too excited about it um so this talk is called double
00:01:12.439 dream hands so intense what does it mean I I don't know
00:01:17.759 uh hopefully you'll find out anyway so now I'm beginning now you can start the
00:01:23.000 clock at 45 uh all right uh oh my God happy rails cough
00:01:35.960 welcome um in case you got lost you don't know where you are my name is Aaron Patterson
00:01:43.200 uh you can find me on the internet is Tender Love um I work for a giant
00:01:48.680 Corporation Called AT&T uh I know most of you here are probably working for
00:01:55.159 startups um I am an open- Source developer at AT&T so what I do there is
00:02:00.640 I uh work on the tools that we use in order to improve the developers and hopefully improve our bottom line um I
00:02:08.759 actually enjoy working for a giant Corporation um and I want to tell you
00:02:14.000 why like there's there's some things about it that make me extremely excited and the thing that I'm going to explain
00:02:19.040 to you today is um how I file expense
00:02:24.560 reports so it's actually a really awesome process this is amazing technology the way that it works is I
00:02:31.000 have to VPN in so I VPN into our our thing I work from home uh full-time so I
00:02:37.080 VPN into our Network and then I have to like fill out a form and click on all the things that I um that I spent money
00:02:45.080 on and then I have to fax my receipts like it gives me it gives me a cover
00:02:51.239 page and then I have to fax that in with a bunch of receipts and then um on the other end of the fax machine it like
00:02:57.120 scans it in as a PDF right so um so once it's scanned in as a PDF
00:03:04.000 like then I can see that and verify my receipts and I'm like good to go right I submit the expense report but it's kind
00:03:09.400 of a pain because I work from home so I don't really have a fax machine so I keep spending like two bucks or whatever
00:03:16.319 on the little like upload a PDF and it'll fax it off for you
00:03:24.319 so since I since I work at a giant company I went into the little we have a phone tool and I went in and found like
00:03:30.680 the accounts payable person I emailed her and I said Hey how do I like I work from home I don't have a fax machine how
00:03:36.840 do I get receipts to you like I don't you know what do I do and she says okay just you know put everything together in
00:03:43.560 a PDF and then email it off to this email address and you'll be good to go so I'm like okay great next time I had
00:03:49.959 to file an expense report I sent an email to this email address about 3 minutes later I get a response saying
00:03:56.360 fact sent
00:04:03.319 so so I emailed an email address to send a
00:04:09.640 fax I emailed a PDF to a mail to fax thing so that it gets faxed to a place
00:04:15.200 to get scan back in as a PDF this technology it's
00:04:22.080 amazing so anyway I will continue um I that is why I enjoy working at a giant
00:04:29.280 company um I get very nervous up on stage and
00:04:35.120 Jose told me one thing that you need to do while you're up on stage is just ask yourself what would Freddy Mercury do so
00:04:41.000 I always include this in my presentations to get myself you know less nervous and then even worse even
00:04:47.759 worse is that I am I'm delivering a keynote and I went and looked at I went and looked at all the other keynote
00:04:53.880 speakers and we have people like dick Gabriel and Guy steel dhh these people like change the world they change the
00:05:00.800 world of computer science they change the world of business I mean more CTO we
00:05:06.479 have CTO that are keynoting Dr Nick he's a freaking doctor right Corey Haynes
00:05:11.960 Corey Haynes is so awesome that his title is his
00:05:17.759 name like that is amazing it's amazing I'm so
00:05:25.759 jealous we have an adventure adviser I don't know what that is I I am merely a Senor software
00:05:33.440 engineer how can I be up here it doesn't make sense so I'm worrying about this
00:05:40.080 worrying about this and all of a sudden I get an email or a te a im from Chad he
00:05:45.800 says hey do you want to hear a story about um sex in the tech industry and I
00:05:51.960 was like uh okay sure he's like well you know that
00:05:57.639 email that O'Reilly sent out out about um rails this is this is the email here
00:06:03.919 uh well there's click tracking on all of the pictures of the keynote
00:06:11.479 speakers and your head was clicked like by Far
00:06:17.160 and Away it clicked more than anybody else so I just want to show you guys
00:06:24.039 like what you clicked on like what is wrong with you
00:06:30.319 you why did you click on
00:06:36.560 me so anyway thinking about this like what you know what is it that can separate me from the other keynote
00:06:42.039 speakers is um I know about ruby and I know about rails I'm the only person on
00:06:47.680 the Ruby core team and on the rails core team and the other thing that lets me stand out from the other keynote
00:06:54.080 speakers is that I look ridiculous
00:07:00.879 so with that in mind we have three items on our agenda I'm going to talk about
00:07:06.280 new features in rails 31 uh I'm going to have a little bit of real talk and then
00:07:11.960 we're going to do look at some uh development Pro tips uh and I have to warn you that this
00:07:17.840 talk will be very technical I'm going to have this will be a very technical talk so prepare yourself uh this will also be
00:07:25.680 a very critical talk I'm going to be critical of our framework and and I want to take a minute here to say you know
00:07:32.080 there's a difference between being critical and like being a hater so I'm going to say some bad things but I'm
00:07:37.840 going to talk about how we're looking to improve them and how we'll make our awesome framework even more awesome so
00:07:45.159 on with the new features um we're going to look at the stuff that's been added to rails 31 over
00:07:52.000 the past year or so and before we get into that I want to talk a little bit about my uh strategy for modifying API
00:08:00.400 my particular strategy and just the way I like to modify apis is to uh either
00:08:06.639 not change them so hopefully implements exactly the same interface that you're using today but give you speed
00:08:13.720 improvements and just better features with them or extend them so that you can make them better but always or use
00:08:19.240 better features but always make sure that they're backwards compatible so that's my my strategy and uh we're going
00:08:25.319 to look at some of the features that incorporate that but we're also going to contrast them against some of features that are added in rails 31 that don't
00:08:32.120 follow that particular strategy um the first thing I want to look at though is
00:08:38.560 uh we got a bunch of new generators in rails and since I can't see anybody here
00:08:44.640 I'm not going to ask for a show of hands I'll just assume all of you work for startup companies well probably 90% um
00:08:51.680 the new generator that we included is rails generate VC so when you're running
00:08:57.839 out of money just run this thing boom money apparently it only works in San
00:09:08.399 Francisco the other thing that um dhh has committed this the other day it's called haltz question mark you give it a
00:09:15.079 function and it tells you whether or not that function finishes it's pretty
00:09:21.839 amazing I have to say this is a change the face of computer
00:09:27.279 science uh but really we're going to look at we're going to look at prepared statements um I added to rails 31
00:09:34.720 prepared statement caching support um and what that is is typically we send a
00:09:40.560 SQL statement to our database that looks like this but instead now we're going to start sending statements that look like
00:09:46.360 this so we'll put in placeholders for the values that we actually want to want to use oh my God I only have 37 minutes
00:09:54.120 so normally we have a rails application we generate a SQL statement send it off to the database we get records back and
00:10:01.079 we just keep doing that over and over and over again in our application in the new world what we're going to be doing
00:10:06.360 oh that takes four steps four steps on the database side uh the database has to parse our SQL statement then it has to
00:10:13.720 come up with a query plan then it has to execute that query plan and finally it has to return results to to
00:10:19.959 us right so in the new world what we're going to do is we're going to send a statement off to the database and say
00:10:25.519 hey I'm going to execute this in the future I don't know what the values are going to be in the future but I'm I'm
00:10:31.079 going to execute a statement like this the database says okay great passes you back a token and what the database does
00:10:37.959 is it comes up it parses the SQL statement and it comes up with a query plan and then caches that right then
00:10:44.279 when we want to make subsequent queries to the database we pass along that token along with the actual value that we want
00:10:50.279 to put in to the SQL statement and then the database Returns the records to us so these four steps become two so we
00:10:59.440 only have to we only have to prepare this statement once and we keep that token and keep reusing it over and over again and in subsequent times to the
00:11:06.000 database we uh the database only has to execute the query plan and return
00:11:12.120 results to us so it's typically faster and I want to share with you the
00:11:17.600 impact on the three different database adapters that we have um sqlite
00:11:23.720 postgress and MySQL with sqlite the performance looked
00:11:29.279 a little bit like this it's linear performance I just like to show linear performance graphs to make sure that I
00:11:34.399 didn't [ __ ] anything up so um really what we want to look at
00:11:40.279 is uh the number of queries per second we were able to perform so on average before I implemented prepared statement
00:11:47.320 caching we could execute about 8,600 queries per second after implementing
00:11:52.519 statement caching we're doing almost 1,300 queries per second and this is on a simple statement so if we say like the
00:11:58.959 most simple statement you could possibly say select star from wherever where ID equals something and what I wanted to do
00:12:04.560 is increase the complexity of the sequal statement and Benchmark that so I could look at how the performance changed as
00:12:13.079 our queries became more complex and oh right so we have a Delta here of 4,300
00:12:18.480 queries per second to the good now if we try a more complex SQL statement we're still getting linear performance but you
00:12:24.320 can see that a larger a much larger Delta between the two we went from 339
00:12:31.399 queries per second up to a little over 4,000 so very good improvements as far
00:12:37.279 as sqlite is concerned postresql still linear performance thank God it didn't
00:12:42.519 screw anything up uh we improve from 4600 up to 5,500 quaries per second so
00:12:50.040 I'm happy very happy with those results and as we increase the complexity for uh statements on
00:12:56.480 postgress it Still Remains we get even better performance so this makes me
00:13:02.000 happy um MySQL the picture is not so
00:13:07.760 awesome so we remain linear but if we look at the number of queries per second
00:13:13.360 that we're actually performing it degraded and you know theoretically when
00:13:19.360 you talk about the amount of work that we had to do like we're I explained to you we had four steps we were able to
00:13:24.720 reduce those to two back over on the database you'd think all databases will go fast faster but it's actually not
00:13:30.880 true with mySQL and I'll talk about why in a minute so it went
00:13:35.920 down um but as we increased the SQL complexity we actually improved
00:13:42.920 results okay so why is this why did MySQL get slower
00:13:52.120 and the reason is because when you send that prepared statement up to mySQL it does no query planning at all it doesn't
00:13:58.240 do any query planning in advance so it doesn't actually perform the planning until you actually execute the
00:14:05.120 statement and for some reason the MySQL API when you do a normal simple query it
00:14:12.079 takes a different code path than when you do a prepared statement right there are two completely different code paths
00:14:18.639 and when you do a prepared statement it actually requires two network round trips so on MySQL the only time you're
00:14:23.959 actually saving is SQL parse time right so you have to judge what whether the
00:14:29.839 amount of time that it takes to parse that SQL statement will overcome the amount of time for the two network round
00:14:35.759 trips so doing the performance profiling on MySQL is a little bit more
00:14:43.680 difficult another interesting thing is that when we use the prepared statement
00:14:48.959 API we get parsed dates back prear dates so the benchmarks that I'm showing you
00:14:55.040 aren't taking into account any of these any of these return values we actually get casted values back from my SQL so
00:15:01.800 when I ask for an integer it actually gives you back a c integer and when you ask for a date it gives you back a struct that contains all the date Parts
00:15:08.880 but this is only for the prepared statement API with the normal querying
00:15:14.000 API you just get strings back and we have to pay partime costs for that so
00:15:19.360 unfortunately it's difficult it's very difficult to say whether prepaired statements are slower or not from isql
00:15:29.279 so how do we use this in rails what are the API changes required in order to use prepared statements in rails before we
00:15:36.399 would say oh you know user. find one we want to find our user with a particular ID and the API changes that I made to
00:15:43.800 enable prepared statement caching on Rails the front end API looks like
00:15:51.160 this exactly the same so you get speed improvements for
00:15:57.160 free yay um the next thing I want to look at is
00:16:02.360 serialized attributes um serialized attributes it
00:16:07.600 looks like this I don't know if any of you have used this before but basically what it does is you are able to save uh
00:16:13.759 arbitrary Ruby data structures into the database so you can say like I want to save this hash of stuff like maybe
00:16:20.199 somebody's preferences or something you can save that into the database and it saves it as
00:16:25.720 yaml so you say like oh assign this hash save it off saves to the database as a
00:16:31.959 yaml and I was looking at this working on an application I was like why yaml
00:16:37.759 like why do we why do we say this as yaml and not as like Json or you know
00:16:43.639 something else so I started looking in the rail source and found that the reason we were saving as yaml was
00:16:50.560 because active record was very tightly coupled to yaml we were specifically saying okay
00:16:57.399 we're going to go dump this out is yaml in the code so what I decided to do is just refactor that such that yaml is not
00:17:05.760 coupled to active record and now we can actually configure the storage
00:17:10.959 strategies with whatever storage strategy we want and it looks like this we're able to supply a second parameter
00:17:17.000 to serialize with the strategy that we want to use to emit to the database right so we
00:17:26.919 have to implement a coder and that coder encodes our data to the database and what does that API look
00:17:33.160 like you just have to implement a load and a dump provide that to serialize on
00:17:39.160 load that's when you get data back from the database when you're dumping that's when you're dumping data to the database
00:17:44.880 so it's very easy we can Implement base 64 storage Json
00:17:50.120 storage Marshall storage XML storage right everyone wants to store XML in
00:17:55.520 their database yeah love that who else is on on Oracle yeah I see all those
00:18:02.400 hands with these Bright Lights
00:18:08.080 oh I want to announce official nosql support in rails everybody is excited
00:18:13.600 about
00:18:21.799 that I forgot to mention that some of the things I say on stage might not be true
00:18:31.200 but I guess I I guess I should probably show it off here we'll we'll create 26 users with preference hashes right we're
00:18:37.960 going to store unstructured data in our database create a bunch of users with preference hashes and then what we're
00:18:43.480 going to do is we're going to pull this data back out and modify it so we have our 26
00:18:48.559 users um what did I do next I forgot we find a user and we get preferences back
00:18:54.039 and we can modify update these preferences modify them and then save it back to the database but what we're
00:18:59.480 going to do in rails is we're going to find by somebody's preferences yes we'll see who has their
00:19:07.799 favorite color set to Green I'm the only one that has my favorite color set to Green in the database so hopefully we'll only get one record
00:19:14.640 back and yes unstructured data stored in the database and we can query on
00:19:21.080 it
00:19:26.360 yes unfortunately there a couple fairly important
00:19:36.320 caveats the important caveats are it only works on post
00:19:44.480 gr and the other important caveat is you might hurt yourself using
00:19:55.000 this so if you want the code to actually do this here it is right here quickly write it down no uh I will I will
00:20:02.559 provide the slides on my Twitter's or Riley will provide them too and you can get this code but don't use this code you might notice that there is an eval
00:20:09.000 like right in the middle there you probably don't want to do
00:20:14.159 that but it can be done um so another interesting thing about this
00:20:20.080 serialization strategy API is that um we can think of other things besides like
00:20:28.080 Marshall and uh XML and yaml and all those things we can think about other things as
00:20:33.600 serialization strategies one of those things that we can think about is
00:20:40.039 encryption right so we can Implement a coder that saves something encrypted to
00:20:45.840 the database we can use bcrypt or we can use whatever but we're allowed to
00:20:51.280 configure it with whatever coder that we want to use okay so this this particular
00:20:56.880 coder will encrypt our our passwords to the database and we can use it and say
00:21:01.919 oh some crazy call we store our password like I suck at spelling and I changed
00:21:07.080 our password column to password with One S but I'm too lazy to change it cuz you
00:21:13.000 know it takes forever at a giant company and I have to fax
00:21:21.159 stuff but now I just email it which that faxes it for me huge perform huge productivity
00:21:28.679 Improvement really but we can configure things and save stuff to the database encrypted like this so it's not just
00:21:36.520 for uh uh one two-way bidirectional data
00:21:42.760 saving uh and well I was thinking about this API one thing that was very important to me was you know looking at
00:21:48.760 the apis of different things that dump data in the standard Library we have
00:21:54.240 yaml Json and Marshall and their API for dumping and loading data is very
00:22:00.320 consistent so I wanted to maintain this consistency because my eventual goal would be that you could say you know
00:22:06.240 serialize Fu and then just pass in the yaml constant I think that would be very awesome unfortunately there is some
00:22:14.000 problems inside of active record that keep us from doing that which is why you may have noticed the weird like return
00:22:19.520 and less something rather there and there um but it would be very cool if we could just say all right straight up
00:22:25.279 dump Jason on in there but the important thing to me was that uh the API was the
00:22:30.480 API that I defined was consistent with the stuff that we see in standard Library I think this is important for
00:22:35.559 many people's API so that you don't have to think about how to use this it already makes sense you know how to use
00:22:41.640 yaml and Json and Marshall from uh from the standard Library so you should know how to use this API already I don't have
00:22:48.400 to teach you anything another important decision for me about this particular API was that
00:22:55.440 you get to choose your serialization strategy so if you know you're on
00:23:01.080 postgress and you can store stuff in the H store you get to choose using the H store and it just
00:23:07.799 works important lesson I learned from this was that good abstractions yield good features so I didn't specifically
00:23:15.720 go into active record and say Hey I want to make sure that you can store stuff as Json or bcrypt or whatever I went in
00:23:23.000 there and said why are we so tightly coupled to yaml Why is the API so hardcoded to use
00:23:31.080 yaml we should be able to configure this with whatever cuz someday maybe we'll want to change off of yaml right and
00:23:38.480 after after refactoring the internals of active records such that I could do that this feature fell out so it wasn't
00:23:46.360 specifically that I wanted to give the world this feature it just happened to be that way after doing good
00:23:52.159 abstractions in my code another important lesson for me was that consistency yields freedom and
00:23:59.520 flexibility since we're consistent with this API you're free to choose you can choose any of these things that are in
00:24:06.880 standard Library eventually with these small caveats but this seemed very important to me now the next thing I
00:24:14.200 want to do is I want to contrast that with um I don't know if this was a controversial feature but it kind of
00:24:20.919 annoyed me has secure password um has secure password what it
00:24:28.159 does is it says all right we declare half secure password in our model and now our model has a password field and
00:24:36.520 it gets automatically encrypted and put into the database so we can say you know find a
00:24:43.279 user set their password this is my password on everything so don't use it
00:24:49.840 please save that off unfortunately so its advantages are it introduces this
00:24:55.240 new method and it saves stuff to a column called password digest and it uses bcrypt bcrypt is secure it's good
00:25:03.039 use it please um but what I don't like about it I feel its advantages are also
00:25:09.440 its disadvantages when you see that you don't know that it stores it in that particular column there's there's magic
00:25:15.320 behind this method it's making assumptions for you that you don't know anything about that kind of bothers me
00:25:22.000 the other thing that bothers me about it is I have no way to reuse it it only works in this one particular particular
00:25:28.799 use case all of my applications at work can't use this because we don't store
00:25:34.840 our passwords in the same column that that does we don't necessarily use bcrypt we might be using like sha 2 256
00:25:42.320 or something crazy like that it doesn't work for me at all so when I see this I
00:25:48.760 just think to myself that when I'm designing API I need to go green I need to design something that I
00:25:55.760 can reuse when I think about API and I think about implementing stuff for other features other people I think to myself
00:26:02.520 how can I reuse this how can I reuse this in my own applications how can I reuse this in my own code and the
00:26:08.760 problem for me with that feature is I can't and the other kind of strange thing to
00:26:16.000 me is I showed you earlier in our serialization strategies we were able to put together something that saved our
00:26:22.120 code is bcrypt or whatever we wanted to so you might assume that the source for
00:26:27.640 has secure password just uses serialization strategies guess what it
00:26:33.919 does not so if you have time and would like to contribute to rails core I
00:26:39.279 suggest that you refactor that method to use serialization strategies I would appreciate
00:26:45.279 that so the next thing I want to talk about is streaming responses um
00:26:50.960 streaming responses a new thing in rails 31 and what we're going to be doing there is uh or the idea behind streaming
00:26:56.880 responses is that somebody can make request to your web server and you immediately get
00:27:02.320 data right uh the data is it's sent
00:27:07.799 before it's completely done currently in rails 3.0 we process all of our Erb buffer It Up all In memory and then spit
00:27:14.559 it out to you right what we want to do in rails 31 is we want to make it so that you can it spits out data as soon
00:27:22.000 as it's ready so we process some of the Erb spit it out process more keep spitting it out so so that we can be
00:27:28.440 spending time processing on the server while our client is downloading data and maybe going off and fetching Assets in
00:27:34.960 parallel okay and to understand how specifically how this is implemented we
00:27:40.120 need to look at the rack API this is a simple very simple rack application you have to implement a method called call
00:27:46.720 gets an environment and you have to return a triple and this is very important please note in later slides I
00:27:53.760 will talk about this you have to return a triple from call and the third thing in the triple is the body and it has to
00:27:59.840 be something that you can iterate over okay so what we do in rails 30X is we
00:28:06.720 say all right append to this body render our content render our layout keep appending to this body and then return
00:28:12.080 the triple from our generator which is our application and then that bubbles all the way up our rack middleware stack
00:28:18.760 and then gets spit out to the socket what we want to do in rails 31 is
00:28:24.039 we need to do delayed evaluation we need to actually wait until the socket is until we're back up in the socket and we
00:28:30.519 need to spit stuff out before we start processing so we want to delay that evaluation before we want to delay that
00:28:36.559 evaluation and return up the stack how we do that is we take advantage of body. each so we know that rack calls each on
00:28:43.440 the body so what we want to do is we want to actually start generating data inside there so we we postpone our
00:28:48.960 calculations until each is called on the body so in rail 31 this is absolutely not what the source looks like I have
00:28:55.159 simplified it completely but it's something similar to this where we create a new object and it does
00:29:01.960 processing inside of each and we just pass that up the stack and our evaluation is delayed until we actually
00:29:08.919 write out to the socket but I want to talk a little bit about middleware now there's a problem
00:29:15.279 there was a huge problem for me implementing this with actually Jose did the hard
00:29:20.799 work I just dealt with the middleware um the problem is middleware is a chain so
00:29:27.360 it's a linked list we have a bunch of middleware one points to the next points to the next and then finally we get to an application right and say we have a
00:29:34.720 request timer Connection Manager application our middleware chain is much larger than that but I want to keep it
00:29:40.679 simple and when we do um when a request is processed the web server calls the
00:29:47.519 tip of the linked list and that just keeps calling down the list until finally you get to a generator which is
00:29:53.760 our application and then we return back up from that and that gets to the web server and then we start spitting stuff
00:30:01.080 out so let's think about the way that The
00:30:06.600 Connection Manager works specifically it opens a database
00:30:11.799 connection then it delegates off to the next middleware in the chain and it saves off the return value from that
00:30:18.880 middleware that it delegated to then it closes off the database connection and then it Returns the
00:30:25.799 values that it saved off when delegating
00:30:32.480 now I don't know if you've spotted the problem with this but we have delayed
00:30:37.720 evaluation until we get back up to the io so when we delegate by the time that the delegate has returned we haven't
00:30:44.799 actually done any processing we haven't actually done anything so then we close off the
00:30:51.080 database connection and return those return values up to the up to the web server the web server calls each and
00:30:56.279 starts processing the Erb and we have no database connection then you get these exceptions
00:31:03.080 and you're like what what happened so how do we fix this how do we
00:31:09.320 deal with this solution and the way to deal with that is we need to take advantage of the fact that rack will call close on the
00:31:15.440 body okay when it's done iterating over the body it'll actually call close on it
00:31:20.519 so what we do is we have to proxy each body that gets returned so we implement this proxy
00:31:27.679 class class that points off at a delegate a delegate body iterates over the delegate and then finally when close
00:31:34.240 is called then we close off our database connection so our Connection Manager
00:31:40.559 then becomes all right when call gets called we open our database connection we delegate off to the application we
00:31:45.880 Implement our or we instantiate our body proxy and then we return we don't actually close the database connection
00:31:51.960 at this point oh I I know you can't read this
00:31:58.840 uh this is our middleware stack when you do a new application this
00:32:04.840 is actually a little bit old and we'll we'll just um go through this
00:32:10.519 slowly oh yeah look at that stack so awesome so I had to go through
00:32:17.279 all of these objects look at their implementations and find things that would break because of this delayed
00:32:22.360 evaluation and for every one of those middleware objects that would break for delayed evaluation we have to implement
00:32:29.159 a body proxy no don't clap for this this is
00:32:35.559 terrible this is horrible body body
00:32:42.000 proxy I had to go through about I don't know 25 middle Wares probably more and
00:32:47.240 find all of these where we would break and it really kind of annoyed me
00:32:53.600 because if you think about it we have we have a bunch of filters and then we have a generator at the end right but our
00:32:58.880 generator isn't actually a generator it's just returning a triple it's returning a generator we've turned the
00:33:04.919 body the body is a generator now so we have a generator that returns a generator this seems broken to
00:33:11.919 me and I think the way that we need to fix this is we need to embrace diversity
00:33:18.320 we can't Jam all these different types of content processors into one thing
00:33:23.480 called middleware it doesn't fit we need to look at the different between the way these particular things
00:33:30.360 process content and split them up and if we look at it there's really three basic things that we do we have something that
00:33:37.440 generates content which is typically our application and we have filters which actually modify the content for example
00:33:43.360 if we have a gzip generator we actually gzip the content before sending it out and then we have things that deal with
00:33:48.960 life cycle handles for example the database connection things we were looking at the database connection the
00:33:54.399 connection manager doesn't care what happened between the question or response it just wants to open the
00:34:00.440 database connection when it starts and when it's done it just wants to close the database connection it does not care
00:34:05.919 what happened in between all right it's time for some
00:34:12.800 real talk oh this is the other part where I get to contradict Dr Nick rails is getting
00:34:20.240 slower nobody's clapping come on SO rails is getting slower and rails
00:34:27.760 is getting slower because it's doing more work okay uh I went benchmarked rails 23
00:34:34.000 versus rails 3.0 our x axis is the number of requests that I performed and
00:34:39.480 the Y AIS is the amount of time that I did and for these benchmarks I cut out the web server part so I actually just
00:34:44.679 wrote Ruby code and benchmarked our rack stack I wanted to see how many requests per second that I could push through the
00:34:51.280 system and our growth is about linear here when we do this but when we look at our average requests per second we'll
00:34:57.240 see that rails 23 is significantly faster than rails 3.0 and unfortunately we're doing a bit
00:35:04.520 more work like I described all these body proxies in our middleware we're doing a little bit more work for rails 3.1 I didn't include that on here
00:35:11.440 because when I benchmarked it it was only like it was basically exactly the same very close to the same and I just
00:35:17.359 thought it would clutter the graph so we I just want to compare 23 and
00:35:22.599 3 so it's not really fair to compare these
00:35:28.520 two and the reason it's not fair is because our middleware increased we're actually doing more work in our middleware than we were doing we're
00:35:35.119 doing more work in our middleware in rails 3 than we were doing in rails 23
00:35:40.720 okay but the thing that bothers me is when you go and profile that and look at all the middleware in our
00:35:47.200 stack none of it actually seems to be taking up very much time right we're doing we're doing more
00:35:54.319 work but none of it is really time consuming so where is this time
00:35:59.440 going it's going to our garbage collector we're putting a lot of GC
00:36:05.520 pressure on Ruby and I guess this is the part where I have to say maybe you should use
00:36:11.200 rubinius or J Ruby or something like that or re um but the reason we're putting
00:36:17.560 pressure on our GC is because of Stack depth every release of rails is increasing its stack and the way that
00:36:24.040 Ruby's garbage collector works is it has to scan the stack for objects to collect
00:36:29.720 so obviously as our stack gets larger the garbage collector has to do more
00:36:34.880 work so maybe it would behoove us to decrease
00:36:41.720 that our rails 23 stack was 51 deep and this is measuring from inside of your
00:36:47.520 controller if you just say puts puts collar you'll see that it's 51 deep in
00:36:53.560 rails 3.0 it's 60 deep in rails 3.1 it's going to be 67
00:36:58.920 deep so if you look at the collar from your controller say puts collar it looks like this I know it's way too long to
00:37:05.920 read which is why we have a stack Trace
00:37:11.000 scrubber so how do we take care of this how do we fix this this is something that needs to be fixed I think and
00:37:18.359 ironically the way to fix this is the same way we need to deal with those body proxies is we need to embrace
00:37:24.480 diversity we need to change we need to to change the rack API we need something
00:37:29.920 that's not dependent on return values you noticed earlier I made a point saying when you call call it has to
00:37:36.760 return something we need something more evented something that supports streaming easier and the API that I am
00:37:44.800 proposing is something that looks like this we split up generators and we actually write a response out to
00:37:50.680 something and then we close that response when we're done the next thing we need to do is we
00:37:57.599 need to implement filters and these filters are specifically for filtering data they can buffer stuff up or they
00:38:04.480 can gzip things they can do whatever they want to but their only responsibility in life is dealing with response
00:38:11.359 data we need to also support life cycle hooks so that we can support things like
00:38:18.560 the connection pool where we bring up a connection and then tear it down at the end now I put together uh test
00:38:25.760 implementation of this with no web server front ends like it won't work with unicorn and I ported most of our
00:38:31.839 rack stack to this to see what the performance differences would be I brought the stack depth down to about
00:38:40.920 36 I did my profiling here to make sure that our request and response was still a linear growth so I didn't screw
00:38:47.119 anything up and our request and responses we were able I was able to
00:38:52.680 push through more um requests per second on Rails
00:38:58.000 uh than 23 could and that's doing the same amount of work that we do in 3.0 so
00:39:03.920 if we decrease the stack size we'll be able to do the same amount of work in less time than rails 2.3 can so I think
00:39:12.119 that this is a very important thing we need to
00:39:19.560 do these this graph is an estimate but I believe it's conservative okay so something very
00:39:27.839 important to me in the coming year is that we must change we need to
00:39:34.280 change this and the things that I want to be focusing on in rails for the next year are
00:39:39.920 speed memory consumption stack depth but most importantly I want to make sure
00:39:45.599 that it's backwards compatible thank ah thank
00:39:52.720 you so finally the last thing I want to do is I I was having a really hard time
00:40:00.040 figuring out all these problems doing all these benchmarks figuring out where all of our time was spent and I
00:40:07.920 consulted many people to figure out what the problems were and I learned
00:40:13.599 developer Pro tips from these people and I would like to invite them up on stage with me
00:40:19.079 here and I want them to show you all these Pro tips so this is all right this
00:40:26.480 is what you do to be become a professional Ruby developer and to find bugs like these and to fix them all
00:40:32.720 right they all said the same thing to me when I went and consulted them about these particular problems and so I just
00:40:40.359 want to share with you what they told me to do ah there you are right there
00:40:48.880 perfect oh actually what before we begin you
00:40:55.480 guys what the hell four people from Ruby signed up to do
00:41:01.200 the 5K only I showed up or rails core excuse me what come on guys what the
00:41:08.040 hell all right so anyway developer Pro tips um let's do
00:41:15.640 this oh I'm sorry Wayne um will you change
00:41:22.079 please uh Wayne has to get into his dancing gear for us
00:41:40.160 actually this is specifically what Wayne said to me when I asked him about these problems over I am and it was just that
00:41:47.319 loud too all right so are you are you all
00:41:53.720 ready for this start with your heads down everybody Now look up
00:42:00.319 slowly bring your arms up and bring them down they're going to go up again both
00:42:06.800 arms higher this time turn them down bring it thumbs to yourself and
00:42:14.800 down Point your right hand from low to high burst your right hand now fist to
00:42:21.079 your sides 10 9 8 7 now your fingers 5 4
00:42:27.319 three two one burst both
00:42:34.359 hands up punch across crank crank stay
00:42:39.839 down shoulder chin shoulder and tap
00:42:45.280 double dream hands and thumbs to yourself punch again now Point your
00:42:51.800 right hand from low to high punch crank underhanded two Pats
00:42:57.680 shoulder double dream hands up just like before
00:43:03.760 crank stay down shoulder chin shoulder shoulder
00:43:09.720 shoulder double dream hands again thumbs to yourself punch punch from high to
00:43:17.880 low punch and crank underhanded Pat twice
00:43:23.920 double dream hands again up Rock out now crank
00:43:29.240 it stay down shoulder chin shoulder hand double
00:43:36.400 dream hands Now jazz hand ah TI hand rain
00:43:42.440 hands Point your hand over there Step clap jazz hand burst again left jazz
00:43:52.599 hand clap clap P your knees step claps reach to the
00:43:59.240 audience Two Step claps hands to your knees left jazz hand Left Right bur like
00:44:09.359 before and point high to low crank your
00:44:16.760 underhanded double dream hands and
00:44:23.319 Butterfly double dream hands freestyle
00:44:28.960 make a tight group
00:44:40.559 punch you did you came thanks guys thank you thank you
00:44:46.480 thank you thanks thank you thank you guys want a group pose uh actually should we have
00:44:53.480 a pose yes I have two freaking minutes get back up here on stage what are you doing so my real difference between all
00:44:59.920 the other keynote speakers is I can get some of the most important Ruby people in the world to come up on stage and look like
00:45:10.119 idiots everybody get together come on in come here come here come here we're
00:45:15.680 posing for pictures now thank you everybody ah wait what I'm a
00:45:23.920 poser all right thank you I have seconds
00:45:30.559 left thank you thank you guys thank you all thank I feel amazing
00:45:39.240 now the embarrassment or the the stage fright is completely gone
00:45:47.680 so I want to leave you I want to leave you with some final words um I want to
00:45:53.040 leave a message I want to start with a message for the rest of the Rails core team and for uh the
00:46:02.640 audience I'm kind of out of breath now 5K plus dancing and none of my rails core team
00:46:09.359 teammates showed up so the rails core team that you know
00:46:15.640 today may not be here tomorrow
00:46:23.599 right dhh isn't necessarily going to be around developing on Rails for the rest of his life but I firmly believe that
00:46:30.559 the rails framework will be around and I think it's important for us to make sure
00:46:37.440 that we refactor the internals of it so that we can get new people in and new people developing on core rails itself
00:46:45.720 we need to get more people into the rails core team more people helping out with our beloved
00:46:54.359 framework the last things I want to say are not all features are tangible speed
00:47:00.920 memory IO good abstraction yields use reusable
00:47:07.040 code reusable code yields new features so your
00:47:12.640 homework from now for the rest of your life go green write reusable
00:47:20.640 code refactor rails and go forth and code thank you
00:47:26.960 you