Gaurav Kumar Singh

Static typing with RBS in Ruby

In this talk, we'll generally explore the static type eco system in Ruby. Ruby has two main type checkers Sorbet and RBS. Sorbet was created by the Stripe and RBS is supported by ruby. Sorbet is an annotation base type checking system while RBS is a definition file-based type system. We'll add type annotation for a popular gem using sorbet and RBS and then compare the differences between the two systems. There is lot of interoperability announced between Sorbet and RBS and we'll explore if it's practically possible to convert a sorbet annotated project to RBS.

RubyConf 2022

00:00:00.000 ready for takeoff
00:00:17.160 good afternoon everyone my name is
00:00:19.080 gaurav Singh and I'm an engineer at
00:00:22.020 pattern so today we are going to talk
00:00:23.640 about static typing in Ruby using RBS
00:00:28.260 so
00:00:30.480 yeah so first of all what is this typing
00:00:33.300 business like what do we mean by types
00:00:35.219 right so broadly there are two
00:00:37.380 categories of programming languages one
00:00:39.600 are which are strictly type languages
00:00:41.700 and another is which are dynamically
00:00:43.379 typed languages so let's take the first
00:00:46.020 example here of a Java code so here uh
00:00:50.039 string variable Str variable has Type
00:00:52.620 string right which contains the value
00:00:55.020 hello but if we try to assign it value
00:00:58.140 as integer we get the syntax error while
00:01:01.260 in the dynamically typed languages type
00:01:03.600 is associated with the value that
00:01:05.700 variable is holding so you can see over
00:01:08.340 here in the second example for the Ruby
00:01:10.560 code so name type of the name variable
00:01:13.680 is a string when it's holding the value
00:01:15.299 who am I and its value its type is
00:01:18.720 integer when
00:01:20.220 ah it's holding the value 5 so type in
00:01:23.700 the dynamically typed languages
00:01:24.840 associated with the value
00:01:27.420 okay
00:01:29.520 so what is the problem uh with the
00:01:31.860 dynamic typing so let's consider the
00:01:34.500 first example here Java code so we have
00:01:37.320 a simple add function and which is
00:01:39.720 accepting two arguments uh into both of
00:01:42.780 them are integer types and we are just
00:01:44.759 returning their sum now if I'm trying to
00:01:47.880 add past those arguments with the string
00:01:50.579 type I'm going to get a syntax error
00:01:53.040 right now let's consider the second
00:01:55.079 example of the Ruby code again I am
00:01:57.240 passing two arguments and I'm going to
00:01:59.520 sum them but if I'm passing the argument
00:02:02.579 string it's not going to give me any
00:02:04.680 error when I'm writing the code but it's
00:02:06.420 going to give me the error at the run
00:02:08.280 time so these kind of issues which we
00:02:11.099 faced with the dynamic typing languages
00:02:13.340 like people would say we can add
00:02:15.660 extensive test code but we all know like
00:02:18.200 sometimes humans do make mistake so to
00:02:21.599 overcome these problems static typing
00:02:23.520 has been introduced uh
00:02:29.280 but
00:02:36.360 yeah so advantages of uh type checking
00:02:39.840 so first thing it's a proven fact into
00:02:42.480 multiple languages like JavaScript and
00:02:44.879 python there are very popular static
00:02:47.099 type checking libraries uh typescript is
00:02:49.860 the very popular version of the
00:02:51.239 JavaScript and MiFi is a very popular
00:02:53.519 package in Python it eliminates lot of
00:02:57.540 Errors like no method or nil clusters at
00:03:00.120 the runtime better integration with IDs
00:03:02.580 Varun had just showed like how to
00:03:04.319 integrate in the IDS it provides
00:03:07.019 documentation for the code so like when
00:03:09.660 we annotate our code with the static
00:03:13.019 typing annotations we get implicit
00:03:15.420 annotate documentation for our code at
00:03:18.480 that point of time
00:03:21.300 uh so like type checking what are type
00:03:24.420 checking systems are there in Ruby so
00:03:27.120 this is not something new in the Ruby
00:03:29.459 ecosystem uh like back in 2009 uh there
00:03:33.659 was something called Diamondback Ruby or
00:03:35.760 uh Ruby then tuft University came up
00:03:39.180 with RDL and then stripe launched its
00:03:42.299 survey and then in 2019 Ruby core team
00:03:45.360 launched RBS
00:03:48.299 now what is rbse check exactly so the
00:03:51.540 RBS stands for Ruby syntax uh it is
00:03:55.080 basically uh it provides
00:03:57.840 you can Define your Ruby code in RBS
00:04:00.840 basically you're trying to say how my
00:04:02.940 Ruby code is going to look like you can
00:04:04.860 consider this something as uh type
00:04:07.140 definition files of
00:04:08.900 javascript.t.ts extension files which we
00:04:11.459 generally use so it basically provides
00:04:14.099 the structure RBS does not uh like check
00:04:18.479 the type of the Ruby code it only
00:04:20.760 provides the programming interface to
00:04:23.040 define the structure of your Ruby code
00:04:25.560 so what options does RBS comes up with
00:04:29.400 so there are a lot of options which RBS
00:04:32.040 provides which is basically for parsing
00:04:34.740 the code seeing what kind of AST
00:04:36.600 generates how to annotate list probing
00:04:39.419 the ancestors of the Ruby class in the
00:04:42.300 scope of this talk we are mostly
00:04:44.040 concerned about the kind of code which
00:04:46.800 we are going to generate for our Ruby
00:04:48.840 code so we'll be focusing on the
00:04:51.720 Prototype option
00:04:54.120 so RBS prototype comes up with three
00:04:57.300 options RB RBI and runtime so RB uses uh
00:05:03.060 syntactical parsing of the Ruby code
00:05:05.340 RBI so survey is a popular library and
00:05:10.020 first orbit there are RBI files which
00:05:12.540 are generated by force orbit and to
00:05:14.820 convert those orbit RBI files into RBS
00:05:17.820 we use RBI and then third option is
00:05:21.500 runtime so run time is basically it
00:05:25.199 loads your class into memory and then
00:05:27.060 try to see what all functions or methods
00:05:29.759 and attributes are defined on top of
00:05:31.500 that class or object and then generate
00:05:34.620 the RBS code for that object
00:05:38.039 so uh let's take a very simple example
00:05:41.340 here so we have a class user it has two
00:05:45.360 attributes name and age and
00:05:48.960 we have just a Constructor over there
00:05:51.660 okay so we are going to generate the RBS
00:05:55.800 file using the RB option so the output
00:05:58.560 of this command will be something like
00:06:01.500 the sigup the Sig option you are seeing
00:06:04.320 over here so attribute reader name is
00:06:07.380 untyped
00:06:08.580 and age is untyped and if you see the
00:06:11.759 function sick for the function so name
00:06:15.000 of the function and then all the at
00:06:17.660 arguments which this function accepts
00:06:19.860 and its return type
00:06:22.259 so over here you will see all these uh
00:06:25.620 attributes are untyped untyped basically
00:06:27.960 means it can accept any type of value
00:06:30.419 but the basic idea of like having static
00:06:34.800 typing is not to have untyped value we
00:06:37.500 need to have Define some type of value
00:06:40.319 which we want to accept for at our
00:06:41.940 attributes right so let's change it to
00:06:44.100 uh like the kind of uh values we want
00:06:47.460 for our attributes so for example for
00:06:49.919 name we want string and for age we want
00:06:53.639 integer so here we have defined uh
00:06:56.699 declared a name as a string age as
00:06:59.400 integer and our initialize accepts two
00:07:02.580 parameters one is string another is
00:07:04.740 integer and returns nothing
00:07:07.199 okay so let's take the example over here
00:07:09.780 so what I have done is that I have tried
00:07:11.639 to instantiate this user class with two
00:07:15.240 kind with two arguments and both of them
00:07:17.340 are string right so and uh if you see on
00:07:20.759 the previous slide we have uh one string
00:07:23.520 and one uh integer right so this is
00:07:26.400 going to give me error first so error
00:07:29.639 says like string type is not accepted
00:07:32.880 because argument argument type is
00:07:35.160 integer so these kind of Errors we we
00:07:39.000 can catch very early when we are trying
00:07:40.680 to annotate our code with RBS
00:07:43.800 okay so uh let's see what happens with
00:07:47.280 the dynamic attributes so here we have a
00:07:51.419 code in which we are dynamically defined
00:07:53.340 Getters and Setter for attributes name
00:07:55.979 age and gender
00:07:59.039 so if I'm going to use RB option it only
00:08:02.160 does the syntactic parsing of the Ruby
00:08:04.319 code so it's not able to identify what
00:08:06.840 all methods are there into the class so
00:08:09.240 it only says okay this attribute is
00:08:11.220 there which has which is of the array
00:08:13.139 type of name is and sex
00:08:16.620 right so to overcome this problem uh we
00:08:20.460 can use runtime option which we had
00:08:22.259 discussed earlier so what runtime will
00:08:24.300 do is that we'll give it uh we'll give
00:08:27.479 it the path of the code where that class
00:08:30.900 is defined and then what class do we
00:08:34.080 want to generate RBS code for
00:08:36.240 okay so and once we do the runtime run
00:08:40.440 the command we'll see that it has loaded
00:08:42.779 the class into the memory and then it
00:08:44.760 has seen the age as method then age like
00:08:47.760 you can set the value for the age as the
00:08:49.980 method method gender and setting the
00:08:52.800 value for the gender
00:08:55.260 okay so uh now let's see a bit more
00:08:59.339 detail into uh for different kind of
00:09:02.100 functions what kind of RBS 6 we can
00:09:04.260 write so let's take the first example
00:09:06.360 over here uh so in this example uh we
00:09:10.260 have a method called greet and this is
00:09:13.440 going to accept as an optional argument
00:09:16.860 for the class method okay so for the
00:09:19.800 optional arguments anything which can be
00:09:22.019 nullable we are going to use question
00:09:24.120 mark to Define that this value can be
00:09:25.920 nil so if you see at the line number 11
00:09:28.339 uh it the function grid is accepting an
00:09:32.820 argument
00:09:34.140 with a string type which can be nil so
00:09:36.600 we can uh like call the grid function
00:09:39.060 without any argument also but if you are
00:09:41.279 going to call grid then it should be
00:09:43.019 string only there is no other option
00:09:46.140 let's take another example where uh like
00:09:48.779 our function can have a variable number
00:09:50.940 of arguments so here we have uh example
00:09:54.720 class which are function names name ends
00:09:56.880 with and it accepts like name argument
00:10:00.540 is fixed and then we are going to have
00:10:02.160 variable number of arguments using uh
00:10:04.500 Splat operator so if you see at the line
00:10:07.860 number 13 over here so
00:10:10.500 the first argument is fixed like you
00:10:12.600 have to whenever you are going to call
00:10:14.220 this function you have to provide the
00:10:16.800 name but uh using the Splat operator
00:10:21.000 again it's going to accept the variable
00:10:23.100 number of arguments over here and
00:10:25.800 in this case the return type is Boolean
00:10:28.980 because the function is checking whether
00:10:30.779 the name ends with particular value or
00:10:32.580 not
00:10:37.019 okay so let's take another example uh so
00:10:40.920 we have a class called example then it
00:10:44.040 has a function multiply and it is going
00:10:46.500 to accept two arguments first is
00:10:48.899 arguments and another is multiplier
00:10:51.000 right if you see the Shell Code over
00:10:54.060 here so we can call this multiply
00:10:57.000 function with string as well as integer
00:10:59.880 or float values also right so if I'm
00:11:02.459 going to call it with string then the
00:11:05.820 default value of the default value of
00:11:07.980 the multiplier is 5 it's going to print
00:11:10.140 Ruby five times that string five times
00:11:13.200 so in this case I provided argument as
00:11:15.360 Ruby and then it is printing Ruby five
00:11:17.640 times and if I'm going to give argument
00:11:20.880 as integer at the line number 13 if
00:11:23.399 you'll see so it's going to return the
00:11:25.500 value as 20. so this function can return
00:11:28.279 a string as well as integer both kind of
00:11:31.740 values so to handle these kind of cases
00:11:35.120 RBS has introduced you Union type so if
00:11:39.420 you see the signature for this
00:11:40.980 particular function at the line number
00:11:43.079 19
00:11:44.399 so first argument ARG it can be either a
00:11:48.899 string or it can be either integer and
00:11:51.899 then we have optional attribute of
00:11:54.120 integer which is the multiplier and the
00:11:56.940 return type if you'll see it it is again
00:11:58.980 can be either a string or integer type
00:12:04.680 okay so another example is uh
00:12:08.940 with the keyword arguments so you can
00:12:11.700 see over here in the output if I'm going
00:12:15.180 to give a string and I'm not going to
00:12:17.579 the second argument if I'm not going to
00:12:19.500 provide the second argument it's only
00:12:21.420 going to give me the first character of
00:12:23.519 that string and if I'm going to give the
00:12:26.040 range of that string suppose like range
00:12:29.700 4 then the first four characters of that
00:12:32.040 particular string is going to be written
00:12:33.480 as an array okay
00:12:36.000 so uh to for these kind of functions if
00:12:39.420 you will see the
00:12:41.519 the first argument is string then we
00:12:44.820 have our range as name parameter over
00:12:46.980 here and you see the question mark which
00:12:49.440 denotes that this is an optional
00:12:51.060 parameter and the return type over here
00:12:53.279 is uh like you can have either a string
00:12:56.220 or array of string as the written type
00:12:57.839 for this this function
00:13:01.260 okay so now the duct typing so like what
00:13:04.980 happens when we have uh our objects or
00:13:07.800 class where the duct typing is
00:13:09.120 introduced so let's look at this class
00:13:11.459 over here so we have this class car
00:13:13.320 which has like Dynamic methods Wheels
00:13:16.380 engine and roof and then we are like
00:13:19.380 setting like creating Getters and Center
00:13:21.660 for these methods over here right now uh
00:13:25.800 if I'm going to generate the RBS code
00:13:27.720 for this using the syntactical parsing
00:13:29.760 method using the RB so it's going only
00:13:33.540 going to generate uh the attributes it's
00:13:35.880 not going to give me all the methods
00:13:37.440 which are there in the uh car class
00:13:39.420 right and car is inheriting for from my
00:13:43.500 car
00:13:44.700 so if you see over here if I'm going to
00:13:47.880 create a call
00:13:50.100 wheels on my car object
00:13:53.100 it's saying that my car does not have
00:13:55.320 method Wheels right
00:13:56.940 but if you see the code over here the my
00:14:00.300 car is inheriting from the car so it
00:14:02.339 should have that method Wheels the
00:14:04.320 reason it's not able to identify why
00:14:06.959 this method is not there is because RBS
00:14:10.560 code does not have that method into the
00:14:13.440 RBS signatures so
00:14:16.980 um
00:14:18.120 so there can be scenarios when we cannot
00:14:20.399 have RBS code which which is like
00:14:23.220 defined properly for a module or for a
00:14:26.339 particular class which we are going to
00:14:27.720 use so to handle these cases uh concept
00:14:32.160 of interfaces has been introduced so you
00:14:35.220 can say like I have a interface which is
00:14:37.440 like behaves like a car and then it has
00:14:39.959 all these methods Wheels engine and roof
00:14:41.760 and when you include that method into
00:14:44.040 your RBS code then you can call these uh
00:14:48.120 like method Wheels engine and roof on
00:14:50.699 your object so it's not going to give us
00:14:52.800 error after that
00:14:58.019 okay so uh now comes the Steep so all
00:15:02.399 the errors uh type checking errors which
00:15:04.920 we had seen earlier they were generated
00:15:06.839 using steep uh so steep is the tool
00:15:10.920 which actually
00:15:12.779 does the
00:15:14.519 type checking so what happens is that
00:15:16.500 RBS only provides us the syntax it does
00:15:19.019 not provide some mechanism to check uh
00:15:21.120 whether the Ruby code corresponding Ruby
00:15:23.519 code is correct with RBS code or not we
00:15:25.560 can only generate the Prototype using
00:15:27.120 RPS
00:15:28.139 a steep takes input of the Ruby code and
00:15:31.440 the RBS
00:15:32.660 signatures and then says like whether
00:15:34.920 the code is correct over there or not so
00:15:38.459 the way we can set up RBS similarly we
00:15:41.760 can set up steep also we just have to do
00:15:44.760 a steep in it and it will create a steep
00:15:48.240 file into our project
00:15:49.920 so
00:15:51.660 basically we have to configure steep
00:15:54.000 then we have to Define like where our
00:15:56.600 signatures are going to reside so here
00:15:58.920 in this case they are going to reside in
00:16:00.779 the sick folder uh what folder does it
00:16:04.139 have to check so if you see that it's
00:16:06.839 going to check in the live folder the
00:16:08.459 directory name where it's going to check
00:16:09.899 things and if we have suppose some files
00:16:13.980 which we have we doesn't have generated
00:16:15.779 our base code for them then we can
00:16:17.579 ignore them and we have to tell if we
00:16:20.760 are going to if you are using some
00:16:22.199 standard library or some gem from the
00:16:24.060 outside then we have to tell Steve that
00:16:26.519 load RBS for these libraries also
00:16:32.040 uh steep comes with uh stats option also
00:16:36.660 so here what I have done is that I have
00:16:39.779 checked out a Discord RB uh did some
00:16:42.959 annotation on top of that and then try
00:16:45.480 to see like uh what is the stats for all
00:16:48.660 those annotations over here right so you
00:16:51.720 can see like there are some files which
00:16:53.339 are like hundred percent covered for
00:16:55.079 some files their coverage is uh like in
00:16:57.420 the 50s or 40s depending on like what
00:17:00.660 how much detail I have uh like provided
00:17:03.600 for those files right so steep comes
00:17:06.360 with these options also and if you have
00:17:08.459 to check like whether
00:17:10.100 everything in our code has gone sorry in
00:17:13.620 our annotation has gone fine or not then
00:17:15.660 we can just call it steep check and it
00:17:18.419 will pass all the code and say okay
00:17:20.040 things are good or bad whatever depends
00:17:22.140 depending on it so this I have run on
00:17:24.900 Discord RB and I ignored like
00:17:28.380 lots of files because annotating all of
00:17:31.620 them was not practically possible at
00:17:33.000 that point of time but you can see like
00:17:36.360 as the end result there are not many
00:17:38.340 errors
00:17:40.320 okay so now comes the sorbet
00:17:43.440 so uh the primary difference between
00:17:46.200 Steep and the sorbet is that uh for the
00:17:50.460 RBS
00:17:52.260 we have to like put the Sig files into a
00:17:55.799 different into a different file while
00:17:58.620 sorbet is annotation based system so if
00:18:01.740 you see that for the class people
00:18:04.500 for every function we have provided The
00:18:06.900 annotation like Sig void for the
00:18:09.240 Constructor and for the add function we
00:18:12.900 have say
00:18:14.280 in which it's going to accept the params
00:18:17.580 as a person and then it's going to
00:18:19.380 return uh array of persons
00:18:22.980 so let's just compare the RBS code and
00:18:25.260 this orbit code just for the sake of
00:18:27.059 like what is the difference or what are
00:18:28.860 the similarities over here so if you see
00:18:32.100 over here for the add function
00:18:35.220 um so the basic gist of these signatures
00:18:38.160 are same the add function on the left
00:18:40.559 hand side is the RBS code so you'll see
00:18:43.080 it's accepting the person uh object and
00:18:46.320 returning an array of persons
00:18:49.020 on the right hand side you will see so
00:18:51.900 one difference in sorbit is that all the
00:18:56.299 function calls the in the Sig we have to
00:19:00.179 Define them as hashes so when we are
00:19:02.640 saying params then we have to provide
00:19:04.200 person person is our argument name and
00:19:07.320 then type of that argument and then it
00:19:10.620 returns array of persons
00:19:14.100 second is this remove function so remove
00:19:16.799 function will accept percent object and
00:19:19.620 return depending on if that person is
00:19:22.080 there in the list or not it's going to
00:19:24.000 return person object which it it has
00:19:26.220 deleted from the list so you can see
00:19:29.100 over here uh the question mark there in
00:19:31.260 the RBS code which tells like we can
00:19:34.620 we might return nil also and on the
00:19:37.980 survey code you will see that there is
00:19:40.740 concept of nullable class so
00:19:43.620 uh the idea over here is that if it's
00:19:46.080 going to return something then it will
00:19:47.760 return uh the person object but if it
00:19:50.820 does not then it can be nil also
00:19:53.760 uh same the concept the same concept as
00:19:57.120 in the search function also so it's
00:19:59.400 going to accept a string over here and
00:20:02.460 might or might not return a person
00:20:04.380 object person object
00:20:07.440 right now we can use this in our CI CD
00:20:11.640 Pipeline and sorbet comes up with
00:20:14.100 RoboCop gem also robocops orbit so we
00:20:17.460 might not necessarily have to like use
00:20:20.160 it in origin Keynes or in our GitHub
00:20:22.440 cicd while for the RBS we have to like
00:20:25.860 add additional check in our CI CD for
00:20:28.740 like doing the type check-ins
00:20:32.179 uh there are few things which are still
00:20:35.280 not supported uh in the RBS code it's in
00:20:38.880 active development right now so these
00:20:41.400 things are expected
00:20:43.140 one thing which I feel is that there are
00:20:45.360 multiple options to generate the output
00:20:47.400 uh in the RBS like RBI RB RB and the
00:20:51.900 runtime so there should be a tool using
00:20:54.419 which we can compare the output of
00:20:56.700 syntactical parsing and the runtime code
00:20:58.740 generation and see which one has the
00:21:00.840 better resemblance to our code
00:21:04.620 then there are some useful gems we can
00:21:07.140 use uh for like uh generating the code
00:21:10.860 for uh
00:21:13.080 for
00:21:14.340 our annotations sword parlor and sorbet
00:21:17.940 rails RBS rails and Pronto survey
00:21:22.020 yep that's it thank you