00:00:00.000
ready for takeoff
00:00:16.920
hello everybody and welcome to my talk
00:00:20.039
today analyzing an analyzer a dive into
00:00:23.039
how a Robocop works RoboCop is quite
00:00:26.640
complex and I don't think we can do an
00:00:28.980
exhaustive study on it so this will be
00:00:30.960
more of a regular dive not a deep dive
00:00:33.840
uh my name is Kyle dollar I'm based out
00:00:37.680
of Vancouver Canada
00:00:39.600
I've been working with Ruby and uh rails
00:00:43.800
for well over a decade now I'm in love
00:00:46.200
with the language I love the community I
00:00:48.719
love spaces like this where we can
00:00:50.820
interact with each other and get a sense
00:00:52.860
of what the Ruby Community feels like
00:00:56.039
I'm really drawn to tools that can
00:00:58.800
benefit the entire community and RoboCop
00:01:01.440
is no exception to that and my hope is
00:01:04.860
that after this listening to this talk
00:01:07.020
you'll understand some of the basics
00:01:08.880
about how RoboCop can analyze and
00:01:12.540
correct code and maybe some of you will
00:01:15.000
feel inspired to either contribute
00:01:17.280
custom rules for yourself your
00:01:19.200
organization or even to the open source
00:01:21.540
repository or even start playing around
00:01:23.820
with new tools that utilize similar
00:01:26.340
ideas or concepts
00:01:28.920
I have been working at aha for the past
00:01:31.619
two years and it is one of the best
00:01:33.780
workplaces that I've been a part of we
00:01:36.479
are a human-centric company that's
00:01:38.220
helping other companies build the
00:01:39.659
products that matter for them with our
00:01:41.700
suite of products and we have an amazing
00:01:44.400
team that's uh distributed and all
00:01:46.860
distributed by Design all over the world
00:01:48.439
we have one of the best company cultures
00:01:51.119
that I've seen and it's powered by the
00:01:53.939
responsive method which helps gives us a
00:01:56.640
framework of shared values that we all
00:01:59.280
agree upon and embody and it really
00:02:01.619
helps Empower us so that we can move
00:02:03.540
quickly and stay aligned so if you'd
00:02:06.240
like to be part of that culture of
00:02:07.799
course uh that's higher
00:02:10.879
linters are a static code analysis tool
00:02:15.420
that can be used to flag
00:02:17.819
programming errors suspicious constructs
00:02:20.160
stylistic errors but it can be used to
00:02:23.280
do more too it can be used to alert
00:02:25.200
around security it could be used as a
00:02:28.140
tool for training other engineers
00:02:30.239
and RoboCop is one of the most popular
00:02:32.400
linters for Ruby and I'm just a little
00:02:35.640
bit curious just from a quick show of
00:02:37.500
hands does and has anyone here not
00:02:39.540
worked with RoboCop before
00:02:43.319
so I don't think I see a single hand and
00:02:45.360
that's about what I expected
00:02:48.120
um rails is one of those gems that is
00:02:50.760
very closely tied to Ruby
00:02:53.280
so much so is that there's often the
00:02:54.840
Assumption if you say you're working
00:02:55.860
with Ruby they just assume you're
00:02:57.840
working with rails
00:02:59.700
and to put things in a little bit of
00:03:01.200
perspective rails has been downloaded
00:03:03.720
about 387 million times off from ruby
00:03:06.840
gems and is about the 40th most popular
00:03:09.000
gem
00:03:10.379
RoboCop has been downloaded about 270
00:03:13.319
million times and is about the 76 while
00:03:15.360
not as popular as rails RoboCop is up
00:03:18.239
there that when you talk about Ruby
00:03:20.159
you're also thinking about RoboCop
00:03:23.220
and the original creator of rubble cop
00:03:25.260
gave another talk about this in 2018 at
00:03:28.440
Ruby kaige so after this talk if you're
00:03:31.379
really curious about learning more about
00:03:32.879
RoboCop this is a great resource
00:03:35.879
um you can learn more about RoboCop from
00:03:38.159
him
00:03:39.900
and I wanted to begin today's talk with
00:03:42.120
a little personal story
00:03:43.739
I first got into using robocops several
00:03:46.860
years ago it was at a point where we
00:03:48.959
were trying to figure out how to have an
00:03:50.400
agreed-upon style for all the code that
00:03:53.159
we were writing but we would end up with
00:03:55.440
pull requests that were just full of
00:03:56.879
these nitpick comments that wouldn't
00:03:59.220
really be talking about the content of
00:04:01.319
the pull request but instead talking
00:04:03.239
about how the code looked
00:04:05.340
and sometimes these would be fine but
00:04:08.040
other times we would get into big and
00:04:09.659
often pointless arguments over what
00:04:11.760
seemed like the most trivial kind of
00:04:14.040
things
00:04:15.239
we would get into big debates of whether
00:04:17.280
or not we should use single quotes or
00:04:18.959
double quotes
00:04:20.160
uh what is the maximum line with uh
00:04:22.740
whether or not we need to have an extra
00:04:24.120
line at the end of a guard clause
00:04:26.699
and we thought that if RoboCop could
00:04:28.979
handle our linting and styling for us we
00:04:31.139
could focus all of the comments of pull
00:04:33.000
requests on the actual content of the
00:04:35.280
pull request and this helped a lot
00:04:38.460
except for that was also incredibly
00:04:40.139
frustrating to work with we would write
00:04:42.180
code we pushed up to CI and CI would
00:04:44.460
reject it because RoboCop would have
00:04:46.560
flag violations and we would go back and
00:04:48.600
we'd fix them and do it again
00:04:51.120
um I had a project in which I needed to
00:04:54.000
name space a huge number of constant
00:04:55.800
references
00:04:57.000
so all of the lines became longer and
00:04:59.340
the maximum line length rule became the
00:05:01.199
bane of my existence I'd have to hunt
00:05:03.479
down from CI every line that was over
00:05:05.820
the length figure out how do I break
00:05:07.560
this up into pieces
00:05:09.300
and I hated it
00:05:10.979
and many Engineers had the similar
00:05:12.840
experience and the message of not liking
00:05:15.060
of how RoboCop was rolled out started
00:05:17.580
getting mixed up into the message of we
00:05:19.740
hate RoboCop
00:05:21.600
but then we started really leaning into
00:05:23.280
Robocop and learning about how we could
00:05:25.620
use autocorrect and a lot of the
00:05:27.419
frustration of Robocop started to
00:05:29.580
disappear we could have RoboCop fix our
00:05:32.340
code for us if it detected something
00:05:35.039
that was a violation we could have it
00:05:38.039
handle it automatically and most the
00:05:40.139
time those changes were perfect
00:05:42.780
and so we kept leaning into this as well
00:05:44.699
we started writing our own custom cops
00:05:46.860
to help us migrate uh bad patterns to
00:05:49.259
good ones we wrote some to keep
00:05:52.320
deprecations under control as we were
00:05:54.419
upgrading rails all the while we were
00:05:56.759
using error messages from ruakop as a
00:05:59.220
way to explain Concepts linked to
00:06:01.139
Internal Documentation and give
00:06:04.520
reasons why certain decisions were being
00:06:06.900
made
00:06:07.979
I gave a talk at railsconf at 2020 uh
00:06:12.240
illustrating how RoboCop can be used to
00:06:14.699
communicate information about bad
00:06:16.139
patterns in code and that is another
00:06:17.699
resource that you can look up later if
00:06:20.160
you are curious about learning more
00:06:22.919
and this year RoboCop turns 10. and in
00:06:26.940
the open source world this is a pretty
00:06:28.919
big deal uh development of the rubicop
00:06:32.100
has been pretty much pretty consistent
00:06:33.479
throughout the years and with where
00:06:35.639
we're at I don't see RoboCop going away
00:06:38.000
anytime soon
00:06:40.440
but it's large enough that understanding
00:06:43.020
how RoboCop Works can be really tricky
00:06:46.199
there are thousands of commits thousands
00:06:49.139
of closed issues hundreds of
00:06:51.660
contributors and releases
00:06:54.660
this is not going to be a code review of
00:06:57.660
Robocop uh I don't think I could cover
00:07:00.000
it and do it justice and it would
00:07:01.560
probably be very overwhelming
00:07:03.479
in the history of with the history of
00:07:05.759
Robocop it's a bit of a marathon and
00:07:07.740
we've got 30 minutes here it's a bit of
00:07:09.539
a Sprint
00:07:10.500
so instead this talk is going to be a
00:07:12.960
dive into how does the basics work I'll
00:07:16.139
outline uh the basics and help
00:07:18.539
illustrate how some of the processes
00:07:20.340
function inside of Robocop
00:07:22.919
um with the aim to stay as close to how
00:07:25.560
RoboCop works as possible but I might
00:07:27.479
make some simplifications in this talk
00:07:29.280
just for the purposes of the talk
00:07:32.460
so let's dig into this how does RoboCop
00:07:36.180
work and I thought of a good way to dive
00:07:38.819
into the details of how RoboCop works is
00:07:41.280
to start by looking at the command line
00:07:42.900
interface
00:07:44.099
how would this how does this work for a
00:07:46.680
single file and a single cup
00:07:49.620
so what happens when you run the RoboCop
00:07:52.560
command
00:07:53.759
well at first it will run the executable
00:07:56.699
file which loads up the Ruby library and
00:08:00.240
processes
00:08:01.500
and then load some configuration and
00:08:04.080
these two steps are the easy part I'll
00:08:06.539
touch them a little bit for completeness
00:08:08.099
but there shouldn't be too much
00:08:09.660
surprises here but then we'll get into
00:08:11.580
the the meat of Robocop at some point
00:08:15.479
RoboCop will need to process a file
00:08:18.660
and that is to take a file that exists
00:08:20.580
and do something to it so that it can
00:08:24.000
make decisions about that file and the
00:08:26.460
code inside of it
00:08:29.340
once it's processed this code it will
00:08:31.740
need to run it through a series of cops
00:08:34.260
that will make decisions on whether or
00:08:36.180
not there are any offenses and if there
00:08:39.539
are it can write those or rewrite those
00:08:42.599
that source code and change the file
00:08:45.660
and it will loop back once it's finished
00:08:47.940
uh changing the file and potentially
00:08:50.459
start again
00:08:51.540
and this Loop is important as multiple
00:08:53.820
different cops can adjust the same line
00:08:56.100
of code and so you may need to process
00:08:58.620
the code a couple times also be wary of
00:09:00.839
getting into Infinite loops
00:09:03.600
so we'll break down this process into
00:09:06.060
these steps
00:09:07.920
and at the start of it all is the
00:09:09.720
command line interface now this is
00:09:11.880
pretty straightforward and I'm going to
00:09:13.440
go pretty quick here as this isn't to
00:09:15.420
talk about command line interfaces this
00:09:16.980
is a talk about Rubicon
00:09:18.899
effectively there's just a executable
00:09:21.839
file called RoboCop it's configured to
00:09:24.120
use Ruby it loads the appropriate
00:09:26.339
libraries into the load path requires
00:09:28.440
Robocop and then does some kind of
00:09:31.260
processing
00:09:32.399
and this is where we go from the command
00:09:35.940
line into the world of Ruby and this is
00:09:38.040
our rubricops entry point to doing
00:09:39.779
whatever it wants to do
00:09:42.000
and this the part that it wants to do
00:09:44.160
next is start loading some configuration
00:09:45.839
determine which cops are active uh what
00:09:49.140
options do we need to provide to them in
00:09:51.839
general this is all done through yaml
00:09:54.000
files I won't go through the options
00:09:56.279
here uh they have a pretty well
00:09:58.320
documented online about what options are
00:10:00.240
available or not
00:10:02.220
um effectively there is a big yaml file
00:10:04.500
where you can give it some options that
00:10:06.420
are provided to all cops for instance
00:10:08.399
and you could say which Ruby version
00:10:09.779
you're looking for here are some file
00:10:11.880
patterns to include here's some patterns
00:10:14.220
to exclude you can also provide specific
00:10:17.100
configuration for specific cops you can
00:10:20.100
Nest the configuration based off the
00:10:21.839
comp's name
00:10:23.399
and then that configuration is then
00:10:25.500
available inside of the comp itself
00:10:29.399
now that's pretty quick that's
00:10:32.040
at the end of the day it's just all
00:10:33.420
loaded with uh yaml and then it is a
00:10:37.200
hash inside the Ruby process there's not
00:10:39.000
too much fancy going on there but this
00:10:41.760
is where we get to the more interesting
00:10:43.140
part of the whole process processing the
00:10:45.300
code
00:10:46.200
and this is a bit of a meta topic
00:10:48.300
because we need to talk about code that
00:10:50.519
understands code
00:10:54.600
and to illustrate how this really works
00:10:57.300
I'm going to be leaning into this
00:10:59.459
specific example RoboCop has a style cop
00:11:03.420
called array join that its purpose is to
00:11:07.260
check whether or not there is a star
00:11:09.660
method that is being used to join values
00:11:12.720
of an array
00:11:14.220
so imagine we had some code and we want
00:11:16.620
to determine if any of these bad
00:11:18.779
patterns exist in the code and if it
00:11:21.120
does we want to flag the code AS
00:11:23.279
violating this rule
00:11:25.800
so imagine we've got our array
00:11:28.079
uh or here we've got a an array of
00:11:30.600
strings which contain Foo bar baz that
00:11:33.899
are being joined with the star method
00:11:36.180
and joined with a comma if we wanted to
00:11:38.640
ask does this code violate the rule as a
00:11:43.260
human we can say yes obviously but how
00:11:45.540
do we write code that determines this
00:11:48.180
one option
00:11:50.100
is that we could reach for our fancy
00:11:53.220
tool of regular expressions
00:11:55.320
and we can start with something that
00:11:56.880
already looks pretty complicated
00:11:58.920
uh it looks for an open brace a bunch of
00:12:01.920
non-closing braces closing brace space
00:12:05.700
star space uh it's pretty complicated uh
00:12:09.240
this would match this code though so in
00:12:12.000
this specific example we could use a
00:12:13.740
regular expression but in the world of
00:12:15.240
Ruby things can change what happens if
00:12:17.160
we use single quotes
00:12:19.380
well we could update our regular
00:12:21.720
expression to do something with this
00:12:23.820
we'll figure out which quote you have a
00:12:25.800
back reference for it again we can write
00:12:27.779
things in different ways if there's no
00:12:29.220
spaces what do we do
00:12:31.800
we can continue adjusting the regular
00:12:34.200
expression
00:12:35.459
but at some point we could write this in
00:12:37.920
a completely different way we use a
00:12:39.480
percent W notation to write a series of
00:12:42.540
white space delimited uh strings uh at
00:12:46.500
this point I don't really know how we
00:12:48.360
would do a regular expression to to kind
00:12:51.360
of match this kind of code and get more
00:12:54.300
complex you don't even need parentheses
00:12:55.560
you can use almost any character to
00:12:57.600
delimit this array so this just gets
00:13:00.120
more confusing
00:13:02.279
um
00:13:03.000
and we can continue down this path what
00:13:04.860
if this code was in a comment what was
00:13:07.260
what if this code was in uh nested array
00:13:09.480
what if there's Dynamic strings in here
00:13:11.120
and using a tool like regular
00:13:13.620
Expressions just wouldn't be sufficient
00:13:15.060
to make decisions about code like this I
00:13:17.700
have to include the obligatory XKCD
00:13:19.800
comic whenever talking about regular
00:13:21.300
Expressions because often you end up
00:13:23.100
with more problems
00:13:24.300
but there needs to be a different way
00:13:27.180
and what we need to be able to do is we
00:13:29.459
need to take Ruby code and break it into
00:13:32.339
the abstract syntax tree that what it
00:13:35.040
represents
00:13:36.600
and this abstract syntax tree or the AST
00:13:39.320
is a representation of the code
00:13:44.040
so for example we have a begin rescue
00:13:46.500
block like we do in the corner this is
00:13:48.600
what the corresponding abstract syntax
00:13:50.760
tree would look like now it is a lot so
00:13:53.700
I will break this down
00:13:55.980
and Ruby has a gem that can handle
00:13:58.260
converting code into an ASD called
00:14:00.540
parser it comes with Ruby you don't need
00:14:03.420
to you just need to require it and you
00:14:05.459
can get a representation of what any
00:14:07.560
Ruby code looks like as an AST
00:14:10.740
so let's come back to our array example
00:14:14.100
what does the abstract syntax tree here
00:14:16.620
look like
00:14:18.180
well first this entire thing is a method
00:14:21.300
call
00:14:22.200
and this is represented by the send no
00:14:25.740
no the send node has many children
00:14:29.160
the first child of it is what is this
00:14:32.399
method being called on so in this case
00:14:34.680
it's being called on an array and that
00:14:37.079
array contains three elements a string
00:14:39.240
Foo a string bar and a string bass
00:14:42.899
the second child of the send method is
00:14:45.600
the name of the method that is being
00:14:47.220
called so in this case the method name
00:14:49.380
is star
00:14:51.060
and the rest of the children are these
00:14:53.760
of the send node are the arguments that
00:14:55.440
are provided to the method uh so in this
00:14:57.540
case there is only one argument and
00:14:59.100
that's just a string that is a comma
00:15:02.220
now if we come back to what the code
00:15:04.620
looked like and we wanted to check what
00:15:06.480
the abstract syntax tree looked like uh
00:15:09.180
with parser we can run this code through
00:15:11.459
parser and when we format it it would
00:15:13.800
look something like this
00:15:16.500
um and what's great about this abstract
00:15:18.300
tree or abstract syntax tree
00:15:20.040
representation is if we start writing
00:15:21.839
this code in a different way we use
00:15:23.459
single quotes instead of double quotes
00:15:25.560
the abstract syntax tree doesn't change
00:15:27.779
if we write it without spaces
00:15:30.240
the abstract syntax tree doesn't change
00:15:32.760
and this is true for the percent W
00:15:34.860
notation as well so this is what RoboCop
00:15:37.800
uses under the hood to be able to make
00:15:40.800
decisions about code
00:15:43.380
they have a utility gem available called
00:15:46.320
RoboCop AST
00:15:48.240
that extends the functionality of the
00:15:50.579
parser gem to make things a little bit
00:15:52.860
more simple or easier to read
00:15:56.040
so RoboCop has this process Source class
00:15:58.860
that you could use if you wanted to
00:16:00.420
where you could provide it's some code
00:16:02.519
in a string form so this could be read
00:16:04.199
directly from a file
00:16:05.940
and the version of Ruby you are
00:16:08.220
interested in and you can look at the
00:16:09.720
abstract syntax tree and start to ask
00:16:11.820
questions about it
00:16:13.560
so if we looked at this code
00:16:15.480
and we wanted to know if it violated
00:16:17.339
that array join rule we could start
00:16:19.500
asking some questions we could ask is
00:16:21.540
this a send type and we could say yeah
00:16:24.180
it is
00:16:25.560
we could look at the receiver of this
00:16:27.720
method and ask it is it an array type
00:16:30.180
it is we could check to see if the
00:16:33.180
method name is star
00:16:35.760
and we could also look at the arguments
00:16:37.500
is there one argument is that argument a
00:16:40.440
string
00:16:41.759
so this gives us a way to start asking
00:16:44.940
meaningful questions of code in a very
00:16:48.600
repeatable manner
00:16:51.060
okay so now that we have a little bit of
00:16:53.519
understanding about how RoboCop is
00:16:55.139
processing the code
00:16:56.579
how does it actually start applying the
00:16:59.399
specific cops
00:17:02.579
to understand that we need to make our
00:17:04.799
example just a little bit more complex
00:17:07.559
what if our array joining code is
00:17:10.199
wrapped in a method
00:17:11.819
Lis seems to throw a bit of a wrench
00:17:13.799
into things we can no longer ask if this
00:17:16.439
is a send type anymore it's not this is
00:17:18.600
a method definition
00:17:20.160
we can't ask if the receiver is an array
00:17:23.459
type because there is no receiver for
00:17:25.559
this method definition
00:17:27.600
the method name here is not star
00:17:30.600
we do have one argument though so that
00:17:32.940
we do have that going for us
00:17:35.460
what we need to do is we need to break
00:17:37.500
this code down into its abstract tree
00:17:39.960
representation abstract syntax tree
00:17:41.760
representation and when we do it looks
00:17:43.860
something like this you can see the
00:17:45.720
original send node that we wanted to
00:17:47.340
focus on is there but it's wrapped in
00:17:49.620
another node this def node
00:17:52.740
so in order to navigate and find these
00:17:55.620
various nodes we need to walk through
00:17:57.840
the abstract syntax tree
00:18:00.240
so what we'll go through next is a very
00:18:02.280
simplified version of how do we walk in
00:18:04.799
abstract syntax tree
00:18:07.020
to do so we could start with a method
00:18:08.820
like walk
00:18:10.380
and the goal of this will can we visit
00:18:12.240
every single node inside of this
00:18:14.820
abstract syntax tree
00:18:16.980
to do so we'll start with passing it the
00:18:18.960
entire abstract syntax tree
00:18:21.240
and we'll need some sort of convention
00:18:22.860
on determining what do we call the
00:18:25.080
methods for our various nodes so for
00:18:27.720
this we'll create a method for each node
00:18:29.640
type that's prefixed by on
00:18:32.280
so here we'll need a method called on
00:18:35.100
Def because def is the top level node of
00:18:38.760
this abstract syntax tree
00:18:41.820
from here what does on Def do well we in
00:18:45.480
order to understand this method we need
00:18:47.820
to know what what does the def node in
00:18:50.100
the abstract syntax tree look like
00:18:53.160
it has many children its first child is
00:18:56.100
the name of the method that's being
00:18:57.840
defined this is a symbol so we don't
00:19:00.660
need to do anything further here
00:19:02.940
uh the second child of the death note is
00:19:05.280
the arguments that are passed in
00:19:07.380
so to deal with this convention we'll
00:19:09.539
create an on args method we'll hand it
00:19:11.580
the arguments that will continue to
00:19:13.559
recursively visit these nodes
00:19:16.320
the last child is the body of the method
00:19:19.679
it could be a few different things but
00:19:21.780
to keep it simple we'll focus on what it
00:19:23.340
is here so we'll need to write in on
00:19:26.039
send method
00:19:28.380
so we now we can start looking at those
00:19:31.200
methods what does the on args do
00:19:34.740
this has many children and each child is
00:19:38.700
an ARG so we could walk through each of
00:19:41.160
the the children of this piece of the
00:19:43.260
AST and call on ARG on Egypt
00:19:46.559
and the odd and ARG itself doesn't need
00:19:49.200
to do anything further it doesn't have
00:19:51.000
anything further nested so we don't need
00:19:52.679
to do anything to navigate this entire
00:19:54.419
tree
00:19:56.820
moving back over to the on send method
00:19:59.460
the send node itself we've gone over
00:20:01.500
before
00:20:02.640
and we know that the first child is is
00:20:04.679
the receiver so we can call the ONG
00:20:06.720
array here
00:20:08.280
the second is the name of the method
00:20:11.160
that's being called which in this case
00:20:12.660
is star it's a symbol we don't need to
00:20:14.160
do anything further and the rest of the
00:20:16.320
children are the arguments to the method
00:20:18.780
so in this case there's only one
00:20:20.220
argument which is the string so we'll
00:20:21.840
call the on Str method
00:20:25.320
almost done here the on array method
00:20:28.500
looks very similar to what the on args
00:20:30.780
look like we go over all of the children
00:20:33.059
and we need to call a method based off
00:20:35.280
what type is inside of the array so in
00:20:38.520
this case we just call on STR on each
00:20:41.520
one of these uh elements of the array
00:20:45.539
and lastly the on SDR method doesn't
00:20:48.360
have any further children so we don't
00:20:50.220
need to do anything
00:20:52.140
and so utilizing all of these methods we
00:20:55.200
can visit all elements of this abstract
00:20:58.020
syntax tree and based off where we are
00:21:00.960
in the true we can start to tie in
00:21:02.640
behavior of what we want to do
00:21:05.820
and this is all fairly complicated still
00:21:08.160
but RoboCop AST that gem that I talked
00:21:11.460
about earlier has abstracted this away
00:21:13.020
into a module called RoboCop AST
00:21:15.419
traversal and using this it can Traverse
00:21:18.600
over any arbitrary abstract syntax tree
00:21:22.140
it'll visit each node of the tree in a
00:21:25.200
depth first search
00:21:27.780
um and if any of the cops that are
00:21:30.780
active
00:21:31.799
Define methods that are interested in
00:21:34.080
each piece of that abstract syntax tree
00:21:36.240
it will hand that node over to the cop
00:21:38.460
to handle its own logic
00:21:42.179
so what this could look like is we could
00:21:44.280
have a class that includes this module
00:21:46.260
and this would allow it to Traverse over
00:21:48.659
any arbitrary abstract syntax tree
00:21:51.659
and the object could hold a reference to
00:21:53.760
whatever cops are available and enabled
00:21:56.820
and then for each type of node it can
00:22:00.179
walk through the cops that are available
00:22:01.799
determine if they respond to that method
00:22:04.440
and if so
00:22:05.520
give them that piece of the abstract
00:22:07.260
syntax tree
00:22:08.640
I've only listed a single method here
00:22:10.679
the onsend but effectively this is what
00:22:13.200
RoboCop does it does this for every
00:22:15.659
possible type of node
00:22:18.299
and if we were to look at what the cop
00:22:20.640
for that array join cop looks like the
00:22:24.360
source code for this
00:22:26.280
um there's while there's a lot going on
00:22:28.020
here the main thing here is that the cop
00:22:29.940
just defines an on send method and this
00:22:32.460
is past every send node so this cop when
00:22:36.240
run through RoboCop will see every
00:22:38.159
method call that is in your source code
00:22:40.080
and this will allow the cop to ask
00:22:42.120
specific questions to that method call
00:22:44.760
is the receiver an array is the method
00:22:47.580
named star and if it matches everything
00:22:49.799
that it needs to match it can record an
00:22:52.260
appropriate offense
00:22:55.740
and this is how each cop ties into the
00:22:58.020
source code if we were to look at
00:22:59.820
another cop example just for a quick
00:23:02.280
moment we can look we can see something
00:23:04.320
similar for instance if we look at this
00:23:06.120
cop which is the min max cop
00:23:08.299
it is looking for when the Min and the
00:23:11.700
max are returned as two elements of the
00:23:13.980
same array this would need to do some
00:23:16.320
sort of logic and it on each array that
00:23:19.020
it encounters
00:23:20.520
the source code for this
00:23:23.400
just defines an on array method which
00:23:26.460
can then tie into every single array
00:23:28.440
definition and it can start asking
00:23:30.360
questions about this all right does this
00:23:32.580
match the Min and the max
00:23:34.559
this allows each cop to focus on only
00:23:37.440
the pieces of the abstract syntax tree
00:23:39.539
that they're interested in and make very
00:23:41.880
focused decisions
00:23:44.700
and there are many many different types
00:23:47.580
of nodes possible in the abstract syntax
00:23:49.919
trees you could have nodes for local
00:23:53.340
variable assignments you can have nodes
00:23:56.520
for class definitions or for yield
00:23:58.620
blocks some of these nodes I'm not
00:24:01.020
entirely sure how you would even write
00:24:02.880
Ruby code to represent them
00:24:06.000
um but if you can represent them as
00:24:08.480
abstract syntax tree RoboCop can tie
00:24:11.340
into that piece and start making
00:24:12.840
decisions about it
00:24:15.900
so finally we get to a point where we
00:24:17.520
can start asking RoboCop
00:24:19.740
how does it change the abstract syntax
00:24:22.020
tree and auto correct it
00:24:24.059
if we were to take the method that we
00:24:25.919
looked at earlier and we were to fix it
00:24:27.780
we would want to do something like this
00:24:29.100
where we replace remove the star replace
00:24:30.960
it with join and wrap the arguments
00:24:34.860
lucky for us the parser gem that we
00:24:37.500
talked about earlier has a class called
00:24:39.539
the tree rewriter and as they say it
00:24:42.240
performs all of the heavy lifting for
00:24:44.340
the sorcery writing process so you can
00:24:46.559
have multiple rewrites and it will
00:24:48.480
perform them all in the correct order
00:24:51.059
the general structure would look
00:24:52.500
something like this where you can get
00:24:53.940
the abstract syntax tree for any
00:24:55.799
arbitrary source code
00:24:57.419
and pass it along to the tree rewriter
00:24:59.460
class but from here what can we do
00:25:02.280
one of the things you can do is call the
00:25:05.100
replace method which as the document
00:25:06.780
says will replace the source code with
00:25:09.780
with us of a specific range with new
00:25:11.940
content
00:25:13.080
we can do other things with the tree
00:25:14.520
rewriters such as adding things before
00:25:16.200
or after or removing content but we'll
00:25:18.059
focus on this for right now
00:25:19.919
and two immediate questions should pop
00:25:22.020
up which is what is the range and how do
00:25:25.140
we determine the content
00:25:27.539
looking at the range the range is this
00:25:29.520
special class that comes from the parser
00:25:31.799
gem itself and it outlines a range of
00:25:34.260
characters that represent various Ruby
00:25:36.840
expressions
00:25:38.220
and there's a method available on all of
00:25:40.200
the code that is parsed with parser
00:25:42.000
called location
00:25:43.559
and this gives a lot of really
00:25:45.059
interesting aspects of that piece of the
00:25:47.100
abstract syntax tree but one piece of
00:25:49.380
that is the expression and that provides
00:25:51.840
the range for that entire piece of the
00:25:54.600
abstract syntax tree
00:25:56.460
so if we were to focus on that send node
00:25:59.100
which we determined was the third child
00:26:01.080
of that overall AST we can get to where
00:26:04.020
this source code is located by looking
00:26:06.000
at the location dot expression and this
00:26:08.520
is how we can get the range of what's
00:26:10.200
code needs to change
00:26:13.500
now when we start looking into content
00:26:15.240
of what how do we determine the content
00:26:17.340
it should be replaced with well we knew
00:26:19.320
that the receiver was an array type but
00:26:21.000
we can ask more questions about this
00:26:22.860
receiver for instance we can ask it
00:26:24.960
explicitly what's the source and this
00:26:27.480
will give us exactly as it was written
00:26:29.400
the source code of there's this array
00:26:32.640
and we can save this as a string to use
00:26:34.980
later
00:26:35.820
similarly
00:26:37.260
we can look at the arguments and look at
00:26:40.140
the source of the arguments and this
00:26:41.760
lets us get explicitly from the code
00:26:43.580
exactly that was passed in
00:26:47.880
putting this all together we can get
00:26:49.500
something like this where we set up our
00:26:51.120
code we set up our street rewriter we
00:26:53.640
focus on that send node specifically and
00:26:56.400
we grab the location of that send node
00:26:58.500
using that location.expression we can
00:27:00.539
get the source array we can get the
00:27:03.000
source arguments and we can call this
00:27:05.340
replace method to
00:27:07.260
wrap this array or with this join that
00:27:11.100
wraps the argument
00:27:12.720
and from here we can have the tree right
00:27:14.279
or process it and we end up with this
00:27:16.559
Rewritten method exactly the way we
00:27:19.020
wanted it
00:27:20.400
and effectively this is what RoboCop
00:27:22.200
does they have their own corrector which
00:27:25.320
is an object that wraps the tree
00:27:26.820
rewriter so you don't need to know
00:27:28.140
certain things like the location
00:27:29.400
expression you can pass in directly the
00:27:32.460
piece of the abstract syntax tree but
00:27:34.860
all of the other methods such as replace
00:27:36.360
are available
00:27:38.159
and this allows each cop to handle its
00:27:41.400
own changes whenever it detects
00:27:42.900
something that violates a certain
00:27:44.580
pattern and when it's done RoboCop will
00:27:47.580
write these auto corrected changes back
00:27:49.559
to the file and if necessary reprocess
00:27:52.020
the new file and start again
00:27:55.140
okay so even after now 28 minutes this
00:27:58.919
this can be a lot this is a crash course
00:28:01.140
through
00:28:02.460
Rubicon we've gone through some of the
00:28:04.500
different pieces uh simplistic version
00:28:06.419
of Robocop we've gone through pieces of
00:28:09.179
the command line interface and
00:28:10.980
configuration
00:28:12.240
we've walked through how RoboCop can
00:28:14.640
take that file and convert it to an
00:28:16.799
abstract syntax tree so it can interact
00:28:18.480
with it
00:28:19.440
we talked about how it could then
00:28:21.000
Traverse that abstract syntax tree and
00:28:23.820
rewrite it and then write it back to the
00:28:25.799
file
00:28:27.299
I'm hoping that this has helped you
00:28:28.919
understand a little bit about how
00:28:30.600
RoboCop functions and I hope this has
00:28:33.299
got you thinking about how you can start
00:28:34.620
utilizing this knowledge as I mentioned
00:28:37.500
earlier there's some other great
00:28:39.419
resources online if you want to learn
00:28:41.460
more but if you are interested in
00:28:44.039
building a tool that needs to analyze
00:28:46.679
Source codes there's probably little
00:28:47.820
bits of pulling from the abstract syntax
00:28:49.860
tree or walking through an abstract
00:28:51.539
syntax tree that could all be really
00:28:53.400
useful
00:28:55.020
um I hope you found all of this really
00:28:56.279
interesting uh thank you for your time
00:28:58.080
uh I don't know if we have time for
00:29:00.179
questions uh so I'm just going to end it
00:29:02.039
there if you have questions for me
00:29:03.480
personally come up after the talk and
00:29:05.039
I'll be happy to answer