00:00:08.020
hello guys
00:00:10.870
yeah
00:00:12.980
I think this is the final presentation
00:00:14.870
for the today for the first first day of
00:00:18.380
Ruby conference and it will be quick
00:00:21.680
hopefully and this is about parallel
00:00:24.770
processing with Ruby and yes actually
00:00:29.839
yesterday night I came here from
00:00:32.509
Singapore and I forgot my laptop charger
00:00:36.110
to take so my laptop is running out of
00:00:39.589
battery right now if it died quickly you
00:00:45.559
can go home quickly so yeah and yes I'm
00:00:53.359
the Lumina on China and this is my first
00:00:56.539
time in Malaysia and Kuala Lumpur even
00:00:58.940
though I've worked in Singapore and I'm
00:01:05.390
from Sri Lanka actually my motherland
00:01:08.390
and it's a small island in South Asia if
00:01:14.119
you have ever heard or else ever visited
00:01:16.729
yes
00:01:17.420
that's my country and in Sri Lanka we
00:01:21.259
have nice features we have elephants and
00:01:24.039
we have tea plantations and we drink tea
00:01:28.970
a lot yes and I'm working in the company
00:01:36.170
called B bytes in Singapore yes we are
00:01:40.880
working with Ruby and we have some
00:01:44.780
products which are running with Ruby and
00:01:47.929
grapes so because of the reason we are
00:01:51.349
using grape is actually we are providing
00:01:55.970
some api's for the front ends so because
00:01:59.239
of that we prefer to use grape than
00:02:02.119
rails but of course we are using
00:02:04.399
activerecord
00:02:05.390
action mailer those kind of things with
00:02:08.720
the crepe API support actually which
00:02:11.810
helped us a lot right now to you API
00:02:17.060
layer for our front-end developers and
00:02:19.629
as well as the final thing we are
00:02:23.030
working with blockchains
00:02:24.930
something interesting and actually after
00:02:28.719
coming to Singapore I started learning
00:02:30.219
about blockchains and is we are
00:02:33.280
developing some things with aetherium
00:02:35.200
and as well as for our transactions to
00:02:40.090
secure our transactions we are having a
00:02:43.299
private blockchain which has about ten
00:02:47.920
nodes so yeah we are doing some
00:02:50.650
developments with blockchains like that
00:02:52.180
and also we are hiring if any of you are
00:02:57.370
familiar with Ruby grape I know all of
00:03:00.549
you are family with Ruby and if you have
00:03:03.340
any plans to move to Singapore yes we
00:03:06.219
are hiring and as well as if any of you
00:03:08.439
are family with blockchains
00:03:09.669
yes you will definitely hire you and yes
00:03:15.030
this is the topic actually the topic is
00:03:19.629
about parallel processing with Ruby and
00:03:21.599
even though the topic is about parallel
00:03:24.430
processing with Ruby
00:03:26.699
it will contain parallel processing and
00:03:29.199
concurrency so I am going to talk about
00:03:30.909
parallel processing and concurrency how
00:03:33.129
it handles in Ruby versions actually and
00:03:38.349
yes first of all before going into the
00:03:43.000
deep this is the difference between
00:03:45.099
power processing and concurrency for the
00:03:48.519
people for the developers who don't know
00:03:50.609
actually who forgot after graduating
00:03:54.280
from school so from parallel processing
00:03:58.689
we can have multiple threads as this
00:04:02.259
drag group and two or more threads can
00:04:07.060
be processed perrolli in power
00:04:09.819
processing but in concurrency without
00:04:12.280
power ISM we can have multiple threads
00:04:14.859
two or more without any problem but
00:04:17.940
there will be a context switching
00:04:20.079
happening which will allow only one
00:04:23.320
thread to execute at a given time so
00:04:27.240
Ruby MRI supports concurrency not
00:04:31.719
parallel processing
00:04:33.760
and so most of our applications looks
00:04:42.640
like this because of concurrency not
00:04:47.950
actually concurrency but we are using
00:04:50.140
only one CPU but we have 16 calls and 15
00:04:55.210
of them are in idle State
00:04:57.510
yeah so I know most of you are family
00:05:05.080
with Ruby GI l and how it works and for
00:05:10.120
the people who don't know Ruby a ruby
00:05:12.340
Jalen's Ruby global interpreter lock
00:05:15.190
which is in Ruby MRI and which actually
00:05:21.940
handles the concurrency system in Ruby
00:05:24.460
MRI so these are the versions most
00:05:28.990
famous versions in Ruby and if you can
00:05:31.780
see Ruby MRI 1.8 and 1.9 so on supports
00:05:37.510
only concurrency but JRuby RBX those
00:05:43.120
kind of things
00:05:43.870
supports concurrency and parallelism
00:05:46.440
parallel processing so you may have
00:05:48.880
might think like why don't we use JRuby
00:05:52.980
for the faster performance but actually
00:05:56.110
it's not the case to use JRuby for the
00:06:00.700
faster but you have to make sure you are
00:06:03.190
safe within the code that you are
00:06:06.310
writing because you have to be very
00:06:09.010
careful with shared mutable data if you
00:06:11.710
are doing parallel processing with JRuby
00:06:13.540
Oh
00:06:16.590
Ruby implementation which supports
00:06:18.910
parallel processing so this is the
00:06:22.530
simple architecture how Ruby MRI and
00:06:27.360
JRuby those kind of things
00:06:29.620
differs so in Ruby Amara we have green
00:06:32.830
threads actually we have threads and
00:06:36.330
before going to the Ruby interpreter we
00:06:38.770
have gi L a global interpreter which
00:06:41.560
handles our concurrency layer and which
00:06:45.520
only allows only one thread class
00:06:48.190
the rubian repeater and then the spread
00:06:50.410
and then the colonel
00:06:51.639
but JRuby doesn't have anything like CIL
00:06:54.639
it directly connects from grain to JVM
00:06:57.759
and always threats so I found this nice
00:07:05.460
little text according to the internet
00:07:08.199
how people think about Jil and yes this
00:07:13.720
is it so I'm going to use this bad code
00:07:21.310
actually to demonstrate how we can how g
00:07:26.650
IL works and how JRuby works with
00:07:30.340
multiple threads so we have an empty
00:07:33.190
array and we are going to create five
00:07:35.650
threads and each thread is going to push
00:07:39.669
nil object thousand times to this array
00:07:42.370
and actually technically the final
00:07:46.990
answer of array dot size should be five
00:07:49.060
thousand because we have five threads
00:07:51.069
and we are going to push thousand
00:07:53.490
objects from each thread but actually
00:07:59.259
this is the answer that we are getting
00:08:00.729
from Ruby MRI we get five thousand
00:08:05.080
that's the correct answer but JRuby and
00:08:09.419
the Ruby implementations which support
00:08:13.060
parallel processing actually they comes
00:08:16.810
too close to five thousand but not like
00:08:19.930
exactly five thousand that means the
00:08:22.060
answer is incorrect so this is happening
00:08:27.159
actually because of the parallel
00:08:29.229
processing and we are using shared
00:08:31.000
mutable data here this array that which
00:08:35.649
can be edited by multiple threads and
00:08:38.070
because of that JRuby with parallely
00:08:43.290
multiple threads actually pushing this
00:08:47.500
test that nil object to that array
00:08:49.480
because of that that's something
00:08:51.850
happening and I would like to take you
00:08:55.029
into the Ruby MRI code which actually
00:08:58.980
push
00:09:00.760
objects to Ruby array so this is the
00:09:05.170
implementation right now and if you can
00:09:09.490
see in line 925 we are getting the
00:09:15.310
length first of all length of the array
00:09:17.440
coli DX and then we are pushing the
00:09:20.860
object and then by nan 930 line we
00:09:24.910
actually increment the ID x value by 1
00:09:29.260
that means we are setting the array
00:09:32.080
length to plus 1 and we are returning
00:09:35.470
the array so this is the code this is
00:09:39.220
the actual implementation by C and there
00:09:44.110
can be scenarios that I am going to show
00:09:47.350
you right now which can be happened but
00:09:49.900
it will not happen every time but
00:09:52.120
there's a possibility like this can be
00:09:55.150
happen so this is the Ruby Amara version
00:09:59.470
and there's a thread 1 first of all when
00:10:05.440
we start up that process there will be
00:10:07.990
thread 1 and the ID x value will be 0
00:10:11.440
because there is no length for that and
00:10:14.490
he comes to 930 line and we are having a
00:10:19.660
context switching here so we are going
00:10:22.390
to move into thread 2 actually here
00:10:26.380
before executing this 930 line this
00:10:29.520
execution this context switching is
00:10:31.900
happening in this example and in thread
00:10:37.000
- it doesn't know anything about thread
00:10:39.940
1 that that is going to execute this 930
00:10:44.470
because before executing this 930 that
00:10:47.440
context switching happens and the ID x
00:10:50.410
value in thread 2 is still 0 and we are
00:10:56.290
executing the full method correctly in
00:11:00.850
line thread 2 without any context
00:11:03.970
switching and then the context switching
00:11:07.570
happen here in the thread 2 and we are
00:11:10.540
passing it to thread 1 again so
00:11:14.710
thread one doesn't know that thread to
00:11:17.560
actually increment the ID x value by one
00:11:20.620
and the only thing thread one knows is
00:11:25.720
ID x value is zero so he also the thread
00:11:31.180
one also do is he increment the value by
00:11:34.840
one and still it's the array length will
00:11:38.380
be one so which is incorrect two threads
00:11:41.940
actually pushed two objects to array but
00:11:44.680
the length is still one so but here
00:11:50.310
actually this is the Ruby mris code but
00:11:55.110
we are getting the correct value 5000
00:11:59.020
for Ruby implementation Ruby amara
00:12:01.960
implementation how that can happen
00:12:03.940
so that can happen is we have timing
00:12:10.510
threads in Ruby MRI which handles these
00:12:13.240
kind of scenarios actually that means it
00:12:16.390
will not allow contact switching to
00:12:19.450
happen yeah when they execute in this
00:12:22.960
line of code so because of that we are
00:12:25.330
safe enough inside Ruby MRI in these
00:12:28.720
kind of situations that's why we are
00:12:31.420
getting Rou 5000 that that means the
00:12:34.510
correct answer in Ruby MRI
00:12:36.820
implementation so yes to get ro get rid
00:12:45.820
of that bad code we have a solution if
00:12:49.420
we if you still want to use that bad
00:12:53.040
implementation and if you don't want to
00:12:56.440
get rid of that yes we can have mutex
00:13:00.960
mutex will work like a lock inside your
00:13:06.430
code and the place that you want to look
00:13:10.170
by executing you can actually put it
00:13:14.200
inside a mutex synchronize block so this
00:13:18.750
100 times block will only be executing
00:13:23.320
one thread only at a time so because of
00:13:26.740
that you are safe
00:13:28.389
even though with ruby MRI or as JRuby
00:13:31.329
the correct results you are getting and
00:13:34.199
yes you are good to go and if you are
00:13:38.220
actually interested enough to go through
00:13:42.579
the code of JRuby here's the JRuby
00:13:47.199
actually array append code you can go
00:13:50.559
through and actually it's most likely
00:13:52.809
it's similar and yes you can go through
00:13:57.489
and check what they are doing and yes
00:14:00.970
what's their implementation so the
00:14:05.439
question is are you safe with ruby MRI
00:14:11.129
actually the question that can be even
00:14:19.839
though you are using ruby MRI and even
00:14:23.379
though you are using MRI with these kind
00:14:25.420
of bad implementations ruby MRI
00:14:31.299
implementation inside it will be thread
00:14:34.360
safe but the things that you are doing
00:14:37.299
with shared mutable data that can
00:14:41.499
actually take you to some bad data
00:14:46.089
implementations and wrong
00:14:47.949
implementations of data so because of
00:14:52.600
that you have to make sure when you are
00:14:54.879
using shared mutable data in your
00:14:57.999
systems so you have to make sure that
00:15:00.429
you are using it correctly without any
00:15:03.660
without giving multiple threats to
00:15:06.160
access a shared mutable data and giving
00:15:10.499
multiple threads to actually edit shared
00:15:13.660
mutable data at once and if you want to
00:15:16.059
do something like that then use mutex
00:15:19.919
kind of locks those kind of things and
00:15:24.149
yes as everyone talks about these Ruby
00:15:28.569
three actually there's a proposal with
00:15:32.799
ruby three which is going to happen
00:15:35.169
actually I don't know when Ruby three
00:15:37.749
will like released but there
00:15:41.410
there's a proposal called guild which is
00:15:45.699
going to replace the Ruby global
00:15:49.269
interpreter lock inside Ruby MRI with
00:15:52.119
this guild implementation so the Ruby
00:15:56.019
code team developer Kochi he has done
00:16:00.339
some talks about this skilled
00:16:02.470
implementation that they are going to
00:16:04.329
actually implement but still they
00:16:07.179
haven't start implementing it at
00:16:09.119
implementing the so the guild
00:16:13.539
implementation is something like this so
00:16:17.669
there can be multiple gills in a program
00:16:20.889
and inside a guild there can be multiple
00:16:23.109
threads and when we are executing guilds
00:16:27.549
inside a guild there will be concurrency
00:16:31.979
but that means guild one inside the
00:16:37.479
guild 1 t1 and t2 will process
00:16:40.299
concurrently and t2 doesn't care
00:16:43.869
anything about actually gg2 doesn't say
00:16:49.299
anything about g1 what they are doing
00:16:51.809
and because of that g2 can execute
00:16:55.859
thoroughly among with g1 so hopefully
00:17:01.649
this will improve the process and as
00:17:04.839
through b3 proposal which three times
00:17:09.069
faster then Ruby 2 so this will be a
00:17:12.909
really good implementation and yes
00:17:16.809
actually I contacted Kochi few weeks
00:17:21.490
back and I asked about this
00:17:24.069
implementation and any progress about
00:17:27.220
this but their understanding is like the
00:17:35.200
they started implementing actually they
00:17:38.139
started fine-tuning there are five
00:17:40.330
implementations implementation in Ruby
00:17:42.879
MRI which will support guilt later and
00:17:46.539
they haven't start implementing guilt
00:17:49.179
yet so we actually we still have several
00:17:53.500
years to go
00:17:55.090
and because I talked about fiber so
00:18:00.749
fiber was trade this that actually
00:18:05.139
different between fibers and trade
00:18:08.909
fibers can do the things that trade is
00:18:13.179
doing but it's lightweight and it
00:18:17.159
initialized quickly than threads and it
00:18:20.519
actually destroyed quickly than threads
00:18:23.440
so on top of this fiber layer there are
00:18:27.279
several actually gems that people have
00:18:33.129
developed one is celluloid which I was
00:18:37.509
working previously in 2014 when I also
00:18:42.309
actually when I was in college I got a
00:18:45.879
chance to work with celluloid celluloid
00:18:48.269
any of you have heard about and work
00:18:51.070
with celluloid yeah so the idea of
00:18:57.549
celluloid is everything inside celluloid
00:19:00.039
is considered as actors so if any
00:19:05.820
process or else any thread that we are
00:19:08.980
starting or elles any execution that we
00:19:11.980
are doing is considered as an actor
00:19:14.799
inside celluloid and those actors can
00:19:18.700
communicate with each other using mail
00:19:21.730
box so that means something like passing
00:19:25.659
messages in between them so this is the
00:19:28.600
basic implementation basic idea of
00:19:31.809
celluloid which is inspired from Lang
00:19:36.580
those kind of languages and how they
00:19:39.009
implement their actor models so which
00:19:41.999
actually is pretty good if you have any
00:19:45.070
time you can just go and check celluloid
00:19:48.490
and yes we are welcoming contributors so
00:19:51.669
yes and yeah so that's it for the
00:19:59.889
presentation and any Corrections
00:20:08.250
any questions for Dylan on celluloid and
00:20:11.620
the actor model come on if we don't have
00:20:18.220
three questions nobody's going home
00:20:19.690
that's it final thanks for the talk just
00:20:23.200
wondering would you be able to point out
00:20:28.149
any sort of places in common types of
00:20:31.330
applications we build we're utilizing
00:20:33.100
either parallelism or concurrency it
00:20:36.279
could be a useful thing to do okay do
00:20:41.620
you mean any web applications is it just
00:20:45.340
in any any of the sort of apps that
00:20:46.750
people here might be building in the day
00:20:48.820
to day lives what kind of what kind of
00:20:53.169
jobs within these apps would adopting an
00:20:55.750
approach like parallelism or concurrency
00:20:57.960
be useful area so most probably it will
00:21:02.620
be big Ruby doesn't support actually
00:21:05.559
with much support with data processing
00:21:08.919
right now which goes to Python always so
00:21:12.669
if we can create a really good layer
00:21:16.029
with this kind of guild implementation
00:21:18.460
and there will be with with some of
00:21:22.440
really good libraries on top of that so
00:21:26.100
we can use them for data processing
00:21:29.919
right now what we are not doing we are
00:21:32.460
yes those kind of things so it will be a
00:21:35.919
new age for Ruby as I think which we are
00:21:40.509
still using for web kind of things so
00:21:43.929
yeah thanks yeah t-minus two I'm holding
00:21:52.269
the room hostage Alex you could be our
00:21:55.299
Savior please thank you I have a
00:22:03.669
question I should if your answer
00:22:05.700
recently like not really recently in
00:22:08.590
April Mike Bertram he released the new
00:22:11.019
version of sidekick previously he used
00:22:13.659
the silver to internal implementation
00:22:15.850
but now
00:22:16.660
he'll write everything in his own
00:22:18.730
implementation or the part of the
00:22:21.130
concurrency do you know the reasons why
00:22:23.170
and any insights on that yeah
00:22:26.800
earlier I think two years back
00:22:30.610
cyclic used celluloid as there are
00:22:33.600
concurrency model and actually I think
00:22:37.510
as cellular it goes actually sidekick
00:22:40.960
goes bigger and bigger they want to
00:22:42.610
implement their own and they wanted to
00:22:45.070
get rid of cellular that means they
00:22:48.790
wanted to get rid of depend dependency
00:22:51.820
layer that they are using and also with
00:22:54.040
the active actor model that we are we
00:22:57.010
are using inside celluloid they wanted
00:23:00.700
to get rid of that so that's why
00:23:02.410
sidekick implemented their own layer of
00:23:05.590
concurrency there with their using yeah
00:23:09.750
and did that make you sad yeah thank you
00:23:14.860
very much