00:00:15.400
so good morning uh everyone so let's
00:00:18.039
talk about detecting and classify object
00:00:20.119
image using Ruby uh about myself so my
00:00:24.240
name is Fab Leandro uh you can find me
00:00:27.160
using uh my nickname uh fa semi uh I'm
00:00:31.720
from Brazil so English is not my first L
00:00:35.040
uh language so uh feedbacks are welcome
00:00:38.800
please uh and also I am software
00:00:41.239
engineer at COD Miner for the company uh
00:00:44.520
at COD Miner we are software Boutique uh
00:00:48.399
company that we can handle and help you
00:00:51.719
with any kind of web application mainly
00:00:55.840
our uh Ruben raos application but feel
00:00:58.440
free to reach us and and contact us if
00:01:01.320
you need some help in your team or in
00:01:03.920
your web application uh our examp all
00:01:07.400
examples that I'm using today um are
00:01:10.560
available at
00:01:12.680
my uh GitHub page uh in the Ripple call
00:01:16.720
the rubikon
00:01:18.320
2024 uh the care code is to the to the
00:01:22.200
Ripple uh so I will talk about my
00:01:25.079
research overview uh about the digital
00:01:28.520
image processing Ruby options that we
00:01:31.159
had at the time uh also object detection
00:01:34.920
supervis classification and while we're
00:01:38.000
going to try to do and what what else
00:01:40.680
you can also try okay so my master
00:01:44.439
degree research are basically um we we
00:01:49.000
collect some soil samples in the field
00:01:51.600
in the Farms or whatever is the place we
00:01:54.600
sent to the lab they sent to us uh the
00:01:57.479
chemical information about this soil
00:01:59.520
sample
00:02:00.920
and with that we also launch a drone
00:02:04.039
with multiple cameras and sensors to
00:02:06.799
capture uh the soil
00:02:09.319
image uh we collect the the image do
00:02:13.120
some some formulas in there and put on a
00:02:17.200
CNN and try to get a heat map about the
00:02:20.680
chemical that are in the
00:02:22.920
soil so basically we have
00:02:25.640
like 10,000 of images we try to glue
00:02:30.280
those imag is called the image sting and
00:02:34.239
to have a osic like a a map of the field
00:02:38.560
and after that we we crop the image and
00:02:41.720
create some image with that and this is
00:02:44.400
the process that I talk about today so
00:02:48.120
Digital Image processing every
00:02:50.440
everything that we did on programming is
00:02:53.480
not about the languag is about the
00:02:55.640
process so for Digital Image processing
00:02:58.920
basically we have
00:03:00.840
a way to represent the image uh that
00:03:05.360
major of the times we are using AR race
00:03:07.680
to do that um like uh uh a grayscale
00:03:13.640
image can be represented by single
00:03:15.799
single values uh for for each pixel and
00:03:19.599
for our RGB image we have like for each
00:03:23.799
color band we have a different value uh
00:03:26.680
and also those values can depending off
00:03:29.000
the colors space and Beyond so basically
00:03:32.760
the pattern is uh to to handle with
00:03:36.480
digital images on on on language
00:03:40.080
programming we can use in this kind of
00:03:43.720
process uh and and today we have like
00:03:47.400
two major projects that are focused on
00:03:50.480
digital image processing that's called
00:03:52.879
image magic and open CV uh image magic
00:03:56.560
is more focused on the uh image per se
00:04:01.239
like cropping doing some blur and this
00:04:04.599
kind of stuff and the open CV has the
00:04:07.360
digital image processing there but also
00:04:09.879
is using by uh to visual Computing like
00:04:13.680
detecting detecting uh sub image and
00:04:17.120
this kind of stuff and since I'm from
00:04:20.280
Academia also a master degree student uh
00:04:23.919
we need to to look at uh Publications
00:04:29.560
and and see what is going on on the
00:04:32.039
Academia field uh if you look at El and
00:04:37.759
it3 uh Publications uh you going to got
00:04:41.639
you got to got like uh 17 Publications
00:04:46.800
and the last four years using Mage magic
00:04:49.560
but with open CVS more than 6,000 of
00:04:53.919
publication in the last uh five
00:04:57.000
years uh so we choose to to use open CV
00:05:04.160
for hours uh for ours
00:05:07.520
research um okay and open CV has a a
00:05:11.919
bunch of rappers um that you can use in
00:05:15.759
on C uh python um Java and also Ruby but
00:05:22.520
sadly the the Ruby wrapper is is too old
00:05:26.440
it's like the last commit is I don't
00:05:29.360
know eight years ago maybe seven uh so
00:05:34.160
we had some choices to do uh we need to
00:05:37.600
Define if you going to keep up with the
00:05:39.800
rubby wrapper or I don't know go to
00:05:43.560
image magic uh the image Magic Gem is
00:05:47.479
keep is up to date uh maybe create our
00:05:51.039
own implementation size is just a
00:05:53.199
process or other open CV implementations
00:05:57.240
from
00:05:58.160
Ruby uh so we had to make a choice the
00:06:01.560
other three ones were not uh the the
00:06:05.560
fastest way to our uh reproduce some
00:06:09.039
some articles
00:06:11.280
so we found the
00:06:13.800
P please uh don't don't leave yet I
00:06:19.280
know uh but yeah uh the p is to C python
00:06:24.120
from Ruby yes uh the author is K Morata
00:06:27.759
and the first release are was in 2006 16
00:06:33.160
and the latest releas is is from May of
00:06:37.000
this
00:06:37.919
year okay so our problems are
00:06:43.039
resolved okay so to use p is simple as
00:06:46.960
that uh you can uh load the py call
00:06:50.360
import included the methods and using a
00:06:53.360
method called p p import uh the port
00:06:57.039
will uh import the package from python
00:06:59.759
so you need to uh set up the the entire
00:07:03.160
environment the entire P python
00:07:05.160
environment and install the python
00:07:07.199
packes that you're going to you're going
00:07:08.879
to use in in my Ripple uh I put a Docker
00:07:12.800
file there that created all the python
00:07:16.319
python environment and install like the
00:07:18.520
open CV and other tools that uh I will
00:07:22.160
show you
00:07:23.319
today okay uh this is an example uh on
00:07:28.720
how you can
00:07:30.160
uh do a Hello word using using the py
00:07:33.120
call so the the print is execute by the
00:07:37.160
python and you have the object size the
00:07:40.400
print is not like a value is just uh STD
00:07:44.520
out print uh the the value is new now
00:07:50.720
side okay so we have the python now
00:07:54.919
let's try the open CV uh to load the
00:07:58.159
open CV uh on python SC
00:08:01.080
CV2 and to load the images is simple as
00:08:05.960
that uh you can call the the IM readd uh
00:08:10.520
method from Pyon and as I said before is
00:08:14.520
represent by by
00:08:16.440
aray um so this is example of the RGB
00:08:20.919
image um well in open CV is not RGB uh
00:08:27.560
open CV loads as B so it swap the um the
00:08:33.760
color Channel long star short is about
00:08:36.880
the history and Legacy on the early
00:08:40.719
2000s years okay so like the first pixel
00:08:46.519
is represent by array with those values
00:08:49.760
uh and uh the range values uh by default
00:08:53.920
is 0 to 255 and is B jar so this this
00:08:59.440
first position uh represents the the
00:09:02.519
blue color uh green color and red color
00:09:06.399
and for you to convert the the BJ image
00:09:10.800
to to a gray scaled image the CV CVT
00:09:15.000
color is there so it's a method from
00:09:17.640
open CV you can call that and all of
00:09:20.959
these are executed by python but you can
00:09:24.079
have the um the value on Ruby so this is
00:09:28.720
uh RB
00:09:31.200
I RB uh shell uh that are calling
00:09:35.839
through Ruby so the value at the end is
00:09:40.519
161 so it's just one value for the for
00:09:43.360
the great pixel and there is you have a
00:09:47.079
great schedle image simple as
00:09:50.320
that uh because we have the open CV
00:09:53.519
working for
00:09:54.720
us okay so this is another example that
00:09:58.480
you can just resize the image draw a
00:10:01.399
rectangle in there and write the new
00:10:04.000
image uh using the object from from
00:10:08.399
Ruby okay so now we have a new image
00:10:11.800
there that's
00:10:13.959
great and this is
00:10:18.920
the the the option that we have that uh
00:10:23.959
we are TR about it because we we can
00:10:27.720
separate all the all the all the pixels
00:10:31.680
uh by by his value like split the image
00:10:35.320
uh so I have a image that represents the
00:10:38.200
blue value the green value and the red
00:10:40.240
value uh for our bjr image is is simple
00:10:43.600
as that uh but for our research we using
00:10:47.200
like mpect image with 10 color bands in
00:10:50.839
the single in the single image like
00:10:53.880
infrared New Year infrared and this kind
00:10:55.920
of stuff and with that uh sorry with
00:11:00.079
that we can apply some uh some formulas
00:11:03.519
that we call vegetation
00:11:05.720
indexes uh that is basically some
00:11:08.000
formulas that the academic field uh
00:11:11.399
provide for us uh
00:11:15.079
and man we can create some some just
00:11:19.440
doing this kind of stuff using Ruby this
00:11:22.959
this is awesome for us uh so basically
00:11:26.240
uh oh and at the end we can apply a
00:11:28.800
color m so uh the raow image uh is
00:11:32.920
represented by that like a binary image
00:11:35.720
just zero and and ones but with color
00:11:39.920
map we can have a a human uh view to to
00:11:46.959
identify whatever we want to identify in
00:11:49.760
the image uh
00:11:54.120
for like for the for the Mets and this
00:11:58.440
kind of stuff they are same values but
00:12:01.320
is more like for for human being SI
00:12:04.800
image and we can see whatever we want to
00:12:08.800
see and this is kind of stuff that we
00:12:12.440
can grab uh apply multiple uh vegal
00:12:15.760
indexes formula so this is the same
00:12:18.199
image and we can have like um multiple
00:12:22.880
informations based on a single a single
00:12:25.279
part of image applicating multiple
00:12:28.560
multiple formulas in there all the
00:12:30.920
formulas that I use in these examples
00:12:33.040
are also in the in the
00:12:35.800
r uh so that is great we can use in call
00:12:39.760
to to just load the open CV but not uh
00:12:43.839
just the open CV uh we can using uh
00:12:48.519
some some algorithms to detecting
00:12:52.079
objects uh we we usually uh we usually
00:12:57.800
use supervisor
00:12:59.920
um AI you know to classify the image uh
00:13:04.240
is not like full automatically AI but is
00:13:06.920
more like oh I need to focus I need you
00:13:11.279
you to focus on this and this kind of
00:13:14.160
classes so to do that we need to
00:13:17.360
generate some objects there and to
00:13:20.240
generate objects we can also using open
00:13:23.720
CV like finding counters uh you can
00:13:26.720
apply a threshold that will
00:13:29.839
uh find like some edges of the pixels
00:13:33.360
like the color space and this kind of
00:13:36.040
stuff so as I said before open CV
00:13:39.480
provide all of this for us and we can
00:13:43.279
have like a image with like this like
00:13:46.480
all the all the green stuff are objects
00:13:49.600
in the
00:13:50.800
image uh I appli this example for the
00:13:54.279
RGB image but if I use like the um the
00:13:58.759
visual ation index image I can have this
00:14:02.920
kind of objects so that's why we can
00:14:06.839
apply some some kind of stuff like
00:14:09.519
that and also we
00:14:11.920
have on other algorithm that can uh that
00:14:16.399
can be like for example slide window is
00:14:18.720
another algorithm for Academia that is
00:14:21.600
just I slide that will crop multiple
00:14:24.519
image in in the in the image that you
00:14:27.720
have and selective search is also
00:14:31.079
another another way to to generate your
00:14:33.880
objects uh all of those those algorithms
00:14:38.440
are available at my my
00:14:42.440
RI okay
00:14:44.639
good right we have a object I can load
00:14:48.279
on Ruby I can inst information from
00:14:50.639
there but where is the AI well the you
00:14:56.000
can use so many approaches to user
00:14:59.800
classify uh
00:15:02.279
image and like you need to see what the
00:15:06.440
Academia is is using uh go check the the
00:15:10.680
L server and
00:15:12.240
I3 but today the hype is using deep
00:15:16.240
learning CNN transfer learning
00:15:18.160
Transformers and this kind of stuff uh
00:15:21.000
it's basically the same
00:15:23.639
uh the same approach that llm uses to
00:15:27.880
achieve uh uh
00:15:29.680
the results it's not exactly the same
00:15:33.279
way but it's some kind uh also for image
00:15:37.600
you can uh check the challenges or
00:15:39.959
benchmarks call the image net
00:15:42.720
C uh e for the and kago competitions
00:15:47.000
they are like the benchmarks for results
00:15:50.440
of uh classifying
00:15:53.040
Imes uh
00:15:55.040
okay so uh for this example I'm using
00:15:59.600
the CFE uh deep learning framework uh is
00:16:04.240
built by Berkeley
00:16:05.839
University and it's simple as that uh
00:16:09.680
you load the the deploy partiy this is
00:16:13.639
uh this is adjacent file that have the
00:16:16.519
information about the layers and your
00:16:19.920
model the model is basically a huge
00:16:22.079
array with a lot of Randal values that
00:16:26.560
uh put the weight on on the Network to
00:16:31.000
to get the results and then you can
00:16:34.120
reshape the first layer this for this
00:16:37.519
example the data layer is is the input
00:16:41.240
uh to insert the image there and
00:16:43.759
basically you can do whatever you want
00:16:46.120
but uh you need to to
00:16:49.440
Define how many channels that you have
00:16:53.480
and the window size of each
00:16:55.720
image and the cafe framework have a
00:16:58.519
trans farmer um to like make a pattern
00:17:03.519
about your image like what what what
00:17:07.039
will be the the channels what is the
00:17:09.480
sequence of the color color channels uh
00:17:13.959
what will be the values what you the
00:17:16.079
size and this kind of stuff and you can
00:17:18.839
just call it and processing the image
00:17:21.319
right
00:17:22.000
there uh and to perform the the
00:17:25.360
valuation uh of the image you call the
00:17:29.200
forward and you're going to have a
00:17:31.400
result uh each each model will have your
00:17:35.440
own uh your own layers uh to evaluate
00:17:38.880
the results uh for this specific case uh
00:17:42.919
the layer is named softmax and basically
00:17:45.799
we going to have like uh the label um
00:17:49.919
the label names and
00:17:52.799
the like the curs for that label uh for
00:17:57.559
the example um just using like a high
00:18:00.840
value of a chemical and a lower level of
00:18:03.559
a chemical for this example is 99% that
00:18:07.360
is a lower level of the chemical that
00:18:10.400
I'm evaluating uh
00:18:13.720
I yeah I have so for this kind of stuff
00:18:17.919
I'm using uh aluminium uh values so it's
00:18:21.480
basically low level and higher level of
00:18:25.080
aluminum
00:18:26.799
um if you see the image seems okay great
00:18:31.720
like you have a roof and maybe uh is
00:18:34.720
high aluminum there um and you have a a
00:18:39.039
road that is low aluminum uh but I I I
00:18:42.760
grabb the best examples of that but uh
00:18:47.200
as uh as I said uh sometimes the the the
00:18:53.400
AI hallucinate that those vales so yeah
00:18:58.799
we can have like all those scripts using
00:19:01.720
python that Academy provides the
00:19:04.600
implementation for that and you can use
00:19:08.159
on Ruby and this is great okay I show
00:19:12.600
you how to use a model but how can I
00:19:15.480
train or build my own model to do that
00:19:18.679
is not a simple task because you need to
00:19:21.480
evaluate that and it's more like you
00:19:25.360
need uh to build your own database and
00:19:29.880
separate the image uh visualize the
00:19:32.760
results and try again try again try
00:19:35.320
again try again try again try again it's
00:19:37.520
it's is a huge loop on that so for such
00:19:42.760
task we use the envid digits it's a
00:19:46.240
platform build by Nidia and is not uh
00:19:52.320
it's not for Cafe you can use interal
00:19:54.960
flow and other other uh other projects
00:19:59.200
to do
00:20:00.159
that
00:20:02.679
um okay so I'll show you about the Deep
00:20:05.679
learning and some algorithms that you
00:20:08.559
can use that but also in the Academia we
00:20:12.200
have a bunch of other algorithms to do
00:20:15.000
that like K sift svm YOLO is another
00:20:19.640
kind of deep learning uh also a
00:20:22.679
basically as elen distance uh that like
00:20:27.280
is more like trying to your uh to your
00:20:30.760
problem uh see the results try again
00:20:34.039
change the thing try again see the
00:20:36.360
results and going on going on going on
00:20:40.159
so for us the cafe uh we have great
00:20:43.720
results using Cafe for for image uh but
00:20:48.200
sometimes you going to use like I don't
00:20:51.159
know uh see to to use for audio some
00:20:56.520
something like that uh um okay and the
00:21:01.280
results for my dissertations just to to
00:21:04.159
show you uh so the best the best uh
00:21:09.840
results that I have until now I have for
00:21:12.679
potassium phosphorus and organic matter
00:21:15.880
like I have a 64 for potassium 74 75%
00:21:21.080
for uh phosphorus and 74 for organic
00:21:25.080
matter like so this is the result that
00:21:28.600
we are trying to achieve uh creating
00:21:31.279
some some kind of hit map uh
00:21:35.000
but this is some way in fake because the
00:21:40.480
results is like a image with 10 color
00:21:43.760
bands and we cannot represent that for
00:21:46.799
human
00:21:47.679
view uh so I just appli the color map
00:21:51.080
that I showed
00:21:52.880
before uh also to correlate the image
00:21:56.520
and his position we using two two other
00:21:59.080
packages that are called gal and JY uh
00:22:02.919
we have both
00:22:05.400
uh we have similar uh packs uh similar
00:22:09.360
gems that can be used but it's not like
00:22:14.799
we have on python word uh and also a
00:22:18.520
great P to to stack this multiple color
00:22:22.720
image uh we use rer
00:22:25.919
iio uh future steps
00:22:29.279
you can try using ter sflow uh that is
00:22:32.240
another deep learn framework and is made
00:22:35.039
by Google and a lot of uh research and
00:22:40.520
have more tools that c provide for you
00:22:44.440
uh also we want to try to use Ruby Julia
00:22:47.679
Julia is another programming language
00:22:49.880
and Ruby Julia is something like by call
00:22:52.320
you can call uh the Julia language using
00:22:55.640
python uh but we we need a if work maybe
00:22:59.679
in two years we're going to try that um
00:23:03.799
also we are testing for other
00:23:07.200
elements so I'm a
00:23:11.080
liar uh the message that I I I I can put
00:23:16.559
on the table is more like great Ruby is
00:23:20.679
a great Community we know how to build
00:23:23.240
software but we cannot keep keep
00:23:26.279
Computing uh competing
00:23:29.200
with another Fields like uh the AI is
00:23:34.000
today is the bab steps of the AI uh we
00:23:38.960
are in baby steps for the last 15 years
00:23:42.080
but okay and it's more like the Academia
00:23:46.600
is is building that and they using
00:23:50.039
python uh and we like Ruby you know it's
00:23:55.320
beautiful
00:23:57.200
and I I don't want to to compare the
00:24:00.880
language but more like I like to be on
00:24:05.039
Ruby I like to use Ruby uh so P provide
00:24:09.720
a lot of uh opportunities to us uh to
00:24:13.240
bring what they are building
00:24:16.120
for uh for our our world
00:24:20.640
okay
00:24:22.919
um I think that we have some sometimes
00:24:25.440
for keyway any questions yes please
00:24:29.440
no no for us is uh more agriculture but
00:24:33.240
we have some teams that are using for
00:24:35.440
environment like they need to uh to to
00:24:39.320
see where where we have some rivers and
00:24:43.360
like determine how much of trees we need
00:24:50.080
uh inside of the rivers you know uh on
00:24:54.279
the on the on the field uh but like if
00:24:57.720
you send the Drone to the field you
00:25:00.600
cannot capture the the rivers because
00:25:03.520
the trees will will close the The
00:25:11.880
View
00:25:13.799
please special drones or like an off the
00:25:16.240
shelf drone no we we usually build our
00:25:19.399
own like sometimes we use uh phantom
00:25:23.679
for and we just uh have the mount to to
00:25:27.760
put some some cameras in
00:25:30.080
there but usually uh they build our own
00:25:35.679
drones
00:25:37.480
using I don't remember
00:25:39.880
the the framework to do
00:25:44.320
that that's
00:25:46.480
it okay thank you guys feel free to read
00:25:51.279
me