00:00:00.000
ready for takeoff
00:00:18.900
so welcome my name is Ali I'm going to
00:00:22.800
talk about stream processing with Ruby
00:00:25.500
and specifically turbine RP
00:00:29.220
so here's a quick agenda essentially
00:00:31.080
this is what we're going to cover so you
00:00:32.399
know what you're getting yourself into
00:00:36.059
let me start off with a little a little
00:00:38.700
bit about myself uh who am I why you
00:00:41.100
should trust me
00:00:43.200
so this is me I'm the CTO at one of two
00:00:46.440
co-founders uh davaris the other
00:00:48.420
co-founder is right there
00:00:49.920
at moroxo and uh previously before
00:00:53.700
starting roxa I was a lead engineer at
00:00:56.340
Heroku specifically on the heruku dates
00:00:58.140
team mainly working on uh heroku's Kafka
00:01:01.260
offering where my team managed thousands
00:01:05.400
of Kafka clusters for tens of thousands
00:01:07.680
of customers uh before that I built a
00:01:10.799
system at a targeted advertising company
00:01:12.299
that queried over 2 billion user
00:01:14.580
profiles uh in real time
00:01:16.799
and then way way way before that I built
00:01:19.380
analytics pipelines for mobile apps
00:01:21.780
um processing regularly over 100 000
00:01:24.740
events per second
00:01:26.700
and so basically I've been doing this
00:01:28.619
for for quite a while working in and
00:01:30.360
around the data space
00:01:34.680
so stream processing what is it and why
00:01:38.159
you should care
00:01:40.680
so specifically in stream processing
00:01:43.979
what I what I mean by string processing
00:01:45.600
is really about taking an unbounded
00:01:47.720
sequence of events uh continuous
00:01:50.460
unbounded sequence of events and
00:01:52.079
applying some sort of computation or
00:01:53.700
transformation to it
00:01:55.140
I'm intentionally avoiding the term real
00:01:57.060
time
00:01:58.259
um that's generally implied but there's
00:02:00.479
no
00:02:01.380
generally accepted agreed upon
00:02:03.600
definition for real time but essentially
00:02:06.899
not batch whatever that means to you
00:02:10.979
so some examples of stream processing in
00:02:13.080
general are filtering to get a number of
00:02:15.300
events you want to drop some of them
00:02:16.879
enrichment you want to take each event
00:02:18.900
and you want to augment it with some
00:02:20.160
additional information
00:02:21.739
aggregation where you want to do some
00:02:23.879
sort of processing across a number of
00:02:25.200
them maybe count them some of them do
00:02:27.060
some sort of calculation there joins
00:02:30.360
typically is similar to a SQL join you
00:02:33.660
want to take two sets of data mash them
00:02:35.400
together by some common element and then
00:02:37.980
routing is kind of another one where you
00:02:40.680
want some events to go in place and
00:02:42.180
other events to go somewhere else
00:02:46.379
um some some common use cases that
00:02:48.599
should be familiar to most people
00:02:51.060
um you know analytics is probably one of
00:02:52.680
the most common uh it's one of the most
00:02:54.599
common ones that we see at least and
00:02:56.640
essentially you're taking data from a
00:02:58.800
number of different sources it could be
00:03:00.180
your operational database maybe it's a
00:03:01.920
postgres database back in your rails
00:03:03.480
application you're taking some data from
00:03:05.640
you know support tickets in in zendesk
00:03:07.620
and maybe some CRM data from Salesforce
00:03:09.840
and you're pulling them all into a
00:03:12.120
single data warehouse where your data
00:03:14.280
scientists run some queries and and sort
00:03:16.319
of derive some some insight out of
00:03:19.560
um another common use case is
00:03:20.940
replication or just and disaster
00:03:22.379
recovery
00:03:23.459
and so here you're continuously and
00:03:25.739
hopefully immediately pulling data from
00:03:29.099
one place and putting it into some other
00:03:31.400
region or data center or or Cloud even
00:03:35.340
um across you know geographical
00:03:37.019
distances in order to have a uh another
00:03:40.440
place that you can recover from this
00:03:42.840
could also be different database types
00:03:45.299
so maybe you're doing postgres on RDS
00:03:48.120
and AWS and you're copying over to SQL
00:03:51.180
server and Azure on on a different uh
00:03:53.760
different side of the the country
00:03:57.480
um enrichment is another very common one
00:03:59.280
for us so essentially you're taking some
00:04:01.860
data uh maybe it's a user sign up and
00:04:05.159
you want to add some additional
00:04:06.959
information to make that data more
00:04:08.280
useful to you so maybe you look up their
00:04:10.680
email with some third-party service that
00:04:13.560
gives you a little bit more information
00:04:14.459
about them maybe the company the role or
00:04:16.799
whatever it is and then you're taking
00:04:18.299
that sort of fatter enriched record and
00:04:20.220
then you're putting it somewhere else so
00:04:21.540
you can use it maybe it's back in your
00:04:22.740
operational database maybe it's in your
00:04:24.660
your data warehouse
00:04:26.400
and then uh I've listed integration
00:04:29.040
which is a super vague general catch-all
00:04:31.560
for like everything else and essentially
00:04:34.620
taking your data and putting it
00:04:36.419
somewhere else where it can be used by
00:04:37.919
someone else
00:04:38.960
this could be third parties it could be
00:04:41.699
other teams maybe you scrub the pii out
00:04:45.540
of your stream of data and you make it
00:04:47.340
available for a partner to use
00:04:49.919
um that's that's kind of a common
00:04:51.419
example too
00:04:55.500
and so what is what is the the problem
00:04:58.560
right with stream processing right now
00:05:02.160
essentially you know everyone here I
00:05:04.259
assume loves Java it's your favorite
00:05:05.699
language uh clearly Ruby conference must
00:05:08.880
love travel
00:05:10.259
um nothing nothing wrong with Java but
00:05:12.240
essentially if you do enough stream
00:05:14.100
processing you're going to end up with
00:05:15.360
Java somewhere Kafka is written in Java
00:05:17.639
Cafe connects written in Java Kafka
00:05:19.199
streams is written in Java Pulsar is
00:05:21.000
Java spark is Java flank is Java Java is
00:05:23.580
everywhere and that's great if you love
00:05:25.919
Java if you don't then that kind of
00:05:28.800
sucks
00:05:30.479
um so that's sort of one major obstacle
00:05:32.880
with stream processing especially for
00:05:34.199
everyone else
00:05:36.060
and then the other sort of major part of
00:05:38.400
it is stream processing introduces a ton
00:05:40.860
of new sort of patterns and paradigms
00:05:42.660
that aren't really common elsewhere so
00:05:44.820
if you're used to building web
00:05:46.199
applications with a regular request
00:05:47.639
response cycle now you have to worry
00:05:49.800
about delivery semantics uh is it at
00:05:51.960
least once is that most ones is it
00:05:54.240
exactly once with scare quotes
00:05:57.440
you know ordering guarantees what are
00:06:00.600
they is it strictly ordered is it
00:06:02.580
globally ordered is some subset of it
00:06:04.740
ordered late delivery is something you
00:06:07.320
don't typically have to deal with you
00:06:09.539
might get a message seconds later or
00:06:11.820
days later or even weeks later what do
00:06:13.740
you do with that message
00:06:15.600
and then you get duplicates that's kind
00:06:18.120
of an annoying one that's pretty common
00:06:19.380
especially when the default is at least
00:06:21.539
once in many cases
00:06:23.960
then
00:06:25.500
you have to think about partitions and
00:06:27.000
topics so if you work with Kafka
00:06:29.100
partitions are the scaling unit and so
00:06:31.440
you really need to get it right the
00:06:32.940
first time around because changing it
00:06:34.319
later is painful
00:06:35.880
and so these are all things that you
00:06:38.039
don't typically have to worry about and
00:06:39.840
you don't want to worry about
00:06:41.720
it's not something that you should
00:06:43.740
really care about you should just use
00:06:45.180
the tools someone else should should
00:06:47.160
worry about these things
00:06:52.020
another major part of it is where do you
00:06:54.360
deploy this stuff so if you have a
00:06:56.160
stream processing application it does
00:06:57.960
something useful now what where do you
00:07:00.419
run it and how do you maintain it and
00:07:02.400
how do you make sure that it runs
00:07:03.660
consistently performs well all the stuff
00:07:07.199
um so the easy answer is
00:07:10.080
yeah it's easy all you need to do is set
00:07:12.780
up a VPC set up your subnets IPS
00:07:16.139
configure some security groups spin up
00:07:17.940
some ec2 instances deploy kubernetes
00:07:20.520
provision Kafka create topics create
00:07:22.259
partitions wire everything up make sure
00:07:24.660
Echoes are in place
00:07:26.720
you know configure 8 000 million
00:07:29.580
different things wire everything up and
00:07:31.740
yeah it's good that's all you need to do
00:07:33.960
so just do this thing
00:07:38.039
um yeah so it's it's not it's not easy
00:07:40.979
um if you look at some of the AWS guys
00:07:42.419
for for setting up vanilla kubernetes
00:07:44.539
it's like 60 Pages
00:07:47.880
um and 10 of those pages are like create
00:07:49.740
your VPC and configure everything
00:07:51.419
correctly the first time because if you
00:07:53.880
get the subnets wrong then it really is
00:07:56.039
super painful to fix it
00:07:58.380
um so that's a big part of that is you
00:08:00.240
know once we have this thing where do we
00:08:02.160
run it and how do we make sure that it
00:08:03.479
runs consistently and performs well and
00:08:05.940
does all the things that we needed to do
00:08:09.240
so that's kind of the the problem space
00:08:10.919
that we're trying to tackle
00:08:14.400
so for us at maroxa our answer is is
00:08:17.340
basically uh turbine and the Rockstar
00:08:19.680
data platform and so it's turbine is
00:08:22.080
sort of the tool chain and the data
00:08:24.000
platform is is the platform as a service
00:08:25.440
that runs the the tool chain
00:08:29.759
so I'm going to dig into turbine a
00:08:31.139
little bit
00:08:31.979
um
00:08:32.640
essentially turbine is is the framework
00:08:34.500
that we work with it's actually a family
00:08:36.419
of Frameworks for various languages we
00:08:39.479
started with go and JavaScript and
00:08:41.219
Python and at this conference we're
00:08:43.320
making a turbine available for Ruby as
00:08:45.779
well
00:08:46.440
and so each turbine framework is sort of
00:08:49.500
individually handcrafted for that
00:08:51.540
particular language to follow idiomatic
00:08:54.600
practices for that language and so that
00:08:56.640
it looks familiar and you know works in
00:08:59.399
the way that you expect it to as someone
00:09:01.200
who writes Ruby day in Day Out
00:09:04.740
the other sort of main focus for for
00:09:06.720
Turbine is we've introduced an API
00:09:10.140
that exposes a high level sort of
00:09:12.540
abstract abstraction on top of these
00:09:14.700
common things so as long as you can
00:09:17.399
assign variables call methods then you
00:09:20.279
should be able to create a sort of Rich
00:09:22.380
stream processing applications
00:09:25.560
uh the other sort of key part for us is
00:09:27.600
you can write custom logic in that
00:09:29.760
language and so if you're using turbine
00:09:31.740
for Ruby you can write logic in Ruby in
00:09:35.580
familiar Ruby that looks like Ruby
00:09:37.320
doesn't introduce any weird dsls or
00:09:39.180
anything it also lets you import
00:09:40.920
rubygems that you might already have or
00:09:43.080
might already exist online so you can
00:09:44.880
import those in and use them with your
00:09:46.500
your turbine app to actually help you
00:09:48.600
process these these events
00:09:55.560
so this is what it looks like so this is
00:09:57.899
a turbine app
00:09:59.880
it's obviously a very a simple example
00:10:01.740
but you can kind of expand this as you
00:10:03.720
go along but it should look very
00:10:06.240
familiar uh it's very much inspired by
00:10:08.220
the racket API and so it should look
00:10:10.380
pretty familiar to to anyone who's been
00:10:12.180
writing Ruby for for any amount of time
00:10:15.000
um essentially we expose a number of
00:10:17.100
methods that allow you to tap into a
00:10:19.800
resource in this case it's a database
00:10:22.980
resource named demo PG
00:10:25.200
and then you pull records out of a table
00:10:28.080
called events you process them with a a
00:10:31.800
process called pass-through and then you
00:10:34.200
write it to the same database in a
00:10:36.480
different collection and so what you'd
00:10:38.339
expect here is you're basically creating
00:10:40.320
a very simple pipeline that pulls data
00:10:42.240
from one place processes it with the
00:10:44.339
function pass through which is actually
00:10:45.540
written below
00:10:47.480
and then writes it out into the database
00:10:51.720
and so that's turbine itself that's
00:10:53.519
really the framework that you would
00:10:54.540
write
00:10:55.500
um these data apps in
00:10:57.899
the other major part of the tool chain
00:11:00.060
for us is actually the platform itself
00:11:02.700
and so this is the the platform as a
00:11:05.220
service in our case and so it's a fully
00:11:07.200
managed platform as a service that's
00:11:09.899
designed to host and run turbine apps
00:11:12.899
essentially we handle the operational
00:11:14.579
burden of running this thing wiring up
00:11:17.060
monitoring the sort of underlying
00:11:20.519
instances and the components of making
00:11:21.779
sure that it's healthy and it continues
00:11:23.399
to run
00:11:24.720
um a lot of the magic around
00:11:25.640
automatically figuring out how to do
00:11:28.019
things or the heavy lifting is handled
00:11:30.660
by the platform and so it'll reach out
00:11:32.579
and look at resources and figure out how
00:11:34.500
best to get data out of them
00:11:36.600
um and sort of automatically configure
00:11:38.519
these connectors and and pipeline
00:11:41.279
components to achieve that
00:11:44.820
um
00:11:45.600
and then when you actually deploy your
00:11:48.779
turbine app this custom logic that you
00:11:50.519
wrote so the pass-through function that
00:11:52.200
gets packaged up into a container and
00:11:54.120
deployed onto the platform the platform
00:11:55.980
contains a sort of serverless functions
00:11:58.500
component that's where that function
00:12:00.480
goes and it's responsible for scaling it
00:12:02.880
independently and so as you get more
00:12:05.040
events coming in it'll scale up those
00:12:07.079
functions to process more of those
00:12:08.880
events
00:12:10.680
so that's just the managed side of it
00:12:14.579
so here's a very high level architectury
00:12:17.160
type view of it so essentially pulling
00:12:20.160
in data from somewhere it figures out
00:12:21.779
how best to do that it puts it into a
00:12:24.959
durable store where it can rewind and
00:12:27.240
replay and kind of act as a shock
00:12:28.800
absorber it applies your turbine
00:12:31.620
function across all those events and
00:12:34.260
then whatever the results are go back
00:12:35.700
out through some connector or mini
00:12:37.740
connectors into wherever the destination
00:12:40.560
resources and so everything in the sort
00:12:43.740
of dotted box in the middle that's the
00:12:45.540
platform itself and it just handles it
00:12:46.920
for you
00:12:50.040
so I'm going to attempt a live demo
00:12:54.300
we'll see we'll see how that goes
00:12:58.079
all right
00:13:02.040
there we go
00:13:06.139
all right
00:13:22.320
all right
00:13:23.880
so here you can see
00:13:26.760
a turbine app that I wrote previously
00:13:30.180
essentially it implements that
00:13:32.279
enrichment use case so here we're
00:13:35.040
actually requiring existing clear bit
00:13:37.139
gem so that's the gem that exists open
00:13:39.660
source I just pulled in
00:13:41.519
we're pulling this we're using this
00:13:43.620
database called demo PG similar to the
00:13:45.839
example I included we have two types of
00:13:48.060
apis there's the chaining sort of based
00:13:50.220
fluent API as well as a more sort of
00:13:52.800
traditional procedural one so that's the
00:13:54.839
one I'm using here
00:13:56.639
so basically I'm saying take the records
00:13:58.920
out of a collection called events
00:14:01.019
process them using this enrich function
00:14:04.200
which I've written below and then write
00:14:06.000
out the results in events underscore
00:14:08.399
copy and so this is the enrich function
00:14:11.040
it's fairly contrived but actually does
00:14:13.500
something useful if you're not familiar
00:14:15.899
with clearbit it's one of the services
00:14:17.519
where you give it some information about
00:14:20.040
aptically user and it has a database of
00:14:23.339
users and a ton of information about
00:14:25.019
them so in this case I'm forwarding the
00:14:28.200
email of a user and then it's returning
00:14:31.620
back some information like the company's
00:14:34.139
legal name for the employer and then the
00:14:37.320
location of that person
00:14:40.279
and so this is the the data that I'm
00:14:43.019
actually feeding it so
00:14:44.699
turbine ships with a sort of local
00:14:46.800
development mode where you can kind of
00:14:48.360
iterate quickly and have this very fast
00:14:50.160
feedback loop
00:14:51.600
um where you can use fixture data or
00:14:53.339
sampled records to actually run it
00:14:54.720
through your pipeline and say like does
00:14:56.760
it do what I think it does or does it do
00:14:58.740
what I need it to do and you can
00:14:59.820
probably test against it and everything
00:15:01.079
and then once you're happy with that
00:15:03.120
functionality you can deploy it onto the
00:15:04.560
platform and so this is an example
00:15:06.300
record that I created
00:15:08.339
um so the actual value of the record has
00:15:11.519
an activity which is logged in and has
00:15:13.500
my email address
00:15:14.820
and so
00:15:16.380
what I hope happens uh is when I execute
00:15:19.740
it locally it should take my email
00:15:21.440
process it through this custom function
00:15:23.459
hit the clearbit API fetch some
00:15:25.260
additional details and say this is what
00:15:27.000
would have happened had you deployed
00:15:28.500
this live
00:15:30.899
and so
00:15:33.000
we have Marx's CLI
00:15:37.019
so essentially this is the local
00:15:38.459
execution command and it basically
00:15:40.800
threads your record through and it shows
00:15:42.899
you what what would have happened
00:15:45.600
um and so here it did work so you can
00:15:48.300
see that it says it fetched this record
00:15:51.240
which I showed earlier which just had my
00:15:53.699
email address
00:15:54.720
and then it augmented that enriched that
00:15:57.300
data with the company rocks Inc and
00:15:59.820
location San Francisco
00:16:02.399
um and that's it so essentially it did
00:16:05.100
what I thought it did now I'm happy with
00:16:06.899
it I can deploy onto the platform and
00:16:08.339
the platform will package all these
00:16:09.480
components and deploy it into a
00:16:11.399
continuously running pipeline
00:16:14.639
So yeah thank you
00:16:23.880
all right
00:16:28.440
so
00:16:29.940
what's next for for Turbine and moronsa
00:16:34.320
essentially right now turbine RB
00:16:36.959
um or turbine for Ruby we basically
00:16:40.259
recently made it it's still in a
00:16:43.139
relatively early developer preview and
00:16:45.120
we're looking for for feedback we want
00:16:46.980
people to use it we want people to to
00:16:48.600
try it out and actually tell us how to
00:16:49.980
improve it we are super focused on
00:16:52.860
developer experience and so we want to
00:16:55.320
make it great for for developers and so
00:16:58.139
yeah we want people to sign up use it
00:16:59.940
and tell us tell us what they think and
00:17:01.920
tell us how we can improve it one of the
00:17:04.020
things that was relative relatively
00:17:05.760
recent for us is
00:17:07.760
Ruby 3.1 introduced the idea of a value
00:17:10.919
object or the the data class which
00:17:14.280
introduces sort of an immutable struct
00:17:16.740
essentially that seems like it would be
00:17:18.720
pretty good for this kind of use case
00:17:20.100
where records come into the platform as
00:17:22.919
an immutable object and you use sort of
00:17:24.600
methods defined on it to to manipulate
00:17:26.339
this so that's something I would like to
00:17:27.480
consider but again we'd love to hear
00:17:29.280
from from users and say this is what we
00:17:31.320
want or this API sucks and you should do
00:17:33.780
something else
00:17:35.580
um another sort of major component that
00:17:37.200
we're we're working on is a native
00:17:39.419
stateful processing so stateful
00:17:41.880
processing is is kind of a big problem
00:17:43.559
space to to solve
00:17:45.600
um right now on the platform you can
00:17:48.299
Implement stateful processing but the
00:17:51.179
burden is on you to persist data
00:17:53.340
somewhere so you might have some sort of
00:17:55.679
redist or or a database or something
00:17:57.419
like that soon we hope to have that
00:17:59.880
natively built into the platform and so
00:18:02.100
you can just
00:18:03.600
magically assume that there is some
00:18:05.580
persistence available to every function
00:18:07.140
and if you write something to that it
00:18:08.700
will just be available everywhere
00:18:10.799
um part of the part of the functionality
00:18:12.780
is joins and so being able to do stream
00:18:14.580
joins natively without relying on
00:18:16.020
anything external would be enabled by
00:18:18.360
the native stateful processing another
00:18:21.240
major component that we're kind of
00:18:22.679
digging into is CI CD integration and so
00:18:25.500
I think
00:18:26.700
um for us to make this functionality
00:18:29.340
turbine and writing stream processing
00:18:31.200
really available to all software
00:18:33.299
Engineers it needs to play nice with you
00:18:35.880
know traditional or common CI CD
00:18:37.919
practices so you should be able to write
00:18:39.660
a screen processing application
00:18:41.400
alongside your rails application or your
00:18:43.559
Ruby application and have them sort of
00:18:45.600
deployed in lockstep together you can
00:18:48.059
already import those objects on those
00:18:50.340
models so why not have them deployed
00:18:53.220
together right if you change something
00:18:55.080
in your app that would effectively break
00:18:57.240
your stream processing application they
00:18:59.160
should both be blocked on success
00:19:01.260
successful deploys on both
00:19:03.240
so that's something that we're we're
00:19:04.980
actively digging into right now
00:19:10.080
so if you want to access the developer
00:19:12.660
preview you can take a picture of the QR
00:19:16.440
code that'll take you to a landing page
00:19:18.179
where you just show your interest you
00:19:21.059
can also win a meta Quest 2 by filling
00:19:23.700
that out
00:19:24.799
so yeah sign up and kind of let us know
00:19:28.700
what you want to do with it and how you
00:19:31.080
you'd like to use it and we'll try to
00:19:33.539
onboard as many people as quickly as
00:19:35.160
possible
00:19:43.080
all right
00:19:45.120
um
00:19:47.280
questions
00:19:48.419
we have plenty of time for questions so
00:19:50.880
if anyone has any we can address it now
00:19:53.220
otherwise you can catch up with me
00:19:55.980
yeah so the question was what's the main
00:19:58.980
difference between our platform and
00:20:01.500
using a serverless function platform
00:20:04.100
so in the case of the serverless
00:20:06.780
functions you still have to have
00:20:08.220
infrastructure to deliver your records
00:20:09.960
to that serverless function right in the
00:20:12.720
case of the the market platform you're
00:20:14.760
deploying this application that's
00:20:15.960
running continuously and so it's doing a
00:20:18.539
fair bit more than just integrating with
00:20:20.100
a serverless function so the platform
00:20:22.080
does the heavy lifting in terms of
00:20:23.340
pulling data out so I kind of glossed
00:20:25.140
over it very lightly but if you point
00:20:28.440
the platform to a postgres database it
00:20:30.720
will actually reach out and inspect the
00:20:32.100
database and look at what version it's
00:20:33.539
running what credentials you provided
00:20:35.400
whether it can set up logical
00:20:36.780
replication or not what extensions are
00:20:38.700
available and if it can it will set up a
00:20:41.220
logical replication slot with CDC so you
00:20:44.280
get very low latency High throughput
00:20:46.500
sort of change data capture into your
00:20:50.100
function and your function is being
00:20:51.539
triggered continuously against that so
00:20:54.240
yeah it's a lot more of the the sort of
00:20:56.220
complete pipeline rather than just that
00:20:58.500
function
00:21:00.960
related to that you could actually call
00:21:02.880
third-party functions like you could
00:21:04.740
deploy some logic or maybe you already
00:21:06.539
have logic on Lambda from our function
00:21:08.880
you can say every time I get an event
00:21:10.200
trigger this serverless function and
00:21:12.660
then take the result and put it into
00:21:13.799
something else
00:21:15.360
sure so the question is what did we
00:21:17.520
build the CLI with and how is it
00:21:18.960
installed
00:21:20.520
um the CLI is built using Cobra which is
00:21:23.160
a a go framework for writing clis it's
00:21:27.000
the same one that Coupe cuddle is is
00:21:28.380
written in and you can install it on Mac
00:21:32.159
using Homebrew
00:21:34.500
um
00:21:35.400
Linux also through Homebrew but we also
00:21:38.460
which is weird because nobody uses hover
00:21:40.860
on Linux
00:21:42.179
um but it's there but we also build
00:21:44.520
binaries we use go releaser to actually
00:21:46.500
generate
00:21:47.700
um binaries for multiple architectures
00:21:49.799
and multiple platforms
00:21:51.720
um and so yeah if you go to it's
00:21:54.000
actually open source as well so if you
00:21:55.200
go to GitHub maroxa CLI you can see all
00:21:59.280
the code for the CLI and all the tooling
00:22:01.080
and GitHub actions and everything we use
00:22:02.880
around generating it it's definitely
00:22:05.400
worth checking out we've invested a lot
00:22:07.020
of time in in a builder pattern for
00:22:09.179
creating new commands very easily I know
00:22:11.340
it's in go but it's it's worth checking
00:22:13.559
out either way
00:22:14.640
sure so the question is what was I
00:22:16.620
running locally to enable Rockstar apps
00:22:19.080
run and how does it compare to what is
00:22:21.720
run on the platform when I run rocks
00:22:23.820
apps deploy
00:22:25.220
so essentially we try to mimic the same
00:22:29.039
experience so that you have this fast
00:22:31.500
feedback loop locally and so we're
00:22:34.620
moving towards this
00:22:36.840
unified back end for enabling multiple
00:22:39.659
languages so right now we support go
00:22:41.460
JavaScript and Python and Ruby and so
00:22:44.880
it's the same functional backend even
00:22:47.280
locally so when you execute the local
00:22:49.320
execution it threads your records
00:22:51.480
through your function and then feeds it
00:22:53.400
back into it when you run rocks apps
00:22:55.799
deploy it does something very different
00:22:58.620
but the end result is effectively the
00:23:00.480
same and actually
00:23:02.220
ships your package it builds a container
00:23:04.740
out of your code and then ships it to
00:23:06.960
the platform and the platform wires up
00:23:08.460
all these components
00:23:10.860
it's a lot of technical stuff I'm happy
00:23:13.020
to go into much more detail with anyone
00:23:14.700
who wants to to discuss it
00:23:18.059
uh so the question is how do multiple
00:23:19.919
developers uh working locally
00:23:22.679
collaborate on the same sort of
00:23:25.080
deployment the same app
00:23:28.580
so essentially one of the things we do
00:23:31.679
is we with a local development
00:23:33.299
environment you can actually run a
00:23:35.760
command that pulls sample data from a
00:23:38.220
development database or staging database
00:23:39.840
and lets you iterate on it locally but
00:23:43.200
then we also use the typical git
00:23:45.360
workflow so you're building your stream
00:23:47.340
your data app and you're committing it
00:23:49.500
to GitHub and so you can kind of
00:23:52.799
lean on the same workflows that you
00:23:54.600
normally have around collaborating so
00:23:56.580
you are creating PRS with your stream
00:23:59.220
processing application you know getting
00:24:00.720
feedback and comments and everything at
00:24:02.159
the same time so we aren't necessarily
00:24:05.280
diverging from that our goal is actually
00:24:06.960
to map as closely as possible to what
00:24:09.240
you normally do with software
00:24:10.320
development so you follow the same
00:24:12.360
workflows that you normally have
00:24:14.120
you'd write some code you push a PR you
00:24:17.100
run some tests you get some feedback you
00:24:18.960
iterate on that and then eventually you
00:24:21.539
deploy the thing that you know works
00:24:23.520
when you're happy with it
00:24:27.320
yeah so sure so the local development
00:24:30.659
experience doesn't actually rely on any
00:24:33.299
databases it sort of assimilates what
00:24:35.100
that database would be so in the example
00:24:36.840
that I used today it simulates getting a
00:24:39.900
record from postgres
00:24:41.400
by actually sampling sampling a record
00:24:44.280
from postgres and says this is what the
00:24:45.900
record looks like and it stores it
00:24:47.520
locally in this demo.json file so it
00:24:50.940
includes a bunch of sample records and
00:24:52.500
then that's what you're iterating on
00:24:53.640
locally so you don't need postgres
00:24:55.860
um
00:24:56.580
the way the turbine framework is
00:24:58.559
designed it's actually entirely agnostic
00:25:01.200
of the real resource and so I can go in
00:25:03.539
and change demo PG to demo and the
00:25:07.080
code Works in exactly the same way
00:25:08.280
because it's the platform that's doing
00:25:09.659
that translation as far as the turbine
00:25:11.580
function is concerned I get a record
00:25:13.500
that looks like this and I'm applying
00:25:15.539
some Transformations and I'm pushing out
00:25:17.039
a record in that format
00:25:19.320
the platform is the thing that's
00:25:20.760
responsible for pulling the record from
00:25:22.380
postgres and giving it to the turbine
00:25:24.600
function
00:25:25.860
all right I guess that's it for me
00:25:28.679
thank you very much