Ruby Video | Adventures with Micro Services in Rails

Adventures with Micro Services in Rails

Play on YouTube

Red Dot Ruby Conference 2014

Adventures with Micro Services in Rails

Anand Agrawal • June 27, 2014 • Singapore • Talk

In the talk titled 'Adventures with Micro Services in Rails,' Anand Agrawal presents an insightful exploration into the use of microservices within a Rails architecture. The discussion revolves around defining what microservices are, when and why they should be utilized, and the potential pitfalls associated with them. Key points covered include:

The talk ultimately provides a comprehensive overview of the practices, challenges, and strategies involved in implementing a microservices-based architecture in Rails, emphasizing that while microservices offer many advantages, careful consideration and management of associated complexities are essential for success.

Adventures with Micro Services in Rails
Anand Agrawal • June 27, 2014 • Singapore • Talk

I have spent some time working on a project where we've built 8 micro services and 2 applications, and planned to carve out a few more. Deployment was carried out in a farm of 25 servers in production with a single click in less than 3 minutes.

In this talk I will share our experiences with building a micro service based architecture - the good, the bad and the ugly.

What are micro services?
When/Why/How micro services?
Why NOT micro services?
Managing Continuous Integration and Continuous Delivery with micro services
A few design principles that we followed and that worked for us

Help us caption & translate this video!

http://amara.org/v/FGY3/

Red Dot Ruby Conference 2014

00:00:20.480 um so today we're gonna talk about microservices because before that something about me

00:00:25.680 i am a consultant from thoughtworks sync up singapore full stack engineer and i also

00:00:31.599 co-founded ideaboards which is a retrospective tool this is my twitter

00:00:39.200 handle and github handle enough about me uh what's in for you

00:00:46.239 so we're going to talk about what are microservices uh why microservices how microservices

00:00:53.039 and when do we use those microservices how many of you have heard about

00:00:59.359 microservices oh how many of you have actually used

00:01:04.879 microservices or wrote code cool um so let's define micro services uh

00:01:13.680 there is no formal definition per se but before jumping on to that let's talk about uh what are services

00:01:21.280 um it's just an implementation of a contract right you have some contract and the service actually

00:01:28.240 implements that particular contract and tells you that if if you call me with these these parameters i'm gonna do

00:01:34.159 something for you what happens if you attach a micro to

00:01:39.759 the micro service

00:01:45.200 um so micro as the name suggests it should be small we're gonna talk about what is small it

00:01:51.840 should be independent it should be self-contained it should perform on itself it should be

00:01:58.799 composable it each service should work together with all the other microservices

00:02:05.600 and it does one thing and that one it does that one thing well um that's the that's the key um

00:02:13.520 but what is the right size we talk about micro services there is a lot of conversation around what should be the

00:02:19.920 size of the service some people rate size based on

00:02:25.440 lines of code like if it's beyond 200 lines of code it's not a microservice some people

00:02:32.640 think about it in terms of team that if one person could develop that over a period of time then that is

00:02:37.760 a micro service or if it's a group of people that could work independently on a service

00:02:43.519 that is a micro service um well for us we've been working with a client and for

00:02:49.280 us what really worked out is defining those services in terms of domain

00:02:55.280 so each service does just one thing and it does that one thing right so that one

00:03:01.599 thing could take 100 lines of code or could take 1000 lines of code

00:03:10.879 tying it to unique philos unix philosophy you write programs which are small

00:03:16.560 programs we've used said that all these small little programs

00:03:22.159 and then use pipes to concatenate them and make them cohesive make them work

00:03:28.560 together in case of microservices yeah http is the new pipe because http

00:03:35.120 microservices are usually exposed over http and each service does its own job and passes

00:03:42.319 it on to the different service

00:03:48.239 it's also kind of uh going back to the object oriented philosophy i talked about there was a

00:03:54.239 talk on solid principles so taking uh talk about single

00:03:59.439 responsibility the service this service does just one thing

00:04:04.480 uh it has low coupling it doesn't it's not chatty with too many different services

00:04:10.239 at the same time it is cohesive and it's small and does that one thing

00:04:16.720 well and as the jeff jeff bay says

00:04:23.759 a monolithic application which is 100k lines of code is nothing but 101k

00:04:31.040 line of applications waiting to happen well he has used the 1k line of code as

00:04:36.960 a parameter for a microservice but you get the idea right i mean rather than having a big one of these

00:04:43.120 monolithic application break it down into smaller so that it's much more manageable

00:04:50.880 um so why do we use microservice what's what's the key objective um

00:04:58.800 rather than answering that question i would like to share my story on my journey of using microservice or how we

00:05:04.560 use microservice in one of the client projects um so i'll tell you a story uh

00:05:12.400 this happened couple of years back when we started an engagement with a client

00:05:17.680 i'll give you a little history on the client uh this is 90 year old business it's a social

00:05:24.400 gaming company it was a 90 year old business and it was quite funny to hear on the first day

00:05:30.720 that their customers were literally dying or dying out but literally dying because

00:05:36.800 of the age so yeah lots of legacy codes over this

00:05:42.080 period of 90 years they've accumulated like a lot of code acquired a few companies the tech stack

00:05:48.720 was mixed of vb scripts vb6 forms and oracle and all

00:05:55.919 crazy stuff because of these the flexi it was not

00:06:03.919 flexible the cost of introducing any change was like unfold even adding a simple feature on a

00:06:10.319 customer was like changing three or four different systems and then uh

00:06:16.000 getting that through so it was really really painful all these uh apps were

00:06:24.240 fully functional in their own silos because of the various acquires and mergers um but they each of the apps had like

00:06:31.520 concentrated complexity yes there's a lump of code sitting there

00:06:37.039 and nobody knows how it works

00:06:43.120 when he started talking more about it that's when we came with this idea that

00:06:49.680 oh this sounds like we need to build something that is small that is independent that is composable and that does one

00:06:56.720 thing so that if there are four different applications wants to do payment

00:07:01.759 it's just one service which should be doing payment right if they want to store customer information there should

00:07:07.039 be just one service holding those customer information so that's when

00:07:12.720 that's where our journey started and building the micro services what we've achieved

00:07:19.759 so far is we have 10 micro services

00:07:24.800 doing a bunch of things 25 vms in production 60 plus vms across other environments

00:07:32.160 like qa tests and performance environments

00:07:37.199 and we could achieve one click deployment across all these environments

00:07:44.720 and you can guess if it's one click deployment who does the deployment so it's product owners which queue is

00:07:51.039 anybody uh it's an interesting story that uh somebody want somebody at the client side wanted a dog to do

00:07:56.960 the deployment so press the button um so for us microservices

00:08:05.680 each of these 10 different microservices was like self-contained they had like their own db their own

00:08:11.759 contracts they were running in their own process

00:08:18.000 talking to each other through htp

00:08:23.280 but how did we start we didn't start with 10 micro services on day one right so try to solve small and valuable

00:08:32.320 problems started with small piece of functionality like customer database and

00:08:38.240 try to migrate that first start with plain old services so we didn't start we

00:08:46.399 didn't start with microservices on day one that hey this let's go with just one resource

00:08:52.399 per service or let's just go with one responsibility for service we

00:08:57.839 started with plain old service and started realizing that some things could be moved out

00:09:03.920 so when services starting doing too much over the period of time like you do object refactoring very

00:09:09.839 factored services so if service responsibility grown we extracted them to smaller ones

00:09:18.720 um so as i was telling you uh about the domain which is uh social

00:09:24.560 gaming typically uh over the web it has three or four components so that's where we

00:09:29.600 started the plain old services there is catalog which is that game catalog

00:09:35.920 customers the orders and payments

00:09:41.519 we slowly realized that customer service is trying to talk to legacy too much so start we extracted out that as a

00:09:48.720 different micro service to talk to legacy database and those kind of things

00:09:54.160 and eventually throw that away the order started growing too much so

00:10:00.640 it's abstract extracted out orders into two which is order processing uh the main order service now is

00:10:07.519 responsible for just taking orders and there is a separate service for processing the orders

00:10:13.360 and there is a separate service for resulting because it's a gaming company so after

00:10:18.560 when you play anything at the end of it you get results so there was this result service

00:10:28.320 from a high level you would still think that well it's okay that just sounds like a plain old

00:10:34.079 service right i mean i had like lot of debates like why is this uh microservice it's just it's just service

00:10:41.680 which is well designed that's it right i mean uh had hard time people asking those

00:10:47.200 questions that why why what is micro in these services

00:10:52.959 well for us it was mainly the single responsibility fact that a customer service just does

00:10:59.839 everything related to customer data and its boundaries are restricted to

00:11:05.600 just customer and it never overstepped it blocked its boundaries

00:11:11.839 um you also said each service has one resource so customer service has

00:11:18.320 just one resource uh well sometimes uh it had two resource

00:11:23.600 so especially with payments when we started dealing with payments it started having two or three resource because we had

00:11:30.000 payment method credit card and debit card which are like three resources and one service but pulling them out into a different

00:11:38.240 their own microservice would mean that breaking that

00:11:43.680 comp law of composition and the services the payment service our credit card

00:11:49.040 service and the direct debit service would become too chatty and

00:11:54.320 it would defeat the purpose of having a service

00:11:59.360 so they communicate our restful contract http and json

00:12:04.480 um so over the period of time when we were into that journey had some

00:12:11.760 thumb rules around what really uh you should do to make these services

00:12:17.200 uh micro and keep the complexity to a minimum is one top level resource as we just

00:12:24.399 talked about well in some cases too focus on contracts so that is very

00:12:30.800 important for each service that the contract should be driven made

00:12:35.839 it clear that contracts should be driven mainly based on the domain and not the client

00:12:42.959 so if the clients the consumer of the service needs certain additional information or

00:12:49.200 summary information or those kind of things then what i have seen is many

00:12:55.360 places we end up creating those service endpoints and those becomes unmanageable so we

00:13:01.760 focus a lot of contract a lot of it on contracts uh each service

00:13:08.000 had their own context and uh they're not allowed to access

00:13:13.040 data which is beyond their context anyway each of the service has their own database so they

00:13:18.639 it was anyway not possible but you made a point that even if something is accessible uh

00:13:26.000 some data is accessible and you're not it is beyond the domain so call the

00:13:32.800 service rather than directly using the data source uh avoid too much of coupling between

00:13:40.000 the services again since they are independent we try to avoid a lot of coupling

00:13:45.519 between the services and since now we have 10 different

00:13:51.519 application each of them logging in their own each of them having running in their own process it was

00:13:57.839 really important to have a sophisticated logging and monitoring framework

00:14:03.519 so that if anything breaks we get to know immediately if any of the request fails through logs

00:14:10.560 we can immediately catch those errors or exceptions

00:14:16.000 um so having said that uh following those thumb rules there were we ended up creating

00:14:22.160 few cross-cutting services which is a spa which was needed across

00:14:28.560 all the other microservices so going back to the services that we had we ended up

00:14:35.839 having a communication service because there was a lot of communication to the clients around

00:14:41.760 hey your payment is due next month welcome welcome message or maybe result message

00:14:48.160 that hey you won this particular game so each of these services had their own communication

00:14:55.440 so we abstracted out the communication the cross-cutting concerns around all the services

00:15:00.560 created a communication service and all the service needs to do is just ping

00:15:06.880 that communication service and say that hey send the communication that customer has won

00:15:12.880 and it's the communication service responsibility to figure out whether to send an email communication

00:15:18.399 or or sms communication or whatever

00:15:24.959 there were a lot of scheduled jobs across these uh services like you need to send a

00:15:30.160 weekly email to customers uh payment emails and stuff so those scheduling was part of all these

00:15:37.759 services abstracted out in a separate service so that it would just ping the service and

00:15:43.759 schedule anything on any service and we talked about error reporting

00:15:49.920 already uh so it was extracted out as a separate service so it would log a request to one

00:15:55.440 common place and could do a search and those kind of stuff on this error reporting tool

00:16:04.639 um so having this form of services uh what uh

00:16:11.440 usually hear about uh thing is service explosion um so you have these services it's

00:16:18.560 difficult to for a developer to check out them and deploy them and

00:16:24.240 things like that so how do you stay productive in spite of having these many services

00:16:31.440 um well we use ruby and rails use rails api to build the service endpoints

00:16:40.240 uh focus lot of uh a lot of our focus was on devops to make

00:16:45.519 things as simple as possible in terms of deployment in terms of setting up a dev

00:16:50.560 box and stuff um we used feature toggles

00:16:56.000 instead of feature branches because doing feature branches with us service oriented architecture is like

00:17:02.160 really really difficult and having ci and cd pipeline itself is like really

00:17:07.520 difficult um we also created a lot of client gems

00:17:13.039 for these micro services the client jumps basically the provides

00:17:18.799 easy to talk to you to the service so you it would feel like you're calling a

00:17:24.000 service in memory because it gives you a nice object you just call a service using that object and you get

00:17:30.960 a object back and a mantra was automate automate

00:17:37.679 whatever it is whatever is the repetitive process just automate everything

00:17:45.360 um so how do i make a small change and still say same i mean if we make a small

00:17:52.960 chain how do we make sure that everything works properly if that change happens the answer is

00:18:00.000 simple test it and if you're thinking of something like this

00:18:09.520 then this is the answer it's funny only when it is a joke i mean

00:18:15.600 if you're building an enterprise software or any software it's important to have tests um so

00:18:23.440 started off with unit test in each of the services whether the object within the service is

00:18:29.200 doing the right thing but then that's too obvious right i mean everybody of us right unit test

00:18:35.840 and we love our spectre the contract test which is uh is my

00:18:43.440 service doing what it should uh which is basically out of container

00:18:48.480 test so we ping a endpoint in memory service and see whether we're getting the right

00:18:54.840 response and there is uh and we test the

00:19:00.000 contracts these are basically black box block pack black box test which test the contract

00:19:09.120 send something and get a data get some response back don't worry about the implementation uh

00:19:15.679 the next is integration test the acceptance test is for uh the boundaries within within the

00:19:22.000 service itself in integration tests we test whether this particular service is behaving

00:19:28.160 nicely with other services so if this service is calling some other surveys or testing the user flow then we write

00:19:34.640 root unit test it tests the distributed effect if anything fails then

00:19:40.080 actually is the error getting reported in the errors error reporting service or if the

00:19:46.720 payment is getting uh deducted then is the kid is the customer getting communication or not um

00:19:55.520 so test async action lot of actions were async like when you're sending a communication it's all async

00:20:02.480 so we were using uh rescue for it and there's nice plug-in rescue that lets

00:20:08.080 you test async actions um so you build these micro services

00:20:15.120 uh how do we actually ship it as james lewis says we are essentially

00:20:21.679 building the complexity of building the software to actually the infrastructure

00:20:27.360 so instead of now having one application to deploy now we have to deploy like 100

00:20:34.000 applications of 1k line each so the code becomes simple easy to

00:20:41.200 understand but infrastructure is slightly complicated

00:20:46.400 so we provision use puppet solo at some point of time we would like to

00:20:51.919 use docker as well um so provisioning it begins at home so

00:20:58.000 even the dev box are provisioned so that if there is any change in

00:21:03.440 any of the versions of the software the same scripts are used across all the

00:21:09.120 environments um so script goes through ci like

00:21:14.960 application code the puppet script we're gonna see that in the ci

00:21:20.559 pipeline slide and immutable server as brian was talking about it in the morning so

00:21:27.440 it doesn't make sense for server to be mutable if you're using provisioning script

00:21:33.760 this is how our integration pipeline continuous integration pipeline looked like

00:21:39.840 we had the ui test each each of the

00:21:44.880 boxes at the top is unit tests so there's ui there is service there is

00:21:51.120 a puppet code which flows through integration uid performance

00:21:56.720 and eventually to production all this is one click deployments across environments

00:22:06.880 so with every check-in we run unit test run integration test that we've written

00:22:12.559 run acceptance test and build a package this is really important because that same version of

00:22:19.200 the package would be deployed across all the other other environments

00:22:24.480 and the most important thing is we shipped often like weekly or less than that uh just ship

00:22:30.880 whatever you have and we shipped it like fedex

00:22:38.640 um so talking about ci and cd followed that in the project what it

00:22:44.480 actually gave us is single click deployments uh we managed to get

00:22:49.600 cut down the server deployments from uh to actually three minutes so each change

00:22:56.240 would actually take like three minutes to deploy to server to production um we had a farm of

00:23:02.240 25 servers and everything just works uh their deployments and everything

00:23:08.880 just works like a charm um yeah and so easy that our product owner

00:23:14.880 does it um we made a point that since we are adding

00:23:20.400 lot of micro services and refactoring to microservices

00:23:25.440 the cost of adding any of these services should be as low as possible

00:23:31.039 so we managed to cut that time to less than less than a day and right from creating

00:23:37.919 a project to taking that but that empty project production was less than a day

00:23:44.799 so having uh talked about microservice when do we use the microservice it's not

00:23:51.840 a silver bullet right i mean it comes with a cost

00:23:57.840 so these are some of the trade-offs the benefits and the cost associated with them

00:24:02.880 so the benefit is you get small reusable and maintainable

00:24:08.080 code which are throw away you can just rewrite them and stuff but at the same time you'll have like a

00:24:14.559 complex infrastructure because you need to deploy those independent services the individual codes

00:24:21.200 each service would grow independently can divide the team such that based on

00:24:28.240 services and e-service would keep growing on their own with the teams

00:24:35.440 but on the same side on the other side the learning curve is quite huge in microservice

00:24:41.520 because now you have to deal with multiple applications and developers would find hard to know what's happening

00:24:49.120 in the other service they scale independently as they are in

00:24:56.080 their own process they have their own database you can make deployments is that if there is any

00:25:02.880 service which is not heavily loaded uh you can make a rational use of

00:25:08.960 the infrastructure rather than having monolithic app and running

00:25:15.760 fat servers at the same time there is network overhead in terms of going

00:25:22.720 through the http going through over the wire and calling those servers they have independent dbs so if there is

00:25:29.360 high load on database the database could be scaled independently but at the same time they

00:25:34.960 end up having a fragmented data and the reporting and all becomes slightly

00:25:40.840 difficult well uh that's all i had uh

00:25:47.919 questions your last comment is the perfect segue into my question which was this seems like it would make

00:25:54.080 reporting a nightmare so is it slightly difficult or is it a nightmare well

00:26:02.000 well some uh some of the complex reports become like nightmare as you said um

00:26:10.320 in some of the cases what we have uh tried out is uh dump these data into a warehouse and

00:26:16.480 start developing reports out of that uh there are some plugins with uh

00:26:22.000 postgresql where it allows you to connect to multiple databases and run sql query across databases

00:26:29.279 that worked out in some cases in some cases it was just fetch the data if it's small enough

00:26:36.000 and do it in memory so it it depends on the usage um if the data is really like really huge

00:26:42.559 then first case where you dump that in our warehouse would really would work fine how do you deal with

00:26:49.120 uh versioning the services and what services you're talking to from

00:26:54.960 presumably not something plot involved according to each other

00:27:00.080 um so as you saw the deployment pipeline it uh to start with we said okay let's

00:27:08.640 not do versioning at all in the services let's deploy all everything or nothing so rather than

00:27:15.279 picking up what needs to be deployed we made our deployment script intelligent enough to figure out if there is a

00:27:20.720 change it would deploy otherwise it won't and since uh in the deployment

00:27:27.919 in the ci pipeline you have tested that hey this version of the service works nicely with these other versions

00:27:34.159 of the service we would deploy that whole lump of all the services together

00:27:39.440 or and roll back if needed all the services together this is just to simplify so that we

00:27:45.360 don't end up having too many versions and too many

00:27:52.080 making code unmaintainable basically a couple of questions

00:27:59.600 on your client library gyms did you uh did you settle on something like there are a few

00:28:04.720 uh json schema standards that are being talked about that sort of made it easier

00:28:10.080 to like the client library to be sort of discovering the layout of the api did you use

00:28:15.840 anything like that or is it kind of that um so we used hashi gem if you have heard of that

00:28:21.360 that worked out really well where we could actually uh create the objects create the models

00:28:27.279 and stuff really nicely using a dsl

00:28:33.039 all right and although if you don't mind that have you found a need or use any tools or building tools like

00:28:39.200 any twitter adults something that it's probably solid specifically for tracing transactions that go through

00:28:45.600 multiple services like if you have some issues that come up especially with the urgent question like do you have a hard time do

00:28:51.679 you want to use something that goes through multiple services with a you mentioned a lot of logging and boundary

00:28:56.840 infrastructure

00:29:03.360 so we passed in a unique identifier from the main caller so the ui

00:29:09.279 layer which talks to the service would typically pass in a common header or which would be

00:29:15.679 a unique identifier and if this service is calling some other service it would pass on the same

00:29:21.760 identifier and the logging actually make sure that we are using that particular identifier so when you're searching in

00:29:29.039 splunk or any other log aggregator service all you need to do is just check based on that

00:29:34.399 identifier to trace where all the requests went through thank you

00:29:44.320 okay i was wondering if you how you went about testing the contract of the service

00:29:51.440 did you use any particular tools um actually our spec has it

00:29:58.559 uh our spec has a way to test the contracts uh without actually spinning up the uh

00:30:05.679 the server like we do controller test right you hit an endpoint and see if it is returning the proper response

00:30:13.520 if you do a render view on top of it uh it will actually renders the view it will actually render the json and give

00:30:19.919 you back the response so we just use plain aspect so does it actually help you test it once

00:30:29.039 the client of the server changes it breaks the client or

00:30:41.120 so so the contract test were was about testing the contracts in isolation so

00:30:46.399 you just test that services contract and if there is any change and if there is any breakage we don't allow to

00:30:52.880 promote that particular package any further there was also integration test pipeline

00:31:00.880 which tests whether this particular service is able to contact works

00:31:07.760 together with other services for that again we used our spec to test the contracts and test

00:31:16.159 its distributed effect across multiple services

00:31:23.600 okay so you're talking about did we calculate the cpu utilization

00:31:28.640 and memory overhead right

00:31:43.840 um so as i said i mean these uh since we could

00:31:50.559 break down the domain into multiple apps it was uh really interesting to you know spin

00:31:56.960 up let's say 10 uh instances of customer service

00:32:02.080 because that is heavily loaded that needs login and we don't want customers to lose out on that while

00:32:08.480 the payment service or communication service which is communication services async so just

00:32:15.840 spin up just one instance of communication service so we played around with lot of those

00:32:22.000 combinations to optimize the infrastructure usage does that that's that

00:32:29.200 answer your question okay well uh thanks

00:32:36.640 thanks anan thank you

00:33:04.799 you

explore all talks recorded at Red Dot Ruby Conference 2014

Explore all talks recorded at Red Dot Ruby Conference 2014

Red Dot Ruby Conference 2014