Summarized using AI

Adventures with Micro Services in Rails

Anand Agrawal • June 27, 2014 • Singapore • Talk

In the talk titled 'Adventures with Micro Services in Rails,' Anand Agrawal presents an insightful exploration into the use of microservices within a Rails architecture. The discussion revolves around defining what microservices are, when and why they should be utilized, and the potential pitfalls associated with them. Key points covered include:

  • Definition of Microservices: Microservices are described as small, independent services that each perform a specific function well and communicate over HTTP. They are manageable, allowing teams to develop services that adhere to the single responsibility principle while being loosely coupled.

  • The Journey to Microservice Adoption: Anand shares a case study about a social gaming company burdened by legacy code from 90 years of development. This legacy led to difficulties in introducing changes, necessitating a shift to a microservices architecture. He emphasizes that the initial build started small, evolving from plain old services to microservices as they refined their objectives and began modularizing functionalities.

  • Deployment Practices: With a deployment model that supported continuous integration and delivery, the team achieved a one-click deployment process across multiple servers, drastically reducing deployment time to under three minutes for 25 virtual machines in production.

  • Testing and Validation: The speaker discusses the importance of testing microservices properly, utilizing unit tests, contract tests, and integration tests to ensure reliability and functionality across the service architecture. He highlights the significance of automating the testing and deployment processes to streamline procedures.

  • Challenges/Trade-offs: Agrawal notes that while microservices can result in more maintainable and reusable code, they also come with increased complexity concerning infrastructure, potential service explosion, and a steep learning curve for developers. He emphasizes the need to manage service interactions efficiently to avoid excessive network overhead and fragmentation of data.

  • Conclusion: The presentation concludes with a note on the importance of having clear contracts, managing dependencies, and ensuring proper logging and monitoring frameworks to facilitate efficient operation across multiple services. Anand also raises key points on reporting challenges due to fragmented data and the approach his team adopted for managing multiple services effectively.

The talk ultimately provides a comprehensive overview of the practices, challenges, and strategies involved in implementing a microservices-based architecture in Rails, emphasizing that while microservices offer many advantages, careful consideration and management of associated complexities are essential for success.

Adventures with Micro Services in Rails
Anand Agrawal • June 27, 2014 • Singapore • Talk

I have spent some time working on a project where we've built 8 micro services and 2 applications, and planned to carve out a few more. Deployment was carried out in a farm of 25 servers in production with a single click in less than 3 minutes.

In this talk I will share our experiences with building a micro service based architecture - the good, the bad and the ugly.

What are micro services?
When/Why/How micro services?
Why NOT micro services?
Managing Continuous Integration and Continuous Delivery with micro services
A few design principles that we followed and that worked for us

Help us caption & translate this video!

http://amara.org/v/FGY3/

Red Dot Ruby Conference 2014

00:00:20.480 um so today we're gonna talk about microservices because before that something about me
00:00:25.680 i am a consultant from thoughtworks sync up singapore full stack engineer and i also
00:00:31.599 co-founded ideaboards which is a retrospective tool this is my twitter
00:00:39.200 handle and github handle enough about me uh what's in for you
00:00:46.239 so we're going to talk about what are microservices uh why microservices how microservices
00:00:53.039 and when do we use those microservices how many of you have heard about
00:00:59.359 microservices oh how many of you have actually used
00:01:04.879 microservices or wrote code cool um so let's define micro services uh
00:01:13.680 there is no formal definition per se but before jumping on to that let's talk about uh what are services
00:01:21.280 um it's just an implementation of a contract right you have some contract and the service actually
00:01:28.240 implements that particular contract and tells you that if if you call me with these these parameters i'm gonna do
00:01:34.159 something for you what happens if you attach a micro to
00:01:39.759 the micro service
00:01:45.200 um so micro as the name suggests it should be small we're gonna talk about what is small it
00:01:51.840 should be independent it should be self-contained it should perform on itself it should be
00:01:58.799 composable it each service should work together with all the other microservices
00:02:05.600 and it does one thing and that one it does that one thing well um that's the that's the key um
00:02:13.520 but what is the right size we talk about micro services there is a lot of conversation around what should be the
00:02:19.920 size of the service some people rate size based on
00:02:25.440 lines of code like if it's beyond 200 lines of code it's not a microservice some people
00:02:32.640 think about it in terms of team that if one person could develop that over a period of time then that is
00:02:37.760 a micro service or if it's a group of people that could work independently on a service
00:02:43.519 that is a micro service um well for us we've been working with a client and for
00:02:49.280 us what really worked out is defining those services in terms of domain
00:02:55.280 so each service does just one thing and it does that one thing right so that one
00:03:01.599 thing could take 100 lines of code or could take 1000 lines of code
00:03:10.879 tying it to unique philos unix philosophy you write programs which are small
00:03:16.560 programs we've used said that all these small little programs
00:03:22.159 and then use pipes to concatenate them and make them cohesive make them work
00:03:28.560 together in case of microservices yeah http is the new pipe because http
00:03:35.120 microservices are usually exposed over http and each service does its own job and passes
00:03:42.319 it on to the different service
00:03:48.239 it's also kind of uh going back to the object oriented philosophy i talked about there was a
00:03:54.239 talk on solid principles so taking uh talk about single
00:03:59.439 responsibility the service this service does just one thing
00:04:04.480 uh it has low coupling it doesn't it's not chatty with too many different services
00:04:10.239 at the same time it is cohesive and it's small and does that one thing
00:04:16.720 well and as the jeff jeff bay says
00:04:23.759 a monolithic application which is 100k lines of code is nothing but 101k
00:04:31.040 line of applications waiting to happen well he has used the 1k line of code as
00:04:36.960 a parameter for a microservice but you get the idea right i mean rather than having a big one of these
00:04:43.120 monolithic application break it down into smaller so that it's much more manageable
00:04:50.880 um so why do we use microservice what's what's the key objective um
00:04:58.800 rather than answering that question i would like to share my story on my journey of using microservice or how we
00:05:04.560 use microservice in one of the client projects um so i'll tell you a story uh
00:05:12.400 this happened couple of years back when we started an engagement with a client
00:05:17.680 i'll give you a little history on the client uh this is 90 year old business it's a social
00:05:24.400 gaming company it was a 90 year old business and it was quite funny to hear on the first day
00:05:30.720 that their customers were literally dying or dying out but literally dying because
00:05:36.800 of the age so yeah lots of legacy codes over this
00:05:42.080 period of 90 years they've accumulated like a lot of code acquired a few companies the tech stack
00:05:48.720 was mixed of vb scripts vb6 forms and oracle and all
00:05:55.919 crazy stuff because of these the flexi it was not
00:06:03.919 flexible the cost of introducing any change was like unfold even adding a simple feature on a
00:06:10.319 customer was like changing three or four different systems and then uh
00:06:16.000 getting that through so it was really really painful all these uh apps were
00:06:24.240 fully functional in their own silos because of the various acquires and mergers um but they each of the apps had like
00:06:31.520 concentrated complexity yes there's a lump of code sitting there
00:06:37.039 and nobody knows how it works
00:06:43.120 when he started talking more about it that's when we came with this idea that
00:06:49.680 oh this sounds like we need to build something that is small that is independent that is composable and that does one
00:06:56.720 thing so that if there are four different applications wants to do payment
00:07:01.759 it's just one service which should be doing payment right if they want to store customer information there should
00:07:07.039 be just one service holding those customer information so that's when
00:07:12.720 that's where our journey started and building the micro services what we've achieved
00:07:19.759 so far is we have 10 micro services
00:07:24.800 doing a bunch of things 25 vms in production 60 plus vms across other environments
00:07:32.160 like qa tests and performance environments
00:07:37.199 and we could achieve one click deployment across all these environments
00:07:44.720 and you can guess if it's one click deployment who does the deployment so it's product owners which queue is
00:07:51.039 anybody uh it's an interesting story that uh somebody want somebody at the client side wanted a dog to do
00:07:56.960 the deployment so press the button um so for us microservices
00:08:05.680 each of these 10 different microservices was like self-contained they had like their own db their own
00:08:11.759 contracts they were running in their own process
00:08:18.000 talking to each other through htp
00:08:23.280 but how did we start we didn't start with 10 micro services on day one right so try to solve small and valuable
00:08:32.320 problems started with small piece of functionality like customer database and
00:08:38.240 try to migrate that first start with plain old services so we didn't start we
00:08:46.399 didn't start with microservices on day one that hey this let's go with just one resource
00:08:52.399 per service or let's just go with one responsibility for service we
00:08:57.839 started with plain old service and started realizing that some things could be moved out
00:09:03.920 so when services starting doing too much over the period of time like you do object refactoring very
00:09:09.839 factored services so if service responsibility grown we extracted them to smaller ones
00:09:18.720 um so as i was telling you uh about the domain which is uh social
00:09:24.560 gaming typically uh over the web it has three or four components so that's where we
00:09:29.600 started the plain old services there is catalog which is that game catalog
00:09:35.920 customers the orders and payments
00:09:41.519 we slowly realized that customer service is trying to talk to legacy too much so start we extracted out that as a
00:09:48.720 different micro service to talk to legacy database and those kind of things
00:09:54.160 and eventually throw that away the order started growing too much so
00:10:00.640 it's abstract extracted out orders into two which is order processing uh the main order service now is
00:10:07.519 responsible for just taking orders and there is a separate service for processing the orders
00:10:13.360 and there is a separate service for resulting because it's a gaming company so after
00:10:18.560 when you play anything at the end of it you get results so there was this result service
00:10:28.320 from a high level you would still think that well it's okay that just sounds like a plain old
00:10:34.079 service right i mean i had like lot of debates like why is this uh microservice it's just it's just service
00:10:41.680 which is well designed that's it right i mean uh had hard time people asking those
00:10:47.200 questions that why why what is micro in these services
00:10:52.959 well for us it was mainly the single responsibility fact that a customer service just does
00:10:59.839 everything related to customer data and its boundaries are restricted to
00:11:05.600 just customer and it never overstepped it blocked its boundaries
00:11:11.839 um you also said each service has one resource so customer service has
00:11:18.320 just one resource uh well sometimes uh it had two resource
00:11:23.600 so especially with payments when we started dealing with payments it started having two or three resource because we had
00:11:30.000 payment method credit card and debit card which are like three resources and one service but pulling them out into a different
00:11:38.240 their own microservice would mean that breaking that
00:11:43.680 comp law of composition and the services the payment service our credit card
00:11:49.040 service and the direct debit service would become too chatty and
00:11:54.320 it would defeat the purpose of having a service
00:11:59.360 so they communicate our restful contract http and json
00:12:04.480 um so over the period of time when we were into that journey had some
00:12:11.760 thumb rules around what really uh you should do to make these services
00:12:17.200 uh micro and keep the complexity to a minimum is one top level resource as we just
00:12:24.399 talked about well in some cases too focus on contracts so that is very
00:12:30.800 important for each service that the contract should be driven made
00:12:35.839 it clear that contracts should be driven mainly based on the domain and not the client
00:12:42.959 so if the clients the consumer of the service needs certain additional information or
00:12:49.200 summary information or those kind of things then what i have seen is many
00:12:55.360 places we end up creating those service endpoints and those becomes unmanageable so we
00:13:01.760 focus a lot of contract a lot of it on contracts uh each service
00:13:08.000 had their own context and uh they're not allowed to access
00:13:13.040 data which is beyond their context anyway each of the service has their own database so they
00:13:18.639 it was anyway not possible but you made a point that even if something is accessible uh
00:13:26.000 some data is accessible and you're not it is beyond the domain so call the
00:13:32.800 service rather than directly using the data source uh avoid too much of coupling between
00:13:40.000 the services again since they are independent we try to avoid a lot of coupling
00:13:45.519 between the services and since now we have 10 different
00:13:51.519 application each of them logging in their own each of them having running in their own process it was
00:13:57.839 really important to have a sophisticated logging and monitoring framework
00:14:03.519 so that if anything breaks we get to know immediately if any of the request fails through logs
00:14:10.560 we can immediately catch those errors or exceptions
00:14:16.000 um so having said that uh following those thumb rules there were we ended up creating
00:14:22.160 few cross-cutting services which is a spa which was needed across
00:14:28.560 all the other microservices so going back to the services that we had we ended up
00:14:35.839 having a communication service because there was a lot of communication to the clients around
00:14:41.760 hey your payment is due next month welcome welcome message or maybe result message
00:14:48.160 that hey you won this particular game so each of these services had their own communication
00:14:55.440 so we abstracted out the communication the cross-cutting concerns around all the services
00:15:00.560 created a communication service and all the service needs to do is just ping
00:15:06.880 that communication service and say that hey send the communication that customer has won
00:15:12.880 and it's the communication service responsibility to figure out whether to send an email communication
00:15:18.399 or or sms communication or whatever
00:15:24.959 there were a lot of scheduled jobs across these uh services like you need to send a
00:15:30.160 weekly email to customers uh payment emails and stuff so those scheduling was part of all these
00:15:37.759 services abstracted out in a separate service so that it would just ping the service and
00:15:43.759 schedule anything on any service and we talked about error reporting
00:15:49.920 already uh so it was extracted out as a separate service so it would log a request to one
00:15:55.440 common place and could do a search and those kind of stuff on this error reporting tool
00:16:04.639 um so having this form of services uh what uh
00:16:11.440 usually hear about uh thing is service explosion um so you have these services it's
00:16:18.560 difficult to for a developer to check out them and deploy them and
00:16:24.240 things like that so how do you stay productive in spite of having these many services
00:16:31.440 um well we use ruby and rails use rails api to build the service endpoints
00:16:40.240 uh focus lot of uh a lot of our focus was on devops to make
00:16:45.519 things as simple as possible in terms of deployment in terms of setting up a dev
00:16:50.560 box and stuff um we used feature toggles
00:16:56.000 instead of feature branches because doing feature branches with us service oriented architecture is like
00:17:02.160 really really difficult and having ci and cd pipeline itself is like really
00:17:07.520 difficult um we also created a lot of client gems
00:17:13.039 for these micro services the client jumps basically the provides
00:17:18.799 easy to talk to you to the service so you it would feel like you're calling a
00:17:24.000 service in memory because it gives you a nice object you just call a service using that object and you get
00:17:30.960 a object back and a mantra was automate automate
00:17:37.679 whatever it is whatever is the repetitive process just automate everything
00:17:45.360 um so how do i make a small change and still say same i mean if we make a small
00:17:52.960 chain how do we make sure that everything works properly if that change happens the answer is
00:18:00.000 simple test it and if you're thinking of something like this
00:18:09.520 then this is the answer it's funny only when it is a joke i mean
00:18:15.600 if you're building an enterprise software or any software it's important to have tests um so
00:18:23.440 started off with unit test in each of the services whether the object within the service is
00:18:29.200 doing the right thing but then that's too obvious right i mean everybody of us right unit test
00:18:35.840 and we love our spectre the contract test which is uh is my
00:18:43.440 service doing what it should uh which is basically out of container
00:18:48.480 test so we ping a endpoint in memory service and see whether we're getting the right
00:18:54.840 response and there is uh and we test the
00:19:00.000 contracts these are basically black box block pack black box test which test the contract
00:19:09.120 send something and get a data get some response back don't worry about the implementation uh
00:19:15.679 the next is integration test the acceptance test is for uh the boundaries within within the
00:19:22.000 service itself in integration tests we test whether this particular service is behaving
00:19:28.160 nicely with other services so if this service is calling some other surveys or testing the user flow then we write
00:19:34.640 root unit test it tests the distributed effect if anything fails then
00:19:40.080 actually is the error getting reported in the errors error reporting service or if the
00:19:46.720 payment is getting uh deducted then is the kid is the customer getting communication or not um
00:19:55.520 so test async action lot of actions were async like when you're sending a communication it's all async
00:20:02.480 so we were using uh rescue for it and there's nice plug-in rescue that lets
00:20:08.080 you test async actions um so you build these micro services
00:20:15.120 uh how do we actually ship it as james lewis says we are essentially
00:20:21.679 building the complexity of building the software to actually the infrastructure
00:20:27.360 so instead of now having one application to deploy now we have to deploy like 100
00:20:34.000 applications of 1k line each so the code becomes simple easy to
00:20:41.200 understand but infrastructure is slightly complicated
00:20:46.400 so we provision use puppet solo at some point of time we would like to
00:20:51.919 use docker as well um so provisioning it begins at home so
00:20:58.000 even the dev box are provisioned so that if there is any change in
00:21:03.440 any of the versions of the software the same scripts are used across all the
00:21:09.120 environments um so script goes through ci like
00:21:14.960 application code the puppet script we're gonna see that in the ci
00:21:20.559 pipeline slide and immutable server as brian was talking about it in the morning so
00:21:27.440 it doesn't make sense for server to be mutable if you're using provisioning script
00:21:33.760 this is how our integration pipeline continuous integration pipeline looked like
00:21:39.840 we had the ui test each each of the
00:21:44.880 boxes at the top is unit tests so there's ui there is service there is
00:21:51.120 a puppet code which flows through integration uid performance
00:21:56.720 and eventually to production all this is one click deployments across environments
00:22:06.880 so with every check-in we run unit test run integration test that we've written
00:22:12.559 run acceptance test and build a package this is really important because that same version of
00:22:19.200 the package would be deployed across all the other other environments
00:22:24.480 and the most important thing is we shipped often like weekly or less than that uh just ship
00:22:30.880 whatever you have and we shipped it like fedex
00:22:38.640 um so talking about ci and cd followed that in the project what it
00:22:44.480 actually gave us is single click deployments uh we managed to get
00:22:49.600 cut down the server deployments from uh to actually three minutes so each change
00:22:56.240 would actually take like three minutes to deploy to server to production um we had a farm of
00:23:02.240 25 servers and everything just works uh their deployments and everything
00:23:08.880 just works like a charm um yeah and so easy that our product owner
00:23:14.880 does it um we made a point that since we are adding
00:23:20.400 lot of micro services and refactoring to microservices
00:23:25.440 the cost of adding any of these services should be as low as possible
00:23:31.039 so we managed to cut that time to less than less than a day and right from creating
00:23:37.919 a project to taking that but that empty project production was less than a day
00:23:44.799 so having uh talked about microservice when do we use the microservice it's not
00:23:51.840 a silver bullet right i mean it comes with a cost
00:23:57.840 so these are some of the trade-offs the benefits and the cost associated with them
00:24:02.880 so the benefit is you get small reusable and maintainable
00:24:08.080 code which are throw away you can just rewrite them and stuff but at the same time you'll have like a
00:24:14.559 complex infrastructure because you need to deploy those independent services the individual codes
00:24:21.200 each service would grow independently can divide the team such that based on
00:24:28.240 services and e-service would keep growing on their own with the teams
00:24:35.440 but on the same side on the other side the learning curve is quite huge in microservice
00:24:41.520 because now you have to deal with multiple applications and developers would find hard to know what's happening
00:24:49.120 in the other service they scale independently as they are in
00:24:56.080 their own process they have their own database you can make deployments is that if there is any
00:25:02.880 service which is not heavily loaded uh you can make a rational use of
00:25:08.960 the infrastructure rather than having monolithic app and running
00:25:15.760 fat servers at the same time there is network overhead in terms of going
00:25:22.720 through the http going through over the wire and calling those servers they have independent dbs so if there is
00:25:29.360 high load on database the database could be scaled independently but at the same time they
00:25:34.960 end up having a fragmented data and the reporting and all becomes slightly
00:25:40.840 difficult well uh that's all i had uh
00:25:47.919 questions your last comment is the perfect segue into my question which was this seems like it would make
00:25:54.080 reporting a nightmare so is it slightly difficult or is it a nightmare well
00:26:02.000 well some uh some of the complex reports become like nightmare as you said um
00:26:10.320 in some of the cases what we have uh tried out is uh dump these data into a warehouse and
00:26:16.480 start developing reports out of that uh there are some plugins with uh
00:26:22.000 postgresql where it allows you to connect to multiple databases and run sql query across databases
00:26:29.279 that worked out in some cases in some cases it was just fetch the data if it's small enough
00:26:36.000 and do it in memory so it it depends on the usage um if the data is really like really huge
00:26:42.559 then first case where you dump that in our warehouse would really would work fine how do you deal with
00:26:49.120 uh versioning the services and what services you're talking to from
00:26:54.960 presumably not something plot involved according to each other
00:27:00.080 um so as you saw the deployment pipeline it uh to start with we said okay let's
00:27:08.640 not do versioning at all in the services let's deploy all everything or nothing so rather than
00:27:15.279 picking up what needs to be deployed we made our deployment script intelligent enough to figure out if there is a
00:27:20.720 change it would deploy otherwise it won't and since uh in the deployment
00:27:27.919 in the ci pipeline you have tested that hey this version of the service works nicely with these other versions
00:27:34.159 of the service we would deploy that whole lump of all the services together
00:27:39.440 or and roll back if needed all the services together this is just to simplify so that we
00:27:45.360 don't end up having too many versions and too many
00:27:52.080 making code unmaintainable basically a couple of questions
00:27:59.600 on your client library gyms did you uh did you settle on something like there are a few
00:28:04.720 uh json schema standards that are being talked about that sort of made it easier
00:28:10.080 to like the client library to be sort of discovering the layout of the api did you use
00:28:15.840 anything like that or is it kind of that um so we used hashi gem if you have heard of that
00:28:21.360 that worked out really well where we could actually uh create the objects create the models
00:28:27.279 and stuff really nicely using a dsl
00:28:33.039 all right and although if you don't mind that have you found a need or use any tools or building tools like
00:28:39.200 any twitter adults something that it's probably solid specifically for tracing transactions that go through
00:28:45.600 multiple services like if you have some issues that come up especially with the urgent question like do you have a hard time do
00:28:51.679 you want to use something that goes through multiple services with a you mentioned a lot of logging and boundary
00:28:56.840 infrastructure
00:29:03.360 so we passed in a unique identifier from the main caller so the ui
00:29:09.279 layer which talks to the service would typically pass in a common header or which would be
00:29:15.679 a unique identifier and if this service is calling some other service it would pass on the same
00:29:21.760 identifier and the logging actually make sure that we are using that particular identifier so when you're searching in
00:29:29.039 splunk or any other log aggregator service all you need to do is just check based on that
00:29:34.399 identifier to trace where all the requests went through thank you
00:29:44.320 okay i was wondering if you how you went about testing the contract of the service
00:29:51.440 did you use any particular tools um actually our spec has it
00:29:58.559 uh our spec has a way to test the contracts uh without actually spinning up the uh
00:30:05.679 the server like we do controller test right you hit an endpoint and see if it is returning the proper response
00:30:13.520 if you do a render view on top of it uh it will actually renders the view it will actually render the json and give
00:30:19.919 you back the response so we just use plain aspect so does it actually help you test it once
00:30:29.039 the client of the server changes it breaks the client or
00:30:41.120 so so the contract test were was about testing the contracts in isolation so
00:30:46.399 you just test that services contract and if there is any change and if there is any breakage we don't allow to
00:30:52.880 promote that particular package any further there was also integration test pipeline
00:31:00.880 which tests whether this particular service is able to contact works
00:31:07.760 together with other services for that again we used our spec to test the contracts and test
00:31:16.159 its distributed effect across multiple services
00:31:23.600 okay so you're talking about did we calculate the cpu utilization
00:31:28.640 and memory overhead right
00:31:43.840 um so as i said i mean these uh since we could
00:31:50.559 break down the domain into multiple apps it was uh really interesting to you know spin
00:31:56.960 up let's say 10 uh instances of customer service
00:32:02.080 because that is heavily loaded that needs login and we don't want customers to lose out on that while
00:32:08.480 the payment service or communication service which is communication services async so just
00:32:15.840 spin up just one instance of communication service so we played around with lot of those
00:32:22.000 combinations to optimize the infrastructure usage does that that's that
00:32:29.200 answer your question okay well uh thanks
00:32:36.640 thanks anan thank you
00:33:04.799 you
Explore all talks recorded at Red Dot Ruby Conference 2014
+20