00:00:00
Ready for takeoff.
00:00:17
All right, let's get started now.
00:00:22
I'm sure you've all heard of microservices—the hottest new architectural trend of the last decade. But what is a microservice? Well, I think we can infer from the name that they are like regular services, but smaller. From the hype around them, we can also conclude that they are, in some way, better than regular services.
00:00:40
As far as I'm aware, this is all the information about microservices available anywhere on the internet. There's not a single blog post, book, or think piece to be found. So, we are going to have to extrapolate ourselves. I think we can conclude from those two pieces of information that smaller is better for microservices. But what is the optimal size for a microservice? Perhaps each of your microservices should encapsulate a single concept, or maybe if you're breaking apart a monolith, each of your models should be a microservice.
00:01:04
Alternatively, it could be a single cross-cutting concern. I would like to put this to a vote, but I'm not going to because you'd all be wrong. Let's take a look at the data. If regular service-oriented architecture is good, and microservices are better, well, any fool can draw a line between two points—and I am such a fool. Thus, I assert that the optimal size for a microservice in Ruby is a single object.
00:01:39
If you've got an object anywhere in Ruby, it should run on a remote machine. But let me be more concrete. Imagine that you've got a class and an instance of that class, and I call a method on that instance. This method should run on a remote machine. This object should be a proxy object, no matter what it is. Folks, you may not like it, but this is what peak architecture looks like.
00:02:10
I don't know how this is better than a regular service, but it must be because it’s microservices. Of course, our custom objects should also be microservices, but this is Ruby. Everything is an object, and therefore, everything should be a microservice. This array should be a microservice, this string should be a microservice, and even these integers should run on a remote server.
00:02:32
To that terrible end, I present to you this talk: Everything a Microservice—the worst possible introduction to dRuby.
00:02:58
Thank you.
00:03:03
Now, because I have the moral compass of a Disney villain, I'm going to walk you through setting up your code to do exactly this: to make every single object run on a remote server. Let's take a look at our terrible, terrible goal—every object, including ones created by Ruby, automatically running on remote servers.
00:03:28
To achieve this dreadful end, we need to start small. We need a concept of a remote object. I want to create an object, and then when I call a method on that object, that method should run remotely.
00:03:52
Now, I could do this myself, maybe have that method issue HTTP requests, deal with responses, and set up a service to run that object. But that sounds like a lot of work, and one thing you should know about me is that I am as lazy as I am irresponsible. Therefore, I'm going to try to use something built into Ruby. Let me introduce you to dRuby.
00:04:03
Now, dRuby is an interesting little API for creating and using remote objects. It's built into the Ruby standard library, although if you want, there is a bonus external gem that you can get by doing a `gem install drb`, which just adds some features to the built-in version. But we're going to stick with the built-in version.
00:04:11
Before I talk about how not to use Ruby, let's start out with the correct way to use it. dRuby has a client-server architecture. You've got some local script, let's say running on my laptop, and then we've got a remote script running on a server in the cloud somewhere.
00:04:37
In the remote script, first, require `drb/drb` because dRuby is required by the Ruby standard library, but it's not required by default. Then we set up a class that is going to be a remote object. Next, we call `DRb.start_service`, which will start a dRuby service on whatever hostname and port you give it.
00:05:01
After that, we provide an object to handle requests for this service on that port. Here, we're just going to give it an instance of a remote server. Next, we call `DRb.thread.join`, which will block this script at this point so that it doesn't exit immediately. It will stay running and listening for incoming connections.
00:05:27
On the local side, we require `drb/drb`. Then we create a new remote object with `DRbObject.new_with_uri`, specifying the URI where we started our server. This remote object is a proxy object; it means it's not exactly an instance of remote server.
00:05:39
If I call a method on it, say `.foo`, that method will get proxied to the remote server and run there, with any result returning back to the local side. Let's see how this works. We can run `remote.rb`, which will stop, waiting for incoming connections. Then, on the local side, running `local.rb` will immediately cause the `.foo` method to run on the remote server, not the local one.
00:06:12
This is already kind of like a little microservice with very little code. We have a remote object, but that's how you're supposed to use dRuby, and it kind of sucks. I mean, we have to explicitly set up the `.foo` method on the remote side.
00:06:41
Any method you want to call has to be previously set up on the remote side before you can use it on the local side. And that's not what we want. Let's look back at our goal: we just want to write arbitrary Ruby code and have every object magically become a remote object.
00:07:09
What we want is some function that, given a class like this, creates a remote instance of it via magic. Then what we should get is a nice proxy object. We don't want to pre-configure everything on the remote side.
00:07:32
Well, if we can set up any object to be remote through Ruby, why don't we just make a remote class that creates new objects? So, here I am calling `remote`, creating a class called `RemoteNewer` with a method `make_new`. All it does is take in a class, calls `.new` on it, and returns the resulting object.
00:08:02
We set that up as a Ruby service, and now, on my local side, I can create an instance of this `RemoteNewer` using `DRbObject.new()` with the URI and call `make_new`, giving it a class. I should get an instance of this remote `Newer` class.
00:08:24
This sort of works, but it doesn't quite do what we want. Here's what Ruby is doing: if you have a remote object and you call a method on it, that method will run on the remote side. Any results that method returns have to get back to the local side.
00:08:57
So, dRuby will serialize whatever it's returning—in this case, a string 'bar', send it over the wire, and deserialize it on the local side. If you do that with a string, all you get is a copy of that string, a local stray. That's not what we want with our `make_new` method. We want to get another proxy object instead.
00:09:34
It turns out dRuby is very good at serializing simple things, like strings and integers, but it doesn't know how to serialize everything. For example, if I have a remote method that returns something unusual—something dRuby does not know how to serialize—it will keep that instance on the remote side and instead return a proxy object. Then, if I call a method on that, it will get proxied back to the remote side.
00:09:52
And that’s what we want with our `make_new` function. How does dRuby know what’s too weird to serialize? Well, you can tell it explicitly. You can say that for a given class, you include `DRb::DRbUndumped` to prevent dRuby from trying to serialize instances of this class.
00:10:18
So, back to our `RemoteNewer` `make_new` nonsense. This `make_new` function takes a class, creates a new instance of it, and we're going to cram in this `DRb::DRbUndumped` mix-in to any object we create.
00:10:47
And this sort of works! I have some class, create a new instance in my `RemoteNewer`, and use that to make a new proxy object of whatever class I want. This works well, but the `make_new` function only works if the class takes no arguments. So let's beef it up.
00:11:10
Now it can handle classes that take multiple arguments, keyword arguments, blocks—whatever, and it just passes that on to the `.new` function of this class.
00:11:26
This works! I create my `RemoteNewer`, call `make_new`, and say we're going to make a new string with an initial value of 'string123'. What I get is actually a remote string instead of a local string.
00:11:51
And I should be able to call any method I could normally call on a string on this remote string. I can call `.to_i`, and that `.to_i` will get proxied out to the remote server and back.
00:12:11
But how can we be sure of that? Because `.to_i` is just going to return an integer. For all we know, it ran locally. Here's how: don't worry about the code too much.
00:12:34
I'm monkey patching dRuby so that anytime a call tries to proxy a method, it will print out what that method is. So, back to our `RemoteNewer` nonsense—when we create a remote string and call `.to_i`, let's see what happens.
00:12:53
The `make_new` is a proxied method, and so is `.to_i`. It did run on the remote server! This little string here is a tiny microservice, and we can already make everything we want be microservices. We can create a remote `Newer`, a remote string, a remote hash, or a remote instance of any class we want.
00:13:17
But that's kind of lame for a couple of reasons. Not just that—all we've done is add a couple hundred milliseconds of delay to all of our method calls. Notice that we have to manually call this `make_new` function any time we want to create a new instance of anything. I don't want to do that—I want to just call `String.new` or something like that and get a remote object.
00:13:44
Well, this is Ruby—the only limits are our imagination and our good sense. We clearly have exactly one of those things. Let's just override `String.new`. Why not? Heck, why stop there? Let's override `Object.new`. Now I am overriding the `new` method on the root object class so that instead of returning a normal local object, it will return a remote object.
00:14:13
And this should work. `String.new` should return a remote string. What could go wrong? Well, it immediately crashes and throws a stack overflow error. It turns out that dRuby needs to create some objects under the hood to function, so `Object.new` triggers dRuby.
00:14:30
And so on. You get an infinite loop. The classic way to deal with infinite recursion is to carefully consider your recursive cases, base cases, and edge cases. But that sounds like a lot of work, so I'm just going to inspect the call stack and if I'm about to recurse, I won't.
00:14:51
So here I am using the `caller` method in Ruby, which gives you the entire stack trace at the current point in the code as an array of strings. We are just going to look through that and see if there's anything in there that indicates we were called from dRuby. If so, I am at risk of recursing, so instead of creating a funky remote object, we are going to create a regular local object.
00:15:12
We do this by overriding `Object.new` and stashing the original `Object.new` method that creates local objects. Then, if we are at risk of recursing, we unbind it and rebind it to the current object and then call it. This is all this effort just to create a normal object. But if we are not at risk of recursing, we can do our crazy remote `new` object. Voila! This works—I can call `String.new` and get a remote proxy object for that string.
00:15:50
The same goes for arrays. Calling `.to_i` will run on the remote server. Look at this array! It's a thing of beauty. I've always thought that the only thing better than a Ruby array would be a Ruby array where every method call involves a network round trip.
00:16:14
Now, I can create an array, and it works fine. I can call any random method I want—but not that one; that one causes an error. I call `.each`, and it gives me 'no dump data is defined for class Proc'. I swear there's a reason for this—it’s not just that dRuby is getting tired of us.
00:16:39
It's because `.each` takes a block. Now, let’s take a look at what dRuby is doing under the hood to understand why that's a problem. If you have a remote object in Ruby and you call a method on it with an argument, dRuby needs to serialize this argument.
00:17:03
To do that, it calls `Marshal.dump` with that argument. If you call a method on a remote object that takes a block, well, a block is just a fancy kind of argument, so dRuby will try to dump that as a Proc. Now, what is Marshal? Marshal is Ruby's built-in way of serializing arbitrary objects.
00:17:34
You give it a hash like this `{ :key => 1 }` and call `Marshal.dump`, and it will serialize it to some data that you could save to disk or send over the wire. Later on, you can reload it with `Marshal.load`, and that should give you a copy of the same object you put in.
00:18:01
The problem is, Marshal does not know how to serialize procs. And why would it? Procs can reference arbitrary Ruby code, including variables, classes, methods, constants, and globals. So, to serialize a proc, you would have to serialize the entire program execution context somehow.
00:18:26
So, there's just no way to coherently serialize a proc. But obviously, we are going to incoherently serialize one! It turns out you can tell Marshal how to serialize things it doesn't know how to serialize.
00:18:51
You do this by defining a couple of methods on the class of whatever instance you're trying to serialize. First, we define a `dump` method and this method should serialize whatever proc we're trying to serialize.
00:19:13
We're going to do this with a method called `serialize_block`. We’ll look at how that works in a second, but what it does is serialize the current proc into a string representing that proc, basically just as raw source code.
00:19:34
Now, if you tell Marshal how to serialize something, you also have to tell it how to deserialize it. You do this by defining a `load` method that takes in what your `dump` method put out.
00:19:54
Given a string like this, how would you turn that into a real live in-memory proc? You can just prepend the string with `Proc.new` and evaluate it with `eval`. Now that's how you teach Marshal how to serialize procs.
00:20:15
But you might notice that I'm eliding a lot of dark magic underneath this `serialize_block` function. How does that work? Poorly. But this is already a talk about things that should not be, so I'll take you through it.
00:20:39
It's a two-step process. First, we need to find the source of some block. If we have this proc, we need to find the raw source code of it. We can get part of the way there with a method called `proc.binding.source_location`.
00:21:15
This will give you two pieces of information: the file name where this proc was defined and the line number. If you want, you can just read that file from disk, skim down to the appropriate line, and pull out that line.
00:21:38
While this is doable, it gets messy if the proc spans multiple lines, but it is still possible. However, it's a lot of code, and as we've established, I’m very lazy. So instead, we're going to use a gem called `method_source`.
00:21:59
Given some proc, I can call `proc.source` and it will give me the source code lines for that proc as a string. But it gives us the entire line, not just the block we want. It also gives us the proc call, its method assignment, and everything else.
00:22:27
How are we going to pull out just the block? I could run this string through an incredibly complex regular expression, but if I tried to parse Ruby code with regular expressions, I'd get beaten up in the parking lot by a gang of angry computer science professors.
00:22:49
So instead, we're going to run it through an actual parser. There are several nice Ruby parsers out there. The one we're going to use is `syntax_tree`.
00:23:07
Now, don’t worry too much about this code. It's kind of intentionally small to be useful. What we’re doing under the hood with `syntax_tree` is we start out with some raw Ruby code.
00:23:36
We pass this to `syntax_tree` and say, 'Hey, parse this for me,' and it turns it into a tree-like data structure—an abstract syntax tree representing the structure of that code.
00:23:42
Then we’ll walk that abstract syntax tree until we find a node that represents a block. If we find a node of type block, we can say: 'Great, give me everything below this point in the tree' and format it.
00:24:05
Then, we can simply print it out, turning it into a string—and look what we get: the block itself, just what we wanted with none of the cruft.
00:24:21
Now, if we use this to serialize a block and this block references any variables outside of it, or classes outside of it, or anything not in the block itself, it won't work.
00:24:42
When have we ever let common sense stop us? So let's press on. Remember, we taught Marshal how to serialize a block using this `serialize_block` function, and it works! I can call `marshal.dump` with a proc and it will dump something.
00:25:01
Can we reload it? Yes! `Marshal.load` works, and we get a reloaded proc. Now I can even run this proc with three and it will print four.
00:25:26
We did the impossible; we've serialized the block. They said it couldn't be done—well, no one said it couldn't be done; they said it shouldn't be done. But we did it anyway, and it works.
00:25:48
We've taught `Marshal` how to serialize a block, and `dRuby` uses it. If I call `.new`, that gets picked up and turned into a proxy object by our previous code. Now I call `.each`, and Ruby will serialize this block.
00:26:11
It won’t throw an error; this `.each` will get run on the remote server, and the block will get run on the remote server. That's where one, two, three will print out.
00:26:27
We are now one step closer to the microservice architecture that all our favorite thought leaders tell us we need. But you might have noticed a problem. I don't just mean the sketchy-looking guy up on stage wearing flannel.
00:26:56
You might have noticed that we explicitly have to call `Array.new`. I don't usually create arrays like that. I prefer using the array literal syntax or the string literal syntax or the hash literal syntax.
00:27:21
But how can we do that? We could override `Array.new` like some kind of maniac, but we've already overridden `Object.new`. This technique works if you explicitly call `Array.new`, but it does not work for array literals; nothing happens when you do that.
00:27:40
It turns out there's no good way to override the literal syntax for arrays—at least not without getting into Ruby's C code, which is a bit more sketch than I'm willing to get into in this talk.
00:27:56
So instead, what if we could take array literal syntax and transform it into array.new syntax that does the same thing? Well, this could work, but then I would have to run some Ruby code that reads my code, transforms it, and spits out more Ruby code.
00:28:16
I don't want to do that; I just want to run my code and have it work. To do that, I'm going to introduce you to the ultimate tool in every Ruby villain's toolbox for messing with syntax—`__DATA__`.
00:28:34
Here's how it works: You’ve got some arbitrary Ruby code, and then you've got `__END__`. Ruby will read and execute down to this `__END__`, then ignore everything after. You can have things that are not valid Ruby code thereafter; it's meant for data.
00:28:59
You’re supposed to put data there, but you don’t have to; you can just put more Ruby code—it won't get evaluated unless you `eval` it.
00:29:12
At the moment, this is just a convoluted way to run a few lines of Ruby code. But it gives you an opportunity to run a preprocessor on your code. For example, I can take the code after `__END__`, run it through a regular expression that finds and replaces anything that looks like an array, and converts it to `Array.new` syntax.
00:29:33
Heck, I can do the same thing for hashes and strings, too! This will convert each of these into explicit `.new` syntax, which will get picked up by `Object.new` and turned into one of those dRuby proxy objects.
00:29:52
Now, I did say that if I were to parse Ruby code using regular expressions, I would get beaten up by a gang of angry computer science professors, but the talk is almost over and I think I can make it to the airport before they find me. The rest of you are on your own.
00:30:31
Quickly, one last step: right now, each of these objects will become remote proxy objects, but aren’t they really microservices? I mean, they're all remote, but they're all remote to the same place.
00:30:46
Our network diagram looks pretty lame. If we want our blog post about this cutting-edge architecture to trend on Hacker News, it needs to look something more like this.
00:31:02
As any distributed systems engineer will tell you, the more network hops you have in your system, the faster and more reliable it will be.
00:31:16
So, let's rig this up so that each remote object runs on a different remote server. Now we’re on track to have a trendy hipster architecture; let's go even further and run each of our objects in an AWS Lambda function.
00:31:26
For those who are familiar, AWS Lambdas are limited to 15 minutes of runtime at most, so we don’t have to worry about this architecture crashing well before that.
00:31:56
AWS is a cool service by Amazon Web Services. You give them a function definition and they will run it for you whenever you ask, whether it be one time or a thousand.
00:32:21
You don’t have to worry about EC2 boxes or Heroku dynos or any of that stuff, and for the privilege, they charge you a fraction of a cent per second your Lambda is running.
00:32:44
Now, the basic Lambda function definition looks like this: we start our Ruby service and then call `DRb.thread.join`, but this will block and never complete, as it just listens for incoming connections forever.
00:33:03
AWS will charge me for the full 15 minutes before it times out. A brief aside for those who really know their Lambdas—this isn’t possible. Lambdas don’t allow you to accept incoming connections.
00:33:26
Using the magic of Tailscale makes this possible, but that would be devops nonsense—and we’re not here for that; we're here for Ruby nonsense!
00:33:46
We're trying to create a remote proxy object in a Lambda. First, we need to start a Lambda on a random hostname, then we need to sleep three seconds for it to start.
00:34:01
Next, we need to construct the URI we want to connect to that Lambda. Then we create one of our `RemoteNewer` instances inside that Lambda.
00:34:26
Now we can call `RemoteNewer.make_new` to create a new proxy object pointing to this AWS Lambda. After some hacking around with dRuby's confusing URI management, we can return the proxy object we just created.
00:34:53
Now we have what we wanted—all objects remoted out to their own separate machines, each running in AWS Lambda. This will be blindingly fast because I had to add that three-second sleep just to make it run.
00:35:11
Let’s see how this system works; this is the unspeakable atrocity I promised you. We pre-processed what is a cardinal sin of software by converting all our array, hash, and string literals into explicit `.new` calls using regular expressions.
00:35:32
Then, we overrode the `Object.new` method and, to add insult to injury, we prevented infinite recursion by mucking around with the call stack.
00:35:54
Next, in one of our few reasonable moves, we used dRuby to create remote proxy objects. Because we’re incapable of doing anything simply, we read our own source code from disk, parsed it, transformed it, and sent it over the wire, just to handle block arguments.
00:36:19
Finally, in order to be maximally hipster, we shoved the remote half of this code into serverless AWS Lambda functions. Put it all together and we've got microservices!
00:37:06
Let's see how it works. This is a remote object now, and so is this and this and this—they're all microservices! Let’s see how it works in real life with some sample arbitrary Ruby code.
00:37:26
Surely our microservice architecture will make this blisteringly fast. OK, so it takes ten seconds to run—not so much.
00:37:45
But surely it's cost-effective to run. Here’s my Lambda bill after about a month of messing around with this. Okay, so it costs more than it has any right to.
00:38:01
But surely the elegance and simplicity of this setup would be worth it. This is all the code we’ve gone over in this talk.
00:38:17
So, it’s not so simple relative to normal code; it's slower, more expensive, and hideously complex. Well, folks, I say we've succeeded because that is the textbook definition of a microservice architecture.
00:38:40
Congratulations! We should all feel terrible about ourselves. How about we stick to building monoliths and agree that the real microservices were the friends we made along the way.
00:39:05
Now, I'd like to take a few minutes for questions and answers, but because I like the sound of my own voice entirely too much, I will be providing both.
00:39:26
The first question I always get when I present on this sort of nonsense is, should I use this in production? And oh, thank you!
00:39:30
Hopefully, by this point, the answer is clear: absolutely, please do! And tell me how it goes. That will be an amazing story!
00:40:07
If you have a problem with stability, performance, and maintainability, this architecture will solve that problem for you. On a more serious note, you may be wondering what some reasonable, non-ridiculous use cases for dRuby are.
00:40:48
I’d say dRuby is pretty great for internal projects, hack projects, small projects—anytime you want to quickly get a connection between two chunks of Ruby code running on different processes or even different machines.
00:41:11
That said, dRuby has no built-in security or authentication, so if you expose a DRb port to the public internet, you are asking for a remote code execution vulnerability.
00:41:35
Please at the very least wrap it in a VPN or something. Because some people just can't help but rubberneck at a car crash, I'm sure some of you are wondering where you can see this code in more detail.
00:41:57
You can find this particular code on my GitHub. It's the same code, just with a lot more detail, more comments, and hopefully a little bit easier to read—along with gaps filled in.
00:42:23
Now, at this point, I'm sure many of you are wondering who in God's name I am, how I got on stage, and how we can beef up RubyConf security to prevent this from happening again next year.
00:42:46
My name is Kevin Kuchta. You can find me at various places online. My Twitter handle is there, my personal website is there, and I've recently been spending more time on Mastodon, which you can see in the bottom right corner.
00:43:16
If you're tired of hearing me talk to myself and have a real question, feel free to hit me up after the talk. I'm always happy to chat, or you can catch me in town anywhere around the conference.
00:43:39
Finally, I'm sure at least a couple of you are wondering who would hire me and how you can avoid ever working for them. I assure you, I get this sort of nonsense out of my system in my spare time.
00:44:11
When I log on to work, I write nice, reliable, boring code that, as far as I know, none of my coworkers want to strangle me over. That said, I work for a company called Daybreak Health.
00:44:37
We are a fully remote mental health company providing therapy to teenagers. It is easily the most rewarding job I've ever had. We’ve literally saved lives during my time here.
00:44:59
I don't believe we have an open job position at the moment, but that will change. So next time you’re looking for a job, feel free to reach out. I’d be happy to chat about it and you can check us out.
00:45:12
I think we’re pretty cool.
00:45:17
And that’s the talk! I hope you enjoyed it. I hope you enjoy the rest of the conference. Thank you!