Unraveling the Cable: How ActionCable works

by Christopher Sexton

In the presentation "Unraveling the Cable: How ActionCable works" at RailsConf 2019, Christopher Sexton delves into the mechanics of ActionCable, a powerful feature in Ruby on Rails that facilitates real-time communication through WebSockets. He aims to demystify ActionCable's components—connections, channels, consumers, and clients—by providing a comprehensive overview of how they interact to enable functionalities like chat rooms and IoT device management.

Key Points Discussed:

- Introduction to ActionCable: Explanation of ActionCable as a framework for integrating WebSockets into Rails applications, emphasizing its ability to facilitate two-way communication between clients and servers without the need for complex external systems.

- Real-time Communication: Clarification of the term 'real-time', distinguishing it from strict definitions in computer science, and emphasizing ActionCable's capacity for low-latency responses to events.

- Connection Lifecycle: Detailed exploration of how a client connects to ActionCable and subscribes to channels. Christopher walks through the lifecycle of a WebSocket connection, using code snippets to illustrate the process of initializing a connection and handling subscriptions through client-side JavaScript.

- Deployment of Redis: The importance of Redis in managing pub/sub messaging to facilitate efficient communication between server components. Christopher discusses how it allows Rails applications to handle multiple event-driven scenarios seamlessly.

- Case Study: Chat Room Application: Christopher illustrates the practical applications of ActionCable by demonstrating a chat room example, where users can send and receive messages in real time. He highlights how ActionCable effectively manages message broadcasting through channels.

- Overhead and Efficiency: Discussion on bandwidth usage, emphasizing the advantages of WebSockets over traditional HTTP requests in terms of reducing overhead caused by HTTP headers, particularly in environments with tighter restrictions on bandwidth.

Conclusion: The session concludes by reiterating that understanding the underlying mechanics of ActionCable is critical for developers wishing to implement real-time features in their applications effectively. Learning to navigate and comprehend the Rails source code can alleviate the initial intimidation posed by complex frameworks and lead to more robust application development.

Overall, Christopher encourages participants not to shy away from digging into their tools and frameworks to unlock their full potential and improve their Rails applications.

The talk provides valuable insights not only into ActionCable but also into the broader philosophy of continuous learning and exploration in software development.

00:00:20.480 All right, I think we can get started right at 11:40.

00:00:25.609 Hey, I'm Christopher. I'm on Twitter as @CRSaxton.

00:00:30.630 I was told it's good to give you a little bit about myself to make a more personal connection with the audience, rather than just blasting everyone with technical information.

00:00:36.660 Even though I feel like that's what we're supposed to do—just throw a lot of technical information at you.

00:00:42.750 So, I live in DC, I have a wife and two kids, and I also have a small dog who will make a couple of appearances during this talk.

00:00:49.260 On Twitter, I run a service where I tweet pictures of her in the morning and pretty much post nothing else.

00:00:54.590 I work at a company called Radius Networks, and we create products centered around proximity and location, often using mobile devices and Bluetooth-enabled things.

00:01:05.850 If that interests you and you like IoT, low-level firmware, and Ruby, come and talk to me—I would love to chat!

00:01:11.250 Additionally, for the past couple of years, I've been helping out with a conference called Ruby for Good.

00:01:18.119 We have another one coming up soon in Virginia, from July 25th to 28th. You can check out rubyforgood.org if that's your sort of thing and you like helping charities and nonprofits.

00:01:23.399 Alright, now for the actual talk.

00:01:30.569 I gave this talk at the local DC meetup, the DC Ruby Users Group. One thing I failed to do at the beginning was explain what you should get out of this session and what Action Cable even is.

00:01:42.599 I'm not sure how familiar everyone is with Action Cable or WebSockets, but we're going to step back a little bit and look at it broadly, then dig in deeply.

00:01:55.610 Hopefully, by the end, we’ll come back around to how everything fits together with a clearer understanding.

00:02:03.629 The important thing to know about this is that it's not magic. There are tricks to it, and once you learn those tricks, you can dig in and understand how it works.

00:02:16.770 I remember when I was a kid, I took apart my dad's computer. He was thrilled—it was an old 8086 with those 7400-series chips.

00:02:22.890 I remember prying them out of the socket and thinking I'd never understand what they did. I later got a computer engineering degree and learned that those chips were just NAND gates.

00:02:38.370 Once you understand a NAND gate, you realize it isn't that complicated. I thought, 'Wait a second, how would I do this in Ruby?' It’s not that hard—NAND is just 'not and,' so I simply wrote a method that returned it.

00:02:50.400 This is essentially what those chips do; there are a whole bunch of them on the side of the board.

00:03:02.970 So digging in and looking under the covers to figure out how things actually work seems more intimidating than it is. If this contrived example helps anyone, it's to give you the courage to dig into the source code of Rails or any Ruby gem.

00:03:20.520 It takes a while to trudge through, but once you understand the basics, things start to click into place, and you'll realize that actual people wrote this code—we're all capable of understanding it.

00:03:44.760 Earlier, I mentioned that we skipped over the background of what Action Cable is.

00:03:49.770 At work, we use Action Cable for various applications, such as digital displays or IoT devices getting configuration and sensor data in and out of different devices.

00:04:02.970 But what we often do is create chatrooms with it.

00:04:07.980 So, when I prepared for this talk, I did what I was supposed to do and made a chatroom. Here's the quintessential example of what Action Cable is and does.

00:04:21.550 I opened two browsers connected to my local Rails server. I typed in a message, and as soon as I hit enter, the message popped up.

00:04:26.880 It works even if I switch to the other window and type a message, and it just appears. This is what we're aiming to achieve.

00:04:39.550 So how did we get this to work? I didn't want to get into anything fancy or the high-level logic. Instead, let's figure out what is going on under the covers, and before we do that, let's understand why we care about how this works.

00:05:01.770 One of the main reasons is that we can push updates from server to client. Whenever an event occurs on the back end, such as someone typing a message in a browser window, the server can push that event down to other clients.

00:05:19.840 I mentioned that I was an undergrad in computer engineering, and when I hear the term 'real-time,' it makes me uneasy because there's a lot of connotation that comes with it.

00:05:39.550 There is a thing called real-time computing that has strict constraints, and that’s not what Action Cable or Rails or Ruby are about.

00:05:52.450 Even the operating systems we run these things on may not follow strict real-time constraints. So whenever our sales team at Radius starts saying 'real-time' is instantaneous, I always caveat this.

00:06:06.610 But what Action Cable does allow us to do is achieve quick low-latency response times, which are faster than polling every three seconds.

00:06:20.500 The other factor is overhead. By overhead, I mostly mean bandwidth consumption. A WebSocket connection is much more efficient in the bandwidth it uses.

00:06:38.920 We used this with IoT devices that send lots of sensor data, which we ran over 3G modems. We realized we were spending a lot on bandwidth, and it turned out that about 80% of our overhead was due to HTTP headers.

00:06:52.600 The JSON body of our messages comprised the remaining 20%. Switching to a WebSocket saved us a significant amount of money.

00:07:04.790 Alright, let's move on to how it all works. We’re going to go from the nuts and bolts to the higher-level business components.

00:07:20.500 We’ll show a lot of interrelated snippets, but I'll always come back to one basic overview. I know when you look at a diagram like this, it can be a little anxiety-inducing.

00:07:40.640 A total sidenote—turns out that dogs yawn when they’re anxious and uneasy. This was Hopper's reaction to seeing this diagram.

00:07:44.610 Alright, let’s look at our lifecycle. I’ll run through this with some code on the screen.

00:07:53.320 This is more of a reference to help understand what we’re looking at. I like to place concrete code next to the steps so we can see what’s happening as we go through.

00:08:12.890 So the client connects to Action Cable. Cable acts as a generic abstraction for the WebSocket—the thing that handles sending information back and forth.

00:08:29.410 On the client side, there’s a very simple way to pull in this functionality and create a consumer to handle this connection.

00:08:46.510 The server confirms this connection, ensuring the right person subscribes to the proper channels.

00:09:01.180 The client subscribes to specific channels, which come over this connection, declaring the things it is interested in.

00:09:14.820 The client can then send messages up to the server over this channel, and the server can send messages back down.

00:09:29.780 As part of this, I spent a long time reading and rereading the Action Cable guide. It became clear that Action Cable handles many things, much like Rails handles standard HTTP requests.

00:09:47.350 This includes placing things into controllers, views, and routes. However, there is a lot of terminology and jargon that can be difficult to digest.

00:10:05.930 I wanted to acknowledge that it's there, and we’ll work on defining some of these terms, but they're not particularly important for understanding how the pieces fit together.

00:10:12.360 Let's return to the client connecting to Action Cable. We haven't quite grasped what Action Cable is yet, so we’ll look into that.

00:10:23.670 So, how does a connection open? Often the client is JavaScript running in a browser. It doesn’t have to be, but that’s the default for the chat app we started with.

00:10:42.550 I looked into Action Cable, dug through the code, and found that it basically calls the WebSocket function that's native in the browser.

00:10:51.440 However, there’s some magic involved, so I opened my trusty dev tools because we web developers like to do that.

00:11:05.570 I noticed there was an HTTP request made, which had headers and responses—it looked standard, but it wasn’t.

00:11:21.390 It was WebSocket, and as we tend to do on the internet, I went and read the RFC to understand it better, but ended up confused.

00:11:39.550 So let’s step back to understand how a WebSocket works before we dive deeper into this.

00:11:58.730 I want to look at a fundamental, atomic level and treat TCP connections as atomic—meaning we can't split them up.

00:12:08.470 When the client connects to a server, it establishes a TCP connection. The client then makes a request, and the server responds.

00:12:21.630 In the old days of HTTP/1, the connection would just close after sending the response. But it's important to understand that, when we're talking about the socket, it's like writing to a file.

00:12:39.030 When we send a request, we write some bytes to the socket, just as if we were writing JSON or markdown into a text file on our computer.

00:12:54.550 Similarly, the server writes bytes back to us, and through the magic of TCP, those bytes are delivered back to the client.

00:13:06.930 So that’s how HTTP works at an abstract level. How would you do this in Ruby?

00:13:21.210 It’s really about a four-line Ruby script to do this, plus the actual body that we’re going to send, which includes a few HTTP headers.

00:13:38.740 The hardest part was figuring out the goofy line endings we had to include for legacy reasons, making things a bit ugly.

00:13:54.310 I pointed this code to my local Rails server, watched the logs, and, once I sorted out the bugs, I saw '200 OK.' This meant the Rails server thought it worked.

00:14:07.300 I had a line to read the response, so I ran `socket.read`, and printed out the reply, which was just a plain text file with a header, an empty line, and then the HTML body.

00:14:22.670 That’s kind of the basics of how HTTP works. Now, how do we upgrade to a socket connection?

00:14:35.390 Let’s revisit our diagram. Earlier, we talked about the GET response, but now I want to focus on what's happening on the client as we upgrade.

00:14:52.810 Before, we had a response of '200 OK.' For WebSockets, we need it to return a '101 Switch Protocol' response.

00:15:04.200 Once we get that response, we magically convert the GET response into a bi-directional socket.

00:15:23.127 I thought this sounded like some magic, so let's write the Ruby code to make it happen.

00:15:32.800 We would use the same Ruby script, but send a different and more complicated body to the server, which includes headers requesting to upgrade to WebSocket.

00:15:52.490 After a bit of debugging, I pointed this at my local Rails server, and got a successful upgrade to WebSocket.

00:16:08.640 I also received a confirmation of a successful disconnection, but I omitted that part. After checking what came out of `socket.read`, I got back plain text that was what I needed.

00:16:23.480 So we’ve been able to upgrade our connection to a socket. This connection remains open until the client says we're done.

00:16:39.870 Now, let’s look at what the server is doing. We’ve gone over the fundamental basics of how a client sends an HTTP request, so let’s quickly look at how the server handles that.

00:16:56.540 It uses Rack, which most things in the Ruby world use to handle HTTP. Rack is great because it provides an environment that passes in all the context.

00:17:12.440 This includes things that the web server knows and understands, such as authentication and other environment settings.

00:17:29.610 Then, Rack replies back with an appropriate status code—200 in this case—along with some headers and the body of the response.

00:17:45.590 Let’s look at how this process actually works with a typical Rails app, where these different layers are stacked together, functioning as middleware.

00:17:56.700 The request comes in from the web server, making a call to the next layer down, passing the environment, until eventually, it responds back up the chain.

00:18:13.400 This request-response cycle is clear, but it doesn’t inherently allow for an upgrade connection.

00:18:30.000 To facilitate this, clever developers came up with a hack called Rack Hijack.

00:18:41.500 When a request comes in, if it has `rack.hijack` as part of the environment, it invokes it, effectively establishing a connection and discarding the traditional flow.

00:19:00.210 Let me show you how this works. In the Action Cable codebase, there's a test that illustrates all of this.

00:19:12.780 There's a lot going on, but we need the socket pair to return an IO object, which we can then set to `rack.hijack` so we can connect the sockets.

00:19:28.960 Another important topic is cookies, since they are relevant to our discussion. Action Cable connections do not have the same context as typical HTTP requests.

00:19:44.760 However, it does have headers, and cookies are essentially HTTP headers. We can see the cookies in the browser, as well as in the response body from our Ruby script.

00:19:59.550 This is important because we need to pass information into the connections and identify users. In our chat room, we want to ensure the right people have the right handles.

00:20:12.780 We can do this by passing a cookie in, and luckily, the HTTP headers are available while establishing the connection.

00:20:26.560 Alright, the client subscribes to a channel. This involves a subscription handshake.

00:20:36.770 This is getting more into the business logic rather than the fundamental nuts and bolts, but it’s crucial to our overview.

00:20:48.580 When we establish the connection, which we've covered in detail, the server confirms and the client replies, saying it wants to subscribe to a channel called `room_channel`.

00:21:01.450 The server acknowledges this subscription, as we confirm that they’re being subscribed correctly.

00:21:15.440 I looked in the browser and found that you can click on the WebSocket connection, in this case called 'cable,' to see all the happenings.

00:21:31.620 You can observe the Welcome event being sent down from the server, the client requesting to subscribe to rooms, and the server confirming this.

00:21:46.720 Once subscribed, the server actively pings to ensure the connection remains alive.

00:22:02.340 Subscribing to a channel means we need to update for that channel, which necessitates a way to publish and subscribe to changes.

00:22:17.690 Rails handles this by using Redis for publish/subscribe, enabling real-time updates.

00:22:35.430 Let’s return to our overview and see how the publish-subscribe cycle works. Events can be centrally handled, and it will notify everyone that cares about those events.

00:22:48.530 To explore how this works, I turned to the Redis command line to experiment with it.

00:23:05.080 I opened two terminal windows: one for the server and one for the client, both running on my laptop.

00:23:18.560 On the server side, I typed `subscribe channel 1`, and it replied that it was listening, blocking there.

00:23:32.660 In the client terminal, I published to channel 1 with the message 'hi.' As soon as I hit enter, the server terminal updated to show it received the message.

00:23:44.560 I could do it repeatedly, and every time it happened quickly, which demonstrated how Redis works seamlessly.

00:23:56.870 But being at RailsConf, I wanted to know how to implement this in Ruby, so I wrote two scripts.

00:24:12.710 The first script was for the server, which would listen and block, and the second for the client, which would publish.

00:24:28.450 The client's publishing simply runs once, sends the message from the command line, and exits. I also aimed to make the output similar to the Redis CLI.

00:24:44.780 We start the server, subscribing to a channel, which hangs there waiting.

00:25:06.770 Then, I head over to the client and publish messages like 'hi.' The server immediately updates, confirming it received the message.

00:25:21.820 This messaging model illustrates how Redis efficiently encapsulates the solution for pub/sub on a web server.

00:25:37.540 It enables us to build infrastructure in Ruby that can manage notifications to the involved parties.

00:25:51.990 Alright, the client sends messages to the server, so let’s get back to our overview.

00:26:07.000 We want to understand how things are sent from the client to the server via the socket.

00:26:20.600 To do that, we'll examine the format of how WebSockets construct their payloads.

00:26:34.490 After researching, I broke down the information I found into simpler terms, after consulting various blog posts to clarify.

00:26:50.470 At its core, the structure of a WebSocket message includes an opcode and length of the data.

00:27:04.020 The client can also send a mask, which isn’t incredibly important, but what's intriguing is that the overhead per message is about 2 to 10 bytes.

00:27:19.990 Comparatively, this is extremely lightweight compared to the HTTP headers we previously had.

00:27:38.540 I should note that the diagram I’m referring to is not to scale.

00:27:54.840 In the browser's dev tools, we can see these messages being exchanged, and observe the payload values in the WebSocket message construction.

00:28:09.680 You can track the timestamps, sizes, and contents of the message body.

00:28:25.910 Now let’s focus on the server sending things down. This involves the interaction between the cable server and Redis.

00:28:41.700 The pub/sub model doesn’t function independently—it needs triggers from background jobs finishing or messages inputted.

00:28:59.500 To see this in Rails code, we start with Action Cable on the backend and can broadcast to a channel, sending parameters.

00:29:15.750 In my example, I sent parameters like 'message' which said 'hi' and an additional parameter for context.

00:29:30.200 This gets converted to JSON, which includes an identifier along with the actual message body.

00:29:46.080 The JSON is delivered to the client, and then the Rails JavaScript code converts it to JavaScript for further processing.

00:30:04.220 Overall, the JS function will handle the data, updating the client interface by reflecting the new message.

00:30:20.140 Now, let's recap everything we went over; that was quite a bit!

00:30:37.600 So, simplifying our diagram a little bit, let’s review the steps.

00:30:48.490 We connect to Action Cable, the client talks to the cable server, which checks the connection.

00:31:00.340 The client subscribes to a channel, and then we progress to the pub/sub component.

00:31:10.470 Once everything is in place, the client can send messages to the server which can broadcast to the relevant channels.

00:31:20.950 This process is crucial for sending across connections, allowing efficient interaction within the Rails app.

00:31:32.180 Alright, that’s all for my talk.