RailsConf 2019

The Action Cable Symphony - An Illustrated Musical Adventure

The Action Cable Symphony - An Illustrated Musical Adventure

by Brandon Weaver

In 'The Action Cable Symphony - An Illustrated Musical Adventure,' Brandon Weaver presents a creative and innovative exploration of Action Cable and WebSockets through a symphonic performance involving audience participation. This session, held at RailsConf 2019, seeks to demystify how Action Cable functions by using the metaphor of conducting an orchestra, presented with engaging visuals and a musical composition.

Key points covered in the talk include:
- Introduction to ActionCable: Weaver illustrates the fundamental concept of ActionCable in the context of Rails, explaining how it can be utilized beyond simple chat applications.
- Audience Engagement with Technology: He engages audience members by connecting their phones to play music, showcasing the interactivity made possible through WebSockets.
- Music and MIDI File Processing: Weaver explains how MIDI files are converted to JSON for easier client-side processing, highlighting the complexities involved in handling musical data, such as time signatures and control changes.
- Construct of a Conductor Channel: The conductor's channel is discussed, laying out how commands are sent and how each instrument (or audience member's phone) receives their specific tracks to create a cohesive sound.
- Latency Challenges: A significant portion of the talk focuses on latency issues in distributed systems, which can lead to desynchronization in a symphonic performance. Weaver introduces time synchronization techniques and how they are implemented to manage latency.
- Security Considerations: The use of JSON Web Tokens (JWT) for secure communication is examined, discussing their advantages and drawbacks in real-time applications.
- Conclusion and Reflections: The talk wraps up with reflections on the joy of programming within the Ruby community, emphasizing collaboration and the importance of community support in technical endeavors.

Ultimately, the experience includes not just a technical walkthrough of Action Cable but also an engaging and whimsical performance that underlines the beauty of both programming and music, promoting a sense of community and creativity among developers.

00:00:21.830 Our story today starts with Red the lemur and his master, Scarlets. "Why hasn't the symphony started yet? The orchestra is all here! Topaz is ready to conduct, isn't she?"
00:00:34.530 "Why yes, Red, I believe she is!" then why aren't we playing?
00:00:41.820 Because we have a new conductor joining us today. Where? I don't see them.
00:00:49.470 They're out there. Oh, you mean out there? There are some very interesting looking lemurs. Do you think they can play instruments too?
00:00:59.550 I'm sure they'll do magnificently! Oh, I think they're about ready.
00:01:06.869 Hello everyone! So it looks like we're about ready to start here. If you haven't already, let's go ahead and join in at Symphony.dev.
00:01:14.700 "Simply with a Y was taken, but I kind of like this better. I'm surprised I got that one, honestly, but I'm not going to complain; I like it. It's actually better." So we'll go ahead and get you all there. Give that a second.
00:01:28.890 Oh right, I should turn on the volume on here; otherwise, that's not actually going to do anything, is it? Okay, then let's go ahead and have our fun real quick. We'll load this up and then we're going to hope this works.
00:01:46.880 I have a reasonable amount of confidence because I’ve cheated. I'm not going to say how, but I have cheated. Oh no, no, no, come on! Conference Wi-Fi is fun.
00:02:10.060 We will endeavor, though, so a little bit about what this talk is while this is going on. I submitted this crazy idea for RailsConf because I had this dream of conducting a symphony orchestra.
00:02:38.709 This isn't exactly what I had in mind, but it does work, so we're going to see if we can manage to make that happen. Now it looks like we're about halfway loaded.
00:02:55.120 Come on! Which is why we have this today. I mean, worst case, I'm just going to switch over to localhost and make it from there.
00:03:06.569 I don't want to, but we will see. Come on.
00:03:16.120 Hmm, you see, this is why you're very cautious about doing things through Wi-Fi, which is why we're going to switch over to localhost. Well, I have 109 of you there already.
00:03:46.460 I'm not sure about the rest of them. Come on, you can do it! What's latency yet anyway? Wow, that's impressive! Someone's like two weeks behind. How do you do that?
00:04:09.460 It's probably Android. Never mind, I'm using Android, which probably explains a few other things. Okay, so we're going to go ahead and cheat and go to the local version of this.
00:04:30.090 Oh yes, you can see exactly what's going over there. You can also see it's stuck loading, so unfortunately, I'm going to have to demo this with smaller audiences later. But to give the experience anyways, we're going to go ahead and let's see here run this.
00:05:56.150 So it does work locally; I promise that much. This is why we have contingency plans, ladies and gentlemen.
00:06:06.180 Of course, because my phone locked now I can't play it. Oh, you're no fun! Okay, well, that’s not going to work because my phone locked. Now it doesn't know about the ready state, that's a problem with WebSockets.
00:06:24.690 If your phone happens to lock, bad things happen. Oh well, we tried, we failed, we endeavor, we move on. So welcome to the Action Cable Symphony!
00:06:44.490 So who am I that's up in front of you in a Beethoven-looking wig and a tuxedo? Well, first of all, it sounded fun, so I did it.
00:07:03.420 Second of all, I used to be an artist named musician, and I ended up becoming a programmer through a series of very unfortunate accidents.
00:07:30.090 It started all with someone saying I should try web development. Web development is a lot of fun; it’s going to be great! Okay, so I get in some HTML and some CSS; it's working decently okay.
00:08:07.200 And then they said you should add some JavaScript there. Okay, I'm enjoying this. This is really quite entertaining.
00:08:26.810 And then someone said the horrific words I should have said no to: how about a back end? And that's how I became a developer, an operations person, and several other things. Currently I'm managing Ruby architecture across the company and defining standards.
00:09:02.910 Now, what exactly was that that you just witnessed? It was a symphony played with an entire audience on my computer. Okay, I tried; you'll have to forgive me for that. I'll show you all later; I promise it works.
00:09:50.640 It's using Rails and Action Cable with various clutches and hackery to make it work. More specifically, we were doing Beethoven's 6, the Pastoral Suite. It's a cheery little tune I really enjoy.
00:10:07.180 The fun thing is this song's actual name is Awakening of Cheerful Feelings on Arrival in the Countryside, and I thought that was a really beautiful thing.
00:10:43.450 That's what we're going to do today: we can bring some cheerful feelings into this brand new world of WebSockets with client latency.
00:10:57.640 The more pertinent question here is how? How in the world does something like this work? As it turns out, creating a symphony on smartphones is a really hard task.
00:11:27.360 So let's start with a bit of an overview. What exactly are we going to be covering here today? We're going to start by looking at what it takes on the server to make something like this work.
00:12:07.440 How do you actually get clients to mostly behave themselves, preferably? And how do we secure this thing, which is definitely an interesting task - latency, my personal favorite.
00:12:31.350 As we just saw, and a finale to finish it up. So now for the first major component, we're going to look at the Rails server. We know it's using Action Cable, but how does that work to make music?
00:12:49.080 I mean, Action Cable is just for chat applications, right? We start with something called a MIDI file.
00:13:05.240 For those not familiar, a MIDI file is kind of an old-school input/output file format that allows us to play music with voices, tone fonts, and everything else.
00:13:34.580 What we do is convert this to JSON into something we can actually parse because binary files are not very friendly to the front end. This gives us the ability to get the tracks from the MIDI and tree them, adding their own separate entity.
00:14:53.090 But there's a lot more complexity there, like time signatures, control changes, voices, and other things that are very conveniently not a problem in either Beethoven's 6 nor Beethoven's Ninth Symphony.
00:15:22.760 Overall, it would look something like this: a flow of data from the conductor all the way down to the individual lemurs that happen to be playing.
00:15:57.200 Now how does that work? If we take a look at our conductor here, it brings us to our first interesting part of Action Cable, which is that there's no hard requirement on only using one cable.
00:16:31.420 We start with the conductor cable, which allows us to have an administrative interface for sending and receiving high-level commands like stop, play, go, and buffer music. Just like a symphony conductor!
00:17:12.630 Our conductor channel is in charge of the entire show. We break those midis into separate tracks and create a channel for each instrument, so you can listen to only the part that makes sense for you.
00:17:59.640 In this case, we might have a channel for French horns, violas, violins, and each note in the track is broadcast over the channel which goes to the associated mini channel and eventually to the instrument on your phone.
00:18:34.850 Our clients only listen for the parts they need, so you don't get the entire symphony; you just get that one part.
00:19:15.200 Then we have our last piece, the players themselves. When you see you're connecting, you don't know how many instruments you're going to get. It looks like right now about 46 to 100.
00:19:43.090 There could be any number of oh maybe 10, 100, or have mercy on my Heroku bill. Please, about a thousand.
00:20:08.030 Each of these players is a distinct person, so we can keep communications directly back and forth to the clients. But let's take a look at those players.
00:20:38.169 Whenever you first connect, you're not sure who you are yet because you don't know what song you're playing; you don't know what's loading. That's the job of the conductor to tell people which instruments they should be playing.
00:21:05.150 But they can only do that once they know again what song we are actually playing. The assignment uses a super advanced Ruby algorithm to determine the ideal placement for each player.
00:21:47.830 It took weeks to perfect and honestly is my proudest piece of coding I've ever done. Very delicately calibrated!
00:22:15.179 So, anyways, once that command processes, we have instruments on phones.
00:22:32.670 Now that we know that, the tracks can start collecting a buffer of information on notes from their relevant MIDI channels before they start playing.
00:22:51.449 The conductor can send them commands like play or stop, but the nice thing is these clients can send back meta information.
00:23:12.150 For example, we probably don't want to start a song until everyone is ready. Looking back at that dashboard over there, we have parts and assignments saying that there are roughly so many connected.
00:23:39.889 About 150 connected, 105 are ready. Some latency information and what instruments are out there.
00:24:09.300 I believe roughly ten of each or something like that. A lot of this information you can keep track of what exactly are my clients doing here.
00:24:27.360 That doesn't mean we're not still using RESTful endpoints. Some endpoints may make a lot more sense for REST than others, such as logging in.
00:24:44.190 The thing about WebSockets is they don't replace RESTful endpoints; they just augment them.
00:25:06.899 So that raises a good question: does it scale? I prepared an extra special little demonstration just for that.
00:25:38.370 So let's go ahead and do this.
00:26:44.050 Okay, well refresh it then. Come on!
00:27:06.109 Everything is now decided. That's more work today. It’s supposed to be a very funny joke of it actually playing musical scales!
00:27:27.460 But yes, very funny! There are legitimate concerns about scalability and there's a lot of research being done on this.
00:27:50.120 Especially with some of the folks sitting here in this audience. In any case, I probably will not be a great source of information on this at the moment.
00:28:12.770 As this talk is mostly me telling Heroku to make my problems go away, that's glossing over a lot of information here.
00:28:43.020 Some of them saved until later, such as security and latency.
00:29:01.360 Next up, we have our clients, or everything that's going on on your phones.
00:29:18.020 Originally I started with jQuery, managing to get a working proof of concept. The problem was I implemented two kind of a patch hacky version of React.
00:29:38.130 Being as I'm not nearly as clever as Dan Abramov in JavaScript, I decided it was probably a good idea to learn React.
00:30:01.290 So a lot of the front end here is written in React, which I learned in the process of writing this talk.
00:30:27.600 I wouldn't suggest that you learn a new language while preparing a conference talk, but the difference here was that there are a lot of DOM manipulations happening.
00:30:53.390 Most of the actions are based on the result of listening to WebSockets and responding to messages sent to the conductor.
00:31:15.620 How in the world is that actually playing music? It's using a magical little tool called Tone.js and synthesizers.
00:31:46.960 What happens is it listens to an instrumental track and gets a series of notes to play.
00:32:06.500 Some notes are artificially modified on phones because I found out the hard way that phone speakers can't play bass registers.
00:32:31.930 One time I was demoing and the entire bottom section just fell out, so I figured out that phones can't play that low.
00:32:53.710 We plugged headphones into one of the phones and, sure enough, it played normally.
00:33:15.630 Phone speakers cannot handle bass music, a good thing to know if you ever want to try something like this.
00:33:38.160 Tone.js allows us to keep track of all the notes and events that happen on the sequence, but it only keeps time locally.
00:34:03.200 They never thought keeping time across devices would be a good idea.
00:34:06.800 As for now, we artificially add an offset to whatever time it's supposed to start at, but we'll get into that more under latency.
00:34:43.530 Of course, it's nothing to scratch the surface of what's possible with Tone.js, which can be used for substantially more.
00:35:00.240 I'll have some resources later for people wanting to learn a little more on that front.
00:35:26.370 But that brings us to our next issue: what happens if a particularly mischievous lemur decides they want to interfere with that connection?
00:35:44.289 Security is always an issue, even with WebSockets. In the case of this administrative dashboard, we're using Devise.
00:36:04.949 It would be bad if someone else could start playing music without proper authority.
00:36:14.690 Though I'm not really using typical sessions here, we're using something else entirely that works a little better with front ends: JWTs or JSON Web Tokens.
00:36:40.720 With everything in technology, there are pros and cons, and as like every conference speaker, I'm going to highlight the pros and gloss over the cons.
00:37:07.620 JWTs are interesting because they're self-contained; there's no need to query information on the server.
00:37:34.400 The entire session is encoded in the token, which means they're stateless as the server doesn't need to keep track of the session.
00:37:46.780 If that sounds insecure, good, you have a future in information security; we should talk later.
00:38:17.370 Tokens are signed, which is why they are considered secure. But if your Rails application secret is leaked, you have bigger issues to worry about.
00:38:35.830 These three components contain everything the user needs to run an application like this.
00:38:49.800 But what happens if that mischievous lemur decides to tamper with this and have some fun with it?
00:39:18.050 Imagine a scenario where a mischievous lemur appends their token and tells the server that they are an administrator.
00:39:40.280 That would be a problem, but remember, there's a signature. Once the server sees that the token doesn't match.
00:39:54.570 Thus, the token will be detected as invalid, and we will need to revoke it.
00:40:08.410 Because it's stateless, if you happen to revoke a token, you'd have to introduce state to stop people from logging in.
00:40:30.204 There is a denial list to manage, and if it goes down, there’s chaos! Keep a good point of good enough.
00:40:48.949 Moving on, latency turns out to be a challenging problem to tackle.
00:41:10.929 Getting everything to play on time is incredibly hard. Networks are inconsistent.
00:41:39.940 Each of our phones has a distinct time, and while they seem consistent for everyday use, much more precision is required for music.
00:41:52.994 The problem is if we played the same note against each of those clocks, we'd end up with something sounding like three phones at slightly distinct intervals.
00:42:30.230 Every additional hop we add means more time lost; keeping all devices synchronized is necessary.
00:43:01.470 We require a trustworthy single source for timing, which is why I was delighted to discover something called TimeSyncJS.
00:43:39.230 It has options for both peer-to-peer and server syncing, but in our case, we're using server syncing.
00:44:03.460 It accounts for time, round trip time deviation and provides callbacks for any offset changes.
00:44:32.710 Each connection to the server results in a timestamp, and the server responds with the time, which we can compute averages from.
00:45:24.910 This process creates a much more consistent interface for us to work with and allows for a clock class that we can utilize.
00:45:56.250 From there, we can manage consistent offsets, but they're not exact, just close enough for cohesive performances.
00:46:31.800 There are always ways to improve this, but they often lead to requiring atomic clocks and heavy research.
00:46:52.500 The server must also maintain consistent timestamps across instances to make everything harmonious.
00:47:19.200 Most hosting services utilize NTP to ensure server clocks sync up correctly, which is a remarkably underappreciated standard.
00:47:46.490 The point of chaos arises when there is a discrepancy in timestamping from multiple servers.
00:48:08.800 This connects with the question of real-time applications: what exactly does real-time mean?
00:48:34.570 In films, we've noticed a shift from 30 frames per second to 60 frames per second, leading to an unsettlingly immersive experience.
00:48:54.640 In gaming, the quest for low-latency interactions is fierce; ideally under ten milliseconds.
00:49:22.360 Conversely, sprinters in the Olympics have reaction times between 150 to 200 milliseconds, which is instantaneous for spectators.
00:49:48.650 When it comes to orchestras, you would think that timing would be exact. However, those little inconsistencies contribute to the lively character of music.
00:50:14.490 Thus, a conductor's visual cues are quintessential for coordination; otherwise, the performance can devolve into chaos.
00:50:39.140 The beauty of live performance is that it evokes emotion, while MIDI can sound quite mechanical.
00:51:02.610 So, in the end, action cable isn't about strict exact repository measures. It is about enough to fuel joy.
00:51:29.480 To wrap up, if you want to find out more about where the lemurs go next, feel free to follow me on social networks.
00:51:48.690 Twitter being inconsistent for me, and yes, people still do use IRC!
00:52:01.610 If you manage to find all the lemur stickers hiding, there’s a special prize at the Square booth.
00:52:19.210 Yes, we do have fun things associated with this talk; look out for the lemurs!
00:52:37.400 I’ve been prattling long enough, so thank you for your time!