00:00:21.830
Our story today starts with Red the lemur and his master, Scarlets. "Why hasn't the symphony started yet? The orchestra is all here! Topaz is ready to conduct, isn't she?"
00:00:34.530
"Why yes, Red, I believe she is!" then why aren't we playing?
00:00:41.820
Because we have a new conductor joining us today. Where? I don't see them.
00:00:49.470
They're out there. Oh, you mean out there? There are some very interesting looking lemurs. Do you think they can play instruments too?
00:00:59.550
I'm sure they'll do magnificently! Oh, I think they're about ready.
00:01:06.869
Hello everyone! So it looks like we're about ready to start here. If you haven't already, let's go ahead and join in at Symphony.dev.
00:01:14.700
"Simply with a Y was taken, but I kind of like this better. I'm surprised I got that one, honestly, but I'm not going to complain; I like it. It's actually better." So we'll go ahead and get you all there. Give that a second.
00:01:28.890
Oh right, I should turn on the volume on here; otherwise, that's not actually going to do anything, is it? Okay, then let's go ahead and have our fun real quick. We'll load this up and then we're going to hope this works.
00:01:46.880
I have a reasonable amount of confidence because I’ve cheated. I'm not going to say how, but I have cheated. Oh no, no, no, come on! Conference Wi-Fi is fun.
00:02:10.060
We will endeavor, though, so a little bit about what this talk is while this is going on. I submitted this crazy idea for RailsConf because I had this dream of conducting a symphony orchestra.
00:02:38.709
This isn't exactly what I had in mind, but it does work, so we're going to see if we can manage to make that happen. Now it looks like we're about halfway loaded.
00:02:55.120
Come on! Which is why we have this today. I mean, worst case, I'm just going to switch over to localhost and make it from there.
00:03:06.569
I don't want to, but we will see. Come on.
00:03:16.120
Hmm, you see, this is why you're very cautious about doing things through Wi-Fi, which is why we're going to switch over to localhost. Well, I have 109 of you there already.
00:03:46.460
I'm not sure about the rest of them. Come on, you can do it! What's latency yet anyway? Wow, that's impressive! Someone's like two weeks behind. How do you do that?
00:04:09.460
It's probably Android. Never mind, I'm using Android, which probably explains a few other things. Okay, so we're going to go ahead and cheat and go to the local version of this.
00:04:30.090
Oh yes, you can see exactly what's going over there. You can also see it's stuck loading, so unfortunately, I'm going to have to demo this with smaller audiences later. But to give the experience anyways, we're going to go ahead and let's see here run this.
00:05:56.150
So it does work locally; I promise that much. This is why we have contingency plans, ladies and gentlemen.
00:06:06.180
Of course, because my phone locked now I can't play it. Oh, you're no fun! Okay, well, that’s not going to work because my phone locked. Now it doesn't know about the ready state, that's a problem with WebSockets.
00:06:24.690
If your phone happens to lock, bad things happen. Oh well, we tried, we failed, we endeavor, we move on. So welcome to the Action Cable Symphony!
00:06:44.490
So who am I that's up in front of you in a Beethoven-looking wig and a tuxedo? Well, first of all, it sounded fun, so I did it.
00:07:03.420
Second of all, I used to be an artist named musician, and I ended up becoming a programmer through a series of very unfortunate accidents.
00:07:30.090
It started all with someone saying I should try web development. Web development is a lot of fun; it’s going to be great! Okay, so I get in some HTML and some CSS; it's working decently okay.
00:08:07.200
And then they said you should add some JavaScript there. Okay, I'm enjoying this. This is really quite entertaining.
00:08:26.810
And then someone said the horrific words I should have said no to: how about a back end? And that's how I became a developer, an operations person, and several other things. Currently I'm managing Ruby architecture across the company and defining standards.
00:09:02.910
Now, what exactly was that that you just witnessed? It was a symphony played with an entire audience on my computer. Okay, I tried; you'll have to forgive me for that. I'll show you all later; I promise it works.
00:09:50.640
It's using Rails and Action Cable with various clutches and hackery to make it work. More specifically, we were doing Beethoven's 6, the Pastoral Suite. It's a cheery little tune I really enjoy.
00:10:07.180
The fun thing is this song's actual name is Awakening of Cheerful Feelings on Arrival in the Countryside, and I thought that was a really beautiful thing.
00:10:43.450
That's what we're going to do today: we can bring some cheerful feelings into this brand new world of WebSockets with client latency.
00:10:57.640
The more pertinent question here is how? How in the world does something like this work? As it turns out, creating a symphony on smartphones is a really hard task.
00:11:27.360
So let's start with a bit of an overview. What exactly are we going to be covering here today? We're going to start by looking at what it takes on the server to make something like this work.
00:12:07.440
How do you actually get clients to mostly behave themselves, preferably? And how do we secure this thing, which is definitely an interesting task - latency, my personal favorite.
00:12:31.350
As we just saw, and a finale to finish it up. So now for the first major component, we're going to look at the Rails server. We know it's using Action Cable, but how does that work to make music?
00:12:49.080
I mean, Action Cable is just for chat applications, right? We start with something called a MIDI file.
00:13:05.240
For those not familiar, a MIDI file is kind of an old-school input/output file format that allows us to play music with voices, tone fonts, and everything else.
00:13:34.580
What we do is convert this to JSON into something we can actually parse because binary files are not very friendly to the front end. This gives us the ability to get the tracks from the MIDI and tree them, adding their own separate entity.
00:14:53.090
But there's a lot more complexity there, like time signatures, control changes, voices, and other things that are very conveniently not a problem in either Beethoven's 6 nor Beethoven's Ninth Symphony.
00:15:22.760
Overall, it would look something like this: a flow of data from the conductor all the way down to the individual lemurs that happen to be playing.
00:15:57.200
Now how does that work? If we take a look at our conductor here, it brings us to our first interesting part of Action Cable, which is that there's no hard requirement on only using one cable.
00:16:31.420
We start with the conductor cable, which allows us to have an administrative interface for sending and receiving high-level commands like stop, play, go, and buffer music. Just like a symphony conductor!
00:17:12.630
Our conductor channel is in charge of the entire show. We break those midis into separate tracks and create a channel for each instrument, so you can listen to only the part that makes sense for you.
00:17:59.640
In this case, we might have a channel for French horns, violas, violins, and each note in the track is broadcast over the channel which goes to the associated mini channel and eventually to the instrument on your phone.
00:18:34.850
Our clients only listen for the parts they need, so you don't get the entire symphony; you just get that one part.
00:19:15.200
Then we have our last piece, the players themselves. When you see you're connecting, you don't know how many instruments you're going to get. It looks like right now about 46 to 100.
00:19:43.090
There could be any number of oh maybe 10, 100, or have mercy on my Heroku bill. Please, about a thousand.
00:20:08.030
Each of these players is a distinct person, so we can keep communications directly back and forth to the clients. But let's take a look at those players.
00:20:38.169
Whenever you first connect, you're not sure who you are yet because you don't know what song you're playing; you don't know what's loading. That's the job of the conductor to tell people which instruments they should be playing.
00:21:05.150
But they can only do that once they know again what song we are actually playing. The assignment uses a super advanced Ruby algorithm to determine the ideal placement for each player.
00:21:47.830
It took weeks to perfect and honestly is my proudest piece of coding I've ever done. Very delicately calibrated!
00:22:15.179
So, anyways, once that command processes, we have instruments on phones.
00:22:32.670
Now that we know that, the tracks can start collecting a buffer of information on notes from their relevant MIDI channels before they start playing.
00:22:51.449
The conductor can send them commands like play or stop, but the nice thing is these clients can send back meta information.
00:23:12.150
For example, we probably don't want to start a song until everyone is ready. Looking back at that dashboard over there, we have parts and assignments saying that there are roughly so many connected.
00:23:39.889
About 150 connected, 105 are ready. Some latency information and what instruments are out there.
00:24:09.300
I believe roughly ten of each or something like that. A lot of this information you can keep track of what exactly are my clients doing here.
00:24:27.360
That doesn't mean we're not still using RESTful endpoints. Some endpoints may make a lot more sense for REST than others, such as logging in.
00:24:44.190
The thing about WebSockets is they don't replace RESTful endpoints; they just augment them.
00:25:06.899
So that raises a good question: does it scale? I prepared an extra special little demonstration just for that.
00:25:38.370
So let's go ahead and do this.
00:26:44.050
Okay, well refresh it then. Come on!
00:27:06.109
Everything is now decided. That's more work today. It’s supposed to be a very funny joke of it actually playing musical scales!
00:27:27.460
But yes, very funny! There are legitimate concerns about scalability and there's a lot of research being done on this.
00:27:50.120
Especially with some of the folks sitting here in this audience. In any case, I probably will not be a great source of information on this at the moment.
00:28:12.770
As this talk is mostly me telling Heroku to make my problems go away, that's glossing over a lot of information here.
00:28:43.020
Some of them saved until later, such as security and latency.
00:29:01.360
Next up, we have our clients, or everything that's going on on your phones.
00:29:18.020
Originally I started with jQuery, managing to get a working proof of concept. The problem was I implemented two kind of a patch hacky version of React.
00:29:38.130
Being as I'm not nearly as clever as Dan Abramov in JavaScript, I decided it was probably a good idea to learn React.
00:30:01.290
So a lot of the front end here is written in React, which I learned in the process of writing this talk.
00:30:27.600
I wouldn't suggest that you learn a new language while preparing a conference talk, but the difference here was that there are a lot of DOM manipulations happening.
00:30:53.390
Most of the actions are based on the result of listening to WebSockets and responding to messages sent to the conductor.
00:31:15.620
How in the world is that actually playing music? It's using a magical little tool called Tone.js and synthesizers.
00:31:46.960
What happens is it listens to an instrumental track and gets a series of notes to play.
00:32:06.500
Some notes are artificially modified on phones because I found out the hard way that phone speakers can't play bass registers.
00:32:31.930
One time I was demoing and the entire bottom section just fell out, so I figured out that phones can't play that low.
00:32:53.710
We plugged headphones into one of the phones and, sure enough, it played normally.
00:33:15.630
Phone speakers cannot handle bass music, a good thing to know if you ever want to try something like this.
00:33:38.160
Tone.js allows us to keep track of all the notes and events that happen on the sequence, but it only keeps time locally.
00:34:03.200
They never thought keeping time across devices would be a good idea.
00:34:06.800
As for now, we artificially add an offset to whatever time it's supposed to start at, but we'll get into that more under latency.
00:34:43.530
Of course, it's nothing to scratch the surface of what's possible with Tone.js, which can be used for substantially more.
00:35:00.240
I'll have some resources later for people wanting to learn a little more on that front.
00:35:26.370
But that brings us to our next issue: what happens if a particularly mischievous lemur decides they want to interfere with that connection?
00:35:44.289
Security is always an issue, even with WebSockets. In the case of this administrative dashboard, we're using Devise.
00:36:04.949
It would be bad if someone else could start playing music without proper authority.
00:36:14.690
Though I'm not really using typical sessions here, we're using something else entirely that works a little better with front ends: JWTs or JSON Web Tokens.
00:36:40.720
With everything in technology, there are pros and cons, and as like every conference speaker, I'm going to highlight the pros and gloss over the cons.
00:37:07.620
JWTs are interesting because they're self-contained; there's no need to query information on the server.
00:37:34.400
The entire session is encoded in the token, which means they're stateless as the server doesn't need to keep track of the session.
00:37:46.780
If that sounds insecure, good, you have a future in information security; we should talk later.
00:38:17.370
Tokens are signed, which is why they are considered secure. But if your Rails application secret is leaked, you have bigger issues to worry about.
00:38:35.830
These three components contain everything the user needs to run an application like this.
00:38:49.800
But what happens if that mischievous lemur decides to tamper with this and have some fun with it?
00:39:18.050
Imagine a scenario where a mischievous lemur appends their token and tells the server that they are an administrator.
00:39:40.280
That would be a problem, but remember, there's a signature. Once the server sees that the token doesn't match.
00:39:54.570
Thus, the token will be detected as invalid, and we will need to revoke it.
00:40:08.410
Because it's stateless, if you happen to revoke a token, you'd have to introduce state to stop people from logging in.
00:40:30.204
There is a denial list to manage, and if it goes down, there’s chaos! Keep a good point of good enough.
00:40:48.949
Moving on, latency turns out to be a challenging problem to tackle.
00:41:10.929
Getting everything to play on time is incredibly hard. Networks are inconsistent.
00:41:39.940
Each of our phones has a distinct time, and while they seem consistent for everyday use, much more precision is required for music.
00:41:52.994
The problem is if we played the same note against each of those clocks, we'd end up with something sounding like three phones at slightly distinct intervals.
00:42:30.230
Every additional hop we add means more time lost; keeping all devices synchronized is necessary.
00:43:01.470
We require a trustworthy single source for timing, which is why I was delighted to discover something called TimeSyncJS.
00:43:39.230
It has options for both peer-to-peer and server syncing, but in our case, we're using server syncing.
00:44:03.460
It accounts for time, round trip time deviation and provides callbacks for any offset changes.
00:44:32.710
Each connection to the server results in a timestamp, and the server responds with the time, which we can compute averages from.
00:45:24.910
This process creates a much more consistent interface for us to work with and allows for a clock class that we can utilize.
00:45:56.250
From there, we can manage consistent offsets, but they're not exact, just close enough for cohesive performances.
00:46:31.800
There are always ways to improve this, but they often lead to requiring atomic clocks and heavy research.
00:46:52.500
The server must also maintain consistent timestamps across instances to make everything harmonious.
00:47:19.200
Most hosting services utilize NTP to ensure server clocks sync up correctly, which is a remarkably underappreciated standard.
00:47:46.490
The point of chaos arises when there is a discrepancy in timestamping from multiple servers.
00:48:08.800
This connects with the question of real-time applications: what exactly does real-time mean?
00:48:34.570
In films, we've noticed a shift from 30 frames per second to 60 frames per second, leading to an unsettlingly immersive experience.
00:48:54.640
In gaming, the quest for low-latency interactions is fierce; ideally under ten milliseconds.
00:49:22.360
Conversely, sprinters in the Olympics have reaction times between 150 to 200 milliseconds, which is instantaneous for spectators.
00:49:48.650
When it comes to orchestras, you would think that timing would be exact. However, those little inconsistencies contribute to the lively character of music.
00:50:14.490
Thus, a conductor's visual cues are quintessential for coordination; otherwise, the performance can devolve into chaos.
00:50:39.140
The beauty of live performance is that it evokes emotion, while MIDI can sound quite mechanical.
00:51:02.610
So, in the end, action cable isn't about strict exact repository measures. It is about enough to fuel joy.
00:51:29.480
To wrap up, if you want to find out more about where the lemurs go next, feel free to follow me on social networks.
00:51:48.690
Twitter being inconsistent for me, and yes, people still do use IRC!
00:52:01.610
If you manage to find all the lemur stickers hiding, there’s a special prize at the Square booth.
00:52:19.210
Yes, we do have fun things associated with this talk; look out for the lemurs!
00:52:37.400
I’ve been prattling long enough, so thank you for your time!