Summarized using AI

Icecast Stream Parsing

Dimiter Petrov • December 04, 2019 • Zurich, Switzerland • Talk

In this presentation, Dimiter Petrov discusses the development of a Ruby parser for the Icecast streaming protocol, aimed at automating the process of airplay matching for Swiss artists on the mx3.ch platform. The need for this system arose due to the inadequacies of the previous third-party solution, which relied on cumbersome file-sharing and cron jobs. The focus of the talk includes the key challenges encountered in building a production-ready system and various technical strategies implemented to overcome these challenges. The main topics covered include:

  • Background Context: The mx3 platform connects Swiss musicians, radio stations, and fans. Previously, airplay data was managed by a third-party system that was becoming obsolete.
  • Implementation of Icecast Parser: An exploration of the Icecast protocol followed, which is used by many radio stations for streaming. Petrov decided to write a Ruby parser from scratch, addressing the limitations of existing Ruby libraries that were deemed not production-ready.
  • Technical Challenges: Among the challenges discussed were handling network errors, threading issues, and managing ActiveRecord connection pools within Ruby on Rails. Petrov also presented how he implemented fuzzy matching using PostgreSQL to ensure accurate song title recognition.
  • Robustness and Reliability: The importance of stability in a production system is emphasized, particularly in the context of recovering from network errors using retry logic and minimizing impact on database connections with a producer-consumer model.
  • Fuzzy Matching Implementation: Petrov demonstrated how he achieved fuzzy matching by utilizing the Levenshtein distance algorithm in PostgreSQL to match artist names and track titles from the metadata against the database.

Throughout the talk, Petrov illustrates each step with practical examples, decoding the Icecast protocol, highlighting code snippets, and sharing how he structured the system for optimal performance. The presentation concludes with a reflection on the effectiveness of the newly developed system that enhances the automation of airplay notifications for Swiss artists, showcasing the benefits of building a tailored solution adapted to specific use cases, promoting the potential for open-source sharing in the future.

Icecast Stream Parsing
Dimiter Petrov • December 04, 2019 • Zurich, Switzerland • Talk

Swiss artists registered on mx3.ch get notified when their songs are played on the radio. How does one get airplay information from arbitrary radio streams? How is it matched against a database of songs?

In this presentation, we are going to see the challenges encountered while building a production-ready airplay matching system, including:

* writing a Ruby parser for the Icecast protocol from scratch
* working around peculiarities in Net::HTTP
* threading, queues and ActiveRecord connection pools
* gracefully handling network errors
* fuzzy matching with PostgreSQL

https://www.meetup.com/rubyonrails-ch/events/266309675

Railshöck December 2019

00:00:00.060 good evening welcome to my talk about tree parsing my sky streams before I get
00:00:08.280 into that I want to to give you some context about why this work was done
00:00:14.599 this is nh3 which trees a website for Swiss music Swiss musicians it's the
00:00:21.210 meeting point between artists radios fans record labels and so on and one one
00:00:31.619 reason radio stations use it is that they counted apart for to you to
00:00:37.050 discover new talent local talent national talent in fact all the third
00:00:43.469 radio so good at wah and so on there are co-founders of the platform back in 2006
00:00:50.399 and they're still partners you can see here that this specific track is has
00:01:03.170 employees of the radio stations can does it have an account can can come to the platform and manually time something has
00:01:09.979 been played by them but of course this is a tedious process so most of it is
00:01:15.210 automated and this automation was done
00:01:20.490 for a very long time by a third party partner they ran a system
00:01:30.240 closer to the broadcasting software so as far as I know something involving a
00:01:36.290 Windows File share where the broadcasting system wrote files text
00:01:42.090 files with the current meta data then cron jobs running every second firing
00:01:48.360 Perl scripts that are staff writing to the database then another per script
00:01:53.549 like kind of matching the data like a whole whole thing needless to say did they at some point it did not want to
00:02:00.479 maintain it anymore and wanted to get rid of it at the end of last year which kind of
00:02:06.960 triggered my research
00:02:12.290 and it was also limiting before because
00:02:18.540 it was kind of out of our control this system would get the artists and
00:02:23.940 track data from Emmys free from the API then would do the matching there on its
00:02:30.720 system then push back the on-air events to today an extra a P I
00:02:35.750 but yeah this can be simplified so I I try to look for replacements I think
00:02:42.989 there's a commercial craftier API that provides real time data for these kinds
00:02:48.269 of things it was not not viable for our for a use case and don't be thinking of
00:02:56.130 course that the first you know that the first approach was oh yeah why don't we
00:03:03.840 put it also on the servers do something like the system that was there before yeah it's not possible it's too hard to
00:03:10.980 provision to deploy to do anything then I realized every every radio also has an
00:03:17.280 internet stream rights you can go to a website click the button you hear and continue and they're using mostly one of
00:03:25.859 those solutions so shop cast is the commercial solution that was launched
00:03:30.870 sometime in 98 99 something like this and eyes cast is an open-source for
00:03:37.290 implementation and most radio stations are using ice cast especially you might
00:03:43.109 imagine that the smaller ones because they don't have to pay for commercial licensing and so on and yeah ice cast is
00:03:54.810 kind of like the hidden API to to getting metadata about currently playing stuff on the radio I looked at libraries
00:04:03.319 there's one for Ruby got chart out it
00:04:09.090 was well for me it's not it's not production ready there's some problems with the encoding of the metadata it
00:04:17.010 uses threads in a weird way overall it seemed like it was written with by someone who's not
00:04:23.440 familiar with Ruby and if I have to fix it myself anyway I thought yeah might as well
00:04:31.290 research other solutions or it was invited myself or dated over I looked at
00:04:38.290 JavaScript libraries there's a few that helped me a bit also to understand the protocol but in the end reusing the
00:04:47.530 JavaScript library was also no-go because of the deployment things so image three is a Ruby on Rails application so everything was set up in
00:04:54.940 a in a different way and introducing this complexity was warranted so I did
00:05:01.450 the next best thing which is rewrite it from scratch this is the git history of
00:05:06.760 my prototype and this prototype went into production as all prototypes do
00:05:13.900 eventually even though they should but you can see that at some point I decided
00:05:20.110 oh yeah this is good enough maybe we should have tests and handle edge cases and so on and then yeah after after you
00:05:27.760 know trying to commit to something yeah I just so this was a separate repository right then I kind of ended this into
00:05:34.630 rails application and it's running the production mostly in modified for about
00:05:39.640 six to eight months so it was a pretty good prototype I'll show you the code some of it is you know not point
00:05:45.910 production ready but proof-of-concept ready and yeah ship early and look
00:05:52.900 forward look for errors right so one aspect before we get into it one aspect
00:05:59.650 was how to get test data of course you can run the thing against live radio
00:06:05.500 streams but if you have automated tests as this library has you want to get
00:06:12.700 something stable or something that doesn't depend on internet train you can put it on CI and so on so what does this
00:06:21.370 do just sends a an HTTP request to to a radio stream and then we get the first
00:06:28.270 however many kilobytes we want put this in a file
00:06:33.650 you can even add it to your repository you know 20 K is not that big of a file size right and this allows you to get
00:06:40.340 feedback quickly you know they're sweet yes they're not that many tests but
00:06:45.560 there are higher higher higher level tests and they gave me enough confidence to to refactor and and yeah handle edge
00:06:53.449 cases and so on and the good thing with writing this as a separate library
00:07:00.530 separate repository outside from the application was that it was completely decoupled from rails so once I put back
00:07:09.169 into rails well for one the test suite continued to be really fast but but also
00:07:15.020 it could be something I have considered open sourcing this and I could but I
00:07:21.440 don't want to really maintain it I'm using it for my own purpose if someone finds it valuable why not but the point
00:07:30.770 is to have a solid solid system for my use case yeah it's like that because
00:07:39.229 people I don't see this in many code bases just wanted to give an example of
00:07:44.930 how you know to not load rails and tests so you could have usually people have
00:07:50.360 something like a test rails helper but you can have a test helper that it's beautiful for unit tests so you load
00:07:57.080 your test test framework whichever it is and maybe have some way of requiring
00:08:04.780 files relative to the project route in an easy way and then you know you
00:08:10.789 require your test helper the library to the file that you're on the test and
00:08:17.659 this is it and it's really fast and for example I have this hooked up to or
00:08:23.630 running the current test I haven't hooked up to a key in my editor and I get you know some hundred
00:08:31.610 millisecond response which is cool right that's let's get into the protocol
00:08:38.219 itself so ice cast is a client-server
00:08:45.589 kind of protocol you make one request to the server and by default you get all
00:08:51.839 your data but you can send it an HTTP header this this magical icy metadata
00:08:57.720 header set to one it's called the icy protocol I think for ice cast or
00:09:04.139 something and then in response you get some some headers as well so for example the bitrate bitrate some other metadata
00:09:13.829 that's actually I'm not using and then this is the important one it's the the
00:09:20.279 metadata integral I'll get to it shortly so this is these are the headers and
00:09:26.550 then in the response body you get binary data the binary later you get is a block
00:09:34.439 of all your data then a byte which says how much metadata is about to follow
00:09:42.019 then you get the metadata which looks like something like this so you always
00:09:47.129 have a stream title equals quotes and then the name of whatever is passing
00:09:53.430 through but then afterwards you get new
00:09:59.000 new audio data and so on and so forth one thing to note is that the metadata
00:10:05.639 is not necessarily there so this length bike could just contain the value 0
00:10:11.519 meaning that you know it's not every frame or it's not every interval that you get metadata usually you get some on
00:10:19.439 the change of a song so every three minutes or so right so I looked at how
00:10:26.759 to get there stomach Ruby I are not one
00:10:31.920 for including you know extra gems library stuff like that I don't like
00:10:37.680 projects where you have six different networking libraries two that have slightly improve interfaces on top of
00:10:45.269 that but they used to do the same thing so I wanted to keep it looked at what was it Ruby standard
00:10:50.320 library and the API is not great but it works in this example though this is
00:10:57.130 seems you are doing a request or you get
00:11:02.560 a finite response and we don't know this dream with an internet radio stream you get a you know always running requests
00:11:10.209 on sorry response unless the network is down or something so you can do a
00:11:19.029 streaming you can handle streaming responses with this read body method and
00:11:29.470 you get chunks what I did not like about this is that you don't really have
00:11:35.250 access over how much directly because
00:11:52.290 the sockets TCP socket interface in Ruby provides the i/o API which allows you to
00:12:00.250 read and write exact and yes there's a bit more about keeping going on for some
00:12:07.870 things but for these precise things in the protocol where you want to read this number of bytes then read this byte and
00:12:14.080 so on it made things a bit easier and again as I said I use this in production
00:12:19.270 for four or eight months so one thing to do first you send some some headers so
00:12:26.230 this is of course this connection this connection class is just a small
00:12:34.209 abstraction over the socket to put new lines
00:12:41.730 so I'm sure you're familiar with with HTTP you request a certain path which
00:12:48.459 reroute the host then you send this header the OEC metadata headers one I
00:12:57.129 guess to empty lines - well this is one empty line and then because the right
00:13:03.069 line and sweeter you get you you and your request kind of yeah maybe I'm
00:13:09.370 getting this wrong anyhow then you wait for a response you get the HTTP status
00:13:18.550 in response this is the line which says oh you know HTTP version whatever 200
00:13:24.850 okay so this is the this is from the very first commit so here I did not even
00:13:30.399 bother looking for am I getting bad HTTP this kind of stuff which version I just assumed you know if
00:13:37.319 this public radio has an internet stream they probably doing it right so I wanted
00:13:44.319 to get stirred the whole the whole program done right so then you read some
00:13:49.839 headers displayed by semicolons so you can put them in a hash and read the one
00:13:57.069 you're interested in which is the hi-c meta integer and then you're ready to
00:14:05.110 raise himself and again the first the first trial was really hacky just a
00:14:15.399 regular expression against whatever you getting to fetch the stream title and
00:14:20.579 see if you recognize some strings in the render output data this thing with the
00:14:27.250 unpack one I found it a bit interesting so there's a there's a way to unpack
00:14:34.600 binary strings in Ruby there's two methods one is unpack and unpack one so
00:14:39.850 unpack returns on the Ray and unpack one returns just the first element of the array so you give as a parameter you
00:14:49.509 give a directive for the four and the format can be a number of things
00:14:56.529 flows and integers and so on for this protocol we only need to the 8-bit
00:15:02.620 signing integer and a string the directive can be followed by by accounts
00:15:09.100 and if it if you put a star it just you know repeats until the end of the input
00:15:14.399 so for example here I decode one character from this string and here I
00:15:20.290 killed everything it removes trailing no
00:15:27.000 terminators and space instantly so yeah I gave it a try and this works this is
00:15:36.459 not very interesting data but it works and here oops we have some trading garbage and this is because we read too
00:15:43.480 much data so how to read just enough data we can actually use the protocol as
00:15:52.089 intended which is to read this length
00:15:57.540 bit for the metadata it's just sorry what are the length byte
00:16:04.180 for the metadata it's a number there's a fixed block size
00:16:10.060 in the protocol and I think it's 16 and so you know that whatever length you
00:16:15.250 read you multiply by 16 and you get the size of your metadata as I said you
00:16:20.560 don't get metadata at each iteration so sometimes you have to just continue
00:16:25.660 eating all your data by the way you see here we've completely discarded all your data right so same thing is before you
00:16:35.290 read your metadata and this is one of the fixes I brought well one of the things I had
00:16:41.889 to fix compared to the original library which was encoding most streams I
00:16:48.309 encountered are actually laughing one include you know ISO a date 9 5-1 some
00:16:57.009 of them are a utf-8 encoded and there's no
00:17:02.970 you could try to guess the encoding but it doesn't really I have a I have a
00:17:09.810 fixed amount of radio streams and I can just parse each one for a bit and try
00:17:18.030 different encodings and then put it in a in a configuration somewhere saying okay this radio stream has you defending
00:17:24.720 encoding in this one there is Latin one so here we can transcode as we wish then
00:17:35.820 I move some of the metadata passing to a different thing and then because we
00:17:41.190 actually want to continue reading data and just yield the title we got to
00:17:47.010 whatever is it's calling these invoking this this method and we continue on and
00:17:54.140 the point is that on the other side you should also try to be as quick as
00:18:00.120 possible to write this somewhere else and and and not block the main process
00:18:08.840 yeah so that that that well was a bit better and so this is what actually I
00:18:14.100 should in production and I thought about
00:18:20.250 some of the choices I've done so one was it didn't really support HTTP reading
00:18:26.790 stuff you could you could in call
00:18:33.150 openness so in there but then in the end you you're basically implementing net HTTP it's not worth it and also there
00:18:41.100 was just I Indiana was not comfortable with doing so much HTTP in this nice
00:18:49.200 library so I went back to to actually reuse net HDD and so I had to find a way
00:18:54.540 to deal with the the reading of bytes in
00:19:02.100 a controlled manner I also get some
00:19:07.170 other things for free option in a bit right so the beginning is is the same we
00:19:13.920 get the headed back the response header and then
00:19:19.660 I introduced this abstraction the chunk ioi it's a it's a modified version of
00:19:24.880 something I saw from a library called down it's a ruby gem and I massively
00:19:31.890 simplified it because I didn't want the whole power of having this i/o interface
00:19:38.650 so I just wanted one wanted to Metis basically this read method you can see
00:19:45.610 here I am as an intermediate I'm transforming the stream body to an
00:19:55.180 enumerator and then I can get chunks on demand so diving a bit deeper into the
00:20:04.870 code side what Reed does so this is the length we want to read so we start with
00:20:13.300 an empty buffer and this is the the thing that can get us the chunks we read
00:20:21.520 from the source as many times as necessary until we get the right length
00:20:29.250 and you have to to realize that this read our show reads at most this number
00:20:37.060 of bytes it could read letter fewer and I'll show you the implementation right away so if the buffer is now we get a
00:20:48.040 chunk from the HTTP response we get the
00:21:01.090 if we did read more than then limits number of bytes we store the rest in the
00:21:08.140 buffer and is instance so this is actually a stateful thing and whenever read partial gets
00:21:15.910 cold again buffer won't be empty so we don't retrieve a chunk yet we try to
00:21:22.680 push through the rest of what was in the buffer and
00:21:28.770 so the the original library I took it from did a bit more to I guess make it
00:21:38.440 more efficient I don't care if I have an extra boo boo - so this was enough and
00:21:44.290 an interesting thing I discovered was this exception stop iteration which is something that gets called when you want
00:21:51.250 to stop an iteration from the Ruby numerator and stuff and of course I want
00:21:56.980 to translate it in something into something more tangible for the library since we are getting network data and
00:22:02.770 the file error I mean I know and the files but Ruby and UNIX legacy and all
00:22:08.740 of those things right okay so so this is
00:22:14.530 the happy path what about other stuff there's redirects I think I wanted
00:22:19.660 redirects I've encountered or server errors so maybe the the Broadcasting
00:22:27.310 System is down or something I mean sometimes you get a 500 and maybe also you made a mistake yourself so what you
00:22:36.430 want what I want to do is handle those cases and some other things like timeouts and you know interruptions
00:22:44.260 socket errors whatever in an integrator way and the idea is that some things
00:22:53.500 should be should should crash the process for example if I have a client
00:22:59.050 there so if I made a mistake in the code I want to see this in my bug tracker but
00:23:05.080 for the rest of the errors I just want to retry I can sure I can lock the errors and so on but I want it to
00:23:10.690 recover I introduced an abstraction called attempt and this code is a bit
00:23:21.670 awkward because I had to make it fit into a slide but this retries method
00:23:29.580 defines which exceptions I want to rescue how many retires I I
00:23:36.100 want to allow what delay I have between retries which
00:23:41.700 is which is a lamb which takes a lambda because I want to implement usually exponential back-off where he tries and
00:23:48.840 then extra things you want to do after each retry extra blogging or then so
00:24:02.519 this code is within the connect method of the of this ice cast parsing client
00:24:09.630 and it takes care of the retry logic and
00:24:16.139 then there's one which is completely oblivious to what what's happening
00:24:22.860 outside of it which is the connect without a choice and this thing here the
00:24:28.769 validation it may be over-engineering but what i wanted is if if the block
00:24:38.730 crashes for nine times and then it works and then it works for another three hours if it's practice again I did not
00:24:45.690 want to crash the surrounding process so I said okay you know what once it succeeds once it's good again
00:24:52.710 and we reset the counter of the bridge rise again and here's the implementation
00:24:59.940 so we increment the number of attempts we run whatever code we want to run and
00:25:07.580 I've removed some things but we write an error which hopefully we did a nice
00:25:13.409 message about the stack trace and yeah stuff like that I also made it here a
00:25:21.169 longer message about okay this is retry number six we're trying in 10 minutes
00:25:27.860 and so on and finally we retry so yeah
00:25:33.960 we'll try this book right so this has a
00:25:39.299 bit of stability to the to the thing we don't have only one string
00:25:46.070 just like 510 and at this point 15 15 radios probably and you want to and this
00:25:55.820 is in the context of a rails application so even though the process doing this
00:26:03.800 reading is a continuous background process it's supervised and everything
00:26:09.650 it still has to write a database and if
00:26:15.650 you want to write stuff or 15 different radio streams which could write its
00:26:22.720 random times more or less you have to be careful with
00:26:27.740 how you connect it to database because you could exhaust your connection pool
00:26:33.440 and so I adopted the classical you know concurrence thing of multiple producers
00:26:40.400 one consumer where the they communicate with the queue this is from the threads
00:26:48.410 library in Ruby then we have a consumer so it waits for something to to get into
00:26:58.040 the queue so it works here once we get in an element we store the information
00:27:06.890 into the database and then we fire a background process to to handle this accordingly on the producer side we
00:27:19.100 instantiate our client we do the parsing of the metadata and this block is called
00:27:27.560 whenever there is a new title so whenever there is a new title we put this into the queue
00:27:36.820 there's a few subtleties one is so amazing sidekick for this sidekick
00:27:45.160 guarantees that the job runs once but it
00:27:50.990 could run more than once so you have to do not Prost things twice
00:27:57.030 sometimes you get just a bad metadata from some radios I don't know for one
00:28:02.230 day maybe the sister-like currently playing system is down so they
00:28:08.140 send your audio but not the description of your do you get jingles as well so we
00:28:13.210 what echo jingles is the telephone number of Facebook page of the radio
00:28:19.350 stuff like that advertising basically and you also get partial matches of course I mean partial
00:28:26.230 matter what I mean is partial matches is that we have tracks in the mx3 database right so even even if an artist is in
00:28:35.980 the database they have not uploaded all the tracks and we want to notify this artist if the ones that did upload are
00:28:43.290 being played on the radio and so yeah
00:28:50.110 how do you how do you do that I approached it with Postgres though spirit has a module called for the
00:28:56.350 string match and this one has a method for the or a function calculating 11
00:29:06.190 strong distance which is the distance between these strings the number of different characters so let's start with
00:29:15.480 with the band so let's assume that the track by the artist now Jefferson called
00:29:23.020 Fame is being played so this is what we received with powers the meditator would
00:29:28.900 normalize the whole thing making it as easy as possible to match against what we have in the database what I do is I
00:29:37.930 want to get the closest match so I limit to one result we order by distance with
00:29:44.830 the closest distance first and so hopefully we get the a band ID along
00:29:50.320 with the distance from the thing that we want to match then we do something
00:29:59.020 similar with the tracks of the same band and by the way this Levenstein function
00:30:06.610 it has a faster alternative I think all understand less than which is made for
00:30:11.799 small small distances what it does is it
00:30:17.710 has it has threshold so if you say for example the threshold is 4 if the
00:30:22.870 distance is lower than 4 it will compute the exact distance otherwise it compared
00:30:29.620 gives you 4 I did not use it because if the distance is greater than 3 in this
00:30:38.379 case I actually don't want this to be recorded as a partial match so for me it's too too far away I mean this is
00:30:45.610 more I have observed this in this thing
00:30:50.769 the exact number cannot find Union but there's a threshold beyond which you just get in you know if two artists have
00:30:57.309 the same number of characters in the name of course it will be in like some kind of distance it makes no sense to
00:31:02.889 try too much so Indian what I've caused
00:31:08.559 translated this into active record stuff with sanitization and and so on
00:31:17.799 I get the band and its track and if both
00:31:24.159 the band and the tracker found I returned the track along with the
00:31:30.730 combined score so the song of the two distances and then if this is zero the
00:31:37.570 distance is zero then it's an exact knowledge if it's not an exact match then you have a partial match and this
00:31:43.899 can be presented to an admin who can then accept or reject and yeah and this
00:31:52.450 can be fine-tuned but yeah and basically this is it thank you for listening
Explore all talks recorded at Railshöck Meetup
+25