The Musical Ruby

by Jan Krutisch

In 'The Musical Ruby,' Jan Krutisch presents a unique approach to music creation using Ruby programming language during his talk at EuRuKo 2019. He begins by introducing Sonic Pi, a platform that allows users to code music effortlessly, demonstrating its capabilities with a pre-recorded video to avoid nerves while coding live. The presentation swiftly transitions into the fundamentals of creating sound with pure Ruby, complemented by SoX (Sound Exchange), a powerful tool for managing audio.

Jan elaborates on the basic principles of digital sound synthesis, explaining concepts such as:

Sound Waves: Sound is described as vibrating air molecules, which are converted into audio signals by loudspeakers.
Digital to Analog Conversion (DAC): He discusses how digital data needs to be converted to electrical signals to produce sound, highlighting the challenges involved in this process.
Sampling Frequency and Nyquist Theorem: The importance of sampling frequency for accurately representing sound is explained, particularly in relation to the 44.1 kHz standard of CDs.
Music Theory: Jan covers western musical notation, octaves, and MIDI tuning, clarifying how frequencies correspond to musical notes, particularly noting that 440 Hz corresponds to the pitch 'A' above middle C.
Sound Manipulation: He introduces filters (low-pass, high-pass, bandpass) and envelopes (ADSR) to shape sound over time, giving the audience an insight into sound design.
Drum Sound Creation: Using basic waveforms, Jan demonstrates how to replicate kick drums, snare drums, and hi-hats using Ruby code.
Sequencing and Mixing: A brief overview of arranging musical notes in patterns similar to software like Ableton Live is provided, laying the groundwork for creating complex tracks through programming.

Jan concludes his presentation by highlighting the mixing process, the use of effects to enhance sound, and the overall journey of creating music with Ruby. The audience is encouraged to explore the provided website for more details, and Jan acknowledges a minor mistake in the URL. Ultimately, the talk merges coding with music, showing the audience the potential of programming to create dynamic auditory experiences.

00:00:06.080 Next up is Jan, who lives in Hamburg. He's helped organize quite a few conferences and has attended many as well. Just a couple of months ago, he even taught me how to beatbox, giving me a brief introduction to how it works. It was awesome! Today, he's going to enlighten us some more on musical topics, so please join me in giving Jan a very warm welcome.

00:00:40.320 Thank you! Thank you for having me here. It's a massive honor to finally speak at EuRuKo, a conference that I absolutely adore, so this is great. Before I start, just a quick advertisement: I'm a co-founder of DevFu, a service that does automated dependency updates by sending you pull requests on GitHub. Some people really like us! We even brought t-shirts and stickers, so yay! I'll be in the speaker lounge after this, following the pitches, so hit me up there. Also, Florian, my co-founder, is somewhere in the audience.

00:01:22.799 Before I really start, a quick warning: please don't try to understand the code examples. They're there mostly to show complexities and to give you an idea of what’s going on. Everything is online, and there's a website called rubysynth.fun, although I think there's a typo there; it was nice to find that out while giving this presentation! You'll find a lot more information than what I can cover in 30 minutes on that website.

00:02:04.479 So, let’s start with some music. I’d like to do something different today. This is Sonic Pi. How many of you have played with Sonic Pi? (pause and look around) Okay, cool! I could have live coded this, but I know myself. In the first five minutes of a talk, I get so nervous that my fingers don't really work. I thought I’d better make a video of this instead. For some reason, I was really awkward on the video too, so this might take some time. I'm sorry about that!

00:02:35.760 You can write some code in Sonic Pi, and it will produce music. It's really cool! Sonic Pi has been around for a couple of years; it's written by Sam Aaron, and it's a wonderful system to learn coding and make music. I really recommend it. If you haven't looked at it yet, go to sonicpi.net and play with it! Finally, let’s try to create something that resembles music.

00:03:15.120 Let’s add another drum element to it. I really should have faked this better and pretended to type properly all the way through. Look how amazing I am! I know... Well, now let’s finally add some effects. Please hold on... Come on, you can do it! (pause) Alright! That was Sonic Pi, but we want to dig deeper because I want to show you how to make music using pure Ruby. Sonic Pi relies on a lot of advanced technologies under the hood that aren't Ruby, but I’m going to use pure Ruby along with a tool called SoX.

00:05:07.360 SoX, short for Sound Exchange, was conceived by Lance Norskog and is now maintained by Chris Bagwell. It's been around since 1991. Just out of curiosity, how many of you were born after 1991? (pause and look around) Cool! So, SoX is like a Swiss Army knife for audio file conversion—you can convert anything into anything, and it handles raw data very well.

00:05:33.680 If you take this Ruby code at the top, it's the simplest thing I could come up with to generate sound in Ruby. It produces a square wave and outputs it in binary format as a little-endian floating point 32-bit value. We can then convert this into something that your sound card can understand using the 'play' command, which is part of SoX. We could also save it as a WAV file, which is a sound file format that stores raw sound data, so you can play it in a browser.

00:05:57.759 So, let's go ahead and do that. Be careful; it’s not very nice sounding, but hey, it’s sound! It’s something, right? Sound is essentially a vibrating air molecule. When something starts moving, it sets air molecules in motion, creating a wave that our ears detect. Please don’t ask me to explain how the ear works; that’s complicated, and it’s a wonderful mechanism that allows us to hear these sound waves.

00:06:18.560 Now, we want to somehow turn electrical current into air movement because our computer operates with electrical current. To do that, we need a loudspeaker. A loudspeaker works using an electromagnet that moves a static magnet. That static magnet is connected to a diaphragm or membrane, which pushes air out rhythmically according to the electric current, creating moving air molecules.

00:06:38.240 Computers are digital, so we need to convert digital data to electrical current using a Digital to Analog Converter (DAC). Please don't ask me to explain those either, as they can be quite complex. In general, converting a digital signal into an analog signal comes with certain challenges. If you look at a scientific graph comparing digital and analog waves, you will see that one has steps while the other is smooth. Digital data is always discrete both in time and value, which can become an issue because electrical current typically is not.

00:07:26.000 There are a couple of things you need to know, starting with sampling frequency. This refers to how often you sample a signal in the digital domain. There's a rule that you need at least double the sampling frequency to express a certain analog frequency. This is called the Nyquist Sampling Theorem. The reason why CDs use 44.1 kilohertz as a sampling rate is that it provides a maximum frequency of about 20 kilohertz, the upper limit for human hearing—at least for most people.

00:08:04.400 Let’s get back to the Ruby synth thing. This code again is really simple, and it produces a square wave. It generates a square wave at 440 hertz. If you’re a musician, you'll understand that 440 hertz is the pitch for the A above middle C. If you remember anything from music school, that should have some meaning to you. I had to relearn all this since I forgot most of it!

00:09:06.560 It’s important to note that I’ll be talking about western musical notation specifically, as other cultures use completely different systems. In western notation, this resembles multiple octaves on a musical keyboard. An octave means you're doubling the frequency. For example, moving from a low C to the next C means you double the frequency while crossing 12 keys, which are called half tones.

00:09:43.200 These half tones are C, C#, D, D#, and so on. It's a unique way of looking at the black keys. While the note A is always defined as 440 hertz, defining the frequencies of other notes can get quite complicated. MIDI tuning offers a method for organizing notes with simplified numbers. A MIDI note number ranges from 0 to 127, where 0 is a very low C, 60 is middle C, and 69 is concert pitch A.

00:10:50.800 Now, a square wave at 440 hertz can become pretty dull pretty quickly. Check out this frequency spectrum—it’s cool because it allows us to visualize the harmonic content of the sound we're playing. To shape sound, we can use subtractive synthesis, which is similar to sculpting a big block of marble into a sculpture. You chisel away everything that doesn’t look like your desired form; similarly, we can start with high harmonic content and filter out parts of that spectrum.

00:11:48.920 We’re going to use a state-variable filter. The interesting part about this filter is that you can construct it either as code or as electronic components, which is quite handy. Here's the code for a state-variable filter. A cool aspect of this filter is that it can generate multiple outputs at the same time. The first output is a low-pass filter, which cuts out the high frequencies.

00:12:22.960 This is an exciting step, as our square wave now sounds somewhat different. It’s still not great, but it’s improving. You could also use a high-pass filter that sounds a bit harsh, and a bandpass filter that only allows specific frequencies while rejecting others. While working with this method, we need to consider variance over time—this is important to create musical sounds.

00:13:30.720 By using envelopes, which shape sound parameters over time, we can create a much nicer sound. An attack-decay-sustain-release (ADSR) envelope allows a lot of control through relatively few parameters. Time goes on and you shape a specific parameter with this envelope. Modulating the volume of our filtered sound and adjusting other parameters over time brings us closer to a musically pleasing outcome.

00:14:40.880 In the world of synthesizers, we can create specific sounds, such as the classic kick drum, snare drum, and hi-hat. Kick drums are the largest drums in a set; they produce a sharp impact followed by decay of vibrations. We can emulate this by using a sine wave oscillator, pitch it up, then pitch it down before allowing it to fade out, leading to a basic kick drum sound.

00:15:34.000 For the snare drum, we can use similar techniques. Since snare drums are smaller, they produce higher tones when we pitch our sine wave up. Snare drums also have snares over one drum head, creating a very noisy sound. So, to make our own snare sound, we can generate white noise and filter it down; with the right envelopes, it becomes audible like a snare drum.

00:16:16.640 The hi-hat is another integral part of any drum set. It’s made of two cymbals that generate short, percussive sounds. By using white noise, applying a bandpass filter for higher frequencies, and two different amplitude envelopes—one for closed and one for open—we can create a hi-hat sound. Once we have all our drum sounds, we can mix them together!

00:17:43.679 Now that we’ve built our drum sounds, we’ll need to sequence them to make a musical piece. To do this, we need to understand some terms: beats, bars, notes, and a few more basics. I’ll explain this within the context of a 4/4 measure, which is what most pop songs, as well as techno tracks, use.

00:18:14.720 A 4/4 measure breaks down to four beats in a bar. Musicians often count from one, which can irritate programmers like me if you’re trying to count your measures. A bar represents one measurement of music—it can contain notes of varying lengths such as whole notes, half notes, quarter notes, eighth notes, and sixteenth notes.

00:19:26.320 Machines use something called a step sequencer to create patterns. You'll recognize the Roland 808 drum machine, which has 16 buttons at the bottom. You can program a bar of music using these buttons with relative ease. Another aspect to understand is tempo, which is measured in beats per minute (BPM) or quarter notes per minute. If we set the BPM to 120, we can easily calculate the length of beats and bars.

00:20:34.560 Next, we want to organize our musical notes into patterns, which are collections of notes that can be reused. We can consider them as methods in a programming language. Note patterns are simply collections of notes defined by lengths and pitches, and you can arrange them into songs. This approach is very similar to the arrangement you’d use in software like Ableton Live.

00:21:44.480 Let’s build a simple DSL (domain-specific language) for this. For drum patterns, I’m using strings to show the on-and-off notes. This is what a typical techno beat might look like with snares on the twos and fours, a kick drum on each beat, and a hi-hat on the eighth notes in between.

00:22:55.520 For loop patterns, we want to input notes at certain intervals, which is a little more intricate since there are more parameters to consider. We can define notes accordingly, including their length, and organize them into a song structure, similar to what you’d do in a music software environment. For example, defining when patterns start and how many times they repeat allows us to effectively create songs.

00:24:20.640 However, something is still missing. This is where mixing comes into play. In studios, mixing desks allow you to plug in various instruments and mix them together. In software, we can replicate this structure. Each channel can control a specific instrument, like a bass drum or hi-hat, and through these channels, we can adjust various parameters before sending the output to the master channel.

00:25:52.000 In terms of control, we have volume, equalizer settings, and compressor parameters. An equalizer essentially allows you to adjust the volume of different parts of the frequency spectrum to prevent overlapping frequencies from sounding muddy together. A compressor helps balance and increase the overall volume dynamically.

00:27:05.919 Insert effects can be creative too; for instance, a wave shaper distorts input signals based on a specific formula, producing different sonic characteristics. Lastly, we have chorus effects, creating a more lush sound by mixing input signals with slight delays. All of these adjustments enhance the overall sound as we approach the mixing stage.

00:28:20.480 Send channels are a unique aspect; they allow you to send effects to all channels simultaneously—like adding reverb to a whole drum kit instead of just individual sounds. As we wrap up our discussion on mixing, let’s listen to a more complex sound mix that includes all the elements and effects we've talked about! (transition to music demo).

00:29:49.280 (Transition music) Now let’s wrap it up! Thank you so much for listening to my journey using Ruby to create music! Remember to check out the website for more information. Thank you once again for this opportunity!

00:30:34.560 (Music in the background as Jan concludes remarks) As I said, I'm sorry about the website typo, with the dash incorrectly placed in the URL. I’m working on correcting that! Thank you all so much for this experience; I loved sitting back and listening to the waves of sound we created together.