Talks

How music works, using Ruby

How music works, using Ruby

by Thijs Cadier

The video titled "How music works, using Ruby" presented by Thijs Cadier at RailsConf 2022 explores the intricacies of music production and the evolution of audio technologies through the lens of programming in Ruby. Cadier begins with an explanation of music as a phenomenon perceived by the brain, shaped by sound waves that oscillate in the air. He highlights that music is not only sound, but also an emotional experience that remains partially unexplained by neuroscience.

Key Points Discussed:
- History of Music Recording Technologies: Cadier traces the development of music recording from mechanical devices and wax layers to magnetic tape and digital audio.
- He explains the transition from live music to recorded music with the advent of early recording devices, showcasing how technology has transformed music accessibility.
- Understanding Digital Audio: The presentation covers the concept of digital audio, highlighting how sound can be sampled and manipulated digitally. Cadier emphasizes the importance of waveform representation in music.
- He introduces the use of Ruby code to manipulate audio samples, demonstrating the basics of audio processing like amplification and noise generation.
- Practical Example with Ruby: He demonstrates how to generate different types of sound waves (sine, square, and noise) through Ruby code, illustrating fundamental sound synthesis concepts.
- Mixing and Audio Effects: Cadier discusses the process of mixing audio, explaining how multiple waveforms can be combined to create a complex signal. He touches on audio compression techniques and their significance in achieving a consistent sound level in music.
- The effects of clipping and distortion are mentioned as common elements in music production, particularly in rock styles.
- Technical Programming Demonstration: Throughout, Cadier provides examples of coding in Ruby related to creating and manipulating sound, making technical concepts accessible through programming analogies.
- Conclusion with a Musical Performance: In conclusion, Cadier showcases his music creation skills by playing a cover of a well-known song, thus underlining the practical application of the concepts discussed throughout the presentation.

The key takeaway from Thijs Cadier's presentation is the interconnection between music production and programming, emphasizing how coding can be an integral part of understanding and creating music. The insightful exploration of digital audio along with hands-on coding examples illustrates the potential for developers to engage with music technology creatively.

00:00:13.519 I work at a company called AppSignal, and we have a fully remote work environment. Every year, we hold an internal conference because it is important to find ways to stay in touch with each other and understand who everybody is. This presentation is something I've been working on for four years. I have done a series of internal talks about music, and I have wrapped it all up into one comprehensive presentation.
00:00:30.900 Today, I'm going to provide examples of various music-related concepts. For instance, I've discovered that procrastination is a big issue among developers, which I can relate to. I’ve been getting into music production for a couple of years now, and I’ve figured out that writing code is a great way to understand the world. It forces you to create a complete mental model, which is essential for automation.
00:01:05.880 So for this presentation, I went through a process that helps me deeply understand music. It's similar to learning by doing, which involves modeling concepts without fully grasping them at first. Today, we'll do a quick history of music recording technologies to give us a sense of our starting point. We'll then delve into digital audio and explore ways to manipulate it and generate samples.
00:01:54.960 So, what exactly is music? It's a rather strange phenomenon since it only exists in our brains. Neuroscientists don’t fully understand how this process works, but we know that it involves sound waves. These waves can be visualized similarly to how waves behave in water, which makes it a bit more intuitive for us. The speakers in this room create movement in the air that sends vibrations to our ears, ultimately resulting in what we perceive as music.
00:03:05.000 These vibrations stimulate tiny hair-like structures called follicles in our ears, feeding signals to our brains. Our brains find meaning in these complex waveforms, and we perceive them as music, which evokes emotional responses that we still cannot fully explain. Perhaps one day we will understand this mystery better.
00:03:39.180 To understand how this basic process works, we need to consider an important aspect: the frequency of the sound waves. This refers to how many times the waveform oscillates per second. Here’s a simple waveform; I will play a sine wave that oscillates 440 times per second, which is the standard pitch for the note A.
00:04:00.300 By playing different frequencies, we can create what we perceive as musical notes that sound like this.
00:04:06.979 So, this is just a sine wave generated at various frequencies. Next, I will demonstrate a more complex waveform by combining simple waves. This complexity gives birth to richer sounds, such as those produced by a piano.
00:04:58.740 Now let’s discuss tempo. Tempo is influenced by the spacing between waveforms. When we combine different waveforms and introduce rhythm, we create something that resembles music even if it's quite basic. Back in the day, people would perform music live since there was no recorded music, but this all changed at the beginning of the 20th century.
00:06:06.360 The first known music recording device used wax cylinders. You could shout into it, and a needle would vibrate, etching the sound into the wax. It wasn’t very useful, but it laid the groundwork for recording technology. With the invention of phonographs, audio began to be captured more reliably. However, these machines were still mechanical, so they didn't produce much volume.
00:06:52.980 Then came magnetic tape recorders, which encoded sound using iron particles on tape. This technology advanced rapidly during World War II, leading to significant enhancements in audio recording. After the war, technologies that had originally been developed for radar were leveraged for recording music.
00:07:40.800 With these advancements, it became possible to amplify sound signals, enabling recordings that were clearer and louder. We began to see multiple tracks in mixing sessions, and transistor-based technologies transformed the industry.
00:08:27.900 Then the 1980s introduced digital audio, which took a while to master. Digital audio represents sound in a way that creates smooth curves, while analog sounds are far more continuous. Sampling captures these waveforms thousands of times a second to recreate the sound as accurately as possible.
00:08:57.680 Digital audio technology has replaced many older methods, with software now incorporating the functionalities of million-dollar studios. For this presentation, I will demonstrate the recreation of some of these audio processes using Ruby code.
00:09:48.780 I'm going to use a Ruby gem called WaveFile, which can read and write digital audio files. We start by opening an audio file, compiling the data, and manipulating it as needed.
00:10:15.300 Next, we’ll use a library called ChunkyPNG to create visual representations of the audio data because visualizing is crucial in understanding audio.
00:10:54.800 For example, we can visualize the beginning of a hi-hat sound. If we take the sine wave we previously discussed, it will have a more discernible waveform. In the code, we are converting a long array of numbers into a visual format by drawing points above and below a central line based on their values.
00:12:07.440 Additionally, we will discuss how to compress audio data visually, which allows us to see the audio signal compressed over a longer time frame compared to looking at individual samples.
00:12:48.800 Now, let's take the same drum loop we analyzed before and see how we can manipulate its volume programmatically. By examining the sample values, we can gradually increase the levels until we hear a desired amplification.
00:14:00.780 This is achieved by multiplying the sample values, but too much amplification can lead to clipping, a form of distortion that occurs when the signal exceeds the maximum level that can be accurately represented.
00:15:04.320 Clipping can turn a musical note into a harsh distortion, often characterizing genres like rock music, where artists like Jimi Hendrix intentionally push signals beyond healthy limits for creative purposes.
00:16:25.340 We can also generate sound using synthesizers. Synthesizers contain sound sources and filters. For example, creating noise involves generating random samples, which can produce white noise, akin to static from an unused radio frequency.
00:17:12.360 To illustrate this, we can write a simple Ruby script that loops through a range of values to generate random audio samples, producing noise effectively with minimal coding.
00:17:56.640 We can also create square waves, which have a distinctive sound heard in many electronic music tracks. By alternating between high and low values based on an oscillator model, we can generate a square wave that sounds like this.
00:18:34.680 Finally, we have the sine wave, which is defined mathematically and produces a smoother output. Its formula incorporates mathematical functions to calculate output points, contributing to the complexity of sound generation.
00:19:21.180 The concept of Fourier analysis teaches us that all sounds can be expressed as combinations of sine waves, enabling us to create different musical tones by blending multiple sine waves.
00:19:56.180 Next, I’ll briefly touch on mixing sounds. In a mixing desk, various audio signals are combined into one that can be outputted together, creating a fuller sound. The mixing process involves summing multiple audio tracks and managing levels meticulously.
00:20:25.200 During mixing, taking care to avoid clipping by adjusting volume levels is essential for achieving a polished sound. Each channel can be manipulated to create a well-balanced output.
00:21:08.640 Compression is another crucial technique. Compression reduces the dynamic range of sound, making quiet sounds louder and loud sounds quieter. This allows for a more consistent listening experience, often essential in music production.
00:21:51.360 When applying compression, we typically define a threshold that signals whether a sample needs adjusting. By selectively manipulating the louder portions of a waveform, we can improve the overall mix for better output.
00:22:21.960 Now let’s listen to a simple audio sample and see how compression can transform it by balancing dynamics without losing the essence of the sound.
00:22:58.920 In conclusion, I have demonstrated various audio concepts ranging from sound generation to mixing. Before we wrap up, I want to showcase a cover I created using the techniques I discussed today, which demonstrates how easily we can produce music using simple samples.