Audio Processing

lo-fi hip hop ruby - beats to relax/study to

lo-fi hip hop ruby - beats to relax/study to

by Zachary Schroeder

The video features a presentation titled "lo-fi hip hop ruby - beats to relax/study to" by Zachary Schroeder at RubyConf 2019. The main theme revolves around creating a lo-fi hip hop beat-making tool using the Ruby programming language, specifically aimed at music relaxation and study. The speaker shares his personal journey of developing Ruby Lo-Fi, a software application designed to help users create chillhop music tracks.

Key points discussed include:

- Introduction to Lo-fi Hip Hop: Zach begins by discussing the relaxing qualities of lo-fi hip hop music and its popularity, particularly in YouTube playlists for studying.
- Using Ruby for Audio Manipulation: The core idea is to leverage Ruby along with several gems (Ruby 2D, Ruby Audio, and Ruby Beats) to design a music composition tool. He explains the rationale behind his project and the technical challenges he faced, such as creating UI elements from scratch due to limitations in the graphic library used.
- Demonstration of Ruby Lo-Fi: Zach provides a live demonstration of the application. He showcases features such as sample editing, adding beats (e.g., 808 kick, hi-hats, snare), and utilizing sliders for beats per minute (BPM) adjustment. He also mentions the limitations of the tool, like its single-threaded nature leading to slower operations.
- Learning to Create Projects: Beyond the application itself, Zach emphasizes the importance of personal projects for enjoyment and learning. He advocates for starting small to increase the chances of project completion, suggesting that listeners should consider fun, experimental projects without the weight of expectations.
- Technical Considerations: Throughout the talk, Zach introduces essential audio concepts such as frequency, amplitude, and sample rate, demystifying how digital audio works. He emphasizes using arrays of numbers to process audio and highlights optimizations that could be made in future versions of the app.
- Conclusion and Call to Action: Zach encourages audience members to try out Ruby Lo-Fi and collaborate on further development. He reiterates the value of community engagement and seeking feedback during the project development process. Additionally, he touches on his employer, General Dynamics, indicating they are hiring and inviting the audience to inquire if interested.

The main takeaway from the presentation is the message to code for happiness and creativity, encouraging attendees to explore their projects joyfully and to share their creations with others.

00:00:12 All right, the timer has started. Hello everyone, my name is Zack, and I'm from Pittsburgh, PA. I've been coming to RubyConf every year since 2014, so it's been a while. It’s still one of my favorite things to do throughout the year. Thank you all for coming to listen to me talk about this little project that I made; I really appreciate it. We have a pretty full room, so let's get started. Is my voice nervous in here, or is it just me?
00:00:36 So, you might have been listening before the music faded down. The music was a chill sort of instrumental with a beat under it, which some may call lo-fi hip hop beats. That’s kind of what my talk is about. I imagine everyone is familiar with the YouTube playlists at this point, right? Yes? No? Okay, we're relaxing and studying to them. My whole premise was: can I take my favorite language, Ruby, and make a tool to help me create little tracks like that? I'm going to do a tiny bit of typing, which means I'm definitely going to mess it up.
00:01:52 Okay, now we have our project directory, and we have some samples in there. Rather than going through the slides right away, which I have about a million of, I hope we don't get to, I’d like to show you what it does and start off that way. We’re going to say `ruby lo-fi` and the classic `app.rb`. The window is pretty small, but I hope you can see the text; it’s rather tiny. It’s kind of weird making a little app that you expect people to look at a big screen to interact with, but anyway, we're going to hit 'new.' This is a file browser. It's a UI element that we are all familiar with, but actually, I had to make it myself because the graphic library I’m using, Ruby 2D, doesn’t have more complex objects like file browsers or buttons.
00:02:19 So, whenever you see me interacting with like a checkbox or a slider, that’s something I had to make for this project, which is something I'll probably mention later. Let’s jump into tracks and this interesting track here. One problem is this is a single-threaded app. It doesn’t do anything concurrently, so whenever you’re loading something, you get the nice beach ball. That would be a good future improvement to load things asynchronously. So now we have this kind of confusing jumble of UI elements that I made for this project. Let's start at the top with the sample editor; as you might be familiar, that is the audio file represented in waveform.
00:03:21 Something like that—you're probably familiar. The next heading down is sample effects; as you might imagine, those apply effects to a sample. The third option down says 'make a beat,' and you see tons and tons of circles. The rows of circles represent 16 segments of one measure, so as you can imagine, every fourth circle represents a beat: one, two, three, four. Finally, track settings. This is probably the most confusing part, but we'll get to that. If you’ve used Audacity or another music program before, you’re probably familiar with selecting part of a larger track. We're going to go ahead and do that now. Actually, let’s wind it back a little bit farther. I just dragged a little selection over a part of a larger track, and if you look at the bottom, you see a build button.
00:05:06 So that’s a little loop. Now let’s get to the complicated part: measures in the sample. If I've selected a sample here, I can now say how many measures are in that sample. Right now, it’s set to one; I can set it up to four. You’ll notice that the BPM changes when I move the slider. What does that mean? Well, the BPM, or beats per minute, is fixed to the length of the sample you’ve selected. So, the more beats you have to fit into that sample length, obviously the BPM is going to change. 45 is kind of slow; let’s set it to 2. So now we have 91 beats per minute, and there are 2 measures in a sample. Pop quiz time: if there are 2 measures in a sample and there are 4 measures in a loop... I just forgot what I was going to say. I wanted to be clever, but I'll come back to that.
00:06:08 Measures in the loop means how many measures are in a loop, and loops in track means how many loops are in a track. It’s pretty straightforward, but messing with these sliders and standing up here trying to explain it is kind of throwing me off. So instead, let’s get to the beat part. I just clicked down on 'none,' and we have all these samples here. You might recognize some of these names: 808... Let's go ahead and add an 808 kick. I just want to throw down a couple of kicks here. Let's pick another one: ‘HC,’ which is very creatively named; it stands for hi-hat closed. We’ll go ahead and select some circles and then let’s add one more thing: an 808 snare.
00:09:21 You'll see a stitch directory pop up on my desktop. My thinking was to provide a way to stitch multiple smaller beats together. I was hoping that after I talked for a while, I’d calm down a little bit, but it’s really not happening. I need a flotation tank or something. So that’s kind of the application, and now it’s time to get into some other things I’d like to talk about as they pertain to this app and to other things in general. Now I have to turn on display mirroring.
00:10:37 Well, am I doing something weird, like arrangement? There we go. All right! I must go ahead and play that. Okay, so I just bumbled through a little talk about that app that I made called Ruby Lo-Fi. My name is Zack, as I mentioned. I made Ruby Lo-Fi; it’s a project. I have a very specific definition of a project, and we’ll get to that in a second. So this slide deck is going to be about that, but it’s also going to be about you and your project. I have a question: do you have a project using my very specific definition? Do you have something that’s not work, that’s just for fun, for the joy of it?
00:12:04 As Matt would say, to use Ruby is like a fun little choose-your-own-adventure game. There are no requirements; there’s nothing you have to do. It can be a fun little thing for your experience in any style you want. You can write tests or not; you can be your own boss. But importantly, just relax and have fun with you! When starting a project, we often wonder how to start and where to get inspiration from.
00:12:31 Well, there are two scenarios: either you have an idea, or you don’t. Let’s say you have an idea. Is it a good idea? What is good? Let’s consider another definition. A definition of good might be thinking small. I’m a serial starter of projects, and I almost never finish them. The smaller the project, the easier it will be to finish. I showed you Ruby Lo-Fi; it’s not finished, but it does what I set out to do. So I think that was the right size. Maybe you should think this should take me a couple of weeks or months to finish rather than assuming it should take a couple of years. If your idea is to take on Facebook head-on, the odds of finishing that are approaching zero.
00:13:40 But it’s your choice; I mean, it’s for fun! In the case that you don’t have an idea, you could ask people: do you have a problem I can solve or an idea that I can bring to life? This is surprisingly successful; people have a lot of ideas. But what if that yields nothing? Then what? I came up with a fun little game; it's not my invention, but I applied it to this particular purpose: turning random words into gems and kind of giving up control to the universe.
00:14:34 What I’m saying is that you don’t need to have an idea; there are so many gems out there made by people who had an idea or had a utility they wanted to make, and that can be your inspiration for something you can do. That’s all I wanted to say with that before I got embarrassed. All right, so once you have an idea, it's time to get started. I would like to talk about Ruby Lo-Fi real quick: I didn’t make it; what I did was coordinate it.
00:15:11 Basically, three gems: Ruby 2D for graphics, Ruby Audio for handling WAV files, and Ruby Beats for sequencing. So, really, my job was pretty easy, and I more or less accomplished what I set out to do. But let’s look at the Internet for a second here. This is Ruby 2D; it’s a pretty simple graphics library that works extremely well for creating 2D applications. It gives you all the things you need: rendering shapes, rendering text, writing a loop to handle mouse and keyboard interaction.
00:16:22 If you have any ideas for creating a desktop application in Ruby, I would highly recommend Ruby QT. Shout-out to Warhammer kid: I don’t know him, but he forked Ruby Audio and turned it into a gem. It’s somewhat old—actually eight or nine years ago—and the commits are from 14 years ago. It worked right out of the box for loading WAV files and manipulating them. Then there's the Beats drum machine; this is really cool. This guy named Joel Straight makes a ton of stuff; he’s a highly creative person. Beats allows you to define a YAML file and a set of WAV file samples and sequence them into a beat. Very cool! I can talk about how I used this later. I highly recommend checking this out along with some of his other apps.
00:17:50 I get somewhat envious of people who have this amount of creative output. All right, let's get back to the slides. What I mean to say is to stand on the shoulders of giants. Right? Okay, so when you’re starting your project, you need to do some research, as I did, looking up gems. But not just gems; there are concepts and things involved with your idea.
00:18:02 For example, in terms of Ruby Lo-Fi, I knew I needed a normalization algorithm. Now, what does that mean? It’s a process by which you make a sound file as loud as possible without clipping it. Clipping means going outside the bounds of non-distorted audio. I didn’t know what to do; I sat down and thought about it for a bit. How do I move your late waveform to stretch them without distorting them? I came up with a great idea, and in fact, I can show you right now.
00:18:48 A very helpful person jumped in like a ninja and dropped this normalization algorithm. I messed with that a little bit, and it seemed to work. What I’m trying to say is: outside of Ruby, don’t be afraid to dig deep and look up subject matter related to your idea. That being said, let's see—we have about 20 minutes left, so I better make this fast. I'd like to talk about two things that I dealt with a lot in Ruby Lo-Fi, just to give you an example of some of the stuff that went into my thinking. Let's talk about digital audio for a second: it’s not very mysterious; it’s not a magic box, it's just numbers.
00:20:21 Here is an example of what I’m talking about: these are 13 float numbers; they range from negative one to positive one. I've roughly drawn a curve that matches these values. As you can see, at zero on the left imagine a midpoint line at 0.44, that’s kind of our upper peak. Negative 0.5 is our lower peak. That’s really all audio is. Your computer is just giant arrays of numbers—three concepts real quick: frequency, amplitude, and sample rate.
00:22:09 What is frequency? One cycle per second. What’s a cycle? Let’s jump back to that slide. This is one cycle; it’s a peak and a trough. So, one Hertz is one of these per second—that's too low for people to hear. A equals 440—who knows what that is? Right, A above middle C; it vibrates at 440 Hertz, so that is audible to the human ear. Great! So now we know what frequency is. What about amplitude? It’s power. The most powerful peak we have is 0.44, while the bottom amplitude is negative 0.5, as I mentioned earlier. Too much power equals clipping; we can only handle so high a number in whatever frame of reference we’re talking about, which is negative 1 to 1.
00:23:43 So, if you have like 1.2, you’re going to clip, and you don’t want to do that. So frequency versus amplitude is count versus strength: i.e., the number of Hertz versus the amplitude of those signals. Finally, sample rate is an important concept for computer audio, which means samples per second. Pop quiz! This one I actually know the answer to: what is standard CD quality sample rate? Who knows? You got it! It is 44,100 samples per second. So let’s look back at this thing that I drew. This is 13 samples; I only need another 44,087 to make one second of computer audio at CD quality.
00:24:41 So you can imagine when manipulating arrays, like in Ruby Lo-Fi, if I have a few seconds of audio at 44,100, I'm dealing with hundreds of thousands of numbers to process. That’s what makes it slow. Of course, people have solved this problem already, but for a first chop, I'm basically just dealing with the limits of processing thousands or millions of array locations. So, sample rate doesn’t equal frequency; sample rate doesn’t equal amplitude. You can have a 440 Hertz tone and 44,100 samples per second; all very clear, right?
00:26:00 Now this is a sentence: here's an example of what I’m talking about. I think this is big enough to see. Can anyone guess what this code block does? It fades a sound from zero to whatever its maximum is. As you can see, we're iterating through every sample inside an array of samples. For every sample, we’re multiplying it by the index divided by the number of samples. Very simple! But this is basically how computer audio works; it gets more complicated, but loops of arrays.
00:26:41 So, now you’re an audio expert. Let’s talk about graphics real quick. What do you need to make graphics? You need three things: probably more than that, but at least three. You need to be able to render basic elements like lines, rectangles, and text. You need a coordinate system: x and y for 2D, and x, y, and z for 3D. Does that make sense? You have a rectangle that’s X and Y; if you have a rectangle that shoots into space, it’s easy.
00:27:22 You need an update loop to handle input and redraw your graphics. You could always draw it just once, but then you can’t do anything with it; it’s just a picture. All right, so with those three things, we can make a graphical interface. For example, making a button; I had to do this for Ruby Lo-Fi. A rectangle for the clickable area, a rectangle for the border, text for the label of the button, and a click handler. This is a little smaller, but I think you can still see.
00:28:14 Let’s say we get 'next' and a y from an event, and we have a button that says 'click.' Now to know is to check if x and y are inside the button. It’s pretty straightforward: Is X less than or is X greater than the button’s X? Is it less than the button’s X plus the button's width? Does that make sense? This is basically what an app does when dealing with user input: detecting if I clicked inside something or not, and then everything else is just... now.
00:29:04 Let’s go back to projects and making progress. This is something that I really struggle with. So, let’s say you’re making something, but then it starts to get difficult. The easy stuff is done; the obvious stuff is done. Maybe you start to flail around a little bit. Maybe your progress looks like this: I have a lot of trouble staying on task. For example, Ruby Lo-Fi is about audio, so why did I spend way more time building UI elements and visuals rather than actually dealing with the audio itself, adding more audio effects, and making it function faster?
00:30:38 I lost focus; I got really deep in the weeds. I don’t have any answers to this problem, but I do have a suggestion: when you find yourself straying from what you feel is the correct path, stop working, show someone what you have, and ask for feedback. If I would have done this, someone might have said: 'It takes up to 10 seconds to load one audio file in your app; it’s basically useless.' I would have said, 'Yes, that’s true; I should probably focus on speed before I make another slider,' you know, that kind of thing.
00:31:31 Overall, though, as much as I rambled in this talk, I hope you take away this message: code for happiness and code for joy. I kind of front-loaded the demo, so we don't have to talk about it. Although we can go back to it, I covered making a beat and stitching beats somewhat clearly. Maybe a couple of things to remember about Ruby Lo-Fi: it only handles WAV files, and you always have to hit the build button before playing or saving; otherwise, it’s too slow to use.
00:32:49 There are some keyboard shortcuts, but that’s not really important at this time. Here’s the big thing: I’m a little disappointed in myself because I wanted to have the app a little more complete for people to use. I think it’s usable, but it’s borderline. I wanted to issue a challenge to everyone here: does everyone know what Sox is? It’s a very useful command-line utility that’s sort of the standard for manipulating audio, especially in Linux, but also on Mac and Windows.
00:33:47 I love this webpage, by the way; it looks very retro. You can google 'Sox'—S-O-X, sound exchange. If you want to use Ruby Lo-Fi, you need to have Sox. It’s a dependency that I couldn’t figure out how to get rid of because it actually turns the audio data into sound, and I did not have the ability to do that on my own. There is a download link here; who remembers SourceForge? That’s the thing! It auto-selects your operating system, so it’s essentially a one-click install.
00:34:40 If you want to try Ruby Lo-Fi, you do need it, though, unfortunately. I showed you cloning the repo: Robo Bluebird is my GitHub, and Ruby Lo-Fi. There’s one other weird step: you have to unzip the sounds directory, which is kind of a pain, but maybe in the future, you won’t have to. I experimented briefly with serializing all of those samples into just a big data chunk and then deserializing it. I couldn’t get it to work. Then you run it: run the app.rb, and you can make a beat.
00:35:28 I think the interface is pretty straightforward, but it has enough little professional qualities that it might be difficult to use. I would like to say, if anyone has the wherewithal to download this and play around with it, I would really appreciate it. If you make something with it, you can send it to me or tweet it. I don’t know, can you tweet audio? I think you can! But I did pick up specifically for this purpose three knock-offs; they’re still the real deal, the Amazon special, as I like to say. Here they are; I’m not lying!
00:36:23 Little nano, but Ruby C is still too big to fit on a nano. However, there’s another project called 'm-Ruby C.' If anyone saw the m-Ruby C talk yesterday, someone took that and made it even smaller. Actually, you can get a little Ruby program to run on a man. If anybody wants to try Ruby Lo-Fi but you don’t remember all these like kind of stupid steps, I would love to help you out. Obviously, the app needs work. If this seems like something interesting that you would like to work on as well, I would appreciate the help. I can give you the repo name again if you need it.
00:37:12 But I’ve never really collaborated on a hobby project before, so if anyone looks at it and thinks it’s a good idea but not here: let me help; I would appreciate it a lot. That’s almost all I have. I want to thank you all for listening. Before we go to the question round, I do want to say that my company sent me here; I really appreciate that. I have the obligatory off-theme slide: I work for General Dynamics in Pittsburgh.
00:37:50 We do a lot of interesting things with large datasets, turning them into useful visualizations to help people make decisions and things. We’re hiring at all experience levels, from beginner up through senior principal. I think we have some positions open for like DevOps and data scientists. I’ve only worked there since April, but it’s been good so far. I am in this picture on the hot metal bridge in Pittsburgh; I don’t remember where though. So anyway, I have business cards. If anyone has an interest in living and working in the Steel City, please let me know.
00:38:37 That actually is the end of my presentation. Sure! Yeah! And the cool thing about it is when you deconstruct some of the stuff that you hear like on the YouTube stream, it’s really quite simple. Of course, it’s better produced than this would be, but the idea is the same: taking four or eight bars of music, putting a beat under it, and going from there.
00:39:40 There’s a better... Let’s try that. Yeah, and go ahead and add—what's that? A fat check underneath it. Let’s add a little snare.