Ruby Video | Cats, The Musical! Algorithmic Song Meow-ification

Cats, The Musical! Algorithmic Song Meow-ification

@bethanyhaubert

#game-of-thrones

Cats, The Musical! Algorithmic Song Meow-ification

Beth Haubert • November 13, 2018 • Los Angeles, CA

In the talk titled 'Cats, The Musical! Algorithmic Song Meow-ification' presented by Beth Haubert at RubyConf 2018, the speaker explores a creative intersection of music, coding, and humor through a project called 'Meowifier,' which converts any song into cat meows. The session opens with an introduction to the speaker's background in software engineering and a playful warning about the lighthearted nature of the talk, which includes poor singing and cat GIFs.

Key Points Discussed:
- Introduction to 'Meowifier': The application allows users to upload a song and receive its melody transformed into cat meows, highlighting a quirky and entertaining use of technology.
- The Challenge of Melody Extraction: Haubert explains the complexity of extracting a melody from a polyphonic song, illustrating this with examples of songs that lack lyrics, such as the 'Game of Thrones' theme, which sparked the idea for 'Meowifier.'
- Technical Implementation: The speaker discusses leveraging existing APIs, particularly Sonic API, to extract melody data. She outlines the critical aspects of handling MIDI data, which includes ensuring correct pitch mapping for playback.
- Duration Matching: Haubert elaborates on the importance of matching the length of meows to the corresponding notes, explaining how she used FFMPEG to adjust audio files to fit the required lengths for the melody.
- Creating a Collection of Cat Meows: The development of a custom library of meows across different pitches is detailed, emphasizing the creative process involved in sourcing audio clips and creating a diverse set of meows for different notes.
- Future Expansions: Looking forward, Haubert expresses interest in developing a reverse melody analyzer to further enhance the functionality of 'Meowifier' by allowing it to recognize songs and convert them into meows automatically.

Overall, Beth Haubert's talk is an entertaining blend of technical explanation and humorous anecdotes that showcases how programming can be applied whimsically. She concludes by encouraging further exploration of her project and emphasizes the joy of creating silly, yet technically challenging, applications like 'Meowifier.'

Cats, The Musical! Algorithmic Song Meow-ification
Beth Haubert • November 13, 2018 • Los Angeles, CA

RubyConf 2018 - Cats, The Musical! Algorithmic Song Meow-ification by Beth Haubert

How are you supposed to sing along with your favorite TV theme song every week if it doesn’t have lyrics? At my house, we “meow” along (loudly). We also code, so I built ‘Meowifier’ to convert any song into a cat’s meows. Join me in this exploration of melody analysis APIs and gratuitous cat gifs.

RubyConf 2018

00:00:15.949 My name is Beth Haubert, and this talk is called 'Cats, The Musical! Algorithmic Song Meow-ification.' I am very excited to be here today, as this is my first time speaking at RubyConf. Thank you for being here.

00:00:26.689 I want to give you a little warning before we dive in: within the next 20 to 30 minutes, you are likely to encounter some poor singing, an abundance of cat GIFs, and an excessive amount of silliness. So, be prepared! Let’s get started.

00:00:47.129 I don't remember what my bio says on the website, but it may be a little out of date due to some life changes. I used to be a software engineer at Flywheel in Omaha, Nebraska, and now I am at Thoughtbot in San Francisco, which I am very excited about! If you happen to work at Thoughtbot and we haven't met yet, please come say hi afterward.

00:01:15.210 I've been told by a few people that one of my strengths is thinking outside the box, or as I like to call it, being really good at coming up with silly ideas. Some of my ideas have included 'Feces Book,' a social media website for your poop, and 'Kombucha,' a super boozy version of the fermented tea drink that's supposed to be healthy.

00:01:35.640 Another idea I'm proud of is 'Luka,' a U.S.-based anti-social network for your cats. However, I actually failed to release this project in a smaller market in Omaha, which I still affectionately call 'Omaha.' Now, on a change of subject but still very relevant, who here is familiar with 'Game of Thrones'? Okay, cool, most of you. It has been one of the most popular television shows.

00:02:22.780 To give you a bit of background on me, this is a picture of my husband and me. We got married in a bar, to the disappointment of both of our parents. These are our cats: Xiao, GUI, and Clementine. Yes, this is a professional photograph of my cat. This is also a picture of me and my husband on Halloween a few years back when we really got into the 'Game of Thrones' theme.

00:02:45.810 Now, you might be wondering where I’m going with this, and I will explain shortly. To set the scene, I’m going to play you the 'Game of Thrones' theme song right now.

00:03:01.890 If we could get the volume up a bit?

00:03:41.230 Since you're all familiar with the 'Game of Thrones' theme, we're going to do something together. We are going to meow the 'Game of Thrones' theme song! I’ll get us started and then you will all join in. Are you ready? Yum, yum, yum, yum, yum, yum, yum... (and so on).

00:04:01.560 Great job, everyone! We just meowified the 'Game of Thrones' theme song. I think we probably set some kind of world record for the most people meowing in a room. So, 'Meowifier,' the application I created, is just one idea in a long line of silly ideas I've had over the years, but it also seemed like a really interesting technology problem to solve.

00:04:50.410 How exactly does it work? Simply put, you upload a song's audio file, and 'Meowifier' outputs a new audio file with that song's melody sung by cats. I wish I could say there’s such a thing as a cat choir, but there isn’t. I searched online, and nothing came up. So it was all up to me to figure this out.

00:05:09.550 So let’s discuss the first big challenge I faced: how to obtain the notes of only the melody from a song’s audio file. For a human, especially someone with any musical training, picking up the melody of a song is relatively easy.

00:05:31.710 The melody, for those of you who may not be musicians, is the principal part of the song. Every song you hear on TV or the radio is polyphonic, meaning there is more than one note happening at the same time. For example, if you listen to 'Bohemian Rhapsody' by Queen, the melody is the part that Freddie Mercury sings, like the line 'Mama, just killed a man.' That's what I'm looking for, just the melody.

00:05:57.930 You may be wondering how I went about extracting the melody. Compared to a human brain, computers are pretty unintelligent; we have to tell them everything to do! Writing an algorithm for a computer to extract the melody is incredibly complicated.

00:06:05.010 I did not write my own algorithm; I’m not that ambitious. Instead, I did what most programmers do: I Googled it to find something that could work. After some digging, I came across a tool called Sonic API, which offers professional-grade audio technology and high-quality algorithms.

00:06:27.440 This API is free up to a certain point, so I decided to give it a shot. In my code, I created a simple song parser class with a parse method. All I needed to do was pass the proper parameters through an HTTP call to the API, including the song file, and it was supposed to extract the melody for me.

00:06:47.200 However, while I won’t expect you to read the tiny text in the example on the screen, this is just a collection of what the API sends back. It receives the first few notes of the 'Game of Thrones' theme song and returns several pieces of data per note.

00:07:17.020 So, for each note extracted, I have four pieces of data. The MIDI pitch here maps to the pitch of a note. Since only whole MIDI pitch numbers map to standard notes on a keyboard, I had to round the note up or down before I could map it.

00:07:40.400 MIDI, which stands for Musical Instrument Digital Interface, is a technical standard that allows all electronic musical instruments from different manufacturers to communicate with one another. For example, a MIDI pitch of 36 corresponds to a C, specifically two octaves down from middle C.

00:08:11.180 Let’s take a break and recap where we are. So, I need to ensure that my notes have the correct pitch mapped in MIDI, which allows me to construct my melodies correctly.

00:08:34.590 The next step is matching the length of the meows to the length of the notes in the melody. Melodies contain notes that vary in length—when you think back to 'Bohemian Rhapsody,' for instance, Freddie doesn't just hold each note for half a second; some are shorter and some are longer.

00:09:01.750 It’s important for me to have control over the exact length of my meows, but I can’t physically create every conceivable length from scratch. This means I need to find a tool that can cut or extend a meow to fit the proper timing. After some searching, I found a great tool called FFMPEG.

00:09:24.590 FFMPEG has been around for two decades, and while the plethora of options can be overwhelming, I finally got it to work! Unfortunately, this method I wrote is quite long and messy because I didn’t refactor it, and I've had it for two years.

00:09:46.870 What happens is I pass in the parsed song, which has the collection of notes extracted from the API, and we need to adjust our meow sound files accordingly. I have around 88 short audio files in my application, each with a different pitch.

00:10:03.140 The logic handles creating meows with the correct duration. If the extracted note is shorter than the meow file I have, I trim the end to get the correct length. If it’s longer, I duplicate the meow file until it reaches the desired length.

00:10:24.580 To illustrate, if the duration of the note is 0.48 seconds but the meow is only one second long, I can trim it down to match. For longer notes, I find how many loops of the meow file are needed to match the duration of the extracted note, and I combine them to create one audio file.

00:10:43.950 The next challenge was to create a multi-octave library of meows since very few cats meow in the bass or tenor range. Since I found no existing cat choir, I had to come up with my custom meow library using various tactics.

00:11:04.960 Initially, I attempted to record notes on my piano and sing along, but I always ended up a bit flat. Afterward, I tried a tuner app on my phone, and I managed to collect a few notes, but it wasn’t enough for a full library.

00:11:30.300 I finally found some octave recordings of an auto-tuned man meowing on a free sound website. While this collection didn’t span the full 88 notes of a keyboard, it provided a solid foundation for my work. However, I soon needed to create a workaround for notes that fell outside of that range.

00:11:56.980 Once I created my meow library, each note in the collection required a pitch and octave designation. Some of these notes had sharp designators like 'f-sharp 6.’ To ensure compatibility with my constants, I wrote a method that patched this in so I wouldn’t have to rename every single file in my library.

00:12:15.420 Despite my efforts, I wasn't fully satisfied with the library I developed. I wanted the best quality meows, so I eventually decided to auto-tune 88 more meows and create my custom library spanning almost the entire keyboard.

00:12:34.060 The moment you’ve all been waiting for is approaching. Remember the theme from 'Game of Thrones'? I’ll be using my new tools on this for a live demonstration.

00:12:54.330 However, I did encounter a few problems. It turns out that the melody analyzer I initially used wasn't of world-class quality, so I sought something more reliable.

00:13:01.400 I discovered a tool called Melodia, which is based on a Ph.D. thesis about melody extraction. Unfortunately, while it returns high-quality results, I found that it wasn't necessarily worth my time to integrate it.

00:13:19.320 Ultimately, I decided to return to Sonic API despite its imperfections. As this is a side project, I'm not constrained by deadlines or payment, so I intend to continue refining it.

00:13:45.420 Looking ahead, I’m considering branching out to create a reverse melody analyzer similar to Shazam. This would help identify which song a user is playing on my website. After determining the song, I could scrape the internet for a mini version of that track and convert it into a meow version.

00:14:02.380 I haven't built that yet, but it is next on my agenda. So, the next time you see this talk, perhaps I will have completed it.

00:14:24.680 In the meantime, I hope you’ve enjoyed my discussion and the cat GIFs I've shared. You can find my slides on Speaker Deck, and I’ll be tweeting that link out soon. My Twitter handle is @aHaubart.

00:14:42.220 Thank you all so much for participating and for yelling with me today.

@bethanyhaubert

explore all talks recorded at RubyConf 2018

Explore all talks recorded at RubyConf 2018

RubyConf 2018

Opening Keynote: The Power of The Community

Yukihiro "Matz" Matsumoto

The Atonement of J. Robert Oppenheimer

Sweat the Small Stuff

The Games Developers Play

RubyPlot - Creating a Plotting Library for Ruby

Being Good: An Introduction to Robo- and Machine Ethics

Designing an engineering team: Making room for everyone

Cats, The Musical! Algorithmic Song Meow-ification

Graphics and Simulations (and Games), Oh My!

Ruby Family Fued

Evan Phoenix, Eileen M. Uchitelle, Aaron Patterson, Justin Searls, Nadia Odunayo, Sam Phippen, Valerie Woolard, and Ernie Miller

Unraveling the Masculinization of Technology

Audrey Eschright

The Dangers of Tribal Knowledge

Yes, You Should Provide a Client Library For Your API

Ruby for Makers: Designing Physical Products With Ruby

The Psychology of Fake News (And What Tech Can Do About It)

Empowering Early-Career Developers

Mercedes Bernard

Uncoupling Systems

Wafflebot: Cloud Connected Artificially Intelligent Waffles

Jonan Scheffler

Ethical Data Collection for Regular Developers

Secrets of a Stealth Mentee

Building For Gracious Failure

Ruby-us Hagrid: Writing Harry Potter with Ruby

Keynote: Who and What We're Leaving Behind by Bianca Escalante

Bianca Escalante

Keynote: How to Build a Magical Living Room

The Ruby Developer's Command Line Toolkit

Retrospectives for Humans

Courtney Eckhardt

Let's subclass Hash - what's the worst that could happen?

Running a Government Department on Ruby for over 13 Years

The Developer's Toolkit: Everything We Use But Ruby

What Poker Can Teach Us About Post-Mortems

Trash Talk: A Garbage Collection Choose-Your-Own-Adventure

BDD: Baby Driven Development

Allison McMillan

Live Mob Refactoring

Sam Phippen, Betsy Haibel, Jennifer Tran, and Trey Miller

The Anatomy of a Ruby Gem: Going From Zero to Sharing Code

It's Down! Simulating Incidents in Production

Kelsey Pederson

Pointers for Eliminating Heaps of Memory

Aaron Patterson

Refactoring the Technical Interview

Reducing Enumerable - An Illustrated Adventure

Optimizations in Multiple Dimensions

Parallel programming in Ruby3 with Guild

Code Review, Forwards and Back

Sumana Harihareswara and Jason Owen

Inheritance, Composition, Ruby and You

Cache is King: Get the Most Bang for Your Buck From Ruby

ROM: the final frontier of mruby

Masayoshi Takahashi and Yurie Yamane

Documentation Tradeoffs and Why Good Commits Matter

Greggory Rothmeier

The Case of the Missing Method — A Ruby Mystery Story

Practical Guide To Benchmarking Your Optimizations

The secret power of Ruby 2.6: JIT

Takashi Kokubun

Make Ruby Write Your Code for You

Lightning Talks

Michael Hartl, Christen Rittiger, Tori Machen, Jennifer Tran, Jeremy Schuurmans, Kazumi Karbowski, Justin Searls, Jacob Crofts, Roman Kofman, Ariel Caplan, Jamie Gaskins, Aja Hammerly, Isaac Sloan, Zachary Schroeder, Junichi Ito, Tom Black, Quinn Stearns, Antoine Lecl, and Scott Istvan

Keynote: Unlearning - The Challenge of Change by Jessie Shternshus

Jessie Shternshus

JRuby 2018: Real World Performance

Thomas E Enebo and Charles Nutter

Tekin Süleyman

No Title Required: How Leadership Can Come From Anywhere

Ruby is the Best JavaScript

Hijacking Ruby Syntax in Ruby

Satoshi "moris" Tagomori and Tomohiro Hashidate

Building Generic Software

Humans Aren't APIs And Your Request Is 400 Denied

Cheating with Ruby

d[-_-]b REPL-ELECTRIC

Building web-based board games only with Ruby

Building a Memex (with Ruby!)

The New Manager's Toolkit

Building Serverless Ruby Bots

The New Design of Ruby's Documentation

ITOYANAGI Sakura

Beating Mastermind: Winning with the help of Donald Knuth

High-Speed Cables for Ruby

Vladimir Dementyev

Yukihiro "Matz" Matsumoto