00:00:14.650
So, my first RubyConf was in San Francisco in 2009, and the closing talk of that conference was a fantastic presentation by Aaron Patterson and Ryan Davis. If you haven't seen this talk, you really should find it on Confreaks; it's a great way to spend half an hour or 45 minutes. In this talk, Ryan and Aaron cover all the crazy ideas that they came up with while writing the bus home from the Ruby class they were teaching in downtown Seattle, which we affectionately rebranded as the 'Fail Bus.' They shared a whole bunch of perfectly reasonable code for perfectly unreasonable problems. Back in 2009, I was working at a startup, and my code was running on something that looked one way, but now it runs in a completely different environment. As I was trying to figure out what talk to propose for this conference, I thought about doing another version of their talk.
00:01:19.000
My name is Aja Hammerly. The code for this talk is on GitHub. You can find it in the 'Stupid Ideas' repository. I tweet under the handle @AjaMizer. My phone is over there, so feel free to tweet during the talk; it won’t disrupt me at all. I have a blog at ajahammerly.com, although I used to update it very rarely. Now that I am paid to blog at that address, I’m posting more often. I work for Google Cloud Platform as a Developer Advocate, and yes, we let you run Ruby on our cloud! I will be hosting office hours tomorrow during the afternoon break on level 2. If you have questions about running your code on Google Cloud or just want to know more about the platform, come talk to me.
00:02:07.280
Before we dive into the content, I want to clarify that unless otherwise stated, all code in this talk is copyrighted by Google and licensed under Apache V2, thanks to our lawyers. Now that all that’s out of the way, let's get started with the first stupid idea, which involves load testing.
00:02:18.340
The code in this section predates my time at Google; it’s pretty old, which means that it’s my copyright, and it’s licensed under MIT. I want to be very clear: this is really stupid load testing. I started my career in QA at a small company, and my boss would come and ask, 'Hey, can you load test this new feature?' I would respond, 'Sure, what parameters are you looking for? What’s the ideal load you want this to have?' But they would tell me, 'Just load test it! Do the load testing thing!' So my response was to take my relatively limited skill set at that point and come up with something incredibly ridiculous.
00:03:30.990
I had a Rails log of the scenario I wanted to run, and I was testing from my local server. Using a gem called Mechanize, I was able to write a script that would automate the load testing process. Mechanize allowed me to inject user IDs or association IDs into that script, so I wasn't rerunning the exact same scenario from my log repeatedly. The big challenge in load testing is to make a lot of computers do the same thing at the same time, all pointed at the same server. If you can do that, you have load testing. Although the analysis piece is important, we’ll ignore that for now because this is about stupid ideas for too many computers.
00:04:53.900
What does the code look like for this, you may ask? To be clear, it’s not good. In fact, this is some of the oldest Ruby code I have written that I haven’t deleted yet. When I proposed the talk, I realized I had this code sitting around and retrieved it to include in my slides. Lo and behold, the first line I saw was indicative of just how unrefined this code is. The code involves using Mechanize to loop through the log file and match against a terrible regex that pulls out the action, controllers, and parameters from the Rails log.
00:05:59.430
The key functions of this process allow me to programmatically hit the website, but that still doesn't constitute load. To create load, you take your script and deploy it onto a bunch of computers all at once for a relatively long period of time. To actually do this, I have used a bash script that starts 30 versions of this script on a single computer. The first time I ran this, I used commodity hardware we procured on eBay — a mix of patient box servers placed on a rolling cart in a conference room. They generated so much noise that I was situated in a conference room to avoid disturbing others. I even managed to make the aluminum shells of those machines warm enough to toast cheese sandwiches on them! Unfortunately, when we ran the technique for a third time, we used the cloud, which came with its own disadvantages: no toasted cheese sandwiches!
00:07:00.120
Now, when it came to deployment, I usually call this section 'deployment' but what I really did was just low-scale hackery. I set up a VM on a cloud provider, installed all the dependencies, and ran Ruby on there. Then I had to do this a bunch of times, and to start everything, I had to SSH into each of the servers. Inevitably, it took a little while to get the entire load test suite running at full capacity. That actually became a feature because most real load testing frameworks ramp up the load gradually, which meant my manual framework was working just fine. However, this method offered no statistics or analysis, as we had no idea if any of the agents were timing out or what our maximum capacity was, except what we could glean from the server logs.
00:08:59.000
There are off-the-shelf tools to do this, but back when I did this, I wasn’t aware of tools like Apache Bench. So, lesson learned: please don’t do this unless it’s just for fun! Moving on to stupid idea number two, I adore my boss, and one day while we were having coffee, he proposed a fantastic interview question: can we do sentiment analysis of Twitter using emojis? I started off by consulting with him and asked if I could steal this idea for my talk, to which he graciously agreed. So, a big thank you to my boss and his creativity.
00:10:15.000
Sentiment analysis refers to analyzing a large body of text to determine whether the overall sentiment is positive, negative, or neutral. It’s trickier than it sounds, especially given language’s nuances. For instance, if I were to say, 'Sure, I’d love to!' with a smile, it’s positive, but if your teenage son says it while rolling his eyes, not so much. However, I am not working with words here; I'm working with emojis, so I can categorize emojis into positive or negative sentiments. In this scenario, emojis like the heart, thumbs up, and smiley face can be assigned positive sentiments, while the pile of poop and devil horns emojis represent negative sentiments. I thought having the poop emoji on my slides would be fun for this talk.
00:12:44.000
After categorizing the emojis, I needed a data source for my sentiment analysis, and I decided on tweets. I utilized the TweetStream gem, which allows for live streaming tweets from Twitter. This is my mapping process where I assign point values to different emojis for their sentiment value. I then constructed the code that analyzes a single tweet to determine its sentiment value using these values we established during our categorization.
00:13:54.880
However, analyzing one tweet is not enough; I needed to analyze a multitude of tweets, particularly the ones coming out about RubyConf. For this, I needed multiple computers, and that's where Rinda comes into play. Rinda is an implementation of the Linda distributed coordination language. It uses a shared tuple space, letting multiple processes — potentially on multiple computers — communicate. This is how I was able to deploy my analysis script across several workers that read and processed tweets concurrently. I had different worker types: a fetcher that writes tweets into a tuple space, an analyzer that reads from it and calculates sentiment, and a reducer that aggregates all the sentiment results.
00:15:52.000
When it came to the code, I created a simple server with just a few lines. This part was pretty straightforward, yet I complicated things by passing in the server URI as a command-line argument. The fetcher brings in tweets, which it pushes into the tuple space with the tweet symbol and content. Then, the analyzer examines the tweets to calculate their sentiment and writes the results back into the tuple space. Finally, the reducer compiles the total sentiment for the entire space. If you're familiar with MapReduce patterns, this should look familiar. I tested this setup on Google Cloud with Docker containers, running several workers to handle the load.
00:17:47.700
I utilized the latest Ruby container from Docker Hub, setting it up with around eight containers across five VMs. The architecture consists of one fetcher, one server, one reducer, and multiple analyzers. Having spent a lot of effort learning about Kubernetes, I decided to use that to manage everything, which worked well. The real value of using distributed systems and cloud services was evident in this task as deploying these various processes at scale would have otherwise been incredibly cumbersome. While I was running the demo, I was also logging output to show the total sentiment value and demonstrate the analysis visually. It became very important to visualize results when showcasing how much power distributed systems can provide.
00:19:40.400
The live demo I conducted during this talk involved gathering sentiments from the audience via tweets containing emojis at a specific hashtag. After a brief window to allow tweets to come in, I displayed the resulting sentiment score based on the collected data. The results were rather entertaining, indicating the kind of fun and unexpected outcomes that come from utilizing computing power in silly ways. As I wrap up this keynote, I want to reflect on the overall theme that while the ideas presented in this talk may not be practical or useful in the traditional sense, they exemplify the delight of exploring 'stupid' concepts with technology.
00:25:06.360
The true takeaway from this talk is to not be afraid of embracing absurdity and to enjoy coding. I would like to specifically thank the friends and colleagues who assisted me in preparing for this talk: Ryan, Scott, Eric, and Brian. Their input made this presentation possible, and I am grateful for their support. I also have stickers and other Google Cloud swag for anyone interested. Please feel free to reach out to me via email or social media with any questions or feedback. Remember, it’s important to not take coding too seriously; instead, embrace the joy in our work!