Distributed Systems

Summarized using AI

Keynote: Stupid Ideas for Many Computers

Aja Hammerly • November 21, 2015 • San Antonio, TX

This keynote presentation titled "Stupid Ideas for Many Computers" by Aja Hammerly at RubyConf 2015 explores absurd and amusing applications of Ruby programming in cloud computing environments. The talk is inspired by a previous conference discussion of unwieldy yet functional coding solutions, and Hammerly emphasizes the joy in experimenting with nonsensical concepts in technology.

Key Points Discussed:
- Introduction to Stupid Ideas: Hammerly shares her inspiration from the 2009 RubyConf talk by Aaron Patterson and Ryan Davis, reflecting on how their whimsical ideas influenced her own.
- First Stupid Idea: Load Testing: Hammerly discusses her experience with a load testing script that automates the process using the Mechanize gem. The method involved using older Ruby code to simulate user load on a server, emphasizing its ridiculousness and inefficacy, particularly in how it ran on eBay-sourced hardware.
- She humorously notes the physical manifestations of creating loads, like toasting cheese sandwiches using overheated servers.
- Second Stupid Idea: Emoji Sentiment Analysis: The presentation transitions to a more technical, yet still playful, concept of performing sentiment analysis through emojis from tweets. Hammerly categorizes emojis into positive and negative sentiments and explains her coding process to analyze tweets using the TweetStream gem.
- The implementation employs Rinda to manage distributed computing tasks for sentiment analysis, allowing multiple computers to collaborate efficiently and creatively.
- Live Demonstration: During her talk, Hammerly conducts a live demo by collecting audience tweets to analyze collective sentiment via emojis. This engaging example illustrates the potential of cloud computing and distributed systems, even when applied to humorous undertakings.

Conclusions and Takeaways:
Hammerly concludes that while these ideas might seem impractical or nonsensical, they celebrate the creativity in coding and encourage developers to enjoy the absurdity of programming. The overarching message is a reminder to approach coding with joy and to not shy away from experimenting with unconventional ideas. Hammerly thanks her colleagues for their support and invites the audience to engage with her on social media for further discussion.

The presentation balances practical insight and entertaining concepts, aiming to inspire developers to embrace fun in their coding journey.

Keynote: Stupid Ideas for Many Computers
Aja Hammerly • November 21, 2015 • San Antonio, TX

Keynote: Stupid Ideas for Many Computers by Aja Hammerly

There are plenty of useful things you can do with Ruby and a bunch of servers. This talk isn't about useful things. This talk will show off asinine, amusing, and useless things you can do with Ruby and access to cloud computing.

Sentiment analysis based on emoji? Why not? Hacky performance testing frameworks? Definitely! Multiplayer infinite battleship? Maybe? The world's most inefficient logic puzzle solver? Awesome!

If you are interested in having some fun and laughing at reasonable code for unreasonable problems this talk is for you.

Help us caption & translate this video!

http://amara.org/v/H0nF/

RubyConf 2015

00:00:14.650 So, my first RubyConf was in San Francisco in 2009, and the closing talk of that conference was a fantastic presentation by Aaron Patterson and Ryan Davis. If you haven't seen this talk, you really should find it on Confreaks; it's a great way to spend half an hour or 45 minutes. In this talk, Ryan and Aaron cover all the crazy ideas that they came up with while writing the bus home from the Ruby class they were teaching in downtown Seattle, which we affectionately rebranded as the 'Fail Bus.' They shared a whole bunch of perfectly reasonable code for perfectly unreasonable problems. Back in 2009, I was working at a startup, and my code was running on something that looked one way, but now it runs in a completely different environment. As I was trying to figure out what talk to propose for this conference, I thought about doing another version of their talk.
00:01:19.000 My name is Aja Hammerly. The code for this talk is on GitHub. You can find it in the 'Stupid Ideas' repository. I tweet under the handle @AjaMizer. My phone is over there, so feel free to tweet during the talk; it won’t disrupt me at all. I have a blog at ajahammerly.com, although I used to update it very rarely. Now that I am paid to blog at that address, I’m posting more often. I work for Google Cloud Platform as a Developer Advocate, and yes, we let you run Ruby on our cloud! I will be hosting office hours tomorrow during the afternoon break on level 2. If you have questions about running your code on Google Cloud or just want to know more about the platform, come talk to me.
00:02:07.280 Before we dive into the content, I want to clarify that unless otherwise stated, all code in this talk is copyrighted by Google and licensed under Apache V2, thanks to our lawyers. Now that all that’s out of the way, let's get started with the first stupid idea, which involves load testing.
00:02:18.340 The code in this section predates my time at Google; it’s pretty old, which means that it’s my copyright, and it’s licensed under MIT. I want to be very clear: this is really stupid load testing. I started my career in QA at a small company, and my boss would come and ask, 'Hey, can you load test this new feature?' I would respond, 'Sure, what parameters are you looking for? What’s the ideal load you want this to have?' But they would tell me, 'Just load test it! Do the load testing thing!' So my response was to take my relatively limited skill set at that point and come up with something incredibly ridiculous.
00:03:30.990 I had a Rails log of the scenario I wanted to run, and I was testing from my local server. Using a gem called Mechanize, I was able to write a script that would automate the load testing process. Mechanize allowed me to inject user IDs or association IDs into that script, so I wasn't rerunning the exact same scenario from my log repeatedly. The big challenge in load testing is to make a lot of computers do the same thing at the same time, all pointed at the same server. If you can do that, you have load testing. Although the analysis piece is important, we’ll ignore that for now because this is about stupid ideas for too many computers.
00:04:53.900 What does the code look like for this, you may ask? To be clear, it’s not good. In fact, this is some of the oldest Ruby code I have written that I haven’t deleted yet. When I proposed the talk, I realized I had this code sitting around and retrieved it to include in my slides. Lo and behold, the first line I saw was indicative of just how unrefined this code is. The code involves using Mechanize to loop through the log file and match against a terrible regex that pulls out the action, controllers, and parameters from the Rails log.
00:05:59.430 The key functions of this process allow me to programmatically hit the website, but that still doesn't constitute load. To create load, you take your script and deploy it onto a bunch of computers all at once for a relatively long period of time. To actually do this, I have used a bash script that starts 30 versions of this script on a single computer. The first time I ran this, I used commodity hardware we procured on eBay — a mix of patient box servers placed on a rolling cart in a conference room. They generated so much noise that I was situated in a conference room to avoid disturbing others. I even managed to make the aluminum shells of those machines warm enough to toast cheese sandwiches on them! Unfortunately, when we ran the technique for a third time, we used the cloud, which came with its own disadvantages: no toasted cheese sandwiches!
00:07:00.120 Now, when it came to deployment, I usually call this section 'deployment' but what I really did was just low-scale hackery. I set up a VM on a cloud provider, installed all the dependencies, and ran Ruby on there. Then I had to do this a bunch of times, and to start everything, I had to SSH into each of the servers. Inevitably, it took a little while to get the entire load test suite running at full capacity. That actually became a feature because most real load testing frameworks ramp up the load gradually, which meant my manual framework was working just fine. However, this method offered no statistics or analysis, as we had no idea if any of the agents were timing out or what our maximum capacity was, except what we could glean from the server logs.
00:08:59.000 There are off-the-shelf tools to do this, but back when I did this, I wasn’t aware of tools like Apache Bench. So, lesson learned: please don’t do this unless it’s just for fun! Moving on to stupid idea number two, I adore my boss, and one day while we were having coffee, he proposed a fantastic interview question: can we do sentiment analysis of Twitter using emojis? I started off by consulting with him and asked if I could steal this idea for my talk, to which he graciously agreed. So, a big thank you to my boss and his creativity.
00:10:15.000 Sentiment analysis refers to analyzing a large body of text to determine whether the overall sentiment is positive, negative, or neutral. It’s trickier than it sounds, especially given language’s nuances. For instance, if I were to say, 'Sure, I’d love to!' with a smile, it’s positive, but if your teenage son says it while rolling his eyes, not so much. However, I am not working with words here; I'm working with emojis, so I can categorize emojis into positive or negative sentiments. In this scenario, emojis like the heart, thumbs up, and smiley face can be assigned positive sentiments, while the pile of poop and devil horns emojis represent negative sentiments. I thought having the poop emoji on my slides would be fun for this talk.
00:12:44.000 After categorizing the emojis, I needed a data source for my sentiment analysis, and I decided on tweets. I utilized the TweetStream gem, which allows for live streaming tweets from Twitter. This is my mapping process where I assign point values to different emojis for their sentiment value. I then constructed the code that analyzes a single tweet to determine its sentiment value using these values we established during our categorization.
00:13:54.880 However, analyzing one tweet is not enough; I needed to analyze a multitude of tweets, particularly the ones coming out about RubyConf. For this, I needed multiple computers, and that's where Rinda comes into play. Rinda is an implementation of the Linda distributed coordination language. It uses a shared tuple space, letting multiple processes — potentially on multiple computers — communicate. This is how I was able to deploy my analysis script across several workers that read and processed tweets concurrently. I had different worker types: a fetcher that writes tweets into a tuple space, an analyzer that reads from it and calculates sentiment, and a reducer that aggregates all the sentiment results.
00:15:52.000 When it came to the code, I created a simple server with just a few lines. This part was pretty straightforward, yet I complicated things by passing in the server URI as a command-line argument. The fetcher brings in tweets, which it pushes into the tuple space with the tweet symbol and content. Then, the analyzer examines the tweets to calculate their sentiment and writes the results back into the tuple space. Finally, the reducer compiles the total sentiment for the entire space. If you're familiar with MapReduce patterns, this should look familiar. I tested this setup on Google Cloud with Docker containers, running several workers to handle the load.
00:17:47.700 I utilized the latest Ruby container from Docker Hub, setting it up with around eight containers across five VMs. The architecture consists of one fetcher, one server, one reducer, and multiple analyzers. Having spent a lot of effort learning about Kubernetes, I decided to use that to manage everything, which worked well. The real value of using distributed systems and cloud services was evident in this task as deploying these various processes at scale would have otherwise been incredibly cumbersome. While I was running the demo, I was also logging output to show the total sentiment value and demonstrate the analysis visually. It became very important to visualize results when showcasing how much power distributed systems can provide.
00:19:40.400 The live demo I conducted during this talk involved gathering sentiments from the audience via tweets containing emojis at a specific hashtag. After a brief window to allow tweets to come in, I displayed the resulting sentiment score based on the collected data. The results were rather entertaining, indicating the kind of fun and unexpected outcomes that come from utilizing computing power in silly ways. As I wrap up this keynote, I want to reflect on the overall theme that while the ideas presented in this talk may not be practical or useful in the traditional sense, they exemplify the delight of exploring 'stupid' concepts with technology.
00:25:06.360 The true takeaway from this talk is to not be afraid of embracing absurdity and to enjoy coding. I would like to specifically thank the friends and colleagues who assisted me in preparing for this talk: Ryan, Scott, Eric, and Brian. Their input made this presentation possible, and I am grateful for their support. I also have stickers and other Google Cloud swag for anyone interested. Please feel free to reach out to me via email or social media with any questions or feedback. Remember, it’s important to not take coding too seriously; instead, embrace the joy in our work!
Explore all talks recorded at RubyConf 2015
+80