RubyConf 2017

That time I used Ruby to crack my Reddit password

That time I used Ruby to crack my Reddit password

by Haseeb Qureshi

The video titled "That time I used Ruby to crack my Reddit password" features Haseeb Qureshi discussing a personal story related to his struggles with productivity and password management. At RubyConf 2017, Qureshi narrates a unique anecdote about a time he locked himself out of his Reddit account intentionally to avoid distracting websites, only to find himself in a conundrum two years later when he sought to regain access.

Key points covered in the talk include:

- Introduction and Personal Struggle: Qureshi opens with his past addiction to procrastination through internet browsing and a humorous reference to The Odyssey, providing context to his lack of self-control.

- Password Management Strategy: To combat his time-wasting habits, he decided to change his Reddit password to a random, gibberish string and set his account recovery email to a throwaway account that he also discarded. This drastic measure forced him to focus on more productive activities.

- The Dilemma: After two years, while working at Airbnb, he attempted to log back into Reddit only to discover he had forgotten the generated password. He realized that he had set the password email to be sent on a future date, making recovery impossible without some ingenuity.

- The Solution: Qureshi devised a method to exploit the search functionality of the service he used (LetterMeLater) to gradually deduce his password character by character. He developed a systematic approach likened to a game of "Wheel of Fortune," leveraging substring queries to uncover the password without needing to access it directly.

- Technical Implementation: Using Ruby, he wrote a script named "lettermenow.rb" to automate the querying process, employing the Faraday gem to handle HTTP requests and extract information about his password incrementally.

- Outcome: Ultimately, he successfully retrieved his password using this method, completing this amusing yet enlightening journey into programming and problem-solving.

In conclusion, Qureshi's presentation not only entertains but serves to illustrate the potential of coding as a powerful tool for solving practical problems. He reinforces the importance of programming in everyday scenarios, emphasizing how it can lead to unexpected resolutions when faced with unique challenges.

The story encapsulates a blend of humor and technical prowess, leaving the audience with an appreciation for the creative ways programming can intersect with daily life.

00:00:08 I think it's probably about time to get started. So how's everybody doing? Yeah, good morning! All right, awesome!
00:00:21 I am here to talk to you guys about the time I used Ruby to crack my Reddit password, kind of. My name is Haseeb Qureshi. I'm a software engineer at Earn.com, and I am going to tell you guys a story.
00:00:36 So I used to be addicted to useless websites. I still am, but I used to be too. I'm sure you guys know what this is like. Here’s just an artistic depiction of what a useful site might look like. We all have our own poisons.
00:00:55 Also, I should note that I have no self-control to speak of, so that's pretty bad. This is actually kind of a nightmare. If you remember your English class from high school or something when you read The Odyssey, there's a famous story about when Odysseus tried to maintain some self-control by tying himself to his own mast.
00:01:20 My problems are not quite this severe, but they're kind of similar in their own way. So, keying off the story of The Odyssey, I decided to use that ancient psychological technique that Odysseus used, which is locking himself out of his online accounts.
00:01:37 What he did was a little bit different, but basically, I feel like it’s kind of the same. Let me tell you exactly what I did to try to spend less time wasted on useless websites.
00:01:49 On most of these websites, you have some kind of password system, and obviously, you can change your password. So, what I did was type in some random gibberish. I just grabbed my keyboard and started smacking away, coming up with something sufficiently weird and entered that in as my new password.
00:02:03 Of course, I don't remember what this string is; it’s just a bunch of random characters. I held on to it, though, and I also changed my account recovery email. So now, I have a new password that I have no idea what it is.
00:02:20 I've also changed my account recovery email to a throwaway email that I just created. I also threw away the password to that, so there's no way to get my account back, save for this password. This password is the key to my kingdom.
00:02:36 So what I did was plan to prevent myself from having access to this password until some later date in the future. For now, I want to go into crunch mode. I want to study, practice, and do whatever I need to do that these time-wasting websites are keeping me away from.
00:02:52 Unfortunately, normally the way you do this is through a mechanism known as 'friends.' But I figured there’s probably some way to automate this; you don’t need friends.
00:03:09 So I went and used Google to try to figure this out, and I came across this wonderful website called LetterMeLater. This kind of sounded nice; it allows you to send emails at a future date and time you choose. No friends required!
00:03:24 So, I decided to use it. It's a little bit 1995 looking, but that’s okay—maybe they're just really focused on what they do. So I went ahead and composed a new email to myself.
00:03:35 I created an account, filled in a subject line—I called it 'password' because, well, it’s my password—and set a date in the future for when I would email it to myself. I put my password in there and set it to 'hide mode.'
00:03:58 With 'hide mode,' that allowed me not to actually click on it when I log into LetterMeLater. So, when it's hidden, I have no way to see it until it gets sent to me, right?
00:04:14 So my plan was foolproof. I know that I'm just an awful human being, and the only way that I won't access my password is if there's literally no way I can get to it.
00:04:28 For a while, I used this system to keep myself from wasting time on highly addictive useless websites. Now, my talk is actually not about productivity techniques; this is a programming conference, after all.
00:04:46 So why am I telling you all this? The story is a little more involved. I used this for a while, and it was pretty effective, but later on, it ended up coming back to bite me.
00:05:01 Cut to two years later. I was working at Airbnb, and I had a job. I was gainfully employed—surprise, surprise! I didn’t really believe it either, but that's fine.
00:05:20 So, you know what that means? I'm working at Airbnb, their huge Rails app. A lot of tests, and of course, that means a lot of waiting. And waiting means it's time to start wasting company time.
00:05:34 So, what can I do? Obviously, I can't do work while the test suite is running; that would be silly. Instead, I wanted to get back into some of my old time-wasting activities.
00:05:46 I wanted to log back into this website, so I went back to LetterMeLater. I remembered that I locked my password away. After a couple of years, I thought it would be easy.
00:06:04 When I logged back in, I realized that it was still grayed out. That's kind of weird because usually, I’d send myself something like a month later at a time.
00:06:19 Then I realized, oh, I scheduled it for 2018. The last time I put this there, I guess I didn't remember, but I had gotten so annoyed at myself that I actually set dates super far in the future.
00:06:42 I thought to myself, screw you! You need to get a grip on yourself! And I was like, I can't wait that long! This test suite isn’t that slow.
00:06:58 At that point, I thought maybe there’s some way around this. So I logged into LetterMeLater, and I couldn’t see my password.
00:07:11 As I was clicking around on this, I realized I was about ready to give up. But of course, you've got to try a few things first. I realized there’s a search bar here. What can I search for?
00:07:34 So maybe I can search for my name and see if it pops up. It’s not there. Now what if I search for a single letter?
00:07:54 That popped up. Okay, so maybe you might be looking for subject lines, right? So you can ascertain that pretty quickly by typing 'password.' I can see, okay, yeah, it’s definitely indexing the subject line; that makes sense.
00:08:13 But remember, the body of my email was just the actual password. So if I search for a letter that's not in 'password,' let’s look for the letter 'e.' It’s not in there. What about '1'? What about '2'? '2' is in there, okay.
00:08:34 So, what’s going on here? If I have a way to do substring queries into my password, I have an 'Oracle,' and my 'Oracle' will give me this query.
00:08:51 It will tell me if the body plus the subject—the subject was 'password,' the body was the actual password itself—includes any string that I asked it.
00:09:12 So when I realized this, I ran home. I was off working on this. I didn’t just run home from work, but I ran home, busted out a piece of paper and a pen, and thought, okay, let me see how I can retrieve my password.
00:09:43 Here’s the algorithm: let's think about it like Wheel of Fortune style. I have this one thing: the subject at the top and the body down here.
00:09:58 And I don’t know any of the characters in the body, but I do know the characters in the subject, right? Imagine I have like a word bank, and the word bank consists of all the letters except the ones that are in 'password'.
00:10:15 If I do a substring query for 'p' and I find that it returns true, I don’t actually know if it’s true because it was in the body or because it was in the subject.
00:10:32 The subject would automatically give me a hit; however, if I try all the letters that are not in 'password,' then I know for certain I’ve hit a letter that is only in the body and not in the password.
00:10:49 So if I keep trying letters that are not in the string 'password,' eventually I will make a hit.
00:11:01 Once I make a hit, I know I’m in. I have one of the characters somewhere in my password. On average, this will take two guesses.
00:11:21 Imagine it's somewhere in the middle of the alphabet. I'll try letters until I get one that makes a hit. Then what I do is try to append another letter to do a longer search.
00:11:43 I know that you can append that letter to create a substring, and I just keep iterating through every single letter, including the letters in 'password,' because now any letter can be appended.
00:12:03 I keep going down one by one until I find the next character, and on average, you'll find the character somewhere around the middle of the alphabet.
00:12:16 Then I just keep repeating that, and every time it’s going to take two guesses where 'a' is the size of the alphabet.
00:12:30 Eventually, I will find the next character.
00:12:39 What that means is not the entire string, but I have a suffix to the string. After that, I can just repeat the process going backwards.
00:12:54 Instead of appending to the end of the string, I prepend to the beginning of the string. I just keep going until I fall off going the other direction.
00:13:10 I know, okay, cool! That should be my entire string. So, if I do this, and if I consider that this illustration is not exactly correct, because at the ends, I have to take an extra guess.
00:13:25 By the end of it, I just have to exhaust the entire search space until I determine that no other letter fulfills this string.
00:13:42 That will take two guesses for the ends: Let's say 'a' is the alphabet length. So if you assume 'a' is a mix of lowercase letters and digits and the password length is 22, then doing this will take about 432 queries.
00:14:01 That's actually doable! It's like a reasonable number of things that you can just do in a serial manner through API calls.
00:14:15 Alright, let’s do this. I’m going to create a 'letter_me_now.rb'.
00:14:26 First things first, I need to figure out how I’m going to do this querying. I need some kind of access to Oracle, so I’m going to build that first.
00:14:46 The way this is set up is to let me later account up. I can see if I changed this query to 'B,' then it becomes 'B.' There's no API, so I’m going to scrape this directly.
00:15:04 I can do that! So, I’m going to create an API class. In this API class, I’ll have a URL and remove the part so I can put the query string in programmatically.
00:15:19 I’m going to use the 'Faraday' gem, which is a nice gem for making HTTP requests. I’m going to have a 'def self.get' method that takes in a query.
00:15:37 What I’m going to do is say 'Faraday.get' the URL with the query string as the second argument.
00:15:55 Let’s check this quickly. I’ll say 'API.get' and search for the string 'password.' Okay, that didn’t work; it gave me a redirect.
00:16:12 It’s saying I must be signed in to see this page, so obviously I don’t have any of my cookies. For this to work, I need to make sure I pass the cookies in the headers.
00:16:34 Let’s go ahead and inspect, look for any cookie, and okay, let’s refresh here. Alright, we’ve got this cookie. I'll make sure to sign out when this talk is done.
00:16:55 Now, if I’m not mistaken, that should do the trick. If I do 'API.get' hello, boom! This looks like the actual webpage.
00:17:17 Okay, it’s giving me the 200 status, which is good. Now I want to know if this query returned true or false.
00:17:35 The easiest way to figure that out is to check for some unique string in the HTML that will uniquely identify that, yes, in fact, this returned true.
00:17:57 There are certain things that show up when it's 'scheduled', and I don't think it shows up anywhere else. For simplicity, I’ll use 'password'.
00:18:20 Let’s make sure this returns; if the body of this HTTP request includes the string 'password,' then I'm good to go.
00:18:40 So, I’ll put that in. Now, while I’m testing, I do not want to use the real Oracle because that’s going to be really slow.
00:18:55 I’m going to create a stubbed API that I can use while I’m testing so I can swap that out later with the real API.
00:19:16 In this stubbed API, I will have a fake password that is just some random characters. Great! Then we’ll have 'def self.include' with the same interface.
00:19:34 This will just be 'fake_password.include' the query.
00:19:42 Doing it this way should allow me to use a stubbed API so I don’t make unnecessary HTTP requests while I’m testing.
00:19:58 So now, I need to build that algorithm to figure out how to crack the password.
00:20:14 Let’s call it 'password_cracker.' The password cracker is going to be stateful; it will be taking that API and injecting that dependency.
00:20:29 We’ll set the password to an empty string and count the number of iterations. Let’s create a 'crack' method that goes through each of the steps of that algorithm.
00:20:50 To find the first letter, I need to know the alphabet I’m working with, but I also need the subject line.
00:21:05 So what I’m going to do is just have two arrays: one will be the subject characters, and the other will be all the characters I’m considering.
00:21:25 To find the first letter, I want to iterate through the characters that are not in the subject line, and still in the alphabet.
00:21:43 That should be easy enough. Each letter is essentially checked against the Oracle. If it’s true, then I find my first character. If it doesn't return anything, I want to raise an error; 'Could not find first letter!' and that’s the plan for the first letter.
00:22:11 Now, one thing I want to do is count how many iterations I’m doing throughout this whole algorithm.
00:22:24 Let’s say I’ll go through the process of checking until I find my letter. Each iteration will represent steps.
00:22:40 Now let’s build forward. I’m not worried about the subject line anymore; I just iterate through the alphabet. For each character, I want to see if the query works.
00:23:06 The current password plus one more character, if it’s true, I know what my next character is, and I’ll recursively build this forward.
00:23:26 If all is well, I’ll build backward by prepending the character to the password. If this works, I’ll see just how the verification code says 'yes' or 'no.'
00:23:44 Now that we have everything in place, let’s try this out with the real API. Fingers crossed that the internet here cooperates.
00:24:03 Now that I can see things are rolling, the quest to retrieve the password has begun. I can’t remember what my password might be; let’s see!
00:24:23 Okay, it’s still moving, looking to get back to where we were. I am a bit anxious, but I can see it working through my head.
00:24:45 I see the iterations moving there, and this is where the API calls dramatically step up!
00:24:57 This is going as fast as it can. All I can do now is hope for any results.
00:25:11 And that’s it! 403 iterations! We did it!
00:25:31 So my initial findings in math suggested 432 expected queries, and our actual number was close to that—403, to be exact.
00:25:46 The power of programming is something incredible. This process was not just a joke; it was a learning experience.
00:26:00 Coding also opens doors; for me, this was a defining moment of sorts. It's one of those ‘we solved it’ feelings that drives home the power of programming.
00:26:20 I want to thank you for listening and share this story to illustrate the potential of coding and how we can solve practical problems.
00:26:40 I’m Haseeb Qureshi; you can find me on Twitter at @hasib. Visit my blog at hasibq.com. Thank you so much for listening!