00:00:08
I think it's probably about time to get started. So how's everybody doing? Yeah, good morning! All right, awesome!
00:00:21
I am here to talk to you guys about the time I used Ruby to crack my Reddit password, kind of. My name is Haseeb Qureshi. I'm a software engineer at Earn.com, and I am going to tell you guys a story.
00:00:36
So I used to be addicted to useless websites. I still am, but I used to be too. I'm sure you guys know what this is like. Here’s just an artistic depiction of what a useful site might look like. We all have our own poisons.
00:00:55
Also, I should note that I have no self-control to speak of, so that's pretty bad. This is actually kind of a nightmare. If you remember your English class from high school or something when you read The Odyssey, there's a famous story about when Odysseus tried to maintain some self-control by tying himself to his own mast.
00:01:20
My problems are not quite this severe, but they're kind of similar in their own way. So, keying off the story of The Odyssey, I decided to use that ancient psychological technique that Odysseus used, which is locking himself out of his online accounts.
00:01:37
What he did was a little bit different, but basically, I feel like it’s kind of the same. Let me tell you exactly what I did to try to spend less time wasted on useless websites.
00:01:49
On most of these websites, you have some kind of password system, and obviously, you can change your password. So, what I did was type in some random gibberish. I just grabbed my keyboard and started smacking away, coming up with something sufficiently weird and entered that in as my new password.
00:02:03
Of course, I don't remember what this string is; it’s just a bunch of random characters. I held on to it, though, and I also changed my account recovery email. So now, I have a new password that I have no idea what it is.
00:02:20
I've also changed my account recovery email to a throwaway email that I just created. I also threw away the password to that, so there's no way to get my account back, save for this password. This password is the key to my kingdom.
00:02:36
So what I did was plan to prevent myself from having access to this password until some later date in the future. For now, I want to go into crunch mode. I want to study, practice, and do whatever I need to do that these time-wasting websites are keeping me away from.
00:02:52
Unfortunately, normally the way you do this is through a mechanism known as 'friends.' But I figured there’s probably some way to automate this; you don’t need friends.
00:03:09
So I went and used Google to try to figure this out, and I came across this wonderful website called LetterMeLater. This kind of sounded nice; it allows you to send emails at a future date and time you choose. No friends required!
00:03:24
So, I decided to use it. It's a little bit 1995 looking, but that’s okay—maybe they're just really focused on what they do. So I went ahead and composed a new email to myself.
00:03:35
I created an account, filled in a subject line—I called it 'password' because, well, it’s my password—and set a date in the future for when I would email it to myself. I put my password in there and set it to 'hide mode.'
00:03:58
With 'hide mode,' that allowed me not to actually click on it when I log into LetterMeLater. So, when it's hidden, I have no way to see it until it gets sent to me, right?
00:04:14
So my plan was foolproof. I know that I'm just an awful human being, and the only way that I won't access my password is if there's literally no way I can get to it.
00:04:28
For a while, I used this system to keep myself from wasting time on highly addictive useless websites. Now, my talk is actually not about productivity techniques; this is a programming conference, after all.
00:04:46
So why am I telling you all this? The story is a little more involved. I used this for a while, and it was pretty effective, but later on, it ended up coming back to bite me.
00:05:01
Cut to two years later. I was working at Airbnb, and I had a job. I was gainfully employed—surprise, surprise! I didn’t really believe it either, but that's fine.
00:05:20
So, you know what that means? I'm working at Airbnb, their huge Rails app. A lot of tests, and of course, that means a lot of waiting. And waiting means it's time to start wasting company time.
00:05:34
So, what can I do? Obviously, I can't do work while the test suite is running; that would be silly. Instead, I wanted to get back into some of my old time-wasting activities.
00:05:46
I wanted to log back into this website, so I went back to LetterMeLater. I remembered that I locked my password away. After a couple of years, I thought it would be easy.
00:06:04
When I logged back in, I realized that it was still grayed out. That's kind of weird because usually, I’d send myself something like a month later at a time.
00:06:19
Then I realized, oh, I scheduled it for 2018. The last time I put this there, I guess I didn't remember, but I had gotten so annoyed at myself that I actually set dates super far in the future.
00:06:42
I thought to myself, screw you! You need to get a grip on yourself! And I was like, I can't wait that long! This test suite isn’t that slow.
00:06:58
At that point, I thought maybe there’s some way around this. So I logged into LetterMeLater, and I couldn’t see my password.
00:07:11
As I was clicking around on this, I realized I was about ready to give up. But of course, you've got to try a few things first. I realized there’s a search bar here. What can I search for?
00:07:34
So maybe I can search for my name and see if it pops up. It’s not there. Now what if I search for a single letter?
00:07:54
That popped up. Okay, so maybe you might be looking for subject lines, right? So you can ascertain that pretty quickly by typing 'password.' I can see, okay, yeah, it’s definitely indexing the subject line; that makes sense.
00:08:13
But remember, the body of my email was just the actual password. So if I search for a letter that's not in 'password,' let’s look for the letter 'e.' It’s not in there. What about '1'? What about '2'? '2' is in there, okay.
00:08:34
So, what’s going on here? If I have a way to do substring queries into my password, I have an 'Oracle,' and my 'Oracle' will give me this query.
00:08:51
It will tell me if the body plus the subject—the subject was 'password,' the body was the actual password itself—includes any string that I asked it.
00:09:12
So when I realized this, I ran home. I was off working on this. I didn’t just run home from work, but I ran home, busted out a piece of paper and a pen, and thought, okay, let me see how I can retrieve my password.
00:09:43
Here’s the algorithm: let's think about it like Wheel of Fortune style. I have this one thing: the subject at the top and the body down here.
00:09:58
And I don’t know any of the characters in the body, but I do know the characters in the subject, right? Imagine I have like a word bank, and the word bank consists of all the letters except the ones that are in 'password'.
00:10:15
If I do a substring query for 'p' and I find that it returns true, I don’t actually know if it’s true because it was in the body or because it was in the subject.
00:10:32
The subject would automatically give me a hit; however, if I try all the letters that are not in 'password,' then I know for certain I’ve hit a letter that is only in the body and not in the password.
00:10:49
So if I keep trying letters that are not in the string 'password,' eventually I will make a hit.
00:11:01
Once I make a hit, I know I’m in. I have one of the characters somewhere in my password. On average, this will take two guesses.
00:11:21
Imagine it's somewhere in the middle of the alphabet. I'll try letters until I get one that makes a hit. Then what I do is try to append another letter to do a longer search.
00:11:43
I know that you can append that letter to create a substring, and I just keep iterating through every single letter, including the letters in 'password,' because now any letter can be appended.
00:12:03
I keep going down one by one until I find the next character, and on average, you'll find the character somewhere around the middle of the alphabet.
00:12:16
Then I just keep repeating that, and every time it’s going to take two guesses where 'a' is the size of the alphabet.
00:12:30
Eventually, I will find the next character.
00:12:39
What that means is not the entire string, but I have a suffix to the string. After that, I can just repeat the process going backwards.
00:12:54
Instead of appending to the end of the string, I prepend to the beginning of the string. I just keep going until I fall off going the other direction.
00:13:10
I know, okay, cool! That should be my entire string. So, if I do this, and if I consider that this illustration is not exactly correct, because at the ends, I have to take an extra guess.
00:13:25
By the end of it, I just have to exhaust the entire search space until I determine that no other letter fulfills this string.
00:13:42
That will take two guesses for the ends: Let's say 'a' is the alphabet length. So if you assume 'a' is a mix of lowercase letters and digits and the password length is 22, then doing this will take about 432 queries.
00:14:01
That's actually doable! It's like a reasonable number of things that you can just do in a serial manner through API calls.
00:14:15
Alright, let’s do this. I’m going to create a 'letter_me_now.rb'.
00:14:26
First things first, I need to figure out how I’m going to do this querying. I need some kind of access to Oracle, so I’m going to build that first.
00:14:46
The way this is set up is to let me later account up. I can see if I changed this query to 'B,' then it becomes 'B.' There's no API, so I’m going to scrape this directly.
00:15:04
I can do that! So, I’m going to create an API class. In this API class, I’ll have a URL and remove the part so I can put the query string in programmatically.
00:15:19
I’m going to use the 'Faraday' gem, which is a nice gem for making HTTP requests. I’m going to have a 'def self.get' method that takes in a query.
00:15:37
What I’m going to do is say 'Faraday.get' the URL with the query string as the second argument.
00:15:55
Let’s check this quickly. I’ll say 'API.get' and search for the string 'password.' Okay, that didn’t work; it gave me a redirect.
00:16:12
It’s saying I must be signed in to see this page, so obviously I don’t have any of my cookies. For this to work, I need to make sure I pass the cookies in the headers.
00:16:34
Let’s go ahead and inspect, look for any cookie, and okay, let’s refresh here. Alright, we’ve got this cookie. I'll make sure to sign out when this talk is done.
00:16:55
Now, if I’m not mistaken, that should do the trick. If I do 'API.get' hello, boom! This looks like the actual webpage.
00:17:17
Okay, it’s giving me the 200 status, which is good. Now I want to know if this query returned true or false.
00:17:35
The easiest way to figure that out is to check for some unique string in the HTML that will uniquely identify that, yes, in fact, this returned true.
00:17:57
There are certain things that show up when it's 'scheduled', and I don't think it shows up anywhere else. For simplicity, I’ll use 'password'.
00:18:20
Let’s make sure this returns; if the body of this HTTP request includes the string 'password,' then I'm good to go.
00:18:40
So, I’ll put that in. Now, while I’m testing, I do not want to use the real Oracle because that’s going to be really slow.
00:18:55
I’m going to create a stubbed API that I can use while I’m testing so I can swap that out later with the real API.
00:19:16
In this stubbed API, I will have a fake password that is just some random characters. Great! Then we’ll have 'def self.include' with the same interface.
00:19:34
This will just be 'fake_password.include' the query.
00:19:42
Doing it this way should allow me to use a stubbed API so I don’t make unnecessary HTTP requests while I’m testing.
00:19:58
So now, I need to build that algorithm to figure out how to crack the password.
00:20:14
Let’s call it 'password_cracker.' The password cracker is going to be stateful; it will be taking that API and injecting that dependency.
00:20:29
We’ll set the password to an empty string and count the number of iterations. Let’s create a 'crack' method that goes through each of the steps of that algorithm.
00:20:50
To find the first letter, I need to know the alphabet I’m working with, but I also need the subject line.
00:21:05
So what I’m going to do is just have two arrays: one will be the subject characters, and the other will be all the characters I’m considering.
00:21:25
To find the first letter, I want to iterate through the characters that are not in the subject line, and still in the alphabet.
00:21:43
That should be easy enough. Each letter is essentially checked against the Oracle. If it’s true, then I find my first character. If it doesn't return anything, I want to raise an error; 'Could not find first letter!' and that’s the plan for the first letter.
00:22:11
Now, one thing I want to do is count how many iterations I’m doing throughout this whole algorithm.
00:22:24
Let’s say I’ll go through the process of checking until I find my letter. Each iteration will represent steps.
00:22:40
Now let’s build forward. I’m not worried about the subject line anymore; I just iterate through the alphabet. For each character, I want to see if the query works.
00:23:06
The current password plus one more character, if it’s true, I know what my next character is, and I’ll recursively build this forward.
00:23:26
If all is well, I’ll build backward by prepending the character to the password. If this works, I’ll see just how the verification code says 'yes' or 'no.'
00:23:44
Now that we have everything in place, let’s try this out with the real API. Fingers crossed that the internet here cooperates.
00:24:03
Now that I can see things are rolling, the quest to retrieve the password has begun. I can’t remember what my password might be; let’s see!
00:24:23
Okay, it’s still moving, looking to get back to where we were. I am a bit anxious, but I can see it working through my head.
00:24:45
I see the iterations moving there, and this is where the API calls dramatically step up!
00:24:57
This is going as fast as it can. All I can do now is hope for any results.
00:25:11
And that’s it! 403 iterations! We did it!
00:25:31
So my initial findings in math suggested 432 expected queries, and our actual number was close to that—403, to be exact.
00:25:46
The power of programming is something incredible. This process was not just a joke; it was a learning experience.
00:26:00
Coding also opens doors; for me, this was a defining moment of sorts. It's one of those ‘we solved it’ feelings that drives home the power of programming.
00:26:20
I want to thank you for listening and share this story to illustrate the potential of coding and how we can solve practical problems.
00:26:40
I’m Haseeb Qureshi; you can find me on Twitter at @hasib. Visit my blog at hasibq.com. Thank you so much for listening!