RubyKaigi Takeout 2021

Beware the Dead End!!

Nothing stops a program from executing quite as fast as a syntax error. After years of “unexpected end” in my dev life, I decided to “do” something about it. In this talk we'll cover lexing, parsing, and indentation informed syntax tree search that power that dead_end Ruby library.

RubyKaigi Takeout 2021: https://rubykaigi.org/2021-takeout/presentations/schneems.html

RubyKaigi Takeout 2021

00:00:00.160 Hello everyone, my name is Richard Schneeman, and I want to talk to you about the scariest thing a Ruby programmer can face.
00:00:01.839 Wait a second, do you hear that? Everyone, run!
00:00:09.120 Oh no, oh no! They're right behind me. There are more of them!
00:00:15.200 Okay, all right, so actually there's something even scarier than being chased by dinosaurs—it's syntax errors. Just look at this unexpected syntax error. It's horrifying! Where's the problem? No matter where the problem is, the error always shows up on the last line. Seriously though, where is the problem? What if we had some way to turn the dreaded dead end into something a little bit more obvious?
00:00:36.800 All right, you see the error now? Doesn't line five look a little suspicious, like maybe it's missing a 'do'? Okay, question time! Which do you prefer? Do you like this one? A single syntax error on the bottom where you have to ask, 'Where's the problem?' Or do you prefer this one? Yeah, all right, so I did a very scientific study on the time cost of syntax error exceptions. I found that the average developer loses a hundred hours of productivity to these errors. When I ask developers what they wish for most in life, seventy-eight point three percent of them told me they want an AI algorithm to tell them where their syntax errors are.
00:01:06.159 Well, today you can have that, all for the low, low cost of free with the gem installed: dead_end. The dead_end library is released on RubyGems, and today we'll be talking about how it works. So, what all can dead_end do? When you miss a keyword like 'if', 'do', or 'def', dead_end finds the problem. When you miss an 'end' keyword, dead_end finds the problem. When you miss a curly bracket, dead_end finds the problem. When you miss a square bracket, dead_end finds the problem. Have a problem with a missing pipe character? Dead_end finds the problem. Have a problem with a missing Marvel Universe comic book character? Unfortunately, dead_end can't do everything, but we tried.
00:01:45.920 All right, so today we're going to dig into dead_end. We'll look at why syntax errors are especially difficult in Ruby, how to parse and lex code with Ripper, how AI pathfinding algorithms work, and we'll put all those together to build dead_end at a high level. Then we will crack open the cover and see how it works at a low level. This is a warning: we're going to get technical. There's going to be a lot of code.
00:02:52.879 But first, who am I? Why should you listen to me? I go by Schneems on the internet. If you forget how to pronounce my name, you can go to my blog at schneems.com, where I have an audio recording of me pronouncing my name on my About page.
00:03:04.400 I created CodeTriage.com, which is a platform for learning how to contribute to open source. On CodeTriage, I've got about 60,000 developers signed up to get better and start their open source journey. So you can go to codetriage.com today. In addition, I am writing a book on how to contribute to open source as well. For details, you can sign up on my blog; I have a mailing list. You can also sign up on CodeTriage. I will be emailing out updates from both platforms.
00:03:30.720 When I am not working on open source or teaching people how to contribute, I like to get paid. I work at Heroku, and right now I'm focused on Salesforce Functions. It provides an easy way to work with the data inside Salesforce using whatever languages you love. Currently, we're beta testing with Java and JavaScript users, and we will roll out Ruby support later. So if you use Ruby and Salesforce, I would love to hear from you; my DMs on Twitter are open.
00:04:01.600 People also tell me that I am an exceptional programmer. My programs generate a lot of exceptions. I love exceptions though, especially when they help me debug. Take 'NameError' for instance; this is beautiful. Did I mean to use 'ch_star'? Yes! This error brings me joy. The 'Did you mean' gem is now integrated directly into Ruby, which is brilliant. But syntax errors in Ruby… they are just awful. They creep me out! Why is that? Why are they like that?
00:04:28.639 Well, when Ruby gets source code, it needs to lex and parse it from the top to the bottom, and there are rules about what can and cannot be put in there. For instance, you can't start a Ruby file with an 'end'; that's invalid right away. But what happens if you put in too many keywords or too few? If you have too many 'ends', then Ruby needs to scan the whole file to see if the missing bits are at the bottom. Here's a list of keywords that can trigger a required 'end': if you have an 'end' statement that doesn't match a valid keyword, it will trigger a syntax error.
00:05:58.400 So, what's wrong with this source code? It's missing a 'do', but Ruby doesn't know that; Ruby only knows that it's got an extra 'end'. Why doesn't Ruby know about that missing 'do'? Because calling 'ch_dur' with a block is not required in Ruby; any method can take a block even if it's not used.
00:06:11.280 Here's a method that doesn't take a block when called: the 'bark' method. You can add a block anyway, and it's still totally valid Ruby. Here we are calling it with a block, but it doesn't do anything; it's just fluff. Still, Ruby says, 'Fine, sure, I'll take it'. Now here's another fun point: Ruby's internal parsing is so permissive that while this code can never execute, it's syntactically valid. Note the trailing dot here, and there isn't any keyword that ever matches that 'end'.
00:07:13.320 Let's take a look at how Ruby transforms source code into a program. The source code is fed into Ruby and goes through a process called lexing and parsing. You start with a source code like this string up here: just 'hello world'. The programming language passes your source code to a lexer. Ruby ships with a parser and a lexer, which is pretty sweet. It's called Ripper. It sounds like it should be the name of a band or something pretty extreme like that. Ripper takes the source code and converts it into tokens with significance.