00:00:11.759
Hello everyone, my name is Richard Schneeman, and I want to talk to you about one of the scariest things a Ruby programmer can face.
00:00:25.359
Wait a second, do you hear that? Everyone run! Oh no, oh no! There are more of them! Okay, wow, they followed me from RubyCage!
00:00:31.359
I can't believe it! Two conferences in a row! Who would have believed it? Wow, what a coincidence!
00:00:41.680
So, if you've seen that talk, know that there is a lot of new content in this one, and you will not be disappointed.
00:00:47.520
And while dinosaurs are scary, there’s something that’s even scarier: syntax errors. Okay, don’t laugh. Syntax errors are scary.
00:00:59.600
Just look at this unexpected syntax error. Okay, it’s horrifying! Where’s the problem? Here’s some different code. Wanna guess where Ruby thinks the problem is? Yeah, last line? Nope, wrong! This is frustrating! I am frustrated!
00:01:16.880
What if we had something better? What if we could take this and turn it into something that tells us exactly where the issue lies? It looks like it's missing a 'do', okay?
00:01:22.960
Have you ever seen a cooking show? They show you the final product. I want to show off Dead End in action.
00:01:28.400
Here’s a demo of how it finds syntax errors. Here’s a real document with a real syntax error: the last line, of course. And then this symbol up here means that the entire document cannot parse.
00:01:39.759
So, we're going to transform it until it can.
00:01:44.880
When a valid source code block is found, Dead End safely removes it after each step in the search. Then Dead End re-evaluates the whole document to see if it’s parsable yet. Parsing failed, so we need to keep looking. Still failing. Still failing. Expanding. Still failing. Are you the syntax error? Are you my syntax error? Still failing.
00:02:02.960
Searching for that syntax error... still failing. Now, still can't parse the document. Maybe it's up here? Maybe... no. Yeah, it doesn’t look right. I don’t think it’s in there.
00:02:25.760
Okay, the document is going to be checked, and... oh, okay. Wow! Okay, the parser reports that our minimal document is now parsable. That’s great news! Once this happens, Dead End has found a way to transform our document; it can stop searching and instead focus on the invalid code blocks.
00:02:42.400
This right here is the actual output of Dead End on that file. The issue lies on line number 36—there’s a missing end statement. You can have it today for the low cost of free—just install the gem.
00:02:54.160
By the time you see this slide, we're going to have half a million downloads, probably at least. I am talking with Ruby core to get Dead End integrated directly into Ruby. We are targeting integration for 2022, so not this year, but the next, for the release of Ruby 3.2.
00:03:06.000
This also means that there’s plenty of time for you to give it a shot, give it a spin, and provide me with some feedback. Alright, you saw one example of Dead End finding a syntax error. What else can it do?
00:03:31.040
When you miss a keyword like 'if', 'do', or 'def', Dead End finds the problem. What about a missing 'end' keyword? Can it do that? Yes, it finds the problem.
00:03:41.840
What about a missing curly bracket? Dead End finds the problem. What about a square bracket? Dead End finds that problem too. What about a missing pipe character? Well, Dead End finds that one as well. But have a problem with a missing family member in a Korean thriller drama? No, sorry, Dead End can't assist with that.
00:04:07.680
That was too dark, too deep. So, let’s rewind and try this again. I have a problem with a missing Marvel universe character! Crocodile Loki, come on! Still funny, still good, and still top. It's Crocodile Loki! This year is lasting forever. Oh, that was such a good show.
00:04:19.120
Today, we're going to dig into Dead End. You can actually follow the algorithm yourself by running it with this record CLI flag. You'll see each step along with the annotated source code. That’s actually how I generated those original slides, coming from Dead End. If you don’t want to generate slides, just write a program to do it for you.
00:04:39.199
So today we’re going to talk about syntax errors, lexing, and parsing, and I’ll touch on some of Dead End's internals. But first, who’s this guy letting up on stage? I mean, allowing anyone on stage these days.
00:05:04.320
I go by Schneems on the internet. If you forget how to pronounce my name, you can go to my blog, shneems.com, and click the little play button. I also created Code Triage, which is a platform for learning how to contribute to open source.
00:05:17.840
To date, I’ve helped over 60,000 developers—with the number being around 62,000 now. You can sign up for Code Triage if you want to start contributing to open-source.
00:05:29.039
Speaking of open-source contributions, I’m glad you brought that up. I'm actually working on a book called 'How to Open Source'. You can go to howtoopensource.dev to buy the book as a pre-release; it's not quite ready yet but it focuses on contributing to projects, especially for developers who are unsure how to start, or those who started and got stuck. I've been running Code Triage for years, along with conducting research and interviewing developers.
00:05:57.600
So, this book is kind of like the synthesis of all of that work. The book is at howtoopensource.dev. You can also sign up for codetriage.com, and I’ll email members whenever the book is ready.
00:06:15.360
When I’m not working on open-source, I also like to get paid. Currently, that's happening through Heroku and Salesforce, where I'm working on Salesforce Functions. It’s an easy way to work with data inside of Salesforce using the language you love. If you love Java or JavaScript, you can use it right now, and we will roll out Ruby support later.
00:06:29.120
People also tell me that I am an exceptional programmer, mostly because my programs generate a lot of exceptions. It’s okay; I know what I am. Syntax errors, though, are the main exception we’re going to focus on today.
00:06:40.639
Let's start from the beginning: What is a syntax error, and why are they so hard to understand? Well, this code works wonderfully, comparing 'a' and 'b' all day inside a while loop. When Ruby parses the code, it converts it into an abstract syntax tree, and you can see it now.
00:07:05.199
The tree is beautiful, and Ruby's parser looks upon it with great happiness and purpose. But then a stranger comes upon the land, and the stranger has a secret power. Behold the octothorp! With one key, the stranger transforms the code.
00:07:23.520
A critical line had been commented out, and without that line, the tree is no longer whole. Huge sections of the code are no longer reachable; the code no longer parses. Our parser is sad, and honestly, I’m a little sad too. With that, the stranger leaves, and behind them stands a syntax error.
00:07:59.599
In short, a syntax error occurs when Ruby's parser cannot build a valid parse tree. But why is it difficult to understand? Well, when the parser tries to parse code and finds an error, it will often conceal the actual error, which isn’t always where the developer made a mistake.
00:08:27.280
For instance, here a developer forgot a bracket. As the parser builds the tree, it hides an error because it wasn't expecting a comma. Ruby's parser has several rules it knows about method definitions and what they should look like. When it finds something unexpected, it throws an error.
00:09:06.320
The problem is that the error isn’t caused by the comma. It’s caused by a missing bracket; that’s a major difference. The location of the parse error isn’t always where the developer made the mistake.
00:09:22.880
Here's an example: A developer forgot a space after the 'def'. Ruby believes the error is in the last line because when parsing the module definition, it starts looking for a matching 'end'. When it sees this unexpected character combination, it raises a syntax error. The human result isn't helpful. It doesn't precisely indicate what to delete or change to resolve the issue so parsing errors are different from human errors.
00:10:41.120
Dead End's goal is to turn the parser's problems into something a human can recognize as an issue. How does Dead End work? Well, it uses a library called Ripper. No, it’s not a band name; Ripper is Ruby’s parser that ships with Ruby.
00:11:06.480
You can just require it, and there are no external dependencies for this gem. Ripper can evaluate code and indicate whether there’s a syntax error. We saw the code before. Ripper confirms a syntax error is present, and if we fix it, Ripper will confirm that too.
00:11:37.200
But that’s difficult to do! It can be hard to guess what the developer intended, so we often comment out code we don’t want to run. This is our method; however, that approach didn’t yield any positive results.
00:11:58.000
So instead, Dead End uses indentation and lexical parsing to deconstruct the source code from the outside in. Commenting out lines that do not match removes nodes. But if we reach a point where all the orphaned syntax nodes are gone, we have manually reached a valid state. The output demonstrates the actual issues.
00:12:26.080
It reveals that the 'else' and 'end' are present, but as a developer, it's obvious that there should also be an 'if' present. However, inadvertently commenting and expanding based on indentation can lead to misunderstandings.
00:12:52.960
Let us look deeper into some gotchas. We know the developer has a syntax error, but we can't make uniform assumptions about everything else. If we look at a syntax error caused by a missing 'do', it can lead the parser to identify an extra 'end' below it.
00:14:02.560
Should we only rely on indentation? Well, we’d begin deleting lines which could remove the wrong 'end', thus leading to a valid parse. Why? Because the 'do' matches the other 'end'. This is an example of the complexity this approach can introduce.
00:14:52.160
We can fix this by using lexis output as part of Ripper. The lex output tells us what's in the code—such as our 'do' and 'end'. Using this information is paramount; if we remove specific lines and find keywords, we can identify which blocks can be removed.
00:15:54.000
Even with correct indentation, removing the wrong line may yield a false positive: our syntax error might be on lines that are missing context like a 'def hello'. If we use indentation or comment out properly, we could easily miss the targeted blocks—failing to address the true issue.
00:16:44.960
When we reach syntax with ambiguity, we face errors where the context through which we are examining the code keeps leading us to inadequacies in fixes. This scenario highlights that searching through code must maintain perfect balance, reinforcing that not all errors occur with precise characteristics.
00:18:36.720
For example, Ruby's parser cannot differentiate between a missing 'if' or missing 'end' based solely on errors it encounters. It's tricky and complicates the issue significantly! Knowing that ambiguity exists means we can account for it after searching; our algorithm can still access all pertinent content, allowing ground for flexibility.
00:19:45.440
What’s more, a document can contain multiple syntax errors, with errors piled atop one another. Multiple coding issues can confluence, and we don’t want to evaluate code linearly without a proper pairing which leads us towards real validation, nor should we carelessly show all code indiscriminately.
00:21:00.960
So, how do we handle these multiple errors? We can modify our searching to evaluate multiple pass attempts from both ends towards the middle. This approach helps us locate errors 1 and 2 without failing to find the last. Using this simultaneously gives us flexibility and results.
00:22:06.560
Now that we’ve addressed some technical points, let’s move to artificial intelligence—who knows AI? Raise your hand or type in the chat!
00:22:27.160
Artificial intelligence often refers to algorithms in code. One common example relates to pathfinding. Dead End uses a search algorithm to find our problematic code. This variation is part of uniform cost search, which is sometimes called Dijkstra’s algorithm.
00:23:06.640
Those that want to learn more can benefit from links to informative pages about search processes. In particular, seek out visual representations of search algorithms; they make for fascinating understanding and scope.
00:23:28.880
Now we know a little bit about AI, let’s look at some internals of these algorithms. If Dead End were a cake, we’d start with messy code having syntax errors. We’ll clean it up, tidy it, and present it to our searching algorithm, which is an exhilarating yet complex structure.
00:24:07.280
When syntax errors are found, we'll add context where needed to send detailed feedback back to users. Every app has its own syntax errors; it’s common! So, like every other good library, Dead End utilizes monkey patching.
00:25:19.120
It hooks into the require method, and when a syntax error is raised, conflicts get passed to Dead End. The source code causing the error is read into disk and passed into the search object. However, we do not simply use that raw document.
00:25:57.359
The initial step cleans it up via a cleaned document class handling the various gotchas discussed previously. It clears comments, whitespace, and joins lines of chain methods with trailing slashes or those leading to here-docs.
00:26:32.320
With all this, it converts the cleaned stats into an object type known as 'code line'. Once we have our lines representing the document ready, we pass them to our search class.
00:27:03.520
Next, the code and search class flow smoothly together. This functionality is driven by while loops to explore a frontier which holds all generated code blocks within the source code. The frontier checks if there are remaining problems left to address!
00:27:46.960
If we have highlighted code block lines, we check against the parser to locate remaining issues with the initial document. The decisions here pivot. We expand and adjust based on indentation, only confirming valid ruby code.
00:28:49.680
Results return multiple syntax errors, leading the output towards a detailed illustration. In this final stage, Dead End reveals all adjustments taken from cleaning to searching and formatting!
00:29:09.440
Thank you! I will be here all week, folks.
00:29:16.000
Beyond Dead End, there's another amazing gem called Error Highlight, which shows you which method got a no method error.
00:29:21.520
It was created by Yusuke, known as Mame on GitHub, who gave an excellent talk at a conference I used to run called Keep Ruby Weird. I recommend checking it out!
00:29:48.799
With Error Highlight and Dead End, I also want to touch on the importance of community values when handling errors.
00:30:07.360
I've been writing some Rust code over the past couple of months, and I've observed that their community takes an aggressive stance towards error messages.
00:30:17.600
They not only state there's a problem but also strive to accurately suggest how to fix it. When I opened up an issue, the community tackled it quickly and merged within a month.
00:30:35.040
Though not perfect, this community treats user experience issues as critical bugs. If we invested more energy into our error handling, we could elevate user experience.
00:30:46.640
You can add Dead End to your project and try it out today! Feel free to give feedback on what works or doesn’t work. Hopefully, we can have that finalized before Ruby 3.2 ships.
00:31:06.880
You can also pre-order my book on how to open-source at howtoopensource.dev or sign up to triage issues on Code Triage.
00:31:31.760
Today, we talked about lexing, parsing, syntax errors, AI, and pathfinding! But remember, technical details aren't the most important part.
00:31:42.000
The important part is that everyone sitting in this room is the future of developer tooling. Programming is inherently difficult, but our tools can help us.
00:32:19.200
One of the best ways to judge a system is to see how it fails. Care, grace, and beauty applied to our failure modes create experiences that delight us, teach us, and elevate our code.
00:32:37.200
A syntax error doesn't have to mean the end; it can be the beginning of a beautiful programming story. My name is Richard Schneeman.
00:32:48.880
You may have heard I'm writing a book. You may also have heard I run Code Triage.
00:32:59.039
I want you to go forth and be an exceptional programmer! Bye!
00:33:06.890
You’re still here? It’s over! Go home!
00:33:20.720
Alright, bye!