00:00:00.299
foreign
00:00:11.120
Let's get started. In my efforts to explain as much as possible,
00:00:13.920
I have included too many slides in this presentation.
00:00:16.800
It will be quite a miracle if I finish on time.
00:00:18.600
We will cover a lot of information. My goal for this talk is to give you a broad overview of what this project is,
00:00:22.140
dive into some technical details, and ensure that you are not left behind if you are less familiar with some of the concepts I'm going to discuss.
00:00:28.920
This talk is about Syntax Tree.
00:00:30.420
It's a project I started about four years ago.
00:00:32.399
In essence, it became a project funded by the Ruby Association to create a standard library formatter.
00:00:34.500
There are many formatters available now, but when I started this work, there wasn't a suitable one.
00:00:37.200
Since the inception of the project, it has become much more than just a formatter. It has evolved into a whole host of functionalities that I will discuss today.
00:00:48.059
My name is Kevin Newton, and I work on the Ruby and Rails infrastructure team at Shopify.
00:00:49.980
You may notice that there are quite a few of us here, so please come and talk to us.
00:00:51.719
If you want to learn about some of the problems we're solving at Shopify, particularly in improving Ruby, or generally trying to enhance the ecosystem.
00:00:55.620
You can follow me online at @kevinnewton on Twitter.
00:00:57.420
So, what is Syntax Tree? To explain what Syntax Tree is, we need to discuss a couple of concepts.
00:01:01.000
First, what is a parser?
00:01:02.100
What does a parser do? How does it function?
00:01:03.960
We also need to understand what a syntax tree is. Naming, as you may know, is one of the hardest things in computer science, and I unfortunately named my project Syntax Tree.
00:01:11.960
So you might hear me use 'syntax tree' to refer to the structure and 'Syntax Tree' to refer to the project. Now, let's also discuss the visitor pattern.
00:01:18.600
I want to explain how this pattern applies to syntax trees.
00:01:19.739
Let's start by discussing what a parser is. You may have seen the phrase, 'Mats is nice, so we are nice.' A parser's job is to take plain text content and transform it into a data structure that a programming language can understand.
00:01:22.440
The first step is lexical analysis, which takes a plain text segment and breaks it into chunks.
00:01:27.240
For example, we would classify 'Mats' as a noun, 'is' as a verb, 'nice' as an adjective, and 'so' as a conjunction.
00:01:29.700
The second step is semantic analysis, where we take two segments of the sentence to classify their relationship. We give a name to the concept, defining the grammar.
00:01:34.680
We create a definition of how the words can fit together to form larger concepts, such as a verb phrase. When we combine these segments, we end up with a subject phrase, which is a noun followed by a verb phrase.
00:01:42.720
By adding a conjunction, we form a tree structure in our minds. A complete understanding of these concepts is what a compiler does.
00:01:51.720
What does this look like in terms of a syntax tree? Let's assume we're in Ruby and create objects that represent the nodes.
00:01:55.920
This is a straightforward process: we can create a separate class for every single node. We can store data and use the nodes to represent the tree structure. We can also rearrange them slightly. The nodes on the left are tokens, wrapping a value from the source code, while the nodes on the right are other branches in the middle of the tree.
00:02:06.300
Next, let’s expand one of these nodes.
00:02:07.200
We can walk this tree and perform interesting operations. We can add pattern matching features and comparison methods, as well as a copy method for immutable copies.
00:02:10.200
The 'accept' method is the key focus, along with the 'child nodes' method. The accept method allows us to interact with the node, facilitating the visitor pattern's functionality.
00:02:16.200
The double dispatch visitor pattern allows for dynamic dispatch: when we call the 'accept' method on a node, it calls back into the visitor, which allows us to have different visitor methods for different node types.
00:02:22.740
As I mentioned earlier, the 'child nodes' method allows us to define the relationships between nodes, ultimately letting us iterate through them.
00:02:29.460
In summary, we can use our visitor pattern to implement specific visitors that only focus on a subset of nodes in a tree.
00:02:34.740
Syntax Tree also allows us to format trees based on various algorithms. One such algorithm is pretty print, which organizes code by applying a consistent structure.
00:02:39.900
Formatting involves creating groups of nodes to determine where line breaks should occur. For example, if we reach the end of a line, we break the outermost group first.
00:02:44.100
By building a syntax tree for Ruby, we can implement all sorts of analytics and developments, such as linting, formatting, and semantic analysis.
00:02:51.300
Syntax Tree is not just a formatting tool; instead, it forms an object layer that represents the result of parsing Ruby code.
00:02:58.620
It offers tools to interact with and manipulate this object layer to perform diverse tasks, like refactoring and code improvement.
00:03:07.620
In total, there are five main functionalities of Syntax Tree: building a syntax tree, formatting it, creating a command-line interface (CLI) to facilitate interaction, developing a language server, and translating syntax trees into other formats.
00:03:16.740
Now, let's first take a look at building the syntax tree.
00:03:22.740
Building a syntax tree involves defining all the nodes. In the 'node.rb' file within the Syntax Tree repository, you will see definitions for all the nodes in Ruby.
00:03:30.840
We provide named fields for every single value and sub-node in the tree, source location data, comments attached to the tree, accept and child nodes methods, and immutable behavior by default.
00:03:43.140
The interaction with Ripper, the standard library parser generator, helps us build the nodes for our syntax tree.
00:03:46.920
Ripper itself has over 190 events, helping us parse Ruby code effectively.
00:03:55.140
Furthermore, we utilize techniques to keep track of nodes' positional context to ensure proper mapping and referencing.
00:03:58.740
We aim to present an interface that abstracts away the complexities of using Ripper, allowing users to focus on Syntax Tree's functionality.
00:04:04.500
Ripper does not currently provide support for everything; some features, like handling comments, are something we must manage ourselves.
00:04:12.960
Walking the tree allows us to implement visitor functionality, where adding visit methods for each node type enhances interaction.
00:04:20.700
We can leverage these methods to perform certain operations, such as counting sentences or omitting unnecessary nodes.
00:04:29.280
In addition, we can serialize the entire abstract syntax tree (AST) to JSON, allowing for easy data access and manipulation.
00:04:39.840
Formatting requirements demand the use of a pretty print algorithm, which is based on a foundational paper known as `A Prettier Printer` from the early '90s.
00:04:51.180
The basics of pretty printing include managing text nodes and breakables to control how Ruby code is presented in an unobtrusive manner.
00:05:00.300
Then, we developed a command-line interface (CLI) that allows users to work with the AST more easily.
00:05:06.960
The CLI provides useful functionalities, like generating a tree representation of the source code, formatting it, and searching for specific patterns.
00:05:14.640
A significant feature is the 'match' command, which allows capturing Ruby expressions that correspond to specific nodes, greatly aiding in pattern matching.
00:05:23.880
Using the CLI, we can quickly check how expressions will be interpreted based on the coding style.
00:05:28.500
By employing a language server protocol, we guide interactions with programming tools without needing to delve into intricate implementation details.
00:05:38.760
I also wanted to highlight the Ruby LSP project at Shopify, which uses Syntax Tree to aid various functionalities like document highlighting, folding ranges, and semantic highlighting.
00:05:46.440
This provides our users with enriched experiences and powerful tools while working with Ruby code.
00:05:55.620
Finally, we can translate Syntax Trees to connect with various Ruby parsers, maintaining compatibility and increasing efficiency in code analysis.
00:06:02.880
The core takeaway is that syntax tree nodes can essentially interchangeably work with different parsers, ensuring our feedback remains robust.
00:06:09.300
By leveraging all these functionalities, we're developing a truly versatile and powerful toolkit for Ruby developers.
00:06:15.840
The goal is for users to have an experience where you interact with Syntax Tree effortlessly, regardless of changes in the underlying parser.
00:06:20.940
Thank you very much for your time. I appreciate your attention.
00:06:25.640
Does anyone have any questions that I can answer?
00:06:30.920
One question here is about providing a syntax rewriter.
00:06:34.920
The answer is yes; you can already do that today. My goal is to build more user-friendly and advanced features into the project.
00:06:40.920
This could involve running a script using Rails CLI to automatically update deprecated code.
00:06:44.460
A follow-up question relates to the Pretty Ruby project, which started as a Prettier plug-in for Ruby formatting.
00:06:51.900
Pretty Ruby has transformed to utilize Syntax Tree, making it a lightweight wrapper around the tools we now have available.
00:06:59.100
This project has greatly reduced its complexity, now comprising only a few hundred lines of code.
00:07:02.640
In terms of the dependencies for Syntax Tree, Ripper is used primarily, but we are seeking to create a new Ruby parser for better efficiency.
00:07:10.740
The overall aim is to ensure users can rely on Syntax Tree for consistent performance and features.
00:07:16.520
Thank you again for your time, and feel free to reach out if you have further questions!
00:07:20.700
I hope you enjoyed the presentation, and I look forward to engaging with you further.