RubyKaigi Takeout 2020

Prettier Ruby

Prettier was created in 2017 and has since seen a meteoric rise within the JavaScript community. It differentiated itself from other code formatters and linters by supporting minimal configuration, eliminating the need for long discussions and arguments by enforcing an opinionated style on its users. That enforcement ended up resonating well, as it allowed developers to get back to work on the more important aspects of their job.

Since then, it has expanded to support other languages and markup, including Ruby. The Ruby plugin is now in use in dozens of applications around the world, and better formatting is being worked on daily. This talk will give you a high-level overview of prettier and how to wield it in your project. It will also dive into the nitty gritty, showing how the plugin was made and how you can help contribute to its growth. You’ll come away with a better understanding of Ruby syntax, knowledge of a new tool and how it can be used to help your team.

RubyKaigi Takeout 2020

00:00:01.280 Hello, my name is Kevin Deisz, and today I'm here to talk to you about Prettier for Ruby. I work at Shopify, based out of Boston, Massachusetts.
00:00:09.360 The question I'm here to answer today is: What is Prettier? Prettier is a JavaScript package that provides a language-agnostic framework for building formatters.
00:00:14.880 So, what does that mean? It specifically provides a set of language-specific parsers, one for each language that it supports. It takes information from each language's parser and builds a Prettier-specific intermediate representation, which means it transitions from language-specific to language-agnostic.
00:00:29.359 Prettier also provides a printer to convert that intermediate representation into formatted code. Additionally, it includes a suite of editor tools and plugins that enable various improved workflows, such as formatting on save, formatting on commit, and standardizing the format of your codebase across all developers to minimize lengthy discussions about specific linting rules.
00:00:50.079 Out of the box, Prettier started as a JavaScript printer, supporting various JavaScript implementations and variants. Later on, it expanded to include support for HTML, CSS, and Markdown. These formats come with Prettier by default, but there is also a wide ecosystem of plugin support for languages such as Java, PHP, PostgreSQL, Ruby, SVG, XML, and Swift. This is just a subset, as some of these plugins are community-driven and hosted on individuals' GitHub accounts, while the official ones are part of the Prettier organization on GitHub.
00:01:18.320 Today, I'm here to discuss the Ruby plugin that I've been working on for about two and a half years. This plugin is both a Ruby gem published on rubygems.org and a Node module available via npm. It's averaging around 20,000 downloads per week and is constantly under development whenever I find time on weekends.
00:01:31.119 Let's talk about how this plugin works. Essentially, it formats your source code by converting Ruby code into a concrete syntax tree. The difference between abstract and concrete syntax is that the concrete tree contains specific information about where the nodes were placed and about the sources used.
00:01:45.760 Once the tree is created, we attach all of the comments to it. These comments are the source code comments, which we correctly position in the tree. We then walk through that tree and convert it to a different kind of tree: the Prettier intermediate representation. After that, we hand it back to Prettier and instruct it to print this representation. Prettier's printer has its own rules for generating the output.
00:02:06.400 To demonstrate this process, I will show you a poorly formatted Ruby code snippet. When this code is processed through the Prettier package, it produces a much more understandable and readable result. The first step, as I mentioned, is converting the Ruby source into a concrete syntax tree. This involves spawning a Ruby process that acts as our parser. While Prettier is a JavaScript package, we're running Ruby separately.
00:02:29.280 We utilize Ripper, a standard library package that gives us the syntax tree. We append some additional information that Ripper doesn't track itself, such as location information and extra commas. It is also crucial to maintain the list of comments in their original source locations.
00:02:47.280 Here is the code that spawns the process. You'll note that it disables gems for speed. We pass the text we're trying to parse over standard input to our child Ruby process. Ultimately, at the end of this function, we parse the standard output from the child process and return it as JSON.
00:03:01.760 The contract we provide the Ruby process stipulates that given Ruby code as input, we expect a syntax tree in JSON format as output. It's essential to explain how Ripper works. Ripper is part of the standard library and can be required at any time. Once you define a class that inherits from Ripper, you can create methods corresponding to each node in the syntax tree.
00:03:26.480 Whenever the parser encounters a specific type of node, it calls the corresponding method. Different nodes will have varying numbers of arguments based on their types. For instance, a binary node contains left and right nodes representing the operation being performed. For demonstration, we can create our parser and call parse on it to see the output on standard output.
00:03:48.640 As an example, let’s say we get the output '1 + 2'. This example illustrates how the necessary information is constructed; however, it doesn't include any source location details. Therefore, we need to provide richer information in our syntax tree before returning it to Prettier.
00:04:00.000 To achieve this, we loop through every scanner event—literals, spaces, and tokens—alongside parser nodes. We define explicit base methods for each node. When we want to track additional information or modify the tree, we push extra data into these base methods to extract them into the syntax tree.
00:04:24.799 For instance, many nodes in the Ruby syntax tree behave like a linked list; they contain a base node, and everything points to the next nodes, forming a wide tree. This allows for efficient management and processing of nodes. Additionally, some constructs in Ruby, like multi-assign nodes, may have a comma at the end of the left side. This comma can change the nature of the assignment, especially with splats and different placements of left and right numbers in multi-assign expressions.
00:04:53.600 Furthermore, Ripper doesn't report comments back into the main tree. Therefore, it’s necessary to track comments ourselves and maintain their source position so we can attach them to the appropriate nodes in the end. Once the parsing is complete, we convert the comments to a JSON format and return them as part of our response.
00:05:14.720 Next, we address attaching comments to the concrete syntax tree. For each comment, we walk through the tree to find the closest node to the comment's source location. We then guide Prettier on which node to attach this comment to, based on the surrounding nodes.
00:05:35.440 For example, consider a comment associated with the binary operation '1 + 2'. In this case, the binary node is the enclosing node, the preceding node is '1', and the comment follows this node. We provide essential information to Prettier so that it knows which nodes to attach the comments to during the parsing process.
00:06:01.600 Once we have the syntax tree and the comments properly attached, the next step is converting the current tree into a Prettier-specific syntax tree. This process involves creating Prettier nodes that reflect the structure of our Ruby code, allowing Prettier to understand its context.
00:06:22.720 The most important node in this tree is the group node. The group node signifies to Prettier that this is an atomic transaction of nodes. If Prettier encounters the end of the line, it can split this group into different nodes, allowing for flexibility in the resulting formatted code.
00:06:47.920 Line suffixes are also important; regardless of any breaks made within the group, the comment is always positioned at the end of the line. This ensures proper placement of comments throughout the formatted code.
00:07:03.840 Let’s illustrate this process with a specific example from the Prettier plugin. This function handles operator assignments, such as '+=' and similar constructs. These nodes contain three children: the variable being assigned, the operator being performed, and the value being assigned.
00:07:21.040 The handling function requires specific document builders from Prettier, specifically the concat, group, indent, and line builders. We first extract the value from the node, then place the assigned variable, the operator, and special line behavior in the document.
00:07:40.720 If the code fits on a single line, it gets processed as a space; if it requires breaking into a new line, Prettier automatically indents everything accordingly. We group everything together, denoting it as an atom that can be dynamically split based on line length.
00:08:01.680 After processing, the formatted result in command line shows the Prettier nodes generated, effectively displaying the original Ruby code. The essential takeaway here is that the left side of the format is language-agnostic; Prettier has no inherent understanding of Ruby specifics.
00:08:20.000 The final step involves handing the generated tree to Prettier, which takes care of the printing process. This aspect showcases the advantage of working within the Prettier ecosystem, leveraging all the previous work done on formatting various languages.
00:08:31.040 When processing a Ruby snippet, such as one that doubles the value of a list of items, if the maximum line length exceeds the prescribed limit, Prettier breaks the outermost node accordingly. If it still exceeds, it can further break down other groups until the output adheres to the defined lines.
00:08:54.120 I aim to ensure that the Ruby formatting adheres to established guidelines, particularly with respect to Robocop rules. This concept of maintaining item potency is crucial, ensuring consistent outputs for identical inputs.
00:09:10.000 I strive to avoid fundamentally changing your program’s meaning, although Ruby’s flexibility does introduce some challenges. Consequently, there's a range of design decisions in terms of syntax and formatting conventions to consistently align with community standards.
00:09:28.720 For example, keywords such as break, next, yield, and return do not utilize parentheses. Meanwhile, super may adopt parentheses if overriding parent arguments. Variations in nested ternaries, number formatting, and the choice between braces and do-end syntax also feature in the formatting designs.
00:09:48.320 Our formatting approach is mindful of inline rescues, converting them to multi-line rescues for clarity. Moreover, we ensure that simple strings receive consistent quotation marks, while leaving escape sequences untouched.
00:10:07.760 In the larger context, the overarching philosophy guiding Prettier is to minimize configuration options. The premise here is that, with more options available, unnecessary discussions tend to arise, often leading to inefficiencies and debates in code formatting.
00:10:29.600 Historically, as Prettier expanded, it incorporated options to enhance adoption rates but has since shifted away from that approach, focusing on taking in community feedback for any additional necessary features.
00:10:49.600 With the Ruby plugin, I have implemented six specific options to address various community dialects that have emerged. If more requests arise, I will certainly consider them, but for now, these options cover most of the established patterns.
00:11:10.560 Currently, I find the most crucial part of this presentation revolves around the embedded system present in Prettier. The integration with other formatters allows for greater utility, such as embedding parsers for JavaScript within HTML script tags.
00:11:31.200 With the ability to handle embedded Ruby code and format it correctly, the Ruby plugin benefits significantly from this larger ecosystem, showcasing consistency throughout. Similarly, when Ruby is embedded within other languages, the formatting capabilities persist.
00:11:53.920 For instance, in a scenario where you have a here doc using Markdown, the Ruby plugin can access the Markdown formatter to produce a well-formatted output. Conversely, if Ruby is embedded within a Markdown file, it can appropriately format the Ruby code, adhering to the same high standards.
00:12:17.440 Moreover, as I look ahead, I plan to improve execution speed. One approach will be using a native Node plugin that bypasses the Ruby spawn process, allowing for direct object handling within the Node environment.
00:12:37.760 Additionally, I’d like to innovate further by utilizing Prettier's JavaScript node processing via a C library callable from Ruby. This ongoing experiment seeks to bolster performance and enhance functionality as developments unfold.
00:12:57.680 In terms of new Ruby syntax, particularly matching features introduced in Ruby 2.7, my goal is to ensure support for pattern matching and related functionalities as they evolve.
00:13:09.920 In closing, if you found this presentation useful, I encourage you to get involved in improving code formatting. Joining us on GitHub will allow you to contribute to new issues and pull requests as we progress.
00:13:22.640 I invite you to try out Prettier, as it offers integration with all major formatters, ensuring compatibility with your current workflow. Thank you very much for your time and attention.