All comments must be haiku! Custom linting with RuboCop

by Scott Moore and Kari Silva

In this workshop presented at RubyConf 2021 by Scott Moore and Kari Silva, attendees learned how to customize RuboCop, a popular linter for Ruby, to enforce unique coding standards in their projects, particularly focusing on the creative challenge of requiring comments to be written as haikus. The session covered a variety of topics essential for understanding and implementing custom linting beyond RuboCop's default settings.

Key Points Discussed:
- Introduction to RuboCop: Attendees were introduced to RuboCop, its purpose in maintaining code quality, and its reliance on an underlying library called Parser to analyze Ruby code through an abstraction known as the Abstract Syntax Tree (AST).

Custom Linters: The workshop emphasized the value of creating custom linters tailored to specific coding needs, highlighting how participants could implement rules to enforce best practices unique to their codebases.
Abstract Syntax Trees: A brief explanation was provided regarding how RuboCop utilizes Parser to transform source code into tokens and subsequently into an AST, allowing for detailed programmatic analysis.
Building a Custom Linter: Participants were guided through developing a simple linter that flagged comments that did not adhere to the haiku format, encompassing essential coding practices such as utilizing callbacks to trigger checks during the linting process.
Node Pattern Matching: The latter part of the workshop focused on advanced features, including the 'node pattern matcher,' which simplifies the process of finding and enforcing rules in the AST. Participants explored the implementation of this pattern to selectively apply haiku validations to specific method definitions.
Case Studies and Examples: Several hands-on coding examples reinforced the concepts being discussed, including inspecting processed sources, dumping comments, and adding offenses that enforced specific rules such as the number of lines and syllable counts required in haikus.

Conclusion and Takeaways:
- The workshop effectively illustrated how to extend RuboCop's functionality, making it a powerful tool tailored to developers' specific needs. Participants left with a solid understanding of both the theoretical and practical aspects of building custom linters, equipping them with the skills to improve their coding standards and practices. The use of haikus served as a fun and engaging way to practice this vital skill.

00:00:10.559 Welcome to the workshop, everyone! Today, we are going to talk about building custom linting for RuboCop and extending what RuboCop is capable of right out of the box.

00:00:24.960 If you don’t have the repository, go ahead and clone it now. We will talk a bit about what we’ll cover, and then we’ll jump into code fairly shortly.

00:00:37.520 One thing that the recording team wanted me to ask is if you have questions, please ask them at one of the microphones in the stands. We’re holding one of them, but we’ll put it back, so just come up to the mic and ask your questions.

00:00:56.079 Let me introduce ourselves. My name is Scott Moore, and I’m a software developer. I’ve been doing this for about nine or ten years. I’ve done a little bit of everything and have been a beginner many times. I’ve been working with Ruby for the last couple of years.

00:01:25.280 With me is Carrie.

00:01:43.600 Hi! I’m Carrie Silva. I am a back-end developer, and I work with Scott. Previously, I was a high school teacher in physics and biology. I made a career change and jumped into tech about three or four years ago.

00:01:53.479 I’ll be walking around for support if you have any questions. I’m available for you.

00:02:03.119 We both work at a company called SonderMind, located in Denver. SonderMind builds software to support mental health care and therapy, both virtually and in person. We're hiring, so feel free to grab us after the talk if you want to chat about working here! Like everyone else, we are looking for new hires.

00:02:18.879 Now, I want to quickly talk through what we’ll cover today. This is not going to be a deep dive into parser implementation or getting into the hairy details of how those tools work. This session is largely focused on high-level concepts regarding how we use RuboCop.

00:02:36.400 We’re going to touch on what tools like the parser provide for us, but we won’t go too far into it, mainly due to time constraints.

00:02:49.040 What we will work on is a kind of toy problem to get you thinking about what’s possible with custom linters. I hope it will show you the basics of how to set one up. While working on this, I encourage you to think about problems you have in your codebase that this might help improve.

00:03:01.440 Before we go too much further, could everyone raise your hands if you're familiar with RuboCop, work with it day to day, or use it in your code base?

00:03:13.200 Great, most folks are familiar with it. Next, could we have a show of hands for Ruby experience? Raise your hand if you have at least a year of Ruby experience, and keep it up if you've done five years, ten years - anyone?

00:03:37.840 Okay, it looks like most have some experience, which is awesome!

00:04:05.920 Let’s quickly cover the basics of RuboCop. I’m assuming everyone at least knows of RuboCop; it’s a linter that helps with formatting and auto-correcting stylistic issues in our code, steering it toward a common ground with regards to style guide and best practices, which are configurable.

00:04:45.440 RuboCop provides a framework for doing things in a standardized way, with the added benefit of automation.

00:05:02.960 This raises the question: why do we need to write something custom if we have this tool that follows a community style guide?

00:05:17.840 The need for custom linters arises from unique requirements in your codebase. For example, at SonderMind, we’re currently working on managing our numerous environment variables.

00:05:45.759 We implemented a linter that prevents anyone from introducing new environment variables without including documentation for them. We check against a file where the documentation lives and flag any pull request that does not add documentation when it introduces a new variable.

00:06:10.000 This is a specific problem that custom linters can solve effectively and consistently.

00:06:39.760 So, what is RuboCop actually doing when it lints our code? RuboCop relies on a library called Parser, which works similarly to the process of compiling code.

00:07:05.919 The Parser examines the source code text you've written, splitting that into tokens, ultimately converting those tokens into an abstract syntax tree.

00:07:45.440 RuboCop provides additional tooling to make it easier to work with the output of that library.

00:08:06.080 An abstract syntax tree (AST) represents the structure of the source code. It reflects the relationships between the various parts of the code, making it possible to inspect and modify the code programmatically.

00:08:42.560 Here’s a quick visualization of what the output of a utility called RubyParse looks like for a small snippet of Ruby code. It shows how each component relates to others, including implicit elements that aren’t obvious at first glance.

00:09:10.000 So, that’s the core of what RuboCop works with, demonstrating the importance of using these tools effectively when working with code.

00:09:26.240 Now, we’re ready to start coding! Let’s switch over to what we want to look at.

00:09:46.080 We need to clone our repo and execute a bundle install. Afterwards, we’ll review the instructions markdown file, which is a step-by-step guide to get us through this repo.

00:10:19.120 A couple of things to keep in mind as we’re writing code is that each part of the process of building this is split into steps, represented as directories. When you see references to switching steps, we can use a script called switch step to navigate between them.

00:10:49.720 If something isn’t working for you, and you want to push ahead, you can simply switch to the next step. Each code segment builds upon the previous one, so feel free to skip ahead as needed.

00:11:14.079 Quickly, we’ll take a look at the folder structure and how everything is laid out. What we’re aiming for is to spend about half an hour on the first three steps.

00:11:50.000 The last two steps will cover something called the Node Pattern Matcher in RuboCop, which is fairly common when writing linters. I’ll aim to spend about 45 minutes on those last two steps, then we should have around five minutes for general questions.

00:12:16.480 If something doesn’t make sense, please feel free to step up to the microphone. I’m happy to hang out for another 15-20 minutes to help answer your questions.

00:12:42.720 The main focus here is we have a Ruby app that prints strings. This is essentially a poetry slam application.

00:13:02.240 Now, if you have cloned everything properly, you should be on step one. We'll take a look at the data structure that we get back from RuboCop during linting.

00:13:25.600 Initially, we should be able to run RuboCop without any issues after performing a bundle install. Could everyone try that now?

00:13:59.200 Using the debug flag will also provide additional context about what's happening there.

00:14:26.560 So far, things are going well! We inspected some files, and nothing was detected, which is expected because step one doesn’t do anything with an empty class.

00:15:03.840 Next, we’re going to add a method for capturing when we start a linting file using a callback called `on_new_investigation`. When this is defined, it will be called whenever we run RuboCop.

00:15:39.840 In our step one directory, let’s add this method and then place a `binding.pry` in there, allowing us to inspect what RuboCop has provided.

00:16:05.440 The main part we're going to examine is called `processed_source`, which will help us explore the abstract syntax tree of some files.

00:16:34.960 Let’s ensure everyone can view the processed source and check what we’re receiving there.

00:16:59.200 The `processed_source` gives us an abstract syntax tree that reflects each method and the respective calls. This context, along with comments, is also presented which are outside the AST itself.

00:17:43.360 We can see raw source and various tokens, indicating their ranges in the source file, describing how parser interprets them.

00:18:18.320 This is the lower-level context regarding the parser tool. It allows us to act on it more effectively, driving home the fact that custom linters are powerful partners.

00:18:55.280 The `process_source` will provide us with something called `ast_with_comments`. This sounds odd because normally you wouldn’t consider counting syllables in comments.

00:19:22.880 However, you can imagine leveraging this functionally to create useful parsing logs meaningfully.

00:19:58.000 When examining the `ast_with_comments`, we see that it associates comments with the methods they precede, providing us with key-value pairs.

00:20:36.720 In this step, we’re simply going to iterate through the `ast_with_comments` and print out the comments.

00:21:07.200 If someone runs into any trouble while implementing this, Carrie can assist!

00:21:43.440 You might find that there’s an issue with linting two files; make sure you’re running RuboCop against `app.rb` to focus solely on that file.

00:22:15.520 Now, let’s go through the `ast_with_comments` and see what we’re observing there.

00:22:42.320 We have an abstract syntax tree showing tokens that include the ability to view comments associated with methods.

00:23:19.520 As we scroll through, we see that the `process_source` provides meaningful information about our source files.

00:23:53.280 Next, after understanding this, we’ll implement an inspection using the comment and iterate through.

00:24:24.480 Let’s take about five minutes to get those comments printed when we conduct a RuboCop run.

00:24:54.040 Now, someone mentions updating the import classes to handle haiku comments, but there seems to be an issue executing that callback.

00:25:27.360 Carrie, do you mind assisting with that? Are they not looking at the right files?

00:25:58.000 Yes! Sorry for the confusion; when running the process source, please make sure your path is correct and point to the right file.

00:26:19.040 Now, let’s focus on demonstrating the comments when working through the `ast_with_comments`!

00:26:52.320 I encourage everyone to try adjusting some comments. You can dive back in, so let’s take a moment to observe this further.

00:27:30.560 Now, let’s pivot to step two, where we will focus on more granular inspections.

00:28:04.320 The idea is to apply a callback on definitions. This will happen for each defined method within our code.

00:28:45.760 When we define an `on_def` and pass the node, it will be triggered for every method definition in our source.

00:29:30.160 We also still have access to the `processed_source`, allowing us to retrieve comments related to each method.

00:30:02.080 This will be our understanding, followed by returning an offense if that node does not have an associated comment.

00:30:59.680 As we continue, we will apply some logic to ensure the node has comments and check the count against haiku requirements.

00:31:38.480 Once we confirm that the count matches, we will link an offense with an appropriate message.

00:32:13.040 Remember, a haiku consists of three lines. Thus, we want to ensure that our specific methods comply!

00:32:53.440 Let’s now refocus and go through methods, assuring they meet the syllable requirements.

00:33:29.920 Feel free to test this with a few users and pull up access to see the results.

00:34:00.560 As we distribute our final check, we plan to navigate through ensuring compliance across all counts.

00:34:43.760 This brings us to step three, where we will enforce the syllable count of 575 in our methods.

00:35:25.440 We will utilize an external library to count syllables in a provided string to help this process.

00:35:59.360 We'll ensure our implementation only returns if the syllable counts are appropriate.

00:36:38.880 Moving forward, we want to print out exact comments and highlight any missteps.

00:37:14.560 Let’s confer on our specific functions designations while outlining our progress.

00:37:58.440 Now, let’s progress into step four and delve into the details surrounding our node pattern matchers.

00:38:36.960 This is where the fun begins! Let's ensure that the syntax of these patterns take form with the appropriate details.

00:39:18.240 Remember, we will focus on ensuring the poetic lines match our defined structure, allowing us to improve our understanding of those parameters.

00:40:03.040 We see potential overlaps where these patterns can become useful modifiers in our documentation compliance.

00:40:45.440 As we delve deeper into the syntax trees, we’ll isolate elements based on their identifiers as needed.

00:41:20.960 This exploration will also allow us to bypass unnecessary elements in our coding process!

00:42:04.560 Let's take another three to four minutes and explore usage patterns amongst the nodes.

00:42:40.960 All right, how is everyone feeling? Do you wish to dive further into node patterns?

00:43:20.320 Remember to keep these requirements in mind as we link similar methods together while adhering to XML standards.

00:44:02.240 As we move onward, let's ensure our calls to specific classes align with hierarchical templates.

00:44:39.840 Let’s further review our notes and polished strategies as we wrap up today!

00:45:12.080 If you have any questions or wish to explore our capabilities further, feel free to connect!

00:45:50.720 Remember to take a moment and appreciate this journey into the realm of custom linting and the power of collaboration.

00:46:31.760 Thank you for engaging with us today. Each question that surfaced will lead to new insights as we venture ahead in improving our tools.

00:47:25.680 Let’s take this moment to share perspectives and consider how we can further utilize these tools in our workflows.

00:48:19.040 As we wrap up, I hope to see active discussions surrounding these practices in the community forums as we share these best practices.

00:49:00.640 Let's highlight successful models and effectively share results with our teams to promote growth amongst our peers.

00:49:51.920 The overall goal is to embrace this journey today and take it forward into our daily coding practices.

00:50:41.680 Thank you very much for your participation. If you have further inquiries, please don't hesitate to ask.