Talks

Static Type Checking in Rails with Sorbet

Static Type Checking in Rails with Sorbet

by Hung Harry Doan

In the talk titled "Static Type Checking in Rails with Sorbet," Hung Harry Doan, a Staff Software Engineer at the Chan Zuckerberg Initiative, explores how to implement static type checking in Ruby on Rails applications using the tool Sorbet. The presentation covers a variety of aspects related to Sorbet and its integration into a codebase that comprises over 2,500 files, emphasizing the following key points:

  • Intro to Static Type Checking: The discussion begins with an introduction to static type checking and its significance in catching errors early, improving code maintainability, and enhancing developer productivity.
  • Sorbet Overview: Sorbet is highlighted as a fast and powerful static type checker specifically designed for Ruby, which supports gradual typing and can be integrated with text editors for real-time type checking.
  • Real-World Application: The speaker shares insights into the experiences of adopting Sorbet across their large codebase, demonstrating how it was able to uncover subtle bugs, such as misspelled method calls that would typically go unnoticed until runtime.
  • Challenges with Rails: Doan discusses the challenges posed by metaprogramming in Rails when it comes to type checking, as Rails often generates methods dynamically, complicating type inference.
  • Introduction of Sorbet-Rails Gem: To address the unique challenges posed by Rails, the creation of the Sorbet-Rails gem is introduced, which helps bridge the gap between Sorbet and Rails applications. This gem facilitates the generation of method signatures, thereby enhancing the ability to type check dynamic Rails methods.
  • Implementation Strategies: Key strategies for implementing Sorbet in a Rails environment are shared, including generating RBI (Ruby Interface) files, leveraging Rails reflections for dynamic methods, and encouragement toward adopting a gradual approach to type checking.
  • Adoption Metrics: The speaker discusses metrics for measuring the adoption of type checking within their team, including method call coverage and the number of files type-checked, along with strategies to maintain engagement and support among team members.
  • Conclusion and Insight: The talk concludes with a recommendation to the audience to explore Sorbet for their own projects, emphasizing the potential for improved code quality and developer efficiency. The presenter invites collaboration and contribution to the Sorbet community, highlighting the positive impact it has had on their workflow.

In summary, Hung Harry Doan provides a compelling overview of the benefits and challenges of incorporating static type checking into Ruby on Rails applications using Sorbet, offering practical advice and insights for teams considering similar implementations.

00:00:08.599 Hello everyone, thanks for tuning into RailsConf today. I would like to talk to you about static type checking in Rails using a powerful tool called Sorbet.
00:00:14.510 We are going to cover three main things today. First, we will discuss what static type checking is and the benefits it offers. Next, I will walk you through how we use Sorbet to type check our results, which is going to be a fun challenge. Finally, I will provide you with some tips and tricks for how you can drive the adoption of Sorbet in your team.
00:00:26.179 My name is Hung Harry Doan, and I am a staff engineer at the Chan Zuckerberg Initiative. "Harry" is a name I picked for myself because I really enjoy Harry Potter. I enjoy it so much that I will be using wizard examples throughout this talk.
00:00:43.550 The Chan Zuckerberg Initiative, or CZI, is a new kind of philanthropy that applies technology to solve challenging social problems. We have three main initiatives: one in science and disease research, two in education, and three in the areas of justice and opportunity. I work on the education initiative, where we build an e-learning program, a personalized education platform for K-12 schools in the U.S. It empowers students to learn at their own pace in a way that works best for them.
00:01:01.039 Our team consists of about 35 engineers, and we've been developing this platform for six years. It is built on Ruby on Rails, with more than 2,000 Ruby files and over 150,000 lines of code. As the team's size has expanded, we have faced challenges that I believe are typical for any engineering team.
00:01:14.509 These include maintaining legacy code, making large code changes safely and reliably, and onboarding new engineers. As the code base grows and becomes more complex, these challenges become more pronounced. Because of that, we have been exploring different tools that can improve our team's success.
00:01:31.240 So, why Sorbet? Let's take a look at the code example below. I defined a "levitate" function that takes "cast" as an argument, followed by three calls to the function. Only one of them is valid, but Ruby does not raise an error until the code is executed. Wouldn't it be great if there were a way to detect bugs automatically? It would help us find errors early, reduce harmful testing, and allow us to focus on building features.
00:01:49.870 Static type checkers do exactly that. Using static type checking in a dynamic language is not a new idea. Tools like Flow and TypeScript for JavaScript and mypy for Python have been successfully used for quite some time.
00:02:10.300 One of the most exciting developments for me last year was the release of Sorbet, a static type checker for Ruby. Sorbet is fast and powerful, processing hundreds of thousands of lines of code per second, and can be integrated with editors for type checking as you write code.
00:02:23.230 Sorbet is designed to support gradual typing, allowing you to add type checks to parts of your codebase and benefit from typing immediately while integrating more files as you go. It also comes with a runtime type-checking component. This feature sets it apart from tools like Flow and TypeScript by helping keep type definitions correct and up-to-date with code without much overhead.
00:02:43.170 Sorbet has been battle-tested in production with hundreds of engineers over the last two years and can eliminate common errors like typos, name errors, and argument errors. The type information also makes code much easier to understand, particularly for new engineers.
00:03:03.960 Sorbet relies on method signatures for type checking. A signature defines the types of the parameters and the return value of a method. Sorbet uses this information to enforce that the method is used correctly. Since we integrated Sorbet, we have seen tremendous impacts on our team and results were immediate.
00:03:23.230 Below is an example of a bug that Sorbet found in our code. Someone tried to call "academic_year" on an object, but this method did not exist. The actual method was "current_academic_year." This bug was subtle and hard to spot by just reading the code.
00:03:37.150 Interestingly, Sorbet also helped us discover more issues in our code that we previously didn't know about, such as dead code accessing classes and functions that had already been removed. Additionally, there were bugs caused by incomplete renaming of methods. Bugs like these lowered the quality of the code and slowed down engineers working in those areas.
00:04:00.890 However, when we started with Sorbet, things were not easy. Sorbet could understand Ruby syntax and it provided support for core Ruby APIs like Array, String, and Hash, but it offered very little support for Rails out of the box.
00:04:26.800 Rails code is not easy to type check, and let me explain why. First, it is a large framework with many functionalities—that is one challenge—but moreover, Rails relies heavily on metaprogramming to provide core functionalities.
00:04:34.570 What is metaprogramming? It is a way to program dynamically using code instead of writing everything by hand. For example, I have an Oracle class here, but I am not writing the "answer" method that it should have; it is created by calling "make_magic." If "make_magic" is not called, "answer" is not created.
00:04:49.150 The last line produces an error. You can already see how this can be a challenge for a static type checker, as it sometimes doesn't even know that a method exists, let alone what the method does.
00:05:02.020 Let's look at an example from Rails. Here I have a "wizard," a typical Rails model. It has an object-relational mapping definition and a database backing it up. With this much code, Rails can create a fully fleshed-out model with various functionalities.
00:05:23.140 You can query the model for records in the database, access the attributes of the database table, and also access associations to receive data from other tables. All these methods are generated dynamically, but how does Sorbet understand the code behind these approaches?
00:05:46.430 To help, we provide some custom method signatures, allowing Sorbet to see that Harry is a wizard. However, he doesn't understand much about attributes—"name" and "house" are untyped. Similarly, associations aren't typed either, which makes it challenging to effectively type-check Rails code.
00:06:01.560 If only Sorbet knew the types of these Rails methods! To bridge this gap, I created a gem called Sorbet-Rails to enhance compatibility between Sorbet and Rails. It acts as a one-stop shop that makes Sorbet work seamlessly with Rails.
00:06:20.570 It has a static component that generates method signatures for dynamic methods created by Rails and additional runtime features that help with type checking. Let me take a moment to talk about RBI files, as this is an important concept.
00:06:40.120 RBI files are like C++ header files; they only contain method definitions and signatures without implementations. They provide additional information that Sorbet doesn't get from parsing the code. We generate RBI files for dynamic methods so that Sorbet can better understand their intentions.
00:06:54.310 We weren't the first team encountering challenges with metaprogramming. An alternative approach involves using metaprogramming plugins. These plugins run alongside Sorbet and notify it when dynamic methods are created.
00:07:09.410 This is best explained with an example. Given the Oracle class I defined earlier, you can create a plugin that detects calls to "make_magic" and generates a signature for the "answer" method. Sorbet would then recognize that the method exists in the Oracle class.
00:07:26.110 Metaprogramming plugins have the advantage of being evergreen, meaning they run with every type-checking call. This ensures they are synchronized with the latest versions of code. However, they can slow down type-checking considerably, hindering integration with editors where type checking occurs with every keystroke.
00:07:50.740 Additionally, writing the plugins can be challenging because they only have static information about the code being analyzed, unlike the dynamic nature of our applications.
00:08:06.290 Different approaches can generate RBI files ahead of time, which can be less accurate if code changes and you haven't re-run the generation scripts. However, because we have meta signatures available, it is faster to type check, even when in the editor.
00:08:22.360 Moreover, generating signatures becomes significantly easier when you have access to a fully functioning Rails runtime environment. Due to the advantages of this approach, it has now become the recommended way to manage metaprogramming.
00:08:38.590 Let's see how this approach works with the earlier example using model configurations. Sorbet can generate signature types for "name" and "house". The "house" is defined as an integer column. The model method "sweetens_name" is defined as an enum, demonstrating how generating signatures can be challenging.
00:09:01.310 However, with these signatures, Sorbet begins to understand the code better. Now it knows that "name" and "house" return strings. Similarly, we can generate signatures for association types. Sorbet can generate signatures for a lot more methods in a model.
00:09:27.560 Since Rails relies on metaprogramming to create methods, we can mimic this process to generate signatures. It's an appropriate way to write signatures, especially for more complex models. However, signature generation requires different strategies for each type of method.
00:10:07.110 First, for database attributes and association methods, we rely on Rails reflections to share their signatures. In short, reflection is a mechanism that allows a program to inspect and change its own structure. Rails comes with various reflection classes like Association reflections. These are helpful because they provide detailed information about how dynamic methods behave.
00:10:25.650 Our task is simple: we can use the information they provide to generate corresponding method signatures. However, when it comes to generating types and signatures for enums, we may need to look up the associated Rails documentation or source code.
00:10:43.300 Complications also arise due to third-party gems and private concerns added to models. To tackle this complexity, Sorbet provides a configurable generator where each plugin can generate its own signatures. The generator then aggregates these to produce a final RBI file.
00:11:02.170 You can add or remove plugins as you wish, and even the co-generation logic I described earlier is implemented as plugins. Gem plugins allow the community to share, enhance, and contribute code for public gems. Currently, we have plugins for several gems such as Shrine and Elasticsearch.
00:11:22.530 I hope that as the community develops, we will see more and more gem plugins being shared. Finally, you can also write custom plugins for your private libraries.
00:11:35.020 Sorbet supports generation methods for other Rails objects as well, including mailers and jobs, which present an interesting challenge. They have class-level methods that are generated based on custom user-defined instance methods. For instance, in the Merlot class, when we define an instance method called "notify_subscriber," Rails automatically adds a corresponding class-level method.
00:11:49.920 Sorbet generates a signature for this method as well. The generated signature is refined using Sorbet itself, which reflects upon the signature and produces a better signature for the class method.
00:12:07.440 On top of the RBI generation logic, Sorbet includes a few runtime features to make typing easier. I will only discuss type parameters because they show how your code can become safer and easier to understand.
00:12:22.950 Normally, parameters in controller actions are treated as strings, even if the actual data type is an integer or a boolean. The value received by the controller is a string. However, we can define the structure of parameters with corresponding types for their components using type parameters.
00:12:41.160 With type parameters, we can coerce the normal parameters into this structure, allowing the rest of the code to use the typed values. The structure also serves as documentation for the controller actions, making it easier to know which parameters need to be sent from the client side.
00:13:00.970 We have tackled many technical challenges using Sorbet in our Rails app, but this is only the beginning. Adopting type checking means asking your team to alter their workflow to accommodate new typing conventions.
00:13:11.530 In this section, I will share lessons we've learned from driving adoption in our team and how you can apply them successfully. First, let's discuss metrics to measure adoption progress. Sorbet provides two metrics to assess adoption: file-level and call-site-level metrics.
00:13:36.370 Sorbet features five different type-checking levels, ranging from "polish" to "strict." The primary levels to focus on are "type sufficient," where type checks will be done without needing to write method signatures, and "type required," where you must provide types for instance variables.
00:14:01.200 This latter level is ideal for adoption as it encourages teams to engage with typing in-depth. Sorbet can tell you the number of files at each level, while it checks the number of method calls that get type checked. Any method call on an untyped object is counted as untyped.
00:14:27.700 These metrics provide valuable insights into where to direct your adoption efforts. Typically, you would want to increase the number of files being type checked and then focus on the number of call sites within those files. You can also create your own metrics to define what successful adoption means to your team.
00:14:48.160 For example, we tracked participation in type-checking initiatives, which is important because it indicates that everyone on the team is actively engaging with the typing process. Currently, over 90% of our team is writing type-checked code, and that includes models and controllers.
00:15:06.960 However, in terms of call-site-level metrics, we are at about 66%. The blue line shows a steady growth in the number of call sites being typed checked, while the red line represents the number of untyped call sites that have plateaued.
00:15:26.370 There was a slight bump in the red line when we redesigned our model and controller structures, significantly increasing the number of surfaces that Sorbet could type-check. After that adjustment, the red line held steady, which is a promising indicator.
00:15:44.080 It shows that everyone is engaged in writing type-checked code and that we are constantly working to improve our metrics. Some sub-teams have even set ambitious goals for their code.
00:16:05.780 There are two principles I believe helped us immensely in driving successful adoption. First, we adopted Sorbet gradually. Sorbet can be useful even if you only type-check a portion of your code base; in fact, it's an ongoing process since Ruby is a dynamic language.
00:16:27.841 You can think about where type checking can be beneficial, accepting that it is acceptable to use escape hatches like "T.untyped" to bypass checks when necessary. We found it easier to add types to new code, aligning it with the team's interest in developing new features.
00:16:40.909 The second principle is that we do not obstruct team members from accomplishing their tasks. We want everyone to see Sorbet as a tool that facilitates their work, not as a hurdle to overcome.
00:16:55.639 There are several ways to achieve this. First, learn about the challenges in type-checking code early on. These issues vary based on coding patterns and libraries your team uses.
00:17:10.589 Make sure to provide workarounds that allow engineers to bypass type checkers when necessary, and let them make mistakes as they get used to the tool. If they do mistakenly change a string to a symbol in a signature, ideally that shouldn't affect the production runtime.
00:17:29.169 With these principles in mind, let's walk through the step-by-step process for driving adoption in your team. First, you need to set up the gems. This step is simple: add the gems to your Gemfile, run "bundle install," and follow the initiation steps documented in the gems.
00:17:50.920 One important note here is to disable Sorbet's runtime checks in production. Runtime checks help enforce input and output type correctness, but if a violation occurs in production code, that can be problematic.
00:18:12.480 Once you have set everything up, take a look at the adoption metrics Sorbet provides. You might be surprised; I certainly was. Sorbet managed to analyze 80% of our files and 40% of the call sites within them. This striking ability comes from Sorbet's understanding of all core Ruby logic and the enhancements from Sorbet-Rails.
00:18:36.921 Next, you'll want to establish a solid foundation for your adoption metrics. This foundation will reveal where you should focus your efforts. It's also a great time to explore how to integrate Sorbet with your team's development tools, like Git, CI, or your code editor.
00:19:00.761 Our team has adopted practices to check Sorbet in our CI, collecting logs as we go. Additionally, I must emphasize that editor integration is a game-changer, revolutionizing how you write code.
00:19:20.401 A major part of your effort here should be about understanding type checking and determining how to best implement it throughout your codebase. Know the types of code your team frequently writes to help guide and unblock them.
00:19:40.820 If you're moving too fast, take your time to build a resilient foundation, as it's essential for the success of the next phases. Remember, adoption should be a gradual process.
00:20:03.929 Once you have a solid foundation, you can start intensifying your adoption metrics and get your team actively using the tools. Focus initially on ensuring that code files adhere to the "type sufficient" level since you won't have to write any method signatures yet.
00:20:26.090 Start by concentrating on your models, controllers, and classes that mutate data while typing the files. I've found this reveals many existing bugs. I shared those bugs with the team to get everyone enthusiastic about Sorbet.
00:20:45.120 We also held workshops and recruited early adopters for type checking. People really enjoyed learning about Sorbet, as seen in the photo from our recent workshop.
00:21:04.470 After this, it's time to guide your team into type-checking workflows. A crucial tool I found helpful is Robocop-Sorbet, which enables enforcement of tagging new files according to specific types. Make sure to enable Sorbet's runtime checks in development.
00:21:23.300 Also, enable git integration to run checks automatically when updates are pushed to remote branches. Lastly, keep in mind to provide escape hatches so people can bypass checks when needed.
00:21:43.300 I discovered it immensely valuable to celebrate adoption progress and recognize early adopters frequently. Just doing this alone can generate excitement among the team regarding the effort they are investing.
00:21:58.979 As we reach the current phase where type-checking becomes the norm, it's imperative to ensure logs reflect that most people are accustomed to type checking. You can enhance adoption guidance and require new files to have type signatures as an ideal goal.
00:22:14.480 Robocop-Sorbet can be beneficial once again, especially concerning run-time errors. Direct teams to resolve errors, ensuring team ownership over mistakes discovered.
00:22:31.540 Take a look at the screenshot provided; run-time errors are reported without disrupting the production app, which is excellent. We also began focusing on adding types to older legacy code as the practice became normalized.
00:22:54.680 I found a very effective approach was getting junior engineers involved in type-checking portions of the code within their sub-team. This practice helps them grasp the code they will be working on while simultaneously aiding the team in increasing type coverage.
00:23:12.580 I want to acknowledge, however, that tools are still not perfect. Sorbet and Rails continue to develop, and we sometimes encounter limitations, such as syntax not being fully supported yet by Sorbet.
00:23:33.800 For example, some shapes or block binding may not be recognized sufficiently. Nevertheless, the teams are actively working on these tools, and I am excited about the official support for VSCode that is on the horizon.
00:23:54.440 Moreover, Sorbet undergoes regular updates, and we've implemented two primary recommendations: make certain to frequently use Sorbet and Sorbet-Rails to access the latest features.
00:24:12.580 CZI is not the only organization striving to use Sorbet in Rails. I have seen a few notable companies leveraging Sorbet, ranging from large corporations like Shopify to small startups with just a few engineers.
00:24:30.780 All of them are gaining significant benefits from integrating Sorbet into their workflow. My hope is that my experiences will inspire you to try using Sorbet.
00:24:43.259 Simply follow the initial steps to set up your environment and see if Sorbet can find any bugs in your code base. You are also welcome to join the Slack community, provide feedback, and contribute to the gem.
00:25:06.149 Before I conclude, I want to extend thanks to CZI engineers and leadership. Their continuous support has allowed us to achieve successful adoption.
00:25:36.620 Thanks to everyone on the Sorbet team for creating such a magnificent tool—it has added tremendous value and transformed my Ruby coding practices.
00:25:55.820 Last but not least, I would like to express my gratitude to our various contributors. Your hard work through commits to the report has not gone unnoticed, and I am grateful for your contributions.
00:26:15.200 Thank you for attending the talk. If you have any questions, feel free to reach out to me through GitHub or Slack. I will be more than happy to answer your questions.