Talks

Adopting Sorbet at Scale

Adopting Sorbet at Scale

by Ufuk Kayserilioglu

In this talk presented at RubyConf 2019, titled 'Adopting Sorbet at Scale', Ufuk Kayserilioglu, a Senior Production Engineer at Shopify, discusses the integration of Sorbet, a static type checker for Ruby, into Shopify’s extensive Ruby on Rails codebase. With over 1 million merchants relying on Shopify's platform and a core monolith composed of approximately 21,000 Ruby files, the need for improved code safety and reliability was critical. Key aspects of the presentation include:

  • Introduction to Ruby and Shopify’s Scale:

    • Shopify operates at a significant scale, handling 80,000 requests per second and generating $12 billion in annual revenue.
    • The core application is subjected to 800 pull requests daily, necessitating robust testing and reliability measures.
  • Explaining Sorbet:

    • Sorbet is highlighted as a fast type checker allowing for gradual adoption in the codebase.
    • It offers a configurable type system and identifies type mismatches clearly, enhancing code quality and maintenance.
  • Adoption Journey:

    • Shopify was a pioneer in adopting Sorbet, starting small in 2018 and mandating its use in CI by May 2019.
    • Challenges included building a runtime component, generating RBI files for external gems, and addressing meta-programming complexities inherent in Ruby on Rails.
  • Outcomes and Results:

    • Significant progress has been made, with around 4,500 methods annotated with type signatures.
    • Developers reported improved feedback loops, fewer required tests, and enhanced code quality due to the adoption of static typing principles.
  • Conclusions and Recommendations:

    • Kayserilioglu suggests a gradual approach to type annotations, using tools like Tapioca for RBI generation and RuboCop for enforcing type checks.
    • He emphasizes that type safety can coexist with Ruby's dynamic nature without requiring extensive project rewrites.

Overall, the talk illustrates how incorporating Sorbet into Shopify’s operations has significantly improved code confidence and safety while maintaining flexibility and developer productivity. The session concludes with Kayserilioglu encouraging developers to embrace gradual typing and focus on leveraging Sorbet’s safety features without overwhelming their workflow.

00:00:12.070 Welcome to this session! I want to kick things off by showing you a picture—a poster from the show Entourage. If you watch the show, you know that these characters have all kinds of problems.
00:00:21.280 But there's a specific problem with this picture: there's something missing. I don't know if you can figure it out, but what's missing is that none of these people are actually wearing seatbelts.
00:00:30.980 It's very important because seatbelts save lives. They cut the risk of serious injury by 50% and reduce the risk of death by 45%.
00:00:40.939 Despite this, 4% of Canadian drivers, 13% of U.S. drivers, and 54% of drivers in Turkey, where I'm from, drive without a seatbelt. And these are just the drivers, not the passengers!
00:00:55.810 Why is that? There are a few reasons. First of all, people find seatbelts uncomfortable. Specifically, people over 65 find them restricting. Additionally, many people think they will never need them because they believe they're less likely to have an accident in the first place.
00:01:15.350 A lot of people actually say that their cars are safe because we test them with all kinds of crash tests. So they should be safe already—why do I need seatbelts?
00:01:30.480 But we all know that crash testing costs too much in terms of time and money. You have to demolish a whole car each time you conduct a crash test, which is wasteful. You cannot possibly test every situation or configuration.
00:01:40.939 Interestingly, the higher the stakes, the better the safety features in cars. The need for seatbelts, in cases where the stakes are higher, actually increases—despite the cars getting generally safer due to advancements in technology.
00:01:57.670 This includes roll bars and upgraded seatbelts. Now, at this point, you're probably wondering if you're in the right room. What is this guy talking about?
00:02:05.959 So, let me introduce myself. Hello, my name is Ufaq Kayserilioglu. If you want to know how my last name is pronounced exactly, come find me after the talk.
00:02:14.059 I'm a Senior Production Engineer on the Rails and Will Build Infrastructure team at Shopify. Now, let’s talk a little bit about Shopify.
00:02:26.290 Shopify operates at an immense scale. First of all, it has the largest Ruby codebase in the world. Specifically, it's a Ruby on Rails codebase. Let me share some numbers about the commerce that's happening on Shopify.
00:02:40.980 Shopify now supports over 1 million merchants around the world in 175 countries and can handle approximately 8,000 orders per minute at peak times. We generate about 12 billion dollars of sales on the platform per year.
00:02:57.750 Currently, the cumulative number of sales is around 135 billion dollars up to this point. On the engineering side, we employ about 1,500 people, most of whom work on our large monolith, which we call Shopify Core.
00:03:10.569 That's the core of our business, and Shopify Core has around 21,000 Ruby files. We also run about 150,000 tests with every code push, which takes about 20 minutes across 200 parallel workers.
00:03:24.980 We perform about 40 deploys to production daily. There's a lot at stake here, so we really need seatbelts.
00:03:35.419 Today, I'll share the story of how we adopted Sorbet to achieve these safety benefits.
00:03:41.160 For those of you who attended the previous session, you may already be familiar with Sorbet. It is a type checker for Ruby built by Stripe. It is very fast, meaning it can parse and analyze all the code in our Shopify Core monolith in seconds.
00:03:53.240 Sorbet features a highly expressive type system, operates as a static checker (it doesn't run your code), and allows you to gradually opt into the type system.
00:04:06.850 Let me give you a very simple code example to illustrate how Sorbet works. This code has a problem—can you spot it? The issue is a mismatch between the expected hash key type and what is being supplied.
00:04:19.290 The method `foo` expects a hash with string keys, but we're supplying symbol keys. It's relatively easy to spot this issue when the call site and the method definition are close together.
00:04:34.760 However, if the call site was in a separate file, would you be able to identify the issue? Probably not.
00:04:48.720 Now let's try making our expectations explicit with Sorbet. We need to add a signature—affectionately called a 'sig'—on top of the method to declare our input and output expectations. In this case, the parameters our method expects are ops: a hash with string keys and integer values.
00:05:07.490 The generic version of a hash understood by Sorbet is T.hash, where we can declare the type of the keys and the type of the values. Similar constructs exist for arrays, sets, and enumerables.
00:05:17.990 If we add the signature and run this code through Sorbet, Ruby will complain that it doesn't know where the method sig is coming from, which is why we need the extended piece.
00:05:30.000 So, while this makes our three-line function a little more verbose, it provides significant benefits. When we run this file through the Sorbet type checker, we receive a detailed error output, indicating that hash fetch has an argument specified as a symbol but is given a string.
00:05:43.700 Sorbet can tell us exactly where the problem is happening, even in different environments. Suppose you have some code that relies on an external gem; let's say there's a view gem with a class called View that has a static method called render.
00:05:59.140 In this case, Sorbet wouldn't know if the static method render exists unless we inform it. To do that, we declare the interface of the View class in a Ruby interface file (an RBI file).
00:06:07.940 An RBI file functions similarly to a header file but employs Ruby syntax. By providing the RBI file, we're informing Sorbet that there's a class named View with a singleton method render that has a single parameter: file.
00:06:19.300 After adding that RBI file, Sorbet can inform us that our call to View.render is invalid because we forgot to supply the file.
00:06:30.780 Now that we've seen what Sorbet can do, let's talk a little bit about our journey with Sorbet at Shopify. Shopify was the first company outside of Stripe to gain access to Sorbet.
00:06:44.060 We received access to the codebase around July of 2018 and introduced it at a limited scale in our monolith, Shopify Core, in October 2018. Initially, this was only working across a few files in our codebase.
00:07:02.840 We had to put in a lot of hard work to ensure that Sorbet could run over all the files in our codebase, and by March 2019, after a couple of weeks of testing it in parallel, we made it mandatory in CI for all code in May 2019.
00:07:17.830 At that point, Sorbet was open-sourced by Stripe in June. Throughout this process, you can see it took a year with a small team leading the charge.
00:07:34.600 The reasons for the duration were that we paid the early adopter price. We were one of the first companies to access this codebase and had to solve many problems, sometimes in parallel to what Stripe was doing.
00:07:50.920 Let me go through some of the problems we encountered. Initially, there was no runtime component to Sorbet; there was no gem that described the sigs and performed the runtime checks.
00:08:03.480 Thus, we built our runtime component in-house; the initial version was encapsulated in a gem named Waffle Cone, which is now deprecated. We shared our work with the Sorbet team, which integrated some of our code and ideas into their official version of the runtime.
00:08:17.290 Another problem we faced was that Sorbet needs to understand all constants in your codebase. If Sorbet can't find a constant, it assumes it's an error, which became an issue since we depend on hundreds of external gems.
00:08:32.410 It wasn't possible to manually start typing every gem that we depend on, so we built an RBI generator that takes a gem and generates an RBI file describing the constants exported from that gem.
00:08:48.390 The initial version was built on a generator (yard) to get the signatures, but we realized that because Yard looks at the gem codebase directly, we wouldn't see the dynamically generated methods.
00:09:01.240 We decided the structure of constants was more important or helpful than the signatures, so we dialed back and focused on a runtime reflection-based tool, which we open-sourced as Tapioca.
00:09:16.830 However, some Ruby constructs are unsupported by Sorbet, and I'll mention those later in my talk. To address this, we built RuboCop rules to ensure our code base didn't use those constructs.
00:09:32.800 Initially, these rules were bundled in the Waffle Cone gem, but we later extracted them into RuboCop Sorbet and open-sourced that as a public gem.
00:09:48.410 Additionally, meta-programming posed a challenge because it generates methods at runtime, while Sorbet is a static type checker that doesn't know what's happening at runtime.
00:10:05.440 Since we are a Rails shop, there's a lot of meta-programming coming from Rails, including DSLs, which adds to the complexity. We initially built a DSL plugin for Sorbet in C++ that was based on simple class name matching.
00:10:20.240 The idea was that you could inform Sorbet of class name matches and run a Ruby script that generates the RBI file based on the matched class name.
00:10:30.700 While we documented and contributed this to the Sorbet project, it turned out to be slow and error-prone; it solved our initial problem but isn't used extensively today aside from specific situations.
00:10:46.490 Another problem was Rails support. Rails heavily relies on meta-programming, particularly through Active Record models and their relationships, which generates methods in the background.
00:10:59.30 Since Sorbet needs to understand what those methods are and doesn't understand Rails annotations, we built an RBI generator for our models. This is a Rake task that loads all active record models.
00:11:13.020 It generates RBI files that describe them as they would appear at runtime. The initial code was based on Sorbet Rails, but we plan to switch to using Sorbet Rails and build on top of it soon.
00:11:25.240 Now, let me discuss some of our results, starting with the strictness levels of our RBI files. The strictness levels can vary, and you may have noticed my examples included strictness indicators at the top.
00:11:39.180 The strictness can start from typed: true or typed: ignore. Starting with typed: true means that Sorbet parses your files, ensuring the structure is proper, but no type checking occurs.
00:11:53.520 The majority of our code base is still typed: false, but about 40% of it is typed: true, and about 1.5% is strict with typed: strict.
00:12:06.420 At the strictness level, there are more checks performed, such as verifying the types of methods being called.
00:12:18.590 When we look at the percentage of checked calls, we observe that nearly 50% of method calls in our codebase are being checked with some type annotations.
00:12:30.530 This is interesting because we didn't have to add that many types to achieve this coverage. Many of the calls executed in a program typically refer to standard library or core methods that already have type annotations.
00:12:46.150 The more annotations you integrate into your code base, the better your coverage will be. To demonstrate this, I have a graph illustrating developer invocations of type checking on developer machines.
00:13:01.240 This data is not from CI; we track these invocations using a tool named dev, which all our developers use. Sorbet is integrated into that tool, so every time someone types dev type check, we can capture metrics.
00:13:19.610 On this graph, each color represents a specific user, and the height of the bars indicates the number of times type-checking was conducted each day.
00:13:34.860 The red line indicates when we implemented full Sorbet coverage in our core. You can see a lot of activity in May and June, primarily from my team.
00:13:49.910 However, true usage across developers started in July and August. The graph shows a steady increase in usage; it's simply trending up and to the right.
00:14:06.280 Next, we examined the evolution of method signatures in our code base. As previously mentioned, we started small in 2018, but we're quickly ramping up our efforts, and the rate of increase is itself increasing.
00:14:21.660 Currently, we have around 4,500 methods annotated with signatures. We also conducted user interviews with three teams at Shopify that have extensively used Sorbet in their respective parts of the codebase.
00:14:39.890 In these interviews, we sought to understand what they liked and disliked about their experiences using Sorbet. On the positive side, users mentioned they receive quicker feedback compared to running tests or CI.
00:14:57.170 They expressed satisfaction that Sorbet allowed them to write fewer tests, as many tests were designed to enforce type constraints on parameters.
00:15:09.050 Users also noted that static and runtime type checkers catch errors that might not be detected by tests alone—an interesting revelation.
00:15:25.900 This is because, by the time code reaches CI, developers tend to have already fixed their errors since they perform type-checking on their local machines, enabling immediate responses.
00:15:42.420 They can see what's going wrong and have time to correct it before pushing code to CI. The feedback loop created by Sorbet enhances code design and quality.
00:16:00.200 Additionally, users appreciated that thinking about types while coding improves code quality and encourages better overall design. They also mentioned that it creates evergreen documentation and facilitates onboarding new team members.
00:16:19.390 Of course, there were some dislikes as well. Users consistently mentioned that the syntax is verbose and not dry.
00:16:32.060 The reason for this is that Sorbet needs to operate on top of Ruby; thus, types aren't integrated directly into the language. You must define your methods and also provide a signature to communicate expectations.
00:16:48.570 They noted it was hard to add types to existing code, although it was easier to write new code with types in mind. The challenges with existing code stem from understanding the method’s context adequately.
00:17:01.440 Moreover, users mentioned that Rails and meta-programming support are not yet complete, and they noted pitfalls along the way.
00:17:12.610 For instance, if you utilize a lot of DSLs, you will need to generate RBI files for them because Sorbet does not run your code, meaning it won't see what's happening at runtime.
00:17:31.120 Additionally, it’s important to note that there are some missing standard library signatures bundled with Sorbet. While the Sorbet team works hard to maintain these signatures, full coverage is not achievable.
00:17:46.190 As such, you might encounter errors for perfectly valid Ruby code due to Sorbet's lack of knowledge about those methods, but contributing those missing signatures is very straightforward.
00:18:00.250 Another pitfall, though less frequent, is that Sorbet doesn't perform constant lookups by inheritance. This means that if you have a class Foo with a constant Bar and a subclass Bars, accessing Bar from Bars will result in an error.
00:18:15.000 So keep this in mind. It might feel strange when you first encounter it, but the solution is quick and simple.
00:18:30.000 A more significant consideration, especially for Rubyists, is that dynamic superclasses or mixins are not supported for Sorbet. Methods returning class or module names pose challenges.
00:18:46.560 Since Sorbet does not run through your code, it cannot enforce types for dynamic superclasses or mixins, which means those areas will simply raise errors.
00:19:01.820 Finally, let's talk about the runtime type checking overhead. The numbers we received from Stripe indicate that checking types at runtime results in about a 7% performance overhead.
00:19:15.790 While that isn't an issue for us in development and testing environments—where we run full runtime type checking—we likely want to avoid it in production.
00:19:28.920 So whether or not you want runtime type checks is something to consider carefully.
00:19:43.370 Now, let's discuss how to get started adopting Sorbet in your own codebase. First, I recommend visiting the Sorbet playground to experiment.
00:19:59.010 This tool built on WebAssembly runs inside your browser; you can type code to receive immediate feedback on what works and what doesn't. It's an excellent way to share Sorbet snippets with unique URLs.
00:20:17.860 Step two: add Sorbet to your project. This is simple—add the Sorbet gem and the Sorbet runtime gem to your Gemfile, then run `bundle exec srb init`.
00:20:29.900 This command does several things: it loads each Ruby file, determines the appropriate strictness level, identifies methods at runtime, and generates a number of RBIs.
00:20:40.150 However, our codebase is too large to use this approach as it takes a long time to run so we prefer to be intentional about setting strictness levels.
00:20:55.040 In our case, we add Sorbet and the runtime and use Tapioca for RBI file generation along with fixing code iteratively.
00:21:07.320 An important step is to start typing new code rather than overwhelming yourself trying to add types to existing code all at once.
00:21:20.300 It’s easier to type when writing new code, although you're encouraged to add types to existing code when it's feasible.
00:21:34.750 Using RuboCop Sorbet is another recommendation. We added specific RuboCop rules that create template signatures for your methods, which makes adding types more accessible.
00:21:44.630 Additionally, lean on gradual typing. This is crucial because you can introduce type annotations incrementally. It’s fully opt-in, and you can dictate your pace for improving coverage.
00:22:01.390 While you're doing this, be considerate about not disrupting your colleagues' workflows. Keeping a positive experience with the tool can strongly influence adoption.
00:22:16.230 Also, don't over-type. It's acceptable to use simpler signatures; you need not be overly stringent with types because Sorbet is an added safety feature.
00:22:32.310 Remember, your colleagues shouldn't require a PhD in type theory to make changes. A clear understanding of signatures is critical for encouraging usage within teams.
00:22:51.980 Lastly, track your progress easily. Sorbet generates metrics; just use the metrics file flag to receive a JSON file that can be parsed easily.
00:23:08.610 This allows you to set up nightly tasks and dashboards to track your progress, which can help measure advancements over time.
00:23:24.590 Thank you for your attention! Please don't forget to fasten your seatbelts. If you want to reach out to me, feel free to connect on Twitter or GitHub, or join us at the Shopify booth for any questions.
00:23:36.130 I am more than happy to answer them here or at the booth.
00:24:00.470 The elephant in the room is that Ruby is not inherently a typed language.
00:24:05.500 If we were to rewrite Shopify in another language, wouldn’t we use a static type? The simple answer is no.
00:24:12.480 We don't plan to rewrite Shopify soon, but we want the safety guarantees of static types as much as possible. Sorbet provides a great tool for areas requiring safety.
00:24:25.470 In less critical areas or tangential modules, you may not need to add types. This way, you can enjoy the best of both worlds without requiring extensive resources.
00:24:35.640 All of the above highlights that the team responsible for the final push to enable Sorbet across all files consisted of three people, including myself.
00:24:46.160 Thank you all for coming!