Talks

Introducing Sorbet Into Your Ruby Codebase

wroc_love.rb 2024

00:00:09.480 Good afternoon! Thank you for sticking with me. This is being recorded, and I want to note that it is super packed here. It's great, the best conference!
00:00:21.400 So, who am I? I think I'm a doer. I have a nice side job writing Ruby at GitHub. It's nice. I started writing React there as a fullstack developer and then transitioned into the issues platform team.
00:00:27.720 If the issues platform is not working very well, you can hit me up on Twitter, and I will try to make it better. So, as I said, I was a fullstack developer. Over the years, I have worked with various programming languages such as Python, Clojure, TypeScript, Kotlin, Java, and many more. But now, I've been working with Ruby for two years.
00:00:49.280 I don’t really have a strong opinion about which is better: static or dynamic typing; it depends on the context. Yes, I am quite happy with Ruby now, but maybe if you ask me in a year, I might say ‘never again’—but probably not. I’m having a good time!
00:01:10.880 Now, a quick check. Please raise your hands if you know what Sorbet is. Okay, that's a decent amount—over half, that's great! And who here is currently using Sorbet? It seems like a little bit less, but still more than zero, probably more than five. So, that’s great!
00:01:37.520 So, what is Sorbet? To put it briefly, Sorbet is a gradual type checker for Ruby. You can add typing to your Ruby codebase, but you don't have to go all in; you can just add a bit of typing here and there.
00:01:58.439 Why would you use Sorbet? I’m going to share my short opinion based on why I began using it at work. In general, I’m the engineer on a team with a quite old codebase, about ten years old. It is entangled and complicated, so we wanted to refactor it better. We believe that by implementing Sorbet, refactoring would become easier, allowing us to add more features and improve performance.
00:02:30.519 Initially, I wasn’t sure about using Sorbet because I had never used it before. I thought it might be hard to get accustomed to its syntax. However, I wanted to make code refactoring easier so we could add features and enhance performance.
00:02:53.120 So, let's delve into the syntax. To use Sorbet, you must include this command above or below your frozen string literal; it's always there.
00:03:06.680 Then you can choose different options: strict, true, false, or strong (though strong is not recommended anymore). In this case, strict means every method has to have a signature. We will get into signatures in a moment. After that, you require the Sorbet runtime because Sorbet functions both as a static type checker and checks at runtime, so you can enable or disable it.
00:03:44.520 Now, we must always include some module, and then we can move to the signature part. This is just standard Ruby code, which is good and bad at the same time. For example, we have a method, `bar`, that takes one parameter, `x`, which must be an integer, and it returns a string. It's straightforward.
00:04:07.159 Additionally, we can indicate when a method does not need parameters using a special syntax, allowing for escape hatches for gradual typing. As I said, gradual typing means the `bar` method can be called, but we can put a string when it actually requires an integer. If we execute this code, it would throw a runtime exception because it is not a string. However, if we disable Sorbet's runtime checking, it will just work.
00:04:50.160 To sum it up, the key syntax for Sorbet is gradual typing with escape hatches, allowing you flexibility when incorporating types.
00:05:11.600 Alternatively, instead of adding types inline, we have RBI (Ruby Interface) files. This is normal Ruby, but we only define method signatures without implementations. For instance, we have a small method that’s typed in an RBI file, which defines the signature.
00:05:37.520 This way, you can use this for gems that are not typed. Now, let's talk about how we are using Sorbet in practice.
00:06:05.440 As I mentioned, my goal was to ensure that all our owned files got at least a level of true typing. My first challenge with Sorbet was that it didn't recognize modules included in other modules. For example, you need to include `Kernel` before you can call `raise`, otherwise, you will get an error from Sorbet's static checker.
00:06:57.120 In Ruby, we often use modules to define different behaviors. For instance, we can have a module that specifies an interface. If we include this to-string method, it means our user class can implement this method and behave accordingly.
00:07:16.400 However, we can encounter more complex situations involving dependencies. For example, we might have a user class that consists of several files, yet the way we include modules makes our codebase difficult to manage.
00:07:51.360 Sorbet provides an experimental feature called `require_ancestor`. This feature allows you to specify that a module needs an ancestor, simplifying the usage of dependencies and module inclusion.
00:08:15.720 Though this is experimental, enabling this feature can significantly improve the usability of Sorbet when handling complex codebases.
00:08:45.319 Next, we encounter the challenge where modules cannot change type. For instance, if an authorizer module depends upon a user, it cannot dynamically bind to a specific user type unless explicitly defined. This can become quite cumbersome.
00:09:05.440 This pattern appears often in code and can be a bit annoying. To handle this, we sometimes end up using casting instead of binding, which doesn't always yield favorable results.
00:09:27.720 Alternatively, you could also create an RBI file and redefine it. This method has its pros and cons, but it can be a good solution to improve type safety without altering much of your existing code.
00:09:49.549 We also face challenges with type inference. For instance, if a loop complicates the method implementation, Sorbet might struggle to infer the correct type, which can require us to write additional type hints to clarify our intentions.
00:10:03.840 While Sorbet offers generics, the type inference and generics syntax can become challenging and make code harder to read at times. The typing itself could end up being longer than the actual implementation.
00:10:29.760 We sometimes overlook exotic features Sorbet provides when starting its implementation, which leads to more complications.
00:10:53.600 Tapioca is a companion tool developed by Shopify. It provides a replacement for some of the RBI generation features typically found in Sorbet but now as a CLI application.
00:11:19.600 It can download pre-generated RBI files for specific gems, making it useful for handling Active Record models and other meta-programming tasks.
00:11:42.400 However, implementing these features requires effort, and I found that integrating RBI generation can be slow. It is also challenging to get started with these generation features.
00:12:01.920 At GitHub, we’ve implemented this to a certain degree. We decided to use type true as our lower bound, which meant we didn’t have to write explicit signatures but still had to do some managing to ensure a stable codebase.
00:12:39.040 The downside of this is that while we have fewer signatures, we also end up with much more runtime checks that do not increase our confidence in the system as they provide limited assurance.
00:13:10.480 In hindsight, we probably should have employed a strict typing approach initially, rather than relying on type true, which doesn’t enforce the same guarantees and leads to overhead that doesn’t contribute to code safety.
00:13:39.840 We resorted to using `unsafe` type declarations more often than desired. While these can accelerate development, they do not provide any guarantees, making our environment more fragile.
00:14:03.680 On the subject of type signatures, they too can be disabled based on your configuration of Sorbet on startup. While type assertions are disabled, they still linger in your code and cannot be easily optimized, slowing down performance.
00:14:34.480 When we conducted micro-benchmarks, we noticed that type assertions, even those that appear innocuous, add significant overhead to our applications, which can greatly affect performance.
00:15:02.320 However, the biggest benefit is improved type assurance. Tools such as Solargraph and LSP allow quicker navigation within the codebase.
00:15:35.360 Generating RBI files efficiently is also empowering, enhancing our workflow even though it may be slow to initially integrate.
00:15:59.440 Despite these advantages, setting up those generation features can be a hurdle and might dissuade developers initially, but upon takeoff, it adds immense value.
00:16:26.560 For new developers, the learning curve can be steep, and many might not find immediate value in these type systems, especially if they complicate straightforward tasks.
00:16:52.760 Overall, while Sorbet can be beneficial for large codebases seeking safer refactorings and enhancements, it may not suit everyone, especially those starting fresh.
00:17:24.760 Okay, there's one question.
00:17:34.680 I have a question because in our codebase, we have pretty big projects and we use ROM types a lot. We validate types only for input parameters, such as in controller actions and worker parameters.
00:17:51.600 We check the CSV file from clients and validate it during processing. So, what is the benefit of introducing this static or runtime typing across all classes instead of just validating inputs?
00:18:19.120 The benefit lies in filtering out potential errors early. With a static typing language, certain types prevent specific methods from being called,
00:18:52.000 thus ensuring that your code is robust and fewer runtime errors occur. It’s about validating at a deeper level, beyond just input.
00:19:05.680 Thank you for the insightful talk. I was wondering how runtime checks are implemented. Is it perhaps using the Trace Point API?
00:19:43.520 I have no clue, unfortunately. There are quite a few great people working with this, and I am trying to dive into it. But I don't have the exact answer right now.
00:20:05.680 Yes, thanks for the great talk again. Do I understand your sentiment correctly that you prefer more automation in typings? You wish for types to be incorporated seamlessly within the code instead of obstructing implementations?
00:20:23.560 Yes, that's right!