Yuki Nishijima

Static Typo Checker in Ruby

RubyKaigi2017
http://rubykaigi.org/2017/presentations/yuki24.html

Since 2.3.0, Ruby comes with a dynamic typo checker called the did_you_mean gem, which helps find a bug caused by a typo. However, there's one argument against its design: it runs a naming check at runtime.

So what makes it difficult to implement a static typo checker? What are the technical challenges to build it? Is Type really necessary? In this talk, we'll discuss techniques for how to write a static typo checker by looking at examples that find an undefined method without running Ruby code. Join us to learn about the future of Ruby's typo checker.

RubyKaigi 2017

00:00:00.149 The next speaker is Yuki Nishijima. The title of the talk is 'Static Typo Checker in Ruby'.
00:00:21.980 If you are English speakers you can call me Eugene. You can follow me on GitHub and Twitter. My handle is @yuki24. If you have any questions, feel free to send me a message. I work on Ruby core and I also created a gem called 'Jimmy', which is now part of Ruby.
00:00:42.750 I usually contribute to the Ruby community by working on the gem engine, and I also propose new features to make Ruby more accurate and robust. I work for a startup in New York called RIT, and I live in New York.
00:01:07.640 I came here from Hiroshima, passing through Shanghai. I had a long layover there, and I decided to explore the city.
00:01:24.390 Shanghai is a beautiful city, and when I was there, I visited the Bund, which is lovely. I also saw Shanghai Tower, one of the tallest buildings in the world, although I couldn't go to the top.
00:01:38.150 Another interesting thing I encountered at the airport was a karaoke box, which is a small enclosed space for singing. If you ever find yourself bored at the airport, you can sing there. I wanted to try it, but unfortunately, it was out of service at that time.
00:01:56.759 Soon, I will talk about the static typo checker, but before I dive into that, I want to cover a few preliminary topics. First, let's discuss what the 'did_you_mean' gem is.
00:02:16.790 Many of you might be familiar with the gem from Japan, but I want to explain what it does for those who aren't. Let's consider an example where you have code that looks like this, which is just a string. If you mistyped the method name, it wouldn't work.
00:02:38.740 This is because methods can have aliases, and in pure Ruby, you don't actually have to call that method directly. However, if you require the 'did_you_mean' gem, it will look up the correct method name as soon as an error occurs.
00:03:04.550 Let me also talk about recent changes in the 'did_you_mean' gem. The next version of the gem, which will be 1.2, will be included in Ruby 2.5. However, it will not be compatible with older versions of Ruby, particularly with Ruby 2.3 and below.
00:03:22.630 For instance, version 1.1.2 of 'did_you_mean' is compatible only with Ruby 2.3 and lower versions, and if you are using Ruby 2.2 or earlier, you will have to use version 0.99.
00:03:41.780 The reason I'm doing this is that I want to leverage the latest features of the Ruby core so that the project can be more accurate and robust.
00:04:00.739 New features in Ruby are quite exciting. For example, if you are using a struct object, you can now call a method using its hash-like syntax, passing a key for the method you want to call.
00:04:12.769 Previously, if an undefined method was called, it would silently do nothing, but with the latest version, it will actually raise an exception, providing more informative feedback.
00:04:31.240 This has also led to the resolution of some confusing cases. For example, in earlier versions, if you tried to assign an undefined variable, it might suggest a completely unrelated name as a solution.
00:04:51.800 In fact, these suggestions could sometimes confuse users rather than help them. Therefore, in the recent version, suggestions are now suppressed if the receiver is 'nil'.
00:05:10.590 Another adjustment was made regarding access to private methods. Earlier versions sometimes suggested private methods during error messages, which didn't make sense. With Ruby 3.4, a new method called 'private_call?' was introduced, allowing better detection of these cases.
00:05:43.880 The newer version of 'did_you_mean' continues to evolve, particularly in its experimental features. For instance, it used to offer various suggestions for column names in databases, but this functionality has been removed as it never fully aligned with the gem's purpose.
00:06:05.230 However, I still hope to find a way to activate some of these experimental features for those who want to experiment with them.
00:06:26.190 For example, if you define a full name incorrectly in a project, 'did_you_mean' can suggest the correct instance variable name based on the error caused by that typo. This enhances the capability significantly.
00:06:51.420 The same applies for hash objects. If you call 'fetch' on a hash with a non-existent key, the current version can now suggest the correct key name even if a typo was made in the original.
00:07:11.820 With the new implementation, this feature is made possible without requiring complex patching.
00:07:36.230 If you define a user class and make a mistake in the initializer, older versions of Ruby might not alert you to the typo. However, the latest version will provide accurate warnings if it finds a method name close to 'initialize'.
00:08:00.790 Furthermore, the latest version enables using 'did_you_mean' as a public interface, allowing developers to provide methods as dictionaries for typing corrections.
00:08:23.600 For instance, if you mistakenly write 'sin' instead of 'send', it will correctly guide you to 'send'.
00:08:38.390 Recently, a gem called 'Fastlane' incorporated the 'did_you_mean' features to correct action names automatically, extending the capabilities of dynamic type checking.
00:08:58.890 As developers begin to utilize this functionality effectively, it will greatly enhance code quality and reliability.
00:09:17.840 Switching back to the main topic of the static typo checker, we need to discuss some of the existing problems that 'did_you_mean' has.
00:09:34.270 The primary issue is obviously the runtime lookup it uses to detect issues. If you add a method call to your file and then run a Ruby process, if it's a Rails app and a request comes in, the application tries to process the request.
00:09:51.570 This represents a lot of overhead because every time you deal with an error, the system has to run through the entire Ruby process before finding the incorrect method.
00:10:14.300 When loading your application, MRI behaves as a C program. It attempts to load the dependencies and require the necessary files before executing your Ruby code.
00:10:43.420 This process can be lengthy and it's essential for Ruby to load all the gems and the dependent libraries before the actual code execution.
00:11:04.690 Once it loads your code, as you might know, it uses metaprogramming techniques. For example, if you are using 'has_many' associations in Rails, it dynamically defines methods when accessing database tables.
00:11:25.040 These internal definitions can change over time, especially if there are multiple methods involved in a single operation.
00:11:44.190 When executing a web server in a browser application, the incoming requests get processed, which results in a cascade of function calls all the way to the controller.
00:12:11.240 This lengthy process means there are multiple opportunities for errors to occur, and identifying where they arise can be challenging.
00:12:35.550 As a result, static analysis is an attractive solution for reducing overhead by detecting issues before they become runtime errors.
00:12:49.990 Now let’s take a look at a quick demo of how this can work with the 'did_you_mean' gem.
00:13:02.360 This is a test file for 'did_you_mean', and I intentionally left some typos in the file. I will run the analyzer to showcase its functionality.
00:13:38.600 As we can see, the static checker is able to identify typos and suggest corrections based on the existing method definitions.
00:14:05.270 The goal here is to make sure that the static typo checker can work fine without relying on executing Ruby scripts.
00:14:25.780 When running the static checker, it should check names without executing any intended Ruby scripts, which makes it very efficient.
00:14:50.350 This holds true when checking if the methods exist or not, ensuring everything works smoothly.
00:15:02.000 For this static analysis, the important part is using a parser. Ruby has several tools available to parse code.
00:15:26.260 The most well-known one is the 'Parser' gem, which allows you to analyze Ruby syntax trees.
00:15:44.780 Another useful tool is 'RubyVM::InstructionSequence', which allows for analyzing instructions at a lower level.
00:16:05.430 In our case, we have developed a prototype static type checker using the 'RubyVM::InstructionSequence', rather than necessarily relying on the 'Parser' gem.
00:16:19.560 The aim is to avoid unnecessary dependencies while keeping the analysis straightforward and efficient.
00:16:45.200 This static analyzer can identify undefined method calls, which is a great tool to ensure code accuracy.
00:17:04.100 For a clearer understanding, when analyzing classes and instances, this code can detect if methods that are called do not exist.
00:17:35.240 As Ruby is a dynamic language, it can lead to many complex behaviors that are hard to anticipate, hence the need for thorough analysis.
00:17:55.020 The process becomes intricate because Ruby does not solely rely on static definitions; you need to parse through enough of the project to ensure you get the correct method calls.
00:18:20.680 Despite these complexities, static analysis remains powerful and useful as it helps detect potential issues early.
00:18:35.300 This is essential to keep your Ruby projects robust and efficient at scale.
00:18:53.700 By using the right tools and methodologies, we can hopefully reduce the complexity in finding errors before they reach a production environment.
00:19:14.320 In closing, every slice of complexity we eliminate from Ruby makes it approachable and keeps it competitive amongst other programming languages.
00:19:41.350 It's vital for us as developers to aid our toolsets in allowing more effective and efficient coding practices moving forward.
00:20:14.560 If you have any further inquiries or would like to discuss enhancements to Ruby, do not hesitate to ask. Thank you for your time.
00:21:12.180 A participant asked how the 'did_you_mean' gem is currently structured. The speaker explained that the gem relies on runtime checks to track down method ambiguities, which can lead to slower performance.
00:21:36.200 Further, the speaker mentioned they are emphasizing optimization efforts to enhance the performance of this gem in future versions.
00:22:00.190 In response to another question, they discussed ways to integrate static analysis into the Ruby development process.
00:22:25.020 The importance of minimizing false positives was noted, particularly as Ruby applications expand and introduce complex functionality.
00:22:49.340 The discussion also touched on the possibility of leveraging documentation to aid in method detection.
00:23:12.420 As suggestions for undefined methods improve over time, it is crucial to create environments where clarity thrives.
00:23:35.850 Finally, the session concluded with the speaker expressing gratitude and inviting a continued dialogue among Ruby developers.