Talks

Keynote: Parsing RBS

RubyKaigi 2023

00:00:09.679 Usually, we start the presentations by introducing the speakers. You will all be familiar with Steve, a software engineer at Brock. I'm from Oita, a beautiful place that I love, and I think it’s great to be here.
00:00:21.180 I know this is the standard in the community, but I don't think it adds much interaction to this presentation.
00:00:28.740 The theme of my presentation at RubyKaigi Matsumoto is that I want to answer a question you must have.
00:00:37.800 I’m excited to be here and to discuss what RBS is about.
00:00:43.200 Social aspects are also significant. I visited both places about ten years ago and took some interesting photos.
00:01:01.620 In Oita, the bus station has a special timetable, and we need a reservation to take the bus from there.
00:01:16.560 In fact, social aspects are quite prominent, and the last one I want to mention is the community park.
00:01:24.299 This will be a technical presentation; however, before diving into the presentation details, I would like to share some recent updates regarding RBS. The latest version is RBS 3.1 and 3.4, both of which were released last month.
00:01:51.360 These updates introduce new syntaxes. Specifically, two new syntaxes were introduced in RBS3.0: a class module area syntax and the definition of areas for some long-named classes and modules.
00:02:10.800 This corresponds to the way you assign a cross-module to some constant and use that as a class name or module name in your Ruby code.
00:02:30.480 Another syntax is for importing classes from long package names, which is equivalent to import statements in Java and C#. However, this is specific to RBS and has no direct counterpart in Ruby.
00:02:51.840 With RBS 3.0 and 3.4, some features have been improved. The RBS Assistant support now includes signature help and enhancements in RBS completion.
00:03:05.879 When you type a parenthesis after a method call, signature help pops up automatically, displaying the documentation and types of the method parameters. You can input your method arguments in this interface.
00:03:29.940 There's also an improvement in type name completion. For example, when you type "Chan," a list will pop up, and it resolves to the parsec token factory.
00:03:48.540 Additionally, the system now inserts a shorter name based on the current module nesting context.
00:04:13.680 However, a question arises: Why do we have two different type names for the same input? The parameter type is a fully qualified type name, while the return type uses a shorter name. Let's delve into this detail.
00:04:39.240 When you are typing the parameter type in RBS, if the syntax is broken and the return type information is missing in the method definition, the current RBS parser stops working.
00:05:05.720 The modular nesting context is lost, so this situation causes the parser to fall back to a backup mode and insert an approximate type name instead.
00:05:24.860 Conversely, when typing the return type, it is syntactically correct. The parsing of the RBS file succeeds, and the module nesting context is computed, allowing the resolution of the shortest relative type names.
00:05:51.600 This inconsistency arises from parsing errors. Moreover, we need a purpose that can continue working, even with syntax errors, to provide advanced IDE features like completion, signature help, and jump to definition.
00:06:08.280 Let’s take a look at some examples in the video. In this RBS file, we define a class talk inside a conference. The outline view updates in real-time to reflect these changes.
00:06:35.479 When we add attributes, like video, it immediately recognizes the attribute definition. You can continue defining attributes and see the navigation key update automatically.
00:06:56.160 Once we close the class definition with `end`, there are no syntax errors remaining.
00:07:06.600 Next, I will present how we can create a top-down password generator.
00:07:11.759 This top-down password generator will emphasize error recovery. The target language is RBS, not Ruby, making it easier to focus on error recovery strategies.
00:07:31.560 I developed this password generator to create top-down parses with error recovery features. The grammar is defined in WDSL, but currently, it doesn’t generate any code.
00:07:57.960 Instead, it has an interpreter that receives the grammar definitions and parses the input.
00:08:23.460 The grammar definition is specified using BNF notation, where we have terminal tokens like class keywords and method calls, as well as different non-terminals for module names and class members.
00:08:50.760 In BNF notation, we will see how to repeat phrases or make them optional. This shows that the grammar supports zero or more class members.
00:09:09.380 The backlog is primarily handled by method calls, representing class declarations, method definitions, or HTTP readers.
00:09:30.660 The output of parsing is a concrete syntax tree rather than an abstract structure. Essentially, we create classes for each type of grammar definition, managing their relationships effectively.
00:09:48.300 Let’s observe how the top-down parsing implementation will appear. For each non-terminal symbol, we have specific parsing methods. The cross member terminal must also be determined.
00:10:11.760 The passing methods correspond to each rule body, validating the input against expected tokens and handling any discrepancies.
00:10:26.220 For instance, if the first token in the input is a class declaration token, it will proceed accordingly; otherwise, it will check for method definitions or attribute declarations.
00:10:48.300 The produced parsing result is a tree that includes various subtrees for different parsing elements, ensuring that class and method definitions are thoroughly represented.
00:11:04.320 Let’s analyze another aspect of error recovery when it comes to flawed RBS files. If a method definition lacks a proper ending syntax, the parser may simply halt.
00:11:20.699 For example, an RBS file missing colons will not yield usable results, even while it retains some structural elements.
00:11:33.699 To address this, we want reliable error recovery. It’s not just about raising errors; we must also try to yield a useful structure even in the presence of issues.
00:11:53.100 The first step in this approach will be to introduce a missing tree structure that indicates which token is expected.
00:12:02.820 Considering our current context, you won’t typically gain the full method definition unless the parser encounters the appropriate tokens, leading to expected syntax.
00:12:33.479 Let’s examine a case where it may conflict, such as when the parser sees a token that doesn’t match the expected syntax.
00:12:43.620 In these circumstances, it is essential to establish whether we can bypass certain tokens instead of letting them obstruct overall parsing.
00:12:57.540 Hence, if it’s sensible to delete tokens causing issues, we proceed accordingly and continue parsing.
00:13:10.200 Identifying potential tokens in the input enhances our parsing logic. We can skip tokens that do not conform to the expected format.
00:13:23.820 Let’s continue with the implementation of the parse to skip tokens that don’t align with the rules.
00:13:42.059 After applying this, we can expect to find a workable outline allowing a jumps start to this processing.
00:13:54.600 beyond that, we will come across instances where method definitions are correctly identified, even if they lack expected components and mark these scenarios effectively.
00:14:18.900 This error recovery strategy mimics common approaches found in top-down parsing implementations, ensuring we meet code efficiency.
00:14:30.420 In short, I’ll illustrate another aspect of RBS parsing. Looking at our current implementation, it seamlessly gathers new cross declarations without skipping a beat.
00:14:50.700 On screen, shapes depict how parsing adapts to fresh details. You’ll observe that the framework captures changes, aptly reflecting any structural adjustments.
00:15:04.240 The basic function retains major method definitions while highlighting their corresponding attributes in ability.
00:15:16.540 However, let’s address how error recovery tends to differ across various implementation layers.
00:15:38.540 As we venture deeper into scenarios where classes encounter method definitions, parsing focuses on elements present vs. those omitted.
00:15:58.920 Potential modifications illustrate core changes to the class structure without impacting previous definitions.
00:16:08.579 When defining a structure with multiple varieties at play, it’s critical to highlight the connection to the broader coding landscape.
00:16:20.220 Backtrack to previous steps; we can resolve conflicts without additional parsing issues.
00:16:30.780 We need a structured framework to facilitate this logic as we pursue deeper structural integrity.
00:16:40.020 Overall, despite facing structural complexities, we design resilient parsing functions borne of sound grammatical backgrounds.
00:16:52.440 In summary, you need strategic navigation through multifaceted grammars to maintain integrity.
00:17:02.170 As we draw toward the conclusion, I want to emphasize the merits of evolving the mechanisms for RBS.
00:17:12.160 Our focus centers on achieving excellent syntax completion and bolstering efficiency while making sense of complex inputs.
00:17:22.220 As per user experience, enhancing usability remains paramount while scaffolding frameworks towards practical applications.
00:17:32.840 Your feedback remains invaluable; we must continue evolving methodologies. I want to thank you for all your support.
00:17:43.680 If there are any questions or comments, please feel free to reach out. I’m excited about the journey forward, keen to tackle challenges we face as a community.
00:18:07.020 Having worked on these projects for around 15 years, I remain committed to driving forward the narratives surrounding RBS, types, and safety in programming.
00:18:22.720 Exciting developments await us; let's make sure we stride towards them together!
00:18:37.780 I'm eager to circle back with the abstracts I've mentioned and curious about future endeavors together.
00:18:49.900 That’s a wrap on my part. Thank you once again for this opportunity and your ongoing engagement!
00:19:02.780 Let’s make the best out of this, and ensure we keep striving for progress!
00:19:25.760 Feel welcome to reach out anytime to discuss more about RBS or projects you want to explore together.
00:19:40.220 For now, I’ll conclude my presentation. Best wishes to everyone as we invest in new chapters concerning our future!