Caleb Clausen
Summarized using AI

Ruby Macros

by Caleb Clausen

The video titled Ruby Macros features a presentation by Caleb Clausen at the MountainWest RubyConf 2010. Clausen introduces Ruby macros as a system he developed to enable advanced metaprogramming techniques in Ruby, facilitating manipulation of syntax trees at parse time.

Key Points Discussed:

  • Ruby Macros Overview: Clausen compares Ruby macros to C pre-processor macros and Lisp macros. He states his system aims to emulate the powerful macro capabilities of Lisp, achieving approximately 90% of Lisp's functionality due to Ruby's inherent differences.

  • Macro Definition: He explains how macros are defined in Ruby using a macro keyword, which appears similar to a method definition but operates at parse time. A macro can manipulate its parameters through a context called a 'form'.

  • Example of a Simple Macro: Clausen provides a trivial example of a macro that adds two numbers, showing how it operates on S expressions rather than regular Ruby objects.

  • Syntax Trees: He delves into syntax trees, emphasizing their structure and how they reflect the code in an object-oriented manner. Clausen discusses the potential for improved navigation and comprehension using his 'red parse' type trees compared to traditional Ruby parser trees.

  • Practical Macros: The talk highlights practical applications of macros, such as an assertion macro that enhances assertion capabilities in tests by giving detailed feedback when conditions fail. Clausen illustrates this with the assertion of two variables, showcasing the clarity this macro provides.

  • Advanced Macro Features: Clausen addresses the potential for scoped macros, macro hygiene, and the ability to yield blocks to macros. He also shares concepts on loop unrolling and inlining methods, demonstrating how these could improve performance and maintain clarity in Ruby code.

  • Conclusion and Future Work: Clausen emphasizes the versatility and power of Ruby macros while acknowledging the current limitations such as performance issues. He expresses hope for future enhancements and welcomes any questions from the audience regarding the concepts discussed.

Takeaways:

  • Ruby macros can significantly enhance metaprogramming within Ruby, providing features that approach the power of Lisp macros.
  • The ability to manipulate syntax trees and develop more expressive and efficient code is a primary advantage of utilizing macros in Ruby programming.
00:00:14.920 I want to talk to you today about Ruby macros, which is a system that I created for enabling deep meta programming with Ruby. The idea is to manipulate your syntax trees at parse time and change them in nearly any way you want. If you grew up as a C programmer like me, you may think of macros as something similar to C pre-processor macros. However, this is not the same thing; while they share a general idea, the C preprocessor is very limited in its capabilities. It is a simple textual substitution scheme that cannot manipulate your arguments as objects in a complete way.
00:00:49.440 What I'm really referring to are Lisp macros. Lisp has a very powerful macro system, which I am trying to emulate. It is challenging to achieve complete integration with macros that Lisp has due to the nature of Lisp itself—code is data and data is code, creating a natural flow between the two. In any language that is not Lisp, achieving this is practically impossible, but I strive to get as close as possible; I believe I have reached about 90% of the way there.
00:01:25.200 Now, here's an example of a rather useless macro that simply adds two numbers together. This demonstrates how I had to invent several new syntactical constructions to achieve the desired effect. This example will display all three of my new syntactical constructions. First, we have the macro keyword; this keyword introduces a macro, and macro definitions look syntactically just like method definitions. Essentially, a macro is a method that runs at parse time instead of at runtime.
00:01:55.800 A macro definition appears similar to a method definition; however, instead of the keyword 'def,' you use 'macro.' Inside the macro, you typically have a form, which is distinguished by a parenthesis that has a colon in front of it. A 'form' allows you to quote your code. Every macro definition should contain a form inside it, which specifies the parameters to the macro. These parameters must be escaped using a new syntactical construction called 'form Escape,' represented by this caret symbol (^). This operator allows you to break out of the form context.
00:02:51.200 This particular macro isn’t very useful; it merely inlines the addition operator. For example, a macro invocation like this would be transformed at parse time into a corresponding addition expression. A macro operates by returning a symbolically manipulated structure rather than having access to regular Ruby objects. It does not manipulate arguments as objects directly but instead processes them in terms of S expressions, which stand for symbolic expressions and represent a method of parsing.
00:03:40.640 A macro will ultimately return a structure that gets inlined directly into the code at the point where the macro is invoked. For instance, the macro used in this example would replace itself with the elements returned in an S expression. Additionally, macros can take arguments, which, again, are represented as S expressions rather than regular Ruby objects. In a typical function call, Ruby evaluates the values of those arguments before passing them to the macro. However, since those arguments (like 'a' and 'b') are not defined at parse time, you'll receive a parse tree representation of the added values instead.
00:04:29.680 Now, let’s discuss forms in more detail, as they play a significant role even if they seem boring in comparison to macros. Here's an example of a form, which I placed between a string and a proc to demonstrate that a form occupies a middle ground between a string and a proc. All three types are methods of quoting code. A form behaves like a proc because its contents are parsed, and any syntax errors discovered will be flagged at parse time.
00:05:12.960 However, it's like a string in that the contents haven't been turned into actual instructions yet. They remain data, represented as a tree structure. Below is a representation of the tree corresponding to that form demonstrated earlier. You might find this representation reminiscent of YAML data structures I took a considerable amount of time to develop, with some help from Roger Pack, who is sitting in the back. This method of displaying trees is quite clean compared to my previous and much more chaotic representation.
00:06:00.800 The top level of this tree structure contains a 'call node,' with additional parameters associated with it. Each parameter is represented as a string node containing the data, 'hello.' Here's an illustration of the form escape, which ultimately results in the same expression as before, demonstrating the process of escaping out of your form back into standard Ruby mode temporarily.
00:06:37.680 The form Escape operator permits you to interpret portions of the macro in regular Ruby as they are evaluated, placing the results back into the form at that point. Now, let's examine a more realistic macro; I believe this was the first macro I wrote. It's related to assertions from Test::Unit, this implementation of assert as a macro offers more functionality than standard assertions. For example, it gets the condition being asserted as an S expression. Then it evaluates whether that condition is one of the known comparison operators. If it is, the macro decomposes it into the left and right sides of the comparison, alongside the operator used, allowing the construction of a comprehensive error message.
00:07:40.640 Now let's consider a couple of variables. If I perform an assertion like 'a not equal b,' the expected result is that it turns out to be true, so nothing happens when the assertion runs. However, when an assertion fails, it throws an exception with an error message. Like in Test::Unit, this error message should indicate what was on the left and right sides of the comparison. For instance, if I expect 'a' on the left and 'b' on the right, failing might show you that a was one, but b was two. This unique aspect of the macro allows us to display a symbolic representation of both sides of the assertion, something that method-based assertions cannot do.
00:08:56.720 Additionally, notice how this macro achieved that effect using just a simple assert. There’s no need for multiple assertion types like assert_equal or assert_greater_than. You can perform all of this using a single assert macro, while still utilizing normal syntax. Furthermore, it is important to note that the assertion macro only takes effect if the debug variable is set. When debug is not active, the macro returns nil, which renders it non-influential; at that point, it vanishes from the parse tree, leaving no trace of any method call, in effect performing nothing.
00:10:32.840 Assertions serve numerous purposes beyond unit tests, making it beneficial to have assertions in your codebase. Some developers take assertions to an extreme, advocating for design by contract. This practice employs assertions extensively in business logic. While I don't recommend going that far, being able to include assertions within regular code is advantageous. It ensures that you have expected values checked throughout your main code, and being able to disable them in production mode prevents performance loss from those checks.
00:11:43.440 Now let’s delve into syntax trees and their formats. Essentially, they reflect a tree structure; for instance, consider a simple expression and its corresponding tree. At the top of this tree, we have a 'plus' node representing the left operand 'a' and the right side, which is another node constructed for operations. This setup resembles XML trees or any other types of structured data you might come across. For those who know about `pars_tree` or other Ruby parsing libraries, my implementation of parse trees is oriented toward object-oriented principles.
00:12:56.400 Unlike Ruby parser trees, my 'red parse' type trees are object-oriented, wherein nodes are actual objects. Each node possesses subclasses and holds other nodes, which provides several useful characteristics. For example, subnodes have names instead of being designated by generic indices. Rather than referring to node[0], you can call node.rescue or node.params directly, making navigation and comprehension much more intuitive. Moreover, red parse trees closely mirror the original format of the source code, while Ruby parser trees may have been altered during the parsing process.
00:14:12.760 For instance, Ruby parser handles rescue clauses through a convoluted nesting of nodes, whereas in red parse, the rescue clause becomes a component directly linked to the various types of nodes that can employ rescue. Another distinction is that Ruby parser omits parentheses altogether in the resulting tree, while my implementation retains them as a distinct node that you can reference. Additionally, the operator representation is more explicit in red parse, where operators are treated as independent types, as opposed to being redefined as method calls in Ruby parser.
00:15:46.240 I intended to provide a detailed explanation of node types at this point, but I’ve found that showing examples of parse trees is far more illustrative. Using the 'red parse' command will allow you to parse a piece of code in real-time and observe the resulting structure. For example, testing a method call generates a call node along with appropriate parameters, while flow control structures like 'if' yield nodes categorized by condition and consequence.
00:16:57.760 Observe how a method definition becomes a method node containing a body filled with operations. If anyone has a specific syntax example they wish to see, feel free to ask. Here's a simpler method call, circumstantially accompanied by different structures based on conditional clauses. With each variation, we can visualize how a parse tree builds up as a method processes input.
00:18:30.920 During interactions, you may notice some delays in processing; the speed of red parse is a current limitation. This is an aspect I hope to resolve shortly. While I’ve emphasized how beneficial Ruby macros can be, I acknowledge some pressing issues, such as the slow parsing process, which affects the startup of any application that leverages Ruby macros. Any Ruby file that defines macros or utilizes them—including those that do not define any—will be imported into the interpreter with a specialized require mechanism, adapting to the slower parsing method.
00:20:07.600 It would be advantageous to implement scoped macros, allowing macro definitions to exist within a particular class context, thus restricting their visibility to just that class. At present, all macros must reside in a global scope, visible throughout the entire program. Furthermore, Lisp introduces the concept of macro hygiene, which ensures local variables defined within macro expansions do not collide with local variables in their context of the macro’s calling location. However, this feature remains unimplemented, making all definitions of my existing macros prone to potential naming conflicts.
00:21:26.159 Additionally, there are advanced macro features worth exploring. Macros can accept blocks as inputs, which grant access to them through the yield keyword. However, the usage of yield within a macro context does not retain its conventional expectations. This specific macro allows you to make modifications to local variables within the block while ensuring these changes remain confidential from external contexts. Macros can also utilize receivers accessed through a pseudo-keyword—though I plan to revise this method to use 'self' instead to allow a clearer understanding.
00:22:38.840 I would also like to implement a macro for combining functional operations on enumerations, such as assembling pipelines of methods like select and map. This would transform these function stacks into a single looping operation, optimizing performance. As for some of my other macros, one particularly interesting macro I just developed serves as a loop unroller. If your loop does not exceed a few iterations, this expands the body to utilize multiple copies, efficiently repeating the logic.
00:25:14.200 This macro simplifies complex Ruby loops into shorthand while also accommodating different loop types, though I plan to continue enhancing it. Regarding another macro, I've been working on an inliner that adjusts method definitions. By marking a method with the inline keyword at its beginning, that method behaves as an inlined version of itself, essentially operating as a macro but maintaining the surrounding environment semantics. I believe this is an innovation yet to be seen in Ruby.
00:26:50.960 Now, let me show you how these concepts play out in practice. To declare an inline method, you simply add the inline keyword in front of your method definition, permitting space for parameters and logic like any traditional method. Upon saving, an underlying macro named 'Fu' is created automatically in the background; employing it behaves identically to a standard method call. Furthermore, I can illustrate how the macro does its work: it utilizes method expansion to generate the inlined version of that code.
00:30:00.000 In conclusion, I hope I've been able to clarify the power and versatility of Ruby macros in promoting advanced metaprogramming techniques while addressing various optimizations needed for efficient functioning. If there are any questions regarding these concepts or specific examples, please feel free to ask.
Explore all talks recorded at MountainWest RubyConf 2010
+18