Parsing

Summarized using AI

Keynote: Developing a Language

Evan Phoenix • February 05, 2011 • San Pedro, CA

The video titled "Developing a Language" features Evan Phoenix as a speaker at the LA RubyConf 2011. The talk explores the conceptual and practical aspects of creating programming languages, delving into the philosophical underpinnings of language and abstract ideas. The session is interactive, inviting audience engagement and emphasizing learning through implementation. Here are the key points discussed throughout the talk:

  • Introduction & Ground Rules: Evan begins by thanking the audience and organizers and sets the stage for an interactive session, encouraging questions throughout the presentation rather than saving them for the end.

  • The Purpose of Creating Programming Languages: He poses philosophical questions about language and thought, discussing how ideas and language are intertwined. A good programming language can facilitate better expression of ideas.

  • Learning Through Creation: Evan emphasizes that the best way to learn about programming languages is by actually creating one. He introduces "Prattle", a Smalltalk-inspired language that he and the audience will begin to develop collaboratively.

  • Key Concepts from Smalltalk: Evan mentions four fundamental concepts from Smalltalk (self, true, false, nil) and explains their similarities with Ruby. He introduces parsing as a key step in language development.

  • Parsing Logic: The audience is walked through the process of writing parsers for various data types and constructs within the new language, including numbers, logical values, and strings.

  • Building the REPL: He discusses the importance of implementing a Read-Eval-Print Loop (REPL) for immediate feedback when testing the language code.

  • Method Calls and Sending: He expands the language's functionality by adding method sending features and explains how to manage arguments in method calls, reflecting Smalltalk syntax.

  • Incremental Development: The session emphasizes the necessity of validating each additional feature through testing, underscoring the importance of structure and order in building a language.

  • Final Thoughts and Encouragement: Evan wraps up by motivating the audience to explore programming language development, reassured that challenges are manageable and resources for learning are accessible.

Keynote: Developing a Language
Evan Phoenix • February 05, 2011 • San Pedro, CA

Developing a Language

Help us caption & translate this video!

http://amara.org/v/FNkF/

LA RubyConf 2011

00:00:08.880 Hello everyone!
00:00:29.840 Thank you all for coming out! I hope everyone has a great day today. Alright, let's hear from you quickly.
00:00:36.399 Let's give a round of applause for the organizers, Kobe, JR, and everyone involved!
00:00:47.280 This is going to be an interesting talk. We'll see how it goes!
00:00:54.239 A few quick ground rules: if you get lost and want to ask a question, please raise your hand and I'll answer it right there.
00:01:00.559 This is not the kind of talk where we can wait until the end to discuss questions because we’ll have no context by then.
00:01:05.680 Is everyone cool with that? Great!
00:01:12.240 This talk is titled 'Developing a Language'. Here is my Twitter handle and my GitHub account.
00:01:19.280 I created this slide just a couple of minutes ago.
00:01:25.520 Now, this is talk number 11. You know what that means!
00:01:30.640 Oh wait, these are out of order. Oh well, that's okay.
00:01:35.680 I want to give a shout out for LARB. So you all know about LARB, right?
00:01:40.880 Who doesn't know about Tuesday night hack? Alright!
00:01:47.040 As I was saying, this is the 11th talk and the last talk of the day, so everyone's thinking about dinner.
00:01:54.159 But we won’t get to dinner just yet. First, I’m going to give you some topics to discuss with your friends over dinner.
00:02:00.479 For that discussion, we are actually going to write a new programming language together.
00:02:06.880 Let’s see how this goes!
00:02:12.640 The big question is, why do we create programming languages? This leads to a philosophical question.
00:02:19.120 Which came first in society or human evolution: language or abstract ideas?
00:02:25.360 Is the ability to express an idea the genesis of having that abstract idea?
00:02:30.560 Or do we need to have the abstract idea first in order to realize we have to express it?
00:02:35.920 It’s tough to say. In many ways, having an abstract idea is tied to having the language to express it.
00:02:41.840 If you have an abstract idea, it's hard to do anything with it without the ability to communicate it.
00:02:48.000 A wise person once said that a programmer, like a poet, works only slightly removed from thought.
00:02:54.959 So programming is essentially about abstract ideas and the language used to express those ideas.
00:03:01.200 If ideas are manifestations of language, does a better language lead to better ideas?
00:03:06.640 This is the notion behind the pragmatic programmer idea: learning a new language makes you a better programmer.
00:03:11.680 Seeing how others express similar ideas can help you uncover ideas you didn’t know about.
00:03:17.760 However, newer ideas in programming languages are not always better than older ones.
00:03:23.519 You could spend your life learning about the ideas that others have already articulated.
00:03:28.720 Looking back at older languages is also valuable.
00:03:34.879 The best way to learn a language, however, is to implement one, and that's what we're going to do today.
00:03:41.840 Let’s take a look back in time at one of Ruby’s ancestors, which is Smalltalk.
00:03:48.640 This talk will not be a full-fledged Smalltalk presentation. Instead, the purpose is to whet your appetite and show you how you can express ideas. If you want to work on a language, you'll find that diving in is quite easy.
00:05:00.720 The original idea for this talk was to develop a piece of software called the Rubinius Language Kit.
00:05:08.800 The intent was for you to easily implement things using this kit.
00:05:15.440 But as happens with many good programmers, I decided that before starting on this, I needed to write a bunch of companion software. Consequently, I ended up building a simple parsing library.
00:05:30.640 Then I realized that it was already yesterday, and I hadn’t even gotten to the actual toolkit.
00:05:36.080 What I will present today is a language, and after this talk, we will figure out how to extract pieces from this and build a toolkit.
00:05:42.320 Thus, 'Prattle' is the name of our Smalltalk-inspired language we will create today. This serves as the genesis of our language toolkit.
00:06:00.880 If you have a laptop, you might want to get it out now.
00:06:06.960 There is a lot of code in these slides, and I had to condense it down for clarity. The code for all of this is already online. You can clone it if you wish.
00:06:13.199 I will give you a moment to do that while Mitch asks questions.
00:06:24.479 So hopefully, you can clone this now, and the Wi-Fi is satisfactory.
00:06:30.479 As I mentioned earlier, this whole project has turned into a sort of road to nowhere.
00:06:36.800 With that said, I present to you 'Prattle'!
00:06:42.160 Smalltalk is very much like Ruby, but in truth, Ruby is more akin to Smalltalk.
00:06:47.520 We'll start with the four most basic concepts in Smalltalk: self, true, false, and nil. They are identical to those in Ruby.
00:06:53.280 Now, as a quick note, when looking at the code, you can find it in the future if you're watching this online.
00:07:00.320 So, remove your windows to this part of the screen.
00:07:06.479 You'll be looking at one of the first few commits, which focuses on adding the ability to parse 'self'.
00:07:12.000 This part is essential because we're talking about writing both a parser and the ability to run the parsed code.
00:07:19.200 The initial commit looks somewhat daunting, but the key portions are what we want to focus on.
00:07:24.320 So, how are we going to parse the 'self' statement? We will create a rule called 'self' that matches a string literal for 'self'.
00:07:32.240 When matched, we will return a new self object.
00:07:39.440 Keep in mind, we are inside this self class, so the naming convention applies. This is quite straightforward.
00:07:45.440 Now you can say that you've successfully parsed a Smalltalk program. Congratulations!
00:07:52.320 Next, we will look at 'true', 'false', and 'nil'. It’s pretty straightforward.
00:07:57.760 Let’s move on to something a little more complex: numbers.
00:08:03.440 We want to create a rule for numbers, and we can do this from a regular expression.
00:08:10.080 This process makes it easy to recognize and parse numbers as we progress.
00:08:16.079 Now, we have five rules established, but we need to bind them together.
00:08:24.399 We need a root node that specifies what constitutes a valid Smalltalk program.
00:08:30.240 Whether it's true, false, nil, a number, or any of those valid identifiers.
00:08:36.000 We can specify valid inputs that can be parsed.
00:08:41.279 Writing a language is all about instant gratification. You need a REPL (Read-Eval-Print Loop) to input text and see what happens.
00:08:48.080 When you achieve this, you get a fantastic feeling of accomplishment!
00:08:54.720 In this case, we fired up the REPL that comes with the project.
00:09:02.000 We put in a number like '42', and it responds back with a parsed number node with the value of 42.
00:09:09.760 The REPL also confirms our string inputs like 'true' and 'nil' by returning the respective nodes.
00:09:15.760 Now we’re able to successfully parse and react to inputs!
00:09:21.840 The next biggest challenge is determining actions based on the values we parsed.
00:09:29.600 When we hit a true or a number or a nil, we need to define what the code will do.
00:09:36.640 For the upcoming examples, we will use the ability to inject methods directly into the bytecode.
00:09:43.680 You do not have to concatenate Ruby strings; instead, use a programmatic API to construct your code.
00:09:50.560 With 'self', the process is quick. We add a simple rule that says to push the value of self to the stack.
00:09:56.640 We implement this for all values we defined, ensuring that they are pushed correctly to the stack.
00:10:03.360 Now we will move on to a more complex rule: strings.
00:10:10.160 Parsing strings is slightly harder because we must account for escape characters like quotation marks.
00:10:17.200 Rather than diving into code, let’s look at the grammar output, which will be clearer.
00:10:23.360 In this case, characters will either have escape sequences or not have quotes.
00:10:30.240 A string consists of a quote, a body of text, and another quote.
00:10:37.280 Note that Smalltalk only supports single quotes for strings; double quotes are for comments.
00:10:44.319 In the future, you could implement support for double quotes.
00:10:50.960 Diagnosing issues is key, so let’s check our REPL as we run various strings.
00:10:57.680 If we enter '42' again, it should parse and return its value, confirming it’s working.
00:11:04.320 So far, we’ve been extracting values; however, it appears somewhat dull.
00:11:11.040 Let’s implement method sending so that the REPL is more interesting.
00:11:17.600 In this case, we will send the number three to the method 'class'.
00:11:24.320 In Ruby, this would look like calling '3.class'.
00:11:30.560 The rule to handle this involves parsing the number followed by a space and the method name.
00:11:37.200 The parsed output would also correspond to the grammar.
00:11:43.200 Now we can send the method 'class' to three. Fantastic!
00:11:50.320 But what if our object is now a 'true' value?
00:11:56.320 We need to ensure our grammar is flexible enough to handle both types.
00:12:02.240 So let’s create an 'atom' rule that encompasses both types for method sending.
00:12:08.800 The atom rule should encapsulate various entity types such as numbers, true, false, and nil.
00:12:14.480 This technique helps in constructing the unary send syntax.
00:12:20.480 Once that’s established, it’s straightforward to implement.
00:12:27.600 We will now introduce scenarios where the receiver can also be a unary send.
00:12:33.520 Thus, any valid expression can be a receiver.
00:12:40.320 This property authenticates the rule we established before.
00:12:46.320 So now we have enhanced our programming capabilities.
00:12:52.240 In this case, we are adapting to send the method to the unary send itself.
00:12:59.040 The processing grammar is set up to handle this modification gracefully.
00:13:06.240 Now we want to run the REPL to verify if it executes correctly.
00:13:12.240 Coupling the grammar and the bytecode should yield the correct output.
00:13:18.080 Let's execute to confirm our understanding of method calls.
00:13:24.080 Now we can call methods up to our expectations.
00:13:31.040 The next discussion will cover how to pass methods in Smalltalk style.
00:13:38.080 Passing methods is seamless; we will use the keyword format.
00:13:44.080 The format is simple: send the method name followed by a colon and then an atom as an argument.
00:13:50.960 This execution leads to methods being sent with arguments.
00:13:56.960 However, to send multiple arguments, you must lay them out properly.
00:14:03.840 This approach emphasizes the structuring of method calls.
00:14:10.480 Smalltalk enforces this structure by requiring arguments to be method names.
00:14:16.480 If two arguments are passed, they become part of the method name.
00:14:22.240 To manage this, we must determine how to break down the method calls efficiently.
00:14:28.240 With valid grammar rules in place, we can manage passed arguments.
00:14:34.080 We essentially separate pairs into the correct order.
00:14:41.680 As previously mentioned, we'll couple all this together using the REPL.
00:14:48.480 What we’ll see is that this becomes a very useful tool.
00:14:55.520 Let's test it out and ensure we've covered all methods.
00:15:02.480 Now, we have to consider how to manage variations properly.
00:15:08.480 While doing so, we must maintain accuracy.
00:15:14.800 So let's arrange each call to facilitate error-free executions.
00:15:21.920 This leads us to show how to handle call syntax for testing.
00:15:28.560 Everything appears well-ordered since we've put the fine details together.
00:15:35.160 Now, let’s switch gears and explore how unary sends can be received.
00:15:42.000 Let us check how this reflects when we run it in the REPL again.
00:15:48.440 As expected, it preserved the sequence for evaluation.
00:15:55.920 With that, we are prepared to handle object-oriented method sending in Smalltalk format.
00:16:03.120 Next, we’ll set up receiving class-based calls.
00:16:10.240 It's crucial to handle the syntax efficiently.
00:16:17.920 When applying these methods correctly, we can gain satisfactory outcomes.
00:16:24.000 Let’s execute this setup in the REPL for optimal results.
00:16:31.040 This illustrates how methods can be easily instantiated and utilized.
00:16:38.240 Now, we’re eager to see the transformations take place.
00:16:45.920 From multiple method calls to integration of additional language features.
00:16:52.320 In essence, we are bringing in flexibility for object interactions.
00:16:59.200 As you start implementing more robust features, validate outcomes in real-time.
00:17:06.080 We’re promoting interaction with extensive capabilities through the REPL.
00:17:13.040 Now, let’s ensure all the components tie together for solid execution.
00:17:20.080 Correlating these aspects points towards comprehensive evaluations.
00:17:27.280 If your expectations yield satisfying results, you’re on the right track.
00:17:34.160 We’re assuming a colorful variety of outputs should surface as we proceed.
00:17:41.440 These programmatic expressions will be a reference point.
00:17:48.080 As you expand into multiple send commands, the language will flourish.
00:17:55.200 The big takeaway: building a language depends on organized methods.
00:18:02.400 Each addition builds toward a satisfying programming experience.
00:18:09.120 With every improvement, test claims mature into detailed implementations.
00:18:16.000 Once checks and balances are established, a functioning system takes form.
00:18:23.000 Let’s explore further functionality based on incremental development.
00:18:30.560 We'll continue to develop interfaces, enhance coding modules.
00:18:37.600 With that, we’re moving towards complexities that show programming depth.
00:18:44.880 Bridges between small talk methods and Ruby conventions will enrich methods.
00:18:52.000 Now, initiating blocks falls into play, and size expands.
00:18:58.640 Blocks defined here will also add layers of versatility.
00:19:06.080 Coding in blocks pairs smoothly with the existing Ruby code interface.
00:19:13.680 Having multiple expressions supports further exploration without conflict.
00:19:20.720 Let’s investigate these blocks practically — data-defined responses should emerge.
00:19:26.880 This way, testing dialogues become part of efficient learning paths.
00:19:34.080 With blocks under consideration, techniques continue to streamline execution.
00:19:41.200 Let’s harness interactions to solidify our coding outcomes.
00:19:48.400 To finalize our aim, let’s summarize how the last steps connect.
00:19:55.680 Reflecting these associations as each point turns into core functions.
00:20:02.080 Now, we have arrived at the final threshold.
00:20:09.280 We will draw our larger conclusions on how development iteratively enhances languages.
00:20:16.080 In summation, consider building languages as an opportunity.
00:20:23.040 Experimentation thrives in this space, so take the plunge!
00:20:29.920 Learn through the challenges of language design. They’re not as daunting as they had seemed.
00:20:36.000 Feel confident — resources are available for you.
00:20:43.200 With these words, I encourage you to go out and explore!
00:20:50.240 Thank you all very much!
00:20:56.960 If you have questions, feel free to ask as we wrap up.
00:21:03.360 I appreciate everyone’s attention throughout this talk.
00:21:10.720 Thank you again, and see you next time!
Explore all talks recorded at LA RubyConf 2011
+5