Ruby

At the Mountains of Madness: A Primer on Writing

At the Mountains of Madness: A Primer on Writing

by Goose Mongeau

In the video titled "At the Mountains of Madness: A Primer on Writing", Goose Mongeau, also known as Matthew Robert Mongeau, presents a detailed guide on writing programming languages, focusing on his own language called Dagan. This talk, which took place at Keep Ruby Weird 2015, emphasizes the importance of understanding language design and the process involved in creating a programming language.

Key points discussed include:

- Importance of Design: Mongeau stresses that the design of the language should precede programming. Choosing the right progenitor language is crucial, and he suggests using Ruby due to its user-friendly nature and tools that facilitate language creation.

- Lexing and Parsing: He explains the fundamentals of writing a programming language by starting with lexing, which involves breaking down code into components, and then parsing, which is more complex as it requires composing these elements into coherent instructions.

- Ambiguity in Language Processing: A significant challenge when creating a programming language is dealing with ambiguities that may confuse the machine. Mongeau underscores the necessity of clear grammar to avoid issues during parsing.

- Practical Tools and Examples: The talk introduces tools like Raggle for lexing and Rack for parsing, showcasing their role in simplifying language construction and allowing for flexibility in potential future development. He illustrates how Dagan's design reflects the programmer's style and intention clearly.

- Personal Journey and Empowerment: Mongeau shares his personal experience of rapidly prototyping a language and the satisfaction it brought him. He highlights how developing Dagan equipped him with skills to better understand complex systems like Ruby itself.

In conclusion, Mongeau encourages aspiring language creators to embrace the challenge, stating that writing your programming language can lead to profound insights into existing languages and serve as a creative outlet for expressing programming style. This video serves as both a motivational piece and a practical guide for programmers interested in exploring the depths of language creation.

00:00:09.469 Hello, fellow scientist. You may not have heard of me before, but my name is Matthew Robert Mongeau. I am here to talk about the language of the machine. So many layers have been abstracted from us in this context. Those zeroes and ones seem very innocuous, but in between lies the very brink of insanity. For this reason, a colleague and I set out to write a new language, which we called Dagan. Dagan represents the frustrations we have experienced with current programming languages. In the wild, we've seen code designed in ways that Ruby permits, compelling us to venture into the abyss and create something new. I would like to share with you the trials and tribulations we faced while creating Dagan, as well as the lessons we learned on our journey.
00:01:01.020 If you set out to create your own programming language, you must prioritize design before programming. Failing to do so will lead you to create an unimaginable abomination. When you begin, you must choose a progenitor, a language from which to start. You may be tempted to choose C, but be warned: C is a feeding frenzy unto itself. The challenges faced in C will test the very limits of your sanity and imagination. One critical area you may encounter is memory management, which can bring upon never-ending nightmares. Alongside this, navigating pointers—those hulking monstrosities lurking in the shadows—can be a daunting prospect. Ultimately, if you take this path, your language may perish, and the death of languages is far too common.
00:03:07.030 Dear scientist, my suggestion to you is to use Ruby. Ruby offers a wealth of tools that simplify the language creation process. This way, you can work toward your goals, validating your assumptions before your language fades into obscurity. Remember to take small steps; each step you take toward creating your language will help it emerge into the world. I leave it to you, dear reader, to decide how you will approach the creation of your language.
00:03:38.739 Sincerely, Matthew Mongeau. P.S. — I'm not actually done yet. Okay, so that was just my introduction. Alright, so this talk has an interesting premise: I am trying to convince a room full of people to take the plunge and write programming languages. The first question I need to address is, why should you write a programming language? This is a significant endeavor, but the main reason is that by doing so, you'll gain a much deeper understanding of the programming languages you already use. In many ways, programming is a form of art.
00:04:07.948 Writing your own programming language is the best way to express yourself in terms of how you program and what you create. To begin, I want to gauge my audience’s experience—how many of you have tried writing a programming language before? How many of you have heard of lexing? Quite a few! I assume if you've heard of lexing, you're also familiar with parsing. When writing your programming language, you should definitely start with lexing, as it's a relatively straightforward process. If you haven't heard of lexing, it's essentially an analogy to language processing: taking a sentence composed of words and breaking it into individual components while labeling their meanings.
00:05:38.079 In programming, this means defining the parts of your language and assigning identifiers to them—this is a number, this is a keyword, and this is an operator. While this part is not too difficult, the more challenging aspect is parsing. This is where you take what you understand about the language, the different pieces, and put them together to form sentences, which is essentially how you will construct your language. The difficulty arises due to ambiguity. Ambiguity can wreak havoc; you might write your programming language in a way that seems clear to you, but the computer fails to understand it. Thus, teaching the machine to comprehend your language can be a challenging task.
00:07:05.250 If you persevere through this challenging step, you will be on your way to evaluating your code and ultimately creating a programming language. When my colleague Caleb and I embarked on this journey to write a programming language, we essentially completed a working prototype in a single afternoon. From there, the building experience is immensely rewarding. I believe there's nothing else I've programmed in Ruby that has felt as fulfilling as this.
00:08:52.920 To help you on this journey, I've compiled several tools. If you're trying to lex, I suggest using either Rex or Raggle. I favor Raggle as it allows you to define the structure of your lexer with multiple compilation targets—Ruby, C, Java, Go, Objective-C, C++, and many others. This flexibility is crucial because your language is unlikely to remain in Ruby if you want it to grow and develop; Ruby's slower performance can hinder your progress. Writing in Ruby does, however, allow you to quickly validate your assumptions about your language.
00:09:53.420 Another useful tool for the parsing step is Rack, which allows you to define your grammar. I will provide examples of these tools shortly, showing how they can help streamline the process of creating your language. Rack is modeled after Yak and Bison, both C libraries. Therefore, if you later decide to rewrite your language in C, the transition could be relatively straightforward.
00:11:38.500 Let me illustrate how Dagan, the language I developed, utilizes these tools. This is an example of what Raggle looks like. In Raggle, you define how each component of your language should operate. For example, you can specify that a plus operator is represented with spaces around it, thus creating clarity and avoiding ambiguity. Dagan was designed with the principle that the language itself reflects your style: there should only be one way to accomplish anything within it, leaving no room for confusion.
00:12:55.840 As we build our lexer, it will take code that resembles this—a sample Dagan program—and break it into its individual components to define its structure. For instance, we define 'greeter' as a noun with a colon following it, effectively labeling each part of the language. This process is straightforward, but it does require you to understand how to parse the grammar correctly, as ambiguities may arise. Shift-reduce conflicts may occur if the grammar cannot determine the order of operations.
00:15:07.790 To give you a tangible example, I want to show you a simple interpreter I created, which I affectionately refer to as the BF interpreter. This language, known as Brainfuck, operates with just eight commands. Implementing these operators demonstrates that once you understand the fundamentals, you are capable of creating a programming language that is technically Turing-complete. This means your language can perform any calculation that a Turing machine can.
00:17:05.610 As a result of my experiences building this language, I now have the skill set to contribute to Ruby itself. Anyone who's explored the Ruby source code knows it can appear daunting, comprising thousands of lines that can feel overwhelming. However, having previously developed my own language grants me greater insights into that complexity. Additionally, if you're a Rails developer, don't dismiss the relevance of language design; you can create innovative solutions by employing these tools effectively.
00:18:04.959 Thank you!