Talks
Towards a Higher-Level Language
Summarized using AI

Towards a Higher-Level Language

by Erik Michaels-Ober

The video titled "Towards a Higher-Level Language" features a talk by Erik Michaels-Ober at RubyConf AU 2015, focusing on the evolution of programming languages and the concept of abstraction.

Main Topic:
The central theme discusses the trajectory of programming languages towards higher levels of abstraction, emphasizing the significance of abstraction in programming.

Key Points Discussed:
- Historical Context of Programming Languages:

- The talk begins with the history of computing, explaining how programming started at a very low level with machine language and assembly language.
- Programming languages have progressively operated at higher levels of abstraction to simplify coding tasks.

  • Understanding Abstraction:

    • The speaker explains abstraction through examples like transistors and their symbolic representation.
    • Abstraction allows programmers to use concepts without needing to understand the underlying hardware intricacies.
  • Types of Abstractions in Programming:

    • Several key abstractions are mentioned:
    • Bits and Bytes: Allowing counting and representing characters instead of dealing with individual bits.
    • Variables: An important abstraction that enhances readability and ease of use compared to raw memory addresses.
    • Strings vs Character Arrays: Highlighting the benefits of strings as a higher-level abstraction.
    • Functions and Blocks: Functions abstract behavior, while blocks enable higher levels of abstraction for defining domain-specific languages in Ruby.
    • Garbage Collection vs Manual Memory Management: Discussing errors and complexity associated with manual memory management versus automatic garbage collection.
  • Evolution of High-Level Languages:

    • The presentation covers a historical timeline from early languages like Lisp and COBOL to contemporary languages such as Python and Ruby.
    • It notes that while many newer languages (like Go and Rust) have emerged recently, none have surpassed Ruby in terms of abstraction.
  • Modern Developments and Suggestions for Ruby:

    • As Ruby is over 20 years old, suggestions are made for improvements: removing symbols, enhancing decimal handling, and unifying syntax.
    • The need for better concurrency primitives to simplify concurrent programming in Ruby is emphasized.

Conclusions and Takeaways:

- The talk concludes with the assertion that Ruby still ranks highly among programming languages in abstraction, yet there are opportunities for modernizing its features to enhance usability and performance. The focus is on making programming even simpler and more intuitive as technologies advance, underscoring the continuing relevance of abstraction in programming languages.

00:00:00.199 So, I'd like to take you back to the dawn of the age of computer storage. I mean this quite literally. When programmers think about bits, we typically represent them symbolically as a series of zeros and ones. Tom Stewart had a slide similar to this in his talk yesterday, but bits are not actually zeros and ones; they are, in fact, electronic components—real physical things that you can touch. In the beginning, a mere 67 years ago, there was only one of them, and it looked like this. This is a replica of the world's first transistor. If you've done any work or played with electronics, this may be a more familiar picture of a transistor. It has three terminals, which is characteristic of a transistor.
00:00:37.559 This is the symbolic representation of a transistor: one terminal is called the collector (or the drain), one is called the base (or the source), and the last is called the emitter (or the gate). I like that each one has two different names to make it really confusing. So, the way this works is, if you take a power source, let's say a battery, and you connect the negative terminal to the base and the positive terminal to the collector, nothing happens; that's basically the equivalent of a zero. But as soon as you touch the collector to the emitter, electrons will start flowing from the negative terminal of the battery through to the base of the emitter and from the collector to the emitter with a higher amperage.
00:01:03.640 Now, quick show of hands: how many people understood how a transistor worked before I just explained it? Okay, hands down. How many people still don't understand what it is after I've explained it? A few of you, thank you for your honesty. So, the good news is it doesn't really matter. You don't need to understand how a transistor works to use or program a computer. The thing that makes this possible is something called abstraction. What is abstraction? Well, I'd like to share a quote by John Locke from his essay "Concerning Human Understanding" from the year 1690, where he states three things the mind is capable of: first, combining several simple ideas into a complex idea; second, you can take two ideas and not necessarily combine them, but just sort of put them next to each other and compare and contrast. The product of this exercise is understanding the relationship between these two ideas; and third, abstraction—basically you can take a concrete idea and generalize it.
00:02:09.080 In the same way that we can all use and program computers without necessarily understanding how the fundamental pieces work at each layer below the level we're programming at, we're able to do it anyway. When software developers think about a bit, we don't think about something that looks like a transistor. Maybe we think about something that looks like this: you don't need to understand how a transistor works to understand that it functions as a switch that can be off or on. We can abstract that to a symbolic representation of a zero for off and a one for on. By giving hardware bits a symbolic notation of zero and one, it allows us to start using bits as counters. Unfortunately, we can only count up to one, which is not that useful if we want to count higher than one.
00:03:03.920 We need to give our bits some structure. This brings us to our next abstraction: bytes. Bytes free us from having to think about individual bits, individual ones and zeros; instead, we can think about numbers between 0 and 255. Once we've done that, it's a short step to map those numbers to letters, punctuation, and other characters. Abstraction not only allows us to work with things without understanding them, but it allows us to reason about problems at a higher level. Instead of worrying about the individual bits, we can think about things in terms that are understandable to us, like letters, and we can look at these letters and get some sort of intuitive human feeling about what they mean, as opposed to the bits which are only meaningful to the computer.
00:04:03.400 We used to program computers like this, telling the computer which individual switches it should turn on or off with a punch card. In some third-world countries, such as Florida, voting is still done this way. This is some x86 assembly code that I actually borrowed from Tom's talk yesterday. It’s definitely a higher level than writing zeros and ones, but it is still tied to a specific computer architecture. You're programming specific registers and specific addresses instead of referring generally to memory. Why not give that memory address a name that makes sense to a person? So, variables are an abstraction over reference. Languages with variables tend to be more abstract than languages without variables, and I think this is an improvement. Almost all programming languages have variables; it's a nice abstraction, a useful abstraction—better than this.
00:05:10.000 Strings are an abstraction over character arrays. Not all programming languages have strings; some only have character arrays, and the array sizes can only be constants. This means we run the risk of a buffer overflow since we have to try to guess in advance how many characters we need. There's also that annoying null terminator that goes at the end, and there's a whole category of bugs that can be caused by either forgetting to add a null terminator to your character array or by trying to add things to the character array after the null terminator. Strings, again, are an abstraction that let you operate at a higher level; I think they're quite nice.
00:06:12.120 Does anyone know what the C program will output? A hint: that number I stored in the variable i is max int or int max, so when we add one to it, any guesses? Yeah, you'll get negative two, four hundred eighty-three thousand. Ruby could even hide away that distinction and just have something called 'num' or 'integer' and take care of whether it's a fixed num or a big num under the hood, which is not really important to you. You don't need to know the difference, and as a result, you can sort of have infinitely long numbers in Ruby, or it's only constrained by the amount of memory that you have, not by some arbitrary type constraint.
00:07:01.800 Functions are abstractions as well. They are abstractions over behavior, and languages that allow you to define functions, I would argue, are more abstract than languages without functions. It's again not such a controversial statement; I think functions are quite popular in programming languages. So that's good, but blocks are not. Not every language has blocks or closures, and this is a high-level feature that's quite nice. Again, I would argue that the thing that blocks allow you to do is define behavior on the fly. If you look at most Ruby domain-specific languages, the way they work is heavily tied into using blocks. For example, something like RSpec uses blocks all the time to define your tests. So the idea of defining a domain-specific language in another language, Ruby leverages this feature so you can operate at a higher level, in the domain of your test, for example, or Sinatra allows you to operate in the HTTP domain using blocks.
00:08:49.440 I would argue that languages that require you to manage memory manually are less abstract. Languages that perform garbage collection are more abstract. This is terrible, basically. If you try to call that free twice instead of once somewhere later on, you'll get an error, and if you try to access a pointer that's already been freed, you get an error. There's a whole class of errors that can basically be solved by getting rid of manual memory management and utilizing garbage collection instead. Grace Hopper, a programming language designer who worked on COBOL, believed that programs should be written in a language that was close to English. I want to take you through what I consider to be the history of programming languages so you can see this trajectory because I think there are some trends and lessons that can be learned from this.
00:10:10.240 This chart is basically just one very dynamic slide, but don't worry too much about the y-axis; it's not labeled. Just think of it as the values being relative. If one programming language is above another, that means it's more abstract. We won’t say how much more, because these things are hard to quantify, but I think you can make a pretty clear case just in the ranking. At the most basic level, the first computers used machine language. Soon after that, we developed assembly language, which is specific to a particular hardware architecture and talks directly to registers and addresses. Then, a few years later, I believe in 1958, we got Lisp, which was much higher level than assembly language. Again, don't worry too much about the magnitude, but I think it’s fair to say that Lisp was at least one or two orders of magnitude more abstract than assembly language, which itself was more abstract than machine language.
00:12:10.000 And so I would call this progress. The year after Lisp was released, we got a language called COBOL, which was the language Grace Hopper was working on. In 1960, we got ALLO, which is similar to COBOL. Ruby actually has certain syntax roots that can be traced back to ALLO and COBOL. ALLO and COBOL are interesting because they were the first two languages that tried to define a language standard and basically said that a language should conform to a particular standard. This is a form of abstraction because by locking down a standard, it means that you can then use that language across different hardware architectures, which at the time was also quite varied.
00:13:18.000 If you could standardize on the language and abstract across the hardware, this was a very nice feature. Standardization was a big step forward for COBOL and ALLO. Then, a few years after that, in the mid-60s, Simula was released. Simula is recognized as the world's first object-oriented programming language, and I think you could trace some of Ruby's roots back to Simula. It never really became very popular, but it had many of the object-oriented features that inspired future languages. Then came C in 1970. C is often considered a low-level language; it’s basically one level above assembly language. It is standardized and can be considered abstract across different hardware architectures.
00:14:10.360 However, I would argue this was a bit of a step backward in terms of abstraction. That said, compared to all the other languages—Simula, COBOL, ALLO, and Lisp—C became quite popular and continues to be popular today. It's arguably the most popular language still. I think the reason why C became so popular was that it gave you a certain amount of control that you really needed back then. These abstractions have a cost, and the cost is that if you're writing code optimized for a human instead of for a computer, it won't be as efficient. C allows you to write code optimized for a computer, which gives you performance, especially during an age when hardware was slow and constrained.
00:16:00.760 C became really popular in this environment and continues to be so. At that time, there wasn't much code; almost all code was low-level. People were writing the first operating systems and servers—web applications didn't even exist yet. Having C, which provided a little more abstraction than assembly language while maintaining control, was the sweet spot for the 1970s. However, a few years later, Smalltalk emerged, which took some of the object-oriented concepts from Simula and built upon them. While it shares its ancestry with Ruby, it never became as popular as C. But it is a significant language that continued the trajectory of more abstract languages over time.
00:17:58.760 Then, in 1983, the year I was born, C++ appeared. It brought some of the object-oriented principles from Smalltalk and Simula to a C-like language, adding new features while remaining compatible with C. Objective-C also emerged around the same time, attempting to improve C while integrating object-oriented features. In 1991, Python was released and marked a significant leap towards more abstract programming languages. Python had more object-oriented features than any of its predecessors and didn't carry the baggage of C, C++, or Objective-C. It was a wholly new concept and functioned as a scripting language, eliminating the need to compile code.
00:19:36.000 In 1995, it was a groundbreaking year for programming languages. It marked the release of Ruby, JavaScript, and Java. I see Ruby as being up there with Python, even placing it a bit higher due to my bias. JavaScript is also a high-level language, while Java runs on a virtual machine, allowing for greater portability across systems. Java's innovation is its virtual machine, which enables code execution without specific hardware compilation. This was made possible in the mid-90s as hardware advancements allowed for virtual machines to run efficiently.
00:21:35.760 In the early 2000s, Scala came out, effectively aiming to be a better version of Java. It improved upon Java's design flaws and successfully achieved higher-level functionality than Java. Closure is another language known for its abstract concepts and practical design, and it utilized the Java virtual machine, enhancing its appeal. Moving into modern history, Go, released around 2009, is a lower-level language than most on this list. Nonetheless, it still represents an improvement over C++, as it provides garbage collection built in, simplifying memory management.
00:22:58.760 Rust would rank beneath Go because it lacks a garbage collector, yet it remains more advanced than lower-level languages when compared to Ruby and Python. This overall trajectory of languages suggests a consistent upward progression towards higher-level programming. However, it's worth noting that no one seems to be trying to create something that surpasses Ruby in abstraction. Efforts have been made to improve C, C++, and Objective-C, yet Ruby maintains its position as the highest-level programming language, in my opinion. While it may not be the absolute highest, especially if you consider reasons for use and existing constraints, it stands out as a very high-level language.
00:25:00.640 However, it's now over 20 years old, having been released in 1994. If Ruby were being reinvented today with the same goals of high-level convenience and reduced manual memory management, there are certain things I would change. For example, I would eliminate symbols. Symbols in Ruby are similar to strings but remain persistent. While they reduce memory allocation during comparison, they serve as low-level performance optimizations that are less relevant today, given our modern hardware and memory capacities.
00:26:57.360 Additionally, in Ruby 2.1, a feature was introduced allowing strings to behave similarly to symbols through freezing, providing object identity. As of Ruby 2.2, symbols gained garbage collection features that essentially align them more closely with strings. Given the advancements in hardware since the mid-90s when Ruby was developed, we should no longer be overly concerned about minor memory considerations; we've ample resources available at this point.
00:28:20.720 Next on my chopping block for Ruby would be floats. I wouldn't eliminate them entirely, but I believe we need a better decimal class. Floats are a source of numerous bugs and issues, especially when it comes to precision in calculations. For instance, a seemingly simple calculation like 0.1 plus 0.2 does not yield 0.3 as expected; instead, it results in 0.30000000000000004. This inaccuracy leads to real-world issues—such as storing money in floats, which can result in lost amounts due to rounding errors. A more precise decimal class, perhaps with an attribute for the currency, would mitigate these problems.
00:29:39.600 Another aspect I'd address is the presence of nil. Ruby employs nil to indicate null pointer exceptions; however, modern programming languages like Swift employ optionals as a solution—allowing programmers to handle potential nil values more effectively. This could improve Ruby's usability and robustness. Typed arrays would also benefit Ruby, allowing the specification of array types to restrict additions to the same type. This could minimize errors while providing performance optimizations.
00:30:55.760 Moreover, method overloading could be a useful feature in Ruby. Being able to dynamically dispatch different implementations of methods based on argument types can make the language more flexible and elegant. Additionally, I suggest that methods define themselves as method objects that return themselves. This would open many possibilities for method compositions and improve clarity in method calls without convoluted syntax.
00:32:18.420 Additionally, I believe we should unify Ruby’s function syntax. Currently, the variations in defining functions create unnecessary complexity. We could borrow from JavaScript and merge some variations to create a cleaner syntax. Lastly, I advocate for better concurrency primitives to simplify concurrent programming in Ruby. Right now, threading is cumbersome, leading to potential performance issues when initiating a significant number of threads. There are better abstractions in other languages that allow you to handle concurrency more effortlessly.
00:33:58.360 This is important, as concurrency can be daunting for developers. It would be more beneficial if we could simply instruct Ruby to parallelize tasks without needing extensive knowledge of the underlying systems handling those threads. In conclusion, these are my thoughts on enhancing Ruby and its ability to function as a more advanced, high-level language. Thank you.
Explore all talks recorded at RubyConf AU 2015
+14