Nokogiri

Summarized using AI

Ruby Archaeology

Nick Schwaderer • September 09, 2021 • online

In the talk "Ruby Archaeology" by Nick Schwaderer at RubyKaigi Takeout 2021, the speaker explores the historical aspects of Ruby programming, celebrating its legacy and old code. Schwaderer discusses the significance of revisiting Ruby's past to understand its evolution and to acknowledge work that can still run today. The key points of the talk include:

  • Definition of Ruby Archaeology: Schwaderer introduces the term to describe the practice of exploring and running historical Ruby code.
  • Importance of Ruby's history: He emphasizes the value in recognizing Ruby's past contributions and styles, highlighting how they shape current programming practices.
  • Setting up an environment for legacy code: The speaker describes the complexities of configuring a modern machine to run Ruby 1.8 code, advocating for tools like Vagrant to emulate older environments.
  • Examples of noteworthy Ruby gems from the past:
    • Nokogiri: The speaker showcases this HTML parsing library, demonstrating how to use an old version that was first released in 2008.
    • Hpricot: As a predecessor to Nokogiri, Hpricot is introduced through a nostalgic demo, showcasing its unique features and the historical context of Ruby gems.
    • Builder: This library for generating XML markup is highlighted for its elegant syntax and functionality, reflecting Ruby's versatility.
  • Historical coding conventions: He discusses notable coding styles from the past, including the use of curly braces for blocks and the :: syntax for method invocation.
  • Conclusion and reflection: Schwaderer stresses that not all old code becomes obsolete; many Ruby codes and gems from the past remain runnable and relevant today. He encourages the exploration of historical Ruby projects and offers a Vagrant box containing the discussed environment for attendees to try out.

The talk concludes with an invitation for discussion and a reminder that learning from Ruby's history can positively influence modern practices.

Ruby Archaeology
Nick Schwaderer • September 09, 2021 • online

In 2009 _why tweeted: "programming is rather thankless. you see your works become replaced by superior works in a year. unable to run at all in a few more."

I take this as a call to action to run old code. In this talk we dig, together, through historical Ruby. We will have fun excavating interesting gems from the past.

Further, I will answer the following questions:

- What code greater than 12 years old still runs in Ruby 3.0?
- What idioms have changed?
- And for the brave: how can you set up an environment to run Ruby 1.8 code from ~2008 on a modern machine?

RubyKaigi Takeout 2021: https://rubykaigi.org/2021-takeout/presentations/schwad4hd14.html

RubyKaigi 2021 Takeout

00:00:04.160 Hello there, my name is Nick, and I am an engineer at Shopify. I'm here today to talk to you about Ruby archaeology. Before I get started, I would just like to say thank you so much to the organizers at RubyKaigi, the folks who sponsor it, and the people who put in so much effort every year to put on this wonderful conference. I've been a fan for many years of the wonderful talks and content that comes out of this conference.
00:00:10.880 I'm just honored to be a part of it and hope I can step up and give you at least a fraction of the quality that I've gotten to enjoy from you over the years. So I guess the first question you're asking is, what is Ruby archaeology? It's obviously a term I've coined for this talk, but what does it really mean? Let me step back a second and talk about my interest in Ruby history.
00:00:38.079 I wasn’t around at the beginning; I wouldn't have been reading Dave Thomas in 1998 or speaking Japanese before then or, you know, even hacking around pre-2.3. I love reading about it and looking through the history. I've even run a newsletter for a while called Past Rubies, where people get emails detailing various significant Ruby-related events on this day in history, including talks, blog posts, releases, and much more. It’s really fun to read and experience that history because I think it's important to revisit this content.
00:01:06.400 Ruby is very much the same in terms of the core way that we use it over time. I've written adapters, so whenever I find a resource that was written back in the day, these objects can be passed a date object, giving me a bunch of links that I could open and read to explore the history. I've automated this process with Ruby scripting. I previously created a gem called PortalGun, which would assist you if you had say a corrupted gemfile.lock or were struggling with dependencies. You could simply pass into the executable your gem file and a date object and receive a gem file that points to a specific point in time.
00:01:39.680 For instance, if you knew that in March 2011 your repository and gemfile worked, you could pass in that date and it would give you all of your dependencies with the latest stable versions from that point in history. In particular, around North America in the pre-2010s, _why the lucky stiff's_ work was quite influential on me, with much of the historical Ruby code and content he had on Ruby talk and his blogs, as well as the gems he had written. One of the last tweets he wrote before he disappeared in 2009 was, 'Programming is rather thankless. You see your works become replaced by superior works in a year, unable to run at all in a few more.'
00:02:04.960 For this talk, I'm saying, let's run some old code. Let's do this! The first thing I'd like to discuss is setting up an environment for running Ruby code from 2008 or 2009 on a 2021 machine. I lost several days to this; it's actually quite a tricky task, and not something that’s particularly easy to do. You can't just use rbenv or rvm because, funnily enough, you’re not meant to.
00:02:15.200 Ruby versions 1.8 and 1.9 reached end of life over seven years ago, and if we’re using Ruby 1.8, spoiler alert: we’re going to use that today, it hasn't been the latest and greatest for 14 years. While there are good reasons to upgrade and keep your Ruby versions current, I believe it’s important to look to our past. We have a very mature language with a feature-rich, robust ecosystem, and to continue to progress with the language, we must keep an eye on those older code patterns, styles, and debates.
00:02:39.680 This is because we might just write Ruby the way we've always known and not really think about alternative ways of doing things, limiting ourselves to a narrow track of writing. Ruby is actually a very sharp knife that can be used however you want. So let’s get started! Here’s how I got it done.
00:03:00.880 The first thing I decided is that I couldn't run it directly on my machine, but I could set up a Vagrant box appropriate to the time. We’re lucky in being able to access older versions of Ubuntu; this is what we’re using today. If you've never used Vagrant before or have no background, that's fine; just pre-install it, and with three simple commands, you’ll be up and running in a terminal in an operating system from a long time ago.
00:03:27.920 However, you're only part of the way there! At this point, you can't just code in Ruby; you have to worry about your dependencies. The sources.list file in Ubuntu, which indicates where to find dependencies, points to a modern repository that won’t have any of the dependencies required for your 2008 or 2009 environment. Thankfully, there is a way to point to older packages, since some folks at old releases maintain the old dependencies.
00:04:01.120 By using an old sources list, I was able to update my sources.list to point everything to old releases, and voila! Whenever I run apt-get, I will receive old updates. Of course, this isn't a perfect system, as 'old' can be ambiguous in terms of what time in history you are referring to, but it works for our purposes today. When debugging, just remember to think as if you’re in the past. If you have a question, don’t hesitate to type it with the date—in Google or elsewhere—to find relevant information.
00:04:29.120 With this setup, I started up and directly installed Ruby, getting version 1.8.7 and it worked, which was pretty cool. RubyGems did install, but `gem install` wasn’t functioning typically because the protocol for connecting to RubyGems today isn’t compatible with 14-year-old code. Git was also installed, but there were a few problems—it doesn't want to communicate with GitHub, which probably doesn't surprise anyone here. So we were somewhat hamstrung by our inability to use `gem install` or `git pull` or `git clone.`
00:05:04.000 However, I had an idea. We can download `.gem` files from rubygems.org. I have a system that can download files from the internet and install them directly from the `.gem` file with the RubyGems we have. I could use `wget` to pull down an exact version of an old gem, and by using `gem install --local`, we would be back in business! You may have to install other dependencies to use some basic gems, but that’s expected as you start with a bare box.
00:05:50.880 Okay, you are set up, and ready to code some old Ruby! Let’s explore some old Ruby code together using our new system. Nokogiri—you're likely quite familiar with this gem, as it’s the industry standard for HTML parsing. A tremendous number of gems rely on it, with hundreds of millions of downloads.
00:06:01.560 It’s a great place to start, as it has a robust API that is still strong today. Plus, it was first released in 2008, which makes it perfect for our exploration. We can actually use a version of Nokogiri that dates back to that time, something we are all familiar with, allowing us to play around with the code and expect it to perform well on our old machine. Without further ado, let’s get into the demo.
00:06:22.080 Here we are in our coding environment, using Ruby 1.8.7 as if we are back in 2008 with the appropriate versions of our gems installed locally. In our terminal, the first thing we’ll do is require Nokogiri. Oh, that’s right! In 2008, you had to require RubyGems before you could require Nokogiri.
00:06:29.280 So let me go into an `irb` where I've already set up my `.irbrc` to include the require for RubyGems, allowing us to start with no obstacles. A lot of people would have done this back then. If you use OpenURI, there’s a little curiosity for you: you’re probably used to opening websites, but in those days, `open` was a kernel-level method, not called directly on URI. It was part of the core.
00:06:47.920 You simply passed Nokogiri an HTML file that I’ve prepared for you, and it greets you nicely! You can interact with it as you like. As you see with the `<p>` tags, you can search methods or even call CSS directly on it, and it behaves much as you'd expect it to today.
00:07:09.920 Now, let's have a little more fun and look at some Nokogiri core code from that time in history. Here’s our first look at repository code. If you’re a seasoned Rubyist with years of experience, this might look entirely standard to you, but if you’ve only been coding in Ruby for seven years or less, you might notice some differences in the dialect.
00:07:21.440 That’s right; you might have been forced to use `do … end` for your multiline blocks, but here we have curly braces, which is something significant in Ruby lore and convention from the past. Avdgrim promotes using curly braces for functional blocks, suggesting that those are used when you want something returned, while `do … end` is preferred for side effects. This distinction was discussed over a decade ago, and some might even reference research conducted by Jim Weirich, who talked about how to use blocks in this manner.
00:07:51.440 Thus, here we see a curled friend, allowing you to return values well. It's part of Ruby's history. Another interesting note is invoking a method using the `::` syntax; this is still fully supported in Ruby today, though your linter might object. Sometimes, when writing your code, it’s effective to use a `::` instead of a `.` which provides a unique method invocation.
00:08:29.440 Also notable: the use of attr_accessor is not all contained in one line. You can call it whenever you want and as many times as you'd like. Now, let’s transition to another gem I’d like to showcase: you might not recognize this unless you were hacking Ruby 13 years ago, but it’s called Hpricot, which was a major predecessor to Nokogiri.
00:08:58.400 It has a fascinating connection, as it was intended to be a competitor to Nokogiri, and when Nokogiri emerged, it was a drop-in replacement for Hpricot. Nokogiri was faster, had fewer bugs, but provided the same functionalities that Hpricot did. The Hpricot library stopped being maintained after _why the lucky stiff_ relinquished control in 2009, thus making this a perfect opportunity to explore some older code that is no longer available.
00:09:19.040 Let’s dive into a quick demo with Hpricot. We’ll call the library and get straight to business parsing an HTML file that I’ve written. One of the things I love about the API of Hpricot is how seamlessly you call it at the top-level. You can just pass your HTML into it directly, enabling a swift and elegant output.
00:09:50.080 As we parse it, we can assign it to our `doc` variable and push it into an empty array. Next, we will iterate through the rows and pull out their inner text, filtering based on the types of meats we’re interested in, resulting in some prices and locations. It’s a nostalgic process, reminiscent of good old Ruby programming.
00:10:20.000 Now let’s once again have a final look at our Hpricot code. A fun aspect I noticed is the method definition conventions. You usually don't see methods beginning with a capital letter, but it aligns with the Hpricot module itself. You can call Hpricot immediately following a requirement, passing arguments like in DSL-style, just as evident as it sounds.
00:10:43.200 The method `def hpricot` might feel different, but it reflects an era where such conventions weren’t uncommon. Hpricot uses a blank slate class, which provides a base class with no predefined methods—only essential methods like `method_missing` and `instance_eval`—if you were working with dynamic classes.
00:11:06.120 Next, we’ll examine the configuration options; there's quite a bit happening here. The invocation of config here illustrates the self-parameter or self-style coding that’s beginning to lose steam in contemporary Ruby practice but holds a certain charm that I believe should still be explored.
00:11:29.440 This stands true in our `self.make` method where calling `def h.pricot.make` achieves the same outcome, suggesting there's still room for creativity within Ruby’s conventions. Don’t forget about the treasure trove of great gems from this period. We've seen how the curly braces behave, especially when returning values, reinforcing historical practices in a fair manner.
00:12:00.680 Now, let’s move on to another cherished gem from the era—Builder. This one is all about generating XML markup in a Rubyish way, something incredible for its time and remains functional. Jim Weirich, the maintainer of Builder, created it to elegantly build XML documents, as showcased in this code snippet, which uses hash rockets effectively.
00:12:27.680 Rubyists back then often utilized hash rockets exclusively, meaning that in our example for initializing this markup, we can use methods like `.person`, `.name`, and `.phone` through method missing to output valid XML, showcasing simplicity and elegance in Ruby’s output generation.
00:12:56.960 Let’s delve deeper and see if you can guess what I’m trying to achieve here: yes, we have the capability to pass the Builder instance itself around or as arguments in a very sophisticated manner. Through this, you can generate XML structures seamlessly.
00:13:22.560 To wrap up with Builder, we notice the continuous use of methods in a creative way, demonstrating how versatile Ruby code can be, particularly looking through a historical lens. Next, let’s engage with the internals behind Builder once more in something interesting—a development seen in Hpricot with its relationship to the prominent blank slate design pattern, made famous by Jim Weirich.
00:13:51.360 This provides us with powerful opportunities in our Ruby code, with the same opportunities seen in Hpricot that closely followed this practice. Another element of intrigue could be using the `and` keyword within this context. While it is often discouraged today, it holds a particular charm, reminding us to hug these historical multitudes found in our language as we evolve.
00:14:18.760 In our last bit of exploration today, we see another library where we have the curly brackets returning values. This harks back to the principles laid down by Jim Weirich years ago, and it’s become a central part of our exploration today.
00:14:45.760 So, that basically concludes our dive into Ruby archaeology for today! I've truly enjoyed whittling away at this history along with you, and it's a treasure trove of gems that we can still find and learn from. I hope you have as well! I want to highlight a few engaging projects just waiting to be rediscovered, like a delightful small micro-framework called Camping.
00:15:17.680 Camping has its charm, and it’d be interesting to get that old code running on hosted boxes to serve up some websites. Then there’s Unholy, a fascinating project that converted Ruby to Python byte code and recompiles back. In the early 2000s, many rubyists, unable to find work, often converted Ruby to languages in demand.
00:15:45.680 I'd love to explore Unholy further in future considerations, as I am devoted to keep digging into Ruby’s rich past. To that end, I’d like to provide a gift to you: I’ve prepared a Vagrant box that you can use.
00:16:07.920 If you run the command vagrant up from `schwad/ruby_archaeologist`, you’ll get this exact same environment, built by hand today. With those two commands, you will have the capability to `vagrant ssh` into your box and start executing all your Ruby code, including the three gems demonstrated today.
00:16:39.920 This will always be free, and I may even update it with new thrilling features as time continues. You also may recall music from 2008 that is entirely runnable in Ruby 3.0—added fun for today.
00:17:01.960 Realize not all old code dies merely after some time; it can remain runnable and relevant many years later. This tidbit showcases how Ruby can embrace its past while continuing to move forward.
00:17:23.440 Thank you very much for your time! Here is some information about my work, and I’d love to hear from you about your experiences with historical Ruby. Please contact me about things I might have missed or find interesting, and again, thank you so much for your attention. Enjoy the conference!
Explore all talks recorded at RubyKaigi 2021 Takeout
+32