RubyConf 2021

Black Swan Events in Open Source: That time we broke the internet

Black Swan Events in Open Source: That time we broke the internet

by Julia Ferraioli and Amanda Casari

The video "Black Swan Events in Open Source: That time..." presented by Julia Ferraioli and Amanda Casari at RubyConf 2021 explores the concept of black swan events—unpredictable occurrences that can have significant impacts—within the realm of open source. The talk emphasizes the importance of understanding open source as a complex socio-technical system, which integrates social and technical elements that greatly influence how changes and disruptions occur in this ecosystem. The speakers identify three critical historical events as case studies of black swan events in open source history, demonstrating how they have reshaped the landscape:

  • The Morris Worm (1988): This early internet worm caused extensive disruptions by exploiting vulnerabilities, marking the first felony conviction related to computer fraud. Its impact underscored the necessity for better security and trust protocols within the internet infrastructure.

  • Heartbleed (2014): A vulnerability in the OpenSSL library that exposed sensitive user data, Heartbleed highlighted the risks associated with relying on a few maintainers for critical infrastructure. The community's response brought increased awareness and funding towards open source sustainability, indicating the essential role of maintainers in the eco-system.

  • Left Pad Incident (2016): A developer's removal of a common npm package led to widespread failures across many dependent applications. This incident showcased the fragility of package management systems and sparked discussions about maintainer rights and dependency complexity.

In conclusion, the presentation brings to light the need for a comprehensive understanding of open source dynamics, advocating for increased support and recognition of the maintainers who form the backbone of this ecosystem. The speakers also introduce the "Open Source Stories" project, aimed at capturing and preserving the history and experiences of individuals within the open source community, thus addressing the need to document various narratives and perspectives. Overall, the talk encourages an ongoing conversation about the socio-technical aspects of open source and the implications of significant disruptions in the industry.

00:00:10.559 Hi, RubyConf! Welcome to Black Swan Events in Open Source: That time we broke the internet.
00:00:15.040 We're coming to you today from the past, as we recorded this talk last month. We're going to delve into complex socio-technical systems and how they interact with open source.
00:00:35.360 I'm Julia Ferraioli. I'm a white woman with wavy brown hair, wearing a gray and white flannel shirt. I've been working in and around open source for an unspecified number of years, focused on policies, sustainability, and everything else that needs to be done.
00:00:40.559 I'm a bit of an open source archaeologist because I dig around in data, stories, and source code with a fine-toothed comb and enthusiasm.
00:00:54.400 Hi, I'm Amanda Casari. I'm a white woman with short light hair, wearing a gray vest and striped shirt. Most recently, my work has focused on understanding the nuances of open source—where it works, where it needs support, and where all the things are tucked away in the back room that we didn't know were holding up the internet.
00:01:22.080 We want you to know that slides and speaker notes are available at the short link, which is bit.ly/black-swan-oss. Now, before we jump into story time, Julia and I would like to level set with you on a few foundational concepts to ensure we're all on the same page.
00:01:41.439 First, we will have a very brief introduction to complexity theory, which is the conceptual framework that defines complex systems. Complexity theory states that while complex systems are unpredictable, they are constrained by order-generating rules. A complex system has components that interact in multiple ways, following local rules with no unifying rule to define all interactions, but where the emergent system is greater than the sum of its parts.
00:02:11.440 After that, we want to look at a special kind of complex system—socio-technical systems. In a socio-technical system, we have two new constraints to consider: number one, the social and technical aspects of an organization are interrelated and cannot be isolated in analysis; number two, we always have to account for interactions between society's complex infrastructure and human behavior.
00:02:35.200 So, what does this have to do with open source? We find that anyone we've talked with who has worked in open source for more than a few weeks has a story about how open source has changed. Looking back further, much further—as we do in open source and in archaeology—we can distinctly see how open source has evolved over time.
00:02:59.680 I’m going to do open source a grave injustice and highlight only a few events in its rich history. Initially, software was treated as if it were in the public domain, traded back and forth between scientists and researchers. We hit a point in 1974 when a court ruling established that you could copyright code. The software community grew increasingly uneasy with this trend.
00:03:25.760 In 1989, GPL version 1 was released—this was a pivotal point in open source history. We also saw other milestones along the way, such as the term 'open source' being coined by Christine Peterson and the Open Source Definition, the OSD, being written under the auspices of the Open Source Initiative in 1998.
00:03:46.640 The landscape changed once again with the advent of Git and GitHub in 2005 and 2008 respectively, making open source software and contributing to open source more accessible to more people. You no longer had to search mailing lists to find repositories or email patches around.
00:04:07.360 What we see in open source history is a system that has undergone extensive evolution, and that evolution happens through the disruption of norms. So given open source's evolution, how can we describe it? It's a system—a distributed system, a cooperative system, a political one, a social-technical system, and an organic system.
00:04:32.639 If we look at the evolution and at all these individual layers of open source, we start to see common patterns. We see components interacting in multiple ways, following local rules with no single rule to define all interactions, and the emergent system of open source being greater than the sum of all its parts. Thus, we can confidently say that open source is a complex system.
00:05:00.800 Furthermore, if we want to understand open source and how it changes, we have to view it as a socio-technical system. The human interactions and social components cannot be separated from the technical systems themselves, and this consideration is most important when disruptions occur, especially those that fundamentally change the ecosystem.
00:05:19.840 Sometimes, a disruption to an ecosystem may be what we can call a black swan event. What makes these events so special? There are three critical components of a black swan event that we have to consider: one, they involve a disruption; two, they cause systemic change; and three, they seem inevitable in hindsight.
00:05:47.360 We would like to use this framework to identify three specific events in the history of open source that highlight the complex socio-technical nature of our evolving ecosystem. To do this, we first have to go back to 1988.
00:06:02.000 In 1988, there were about 60,000 computers networked together to form the public internet. While they were increasingly few in number, a majority of these computers were government or academic systems. The Morris worm, or just 'the worm' as it’s commonly known, was created inadvertently by Robert Tappan Morris when he was a student at Cornell. Morris released the worm via a system at MIT to prevent it from being traced back to him, as part of a research project examining internet systems and how their open protocols could be exploited.
00:06:28.000 This worked much quicker than Morris expected by exploiting multiple operating system vulnerabilities, and most importantly, the trust that existed at that time between system admins and users. The Morris worm infected about 10% of the global internet in a matter of days. Among its notable firsts was being the first computer worm to be distributed via the internet as opposed to being traded by floppy disk or removable media, and also, Morris became the first to be felony convicted in the U.S. under the 1986 Computer Fraud and Abuse Act.
00:07:00.800 But why does this event matter to open source, and how is it a black swan event? Using hindsight, we can see how the status quo has changed and the influences on how we operate today. After the Morris worm, we no longer assume everyone is operating in each other's best interest. This event occurred in November 1988, just months before the GPL and Linux were released for the first time.
00:07:50.679 We cannot ignore the human interests, community expectations, and social trusts that existed at the time of these releases, all of which were fundamentally impacted by the Morris worm. The reaction and recovery from the Morris worm demonstrated the robustness of the internet, which was designed as a partitionable and distributed system prioritizing open knowledge sharing.
00:08:33.600 Morris quickly worked to move fixes to the worm onto multiple mailing lists. Unfortunately, some of those actions were delayed due to the effectiveness of the worm, but system administrators worked together effectively to isolate the damage and prevent impacted systems from spreading problems further. Because of the Morris worm, the internet and the software it runs on has changed.
00:09:12.400 We saw a shift in security regulations and international laws. There was a tension, especially in industry and government, to move toward proprietary systems and contracts. The exploit the Morris worm used to access systems without logins meant that now we all have to remember all our passwords.
00:09:49.280 Let's jump forward to 2014 and talk about Heartbleed, which is one of my favorite examples when talking about open source as a complex socio-technical system. Heartbleed was first exploited in 2012 but only publicly disclosed two years later in 2014.
00:10:09.680 What was Heartbleed? It was a vulnerability in the most popular open source cryptography library, called OpenSSL. Essentially, it made user-entered data vulnerable to being captured by malicious third parties. With that data, combined with also exposing cookies and passwords, bad actors could impersonate users, compromising even more information and security.
00:10:49.920 Heartbleed affected a variety of systems, including everything from phone systems to payment processors, to gaming services, and most websites. Given that Apache and Nginx servers were both susceptible to Heartbleed, it affected at least 66% of the internet, including government databases that held personally identifiable information.
00:11:33.200 Heartbleed was undoubtedly a black swan event in the history of open source. It drew attention to the fact that while open source is distributed, maintainership does not have to be, and in many cases isn't. Very few people realized beforehand that OpenSSL was primarily maintained by one or two volunteers.
00:12:13.120 This led to a significant recognition that a maintainer's time can be extraordinarily limited, as we saw in OpenSSL's case where the maintainer didn't have enough time to mitigate Heartbleed. Heartbleed highlighted that while open source is about software or hardware, it cannot function without supporting the people behind it.
00:12:43.760 As a result of Heartbleed, donations poured in to allow core maintainers to work full-time on OpenSSL. It also increased interest in open source sustainability, supply chain analysis, and prompted more tooling to support both.
00:13:05.240 Companies that relied upon open source began to see sponsored development and open source contributions as an investment in their own stability, leading to the founding and funding of initiatives like the Core Infrastructure Initiative.
00:13:25.920 Heartbleed is a classic example of a black swan event because of the complex socio-technical factors that led to it. I’ll pass it off to Amanda now.
00:13:39.040 Thanks, Julia. Finally, we'd like to highlight the impact of an event in 2016, which continues to be felt by those working on open source package management, the platforms they depend on, and the people whose work they share.
00:13:45.360 The quick story of Left-Pad began with a trademark disagreement over an npm package name. It's important to note that the package name in question was not Left-Pad; it was actually a completely different package.
00:13:54.720 In the Node package management system, many maintainers manage multiple packages. When this maintainer lost the trademark dispute because npm sided with the trademark holder, the developer decided to delete all of their npm packages, including Left-Pad.
00:14:07.760 This decision was within the developer's rights; they maintain their own work. However, Left-Pad was a transitive dependency for many other packages, leading to widespread cascading failures across the internet.
00:14:22.080 When Left-Pad was pulled from npm, many websites relying on Node packages suddenly stopped working. In response, npm decided to restore the unpublished package without the developer's consent to keep services running and maintain customer access.
00:14:36.560 So how is this related to open source and classified as a black swan event? With hindsight, we can see clearly that unpublished code and releases are perfectly legal to remove; you can unpublish code under open source terms. However, there are nuances in the rights and responsibilities creators have regarding their contributions to a shared ecosystem.
00:15:17.920 This case led to wider discussions about maintaining rights and open source contractualism, as well as scrutiny around privately run package managers for open source software. The increased awareness has highlighted the complexities of dependencies and what they mean for maintainers and users.
00:15:35.760 Those are our three events, but how did we gather all this information? There's not a centralized repository covering technical and social aspects. As I mentioned earlier, I'm a bit of an archaeologist; I conducted some manual research, cross-referencing various sources.
00:16:10.720 The Internet Archive is a treasure trove of information, and various news articles contain snippets of details about these events. While CVEs are helpful, they do not necessarily tell the whole story, especially about social aspects.
00:17:01.040 We also leaned on our own experiences in open source, but we still miss the human experience apart from our own. This brings us to a new project called Open Source Stories, which you can explore today at opensourcestories.org.
00:17:23.200 This project is focused on collecting oral histories and stories of open source, as well as the people behind them. We hope to capture a variety of experiences from everyone involved in this vast open source ecosystem, documenting pivotal points in open source history.
00:17:51.520 We wish this repository of narratives existed before, so we decided to create it. It is a labor of love, and we want to share our journey along with the website opensourcestories.org, which features a component related to StoryCorps.
00:18:05.760 StoryCorps is a nonprofit organization in the US that captures lived experiences and narrative histories through shared conversations. They archive these experiences in the US Library of Congress, turning them into documented history, which is our goal for Open Source Stories.
00:18:40.600 Currently, we have some first stories published at opensourcestories.org/stories. In the future, we hope to allow self-directed storytelling for those who wish to share their stories without a facilitator present. We also want to onboard new facilitators who can guide conversations and broaden our outreach.
00:19:03.680 If you are interested in contributing to Open Source Stories, you can tell your story or volunteer to help. We have information on how to contribute on our GitHub page, as well as an opportunity to help edit transcripts and copy from recorded conversations.
00:19:40.360 Being able to transcribe recorded conversations is an important task. We aim for these discussions to be accessible to all, creating transcripts that work for people with diverse needs.
00:20:00.180 Another way to get involved is to become a storytelling facilitator, which is currently just Julia and I. We would love for more people to join us and facilitate these meaningful conversations with others.
00:20:19.560 Finally, if you're interested in the references and slides we discussed, they are available at bit.ly/black-swan-oss. We've also included links regarding StoryCorps, Open Source Stories, and the simplified open source timeline that Julia created.
00:20:54.880 With that, we will wrap up and say thank you, and we'll now return to our live selves for some Q&A, if there's time.
00:21:00.160 Hi, feature Amanda and Julia. Thank you all!