Dependency Management

Summarized using AI

The Social Coding Contract

Justin Searls • December 02, 2014 • Earth

In the video "The Social Coding Contract" presented by Justin Searls at RubyConf 2014, the focus is on the complexities and challenges of open-source software development, particularly regarding dependencies and maintainership. Searls discusses how open source has revolutionized coding, allowing developers to utilize others' work but also presenting significant problems. The key points include:

  • The Nature of Open Source: Open source facilitates collaboration between developers and organizations, enabling startups to create applications by leveraging existing libraries and tools. However, this often leads to dependency nightmares due to a lack of understanding of the underlying code.
  • Dependency Management Challenges: As tools for code sharing and dependency management (like RubyGems and npm) have evolved, so have the complexities associated with them. Searls emphasizes issues like transitive dependencies and version conflicts, which can result in applications breaking unexpectedly.
  • The Role of Maintainers: Maintainers are often portrayed as heroic figures in the open-source community, but they face burnout due to the high expectations placed on them by users who think of libraries merely as tools, unaware of the small communities maintaining them.
  • Market Dynamics and Trust: The trust placed in open-source tools grows as they become more popular, often leading to a lack of scrutiny. Security vulnerabilities can emerge because few people take the time to vet the code, leading to a tragedy of the commons scenario.
  • Communication and Community: Searls discusses the importance of fostering better communication within open source, advocating for higher-fidelity interactions to combat the anonymity and toxicity that can sometimes dominate online discussions.
  • Looking Forward: He proposes a need for improved tools to help new contributors engage and for organizations to take responsibility for their dependencies. Searls concludes that to navigate the future of open source effectively, individuals and companies must recognize their role in ensuring code quality and enhancing the community.

In summary, Searls points out that while open source enhances innovation and productivity, it also requires a communal effort to maintain and secure the code we rely upon for our projects.

The Social Coding Contract
Justin Searls • December 02, 2014 • Earth

Social coding revolutionized how we share useful code with others. RubyGems, Bundler, and Github made publishing and consuming code so convenient that our dependencies have become smaller and more numerous. Nowadays, most projects quickly resemble a Jenga tower, with layer upon layer of poorly understood single points of failure.

Help us caption & translate this video!

http://amara.org/v/FixH/

RubyConf 2014

00:00:18.480 All right, good morning! Let's roll. My name's Justin. Fun fact: I get paid by the tweet, so if you could follow me on Twitter and say hello, I'd love that. If you want to drop me a longer message, you can reach me at [email protected].
00:00:25.519 Open source is good, right? Companies are working with competitors and other companies on common tools, and then turning around and sharing that for free. Startups now can stand on the shoulders of giants and build great new things by simply adding a little code on top. Companies that couldn't exist otherwise are now thriving.
00:00:38.239 Never before in the history of the universe has an individual, not state-sponsored or company-sponsored, been able to do a little work on their own and literally change how the world works. But is open source truly good? I mean, companies love consuming open source, but if you ever want to share an upstream patch, let alone open source a library, they suddenly become very stingy and skeptical about this open source thing.
00:00:55.920 Many startups fall into the trap of hoovering up all this free stuff without understanding how it works, building maintainability nightmares. When they become successful, they can’t add new features anymore, and most of the maintainers I know are pretty burnt out. They don’t like the idea that they are doing something for fun in their free time, while companies are running their operations on that stuff and then expecting customer support on nights and weekends.
00:01:32.400 Today, my goal is just to bring to light a handful of issues affecting open source. My only objective here is to encourage you to do the same thing. Maybe if we can start to build broader awareness of some of the systemic issues in open source, we can have ideas on how to fix them. Perhaps someone will come along and start creating new, creative solutions, and we can begin to live and realize the promise of true openness, whatever that may mean.
00:02:13.200 Today, we’ll just look at a handful of topics, such as dependencies, pulling back the curtain a little bit to show what it's like to be a maintainer, issues of trust, adoption, security, and some deep thoughts about how we interact with each other as humans, as well as where I think the future is heading.
00:02:34.720 I'd like to start off with a definition of the term 'ideology.' Most of us think of the word as a political subscription or affiliation, but I like this definition better: 'They do not know it, but they are doing it.' Ideology is the negative space that's driving our actions without us even realizing it. This quote is from Karl Marx.
00:02:46.239 Open source fans might be considered a bunch of hippies, so I figured I’d start with a Marx quote. Capital is an interesting book because it sits at the intersection of philosophy and economics. I think it’s an interesting subject to start with today because so does open source. We share all this code altruistically as if to earn karma from people we don’t know, yet there are all these companies making bucket loads of money off open source.
00:03:13.440 Every company, even those that don’t contribute to open source, needs it to survive. Considering capital in traditional economies, I want to chart the march of progress in economics. In the beginning, everything was tough; everyone was just trying to survive day-to-day. But then, as groups of people began to form, specialization emerged. You could go to one market for your vegetables and another for your meats.
00:03:26.000 Through efficiencies, this opened the door to the development of human culture. Recreation and art emerged. Industrialization further optimized this because now we could go to one place and get all kinds of goods. The internet totally inverted that. Now, from my bed on my iPad, I can order things from anywhere in the world and have them shipped to my door. This is progress, but where does it lead us?
00:04:03.680 This year, there were rumors that Amazon is using big data to predict what you’re going to buy before you even click one click. They’re actually shipping it to distribution centers near you in advance so that they can send it to you the same day or the next day. Some people are starting to ask questions like whether this march of progress is somehow taking away something fundamental to the human experience. It’s an unintended consequence.
00:04:27.440 Another example of unintended consequences in the march of economics is in the documentary Food, Inc. That documentary was a big hit, with its aggressive tagline stating, 'You’ll never look at dinner the same way.' However, I think it’d be more honest if it said for at least a month, because we can’t change these systemic issues. When we chart progress over time, we see a natural accumulation of awfulness mounting up until it reaches a point where we all panic about it.
00:05:01.199 When we panic, we think we can fix this overnight, but we can’t just turn on a dime. Nothing will stop this strain; it’s going to keep getting worse before it gets better. Maybe we can rein it in to current panic levels later. Speaking of awfulness, let’s chart the march of progress for all of the tools that we use to pull in new dependencies from the open source world. In the beginning, there were just files out there on the internet.
00:05:40.640 If I wanted to build a system and pull in some open source, I had to go find it, download it, and literally check it into my version control. It then became logically part of my application. If it broke, I had to fix it. That served as a guard against pulling in too much because I didn’t understand it, and it only added to what I had to maintain. Makefiles and common build system tools emerged as great ways to depend on tools logically.
00:06:56.560 Now, I could build an application and depend on something like libxml, which is built on each of the systems that needed to be able to compile it as a separate entity. I could upgrade it separately, viewing it apart from my application code. Java and its jar files were another great innovation because, instead of having to configure all these build systems appropriately, a single compiled bytecode could be distributed and then run anywhere.
00:07:38.960 With Java, I could go to a website, download a jar, put it on my class path, and it would just work. This was so convenient that it opened the door for those sites to say, 'Oh, and by the way, we depend on this third thing.' That’s what we call transitive dependencies. What that allowed was for all these libraries we depend on to become smaller and more focused.
00:08:24.000 When considering transitive dependencies, Apache Commons emerged as almost an alternative language stack within a language ecosystem. It was an extremely novel development. Ruby has done great work making dependency management even easier with RubyGems and Bundler. When I’m writing my Gemfile, I only specify the things that I explicitly depend on, and those transitive dependencies are discovered for me while version resolution is handled automatically.
00:09:22.080 I can create arbitrarily deep dependency graphs, thinking only about what I directly depend on. npm has taken this one step further because the Node.js runtime allows you to load the same library multiple times in a single process. This means I declare my dependencies, and then it just naively pulls in all the dependencies of my dependencies, and so on, resulting in these gigantic trees.
00:10:05.760 This often leads to a very common issue for a Node.js library, where someone reports, 'Hey, I can’t install this because the path is literally longer than the Windows maximum file path limit of 256 characters.' I wish that were a joke! This march of progress is optimizing for convenience, getting somewhere quickly. It’s short-term progress, but it comes at the low price of long-term fragility. The comedian Louis C.K. talked about this recently; it’s true: everything that makes you happy is going to end at some point. Nothing good ends well.
00:11:55.040 It’s like if you buy a puppy. You’re bringing it home to your family, saying, 'Hey look everyone, we’re all going to cry soon. Look at what I brought home! I brought home us crying in a few years! Count down to sorrow with a puppy.' People love their puppies but what about the costs? So, speaking of communities, Louis C.K., a guy named Gary told me to build a small but non-trivial Rails app.
00:12:52.160 An empty app will have 50 gems, but yours will end up with 75 to 100. Now, go away for six months, come back, update all your dependencies, and your app no longer works. I know this from experience in the Ruby community. It’s easy to start a Jekyll blog, it’s easy to install Sass, it’s easy to generate a Rails app; it’s always easy right now.
00:13:47.760 The reason, I think, is that when somebody asks us what our application is, we think of the code that we write as being our application. Upon inspection, we’d all agree that our app is really the full stack of everything we ship into production. It’s never been easier to ship something to production, but the things that we ship have never been more complex. I say, 'Oh, it’s a Rails app' because it conveys a lot of information all at once, but I never think to say, 'Oh, and Rails depends on Thor, this very specific version specifier.' I didn’t even notice that until I made that slide.
00:14:43.360 Even with 272 gems, they can no longer be installed in the same project due to version resolution conflicts. Bundler hides some of this from us; it could even promote some of this information. It could tell us, 'Hey, we just installed 10 direct dependencies that conferred 43 transitive ones,' or 'The version specifiers on all those gems preclude the installation of 300 other gems of our ecosystem of 88,000.'
00:15:36.720 It could inform us, 'By the way, if you were to run bundle update right now, it would be unable to update five gems to their latest version.' I’d love to know that! Of course, if you’re a Node.js fan, you might say, 'Haha, version resolution doesn’t affect npm,' and you’re right. But there are other issues.
00:16:21.760 Let’s pretend that the orange triangle is a dependency. I might directly depend on version C of it, one of my dependencies might depend on version B, and one of its others might depend on version A—all at the same time. Now, this is all well and good until I begin to think about it. What if I call my triangle and it gives me back a domain model object with version C’s library understanding? And I foolishly, admittedly, pass that into my other dependency, who passes it into version B of the same dependency.
00:17:13.440 Did I think about that when I was writing the library? Probably not, so is it going to blow up or is it going to work? The answer is, nobody knows. And that lack of knowledge isn’t great for understanding the software that we’re building. Another issue is that, as a library author, I can specify the exact version that I want, but I can’t control the specification underneath. If my orange triangle depends on a loose specifier like red square for example at version star, it means that at install time, the user is going to get the latest and greatest version.
00:18:40.960 Things could potentially break. In fact, in our project Lineman JS, one day we started getting complaints that new installations were failing. We hadn’t changed anything for a month. What happened was one of our direct dependencies broke because it had too loose a version specifier on something else, resulting in a breaking API change. To fix it, we realized that we had to fork that dependency and specify explicitly that we wanted version D—the last compatible version—and push that to npm.
00:19:20.880 Now we're saddled with the maintenance and ownership of this npm module that we don’t understand, and if there's a security issue or whatever it is, it’s just baggage. Here’s a video that the team took of me that weekend. I feel like that a lot. We all understand that the code we write for our applications is the code we need, and the code we depend on directly is there for convenience. If we think about it, we can all agree that the stuff our dependencies depend on brings some complexity.
00:20:23.840 But when we think any deeper than that, it just starts to feel like risk. How often do you ask yourself about your transitive dependencies? Not often. Anything deeper feels mysterious. That’s why I think many people have been lampooning the phrase ‘full stack developer’ this year—because nobody knows everything they’re shipping anymore.
00:21:00.240 This makes me long for the good old days sometimes—like Makefiles. Sure, they were painful, but maybe that was a healthy pain. You know, 30 years later, a lot of these old C projects still build correctly. Now, who's confident that when I run npm install on my project 30 years from now, it's still going to work?
00:21:48.640 Let’s talk a little bit about all these people ruining everything—like me, open source maintainers—and what it’s like to be one. The most important thing to know is that open source maintainers are not rock stars; they’re just humans. In fact, they’re kind of just extra early adopters. The way I envision it is that a maintainer is on Google looking for the thing they need, then realizes, 'Oh, it doesn’t exist yet.' So they turn around and go make this thing, working in the open and sharing it.
00:22:37.840 An early adopter, slightly less so, might just Google the same thing the next day, find it, and say, 'Oh sweet, I found this cool new thing! I’m going to share it on Hacker News and talk about it because it’s still hip and underground!' This puts the maintainer up on a pedestal, promoting them. They might get stars in their eyes and be excited that people are using their thing, but because they’re on that pedestal, other people on Hacker News don’t mind pointing out that it doesn’t do X, Y, or Z.
00:23:26.080 Now the maintainer realizes their ego is all wrapped up in the adoption of their tool. But early adopters are usually just as competent as maintainers at building things, so maybe they’ll send in pull requests that fix those issues, and now the maintainer is really happy again. It’s a super up-and-down scenario. The initial release of any new library is usually just to scratch the itch of a person who has a need, so there are going to be rough edges.
00:24:20.560 Early adopters are great because they often submit pull requests or issues to help round things out. What emerges is usually something that’s ready for mass consumption. Maybe you’d call that a version 1.0 candidate or, if you’re like me and afraid of the implications of what 1.0 means in semantic versioning, it’s a version 0.840.
00:25:19.840 At this point, I’d love for the conversation to happen between maintainers and early adopters—sharing ownership of the library together. Let’s own this thing so that the early adopter experience broadens the contribution base of this project. That typically doesn’t happen. I’d even settle for maintainers saying, 'Hey, let’s make you a committer,' so you get notifications of issues and stuff.
00:26:06.480 An early adopter might agree to help sometimes, but those conversations typically don’t happen, and at this point, the maintainer may as well be saying, 'Hey, let’s never communicate again. Sounds good? Bye forever!' They scurry off to the next thing because early adopters are always after the new shiny. So why don’t maintainers just share control?
00:26:52.560 I think the reason is that they misjudge how much happiness this library is going to bring them. Up until now, they scratched their own itch and built something fun. Afterward, they started receiving acknowledgment and got excited. Then they got to version one, and they can only imagine the sky's the limit, thinking, 'This thing’s going to make me super happy! I’m going to be like DHH and people will hold conferences after me!'
00:27:46.080 Fear not; late adopters will disabuse them of this joy! From the maintainer’s perspective, if it scratched their itch initially, version one is probably mature enough to fulfill their needs. They might go a week without any commits or activity, maybe even a month. Because they’re also early adopters, a new shiny thing will likely come along and distract them.
00:28:36.720 Late adopters see this and respond excitedly, saying, 'Oh, look! No recent commits! This must be stable with 800 stars! This sounds like a safe bet. Open source? That’s free! That’s good!' Very often, I’ll receive emails years later from people saying, 'Oh, we’re so glad we found this! Thanks!'
00:29:23.200 Maintainers’ needs start small, but as they layer on features, the problem compounds. If we conceive of each library as a separate thing and layer in what the user needs, there will always be things I wish the library could do that it doesn’t.
00:30:19.600 This gap creates a negotiation, and I then have to ask questions like, 'Should I submit a pull request for this? Do I work around it? Do I open an issue and ask them to implement a feature for me?' A lot of those late adopters will come to the project two days later, looking stern, demanding, 'Wait a second! This thing doesn’t fit my enterprise needs!' How could they ignore such an obvious and important use case?
00:31:14.240 This leads to entitled GitHub issues where people miss important features. I usually reply politely but say, 'I didn’t know it had to do crazy nonsense; could you please explain?' Instead of a reasoned conversation, all their coworkers jump in and start plus-one-ing it. The implication is that I’m bad and should feel bad for not anticipating their needs.
00:32:05.480 Odds are it’s a weekend, and I’m at the park with my wife, having a beautiful day. Suddenly, I’m glued to my phone, feeling nervous and inadequate. I run home quickly and start working intensely on their request. I come back and say, 'Hey guys! I just spent all weekend building this thing; could someone please verify it works so I can close out the issue?' And then, what happens? I never hear from them again.
00:32:59.840 Now, I feel sad because I’ve just done all this work for free for strangers who may not appreciate it, and I’m saddled with it forever. I will have to work around that particular edge case in my code going forward, making my beloved tool less maintainable. This is how projects tend to bring me less happiness over time, to the point where I often hate most of my projects after they’re three years old or more.
00:33:44.560 Late adopters tend to make requests for niche features more than early adopters. This leads many in our community to assume that late adopters make better customers than users, which is why you see donation buttons, discussions about dual licensing (GPL plus commercial large feature add-on packs), and potentially paid support contracts or consulting hours. I think all of this misses the point.
00:34:44.080 There’s a motivation impedance mismatch here. People are building open source out of intrinsic motivation and drive, and money isn’t going to solve that problem. If you want an open source library to do something, and the author doesn’t want to do it, simply throwing money at it won’t increase their motivation.
00:35:38.480 I would much rather we culturally adopt a norm where maintainers feel free to say no. I feel like a jerk when I say, 'No, I can’t do this feature for free for you,' but I think it would be healthier for all of us if we weren’t so entitled when we opened an issue. We should be more complimentary and ask for advice on how they might approach the same problem instead of assuming the library needs more code.
00:36:36.080 Trolls are a totally different category of people who make us unhappy, obviously. They spit out hate, sometimes threats—whatever it is, we know the effect it has on others; it makes them want to give up and walk away. Remember, maintainers are not rock stars. However, due to factors like asymmetry, a lot of maintainers have 15,000 followers on Twitter. We assume they have it all together.
00:37:21.920 We might even understand that if they have that many followers, there are probably a few trolls in the room, but because communication is asymmetric, we often don’t see if those trolls are the overwhelming majority of interactions they have online. They wield an outsized impact on the psyche of many open source developers.
00:38:17.120 So when a maintainer quits, we might be surprised: 'Whoa! That’s weird! What happened? I didn’t see any of that.' Recently, Seth Varga left the Chef project. He was a great contributor who did a lot of amazing work, but just a handful of trolls in the community were able to sap his joy from that project, which further saddens us.
00:39:09.480 At this point, this is where I often see cries for help, saying, 'Hey, I’m burnt out; can someone please help me maintain this thing that has all of these features that no one wants anymore? Hello, anybody?' And at this point, nobody's there. The early adopters who might have helped are long gone.
00:39:59.440 This is exactly how so many projects stagnate and die. Even well-maintained projects, by maintainers who do everything right, can succumb to this. No maintainer lasts forever; a big, visible instance this year was T.J. Hollowaychuk. He left Node.js for Go and, in the process, handed off many of his npm modules to others. He sold Express.js to StrongLoop even though he wasn’t the primary maintainer on it anymore, and he didn’t notify any maintainers.
00:40:46.640 It’s easy to screw this up. Of course, he was beloved in our community, and he left. He had the right to close down all of his stuff and walk away. Since he was highly visible, we all had forks of his projects and were able to piece it back together, but that’s not true for most maintainers.
00:41:01.440 What if there was an application that recognized it's very human and natural for people to leave and stop contributing at a certain point? I think it would be neat to broaden the base of people contributing. What if I had a service where I could authorize RubyGems, npm, and GitHub? It would aggregate all of my projects, and I could explicitly say, 'I need help on this one, and here’s the type of help I need.'
00:41:57.280 This way, when someone logs in, there’s a dashboard displaying all the projects they use that have requested help, with a call to action asking, 'Yes, I’ll offer to help with that!' This strikes me as way more successful than sending off an email into the void asking a maintainer, 'I like your project! Can I help you?'
00:44:06.080 The reason I did this talk today, the reason I thought to put it together, was because our friend and beloved colleague, Jim Weirich, passed away. His passing made me reflect a lot, which was very hard for me. One thought was that he has this tremendous legacy, and I want to see if I can help.
00:44:23.920 In particular, I have a favorite gem of his, RSpec Given, and I wanted to see if I could help adopt it. That process was complex and difficult. I realized that when people go to a lawyer for estate planning, very few think of what will happen to their GitHub repositories when they die. Even if they did, the law hasn’t caught up to technology and never will.
00:44:58.640 What if an application could serve as a dead man's switch? It could recognize that if you fall off the pedal, the train stops. For example, I could list beneficiaries—like Todd and Brandon—and after I go absent for a certain time, maybe it sends me little tickler emails every month. If I go away for 60 days, it would know to add Todd and Brandon as owners of all the projects I’ve authorized it to.
00:45:43.760 By the way, I like to think of a name for a project before I build it. I probably won’t build this, but I’d like to call it 'Somebody Please Make This.dot.io.' Really, somebody please make this! I would greatly appreciate it.
00:46:28.000 Now, regarding dependencies that are binary or runtime that we package with our applications, what about all of these cloud services we use? This slide depicts services that are going to shut down someday. Can any centralized service, probably written with volunteer labor on less money, truly be sustainable? If the corporate ones are all going to shut down, what about the purportedly open ones?
00:46:54.560 Can they be truly open? A maximally open system is rife for abuse, but if you make it too curated, you risk excluding people and limiting the contributions. Almost all open source infrastructure is entirely centralized out of convenience, right? The march of progress just wants to make it easier and easier to adopt stuff, and it’s a hard question to answer. How do we decentralize this?
00:47:58.720 A lot of people in the room contribute to RubyGems; they do great work, and I think they’re underappreciated. But what if RubyGems were to disappear? How many businesses would that affect? Does the businesses relying on RubyGems realize that as a liability? If npm were to fail or lose a month of backups, how many things would no longer be able to install and work?
00:48:51.280 I ask this because I want to know what a decentralized dependency service might look like. We have BitTorrent available, right? We have cryptography, which seems like the sort of technology we might leverage without fearing a single point of failure. Speaking of single points of failure, one of my favorite things is to wait for the next time GitHub goes down because people freak out on Twitter.
00:49:49.440 They make snarky jokes like, 'Well, you know it’s distributed, so why do you care?' because they can work locally, and in fact, I can SSH to my buddy, and we can still collaborate. Good thing that’s all we use GitHub for, right? It’s not like we use GitHub to pull down our dependencies, or we use GitHub to test our code.
00:50:46.720 Most of our continuous delivery services, which ship our code, also depend on GitHub being up. Even if you finish your work, your next issue is in GitHub issues, and you can’t find the next thing to do because now it’s also commandeering all of our project management.
00:51:22.560 How can we connect numerous services to our code, which is admittedly the source of record that everyone wants to integrate with, while avoiding that single point of failure? Fortunately, Git is very fast and portable, suggesting that it could serve as a distributed transport layer that allows these hooks in without requiring a single company to be up for everything we do.
00:51:57.520 Let’s talk about trust. Open source requires adoption; obviously, it’s optimized for it. To adopt something, people have to trust that it’s good. There’s explicit trust, which is the dependencies we directly depend on, but there’s also implicit trust—the stuff they depend on. We just sort of trust that the people we trust are trusting the right people, etc. It’s a big web!
00:52:49.760 As a maintainer, how do I get people to trust me and use my thing? The answer, of course, is marketing. Let’s look at marketing over time in open source. Consider Linus Torvalds’ 1991 announcement of Linux on the Minix mailing list. It’s a few paragraphs long, but there was no catchy name.
00:53:29.120 He made a self-deprecating remark in the first line, then went totally off-message. He talked about working on his niche particular hard drive—this is bad marketing. If Linux were announced today, it probably wouldn’t even make the front page of Hacker News, despite running most of the servers in the world. I think that’s a reason for pause.
00:54:33.360 A decade later, the Ant project for Java tried to enter many big corporate environments that didn’t trust open source. Look at all they had to do: they had a fancy logo, a website, a mission statement, and the Apache foundation affiliation to confer trust. But as the number of dependencies we were pulling in increased over time, we had less time to vet them.
00:55:22.160 These days, the standard GitHub markdown README is expected. You can find a catchy introduction with easy steps to get started, plus some mostly green badges at the top telling you that things are working. It’s gotten even further, as there are many corporate-backed sponsored projects. In one recent example, a diverse group of engineers builds a rocket using gradients and authoritative taglines.
00:56:11.360 The marketing drives everything in open source! We’re under natural selective pressure as maintainers, meaning we look up to all these rock stars in the community. When Aaron Patterson releases a new gem, I trust Aaron. I know he’s good at programming! Clearly, his gems will work.
00:56:57.760 I can check how many stars or forks a project has, how much activity there is, how many open issues there are, and see how often that project has been downloaded. However, even though 90% of those downloads are Travis CI running checks, my lizard brain tells me it’s still important.
00:57:19.120 Semantic versioning emerged as an important technique after years without a formalized method for versioning software. Now we need a way to know at a glance if something is ready for use or safe to update. Who has time to vet transitive dependencies? Nobody. The more people you explicitly trust, the more you don’t realize you’re trusting.
00:58:09.360 We should all recognize that every project is simultaneously marketing to you! I write open source and want people to use it because I love it. I’m passionate about it! I want you to engage with my brand and use my stuff. But I hope that you have a head of discernment on. Realize that every new thing you pull in adds complexity to what you're doing, and more stuff can fail.
00:58:59.680 And speaking of failures, let's talk about security. I’ve been a long-time believer that you can do worse than security through obscurity. You could have a ton of code out in the open with nobody working to secure it. Of course, the Free Software Foundation will remind us that open source code is accessible to everyone, right? The 'cathedral and the bazaar' argument posits that there are many eyes to find shallow bugs, but that’s only technically true if people read the code.
00:59:41.520 So, who reads the source code? First, there are those who claim to read the source code, and a tiny minority of them actually do. A fraction of those who fork projects do anything with it. The committers themselves typically comprise only a handful of that crowd.
01:00:14.720 In contrast, the people hunting for exploits are the only group of constant coders. This group reading the code is out to exploit it. Bash is a computer program that runs everywhere, including devices connected to the internet, such as your smart fridge. It’s unfortunate that bash had a major security exploit because a lot of those devices aren’t upgradable or patchable.
01:01:09.760 Somebody looked at that code and was shocked to see global variables everywhere. Further, there were all these void methods that took no arguments, meaning that all that method could possibly do is 'a' nothing, or 'b' muck with global state. Let’s read a line of open source code as a group.
01:01:50.720 This is the beginning of a for loop. I look at this and, as a Ruby developer, I don’t spend much time with for loops. I see that it’s called string index; I wonder why it’s doing assignment in the second clause instead of a boolean operation. They’re incrementing in the second clause instead of the third. Surely they must have a reason for doing it that way! I assume that whoever wrote it is as smart or smarter than me, so they must have had reasons.
01:02:59.440 This reflects how much cognitive depletion occurs while reading their method. I only got halfway through it. The Free Software Foundation maintaining Bash holds that the solution isn't proprietary software; it lies in providing energy and resources into auditing and improving free programs.
01:03:51.760 So, who here wants to audit the quality of code that literally everybody depends on? None of you, right? But why is that? As a library becomes more popular, the importance of auditing its safety and security increases, yet individuals' motivation to audit decreases. If I use a gem and only three people use it, I’ll probably read the source code to make sure it works. However, if I’m using bash, I’m unlikely to read it because I’ll assume that someone else has.
01:04:41.920 The downside is that everyone else assumes the same thing, leading to a scenario where no one actually reads it. That’s what we call a tragedy of the commons. It’s no one’s problem until it becomes everybody’s problem, and by then, it’s too late.
01:05:31.600 I hope that companies will step up and solve this by investing because they have the capital and can collaborate. The ToDo group is an example of companies coming together to form norms about open source governance. One opportunity is to share notes when auditing the security of projects, working together to patch exploits. I’m not sure if that’s one of their goals, but I think that’s likely where the solutions lie.
01:06:28.160 If you work at a smaller company, it’s your responsibility to recognize that open source isn’t a free lunch. It’s not altruistic; it’s a source of potential liability and should be vetted. I want to share some thoughts about our interactions with others and where I think the future is heading. Unfortunately, all these stick figures I painstakingly drew were a lie.
01:07:17.040 They imply that open source occurs between people face to face, but that’s not true. Our communication flows through email, GitHub issues, and Twitter. Almost all of it is just asynchronous text. We’re all merely avatars, usernames, and some emoji on a screen. This means when you get riled up about open source, no one can hear your scream.
01:08:03.040 I scream all the time, and none of you ever come to say hi. That’s a big part of the problem. Evie tweeted recently that it’s messed up that a lot of modern discourse is optimized for whoever has the fewest feelings and most free time. Who designed that system? We did. Apparently.
01:08:53.760 So when there’s uncertainty, ambiguity, or disagreements in a team that lead to simmering disdain, is email the right communication channel? Probably not. It makes more sense to escalate to a higher fidelity communication. If there’s ambiguity, maybe real-time text could help clear things up.
01:09:47.600 If you’re having a voice conversation, you can empathize with the other person. If you make someone cry, you’ll see it on a video chat. If all else fails, you can always meet in person. There’s something about human biology; when two people get in a room, they tend to walk away with some kind of compromise. That’s not true on Twitter.
01:10:43.440 The strategy can also be a great troll repellent. Trolls gain power through anonymity and perceived lack of consequences for their actions. But if you increase the fidelity of the communication and discourse, their anonymity erodes. If someone’s trolling you, you could say, 'Alright, cool, let’s have a hangout.' Suddenly, they tend to disappear.
01:11:30.080 So what I’d love to do is make opting into these higher forms of communication easier. For instance, if I’m in a GitHub thread, I’d love a button that would open a new chat on Gitter, or build a live example right in place.
01:12:17.680 Higher fidelity communication is better! I also want to think about where the future is heading. I showed a graph that it’s going to get worse. So, on that graph, I said, 'Okay, we’re here, and it’s going to get worse before it gets better. What does this look like?' I was stewing on this while Gogaru was happening.
01:13:23.680 I asked myself, ‘If we extract ourselves from our current culture of dependency, where would that lead us?’ I didn’t come up with many great answers, but then I saw a tweet. Yehuda was giving a talk about the Rust programming language, saying it enables high-level programmers to write systems level code.
01:14:19.760 Initially, I thought, 'Whoa, we have this huge pyramid of stuff we don’t understand!' That’s scary! However, as I explored the concept, I actually realized I’m okay with this. There might be benefits to it, and I want to discuss those.
01:15:02.880 If you think about innovation over time, low-level system innovations happened prior to high-level ones—you need the lower levels first before you can work on high-level systems. Most of the open source innovation we’ve experienced in the last 20 years has been centered around high-level programming languages.
01:15:59.760 Recently, we’ve been reinvesting in low-level systems, like Go and Rust. The open source dependency culture is steeped in many assumptions about high-level code, so the question becomes: if we were to solely transport that approach to low-level systems engineering, how would we apply it?
01:16:50.720 Think of my systems programming friends. They tend to be very conservative and cautious, which is partly due to being isolated from innovation and, just a bit, being curmudgeons! To view that in a better light, there’s a portion of caution born from real-life failures, which often have grave consequences. That’s likely where their extreme caution comes from.
01:17:59.280 So, if you consider a high-level system, what’s the worst that can happen? Healthcare.gov starts failing, and people can’t get their insurance! That’s bad! That’s like 60 nightly news segments in a row of sheer drama for our nation.
01:18:46.720 Now let’s put this same technology in charge and see how it fares. Low-level systems frequently require much more rigor in fault tolerance because people can literally die if you don’t get it right. My systems friends have a different perspective on adopting dependencies than we do. They tend to view them as outsourcing their understanding of how to solve something.
01:19:29.040 This means if we have responsibility for an application, like that orange blob over there, and some specific dependency, we can view that dependency as solving part of our application’s problems while knowing what it’s doing and why.
01:20:14.880 The key is we often might not know how, representing what I call 'understanding debt.' Understanding debt is typically paid off by iterating on each dependency over time, and we learn about it through usage and changing how it operates. However, if iterative releases aren’t feasible, you shouldn’t outsource that understanding.
01:21:03.440 In comparison with high-level systems, low-level systems are often highly centralized and might only run one instance. Conversely, low-level systems tend to be built, shipped, and put out into the wild. High-level systems typically have a full staff maintaining them for the entire duration, while low-level systems often have only a support chain.
01:21:56.720 High-level systems usually have more than enough overhead in terms of CPU and memory to absorb the impact of inefficient dependencies, while low-level systems have very real and hard limits. High-level systems incorporate automated tools for deploying patches, while low-level systems may have intermittent connectivity or no ability to be patched at all.
01:22:49.680 Clearly, we’ve made it easy to iterate on high-level systems while it remains very hard to iterate on low-level systems. These concerns necessitate a deeper upfront understanding from systems engineers. If you were to graph out the depth of understanding required for successful high-level web apps compared to low-level system controls, the disparity is profound.
01:23:47.760 For a high-level application, you only need to grasp basics like how browsers function, HTTP, some JavaScript, and the request/response life cycle. In contrast, a vast amount of upfront knowledge is necessary for a low-level system, like a plane’s control gear, where everything can go wrong and needs consideration on the first day.
01:24:41.040 This traditional paradigm—where we write a handful of little tiny domain objects on top of a mountain of open source stuff we don’t understand—might be an acceptable level of depth for a high-level app. However, if we simply transfer that same approach to low-level systems, we enter a danger zone with things we don’t understand.
01:25:29.520 To summarize, systems engineers have two approaches: they can either build more richer and deeper domain objects or undergo a qualifying process. They rigorously prove that the tools function as needed, enabling us to have systems we can truly trust.
01:26:23.120 This reflection prompted me to realize that modern tooling stems from high-level web development. If we look again at this chart, it represents our current perspective on how dependencies operate. Should we apply that framework to low-level systems, adhering to rigorous standards and ensuring the integrity of our work, we might achieve collaborative efforts and maintain successful projects.
01:27:30.560 At this point, I hope that systems innovations bring the same conveniences we see in open source to low-level systems. They might instill a desperately needed cautiousness and highlight the importance of understanding the things we ship. We covered a lot today in four areas, but I want to conclude by stating that I believe the open source world can improve significantly, and much of it is actionable and fixable.
01:28:19.840 Companies must acknowledge their responsibility to contribute back and audit the security of the tools they depend upon. Startups need cautions that there’s no such thing as building a real app in a weekend and being completely done. It may be better to invest time in understanding the tools and libraries they’re building upon.
01:28:55.040 If they must rush the first version out the door in 30 or 60 days, they should understand that they’ll probably want to discard version one strictly when it comes time to build version two after validating their success. I like to call this the slow code movement.
01:30:05.040 I think individual maintainers can learn a human lesson from this. They should act early, pulling in new owners and contributors as quickly as possible while excitement still surrounds the project. We could solve this with better tools by making it effortless for people who want to engage within open source.
01:30:56.480 Let’s be explicit in this invitation so that newcomers feel welcomed instead of waiting for them to muster the courage to reach out through an email or issue. Once again, my name is Justin! If you have questions, just say hello! I come from Test Double, an agency of exceptional software developers, so if your company needs help, let me know—I'd love our talented staff to collaborate with your team.
01:32:09.920 As with everyone else, we’re hiring too, so feel free to drop a line at [email protected]. I’d love to chat with you during the break today. If you’re too shy for that and just want a free sticker, I’ve got some, too! Most importantly, I want to sincerely thank all of you for your time.”},{
Explore all talks recorded at RubyConf 2014
+73