Open Source

Summarized using AI

State of the RubyGems

Samuel Giddins • December 19, 2023 • San Diego, CA

In the 'State of the RubyGems' talk at RubyConf 2023, Samuel Giddins presents a comprehensive overview of the RubyGems ecosystem, detailing its history, current state, and future roadmap. The talk starts with Giddins' background, highlighting his extensive involvement with RubyGems and Bundler. He recounts the foundation of RubyGems in 2004 and its evolution alongside Ruby and Rails, emphasizing the necessity of automated dependency tracking which led to the creation of Bundler in 2009. Key points discussed include:

  • Historical Context: The transition from a volunteer-based system to a structured organization that better supports the Ruby community. The merger of Ruby Central and Ruby Together into one nonprofit was a significant development aimed at optimizing operational effectiveness.
  • Infrastructure and Growth: RubyGems.org has seen consistent growth, managing over 190,000 gems and 150 billion downloads, while serving approximately 20,000 requests per second.
  • Achievements: The RubyGems team has successfully maintained operational uptime with zero full outages since 2015, and executed hundreds of updates that improved functionality and security.
  • Current Challenges: Giddins emphasizes issues such as handling dependency confusion, account takeovers, and security vulnerabilities, as well as the burden of support requests due to increased security measures.
  • Recent Improvements: New features introduced in the past year include enhanced MFA support and a more efficient dependency resolution mechanism in Bundler, alongside infrastructure optimizations that have improved service stability.
  • Future Goals: Giddins outlines aspirations for further development, including enhanced security projects, improved user experience for developers, and ongoing collaborations with other tech organizations. The critical need for community engagement and funding is underscored, with calls to action for developers to contribute code, provide feedback, and participate in Ruby Central membership.

The session concludes with a reminder of the importance of community support for sustaining RubyGems and fostering innovation in the Ruby ecosystem, encouraging attendees to engage with the RubyGems team for feedback and collaboration.

State of the RubyGems
Samuel Giddins • December 19, 2023 • San Diego, CA

Part history, part state of the union, and part roadmap for community feedback, this talk will cover how Ruby Central came to have an open source team, what we have been doing for the last 8.5 years, highlights from our work in 2023, and a deep dive into the ideas that we would like to get onto our road map. If you want to know more about Ruby Central, RubyGems, or project planning in long-running open source projects, this is the talk for you.

RubyConf 2023

00:00:18 Hello everyone, welcome to the afternoon session. Our next speaker will be Samuel Giddins, and he'll be talking to us about the state of RubyGems.
00:00:30 Samuel is a longtime contributor to Bundler and RubyGems, first joining the team to replace the Bundler dependency resolver in 2013. Today, he wears many hats working for Ruby Central to help maintain and develop not just Bundler but also RubyGems and the infrastructure and Rails app of RubyGems.org, as well as finding and fixing security issues in any library he touches. Please give him a warm welcome.
00:01:09 Good afternoon! Oh wonderful, my laptop's telling me that I'm giving a talk. Seriously, that's what popped up on my screen right now. Um, so hello and welcome to the 2023 State of the RubyGems.
00:01:24 This is going to be really fun! I always love doing these. By a show of hands, who here uses RubyGems? I've never gotten 100% before, so that's an achievement for me right there.
00:01:40 A little bit about me... There we go, uh, so I am @seirgiddins everywhere on the internet, also known as Samuel Giddins.
00:01:48 As mentioned, I'm a RubyGems, Bundler, and RubyGems.org maintainer. I'm currently wearing the security hat a lot of the time. It can be a bit of an uncomfortable hat to wear, and I have been contributing bugs to all these things for at least 10 years.
00:02:10 So, some history: RubyGems has a really long history, and we need to quickly summarize it. Originally created by Chad Fowler, David Black, and Rich Kilmer during RubyConf in 2004, you heard that right—2004! RubyGems grew to become the most popular way to create and share Ruby libraries.
00:02:24 Ruby Central provided dedicated servers for RubyGems, but maintaining those servers remained an entirely volunteer operation. As Ruby, Rails, and GitHub all grew together in popularity during the late 2000s, installing and updating gems by hand became more and more difficult as Rails 3 was being broken apart into many gems, with each piece made optional and separately usable. It became clear that an automated system for tracking, installing, and updating dependencies was necessary.
00:02:54 Enter Bundler: Bundler 1.0 first shipped in 2009, and by 2013 it was clear that volunteers weren't sufficient to keep up with the constantly growing RubyGems.org service and the number of Bundler users. In 2015, Andre Arko founded Ruby Together, a not-for-profit dedicated to supporting developers to maintain and develop the RubyGems ecosystem.
00:03:28 Over the years, Ruby Central and Ruby Together collaborated closely on the RubyGems.org service. By 2022, the boards of Ruby Central and Ruby Together agreed that both nonprofits' missions would be better served by a single organization rather than two organizations with overlapping goals, and they merged into one entity.
00:03:59 Today, Ruby Central supports the worldwide Ruby community by organizing this annual RubyConf and also the annual RailsConf. It has started a program to support regional and local conference organizers and coordinates funding for both servers and developers for RubyGems, Bundler, and associated projects.
00:04:18 So RubyGems.org might not have the enormous scale of, say, GitHub or npm, but it has grown about 20% per year for the last 15 to 20 years, and compound growth is a pretty powerful thing. It's absolutely a significant service with a huge number of users worldwide. I think I counted some hands earlier, and some people here use it.
00:04:50 Just some fun statistics to give you a sense of what RubyGems.org does: we have over 180,000 users with accounts, over 190,000 different gems, and over a million and a half different gem versions.
00:05:02 I believe that's something like 150 billion total gem downloads all time! We average about 20,000 requests per second, which equates to about two billion requests every weekday, maxing out at about 225,000 requests per second during one particularly painful weekend that I remember all too well.
00:05:31 During that weekend, we served something like 7.5 terabytes per hour, 185 terabytes per day, which, if you do the math, is something like 54 petabytes a year— that’s a lot of gems! Just keeping those systems running over time requires a lot of work.
00:06:02 So who makes sure that installing gems is possible at any time of day or night? The Ruby Central Open Source team, which includes a 24/7 on-call team, an SRE team which operates our infrastructure, the RubyGems.org team which maintains the Rails app, the RubyGems team maintaining the gem command, the Bundler team maintaining the Bundler command, the Ruby Toolbox team, the Ruby API team, and the Gemstash team.
00:07:00 Many individuals are members of more than one of those teams, and we wouldn't be able to do this without them. Everyone is working below market rates, making time on their nights and weekends to support these projects because they love to help the Ruby community.
00:07:32 We don't have time in this talk slot to individually thank every developer and volunteer who has made RubyGems and Bundler into what they are today. It’s a really long list; you can see the full list on GitHub.
00:07:52 So we'll just thank the most active current contributors: coordination by Andre Arko, development work and 24/7 on-call support by myself, Joseph Suc, Kobe Swandal, and Arun Agrawal, development work by Martin MD, David Rodriguez, Alan Dash, technical writing by Gift UU, and editorial support by Irene Kano.
00:08:17 Now, in addition to me, Colby and Martin are here at the conference as well and would be happy to talk with you about RubyGems, Bundler, and how we can continue to improve the experience of developing software in Ruby.
00:08:51 If you saw someone up front, give a wave; that was Martin. So, what have we done since the initial founding of Ruby Together in 2015?
00:09:05 We have funded and grown the team of committed maintainers, ensuring that RubyGems.org continues to function. In the past 8.5 years, I’d say we have accomplished quite a lot. Number one is we've kept the lights on. Our biggest accomplishment is keeping RubyGems.org operational. An early and significant inspiration for the initial founding of Ruby Together was a complete outage of RubyGems.org in 2013 that lasted several days and was quite painful.
00:09:40 Does anyone here remember that outage? Was it painful? Yes? Okay, cool. So in the eight and a half years since Ruby Together began to fund maintenance on RubyGems, we have had zero full outages so far. We would like to keep it that way.
00:09:59 In that time, we have served something like 143 billion gem downloads to millions of computers across the world. Keeping the lights on is hard.
00:10:14 We've made a bunch of releases in the Bundler and RubyGems client side projects in that same timeframe. We have been extremely busy, and dare I say productive. Over the same eight and a half years, we have released 201 versions of Bundler and 139 versions of RubyGems.
00:10:51 We have also spent significant time collaborating with Ruby core, improving the way that RubyGems integrates into Ruby and modifying Bundler so it could ship with Ruby itself. We've also seen a bunch of Ruby core being split off into default gems.
00:11:20 So, we've done lots of good stuff. We've kept Bundler and RubyGems working on dozens of new operating systems, processor architectures, and new Ruby implementations.
00:11:39 We've combined the Bundler and RubyGems repositories and code bases, increasing the amount of shared code and reducing the need to fix bugs multiple times. We've designed and implemented multiple dependency resolvers, collaborating with other package managers, languages, and ecosystems. We've also designed, implemented, and maintained two completely new formats for gem metadata and version information, optimizing over time as the needs of the Ruby community have grown and changed.
00:12:30 So, not to complain, but keeping the entire Ruby packaging ecosystem running smoothly and safely is hard. Since I'm the person here on stage, I'll just call out a few of the incidents that I have caused and helped respond to.
00:13:12 Earlier this year, we deprecated the dependency API. When we first did that, it caused something like a 25x increase in traffic to RubyGems.org from 10K RPS to about 225K RPS. We migrated from Unicorn to Puma, and yours truly renamed the Kubernetes deployment, meaning we lost our autoscaling rule. We tried to serve all of the RubyGems.org traffic from a single pod—and that did not work very well.
00:13:59 In early May, there was something of a pager storm that I got bombarded with. I have some graphs here to help show that we went from serving a bunch of traffic from the Rails app to a lot less traffic from the Rails app by rewriting some of our endpoints to be served from S3 instead.
00:14:54 We also deal with a bunch of recurring security issues. The most common, each requiring investigation several times per month, are reports of dependency confusion. That's when you have an internal gem, and someone uploads a gem with the same name to RubyGems.org.
00:15:20 Count takeover is when someone gains access to another person's account, and what I've sort of lifted from the payments industry has been termed 'good maintainer gone bad.' You know, someone publishing gems that they should no longer have access to, or yanking gems that they should no longer have access to.
00:15:59 In addition to the challenges particular to running a packaging ecosystem, we also have to do all the work that any modern web service at scale requires, dealing with things like DoS threats, the continuous number of CVEs that get reported, and patching infrastructure that we run on.
00:16:25 RubyGems.org doesn't have a dedicated support team, so it's yours truly who's responding to many of the support emails in my copious extra time. Each of these requests requires something like 5 to 10 minutes to address, which adds up very quickly.
00:17:06 Things like someone losing access to their account and asking, 'Hey, can I have this gem name? That project looks abandoned,' or maybe that project just looks empty. People complaining that there aren't any good names left; you know, there were 180,000 good names and I'm just too late to come up with another good one.
00:17:51 What we've also seen is that as we've pushed RubyGems.org to become more secure, it's led to more customer service burden over time. We've started requiring MFA for accounts on RubyGems.org, and that's led to a bunch of manual MFA requests that staff has to respond to.
00:18:31 We've also seen a big growth in the number of ownership disputes, such as multiple people claiming ownership of a gem name or companies claiming trademark or IP infringement from a gem that someone in the community has uploaded.
00:19:03 We have to answer all those emails. We also get a bunch of submissions through our HackerOne page, each requiring a member of the team to investigate and determine if there's a real vulnerability that needs to be addressed. Oftentimes, there isn't.
00:19:36 I don't know if anyone here runs a bug bounty program, but there are a lot of reports that you end up closing, and it takes time to do that. Our infrastructure is precarious; we'd love to have fallback providers for things like compute or Edge CDN, but unfortunately, we've yet to find companies that are willing to donate those services or the funds required for us to have redundant providers.
00:20:22 On a happier note, in the last year, we've completed some great work and have exciting features in progress right now. This year, we improved MFA support and added support for passkeys instead of one-time passcodes.
00:20:54 This is step one on the way to using passkeys instead of passwords, and we're very excited about passkeys as a technology that is both more secure and more convenient than managing passwords.
00:21:23 We've given RubyGems.org better uptime and stability by removing the dependency API. The dependency API was originally replaced in 2016. For those counting at home, that’s something like seven years ago.
00:21:48 When we shipped Bundler 1.12 in 2022 and 2023, continued traffic to the dependency API was still in the high hundreds of requests per second. Each one of those requests was served by the Rails app, requiring a complex SQL query, which caused intermittent incidents and continuing failing requests.
00:22:18 So we spent about six months deprecating that API and working with the community to ramp down use until we could remove it.
00:22:41 We made Bundler dependency resolution error messages so, so, so much better than the resolver that I first wrote in 2014 by implementing a pub-grub based dependency resolver, which will be coming to RubyGems soon.
00:23:14 We added a gem-exec to RubyGems and shipped the Bundler Compose beta for quick commands or editor tools use cases. Gem exec lets you directly run any gem command, whether or not that gem is installed, and the Bundler Compose plugin lets you run, for example, your chosen language server, formatting tool, editor integration, or any other code that isn't in your project's core set of dependencies.
00:24:20 We merged hundreds of pull requests including optimizations, bug fixes, and new features across 19 versions of Bundler and RubyGems. Key among those were significant performance improvements for large application bundles. We've added support to use gems for Gemfiles to reuse Ruby version files and implemented a fully allow-list based safe loading for Marshal files, completely removing a repeated source of security issues.
00:25:45 I like to joke that Marshal is kind of the CVE factory of Ruby, so I'm pretty proud that we've sort of sequestered that and made it a bit safer.
00:26:22 Additionally, we caught up on years of deferred maintenance for RubyGems.org and upgraded lots of infrastructure that we hadn't applied, vastly improving the situation for maintainers so that we can keep the service running with fewer hours going forward.
00:27:12 We continued migrating a bunch of stuff to Terraform, making it easier to onboard new maintainers and decreasing the amount of institutional knowledge required to operate the service. We've made it possible to deploy RubyGems.org to multiple different staging environments so we could test large changes in isolation.
00:27:54 We added a set of web-based admin tools, enabling us to respond to support requests without having to SSH into production and use the Rails console—complete with full auditing so we can see the changes people are making.
00:28:31 We've also added backend support for storing the contents of individual gem files and OIDC-based authentication, and you can expect to see user-facing features based on those improvements coming soon. So yeah, it’s been a productive year!
00:29:16 So you may be wondering, 'Sam, how much does this cost?'
00:29:22 I'm going to pass around a hat after this talk, and if you can drop in your AWS credits, we'd greatly appreciate it! There are basically three types of ongoing costs: organizational overhead, services, and infrastructure, and paying maintenance and development work.
00:29:58 Organizational overhead covers the costs of creating and running a non-profit corporation in the U.S.—getting a mailing address, having a board of directors, owning bank accounts, paying accountants, buying insurance, and having a lawyer on staff— all that fun stuff.
00:30:31 We also pay for any useful or needed services that we aren't able to get donated. Overall, the open-source portion of Ruby Central's overhead typically costs between $5,000 and $10,000 each month, including time tracking, invoicing, accounting, and fundraising tools – all those lovely budget line items that business people here know all too well.
00:31:04 The vast majority of our infrastructure comes from two providers. AWS is our primary server host, where we primarily use EKS to host the Rails app served by application load balancers, with our files stored in S3. You know, the standard setup.
00:31:37 In total, our AWS costs generally run around $20,000 per month. Our other significant infrastructure partner is Fastly, which acts as our cache, CDN, and DDoS protection layer. Fastly handles something like 20,000 requests per second and serves 185 terabytes per month of requests.
00:32:09 Charged at retail prices, our Fastly bill would run more than half a million dollars per month. So thank you to Fastly for that!
00:32:30 Lastly, we pay developers directly to work on RubyGems projects, including RubyGems itself, Bundler, the RubyGems.org web app, and the infrastructure both in and out of Terraform. We also fund a 24/7 on-call rotation to ensure someone is available to troubleshoot any issues at any time.
00:32:56 In total, all the time spent on maintenance, development, and being on call 24/7 typically costs between $45,000 and $55,000 per month.
00:33:40 Where does this money come from? It’s a great question! Funding for RubyGems comes from three main places: donated services, Ruby Central members and partners, and funding organizations.
00:34:11 We receive significant donated services from several sources, and I’m going to list them out here to thank them: Dan, Simple, Datadog, Honeybadger, AWS, and of course, Fastly.
00:34:29 AWS has been very open to providing credits on request, and in any given year, we can typically cover 80 to 90% of our AWS bill via those credits. Fastly donates 100% of our service, which is extremely generous, and we are very grateful to them.
00:34:57 The other services mentioned also provide fully comped accounts with smaller dollar values. So thank you to all of them!
00:35:23 While we have always been able to run RubyGems.org primarily on donated hosting and other infrastructure services, we want to reach a place where we're not dependent on donated hosting in order to keep the service alive.
00:36:04 To pay for organizational costs, services that aren't donated, and developer hours, we rely on memberships with Ruby Central, as well as partnerships with companies, non-profits, and governmental agencies. We've gotten very good at writing grant proposals over the past couple of years; it is a very particular skill.
00:36:45 Ruby Central memberships are managed through our website at rubycentral.org, and we have options for single developers and small companies, all the way up through large corporations. For partnerships, we are currently partnering with Shopify and the German Sovereign Tech Fund to underwrite the majority of our developer time.
00:37:21 Those agreements will cover our current basic level of part-time maintenance for another year or so, but we won't be able to keep it up indefinitely without more help. In general, we try to fund as much as possible given our budgetary constraints.
00:37:56 So, the biggest improvement in the last year is our new 24/7 'chase the sun' on-call rotation, where every developer is only on-call during their local business hours, so I'm not being paged while I'm sleeping, which I appreciate. I’ve been woken up by pages many times before but never for RubyGems.org!
00:38:35 This is our on-call team. In addition to on-call coverage, we fund infrastructural maintenance work—upgrading tools, servers, software, responding to alerts, and general Site Reliability Engineering work.
00:39:25 In parallel with site reliability, we're working to expand the tools we provide to developers to research and interact with gems. In-progress projects include easily browsing the contents of gems and diffs between gem versions online, as well as instantly accessible hosted documentation for every gem.
00:40:12 Finally, we are actively working with packaging teams from other languages as well as security teams from companies like Google, GitHub, and GitLab to improve the security of RubyGems.org and the entire Ruby language ecosystem.
00:40:45 This has been my focus for most of this year. Upcoming security projects include trusted publishing, similar to what PyPI has, which allows GitHub Actions to push gems in a high security way without storing any long-lived user API keys that can be exposed, stolen, or misused.
00:41:26 Beyond the RubyGems.org site, we also fund some of the developers who work on the RubyGems and Bundler libraries, constantly working to make managing your gems faster, easier, and fingers crossed, less buggy.
00:42:06 Upcoming projects in the CLI tool side of things include the previously mentioned Bundler Compose, as well as storing and validating gem checksums as part of every install to help keep all users of Bundler secure.
00:42:50 Given how long this list is, it may surprise you to learn that all of this is happening without even one single full-time developer. There truly is a lot going on, and our developers are extremely dedicated and quite good at their jobs.
00:43:31 But all this work is moving at a sort of constant slow-motion pace because every person involved can only get paid for a few hours per week of work.
00:44:03 We would like to do more. If you or your company has an interest in supporting and improving RubyGems and related open-source projects, we would like to hear from you.
00:44:42 So what would we do if we had more support? It might shock you, but we have a pretty extensive roadmap, including a significant amount of cutting-edge best practices around security and code provenance, as well as goals to hugely improve the experience of being a Ruby developer and using gems to build software on a daily basis.
00:45:20 Our first goal is to secure enough funding to hire a dedicated full-time role focused on improving security, working on projects like trusted publishing, checksum verification, full passkey support, the update framework for signed gem indexes, and metadata.
00:46:02 Increased Salsa compliance, Sigstore support, and all the other buzzwords around software supply chain security that are all the rage these days.
00:46:49 Once we have more energy and time focused on improving security issues, the next thing we want to do is significantly improve the experience of working with gems as a developer. That high-level goal includes items like more gem information, including downloads over time, comparisons with other similar gems, browsable gem content, hosted and searchable documentation, popularity information over time, and in-browser gem playgrounds to test out a gem's API quickly before having to download and install it into your app.
00:47:34 So now that you’ve heard about our work and our goals, here are all the ways you can support that work and help us improve the security and experience of being a Ruby developer and using RubyGems to share code.
00:48:22 Number one is you can contribute code. We've got a room full of developers here. If you find a problem or just want to practice your Ruby coding chops while making the world a better place, you can either work for a startup or you can go to GitHub.com/RubyGems and open a pull request. We would love to have your help with Bundler, RubyGems, or the RubyGems.org Rails app.
00:49:11 You can also read RFCs and give us feedback. We have a repo on GitHub at RubyGems RFCs where we write proposals for significant changes to how Bundler and RubyGems work. We would love to hear more feedback from members of the Ruby community so we can adjust our plans to cover your needs and use cases.
00:49:50 You can also tell your teammates about this talk or link to the video of it, and make sure your programming friends know about the existence of Ruby Central and the conferences it runs. You’ve all been posting about this excellent conference online, right? And the work that our open-source team is doing to ensure that everyone who wants gems can always get them.
00:50:45 We need all the help we can get to sustain the RubyGems ecosystem for years to come. Join us as a member at whatever level you or your company is able to, and that helps us keep your tools working.
00:51:37 For companies or non-profits, we have a partnership program for sponsorships or mutual agreements. So here at the conference, you can talk to Ruby Central's Executive Director, Adam Hart, and say hi if you see him, or you can send us an email at [email protected].
00:52:06 Finally, I'd like to say give us feedback. You know, everyone who's working in this space is a Ruby developer too and we're doing it for all of you.
00:52:49 We're trying to make these tools as good as we can and our services as useful and reliable as we can. The best way to make sure that what we're doing is working for you is for you to talk to us, so come talk to me, Martin, Colby, or reach out to the rest of the RubyGems and Bundler team on Slack.
00:53:26 Tell us what we’re doing well, what we're not doing well; preferably what we’re doing well. We’re humans, we like positive reinforcement. But yeah, we’d love to hear from you! We’re not trying to do this work in a vacuum.
00:53:53 Finally, everyone at Ruby Central wants to thank you for coming to this talk, for coming to RubyConf, and for being a Rubyist in the first place. None of the tools we use every day would exist if not for this amazing Ruby community coming together to build them.
00:54:39 So give yourselves a round of applause as the assembled Ruby community. Thank you very much! I'm super excited to hopefully be able to do this again next year and share another year's worth of exciting developments.
Explore all talks recorded at RubyConf 2023
+38