Terence Lee
`bundle install` Y U SO SLOW: Server Edition

Summarized using AI

`bundle install` Y U SO SLOW: Server Edition

Terence Lee • March 07, 2013 • Earth

In this talk by Terence Lee at Ruby on Ales 2013, the focus is on the optimization challenges faced in Bundler, a gem dependency manager for Ruby applications, particularly regarding the performance of the 'bundle install' command. The event's backdrop is the downtime experienced by RubyGems.org on October 17, 2012, which prompted the need for a dependency API. This API was introduced to speed up the process of fetching gem dependencies, significantly improving the user experience. Lee elaborates on the following key points:

  • Introduction to the Speaker: Terence Lee introduces himself and shares his role in enhancing the developer experience when deploying Ruby applications. He expresses excitement about his journey in the Ruby community, including his involvement with Bundler and educational initiatives like Rails Girls.

  • The Background of Bundler: Lee discusses the historical context of dependency management in Ruby, emphasizing the problems developers faced with versions and gem installations prior to Bundler. He refers to personal experiences where dependency management could take an entire day due to mismanaged gem configurations.

  • Improvements in Bundler: The transition from slow dependency resolution times (up to 18 seconds) to improvements enabling installation complete in roughly 3 seconds after the introduction of the new API is highlighted. This enhancement allowed Bundler to fetch only the necessary metadata for a given list of gems, drastically reducing loading times and CPU usage.

  • The API Development: Lee walks through the challenges faced with RubyGems.org, including its prior infrastructure limitations and how this led to the API's development. The new system allows for efficient metadata retrieval while ensuring RubyGems remains operational under peak loads.

  • Architecture and Performance Monitoring: He discusses the architecture of the new API, leveraging PostgreSQL and implementing strategies like in-memory caching for real-time data access. PagerDuty is used for alerting, and metrics are monitored through Librato to ensure system performance.

  • Community Involvement: The speaker emphasizes the project's community foundation, encouraging contributions and collaboration. He discusses the transparent nature of the project where the code is available for input and improvements.

In conclusion, the presentation underscores the importance of optimizing tools and services in the Ruby ecosystem and how collaborative efforts can lead to significant improvements in user experience for developers using Bundler. Lee encourages the community's engagement in enhancing these tools further.

`bundle install` Y U SO SLOW: Server Edition
Terence Lee • March 07, 2013 • Earth

If you've ever done anything in ruby, you've probably used rubygems and to search or install your favorite gem. On October 17, 2012, rubygems.org went down. A Dependency API was built to be used by Bundler 1.1+ to speed up bundle install. Unfortunately, it was a bit too popular and the service caused too much load on the current infrasture. In order to get rubygems.org back up the API had to be disabled. You can watch the post-mortem for details. http://youtu.be/z73uiWKdJhw

Help us caption & translate this video!

http://amara.org/v/FGbC/

Ruby on Ales 2013

00:00:20.400 Awesome, thank you! I'm super excited to be here. Last year was my first talk ever at our conference, so I'm definitely looking forward to this experience.
00:00:41.520 As for my role, I handle your experience when you push apps up there. If something is broken, then it’s definitely my fault. If you hate it, it’s really my fault too, so if you have any issues, please come talk to me.
00:00:53.600 Here’s my contact info: you can find me on Twitter at zero2, and my GitHub is just 'hold on.' You can also email me at [email protected]—it's with one 'r' and no 'a's. People often misspell my name, and if that happens, I won’t get your email.
00:01:06.159 I come from Austin, Texas—not the bear area—and I haven't been home in three and a half months. My stuff lives there, though, so if you're ever in town, I'd love to take you out for tacos. Just reach out and we can grab some tacos—my treat!
00:01:23.840 One of the awesome things about Austin is the great barbecue we have. There's this amazing local barbecue chain called 'Goode Co.' They now have locations all over Texas, and in Austin, they have these fantastic hand washing machines. You stick your hands in, and after 30 seconds, they come out clean after eating a bunch of greasy barbecue food. The best part is that you get stickers after using them!
00:01:41.360 When I go home this Saturday, I'm excited to collect stickers for my taco tour. It will be a great collection! If you guys know Constantine, you might be familiar with my favorite sticker that he has on his laptop.
00:01:59.680 In the community, I work on Bundler and the ButterGem project, which I’m going to talk about today, and I've received help from Steve. I’ve also done a bunch of work with Rails Girls around the world. If you haven’t participated in Rails Girls or any educational organization that teaches beginners, I highly recommend it. It’s a great experience to see the world from their perspective and realize how much knowledge you forget and how difficult it is to be a beginner.
00:02:22.000 I like to do this every so often to remind myself of the challenges beginners face. I was just in Australia last month, and Jorge Haynes gave the opening keynote. Corey is known for both being a gamer and for holding an interesting title.
00:02:35.200 One of the things he discussed was a self-mentoring process where he would take on the title of someone he aspired to be like Aaron Patterson, whom he greatly admires for his open-source contributions. So, twice a week, he commits to doing open-source work, which inspired me, but for different reasons. Aaron also has this fun thing called 'Friday Hugs' where he takes a picture of himself hiking and tweets it every Friday along with a picture of his beloved cat Corbett, who has a Twitter account @orbepuff. If you're not following it, I highly recommend you do so!
00:03:04.159 After seeing Aaron's inspiring interactions, I started going around the world and gave talks after attending the conference with Hills. I’ve been to Finland, Singapore, Amsterdam, Seattle, and more. Today is Friday, and I’m at a Ruby conference, so it makes sense to give out Friday Hugs! If everyone could stand up and pose for a picture, that would be great!
00:03:36.960 Thank you, everyone! Now, let’s move on to the talk.
00:03:44.000 Today, I’m going to talk a little bit about bundler, the client, and provide some context for those who weren’t here last year. I will explain why we actually created the API, discuss the older RubyGems support construction, and go over the incident of when it went down. I’ll also introduce the Ebola Rate Guide that the Bundler team built and maintains. Depending on how much time we have and people’s willingness to skip lunch, I’ll have a bonus round at the end, so we’ll see if we want to discuss some extra things.
00:05:50.000 So, as I was asked in the last presentation, 'Who is Jessica?' This is Jessica Lynn Suttles, who you can find on Twitter as @janelle_suttles. If you’re not following her, you should definitely do so! She was recently sponsored, and I think today is a good day to congratulate her on joining the Bundler team.
00:06:01.520 In the two years I've been working on Bundler, we haven’t had anyone else join the team, so this is the first person who will be helping us with things. I'm super excited to have another person on the team helping Andre and me, as well as Yehuda. Congratulations!
00:06:37.840 Now, what is Bundler? For those who aren't familiar, at a high level, it’s a gem dependency manager for your Ruby application.
00:06:49.919 Those who were here during the older days of Rails 2 or 3 will surely remember the headaches with dependency management. I remember starting an application for my first Ruby job at the other box in Austin, and it would take a whole day to run an app because we had to figure out which gems to install, even though they were listed in the Gemfile.
00:07:04.000 Some gems weren't even listed, leading to confusion among co-workers about which exact gems to install. I remember using Cucumber back then, and the versions were not backward compatible. If you were off by a patch release, it could break the entire integration test suite.
00:07:34.240 When I saw you and Carl present on Bundler, I thought it was the best thing since sliced bread! The interface most of you are familiar with is simply using the command 'bundle install' or 'bundle,' which defaults to the install commands. You have a Gemfile where you list your sources, using RubyGems or any private repos you have, and then you list the gems by name along with their requirements.
00:08:03.280 The sample Gemfile we’ll be using in this presentation was for a simple Sinatra project. When I did this presentation last year, the latest version at that time was Bundler 1.0.22. We ran the bundle install command over a year ago and would get this 'Fetching source index' message, resulting in a very long waiting time.
00:08:46.120 For that simple Gemfile that just included Sinatra, it took roughly 18 seconds to resolve dependencies. A lot of people joked that they would just go get a sandwich or do something else while they waited for the install command to finish. To understand why it was so slow, we need to look at how Bundler resolves dependencies.
00:09:06.000 Bundler has a class called Index that does a lot of work. Most importantly, it contains all the specification information necessary to perform the resolution, and it maintains an instance variable called specs, which lists all the gems organized by name and their associated versions.
00:09:34.000 To fill up the index, we need to fetch all the specs. Inside the index class, we have a method called remote_specs, which goes through all the sources listed in the Gemfile and calls a function to fetch all remote specs for them.
00:10:03.120 If you were to look at the main index, you’d see that it’s only about 1.2 megabytes in size, so it really shouldn’t take that long to fetch. I tested it a month ago, and even over a slow connection, it only took a moment to download.
00:10:42.400 Inside Bundler, there is a remote specification class that acts like a gem specification, tracking the name, version, and platform. However, one important detail is that there’s no dependency information included in this, which hampers the resolution process.
00:11:02.640 Without dependency information, Bundler cannot resolve dependencies, meaning it has no idea what it needs to install when you fetch a gem like Sinatra. Therefore, it has to fetch the gemspec for each of those dependencies during the resolution process.
00:11:19.200 Going back to the remote specs method, Bundler creates remote specification objects for every gem version and platform tuple that exists on RubyGems. This approach consumes a lot of memory. In the early days of Bundler 1.0, if you had a low-memory instance, like a Linode with only 256 MB of RAM, running a bundle install could exhaust the memory.
00:11:40.560 We ran into performance issues on the Bundler team and realized we needed to optimize this memory consumption, which became one of the first things I worked on after joining post 1.0.
00:12:00.800 Using versions like Bundler 1.2.3, we introduced 'fetching gem metadata' in place of 'fetching source index.' The new process allowed installations like our simple Gemfile to complete in roughly three seconds rather than eighteen! The key to this improvement was the new API endpoint provided by RubyGems.
00:12:36.560 This endpoint takes a comma-separated list of gem names and returns all necessary metadata, including names, versions, platforms, and their dependencies. This means we can use a recursive method to hit the endpoint for each gem in the list, allowing us to construct a much smaller dependency graph.
00:12:59.840 This means we can determine dependencies without including irrelevant gems like Rails that are not part of the first-level dependencies for the given gem. We can build a more optimized structure, allowing Bundler to resolve dependencies quickly and efficiently.
00:13:46.000 Now, let’s transition over to the server side. The old setup for RubyGems.org was a single quad-core machine that ran everything it needed to. This included the Rails application at rubygems.org, a PostgreSQL database for user account information, gem metadata, and caching counts.
00:15:00.559 On October 17th, RubyGems.org went down. In our post-mortem discussions, we noticed that the bundler API accounted for 73% of the overall traffic to the machine. It also used over 300% of the available CPU most of the time, largely due to marshalling.
00:15:32.640 Because Bundler’s gem specifications don’t contain dependencies and do not require loading additional libraries in memory, the marshalling process for every request became CPU-intensive. So, the Bundler API was disabled to restore RubyGems.org’s availability to developers.
00:16:11.920 Recognizing how much this impacted the community, I was fortunate enough to work at Roku after this happened. They allowed me to prototype a new Bundler API that could take some of the load off the current RubyGems infrastructure.
00:16:59.840 We designed the architecture to be simple; it would pass a comma-separated list of gems to an endpoint and receive a response with the necessary metadata. This way, the traffic load wouldn’t overwhelm the existing RubyGems servers.
00:17:41.840 The endpoint needed to remain up to date with real-time data. We initially considered using webhooks to be notified whenever a new gem was released to update our records, but that's a feature that came later.
00:18:10.960 The API endpoint consists of two parts: the database and the API, which was built on PostgreSQL data taken from RubyGems.org. We obtained a dump of gem versions and metadata, allowing us to bootstrap the entire process. Inside that database, we linked dependencies to specific versions.
00:19:08.800 One crucial component we kept in mind was scoping dependencies to avoid circular references. When deploying an application, you typically don’t need the development dependencies. Scoping this information to runtime was key during queries.
00:19:58.560 When setting up our queries for dependencies, we made sure users would receive an octet stream back from the endpoint. We also aimed to marshal dumps of all dependencies efficiently.
00:20:32.640 Given all that, we managed to put together a system that accurately reflects the gem ecosystem with all its dependencies. This way, when a gem is pushed to the RubyGems registry, our API would respond accordingly with the updated metadata.
00:21:18.000 Initially, we implemented a simple polling mechanism to grab updates from RubyGems.org regularly. This approach kept our local cache populated with new and yank gems whenever necessary.
00:22:09.440 At the time, rates of gem releases were around one new gem every five minutes, making it important to keep our API up-to-date without delay.
00:22:45.280 To optimize performance and resource usage, we started keeping an in-memory hash to quickly access metadata without hitting the database every time. After the first run, the system became really fast, allowing for rapid access to gem information.
00:23:13.440 As a result, it only takes one to two minutes, which is a significant improvement compared to what we had before. We also moved to a webhook system, allowing for better communication between RubyGems and our API.
00:24:02.080 Today, when a gem is pushed, it sends a request to our API, notifying our system that a new gem has been added or updated. This system includes security measures to prevent unauthorized access to the endpoint, ensuring only legitimate gems are processed.
00:24:50.080 Additionally, we implemented timeouts to prevent stalled communication if our API is ever down or experiences load. This way, the push process remains efficient, allowing developers to release gems without concern.
00:25:36.480 The application runs seamlessly on Heroku, utilizing six dynos, each spinning up three workers, with separate databases for reads and writes to optimize performance. We figured this setup made sense from an early perspective, limiting destructive writes to maintain smooth operations.
00:26:32.560 On the operations side, PagerDuty provides us with an account, and we ensure that Andre, Larry, and I are always on call. If an issue arises, we get alerts via text or email so we can address problems as soon as they happen.
00:27:23.920 We also utilize Paper Trail to manage and store logs, enabling us to analyze bugs and other issues. Additionally, we leverage Librato for performance monitoring; Larry played a significant role in implementing instrumentation throughout the application to track relevant metrics.
00:28:20.080 All our metrics are publicly available via a specific URL related to our services, as we strive for transparency around our operations and performance metrics.
00:29:05.160 To help us monitor performance and availability, we set up error alerts, allowing us to quickly react to spikes in errors or service issues that may arise during operations.
00:29:40.000 Setting up our monitoring with Librato requires using the metrics gem, making it easy to track timing and other metrics in a seamless manner. Developers can easily integrate this into their production systems.
00:30:26.000 Our commitment to building this API as a community project ensures that everything is transparent—no secrets. Our code is out there in the open, ready for collaboration.
00:30:45.760 We encourage anyone interested to reach out, as Andre and I are always happy to help you get started or answer questions. There are numerous possibilities for improvements, and we look forward to what our community can offer.
00:31:34.160 We aim to enhance visibility and monitoring and welcome any suggestions. Contributions are welcome in coding solutions, too. We hope to introduce unique tracking for each Bundler install to gather metrics on gem usage within the community.
00:32:29.280 We hope to improve integration with RubyGems data to make it easier to access specific gem information without needing to download extensive datasets every time. Our team is always open to suggestions and insights from the community.
00:33:29.000 I'd like to acknowledge the amazing contributions of Andre for his work on Bundler and his support in managing the Bundler API service alongside me. Larry was crucial for helping us monitor our application effectively, and Daniel Farina from Roku assisted with the initial setup.
00:34:10.000 Before I finish, I’d like to ask if anyone has any questions regarding the project, the API, or anything else.
00:34:43.240 Thank you, everyone.
Explore all talks recorded at Ruby on Ales 2013
+15