Talks

Living on Rails Edge

Recorded in June 2018 during https://2018.rubyparis.org in Paris. More talks at https://goo.gl/8egyWi

Paris.rb Conf 2018

00:00:11.670 Welcome back to Paris.rb conference, and I have the pleasure and honor of introducing Rafael França, one of the main contributors of Rails.
00:00:22.810 So, I'm here today to talk to you about what it means to be living on the razor's edge and why someone would want to do that.
00:00:36.520 This is a story about how we at Shopify are celebrating our use of Rails and how Shopify can celebrate the development of Rails.
00:00:46.180 My name is Rafael França, and you can find me on Twitter and GitHub with this handle. I’ve been a member of the Rails core team since 2012 and I work at Shopify as a production engineer. We are a community-driven forum, and we are also hiring, so if you want to hear more about this, you can find some people in the Shopify booth. If you have never seen me in person, I assure you that I am very active on GitHub, and I lead many projects related to Rails.
00:01:19.120 Now, back to my story about Shopify. Shopify started around the same time as Rails; it launched in 2004, alongside the first stable release of Rails. Our CEO, Toby, was a member of the Rails core team while working with Rails even before its pre-release. Shopify’s codebase was also never minimally modified, which means we can see the entire history of Shopify and Rails in the same timeline. Here, we have a timeline showing the important version releases for Rails over the years and how Shopify was using it.
00:01:57.130 You can see that we were very close to the Rails release schedule. Shopify was also on the Rails 2.0 series when that was released. This situation remained until Rails 3. With Rails 3, the situation changed somewhat; we only managed to upgrade Shopify to the Rails 3 series almost one year after its release. The same problem occurred with other releases—after releases like Rails 4, you can see that our structure was only updated nearly a year later.
00:02:25.000 Now, Shopify is on Rails 5 and has even begun working with Rails 5.2 prior to its official release. Shopify is also a huge application, possibly one of the largest Rails applications globally, with almost 300,000 lines of code running tests that are highly reliable.
00:02:40.390 Moreover, we have around 1,100 lines of models with more than 2,000 classes, and the same distribution applies to our controllers. We maintain a very healthy code-to-test ratio, about 1.3 to 1 line of code. Shopify has always closely followed the latest versions of Rails. We are currently running Rails 5.2 in production since March, while Rails 5.2 was only officially released in April.
00:03:09.550 We were able to use Rails 5.2 even before its official release, and since then, we've been testing with a nested array of the sixth version, which is not even officially related yet.
00:03:34.690 So, why do we choose to live on the edge? To explain this, I need to share one of our challenges with background jobs. We transitioned from using the Delayed Job framework to Rescue as our background job processor back in 2010. We built this framework to effectively migrate background jobs that were previously running on the database. Early on, we had to move to Rescue, which, as we found, provided a superior solution without the learnings that we had with the high-scale application we were running at that time.
00:05:04.750 In 2012, Sidekiq was released as a simple and efficient message processor for Rails. At the time, we were using Rescue and found Sidekiq to be an interesting option, but we had no resources to migrate immediately. If we look closely, the Rails community quickly rallied around Sidekiq as the de facto standard for background jobs, while the maintainers of Rescue lost interest.
00:05:21.610 The last major update to Rescue came almost three years after its previous version, which is enough to make anyone nervous. It made us realize that the time invested could become stale. Moreover, maintaining alternatives would be incredibly hard for us considering the architecture we had built.
00:05:36.860 In 2014, Rails released Active Job, which extracted logic from the base game. Active Job had a similar syntax to what we had at Shopify at the time but was different enough that it did not fully align with our implementation. Active Job supported multiple back-ends without requiring changes to our code.
00:06:01.590 Many of the patterns we were working with in Rails became foundational as we embraced the changes brought about by Active Job, which ultimately helped teach a new generation of developers how to build powerful applications.
00:06:21.750 Eventually, our application at Shopify integrated Active Job and Sidekiq because that’s what developers were familiar with at the time. Eight years after we created our custom background job framework, it became outdated. New patterns emerged, and we ended up with a snowflake solution that few people understood how to manage.
00:06:53.260 This situation is like when you fracture a bone, and it heals in the wrong way. You can continue to use your arm, but it hurts, and the only way to fix it is through surgery. The next story is about how we send email in the background at Shopify, which was implemented in 2011.
00:08:02.800 When you send an email, you instantiate your message by calling the action in your mailer, then use the delivery method to send the mail to the user. I had the idea to monkey-patch Rails to maintain the same behavior, but instead of sending the mail immediately, it would schedule a background job to send the email later.
00:08:32.599 Therefore, we monkey-patched Rails to return a background email proxy object instead of returning an actual mail message delivery object. This proxy would schedule a job to send the email instead. This was a simplification, but we had a class that would receive the mail class, and the method you wanted to send would be stored in that instance.
00:09:03.890 We serialized the email message, encoded it in base64, and pushed that to our background queue. The background worker would grab the parameters, decode the message, create the proxy again, and call the delivery method. Eventually, the delivery method would trigger the message delivery method, ensuring everything was sent correctly in the background.
00:09:40.770 However, when the object structure changes, issues arise. For example, suppose I update a class that sends an email but then try to load a message that was serialized with the old version of the class. I encountered a rather confusing situation in production when I attempted to use a new version of the job system and found that the old version could not deserialize the message.
00:10:50.390 In this snippet of code, as I attempt to deserialize the message, I face an error where the application cannot process it anymore. This leads to a situation where users stop receiving emails, which has a detrimental impact on the service. Similar to the Rails announcements on Active Job, Rails introduced a way to run background jobs effectively.
00:11:49.110 Fortunately, I realized that I could use that feature instead of monkey-patching our initial setup. This new implementation uses message delivery objects that include all relevant details about the mailer, action names, and arguments, effectively scheduling jobs into the background queue without the previous serialization issues.
00:12:31.650 The new process operates by not using #marshal for serialization. Instead, we send the job parameters as plain Ruby objects, significantly reducing risk when changing object structures. This method also provides performance benefits since we only generate the message in the background when it needs to be done.
00:13:41.100 However, keeping everything integrated and functional does require some rigorous maintenance and attention. We aim to follow best practices, keeping dependencies minimal and ensuring backward compatibility so that actions taken in the framework do not break existing applications.
00:14:44.970 Living on the edge introduces challenges, but we believe it yields far greater rewards. The primary reason for adopting this strategy is the development of shared infrastructure that benefits both our application and the wider Rails community.
00:15:27.770 By building shared frameworks, we help each other avoid snowflakes in our solutions. Using widely-accepted standards fosters collaboration and simplifies onboarding for new members of the team.
00:16:27.620 Another advantage is being able to access newer features earlier. By leveraging the latest Rails releases promptly, we can run the latest code right from the start. This also helps evade regressions, minimizing the frustrations involved with upgrading dependencies.
00:17:20.670 Avoiding major upgrade hurdles means we can continually adapt, instead of stopping our entire team to migrate to the latest version all at once. This iterative approach means we can address any issues on an ongoing basis. Moreover, when issues arise, we can look at a smaller set of updates instead of sifting through thousands of commit histories.
00:18:37.090 Now let’s dive into how Shopify navigated this process of constant upgrades. At Shopify, we maintain continuous development cycles, merging changes constantly to keep the various parts of our ecosystem updated.
00:19:19.720 Due to the large size of our codebase and the numerous developers contributing to it, keeping our branches up-to-date with the latest Rails standards was essential to prevent compatibility conflicts. Daily operations were improved with these continuous integrations.
00:20:10.450 An essential strategy we implemented is known as 'dual booting.' This allowed us to run the same codebase with multiple versions of Rails for testing and development.
00:20:56.220 We utilized environment files where we specified which Rails version to run, making it much easier to test our application against both the latest stable release and the upcoming version under development.
00:21:49.320 Keeping multiple versions operational in parallel ensured that users experienced minimal disruption during the upgrade process. As part of this strategy, we also maintained backward-compatible code practices, allowing gradual integration.
00:23:00.620 Alongside these practices, we introduced features to track failing tests effectively, marking them as expected failures during upgrade cycles. This insight helped us manage expectations among our developers, keeping morale high and mitigated the fear surrounding upgrades.
00:24:11.640 Delegating responsibilities in our testing strategy also proved to be crucial. By breaking our application into smaller components, the owners of those components became responsible for addressing test failures within their scope.
00:25:12.590 This approach led to higher ownership and accountability among technical teams, which allowed us to resolve issues with greater speed and adapt to changes more fluidly.
00:26:04.670 After ensuring tests were passing and establishing compatibility layers, we were ready to deploy upgrades to production. By focusing on gradual rollouts, we could monitor system performance and user experience.
00:26:57.170 In summary, we applied a systematic approach to build and maintain our large codebase while continuously upgrading if necessary, keeping in alignment with community standards and maintaining functionality across all components.
00:27:56.370 Post-deployment, we cleaned up our codebases by removing legacy code and features built around deprecated patterns. Our initial approach aimed at engagement and inclusion from various teams ensured broad participation in cleaning up and modernizing our infrastructure.
00:29:51.120 The final steps involved preparing for future upgrades by promoting ongoing adjustments and removals of conditionals. This emphasis on proactive maintenance allows us to take on technological challenges early and effectively.
00:30:31.250 As a team, we began transitioning from Rails 5 to Rails 6, emphasizing that the journey to stay current is ongoing and must be shared across many hands.
00:32:02.520 One of the important things to remember is to mitigate risks while enjoying the many benefits of living on the edge. Avoid desperate monkey patches, keep dependencies minimal, and constantly evolve processes with a clear communication channel.
00:33:37.630 Finally, my encouragement to all of you is to not give up on being a pioneer in your field, even though the path may be daunting. The progress made along the way will foster a culture of improvement and innovation, yielding great results for everyone!
00:34:19.930 Thank you.
00:34:34.880 As mentioned earlier, my team is hiring. If you want to work in pushing Shopify and Rails forward, let’s talk!
00:34:48.320 So please stand up if you have questions.
00:34:54.710 Hi, thank you for your talk. You mentioned your stats showing you have over 2,000 models in Shopify; are they all Active Records or is there something else? You also mentioned having 500 controllers!
00:35:14.520 Indeed, most of those classes are Active Records, although there is a separate folder for non-Active Record classes.
00:35:31.580 So yes, around 2,000 of them are Active Record models.
00:36:08.620 Hi, thank you for your talk! I wonder what your test coverage is like to ensure confidence in moving to the next version of Rails.
00:36:28.730 Currently, we hover around 80% coverage overall. However, we sometimes encounter issues where certain features aren’t covered, and we address this by continuously adding tests for any production problems we identify.
00:37:10.480 We maintain a focus on integration tests to ensure various components of our application visibly follow the expected outputs and functionalities.
00:37:41.890 What is the average run time for your tests?
00:38:08.460 When executed locally, it can take close to two hours, but we prefer to run them on CI, targeting around 10 to 50 minutes for a complete run.
00:38:44.370 You mentioned it’s easier to live on the edge when core contributors are involved; do you think it’s feasible for others?
00:39:34.300 Yes! I previously thought similarly before immersing myself in Rails contributions through various opportunities and projects. Keeping up-to-date can definitely seem daunting but is manageable for developers willing to engage.
00:40:01.200 Do you think monkey patches serve you well, or have they caused issues over time?
00:40:36.250 They can often create complications. We aim to keep the number of monkey patches minimal, but they do arise from time to time when expedient fixes are needed.
00:41:08.320 Any final questions?
00:41:16.440 Thank you, Rafael!