Software Engineering

Stress Testing as a Culture

Stress Testing as a Culture

by João Moura

In the talk titled "Stress Testing as a Culture" given by João Moura at RubyConf 2014, the speaker delves into the critical importance of stress testing within the development lifecycle of applications, particularly as it relates to performance during high-traffic events like the World Cup. Moura emphasizes that stress testing and performance needs should be prioritized across all levels of an organization, not just developers, as it has implications for the user experience and overall success of a product.

Key points discussed include:
- Financial Impact of Performance: Companies like Amazon and Google quantify potential financial losses due to slow page load times, highlighting the need for fast performance to retain users.
- Increasing User Expectations: Current generations have a diminished patience for slow applications, with 47% expecting load times under two seconds.
- Cultural Integration of Stress Testing: Stress testing should become a part of the organizational culture, integrated from the design and marketing phases.
- Establishing Performance Goals: Organizations should set specific performance goals and optimize features that detract from those objectives, with an understanding of load ratios between frontend and backend processes.
- Types of Testing: Understanding the distinctions between stress tests, load tests, and other performance tests is critical for proper application preparation.
- Tools for Stress Testing: The choice of stress testing tools should align with specific application needs, with options like Apache JMeter and Loader.io discussed.
- Real-World Application: Moura shares personal experience as the CTO of a soccer social network in Brazil, illustrating the challenges faced when traffic surged during the World Cup despite preparations aiming to reduce load time drastically.
- Calls to Action: Encourages teams to embrace stress testing as a vital part of their development approach and to establish a robust culture around performance management.

In conclusion, João Moura stresses the importance of proactive measures in handling performance, setting realistic goals, selecting the right testing strategies and tools, and fostering a company-wide culture that prepares for and values stress testing. This comprehensive approach ensures applications can function effectively under high demand, ultimately impacting user retention and satisfaction during critical usage periods.

00:00:17.600 Hey people, good morning! You can do better than this. Good morning!
00:00:25.439 That's what I'm talking about. First of all, it's my pleasure being here today and I'm talking about stress testing.
00:00:35.160 I will walk around, so please don't mind me. One billion and six million dollars—that is what Amazon would lose with one page slow down by one second.
00:00:46.620 Amazon is not the only company concerned about this. Many other large companies would suffer financially and lose money due to performance and page speed issues. Google conducted research indicating that if they decreased their page speed by 500 milliseconds, it would drop their traffic by twenty percent. That’s a significant number. If we take a closer look at the relationship between page load speed and user demands, we see that after just four seconds, page abandonment rates increase by 25 percent.
00:01:21.420 This is a huge number; it can be the difference between the success and failure of a product. It defines how many users you can support and how many lives your product can change. The thing is, users are becoming increasingly impatient. It’s not just about a generational thing; it’s rooted in neuroscience. Research shows that 47% of people expect a page to load and be fully functional in two seconds. That’s quite a challenge! What's worse is that 75% of these users stated they wouldn’t return if a page took more than four seconds to load. You should take this very seriously because our brains don’t operate efficiently when it comes to short-term memory. Our short-term memory can barely hold onto information for more than 10 seconds. After a few seconds, you could easily forget it.
00:02:33.019 You need to seriously concern yourself with performance. Performance does not only matter if you are a developer; it is crucial for everyone involved in the process—be it team leaders, product owners, or any stakeholders. At some point in the life cycle of your product, you will need to address questions like whether your application meets non-functional requirements and how it behaves under extreme circumstances like high traffic. Performance affects everyone involved in web development.
00:03:42.919 I am the founder and CEO of Gioco. Gioco is a gamification platform, and I am here to talk to you about stress testing as a culture. The first question we need to ask ourselves is: how can we increase the success rate of our application to meet the needs related to scaling? Because stress testing, performance testing, and load testing are all about being prepared to handle extreme circumstances.
00:04:19.370 To discuss these aspects, I will go through four key points that will enable you to make stress testing a part of your application's culture. However, before I dive into those points, I want to share a personal story about my journey here.
00:04:47.310 I am from Brazil. For those who don’t know, this is Brazil's flag. I came all the way from São Paulo, which is the largest city in the whole Americas. It’s a maze of buildings and traffic. People often ask me about our trees, lakes, and forests, which we do have, but not everywhere. It's not the typical image of Brazil that people imagine. Instead, we enjoy life and have big parties. One of our favorite activities is soccer, which might be known for our famous players or teams. In the context of soccer, I received an invitation to join a startup as their CTO. The startup was focused on soccer in Brazil.
00:05:54.710 Soccer is huge in my country, particularly with the World Cup approaching. We had a great opportunity as we secured a marketing deal with Neymar, a player from Barcelona, allowing us to use his image for the next ten years. This was a significant deal and we were preparing to make a marketing push as we approached the World Cup, but I began to worry about our application’s performance and whether it could support the expected traffic. I had just joined the company and had to consider whether our application could handle such pressure.
00:06:47.030 Performance is often a nightmare for developers; sometimes you think about it while other times, it becomes your unexpected reality, and that’s truly scary. I started to Google around looking for stress testing methods that could assure me that my company was prepared for the World Cup. This is how I learned about stress testing, deciding that we needed to employ it in our application preparations. We executed a benchmark test by making 15 requests in one minute—this is very low by industry standards.
00:07:46.330 Unfortunately, only three of those requests were successful, with an average load time of almost ten seconds. We were definitely not ready.
00:08:00.000 This experience shocked me, especially with our significant partnerships. If we could barely handle 15 requests in one minute, how could we ensure we could scale with the anticipated traffic? It's easy to blame developers, but performance is an organization-wide issue, involving marketing, design, and business rules.
00:08:35.750 This leads me into the first key point regarding stress testing: how to approach it. Once you realize how critical performance and stress testing are, you need to treat it as a priority. You have to treat performance the way you treat design or marketing, integrating it into your decision-making process. Ask yourself questions about your CMS, hosting solutions, or any features. It's essential to set performance goals and establish a performance budget that you must adhere to.
00:09:56.000 Establishing clear goals is paramount. You must identify existing features that are under-performing and consider optimizing these or even removing features that detract from achieving your performance budget. A rule of thumb is that 80% of the time users spend loading your application should be spent on the frontend, with only 20% on the backend. This ratio can guide you in setting initial performance goals for stress testing.
00:10:58.000 Another essential consideration is the mobile-first approach. However, I propose a 'performance-first' approach. It's crucial to account for mobile not just in terms of screen size but also regarding network speed, as this directly impacts performance. With the explosion of mobile usage, it's vital to understand how your application performs in these environments. There are projections that everyone online will have a smartphone within a couple of years. Look to how successful companies, like the BBC, focused their stress testing on achieving a 10-second load time even on the slowest connections.
00:12:39.000 When we began our stress testing, my goal was to reduce our load time from ten seconds to 500 milliseconds. This was an ambitious target, especially given our timeline, but I insisted on commitment to this goal. As you set out to stress test your application, first you must consider how you will approach it, which leads me to my second key point: there are different types of testing.
00:13:52.000 There are several types of performance tests, including load tests. Each one varies based on the unique requirements of your application. For our purposes, we decided on conducting stress tests, which help gauge how our application responds under extreme conditions, while load tests assess performance under peak load situations.
00:14:32.000 There are two primary methods to conduct these tests: measuring clients per second or clients per load. Clients per second measures how many clients can interact with your application in a second, while clients per load measures how many simultaneous requests your application can handle. We chose the clients per second approach because it fit our specific requirements, with a goal of handling 300 requests per minute.
00:15:44.000 The third key point to address is the tools to use, which have no straightforward answers because the right tool depends on your specific needs. Several tools can help with stress testing: Apache JMeter is a Java-based tool, Gatling is a powerful tool also worth mentioning. We ultimately decided to use Loader.io developed by SendGrid. It's friendly for setup and provides excellent reporting features. They offer free plans sufficient to test most applications, allowing for up to 10,000 clients per test.
00:16:34.000 So far, we’ve discussed prioritizing performance and establishing goals, and selecting the right tools. Now, the fourth and most important point is about culture. It's crucial that stress testing and a focus on performance become part of your culture. Culture is something that can spread throughout a company; it's like DNA.
00:17:14.000 As developers, we have the power to introduce new practices within our organizations, just like we did with TDD or other methodologies. We need to integrate stress testing into our deployment processes, such as Jenkins. This is what can create a robust culture focused on performance and stress management within your development team.
00:18:10.000 But how did this approach work out during the World Cup? We had significant issues. While we had made impressive progress and reduced our response time to a mean of 627 milliseconds, we faced severe challenges the moment the World Cup commenced. Our application went down just as our users flocked to use it.
00:19:34.000 Despite the preparations, we were not swift enough to react to the influx. This serves as a cautionary tale: you must address performance quickly, before you are put in a position where you're scrambling to fix issues. Even with ample investment and the best technology, performance hurdles can hinder your growth. You need to start addressing these issues preemptively.
00:20:50.000 To quickly summarize my key messages: set performance goals and stick to them; determine which types of tests your application needs; choose appropriate tools without expecting clear-cut answers; and most importantly, embrace a culture of performance and stress testing from the outset, rather than waiting until the last minute.
00:22:35.000 I would like to thank all the amazing people and companies that helped me get here today. I am deeply involved with open-source projects and have been contributing to several, particularly in the realm of achieving models for Rails API. We welcome anyone to help us as we strive to enhance our technology platform. Also, I co-founded Gioco, a software as a service for gamification, with hopes to make it a valuable resource for developers looking for engagement solutions.
00:24:32.000 If you're interested in stickers or learning more about our recent Heroku integration, let me know. I appreciate your time and attention today. Thank you!