00:00:11.719
Hello everybody, I'm Paola Moretto, co-founder of a company called Nuvola. You can find me on Twitter at polymerase. A little bit about me: I'm a developer turned entrepreneur and have been in the high-tech industry for a long time. I love solving hard technical problems. I originally come from Italy, but I've been in the US for 20 years. If you don't find me writing code, I'm usually outdoors hiking.
00:00:46.079
This presentation is about performance. We heard it loud and clear here at RailsConf: faster is better. We all know what performance is, but it's important to understand the real impact of low performance. When I talk about performance, I mean speed and responsiveness—the speed and response times that your application delivers to users.
00:01:03.780
There's a famous quote from Larry Page that states, 'Speed is product feature number one.' Therefore, you need to focus not only on your functional requirements but also on the non-functional ones. Speed is paramount for any web application today. There's plenty of research and data that backs this up, showing the impact of low performance. It affects visibility, SEO ranking, conversion rates, brand perception, brand loyalty, brand advocacy, and can drive up costs and resource usage. Low performance typically leads to over-provisioning, which is not usually the right answer.
00:01:43.800
In today's web application environment, speed is critical. If you have a DevOps model, where development, QA, and ops are integrated, the need for speed becomes even more crucial. In a cloud environment with programmable and elastic infrastructure, where you're adopting continuous delivery and agile methodologies, it is vital to ensure that every build is not only functional but also performs at the right speed.
00:03:04.900
So, how do we tackle the issue of performance? The first thing you need is data. This is a quote I borrowed with pride from a talk mention yesterday: 'In God We Trust, all others bring data.' This leads to the problem where you deploy and hope for the best, relying on users to act as your QA department. I've heard of companies, like certain e-commerce applications, stating that they know when they experience slow down because users complain on Facebook. Normally, that's not the best approach. You need substantial amounts of data.
00:03:30.370
There are two types of data to consider. On the right-hand side, you have your live traffic data when you deploy your applications. This usually falls under the umbrella of monitoring. Many times, this encompasses various monitoring data and techniques. On the left-hand side, you have your testing environment, which typically includes pre-production or staging environments. During this phase, you create synthetic traffic by simulating user activity and according to this process, you conduct performance testing.
00:04:13.510
Let's start with monitoring. You have several types of monitoring: stack monitoring, infrastructure monitoring, user behavior monitoring, and what are the most common user scenarios. You may also encounter streaming analytics or high-frequency metrics, where solutions extract data from the platform with speed. There are many existing solutions, and while we are not affiliated with any of them, we want to communicate the spectrum of monitoring and data instrumentation solutions available. They complement each other as there's no one-size-fits-all, as it depends on your application's needs.
00:05:12.750
The primary issue you face today is that despite having dashboards filled with data, correlating it to understand what's truly happening can be challenging. Therefore, monitoring is your first line of defense; instrument your system before asking questions. However, monitoring alone isn't enough. Life monitoring can be noisy, and the challenge of troubleshooting specific scenarios becomes pronounced when other users are performing different actions & the system might behave unexpectedly.
00:06:18.919
Another problem with monitoring is that it happens after the fact; it doesn’t help you predict or prevent potential issues that your application may encounter. An analogy would be that monitoring is like calling Triple-A after an accident—it’s beneficial, but you’d prefer to avoid the accident in the first place. Therefore, performance testing complements monitoring very well.
00:07:27.150
With performance testing, you can create synthetic traffic to simulate user scenarios in a controlled environment, typically in pre-production. When troubleshooting, you can reproduce specific scenarios conveniently. This controlled setup allows you to peel back layers of issues, facilitating a more straightforward troubleshooting process—with controlled variables: user traffic and user behavior.
00:08:57.920
Performance testing provides end-to-end user metrics, measuring the actual experience of your users. It’s crucial to focus not solely on server metrics or application and database metrics, but to understand the true end-to-end performance. Companies have often seen a significant disparity between user-facing metrics and server metrics, highlighting the need to measure the end-user experience.
00:10:24.990
Another important aspect involves measuring and optimizing before issues arise. Before deploying, you should have tested realistic scenarios and launched your metrics to measure the KPI for end-user experience. Different types of metrics such as response time, transaction completion time, throughput, and error rates come into play here. This ensures that you can identify and resolve any issues before they affect the user.
00:11:00.600
Software is continually changing, and determining whether specific changes affect user interactions is critical. The performance of an application can degrade, not only because of own code changes but also due to external factors - the vast cloud infrastructure and ongoing modifications from external parties can introduce performance bottlenecks. Therefore, continuous testing is necessary to ensure everything is running optimally.
00:12:54.520
For example, a change in the routing system at a cloud provider wasn’t publicized, leading to significant application impacts but went unnoticed until users started reporting issues. Identifying these issues relies heavily on continuous measurement of application performance against ever-changing external factors.
00:14:06.970
A huge challenge in performance troubleshooting is that engineers often spend a lot of time reproducing the initial problem and isolating the issue. This process can be incredibly time-consuming and labor-intensive, but it is vital to identify what is affecting performance. It turns out that with the right data and testing setups, identifying the source of the problem can be much easier.
00:15:17.370
At this stage, you want to minimize uncertainty and provide actionable data. You want predictive analytics capabilities integrated with monitoring and performance testing, enriched with data instrumentation to help localize where the performance issues lie. This predictive analysis accelerates the troubleshooting process, making it easier to resolve issues efficiently.
00:16:22.120
Our goal is to have clear insights into performance issues before they impact actual users, leveraging metrics collected during performance testing to anticipate problems even before they happen. By analyzing data, you can uncover where the bottlenecks reside, allowing developers to act swiftly and effectively.
00:17:47.760
We tap into various strategies, including data mining and machine learning techniques, to help localize performance issues by recognizing patterns within application metrics. The aim is to build a systematic approach to understanding which elements of an application contribute most significantly to performance problems.
00:19:06.690
For instance, when performance tests show significant delays when applying a linear ramp of traffic on a live application, the next step is data analysis via instrumentation in real time. Here, the focus shifts to identifying which aspects of the application are behaving irregularly under specific conditions or user loads. By analyzing historical data alongside live testing results, we can highlight which parts of the application see a divergence, indicating performance bottlenecks.
00:20:52.240
This system approach is both systematic and integrative. We demonstrate how applying these techniques can give immediate visibility into application performance. Understanding these points allows for proactive corrections and remediation to improve overall performance health, resulting in an efficient user experience.
00:22:50.569
In summary, speed is the foremost feature of product performance. Ensuring that your application operates efficiently is crucial to a successful deployment. Monitoring serves as the first defense line, but when coupled with performance testing, we create a more robust system capable of predicting and correcting potential issues before they significantly affect users.
00:28:02.320
Thank you for your time today. If you have any questions or feedback, feel free to reach out to me on Twitter at polymer 803.