Aloha RubyConf 2012
Ensuring High Performance for Your Ruby App
Summarized using AI

Ensuring High Performance for Your Ruby App

by Kowsik Guruswamy

In this presentation titled "Ensuring High Performance for Your Ruby App," Kowsik Guruswamy discusses the importance of application performance management (APM) in the Ruby development lifecycle. He emphasizes that performance issues can lead to significant frustrations for developers, highlighting the need for affordable and easy-to-use performance management tools that fit seamlessly into existing workflows.

Key Points Discussed:
- Introduction of Kowsik: Kowsik is the CTO of Blitz and has been a Rubyist since 2006. He shares his experience of transitioning from C++ to Ruby while working on performance-sensitive applications.
- Understanding Fuzzing: Fuzzing is explained as a method of testing applications by generating numerous test cases to identify vulnerabilities early in the development cycle. Kowsik mentions developing automated unit tests aimed at triggering exceptions and errors.
- Big O Notation: He introduces Big O notation as a key concept for understanding execution speed and memory consumption, noting its relevance in identifying performance issues related to data structures and algorithms.
- Rails Performance: Kowsik addresses common misconceptions about Rails performance, asserting that inefficient usage often leads to slowdowns rather than the framework itself being inherently slow.
- Performance Optimization Example: He shares an example where improper handling of Ruby's require functionality resulted in an O(n²) problem, illustrating the impact of dependency management on performance.
- Distributed Load Testing: Kowsik introduces Blitz as a tool for measuring application performance under load and discusses using Redis to enhance performance and reduce request times significantly.
- Garbage Collection: The impact of garbage collection on app performance is described, addressing how hiccups during collection can disrupt user experience.
- Concurrency vs. Hit Rate: He distinguishes between hit rate (requests handled per second) and concurrency (simultaneous users), explaining their implications on application performance.
- Final Thoughts: Kowsik concludes with advice on exploring and optimizing data handling processes and reiterates that poor performance might stem from the algorithms used rather than the framework itself.

Concluding Takeaways:
- Understanding performance in Ruby applications is a journey through data structures and algorithms.
- It is essential to optimize both the usage of frameworks like Rails and the underlying data management techniques to achieve the best performance results.

Overall, Kowsik's presentation provides valuable insights into the critical aspects of Ruby application performance management, offering practical advice for developers aiming to ensure their applications run efficiently.

00:00:15 Hey guys, this is probably one of the last presentations. You know, with jet lag hitting us, I'm bound to fall asleep, so just wake me up with questions. I love the questions, so we don't have to wait until the end. If you see me nodding off, just kind of give me a nudge and stand up to ask me questions so we can keep this going. Now, a really quick intro about myself.
00:00:50 I am the CTO for M Dynamics, a company that I started some years ago, and we got acquired in April. Now, I am the CTO for Blitz, which carries on the work we began. I've been a Rubyist since around 2006. A lot of people get introduced to Ruby through Ruby on Rails, but for me, it was exactly the opposite.
00:01:08 We actually built a commercial, non-web application in Ruby that consists of about 250,000 lines of code. I will go through a bit about what we did and why performance was so important for us. When I started with Ruby, does anyone know what fuzzing is? It's essentially building a commercial fuzzer on steroids, and before I explain fuzzing, I'll share that our first version was coded in C++. It was hard to maintain.
00:01:38 So, we started moving some stuff over to Ruby, trying to put some Ruby 'lipstick' on the C++ pig—and we struggled with that. By the third version, we had almost completely switched everything to Ruby and had to do a lot of work with C extensions to optimize performance.
00:02:07 For those of you who aren't familiar with fuzzing, I'll give an example. There was a GitHub Rails vulnerability related to SQL injection that allowed for mass assignment, which could essentially compromise users' SSH keys. That's what we focused on during our work; we engaged in something called hacker-driven development.
00:02:37 If you have a RESTful API, maybe you're using Rack or something else, we could methodically generate tons of automated unit tests designed to crash the Ruby app, generate as many exceptions as possible, and tamper with inputs. We created thousands of test cases to ensure vulnerabilities were identified during the development process rather than just relying on an annual audit.
00:03:00 As a result of generating these tens of thousands of test cases, these tests often ran for three to four hours due to their exhaustive nature. We loved Ruby and aimed to make everything run faster because we couldn't wait around for 36 hours for those tests to finish.
00:03:31 We broke everything we could, from printers to SCADA systems, regardless of their protocol, and we found problems in everything. The speed of execution was super important for us.
00:04:02 Anyone still think I'm yawning and slowing down? Cool. We love Ruby because, after moving from C and C++, it felt very clean and beautiful. Writing C inside Ruby felt similar to writing Ruby.
00:04:19 This is also important when it comes to C extensions. Ruby 1.9 introduced more complexity due to the virtual machine infrastructure, while Ruby 1.8 was almost pure.
00:04:49 Since this is a talk about performance—a huge subject itself—let's talk briefly about programming languages. People say Node.js is fast, Rails is slow, or Python is okay. It’s not helpful to compare languages this way.
00:05:19 Every interpreter has trade-offs, such as method caching or regex caching. You need to know a bit about optimization, especially since many great presentations today discussed performance, including Aaron's keynote on Rails 4 and threading and concurrency, which I'll touch upon later.
00:05:51 From my experience with Ruby, I think it fundamentally comes down to Big O notation. Does anyone here know what Big O is? Great! You've likely seen this in computer science classes.
00:06:10 Basically, Big O gives you an idea of the speed of execution or how much memory something will consume in an abstract way. When you face a performance problem, it's often a result of poor data structures or inappropriate algorithms.
00:06:35 It’s also affected by latency. If your data centers are in Germany and your customers are in the U.S., you can't push packets fast enough across the globe due to the speed of light.
00:07:01 Many people claim Rails is slow; however, a deeper investigation often reveals that it's really about how you're utilizing the framework.
00:07:30 In a recent talk, Ben discussed the difference between date range.includes and date range.cover. The former expands Ruby to go through every single element and conduct a comparison, which is O(n). This means for a set of 300 years, it will check every second of every day.
00:08:00 In contrast, cover checks only the outer bounds, which is O(1) no matter the size of the date range. This highlights how complexity can vary significantly based on the method used.
00:08:25 To visualize complexity, the yellow line represents O(1), meaning the access time is consistent regardless of dataset size. In contrast, logarithmic growth is shown as a gradual incline, while linear growth and quadratic growth demonstrate a sharper degradation in performance as dataset size expands.
00:08:54 To illustrate this, I'll use some examples from a product I'm working on that deal with concurrency, threading, and hit rate. Let's start with a simple example.
00:09:22 Some time ago, I installed a gem from Twitter to test its performance. I was surprised to find that after entering the command, it took 1.34 seconds to display the usage of the program, which seemed wrong.
00:09:52 After troubleshooting with Ruby's bt, I discovered that the program was requiring 433 other files—still not a massive number. This leads us to a common issue with Ruby's require functionality.
00:10:13 When you perform a require in Ruby, there exists a loop that checks against all previously loaded files. This creates an O(n²) problem: each new require checks against every previously required file.
00:10:49 To illustrate this in practice, when creating a new Rails app and checking the loaded feature size, I found it often had over 700 dependencies. Performance is critical, particularly regarding the startup time of Rails applications.
00:11:24 There are efficiency concerns with J Ruby during cold startups as it loads and compiles many files, greatly impacting performance. This issue underscores the importance of reducing unnecessary dependency.
00:11:49 I'm a huge fan of Blitz, which is a distributed load testing application. For context, we provide services that help measure how well your app performs under load.
00:12:10 You can set up a Ruby app on Heroku or EC2 and send a specific number of virtual users from different locations, allowing you to evaluate response times and hit rates. We overlay New Relic metrics to show performance under heavy load.
00:12:41 A simple graph we generated demonstrates the relationship between the number of concurrent users and response time. This approach is powerful for continually measuring performance.
00:13:06 Despite being a performance testing product, we learned valuable lessons while developing Blitz. Initially, we used CouchDB as part of our stack, but quickly faced a major bottleneck as each request to the front end was vertical with all static assets.
00:13:47 This led to CouchDB receiving thousands of requests, which we identified as an O(n) problem where each page averaged 'n' static assets, sending 'n' total requests back to CouchDB.
00:14:19 We later revamped our architecture to introduce Redis for queuing, achieving significant performance improvements. Initially, we mixed CouchDB with various traffic generation engines across AWS regions.
00:14:55 However, we faced another challenge termed a self-distributed denial-of-service. With many workers, each front-end web request would be translated into 20 or more HTTP requests back to CouchDB, revealing a major inefficiency.
00:15:42 This led us to switch to Redis for all queuing and subsequently improved our UI response time from around 750 milliseconds to less than 20 milliseconds, which was a significant gain.
00:16:18 Moving to Redis allowed for much greater performance since it works in memory and greatly reduced overhead—this resulted in tremendous enhancements in our application response times.
00:16:45 As a further exploration into performance, let's talk about garbage collection (GC) hiccups. For instance, using a sample app, I disabled GC and manually initiated it every so often to measure performance impacts.
00:17:10 I observed how the system handled increasing concurrent users and began tracking object counts as they increased. Eventually, the system experienced degraded performance as the garbage collection process consumed resources.
00:17:50 GC hiccups are real; as they run on the same thread as the main application, they introduce interruptions that massively affect user experience during peak loads.
00:18:21 Next, I'll clarify hit rates in comparison to concurrency. Hit rate refers to the number of requests your site can handle per second, while concurrency is about simultaneous users accessing your application.
00:18:57 Consider a simple Sinatra application that handles one request at a time. Here, concurrency is limited to one, regardless of how many people are trying to access the application simultaneously.
00:19:35 If you monitor your app’s performance during high concurrency, you will likely find timeouts and errors as it attempts to accommodate more users than it has the capability to handle.
00:20:03 We can further examine the patterns of synchronous requests and see how they behave when creating load, demonstrating the need for efficient handling methods based on concurrency.
00:20:30 Let’s test an asynchronous route with event-machine functionality. This iteration allows requests to return without immediately blocking the connection, improving handling of concurrency.
00:21:10 Using a simulated route with asynchronous handling, we can see that even though the average response time remains the same, the concurrent handling capacity greatly increases.
00:22:02 A connection pool is a critical consideration; if you have a fixed number of connections to your database, any increase in demand that surpasses that count will quickly result in blocked requests and timeouts.
00:22:36 When there’s a surge of concurrent users and a limited pool, requests will start to pile up as they cannot be served, leading to performance degradation.
00:23:06 You may see good average response times during low traffic, but as you scale, your app's true limitations regarding concurrency and connection pooling will become apparent.
00:23:32 In Blitz, we often leverage Redis for registration, serving as a write-through cache, and for job scheduling to ensure the smooth operation of our workload.
00:24:00 Redis is more than just an in-memory store; it's a well-architected distributed dictionary providing efficient data operations. However, you need to manage it wisely to avoid performance pitfalls.
00:24:25 When tracking the complexities of operations performed on data structures in Redis, consider the implications of poor algorithms or operations like set differences, which can lead to O(n) or worse performance.
00:25:00 Performance issues may not always fall on Rails itself; they could stem from inefficiencies in the algorithms used in your application or the way data is arranged in your database.
00:25:30 In one instance, a customer was running geospatial queries without the correct indexing on their database; this resulted in significant performance issues, highlighting the importance of efficient indexing.
00:26:00 Thus, understanding performance is essential—it can be an engaging journey full of learning based purely on data structures and algorithms, no matter the environment you're working in.
00:26:36 Remember, it’s not always the framework that limits performance. Instead, it's vital to explore and optimize your data handling processes to achieve the best results.
00:27:10 That concludes my talk! I'm happy to answer any questions you have, but it's also time for all of us to enjoy the beach. Thank you!
Explore all talks recorded at Aloha RubyConf 2012
+13