Modeling Concurrency in Ruby and Beyond

by Ilya Grigorik

In 'Modeling Concurrency in Ruby and Beyond', Ilya Grigorik presents an in-depth exploration of concurrency, emphasizing its complexities and the various models available for programming in Ruby and other languages. The session begins with Grigorik's personal journey and background in concurrency, highlighting the difference between academic knowledge and industry practices. He elaborates on the significance of understanding both threads and events in designing concurrency models, taking into account the hardware architecture and its impact on performance.

Key points discussed include:

- Understanding Hardware Architecture: The need to consider aspects like CPU caches and RAM access times is crucial for optimizing concurrency. Grigorik explains how innovations in hardware parallelism affect software designs.

- The Role of Measurement: Statistically analyzing performance across various hardware and software implementations is essential to obtain accurate metrics for concurrent applications.

- Alternative Concurrency Models: Grigorik challenges the reliance on threads as the primary concurrency model, introducing alternatives like the Actor model and Communicating Sequential Processes (CSP). He emphasizes that these models can simplify debugging and enhance performance by avoiding shared state.

- Implementing Concurrency in Ruby: The speaker discusses practical implementations in Ruby, proposing improved concurrency via event-driven architectures as seen in languages like Go. He shares experiences with developing libraries that streamline concurrent programming in Ruby.

- Cross-Language Learning: Grigorik encourages the study of concurrency models across different programming languages to inspire innovation and apply new techniques in Ruby.

In conclusion, the speaker calls for a broader perspective on concurrency that encompasses a variety of models beyond traditional threads. The session underlines the importance of experimenting with advanced concurrency features to enable cleaner, more maintainable code in Ruby. He invites the audience to actively engage with the growing body of knowledge on concurrency and to explore the collaborative potential within the developer community. Overall, this presentation serves as a valuable resource for software developers aiming to deepen their understanding of concurrency in Ruby and other languages.

00:00:12.719 All right, good morning everyone.

00:00:19.439 So, in the next 30 minutes, we will explore modeling concurrency, which will obviously be the main topic. We've just spent half an hour discussing a specific aspect of concurrency, so there's no way I can cover the entire topic in this session.

00:00:39.440 Instead, I want to share a story of my own exploration of concurrency. This is a field I've been interested in for a long time, tracing back to my university days. Some of the courses I enjoyed the most were focused on concurrency. I learned about various concepts such as horse locks, deadlock detection algorithms, and other related topics. However, when I entered the industry, I quickly realized that the practices were very different from what I had learned. I picked up several books, and a quick search on Amazon reveals that there are likely millions of pages written about concurrency. This indicates it's a topic that many people care about, yet it often presents similar discussions across different sources.

00:01:21.600 When people think of concurrency, the first thing that often comes to mind is threads. However, it's not only about threads; it's also about events. I would argue that while threads are one possibility, we actually need to consider both threads and events. To understand the necessity for both, we need to step back and examine the underlying hardware.

00:01:48.159 Let’s start by looking at a simple architecture of a CPU. Within a core, we have various caches and input/output devices attached. Normally, on most architectures, fetching an instruction from the RAM takes roughly 100 nanoseconds. While it's relatively quick, it still isn't fast enough.

00:02:06.880 As we delve deeper, we find the significance of caches. Accessing data from the Level 2 (L2) cache takes about 7 nanoseconds, which is an order of magnitude better than going to the main RAM. The L1 cache is even faster, around half a nanosecond. If we look at a typical CPU today, a single clock cycle is about half a nanosecond, enabling us to fetch subsequent instructions swiftly from L1 cache. However, when needing to access RAM, we typically waste around 2,000 CPU cycles.

00:02:35.840 To mitigate this access time issue, the hardware industry has developed numerous techniques over the last several decades, including pre-fetching, branch prediction, pipelining, and speculative execution. These innovations are often abstracted away from software developers.

00:03:05.480 An interesting example provided by Joshua Block, a prominent developer, highlights the unpredictability of performance in concurrent programming. For instance, when comparing two different code paths, one might expect to determine which executes faster. However, the actual execution time can vary due to the internal optimizations of the JVM and underlying hardware. In some cases, both paths may even execute simultaneously without significant performance gains.

00:03:39.600 The take-home message here is the importance of measurement. To get realistic performance metrics today, one must conduct statistical tests, running code multiple times to gather data and analyze it properly. This analysis is essential because performance can depend on various factors, including hardware specifications and JVM versions.

00:04:06.319 Returning to our main topic, we should emphasize that hardware parallelism is inherently tied to our software designs; we rely on various parallel processes, threads, and events. Understanding how these elements interconnect can help optimize our concurrency models. In essence, while processes, threads, and events may seem different, they are actually variations on a theme, and choosing how to utilize them effectively can greatly influence performance.

00:04:40.320 Recently, I delved into a book by Peter Norvig titled 'Seven Languages in Seven Weeks.' I highly recommend it if you haven't read it. One chapter focuses on Ruby, and the author, Matt, poses an interesting question: if he could change anything in the language's design, what would it be? His surprising answer was to remove threads in favor of introducing actors or other more advanced concurrency models.

00:05:17.920 This raises a crucial question: what exactly constitutes an advanced concurrency feature? In many discussions, we primarily focus on threads, but often overlook alternatives like the actor model that provide different advantages. This oversight means that, while we have effective libraries and tools at our disposal, some key elements of advanced concurrency models are still missing in mainstream languages like Ruby. This indicates a gap we could explore further.

00:06:02.080 In summary, both the actor model and Communicating Sequential Processes (CSP) offer compelling alternatives worth exploring. There is a wide range of concurrency models available outside traditional threads, and each has unique advantages and constraints. These continue to evolve, and understanding these concepts can enable us to leverage their capabilities effectively.

00:06:52.560 In developing effective tools or concurrency models, it's essential to recognize what new capabilities they bring and what limitations they impose. A good concurrency model allows for new ways to express our goals and requires us to think differently about our designs. Additionally, these models can prevent certain errors by embedding constraints directly into the programming language, guiding us towards making the right choices and eliminating entire classes of bugs.

00:07:47.040 The actor model, for example, dates back to the early '70s but has only recently made its way into mainstream programming conversations. Therefore, there is a significant benefit to understanding the historical context and theoretical frameworks that underpin these models. The actor model enables distinct processes to communicate through messaging, creating a message-centric system.

00:08:32.640 This approach clearly separates processes while allowing message exchange without shared state, eliminating races, locks, or shared variable concerns altogether. Similarly, CSP, developed in the late '70s, embodies a concept where processes operate concurrently without needing to monitor shared resources, thus simplifying debugging and reducing complexity.

00:09:55.680 To illustrate these concepts further, I began implementing these principles in Ruby, working towards a concurrency model reminiscent of Golang. Hitherto, concurrency in Ruby often entailed more boilerplate code for practical implementations. However, I believe we can effectively adopt principles from Go to make threading simpler and more intuitive.

00:11:52.560 I've created sample libraries that initiate minimalistic event-driven architectures, showcasing how concurrency could be cleaner and more effective. The following project demonstrates producer-consumer patterns within Ruby, allowing clear communication between different parts of the application without needing traditional locks.

00:12:51.840 Perhaps the most relevant comparison exists when building a multi-threaded web server in Ruby. Here, we define request types for handling incoming tasks, which can be queued for processing in parallel threads or workers. Moreover, using a channel, multiple workers can process multiple requests efficiently while ensuring thread safety.

00:14:24.080 In summary, adopting more advanced concurrency features could lead to cleaner, more maintainable code. Although using Ruby's current threading model is complicated due to its architectural limitations, experimenting with platforms like JRuby can unveil interesting opportunities. In fact, JRuby may enable more effective threading and allow for the implementation of actor models.

00:15:19.360 As we explore further, we must consider new concurrency models and languages that merge efficient paradigms with Ruby's syntax. I recommend delving into various systems such as Go or looking into concurrency libraries and frameworks already established in JVM environments. I also believe macros in certain Ruby versions can initiate a different approach but remember that the core Ruby implementation may lag in adapting non-traditional concurrency ideas.

00:16:58.560 Our exploration of these topics should not stop with Ruby; many powerful concepts lie within other programming languages, and embracing them can broaden our understanding of concurrency fundamentals and inspire innovation. While it’s vital to understand Ruby's limitations, the advancement of concurrency features in other languages such as Erlang, Clojure, or Scala cannot be underestimated.

00:18:12.480 Ultimately, as we seek to advance our understanding of concurrency in Ruby and beyond, it is crucial to remain inquisitive. Conduct research, test assumptions, and remain vigilant to new ideas and paradigms that contribute to progress in our field. Follow community discussions, read recent literature, and become comfortable exploring various programming languages.

00:19:15.840 In conclusion, as we expand our perception of concurrency, it’s not merely about threads versus events; we need all these tools at our disposal. Investing time in understanding actor models, CSP, and other concurrency models is essential. We shouldn’t just limit ourselves to the more traditional models we have been taught but should embrace these novel ideas that can revolutionize how we approach concurrency.

00:20:31.680 I also encourage you to check out the additional materials I’ve provided relating to concurrency models. Knowledge-sharing within the community is key to our collective advancement and understanding. Adopting new approaches towards concurrency could pave the way for exciting possibilities within Ruby's ecosystem.

00:21:52.480 Now let’s open the floor for questions regarding concurrency models, particularly concerning actor and CSP models. How do they differ, and what considerations should we keep in mind when integrating these models into our coding practices?

00:22:58.880 One point worth mentioning: while concurrency naturally promotes communication protocols, there will be instances where shared resources cannot be completely avoided. Approaching these issues with messaging systems in mind can clarify communication and enhance robustness over traditional methods.

00:23:38.720 While transitioning to these advanced concurrency models may seem daunting at first, embracing the actor model or CSP opens new pathways to reduce complexity significantly. It’s equally important to acknowledge the power of constructive messaging and effectively build our software around these principles.

00:25:05.600 Choosing the right concurrency model will often depend on project requirements and constraints. This encourages an exploration of the trade-offs presented by different models. To that end, books and literature sourcing recent advancements in concurrency could aid in keeping us updated on effective methods.

00:26:15.200 As you continue down this journey, I encourage you to explore online communities and interactive platforms. Engage in the principal discussions on design patterns beyond Ruby. Building a repertoire of knowledge across several languages and concurrency models can enhance both your skill set and computational effectiveness.

00:27:42.560 We will find ripe opportunities to bridge the gaps in concurrency paradigms, leading to cleaner, more effective code. Certainly, the quest for richer concurrency models will better equip us for the present and future challenges within the programming landscape.

00:29:10.560 As we conclude, please reflect on how you can apply the knowledge discussed today while exploring various practical applications. Ensure you have the tenacity to remain informed and adaptable to new trends and best practices. I'm here with you for add-on discussions or to converse further about any specific models.

00:30:43.520 Thank you all for your engagement and insightful questions. Should you have any further inquiries or comments, don't hesitate to reach out. Remember, the exploration of concurrency's intricate landscape is essential, and it’s only through collaborative engagement that we can truly progress. Keep questioning and keep learning.

MountainWest RubyConf 2011