Easy threading with JRuby, is it?

00:00:07 Let's give Sam a round of applause for Yanis and support him for his future talk.

00:00:15 Welcome! So, it didn't happen randomly. The Baltic Ruby was the final push to organize this meeting. Thank you for that, and welcome to my talk: "Is Threading with JRuby Easier?"

00:00:28 First, about me: I'm Janis Baiza from Latvia. I'm a developer at ebi and have been working with Ruby since 2010. At the beginning, I used it mainly for side projects, but for the last seven years, it has been my main programming language. One of my development passions is performance tuning, and even my Bachelor's and Master's theses were performance-related.

00:00:50 EBI is a data visualization tool that is written in Ruby on Rails. However, in the background, EasyBI uses a Pentaho Mondrian OLAP engine, which is written in Java. Therefore, we use JRuby as it provides many useful features in the Java-related environment. Besides software, another passion of mine is green energy, and I also enjoy data analysis. A couple of years ago, I installed solar panels on my house to gain better insights into how this investment performs and when it will pay back.

00:01:18 I conducted additional analysis using our analytics tool, EasyBI. If you're interested, I would be happy to discuss this further during the hallway track. Today, I will talk about threading in general and will look at some practical use cases. We will dive into concurrency and parallelism in Ruby.

00:02:10 If we examine the execution flow of any single-threaded code, we can categorize operations into two types: CPU/memory-related operations and those related to network activities, such as database queries. The first type is generally faster because CPU and memory operations are tightly coupled with high-speed bandwidth. In contrast, as we've heard today, network operations tend to be slower, requiring more time to complete.

00:03:03 In a single-threaded flow, we often need to wait for these slower operations to complete. To improve efficiency and execution speed, we can divide these operations into multiple threads.

00:03:15 Let's consider the same operations divided into three threads. In most programming environments, execution happens on a single CPU core.

00:03:27 Execution of this code might look like this: one of the threads is selected to execute next. If that thread encounters a slow operation, we have to wait until it is completed. Meanwhile, other threads continue executing until they also hit a slow operation, where they too will need to wait for completion.

00:04:03 Once the first thread receives the necessary information, it can continue operations, followed by the other threads as they receive the information. Eventually, execution is marked as complete.

00:04:18 Now, let's compare this with the execution flow in JRuby, where we have Java and multiple CPU core support. In this case, any thread's operation can be executed by any of the CPU cores. However, we still need to wait for those network operations to complete. Once they are finished, the threads can resume their work.

00:04:52 In this simple example, we can see that the total execution time is faster than previously demonstrated. But that is in theory.

00:05:04 Let's examine a simple imaginary use case based on something we do in our product regarding background data processing.

00:05:17 Suppose I have a significant amount of data to analyze, and I want to download this data to my disk to avoid hitting API limits. I need to retrieve over 200 pages of data, and each request takes approximately 2 to 5 seconds, leading to a total download time of around 11 minutes.

00:05:39 I thought this was quite slow, so I decided to use parallel execution by splitting the operation into four threads. Each thread requests different pages of data with varying durations.

00:06:07 Ultimately, the total time for the parallel execution was just over 3 minutes, making it nearly four times faster. This was quite a significant improvement.

00:06:30 However, it's essential to remember that the slowest part of this process was the network communication. After downloading the data, the files contained simple strings of JSON-like structures. Our next task is to convert these into arrays of hashes.

00:06:49 For the JSON parsing, I decided to increase the workload by multiplying the data count by four times, resulting in almost 1,000 files to process.

00:07:07 Using a straightforward approach to JSON parsing, I utilized flat map and JSON parse functions. This operation took nearly 8 seconds to load almost 95,000 records.

00:07:37 To visualize the results, I'll create a table to compare our next steps with the previous ones. It can be observed that the threaded execution provided faster results.

00:07:55 When I split the data into four groups and created a new thread for each group, the processing time was reduced. However, the speed didn't meet our expectations based on our prior experiences.

00:08:12 Having an array of JSON is great, but let's add some processing logic. I want to count how often different fields appeared in these records.

00:08:38 This operation walks through our prepared hashes, primarily involving memory and CPU-related tasks. It took approximately 0.1 seconds to complete.

00:08:53 Let’s compare this to the results when we multiply the load by 100, which then took around 11 seconds.

00:09:10 We should keep track of our results and see how they perform with threads. During our tests, we found that the CPU execution times were quite similar, with not much difference.

00:09:40 Consequently, we experienced less parallelism with these operations. Now let's switch gears and utilize JRuby for JSON parsing.

00:09:56 In a single-threaded execution, it was indeed slower than in C, but when we ramped up the parallel execution, the numbers significantly improved.

00:10:10 The trend continued with transformation operations. Observations showed near four times better results in these areas, showcasing a significant boost with JRuby.

00:10:42 You might be asking if we achieved what we wanted. However, it’s essential to note that some data counts exhibited unexpected differences.

00:11:10 The issue revolves around the count discrepancies. Occasionally, I received an error during assignment, where certain threads began initializing a value simultaneously, causing conflicts.

00:11:44 This is especially problematic since each thread might read the same value before updating it. If multiple threads access and modify shared variables, we can end up with inaccurate results.

00:12:10 In JRuby, where we have true parallelism, this can become quite pronounced. It's possible to encounter similar issues in MRI, but they typically occur in larger datasets.

00:12:29 So is there a way to achieve parallel execution while ensuring thread safety? Yes, we can utilize Mutex.

00:13:10 We initialize a Mutex to ensure that only one thread can execute a segment of code at a time. This way, if a thread is executing a block of code, others will wait their turn.

00:13:45 However, this can lead to extensive context switching and may degrade performance under certain conditions. Additionally, too much locking can lead to further complications.

00:14:15 In practice, we often turn to the concurrent Ruby gem, which provides modern concurrency tools inspired by several programming languages.

00:14:40 This gem offers robust thread safety guarantees across all major Ruby interpreters.

00:15:10 Using concurrent collections such as concurrent arrays or concurrent hashes can significantly ease thread interactions.

00:15:45 However, it's important to note that concurrent maps are not 100% compatible with typical hash operations.

00:16:08 We can also leverage other features of the gem for more specialized use cases.

00:16:35 In our development environment, we utilize two classes from this gem: AtomicFixnum, which stores integer values safely, and AtomicReference, which can hold different object types.

00:17:05 Now that we have the gem integrated, let’s rewrite our threaded code applying these concurrent options. We will replace standard hashes with concurrent maps.

00:17:47 We will also ensure our increment operations are thread-safe through AtomicFixnum, effectively managing collisions during concurrent updates.

00:18:25 Upon execution, we found that the counts aligned as expected, offering proper validations of nearly 95,000 records.

00:18:48 The counts of previously conflicting data were also corrected, solidifying our results. As expected, JRuby outperformed in threaded execution.

00:19:08 Performance showed improvements with true parallelism in JRuby, which executes across multiple cores efficiently.

00:19:40 However, note that concurrency does not equate to parallelism. Running multiple threads does improve performance, especially when waiting on database or network calls.

00:20:10 Be mindful of how you interact with variables within threads, particularly global and instance variables.

00:20:32 We should make an effort to reduce reliance on such variables, but when necessary, utilize gems like concurrent Ruby to help manage thread operations.

00:21:10 For maintenance purposes, ensure thread naming and context logging are in place. It simplifies debugging and identifies thread-specific issues.

00:21:53 Initialization of global or instance variables must be approached with caution to prevent ghost connections when threading.

00:22:34 One common pitfall is that multiple threads could attempt to create global or connection variables simultaneously, leading to unforeseen results.

00:23:17 Let me exemplify this: if two threads initialize a connection concurrently, it could result in lost connections and could lead to considerable issues in production.

00:23:51 It would be wise to implement mutexes or synchronizations around critical sections of code that handle global resources.

00:24:17 In summary, I recommend initializing connection objects upfront, but also to take advantage of mutexes when delaying creation is necessary.

00:24:54 In particular, ensure all threads share the same mutex instance; otherwise, the synchronization won't be effective.

00:25:30 To optimize connection access, it is prudent to check whether the connection exists before calling synchronized blocks.

00:26:10 This helps in mitigating excessive locks and context switching while ensuring code integrity.

00:26:40 In wrapping up, the key takeaways include understanding that concurrency is not the same as parallelism. While multi-threading often improves performance, it is essential to consider the types of operations being performed.

00:27:20 For CPU-intensive tasks, utilizing multiple cores truly helps. Be cautious with sharing variables and optimize threading practices with tools like concurrent Ruby.

00:27:57 Furthermore, add clarity with thread names and enhance your logging practices to streamline debugging and troubleshooting processes.

00:28:27 Lastly, establishing thread safety for database connections, particularly with busy applications, is paramount. We need to be willing to catch potential issues with mutexes.

00:29:10 Taking these precautions could prevent serious issues from arising in production, such as exhausting database connections.

00:30:00 As we experiment with JRuby's capabilities, we should appreciate the advantages of parallel execution while being mindful of the hurdles that come with shared variables.

00:30:40 Through diligent practice, reducing errors and achieving smooth operations is within reach.

00:31:01 I appreciate your time today. Happy threading!