Ruby Supercomputing: Using The GPU For Massive Performance Speedup

In this presentation titled "Ruby Supercomputing: Using The GPU For Massive Performance Speedup" by Preston Lee at MountainWest RubyConf 2011, the focus is on leveraging Graphics Processing Units (GPUs) for enhanced performance in Ruby applications. The discussion begins with a challenge to traditional views on concurrency, emphasizing that avoiding multi-threading limits the potential of modern computing capabilities. Key points include:

Understanding Concurrency: The presentation explores how developers can benefit from utilizing concurrency through multiple threads, moving beyond single-threaded approaches to improve performance on compute-heavy algorithms.
Examples and Comparisons: A practical example involving a tree ring simulator demonstrates the differences between single-threaded and multi-threaded approaches in Ruby 1.9 and JRuby. The results show that while multi-threading can enhance performance, the global interpreter lock in Ruby limits its effectiveness compared to JRuby.
GPU Architecture: Lee explains the superiority of GPUs, which can run thousands of concurrent threads, providing significant speed increases for specific tasks compared to standard CPU processing.
OpenCL and GPU Programming: The use of OpenCL is introduced as a way to access GPU capabilities across various platforms. The presentation includes several specific code snippets that illustrate how to implement computing tasks on the GPU, using Ruby libraries like Barracuda to facilitate this process.
Performance Challenges: Lee discusses the challenges related to data transfer between the CPU and GPU, noting that this overhead can affect performance, especially in Ruby implementations.

In conclusion, the presentation strongly advocates for Ruby developers to embrace GPU programming to unlock massive performance gains in suitable computational tasks. Lessons learned emphasize the importance of understanding concurrent programming and the potential of GPUs in high-performance computing scenarios.

Ruby Supercomputing: Using The GPU For Massive Performance Speedup
Preston Lee • February 17, 2015 • Earth

By, Preston Lee
Applications typically measure processing throughput as a function of CPU-bound algorithm performance. Most modern production systems will contain 2-16 processor cores, but highly concurrent, compute-heavy algorithms may still reach hard limits. Within recent years, vendors such as Nvidia and Apple have formalized specifications and APIs that allow every developer to run potentially 1,000s of concurrent threads using a piece of hardware already present in the machine: the GPU.

Help us caption & translate this video!

http://amara.org/v/GIj4/

MountainWest RubyConf 2011