Sam Rawlins
The State of C Extensions: Alive and Well, so Learn to Deal

The State of C Extensions: Alive and Well, so Learn to Deal

by Sam Rawlins

The video titled "The State of C Extensions: Alive and Well, so Learn to Deal" by Sam Rawlins discusses the relevance and state of C extensions within the Ruby programming community. C extensions, which allow Ruby to interface with C libraries, have become an integral part of Ruby's ecosystem despite some reluctance from the Ruby community due to challenges related to portability and documentation.

Key points discussed in the video include:

- Historical Context: MRI is the original Ruby implementation, inherently tied to C, with essential data types and libraries built in C.

- Current Landscape: C extensions are still prevalent, with significant examples found in popular gems, particularly in SQL and NoSQL contexts.

- C Extensions Motivation: The primary reasons for utilizing C extensions are performance optimization and the ability to leverage existing C libraries. Examples include the Zlib library for compressing data and the performance contrast between the Chunky PNG and Oily PNG libraries.
- Performance Examples: The video cites concrete examples showcasing speed differences between pure Ruby and C extension implementations, emphasizing Oily PNG’s dramatic performance improvements.
- Challenges with C Extensions: Significant concerns exist regarding C extensions, such as lack of portability, difficulties in debugging, and inadequate documentation practices. This can make them cumbersome for Rubyists who may not be familiar with C.
- Documentation Importance: Rawlins highlights the need for better documentation standards in C extensions, offering examples where poor documentation leads to user confusion. He encourages clarity and accessibility through enhanced documentation practices.
- Best Practices and Future Considerations: Although RubyGems offers some structure for packaging C extensions, a widely accepted standard remains absent. The video also touches on the Foreign Function Interface (FFI) as an alternative but does not delve into it in depth.

In conclusion, Rawlins asserts that C extensions are vital to Ruby's growth and performance capabilities. Encouraging developers to embrace C extensions, he emphasizes the necessity for clear documentation and thoughtful integration within the Ruby community, ensuring they remain accessible and beneficial for all users.

The session closes with an invitation for questions and a reminder of the importance of effective communication regarding C extension usage in Ruby development.

00:00:12.759 This is my talk on the state of C extensions: Alive and well, so you're going to have to deal with it.
00:00:17.760 Who am I? I'm Sam Rawlins. There's my Twitter and my email (GitHub white SW). My claims to the Ruby community are none, really.
00:00:22.039 I sometimes pay quick attention to C extensions.
00:00:27.160 Okay, so we all know that MRI is the first implementation of Ruby, and it's written in C. Many of the lower-level libraries in Ruby are also written in C, including Fixnum, Bignum, String, Hash, etc.
00:00:39.040 You might be saying, 'Okay, Fixnum is older than my grandpa; why do I care about these old C extensions?' The truth is, even new code in Ruby 2 includes C extensions. The Array class is largely a C extension, as is the Keep if d.home and the Fiber class. Ruby must be able to interact with the vast array of C libraries to be taken seriously.
00:01:00.320 To facilitate this interaction, Ruby provides a C extensions API, essentially defined in the Ruby 'room file'. There's no formal extension API, but various documentation, such as Matt's oxygen document, exists. The Pickaxe book also serves as a resource.
00:01:19.200 Some of your favorite gems are actually C extensions. Most of the traditional SQL gems use them, while newer NoSQL gems typically communicate via network sockets, like Noiri, Json, FastThread, etc.
00:01:40.079 I took a quick look at rubygems.org. As of a couple of days ago, there are about 21,000 gems, and only 45 of them are extensions. I'm okay with that because not everything should be a C extension. However, among the top 100 downloaded gems, I found that 18 are C extensions, which I think is a pretty high percentage.
00:02:00.920 The top 10 most downloaded C extensions include Json, which I find interesting. There are two primary motivations for creating C extensions: speed and serving as a bridge to existing C libraries. Most C extensions are motivated by these two reasons.
00:02:22.360 For example, the Zlib library, a long-standing and well-vetted library, is used for encoding and decoding, and it is very efficient. The Zlib C extension in Ruby utilizes this library for both speed and access to existing code.
00:02:45.640 I am not a C extension advocate; I am a Rubyist. I try to program in Ruby more than 50% of the time, even though I work in various languages. I can support C extensions because they expose Ruby to a greater computing ecosystem.
00:03:06.720 Being pragmatic means understanding the benefits of C extensions. A great example of this is the Chunky PNG library, which is a pure Ruby library for encoding and decoding PNG files.
00:03:36.480 The same author created a C extension called Oily PNG that enhances the performance of Chunky PNG by optimizing certain methods, allowing users to easily migrate to a faster library without changing their code.
00:04:02.720 Benchmarks show that Chunky PNG can take as long as 7.2 seconds for certain operations, while Oily PNG does it in less than 1 second, making it over 80 times faster.
00:04:34.080 Another example is the Json gem, which comes in two versions: the C extension variant and the pure Ruby variant. If you need speed, you can require the C extension and fall back to pure Ruby if it’s not available.
00:05:16.320 C extensions face criticism, mainly for being less portable compared to pure Ruby code, as Ruby code can run across different platforms and interpreters. It’s also difficult to patch or debug C extensions, and self-documentation is often lacking.
00:05:53.200 For example, if you want to monkey patch a C extension, you typically have to write in C code, which can be complex and cumbersome.
00:06:29.440 Debugging C extensions presents another challenge, as Rubyists shouldn't be required to know C to debug C extension code. Such difficulties can lead to frustration when attempting to manage library functionality.
00:06:57.280 Additionally, self-documentation is problematic. Users should not have to read undocumented C code to understand ruby extensions. Good documentation practices should be adopted within C extensions, and methods should include clear explanations.
00:07:28.560 While many C extensions lack clear documentation, libraries like ActiveSupport provide excellent examples of why good documentation is critical.
00:08:01.360 An old example, Range#to_s, serves as a cautionary tale. Its documentation leaves much to be desired, raising questions about its functionality from any user unfamiliar with C.
00:08:37.440 Documenting C extension code using methods and including clear comments, preferably in YARD style, can bridge the gap and improve accessibility for Ruby users.
00:09:09.040 There’s a need for a consensus on the best practices for packaging C extensions. The RubyGems API provides guidance on managing compilation, but there isn’t a universally agreed-upon method.
00:09:50.280 As for FFI (Foreign Function Interface), while it wasn't covered in detail today, it is still a viable option to consider when performance comparisons are made against C extensions.
00:10:18.760 In conclusion, C extensions have a valid place in the Ruby ecosystem. When used properly, they enhance performance and allow integration with existing libraries, proving how they are alive and well in the Ruby community.
00:10:58.680 Ultimately, developers should embrace C extensions while maintaining a dedication to clear documentation and user accessibility.
00:11:07.760 Thank you for your time. Do you have any questions? I'd be happy to take them.
00:11:20.240 Is there a good convention for packaging C extensions? There’s some guidance in the RubyGems API, but no widely accepted standard.
00:11:30.240 For any specific queries about packaging or compiling C extensions, I suggest seeking advice in the RubyGems IRC Channel.
00:11:37.240 Thanks again for attending! I hope you found this discussion helpful.
00:11:46.040 It's been a pleasure sharing these insights with you.