Ups and Downs of Ruby Internationalization

The video titled 'Ups and Downs of Ruby Internationalization' presented by Martin J. Dürst at RubyKaigi 2016 focuses on the enhancements in Ruby's string methods, particularly in regard to case conversion for internationalization. The talk highlights several key points regarding how Ruby is adapting to support full Unicode starting from version 2.4, correcting limitations of previous versions that only accounted for ASCII characters.

Key points discussed in the video include:
- Introduction to the Speaker: Martin J. Dürst, a professor with extensive contributions to internationalization and Unicode, introduces himself and his engagement with Ruby.
- Unicode Updates: Ruby has been updated to Unicode version 9.0, maintaining a stable relationship with Unicode but providing critical changes in version 2.4.
- Case Conversion Methods: The talk delves into how methods such as upcase, downcase, and others were inadequate in earlier versions because they didn't accommodate non-ASCII characters. Ruby 2.4 introduces these functionalities fully supporting Unicode.
- Challenges in Case Mapping: Dürst elaborates on the complexities of case mapping, noting historical variations in letter case usage across different languages and the importance of backward compatibility.
- Concurrency with User IDs and DNS: The adjustments to methods can introduce issues, particularly in contexts like DNS and user identifiers that are still constrained to ASCII, which could cause mismatches.
- Special Cases: The presentation dives into unique cases such as the German sharp S (ß), which presents challenges during uppercasing and lowercase operations, and stresses the need for careful implementation when using Unicode due to inconsistencies.
- Implementation Constraints: A thorough explanation of managing string transformations and the technical aspects of applying these methods under varying encodings is provided.
- Importance of Testing: Dürst emphasizes the critical nature of testing, with over 20 million tests conducted to ensure consistency and robustness in the new features.

The conclusion reaffirms the importance of internationalization in programming and the critical necessity to adapt existing applications to leverage these new Unicode features without compromising data integrity. Dürst encourages feedback and collaboration to foster improvements in Ruby's internationalization capabilities and highlights the continued need for diligent testing and community involvement for successful implementation.

Ups and Downs of Ruby Internationalization
Martin J. Dürst • September 08, 2016 • Kyoto, Japan • Talk

http://rubykaigi.org/2016/presentations/duerst.html

Currently many of Ruby's String methods, such as upcase and downcase, are limited to ASCII and ignore the rest of the world. This is finally going to change in Ruby 2.4, where this functionality will be extended to cover full Unicode. You will get to know what will change, how your programs may be affected, and how these changes are implemented behind the scenes. We will also look at the overall state of internationalization functionality in Ruby, and potential future directions.

Martin J. Dürst @duerst
Martin is a Professor of Computer Science at Aoyama Gakuin University in Japan. He has been one of the main drivers of Internationalization (I18N) and the use of Unicode on the Web and the Internet. He published the first proposals for DNS I18N and NFC character normalization, and is the main author of the W3C Character Model and the IRI specification (RFC 3987). Since 2007, he and his students have contributed to the implementation of Ruby, mostly in the area of I18N.

RubyKaigi 2016