Building Generic Software

Ruby

Chris Salzberg

#internationalization-i18n

Building Generic Software

by Chris Salzberg

In his talk "Building Generic Software" at RubyConf 2018, Chris Salzberg explores the concept of generic software, emphasizing its importance in building maintainable and reusable code. He begins by introducing his background, contributing to the Ruby ecosystem, and sharing insights gleaned from notable figures in the field, particularly Jeremy Evans, who highlights the value of writing flexible, generic software rather than specific APIs.

Key Points Discussed:
- What is Generic Software? Generic software refers to software that addresses a broader range of problems rather than focusing narrowly on specific use cases. The idea is to create smaller, reusable components that can be combined in various ways, limiting complexity.
- Lego Philosophy in Programming: Drawing on Gary Bernhardt's analogy, Salzberg advocates for writing code like assembling Lego: breaking problems down into smaller, manageable pieces.
- Challenges of Complexity: Though striving for reusability can often lead to overly complex architectures, Chris stresses the need to balance maintainability and complexity in software design.
- Exploration of Translated Attributes: To illustrate his discussion, Salzberg provides a real-world example of managing translations in Ruby applications, analyzing common strategies for implementing translated attributes with various methodologies like dynamic methods, translation tables, and JSON storage.
- Reusability vs. Complexity: He references a Hacker News comment critiquing the overengineering that can result from excessive reusability, emphasizing the need to avoid unnecessary complexity.
- Design Patterns for Flexibility: Salzberg discusses design choices that encourage flexibility, such as the separation of concerns and utilizing backend protocols that allow for various data-storage strategies while still ensuring simple interfaces to work with.
- Conclusion on Building Flexible Frameworks: Through examples, he argues that successful frameworks thrive on pluggability and diverse functionality, underscoring the importance of supporting multiple approaches while maintaining simplicity.

From this talk, attendees are encouraged to embrace the principles of generic software development, aiming for modular, flexible designs that both mitigate complexity and foster reusability in their projects. Chris Salzberg invites engineers to engage in this topic collaboratively to enhance the Ruby ecosystem and develop well-designed, reusable software.

00:00:15.379 Thank you for coming. The title of this talk is 'Building Generic Software.' My name is Chris Salzberg, and I have a lot to discuss today.

00:00:27.660 I will cover the easier parts quickly and delve into the more challenging aspects slowly. A few quick points about myself: my handle is 'Chewy,' which you may have seen around.

00:00:41.370 I live in Tokyo, Japan, as you can see in the background of this slide. This is Tokyo. Don’t be fooled; I’m not Japanese. That’s just my icebreaker joke. I’m actually Canadian, originally from Montreal. I work at a company called 'Dajika,' based in Tokyo, and I contribute to the open-source world.

00:00:58.199 I’m the author of several Ruby gems, the most well-known of which is called 'Mobility.' I will talk about that a bit later. In my free time, I write about topics like the module builder pattern, which you may have heard of.

00:01:10.439 The focus of today's talk is building generic software. To kick things off, I will introduce the concept of generic software: What is it, and why should you care? Based on this idea, we will look for a specific problem and use it to build some generic software, specifically a framework.

00:01:27.420 After that, we will see what lessons we can learn from this exercise. Now, what is generic software? I did not invent this term, as far as I know. It was coined by Jeremy Evans, who gave a talk here yesterday. He’s the author of a well-known CRM called 'Sequel,' similar to Active Record, and another well-known gem called 'Rhoda,' which he actually discussed here in 2014.

00:01:46.110 I’ve been following Jeremy’s work for some time. A while ago, I sought inspiration in his open-source projects and found a quote buried in his 2012 slides from a talk on the development of Sequel. In this quote, he states that one of the best ways to write flexible software is to write generic software. Instead of designing a single API that completely handles a specific case, you write multiple APIs that manage smaller, more generic parts of that use case. Then, handling the entire case involves gluing those parts together. When you approach code this way, designing APIs that solve generic problems ensures that you can more easily reuse those APIs later to solve other issues. This quote resonated with me and aligned with the work I was doing on developing generic components.

00:02:18.010 That's why I'm here today to make this talk. As I delved into this idea, I realized that you can take the general concept of building blocks and take it in several different directions, depending on how you apply it. This tweet from Gary Bernhardt from September encapsulates that. You may know him as a Rubyist; he stated that programming should be like assembling Lego. Decompose everything into tiny pieces with maximally general interfaces.

00:02:36.129 This idea aligns closely with Jeremy Evans's quote. While you can take my word for it, I'd suggest you listen to Jeremy—he's done great work! However, moving ahead a decade, simple tests have become increasingly complicated, and at times, they break for reasons no one understands. This brings us to a strange schism. Depending on how you apply this Lego block philosophy, you either end up with maintainable software or a convoluted mess that's challenging to maintain.

00:03:05.620 Here’s another perspective: a comment on a Hacker News thread responding to a blog post that discussed how our software has become larger and more complex while doing the same tasks. The commenter attributes part of the blame to an unhealthy obsession with code reusability and sticking to specific paradigms. The result? There’s so much code comprising abstractions piled upon abstractions that no one seems to just want to write some straightforward code anymore; they’d prefer to include entire frameworks instead of writing a few helper functions.

00:03:29.680 I sympathize with this sentiment. If you look at the Twitter thread Gary Bernhardt participated in, it echoes sentiments suggesting that you should just write the code yourself rather than forcing these libraries upon yourself if you don’t genuinely need them. While that's reasonable, I think it misses a crucial point. Imagine that this tree represents a giant dependency tree, like Ruby gems. Most of the time, we’re at the leaves of this tree, building applications that aren't required by anyone else. We rely heavily on lots of software that branches down to the roots, ultimately to the Ruby interpreter. So, suggesting we rip out these frameworks is akin to chopping off the branches from our leaves.

00:03:59.919 While that is a valid approach, I believe we should also consider how we can improve this tree. Can we enhance these branches? Can we optimize and strengthen this structure? That’s what I want to discuss today. To do this, I’ll utilize some concepts that date back over 30 years, back to the 1980s, before even the Internet. I want to highlight a significant paper from 1988 written by Ralph Johnson and Brian Foote, who you may know if you've encountered the term ‘Big Ball of Mud.’ The title is 'Designing Reusable Classes.'

00:04:29.320 This entire idea of reusability is fundamental to object-oriented programming, concerning key concepts like inheritance and polymorphism, which are designed to promote code reuse. Additionally, the concept of a framework, which I will discuss later, is introduced in this paper. Today, I intend to take some of these timeless ideas and apply them alongside the notion of generic software to guide our journey towards creating maintainable software.

00:04:50.610 Generally, when you want to build generic software—software that is reusable and applicable in multiple contexts—you have to start with something specific, at least in my experience. You don't begin from the bottom of the tree; you start from the leaves and work downwards. Therefore, I will begin with a specific example: translation. This topic is particularly relevant to me, as my entry point into the Ruby community was through a translation platform.

00:05:09.430 So, let’s explore the concept of translated attributes. It’s actually a straightforward idea. I’m using the I18n locale here, and it’s not crucial, but take a global language, like English. You might have an attribute in a class, say a 'Talk' class. You create an instance of 'Talk' and set your title.

00:05:27.280 For instance, if you set the title to 'Building Generic Software,' you would expect to retrieve that title later. However, if we switch the locale to something like Japanese and fetch the title, we get nil because, at this moment, there's no translation for that attribute in Japanese. To resolve this, we would need to add that translation, and subsequently, upon switching back to English, the correct English translation would be retrieved.

00:05:42.570 And that’s essentially how it works. In defining these translations, there are various Ruby gems constantly emerging that manage translated attributes. They each have their conventions, but generally, they feature a common interface: you call some class method, often referred to as 'translates,' and pass in your attribute names. This dynamically creates translated attributes for you to use.

00:06:00.740 The complexity arises, however, when we consider how to store these translated attributes. Here is where we start to explore the idea of generic software moving forward. So, what do these storage patterns look like? The simplest approach is the idea of translatable columns, which is pretty straightforward. Imagine a 'Comment' model with a corresponding 'comments' table.

00:06:24.470 If you call the ‘content’ method by default, your ORM—whether it's Sequel or Active Record—won’t understand what you’re referencing because there’s no column named 'content' in your structure. Instead, you would create separate columns for every language you want to support translations for. For example, you’d have 'content_en' for English, 'content_fr' for French, and so forth. You'd need to dynamically define a method that maps from the 'content' call to the respective translated column based on the current language.

00:06:43.750 This approach is relatively transparent and works well, but it has its downsides such as requiring migrations every time a new language is added. Another more scalable approach involves using translation tables. Instead of storing translations in the model table, you create a separate translation table.

00:07:03.480 In this model, each translation is stored on this separate table, with the language represented as a new column rather than a suffix on a model column name. You can set up a foreign key reference back to your original comment table, and then for each translation, you have the relevant column name to store those translations.

00:07:22.720 You would create an association, and when you want to fetch a translation, you'd access this association and retrieve the relevant translation based on the current language. Additionally, there are various patterns to implement storage as well. For instance, you could also use JSON columns in Postgres or the more recent versions of MySQL which also supports JSON.

00:07:40.440 By doing this, you can place all the translations into a single column where the keys are the language codes and the values represent the translations. These storage patterns form a foundational layer, and different gems implement various access patterns on top of them.

00:08:02.660 The most common access pattern is fallbacks, especially if you've worked with I18n. This approach enables you to set a primary language and a fallback language, so if you try to fetch a translation in a regional dialect like Canadian English (en-CA) without a specific entry, you could fall back to the more general English translation.

00:08:20.950 Implementing fallbacks involves looping through the locales you’re trying to access, trying one language at a time, and returning the first found translation. It's quite a simple logic. The important part about design in this context is being able to support various patterns while not coupling them together too heavily.

00:08:42.540 You could also implement other features like dirty tracking, which allows you to track changes in active records. Unfortunately, when managing translated attributes, your ORM may not automatically support this because it doesn’t recognize these custom attributes without additional magic.

00:08:58.990 So, if I set a title to 'Building Generic Software' and change it to 'Building Specific Software,' how do you track those changes? There are additional challenges when you want to perform lookups based on translations in a different language. These aspects need special handling during implementation.

00:09:23.200 Now, let’s delve into the most intriguing part: how different gems can combine various functionalities using control flow. Typically, I illustrate this with the ‘translates’ method. This method defines the accessor for translated attributes.

00:09:49.440 The process begins by splatting the attributes passed into it and iterating through each attribute. You would call a method to define an accessor for each attribute. This would generally involve defining methods for getting and setting the translation values, but the implementation can vary depending on which gem you’re utilizing.

00:10:14.610 The internal details may change, but in essence, this defines how you store and retrieve translations. Now, as a concrete example, if we have a 'Talk' class and the attribute we're interested in is 'title,' the process will result in methods that read from storage and write to storage.

00:10:32.760 The key point here is identifying what these methods do. This explanation is adapted from a gem called 'Tracker,' which is designed to handle translatable columns. Here’s how it typically works. First, we call the necessary fallbacks and attempt to access the value. Since you might be using a storage pattern, you will refer to the required column translation and manage the relevant logic accordingly.

00:10:50.720 So, as you can see, this process includes the handling of various situations seamlessly. If the translation exists, great! If not, it can continue through the list of fallbacks until it finds a suitable resource.

00:11:07.000 Now, we discussed a problem earlier and how we can address it. The issue in question involves a gem called Globalize, which you might know—it’s the most recognized gem for managing translations in Rails applications. The repository has been active, and one of the authors, Thomas, posted a thought-provoking issue about two and a half years ago.

00:11:28.600 At that time, the Rails community began to embrace Postgres, which introduced features like JSON storage and JSONB. Thomas considered whether it might be feasible to allow a method to switch from a storage table-based approach to JSON storage or JSONB storage, indicating an interest in making the setup more flexible.

00:11:46.460 This idea provoked my interest as I began to ponder what steps would be necessary to make a system pluggable. Thus, the critical question for us is: how do we achieve flexibility in such a way that different approaches can coexist?

00:12:01.740 Going back to our theoretical model we’ve built, we need a way to allow for extensibility. The high-level application code calls method definitions, like determines how it accesses translated attributes. Initially, we would pass a class method into our translates method to build this architecture.

00:12:17.760 While embedding specific storage references, we aim to peel the functionalities apart so that we can replace them with alternate implementations as needed.

00:12:34.600 Instead of directly accessing an inner method, we elevate this responsibility to the application layer. This notion of 'inversion of control' is crucial here, where we want the high-level code to dictate how functionalities tie together.

00:12:48.700 In designing reusable classes, as per the paper I mentioned earlier from 1988, this idea allows frameworks to extend their base implementations while promoting a flexible and collaborative design approach.

00:13:06.950 The concept is catchy, often referred to as the 'Hollywood principle'—don’t call us; we’ll call you. You design your plugins or modules, and when the main program needs behavior, it calls back to the plugins, allowing for divergence in functionality without coupling all the pieces tightly.

00:13:23.510 We can now proceed to build a second version of the API. This new approach can accommodate multiple types of storage while ensuring the access methods remain flexible. The 'translates' method can take additional parameters, such as backend keyword arguments representing the storage strategy.

00:13:43.300 As we implement these changes, it’s essential to ensure that whatever backend class you pass also adheres to specific expected behaviors or protocols. This offers a superior level of flexibility, as now you can implement a backend that accommodates existing standards and allows others to follow suit.

00:14:02.260 Once this method is defined, you would manage the attributes accordingly, going through each input while keeping track of instance variables. These backend classes should avoid unnecessary complexities, merely defining how the data reads and writes.

00:14:20.580 After defining the backend method, you can create a way to dynamically access your storage strategy without hard coding the references. This flexibility is crucial for achieving the goals of generic software.

00:14:35.030 As I mentioned earlier, if we hypothetically replace the original backend functions with a translation table backend, we must remember that the core mechanism of accessing translations remains unmoved.

00:14:55.130 The implementation must properly process the references without requiring any hardcoded definitions in the model, which enables seamless integration without necessitating reworks or dependencies.

00:15:10.370 Along the way, we determined that a 'setup model' function can be a valuable mechanism. Invoking this would provide an entry point for configuration about the nature of the backend without disturbing the core setup. Therefore, both the backend and the access patterns do not interfere with the direct usage of either.

00:15:30.470 This research into creating a more flexible approach yields a model wherein distinctions among backend functionality and features are maintained. Extensive interactions enable each part to uphold its independent yet integrated function.

00:15:45.830 Ultimately, when fully realized, you find the backend protocols allowing for different storage methods to work harmoniously, creating an ecosystem where you can cater to various data-access needs.

00:16:07.070 The beauty lies in how different components come together without forcing external dependencies to become interconnected.

00:16:23.140 At this point, I want to reference some notable gems that adopt similar philosophies, notably 'Rhoda' and 'Shrine.' They both have strong modular designs where one can engage with various plugins or storage points while harnessing the Ruby ecosystem’s flexibility.

00:16:39.130 In sharing these insights, I hope to illuminate not only my own journey but also the expansive potential of norms around developing interest-generating software in the Ruby community.

00:16:56.240 What sets successful frameworks apart is their pluggability. Understanding how design patterns contribute to the overall structure aids the development of a combination of reusable and extensible systems.

00:17:12.370 Looking back at the earlier discussion, we contemplated the potential pitfalls of generic software development and specified pathways to mitigation by detailed analysis of our frameworks.

00:17:28.000 This dual-axis reflection—understanding reusability and recognizing complexity—is essential in navigating contemporary architecture. We must strive for balance in these pursuits to prevent our solutions from becoming overly convoluted.

00:17:45.320 As we progress in this discussion, I envision a world where we build exacting tools that deliver generative functions initially and evolve into multifaceted solutions as enhancements and ideas come forth.

00:18:02.650 I urge you to delve into this topic with me if this resonates with you; let's collaborate to further our ecosystem by cultivating well-designed, reusable software.

00:18:28.830 Thank you all for your attention.

RubyConf 2018