Michael Fairley

Summarized using AI

Immutable Ruby

Michael Fairley • February 20, 2013 • Earth

The video presented by Michael Fairley at RubyConf AU 2013 delves into the topic of immutability in Ruby programming. Fairley highlights the prevalent issues caused by mutable state in software development and advocates for adopting immutability to make code more maintainable and testable.

Key Points Discussed:

- Definition of Mutability and Immutability:

- Mutability refers to changing the state of data, whereas immutability means that the state remains unchanged.

- Problems with Mutability:

- Fairley demonstrates several examples from Ruby on Rails applications where mutability leads to bugs and confusion, such as errors in a Person class's full name method and email verification status handling.

- Techniques for Embracing Immutability:

- Introduces value objects, which lack identity and can simplify code. Value objects behave consistently, allowing for easier updates and clear communication within the code.

- Discusses the importance of the freeze method in Ruby, which prevents changes to an object's instance variables.

- Event Sourcing Concept:

- Fairley explains event sourcing, which chronicles all changes to application state as distinct events, allowing for richer historical data queries and easier debugging processes.

- Using Persistent Data Structures:

- He recommends libraries like Hamster for creating immutable data structures that share structure with older versions, minimizing memory usage and complexity.

- Testing with Immutable Structures:

- Fairley emphasizes that testing value objects is less cumbersome and faster, as they do not depend on complex frameworks like Rails.

- Challenges and Trade-offs:

- Though immutability brings benefits, it can lead to slower performance due to increased object allocation and copying. Certain freedoms in mutable state are also lost, which might pose challenges in some frameworks and libraries.

Conclusions and Takeaways:

- Adopting immutability can greatly simplify code maintenance, testing, and enhance safety in multi-threaded environments. However, developers should be aware of the trade-offs and apply immutability judicuously, considering scenarios where mutable state may still be necessary. Fairley encourages learning functional programming concepts and languages to deepen understanding of immutability practices in Ruby.

Michael Fairley invites further discussions on the impact of immutable programming and suggests exploring resources from notable figures in programming to refine these principles.

Immutable Ruby
Michael Fairley • February 20, 2013 • Earth

RubyConf AU 2013: http://www.rubyconf.org.au

Most Ruby code makes heavy use of mutable state, which often contributes to long term maintenance problems. Mutability can lead to code that's difficult to understand, libraries and applications that aren't thread-safe, and tests that are slow and brittle. Immutability, on the other hand, can make code easy to reason about, safe to use in multi-threaded environments, and simple to test. Doesn't that sound nice?
This talk answers the question "why immutability?", covers the building blocks of immutable code, such as value objects and persistent data structures, as well as higher order concepts that make heavy use of immutability, such as event sourcing and pure functions, and finally discusses the tradeoffs involved in going immutable.

RubyConf AU 2013

00:00:08.040 Hello everyone, as Tim mentioned, I am Michael Fairley. If you think I say something stupid, feel free to tweet at me with angry tweets. I work at a company called Braintree, which is based out of the US. We handle online credit card payments. Over the past four years, we've been serving thousands of businesses, including some very impressive ones that are displayed on the screen. Four months ago, we also began working with merchants based in Australia. If you're interested in discussing payments, please come find me later.
00:00:39.160 Today, I'm going to talk about immutable Ruby. Mutability refers to changing the state of data in your code, while immutability means you do not change that state. Of course, your code will need to result in some changes, but utilizing immutability in small facets can help simplify your code and make it easier to test. I will also discuss the trade-offs associated with immutability.
00:01:05.200 I will begin by discussing some problems that arise from mutability. I will show you four examples of Ruby on Rails applications and some odd scenarios that can emerge, including bugs I've encountered in production. In the first example, we have a simple Person class with a first name and a last name, along with a method called full_name that concatenates the two. When we create a person named 'John Doe', his full name is correctly set as 'John Doe'. However, if we change his first name to 'Jim', his full name remains 'John Doe', creating confusion.
00:01:39.160 In the second example, a user can have multiple email addresses, which can be either verified or unverified. When a user signs up, we send them a verification email that they must click to verify their address. If they change their email address later, it might still show as verified. In the third example, there’s a person class with validation that requires the name to be at least four characters long. If we create a person named 'Bob' and he has a validation error due to a short name, and then change it to 'Robert', the errors on the Active Record object may still indicate that his name is too short, leading to confusion.
00:03:10.640 This next situation is more convoluted but quite confusing. We have a hash and an array where we use the array as a key in the hash. Unsurprisingly, the hash behaves as expected—however, when we try to look for values using both the literal and an old state, we can't find them. Yet, when we add or change a value in the array, we see it duplicated in the hash, which complicates things.
00:03:52.360 Now, let's discuss techniques, tools, and stylistic choices you can employ to take advantage of immutability and avoid the issues presented. Firstly, a value or value object is an object that contains attributes but has no identity. These are prevalent in programming; for example, two numbers are considered equal if they hold the same value, regardless of whether they are separate instances in the Ruby virtual machine. This same logic applies to dates and times, where their equality is defined by the internal data, not by any identity.
00:04:39.080 In Ruby, time has a method called UTC, which provides the UTC version of a time object. Importantly, it returns a new instance rather than altering the original object. However, strings in Ruby are not immutable, which can create confusion. For example, even if we have a string that contains the text of the King James Bible, changing the string does not affect the original text.
00:05:10.039 You can compose values out of other values to create domain objects, such as money. For instance, $1 is different from 10 Australian dollars despite their values being equivalent. There’s a remarkable gem called 'values' for working with value objects in Ruby, which I will show you examples of, demonstrating how it can help clean up your code. Value objects work similarly to structs; calling value.new allows you to pass attribute names and get a new class in return.
00:06:07.639 When you construct an instance of this class, it retains the expected data. Unlike structs, attempts to mutate this data will result in an error since value objects lack mutable methods. You can, however, add methods for formatting, such as to_string or a multiplication method for scaling. For example, if you have a user class with both a billing address and a shipping address, using a checkbox to signify using the billing address as the shipping address leads to cumbersome code.
00:06:36.919 Active Record has a method called composed_of for working with value objects embedded inside Active Record objects, which simplifies assigning a shipping address from a billing address without needing to copy it over. Value objects are incredibly useful for facilitating clear communication throughout your system and allow you to modify your code more easily.
00:07:11.199 Testing value objects is straightforward as well. You simply need to create a value object, call the relevant method, and assert against the result. Importantly, you can run these tests without requiring complex dependencies like Rails or factory libraries, leading to fast execution times. For instance, let's say we want to calculate shipping costs based on a user’s location. Instead of adding a calculate_shipping_price method directly to the user, I propose utilizing a shipping service that operates on values.
00:08:20.000 While this may initially seem convoluted, it simplifies testing, as such methods do not depend on Rails or a database. They also help you remain resilient to changes in requirements. For example, if we later introduce support for users with multiple addresses or ship them goods to businesses, having a structure based solely on values allows for easier modifications. When working with APIs, you often send values back and forth, reinforcing the practice of working with values as opposed to objects.
00:09:44.760 Immutability helps clarify assumptions in your code. It's easy to write code that assumes an object or an array will remain unchanged. However, returning six months later to modify that code can lead to confusion if you unintentionally alter something that was meant to remain static. To make these assumptions explicit, Ruby provides a freeze method for every object, which prevents changes to that object's instance variables. Yet, it doesn’t freeze nested elements—so arrays that are frozen can still have their contents adjusted.
00:10:54.000 At Braintree, we often deal with database records that we want to keep immutable. For example, we record transaction amounts that cannot be altered afterward. To enforce immutability, we have an immutable module that can be mixed into our Active Record classes, preventing updates when attempted. We can also enforce such constraints at the database level using PostgreSQL or MySQL to restrict updates or deletions on specific tables.
00:12:07.120 Event sourcing is another related concept that captures all changes to your application state as a sequence of events. This can either drive the change or serve as logs of what happened. For example, consider a bank account: you can track deposits and withdrawals as distinct events, and from these, derive the current balance. The derived state here is the balance, while the details of each transaction make up the observed state. Importantly, this approach allows for interesting inquiries.
00:12:48.640 You can query historical data, like determining a user's balance from a week ago, simply by reviewing the logged events up to that point. If a user returns a book, instead of removing the initial charge, we can insert a new event that negates the previous action, thereby preserving a complete history. Furthermore, events can be replayed; if a bug arises in production, developers can replay the event log to analyze the exact state of the application at the time of the error.
00:14:38.880 Git is an everyday example of an event-sourced system. In Git, each commit represents an event, and the state at any given time is the working directory. You can retrieve past states by checking out a specific SHA, effectively traveling through time. Git allows for event replays with functionalities like rebase, which re-applies a series of commits on a different branch. I once worked on a family history project where we used event sourcing for managing a family tree stored primarily in PostgreSQL. Given that querying graphs in PostgreSQL isn't efficient, we recorded the event log there while keeping the state of the tree referenced in a NoSQL graph database.
00:16:54.120 This dual approach provided us with data durability; even if the graph database failed, we retained a reliable source in PostgreSQL. In cases of vandalism or errors in the family tree, we could easily identify and revert changes by examining the event log. Following a significant learning experience with the graph system, we later chose to replay the logged events for rebuilding data back into PostgreSQL.
00:18:24.800 Beyond these advantages, event sourcing enables features like undo functionality similar to that found in applications like Microsoft Word. Each action you take can generate an event, so when you hit 'Command Z', the system restores the most recent state via its revert command. If you’re working on rich client applications, any feature that requires undo functionality might benefit from employing an event-sourced structure.
00:19:36.760 Regarding the implementation of event sourcing, you can also maintain derived state in memory. In environments where you're unsure a database will retain data, but you have a permanent data source, you could work with memory constructs like Redis. If you utilize mutable state, you break established software rules that assume data remains fixed. Notably, if you never alter data, the hassles of cache invalidation are significantly reduced.
00:20:42.480 This aspect connects with database normalization, which aims to isolate data to prevent duplication. However, with immutability, you are free from the obligation to normalize since updates will no longer apply to multiple locations. In terms of concurrency, a majority of threading safety issues arise from shared mutable state. To counteract this, you could implement private mutable states with libraries like Celluloid that utilize value objects to communicate.
00:22:25.110 Of course, there are trade-offs associated with adopting immutability—one being speed. Immutability typically leads to increased object allocation and copying, which can slow performance. Thus, we might need to use mutable state in scenarios where optimization is necessary. The fastest performance often lies outside the realm of aesthetically pleasing code.
00:23:34.560 You also lose certain freedoms and might face limitations based on library or framework assumptions involving mutable state. Ruby poses a challenge for immutability since it allows constant reassignment, lacking strict final variables found in other languages. The freeze method is Ruby’s core mechanism for achieving immutability. Additionally, deletion qualifies as a form of mutation. While tweets on Twitter may seem immutable, they can indeed be deleted, impacting cache and validation procedures.
00:24:50.840 Although we are a bit ahead of schedule, if you're interested in exploring further, I encourage you to learn one of three functional languages that emphasize immutability. These languages can make changing state challenging, but the concepts you gain from them will enhance your Ruby development. Rich Hickey, the creator of Clojure, advocates passionately for the dangers of mutable state, and I've included links to several of his insightful talks.
00:26:31.399 Some essential texts on value objects include 'Domain Driven Design', which discusses in depth how value objects interact with mutable objects, as well as the C2 Wiki debates among programming pioneers like Martin Fowler and Kent Beck about value object design. Additionally, Gary Bernhardt’s concept of the functional core with imperative shell can provide a valuable perspective as it emphasizes composing reliable, immutable methods and classes while integrating them with imperative code.
00:27:43.040 In Ruby, we encounter mutable data structures like hashes, arrays, and sets. However, at times, you may desire immutable structures—this is where persistent data structures come into play. When modifying a persistent structure, instead of changing the version you hold, you receive a new representation of that state. Libraries like Hamster provide a simple way to achieve this functionality.
00:29:01.320 With Hamster, for instance, if we create a set of symbols A, B, and C, we can add D to it and receive a new set containing all four elements while the original remains unchanged. This persistence applies to hashes as well. Updating or adding a value results in a new hash with the change, leaving the original intact. You can modify vectors similarly, ensuring full integrity of past versions.
00:30:18.080 In conclusion, persistent data structures can be efficiently implemented, sharing most of their structure with older versions to minimize copying overhead. As a demonstration, consider a vector of numbers where the underlying data structure remains intact unless modified, ensuring seamless transitions and optimized memory usage.
00:31:16.799 Thank you!
00:31:40.079 I am open to questions. One person asked if I could repeat their question and clarify how we transitioned the family tree data back into PostgreSQL.
00:32:10.960 To clarify, once we developed features allowing users to modify family tree records, we created event objects that executed their actions to modify the graph database, while also maintaining their states in PostgreSQL. When we decided to migrate back to PostgreSQL, we updated the code so that executing these event objects would affect a specific PostgreSQL table instead.
00:32:52.120 He also inquired about the database we used, which was Neo4j. Most of our challenges emerged from our handling of it, and yes, I do have a preference for Scala as a functional programming language, but it is easy to slip into object-oriented paradigms.
00:33:17.960 Towards the beginning of the talk, someone pointed out what I meant when I stated that values have no identity, illustrating that they indeed have an intrinsic identity expressed through their data. I appreciate that distinction—value objects are designed to determine their identity based on their content rather than external references.
00:33:54.879 In relation to integration with Active Record, a question was raised about the techniques available to facilitate interaction between these concepts. Specifically, how event sourcing employs value objects as its records, and how such systems maintain their data.
00:34:18.600 I noted that value objects hold intrinsic value within their structure, and despite the lack of methods to manipulate them, it is still feasible to maintain encapsulated state through referenced data. Finally, I was asked about experiences with mutable state within applications and measures taken to minimize its impact.
00:35:20.679 Here at Braintree, we are actively working on mutating parts of our objects, segregating immutable and mutable fields. This approach reinforces the importance of preserving core data integrity while allowing for necessary changes.
00:36:23.180 To conclude, difficulties with deep value object graphs arose, specifically in making changes to references. Since every modification requires traversing the hierarchy and copying states, maintaining a deeply nested structure could prove cumbersome.
00:37:07.760 Yet, in applications relying heavily on mutable state, considerations and designs must convey ease of modification without relinquishing value integrity.
00:37:30.800 Thank you all for your time! I'm delighted to engage in further discussion.
Explore all talks recorded at RubyConf AU 2013
+25