Changing the Unchangeable: The Hows and Whys of Immutable Data Structures

by Brad Urani

The video titled 'Changing the Unchangeable: The Hows and Whys of Immutable Data Structures' by Brad Urani delves into the concept of immutable data structures, primarily in the context of functional programming and Ruby. The speaker explains how immutable structures can enhance code integrity and prevent bugs by isolating state changes. Key highlights of the talk include:

Introducing Immutable Data Structures: Urani emphasizes the importance of immutable data structures in functional programming and their advantages in Ruby, despite Ruby’s object-oriented nature.
Methods of Immutability in Ruby: The use of Ruby's .freeze method to create immutable structures and how it aids in bug prevention.
Persistent vs. Immutable Structures: The distinction between immutable and persistent data structures, emphasizing that while immutable structures don't change, persistent structures keep previous versions upon modification.
Using Hamster Gem: The introduction of the Hamster gem that provides persistent immutable data structures, showing how modifications create new versions without altering the original list.
Linked Lists: The speaker illustrates how linked lists can help achieve immutability while maintaining efficiency by sharing parts of the data structure.
Real-World Applications: Discussions about persistent data structures are connected to everyday programming scenarios, such as Git's efficient version control and the management of state in frontend frameworks like React.
Game Development Example: Urani presents a simple game project called Boxers, illustrating how persistent structures can optimize game logic by recording history and managing the state.
Concluding Thoughts: The talk concludes by acknowledging the broader implications of utilizing persistent data structures, including their role in enhancing debugging, software performance, and managing state transitions efficiently. Urani encourages further exploration of techniques and libraries that leverage these programming concepts.

Overall, Urani's presentation illustrates the power of immutable and persistent data structures, offering both theoretical and practical insights for software developers interested in functional programming paradigms in Ruby.

00:00:15.160 Welcome! How's everyone's morning? Good? Cool! My name is Brad Urani, and this is a talk about immutable data structures.

00:00:21.380 Immutable data structures are an ideal concept from functional programming. I got interested in this topic after taking a course on Coursera called 'The Principles of Functional Programming in Scala', taught by Martin Odersky, the inventor of Scala.

00:00:32.059 It is a fantastic course, and I highly recommend it. It got me thinking about how we can implement some functional programming techniques in Ruby.

00:00:44.239 When I mention functional programming, I'm referring to techniques that isolate state changes. This involves combining several techniques and applying a certain discipline. It's about what you do and what you don’t do. Functional programming has a lot of built-in capabilities, especially with immutable data structures.

00:01:02.420 In Ruby, we can leverage some third-party gems for higher-order functions, such as passing lambdas around and partial functional application. However, we often miss features like lazy evaluation and pattern matching, which are not natively available in Ruby.

00:01:14.270 If we combine all these aspects, we can approximate functional programming. However, the reality is that pure functional programming is not the best fit for Ruby, which is fundamentally an object-oriented language.

00:01:32.509 I have attempted pure functional programming in Ruby, and while it is possible, I do not recommend it for production. Nevertheless, I find immutable data structures quite interesting because they have various intriguing use cases.

00:01:49.850 Additionally, they offer some cool 'aha!' moments. Immutable data structures are built into Ruby via the .freeze method. For instance, if we take an array and call .freeze on it, trying to change it will raise an error.

00:02:03.229 The purpose of this mechanism is to prevent specific types of bugs. When passing an array to a function, if its integrity is critical for success, using .freeze can help avoid potential issues.

00:02:21.200 In the Ruby on Rails source code, you will often see .freeze used with strings as a performance enhancement. However, .freeze does not provide the same performance boost for arrays.

00:02:35.440 This use of .freeze is primarily about avoiding bugs, which, while useful, is not the core topic of my talk. It’s important to note that while .freeze creates immutable data structures, these structures are not persistent.

00:02:57.610 A persistent data structure possesses specific qualities that I will explain today. To achieve persistent data structures, we can utilize a third-party gem called Hamster.

00:03:10.269 Hamster is an impressive gem that offers a collection of persistent immutable data structures. Let’s create a list containing the items 1, 2, and 3.

00:03:22.810 Using Hamster, I can create a hamster list with the same items. The method I will use is similar to traditional Ruby arrays, but the way they operate differs significantly.

00:03:36.430 Let me demonstrate this comparison. If I create an array in Ruby with the values 1, 2, 3, I can use the .unshift method to add an element to the front.

00:03:50.760 After using .unshift, if I print out the array, I will notice that it reflects the updated values. The reason for this is that both references in memory point to the same array, leading to changes in all references.

00:04:14.829 On the other hand, Hamster lists operate differently. If I create a hamster list and then modify it, the original list remains unchanged.

00:04:27.010 This is because when you edit a Hamster list, any references to the original list stay the same. This is the essence of immutability.

00:04:46.150 While persistent data structures are immutable, not all immutable structures are persistent. The special qualities that make data structures persistent involve careful implementation.

00:05:06.150 Let’s consider how we would create the Hamster list class ourselves. One approach could be cloning the entire list in memory whenever an update occurs, achieving immutability.

00:05:37.969 However, this method would be inefficient. In fact, PHP implements a copy-on-write mechanism where arrays can swap references without mutation, preserving the original unmodified.

00:05:54.620 An interesting aspect of PHP is that it can track references while cloning under the hood, but this isn’t the most efficient approach.

00:06:06.180 To create a performant persistent array structure, we can leverage linked lists. You may have learned about algorithms, stacks, queues, and trees, and wondered about the practical applications of linked lists.

00:06:31.020 In the case of a persistent list, when using cons to push an element onto the front, the original list remains unchanged.

00:06:51.630 Thus, we achieve immutability while sharing parts of the underlying data structure, meaning we don't need to clone anything else.

00:07:09.500 According to Wikipedia, a persistent data structure is one that preserves previous versions of itself upon modification. Such structures exemplify immutability, as their operations yield new updated versions without modifying the original.

00:07:22.220 However, there are limitations to these persistent lists. For example, they can only be singly linked, meaning they don't allow for backward traversal.

00:07:36.430 Additionally, with a singly-linked structure, random access is impossible. You can’t directly access the third element since it depends on the relative position you reference.

00:07:57.880 Let's explore an example with a linked list representing a rock band from 1980, which consisted of David, Eddie, Alex, and Mike. If we take this list and perform operations like swapping out David for Sammy, this can result in multiple references pointing to a shared structure.

00:08:35.110 This capability allows us to create a branching structure, resembling a tree. As I continue the example, I can add new elements or shift members while maintaining the integrity of the original list.

00:09:01.459 It is important to note that the arrows in this structure only point in one direction, reflecting the nature of singly-linked lists. We can also extend this structure to accommodate additional layers.

00:09:24.300 Moving on, consider the concept of implementing a persistent graph. A graph can connect nodes in multiple ways, which complicates assigning direction without modifying the underlying structure.

00:09:58.180 We might achieve a persistent directed acyclic graph (DAG) that allows for additional connections while maintaining immutability in the overall structure. This makes me think of Git as a familiar application of persistent data structures.

00:10:37.070 Git makes it easy to traverse projects due to its efficient use of persistent data structures, allowing for straightforward history tracking and version control.

00:11:05.670 Next, let's discuss a fun project called Boxers, a simple game that can elegantly illustrate the concepts we’ve been exploring.

00:11:23.030 The main objective of Boxers is for the player to maneuver a block through a grid, promoting various aspects of persistent data structures.

00:11:49.690 I created a solver for this game, which illustrates how we can systematically determine the quickest route to the goal while recording our history through the moves we make.

00:12:11.020 This takes us back to how we structure our game logic, allowing us to reuse historical states while ensuring efficiency.

00:12:38.380 The power of utilizing lazy computation in Ruby enhances the performance when searching for solutions, as we build and maintain histories in our data structures.

00:13:05.250 Research shows that using lazy evaluations helps us optimize our processes, leading to faster solutions within the persistent data structures.

00:13:31.880 In my implementation, I can delay the calculation of results until absolutely necessary, which allows for reduced computational waste.

00:14:02.050 Hamster also provides a structured way to work with common data types, including vectors and lists. This allows developers to implement functional programming principles more effectively.

00:14:47.560 I will share an interesting insight into how the persistent vector in Hamster operates, particularly regarding performance efficiency during updates.

00:15:31.030 The novel implementation of a persistent vector involves a combination of shared memory nodes and computed cache elements.

00:16:07.520 This clever structuring saves time and memory resources, thanks to the efficiency of only recreating necessary nodes.

00:16:29.220 Applications like this are not mere curiosities; they hold significant potential in real-world software development, offering strategies for managing state and history.

00:17:21.260 As I conclude, let’s recognize the broader implications in industries, particularly in the burgeoning realm of frontend frameworks like React, coupled with libraries like Redux.

00:18:10.720 As we build sophisticated applications, utilizing persistent structures enables us to manage changes seamlessly, track historical states, and leads to better performance.

00:19:01.020 Rich Hickey's contributions through the Clojure programming language illustrate persistent data structures in practice, enabling smooth transitions within application logic.

00:19:41.180 Hickey's development of a persistent database called Datomic stands as a testament to the practical applications of these principles, enabling seamless historical tracking of data.

00:20:27.870 You can explore these ideas further in Hickey's talk 'The Value of Values', which delves into the importance of history and immutable data structures in software development.

00:21:20.270 If we consider how conventional programming has evolved, we see that the shift towards immutable structures is driven by the decreasing costs of memory and storage.

00:22:05.120 By recording history rather than relying solely on mutable states, we can provide powerful tools for debugging and assessing software performance.

00:23:02.150 Finally, languages like Elm take this paradigm further by ensuring all programs inherently support time traveling concepts, thanks to their functional architecture.

00:23:59.100 You can see Elm’s time travel feature in action as I demonstrate a small game of Connect Four, showcasing how the architecture supports efficient state management.

00:24:53.020 With all this in mind, let’s appreciate how persistent data structures can pave the way for a more efficient, manageable, and robust development process.

00:25:53.030 My code, including the Boxers solver, the Hamster matrix gem, and my Ruby version of Connect Four, is available online. I encourage you to explore these resources further.

00:26:35.090 Lastly, I want to thank you all for your time, and I am happy to connect with any of you on Twitter or LinkedIn. Don't hesitate to reach out!