Keynote: Disposable Components

Talks

Chad Fowler

@chad

#software-architecture

Keynote: Disposable Components

by Chad Fowler

In his keynote speech titled "Disposable Components" at the Garden City Ruby 2014 event, Chad Fowler explores the challenges faced in software development and the need for a shift in systems architecture to create lasting, resilient software. He begins by acknowledging the prevalence of software project failures, citing the Standish Chaos Report which reveals that most projects are deemed unsuccessful or met with significant challenges. Fowler emphasizes that the legacy of software often leads to code that is hard to change, resulting in the frequent need for complete rewrites. He advocates for a paradigm shift where software is viewed as living organisms made up of small, disposable components, akin to cells in a body.

Key points discussed include:

Current State of Software Development: Fowler criticizes the tightly coupled systems commonly created, which hinder adaptability and ephemerality, leading to high failure rates.
Legacy Software: A dual view of legacy software is presented, combining its negative connotation with the potential for it to benefit future generations if designed well.
Homeostasis in Software: Drawing parallels from biology, he introduces the concept of homeostasis, where components can be replaced while maintaining overall system functionality.
Microservices and Componentization: Fowler discusses the benefits of microservices architecture, emphasizing that maintaining small, independent components leads to easier maintenance and upgrades.
Deployment Practices: He shares insights on the importance of continuous deployment and rapid release cycles while minimizing the fear associated with making changes in production environments.
Monitoring and Resiliency: Fowler highlights the need for comprehensive monitoring and promotes a culture of readiness within development teams to respond to failures effectively.

Throughout the talk, he provides examples from his work at 6 Wunderkinder and other experiences, illustrating key concepts with anecdotes about enduring systems and successful methodologies. He concludes with a call to adopt principles that support small, manageable components which help create resilient software architectures, ultimately concluding that fostering a culture of experimentation and continuous improvement is essential for the future of software development.

00:00:23.980 Yes, hello! Thank you. Hello! I am Chad.

00:00:31.990 As he said, I need no introduction, so I won't introduce myself any further. I may be the biggest non-Indian fan of India.

00:00:40.770 So this year, I’m back in Bangalore! Wow!

00:00:51.160 My Bengali is bad, and my Hindi is worse, but I'm trying my best.

00:00:57.850 My German is okay, but I mix it with Hindi sometimes. I’ll switch back now, sorry if you don’t understand Hindi. I said nothing of value, and it was all wrong, but I was trying to say my Hindi is bad because I’m learning German.

00:01:10.900 Currently, I’m working at 6 Wunderkinder on a product called Wunderlist. It is a productivity application that runs on every client you can think of. We have native clients, a backend, and millions of active users.

00:01:30.460 I’m telling you this not so that you’ll go download it, although you can do that too. I want to share the challenges I face and how I’m starting to think about systems architecture and design. That’s what I’m going to talk about today.

00:01:55.720 I’ll show you some real things we are implementing and some ideas that may sound like fantasy but hopefully will help you think about system architecture and how to build things that can last a long time.

00:02:25.660 First, I want to mention a graph from the Standish Chaos Report. I’ve extracted some years and raw data because they don’t matter for this point. Each bar in the graph represents successful software projects.

00:02:49.150 The green bars represent successful projects, the silver or white ones are challenged projects, and the red ones are failed. Challenged means significantly over time or budget, which, to me, means failed.

00:03:02.439 So, basically we're terrible. We, all of us here, are terrible. We call ourselves engineers, but it's a disgrace. We very rarely actually launch things that work. It’s kind of sad.

00:03:30.819 Once software is launched, anecdotally, as you might experience in your own work lives, business software gets killed after about five years. So, you barely ever launch it successfully, and within about five years, you find yourself needing a big rewrite, throwing everything away and replacing it.

00:03:49.150 There is always that project to get rid of the old Java code you wrote five years ago, and in five years, you’ll be replacing your old Ruby code that didn’t work with something else.

00:04:05.139 You probably all know the term legacy software, right? I’m sure you think of it negatively—as that ugly code that doesn’t work and is brittle. You can’t change it, and you’re all afraid of it. But there’s also a positive connotation of the word legacy; it’s about leaving behind something that future generations can benefit from.

00:04:30.370 But if we rarely launch successful projects and those we do tend to die within five years, none of us are actually creating a legacy in our work. We are just creating things that get thrown away—kind of sad.

00:04:49.150 We create this legacy software that’s hard to change, which is why it ends up getting thrown away. If the software worked and could be changed to meet business needs, you wouldn’t need to perform a big rewrite.

00:05:05.139 We create large, tightly coupled systems. I don’t just mean one application, but many applications that are all tightly coupled. You have this thing talking to the database of another system, so if you change the columns to update the view of a webpage, you ruin your billing system.

00:05:26.000 This makes it hard to change. The way we work is the default setting. If we were robots churning out code and had a preferences panel, it would lead us to create terrible software that gets thrown away in five years.

00:05:42.000 It’s just how we work as human beings. When we write code, our instincts lead us to create systems that are tightly coupled, hard to change, ultimately thrown away, and unable to scale.

00:06:00.880 We try to implement tests, and adopt test-driven development (TDD), but we end up with test suites that take 45 minutes to run. I’m sure many teams have faced this situation. You start focusing on speeding up the test suite instead of making meaningful progress.

00:06:29.700 You might think, 'If it only fails 90% of the time, that’s okay, right?' Right now it’s taking 45 minutes we want to reduce that time to 10 minutes. The test suite becomes a liability instead of a benefit because everything is so tightly coupled.

00:06:53.360 You're terrified to deploy! I recall the last big Java project I worked on. Once a week, we deployed with 15 people working all night, copying class files and restarting servers. Today’s systems are better, but it’s still terrifying.

00:07:14.320 You deploy code; you change it in production and are unsure what might break because testing these large integrated components is very challenging. Upgrading technology stacks is intimidating.

00:07:33.980 How many of you have been using Rails for more than three years? Anyone still have rails 2 apps in production? That's a lot of people. Wow, that’s terrifying! I’ve recently encountered situations with Rails 2 apps in production.

00:07:52.200 Security patches were rolling out, and we applied our own versions out of fear. We’d rather hack the code than upgrade because we didn't know what would happen.

00:08:11.240 You re-implement everything yourself, wasting time and burning out on obsolete software when you should be utilizing the new patches.

00:08:29.890 This is a challenge I see Ruby has inflicted on us. I've been using Ruby for 13 years now, and we create these mountains of abstractions, burying logic in them.

00:08:42.640 In Java, it was static classes and design pattern soup, but in Ruby, it’s modules and mix-ins. All these complex ways of hiding reality from us.

00:09:00.480 When you look at the code, it becomes opaque. This complexity creates a software-specific problem. Cars built long ago, which are older than any software you run, still drive just fine.

00:09:18.720 How do they function? Our bodies, despite being abused, still work. We can survive long flights. How does that happen? It’s homeostasis.

00:09:34.500 I won’t define homeostasis beyond saying it’s essentially maintaining balance through various components that help regulate the system.

00:09:51.470 For instance, if the liver overperforms or malfunctions, another component kicks in and corrects it. Our bodies thrive because we have internal agents managing various risks.

00:10:11.100 This balancing act, known as homeostasis, is crucial. An inability to do this can lead to severe health problems.

00:10:30.010 Good news? We're all dying constantly. About 50 trillion cells in our bodies die at a rate of around 3 million per second.

00:10:50.770 Physically, you aren’t the same person you were a few years ago. Yet, you're still the same system.

00:11:06.230 You can think of software similarly; if components can be replaced, like cells, the overall system continues to survive.

00:11:24.140 Focus on constant small changes to ensure longer-term survival. This talk is about the solution: mimic the characteristics of living organisms.

00:11:37.090 One key takeaway I’ll emphasize is that small things are good. Small projects, small commitments, small classes, small teams—these are beneficial.

00:11:56.390 If we see software as an organism, what is a "cell" in that context? A cell is a tiny component. That’s a subjective term, but it’s helpful for thinking. If you make your software from tiny components, each one can die, yet the system remains.

00:12:18.300 You don’t need your code to live forever. The function of the system can take precedence over durability.

00:12:37.100 Ten years ago, we created Ruby Gems at RubyConf 2003 in Austin. I haven't touched it in years, yet it continues to exist, much to others' chagrin.

00:12:53.580 I’m not sure if my original code survives, but that's not important. What’s significant is that the system still operates. I ventured to question on Twitter: 'What are some of the oldest surviving software systems still in use?'

00:13:13.160 Responses often related to UNIX systems. The enduring old systems I've seen tend to consist of components or tiny programs.

00:13:33.680 For example, 'grep' is a tiny program that performs one function. Many old systems conform to this metaphor.

00:13:53.950 In my previous work with GE, we had a system called the 'Bull,' an aging mainframe. Despite various attempts to replace it, users preferred it.

00:14:12.420 The system's longevity stemmed from its clear interfaces and tiny components, which sustained operations despite unsuccessful replacement efforts.

00:14:30.160 Now, how do I approach the task of building systems to survive long-term? One inspiration is Fred George from ThoughtWorks, who shares experiences with microservice architectures.

00:14:50.210 He highlights the importance of tiny components that perform singular tasks and can be replaced when necessary.

00:15:06.800 At 6 Wunderkinder, where I work, we adopted similar principles. Our rules aim to minimize coupling, ensuring fear-free deployments.

00:15:22.680 We strive to reduce cruft, that nasty leftover code in our systems. Our focus is on making code changes trivial and allowing ourselves the freedom to accelerate development.

00:15:40.160 I think no developer desires to work slowly. It often happens because systems constrain our progress, but it often stems from messy architecture.

00:15:57.600 One less controversial rule states that comments are a design smell. Anyone disagree? Comments often signal you should investigate further.

00:16:11.420 Inline comments are particularly suspect, often indicating you may need to create separate methods for clarity.

00:16:28.380 Another idea, albeit more controversial, is that tests can be a design smell. If a test suite is slow and brittle, it signals a flawed system.

00:16:40.360 I’ve found when reviewing slow, complex test suites, they often reveal a poor state of the overall system, leading developers to write excessive tests in fear.

00:16:56.400 A simplified system wouldn’t require numerous test files that take a long time to run. Focus instead on developing small, trivial systems.

00:17:10.600 In my approach, you can write code in any language, as long as it’s compact enough for easy understanding, both for the devs and for the actual code.

00:17:30.060 Ultimately, every component should be small, stand-alone, easily maintained, and in its own repository.

00:17:46.800 The idea is that if you can look at a component and understand it immediately, it reduces risk by lowering complexity.

00:18:02.700 Our systems are heterogeneous by default. Different languages enhance system design. By utilizing varied programming languages, we decrease tight coupling.

00:18:18.780 For instance, I have worked with Objective-C, Ruby, Scala, and more—all without tightly coupling those components, as different languages act as natural barriers.

00:18:36.600 Furthermore, server nodes should be disposable. In my previous jobs, we were overly proud of server uptime.

00:18:53.480 Long uptimes breed fear: you feel hesitant to change or upgrade. Consequently, we embrace disposability over longevity.

00:19:07.260 We deploy new versions of services by creating and replacing servers with load balancers, eliminating the uncertainty of knowing everything on each server.

00:19:24.040 Decision-making is streamlined without concerning ourselves about the specifics within a server since it’s straightforward to recreate.

00:19:39.500 Additionally, provisioning new services should be simple. We’ve transitioned from complex config management to straightforward shell scripts.

00:19:56.700 Instead of overengineering with tools like Chef, we utilize uncomplicated scripts that execute efficiently.

00:20:14.060 We focus on continuous deployment, upholding the idea that if something feels hard, you should do it consistently to turn daunting tasks into routine operations.

00:20:31.860 For instance, where deploying weekly took effort, we’d make it mandatory that any change deployed went live promptly.

00:20:52.880 Deploy frequently; it minimizes fear, allowing teams to confidently manage changes. The average uptime of our servers is around 17 hours.

00:21:08.860 With distributed systems, failures are inevitable, sodesign for resiliency. I recommend studying Joe Armstrong's philosophy about failure and recovery in Erlang.

00:21:25.060 Testing shouldn’t overshadow monitoring; effectively track when things go wrong, measure faults, and aim to resolve them quickly.

00:21:44.430 When monitoring comprehensively—beyond just tracking memory and disk space—you can gain insights about your business metrics.

00:22:03.160 For example, if user signups decrease, it might indicate a larger underlying issue.

00:22:19.930 Understanding business impact often matters more than merely knowing server status.

00:22:36.260 Finally, to leave you with something to think about, foster a culture of urgency and readiness. You may need to overcome fears about potential disasters.

00:22:53.240 Use practices like 'canary in the coal mine' deployments to test new versions incrementally, watching the effects carefully.

00:23:10.630 Gradual change alleviates fears, allowing for continuous deployment without worry that everything might crash at once.

00:23:30.060 As we discuss resilience, envision scenarios where testing offloads burdens, leading to a more spontaneous response to failures.

00:23:47.300 This concept can guide you in architecting software and systems resilient to change: designing systems that can evolve or dissolve without chaos.

00:24:03.500 Experimental ideas, like chaos testing, can test system resilience by purposefully inducing failure to improve future reactions.

00:24:23.050 Additionally, consider whether you can incorporate 'spot instances' where your system remains flexible, adapting responsively to real-time availability.

00:24:41.390 Approach homeostasis in software design: configure systems to shift and scale based on environmental fluctuations.

00:25:01.060 Lastly, spread knowledge. If anyone’s interested in JSON schema and high-performance asynchronous validation, let's connect.

00:25:19.450 Your insights or involvement would be greatly appreciated. I think that’s my time.

00:25:47.440 Thank you very much and let’s continue this conversation during the conference.

Garden City Ruby 2014