Distributed Systems

Summarized using AI

Running Heroku on Heroku

Noah Zoschke • October 09, 2012 • Earth

In this informative talk at Aloha RubyConf 2012, Noah Zoschke explores the intriguing concept of 'Running Heroku on Heroku', focusing on the phenomenon of self-hosting. Heroku, a cloud application platform, traditionally associated with Ruby, has now expanded to support multiple programming languages. Zoschke emphasizes the idea that application developers should concentrate on deploying applications without the complexities of server management.

Key Points Discussed:
- Definition of Self-Hosting: The term refers to systems that can run and manage themselves, thereby eliminating the need for external hosting or management.
- Bootstrapping: This process is vital to self-hosting, allowing systems to initialize and function independently. It is likened to various domains, including language acquisition and biological processes.
- Historical Context of Bootstrapping: The evolution of the term from an impossible task to a self-sustaining process across different fields, including computing.
- Self-Hosting in Computing: Zoschke discusses self-hosting compilers, using LLVM as a case study, demonstrating how a language can compile itself to improve efficiency and consistency.
- Heroku's Self-Hosting Practices: Heroku's commitment to self-hosting is exemplified through 'dogfooding'—using its platform to run its own services. The evolution from a monolithic architecture to a more modular setup is discussed.
- Operational Efficiency: By applying self-hosting principles, Heroku enhances its operational efficiency, ensuring that internal services are managed within the framework of its platform.
- Technical Architecture of Heroku: The speaker outlines Heroku’s architectural components, which include the HTTP router, API, and build system, showing how they interrelate to manage operations effectively.
- Continuous Improvement: Zoschke highlights a path towards increasing complexity in hosted services and the environment's evolution, encouraging ongoing exploration of self-hosting capabilities.

Key Takeaways:
- Self-hosting is instrumental for modern application platforms, offering significant efficiency and consistency benefits.
- The principles explored can serve as a model for enhancing any complex software system.
- Encouragement to consider the implications of self-hosting in personal software projects for potential advancements in efficiency and management.

Running Heroku on Heroku
Noah Zoschke • October 09, 2012 • Earth

The Heroku application platform makes deploying, running and scaling applications incredibly easy.

Traditionally these apps have been Ruby web applications. But as both the platform and its users mature, we are seeing the complexity of hosted apps increase, with more complex infrastructure systems running on Heroku.

Today, nobody is more interested in running infrastructure on Heroku than Heroku itself, as self-hosting offers massive benefits and is a fascinating engineering puzzle to boot.

We will first discuss the concept of self-hosting and why it such an interesting computer science problem, and a vital property of many systems. Compilers, revision control systems and application platforms all exhibit similar properties of bootstrapping, cross-compiling, and avoiding circular dependencies.

Then we will take a look at the more interesting self-hosted components of Heroku such as the the distributed application compiler that used to be a server farm but now is little more than a Heroku app that can even compile itself for releasing new versions.

All of this will show how working towards a self-hosted platform results in comprehensive consistency assurance and gains in efficiency, noble goals for such a complex software system.

Help us caption & translate this video!

http://amara.org/v/FGfh/

Aloha RubyConf 2012

00:00:15.160 Hey everybody, thanks for hanging in there. My name is Noah Zoschke.
00:00:20.720 This is a talk called 'Running Heroku on Heroku'. We’re here at Aloha Ruby.
00:00:26.039 I want to thank all the organizers, speakers, and everyone for making this event great.
00:00:33.000 The slides for this talk are available online if you want to reference anything later.
00:00:38.600 I assume most of you are familiar with Heroku as Rubyists.
00:00:45.120 Just a quick background: Heroku is a cloud application platform as a service. It used to be referred to as a Ruby application platform as a service, but we have opened this up to other languages like Python, Java, and Node.js.
00:01:00.600 One of our taglines is 'Run anything and see everything without servers.' A personal goal of mine, and a goal for our platform, is to eliminate the concept of a server. As application developers, we want to focus on getting our apps online rather than dealing with the traditional operations involved in setting up servers and databases.
00:01:18.560 From a more low-level perspective, as a platform engineer, I see us developing a distributed Unix system. This is essentially a grid of application servers based on Linux, specifically on an Ubuntu base. I consider Heroku to be an operating system layer that interfaces with a pool of computing, which I suppose is the cloud.
00:01:32.399 However, it's also just servers in Amazon's infrastructure. This talk today explores an interesting computer science problem: the concept of self-hosting.
00:01:40.759 Specifically, I'm interested in how this relates to bootstrapping. I'm not talking about Twitter Bootstrap as a design tool, but rather the process of creating systems that can host themselves.
00:01:55.439 If you caught Glenn's talk earlier, you might see parallels in how we can delve deep into computer science and understand these low-level primitives.
00:02:09.160 Bootstrapping is a generic term with various applications, originating in the 19th century from the phrase 'to pull oneself up by one's own bootstraps.' Initially, it referred to something impossible, like raising oneself by one's own efforts.
00:02:21.400 The term was often applied to traveling salesmen with impossible schemes, like perpetual motion machines. It's a process that's fundamentally impossible, like lifting oneself up by bootstraps.
00:02:34.120 Interestingly, this term shifted in meaning in the 20th century to refer to a self-sustaining process that can proceed without external help. This concept applies across various domains, including socioeconomics.
00:02:47.360 In political seasons, politicians often tout their bootstrap stories, claiming to have succeeded against all odds. In business, companies aim to be 'bootstrapped', relying on good ideas and hard work rather than venture capital.
00:03:00.319 This idea extends to language acquisition, where children bootstrap their ability to speak and understand language from a blank slate. In biology, all cells start as generic stem cells that undergo a bootstrapping process to specialize. In computing, the process of booting a computer is essentially bootstrapping.
00:03:14.879 Originally, some mainframes had a big red button labeled 'bootstrap,' which would initiate the startup process. Although, as users, we don’t have to worry about this anymore, booting is still a complex task.
00:03:27.200 There's also the context of compilers in computer science where bootstrapping involves using a simpler language to translate a more complex program.
00:03:39.919 Essentially, a simple language can be used to create increasingly advanced programming tools, aiding in the development of interpreters such as Ruby, which may then be compiled into more complex binaries.
00:03:53.760 Compiling a language requires a bootstrap process, leading to something known as self-hosting—the ability for a language or system to self-compile.
00:04:07.360 For example, LLVM is a C compiler written in C. Self-hosted languages showcase significant advantages, such as enabling development in a higher-order programming language.
00:04:22.680 One key advantage is that this approach demonstrates the expressiveness of a language by allowing it to write its own compiler.
00:04:36.960 Having a self-hosting compiler stands as a non-trivial test of the language’s capabilities, indicating confidence in its general utility.
00:04:50.760 A self-hosted system guarantees a comprehensive consistency check, confirming that two versions of a language can effectively compile each other.
00:05:04.440 In my own exploration around this topic, I delved into LLVM, a well-known cross-platform compiler framework.
00:05:17.120 LLVM started as a research project back in 2000 and has since become a strong competitor to GCC. Its architecture allows for substantial flexibility through its pluggable components.
00:05:34.640 Clang, a front end of LLVM, is a great example of a self-hosting environment as it is written in C and uses itself to compile.
00:05:46.560 To demonstrate this self-hosting process, I ran a three-stage bootstrap of LLVM, using GCC for the initial build which was successfully tested.
00:06:04.760 For the second stage, we compiled with Clang using the results from the first stage, achieving successful test results once again.
00:06:19.280 In our final step, we built a new version of Clang using itself, aiming for a self-hosting clang compiler.
00:06:34.960 After completing these steps, we compared the binaries produced, observing notable differences between the output from GCC and Clang.
00:06:50.640 Seeing minimal variation in output from integers is reassuring, as it indicates a functional self-hosting compiler.
00:07:07.440 While the second-stage binary passed tests, the third stage verified that a complete package could be rebuilt from source.
00:07:19.680 We concluded our self-hosting compiler bootstrap successfully, allowing Clang to compile itself indicating the effectiveness of this approach.
00:07:31.840 In essence, the concept of self-hosting can relate to any program creating a new version of itself, including compilers, kernels, and programming languages.
00:07:46.640 For example, Linus Torvalds manages Git using Git itself, a testament to the self-hosting concept in computing.
00:08:01.520 Looking at services further within Heroku, we strive for the same self-hosting principles. It’s straightforward for us to run our own websites on our platform, including www.heroku.com.
00:08:14.920 If we have a platform that we built for running websites, we should use it for our services. This is often termed 'dogfooding': using your own product.
00:08:31.600 The motivation behind self-hosting goes beyond pride. By running our services on Heroku, we achieve efficiency gains similar to development efficiency.
00:08:44.840 Historically, our platform started with a single monolithic Rails application called 'Core,' which encompassed everything from the website to the API.
00:09:01.680 In the past, deploying any change could risk other parts of the platform. We eventually recognized the need for a separation of concerns.
00:09:18.840 We've subsequently applied this separation across our platform, including our add-ons, which are now also running on Heroku.
00:09:37.200 For example, our scheduling tool used to operate separately, but now it runs efficiently on our platform, exemplifying our improved architecture.
00:09:51.520 With Cloud services, we are running processes with greater complexity across several servers to handle different operations.
00:10:05.600 Although it looks much like your standard web application, it serves as a model for our internal processes.
00:10:19.440 The Heroku Postgres service takes this further as it manages its infrastructure wholly on our platform, showcasing what we can accomplish through self-hosting.
00:10:34.000 As we grew, it became essential that the platform improved—to handle tasks that may extend beyond conventional web applications.
00:10:49.560 Now, let me step back to explain the underlying architecture of Heroku to put things in perspective.
00:11:06.200 Heroku has three primary inputs: the HTTP router, which directs web requests to different application servers, the API managing control plan requests, and the build system that handles source code.
00:11:22.680 When a user pushes their code, we build the application, scaling up or down as necessary, which encapsulates the core of our platform.
00:11:39.440 Now, I’ll delve into the build servers that serve as the infrastructure behind our operations.
00:11:54.880 The aspiration remains to forget servers in all aspects of our business, focusing instead on application delivery.
00:12:07.640 One of my motivations is to further integrate our compile servers, moving from a worker pattern to a runtime process model.
00:12:24.560 This transformation offers not only efficiency but also alignment of build and runtime environments.
00:12:39.360 By reusing the technology we’ve developed for secure web applications, we aim to refine our building processes.
00:12:56.320 As I execute the slug compiler, we are effectively starting with the right environment for dedicated services.
00:13:10.160 We should expect bundler to seamlessly install all dependencies, demonstrating the efficiency in code management in a self-hosting context.
00:13:27.520 The self-hosting principle further emphasizes the importance of ensuring compatibility between build and runtime environments.
00:13:43.680 This illustrates the capability of running applications directly on Heroku, paralleling the LLVM bootstrap example where generic Unix systems execute arbitrary code.
00:13:59.560 Overall, as we evaluate what other systems and processes we could host internally, we realize we're on a path of continuous improvement.
00:14:13.760 This includes ongoing exploration of more complex services, as most rectangles representing servers hold potential for self-hosting.
00:14:29.560 The key takeaway from my talk today is the fundamental goal of self-hosting which can serve any complex system.
00:14:43.960 Overall, self-hosting empowers us with efficiency gains and a means to validate our systems comprehensively while engaging with the challenges of developing a consistent platform.
00:15:00.000 As I wrap up, I encourage you to consider the implications of self-hosting for your own software systems and the potential it presents.
00:15:15.720 I’ll end with a couple of references that have informed my understanding of these concepts, such as classic literature on compilers and principles of self-hosting.
00:15:31.200 Thank you for your time, and I’m happy to answer any questions regarding Heroku, self-hosting, and more.
00:15:47.640 If you forget about your servers, do they ever get lost?
00:15:53.680 Indeed, servers can get lost; Amazon is very good at losing them.
00:16:02.920 And to wrap up, Cedar is not read-only anymore, but be cautious with persistent data.
00:16:14.080 Thank you.
Explore all talks recorded at Aloha RubyConf 2012
+17