00:00:15.160
Hey everybody, thanks for hanging in there. My name is Noah Zoschke.
00:00:20.720
This is a talk called 'Running Heroku on Heroku'. We’re here at Aloha Ruby.
00:00:26.039
I want to thank all the organizers, speakers, and everyone for making this event great.
00:00:33.000
The slides for this talk are available online if you want to reference anything later.
00:00:38.600
I assume most of you are familiar with Heroku as Rubyists.
00:00:45.120
Just a quick background: Heroku is a cloud application platform as a service. It used to be referred to as a Ruby application platform as a service, but we have opened this up to other languages like Python, Java, and Node.js.
00:01:00.600
One of our taglines is 'Run anything and see everything without servers.' A personal goal of mine, and a goal for our platform, is to eliminate the concept of a server. As application developers, we want to focus on getting our apps online rather than dealing with the traditional operations involved in setting up servers and databases.
00:01:18.560
From a more low-level perspective, as a platform engineer, I see us developing a distributed Unix system. This is essentially a grid of application servers based on Linux, specifically on an Ubuntu base. I consider Heroku to be an operating system layer that interfaces with a pool of computing, which I suppose is the cloud.
00:01:32.399
However, it's also just servers in Amazon's infrastructure. This talk today explores an interesting computer science problem: the concept of self-hosting.
00:01:40.759
Specifically, I'm interested in how this relates to bootstrapping. I'm not talking about Twitter Bootstrap as a design tool, but rather the process of creating systems that can host themselves.
00:01:55.439
If you caught Glenn's talk earlier, you might see parallels in how we can delve deep into computer science and understand these low-level primitives.
00:02:09.160
Bootstrapping is a generic term with various applications, originating in the 19th century from the phrase 'to pull oneself up by one's own bootstraps.' Initially, it referred to something impossible, like raising oneself by one's own efforts.
00:02:21.400
The term was often applied to traveling salesmen with impossible schemes, like perpetual motion machines. It's a process that's fundamentally impossible, like lifting oneself up by bootstraps.
00:02:34.120
Interestingly, this term shifted in meaning in the 20th century to refer to a self-sustaining process that can proceed without external help. This concept applies across various domains, including socioeconomics.
00:02:47.360
In political seasons, politicians often tout their bootstrap stories, claiming to have succeeded against all odds. In business, companies aim to be 'bootstrapped', relying on good ideas and hard work rather than venture capital.
00:03:00.319
This idea extends to language acquisition, where children bootstrap their ability to speak and understand language from a blank slate. In biology, all cells start as generic stem cells that undergo a bootstrapping process to specialize. In computing, the process of booting a computer is essentially bootstrapping.
00:03:14.879
Originally, some mainframes had a big red button labeled 'bootstrap,' which would initiate the startup process. Although, as users, we don’t have to worry about this anymore, booting is still a complex task.
00:03:27.200
There's also the context of compilers in computer science where bootstrapping involves using a simpler language to translate a more complex program.
00:03:39.919
Essentially, a simple language can be used to create increasingly advanced programming tools, aiding in the development of interpreters such as Ruby, which may then be compiled into more complex binaries.
00:03:53.760
Compiling a language requires a bootstrap process, leading to something known as self-hosting—the ability for a language or system to self-compile.
00:04:07.360
For example, LLVM is a C compiler written in C. Self-hosted languages showcase significant advantages, such as enabling development in a higher-order programming language.
00:04:22.680
One key advantage is that this approach demonstrates the expressiveness of a language by allowing it to write its own compiler.
00:04:36.960
Having a self-hosting compiler stands as a non-trivial test of the language’s capabilities, indicating confidence in its general utility.
00:04:50.760
A self-hosted system guarantees a comprehensive consistency check, confirming that two versions of a language can effectively compile each other.
00:05:04.440
In my own exploration around this topic, I delved into LLVM, a well-known cross-platform compiler framework.
00:05:17.120
LLVM started as a research project back in 2000 and has since become a strong competitor to GCC. Its architecture allows for substantial flexibility through its pluggable components.
00:05:34.640
Clang, a front end of LLVM, is a great example of a self-hosting environment as it is written in C and uses itself to compile.
00:05:46.560
To demonstrate this self-hosting process, I ran a three-stage bootstrap of LLVM, using GCC for the initial build which was successfully tested.
00:06:04.760
For the second stage, we compiled with Clang using the results from the first stage, achieving successful test results once again.
00:06:19.280
In our final step, we built a new version of Clang using itself, aiming for a self-hosting clang compiler.
00:06:34.960
After completing these steps, we compared the binaries produced, observing notable differences between the output from GCC and Clang.
00:06:50.640
Seeing minimal variation in output from integers is reassuring, as it indicates a functional self-hosting compiler.
00:07:07.440
While the second-stage binary passed tests, the third stage verified that a complete package could be rebuilt from source.
00:07:19.680
We concluded our self-hosting compiler bootstrap successfully, allowing Clang to compile itself indicating the effectiveness of this approach.
00:07:31.840
In essence, the concept of self-hosting can relate to any program creating a new version of itself, including compilers, kernels, and programming languages.
00:07:46.640
For example, Linus Torvalds manages Git using Git itself, a testament to the self-hosting concept in computing.
00:08:01.520
Looking at services further within Heroku, we strive for the same self-hosting principles. It’s straightforward for us to run our own websites on our platform, including www.heroku.com.
00:08:14.920
If we have a platform that we built for running websites, we should use it for our services. This is often termed 'dogfooding': using your own product.
00:08:31.600
The motivation behind self-hosting goes beyond pride. By running our services on Heroku, we achieve efficiency gains similar to development efficiency.
00:08:44.840
Historically, our platform started with a single monolithic Rails application called 'Core,' which encompassed everything from the website to the API.
00:09:01.680
In the past, deploying any change could risk other parts of the platform. We eventually recognized the need for a separation of concerns.
00:09:18.840
We've subsequently applied this separation across our platform, including our add-ons, which are now also running on Heroku.
00:09:37.200
For example, our scheduling tool used to operate separately, but now it runs efficiently on our platform, exemplifying our improved architecture.
00:09:51.520
With Cloud services, we are running processes with greater complexity across several servers to handle different operations.
00:10:05.600
Although it looks much like your standard web application, it serves as a model for our internal processes.
00:10:19.440
The Heroku Postgres service takes this further as it manages its infrastructure wholly on our platform, showcasing what we can accomplish through self-hosting.
00:10:34.000
As we grew, it became essential that the platform improved—to handle tasks that may extend beyond conventional web applications.
00:10:49.560
Now, let me step back to explain the underlying architecture of Heroku to put things in perspective.
00:11:06.200
Heroku has three primary inputs: the HTTP router, which directs web requests to different application servers, the API managing control plan requests, and the build system that handles source code.
00:11:22.680
When a user pushes their code, we build the application, scaling up or down as necessary, which encapsulates the core of our platform.
00:11:39.440
Now, I’ll delve into the build servers that serve as the infrastructure behind our operations.
00:11:54.880
The aspiration remains to forget servers in all aspects of our business, focusing instead on application delivery.
00:12:07.640
One of my motivations is to further integrate our compile servers, moving from a worker pattern to a runtime process model.
00:12:24.560
This transformation offers not only efficiency but also alignment of build and runtime environments.
00:12:39.360
By reusing the technology we’ve developed for secure web applications, we aim to refine our building processes.
00:12:56.320
As I execute the slug compiler, we are effectively starting with the right environment for dedicated services.
00:13:10.160
We should expect bundler to seamlessly install all dependencies, demonstrating the efficiency in code management in a self-hosting context.
00:13:27.520
The self-hosting principle further emphasizes the importance of ensuring compatibility between build and runtime environments.
00:13:43.680
This illustrates the capability of running applications directly on Heroku, paralleling the LLVM bootstrap example where generic Unix systems execute arbitrary code.
00:13:59.560
Overall, as we evaluate what other systems and processes we could host internally, we realize we're on a path of continuous improvement.
00:14:13.760
This includes ongoing exploration of more complex services, as most rectangles representing servers hold potential for self-hosting.
00:14:29.560
The key takeaway from my talk today is the fundamental goal of self-hosting which can serve any complex system.
00:14:43.960
Overall, self-hosting empowers us with efficiency gains and a means to validate our systems comprehensively while engaging with the challenges of developing a consistent platform.
00:15:00.000
As I wrap up, I encourage you to consider the implications of self-hosting for your own software systems and the potential it presents.
00:15:15.720
I’ll end with a couple of references that have informed my understanding of these concepts, such as classic literature on compilers and principles of self-hosting.
00:15:31.200
Thank you for your time, and I’m happy to answer any questions regarding Heroku, self-hosting, and more.
00:15:47.640
If you forget about your servers, do they ever get lost?
00:15:53.680
Indeed, servers can get lost; Amazon is very good at losing them.
00:16:02.920
And to wrap up, Cedar is not read-only anymore, but be cautious with persistent data.
00:16:14.080
Thank you.