00:00:30.160
Cool, so as I said, I'm talking about systems programming in Ruby. My name is Andy Delcambre; I go by a delcambre on the internet. My last name is not phonetic at all; no one ever gets it right. I work at GitHub, mainly on backend systems, focusing on the system components. I essentially work from the bottom of the Rails app to the top of the Git layer. We have an RPC system that retrieves data from the file servers to the front ends. If you want to talk about any of that stuff, I would love to chat about it; it’s fascinating to me.
00:01:03.680
So, the title of this talk is "Ruby Systems Programming," which implies that I'm going to try and teach you how to write systems code and create low-level things. However, that’s not really the point at all. The objective is more about learning how these lower-level building blocks work in the applications we write every day, even the components that we don’t think about, or write, and that are underneath the layers we work at.
00:01:15.520
With that said, I’d like to define what systems programming is and what system software is. This is a quote from Wikipedia: "System software operates and controls the computer hardware and provides a platform for running application software." I really like this quote because it breaks the concept into two pieces. First, when we talk about system software, we’re discussing direct interactions with hardware, which means writing very low-level code, probably interacting with the kernel. On the other hand, we're creating a platform for running application software.
00:01:44.560
This means none of the code that we write in system software is likely to be user-facing. It’s primarily software that other software will run on top of. The second aspect of this is that we have application software running on top of this system software, which is essentially everything else—this could be video games, iPhone apps, or anything that runs on your set-top box.
00:02:08.160
For this crowd, and for me, since this is what I know, I’m going to narrow the scope down to web software. We have a large set of building blocks that run underneath the Rails apps we write—the Ruby applications we develop. These are the components that we don’t often think about when we're focused on writing in Rails, such as actions and controllers.
00:02:37.280
We’re going to start at the bottom and work our way up the stack, beginning with the kernel—the software that runs directly above the hardware. We'll talk about system calls, how we interact with the kernel, and how we deal with anything that resembles a file, which encompasses most things in UNIX, including file descriptors and sockets for network programming.
00:02:55.440
The code or anything I discuss here should apply generally to any UNIX system. However, there isn't much overlap with Windows or other non-UNIX systems. Most specifics I mention will relate to Linux on x86 architecture. This is because the code is low-level enough that the platform and the hardware you're running on actually matter quite a bit. That said, my demo will also work on my Mac, so it’s not exclusively Linux.
00:03:38.319
At the beginning, we have the kernel, which is the component that runs directly above the hardware. This kernel controls all the operations on your system. Generally, a computer looks like this: you have the hardware—memory, hard disks, graphics drivers, sound cards, etc.—at the bottom. Above that is your code, which is everything you think of as software. Any application you write, anything you run, everything you experience is in this green box. The kernel acts as the mediator between our code and the hardware.
00:04:56.720
The kernel handles all the interactions with the hardware and manages the processes that execute in user mode. User mode runs all the code you write. It's implemented in the CPU as a differentiation between kernel mode and user mode. When you're executing in user mode, your capabilities are limited. You can perform operations with data you already possess, whether numbers or strings, but you're not able to access new data externally or perform IO operations like reading files or networking.
00:06:19.680
On the other hand, kernel mode on your computer can perform essentially any action imaginable. The kernel has the authority to read and write to any memory and control every program running. However, you don’t have the privilege to run code in kernel mode; you're confined to user mode. So how do we accomplish the actions that kernel mode can do while operating in user mode? This is where system calls come into play.
00:06:56.319
System calls are how we make those requests and modifications to the system that only kernel mode can execute. Essentially, while our program is running in user mode, it can’t perform operations directly; instead, it requests the kernel to perform those operations on its behalf. For instance, if you need to open or write to a file, you would invoke a system call. The kernel will execute the requested action and return control back to your program.
00:07:50.720
You can observe all the system calls that your program executes using trace commands. For example, in UNIX, the command is `strace`, and on macOS, it's `dtruss`. These commands allow you to take any process and see the system calls it's making. When I tested with one request locally through one of the GitHub apps, I noted that a single request executed around 120,000 system calls. That's a significant volume for just one interaction.
00:09:10.320
You might assume that because we can’t do anything without system calls and given that this example involved such a high number of them, the API would be vast. Surprisingly, it’s quite small—just 326 possible system calls in the Linux kernel. If a function isn’t in this syscall table, it’s not available for calls, and if it resides in kernel mode but is absent from that table, it won’t work.
00:09:51.280
Making a system call is quite a specific process. This assembly snippet illustrates the `open` syscall, which has a number associated with it (for instance, `open` is syscall number five). First, we load five into the eax register, and then we populate the additional registers to pass any arguments. If there are more than a few arguments, the method varies, but most system calls typically have a limited number of parameters.
00:11:04.280
At this point, we trigger the syscall interrupt 0x80, which shifts execution from your program to the kernel. The interrupt handler finds the request in the eax register, executes the corresponding command, and returns the result to your program. It’s important to note that your program stops executing once this interrupt happens.
00:11:22.720
Interestingly, within that syscall table, there are no blank spots; every entry must correspond to a syscall, even those that are deprecated or reserved for future use. Of the 326 syscalls, 65 defined as 'not implemented' don’t actually perform any action, which narrows our effective set to around 260 syscalls available for Linux.
00:12:54.720
Comparatively, Windows has thousands of syscalls. While the exact number is unknown, it far surpasses Linux. It’s intriguing to think that in UNIX/Linux, we manage with a smaller API boundary compared to the significantly larger one in Windows while maintaining the same overall capabilities.
00:13:44.800
Moving on, in the kernel, we frequently discuss things that appear as files. Linux adopts the principle that everything—hardware components, devices, and even sockets—functions like files. This belief leads to the expression 'everything is a file,' which isn’t strictly true but is functionally applicable in many situations.
00:14:38.560
File descriptors are how we handle these file-like interactions when using syscalls. The operational API, however, is quite small. Four primary syscalls enable you to interact with file descriptors—other syscalls exist, but these are fundamental. You can read from files, write to them, and close them, maintaining control over their existence.
00:15:36.240
File types can also vary; some may allow seeking (moving around), but streaming data over a network won’t permit you to rewind. A few interesting statistics include that I observed all the file descriptors currently open for a Unicorn process in production—easily manageable. Every process begins with three default file descriptors: standard input (0), standard output (1), and standard error (2). In this case, the standard input originates from /dev/null since this server ignores input.
00:16:29.440
The default server output logs go to log files that keep track of operations and syntax highlighting, etc. For anything communicating over the network, we essentially manage socket connections, including those for web traffic and interactions with databases like MySQL or Redis. With file APIs discussed only in reading and writing, it's essential to note that we lack a standard API for creating new file descriptors: that depends on the type of file descriptor.
00:18:34.040
For network communications, there’s the BSD sockets API. A brief history reveals that UNIX began as a result of the multics project—a collaborative endeavor in the early 1960s. When that failed, some of the individuals involved developed UNIX, which adopted a meaningful single-user system structure. Over the years, UNIX software has evolved, and by 1983, the original TCP/IP networking standards began to take hold.
00:19:38.720
There’s a certain nostalgia to it—that nearly 30 years later, we still effectively use the same API for sockets that has barely changed since its inception. The BSD sockets API is quite successful, having been standardized well and adapted to Windows as well. The server-side handling of connections involves several core calls: creating a socket, binding it to a port, and setting it to listen for incoming connections, all of which follow a specific sequence.
00:20:49.840
Once connections are accepted, we can deal with the socket the same way as previous APIs discussed. HTTP operates on requests and responses—after accepting a connection and reading from it, we generate and send back a response before closing the connection if necessary. A typical HTTP request format includes the verb, path, and version.
00:22:12.480
Response messages include a status line consisting of an HTTP version and response code, plus any headers needed (which are vastly flexible), followed by response bodies if applicable. Therefore, as we dive into code, we should note security vulnerabilities in any lightweight implementations, like this simple 23-line HTTP server example.
00:23:20.880
This code contains basic structure for creating an HTTP server using Ruby, showing standard practices for handling sockets, reading files, and responding to requests. It demonstrates a simple JSON API request from a server using sockets. Following this implementation, I’ll verify its practicality by testing it in a directory housing files I'm serving with it.
00:24:45.680
I confirm that the server responds as expected. Various requests work properly, like when files exist, and return 200 OK messages as a valid response. In conclusion, this working HTTP server in Ruby shows how to employ system calls and sockets effectively.
00:26:02.480
This server effectively demonstrates core principles as it leverages the existing Ruby API calls to implement this HTTP functionality. This design pattern is present in many Ruby applications, such as Unicorn. Thus, all the mentioned API calls we utilize transfer lessons learned all the way from the kernel to the layers above where our Rails app operates.
00:27:30.960
If you're interested in further knowledge on this topic, I recommend a couple of books: "Linux System Programming" by Robert Love, which has been updated for the current ecosystem of Linux system calls, and Richard Stevens' renowned works on network programming and UNIX principles. Those books provide a deeper insight into both systems programming and network operations, serving as solid references for both the beginner and advanced programmer.