Unix

Summarized using AI

Ruby Systems Programming

Andy Delcambre • March 07, 2013 • Earth

In the video "Ruby Systems Programming," presented by Andy Delcambre at the Ruby on Ales conference in 2013, the speaker delves into the fundamentals of systems programming, particularly as it pertains to Ruby development. The discussion emphasizes the importance of understanding low-level system components to appreciate how high-level programming works.

Key points covered in the presentation include:

- Introduction to Systems Programming: Clarifying what systems programming entails, which includes writing software that interacts with hardware and provides a platform for application software.

- User Mode vs Kernel Mode: Explaining the two execution modes on the CPU with a focus on user mode where applications run and kernel mode where direct hardware communication occurs.

- System Calls: Discussing the function of system calls as a means for user-mode programs to request services from the kernel, illustrated with examples on common operations and the number of system calls made during a web server request.

- Writing a Web Server in Ruby: Delcambre progresses to demonstrate how a basic web server can be constructed in Ruby, using socket programming to handle incoming connections and serve HTTP responses.

- TCP/IP Fundamentals: He explains the TCP connection process, including the three-way handshake and the structure of HTTP requests and responses.

- Applicability of Concepts: The significance of these concepts is presented as foundational for higher-level frameworks, with references to established Ruby frameworks like Unicorn that still use underlying socket APIs.

In conclusion, Delcambre asserts that understanding the basic principles of systems programming is valuable for any developer, as it deepens comprehension of the infrastructure that supports modern applications. This grasp of systems programming not only enhances programming skills but also fosters appreciation for the workings behind the software we use daily.

Ruby Systems Programming
Andy Delcambre • March 07, 2013 • Earth

We as rubyists tend to write software that runs on the web, without a deep understanding of what it would take to write the plumbing for that same software. I think it's useful to have a basic understanding of how some of the lower level components of a system work. I'll discuss the basics of systems programming, using Ruby. I'll talk about syscalls and kernel space vs user space. I'll cover a bit about file descriptors and what they're for. And hopefully I'll walk through a small example of a working webserver using those primitive syscalls.

Help us caption & translate this video!

http://amara.org/v/FGbB/

Ruby on Ales 2013

00:00:20.400 All right, so I'm going to talk about systems programming. My name is Andy Delcambre, but I go by a different handle on the internet.
00:00:26.960 My last name is not phonetic.
00:00:36.719 I’ve been working on various Ruby projects since I started programming and I've been having a blast with it. We just relaunched a project right after I started.
00:00:42.960 It's been a lot of fun. Similar to Jessica before me, I have a background in computer science.
00:00:48.399 I came to this point in an interesting way. I was actually a computer science professor at a university.
00:00:53.600 I swore up and down until I was a senior in high school that I would never do what my parents did. I planned to rebel and pursue civil engineering.
00:01:04.479 But then I took a class in civil engineering, and I really liked it.
00:01:10.840 I ended up getting a design degree. I can't even find my degree; it’s somewhere in my apartment. No one's ever asked me for it.
00:01:16.560 I don't believe it has helped me much in terms of job opportunities. So, I don’t think having a degree is essential, although I'm glad I have some foundational knowledge.
00:01:22.240 Now, to provide some background on my history: if you search for "Velcro Software" on the internet, you won’t find me or my mom.
00:01:28.880 Instead, you'll find the company that my father and grandfather started together back in the 70s called Open Software.
00:01:34.560 My grandfather started taking night classes in the late 60s to learn how to automate accounting software, making software for small local governments in Louisiana.
00:01:40.400 They did this for about 20 years until the early 90s. My grandfather also did not have a computer science degree.
00:01:45.840 I again do not believe that a computer science degree is required for success in computing.
00:01:52.320 Now, I’ll get into what I will cover today.
00:01:58.000 These are the two classes I took at my university; I didn't take all of them, but I learned bits from both.
00:02:03.600 This is intended to provide a background about all the underlying code that powers the work we do every day, though this is not the stuff we code directly.
00:02:10.640 I find it really interesting, and I am satisfied by my curiosity to understand how these pieces work.
00:02:16.560 Most of this code runs on virtually any UNIX operating system, although the examples I will present are more specifically using Linux.
00:02:23.200 So, the title is "Ruby Systems Programming." What do I mean by systems programming?
00:02:28.720 It essentially refers to writing system software that operates on an operating system.
00:02:34.319 According to Wikipedia, this software is meant to operate and control computer hardware and provide a platform for application software.
00:02:40.080 There are two components: one involves interacting with hardware such as networking, disks, and displays.
00:02:45.519 The second is the platform component, which underlies everything else. It’s not an end in itself.
00:02:51.760 I’m going to start low, discussing how code executes on the computer and then work my way up the stack.
00:02:58.239 At the end, we'll talk about how an actual web server might be written in Ruby using the most basic primitives.
00:03:04.959 First, we’ll discuss how code is executed on a computer.
00:03:10.000 Code can be executed in two modes on the CPU: user mode and kernel mode.
00:03:15.040 Basically, all the code we write runs in user mode. Unless you are hacking on the kernel, you are not running any code in kernel mode.
00:03:21.280 These modes are implemented in the CPU and have different permissions allowing different actions.
00:03:26.800 In user mode, you can generally do two kinds of operations without calling into the kernel.
00:03:33.599 You can perform mathematical operations and you can access memory, but only the memory you have been granted access to.
00:03:40.239 You may have been given access to a stack, and you can read or write that, but you cannot perform I/O operations like networking or disk access.
00:03:45.440 When in kernel mode, the situation is quite the opposite. Here, you can perform almost any action.
00:03:51.760 The kernel can read or write to any memory in the system and can execute any instruction at any time.
00:03:58.239 It communicates with hardware, manages file systems, and interacts with display drivers. In kernel mode, you can do pretty much anything.
00:04:05.360 Every bit of code you write primarily runs in user mode.
00:04:12.560 If, for example, you are logging information and need to write to disk, how do you achieve that?
00:04:17.680 You might think that you write code, switch modes to kernel mode, perform the write, and then switch back to user mode.
00:04:24.720 However, this is not how it works. All of your code runs in user mode.
00:04:29.919 In practice, you utilize system calls, which are API calls to the kernel.
00:04:36.000 Going back to our logging example, you would make a system call to log your file.
00:04:41.600 Your code effectively hands off execution to the kernel.
00:04:46.880 The kernel handles the file operations, returning a pointer to your code after completion.
00:04:52.320 Then your code can continue and call other system functions as necessary.
00:04:58.880 I was curious about how many system calls we make daily.
00:05:03.919 To find out, I used a tool called strace on Linux, which traces system calls made by a specific program.
00:05:10.400 I booted up a web server locally and recorded all system calls that occurred on a request.
00:05:16.880 The output was quite extensive, reflecting how many operations were taking place.
00:05:22.240 You’ll see calls like closing the file and accepting incoming connections.
00:05:27.840 This process executes significantly faster than the video frame rate.
00:05:34.560 Across the entire request, there were 120 system calls made.
00:05:40.880 Some might argue that this shows Ruby is inefficient with system calls.
00:05:46.880 While this may be true, it's also important to understand the multitude of system calls being made.
00:05:52.560 As you're connecting to databases, interacting with caches, and reading files, countless calls are being processed.
00:05:59.520 The sheer volume of activity necessitates these numerous system calls.
00:06:05.920 Given how fundamental system calls are, one might expect the API to be enormous.
00:06:11.680 Interestingly, there are a total of 326 system calls available.
00:06:17.680 However, not all are operational; many are historical or deprecated.
00:06:23.280 In fact, about 65 of these system calls are not implemented, leaving around 260 functional ones.
00:06:29.440 These calls provide a surprisingly small yet powerful API.
00:06:35.120 Now, switching gears, let's focus on web programming and how to write a web server.
00:06:41.680 This section will not cover the entire scope but will focus on receiving connections.
00:06:47.799 A bit of history: back in the 1960s, an operating system called MULTICS was developed at MIT.
00:06:52.480 Ken Thompson and Dennis Ritchie worked on MULTICS at Bell Labs.
00:06:58.720 Although it was a research project that introduced new concepts, it faced numerous issues.
00:07:05.040 Due to its ambitious goals, it wasn't practical for real-world use.
00:07:10.320 Some people from that team at Bell Labs then developed UNIX as a simpler alternative.
00:07:16.960 Originally, UNIX was designed for single-user systems, which is reflected in its name.
00:07:23.360 By the 1970s, UNIX had evolved to support multi-user functionality.
00:07:29.760 During this time, AT&T faced restrictions that limited its ability to sell the operating system.
00:07:35.520 Universities, including UC Berkeley, obtained licenses for the operating system's source code.
00:07:42.400 In 1978, the first BSD version was released, forming the basis for further UNIX development.
00:07:49.280 Bill Joy and his team added programs like CShell and Vi during this period.
00:07:56.160 By 1983, BSD introduced a TCP/IP stack that gained wide adoption, laying the groundwork for networking.
00:08:02.720 Interestingly, the API developed back then is still in use today. BSD sockets remain a standard across operating systems.
00:08:09.480 The API is remarkably simple and persists decades later, showcasing the stability of foundational technologies.
00:08:16.000 Now, let’s review the core components of the BSD socket API.
00:08:23.200 The socket method provides a socket for interaction, while the bind method binds the socket to a specific port.
00:08:29.600 The listen method prepares the socket to accept incoming connections.
00:08:36.080 Each incoming connection generates a new socket for handling communication.
00:08:43.440 Once communication is done, we need to close the socket properly.
00:08:49.360 This process primarily relates to TCP sockets rather than UDP.
00:08:56.560 Now, let’s discuss how TCP works.
00:09:03.600 The basic connection process includes a three-way handshake.
00:09:09.200 The client sends a packet to request and synchronize a connection.
00:09:15.040 The server acknowledges the request and also sends a synchronization packet back.
00:09:20.560 Finally, the client acknowledges back to complete the handshake.
00:09:27.280 Once established, data can flow freely in both directions, handled transparently by the TCP protocol.
00:09:34.080 As requests are sent, they are packaged as data packets.
00:09:41.040 Each packet must be acknowledged to ensure it was received.
00:09:46.640 If any packet is lost, the protocol triggers retransmission.
00:09:54.080 Finally, when communication is done, a four-way handshake is used to close the connection.
00:10:01.680 This involves each side acknowledging the termination, allowing them to shut down in turn.
00:10:07.680 Now, let’s look at the command line in UNIX systems to list open file descriptors.
00:10:13.440 You will see open file descriptors and their states.
00:10:20.560 For example, a program may be listening on port 80 and viewing any connected clients.
00:10:26.720 Now that we've reviewed TCP, let’s get into HTTP, which should feel more familiar.
00:10:32.559 The HTTP request sent across the network consists of several components.
00:10:39.200 The first line contains the method (like GET or POST), the requested path, and the version of HTTP.
00:10:46.080 After this line follow various headers indicating request details.
00:10:53.920 A sample header could include the user agent, host, and accepted content types.
00:11:00.320 On the response side, the server sends back similar lines.
00:11:06.160 The first line contains the HTTP version followed by a status code and its description.
00:11:12.800 As an example, a successful response might return '200 OK'.
00:11:19.760 This is followed by additional headers detailing the response information.
00:11:26.160 Lastly, the response contains the body, such as HTML content.
00:11:32.640 Now, let’s take a look at a simple example of an HTTP server written in Ruby.
00:11:38.880 I will share some code that illustrates the fundamental components of a web server.
00:11:45.760 Don’t try to read everything on the slide. I’ll explain it line by line.
00:11:52.880 Even though this isn't production-ready code, it still showcases how to run an HTTP server.
00:11:58.960 First, we require the socket library from Ruby's standard library for networking.
00:12:06.080 Next, we define some constants to streamline our code.
00:12:13.040 In the main program, we create a new socket using the socket API.
00:12:19.760 The parameters specify the type of socket—IPv4 and TCP in this case.
00:12:26.560 Then we set up our server to listen to connections on a specific address and port.
00:12:33.280 After binding the socket, we call the listen method, allowing it to receive incoming client requests.
00:12:40.080 When the server is ready, it can accept client connections; the first connection will be queued.
00:12:47.680 As connections are accepted, a new client socket is created for communication.
00:12:56.080 Next, we need to parse the incoming request to determine how to respond.
00:13:02.960 We read the first kilobyte of data from the socket and split it to extract the HTTP components.
00:13:10.960 Once we obtain the requested path, we check if the requested file exists.
00:13:18.720 If the file exists, we respond with a 200 OK status and send the file content.
00:13:25.440 We also send appropriate headers, including the content length.
00:13:32.560 If the file does not exist, we return a 404 Not Found response.
00:13:40.080 Finally, we close the connection to signal to the client that no more data will be sent.
00:13:46.720 This process is a complete web server written in only a few lines of Ruby code.
00:13:53.440 Seeing it run demonstrates the simplicity and power of building applications from the ground up.
00:14:00.080 This is a functional web server capable of dynamically serving requests.
00:14:07.040 So despite the fun of building our own servers, one might wonder why it's important.
00:14:15.360 I believe understanding the fundamentals gives us insights into higher-level frameworks.
00:14:23.440 I found similar code patterns within a well-known Ruby server framework, Unicorn.
00:14:31.200 You’ll see similar ideas implemented using the socket APIs.
00:14:39.040 They still utilize raw socket APIs for their networking operations.
00:14:47.360 This shows that regardless of the layers of abstraction, the underlying principles remain vital.
00:14:55.760 Understanding these concepts can enhance your programming skills and deepen your comprehension.
00:15:03.200 In conclusion, I hope you find this information illuminating.
00:15:09.440 It's beneficial to explore the internal workings of systems programming.
00:15:15.760 It helps us appreciate the infrastructure behind our applications.
00:15:22.560 Thank you all for your attention!
Explore all talks recorded at Ruby on Ales 2013
+15