RubyKaigi Takeout 2021

Story of Rucy - How to compile a BPF binary from Ruby

BPF is a technology used in Linux for packet filtering, tracing or access auditing. BPF has its own VM and set of opcodes.

If you want to write a program that loads and uses BPF binary, you can write it in any language including Ruby.

However, to prepare a "BPF binary" itself, you generally need to write a bit weird C, and pass it to clang compiler using bpf target.

Wouldn't it be great if we could make these BPF binaries entirely in Ruby?

Rucy is intended to allow programmers to write their whole BPF programs in Ruby. I'll discuss how to "compile" BPF binaries from Ruby in this talk.

RubyKaigi Takeout 2021: https://rubykaigi.org/2021-takeout/presentations/udzura.html

RubyKaigi Takeout 2021

00:00:01.040 Hello, I am Uchihō Kondo.
00:00:04.240 I’m going to start talking about the story of Rucy.
00:00:11.920 First, I will introduce myself. I am Kondo Uchihō.
00:00:16.240 I work at GMO People and live in Kochi. I was a RubyKaigi organizer in 2019.
00:00:24.640 I like Ruby, MLB, Rust, and Duolingo. Today, I will talk about my program, Rucy.
00:00:40.879 However, before we can discuss Rucy, there are a lot of requirements that need to be understood.
00:00:52.640 So first, I'm going to talk about BTF, which is about BPF.
00:01:02.000 BPF, in simple terms, is one of the latest technologies in Linux.
00:01:06.320 It is often confused because it is used for various purposes. But basically, it is a technology that allows users' code to run faster and safer in the kernel.
00:01:23.600 BPF is used, for example, in several areas, such as networking. The name Berkeley Packet Filter implies that BPF historically started with packet filters like TCP dump.
00:01:35.360 It has been developed over time to run faster and serve other functions, such as server tracing and observability.
00:01:46.399 Now, with BPF, it’s becoming a reality to aggregate current events more precisely than previous classic solutions. If you are running FreeBSD, you are probably already a fan of DTrace.
00:02:15.360 You can also use BPF as a mechanism to filter which devices are allowed to access resources from inside a container.
00:02:31.280 This encompasses the use case for device access control.
00:02:37.760 The entire BPF program works to control the context, such as device major or minor codes that are necessary for filtering.
00:03:39.200 Now, continuing, the next topic is whether you can use BPF from Ruby.
00:03:44.560 The answer is yes! I started developing a project called rbcc in mid-2019.
00:03:52.639 It allows Ruby to access BPF's observability features. Rbcc is a Ruby binding of BCC. BCC, itself, is an SDK for developing tools for BPF.
00:04:02.080 Unfortunately, I started developing rbcc because I realized BCC didn't support Ruby. Fortunately, BCC uses Python FFI, so I used the Ruby FFI to access BCC.
00:04:22.639 Rbcc is also one of the outcomes of a Ruby Association development grant from 2019.
00:04:51.440 However, there are some pitfalls with using BCC. When running the tools that utilize BCC, the compilation process of the BPF binary will be done using the 'c2' compilation method.
00:05:37.440 This can lead to problems, such as runtime overhead and the need to have LLVM and the appropriate kernel headers set up for compiling.
00:06:02.160 Installing LLVM and Clang from your Linux package manager can bloat your compilation environment, especially when trying to create a container image.
00:06:21.840 Therefore, rather than using the existing methods like BCC, I recommend pre-compiling the BPF binary.
00:06:35.840 Incorporating it into the user space program allows you to install the whole setup as a single file.
00:06:48.960 Instead of continuing to use BCC, I decided to develop a new tool called MLB, which I believe is the best way to achieve a one-binary goal.
00:07:06.720 However, the problem is that the BPF binary can only be created in C.
00:07:32.960 Effectively, we need to pass the C code to the clang command with the target BPF option.
00:07:52.880 Since you are a Rubyist, you may be proficient in C, but I aimed to make it possible for Ruby developers to create BPF binaries without needing to write any C code.
00:08:06.880 This brings us to the summary so far: you need to write C for the BPF binary, but I'm trying to make it possible to write every part of the BPF program in Ruby, using Rucy.
00:08:19.440 It's now time to discuss the technology behind Rucy. Rucy is a Ruby compiler targeting BPF.
00:08:43.360 In other words, Rucy will generate the BPF program binary directly from a plain Ruby script.
00:08:54.720 Here is an architectural overview of how I'm compiling Ruby into a BPF program using Rucy.
00:09:02.560 The process begins with features being programmed into Ruby scripts, which can then generate the BPF bytecode. Rucy internally interprets these Ruby codes and extracts the necessary information for the BPF program.
00:09:25.440 By doing this, we obtain both the metadata and the binary representation of the BPF program, resulting in the object file format.
00:09:41.839 The BPF bytecode consists of instructions that run on its own virtual machine inside the kernel, which has its own instruction set.
00:09:59.440 The virtual machine can use 10 registers, and the BPF instructions are essentially 64-bit fixed.
00:10:11.680 The layout of an instruction can be described as: destination register, source register, offset, and immediate value.
00:10:32.160 Every instruction consists of a combination of these individual bits. I will explain details about these in the next session.
00:10:57.760 Much like regular programming environments, we also have the MLVM, which operates on a register machine but requires a different set of instructions.
00:11:08.239 The instructions in MLVM are of variable lengths, depending on the operands they accept. The sizes of the operand values range from 8 to 80 bits.
00:11:50.880 For instance, an operation that loads a number into a register takes only its respective operand, while jump instructions require more sophisticated representations.
00:12:22.080 Furthermore, instruction transformations between MLVM and BPF require special considerations, especially handling offset calculations for methods and variable names.
00:12:57.760 Moreover, jump instructions behave distinctly between MLVM and BPF; in BPF, a single instruction is used for jumps, while in MLVM, two are required.
00:13:40.960 In practice, an instruction that compares register values might need additional instructions to determine whether to jump based on the outcome of that comparison.
00:14:23.360 Lastly, every BPF program must return a value, typically stored in a specific register, and it is necessary to manage this return state.
00:14:48.320 In conclusion, we now have the capability to produce BPF binary objects directly from Ruby scripts.
00:15:04.560 Using these object files and the MRuby and BPF libraries, it is possible to compose an entire command binary.
00:15:24.640 Writing the whole tool in Ruby can reduce context switches for programmers, and I am working on enabling them to more easily share data structures between BPF and user-space programs.
00:15:38.240 Let's now take a look at Rucy's demonstrations in action.
00:15:52.800 First, I'll show the secret demo. This is a target Ruby script that can be compiled with the Rucy command.
00:16:10.640 The generated binary is a valid ELF format.
00:16:22.740 Next, I will make a cgroup and load this object into the subgroup.
00:16:44.720 Now the process can access a random resource, but if the current process is assigned to a different group, that resource will become inaccessible.
00:17:03.440 This shows how the filtering works.
00:17:20.640 The next demo is about tracing kernel functions.
00:17:39.840 In this demonstration, Rucy and MRuby code will rescan the function called TCP connect.
00:18:00.960 Here is the first code that checks differences in the cross-section and compiles it into an object.
00:18:24.640 This gives us the BPF binary generated by Rucy.
00:18:43.360 The next part of the demo shows how to embed this object code into an MRuby program and execute it.
00:19:09.440 By using the generated command, you can call, for instance, TCP connect, which will be logged during execution.
00:19:45.600 Finally, I would like to share my research goals and what I aim to achieve in the future.
00:20:11.440 Currently, I have implemented only the minimum basic BPF functionality. For example, handling BPF map data structures is still unimplemented.
00:20:52.560 Many features in actual BPF cannot yet be used in Rucy. I would like to gradually implement these required features and create sample tools for real-world BPF operations.
00:21:47.760 This concludes my talk on the story of Rucy.
00:21:56.080 Thank you very much!