00:00:01.040
Hello, I am Uchihō Kondo.
00:00:04.240
I’m going to start talking about the story of Rucy.
00:00:11.920
First, I will introduce myself. I am Kondo Uchihō.
00:00:16.240
I work at GMO People and live in Kochi. I was a RubyKaigi organizer in 2019.
00:00:24.640
I like Ruby, MLB, Rust, and Duolingo. Today, I will talk about my program, Rucy.
00:00:40.879
However, before we can discuss Rucy, there are a lot of requirements that need to be understood.
00:00:52.640
So first, I'm going to talk about BTF, which is about BPF.
00:01:02.000
BPF, in simple terms, is one of the latest technologies in Linux.
00:01:06.320
It is often confused because it is used for various purposes. But basically, it is a technology that allows users' code to run faster and safer in the kernel.
00:01:23.600
BPF is used, for example, in several areas, such as networking. The name Berkeley Packet Filter implies that BPF historically started with packet filters like TCP dump.
00:01:35.360
It has been developed over time to run faster and serve other functions, such as server tracing and observability.
00:01:46.399
Now, with BPF, it’s becoming a reality to aggregate current events more precisely than previous classic solutions. If you are running FreeBSD, you are probably already a fan of DTrace.
00:02:15.360
You can also use BPF as a mechanism to filter which devices are allowed to access resources from inside a container.
00:02:31.280
This encompasses the use case for device access control.
00:02:37.760
The entire BPF program works to control the context, such as device major or minor codes that are necessary for filtering.
00:03:39.200
Now, continuing, the next topic is whether you can use BPF from Ruby.
00:03:44.560
The answer is yes! I started developing a project called rbcc in mid-2019.
00:03:52.639
It allows Ruby to access BPF's observability features. Rbcc is a Ruby binding of BCC. BCC, itself, is an SDK for developing tools for BPF.
00:04:02.080
Unfortunately, I started developing rbcc because I realized BCC didn't support Ruby. Fortunately, BCC uses Python FFI, so I used the Ruby FFI to access BCC.
00:04:22.639
Rbcc is also one of the outcomes of a Ruby Association development grant from 2019.
00:04:51.440
However, there are some pitfalls with using BCC. When running the tools that utilize BCC, the compilation process of the BPF binary will be done using the 'c2' compilation method.
00:05:37.440
This can lead to problems, such as runtime overhead and the need to have LLVM and the appropriate kernel headers set up for compiling.
00:06:02.160
Installing LLVM and Clang from your Linux package manager can bloat your compilation environment, especially when trying to create a container image.
00:06:21.840
Therefore, rather than using the existing methods like BCC, I recommend pre-compiling the BPF binary.
00:06:35.840
Incorporating it into the user space program allows you to install the whole setup as a single file.
00:06:48.960
Instead of continuing to use BCC, I decided to develop a new tool called MLB, which I believe is the best way to achieve a one-binary goal.
00:07:06.720
However, the problem is that the BPF binary can only be created in C.
00:07:32.960
Effectively, we need to pass the C code to the clang command with the target BPF option.
00:07:52.880
Since you are a Rubyist, you may be proficient in C, but I aimed to make it possible for Ruby developers to create BPF binaries without needing to write any C code.
00:08:06.880
This brings us to the summary so far: you need to write C for the BPF binary, but I'm trying to make it possible to write every part of the BPF program in Ruby, using Rucy.
00:08:19.440
It's now time to discuss the technology behind Rucy. Rucy is a Ruby compiler targeting BPF.
00:08:43.360
In other words, Rucy will generate the BPF program binary directly from a plain Ruby script.
00:08:54.720
Here is an architectural overview of how I'm compiling Ruby into a BPF program using Rucy.
00:09:02.560
The process begins with features being programmed into Ruby scripts, which can then generate the BPF bytecode. Rucy internally interprets these Ruby codes and extracts the necessary information for the BPF program.
00:09:25.440
By doing this, we obtain both the metadata and the binary representation of the BPF program, resulting in the object file format.
00:09:41.839
The BPF bytecode consists of instructions that run on its own virtual machine inside the kernel, which has its own instruction set.
00:09:59.440
The virtual machine can use 10 registers, and the BPF instructions are essentially 64-bit fixed.
00:10:11.680
The layout of an instruction can be described as: destination register, source register, offset, and immediate value.
00:10:32.160
Every instruction consists of a combination of these individual bits. I will explain details about these in the next session.
00:10:57.760
Much like regular programming environments, we also have the MLVM, which operates on a register machine but requires a different set of instructions.
00:11:08.239
The instructions in MLVM are of variable lengths, depending on the operands they accept. The sizes of the operand values range from 8 to 80 bits.
00:11:50.880
For instance, an operation that loads a number into a register takes only its respective operand, while jump instructions require more sophisticated representations.
00:12:22.080
Furthermore, instruction transformations between MLVM and BPF require special considerations, especially handling offset calculations for methods and variable names.
00:12:57.760
Moreover, jump instructions behave distinctly between MLVM and BPF; in BPF, a single instruction is used for jumps, while in MLVM, two are required.
00:13:40.960
In practice, an instruction that compares register values might need additional instructions to determine whether to jump based on the outcome of that comparison.
00:14:23.360
Lastly, every BPF program must return a value, typically stored in a specific register, and it is necessary to manage this return state.
00:14:48.320
In conclusion, we now have the capability to produce BPF binary objects directly from Ruby scripts.
00:15:04.560
Using these object files and the MRuby and BPF libraries, it is possible to compose an entire command binary.
00:15:24.640
Writing the whole tool in Ruby can reduce context switches for programmers, and I am working on enabling them to more easily share data structures between BPF and user-space programs.
00:15:38.240
Let's now take a look at Rucy's demonstrations in action.
00:15:52.800
First, I'll show the secret demo. This is a target Ruby script that can be compiled with the Rucy command.
00:16:10.640
The generated binary is a valid ELF format.
00:16:22.740
Next, I will make a cgroup and load this object into the subgroup.
00:16:44.720
Now the process can access a random resource, but if the current process is assigned to a different group, that resource will become inaccessible.
00:17:03.440
This shows how the filtering works.
00:17:20.640
The next demo is about tracing kernel functions.
00:17:39.840
In this demonstration, Rucy and MRuby code will rescan the function called TCP connect.
00:18:00.960
Here is the first code that checks differences in the cross-section and compiles it into an object.
00:18:24.640
This gives us the BPF binary generated by Rucy.
00:18:43.360
The next part of the demo shows how to embed this object code into an MRuby program and execute it.
00:19:09.440
By using the generated command, you can call, for instance, TCP connect, which will be logged during execution.
00:19:45.600
Finally, I would like to share my research goals and what I aim to achieve in the future.
00:20:11.440
Currently, I have implemented only the minimum basic BPF functionality. For example, handling BPF map data structures is still unimplemented.
00:20:52.560
Many features in actual BPF cannot yet be used in Rucy. I would like to gradually implement these required features and create sample tools for real-world BPF operations.
00:21:47.760
This concludes my talk on the story of Rucy.
00:21:56.080
Thank you very much!