Adding Security to Microcontroller Ruby

00:00:03.919 Hello everyone, thank you for coming. I'm Ryo Kajiwara, and I go by S01 on the internet. Today, I will be talking about adding security to Microcontroller Ruby. You get it, the title is self-explanatory. The slides are available through this QR code or the URL; I’ll tweet this and also share it on Discord. I recommend that you open the slides and follow along because I have a lot of text. Don't worry, I’m not the kind of security guy who tricks you into loading malicious stuff by scanning this QR code.

00:00:20.320 I do a lot of things in my field. If anything on this slide catches your interest, let’s talk later. I am a freelance web developer focused on digital identity and security. I’ve worked on writing, editing, and implementing standards in the W3C and IETF. Previously, at RubyKaigi Taiwan, I gave a talk titled 'Adventures in the Dungeons of OpenSSL,' where I implemented Hybrid Public Encryption (HPKE) using the OpenSSL gem and by extending the C extension of the OpenSSL gem.

00:00:51.680 After gaining some experience with C extensions in cryptographic libraries, I started pondering the question: why not implement cryptography into microcontrollers? Can we have TLS in Pico Ruby? Our target environment today is Pico Ruby plus R2P2, which is developed by Hinson. Pico Ruby is a Ruby implementation that works on microcontrollers, and R2P2 is a shell system compatible with it. A well-known use case is the PRK firmware, which allows you to run Ruby on keyboards. Here’s a Raspberry Pi Pico, which uses the RP2040 microcontroller, featuring a dual-core ARM Cortex-M0+ at 133 MHz with 264 kilobytes of SRAM and 2 megabytes of Flash.

00:01:29.560 This RAM size is quite limited, and it will become important later. The board also has wireless LAN capabilities with a CYW 43439 chip, costing approximately 1,353 yen (around 10 USD). It is also GKI (General Kernel Interface) certified, allowing us to legally use wireless connectivity in Japan. Moreover, MicroPython can already connect to the internet through Wi-Fi and perform HTTPS requests. So we wanted to achieve something similar, which led me to start the ASMR project—adding security to Microcontroller Ruby. More accurately, it's about adding SSL/TLS to Microcontroller Ruby, or simply adding networking to Microcontroller Ruby.

00:02:01.680 I initially framed the project in a way that didn’t have any meme value, so I changed the title. I prepared a demonstration, but Wi-Fi can be unreliable in real-time. Therefore, I have a video to illustrate the concept. This is the R2P2 shell system. We can start up IRB, initialize the Wi-Fi driver, and then enable station mode to connect to the access point. Once the connection succeeds, we will use the connect blocking function to link to the access point, providing the SSID and the password. I won’t show my actual home SSID and password, but after a successful connection, we will instantiate an HTTPS client and connect to example.com.

00:02:45.319 However, in the demo, we encountered an error initially, and the connection failed. After a couple of retries, we eventually succeeded in obtaining the content. I will explain the initial failure later, but let’s return to the slides. Here’s the obligatory warning when discussing security: cryptographic APIs are susceptible to misuse. If you intend to use them in production, it is crucial to consult with an expert before proceeding. Gems that have made their way into Pico Ruby and R2P2’s master branch are now production-ready. However, everything else, particularly aspects such as networking hardware, should be treated as experimental. There are some items that need to be addressed before merging.

00:03:30.439 Additionally, I treat Pico as a normal computer whenever feasible, which presents non-trivial challenges due to the difficulty of debugging embedded software. Embedded bugs can be particularly tricky to identify, as various factors could be overlooked. Thanks to the Pico SDK and R2P2, I’m able to work with the Pico as if it's a conventional computer. First, let’s discuss cryptography in Pico Ruby. Here’s a quick recap of how to perform SHA-256 in Ruby: you require the OpenSSL gem, instantiate a digest object, and call updates as needed before calling the hex digest method for the SH256 module.

00:04:02.920 In Ruby, for symmetric key encryption using AES, you instantiate the AES-128 object, call the encrypt method, set the key and initialization vector, and then call update for the text you wish to encrypt. To finalize the encryption, use cipher.final. To decrypt, you approach the process similarly, but call cipher.decrypt first. So why haven’t we implemented OpenSSL into Pico Ruby? The primary reason is that it’s too large; OpenSSL encompasses a wide array of functionalities, which makes its binary size incompatible with embedded systems. Instead, there are cryptographic libraries specifically designed for such environments, including mbed TLS and wolfSSL. Pico SDK utilizes mbed TLS, which offers build options to include only the necessary cryptographic functions.

00:04:46.919 When developing cryptography solutions, it is essential to visualize the encrypted content, which often appears as a binary string. Thus, it’s helpful to employ base16 or base64 to examine these binaries, but Pico Ruby originally did not have these capabilities. The use of base16 and base64 is designed for human readability rather than for machines. Thus, to facilitate debugging cryptographic operations, I implemented a base16 and base64 mrb gem as a simple example for anyone interested in mrb gem development. Here’s how AES is structured in Pico Ruby using mbed TLS; it resembles the logic found in OpenSSL. Additionally, SHA-256 implementation looks similar, as well.

00:05:25.680 At this point, Pico Ruby already integrated CMAC using mbed TLS prior to my intervention. I added the most commonly used algorithms for symmetric encryption (AES) and included two modes: the non-AEAD mode (CBC) and the AEAD mode (GCM). Furthermore, I implemented the SHA-256 algorithm, which is recognized to be secure. I purposefully did not include SHA-1 or MD5 due to their known security vulnerabilities, so you won't find them in my work. Now let’s shift our focus to implementation details, specifically regarding how to create MRB extensions.

00:06:05.960 When writing a C library wrapper, you need to encapsulate the library's context value—this is a C value—into a Ruby object to maintain context through an instance value. In C, this is accomplished using the RData macro. In MRB C, it's a bit different; we establish an instance using the mrbc_instance_new function, which requires a pointer to the class and the size of the buffer to create. We desire to wrap the context struct generated by OpenSSL and mbed TLS. To define methods, you might be familiar with CRuby's RB_DEFINE_METHOD, which requires the class name and function pointer along with the specified number of arguments in the RB_DEFINE_METHOD function. In MRB, this process is slightly different, where we use mrbc_define_method, requiring a pointer to the VM, the class, and the string name alongside the function pointer.

00:06:56.400 Unlike with CRuby, we don't specify the number of arguments; instead, we retrieve arguments using the get_arg_position macro. With this knowledge established, here’s what the initialization of the digest instance looks like: we fetch the algorithm ID for SHA-256 using get_r1, indicated in the first argument. Next, we instantiate the digest object with the buffer size of the mbed TLS MD context. Then we acquire the pointer to that buffer and call the mbed TLS function to create the context within that buffer. Following this, we call additional mbed TLS functions to properly set up the buffer before returning the instance value itself.

00:08:06.040 Next, let's examine how the update function operates. Initially, we extract the wrapped context from the instance. This will be the mbed TLS MD context, which we then pass into the mbed functions. When defining an instance method, it's important to invoke the mrbc_incre function (as seen in the second to last line) to prevent the object from being deallocated. If you call instance.update after this function, the next time the object is referenced, it will have been deallocated, leading to segmentation faults, which we certainly want to avoid. Please note that I implemented APIs with updates and finishes, favoring a multiple-call approach instead of one-shot APIs, which would encrypt the entire buffer at once.

00:09:54.399 Using one-shot APIs in a memory-constrained environment like Pico Ruby is not ideal since you would need the entire original string buffer plus an additional buffer of the same size, effectively doubling memory use. With multiple call APIs, we can process portions of the input string, send them, and free the memory to continue with the rest. Additionally, I must emphasize that using fixed nonces or random values for initialization vectors is a significant security risk, potentially compromising security. For instance, the PlayStation 3's signing key was leaked due to the reuse of an ECDSA nonce.

00:10:40.079 To secure cryptographic processes, we require a reliable random number generator. However, do we have standard RNGs available on devices like we do in conventional computers? Unfortunately, no. We must construct a random number generator from the resources available in Pico. Our solution is to utilize the ring oscillator to extract random bits, accessible through the RNG gem that resides within Pico Ruby. The ring oscillator consists of a sequence of NOT gates that toggle with the clock. The oscillation generates bits randomly, providing us with zeros and ones.

00:11:25.760 The code snippet illustrates how to retrieve random bits: it generates a 32-bit integer and obtains 8 bits. Since the randomness generated by hardware may possess certain biases, directly using it could lead to suboptimal RNG. Instead, we employ a technique called whitening or debiasing, where we convert zeros into ones and discard all-zero outputs to deliver an even probability of ones and zeros. That's a brief overview of the cryptography portion; now let’s dive into the networking aspect.

00:12:05.200 So what do we mean by networking in this context? The objective is to connect to Wi-Fi using 802.11, which is a Layer 2 protocol, then acquire an IP address (Layer 3). Afterward, we communicate with servers via TCP (Layer 4), possibly encrypted with TLS (Layer 5). Finally, we use HTTP as our application layer (Layers 6 and 7). What were we missing? We had the hardware driver included in the Pico SDK and a TCP/IP stack through LWIP (Lightweight IP) along with the TLS library (mTLS) which is also a part of Pico SDK. However, we lacked HTTPS.

00:13:03.080 To bridge this gap, I primarily worked on writing interfaces to Pico Ruby and implementing basic HTTP functionality using Ruby. First, we integrated libraries into R2P2, including CW43, the LWIP thread-safe background, mTLS, and the CW43 driver. The Wi-Fi driver and LWIP necessitate periodic servicing for data extraction from hardware into memory, which we addressed in two modes. The first mode is the arch-LWIP poll mode, where the main application must periodically call the Wi-Fi driver to trigger callbacks and move data from hardware. This approach, however, is tedious.

00:13:51.599 Instead, I opted for the arch-LWIP thread-safe background mode, which involves a background process handling all aspects of Wi-Fi drivers and the TCP/IP stack safely while allowing multitasking for multiple cores. There’s also a real-time mode available through FreeRTOS, but that was excessive for a simple Wi-Fi implementation. This is what our interface with the driver looks like. I demonstrated this in the video, so you can check the slides later for more detail.

00:14:31.760 After getting the Wi-Fi driver part sorted, I worked on creating a wrapper for DNS functionality, which turned out to be simpler than building the TCP client. The LWIP library utilizes UDP to interface with DNS servers. The DNS get host by name function accepts a callback, which will execute once a DNS record is found, and many LWIP functions operate similarly. For instance, we can examine the get IP input part involving the DNS get host by name function, which accepts a pointer to the DNS found callback function.

00:15:06.360 When the DNS record is located, the IP address is stored in a void pointer, which we then copy into our IP pointer to return to the main function. You might recall that the first call to HTTPS failed during the demo. I took the initiative to debug this by setting up a hotspot using a Raspberry Pi and followed an official tutorial on monitoring packets with Wireshark. The packet capture showed that it attempted to connect to an IP address that didn't match any known value before the DNS response could be retrieved.

00:15:50.399 The issue arose because the IP address variable was not properly initialized or zeroed out. After ensuring the variable was set correctly, I confirmed that the TCP connection directed to the accurate IP address following the successful retrieval of the DNS record. Implementing the TCP client generally involves a straightforward process: we create a protocol control block (PCB), establish callbacks for various events (data reception, data sending, error occurrences, and idle connections), and utilize a polling function for idle connections.

00:16:35.680 This is a visual outline of the steps: first, we generate the PCB with LACP_NEW based on the TCP connection state. We maintain pointers to two buffers: the MRBC value for sending data and a buffer for incoming data. This incoming data will be stored in the received data MRBC value. Additionally, we will be capturing the pointer to the VM to handle freeing the buffers properly.

00:17:03.080 Here's how the receive callback is structured. We extract packets using a struct called PBUFF, or packet buffer. We then retrieve the contents of these packets into our receive buffer, utilizing a while loop to handle multiple packets. After processing the incoming packets from the while loop, we copy the data into the respective MRBC value. This is the main function responsible for sending data over TCP. When the connection is established or encounters an error, the TCP client poll input function in the V loop will return zero.

00:17:44.360 At that juncture, the value in the received data MRBC will be ready to return from the connection. Once we complete the TCP client implementation, constructing a basic HTTP client becomes relatively straightforward. A mere HTTP request can be crafted within Ruby. Upon obtaining the necessary components for our HTTP client, the interaction will be quite elementary.

00:18:25.440 Next, we turn our attention to TLS functionalities through LWIP and the application layer TCP stack. Integrating TLS is not overly complicated; instead of using LACP_NEW to initiate the PCB, we call LACP_TLS_NEW with the client configuration. Following that, we set the host name using the mbed TLS SSL set host name function. After this, the rest of the process remains the same. What occurs is that if the PCB operates under the TLS protocol, when data is sent, LWIP calls TLS functions to encrypt data before it is sent through TCP.

00:19:11.759 As data is received, TLS callbacks will decrypt the data first before passing it to TCP callbacks with the plaintext value. Therefore, the cryptography is handled transparently. However, the cryptography implementations covered earlier may not be applicable here. They can be utilized in application-level cryptographic operations. While I previously mentioned that TLS integration was trivial, it turned out to present some challenges.

00:20:01.560 One significant complication involved memory management; LWIP has its independent memory management mechanism, separate from the MRBC VM or the Pico runtime. When I started implementing TLS functionality, it unexpectedly locked up due to memory conflicts. The use of LWIP consumed excessive memory, encroaching into the MRBC VM's memory. Consequently, I had to reduce Pico's heap memory from 194 kilobytes to 96 kilobytes, effectively halving the already limited size.

00:20:37.920 Moreover, ordinary malloc and free functions common in C are not available in this environment. We rely on either mrbc_alloc or mrbc_free, which require the correct virtual machine or the LWIP variant functioning correctly. Carelessly using an inappropriate free function within the TCP client code led to erratic behavior, costing me three hours of troubleshooting. It’s important to note that this overview does not cover the entirety of the process, as I intentionally omitted several details for brevity, including topics such as hardware driver periodic servicing, TCP handshakes, TLS handshakes, and managing large data chunks.

00:21:28.800 In short, do not attempt this lightly—handling Pico Ruby does present some challenges. Now, let's discuss future work. In desktop Ruby, we utilize TCP sockets that follow a standardized BSD socket API widely recognized and employed across numerous platforms. LWIP also has an equivalent; however, it mandates the use of an RTOS, which we deemed unnecessary for our purposes.

00:22:15.720 To tackle the situation required extensive effort; initially, I considered importing the MicroPython network module into Pico Ruby but quickly realized it was overly complex. Instead, I opted to implement basic minimal functionality first, laying the groundwork for a socket-like API. Moving forward, the next logical step is to introduce server capabilities. This development is currently underway with the present API, but I need to devise a Ruby API design enabling Ruby code to handle responses efficiently. This task should become significantly more manageable with a socket-like API.

00:23:06.840 However, we must remain aware that this will be restricted by Pico Ruby's limited capabilities. Although we currently implement blocking I/O, the question remains: can Pico Ruby handle non-blocking I/O? Additionally, is it feasible for Pico to support multiprocessing or multi-threading? While the Raspberry Pi Pico is multicore, the hardware we intend to target may not share that capability. So, why bother pursuing this direction?

00:23:36.120 To conclude today's talk, we can affirm that Pico Ruby can utilize TLS. We have currently integrated Base16, Base64, SHA-256, AES, and random number generation capabilities into Pico Ruby’s R2P2 environment. Although these components are experimental, I'm working diligently to merge them into the master branch.

00:24:19.160 Before finalizing the merge, there are tidbits left, especially renaming C function calls for consistency due to the lack of namespaces. Error handling also requires refinement, as it remains somewhat erratic. Moreover, we need to optimize networking with build flags; if we are not utilizing networking resources, we can reclaim memory space. However, we must critically evaluate whether TLS is truly necessary; symmetric cryptography suffices in limited environments, but asymmetric cryptography (such as RSA and elliptic curves) is often too slow. Performance data from similar hardware indicates that it takes upwards of 6.4 seconds to verify RSA 2048 signatures, resulting in timeouts for our connections.

00:25:12.960 Additionally, lacking a trusted certificate store, we obtain encryption through TLS but miss out on the authentication aspect. Based on your security requirements, it may be adequate to implement symmetric encryption between the gateway and Pico, while employing TLS from the gateway onward. Keep in mind that your Wi-Fi password will be stored within the Pico for connectivity. Thus, if the device is physically compromised, your Wi-Fi password would be at risk. But do you consider that significant in most scenarios?

00:26:01.880 If you are looking for more of a traditional computer experience, you might want to consider a Raspberry Pi 0.2W, which runs Linux and allows for SSH access and running GUI applications. You may not even need to use MRuby; CRuby would work comfortably on it. However, the reason we choose Pico lies in its proximity to hardware, making it simpler to embed into other devices.

00:26:29.240 In an IoT environment, these devices collaborate effectively. With Pico Ruby's Wi-Fi connectivity, we edge closer to realizing true IoT functionality. We're poised to explore various avenues within the IoT domain, including implementing support for CoAP, which resembles JSON in binary form. Additionally, there are working groups in the IETF focused on constrained environments that are equivalent to IoT devices. We can confidently assert that Pico Ruby in the IoT sphere presents a blue ocean of opportunities awaiting your contributions.

00:27:23.600 I would like to extend my gratitude to M San for the extensive work on Pico Ruby and for aiding in my developmental efforts. Thanks also to the RubyKaigi speakers who have inspired me to share my insights today, particularly the team protocol implementers, Unan and Shan. A heartfelt thank you to the organizers of RubyKaigi 2024 for hosting this event. That's the conclusion of my talk. If you have any questions, please feel free to tweet at me or reach out through other platforms. Thank you very much.