Websockets

Summarized using AI

How to Hijack

Dávid Halász • March 22, 2019 • Wrocław, Poland

In this talk titled 'How to Hijack', held at the wroc_love.rb 2019 conference, Dávid Halász discusses the complex concepts of smuggling, hijacking, and proxying in non-blocking disorder sockets using Ruby on Rails and Rack.

Key points include:
- Introduction to Speaker and Context: Dávid, from Hungary and now residing in Brno, Czech Republic, works for Red Hat and focuses on hybrid cloud infrastructure management with ManageIQ, emphasizing its Ruby on Rails architecture.
- Browser Remote Consoles: Explained the implementation of remote desktop sessions via browsers using a VNC endpoint to access virtual machines, leveraging WebSockets for bidirectional data transfer.
- Sockets and Buffering: Introduced programming sockets, the importance of buffering in I/O operations, and the challenges of blocking I/O, which can lead to endless loops in proxy implementations.
- Non-blocking I/O: Discussed the benefits of non-blocking I/O methods and how they can be utilized to avoid complications when handling multiple sockets, including the concept of 'bouncing select'.
- Epoll and socket handling: Covered improvements in socket handling in Linux with epoll and how to manage readiness of sockets effectively, as well as the creation of Ruby wrappers for this mechanism.
- WebSocket Management and Hijacking: Explored how WebSockets facilitate persistent connections and the technique of socket hijacking to manage interactions with underlying VNC connections while maintaining server efficiency.
- TCP Smuggling: Proposed generating TCP connections that could efficiently handle data and emphasized the necessity of a browser plugin due to limitations imposed by browser sandboxes.
- Demonstration of Architecture: Dávid showcased a live demo of a VNC session running in a containerized environment, illustrating how connections to VMs can be established and managed effectively, utilizing simple server structures similar to Rack.

The conclusion highlighted that although the architecture utilizes a server, browser plugin, and client app, further refinements are necessary before it can be considered production-ready. The audience was engaged through a Q&A session, clarifying technical queries about the socket translations and operational efficiencies.

Overall, the talk provided an insightful examination of using Ruby for advanced network operations, making a case for innovative solutions in hybrid cloud environments.

How to Hijack
Dávid Halász • March 22, 2019 • Wrocław, Poland

wroclove.rb 2019

00:00:14.570 Hi everyone, welcome to my talk about smuggling, hijacking, and proxying in non-blocking disorder sockets with Rack.
00:00:21.410 My name is Dávid Halász, and I pronounce my name as Halász, not as Hellish.
00:00:28.640 Polish people often confuse the pronunciation of my name. We had an agreement with Hungarians and Polish to keep the world guessing how we pronounce the 'sz'.
00:00:41.450 You can find me on Twitter under this handle. Feel free to tweet about any negative experiences.
00:00:46.670 I'm from Hungary and speak a little Slovak, but I live in the Czech Republic in a town called Brno. This town is famous for beer, but I'm more into MotoGP. If anyone's familiar with it, there was an infamous genetic experiment in Brno.
00:01:03.590 If you've heard of Gregor Mendel, he was the one experimenting with pea plants there. I work for Red Hat, an open-source company.
00:01:15.560 I'm involved in the ManageIQ project, which is an open-source hybrid cloud infrastructure management platform. If you have cloud or on-premise infrastructure, we can manage it regardless of the providers you are using.
00:01:31.680 What's important to you at this conference is that it's written in Ruby on Rails and contains over a million lines of code. My focus is mainly on browser remote consoles, which are essentially remote desktop sessions accessed through a browser.
00:01:50.630 To give an example, when you think of a remote desktop session, you might visualize it as a Windows remote desktop connection. Imagine you don’t know the password, but you get the idea.
00:02:08.400 In the browser, you can utilize environments like Fluxbox Desktop Manager running in a window. In ManageIQ, we implemented it so you can see a summary screen of a virtual machine, click on 'Access' and then 'VM Console', which opens a pop-up window with the VNC session.
00:02:28.680 The architecture is straightforward: you have your VM and your browser, with the VM operating on a hypervisor providing a VNC endpoint.
00:02:43.069 Even if you run VMware or VirtualBox on your local machine, you can set this VNC endpoint and initiate a VNC session. Nevertheless, browsers generally don't support VNC, so we need something in the middle.
00:03:01.909 Let’s refer to that as a proxy. As we analyze this proxy further, we uncover some intricate concepts from computer science, such as threads, events, asynchronous I/O, and blocking reads.
00:03:18.400 Don't worry; we'll touch on all of those topics. Inside the proxy, we typically have two or more endpoints: one for the WebSocket part and one for the VNC part.
00:03:27.840 You must transmit data in both directions and translate between the two endpoints. It's pretty clear for now, but what exactly is an endpoint?
00:03:41.840 Endpoints are essentially sockets, and I don't mean the sockets used for plumbing but rather programming sockets.
00:03:57.560 Here’s an example of using sockets in Ruby. You need to require the socket library unless you’re running Rails, where it’s included by default.
00:04:05.530 In this snippet, I'm opening an HTTP connection on port 80 to the conference's website, setting some headers, and reading the headers in response. However, this is non-HTTPS, so well done organizers for having a secure site.
00:04:23.220 The operations I'm performing are reminiscent of files, which are essential for networking. If you recall the old spinning hard drives, you may remember how they took a bit longer because they needed time to read the data.
00:04:41.400 The idea of buffering in both reading and writing emerged to improve efficiency. For instance, when writing data, you would write to a buffer.
00:04:54.430 Similarly, when reading data, you'd read from a prefetched buffer. This concept of buffering is still very useful for sockets.
00:05:07.370 When using sockets, writing to them actually goes to your buffer, and when the operating system's scheduler notices free time in your network card, it can send data.
00:05:22.150 However, many issues arise when the buffer is full, forcing you to wait until there's space available. This is called blocking.
00:05:35.240 For reading, I was too lazy to make slides, so I'm relying on your imagination for this. A naive implementation of proxy using blocking read and write would be disastrous.
00:05:59.010 You would face endless loops as you could block at either end without resolving the situation. Hence, the need arose for non-blocking I/O.
00:06:12.390 With non-blocking I/O, the same methods have 'unblocked' prefixed, which means they won't wait if the buffer is full or empty.
00:06:28.460 For instance, if you're reading from multiple sockets, you would use a loop that iterates through each socket.
00:06:43.380 That loop utilizes I/O select, collecting all the ready sockets into an array, allowing you to read from them efficiently.
00:06:57.360 This method works seamlessly in one direction. However, building a proxy that handles both ends without relying on threads is where complications arise.
00:07:12.290 This is my initial attempt to create a proxy; I apologize if I cannot see the screen properly. In this setup, you have a pair of sockets, A and B, and a method to translate data between both directions.
00:07:27.400 You would enter an endless loop, iterating through all the sockets. If one part isn’t ready, you simply skip the iteration.
00:07:40.540 The challenge appears when both sockets go into a feedback loop, thus leading to an endless loop, consuming CPU resources unnecessarily.
00:07:53.750 This became problematic for potential production environments; therefore, I sought out alternatives. Initially, I considered dropping Ruby entirely but realized the conference was for Ruby.
00:08:10.290 Using threads with blocking I/O is not an efficient approach since it requires many threads to handle connections.
00:08:29.560 Alternatively, libraries such as EventMachine or Celluloid looked promising until I found our application requires PostgreSQL in async mode.
00:08:44.150 Then I discovered a new library called Async, but it wasn't available back when I began working on this topic. If I were to start again now, I would certainly use it.
00:09:03.750 The concept of fibers in Ruby is beneficial since they automatically yield during I/O operations, unlike normal threads which require explicit yielding.
00:09:17.300 In a hypothetical example with fibers, having them run inside endless loops can convert blocking I/O into non-blocking I/O.
00:09:29.529 Unfortunately, such features aren't in Ruby until version three, which is yet to arrive.
00:09:43.159 This prompts the creation of a method I call 'bouncing select'. The primary concept is to keep the arrays of sockets dynamic.
00:09:57.799 After an I/O select, I remove the ready sockets from the original arrays. If a transmission occurs, the socket is bounced back.
00:10:18.650 Thus, when socket A is ready for reading, it is removed from the iteration; if socket B isn't ready for writing, it merely skips waiting.
00:10:31.010 However, I/O select becomes inefficient when dealing with thousands of sockets because it's a system call that requires the entire list of sockets to be passed to the kernel every time.
00:10:43.750 The Linux kernel developers proposed an improvement, called 'epoll', which allows you to register sockets with one call and wait via another.
00:11:04.380 Unfortunately, this feature is exclusive to Linux, so Mac users will have to rely on KQ or fall back to I/O select.
00:11:17.600 While playing with it, I found an intriguing feature known as 'epoll one shot', which automatically removes a socket from the array once it has been read or written.
00:11:31.010 When I researched Ruby wrappers for epoll, I found that they often struggled with 'one shot', leading me to implement my own solution in C.
00:11:44.570 Let's analyze the code where I established a register function with a socket and associated operations, handling readiness.
00:11:57.600 This approach works well on Linux, but I invite contributions for KQ implementation, as we have some Mac developers needing a compatible solution.
00:12:08.420 Now, let's shift the focus to WebSockets, which are the backbone of our remote consoles, functioning as HTTP on steroids.
00:12:25.480 They support bidirectional transfer after an HTTP upgrade. When your browser sends an HTTP GET request with an upgrade header, it initiates the process.
00:12:39.350 The web server responds with an HTTP 101 switch protocol header, indicating that the protocol has been upgraded to WebSocket.
00:12:55.360 Rack is what most Ruby web servers use, allowing custom responses to requests that include a status, header, and body.
00:13:08.290 It’s functional for a request-response model, but WebSockets require persistent connections. Developers implemented 'socket hijacking' to manage such connections.
00:13:20.840 Through hijacking, we can access the socket behind the incoming request, allowing us to establish a connection with the VNC server.
00:13:37.890 The magic happens by pushing the VNC connection to the proxy in a separate thread, enabling the web server to continue handling requests.
00:13:50.230 So, this is how our remote consoles operate, integrating the upgrade and proxy components. Now for the intriguing part: smuggling.
00:14:07.889 While implementing the upgrade and WebSocket parts, I pondered whether downgrading to TCP instead of upgrading to an alternative option could work.
00:14:26.620 The idea is generating TCP connections that can handle data in any format, which is significantly beneficial in scenarios where desktop applications outperform web applications.
00:14:42.430 Offering a superior user experience in VNC remote desktop connections can be crucial, leading me to create an architecture that involves a server proxy.
00:14:59.380 The VNC endpoint is delivered from the hypervisor to the VNC client, facilitating efficient communication once the upgrade occurs.
00:15:16.560 Essentially, this establishes a pathway for using raw TCP-based protocols to transfer data.
00:15:25.560 In the context of ManageIQ, the scenario unfolds where a button labeled 'native console' sends a request to the server.
00:15:38.520 The response triggers a browser plugin that opens up a local port, enabling connection routing for the VNC client to the web server.
00:15:56.100 After that, the proxy on the web server sends the HTTP upgrade request to allow smooth tunneling between the VM and VNC client.
00:16:12.190 However, a significant concern remains: the browser plugin is constrained by browser sandboxes, limiting its ability to establish direct TCP connections.
00:16:29.560 In reality, both a browser plugin and a client app are still required for this architecture to operate as intended.
00:16:44.780 I speculate that W3C will eventually implement the ability to open TCP connections within browsers with user permission.
00:17:00.030 I’ve developed an architecture that utilizes a server, browser plugin, and client app that works together to facilitate this.
00:17:11.750 While I dislike writing JavaScript, I created a simple server that operates similarly to Rack, handling request routing using a block.
00:17:27.150 Let's discuss advantages and disadvantages of this architecture. The advantage is it mimics HTTP, allowing you to utilize headers and middleware.
00:17:53.460 The disadvantage is the requirement of a browser plugin that is not fully polished or production-ready yet.
00:18:05.490 So far, I can demonstrate how this architecture works in practice. Allow me some time to set this up.
00:18:18.580 After some technical challenges, I'll share a VNC session running as a demo remote container.
00:18:29.310 In this live demonstration, I'll connect to a VM where I’m installing Docker and pulling down containers for SSH and VNC servers.
00:18:50.490 The goal is to demonstrate how I run Puma within a Ruby environment, managing connections regardless of endpoint URL.
00:19:04.790 I set up a simple website to make the connection process visible, allowing you to see the VNC session we’re connecting to.
00:19:17.560 As I navigate this, I’ll illustrate how the VNC and SSH connections are active and operational.
00:19:31.380 If I hit the stop button, the connection closes but can easily be reestablished with the press of another button.
00:19:48.350 However, if there are any typos, I’ll of course address that promptly, and ensure that everything is functioning as expected.
00:20:06.510 I’m currently trying to connect to localhost, ensuring it's running effectively inside the container.
00:20:23.080 The VNC connection is established, running correctly with container background services.
00:20:37.450 I'll now demonstrate a successful SSH connection as well; this highlights the versatility of this configuration.
00:20:50.670 As demonstrated, this session runs efficiently within a containerized environment.
00:21:06.180 In conclusion, I hope this showcase helps you understand the architecture and design choices made.
00:21:20.460 Thank you all for your attention. I’m more than willing to answer any questions you may have.
00:21:34.970 I know I have some stickers to give away for great questions!
00:21:42.970 One participant asks about my mascot sticker. Yes, I’ll share some!
00:21:50.970 Another person asked if Red Hat developers receive hats; indeed, it's part of the company culture.
00:22:04.620 A question arises about the translation between VNC and WebSocket frames. I confirm, it’s done typically via method calls.
00:22:19.980 While this is efficient in Ruby, the smuggling doesn't require extensive processing, ensuring timely translations.
00:22:36.210 If you’re seeking very streamlined operations, I suggest exploring kernel features that enable direct socket connections.
00:22:52.950 Thank you for your insightful questions; I hope my answers clarify the points raised.
Explore all talks recorded at wroclove.rb 2019
+13