00:00:11.719
Hello, hi! My name is Dave Tapley. You can find me on Twitter at Dave Tapley. Firstly, thank you for coming to my talk. It's going to be fairly broad, and I'm going to show you how I've been engineering a solution to a problem in a board game.
00:00:24.300
If you are here because of the board games in the title, I want to clarify that there is actually only one board game I'm going to discuss. However, if you are here because 'augmented reality' is in the title, then yes, I have been working on an augmented reality app for the web using Rails.
00:00:36.329
To that point, almost nothing I am going to show you is probably the right way to solve these problems. It is definitely not the right tool for the job, but I have had a lot of fun with this little project. I've gotten to use some older and newer technologies in weird and wonderful ways.
00:01:02.670
But before we get into the engineering fun, we need to talk about board game fun. So, introducing PitchCar that you can see on the screen. I thought the best way to show you PitchCar, which is very simple, is to explain how it works. The game is simple: players take turns to place a piece at the start, and each person has one of these little cars, which is just a small wooden disk.
00:01:24.150
Players take turns to flick their cars around the track, and the first person to complete two laps is the winner. The track is made up of little pieces that jigsaw together; it's just a series of straights and corners. There is also a smaller track that you can build, providing 630 possible combinations of track, so you have plenty of replay opportunities.
00:02:05.490
One thing to notice about the track is these little red walls. The walls are really what makes the game fun; you can slide around them and bounce off them. However, a side effect is that there are places where there are no walls.
00:02:20.900
Coming off the track is just part of the game; it happens all the time. The good news is that you're not out of the game; you just go back to where you were. However, remembering where to return can be hard, and sometimes you can knock yourself or other players off the track simultaneously.
00:02:51.769
Even if you manage to remember where you have to go, it's challenging because it's a game all about being in the right place at the right time. It’s very easy to remember you were maybe half an inch to the left, or the other person was, for example, behind you. So, we’re all human, and no harm done, but as an engineer, I thought we could do better. So, let's have a computer track it!
00:03:16.620
Broadly speaking, I mean capturing images from the webcam and then identifying where the cars are. Then, if someone leaves the track, we will want to show where they were so they can be placed back accurately. That's cool!
00:03:50.680
Before we get into how I engineered the solution, I want to do a brief overview to justify some of the decisions I made. There are broadly three things I needed to do. One is to capture the images from the webcam. I had never played with this before; it turns out there’s this Web Real-Time Communications API.
00:04:14.890
No surprise, it involves JavaScript. Like most people, I cautiously say that JavaScript is the language of the web—it’s okay! However, I haven't played with any new frameworks in a while, so I thought I’d give Vue.js a try to increase my use of UJS.
00:04:37.130
I won’t say too much other than I'm kind of back into JavaScript now, which I never thought I would say. Once we’ve captured the images, we need a way to process them with some computer vision. That sounds terrifying in Ruby, so I basically cheated and used this old, but super stable, 18-year-old OpenCV library with a Ruby wrapper, so we don’t have to look at any C code.
00:05:15.919
I wanted it to be real-time, as I didn’t want to have to submit the form every time someone took a shot, as that would be tedious. I also got to play with Action Cable, which was a good entry point for me.
00:05:40.610
In summary, this is how broad the talk is going to be: we're going to see this web RG protocol, and I'll explain how much I like PJs probably more than I should. OpenCV will be on the back end, which will involve a bunch of math. However, you will be okay, and then Action Cable will be responsible for sending it all back and forth.
00:06:07.150
The plan is that the bulk of the talk will focus on capturing these images, sending them over to the server, processing them with OpenCV to determine where the cars are, enabling us to show you how to place them back when they come off the track. If we get through all of that, we might even see a demo that may or may not work.
00:06:49.930
The first step was capturing the image, which seemed like the logical place to start. I've never really seen Vue components before, so I found this new webcam component on GitHub, which has a simple 'Hello World' implementation. The main element that Vue is going to take over is the video tag, which implements WebRTC, and below that, there's a button element for triggering the photo capture.
00:07:34.010
The only interesting thing about that button is the '@click' directive in Vue, which is meant to handle the on-click event. In the image tag, there's a 'v-bind:src' that allows the source of the image tag to update based on whatever is in the component's photo variable.
00:08:06.339
The corresponding JavaScript snippet implements the behavior of that 'take photo' button when you click it. It looks fairly simple, but I started wondering what this 'photo' variable is. How do you represent a file or image data in JavaScript? Spoiler: it’s a string.
00:08:44.379
The 'photo' variable is a massive string, specifically a Data URI. Mozilla describes it as a URL prefix for the data scheme, allowing content creators to embed small files inline in documents. 'Small files' loosely means under a gigabyte, which I think is reasonable for this context.
00:09:05.150
I wanted to play around with these Data URIs, so I found a handy website where you can input a URL, hit 'generate Data URI,' and receive the output at the bottom. However, it can be misleading because if you copy the text from that input, it can actually produce a very large string.
00:09:34.400
What amazed me was that if you stick that massive string in the source attribute of an image tag and load it in a browser, it just works! So, I realized that source attributes don’t always have to link to files; they can also contain the data itself.
00:10:07.570
As a result, I was able to call that a success. Now that we have our Data URI in Ruby land on the server, the next step is to get it loaded into the OpenCV library.
00:10:36.790
If you look at the OpenCV documentation, you'll find a class which is the closest thing to an image class. OpenCV IPL (Intel Image Processing Library) has read methods, but they expect a filename. This presented the challenge of figuring out how to fit our string-shaped Data URI into that file-shaped load method.
00:11:14.740
It’s fairly obvious that everything after the comma in a Data URI is the actual data. Thus, I figured I could slice off everything before the comma and use just the data part, but it seemed a bit ugly to handle that manually. Thankfully, Dee Balls had written a Data URI gem, which I found extremely helpful.
00:11:57.000
This gem allowed me to handle Data URIs well. I simply opened a file in binary mode, wrote the data to it, and then passed that to the IPL image load method. I was shocked that it worked right off the bat. Typically, we would say, 'You can’t just write binary strings to files in Ruby,' but it really worked the first time.
00:12:36.090
Once we had our image loaded, I needed to give a brief segue regarding matrices. I know most of you are probably familiar with matrices, but for those who aren't, it’s essentially a rectangular array of values. In essence, images are just matrices, with each cell in the matrix representing the color of a pixel.
00:13:15.290
OpenCV makes it clear by having its image class actually inherit from the matrix class. Thus, in OpenCV, an image is just a table of pixels. If you need more convincing that an image is just a table, there’s a spectacular website where you can upload an image and download it as a spreadsheet, with the background colors of the cells set to match the pixels. It's fabulous!
00:13:51.300
Now, with our image successfully loaded, let's begin the process of identifying the car pixels. You could assume that the blue pixels belong to the car, so we need a way to instruct OpenCV to tell us which pixels are blue. As web programmers, we can refer to a web color definition of blue and use the equality method that OpenCV provides, which gives us a binary matrix showing whether each pixel is blue.
00:14:32.000
The result is what I am referring to as a mask—a new matrix that, instead of containing colors, contains truth values. For every pixel, is it blue? While it would be nice to visualize the masks directly, OpenCV has limitations on visualization. However, the matrix class does offer a save image function, but again, it requires a filename.
00:15:26.020
I decided to take a page from our earlier strategy and attempt to work with Data URIs again, but this didn't work quite as easily as before. The principle remains the same: write to a temporary file in binary mode, read it back as a string, and then use base64 encoding. Eventually, we could recover the Data URI again.
00:15:51.140
While I assumed this would be computationally intensive, it turns out that the file system often doesn't bother writing these files directly to disk, so performance remains adequate. Once we have our mask as a Data URI, we can stick it into the source attribute of an image tag, generating a displayable image.
00:16:20.289
However, I have to emphasize that the chances of a pixel being exactly the blue color we defined, within the large RGB color space, is essentially zero. Instead, we want to consider a range of blues. Fortunately, OpenCV can help with this. A very similar method allows us to supply two colors, defining a min/max range, and OpenCV will tell us which pixels fall within that range.
00:17:11.340
This way, we can create a more dynamic mask from the pixels that are blue but also avoid just aiming at random bits of dust that happen to be blue in the image. To visualize these mask changes, I sent the data back from the Action Cable server to the browser, creating a new data URI containing the updated mask with the range function.
00:18:01.790
On the JavaScript side of this, when creating the subscription in the Vue component's created callback, I added a function that would be invoked whenever it receives a message from Action Cable. When the message containing our mask data URI arrives, I can set that to the corresponding model variable.
00:18:50.300
The beauty of this is that now the image just magically updates whenever JavaScript receives the action cable message. I was quite impressed by how simple this setup was. This approach eliminates the need to copy-paste our mask Data URI every time.
00:19:23.470
However, we still didn't have a way to dynamically update the color ranges for our mask detection, so we looked into that—getting used to Vue.js. In the HTML snippet, you’ll see a 'v-model' directive, which allows the input inside it to bind to a range object that holds the min and max values for our color thresholds.
00:19:49.230
Additionally, I've added an @change attribute which is similar to an on-click handler. This is part of the developing pattern we're seeing, where changes to our input send messages back over Action Cable. The methods are defined in the component where I specify that every time the HSV input is adjusted, we send an update message to Action Cable.
00:20:35.840
This message will indicate the full ranges for red, green, and blue colors. When this data comes back to the Ruby side, we can extract the ranges and utilize them for our OpenCV processing too.
00:21:16.480
After this exchange, we update our mask accordingly. The final result was a visually pleasing little component that would dynamically adjust as we input different color ranges. Essentially, every input change sends a message of Action Cable with updated ranges, and with each alteration, a new mask is created and returned as a Data URI.
00:21:56.640
The outcome was pleasing as we were able to sift through very specific colors and accurately pinpoint where the car was on the track. However, merely having the mask wasn't enough. We wanted to ascertain the exact XY coordinates of the car's position.
00:22:39.720
This is where the Hough Transform or Hough Transform math comes into play! This remarkable piece of math allows us to analyze a matrix of truths and falses, identifying where the circles are located. The good news is that if the car is circular, we can get its x and y position from the resulting object.
00:23:47.200
Using a few parameters from the wisdom of the internet, I found and tweaked the settings until I could reliably identify the center of the car’s circle in the image, returning the center coordinates. This worked surprisingly robustly and effectively.
00:24:17.820
The final challenge was determining which car fell off the track. In a simplified state of the game, it's pretty clear that the blue car is on the track and the pink car got knocked off. However, the orange car is elsewhere, potentially under a chair as players get excited.
00:25:11.850
Reflecting on this scenario, I thought we might have three possible states: cars are either on the track, in the image, completely off the track, or outside the camera's view. However, as I simplified my logic, I realized that there were only two conditions: a car is either on the track or it’s not.
00:25:56.300
This realized optimization led to defining a mask where all tracked pixels are true, allowing me to delete everything that isn't on the track. Consequently, if the car is in the image, by definition, it must be on the track.
00:26:42.820
After successfully building this mask—simply requiring some geometry based on the start line of the track—I developed a tool with Vue.js and Action Cable that allows you to visualize and adjust the mask.
00:27:19.990
At this stage, I had to introduce some highly sophisticated code for describing tracks. Essentially, every track starts with a straight piece, then turns, and I’ve manually crafted this code based on my understanding.
00:27:59.010
I crafted an algorithm to determine the next piece based on its previous piece, tracking its rotation and preserving the scale, allowing me to draw the track in the proper location on the mask.
00:28:39.400
The next piece should follow a pre-determined order: starting with the first piece, I modified how the code handles drawing the corners onto the mask, using OpenCV's existing methods to facilitate the drawing.
00:29:40.280
By integrating recursive functionality, each piece of the track could be identified and positioned correctly based on the previous piece's attributes, ensuring the path is drawn accurately and systematically.
00:30:46.299
As these methods combined, we reached the point where our mask could facilitate our car detection logic—a simple premise where, if the car was in the imagery, it confirmed its presence on the track.
00:31:24.710
The core logic checks whether it sees the car and confirms its status. If the car goes off the track and can no longer be seen, we simply leave a marker where it originally was so players could replace it accurately back on track.
00:32:03.500
Now, let’s see if this setup works in real-time! Because the track was slightly bumped during setup, we'll fine-tune the position to ensure alignment.
00:32:24.330
This is the demo where we can visually check if the augmented reality setup works. So essentially, as cars are flicked, the system should track their movement accurately.
00:32:45.210
The system should grab hold of car locations and provide live feedback on their status while interacting in real-time game scenarios, allowing a smooth gaming experience.
00:33:08.400
Every time a car goes off track, the UI should respond painting an overlay or question mark to indicate where it was last seen. This clearly shows the accurate feedback for players.
00:34:09.300
To summarize, once you explore the features and toolsets of Vue.js and Action Cable, you'll find incredible versatility in creating dynamic interactive projects with reactive elements, whether it's games or any real-time application.
00:35:03.500
I think one of the biggest takeaways is the power of using Action Cable to facilitate real-time communication with web components. As we advance into newer frameworks like Rails, a lot more JavaScript will surely define how we create the next generation of reactive applications.
00:35:43.870
The continuous evolution of these technologies means we will see better automations in tracking systems and enhanced visual feedback systems in gameplay, making experiences even smoother.
00:36:53.800
Engaging in projects like these allows us to think creatively about the power of augmented reality and its capacity to revolutionize board games. Thank you all for listening!