Ruby on Ales 2014
The Future of Computer Vision: How Two Rubyists Are Changing The World

Summarized using AI

The Future of Computer Vision: How Two Rubyists Are Changing The World

Jonan Scheffler and Aaron Patterson • March 06, 2014 • Earth

The video titled "The Future of Computer Vision: How Two Rubyists Are Changing The World" features Jonan Scheffler and Aaron Patterson discussing advancements in computer vision through the lens of Magic: The Gathering card sorting. The speakers humorously recount their personal experiences with the card game and their shared passion for finding a way to streamline the sorting of vast collections of cards. They each devised separate solutions to tackle the problem using robotics and artificial intelligence, demonstrating their approaches and explaining their thought processes through the presentation.

Key Points Discussed:
- Introduction of Speakers: Jonan and Aaron introduce themselves, establishing their connection through their oversize collections of Magic: The Gathering cards.
- Problems Addressed: Both speakers highlight the tediousness of sorting and valuing their card collections, indicating the need for technological intervention.
- High-Level Solution Overview: The presentation outlines the overarching goals, where they aim to photograph, extract, identify, and catalog the cards automatically.
- Implementation Details:
- Jonan created a robotic solution called 'Urza' using Lego components to safely dispense cards and scan them with a camera.
- Aaron developed a system using OpenCV to extract cards from images, emphasizing the challenges of camera alignment and image cropping.
- Technical Methodologies: **
- **Data Collection
: Both speakers discuss methods for gathering data and images of the cards to build a corpus for identification.
- Card Recognition: They detail the processes they employed for recognizing cards, including perceptual hashing and Hamming distance calculations.
- Divergent Paths: They contrast their systems, noting similarities in concept but differences in execution, strengths, and weaknesses. Jonan’s approach allows for automated scanning while Aaron’s is more manual-driven but cost-effective.
- Conclusion and Takeaways: The presentation concludes with a call to explore future possibilities in computer vision within the context of their passion for Magic: The Gathering and encourages viewers to engage with these technological advancements in a playful manner.

The humorous and personal anecdotes combined with technical details illustrate how creativity and technology can converge to solve everyday problems, particularly in niche hobbies like collectible card gaming.

The Future of Computer Vision: How Two Rubyists Are Changing The World
Jonan Scheffler and Aaron Patterson • March 06, 2014 • Earth

By Jonan Scheffler and Aaron Patterson
It all started with a dream. A dream of a world where people have learned to live in harmony with nature, where war is a distant memory, where humankind reaches unimaginable heights of technological innovation and Magic: The Gathering players no longer need to sort their cards by hand. This presentation will describe in detail the life-changing technological leaps that led us to this collectible card game utopia, examining the scanning, recognition and sorting of small bits of cardboard and all the Ruby that allows the magic to happen. If you've ever dreamed of being able to live your planeswalking dreams without the requisite hours of collating your cardboard collection, this is the presentation for you.

Help us caption & translate this video!

http://amara.org/v/FG1Z/

Ruby on Ales 2014

00:00:16.960 Okay, so these two speakers are here.
00:00:23.439 They are present, and I don't know what to say about them.
00:00:29.199 Except that I am one of them, and Jonan is the other one.
00:00:36.880 Do I need to get turned on? Can you guys hear me? Oh, I am miked up.
00:00:42.559 I’m the other one of the two Rubyists changing the world. It's a very exciting time.
00:00:48.160 We practiced this last night, but we did not have a clicker. The clicker is going to change everything.
00:00:54.239 Yes, it's going to be amazing and so much better with a clicker. Alright, we are here today to talk to you about the future of computer vision.
00:01:05.920 Let me set this up here so we can see what's ahead, so we can cheat.
00:01:12.320 You have no idea what's about to happen, but we can see it right there. It's amazing! It's the future; we're seeing into the future with computers.
00:01:17.600 Alright, let's get started. We are going to talk about the future of computer vision and how we are changing the entire world.
00:01:22.880 First, it's about me. I'm Jonan.
00:01:28.400 I am Leapbot on Twitter. I work at New Relic up in Portland. Hi, New Relic!
00:01:34.640 By the way, if you're looking for work, we are hiring. It's a great company, and if you want help installing New Relic, I'm happy to assist.
00:01:40.640 You want to click? So, I actually now write software. When I was a kid, I wanted to be a spaceman.
00:01:47.280 But instead, I have to pass my time building Lego robots and scanning Magic cards.
00:01:52.840 It's really pretty terrible, actually. But if there are any spaceman recruiters here, hit me up after this!
00:02:10.800 And I am Aaron Patterson. You may know me from such things as emceeing this conference.
00:02:18.239 Thank you, all of you, for having this magical gathering today.
00:02:25.680 That is me and my cat. Let's do this; we only have 30 minutes and a lot of stuff to show you.
00:02:31.920 Yes, we are here today to talk to you about Magic: The Gathering.
00:02:37.360 This is a magical gathering. Who's played Magic? Has anybody played Magic: The Gathering?
00:02:44.400 A lot of nerds in the house! Okay, yes, you're going to fit right in here.
00:02:50.000 Magic: The Gathering is a collectible card game for those who don't know.
00:02:55.200 You buy little packs of 15 cards for too much money and then hoard them in your garage.
00:03:01.599 You get totally addicted to the game, and I am totally addicted.
00:03:08.159 If you follow my Twitter, I spent my lunch money on buying these stupid cards yesterday instead of eating.
00:03:15.680 I basically started playing this game, I think, 20 years ago in high school, and then I quit for a while.
00:03:22.080 Of course, when you're in high school, you don't have a million dollars to spend on these stupid things, so I quit.
00:03:28.560 Then, about 10 years ago, I started playing again when I had a real job and spent way too much money on it.
00:03:34.319 And now, somebody got me addicted again. I don't know how that happened.
00:03:40.959 I have a similar story, actually. I've been playing for about the same amount of time.
00:03:47.920 Except that I just never stopped instead of stopping, which is a way better way to do it.
00:03:53.120 So basically, I have too many cards. I opened my closet, and I have about 6,000 cards.
00:03:58.799 Probably over 6,000 cards in my closet. I just started playing again, and I have no idea what they do.
00:04:06.000 I decided to go play with a friend and then I just got destroyed, and it sucked. I was sad.
00:04:12.239 So I have no idea what all my cards do. I want to know that.
00:04:18.160 I also want to know if they are worth any money because maybe I should just sell these stupid things.
00:04:24.560 They have ruined so much of my life.
00:04:30.639 And I also have many other problems besides that, but those are some of them.
00:04:35.759 It's cute that Aaron says he has too many cards. He has about six thousand cards.
00:04:41.040 I think I have a hundred thousand-ish, give or take twenty thousand. I'm not positive.
00:04:50.880 It takes up a lot of my garage. I'd like to count those so I know for sure.
00:04:56.079 Hard work is hard! I am an old man, and hunching over my coffee table sorting cards is not my idea of a good time.
00:05:04.880 Valuing collections quickly is difficult. This is the thing that I do as a hobby.
00:05:09.919 I buy a collection on Craigslist, keep the bits I want, and sell the bits I don't want.
00:05:15.919 I make my money back and go again. If anyone's selling their Magic cards, hit me up!
00:05:22.000 And I want to stop buying cards that I already have.
00:05:28.000 I'm really tired of finding cards later that I've purchased at the store that night for my deck.
00:05:34.800 Because I couldn't sort through a hundred thousand cards fast enough, we needed to solve these problems.
00:05:43.840 And we're both super lazy, which sounds like the perfect job for a computer.
00:05:50.880 Computers are built as our slaves; they do repetitive tasks for us.
00:05:57.120 So today, we're going to talk to you about our solutions for dealing with these particular problems.
00:06:03.840 I don't remember what slide is next; it's this one. We're going to talk about robotics, artificial intelligence, AI, cats, that one's important.
00:06:11.039 Then, like cameras and stuff. Here's a roadmap of our talk today.
00:06:20.000 We're going to go over the high-level solution. Basically, we implemented the same thing, but unfortunately they are different.
00:06:38.479 We're going to go over the specific implementations, and you'll see how it forks.
00:06:45.120 See the red and the green? That's where it forked, and you can't do that on Windows—that’s us right there.
00:06:52.639 So we're going to discuss our particular solutions to this problem. And then maybe we'll talk about future work.
00:07:03.840 And maybe why we couldn't work together, mainly because I hate Jonan.
00:07:11.039 It's true; my elf deck keeps beating him.
00:07:16.479 So from a high level, basically both of our systems do this: they take a photo of the Magic card.
00:07:22.000 The system tries to extract a card from that photo. So if we have a card in the photo, we need to pull the card out.
00:07:28.000 Then we need to identify the card. The system tries to identify which card it took a picture of.
00:07:33.680 Then it saves that data because we need to keep an inventory of the cards that we have.
00:07:38.960 We want to know what we have. I want to figure out different cards that I have with their ratings.
00:07:45.919 So we save the data at the end, and we basically repeat this process over and over again.
00:07:52.639 Rather, robots do it, and we don't. Yes, that's the plan anyway.
00:08:01.199 We basically glossed over the card identification section; that's the easy part, don't worry about this.
00:08:08.160 We're going to talk about exactly how the card identification system works.
00:08:15.000 We start out with a corpus of data. This is like, we have some knowledge about the cards.
00:08:21.679 We acquired this knowledge by gathering a bunch of images online, just in general to solve this.
00:08:29.040 You could have a corpus of data—something known, right? This is the solution to the problem already.
00:08:36.479 We start with a bit of a solution. We have a bunch of images tied to particular data.
00:08:42.240 If those images match the images we take, we're close.
00:08:48.720 What our systems do first is they take the corpus and try to match the image against what we have.
00:08:57.360 It makes a guess. The system will guess what the card is.
00:09:03.679 It'll prompt us and say, 'I think this thing is a Forest,' and I'll say, 'You're a bad robot; it's an Island! You're terrible at everything!'
00:09:10.560 So it prompts us, saying, 'Is this the right card? I made a guess; can you tell me?'
00:09:17.120 Then we teach the system whether or not it was right.
00:09:22.240 I could be unnecessarily mean to my robot right now, or I could be a good trainer.
00:09:30.560 We say, 'Yes, you got it right,' or 'No, you got it wrong.' Once we do that, then we save off that data.
00:09:36.640 We tell it the right answer, and that helps our corpus to grow. It learns from this.
00:09:43.440 So we say, 'Okay, good job; you did it,' or 'No, this is the correct answer,' and then we basically repeat this process.
00:09:50.480 Every time we repeat the process, the system gets smarter. This is artificial intelligence.
00:09:56.960 Truly, artificial Urza is a dumb robot. Every time the corpus gets larger, it gets better at guessing what those scanned cards are.
00:10:03.200 It's a learning system. The data model; we're going to talk a little bit about how we have to model the data.
00:10:09.600 When it comes to Magic cards, you would think one little card, one little image should work.
00:10:16.480 But let's look at some cards. We have a card model; we store a bunch of cards in our system.
00:10:22.640 We have to store information about the cards, particularly the name and the image.
00:10:27.920 We also have to store the expansion set, and this is important because a card can be reprinted between expansions.
00:10:35.759 The value changes drastically between the expansions, and since I struggle at Magic, I want to know the rating.
00:10:41.760 So I store the rating for those so I can say, 'Hey, tell me what the best cards are in my system!'
00:10:48.639 This is our method to make Aaron a better Magic player.
00:10:54.079 So we think, 'Okay, that's great; we'll store an image.' So we have a card, and that card has one image.
00:11:00.399 That seems reasonable, right? Are we on the same page? Sounds legit.
00:11:07.440 But anybody ever seen this card? That's one card, and it has two images; it's called Big Furry Monster.
00:11:14.560 They printed it to troll programmers who try to scrape their site later.
00:11:20.960 So we think, 'Okay, great; now we have a new model.' An image belongs to a card.
00:11:27.200 A card has many images. Are we good with that? Sounds good.
00:11:34.640 David thinks so. He's a Magic player, so he would know.
00:11:40.480 But here’s an image with two cards. Good job, Wizards. Thank you so much for your help.
00:11:47.920 They made these cute little half cards, and you could turn them sideways, cut them in half, and use them as tokens.
00:11:56.160 They were adorable but terrible to model. We end up with this: an image has and belongs to many cards, and a card has and belongs to many images.
00:12:03.680 Our total data model looks like this. We're going to talk about hashes here in a second.
00:12:10.320 But basically, we have cards; cards have and belong to many images, and cards also have many hashes.
00:12:16.800 We're going to talk about exactly what the hashes are. Hashes are very exciting. That's the recognition piece of this talk.
00:12:23.760 We want to get back into that easy part that we kind of glossed over earlier.
00:12:29.680 Recognition is actually a pretty hard problem. We use what's called a perceptual hash.
00:12:37.679 Has anyone here ever done OCR? Did you look at this problem and say, 'Oh, you just OCR the cards'? No.
00:12:43.680 Because you're smarter than that, you’ve tried to do this. It's ridiculously hard, right?
00:12:50.320 OCR is almost always wrong and very difficult to work with.
00:12:56.960 So we didn't do that; we used perceptual hashes, which is basically a hash of the image.
00:13:03.760 If the human eye thinks the two images look similar, the hashes that come out of this perceptual hash will also be similar.
00:13:09.280 That's the theory behind all of this. Think of it like producing a hash key for some object. We're just producing a hash for an image.
00:13:15.760 Let’s take a look at that. On the left, we have a reference card. This is a card from our corpus, one that is known.
00:13:21.280 On the right side, we have one that is unknown—something I was able to pull off an image.
00:13:27.520 At the bottom, you can see we have two numbers, and those are actually the hashes.
00:13:33.280 The left side is our known good hash, the right side is some hash, and we need to know how close they are together.
00:13:40.240 This is our corpus here, right? So we use the Hamming distance to calculate that.
00:13:47.440 We compare those two strings effectively for difference, and the Hamming distance is the number of transformations.
00:13:53.760 You have to make one to turn it into the other, right? Number by number.
00:14:01.760 The lower the Hamming distance, the closer the images are together.
00:14:05.760 So if we take a look at this one, we’ll see: okay, we have a number here, and another number there.
00:14:12.000 If we look at the Hamming distance for that, it’s 28, so clearly the first one was much closer.
00:14:17.760 A brief disclaimer here: it is possible for very distinct images to have similar p-hashes, and sometimes that comes up.
00:14:24.080 So our system needs to prepare for that.
00:14:30.080 Anyway, we save these hash keys off, and adding these hashes to our system increases the knowledge.
00:14:36.080 Remember when we talked about learning? Right here, we store the keys again so next time we get better.
00:14:42.000 So we are going to talk a little bit about divergent work here—where our solutions went their separate paths.
00:14:49.680 We have similar things for data and similar things for hashing, but this is where we diverge.
00:14:55.680 I built a fixed environment solution that you see here, out of Legos. This is Urza, the robot.
00:15:03.200 I'm going to talk a little bit about designing it really quickly.
00:15:09.760 I need a way to dispense cards, and it's very important to me that I dispense those cards safely.
00:15:16.800 It's a difficult thing to dispense cards without damaging them because Magic cards have a very fragile varnish.
00:15:22.320 So I obviously want to take good care of my cards; I want to scan them, and I want to know what I have.
00:15:30.080 This whole tower actually rotates around; I can sort into distinct piles by any arbitrary criteria on any Magic card.
00:15:37.039 Those are the three things that I wanted out of my system, and I chose to build it using Lego.
00:15:43.120 Next, bricks—has anybody ever used the Lego Next brick? They're pretty cool.
00:15:50.560 There's a little GUI tool you can use to program them, but that tool is not ideal in this case.
00:15:57.200 It's good for making robots that chase colored balls, but for this case, I wanted to use this gem called Lego Next.
00:16:07.200 It’s written by a very smart man named Christian Docc, and you should check out this gem on GitHub.
00:16:13.520 I wrote the piano example; the one that doesn't work totally. It's awesome; you should play.
00:16:21.040 Regardless, Lego Next communicates with the brick over USB or Bluetooth.
00:16:27.760 It allows me to send commands like, 'Hey, run the motor,' or 'Check the light sensor and see if the light changed.'
00:16:36.240 This makes it very easy to work with. The dispenser here on the top has a light sensor.
00:16:44.960 This light sensor can detect color, but I'm not using that functionality. It just tells me whether or not the light has changed.
00:16:52.640 If it was dark and it got lighter by seven—an arbitrary value I get back from the light sensor—then I know something has changed.
00:16:59.040 I need this light sensor so that I can run the first motor until the card comes out.
00:17:05.440 The idea is to run the motor, and we know when to stop the motor because the light sensor changed.
00:17:12.320 So, as it feeds the card over, it covers the light sensor, and we know when to stop.
00:17:19.040 The alternative would be to just run the motor for 0.7 seconds, but that gets out of sync after a while.
00:17:26.040 Thus, we need the camera. I really wanted a variable distance lens and variable focal length.
00:17:32.960 The camera is here—it's the one with the duct tape on it.
00:17:40.240 That last part, variable focal length, turns out to be hard on Macs. There are very few webcams that actually do that.
00:17:48.160 The Microsoft Lifecam, the one Aaron bought, happens to be one, but you can't focus with my camera because it's from Logitech.
00:17:53.880 That's a Mac feature; the driver they have is a universal webcam driver, which means universally that it's not gonna work with Logitech.
00:18:02.080 So, in order to change the focal length, I bought some glasses at the dollar store, ripped the lenses out, and duct taped them to the tip of the camera.
00:18:09.600 What you're welcome to see later—I spent five dollars buying five different pairs.
00:18:14.640 This is a questionably variable focal length; I could change it hypothetically.
00:18:23.360 This is what I get: this fixed environment advantage. I know that my card is going to drop vertically into my screen.
00:18:30.240 It will be perfectly in the middle, and I just have to set my cropping distances.
00:18:36.960 I take 489 pixels off the top, and I will get something close to this on the right, which is the reference image.
00:18:43.120 I can mount the camera at a 90-degree angle to the card. I also know every time it feeds where the card's going to land.
00:18:50.080 All I have to do is crop a number of pixels off the sides of the card, and then I’ll end up with something close to that.
00:18:56.560 For the software inside of this aside from the Lego Next gem, I'm using three other gems.
00:19:02.720 I'm using RMagick to handle the cropping, which is pretty easy.
00:19:10.440 I'm using Fashion, a gem by Mike Perham, which is maintained by someone else now, but Fashion handles the fingerprints.
00:19:19.200 The hashes that we're going to show you a little bit more about soon, and then AV Capture.
00:19:27.360 It turns out that all of the gems for handling webcams on Macs were not working super well.
00:19:35.240 So I wrote one. Yes, I wrote one.
00:19:40.480 Now, I want to talk briefly about my data mining.
00:19:47.200 I was mining data with a gem that I wrote called Gatherer, the Magic King.
00:19:53.920 It was a long process; it took me several months, and it still wouldn't work sometimes.
00:19:59.680 Things would move around on the page. The actual site where I'm scraping is called Gatherer.
00:20:05.760 It’s the official Wizards database of all of the cards and all the data.
00:20:12.560 That’s where we got pictures from earlier we showed you.
00:20:19.520 I finally figured out the secret to making my stuff less brittle.
00:20:25.520 I was about to finish my gem when this jerk, whose name is Mark Turner, sent me a link.
00:20:32.560 He said, 'Yo bro, you just wasted your whole life; you should stop!'. It was a link to this MTG JSON site.
00:20:42.560 It has any kind of API I could want, and all the data in JSON format! So, that’s pretty nice.
00:20:48.760 We use that now instead, and there's an API to access the images as well.
00:20:55.679 These multiverse IDs uniquely identify Magic cards within the system.
00:21:02.080 Even Big Furry Monster has a multiverse ID, which happens to have two images. Screw you, Wizards!
00:21:08.240 Reorganizing cards or, I'm sorry, recognizing cards. I have three different methods I use to recognize cards.
00:21:14.240 This is the obvious one: if I have a Hamming distance that is less than or equal to six, I say we're good.
00:21:20.560 It's not that simple, really; I’ll have false positives, but I have fewer false positives doing this than I would with a higher number.
00:21:26.960 I tuned this with science and trial and error for several hours, and six is the correct number.
00:21:33.280 The second method I have for identifying cards is searching for duplicate fingerprints.
00:21:41.120 I have a method that returns to me all the Hamming distances of all the cards.
00:21:47.920 It's kind of like a hash type structure where a Hamming distance of 10 points to an array of cards.
00:21:54.240 Cards can be duplicated in there; I keep putting them in.
00:22:01.760 So if I see seven fingerprints for a card, and it has a 10, a 12, and a 14, I will see it in the highest duplicates.
00:22:07.840 I choose that card if I see it five times. The machine learns from itself and says, 'Okay, I trust that 7 duplicate is the actual card'.
00:22:14.160 I'll save the next fingerprint so the next time we will see eight.
00:22:20.560 Finally, there's the guessing prompt: the part where I push a yes with my finger, and that's the last step for Urza.
00:22:27.920 Then when I train, hopefully next time it will learn and not bother me.
00:22:35.040 So, advantages of this system. There are advantages to Urza, but I can't remember what they are.
00:22:42.080 Your code is simple, right? Yes. You know where the card is at any point.
00:22:48.640 The other advantage is that you don't need to do anything; the machine just feeds cards through.
00:22:55.920 I can let it do this while I watch TV or play Magic with Aaron, and Urza can scan my collection.
00:23:01.200 Ideally it will work just fine, but there are some pretty serious disadvantages.
00:23:07.680 That being, that every time I move it, because I want that variable camera length, I have to recalibrate the camera.
00:23:14.880 The pixel distances, etc. Just carrying the table out here may have miscalibrated the camera.
00:23:22.760 The disadvantage of this system is that it's pretty fragile. But that is solvable.
00:23:30.080 Absolutely, this is just a prototype. If I built it as a static machine, like maybe an acrylic box with a camera on top.
00:23:37.760 That wouldn't have that problem. Another disadvantage is the expense; Lego Next is actually pretty expensive.
00:23:43.760 It's like 300 for one of these kits. You can find them cheaper on Craigslist, maybe for less than a hundred dollars.
00:23:51.520 I got mine for 75. I recently traded two Raspberry Pis for another one; thank you, Adam.
00:23:58.080 I think this is a pretty big disadvantage, but it's really nice for prototyping.
00:24:05.680 So we're going to talk about Freestyle. I will take the clicker now for you, sir.
00:24:14.320 Thank you! This is my far inferior solution to the same problem.
00:24:21.680 I like to call mine the Freestyle Environment for Freedom.
00:24:27.680 One, because it sounds cool, and two, that’s the only reason. Basically, what I have is a light box hooked up to a webcam.
00:24:40.160 The webcam is connected to my computer.
00:24:45.600 The problem is there's a cat that's always messing with my stuff, so it can misalign.
00:24:52.560 The camera points right into my light box, and if you look through it, you see the camera is pretty off.
00:24:59.760 The card is not rectangular anymore when you take a photo of it.
00:25:07.280 Another problem is that this card can be anywhere inside the image.
00:25:14.240 What’s really nice about Aaron’s system is that it's right there; we can crop it.
00:25:20.160 For mine, I have to find the card and crop it out of the image and somehow resize it.
00:25:28.240 So, I’m going to talk about how I do that. The software I use for that is OpenCV.
00:25:35.440 I wrote a thing that scrapes all of the Magic: The Gathering website.
00:25:41.920 It puts all the stuff in my database. I wasted my entire life.
00:25:46.080 I hate you Mark Turner; I didn't know about this awesome website with JSON.
00:25:54.200 So I wasted my life. Anyway, that’s what happens in my system.
00:25:59.760 Once I get that data, I use OpenCV. It's what actually extracts the card from the image.
00:26:06.640 The issue is, we have an image that looks like this: it’s totally washed out.
00:26:12.640 You can see it’s just a card inside some image, but it's not rectangular.
00:26:17.440 It can be anywhere in here because I'm doing this by hand.
00:26:22.720 If you compare this to a reference, it's going to be crazy bad, right?
00:26:28.480 Yes, it might be better than a red card, but you're still not going to know anything about it.
00:26:36.240 So what we want is for something to look like this; it needs to be rectangular.
00:26:42.080 We also need it to be cropped.
00:26:48.280 The way we do card extraction with OpenCV is in only eight easy steps.
00:26:54.560 One. Eight hundred cardbot, buy it! I’m going to show you exactly how easy it is.
00:27:01.680 We start out with a card like this; this is our starting image. We pre-process it and turn it grayscale with OpenCV.
00:27:08.320 Once we have a grayscale image, we can feed it into an algorithm for edge detection.
00:27:13.680 OpenCV gives us an algorithm called the Canny edge detector.
00:27:20.480 It finds all the edges in the image; you just feed it in and say, 'Hey, tell me the edges.'
00:27:29.680 We get back an image that represents all the edges in the image. All the white stuff are the edges.
00:27:36.560 Unfortunately, they are not the edges we want—none of them are. We only want the one around the card.
00:27:43.760 We need to figure out which one that is. The way we do that is we say, 'Hey, OpenCV, tell me all the contours in here.'
00:27:49.920 Contours are just shaped edges in the image. We need to say we don’t want any contours that are holes.
00:27:56.560 We only want those bounding ones, those outside ones. We want to eliminate the holes.
00:28:02.400 So we filter those out, and if you draw all the contours available, you see what it looks like.
00:28:08.080 It looks exactly the same as the edge-detection stuff.
00:28:15.680 So now we only want the largest contour, and that’s what we want in this image.
00:28:21.600 We sort all those by the contour area, and we end up with a shape that represents the contour of the card.
00:28:27.760 But we need to get a polygon out of it.
00:28:33.280 We say, 'Hey, OpenCV, give me a polygon that represents this shape,' and we try to get a convex hull for that.
00:28:41.440 If you look at it, you give it this constant and say, 'Okay, put it in clockwise order.'
00:28:48.640 It turns out that if you specify clockwise, it gives it to you counterclockwise!
00:28:54.560 Which is why I've got that flip, right there. You flip it, saying, 'No, I actually wanted clockwise order.'
00:29:01.600 And we find the polygon and the convex hull.
00:29:06.800 How much time do we have? We have 10 minutes—great! I want to go over it.
00:29:15.680 One super annoying thing about OpenCV is that the documentation kind of sucks.
00:29:21.840 I had to read the code to figure out what it does, and I'm sifting through this C++ library.
00:29:29.680 I need to get a bunch of points, the convex hull for this, and supposedly it returns them in clockwise order.
00:29:38.000 So I did that, and it turns out if you specify clockwise it gives it counterclockwise.
00:29:48.560 I hate you! This is why I’ve got that reverse there!
00:29:57.680 So after finding the polygon and getting the clockwise points, we can extract the card from this image.
00:30:05.680 We need to say we want a warp matrix.
00:30:10.200 We need to transform the image, so we say, 'I want 230 pixels wide and 300 pixels tall.'
00:30:16.720 Give me a matrix that can transform this.
00:30:23.280 That’s really exciting, and we have this warp matrix happening here.
00:30:29.680 We feed the warp matrix back into OpenCV and say, 'Hey, transform this image.'
00:30:35.440 And once we transform the image, it will look like this.
00:30:41.200 The image looks all messed up, but it actually warps the image so that our card is now rectangular.
00:30:48.640 It's in the upper left-hand corner at a known width and height, which is important.
00:30:55.040 We set the region of interest (ROI), which stands for Return on Investment.
00:31:01.120 We say, 'I'm interested in the upper-left-hand corner,' and it slices that out.
00:31:08.080 And now we actually get the card and throw away the rest! It's just that easy.
00:31:16.240 Eight steps; call now, folks! Yes, there is a small amount of work to do this.
00:31:23.200 The OpenCV documentation is terrible, so I thought I would buy a book.
00:31:30.560 I went to the O'Reilly website and thought, 'Oh, there's a book; let’s look at that.'
00:31:36.720 Oh, this looks great, and I look at the authors. Then I go back to the OpenCV website.
00:31:43.040 I wonder who wrote OpenCV? The authors of this book are the same ones who wrote OpenCV!
00:31:50.480 I hate you! Why didn’t you just write the documentation? Now I gotta buy this book!
00:31:56.520 So, I bought the book. It's just that easy, folks.
00:32:01.680 Now, I’m going to show you my system in action—a video of my system actually working.
00:32:07.280 Over there is the live feed from the camera. I put a card in here, and you see it scanning the card.
00:32:13.840 On the left is the image that it recognized from the video. The one next to it is what the system thinks it is.
00:32:20.080 I click on the one that I want, and as soon as it saves it to the database, you see we end up with two Stone Rains.
00:32:26.640 The system learns from that by Hamming distance.
00:32:30.720 Yes, the closest by Hamming distance. If you've ever played a Stone Rain, you're a terrible person.
00:32:36.720 I will never play Magic with you! That’s how this system works.
00:32:43.760 I can answer questions now about what I have. I can query it and know what cards I have.
00:32:49.760 I can say, 'How good are they?' and now I can become a better Magic player.
00:32:56.080 In fact, I’ll show you the best card in my collection; it's pretty good.
00:33:01.680 It's a good card, and now I’m going to show you the worst card in my collection.
00:33:07.760 This card is terrible! This card is super duper.
00:33:12.000 I want to explain just real quickly for non-Magic players. This card costs five.
00:33:19.840 It's a resource-limited game; you get one more every turn of the game.
00:33:26.560 So on the fifth turn, you can cast it.
00:33:31.840 What this does is give the creature haste; it lets it attack the first turn.
00:33:38.960 You also have to cast that creature on the same turn, otherwise you'd be able to attack anyway.
00:33:46.000 By turn ten, if you're trying to give a creature haste and attack your opponent, you're already dead!
00:33:51.520 This is a really bad card.
00:33:56.880 Of course, I had to click on the forum to read the comments; it's pretty hilarious.
00:34:02.960 This is the worst card in my collection. The advantages of my system are that it's very cheap.
00:34:09.360 My wife built this light box; that's free labor!
00:34:15.440 The only other thing is the webcam. The webcam cost me like 30 or 40 bucks.
00:34:21.440 So all that's really involved in my system is: a wife.
00:34:27.680 Thank you, Abby! If you’re watching, thank you!
00:34:35.920 A light box and a camera are all you need.
00:34:42.520 The disadvantages are numerous. You actually have to feed the card in.
00:34:47.680 You have to recognize the card, and you don’t know where it is, so the disadvantages are the complexity of code.
00:34:56.080 It's also a manual process; I have to feed that card in there unless I could talk my wife into doing that.
00:35:02.480 Good luck! He’s his own dispenser; he just throws cards in and pulls them out.
00:35:08.560 Train the cat! If Gorby could do this work, I think... no; he sleeps on the stupid box.
00:35:15.440 It's warm because I have the light there! So that's a problem with this system.
00:35:22.000 Another advantage is that I don’t have this calibration issue; it mostly just works.
00:35:29.360 Another huge disadvantage is that sometimes OpenCV doesn’t find the card in the image.
00:35:37.920 So I have to feed video to
Explore all talks recorded at Ruby on Ales 2014
+7