Talks
Dynamic Generation of Images and Video with Ruby-Processing
Summarized using AI

Dynamic Generation of Images and Video with Ruby-Processing

by Jeff Casimir

Dynamic Generation of Images and Video with Ruby-Processing by Jeff Casimir

In this talk, Jeff Casimir presents an engaging session on using Ruby-Processing for the dynamic generation of images and video, following up from an earlier talk at RubyConf. He begins by introducing Processing, a simplified programming model built on Java, which focuses on art creation across various media including graphics and video. Ruby Processing serves as a Ruby wrapper around this library, encouraging creative coding and experimentation.

Key Points Discussed:

- Introduction to Processing:

- Processing is a library that simplifies the programming process for artistic endeavors.

- Ruby Processing allows developers to leverage Processing within Ruby.

  • Benefits and Limitations of Ruby Processing:

    • Pros:
    • Encourages creativity and is enjoyable for prototyping.
    • Strong support for data visualization projects.
    • Cons:
    • Performance limitations compared to native implementations using C.
    • Struggles with large datasets due to memory management issues.
  • Practical Example:

    • Casimir shares a personal project where he aimed to improve data visualization for academic data through dynamically generated images to avoid layout issues in applications like Excel.
    • He successfully demonstrated code that generates images based on provided parameters using Ruby Processing, emphasizing the step-by-step process from sketch creation to dynamic image output.
  • Integrating Ruby Processing with Rails:

    • He explains the implementation of a job queue system using Beanstalk to asynchronously create images based on Rails model updates, highlighting how jobs are enqueued and processed dynamically.
  • Video Processing with Ruby:

    • Casimir introduces video processing capabilities of Ruby Processing, describing a project for video tagging, which includes loading video feeds and overlaying text on the processed video frames.
  • Future Considerations and Closing Thoughts:

    • Discusses the potential challenges of scaling this implementation for production environments and suggests replacing Ruby Processing with native solutions like ffmpeg for video processing needs.
    • He wraps up by suggesting additional resources for learning Processing and Ruby Processing, encouraging attendees to explore further.

Conclusion:

Casimir's talk emphasizes the fun and creativity found within coding using Ruby Processing while acknowledging its limitations. He encourages developers to embrace experimentation and creativity in their programming projects, ultimately fostering a collaborative and inspiring community around Ruby Processing.

00:00:16.640 All right, good afternoon everybody. This talk is entitled 'Code of R2.' This is kind of a follow-up to a talk I gave at RubyConf in November. If you didn't see that one, it's not essential knowledge, but I'm going to fly through some of the basics at the front end. All the slides and code source files are on GitHub, so feel free to check that out if you're interested.
00:00:44.079 This is what we're going to do: briefly talk about what Processing is, create some sample content, automate that content, and then discuss what next steps are if you want to pursue this as a project.
00:01:01.840 So first, what is Processing? It is a library built on top of Java. It's kind of a simplified programming language particularly designed for people interested in creating art—ranging from 2D and 3D to music, video, and interactive projects. Ruby Processing takes Processing and wraps it in a nice Ruby shell. This guy Jeremy Ashkenas has put it together, essentially as a one-man show. It's an extremely small community, and part of the reason I'm doing this talk, like my RubyConf talk, is to encourage people to check this out because I think it's really neat.
00:01:35.600 These are some of the things I showed off at RubyConf that you can do with Processing. In the top left, I've got a little abstract art piece in 2D. On the right, there are 3D balls chasing the mouse pointer using OpenGL. Below, there's real-time video processing, which I call the 'Witness Protection Program' video. It pixelizes your video, and lastly, a Pong game that doesn't even have a scoreboard—so everyone wins!
00:02:07.600 Now, let's talk about Ruby Processing—the good and the not so good. First off, it really supports creativity. There is an amazing gallery on processing.org of commercial and open-source projects built with Processing. One project I found pretty awesome is a map of Manhattan, where their nine-block squares are cross-referenced with international phone records and colorized based on which countries they were calling. There is a big push in the Processing community for data visualization, so if you're a data fan, Processing is for you. It is incredibly easy to develop with—it’s honestly the most fun I've had developing in a long time. It encourages you to experiment, and you can get really cool things done quickly.
00:03:36.879 However, the downside is that it's not very performant, especially when you’re using Ruby Processing. You're running a wrapper around the JVM along with the Processing library, and there are a lot of pieces in the mix. So, you're not going to write Quake 16 in Ruby Processing. It's great for quickly developing things and prototyping, especially for low-load or asynchronous solutions. But if you need hard real-time OpenGL, I suggest prototyping with Processing and then, when you’re ready to achieve 90 frames per second, consider rewriting using C.
00:04:01.519 Extremely large data can also be a struggle, partly because of how memory is managed. It's pretty easy to overflow the Java heap. You can increase the heap size and so forth, but it works best with smaller data. Most of those complaints are levied against both Ruby and Rails themselves, but as we all know, pretty amazing things can be created with them. Therefore, I think the same can be done with Ruby Processing.
00:04:36.159 This is really the story of a project I thought would be trivial but turned out to be quite complex. I hate this icon on my computer; it makes me feel like I'm constantly force quitting Excel. The RAM usage is ridiculous, and I don't understand why there's one layout called 'normal' and one called 'something else' but 'normal' isn't the default layout. If it's normal, it should be normal! I really dislike Excel, mostly due to feature envy.
00:05:03.440 I work with a lot of academic data. I have spent 26 of my 29 years either attending school, teaching high school, or serving as a middle school vice principal, and I work with academic data all the time. This is very common: we have very long table headers and narrow data. If we have horizontal text, the cell sizes end up being mostly blank spaces, which is completely unusable—especially in a web context. You can only fit five columns before you have to scroll across the page. All I wanted to do was implement a solution for my Rails app.
00:05:26.880 This part blew my mind. I've never wanted to create something that only works in Internet Explorer. If you use even IE6, there’s a simple CSS property that will rotate text—it works great. CSS3 transforms are awesome because they work almost, but they measure the column before the rotation, resulting in unreadable, rotated text that overlaps other table cells. If you do it on a span, it doesn't even rotate correctly; divs and cells are just not behaving as expected.
00:06:14.080 I just looked at the Google results—JavaScript, CSS, SVG, Canvas—especially in the latter two; there are ways to do it, but they were too complicated for me. I didn’t grasp their complexities. My conclusion was that I needed to dynamically create images based on the data in my Rails app. That's what I set out to do, so let's run through about 12 iterations of a program in these 45 minutes.
00:06:31.520 Creating images with Ruby Processing, as I said, it's great for 2D and 3D. For now, we're going to focus on very simple tasks using just 2D. This is what a Ruby Processing program looks like: it's called a sketch. When you write Processing, you’re creating sketches. There are two basic methods: 'setup,' which runs once when it initializes, and 'draw,' which by default runs infinitely in a loop as fast as possible unless you set a frame rate.
00:07:20.640 This specific example runs this code just once. The setup method sets the window size, while in the draw method, I set a white background. I pick a random fill color, and the drawing style of Processing is somewhat declarative. You simply state 'background,' as opposed to 'window.background.' You don't use object-oriented syntax here. When you use 'save,' it is intelligent about the file name; if you name it .png, it outputs a PNG, and .gif outputs a GIF. Once it hits exit, it stops execution. So, this draw is actually going to get run only one time.
00:08:03.680 I call this a proof of concept because I wanted to ensure that it was possible to output files from Ruby Processing. When you run Ruby Processing and install the gem, most gems are around 50K to 200K. However, this gem is 18 megabytes because it wraps an entire JRuby instance. So you're running an encapsulated JRuby instance, even if you already have JRuby installed. You get an 'rp5' executable, and 'rp5' has several run methods.
00:08:27.680 You can use 'rp5 unpack samples,' which will output a group of about 16 samples that are included in the gem—great for getting started and seeing example code. There’s also 'rp5 run,' which executes code in a manner you would expect: it runs it once. More importantly, there’s 'rp5 watch,' which enhances the development cycle akin to auto-test. It watches that file, and as soon as you make changes, it reloads the code. I’ll use 'rp5 watch' today just in case I make mistakes, running it on 'sample.rb.' It will take a few seconds, and you'll see it flash up here when it's done.
00:09:16.480 I go to my file system, and this is the graphic generator that outputs a file, just to show you that I'm not making this up. I can save the file and give it a different file name. If you really wanted to, you could close it or say you're going to take out the 'x' and save it. Now, this is going to put it into a loop, and wow—colors! The concept has been proven, as far as I'm concerned; I can output a file.
00:10:14.320 This is the goal for the project: given a string like 'Week 6 Quiz,' the name of an assignment, I want to create an image that looks like this. It's vertically oriented, and it should read bottom to top (you’d have to tilt your head taco-style to read it). It measures its size according to the length of the text. I don't want to automatically have a 200 by 200 image; I want it properly sized for the text's length. I'm going to use the Silk Screen font, which is from cotdk.org. It’s a font built for small sizes, readable down to 5 or 6 pixels, much smaller than your typical 6 point. For smaller text, your disclaimers should be in Silk Screen.
00:11:02.240 I was just going to use black text on white for this project. This is how the algorithm is going to work: I’m going to outline my setup and my draw methods. I’m going to set some default options that can be overridden. Then, I will create the window. In Processing, the window's origin (0, 0) is at the top left. I need to move the origin down to the bottom of the window and slightly to the right for text placement. If it was outputting text, the origin of this text would be the bottom left corner since the baseline determines how text works. Thus, I have to move it not only down but also to the right by the height of the letters so that my text will end up positioned well.
00:12:21.680 Once the origin is moved there, I have to rotate the plane. This is so that the text gets written in the correct direction. Then I’ll write the text and output the file—here's the demo.
00:12:39.760 That’s the finished demo. I’ve outlined my algorithm and steps, but I’m going to implement it in a slightly different order than I originally envisioned because it’s easier for you to follow. So, I start with a setup and draw methods, and I add this 'load_parameters' method. This is just Ruby, so you can access other objects—you can create your own methods and include modules. I call 'load_parameters' to load up my default parameters, set the size of the window, and then proceed to my draw method.
00:13:01.440 So far, all the draw does is set the background to the specified background color. Again, I will utilize 'rp5 watch' here, so as I make changes you will see them automatically show up. Oh, I didn’t save it. Okay, white window—very impressive! Next, I'm going to add a few more parameters; my second step—really step four—is to write the text. I’m doing this part a little out of order. First, I want to load a font, as Processing uses the VLW font format. You can take any OpenType or TrueType font and create a VLW.
00:14:05.600 It’s basically a rasterized set of bitmap images. You specify the OpenType font you want at a certain size, and Processing will output that file for you. It has its advantages and disadvantages, but anyway, this is how you load the font. I set my font color to black and set some default text. In my draw method, I will actually output the text. I set my fill color to the font color, and 'textFont' just utilizes this parameter to set the font. I will finally output the text at x position 0 and y position 10.
00:14:49.839 If I set it to 0,0, it would start at the corner and write the text above the frame of the window, so I set it to 10 so we could see it. If I hit save, great, you see the text there! Next, I need to move the origin. In Processing, the command to do that is 'translate.' I just added this line to translate the origin down to the line height and took off 2 pixels—to make the crop a little tighter. After some experimentation, I found that fonts have a little built-in padding, so it’s best to trim off a couple of pixels.
00:15:38.880 I've changed the translation from 0,10 to 0,0. If I hit save, now my default text is getting there; it’s down here at the bottom left. Step six is going to be to rotate the plane. Processing uses radians, which you probably haven’t thought about since geometry class. This looks a little confusing to me because -90 radians equates to 12 complete circles, but the radians helper converts degrees to radians for you. I wanted it rotated by -90 degrees (counterclockwise), so I needed to convert that to radians.
00:16:02.880 Now, I just call 'rotate' to rotate the plane. So, anything I draw or text I write after this rotation will now appear vertically oriented. I’ll hit save, and the text pops over there. I’m almost done! Finally, I want to output the file. I’ll give it the subtle name 'default' and set the file extension to .png. Then I'll concatenate these pieces together with a dot to output the actual file. Lastly, I add an 'exit' command because I don’t want it saving the file over and over and thrashing my hard drive.
00:16:50.560 Now, once I hit save, the window should close because it runs that one draw. If I look in my file system, great, the default file worked. Okay, I can apply this but, obviously, it's infeasible for me to type all the parameters manually. My first implementation was quite rough; I created CSV files and then had my Ruby Processing sketch watch a folder for a CSV file to pop up and bring into the program. It was ugly. I wanted to figure out a more robust and mature way to pass messages from Rails to Ruby Processing.
00:17:38.720 I was excited when I saw this book released a few months ago. I saw Dave Thomas speak a year ago, and someone asked him a great question: what's your favorite feature in Ruby that nobody uses? His answer was Rinda. He then proceeded to spend about ten minutes writing a distributed application using Rinda. If you're not familiar, Rinda uses a chalkboard model where you post jobs and workers look for jobs they can fulfill. However, there are only about two tutorials for using Rinda in English, dating back to 2005, and the content remains somewhat outdated.
00:18:38.160 So I started working through this and realized it was a bit more complex than I was ready to tackle. This was partly because I needed to communicate between two interpreters—my Rails app would run in MRI while my sketch would run in JRuby. The marshaling of objects back and forth was concerning. I then returned to what I knew: Beanstalk. If you haven't used Beanstalk for background processing before, let me tell you, it's simple and fast! It’s awesome. You can install Beanstalk on macOS easily, and if you don’t use Homebrew, you should check it out. I didn’t understand the hype until I tried it: it’s incredibly useful.
00:19:19.200 Beanstalk runs easily on numerous Linux distributions, and by default, it listens on localhost at port 11300, so you can run multiple Beanstalk instances as needed. The catch with Ruby Processing is that you don’t have access to the same gems and objects that you would from your native Ruby interpreter. The Beanstalk client is how you interact with Beanstalk from Ruby, and all you need to do is install the Beanstalk client gem and require it in an IRB session. To connect to Beanstalk, you simply say 'BeanstalkPool.new,' giving it the address and port.
00:20:14.560 To put a job onto the queue, you call 'put' and then provide a string for the job. There’s nothing complicated about it. It’s essentially an implementation of the memcached protocol. You just put strings in and take them out as needed. After which, you get back a job number. When pulling jobs from the queue on the other side, again require the Beanstalk client gem and connect to it as before. You say 'bs.reserve,' which finds a job and gives it back to you. The string that was put in with that job is considered its body.
00:21:06.480 When you reserve a job, the job isn't removed from the queue until you delete it; it’s marked as reserved. So, there’s a timeout period for if your worker fails, which by default is 300 seconds; it will then give the job to another worker. I just print out the body, and once I’ve completed whatever work I needed to do, I say 'job.delete' to tell Beanstalk I’m done and to remove that job from the queue. If there’s no job, 'bs.reserve' blocks, making it a built-in wait mechanism and minimizing resource usage.
00:21:38.720 This is how we'll implement the Beanstalk solution. We’ll start with the Rails app, posting a job composed of text parameters to Beanstalk. The Ruby Processing sketch will wait for jobs, and once a job appears, those text parameters are sent to Ruby Processing. It creates the image and puts that image directly into the public folder of the Rails app.
00:22:23.680 From the Rails side, I just set up a model for assignments containing students and grades. When an assignment is saved, I find Beanstalk, create a job, and push the information including the filename and text to Beanstalk. On the worker side, once it waits for a job, it retrieves the parameters, generates the image, and removes the job from Beanstalk. It’s that simple.
00:23:05.840 As I mentioned, Beanstalk functions with plain strings, so some serialization is necessary. I chose to use YAML because it's easy and I like it, but JSON or other plain text formats work too. In the assignment model, I use 'after_save' to define the logic for generating the header image. I locate the Beanstalk client and push the filename and assignment name converted to YAML onto Beanstalk.
00:23:51.760 It’s probably more efficient to have the Beanstalk finding process centralized, as I really only require a singleton Beanstalk. Nonetheless, this works. Unfortunately, Ruby Processing can’t easily use gems, which is one of the most significant challenges when you start integrating Ruby Processing into your project.
00:24:32.400 While thinking you’ll connect to awesome libraries or your Sinatra apps, you may trip; you are using JRuby, so now you've hit two snags. First, JRuby can't use gems with native code since it relies on the JVM. Secondly, it’s challenging—and many times impossible—to install gems into the JRuby instance Ruby Processing uses. You may do some intricate tricks with your load path to point it to your local MRI interpreter to locate gems, but practically, if you have non-native gems, just unpack them into your sketch and then require them right there.
00:25:11.840 So, that’s all I did. I created a vendor folder. You can unpack the Beanstalk client there. Moving on, during this process, I'm not going to run the sketch at each iteration. I will only show the steps, as I can’t execute the sketch each time. To start out, I’ll set my load path and required gems. I’ll work from the last demo, modifying the load path to find the unpacked gem, and then I will require the Beanstalk client and YAML, which is core to Ruby, so it's available in JRuby.
00:25:45.040 So, everything else remains the same from our previous demo. Next, I need to modify the setup to specify a default window size. The windowing toolkit that JRuby uses—I'm assuming it's Swing—allows the window to have a different size than the drawing canvas itself. So I’m creating a 200 by 200 window, although the canvas might be larger.
00:26:35.680 Secondly, I will find Beanstalk using the same code we used before. Then, third, I want to load the job parameters. In my load_parameters method, I will use 'beanstalk.reserve.' If no job is pending, the sketch will block here and wait until a job shows up. When a job is found, it will take the job and ask for its body—the YAML string. This seems a little backward because you're un-yamling it, but it effectively outputs an actual hash. I'll then merge these options with my defaults and remove the job.
00:27:29.040 For a more robust solution, you probably shouldn’t delete the job until after the image has been created. Now, I also want to calculate the image size. This is slightly messy but manageable. I’m introducing a character width parameter, and while I initially thought the Silk Screen was of fixed width, it turns out it is not. I approximated that characters average around 5 pixels wide.
00:28:16.000 Then, utilizing a formula, I determined how tall my image should be. I take the length of the text being passed in, calculates its length, adds one, multiplies it by the character width, and introduces a scaling factor of 1.2. Admittedly, this could be implemented far better by analyzing specific letters—but this works well enough for prototyping.
00:29:07.040 Next, I need to load the parameters once and resize the window during each draw. There are two models here: you can create an entirely new Ruby Processing environment for every job, or you could maintain it once and let it block while using draw for each posted job. The latter is generally more reasonable, provided Ruby Processing doesn't have a memory leak. I'll have the load of parameters occur inside draw since it gets repeated, calling 'load_parameters' within the draw method instead of the setup.
00:29:52.480 Lastly, I will set the target directory, creating a constant—though one could also output images to S3 or define an alias in your filesystem or environment variables. This points to my Rails app's directory, specifically within a 'generated' folder where I store my automatically generated images. I’ll tack that directory onto the front of the target.
00:30:38.960 Now, all my steps are checked off, and I’m ready to go. This is the same file, albeit with many more pieces. I have the sketch, Beanstalk, the Rails server, and I have to actually make some requests. Here’s my Rails server: it's already running. As I said, to start Beanstalk, all I need to do is execute 'beanstalkd'. By default, there’s no particular output—it's just waiting silently for my command.
00:31:30.320 I will use 'watch' again in case I make a typo, but in production, you would typically just use 'run' instead. A notable consequence of writing Ruby Processing sketches is the stack trace. While some people complain about the Rails stack trace, Ruby Processing's stack traces can get crazy because it mixes Java exceptions causing Ruby exceptions, resulting in an extravagant number of pages. However, generally, line error detection is reasonably decent, allowing you to pinpoint where problems occur.
00:32:20.000 If I inspect our setup here, it’s booted up Ruby Processing with my 200 by 200 window, just waiting and blocking for a job to hit the queue. In my Rails app, visually represented in 'Tablers,' all it does is output that table in the view. I basically output an 'image tag' and ask the assignment for its image path, with the image path being defined within the model.
00:33:25.080 So again, this should ideally go into an environment configuration. Yet, as expected, if I refresh the page, it shows no images because the image folder is empty. The setup will trigger as soon as we edit or create an assignment, so I hit edit and change the assignment's name to 'Week 6 Quiz.' After I save it, view all, and—boom! The text image appears!
00:34:15.760 The naming space gets tied with the unique ID I generated. I test out creating a new assignment, and again—bam, done! Yes!
00:35:00.000 So what have we done? We established a distributed message-passing framework, applying Ruby Processing to power the worker, and rolled it all into a Rails app. Now we're successfully coordinating two independent Ruby interpreters, incorporating external Java libraries, and making everything function seamlessly.
00:35:54.880 Now, what’s next? In the brief for this talk, I mentioned video, and Ruby Processing makes this too easy. Rendering a video actually takes quite some time, so let me illustrate live video now.
00:36:38.240 I call this 'Video Tagger.' The idea is to process videos uploaded through your Rails app, whether they are screencasts or user-submitted samples. You’ll want to mark them—like stating ‘Property of my Top Secret Website.’ To achieve video work in Processing, you load several libraries from Java QuickTime, set the window size—the smaller, the faster the frame rate.
00:37:24.560 You can set it bigger, but just remember, size matters. This single line of code turns on my webcam and activates video processing. I am astonished at how easy it is! I replicate a similar setup as before, loading parameters and defining the default behavior while outputting a text that shows 'Secret Property of Mountain West RubyConf.' In my draw method, I include logic that waits for a frame to come in from the camera as long as draw runs faster than 30 FPS.
00:38:02.960 Then, the image function captures that frame of video to display it. Additionally, I’ll draw a black bar at the bottom of the window for a little more flair. I determine the coordinates for that rectangle and set it across the width of the window. I then output some text, positioned to the bottom right corner while scaling with the entire line length.
00:39:21.840 Let me wrap this up—the video processing features we uncovered were straightforward. However, from here, I consider this a prototyping project. The next question pertains to what happens on the server side. While developing on my laptop runs great, attempting this on an actual server introduces challenges. Video processing requires a windowing toolkit where it tries to render windows; in a headless Ubuntu host, if it cannot find X windows running, it will crash.
00:40:30.560 Thankfully, in the Unix world, someone else has likely faced this issue. Thus, they created 'Xvfb' (the X Virtual Frame Buffer). This essentially lies to your programs that X is running, taking their windowing commands and discarding them. Fortunately, this method is effective because we don’t care about the display, just the saving of the files.
00:41:56.079 To scale beyond prototype, we have to consider: what’s next? If the processing becomes too slow—it is good, but Processing could prove sluggish at scale. If this was running too slowly and you're handling a thousand users, clearly, you could spin additional Ruby Processing workers if it's efficient with ram.
00:42:40.000 Otherwise, if Ruby Processing is not viable anymore, you could reimplement your workers; essentially, you'd first spend two weeks installing ImageMagick and then enlist someone to figure it out. This is also the case with audio and video; ffmpeg can handle virtually everything you could dream of with command-line arguments.
00:43:35.760 The goal should be to make it work in Processing first, then consider implementing ffmpeg. Also, native OpenGL processing is wrapped for you, but if you demand ultimate performance, you’ll be writing in C. Panda is another excellent video-handling library.
00:44:31.920 Lastly, here are the links I recommend. One valuable resource is the book 'Learning Processing'—it’s aimed at non-programmers, so you can skim through it quickly. More importantly, Jeremy and a few other contributors have re-implemented all the examples from the book with Ruby Processing, allowing you to see how it works in Ruby.
00:45:35.840 If you're interested in the Silk Screen font, the link is here. I also teach classes, with this one being most relevant to you concerning professional Rails practices and pair programming using Pivotal Tracker. If you're interested, chat with me about it. That’s it! Do I have time for questions? If you have any, please feel free to ask!
00:46:14.880 Oh, yeah—go ahead.
00:46:18.720 There are alternative implementations of Processing—like Processing.js, which tries to implement it in JavaScript for usage in the browser, but it only covers a small subset of Processing. Other implementations exist, but that would be a sizeable undertaking by itself since Processing relies on many existing Java libraries, such as the Java QuickTime library.
00:46:51.520 For instance, while those libraries flourish on a Mac, porting them won’t be a small feat; it will require substantial work. So, are there any other questions?
00:47:50.240 If not, go ahead and create some dynamic images!
00:48:30.000 Thank you!
Explore all talks recorded at MountainWest RubyConf 2010
+18