00:00:10.480
Okay, we'll get started.
00:00:12.320
Uh, hello everyone! Welcome to RubyConf 2021. Yeah, I'm Chris Cha. I'm a full stack developer working in test automation and DevOps at The Container Store, which specializes in organizing solutions like custom closets and the joy of tidying up your space.
00:00:17.840
Yes, we still use containers but also cloud and orchestration technologies to define a scalable, highly available future. Join us!
00:00:38.239
The title of my talk today is "Automating Legacy Desktop Applications with JRuby and Sikuli."
00:00:44.320
Here's an example of my development environment: I have Vim to edit text, Sikuli IDE to manage screenshots, xFreeRDP to remote into the Windows VM, and multiple terminals to run Vagrant and other tasks within the Windows VM. I use Command Prompt to run individual tests with JRuby.
00:01:03.680
Our goal is to automate a very battle-tested legacy application. To start, I couldn't just run Windows; I'm on a Mac. The app is Windows only, so my go-to for VMs is VirtualBox. Vagrant provides an easy way to export from VirtualBox.
00:01:12.799
I found a Windows 10 Pro image from Microsoft and made changes to help with remoting. If I could run JRuby with Sikuli, I could win. At this point, it was all only imaginative strategy. My ambition was to learn Vim, Vagrant, Sikuli, and JRuby, plus Java, all at once, and I assumed it would integrate in the end.
00:01:39.280
Here is the transform script. It takes every ed file, transforms the directory, and applies it to the Vagrant file. I wanted to be able to start from scratch at any time, create a new folder, and be able to get to the current state. This forced me to keep changes isolated and, in retrospect, helped manage complexity that Vagrant would have otherwise been littered with, including commented half-experiments and distractions over syntax from previous attempts.
00:01:55.439
From earlier research, I knew xFreeRDP would let me connect to Windows machines. I just needed to figure out how to do that with a Windows VM instead of a remote host. Fortunately, many others had solved this problem. Our first transform looks for the line 'config.vm.network' and appends content to it. It almost looks like a code block. RuboCop always reminds me to put a new line at the end, which is important to do that with ed files.
00:02:27.599
I knew the app in question pulled in its own JDK. I just needed to bootstrap the app and adopt OpenJDK, which had the easiest way to install. So, we'll call this the "Is this even going to work?" step. If anything, at least I had a Vagrant file for a nice Windows VM, and JRuby needed Java.
00:02:46.160
I could use Command Prompt to poke around inside the system. Here’s the batch script we used to set up Java: we download, extract, set a path, and done! This is the usual route for most binaries. I avoided using a package manager for now since I didn’t want any issues there to complicate attempts.
00:03:02.319
I really like PowerShell to extend batch scripting capability. Since Vagrant prints things to the console, we could invoke commands within the VM OS context and get useful results back. Here, we invoke Java and ask for its version. This verifies the path is set, and JRuby will know where to get its Java, and all will be well. More challenges await.
00:03:50.399
My second transform is a sanity check just to make sure PS1 files work so PowerShell executes as well in case we need it later. But also a way to append to the end of the Vagrant file instead of the middle. We pretty much use this approach to add on the rest of the changes.
00:04:21.120
Here is the transform to the Vagrant file to layer JDK installation on top. Every change, big or small, is run through a pattern of a transform and a script. Here we just layer on the Java verification script where we check the Java version.
00:04:39.360
Still, we need our app to test; it's not available from the VM. We have to retrieve it from somewhere. Our source is a mapped network drive, so we need to map the network drive. We figured out Java first since that was easier, but now we dig further into Windows. We need an app to run.
00:05:04.000
It might be nice to share the Vagrant dev loop we established. We retrieve a public image, apply transforms, and then start it up with Vagrant. We reboot it and then try to remote in with xFreeRDP. Now we have a Java development environment, but it can become so much more.
00:05:34.720
We need a strategy to map a network drive correctly every time. We can do this with a scheduled task. We use the XML version by exporting a basic task because this is the only way to create a task with all conditions disabled. For example, on a laptop that is not plugged in, the scheduled task would not run because it would detect the server our machine was on batteries.
00:05:58.560
The crucial piece here is PowerShell supports here-doc strings so we can paste the XML directly. Here's the rest of the scheduled task XML and script: we create the scheduled task with PowerShell using the XML content. After Vagrant reloads, we should have a mapped drive.
00:06:14.080
All this to support one batch script but a useful pattern for the future. There is more to it. I guess we make sure to delete our temporary XML file. In order to map the drive, we need credentials. These can be collected at Vagrant build time as a user prompt.
00:06:44.720
Now that we can acquire the app, we have to figure out how to run the darn thing. Each app is different and unique in its own way. In our case, the app uses Java Swing with multiple windows and a separate launcher dialog. The app was distributed as a JNLP file opened as a text file. We could retrieve the startup arguments and try to reproduce it ourselves.
00:07:05.920
We take the args and pass them to Java hoping for the best—fingers crossed. Fortunately, it worked. To reliably map the network drive on startup, we will always remap it, remove the mapping, and then remap it. Those magic arguments will be used forever with the right choices in the exported XML schedule task.
00:07:20.000
We’ll never have to worry about drive mapping again, at least for this VM. Oh, I forgot to mention we are creating a startup script that will be run at startup, not actually mapping the drive at Vagrant runtime.
00:07:43.680
Here we map the drive. We also list the files to verify we have indeed mapped the drive; otherwise we can't get the app. Some minor refactoring has been done here to help the devs hint at the format of credentials.
00:08:03.680
All of these steps have been to support eventual automation, which means we need to run things headless. We try a brief foray into Linux with XVFB, a virtual display frame buffer. Without headless, the automation test would always need to run on a user's machine.
00:08:36.479
Here we set up the scheduled task to start up the app on VM startup like kiosk mode, with the rest of the script where Windows launches the app when we remote in. At this point, OS provisioning is complete with the app launching on startup. We can test with Sikuli.
00:09:14.640
Our first test is to take a screenshot since that will prove XVFB can see something and that Sikuli can run with our installed version of Java. JRuby is used here because executing Sikuli IDE needs a version of JRuby.jar that resides next to it. In a future slide, we’ll install JRuby from scratch and use Sikuli solely from code.
00:09:34.640
Note that we are again creating this script that will launch on startup, not actually executing the script. Remoting should launch the app and take a screenshot. Later, we have the app launch on connect, not just at startup. The JRuby used here is somehow already available.
00:09:54.320
The relevant Ruby files will now be covered to take a screenshot. We use the native Java imaging library and Sikuli's capture method to save an image of the screen. We employ an instance of Sikuli's screen and region objects to get our bounding box, capture that, and write the bytes to a file.
00:10:09.760
We also fix a bug in this commit where we forgot to add a quit command to the .ed file. The Ruby script just invokes our screenshot module to send the capture message, specifying a file name.
00:10:29.919
We are finally at JRuby. The text approach allows us to define tests as text and organize a software architecture around a programming language. Here we install JRuby from the internet and specify the Java home environment variable.
00:10:55.840
The script to install JRuby continued by setting the path after we find the latest JRuby installation directory. As before, we use .ed files to transform the Vagrant file with a script to execute and the JRuby setup and verification steps.
00:11:13.680
This commit allows java.exe to pass through Windows Firewall. We also need to fix an issue where the app starts up only the second time and subsequent times, and never the first. Here is the script to allow Java through the firewall and the transform to specify it to run.
00:11:50.320
We're starting to develop a workflow: we'll use the Sikuli IDE to manage images, the VM and JRuby to run tests, and xFreeRDP to remote in and take screenshots of UI components. Sikuli wraps OpenCV which uses Tesseract, which requires the Visual C++ Redistributable, so we install that.
00:12:09.440
Here's the scheduled task to run on login, which can launch the app, run tests, or anything needed. Continuing from previous work, we copy to a common startup folder. Subsequent runs will trigger from a scheduled task. We use both methods to achieve the desired kiosk behavior, where the app always launches.
00:12:29.920
The Sikuli IDE will render screenshots as images, but they are stored as text. Ruby lets us use strings as expressions, allowing us to have a hierarchy where child UI elements are beneath a parent string. The indentation is just a visual convention.
00:12:56.360
As we are still in the proof of concept phase, much of the code is procedural. We have very minimal boilerplate; however, we are just importing Java, specifying a path for jars to Sikuli’s import statements, and setting verbose debugging. After that, we minimize the JRuby window and get to automation.
00:13:20.320
First, we tell Sikuli about our global app path and then we sleep. Now we can start clicking and typing things from this mess. We can break out the code into modules, classes, and concerns. From there, we can start to incorporate tests.
00:13:40.160
We start and stop screen recording as a test and then run config_pos.rb. This batch script will run on login. We begin to break out environment and config code into its own module. We want this repo to potentially host screenshots of different apps to automate, but they all live under a common bundle path.
00:14:10.480
The same applies here, but with two transforms: one to install the Visual C++ Redistributable and the other to schedule a startup task. We take a breather and try something different: screen recording.
00:14:40.800
Now that we have automation in place, we can see if a video can be recorded. First, we’ll install VLC.
00:15:10.560
Here is the invocation to start and stop screen recording. The latter is done with ncat. A lot of options are passed to VLC from the IDE file and the executable. It looks like we bundled VLC in the repo to save a download.
00:15:39.040
VLC turned out to be too laggy for headless recording, and XVFB renders Swing buttons in a strange way, so we need to look for alternatives. That's a future task.
00:16:00.880
In this commit, we fix the app not launching the first time. We reuse our existing script in the startup folder. The bits admin changes here supposedly make the JRuby download faster, but we replace it later with a much faster alternative.
00:16:46.120
In the screenshot below, we copy a script to the startup folder, but it takes forever for bits admin to get its magic going, so we directly HTTP get the thing. Otherwise, downloading Java and JRuby files can timeout.
00:17:07.280
Here, we change out Java 11 for Java 8 and also use PowerShell's Invoke-WebRequest to fetch Java faster. The Java downgrade is due to the Ruby Maven gem not yet supporting the newer version, so Sikuli still works, so no big deal.
00:17:29.920
We also simplify the process from polling bits admin to just HTTP fetching JRuby directly. Evidently, the dash version was updated to the more conformant dash flag syntax.
00:17:51.680
Okay, now we can address gems, which means bundler as well as our internal config tool and linter. Once this project graduates to have additional contributors, we'll need a consistent coding style and to use existing conventions.
00:18:13.920
Here's what the gem file looks like. Of course, it's a script driven by an ed transform in order to get gems installed and the transform itself.
00:18:29.520
By now, it should be very familiar. Now that we have a Windows VM and a running app that we can interact with using JRuby and Sikuli, we can start writing tests. Minitest comes with Ruby, so we'll use that.
00:18:51.920
We install diffutils, although for code diffs I used Visual Studio Code from the local machine before making commits. We had the Minitest gem in the gem file and a minor nitpick to use a no-spaces install directory for VLC and a bit of research to tell VLC to use a new install directory.
00:19:12.560
And here’s the edit script to set up the install. Next, we will refactor into modules and tests.
00:19:42.560
With the proof of concept working, we start modularizing into components. The tests are UI-centered, much like test automation in the web world with Selenium. We will abstract interactions with Swing from user intent using the page object pattern.
00:19:57.920
Here I'm just updating the docs for dev startup and setting a Ruby opt environment variable for Minitest. There's also a minor refactoring here to more reliably select defaults in the point of sale launcher dialog.
00:20:25.120
We split the Sikuli API setup and a helper module. The boilerplate from the proof of concept to initialize Sikuli is moved into a sikuli_env.rb file, and the helper responds to the minimize window message.
00:20:46.640
In Sikuli test_environment.rb, we import useful Sikuli libraries like debug, set the debug level, and introduce classes and methods to help with initializing the screenshot images path. This is where we store pictures of buttons, similar to how Selenium defines DOM elements through xpath.
00:21:06.000
Here we specify that the images representing point of sale UI elements, like buttons, dialogues, etc., belong in a path off of the global image root path. The star.sikuli folder becomes a subtype, a specialization of a common parent class that represents the parent folder.
00:21:40.000
We also see the components class, which represents the generic parent of UI components. Some private methods here help with the methods in the previous slide, and we see the POS component subclass near the bottom, which is a specialization of components.
00:22:07.680
In Ruby, the ease of thinking so fluidly of object hierarchies with the less than operator helps realize the idea of object specialization to provide specific behavior. More of the POS components class in the next slide.
00:22:29.440
With all the setup code from before, POS components just inherit from components, and its constructor specifies the POS app. That’s all the child needs to do is delegate to the parent.
00:22:52.800
Once the type of POS app is known, it’s kind of hard-coding but seems okay since we're setting up a known environment. Introducing Minitest, we want to test our setup code and learn Minitest. Despite these tests, a bug will be fixed later.
00:23:08.720
But hey, we’re in a Windows VM from macOS running JRuby and Sikuli! The next bit is to connect that with testing the actual application. We introduce a spec helper.rb as well, which imports the needed libraries. This assumes JRuby is invoked with a library path.
00:23:29.760
Here’s a checkpoint command. These are fun when you’re grasping for the next set of features and end up roping in a bunch of changes that you later need to summarize. It’s nebulous work like this, and I’m happy that it has turned out to just be a linear sequence of added lines.
00:23:59.280
GitHub shows just addition after addition rather than switching between changes.
00:24:20.800
We’re adding to our image repository now; that is our UI elements repository with more screenshots. We're getting closer to login and the data setup needed.
00:24:45.840
Here we have configuration starting to be stored in a YAML file, with derived values determined at the top, like using the specified default if not specified from the environment.
00:25:09.680
We also see a new Sikuli component module, which itself has a component class and accepts the hash from Sikuli test environment. We are now logging in.
00:25:37.760
Since we automated the mapping between image name and image path, the key is now the same as an xpath in Selenium, and we’re navigating a ‘quote-unquote’ DOM tree. Except instead of a web page, it’s the desktop, and we assert that the image of the UI component is present somewhere on it.
00:26:06.000
Here's the bug fix: there was an off-by-one directory level. It wasn't to 'slash app' but the base bundle path itself, which seemed to create a redundancy in classes, but I might have let it be.
00:26:39.920
Now we're starting on the road to automating the app. The setup we saw in Minitest is realized in the 'before do' block. This turns out to be costly, starting up the app for each single test, but it's progress.
00:27:01.840
Here we see the change to use the app after all. Maybe the bug fix previously was to generalize the library as we are closer to the runtime state, and the test would know more about initialization than the library.
00:27:18.200
We bulk up spec_helper.rb with our in-house config YAML parser, too. We had a requirement early on not to use Cucumber, but the design of the classes mirrors an organization toward natural language.
00:27:40.080
We do not end up needing this Sekuli feature module, as feature classes can be inherited from Minitest. A minor refactoring here ensures that named components do not have to initialize themselves but defer to the parent.
00:28:03.000
Now we're just left with intent calls, simply named and possibly various private helper methods. Now that we can log in, we can start automation on a basic workflow in point of sale.
00:28:28.720
Here’s the Sikuli feature module that we use with Minitest’s feature inheritance instead.
00:28:54.560
Here we are just starting out on the boilerplate for the initial test. Here is a test where we want to try clicking all the buttons and ensure they act as expected. The difference here is to log out from the main window instead of from a dialog.
00:29:16.320
Updating our spec_helper.rb to include the unused feature module.
00:29:39.000
Second interlude: there are some pull requests that never got merged. Those are from trying out the state machine model sample pattern, but I kept getting stack overflow in my state machines.
00:29:59.280
Getting a Ruby version of his rocket launch example is a definite to-do for me, but here we just start adding tests. The framework is set up now for us to add a lot of tests.
00:30:24.800
Some notes to self in the form of a README—how to exit from java.exe. We’re removing the screenshot function, as by this point the app launches reliably enough to debug other things.
00:30:45.920
More work with VLC, but there are some quirks. The XVFB shows the UI with exaggerated Swing buttons so the screen caps we took of the components are not recognized. We end up having to hack around that and headless is tabled for the future.
00:31:16.400
If anything, at this point, we’re pretty much writing page objects like in Selenium. Reused constants that would be XPaths are image names. In this example, 'checkbox no devices' constant is used more than once, so we dry it out and add it up top.
00:31:41.520
Here's the proof of concept code: the messy stuff we had before has been moved into components and methods. Here we use runtime exceptions as pre and post conditions, having methods fail fast. This has been something I brought into my Selenium web testing as those assertions live in the methods as contracts.
00:32:00.720
We find that the specs don’t always have to check for them. In Cucumber, the then steps can be shorter and reflect user capability instead of specific UI elements. Borrowing from Node.js, we use an index.rb file to define the UI components, which are themselves separate classes.
00:32:22.440
Did you all know a separate file that specifies an existing module automatically gets included for namespace purposes? That’s nifty and keeps each file smaller.
00:32:42.800
We have a process in point of sale called 'take', and the UI elements we used are defined as constants. Each public method uses the primitives we have built up or borrowed from Sikuli, including click, type, wait for, and click find.
00:33:05.120
Private methods for tracking state, like line item count, are managed through messages like increment line item count. The page object is 'take order'. The interface reflects intent, like adding and accepting, and the consumer never directly clicks or types anything.
00:33:31.280
Here are some more public methods from 'take order' and service to tests and expectations.
00:34:03.560
Continuing with the methods to use in tests before we look at the private methods.
00:34:21.520
We just have only a couple of private methods to track line item count state and another component for the top bar UI as well, just for logging out. We just click the logout button, which is the image of the logout button. Once recognized, it sends a click.
00:34:41.760
Here are the helper methods used in components defined in Sikuli component.rb.
00:35:06.560
Here is a simpler component, at least to flesh out one of the dialogues. I remember this dialog has several more buttons.
00:35:29.920
Oh, and the screenshot capability gets moved into a helper method. Thanks to Ruby's ability to execute shell commands, we can encapsulate screen recording controls into this helper.rb module.
00:35:55.920
In Sikuli test_environment.rb, we also add logging and some more bug fixes. We also see the uppercase I live flag, so our require statements don’t need to all be relative.
00:36:30.440
In future work, it would be nice to provide a Rails dashboard that could adjust on login.batch to run one or more tests.
00:36:52.120
In Minitest, there is not a before all concept, but the internet had a solution: just call them before the 'before do' block.
00:37:02.480
We see the initial steps to start up the app are pretty expensive, so we only want to log in once. Then, every test per scenario can be done in an authenticated context.
00:37:37.920
Here I'm refactoring placeholder code with actions. In addition, I’m adding more tests.
00:37:57.600
So now we’re starting to see the benefits of the interface. The tests nearly read like step definitions.
00:38:30.840
Finally, adjustments to the spec helper to include those new components.
00:38:54.560
With that, we have successfully shown a Windows VM running our Java Swing app, integrated JRuby with Sikuli, and used Minitest to verify behavior. Not a bad run with free and open source software.
00:39:34.720
Thank you!