Testing

No Traffic, No Users, No Problem! - Usability Testing for New Apps

No Traffic, No Users, No Problem! - Usability Testing for New Apps

by Jim Jones

In the video titled 'No Traffic, No Users, No Problem! - Usability Testing for New Apps,' Jim Jones discusses effective methods for conducting usability testing on web apps before they are launched. The video emphasizes the stress involved in making design decisions for web applications and presents a practical solution using Amazon's Mechanical Turk service to gather user feedback. The following key points are covered in detail:

  • Usability Testing Needs: Jones highlights the difficulties developers face when trying to get feedback on their web app designs without formal usability testing services.
  • Mechanical Turk Introduction: An overview of Mechanical Turk, a crowdsourcing marketplace where small tasks can be assigned to a large pool of online workers, is provided. Jones explains the structure of a Human Intelligence Task (HIT) and how responses from workers (termed 'turkers') can be utilized for meaningful insights.
  • Integration with Rails: The presentation includes sample code for integrating Mechanical Turk with a Rails application. Jones introduces tools like the Turbo gem and the TURK gem, which facilitate creating HITs and collecting feedback efficiently.
  • Fast Feedback Mechanisms: The Forward gem is suggested for exposing local development environments, allowing quick feedback collection without deployment.
  • Soliciting Meaningful Feedback: Tips are shared on how to ask for valuable feedback, including making personal appeals and promoting an environment where constructive criticism is welcomed.
  • A/B Testing: Jones discusses the Vanity gem for conducting A/B testing within Rails applications, sharing personal examples of tests done on homepage images and their impact on user engagement.
  • Handling Spam Data: Strategies for mitigating spam submissions on Mechanical Turk are shared, emphasizing the importance of ensuring the quality of feedback received.
  • Creative Uses of Mechanical Turk: The video concludes by discussing innovative applications of Mechanical Turk, such as content creation and problem-solving through collaborative efforts. The presentation also highlights research findings regarding the effectiveness and timing of task submissions on Mechanical Turk, illustrating the platform's potential beyond conventional uses. In conclusion, Jones encourages developers to leverage usability testing and feedback mechanisms to improve their web applications prior to launch, promoting a culture of experimentation and iteration in development practices. Overall, the video serves as a valuable guide for developers looking to enhance their app's usability through crowd-sourced feedback before going live.
00:00:16.400 Thanks everyone for coming out. I really appreciate it; you guys are looking great.
00:00:22.640 My name is Jim Jones. I am a Ruby on Rails engineer, and I work as a consultant in the San Francisco area.
00:00:28.880 I have 14 years of experience working with Rails on and off for the past six years.
00:00:34.719 Like many of you, I have a lot of side projects, to the point where most of them aren't completed. I haven't forgotten about some of them.
00:00:41.040 There are times when I'm experimenting with certain images or trying to write some copy text, and I feel stuck.
00:00:47.360 I get stuck to the point where colors start looking awkward, my writing sounds strange, or it's 2 AM, and I'm trying to be funnier than I probably am. At that moment, I can't really tell what the issue is.
00:00:59.520 So, I sought out usability solutions. While there are many usability services available, there aren't very many that allow me to evaluate certain bits of the site.
00:01:04.879 Most of these services are very formalized, while I wanted a quick point of feedback without needing to formalize my requirements.
00:01:10.720 I had some prior experience with Mechanical Turk for data gathering, so I started to explore this path.
00:01:16.400 For those who don't have a background in this, Mechanical Turk is a crowdsourcing marketplace. It allows individuals to coordinate small tasks to be performed by people.
00:01:57.280 Imagine someone proposes a problem to an army of 1 million people, each of whom can perform a 10 or 20-second task for you. What would you use that for? These are the micro tasks that Mechanical Turk excels at.
00:02:15.440 These tasks are generally those that computers cannot do well. Most people's experiences with Mechanical Turk involve straightforward tasks, such as tagging images and scientific surveys, usually limited to demographic data.
00:02:34.160 I want to expand that perception and get you thinking differently towards the end of this talk. Just to clarify some terminology, when I refer to a HIT, that's a Human Intelligence Task.
00:03:20.239 This is the task you post on Mechanical Turk, instructing the worker to do something on your behalf. The workers performing these tasks are called 'turkers,' and an assignment is something within that specific HIT.
00:03:38.480 For example, if you want to classify an image and request 100 responses, you would have one HIT with 100 assignments. Through this process, you can take the most common classification for that image.
00:04:06.400 The estimates for the Mechanical Turk service indicate there are millions of workers with various motivations. The most common motivation is monetary compensation, as many turkers try to supplement their income.
00:04:22.880 However, some people are just passing the time or seeking alternative motivations, such as battling insomnia or needing extra cash. Next time you find yourself bored, explore the service; there are many engaging tasks available.
00:04:52.880 The demographic breakdown shows that the United States dominates the service, with India in a close second. Although workers come from various countries, the majority are from the United States and India.
00:05:29.759 In the U.S., the gender ratio is primarily female, while in India it is mostly male. The predominant age range for workers in both countries is 24 to 33 years. Surprisingly, most users have at least a bachelor's degree, indicating a high level of education among turkers.
00:05:57.919 It's essential to understand this aspect because the quality of feedback received from educated users can be quite impressive. The average income for U.S. workers on the platform tends to be middle-class, looking to supplement their earnings.
00:06:17.440 When working with Mechanical Turk, there are three ways to interface with the service: a basic web interface, an API for Ruby, and a command-line tool. Most people’s experience is likely limited to the basic web interface, which is a simple form builder with linear data collection.
00:06:43.680 The simple web interface generates HTML pages served from Amazon S3. However, a powerful method is using external HITs, which allow the turkers to interact with your website directly. Your entire form can be displayed within an iframe, giving you full control over how the form is presented.
00:07:18.720 With the Turbo gem, integration with Mechanical Turk and Rails becomes seamless. The TURK gem provides various methods for easier integration. Instead of posting data back to your server, the data gets sent straight to Mechanical Turk, which helps streamline feedback collection.
00:07:56.720 The workflow generally involves a turker completing your assignment, with data posted to Mechanical Turk. You can retrieve this data through a rake task, import it into your models, and programmatically approve or reject it.
00:08:23.360 Now that we've set up the basics, let's delve into more advanced cases, focusing on soliciting feedback.
00:08:31.360 My objective is to gather meaningful feedback quickly for better development decisions. I want to skip the deployment phase entirely and get feedback as soon as possible, especially when I'm the only person working on a project.
00:09:21.839 Sometimes, I may not want to merge changes into the master branch or deploy to a staging server, so I just want to experiment on my local development environment.
00:10:01.680 The first tool I recommend is the Forward gem. This gem makes it easy to expose your local development instance to the outside world, providing a convenient domain for others to access.
00:10:46.880 There’s another option called Local Tunnel, which is free, but I prefer Forward because it provides an HTTPS interface and a static domain.
00:11:10.960 Next, we will use the TURK gem for feedback collection. After inserting the method into your application template, you can run a rake task to create a feedback study, passing in the URL provided by Forward.
00:11:56.480 Imagine you're working on an experimental branch and want feedback. You can post your HIT to gather responses without deploying. After gathering feedback, you define a workflow to retrieve responses and analyze the results.
00:12:27.200 I want to share feedback I've received from turkers on a project I’m working on called 5s5, a website allowing fans to bid for video chats with celebrities. The study form helper creates an overlay for collecting feedback from users.
00:13:21.920 Some feedback from turkers has been very insightful, such as suggesting making the logo bolder or providing more information in the overlay. This feedback is valuable as it helps identify areas for improvement.
00:14:04.640 With feedback, it's essential to focus on common threads. A few pro tips when soliciting feedback include making personal appeals rather than corporate ones, ensuring you invite negative feedback, and explaining the purpose of your project.
00:14:44.080 Asking workers to provide detailed feedback will lead to more helpful responses. By inviting them to give their opinions, you can foster a safe environment for sharing constructive criticism.
00:15:39.440 Now we will discuss A/B testing using the Vanity gem. This framework integrates well with Rails and helps test various aspects of your application. For instance, I set up an A/B test to evaluate different homepage images and measure user interactions.
00:16:32.880 In this example, we tested how different images led turkers to interact with an event page. The instructions provided were open-ended, giving participants the freedom to navigate the site however they chose.
00:17:44.800 The results were inconclusive; while one image performed better than the other, the difference wasn't statistically significant. This shows the importance of continued testing and experimentation in development.
00:18:24.920 Now, let's address the issue of spam data. Mechanical Turk has struggled with spam submissions, with estimates suggesting as much as 40% of all submissions may be garbage. However, improvements have been made to combat this problem.
00:18:56.800 There are a few strategies to mitigate spam, including using standard questions to validate submissions and incorporating feedback validation into your forms.
00:19:18.320 By validating the feedback provided by turkers, you can enhance the quality of responses. Research suggests that this technique often leads to turkers re-evaluating their submissions.
00:19:55.040 Now, let’s explore some of the surprising research findings related to Mechanical Turk that might leave you intrigued.
00:20:30.480 For instance, increasing compensation doesn’t always lead to more accurate results. To achieve better outcomes, tasks should be checked for redundancy, meaning having multiple checks on submissions can help enhance accuracy.
00:21:15.560 Another finding was that many tasks are completed during specific peak times, notably between 2 AM and 11 AM in the U.S. and 11:30 AM to 8:30 PM in India.
00:22:02.560 Additionally, it’s important to note that when posting HITs, turkers often work efficiently on tasks based on their visibility in the queue. By having a higher number of posted HITs, you increase the chances of getting your tasks completed.
00:22:47.440 Lastly, I want to highlight creative uses for Mechanical Turk, showcasing how it can be utilized beyond data collection. Many individuals are beginning to explore its potential in content creation and iterative processes.
00:23:15.280 For example, you could create collaborative content by having turkers build off each other’s submissions for writing projects.
00:23:55.840 I encourage you to check out my GitHub project named 'ad vote,' which simulates the Google AdWords experience without the cost. It allows you to test ad variations effectively.
00:24:45.920 A unique project I found interesting involved having turkers draw pictures of sheep, which showcased their artistic skills while capturing their creativity.
00:25:10.240 Additionally, there was an experiment to see how creatively turkers could rewrite news headlines. The results were often humorous and provided intriguing perspectives.
00:25:56.360 I will now open the floor for questions. Thank you all for your attention!