Crowd-sourced testing

Imagine 40 people from around the world suddenly scouring your site for bugs,and reporting them to a central system that you monitor and approve / reject. That was my experience recently with uTest, who provide the platform and the community for this crowd-sourced QA service. uTest claim it covers several types of testing, from functional to usability, but how does it compare to traditional usability testing and quality assurance (QA)? NB: You might consider me somewhat biased, as I come from a usability and accessibility testing background. However, it is an honest appraisal of where I see a crowd-sourced testing service fitting into website and product development based on the principles, not just this service in particular.

Zapper event.

Guest of Honour

I was lucky enough to be invited to a 'Zappers' event, where anyone interested can try out the system, reporting bugs  on a particular site or application. The particular site in this case was my companies', That made me somewhere between guest of honour (free meal, having to do speeches and give prizes), and a sacrifice to the god of bugs. I was in the hot-seat, having to review the bugs as they came in and work out whether they were legitimate or not. For the purpose of the event, the people posting bugs were split into teams and scored one point per accepted bug. If it were a real project, 'zappers' would receive payment for legitimate bugs, varying depending on the severity of the bug. The type of bugs that get reported varies a lot. The reporters are motivated by money; every bug accepted is rewarded. This means that good reports are encouraged. uTest have also created a reputation system so they can identify good testers over time. You can also set a budget for the number of accepted bugs, and people stop reporting when the limit is reached.

Zappers score board at the beginning, no scores yet.

In the case of this zappers event we had reports such as:

  • The search gives a 500 error if you put several thousand characters in it.
  • The score-board for a flash game wasn’t working.
  • Broken links, broken image links.
  • Some 404 pages weren’t working.
  • There is a space before the full stop of a sentence.
  • PDF is not a proprietary format any more (an item in our glossary).


There are aspects of the output you can predict from the methodology: It is user-reported, so it will tell you what people see as errors. That isn't always the same as the source of an error, as the zapper might have missed something. In these cases the error is they missed it, not what they reported. For example, if there is an instruction at the beginning of a form-based process that they miss, it might be something later in form that is reported as a bug. Combined with the fact the people are motivated financially, they are essentially professional testers, and not suitable for finding contextual usability issues. (Further reading on usability guidelines vs testing.) For example, no one raised a bug that they didn’t understand one of the services we provide, which I would expect from real usability testing. Usability-guidelines can be used by testers, and someone did submit a bug about the logo not linking to the homepage. That being said, it is definitely better than people checking their own work. These will all be people unfamiliar with the website, who don't have the burden of knowledge about why things work the way they do. I've found that people familiar with something don't explore in the same way as those new to it, so this is a real advantage. Although you can direct people to certain areas, it did appear to be better for functionality testing rather than content testing. For a large information based site, I'm not sure that the coverage would be comparable to copy-editors going through a list of pages. There is also still a place for automated checks (e.g. broken links, unit testing if applicable), before you go into QA. Unless someone in the testing is an accessibility expert, you will get very skewed accessibility results. Without knowing the guidelines well, people tend to report a sub-set of accessibility issues, without knowing if they impact other types of user. The type of issues I've heard non-experts raise include "I know my dyslexic friend doesn’t like black on white text, therefore the site should use less contrasting colour combinations". Unfortunately that clashes with people who have limited vision and would like maximum contrast, but you need to be well versed in the web accessibility guidelines to know all the ins and outs, and provide actionable recommendations.

My conclusions

  • It is a useful service for pre-launch QA. It is many pairs of eyes, motivated to find problems.
  • It wouldn’t necessarily replace a dedicated QA person or automated tests, which could provide better coverage or particular aspects of a system. However, there is nothing stopping testers using automated tools, one person did, and submitted a list of pages with broken links.
  • The quality will depend on the testers, however, even for one organisation, it is something that should improve over time as the community of testers are trained by the bug-rejection notes.
  • It does not cover usability / accessibility testing to the level our clients would expect. Usability (or rather, user-centred design), should be part of the initial design process. If you are conducting usability testing at launch, it's too late. Accessibility should also (mostly) be taken care of at the earlier stages of a project, between the requirements and the templates stages.

Whether we use the service will depend on whether our clients would like it (and are willing to pay for it), however, we will be recommending it as an excellent QA method.