The higher the quality, the less effective we are at testing

May 5, 2023

First published 07/12/2011

Its been interesting to me to watch over the last 10 or maybe 15 years the debate over whether exploratory or scripted testing is more effective. There's no doubt that one can explore more of a product in the time it takes for someone to follow a script. But then again – how much time exploratory testers lose spent bumbling around lost, aimlessly going over the same ground many times, hitting dead ends (because they have little or no domain or product knowledge to start with). Compare that with a tester who has lived with the product requirements as they have evolved over time. They may or may not be blinkered, but they are better informed – sort of.

I'm not going to decry the value of exploration or planned tests – both have great value. But I reckon people who think exploration is better than scripted under all circumstances have lost sight of a thing or two. And that phrase 'lost sight of a thing or two' is significant.

I'm reading Joseph T. Hallinan's book, “Why We Make Mistakes”. Very early on, in the first chapter no less Hallinan suggests, “we're built to quit”. It makes sense. So we are.

When humans are looking for something, smuggled explosives, tumors in x-rays, bugs in software – humans are adept at spotting what they look for – if, and it's a big if, these things are common – in which case they are pretty effective – spotting what they look for most of the time.

But, what if what they seek is relatively rare? Humans are predisposed to just give up the search prematurely. It's evolution stupid! Looking for, and not finding, food in one place just isn't sensible after a while. you need to move on.

Hallinan quotes (among others) the cases of people who look for PA-10 rifles in luggage in airports and tumours in xrays. In these cases, people look for things that rarely exist. In the case of radiologists, mammograms reveal tumours only 0.3 percent of the time. 99.7 percent of the time the searcher will not find what they look for.

In the case of guns or explosives in luggage the occurrence is rarer. In 2004, according to the thegunsource.com, 650 million passengers travvelled in the US by air. But only 598 firearms were found – about one in a million occurences.

Occupations that seek to find things that are rare have considerable error rates. The miss rate for radiogists looking for cancers is around 30%. In one study at the world famous Mayo clinic, 90% of the tumours missed by radiologists were visible in previous x-rays.

In 2008, from the UK I travelled to the US, to Holland and Ireland. On my third trip, returning from Ireland with the same rucksdack on my back at the security check at Dublin airport (i.e. my sixth flight), when going through security, I was called to one side by a security officer. A lock-knife with a 4.5 inch blade was found in my rucksack. Horrified, when presented with the article I asked could it please be disposed of! It was mine, but in the bag by mistake – and had been there for six months unnoticed, by me and by five airport security scans. This was the sixth flight with the offending article in the bag. Five previous scans at airports terminals had failed to detect a quite heavy metal object – pointed and a potentially dangerous weapon. How could that happen? Go figure.

Back to software. Anyone can find bugs in crappy software. Its like walking in bare feet in a room full of loaded mouse traps. But if you are testing software of high quality, it's harder to find bugs. It may be you give up before you have given yourself time to find the really (or not so) subtle ones.

Would a script help? I don't know. It might help because in principle, you have to follow it. But it might make you even more bored. All testers get bored/hungry/lazy/tired and are more or less incompetent or uninformed – you might give up before you've given yourself time to find anything significant. Our methods, such as they are, don't help much with this problem. Exploratory testing can be just as draining/boring as scripted.

I want people to test well. It seems to me that the need to test well increases with the criticality and quality of software, and motivation to test aligns pretty closely. Is exploration or scripted testing of very high quality software more effective? I'm not sure we'll ever know until someone does a proper experiment (and I don't mean testing a 2000 line of code toy program in a website or nuclear missile).

I do know that if you are testing high quality code and just before release it usually is of high quality, then you have to have your eyes open and your brain switched on. Both of em.

Tags: #exploratorytesting #error #scriptedtesting

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account