Paul Gerrard

My experiences, opinions in the Test Engineering business. I am republishing/rewriting old blogs from time to time.

First published 25/11/2011

Some time ago, Tim Cuthbertson blogged “how I Replaced Cucumber With 65 Lines of Python.” I recently commented on the post and I've expanded on those comments a little here.

I share Tim's frustrations with Cucumber. I think the reuse aspect of the step definitions has some value, but that value is limited. I've heard of several sites having literally thousands of feature files and step definitions and no way to manage them systematically. A bit of a nightmare perhaps.

To address the 'specification by example isn't enough' challenge – SBE isn't enough and I demonstrate/discuss that here. Although some trivial requirements can be SBE – most can't, so you need a separate requirement statement to supplement scenarios/examples to fully describe the requirement.

This doesn't sound very Agile, but I'm not talking Agile here necessarily. I understand that some teams can live with minimalist stories and the spec is the code. I'm talking about teams that require an accurate definition of the requirement and want to drive the creation of tests from stories and scenarios. This need could apply to all project styles and not just Agile.

Gojko Adzic talks about the need for 'Key Examples' in his Specification by Example book. When I spoke to Gojko not too long ago and suggested more specification content beyond examples was usually required – he agreed. If this is true, that doesn't mean that we need bloated requirements documents. The level of detail in a requirement (as captured by a BA) can be quite compact, because the precision of a business rule doesn't need heavy explanation – the scenarios and tabulated examples (if needed) do that for us.

Successful execution of 'key examples' are a necessary but not usually sufficient acceptance criteria. Developers definitely need more tests to cover edge cases, for example. (User) acceptance requires end to end tests and probably combinations of examples in sequence to fully satisfy the business users. (Although these types of tests are likely to be manual rather than automated).

Some time ago, we wrote (a small amount of) code to generate Python unitttest code directly from stories, scenarios and example tables and it works fine. (All we need are different language templates to generate xunit code in other programming languages). The test code may be xunit format – but the story/scenarios define the content of the tests. xUnit code could drive any style of test in theory. We're also experimenting with generating Robot Framework code and HTML Fitnesse tables directly. All seems feasible to me and in principle, all that's required is a template to generate the correctly formatted output. Additional setup/teardown code and fixtures are held in pre-existing code.

Our SaaS product SP.QA can be used to capture requirements and the range of stories/scenarios that example them. Since the requirements and stories are managed in a database and the test code is generated, the developers (or testers) only need to manage a single class or function for each story to implement the tests.

This has the distinct advantage that BAs can treat epic stories/sagas as requirements and build their hierarchical view. Stories that identify features and scenarios that example them can be refined over time. When they are 'trusted' the test code can be generated.

We're offering a commercial product, but I think even in the open source domain, the days of plain text story writing are numbered. We think all requirements tools will move in this direction. In the future, requirements tools will capture requirements and stories/scenarios that example those requirements so that they can be used to drive requirements reviews, to be a starting point for acceptance tests and be used to generate test code for developers and system testers.

When business analysts and users get their hands on these tools, then BDD will really take off.

Tags: #BDD #Python #Cucumber #xUnit #Fitnesse #RobotTestFramework

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 05/11/2009

we're stood in a boat ankle deep in water. the cannibals are coming to kill and eat us. The testers are looking for the holes in the boat saying – we can't push off yet the river is full of hungry crocodiles.

The testers are saying – if we push off now, we'll all die.

The skipper is saying – if we don't push off soon, we'll all die.

I'ts the same with software.

Tags: #ALF

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 18/09/2010

This is the first in a series of short essays in which I will set out an approach to test design, preparation and execution that involves testers earlier, increases their influence in projects, improves baseline documents and stability, reduces rework and increases the quality of system and acceptance testing. The approach needs automated support and the architecture for the next generation of test management tools will be proposed. I hope that doesn’t sound too good to be true and that you’ll bear with me.

Some scene-setting needs to be done...

In this series, I’m focusing on contexts (in system or acceptance testing) where scripted tests are a required deliverable and will provide the instructions in the form of scripts, procedures (or program code) to execute tests. In this opening essay, I’d like to explore why the usual approach to building test scripts (promoted in most textbooks and certification schemes) wastes time, undermines their effectiveness and limits the influence of testers in projects. These problems are well-known.

There are two common approaches to building scripted tests:

  1. Create (manually or automated) test scripts directly from a baseline (requirement or other specification documents). The scripts provide all the information required to execute a test in isolation.
  2. Create tabulated test cases (combinations of preconditions, data inputs, outputs, expected results) from the baseline and an associated procedure to be used to execute each test case in turn.
By and large, the first approach is very wasteful and inflexible and the tests themselves might not be viable anyway. The second approach is much better and is used to create so called ‘data-driven’ manual (and automated) test regimes. (Separating procedure from data in software and tests is generally a good thing!) But both of these approaches make two critical assumptions:
  • The baseline document(s) provide all the information required to extract a set of executable instructions for the conduct of a test.
  • The baseline is stable: changing requirements and designs make for a very painful test development and maintenance experience; most test script development takes place late in the development cycle.
In theory, a long term, document-intensive project with formal reviews, stages and sign-offs could deliver stable, accurate baselines providing all the information that system-level testers require. But few such projects deliver what their stakeholders want because stakeholder needs change over time and bureaucratic projects and processes cannot respond to change fast enough (or at all). So, in practice, neither assumption is safe. The full information required to construct an executable test script is not usually available until the system is actually delivered and testers can see how things really work. The baseline is rarely stable anyway: stakeholders learn more about the problem to be solved and the solution design evolves over time so ‘stability’, if ever achieved, is very late in arriving. The usual response is to bring the testers onto the project team at a very late stage.

What are the consequences?

  • The baselines are a ‘done deal’. Requirements are fixed and cannot be changed. They are not testable because no one has tried to use them to create tests. The most significant early deliverables of a project may not themselves have been tested.
  • Testers have little or no involvement in the requirements process. The defects that testers find in documents are ignored (“we’ve moved on – we’re not using that document anymore”).
  • There is insufficient detail in baselines to construct tests, so testers have to get the information they need from stakeholders, users and developers any which way they can. (Needless to say, there is insufficient detail to build the software at all! But developers at least get a head start on testers in this respect.) The knowledge obtained from these sources may conflict, causing even more problems for the tester.
  • The scripts fail in their stated objective: to provide sufficient information to delegate execution to an independent tester, outsourced organization or to an automated tool. These scripts need intelligence and varying degrees of system and business domain knowledge to be usable.
  • The baselines do not match the delivered system. Typically, the system design and implementation has evolved away from the fixed requirements. The requirements have not been maintained as users and developers focus on delivery. Developers rely on meetings, conversations and email messages for their knowledge.
  • When the time comes for test execution:
    1. The testers who created the scripts have to support the people running them (eliminating the supposed cost-savings of delegation or outsourcing).
    2. The testers run the test themselves (but they don’t need the scripts, so how much effort to create these test scripts was wasted?).
    3. The scripts are inaccurate, so paper copies are marked up and corrected retrospectively to cover the backs of management.
    4. Automated tests won’t run at all without adjustment. In fixing the scripts, are some legitimate test failures eliminated and lost? No one knows.
When testers arrive on a project late they are under-informed and misinformed. They are isolated in their own projects. Their sources of knowledge are unreliable: the baseline documents are not trustworthy. Sources of knowledge may be uncooperative: “the team is too busy to talk to you – go away!”

Does this sound familiar to you?

That’s the scene set. In the next essay, I’ll set out a different vision.

Tags: #Essaysontestdesign

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 06/12/2009

Are you ever asked as a tester, “is the system good enough to ship?” Given our normal experience, where we are never given enough time to do the testing, the system cannot be as good as it should be. When the time comes to make the release decision, how could you answer that question? James Bach introduced the idea called ‘Good Enough’ in 1997 (Bach,1997). It is helpful to understanding the risk-based test approach as it seems to hold water as a framework for release decision-making, at least in projects where risks are being taken. So, what is “Good Enough?” and how does it help with the release decision?

Many consultants advocate ‘best practices’ in books and conferences. Usually, they preach perfection and they ask leading questions like, “would you like to improve your processes?”, “do you want zero defects?” Could anyone possibly say “no” to these questions? Of course not. Many consultants promote their services using this method of preaching perfection and pushing mantras that sound good. It’s almost impossible to reject them.

Good enough is a reaction to this compulsive formalism, as it is called. It’s not reasonable to aim at zero-defects in software and your users and customers never expect perfection, so why do you pretend that you’re aiming at perfection? The zero-defect attitude just doesn’t help. Compromise is inevitable and you always know it’s coming. The challenge ahead is to make a release decision for an imperfect system based on imperfect information.

The definition of “Good Enough” in the context of a system to be released is:

  1. X has sufficient benefits.
  2. X has no critical problems.
  3. The benefits of X sufficiently outweigh the problems.
  4. In the present situation, and all things considered, improving X would cause more harm than good.
  5. All the above must apply.

To expand on this rather terse definition, X (whatever X is) has sufficient benefits means that there is deemed enough of this system working for us to take it into production, use it, get value, and get the benefit. It has no critical problems. i.e. there are no severe faults that make it unusable or unacceptable. At this moment in time, with all things considered, if we spend time trying to perfect X, that time is probably going to cost us more than shipping early with the known problems. This framework allows us to release an imperfect system early because the benefits may be worth it. How does testing fit into this good enough idea?

Firstly, have sufficient benefits been delivered? The tests that we execute must at least demonstrate that the features providing the benefits are delivered completely, so that we have evidence of this. Secondly, are there any critical problems? Our incident reports give us the evidence of the critical problems and many others too. There should be no critical problems for it to be good enough. Thirdly, is our testing good enough to support this decision? Have we provided sufficient evidence to say these risks are addressed and those benefits are available for release?

It is not for a tester to decide whether the system is good enough. An analogy that might help here is to view the tester as an expert witness in a court of law. The main players in this courtroom scene are:

  • The accused (the system under test).
  • The judge (project manager).
  • The jury (the stakeholders).
  • Expert witness (the tester).

In our simple analogy, we will disregard the lawyers’ role. (In principle, they act only to extract evidence from witnesses). Expert witnesses are brought into a court of law to find evidence and present that evidence in a form for laymen (the jury) to understand. When asked to present evidence, the expert is objective and detached. If asked whether the evidence points to guilt or innocence, the expert explains what inferences could be made based on the evidence, but refuses to judge innocence or guilt. In the same way, the software tester might simply state that based on evidence “these features work, these features do not work, these risks have been addressed, these risks remain”. It is for others to judge whether this makes a system acceptable.

The tester simply provides information for the stakeholders to make a decision. Adopting this position in a project seems to be a reasonable one to take. After all, testers do not create software or software faults; testers do not take the risks of accepting a system into production. Testers should advocate to their management and peers this independent point of view. When asked to judge, whether a system is good enough, the tester might say that on the evidence we have obtained, these benefits are available; these risks still exist. The release decision is someone else’s decision to make.

However, you know that the big question is coming your way so when you are asked, “is it ready?” what should you do? You must help the stakeholders make the decision, but not make it for them. The risks, those problems that we thought could occur some months ago, which, in your opinion would make the system unacceptable, might still exist. Based on the stakeholders’ own criteria, the system cannot now be acceptable, unless they relax their perceptions of the risk. The judgement on outstanding risks must be as follows:

  • There is enough test evidence now to judge that certain risks have been addressed.
  • There is evidence that some features do not work (the feared risk has materialised).
  • Some risks remain (tests have not been run, or no tests are planned).

This might seem like an ideal independent position that testers could take but you might think that it is unrealistic to think one can behave this way. However, we believe this stance is unassailable since the alternative, effectively, is for the tester to take over the decision making in a project. You may still be forced to give an opinion on the readiness of a system, but we believe taking this principled position (at least at first) might raise your profile and credibility with management. They might also recognise your role in projects in future – as an honest broker.

REFERENCES Bach, J. (1997), Good Enough Quality: Beyond the Buzzword, in: IEEE Computer, August 1997, pp. 96-98

Paul Gerrard, July 2001



Tags: #risk-basedtesting #good-enough

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 06/11/2009

If we believe the computer press, the E-Business revolution is here; the whole world is getting connected; that many of the small start-ups of today will become the market leaders of tomorrow; that the whole world will benefit from E-anyWordULike. The web offers a fabulous opportunity for entrepreneurs and venture capitalists to stake a claim in the new territory – E-Business. Images of the Wild West, wagons rolling, gold digging and ferocious competition over territory give the right impression of a gold rush.

Pressure to deliver quickly, using new technology, inexperienced staff, into an untested marketplace and facing uncertain risks is overwhelming. Where does all this leave the tester? In fast-moving environments, if the tester carps about lack of requirements, software stability or integration plans they will probably be trampled to death by the stampeding project team. In high integrity environments (where the Internet has made little impact, thankfully), testers have earned the grudging respect of their peers because the risk of failure is unacceptable and testing helps to reduce or eliminate risk. In most commercial IT environments however, testers are still second class citizens on the team. Is this perhaps because testers, too often, become ant-risk zealots? Could it be that testers don’t acclimatise to risky projects because we all preach ‘best practices’?

In all software projects, risks are taken. In one way, testing in high-integrity environments is easy. Every textbook process, method and technique must be used to achieve an explicit aim: to minimise risk. It’s a no-brainer. In fast-moving E-Business projects, risk taking is inevitable. Balancing testing against risk is essential because we never have the time to test everything. It’s tough to get it ‘right’. If we don’t talk to the risk-takers in their language we’ll never get the testing budget approved.

So, testers must become expert in risk. They must identify failure modes and translate these into consequences to the sponsors of the project. 'If xxx fails (and it is likely, if we don’t test), then the consequence to you, as sponsor is...' In this way, testers, management, sponsors can reconcile the risks being taken to the testing time and effort.

How does this help the tester? Firstly, the decision to do more or less testing is arrived at by consensus (no longer will the tester lie awake at night thinking: 'am I doing enough testing?'). Second, the decision is made consciously by those taking the risk. Third, it makes explicit the tests that will not be done – the case for doing more testing was self-evident, but was consciously overruled by management. Fourth, it makes the risks being taken by the project visible to all.

Using risk to prioritise tests means that testers can concentrate on designing effective tests to find faults and not worry about doing ‘too little’ testing.

What happens at the end of the test phase, when time has run out and there are outstanding incidents? If every test case and incident can be traced back to a risk, the tester can say, 'at this moment, here are the risks of release'. The decision to release needn’t be an uninformed guess. It can be based on an objective assessment of the residual risk.

Adopting a risk-based approach changes the definition of ‘good’ testing. Our testing is good if it provides evidence of the benefits delivered and of the current risk of release, at an acceptable cost, in an acceptable timeframe. Our testing is good if, at any time during the test phase, we know the status of benefits, and the risk of release. No longer need we wait a year after release before we know whether our testing is perfect (or not). Who cares, one year later anyway?

In a recent E-Business project, we identified 82 product risks of concern. Fewer than 10 had anything to do with functionality. In all E-Business projects, the issues of Non-Functional problems such as usability, browser configuration, performance, reliability, security seem to dominate people’s concerns. We used to think of software product risks in one dimension (functionality) and concentrate on that. The number and variety of the risks of E-Business projects forces us to take a new approach.

It could be said that in the early 1990s, the tester community began to emerge and gain a voice in the computer industry. Using that voice, the language of risk will make testers effective in the ambitious projects coming in the next millennium.

Paul Gerrard, February 2000

Tags: #risk #e-businesstesting #language

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 30/06/2011

Scale, extended timescales, logistics, geographic resource distribution, requirements/architectural/commercial complexity, demand for documented plans and evidence are the gestalt of larger systems development. “Large systems projects can be broken up into a number of more manageable smaller projects requiring less bureaucracy and paperwork” sounds good, bur few have succeeded. Iterative approaches are the obvious way to go, but not many corporates have the vision, the skills or the patience to operate that way. Even so, session-based/exploratory testing is a component of almost all test approaches.

The disadvantages of documentation are plain to see. But there are three apsects that concern us.

  1. Projects, like life never stand still. Documentation is never up to date or accurate and it's a pain to maintain – so it usually isn't
  2. Processes can be put in place to keep the requirements and all dependent documentation in perfect synchronisation. The delays caused by the required human interventions and translation processes undermine our best efforts.
  3. At the heart of projects are people. They can rely on processes and paper to save them and stop thinking. Or they can use their brains.

Number 3 is the killer of course. With the best will and processes and discipline in the world, all our sources of knowledge are fallible. It is our human ability and flexibility and dare I say it agility that allows us to build and test some pretty big stuff that seems to work.

Societal and corporate stupor (aka culture) conspire to make us less interested in tracking down the flaws in requirements, designs, code, builds and thinking. It is our exploratory instincts that rescue us.

Tags: #ALF

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 13/10/2010

It seems like Prezi is all the rage. As a frequent presenter, I thought I'd have a play. So I took some of the early text from the Tester's Pocketbook and created my first Prezi. Not half bad. I'm not sure it's a revolution, but sometimes, anything is better than Powerpoint.

 



Tags: #testaxioms #Prezi

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 26/05/2011

Test Assurance is an evolving discipline that concerns senior testing professionals. Unfortunately, there isn't an industry-wide definition of it. Even if there was one, it would probably be far too vague.

This group aims to provide a focus for people who are active in the space to share knowledge, but also to senior folk who are looking for a potential career 'upgrade path'. By and large, test assurance pros are expert in tests, but sit above the fray. Their role is to assess, review, audit, understand, challenge testing but not usually to conduct it. That is (as was written in one of my TA briefs)...

Test Assurance has no responsibility for delivery.

TA as an engagement might be a full time internal role in one project or a programme of projects, engaged from the beginning and with a scope of influence across requirements through to acceptance testing, performed internally or by suppliers.

A variation of this role would be to provide oversight of a project from an external point of view. In this case, Test Assurance might report to the chair of a programme management board – often a business leader.

But an alternative engagement might be as a testing trouble-shooter where a (usually large) project has a 'testing problem'. A rapid review and report, with recommendations, presented to at least project board level is the norm.

There are wide variations on these themes.

So my question in this discussion is – what is your experience/view of Test Assurance? Let's hear your comments – perhaps we can create a TA scope or terms of reference so we can define the group's focus.

Here is the link: http://www.linkedin.com/groups/Test-Assurance-3926259

Do join us.

Tags: #testassurance #linkedin

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 09/12/2009

A post on the Software Testing Club, Is Testing Last in Line? seems oh so familiar to complaints (if that that is they are) heard for as long as I've been in software (and I'm in my 29th year).

I think all of the responses to the blog are reasonable – but the underlying assumption in all (most) of them is that the tester is responsible for getting:

a) involved early b) involved heavily

Now of course, there are arguments that we have all had drummed into us since the 1970s that can be used to support both of these aims. But they tend to exclude the viewpoint of the stakeholder. (To me a stakeholder is anyone who is interested in the outcome of testing).

Why were we not automatically involved earlier, if it is so self-evident we should be?

Was no one interested in what we could tell them (given access to whatever products were being produced) at the time? Do stakeholders think we produce so little of interest?

Why don't we get the time, budget, people and resources to test as much as we could?

The same challenges apply. And the conclusions are uncomfortable.

By and large our stakeholders are not stupid. If it is self-evident to us that we should be involved earlier, more and more often, why isn't it obvious to them? Howling at the moon won't help us.

Surely we need to engage with stakeholders:

What exactly do they want from us? When? In what format? How frequently? How flexibly? How thoroughly? and so on.

Testing is 'last in line' with good reason. We dont engage, we don't articulate what we do well enough, we provide data, not information, we provide it late (self-fulfilling prophecy time here), we focus on bugs rather than business goals, we write documents when we need to deliver intelligence, we find bugs, when we need to provide evidence of success, we refuse to judge and give our stakeholders nothing when they need our support most etc. etc.

Every ten years or so, I get depressed when the next “big thing” arrives and it appears that, well... it's the same old same old with a different label. New technologies offer new opportunities I guess and after reading a text book or two I get on board.

But for as long as I can remember, testers have been complaining that testing is 'last in line' and they BLAME OTHERS. Surely it's time to look at how WE behave as testers? Surely, we should look at what we are doing wrong rather than blame others?

Tags: #prioritisation #lastinline

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 26/01/2012

Peter Farrell-Vinay posted the question “Does exploratory testing mean we've stopped caring about test coverage?”on LinkedIn here: http://www.linkedin.com/groupItem?view=&gid=690977&type=member&item=88040261&qid=75dd65c0-9736-4ac5-9338-eb38766e4c46&trk=group_most_recent_rich-0-b-ttl&goback=.gde_690977_member_88040261.gmr_690977

I've replied on that forum, but I wanted to restructure some of the various thoughts expressed there to make a different case.

Do exploratory testers care about coverage? If they don't think and care about coverage, they absolutely should.

All test design is based on models

I’ve said this before: http://testaxioms.com/?q=node/11 Testing is a process in which we create mental models of the environment, the system, human nature, and the tests themselves. Test design is the process by which we select, from the infinite number possible, the tests that we believe will be most valuable to us and our stakeholders. Our test model helps us to select tests in a systematic way. Test models are fundamental to testing - however performed. A test model might be a checklist or set of criteria; it could be a diagram derived from a design document or an analysis of narrative text. Many test models are never committed to paper – they can be mental models constructed specifically to guide the tester whilst they explore the system under test. From the tester’s point of view, a model helps us to recognise particular aspects of the system that could be the object of a test. The model focuses attention on areas of the system that are of interest. But, models almost always over-simplify the situation.

All models are wrong, some models are useful

This maxim is attributed to the statistician George Box. But it absolutely applies in our situation. Here’s the rub with all models – an example will help. A state diagram is a model. Useful, but flawed and incomplete. It is incomplete because a real system has billions of states, not the three defined in a design document. (And the design might have a lot or little in common with the delivered system itself, by the way). So the model in the document is idealised, partial and incomplete - it is not reality. So, the formality of models does not equate to test accuracy or completeness in any way. All coverage is measured with respect to the model used to derive testable items (in this case it could be state transitions). Coverage of the test items derived from the model doesn’t usually (hardly ever?) indicate coverage of the system or technology. The skill of testing isn't mechanically following the model to derive testable items. The skill of testing is in the choice of the considered mix of various models. The choice of models ultimately determines the quality of the testing. The rest is clerical work and (most important) observation. I’ve argued elsewhere that not enough attention is paid to the selection of test models. http://gerrardconsulting.com/index.php?q=node/495

Testing needs a test coverage model or models

I’ve said this before too: http://testaxioms.com/?q=node/14 Test models allow us to identify coverage items. A coverage item is something we want to exercise in our tests. When we have planned or executed tests that cover items identified by our model we can quantify the coverage achieved as a proportion of all items on the model - as a percentage. Numeric test coverage targets are sometimes defined in standards and plans and to be compliant these targets must be met. Identifiable aspects of our test model, such as paths through workflows, transitions in state models or branches in software code can be used as the coverage items. Coverage measurement can help to make testing more 'manageable'. If we don’t have a notion of coverage, we may not be able to answer questions like, ‘what has been tested?’, ‘what has not been tested?’, ‘have we finished yet?’, ‘how many tests remain?’ This is particularly awkward for a test manager. Test models and coverage measures can be used to define quantitative or qualitative targets for test design and execution. To varying degrees, we can use such targets to plan and estimate. We can also measure progress and infer the thoroughness or completeness of the testing we have planned or executed. But we need to be very careful with any quantitative coverage measures or percentages we use.

Formal and Informal Models

Models and coverage items need not necessarily be defined by industry standards. Any model that allows coverage items to be identified can be used.

My definition is this: a Formal Model allows coverage items to be reliably identified on the model. A quantitative coverage measure can therefore be defined and used as a measurable target (if you wish).

Informal Models tend to be checklists or criteria used to brainstorm a list of coverage items or to trigger ideas for testing. These lists or criteria might be pre-defined or prepared as part of a test plan or adopted in an exploratory test session.

Informal models are different from formal models in that the derivation of the model itself is dependent on the experience, intuition and imagination of the practitioner using them so coverage using these models can never be quantified meaningfully. We can never know what ‘complete coverage’ means with respect to these models.

Needless to say, tests derived from an informal model are just as valid as tests derived from a formal model if they increase our knowledge of the behaviour or capability of our system.

Risk-based testing is an informal model approach – there is no way to limit the number of risks that can be identified. Is that bad? Of course not. It’s just that we can't define a numeric coverage target (other than ‘do some tests associated with every serious risk’). Risk identification, assessments etc. are subjective. Different people would come up with different risks, described differently, with different probabilities and consequences. Different risks would be included/omitted; some risks would be split into micro-risks or not. It's subjective. All risks aren't the same so %coverage is meaningless etc. The formality associated with risk-based approaches relates mostly to the level of ceremony and documentation and not the actual technique of identifying and assessing risks. It’s still an informal technique.

In contrast, two testers given the same state transition diagram or state table asked to derive, say, state transitions to be covered by tests, would come up with the same list of transitions. Assuming a standard presentation for state diagrams can be agreed, you have an objective model (albeit flawed, as already suggested).

Coverage does not equal quality

A coverage measure (based on a formal model) may be calculated objectively, but there is no formula or law that says X coverage means Y quality or Z confidence. All coverage measures give only indirect, qualitative, subjective insights into the thoroughness or completeness of our testing. There is no meaningful relationship between coverage and the quality of systems.

So, to return to Peter's original question “Does exploratory testing mean we've stopped caring about test coverage?” Certainly not, if the tester is competent.

Is the value of testing less because informal test/coverage models are used rather than formal ones? No one can say – there is no data to support that assertion.

One 'test' of whether ANY tester is competent is to ask about their models and coverage. Most testing is performed by people who do not understand the concept of models because they were never made aware of them.

The formal/informal aspects of test models and coverage are not a criteria for deciding whether planned/documented v exploratory is best because planned testing can use informal models and ET can use formal models.

Ad-Hoc Test Models

Some models can be ad-hoc – here and now, for a specific purpose – invented by the tester just before or even during testing. If, while testing, a tester sees an opportunity to explore a particular aspect of a system, he might use his experience to think up some interesting situations on-the-fly. Nothing may be written down at the time, but the tester is using a mental model to generate tests and speculate how the system should behave.

When a tester sees a new screen for the first time, they might look at the fields on screen (model: test all the data fields), they might focus on the validation of numeric fields (model: boundary values), they might look at the interactions between checkboxes and their impact on other fields visibility or outcomes (model: decision table?) or look at ways the screen could fail e.g. extreme values, unusual combinations etc. (model: failure mode or risk-based). Whatever. There are hundreds of potential models that can be imagined for every feature of a system.

The very limited number of test models associated with textual requirements are just that – limited – to the common ones taught in certification courses. Are they the best models? Who knows? There is very little evidence to say they are. Are they formal – yes, in so far objective definitions of the models (often called test techniques) exist. Is formal better than informal/ad-hoc? That is a cultural or value-based decision – there's little or no evidence other than anecdotal to justify the choice.

ET exists partly to allow testers to do much more testing than that limited by the common models. ET might be the only testing used in some contexts or it might be the 'extra testing on the top' of more formally planned, documented testing. That's a choice made by the project.

Certification Promotes Testing as a Clerical Activity

This ‘clerical’ view of testing is what we have become accustomed to (partly because of certification). The handed-down or ‘received wisdom’ of off-the-shelf models are useful in that they are accessible, easy to teach and mostly formal (in my definition). There were, when I last looked, 60+ different code coverage models possible in plain vanilla program languages. My guess is there are dozens associated with narrative text analysis, dozens associated with usage patterns, dozens associated with integration and messaging strategies. And for every formal design model in say, UML, there are probably 3-5 associated test models – for EACH. Certified courses give us five or six models. Most testers actually use one or two (or zero).

Are the stock techniques efficient/effective? Compared to what? They are taught mostly as a way of preparing documentation to be used as test scripts. They aren't taught as test models having more or less effectiveness or value for money to be selected and managed. They are taught as clerical procedures. The problem with real requirements is you need half a dozen different models on each page, on each paragraph even. Few people are trained/skilled enough to prepare good designed, documented tests. When people talk about requirements coverage it's as sophisticated as saying we have a test that someone thinks relates to something mentioned in that requirement. Hey – that's subjective again – subjective, not very effective and also very expensive.

With Freedom of Model Choice Comes Responsibility

A key aspect of exploratory testing is that you should not be constrained but should be allowed and encouraged to choose models that align with the task in hand so that they are more direct, appropriate and relevant. But the ‘freedom of model choice’ applies to all testing, not just exploratory, because at one level, all testing is exploratory (http://gerrardconsulting.com/index.php?q=node/588). In future, testers need to be granted the freedom of choice of test models but for this to work, testers must hone their modelling skills. With freedom comes responsibility. Given freedom to choose, testers need to make informed choices of model that are relevant to the goals of their testing stakeholders. It seems to me that the testers who will come through the turbulent times ahead are those who step up to that responsibility.

Sections of the text in this post are lifted from the pocketbook http://testers-pocketbook.com

Tags: #model #testmodel #exploratorytesting #ET #coverage

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account