First published 31/08/2012

In response to Kim Ming Leung's comment A Different way to Apply SBE to Structured Methods and with reference to some of the ideas on his blog post here: The New V-Model I decided to write a longer response and blog.

I prefer to use the term “Measurement” and have applied it to Business goal, Business process, User story and program levels"

The given/when/then construct is appropriate where examples of a process or feature is required to illustrate and challenge the understanding of the feature or process in context. We recommend its use in requirements validation of identified features before implementation, in addition to the testing of software as per Behaviour-Driven Development. The approach can be used to validate business processes to some extent, in that pre-conditions, steps and post-conditions provide an example and therefore a test case of a stage in a business process.

Higher level business goals are defined and measured in different ways, however.

In the early 2000s I coined the term 'Project Intelligence' to label the information that testers and a full test process gathered, analysed and disseminated. You can download the 2004 paper that sets out these ideas here: Managing Projects With Intelligence. The notion of testing is expanded to include not only software testing (at all levels) but also the measurement (and that is the correct term in this context) of achievement in general. That is, I suggest that the disciplines of testing can be applied to the highest level business goals, intermediate goals and achievements, and in fact, any output, outcome or business benefit. So this notion fits nicely with your proposal to use the term "measurement rather than testing".

"Different “test detailing” methods are applied at different level"

Given your description of detailing, I agree. Testers (if they are to measure at every technical and business level) need to learn how to measure in a plethora of dimensions. Business measures come in all shapes and sizes from financial metrics and ratios to all kinds of physical measures to measures of less tangible outcomes or achievements:

  • Financial measures might appear simple, but the formulae, data derivation and calculations required might not be. A certain level of financial accounting nous will be required.
  • Physical measures relate to 'almost everything else' that is measurable. This could be a headcount, products sold, manufactured and delivered in a period, time to manufacture, deliver or repair, commissioned floorspace, employees trained and certified, systems delivered, tested or implemented. There is no limit to what can be measured and the variations of goal and achievement targets can be set and measured are endless.
  • What about intangibles? How are they measured? There's the theory that anything can be measured (Tom Gilb's law). I interpret this as, "if I choose to assign a measure to that 'intangible concept', then that's good enough for me". In so far as stakeholders can regard such targets as meaningful, then I guess these things can be measured. But this is, in my opinion, more an art than a science. There's a lot more subjectivity that is comfortable for some. But if you are the measurer, and not the stakeholder, then perhaps it's not your problem to worry about. Maybe you don't have the domain knowledge or experience to judge what a good measure is? Who knows?

Given this extended role of testing (or measurement), the people who test today have an opportunity to expand their remit to measuring at a much higher level, and their involvement and influence in projects could be much more significant. Approaches such as benefits realisation, results-based and performance management require good measurement to work, but their advocate rarely, if ever, explain how measurements are made. There is a presumption that numbers just 'fall from the air' perhaps. The challenge of measurement falls most severely on systems testers. I propose that the hard-won lessons of system testing can be applied at all levels in business programmes and maybe the systems testers should take this role on.

"I regard the exploration and verification capabilities desirable side effects but the ultimate goal is to agree with users the Measurements (Acceptance criteria) as Specifications."

I think by this you mean the internal testing of systems at a unit, integration or system level is of little interest to stakeholders but that the definition of criteria for acceptance and the acceptance (testing) process itself is of most importance. I think definitely that stakeholders usually are not interested in what goes on inside software development projects (and most software testing falls into this category), but they are interested in the process that triggers acceptance and payments to suppliers in particular. But there is a large variation in views of what acceptance criteria are and whether they can ever be objective. I think the traditional 'software testing' view that a feature can be deemed accepted if a set of tests derived from its specification can be staged and run successfully is inadequate and not meaningful.

Most stakeholders find it hard to be specific enough for us (testers) to stage tests. I don't think we have made anywhere near enough progress towards defining acceptance criteria that truly cover all aspects of a system to allow us to test them 'objectively'. Such criteria must cover the functionality, of course, but there are so many intangible aspects to the comfort, fit, confidence, sense of certainty that a 'good' system engenders that our attempts to define and measure usability, performance, security in technical terms miss the mark. I know we can define acceptance criteria in these technical, measurable, objective ways, but they aren't satisfactory even through they are “rational” and “objective”.

But you know what? I don't think it matters.

Many studies have shown that business people, especially at the highest level, make their biggest decisions based on experience, intuition and gut feel. Yes, they ask for the data, analyses and fancy charts. Sometimes they understand the data. But when it comes to making the call, as often as not, the guy at the top will take the opinions of his peers and gauge the consensus in his community to make that final decision. Our pained deliberations over the rationality or irrationality of our measurements don't really figure at all in their estimation.

What we really need is to be regarded as a peer and have our say in that consensus.

Tags: #projectintelligence #BSM #resultsbasedmanagement #benefitsrealization

First published 28/09/2016

Last week in Dublin and Cork, I gave talks on testing the Internet of Things (or everything, depending on who you consult). I want to share a little idea or comment that I made during the Cork session. It seemed to come from nowhere – I certainly hadn't prepped the comment so it might have come from somewhere deep in my subconscious. But if you know the source and it's someone other than me, please let me know and I'll acknowledge them.

You never know with live presentations. I don't rehearse (ever) so I never really know what is going to come out. To me, presentations are really stories, so I know the jist of the story, but it's different every time I tell it. My talks are always kid of  ”live” so to speak, and with the adrenalin flowing, I tend to go into autopilot. Sometimes random (but good) thoughts come out in the flow. From nowhere.

Anyway. What I said was...

“In projects, we tend to let the pilot of a plane – the developer – take off with a general heading and let them go. The navigator – the tester – waits at the other end, and after a rather long wait says 'you were 65 miles off target, and you crashed into a mountain'.

Why on earth do we leave the navigator off the plane? Why do we separate testers from developers? For heaven's sake, why don't we put the navigator on the plane with the pilot in the first place?”

This is the best illustration I can think of for ”test early, test often” (a mantra some of us have preached for 20+ years). I've been calling testers “navigators” for ages, but the new idea was suggesting, by putting testing 'after' development, it's like asking the navigator to wait at the destination.

So, if you are trying to argue for “test early, test often”, “shift-left” or “testers embedded with developers” you might use my “waiting for a plane” analogy.

Of course, you might have heard this idea somewhere else (and maybe I did too, but forgot). Do let me know – I'd like to thank them.

Tags: #TesterasNavigator #Shift-Left

First published 13/10/2020

Nicholas Snogren posted on LinkedIn a reference to an “Axioms of Testing” presentation from 2009 and asked me to comment on his “Tenets of Software Testing”. There are some similarities but not many I think, some parallel too, but his question prompted me to give a longer response than I guess was expected. I said...

“Hi, thanks for asking for my opinion. Your tenets look interesting – and although I don't think they map directly to what I've written, they raise points in my mind that need a little airing – my mind grows cobwebby over time, and it's good to brush off old ideas. A bit like exercising muscles that haven't been used for a while haha.”

I give my response as a comparison with my Tester's Pocketbook, and Test Axioms website and presentations. (I notice that some of these posts are around 12 years old and some links don't work (anymore). Some are out of my control, others I'll have to track down and correct others – let me know if you want that.)

Tenets v Axioms

Firstly, let's get our definitions right.

According to, a Tenet is “any opinion, principle, doctrine, dogma, etc., especially one held as true by members of a profession, group, or movement.” Tenets might be regarded as beliefs that don't require proof and don't provide a strong foundation.

From the same source, an Axiom is, “i) a self-evident truth that requires no proof. ii) a universally accepted principle or rule. iii) Logic, Mathematics. a proposition that is assumed without proof for the sake of studying the consequences that follow from it”

I favoured the use of Axioms as a foundation for thinking in the testing domain. Axioms, if they are defensible, would provide a stronger foundational set of ideas. When I proposed a set of Testing Axioms, there was some resistance – here's a Prezi talk that introduces the idea.

James Bach in particular challenged the idea of Axioms and suggested I was creating a school of testing (when schools of testing were getting some attention) here.

By and large, by defining the Axioms in terms that are context-neutral, challenges have tended to be easy to acknowledge, disarm and set aside. Critics, almost entirely from the context-driven school, jumped the gun so to speak – they clearly hadn't read what I had written at the time before critiquing. Only one or two people responded to James' call to arms to criticise the Axioms and challenged them.

The Axioms are fully described in The Tester's Pocketbook –

The Axioms of Testing website – – sets out the Axioms with some explanation and provides around 50% of the pocketbook content for free.

Axioms caught the attention (and criticism) of people because I pitched them as universal principles or laws of testing. Tenets, being less strident in their definition might not attract attention (or criticism) in the same way.

Immediate Comments on the Tenets

The Tenets are numbered and italicised. My comments in plain text.

  1. A software product’s behavior is exhibited by interactions.
  2. There is potentially an infinite number of possible behaviors in software.

These are properties of software. I'm not sure what 1 says other than behavior is triggered by interactions and presumably observed through interactions. Although a lot of software behaving autonomously might respond to internal events such as the passing of time and might not exhibit any behaviour through interactions e.g. a change of internal state. I'm not sure 1 says much.

Tenet 2 is reasonable for reasonably-sized software artefacts.

In the Tester's Pocketbook, I hardly use the term software. I prefer that we test systems. Software is usually an important part of every system. Humans do not interact with software (except by reading or writing it). Software exists in the context of program compilation, hosted on operating systems, running on devices which have peripherals and other interconnected systems which may or may not have user interfaces.

Basing Axioms on Systems means that the Axioms are open to interpretation as Axioms of testing ANY system (i.e. anything. I don't press that idea – but it's an attractive one). Another 'benefit' is that all of the Systems Thinking principles can also be brought to bear on our arguments. Outside its context, Software is not a System.

3. Some of those behaviors are potentially negative, that is, would detract from the objectives of the software company or users.

I use the term Stakeholders to refer to parties interested in the valuable, reliable behavior of systems and the outcome and value of testing those systems.

4. The potentiality for that negative behavior is risk.

OK, but it could be better worded. I would simply say 'potential modes of failure' rather than negative behaviour.

5. It’s impossible to guarantee a lack of risk as it’s impossible to experience an infinite number of behaviors.

Not really. You can guarantee a no-risk situation if no one cares or no one cares enough to voice their concerns before testing (or after testing). There is always the potential for failure because systems are complex and we are not skilled enough to create perfect systems.

6. Therefore a subset of behaviors must be sampled to represent the risk.

Rather than represent, I would say trigger the failure(s) of concern to explore the risk and better inform a risk-assessment.

7. The ability to take an accurate sample, representative of the true risk, is a testing skill.

Not sure what you mean by sample – tests or test cases, I presume? Representative is a subjective notion, surely; 'true' I don't understand; and a testing skill would need more definition than this, wouldn't it?

8. A code change to an existing product may also affect the product in an infinite number of ways.

I'd use 'ANY' change, to a 'SYSTEM'. Why 'also'? What would you say fits into a 'not only.... but also...' clause? But I'm not sure I agree with this assertion anyway. A code change changes some software artefact. The infinite effects (faulty behaviors?) derive from infinite tests (or uses in production) – which you say in 5 is impossible to achieve. I'm not sure what you're trying to say here.

9. It is possible to infer that some behaviors are more likely to be affected by that change than others.

You can infer anything you like by calling upon the great Unicorn in the sky. How will you do this? Either you use tools which are limited in capability or you might use change and defect history or you might guess based on partial knowledge and experience.

10. The risk -of that change- is higher within the set of behaviors that are more likely to be affected by that change.

Do you mean probability of failure or the consequence of failure? I assume probability. At any rate, this is redundant. You have already asserted this in 9. But it's also more complicated than this – a cosmetic defect on an app can be catastropic and a system failure negligible at times.

11. The ability to accurately estimate a scope of affected behavior is another testing skill.

I would call this the skills of impact analysis rather than testing. Developers are relatively poor at this, even having a far deeper technical knowledge (either they aren't able or lack the time to impact-analyse to any reliable degree). So we rely on testing to catch regressions which is less than ideal. Testers depend on their experience rather than system internals knowledge. But, since buggy systems behave in essentially unpredictable ways, we must admit our experience is limited and fallible. It's not a 'skill' that I would dwell on.

12. The scope and sampling ideas alone are meaningless without empirical evidence.

The scope and sampling ideas have meaning regardless of whether you implement them. I suppose you might say they are useless ideas if you don't gather evidence.

13. Empirical evidence is gathered through interactions with the product, observation of resultant behavior, and assessment of those observations.

The word empirical is redundant. I would use the word 'some' here. We also get evidence from operation in production, for example. (Unless you include that already?)

14. The accuracy and speed of scope estimation, behavior sampling, and gathering of evidence are key performance indicators for the tester.

If you are implying 13 are tester skills, I suppose you could make this assertion. But you haven't said what the value of evidence is yet. Is the purpose of testing only to evaluate the performance of testers? Hope not ;O)

15. Heuristics for the gathering of such evidence, the estimation of scope, and the sampling of behavior are defined in the Heuristic Test Strategy Model.

Heuristics are available in a wide range of sources including software, systems and engineering standards. Why choose such a limited source?

Inspiration for the Tenets

These tenets were inspired by James Bach’s “Risk Gap” and Doug Hubbard’s book “How to Measure Anything.” Both Bach and Hubbard discuss a very similar idea from different spaces. Hubbard suggests that by defining our uncertainty, we can communicate the value of reducing the uncertainty. Bach describes the “knowledge we need to know” as the “Risk Gap.” This Risk Gap is our uncertainty, and in defining it, we can compute the value of closing it. In testing, I realized we have three primary areas of uncertainty: 1) what is the “risk gap,” or knowledge we need to find out, 2) how can we know when we’ve acquired enough of that unknown knowledge, and 3) how can we design interactions with the program to efficiently reveal this knowledge.

There are several interesting anomalies to unpick here:

  • I recall James telling a story about Tom Gilb asserting anything could be measured. James suggested Love and Tom obliged. I don't think James was impressed.
  • 'Defining uncertainty' – how do you do that reliably? Numerically? Objectively? We can put any numbers we like against probability and consequence. Being certain, with or without evidence, is always subjective. People can say they are more certain, but based on ... what? How do we correlate data with a human emotion and use that to make engineering decisions? People can be easily deceived – by themselves, not just by others. Consider this, for example, and this.
  • Risk Gap – how is a quantity of knowledge measured? What units? With what certainty? These are aspects that James has argued against since the early 1990s.
  • Your three challenges 1), 2)and 3) are reasonable as goals. How do Bach and Hubbard argue you achieve them, if not by calling on the subjective opinions of other people?

Some More General Comments

You seem to be trying to 'make a case' for testing as a tool to address the risk of failure in systems. I (like and) use that same approach in a rounder sense in my conference talks and writings, when practicable. My observations on this are:

  1. The logic doesn't flow as it should because of flaws in the individual statements
  2. You have no pre-definition of test, testing or its purpose at the outset, so it's not clear what your destination is
  3. There's no defined goal of testing, other than to gather evidence to (my words) reassess risks and thereby reduce uncertainty (but you don't say why that's a 'good thing')
  4. Testing enables a reassessment of risk, but that reassessment may increase risk if, for example, you find a bug in something that was previously deemed reliable. (there's a bigger conversation to be had, but risk is not a BAD thing, it's the barrier(s) you need to navigate or break through to gain your REWARD).
  5. Extant, significant risks are a BARRIER to accepting or releasing systems. As such, the goal of testing is to provide evidence that to the people who make the decision, those risks are acceptable or negligible (reducing uncertainty, sure but never eliminating it). But the ultimate goal of testing is to show that the system WORKS. Encountering failures is a detour from that goal. The testing goal is broader than exploring risks.
  6. You don't mention stakeholders at all. Why do we test? To provide evidence to testing stakeholders – our customers – so they can make better-informed decisions.


I don't want to give the impression that I'm criticising Nicholas or am arguing against the concept of Tenets or Principles or Axioms of testing. All I have tried to do is offer reasonable criticism of the Tenets to show that is a) extremely difficult to postulate bullet-proof Tenets, Principles or Axioms and b) it is extremely easy to criticise such efforts by:

  • Exposing flaws in the language used and the logic in an argument that C follows B follows A etc.
  • Identifying implicit assumptions of meaning, scope, dependency and authority and
  • Offering examples of context that contradict, or expose flaws in, the statements made.

I do this because I have been there many times since 2008 and occasionally have to defend the Test Axioms from such criticisms. I have to say, Critical Thinking is STILL a rare skill – I wish criticism were more often proffered as a result of it.


  1. The Tester's Pocketbook –
  2. Axioms of Testing website –

Tags: #ALF #TestAxioms #Tenets #CriticalThinking

First published 05/07/2018

Do you remember the 'Testing is Dead' meme that kicked off in 2011 or so? It was triggered by a presentation done by Alberto Savoiea here . It caused quite a stir, some copycat presentations and a lot of counter-arguments. But I always felt most people missed the point being made. you just had to strip out the dramatics and Doors music.

The real message was that for some organisations, the old ways wouldn't work any more, and as time has passed, that prediction has come true. With the advent of Digital, mobile, IoT, analytics, machine learning and artificial intelligence, some organisations are changing the way they develop software, and as a consequence, testing changes too.

Shifting testing left, with testers working more collaboratively with the business and developers, test teams are being disbanded and/or distributed across teams. With no test team to manage, the role of the test manager is affected. Or eliminated.

Test management thrives; test managers come and go.
It is helpful to think of testing as less of a role and more of an activity that people undertake in their projects or organisations. Everyone tests, but some people specialise and make a career of it. In the same way, test management is an activity associated with testing. Whether you are the tester in a team or running all the testing in a 10,000 man-year programme, you have test management activities.

For better or for worse, many companies have decided that the role of test managers is no longer required. Responsibility for testing in a larger project or programme is distributed to smaller, Agile teams. There might be only one tester in the team. The developers in the team take more responsibility for testing and run their own unit tests. There’s no need for a test manager as such – there is no test team. But many of the activities of test management still need to be done. It might be as mundane as keeping good records of tests planned and/or executed. It could be taking the overall project view on test coverage (of developer v tester v user acceptance testing for example).

There might not be a dedicated test manager, but some critical test management activities need to be performed. Perhaps the team jointly fulfil the role of a virtual test manager!

Historically, the testing certification schemes have focused attention on the processes you need to follow—usually in structured or waterfall projects. There’s a lot of attention given to formality and documentation as a result (and the test management schemes follow the same pattern). The processes you follow, the test techniques you use, the content and structure of reporting vary wherever you work. I call these things logistics.

Logistics are important, but vary in every situation.
In my thinking about testing, as far as possible, I try to be context-neutral. (Except my stories, which are grounded in real experience).

As a consultant to projects and companies, I never knew what situation would underpin my next assignment. Every organisation, project, business domain, company culture, and technology stack is different. As a consequence, I avoided having fixed views on how things should be done, but over twenty-five years of strategy consulting, test management and testing, certain patterns and some guiding principles emerged. I have written about these before[1].

To the point.

Simon Knight at Gurock asked me to create a series of articles on Test Management, but with a difference. Essentially, the fourteen articles describe what I call “Logistics-Free Test Management”. To some people that's an oxymoron. But that is only because we have become accustomed in many places to treat test management as logistics management. Logistics aren't unique to testing.

Logistics are important, but they don't define test management.
I believe we need to  think about testing as a discipline where logistics choices are made in parallel with the testing thinking. Test Management follows the same pattern. Logistics are important, but they aren't testing. Test management aims to support the choices, sources of knowledge, test thinking and decision making separately from the practicalities – the logistics – of documentation, test process, environments and technologies used.

I derived the idea of a New Model for Testing – a way of visualising the thought processes of testers – in 2014 or so. Since then, I have presented to thousands of testers and developers and I get very few objections. Honestly!

However, some people do say, with commitment, “that's not new!”. And indeed it isn't.

If the New Model reflects how you think, then it should be a comfortable fit. It is definitely not new to you!
One of the first talks I gave on the New Model is here. (Skip to 43m 50s to skip the value of testing talk and long introduction).

[caption id=“attachment_1068” align=“aligncenter” width=“525”] The New Model for Testing[/caption]

Now, I might get a book out of the material (hard-copy and/or ebook formats), but more importantly, I'm looking to create an online and classroom course to share my thinking and guidance on test management.

Rather than offer you specific behaviours and templates to apply, I will try to describe the goals, motivations, thought processes, the sources of knowledge and the principles of application and use stories from my own experience to illustrate them. There will also be suggestions for further study and things to think about as exercises or homework.

You will need to adjust these lessons to your specific situation. It requires that you think for yourself – and that is no bad thing.
Here’s the deal in a nutshell: I’ll give you some interesting questions to ask. You need to get the answers from your own customers, suppliers and colleagues and decide what to do next.

I'll be exploring these ideas in my session at the next Assurance Leadership Forum on 25 July. See the programme here and book a place.

In the meantime, if you want to know more, leave a comment or do get in touch at my usual email address.

[1] The Axioms of Testing in the Tester’s Pocketbook for example,

Tags: #testassurance #testmanager #NewModelTesting #ALF #TestManagement #post-format-gallery

First published 10/04/2014

I'm working with Lalitkumar who edits the Tea Time With Testers online magazine. It has a large circulation and I've agreed to write an article series for him on 'Testing the Internet of Everything'. I'll also be presenting webinars to go with the articles, the first of which is here: It takes place on Saturday 19 April at 15.30pm. An unusual time – but there you go.

You can download the magazine from the home page here:

Lalit has asked for questions on the article and I'll respond to these during the webinar. But questions on a more broad range of testing-related subjects, I'll write a response for the magazine. But I'll also blog these questions and answers here.

Questions that result an interesting blog will receive a free Tester's Pocketbook - if you go through the TTWT website and contact Lalit - anything goes. I look forward to soem challenging questions :O)

The first Q&A Will appear shortly...

Tags: #TTWT #TeaTimewithTesters #IOE #IOT

First published 27/03/2013

Did you know? We’re staging some webinars

Last night, we announced dates for two webinars that I will present on the subject, “Story-Based Test Automation Using Free Tools”. Nothing very exciting in that, except that it’s the first time we have used a paid-for service to host our own webinar and marketed that webinar ourselves. (In the past we have always pitched our talks through other people who marketed them).

Anyway, right now (8.40 PM GMT and less than 24 hours since we started the announcements) we have 96 people booked on the webinar. Our GoToWebinar account allows us to accept no more than 100. Looks like a sell-out. Great.

Coincidentally, James Bach and Michael Bolton have revisited and restated their positions on the “testing versus checking” and “manual versus automated testing” dichotomies (if you believe they are dichotomies, that is). You can see their position here:

I don’t think these two events are related, but it seemed to me that it would be a good time to make some statements that set the scene for what I am currently working on in general and the webinar specifically.

Business stories and testing

You might know that we (Gerrard Consulting) have written and promoted a software development method ( that uses the concept of business stories and have created a software as a service product ( to support the method. The method is not a test method, but it obviously involves a lot of testing. Testing that takes place throughout the development process – during the requirements phase, development phase, test phase and ongoing post-production phases.

Business stories are somewhat more to us than ‘a trigger for a conversation’, but we’ll use the term ‘stories’ to refer to them from now on.

In the context of these phases, the testing in scope might be called by other names and/or be part of processes other than ‘test’. Requirements prototyping, validation, (Specification by Example/Behaviour-Driven Development/Acceptance Test Driven Development/ Test-Driven Development – take your pick), feature-acceptance testing, system testing, user-testing and regression testing during and after implementation and go-live.

There’s quite a lot of this testing stuff going on. Right now, the Bach-Bolton dialogue isn’t addressing all of this in a general way, so I’m keeping a watching brief on events in that space. I look forward to a useful, informant outcome.

How we use (business) stories

In this blog, I want to talk specifically about the use of stories in a structured domain-specific language (using, for example Gherkin format (see to example (and that is a KEY word) requirements. I’m not interested in the Cucumber-specific extensions to the Gherkin syntax. I’m only interested in the feature heading (As a…/I want…/So that…) and the scenario structure (given…/when…/then…) etc. and how they are used to test in a broader sense:

  • Stories provide accessible examples in business language of features in use. They might be the starting point of a requirement, but usually not a full definition of a requirement. Without debating whether requirements can ever be complete, we argue that Specification by Example is not (in general) possible or desirable. See here:
  • If requirements provide definitions of behaviour in a general way, stories can be used to create examples of features described in requirements that are specific and, if carefully chosen, can be used to clarify understanding, to prototype behaviours and validate requirements in the eyes of stakeholders, authors and recipients of requirements. We describe this process here:
  • Depending on who creates these stories and scenarios and for what purpose, these scenarios can be used to feed a BDD, ATDD or Specification by Example approach. The terminology used in these approaches varies, but a tester would recognise them as a keyword-driven approach to test automation. Are these automated scenarios checks or tests? Probably checks. But these automated checks have multiple goals beyond ‘defect-detection’.

Story-based testing and automation

You see, the goals of an automated test (and let me persist in calling them tests for the time being) varies and there are several distinct goals of story-based scenarios as test definitions.

In the context of a programmer writing code, the rote automation of scenarios as tests gives the programmer a head start in their test-driven development approach. (And crafting scenarios in the language of users segues into BDD of course). The initial tests a programmer would have needed to write already exist so they have a clearer initial goal. Whether the scenarios exist at a sufficiently detailed level for programmers to use them as unit-tests is a moot point and not relevant right now. The real value of writing tests and running them first derives from:

  1. Early clarification of the goal of a feature when defined
  2. Immediate feedback of the behaviour of a feature when run
  3. When the goal is understood and the tests pass, then the programmer can more safely refactor their code
Is this testing? 2 is clearly an automated test. 3 is the reusable regression test that might find its way into a continuous integration and test regime. These tests typically exercise objects or features through a technical API. The user interface probably won’t be exercised.

There is another benefit of using scenarios as the basis of automated tests. The language of the scenario (which is derived from the businesses’ language in a requirement) can be expected to be reused in the test code. We can expect (or indeed mandate) the programmer to reuse that language in the naming of their variables and objects in code. The goals of Ubiquitous Language in systems (defined by Eric Evans and nicely summarised by Martin Fowler here are supported.

Teams needing to demonstrate acceptance of a feature (identified and defined by a story), often rely on manual tests executed by the user or tester. The tester might choose to automate these and/or other behaviour or user-oriented tests as acceptance regression tests.

Is that it? Automated story tests are ‘just’ regression tests? Well maybe so.

The world is going 'software as a service' and the development world moves closer to continuous delivery approaches every day. The time available to do manual testing is shrinking rapidly. In extremis, to avoid bottlenecks in the deployment pipeline ( there may be time only to perform cursory manual testing. Manual, functional testing of new features might take place in parallel with development and automation of functional tests must also happen ahead of deployment because automated testing becomes part of the deployment process itself. Perhaps manual testing becomes a test-as-we-develop activity?

But there are two key considerations for this high-automation approach to work:

  1. I’ve said elsewhere that Continuous Delivery is a beast that eats requirements ( and for CD to work, then the quality of requirements must be much higher than we are accustomed to. We use the term trusted requirements. You could say, tested and trusted. We, and I mean testers mostly, need to validate requirements using stories so the developers receive both trusted requirements and examples of features in use. Without trusted requirements, CD will just hit a brick wall faster.
  2. Secondly, it seems to me that for the testers not to be a bottleneck, then the manual checking that they do must be eliminated. Whichever tests can be automated should be. The responsibility for automation of checking must move from being a retrospective activity to possibly a developer activity. This will free the manual testers to conduct and optimise their activity in the short time they have available.
  3. There are several spin-off benefits of basing tests on stories and scenarios. Here’s two: if test automation is built early, then all checks can take advantage of it; if automation is built in parallel with the software under test, then the developers are much more likely to consider the test automation and build the hooks to allow it to operate effectively. The continuous automated testing provides the early warning system of continuous delivery regimes. These don't 'find bugs', rather they signal functional equivalence. Or not.

    I wrote a series of four articles on 'Anti-Regression Approaches' here: What are the skills of setting up regression test regimes? Not necessarily the same as those required to design functional tests. Primarily, you need automation skills and a knowledge of the internals of the system under test. Are these testing skills? Not really. They are more likely to be found in developers. This might be a good thing. Would it not be best to place responsibility for regression detection on those people responsible for introducing regressions? Maybe developers can do it better?

    One final point. If testers are allowed (and I use that word deliberately) to test or validate requirements using stories in the way we suggest, then the quality of requirements provided to developers will improve. And so will the software they write. And the volume of testing we are currently expected to resource will reduce. So we need fewer testers. Or should I say checkers?

    This is the essence of the “redistributed testing” offer that we, as testers, can make to our businesses.

    The webinar is focused on our technical solution and is driven by the thinking above.

    Last time I looked we had 97 registrants on the 4th April Webinar. If you are interested, the 12th April webinar takes place at 10 AM GMT – you can register for it here:

    Tags: #testautomation #businessstorymethod #businessstories #BusinessStoryManager #BDD #tdd #ATDD

First published 03/12/2009

The V-model promotes the idea that the dynamic test stages (on the right hand side of the model) use the documentation identified on the left hand side as baselines for testing. The V-Model further promotes the notion of early test preparation.

Figure 4.1 The V-Model of testing.

Early test preparation finds faults in baselines and is an effective way of detecting faults early. This approach is fine in principle and the early test preparation approach is always effective. However, there are two problems with the V-Model as normally presented.

The V-Model with early test preparation.

Firstly, in our experience, there is rarely a perfect, one-to-one relationship between the documents on the left hand side and the test activities on the right. For example, functional specifications don’t usually provide enough information for a system test. System tests must often take account of some aspects of the business requirements as well as physical design issues for example. System testing usually draws on several sources of requirements information to be thoroughly planned.

Secondly, and more important, the V-Model has little to say about static testing at all. The V-Model treats testing as a “back-door” activity on the right hand side of the model. There is no mention of the potentially greater value and effectiveness of static tests such as reviews, inspections, static code analysis and so on. This is a major omission and the V-Model does not support the broader view of testing as a constantly prominent activity throughout the development lifecycle.

The W-Model of testing.

Paul Herzlich introduced the W-Model approach in 1993. The W-Model  attempts to address shortcomings in the V-Model. Rather than focus on specific dynamic test stages, as the V-Model does, the W-Model focuses on the development products themselves. Essentially, every development activity that produces a work product is “shadowed” by a test activity. The purpose of the test activity specifically is to determine whether the objectives of a development activity have been met and the deliverable meets its requirements. In its most generic form, the W-Model presents a standard development lifecycle with every development stage mirrored by a test activity. On the left hand side, typically, the deliverables of a development activity (for example, write requirements) is accompanied by a test activity “test the requirements” and so on. If your organization has a different set of development stages, then the W-Model is easily adjusted to your situation. The important thing is this: the W-Model of testing focuses specifically on the product risks of concern at the point where testing can be most effective.

The W-Model and static test techniques.

If we focus on the static test techniques, you can see that there is a wide range of techniques available for evaluating the products of the left hand side. Inspections, reviews, walkthroughs, static analysis, requirements animation as well as early test case preparation can all be used.

The W-Model and dynamic test techniques.

If we consider the dynamic test techniques you can see that there is also a wide range of techniques available for evaluating executable software and systems. The traditional unit, integration, system and acceptance tests can make use of the functional test design and measurement techniques as well as the non-functional test techniques that are all available for use to address specific test objectives.

The W-Model removes the rather artificial constraint of having the same number of dynamic test stages as development stages. If there are five development stages concerned with the definition, design and construction of code in your project, it might be sensible to have only three stages of dynamic testing only. Component, system and acceptance testing might fit your normal way of working. The test objectives for the whole project would be distributed across three stages, not five. There may be practical reasons for doing this and the decision is based on an evaluation of product risks and how best to address them. The W-Model does not enforce a project “symmetry” that does not (or cannot) exist in reality. The W-model does not impose any rule that later dynamic tests must be based on documents created in specific stages (although earlier documentation products are nearly always used as baselines for dynamic testing). More recently, the Unified Modeling Language (UML) described in Booch, Rumbaugh and Jacobsen’s book [5] and the methodologies based on it, namely the Unified Software Process and the Rational Unified Process™ (described in [6-7]) have emerged in importance. In projects using these methods, requirements and designs might be documented in multiple models so system testing might be based on several of these models (spread over several documents).

We  use the W-Model in test strategy as follows. Having identified the specific risks of concern, we specify the products that need to be tested; we then select test techniques (static reviews or dynamic test stages) to be used on those products to address the risks; we then schedule test activities as close as practicable to the development activity that generated the products to be tested.

Tags: #w-model

First published 05/05/2011

We can help you meet the challenges in the selection and management of your current and prospective external/internal suppliers and partners. We can help your supplier management by:

  • Evaluating supplier strengths and weaknesses
  • Identifying any major risks associated with your supplier services
  • Developing contract schedules including specific acceptance criteria
  • Refining and improving the commercial relationship with your suppliers
  • Developing plans and techniques for the performance tracking and management of your suppliers

If you’d like to know more, please contact us directly.

Tags: #SupplierSelection #SupplierSelectionManagement

First published 21/10/2009

I'm relieved, excited and delighted to tell you that The Tester's Pocketbook has been published and is available. (It is a pocketbook, with 104 pages and c. 19k words).

The book summarises the thinking on Test Axioms and the axiom definitions are hosted (and will be maintained in future) on the Test Axioms website.

Thanks to all my reviewers and people who supported me.

Tags: #paulgerrard #testaxioms #testerspocketbook

