Paul Gerrard

My experiences in the Test Engineering business; opinions, definitions and occasional polemics. Many have been rewritten to restore their original content.

First published 15/07/2013

See below the four presentations given by Paul at the World Conference on Next Generation Testing held in Bangalore, India between 8th and 12th July 2013.

Tags: #nextgentesting

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 02/12/2022

I’m leaving Twitter, for obvious (to me) reasons, which I'll explain. Essentially, I don't like what's happening to it.

Twitter, YouTube, personal blogs and Mastodon servers are full of accounts and opinions of the calamitous start to Elon Musk's first weeks at the helm of what he hopes will become Twitter 2.0. I've listed some links at the bottom of this post to back up the tale of woe. I don't like Musk, I don't like the sound of his politics, I don't like what he's doing to 'turn the company around'. I'm guessing he'll fail sooner rather than later.

Twitter was bought with borrowed money and now has a debt of around $13bn costing $1bn a year to service. It's revenues are plummeting as advertisers abandon the site. Half of the workforce of 7,500 has been fired (illegally), and whole departments have resigned, refusing to go 'hardcore' (including the HR payroll department, apparently). Of the 3750 or so of employees who were not fired, 75% did not respond to the 'click to be hardcore or be fired' email from Musk.

No one knows how many of the original 7,500 employees are still at the company. It could be just a few hundred who remain. It seems likely that many more people remain at the company because there's no HR staff to terminate these employees. And so on.

What does this mean for Twitter?

The general view expressed by experts, Twitter-watchers and ex-employees is that when (not if) Twitter has some infrastructure failures, there may not be enough (or any) people with the skills required to restore the service. Twitter will always have had hackers trying to penetrate the site, but, given the vulnerability of the service they'll be trying extra hard to bring it down. Forever.

It's when they deploy larger software or infrastructure upgrades when the fun will really start.

Musk has also admitted, if Twitter can't get it's revenues up it may have to file for bankruptcy. It really is that bad.

So, I'm leaving (not leaving) Twitter

I won't close my account because, you never know, maybe it'll turn a corner or something and become both viable and an attractive, friendly place to be. But I'm not holding my breath.

Introducing Mastodon

Mastodon seems to be the main game in town for people wishing to change platforms. Compared to Twitter, it is still small – around 8million accounts according to https://bitcoinhackers.org/@mastodonusercount – but growing fast as most people leaving the 'bird site' need a new home and land on the federated, decentralised service named after the ancient ancestor of elephants.

Lots of blogs and YouTube videos explaining what Mastodon is and how you use it have appeared over the last few weeks. You must choose and register on a specific server (often called an instance), but that server is one of (today) about 3,500. At the time of writing, around 200 servers are being added per day. You can think of a Mastodon server a bit like an email server. You have to toot (or post) rather than Tweet through your home server, but it can connect with every other Mastodon user or server on the planet. (Unless they are blacklisted – and some are).

I have set up a Mastodon Server

It's not as easy to find people you know, and of course, it's early days and most people aren't on Mastodon yet. But it's growing steadily as people join, experiment and learn how to use the service.

Mastodon accounts are like Twitter except, like email, you have to specify a service that you are registered with. For example, my Mastodon account is @paul@tester.social – a bit like an email address. Click on my address to see my profile.

I've been using Mastodon for about a month now, and I've found and followed Mastodon accounts of 70-75 testing-involved people I follow on the bird site. That's nearly a quarter of the 325 I follow (and I follow quite a few non-testing, jokey and celebrity accounts). So I'll risk saying, 25% or more tech-savvy people have made the move already. If you look at who the people you know follow, you will see names you recognise. It's not so hard to get started. And enthusiastic people who follow you on Twitter will be looking for you, when they join.

Why did I set up a Mastodon server?

Good question. On the one hand, I like to try new products and technologies – I have been running my own hardware at data centres since the early 2000s. At one point I had a half rack and eleven servers. Nowadays, I have three larger servers hosting about twenty virtual machines. If you want to know more about my set up, drop me a line or comment below.

I've hosted mail and web servers, Drupal and Wordpress sites, and experimented with Python-based data science services, MQTT, lots of test tools, and for a while, I even ran a Bitcoin node to see what was involved. So I thought I'd have a play with Mastodon. I used this video to guide me through the installation process. A bit daunting but with open-source software, you have to invest time to figure things out that aren't explained in documentation or videos.

tester.social is hosted on a 4 cpu, 16Gb memory, 1 Tb data virtual machine and uses Cloudflare R2 as a cloud-based object store for media uploads etc. From what I've seen of other site setups it would easily support 10,000 or more registered users. But, I'm going to monitor it closely to see how it copes of course.

It's an experiment, but one we will support for the next year at least. If it takes off and we get lots of users, we may have to think carefully about hiring technical and moderation staff and/or limiting registrations. But that's a long way off for now.

If you want to join tester.social, please do but...

Be aware that when you register, you will be asked to explain in a few words what you want to see on the site, and what you'll be posting about. Your account will be reviewed and you'll get an email providing a confirmation and welcome message. This is purely to dissuade and filter out phoney accounts and bots.

We will commit to the Mastodon Server Covenant and hopefully be registered with the Mastodon server community which today numbers just over 3,000. Nearly 2,000 servers have been set up since Musk took over the bird site. Mastodon is growing rapidly.

I don't know anyone on Mastodon, how do I find people I know?

If you are on Twitter, keep an eye out for people you know announcing their move to Mastodon – follow them, and see who they follow. And follow them. And see who they follow that you know and follow them and...

If you are not on Twitter, follow me. See who I follow and follow those you know. You'll get the hang of it pretty quickly.

I don't want to leave twitter yet, but want to experiment

Firstly, you can join any Mastodon service and use a cross-poster to copy your toots to tweets and vice versa. All new post on either service will be mirrored to the other. I found this article useful to learn what help is out there to migrate from Twitter to Mastodon: https://www.ianbrown.tech/2022/11/03/the-great-twitter-migration/.

For example, there are tools to scan your Twitter follows and followers for Mastodon handles, and you can import these lists to get you started. I used Movetodon to follow all my Twitter follows who had Mastodon accounts. Over time, I'll use it again and again to catch people who move in the future.

I registered with https://moa.party/ – it synchronises my posts on tester.social and Twitter in both directions – so I only have to post once. I post on tester.social and the tweet that appears on bird site isn't marked as 'posted by a bot' – so that works for me.

I found Martin Fowler's Exploring Mastodon blog posts on the subject very useful. He talks about his first month of using the service. Which is where I am at the moment. Sort of.

A few FAQs answered. Sort of

What if I don't like the service I register with or prefer another service? You can always transfer your account to another Mastodon service. Your follows and followers will be transferred and your posts, blocks and so on. You have to create a new account on the target service and may have to change your account name if your current name is taken on the new service. A redirect from old to new account is part of the service.

I hate advertising – can I avoid it? Mastodon servers do not display advertisements. Of course, companies might post commercials, but if you complain, it might be taken down and the poster be advised to leave the service and blocked in extremis.

What are toots and boosts? A toot is equivalent to a Tweet, and a boost is the same as a Retweet.

Are there apps for Mastodon? Yes of course – you can see the IOS and Android apps here. there are also a number of 3rd party apps. Don't ask me what they do – I haven't explored.

Why is the web interface better than apps? If you use the web interface, you can turn on what is called Advanced Web Interface in your preferences. In this version, you can view and pin multiple columns at the same time – each column having your toot window, home timeline, local timeline, federated timeline in parallel. You can set up what appears in each column in your preferences.

What are toot window, home timeline, local timeline, federated timeline? The toot window is where you post new messages. The three timelines are:

Home: like on Twitter, it shows all the posts of all the people you follow on all Instances
Local: it shows all the posts of the members of your Instance
Federated: it shows all the posts of the members of your Instance and also the posts of people on other Instances that are followed by people of your Instance.

If you want more help – and I'm sure you do, try this site: https://mastodon.help/ – it seems to cover all of the basics and quite a lot of the advanced stuff too.

If you join tester.social and need help; have a question?

If you need help mention @questions in a post and it'll reach us. We'll try to answer as soon as we can.

References

https://www.wsj.com/articles/how-elon-musks-twitter-faces-mountain-of-debt-falling-revenue-and-surging-costs-11669042132

https://www.bloomberg.com/news/articles/2022-11-10/musk-tells-twitter-staff-social-network-s-bankruptcy-is-possible

https://www.cipp.org.uk/resources/news/twitter-s-payroll-department-walks-out.html

https://www.nytimes.com/2022/11/18/technology/elon-musk-twitter-workers-quit.html

https://www.theguardian.com/technology/2022/nov/17/elon-musk-twitter-closes-offices-loyalty-oath-resignations

https://social.network.europa.eu/@EU_Commission/109437986950114434 (Twitter does not meet EU laws)



Tags: #mastodon #tester.social #twexit #SocialMedia #Mastodon #Twitter #socialmedia #twitter

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 12/04/2012

 

In this approach, the technique involves taking a requirement and identifying the feature(s) it describes. For each feature, a story summary and a series of scenarios are created and these are used to feedback examples to stakeholders. In a very crude way, you could regard the walkthrough of scenarios and examples as a ‘paper-based’ unit or component test of each feature.

What do you mean ‘unit or component test’?

A story identifies a feature and includes a set of examples that represent its required behaviour. It is not a formal component specification, but it summarizes the context in which a feature will be used and a set of business test cases. Taken in isolation, the story and scenarios, when compared with the requirement from which it is derived, provide a means of validating the understanding of the requirement.

Rather than test the functionality of a feature (which is done later, by developers usually), the story and scenarios test the requirement itself. When anomalies are cleared up, what remains are a clarified requirement and a story that identifies a feature that provides a set of clarifying examples. The definition of the identified feature is clarified and validated, in isolation – just like a component can be tested and trusted, in isolation.

The scenarios are limited in scope to a single feature but taken together, a set of stories validates the overall consistency and completeness of a requirement with respect to the feature(s) it describes.

Creating Stories and Scenarios – DeFOSPAM

DeFOSPAM is the mnemonic the Business Story Method™ uses to summarise the seven steps used to create a set of stories and scenarios for a requirement that allow us to comprehensively validate our understanding of that requirement.
  • Definitions
  • Features
  • Outcomes
  • Scenarios
  • Prediction
  • Ambiguity
  • Missing
The number of features and stories created for a requirement are obviously dependent on the scope of a requirement. A 100 word requirement might describe a single system feature and a few scenarios might be sufficient. A requirement that spans several pages of text might describe multiple features and require many stories and tens of scenarios to fully describe. We recommend you try and keep it simple by splitting complex requirements.

D – Definitions

If agreement of terminology, or feature descriptions cannot be attained, perhaps this is a sign that stakeholders do not actually agree on other things? These could be the goals of the business, the methods or processes to be used by the business, the outputs of the project or the system features required. A simple terminology check may expose symptoms of serious flaws in the foundation of the project itself. How powerful is that?

On the one hand, the need for definitions of terms used in requirements and stories arises as they are written and should be picked up by the author as they write. However, it is not uncommon for authors to be blind to the need for definitions as they might be using language and terminology that is very familiar to them. Scanning requirements and stories to identify the terms and concepts that need definition or clarification by subject matter experts is critical.

Getting the terminology right is a very high priority. All subsequent communication and documentation may be tainted by poor or absent definitions. The stories and scenarios created to example requirements must obviously use the same terminology as the requirement so it is critical to gain agreement early on.

Identify your sources of definitions. These could be an agreed language dictionary, source texts (books, standards etc.) and a company glossary. The company glossary is likely to be incomplete or less precise than required. The purpose of the definitions activity is to identify the terms needing definition, to capture and agree terminology and to check that the language used in requirements and stories is consistent.

On first sight of a requirement text, underline the nouns and verbs and check that these refer to agreed terminology or that a definition of those terms is required.

  • What do the nouns and verbs actually mean? Highlight the source of definitions used. Note where definitions are absent or conflict.
  • Where a term is defined, ask stakeholders – is this the correct, agreed definition? Call these ‘verified terms’.
  • Propose definitions where no known definition exists. Mark them as ‘not verified by the business’. Provide a list of unverified terms to your stakeholders for them to refine and agree.
When you start the process of creating a glossary, progress will be slow. But as terms are defined and agreed, progress will accelerate rapidly. A glossary can be viewed as a simple list of definitions, but it can be much more powerful than that. It’s really important to view the glossary as a way of making requirements both more consistent and compact – and not treat glossary maintenance as just an administrative chore.
  • A definition can sometimes describe a complex business concept. Quite often in requirements documents, there is huge scope for misinterpretation of these concepts, and explanations of various facets of these concepts appear scattered throughout requirements documents. A good glossary makes for more compact requirements.
  • Glossary entries don’t have to be ‘just’ definitions of terminology. In some circumstances, business rules can be codified and defined in the glossary. A simple business rule could be the validation rule for a piece of business data, for example a product code. But it could be something much more complex, such as the rule for processing invoices and posting entries into a sales ledger.
Glossary entries that describe business rules might refer to features identified in the requirements elsewhere. The glossary (and index of usage of glossary entries) can therefore provide a cross-reference of where a rule is used and associated system feature is used.

F – Features – One Story per Feature

Users and business analysts usually think and document requirements in terms of features. A feature is something the proposed system needs to do for its user and helps the user to meet a goal or supports a critical step towards that goal.

Features play an important part in how business users, wishing to achieve some goal, think. When visualising what they want of a system, they naturally think of features. Their thoughts traverse some kind of workflow where they use different features at each step in the workflow. ‘… I’ll use the search screen to find my book, then I’ll add it to the shopping cart and then I’ll confirm the order and pay’.

Each of the phrases, ‘search screen’, ‘shopping cart’, ‘confirm the order’ and ‘pay’ sound like different features. Each could be implemented as a page on a web site perhaps, and often features are eventually implemented as screen transactions. But features could also be processes that the system undertakes without human intervention or unseen by the user. Examples would be periodic reports, automated notifications sent via email, postings to ledgers triggered by, but not seen by, user activity. Features are often invoked by users in sequence, but features can also collaborate to achieve some higher goal.

When reading a requirement and looking for features, sometimes the features are not well defined. In this case, the best thing is to create a story summary for each and move on to the scenarios to see how the stories develop.

Things to look for:

  • Named features – the users and analysts might have already decided what features they wish to see in the system. Examples could be ‘Order entry’, ‘Search Screen’, ‘Status Report’.
  • Phrases like, ‘the system will {verb} {object}’. Common verbs are capture, add, update, delete, process, authorise and so on. Object could be any entity the system manages or processes for example, customer, order, product, person, invoice and so on. Features are often named after these verb-object phrases.
  • Does the requirement describe a single large or complex feature or can distinct sub-features be identified? Obviously larger requirements are likely to have several features in scope.
  • Are the features you identify the same features used in a different context or are they actually distinct? For example, a feature used to create addresses might be used in several places such as adding people, organisations and customers.

O – One Scenario per Outcome

More than anything else, a requirement should identify and describe outcomes. An outcome is the required behaviour of the system when one or more situations or scenarios are encountered. We identify each outcome by looking for requirements statements that usually have two positive forms:
  • Active statements that suggest that, ‘…the system will…’
  • Passive statements that suggest that ‘…valid values will …’ or ‘…invalid values will be rejected…’ and so on.
Active statements tend to focus on behaviours that process data, complete transactions successfully and have positive outcomes. Passive statements tend mostly to deal with data or state information in the system.

There is also a negative form of requirement. In this case, the requirement might state, ‘…the system will not…’. What will the system not do? Usually, these requirements refer to situations where the system will not accept or proceed with invalid data or where a feature or a behaviour is prohibited or turned off, either by the user or the state of some data. In almost every case, these ‘negative requirements’ can be transformed into positive requirements, for example, ‘not accept’ could be worded as ‘reject’ or possibly even as ‘do nothing’.

You might list the outcomes that you can identify and use that list as a starting point for scenarios. Obviously, each unique outcome must be triggered by a different scenario. You know that there must be at least one scenario per outcome.

There are several types of outcome, of which some are observable but some are not.

Outputs might refer to web pages being displayed, query results being shown or printed, messages being shown or hard copy reports being produced. Outputs refer to behaviour that is directly observable through the user interface and result in human-readable content that is visible or available on some storage format or media (disk files or paper printouts).

Outcomes often relate to changes of state of the system or data in it (for example, updates in a database). Often, these outcomes are not observable through the user interface but can be exposed by looking into the database or system logs perhaps. Sometimes outcomes are messages or commands sent to other features, sub-systems or systems across technical interfaces.

Often an outcome that is not observable is accompanied by a message or display that informs the user what has happened. Bear in mind, that it is possible that an outcome or output can be ‘nothing’. Literally nothing happens. A typical example here would be the system’s reaction to a hacking attempt or selection of a disabled menu option/feature.

Things to look out for:

  • Words (usually verbs) associated with actions or consequences. Words like capture, update, add, delete, create, calculate, measure, count, save and so on.
  • Words (verbs and nouns) associated with output, results or presentation of information. Words like print, display, message, warning, notify and advise.

S – Scenarios – One Scenario per Requirements Decision

We need to capture scenarios for each decision or combination of decisions that we can associate with a feature.

Often, the most common or main success scenario is described in detail and might be called the ‘main success’ or ‘default’ scenario. (In the context of use cases, the ‘main success scenario’ is the normal case. Variations to this are called extensions or scenarios). The main success scenarios might also be called the normal case, the straight-through or plain-vanilla scenario. Other scenarios can represent the exceptions and variations.

Scenarios might be split into those which the system deals with and processes, and those which the system rejects because of invalid or unacceptable data or particular circumstances that do not allow the feature to perform its normal function. These might be referred to as negative, error, input validation or exception condition cases.

The requirement might present the business or technical rules that govern the use of input or stored data or the state of some aspect of the system. For example, a rule might state that a numeric value must lie within a range of values to be treated as valid or invalid, or the way a value is treated depends on which band(s) of values it lies in. These generalised rules might refer to non-numeric items of data being classified in various ways that are treated differently.

A scenario might refer to an item of data being treated differently, depending on its value. A validation check of input data would fall into this category (different error messages might be given depending on the value, perhaps). But it might also refer to a set of input and stored data values in combination. A number of statements describing the different valid and invalid combinations might be stated in text or even presented in a decision-table.

Things to look out for:

  • Phrases starting (or including) the words ‘if’, ‘or’, ‘when’, ‘else’, ‘either’, ‘alternatively’.
  • Look for statements of choice where alternatives are set out.
  • Where a scenario in the requirement describes numeric values and ranges, what scenarios (normal, extreme, edge and exceptional) should the feature be able to deal with?

P – Prediction

Each distinct scenario in a requirement setting out a situation that the feature must deal with, should also describe the required outcome associated with that scenario. The required outcome completes the definition of a scenario-behaviour statement. In some cases, the outcome is stated in the same sentence as the scenario. Sometimes a table of outcomes is presented, and the scenarios that trigger each outcome are presented in the same table.

A perfect requirement enables the reader to predict the behaviour of the system’s features in all circumstances. The rules defined in the requirements, because they generalise, should cover all of the circumstances (scenarios) that the feature must deal with. The outcome for each scenario will be predictable.

Now, of course, it is completely unrealistic to expect a requirement to predict the behaviour in all possible situations because most situations are either not applicable or apply to the system as a whole, rather than a single feature.

However, where scenarios are identifiable, the need must be to predict and associate an outcome with those scenarios.

When you consider the outcomes identified on the Outcomes stage, you might find it difficult to identify the conditions that cause them. Sometimes, outcomes are assumed or a default outcome may be stated but not associated with scenarios in the requirements text. These ‘hanging’ outcomes might be important but might never be implemented up by a developer. Unless, that is, you focus explicitly on finding these hanging outcomes.

Things to look out for:

  • Are all outcomes for the scenarios you have identified predictable from the text?
  • If you cannot predict an outcome try inventing your own outcomes – perhaps a realistic one and perhaps an absurd one and keep a note of these. The purpose of this is to force the stakeholder to make a choice and to provide clarification.

A – Ambiguity

The Definitions phase is intended to combat the use of ambiguous or undefined terminology. The other major area to be addressed is ambiguity in the language used to describe outcomes.

Ambiguity strikes in two places. Scenarios identified from different parts of the requirements appear to be identical but have different or undefined outcomes. Or two scenarios appear to have the same outcomes, but perhaps should have different outcomes to be sensible.

There are several possible anomalies to look out for:

  • Different outcomes imply different scenarios but it appears you can obtain the same outcome with multiple scenarios. Is something missing from the requirement?
  • It is possible to derive two different outcomes for the same scenario. The requirement is ambiguous.
In general, the easiest way to highlight these problems to stakeholders is to present the scenarios/outcome combinations as you see them and point out their inconsistency or duplication.

Look out for:

  • Different scenarios that appear to have identical outcomes but where common sense says they should differ.
  • Identical scenarios that have different outcomes.

M – Missing

If we have gone through all the previous steps and tabulated all of our glossary definitions, features, scenarios and corresponding outcomes we perform a simple set of checks as follows:
  • Are all terms, in particular nouns and verbs defined in the glossary?
  • Are there any features missing from our list that should be described in the requirements? For example, we have create, read and update features, but not delete feature
  • Are there scenarios missing? We have some but not all combinations of conditions identified in our table
  • Do we need more scenarios to adequately cover the functionality of a feature?
  • Are outcomes for all of our scenarios present and correct?
  • Are there any outcomes that are not on our list that we think should be?

Workshops

Story-driven requirement validation, being based on the DeFOSPAM checklist is easily managed in a workshop format in three phases. Requirements might be clustered or chunked into selected groups to be reviewed. It is helpful if all of the terms requiring definition, clarification or agreement are distributed ahead of the session, so the stakeholders responsible for the definitions can prepare ahead of the meeting.

At the workshop or review meeting, each requirement is considered in turn and the discussion of the stories derived from it is performed:

  • Firstly, consider the definitions to be captured and agreed. The group need to consider the definition of every term individually but also in the context of other related terms as a self-consistent group.
  • For each feature, the scenarios are considered one by one. Where there are suspected omissions, ambiguities, these are discussed and corrected as necessary.
  • The scenarios for a feature are considered as a set: have enough scenarios been documented to understand the requirement? Do they provide enough spread? Do they cover the critical situations?
  • The requirement, stories and scenarios are considered as a whole: is the requirement clear, complete and correct? Are all features identified and all requirements addressed or described by one or more stories? Can the requirements and stories be trusted to proceed to development?
Usually, discrepancies in requirements, definitions and stories can be resolved quickly in the meeting and agreed to.

The text of this post has been extracted from The Business Story Pocketbook written by Paul Gerrard and Susan Windsor.

Tags: #DeFOSPAM #businessstory #requirementsvalidation

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 18/05/2012

In London, on 18 May I presented a keynote to the Testing and Finance conference. I've been asked for the slides of that talk, so I have uploaded them here. The talk was mostly based on two articles that I originally wrote for Atlassian, and you can find the text of those articles in the blog here: http://gerrardconsulting.com/index.php?q=node/602

Note that the presentation introduces some broader suggestions about influences on the future of testing and testers, including the increasing adoption of continuous delivery.



Tags: #redistributionoftesting #futureoftesting

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 25/11/2011

Some time ago, Tim Cuthbertson blogged “how I Replaced Cucumber With 65 Lines of Python.” I recently commented on the post and I've expanded on those comments a little here.

I share Tim's frustrations with Cucumber. I think the reuse aspect of the step definitions has some value, but that value is limited. I've heard of several sites having literally thousands of feature files and step definitions and no way to manage them systematically. A bit of a nightmare perhaps.

To address the 'specification by example isn't enough' challenge – SBE isn't enough and I demonstrate/discuss that here. Although some trivial requirements can be SBE – most can't, so you need a separate requirement statement to supplement scenarios/examples to fully describe the requirement.

This doesn't sound very Agile, but I'm not talking Agile here necessarily. I understand that some teams can live with minimalist stories and the spec is the code. I'm talking about teams that require an accurate definition of the requirement and want to drive the creation of tests from stories and scenarios. This need could apply to all project styles and not just Agile.

Gojko Adzic talks about the need for 'Key Examples' in his Specification by Example book. When I spoke to Gojko not too long ago and suggested more specification content beyond examples was usually required – he agreed. If this is true, that doesn't mean that we need bloated requirements documents. The level of detail in a requirement (as captured by a BA) can be quite compact, because the precision of a business rule doesn't need heavy explanation – the scenarios and tabulated examples (if needed) do that for us.

Successful execution of 'key examples' are a necessary but not usually sufficient acceptance criteria. Developers definitely need more tests to cover edge cases, for example. (User) acceptance requires end to end tests and probably combinations of examples in sequence to fully satisfy the business users. (Although these types of tests are likely to be manual rather than automated).

Some time ago, we wrote (a small amount of) code to generate Python unitttest code directly from stories, scenarios and example tables and it works fine. (All we need are different language templates to generate xunit code in other programming languages). The test code may be xunit format – but the story/scenarios define the content of the tests. xUnit code could drive any style of test in theory. We're also experimenting with generating Robot Framework code and HTML Fitnesse tables directly. All seems feasible to me and in principle, all that's required is a template to generate the correctly formatted output. Additional setup/teardown code and fixtures are held in pre-existing code.

Our SaaS product SP.QA can be used to capture requirements and the range of stories/scenarios that example them. Since the requirements and stories are managed in a database and the test code is generated, the developers (or testers) only need to manage a single class or function for each story to implement the tests.

This has the distinct advantage that BAs can treat epic stories/sagas as requirements and build their hierarchical view. Stories that identify features and scenarios that example them can be refined over time. When they are 'trusted' the test code can be generated.

We're offering a commercial product, but I think even in the open source domain, the days of plain text story writing are numbered. We think all requirements tools will move in this direction. In the future, requirements tools will capture requirements and stories/scenarios that example those requirements so that they can be used to drive requirements reviews, to be a starting point for acceptance tests and be used to generate test code for developers and system testers.

When business analysts and users get their hands on these tools, then BDD will really take off.

Tags: #BDD #Python #Cucumber #xUnit #Fitnesse #RobotTestFramework

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 06/11/2009

If we believe the computer press, the E-Business revolution is here; the whole world is getting connected; that many of the small start-ups of today will become the market leaders of tomorrow; that the whole world will benefit from E-anyWordULike. The web offers a fabulous opportunity for entrepreneurs and venture capitalists to stake a claim in the new territory – E-Business. Images of the Wild West, wagons rolling, gold digging and ferocious competition over territory give the right impression of a gold rush.

Pressure to deliver quickly, using new technology, inexperienced staff, into an untested marketplace and facing uncertain risks is overwhelming. Where does all this leave the tester? In fast-moving environments, if the tester carps about lack of requirements, software stability or integration plans they will probably be trampled to death by the stampeding project team. In high integrity environments (where the Internet has made little impact, thankfully), testers have earned the grudging respect of their peers because the risk of failure is unacceptable and testing helps to reduce or eliminate risk. In most commercial IT environments however, testers are still second class citizens on the team. Is this perhaps because testers, too often, become ant-risk zealots? Could it be that testers don’t acclimatise to risky projects because we all preach ‘best practices’?

In all software projects, risks are taken. In one way, testing in high-integrity environments is easy. Every textbook process, method and technique must be used to achieve an explicit aim: to minimise risk. It’s a no-brainer. In fast-moving E-Business projects, risk taking is inevitable. Balancing testing against risk is essential because we never have the time to test everything. It’s tough to get it ‘right’. If we don’t talk to the risk-takers in their language we’ll never get the testing budget approved.

So, testers must become expert in risk. They must identify failure modes and translate these into consequences to the sponsors of the project. 'If xxx fails (and it is likely, if we don’t test), then the consequence to you, as sponsor is...' In this way, testers, management, sponsors can reconcile the risks being taken to the testing time and effort.

How does this help the tester? Firstly, the decision to do more or less testing is arrived at by consensus (no longer will the tester lie awake at night thinking: 'am I doing enough testing?'). Second, the decision is made consciously by those taking the risk. Third, it makes explicit the tests that will not be done – the case for doing more testing was self-evident, but was consciously overruled by management. Fourth, it makes the risks being taken by the project visible to all.

Using risk to prioritise tests means that testers can concentrate on designing effective tests to find faults and not worry about doing ‘too little’ testing.

What happens at the end of the test phase, when time has run out and there are outstanding incidents? If every test case and incident can be traced back to a risk, the tester can say, 'at this moment, here are the risks of release'. The decision to release needn’t be an uninformed guess. It can be based on an objective assessment of the residual risk.

Adopting a risk-based approach changes the definition of ‘good’ testing. Our testing is good if it provides evidence of the benefits delivered and of the current risk of release, at an acceptable cost, in an acceptable timeframe. Our testing is good if, at any time during the test phase, we know the status of benefits, and the risk of release. No longer need we wait a year after release before we know whether our testing is perfect (or not). Who cares, one year later anyway?

In a recent E-Business project, we identified 82 product risks of concern. Fewer than 10 had anything to do with functionality. In all E-Business projects, the issues of Non-Functional problems such as usability, browser configuration, performance, reliability, security seem to dominate people’s concerns. We used to think of software product risks in one dimension (functionality) and concentrate on that. The number and variety of the risks of E-Business projects forces us to take a new approach.

It could be said that in the early 1990s, the tester community began to emerge and gain a voice in the computer industry. Using that voice, the language of risk will make testers effective in the ambitious projects coming in the next millennium.

Paul Gerrard, February 2000

Tags: #risk #e-businesstesting #language

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 06/12/2009

Are you ever asked as a tester, “is the system good enough to ship?” Given our normal experience, where we are never given enough time to do the testing, the system cannot be as good as it should be. When the time comes to make the release decision, how could you answer that question? James Bach introduced the idea called ‘Good Enough’ in 1997 (Bach,1997). It is helpful to understanding the risk-based test approach as it seems to hold water as a framework for release decision-making, at least in projects where risks are being taken. So, what is “Good Enough?” and how does it help with the release decision?

Many consultants advocate ‘best practices’ in books and conferences. Usually, they preach perfection and they ask leading questions like, “would you like to improve your processes?”, “do you want zero defects?” Could anyone possibly say “no” to these questions? Of course not. Many consultants promote their services using this method of preaching perfection and pushing mantras that sound good. It’s almost impossible to reject them.

Good enough is a reaction to this compulsive formalism, as it is called. It’s not reasonable to aim at zero-defects in software and your users and customers never expect perfection, so why do you pretend that you’re aiming at perfection? The zero-defect attitude just doesn’t help. Compromise is inevitable and you always know it’s coming. The challenge ahead is to make a release decision for an imperfect system based on imperfect information.

The definition of “Good Enough” in the context of a system to be released is:

  1. X has sufficient benefits.
  2. X has no critical problems.
  3. The benefits of X sufficiently outweigh the problems.
  4. In the present situation, and all things considered, improving X would cause more harm than good.
  5. All the above must apply.

To expand on this rather terse definition, X (whatever X is) has sufficient benefits means that there is deemed enough of this system working for us to take it into production, use it, get value, and get the benefit. It has no critical problems. i.e. there are no severe faults that make it unusable or unacceptable. At this moment in time, with all things considered, if we spend time trying to perfect X, that time is probably going to cost us more than shipping early with the known problems. This framework allows us to release an imperfect system early because the benefits may be worth it. How does testing fit into this good enough idea?

Firstly, have sufficient benefits been delivered? The tests that we execute must at least demonstrate that the features providing the benefits are delivered completely, so that we have evidence of this. Secondly, are there any critical problems? Our incident reports give us the evidence of the critical problems and many others too. There should be no critical problems for it to be good enough. Thirdly, is our testing good enough to support this decision? Have we provided sufficient evidence to say these risks are addressed and those benefits are available for release?

It is not for a tester to decide whether the system is good enough. An analogy that might help here is to view the tester as an expert witness in a court of law. The main players in this courtroom scene are:

  • The accused (the system under test).
  • The judge (project manager).
  • The jury (the stakeholders).
  • Expert witness (the tester).

In our simple analogy, we will disregard the lawyers’ role. (In principle, they act only to extract evidence from witnesses). Expert witnesses are brought into a court of law to find evidence and present that evidence in a form for laymen (the jury) to understand. When asked to present evidence, the expert is objective and detached. If asked whether the evidence points to guilt or innocence, the expert explains what inferences could be made based on the evidence, but refuses to judge innocence or guilt. In the same way, the software tester might simply state that based on evidence “these features work, these features do not work, these risks have been addressed, these risks remain”. It is for others to judge whether this makes a system acceptable.

The tester simply provides information for the stakeholders to make a decision. Adopting this position in a project seems to be a reasonable one to take. After all, testers do not create software or software faults; testers do not take the risks of accepting a system into production. Testers should advocate to their management and peers this independent point of view. When asked to judge, whether a system is good enough, the tester might say that on the evidence we have obtained, these benefits are available; these risks still exist. The release decision is someone else’s decision to make.

However, you know that the big question is coming your way so when you are asked, “is it ready?” what should you do? You must help the stakeholders make the decision, but not make it for them. The risks, those problems that we thought could occur some months ago, which, in your opinion would make the system unacceptable, might still exist. Based on the stakeholders’ own criteria, the system cannot now be acceptable, unless they relax their perceptions of the risk. The judgement on outstanding risks must be as follows:

  • There is enough test evidence now to judge that certain risks have been addressed.
  • There is evidence that some features do not work (the feared risk has materialised).
  • Some risks remain (tests have not been run, or no tests are planned).

This might seem like an ideal independent position that testers could take but you might think that it is unrealistic to think one can behave this way. However, we believe this stance is unassailable since the alternative, effectively, is for the tester to take over the decision making in a project. You may still be forced to give an opinion on the readiness of a system, but we believe taking this principled position (at least at first) might raise your profile and credibility with management. They might also recognise your role in projects in future – as an honest broker.

REFERENCES Bach, J. (1997), Good Enough Quality: Beyond the Buzzword, in: IEEE Computer, August 1997, pp. 96-98

Paul Gerrard, July 2001



Tags: #risk-basedtesting #good-enough

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 06/11/2009

Presentation to SAST on Risk-Based Testing (Powerpoint PPT file) – This talk is an overview of Risk-Based Testing presented to the Swedish Association of Software Testing (SAST). Why do Risk-Based Testing?, Introduction to Risk, Risks and Test Objectives, Designing the Test Process, Project Intelligence, Test Strategy and Reporitng.

Risk – The New Language of E-business Testing. This talk expands the theme of risk-Based Testing introduced below. It focuses on E-Business and presents more detail on risk-Based Test Planning and Reporting. It has been presented to the BCS SIGIST in London and is the opening keynote for EuroSTAR 2000.

Risk-Based Testing – longer introduction. This talk present a summary of what risk-based testing is about. It introduces risk as the new language of testing and discusses the four big questions of testing: How much testing is enough? When should we stop testing? When is the product good enough? How good is our testing? Metrics (or at least counting bugs) doesn't give us the answer. The risk-based approach to testing can perhaps help us answer these questions, but it demands that we look at testing from an different point of view. A polemic.

Registered users can download the paper from the link below. If you aren't registered, you can register here.

Tags: #risk-basedtesting

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 30/03/2007

The raw materials of real engineering: steel, concrete, water, air, soil, electomagnetic waves, electricity, obey the laws of physics.

Software of course, does not. Engineering is primarily about meeting trivial functional requirements and complex technical requirements using materials that obey the laws of physics.

I was asked recently whether the definitions – Functional and Non-Functional – are useful.

My conclusion was – at the least, they aren't helpful. At worst debilitating. There's probably half a dozen other themes in the initial statement but I'll stick to this one.

There is a simple way of looking at F v NF requirements. FRs define what the system must do. NFRs define HOW that system delivers that functionality. e.g. is it secure, responsive, usable, etc.

To call anything 'not something else' can never be intuitively correct I would suggest if you need that definition to understand the nature of the concept in hand. Its a different dimension, perhaps. Non-functional means – not working doesn't it?

Imagine calling something long, “not heavy”. It's the same idea and it's not helpful. It's not heavy because you are describing a different attribute.

So, to understand the nature of Non-Functional Requirements, it's generally easier to call them technical requirements and have done with it.

Some TRs, are functional of course, and that's another confusion. Access control to data and function is a what, not a how. Security vulnerabilities are, in effect functional defects. The system does something we would rather it didn't. Pen testing is functional testing. Security invulnerability is a functional requirement – it's just that most folk are overcome by the potential variety of threats. Pen tests use a lot of automation using specialised tools. But they are specialised, non non functional.

These are functional requirements just like the stuff the users actually want. Installability, documentation, procedure, maintainability are ALL functional requirements and functional tested.

The other confusion is that functional behaviour is Boolean. It works or it doesn't work. Of course, you can count the number of trues and falses. But that is meaningless. 875 out of 1000 test conditions pass. It could be expressed as a percentage, what exactly does that mean? Not much, until you look into the detail of the requirements themselves. One single condition could be several orders of magnitude more important than another. Apples and oranges? Forget it. Grapes and vineyards!

Technical behaviour is usually measurable on a linear scale. Performance and reliability for example (if you have enough empirical data to be significant) are measured numerically. (OK you can say meets v doesn't meet requrements is a Boolean but you know what I mean).

Which brings me to the point.

In proper engineering, say civil/structural... (And betraying a prejudice, structural is engineering, civil includes all sorts of stuff that isn't...)

In structural engineering, for example, the Functional requirements are very straight forward. With a bridge – say the Forth Bridge or the Golden Gate – built a long long time ago – the Functional requirements are trivial. “Support two railway lines/four lanes of traffic with travelling in both directions. (And a foot bridge for maintenance)”.

The Technical requirements are much more complex. 100% of the engineering discipline is focused on techical requirements. Masses of steel, cross sections, moments, stresses and strains. Everything is underpinned by the science of materials (which are extensively tested in laboratories, with safety factors applied), and tabulated in blue or green books full of tabulated cross sectional areas, beam lengths, cement/water ratios and so on. All these properties are calculated based on thousands of laboratory experiements, with statistical technques applied to come up with factors of safety. Most dams, for example, are not 100% safe for all time. they are typically designed to withstand 1 in 200 year floods. And they fail safely, because one guy in the design office is asked to explore the consequences of failure – which in the main are predictable.

Software does not obey the laws of physics.

Software development is primarily about meeting immensely complex functional requirements and relatively simple technical requirements using some ethereal stuff called software that very definitely does not obey laws at all. (Name one? Please?)

Functional testing is easy, meeting functional requirements is not. Technical testing is also easy, meeting technical requirements is (comparatively) easy.

This post isn't about “non-functional requirements versus functional rerqurements.” It's an argument saying ALL requirements are hard to articulate and meet. So there.

Tags: #ALF

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 06/02/2010

This was one of two presentations at the fourth Test Management Summit in London on 27 January 2010.

I don't normally preach about test auotmation as, quite frankly, I find the subject boring. The last time I talked about automation was maybe ten years ago. This was the most popular topic in the session popularity survey we ran a few days before the event and it was very well attended.

The powerpoint slides were written on the morning of the event while other sessions weere taking place. It would seem that a lot of deep-seated frustrations with regression testing and automation came to the fore. The session itself became something of a personal rant and caused quite a stir.

The slides have been amended a little to include some of the ad-hoc sentiments I expressed in the session and also to clarify some of the messages that came 'from the heart'.

I hope you find it interesting and/or useful.

Regression Testing – What to Automate and How

Tags: #testautomation #automation #regressiontesting

Paul Gerrard My linkedin profile is here My Mastodon Account