First published 03/07/2006

A couple of weeks ago, after the BCS SIGiST meeting I was chatting to Martin Jamieson (of BT) about tools that test 'beneath the GUI'. A while later, he emailed a question...

At the recent SIGIST Johanna Rothman remarked that automation should be done below the level of the GUI. You then stood up and said you're working on a tool to do this. I was explaining to Duncan Brigginshaw ( yesterday that things are much more likely to change at the UI level than at the API level. I gave him your example of programs deliberately changing the html in order to prevent hacking – is that right? However, Duncan tells me that he recommends automating at the UI. He says that the commercial tools have ways of capturing the inputs which shield the tester from future changes e.g. as objects. I think you've had a discussion with Duncan and am just wondering what your views are. Is it necessary to understand the presentation layers for example?

Ulitimately, all GUI interfaces need testing of course. The rendering and presentation of HTML, the execution of Javascript, ActiveX and Java objects obviously need a browser involved for a user to validate their behaviour. But Java/ActiveX can be tested through drivers written by programmers (and many are).

Typically, Javascript isn't directly accessible to GUI tools anyway (as it is typically used for field validation and manipulation, screen formatting and window management). One can write whole applications in JavaScript if you wish.

But note that I'm saying that a browser is essential for a user to validate layout and presentation. If you go down the route of using a tool to automate testing of the entire application from GUI through to server based code, you need quite sophisticated tools, with difficult to use scripting (programming) languages. And lo and behold, to make these tools more usable/accessible to non-programmers, you need tools like AXE to reduce (sometimes dramatically) the complexity of the scripting language required to drive automated tests.

Now, one of the huge benefits of these kinds of testing frameworks, coupled with 'traditional' GUI test tools is they allow less technical testers to create, manage and execute automated tests. But, if you were to buy a Mercury WinRunner or QTP license plus an AXE licence, you'd be paying 6k or 7k PER SEAT, before discounts. This is hugely expensive if you think about what most automated tools are actually used for – compared with a free tool that can execute tests of server-based code directly.

Most automated tools are used to automate regression tests. Full stop. I've hardly ever met a system tester who actually set out to find bugs with tools. (I know 'top US consultants' talk about such people, but they seem to exist as a small minority. What usually happens is that the tester needs to get a regression test together. Manual tests are run, when the software is stable and tests pass, they get handed over to the automation folk. I know, I know, that AXE and tools like it allow testers to create automated test runs. However, with buggy software, you never get past the first run of a new test. So much for running the other 99 using the tool – why bother.

Until you can run the other 99, you don't know whether they'll find bugs anyway. So folk resort to running them manually because you need a human being checking results and anomalies, not a dumb tool. The other angle is that most bugs aren't what you expect – by definition. e.g. checking a calculation result might be useful, but the tab order, screen validation, navigation, window management/consistency, usability and accessibility AREN'T in your preprared test plan anyway. So much for finding bugs proactively using automation. (Although bear in mind that there are free/cheap tools exist to validate HTML, accessibility, navigation and validation).

And after all this, be reminded, the calculated field is actually generated by the server based code. The expected result is a simple number, state variable or message. The position, font, font size and 57 other attributes of the field it appears in are completely irrelevant to the test case. The automated tool is, in effect, instructed by the framework tool to ignore these things, and focus on the tester's predicted result.

It's interesting (to me anyway) that the paper that is most downloaded from the Gerrard Consulting website is my paper on GUI testing. It was written in 1997. It gets downloaded between 150 and 250 times a month approximately. Why is that for heavens sake – it's nine years old! The web isn't even mentioned in the paper! I can only think people are obsessed with the GUI and haven't got a good understanding of how you 'divide and conquer' the complex task of testing a GUI into simpler tasks. Some that can be automated beneath the GUI, some that can be automated using tools other than GUI test running tools, some that can be automated using GUI test running tools and some that just can't be automated. I'm sure most folk with tools are struggling to meet higher than realistic expectations.

So, what we are left with is an extremely complex product (the browser), being tested by a comparably (probably more) complex product, being controlled by another complex product to make the creation, execution and evaluation of tests of (mainly) server-based software an easy task. Although it isn't of course. Frameworks work best with pretty standard websites or GUI apps with standard technologies. Once you go off the beaten track, the browser vendor, the GUI tool vendor and the framework vendor all need to work hard to make their tools compatible. But all must stick to the HTTP 2.0 protocol which is 10 years(?) old. How many projects set themselves up to use bleeding edge technology, and screw the testers as a consequence? Most, I think.

So. There we have it. If you are fool enough to spend £4-5,000 per seat on a GUI tool. You then to be smart enough to spend another £2,000 or so on a Framework (PER USER).

Consider the alternative.

Suppose you knew a little about HTML/HTTP etc. Suppose you had a tool that allowed you to get web pages, interpret the HTML, insert values to fields, submit the form, execute the server based form handler, receive the generated form, validate the form in terms of new field values, save copies of the received forms on your PC, compare those forms with previously received forms, and deal with the vagaries of secure HTTPS, and ignore the complexities of the user interface. The tool could have a simple script language, based on keywords/commands, stored in CSV files, managed by Excel.

If the tool could scan a form, not yet tested, and generate the script code to set the values for each field in the form, you'd have a basic but effective script capture facility. Cut and paste into your CSV file and you have a pretty effective tool. Capture the form map (not gui – you don't need all that complexity, of course) and use the code to drive new test transactions.

That's all pretty easy. The tool I've built does 75-80% of this now. My old Systeme Evolutif wesite (including online training) had 17,000 pages, with around 100 Active Server Pages script files. As far as I know, there's nothing the tool cannot test in those 17,000 pages. Of course most are relatively simple. But they are only simple in that they use a single technology. There's thousands of lines of server-base code. If/as/when I created a regression test pack for the site, I can (because the tool is run on the command line) run that test every hour against the live site. (Try doing that with QTP). If there is a single discrepancy in the HTML that is returned, the tool would spot it of course. I don't need to use the GUI to do that. (One has to assume the GUI/browser behaves reliably though).

Beyond that, a regression test based on the GUI appearance would never spot things in the HTML unless you wrote code specifically to do that. Programmers often place data in hidden fields. By definitition, hidden fields never appear on the GUI. GUI tools would never spot a problem – unless you wrote code to validate HTML specifically. Regression tests focus on results generated by server-based code. Requirements specify outcomes that usually do not involve the user interface. In most cases, the user interface is entirely irrelevant to the successful outcome of a functional test. So, a test tool that validates the HTML content is actually better than a GUI tool (please note). By the way, GUI tools don't usually have very good partial matching facilitites. With code-based tools, you can use regular expressions (Regexs). Much better control for the tester then GI tools.

Finally. If you use a tool to validate returned messages/HTML, you can get the programmer to write code that syncs with the test tool. A GUI with testability! For example, the programmer can work with the tester to provide the 'expected result' in hidden fields. Encrypt them if you must. The developer can 'communicate' directly with the tester. This is impossible if you focus on the GUI. It's really quite hard to pass technical messages in the GUI without the user being aware.

So. A tool that drives server-based code is more useful (to programmers in particular because you don't have the unnecessary complexities of the GUI). They work directly on the functionality to be tested – the server based code. They are simpler to use. They are faster (there's no browser/GUI and test tool in the way). They are free. AND they are more effective in many (more than 50%?) cases.

Where such a tool COULD be used effectively, who in their right mind would choose to spend £6,000-7,000 per tester on LESS EFFECTIVE products?'

Oh, and did I say, the same tool could test all the web protocols MAIL, FTP etc. and could easily be enhanced to cover web sevices (SOAP, WSGI blah blah etc.) – the next big thing – but actually services WITHOUT a user interface! Please don't get me wrong, I'm definitely not saying that GUI automation is a waste of time!'

In anything but really simple environments, you have to do GUI automation to achieve coverage (whatever that means) of an application. However, there are aspects of the underlying functionality that can be tested beneath the GUI and sometimes it can be more effective do do that but only IF there aren't complicated technical issues in the way (that would be hidden behind the GUI and the GUI tool ignores them).

What's missing in all this is a general method that guides testers to using manual, automation above or below the GUI. Have you ever seen anything like that? One of the main reasons people get into trouble with automation is because they have too high expectations and are overambitious. It's the old 80/20 rule. 20% of functionality dominates the testing (but could be automated). Too often, people try and automate everything. Then 80% of the automation effort goes on fixing the tool to run tests of the least important 20% of tests. Or something like that. You know what I mean. The beauty of frameworks is they hide the automation implementation details from the tester. Wouldn't it be nice if the framework SELECTED the optimum automation method as well? I guess this should depend on the objective of a test. If the test objective doesn't require use of the GUI – don't use the GUI tool! Current frameworks have 'modes' based on the interfaces to the tools. Either they do GUI stuff, or they do Webservices stuff or... But a framework ought to be able to deal with gui, under the gui, web services, command-line stuff etc. etc. Just a thought.

I feel a paper coming on. Maybe I should update the 1997 article I wrote!

Thanks for your patience and trigging some thoughts. Writing the email was an interesting way to spend a couple hours, sat in a dreary hotel room/pub.

Posted by Paul Gerrard on July 4, 2006 03:08 PM


Good points. In my experience the GUI does change more often the the underlying API. But often, using the ability of LR to record transactions is still quicker than hoping I've reverse-engineered the API correctly. More than once I've had to do it without any help fom developers or architects. ;–)


Paul responds:

Thanks for that. I'm interested to hear you mention LR (I assume you mean Load Runner). Load Runner can obviously be used as an under the bonnet test tool. And quite effective it is too. But one of the reasons for going under the bonnet is to make life simpler, and as a consequence, a LOT cheaper.

There are plenty of free tools (and scripting languages with neat features) that can be perhaps just as effective as LR in executing basic transactions – and that's the point. Why pay for incredibly sophisticated tools that compensate for each other, when a free simple tool can give you 60, 70, 80% of what you need as a functional tester?

Now LR provides the facilitites, but I wouldn't recommend LR as a cheap tool! What's the going rate for an LR license nowadays? $20k, $30k?

Thanks. Paul.

First published 16/06/2010

The third in a series of articles on Anti-Regression Approaches has been posted here: Part III: Regression Testing

Part I: Introduction & Impact Analysis can be found here.

Part II: Regression Prevention and Detection Using Static Techniques can be found here.

The next article will focus on Test Automation.

First published 19/07/2010

In Parts I and II of this article series, we introduced the nature of regression, impact analysis and regression prevention. In Part III we looked at Regression Testing and how we select regression tests. This article focuses on the automation of regression testing.

Automated Regression Testing is One Part of Anti-Regression

Sometimes it feels like more has been written about test automation, especially GUI test automation, than any other testing subject. My motivation in writing this article series was that most things of significance in test automation had been said 8, 10 or 15 years ago and not that much progress has been made since (notwithstanding the varying technology changes that have occurred). I suggest there’s been a lack of progress, because significant and sustained success with automation of (what is primarily) regression testing is still not assured. Evidence of failure, or at least troublesome implementations of automation, is still widespread. My argument in the January 2010 Test Management Summit was that perhaps the reason for failure in test automation was that people didn’t think it through before they started. In this context, ‘started’ often means getting a good deal on a proprietary GUI Test Automation tool. It’s obvious – buying a tool isn’t the best first step. Automating tests through the user interface may not be the most effective way to achieve anti-regression objectives. Test automation may not be an effective approach at all. It certainly shouldn’t be the only one considered. Test execution automation promises reliable, error-free, rapid, unattended test execution. In some environments, the promise is delivered, but in most – it is not. In the mid 1990’s informal surveys revealed that very few (in one survey, less than 1% of) test automation users achieved ‘significant benefits’. The percentage is much higher nowadays – maybe as high as 50%, but that is probably because most practitioners have learnt their lessons the hard way. Regardless, success is not assured. Much has been written on the challenges and pitfalls of test automation. The lessons learned by practitioners in the mid-90s are substantially the same as those facing practitioners today. I have to say that it’s a cause of some frustration that many companies still haven’t learnt them. In this article, there isn’t space to repeat those lessons. The referred papers, books and blogs at the end of this article focus on implementing automation, primarily from a user interface point of view, and sometimes as an end in itself. To complement these texts, to bring them up to date and focus them on our anti-regression objective, the remainder of this article will set out some wider considerations.

Regression test objectives and (or versus?) automation

The three main regression test objectives are set out below together with some suggestions for test automation. Although the objectives are distinct, the differences between regression testing and automation for the three objectives are somewhat blurred.

Anti-Regression ObjectiveSource of TestsAutomation Considerations
1. To detect unwanted changes to trusted functionality.Functional system tests Integration testsConsider the criteria in references 6, 7, 8 Most likely to be automated using drivers to component and sub-system interfaces
2. To detect unwanted changes (to support technical refactoring).Test-first, test-driven environments generate automated tests naturallyConsider reference 9 and the discussion of testing in TDD and Agile in general.
3. To demonstrate to stakeholders that they can still do business.Acceptance Tests, business process flows, ‘end to end’ tests.Consider the criteria in references 6, 7, 8 but expect mostly manual testing for demonstration purposes. See reference 10 for an introduction to Acceptance Driven Development.

Regression objectives reframed: detecting regression v providing confidence

Of the three regression test objectives above, objectives 1 and 2 are similar. What differentiates them is who (and where) they come from. Objective 1 comes from a system supplier perspective and tests are most likely to be sourced from system or integration tests that were previously run (either manually or automated). Objective 2 comes from a developer or technical perspective where the aim is to perform some refactoring in a safe environment. By and large, ‘safe’ refactoring is most viable in a Test-Driven environment where all unit tests are automated, probably in a Continuous Integration regime. (Although refactoring at any level benefits from automated regression testing).

If objectives 1 and 2 require tests to demonstrate ‘functional equivalence’, regression test coverage can be based on the need to exercise the underlying code and cover the system functionality. Potentially, tests based on equivalence partitioning ought to cover the branches in code (but not housekeeping or error-handling functionality – but see below). Tests covering edge conditions or boundary values should verify the ‘precision’ of those decisions. So a reasonable guideline could be – use automation to cover functional paths through the system and data-drive those tests to expand the coverage of boundary conditions. The automation does not necessarily have to execute tests that would be recognisable to the user, if the objective is to demonstrate functional equivalence.

Objective 3 – to provide confidence to stakeholders is slightly different. In this case, the purpose of a regression test is to demonstrate to end users that they can execute business transactions and use the system to support their business. In this respect, it may be that these tests could be automated and some automated tests that fall under the category 1 and 2 above will be helpful. But experience of testing GUI applications in particular suggests that end users sometimes only trust their own eyes and need to have a hands-on experience to give them the confidence that is required. Potentially, a set of automated tests might be used to drive a number of ‘end to end’ transactions, and reconciliation or control reports could be generated to be inspected by end users. There is a large spectrum of possibilities of course. In summary, automated tests could help, but in some environments, the need for manual tests as a ‘confidence builder’ cannot be avoided.

At what level(s) should we automate regression tests?

In Part III of this article series, we identified three levels at which regression testing might be implemented – at the component, system and business (or integrated system) levels. These levels should be considered as complementary and the choice is where to place emphasis, rather than which to include or exclude. The choice of automation at these levels is not really the point. Rather, a level of regression testing may be chosen primarily to achieve an objective, partly on the value of information generated and partly because of the ease with which the tests can be automated.

What are the technical considerations for automation?

At the most fundamental, technical level, there are four aspects of the system under test that must be considered. How the system under test is stimulated, and how the test outcomes of interest (with respect to regression) will be detected.

Mechanisms for stimulating the system under test

This aspect reflects how a test is driven by either a user or an automated tool. Nowadays, the number of user and technical interfaces in use is large – and growing. A table of the most common are presented and some suggestions made.

PC/Workstation-based applications and clients>
  • Proprietary or open source GUI-object based drivers
  • Hardware (keyboard, video, mouse) based tools – physically connected to clients
  • Software based automation tools driving clients working across VNC connections
Browser/web-based applications
  • Proprietary object-based agents
  • Open source JavaScript-based agents
  • Open source script languages and GUI toolkits
Web-Server-based functionality (HTTP)
  • Proprietary or open source webserver/HTTP/S drivers
Web services
  • Proprietary or open source web services drivers
Mobile applications
  • Mobile OS simulators driven by integrated or separate GUI based toolkits
  • Typically java-based toolkits
Error, failure, spate, race conditions or other situations
  • May be simulated by instrumentation, load generation tools or manipulation of the test environment or infrastructure
  • Don’t forget that environmental conditions influence the behaviour of ALL systems under test.

There are an increasing number of proprietary and open source unit and acceptance testing frameworks available to manage and control the test execution engines above.

Outcome/Output detection and capture

A regression can be detected in as many ways as any outcome (output, change of state etc.) of a system can be exposed and detected. Here’s a list of common outcome/output formats that we may have to deal with. This is not a definitive list.

Browser-rendered output
  • The state of any object on the document Object Model (DOM) exposed by a GUI tool
Any screen-based output
  • Image recognition by hardware or software based agents
Transaction response times
  • Any automated tool with response time capture capability
Database changes
  • Appropriate SQL or database query tool
Message output and content
  • Raw packets captured by network sniffers
  • Message payloads captured and analysed by protocol-specific tools
Client or server system resources
  • CPU, i/o, memory, network traffic etc. detected by performance monitors
Application or other infrastructure – changes of state
  • (Database, enterprise messaging, object request brokers etc. etc.) - dedicated system/resource monitors or custom-built instrumentation etc.
Changes in accessibility or usability (adherence to standards etc.)
  • Web page HTML scanners, character-based screen or report scanners or screen image scanners
Security (server)
  • Port scanning and server-penetration tools

Comparison of Outcomes

A fundamental aspect of regression testing is comparison of actual outcomes (in whatever format from whatever source above) to expected outcomes. If we are running a test again, the comparison is between the new ‘actual’ output/outcome and previously captured ‘baseline’ output/outcome.

Simple comparison functionality of numbers, text, system states, images, mark-up language, database content, reports, message payloads, system resource is not enough. We need to have a capability in our automation to:

Filter content: we may not need to compare ‘everything’. Subsets of database records, screen/image regions, branches or leaves in marked up text, some objects and states but not others etc. may be filtered out (of both actual and baseline content).

Mask content: of the content we filter out, we may wish to mask out certain patterns of content such as image regions that do not contain field borders; textual report columns or rows that contain dates/times, page numbers, varying/unique record ids etc.; screen fields or objects of certain colours, sizes, that are hidden/visible; patterns of text that can be matched using regular expressions and so on.

Calculate from content: the value, significance or meaning of content may have to be calculated: perhaps the number of rows displayed on a screen is significant; the error message, number or status code displayed on a screen image, extracted by text recognition; the result of a formula in which the variables are extracted from an outputted report and so on.

Identify content meeting/exceeding a threshold: the significance of output is determined by its proximity to thresholds such as: CPU, memory or network bandwidth usage compared to pre-defined limits; the value of a purchase order exceeds some limit; the response time of a transaction exceeds a requirement and so on.

System Architecture

The architecture of a system may have a significant influence over the choice of regression approach and automation in particular. An example will illustrate. An increasingly common software model is the MVC or model-view-controller architecture. Simplistically (from Wikipedia):

“The model is used to manage information and notify observers when that information changes; the view renders the model into a form suitable for interaction, typically a user interface element; the controller receives input and initiates a response by making calls on model objects. MVC is often seen in web applications where the view is the HTML or XHTML generated by the app. The controller receives GET or POST input and decides what to do with it, handing over to domain objects (i.e. the model) that contain the business rules and know how to carry out specific tasks such as processing a new subscription.”

A change to a ‘read-only’ view may be completely cosmetic and have no impact on models or controllers. Why regression test other views, models or controllers? Why automate testing at all – a manual inspection may suffice.

If a controller changes, the user interaction may be affected in terms of data captured and/or presented but the request/response dialogue may allow complete control of the transaction and examination of the outcome. In many situations, automated control of requests to and from controllers (e.g. HTTP GETs and POSTs) is easier to achieve than automating tests through the GUI (i.e. a rendered web page).

Note that cross-browser test automation, to verify the behaviour and appearance of a system’s web pages across different browser types, for example, cannot be handled this way. (Some functional automation may be possible, but some usability/accessibility tests will always be manual).

It is clear that the number and variety of the ways a system can be stimulated and potentially regressive outcomes can be observed is huge. Few, if any tools, proprietary or open source, have all the capabilities we need. The message is clear – don’t ever assume the only way to automate regression testing is to use a GUI-based test execution tool!

Regression test automation – summary

In summary, we strongly advise you to bear in mind the following considerations:

  1. What is the outcome of your impact analysis?
  2. What are the objectives of your anti-regression effort?
  3. How could regressions manifest themselves?
  4. How could those regressions be detected?
  5. How can the system under test be stimulated to exercise the modes of operation of concern?
  6. Where in the development and test process is it feasible to implement the regression testing and automation?
  7. What technology, tools, harnesses, custom utilities, skills, resources and environments do you need to implement the automated regression test regime?
  8. What will be your criteria for automating (new or existing, manual) tests?

Test Automation References

  1. Brian Marick, 1997, Classic Testing Mistakes,
  2. James Bach, 1999, Test Automation Snake Oil,
  3. Cem Kaner, James Bach, Bret Pettichord, 2002, Lessons Learned in Software Testing, John Wiley and Sons
  4. Dorothy Graham, Paul Gerrard, 1999, the CAST Report, Fourth Edition
  5. Paul Gerrard, 1998, Selecting and Implementing a CAST Tool,
  6. Brian Marick, 1998, When Should a Test be Automated?
  7. Paul Gerrard, 1997, Testing GUI Applications,
  8. Paul Gerrard, 2006, Automation below the GUI (blog posting),
  9. Scott Ambler, 2002-10, Introduction to Test-Driven Design,
  10. Naresh Jain, 2007, Acceptance-Test Driven Development,

In the final article of this series, we’ll consider how an anti-regression approach can be formulated, implemented and managed and take a step back to summarise and recap the main messages of these articles.

Paul Gerrard 21 June 2010.

First published 14/04/2010


For some years, I’ve avoided getting too involved in test execution automation because I’ve felt it was boring. Yes, I know it has great promise and in principle, surely we should be offloading the tedious, repetitive, clerical tasks to tools. But I think the principles of regression testing haven’t changed and we’ve made little progress in the last fifteen years or so – we haven’t really moved on. It’s stuck in a time-warp.

I presented a talk on test automation at Eurostar 1997. Titled “Testing GUI Applications”, the paper I wrote is still the most popular one on my website with around 300 downloads a month. Why are people still interested in stuff I wrote so long ago? I think it was a good, but not groundbreaking paper; it didn’t mention the web; the recommendations for test automation were sensible, not radical. Books on the subject have been written since then. I’ve been meaning to update the paper for the new, connected world we now inhabit, but haven’t had the time so far.

But at the January 2010 Test Management Summit, I chose to Facilitate the session, “Regression Testing: What to Automate and How”. In our build-up to the Summit the topic came top of the popularity survey. We had to incorporate it into the programme, but no one volunteered – so I picked it up. On the day, the frustrations I’ve held for a long time came pouring out and the talk became a rather angry rant. In this series of articles, I want to set out the thoughts I presented at the Summit.

I’m going to re-trace our early steps to automation and try and figure out why we (the testers) are still finding that building and running sustainable, meaningful automated regression test suites is fraught with difficulties. Perhaps these difficulties arise because we didn’t think it through at the beginning?

Regression tests are the most likely to be stable and run repeatedly so automation promises big time savings and, being automated, guarantees reliable execution and results checking. The automation choice seems clear. But hold on a minute!

We regression test because things change. Chaotic, unstable environments need regression testing the most. But when things change regularly, automation is very hard. And this describes one of the paradoxes of testing. The development environments that need and would benefit from automated regression testing are the environments that find it hardest to implement.

Rethinking Regression

What is the regression testing thought process? In this paper, I want to step through the thinking associated with regression testing. To understand why regressions occur, to establish what we mean by regression testing, why we choose to do it and automate it.

How do regressions occur?

Essentially, something changes and this impacts ‘working’ software. This could be the environment in which the software operates, an enhancement is implemented or a bug is fixed (and the change causes side-effects) and so on. It’s been said over many years that software code fixes have a 50% chance of introducing side-effects in working software. Is it 30% or 80%? Who cares? Change is dangerous; the probability of disaster is unpredictable; we have all suffered over the years.

Regressions have a disproportionate impact on rework effort, confidence and even morale. What can we do? The two approaches at our disposal are impact analysis (to support sensible design choices) to prevent regressions and regression testing – to identify regressions when they occur.

Impact Analysis

In assessing the potential damage that change can cause, the obvious choice is to not change anything at all. This isn’t as stupid a statement as it sounds. Occasionally, the value of making a change, fixing a bug, adding a feature is far outweighed by the risk of introducing new, unpredictable problems. All prospective changes need to be assessed for their potential impact on existing code and the likelihood of introducing unwanted side-effects. The problem is that assessing the risk of change – Impact Analysis – is extremely difficult. There are two viewpoints for impact analysis: The business view and the technical view.

Impact Analysis: Business View

The first is the user or business view: the prospective changes are examined to see whether they will impact the functionality of the system in ways that the user can recognise and approve of. Three types of functionality impact are common: business- or data- or process-impacted functionality.

Business-impacts often cause subtle changes in the behaviour of systems. An example might be where a change affects how a piece of data is interpreted: the price of an asset might be calculated dynamically rather than fixed for the lifetime of the asset. An asset stored at a location at one price, might be moved to another location at another price. Suddenly – the value of the non-existent asset at the first location is positive or even negative! How can that be? The software worked perfectly – but the business impact wasn’t thought through.

A typical data-impact would be where a data item required to complete a transaction is made mandatory, rather than optional. It may be that the current users rely on the data item being optional because at the time they execute the affected transaction the information is not known, but captured later. The ‘enhanced’ system might stop all business transactions going ahead or force the users to invent data to bypass the data validation check. Either way, the impact is negative.

Process-impacted functionality is where a change might affect the choices of paths through the system or through the business process itself. The change might for example cause a downstream system feature to be invoked where before it was not. Alternatively, a change might suppress the use of a feature that users were familiar with. Users might find they have to do unnecessary work or they have lost the opportunity to make some essential adjustment to a transaction. Wrong!

Impact Analysis: Technical View

With regards to the technical impact analysis by designers or programmers – there are a range of possibilities and in some technical environments, there are tools that can help. Very broadly, impact analysis is performed at two levels: top-down and bottom-up.

The top down analysis involves the consideration of the alternate design options and looking at their impact on the overall behaviour of the changed system. To fix a bug, enhance the functionality or meet some new or changed requirement, there may be alternative change designs to achieve these goals. A top-down approach looks at these prospective changes in the context of the architecture as a whole, the design principles and the practicalities of making the changes themselves. This approach requires that the designers/developers have an architectural view, but also a set of design principles or guidelines that steer designers away from bad practices. Unfortunately, few organisations have such a view or have design principles so embedded within their teams that they can rely on them.

The bottom-up analysis is code-driven. If the selected design approach impacts a known set of components that will change, the software that calls and depends upon the to-be-changed components can be traced. The higher-level services and features that ultimately depend on the changes can be identified and assessed. This sounds good in principle, especially if you have tools to generate call-trees and collaboration diagrams from code. But there are two common problems here.

The first problem is that the design integrity of the system as a whole may be poor. The dependencies between changed components and those affected by the changes may be numerous. If the code is badly structured, convoluted and looks like ‘spaghetti’, even the software experts may not be able to fathom this complexity and it can seem as though every part of the system is affected. This is a scary prospect.

The second problem is that the software changes may be at such a low level in the hierarchy of calling components that it is impractical to trace the impact of changes through to the higher level features. Although a changed component may be buried deep in the architecture, the effect of a poorly implemented software change may be catastrophic. You may know that a higher level service depends on a lower level component – the problem is, you cannot figure out what that dependency is to predict and assess the impact of the proposed change.

All in all, Impact analysis is a tricky prospect. Can regression testing get us out of this hole?

To be continued...

Paul Gerrard 29 March 2010.

First published 10/05/2010

In Part I of this article series, we looked at the nature of regression and impact analysis. In this article we reframe impact analysis as a regression prevention technique; we compare technical and business impact analysis a little more and we discuss regression prevention and detection using business impact analysis and static code analysis. The next article (Part III) will focus exclusively on regression testing as a regression detection approach.

Regression Prevention and Regression Detection

Before we go any further, it’s worth exploring the relationship between impact analyses (used to prevent regressions) and testing (used to detect regressions). We looked at impact analysis from both business and technical viewpoints, but we can also compare the pre-change activities of impact analysis to the post-change activities of testing.

  Technical Viewpoint
(Design- or Code-Based)
Business Viewpoint
Pre-Change Impact Analysis (regression prevention) Technical Impact Analysis
A manual analysis of the designs and source code to determine the potential impact of a change.
Business Impact Analysis
A speculation, based on current behaviour of the system and the business context, of the potential impact of a change.
Post-Change Testing (regression detection)

Static Regression Testing
An automated static analysis of the source code to identify analysis differences post-change.

Dynamic Regression Testing
Execution of a pre-existing dynamic test to compare new behaviour with previously trusted behaviour.

The table summarises the four anti-regression activities in a 2x2 matrix. There are some similarities between the business and technical impact analysis approaches. Both are based on a current understanding of the existing system (but at a technical or behavioural level depending on viewpoint). Both are somewhat speculative and focus on how regressions can be avoided or accommodated in the technology or in the business process./p>

The post-change techniques focus on how regressions can be detected. They are based on evidence derived from a post-change analysis of the design and code or a demonstration of the functionality implemented using a previously run set of tests.

A comprehensive anti-regression strategy should include all four of these techniques.

Technical Impact Analysis (Regression Prevention)

A technical impact analysis is essentially a review of the prospective changes to a system design at a high level and could take the form of a technical review. Where major enhancements are concerned and the changes to be made are mainly additions to an existing system, a technical review would focus on impact at an architectural level (performance, security, resilience risks). We won’t discuss this further.

At the code level, concerns would focus on new interfaces (direct integration) and changes to shared resources such as databases (indirect integration). Obviously, changes to be made to the existing code base, if they are known, need to be studied in some detail.

Some form of source code analysis needs to be performed by designers and programmers on the pre-change version of the software. The analysis is basically a code inspection or review. The developer speculates on what the impact of the changes could be and traces paths of execution through the code and through other unchanged modules to see what impact the changes could have. Design changes involving the database schema, messaging formats, call mechanisms etc. may require changes in many places. Simple search and scanning tools may help, but this is a labour intensive and error-prone activity of course.

In small code samples that are simple, well designed and have few interfacing modules, the developer will have a reasonable chance of identifying potential problems that are in the near vicinity of the proposed changes. Anomalies found can be eliminated by adjusting the design of the changes to be made (or by adopting a design that avoids risky changes). However, in realistically sized systems, the scale and complexity of this task severely limits its scope and effectiveness.

Some changes will be found by compilers or the build processes used by the developers so these are of less concern. The impact of more subtle changes may require some ‘detective work’. Typically, the programmer may need to perform searches of the entire code base to find specific code patterns. Typical examples might be accesses to a changed database table; usage of a new or changed element in an XML formatted message; usage of a variable that has a changed range of validity and so on.

Usually, formal static analysis on programme source code is performed using tools. These can’t be used to predict the impact on behaviour of changes (unless the changed code is analysed, and we’ll look at that next), but the output of tools could be used to focus the programmers’ attention on specific aspects of the system. Needless to say, a defect database used to identify error-prone modules in the code base is invaluable. These areas of special attention could be given more attention in a targeted inspection of the potential side-effects of changes.

Business Impact Analysis (Regression Prevention)

When additional functionality is required to be added to a new system or an enhancement to existing functionality is required then some form of business impact analysis, driven by an understanding of the proposed and existing behaviour, is required. Occasionally a bug fix requires a significant amount of redesign so a review of the functional areas to be changed and the functional areas that might be affected is in order.

The business impact analysis is really a set of ‘what-if?’ questions asked of the existing system before the proposed changes are made. The answers to those what-if questions may raise concerns about potential failure modes – the consequences – against which a risk-analysis could be conducted. Of course, the number of potential failure modes is huge and the knowledge available to analyse these risks may be very limited. However, the areas of most concern could be highlighted to the developers for them to pay special attention to and in particular to focus attention on subsequent regression testing.

A business impact analysis follows a fairly consistent process:

  1. PROPOSAL: Firstly, the proposed enhancement, amendment or bug fix resolution is described and communicated to the business users.
  2. CONSEQUENCE: The business users then consider the changes in functionality, and the technical changes that the programmers know will affect certain functional aspects of the system. Are there potential modes of failure of particular concern?
  3. CHALLENGE: Finally, the business users then challenge the programmers and ask, “what would happen if...?” questions to tease out and uncertainties and to highlight any specific regression tests that would be appropriate to address their concerns.

The main output from this process may be a set of statements the focus the attention of the developers. Often, specific feature or processes may be deemed critical and must not under any circumstances be adversely affected by changes. These feature and processes will surely be the subject of regression testing later.

Static Regression Testing (Regression Detection)

Can static analysis tools help with anti-regression? This section makes some tentative suggestions on how static analysis tools could be used to highlight changes that might be of concern. This is a rather speculative proposal. We would be very interested to hear from practitioners or researchers who are working in this area.

Tools are normally used to scan new or changed source code with a view to detecting poor programming practices and statically-detectable defects, so that developers can eliminate them. However, regressions are often found in unchanged code that calls or is called by the changed code. A scan of code in an unchanged module won’t tell you anything you didn’t already know. So, the analysis must look inside changed code and trace the paths affected by the changes to unchanged modules that call or are called by the changed module. Of course there may be extremely complex interactions for these tools to deal with – but that is exactly what the tools exist to do.

The process would look something like this:

  1. An analysis is performed on the unchanged code of the whole code-base or selected components of the system to be changed (the scope needs to be defined and be consistent in this process).
  2.  The same analysis is performed on the changed code-base.
  3. The two analyses are compared and differences highlighted to identify where the code changes have affected the structure and execution paths of the system.

Whereas a differencing tool would tell you what has changed in the code. Identifying the differences in the output of the static analyses may help you to locate where the code changes have an impact.

Tool can generate many types of analysis nowadays. What types of analyses might be of interest?

Control-flow analysis: if two control flow analyses of a changed system differ, but the differences are in unchanged code, then is it possible that some code is executed that was not used before, or some code that was used before is no longer executed. The new flow of control in the software may be intentional, but of course, it may not be. This analysis simply gives programmers a pointer to functionality and code paths that are worth further investigation. If the tool can generate graphical control-flow graphs, the changes may be obvious to a trained eye. The process could be analogous to a doctor examining x-rays taken at different times to look for growth of a tumour or healing of a broken bone.

Data Flow analysis: Data flow analysis traces the usage of variables in assignments, predicates in decisions (e.g. referenced in an if... then... else... statement) or the value of a variable is used in some other operation such as assignment to another variable or used in a calculation. A difference in the usage pattern of a variable defined in a changed module passed into or passed from unchanged modules may indicate an unwanted change in software behaviour.

Not enough organisations use static analysis tools and few tools are designed to search for differences in ‘deep-flow analysis’ outputs across versions of entire systems – but clearly, there are some potential opportunities worth exploring there. This article leaves you with some ideas but won’t take this suggestion further.

So far, we’ve looked at three techniques for preventing and detecting regressions. In the next article, we’ll examine the activity of more interest to programmers and testers – regression testing.

First published 05/10/2010

The first four articles in this series have set out the main approaches to combating regression in changing software systems. From a business and technical viewpoint, we have considered both pre-change regression prevention (impact analysis) and post-change regression detection (regression testing). In this final article of the series, we’ll consider three emerging approaches that promise to reduce the regression threat and present some considerations of an effective anti-regression strategy with a recap of the main messages of the article series.

Three Approaches: Test, Behaviour and Acceptance Test-Driven Development

There is an increasing amount of discussion on development approaches based on the test-driven model. Ten years or so ago, before lightweight (later named Agile) approaches became widely publicized, test-driven development (TDD) was rare. Some TTD happened, but mostly in high integrity environments where component development and testing was driven by the need to meet formal functional and structural test coverage targets.

Over the course of the last ten years however, the notion of developers creating automated tests typically based on stories and discussions with on-site customers is becoming more common. The leaders in the Agile community are tending to preach behaviour- (BDD) and even acceptance test-driven development (ATDD) to improve and make accessible the test assets in Agile projects. They are also an attempt to move the Agile emphasis from coding to delivery of stakeholder value.

The advocates of these approaches (see for example,,, ATDD in Practice) would say that the approaches are different and of course, in some respects they are. But from the point of view of our discussion of anti-regression approaches, the relevance is this:

  1. Regression testing performed by developers is probably the most efficient way to demonstrate functional equivalence of software (given the limited scope of unit testing).
  2. The test-driven paradigm ensures that regression test assets are acquired and maintained in synchrony with the code – so are accurate and constantly reusable.
  3. The existence of a set of trusted regression tests means that the programmer is protected (to some degree) from introducing regressions when they change code (to enhance, fix bugs in or refactor code).
  4. Programmers, once they commit to the test-first approach tend to find their design and coding activities more predictable and less stressful.
These approaches obviously increase the effort at the front-end and many programmers are not adopting (and may never adopt) them. However, the trend toward test-first does seem to be gaining momentum.

A natural extension of test-first in Agile and potentially more structured environments is the notion of live specifications. In this approach, the automated tests become the independent and executable definition of the behavior of the system. The tests define the behavior of a system by example, and can be considered to be executable specifications (of a sort). Of course, examples alone cannot define the behavior of systems completely and some level of logical specification will always be required. However, the live-specification approach holds great promise, particularly as way of reducing regressions.

The ideal seems to be that where a change is required by users, the live specification is changed, new tests added and existing tests changed or retired as required. The software changes are made in parallel. The new and changed tests are run to demonstrate the changes work as required, and the existing (unchanged) tests are, by definition, the regression test pack. The format, content and structure of such live-specifications are evolving and a small number of organisations claim some successes. It will be interesting to see examples of the approach in action.

Unified Requirements and Systems Testing

The test-first approaches discussed above are gaining popularity in Agile environments. But what can be done in structured, waterfall, larger developments?

Some years ago (in my first Eurostar paper in 1993), I proposed a ‘Unified Approach to System Functional Testing’. In that paper, I suggested that a tabular notation for capturing examples or test cases could be used to create crude prototypes, review check lists and structured walkthroughs of requirements. These ‘behaviours’ as I called them could be used to test requirements documents, but also reused as the basis of both system and acceptance testing later on. Other interests took priority and I didn’t take this proposal much further until recently.

Several developments in the industry make me believe that a practical implementation of this unified approach is now possible and attractive to practitioners. See for example the model-based papers here: or the tool described here: To date, these approaches have focused on high formality and embedded/industrial applications.

Our approach involves the following activities:

  1. Requirements are tabulated to allow cross-referencing.
  2. Requirements are analysed and stories, comprising feature descriptions and a covering set of scenarios and examples (acceptance criteria) are created
  3. The scenarios are mapped to paths though the business process and a data dictionary; paper and automated prototypes can be generated from the scenarios
  4. Using scenario walkthroughs, the requirements are evaluated, omissions and ambiguities identified and fixed.
  5. The process paths, scenarios and examples may be incorporated into software development contracts, if required.
  6. The process paths, scenarios and examples are re-used as the basis of the acceptance test which is conducted in the familiar way.
Essentially, the requirements are ‘exampled’, with features identified and a set of acceptance criteria defined for each – in a structured language. It is the structure of the scenarios that allows tabular definitions of tests for use in manual procedures as well as skeletal automated tests to be generated automatically. There are several benefits deriving from this approach, but the two that concern us here are:
  • The definition of tests and the ability to generate automated scripts occurs before code is written which means that the test-first approach is viable for all projects, not just Agile.
  • The database of requirements, processes, process paths, features, examples and data dictionary are cross-referenced. The database can be used to support more detailed business-oriented impact analysis.
The first benefit has been discussed in the previous section. The second has great potential.

The business knowledge captured in the process will allow some very interesting what-if questions to be asked and answered. If a business process is to change, the system features, requirements, scenarios and tests affected can be traced. If a system feature is to be changed, the scenarios, tests, requirement and business process affected can be traced. This knowledge should provide at least at a high level, a better understanding of the impact of change. Further, it promotes the notion of live specifications and Trusted Requirements.

There is a real possibility that the (typically) huge investment in requirements capture will not be wasted and the requirements may be accurately maintained in parallel with a covering set of scenarios. Further, the business knowledge captured in the requirements and the database can be retained for the lifetime of the system in question.

Improving Software Analysis Tools

The key barrier to performing better technical impact analyses is the lack (and expense) of appropriate tools to provide a range of source code analyses. Tools that provide visualisations of the architecture, relationships between components and hierarchical views of these relationships are emerging. Some obvious challenges make life somewhat difficult though:
  1. Tools are usually language dependent so mixed environments are troublesome.
  2. The source code for third-party components used in your system may not be available.
  3. Visualisation software is available, but for real-size systems, the graphical models can become huge and unworkable.
These tools are obviously focused at architects, designers and developers and are naturally technical in nature.

An example of how tools are evolving in this area is Structure101g by ( This tool can perform detailed structural analyses of several languages (“java, C/C++ and anything”) but can, in principle provide visualisations, navigation and query facilities for any structural model. For example, with the necessary plugins, the tool can provide insights into XML/XSLT libraries and web site maps at varying levels of abstraction.

As tools like this become better established and more affordable, they will surely become ‘must-haves’ for architects and developers in large systems environments.

Anti-Regression Strategy – Making it Happen

We’ll close this article series with some guidelines summarised from this and previous articles. Numerals in brackets refer to the article number.
  1. Regressions in working software affect business users, technical support, testers, developers, software and project management and stakeholders. It is everyone’s problem (I, V).
  2. A comprehensive anti-regression strategy would include both regression prevention and detection techniques from a technical and business viewpoint. (I, II).
  3. Impact analysis can be performed from both a business and technical viewpoints. (all)
  4. Technical impact analysis really needs tool support; consider open source, proprietary (or consider building your own to meet your specific objectives).
  5. Regression testing may be your main defence against regression, but should never be the only one; impact analysis prevents regression and informs good regression testing. (I, II, IV).
  6. Regression testing can typically be performed at the component, system or business level. These test levels have different objectives, owners and may be automated to different degrees (III).
  7. Regression tests may be created in a test-driven regime, or as part of requirements or design based approaches. Reuse of tests saves time, but check that these tests actually meet your anti-regression objectives (III).
  8. Regression tests become less effective over time; review your test pack regularly, especially when you are about to add to it. (This could be daily in an Agile environment!) (III)
  9. Analyses of production data will tell you the format, volumes and patterns of data that is most common – use it as a source of test data and a model for coverage; but don’t forget to include the negative tests too though! (III)
  10. If you need to be selective in the tests you retain and execute then you’ll need an agreed process, forum, decision-maker or makers or criteria for selection (agreed with all stakeholders in 1 above) (III).
  11. Most regression testing can and should be automated. Understand your context (objectives, test levels, risk areas, developer/tester motivations and capabilities etc.) before defining your automation strategy (III, IV).
  12. Consider what test levels, system stimulation and outcome detection methods, ownership, capabilities and tool usability are required before defining an automation regime (IV).
  13. Creating an automation regime retrospectively is difficult and expensive; test-first approaches build regression testing into the DNA of project teams (V).
  14. There is a lot of thinking, activity and new approaches/tools being developed to support requirements testing, exampling, live-specs and test automation; take a look (V).
I wish you the best of luck in your anti-regression initiatives.

I’d like to express sincere thanks to the Eurostar Team for asking me to write this article series and attendees at the Test Management Summit for inspiring it.

Paul Gerrard 23 August 2010.

First published 30/09/2009

This document presents an approach for:

  • Business Scenario Walkthroughs (BSW) and
  • Business Simulation Testing (BST)

Objectives of Business Simulation

The primary aim of BST is to provide final confirmation that the systems, processes and people work as an integrated whole to meet an organisations objectives to provide a sophisticated, efficient service to its customers. Business Simulation tests take a more process and people-oriented view of the entire system; User Acceptance Testing is more system-oriented.

The specific objectives of Business Simulation are to demonstrate that:


  • the business processes define the logical, step by step activities to perform the desired tasks
  • for each stage in the process, the inputs (information, resource) are available at the right time, in the right place to complete the task
  • the outputs (documents, events or other outcomes) are sufficiently well-defined to enable them to be produced reliably, completely, consistently
  • paths to be taken through the business process are efficient (i.e. no repeated tasks or convoluted paths)
  • the tasks in the Business Process are sufficiently well defined to enable people to perform the tasks regularly and consistently
  • the process can accommodate both common and unusual variations in inputs to enable tasks to be completed.


  • the people are familiar with the processes such that they can perform the tasks consistently, correctly and without supervision or assistance
  • people can cope with the variety of circumstances that arise when performing the tasks
  • people feel comfortable with the processes. (They don't need assistance, support or direction in performing their tasks)
  • customers perceive the operation and processes as being slick, effective and efficient
  • the training given to users provides them with adequate preparation for the task in hand.


  • the system provides guidance through the business process and leads them through the tasks correctly
  • the system is consistent, in terms of information required and provided, with the business process
  • the level of prompting within the system is about right (i.e. giving sufficient prompting without treating experienced users like first-time users)
  • response times for system transactions are compatible with the tasks which the system supports (i.e. fast response times where task durations are short)
  • users' perception is that the system helps the users, rather than hinders them. And that holds true, for if you were to cast a glance at a sap supplier portal, you'd be awed at how the agglutination of complexity and efficiency that SAP-powered systems are able to merge.

Business Scenario Walkthroughs

The purpose of the Business Scenario Walkthtrough (BSW) is to 'test' the business process and demonstrate the process itself is workable. The value of BSW is that they can be used to simulate how the business process will operate, but without the need for the IT system or other infrastructure to be available. The Walkthroughs usually involve business users who role-play and use may be made of props, rather than real systems.

This technique is excellent for refining user requirements for systems, but in this case the 'script' will identify the tasks which need to be supported by specified functionality in the system. It will verify that the mapping of functionality to the business processes (to be used in training) is sound and that the other objectives are met.

Test Materials

The test of the business processes requires certain materials to be prepared for use by the participants. These are:

  • Instructions to the participants
  • Materials to be tested (Business Process Descriptions)
  • Business Scenarios
  • Checklist for inspections and Walkthrough
  • Issue logging sheets.


Inspections and Walkthroughs are labour-intensive and can involve 4-7 people (or more) and so can be expensive to perform. In order to gain the maximum benefit from the sessions, the sessions should be properly planned and detailed preparations made well in advance. Further it is essential that the procedures for the inspection and Walkthrough are followed to ensure all the materials to be tested are covered in time.


The inventory of scenarios to be covered should be allocated to the inspectors based on their concerns for specific processes to ensure the work is distributed and every scenario is covered. A checklist of rules or requirements to assist inspectors in identifying issues will be prepared. Depending on the viewpoint of the inspector, a different checklist may be issued.


The inspectors should use the scenarios to trace paths through the business processes and look out for issues of usability, consistency or deviation from rules on the checklist. The source document should be marked up and each issue identified should be logged. The marked up documents and issue list should be copied to the inspection leader.

Error Logging Meeting

The issue-lists compiled by the inspectors will be reviewed at an Error-Logging meeting. The purpose of the meeting is not to resolve errors at the meeting, but to work through the documents under test and compile an agreed list of errors to be resolved.

Inspection Follow-Up

The error log will be passed to the authors of the business processes for them to resolve. The corrected documents should be passed to the inspection leader for them to check that every error has been addressed. The person who raised the error should then confirm that the error has actually been resolved in an acceptable way.


The Walkthrough is a stage-managed activity where the business scenarios are used to script a sequence of activities to be performed by business users in the real world. The participants each have copies of the 'script' and should understand their role in the Walkthrough. Other people, who have an interest or contribution to make, may attend as observers. Observers may raise incidents in the same way as the participants.

The Walkthrough is led by one person who ensures the scripts are followed and incidents are logged. The aim is to identify and log problems, but not to solve them. During each session, a 'scribe' who may also be an observer, logs the incidents.

As the Walkthrough proceeds, participants and observers should aim to identify any anomalies in the business process by referencing the BSW checklist.

Incident Logging

The Walkthrough is specifically intended to address the objectives for people and processes presented in section 1.4. Incidents will be raised for any problems relating to those objectives. For example:

  • the business processes fail to provide logical, step by step activities to perform the desired tasks
  • for a stage in the process, the inputs (information, resource) are not available at the right time, in the right place to complete the task

The other objectives for people and processes can be re-cast to represent incident types. Follow-Up Incidents will be prioritised and categorised as defined in the Test Strategy. Resolution of the incidents will be dealt with by the Operational Infrastructure team or the Training team.

Where significant changes to processes or re-training is involved, a re-test may be deemed necessary.


The Test Manager must be satisfied that incidents have been correctly resolved and will monitor outstanding incidents closely.

The tester who raised the incident will be responsible for signing off incidents.

Business Simulation

The purpose of Business Simulation tests is to provide final confirmation that the system, processes and people are ready to go live. If one were to take the example of a business software for electricians, they'd know that in order to test the overall user facility, a simulation of the activities expected to take place will be staged. In essence, a series of prepared test scenarios will be executed simulate the variety of real business activity. The participants will handle the scenarios exactly as they would in a live situation, perform the tasks defined for the business process using the knowledge and skills gained in training.

It is intended, as far as possible to exercise the complete business processes from end to end. The simulation will cover both processes supported by the system and manual processes. The aim is to test the complete system, processes and people.

Test Materials

The simulation should be scripted. The business scenarios used in the BSW will be re-used as the basis of the BST scripts. There will be two documents used to script and record the results of every test:

  • Test script
  • Test log.

The scripts will be used drive the test and will beused by a test leader. Participants in the test will treat the situation as they will in business, and so will not normally use a test script but may have tables of test data values to use, if necessary. Think of it as a businessman carrying out a us import data procedure to assimilate information about their clients. Every scripts should be logged after it is over and any comments or problems must be recorded.


The script will have the following information included:

  • A script reference (unique within the test)
  • A description of the purpose of the scenario to be made. E.g. to get a quotation for a particular product, or to pose an awkward situation for a telesales operator (e.g. not knowing a key piece of information).
  • Information which is required and relevant to the processing of the script – these are the data values which should find their way into the system
  • Instructions (responses) to situations where the information is specifically NOT available. (To simulate the situation where the participants do not have information)
  • A simple questionnaire to record whether the objective of the script was met, whether the service as experienced by participants was smooth and efficient (or timely, accurate, courteous etc.)

Test Log

The Test Log will be used by the participants to log the following:

  • The script reference number (to match the test leader's test script and comments).
  • Comments on difficulties experienced while executing the script. These could be:
    • Problems with the system.
    • Problems with the process.
    • Problems for which the participant was not adequately prepared during training.
  • Date and times for both the start and end of the test.
  • Initials of the participant
  • An indication of whether the objectives of the test were met.

Example Process

The notes below refer to a Business Simulation for a Call Centre application where an Automatic Call Distributor (ACD) and Windows client/server system with various interfaces to other systems was used.


Caller Scripts will be distributed to the callers, Call Logs to the Teleagents who will accept the calls. Both Callers and Teleagents will be briefed on how the test will be conducted.

Test calls will be made by the Callers in a realistic manner (via the PSTN to the ACD numbers or, alternatively, as internal calls to the Teleagent stations) and be conducted exactly as will occur in live used.

Dummy Calls

At the opening, the caller should state that this is a test call and quote the reference number printed on the script. The Caller should not give any indication of the purpose of the test, but conduct the call in as realistic a way as possible.

At the end of the call, the Caller should record comments on the test call on their test script. The Teleagent should also log the call using the call reference, and record any comments on difficulties experienced and suggestions on how any difficulties were dealt with.

Tester Roles

The test calls do not need to be made simultaneously, so it is planned to have half of the trained Teleagents impersonate callers while the remaining agents take the calls. Roles would then be reversed to complete the test.

Results Checking

The completed test scripts and logs will be matched using the call reference. The test results will be analysed to identify any recurring problems or difficulties experienced from the point of view of the Callers or the Teleagents.

Where printed output (fulfilment pack) is generated for dispatch to the callers (dummy or real) addresses, the fulfilment packs will be checked to ensure:

  • every fulfilment pack is generated
  • the fulfilment pack is complete
  • the information presented on the fulfilment documents is correct, when compared with the information presented on the Callers Script.

Results checking will be performed by the Callers.

Incident Logging

Incidents will be raised for any of the following:

  • Failure or any other anomaly occurring within the system
  • problems encountered during a call by the Caller
  • problems encountered during a call by the Teleagent
  • failure to generate a fulfilment pack
  • wrong contents in a fulfilment pack
  • incorrect details presented in the fulfilment pack.


Incidents will be prioritised and categorised as defined in the Test Strategy. Resolution of the incidents will be handled as follows:

  • System problems will be handled by the appropriate development team.
  • Process problems will be handled by Operational Infrastructure team.
  • People problems will be handled by the Training team.

Where significant changes to the system and/or processes or re-training is involved, a re-test may
be deemed necessary.


The Test Manager must be satisfied that incidents have been correctly resolved and will monitor outstanding incidents closely.

The tester who raised the incident will be responsible for signing off incidents.

First published 11/10/2009

Requirements are the foundations of every project yet we continue to build systems with requirements that have not been tested. We take care to test at every stage during design and development and yet the whole project may be based on untested foundations.

Functional system tests should be based around coverage of the functionality described in the requirements, but it is common for the design document to be used as the baseline for testing because the requirements can't be related to the end product. In the worst case, system tests can become large scale repetitions of unit tests. It is not surprising that many system tests fail to reveal requirements errors.

We ask users to perform acceptance tests against their original requirements. But who can blame enthusiastic users when they become overwhelmed by the task? The system bears so little resemblance to what they asked for that the acceptance test often becomes a superficial hands-on familiarisation exercise. This paper proposes that a unified view of requirements can improve the requirements gathering process, give users a clearer view of their expectations and provide a framework for more effective system and user acceptance tests.

A Unified Approach to System Functional Testing

First published 08/12/2009

The Project Euler site presents a collection of 'maths-related' problems to be solved by computer – 250+ of them and the site allows you to check your answers etc. You don't need to be a mathematician for all of them really, but you do need to be a good algorithm designer/programmer.

But it also reminded me of a recurring thought about something else. Could the problems be used as 'testing' problems too? The neat thing about some of them is that testing them isn't easy. Some problems have only one answer – they aren't very useful for testers – there is only one test case (or you need simply to write/reuse a parallel program to act as oracle). But others like problem 22 for example provide input files to process The input file could be edited to generate variations – i.e. test cases to demonstrate the code works in general, not just a specific situation. Because some problems must work for infinite cases, simple test techniques probably aren't enough (are they ever?)

The Euler problem statements aren't designed for a specific technique – although they define requirements precisely, they are much closer to a real and challenging problem. The algorithms used to solve the problems are a mystery – and there may be many many ways of solving the same problem. (cf screens that simply update records in a database – pretty routine by comparison). The implementation doesn't influence our tests – its a true black box problem.

Teaching testing “the certified way” starts from the wrong end. We teach technique X, give out a prepared requirement (that happens to fit technique X – sort of) and say, “prepare tests using technique X”. But real life and requirements (or lack of them) aren't like that. Requirements don't usually tell you which technique to use! The process of test model selection (and associated test design and coverage approaches) is rarely taught – even though this is perhaps the most critical aspect of testing.

All of which makes me think that maybe we could identify a set of problem statements (not necessarily 'mathematical') that don't just decompose to partitions and boundaries, states or decisions and we should use these to teach and train. We teach students using small applications and ask them to practice their exploratory testing skills. Why don't we do the same with requirements?

Should training be driven by the need to solve problems rather than trot out memorised rote test design techniques? Why not create training exercises (and certification(?)) from written instructions, a specification, a pc and some software to test?

Wouldn't this be a better way to train people? To evaluate their ability as testers? This is old hat really – but still few people do it.

What stops us doing it? Is it because really – we aren't as advanced as we think we are? Test techniques will never prove correctness (we know that well) – they are just heuristic, but perhaps more systematic ways of selecting tests. Are the techniques really just clerical aids for bureaucratic processes rather than effective methods for defect detection and evidence collection? Where's the proof that says they are more effective? More effective – than what?

Who is looking at how one selects a test model? Is it just gut feel, IQ, luck, happens to be on my desk? Is there a method of model selection that could be defined and taught? Examined? Why don't we teach people to invent and choose test models? It seems to me that this needs much more attention that anyone gives it today.

What do you think?

