Paul Gerrard

My experiences in the Test Engineering business; opinions, definitions and occasional polemics. Many have been rewritten to restore their original content.

First published 10/05/2010

The second in a series of articles on Anti-Regression Approaches has been posted here: Part II: Regression Prevention and Detection Using Static Techniques

Part I: Introduction & Impact Analysis can be found here.

Tags: #regressiontesting #impactanalysis #anti-regression #staticanalysis

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 16/06/2010

The third in a series of articles on Anti-Regression Approaches has been posted here: Part III: Regression Testing

Part I: Introduction & Impact Analysis can be found here.

Part II: Regression Prevention and Detection Using Static Techniques can be found here.

The next article will focus on Test Automation.

Tags: #testautomation #regressiontesting #anti-regression

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 19/07/2010

In Parts I and II of this article series, we introduced the nature of regression, impact analysis and regression prevention. In Part III we looked at Regression Testing and how we select regression tests. This article focuses on the automation of regression testing.

Automated Regression Testing is One Part of Anti-Regression

Sometimes it feels like more has been written about test automation, especially GUI test automation, than any other testing subject. My motivation in writing this article series was that most things of significance in test automation had been said 8, 10 or 15 years ago and not that much progress has been made since (notwithstanding the varying technology changes that have occurred). I suggest there’s been a lack of progress, because significant and sustained success with automation of (what is primarily) regression testing is still not assured. Evidence of failure, or at least troublesome implementations of automation, is still widespread. My argument in the January 2010 Test Management Summit was that perhaps the reason for failure in test automation was that people didn’t think it through before they started. In this context, ‘started’ often means getting a good deal on a proprietary GUI Test Automation tool. It’s obvious – buying a tool isn’t the best first step. Automating tests through the user interface may not be the most effective way to achieve anti-regression objectives. Test automation may not be an effective approach at all. It certainly shouldn’t be the only one considered. Test execution automation promises reliable, error-free, rapid, unattended test execution. In some environments, the promise is delivered, but in most – it is not. In the mid 1990’s informal surveys revealed that very few (in one survey, less than 1% of) test automation users achieved ‘significant benefits’. The percentage is much higher nowadays – maybe as high as 50%, but that is probably because most practitioners have learnt their lessons the hard way. Regardless, success is not assured. Much has been written on the challenges and pitfalls of test automation. The lessons learned by practitioners in the mid-90s are substantially the same as those facing practitioners today. I have to say that it’s a cause of some frustration that many companies still haven’t learnt them. In this article, there isn’t space to repeat those lessons. The referred papers, books and blogs at the end of this article focus on implementing automation, primarily from a user interface point of view, and sometimes as an end in itself. To complement these texts, to bring them up to date and focus them on our anti-regression objective, the remainder of this article will set out some wider considerations.

Regression test objectives and (or versus?) automation

The three main regression test objectives are set out below together with some suggestions for test automation. Although the objectives are distinct, the differences between regression testing and automation for the three objectives are somewhat blurred.

Anti-Regression ObjectiveSource of TestsAutomation Considerations
1. To detect unwanted changes to trusted functionality.Functional system tests Integration testsConsider the criteria in references 6, 7, 8 Most likely to be automated using drivers to component and sub-system interfaces
2. To detect unwanted changes (to support technical refactoring).Test-first, test-driven environments generate automated tests naturallyConsider reference 9 and the discussion of testing in TDD and Agile in general.
3. To demonstrate to stakeholders that they can still do business.Acceptance Tests, business process flows, ‘end to end’ tests.Consider the criteria in references 6, 7, 8 but expect mostly manual testing for demonstration purposes. See reference 10 for an introduction to Acceptance Driven Development.

Regression objectives reframed: detecting regression v providing confidence

Of the three regression test objectives above, objectives 1 and 2 are similar. What differentiates them is who (and where) they come from. Objective 1 comes from a system supplier perspective and tests are most likely to be sourced from system or integration tests that were previously run (either manually or automated). Objective 2 comes from a developer or technical perspective where the aim is to perform some refactoring in a safe environment. By and large, ‘safe’ refactoring is most viable in a Test-Driven environment where all unit tests are automated, probably in a Continuous Integration regime. (Although refactoring at any level benefits from automated regression testing).

If objectives 1 and 2 require tests to demonstrate ‘functional equivalence’, regression test coverage can be based on the need to exercise the underlying code and cover the system functionality. Potentially, tests based on equivalence partitioning ought to cover the branches in code (but not housekeeping or error-handling functionality – but see below). Tests covering edge conditions or boundary values should verify the ‘precision’ of those decisions. So a reasonable guideline could be – use automation to cover functional paths through the system and data-drive those tests to expand the coverage of boundary conditions. The automation does not necessarily have to execute tests that would be recognisable to the user, if the objective is to demonstrate functional equivalence.

Objective 3 – to provide confidence to stakeholders is slightly different. In this case, the purpose of a regression test is to demonstrate to end users that they can execute business transactions and use the system to support their business. In this respect, it may be that these tests could be automated and some automated tests that fall under the category 1 and 2 above will be helpful. But experience of testing GUI applications in particular suggests that end users sometimes only trust their own eyes and need to have a hands-on experience to give them the confidence that is required. Potentially, a set of automated tests might be used to drive a number of ‘end to end’ transactions, and reconciliation or control reports could be generated to be inspected by end users. There is a large spectrum of possibilities of course. In summary, automated tests could help, but in some environments, the need for manual tests as a ‘confidence builder’ cannot be avoided.

At what level(s) should we automate regression tests?

In Part III of this article series, we identified three levels at which regression testing might be implemented – at the component, system and business (or integrated system) levels. These levels should be considered as complementary and the choice is where to place emphasis, rather than which to include or exclude. The choice of automation at these levels is not really the point. Rather, a level of regression testing may be chosen primarily to achieve an objective, partly on the value of information generated and partly because of the ease with which the tests can be automated.

What are the technical considerations for automation?

At the most fundamental, technical level, there are four aspects of the system under test that must be considered. How the system under test is stimulated, and how the test outcomes of interest (with respect to regression) will be detected.

Mechanisms for stimulating the system under test

This aspect reflects how a test is driven by either a user or an automated tool. Nowadays, the number of user and technical interfaces in use is large – and growing. A table of the most common are presented and some suggestions made.

PC/Workstation-based applications and clients>
  • Proprietary or open source GUI-object based drivers
  • Hardware (keyboard, video, mouse) based tools – physically connected to clients
  • Software based automation tools driving clients working across VNC connections
Browser/web-based applications
  • Proprietary object-based agents
  • Open source JavaScript-based agents
  • Open source script languages and GUI toolkits
Web-Server-based functionality (HTTP)
  • Proprietary or open source webserver/HTTP/S drivers
Web services
  • Proprietary or open source web services drivers
Mobile applications
  • Mobile OS simulators driven by integrated or separate GUI based toolkits
Embedded
  • Typically java-based toolkits
Error, failure, spate, race conditions or other situations
  • May be simulated by instrumentation, load generation tools or manipulation of the test environment or infrastructure
Environments
  • Don’t forget that environmental conditions influence the behaviour of ALL systems under test.

There are an increasing number of proprietary and open source unit and acceptance testing frameworks available to manage and control the test execution engines above.

Outcome/Output detection and capture

A regression can be detected in as many ways as any outcome (output, change of state etc.) of a system can be exposed and detected. Here’s a list of common outcome/output formats that we may have to deal with. This is not a definitive list.

Browser-rendered output
  • The state of any object on the document Object Model (DOM) exposed by a GUI tool
Any screen-based output
  • Image recognition by hardware or software based agents
Transaction response times
  • Any automated tool with response time capture capability
Database changes
  • Appropriate SQL or database query tool
Message output and content
  • Raw packets captured by network sniffers
  • Message payloads captured and analysed by protocol-specific tools
Client or server system resources
  • CPU, i/o, memory, network traffic etc. detected by performance monitors
Application or other infrastructure – changes of state
  • (Database, enterprise messaging, object request brokers etc. etc.) - dedicated system/resource monitors or custom-built instrumentation etc.
Changes in accessibility or usability (adherence to standards etc.)
  • Web page HTML scanners, character-based screen or report scanners or screen image scanners
Security (server)
  • Port scanning and server-penetration tools

Comparison of Outcomes

A fundamental aspect of regression testing is comparison of actual outcomes (in whatever format from whatever source above) to expected outcomes. If we are running a test again, the comparison is between the new ‘actual’ output/outcome and previously captured ‘baseline’ output/outcome.

Simple comparison functionality of numbers, text, system states, images, mark-up language, database content, reports, message payloads, system resource is not enough. We need to have a capability in our automation to:

Filter content: we may not need to compare ‘everything’. Subsets of database records, screen/image regions, branches or leaves in marked up text, some objects and states but not others etc. may be filtered out (of both actual and baseline content).

Mask content: of the content we filter out, we may wish to mask out certain patterns of content such as image regions that do not contain field borders; textual report columns or rows that contain dates/times, page numbers, varying/unique record ids etc.; screen fields or objects of certain colours, sizes, that are hidden/visible; patterns of text that can be matched using regular expressions and so on.

Calculate from content: the value, significance or meaning of content may have to be calculated: perhaps the number of rows displayed on a screen is significant; the error message, number or status code displayed on a screen image, extracted by text recognition; the result of a formula in which the variables are extracted from an outputted report and so on.

Identify content meeting/exceeding a threshold: the significance of output is determined by its proximity to thresholds such as: CPU, memory or network bandwidth usage compared to pre-defined limits; the value of a purchase order exceeds some limit; the response time of a transaction exceeds a requirement and so on.

System Architecture

The architecture of a system may have a significant influence over the choice of regression approach and automation in particular. An example will illustrate. An increasingly common software model is the MVC or model-view-controller architecture. Simplistically (from Wikipedia):

“The model is used to manage information and notify observers when that information changes; the view renders the model into a form suitable for interaction, typically a user interface element; the controller receives input and initiates a response by making calls on model objects. MVC is often seen in web applications where the view is the HTML or XHTML generated by the app. The controller receives GET or POST input and decides what to do with it, handing over to domain objects (i.e. the model) that contain the business rules and know how to carry out specific tasks such as processing a new subscription.”

A change to a ‘read-only’ view may be completely cosmetic and have no impact on models or controllers. Why regression test other views, models or controllers? Why automate testing at all – a manual inspection may suffice.

If a controller changes, the user interaction may be affected in terms of data captured and/or presented but the request/response dialogue may allow complete control of the transaction and examination of the outcome. In many situations, automated control of requests to and from controllers (e.g. HTTP GETs and POSTs) is easier to achieve than automating tests through the GUI (i.e. a rendered web page).

Note that cross-browser test automation, to verify the behaviour and appearance of a system’s web pages across different browser types, for example, cannot be handled this way. (Some functional automation may be possible, but some usability/accessibility tests will always be manual).

It is clear that the number and variety of the ways a system can be stimulated and potentially regressive outcomes can be observed is huge. Few, if any tools, proprietary or open source, have all the capabilities we need. The message is clear – don’t ever assume the only way to automate regression testing is to use a GUI-based test execution tool!

Regression test automation – summary

In summary, we strongly advise you to bear in mind the following considerations:

  1. What is the outcome of your impact analysis?
  2. What are the objectives of your anti-regression effort?
  3. How could regressions manifest themselves?
  4. How could those regressions be detected?
  5. How can the system under test be stimulated to exercise the modes of operation of concern?
  6. Where in the development and test process is it feasible to implement the regression testing and automation?
  7. What technology, tools, harnesses, custom utilities, skills, resources and environments do you need to implement the automated regression test regime?
  8. What will be your criteria for automating (new or existing, manual) tests?

Test Automation References

  1. Brian Marick, 1997, Classic Testing Mistakes, http://www.exampler.com/testing-com/writings/classic/checklist.html
  2. James Bach, 1999, Test Automation Snake Oil, http://www.satisfice.com/articles/test_automation_snake_oil.pdf
  3. Cem Kaner, James Bach, Bret Pettichord, 2002, Lessons Learned in Software Testing, John Wiley and Sons
  4. Dorothy Graham, Paul Gerrard, 1999, the CAST Report, Fourth Edition
  5. Paul Gerrard, 1998, Selecting and Implementing a CAST Tool, http://gerrardconsulting.com/?q=node/532
  6. Brian Marick, 1998, When Should a Test be Automated? http://www.stickyminds.com/sitewide.asp?Function=edetail&ObjectType=ART&ObjectId=2010
  7. Paul Gerrard, 1997, Testing GUI Applications, http://gerrardconsulting.com/?q=node/514
  8. Paul Gerrard, 2006, Automation below the GUI (blog posting), http://gerrardconsulting.com/index.php?q=node/555
  9. Scott Ambler, 2002-10, Introduction to Test-Driven Design, http://www.agiledata.org/essays/tdd.html
  10. Naresh Jain, 2007, Acceptance-Test Driven Development, http://www.slideshare.net/nashjain/acceptance-test-driven-development-350264

In the final article of this series, we’ll consider how an anti-regression approach can be formulated, implemented and managed and take a step back to summarise and recap the main messages of these articles.

Paul Gerrard 21 June 2010.

Tags: #testautomation #regressiontesting #automatedregression

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 14/04/2010

Introduction

For some years, I’ve avoided getting too involved in test execution automation because I’ve felt it was boring. Yes, I know it has great promise and in principle, surely we should be offloading the tedious, repetitive, clerical tasks to tools. But I think the principles of regression testing haven’t changed and we’ve made little progress in the last fifteen years or so – we haven’t really moved on. It’s stuck in a time-warp.

I presented a talk on test automation at Eurostar 1997. Titled “Testing GUI Applications”, the paper I wrote is still the most popular one on my website gerrardconsulting.com with around 300 downloads a month. Why are people still interested in stuff I wrote so long ago? I think it was a good, but not groundbreaking paper; it didn’t mention the web; the recommendations for test automation were sensible, not radical. Books on the subject have been written since then. I’ve been meaning to update the paper for the new, connected world we now inhabit, but haven’t had the time so far.

But at the January 2010 Test Management Summit, I chose to Facilitate the session, “Regression Testing: What to Automate and How”. In our build-up to the Summit the topic came top of the popularity survey. We had to incorporate it into the programme, but no one volunteered – so I picked it up. On the day, the frustrations I’ve held for a long time came pouring out and the talk became a rather angry rant. In this series of articles, I want to set out the thoughts I presented at the Summit.

I’m going to re-trace our early steps to automation and try and figure out why we (the testers) are still finding that building and running sustainable, meaningful automated regression test suites is fraught with difficulties. Perhaps these difficulties arise because we didn’t think it through at the beginning?

Regression tests are the most likely to be stable and run repeatedly so automation promises big time savings and, being automated, guarantees reliable execution and results checking. The automation choice seems clear. But hold on a minute!

We regression test because things change. Chaotic, unstable environments need regression testing the most. But when things change regularly, automation is very hard. And this describes one of the paradoxes of testing. The development environments that need and would benefit from automated regression testing are the environments that find it hardest to implement.

Rethinking Regression

What is the regression testing thought process? In this paper, I want to step through the thinking associated with regression testing. To understand why regressions occur, to establish what we mean by regression testing, why we choose to do it and automate it.

How do regressions occur?

Essentially, something changes and this impacts ‘working’ software. This could be the environment in which the software operates, an enhancement is implemented or a bug is fixed (and the change causes side-effects) and so on. It’s been said over many years that software code fixes have a 50% chance of introducing side-effects in working software. Is it 30% or 80%? Who cares? Change is dangerous; the probability of disaster is unpredictable; we have all suffered over the years.

Regressions have a disproportionate impact on rework effort, confidence and even morale. What can we do? The two approaches at our disposal are impact analysis (to support sensible design choices) to prevent regressions and regression testing – to identify regressions when they occur.

Impact Analysis

In assessing the potential damage that change can cause, the obvious choice is to not change anything at all. This isn’t as stupid a statement as it sounds. Occasionally, the value of making a change, fixing a bug, adding a feature is far outweighed by the risk of introducing new, unpredictable problems. All prospective changes need to be assessed for their potential impact on existing code and the likelihood of introducing unwanted side-effects. The problem is that assessing the risk of change – Impact Analysis – is extremely difficult. There are two viewpoints for impact analysis: The business view and the technical view.

Impact Analysis: Business View

The first is the user or business view: the prospective changes are examined to see whether they will impact the functionality of the system in ways that the user can recognise and approve of. Three types of functionality impact are common: business- or data- or process-impacted functionality.

Business-impacts often cause subtle changes in the behaviour of systems. An example might be where a change affects how a piece of data is interpreted: the price of an asset might be calculated dynamically rather than fixed for the lifetime of the asset. An asset stored at a location at one price, might be moved to another location at another price. Suddenly – the value of the non-existent asset at the first location is positive or even negative! How can that be? The software worked perfectly – but the business impact wasn’t thought through.

A typical data-impact would be where a data item required to complete a transaction is made mandatory, rather than optional. It may be that the current users rely on the data item being optional because at the time they execute the affected transaction the information is not known, but captured later. The ‘enhanced’ system might stop all business transactions going ahead or force the users to invent data to bypass the data validation check. Either way, the impact is negative.

Process-impacted functionality is where a change might affect the choices of paths through the system or through the business process itself. The change might for example cause a downstream system feature to be invoked where before it was not. Alternatively, a change might suppress the use of a feature that users were familiar with. Users might find they have to do unnecessary work or they have lost the opportunity to make some essential adjustment to a transaction. Wrong!

Impact Analysis: Technical View

With regards to the technical impact analysis by designers or programmers – there are a range of possibilities and in some technical environments, there are tools that can help. Very broadly, impact analysis is performed at two levels: top-down and bottom-up.

The top down analysis involves the consideration of the alternate design options and looking at their impact on the overall behaviour of the changed system. To fix a bug, enhance the functionality or meet some new or changed requirement, there may be alternative change designs to achieve these goals. A top-down approach looks at these prospective changes in the context of the architecture as a whole, the design principles and the practicalities of making the changes themselves. This approach requires that the designers/developers have an architectural view, but also a set of design principles or guidelines that steer designers away from bad practices. Unfortunately, few organisations have such a view or have design principles so embedded within their teams that they can rely on them.

The bottom-up analysis is code-driven. If the selected design approach impacts a known set of components that will change, the software that calls and depends upon the to-be-changed components can be traced. The higher-level services and features that ultimately depend on the changes can be identified and assessed. This sounds good in principle, especially if you have tools to generate call-trees and collaboration diagrams from code. But there are two common problems here.

The first problem is that the design integrity of the system as a whole may be poor. The dependencies between changed components and those affected by the changes may be numerous. If the code is badly structured, convoluted and looks like ‘spaghetti’, even the software experts may not be able to fathom this complexity and it can seem as though every part of the system is affected. This is a scary prospect.

The second problem is that the software changes may be at such a low level in the hierarchy of calling components that it is impractical to trace the impact of changes through to the higher level features. Although a changed component may be buried deep in the architecture, the effect of a poorly implemented software change may be catastrophic. You may know that a higher level service depends on a lower level component – the problem is, you cannot figure out what that dependency is to predict and assess the impact of the proposed change.

All in all, Impact analysis is a tricky prospect. Can regression testing get us out of this hole?

To be continued...

Paul Gerrard 29 March 2010.

Tags: #testautomation #regressiontesting #impactanalysis

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 10/05/2010

In Part I of this article series, we looked at the nature of regression and impact analysis. In this article we reframe impact analysis as a regression prevention technique; we compare technical and business impact analysis a little more and we discuss regression prevention and detection using business impact analysis and static code analysis. The next article (Part III) will focus exclusively on regression testing as a regression detection approach.

Regression Prevention and Regression Detection

Before we go any further, it’s worth exploring the relationship between impact analyses (used to prevent regressions) and testing (used to detect regressions). We looked at impact analysis from both business and technical viewpoints, but we can also compare the pre-change activities of impact analysis to the post-change activities of testing.

  Technical Viewpoint
(Design- or Code-Based)
Business Viewpoint
(Behaviour-based)
Pre-Change Impact Analysis (regression prevention) Technical Impact Analysis
A manual analysis of the designs and source code to determine the potential impact of a change.
Business Impact Analysis
A speculation, based on current behaviour of the system and the business context, of the potential impact of a change.
Post-Change Testing (regression detection)

Static Regression Testing
An automated static analysis of the source code to identify analysis differences post-change.

Dynamic Regression Testing
Execution of a pre-existing dynamic test to compare new behaviour with previously trusted behaviour.

The table summarises the four anti-regression activities in a 2x2 matrix. There are some similarities between the business and technical impact analysis approaches. Both are based on a current understanding of the existing system (but at a technical or behavioural level depending on viewpoint). Both are somewhat speculative and focus on how regressions can be avoided or accommodated in the technology or in the business process./p>

The post-change techniques focus on how regressions can be detected. They are based on evidence derived from a post-change analysis of the design and code or a demonstration of the functionality implemented using a previously run set of tests.

A comprehensive anti-regression strategy should include all four of these techniques.

Technical Impact Analysis (Regression Prevention)

A technical impact analysis is essentially a review of the prospective changes to a system design at a high level and could take the form of a technical review. Where major enhancements are concerned and the changes to be made are mainly additions to an existing system, a technical review would focus on impact at an architectural level (performance, security, resilience risks). We won’t discuss this further.

At the code level, concerns would focus on new interfaces (direct integration) and changes to shared resources such as databases (indirect integration). Obviously, changes to be made to the existing code base, if they are known, need to be studied in some detail.

Some form of source code analysis needs to be performed by designers and programmers on the pre-change version of the software. The analysis is basically a code inspection or review. The developer speculates on what the impact of the changes could be and traces paths of execution through the code and through other unchanged modules to see what impact the changes could have. Design changes involving the database schema, messaging formats, call mechanisms etc. may require changes in many places. Simple search and scanning tools may help, but this is a labour intensive and error-prone activity of course.

In small code samples that are simple, well designed and have few interfacing modules, the developer will have a reasonable chance of identifying potential problems that are in the near vicinity of the proposed changes. Anomalies found can be eliminated by adjusting the design of the changes to be made (or by adopting a design that avoids risky changes). However, in realistically sized systems, the scale and complexity of this task severely limits its scope and effectiveness.

Some changes will be found by compilers or the build processes used by the developers so these are of less concern. The impact of more subtle changes may require some ‘detective work’. Typically, the programmer may need to perform searches of the entire code base to find specific code patterns. Typical examples might be accesses to a changed database table; usage of a new or changed element in an XML formatted message; usage of a variable that has a changed range of validity and so on.

Usually, formal static analysis on programme source code is performed using tools. These can’t be used to predict the impact on behaviour of changes (unless the changed code is analysed, and we’ll look at that next), but the output of tools could be used to focus the programmers’ attention on specific aspects of the system. Needless to say, a defect database used to identify error-prone modules in the code base is invaluable. These areas of special attention could be given more attention in a targeted inspection of the potential side-effects of changes.

Business Impact Analysis (Regression Prevention)

When additional functionality is required to be added to a new system or an enhancement to existing functionality is required then some form of business impact analysis, driven by an understanding of the proposed and existing behaviour, is required. Occasionally a bug fix requires a significant amount of redesign so a review of the functional areas to be changed and the functional areas that might be affected is in order.

The business impact analysis is really a set of ‘what-if?’ questions asked of the existing system before the proposed changes are made. The answers to those what-if questions may raise concerns about potential failure modes – the consequences – against which a risk-analysis could be conducted. Of course, the number of potential failure modes is huge and the knowledge available to analyse these risks may be very limited. However, the areas of most concern could be highlighted to the developers for them to pay special attention to and in particular to focus attention on subsequent regression testing.

A business impact analysis follows a fairly consistent process:

  1. PROPOSAL: Firstly, the proposed enhancement, amendment or bug fix resolution is described and communicated to the business users.
  2. CONSEQUENCE: The business users then consider the changes in functionality, and the technical changes that the programmers know will affect certain functional aspects of the system. Are there potential modes of failure of particular concern?
  3. CHALLENGE: Finally, the business users then challenge the programmers and ask, “what would happen if...?” questions to tease out and uncertainties and to highlight any specific regression tests that would be appropriate to address their concerns.

The main output from this process may be a set of statements the focus the attention of the developers. Often, specific feature or processes may be deemed critical and must not under any circumstances be adversely affected by changes. These feature and processes will surely be the subject of regression testing later.

Static Regression Testing (Regression Detection)

Can static analysis tools help with anti-regression? This section makes some tentative suggestions on how static analysis tools could be used to highlight changes that might be of concern. This is a rather speculative proposal. We would be very interested to hear from practitioners or researchers who are working in this area.

Tools are normally used to scan new or changed source code with a view to detecting poor programming practices and statically-detectable defects, so that developers can eliminate them. However, regressions are often found in unchanged code that calls or is called by the changed code. A scan of code in an unchanged module won’t tell you anything you didn’t already know. So, the analysis must look inside changed code and trace the paths affected by the changes to unchanged modules that call or are called by the changed module. Of course there may be extremely complex interactions for these tools to deal with – but that is exactly what the tools exist to do.

The process would look something like this:

  1. An analysis is performed on the unchanged code of the whole code-base or selected components of the system to be changed (the scope needs to be defined and be consistent in this process).
  2.  The same analysis is performed on the changed code-base.
  3. The two analyses are compared and differences highlighted to identify where the code changes have affected the structure and execution paths of the system.

Whereas a differencing tool would tell you what has changed in the code. Identifying the differences in the output of the static analyses may help you to locate where the code changes have an impact.

Tool can generate many types of analysis nowadays. What types of analyses might be of interest?

Control-flow analysis: if two control flow analyses of a changed system differ, but the differences are in unchanged code, then is it possible that some code is executed that was not used before, or some code that was used before is no longer executed. The new flow of control in the software may be intentional, but of course, it may not be. This analysis simply gives programmers a pointer to functionality and code paths that are worth further investigation. If the tool can generate graphical control-flow graphs, the changes may be obvious to a trained eye. The process could be analogous to a doctor examining x-rays taken at different times to look for growth of a tumour or healing of a broken bone.

Data Flow analysis: Data flow analysis traces the usage of variables in assignments, predicates in decisions (e.g. referenced in an if... then... else... statement) or the value of a variable is used in some other operation such as assignment to another variable or used in a calculation. A difference in the usage pattern of a variable defined in a changed module passed into or passed from unchanged modules may indicate an unwanted change in software behaviour.

Not enough organisations use static analysis tools and few tools are designed to search for differences in ‘deep-flow analysis’ outputs across versions of entire systems – but clearly, there are some potential opportunities worth exploring there. This article leaves you with some ideas but won’t take this suggestion further.

So far, we’ve looked at three techniques for preventing and detecting regressions. In the next article, we’ll examine the activity of more interest to programmers and testers – regression testing.



Tags: #regressiontesting #impactanalysis #anti-regression #testautomaion #staticanalysis

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 07/04/2016

At Eurostar 2010 in Copenhagen, the organisers asked me to do a brief video blog, and I was pleased to oblige. I had presented a track talk on test axioms in the morning and I had mentioned a couple of ideas in the talk. these were the “quantum theory of testing” and “testing relativity”. The video goes into a little more detail.

The slides I presented are included in the slideshare set below. The fonts don't seem to have uploaded, I'm afraid:

Advancing Testing Using Axioms

View more presentations from Paul Gerrard


Tags: #testingrelativity #quantumtesting

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 03/05/2006

I coach rowing, so I'll use this as an analogy. Consider the crew of rowers in a racing eight. The coach’s intention is to get all eight athletes rowing in harmony, with the same movement with balance, poise and control. In theory, if everyone does the same thing, the boat will move smoothly, and everyone can apply the power of their legs, trunk and arms to moving the boat as quickly as possible (and win races). Of course, one could just show the crew a video of some Olympic champions and say, 'do what they do', 'exactly', 'now'. But how dumb is that? Each person is an individual, having different physical shape and size, physiology, ambition, personality, attitudes and skill levels. Each athlete has to be coached individually to bring them up to the 'gold standard'. But it's harder than that, too. It's not as if each athlete responds to the same coaching messages. The coach has to find the right message to get the right response from each individual. For example, to get rowers to protect their lower backs, they must 'sit up' in the boat. Some rowers respond to 'sit up' others to 'keep your head high', 'be arrogant' and so on. That's just the way it is with people.

In the same way, when we want people to adopt a new way of working – a new 'process', we have to recognise that to get the required level of process adherence and consistency, (i.e. changed behaviours) every individual faces a different set of challenges. For each individual, it's a personal challenge. To get each individual to overcome their innate resistance to change, improve their skill levels, adjust their attitudes, and overall, change their behaviour, we have to recognise that each individual needs individual coaching, encouragement and support.

Typical 'process' improvement attempts start with refined processes, some training, a bit of practice, a pilot, then a roll-out. But where is the personal support in all this? To ask a group of individuals to adopt a new process (any process) by showing them the process and saying 'do it', is like asking a village footbal team to 'play like Brazil'.

Tags: #Rowing #TesterDevelopment #TestProcessImprovement

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 22/06/2011

Many thanks to Helmut Pichler and Manfred Baumgartner of Anecon who invited me to speak at their joint seminar with Microsoft at the Microsoft office in Vienna, Austria in May. Thanks also to Andreas Pollak of Microsoft who organised the event and who acted as my able assistant when my remote control did not work.

The event agenda and sessions are described here. Andreas assembled the Powerpoint and a voice recording into a movie which is reproduced below. Apologies for the first minute of the talk, as my remote control gadget didn't work. Andreas kindly offered assistance :O)

The talk, ah yes, the talk. Essentially, it's an attempt to discuss the meaning of quality and how testers use test models. Abstract below.

I hope I don't upset too many British Royal Watchers, Apple Product devotees or McDonalds lovers with this talk. I'm not one of you, I'm afraid.

Abstract: Rain is great for farmers and their crops, but terrible for tourists. Wind is essential for sailors and windmills but bad for the rest of us. Quality, like weather, is good or bad and that depends on who you are. Just like beauty, comfort, facility, flavour, intuitiveness, excitement and risk, quality is a concept that most people understand, but few can explain. It’s worse. Quality is an all-encompassing, collective term for these and many other difficult concepts.

Quality is not an attribute of a system – it is a relationship between systems and stakeholders who take different views and the model of Quality that prevails has more to do with stakeholders than the system itself. Measurable quality attributes make techies feel good, but they don’t help stakeholders if they can’t be related to experience. If statistics don’t inform the stakeholders’ vision or model of quality, we think we do a good job. They think we waste their time and money. Whether documented or not, testers need and use models to identify what is important and what to test. A control flow graph has meaning (and value) to a programmer but not to a user. An equivalence partition has meaning to users but not the CEO. Control flow, equivalence partitions are models with value in some, but never all, contexts. If we want to help stakeholders to make better-informed decisions then we need test models that do more than identify tests. We need models that take account of the stakeholders’ perspective and have meaning in the context of their decision-making. If we measure quality using technical models (quality attributes, test techniques) we delude both our stakeholders and ourselves into thinking we are in control of Quality.

We’re not.

In this talk, Paul uses famous, funny and tragic examples of system failures to illustrate ways in which test models and (therefore testing) failed. He argues strongly that the pursuit of Quality requires that testers need better test models and how to create them, fast.

Tags: #quality #tornadoes #testmodels

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 01/05/2007

At last week's Test Management Forum, Susan Windsor introduced a lively session on estimation – from the top down. All good stuff. But during the discussion, I was reminded of a funny story (well I thought it was funny at the time).

Maybe twenty years ago (my memory isn’t as good as it used to be), I was working at a telecoms company as a development team leader. Around 7pm one evening, I was sat opposite my old friend Hugh. The office was quiet, we were the only people still there. He was tidying up some documentation, I was trying to get some stubborn bug fixed (I’m guessing here). Anyway. Along came the IT director. He was going home and he paused at our desks to say hello, how’s it going etc.

Hugh gave him a brief review of progress and said in closing, “we go live a week on Friday – two weeks early”. Our IT director was pleased but then highly perplexed. His response was, “this project is seriously ahead of schedule”. Off he went scratching his head. As the lift doors closed, Hugh and I burst out laughing. This situation had never arisen before. What a problem to dump on him! How would he deal with this challenge? What could he possibly tell the business? It could be the end of his career! Delivering early? Unheard of!

It’s a true story, honestly. But what it also reminded me of was that if estimation is an approximate process, our errors in estimation in the long run (over or under estimation) expressed as a percentage under or over, should balance statistically around a mean value of zero, and that mean would represent the average actual time or cost it took for our projects to deliver.

Statistically, if we are dealing with a project that is delayed (or advanced!) by unpredictable, unplanned events, we should be overestimating as much as we under estimate, shouldn’t we? But clearly this isn’t the case. Overestimation, and delivering early is a situation so rare, it’s almost unheard of. Why is this? Here's a stab at a few reasons why we consistently 'underestimate'.

First, (and possibly foremost) is we don't underestiate at all. Our estimates are reasonably accurate, but consistently we get squeezed to fit with pre-defined timescales or budgets. We ask for six people for eight weeks, but we get four people for four weeks. How does this happen? If we've been honest in our estimates, surely we should negotiate a scope reduction if our bid for resources or time is rejected? Whether we descope a selection of tests or not, when the time comes to deliver, our testing is unfinished. Of course, go live is a bumpy period – production is where the remaining bugs are encountered and fixed in a desperate phase of recovery. To achieve a reasonable level of stability takes as long as we predicted. We just delivered too early.

Secondly, we are forced to estimate optimistically. Breakthroughs, which are few and far between are assumed to be certainties. Of course, the last project, which was so troublesome, was an anomaly and it will always be better next time. Of course, this is nonsense. One definition of madness is to expect a different outcome from the same situation and inputs.

Thirdly, our estimates are irrelevant. Unless the project can deliver in some mysterious predetermined time and cost contraints, it won't happen at all. Where the vested interests of individuals dominate, it could conceivably be better for a supplier to overcommit, and live with a loss-making, troublesome post-go live situation. In the same vein, the customer may actually decide to proceed with a no-hoper project because certain individuals' reputation, credibility and perhaps jobs depend on the go live dates. Remarkable as it may seem, individuals within customer and supplier companies may actually collude to stage a doomed project that doesn't benefit the customer and loses the supplier money. Just call me cynical.

Assuming project teams aren't actually incompetent, it's reasonable to assume that project execution is never 'wrong' – execution just takes as long as it takes. There are only errors in estimation. Unfortunately, estimators are suppressed, overruled, pressured into aligning their activities with imposed budgets and timescales, and they appear to have been wrong.

Tags: #estimation

Paul Gerrard My linkedin profile is here My Mastodon Account

First published 21/10/2009

I'm relieved, excited and delighted to tell you that The Tester's Pocketbook has been published and is available. (It is a pocketbook, with 104 pages and c. 19k words).

The book summarises the thinking on Test Axioms and the axiom definitions are hosted (and will be maintained in future) on the Test Axioms website.

Thanks to all my reviewers and people who supported me.

Tags: #paulgerrard #testaxioms #testerspocketbook

Paul Gerrard My linkedin profile is here My Mastodon Account