Paul Gerrard

My experiences, opinions in the Test Engineering business. I am republishing/rewriting old blogs from time to time.

First published 12/03/2010

On Thursday's SIGIST meeting, it was great to have such a positive reaction to my workshop and closing talk.

The Fallibility axiom (p41) tells us our sources of knowledge are undependable. The tester is a human being and prone to error. The system is being tested because we are uncertain of its behaviour or reliability. As a consequence, the plan for any test worth running cannot be relied upon to be accurate before we follow it. Predictions of test status (e.g. coverage achieved or test pass- rate) at any future date or time are notional. The planning quandary is conveniently expressed in the testing uncertainty principle:

  • One can predict test status, but not when it will be achieved;
  • One can predict when a test will end, but not its status.
Consequently, if a plan defines completion of testing using test exit criteria to be met at a specified date (expressed in terms of tests run and the status of those tests) it is wise to regard them as planning assumptions, rather than hard targets.
  • If exit criteria are met on time or earlier, our planning assumptions are sound: We are where we want to be.
  • If exit criteria are not met or not met on time, our plan was optimistic: Our plan needs adjustment, or we must relax the criteria.
Whichever outcome arises, we still need to think very carefully about what they actually mean in our project.

Tags: #testaxioms #uncertainty

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 01/12/2009

The Knowledge Base has moved HERE!

This is a new website currently hosting a directory of tools in the DevOps, SDET and Testing support domains and it also provides a searchable index of tools. There are over 2208 tools registered although 693 are actually programming languages.

The site also monitors the blog pages of 277 bloggers. These again are indexed and searchable.

Numbers correct as of 25/8/2015.



Tags: #tkb #ToolsKnowledgeBase

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 07/04/2016

At Eurostar 2010 in Copenhagen, the organisers asked me to do a brief video blog, and I was pleased to oblige. I had presented a track talk on test axioms in the morning and I had mentioned a couple of ideas in the talk. these were the “quantum theory of testing” and “testing relativity”. The video goes into a little more detail.

The slides I presented are included in the slideshare set below. The fonts don't seem to have uploaded, I'm afraid:

Advancing Testing Using Axioms

View more presentations from Paul Gerrard


Tags: #testingrelativity #quantumtesting

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 22/06/2011

Many thanks to Helmut Pichler and Manfred Baumgartner of Anecon who invited me to speak at their joint seminar with Microsoft at the Microsoft office in Vienna, Austria in May. Thanks also to Andreas Pollak of Microsoft who organised the event and who acted as my able assistant when my remote control did not work.

The event agenda and sessions are described here. Andreas assembled the Powerpoint and a voice recording into a movie which is reproduced below. Apologies for the first minute of the talk, as my remote control gadget didn't work. Andreas kindly offered assistance :O)

The talk, ah yes, the talk. Essentially, it's an attempt to discuss the meaning of quality and how testers use test models. Abstract below.

I hope I don't upset too many British Royal Watchers, Apple Product devotees or McDonalds lovers with this talk. I'm not one of you, I'm afraid.

Abstract: Rain is great for farmers and their crops, but terrible for tourists. Wind is essential for sailors and windmills but bad for the rest of us. Quality, like weather, is good or bad and that depends on who you are. Just like beauty, comfort, facility, flavour, intuitiveness, excitement and risk, quality is a concept that most people understand, but few can explain. It’s worse. Quality is an all-encompassing, collective term for these and many other difficult concepts.

Quality is not an attribute of a system – it is a relationship between systems and stakeholders who take different views and the model of Quality that prevails has more to do with stakeholders than the system itself. Measurable quality attributes make techies feel good, but they don’t help stakeholders if they can’t be related to experience. If statistics don’t inform the stakeholders’ vision or model of quality, we think we do a good job. They think we waste their time and money. Whether documented or not, testers need and use models to identify what is important and what to test. A control flow graph has meaning (and value) to a programmer but not to a user. An equivalence partition has meaning to users but not the CEO. Control flow, equivalence partitions are models with value in some, but never all, contexts. If we want to help stakeholders to make better-informed decisions then we need test models that do more than identify tests. We need models that take account of the stakeholders’ perspective and have meaning in the context of their decision-making. If we measure quality using technical models (quality attributes, test techniques) we delude both our stakeholders and ourselves into thinking we are in control of Quality.

We’re not.

In this talk, Paul uses famous, funny and tragic examples of system failures to illustrate ways in which test models and (therefore testing) failed. He argues strongly that the pursuit of Quality requires that testers need better test models and how to create them, fast.

Tags: #quality #tornadoes #testmodels

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 01/12/2009

The extraordinary growth in the Internet is sweeping through industry. Small companies can now compete for attention in the global shopping mall – the World Wide Web. Large corporations see ‘The Web’ not only as an inexpensive way to make company information on products and services available to anyone with a PC and browser, but increasingly as a means of doing on-line business with world wide markets. Companies are using the new paradigm in four ways:

  • Web sites - to publicise services, products, culture and achievements.
  • Internet products - on-line services and information to a global market on the Web.
  • Intranet products - on-line services and information for internal employees.
  • Extranet products - on-line services and information enabling geographically distributed organisations to collaborate.

Web-based systems can be considered to be a particular type of client/server architecture. However, the way that these systems are assembled and used means some specialist tools are required and since such tools are becoming available we will give them some consideration here.

The risks that are particular to Web based applications are especially severe where the system may be accessed by thousands or tens of thousands of customers. The very high visibility of some systems being built mean that failure in such systems could be catastrophic. Web pages usually comprise text base files in Hypertext Markup Language (HTML) and now contain executable content so that the traditional separation of ‘code and data’ is no longer possible or appropriate. Browsers, plug-ins, active objects and Java are also new concepts which are still immature.

There are four main categories of tools which support the testing of Web applications:

Application test running

Test running tools that can capture tests of user interactions with Web applications and replay them now exist. These tools are either enhancements to existing GUI test running tools or are new tools created specifically to drive applications accessed through Web browsers. The requirements for these tools are very similar to normal GUI test running tools, but there are some important considerations.

Firstly, the Web testing tool will be designed to integrate with specific browsers. Ask your vendor whether the tool supports the browsers your Web application is designed for, but also check whether old versions of these browsers are also supported. To test simple text-oriented HTML pages, the Web testing tool must be able to execute the normal web page navigation commands, recognise HTML objects such as tables, forms, frames, links to other pages and content consisting of text, images, video clips etc. HTML text pages are often supplemented by server-side programs typically in the form of Common Gateway Interface (CGI) Scripts which perform more substantial processing. These should be transparent to the test tool.

Increasingly, web applications will consist of simple text-based web pages, as before, but these will be complemented with ‘active content’ components. These components are likely to be Java applets, Netscape ‘plug-ins’, ActiveX controls. Tools are only just emerging which are capable of dealing with these components. Given the portable nature of the Java development language, tools written in Java may actually be completely cable of dealing with any legitimate Java object in your Web application, so may be an obvious choice. However, if other non-Java components are present in your application, a ‘pure-Java’ tool may prove inadequate. Another consideration is how tools cope with dynamically generated HTML pages – some tools cannot.

HTML source, link, file and image checkers

Tools have existed for some time which perform ‘static tests’ of Web pages and content. These tools open a Web page (a site Home page, for example) to verify the syntax of the HTML source and check that all the content, such as images, sounds and video clips can be accessed and played/displayed. Links to other pages on the site can be traversed, one by one. For each linked page, the content is verified until the tool runs out of unvisited links. These tools are usually configured to stop the search once they encounter a link to an off-server page or another site, but they can effectively verify every page and that every home-site based link and content is present. Most of these tools provide graphical reports on the structure of Web sites which highlight the individual pages, internal and external links, missing links and other missing content.

Component test-drivers

Advanced Web applications are likely to utilise active components which are not directly accessible, using a browser-oriented Web testing tool. Currently, developers have to write customised component drivers, for example using the main{} method in Java classes to exercise the methods in a the class without having to use other methods in other classes.

As web applications become more sophisticated, the demand for specialised component drivers to test low level functionality and integration of components will increase. Such tools may be delivered as part of a development toolkit, but unfortunately, development tool vendors are more often interested in providing the ‘coolest’ development, rather than testing tools.

Internet performance testing

Web applications are most easily viewed as a particular implementation of client/server. The performance testing tools for the web that are available are all enhanced versions of established client/server based tools. We can consider the requirements for load generation and client application response time measurement separately.

Load generation tools rely on a master or control program running on a server which drive physical workstations using a test running tool to drive the application, or test drivers which submit Web traffic across the network to the Web servers. In the first case, all that is new is that it is the Web-oriented test running tool which drives the client application through a browser. For larger tests, test drivers capable of generating Web traffic across the network are used. Here, the test scripts dispatch remote procedure calls to the web servers. Rather than a SQL-oriented database protocol such as ODBC, the test script will contain HTTP calls which use the HTTP protocol instead. All that has changed is the test driver programs.

Client application response time measurement is done using the Web test running tool. This may be a standalone tool running on a client workstation or the client application driver, controlled by the load generation tool.



Tags: #tools #automation #web #internet

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 16/12/2012

A couple of weeks ago I gave a talk that included a couple of slides that focused on the idea of Specification by Example and how it cannot be relied upon to fully define the functionality of a software solution. I thought I'd summarise it here while the thought was fresh in my mind and also because Martin Fowler recently re-posted a blog originally published some time ago.

Martin provides a broader perspective and significantly, he says 'Specification By Example only works in the context of a working relationship where both sides are collaborating and not fighting'. Quite. He quotes a Cedric Beust post that critiques TDD (and Agile projects in general) that promote the use of tests as specifications.

Clearly, SBE can work nicely in an Agile environment where the scenarios are there to capture some key examples of the feature in use. The more general business rules to be implemented in code are (presumably) discussed and captured elsewhere – specifically in the code and exampled in tests. The examples and automated tests based on the conversations are retained to provide evidence that the software 'works' and stays working after changes elsewhere. One obvious, valuable outcome of SBE, Behaviour-Driven or Test-Driven approaches is a set of automated tests that are a quite effective anti-regression measure for use in projects that practice a continuous delivery approach. But what about non-Agile? Can SBE work in all contexts?

The questions is, “can examples alone be trusted to fully describe some system behaviour?” The answer is occasionally yes, but usually – no. Here's an example of why not.

The table below shows some scenarios associated with a feature. Call it SBE, BDD or just a shorthand for some TDD tests. Whatever.

given a, b, c are real numbers
when a=<a>
  and b=<b>
  and c=<c>
then r1=<r1> and r2=<r2>

| a | b | c | r1 | r2 | | 1 | -2 | 1 | 1 | 1 | | 1 | 3 | 2 | 1 | 2 | | 1 | 3 | 2 | 1 | 2 | |12 |-28 |15 | 1.5| 0.833|

It doesn't give much away does it? “Do you know what it is yet?” (as Rolf Harris might ask).

Now, I could keep giving you new examples that are correct from the point of view of the requirement (that I'm not yet sharing with you). Maybe you'd spot the pattern of the inputs and outputs and guess that a b c are the coefficients of a quadratic and r1 r2 are the roots. Aha. The programmer could easily implement the code as follows:

r1 = (-b + sqrt(bb – 4ac))/(2a)
r2 = (-b – sqrt(bb – 4ac))/(2a)
Sorted. But is it...?

Suppose I then gave you an example that could NOT be processed by the quadratic formulae? The example below would probably cause an exception in the code:

| a |  b | c |  r1 | r2  |
| 4 |  3 | 2 | ... | ... |
You can't take square roots of negative numbers. So you could argue that there's a validation rule (not yet specified) that rejects inputs that cause this exception and change the code accordingly. But in fact, one CAN derive the square root of negative numbers. They are just called 'complex numbers' that's all. (Mathematicians can be a bit slippery). Have we got it right yet? We'd have to look at the expected outcomes in the examples provided, and generate them in code and hope for the best. Whatever. That's enough maths for one day.

The principle must be that examples on their own do not provide enough information to formulate a general solution. It is always possible to code a solution that will satisfy the examples provided. But that is not a solution – it is mimicry. A coded table of pre-defined answers can mimic the general solution. But the very next example, when used to test our solution will fail if the example is not in our coded table. Our model of the solution is incomplete – it's wrong. In fact, to be certain we have the perfect solution we would effectively need an infinite number of examples that when tested generate the required outcomes. Specification by Example ALONE cannot provide a complete specification.

Where does this leave us? Specifications are models of systems. All models are wrong (or at least incomplete) but some are useful. But having a specification is a necessary (but probably not sufficient) condition for building a solution.

Perhaps Specification by Example is mis-named. It should called be Specification AND Example.

The question remains, “How much software is out there that just mimics a solution?”

Tags: #specificationbyexample #SBE

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 05/11/2009

This talk setting out some thoughts on what's happening in the testing marketplace. Covers Benefits-Based Testing, Testing Frameworks, Software Success Improvement, Tester Skills and provides some recommendations for building your career.

Registered users can download the paper from the link below. If you aren't registered, you can register here.

Tags: #testingtrends

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 06/11/2009

This talk will be presented at EuroSTAR 2007 in Stockholm in December 2007. I presented the talk at the EuroSTAR-Testnet mini-event at Nieuwegein, Holland on the same night as Liverpool played AC Milan in the Champions League Cup Final hence the picture on slide 2. (It's a shame they lost 2-1). The focus of the talk is that using lessons learned can help you formulate a better test strategy, or as I am calling them nowadays, 'Project Intelligence Strategies'.

Registered users can download the paper from the link below. If you aren't registered, you can register here.

Tags: #sap #erp #lessonslearned

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 06/11/2009

This talk was presented at SQSTEST in 2006. It sets out an alternative way of thinking about 'process improvement'. My argument is that we should focus on results, then define the changes we need to make. It draws on Results-Chain theory and the change management approach of John Kotter.

Registered users can download the paper from the link below. If you aren't registered, you can register here.

Tags: #softwaresuccessimprovement

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account

First published 14/12/2011

When the testing versus checking debate started with Michael’s blog here http://www.developsense.com/blog/2009/08/testing-vs-checking/ I read the posts and decided it wasn’t worth getting into. It seemed to be a debate amongst the followers of the blog and the school rather than a more widespread unsettling of the status quo.

I fully recognise the difference between testing and checking (as suggested in the blogs). Renaming of what most people call testing today to checking and redefining testing in the way Michael suggests upset some folk, cheered others. Most if not all developer testing and all testing through an API using tools becomes checking – by definition. I guess developers might sniff at that. Pretty much what exploratory testers do becomes the standard for what the new testing is. So they are happy. Most testers tend not to follow blogs so they are still blissfully unaware of the debate.

Brian Marick suggested the blogs were a ‘power play’ in a Tweet and pointed to an interesting online conversation here http://tech.groups.yahoo.com/group/agile-testing/message/18116. The suggested redefinitions appear to underplay checking and promote the virtue of testing. Michael clarified his position and said it wasn’t here http://www.developsense.com/blog/2009/11/merely-checking-or-merely-testing/ and said:

“The distinction between testing and checking is a power play, but it’s not a power play between (say) testers and programmers. It’s a power play between the glorification of mechanizable assertions over human intelligence. It’s a power play between sapient and non-sapient actions.”

In the last year or so, I’ve had a few run-ins with people and presenters at conferences when I asked what they meant by checking when they used the word. They tended to forget the distinction and focus on the glorification bit. They told me testing was good (“that’s what I get paid for”) and checking was bad, useless or for drones. I’m not unduly worried by that – but it’s kind of irritating.

The problem I have is that if the idea (distinguishing test v check) is to gain traction and I believe it should, then changing the definition of testing is hardly going to help. It will confuse more than clarify. I hold that the scope of testing is much broader than testing software. In our business we test systems (a system could be a web page, it could be a hospital). The word and activity is in widespread use in almost every business, scientific and engineering discipline you can imagine. People may or may not be checking, but to ask them to change the name and description of what they do seems a bit ambitious. All the textbooks, papers and blogs written by people in our business will have to be reinterpreted and possibly changed. Oh, and how many dictionaries around the world need a correction? My guess is it won’t happen.

It’s much easier to say that a component of testing is checking. Know exactly what that is and you are a wiser tester. Sapient even.

The test v check debate is significant in the common exploratory contexts of individuals making decisions on what they do right now in an exploratory session perhaps. But it isn’t significant in the context of larger projects and teams. The sapience required in an exploratory session is concentrated in the moment to moment decision making of the tester. The sapience in other projects is found elsewhere.

In a large business project, say an SAP implementation, there might be ten to twenty legacy and SAP module system test teams plus multiple integration test teams as well as one or several customer test teams all working at a legacy system, SAP module or integrated system level. SAP projects vary from maybe fifty to several thousand man-years of effort of which a large percentage (tens of percent) is testing of one form or another. Although there will be some exploration in there – most of the test execution will be scripted and it’s definitely checking as we know it.

But, the checking activity probably counts for a tiny percentage of the overall effort and much of it is automated. The sapient effort goes into the logistics of managing quite large teams of people who must test in this context. Ten to twenty legacy systems must be significantly updated, system tested, then integrated with other legacy systems and kept in step with SAP modules that are being configured with perhaps ten thousand parameter changes. All this takes place in between ten and thirty test environments over the course of one to three years. And in all this time, business as usual changes on the legacy systems and the system to be migrated and/or retired must be accommodated.

As the business and projects learn what it is about, requirements evolve and all the usual instability disturbs things. But change is an inevitable consequence of learning and large projects need very careful change management to make sure the learning is communicated. It’s an exploratory process on a very large scale. Testing includes data migration, integration with customers, suppliers, banks, counterparties; it covers regulatory requirements, cutover and rollback plans, workarounds, support and maintenance processes as well as all the common non-functional areas.

Testing in these projects has some parallels with a military campaign. It’s all about logistics. Test checking activity compares with ‘pulling the trigger’.

Soldiering isn’t just about pulling triggers. In the same way, testing isn’t just about checking. Almost all the sapient activity goes into putting the testers into exactly the right place at the right time, fully equipped with meaningful and reliable environments, systems under test, integral data and clear instructions, with dedicated development, integration, technical, data, domain expert support teams. Checking may be manual or automated, but it’s a small part of the whole.

Exploration in environments like these can’t be done ‘interactively’. It really could take months and tens/hundreds of thousands of pounds/dollars/euros to construct the environment and data to run a speculative test. Remarkably, exploratory tests are part of all these projects. They just need to be wisely chosen and carefully prepared, just like other planned tests, because you have a limited time window and might not get a second chance. These systems are huge production lines for data so they need to be checked endlessly end to end. It’s a factory process so maybe testing is factory-like. It’s just a different context.

The machines on the line (the modules/screens) are extremely reliable commercial products. They do exactly what you have configured them to do with a teutonic reliability. The exploration is really one of requirements, configuration options and the known behaviour of modules used in a unique combination. Test execution is confirmation but it seems that it can be done no other way.

It rarely goes smoothly of course. That’s logistics for you. And testing doing what it always does.

Tags: #testingvchecking #checking #factory

Paul Gerrard Please connect and contact me using my linkedin profile. My Mastodon Account