First published 03/12/2009
This paper gives general guidance about selecting and evaluating commercial CAST tools. It is intended to provide a starting point for tool assessment. Although it is not as detailed or specific as a tailored report prepared by a consultant, it should enable you to plan the basics of your own tool selection and evaluation process.
It is easy to make the mistake of considering only tool function, and not the other success factors related to the organisation where the tool will be used; this is one reason that expensive tools end up being unused only a few months after purchase. Following the advice given in this paper should help you to avoid some of these problems.
There are a surprising number of steps in the tool selection process. The following diagram is an overview of the whole process.
Overview of the selection process
Where to start
You are probably reading this report because you want to make your testing process more efficient through the use of a software testing tool. However, there is a wide variety of tools available; which one(s) should you buy?
There are a number of different types of testing tools on the market, and they serve different purposes. Buying a capture/replay tool will not help you measure test coverage; a static analysis tool will not help in repeating regression tests.
You also need to consider the environment where the testing tool will be used: a mainframe tool is no use if you only have PCs.
The skills of the people using the tool also need to be taken into account; if a test execution tool requires programming skills to write test scripts, it would not be appropriate for use by end-users only.
These considerations are critical to the successful use of the tool; if the wrong tool is selected, whether it is the wrong tool for the job, the environment or the users, the benefits will not be achieved.
The tool selection and evaluation team
Someone should be given the responsibility for managing the selection and evaluation process. Generally a single individual would be authorised to investigate what tools are available and prepare a shortlist, although there could be several people involved. The researchers need to have a reasonable idea of what type of tool is needed, who within the organisation would be interested in using it, and what the most important factors are for a tool to qualify for the shortlist.
After the shortlist is prepared, however, it is wise to involve a number of people in the evaluation process. The evaluation team should include a representative from each group planning to use the tool, and someone of each type of job function who would be using it. For example, if non-technical end-users will be using a test execution tool to run user acceptance tests, then a tool that needs programming skills would be excluded and an end-user should be on the evaluation team. The usability of a tool has a significant effect on whether the tool becomes accepted as part of the testing process.
If the evaluation team becomes involved in a trial scheme, the intended user must make the recommendation as to the tool’s usability. However, the team may need to have access to a systems support resource. The role of the support person is to assist in overcoming technical problems which appear important to the user or developer but are actually quite easily overcome by a technician.
The selection and evaluation team may also go on to become the implementation team, but not necessarily.
What are the problems to be solved?
The starting point for tool selection is identifying what problem needs to be solved, where a tool might provide a solution. Some examples are:
- tests which are currently run manually are labour-intensive, boring and lengthy
- tests which are run manually are subject to inconsistencies, owing to human error in inputting tests
- we need knowledge of the completeness or thoroughness of our tests; the measurement of the amount of software tested with a test suite is difficult or impossible to compute manually
- paper records of tests are cumbersome to maintain, leading to tests being repeated or omitted
- when small changes are made to the software, extensive regression tests must be repeated and there is not time to do it manually
- setting up test data or test cases is repetitive and ‘mechanical’: the testers find it uninteresting and make too many ‘simple’ errors
- comparison of test results is tedious and error-prone
- errors are found during testing which could have been detected before any tests were run by examining the code carefully enough
- users find too many errors that could have been found by testing.
The current testing process must be sufficiently well-defined that it is easy to see the areas where automated improvement would actually help.
Is a tool the right solution?
Software testing tools are one way to approach these types of problems, but are not the only way. For example, code inspection could be used to address the problem of detecting errors before test execution. Better organisation of test documentation and better test management procedures would address the problem of omitting or repeating tests. Considering whether all the repetitive ‘mechanical’ test cases are really necessary may be more important for test effectiveness and efficiency than blindly automating them. The use of a testing tool will not help in finding more errors in testing unless the test design process is improved, which is done by training, not by automation.
An automated solution often ‘looks better’ and may be easier to authorise expenditure for than addressing the more fundamental problems of the testing process itself. It is important to realise that the tool will not correct a poor process without additional attention being paid to it. It is possible to improve testing practices alongside implementing the tool, but it does require conscious effort.
However, we will assume that you have decided, upon rational consideration of your own current situation (possibly with some tool readiness assessment advice from an outside organisation), that a testing tool is the solution you will be going for.
How much help should the tool be?
Once you have identified the area of testing you want the tool to help you with, how will you be able to tell whether any tool you buy has actually helped? You could just buy one and see if everyone feels good about it, but this is not the most rational approach. A better way is to define measurable criteria for success for the tool. For example, if the length of time taken to run tests manually is the problem, how much quicker should the tests be run using a tool?
Setting measurable criteria is not difficult to do, at least to obtain a broad general idea of costs. A general idea is all that is necessary to know whether the tool will be cost-justified. A realistic measurable criterion for a test execution tool might be set out as follows:
Manual execution of tests currently takes 4 man-weeks. In the first 3 months of using the tool, 50–60 per cent of these tests should be automated, with the whole test suite run in 2–2½ man-weeks. Next year at this time we aim to have 80 per cent of the tests automated, with the equivalent test suite being run in 4 man-days.
An approach to measuring the potential savings of a test coverage tool might be:
We currently believe that our test cases ‘completely test’ our programs, but have no way of measuring coverage. As an experiment on a previously tested (but unreleased) program, rerun the dynamic tests using a coverage measurement tool. We will almost certainly find that our tests reached less than 100 per cent coverage. Based on the tool’s report of the unexecuted code we can devise and run additional test cases. If these additional test cases discover errors – serious ones that are deemed likely to have appeared sometime in the future in live running – then the tool would make a saving by detecting those errors during testing, which is less costly than in live running. The potential savings are the difference between the cost of an error found in live running: say £5000 for a modest error that must be corrected, and an error found during testing, say £500.
A similar approach could be applied to determining whether a static analysis tool is worth using:
Run the static analysis tool on a group of programs that have been through dynamic testing but are not yet released. Evaluate (in the opinion of a senior analyst) the cost of the genuine errors detected. For those errors which were also found in the original dynamic testing, the static analyser might not save the cost of running the dynamic test (because you do not design your dynamic tests assuming that errors have been found previously), but it might save the cost of all the diagnosis and rerunning that dynamic testing entails. More interesting is the cost of those errors which were only detected by static test but which would have caused a problem in live running. The cost of detecting the error through static analysis is likely to be one-hundredth the cost of finding it in live running.
When looking at the measurable benefits it is best to be fairly conservative about what could be accomplished. When a tool is used for the first time it always takes much longer than when people are experienced in using it, so the learning curve must be taken into account. It is important to set realistic goals, and not to expect miracles. It is also important that the tool is used correctly, otherwise the benefits may not be obtained.
If you find that people are prepared to argue about the specific numbers which you have put down, ask them to supply you with more accurate figures which will give a better-quality evaluation. Do not spend a great deal of time ‘polishing’ your estimates: the tool evaluation process should be only as long as is needed to come to a decision, and no longer. Your estimates should reflect this granularity.
How much is this help worth?
The measurable criteria that you have identified as achievable will have a value to your organisation; it is important to quantify this value in order to compare the cost of the tool with the cost saved by the benefits. One of the simplest ways to quantify the benefits is to measure the saving of time and multiply that by approximate staff costs.
For example, if regression tests which normally take 4 man-weeks manually can be done in 2 man-weeks we will save 2 man-weeks of effort whenever those tests are run. If they are run once a quarter, we will save 8 man-weeks a year. If they are run once a month, we will save 24 man-weeks a year. (If they are only run once a year we will only save 2 man-weeks in that year.) If a man-week is costed at say, £2,000, we will save respectively £16,000, £48,000 or £4,000.
The savings that can be achieved can be taken advantage of by putting the people into more productive work, on development, enhancements or better test design.
There will also be other benefits, which may be very difficult if not impossible to quantify but which should also be mentioned. The risk of an embarrassing public release may be reduced, for example, but it may not be possible to put a monetary value on this. Morale is likely to improve, which is likely to result in an increase in productivity, but it may not be possible or desirable to separate this from the productivity increase from using the tool. There may be some things that are not even possible to do manually, which will not be discovered until the tool has been in use; these unanticipated benefits cannot be quantified because no one realised them at the time.
Of course this is a very simplistic start to building a proper business case for the tool, but it is essential that some first attempt is made to quantify the benefits, otherwise you will not be able to learn from your tool evaluation experience for next time.
Tool Requirements
What tool features are needed to meet requirements?
The next step is to begin to familiarise yourself with the general capabilities of tools of the type you want.
Which of the features listed are the most important ones to meet the needs and objectives for the tool in your current situation? For example, if you want to improve the accuracy of test results comparison, a capture/replay tool without a comparator would not help.
Make a list of the features, classified as ‘essential’, ‘desirable’ and ‘don’t matter’. The essential features list would rule out any tool which did not provide all of the things on that list, and the desirable features would be used to discriminate among those tools that provide all the essential features.
Note that your feature list will change as you progress in evaluating tools. You will almost certainly discover new features that you think are desirable as you go through the evaluation process. The tool vendors are sure to point out features they can supply but which you did not specifically request. Other tool users may recommend a feature as essential because of their experience, which you may not have thought was so important. For example, you may not consider the importance of being able to update your test scripts whenever the software changes because you are concentrating on the use of the execution tool for capturing tests the first time. However, this may be a significant ‘running cost’ for using the testing tool in the future. It is also possible that a feature that you thought was desirable is not required owing to the way other features are implemented.
As well as the functions that the tool performs, it is important to include some grading of usability as a feature for evaluation. Tools that have sound technical features, but are difficult to use, frequently become shelfware.
What are the constraints?
Environmental constraints
Testing tools are software packages and therefore may be specific to particular hardware, software or operating systems. You would not want a tool that runs only on a VAX VMS system if you have an IBM MVS system and no possibility of acquiring or using anything else.
Most people look for a tool which will run on the environment in which they are developing or maintaining software, but that is not the only possibility. A number of tools can run on a PC, for example, and can execute tests running on a different computer. Even debug and coverage measurement tools can work in a ‘host–target’ or client–server environment.
Having to acquire additional hardware is sometimes more of a psychological barrier than a technical or economic one. In your tool selection process, especially if there are not many for your ‘home’ environment, it is worth considering tools based on a separate environment.
However, you may need to acquire extra hardware even for a tool that runs on your own current environment, for example extra disk space to store test scripts.
Make sure that you find out exactly what the tool requires in terms of hardware and software versions. For example, you would not want to discover at installation time that you needed to have an operating system upgrade or additional memory before the tool can work. Have you considered security aspects? Do you need a separate language compiler for the test scripts?
Commercial supplier constraints
The company that you buy the tool from will be an important factor for your future testing practices. If you have problems with the tool, you will want them sorted out quickly and competently. If you want to get the best from the tool, you will want to take advantage of their expertise. You may want to influence the future development of the tool to provide for those needs which are not currently met by it.
There are a number of factors that you should take into consideration in evaluating the tool vendor’s organisation:
- Is the supplier a bona fide company?
- How mature are the company and the product? If the company is well established this gives confidence, but if the product has not changed significantly in recent years it may be getting rather out of date. Some organisations will feel that they need to buy products from the product vendor who sets the trend in the marketplace. Some organisations will be wary of new product companies, but there may be instances when a brand new organisation or product may be an unknown quantity but may be just what you need at just the right time. A new vendor may be much more eager to please their first customers;
- Is there adequate technical support? What would their response be to major or minor problems? Does the vendor run a help desk? What hours is help available? (If your vendor is in California and you are in Europe, there will be no overlap of their working day with yours!) What training courses are provided? How responsive are they to requests for information?
- How many other people have purchased or use this tool? You may or may not want to be the very first commercial user of a new tool. Can you talk to any other users? Is there a user group, and when does it meet and who controls it? Will they provide a reference site for you to talk to?
- What is the tool’s history? Was it developed to support good internal testing practices, to meet a specific client need, or as a speculative product? How many releases have there been to date, and how often is the tool updated? How many open faults are there currently reported?
Your relationship with the tool vendor starts during the selection and evaluation phase. If there are problems with the vendor now (when they want your money), there are likely to be even more problems later.
Cost constraints
Cost is often the most stringent and most visible constraint on tool selection. The purchase price may only be a small factor in the total cost to the organisation in fully implementing the tool. Cost factors include:
- purchase or lease price
- cost basis (per seat, per computer etc.)
- cost of training in the use of the tool
- any additional hardware needed (e.g. a PC, additional disk space or memory)
- support costs
- any additional costs, e.g. consultancy to ensure the tool is used in the best way.
Other constraints
Tool quality factors may include:
- How many people can use the tool at the same time? Can test scripts be shared?
- What skill level is needed to use the tool? How long does it take to become proficient? Are programming skills needed to write test scripts?
- What documentation is supplied? How thorough is it? How usable is it? Are there ‘quick reference guides’, for example?
There may well be other constraints which override all of the others, for example ‘political’ factors, such as having to buy the same tool that the parent company uses (e.g. an American parent enforces an American tool as its standard), or a restriction against buying anything other than a locally supported tool, perhaps limiting the choice to a European or British, French or German tool. It is frustrating to tool selectors to discover these factors late on in the selection process.
Constructing the shortlist
Use the cross-references in this report to find the tools that meet your environmental requirements and provide the features that are essential for you. Read the descriptions in the tools pages. This should give you enough information to know which tools listed in this report can go on your shortlist for further evaluation.
If there are more than six or seven tools that are suitable for you, you may want to do some initial filtering using your list of desirable features so that you will be looking at only three or four tools in your selection process.
If no or not enough suitable tools are found in this report, the search could be widened to other countries (e.g. the USA).
Other sources of information include pre-commercial tools (if you can find out about them). It is worth asking your current hardware or system software supplier if they know of any tools that meet your needs. If you are already using a CASE tool, it would be worth asking your vendor about support for testing, either through future development of their tool or by linking to an existing CAST tool. Conferences and exhibitions are where new vendors often go to announce a new tool. In particular the EuroSTAR conference is the prime showcase for testing tools.
The possibility of in-house development of a special-purpose tool should also be assessed. Do not forget to consider any existing in-house written tools within your own organisation that may be suitable for further development to meet your needs. The true cost of in-house development, including the level of testing and support needed to provide a tool of adequate quality, will be significant. It is generally much more than the cost of a commercial tool, but an in-house written tool will be more directly suitable to your own needs. For example, it can help to compensate for a lack of testability in the software under test. A purchased tool may need additional tailoring in order to meet real needs, and this can be expensive.
Another possibility is to use a ‘meta-tool’ to develop a new tailored tool in a short time. A meta-tool provides software for building software tools quickly, using the existing foundation of a standardised but highly tailorable user interface, graphical editor and text editor. It could enable a new testing tool, tailored to a specific organisation, to be built within a few months.
Summary of where to look for tools |
- Existing environment (this report)
| |
| |
- In-house prototype for development
| |
- Future environment of likely vendor
| - Conferences and exhibitions
|
| |
Tool Evaluation
Evaluating the shortlisted candidate tools
Research and compare tool features
Contact the vendors of the shortlisted tools and arrange to have information sent (if you have not done this already). Study the information and compare features. Request further information from the vendors if the literature sent does not explain the tool function clearly enough.
This is the time to consult one or more of the publications which have evaluated testing tools, if the ones you are interested in are covered in such a report. The cost of such reports should be compared to the cost of someone’s time in performing similar evaluations, and the cost of choosing the wrong tool because you did not know about something which was covered in published material. (Do not forget to allow time to read the report.)
Ask the shortlisted vendors to give you the names of some of their existing customers as reference. Contact the reference sites from each shortlisted vendor and ask them a number of questions about the tool. For example, why they bought this tool, how extensively it is now used, whether they are happy with it, what problems they have had, their impression of the vendor’s after-sales support service, how the tool affected their work, what benefits the tool gave them, and what they would do differently next time they were buying a tool. Remember that reference sites are usually the vendor’s best customers, and so will be likely to be very happy with the tool. Their environment is different from yours, so the benefits or problems which they have had may well not be the same as the ones which are important to you. However, the experience of someone else who bought a tool for similar reasons to yours is invaluable and well worth pursuing.
Many vendors are aware that a tool does not always add up to a total solution and are keen to present it as part of a more comprehensive offering, often including consultancy and training beyond just their product. They usually understand the issues covered in this paper because bad selection and bad implementation of their tools gives them a bad reputation. Because the vendors have good experience in getting the best out of their tools, their solutions may enhance the tools significantly and are worth serious examination. Nevertheless, it is always worth bearing in mind that the tool supplier is ultimately trying to persuade you to buy his product.
At any point in the selection and tool evaluation process it may become clear which tool will be the best choice. When this happens, any further activities will not influence the choice of tool but may still be useful in assessing in more detail how well the chosen tool will work in practice. It will either detect a catastrophic mismatch between the selected tool and your own environment, or will give you more confidence that you have selected a workable tool.
Tool demonstrations: preparation
Before contacting the vendor to arrange for a tool demonstration, some preparatory work will help to make your assessment of the competing tools more efficient and unbiased. Prepare two test case suites for tool demonstration:
- one of a normal ‘mainstream’ test case
- one of a worst-case ‘nightmare’ scenario.
Rehearse both tests manually, in order to discover any bugs in the test scenarios themselves. Prepare evaluation forms or checklists:
- general vendor relationship (responsiveness, flexibility, technical knowledge)
- tool performance on your test cases. Set measurable objectives, such as time to run a test on your own (first solo flight), time to run a reasonable set of tests, time to find an answer to a question in the documentation.
It is important that the tools be set up and used on your premises, using your configurations, and we recommend this, if at all possible, for the demonstration. We have had clients report to us that they found this single step to be extremely valuable, when they discovered that their prime candidate tool simply would not run in their environment! Of course, the vendor may be able to put it right but this takes time, and it is better to know about it before you sign on the dotted line, not after.
Invite the vendors of all shortlisted tools to give demonstrations within a short time-frame, for example on Monday, Wednesday and Friday of the same week. This will make sure that your memory of a previous tool is still fresh when you see a different one.
Give vendors both of your test cases in advance, to be used in their demo. If they cannot cope with your two cases in their demo, there probably is not much hope of their tool being suitable. However, be prepared to be flexible about your prepared tests. The tool may be able to solve your underlying problem in a different way than you had pictured. If your test cases are too rigid, you may eliminate a tool which would actually be very suitable for you.
Find out what facilities the vendors require and make sure they are available. Prepare a list of questions (technical and commercial) to ask on the demo day, and prepare one more test case suite to give them on the day. Allow time to write up your reactions to each of the tools, say at the end of each day.
Tool demonstrations from each vendor
Provide facilities for the vendor’s presentation and their demonstration. Listen to the presentation and ask the questions you had prepared.
Observe their running of your prepared test case suites. Try your own ‘hands-on’ demonstration of your prepared test case suites and the new one for the day. Have a slightly changed version of the software being tested, so that the test suite needs to be modified to test the other version. Have the vendors edit the scripts if they insist, but it is better to edit them yourself with their assistance, so that you can see how much work will be involved in maintaining scripts.
Ask (and note) any more questions which occur to you. Note any additional features or functions which you had not realised this tool provided. Note any features or functions which you thought it did provide but does not, or not in the form you had thought.
Try to keep all demonstrations the same as far as possible. It is easy for the last one to incorporate improvements learned during the other demonstrations, but this is not fair to the first one. Save new ideas for use in the competitive trial.
Thank and dismiss the vendor. Write up your observations and reactions to this tool.
Post-demonstration analysis
Ask of the vendors you saw first any questions which occurred to you when watching a later vendor’s presentation or demonstration. This will give the fairest comparison between the tools.
Assess tool performance against measurable criteria defined earlier, taking any special circumstances into account. Compare features and functions offered by competing tools. Compare non-functional attributes, such as usability. Compare the commercial attributes of vendor companies.
If a clear winner is now obvious, select the winning tool. Otherwise select two tools for final competitive trial. Write to the non-selected vendors giving the reason for their elimination.
Competitive trial
If it is not clear which tool is the most appropriate for you at this point, an in-house trial or evaluation will give a better idea of how you would use the tool for your systems.
Most tool vendors will allow short-term use of the tool under an evaluation licence, particularly for tools which are complex and represent a major investment. Such licences will be for a limited period of time, and the evaluating unit must plan and prepare for that evaluation accordingly.
It is all too easy to acquire the tool under an evaluation licence only to find that those who really ought to be evaluating it are tied up in some higher-priority activity during that time. If they are not able or willing to make the time available during the period of the licence to give the tool more than the most cursory attention, then the evaluation licence will be wasted.
The preparation for the trial period includes the drafting of a test plan and test suites to be used by all tools in the trial. Measurable success criteria for the evaluated tools should be planned in advance, for example length of time to record a test or to replay a test, and the number of discrepancies found in comparison (real, extraneous and any missed). Attending a training course for each tool will help to ensure that they will be used in the right way during the evaluation period.
When the competi
Tags: #cast
Paul Gerrard
My linkedin profile is here
My Mastodon Account