Escolar Documentos
Profissional Documentos
Cultura Documentos
Defect testing
• Testing programs to establish the presence of system defects
Objectives
• To understand testing techniques that are geared to discover program faults
• To introduce guidelines for interface testing
• To understand specific approaches to object-oriented testing
• To understand the principles of CASE tool support for testing
Topics covered
• Defect testing
• Integration testing
• Object-oriented testing
• Testing workbenches
● Component testing
Testing of individual program components
Usually the responsibility of the component developer (except sometimes for
critical systems)
Tests are derived from the developer’s experience
● Integration testing
Testing of groups of components integrated to create a system or sub-system
The responsibility of an independent testing team
Tests are based on a system specification
Testing phases
Component Integration
testing testi ng
Defect testing
●The goal of defect testing is to discover defects in programs
●A successful defect test is a test which causes a program to behave in an anomalous
way
●Tests show the presence not the absence of defects
Testing priorities
●Only exhaustive testing can show a program is free from defects. However, exhaustive
testing is impossible
●Tests should exercise a system's capabilities rather than its components
●Testing old capabilities is more important than testing new capabilities
●Testing typical situations is more important than boundary value cases
1
Testing
●Testcases Inputs to test the system and the predicted outputs from these inputs if the
system operates according to its specification
The defect testing process
Black-box testing
●An approach to testing where the program is considered as a ‘black-box’
●The program test cases are based on the system specification
●Test planning can begin early in the software process
Inputs causing
anomalous
Input test data I behaviour
e
System
Outputs which reveal
the presence of
Output test results Oe defects
Equivalence partitioning
●Input data and output results often fall into different classes where all members of a
class are related
●Each of these classes is an equivalence partition where the program behaves in an
equivalent way for each class member
●Test cases should be chosen from each partition
2
Testing
Invali d in pu ts Vali d in pu ts
Sy stem
Ou tput s
Number of input values
9999 100000
10000 50000 99999
Input values
Search routine specification
procedure Search (Key : ELEM ; T: ELEM_ARRAY;
Found : in out BOOLEAN; L: in out ELEM_INDEX) ;
Pre-condition
-- the array has at least one element
T’FIRST <= T’LAST
Post-condition
-- the element is found and is referenced by L
( Found and T (L) = Key)
or
-- the element is not in the array
( not Found and not (exists i, T’FIRST >= i <= T’LAST, T (i) = Key ))
3
Testing
Search routine input partitions
Array Element
Single value In sequence
Single value Not in sequence
More than 1 value First element in sequence
More than 1 value Last element in sequence
More than 1 value Middle element in sequence
More than 1 value Not in sequence
Structural testing
White-box testing
4
Testing
Test data
Tests Derives
Component Test
code outputs
// This is an encapsulation of a binary s earch function that takes an array of
// ordered objects and a k ey a nd returns an o bject with 2 attributes namely
// index the value of the array index
// found a boolean indicating whether or not the key is in the array
// An object is returned because it is not possible in J ava to pass basic types by
// reference to a function and so return two values
// the key is 1 if the element is not found
public static v oid search ( int key, int [] elemArray, Result r )
{
int bottom = 0 ;
int top = elemArray.length 1 ;
int mid ;
r.found = false ; r.index = 1 ;
while ( bottom <= top )
{
mid = (top + bottom) / 2 ;
if (elemArray [mid] == key)
{
r.index = mid ;
r.found = true ;
return ;
} // if part
else
{
if (elemArray [mid] < key)
bottom = mid + 1 ;
else
top = mid 1 ;
}
} //while loop
} // search
Binary search (Java)
} //BinSearch
5
Testing
Equivalence class boundaries
Midpoint
Binary search - test cases
Binary search test cases
Path testing
●The objective of path testing is to ensure that the set of test cases is such that each
path through the program is executed at least once
●The starting point for path testing is a program flow graph that shows nodes
representing program decisions and arcs representing the flow of control
●Statements with conditions are therefore nodes in the flow graph
Program flow graphs
●Describes the program control flow. Each branch is shown as a separate path and
loops are shown by arrows looping back to the loop condition node
●Used as a basis for computing the cyclomatic complexity
●Cyclomatic complexity = Number of edges - Number of nodes +2
Cyclomatic complexity
●The number of tests to test all control statements equals the cyclomatic complexity
●Cyclomatic complexity equals number of conditions in a program
●Useful if used with care. Does not imply adequacy of testing.
●Although all paths are executed, all combinations of paths are not executed
Binary search flow graph
6
Testing
while bottom < = top
bottom > top
2
3 if (elemArray [mid] == key
8 4
(if (elemArray [mid]< key
5 6
9
Binary search flow graph
Independent paths
●1, 2, 3, 8, 9
●1, 2, 3, 4, 6, 7, 2
●1, 2, 3, 4, 5, 7, 2
●1, 2, 3, 4, 6, 7, 2, 8, 9
●Test cases should be derived so that all of these paths are executed
●A dynamic program analyser may be used to check that paths have been executed
Integration testing
●Testscomplete systems or subsystems composed of integrated components
●Integration testing should be black-box testing with tests derived from the specification
●Main difficulty is localising errors
●Incremental integration testing reduces this problem
Incremental integration testing
A T1
T1
A
T1 T2
A B
T2
T2 B T3
T3
B C
T3 T4
C
T4
D T5
7
Testing
Top-down testing
Testing
Level 1 Level 1 . . .
sequence
Le vel 3
stubs
Bottom-up testing
Test
drivers
Testing
Level N Level N Le vel N Level N Level N
sequence
Test
drivers
Level N–1 Level N–1 Level N–1
Tetsing approaches
●Architectural validation
•Top-down integration testing is better at discovering errors in the system
architecture
●System demonstration
•Top-down integration testing allows a limited demonstration at an early stage in the
development
●Test implementation
•Often easier with bottom-up integration testing
●Test observation
8
Testing
•Problems with both approaches. Extra code may be required to observe tests
Interface testing
●Takes place when modules or sub-systems are integrated to create larger systems
●Objectives are to detect faults due to interface errors or invalid assumptions about
interfaces
●Particularly important for object-oriented development as objects are defined by their
interfaces
Test
cases
A B
Interfaces types
●Parameter interfaces
•Data passed from one procedure to another
●Shared memory interfaces
•Block of memory is shared between procedures
●Procedural interfaces
•Sub-system encapsulates a set of procedures to be called by other sub-systems
●Message passing interfaces
•Sub-systems request services from other sub-systems
Interface errors
●Interface misuse
•A calling component calls another component and makes an error in its use of its
interface e.g. parameters in the wrong order
●Interface misunderstanding
•A calling component embeds assumptions about the behaviour of the called
component which are incorrect
●Timing errors
•The called and the calling component operate at different speeds and out-of-date
information is accessed
Interface testing guidelines
●Design tests so that parameters to a called procedure are at the extreme ends of their
9
Testing
ranges
●Always test pointer parameters with null pointers
●Design tests which cause the component to fail
●Use stress testing in message passing systems
●In shared memory systems, vary the order in which components are activated
Stress testing
●Exercises the system beyond its maximum design load. Stressing the system often
causes defects to come to light
●Stressing the system test failure behaviour.. Systems should not fail catastrophically.
Stress testing checks for unacceptable loss of service or data
●Particularly relevant to distributed systems which can exhibit severe degradation as a
network becomes overloaded
Object-oriented testing
●The components to be tested are object classes that are instantiated as objects
●Larger grain than individual functions so approaches to white-box testing have to be
extended
●No obvious ‘top’ to the system for top-down integration and testing
Testing levels
●Testing operations associated with objects
●Testing object classes
●Testing clusters of cooperating objects
●Testing the complete OO system
Object integration
●Levels of integration are less distinct in object-oriented systems
●Cluster testing is concerned with integrating and testing clusters of cooperating objects
●Identify clusters using knowledge of the operation of objects and the system features
that are implemented by these clusters
Approaches to cluster testing
10
Testing
request (report)
acknowledge ()
report ()
summarise ()
send (report)
reply (report)
acknowledge ()
11
Testing
specific
●Difficult to integrate with closed design and analysis workbenches
A testing workbench
Test data
Specification
generator
Execution File
Simulator
report comparator
Report Test results
generator report
Key points
●Test parts of a system which are commonly used rather than those which are rarely
executed
●Equivalence partitions are sets of test cases where the program should behave in an
equivalent way
●Black-box testing is based on the system specification
●Structural testing identifies test cases which cause all paths through the program to be
executed
●Test coverage measures ensure that all statements have been executed at least once.
●Interface defects arise because of specification misreading, misunderstanding, errors or
invalid timing assumptions
●To test object classes, test all operations, attributes and states
●Integrate object-oriented systems around clusters of objects
1. What is testing?
The act of checking if a part or a product performs as expected.
2. Why test?
• Gain confidence in the correctness of a part or a product.
12
Testing
13
Testing
• P sorts the input sequence in descending order and prints the sorted
sequence.
Correctness again
P is considered correct with respect to a specification S if and only if:
For each valid input the output of P is in accordance with the specification
S.
7. Errors, defects, faults
Error: A mistake made by a programmer
Example: Misunderstood the requirements.
Defect/fault: Manifestation of an error in a program.
Example:
Incorrect code: if (a<b) {foo(a,b);}
Correct code: if (a>b) {foo(a,b);}
Failure
• Incorrect program behavior due to a fault in the program.
• Failure can be determined only with respect to a set of requirement
specifications.
• A necessary condition for a failure to occur is that execution of the
program force the erroneous portion of the program to be executed. What
is the sufficiency condition?
Errors and failure
Inputs Error-revealing
inputs cause
failure
Program
Erroneous
Outputs outputs indicate
failure
8. Debugging
• Suppose that a failure is detected during the testing of P.
• The process of finding and removing the cause of this failure is known as
debugging.
• The word bug is slang for fault.
• Testing usually leads to debugging
• Testing and debugging usually happen in a cycle.
Test-debug cycle
14
Testing
Test
Yes
Failure? No
Debug Yes
Testing No
complete?
Done!
9. Types of testing
15
Testing
16
Testing
program modules.
• Testing should begin “in the small” and progress toward testing “in the large”.
• Exhaustive testing is not possible.
• To be most effective, testing should be conducted by an independent third
party.
Software Testability
According to James Bach:
Software testability is simply how easily a computer program can be tested.
A set of program characteristics that lead to testable software:
• Operability: “the better it works, the more efficiently it can be tested.”
• Observability: “What you see is what you test.”
• Controllability: “The better we can control the software, the more the testing
can be automated and optimized.”
• Decomposability: “By controlling the scope of testing, we can more quickly
isolate problems and perform smarter retesting.”
• Simplicity: “The less there is to test, the more quickly we can test it.”
• Stability: “The fewer the changes, the fewer the disruptions to testing.”
• Understandability:”The more information we have, the smarter we will test.”
Test Case Design
Two general software testing approaches:
Black-Box Testing and White-Box Testing
Black-box testing:
knowing the specific functions of a software, design tests to demonstrate
each function and check its errors.
Major focus:
functions, operations, external interfaces, external data and
information
White-box testing:
knowing the internals of a software, design tests to exercise all internals of a
software to make sure they operates according to specifications and designs
Major focus: internal structures, logic paths, control flows, data flows internal data
structures, conditions, loop, etc.
White-Box Testing and Basis Path Testing
White-box testing, also known as glass-box testing.
It is a test case design method that uses the control structure of the procedural design
to derive test cases.
Using white-box testing methods, we derive test cases that
• Guarantee that all independent paths within a module have been exercised at
least once.
• Exercise all logical decisions on their true and false sides.
• Execute all loops at their boundaries and within their operational bounds.
• Exercise internal data structures to assure their validity.
Basic path testing (a white-box testing technique):
• First proposed by TomMcCabe [MCC76].
• Can be used to derive a logical complexity measure for a procedure design.
• Used as a guide for defining a basis set of execution path.
17
Testing
18
Testing
• If an input condition specifies a member of a set, one valid and one invalid
equivalence classes are defined.
• If an input condition is Boolean, one valid and one invalid classes are
defined.
Examples:
area code: input condition, Boolean - the area code may or may not be present.
input condition, range - value defined between 200 and 900
password: input condition, Boolean - a password nay or may not be present. input
condition, value - six character string.
command: input condition, set - containing commands noted before.
19
Testing
Validation:
– Does a s/w product satisfy its requirement?
alternatively
– Are we building the right-product?
Verification:
– Does a s/w product satisfy its specification?
alternatively
– Are we building the product right?
Levels of testing
Systematic testing
Ad hoc approach to testing is:
– ineffective: too late to influence design decisions
– inefficient: testing cannot be reused during maintenance (e.g.)
– hit and miss
Systematic testing:
– systematic: consists of a test plan and test strategy
– planned: actually design for testability
– documented: for repeatability and understandability
– maintained: repeat tests after every change
Test-case selection
Black-box or functional selection
– Test cases are based on the requirements or specification.
– Note: may miss important aspects of the implementation.
White-box or structural selection
– Test cases are based on the implementation.
– Note: may miss important aspects of the requirements (won’t test missing
functionality).
20
Testing
Function f(p,q)
p: [0..100] and q: {red, green, blue}
calculate test values for p (interval rule)
– correct values: {0, 50, 100}
– exception values: {-100,-1,101,1000}
values for q
– correct values: {red, green, blue}
– exception values: none
number of combinations is 7 * 3 = 21
21
Testing
return (SC);
}
Test cases for `triangle’
1) valid scalene
2) valid equilateral
3) valid isosceles
4) 3 isosceles: a==b, b==c, a==c
5) a=0
6) a=b=c=0
7) a=-1
8) 3 invalid: a=b+c, b=a+c, c=a+b
9) 3 invalid: a>b+c, b>a+c, c>a+b
Assume the compiler checks for correct types
Myer scoring: 1 point for each test case, and 1 point if you have specified the outcome of
each test case. Maximum score is 10.
In 1978, when the book was published, experienced professional programmers scored
on average about 6.
22
Testing
Test scaffolding
23
Testing
Top-down testing
Use stubs to test top-level modules
Stepwise replace stubs by lower-level modules
depth-first or breadth first
Bottom-up testing
Use drivers to test bottom-level modules
Stepwise replace drivers by higher-level modules
Top-down vs bottom-up testing
Top-down testing
+ early availability of executable
+ early detection of flaws in interfaces
- poor controllability and observability
- cost of developing and maintaining stubs
Bottom-up testing
+ better controllability and observability
- executable available very late
- cost of developing and maintaining drivers
System testing
Test the entire system in production environment
Various forms:
volume/stress testing (load limits)
performance testing (efficiency)
reliability testing (e.g. mean-to-failure)
recovery testing (e.g. h./w failure)
acceptance testing (by the client)
regression testing (maintenance)
Regression Testing
As a consequence of the introduction of new bugs, program maintenance requires
far more system testing per statement than any other programming. Theoretically,
24
Testing
after each fix, one must run the entire bank of test cases previously run against the
system, to ensure that it has not been damaged in an obscure way ... and this is very
costly.
Three dimensions of quality
operational
correctness (functional testing)
reliability (reliability testing)
efficiency (performance testing)
integrity (recovery testing)
usability (acceptance testing)
transitional
portability, reusability, interoperability
revisional
maintainability, flexibility, testability
Test documents
Test plan:
the specification of the test system
test selection strategy
e.g. IEEE Standard 829 for S/W Test Document
Test report
report of the results of the test implementation
Planning
build the test `scaffolding’, which consists of:
drivers: modules that call code under test
stubs: modules called by code under test
generate the test cases
determine the expected output for each test case
write the test plan
Execution
execute the test cases and generate a report that includes the actual outputs
Evaluation
compare the actual and expected outputs
Test plan
an informal specification of the tests
consists of:
assumptions
assumptions on which the testing depends
environment
environment in which testing is performed
test-case selection strategy
choice of test cases
implementation strategy
key aspects of the implementation
25
Testing
26
Testing
s_pop empty
g_top integer empty
g_depth integer
27
Testing
An argument that testing has limits in certain aspects to be complete to replace proof
28
Testing
Limits of Testing
Concurrent / Distributed / Mobile systems
–Concurrency
–Non-determinism
–Mobility
Non-functional properties
–Performance
–Reliability, Fault Tolerance
(Statistical metrics)
Non-testable condition
–Environment
Testing based solely on analysis of program implementation is not sufficient
Few formal methods are available explicitly for testing
Formal Proof
Exploratory Testing
29
Testing
In scripted testing, tests are first designed and recorded. Then they may be executed at
some later time or by a different tester.
In exploratory testing, tests are designed and executed at the same time, and they often
are not recorded.
30
Testing
31
Testing
32
Testing
Black-Box Testing
• Synonyms for black-box testing include: behavioral, functional, opaque-box, and
closed-box testing.
• Black-box testing treats the system as “black-box”, so it doesn’t explicitly use
knowledge of the internal structure.
• Black-box testing is usually described as focusing on testing functional
requirements.
Black-box testing is used to access the accuracy of model input-output transformation.
Input Output
Black-box testing is applied by feeding test data to model and evaluating the
corresponding outputs. The concern is how accurately the model transform a given set
of input data into a set of output data. If we can test all input-output transformation paths,
then we can get 100% confidence.
Black-box testing is based on the view that any model can be considered to be a
function that maps values from its input domain to values in its output range. The content
( implementation ) of a black-box is not know, and the function of the black-box is
understood completely in terms of its inputs and outputs. Many times, we operate very
effectively with black-box knowledge; in fact this is central to object orientation. As an
example, most people successfully operate automobiles with only black-box knowledge.
The more the system input domains is covered in testing, the more confidence we gain
in the accuracy of the system input-output transformation.
Examples of generation of test data
Exhaustive Testing
Random Testing
Systematic Way
33
Testing
Exhaustive Testing
The simple model is designed to print every input integer on the screen.
34
Testing
It is not practical to test this model by running it with every possible data input; the
number of elements in the set of intValue is clearly too large. In such cases, we don’t
attempt exhaustive testing.
To test this simple printer model, we may try over hundreds of time until finally we feed
an integer 9 to it, and find there is a error in the model, for it can not print the input
integer less than 10 on the screen.
Fortunately, however, there are strategies for testing in a systematic way. One goal-
oriented approach is to cover general classed of data. We should test at least one
example of each category of inputs, as well as boundaries and other special cases.
A Systematic Way of Testing
There are strategies for testing in a systematic way. One goal-oriented approach
is to cover general classed of data.
We should test at least one example of each category of inputs, as well as
boundaries and other special cases.
Negativ
e Zero Positive Values
Values
Three cases
Field Testing
Field testing known as live environment testing places the system in an
operational (real) environment (i.e. using real data as the input source).
The purpose is collecting as much information as possible.
It is necessary to demonstrate system capabilities for acceptance.
35
Testing
Target environments and associated development tools are usually much more
expensive to provide to system developer than host environments.
It is rare for a host environment to be identical to a target environment. There may
just minor differences in configuration, or there may be major differences such as
between a workstation and an embedded control processor, where the two
environments may even have a different instruction set.
Test Implementation
Direct access to target hardware, such as input-output devices.
Calls to target environment system, such as an operating system or real-time
kernel.
Acceptance testing has to be conducted in the target environment!
The final stage of system testing, acceptance testing, has to be conducted in the target
environment. Acceptance of a system must be based on tests of the actual system, and
not a simulation in the host environment.
36
Testing
Some of the mutant models will turn out to be functionally equivalent to the model.
That is, they will be indistinguishable form the model under the test data.
A mutation score is a number in the interval [0,1]. A high score indicate that D is very
close to being adequate for the model relative to the set of mutants of the model. A
low score indicates a weakness in the test data. The test data does not distinguish
the test model form the mutant model which contain an error.
If all mutants of the being tested model give incorrect results on execution by
the test data D, we say they die on the execution. If all mutants die, it is highly
likely that the tested model is very likely to be correct.
If some mutants are live and the test data D is adequate, then either the live
mutants are functionally equivalent to the model or there still might be complex
errors in the model.
37
Testing
38
Testing
39
Testing
40
Testing
Define System
Requirements Test
Veri fy Validate
Requirements Requirements
Design Integration
System Veri fy Test Validate
Verificat ion System System Validat ion
Code Unit
System Veri fy Test Validate
Code Code
Unit Testing
–Testing performed on a single, standalone module or unit of code.
Integration Testing
–Testing performed on groups of related modules to ensure data and control are
passed properly between modules.
System Testing
–A predetermined combination of tests that, when executed successfully, satisfy
management that the system meets specifications
–Validates that the system was built right.
User Acceptance Testing
–Testing to ensure that the system meets the need of the organization and the end
user/customer
–Validates that the right system was built.
Regression Testing
–Testing after changes have been made to ensure that no unwanted changes were
introduced to the software or system.
Functional Tests
–Tests that validate business requirements
–Tests what the system is supposed to do
Black Box Tests
–Functional testing
–Based on external specifications without knowledge of how the system is
constructed
41
Testing
Structural Tests
–Tests that validate the system architecture
–Tests how the system was implemented
White Box or Glass Box Tests
–Structural testing
–Testing based on knowledge of internal structure and logic
–Usually logic driven
If x=curr-date then
set next-val to 03
else
set next-val to 05.
To effectively test systems, both functional and structural testing need to be performed.
The Economics of Testing - Making the Message to Management
42
Testing
43
Testing
Test Strategy
44
Testing
–Reliability
Who Will Conduct Testing?
–Users
–Developers
What Are the Tradeoffs?
–Schedule
–Cost/Resources
–Quality
How Critical is the System to the Organization?
–Risk Assessment
Risk Assessment
45
Testing
46
Testing
Approach (Strategy)
Test Objectives
Description of the system or software to be tested
Test environment
Description of the test team
Milestones/Schedule
Functions and attributes to be tested
Evaluation criteria
Data recording
References
Tests to be performed
How Much Time Should be Spent on Test Planning?
Of the total test time, roughly one-third of the time can be allocated each to:
–Test planning
–Test execution
–Test evaluation
Tips for Test Planning
Start early.
Keep the test plan flexible to deal with change.
Frequently have the test team review the test plan.
Keep the test plan concise and readable.
Step 3 - Execute Tests
Select test tools
Develop test cases
Execute tests
Select Test Tools
A test tool is any vehicle that assists in testing.
May be manual or automated
Automated Tools
47
Testing
Automated Testing
48
Testing
49
Testing
50
Testing
51
Testing
–Data-oriented
–Boundary value analysis
–Decision tables
–Equivalence partitioning
Structural Techniques
–Complexity analysis
–Coverage
•Statement
•Branch
•Condition
•Multi-condition
•Path
Step 4 - Evaluate/Report Test Results
Occurs throughout the testing life cycle
Tracks testing progress
Keeps management informed of testing progress
Valuable Test Metrics
Two key areas to measure
–Time
•For future estimating
–Defects
•For determining effectiveness of testing
•For improving the development and testing processes
Time
–Time per test case
–Time per test script
–Time per unit test
–Time per system test
Sizing
–Function points
–Lines of code
Defects
–Numbers of defects
–Defects per sizing measure
–Defects per phase of testing
–Defect origin
–Defect removal efficiency
52
Testing
53
Testing
••Code
Codeanalyzers
analyzers
••Walkthroughs
Walkthroughs
••Inspections
Inspections Level 2 - Verification
••Acceptance
Acceptancetest
test
••Unit
Unittest
test
••Integration
Integrationtest
test
••System
Systemtest
test
Level 1 - Validation
Software Testing
54
Testing
Table of Contents
Testing often becomes a question of economics. For projects of a large size, more
testing will usually reveal more bugs. The question then becomes when to stop testing,
and what is an acceptable level of bugs. This is the question of 'good enough software'.
It is important to remember that testing assumes that requirements are already
validated.
55
Testing
Basic Methods
White Box Testing
White box testing is performed to reveal problems with the internal structure of a
program. This requires the tester to have detailed knowledge of the internal structure. A
common goal of white-box testing is to ensure a test case exercises every path through
a program. A fundamental strength that all white box testing strategies share is that
the entire software implementation is taken into account during testing, which facilitates
error detection even when the software specification is vague or incomplete. The
effectiveness or thoroughness of white-box testing is commonly expressed in terms of
test or code coverage metrics, which measure the fraction of code exercised by test
cases.
Black Box Testing
Black box tests are performed to assess how well a program meets its requirements,
looking for missing or incorrect functionality. Functional tests typically exercise code with
valid or nearly valid input for which the expected output is known. This includes
concepts such as 'boundary values'.
Performance tests evaluate response time, memory usage, throughput, device
utilization, and execution time. Stress tests push the system to or beyond its specified
limits to evaluate its robustness and error handling capabilities. Reliability tests monitor
system response to representative user input, counting failures over time to measure or
certify reliability.
Testing Levels
Different Levels of TestTesting occurs at every stage of system construction. The larger a
piece of code is when defects are detected, the harder and more expensive it is to find
and correct the defects.
The different levels of testing reflect that testing, in the general sense, is not a single
phase of the software lifecycle. It is a set of activities performed throughout the entire
software lifecycle.
In considering testing, most people think of the activities described in figure. The
activities after Implementation are normally the only ones associated with testing.
Software testing must be considered before implementation, as is suggested by the
input arrows into the testing activities.
56
Testing
The following paragraphs describe the testing activities from the 'second half' of the
software lifecycle.
Unit Testing
Unit testing exercises a unit in isolation from the rest of the system. A unit is typically a
function or small collection of functions (libraries, classes), implemented by a single
developer. The main characteristic that distinguishes a unit is that it is small enough to
test thoroughly, if not exhaustively. Developers are normally responsible for the testing of
their own units and these are normally white box tests. The small size of units allows a
high level of code coverage. It is also easier to locate and remove bugs at this level of
testing.
Integration Testing
One of the most difficult aspects of software development is the integration and testing
of large, untested sub-systems. The integrated system frequently fails in significant and
mysterious ways, and it is difficult to fix it
Integration testing exercises several units that have been combined to form a module,
subsystem, or system. Integration testing focuses on the interfaces between units, to
make sure the units work together. The nature of this phase is certainly 'white box', as
we must have a certain knowledge of the units to recognize if we have been successful
in fusing them together in the module.
There are three main approaches to integration testing: top-down, bottom-up and 'big
bang'. Top-down combines, tests, and debugs top-level routines that become the test
'harness' or 'scaffolding' for lower-level units. Bottom-up combines and tests low-level
units into progressively larger modules and subsystems. 'Big bang' testing is,
unfortunately, the prevalent integration test 'method'. This is waiting for all the module
units to be complete before trying them out together.
Bottom-up
Allows early testing aimed t proving feasibility and practicality of particular
modules.
Modules can be integrated in various clusters as desired.
Major emphasis is on module functionality and performance.
Advantages
No test stubs are needed
It is easier to adjust manpower needs
Errors in critical modules are found early
Disadvantages
Test drivers are needed
Many modules must be integrated before a working program is available
Interface errors are discovered late
Comments
At any given point, more code has been written and tested that with top down
testing. Some people feel that bottom-up is a more intuitive test philosophy.
Top-Down
The control program is tested first
Modules are integrated one at a time
Major emphasis is on interface testing
57
Testing
Advantages
No test drivers are needed
The control program plus a few modules forms a basic early prototype
Interface errors are discovered early
Modular features aid debugging
Disadvantages
Test stubs are needed
The extended early phases dictate a slow manpower buildup
Errors in critical modules at low levels are found late
Comments
An early working program raises morale and helps convince management
progress is being made. It is hard to maintain a pure top-down strategy in practice.
Integration tests can rely heavily on stubs or drivers. Stubs stand-in for finished
subroutines or sub-systems. A stub might consist of a function header with no body, or it
may read and return test data from a file, return hard-coded values, or obtain data from
the tester. Stub creation can be a time consuming piece of testing.
The cost of drivers and stubs in the top-down and bottom-up testing methods is what
drives the use of 'big bang' testing. This approach waits for all the modules to be
constructed and tested independently, and when they are finished, they are integrated all
at once. While this approach is very quick, it frequently reveals more defects than the
other methods. These errors have to be fixed and as we have seen, errors that are
found 'later' take longer to fix. In addition, like bottom up, there is really nothing that can
be demonstrated until later in the process.
External Function Testing
The 'external function test' is a black box test to verify the system correctly implements
specified functions. This phase is sometimes known as an alpha test. Testers will run
tests that they believe reflect the end use of the system.
System Testing
The 'system test' is a more robust version of the external test, and can be known as an
alpha test. The essential difference between 'system' and 'external function' testing is the
test platform. In system testing, the platform must be as close to production use in the
customers’ environment, including factors such as hardware setup and database
size and complexity. By replicating the target environment, we can more accurately test
'softer' system features (performance, security and fault-tolerance).
Because of the similarities between the test suites in the external function and system
test phases, a project may leave one of them out. It may be too expensive to replicate
the user environment for the system test, or we may not have enough time to run both.
Acceptance Testing
An acceptance (or beta) test is an exercise of a completed system by a group of end
users to determine whether the system is ready for deployment. Here the system will
receive more realistic testing that in the 'system test' phase, as the users have a better
idea how the system will be used than the system testers.
Regression Testing
Regression testing is an expensive but necessary activity performed on modified
software to provide confidence that changes are correct and do not adversely affect
58
Testing
other system components. Four things can happen when a developer attempts to fix a
bug. Three of these things are bad, and one is good:
New Bug No New Bug
Successful Change Bad Good
Unsuccessful Change Bad Bad
Because of the high probability that one of the bad outcomes will result from a change to
the system, it is necessary to do regression testing.
It can be difficult to determine how much re-testing is needed, especially near the end of
the development cycle. Most industrial testing is done via test suites; automated sets of
procedures designed to exercise all parts of a program and to show defects. While the
original suite could be used to test the modified software, this might be very time-
consuming. A regression test selection technique chooses, from an existing test set,
the tests that are deemed necessary to validate modified software.
There are three main groups of test selection approaches in use:
Minimization approaches seek to satisfy structural coverage criteria by identifying a
minimal set of tests that must be rerun.
Coverage approaches are also based on coverage criteria, but do not require
minimization of the test set. Instead, they seek to select all tests that exercise changed
or affected program components.
Safe attempt instead to select every test that will cause the modified program to produce
different output than original program.
An interesting approach to limiting test cases is based on whether we can confine testing
to the "vicinity" of the change. (Ex. If I put a new radio in my car, do I have to do a
complete road test to make sure the change was successful?) A new breed of regression
test theory tries to identify, through program flows or reverse engineering, where
boundaries can be placed around modules and subsystems. These graphs can
determine which tests from the existing suite may exhibit changed behavior on the new
version.
Regression testing has been receiving more attention as corporations focus on fixing the
'Year 2000 Bug'. The goal of most Y2K is to correct the date handling portions of their
system without changing any other behavior. A new 'Y2K' version of the system is
compared against a baseline original system. With the obvious exception of date
formats, the performance of the two versions should be identical. This means not only do
they do the same things correctly, they also do the same things incorrectly. A non-Y2K
bug in the original software should not have been fixed by the Y2K work.
A frequently asked question about regression testing is 'The developer says this problem
is fixed. Why do I need to re-test?’ to which the answer is 'The same person
probably told you it worked in the first place'.
Installation Testing The testing of full, partial, or upgrade install/uninstall processes.
Completion Criteria
There are a number of different ways to determine the test phase of the software life
cycle is complete. Some common examples are:
All black-box test cases are run
White-box test coverage targets are met
59
Testing
60
Testing
Estimating the number and severity of undetected defects allows informed decisions on
whether the quality is acceptable or additional testing is cost-effective. It is very
important to consider maintenance costs and redevelopment efforts when deciding on
value of additional testing.
Risk Management
Metrics involved in risk management measure how important a particular defect is (or
could be). These measurements allow us to prioritize our testing and repair cycles. A
truism is that there is never enough time or resources for complete testing, making
prioritization a necessity.
One approach is known as Risk Driven Testing, where Risk has specific meaning. The
failure of each component is rated by Impact and Likelihood. Impact is a severity rating,
based on what would happen if the component malfunctioned. Likelihood is an estimate
of how probable it is that the component would fail. Together, Impact and Likelihood
determine the Risk for the piece.
Obviously, the higher rating on each scale corresponds to the overall risk involved with
defects in the component. With a rating scale, this might be represented visually:
The relative importance of likelihood and impact will vary from project to project and
company to company.
A system level measurement for risk management is the Mean Time To Failure
(MTTF). Test data sampled from realistic beta testing is used find the average time until
system failure. This data is extrapolated to predict overall uptime and the expected time
the system will be operational. Sometimes measured with MTTF is Mean Time To Repair
(MTTR). This represents the expected time until the system will be repaired and back in
use after a failure is observed. Availability, obtained by calculating MTTF / (MTTF +
MTTR), is the probability that a system is available when needed. While these are
reasonable measures for assessing quality, they are more often used to assess the risk
(financial or otherwise) that a failure poses to a customer or in turn to the system
supplier.
Process Improvement
It is generally accepted that achieve improvement you need a measure against which to
gauge performance. To improve our testing processes we the ability to compare the
results from one process to another.
Popular measures of the testing process report:
Effectiveness: Number of defects found and successfully removed / Number of Defect
Presented
Efficiency: Number of defects found in a given time
It is also important to consider reported system failures in the field by the customer. If a
high percentage of customer reported defects were not revealed in-house, it is a
significant indicator that the testing process in incomplete.
A good defect reporting structure will allow defect types and origins to be identified. We
can use this information to improve the testing process by altering and adding test
activities to improve our changes of finding the defects that are currently escaping
detection. By tracking our test efficiency and effectiveness, we can evaluate the changes
made to the testing process.
61
Testing
Testing metrics give us an idea how reliable our testing process has been at finding
defects, and can is a reasonable indicator if its performance in the future. It must be
remembered that measurement is not the goal, improvement through measurement,
analysis and feedback is what is needed.
Pros
Testers are usually the only people to use a system heavily as experts;
Independent testing is typically more efficient at detecting defects related to
special cases, interaction between modules, and system level usability and
performance problems
Programmers are neither trained, nor motivated to test
Overall, more of the defects in the product will likely be detected.
Test groups can provide insight into the reliability of the software before it is
actually shipped
Cons
Having separate test groups can result in duplication of effort (e.g., the test group
expends resources executing tests developers have already run.
The detection of the defects happens at a later stage, designers may have to
wait for responses from the test group to proceed. This problem can be
exacerbated in situations where the test group is not physically collocated with
the design group.
The cost of maintaining separate test groups
The key to optimizing the use of separate test groups is understanding that developers
are able to find certain types of bugs very efficiently, and testers have greater abilities in
detecting other bugs. An important consideration would be the size of the organization,
and the criticality of the product.
Testing Problems
When trying to effectively implement software testing, there are several mistakes that
organizations typically make. The errors fall into (at least) 4 broad classes:
Misunderstanding the role of testing.
The purpose of testing is to discover defects in the product. Furthermore, it is important
to have an understanding of the relative criticality of defects when planning tests,
reporting status, and recommending actions.
Poor planning of the testing effort.
Test plans often over emphasize testing functionality at the expense of potential
interactions. This mentality also can lead to incomplete configuration testing and
inadequate load and stress testing. Neglecting to test documentation and/or installation
procedures is also a risky decision.
62
Testing
63
Testing
Evidence of the benefits of inspections abounds. The literature (Humphrey 1989) reports
cases where:
inspections are up to 20 times more efficient than testing;
code reading detects twice as many defects/hour as testing;
80% of development errors were found by inspections;
inspections resulted in a 10x reduction in cost of finding errors;
In the face of all this evidence, it has been suggested that "software inspections can
replace testing". While the benefits of inspections are real, they are not enough to
replace testing. Inspections could replace testing if and only if all information gleaned
through testing could be obtained through inspection. This is not true for several
reasons.
Firstly, testing can identify defects due to complex interactions in large systems (e.g.
timing/synchronization). While inspections can detect this event, as systems become
more complex the chances of one person understanding all the interfaces and being
present at all the reviews is quite small.
Second, testing can provide a measure of software reliability (i.e. failures/execution time)
that is unobtainable from inspections. This measure can often be used as a vital input to
the release decision.
Thirdly, testing identifies system level performance and usability issues that inspections
cannot. Therefore, since inspections and testing provide different, equally important
information, one cannot replace the other. However, depending on the product, the
optimal mix of inspections and testing may be different!
64
Testing
Fault injection is not a new concept. Hardware design techniques have long used
inserted fault conditions to test system behavior. It is as simple as pulling the modem out
of your PC during use and observing the results to determine if they are safe and/or
desired. The injection of faults into software is not so widespread, though it would
appear that companies such as Hughes Information Systems, Microsoft, and Hughes
Electronics have applied the techniques or are considering them. Properly used, fault
insertion can give insight as to where testing should be concentrated, how much testing
should be done, whether or not systems are fail-safe, etc.
As a simple example consider the following code:
In this case it is catastrophic if T > 100. By using perturb(x) to generate changed values
of X (i.e. a random number generator) you can quickly determine how often corrupted
values of X lead to undesired values of T.
The technique can be applied to internal source code, as well as to 3rd party software,
which may be a "black box"
Conclusion
Software testing is an important part of the software development process. It is not a
single activity that takes place after code implementation, but is part of each stage of the
lifecycle. A successful test strategy will begin with consideration during requirements
specification. Testing details will be fleshed through high and low level system designs,
and testing will be carried out by developers and separate test groups after code
implementation. As with the other activities in the software lifecycle, testing has its
own unique challenges. As software systems become more and more complex, the
importance of effective, well planned testing efforts will only increase.
65
Testing
Testing Objectives:
66
Testing
Myers [MYE79] states a number of rules that can serve well as testing objectives:
The major testing objective is to design tests that systematically uncover types of
errors with minimum time and effort.
Software Testability
67
Testing
Black-box testing:
Major focus:
White-box testing:
It is a test case design method that uses the control structure of the procedural
design to derive test cases.
Guarantee that all independent paths within a module have been exercised at
least once.
Exercise all logical decisions one their true and false sides.
Execute all loops at their boundaries and within their operational bounds.
Exercise internal data structures to assure their validity.
68
Testing
Cyclomatic Complexity
When this metric is used in the context of the basis path testing, the value
computed for cyclomatic complexity defines the number of independent paths in
the basis set of a program.
For example,
path 1: 1-2-10-11-13
path 2: 1-2-10-12-13
path 3: 1-2-3-10-11-13
path 4: 1-2-3-4-5-8-9-2-…
path 5: 1-2-3-4-5-6-8-9-2-..
Path 6: 1-2-3-4-5-6-7-8-9-2-..
Step 4: Prepare test cases that will force execution of each path in the basis set.
69
Testing
Equivalence Partitioning
An equivalence class represents a set of valid or invalid states for input condition.
Equivalence Classes
Examples:
area code: input condition, Boolean - the area code may or may not be present.
input condition, range - value defined between 200 and 900
70
Testing
Objective:
Boundary value analysis leads to a selection of test cases that exercise bounding
values.
Guidelines:
71
Testing
72
Testing
73