Escolar Documentos
Profissional Documentos
Cultura Documentos
Software Testing
1 Introduction
The software engineering community has turned attention to statistical software test-
ing recently [1, 2, 3]. The main idea is that the reliability of software depends greatly
on the manner in which the software is used. The importance of a failure is largely
determined by the likelihood of encountering it. Therefore software is tested accord-
ing to the model which highlights the critical usage and a lot of test cases must be
executed to simulate statistically the usage of the software. However it needs a lot of
time and is often error-prone to manually generate the expected outputs for these test
cases and compare the actual outputs of the application under test (AUT) with them.
As a result automated test oracle is needed in statistical software testing to automati-
cally generate the expected output and compare the actual output with it. However
there are very few techniques developed to automate oracle. In most cases, a tester is
assumed to provide the expected output of the software, which is specified by a table
of pairs [4], logical expressions to be satisfied by the software [5], or temporal con-
straints that must not be violated during software execution [6]. Schroeder uses the
input-output (IO) relationships of the software to identify unique combinations of the
inputs which influence outputs. By this information a lot of test cases which including
the inputs and expected outputs can be automatically generated [7]. However deter-
mining all IO relationships manually is rather difficult. When testing software with
graphical user interfaces (GUI) by capture/replay tool [8, 9], expected outputs are
saved for comparison while recording test scripts or inserted into the test scripts
I. King et al. (Eds.): ICONIP 2006, Part III, LNCS 4234, pp. 498 – 507, 2006.
© Springer-Verlag Berlin Heidelberg 2006
Neural Networks Based Automated Test Oracle for Software Testing 499
where α is a positive number called the momentum constant, Δw ji (n) is the cor-
rection which is applied to the weight connecting the output of neuron i to the input
of neuron j at the n th iteration, η is the learning rate, δ j ( n) is the local gradient at
the n th iteration, and yi (n) is the function signal appearing at the output of neuron
i at the n th iteration. The training error can be the sum over output units of the
500 Y. Mao et al.
squared difference between the expected outputs tk given by a teacher and the actual
output zk :
1 c 1
∑
2
J ( w) = ( tk − zk ) 2 = t − z (2)
2 k =1 2
where t and z are the target and the network output vectors of length c respectively
and w are the weights in the network. Details about the back propagation algorithm
can be found in references [18, 19].
Function testing involves executing an AUT and examining the output, which is im-
plemented by oracle. General model of oracle is as Fig.1. In the model expected
outputs are generated from inputs and then compared with actual outputs from AUT.
If they are not same, it implies a failure. The process of generating expected outputs
and comparing is traditionally done manually. Testers compute the expected outputs
from program specifications or their knowledge of how a program should operate.
Expected outputs are then compared with actual outputs by tester’s knowledge to
determine if a failure occurs.
The relationship from the input to output of the software is in nature a function. Let
x = ( x1 , x2 ,", xn ) and y = ( y1 , y 2 , " , y m ) be input and output vectors respec-
tively. Therefore the relationship can be represented by y = f (x ) , where f imple-
ment the specification of the software. When x ∈ R ,
n
y ∈ R m , and f is continuous,
the function f can be approximated by NN after training. In this paper, we propose the
automated oracle in this situation. Each component in y is also a function of x , that
is y j = f j (x) . Let D( xi ) be the set of all possible values of xi and D(x) be the set
of all possible values of x . Therefore D (x ) includes every possible combination of the
Neural Networks Based Automated Test Oracle for Software Testing 501
i
from x and compare it with the actual output automatically.
The model of the automated oracle is as Fig. 2. In the model approximate outputs
are automatically generated according to inputs. Approximate output is not as same as
expected output. But it can approach expected output in any precision. Comparison
process now became as follows:
⎧ 0 if η − a ≤ ε
η −aε = ⎨ (3)
⎩η − a − ε otherwise
where η and a are the actual output and approximate output respectively, ε is test
precision. Indicator ε controls the criterion if the actual value η is right or not. We
can adjust ε between precision and test cost. In experiment, we will describe the
effect of the ε . If η − a ε = 0 , it means the actual output is right within precision ε .
Otherwise, if η − a ε > 0 , it means that a failure occurs because the actual output is
not right within precision ε . To generate the approximate outputs, the relationship
from the input to output of the AUT should be learned from a small set of training
samples of the specification. Let D ( y ) be co-domain of the AUT. Then the training
samples are:
where y i is the expected output. The approximate outputs are then automatically
generated for ∀x ∈ D(x ) by the relationship learned.
NN can approach any continuous function in theory. This feature is used to generate
the approximate output in the automated oracle. To implement automated oracle, two
502 Y. Mao et al.
Step 4: When the stopping criterion is satisfied, keep the weight and go to step 5.
Step 5: Obtain approximate output a from NN for ∀x ∈ D ' ( x) . Set test preci-
i i
sion ε = max(abs ( y i − a i )) .
i
Step 6: Get the input of the AUT and input it to NN. Obtain the approximate output
a from NN.
Step 7: Get the actual output η from the AUT and compare it with the approximate
output a according to equation (3). Determine if there is failure or not by
the result of comparison.
Step 8: Repeat step 6 and 7 to test the AUT in other inputs.
Step 9: If it is needed to test in different precision, go to step 2. Otherwise, the proc-
ess finishes.
4 Experiments
The goal of the experiment is whether the method proposed is effective to expose the
failure of the AUT. It includes whether NN can be used to generate the approximate
output that is close to the expected output, whether the value of ε can be computed
from samples, and whether the method can expose failure of the AUT.
The AUT in the experiment is an application with GUI as Fig. 3. It has three input
variables and two output variables. Input and output vector are x = ( x1 , x2 , x3 ) and
y = ( y1 , y2 ) respectively. The relationship between input and output is:
variable x1 is dropped from interval [0,3π ] evenly and the variable y1 is computed
manually. NN is constructed and training parameters are set as table 2. We select
' '
samples from S evenly to form training samples S whose size of is 100. Use S to
train NN. The training process is as Fig. 4. It shows the performance achieves the goal
quickly after 192 epochs pass.
Table 1. Five failures caused by faults seeded in the AUT ( x1 , η , and y1 are the inputs of
AUT, actual outputs, and expected outputs)
Failure ID x1 η y1
1 1.1 0.3900 0.1942
2 1.57 -1.8732 -1.5698
3 3.683 -3.5776 -3.6777
4 5.9 -5.3631 -5.3842
5 7.31 0.4367 0.4467
Fig. 5. Plot of the expected outputs, approximate outputs and their difference when the training
goal is 0.001
Test precision ε is computed in this step. The trained NN is used to generate ap-
output a for ∀x ∈ D ( x ) , i = 1, " ,200 . Test precision ε is set
i i '
proximate
Neural Networks Based Automated Test Oracle for Software Testing 505
as max( y1
i
− a i ) . The result value of ε is 0.1863 because the values of ( y1i − a i )
i
is in the interval [-0.1863,0.0757] . As a result if the actual output obtained from
the AUT is in the interval [ a − 0.1863, a + 0.1863] , there is no failure. Otherwise,
a failure is exposed. We now check if the method can expose the failures in table 1.
The result is in table 3. In the table η −aε is computed by equation (3). A failure is
successfully when training goal is set to 0.001 because η − a ε > 0 . Failure 3, 4, and
5 can not be exposed. It is because the difference between the actual output and the
approximate output is below the test precision ε .
Table 3. The failures that can be exposed when ε is 0.1863 ( x1 , η , and a are input, actual
output, and approximated output. η − a > 0 means a failure is exposed ).
ε
Failure ID x1 η y1 a η −aε
1 1.1 0.3900 0.1942 0.1943 0.0094
2 1.57 -1.8732 -1.5698 -1.5730 0.1139
3 3.683 -3.5776 -3.6777 -3.6690 0
4 5.9 -5.3631 -5.3842 -5.3693 0
5 7.31 0.4367 0.4467 0.4870 0
When we change the training goal, different test precision can be achieved (table 4).
It shows we can set goal according to different demand of precision in software testing.
If a precision ε of 0.0165 is needed, we can set training goal under 5e-005. If the ac-
tual output is in the interval [ a − 0.0165, a + 0.0165] , where a is the approximate
output obtained from NN, it means there is no failure. Otherwise, it means a failure
occurs. All failures in table 1 can be exposed in this situation. It means the method
proposed can expose failure effectively.
Table 4. Precision ε achieved under different training goal ( min and max are min and
max difference between expected and approximate outputs)
In the experiment, the value of ε is obtained from 200 samples. We now obtain
the value from more samples to see if the value will change obviously. The result is in
table 5. The meaning of each column is same as table 4. It shows that the value of ε
506 Y. Mao et al.
computed from 200 samples is effective in this experiment because it is same as the
one computed from more samples where the number is 100000.
Table 5. The comparison of the value ε obtained from different samples ( min , max , and
ε are obtained when the number of samples is 200, min ' , max ' , and ε ' are obtained when
the number of samples is 100000)
5 Conclusions
In statistical software testing a lot of test cases should be executed to simulate statisti-
cally the usage model of the AUT. However it is difficult to manually generate ex-
pected outputs for these test cases and compare the actual outputs of the AUT with
them. As a result an automated test oracle is proposed in this paper to solve the prob-
lem. The oracle can be applied when the relationship from the input to output of the
AUT is a continuous function. From above results we conclude that NN can be used to
implement the automated test oracle in reasonable precision. It can generate the ap-
proximate output for AUT and the precision can be adjusted by training parameter. As
a result, we can test AUT in the precision needed. By the method, we need not generate
all expected output from AUT manually. It can save a lot time and labor in software
testing. It is shown in the experiment that the precision ε is important to expose failure
and it can be computed from samples generated manually. However it is not verified
that if it is effective when the relationship from the input to output become more com-
plicated. As a result we will do the experiment in more complicated relationships.
Acknowledgment
This work was supported in part by the National High Technology Development Plan
of China (863) under grant no. 2003AA1Z2610.
References
1. Sayre, K.: Improved techniques for software testing based on Markov chain usage models,
PhD. thesis, University of Tennessee, Knoxville, USA (1999)
2. Bertolini, C., Farina, A.G., Fernandes, P., Oliveira, F.M.: Test case generation using sto-
chastic automata networks: quantitative analysis, In: Proc. of the second International
Conf. on Software Engineering and Formal Methods, IEEE Press (2004) 251-260
Neural Networks Based Automated Test Oracle for Software Testing 507
3. Beyer, M., Dulz, W., Zhen, F.: Automated TTCN-3 test case generation by means of UML
sequence diagrams and Markov chains, In: Proc. of the 12th Asian Test Symposium, Pis-
cataway : IEEE Press (2003) 102-105
4. Peters, D., Parnas, D.L.: Generating a test oracle from program documentation, In: Proc. of
the International Symposium on Software Testing and Analysis (1994) 58-65
5. Bousquet, L., Ouabdesselam, F., Richier, J., Zuanon, N.: Lutess: a specification-driven
testing environment for synchronous software, In: Proc. of the 21th International Conf. on
Software Engineering, ACM Press (1999) 267-276
6. Dillon, L.K., Ramakrishna, Y.S.: Generating oracles from your favorite temporal logic
specifications, In: Proc. of the 4th ACM SIGSOFT Symposium on the Foundations of
Software Engineering, ACM Software Engineering Notes, vol.21 (1996) 106-117
7. Schroeder, P.J., Faherty, P., Korel, B.: Generating expected results for automated black-
box testing, In: Proc. of the 17th IEEE International Conf. on Automated Software Engi-
neering, IEEE Press (2002) 139-148
8. Ostrand, T., Anodide, A., Foster, H., Goradia, T.: A visual test development environment
for GUI systems, ACM SIGSOFT Software Engineering Notes, vol.23, no.2 (1998) 82-92
9. Chen, W.K., Tsai, T.H., Chao, H.H.: Integration of specification-based and CR-based ap-
proaches for GUI testing, In: Proc. of the 19th International Conf. on Advanced Informa-
tion Networking and Applications, vol.1 (2005) 967-972
10. Memon, A., Nagarajan, A. Xie, Q.: Automating regression testing for evolving GUI soft-
ware, Journal of Software Maintenance and Evolution: Research and Practice, vol.17, no.1
(2005) 27-64
11. Chen, J. Subramaniam, S.: Specification-based testing for GUI-based applications, Soft-
ware Quality Journal, vol.10, no.3 (2002) 205-224
12. Hierons, R.M.: Testing from a Z specification, Software Testing, Verification, and Reli-
ability, vol.7 (1997) 19-33
13. McDonald, J. Strooper, P.: Translating object-Z specifications to passive test oracles, In:
Proc. of the 2th International Conf. on Formal Engineering Methods, IEEE Press (1998)
165-174
14. Aggarwal, K.K., Singh, Y., Kaur, A., Sangwan, O.P.: A neural net based approach to test
oracle, ACM SIGSOFT Software Engineering Notes, ACM Press, vol.29, no.3 (2004) 1-6
15. Ramamoorthy G.V., Ho S.F., Chen W.T.: On the automated generation of program test
data, IEEE Trans. Software Engineering, vol.SE-2 (1976) 293-300
16. Chen, T., Chen, H.: Approximations of continuous functionals by neural networks with
application to dynamic systems, IEEE Trans. Neural Networks, vol.4, no.6 (1993) 910-918
17. Chen, D.S., Jain, R.C.: A robust back propagation learning algorithm for function ap-
proximation, IEEE Trans. Neural Networks, vol.5, no.3 (1994) 467-479
18. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, second edition, John Wiley &
Sons (2001)
19. Fausett, L.: Fundamentals of neural networks: architectures, algorithms, and application,
Prentice Hall: Englewood Cliffs, New Jersey (1994)