Você está na página 1de 10

Neural Networks Based Automated Test Oracle for

Software Testing

Ye Mao1, Feng Boqin1, Zhu Li2, and Lin Yao1


1 School of Electronic & Information Engineering, Xi’an Jiaotong University,
Xi’an 710049, China
xjtuyemao@163.com
2 School of Software, Xi’an Jiaotong University,

Xi’an 710049, China

Abstract. A lot of test cases must be executed in statistical software testing to


simulate the usage of software. Therefore automated oracle is needed to auto-
matically generate the expected outputs for these test cases and compare the ac-
tual outputs with them. An attempt has been made in this paper to use neural
networks as automated test oracle. The oracle generates the approximate output
that is close to expected output. The actual output from the application under
test is then compared with the approximate output to validate the correctness.
By the method, oracle can be automated. It is of potential application in soft-
ware testing.

1 Introduction
The software engineering community has turned attention to statistical software test-
ing recently [1, 2, 3]. The main idea is that the reliability of software depends greatly
on the manner in which the software is used. The importance of a failure is largely
determined by the likelihood of encountering it. Therefore software is tested accord-
ing to the model which highlights the critical usage and a lot of test cases must be
executed to simulate statistically the usage of the software. However it needs a lot of
time and is often error-prone to manually generate the expected outputs for these test
cases and compare the actual outputs of the application under test (AUT) with them.
As a result automated test oracle is needed in statistical software testing to automati-
cally generate the expected output and compare the actual output with it. However
there are very few techniques developed to automate oracle. In most cases, a tester is
assumed to provide the expected output of the software, which is specified by a table
of pairs [4], logical expressions to be satisfied by the software [5], or temporal con-
straints that must not be violated during software execution [6]. Schroeder uses the
input-output (IO) relationships of the software to identify unique combinations of the
inputs which influence outputs. By this information a lot of test cases which including
the inputs and expected outputs can be automatically generated [7]. However deter-
mining all IO relationships manually is rather difficult. When testing software with
graphical user interfaces (GUI) by capture/replay tool [8, 9], expected outputs are
saved for comparison while recording test scripts or inserted into the test scripts

I. King et al. (Eds.): ICONIP 2006, Part III, LNCS 4234, pp. 498 – 507, 2006.
© Springer-Verlag Berlin Heidelberg 2006
Neural Networks Based Automated Test Oracle for Software Testing 499

manually. Memon presents planning method to generate expected outputs [10]. It


needs to construct GUI model and set conditions for every operator manually. Chen
tests software by finite state machines (FSM), expected outputs are manually included
in the model [11]. Specification language Z uses predicate logic to specify an opera-
tion as a relation between the input and output [12]. Test oracle can be generated from
Z specification [13]. However it needs that the user’s requirements of the software are
represented by Z specification language. To implement more automatic oracle, Ag-
garwal use neural networks (NN) based approach to generate expected output [14] for
triangle classification problem [15]. By the experiment they made the conclusion that
NN can be used as test oracle with reasonable degree of accuracy for classification
problem in software testing.
We propose in this paper that the relationship from inputs to outputs of AUT is in
nature a function. When the function is continuous, an automated oracle is proposed.
NN is used to implement the oracle. To appeal of the NN approach lies in its ability to
approximate a function in any precision without the need to have knowledge of that
function [16-19]. Experiment has been conducted to validate the effectiveness of the
proposed method.
In the next section, we briefly describe the theory of multilayer NN. Automated
oracle based on NN is proposed in section 3. Section 4 presents the results of the
experiment. The conclusions and future directions are presented in section 5.

2 Multilayer Neural Networks for Function Approximation

Multilayer NN has been established to be effective method to approximate continuous


n
or other kinds of functions defined on compact sets in R [16, 17]. It can be used to
learning the relationship from inputs to outputs by training on the samples. Once
training process finishes, it can be given any input and produce an output by the rela-
tionship learned. The output generated by NN can be arbitrarily close to the expected
output owing to the generalization capabilities of the networks. Back propagation
based on gradient descent in error is the most popular training algorithm for multi-
layer NN [17, 18, 19]. In the algorithm the network is initialized with a random set of
weights, and then trained from a set of input and output pairs, i.e. training samples.
Training process stops when the training error is acceptable or a predefined number of
epochs pass. Once trained, network weights are kept and used to approximate the
function. When training NN, the weight update rule can be expressed as follows:

Δw ji (n) = αΔw ji (n − 1) + ηδ j (n) yi (n) (1)

where α is a positive number called the momentum constant, Δw ji (n) is the cor-
rection which is applied to the weight connecting the output of neuron i to the input
of neuron j at the n th iteration, η is the learning rate, δ j ( n) is the local gradient at
the n th iteration, and yi (n) is the function signal appearing at the output of neuron
i at the n th iteration. The training error can be the sum over output units of the
500 Y. Mao et al.

squared difference between the expected outputs tk given by a teacher and the actual
output zk :

1 c 1

2
J ( w) = ( tk − zk ) 2 = t − z (2)
2 k =1 2
where t and z are the target and the network output vectors of length c respectively
and w are the weights in the network. Details about the back propagation algorithm
can be found in references [18, 19].

3 Automated Test Oracle Based on Neural Networks

3.1 Model of the Automated Oracle

Function testing involves executing an AUT and examining the output, which is im-
plemented by oracle. General model of oracle is as Fig.1. In the model expected
outputs are generated from inputs and then compared with actual outputs from AUT.
If they are not same, it implies a failure. The process of generating expected outputs
and comparing is traditionally done manually. Testers compute the expected outputs
from program specifications or their knowledge of how a program should operate.
Expected outputs are then compared with actual outputs by tester’s knowledge to
determine if a failure occurs.

Fig. 1. General oracle model

The relationship from the input to output of the software is in nature a function. Let
x = ( x1 , x2 ,", xn ) and y = ( y1 , y 2 , " , y m ) be input and output vectors respec-
tively. Therefore the relationship can be represented by y = f (x ) , where f imple-
ment the specification of the software. When x ∈ R ,
n
y ∈ R m , and f is continuous,
the function f can be approximated by NN after training. In this paper, we propose the
automated oracle in this situation. Each component in y is also a function of x , that
is y j = f j (x) . Let D( xi ) be the set of all possible values of xi and D(x) be the set
of all possible values of x . Therefore D (x ) includes every possible combination of the
Neural Networks Based Automated Test Oracle for Software Testing 501

value from D( xi ) and D( x) = ∏ D( xi ) . Let x i be a data item in D(x)


i

and y = f ( x ) , then ( x , y ) is a test case. Automated test oracle can generate y i


i i i i

i
from x and compare it with the actual output automatically.
The model of the automated oracle is as Fig. 2. In the model approximate outputs
are automatically generated according to inputs. Approximate output is not as same as
expected output. But it can approach expected output in any precision. Comparison
process now became as follows:

⎧ 0 if η − a ≤ ε
η −aε = ⎨ (3)
⎩η − a − ε otherwise
where η and a are the actual output and approximate output respectively, ε is test
precision. Indicator ε controls the criterion if the actual value η is right or not. We
can adjust ε between precision and test cost. In experiment, we will describe the
effect of the ε . If η − a ε = 0 , it means the actual output is right within precision ε .
Otherwise, if η − a ε > 0 , it means that a failure occurs because the actual output is
not right within precision ε . To generate the approximate outputs, the relationship
from the input to output of the AUT should be learned from a small set of training
samples of the specification. Let D ( y ) be co-domain of the AUT. Then the training
samples are:

S = {( x i , y i ) | x i ∈ D ' ( x), D ' ( x) ⊂ D( x), y i = f ( x i ), y i ∈ D( y )} (4)

where y i is the expected output. The approximate outputs are then automatically
generated for ∀x ∈ D(x ) by the relationship learned.

Fig. 2. Automated oracle model

3.2 Automated Oracle Based on Neural Networks

NN can approach any continuous function in theory. This feature is used to generate
the approximate output in the automated oracle. To implement automated oracle, two
502 Y. Mao et al.

processes must be automated. One is to generate approximate output by NN and the


other is to compare the actual output from AUT with the approximate output. Auto-
mated oracle can be summarized in the following steps.
Step 1: Manually generate sample set S in equation (4) from the specification of the
AUT.
Step 2: Construct NN and initialize the weights. Set training parameters.
Step 3: Train NN by back propagation algorithm in the training set S ⊂ S .
'

Step 4: When the stopping criterion is satisfied, keep the weight and go to step 5.
Step 5: Obtain approximate output a from NN for ∀x ∈ D ' ( x) . Set test preci-
i i

sion ε = max(abs ( y i − a i )) .
i
Step 6: Get the input of the AUT and input it to NN. Obtain the approximate output
a from NN.
Step 7: Get the actual output η from the AUT and compare it with the approximate
output a according to equation (3). Determine if there is failure or not by
the result of comparison.
Step 8: Repeat step 6 and 7 to test the AUT in other inputs.
Step 9: If it is needed to test in different precision, go to step 2. Otherwise, the proc-
ess finishes.

4 Experiments
The goal of the experiment is whether the method proposed is effective to expose the
failure of the AUT. It includes whether NN can be used to generate the approximate
output that is close to the expected output, whether the value of ε can be computed
from samples, and whether the method can expose failure of the AUT.
The AUT in the experiment is an application with GUI as Fig. 3. It has three input
variables and two output variables. Input and output vector are x = ( x1 , x2 , x3 ) and
y = ( y1 , y2 ) respectively. The relationship between input and output is:

y1 = x1 sin(3x1 ) + cos( x1 )e1− x1


2
(5)

x12 + x22 + x32 + 3x2 x3 + x1 x3 (6)


y2 =
15
where x1 , x2 , x3 ∈ [0,3π ] . The output variable verified in the experiment is y1 in
equation (5). It can be generalized to more complicated and real situations. We manu-
ally seed faults into the AUT. The faults seeded in the AUT will cause failures as
table 1. It shows that the faults will cause the actual output of the AUT different from
the expected output. Manually generate set S with 200 samples from the specifica-
tion of the AUT. The set S is used for training NN and compute the value of ε . The
Neural Networks Based Automated Test Oracle for Software Testing 503

variable x1 is dropped from interval [0,3π ] evenly and the variable y1 is computed
manually. NN is constructed and training parameters are set as table 2. We select
' '
samples from S evenly to form training samples S whose size of is 100. Use S to
train NN. The training process is as Fig. 4. It shows the performance achieves the goal
quickly after 192 epochs pass.

Fig. 3. The example of AUT

Table 1. Five failures caused by faults seeded in the AUT ( x1 , η , and y1 are the inputs of
AUT, actual outputs, and expected outputs)

Failure ID x1 η y1
1 1.1 0.3900 0.1942
2 1.57 -1.8732 -1.5698
3 3.683 -3.5776 -3.6777
4 5.9 -5.3631 -5.3842
5 7.31 0.4367 0.4467

Table 2. Neural networks architecture and training parameters

Network architecture Training parameters


The number of layers 2
The number of units on the Input: 1; Hidden: 13; Output: 1
layers
Transfer functions logsig in 1st layer, purelin in 2st layer
Training function trainlm
Learning function learngdm
Performance function mse
Initial weights and biases The Nguyen-Widrow method
epochs 10000
goal 0.001
Adaptive learning rate 0.1
504 Y. Mao et al.

Fig. 4. The process of training the NN

After NN is trained, we test the performance of approximation. The expected out-


put y1 , approximate output a obtained from trained NN, and difference ζ = y1 − a
is as Fig.5. From it we can see the difference is small enough when the training goal is
set to 0.001. It shows that NN can be used to generate the approximate output that is
close to the expected output.

Fig. 5. Plot of the expected outputs, approximate outputs and their difference when the training
goal is 0.001

Test precision ε is computed in this step. The trained NN is used to generate ap-
output a for ∀x ∈ D ( x ) , i = 1, " ,200 . Test precision ε is set
i i '
proximate
Neural Networks Based Automated Test Oracle for Software Testing 505

as max( y1
i
− a i ) . The result value of ε is 0.1863 because the values of ( y1i − a i )
i
is in the interval [-0.1863,0.0757] . As a result if the actual output obtained from
the AUT is in the interval [ a − 0.1863, a + 0.1863] , there is no failure. Otherwise,
a failure is exposed. We now check if the method can expose the failures in table 1.
The result is in table 3. In the table η −aε is computed by equation (3). A failure is

exposed if η − a ε is larger than 0. Table 3 shows failure 1 and 2 can be exposed

successfully when training goal is set to 0.001 because η − a ε > 0 . Failure 3, 4, and
5 can not be exposed. It is because the difference between the actual output and the
approximate output is below the test precision ε .

Table 3. The failures that can be exposed when ε is 0.1863 ( x1 , η , and a are input, actual
output, and approximated output. η − a > 0 means a failure is exposed ).
ε

Failure ID x1 η y1 a η −aε
1 1.1 0.3900 0.1942 0.1943 0.0094
2 1.57 -1.8732 -1.5698 -1.5730 0.1139
3 3.683 -3.5776 -3.6777 -3.6690 0
4 5.9 -5.3631 -5.3842 -5.3693 0
5 7.31 0.4367 0.4467 0.4870 0

When we change the training goal, different test precision can be achieved (table 4).
It shows we can set goal according to different demand of precision in software testing.
If a precision ε of 0.0165 is needed, we can set training goal under 5e-005. If the ac-
tual output is in the interval [ a − 0.0165, a + 0.0165] , where a is the approximate
output obtained from NN, it means there is no failure. Otherwise, it means a failure
occurs. All failures in table 1 can be exposed in this situation. It means the method
proposed can expose failure effectively.

Table 4. Precision ε achieved under different training goal ( min and max are min and
max difference between expected and approximate outputs)

Goal min max ε Goal min max ε


0.01 -0.3694 0.1852 0.3694 5E-04 -0.0818 0.0408 0.0818
0.005 -0.2542 0.1378 0.2542 1E-04 -0.0789 0.0294 0.0789
0.001 -0.1863 0.0757 0.1863 5E-05 -0.0149 0.0165 0.0165

In the experiment, the value of ε is obtained from 200 samples. We now obtain
the value from more samples to see if the value will change obviously. The result is in
table 5. The meaning of each column is same as table 4. It shows that the value of ε
506 Y. Mao et al.

computed from 200 samples is effective in this experiment because it is same as the
one computed from more samples where the number is 100000.

Table 5. The comparison of the value ε obtained from different samples ( min , max , and
ε are obtained when the number of samples is 200, min ' , max ' , and ε ' are obtained when
the number of samples is 100000)

Goal min max ε min ' max ' ε'


0.01 -0.3694 0.1852 0.3694 -0.3694 0.1864 0.3694
0.005 -0.2542 0.1378 0.2542 -0.2542 0.1378 0.2542
0.001 -0.1863 0.0757 0.1863 -0.1863 0.0758 0.1863
5E-04 -0.0818 0.0408 0.0818 -0.0818 0.0409 0.0818
1E-04 -0.0789 0.0294 0.0789 -0.0789 0.0297 0.0789
5E-05 -0.0149 0.0165 0.0165 -0.0149 0.0165 0.0165

5 Conclusions
In statistical software testing a lot of test cases should be executed to simulate statisti-
cally the usage model of the AUT. However it is difficult to manually generate ex-
pected outputs for these test cases and compare the actual outputs of the AUT with
them. As a result an automated test oracle is proposed in this paper to solve the prob-
lem. The oracle can be applied when the relationship from the input to output of the
AUT is a continuous function. From above results we conclude that NN can be used to
implement the automated test oracle in reasonable precision. It can generate the ap-
proximate output for AUT and the precision can be adjusted by training parameter. As
a result, we can test AUT in the precision needed. By the method, we need not generate
all expected output from AUT manually. It can save a lot time and labor in software
testing. It is shown in the experiment that the precision ε is important to expose failure
and it can be computed from samples generated manually. However it is not verified
that if it is effective when the relationship from the input to output become more com-
plicated. As a result we will do the experiment in more complicated relationships.

Acknowledgment
This work was supported in part by the National High Technology Development Plan
of China (863) under grant no. 2003AA1Z2610.

References
1. Sayre, K.: Improved techniques for software testing based on Markov chain usage models,
PhD. thesis, University of Tennessee, Knoxville, USA (1999)
2. Bertolini, C., Farina, A.G., Fernandes, P., Oliveira, F.M.: Test case generation using sto-
chastic automata networks: quantitative analysis, In: Proc. of the second International
Conf. on Software Engineering and Formal Methods, IEEE Press (2004) 251-260
Neural Networks Based Automated Test Oracle for Software Testing 507

3. Beyer, M., Dulz, W., Zhen, F.: Automated TTCN-3 test case generation by means of UML
sequence diagrams and Markov chains, In: Proc. of the 12th Asian Test Symposium, Pis-
cataway : IEEE Press (2003) 102-105
4. Peters, D., Parnas, D.L.: Generating a test oracle from program documentation, In: Proc. of
the International Symposium on Software Testing and Analysis (1994) 58-65
5. Bousquet, L., Ouabdesselam, F., Richier, J., Zuanon, N.: Lutess: a specification-driven
testing environment for synchronous software, In: Proc. of the 21th International Conf. on
Software Engineering, ACM Press (1999) 267-276
6. Dillon, L.K., Ramakrishna, Y.S.: Generating oracles from your favorite temporal logic
specifications, In: Proc. of the 4th ACM SIGSOFT Symposium on the Foundations of
Software Engineering, ACM Software Engineering Notes, vol.21 (1996) 106-117
7. Schroeder, P.J., Faherty, P., Korel, B.: Generating expected results for automated black-
box testing, In: Proc. of the 17th IEEE International Conf. on Automated Software Engi-
neering, IEEE Press (2002) 139-148
8. Ostrand, T., Anodide, A., Foster, H., Goradia, T.: A visual test development environment
for GUI systems, ACM SIGSOFT Software Engineering Notes, vol.23, no.2 (1998) 82-92
9. Chen, W.K., Tsai, T.H., Chao, H.H.: Integration of specification-based and CR-based ap-
proaches for GUI testing, In: Proc. of the 19th International Conf. on Advanced Informa-
tion Networking and Applications, vol.1 (2005) 967-972
10. Memon, A., Nagarajan, A. Xie, Q.: Automating regression testing for evolving GUI soft-
ware, Journal of Software Maintenance and Evolution: Research and Practice, vol.17, no.1
(2005) 27-64
11. Chen, J. Subramaniam, S.: Specification-based testing for GUI-based applications, Soft-
ware Quality Journal, vol.10, no.3 (2002) 205-224
12. Hierons, R.M.: Testing from a Z specification, Software Testing, Verification, and Reli-
ability, vol.7 (1997) 19-33
13. McDonald, J. Strooper, P.: Translating object-Z specifications to passive test oracles, In:
Proc. of the 2th International Conf. on Formal Engineering Methods, IEEE Press (1998)
165-174
14. Aggarwal, K.K., Singh, Y., Kaur, A., Sangwan, O.P.: A neural net based approach to test
oracle, ACM SIGSOFT Software Engineering Notes, ACM Press, vol.29, no.3 (2004) 1-6
15. Ramamoorthy G.V., Ho S.F., Chen W.T.: On the automated generation of program test
data, IEEE Trans. Software Engineering, vol.SE-2 (1976) 293-300
16. Chen, T., Chen, H.: Approximations of continuous functionals by neural networks with
application to dynamic systems, IEEE Trans. Neural Networks, vol.4, no.6 (1993) 910-918
17. Chen, D.S., Jain, R.C.: A robust back propagation learning algorithm for function ap-
proximation, IEEE Trans. Neural Networks, vol.5, no.3 (1994) 467-479
18. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, second edition, John Wiley &
Sons (2001)
19. Fausett, L.: Fundamentals of neural networks: architectures, algorithms, and application,
Prentice Hall: Englewood Cliffs, New Jersey (1994)

Você também pode gostar