Você está na página 1de 4

Outlier modification in Fuzzy Linear Regression with

Fuzzy input and output using T


w
based arithmetic
operations
B. Pushpa, R. Sattanathan and R. Vasuki
Abstract Fuzzy linear regression has been widely studied and applied in various areas. However Tanakas approach
suffered the problem of extremely sensitive to the outlier. In this paper to handle the outlier problem, Chens approach is used
along with Tw based arithmetic operations which can handle fuzzy input-fuzzy output dataset. Some numerical experiments are
performed to assess the performance of the proposed approach.
Index TermsFuzzy linear regression, T
w
-based arithmetic operations, Fuzzy input and fuzzy output.

1 INTRODUCTION
Regression analysis has a wide spread applications in
various fields such as business, engineering and econom-
ics to explore the statistical relationship between input
(independent or explanatory) and output (dependent or
response) variable. In a paper published in 1970, Bellman
and Zadeh [1] proposed the concept of fuzzy set theory.
Since then several authors have constructed different
fuzzy regression models and proposed the associated
solution methods. The article by Tanaka et al [8] is proba-
bly the first research on this topic. In their study, a re-
gression problem with fuzzy dependent variable and
crisp independent variable was formulated as a mathe-
matical programming problem. The main objective of
them was to minimize the total spread of the fuzzy re-
gression coefficients subject to the constraint that the re-
gression model needed to satisfy a pre-specified member-
ship value in estimating the fuzzy responses. The main
drawback of this approach is that it is scale dependent.
Although this approach was later improved by Tanaka [9]
, Tanaka and Watada [10] and Tanaka et al [11], it still suf-
fered the problem of being extremely sensitive to outliers
as pointed out by Redden and Woodall [6]
The problem of outliers was investigated by Peter
[5]. However, Peter considered systems with non fuzzy
input and non fuzzy output data type. The present inves-
tigation focuses on fuzzy input and fuzzy output data
type with fuzzy arithmetic operations based on the sup t-
norm convolution. This approach helps to simplify the
evaluation of fuzzy linear regression whose coefficients
and input data are fuzzy numbers. An LP based method
using weakest t-norm which satisfies scale independent
property is discussed here and also to discuss the effect of
outliers, Chens [2] approach is used for fuzzy input fuzzy
output dataset.
2 FUZZY LINEAR REGRESSION USING SHAPE
PRESERVING OPERATIONS
In this paper, we consider fuzzy linear regression models
based on Tanakas[8] approach using the T
w
based arith-
metic operations, where both input data and output data
are fuzzy numbers,. For computational simplicity, it is
assumed that coefficients and variable are symmetric
fuzzy numbers.
Let us consider the fuzzy output
i i i
Y = (y ,e ) and fuzzy
inputs
i i1 i2 ip
X = (X , X ,...X ), i = 1, 2,...n

where
ij ij ij
X = (x , ), j = 1, 2,...p . The problem is to determine
fuzzy parameters
1 p
A , ..., A

satisfying the following


conditions:
(a) The data represented by a possibilistic linear model:


*
i 0 1 i1 p ip i
Y = A +(A X ) +..... +(A X ) = A X ,
i = 1,....n Where
( )
j j j
A = a , , j = 1,...p
(b) Given input output relation
i i
(X , y ), i = 1, ....n and a
threshold h, it must hold that
| | n i Y y
h i i
...., , 1 ,
~
= e
(c)The index of fuzziness of the possibilistic linear model
is defined as
( )
n
j ij ij j
1jp
1
max a , x

- B. Pushpa with Panimalar Institute of Technology, Poonamallee,


Chennai-600123, India.
- R. Sattanathan with D. G. Vaishnava College, Arumbakam,
Chennai-600106, India.
- R. Vasuki with SIVET College, Gowrivakkam, Chennai-600073, India.
JOURNAL OF COMPUTING, VOLUME 4, ISSUE 7, JULY 2012, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 70
2012 Journal of Computing Press, NY, USA, ISSN 2151-9617
http://sites.google.com/site/journalofcomputing/
Under the above assumptions, the problem can be solved
as the following mixed LP problem:
Minimize
( )
n
j ij ij j
1jp
1
max a , x

Subject to
( )
p n
1 1
i j ij j ij ij j i
1 j p 1 1
j
y a x L (h) max a , x L (h) e
0, i 1, ....n, j 1, ..., p

s s
s o
o > = =

3 MODIFICATION IN OUTLIERS
When outliers exist in the data, possibilistic fuzzy linear
regression gives incorrect results in the sense that the es-
timated interval is too wide to be useful. Thus a reasona-
ble modified interval should be considered. Chen[2] pro-
posed a probabilistic approach based on the linear pro-
gram given by Tanaka et al. to detect the outliers for crisp
input fuzzy output dataset. The main idea of this ap-
proach is to keep the difference between the spreads of
the estimated outputs and the actual ones less than a user
defined value R k e The linear program of this approach
is given below:
Min
n
0 1 1i 2 2i n ni
i=1
= c + c x + c x +.... + c x


Subject to
T T
i i i i
T T
i i i i
T
i i
a x (1 h)c x y (1 h)e
a x (1 h)c x y (1 h)e
c x e k,
+ > +
s
s
In this approach, by setting k as a large value, the model
become less sensitive to the outliers and modifying the
constraint which is with the outlier as:
T T
i i i i
T T
i i i i
a x (1 h)c x y (1 )(1 h)e
a x (1 h)c x y (1 )(1 h)e
+ > +
s
Where
i -r i + r
i
e + ..... + e
= 1 -
2re
Obviously decides the influence of the abnormal val-
ues on the overall data or decides the degree of impor-
tance of these outliers. In order to make more reliable
and objective, the value of r should be reasonably large.
The above Chens method is incorporated in Tanakas ap-
proach using T
w
based arithmetic operation and the
linear program of this approach is given below:
Minimize
( )
n
j ij ij j
1jp
1
max a , x

Subject to
( )
( )
{ } { }
p n
1 1
i j ij j ij ij j i
1 j p
1 1
j ij ij j i
1 j p
i j
1 i n i j
y a x L (h) max a , x L (h) e
max a , x e k, i 1,....n, j 1,..., p
where k max e max e

s s
s s
s s =
s o
o s = =
=

The above procedure is explained by giving some exam-
ple with fuzzy input-fuzzy output data.
4 EXAMPLES
We shall use the numerical values used by Chens[2]
which is with crisp input and fuzzy output data with
modification in outlier. The results of the proposed ap-
proach for the Chens treatment with outlier in spreads of
fuzzy output data using LINGO are given the Table 1 and
the comparison between the original data with outlier
and the modifying the outlier data are made in figure 1
and figure 2.
Table 1. Crisp input and fuzzy output data used by
Chen [2]
x ( , )
i i
y e
Observed
Interval
Estimated
interval
using
Chens
modified
approach
(h = 0)
Estimated
interval
using the
proposed
approach
(h=0.5)
1 (8.0,1.8) (6.2,9.8) (6.2,10.2) (4.43,8.55)
2 (6.4,2.2) (4.2,8.6) (4.2,12.1) (6.26,10.38)
3 (9.5,2.6) (6.9,12.1) (6.2,14.13) (8.09,12.21)
4 13.5,2.6) (10.9,16.1) (8.2,16.1) (9.92,14.04)
5 (13.0,2.4) (10.6,15.4) (10.2,18.1) (11.75,15.87)
6 (15.2,2.3) (13.1,17.3) (12.2,20.03) (13.58,17.7)
7 (17.0,2.2) (15,19) (14.2,22.0) (15.37,19.57)
8 (19.3,4.8)* (14.5,24.1) (16.2,23.96) (16.9,21.7)
9 (20.1,1.9) (18.2,22.0) (18.2,25.93) (18.43,23.83)
10 (24.3,2.0) (22.3,26.3) (20.2,27.9) (19.96,25.96)
* indicates the outlier
Table 2. Estimated parameter values of A0 and A1
Method A0 A1
Tanakas (3.84,3.85) (2.10,0)
Proposed (4.66,2.06) (1.83,0.3)
JOURNAL OF COMPUTING, VOLUME 4, ISSUE 7, JULY 2012, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 71
2012 Journal of Computing Press, NY, USA, ISSN 2151-9617
http://sites.google.com/site/journalofcomputing/
Table 2 shows the estimated parameter values of A0 and
A1 using the proposed approach for fuzzy linear regres-
sion for crisp input and fuzzy output using the shape pre-
serving operations using TW norm and by Sakawa and
Yano[7] method of approach .
The figure 2 shows that the proposed method gives nar-
row spread than the fuzzy linear regression given by
Chen[2] in figure 3.
D.H.Hong et al.,[3] designed an example given in table3
to illustrate their fuzzy regression model with fuzzy input
and fuzzy output.
By applying the proposed algorithm, using LINGO, the
outliers are modified and fuzzy linear regression model is
constructed.
Table 3. Fuzzy input and fuzzy output data used by
D.H.Hong et al .[3] along with estimated interval using
the shape preserving operations in Fuzzy linear regres-
sion.
( , )
i i
x ( , )
i i
y e
Estimated interval
using the proposed
approach
(h = 0.4)
(2.0,0.5) (4.0,0.5) (1.71,7.35)
(3.5,0.5) (5.5,0.5) (2.52,8.16)
(5.5,1.0) (7.5,1.0) (3,59,9.23)
(7.0,0.5) (6.5,0.5) (4.39,10.03)
(8.5,0.5) (8.5,0.5) (5.2,10.84)
(10.5,1.0) (8.0,1.0) (6.27,11.91)
(11.0,0.5) (10.5,2.5)* (6.54,12.18)
(12.5,0.5) (9.5,0.5) (7.34,12.98)
*indicates abnormal values
Table 4. Estimated parameter values of A0 and A1
Table 4 shows the estimated parameter values of A0 and
A1 from fuzzy linear regression for fuzzy input and fuzzy
output using proposed approach using the shape preserv-
ing operations using TW-norm and by Sakawa and
Yano[7] method of approach .
Method A0 A1
Sakawa and
Yano
(3.367,0.4260) (0.559,0.111)
Proposed (3.45,2.81) (0.536,0.0295)
JOURNAL OF COMPUTING, VOLUME 4, ISSUE 7, JULY 2012, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 72
2012 Journal of Computing Press, NY, USA, ISSN 2151-9617
http://sites.google.com/site/journalofcomputing/
Figure 4 and figure 5 shows that modifying the outlier
data, we got the best fit for the fuzzy linear regression.
Comparing with the result of Sakawa and Yano[ 7] given
in table 4 , the proposed approach is more effective even
though different methods and different arithmetic opera-
tions are used.
In order to measure the degree of Fuzzy regression, we
can use an index of confidence which is measured as
SSE
IC = 1-
SST
Table 5. IC value of the proposed method in Crisp/
Fuzzy input and fuzzy output.
Type of
data
Original data with
Outlier Fuzzy re-
gression(using Tw
based operations)
Modified outlier
data in the fuzzy
regression(using Tw
based operations)
Crisp in-
put Fuzzy
output
0.4429 0.5253
Fuzzy
input
Fuzzy
output
0.8912 0.9289
From the above table5, it is clear that the proposed me-
thod is mostly suitable for fuzzy input fuzzy output data.
Chens[2] method of treatment to outlier in fuzzy input
and fuzzy output data have higher IC value when com-
pared to crisp input fuzzy output data using T
w
based
arithmetic operations in Fuzzy linear regression.
4 CONCLUSION
Although the proposed mathematical model is a non-
linear problem, it is easily solved with a general software
package LINGO. The performance of the proposed ap-
proach is illustrated by examples from the existing litera-
ture. The results show that our approach is suitable for
fuzzy input and fuzzy output data. In addition the com-
putational efficiency and effectiveness of the proposed
approach seems to be satisfactory which was demonstrat-
ed by using the crisp/fuzzy input and fuzzy output.
REFERENCES
[1] R.E. Bellmann, L.A. Zadeh, Decision making in a
fuzzy environment, Manage. Sci. 17B (1970) 141-164.
[2] Y.S. Chen, Outliers detection and confidence interval
modification in fuzzy regression, Fuzzy Sets and Sys-
tems, 119 (2001) 252-279.
[3] D.H. Hong, S. Lee and H. Y. Do, Fuzzy linear regres-
sion analysis for fuzzy input- output data using
shape preserving operations, Fuzzy Sets and Systems,
122 (2001) 513-526.
[4] A. Kolesaeova, Additive preserving the linearity of
Fuzzy intervals, Tata Mountains Math. Publ. 6 (1995)
75-81.
[5] G.Peters, fuzzy linear regression with fuzzy intervals,
Fuzzy Sets and Systems, 63 (1994) 45-55.
[6] D.T.Redden, W.H. Woodall, Properties of certain
fuzzy linear regression models, Fuzzy Sets and Sys-
tems, 44 (1994) 361-375.
[7] M. Sakawa and H.Yano, Multiobjective fuzzy linear
regression analysis for fuzzy input and fuzzy output
data, Fuzzy Sets and Systems, 47(1992), 173-181.
[8] H.Tanaka, S. Uegima and K.Asai, Linear regression
analysis with fuzzy model, IEEE Trans Systems, Man
Cyberbet, 12 (1982) 903-907.
[9] H.Tanaka, Fuzzy data analysis by possibilistic linear
models, Fuzzy Sets and Systems 24 (1987) 363-375.
[10] H.Tanaka and J.Watada, Possibilistic linear system
and their application to the linear regression model,
Fuzzy Sets and Systems, 27 (1988) 275-289.
[11] H.Tanaka, I. Hayashi and J.Watada, Possibilistic
linear regression analysis for fuzzy data, Eur. J. Oper.
Res. 40 (1980) 389-396.
B. Pushpa received her BSc and M.Phil degree in Mathe-
matics from University of Madras, Chennai. She received
her M.Sc degree from Anna University, Chennai. She is
pursuing her Ph.D programme in Manonmaniam Sunda-
ranar University, Tirunelveli, Tamilnadu. Curently she
she is an Assistant Professor in Mathematics Department
at Panimalar Institute of Technology, Poonamallee, Chen-
nai. Her research interests include Fuzzy probability and
Fuzzy regression analysis.
R. Sattanathan received his Ph.D degree from University
of Madras. His research interests include Functional
Analysis, Patent Regognition, Fuzzy probability and
Fuzzy regression analysis.
R. Vasuki received her BSc, M.Sc, M.Phil and Ph.D degree
in Mathematics received from University of Madras,
Chennai. She is an Assistant Professor in Mathematics
Department at SIVET College, Gowrivakkam, Chennai.
Her research interests include Fuzzy metric space and
Fuzzy regression analysis.
JOURNAL OF COMPUTING, VOLUME 4, ISSUE 7, JULY 2012, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 73
2012 Journal of Computing Press, NY, USA, ISSN 2151-9617
http://sites.google.com/site/journalofcomputing/