Remote Sensing Image Classification Method Based On Evidence Theory and Decision Tree

Remote Sensing Image Classification Method Based on Evidence
Theory and Decision Tree

LI Xuerong*ab, XING Qianguob, KANG Lingyanab
a
Graduate University, Chinese Academy of Sciences, Beijing 100080, P.R. China;
b
Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, 17 Chunhui Road,
Laishan District, Yantai 264003, P.R. China
ABSTRACT
Remote sensing image classification is an important and complex problem. Conventional remote sensing image
classification methods are mostly based on Bayesian subjective probability theory, but there are many defects for its
uncertainty. This paper firstly introduces evidence theory and decision tree method. Then it emphatically introduces the
function of support degree that evidence theory is used on pattern recognition. Combining the D-S evidence theory with
the decision tree algorithm, a D-S evidence theory decision tree method is proposed, where the support degree function is
the tie. The method is used to classify the classes, such as water, urban land and green land with the exclusive spectral
feature parameters as input values, and produce three classification images of support degree. Then proper threshold
value is chosen and according image is handled with the method of binarization. Then overlay handling is done with
these images according to the type of classifications, finally the initial result is obtained. Then further accuracy
assessment will be done. If initial classification accuracy is unfit for the requirement, reclassification for images with
support degree of less than threshold is conducted until final classification meets the accuracy requirements. Compared
to Bayesian classification, main advantages of this method are that it can perform reclassification and reach a very high
accuracy. This method is finally used to classify the land use of Yantai Economic and Technological Development Zone
to four classes such as urban land, green land and water, and effectively support the classification.
Keywords: evidence theory, decision tree, support degree, remote sensing classification
1. INTRODUCTION
The classification technique of remote sensing images is a branch of pattern recognition techniques in remote sensing
field. It aims to the identification of remote sensing images, i.e. recognizing and classifying ground cover information in
remote sensing images thereby distinguishing the corresponding ground truth and extracting the required information [1-
2]. The classification of remote sensing data is important. The uncertainty of remote sensing data is that the value of the
attribute has a confidence level, which comes from the acquirement, transmission, storage of remote sensing data.
Dempster-Shafer evidence theory (D-S evidence theory) [3] is the extension of probability, which constructs the one-to-
one relationship between proposition and aggregation. D-S evidence theory is an uncertainty theory through
transforming the uncertainty of proposition to the uncertainty of aggregation. D-S evidence theory is applied in the
description, processing and deduction of uncertain, incomplete, unreliable data or information in recent years [4-6].
Classification is an important task of data mining, which is to construct models to classify the data into different classes.
Decision tree classifier [7-8] is a supervised classification method, which is nonparametric and does not need the data in
normal distribution. It depends on the classification rules, which can learn from classification process or predefinition, to
classify the data. There are many decision tree algorithms such as ID3, C4.5, CART, etc., which are effective and widely
used in classification field, but they could not deal with uncertain data in the construction and classification of the
decision trees.
As for the limitation of traditional decision tree algorithms, a D-S evidence theory decision tree method is proposed,
which combines the D-S evidence theory with decision tree classifier. The method can deal with the uncertainty of
*xrli@yic.ac.cn; phone 86-535-2109033; fax 86-535-2109000; yic.ac.cn.
Multispectral, Hyperspectral, and Ultraspectral Remote Sensing Technology, Techniques, and Applications III,
edited by Allen M. Larar, Hyo-Sang Chung, Makoto Suzuki, Proc. of SPIE Vol. 7857, 78570Y 2010 SPIE
CCC code: 0277-786X/10/$18 doi: 10.1117/12.869544
Proc. of SPIE Vol. 7857 78570Y-1
Downloaded from SPIE Digital Library on 20 Apr 2011 to 159.226.100.156. Terms of Use: http://spiedl.org/terms
remote sensing imageries. To utilize evidence theory to decision tree algorithm, support degree is proposed, which is
classification rules of the decision tree. When compared with statistical classification methods, the decision tree method
using support degree shows great superiority. Experimental results demonstrate the proposed method is effective and can
improve the classification accuracy.
2. D-S EVIDENCE THEORY AND DECISION TREE

2.1 D-S evidence theory
The theory of evidence was first proposed by Dempster in early 1967 and then extended by Shafer as a mathematical
framework for the representation of uncertainty. D-S evidence theory allows for a representation of both imprecision and
uncertainty [9-10] through the definition of two functions: belief ( Bel ) and plausibility ( Pls ), both derived from a
mass function m (or basic probability assignment). Mass functions are defined on the power set of the space of
discernment D , i.e. a mass is attributed to each subset of D . In classification problems, D may for instance be the set of
classes of interest, and a subset of D represents a union of classes. This represents a major difference with probabilistic
approaches with only assign probabilities to singletons (i.e. to subsets of D of cardinality 1). In the following, singletons
will be called simple hypotheses, whereas subsets containing at least two elements of D are called compound
D
hypotheses. A mass function m is thus a function from 2 onto [0, 1], such that
m( ) = 0 , m( A) = 1
A D
(1)
A subset A with non-zero mass value is called a focal element.

The problem of assigning masses to hypotheses becomes more complicated if values have to be assigned to compound
hypotheses. Belief and plausibility functions are derived from the mass function, and are respectively defined by
Bel ( ) = 0 ,
Bel ( A) = m( B) , A D, A
B A
(2)
Pls ( ) = 0 ,
Pls ( A) =
B I A
m( B) , A D, A (3)
Clearly, we have the following properties:

Bel ( D) = 1 , (4)
Pls( D) = 1 , (5)
Bel ( A) Pls( A), A D , (6)
Pls( A) = 1 Bel ( A), A D (7)
D-S theory evidence provides an explicit measure of ignorance about an event A and its complementary A as the length
of the internal [ Bel ( A) , Pls ( A) ] (called belief internal). It can also be interpreted as the imprecision on the true
probability of A . The mass assigned to D can be interpreted as the global ignorance since this weight of evidence is not
discernible among the hypotheses. In summary, as for probability theory, using numerical values in [0, 1] allows us to
represent uncertainty, but using the two functions Bel and Pls , D-S evidence is also able to represent imprecision.
If masses are assigned only to simple hypotheses ( m( A) = 0 for | A |> 1 ), then the three functions m , Bel and Pls are
equal and are a probability, called Bayesian mass function. Otherwise, there is no direct equivalence with probabilities.
2.2 Decision tree

Decision Tree is one of the most popular classification algorithms [11-15], which is usually applied to data classification
based on a tree-structured graph or model of decisions and their possible consequences or decision rules that constructed
by learning from training dataset. Decision Tree can be divided into one root node, internal branch nodes and leaf nodes.
Analytically, each internal node represents a set of attribute records from the original dataset (usually called test
attribute), and each branch represents the probability value of the corresponding node. Each leaf node represents the
attribute value of one class or category, different leaf nodes can represents attribute value of the same classes. Decision
tree can be described by a group of production rules using IF-THEN style, each path from the root to leaf node stands
one rule, the condition of rule is decided by the balance of all the nodes attribute value, the result of rule is the class
attribute of leaf node in the c corresponding path. Compared with decision attribute, rule is more popular chosen in
practical application, because it is more concise, easier to be comprehended, applied and adjusted when building expert
system. Decision Tree can be flexibly adjusted by the class condition of internal nodes or rules. The basic scheme of
decision Tree is to split and mask every target as an image layer so as to avoid one targets affection and interference on
the other target extraction. The methodology of decision tree is to gradually classify the remotely sensed data into each
branch of the decision tree according to some rules. CART (Classification and Regression Trees) is one popular tree
growth method to construct binary tree using training dataset for supervised classification. It has the advantage of taking
the binary tree, in which the root node stands for all the samples, and the root node is divided into two child nodes, then
every child node is divided into lower level child nodes, the division procedure is continued until that there are no nodes
can be divided. As a non-parametric, multilayer method, free of data distribution hypothesis, decision tree is more robust
and flexible for data analysis and interpretation in the application.
3. D-S EVIDENCE THEORY DECISION TREE METHOD

3.1 Image classification
Classification of a digital image is a procedure of converting image pixels with different, especially similar properties,
structure, etc., into different classes. The kernel of classification is to define the central point and scope of every classes
and according classification decision functions. If two pixels are similar, they should have similar eigenvectors and the
minimum distance between the two eigenvectors.
Suppose two vectors of remote sensing imageries x and y who have m features respectively:
x = ( x1 , x2 ,..., xm )T , y = ( y1 , y2 ,... ym )T .
Euclidean distance || x y || presents the similarity of x and y :
m
|| x y ||= (x y )
i =1
i i
2
.
If || x y || is smaller, the difference between x and y in every feature is smaller. Otherwise, the difference is bigger.
A pixel x belongs to class A , which means x is more nearer to the average vector of class A center.
A = {x1 , x2 ,...xn } , and xi has m features or characters, so
xi = ( xi1 , xi2 ,...xim ) , (i = 1, 2,...n) .
1 m j
yj = xi , ( j = 1, 2,...m) ,
n i =1
Average vector y = ( y , y ,... y ) of class A is constructed. The pixel x belongs to class A , which means the distance
1 2 m
between x and average vector of class A is nearest.

The similarity between x and class A is presented. But it is hard to conclude that the similarity between x and class A is
better than the similarity between x and class B. So support degree is needed to reflect the similarity between pixel x and
class A .
3.2 Support degree of D-S evidence theory

To utilize evidence theory to decision tree algorithm, support degree is proposed. When compared with statistical
classification methods, the decision tree method using support degree shows great superiority. Finally, result comes out
after fusing the two classification results with D-S evidence theory. Experimental results demonstrate that the proposed
method is feasible and can improve the classification accuracy.
Suppose i represent x Ai (i = 1, 2,...n) , so let D = {1 , 2 ,... n } be a recognition frame. Plausibility function ( Pls )
is deducted on the recognition frame.
C
Pls ({i }) = , (i = 1, 2,...n) and C is a constant.
|| x Ai ||
C min || x Ai ||
Pls( A) = max( Pls({i })) = max = i(1,..,n ) , A D
i A i A || x Ai || min || x Ai ||
i A
So we can get the support degree function S ( A) on the recognition frame D in S-D evidence theory:
min || x Ai ||
S ( A) = 1 Pls ( A) = 1 i(1,..., n ) .
min || x Ai ||
i A
The bigger the value of S ( A) is, the more similar x should belong to class Ai . So the support degree function is the rule
of the classification.
3.3 D-S evidence theory decision tree method

D-S evidence theory is an important tool to represent the uncertainty of the data. Support degree is used to represent the
pixels which class they should belong to. Through combining the support degree with the decision tree algorithm, a D-S
evidence theory is proposed. The following steps are technique process of the D-S evidence theory decision tree:
(1) Preprocessing of remote sensing images, such as geometrical correction of TM data, proper bands choosing and so
on.
(2) Choose proper bands and representative data as the training samples according to different ground objects and
classification types to construct decision tree.
(3) Use ENVIs decision tree algorithm where support degree in evidence theory is the branch conditions, and construct
the decision tree of one object class and the leaf nodes represent different support degree such as 0, 0.1, 0.2, 0.3 to great
than 0.6.
(4) Execute the decision tree to classify the data, and produce the classification image of one ground object based on
different support degree.
(5) Produce the other round object classification images according to steps from (2) to (4).
(6) Choose proper threshold to produce binary images of different ground object classification. The pixels whose support
degree is less than the threshold are assigned to 0, while others are assigned to 1.
(7) Overlay the binarization images of ground object classification.
(8) Appraise the classification accuracy of the final overlaid image. If the accuracy is lower than need, go to (6);
otherwise, classification is finished.
4. APPLICATION EXPERIMENT
The experiment chooses 2006s Landsat 5 / Tm images of Yantai Economic and Technological Development Zone. The
D-S evidence theory decision tree method is used to classify the land cover of Yantai Economic and Technological
Development Zone to four classes such as urban land, farmland, forest land and water, and effectively support the
classification. The following are the steps:
(1) Select TM 5-4-3 spectral bands, do geometry correction, and subset the images of Yantai Economic and
Technological Development Zone.
(2) Choose the common and representative data as the training samples. Classification accuracy depends on the quality
and quantity of the samples.
(3) Calculate the maximum, minimum and average values of the interesting spectral bands. These values can be used to
calculate support degree of the ground object classification.
(4) Construct decision tree with different support degree according to the three classes.
(5) Choose zero as the threshold to produce binary images of different ground object classification, execute the decision
tree algorithm to classify the data, and produce three ground object classification images (figure 1(a)-(c)).
(6) Overlay the binarization images of ground object classification, and produce the classification image of the three
ground objects (figure 2(a)).
Figure 1. Classification images of support degree, (a) water, (b) urban and (c) green land.
Figure 2. The result of image classification based on evidence theory, (a) the first classification and (b) the second
classification.
5. APPRAISAL OF CLASSIFICATION ACCURACY
Compared to Bayesian classification, main advantages of this method are that it can perform reclassification and reach a
very high accuracy. From figure 1(a)-(c), three classification images based on D-S evidence theory decision tree method
is showed. Through the comparison with original remote sensing images, the accuracy of water (figure 1(a)) is high,
while the accuracy of the other two classification result is low. Binarization operations with the three classification
results are done. The pixels that support degree is zero belong to one class, and the others belong to another class. Four
classes such as water, urban land, and green land is numbered to 1, 2 and 3. Then the three binarization images are
overlaid to one result image (figure 2(a)). We randomly choose 320 points from figure 2(a) and compare with the
reference and original images, so we get the classification error matrix and the accuracy assessment report (table 1).
Table 1. Classification error matrix and the accuracy assessment report.
Class Water Urban Green land Number of samples Classification
accuracy
Water 69 1 5 75 0.9200
Urban 1 71 23 95 0.7474
Green land 3 46 101 150 0. 6733
Total number of samples: 320, correct classified samples: 241, and overall classification accuracy: 0.7531.
From table 1 the total appraisal result of the accuracy is 0.7531, which is similar to the result of six times classification of
the maximum likelihood classification method (the result is 0.7312). Because the accuracy of green land is lower, we can
adjust the threshold to reclassify until get the ideal accuracy. Because the area of green land is large, we choose the
support degree of the three cover classes: urban land more than 0.4, water more than 0.5, green land more than 0.3 and
overlay the three images second times. The figure 2(b) is the result. We randomly choose 320 points from figure 2(b) and
compare with the reference and original images. The total classification accuracy reaches 0.9023 and the accuracy of the
three classes is 0.9600, 0.8532, and 0.9233. The classification accuracy meets our demands.
6. CONCLUSION
Remote sensing data has uncertainty and plausibility result from the data acquirement, transmission, storing, handling
etc.. D-S evidence theory is a powerful tool that can be applied to express the uncertainty of the data. Decision tree is a
classification algorithm, which is a non-parametric, multi layer method, free of data distribution hypothesis, decision tree
is more robust and flexible for data analysis and interpretation in the application. Its time complexity is low and has fast
classification speed. Combining the D-S evidence theory with the decision tree algorithm, a D-S evidence theory
decision tree method is proposed, where the support degree function is the tie. The method is used to classify the classes,
such as water, urban land and green land with the exclusive spectral feature parameters as input values, and produce
three classification images of support degree. Then the proper threshold value of support degree is chosen to each
classification image and binarization handling is executed. Then overlay these images according to the type of
classifications, and the initial result is obtained. Finally further accuracy assessment will be done. If initial classification
accuracy is unfit for the requirement, reclassification is conducted through re-choosing the support degree threshold of
the images of every ground object classification, until final classification meets the accuracy requirements. Compared to
Bayesian classification, main advantages of this method are that it can perform reclassification and reach a very high
accuracy. This method is successfully used to classify the land cover of Yantai Economic and Technological
Development Zone to three classes such as water, urban land and green land. The experiment effectively supports the
classification method and has precise classification result.
REFERENCES
[1] MCCLEAN S, SCOTNEY B, SHAPCOTTM, Aggregation of imprecise and uncertain information in

databases, IEEE Transactions on Knowledge and Data Engineering, 13 (6): 902-912(2001).
[2] Blaschke T, "Object based image analysis for remote sensing", ISPRS Journal of Photogrammetry and Remote
Sensing 65, 2-16(2010).
[3] Shafer G., [A Mathematical Theory of Evidence], Princeton University Press, Princeton, 1976.
[4] Duan X S. [Evidence Theory and Decision and Artificial Intelligence]. Beijing: China Renmin University Press,
1993.
[5] Yager R J, Kacp rzyk J, FedrizziM, [Advances in the Dempster-Shafer Theory of Evidence]. New York: John
Wiley and Sons, 1994.
[6] Friedl M A, Brodley C E, Strahler A H, Maximizing Land Cover Classification Accuracies Produced by
Decision Trees at Continental to Global Scales, IEEE Transactions on Geoscience and Remote Sensing, 37(2):
969-977(1999).
[7] Safavian S R, Landgrebe D, A Survey of Decision Tree Classifer Methodology, IEEE Trans. Syst. Man
Cybern, 21: 660-674(1991).
[8] Mclver D K, Friedl M A, Estimating Pixel-scale Land Cover Classification Confidence Using Non -parametric
Machine Learning. Methods, IEEE Transaction on Geoscience and Remote Sensing, 39: 1959-1968(2001).
[9] Mertikas, P. and Zervakis, M. E., "Exemplifying the theory of evidence in remote sensing image classification",
International Journal of Remote Sensing, 22(6), 1081-1095(2001).
[10] Isabelle Bloch, "Some aspects of Dempster-Shafter evidence theory for classification of multi-modality medical
images taking partial volume effect into account", Pattern Recognition Letters 17, 905-919(1996).
[11] Mclver D K, Friedl M A. Using Prior Probabilities in Decision-tree Remotely Sensed Data, Remote Sensing
of Environment, 81: 253-261(2002).
[12] Friedl M A, Brodeley C E. Decision Tree Classification of Land Cover from Remotely Sensing Data, Remote
Sensing of Environment, 61, 399-409(1997).
[13] Niccolai, Andrew M. , Hohl, Aaron , Niccolai, Melissa and Dearing Oliver, "Decision rule-based approach to
automatic tree crown detection and size classification", International Journal of Remote Sensing, 31:12, 3089-
3123(2010).
[14] Chun-Chieh Yang, Shiv O. Prasher, Peter Enright, Chandra Madramootoo, Magdalena Burgess, Pradeep K.
Goel and Ian Callum, "Application of decision tree technology for image classification using remote sensing
data", Agricultural Systems 76, 1101-1117(2003).
[15] Hui YUAN, Rongqun ZHANG and Xianwen LI, "Extracting Wetland Using Decision Tree Classification",
Proc. WSEAS 8, 240-245.

Remote Sensing Image Classification Method Based On Evidence Theory and Decision Tree

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Remote Sensing Image Classification Method Based On Evidence Theory and Decision Tree

Enviado por

Direitos autorais:

Formatos disponíveis

Remote Sensing Image Classification Method Based on Evidence

Theory and Decision Tree

*xrli@yic.ac.cn; phone 86-535-2109033; fax 86-535-2109000; yic.ac.cn.

Proc. of SPIE Vol. 7857 78570Y-1

2. D-S EVIDENCE THEORY AND DECISION TREE

A subset A with non-zero mass value is called a focal element.

Clearly, we have the following properties:

Bel ( A) Pls( A), A D , (6)

Pls( A) = 1 Bel ( A), A D (7)

Proc. of SPIE Vol. 7857 78570Y-2

2.2 Decision tree

3. D-S EVIDENCE THEORY DECISION TREE METHOD

A = {x1 , x2 ,...xn } , and xi has m features or characters, so

xi = ( xi1 , xi2 ,...xim ) , (i = 1, 2,...n) .

Proc. of SPIE Vol. 7857 78570Y-3

between x and average vector of class A is nearest.

3.2 Support degree of D-S evidence theory

3.3 D-S evidence theory decision tree method

Proc. of SPIE Vol. 7857 78570Y-4

Proc. of SPIE Vol. 7857 78570Y-5

Proc. of SPIE Vol. 7857 78570Y-6

Proc. of SPIE Vol. 7857 78570Y-7

[1] MCCLEAN S, SCOTNEY B, SHAPCOTTM, Aggregation of imprecise and uncertain information in

Proc. of SPIE Vol. 7857 78570Y-8

Você também pode gostar