Escolar Documentos
Profissional Documentos
Cultura Documentos
Classification
by
using
Spectral
and
Statistical Texture Data Analysis
Salman Qadri*[a-b], Muzammil-ul-Rehman[a], Mutiullah[a], Muhammad
Amjad Iqbal[b], Muhammad Nazir[b]
[a] Faculty of Information Technology, The University of Central Punjab Lahore, Punjab 54000, Pakistan
[b] Department of Computer Science & IT, Faculty of Management Sciences, The Islamia University of
Bahawalpur, Punjab 63100, Pakistan.
*Author for correspondence; e-mail address: salman.qadri@iub.edu.pk
ABSTRACT
The main objective of this research was to find out the
importance of machine vision approach for the classification of five
types of land cover (LC), fertile, green pasture, desert-range land,
bare and Sutlej-river land. A novel spectra-statistical frame work was
design to classify the subjective land cover types accurately. The
above mentioned five types of land cover have strong correlation
among each other. On the basis of human perception, among these
three selected land cover like desert rangeland, Sutlej river land and
bare land have almost similar physical features and remaining two
fertile cultivate land (cropland) and green pasture (grass) are similar.
Remote sensing data of these five types land was acquired by using
handheld crop scan device MSR5 in the form of five spectral bands
(blue, green, red, infrared and microwave) while statistical texture
data was arranged with a digital camera by the transformation of
acquired images into 229 statistical texture features for each image.
out of which the most discriminant 30 features were obtained by
integrating the three statistical features selection techniques, Fisher
co-efficient (F), Probability Of Error plus Average Correlation Coefficient (POE+ACC), and Mutual Information Co-efficient (MI), while
no such feature selection procedure was required for spectral data
because in this data each scene was completely define on the basis of
above mentioned only five spectral bands. Capability of selected
statistical texture data clustering was verified by Non Linear
Discriminant Analysis (NDA) and Linear Discriminant Analysis (LDA)
approach was applied for spectral features. For classification, these
statistical and spectral features were deployed to artificial neural
network (ANN). By implementing cross validation method (80-20) we
received an accuracy of 91.3221% for statistical texture data and
96.40% for spectral data respectively.
Keywords: Textural features, Remote Sensing, Artificial Neural
Network, Land Cover, Mazda Software Version 4.6.
1. INTRODUCTION
Image
processing
and
remote sensing can play a vital
role for betterment of the
agriculture field [1]. By using this
technology, we can classify vast
land cover area into different
categories [2] .This would be
helpful not only for the socioeconomic sector but also fulfill
the needs of future. In twenty
first century, world is facing the
challenge of hunger, food and
poverty [3]. This issue can be
resolved by increase in crop
production and better utilization
of cultivated land. Land cover
information is necessary for
different policy making, planning
and management purposes like,
land record of forest, desert,
farmland, and wetland as well as
other biophysical resources are
required
for
land
cover
information.
Researchers
are
trying to get the benefits of
technology by involving it in
agriculture field [4]. It is being
tried to enhance the cultivated
land area and monitored the land
through intensive manual survey
[5].
For the success of such
surveys a heavy economical and
labor investment is required. In
developing
countries
like
Pakistan, it seems to be very
difficult to spend huge amount on
such projects.
Whether
directly
or
indirectly almost 50% population
of these countries is associated
with agriculture profession [6].
All
above
discussed
issues
highlight the importance of the
proper
classification,
management, better utilization,
crop growth and production of
the
land.
According
to
2.
Above
mentioned
five
different types of land cover (LC)
plots having 43560 square feet
area (1 acre) for each type.
Digital photographs of bare land,
desert
rangeland,
fertile
cultivated land, green pasture
and Sutlej river land are acquired
Sr. No
1.
2.
3.
4.
5.
Land Type
Fertile
Bare
Desert
Green Pasture
Sutlej River
Time
1.00 pm
2.00 pm
1.30 pm
1.30 pm
1.00 pm
Light Intensity
34300 lux
34000 lux
34500 lux
35000 lux
34300 lux
geology,
zoology,
agriculture,
forestry,
botany,
meteorology,
oceanography
and
civil
engineering [13].
2.4 MULTISPECTRAL
(MSR5)
RADIOMETER
Multispectral Radiometer
(MSR5) made-up of CROPSCAN
Inc. (USA) for data collection.
MSR5 has the characteristic to
provide data similar to satellite
LANDSAT5 TM. It has five
different section of spectrum,
including visible (Blue, Green,
Red) near infrared and shortwave
infrared. MSR5 spectrum consists
of blue (450 to 520 nm), green
(520 to 600 nm), red (630 to 690
nm), near infrared (760 to 900)
and shortwave infrared (1550 to
1750 nm).
MSR5
CROPSCAN
has
been already used for the
assessment and measurement of
crops
weed
effect
[14]and
vegetation cover estimation and
diseases estimation [15-16]. For
remote sensing data, we acquired
50 MSR scans of each plot at 5
feet height of land cover surface.
Each MSR scan contain five wave
bands, three visible (Blue, Green,
Red) and two invisible infrared
and microwave. Five different
types of land cover[17] contain
total 250 spectral data instances.
PREPROCESSING
Each image has a vast
irrelevant area, so prior to
further
processing
relevant
portion of
the
image
was
extracted. The extracted relevant
portions of the images were
converted to gray scale images (8
2.5
TEXTURE FEATURES
Statistical texture features
are categorized in to first order
which relates to the intensity of
the individual pixels, Second
order relates to the occurrence of
neighboring pixels. First order
statistical parameters are directly
based on histogram features of an
image
while
second
order
parameters derived from Gray
Level
Co-occurrence
Matrix
(GLCM). Here in this work total
229 statistical texture features
were calculated for each region
of interest (ROI) by using Mazda
software
version
4.6.
The
calculated
parameters
are
grouped as first order 9 statistical
parameters and 11 second order
(Haralick) statistical parameters
derived from (GLCM) in all four
directions (0, 45, 90 and 135)
up to 5 pixel distance 220
(1145) Haralick et.al [19]. It
means that each ROI had defined
by 229 textural features and
2.7
statistically
the
data
was
presented in 68700 (300229)
dimensional
features
vector
space.
It is worth to be mentioned
here that all of the 229 calculated
features
were
not
equally
important regarding for land
cover classification. Furthermore,
statistically a huge data was
required to have a reliable
discrimination and classification
results on the basis of so large
number of features, generally,
which was not available. So, it
was necessary that feature vector
space dimensionality should be
reduced by selecting the most
discriminate features, which had
the ability to discriminate and
FEATURES SELECTION
Selection of the most
suitable
features
for
the
classification was a challenging
task.
We
had
used
three
supervised
feature
selection
methods Fisher Co-efficient (F),
Probability Of Error plus Average
Correlation
Co-efficient
(POE+ACC)
and
Mutual
Information
Co-efficient
(MI).
These methods were merged
together (F+PA+MI) to get the
most
discriminant
features.
Fisher
Co-efficient
(F)
[20]
mathematically is described as:
2.8
K2
F= 2 =
M
a=1
j=1
Pa P j LaL j / Pa M 2a
1 P2a
(1)
a=1
a=1
Where
Between-class variance,
within-class variance,
Pa
probability of feature
Ma
La
a ,
in the given
class.
Probability of Error plus Average Correlation Co-efficient (POE+ACC)
[21] is defined as:
POE ( f j ) =
(2)
k1
n= f j : Minimum j POE ( f j ) +
(3)
1
. Correlation ( f a , f j )|
k1 a =0 |
(4)
I ( F , C )= f c P ( F .C ) log 2
P ( F . C)
P( F ) P (C )
(5)
order
according
to
their
significance. In this way total 30
(10 features by each mentioned
approach) were selected. As the
combined set of features gave
better classification results [23],
were
obtained
procedures.
for
further
PA
MI
1S (0,3)
Correlation
2S (0,4)
Correlation
3S (0,3) Contrast
4S (0,4) Contrast
5S (0,5)
Correlation
6S(0,5) Contrast
7S (2,2)
Correlation
8S(0,3) Sum
Variance
9S(0,1) Inv Diff
Mom
10S(0,4) Sum
Variance
Perc.01%
S(1,1) Sum Variance
S(0,1) Ang. Sec
Mom Skewness
S(0,2) Sum Variance
S(5,5) Entropy
S(5,-5) Inv. Diff.
Mom
S(1,0) Sum. Average
S(1,0) Correlation
S(3,3) Entropy
FEATURES REDUCTION
Prior to classification the
features data was standardized to
reduce the effect of unwanted
variation within the data due to
outliers and other artifacts by
applying
the
following
mathematical relation:
2.9
K K
K= i
'
i
(6
)
Where:
K 'i
value of
i th feature and i = 1, 2,
is the standardized
n .
Ki
Standard deviation
The
above
mentioned
approaches of feature selection
k
Ng
k=1
Y n=
Now here n = 1, 2, 3
( )=
1
1+ exp ( )
(7
)
j=1
Here k = 1, 2, 3
Ny
(9
)
Ng
Nx
h j=[W j 0 + W ji X i ]
i=1
Nh
gk = V k 0 + V kj h
(8
)
U n 0 + U nk g
While j = 1, 2, 3
Nh
Supervised
learning
methods were based on input
patterns and correct classes they
belong to, {xi, di} where
i = 1, 2, 3 M .
For this purpose, the following
error function
Ny
1
E= ( d Y ( X i ; U ,V , W 2
2 i=1 n=1
While for MSR5 datasets
linear
discrimination
analysis
(LDA) gave the best results for
data clustering and analysis. Let
(k )
xi
denote the
i th
pattern in
(1
0)
(11
)
class i, where i = 1, 2, 3
and k = 1, 2, 3
Mk ,
N c . Define
CW
as:
X i U
X
k
( i U k )t
.()
Nc
W =
Where
(12
)
Mk
1
.
M k=1 i=0
C
CB
k . Similarly,
as
U
( k U )t
Nc
B=
Where
1
M .(U k U )
M k=1 k
C
scatter
matrix
was
transform matrix.
the
X ki U
X
k
( i U )t
.()
Nc
t=
2.10
CLASSIFICATION
(13
)
Mk
1
.
M k=1 i=0
C
(14
)
Nh
Y k =[V k 0+ V kj h j ]
j =1
Where k = 1, 2, 3
Ny
(15
)
given as:
Nx
h j=[W j 0 + W ji X i ]
i=1
(16
)
Nh .
For
train
and
testing
purpose, the weight coefficients
are adjusted and observed how
much actual output value Y is
close is to the desired output d.
Supervised training techniques
Ny
1
E= ( d Y ( X i ; V , W 2
2 i=1 n=1
(17
)
Input
Layers=5
Learning Rate
Eta=0.25
2nd Hidden
Layer=2
Optimized Iteration
Limit=70
Input
Layers=5
Learning Rate
Eta=0.20
PHOTOGRAPHIC DATA
For photographic dataset,
first attempt for data clustering
and land cover classification was
verified on the basis of features
selected
by
individual
F,
POE+ACC and MI approaches on
3.1
(128 128) ,
(64 64) ,
(256 256)
and
2ndHidden
Layer=2
Optimized Iteration
Limit=70
(512 512)
we received 80%,
features
of
each
selection
method)
was
received
by
combining
these
three
approaches on ROI (512 512) .
On deploying these 30 features to
RDA, PCA, LDA and NDA, These
datasets are deployed on the
above feature reduction by using
the k-fold (80-20) cross validation
method. It was observed that
nonlinear discriminant Analysis
(NDA) has given better analysis
of 100% as compared to others
three features reduction analysis
approaches. The results are
summarized in Table 5.
Statistical Data
Analysis K-(80-20)
RDA
PCA
LDA
NDA
1-Fold
92.5%
92.50%
97.50%
100%
2-Fold
88.75%
87.92%
96.25%
100%
3-Fold
90%
89.17%
98.75%
100%
4-Fold
88.75%
87.50%
96.67%
100%
5-Fold
90.42%
90.42%
99.17%
100%
Average Accuracy
90.08%
The
Statistical
data
comparison analysis of RDA, PCA,
LDA and NDA is presented in
Figure 5. From this Figure, it is
clear that the result NDA leads
for best classification result of
100% accuracy as compared to
remaining three approaches RDA,
89.502
97.668
100%
%
%
PCA
and
LDA.
Figure
5
represents the Photographic data
clustering for five input classes in
NDA projection space
105.00%
100.00%
95.00%
90.00%
85.00%
80.00%
RDA
PCA
LDA
NDA
Statistica
l Data
Iteration
(80-20)
Trainin
g Data
Train
Accurac
y%
Test
Data
Miscla
ssified
Data
Classifica
tion %
1-Fold
240
100%
60
5/60
91.67%
2-Fold
240
100%
60
6/60
90%
3-Fold
240
100%
60
6/50
90%
4-Fold
240
100%
60
3/50
95%
5-Fold
240
100%
60
6/50
90%
Fertil
e
Land
Green
Pastur
e
Desert
Rangel
and
Bare
Land
Sutlej
River
Land
Total
51
60
59
60
48
60
Bare Land
57
60
Sutlej
River
Land
55
60
Type
Fertile
Land
Green
Pasture
Desert
Rangelan
d
Fertile Land
Green Pasture
Desert RangeLand
Bare Land
Satluj river Land
SPECTRAL DATA
As
we
have
already
mentioned that a scene was
completely explored on the basis
of five spectral bands (Blue.
Green,
Red),
infrared
and
microwave
acquired
by
MSR5.The whole data (250 scans)
acquired by MSR5 were deployed
to RDA, PCA, LDA and NDA to
verify the validity of data
clustering for the classification.
We received RDA 98.7%, PCA
98.4%, LDA 99.5% and NDA
99.4% data clustering accuracy. It
is clear that we received the best
clustering accuracy by LDA
approach as shown in Figure 8.
For training and testing ANN
classifier, the same K-fold (80-20)
cross validation method was also
used for Spectral data analysis. A
Spectral Data
Analysis (80-20)
RDA
PCA
LDA
NDA
1-Fold
99%
97.5%
99.5%
100%
2-Fold
99%
99%
100%
99%
3-Fold
98.5%
98.5%
100%
100%
4-Fold
98.5%
98.5%
99%
99%
5-Fold
98.5%
98.5%
99%
99%
Average Accuracy
98.7%
98.4%
99.5%
99.4%
99.60%
99.40%
99.20%
99.00%
98.80%
98.60%
98.40%
98.20%
98.00%
97.80%
RDA PCA LDA NDA
analysis Results.
Linear
discriminant
Analysis (LDA) graph shows the
properly clustered data in to its
five
appropriate
classes
as
compared to employed other
reduction
techniques.
Data
We received an average
accuracy of 100% when the
classifier was trained under the
architecture
setting
already
discussed in Table 4 and an
average classification accuracy of
96.40% was obtained when
classifier was tested for MSR5
data. So, five types of land cover
data were clustered properly by
using linear discriminant analysis
(LDA). MSR5 train, test, properly
classified along with misclassified
data is represented in Table 9.
Spectral
Data
Iteration
(80-20)
Traini
ng
Data
Train
Accur
acy
Test
Dat
a
Misclassi
fied Data
Classificat
ion%
1-Fold
200
100%
50
6/50
88%
2-Fold
200
100%
50
2/50
96%
3-Fold
200
100%
50
0/50
100%
4-Fold
200
100%
50
1/50
98%
5-Fold
200
100%
50
0/50
100%
Average Accuracy:
88+96+100+98+100 = 96.40%
Type
Ferti
le
Land
Desert
Rangel
and
Bare
Land
Sutlej
River
Land
Tot
al
47
Gree
n
Pastu
re
1
Fertile
Land
Green
Pasture
Desert
Rangeland
Bare Land
Sutlej River
Land
50
50
50
48
50
0
0
0
0
2
1
48
1
0
48
50
50
Fertile Land
Green Pasture
Desert RangeLand
Bare Land
Satluj river Land
statistical data.
Reason
behind
this
classification accuracy difference
is
that
statistical
analysis
outperformed on fine texture
[26]. In this research, the
photographic data was taken at 5
feet height so the area under
these photographs were not
equally covered and distributed,
beside this ROIs also play an
important role for classification.
theory
algorithms
practicalities,1st. Edn., Elsevier,
2004.
[13] Panigrahy S., Upadhyay G.,
Ray S. S. and Parihar J.S.,
Mapping of cropping system for
the Indo-Gangetic plain using
multi-date SPOT NDVI-VGT data,
Journal of the Indian Society of
Remote Sensing., 2010; 38(4):
627-632.
[14] Tsirogiannis I., Katsoulas N.,
Savvas D., Karras G. and Kittas
C.,
Relationships
between
Reflectance and Water Status in a
Greenhouse Rocket (Eruca sativa
Mill.)
Cultivation,
European
journal of horticultural science.,
2013; 78(6): 275-282.
[15] Chang J., Clay S.A., Clay D.E.
and Dalsted K., Detecting weedfree and weed-infested areas of a
soybean field using near-infrared
spectral data, Weed Science.,
2002; 52(4): 642-648.
[16] Vrindts E., Reyniers M.,
Darius P., Gilot M., Sadaoui Y.,
Frankinet M., Hanquet B. and
Destain M.F., Analysis of soil and
crop properties for precision
agriculture for winter wheat,
Biosystems engineering., 2003;
85(2): 141-152.
[17] Szczypiski P.M., Strzelecki
M., Materka A. and Klepaczko A.,
MaZdaA software package for
image texture analysis, Computer
methods
and
programs
in
biomedicine., 2009; 94(1): 66-76.
[18] Gonzalez R. C., Woods R. E.
and Eddins S. L., Digital image
processing
using
MATLAB.
Pearson Education India, 2004.
[19] Haralick R. M., Shanmugam
K. and Dinstein I.H., Textural
features for image classification.,
IEEE Transactions on Systems,