Escolar Documentos
Profissional Documentos
Cultura Documentos
Letters
ELSEVIER
Abstract
In this paper segmentation of colour images is treated as a problem of classification of colour pixels. A hierarchical
modular neural network for classification of colour pixels is presented. The network combines different learning techniques,
performs analysis in a rough to fine fashion and enables to obtain a high average classification speed and a low classification
error. Experimentally, we have shown that the network is capable of distinguishing among the nine colour classes that occur
in an image. A correct classification rate of about 98% has been obtained even for two very similar black colours. @ 1997
Elsevier Science B.V.
Keywords: Colour classification; Image segmentation; Modular neural networks
1. Introduction
Colour image processing and analysis is increasingly used in industry, medical applications, and other
fields. Quality inspection, process control, material
analysis, and medical image processing are a few examples. Therefore, a research in colour perception and
development of efficient computational models for real
world problems is of crucial importance. One task
that often arises in colour image processing is image
segmentation. Colour image segmentation techniques
can, roughly, be categorised into techniques for chromatically dividing an image space and those for clustering a feature space derived from an image. Region
growing, region splitting and merging are the common
approaches used by methods of the first group (Liu
* Corresponding author. E-mail: Antanas.Verikas@cbd.hh.se.
I Electronic Annexes available. See http://www.elsevier.nl/
locate/patrec.
and Yang, 1994, Panjwani and Healey, 1995). Methods of the second group divide colour space into clusters (Uchiyama and Arbib, 1994, Tominaga, 1992).
The colour image segmentation method we discuss
here belongs to the latter category. We treat the colour
image segmentation problem as a problem of classification of colour pixels.
The most common goal in colour image segmentation is to partition a colour image into a set of uniform colour regions. However, the aim of this work
is slightly different. The motivation for this work is
a need to determine colours of inks used to produce
a multi-coloured picture created by printing dots of
cyan (c), magenta ( m ) , yellow (y) and black (k)
primary colours upon each other through screens having differing raster angles. The answer must be given
for any possible combination of cyan, magenta, yellow and black ink and for any area of the picture. One
factor that influences the colour impression of the picture is the size and shape of the areas covered by the
174
Table 1
The mean values and standard deviations of the variables R, G and B for five overlapping classes of colours ( R, G and B E [0, 255] )
Colour class
Variab~
Mean
Stand. dev.
my
cm
cmy
119
5
41
5
53
7
I12
6
37
5
33
6
32
5
31
6
48
10
30
5
31
6
33
7
25
4
25
3
24
4
different inks. This information can be used to control the amount of ink transferred to the paper in each
of the four printing nips holding cyan, magenta, yellow and black. The measurement of the area covered
by ink of the different colours can be done automatically using an image analysis system, if the image
taken from the printed picture can be segmented into
regions according to the following two rules:
1. Pixels should be assigned to the same cluster
(colour class) if they correspond to areas of the picture
that were printed with the same inks.
2. Pixels corresponding to areas printed with different inks should be assigned to different clusters.
Determination of a colour class for every pixel of
the image is the way the task is solved. In order to
solve the task with acceptable accuracy in classification and a high average speed we propose the use of a
hierarchical modular neural network. Note that classification speed is of primary interest in our application.
The rest of the paper is organised as follows. In the
next two sections we briefly describe the input data
and the colour space used. Architecture of the network
is presented in Section 4. Procedures for training the
network are given in Section 5. Section 6 summarises
the results of experimental investigations. Section 7
concludes the work.
2. The data
When mixing dots of cyan, magenta and yellow
colours eight combinations are possible for every pixel
in the picture. The combination emy produces the
black colour. However, in practice black ink is most often also printed. We assume the black ink to be opaque.
Therefore, we have to distinguish between 9 colour
classes, namely c, m, y, w (white paper), cy, cm, my,
emy (black resulting from overlay of cyan, magenta
and yellow) and k (black resulting from black ink).
Discrimination between some of the colour classes
is a rather complicated matter, since they are highly
175
Colourpixel (X)
~------*IClasslabell
d(x,
C i ) ~-" ~
[~
Fuzzypost-processing
Weights % from
random optimisation
[ Class label)
Fig. 1. Architecture of the network.
r1
il ,
(1)
where r is the correlation coefficient. The eigensolution of the covariance matrix gives the following
eigenvectors (ei) and the corresponding eigenvalues
(ai):
el = {1, 1, 1}a';
e2 = { 1 , 0 , - 1 } T ;
(2)
e3 = { 1 , - 2 , 1}T;
At =0-2(1 + 2 r ) ;
a 2 = h 3 = 0"2( 1 -- r).
(3)
I=R+G+B,
(4)
J=R-B,
(5)
K = R - 2G + B,
(6)
which are almost uncorrelated and have zero covariances. 1, J and K are the variables of the "IJK" colour
space.
According to Hunt ( 1991 ), three signals representing colour are transmitted via nerve fibres from the
human eye to the brain. One of these signals is usually
referred to as an achromatic signal and the other two
as colour difference signals. In this sense the 1, J and
K variables mimic the signals transmitted to the human brain, since I can be referred to as an achromatic
signal, while J and K as colour difference signals.
Distances measured in the "IJK", as well as the
RGB colour space do not represent colour differences
in a uniform scale from the point of view of perception. The C1ELuv and CIELab colour spaces are more
uniform in this sense. In spite of that we have cho-
176
vl I
I
Grossberg
layer
Competitive
layer
W I
Input
layer
x,
x2
x, y~ Y2 Ym
Fig. 2. A forward-onlycounterpropagationnetwork.
classification pertbrmed by the tree is very fast, since
the tree consists of only a few nodes and only one
neuron is used in every node of the tree.
4.2. C o u n t e r p r o p a g a t i o n n e t w o r k
The binary decision tree performs the first classification step. Two types of terminal nodes can be encountered in the tree: (1) the node representing one
colour class only, and (2) the node representing a
cluster (a set) of ambiguous colour classes. The classification performed by the tree is final for the pixels arriving at terminal nodes of the first type. Pixels
reaching terminal nodes of the second type are transferred to the CP network for further analysis. The tree
divides colour space into several colour regions. The
j = 1 . . . . . MAC,
(7)
{c} = UAG),
(8)
3--`MAcNj.
N = ~_~/
(9)
As far as highly overlapping colour classes are considered, most of the weight vectors will be located in
177
the overlapping regions of the class-conditional distributions. However, some of the vectors will be also
placed in the non-overlapping "tails" of the distributions. Therefore, the decisions made by using the different weight vectors are not of the same reliability. We
say that the decision is made (when classifying pixel
x) by using the weight vector c~ ( i = 1,2 . . . . . Nj; j =
1 . . . . . MAC), if the minimum distance d ( x , c~) has
been obtained by using the weight vector c~.. Some of
the decisions made by using weight vectors from the
overlapping regions can be rather doubtful. Therefore,
a correction of the decisions (the post-processing)
takes place after the pixels have been classified. The
concept of the correction is as follows.
The decision classes (the colour classes) and the
weight vectors ~ representing the regions of the colour
space are considered as fuzzy sets. Membership values for the fuzzy sets and the fuzziness of the decisions made by the weight vectors are defined. Classification of an image by the counterpropagation network results in the classified image as well as in a
number MAC (the number of ambiguous classes in
the set) of supplementary images. Every pixel x in
the supplement image j is represented by the value of
the membership function Aj ( x ) of the jth ambiguous
class. Post-processing is based on information about
the membership values and the fuzziness of the decisions. More details about the post-processing can be
found in (Verikas and Malmqvist, 1995).
178
EWil...iLXil'''XiL),
ij 4"" ~ic
(lO)
X+
i f g ( x ) ~> 0
X_
ifg(x) < 0
Vx C X,
(11)
Z
Wil"'iLXil'''XiL"
iI~'"~it.
(12)
where g ( x ) is given by
g( x ) = wo + Z
The learning set X contains labelled as well as unlabelled pixels. The unlabelled pixels are those coming
from the borders of the dots. Labels for such pixels are
hard or even impossible to obtain. Therefore an unsupervised learning algorithm, we have recently proposed, is used for the binary decision tree construction
(Verikas et al., 1995). For every node of the tree the
algorithm tries to locate the decision boundary (12)
in a place with few learning samples. A node of the
tree is labelled as being a terminal node of the first
type when all labelled samples falling in the node belong to the same class or if only one class has the
number of labelled samples above the threshold T1 and
the ratio of samples of two major classes represented
by the node is above the threshold T2. A node of the
tree is labelled as being the terminal node of the second type if the number of labelled samples falling in
the node is above the threshold T1 more than for only
one class and the samples falling into the node form
a "compact cluster". The algorithm that can find the
"compact clusters" is given in (Verikas et ai., 1996).
5.2. Counterpropagation network
5.2.1. Process of designing the network
The CP networks with input from the second type
nodes of the tree are trained separately. In order to
avoid overtraining and to achieve better generalisation
properties of the network, separate data sets have been
used in different steps of the designing process.
Data sets used to design a pattern recognition system are always limited and very often not representative enough. This often happens because of a lack of
179
k=argnfin(d(x,w~)-bq),
/=1,2
.....
Nq,
(13)
where d(x,w~) is the distance between pixel x and
the ith weight vector of the qth class, and bq is the
winning frequency sensitive term, that penalises too
frequent "winners" and rewards those that win seldom
(Verikas and Malmqvist, 1995). Then the winning
weight vector w~ (t) is updated according to the rule
W~q(t+l)=Wkq(t)+at[x(t)--Wkq(t)],
(14)
pij(t+l)=l.'ij(t)+fl[yj-l,'ij(t)]zi,
(15)
zi =
1,
if d(x,wiq) = j=l,2,...,u,,
min d(x,wJ,),
0,
otherwise,
(16)
180
tion process, NL is the number of samples in the learning set, k is a constant, Q is the number of classes,
and Ntwi is given by
where N~i is the number of samples from class i classifted correctly at the zeroth iteration of the optimisation process. The second term in the performance
measure penalises an increase in wrong classifications.
The Alopex algorithm (Unnikrishnan and Venugopal,
1994) performing a random search in the weight space
is used for the optimisation.
(b'il, 11i2. . . . .
min \ d j
(17)
t O,
if
N~i
>
Ntci,
(22)
otherwise,
E[li],
= f 1~ci -- Ntci ,
ck(t+l)=ck(t)+e(t)a(t)
N'w,
(19)
min[&],
E[J/],
E[gi],
max[&],
min[Ki],
max[K/],
(20)
for k E {i, j}. The modified lvq is similar to that described by Song and Lee (1996). However, we allow
only modifications of weights inside the window h.
5.4. Determining weights f o r the Euclidean distance
The weights aij that appear in the weighted Euclidean distance are specific for each reference pattern.
The weights are found by maximising the following
function of classification performance.
o
F =
Ntci - k
-
Nwi
)/
NL,
(21 )
i=l
6. Experimental testing
6.1. Learning and testing sets
181
at = kl (I - t/tl ),
at=k2(1 -t/t2).
Values Ofkl = 0.4, k2 = 0.02 and tl = 0.1tz were chosen. The value of t2 depended on the number of nodes
and was set to a value of 500 (number of nodes).
Note that parts of the network representing different
classes are trained separately. Therefore, the Number
of Nodes is relatively small. To keep the training process of the Grossberg layer well-behaved, the parameter/3 should be kept suitably small (0 ~</3 << 1 ). After preliminary experiments the parameter/3 was set
to a value 0.002. To ensure that there be no increase in
incorrect classifications after the optimisation starts,
the constant k should be set to a relatively large value.
The value of k = 10 has been found to be appropriate
for our task.
The optimal size of the lvq window depends on the
number of training samples. If a large number of sampies is available, a narrow window would guarantee
the most accurate location of the decision boundary.
For good statistical accuracy, however, the number of
samples falling into the widow must be suffcient (Kohonen, 1990). The optimal value of e depends on the
size of the window, being smaller for narrower windows (Kohonen, 1990). After some preliminary experiments the following values of the lvq parameters
have been used: A = 0.01, t~ = 0.02 and e = 0.02.
182
'12/2 q- '1oe/2
Pl,2 = 2 f + ~
~ / 4 f ( 1 - f ) -'}- -"t2/2
Nr
N2 , (23)
2 1+
NrJ
183
Fig. 10. An example of an image containing eight classes of colours (no k). A part of the image was classified by the developed neural
network.
Fig. 11. An image of dots printed with black ink on a magenta-yellow background.
Fig. 12. An image of dots printed with cyan, magenta and yellow
inks on a magenta-yellow background.
Table 2
Performance of the network and confidence intervals for different colour classes
Colours
w, 3,, ey
my
em
emy
era, emy, k
f
P1
P2
Nr
0.992
0.9914
0.9926
9 * 104
0.981
0.9801
0.9819
12 104
0.982
0.9811
0.9828
9 * 104
0.978
0.9770
0.9789
9 * 104
0.941
0.9395
0.9424
10 * 104
0.902
0.9003
0.9037
12 * 104
0.908
0.9064
0.9096
12 * 104
0.980
0.9791
0.9808
10 * 104
184
very difficult. A correct classification rate of about 7075% has been obtained for the black dots on the cyanmagenta-yellow background. We hope that the classification results for such dark areas of pictures can
be improved by exploiting knowledge about the lightpaper-ink interaction and by using more elaborate extraction of additional features. The work on how an
artificial neural network can be used for finding a set
of additional features is going on. On the other hand,
for the application it is important to "find" black ink
on lighter areas of pictures.
In order to evaluate the results obtained from the
system we attempted to distinguish between two
"black" images using some other method for colour
image segmentation. It has been recently reported
good segmentation results of textured colour images
obtained using Gaussian Markov random field models (Panjwani and Healey, 1995). In this model it is
assumed that the rgb colour vector at each location is
a linear combination of neighbours in all three planes
plus Gaussian noise. The coefficients of the combination are estimated as parameters of the model. There
are three colour planes and four directions used.
Therefore there are 12 parameters of the model for
each colour plane. For two textures a difference in
estimated values of the parameters indicates the difference between textures themselves. This approach
has been chosen for the comparison. The two "black"
images have been treated as two textures with different spatial interaction of coloured pixels. Table 3
provides an example of the estimated values of the
parameters for R colour plane. As we can see from the
table there is no significant difference in the values
of the model parameters estimated from the image of
the picture printed in black ink and that printed in
cyan, magenta and yellow inks in this order on top of
each other. The same range of difference between the
estimated parameter values has been obtained for the
colour planes G and B.
7. Conclusions
Table 3
Means and standard deviations of the estimated values of model
parameters for R colour plane
"Black image. . . .
Parameter
1
2
3
4
5
6
7
8
9
10
11
12
cmy image"
Mean
St. dev.
Mean
St. dev.
-0.1330
0.2248
-0.1364
0.5558
-0.0545
0.0751
-0.0352
0.0040
-0,0200
0.0411
-0.0251
0.0072
0.062
0.078
0.053
0.056
0.025
0.021
0.012
0.006
0.015
0.020
0.010
0.003
-0.1431
0.2458
-0.1500
0.5577
-0,047
0,0653
-0.0351
0.0151
-0.0171
0.0351
-0.0255
0.0062
0.050
0.071
0.064
0.072
0.023
0.021
0.015
0.015
0.013
0.025
0.016
0.005
a h i g h average classification speed and a low classification error. Experimentally, w e have shown that the
n e t w o r k is capable o f distinguishing a m o n g the nine
c o l o u r classes that o c c u r in a h a l f t o n e c o l o u r image.
A correct classification rate o f about 9 8 % has been
o b t a i n e d even for two very similar black colours,
n a m e l y the black printed in black ink and the black
printed in cyan, m a g e n t a and y e l l o w inks in this order
on top o f each other.
Acknowledgements
W e gratefully a c k n o w l e d g e the support we have
received f r o m T h e S w e d i s h National B o a r d for Industrial and Technical D e v e l o p m e n t and The R o y a l
S w e d i s h A c a d e m y o f Sciences. We also wish to thank
two a n o n y m o u s r e v i e w e r s for their valuable c o m m e n t s
on the manuscript.
185
References
Desieno, D. (1988). Adding a conscience to competitive learning.
Proc. ICNN I. IEEE Press, New York, 117-124.
Hunt, R.W.G. (1991). Measuring Colour. Ellis Horwood,
Chichester, UK.
Kohonen, T. (1990). The self-organizing maps. Proc. IEEE 78
(9), 1461-1480.
Liu, J. and Y.-H. Yang (1994). Multiresolution color image
segmentation. IEEE Trans. Pattern Anal. Machine lntell. 16 (7),
689-700.
Panjwani, D. K. and G. Healey (1995). Markov Random Field
models for unsupervised segmentation of textured color images.
IEEE Trans. Pattern Anal. Machine Intell. 17 (10), 939-954.
Song, H.-H. and S.-W. Lee (1996). LVQ combined with simulated
annealing for optimal design of large-set reference patterns.
Neural Networks 9 (2), 329-336.
Tan, T.S.C. and J. Kittler (1993). Colour texture classification
using features from colour histogram. Proc. SCIA-93, Tromso,
Norway, 807-813.
Tominaga, S, (1992). Color classification of natural color images.
Color Research and Application 17 (4), 230-239.
Uchiyama, M. and M.A. Arbib (1994). Color image segmentation
using competitive learning. IEEE Trans. Pattern Anal. Machine
Intell. 16 (12), 1197-1206.
Unnikrishnan, K.P. and K.P. Venugopal (1994). Alopex: A
correlation-based learning algorithm for feedforward and
recurrent neural networks. Neural Computation 6, 469-490.
Verikas, A. and K. Malmqvist (1995). Increasing colour image
segmentation accuracy by means of fuzzy post-processing. Proc.
IEEE Internat. Conf. on Artificial Neural Networks, Perth,
Australia, Vol. 4, 1713-1718.
Verikas, A., K. Malmqvist, L. Bergman and A. Gelzinis (1995).
An unsupervised learning technique for finding decision
boundaries. Proc. 5th European Conf on Artificial Neural
Networks, ICANN-95, Paris, Vol. 2, 99-104.
Verikas, A., K.Malmqvist and A. Gelzinis (1996a), A new
technique to generate a binary decision tree. Proc. Symposium
on Image Analysis, Lund, Sweden, 164-168.
Verikas, A., K. Malmqvist, L. Malmqvist and L. Bergman (1996b).
Weighting colour space coordinates for colour classification.
Proc. Symposium on Image Analysis, Lund, Sweden, 49-53.