Você está na página 1de 20

J Intell Robot Syst (2011) 62:527546

DOI 10.1007/s10846-010-9463-2

A Road Following Approach Using Artificial Neural


Networks Combinations

Patrick Yuri Shinzato Denis Fernando Wolf

Received: 18 March 2010 / Accepted: 13 August 2010 / Published online: 28 August 2010
Springer Science+Business Media B.V. 2010

Abstract Navigation is a broad topic that has been receiving considerable attention
from the mobile robotic community over the years. In order to execute autonomous
driving in outdoor urban environments it is necessary to identify parts of the terrain
that can be traversed and parts that should be avoided. This paper describes an
analyses of terrain identification based on different visual information using a MLP
artificial neural network and combining responses of many classifiers. Experimental
tests using a vehicle and a video camera have been conducted in real scenarios to
evaluate the proposed approach.

Keywords Image processing Navigation Machine learning

1 Introduction

Autonomous navigation capability is a requirement for most mobile robots. In


order to deal with this issue, robots must obtain information about the environment
through sensors and thereby identify safe regions to travel [1]. Outdoor navigation
in unknown terrain is certainly a complex problem. Beyond obstacle avoidance, the
vehicle must be able to identify surface where it can navigate safely. The irregularity
of the terrain and dynamics environment are some of the factors that make the robot
navigation a challenge task [2].
Usually it is desirable that the mobile robot (vehicle) have the capacity to move
along the road and avoid obstacles and non-navigable areas. Since these elements
usually have differences in color and texture, cameras are a suitable option to identify

P. Y. Shinzato (B) D. F. Wolf


Institute of Mathematics and Computer Science,
University of Sao Paulo, Sao Carlos, SP, Brazil
e-mail: shinzato@icmc.usp.br
D. F. Wolf
e-mail: denis@icmc.usp.br
528 J Intell Robot Syst (2011) 62:527546

Fig. 1 Vehicle used for data


collection

navigable regions. Several techniques for visual road following have been developed
based on certain assumptions about the road scene. Detecting road boundaries
through the use of gradient-based edge techniques is described in [35]. These
algorithms assume that road edges are clear and fairly sharp. In [6], it has been
developed an approach to extracts the texture from road images and use it as a
feature for the segmentation of unmarked roads. The approach presented by [7]
divides images in slices and tries to detect the path on each one.
A work related to artificial neural network (ANN) applied to road following is the
Autonomous Land Vehicle In a Neural Network (ALVINN) [8], where a network is
used to classify the entire image and detect the road. Another work that uses ANN
is presented by [9]. Both works had only one ANN that should be re-trained in
order to be able to identify a long way. This work presents the use of a set ANNs
in order to improve the road identification. More specifically, we present a image
classification approach based on many ANNs that use different features obtained
from images as input and combine all outputs to draw one improved classification.
Experimental tests using a vehicle (Fig. 1) and a video camera have been conducted
in real scenarios.
The rest of this paper are organized as follows. Section 2 presents techniques and
features used to identify the navigable region in the image. Section 3 presents the
concepts of artificial neural networks used in this work. Section 4 the experimental
results obtained from tests in real environment is presented. At last, Section 5
presents conclusion and future work.

2 Block-based Classification Method

Navigation in outdoor spaces is considerably more complex than in structured


indoor spaces. The terrain is composed by a variety of elements like grass, gardens,
sidewalks, streets and gravel. These elements usually have different colors and
textures making possible the use of cameras to diferentiate them. The first step to
build a vision-based outdoor navigation system is to classify outdoor spaces into two
classes: navigable regions and non-navigable regions. The navigable regions are the
surfaces where a mobile robot can travel safely on. After the terrain classification,
other algorithms available in the literature can perform path planning and obstacle
avoidance [10].
J Intell Robot Syst (2011) 62:527546 529

Fig. 2 Blocks generated of


frame of road scene

A block-based classification method consists on dividing the image in blocks of


pixels and evaluate them as a single unit. A value is generated to represent this group,
this value can be the average of the RGB, entropy and others features from collection
of pixels represented. In the grouping step, a frame resolution (M N) pixels was
sliced in groups with (K K) pixels, as show Fig. 2.
Supposing an image represented by a matrix I of size (M N). The ele-
ment I(m, n) corresponds to the pixel in row m and collumn n of image, where
(0 m < M) and (0 n < N). Therefore, block B(i, j) contains all the pixels
I(m, n) such that ( (i K) m < ((i K) + K) ) and ( ( j K) n < (( j K) +
K) ). For each block, a feature value is calculated depending on the feature chosen.
This strategy has been used to reduce the amount of image elements, allowing faster
processing.

2.1 Statistical Measures as Image Features

We use statistical measurements as features, such as mean, probability, entropy and


variance. Their definitions and equations are described below.

2.1.1 Shannon Entropy


In this work, texture analysis consists of calculating pixels entropy. In a simple
way, entropy can be defined as being the degree of regularity of a data set [11].
Mathematically, Shannon entropy can be defined as follow:

E(X) = p(x) log p(x) (1)
xX

where p(x) is the probability of pixel x be in the collection. So, in this case, x
corresponds to the pixel and the block corresponds to the collection. Calculation
depends on the space colors and number of channel used.
530 J Intell Robot Syst (2011) 62:527546

2.1.2 Energy
The energy value measures the presence of high values in relation to other values,
and can be defined as:

C1
= ( p(x))2 (2)
i=0

where p(x) is the probability of pixel x be in the collection and C is the number of
colors of image. For example energy of channel R of RGB has 256 colors).

2.1.3 Variance
Variance is a very known concept in statistics, it represents dispersion compared to
the average. It can be describe as:


C1
2 = (x )2 p(x) (3)
i=0

where p(x) is the probability of pixel x be in the collection, is mean of the collection
and C is the number of colors in the image.

2.2 RGB Color Space

The RGB is a space color where each color can be defined by values of R (red), G
(green) and B (blue) components [12]. The classification based on the color space
generates a feature with RGB pixel format. This feature is the weighted average of
the pixel occurrence in block.
We also use the RGB entropy and energy as features. In order to obtain the
entropy and energy value, the frequency of each pixel in the block is calculated. For
each pixel with value x, p(x) is calculated by dividing the frequency of x by the total
number of pixel into block. Note that x and y are pixels in format RGB, x = y if and
only if:
red of x equals red of y and,
green of x equals green of y and,
blue of x equals blue of y

2.3 HSV Color Space

The HSV color space is composed by hue (H), saturation (S) and value (V)
(brightness) [13]. If the component saturation is zero, then hue can be disconsidered.
As in RGB, we generate average, entropy and energy of HSV. Where, x and y are
a pixel in format HSV, x = y if only if:
hue of x equals hue of y and,
saturation of x equals saturation of y and,
value of x equals value of y
However, for this space color, we generate entropy, energy and variance from
each channel independently. In other words, we generate also attributes such as hue
J Intell Robot Syst (2011) 62:527546 531

Fig. 3 Network 1 5 1

entropy, saturation entropy, value entropy, addition to other measures previously


commented. Another attribute generated was HAS, which is (hue + saturation)/2.
This attribute has been generated in order to take advantage of the consistency of
these two channels when they belong to a pixel of the street. The entropy value of
HAS was also used in this work.

3 Artificial Neural Networks as Classifiers

Artificial Neural Networks (ANN) are notorious for presenting very own properties
such as: adaptability, ability to learn by examples and ability of generalization. In this
work, we have used a multilayer perceptron (MLP) [14], which is of a feedforward
neural network model that maps sets of input data onto specific outputs. We used the
back propagation technique [15], which estimates the weights based on the amount
of error in the output compared to the expected results.
In this work, we used one hidden layer with five neurons as show the Fig. 3. All
networks tested have only one neuron on output layer, which is enough to classify
the block as navigable (returning 1) or non-navigable (returning 0). However, the
networks provided responses in decimal values between 0 and 1. For this reason we
defined responses as follow:
if result 0.3 then the region is classified as non-navigable;
if result 0.7 then the region is classified as navigable;
if result > 0.3 and result < 0.7 then is classified as unknown; Notice that the
unknown classification is actually considered an error value.
The size of the input layer corresponds to the number of image attributes used.
Therefore, the differences between the classifiers evaluated are the number of
attributes used and their combination. The networks were evaluated at every 100
training cycles until reach 2,000 cycles, which has been enough to guarantee their
convergence.

4 Experiments and Results

In order to analyse the various attributes combinations, several experiments have


been carried out at the university campus. We have collected data in realistic
environments under different conditions. More specifically, we recorded the path
traversed by a vehicle in a diverse terrain through streets flanked with sidewalks,
parking or vegetation. In addition, portions of the street had adverse conditions such
as sand and dirt (shown in Fig. 4).
532 J Intell Robot Syst (2011) 62:527546

Fig. 4 Example of dirty road


used in the experiments

Our setup for the experiments was a car equipped with an A610 Canon digital
camera. The image resolution was (320 240) pixels with 30 FPS. In order to
execute the experiments with ANNs, we used a Stuttgart Neural Network Simulator
(SNNS) [16]. The OpenCV [17] library has been used in the image acquisition and to
visualize the processed results from SNNS. The block size used was K = 10, which
resulted in 768 blocks per frame.
We performed the experiments in two phases, in first phase we trained the ANNs
with one simple frame, where the road is flanked with grass and a parking each
side. The frame used for evaluation was similar to the one used in the training step.
The second phase was more complex, we used five frames with different conditions
for training step and evaluated with fifteen frames (the five of training step + ten
other frames). We separated the work into two phases because the total number
of classifiers (feature combinations) was very large, approximately 28,000. The first
phase eliminated the combinations of attributes that did not obtain satisfactory
results, reducing the number of candidates on the second phase, which has a more
complex analysis.

4.1 Phase 1

In Phase 1 we tested combination of 21 features: average R (red), average G (green),


average B (blue), RGB entropy, average H (hue), average S (saturation), average
V (value), HSV entropy, H entropy, S entropy, V entropy, H variance, S variance,
V variance, RGB energy, HSV energy, H energy, S energy, V energy, HAS and
HAS entropy. Each feature corresponds to a neuron in the input layer of ANN.
Therefore, we tested different combinations with one attribute, two attributes, three,
four and five attributes, thus, networks with one, two, up to five neurons in input
layer. Totaling in 27,890 different classifiers evaluated.
The frames used in this phase are showed in Fig. 5. An important detail about
this stage is that only the blocks below the horizon line were used to both train and
evaluate, thus each frame generated only 480 blockscan be seen in viewing of the
J Intell Robot Syst (2011) 62:527546 533

(a) Frame used for training (b) Frame used for evaluation
Fig. 5 Frames used in Phase 1

classifier response (Fig. 6). This is due to the fact that much of the top of the image
represents sky, which can be eliminated with a pre-processing [18].
Among the results obtained from all classifiers tested, 16,976 classifiers achieved
hit rate between 90 and 98%, being about one thousand classifiers reached approx-
imately 98%. It is important to notice that the hit rate is a percentage of the 480
blocks from the assessment frame, which means that classifier with hit rate of 98%
did not classify correctly around 10 blocks. As the number of feature combination is
still high we executed the second phase reviewing only the classifiers that got hit rate
90% or more.

4.2 Phase 2

Based on the results obtained on Phase 1, we evaluated 16,976 classifiers in different


street conditions. The evaluation method and settings were the same used in Phase 1.
For this experiment, we used patterns generated from five frames in the training

(a) Best result (b) Evaluating of best result


Fig. 6 Classification results: a shows blocks classified non-navigationas magentaand
navigationas cyanclassifier responses. b Shows correct, false-negative, false-positive and un-
known classification in green, blue, red and yellow, respectively
534 J Intell Robot Syst (2011) 62:527546

(a) Frame 1 (b) Frame 2 (c) Frame 3

(d) Frame 4 (e) Frame 5

Fig. 7 Frames used for training of Phase 2

stepthe frames can be seen in Fig. 7. For the evaluation stage, we used patterns
generated from 15 frames. The frames used in evaluating step are shown in Fig. 8.
Among this frames, it can be seen scenes of curves, dirty road and streets with no
defined edges.
Among the results obtained in Phase 2, 5,967 feature combinations achieved hit
rate 90% or more. One important detail is that in Phase 1 there were classifiers that
reached hit rate 98%, while in Phase 2 the best result obtained was 93%. Among
these 5,967 classifiers, we made two analysis. We analised the best five results that
reached approx. 93% and we analyzed the number of times that a subset of attributes
appeared in these results.
Two analysis have been done to the classification results:

General analysis is the average error/hit rate obtained from all the test frames
evaluated.
Analysis per frame is the error/hit rate of a sinle frame compared to the error/hit
rate and standard deviation of the complete set of test frames.

This allows us to know whether the errors of a given classifier are concentrated in
few frames or are spread along all the evaluated frames.

4.3 Analysis of the Best Features Combinations

Analysing the results from all classifiers evaluated, 5,967 achieved hit rate 90% or
more in general analysis. We discuss the best five results that reached 93%, others
J Intell Robot Syst (2011) 62:527546 535

(a) Frame 6 (b) Frame 7 (c) Frame 8

(d) Frame 9 (e) Frame 10 (f) Frame 11

(g) Frame 12 (h) Frame 13 (i) Frame 14

(j) Frame 15
Fig. 8 Frames used for evaluation of Phase 2

classifiers reached more that 90% and less 93%. This classifiers have the following
attributes:
Classifier 1: B average, RGB entropy, V entropy, S variance, S energy.
Classifier 2: B average, HSV entropy, S entropy, V entropy, S energy.
Classifier 3: R average, B average, H average, V entropy, HSV energy.
Classifier 4: R average, G average, H average, V entropy, HAS entropy.
Classifier 5: R average, H average, H entropy, V entropy (this configuration have
only four attributes).
536 J Intell Robot Syst (2011) 62:527546

Since the scenes used in the tests are considerably different, it is possible that the
classification of some frames has bad results (a hit rate of less than 90%) while others
have a hit rate near 100%. Therefore, it is convenient to do a visual analysis of the
results of the classifiers for each scene used in the tests. A short description of the
behavior of each classifier is present below:

4.3.1 Classif ier 1


This classifier misclassified almost all the blocks representing the parking space. It
obtained significant errors on the sidewalk and traffic lane. But it got good results
in the dirty road (Fig. 9a). Overall, the errors were well distributed over the frames
except in cases of parking and sidewalk (Fig. 9b).

4.3.2 Classif ier 2


This classifier completely misclassified the blocks representing the parking space
(Fig. 10a) and walkways (Fig. 10b). In the dirty road, very few blocks were classified
as unknown (Fig. 10c). The classifier obtained a very good performance in the other
scenes (Fig. 10d).

4.3.3 Classif ier 3


Among all the five classifiers analyzed, this obtained the highest hit rates in the
parking areas (Fig. 11a) and sidewalks. Unlike the others, got reasonable results in
the traffic lanes (Fig. 11b). This classifier also had a good performance in the dirty
roads.

4.3.4 Classif ier 4


This classifier obtained several errors in the parking area but most of these errors
corresponds to unknow regions (Fig. 12a). It had as many mistakes as the previous
classifier in the traffic lanes and achieved the second best mark in the dirty roads.
The classifier had a good performance in the other scenes (Fig. 12b).

(a) Analysis of frame 14. (b) Analysis of frame 11.


Fig. 9 Results obtained by the Classifier 1 responses: show correct, false-negative, false-positive and
unknown classification in green, blue, red and yellow, respectively
J Intell Robot Syst (2011) 62:527546 537

(a) Analysis of frame 1. (b) Analysis of frame 11.

(c) Analysis of frame 5. (d) Analysis of frame 2.


Fig. 10 Results obtained by the Classifier 2 responses: show correct, false-negative, false-positive
and unknown classification in green, blue, red and yellow, respectively

4.3.5 Classif ier 5


As the second classifier, this classifier completely misclassified the blocks represent-
ing the parking spaces(Fig. 13a) and walkways. But obtained good performance in

(a) Analysis of frame 1. (b) Analysis of frame 6.


Fig. 11 Results obtained by the Classifier 3 responses: show correct, false-negative, false-positive
and unknown classification in green, blue, red and yellow, respectively
538 J Intell Robot Syst (2011) 62:527546

(a) Analysis of frame 1. (b) Analysis of frame 8.


Fig. 12 Results obtained by the Classifier 4 responses: show correct, false-negative, false-positive
and unknown classification in green, blue, red and yellow, respectively

traffic lane and dirty roads (Fig. 13b). Worth remembering that this classifier has
only four image attributes as input.
All classifiers had problems, in different degrees, at the road edges and traffic
lanes due to loss of precision of the block based method and different colors of
lanes, curbs and road. Also, many classifiers had problems with parking areas due
to similar color and texture of streets. In general, all classifiers obtained reasonable
classification of the main portion of the street where the car can travel. The graphic
showed in Fig. 14 describes the error rate per frame for the best five classifiers. This
analysis shows that classifiers with same hit rate can have differents responses.
Another conclusion to be drawn is that these five classifiers have some attributes
in common. Since many classifiers achieved acceptable performance, we can analyze
the frequency which combinations of attributes were used. From this analysis we can
know the real contribution of each attribute in the classifier.

(a) Analysis of frame 7. (b) Analysis of frame 14.


Fig. 13 Results obtained by the Classifier 5 responses: show correct, false-negative, false-positive
and unknown classification in green, blue, red and yellow, respectively
J Intell Robot Syst (2011) 62:527546 539

Fig. 14 Analysis per frame

4.4 Frequency Analysis

In this analysis we considered acceptable classifiers that achieved a hit rate of 90% or
more, which were approximately five thousand. Instead of analyzing one by one, we
decided to determine the common features among them. The classifiers with these
subsets were retrained and revaluated. Based on these new results it was possible to
see the contribution over combinations of attributes in the ANN from classifier.
Among all the 5,967 classifiers that have been considered successful, there are
classifiers that use five, four, three, two and even one attribute as input. So, it has
been counted how many times a determined subset is used as input for acceptable
classifiers.
Evaluating subsets of one element, the attribute that obtained the best result was
V entropy, appearing 2,740 times, all others appeared less than two thousand times.
We analysed subsets of five, four and three atributes from all classifiers. Table 1
shows the three subsets of four elements that appeared more frequently. These
subsets appeared 18 times, which is the best result possible, since there are only 17
attributes to be the fifth element or not have it. This means that this combination
yielded good performance independently of the fifth attribute used.

Table 1 Attributes that B average Hue average Hue entropy Value entropy
appeared more frequently
Hue average Hue entropy Value entropy HSV energy
Hue average Hue entropy Value entropy HAS entropy
540 J Intell Robot Syst (2011) 62:527546

Note that these subsets have Hue average, Hue entropy, Value entropy in
common, which is the subset of three elements most usedappeared 146 times.
Based on attributes from the Table 1, we reviewed all the classifiers that used these
attributes. More specifically, we retrained the classifiers showed in Table 2 (average
of ten executions), where the columns AT are attributes, the column AVE is average
hit rate from general analysis, the column SD is standard deviation for AVE, APF
is average error rate from analysis per frame and column SDF is standard deviation
for APF.
From the results presented in Table 2, we can notice that the top ten from general
analysis are also in the top ten from analysis per frame. Note that the best
classifier has only 6.93% error rate per frame with a low standard deviation, so for
all frames tested, this classifier missed less than 10%, which can be considered a
good performance. However, is necessary determine how these errors are displayed
in frame, because misclassified blocks grouped are more harmful than the blocks
scattered in the region of interest.
For example, the classification of Frame 1 did not obtain satisfactory results with
Blue average, H average, H entropy, V entropy and HAS entropy, H average, H
entropy, V entropy. Their error rates for this frame are similar, but the classifier with
Blue is better than classifier with HAS entropy because the classifier with Blue has
classified the region of the park as unknownshow in Fig. 15awhile the classifier
with HAS entropy classified as navigablesee Fig. 15b.

Table 2 Evaluation of the most contributing attributes


Attribute 1 Attribute 2 Attribute 3 Attribute 4 AVE SD APF SDF
Blue average H average H entropy V entropy 92 0.94 6.93 3.32
HAS entropy H average H entropy V entropy 91.7 0.82 7.74 3.68
Blue average H entropy V entropy 91.6 0.7 7.28 4.49
Blue average V entropy 91.5 0.53 7.38 4.59
HSV energy H average H entropy V entropy 91.2 0.63 7.74 3.62
H average H entropy V entropy 91.0 0.47 8.18 3.21
HAS entropy H average V entropy 91.0 0.0 8.54 3.59
HSV energy H average V entropy 90.5 0.53 8.72 3.59
H average V entropy 90.3 0.48 8.56 3.73
Blue average H average V entropy 90.2 1.32 7.39 4.3
Blue average H average H entropy 89.8 0.63 9.04 3.8
H average H entropy 89.0 0.67 10.0 3.11
HAS entropy H average 89.0 0.0 11.93 4.08
HSV energy H average H entropy 89.0 0.0 11.22 2.98
Blue average H average 88.9 0.99 10.17 3.74
Blue average H entropy 88.8 0.79 10.97 4.75
HAS entropy H average H entropy 87.7 7.27 9.29 4.25
HSV energy H average 87.5 0.85 13.31 4.41
HAS entropy H entropy V entropy 78.3 2.71 19.86 5.95
HAS entropy H entropy 74.5 0.85 23.82 6.94
HAS entropy V entropy 70.6 0.7 28.15 5.45
HSV energy H entropy V entropy 67.5 0.53 31.47 8.59
H entropy V entropy 65.7 0.48 33.54 10.38
HSV energy V entropy 63.5 0.71 36.92 8.86
HSV energy H entropy 59.3 2.11 42.22 7.24
J Intell Robot Syst (2011) 62:527546 541

(a) Answer of classifier with blue average. (b) Answer of classifier with HAS entropy.
Fig. 15 Classification results: blocks classified non-navigationas magentaand navigationas
cyanclassifier responses. The color yellow represents classification unknown

(a) Sidewalk as unknown. With blue average (b) Sidewalk as navigable. Without blue average

(c) Errors in dirty road. With blue average it does (d) Errors in dirty road. Without blue average.
not classify dirty road as navigable.

Fig. 16 Classification results: show blocks classified non-navigationas magenta, navigationas


cyanclassifier responses and unknown classification as yellow
542 J Intell Robot Syst (2011) 62:527546

Analyzing the results of top four classifiers, we concluded that the classifiers with
(blue average) tend to classify the parking lot and the sidewalks as unknown
showed in Fig. 16awhile the classifiers without (blue average) classified as
navigableshowed in Fig. 16b. However, these classifiers obtained more errors than
classifiers without feature blue average on the dirt roads as seen in the Fig. 16c.
Another conclusion to be drawn is that the subset (hue average and value entropy)
has a good performance (appearing eight times in the top ten). It can also be seen that
these two attributes when combined with another, slightly improve its performance.
Due to this fact, we reanalyzed all classifiers of up to three elements that have as
input at least these two attributes. Table 3 presents the same columns of the Table 2,
but with other classifiers.
From the results shown in Table 3, we can notice that the top six from general
analysis are also in the top six from analysis per frame. In addition, all classifiers
of Table 3 have achieved good results, except the classifier 20. This demonstrates that
Hue average and Value entropy are adequate attributes to be used in classifying an
image of a road scene.
In general, classifiers with blue average or hue entropy or saturation entropy,
obtained better results in parking lots and sidewalks, however they missed a large
proportion in the dirt roads. It can be concluded that classifiers incorporating the dirt
as navigable region also include the sidewalks and parking lots because of the similar
color and texture. The attribute blue average helps reducing the similarity between
dirty street and sidewalks but it is not enough to classify them as non-navigable. If
we assume that all block classified as unknown are non-navigable then the classifier
can be used in road following algorithm with good results.
A good overall classification performance can be seen in Fig. 17, with some erros
in the traffic lanes and edges of sidewalks. The most significant error occurred in the

Table 3 Evaluation of Hue average and Value entropy


AT 1 AT 2 AT 3 AVE SD AFP SDF
1 H average V entropy G average 91.3 1.1 7.54 4.0
2 H average V entropy B average 91.1 0.83 7.33 4.21
3 H average V entropy H entropy 91.1 0.3 7.54 3.37
4 H average V entropy V average 90.9 1.58 6.86 4.63
5 H average V entropy S entropy 90.8 0.4 7.63 3.87
6 H average V entropy S variance 90.8 0.6 7.82 4.43
7 H average V entropy HAS entropy 90.8 0.4 9.0 3.32
8 H average V entropy H variance 90.7 0.9 7.9 4.01
9 H average V entropy HSV entropy 90.7 0.46 8.14 3.93
10 H average V entropy S average 90.6 1.02 8.24 2.69
11 H average V entropy RGB entropy 90.6 0.49 8.38 3.96
12 H average V entropy HSV energy 90.6 0.49 8.58 3.85
13 H average V entropy RGB energy 90.4 0.49 8.51 3.8
14 H average V entropy H energy 90.4 0.49 8.64 3.74
15 H average V entropy R average 90.4 1.5 9.49 6.16
16 H average V entropy 90.3 0.46 8.46 3.89
17 H average V entropy V energy 90.3 0.64 8.74 3.76
18 H average V entropy S energy 90.2 0.4 8.78 4.01
19 H average V entropy HAS average 89.8 0.6 11.01 6.29
20 H average V entropy V variance 88.0 7.01 7.86 4.38
J Intell Robot Syst (2011) 62:527546 543

(a) Frame 1 (b) Frame 6 (c) Frame 2

(d) Frame 7 (e) Frame 3 (f) Frame 8

(g) Frame 9 (h) Frame 10 (i) Frame 4

(j) Frame 11 (k) Frame 12 (l) Frame 13

(m) Frame 5 (n) Frame 14 (o) Frame 15


Fig. 17 Classification results: show blocks classified non-navigableas magenta, navigableas
cyanclassifier responses and unknown classification as yellow
544 J Intell Robot Syst (2011) 62:527546

Fig. 18 Combining of five


responses of frame 6. Blocks
classified with navigable area
in red

dirt road, where the region in the middle of the road is classified as non-navigable
which can be expected due to the similarity in color to the sidewalk and plats.

4.5 Combined Classification

Based on the analysis presented in the previous section, we can notice that several
classifiers have almost the same hit rates, but with classification errors in different
parts of the image and in different proportions along the path traveled by the vehicle.
This is due to using or not certain attributes as input of the classifier, making it more
or less sensitive to certain parts of the environment. Thus, we can build a more robust
classifier if we combine the responses of the various classifiers.
Results obtained from the average of responses from each of the five best
classifiers of (Section 4.3) are shown in Fig. 18. This average is calculates for each
block in the image. The top left image is the result of the average of five best

Fig. 19 Combining of five


responses of frame 11. Blocks
classified with navigable area
in red
J Intell Robot Syst (2011) 62:527546 545

Fig. 20 Combining of five


responses of frame 12. Blocks
classified with navigable area
in red

responses, the top center image shows the degree of certainty about the navegability.
In this image, the classification is represented in gray scale where white represents
the value 1 (navigable) and the black represents the value 0 (non-navigable), and
values between 0 and 1 are shown in grayscale. The top right image is the actual
image. The five others images of the second and third rows are the results obtained
from each classifier.
We can see errors in individual responses, for example, blocks considered naviga-
ble in the vegetation and in some places in the sky. However, by using the average,
these errors are suppressed. This is because, in several cases the different classifiers,
did not obtain wrong classification in the same regions of the image.
To measure the efficiency of the combined classifier, we can compare the error
rate per frame of the Fig. 18 which was 4.8%, with error rate of individual classifiers
shown in graphic of Fig. 14. We can notice that the result is very close to the best
response obtained. The same happens the frame 11 with an 10.6% of error shown in
Fig. 19 and the frame 12 shown in Fig. 20 with an error rate equal to 4%.
The results obtained can be considered satisfactory and the reason for the behav-
ior of the combined response stay very close to the best responses in most cases is
that the errors of the classifiers, for the most part, happens because of uncertainty
of the responsevery close to maximum limits set 0.3 and 0.7. When calculates the
average, the classifier that has more certainty tends to influence the overall response.

5 Conclusions and Future Works

Autonomous navigation is one of the main capabilities of autonomous robots.


This paper addresses the problem of identify navigable areas in the environment
using artificial neural networks and vision information. Different combinations of
attributes have been evaluated in realistic environments.
In general the results were satisfactory, since many classifiers have obtained a
good rate of success. Furthermore, classifiers obtained good classification of the main
portion of the street where the car can travel, the region of interest. The block-based
546 J Intell Robot Syst (2011) 62:527546

method ensures an increase in performance, however it causes errors at the road


edges and the traffic lane in different proportions. By combining the responses of
multiple classifiers, many of these errors were reduced.
As future work we plan to evaluate others image features and more complex
environments. We also plan to integrate our approach with laser mapping, which
provides depth information.

Acknowledgements The authors acknowledge the support granted by CNPq and FAPESP to the
INCT-SEC (National Institute of Science and TechnologyCritical Embedded SystemsBrazil),
processes 573963/2008-9 and 08/57870-9.

References

1. Arkin, R.C.: An Behavior-based Robotics. MIT Press, Cambridge (1998)


2. Wolf, D., Sukhatme, G., Fox, D., Burgard, W.: Autonomous terrain mapping and classification
using Hidden Markov models. In: Proceedings of the 2005 IEEE International Conference on
Robotics and Automation, 2005. ICRA 2005, pp. 20262031 (2005)
3. He, Y., Wang, H., Zhang, B.: Color-based road detection in urban traffic scenes. IEEE Trans.
Intell. Transp. Syst. 5(4), 309 (2004). doi:10.1109/TITS.2004.838221
4. Broggi, A., Bert, S.: Vision-based road detection in automotive systems: a real-time expectation-
driven approach. J. Artif. Intell. Res. 3, 325 (1995)
5. Rotaru, C., Graf, T., Zhang, J.: Extracting road features from color images using a cognitive
approach. In: Intelligent Vehicles Symposium, 2004, pp. 298303. IEEE (2004). doi:10.1109/
IVS.2004.1336398
6. Zhang, J., Nagel, H.H.: Texture-based segmentation of road images. In: Proceedings of the
Intelligent Vehicles 94 Symposium, pp. 260265 (1994). doi:10.1109/IVS.1994.639516
7. Ghurchian, R., Takahashi, T., Wang, Z., Nakano, E.: On robot self-navigation in outdoor envi-
ronments by color image processing. In: 7th International Conference on Control, Automation,
Robotics and Vision, 2002. ICARCV 2002, vol. 2, pp. 625630 (2002)
8. Pomerleau, D.: Neural network vision for robot driving. In: Arbib M. (ed.) The Handbook of
Brain Theory and Neural Networks (1995)
9. Foedisch, M.: Adaptive real-time road detection using neural networks. In: Proc. 7th Int. Conf.
on Intelligent Transportation Systems. Washington DC (2004)
10. Choset, H., Lynch, K.M., Hutchinson, S., Kantor, G.A., Burgard, W., Kavraki, L.E., Thrun, S.:
Principles of Robot Motion: Theory, Algorithms, and Implementations. MIT Press, Cambridge
(2005)
11. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379423,
623656 (1948)
12. Joblove, G.H., Greenberg, D.: Color spaces for computer graphics. SIGGRAPH Comput.
Graph. 12(3), 20 (1978). doi:10.1145/965139.807362
13. Reiter, C.: With J: image processing 2: color spaces. SIGAPL APL Quote Quad 34(3), 3 (2004).
doi:10.1145/1127556.1127557
14. Churchland, P.S., Sejnowski, T.J.: The Computational Brain. MIT Press, Cambridge (1994)
15. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Neurocomputing: Foundations of Research,
pp. 673695 (1988)
16. University of Stuttgart: http://www.ra.cs.uni-tuebingen.de/SNNS/ (2010). Visited in March 2010
17. Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library.
OReilly, Cambridge (2008)
18. Lee, J., Crane III, C.D., Kim, S., Kim, J. (eds.): Road Following in an Unstructured Desert
Environment using Monocular Color Vision as Applied to the DARPA Grand Challenge.
International Conference on Control, Automation and Systems (2005)

Você também pode gostar