Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract
The majority of the weed detection research concentrates on problems where
weeds grow within an artificial plantation in which the weeds exist between lines
of crops. These problems are typically solved by learning the plantation
geometry using various pattern recognition techniques, and then any vegetation
that does not fit in the geometric model is classified as weed. In the case where
weeds are distributed over a natural landscape, the distribution of weed and
non-weed vegetation is more complex and cannot be discriminated by using a
simple geometric model which describes the relative distribution of vegetation.
Further analysis in weed colour and texture is commonly used to improve the
classification results; however the shape property of the weed itself is not
usually explored in this field as a means of weed detection.
In this paper, we examine the use of a template-matching tree crown detection
technique, (used successfully Scandinavian forestry studies) for identifying
woody weeds in a natural landscape based on their shadow. The proposed
algorithm is divided into two stages, the first stage segments the image using
colour and texture features, the second stage utilises template matching using
shape information related to projected shadow of the woody weed, relying on
information about the time of day, sun angle and UAV pose information
collected by the onboard navigation system. We present experimental results of
the approach using colour vision data collected from a UAV during operations in
the Julia Creek in July 2009 and August 2010.
1. Introduction
It is easy for humans to identify different types of objects in a visual
environment, however robust and efficient vision based object recognition is a
challenging problem in the field of robotics and machine vision. Progress has
been made in past decades and different algorithms have been applied in
various applications. Examples include: manufacturing control such as bin
picking [Perkins, 1978], [Agrawal et. al, 2010] and sorting [Grimson, 1991],
1
remote sensing object detection of objects [Binford, 1989], [Mundy, 1990],
pedestrian detection [Gavrila, 2000] and vehicle detection [Betke et. al, 2000].
This paper concentrates on remote sensing object recognition in the context of
aerial surveying. Various methods have been proposed for detection and
recognition of objects from aerial images [Thompson and Mundy, 1987],
[Brooks et. al, 1981], however these techniques mostly deal with well defined
man-made objects whose visual signature could be accurately modeled. Unlike
man-made objects, natural objects, such as trees, are less uniform and difficult
to generalise with geometric models. A related problem exists in the field of
forest research, where various tree counting algorithms have been proposed to
detect densely populated tree species with well defined outlines [Pollock, 1996],
[Larsen and Rudemo, 1997], [Olofsson et. al, 2006].
An image frame is a two dimensional projection of the real world from the
camera point of view; this loss of one dimension of the real world is one of the
limiting factors in computer vision using monocular cameras. There are
techniques from computer vision developed to retrieve the lost dimension using
shading with photometric stereo [Brooks, 1989] and shadow by realising the
fact that the shadow of an object is a projection from the point of view of the
light source. In this paper we study the effect of utilizing this extra shadow
projection to add an extra dimension to the otherwise two dimensional image.
Similar approaches have been used in object recognition and bin picking by
casting shadows [Agrawal et. al, 2010] and ray traced templates to find
individual trees in aerial photographs [Larsen and Rudemo, 1997].
The aim of this paper is to track or map the distribution of objects in an outdoor
unstructured environment. For outdoor robotic applications object recognition
has proven challenging due to the structural complexity of natural objects. To
deal with the complexity this paper introduces a new method for remote sensing
object detection and recognition for aerial surveying of vegetation. The
technique models tree observations using a geometric model as well as colour,
texture and shadow information to detect and identify individual trees.
The proposed algorithm is divided into two stages. The first stage is a basic
image segmentation using colour and texture information involving selecting
colour and texture features from the original image and the training of Support
Vector Machines (SVM) [Burges, 1998], [Vapnik, 2000] to distinguish between
background, trees and shadows in the image. In the second stage, a target
template is generated using prior information. To quantify shape information of
the target template, it is necessary to construct an object geometric model.
This model is then used to produce an appearance template model. Based on
the navigation solution, the position and attitude of the platform and the camera
is known. Combining this information with a solar position model, the ideal
appearance of the object with shading and shadow can be predicted. The
relative position of target object and its corresponding shadow is treated as the
context information and can be used as supporting evidence of detections. The
proposed algorithm utilises different levels of vision features including low level
features such as colour and luminance, intermediate level features such as
shape and textured regions and high level features such as context information.
2
Low level vision assigns labels to every pixel whereas high level vision is
responsible for labelling discrete objects. Different levels of vision features are
used in the proposed algorithm to extract the most features from the
information rich vision data.
2. Related Work
3
vision feature of luminance and also intermediate level feature of shape. The
tree recognition in aerial images of forest based on synthetic tree crown image
model was first proposed by [Pollock, 1996], this algorithm was then expanded
to include shadow [Larsen and Rudemo, 1997] and further improvements and
implementations to discriminate tree species was shown in [Olofsson et. al,
2006].
3. Algorithm
Object recognition is the identification of certain target objects inside an image
frame and is part of a more general pattern recognition problem. Pattern
recognition can be performed using either a priori knowledge or using statistical
information learned from the data. In this object recognition application both
approaches are used. Patterns such as colour and texture which are not
obvious to the human eye and difficult to model directly are learned statistically
from the data set, whereas patterns such as shape, scale and orientation of the
target object can be modelled directly using a priori knowledge.
The object recognition algorithm is divided into two modules: an image
segmentation module that takes the colour and texture information into account
and a model-object matching module that takes the shape, scale and context
information into account. The two stage approach breaks down the otherwise
difficult to solve vision problem into manageable sub-problems. This also allows
each distinctive module to be evaluated separately, therefore future
improvements can be made independently. This also allows us to generalise
the algorithm for different applications with similar structure by re-learning the
statistical models and using other prior knowledge.
4
After feature extraction the colour and texture features are grouped into one
single feature vector consisting of three colour channels and thirty texture
channels. Each feature vector is then assigned with a label representing its
class. The aim is to segment the original image into three different classes,
object, shadow and background. SVM is used as the classifier to predict the
labels of the feature vectors.
Figure 1: Image Segmentation: The classifier convert the original colour images into
images with meaningful class labels.
5
the exact shape of a natural object, therefore a simple geometric shape or a
combination of geometric shapes can be used as a good approximation. In this
paper the algorithm is applied on ellipsoid appearance trees therefore an
ellipsoid is used to approximate the shape.
In addition to the object model which provides the shape prior information, the
shading and shadow cast by the object can provide extra shape information.
More importantly, the shading and shadow orientation actually provide context
information. The orientation of shading and shadow can be predicted using a
solar model and knowledge of the vehicle pose. Any potential object detections
with the wrong shading and shadow orientation can thus be rejected.
A sun path model is used to estimate the position of the sun in the sky at
defined times and locations where images are collected. The model returns a
light vector which represents the orientation of the incident light from the sun.
The images time stamps are used in the sun path model to predict the exact
position of the sun in the sky; combining this with the platform position allows
prediction of the direction of the shadow. The platform pose is used to estimate
the camera pose. Combined with the solar model we are able to predict the
shadow position with respect to the target object in each image frame.
6
Figure 2: Algorithm Summary: During each observation both vision and pose
information are obtained. The Vision information is processed using statistical learning
and simplified into three classes. The pose information and the solar model are used to
generate the object appearance model which is then compressed into a prior template.
The outputs of both approaches are combined together using correlation template
matching, a correlation map is produced at the end of the algorithm, the map indicates
the likelihood of target object in the image coordinates.
7
Figure 3: Scaled down J3 Cub: This platform is used to collect aerial image data over
Julia Creek, Queensland, Australia. The mission area is flat with trees distribute
sparsely, the data set is ideal to test the object detection algorithm.
The detection algorithm utilises the colour and texture information from the
segmentation stage, the shape information from the weak object and shadow
detectors and the context information from the strong detector.
In this data set the target objects were trees, they varied slightly in size
depending on environmental conditions . Three different template sizes were
defined according to the prior knowledge on the typical size of the trees, these
templates were created to capture most targets within the size range. This
multi-scale correlation map is shown in Figure 4, where the response to large,
medium and small size template is colour coded with red, green and blue
respectively.
To evaluate the overall performance of the algorithm, the correlation map was
thresholded to produce regions of interest, the centroids of each region were
calculated and a correction vector was also calculated to compensate the
difference between the centre of the template and the actual portion of the
geometric model. This process is shown in Figure 4.
8
Figure 4: Detection Result: To evaluate the detection rate it is necessary to identify the
location of each detected objects. Top left is the original image, top right is the
correlation map, bottom left is the threshold correlation map with region of interest,
bottom right is the original image overlay with the centroid of the corresponding region
of interest.
The detection results from individual image frames were then transformed into
global coordinates using an onboard mapping system [Bryson et. al., 2010] to
generate the global distribution of the vegetation. The result is shown in Figure
5. The overall sensitivity and specificity of the algorithm are both at 80%.
9
Figure 5: Mapping over the part of the mission area. White dots represent the detected
crowns.
There are a few future improvements that can be made. Firstly the algorithm
gets confused when targets objects are very close together; because the
outline is an approximation rather than an exact model it is not able to
distinguish whether the detection is one large object or multiple smaller objects.
10
This ambiguity problem is intrinsic to the detection algorithm because the
performance is restricted by the resolution of the data. This can potentially be
resolved by using an active sensing strategy where the most ambiguous region
is selected for further surveying at different resolution. This monocular image
detection algorithm could be combined with a stereo vision algorithm in order to
provide extra depth information, although depth estimate could be of poor
quality due to the small base line. Depth could be used as an extra input to the
statistical learning algorithm. This algorithm could also be extended to solve
other object recognition problems within other aerial imaging scenarios, such as
power-line surveying, traffic monitoring and wild life population monitoring.
References
A. Agrawal, Y. Sun, J. Barnwell, and R. Raskar. Vision-guided Robot System
for Picking Objects by Casting Shadows. The International Journal of Robotics
Research, 2010.
M. Betke, E. Haritaoglu, and L.S. Davis. Real-time multiple vehicle detection
and tracking from a moving vehicle. Machine Vision and Applications, 12(2):69–
83, 2000.
T.O. Binford. Spatial understanding: the successor system. In Proceedings of a
workshop on Image understanding workshop, page 20. Morgan Kaufmann
Publishers Inc., 1989.
M.J. Brooks and B.K.P. Horn. Shape and source from shading. Shape from
shading, 1989.
RA Brooks. Symbolic reasoning among 3d objects and 2D models. Artificial
Intelligence, 17:285–348, 1981.
M. Bryson, A. Reid, F. Ramos and S. Sukkarieh. Airborne Vision-Based
Mapping and Classification of Large Farmland Environments. Journal of Field
Robotics, 27(5): 632-655, 2010
C.J.C. Burges. A tutorial on support vector machines for pattern recognition.
Data mining and knowledge discovery, 2(2):121–167, 1998.
M. Erikson. Segmentation of individual tree crowns in colour aerial photographs
using region growing supported by fuzzy rules. Canadian Journal of Forest
Research, 33(8):1557–1563, 2003.
M. Erikson and K. Olofsson. Comparison of three individual tree crown
detection methods. Machine Vision and Applications, 16(4):258–265, 2005.
D. Gavrila. Pedestrian detection from a moving vehicle. Computer Vision ECCV
2000, pages 37–49.
FA Gougeon. A crown-following approach to the automatic delineation of
individual tree crowns in high spatial resolution aerial images. Canadian Journal
of Remote Sensing, 21(3):274–284, 1995.
11
W.E.L. Grimson. Object recognition by computer: the role of geometric
constraints. 1991.
D.P. Huttenlocher. Three-dimensional recognition of solid objects from a two-
dimensional image. AITR-1045, 1988.
D.P. Huttenlocher and S. Ullman. Object recognition using alignment. In
Proceedings of the 1st International Conference on Computer Vision, pages
102–111, 1987.
A.K. Jain and F. Farrokhnia. Unsupervised texture segmentation using Gabor
filters. Pattern recognition, 24(12):1167–1186, 1991.
M. Larsen and M. Rudemo. Using ray-traced templates to find individual trees
in aerial photographs. In Proceedings of the Scandinavian Conference on
Image Analysis, volume 2, pages 1007–1014. Citeseer, 1997.
B.S. Manjunath and W.Y. Ma. Texture features for browsing and retrieval of
image data. Pattern Analysis and Machine Intelligence, IEEE Transactions on,
18(8):837 –842, August 1996.
J. Mundy. Object recognition in the geometric era: A retrospective. Toward
Category-Level Object Recognition, pages 3–28, 2006.
JL Mundy and AJ Heller. The evolution and testing of a model-based object
recognitionsystem. In Computer Vision, 1990. Proceedings, Third International
Conference on, pages 268–282, 1990.
K. Olofsson, J. Wallerman, J. Holmgren, and H. Olsson. Tree species
discrimination using Z/I DMC imagery and template matching of single trees.
Scandinavian Journal of Forest Research, 21:106–110, 2006.
WA Perkins. A model-based vision system for industrial parts. IEEE
transactions on computers, pages 126–143, 1978.
R.J. Pollock. The automatic recognition of individual trees in aerial images of
forests based on a synthetic tree crown image model. The University of British
Columbia (Canada), 1996.
D. Thompson and J. Mundy. Three-dimensional model matching from an
unconstrained viewpoint. In 1987 IEEE International Conference on Robotics
and Automation. Proceedings, volume 4, 1987.
V.N. Vapnik. The nature of statistical learning theory. Springer Verlag, 2000.
12