Escolar Documentos
Profissional Documentos
Cultura Documentos
ABSTRACT
We offer in this article, a method for text extraction in images issued from city scenes. This method is used in the
French iTowns project (iTowns ANR project, 2008) to automatically enhance cartographic database by extracting text
from geolocalized pictures of town streets. This task is difficult as 1. text in this environment varies in shape, size,
color, orientation... 2. pictures may be blurred, as they are taken from a moving vehicle, and text may have perspective
deformations, 3. all pictures are taken outside with various objects that can lead to false positives and in unconstrained
conditions (especially light varies from one picture to the other). Then, we can not make the assumption on searched
text. The only supposition is that text is not handwritten. Our process is based on two main steps: a new segmentation
method based on morphological operator and a classification step based on a combination of multiple SVM classifiers.
The description of our process is given in this article. The efficiency of each step is measured and the global scheme is
illustrated on an example.
1 INTRODUCTION
199
CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Figure 3: On the left, function f and a set of 2 functions Figure 4: Result of eq. 4 (function s) on an edge and in
h1 and h2 . On the right, function k computed by toggle homogeneous noisy regions.
mapping.
3 TEXT SEGMENTATION
Figure 5: From left to right: 1. Original image, 2. Bina-
rization (function s from eq. 4), 3. Homogeneity constraint
Our segmentation step is based on a morphological oper- (eq. 5), 4. Filling in small homogeneous regions.
ator introduced by Serra (Serra, 1989): Toggle Mapping.
Toggle mapping is a generic operator which maps a func-
tion on a set of n functions: given a function f (defined
on Df ) and a set of n functions h1 , ..., hn , this operator Function s is then improved:
defines a new function k by (Fig. 3):
0 if |h1 (x) − h2 (x)| < cmin
∀x ∈ Df k(x) = hi (x); ∀j ∈ {1..n} 1 if |h1 (x) − h2 (x)| >= cmin
s(x) =
|f (x) − hi (x)| ≤ |f (x) − hj (x)|
(1)
& |h1 (x) − f (x)| < p ∗ |h2 (x) − f (x)|
2 otherwise
The result depends on the choice of the set of functions hi . (5)
A classical use of toggle mapping is contrast enhancement: Then, no boundary will be extracted within homogeneous
this is achieved by applying toggle mapping on an initial areas. s is a segmentation of f (notice that now we have 3
function f (an image) and a set of 2 functions h1 and h2 possible values instead of 2: a low value, a high value and
extensive and anti-extensive respectively. a value that represents homogeneous regions).
To segment a gray scale image f by the use of toggle To use this method efficiently, some parameters must be
mapping, we use a set of 2 functions h1 and h2 with h1 set up: the size of the structuring element used to com-
the morphological erosion of f and h2 the morphological pute a morphological erosion (h1 ) and a dilation (h2 ), the
dilatation of f . These two functions are computed by: minimal contrast cmin and an additional parameter p. Vari-
ations of p influence the thickness of detected structures.
∀x ∈ Df h1 (x) = min f (y); y ∈ v(x) (2) Getting three values in output instead of two can be dis-
∀x ∈ Df h2 (x) = max f (y); y ∈ v(x) (3) turbing. Many strategies can be applied to assign a value
to homogeneous regions (to determine whether the region
with v(x) a small neighborhood (the structuring element) belongs to low value areas or high value ones): if a region
of pixel x. Then, instead of taking the result of toggle is completely surrounded by pixels of the same value, the
mapping k (eq. 1), we keep the number of the function on whole region is assigned to this value. Another strategy
which we map the pixel. This leads us to define function consists in dilating all boundaries onto homogeneous re-
s: gions. In our case, this is not a real issue: as characters
are narrow, it is not common to have homogeneous regions
∀x ∈ Df s(x) = i; ∀j ∈ {1..2}|f (x)−hi (x)| ≤ |f (x)−hj (x)| inside characters and if it occurs, such regions are small.
(4) Then, our strategy consists in studying boundaries of small
Function s(x) takes two values and may be seen as a bi- homogeneous regions in order to fill a possible hole in
narization of image f with a local criterion (Fig. 4 left). characters. Bigger homogeneous regions are mostly left
Our function efficiently detects boundaries but may gener- unchanged, only a small dilation of these boundaries is per-
ate salt and pepper noise in homogeneous regions (Fig. 4 formed.
right): even very small local variations generate an edge. Illustration of the segmentation process is given in Fig-
To avoid this, we introduce a minimal contrast cmin and if ure 5. In the rest of the paper, this method is called Toggle
|h1 (x) − h2 (x)| < cmin , we do not analyse the pixel x. Mapping Morphological Segmentation (TMMS).
200
In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 --- Paris, France, 3-4 September, 2009
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
4 FILTERING
5 PATTERN CLASSIFICATION
201
CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
202
In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 --- Paris, France, 3-4 September, 2009
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
203
CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
204