Vision

Visin
Sesin 1: Introduccin a la
visin artificial
Departamento CCIA
http://www.rvg.ua.es/master/vision/2011/
Hoy
Introduccin a la asignatura
Introduccin a la Visin Artificial
Prcticas: Python, Scipy, Numpy, OpenCV
Introduccin a la
asignatura
Temario
Teora
1. Introduccin a la Visin Artificial y a sus
aplicaciones
2. Clasificacin de imgenes mediante histogramas y
caractersticas
3. Extraccin y emparejamiento de contornos
4. Tracking de objetos
5. Visin 3D: modelo y calibracin de cmara,
cmaras estreo
6. Matching y reconstruccin 3D a partir de nubes de
puntos
Prcticas
1. Desarrollo de software para visin artificial con
OpenCV, Scipy y Python
2. Visin 3D con Point Cloud Library (PCL)
Bibliografa
Computer Vision: Algorithms and Applications de
Szelinsky
Learning OpenCV
Python Scientific Lecture Notes
Software
Python + Numpy/Scipy + Matplotlib
OpenCV (interfaz Python)
PCL (Point Cloud Library)
Transparencias del curso

Las transparencias del curso estn basadas en los
materiales disponibles en las webs de los
siguientes cursos
Svetlana Lazebnik, University of North Carolina at
Chapel Hill, Computer Vision
William T. Freeman y Antonio Torralba, MIT, Advances
in Computer Vision
Steven Seitz, y Rick Szelinski, University of
Washington: Computer Vision, Winter 2006 y Spring
2008
Derek Hoiem, University of Illinois, Computer Vision
Michael J. Black, Brown University, Introduction to
Computer Vision
James Hays, Brown University, Introduction to
Computer Vision
Evaluacin
50% Prctica 1
20% Prctica 2
30% Presentacin de un trabajo
Presentacin de un artculo sobre alguna aplicacin
prctica que use alguna de las tcnicas de la
asignatura, entrando en detalles tcnicos de
implementacin
Se realizar en las horas de prcticas del ltimo da
de la asignatura (mircoles 21 de diciembre)
Introduccin a la Visin
Artificial
Qu es la vision artificial?
El objetivo de la visin artificial es desarrollar
programas que sean capaces de interpretar
imgenes y vdeo, obteniendo informacin sobre
ellas
Vision
as measurement device
Visin como un instrumento de medidad
Real-time stereo
Structure from motion
Reconstruction from
Internet photo collections
NASA Mars Rover
Goesele et al.
Pollefeys et al.
Fei-Fei Li
Vision
Lecture 1 -
amusement park
sky
The Wicked
Twister
Cedar Point
Ferris
wheel
ride
Slide credit: Kristen Grauman
24
23-Sep-11
como fuente
de informacin information
semntica
asVisin
a source
of semantic
Lake Erie
ride
12 E
water
Objects
Activities
Scenes
Locations
Text / writing
Faces
Gestures
Motions
Emotions
ride
tree
tree
people waiting in line

people sitting on ride
umbrellas
tree
deck
Fei-Fei Li
bench
carousel
tree
pedestrians
25
Lecture 1 -
maxair
23-Sep-11
10
Relaciones con otros campos
Biologa
Psicologa
Informtica
Matemticas
Fsica
Ingeniera
What is it related to?

Biology
Psychology
Neuroscience
Engineering
Robotics
Cognitive
sciences
graphics,algorithms,
system,theory,
Speech
Computer
Science
Computer Vision
Information retrieval
Image
processing
Machine learning
Physics
Fei-Fei Li
Maths
Lecture 1 -
23-Sep-11
11
Por qu investigar en visin artificial?

La visin es til
Aplicaciones en muchos campos de inters
No hace falta resolver el problema completo para
hacer algo til
La visin es interesante
Permite distintos enfoques y tcnicas
Resultados visuales y atractivos
La visin es difcil
La mitad del cortex cerebral de los primates se utiliza
para el procesamiento visual
La comunidad investigadora es bastante exigente
para la publicacin de papers en congresos y revistas
12
El cerebro interpreta las escenas
Tringulo de Kaniza
Kaniza en 3D
13
14
14
14
La percepcin es ambigua
Bottom
line
Muchas escenas 3D
diferentes
pueden dar lugar a
line
la mismaisescena
2DBottom
Perception
an inherently
ambiguous problem
Perception is an inherently ambiguous problem
Many different 3D scenes could have given rise to a particular 2D picture

Many different 3D scenes could have given rise to a particular 2D picture
Fei-Fei Li
Fei-Fei Li
Lecture 1 -
47
Lecture 1 -
47
23-Sep-11
23-Sep-11
Sinha & Adelson 93
15
Orgenes de la visin artificial

Lawrence G. Roberts (Premio Prncipe de Asturias
2002), Machine Perception of Three Dimensional
Solids, Ph.D. thesis, MIT Department of Electrical
Engineering, 1963 (PDF original)
Hecho realidad en las notas de clase del curso 6.869:

Advances in Computer Vision del MIT (A Simple
Vision System)
16
Una historia rpida de la visin artificial

1966: Minsky assigns computer vision as an
undergrad summer project
1960s: interpretation of synthetic worlds
1970s: some progress on interpreting selected
images
1980s: ANNs come and go; shift toward geometry
and increased mathematical rigor
1990s: face recognition; statistical analysis in
vogue
2000s: broader recognition; large annotated
datasets available; video processing starts
Guzman 68
Ohta Kanade 78
Turk and Pentland 91
17
Definicin ms formal
Necesitamos una definicin ms formal, que tenga
en cuenta los aspectos fsicos, computacionales y
matemticos
Image plane
Light energy
Qu propiedades (pistas) del mundo visual podemos

extraer o medir?
Cmo podemos usar nuestro conocimiento (a priori)
sobre el mundoYour
para interpretarlo?
answer
Prediction
process
Inference
process
Internal
model
of world
Object
recognition
Update over time
(Michael J. Black)
Formacin de la imagen
Digital Images
CS143 Intro to Computer Vision
World
Camera
Digitizer
Michael J. Black
Digital
Image
2.3 in Szeliski
18
CS143 Intro to Computer Vision
Michael J. Black
Tcnicas y niveles
Imgenes
Grayscale Image
xx ==
58 59
59
58
41
yy == 41
42
42
43
43
44
44
45
45
46
46
47
47
48
48
49
49
50
50
51
51
52
52
53
53
54
54
55
55
210
210
206
206
201
201
216
216
221
221
209
209
204
204
214
214
209
209
208
208
207
207
208
208
204
204
200
200
205
205
209
209
196
196
207
207
206
206
206
206
214
214
212
212
215
215
205
205
209
209
210
210
205
205
206
206
203
203
210
210
60
60
61
61
62
62
63
63
204
204
203
203
192
192
211
211
211
211
224
224
213
213
215
215
214
214
205
205
211
211
209
209
203
203
199
199
202
202
202
202
197
197
201
201
193
193
194
194
199
199
208
208
207
207
205
205
203
203
199
199
209
209
209
209
236
236
203
203
197
197
195
195
198
198
202
202
196
196
194
194
191
191
208
208
204
204
202
202
217
217
197
197
195
195
188
188
199
199
247
247
210
210
213
213
207
207
197
197
193
193
190
190
180
180
196
196
186
186
194
194
194
194
203
203
197
197
197
197
64 65
65
64
143
143
207
207
156
156
208
208
220
220
204
204
191
191
172
172
187
187
174
174
183
183
183
183
188
188
183
183
196
196
71
71
56
56
69
69
57
57
56
56
173
173
214
214
188
188
196
196
185
185
177
177
187
187
185
185
190
190
181
181
66
66
67
67
68
68
64
64
63
63
65
65
69
69
63
63
64
64
60
60
69
69
86
86
149
149
209
209
187
187
183
183
183
183
173
173
80
80
58
58
57
57
60
60
60
60
60
60
62
62
72
72
62
62
71
71
90
90
239
239
221
221
196
196
186
186
84
84
53
53
55
55
55
55
55
55
59
59
66
66
55
55
66
66
63
63
62
62
58
58
75
75
122
122
105
105
69 70
70 71
71 72
72
69
54
54
53
53
52
52
77
77
46
46
51
51
76
76
49
49
87
87
55
55
64
64
68
68
61
61
63
63
62
62
54
54
61
61
53
53
49
49
97
97
62
62
51
51
56
56
57
57
55
55
52
52
61
61
58
58
58
58
57
57
57
57
62
62
60
60
62
62
58
58
56
56
49
49
52
52
60
60
45
45
93
93
51
51
60
60
64
64
64
64
58
58
51
51
50
50
61
61
106
106
48
48
55
55
56
56
48
48
56
56
52
52
56
56
60
60
66
66
63
63
How do we go from an array of numbers recognizing fruit?
Bajo nivel: imgenes a imgenes

CS143 Intro
Intro to
to Computer
Computer Vision
Vision
CS143
Michael J. Black
sharpening
blurring
original image
Canny
(Linda Shapiro,
Computer vision,
Washington Univ)
19
Tcnicas y niveles
Nivel medio: imgenes a caractersticas
edge image
circular arcs and line segments

data
structure
My Research original color image
K-means
clustering
regions of homogeneous color
Recovering 3D layout and context
Alto nivel: procesamiento de caractersticas
BED
(Hoeim)
20
La visin artificial en funcionamiento

Modelado 3D
Image from Microsofts Virtual Earth

(see also: Google Earth)
Photosynth.net
3D urban modeling: Microsoft Photosynth
http://labs.live.com/photosynth/
Source: S. Seitz
Based on Photo Tourism

23-Sep-11
Lecture 1 - 52
Fei-Fei Li
by Noah Snavely, Steve Seitz,
and Rick Szeliski
http://photosynth.net/
21

3D a partir de miles de imgenes
Building Rome in a Day: Agarwal et al. 2009
Reconocimiento de caracteres
Digit recognition, AT&T labs

http://www.research.att.com/~yann/
License plate readers

http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
22

Deteccin y reconocimiento de caras
Face recognition: Apple iPhoto software
Cmaras digitales
http://www.apple.com/ilife/iphoto/
iPhoto de Apple
Fei-Fei Li
Lecture 1 -
55
23-Sep-11
Efectos especiales
Pirates of the Carribean,

Industrial Light and Magic
Lord of the Rings,
WETA Digital
23

Datos biomtricos a partir de imgenes (John
Daugman, 2001 -paper sobre reconocimiento
basado en el iris-)
Sistemas de seguridad
Face recognition systems now

beginning to appear more widely
Fingerprint scanners on many
new laptops, other devices
http://www.sensiblevision.com/
24

Retransmisiones deportivas
Tracking de objetos
Sportvision first down line
Nice explanation on www.howstuffworks.com
http://www.sportvision.com/video.html
Juegos interactivos: Kinect

Object Recognition
3D
Robot
25

Reconocimiento de lugares en dispositivos mviles
Point & Find, Nokia, Google Goggles
Conduccin autnoma (artculo del New York

Times)
26
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
Vision systems (JPL) used for several tasks
Panorama stitching
3D terrain modeling
Obstacle detection, position tracking
For more, read Computer Vision on Mars by
Matthies et al.
Robots mviles
http://www.robocup.org/
Kurt Konolige
27
Dos ejemplos de investigacin

Profunidad a partir de una nica imagen. Ashutosh
Saxena, Sung H. Chung, Andrew Y. Ng, Make 3D (IJCV
Paper, 2007)
http://make3d.cs.cornell.edu/
Bringing pictorial space to life. Antonio Criminisi, Martin
Kemp, Andrew Zisserman (2002 paper, vdeo)
Image plane
(wall, canvas)
Reconstructed 3D structure
Vantage point (observer)
Reprojection image
Figure 25: The reprojection image is defined as the image created by projecting the computed three-dimensional reconstruction
Figure 22: Three-dimensional
onto the plane of the painting.
5.2 Accuracy of three-dimensional reconstruction
reconstruction of The Music Lesson. (a) A lady at the virginals with a gentleman (the music lesson
cm, by Johannes Vermeer (1632-1675). (b-f) Snapshots of a virtual fly-through inside the reconstructe
(1662-65),
painting to show different views of the room.
The above
section suggests
the use
of our three-dimensional
as a way
accuracy
of
28
virtual
museum
and the
reconstructed
3D visualization
paintings.tool (a),
(b)of assessing
and (c)theThe
fading
the reconstructed scene geometry and the correctness of the retrieved vantage position. In fact, the perfect transition
something
by
eye,
as
Steenwick
was
done
with
the
window,
there
is
a
tendency
to
make
the
forms
look right
the of
museum
room to the painted scene
is achievedfrom
only when
the reconstructed
painting and(c).
its vantage
o afrom
view
its reconstructed
3D model
the both
correct
vantage3Dlocation
Image
according to perspectual rather than geometrical criteria [39].
Estado del arte actual

Hemos visto ejemplos de sistemas de los ltimos
aos
Vamos a ver cada vez ms aplicaciones en
dispositivos cada vez ms variados
Explosin de aplicaciones en dispositivos mviles
El rea de investigacin es muy dinmica

Para saber ms acerca de aplicaciones y
compaas de visin:
David Lowe mantiene una excelente pgina con
compaas y aplicaciones de visin: http://
www.cs.ubc.ca/spider/lowe/vision.html
29
Python, Scipy/Numpy,
OpenCV
Software cientfico
Problemas de Matlab
Python: lenguaje de programacin muy extendido
en reas cientficas
Lenguaje de script
Intrepretado
Dbilmente tipeado
Prototipado rpido, no es muy eficiente
Mdulos de computacin cientficos en Python:
Numpy: arrays y operaciones con matrices

Scipy: libreras de alto nivel
Matplotlib: visualizacin, grficas
Open Source y multi-plataforma
Libreras bastante extendidas y probadas en la
comunidad cientfica
Referencias
Documentacin de Numpy/Scipy
Blog
31
OpenCV
Historia
Librera desarrollada inicialmente por Intel
Mantenido en la actualidad por la empresa de
robtica Willow Garage
Caractersticas principales
Proporciona una librera open source bien testeada y
optimizada para procesamiento de imagen, visin y
operaciones geomtricas
Escrita en C, asegurndose que la implementacin es
rpida y portable
Compilado para mltiples plataformas, incluyendo
plataformas embebidas y GPUs
Wrappers para distintos lenguajes, incluyendo python
Referencias
Wiki de OpenCV: Wiki de OpenCV
OpenCV 2.1 Python Reference: Manual de referencia
de la interfaz Python en WillowGarage
32
Referencias
Computer Vision: Algorithms and Applications de
Szelinsky: cap. 1
Learning OpenCV: cap. 1 y 2
Python Scientific Lecture Notes: cap. 1
33

Vision

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Vision

Enviado por

Direitos autorais:

Formatos disponíveis

Visin

Transparencias del curso

Structure from motion

NASA Mars Rover

Slide credit: Kristen Grauman

people waiting in line

Relaciones con otros campos

What is it related to?

Por qu investigar en visin artificial?

El cerebro interpreta las escenas

El cerebro interpreta las escenas

El cerebro interpreta las escenas

El cerebro interpreta las escenas

Many different 3D scenes could have given rise to a particular 2D picture

Sinha & Adelson 93

Orgenes de la visin artificial

Hecho realidad en las notas de clase del curso 6.869:

Una historia rpida de la visin artificial

Turk and Pentland 91

Qu propiedades (pistas) del mundo visual podemos

Update over time

How do we go from an array of numbers recognizing fruit?

Bajo nivel: imgenes a imgenes

circular arcs and line segments

My Research original color image

regions of homogeneous color

Recovering 3D layout and context

Alto nivel: procesamiento de caractersticas

La visin artificial en funcionamiento

Image from Microsofts Virtual Earth

Based on Photo Tourism

La visin artificial en funcionamiento

Building Rome in a Day: Agarwal et al. 2009

Digit recognition, AT&T labs

License plate readers

La visin artificial en funcionamiento

Pirates of the Carribean,

La visin artificial en funcionamiento

Face recognition systems now

La visin artificial en funcionamiento

Juegos interactivos: Kinect

La visin artificial en funcionamiento

Point & Find, Nokia, Google Goggles

Conduccin autnoma (artculo del New York

La visin artificial en funcionamiento

Vision systems (JPL) used for several tasks

Dos ejemplos de investigacin

5.2 Accuracy of three-dimensional reconstruction

according to perspectual rather than geometrical criteria [39].

Estado del arte actual

El rea de investigacin es muy dinmica

Mdulos de computacin cientficos en Python:

Numpy: arrays y operaciones con matrices

Você também pode gostar