3D Reconstruction: Standard Stereo Setup

Zoltan Kato: Computer Vision
Standard stereo setup

Image planes of cameras are pl parallel. Focal points are at same B height. Focal lengths same. epipolar pr lines are horizontal scan lines.
x
6. 3D Reconstruction
Computer Vision Zoltan Kato
http://www.inf.u-szeged.hu/~kato/
Cl
x
f
y
Cr
P( X , Y , Z )
Derive expression for Z as a function of xl , xr , f , B

2
Disparity and depth
Depth reconstruction
P(X,Y,Z)
Z xl f Cl B
adapted from D. Young
xr pl pr Cr
B + xr xl B = Z Z f B Z= f xl xr
Disparity: d=xl-xr
Z= f
Then given Z, we can compute X and Y. B is the stereo baseline d measures the difference in retinal position between corresponding points
3 4
B d
Stereo disparity example
Stereo disparity example
Disparity gets smaller with increasing depth

Left image Right image
courtesy of D. Young
Cameras have parallel optical axes, separated by horizontal translation (approximately)

5
courtesy of D. Young
Left and right edge images, superimposed

6
What If?
p p11
y y
x x
Two view (stereo) geometry
O1 O1
x x
z z
P = ( X ,Y , Z ) 1
f f
p2
p2
y
O2 O2
z
z
Geometric relations between two views are fully described by recovered 3X3 matrix F
7 8
Planar rectification
Brings two views into standard stereo setup
reproject image planes onto common plane parallel to line between optical centers Notice, only focal point of camera really matters (Seitz)
Image pair rectification

simplify stereo matching by warping the images Apply projective transformation so that epipolar lines correspond to horizontal scanlines
map epipole e to (1,0,0) try to minimize image distortion problem when epipole in (or close to) the image
1 0 = He 0
e e
10
Extracting Structure
The key aspect of epipolar geometry is its linear constraint on where a point in one image can be in the other By matching pixels (or features) along epipolar lines and measuring the disparity between them, we can construct a depth map (scene point depth is inversely proportional to disparity)
Stereo matching
What should be matched?
Pixels? Collections of pixels? Edges? Objects?
Correlation-based algorithms
Produce a DENSE set of correspondences
courtesy of P. Debevec
View 1
View 2
Computed depth map

11
Feature-based algorithms
Produce a SPARSE set of correspondences
12
Stereo matching: constraints

Photometric constraint Epipolar constraint (through rectification) Ordering constraint Uniqueness constraint Disparity limit Disparity continuity constraint
Photometric constraint
Photometric constraint
Assumption: Same world point has same intensity in both images Issues: noise, specularity, matching standalone pixels wont work
match windows centered around individual pixels. ?

= f g
14
13
Image Normalization
Even when the cameras are identical models, there can be differences in gain and sensitivity. The cameras do not see exactly the same surfaces, so their overall light levels can differ. For these reasons and more, it is a good idea to normalize the pixels in each window:
I= I
1 Wm ( x , y ) ( u ,v )Wm ( x , y )
Images as Vectors
Left Right
Unwrap image to form vector, using raster scan order wL
I (u, v) [ I (u, v)]

m
Average pixel
2
Wm ( x , y )
Window magnitude Normalized pixel

15
( u ,v )Wm ( x , y )
I ( x, y ) I I ( x, y ) = I I W ( x, y )
Each window is w L a vector in an m2 dimensional vector space. Normalization makes them unit length.
wR
m m
row 1
row 2
wL
row 3
16
Image Metrics
(Normalized) Sum of Squared Differences (SSD)
Correspondence via correlation
wR (d ) wL
CSSD (d ) =
L ( u ,v )Wm ( x , y )
[ I (u, v) I
2
Left
Right
(u d , v)]
= wL wR (d )
Normalized Correlation (NC)
scanline
C NC (d ) =
L ( u ,v )Wm ( x , y )
I (u, v) I
(u d , v)
Rectified images
2
SSD error
= wL wR (d ) = cos
17
disparity
d * = arg min d wL wR (d ) = arg max d wL wR (d )

18
Window size
Epipolar constraint
W=3
W = 20
Better results with adaptive window

T. Kanade and M. Okutomi, A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment,, Proc. International Conference on Robotics and Automation, 1991. D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion. International Journal of Computer Vision, 28(2):155-174, July 1998
19
Match pixels along corresponding epipolar lines (1D search)
(Seitz)
20
Ordering constraint
order of points in two images is usually the same
surface slice
Ordering constraint
Occluded Pixels Left scanline
Right scanline Left scanline Right scanline Disoccluded Pixels
Match Match Occlusion Match
Three cases:
Sequential cost of match Occluded cost of no match Disoccluded cost of no match
Disocclusion
21 22
Surface as a path
surface slice
6 5 4 occlusion left
Uniqueness constraint
surface as a path
In an image pair each pixel has at most one corresponding pixel

In general one corresponding pixel In case of occlusion/disocclusion there is none
1 2 3 4,5 6
1 2,3 4 5 6
2,3 1 1
occlusion right 2 3 4,5 6
23
24
Disparity constraint
Range of expected scene depths guides maximum possible disparity.
Search only a segment of epipolar line surface slice
bounding box
constant disparity surfaces
Disparity continuity constraint

Assume piecewise continuous surface piecewise continuous disparity
In general disparity changes continuously discontinuities at occluding boundaries
surface as a path
ba nd
use reconsructed features to determine bounding box

25 26
di sp ar ity
Unfortunately, this makes the problem 2D again. Solved with a host of graph algorithms, Markov Random Fields, Belief Propagation, .
Stereo matching
Constraints
epipolar ordering uniqueness disparity limit disparity gradient limit
Similarity measure (SSD or NC)
Disparity map
image I(x,y) Disparity map D(x,y) image I(x,y)
Trade-off
Matching cost (data) Discontinuities (prior)
Optimal path (dynamic programming )
(x,y)=(x+D(x,y),y)
(Cox et al. CVGIP96; Koch96; Falkenhagen97; Van Meerbergen,Vergauwen,Pollefeys,VanGool IJCV02)
27 28
Feature-based Methods
Conceptually very similar to Correlation-based methods, but:
They only search for correspondences of a sparse set of image features. Correspondences are given by the most similar feature pairs. Similarity measure must be adapted to the type of feature used.
Features commonly used

Corners
Similarity measured in terms of
surrounding gray values (SSD, NC) location
Edges, Lines
Similarity measured in terms of:

29
orientation contrast coordinates of edge or lines midpoint length of line

30
Example: Comparing lines

ll and lr: line lengths l and r: line orientations (xl,yl) and (xr,yr): midpoints cl and cr: average contrast along lines l m c : weights controlling influence
Correspondence By Features
LEFT IMAGE corner line
structure
The more similar the lines, the larger S is!

31 32
Correspondence By Features
RIGHT IMAGE corner line
Computing Correspondence
Which method is better?
Edges tend to fail in dense texture (outdoors) Correlation tends to fail in smooth featureless areas
structure
Both methods fail for smooth surfaces
Search in the right image the disparity (dx, dy) is the displacement when the similarity measure is maximum
33 34
Three geometric questions

1. Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point x in the second image? 2. Camera geometry (motion): Given a set of corresponding image points {xi xi}, i=1,,n, what are the camera matrixes P and P for the two views? 3. Scene geometry (structure): Given corresponding image points xi xi and cameras P, P, what is the position of (their pre-image) X in space?
35
Terminology
Point correspondences: xixi Original scene (pre-image): Xi Projective, affine, similarity reconstruction =
reconstruction that is identical to original up to projective, affine, similarity transformation Literature: Metric and Euclidean reconstruction = similarity reconstruction
36
Computing Structure
Recall that canonical camera matrices P, P can be computed from fundamental matrix F Triangulation: Back-projection of rays from image points x, x to 3-D point of intersection X such that x=PX and x=PX
E.g. P=[I|0] and P=[[e]xF|e],
Reconstruction ambiguity: similarity
-1 x i = PXi = PHS (HSXi )

T T PHS1 = K[R | t]R' - R' t' = K[RR'T | RR'T t'+t] 0
from Hartley & Zisserman
37
38
Reconstruction ambiguity: projective
The projective reconstruction theorem

If a set of point correspondences in two views determine the fundamental matrix uniquely, then the scene and cameras may be reconstructed from these correspondences alone, and any two such reconstructions from these correspondences are projectively equivalent
xi x i P2 = P1H-1
(P1 , P1 ' , {X1i }) (P2 , P2 ' , {X2i })

P2 = P1H-1 X2 = HX1
P2 (HX1i ) = P1H-1HX1i = P1 X1i = xi = P2 X2i

along same ray of
(except : Fxi = xi F = 0)
P2, idem for P2
x i = PXi = PH
-1 P
)(H X )
P i
39
two possibilities: X2i=HX1i, or points along baseline key result: allows reconstruction from pair of uncalibrated images
40
Triangulation: Issues
Errors in
points x, x & F such that xTFx=0 or
Point reconstruction
X such that x=PX and x=PX

This means that rays are skew they dont intersect
x = PX x'= P' X
41
42
Linear triangulation
x = PX x'= P' X
x P' X = 0
3T 2T 1T
Geometric error
Reconstruct matches in projective frame by minimizing the reprojection error
AX = 0
xp 3T p1T yp 3T p 2T A = 3T 1T x' p' p' 3T 2T y ' p' p'
invariance?
( ) ( ) y (p X ) (p X ) = 0 x(p X ) y (p X ) = 0
x p 3T X p1T X = 0
2T
d (x1 , P1X ) + d (x 2 , P2 X )
2
Non-iterative optimal solution (see Hartley&Sturm,CVIU97)
homogeneous
X =1
inhomogeneous
(AH-1 )(HX) = e
algebraic error yes, constraint no (except for affine)
43 44
( X , Y , Z ,1)
Optimal Projective-Invariant Triangulation: Reprojection Error Pick that exactly satisfies camera geometry so that and , and which minimizes Can use as error function for nonlinear minimization on two views
Polynomial solution exists
Covariance of Structure Recovery

Bigger angle between rays Less uncertainty
Cant triangulate points on baseline (epipoles) because rays intersect along entire length
46
45
Line reconstruction
Projective Reconstruction Ambiguity
l T P L= T l' P' LX = 0 for a point X on line L
doesnt work for epipolar plane

Two views from which

47
hence
F and P, P are computed
Reconstructions related by a 4 x 4 projection H

48
Hierarchy of transformations
Less ambiguity
Stratified Reconstruction
Idea: Try to upgrade reconstruction to differ from the truth by a less ambiguous transformation Use additional constraints imposed by:
Scene:
known 3-D points (no 4 coplanar) Euclidean reconstruction Identify parallel, orthogonal lines in scene
Metric/
Camera calibration: Known K, K reconstruction Camera motion: Known R, t
Metric/similarity
Properties of transformations (2-D)
49
50
Projective
Affine Upgrade
Projective
Affine Upgrade
I | 0 H= T
Identify plane at infinity (in the true coordinate frame, = (0, 0, 0, 1)T)
E.g., intersection points of three sets of parallel lines define a plane E.g., if one camera is known to be affine
Then apply 4 x 4 transformation:
(P, P' , {Xi }) T T = ( A, B, C, D) a (0,0,0,1) T H-T = (0,0,0,1)
This is the 3-D analog of affine image rectification via the line at infinity l Things that can be computed/constructed with only affine ambiguity:
Midpoint of two points Centroid of group of points Lines parallel to other lines, planes
51 52
Translational motion
points at infinity are fixed for a pure translation
reconstruction of xi xi is on
Scene constraints
Parallel lines
parallel lines intersect at infinity reconstruction of corresponding vanishing point yields point on plane at infinity 3 sets of parallel lines allow to uniquely determine
F = [e] = [e' ]
P = [I | 0] P'= [I | e' ]
Remarks:
in presence of noise determining the intersection of parallel lines is a delicate problem obtaining vanishing point in one image can be sufficient
53
54
Affine Reconstruction Ambiguity
Affine
Metric Upgrade
Identify absolute conic on via image of absolute conic (IAC)

From scene
E.g., orthogonal lines
From known camera calibration

Completely constrained: = Partially constrained:
Affine reconstructions Zero skew Square pixels
K-T K-1
Same camera took all images (e.g. moving camera)

55 56
The infinity homography

P = [M | m] P'= [M'| m']
H = M' M-1
Affine to metric
~ T X = X,0
( )
Unchanged under affine transformations
~ x = MX
~ x' = M' X
P = [M | m]A a = [MA | Ma + m] 0 1 H = M' AA-1M-1
Identify absolute conic transform so that : X 2 + Y 2 + Z 2 = 0, on then projective transformation relating original and reconstruction is * a similarity transformation * projection in practice, find (image of constraints absolute conic)
back-projects to cone that intersects in
P = [I | 0] P = [H | e]
57
affine reconstruction:
58
Affine to metric
Given
Same camera for all images
P = [M | m]
Same intrinsics absolute conic
same image of the
For example: moving camera
possible transformation from affine to metric is
A-1 0 H= 0 1
Given sufficient images there is in general only one conic that projects to the same image in all images, i.e. the absolute conic
This approach is called self-calibration
AA = M M
T
(cholesky factorisation)
Transfer of IAC:
' = H H
59
-T
-1
60
Direct metric reconstruction using Approach 1

calibrated reconstruction
Direct metric reconstruction using ground truth Use control points XEi with know coordinates to go directly from projective to metric
Need 5 points (no 4 coplanar) 2 linear eq. in H-1 per view, 3 for two views)
= K -TK -1 K
Approach 2
compute projective reconstruction back-project from both images intersection defines and its support plane
X Ei = HXi x i = PH-1X Ei
61
62
Metric Reconstruction Example
Metric Reconstruction
Objective: Given two uncalibrated images compute (PM,PM,{XMi}) (i.e. within similarity of original scene and cameras) Algorithm 1. Compute projective reconstruction (P,P,{Xi})
1. Compute F from xixI 2. Compute P, P from F 3. Triangulate Xi from xixI
2. Rectify reconstruction from projective to metric

1. Direct method: compute H from control points
X Ei = HXi PM = PH-1 PM = PH-1 X Mi = HXi

2. Stratified method:
1. Affine reconstruction: compute
I|0 H=
-1 H = A 0 AAT = MT M 1 0 1
Original views
Synthesized views of reconstruction
2. Metric reconstruction: compute IAC
Only overall scale ambiguity remainsi.e., what are units of length?

63
64
Reconstruction summary
Image information provided View relations and projective objects 3-space objects reconstruction ambiguity
point correspondences point correspondences including vanishing points
projective
F,H F,H ,
affine
Points correspondences and internal camera calibration
metric
65

3D Reconstruction: Standard Stereo Setup

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

3D Reconstruction: Standard Stereo Setup

Enviado por

Direitos autorais:

Formatos disponíveis

Zoltan Kato: Computer Vision

Standard stereo setup

Derive expression for Z as a function of xl , xr , f , B

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Disparity and depth

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Stereo disparity example

Stereo disparity example

Disparity gets smaller with increasing depth

Cameras have parallel optical axes, separated by horizontal translation (approximately)

Left and right edge images, superimposed

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Two view (stereo) geometry

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Image pair rectification

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Computed depth map

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Stereo matching: constraints

match windows centered around individual pixels. ?

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Unwrap image to form vector, using raster scan order wL

I (u, v) [ I (u, v)]

Window magnitude Normalized pixel

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Correspondence via correlation

d * = arg min d wL wR (d ) = arg max d wL wR (d )

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Better results with adaptive window

Match pixels along corresponding epipolar lines (1D search)

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Right scanline Left scanline Right scanline Disoccluded Pixels

Match Match Occlusion Match

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

In an image pair each pixel has at most one corresponding pixel

occlusion right 2 3 4,5 6

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Disparity continuity constraint

use reconsructed features to determine bounding box

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Optimal path (dynamic programming )

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Features commonly used

orientation contrast coordinates of edge or lines midpoint length of line

Zoltan Kato: Computer Vision

Zoltan Kato: Computer Vision

Example: Comparing lines

The more similar the lines, the larger S is!