Escolar Documentos
Profissional Documentos
Cultura Documentos
PUBLICATIONS ON GEODESY
NEW SERIES
VOLUME 8
NUMBER 1
THE GEOMETRY OF
GEODETIC INVERSE LINEAR MAPPING
AND NON-LINEAR ADJUSTMENT
by
P. J. G. TEUNISSEN
1985
RIJKSCOMMISSIE VOOR GEODESIE, THIJSSEWEG 11, DELFT, THE NETHERLANDS
SUMMARY
l0
The problem o f inverse linear mapping
and
2' The problem o f non-linear adjustment.
A f t e r the introduction, which contains a motivation of our emphasis on geometric thinking, we
commence i n chapter I1 w i t h the theory of inverse linear mapping. Amongst other things we show t h a t
every inverse B of a given linear map A can be uniquely characterized through the choice of three
linear subspaces, denoted by S, C and V.
Chapter U1 elaborates on the consequences o f the inverse linear mapping problem for planar,
ellipsoidal and three dimensional geodetic networks. F o r various situations we construct sets of base
vectors f o r the nullspace Nu(A) of the designmap. The chapter is concluded w i t h a discussion on the
problem o f connecting geodetic networks. We discuss, under f a i r l y general assumptions concerning
the admitted degrees of freedom o f the networks involved, three alternative methods of connection.
Chapter
IV
w i l l practically n o t be a f f e c t e d by
line search strategies i f both the least-squares residual vector and extrinsic curvature are small
enough.
N e x t we discuss some conditions which assure global convergence of GM.
Thereupon we show that f o r a particular class of manifolds, namely ruled surfaces, important
simplifications
of
the non-linear
least-squares
dimensional reduction. Application o f this idea made i t possible t o obtain an inversion-free solution o f
a non-linear variant o f the classical t w o dimensional H e l m e r t transformation. This non-linear variant
has been called the Symmetric H e l m e r t transformation. We also give an inversion-free solution of the
two
dimensional
rotational
invariant
IV
curvatures i n practice; we estimate the curvature of some simple 2-dimensional geodetic networks
and we b r i e f l y discuss some of the consequences of non-linearity f o r the statistical treatment of an
adjustment. Hereby it is also shown t h a t the bias of the least-squares residual vector is determined by
the mean curvature of the manifold and t h a t the bias of the least-squares parameter estimator is
determined by the trace o f the Christoffelsymbols of the second kind.
The chapter is concluded w i t h a b r i e f discussion o f some problems which are s t i l l open for future
research.
ACKNOWLEDGEMENTS
'The Geodetic Institute of t h e S t u t t g a r t University (FRG) for t h e facilities offered during t h e author's
s t a y in Stuttgart.
Finally,special thanks go t o miss Janna Blotwijk f o r t h e excellent job she did in typing and preparing
t h e final version of this publication.
.......................................................iii
ACKNOWLEDGEMENTS ............................................... v
SUMMARY
INTRODUCTION
....................................................l
.................................................. 10
..............................13
Injective and Surjective Maps ....................................... 18
Arbitrary Systems of Linear Equations and Arbitrary Inverses ................22
Some Common Type of Inverses and their Relation
to the Subspaces S. C and D ........................................ 24
6. C .and S-Transformations .........................................30
.
.
3.
4.
5.
1
The Principles
....................................................35
Geodetic Networks and their Degrees of Freedom ......................... 36
2.1.
Planar networks ............................................36
2.2.
Ellipsoidal networks .........................................42
2.3.
Three dimensional networks ...................................52
3. (Free)Networks and their Connection ..................................65
3.1.
Types of networks considered .................................. 65
3.2.
Three alternatives .......................................... 68
.
2.
1
IV
Introduction
........................................84
2. A Brief Introduction into Riemannian Geometry ..........................87
3. Orthogonal Projection onto a Parametrized Space Curve ....................91
3.1.
Gauss' iteration method ......................................91
3.2.
The Frenet frame ...........................................92
3.3.
The "Kissing" circle .........................................95
3.4.
One dimensional Gauss- and Weingarten equations ................... 97
3.5.
Local convergence behaviour of Gauss' iteration method ............... 98
3.6.
Examples ................................................ 102
1
.............................................. 1 0 9
4 . Orthogonal Projection onto a Parametrized Submanifold ................... 110
Gauss1 method ............................................ 1 1 0
4.1.
4.2.
The Gauss1equation ........................................ 1 1 2
4.3.
The norma1fiel.d B .......................................... 116
4.4.
'The l o c a l r a t e o f convergence ................................. 118
4.5.
Global convergence ........................................ 125
5 . Supplements and Examples ........................................ 1 3 4
The t w o dimensional H e l m e r t transformation ...................... 1 3 4
5.1.
Orthogonal projection onto a ruled surface ........................ 1 3 9
5.2.
5.3.
The t w o dimensional Symmetric H e l m e r t transformation .............. 1 4 1
3.7.
Conclusions
5.4.
5.5.
......................... 145
........................................ 1 4 8
5.6.
The extrinsic curvatures estimated ............................. 1 5 6
5.7.
Some t w o dimensional networks ................................ 1 6 3
6 . Some Statistical Considerations ..................................... 1 6 6
7 . Epilogue .....................................................1 7 0
generalization
REFERENCES
....................................................
173
I. INTRODUCTION
This publication has the intention t o give a contribution t o the theory of geodetic adjustment. The
two main topics discussed are
l0
The problem o f inverse linear mapping
and
2'
I n our discussion o f these t w o problems there is a strong emphasis on geometric thinking as a means
of visualizing and thereby improving our understanding o f methods o f adjustment. It is namely our
belief t h a t a geometric approach t o adjustment renders a more general and simpler treatment of
various aspects of adjustment theory possible. So is it possible t o carry through quite rigorous trains
o f reasoning i n geometrical terms without translating them i n t o algebra. This gives a considerable
economy both i n thought and i n communication of thought. Also does it enable us t o recognize and
understand more easily the basic notions and essential concepts involved. And most important,
perhaps, is the f a c t t h a t our geometrical imagery i n t w o and three dimensions suggests results for
more dimensions and offers us a powerful tool o f inductive and creative reasoning. A t the same time,
when precise mathematical reasoning is required it w i l l be carried out i n terms of the theory of f i n i t e
dimensional vector spaces. This theory may be regarded as a precise mathematical framework
underlying the heuristic patterns of geometric thought.
I n Geodesy it is very common t o use geometric reasoning. I n fact, geodesy benefited considerably
f r o m the development of the study of d i f f e r e n t i a l geometry which was begun very early i n history.
P r a c t i c a l tasks i n cartography and geodesy caused and influenced the creation of the classical theory
o f surfaces (Gauss, 1827; Helmert, 1880). And differential geometry can now be said t o constitute an
essential p a r t of the foundation of both mathematical and physical geodesy (Marussi, 1952; Hotine,
1969; Grafarend, 1973).
B u t i t was n o t only i n the development of geodetic models t h a t geometry played such a p i v o t a l r8le.
Also i n geodetic adjustment theory, adjustment was soon considered as a geometrical problem. Very
early (Tienstra, 1947; 1948; 1956) already advocated the use of the Ricci-calculus i n adjustment
theory. It permits a consistent geometrization of the adjustment of correlated observations. H i s
approach was l a t e r followed by (Baarda, 1967 a,b; 1969), (Kooimans, 1958) and many others.
More recently we witness a renewed interest i n the geometrization of adjustment theory. See e.g.
(Vanicek, 1979), (Eeg, 1982), (Meissl, 1982), (Blais, 1983) or (Blaha, 1984). The incentive t o this renewed interest is probably due to the introduction into geodesy of the modern theory of H i l b e r t
spaces w i t h kernel functions (Krarup, 1969). As (Moritz, 1979) has p u t it rather plainly, this theory
can be seen as an i n f i n i t e l y dimensional generalization of Tienstra's theory of correlated observations
i n i t s geometrical interpretation.
Probably the best motivation f o r taking a geometric standpoint i n discussing adjustment problems i n
linear models is given by the following discussion which emphasizes the geometric interplay between
.)
where E {
(. , .)M
We
i.e.,
M,
defined by
-1
.)M
= (Y1,
yl
Vylc
The linear vector space M* denotes the dual space o f M and is defined as the set o f a l l real-valued
(homogeneous)
*
y : M
IR
linear
Instead o f w r i t i n g
*
y (y
each
!l*
is a linear
function
considering
*
y (y
denotedby
(.,.I:
M
The function (
. Thus
defined on M
functions
., . )
M + I R a n d i s d e f i n e d b y ( y ,Y1)=Y
* (yl)
V y
* E M* ,Y1c
where
w i l l assume t h a t
{y } + U
1
subspace o f M.
The problem of linear estimation can now be formulated as: given an observation ys on the random
vector y, i t s covariance map Q and the linear manifold i , estimate the position o f y i n fi C M
Y
I f we r e s t r i c t ourselves t o Best Linear Unbiased Estimation (BLUE), then the problem o f linear
6 +
(y
) = y
i.e.,
R,
and
2'
{h(y )} 5
estimators g ( y )
F r o m (1.4.1')
follows that
Variance
X
{ g(y) }
= a + ( y ,y),
f o r a l l linear unbiased
a EIR, y*
M*,
* -y ) .
of ( y s ,
= y
*S
I*
forsome y
y y
The set o f
{ y } + U.
*1
y E M* for
of
and is denoted by U
since
Uc M
6
F r o m (1.4.2')
which ( y
0
If
(F*,
F*)
{ }:y
+ U
= 0,
(1.5)
N,
and y
{ }:y
+ U
X
M*
,U)
5 (Y*,Q~Y
F*
yX
.X
(ys- y
c M*, i.e.
follows w i t h (1.6) t h a t
and
* ,U)
= ( y s - y , y l ) f o r some y
it follows t h a t
X.
S
0
.X
- y
(1.6)
must satisfy
Y*
{ yS} + u0
(1.7)
must satisfy
-*
*
y
E,{*.
And
in
{ }:y
-X
it w i l l be i n t u i t i v e l y clear t h a t y
0
IS
+ U
found by
figure 1
Now, before we characterize the map which maps y* into
S
on linear maps.
be a
= { y E M
y = A
forsome
E U )
U = N
In t h e s p e c i a l c a s e t h a t
t h e inverse i m a g e of { O}
of A. I t is easily verified
-1
(V).
f A x2.
t h a t A xl
l,
= M
T h e m a p A is surjective o r o n t o if A N
x2
x1 f x2
implies
*
y
composition
*
y
A assigns t h e l - f o r m
o A
N* t o y
i t follows t h a t t h e
M*
A
* . Since t h e m a p
M* w e s e e t h a t t h e m a p A induces a n o t h e r linear
o A.
V c M under A, w e h a v e t h e duality r e l a t i o n
V = { O} t h e r e l a t i o n r e d u c e s t o Nu(A) O = R ( A * )
N w e h a v e t h e unique decomposition
c a n now d e f i n e a linear m a p P:
with
2'
N = U ce V
X
, with "ce"
X
with
denoting t h e d i r e c t sum.
X
. We
+ N through
E U ,
E V and N =
U a V
figure 2
I f P projects onto U and along V then I- P, w i t h Ithe identity map, projects onto V and along U. Thus
I - P
U,
= P
(1.14)
v, U
It is easily verified t h a t the dual P* of a projector P is again a projector operating on the dual space.
(0))
= V
= P
and
u,v
(P-'
U,
( u ) ) ~=
N =
( 0 ) ; P*
U,
U'.
Thus,
= P
and
(I-P
u,v
)* = P *
v,u
= P
uO,vO
Finally we mention t h a t one can check whether a linear map is a projector, by verifying whether the
i t e r a t e d operator coincides w i t h the operator itself (Idempotence).
Now l e t us r e t u r n t o the point.where we l e f t our BLUE'S problem. We noted t h a t
orthogonally projecting
0
onto
( U')'.
i* could be found by
i.e.,
follows then that the linear function h(y) is the unique BLLIE's estimator o f
h(y)
= ((I
- P
= i +
+ (P
Y*,Y),
(uOf
where yl
is an arbitrary element o f N.
And since
we get
= y1
+ p
Y1
u,uL
which solves the dual problem
figure 3
E N ,
,uO
Thus we have recovered t h e existing duality between BLUE's estimation and least-squares estimation.
We minimize a sum of squares (1.20) and emerge with an optimum estimator, namely one which
minimizes another sum of squares (1.81, t h e variance. From the geometrical viewpoint this arises
simply from t h e duality between t h e so-called observation space M and estimator space M*,
G.
t h e observation space M (in Tienstrals terminology this would be "standard problem 11") or implicitly
by a s e t of linear constraints ("standard problem I"). Even a mixed representation is possible.
Consequently, in general we have t h a t if a coordinate representation is needed one c a n t a k e t h e one
which seems t o be t h e most appropriate. T h a t is, t h e use of a convenient basis r a t h e r than a basis
fixed a t t h e outset is a good illustration of t h e f a c t t h a t coordinate-free does not mean freedom from
coordinates so much a s it means freedom t o choose t h e appropriate coordinates for t h e task a t hand.
With respect t o our first topic, note t h a t a direct consequence of t h e coordinate-free formulation is
t h a t t h e difficulties a r e evaded which might possibly occur when a non-injective linear map A is used
to specify the linear model. This indicates t h a t t h e actual problem of inverse linear mapping should
not be considered t o constitute an essential part of t h e problem of adjustment. T h a t is, in t h e context
of BLUE's estimation i t is insignificant which pre-image of
however, still not generally agreed upon. The usually merely algebraic approach taken often makes
one omit t o distinguish between t h e actual adjustment problem and t h e actual inverse mapping
problem. As a consequence, published studies in the geodetic literature dealing with t h e theory of
inverse linear mapping surpass in our view often t h e essential concepts involved. We have therefore
tried to present an alternative approach; one t h a t is based on t h e idea t h a t once the causes of t h e
general inverse mapping problem a r e classified, also t h e problem of inverse linear mapping itself is
solved. Our approach s t a r t s from the identification of t h e basic subspaces involved and next shows
t h a t t h e problem of inverse linear mapping can be reduced t o a few essentials.
As t o our second topic, t h a t of non-linear adjustment, note t h a t t h e Gauss-Markov theorem
formulates a lot of "ifs" before i t s t a t e s why least-squares should be used: if t h e mean
linear manifold
N,
Y'
lies in a
i,
which w i l l be curved i n general. Hence, s t r i c t l y speaking the Gauss-Markov Lheorem does not
apply anymore i n the non-linear case. And consequently one might question whether the excessive use
of the theorem i n the geodetic l i t e r a t u r e f o r theoretical developments is justifiable i n a l l cases.
Since almost a l l functional relations i n our geodetic models are non-linear, one may be surprised t o
realize how l i t t l e attention the complicated problem area of non-linear geodesic adjustment has
received. One has used and is s t i l l predominantly using the ideas, concepts and results f r o m the
theory of linear estimation. O f course, one may argue t h a t probably most non-linear models are only
moderately non-linear and thus p e r m i t the use of a linear(ized) model. This is true. However, it does
i n no way release us f r o m the obligation of really proving whether a linear(ized) model is sufficient as
approximation. What we need therefore is knowledge of how non-linearity manifests itself a t the
various stages of adjustment. Here we agree w i t h (Kubik, 1967), who points out that a general
theoretical and practical investigation i n t o the various aspects of non-linear adjustment is s t i l l
lacking.
I n the geodetic l i t e r a t u r e we only know of a few publications i n which non-linear adjustment problems
are discussed. I n the papers by (Pope, 19721, (Stark and Mikhail, 19731, (Pope, 1974) and (Celmins,
1981; 1982) some p i t f a l l s t o be avoided when applying variable transformations or when updating and
re-evaluating function values i n an iteration procedure, are discussed. And i n (Kubik, 1967) and
(Kelley and Thompson, 1978) a b r i e f review is given of some iteration methods. A n investigation i n t o
the various e f f e c t s of non-linearity was started i n (Baarda, 1967 a,b), (Alberda, 19691, (Grafarend,
1970) and more recently i n (Krarup, 1982a). (Alberda, 1969) discusses the e f f e c t of non-linearity on
the misclosures o f condition equations when a linear least-squares estimator is used and illustrates
the things mentioned w i t h a quadrilateral. A similar discussion can be found i n (Baarda, 1967b), where
also an expression is derived f o r the bias i n the estimators. (Grafarend, 1970) discusses a case where
the circular normal distribution should replace the ordinary normal distribution. And finally (Baarda,
1967a) and (Krarup, 1982a) exemplify the e f f e c t of non-linearity w i t h the aid of a circular model.
Although we accentuate some d i f f e r e n t and new aspects of non-linear adjustment, our contribution t o
the problem of non-linear geodesic adjustment should be seen as a continuation o f the work done by
the above mentioned authors. We must a d m i t though t h a t unfortunately we do n o t have a c u t and
dried answer t o a l l questions. We do hope, however, t h a t our discussion of non-linear adjustment w i l l
make one more susceptible t o the intrinsic difficulties of non-linear adjustment and t h a t the problem
w i l l receive more attention than it has received hitherto.
The plan of this publication is the following:
I n chapter I1 we consider the geometry of inverse linear mapping. We w i l l show t h a t every inverse B
of a linear map A can be uniquely characterized through the choice of three subspaces S , Cand D.
Furthermore, each of these three subspaces has an interesting interpretation of i t s own. I n order t o
f a c i l i t a t e reference the basic results are summarized i n table 1.
I n chapter 111 we s t a r t by showing the consequences of the inverse mapping problem for 2 and 3dimensional geodetic networks. This p a r t is easy-going since the planar case has t o some extent
already been treated elsewhere i n the geodetic literature. The second p a r t of this chapter presents a
discussion on the i n geodesy almost omnipresent problem of connecting geodetic networks.
Finally, chapter I V makes a s t a r t w i t h the problem of non-linear adjustment. A differential geometric
approach is used throughout. We discuss Gauss' method i n some detail and show how the extrinsic
curvatures of submanifold
IL
1
. The principles
Many problems i n physical science involve the estimation or computation of a number of unknown
parameters which bear a linear (or linearized) relationship t o a set o f experimental data. The data
may be contaminated by (systematic or random) errors, insufficient t o determine the unknowns,
redundant, or a l l o f the above and consequently, questions as existence, uniqueness, stability,
approximation and the physical description o f the set o f solutions are a l l o f interest.
I n econometrics f o r instance (see e.g.
parameters f r o m the observations is known there as the "identification problem". And i n geophysics,
where the physical interpretation o f an anomalous gravitational f i e l d involves deduction of the mass
distribution which produces the anomalous field, there is a fundamental non-uniqueness i n potential
f i e l d inversion, such that, f o r instance, even complete, perfect data on the earth's surface cannot
distinguish between t w o buried spherical density anomalies having the same anomalous mass but
d i f f e r e n t r a d i i (see e.g.
Also i n geodesy similar problems can be recognized. The f a c t t h a t the data are generally only
measured a t discrete points, leaves one i n physical geodesy f o r instance w i t h the problem o f
determining a continuous unknown function f r o m a f i n i t e set of data (see e.g.
1982). Also the non-uniqueness i n coordinate-system definitions makes itself f e l t when identifying,
interpreting, qualifying and comparing results f r o m geodetic network adjustments (see e.g.
Baarda,
1973). The problem o f connecting geodetic networks, which w i l l be studied i n chapter three, is a
prime example i n this respect.
A l l the above mentioned problems are very similar and even formally equivalent, i f they are
described i n terms of a possible inconsistent and under-determined linear system
i n t o the m-dimensional
The system is certainly consistent i f the rank of A, which is defined as rank A = dim. R(A) = r, equals
the dimension o f M. I n this case namely the range space R(A) equals M and therefore y
M= R(A). I n
a l l other cases, r <dim. M consistency is no longer guaranteed, since i t would be a mere coincidence
i f the given vector y EA.( lies i n the smaller dimensioned subspace R ( A ) c M . Consistency is thus
guaranteed i f y
E R(A)
= NU (A*)'.
Assuming consistency, the next question one m i g h t ask is whether the solution of (1.1) is unique or
not, i.e.
X.
I f not, the
system is said t o be under-determined. The solution is only unique i f the rank o f A equals the
dimension o f i t s domain space N
, i.e.
the
above
m = dim. M
considerations
and n = dim. N
t o be t w o
follows
, which
and x2 f xl
that
it
is
the
relation
of
r = dim. R ( A )
to
r = m = n, we know t h a t a unique inverse map B o f the bijective map A exists, w i t h the properties
B A = I
and
A B = I .
(1.2)
For non-bijective maps A, however, i n general no map B can be found f o r which (1.2) holds. F o r such
maps therefore a more relaxed type o f inverse property is used. Guided by the idea t h a t an inversel i k e map B should solve any consistent system ,that is, map B should furnish f o r each y
some solution
R(A),
Maps B: M + N
which satisfy this relaxed type o f inverse condition are now called generalized
inverses o f A.
w i t h the pioneering work o f Bjerhammar (Bjerhammar, 1951) ,who defined a generalized inverse f o r
rectangular matrices.
1955) t h e l i t e r a t u r e o f
and range space R(A) c M both are proper subspaces. That is,
M : observation space
N : parameter space
= rank A
dirn.R(A)
d irn. Nu(A)
= n-rank
Now, just l i k e there are many ways i n which a basis of a subspace can be extended t o a basis which
generates the whole space, there are many ways t o extend the subspaces N u ( A ) c
R(A) c
and
figure 5
L e t us choose t w o arbitrary subspaces, say S c N
S e N u ( A ) and R ( A ) e
CO
and
CO
c M,
N : parameter space
M : observation space
R(A) = rank A
d im.
dim.
N = S
EI
Nu(A)
figure 6
M =R(A) s C O
= mrank A
The complementarity of S and Nu ( A ) then implies t h a t t h e subspace S has a dimension which equals
t h a t of R ( A )
A
Is
i.e.
dim. S = dim.
become t h e inverse of A
Is
AI = I
B~R(A) s
and
IS 1'
= I .
R(A)
d im.
figure 7
The inverse-.like properties (1.4) a r e thus t h e ones which replace (1.2) in t h e general c a s e of rank A =
r < min.(m,n). The second equation of (1.4) can be rephrased a s ABA = A, and therefore constitutes
t h e classical definition of a generalized inverse of A. The first equation of (1.4) s t a t e s t h a t
In t h e next section w e will prove what is already intuitively clear, namely t h a t equation (1.5) is
equivalent t o t h e classical definition (1.3), and therefore (1.5) c a n just a s well be used a s a definition
of a generalized inverse. In f a c t , (1.5) has t h e advantage over (1.3) t h a t it clearly shows why
generalized inverses a r e not unique. The image of S under A is namely only a proper subspace of M .
To find a particular map B which satisfies (1.5), w e therefore need t o specify i t s failing basis values.
In this section w e will follow our lead t h a t a map is only uniquely determined once i t s basis values a r e
completely specified.
As said, t h e usual way t o define generalized inverses B of A is by requiring
This expression, however, is not a very illuminating one, since i t does not tell us what generalized
inverses of A look like o r how they can be computed. We will therefore r e w r i t e expression (2.1) in
s u c h l a form t h a t i t becomes relatively easy t o understand t h e mapping characteristics of B. This is
done by t h e following theorem:
Theorem
A B A = A
l0
F o r some unique S c N
complementary t o Nu(A),
B Ax =
X,
, holds.
Proof of l0
F r o m premultiplying ABA = A w i t h B follows BABA = BA. The map BA is thus idempotent
(+ )
X E
BA
then
implies
BAx =
that
N = S c Nu (A)
X,
R(f3A) = S
and
S.
And
it
also
Nu(BA) = Nu(A)
implies
that
we therefore
(C)
BAx =
X,
B A
Nu(A)
VX
by
X
= P
S, NU(A)',
A B A x = Ax,
vx
PNu(A) ,S
or finally, a f t e r premultiplication w i t h A ,
Proof of
X E
Vx N .
z0
We o m i t the proof since i t is straightforward.
N.
) = A, we get
The above theorem thus makes precise what already was made intuitively clear i n section one.
There are now t w o important points which are put forward by the theorem. F i r s t o f all, it states t h a t
N which satisfies
Iy
R(A) = A N =
= {y
E
E
= {y
M
M
E M
I
I
y = A x f o r some X E N 1
y = A x f o r some x = x + X
y =
AX
f o r some
2'
ES
,X
NU(A))
ESI
= AS,
this implies t h a t a generalized inverse B of A maps the subspace R ( A ) c M onto a subspace S c N
complementary t o Nu(A). Map B therefore determines a one-to-one relation between R(A) and S, and
is injective when r e s t r i c t e d t o the subspace R(A).
A second point t h a t should be noted about the theorem is t h a t it gives a way o f constructing arbitrary
generalized inverses of A. To see this, consider expression (2.2). Since R(A) = A N = A S, expression
(2.2)
is therefore not
...,
CO,
%l,
n
and e",
CO
c M
, i.e.
M = R( A ) e
p = l,
CO
forms a basis o f
how i t maps
CO,
say:
I
i
B C
P
= D
..
l , .m ;
a '
e
P
CO
= V c N , with
c N
condition, namely V c N u ( A ) ,
p , .
..,m
(2.3)
Thus i f
..
l , .n ;
"
we have,
M = R(A) e C
(2.4)
o f the theorem
says t h a t A B is a projector, projecting onto the rangespace R(A) and along a space, say
t o,
S)
l
i
6. .C' = 0 ,
expresses the f a c t t h a t C
I t
P
11 9
q = l, r, or i n m a t r i x notation t h a t ( C ) C
= 0
pxm mx(m-p) p x ( m - p )
The kernel l e t t e r
...,
"$lv
i,j = l,...,m;
p=l, ...,(m-
r);
CO
D c Nu(A)
b u t now w i t h the
and
CO
under B:
B AS=S
and
= D
CO
with
= S m Nu(A),
M = R(A) m
(2.5)
CO
and
D c Nu(A)
A few things are depicted i n figure 8.
N : parameter space
M : observation space
dim. S = rank A
dim. C
C=
= m-rank A
Nu(A)
figure 8
Our objective o f finding a unique representation o f an arbitrary generalized inverse B o f A can now
be reached i n a very simple way indeed. The only thing we have t o do is t o combine (2.2) and (2.3). I f
we take the coordinate expressions o f
a
B
where e .
S,
CO
and D,
i = Biea
. ..
i= l,
a
S e
q a
,m
li
and
and e
and A t o be
A e
a
e . and D e
P
1
P a
i
= A e
a
a i
...,
a = l,
p=l,
...,( m - r ) ;
q=l,
...,r ,
and
B ( A
S
n x m m x n n x r .mxym-r))
Now, since the subspaces R(A) = A S and
= (n8r
CO
: nxym-r))
: CI)
has
f u l l rank and is thus invertible. The unique representation o f a particular generalized inverse B o f A
therefore becomes
: n xD( m - r ) ) ( mA xSr
B = ( S
n xm
nxr
'
C
mx(m-r)
-1
CO
and
i n t o (2.7) (recall t h a t C
f):
With (2.7) or (2.8) we thus have found one expression which covers all the generalized inverses of A.
Furthermore we have the important result that each particular generalized inverse o f A ,defined
through (2.2)
and (2.3),
complementary t o Nu(A),
I n the next t w o sections we w i l l give the interpretation associated w i t h the three subspaces S,
CO
and D. Also the r e l a t i o n w i t h the problem of solving an arbitrary system o f linear equations w i l l
become clear then.
hold for any arbitrary generalized inverse B o f A. That is, the maps BA and AB behave l i k e identity
maps on respectively the subspaces S c
and R(A) c M
= n, the generalized inverses of A become left-inverses, since then BA = I.And similarly they become
right-inverses i f rank A = r = m, because then AB = I holds.
I n order t o give an interpretation o f the subspace S c N
case t h a t rank A = r = m.
I f rank A = r = m then R(A) = M
CO
= {0)
. With (2.5)
{ 0)
S (AS)-'
n x m mxm
,with
N= S a
b!u(A)
M : observation space
N : parameter space
dirn. S
d im. R (A)
d im. Nu(A)
M = R(A)
N = S e Nu(A)
figure 9
Thus t h e only subspaces which play a role in t h e inverses of surjective maps a r e t h e subspaces S
complementary t o Nu(A).
In order t o find o u t how (3.2) is r e l a t e d t o t h e problem of solving a system of linear equations
y = A
X ,
mxl
mxn n x l
(3.3)
t h e r e f o r e be w r i t t e n a s
{ x } = { x I
X
= B y +
V'
a
nxl nxl
nxl nx(n-r)fn-r)xl
nxl
By choosing a ,s a y a : = a l
X
1
nxl
1 .
xl
{ X)
B y + V a
nxl
nxl
1'
(silt
that
I
X1
= ((S )
(n-r)xn nxl
But this means t h a t , since
v 1
(n-r)x(n-r)
al
cal l
(n-r)xl
(3.5)
1
(n-r)xl
d e t e r m i n e xl, equation (3.5) and (3.3) together suffice t o determine xl uniquely. O r in o t h e r words,
t h e solution of t h e uniquely solvable system
:l] =
(m+n-r)xl
( Z Y ]
(m+n-r)xn nxl
is precisely xl:
nxl
with V
1 1
[(:I)J1
nx(m+n-r) (m+n-r)
0
= Nu(A)
t I -l
= (s(~s)-l
: vL"s
v 1 )[:l]
nx(n-r)
(m+n-r)xl
Thus w e have recovered t h e rule, t h a t in order t o find a particular solution t o (3.3), s a y xl, w e merely
, since
V c Nu ( A )
. But this
M : observation space
N :parameter space
R(A) = rank A
= r = n
d im. S
cO
M = R(A) e
N=S
= m-n
CO
figure 1 0
X
F o r t h e dual m a p A : M
X
+
sketched in figure 9 (see figure 11). Now, taking advantage of our result (3.2), w e find t h e general
matrix-representation of a n arbitrary generalized inverse B* of A* t o be
-1
t
t
B = C ( A C )
mxn mxn n x n
M*
N*
:e s t i m a t o r s p a c e
.
: co-parameter space
d irn. C =
rank A =
dim.S
= rank A
= r = n
d irn. Nu(A ) =
M = C
N* ,S
NU(A*)
figure 11
= ( C A)
with
M = R(A)
a C
Thus dual to our result (3.2), we find t h a t t h e only subspaces which play a r8le in t h e inverses of
injective maps, a r e t h e subspaces
CO
complementary t o R(A).
With t h e established duality relations i t now also becomes easy to s e e how (3.8) is related t o t h e
problem of solving a generally inconsistent but otherwise uniquely determined system of linear
equations
y
mxl
A
X
rnxn n x l
w i t h r a n k A = r = n.
.I
(n+m-r)xl
with
A . C )
(n+rn-r)xm mxl
%
Nu(A )
( C A)
-1
y =
I t l - l
'
( n + r n - r ) xrn
mx l
We therefore have recovered t h e dual rule t h a t in order t o find a particular solution t o (3.9), we need
t o extend t h e system of linear equations from (3.9) t o (3.10) by introducing additional unknowns such
t h a t t h e extended matrix
( A :
C'
)
rnxn m x ( r n - r )
becomes square and regular. Furthermore t h e corresponding left-inverse of A is obtainable from t h e
inverse of this extended matrix.
A
rowwise. And
mx n
S c N complementary t o Nu(A) i n removing
the underdeterminability was demonstrated. Similarly we saw how consistency o f an inconsistent, but
otherwise uniquely determined system o f linear equations was restored by extending the m a t r i x
A columnwise. And here the subspace c O c M complementary t o R(A) played the decisive
mx n
role. We also observed a complete duality between these results; f o r the dual o f an injective map is
surjective and vice versa.
These results are, however, s t i l l not general enough. I n particular we note t h a t the subspace
c Nu(A)
reason f o r this w i l l become clear i f we consider the interpretation associated w i t h the subspace D .
Since
n D
{o}
2 rank
.With dim.S=
{o}
d i m R(A)
. But this
shows why the subspace D gets annihilated i n case o f injective and surjective maps. The l e f t (right)
inverses have namely the same rank as the injective (surjective) maps. F r o m the above i t also
becomes clear t h a t the rank o f B is completely determined by the choice made f o r
w i l l have minimum rank i f D is chosen t o be D =
one can choose D such that dim. D = min.(m,n)-r.
{o}
D. I n particular B
rank B = min.(m,n),
Nu(A)
if
gets
A
X
mxn n x l
, w i t h rank A = r
< min.(m,n),
(4.1)
a system which is possibly inconsistent and underdetermined a t the same time. F r o m the rank-
deficiency o f A in (4.1)
y E R(A)
uniquely. Following
the same approach as before, we can a t once remove this underdeterminability by extending (4.1) t o
But although the extended m a t r i x o f (4.2) has f u l l column rank, the system can s t i l l be inconsistent.
To remove possible inconsistency we therefore have t o extend the m a t r i x o f (4.2) columnwise so t h a t
the resulting m a t r i x becomes square and regular. Now since
M = R(A)
e,
CO
, the
following
0
(m+n-r)xl
(m+n-r)x(m+n-r)
l;l
(m+n-r)xl
with
M = R(A)
0
@
with
N = S s
M = R(A) s
Nu(A),
CO
X
(n-r)x(m-r)
and
with
N = S e
Nu(A)
M = R(A) e
CO,
v0
-V
being arbitrary.
The unique
our
Thus as a
w i t h V'
Nu(A) a n d U'
NU(A")
This result then completes the circle. I n section one namely, we started by describing the geometric
principles behind inverse linear mapping. I n section two these principles were made precise by the
stated theorem. This theorem enabled us t o f i n d a unique representation concerning a l l g e n e r a k e d
inverses B of a linear map A. I n section three we then specialized to injective and surjective maps,
showing the relation between the corresponding inverses and the solutions o f the corresponding
systems o f linear equations. And finally this section generalized these results to arbitrary systems of
linear equations whereby our general expression of generalized inverses was again obtained.
becomes very simple indeed to derive most of the standard results which one can find i n the many
textbooks available. See e.g.
is played by the three subspaces S , C and D i n the more common type of inverses used:
-1
Q y
We know fro;
(. , .)M
(Y,.)~.
and l e t Q : M
Y
t o be a least-squares solution o f m i n
X
. ( y-A
X,
y-A x ) ~
I, w i t h U = R ( A ) ,
A B = P
u,u
must hold. F r o m (2.8) follows, however, t h a t i n general
t
-1 t
= A S ( C A S )
C .
mxm
And since
-1 t
A S ( C ~ A S ) C .
C'
=
0
mxm
mx (m- r )
mx (m- r )
and
t
-1 t
A S ( C A S )
C .AS = A S
mxm
mxr
mxr
U CO'
(5.2) we thus conclude t h a t least-squares inverses are obtained by chodsing
mx(m-r)
'
U
Y
mxm m x ( m - r )
Q
For
(. , .)
and l e t Q : N
r
+
y = A
X,
; must
X.
Hence,
m i n.
X
X, X
)N
subject t o
be the
must hold. With the same reasoning as above we then find that the minimum norm inverses are
obtained by choosing
while
CO
and
S =
nxr
Q
V
.
X
nxn n x r
= R (A )
- maximum- and minimum rank inverses I n the previous section we already indicated t h a t by varying the choices f o r D C N u ( A ) one could
manipulate the rank o f the corresponding generalized inverse. Inverses w i t h maximum rank min.(m,n)
were obtained i f one could choose D such that
d im. D
= m i n. (m, n ) - r
1'
CO
and
= {U},
B A x =
B 1c 1
O={O1
V x E S 1 ;
X,
with
Nq S
= BIR(A)
o V0
o V
and
V0
U = R(A),
h!=
U.
= AS1 o NU(B~)
(5.9)
= Nu(A)
B u t the linear map A i t s e l f also satisfies similar conditions. F o r an arbitrary generalized inverse, B
say, o f A we have namely
A B y = y,
V y
EU;
{o),
A V 0 =
with
M = U o C
= AR(B) o
and
u=R(A),
V'
N=S
CO,
=NU(A)
o V
= BU
c Nu(A)
Upon comparing (5.11) w i t h (5.9) we therefore conclude that the linear map A is representable i n a
way similar t o t h a t o f B1 i n (5.10), i.e.
with
V = Nu ( A ) '
U = R(A),
Now,
B1
n xm
(S
t
-1 t
V
( V S1)
nxn
1.
B
n xm
t
-1 t
cl
(u(clu)
mxm
1.
and
which shows how t o obtain a prespecified minimum rank inverse f r o m any arbitrary generalized
inverse of A. Because of the reciprocal character o f minimum rank inverses
inverse o f i t s minimum rank inverses
A is namely again an
y 1 A
X,
and
o f A is characterized by (5.7),
by
= P
1~~ w i t h
U = R(A),
(5.18)
u,u
i t follows f r o m the combination o f (5.17) and (5.18) together w i t h the transformation rule (5.13),
that
N o t e t h a t since no freedom is l e f t i n choosing the three subspaces, the minimum norm least-squares
inverse must be unique.
I n the special case that both N and M are endowed w i t h the ordinary canonical metric, the minimum
norm least-squares inverse is commonly known as the pseudo-inverse.
- constrained inverses 0
. In
X
so t h a t determinability o f X was restored, and the
mXl
mxn n x l
minimum number o f unknowns so that the prolonged m a t r i x became square and regular, i.e. so that
Sometimes, however, one can come across the situation where the system o f linear equations
A A
X
is appended w i t h the restrictions ( T L ) L x = c
q > n - r . That is, w i t h the
m lmxn n x l
qxn n x l q x l '
restrictions t h a t X should l i e i n a linear manifold parallel t o a subspace T which is a proper subspace
of an S, complementary t o Nu(A)
. I n this case
Although this situation d i f f e r s f r o m the ones we considered so far, it can be handled just as easy. B y
the method o f prolongation we get namely
with T c S
N = S m Nu(A),
M = AT m C
t
-1 t
where m a t r i x T ( C A T )
C is known as a constrained inverse of A (see e.g. Rao and Mitra, 1971).
n xm
Other types o f constrained inverses can be obtained i n a similar way.
To conclude this section we have summarized, i n order t o f a c i l i t a t e reference, the basic results i n
table 1.
,I
I,
4 :
- U ,
n . ',L
5 B-;
"I., ,
-
-.m
"b.
U
+.
6. C- and S -transformations
Now t h a t we have found a geometric characterization of the inverse linear mapping problem, l e t us
return t o the linear estimation problem which was considered i n chapter I.
Consider the linear model
6 + (y
h(y) =
,y )
f
= (ys-
-*
y ,yl)
forsome y
9 ),
and
-X
That is, y
yX)
S
+ U
0 .
In M
figure 12
appropriate subspace
linear manifold
orif
U= A N ,
C c M
{ y*) +
S
complementary t o U'
U onto C . Hence,
With (6.2) then follows that the class of linear unbiased estimable functions of ( y
where
= {y } + A N
1
C c M
and
is arbitrary provided t h a t M
= C
e,
,-y )
is given by:
NU(A').
Every
such linear function is thus uniquely characterized by the choice made f o r C . And by varying the
choices for C one varies the type of unbiased estimator. Since the projector P
always
C, N ~ ( A ' )
figure 13
and
i n accordance w i t h
the
current
terminology
C-
transformations
A typical example i n which a particular choice for C is made can be found i n the method of averages
due t o T. Mayer (Whittaker and Robinson, 1944, p. 258). I n this method, which is sometimes used f o r
polynomial approximations (see e.g. Morduchow and Levin, 1959), C is chosen such that the equations
of a linear system
Although more o f
y = A
e%hpye% can be given, the most common applied estimator is of course the
C =
It is
interesting t o note though, t h a t since every (oblique) projector can be interpreted as an orthogonal
projector w i t h respect t o an appropriate m e t r i c tensor, every unbiased estimator can be interpreted
as a BLUE's estimator w i t h respect t o an appropriate covariance map, a f a c t which was already
U=
with
R(A)
and
R(A),
M = R(A) e
CO,
CO'
Thus the problem o f comparing d i f f e r e n t unbiased estimators can i n principle be restricted t o the
problem of analyzing the e f f e c t of assumptions on the m e t r i c tensor. See e.g. (Krarup, 1972).
*(c)
)i
Y
= Y1 + P
(ys-yl)
Nc
)i
-(c)
Y
y1
+ P
(ys-y,),
R(A)
M is given by:
C, N ~ ( A * )
or
(c)
with y
, CO
And since the problem of removing inconsistency is i n the above context of linear estimation
essentially the problem o f finding the estimate
actual adjustment problem once
-(c?
parameter representation o f y
9")
, one
N.
AN
-(c)
F r o m the transformation r u l e (5.13) and (6.7) follows then t h a t every inverse image of Y
under A can be w r i t t e n as
-Y1
with
N = S m
and M*
Nu(A)
T o understand what
X
A
* ')
(yS9y1)
* -
(Y
*( c )
Y,Yl
(ySsy1)
N U ( A * ) , and where
( p
,Y-Y,)
is a LUE of
= { Y ~ }+ AN
V I sE M*
y,, y - y l )
C, N ~ ( A * )
. B-* x s , y - y l )
f
(P
(Y,*Y),
= C m
1' ''I
is a L U E o f
isaLUEof
C, N ~ ( A * )
( X ,X ) ,
X:
.B*.
(P
R (A*)
R(A*)
S , Nu(A)
,y-y
of A*
) isaLUEof
(y-yl))
isaLUEof
R ( A ) , CO
v X : E N *
is an unbiased estimate o f
B*
. B . P
( X ~ 9 PN U~( A, )X ) ,
,so
(xs,p
I n other words
arbitrary inverses
C, N ~ ( A * )
XA ( s " )
.
(S)
--
S , Nu(A)
figure 1 4
The transformation between the various inverse images of
-y
Such transformations a r e now known a s S-transformations. They were first introduced by Baarda in
t h e context of f r e e networks and used t o obtain an invariant precision description of geodetic
networks (see e.g. Baarda, 1973; Molenaar, 1981; Van Mierlo, 1979; or Teunissen, 1984a). Baarda has
used t h e t e r m "S-transformation", since t h e projector PS nu(^) is in case of geodetic networks
9
derivable from the differential Similarity transformation. In the above general context, however, i t
would perhaps be more appropriate t o call transformation (6.10) a Singularity transformation. This a s
opposed t o the Consistency transformation (6.6).
Note t h e g r e a t resemblance between (6.6) and (6.10). From this comparison also follows t h e duality
result t h a t the
of
C-transformations of A a r e the
is t h e S-transformation of A
3f
S-transformations of A
and t h e projector P
S, Nu(A)
or, t h e projector
is t h e C-transformation
In this section we have seen how the inverse linear mapping theory applies t o t h e problem of linear
estimation. We have seen that t h e actual problem of adjustment and t h e actual problem of inverse
mapping, although dually related, a r e essentially two problems of a different kind. Were we only
interested in adjustment, i.e. in removing inconsistency, then we would only be concerned with t h e
subspace C
9")
R(*)
t h e fore. We would like t o stress here t h e importance of the definite ordering: first adjustment and
then inverse mapping, since i t shows t h a t in an estimation context no g r e a t value should be attached
- ( s , c ) of
t o the subspace D. In f a c t t h e only inverses of A which map ys into the pre-image X
$(c),
one
can
not
get
= P
R ( A ) , R(A)
1)'s
an
arbitrary
, by mapping
pre-image
y
0) )
-(S)
of
the
least-squares
estimate
1. Introduction
In t h e preceding chapter we have seen how t o characterize an arbitrary inverse of a linear map
A: N
CO
make an inconsistent system of linear equations consistent, and how S complementary t o Nu(A) gave
a
way
of
restoring
determinability.
We
also
noted
that
although
inconsistency
and
S-transformation
advocated by (Baarda, 1979). For large networks however, i t will not be sufficient t o consider only
t h e coordinate transforming S-transformations. In these cases one is almost surely also interested in
a description of t h e fundamental directions like local verticals and t h e average terrestrial pole. T h a t
is, besides t h e network's point configuration also t h e configuration of t h e fundamental directions
becomes of interest then. Hence, we also need S-transformations t h a t transform both coordinates
and orientation parameters.
Having given t h e various representations of Nu(A) which a r e needed t o derive t h e appropriate S -
transformations, we turn our attention i n section three t o the problem o f connecting geodetic
networks.
Without
systems. And finally similar problems are encountered in deformation analysis (Van Mierlo, 1978).
There networks measured a t t w o or possibly more epochs need t o be compared i n order t o a f f i r m
projected geophysical hypotheses.
I n a l l the above cases the same principles f o r connecting networks can be applied although o f course
the elaboration can d i f f e r f r o m application t o application, depending e.g.
and the purposes one needs t o serve. T h a t is, although d i f f e r e n t solution strategies exist, a l l methods
r e l y on the self-evident principle t h a t the only information suited f o r comparing networks, is the
information which is common t o both networks.
I n our presentation we w i l l discuss three methods f o r connecting geodetic networks. Although a l l
three alternatives are considered t o some extent i n the geodetic literature, the t r e a t m e n t below
accentuates some aspects which are not discussed elsewhere.
2.1.
Planar networks
L e t us commence, i n order to f i x our minds, w i t h the simple example o f a two dimensional planar
triangulation network i n which only angles are measured (see e.g.
figure 15).
a f i r s t standard
w i l l do.
I n practice,
The
same reference i n common. The benefit being t h a t w i t h coordinates the relative position o f any t w o
points i n a network is easily obtained without need t o bother about the way i n which these t w o
network points are connected by the measured elements. Consequently, coordinates are very
tractable f o r drawing maps or making profiles of the whole or parts o f the network.
With this motivation i n mind we are thus looking f o r a way t o present our results of adjustment by
means of (cartesian) coordinates.
However, i n order t o compute coordinates we f i r s t need t o f i x some reference, i.e. i n the case of a
planar triangulation network we need t o f i x the position, orientation and scale of the network. One
way t o accomplish this is o f course by fixing t w o points of the network,i.e.
= yr
I
I
cosA
+ I
cos (A
+ n + a
1
rS
rS
S i
rS
rsi
figure 16
Linearization o f (2.1) gives (the upperindices "on indicate the approximate values):
Ax
Ayi
= Ax
+X
= Ay +y
r
rs
Alnl
rs
Alnl
rs+'rs
0
-X
rs
rs
AA
AA
rs
rs
+X
+y
si
0
si
Alnl
Alnl
si+'si
0
-X
si
si
AA
AA
rs
+y
si
0
-X
rs
si
Aa
Aa
rsi
rsi
which we can w r i t e as
o
Ax
ri
Ay
A A ~
'ri
rS
Alnl
rs
'
Since a l l the angular type of information is collected i n the f i r s t t e r m on the right-hand side of (2.3)
we see that, i n order t o be able t o introduce coordinates, we need t o assign B p r i o r i values t o the
second term. One way is of course t o take points P, and PS as reference- or base points by assigning
t o them the non-stochastic approximate coordinates
Ax r
= Ayr
= Mrs = A l n l r s
XO
'
y:
and x
,y 0 ,
= 0 or
The coordinates o f any other point Pi of the network are then computed as
where the upperindices (r,s) indicate t h a t these coordinates are computed w i t h respect t o the
basepoints P, and PS.
Although the choice of fixing the t w o points P, and PS i n (2.3) is an obvious one, there are also other
ways o f introducing coordinates. One could f o r instance take two other points of the network as base
points, or f i x linear combinations o f the coordinate increments of network points. Essential is,
irrespective the choice made, t h a t the positional-, orientational- and scale degrees o f freedom of the
network are taken care of. This is best seen by observing that (2.3) combined w i t h (2.5) essentially
constitutes the t w o dimensional d i f f e r e n t i a l similarity transformation:
= t O = 0.
Y
Since there are many different ways of introducing coordinates, i t is important that one recognizes
under the assumptions t h a t 2 = 1 , @O=0 and t:
that i n general
Hence, i f one wants t o compare two sets of coordinates, where the two sets are computed f r o m t w o
different and independent observational campaigns
analysis
i t is essential that these coordinates are a l l defined w i t h respect t o the same reference.
Now i n order t o get a l l coordinates i n the same reference system one needs t o be able t o transform
f r o m one system t o another.
F o r the above defined (r,s)-system this transformation is easily obtained.
F r o m substituting
(2.9)
which shows how t o transform f r o m an arbitrary coordinate system t o the prespecified (r,s)-system.
To find the general procedure f o r deriving such transformations, note t h a t the definition of the (r,s)system and the derivation of (2.9) followed f r o m the decomposition formula (2.3). With (2.5) and (2.8)
this decomposition formula reads i n m a t r i x notation as
where
X
and
= (Ax ,Ayr,Axs,Ay
r
Ax. ,Ay
I
t
..
.) ,
i
X.
A n alternative
I
where R(Si) is arbitrary b u t complementary t o R(V ) We then get
or
= S C V S.)
L
1 t
1 I t l - l I t
+v ( ( 5 . ) V ) (Si) X
S
X
(2.11)
o f information and a second p a r t for which additional a p r i o r i information is needed. Now, just l i k e
decomposition (2.10) suggested t o choose the restrictions (2.4), (2.11) suggests t h a t we take
we f i n d f r o m
This is the general expression one can use for deriving transformations l i k e (2.9).
We thus see t h a t i n
order t o derive such a transformation we only need t o know R(V ) and t o choose an (S: )
I
4 X ,!I
t h a t R(S ) is complementary t o R(V )
such
So far we discussed planar networks o f the angular type. B u t formula (2.14) is o f course valid for
other types o f networks too. The only difference is that we need t o modify R(V ) accordingly. For
a network i n which azimuths and distances are measured for instance, we find f r o m
that
i.e.
the appropriate (differential) s i m i l a r i t y transformation is i n this case the one i n which scale and
r o t a t i o n is excluded.
t h a t i n case of, for instance, an angular type o f network a l l linear(ized) functions o f the angular
observables are invariant to the d i f f e r e n t i a l similarity transformation (2.6). Thus i f the adjustment of
the planar triangulation network o f e.g. figure 15 is formulated as
then
I
remark concerning the choice of S = R( S ) complementary to R( V )
Some authors, when dealing w i t h free network adjustments, prefer t o take the coordinate system
definition corresponding t o the choice
This is o f course a legitimate choice, since it is just one o f the many possible. However, we cannot
endorse their claim t h a t one always should choose (2.20) because i t gives the "best" coordinate system
definition possible.
They m o t i v a t e their claim by pointing out that the covariance map o f the pre-image o f the BLUE'S
estimate
9 of 7 = Ax corresponding w i t h
t r a c 4 (I-?(
f o r a l l pre-images
(V')
of
'vI)-'(V')
(I-V'(
(V')
'V')
*(S)
under A.
This i n i t s e l f is true o f course. I n case o f free networks however, i t is unfortunately without any
meaning. A l l the essential information available is namely contained i n
whereby
is nothing
b u t a convenient way o f representing this information. A theoretical basis f o r prefering (2.20) does
therefore not exist i n f r e e network adjustments. A t the most one can decide t o choose (2.20) on the
basis o f computational convenience which m i g h t i n some cases be due t o the symmetry o f
I-v'(
(V')
tv')
- l ( v L ) t.
One could also rephrase the above as follows: Since every (oblique) projector can be interpreted as an
orthogonal projector w i t h respect t o an appropriate chosen metric, the difference between the w i t h
choice (2.20)
norm, w i t h (2.20)
parameter-space norm. And since there is no reason t o prefer one particular norm above another, we
do not have, as i n physical geodesy, a norm choice problem i n f r e e network adjustments.
So f a r we discussed the inverse linear mapping problem o f planar geodetic networks. B u t l e t us now
assume t h a t we have t o compute a geodetic network, the points o f which are forced t o l i e on a given
ellipsoid o f revolution, defined by
where a and b are respectively the ellipsoid's major and minor axes.
I n view o f the foregoing discussion the three main questions we are interested i n are then: (i) how
does the theory o f S-transformations apply t o the ellipsoidal model, (ii) how does i t compare t o the
results we already obtained f o r the planar case and ( i i i ) what are the consequences f o r practical
network computations.
On an i n t u i t i v e basis it is not too d i f f i c u l t t o answer these three questions provisionally. F r o m the
rotational symmetry o f the ellipsoid o f revolution follows namely that the ellipsoidal model w i l l a t
most admit one degree o f freedom. And since this degree of freedom is o f the longitudinal type i t
follows t h a t the ellipsoidal counterpart of transformation (2.6) w i l l read as
where
is the geodetic longitude increment of point Pi and AeZ the d i f f e r e n t i a l r o t a t i o n angle.
AAi
Hence, transformation (2.22) can be used t o derive the appropriate S-transformations f o r the
ellipsoidal model.
As t o the second question, i f one wants to understand i n what way and to what extent the ellipsoidal
model differs f r o m the planar model, we need a way of comparing both models. One can achieve this
by considering the planar model as a special degenerate case of the ellipsoidal model. Assume
therefore t h a t we are given a geodetic triangle (i.e. a triangle bounded by geodesics) on the ellipsoid
By l e t t i n g e 2=(a 2-b 2 )/a 2 the f i r s t numerical eccentricity squared, approach zero
of revolution (2.21).
increasing values o f R the spherical triangle w i l l ultimately reduce t o an ordinary planar triangle.
Summarizing one could therefore say that the difference between ellipsoidal geometry and planar
Euclidean geometry is p r i m a r i l y made up by the two factors e2 and R. And one can thus expect t h a t
i f both the ellipsoidal eccentricity factor e2 and the spherical curvature 1/R are small enough, no
significant differences w i l l be recognizable between ellipsoidal geometry and planar Euclidean
geometry.
B u t what about the admitted degrees o f freedom? We note namely a drastic change i n the maximal
number of admitted degrees of freedom when the two l i m i t s e 2
0 and R
ellipsoidal model only admits the longitudinal degree o f freedom, whereas the planar model admits a
maximum of four degrees of freedom. Still, despite this difference i n admitted degrees o f freedom it
seems reasonable t o expect that the actual estimation problem of the ellipsoidal model w i l l n o t be
too d i f f e r e n t f r o m that of the planar model i f e2 and 1/R both are small enough. Consequently, it can
be questioned whether i n this case transformation (2.22) suffices t o characterize the degrees o f
freedom admitted by the ellipsoidal model. Theoretically i t does of course. B u t for practical
applications i t becomes questionable whether the rotational degree o f freedom as described by (2.22)
is the only degree of freedom the ellipsoidal model admits i f both e 2 and 1/R are small.
This then brings us t o the t h i r d question concerning the consequences for practical network
computations. Namely, the smaller e2 and 1/R get the worse the conditioning of the ellipsoidal
networks' design m a t r i x A can expected t o be. That is, although theoretically the maximum defect o f
A equals one, i t can be expected t h a t for small enough values of e2 and 1/R more than one of the
columns of the design m a t r i x A w i l l show near linear dependencies. As a consequence one can
therefore expect t h a t the ill-conditioning of A w i l l a f f e c t the estimation of the explanatory variables
X
= A
X.
collinear variables do not provide information that is very different f r o m t h a t already inherent i n
others. I t becomes d i f f i c u l t therefore t o i n f e r the separate influence of such explanatory variables on
the response
7.
arises f r o m the f a c t t h a t a near collinear relation can readily result i n a situation i n which some of
the observed systematic influences of the explanatory variables
model's random error term. And i t w i l l be clear t h a t under these circumstances, estimation can be
hindered.
To f i n d out whether f o r p r a c t i c a l ellipsoidal networks the estimation o f geodetic coordinates is
indeed hindered by the expected ill-conditioning of A, one can follow different b u t related routes.
One way is t o investigate numerically t o what extent the shape o f an ellipsoidal network as measured
by i t s angles, can considered t o be invariant t o a change of i t s position, orientation and scale.
Another way is t o compute the non-zero singular values o f A or the non-zero eigenvalues of the
t
normal m a t r i x A A. Eigenvalues small relative t o the largest eigenvalue of the normal m a t r i x w i l l
then r e f l e c t the poor conditioning o f A. And finally one could t r y t o show analytically t h a t the
estimated geodetic coordinates lack precision i f only the longitudinal degree of freedom is taken care
of.
The f i r s t approach, which is based on the idea that f o r planar geodetic networks of the angular type
the invariance t o position, orientation and scale changes is complete, has been followed by (Nibbelke,
1984). And he found t h a t f o r p r a c t i c a l ellipsoidal triangulation networks one can indeed consider the
network's position, orientation and scale as non-estimable. That is, one is, just as i n the planar case,
forced t o f i x four linear independent functions of the geodetic coordinate increments. The theoretical
deformations of the network's shape, which possibly follow f r o m these restrictions, are then
negligible. The same conclusion was also reached by (Kube and Schnkldelbach, 1975), who used the
second approach.
network show t h a t i n case of, f o r instance, an ellipsoidal triangulation network, four eigenvalues of
the normal m a t r i x w i l l be so small t h a t a sensible estimation of the network's position, orientation
and scale is n o t attainable. This conclusion is also i n agreement w i t h the result found by (Krarup,
1982a), who indicated t h a t the position of a trilateration network on an ellipsoid of revolution is
practically non-estimable.
As an example and also t o support the above mentioned findings we w i l l now show analytically t h a t
the estimation o f geodetic coordinates indeed lacks precision i f only the longitudinal degree of
freedom is taken care of. F o r this purpose assume t h a t we have a f u l l rank linear model
y = A x
mxl
mxn n x l '
i n which x2 of
(xl
xi)
mxl
m x ( n - l ) rnxl
conditioning o f A.
L '
nxl
follows then t h a t the column vector A 2 depends almost linearly on the columns o f A.l
Using the
reparametrization
we can w r i t e (2.23) as
or as
) + (I-Q ( A A )
1 1 1
2
-1 t
A )A2x2
1
y = A x
+ A x
1 1
2 2'
with
= (I-A ( A A )
1 1 1
(2.24)
-1 t
A ) A .
1
2
(2.25)
F r o m the f a c t t h a t A 2 depends almost linearly on the columns of A1 now follows t h a t one can
t o be a rather short column vector. Geometrically this is seen as follows.
reasonably expect A
t
t
Since I-A1 (AIA1 ) A1 is an orthogonal projector, we have t h a t (see figure 17)
-I
A ~ A=
2 2
where
A (1-A (A A )
1 1 1
2
-1 t
t
2
A ) A = A A s i n 0,
1 2
2 2
0 denotes the angle between A2 and i t s orthogonal projection on the subspace spanned by the
columns of A.l
figure 17
F r o m the near linear dependency o f A1 and A 2 thus follows that the angle 0 w i l l be small. Hence,
the length o f
A2
3
a I,
=
X
A A
2 2
t
2
A A sin 0
2 2
t
t
-1 t
A ( I - A ( A A ) A1)A2
2
1 1 1
A2
to what extent the diagnosed ill-conditioning of A affects the estimation o f x2 we need t o have a
reasonable estimate of (2.27).
Since we know t h a t the possible lack o f precision of the estimated parameter x2 is a consequence o f
the near linear dependency between A1 and A2, i t follows that there must exist a vector, z say, f o r
which
A z = v
we get
With
( I-A1 (AIA1
-1 t
A1 ) v (
v v,
the lower bound of (2.30) t o prove that the estimation of x2 indeed lacks precision.
Now, t o apply the above t o our case of ellipsoidal networks, recall t h a t we made i t plausible t h a t the
difference between ellipsoidal,
enough values of e2 and 1/R, the eigenvectors of spherical- and planar networks' design matrices
belonging t o zero eigenvalues are the proper candidates for the z-vector of (2.28). F o r this purpose
we thus f i r s t need to find the spherical analogon of (2.3) (or (2.15)).
We w i l l s t a r t f r o m the three dimensional d i f f e r e n t i a l similarity transformation
( A t
With
'1
c o s @ . c o sA .
-sin@.cosA
I
0
I
0
I
0
Ay
= cos@, s i n 1
where
i J
AZ
ni
i
i
0 . .
-sinX.
c
(~'+h(l)~@~
i
I
c o s @.
sin@.
Ah .
1
0
NOCOS
$. A A
I
i,
0
0.
0.
0
2
2 0
0
0.
. - s i n @ . s i n k . c o s + . . - ( N . ( 1 - e s i n Q. ) + h . ) s i n 1
i.
1
i.
I .
I
1
1
i.
0
0.
0
0.
0.
2 0
0
0
0
c o s Q . c o s A . . c o s @, s i n k . . s i n @ , . - e N . c o s Q , s i n @ . s i n A .
0
-sinQ.cosA
0.
2
( ~1 ~ ( 1 - e
0
2
(N. ( 1 - e s
2 0
0
e N.cos Q
1
Q, c o s A
cosQ.sin@
1
i
2
2 0
. - e s i n @.)
(2.32)
Since t h e network points a r e f o r c e d t o lie on t h e ellipsoid of revolution, w e m u s t h a v e t h a t
0
= 0 and h i = 0
Vi = 1,
...
H e n c e , i t follows f r o m (2.32) t h a t
0
0
0
2 0
0
0
0
+ c o s Q . s i n A . A t + s i n Q . A t - e N . c o s Q . s i n @ . s i n A . A ~+
I
1
Y
l
Z
l
l
l
l
X
2 0
2
2 0
e N .I c o s Q O1 s i n Q O
1 c o s A ~ A E+ N Y ( ~ - e s i n $ i ) AK ,
Y
V i = 1,
0
0 = cosQ.cosA.At
...
But this means t h a t for a regular network (i.e. a network which excludes cases like X i = constant,
Vi = 1,
. . .)
which confirms our earlier s t a t e m e n t t h a t t h e ellipsoidal model only admits t h e longitudinal degree of
freedom.
In an analogous way we c a n find the type of degrees of freedom admitted by t h e spherical model. In
spherical coordinates Ri, $ i and A .
'
' A t '
. COSA i
+ -sin$.cosA
I
i
0
.. R . s i n $ . c o s ) , .
I
0.
.R.sin$.sin~
I .
I
I
i.
0.
.-
c o s 0. c o s A
~t~
~t
0
AEX
AE
RO
AE
i,
, AK
(2.36)
And by setting
0
R i = R and ARi = 0
Vi = 1 ,
...,
(2.37)
we g e t t h a t
(2.39)
To find t h e expression which corresponds t o (2.3) (or (2.15)), we first need t o know the relation
t
and ( A $ , , A A r , A A r s ) t . This is given by
between ( A E ~ , A E
Y
Y
z
0.
cosh
'i.
o
o 0 .
o
o
o o
o
----- . s i n @ . s i n ( A . - A ) . R s i n @ . c o s @ c o s ( A - A ) - R c o s @ . s i n @
0.
1
l
r .
1
r
i
r
1
O
r
The spherical analogon of (2.3) (or (2.15)) then finally follows from substituting
I
ir
- s i n --- s i n A
= s i n ( + a - @ )sine(.-A )
1
r
R
ir
r
= cos@ sin(A.-A
and
1
sin
ir
--- c o s ( 2 n - A
R
ir
) = s i n ( + n - $ . ) c o s ( i n - @ 1 - c o s ( ~ n - & ) ~ i n ( + n - @ ~ ) c- o
A s )( ~
into (2.41):
.
.
cos @
r.
0
\
cos(A - A )
i
r
. Rsin
ir
--- s i n A
(2.42.a) and (2.42.b) follow from applying t h e sin-rule sin a/sin A = sin b/sin B and t h e so-called fiveelements' rule sin c cos a - cos c sin a cos B = cos A cos b (see figure 18) of spherical geometry.
Expression
(2.43)
are here thus confronted w i t h a situation where angles alone suffice t o determine scale. B u t still,
although scale is theoretically estimable, one can expect, as was made clear i n the foregoing
introductory discussion, t h a t i f the spherical curvature is small enough scale w i l l only be very poorly
estimable. And indeed it turns out t h a t f o r practical spherical networks, scale can be considered as
non-estimable. See for instance (Molenaar, 1980a,p.20) or the earlier c i t e d references.
I n the same manner it is concluded i n these publications that the scale, orientation and position o f
practical ellipsoidal networks, can considered t o be non-estimable. To support these findings we w i l l
now show analytically, t h a t the geodetic coordinates lack precision i f only the longitudinal degree o f
freedom o f the ellipsoidal model is taken care of. F o r this purpose consider expression (2.43).
The
three columns o f the m a t r i x on the right-hand side o f (2.43) span the nullspace o f the design m a t r i x o f
a spherical triangulation network, whereas the f i r s t column vector provides a basis o f the nullspace o f
an ellipsoidal network's design matrix. Thus, i f the eccentricity factor e2 is small enough one can
expect t h a t both the second and t h i r d column vector o f (2.43) get almost annihilated by the ellipsoidal
network's design matrix. Hence, we can use one of these vectors, say
(2.45)
where A .
1
where
-0
= (... - s i n A
-0
,
ij
-cosA
-0
,
i j
-sinA
-0
j i,
-cosA
,
j i
...I
-0
-0
-0
- cosA
= a z = -sinA
-sinA
sin@.sin(X.-X )
ij
I
I
r
cosA
-0
sin@.sin(X -X )
j i
J
j
r
cos(X -X )
ij
i
r
cos(),
j i
-X
L e t us therefore identify
sin A
ij
= sin A
ij
+ cos A
ij
AA
ij
= (sin A
ij
cos(X -X )
i
r
+ [sin A
cos(X.-X
j i
J
0
l-
cos A
ij
s i n @ , s i n ( X . - X ))AA
+
I
I
r
ij
cos A
s i n @ , s i n ( X -X ))AA
j i
J
j
r
j i
(2.48)
1AA .1
1
1AA .1
J
From this e s t i m a t e and (2.30) thus indeed follows t h a t in c a s e of practical ellipsoidal networks (1.. =
1J
64 km, R = 6 4 0 0 k m , o =
l ~ - ~ i .j 'l e = 1/ 3 0 0 ) geodetic coordinates will lack precision if
only t h e longitudinal degree of freedom is taken c a r e of.
2.3.
Now t h a t we have considered t h e inverse mapping problem in two dimensions i t is not too difficult t o
generalize t o t h r e e dimensions.
We will first assume t h a t only angles and distance ratios a r e measured in t h e t h r e e dimensional
geodetic
network.
The
generalization
of
(2.1)
to
t h r e e dimensions becomes
then
rather
rs
Isi
---c o s a
'sr
lrssinArsX
rsi
'rsCoSArs
0
'
'si
---s i n a
'
sr
rsi
0 l O ] ' I ~ ~ S ~ ~ A ~
lr s ~ ~ s A
rS
-1 0 0
, 0 0 0
(2.50)
where t h e action of t h e matrix
equals t h e action of
with
"X"
With (2.51), expression (2.50) therefore suggests t h e following generalization t o three dimensions:
X
y
z
, i ,
yr
z
. r,
l r s s i n Z r s s i n Ar s
+ 1r Ss i n Z r s c o s Ar S
- 'rsCos
lrssin Z
'
Isi
- ---
rS
s i n Ars
lrs
'rs
rS
Is i
- ---
n2
Isr sin a r S
' nnl 3]
\
l r s s i n Zrscos Ars
lrsCOS Z
rS
where Zrs denotes t h e vertical angle of t h e line PrPs (see figure 19.a) and n = (nln2n3It is t h e unit
normal of t h e plane through t h e points P,, PS and P i (see figure 19.b) defined a s
figure 19
We thus see t h a t one way of introducing coordinates f o r three dimensional networks of the angular
type is by starting t o f i x the t w o points P,
freedom. Namely, three translational degrees of freedom, two rotational degrees of freedom and one
freedom of scale. The remaining rotational degree o f freedom, namely r o t a t i o n of the network around
the line PrPs, is then taken care of by fixing the direction o f the u n i t normal n i n the plane
perpendicular t o the line PrPs. The so defined coordinate system thus corresponds t o fixing two points
P, and PS, and the plane through these two points and a t h i r d point, P t say. Following (Baarda, 1979)
I
t
)
m a t r i x by which the (r,s;t)we w i l l denote this S-system as the (r,s;t)-system. The ( S
(r,s; t )
system is defined then follows f r o m the restrictions
where n
which
o o o t
= ( n n n ) can be computed f r o m (2.53) for i = t using approximate values. With
1 2 3
follows
from
the
three
dimensional
differential
similarity
transformation
(2.311,
' 0
X
-------------
----------W--
rs rt
rst
, O '
X
' 0
X
0
tr
Ybr
z t r,
'
ri
, O '
t ,
AXS
"2
Ays
."3,
IAzs,
'Ysi
zi ,
'
"1
Expression (2.56) can be considered as the natural generalization o f (2.9). Namely, i f we r e s t r i c t our
attention
in
(2.56)
to
the
Ax,
..., andalso
0
z.
= 0
1
V i = l,
A y - parts
0
of
n o - n 2 = 0,
1 -
the
n;
points
- 1,
Pi,
P,
and
PS
and
take
I n deriving the three dimensional S-transformation (2.56) we assumed that only angles and distance
ratios were observed. B u t this assumption is generally only valid i n l o c a l three dimensional surveys
(e.g.
construction works). I n large three dimensional networks, one w i l l usually have besides the
angles and distance ratios also direction measurements l i k e astronomical azimuth, latitude and
longitude a t ones disposal. It is l i k e l y then that one is not only interested i n the (cartesian)
coordinates describing the network's configuration but also i n the orientation (and possibly scale)
parameters describing fundamental directions l i k e local verticals and the earth's average r o t a t i o n
axis. It seems therefore t h a t f o r large three dimensional networks transformations l i k e (2.56), which
only transform coordinates (and their CO-variances) do not really suffice. And this becomes even more
apparent i f one thinks o f connecting such networks. F o r large networks we therefore need Stransformations that also transform orientation (and scale) parameters.
Now before deriving such S-transformations l e t us f i r s t draw a parallel w i t h the t w o dimensional
planar case. Since i n practice the observation equations are usually w r i t t e n down i n terms o f
instead o f i n terms of angles and distance ratios, the parameter
directions r.. and pseudo-distances I..
11
vector
11
and scale unknowns. Hence, the linear model o f t w o dimensional planar networks w i l l i n practice be
o f the same f o r m as that o f large three dimensional networks:
with, x:l
Thus also i n case o f t w o dimensional networks one can i n principle decide t o involve the orientationand scale unknowns i n the many S-systems possible. O f course i n practice one w i l l not do so, since i n
t w o dimensional planar networks these unknowns are generally o f no particular interest. B u t still, l e t
us, f o r the sake o f comparison between the two- and three dimensional case, pursue the idea o f
involving these unknowns i n the many S-systems possible.
Consider for this purpose a t w o dimensional planar network w i t h direction- and pseudo-distance
measurements r.. and I...
I n figure 20 a part of such a network is drawn. The theodolite frames i n
11
11
P . P:
1
figure 20
Analogous t o (2.1) we can then write
Hence,
(
.. .AX
if
the
Ay
unknowns
.... A e i ,
in
AI n l c i .
the
..)
linear
model
(2.57)
are
ordered
like
t
X
= (xl
t
x2 ) =
T h a t is, instead of fixing coordinates like we did in (2.4) we may just a s well fix one network point,
one direction of zero-reading and one scale parameter. The corresponding S-transformation then
follows from (2.59) a s
I = 1,2,3;
2'
The theodolite f r a m e T I ( P i )
I = 1, 2 , 3 ,
in point Pi;
TI
c o m p l e t e s t h e right-handed system.
figure 2 1
(a)
The relation between t h e two frames E
a n d T I ( P . ) is given by
I
cos0
where
sine
0
-sine
3, i
cos0
3, i
3, i
3, i
and
-sine
-S
in0
cos0
2, i
1, i
cos8
-sine
2, i
case 1, i case 2 , i
2, i
l ,i
sin0
2, i
case l , i s i n e 2 , i
cos0
sine
- X(P
l ,i
l,i
X(P P . ) = ( K 1 . s i n Z s i n r . , K 1 . s i n Z c o s r
1 .cosZ
r I
r rl
ri
rl
r rl
ri
r i V K rr l
ri
is a scale factor.
r
I
From (2.631, (2.64) and X ( P r P ) = X ( P i ) E I
where
yi
z
L
i,
Xr
-sine
yr
- zr J
which
( P )
r
T
(P
I=3 r
I=2
X
I
2,r
CO&
2,r
shows
-sine
-sine
cos0
that
one
l,r
l,r
COS^
sine
cos0
1 .sinZ s i n ( 8
+r . )
r r~
ri
3 , r r~
K 1 . sinZ .cos(8
+
.
COS^
sine
r rl
rl
3 , r r~
l,r
2,r
K 1 .cosZ .
S i ne
l,r
'. r rl
rl
2,r
2,r
l,r
can
start
~ Y r ~ Z r 9 e l , r ' e 2 , r 9 e 3and
,r
I
X ( P ) E follows then that
r
I
l,r
COS^
computing
2,r
"K
coordinates
once
the
seven
= Ayr = Az
(2.65)
parameters
K,.
system would be
Ax
= A0
l,r
= A0
2,r
= A8
3,r
= Aln K
= 0
(2.66)
Since (2.65) generalizes the first two equations of (2.50), linearization of (2.65) would give us the
three dimensional analogue of the first two equations in (2.59). But this is of course only half of the
story. We also need to know how the last two equations of (2.59) read in three dimensions. For scale
this is trivial:
A n
K .
I
= A n l
. - A n l.
I
+ Aln
(2.67)
To find the corresponding transformation for the orientational parameters though, we need to know
in point Pi are affected by differential
el,i9e2,i9e3,i
and K
Since we can rule out
changes in the seven parameters x r ,y r , z
r
r9el,r9e~,rye3,r
differential changes in the scale- and translational parameters, this leaves us with the problem of
in terms of the
finding a differential relation which expresses the h e l , d e 2 , J e 3 ,
how the orientational parameters
1,!9Ae2,r9Ae3,r.
Let us assume that the non-linear relat~onreads
Linearization gives,
(2.69)
Since the first t e r m on t h e right-hand side of (2.69) only contains observables, it is t h e second t e r m
we a r e really interested in. In components (2.69) reads then
t
= ( observables )
(A01,i,A02,i,A93,i)
- ' 2O, r 1
e1
an
'OS
'
1, r
cos g
1
0
c0s(e2
(2.70)
We a r e now in t h e position to collect our results. From (2.701, (2.67) and t h e linearized expression of
(2.65) follows t h e t h r e e dimensional analogue of (2.59) a s
..
0
-2
0:
.....
....
-z
ri
0
cos0
sing
2,r
l:(y .sine
+X cos0
) 0
rl
2,r r i
2,r
0:
cos(g
0:tang
2,i
-g
(-z
2,r
.. -1l , i s i n ( g 2 , i g 2 , r )
O.cos g
sin(g
g
0
l,i
2,i
2,r
ri
(z
(-y
ri
0
ri
0
ri
-cos0
l(-sing
) O
cosg
cos0
cos0
1,r
0.
1,r
0
2,r
0
1,r
sin(g
sing
cosg
cos0
2,i
0.
2,r
+y
ri
0
2,r
-X
ri
0
sing
sing
1,r
. O
. ri
.X
. O
l,r
Iyri
0
. O
+X cosg
sing
).z
2,r r i
1,r
2,r
ri
-g
...
....
..
2,r
0
.O
)
0
+tang
cosg
cos(g
g
)):o
l,r
l,i
1,r
2,i 2,r
-1 0
0
0
0
cos
cos9
cos(g
-9
)
.O
'1,i
1,r
2,i 2,r
Thus i f the unknowns i n the linear model of the three-dimensional network are ordered l i k e
...
.. .
. ..
3'
EI,
1=1,2,3;
1 = 3
points towards the line of intersection of the plane of the average terrestrial
E I=l
equator and the plane containing the Greenwich vertical and parallel t o
I=3'
T ( P . 1, I= 1 , 2 , 3 ,
i n point Pi;
I I
points towards the local astronomical zenith,
'E I= 2
4'
TI ,3
T
I=2
T I=l
I f we assume t h a t the theodolite frames are levelled, then the following relations between the four
triads E
where
I'
EI,
TI(Pi)
and * T ~ ( P ~ ) hold:
, and
'8'
Aa
AB
eS
= y
= 0,
-cos @
)R(81,r,8
2, r
0 "
A8
A8
- s i n Qr
A8
gives
cos A
r
r
o
o
-cos @ s i n A
r
r
r
o
-cos A
r
AY
aO =
sin A
e
r
3, r
l,r
2,r
3,r,
@P
(2.74)
and
in (2.71) by
and A i With
'1, i
'2, i
(2.74) we then find t h a t for large t h r e e dimensional networks in which also astronomical latitude,
Since
= y0 = 0
we can replace
(-z
( z
O O O
cos
-1
s sin(^.-^ )
I
cos
cos@ sinA +y
sin@ )
r ri
r i
r
r
cos@ cosh -X
sin@ )
ri
r
r
ri
r
-1
cos@ cos(A -A )
-cos@ sinA
r
r
..
.. r i
0
'ri
where we have denoted the f i r s t column vector on the right-hand side o f (2.71) i n which i t says
"observables",
by
one may wonder why there are s t i l l seven degrees of freedom. Aren't the
and A.. supposed t o take care of the rotational degrees of freedom? The reason f o r this
hi
IJ
apparent discrepancy is of course that the network's point configuration and fundamental directions
are described w i t h coordinates referring t o the frame E*, which is essentially an arbitrary one. We
have chosen f o r this approach because i t enables us t o describe the most general situation, i.e.
it
allows us t o introduce any reference system we like. That is, we do not r e s t r i c t ourselves beforehand
t o those reference systems which might be the obvious ones t o choose because of the available
@
S,
and A.. But, would one aspire a f t e r this more conventional S-system definition, then
IJ'
decomposition formula (2.75) is easily modified. To see this, l e t us consider the t w o dimensional
situation. Assume t h a t azimuths A.., horizontal directions r.. and distances 1.. are observed. B y taking
11
11
IJ
the general case of describing the network i n an arbitrary system (see figure 22) we get f r o m
linearizing
yi
= y r + ~1
ri
cos (Ari-
a)
= A . - r
+ n - a
rI
ir
InK = InK
that
where the upper indices (r,//) indicate t h a t these coordinates are computed i n the S-system which is
defined through fixing the point P r
(A x r
= Ayr
= 0)
a )
the orientation parallel ( i f a O=0 ) t o the north direction ( ~ =O
F r o m decomposing (2.77) l i k e
(A l n ~ = O ) and
Ax.
'Ax
A Y I.
'ri
A Y I.
=
A0
A llx
Allx=O
Aa
Aa = O
r-
0 .
ri
ri
-1
'ri
0
' ~ a
A llx
follows t h a t t h e reference systems one usually considers when azimuths and distances a r e observed,
a r e of t h e (//)-type. They a r e defined through Aa = 0 , A l nc =O.
explicitly stated, t h e conventional S-system chosen when azimuths and distances a r e measured, is:
figure 22
In t h r e e dimensions (2.79) generalizes t o
and it will now be c l e a r t h a t t h e usual phrase "astronomical latitude, longitude and azimuth t a k e c a r e
of t h e rotational degrees of freedom" essentially means t h a t one has fixed t h e orientation of t h e
reference system through Aa = A B = Ay = 0 .
From (2.75) follows t h a t t h e with S-system (2.80) corresponding decomposition is given by:
The corresponding S-transformation is then easily found from bringing t h e second t e r m on t h e righthand side of (2.81) t o t h e left-hand side (see also Teunissen, 1984a). N o t e t h a t since
(rsll)
A B ( r , l l ) = Ay (
) = 0 , one c a n replace A B r l l and A B 2( ,r , / / ) in
Aa
l,i
) and AA ( r , l l )
(2.81) by respectively A@ (
i
Instead of (2.80) one could of course also consider still other types of S-system definitions. One could
f o r instance t a k e t h e restrictions given by (2.54). The orientation of t h e earth-fixed f r a m e
the
directions
of
the
local
verticals
are
then
given
by
respectively
And if one replaces
A a ( r , s ; t ) , A B ( r , s ; t ) 9 AY ( r , s ; t ) and A 8 1( ,ri, s ; t ) " 2 , (i S ; )
t h e cartesian coordinates in e.g. (2.75) by geodetic coordinates and t h e direction unknowns
A0 l ,
A8 2 ,
S i = A81,i
5 ,q
through using
0
q i = (A0 2 , i - A AL. ) c o s @ i
'
one c a n show t h a t also t h e following s e t s of restrictions a r e legitimate choices for defining a n Ssystem:
(a)
c r = q r = A A r s = 0 ,
( G : A8
l,r
= A8
2,r
(b)
Aa = A B = Ay = 0 ,
(c)
A a = A B = AA
= AArs
= 0,
A+r
= 0 , A+r
AQr
= AAr
= A h r = 0 , A l n ~= 0
= A A
=Ah
= AA
r
r
r
= Ah
= Ah
r
r
r
= 0 , A I ~ K= 0 )
= 0 , AlnK = 0
= 0,
A l n ~= 0
(2.82)
(see also Strang v. Hees, 1977; Yeremeyev and Yurkina, 1969). And in this way many m o r e s e t s of
necessary and sufficient restrictions can be found. Note t h a t also the geodetic coordinates should be
given an upperindex r e f e r r i n g t o the S-system through which they are defined.
I n principle of course there is no need f o r introducing deflection of the v e r t i c a l components. F o r
computing three dimensional networks one can just as well do without them. Due, however, t o the
f a c t t h a t many existing large networks lack the necessary zenithdistances one has preferred i n the
past the classical method of reductions t o a reference ellipsoid and computation by means of
ellipsoidal quantities t o the more theoretically a t t r a c t i v e spatial triangulations of Bruns and Hotine
(see e.g.
Hotine, 1969; Torge and Wenzel, 1978; Engler e t al. 1982). Instead of solving the height
problem by using zenith distances one resorts t o the astrogeodetic (or gravimetric) method. The
problem of the network computation is then split i n t o t w o nearly independent problems, namely the
(a)
41
,X -
(b)
5 ,n
,h
- problem
The procedure followed is i n short the following (see also Heiskanen and Moritz, 1967). One starts by
defining a three dimensional S-system (geodetic datum). Usually one takes the datum given by (2.82.b)
or (2.82.~). Using the approximate information available on
{(P,
X;,
hp,
@g,A:
the observed angles and distances t o the ellipsoid and computes on it the geodetic coordinate
increments A @
A f t e r having solved f o r (a), one enters the solution of (b) where new heights
and new deflections of the v e r t i c a l need t o be determined based on the new ellipsoidal values of '$
and X
With these new values the whole procedure is repeated. One can consider this i t e r a t i o n
i'
procedure as a block Gauss-Seidel type of iteration where a linear system
is solved i t e r a t i v e l y as
A practical point o f concern is, however, the reduction procedure. I n many cases the necessary
gravity field information, needed t o perform a proper reduction of the observational data, is lacking
(see e.g.
available, the classical method of reduction t o the ellipsoid can be seen t o be formally equivalent t o
the t r u l y three dimensional method and both methods, i f applied correctly, w i l l give the same results
(Wolf, 1963a; Levallois, 1960). Hence, the final iterated solution of the classical method f o r the
network's shape w i l l be free f o r m any deterministic effects of the arbitrarily introduced datum. The
intermediate solutions of the i t e r a t i o n procedure, however, do theoretically depend on the choice of
datum. It is gratifying t o know therefore, as has been shown i n subsection 2.2, t h a t these e f f e c t s are
practically negligible.
Now that we have given representations o f Nu(A) in various situations we can start discussing the
problem o f connecting geodetic networks.
I n principle this problem is not too difficult. Essential is t o know the type o f information the t w o
networks have i n common. Based on this information one can then formulate the appropriate model
and p e r f o r m the adjustment.
As t o the methods o f connecting geodetic networks one can distinguish between three solution
strategies. Two o f them need the parameters, describing the t w o separate adjusted networks, while
the t h i r d method starts f r o m the assumption t h a t the original observation equations (or rather t h e
reduced normal equations) are s t i l l available.
I n the f i r s t method (method I ) use is made o f condition equations. The idea is t o eliminate f i r s t a l l
non-common information f r o m the t w o sets o f parameters describing the t w o separate adjusted
networks.
The so transformed
parameters are then finally used on an equal footing i n the method of condition equations.
It is curious t h a t this method has found so l i t t l e attention i n the literature. We only know o f a few
the general aversion one has f o r the method o f condition equations since it is known t o be
cumbersome i n computation.
argument does not hold. On the contrary, the method can i n many cases be very tractable indeed.
The second method (method 11) is essentially the counterpart o f the above mentioned method. I n this
method one starts by determining the transformation parameters. This is done by means o f a leastsquares adjustment. A f t e r the adjustment one then applies the transformation parameters t o obtain
the f i n a l estimates o f the parameters describing the t w o connected networks.
Method I1 seems t o be very popular w i t h those working on the problem o f connecting satellite
networks w i t h t e r r e s t r i a l networks (see e.g.
Peterson,
discussions on this method is, however, t h a t often the starting assumptions are n o t explicitly
formulated. As we w i l l see this may avenge i t s e l f on the general applicability o f the method and also
may a f f e c t the interpretability o f the transformation parameters.
Finally the t h i r d method (method 111) makes use of the so-called H e l m e r t blocking procedure. I t is
therefore essentially a phased type of adjustment, applied t o the original models o f the t w o
overlapping networks (e.g. Wolf, 1978).
Usually when one applies this method one starts f r o m the principle t h a t both the reduced normals are
regular, thereby suggesting that the t w o overlapping networks have no degrees o f freedom a t all. F o r
a general application o f the method, this is o f course a too restrictive assumption t o s t a r t with. We
w i l l therefore have t o show how the method applies i n the general case.
assumptions are too r e s t r i c t i v e t o render a general application o f the methods possible or they are
not too precisely formulated.
F o r a proper course o f things l e t us therefore s t a r t by stating our basic
Assumptions
F i r s t consider the original models. We assume that the f i r s t network is described by the linear(ized)
model
: A2
mxn
mxn
= ( A
A.
mx l
::.l)
Q,
Y
w i t h dim.
Nu(A, : A 2 )
= q,
withdim.
Nu(il:i3)
(n+n ) x l
= (
A;
mx l
i
1
A;
: i3
)
mxn
Q-,
i2
q.
mxn
(n+n ) x l
3
We further assume t h a t the second network, apart f r o m some additional degrees o f freedom, has the
same type o f degrees o f freedom as the f i r s t network. This means that we assume the nullspace o f
(3.1.l.a)'~
reduced f o r A
X ,
Ax2
normal
i.e.
w i t h the projectors P
-1 t -1
= I-A ( A ~ Q - ~ A A~ )Q
and
2
2 Y
2 Y
2
And finally we assume that
AP
I-A
>
with r -
t -1- l - t -1
( A Q- A ) A Q3 y
3
3 y
and
r x l
Since some o f the derivations and formulae i n the next section become quite elaborate, we w i l l use
f r o m t i m e t o t i m e the following
Example
-,
astronomical azimuth
and angle measurements. And the second network can considered t o be planar
I f the parameters
NU(A :A
1
2
( A ; ~ , A; 3 )
and
.. .. ..
.. ..
1 0
R(
0 1
1 0
N ~ ( A.A- ) = R (
1
3
O
) , w i t h q = 2 and
0
X.
0I )
withq=3.
.. .. ..i
.. ..
The second network has namely apart f r o m the t w o translational degrees o f freedom also an
additional freedom o f scale.
Furthermore, transformation (3.1.1.d)
with r = 4
R(V ) = R ( ( V ) 1
1
1 1
CJ
R((V1)2)
= R ( ( v ~ ) ~CJ ) Nu(F3Al),
...
>
with
R((V ) )
1 1
= R(
we can i d e n t i f y the A pl
'i
o ) and
-X
1 0
i
I
0
R ( ( V ) ) = R(
1 2
0 1 Yi
t
parameter o f A p = (A p1 A
1,
.. .. ..
pi
)
3.2.
Three alternatives
Since the above mentioned f i r s t t w o methods are closely related we w i l l discuss them together.
Method Iand I1
Both methods are applicable i f the parameters, describing the two separate adjusted networks and
their covariances are available. Thus we assume given (see figure 23):
with
(3.2.1)
S = R ( S ) complementary t o
Nu(A1 : A 2 )
and
f i r s t network
5:
= ~ ( 5 complementary
)
to
NU(A :A )
1
3
second network
figure 23
Our
goal
is now,
to
solve
,Ai2(s) ,Ai3(S))
for
the
transformation
parameters
and the
increments
expressed i n the same coordinate system as t h a t of the f i r s t network. F o r our example i n subsection
3.1 this means t h a t we wish our results to take the scale and orientation of the f i r s t network. This is
a sensible choice since the f i r s t network contains by assumption more information than the second
Nu(P2Al)
by
we thus need t o take care of the i n both cases existing translational degrees of freedom. B u t as we
know f r o m the previous section this can be done i n very many ways. The simplest way being to f i x
just one network point. Having done this we thus finally end up w i t h t w o sets of coordinates each
describing one o f the t w o separate adjusted networks. How are we now t o compare these t w o
coordinate sets? N o t by blithely comparing the coordinates o f corresponding networkpoints f o r these
were introduced i n a rather arbitrary way. I n general namely, the t w o fixed networkpoints w i l l be
different ones. I n fact, even i f one would have fixed the same networkpoint i n both networks, one s t i l l
should exercise great care. This is because the numerical values assigned t o the fixed point need not
be identical f o r both networks. Now i f we disregard this possibility f o r the moment and assume that
the same set o f approximate coordinates are used f o r linearizing the observation equations o f both
networks, we would have the inequality
even if
S = S.
The reason being o f course t h a t the f i r s t network is orientated w i t h respect t o astronomical n o r t h and
the second w i t h respect t o magnetic north. Thus the only information the t w o networks have i n
common is o f the distance- and angular type. But again we can take care o f this discrepancy by using
the appropriate S-transformation, namely one that eliminates the azimuthal information f r o m both
networks.
Finally we complicate the situation a b i t f u r t h e r by assuming t h a t the second network lacks distance
measurements, i.e.
lacks scale. I n this case we are i n the situation as described by the example o f the
but now the second network also has an additional freedom o f scale. I n this case we thus certainly
w i l l have the inequality
3.
B u t as w i l l be clear now, one can again overcome this discrepancy by using the appropriate S-
transformation, namely one which reduces both networks to ones of the angular type.
Summarizing,
8 i 5 ) may
we can conclude f r o m the above discussion that although the causes f o r the
incompatibility of
and A
transformation t o eliminate this discrepancy. And i n view of our general assumptions (3.1.1)
i t follows
or equivalently
reads i n cartesian
coordinates as
equations
or a set of n-4 linear equations which is i n one-to-one correspondence t o such a set of n-4 angular
condition equations.
Some authors have expressed their hesitation towards the above described procedure f o r using Stransformations. They argue that by using an S-transformation which eliminates e.g.
the available
azimuthal and scale information, one eliminates information which is important i n i t s own right. This,
however,
or (3.2.4),
The S on which
then the adjustment f o r connecting both networks is based. After the adjustment one can then
always, i f so desired, transform the adjusted coordinates back t o one of the original coordinatesystems. I n the above example f o r instance one can always transform back t o the system o f the f i r s t
network, the one that contains scale- and orientation information.
Now l e t us consider the actual solution strategies of the t w o methods I and 11. We w i l l start w i t h
method I.
Although it is customary i n the l i t e r a t u r e t o s t a r t f r o m modelformulation (3.2.3),
t o be explained, w i l l s t a r t f r o m modelformulation (3.2.4).
This formulation of t h e least-squares solution of method I is however not yet in concurrence with t h e
formulation one usually finds in t h e l i t e r a t u r e (see e.g. Baarda,1973,p.125 or Van Mierlo,1978,p.9-26).
We t h e r e f o r e have t o r e w r i t e (3.2.5) a bit. For this purpose t a k e t h e following abbreviation
Since R ( A ) = R( 5
-
t
F r o m premultiplying this expression with v l ( V 1 ( Q
-(:l
d
and
:= P
l
R(?),R(v,)
we c a n r e w r i t e (3.2.5) a s
(S)
(Aa,
+ Qr
)v1)
t
V1
follows then
v-
with
R ( S ) complementary t o
l.
R(V1).
F r o m (3.1.1.d) follows t h a t
through
- c S 1 ,with
d
d
an arbitrary inverse of
This is also t h e solution one can f i n d i n (Baarda, 1973) although t h e r e t h e r e s u l t is derived under t h e
m o r e r e s t r i c t i v e assumptions t h a t
Nu(P2A1)
w i t h (3.2.5)
computationwise. F o r m u l a t i o n (3.2.10.a)
transformation, namely (3.2.2),
however suggested by (3.2.7)
= N U ( P ~ A ~=) R(V1).
A m o r e d i r e c t way is
11.
v~(v:(
Qlis)
Ql(s)
+ Q;
1
computed f r o m
of
iW i t h
.
i
+Q;
E)
E)
3,
S =
inverse
can be
(3.2.11)
I n t h e general case t h a t
rank
Q
inverse
S
Q;
of
iS )
Nu(P2A1 ) c Nu(P3A1 )
R(V1
, (3.2.7)
S ) + Q; E ) . Instead
it
becomes
i
i
. W i t h (5.21) o f chapter I1 follows then
Q
w i l l cease t o be a m i n i m u m
a
constrained
inverse
Thus, since a representation o f R(V1) i s usually readily available, we see t h a t instead o f (3.2.10.a)
can also use f o r m u l a t i o n (3.2.5)
w i t h (3.2.7)
computed v i a (3.2.11')
I
= VIAp
one
(or (3.2.11)).
of
Usually t h i s model w i l l c o n s t i t u t e t h e d i f f e r e n t i a l s i m i l a r i t y t r a n s f o r m a t i o n
(3.2.12)
and
e.g. when combining doppler networks with terrestrial networks (Peterson, 1974). However, since t h e
common unknowns of the two overlapping networks need not be restricted t o coordinates, relation
(3.2.12) could be a kind of modified differential similarity transformation such a s for instance (2.81).
In f a c t , relation (3.2.12) need not be r e s t r i c t e d t o t h e differential similarity transformation a t all. It
could for instance also include additional "transformation" parameters which describe projected
geophysical hypotheses in a deformation analysis. O r i t could include, say, a refraction model.
When we solve for (3.2.12) we immediately notice a difficulty which is often overlooked in t h e
literature. Namely, t h a t t h e covariance sum Q R1 ( S ) + Q i l ( ;) can turn out t o be singular. Assume
for instance t h a t
S = R ( S ) is complementary t o
Nu(P3Al ) and t h a t
c S . Then Nu(Qpl (
will exist. One could of
= R ( 3 ) is complementary t o
Q+]
t a k e a generalized inverse of Q R ( S ) + Q i l ( )
+
,3
Nu(P2A1)
however from f u r t h e r elaboration on this point, since if one really insists on using (3.2.12), one can
e i t h e r transform one of t h e covariance m a t r i c e s by means of an appropriate S-transformation so t h a t
t h e sum Q* ( S ) + Q: ( 5 )
becomes regular again, or, what is more practical, add t h e matrix
I I
1
1
( v ~ ) ( v ~t )o ~ Q * i s ) + Q; f 5 )
The solution of (3.2.12) follows then from straightforward
(3.2.13)
And since we have t h e projector identity
- v
I - ( Q ( S ) + Q X, ( ; ) ) v l [ v ; ( ~
R1
R1
Q:(s)
X
1
v 1L
and Q:
( 1 of c o o r d i n a t e s defined in
This brings us t o another important point, namely t h a t o f the interpretability o f the transformation
parameters
AF.
are estimable and t h a t one is allowed, i n the context o f testing alternative hypotheses, t o test
whether some or a l l o f the transformation parameters are significant or not. Here, however, one
should exercise great care. I n particular one should be aware t h a t one can not test whether an
arbitrary linear function o f the transformation parameters,
-(S)
H : Aft1
- Ax? ( S )
= VIAp,
Ax 1
= VIAI
P,
against
-(S)
HA: AR1
c
Ap say, is zero or not, i.e.:
lxr r x l
t
c Ap = 0 ,
c t Ap
0.
The reason is that, i n the general case we are considering here, one cannot t r e a t a l l transformation
parameters on an equal footing. I n case o f our example o f subsection 3.1,
orientational parameter Ap
1
Finally we l i k e t o point out the great resemblance between (3.2.10)
only d i f f e r
( A ~S )I
and (3.2.15).
A'!;
, A 2f
'
))
The t w o methods
and increments
method, unless one chooses on the basis o f computational convenience. One can argue namely t h a t
method I is t o be preferred since it only needs the inverse o f the covariance m a t r i x of the difference
vector
;(')or
(3.2.7),
whereas
both Q 2 ( s ) + Q t ( 3 ) + (v:)(v:)~
1
1
L e t us now consider
method
I
a n d (V1)
I1
needs
( ~ ~ ( +s Q
) ;(i)+
1
1
the
inverse
of
I I t - 1 I
(V1)(V1)
) (V1).
method III
The Helmert-blocking method is essentially a phased type of adjustment applied t o a second standard
problem formulation. Instead o f performing the adjustment i n one step, the original set o f observation
equations is divided into t w o groups, each describing one o f the t w o overlapping networks. A f t e r
having formed the corresponding normalsystems one then reduces t o obtain the reduced normals
pertaining t o the common unknowns o f the t w o networks. Through inversion o f the sum of these
reduced normals one solves f o r the f i n a l adjusted values o f the common unknowns. The remaining
unknowns are found by means o f back-substitution (e.g. Wolf, 1978).
I f we reduce f o r the A x 2 -parameters, (3.1.l.a)'~ normalsystem becomes
we have
z0
with
t -1
-1 t -1
A2Qy , and
A (A Q A2)
2
2 Y
S = R ( S ) complementary t o Nu(P2A1)
P2 = I
2'
Ax
= (
the solution
t -1- l - t -1
Q- A ) A Q - ( A y
3 Y
3
3 Y
- A1
with
- l - t -1
A 3 Q y - , and
3
3 Y
= R ( S ) complementary t o NU(P A
3 1
P,
= I
A (A t Q--1A,)
is taken care
and
NU(N')
1
= R(Vl)
, or i1
F o r our example of subsection 3.1 such a modification of N1 would mean t h a t we eliminate the scaleand orientational information of the f i r s t network. And likewise, elimination o f the orientational
information
of
the
second network
would
correspond t o
= R(V1).
Nu(fil)
modifying
ilt o
an
with
Since by assumption the f i r s t network contains more information than the second, we w i l l opt f o r
modifying
F o r our example this means t h a t we eliminate the orientation o f the second network
1
i n favour of the astronomical orientation o f the f i r s t network.
fi 1
model
6x1
i x n i x n
mx(r-q)
(n+n + r - q ) x l
3
-3
-t
-l=
-1
-l=t
I - A ( A Q- A 3 ) A Q- , and
3 3 Y
3 Y,
I
= R ( : ) complementary t o NU(P A
= R ( V ~2 N U ( P ~ A ~
=
3 1
fi1
-1 =
Note that since
Q- ( p 3 A 1 ) .
Y
the t w o solutions (3.2.16.b) and (3.2.18)
= (P3A1)
t o (3.2.17)
t -1
-1 t -1
(A Q A2)
A Q
(Ay
2 Y
2 Y
A (
- l - t -1
( A ~ Q ~ ~ AA Q~ )( A
3 Y
3 Y
t
t t
and
AV = ( A x Ap )
2
1
A ; ~
;-
= (A3:-
-(S)),
A A;
1 1
8'
A
))
1 1
(V')
1 1 1
(V ) )
3 1
into
t - 1 - I -1 I t t -1 = - ( ( v ' ) ~ ( P A ) Q- ( P ~ A ~ ) ( v ~ )( ~v )~ ) ~ ( P ~Q-A ~( A) Y
1 1 3 1
y
Y
- t -1- - l - t -1
A t (3s ) = ( A Q- A 3 )
A Q- [ I
(Al(v;)l+~3(v;)l).
3 Y
3 Y
I t - t - l
-1 It
t
( ( V ) ( P A ) Q- ( P A
( v ~ ) ~ ( P ~Q; A l
~)
)(A;
1 1 3 1
y
3 1
Al~:is)),
w i t h S = R(S) complementary t o
Nu(P2A1 )
A~A:~('))
t -1
N = ( p A ) Q (P A )
1
2 1
y
2 1
t -1 - ) Q- (P3Al),
3 1
Y
(P A
R,
= (B3Al)
Q;
-1 = (p3A1),
and
(3.2.20)
a)
t o the
b)
NIA X
= A n and ilA;
= A"
1
1
1
1'
I
Relax the reduced normal system of the second network w i t h the aid of (Vl ) l :
c)
d)
common unknowns:
and (3.1.1.b)
w i t h the solution:
e)
I n the aaove approach t o the H e l m e r t blocking procedure we have seen that, as a consequence of our
general assumptions (3.1.1),
been taken care of before applying the H e l m e r t blocking procedure. The reduced normals w i l l namely
be regular i f for instance the S-systems of both networks are defined a p r i o r i i n their non-overlapping
parts.
The question t h a t remains t o be answered is then, whether one can s t i l l apply the procedure as
outlined i n (3.2.21).
With some slight modifications we w i l l see that the answer is i n the affirmative.
-1
-1
t -1
1
= ((prpl)
(P;A~))
(P;A~)
QY A Y
(
1
(s2)
t t -1
t t -1
A; s 2 ) = S ( S A Q A S )
S A Q ( A y - AIAG1
) , with
2
2
2 2 y
2 2
2 2 y
t
l0
AX
2"
t t -1
-1 t t -1
A S (S A Q A2S2)
S A Q
2 2
2 2 y
2 2 Y
= R ( S 2 ) complementary t o Nu(P A ) =
1 2
S2
P;
the solution
= I
and
R ( ( 1- A
(3.2.22.a)
t -1
-1 t -1
( A Q A1 ) A Q
)A2)
1 l Y
1 Y
-1 -,,-
-1
t -1 = ( ( P ~ A ~Q; ) ( P ~ A ~ ) )( P p l )
Q Y- A Y ,
1
- t - t -1 " (S3)
2'
A;
B ( 5 t A- t Q--1A S )
S A Q- ( A y - A 1 ~ i l
) , with
(3.2.22.b)
3
3
3 3 y
3 3
3 3 y
-1
- r,
- t - t -1- - t - t -1
= I - A S ( S A Q - A S )
S A Q - ,and
3
3
3
3
y
3
3
-3-3y
- t -1- l - t -1
P3
= R (S
complementary t o Nu(P A
= R( ( 1 -A ( A Q- All
A Q- ) A 3
3
1 3
1 l Y
1 Y
^ (:
'
l0
AX
-,,- t
1
(
..
These two solutions are easily verified by transforming w i t h the appropriate S-transformations the
t w o solutions (3.2.16.a)
and (3.2.16.b).
F o r the H e l m e r t blocking procedure we have i n the above case the disposal of the reduced normal
systems
- ,,
N Ax
N-'AX
= An"
and
1 1
1
- ,,
= An
with
t -1
N" = (P"A ) Q
1
2 1
Y (p;~~),
i
'='( P 3" i
1
and
An" = (P2A1)
1
-1
Ay
Y
A'';
= (p;Al)
1'
t - 1 --) Q- (P3Al),
Y
t -1 Q- Ay.
m;
m.;
m1
.'1
I t--(V,)
I t - I
(V1) N1(V1)
N1
o f the a p r i o r i
3 -
An 1
system definition.
we get
'
- ( V 1~t
) An
--1
Ap
I
- ,,
A x1
'
(3.2.23.e)
We thus see that also i n case the S-systems o f both networks are defined a priori, one can apply the
procedure as outlined i n (3.2.21).
Ap
w i l l not be
invariant t o the choice o f S-systems. This emphasizes once more our earlier remark about the
interpretability o f the transformation parameters.
N o t e that solution (3.2 23.e) is es ent'ally t h sa e as solution (3.2.19) or (3.2.21). One can verify this
S
S
2es21"
by showing t h a t (Axl
A
Ax3
) transforms w i t h an appropriate S-transformation
:(S)
to ( A X ~ b i b s ) , A i i s ) ) .
I n this section we have seen how the three customary methods for connecting geodetic networks
generalize i f one starts f r o m the general assumptions (3.1.1).
As t o the f i r s t t w o methods, i t is interesting t o remark that i n the geodetic literature one usually
assumes either one o f the following t w o attitudes when discussing the problem o f connecting geodetic
networks: Either one places the whole discussion i n the context o f free networks, thereby suggesting
t h a t free networks are really something special and that they should n o t be confused, l e t alone be
compared w i t h "ordinary" networks. Or, one assumes the attitude t h a t the coordinates o f the t w o
overlapping networks merely d i f f e r by a similarity transformation, which is easily taken care o f by
estimating the transformation parameters i n a least-squares sense. B o t h attitudes are however
needlessly too restrictive. Although i n the f i r s t approach one is normally very careful i n stating what
type o f networks are involved, one usually starts f r o m the too r e s t r i c t i v e assumptions t h a t
that method Iis more tractable computationwise than method 11. I n some cases a t least.
As to the t h i r d method, we showed how one should go about when the S-systems are defined either
before or a f t e r the merging o f the t w o reduced normals. Here also the f a c t that i n general not a l l
transformation parameters can be treated on an equal footing, became apparant.
Some authors have proposed i n the context of method I11 t o give weights t o some of the
transformation parameters. They argue that i n case o f for instance two networks which both are
known t o contain orientational information, this is a way o f deciding how much of the orientational
information of both networks is carried over to the final solution. This i n itself is true o f course, b u t
we do n o t think t h a t i n general this is an advisable way t o go about, since it has an element of
arbitrariness i n it. So f a r namely, no objective c r i t e r i u m has been proposed on the basis of which t o
decide t o follow such a procedure. I t seems therefore more advisable t o decide on the basis of
statistical tests whether or not the two networks significantly d i f f e r i n their orientation.
As a final remark we mention t h a t i n this chapter we have adopted the customary assumption t h a t
the coordinate systems i n which the two networks are described d i f f e r only differentially. I f this is
not the case then one has t o recourse t o either a preliminary transformation which make the two
networks coincide approximately or to an iteration. I n the next chapter we w i l l see t h a t i n some cases
one can do without an iteration and formulate an exact non-linear solution.
As a general solution of the linear unbiased estimation problem we found t h a t the actual adjustment
problem was solved by
where B: M
-+
A: N
-+
I n this chapter we take up the study o f non-linear adjustment. A problem which heretofore has almost
been avoided i n the geodetic literature. To this end we replace the linear map A by a non-linear map
y:
-+
Instead o f the linear model (1.1) we then have the non-linear model
It seems natural now t o extend our results o f the linear theory t o the companion problem o f non-
linear operators. B u t unfortunately one can very seldom extend the elegant formulations and solution
techniques f r o m linear t o non-linear situations.
I n correspondence w i t h t h e linear theory the problem o f non-linear adjustment can roughly be .divided
into (a) the problem o f finding the estimates
properties
of
and
;,
adjustment problem
min.2
XE
E(x) =
min.
(Y~-Y,Y~-Y)~,= ( Y ~ - ~ , Y ~ - ? ) ~
yEN=y( N I
and
maps P: M
-+
and y
-1 :
hi
(1.3)
such that
and
2'
;=
- 1( y ) ,
with
-1 o
y = identity.
Due, however, t o the non-linearity of map y it is very seldom t h a t one can find closed expressions f o r
the maps P and y-l
x o , xl,
x2.
X
0
, the i n i t i a l guess,
t o the point
i.
and proceeds
Most methods
scheme:
q+l
= xB + t A X ,
q
9 q
.. . ,n ;
= l,
no summation over q,
(i)
(ii)
(iii)
Determine a scalar
i.e.,
t
such t h i t
I I y s - y ( x q + ! ) l iM 5 I I Y s - Y ( x 9 ) I IM ,
9
such t h a t the q t h step may considered t o be an ~ m p r o v e m e n t over the (q-1)th
;.
If
Generally one can say t h a t the individual methods falling under (1.4) d i f f e r i n their choice of the
The iterative techniques f a l l roughly i n t o t w o classes:
and the scalar t
9
cl'
direct search methods and gradient methods. D i r e c t search methods are those which do not require
increment vector A x
the explicit evaluation of any p a r t i a l derivatives of the function E, but instead r e l y solely on values
of the objective function E, plus information gained f r o m the earlier iterations. Gradient methods on
the other hand are those which select the direction A x
adjustment problem (1.3). F o r a comprehensive survey of the various methods we r e f e r the reader t o
the encyclopaedic work of (Ortega and Rheinboldt, 1970). Instead, we r e s t r i c t ourselves t o t h a t
gradient method which seems t o be preeminently suited f o r our least-squares adjustment problem,
namely Gauss' iteration method. This method can be considered as the natural generalization o f the
linear case and i t is the only method which fully exploits the sum of squares structure of the
objective function E.
As t o the second problem, namely that of finding the statistical properties of the estimators involved
we w i l l not present a complete treatment of the statistical theory dealing w i t h non-linear
adjustment. We cannot expect a w e l l working theory for the non-linear model as we know i t f o r the
7 f o r instance,
depends on both
the non-linear map P and on the distribution o f the data. Hence, i t depends on the "true" values o f
which are generally unknown. Therefore, even when we can derive a precise formula f o r the
distribution o f the estimator, we can evaluate i n general only the approximation obtained by
substituting the estimated parameter values f o r the "true" ones.
T + AT =
N + AN =
i.e.,
we have
c o s ( k ~ s )T + s i n ( k ~ s )N
s i n ( k ~ s )T + c o s ( k ~ s )N,
(T,N)
undergoes a r o t a t i o n
depending on the curvature k o f the plane curve as one moves f r o m the point on the curve
corresponding t o
+ A S.
I n order t o
submanifold a f f e c t the local behaviour o f the multivariate Gauss' iteration method. A t the end o f
subsection 4.4 we summarize the more important conclusions. The section is ended w i t h a subsection
i n which we show how Gauss' method can be made i n t o a globally convergent i t e r a t i o n method.
I n section 5 we s t a r t by considering the classical two dimensional H e l m e r t transformation as a typical
example o f a t o t a l l y geodesic submanifold, i.e.
We cannot expect t o convey here much of the theory o f Riernannian geometry. F o r a comprehensive
treatment o f the theory we r e f e r the reader t o the relevant mathematical l i t e r a t u r e (see e.g. Spivak,
1975).
Riernannian geometry is a generalization o f m e t r i c differential geometry o f surfaces. Instead o f
surfaces one considers n-dimensional Riernannian manifolds. These are obtained f r o m differential
manifolds by introducing a Riernannian metric, that is, a m e t r i c defined by a quadratic d i f f e r e n t i a l
f o r m whose coefficients are the components o f a t w o times covariant positive definite symmetric
tensor field. The corresponding geometry is called Riernannian geometry.
Surfaces, w i t h their usual m e t r i c inherited or induced f r o m the ambient 3-dimensional Euclidean
space,
are
2-dimensional
Riemannian manifolds,
and p a r t of our
considerations
w i l l be a
generalization o f ideas f r o m the theory of surfaces and curves. However, f o r n = l or 2 there are many
simplifications that have no counterpart when n > 2.
I n this section we only present b r i e f l y some o f the basic notions o f Riernannian geometry. We f i r s t
consider manifolds. A n n-dimensional differentiable or smooth manifold can roughly be described as a
set of points t i e d together continuously and differentiably, so t h a t the points i n any sufficiently small
region can be put i n t o a one-to-one
That
correspondence furnishes then a coordinate system f o r the neighbourhood. Moreover the passage f r o m
one coordinate system t o another is assumed t o be smooth i n the overlapping region.
The manifold concepts generalizes and includes the special cases of the r e a l line, plane, linear vector
space and surfaces which are studied i n the classical theory. The mathematician (see e.g.
Hirsch,
1976) usually begins his development o f differential topology by introducing some p r i m i t i v e concepts,
such as sets and topology o f sets, then builds an elaborate framework out o f them and uses t h a t
framework t o define the concept o f a differential manifold. F o r our present application, however, we
can ignore most o f the topological aspects. They are either very natural, such as continuity and
connectedness or highly technical. Moreover, our analysis i n subsequent sections w i l l mainly be o f a
local nature, i.e. d i f f e r e n t i a l geometry i n the small. F o r differential geometry i n the small one can do
without the global considerations i n most cases since one assumes t h a t a single coordinate system
without singularities covers the portion o f the manifold studied.
We have chosen t o define manifolds as subsets o f some big, ambient space(Rk. This has the advantage
t h a t manifolds appear as objects already familiar t o those who studied the classical theory o f
surfaces and i t also enables us t o surpass many of the topological concepts. Suppose t h a t N is a subset
of some big, ambient space lRk. Then N is an n-dimensional manifo1.d i f it is locally diffeomorphic t o
IRn; this means that each point
and
N are said
h-l: V
h-'
-+
, ..., X n ) ,
= (X
the n functions
a=l,
... ,n ,
As a simple geodetic example o f a manifold, l e t N be the set o f a l l planar geodetic networks having,
say,
fn
o f N. The most
- 1:
lRn as
the identity map. The coordinate functions are then the standard cartesian coordinates. However, one
could o f course also take polar coordinates, cylindrical coordinates, spherical coordinates or any o f
the other customary curvilinear coordinates provided they are suitable restricted so as t o be one-toone and have non-zero Jacobian determinant.
I f t w o sets
c N
geodetic networks having $ n number o f points, w i t h the additional restrictions that, say, some
distances between some network points are taken t o be constant. Then
can be shown t o be a
... , .
local
coordinate
T(? ) =
B
dc /d t a
P am- .
T =
form
functions
T =
a1
...,n ,
we
obtain
the
traditional
velocity
vector,
i.e.
a
= d c / d t . SO, a t a n g e n t v e c t o r T is now a d i f f e r e n t i a l o p e r a t o r of t h e
N a t c a n d is w r i t t e n a s
a, ,
In t e r m s of local c o o r d i n a t e s t h e d i f f e r e n t i a l o p e r a t o r s
TcN
f o r m a basis of T N . If t h e c o m p o n e n t s
a r e s m o o t h functions, t h e n
is c a l l e d a v e c t o r field on N.
( x )L
T h e s u b j e c t begins by observing t h a t t h e t a n g e n t s p a c e s
neighbouring points
and
X'
to
X'.
T N , T ,N
X
a t two
A connection is essentially a s t r u c t u r e
t o the curve c a t
to
X'
X,
to
X.'
L e t T be a tangent
c o v a r i a n t d e r i v a t i v e of T is t h e r a t e of c h a n g e of T w i t h r e s p e c t t o t. This c o v a r i a n t d e r i v a t i v e will
d i f f e r f r o m t h e ordinary p a r t i a l derivative, t h e q u a n t i t y t h a t m e a s u r e s t h i s d i f f e r e n c e i s t h e
connection.
L e t X and Y b e v e c t o r fields on N. T h e c o v a r i a n t d e r i v a t i v e of Y w i t h r e s p e c t t o X i s t h e n d e n o t e d by
VXY and i t is a v e c t o r field on N . T h e application of t h e o p e r a t o r
V is defined t o b e linear in b o t h i t s
( f Y ) = X ( f ) Y + f VXY,
X
valued s m o o t h function on N . With t h e l o c a l c o o r d i n a t e expressions X =
a r g u m e n t s and m u s t s a t i s f y t h e c h a i n r u l e
w h e r e f is a n y real-
2 aa '
therefore g e t
which shows t h a t
V Y is t o t a l l y specified o n c e V
X
v e c t o r s fields in t h e c o o r d i n a t e fields 3
v a a = rYa 6
coefficients.
a ,B ,y
ay
T h e n 3 real-valued s m o o t h functions
as
rY
aB
Y =
'?aa
we
is given. I t is c u s t o m a r y t o e x p r e s s t h e s e
aa B
= l,.
. ., n .
(2.2)
c(t)
if i t s c o v a r i a n t d e r i v a t i v e w i t h r e s p e c t t o t h e d i r e c t i o n T = d c / d t
identically z e r o , i.e.,
aa
is
. These curves
a r e c a l l e d geodesics. S i n c e t h e c o v a r i a n t d e r i v a t i v e VTT m e a s u r e s t h e r a t e of c h a n g e of T in t h e
direction T under parallel transport, an equation describing the above definition o f a geodesic is
simply
VTT = 0 ,
(2.4)
With T = d c a / d t
local coordinates
aa ,
So f a r we have equipped the manifold N only w i t h a connection given by the defining equation (2.2).
We w i l l now give it some additional structure. Assume given a smooth real-valued, symmetric and
( . ,.)
:T x N
T x N + IR.
( . ,.)
is called the m e t r i c
xN
is said t o be symmetric or
is said t o be metric.
U p till now we have considered only one manifold N. L e t us now consider two manifolds N and M, and
a smooth injective map y between them, i.e.
y:
N + M
= y ( N ) c M defines
a submanifold of M.
The map y provides a way of mapping vectors on N i n t o vectors on M. The image of T N under y is a
tangent space of
N.
y(x)
and is called the push forward of y. The precise action on a
TxN + T y ( x )
is such that given a function f on M, so that f(y(x)) is a function on N, then
a
With X = X a
this would give i n
is defined by ( y X ) f = X f ( y ( X ) 1.
Y
X
a
by y is w r i t t e n y :
3i
vector X
T--N
yY(X) E T
y(x)
local coordinates
ai , i = l ,
a
coordinate vector fields and y ( X ) the coordinization o f the map y :
where yl, i = l ,
..,m,
-+ M
the corresponding
Although i t is possible t o suppress explicit reference t o the map y, t o identify N w i t h the subset
y (N)
y (T N)
of T M,
we w i l l not do so. R e c a l l
Y
namely t h a t also i n the case o f linear maps we are not used t o identify the range space w i t h the
X
As a closing o f this section we define the observation- and parameter space o f our adjustment. I n our
least-squares adjustment context the observation space M is taken t o be Euclidean w i t h Euclidean
metric
Tj
Manifold N w i l l play the role o f the parameter space and the non-linear map y replaces t h e linear
map A which has been used hitherto. Manifold N w i l l be endowed w i t h a Riemannian m e t r i c by pulling
the m e t r i c o f M back by y. That is, given the m e t r i c o f M we define the m e t r i c o f N by
(x,Y)~
(Y
for any
,
X,Y
T ~ N .
3.1.
Gauss' i t e r a t i o n method
the simplest class o f problems, namely those i n which manifold N is one dimensional. I n case o f our
least-squares problem
min.
N = y(N)
Ys-Y,Ys-Y
)M
(Y
, - ~ Y ~ - $ )M
y:
the map
t EIR = N +M.
c:
min.
N = c(N)
= (ys-&ys-L.
)M
M i n this section by
to be the least-squares
solution o f (3.31, is t h a t
d .
-IS
dt
a basis o f T IR = T N
I n the linear case i t was necessary and also sufficient f o r the residual vector t o be orthogonal t o the
fi
= AN
residual
vector
linear submanifold
Since
the
T ( c ( N ) ) = T .N
C
needs
to
fi
be orthogonal
the
linear
tangent
space
~~i
. B u t due t o
= c ( N) a t
6,
we need t o know
N = IR
c:
to
is
generally unknown a priori. Hence our minimization problem cannot be solved directly. Expression
(3.4)
does however suggest a way o f solving our adjustment problem. Instead o f orthogonally
projecting y
projection o f y
T ~ N,
fi
B u t by pulling the non-orthogonality as measured by (3.5) back t o the Riemannian manifold N, we get
= (c
I;
(--l,
dt
Y ~ - C ) ,~M
with
At
IR = T
t9
(3.6)
N = IR.
This is Gauss' iteration method and i t consists o f successively solving a linear least-distance
adjustment problem u n t i l condition (3.4) is met.
Before we now proceed w i t h studying the local behaviour o f Gauss' iteration method (3.7),
we w i l l
f i r s t derive some local geometric properties of the space curve c itself. A n appropriate approach f o r
studying the local geometry of curve c is by using
l T( I
And since (
= l f o r a l l admissible t E IR,
we have
which shows t h a t D T T is orthogonal t o the unit tangent field T. We define the first curvature kl as
and when kl
Geometrically the f i r s t curvature kl can be seen t o determine the r a t e of change o f the direction of
the tangent t o the curve w i t h respect t o i t s arclength, where arclength is defined as
Thus DTNl
+ klT
+ klT
.Similarly i t follows f r o m
I I N1( I
= 1 that
We can proceed i n this way t o define k3, N3 etc. The vectors T, N1, NZ,
vectors and the equations t h a t express the DrT, DTNi i n terms of the Frenetvectors are called the
Frenet equations. F o r the case m=3 they read as
I n order t o find the relative position o f the curve c w i t h respect t o i t s Frenetframe a t some regular
point, we can study the projections o f the curve onto the planes o f the Frenetframe. F o r convenience
we assume that the curve c has been parametrized w i t h the arclength parameter
point, P say, correspond t o the value
. Now l e t our
w r i t t e n i n the f o r m
The subscript "o" denotes t h a t the value i s taken a t the point corresponding t o
3
+ 0
i f S + 0.
symbol means t h a t o ( S ) / S
Since
D T = k N
T
1 1'
it follows t h a t
2
D T = D ( k N
T
T
1 1
=T(k
)N
1 1
dkl
k D N =--N + k
(-klT
1 T 1
ds
1
l
k N 1.
2 2
dkl
l
- --l - d s
'
Choose now a special coordinate system i n M such t h a t the point P under consideration is the origin
and the vectors To, N
o
,l
and
S = 0.
3
It w i l l be clear t h a t many curves exist which have up t o o(s ) the same canonical representation as
small enough these curves behave alike and are thus indistinguishable.
We w i l l now give a characterization o f such "kissing" curves and one of them, namely
3.3.
The "kissing" c i r c l e
and c2(s2).
but
i = l,
...,m,
where the coordinates o f the t w o curves are given w i t h respect t o a fixed frame o f M . W i t h (3.15)
follows then t h a t two curves have a contact o f order a t least t w o a t a common point P i f and only i f
and moreover, the same
they have a t P a common tangent vector To, a common normal N
l,o
curvature kl(0).
A l l such curves w i l l thus have the same canonical representation
And i n the above sense o f contact such curves can be considered each others best approximation.
Now, i f we r e c a l l our iteration scheme (3.7) we observe that only f i r s t order derivative information is
used.
Hence,
f o r a small enough portion o f the curve c(s) about the least-squares solution
I n fact, w i t h the same approximation we can replace the space curve c(s) by the c i r c l e
This follows f r o m
and
Thus we can use the c i r c l e C(s) t o replace the curve c(s) i n a neighborhood o f P. The c i r c l e C(s) is
known as the osculating (="kissingw) c i r c l e o f c(s) a t
6 = c(o)
or the c i r c l e of curvature.
N o t e that by replacing c(s) by C(s) we achieve a drastic simplification o f our original non-linear leastsquares adjustment problem.
c(s )
And
secondly we can now exploit the simple geometry of the osculating c i r c l e C(s) i n order t o understand
the local behaviour o f Gauss' i t e r a t i o n method (3.7).
Consider therefore the situation as sketched i n figure 24.
C(s
figure 24
Nl,o,
and C(sl)
As,
- I I is
- C(sl)l
4,
tan
I I Y s - ~ ( s l ) l Ill
-1
6,
sin a
--------------M----------------------
-1
kl(0)
4,(k,(D)
lM s i n ( a - @ l )
-1 1 c
.1
- I I Y ~ - C Il
1
II
= I I ys-~(sl)l
iS-c(sl)l
IM c o s
a'
( s i n a - @1c o s a )
lM
sin(a-4,)
With
= k l( 0 ) s l
And w i t h s
= s
+ A s
1'
F r o m this expression we can now formulate several important conclusions concerning the local
behaviour o f Gauss' i t e r a t i o n method as applied t o the curve c(s): F i r s t of a l l expression (3.20) tells us
t h a t i n case k ( 0 ) f 0 , the local convergence behaviour of Gauss' i t e r a t i o n method as applied t o
1
the space curve c(s), is linear. That is, the computed arclength o f the curve c(s) f r o m 6 t o c(s
q+l
depends linearly on the computed arclength f r o m 6 t o the point c(s ) o f the preceding step.
'4
Secondly, a necessary condition f o r convergence of Gauss' iteration method is t h a t
And thirdly, expression (3.20) shows t h a t the local linear convergence behaviour is determined by t w o
terms, namely the f i r s t curvature kl
the residual vector
-6
- 6 i n the
6 and
6.
the projection (N
~~Y,-~)B,M
Of
S.
S.
c(t),
with
The question t h a t remains is then whether the above given conclusions s t i l l hold when t
S.
To study this more general case, it seems appropriate t o look f o r the direct analogon o f the Frenet
equations (3.13). These are given by the so-called
follows t h a t
S,
We therefore have t h a t
D V = (S')
-1
(sl')V
2
( S ' ) kN
l 1.
D N = ( s l ) D N and D?
= (s1)D,-N2.
V 1
T 1
2
With these last three equations we can now replace (3.13) by
F o r m = 3, these equations can be considered as the one-dimensional analogons of the Gauss- and
Weingarten equations.
N o w l e t us return t o our adjustment problem and see how the equations (3.23) come t o our use f o r
describing t h e local properties of iteration scheme (3.7).
F i r s t observe t h a t (3.7) can also be w r i t t e n as
Expanding the right-hand side i n t o a Taylor series about the least-squares solution t gives then w i t h
jE(i)
And w i t h
= 0:
DVV = ( s ' ) - ~ ( s ~ ~+ ) v
and
B u t this is exactly the result we obtained i n (3.20) for the special case t =
S,
t = 0.
Hence, we
have as a f o u r t h conclusion t h a t the local linear convergence behaviour of Gauss' iteration method as
applied t o the space curve c(t), is invariant t o any admissible non-linear parameter transformation. It
is thus idle hope t o think that one can improve the convergence behaviour by changing t o a d i f f e r e n t
coordinate system.
D T = O
T
which means t h a t the unit tangent vector T is parallel along the whole curve c(t). And since 4.1 is
Euclidean by assumption, this means t h a t the curve c(t) is a straight line. F r o m (3.25) follows then
that
And w i t h
g(;)
= (SI(;))~
f o r kl = 0, follows then t h a t
and
Hence, for the case the curve c(t) is a straight line (kl
0)
have a local quadratic convergence behaviour. B u t how is this possible? Doesn't orthogonal projection
onto a straight line correspond t o the case o f linear least-squares adjustment. And i f so, wouldn't t h a t
mean t h a t iterations are superfluous? The answer is partly i n the a f f i r m a t i v e and p a r t l y i n the
negative. It essentially boils down t o our earlier remarks made i n the previous chapters, namely t h a t
adjustment i n the general sense should be thought o f as being composed o f the problem o f adjustment
in
the
narrow
sense,
i.e.
the
problem
) ,
9
9
of
finding
an
estimate
such
i.e.
that
the
i n the submanifold
E hi ,
YS
more precise as t o what we mean by "linear least-squares adjustment". Usually one means by "linear
i a
least-squares adjustment1' t h a t the coordinate functions y ( X )
i = l, ,m, a = l,
n
. . .,
...
o f the map y are linear. We will, however, c a l l a least-squares adjustment problem linear, i f the
submanifold
fi
is linear or
flat. F o r our problem o f orthogonal projection onto the curve c this means t h a t the adjustment
problem is termed linear i f kl = 0. B u t it also means that linear least-squares problems may admit
...,
non-linear functions cl(t), i = l, m. The non-linearity i n cl(t) is then only caused by the choice o f the
parameter t. That is, by choosing another parameter i t is possible t o eliminate the non-linearity i n
cl(t). I n particular i f one takes the arclength parameter
the functions ci(t) w i l l become linear. As a consequence we see t h a t the local quadratic convergence
factor o f (3.29) is n o t a property o f the curve c(t) itself, b u t instead depends on i t s parametrization.
I n the special case namely o f t =
S,
then. Thus we see t h a t w i t h (3.29) we are actually solving f o r the inverse mapping problem, instead o f
the actual adjustment problem.
To put the argument geometrically, consider an arbitrary parametrization o f the straight line c such
t h a t the parameter t is not a linear function o f the arclength
The length
S.
1 IV(t )l
o f the curve's tangent vector V changes then when moving along the curve f r o m point
d
g ( t ) = ( C (--),
c
Hence, the coordinate expression o f the induced m e t r i c o f N
w dt
w
w i l l be a function o f the parameter t. B u t this means t h a t when one applies formula (3.7)
'(t)
t o point.
d
dt
o f Gauss'
(--)h4,
i t e r a t i o n method one is i n f a c t using t w o different "yardsticks". One yardstick given by the pulled
back m e t r i c o f the tangent space o f the curve c a t point c(t ), namely g(t ), and a second yardstick,
q
q
namely g(t), the induced m e t r i c o f the parameter space N itself. And it w i l l be clear t h a t the induced
m e t r i c g(t ) o f t h e linear tangent space T N w i l l be constant f o r the whole space, whereas the
q
t
induced m e t r i c g(t) o f N i t s e l f changes fromqpoint t o point. Thus when one computes the tangent
vector A t
= At
--
q dt
through
to t
q
, t o obtain
one is i n f a c t neglecting t h a t T
25).
Ay
1
dt,)
0x1
't:
m e t r i c : g .1 .I =
( ai,a.) 1
m e t r i c : g(tl)
V
U
Atl
(- -)
dt'dt
m t r i c : g(t) =
tl,N
d
(-d -)
dt'dt
t N
figure 25
t.
= c ( N ) follows namely t h a t
G, s t i l l forced t o recourse t o an
E, but instead is satisfied w i t h 6 no
9'
-6
= 0
quadratic convergence rule (3.29) for zero residual vector adjustment problems. This is i n f a c t not
very surprising since f o r b o t h the cases kl = 0 and y
-6
= 0
we do n o t need an i t e r a t i o n t o
solve the actual adjustment problem. I n case o f kl = 0 the actual adjustment problem is namely linear
and i n case of
ys=
-6
-6
= 0
To illustrate the theory developed so far and t o demonstrate the various e f f e c t s mentioned we w i l l
now give some examples.
3.6 Examples
- i = l,
...,n = number of network points,
- the t i l d e " - " s i g n stands f o r the mathematical expectation,
- xi,yi are cartesian coordinates of the networkpoints,
- - X ,y are the fixed given coordinates,
- X , y,
,X n t Y n) S is the observation vector, and
- 8 is the r o t a t i o n angle t o be estimated.
where:
.. .
with
ai'
i = 1,
.. .. 2 n
M=
I t w i l l be clear that the above model (3.30) determines a curve c(8 ) i n the observation-space M. To
solve f o r (3.30)
orthogonally onto c (8 )
( X l, y l,
. .. , n t Y nIt
X
of
c(8 ) and the convergence factor cf. of Gauss' iteration method as applied t o (3.30).
-Y
= (xl,Y1,---,X
matrices of order 2.
where:
After
ntYn)
t
9
with
(e19e2)M
= 0,
(el,el)M
= (e2,e2)M
= 1
Hence our non-linear model (3.30) describes a circle which lies i n the two-dimensional plane spanned
by the orthonormal vectors el and e2 (see figure 26).
figure 26
The radius o f this c i r c l e is given by the square r o o t of I.
Thus we have immediately t h a t
of c ( 0 ) is given by
-6
-&
projected onto N1, the f i r s t normal of c ( @ ) . Thus we need the length of the pseudo residual vector
-y s - c..,
where
with
= 1
- i
i ( e 2 , ~ s ) M'
l i Z l
Therefore
l - A
l,
with
9
X.
-1
y. =
L
ii
-x i
A cos 8 +
A sin 8 +
ii
;i
A sin 8
A cos 8,
Hence,
It w i l l also be clear f r o m the figure t h a t solution (3.36) is a global minimum of the minimization
problem
Il
y I IM
= 0.
We w i l l now give an alternative interpretation of the non-linear model (3.30). F o r the moment this
alternative interpretation is only o f theoretical interest. Observe t h a t we can w r i t e model (3.30) i n
the f o r m of
cos 8
sin 8
-. s i n
8
cos 8
which we abbreviate as
Thus
stands f o r the nx2 m a t r i x on the l e f t hand side of (3.381, A f o r the nx2 m a t r i x on the r i g h t
We w i l l denote the linear vector space of nx2 r e a l matrices by M(nx21, and the space o f 2x2
orthogonal matrices by O(2):
It can be shown t h a t O(n) is an
fi
n(n-l)
------
2
= A 0 ( 2 )C M
dim. M = 2n,
we have t h a t
dim. N = dim.
fi
= 1,
It is easily verified t h a t
(. ,
trace[(.)
(.))
)M
and (3.41) we are now i n the position of rephrasing our original least-squares problem
(3.37) as
min. (ys-A
xeN=0(2)
x,ys-A
x ) ~=
min. t r a c e [ ( y
x~N=0(2)
And this is the formulation which we w i l l use i n our discussion of the three dimensional H e l m e r t
transformation (see subsection 5.5).
In the remaining four examples of this section we give some numerical results of some simple models
to demonstrate the various effects mentioned of Gauss' iteration method. In all these examples we
take the metric of M to be the standard metric.
i =l
i=2
(t) = cos(t), c
(t) = sin(t),
i =l
i =2
The observation point given is: y
= 0.5, y
= 0.0,
and
Ourmodelreadsas: c
ZRS
iteration step q
(rad.)
.i=l (tq)
ciZ2(tq)
0.90822
0.41849
0.43178
0.97534
0.22070
0.22254
0.99371
0.11195
0.11218
0.99842
0.05618
0.05621
0.99960
0.02812
0.02812
0.99990
0.01406
0.01406
tq
0.99997
0.00703
0.00703
0.99999
0.00352
0.00352
0.99999
0.00176
0.00176
10
0.99999
0.00088
0.00088
11
0.99999
0.00044
0.00044
12
1.00000
0.00022
0.00022
13
1.00000
0.00011
0.00011
14
1.00000
0.00005
0.00005
15
1.00000
0.00003
0.00003
table 1
i=l
= 0.5,
Since the unitcircle has curvature kl = 1, we have with the observation point y
i=2
S
~ 0 . 0 that ( k l ~ l , y s - C ) B , M = 0 . 5 . And this local convergence factor is indeed
Ys
clearly recognizable from t h e above given numerical results.
i =2
= 0.0,
i t e r a t i o n step q
1
2
4
5
6
table 2
( klNI,
ys-
y -C
c
) S ,M
- 0.5,
which
follows
from
the
fact
that
the
residual
vector
example reveals another feature, namely t h a t when the observation point ys and the centre o f
curvature are on opposite sides o f the curve,
consequence the steplength o f each iterationstep w i l l be too long, resulting i n an overshoot. Hence,
the oscillatory behaviour o f the above iteration.
I n t h e previous example the obervation point ys and centre o f curvature were on the same side o f the
curve. And i n t h a t case the steplength w i l l be too short (see figure 27). This e f f e c t is indeed clearly
recognized f r o m table 1 where the points i n the sequence tl,
figure 27
t2
... approach
C(s)
= .lot,
ciz2(t) = ,lot,
i= l
i=2
the observation point is given by: y
= 0,
y
= 2e, and
.
t
f r o m one side.
.id (tq)
.i=2 (tq)
5.57494
5.57494
0.17183
3.33967
3.33967
0.12059
2.77267
2.77267
0.10198
2.71881
2.71881
0.10002
2.71828
2.71828
0.10000
i t e r a t i o n step q
table 3
Since the curve onto which the observation point is projected has no curvature, the local convergence
behaviour o f Gauss' i t e r a t i o n method as applied t o the above model must be quadratic. I n fact, w i t h
(S l ( ))S
( ) = 5
verified f r o m table 3.
When viewing the last column o f table 3 we also notice another interesting feature. We see t h a t a l l
iterates t except the i n i t i a l guess to stay on the same side o f the solution t .The explanation is t h a t
9
the induced m e t r i c function, which f o r the above model reads g(t) = 200 e20t, is monotonic and
increasing. With a monotonic and increasing m e t r i c function one w i l l namely have an overshoot. I n
the above iteration this has the following effect. Since t o <
we
are going uphill. Hence, i n the f i r s t iteration step we w i l l have an overshoot. Thus t l > t .
B u t f o r the next step this means t h a t w i t h the graph o f g(t) we are going downhill. Hence, f o r the
second and succeeding steps we w i l l have an undershoot, which explains why tl,
t2... a l l approach t
iteration step q
i=l
ys
= 0.0,
and
ci='(tq)
ci='(tq)
0.99694
0.07821
0.07821
1.00000
0.00008
0.00008
1.00000
0.00000
0.00000
Although the unitcircle has a curvature of kl = 1, the observation point lies on the circle. Hence, we
expect a local quadratic convergence behaviour governed by rule (3.29). However, a closer look a t the
above results reveals a t h i r d order behaviour instead o f second order. The explanation is given by the
f a c t t h a t t equals the natural arclength parameter S o f the unit circle. Thus f (sl(t))- 1s1I(t) = 0.
3.7. Conclusions
is a s t r i c t local
minimum if
We also found t h a t t h e local convergence behaviour of Gauss' method is invariant t o any non-linear
admissible p a r a m e t e r transformation.
The decisive f a c t o r s which determine t h e local convergence r a t e a r e given by kl and y
-6.
If
either of them or both a r e equal t o zero, then Gauss' method will have a local quadratic convergence
behaviour:
Instead of solving t h e actual adjustment problem, one is then solving for t h e inverse mapping
problem: given
6 find t h e pre-image
t under map c: t c N = IR -+ M
t o be simple and straightforward? I n most cases yes, although there are t w o points which are worth
mentioning. Firstly, when we consider manifolds other than curves, we must i n some way take care o f
the increase i n dimensions. And secondly, we must recognize that a surface i n a three dimensional
space is the simplest object having i t s own internal or intrinsic geometry. I n our investigation o f the
space curve c(t) we were lead t o the invariants o f curvature. But these are invariants rather of the
way the curve is situated i n space, than internal t o the curve. That is, they are extrinsic invariants. A
curve has no intrinsic invariants, since essentially the only candidate for this status is the natural
parameter o f arclength
S.
But
a straight line, i.e. we can coordinatize a straight line w i t h the same parameter s i n such a way t h a t
distances along both curve and straight line are measured i n the same way. F o r surfaces and
manifolds i n general the situation is different. I t is impossible, f o r instance, t o coordinatize the
sphere so that the formula for distance on the sphere i n terms o f these coordinates, is the same as
the usual distance formula i n the ambient space. A consequence is t h a t where i n the univariate case
the possible local quadratic convergence behaviour o f Gauss' method could be reduced t o a t h i r d order
behaviour by taking the arclength
4.1.
Gauss'method
I n this section we w i l l consider Gauss' method for the multivariate case o f non-linear least-squares
adjustment. Thus we assume dim. N = n
1.
y:
When we speak o f the m e t r i c o f N we mean as before the induced metric, i.e. the m e t r i c obtained by
pulling the m e t r i c o f M back t o N :
(x,Y)~ =
( y,
(X)
,y,
(Y))M
For
must hold a t
T,R
of
at
i,i.e.
-i
must
we have t h a t
iE N .
is generally unknown a
priori. Hence, our adjustment problem cannot be solved directly i n general. B u t as i n the previous
section, (4.2) suggests that we take as a f i r s t approximation the orthogonal projection of y, onto a
chosen nearby tangent space T
of
at y
= y ( x ) . O f course then
9
B u t by pulling the non-orthogonality as measured by (4.3) back t o the Riemannian manifold N, we get
(aa,~xq)xqN
= ( Y , ( ~ ~ ) ~ Y M~ ~, ~w i)t h~ AX
9
9
Tx N
(4.4)
This scheme is thus the multivariate generalization o f (3.7), and it consists of successively solving a
linear least-distance adjustment problem u n t i l condition (4.2) is met.
I n order t o understand the local behaviour of Gauss' method we shall now proceed i n a way similar to
t h a t of the previous section. One of the problems, however, we have t o deal w i t h is the increase i n
dimensions. Nevertheless, the l i n e a r i t y of the local r a t e of convergence of Gauss' method (4.5) is
easily shown. F r o m Taylorizing
And since
we get
which proves our statement. Thus, f o r points close enough t o the solution the coordinate-differences
of t h e c u r r e n t point
previous point
wl
and 2.
Upon comparing (4.6) with our univariate result (3.26) we s e e t h a t we still lack a proper geometric
interpretation of t h e convergence f a c t o r of Gauss' method (4.5) although we c a n expect t h a t in some
way t h e curvature behaviour of t h e submanifold
at
as given in (3.23).
To do so, we first recall t h a t t h e connection D of M satisfies
+ IR
,N
i.e.
= Y*(Z),
or in components
Now, l e t X,Y and Z be three vector fields on N and l e t V, W and U be their extensions. As in (3.23),
we then decompose DVW restricted t o
Dvw I i j
i,
= ~ a n g . ( ~ -w)
+ Norm.
V IN
(DW -1
V IN
With t h e connection properties (4.7), (4.8) and (4.9) of D we can then derive the following properties
for
(i)
L e t f and g be smooth functions on M and denote their pullbacks by
f = f o y,
g = g o y.
i and g
respectively, i.e.
or
Hence,
V- i y = i x ( i ) y +
fX
igvXy
~ ( i x , ; ~ )= i g
and
B(x,Y).
(4.12)
IN
(via iwj
av
ajl hi
= y*(XY-YX)
Y*(VXY)
B(X,Y)
y*(V?)
B(Y,X)
= yf ( m - Y X )
or
VXY
V?
= XY
YX
and
B(X,Y)
(iii)
F r o m (4.9), (4.10) and (4.11) follows that
And since
= B(Y,X)
(4.13)
it follows t h a t also
Concluding, (4.12),
is metric, i.e.
is
the unique Riemannian connection (also known as the induced or L e v i - C i v i t a connection) o f N which
is completely described by t h e induced m e t r i c
( .,.X
rYaB'
aa '
Y =
aB,
Z =
aY
more easily i f
defined by
With X =
aa '
( .,.
Y =
aB '
. Since we assume
that
Cyclically permuting the indices gives then three equations which, w i t h (4.16) show t h a t
fi,
Weingarten's equation,
is obtained f r o m applying DV t o a
And w i t h a similar derivation as used above one can show t h a t KN(X) is bilinear i n N a n d X, and t h a t
I-
(a 1,
r a
of
fi
i n h4 (see e.g.
W = yr(ag ),
Spivak, 1975).
i.e.
aa b y dt.
(We have given BC the subindex "c" t o emphasize t h a t the normalfield BC belongs t o the spacecurve c
viewed as a one dimensional manifold).
With (4.16) and (4.17) i t follows that the f i r s t t e r m of (4.21) can be w r i t t e n as
Hence,
if
we
that
d d
1
B = B t ) ,
N=
where
c dt dt
C
we get f o r the second t e r m of (4.21) t h a t
ry
,-
put
d
= kl( t ) dt
ry
is
unitnormal,
and
d
c (-1
r dt
d
(-1
rdt
1 - 1
dg
d
= c (- g ( t )
( t ) Z) + g(t)kl(t)N1
r 2
dt
and V = c (-)
r dt
(3.23).
N o t e f r o m comparing (4.20) and (4.24) t h a t
(S'
(t))
-1
dt
sl'(t)
generalizes t o
B:r
generalizes t o
B( aa,
and
2
( S ' ( t 1) kl(
t INl
ayp
aB).
fi
o f a curve CAR + Ed
4.3.
The normalfield B
i n more detail.
According t o (4.23),
normalfield BC through
Now i n order t o find the proper multivariate generalization o f this expression, one of the problems we
have t o deal w i t h is the increase i n dimensions. We can, however, get round this d i f f i c u l t y i f we
consider t w o curves, one i n N, which we denote by c:l
denote by c2: IR
QC
M.
BC
IR + N , and one i n
= y o c
1'
M,
which we
following situation
2
of M and N respectively, we can then apply the univariate Gauss'
V = c
d
(-)
2% d t
and X = c
d
(-)
1%d t
this gives
D V = c ((s'(t))-'s"(t)
V
2%
d
2
it)
+ ( ~ ' ( t ) ) k2,1(t)%,1,
(4.26.a)
and
V = y (X).
%
c 2 = y o c.l
Hence
gives
follows then t h a t
Hence, f o r curve c 2 which lies entirely i n c M the normalfield B equals the orthogonal component
I2
of ( S ' ( L ) ) k2,1(t)N2,1.
Thus f o r an arbitrary unit normal N E T N we have
since ( s l ( t ) )
=(x,x).
(t)N2,1,NX(
the extrinsic curvature o f curve c 2 w i t h respect t o the unitnormal N and
2,l
denote it by kN(t) (the f i r s t curvature k
(t) of curve cl i n N is sometimes called the intrinsic or
1,l
geodesic curvature). We can now w r i t e (4.28) as
We c a l l ( k
As a consequence o f the
The
eigenvectors
determine
then the
r
h,
r = l , ...,n,
and
r=l,.
The corresponding mutually orthogonal principal directions are denoted by X
J'
o f submanifold N = y(N)
F o r l a t e r reference we define the mean curvature
..,n.
for the normal
of
as
1T N.
Y
4.4.
The l o c a l r a t e o f convergence
o f Gauss' i t e r a t i o n method.
Recall f r o m (4.6) t h a t
Hence,
since y
1-;
E T.N.
Y
-i
S
at
f o r m which b e t t e r resembles our univariate result (3.25) if we make use o f the eigenvalue problem
(4.30).
T N.
X
r =
Assume therefore t h a t X
,,
...,n,
l,
Then
( ~ ( a ,a
a B
<
) ,N)$~
a0 r
With
and
Hence, we have
r
q+l =
r
uq
(G
0(ut6 U )
q ts q
r=l,...,n
If
and
(ii)
{xd
(iii)
i,
at
behaviour
of
Gauss'
method is
invariant
to
any
-i.
itself,
admissible
the local
parameter
(iv)
I c f . = m'.{
I I.,I l l l
I I.,l I I l
either be positive, zero or negative. But they are always real, since B is symmetric i n i t s
arguments.
(v)
does however not ensure local convergence of Gauss' method. See (4.36).
(vi)
If
< 0,
i n each
D.
corresponding centre of
iteration step i f
< 0.
Hence, the
A n interesting point o f the above conclusion (vi) is that i t indicates the possibility o f adjusting the
steplength i n each i t e r a t i o n step w i t h the aid of the curvature behaviour o f
D,
so as t o improve the
convergence behaviour (4.35) of Gauss' method. L e t us therefore pursue this argument a b i t further.
Instead of (4.5) we take
where AX
...,
r = l, n; no summation over r.
As could be expected, it follows f r o m (4.39) t h a t the scalar t
..
1
min.[mx.{I(l-t
) + ($,,N,Y~-Y)$~t q l , l ( l - t ) + ( $ N , Y ~ - Y ) $ ~
t q l}).
9
9
t >O
9
F r o m figure 28 follows then, t h a t i f $ is a s t r i c t local minimum, the optimal choice f o r tq is:
figure 28
And f r o m this follows t h a t the smallest attainable linear convergence factor (Icf.) f o r Gauss' method
w i t h a l i n e search strategy is given by:
1 n
((~~,,,)N,Y~-Y)~~
Icf. =
?-G
convergence
>> 0 f o r instance.
executable as such, since we generally lack the curvature information needed. Nevertheless, the
above results are o f some importance since w i t h (4.42) we have obtained a lower bound on the linear
convergence factor attainable f o r Gauss' method w i t h a line search strategy. This means that when
one decides t o use a l i n e search strategy i n practice, one should choose a strategy which gives a r a t e
of convergence close t o (4.42).
Apart f r o m the minimization r u l e which w i l l be used i n the next section t o establish global
convergence, we shall not discuss i n the sequel any o f the existing line search strategies. F o r details
the reader is therefore referred to the relevant literature (see e.g. Ortega & Rheinboldt, 1970). Our
decision of not including a discussion on various line search strategies is mainly based on the
(vii)
1 n
( ( w k ) ~ , y ~ - y ) is~ small,
~
then t = l is a good choice f o r a line search strategy (see
9
(4.40)). Hence, f o r small residual adjustment problems and moderately curved submanifolds
If
fi , Gauss'
I n fact, i f either B
convergence behaviour.
(viii)
'4
F r o m (4.35) follows t h a t Gauss' method has a local quadratic convergence behaviour i f either
the normalfield B vanishes identically on
fi,
i.e.
B % 0, or y
E
S
N,
i.e. ys =
G.
Submanifolds
O f course, we s t i l l have t o prove (4.43). B u t i t is reasonable t o expect t h a t (4.43) holds, since we know
f r o m the previous section t h a t f o r geodesics Gauss' method has a local quadratic convergence
behaviour w i t h convergence f a c t o r
-$ ( S ' (;))-'S''(;).
then T
(S'(t))-lst'(t)
rYaB'
which means that our actual adjustment problem is linear. Hence, i f B C 0 then
f r o m which follows t h a t
This already shows that indeed the convergence behaviour w i l l be the same i f either B c 0 or ys =
holds. Remember t h a t i n both cases we are actually solving the inverse mapping problem: given
f o r sure y l ~ hi
f i n d the pre-image k under map y. To prove the
i = y + PTM,Tlil(ys-yl)
quadratic convergence behaviour (4.43), we Taylorize the right-hand side of
i. With (4.45)
and
2q + l = xBq + AxB4'
this gives
And since
this gives
As
another
FY={ r
a a
(X
),
a+ x a x
I-.
With respect t o the univariate case there is however one big difference. I n the univariate case we
could always f i n d a parametrization f o r which (sf(t))-'sl1(t)
I n the
multivariate case however this is only possible i f B GO. The explanation is t h a t i n the univariate case
T N and N are identifiable irrespective the curvature o f the space curve c, whereas i n the
t
rnultivariate case T N and N are only identifiable i f B E 0. Namely, only i f B E 0 can one f i n d a
X
parametrization f o r which
( .,. )N
X,
Y
raB
vanish
locally.
coordinates.
The procedure of finding geodesic polar coordinates is the following:
According t o the theory of ordinary differential equations a geodesic c ( s )
through a point X, i s
d
and the tangent vector c (-)
at
o
d
x ds
X.,
Hence a point X = c ( s ) E N on this geodesic can be identified by
c (-) a t X, and S. O r i n
a
a
x ds
coordinates: the point X E N w i t h coordinates X = c ( S ) can be identified locally w i t h the point
locally uniquely characterized by the coordinates o f
Tx N having coordinates
0
= c(o)
and
a t xo,
follows then w i t h
that
Or w i t h (4.481,
Pa
BY
vanish a t . ,X
Fa
BY
is given i n
y,
a
X
are linear i n
we are
indeed dealing here w i t h the proper multivariate generalization o f the case considered i n the previous
section where the univariate parameter t was chosen as linear function o f s so as t o eliminate the
necessity o f i t e r a t i o n f o r solving the inverse mapping problem.
the above l o c a l analysis o f Gauss' method we have seen that both the i n i t i a l guess xo had t o be
r
1 had t o hold f o r a l l r = l,..,n
i n order
f f i c i e n t l y close t o the solution ic and I
N,y g-y)9M I
(&
assure convergence. F o r most practical problems we indeed believe that these conditions are
satisfied. Nevertheless, it would be dissatisfactory not t o have an i t e r a t i o n method which guarantees
convergence almost independently o f the chosen i n i t i a l guess and curvature behaviour o f the
submanifold
i.
independency o f the i n i t i a l guess xo, i.e. usually one w i l l have global convergence t o a l o c a l minimum.
The method we w i l l discuss is essentially the above discussed Gauss' method, but now w i t h the socalled minimization rule as line search strategy. I n formulating the method we have chosen t o s t a r t
f r o m some general principles so as t o get a better understanding o f how the various assumptions
contribute t o the overall proof o f global convergence.
As a start we assume
...
{X
f o r which E ( x
q+l
E(x ),
9
This seems a natural conditon t o s t a r t w i t h since we are looking f o r an iteration method which can
locate a local minimum o f E. F r o m (4.52) follows that the sequence { ~ ( x)} converges t o a l i m i t ,
9
since the sum o f squares function E is bounded f r o m below ( 0 2 E ( x ) , VX) and the sequence
{ ~ ( x ~ )is} non-increasing.
vector
..,n.
analysis
the gradient
of
a scalar
f i e l d E is defined as the
vector
field
direction o f -3 E. However, this ordinary definition o f gradient is not invariant under a change o f
a
coordinates. With our geometric exposition o f the preceding sections i n mind we can therefore expect
t h a t the simplicity of the ordinary vector analytic definition o f the gradient almost inevitably forces
difficulties and awkwardness when problems involving change o f coordinates are encountered. A way
out of this dilemma is offered i f we bring the requirements o f invariance under change o f coordinates
t o the foreground. Therefore, given a function E:
E, invariantly by
N + IR
( g r a d E,X )N
(grad
aB
+a
E.
aB
( g r a d E ) =~ g
ag.
= Ax(x) = - g r a d E ( x ) c TxN
it follows t h a t Ax(x)
(4.55)
. .
~,E(x)
-say i( x ) g . J.(y;-yJ(x)),
1
the vector
this does not necessarily imply t h a t t h e function value o f E(x) decreases by taking Ax(x)
as
incremental step. I n f a c t we already saw i n the previous section that the descent property only holds
if
sufficiently close t o
c (t=O) =
9
we can define
and c
q+l by X q + l
(-1
qrt d t
= Ax
X
= Ax(x ) = - g r a d E ( x ),
9
q
E ( c ( t ) ) - E ( c (0))
dc
'l
q
d
li m
t 'l
= a aE ( q~ )d t- ( O ) = ( g r a d E , ~ ~ ( ~ ) ) ~ ~
t 4
'l
d
i t follows w i t h c (-)
= Ax(x ) =
g r a d E ( X ) , that if Ax(x ) f 0,
qr d t X
'l
'l
'l
'l
E(cq(t))
li m
E(c (0))
- (grad
t 4
< 0.
E, g r a d E ) x
'l
) f 0,
&(X
'l
case when t
E ( c ( t ) ) = min. E ( c q ( t ) ) .
tx
'l q
(4.58)
So f a r we did not specify the type o f curve c ( t ) chosen. The simplest way computationwise would
'l
be t o choose the curve c ( t ) so that i t s coordinate functions are given by
'l
B u t other choices are also possible. And since the particular type of curve chosen is not important f o r
our convergence analysis, we just assume that a rule is given which smoothly assigns a unique curve
c : t ~ l R +N t o every point X so t h a t the i n i t i a l conditions (4.56) hold. That is, we assume t h a t the
q
a
q
coordinate functions c
*l, ,n,
of the curve c are smooth functions o f not only the
'4
'l
parameter t but also of the i n i t i a l conditions. Instead o f c ( t ) we may therefore w r i t e
'l
c ( t , x ,Ax(x ) ) and by Taylor's formula we have
'l
'l
...
(i)
X,
c (0) =
9
N t o every point
and c (-1
9
qr d t
= Ax(x )
X
with
f 0.
(ii)
Compute Ax(x ) =
9
(iii)
(iv)
Compute
q+l
II
X,
E ( x ) 5 ~ ( x ~ is
) }bounded, and t h a t t h e function
... .
... .
critical point of E.
We d e n o t e t h e unique curve assigned t o i by c ( t , & , ~ x ( i ) ) , and t h e positive s c a l a r t satisfying
h
~ ( c (,X,AX(;)))
t
= min. ~ ( c ( ,;,Ax(;)))
t
t > O
assigned t o a n arbitrary point X
E ( c ( t f ,x,Ax(x))) =
by
by
rpj~.E ( c ( t , x , A x ( x ) ) )
by tl=t(x).
im. x
1-cm
qi
i t follows that
X,
)
i
IF(X
Since we assumed
~ ( i I)
for every i
r.
> 0 i n (4.63) t o be
F ( -~)
ql
5 $ F(X) <
for every i
I.
IF(;)
From
E(c(t(x),x,
x(x)))
E(c(t,x,
E(c(t(x),x,
x(x)))
E(x)
x(x)))
or
F(x),
follows then t h a t
or
~ ( c ( t ( x ),xqi,
qi
)
'i+l
x(xqi)))
5 E(x
E ( x .) < F ( x .)
q~ q1
w i t h F(;)
F(;),
5 71 F ( x ) <
f o r every i
0,
every
r 9
r.
) follows that
qi+l
E(xq
) 5 E ( x -)-f I F(;)
i+l q I
with
F(;)
0,
f o r every i > r.
Hence,
lim. E(x ) =
j -cm
qi
-a
(4.65)
;is not
{X
{X
i t s e l f converges t o a c r i t i c a l point.
;=
;( which
This concludes the proof o f the following global convergence theorem (Ortega & Rheinhold, 1970):
E(x)
E(x ) }
0
l i m xq =
{X
w i t h g r a d E ( x ) = 0.
c!To conclude this section we w i l l prove the following result on the r a t e o f convergence o f the globally
convergent i t e r a t i o n method (4.60):
If
then
1~
Y ~ - Y 1 (lM~- ~1 lys-91
~ )
l im.
1 n 2
lM
2
($,,-$,,l
1 lys-$1 lM
<-
2'
{X
,&(X
)).
q
9
equivalent i n the sense t h a t the curves are a l l tangent a t the starting point
curve c ( t
,X
{X
HX =
VXg r a d E
f o r a1 l
TxN,
satisfies
(H X,Y)
= (x,Y)~ (B(x,Y),N)~
f o r a1 l X,Y
follows that
And w i t h
D
y*(X)
( y -y) = -y*(X),
S
T N,
X
where
this gives
y ( g r a d E)
-y, (X) = -D
Y*(X)
+D
y*
Hence,
(x,Y),
(Y*(x),Y*(Y))~
= ( D y * ( x ) ~ * ( g r a d E)
Y*(Y))~
(Dy (x)N, Y * Q ) ~ *
(4.71)
Since,
O=D
Y*(X)
( N, y*(Y)
y(cq(t))
figure 29
S,
(1
grad E(x )
(1
t = lly*(grad E(x
1) I l M t
Furthermore we know that the scalar t satisfies the minimization rule. Therefore
9
where
N1 is the f i r s t
normal o f y (c ( t 1)
Compare w i t h (4.72).
To make relation (4.75) precise we r e c a l l that geodesics are characterized by
VV=O,
with
V = c
d
dt
(-1.
d
(-
qr d t
grad E(x )
this gives
( w a d E, g r a d
t
1.
+ 0 ( IIgrad E l l x
(H g r a d E, g r a d E ) ~
q
Compare w i t h (4.75).
Now, t o continue our proof o f (4.67), we substitute (4.76) i n t o
E(c (t ))
q q
E ( c ( 0 ) ) = (grad E, V )
q
c
t + -(vVgrad
(0)
( g r a d E, g r a d
E,
v ) ~(o)
tq + O ( t
3
9
(o) t q +
q
+ -(H g r a d E, g r a d
2
and find
2
3
(o) t q + O ( t )
9
(grad E, g r a d E ) ' ~ hi
E(x
-1
E(x ) =
q
3
+0(IIgradElIX
q
(H g r a d E, g r a d
).
(4.77)
By assuming t h a t
can w r i t e
E(c(9))
4'
9 +
E ( c ( o ) ) = (grad E,
1
?(Hw,
2
9
4'
V,#
where
= 0
0 =(gradE,
Hence,
4'
and c(9) =
0 (s3)
U)-X
=(gradE,U)
+(Hw,u)
q
2
O(Ilgrad E l l x N ) .
V# =
0)
along c ( s ) ,
9+0(sL).
4'
4'
~ ( x =)
4'
+ ( ~ ' g r a d E, g r a d E),
N + o(I l g r a d E l
q
).
2
( g r a d E, g r a d E ) ~
4'
(H g r a d E, g r a d E)
4'
(H-lgrad
E, g r a d
q
(E(X ) - E ( ~ ) * o (
q
X,
we
(4.78)
(-1.
-1
= - H grad E(x )
q
wx
4'
4'
d
ds
we have f o r an arbitrary parallel f i e l d U (i.e.
and W = c
g r a d E ( x ) = 0,
Since
l lgrad El l x
q
)l,
with Ax(x ) =
9
grad E(x ).
By assuming that jc
we can now apply Kantorovich' inequality to (4.80). Kantorovich' inequality (see Rao, 1973, p. 74)
states namely that if a linear map
T N,i.e.
X
(bx,bx)
= 1.
= 1 - < 1 [ y ~ - y 1 1 ~ ,r = l ,
...,n,
In this section we will consider some examples to illustrate the theory developed in the previous
sections. Apart from the examples, we also present new results on the Helmert transformation and
give some suggestions as to how to estimate the extrinsic curvatures.
In subsection 3.6 we have seen that the solution of the Helmert transformation only admitting a
rotation could be found by orthogonally projecting the observation point onto a c i r c l e w i t h radius
equalling the square r o o t of the moment of inertia of the network involved. We w i l l now generalize
this result and consider the f u l l H e l m e r t transformation. That is, we w i l l assume the scale- and
translation parameters to be included as well.
O f couse, the solution to the two dimensional H e l m e r t transformation is w e l l known (see e.g. Kijchle,
1982). I t is therefore n o t so much our purpose to present the solution, but t o emphasize the geometry
involved. And the method chosen f o r deriving the solution prepares us for the case considered i n our
next example.
The model of the H e l m e r t transformation reads
where:
xi
yi
u.A cos 9
viA
sin 9
u.A s i n 9
viA
cos 9
i
9
Y i
...,
i = l, n = number of points,
,v
A , 9 , t x and t
are respectively the
Y
parameters, which need t o be estimated, and
ex i 9 eyi
scale,
orientation
and translation
I f we w r i t e model (5.1) as
where:
E(A,9, t x , t y ) =
A999tx,ty
I l ys-
min.
A,9,t
X'
Acos9
Asin9
X 3
y 4
II
(5.3)
We shall solve (5.3)
subproblem
min.
t
t
X'
E(A,9,tx,ty).
'.
Let t
( h , B)
min.
E(X,8,tx(X,B),t
198
,. .
(A,B)).
L e t X , 8 denote the solution o f (5.5). The overall solution o f our original least-squares problem (5.3)
is then
B y taking this two-step procedure we have separated our original four-dimensional least-squares
problem (5.3) i n t o t w o two-dimensional least-squares problems (5.4) and (5.5).
With the abbreviation
ys(X,8)
= ys
X cos B
X sin 8
(5.7)
2'
X'
l lys(X,8) -
= min.
E(k,B,tx,ty)
min.
t
t
,t
txx3
t X I I'.
Y 4 M
(5.8)
And geometrically this problem can of course be seen as the problem o f finding the point i n the plane
spanned by the orthogonal vectors x3 and X 4 (as before we assume that the observation space is
endowed w i t h the standard metric) which is nearest t o y ( X , 8 ) . (see figure 30).
S
figure 30
and
x 4 are orthonormal, i t follows that the point i n the plane
3
spanned by x3 and x4 closest t o y ( X, 8 ) is
Since the two vectors
Hence,
(
(X
M)
1
= i
; ( ~ ~ , y ~ ( A , 8 ), ) ~
ty(A,8)
or w i t h (5.7)
tx(A99) =
t
( ~ , 8 )=
1
;
(x3,yS 1
;( x 4 , y S -
A cos 8
X1
A s i n 8 x2)
M
A cos 8
A sin 8
,
(5.9)
min.
A,9
E(A,8,tx(A,8),t
(A,8))
where:
C
YS
C
-1
YS
= X
1
n (x4.ys)M
;(
1
;(x3*x2)M
( X ) . Y ~ ) ~
1
l-
A cos 8
X2
= min.11~
S
1, 8
A s i n 8 xC11
M
, (5.10)
1
x
~
1
;(x49x2)N
x4
~
t~
X4
l c c
-(X
R
2 'Y S)M
figure 31
1 c
1 c
C
C
- X
and - X
with
R = I I xl ( I
= 1 I X * 1 1 ,are orthonormal,
R 1
R 2,
C
C
C
it follows t h a t the point i n the plane spanned by X
and X closest t o y
is
1
2
S
icos 6
$
(x!,y;)
R
=W,
i
sin
('1.~1)~
= $(xq,y~)
= (x$,Y~)M
R
M ('$,'$)M
8 = tan
n c c
C ".X -1 i = l I i
c c
C C
UiYi
9
X =
i=l
u.x
n c c
c c
C u.x.+ v y
i=l I I
i i
n
n
1 Y.
C X.
C
j=lj
c
j=lj
where: X = X - - Y . = Y - i
i
n
l
i
n
C C
+ V.Y.)
1 1
n c c
+ (.C v . x
1=1I i
c c
UiYi)
21
n
C U.
j=lj
c
U
= U - -
,v
c
i
= v.--.
I
n
C v
j=l j
n
(5.12)
together w i t h (5.13)
mapping problem. 'This follows f r o m (5.11) i f one solves f o r the parameters A and 8 .
Also note t h a t we are by no means restricted t o the particular two-step procedure chosen i n (5.4) and
(5.5). Instead o f taking the above two-step procedure, we could for instance have decided t o only f i x
manifolds.
A ruled surface is a surface which has the property that through every point o f the surface there
passes a straight line which lies entirely i n the surface. Thus the surface is covered by straight lines,
called rulings which f o r m a f a m i l y depending on one parameter.
I n order t o f i n d a parametrization o f a ruled surface choose on the surface a curve transversal t o the
rulings. L e t this curve be given by c ( t l).
which passes through this point. This vector obviously depends on tl.
Thus we have T ( t l)
The parameter tl indicates the ruling on the surface, and the parameter t 2 shows the position on t h e
ruling.
I f i n an adjustment context the submanifold
take advantage of the special properties o f
fl
n.
fl
rulings, whilst curved i n the directions transversal t o it. Hence, it might turn out t o be advantageous
t o perform the adjustment i n t w o steps. I n the f i r s t step one would then solve f o r a linear leastsquares adjustment problem, and i n the second step f o r a non-linear adjustment problem o f a reduced
t l)
C(
As solution we get an adjusted point on the surface which depends on the choice o f ruling, i.e. on the
choice t
1:
The second step consists then of orthogonally projecting ys onto the curve given by (5.16).
This
problem is o f course i n general s t i l l non-linear, but i t has the advantage o f being o f a smaller
dimension than the original adjustment problem.
As an example one could think o f a cylinder (this is i n f a c t a very special ruled surface, since i t is
developable). Then we have (see figure 32):
i= l
( t
) = R c o s ( t 1,
1
1
;(ty)
1'
i= 2
( t ) = R s i n (tl),
1
i= 3
(tl)
1
yk'3~ .
= c(ty)
i= 3
min. (yS-yS
to
1
c(tl),
ys-
i= 3
yS T
c(tO))
1 $4
Ys
figure 32
It w i l l be clear t h a t the above described procedure also holds f o r ruled-type o f manifolds.
= 0,
5.3.
As a nice application of the idea described i n the previous example we have what we shall c a l l the
t w o dimensional Symmetric H e l m e r t transformation.
Recall the model of the t w o dimensional H e l m e r t transformation (see (5.1)) and note that the model
i n i t s classical formulation favours one point f i e l d above the other. This can also be seen f r o m the
rather asymmetric solution o f the scale parameter (see (5.12)).
It has bothered the present author for some t i m e t h a t one was satisfied w i t h the classical formulation
(5.1). A b e t t e r formulation would namely be:
-X .
-l
yi
--
where:
u.X
cos 8
v.X
sin 8
u.X
sin 8
v.X
cos 8
l
l
...,
i = l, n = number of points,
the t i l d e
xi,yi
l'-!
and x i ,
X,
8,
t x and t
and
i=l,
...,n,
t o the X , 8 - coordinate lines. Thus i n the f i r s t adjustment step we can either f i x the ui,vi,
or X and 8.
i=l,
...,n,
It turns out that the choice of fixing X and 8 is the most advantageous one.
Skipping the tedious but t r i v i a l adjustment derivation we find f o r fixed X and 8 the solution of the
f i r s t adjustment step as:
(1
yc +
(1
%,X
= Cc
ti(h,8)
iX(X,8)
= C,
,t.y ( X , 8 )
where:
Ci(X,8)
yc +
C,
-1
x ~ ) (X:
cos 8
X sin 8
X ~ XC O S
XTX
y c X
- yc
e -
sin 8
s i n 8,
X cos 8,
yy
yy
X s i n 81,
X cos 81,
(5.18)
-C
X
,,
F .j,
= n
~ = J1
j=l
Yi = Yi -C
Y c 9
c
C
= n
=
-1
j=1
-
yC
X.,
J
-1
Yi = Yi
c,
j=1
(5.1~)
-C
2
+ ( l + A )
-1
2 c
(Ax.
iC
+ (1 + A
-1
)
-C
C
X .
' cos 0
-S
in 8
A cos 8 +
A cos 8
sin
y Ci
A sin 8) + e
'
y . A s i n e ) + eI
x '
i
sJ(lj"
cos 8
(5.19)
where e a r e t h e residuals.
The sum of t h e squared residuals reads then:
A = 1
;'l! "
t
2
f ( A ) = ( l + A )
-1 n
,c-
i=l
'
X .1
cos
sin
- A
cos 0
-sin 0
i,
y
,,
.'i,
(5.22)
f(4)
=(cos
sin
2'
cos
where:
s i n 4 e2)
plane spanned by the vectors el and ep. Hence, t o minimize f ( 4 ) we need t o f i n d t h a t point on the
ellipse
y(4)
= cos
4 e
sin
2 '
which is closest to the origin. 'This minimization problem results then i n the following eigenvalue
problem
sin
min.
Hence,
Substitution of p = p
tan @ =
mi n
( (e29e2)-(e19e1)1
V( (elvel)+
(e2,e2)12
2(e 1' e2)
2'
4((e1,e1)
(e2,e2)- ( (el,e2))
)
9
I (el, e2) 1
With (5.24'), (5.21) and (5.18) the least-squares solution of t h e two dimensional Symmetric Helmert
transformation (5.17) finally becomes:
- nc i c o s 6
- yc i s i n
6,
t y = yc + gc X s i n 8
- yc X cos
8,
ix
ji
= xc
= gc
+ ( 1 + i2)
-1
(:f
+ xf
COS
- y f i s i n 6)
E i
2
+ Y
,-l
- i=1
-
-c
E1 = 1( ( x ic)
+ (y;)2)
-------------------------U--------------------
2 j/(.E
x i c + Y
1=1
l l
+v1
+
'
'i-,
2
(yI
C i IC
-c 2
-c 2
((xi)
+ (yi) 1
c 2
c-c
2'
x.y.1)
l
c 2
- iel((xi)
+ (yi)
1
.......................................
C-C
C-C
E ( y Cl i lC - Xc.1Y- c.11 )
(i[l(xixi
+ YiYi)12
+
2'
2
which demonstrates the symmetry i n our least-squares solution o f the scale parameter. This i n
contrast t o solution (5.12) o f the classical H e l m e r t transformation.
U p til now we assumed the simplest structure possible for the covariance matrices o f the observed
cartesian coordinates. I n many practical applications this assumption w i l l do, but i t w i l l not be
sufficient f o r a l l applications. Unfortunately one can not expect t o find a solution l i k e (5.27) i f the
observed coordinates are allowed t o have an arbitrary covariance matrix. One o f the reasons t h a t the
derivation of (5.27) went so smoothly is namely that the covariance matrices used f o r the t w o sets o f
coordinates are scaled versions o f each other and are invariant t o rotations. This indicates, however,
..
, , .. .
2
k Q =
..
.X
,y . , . . . ) t
I
f o r some k E /R+,
and
t
R Q R = Q ,
where R is a 2n
cos 8
s i n 8
2 blocks
sin 8
c o s 8
c a n d i d a t e f o r 6.
T o solve f o r t h e S y m m e t r i c H e l m e r t t r a n s f o r m a t i o n with t h e new c o v a r i a n c e s t r u c t u r e (5.281, (5.29)
w e apply t h e s a m e two-step procedure a s used before.
F o r fixed X and 8 w e g e t t h e n a s solution of t h e f i r s t step:
u . ( x , ~ )=
I
;C
;C -
X cos
t (x,e) = y + X sin
Y
C
COS
- k X sin 0 yi)
2
C
2
C
2 -l -c
(yi + k X sin 0 x + k X cos 0 yi)
)
i
+ X
+ l
t x ( ~ , 8 )=
2 -l -c
2
(xi + k X
+ ( 1 + (kX) )
1 sin 0
y c'
- X c o s 8 yc,
where:
2
] = ( l + (LA)
-1
i
C
'i
cos0
-C
X
sine
- X
- s i n e
cos8
r \yi
yi
'
-C
I
e=
cos 0
k X ( 1 + (kX)
- s i n e
-C
cos 0
cos0
-C
,
. 'i
,,
w h e r e e a r e t h e residuals.
H e n c e , t h e weighted s u m of t h e s q u a r e d residuals r e a d s t h e n
di-l, i
,i-1
:i i
0
i-l
0
di+l, i
0
0
l
0
di i
:i+l,i
'
d . ~ i +, l
0
'
..
0
di,i+l
c
i
c
yi
-c
i
YP
and
8.
With t h e r e p a r a m e t r i z a t i o n
1
0 < @ < 2
@,
k A = tan
(5.34)
we c a n r e w r i t e (5.33) a s
t ,
-c ij-c
-c i j - c
(x.d X . + y.d y . )
1
- c i j c
-cijc"
-k(xid X . + y.d y.)
-c i j - c
-c i j - c
(xid X . + y.d y . )
- c i j c
- c i j c
-k(x.d X . + yid y j )
cos@
-c i j c
-k(yid x j
-c i j c
xid Y.)
-c i j c
-c i j c
-k(yid x j - x i d y . )
sin@cos8
2 c ij c
c ij c
k (xid X . + yid Y . )
sin@sin0
COS(
I,,
0
-c i j - c
-c i j - c
(x.d X . + y.d y . )
1
-c i j c
-c i j c
-k(xid X . + y.d y . )
1
s i n @ cos0
-c i j c
-k(y.d X .
sin@sine = p
sin@sin0
-c i j c
-k(yid x j
cos@
cos@
- c i j c
- c i j c 3 '
s i n @cos0
-k(xid X . + y i d y . )
-c i j c
xid y.)
-c i j c
xid y j )
2 c ij c
c ij c
k ( x i d x . + y . d y.)
of (5.36).
-c i j - c
2 c ij c
c ij c
y . ) + k (x.d x j + yid y . )
+ y.d
1~
-c i j - c - c i j - c
2 c ij c c ij c 2
2 - c i j c -c i j c 2 - c i j c - C i j c 2 I
$xid
x . + y . d y.1-k ( x i d x . + y i d y . ) ) + 4 k ( ( x . d x.+y.d y . ) + ( y i d X . - x i d Y . ) )
J
l
J
J
1
1
1
1
J
J
J
k),
2 c ij c
c ij c
k (x.d
X. + yid
Y.)
'min.
1
J
1
= t a n $, = -------T------------------------------------------------- c ~ j c - c i j c
- c i j c
- c i j c
-k((xid
X . + yid
y.) cos 6 + (y.d
X
- xid y.) sin
J
J
1
j
J
6)
-c i j -c
-c i j-c
(xid X + yid Yj)
j
kX =---------1---------L---------------------------2
- c i j c
- c i j c 2 '
- c i j c
- c i j c
xid x + y.d
y.)
+ (yid
X
- x i d
y.)
j
I
J
j
I
c ij c
2 c ij c
k (xid X. + yid y.)
* 2 c i j c
c i j c
k (xid X + y.d
Y.)
j
I
J
+
-c i j -c
(x.d
x
I
j
- c i j c
- c i j c
x i d X + y .L d
yj)
j
- c i j c
+ (yid
X
j
- 21
-c i j - c
y i d y.)
I
----------------------------------------------F
(5.39)
- c i j c
xid y j )
21
'
The adjusted coordinates and translation p a r a m e t e r s c a n be found by substituting (5.38) and (5.39)
into (5.31).
5.5. The three dimensional Helmert transformation and its symmetrical generalization
Now t h a t we have found t h e solution t o t h e t w o dimensional H e l m e r t transformation and i t s nonlinear generalization, i t is natural t o t r y t o generalize these results t o t h r e e dimensions.
We will first consider t h e classical t h r e e dimensional H e l m e r t transformation. The model for t h e
t h r e e dimensional Helmert transformation reads:
where:
i = l,
parameters,
ex i
pey
i ) e z i a r e t h e errors, and
cosa sina
0 -sine: c o s a
,R2(g)=
cosg 0 - s i n g
0
1
0
s i n g 0 cosg
cosy s i n y 0
I n contrast t o the t w o dimensional case, the submanifold o f the three dimensional HeImert
transformation
is
curved.
This
compIicates
matters
considerably.
However,
number
of
simplifications can be obtained i f we again apply the appropriate two-step procedure. I n the f i r s t step
we therefore assume the orientation parameters a , B
~ ( a , g , ~and
) translation parameters t X ( a , g , y ) , t y ( a , B , y ) , t z ( a , & y ) .
consists o f a linear adjustment problem, i t is relatively easy t o solve. The second adjustment step,
where we have t o solve f o r the orientation parameters, is however s t i l l non-linear. We w i l l solve this
second adjustment step by making use o f the alternative formulation as discussed i n example 1 o f
section 3.6
To apply the alternative formulation which makes use o f the trace operator,
we take the
abbreviations
: ,
:
H =
nxl
tx
t
Y
t
z
R = R (y)R2(~)R1(a)
and w r i t e (5.40) as
= min.
f(X,t)
X,t
t
t t
t
t
t r a c e [ ( X - X U R -Ht ) ( X - X U R -Ht
1).
(5.43)
X,t
a/a
L =
a/a
ij
then
(b)
(C)
---------------a t r a c e [ L K L M)- =
a L
a t r a c e (L M L t K)- =
---------------a L
2 K L M, w h e n K a n d M symmetric
2 K L M,
The proofs of these relations are straightforward, and we illustrate the method by proving (5.44.a):
Let
t r a c e (K L M) = K . . L
jk
M
1 1
k i
'
trace
(K L M) = K
a
or
a t r a c e (K L M] =
--------------a L
M
t
t
i m n i - KmiMin
KtMt
(a)
a f - -2
-at
(b)
-a x --
F r o m (5.45.a)
(X-X U R
2 i trace
(utu)
2 nt =
t t
t
2 t r a c e ( ( X - ~ t) ( U R
)) = 0
follows t h a t
p y
t =
(X-X U R ) H
1 t r a c e (U ( I
Note
(I
gives
t
I H H )U)
n
t
t r a c e (X ( I
1
t
.
1s
a
that
I H H
1
t
H H ) With the abbreviations
uC
nx3
projector,
1
;H
t
t
H )U R
i.e.
(I
) = 0.
- -n1 H
t
H )(I
1
t
C
1
t
= ( I - - H H ) U
and X = ( I - - H H ) X
n
n
nX 3
nX 3
nX n
nx3
nXn
(5.47)
1 H Ht ) =
(5.48)
trace ( ( X ~ ) ~ ( U ~ ) R ~ ]
-------------------
trace ( ( u ~ ) ~ ( u ~ ) )
Formula (5.46) together w i t h (5.49) constitute the solution o f the f i r s t step. To formulate the second
adjustment step, we substitute (5.46) and (5.49) i n t o
t r a c e ((X-X U R
t t
t
Ht ) ( X - X U R
Ht
))
min.
trace
))
~,B,Y
subject to
R = R ( y ) R (B)Rl(a)
3
2
------------------- l
trace ( ( u ~ ) ~ ( u ~ ) )
that
t r a c e ( ( X C ) (u')R
t r a c e ( ( U C)
( U C ) ) is
max.
a,B,y
toR=R3(y)R2(B)Rl(a).
be factorized i n the f o r m
where V1 and V2 are orthogonal matrices of order 3x3 respectively, and D is a diagonal m a t r i x o f the
form
where di,
dl)
i=1,2,3,
d2)
d3)
( X C ) (UC )
0.
'
2 t
F r o m (5.52) follows that ( U C ) ( X C ) ( X C ) (UC ) = V2D V2.
orthonormal set of eigenvectors of the symmetric m a t r i x
,B ,Y
t r a c e ( V DV R
1
2
subject
t r a c e ( V R V D)
2
1
to
= t r a c e ( B A)
subject
R = R3(y)R2(~)Rl(a)
to
(5.53)
we can r e w r i t e (5.53) as
R = R ( y ) R 2 ( ~)R1(")
3
i= l , 2,3, i t follows t h a t
(5.54)
L e t us now f i r s t assume that a l l three singular values are non-zero. Then, since the singular values di
t t
are positive and the matrices i n the t r i p l e product V2R V1 are orthogonal, i t follows that (5.55) is
maximal i f ai = 1, i = 1,2,3.
or w i t h (5.52)
and
corresponding
( U C ) ( X C) ( X C ) ( U C)
eigenvalues
of
the
symmetric
matrix
Substitution o f
gives
1
2
2
p = b - - a
a n d q = c + - - a
3
27
where
- -31 ab.
According t o the Cardanian formula (see e.g. Griffiths, 1947) the three roots o f (5.58) are:
where w = c o s
Thus
with
-32
(5.59)
( U C ) ( X C ) ( X C)
+ i
and
sin
(5.57)
-32
one
2
a a n d i =-l.
can
compute
the
eigenvalues
o f the symmetric
matrix
Although the case of zero singular values w i l l n o t occur very o f t e n i n practice, l e t us now assume
t h a t one of the singular values, say d equals zero. It follows then again t h a t (5.55) is maximal i f and
j
t
only i f R = V1V2. With (5.52) we can therefore w r i t e
respectively.
Finally we consider the case of multiple zero singular values. The case dl
= d2 = d3 = 0 is trivial,
since then the orthogonal m a t r i x R is indeterminate and may take any arbitrary form. I n case only
t w o of the singular values, say d2 and d3, equal zero, we find that (5.55) is maximized i f R takes the
f orm
O
cos+
-sin$
+sin+
+cos+
V:
where
+ is arbitrary.
Thus i n case the t w o singular values d. and dk equal zero we find w i t h (5.52) t h a t the orthogonal
J
m a t r i x R takes the f o r m
+ is arbitrary.
where
Now t h a t we have found the solution of the three dimensional H e l m e r t transformation (5.42), we w i l l
consider the three dimensional generalization of the Symmetric H e l m e r t transformation (5.17). Using
our alternative formulation the model can be w r i t t e n as
As i n section 5.4
sets
.. .
,y , , .. .
. ..-
,- ,
...
and
2
k Q =
f o r some k E /R+.
,V.,W
m i n i m i ze
,A,t
,t ,t
i
X
Y
Z
f(ui,vi,wi,A,t
t
X'
,~,B,Y
ttz9a98,~),
with
..
and where the element o f the nxn symmetric m a t r i x G on place ij is given by d'l.
To solve (5.63)
and orientation
parameters a , 8 , y :
min imize
U ,V.,W.,t
i I I
9(ui,vi,w
t ,t
x ' y z
,t ,t , t
x y z
2
t
t t
t
t
{ k t r a c e [ ( x - A U R - H t ) G ( X - A U R - H t ) ) + trace[(%-u)~G(R-U))}.
minimize
U .I, V
i, W i , t X , t
,t
Y
(5.64)
With the aid o f the m a t r i x d i f f e r e n t i a t i o n rules o f (5.44) we find that the c r i t i c a l point o f g should
satisfy:
(a)
(b)
F r o m (5.65.b)
au
= 2 ( k 2 A 2 + 1 ) ~U
f f = -2
2
t
2A k G ( X - H t ) R
2 G
t t
2
t
( X - A U R ) G H + 2 k t H G H = O
we f i n d t h a t
gives
= 0
and
%C
= ( I -H(H G H )
-1 t
H G)>;:
(5.68) and (5.69) constitute the solution o f our f i r s t adjustment step. Compare (5.68) and (5.69) w i t h
(5.31).
To commence w i t h our second adjustment step we substitute (5.68) and (5.69) i n t o (5.63') and find
I n a similar way as (5.56) was derived, we find t h a t f o r fixed scale the conditional minimum o f (5.70)
is obtained by
where the diagonal m a t r i x D contains the singular values o f ( X C ) t ~ ( 8 C ) and the column vectors
of
(8')
the
orthogonal
V2
matrix
are
provided
by
the
eigenvectors
of
the
3x3
matrix
t ~ ( ~ (X')
C ) t~(8C).
To f i n d the least-squares
estimate
X,
of
we substitute (5.71)
reparametrization
k X = tan
This gives
+,
0 <
<
-2
8.
into (5.70)
min
F r o m (5.74) follows t h a t
'min.
12 $k2trace((~C)t~(~C))-trace((X- c ) tG(%')))'
2
c t -c-t
4 k 2 t r a c e ( ( X ) G(X )R
(5.68) respectively.
i . As
possibilities:
(i)
T r y t o compute the extrinsic curvatures analytically. Those cases where this is possible will,
however, be rare.
L-et us take as an example the Symmetric H e l m e r t transformation (5.17). F o r convenience we
reparametrize (5.17) as
y .I = - b u .1
a v .I + t
Y'
where a = X c o s e and b = X s i n e .
1 a
We assume that the observation equations y ( X )
X
a = 1,
.. . , 2 n + 4 ,
I = 1,
... , 4 n ,
reads i n partitioned f o r m as
where:
m a t r i x A is 2n
1
1
2n block-diagonal w i t h equal 2
v
-U
l
1
n
n
v
-U
n
n
2 blocks
0 '
1
B =
2nx4
C = I
2n
and
D =
0 .
2nx4
2nx4
We also assume t h a t the observation space has the standard metric, i.e.
follows then t h a t the induced m e t r i c g
with:
aB
reads i n partitioned f o r m as
I J = 6I
J. It
I a
Furthermore it follows from (5.78) t h a t t h e non-zero second derivatives of y ( X ) a r e given
by:
...,
f o r i = l, n.
Hence f o r an arbitrary unit normal vector N, i.e.
t h e matrix
where:
(B(
I (Bc%9aB),N)M-kN
gaB I = 0.
Now assume t h a t kN f
i= l
i= l
(NZi-'ui
2i
+ N
= 0 and
t t
Hence F A B = 0.
v.)
I
.E= l
= 0,
i= l
(NZ;-'v.
2i
N u.)
2i
N
= 0.
t
2
With A A = X I
this gives f o r (5.85)
2n '
= 0,
I (
2
I 2
2
2 2
- kN i r l ( u i + v i ) )
(N
I=l
2
(NI)'
I=l
= k
2
N
( .1 f= 1 ( u 2.I +
2 2 2
I 2 + nk ( U + V
N c c
2
2
v . ) -nu
I
- v
2
C
I2
= 0
2
( (uc)2+(vg)2)
N i=l
I
(ii)
in advance. T h e following
(5.88)
T h e proof of (5.88) goes similar t o t h a t of (4.37.a).
(iii)
L e t k denote the i n absolute value largest principal curvature f o r the normal direction
N =
/ 116 1
have then
M '
with
= y
S
I I .I 1
-9.
With (5.89) and the Cauchy-Schwarz inequality we obtain the following upperbound:
a8
gYY(R) = t r a c e ( g
(R)).
with
2
aaB
theorems known f r o m the literature. F o r instance, one of the simplest exclusion theorems is:
F o r a l l eigenvalues p of a m a t r i x AaB one has
where
I I .I I
F o r the max-norm
l l XB I l
= max
l XB l
I I 5
mgx
I AaB I ,
For a diagonal dominant m a t r i x one could take Gershgorin's theorem, which says t h a t the
union of a l l discs
Am
12 I I
AaB
I},
(5.94)
(no s u m s t i o n over a ) ,
B =l
B b
n matrix A
a8
Instead of using exclusion theorems one coul.d also t r y t o compute the spectral radius o f
2
i
directly. This can turn out t o be feasible especially when per observation equation
aaB y
only a few parameters are involved.
X
As an alternative t o (5.91) we could make use of condition equations i f they are available.
Let Y
and
~
.'Then
i
X
( Y * ( Y ) , h ) = 0.
U
Hence,
= D
(Y*(Y),R)
= ( D ~(x)Y,(Y)
Y * (X)
M
a
,R)
nr
(B(x,Y),R) M =
(D
Y* (X)
R,y*(~))M
N o w assume that
m=
(gjk
( y ) = 0,
-c 1
p
a U e ) a.,
a k u g p ~1
J
P = l,
j ,k,l=l,...
where
And with X =
aa '
Y =
aB
Hence,
Expression (5.96)
the
..
well
..
(i)
figure 33
(ii)
aB
...
g..
= d i a g (... o
I J
- 2 ...)
Y
(5.99)
Now i f we also assume t h a t a l l distances i n the network are about the same, i.e.
= I
P4
and t h a t the variances o f the estimated parameters do not d i f f e r too much, we get w i t h
(5.98) and (5.99) f o r (5.91):
(iii)
!?1 1.
i=l
cos A.
= 0,
and
f!
i=l
I.
sin^
= 0
I f we assume that the observations are uncorrelated and the variances satisfy
then
and
where t h e odd numbered residuals r e f e r t o the distance residuals and the even numbered
residuals r e f e r t o the azimuth residuals.
Furthermore
it
follows
that
the
following
two
2n
2n
matrices
Im
g=l
and
Im
2 p =2
U
mn
a 2
sin A
1.
i
l
- a 2 sin A
i
Ai
A.
COS
1.
l
and
,
cos A.
- a 2 I cos A
A. i
i
Ai
sin^
2n m a t r i x
is block-diagonal w i t h blocks
( a cos A.
b sinA.)
2 2
4(a +b )
3 ( a cos A.
b sinA.)
2'
where l
2
is the distance f o r which U / l
li
i1
...,n ,
is the greatest.
i(5.105)
,
I n the previous sections we dealt w i t h the problem o f finding the least-squares solution
fi.
fi
9 to
important. That is, one also needs t o f i n d the statistical properties of the estimators involved and
formulate ways of testing statistical hypotheses. Unfortunately we are not able yet t o present a
complete treatment of the statistical theory dealing w i t h non-linear geodesic estimation, although i t
w i l l be clear t h a t i n considering non-linear models one cannot expect a well working theory as we
know i t for linear models. I n the following we w i l l r e s t r i c t ourselves therefore t o a few general
remarks.
As we have seen, Gauss' method enabled us, given the observation point y
least-squares estimate i( o f
= y(;)
of
Nc
X.
= P(ys)
and
= y
-1
y - l o P: M
and
and
t o compute the
such that
P(yS),
(6.2)
A4
E
S
-1
y
and j?
WE
would obtain d i f f e r e n t
probability distributions which depend both on the nature of the maps involved and on the assumed
normal distribution o f the observations. F o r making statistical inferences i t is therefore important t o
know the statistical properties of the estimators involved.
i
I n case the coordinate functions y ( xa ) of the map y are linear, i t is not d i f f i c u l t t o derive the
precise distribution of the least-squares estimators. The following distributional properties are w e l l
known:
However, these results do not carry over t o the non-linear case. Only i n the exceptional case that one
is dealing w i t h a totally geodesic submanifold
s t i l l hold. O f course, a similar complete theory as we know i t for linear models can hardly be
expected. Essential properties which are used repeatedly i n the development of the linear theory
break down completely i n the non-linear case. Take for instance the mathematical expectation
operator E{
i.e.,
.)
the mean of the image differs generally f r o m the image o f the mean. Hence, we can hardly
expect our least-squares estimators t o be unbiased i n the non-linear case. Consequently, one cannot
justify least-squares estimation anymore by referring to the Gauss-Markov theorem. O f course this by
no means implies that one should do away w i t h the least-squares estimators. Under the usual
assumption o f normality the least-squares estimators are namely s t i l l maximum likelihood estimators.
Besides, when one overemphasizes the importance o f exactly unbiased estimators, one can f i n d
oneselves i n an impossible situation. Very o f t e n namely we have a natural estimator which is,
2,
C J (by
~
g(=),
and i f i t is required
that
*
( X ,X),
with
*
X
respect
N*,
to
X E
linear
model
-y E
= ANcM
linear
function
unbiased linear estimator. However, this definition cannot be used f o r a non-linear model. F i r s t of a l l
since a r e s t r i c t i o n t o linear estimators is not reasonable anymore, and secondly since non-linear
estimators are almost always biased. Thus what we need is a more general definition o f estimability,
one which f o r linear models reduces to the above given one. The answer is given by the dual r e l a t i o n
* *
= A y
forsome
Y E
or
Ax = 0
and
( X ,X)
f 0,
* ,X )
( X ,X)
Grafarend and
Schaffrin, 1974). Therefore i n general i t would seem more appropriate t o couple the definition o f
estimability t o the property of invariance.
Since it is impossible i n general t o derive precise formulae f o r the distributional properties of the
non-linear estimators, the best we can do seems to be to find approximations. Three approaches
suggest themselves:
When one has a non-linear model it is natural t o hope t h a t i t is only moderately non-linear so t h a t
application of the linear theory is justified. I n practical applications the f i r s t step taken should
therefore be t o prove whether a linear(ized) model is sufficient as approximation, because then the
statistical treatment is much more simple. And since the origin of a l l complications i n non-linear
adjustment lies i n the presence o f curvatures, it seems reasonable t o take the mean curvature as a
measure o f non-linearity. L e t us therefore Taylorize the expressions i n (6.2) about the true values
F=
Y(x).
With e = y
-y
this gives:
= ( ~ ( y ~ =) yk
) ~+ a i ( p ( ? ) )
a2
i j
(P(?))
i j
e e +
...
and
-k
Y
1 2 2
k i j
= ? a a..(~(?)) g
'I
and
12 a 2 a 2i j ( Y- 10
xQ} =
E{P'-
a i j
~ ( y ) )g
And w i t h the definitions o f the unique mean curvature normal 6l (see (4.33)) and the C h r i s t o f f e l
a
Isymbols o f the second kind T
(see (4.17)), and by using the f a c t that y s - P ( y s ) E T ,N,
one
BY
Y
w i l l find that one can r e w r i t e (6.6) as
1
= - a2n
2
E{?-?}
(a)
and
where
{P
(b)
-X
N
p
,
P'
a n d a,B,y
"P
1 2 BY
50 g
... ,(m- n ) ,
-21
n N
,
BY
is an orthonormal basis o f
,...,n.
= l
T%
Y
direction N read
coordinates
- ,y- ,
a
,y
and x i
i= l ,
... ,n ,
cN
= D.
With (6.7.a)
are unbiased.
F o r the Symmetric H e l m e r t transformation
Similar estimates as given by (6.7) can also be derived for the higher order moments of the non-linear
estimators.
Fortunately our rather pessimistic estimates i n section 5 indicate that the application o f the theory
of linear statistical inference is generally justified i n geodetic network adjustments. But, we must
admit t h a t it is n o t clear t o us y e t what t o do when the model is significantly non-linear and
therefore much more research needs t o be done i n this area. Such being the case one may be surprised
t o realize how l i t t l e developed is the statistical theory o f non-linear estimation for p r a c t i c a l
applications. See f o r instance the survey papers (Cox, 1977), (Bunke, 1980); the book (Goldfeld and
Quandt, 1972) and the very recent book (Humak, 1984).
A n alternative way t o estimate the properties o f the distribution of the estimators involved, would be
t o use computer simulation. One could replicate the series of experiments as many times as one
pleases, each t i m e w i t h a new sample o f errors drawn f r o m the prescribed normal distribution and so
obtain the relevant distributional properties by averaging over a l l replications.
Although this
approach could give us valuable insight i n t o the e f f e c t of non-linearity, it must be carried out on a
system whose parameters are known i n advance, and such a system may n o t always be realistic. B u t
then again, since the distributions o f the estimators involved depend on the actual distribution of the
observational data which on i t s t u r n depends on the "true" values
is almost always faced w i t h the problem t h a t even when one can derive exact formulae f o r the
distributions one can evaluate only the approximation obtained by substituting the estimated
parameters f o r the true ones.
Finally we mention the possibility t o r e l y on results f r o m asymptotic theory. The central idea o f
asymptotic theory is t h a t when the number m of observations is large and errors of estimation
corresponding small, simplications become available t h a t are not available i n general. The rigorous
mathematical development involves l i m i t i n g distributional results holding as m
and is closely
related to the classical l i r n i t theorems o f probability theory. I n recent years many researchers have
concentrated on developing an asymptotically theory f o r non-linear least-squares estimation. I n
(Jennrich, 1969) a f i r s t complete account was given of the asymptotic properties o f non-linear leastsquares estimators. And i n (Schmidt, 1982) i t was shown how the asymptotic theory can be u t i l i z e d t o
formulate asymptotic exact test statistics. See also the very recent book (Bierens, 1984). Roughly
speaking one can say t h a t under suitable conditions one gets the same asymptotic results for the nonlinear model as for the linear one. Unfortunately, we doubt whether the results obtained up t o now
can satisfy the requirements of applications i n practice. I n particular, the theory s t i l l seems t o lack
statements concerning the accuracy of the approximations by l i m i t distributions.
7. Epilogue
I n this chapter we have t r i e d t o show how contemporary differential geometry can be used t o improve
our understanding of non-linear adjustment. We have seen t h a t unfortunately one can very seldom
extend the elegant formulations and solution techniques f r o m linear t o non-linear situations. F o r most
non-linear problems one w i l l therefore have t o recourse, i n practice, t o methods which are i t e r a t i v e
i n nature. As our analysis showed, the Gauss1 method is pre-eminently suited for small extrinsic
curvature non-linear adjustment problems. On the whole, one could say t h a t solutions t o linear
problems are prefabricated, while exact solutions t o non-linear problems are custom made. A n
important example is our inversion-free solution t o the Symmetric H e l m e r t transformation.
Although we have treated a number of new aspects o f non-linear adjustment, we must recognize t h a t
we are only on the brink of understanding the complex of problems o f non-linear adjustment. Many
problems and topics were l e f t untouched or were not further elaborated upon.
F o r instance, i n our proof o f the global convergence theorem (4.66) we made use o f the line search
strategy known as the minimization rule. However, i t s practical application is l i m i t e d by the f a c t
that the line search must be exact, i.e.,
I n our discussion o f Gauss' method, we assumed the non-linear map y t o be injective. However, i n
many p r a c t i c a l applications the m a t r i x of f i r s t derivatives
(see e.g.
y
becomes o f non-maximum rank
a
chapter 111) and the required inverses cannot be calculated. A way out o f this dilemma is
one then
suggested by the theory of inverse linear mapping. Instead o f an ordinary inverse o f g
aB '
B
Ba
B
Ba
a a E = ( g r a d E ) is
takes a generalized inverse, g
say, of gaB. TO show that Ax
= -g
s t i l l i n a descent direction, note t h a t
Hence,
if
Ax 4
then
- (grad
E,
Ax
Ax
has a positive
y.
a ay
O f the many i t e r a t i o n methods available, we only discussed Gauss' method. We did not mention any o f
the possible alternative iteration methods such as, f o r instance, Newton's method, LevenbergMarquardtls compromise or the method o f conjugate-gradients (see e.g. Ortega and Rheinboldt, 1970).
Although more intricate, these methods can become quite a t t r a c t i v e i n case of large curvature
problems since they take care, i n one way or the other, o f the curvature behaviour o f
i.
Also did we not discuss the interesting point o f view which is provided i f one interprets the i t e r a t i o n
process as a dynamical system. Consider namely Gauss' method
(a)
(b)
A X
B = gB a ( X )a Y i ( X ) g i j ( ~j s j-( X~ l ) ,
q
B
xqrl=
x B + t A xB
q
q
q
then the autonomous dynamical system
I t s solution is a curve c ( t ) which passes through the i n i t i a l value xo a t t i m e t=O and which has i t s
velocity given by the value o f the vector field
- grad E.
too
much
grad.~(;)
since
= 0.
one
can
show
that
under
suitable
conditions
hjg
c(t)
= ;(
with
This is l i k e the pendulum paradox, which says that the pendulum once i t is i n
motion can never come t o a state o f rest, but only approximate one arbitrary closely. Thus, g'lven an
i n i t i a l guess xo which is n o t a c r i t i c a l point o f E, one can t r y t o solve our non-linear adjustment
problem by solving the system o f d i f f e r e n t i a l equations (7.1),
I n connection w i t h the above dynamical interpretation we also mention the potential value which a
study o f the qualitative theory o f the global behaviour o f dynamical systems and o f Morse theory, can
have for a betterment o f our understanding o f non-linear adjustment. This qualitative theory is
namely concerned w i t h the existence o f equilibrium behaviour o f a dynamical system, together w i t h
questions o f local and global stability (see e.g.
Morse theory studies, amongst other things, the equilibrium configuration o f a gradient system. The
Morse inequalities, for instance, place restrictions on the number o f c r i t i c a l points t h a t a function E
can have due t o the topology o f the manifold on which it is defined (see e.g. Hirsch, 1976).
G.
This
would correspond t o a non-linearly constrained adjustment problem. Although the geometry o f the
problem is n o t too d i f f e r e n t f r o m the one discussed i n this chapter, the various methods for actually
solving a constrained problem can become quite involved (see e.g.
go about is, t o prolong the original constrained problem w i t h the aid o f the Langrange m u l t i p l i e r r u l e
t o one which is unconstrained. I t is interesting t o point out that although the Lagrange multipliers are
o f t e n thought o f as being merely dummy variables, which are just needed t o prolong the constrained
problem i n t o an unconstrained one, they actually have an important interpretation o f their own. I n
fact,
there is a very r i c h duality theory connected w i t h the Lagrangian formulation (see e.g.
Rockafellar,
The
Lagrangian formulation has namely the physical significance that i t replaces the given (kinematical)
constraints by forces which maintain those constraints. As a result the multipliers equal the forces o f
Krarup, 1982b). The multipliers can therefore be used as test statistics. F o r linear
models one can show t h a t the standardized Lagrangian multiplier equals Baarda's W-test statistic (see
Teunissen, 1984b).
That many more problems and topics related t o non-linear adjustment can be brought forward is
indisputable. Many questions are s t i l l open f o r future research and i t w i l l probably take some t i m e
before we understand non-linear geodesic adjustment as w e l l as we understand linear adjustment. We
therefore conclude by expressing the wish that the rather unsurveyed area o f non-linear adjustment
and statistical inference w i l l receive more serious attention than i t has received hitherto.
REFERENCES
AdQm, J., F. Halmos and M. Varga (1982): On the Concepts o f Combination of Doppler Satellite and
Terrestrial Geodetic Networks, A c t a Geodaet.,
Sic. Hung.
Eerste deel"),
(1983): D u a l i t y Considerations
Manuscripta
D.T.
Gravesteijn, H.M.
Teunissen
(1982): The D e l f t Approach for the Design and Computation o f Geodetic Networks, In: "Forty
Years o f Thought
Engler, K.,
Geodetic Networks w i t h Observables i n Geometry and Gravity Space. DGK, Reihe B, H e f t Nr.
258/VII, Mijnchen, pp. 119-141.
Flemming, W. (1977): Functions of Several Variables, Springer Verlag.
Gauss, C.F.
(1887): Abhandlungen zur Methode der Kleinsten Quadrate, Deutsch herausgegeben von
and R.E.
Grafarend, E.,
Christoffels, i n E.B.
Christoffel, The Influence of his Work on Mathematics and the PhysicaI Sciences. Edited by
P.L. Butzer and F. FehBr, Birkhauser Verlag.
Grafarend, E.W.,
E.H.
Helmert, F.R. (1880): D i e Mathemat. und Physikal. Theorieen der Hoheren Geodasie, Leipzig.
Hestenes, M.R. (1975): Optimization Theory, The F i n i t e Dimensional Case, John Wiley, New York.
Hirsch, M.W. (1976): D i f f e r e t i a l Topology, Springer-Verlag.
Hirsch, M.W.
Jackson, J. (1982): Survey Adjustment, Survey Review, Vol. 26, No. 203, pp. 248-249.
Jennrich, R.I.
Kooimans, A.H.
(1982a): Non-Linear Adjustment and Curvature, In: Daar heb ik veertig jaar over
nagedacht
pp. 145-159.
Krarup, T. (1982b): Mechanics of Adjustment, Peter Meissl
T.U. Graz.
Kube, R. and K. Schnadelbach (1975): Geometrical Adjustment of the European Triangulation
Networks
- Report of
11.
Kubik, K. (1967): I t e r a t i v e Methoden zur Lijsung des Nichtlinearen Ausgleichsproblems, ZfV., Nr. 6,
pp. 214-225.
Levallois, J.J.
(1952): Intrinsic Geodesy, The Ohio State Research Foundation, Project No. 485,
Columbus.
Meissl, P. (1973): Distortions of Terrestrial Networks caused by Geoid Errors, Bolletino di Geodesia e
Scienze A f f i n i , N. 2, pp. 41-52.
Meissl. P. (1982): Least Squares Adjustment, A Modern Approach, Mitteilungen der geodkitischen
Institute der Technischen Universitat Graz, Folge 43.
Molenaar, M. (1981a): A Further Inquiry i n t o the Theory of S-transformations and C r i t e r i o n Matrices,
Netherlands Geodetic Commission, Vol. 7, Nr. 1, Delft.
Molenaar, M. (1981b): S-transformations and A r t i f i c i a l Covariance Matrices i n Photogrammetry, I T C
Journal, No. 1, pp. 70-79.
Morduchow, M. and L. L e v i n (1959): Comparison of the Method of Averages w i t h the Method of
Least-Squares:
F i t t i n g a Parabola.
and W.C.
413.
Peterson, A.E.
Satellite Data, The Canadian Surveyor, Vol. 28, No. 5, pp. 487-495.
Pope, A. (1972): Some P i t f a l l s t o be Avoided i n the Iterative Adjustment of Nonlinear Problems,
Proceedings o f the 38th Annua.1 Meeting, American Society of Photogrammetry.
Pope, A. (1974): Two Approaches t o Nonlinear Least-Squares Adjustments, The Canadian Surveyor,
Vol. 28, No. 5, pp. 663-669.
Rao, C.R. (1973): Linear Statistical Inference and i t s Applications, Wiley, New York.
Rao, C.R.
and S.K. M i t r a (1971): Generalized Inverse o f Matrices and i t s Applications, J. Wiley, New
York.
Rockafellar, R.T. (1969): Convex Analysis, Princeton University Press, Princeton, N.J.
Rummel, R. and P.J.G.
..."Anniversary
edition on the occasion o f the 65th birthday of Professor W. Baarda., Vol. 11, pp. 602-623.
Rummel, R. (1984): F r o m the Observational Model t o Gravity Parameter Estimation, L e c t u r e Notes
o f the International Summer School on L o c a l Gravity F i e l d Approximation, Beijing, China,
Aug. 2 1 t o Sept. 4.
Sansb, F. (1973): A n Exact Solution of the Roto-Translation Problem, Photogrammetria, 29, pp. 203216.
Schmidt, W.H. (1982): Testing Hypotheses i n Nonlinear Regressions, Math. Operationsforsch. Statist.,
E.
and E.
Mikhail (1973):
Observations t o the Reference Ellipsoid, Bull. GBod. 56, no. 4, pp. 356-363.
Teunissen, P.J.G.
P.J.G.
(1984a):
Generalized Inverses,
Adjustment,
The
Datum
Problem
and
S-
- 10 May 1984.
- 10 May 1984.
25 A p r i l
Teunissen, P.J.G.
Paper presented a t the 16th European Meeting of Statisticians, Marburg (FRG), 3-7 Sept. 1984.
Tienstra, J.M.
(1948): The Foundation o f the Calculus of Observations and the Method o f Least-
J.M.
N.V.
U i t g e v e r i j Argus, Amsterdam.
Torge, W. and H.G.
and M.I.