IMPORTANT Ray Optics Notes 01

Ray Optics for Imaging Systems
Course Notes for IMGS-321

11 December 2013
Roger Easton
Chester F. Carlson Center for Imaging Science
Rochester Institute of Technology
54 Lomb Memorial Drive
Rochester, NY 14623
1-585-475-5969
easton@cis.rit.edu
December 11, 2013

Contents
Preface ix
0.1 References: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Introduction 1
1.1 Models of Light and Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Ray model of light (geometrical optics) . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Wave model of light (physical optics): . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Photon model of light (quantum optics): . . . . . . . . . . . . . . . . . . . 3
2 Ray (Geometric) Optics 5

2.1 What is an imaging system? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Simplest Imaging System Pinhole in Absorber . . . . . . . . . . . . . . . . 5
2.2 First-Order Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Third-Order Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Higher-Order Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Notations and Sign Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Nature of Objects and Images: . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Human Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Principle of Least Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 Fermats Principle for Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7.1 Plane Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.8 Fermats Principle for Refraction: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.8.1 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.8.2 Refractive Constants for Glasses . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.9 Image Formation in the Ray Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.9.1 Refraction at a Spherical Surface . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.9.2 Imaging with Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.10 First-Order Imaging with Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.10.1 Examples of Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.10.2 Spherical Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.11 Image Magnifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.11.1 Transverse Magnification: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.11.2 Longitudinal Magnification: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11.3 Angular Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.12 Single Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.12.1 Positive Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.12.2 Negative Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.12.3 Meniscus Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.12.4 Simple Microscope (magnifier, magnifying glass, loupe) . . . . . . . . . . 37
2.13 Systems of Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.13.1 Two-Lens System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.13.2 Eective (Equivalent) Focal Length . . . . . . . . . . . . . . . . . . . . . . . 43
v
vi CONTENTS
2.13.3 Summary of Distances for Two-Lens System . . . . . . . . . . . . . . . . . . . 48

2.13.4 Eective Power of Two-Lens System . . . . . . . . . . . . . . . . . . . . . . 48
2.13.5 Lenses in Contact: t = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.13.6 Positive Lenses Separated by t < f1 + f2 . . . . . . . . . . . . . . . . . . . . . 49
2.13.7 Cardinal Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.13.8 Lenses separated by t = f1 + f2 : Afocal System (Telescope) . . . . . . . . . . 56
2.13.9 Positive Lenses Separated by t = f1 or t = f2 . . . . . . . . . . . . . . . . . . 58
2.13.10 Positive Lenses Separated by t > f1 + f2 . . . . . . . . . . . . . . . . . . . . . 60
2.13.11 Compound Microscopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.13.12 Two Positive Lenses with Dierent Focal Lengths and Dierent Separations . 62
2.13.13 Systems of One Positive and One Negative Lens . . . . . . . . . . . . . . . . 63
2.13.14 Newtonian Form of Imaging Equation . . . . . . . . . . . . . . . . . . . . . . 64
2.13.15 Example (1) of Two-Lens System . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.13.16 Example (2) of Two-Lens System: Telephoto Lens . . . . . . . . . . . . . . . 69
2.13.17 Images from Telephoto System: . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.13.18 Example (3) of Two-Lens System: Two Negative Lenses . . . . . . . . . . . . 74
2.14 Plane and Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.14.1 Comparison of Thin Lens and Concave Mirror . . . . . . . . . . . . . . . . . 79
2.15 Stops and Pupils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.15.1 Focal Ratio f-number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.15.2 Example: Focal Ratio of Lens-Aperture Systems . . . . . . . . . . . . . . . . 81
2.15.3 Example: Exit Pupils of Telescopic Systems . . . . . . . . . . . . . . . . . . . 85
2.15.4 Pupils and Diraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.15.5 Field Stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.16 Marginal and Chief Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.16.1 Telecentricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.16.2 Marginal and Chief Rays for Telescopes . . . . . . . . . . . . . . . . . . . . . 94
3 Tracing Rays Through Optical Systems 95

3.1 Paraxial Ray Tracing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.1.1 Paraxial Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.1.2 Paraxial Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.1.3 Linearity of the Paraxial Refraction and Transfer Equations . . . . . . . . . . 98
3.1.4 Paraxial Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.2 Matrix Formulation of Paraxial Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . 100
3.2.1 Refraction Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.2.2 Ray Transfer Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.2.3 Vertex-to-Vertex Matrix for System . . . . . . . . . . . . . . . . . . . . . . 104
3.2.4 Example 1: System of Two Positive Thin Lenses . . . . . . . . . . . . . . . . 105
3.2.5 Example 2: Telephoto Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.2.6 MVV0 Derived From Two Rays . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.3 Object-to-Image (Conjugate) Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.3.1 Matrix of the Relaxed Eye (focused at ) . . . . . . . . . . . . . . . . . . 114
3.4 Vertex-Vertex Matrices of Simple Imaging Systems . . . . . . . . . . . . . . . . . . . 115
3.4.1 Magnifier (magnifying glass, loupe) . . . . . . . . . . . . . . . . . . . . . 115
3.4.2 Galilean Telescope of Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . 116
3.4.3 Keplerian Telescope of Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . 117
3.4.4 Thick Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.4.5 Microscope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
3.5 Image Location and Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.6 Marginal and Chief Rays for the System . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.6.1 Examples of Marginal and Chief Rays for Systems . . . . . . . . . . . . . . . 123
CONTENTS vii
4 Depth of Field and Depth of Focus 141

4.0.2 Examples of Depth of Field from Video and Film . . . . . . . . . . . . . . . . 143
4.1 Criterion for Acceptable Blur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.2 Depth of Field via Rayleighs Quarter-Wave Rule . . . . . . . . . . . . . . . . . . . . 152
4.3 Hyperfocal Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.4 Methods for Increasing Depth of Field . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.5 Sidebar: Transverse Magnification vs. Focal Length . . . . . . . . . . . . . . . . . . 157
5 Aberrations 161
5.1 Chromatic Aberration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.2 Third-Order Optics, Monochromatic Aberrations . . . . . . . . . . . . . . . . . . . . 165
5.2.1 Names of Aberrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.2.2 Aberration Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.2.3 Fourth-Order (Third-Order Ray) Aberrations: . . . . . . . . . . . . . . . . . . 181
5.2.4 Zernike Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
5.3 Structural Aberration Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.4 Optical Imaging Systems and Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.5 Optical System Rules of Thumb . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Preface
This book is intended to introduce the mathematical tools that can be applied to model and predict
the action of optical imaging systems.
ix
0.1 REFERENCES: 1
0.1 References:
Many references exist for the subject of wave optics, some from the point of view of physics and many
others from the subdiscipline of optics. Unfortunately, relatively few from either camp concentrate
on the aspects that are most relevant to imaging.
Useful Optics Texts:

[P3] (the three) Pedrottis, Introduction to Optics, Pearson Prentice-Hall, 2007.
[G] Gaskill, Jack D., Linear Systems, Fourier Transforms, and Optics, John Wiley, 1978.
[JG] Goodman, Joseph, Introduction to Fourier Optics, Third Edition, Roberts & Company,
2005.
[H] Eugene Hecht, Optics, 4th Edition, Addison-Wesley, 2002.
[PON] Reynolds, DeVelis, Parrent, Thompson, The New Physical Optics Notebook, SPIE,
1989.
[BW] Max Born and Emil Wolf, Principles of Optics, 7th Expanded Edition, Cambridge
University Press, 2005.
[GF] Grant R. Fowles, Introduction to Modern Optics (Second Edition), Dover Publications,
1975.
[RHW] Robert H. Webb, Elementary Wave Optics, Dover Publications, 1997.
[FLS] R. Feynman, R. Leighton, M. Sands, The Feynman Lectures on Physics, Addison-
Wesley, 1964.
[KF] M.V. Klein and T.E. Furtak, Optics, Second Edition, Wiley, 1986
[JW] F. Jenkins and H. White, Fundamentals of Optics, 4th Edition, McGraw-Hill, 1976.
[NP] A. Nussbaum and R. Phillips, Contemporary Optics for Scientists and Engineers,
Prentice-Hall, 1976.
[I] K. Iizuka, Engineering Optics, Springer-Verlag, 1985.
[FBS] D. Falk, D. Brill, and D. Stork, Seeing the Light, Harper and Row, 1986.
Lawrence Mertz, Transformations in Optics, John Wiley & Sons, 1965.
Physics Texts with useful discussions:

[HR] D. Halliday and R. Resnick, Physics, 3rd Edition, Wiley, 1978.
[C] F. Crawford, Waves, Berkeley Physics Series Vol. III, McGraw-Hill, 1968.
John D. Jackson, Classical Electrodynamics, Third Edition, Wiley, 1998, 6.
Feynman, Leighton, and Sands, Lectures on Physics, particularly Volume 1.25-33 and Vol-
ume II 32-33
Curriculum: Geometrical Optics and Imaging

1. Models for light propagation
(a) ray model (geometric optics)

(b) wave model (physical optics)
(c) photon model (quantum optics)
2. First-order optics
(a) third-order optics, aberrations

(b) higher-order approximations
3. Sign conventions for distances and angles
(a) Nature of objects and images (real and virtual)

2 Preface
4. Human eye
5. Refractive index
(a) Optical path length

(b) Fermats principle of least time (P3 2.2, H 4.5, BW 3.3)
(c) Snells law for reflection: 2 = 1
i. plane mirrors
(d) Snells law for refraction: n1 sin [1 ] = n2 sin [2 ]
i. plane interface between two media
(e) Dispersion (variation in n with )
i. relationship between mean refractive index and dispersion
ii. crown and flint glasses
(f) Dispersing prisms
6. Refraction at a Spherical Surface
(a) Paraxial approximation, imaging equation

(b) Reflection at a spherical surface
7. Imaging with thin lenses
(a) Imaging equation in terms of object and image distances and focal length
(b) system power
(c) spherical mirrors
(d) object/image conjugates
(e) Image magnifications
i. Transverse magnification
ii. Longitudinal magnification
iii. Angular magnification
(f) Single thin lenses
i. positive lens
ii. negative lens
iii. meniscus lens
iv. simple microscope
(g) Systems of thin lenses
i. lenses in contact
ii. eective focal length and power of two-lens system
iii. focal and principal points
iv. afocal systems (telescopes)
v. eyeglasses
vi. compound microscopes
vii. Newtonian form of imaging equation
viii. telephoto lens
ix. Stops and pupils
A. aperture stop
B. entrance and exit pupils
0.1 REFERENCES: 3
C. field stop
(h) Marginal and chief (principal) rays
i. telecentricity
8. Tracing rays through optical systems
(a) paraxial ray tracing equations

i. paraxial refractiontransfer
ii. paraxial transfer
iii. linearity of equations
(b) matrix formulation of paraxial ray tracing
i. refraction matrix
ii. transfer matrix
iii. Lagrangian invariant
iv. vertex-to-vertex matrix for imaging system
v. object-to-image (conjugate) matrix
vi. matrix for eye model
(c) Examples of imaging system matrices
i. magnifier
ii. Galilean telescope
iii. Keplerian telescope
iv. thick lens
v. microscope
(d) image location and magnification
(e) Depth of field and depth of focus
i. examples from film and video
ii. criterion for acceptable blur
iii. depth of field via Rayleighs quarter-wave rule
iv. hyperfocal distance
v. methods for increasing depth of field
vi. transverse magnification vs. focal length
(f) Aberrations
i. Chromatic aberration
A. achromatic doublet
B. apochromatic triplet
ii. Third-Order (Seidel) Aberrations
A. spherical aberration (relation to defocus)
B. coma
C. astigmatism
D. distortion
E. curvature of field
F. piston error
9. Computed Ray Tracing, OSLOTM

Chapter 1
Introduction
The obvious first question to consider is what is optics (or perhaps what are optics? heh, heh).
One reasonable definition of optics is the application of physical principles and observed phenomena
to manipulate light in useful ways. This presupposes the definition of light, which I specify as
electromagnetic radiation of any color, temporal frequency, and wavelength. This is more general
than the definition put forth by humanocentrics (e.g., color scientists), but is much more reasonable
in our field, where we want to take advantage of all measureable radiation to learn information
about objects that emit, reflect, refract, or otherwise modify radiation. The definition in imaging
is somewhat narrower: the application of the properties if materials and of light to form images,
which are recognizable (though approximate) replicas of the spatial and spectral distribution of
light reflected, transmitted, and/or emitted by an object.
To design optical image-forming systems, we must model the propagation of light from the
object (source) to the optic, the action of the optic on the incident light distribution, and finally
propagation from the optic to the sensor. The last step of conversion of the spatial (and possibly
spectral) distribution of incident light into measurable physical and/or chemical changes in some
medium by the sensor, is outside the scope of this discussion.
We hope to find a mathematical model of optical imaging as a system, where an output dis-
tribution g is created from an input object distribution f by the action of an imaging system O,
e.g., g [x, y, ] = O {f [x, y, z, ]}. We generally use this model to (try to) solve the inverse imaging
problem by inferring the input object from the output image and knowledge of the system. The task
may be dicult or even impossible; it is easy to see one diculty because most sensors measure only
a 2-D distribution of monochromatic light and therefore cannot possibly recover the three spatial
dimensions of a realistic object from a single image.
Schematic of an optical system that acts on an input with three spatial dimensions, time, and
wavelength f [x, y, z, t, ] to produce a 2-D monochrome (gray scale) image g [x0 , y 0 ].
1
2 CHAPTER 1 INTRODUCTION
1.1 Models of Light and Propagation

To be able even to write down, let alone solve, the imaging equation(s) for optical systems, we
need to specify the mathematical model of light that will describe its behavior as it propagates and
interacts with input objects, optical systems, and output sensors. To simplify the descriptions in
the dierent contexts, three physical models for light and its interactions are used that are (loosely
speaking) distinguished by the physical scale of the phenomena:
1.1.1 Ray model of light (geometrical optics)

macroscopic-scale phenomena (e.g., reflection, refraction)
1. (a) light propagates as RAYS that travel in straight lines until encountering an change in
properties of a medium or an interface between media. Except to dierentiate the color
of light, the wavelength and temporal frequency of the light are assumed to be zero
and infinity, respectively (0, ), which means that there are no eects due to
diraction;
(b) uses Fermats principle of least time to derive Snells law, which describes the phenomena
of reflection and refraction;
(c) useful for designing imaging systems (to locate the images and determine their magnifi-
cations)
(d) calculations for modeling the behavior of optical systems (lenses and/or mirrors) are
(relatively) simple and may be easily implemented in software;
(e) the quality of images from the system is assessed in terms of aberrations of the optical
system, which describe deviations of the image from ideal behavior.
1.1.2 Wave model of light (physical optics):

1. microscopic-scale phenomena (diraction/interference, reflection, refraction, refractive index,
...)
(a) considers light (electromagnetic radiation) to propagate as WAVES ;

(b) propagation and interaction of light are described by Maxwells equations;

(c) light propagates with velocity c in vacuum c / 3 108 m s1 and velocity v < c in
transparent materials;
(d) light is described by its wavelength in vacuum 0 and oscillation frequency 0 , whose
values aect any interactions with matter;
(e) the oscillation frequency 0 of waves emitted by a particular light source is constant
regardless of medium and is related to the vacuum wavelength 0 via:
0 0 = c
(f) the ratio of the propagation velocities in vacuum and in a medium is the index of refraction
of the medium:
c
n
v
(g) the wavelength of the wave in a medium is shorter the vacuum wavelength 0 via:
0
medium =
n
(h) wave optics explains the image-forming phenomena of reflection, refraction, diraction
(and interference, which is really just another name for diraction) and the phenomena
of polarization and dispersion that aect the quality of images;
1.1 MODELS OF LIGHT AND PROPAGATION 3
(i) mathematical calculations in wave optics are more complicated than those in ray optics
and often not easy to implement in computers. For example, it is dicult to evaluate the
exact form of light after propagating a short distance from the source;
(j) uses the Huygens-Fresnel principle to derive the mathematical model for propagation of
light, which if often divided into three regions:
i. linear, shift-invariant model in the Rayleigh-Sommerfeld diraction region (valid
everywhere)
ii. linear, shift-invariant approximation in the near field for propagation by a su-
ciently large distance from the source (Fresnel diraction)
iii. linear, shift-variant approximation in the far field for propagation to very large
distances from the source (Fraunhofer diraction);
(k) wave/physical optics is useful for assessing the quality of the images produced by systems.
1.1.3 Photon model of light (quantum optics):

atomic-scale phenomena (emission and absorption of radiation)
1. (a) light is composed of PHOTONS with both wave and particle characteristics;
(b) used to explain/analyze the physical interaction of light and matter, such as emission by
sources (e.g., lasers), and the photoelectric eect in sensors;
c E h
(c) Fundamental relationships: E0 = h 0 = h and momentum p = = , where h is
0 c 0
Plancks constant:
= 6.626 1034 J s
h = 4.136 1015 eV s
Phenomena described by the ray and wave models are most relevant to imaging, though the
quantum model is vital for understanding the properties and artifacts of light sensing. You probably
have seen some consideration of ray optics in undergraduate physics, and any such experience will
be useful in this course. The most common treatments of optics consider rays first because the
mathematical models and calculations are simpler. However, the preparation of linear systems you
just had makes it possible and even desirable to consider the wave model first by applying the
concepts of the impulse response and transfer function; these may significantly simplify the concepts
and calculations.
There are several goals to be reached by the conclusion of this discussion; we want to have the
capabilities to do several things:
locate the image(s) of an object generated by the lens, mirror, or system of lenses and/or
mirrors;
determine the character (real or virtual) and the size(s) (i.e., the transverse magnification)
of the image(s);
determine the field of view of the imaging system, i.e., the angular subtense of the object
that is imaged;
determine the range of distances in the scene from the optical system that appears to be in
focus (the depth of field);
determine the capability of the optics to distinguish closely spaced objects this is the spatial
resolution of the system (often specified in terms of measurements from the point spread
function or the modulation transfer function = MTF, which are optical analogues of
the impulse response and transfer function that are considered in the course on Fourier
methods);
4 CHAPTER 1 INTRODUCTION
understand the constraints on system performance due to the properties of materials used in the
imaging system, such as the variation in refractive index of glass with wavelength (dispersion)
Much of this discussion (especially about depth of field and spatial resolution) will benefit from
concepts derived in the course on Fourier methods, but we must also be aware of the limitations in
these concepts due to nonlinearities and/or shift-variant properties of the optical system.
Chapter 2
Ray (Geometric) Optics
Ray optics (commonly, though unfortunately, called geometric optics) uses the model of light as a
ray to evaluate the locations and properties of images created by systems of lenses and/or mirrors.
It does not consider any eects due to the wave model of light, such as interference or diraction
(which are actually just dierent words for the same phenomenon: interference considers few light
sources and diraction considers an infinite number, or just many). The subject of ray optics
may be subdivided into categories of first-order, third-order, and even higher-order optical
computations. It also cannot explain other wave-propagation phenomena, such as total internal
reflection.
2.1 What is an imaging system?

As a simple definition, we may consider an imaging system to map the distribution of the input
object to a similar distribution at the output image (where the meaning of similar is to be
determined). Often the input and output amplitudes are represented in dierent units. For example,
the input often is electromagnetic radiation with units of, say, watts per unit area, while the output
may be a transparent negative emulsion measured in dimensionless units of density or transmit-
tance. In other words, the system often changes the form of the energy; it is a transducer.
In the ray model, we can think of the imaging system as selecting and/or redirecting rays of
light to map the energy onto the image sensor. The selection or redirection process uses some
type of physical interaction between light and matter to remap the energy emitted or modified by
the object onto the sensor. Among the more obvious physical interactions in our experience are
refraction and reflection, but these are not the only, nor even the simplest, possible mechanisms.
The very simplest interaction between light and matter is absorption, where the light energy is
transferred to matter and disappears (of course, it does not really vanish, but most often is
converted into heat in the matter, but it is no longer available to create an image, so it may as well
have disappeared. We can use an absorber to create the simplest imaging system: the pinhole
camera
2.1.1 Simplest Imaging System Pinhole in Absorber

Consider a 3-D volume of space that contains the object. Occasionally, a ray of light emitted (or
reflected) from a location in the volume is selected by the pinhole and reaches the sensor.
every point in space is in focus on the sensor

transverse magnification Mt determined by relative distances
z2
MT =
z1
negative sign means image is inverted
5
6 CHAPTER 2 RAY (GEOMETRIC) OPTICS
The number of rays from the object that actually reach the image is small. The interaction
with the sensor requires the quantum model of discrete energy packets, so the number of packets
is small if the hole diameter is small. If the object is a uniformly emitting planar source, the
numbers of packets measured from dierent locations in the field are dierent (Poisson statistics);
these numerical variations in what should be identical measurements appear as noise. The metric
of noise is determined by the mean value of the signal and the variation about that mean, which
is described by the standard deviation . The signal-to-noise ratio is a dimensionless quantity that
may be defined many ways, but well use a simple definition that will suit this purpose

SN R = =

More photons leads to larger signals ( ) and larger standard deviation ( ), but mean increases

faster than the variance = , so the SNR is
better statistics and less relative noise

Quality of image depends on diameter d0 of pinhole. Improve statistics by increasing the
number of photons. Larger dose or larger pinhole. The blur quality of the image is better for
smaller pinhole because less uncertainty in ray path.
How to improve?
Longer exposure time
multiple pinholes
Depth of field
Redirect rays:
reflective pinholes
Reflection
Refraction
Diraction (wave property), e.g., holography
2.2 First-Order Optics

Of most concern to us will be first-order, paraxial or Gaussian optics, where the angles of
light rays measured relative to the optical axis are assumed to be small, so that the ray heights
remain small as the rays propagate down the optical axis, which is the source of another common
term of paraxial optics, meaning that the ray remains near the optical axis. In cases such that
the ray angle = 0, then we can approximate trigonometric functions by the first terms in their
power-series expansions (the Taylor series ):
! !
0 1 2
(x x0 ) (x x0 ) df (x x0 ) d2 f 1 dn f
f [x] = f [x0 ] + + + + (x x0 )n +
0! 1! dx x=x0 2! dx2 x=x0 n! dxn x=x0
X n
(x x0 )
= f (n) [x0 ]
n=0
n!
If the base value and the derivatives are evaluated at the origin, we have a Maclaurin series:
X
1 (n)
f [x] = f [0] xn
n=0
n!
2.2 FIRST-ORDER OPTICS 7
The Maclaurin series for the sine is:

X
1 dn
sin [] = n (sin []) n
n=0
n! d =0
1 1 1 1 1
sin [] = sin [0] 0 + (+ cos [0]) 1 + ( sin [0]) 2 + ( cos [0]) 3 + (+ sin [0]) 4 +
0! 1! 2! 3! 4!
3 5
= 0++0 +0+
3! 5!
3 5
= +
3! 5!
3 5
= +
6 120
Note that only odd powers of are present in the series for sin [], because the sine is an odd
(antisymmetric) function that satisfies the condition sin [] = sin [+].
The corresponding series for the even (or symmetric) cosine includes only even powers of :
X
2 4 2n
cos [] = 1 + = (1)n
2! 4! n=0
(2n)!
= lim

{cos []} = 1
=0
2
= cos [] 1
2
So the approximation of the cosine with two terms is the dierence of a constant and a parabola.
The series for the (odd, antisymmetric) tangent is less commonly known and includes only the
odd powers of :

3 2 5 X 2n 22n 1
tan [] = + + + = 2 B2n 2n1 = lim {tan []} =
3 15 n=0
(2n)!
=0
th
where B isbthe Bernoulli number. The first-, third-, and fifth-order series approximations for
the tangent are:

= for > || ' 0
tan []
2
3

tan []
= +
3
3
2
tan []
= + + 5
3 15
The validity of these approximations is perhaps more obvious from the graphs, where we can see
that sin [] / and tan [] ' for small positive values of .
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5
theta
Comparison of (black), sin [] (red), and tan [] (blue) for 0 +0.5 radians, showing that
sin [] / and tan [] ' over this domain.
The corresponding first-order approximation to the cosine is the unit constant
lim {cos []} = 1

0
1.2
1.1
1.0
0.9
0.8
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
theta
The first-order approximation to cos [] (red) compared to the unit constant (black), showing that
the two are very similar for small values of .
The advantage of the first-order approxmation is that evaluation of the ray heights and angles
becomes simple because of the proportionality.
2.3 THIRD-ORDER OPTICS 9
2.3 Third-Order Optics

It likely is obvious from the definition of first-order optics that third-order optics includes the
second term in the expansions:
3 3
sin []
= =
3! 6
3
tan [] = +
3
2
1 =1 2
cos [] =
2! 2
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5
theta
Comparison third-order approximations of sin [] (red), and tan [] (blue) to the linear term
(black) .
Note that the third-order approximation for the cosine is a biased parabola:
1.2
1.1
1.0
0.9
0.8
0.0 0.1 0.2 0.3 0.4 0.5
theta
2
cos [] (black) and its third-order approximation as 1 2 (red).
The results for ray angles using third-order optics will dier from those of first-order optics; these
dierences lead to image aberrations.
2.3.1 Higher-Order Approximations
We clearly can add additional terms to the power series that will increase the accuracy of any
calculations at the cost of significantly more complexity.
2.4 Notations and Sign Conventions

One of the simplest and most dicult aspects of ray optics is the set of conventions to be adopted for
all of the quantities to be measured. As in many aspects of optics, there are competing choices for
conventions that have their own distinct advantages, but that lead to dierent equations for image
locations, etc. We are going to use the directed distance convention, where distances are positive
if measured from left to right. The problem becomes remembering which are the points measured
from and to, respectively. The figure shows sign conventions for the dierent quantities. Note
that in all cases, light travels from left to right in all media with positive refractive index (n > 0), so
the distances are positive if measured in the same direction of light travel and negative if measured
in the other direction.
Sign conventions for distances, heights, angles, and curvatures. The distance is positive if measured
from left to right; the height is positive if the endpoint is above the axis; the angle from the axis or
from a normal is positive if measured in the counterclockwise direction (positive ); and the
curvature is positive if its center is to the right of the vertex (intersection of the surface and the
optical axis).
Now consider the example in the figure where an optical system forms acts on a red object (the
upright red arrow) located at the object point labeled by O to produce an image at O0 . The
horizontal black line is the line of symmetry of the optical system and is calle the optical axis.
2.4 NOTATIONS AND SIGN CONVENTIONS 11
Sign conventions for a specific case: the object height at O is positive, while the image height at O0
is negative. The angle of the (blue) ray from the base of the object to the (green) first surface is
positive. The radius of curvature R of the first surface is positive.
The front and rear surfaces of the optical system are shown in green; their intersections with the
optical axis are the vertices of the system. The object space includes all features to the left of the
vertex V that is closer to the object, so V is the object-space vertex of the imaging system. Similarly,
the image space includes all features to the right of the vertex V0 that is closer to the image O0 ,
so V0 is the image-space vertex. The ray shown in blue from the object O to the green optical
surface makes an angle measured from the optical axis to the ray; since this angle is measured
counterclockwise, it is a positive angle > 0. The image-space ray from V0 to O0 measured from
the axis is a clockwise angle, so 0 < 0.
The front surface of the optical system has a radius of curvature R that is measured from the
vertex to the center of curvature, i.e. R =VC, where the overscored pair of letters denotes the
distance from the first feature to the second. In this case, the distance from V to C is measured
from left ot right, so VC R > 0. In the same manner, the distance from the rear vertex V0 to
its center of curvature C0 is measured from right to left, so R0 V0 C0 < 0; R0 is negative in this
example.
Two other features are shown in the figure that we have not yet described, one each in object
and image space. F and F0 are object-space and image-space focal points, respectively. They
are endpoints of the object-space and image-space focal lengths; the other endpoints are either
the vertices (if the lenses are thin) or the principal points (which we shall label as H and H0 ,
respectively). That discussion will have to wait until later.
We will often have the need to propagate a light ray through an optical system consisting of
a set of dierent thin lenses or a set of surfaces separated by dierent media. The cascade of
calculations requires distances measured from the object to the lens or front surface and from lens
or back surface to the image. The need to express multiple distances will be addressed by both
subscripts and primed notation, depending on context, where the unprimed notation will refer
to the distance before the lens or surface and the primed notation to that after. When multiple
surfaces are needed, the first will be denoted by the subscript 1, the next by 2, etc.
Notation can also be a problem. The two dierent lower-case Greek letters for phi (straight
and cursive ) will be used in dierent ways: represents the power of a lens or surface and is
measured in reciprocal length, most commonly reciprocal meters m1 , which is named the diopter.
The cursive phi () will be used to represent an angle, and therefore is dimensionless. The cursive
letter f is used to represent a function, e.g., f [x, y, t], whereas the straight letter f will be used
to denote the focal length with dimensions of length. This means that:
1
=
f
2.4.1 Nature of Objects and Images:

1. Real Object: Rays incident on the lens are diverging from the source; the object distance is
positive
2. Virtual Object: Rays into the lens are converging toward the source located behind the
lens; object distance is negative
3. Real Image: Rays emerging from the lens are converging toward the image; image distance is
positive
4. Virtual Image: Rays emerging from the lens are diverging, so that the image is behind the
lens and the image distance is negative
2.5 HUMAN EYE 13
2.5 Human Eye

Since this course considers optics of imaging systems, and since the images generated by many
optical systems are viewed by human eyes, we need to at least introduce the optics of the eye; we
will consider it in more detail when we trace rays through the standard eye model later.
The optics of the human eye include the curved surface (the cornea, which exhibits most of
the power of the system) and a deformable lens. The system is intended to form an image on the
retina, which is a fixed distance from the cornea. The lens is deformed by action of ciliary muscles
to change the plane that is viewed in focus. When the muscles are relaxed, the lens is flatter,
i.e., the radii of curvature of the surfaces are larger. To view an object close up, the focal length
of the eye lens must be shortened by making the lens shape more spherical. This is accomplished by
tightening the ciliary muscles (which is the reason why your eyes get tired after an extended time
of viewing objects up close).
If the retina is located too far from the cornea, so that the image is in front of the retina
when the muscles are relaxed, then the eye sees a blurry image of distant objects, but nearby
objects may be well focused. This is the condition of nearsightedness or myopia. If the retina
is too close to the cornea, the image is focused behind it and the eye sees distant objects more
sharply (hyperopia or farsightedness.)
2.6 Principle of Least Time

The mathematical model of ray optics is based on a principle stated by Fermat. Long before that,
Hero of Alexandria hypothesized a model of light propagation that could be called the principle of
least distance:
A ray of light traveling between two arbitrary points

traverses the shortest possible path in space. (Hero of Alexandria)
This statement applies to reflection and transmission through homogeneous media (i.e., the medium
is characterized by a single index of refraction). However, Heros principle is not valid if the object
and observation points are located in dierent media (as is the normal situation for refraction) or if
multiple media are present between the points.
In 1657, Pierre Fermat modified Heros statement to formulate the principle of least time (which
actually works):
A light ray travels the path that requires the least time to traverse. (Fermat)
The laws of reflection and refraction may be easily derived from Fermats principle. A moving ray
(or car, bullet, or baseball) traveling a distance s at a velocity v requires t seconds:

s
t=
v
If the ray travels at dierent velocities for dierent increments of distance, the total travel time is
the summation over the dierent distances and dierent velocities:
XM
sm
t=
v
m=1 m
c
If we define the velocity of a light ray in a medium of index n to be v = . then:
n
M
X M
sm 1 X
t= = (nm sm )
m=1
c c m=1 c
nm
where the optical path length is defined:

M
X
(nm sm )
m=1
For a single medium, the optical path length is:
ns
Note that the optical path length is longer than the physical path length; it is the distance that a
ray would travel in vacuum in the same time that it would take to travel the physical distance s;
the optical path is longer than the physical path because light travels more slowly in the medium
(nm 1). The principle of least time may be restated as a light ray requires the least time to
traverse the path with the shortest optical path length, or:
A ray traverses the route with the shortest optical path length.
This suggests a philosophical question, How does the light ray know which path to take before
it leaves the source? I leave it to you to ponder this question, but will say that the diculty if
formulating an answer suggests the limitation of the (simple) ray model for light propagation.
2.7 Fermats Principle for Reflection
Now consider the path traveled upon reflection that minimizes an easily evaluated optical path
length:
2.7 FERMATS PRINCIPLE FOR REFLECTION 15
Schematic for determining the angle of reflection using Fermats principle.
As drawn, the angle 1 is positive (measured from the normal to the ray) and 2 is negative (from the
normal to the ray). The ray travels in the same medium of index n both before and after reflection.
The components of the optical path length are:
p
so = h2 + x2
q
op = b2 + (a x)2
And the expression for the total optical path length is:
= n (so + op)
p q
2 2 2 2
=n h + x + b + (a x)
= [x] (a function of x)
By Fermats principle, the path length traveled is the minimum of the optical path length , so the
position of o along the x-axis is found by setting the derivative of with respect to x to zero:
p q
d d 2
= n h2 + x2 + b2 + (a x) =0
dx dx

2x 2 (a x)
=n + q
2 h2 + x2 2 b2 + (a x)2
x ax
= q =0
h2 + x2 b2 + (a x)
2
x ax
= =q
2
h +x 2 2
b2 + (a x)
From the drawing, note that:

x
sin [1 ] =
h2 + x2
ax
sin [2 ] = q
2
b2 + (a x)
= sin [1 ] = sin [2 ]
= 2 = 1
In words, the magnitudes of the angles of incidence and reflection are equal (as already derived
by evaluating Maxwells equations at the boundary). The negative sign is necessary because of
the sign convention for the angle; the angle is measured from the normal and increases in the
counterclockwise direction, but the reversal of the propagation direction of the ray means that it
also may be explained by assuming that the index of refraction for the image space is the negative
of that for the object space.
Snells law for reflection at interface.
Note that Snells law for reflection does not include either refractive index n, which means that
the outgoing ray angle is not aected by the dierent refractive indices of the the two media, so the
image location and quality are not influenced by the indices. The amount of the ray that is reflected
IS aected by the two refractive indices via the Fresnel equations, which require the principles of
wave optics for explanation. At this point, we will just introduce the relationship without proof. If
light is incident normally to the interface between two media ( = 0) with refractive indices n1 and
n2 , the reflectivity of the surface obeys:
2
n1 n2
R= if = 0
n1 + n2
If the first medium is air with n ' 1 and the second is glass with n
= 1.5, the reflectivity is:
2
1 1.5
R= = 0.04
1 + 1.5
Note that the reflectivity is the same if the first medium is glass and the second is air:
2
1.5 1
R= = 0.04
1.5 + 1
The reflectivity at dierent incident angles obeys more complicated expressions, in part because the
light must be decomposed into dierent polarizations depending on the direction of oscillation of
the electric field.
2.7 FERMATS PRINCIPLE FOR REFLECTION 17
2.7.1 Plane Mirrors
Other than perhaps the pinhole, the simplest image forming system is the plane mirror, which is
so familiar that it may seem hardly worth mentioning. Clearly its action obeys Snells reflection
law that 2 = 1 , which means that the the appearance of an image is reversed relative to the
object, i.e., the parity of the image is inverted. It also allows introduction of the concepts of object
space and image space, which will be used thenceforth and forevermore. The object space is the
locus of points where objects may exist, which is all points in front of the mirror (real objects)
and behind the mirror (virtual objects) . A real object forms a virtual image behind the mirror,
and a virtual object forms a real image in front of the mirror. In other words, the object and
image spaces for reflection by a plane mirror both include the entire 3-D space.
Object and image space for a plane mirror. Rays diverging from a real object forms a virtual image
behind the mirror, but rays converging to a virtual object behind the mirror form a real image
in front of the mirror.
2.8 Fermats Principle for Refraction:
Schematic for refraction using Fermats principle.
In this drawing, both 1 and 2 are positive (measured from the normal to the interface in the
counterclockwise direction). The optical path length is:
= n1 so + n2 op
p q
= n1 h2 + x2 + n2 b2 + (a x)2
By Fermats principle, the path length traveled is that such that is minimized, so we again set the
derivative of with respect to x to zero and identify trigonometric functions for the resulting ratios.
d 2x 2 (a x)
= n1 + n2 q =0
dx 2 h2 + x2 2 b2 + (a x)2
x ax
= n1 = n2 q =0
2
h +x 2
b2 + (a x)2
x
sin [1 ] =
h + x2
2
ax
sin [2 ] = q
2
b2 + (a x)
= n1 sin [1 ] = n2 sin [2 ]
= Snells Law for refraction
Note that with this sign convention, Snells law may be applied to reflection by setting the refractive
index of the second medium to be the negative of the first:
n1 sin [1 ] = n2 sin [2 ]
= n1 sin [1 ] = n1 sin [2 ]
= sin [1 ] = sin [2 ]
= 2 = 1
2.8 FERMATS PRINCIPLE FOR REFRACTION: 19
The expression of Snells law for refraction is general, but we can easily apply the first-order paraxial
approximation that sin [] = if the ray angles are small (n = 0):
n1 sin [1 ] = n2 sin [2 ] = n1 1 = n2 2 in paraxial approximation

n1
= 2 = 1 in paraxial approximation
n2
2.8.1 Dispersion
Unlike the reflection law, Snells law for refraction DOES include the refractive indices. This means
that the angle of refraction will change as the indices change, as with wavelength. All (or perhaps
I should day ALL) transparent materials exhibit a variation in refractive index with wavelength,
which is called dispersion. Note that the features of dispersion depend on the material (e.g., glass).
The full explanation of dispersion is beyond the scope of this course, so we will just describe its
eects.
In a transparent matrial over the range of visible wavelengths, the refractive index n DE-
CREASES with increasing . In the study of wave optics, this ensures that the phase velocity
d
for the average wave v = is larger than the group or modulation velocity . Among other
k dk
things, this ensures that a signal transmitted as a modulation of a light wave cannot travel at a
speed faster than the velocity of light. A schematic dispersion for a hypothetical glass is shown in
the figure; note that the slope of the dispersion curve decreases with increasing ; the curve flattens
out as increases in the visible range.
Typical dispersion curve for glass at visible wavelengths, showing the decrease in n with increasing
and the three spectral wavelengths specified by Fraunhofer and used to specify the refractivity,
mean dispersion, and partial dispersion of a material.
The refractive indices for several real glasses shows an additional feature of dispersion curves:
the relationship between the amount of dispersion and the refractive index. Glasses with lower
refractive index (n = 1.5, the so-called crown glasses) have a flatter graph and therefore less
dispersion. In other words, nblue is larger than nred , but not much larger., so that the smaller the
refractive index, the smaller the dispersion. Flint glasses have larger values of the refractive index
(n
= 1.7) and larger variations across the visible spectrum:
(nblue nred )flint > (nblue nred )crown

Dispersion curves for various optical glasses as a function of wavelength in the visible region of
the spectrum (measured in Angstroms, where 1 = 0.1 nm = 1010 m, 4000 = 400 nm) The rapid
rise in the index at wavelengths in the ultraviolet region is due to the atomic resonances there.
If we use the paraxial approximation for rays in air entering a glass with refractive index n, the
outgoing ray angle 2 is:
1
2 = 1 in paraxial approximation
n2
Dispersion ensures that (n2 )blue > (n2 )red , which means that (2 )blue < (2 )red and the deviation
angle blue > red .
Since the outgoing ray angles are dierent for dierent colors, images will be formed at dierent
distances in dierent colors. This is the source of chromatic aberration in imaging systems.
Eect of dispersion on refraction: since the refractive index for red light is smaller, the angle of
refraction measured from the normal is larger. Put another way, this means that the deviation
angle due to refraction is smaller for red light than for blue light.
In imaging, we often think of dispersion in refractive elements as an unfortunate bug in the

system, but you probably also know that it can be a very useful feature; it provides a tool for
spreading white light into its constituent spectrum in a dispersing prism.
Dispersing prism with the two refractions, showing that the angle of deviation from the original
path is larger for blue light than for red light.
From the figure, note that the angle of deviation of the ray from the original path is larger for blue
light due to the dispersion of light
blue > red for prism
The relationship between the wavelength and the deviation angle is complicated for refraction.
As a side comment, note that light may also be dispersed into its spectrum by the phenomenon
of diraction in gratings. However, the relationship between the wavelength and the deviation angle
for diraction is very simple: the angle of deviation is proportional to the wavelength (for small
angles):
= blue < red for grating
This means that it is easier to construct an accurate spectrometer based on diraction than based
on refractive dispersion.
2.8.2 Refractive Constants for Glasses

The refractive properties of glass are approximately specified by the refractivity and the measured
dierences in refractive index at the three Fraunhofer wavelengths F, D, and C:
Refractivity nD 1 1.75 nD 1.5

Mean Dispersion nF nC > 0 dierences between blue and red indices
Partial Dispersion nD nC > 0 dierences between yellow and red indices
nD 1
Abb Number ratio of refractivity and mean dispersion, 25 65
nF nC
(note that larger dispersions result in smaller Abb numbers)

Glasses are specified by six-digit numbers abcdef, where nD = 1.abc, to three decimal places,
and the Abb number = de.f . Note that larger values of the refractivity mean that the refractive
index is larger and thus so is the deviation angle in Snells law. A larger Abb number means that
the mean dispersion is smaller and thus there will be a smaller dierence in the angles of refraction.
Such glasses with larger Abb numbers and smaller indices and less dispersion are crown glasses,
while glasses with smaller Abb numbers are flint glasses, which are denser. Examples of glass
specifications include Borosilicate crown glass (BSC), which has a specification number of 517645, so
its refractive index in the D line is 1.517 and its Abb number is = 64.5. The specification number
for a common flint glass is 619364, so nD = 1.619 (relatively large) and = 36.4 (smallish). Now
consider the refractive indices in the three lines for two dierent glasses: crown (with a smaller n)
and flint:
Line [ nm] n for Crown n for Flint
C 656.28 1.51418 1.69427

D 589.59 1.51666 1.70100
F 486.13 1.52225 1.71748
The glass specification numbers for the two glasses are evaluated to be:
For the crown glass:

refractivity: nD 1 = 0.51666
= 0.517
1.51666 1
Abb number : = = 64.0
1.52225 1.51418
Glass number = 517640
For the flint glass:

refractivity:L nD 1 = 0.70100 = 0.701
0.70100 1
Abb number: = = 30.2
1.71748 1.69427
Glass number = 701302
Dispersion curve of a material from very short to very long wavelengths. The index increases with
increasing as additional resonances are passed, but the index of refraction decreases with
increasing wavelength in the visible wavelengths (bold face).
The dispersion curves for optically transparent materials, such as glass and air, exhibit some very
similar features, though the details may be significantly dierent. Starting at very short wavelengths
( ' 0), the refractive index n is approximately unity. In words, the wavelength is so short (and
the oscillation frequency so large) that the energy per photon is very large, so that photons pass
through the material without interacting with the atoms; the material appears to be vacuum. For
longer (but still very short) wavelengths (hard X rays), the refractive index actually is slightly
less than unity, which means that X rays incident on a prism are refracted away from the prisms
base, rather than towards the base in the manner of visible light. This is the reason why X rays can
be totally reflected at grazing incidence, which is the focusing mechanism used in X-ray telescopes
(such as Chandra). As the wavelength of the incident light increases further, though still within the
X-ray region, the radiation incident on the material is heavily absorbed; this is the K-absorption
edge where the energy of the incident X rays is just sucient to ionize an electron in the innermost
atomic shell the K shell. For example, the wavelength of this absorption is K = 0.67 nm
for silicon. Other absorptions occur at yet longer wavelengths (smaller incident photon energies),
where electrons in the L and M shells, etc., of the atom are ionized. The spectrum of a material
with a large atomic number (and thus several filled electron shells) will exhibit several such resonant
absorptions.
Ionization of a K-shell electron by an incoming X ray of sucient energy. This is the reason for
the large absorptions of hard X rays by materials. Lower-energy (longer-wavelength) X rays will
ionize electrons in the L or M shells, thus producing other absorption edges.
As the wavelength of the incident radiation increases further, into the far ultraviolet region of
the spectrum, the real part of the refractive index decreases to a value much less than unity within
a wide band of anomalous dispersion. The fact that n < 1 in this region may be confusing because
it seems that the velocity of light exceeds c, but these waves do not propagate in the material due
to the strong absorption (large value of ). The wavelength of maximum absorption corresponds to
the largest of the several natural oscillation frequencies of bound electrons in the material.
In the visible region of the spectrum, the dispersion curve exhibits the familiar decrease in n
with that was shown above. For example, the index of air is n = 1.000279 at = 486.1 nm
(Fraunhofers F line) and n = 1.000276 at = 656.3 nm (C line). The corresponding values for
diamond are nF = 2.4354 and nC = 2.4100. The closer the nearest ultraviolet absorption to the
visible spectrum, the steeper will be the slope dn
d in the visible region and thus the larger the visible
dispersion (defined below).
The dispersion curve descends yet more steeply somewhere in the near infrared region and then
rises due to anomalous dispersion in the vicinity of an infrared absorption band (labeled 2 on
the graph). For quartz (crystalline SiO2 ), the center of this band is located at = 8.5 m, but the
absorption already is quite strong for wavelengths as short as = 4 m. Most optical materials have
several such infrared absorption bands and the base level of the index of refraction is larger after
each such band. This behavior is confirmed by far-infrared measurements of the refractive index of
quartz (crystalline SiO2 ), which varies over the interval 2.40 n 2.14 for 51 m 63 m. The
large values of n ensure that the focal length of a convex quartz lens is much shorter at far-infrared
wavelengths than at visible wavelengths.

As the wavelength is increased still further into the radio region of the spectrum after the last
absorption band, the refractive index decreasesr slowly due to normal dispersion from that last
absorption and approaches a limiting value of .
0
2.9 Image Formation in the Ray Model

We know that light rays are deviated at interfaces between media with dierent refractive indices.
The goal in this section is to use interfaces of specified shapes to collect the light and reshape
the wavefronts in a way that recreates images of the original sources.
2.9.1 Refraction at a Spherical Surface

Optical systems typically are used to form images of the source distribution by constructing optical
elements (lenses) made out of transparent media with dierent refractive indices to redirect the
electromagnetic radiation. Until rather recently, lenses were fabricated almost exclusively from
glass, which required the optical surfaces to be ground to the desired curvature and polished to
remove scratches, etc., from the grinding. Two pieces of glass are typically employed in the grinding
process: the optic and the tool. Water and a grinding compound composed of flecks of some
hard substance resembling sand are placed on the surface of one glass and the two surfaces rubbed
together with some force applied to the top optic. The two glass pieces are In the grinding process,
The surface that is easiest to fabricate is a sphere, because the two surfaces will be in contact
at all translations. Glass is ground out of the center of the top piece and o of the edges of the
bottom piece, leaving a concave sphere on top and a convex sphere on the bottom. The grit
of the grinding compound is reduced gradually to leave a smoother surface. The surface is then
polished using very fine jewelers rouge to produce smooth surfaces of optical quality. More
recently, optical elements have been fabricated from thin plates cemented over a hollowed-out grid
to lighten the weight. Also plastics and other materials have been developed that may be cast to
produce optical surfaces of various shapes with minimal polishing.
Grinding optical surfaces: a slurry of water and grinding compound (e.g., carborundum) is placed
between two glass surfaces. The top glass is pushed down and moved around to grind glass from the
center region of the top piece. The resulting surfaces must be spherical because they are the only
curves that remain in contact at all locations.
Consider the action of a spherical surface of a medium with index n2 on an incident ray in a
medium of index n1 :
2.9 IMAGE FORMATION IN THE RAY MODEL 25
Refraction at a spherical surface between two media of refractive index n1 and n2 .
The point source is located at s and its distance to the vertex v is sv z1 > 0. The distance
from vertex v to the observation point p is vp z2 > 0. The physical distance traveled by a ray in
medium n1 to the surface is sa 1 and that in medium n2 is ap 2 . The radius of curvature of
the surface is vc = ac R > 0 as drawn. For emphasis, we repeat that z1 , z2 , and R are all positive
in our convention. The ray intersects the surface at angle (the position angle) measured from
the center of curvature c. The optical path length of the ray from s to p through a is
OP L = n1 1 + n2 2 = n1 (sa) + n2 (ap)
The triangles 4sac and 4acp has sides 1 and R with hypotenuse z1 + R, while 4acp has sides
R and z2 R, with hypotenuse ap 2 . The physical lengths 1 and 2 may be evaluated from the
other two sides and the included angle via the law of cosines:
2 2
4sac = 1 = (z1 + R) + R2 2R (z1 + R) cos []
q
2
= 1 = (z1 + R) + R2 2R (z1 + R) cos []
4acp = 2
2 = (z2 R)2 + R2 2R (z2 R) cos [ ]
q
= 2 = (z2 R)2 + R2 + 2R (z2 R) cos []
q
= (z2 R)2 + R2 2R (R z2 ) cos []
The corresponding optical path length is:
OP L = n1 1 + n2 2
q
2
= n1 (z1 + R) + R2 2R (R + z1 ) cos []
q
2
+ n2 (z2 R) + R2 2R (R z2 ) cos []
which is obviously a function of the position angle . We can now apply Fermats principle to find
the angle for which the OPL is a minimum:
d
(OP L) = 0
d
n1 2R (R + z1 ) sin [] n2 2R (R z2 ) sin []
=q +q
2 2
(z1 + R) + R2 2R (R + z1 ) cos [] (z2 R) + R2 2R (R z2 ) cos []

n1 (R + z1 ) n2 (R z2 )
= 2R sin [] +
1 2
which may be rearranged to:

n1 (R + z1 ) n2 (R z2 )
0 = 2R sin [] +
1 2
n1 (R + z1 ) n2 (R z2 )
= 0 = +
1 2
n1 R n2 R n2 z2 n1 z1
= + =
1 2 2 1

n1 n2 1 n2 z2 n1 z1
= + =
1 2 R 2 1
This last relation between the physical path lengths 1 and 2 and the distances z1 and z2 is exact.
Now we use the expression for the physical path length 1 to find its ratio relative to the axial
distance z1 and use simple algebra to rearrange:
q
1
(z1 + R)2 + R2 2R (z1 + R) cos []
=
z1 z1
! 12
2
(z1 + R) + R2 2R (z1 + R) cos []
=
z12
2 1
z1 + R2 + 2Rz1 + R2 2R2 cos [] 2Rz1 cos [] 2
=
z12
2 12
1 2R 2R
= 1+ + (1 cos [])
z1 z12 z1
This relation also is exact, but may be approximated by applying a truncated series for cos []:
2 4 6
cos [] = 1 + +
= 1 if
=0
2! 4! 6!
2 4 6
= 1 cos [] = 1 1 + +
2! 4! 6!
2 4 6
= +
2! 4! 6!

= 0 if = 0
This leads to the first-order approximation that the path length and axial length are approximately
equal:
1
= 1 = 1 = z1
z1
2.9 IMAGE FORMATION IN THE RAY MODEL 27
Similarly, we can show that:

2

= z2
This paraxial or Gaussian approximation (also called first-order optics because it is based on only
the first-order term in the cosine series) is valid only for small ray angles measured from the optical
axis. In words, the optical path lengths of rays that travel along the optical axis and rays that travel
away from the axis (but still with = 0) are equal.
The simplified imaging equation has the form:

1 n2 z2 n1 z1 1
= (n2 n1 )
R 2 1 R
n1 n2 1
= + = (n2 n1 )
z1 z2 R
This is the paraxial imaging equation for single surface; clearly it is an approximation to the true
equation, and also clearly it is similar to the imaging equation we have already considered.
Object at Infinite Distance

Now consider some pairs of object and image distances z1 and z2 . If the object is located at ,
then:
n1 n2 n2 1
+ = = (n2 n1 )
z2 z2 R
n2 R
= z2
= f2 the image-space focal length
n2 n1
which is what we normally think of as being the focal length of the optic.
Image at Infinite Distance

If the image is located at +, the object distance must be
n1 1 n1 R
= (n2 n1 ) = z1
= f1 the object-space focal length
z1 R n2 n1
1 1
= (n2 n1 )
f1 R
Also note that:
n1 R
f1 n2 n1 n
= = 1 = n1 f2 = n2 f1
f2 n2 R n2
n2 n1
In words, the ratio of the focal lengths in the two spaces (object and image) is the ratio of the indices
of refraction in the two spaces.
Rule of Thumb: Estimating focal lengths of converging lenses: For a single positive
(converging) lens (i.e., not a lens system with multiple elements), it is easy to estimate the focal
length of a lens by finding the distance from the lens to the image of a distant bright object. The
requirement for distant is not critical forming the image of ceiling lamp on the floor or a tabletop
will give a useful estimate for a positive lens with a short focal length.
2.9.2 Imaging with Spherical Mirrors

The equation for a single refractive surface may be used to derive the focal length of a spherical
mirror by setting the refractive index of image space to the negative of that in object space:
1 1 n1
= = (n1 n1 ) = 2
f R R
In air, the equation for the focal length of a spherical mirror is:
R R
f = in air
2n 2
In words, the focal length of a spherical mirror is half of the radius of curvature; the focal length is
positive (converging) if R > 0 and negative if R < 0, as shown.
Spherical mirrors: concave mirror with negative radius of curvature R = VC < 0 makes outgoing
light rays converge and so f > 0; convex mirror with positive radius of curvature makes rays diverge
and f < 0.
2.10 First-Order Imaging with Thin Lenses

Normally we do not consider the case of an object in one medium with the image in another usually
both object and image are in air and a lens (a device composed of material with dierent refractive
index n and curved surfaces) diverts the rays to form the image. We can derive the formula for the
object and image distances if we know the radii of the lens surfaces and the indices of refraction.
We merely cascade the formula for a single surface:
n1 n2 n2 n1
At first surface: + 0 =
z1 z1 R1
n2 n3 n3 n2
At second surface: + 0 =
z2 z2 R2
where z1 is the (usually known) object distance, z10 is the image distance for rays refracted by the
first surface, z2 is the object distance for the second surface, and z20 is the image distance for rays
exiting the second surface (and thus from the lens). For the common convex-convex lens, the
2.10 FIRST-ORDER IMAGING WITH THIN LENSES 29
center of curvature of the first surface is to the right of the vertex, and thus the radius R1 of the
first surface is positive. Since the vertex is to the right of the center of curvature of the second
surface, then R2 < 0. If the lens is thin, then the ray encounters the second surface immediately
after refraction at the first surface, so the ray heights at the two surfaces are the same. The object
distance for the second surface is the negated image distance from the first: z2 = z10 . Put another
way, the absolute value of the image distance for the front surface |z10 | is the same as the object
distance for the second surface |z2 |. If the lens is thick, then the object distance for the second
lens is dierent from the image distance for the first, and the ray heights will be dierent if the ray
angle is not zero. The thickness t of the lens must satisfy the relationship:
z10 + z2 = t = z2 = t z10 for thick lens
for a thick lens. For a thin lens with t = 0
z2 = 0 z10 = z2 = z10 for thin lens
The equations for the two surfaces may be added and the RHS may be rearranged to obtain a
single imaging equation for a lens with two surfaces:

n1 n2 n2 n3 n2 n1 n3 n2
+ 0 + + 0 = +
z1 z1 z2 z2 R1 R2

n3 1 1 n1
= + n2
R2 R1 R2 R1
For a thin lens with t = 0, substitute z2 = z10 to obtain:

n1 n3 n1 n2 n2 n3
t = 0 = + 0 = + + + 0
z1 z2 z1 z2 z2 z2

n1 n3 n3 1 1 n1
+ 0 = + n2
z1 z2 R2 R1 R2 R1
where the object is immersed in index n1 , the lens has index n2 , and the image is immersed in index
n3 .
In the usual case of both object and image in air so that n3 = n1 = 1,the equation simplifies to:

1 1 1 1 1 1
+ = + n2
z1 z20 R2 R1 R2 R1

1 1 1 1
+ 0 = (n2 1)
z1 z2 R1 R2
Note the similarity between this equation and that we inferred from the derivation of the image
plane using wave optics:
1 1 1
+ =
z1 z2 f
where the distances z1 and z2 from the object to the lens and lens to image are what we had called
z1 and z2 previously, and we identify:

1 1 1 1 1
(n2 1) = = + (Lensmakers Equation)
R1 R2 f z1 z20
which defines the focal length of the thin lens in terms of its physical parameters for a thin lens.
This is the so-called lensmakers equation for thin lenses IN AIR; it determines the distance z20 to
the image for object distance z1 , the radii of curvatures R1 and R2 of the spherical surfaces, and the
index of refraction n2 of the glass. Note that the object distance z1 and the image distance z20 both
appear with the same algebraic sign, which may be interpreted as demonstrating an equivalence
of the object and image because the propagation of light rays may be reversed to exchange the roles
of object and image. Corresponding object and image points (or object and image lines or object
and image planes) are called conjugate points (or lines or planes).
In the more general case where the refractive index of object space is n3 > 1 so that n3 6= n1 ,
the focal length of the lens is:
n1 n3 1
(n2 1) =
R1 R2 f
and that of image space is n3 .
2.10.1 Examples of Thin Lenses

1. Plano-convex lens, curved side forward (convexo-planar lens)
R1 = |R1 | > 0
R2 = (sign has no eect)

1 1 1 1 n2 1
+ 0 = (n2 1) = >0
z1 z2 |R1 | |R1 |
If z1 = +, then z20 = f > 0, the focal length
1 n2 1
= = system power (measured in meters1 = diopters)
f R1
R1
f= = 2R1 (since n2 = 1.5 for glass)
n2 1
We often use the power = f 1 (measured in m1 = diopters) instead of the focal length
f to describe the lens, since powers of dierent lenses combine by addition, instead of as
reciprocals of sums of reciprocals. The power measures the ability of the lens or lens system
to deviate rays, i.e., to change the ray angle.
2. Plano-convex lens, plane side forward:
R1 =
R2 = |R2 | < 0
1 1 (n2 1) (n2 1)
+ = =+ >0
z1 z20 R2 |R2 |
|R2 |
f= = 2 |R2 |
n2 1
So the focal length of the lens is the same regardless of its orientation (front-to-back). Since
the focal lengths for the two configurations (curved side in front or behind lens) are the same,
you might assume that the same image quality can be expected for the two configurations.
This is NOT the case, but the explanation requires the theory of aberrations. At this point,
we will just try to give a bit of motivation for another rule of thumb, while postponing the
proof.
Rule of Thumb: Orientation of Plano-Convex Lens: When using a plano-convex lens
to form an image, the quality of the image is better if the power is more evenly divided among
the two surfaces. This means that the the curved side of the lens is placed towards the longer
conjugate (which usually is towards the object) and the plane side towards the shorter conju-
gate. This miniizes the spherical aberration that causes rays from a point object to cross the
optical axis at dierent distances from the lens. This perhaps may be visualized better if we
consider the case of a distant object (assume z1 = ) and a plano-convex lens with the flat
2.10 FIRST-ORDER IMAGING WITH THIN LENSES 31
side towards the object. For an object at infinity, the rays incident upon the lens are parallel
(collimated) both when they are incident to and when they exit the flat surface. In other
words, the flat side contributes no power to the imaging, so all of the focusing power comes
from the curved surface.
Rule of thumb: when using a plano-convex lens, place the curved side towards the longer conjugate
to get a better image.
3. Plano-concave, plane side forward:
R1 =
R2 = + |R2 | > 0

1 1 1 1 (n2 1)
+ = (n2 1) = <0
z1 z20 + |R2 | |R2 |
|R2 |
f = = 2 |R2 |
n2 1
4. Double convex lens with equal radii:
R1 = |R| > 0
R2 = R1 = |R|

1 1 1 1 (n2 1)
+ 0 = (n2 1) =2 >0
z1 z2 |R| |R| |R|
1 2 (n2 1)
==
f |R|
|R|
f= = |R| > 0 if n2
= 1.5
2 (n2 1)
2.10.2 Spherical Mirror

The mirror changes the direction of rays by reflection that obeys Snells law for reflection so that
the angle of reflection is the negative of the angle of incidence (measured from the normal to the
surface). For a concave spherical mirror, the incident ray angle varies with height above the optical
axis. dierence in analysis between the single refractive surface and the mirror may be simplified by
recognizing that the mirror reverses the direction of propagaion of light, which may be explained
by setting n2 = n1 = 1
1 1 1 2 R
= = = f =
f R R R 2
In words, the focal length of a spherical mirror is half of the radius of curvature. A concave mirror
with negative radius is positive (center to left of vertex)
2.11 Image Magnifications

The most common use for a lens is to change the apparent size of an object (or image) via the
magnifying properties of the lens. The mapping of object space to image space distorts the size
and shape of the image, i.e., some regions of the image are larger and some are smaller than the
original object. We can define three types of magnification: transverse, longitudinal, and angular,
where the first two describe the impact of the imaging system on lengths that are respectively
perpendicular to and parallel to the optical axis, while the last refers to the action on the angles
of rays measured from the optical axis. Note that the very name of magnification is rather
misleading because most imaging systems produce images that are smaller than the object; they
actually minify the features because the magnifications are smaller than unity.
2.11.1 Transverse Magnification:

The transverse magnification MT is what we usually think of as magnification it is the ratio of
object to image dimension measured transverse to the optical axis. In the figure, note the two similar
triangles 4a1 b1 c and 4a2 b2 c:
The transverse magnification of the image is the ratio of the height of the image to that of the
y2
object: MT = .
y1
It is easy to see that:
y1 |y2 | y2
= = (because y2 < 0)
z1 z2 z2
y2 z2
= MT =
y1 z1
If |MT | is larger than or smaller than unity, the image is magnified or minified, respectively. If
MT > 0, the image is upright or erect and if MT < 0, the image is inverted (upside down).
2.11 IMAGE MAGNIFICATIONS 33
2.11.2 Longitudinal Magnification:
The longitudinal magnification ML is the ratio of the length or depth of the image measured
along the optical axis to the corresponding length of the object; the longitudinal magnification is
the ratio of dierential elements of length of the image and object, which approach an infinitesimal
in the limit:
z2
ML =
z1
z2 dz2
lim =
z1 0 z1 dz1
The expression may be derived by evaluating the total derivative of the lensmakers equation.

1 1 1 1
+ = (n 1)
z1 z2 R 1 R2
Since the imaging equation relates the reciprocal distances z11 and z21 , the longitudinal magnifica-
tion varies for dierent object distances. The total derivative of the left-hand side of the imaging
equation is:

1 1 1 1
d + = d +d
z1 z2 z1 z2
1 1
= 2 dz1 2 dz2
z1 z2
The derivative of the right-hand side is:

1 1 1 1
d (n 1) = (n 1) d
R 1 R2 R 1 R2
= 0 (because n, R1 , and R2 are constants)
We combine these to see that:

1 1 1 1
dz1 2 dz2 = 0 = dz1 = 2 dz2
z12 z2 z12 z2
2
dz2 z2
= =
dz1 z1
We can now identify the ratio of the two dierential lengths along the axis as the longitudinal
magnification ML :
2
dz2 z2
ML = = (MT )2 < 0
dz1 z1
The longitudinal magnification is negative because the image moves away from the lens (increasing
z2 ) as the object moves towards the lens (decreasing z1 ). The longitudinal magnification aects the
irradiance of the image (i.e., the flux density of the rays at the image); if |ML | is large, then the
light in the vicinity of an on-axis location is spread out over a longer longitudinal dimension at
the image, which requires the irradiance of the image to decrease.
The scaling of the 3-D image along the three axes. The scaling along the transverse axes x and
y define the transverse magnification, while the scaling of the image along the z-axis is determined
by the longitudinal magnification.
The eect of longitudinal magnification on the irradiance of the image of a uniformly luminous rod
of length ab. The section at z1 = 2f is imaged with unit negative transverse magnification at
z2 = 2f . Sections of the rod with z1 > 2f are imaged at z2 < 2f , and the energy density is
remapped to account for the nonlinear distance relationship z11 + z12 = 1f .
2.11.3 Angular Magnification

This is the ratio of the angles of the outgoing ray and the corresponding incoming ray measured
relative to the optical axis. Angular magnification is particularly relevant for systems that do not
form images, e.g., afocal telescopes. We shall shortly utilize this concept when considering the
single-lens magnifier.
out
M =
in
If |M | > 0, then the angle of the emerging ray is larger than that of the corresponding entering
ray. This will increase the angular separation between rays generated by two objects so that it will
be easier for the eye to resolve them. The angular magnification is sometimes called teh magnifying
power of the lens.
2.12 SINGLE THIN LENSES 35
2.12 Single Thin Lenses

2.12.1 Positive Lens
The power of a single lens with two surfaces is determined by the lensmakers equation:

1 1 1
= = 1 + 2 = (n2 1)
f R1 R2
The power is positive if 0 < R1 < |R2 |. The most common case is the double convex lens
where R1 > 0, R2 < 0, which means that the ray encounters positive power at both surfaces. The
action of a single thin positive lens with known focal length on an object with known location may
be solved graphically by sketching three specific rays from the tip of the object:
1. the ray parallel to the optical axis; this ray is refracted by the lens to pass through the image-
space focal point F,
2. the ray through the center of the lens, which is not refracted by the thin lens and so maintains
the same angle relative to the optical axis, and
3. the ray through the object-space focal point F0 to the lens; this ray is refracted and travels
parallel to the optical axis.
The intersection of these three rays (or obviously of any two) is the location of the image of
the tip of the object:
The example in the figure closely matches the situation where the image is an inverted replica of
the object, so that h0 = h and MT = 1. The two equations that must be satisfied are
z2 = z1 = MT = 1
1 1 1
+ = = z1 = z2 = 2 f
z1 z2 f
This situation where the object and image distances are twice the focal length is often called imaging
at equal conjugates.
This drawing assumes that the indices of refraction in object and image space are identical. If the
indices are dierent (e.g., if the object is in water and the image in air), then the imaging equation
must be modified:
n n1 n2 1
=
R1 R2
n1 n2
= +
z1 z2
If the refractive indices in object and image spaces are larger than that of the lens, such as a
case where the object and image are in glass or water and the lens is made of air, the curvatures
must be reversed, so that R1 < 0 and R2 > 0 to make a positive lens.
Lens made of rare medium (e.g., air) within a dense medium (e.g., glass, water). The reversal of
refractive indices requires inverting of the signs of the radii of curvature.
2.12.2 Negative Lens
A lens with negative power at both surfaces may be constructed if R1 is negative and R2 is positive.
Two (or more) rays that have passed through a lens with negative power will exhibit a larger
diivergence on the output side than on the input side.
2.12.3 Meniscus Lenses
A lens with radii of curvature with the same sign on both surfaces is a meniscus lens. If both radii
are positive, then the powers of the two surfaces are:

n1 1n 1 1
1 = + = (n 1)
|R1 | |R2 | |R1 | |R2 |
which may be positive or negative depending on the relative sizes of R1 and R2 ; the power is positive
if R2 > R1 and negative if R2 < R. An example of a meniscus lens with positive power is shown in
the figure.
Meniscus lens with positive power; the radii of curvature of both surfaces is positive since the
vertices are to the left of the centers, but the fact that R2 > R1 ensures that > 0.
Examples of meniscus lenses with positive and negative power are also shown:
Meniscus lenses with positive and negative powers from the Newport optics catalog. The red lines
represent rays that show the respective converging and diverging actions of the lenses.
2.12.4 Simple Microscope (magnifier, magnifying glass, loupe)

This is arguably the simplest imaging system, but some of the concepts it illustrates are suciently
sophisticated that many optickers and/or imaging scientists may not understand them entirely. The
simple microscope is a single lens with positive focal length that is used to increase the size of the
image on the retina than could be formed with the eye alone. It also may be called the magnifying
glass if handheld or a loupe if designed to rest on the object). You may know already that the
eye lens is deformed by ciliary muscles that are relaxed when the lens is flatter, i.e., the radii of
curvature of the surfaces are larger so the focal length is longer. To view an object close up, the
focal length of the eye lens must be shortened by making the lens shape more spherical. This is
accomplished by tightening the ciliary muscles (which is the reason why your eyes get tired after an
extended time of viewing objects up close).
The closest distance to an object that appears to be sharply focused by the unaided eye is the
near point, which (obviously) depends on the flexibility of the deformable eyelens and the capability
of the ciliary muscles, which (obviously) vary with individual, and with age for a single individual.
The distance to the near point may be as close as 50 mm = 2 in for a young child and in the range
between 1000 mm 2000 mm for an elderly person. This reduction in accommodation for close
objects is one of the signs of aging. The near point of an ideal eye is assumed to be 250 mm = 10 in
from the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing
the angular subtense of fine details for those individuals. For this reason, nearsighted individuals
in ancient times (before optical correction) often were attracted to professions requiring fine work,
such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in
these crafts.
The reference for angular magnification is the angle subtended by the object if viewed at the
near point of the average eye so that z1 = 250 mm. If the object height is y, the angle when viewed
at the near point is: h i
y y
250 mm = tan1 =
250 mm 250 mm
where the first-order approximation tan []
= if
= 0 is used in the last step.
Magnifier with Object at Focal Point of Positive Lens
If the object is positioned at the object-space (front) focal point of a positive lens with focal length
flens , then the arays from the tip of the object are parallel when they exit the lens and so may be
viewed in focus by an eye with a relaxed lens for an object at an infinite distance away. The angle
subtended by the object one focal length away is:

1 y y
lens = tan =
flens flens
Magnifier with object at focal point of lens. Figure (a) at top shows the angle 250 mm subtended by
the object when located at the near point; (b) shows the angle lens subtended by the object when
located at the object-space focal point of the lens. The blue ray in (b) emerges parallel to the optic
axis, which shows that the object distance z1 = f .
The angular magnification or magnifying power of the magnifier is the ratio of the angle subtended
by the object when viewed at the closer distance through the lens to the angular subtense viewed
at the near point:

y y
tan1
lens f flens
M = = h lensy i= y
250 mm tan 1
250 mm 250 mm
250 mm
M = , object at focal point
flens
If the focal length of the magnifying lens is, say f = 50 mm, then the magnifying power of the lens
for the object at the focal point is:
250 mm
M = =5
50 mm
Magnifier with Image Formed at Near Point
We can instead use the magnifying lens held close to the eye to form a virtual image at the near
point of the eye. This means that the distance from the lens to the virtual image formed by the lens
is the distance to the near point: V0 O0 = z2 = 250 mm. ISubstitute this distance into the imaging
equation:
1 1 1
+ =
z1 250 mm f
1 1 1 250 mm f
= = + = z1 =
z1 f 250 mm 250 mm + f
The angle subtended by the object at the near point is the same as before:
h y i y1
1
250 mm = tan1 =
250 mm 250 mm
but the angle subtended by the image when positioned at the near point viewed through the lens is
dierent:
y2 y2 y1
lens = tan1 = =
|250 mm| 250 mm z1
where the similarity of the triangles has been used. This expression may be recast by substituting
the expression for z1 :

y1 y1 y1 250 mm + f
lens = = =
z1 250 mm f f 250 mm
250 mm + f
The magnifying power is:

y 250 mm + f
1

lens f 250 mm 250 mm + f
M = = y =
250 mm 1 f
250 mm
250 mm
M = + 1 image at near point
flens
Magnifier with image at near point of eye. The top figure again shows the angle 250 mm subtended
by the object when located at the near point. The second figure shows the image at the near point,
which is more distant than the object.
2.13 SYSTEMS OF THIN LENSES 41
2.13 Systems of Thin Lenses

The images produced by systems of thin lenses may be located by finding the intermediate image
produced by the first lens, which then become in turn the objects for the second lens, which generates
an image that is the object for the third lens, etc. This type of analysis also may be applied
directly to the more realistic case of thick lenses, where the first lens actually represents the
first surface of the thick lens and the light propagates through the glass between the surfaces.
Though straightforward, this sequential solution to the image may be tedious and also not very
illuminating (pun intended) about the action of the system of lenses. The object and distance for
the nth lens will be denoted by zn and the corresponding image distance by the primed quantity zn0 .
2.13.1 Two-Lens System

Consider a two-lens system with first lens L1 and second lens L2 separated by the distance t. The
object for the system shown in the figure is labelled by O and the corresponding image by O0 , the
object- and image-space focal points are F and F0 , and the object- and image-space vertices (first
and last surfaces of the system) by V and V0 .
Imaging by a system of two thin lenses L1 and L2 separated by the distance t. The object and
image distances for the first lens are z1 and z10 and for the second lens are z2 and z20 .
From the diagram, we see that z10 the image distance from the first lens, z2 the object distance for
the second lens, and the lens separation t are related by:
z10 + z2 = t
so the object distance for the second lens is z2 = t z10 . The imaging equation for the first lens
determines z10 :
1 1 1 1 1 1 z1 f1
+ = = 0 = =
z1 z10 f1 z1 f1 z1 z1 f1
z1 f1
= z10 =
z1 f1
If z1 = , then the

z1 f1 z1
z10 = lim = f1 lim = f1 1 = f1
z1 z1 f1 z1 z1 f1
In words, the image distance from the first lens for an object at is the focal length of the first
lens, as it should be. The object distance to the second lens is z2 = t z10 , which may be rewritten
in terms of z1 , f1 , and t for the general case:
z1 f1
z2 = t z10 = t
z1 f1
z1 t f1 t z1 f1
=
z1 f1
z1 (t f1 ) f1 t
=
z1 f1
In the limit of infinite object distance, the object distance to the second lens is:

z1 f1 t
z2 [for z1 = ] = lim (t f1 )
z1 z1 f1 z1 f1
= 1 (t f1 ) 0
= t f1
which is the dierence in the separation of the lenses and the distance from the image-space focal
point of the first lens; this often is a negative distance (i.e., virtual object for the second lens).
In the general case, apply the imaging equation for the second lens and substitute for the ex-
pression for z2 :
1 1 1
=
z20 f2 z2
1 z1 f1
=
f2 z1 (t f1 ) f1 t
z1 f1 f2
t (f1 + f2 ) +
1 (z1 f1 ) (z1 f1 )
= z1
z20 f2 t f1 f2
(z1 f1 )

z1 f1 z1
f2 t f1 f2 f2 t
(z1 f1 ) (z1 f1 )
= z20 = =
z1 f1 f2 z1 (f1 + f2 ) f1 f2
t (f1 + f2 ) + t
(z1 f1 ) (z1 f1 ) (z1 f1 )
The image distance for a specified (non-infinite) object location is called the back focal distance by
some authors:
f1 z1
f2 t
(z1 f1 )
BF D = z20 = V0 O0 =
z1 (f1 + f2 ) f1 f2
t
(z1 f1 )
In the limit of infinite object distance, the BFD becomes the back focal length BFL:
lim [z20 ] = z20 [f1 , f2 , t; z1 = ] V0 F0

z1

z1
f2 t f1 (z1 f1 )
= lim
z1 z1 (f1 + f2 ) f1 f2
t
(z1 f1 )
f2 (t f1 1)
=
t 1 (f1 + f2 ) 0 f1 f2
t f2 f1 f2 f1 f2 f2 t
= =
t (f1 + f2 ) (f1 + f2 ) t
f1 f 2 f2 t (f1 t) f2
BFL = V0 F0 = =
(f1 + f2 ) t (f1 + f2 ) t
These complicated expressions, for the image distances measured from the second lens in terms of
the two focal lengths f1 and f2 , the separation t, and the distance z1 from the object to the first
lens, are useful, but it tell little on its face about the entire lens system. We would much prefer
establishing relationships from the object to the lens system and from the system to the image. The
first step in this analysis is to define an equivalent or eective focal length for the entire system,
which is the focal length of the equivalent single thin lens.
2.13.2 Eective (Equivalent) Focal Length

We can use the results just derived to find an expression for the imaging action of a two-lens system
by finding the location and focal length of the equivalent single lens that would generate the same
image. This is an important concept, so we will do a rigorous derivation, which is perhaps simplified
by adding some details to the figure:
Ray diagram of system of two positive thin lenses to illustrate the concept of eective (or
equivalent) focal length fe , back focal length BF L = z20 = V0 F0 , and principal point H0
The continuations of the input outgoing rays intersect at B, whose projection onto the optical axis
is at H0 , this is the location of the equivalent single lens that would generate the same outgoing
ray from the incoming ray. The distance from H0 , the image-space principal point, to F0 is the
image-space eective (or equivalent) focal length:
H0 F0 fe
We have already evaluated the back focal length, which is the image location for an object at infinity:
(f1 t) f2
V0 F0 = z20 [z1 = ] =
(f1 + f2 ) t

Compare two sets of similar triangles: AVF01 CV0 F01 and BH0 F0 CV0 F0
shown in the figures:

From the first pair of triangles AVF01 CV0 F01 , we can construct ratios of their
heights and axial lengths:
h1 h2 h2 V0 F01
= = =
VF01 V0 F01 h1 VF01
Now note that the distance VF01 = f1 , while V0 F01 may be rewritten:
V0 F01 = VF01 VV0 = f1 t
so the ratio may be rewritten:

h2 f1 t
=
h1 f1

From the second pair of similar triangles BH0 F0 CV0 F0 , we can define the distance
H0 F0 fe and V0 F0 = BF L = z20 [z1 = ], so we now have two expressions for the ratio:
h2 V0 F0 BF L
= =
h1 H0 F0 fe
h2 BF L
=
h1 fe
Equate the two boxed equations::

f1 t BF L
=
f1 fe
1 1 f1 t
= =
fe BF L f1
Now substitute the formula for the back focal length BFL, which is z20 if z1 = :
f2 (t f1 ) 1 (f1 + f2 ) t
z20 = = 0 =
t (f1 + f2 ) z2 (f1 t) f2
1 1 f1 t
= =
fe BF L f1
1 (f1 + f2 ) t f1 t
=
fe (f1 t) f2 f1
which may be rearranged to obtain a relationship for the reciprocal of the eective focal length in
terms of the reciprocals of the individual focal lengths:
1 (f1 + f2 ) t f1 t
=
fe (f1 t) f2 f1
(f1 + f2 ) t 1 1 t
= = +
f2 f1 f1 f2 f1 f2
1 1 1 t f1 f2
= + = fe =
fe f1 f2 f1 f2 (f1 + f2 ) t
These two equivalent expressions specify what is certainly the most important equation we have
derived to date and arguably the most important to be derived in this class. It determines the eect
on the image of separating two thin lenses by some distance t.
This expression may also be written in terms of the powers of the two lenses, where the power
of the nth lens is the reciprocal of the focal length: n fn1 .
e = 1 + 2 1 2 t
Note that if
1 1 + 2
t = f1 + f2 = + = 1
1 2 1 2
then the fe = = BF L = + and e = 0; the object and image are both an infinite distance
from the system. The focal points are located at and the system is called afocal. Such a system
has infinite focal length and no power, which means that the image of an object at infinity is also
at infinity,. Since z1 = z20 = , then the transverse magnification is zero.However, such a system
exhibits a useful angular magnification, as we shall see.
Back Focal Length and Image-Space Principal Point

We have evaluated the back focal length:
f1 f2 f2 t
BF L = V0 F0 =
(f1 + f2 ) t
and the system focal length:

f1 f2
fe =
(f1 + f2 ) t
We now define the image-space principal point H0 to be the point that is located one eective focal
length from the image-space focal point, i.e., so that H0 F0 = fe
f1 f2
H0 F0 fe =
(f1 + f2 ) t
We can think of H0 as the location of the single equivalent thin lens that generates the same outgoing
ray that emerges from the two-lens system. For a single thin lens, H0 coincides with the image-space
vertex V0 , which in turn coincides with the object-space vertex V since the thin lens has thickness
t = 0.
From the equation for the BFL and the definition of the principal point, we can also specify the
distance from the principal point to the vertex:
fe H0 F0 = H0 V0 + V0 F0 = H0 V0 + BFL
f1 f2 f1 f2 f2 t
= H0 V0 = fe BFL =
(f1 + f2 ) t (f1 + f2 ) t
f2 t
H0 V0 =
(f1 + f2 ) t
We can (and will) derive corresponding results in the object space, i.e., object-space principal and
focal points.
A pair of positive thin lenses showing the image-space principal and focal points H0 and F0 ,
respecively.
Compare Back Focal Length and Back Focal Distance
As the object distance decreases from , the distance from the rear vertex to the the image typically
increases, so that the BF D for a finite object distance typically is larger than the BF L for an infinite
object distance. This can be seen by comparing the two expressions for some specimen focal lengths.
For f1 = 100 mm, f1 = 25 mm. and t = 75 mm, the focal length of the equivalent single lens is:
1
1 1 75 mm
fe = + = +50 mm
100 mm 25 mm 100 mm 25 mm
The back focal length (distance from rear vertex to focal point) is:
(f1 t) f2
BF L = z20 [z1 = ] =
(f1 + f2 ) t
25 mm (75 mm 100 mm)
= = 12.5 mm
75 mm (100 mm + 25 mm)
If the object distance is decreased from z1 = to z1 = 1000 mm, the back focal distance is:

z1 z1
f2 t f1 f2 f2 t f1 f2
z1 f1 (z1 f1 )
BF D = =
z1 z1 f1 f2
(t f2 ) f1 t (f1 + f2 ) +
z1 f1 (z1 f1 ) (z1 f1 )

1000 mm
25 mm 75 mm 100 mm 25 mm
1000 mm 100 mm
BF D [z1 = 1 m] = 20.
1000 mm
(75 mm 25 mm) 100 mm
1000 mm 100 mm
1000 mm
25 mm 75 mm 100 mm 25 mm
(1000 mm 100 mm)
=
1000 mm 100 mm25 mm
75 mm (100 mm + 25 mm) +
(1000 mm 100 mm) (1000 mm 100 mm)
14.773 mm > BF L
In words, as the object distance decreases from infinity, the image distance moves back away from
the focal point.
Front Focal Length
The front focal length ( F F L) FV is the distance z1 in the case where z20 = . It is calculated by
setting the denominator of the expression for z20 to zero:
z1 f1
(t f2 ) =0
z1 f1
z1 f1
= = t f2
z1 f1
z1 t f2
= =
z1 f1 f1
= z1 f1 = (t f2 ) (z1 f1 )
= z1 f1 = tz1 tf1 z1 f2 + f1 f2
= z1 (f1 + f2 t) = f1 f2 tf1
f1 (f2 t)
lim z1 = FV = = FFL
z20 (f1 + f2 ) t
Note that this expression has the same form as the front focal distance except that f1 and f2 are
swapped.
Front Focal Distance
Also note that the front focal distance ( F F D) is the axial distance from an object to the first surface
(front vertex) of the imaging system applies for finite object distances. This is synonymous with the
term the working distance, a concept often used in microscopy.

f2 z2
f1 t
(z2 f2 )
F F D = OV =
1
t (z2 (f1 + f2 ) f1 f2 )
(z2 f2 )
Object-Space Principal Point
We have already shown how to find the location of the equivalent single lens on the output side
by extending the rays entering and exiting the system until they meet. We can locate the equivalent
single lens in object space by reversing the system and introducing rays from the left again..
Since we know the distance from the object-space focal point to the object-space vertex and the
eective focal length, we can find the distance from the vertex to principal point in object space.
f1 f2
FH = fe =
(f1 + f2 ) t
= FV + VH = F F L + VH
f1 (f2 t)
= + VH
(f1 + f2 ) t
This implies that the distance from the object-space vertex to the object-space principal point is:
f1 f2 f1 (f2 t)
VH =
(f1 + f2 ) t (f1 + f2 ) t
f1 t
VH =
(f1 + f2 ) t
2.13.3 Summary of Distances for Two-Lens System

f1 f2
fe = H0 F0 = FH
(f1 + f2 ) t
f2 (f1 t)
BF L = V0 F0
(f1 + f2 ) t
f2 t
H0 V0 = H0 F0 V0 F0
(f1 + f2 ) t
f1 (f2 t)
F F L = FV
(f1 + f2 ) t
f1 t
VH = FH FV
(f1 + f2 ) t
2.13.4 Eective Power of Two-Lens System
The expression for the power of the system composed of two lenses in air with focal lengths f1 and
f2 is:
1 1 1 t
e [Diopters] = +
fe [ m] f1 [ m] f2 [ m] f1 f2
e [Diopters] = 1 + 2 1 2 t
Clearly the power is zero if the separation distance t is equal to the sum of focal lengths; this is the
recipe for a telescope. If the two lenses have positive power and the separation is just less than the
sum of focal lengths, the eective focal length can be very large. This is also the case if if one of the
two lenses has negative power (so that the numerator is negative) and the separation is just larger
than the sum of the focal lengths (so that the denominator is negative and approximately zero).
2.13.5 Lenses in Contact: t = 0
If the lenses are in contact, then t = 0 and the front and back focal lengths are equal to the focal
length of the equivalent single thin lens:
f1 f2
F F L = BF L = = fe , if t = 0
f1 + f2
1 1 1
= = + , if t = 0
fe f1 f2
Two thin positive lenses in contact. The focal length of the system is shorter than the focal
lengths of either, and may be evaluated to see that fe = f1f1+f
f2
2
. The image-space principal point is
the location of the equivalent thin lens. Since both lenses are thin, the principal point coincides
with the locations of both lenses, so that V0 = H0 = H = V.
The power of the system composed of two thin lenses in contact is the sum of the powers:
e [Diopters] = 1 + 2 1 2 0
= 1 + 2 for two thin lenses in contact
This is the assumed system for the magnifier with the lens held close to the eye.
2.13.6 Positive Lenses Separated by t < f1 + f2
If two positive thin lenses are separated by less than the sum of the focal lengths, the image-space
focal point F0 is closer to the first lens than it would have been had the second lens been absent. As
shown, the eective focal length of the system is fe < f1 . We can apply the equation for fe to this
case to see that:
f1 f2
fe = >0
(f1 + f2 ) t
f1 + f2 > fe > 0 if f1 + f2 > t > 0
A pair of positive thin lenses separated by less than the sum of the focal lengths.
Consider a specific example with f1 = 100 mm, f2 = 50 mm, and t = 75 mm. The focal length of
the equivalent single lens is:
f1 f2 (100 mm) (50 mm) 200 2

fe = = = mm = 66 mm
(f1 + f2 ) t (100 mm + 50 mm) 75 mm 3 3
The image formed by the first lens is located at its focal point:
1 1
1 1 1 1
z10 = = = 100 mm
f1 z1 100 mm
The object distance to the second lens is therefore the dierence t z10 :
z2 = t z10 = 75 mm 100 mm = 25 mm
The image of an object located at z1 = appears at z20 :

1 1
1 1 1 1 50 2
z20= = = mm = 16 mm
f2 z2 50 mm 25 mm 3 3
2
V0 F0 = .16 mm
3
measured from the rear vertex V0 of the system. We already know that the system focal length is
66 23 mm, so the image-space principal point H0 (the position of the equivalent thin lens) is located
66 23 mm IN FRONT of the system focal point, i.e., 50 mm in front of the second lens and 25 mm
behind the first lens.
2
H0 F0 = fe = 66 mm
3
2
V0 F0 = BF L = 16 mm
3
2 2
H0 V0 = H0 F0 V0 F0 = 66 mm 16 mm = 50 mm
3 3
We have already shown how to find the location of the equivalent single lens on the output side
by extending the rays entering and exiting the system until they meet. We can locate the equivalent
single lens in object space by reversing the system, as shown in the figure. The first lens in
the system is now (what we have called the second lens) L2 with f2 = 50 mm. The second lens is
L1 with f1 = 100 mm and the separation is t = 75 mm. The resulting eective focal length remains
unchanged at fe = 200 2
3 mm = 66 3 mm. If we bring in a ray from an object at , the intermediate
image formed by L2 is located at the focal point of L2 :

1 1
1 1 1 1
z10 = = = 50 mm
f2 z1 50 mm
Thus the image distance to L1 is:

0
z2 = t z1 = 75 mm 50 mm = +25 mm
The image of the object at z1 = produced by the entire system is located at z20 :
1 1
1 1 1 1 100 1
z20 = = = mm = 33 mm
f1 z2 100 mm +25 mm 3 3
measured from the second lens L1 (or equivalently from the second vertex). The image is behind
the second lens and is thus virtual. The object-space principal point H is the point such that the
distance FH = fe = 66 23 mm, which means that H is located 33 13 mm IN FRONT of L2 .
The object-space principal point H may be located by reversing the system and bringing in a
ray from an object at infinity.
When we re-reverse the system to graph the object- and image-space principal points, H is
located behind the lens L2 , as shown in the graphical rendering of the entire system:
The principal and focal points of the two-lens imaging system in both object and image spaces.
The object-space principal point is the location of the equivalent thin lens if the imaging system
is reversed. We can now use these locations of the equivalent thin lens in the two spaces to locate
the images by applying the thin-lens (Gaussian) imaging equation, BUT the distances z and z 0 are
respectively measured from the object V to the object-space principal point H and from the image-
space principal point H0 to the image point O0 . The process is demonstrated after first locating the
images via a direct calculation.
Brute Force Calculation of Image

Now consider the location and magnification of the image created by the original two-lens imaging
system (with L1 in front) for an object located 1000 mm in front of the system (so that OV =
1000 mm). We can locate the image step by step:
Intermediate image created by L1 :

1 1
1 1 1 1 1000
z10 = = = mm
= 111.11 mm
f1 z1 100 mm 1000 mm 9
Transverse magnification of intermediate image::

1000
z10 mm 1
(MT )1 = = 9 =
z1 1000 mm 9
Distance from intermediate image to L2 :

1000 325
z2 = t z10 = 75 mm mm = mm = 36.11 mm
9 9
Distance from L2 to final image:

1 1
1 1 1 1 650
z20 = = 325 =+ mm
= +20.97 mm
f2 z2 50 mm 9 mm 31
Transverse magnification of second image:

650
31 mm 18
(MT )2 = =+
325
9 mm
31
The transverse magnification of the image from the entire system is the product of the transverse
magnifications from each lens:

1 18 2
MT = (MT )1 (MT )2 = + =
9 31 31
which indicates that the image is minified and inverted.
Imaging Equation using Principal Points
We have just seen that the object- and image-space principal points are the reference locations
from which the system focal length is measured;
fe = FH = H0 F0
In exactly the same way, these principal points are the reference locations from which the object
and image distances are measured:
z = OH
z 0 = H0 O0
The ray entering the system can be modeled as traveling from the object O to the object-space
principal point H. The resulting outgoing (image) ray travels from the image-space principal point
H0 to the image point O0 . This may seem a little weird, but actually makes perfect sense if we
relate the measurements to the equation for a single thin lens. In that situation, focal lengths are
measured from the object-space focal point to the thin lens and from the lens to the image-space
focal point. In other words, the object- and image-space vertices V and V0 of a thin lens coincide
with the principal points H and H0 . We know that an object located at the lens (z = 0) generates
an image at the lens (z 0 = 0) with magnification of +1; the heights of the object and image at the
principal points are identical. In the realistic system where the object- and image-space principal
points are at dierent locations, the image of an object located at the object space principal point
is formed at the image-space principal point with unit transverse magnification MT = +1. In other
words, the principal points are the locations of conjugate points with unit transverse magnification.
Notice the dierence to the situation where the object distance OH = 2f , so that the image distance
H0 O0 = 2f with transverse magnification MT = 1:
OH = z = 2f
1 1 1
+ =
z z0 f
z 0 = H0 O0 = 2f
2f
MT = = 1
2f
This case where the object and image distances are equal so that the transverse magnification is 1
often is called imaging at equal conjugates.
Note the positions of the principal and focal planes of the system we just analyzed: f1 = +100 mm,
f2 = +50 mm, and t = +75 mm. The principal points are crossed, which means that the object-
space principal point is farther towards image space than the image-space principal point (H is
behind the H0 ). Such a system is more compact, because the image is closer to the object-space
principal point, so that F0 is closer than V0 O0
Principal points of an imaging system: The dashed ray from the object at O reaches the
object-space principal point H with height h. The image ray (solid line) departs from the
image-space principal point H0 with the same height h and goes to the image point O0 , so that the
distances OH = z and H0 O0 = z 0 satisfy the imaging equation z1 + z10 = fe1 .
Location of Image using Principal Points

We can also analyze this system by using the model of the single thin lens located at the object-
and image-space principal points. We have already shown that the focal length of the system is:
200
fe = FH = H0 F0 = + mm
3
The object and image distances z and z 0 of the single lens equivalent to the two-lens system are
respectively measured principal points: z = OH and z 0 = H0 O0 .
The object distance is measured to the object-space principal point, which is 100 mm behind L1 (or
V), thus the object distance is the distance from O to L1 plus 100 mm:
z = OV + VH = 1000 mm + 100 mm = 1100 mm

The single-lens imaging equation may be used to find the image distance z 0 , which now is MEA-
SURED FROM THE IMAGE-SPACE PRINCIPAL POINT H0 (and NOT from the image-space
vertex V0 ).
1
0 1 1
z =
fe z
1
1 1
= 200 mm
3 mm
1100
2200
= H0 O0 = mm = 70.97 mm
31
The image distance from the vertex is calculated by subtracting the distance from the image-space
principal point H0 to the image-space vertex V0 :
V0 O0 = H0 O0 H0 V0
2200 650
= mm 50 mm = mm
= +20.97 mm
31 31
The resulting transverse magnification is:
2200
z0 mm 2
MT = = 31 = = 0.065
z 1100 mm 31
Both the image distance and the transverse magnification match the values obtained with the step-
by-step calculation performed above (as they must!).
2.13.7 Cardinal Points

The object-space and image-space focal and principal points are four of the six so-called cardinal
points that determine the paraxial properties of an imaging system. There are three pairs of locations
where one of each pair is in object space and the other is in image space. The object- and image-
space focal points are F and F0 , while the principal points H and H0 are the locations on the axis
in object and image space that are images of each other with transverse magnification MT = +1.
The nodal points N and N0 are the points in object and image space where the ray angle of the
entering and exiting rays are identical, which means that the angular magnification of rays into
and out of the nodal points is M = +1. The principal and nodal points coincide for systems with
the object and image spaces in the same medium (e.g., both object space and image space in air).
A table of significant points on the axis of a paraxial system is given below:
A x ia l P o in t O b je ct S p a ce (fro nt) Im a g e S p a c e (b a ck ) C o n ju g a te P o ints? (o b ject a n d im a g e?)
Fo c a l P o i n t s F F0 No
0
N o d a l P o ints N N Yes: M = +1
P rin c ip a l P o in ts H H0 Ye s: MT = +1
Vertice s V V0 No
H0 O0 z0
O b je c t/ Im a g e O O0 Ye s: MT = =
OH z
E ntra n c e / E x it P u p ils E E0 Y e s , MT varies
E q u al C on ju ga tes OH=2fe z20 =H0 O0 =2fe Ye s: MT = 1

2.13.8 Lenses separated by t = f1 + f2 : Afocal System (Telescope)
If the two lenses are separated by the sum of the focal lengths, then an object at forms an image
at ; the system focal length is infinite. Since the focal points are both located at infinity, we say
that the system is afocal; it has zero power, i.e., the rays exit the system at the same angle that
they entered it. If the focal length of the first lens is longer than that of the second, the system is a
telescope.
Two thin lenses separated by the sum of their focal lengths. An object located an infinite distance
from the first lens forms an intermediate image at the image-space focal point f10 of the first lens.
The second lens forms an image at infinity. Both object- and image-space focal lengths of the
equivalent system are infinite: f = f 0 = . The system has no focal points it is afocal.
The focal length of this system is:

1 1 1 t
= 0 = + =0
fe f1 f2 f1 f2

1 1 f1 + f2
= + =0
f1 f2 f1 f2
= t = f1 + f2
which shows that the separation between the two lenses is t = f1 + f2 .
Angular Magnification of a Telescope
The telescope has infinite focal length and therefore no power, but you already know that it does
something. Consider the systems eect on a ray that enters the first lens at its center at angle ,
so it is transmitted through the lens with no change in angle. Because the ray crossed the axis at
the first lens and travels the distance z2 = f1 + f2 to the second lens, where it is deviated to make
the angle 0 with the optical axis. We need to relate and 0 to evaluate the angular magnification.
Angular magnification of a telescope: the red ray strikes the center of the first lens at angle and
is transmitted without deviation (because the sides are parallel at the center and the lens is thin).
The ray is deviated by the second lens at angle 0 . The angular magnification is the ratio of these
two angles.
From the figure, note that the angle of the entering ray is positive and that of the exiting ray is
negative. The angle of the entering ray may be determined from the triangle between the lenses
with sides (f1 + f2 ) and h:
h
tan [] = =
f1 + f2
To find the exiting angle 0 , we need to find the distance from the second lens to the point where
the ray crosses the axis. This is easy to find using the imaging equation for a thin lens in air:
1 1 1 z2 f2
+ = = z20 =
z2 z20 f2 z2 f2
where the object distance z2 is the distance between the lenses:
z2 = t = f1 + f2
so the image distance for the red ray is:
z2 f2 (f1 + f2 ) f2 f2
z20 = = z20 = = (f1 + f2 )
z2 f2 (f1 + f2 ) f2 f1
The angle 0 satisfies the condition:

h h f1 h
tan 0 = 0 = =
= 0
z2 f2 f2 f1 + f2
(f1 + f2 ) f1
So the angular magnification is:

f1 h
f2
0
f1 +f2 f1
M = = =
h f2
f1 +f2
where the negative sign means that the two angles have dierent algebraic signs. In words, the
angular magnifcation of a telescope is the ratio of the focal lengths of the lenses. If the two lenses
are both positive (Keplerian telescope), then the angular magnification is negative. If the objective
(first lens) has positive power and the ocular (second lens) is a negative (Galilean telescope), then
the angular magnification is positive.
The angular magnification shows that two distant objects separated by a small angle (as a double
star in the sky) will be separated by a larger angle if viewed through a telescope.
2.13.9 Positive Lenses Separated by t = f1 or t = f2
We now continue the sequence of examples for two positive lenses separated by increasing distances.
If two positive lenses are separated by the focal length of the first lens, then the focal length of the
system is:
f1 f2 f1 f2
fe = = = f1 (if t = f1 )
(f1 + f2 ) f1 f2
In words, the focal length of a system of two lenses separated by the focal length of the first lens is
equal to the focal length of the second lens.
If the two lenses are separated by the focal length of the second lens, then the system focal length
is f2 .
f1 f2 f1 f2
fe = = = f2 (if t = f2 )
(f1 + f2 ) f2 f1
Recall that the transverse magnification is approximately proportional to the focal length if the
object is distant:

zf
0
z zf
MT = =
z z !
1 f 1
= f =
zf z 1 zf
+ n + n+1
f X f X f
= =
z n=0 z n=0
z
f
= f if z f
z
where the formula for the converging geometric series has been used. In words, the transverse
magnification of a distant object formed by an imaging system is approximately proportional to the
focal length (which is why long focal lengths are used to image distant objects).
For the purpose of this example, we analyze the second case because it is the basis for probably
the most common application of imaging optics. The extension to the first case is trivial. Since
the focal length of the system is identical to the focal length of the second lens, this suggests the
question of how does the image change if the front lens is added.
Eect of adding lens L1 at the object-space focal point of lens L2 , so that t = f2 and fe = f2 . The
upper sketch is the lens L2 alone, and the lower drawing shows the situation with L1 added.
Consider a specific case with f2 = 100 mm and f1 = 200 mm. If only L2 is present and the object
distance is z2 = 1100 mm, then the image distance is:
1 1
1 1 1 1
z20 = = = 110 mm
f2 z2 100 mm 1100 mm
The associated transverse magnification is:
z20 +110 mm 1
(MT )L 2 alone = = =
z2 +1100 mm 10
Now add L1 at the front focal point of L2 and find the associated image. The object distance to
L1 is 1100 mm 100 mm = 1000 mm. The first lens forms an image at distance:
1 1
1 1 1 1
z10 = = = 250 mm
f1 z1 200 mm 1000 mm
with transverse magnification:
z10 +250 mm 1
(MT )1 = = =
z1 +1000 mm 4
The object distance to the second lens is:
z2 = t z10 = 100 mm 250 mm = 150 mm
and the resulting image distance behind lens L2 is:

1 1
1 1 1 1
z20 = = = +60 mm
f2 z2 100 mm 150 mm
Compare the image distances behind lens L2 and the system focal lengths without and with L1 in
the system:
z20 (without L1 ) = V0 O0 (without L1 ) = +110 mm > V0 O0 (with L1 ) = +60 mm

the image has moved closer to lens L2 .
fe (without L1 )= 100 mm = fe (with L1 )
Now check the other attributes of the image. Recall that MT = 0.1 if using L2 alone. If using
both lenses, the transverse magnification of the image formed by the second lens is:
60 mm 2
(MT )2 = =+
150 mm 5
The magnification of the system is the product of the magnifications due to each lens:
MT for system with L1 and L2 = (MT )1 (MT )2

1 2 1
= + = = MT for L2 alone
4 5 10
MT (without L1 )= MT (with L1 ) if t = f2
which is the same as for lens L2 alone! The transverse magnification of the system is not
changed by the addition of lens L1 with focal length f1 placed at the front focal point
of lens L2 , If f1 > 0, the image distance measured from L2 is shorter if L1 is present than if L1
is missing. Obviously, if the first lens has negative power (f1 < 0), the image distance measured
from L2 is longer if L1 is present than if L1 is missing. Put another way, the addition of lens L1
located at the object-space focal point of lens L2 moves the principal points and focal points by
equal distances either forward (towards L2 ) if f1 > 0 or backwards (farther from L2 ) if f1 < 0,
but the the focal length is unchanged. This system demonstrates the principle of eyeglass lenses,
where the ideal location for the corrective lens is at the object-space focal point of the eyelens (this
is the reason that eyeglasses are on your nose). The corrective action of a negative lens L1 placed
at the front focal point of L2 moves the image location backwards (away from L2 ) to correct
nearsightedness without changing the transverse magnification of the imaging system. A positive
lens L1 placed at the front focal point of L2 will move the image forwards (towards L2 ) to correct
farsightedness.
2.13.10 Positive Lenses Separated by t > f1 + f2
If the two positive lenses are separated by more than the sum of the focal lengths, the focal length
of the resulting system is negative:
f1 f2
fe = <0
(f1 + f2 ) t
If the object distance is , the first lens forms an intermediate image at its image-space focal
point, i.e., at z10 = f1 . Since the object distance z2 measured from the second lens is larger than f2 , a
real image is formed by the second lens at the system focal point F 0 . If we extend the exiting ray
until it intersects the incoming ray from the object at infinity, we can locate the equivalent single
thin lens for the system, i.e., the image-space principal point H0 . In this case, this is located farther
from the second lens than the focal point. The eective focal length fe = H0 F0 < 0, so the system
has negative power.
The system composed of two thin lenses separated by d > f1 + f2 . The image-space focal point F0 of
the system is beyond the second lens, but the image-space principal point H0 is located even farther
from L2 . The distance H0 F0 = fe < 0, so the system has negative power!
2.13.11 Compound Microscopes

We have already discussed the simple magnifier, where the object is located closer to the positive
lens than the focal length, thus forming a larger upright virtual image close to the near point of
the eye. In the compound magnifier (more commonly called the compound microscope) formed from
two lenses, the objective and eyelens generally have a short positive focal length and a longer focal
length, respectively. The focal points of the two lenses are separated by a fixed distance, the tube
length, which is now standardized by the Royal Microscope Society as t = 160 mm, though some
companies manufacture other lengths (e.g., Leitz with t = 170 mm). Not that it matters in this
class, it is important to ensure that the objective is used with the correct tube length to minimize
aberrations in the final image.
Modern microscope systems are often infinity corrected, which means that the object is located
in the front focal plane of the objective so that the rays emerging are parallel (collimated). This
feature allows a beamsplitter to be introduced in the light path for a second eyelens, camera, or other
apparatus. A lens within the microscope tube (the tube lens, duh) creates an intermedia image
that is viewed by the eyelens. In more traditional microscopes, the object typically is located just
beyond the focal point of the short-focal-length positive objective lens (so that the object distance
z1 ' f1 ), thus forming a large real inverted image that is positioned at the front focal point of the
ocular (eye lens). The eye lens then forms an image at infinity, i.e., the parallel rays emerging from
the ocular are viewed by a relaxed eye.
Microscope objectives and eyepieces are labeled by magnifying powers, e.g. 10X - 40X for the
objective and 10X for the ocular. The total magnification is the product, so that a 10X objective
and 10X ocular yields a magnification of 100X.
The magnifying power of an objective with focal length f1 and tube length 160 mm is:
160 mm
M1 =
f1
For example, objectives with these focal lengths have magnifying powers:
f1 = 16 mm = M1 = 10X
f1 = 1.6 mm = M1 = 100X
The magnifying power of the eyelens is calculated from the same formula used for the simple mag-
nifier:
250 mm
(M )1 =
f2
with sample value:
f2 = 25.4 mm = M2
= 10X
The magnifying power of the compound microscope is the product of the two magnifying powers:
M.P. = (M )1 (M )2
160 mm 250 mm
=
f1 f2
160 mm 250 mm
= = 1000X
1.6 mm 25.4 mm
where again the negative sign means that the image is inverted.
2.13.12 Two Positive Lenses with Dierent Focal Lengths and Dierent
Separations
From the list of distances for a two-lens system:
f1 f2
fe = H0 F0 = FH
(f1 + f2 ) t
(f1 t) f2
BF L = V0 F0
(f1 + f2 ) t
f2 t
H0 V0 = H0 F0 V0 F0
(f1 + f2 ) t
f1 (f2 t)
F F L = FV
(f1 + f2 ) t
f1 t
VH = FH FV
(f1 + f2 ) t
we can determine the impact of the lens separation t for the specific example:
f1 = +100 mm
f2 = +25 mm
t BF L FFL fe
0 mm +20 mm +20 mm +20 mm

+25 mm = f2 0 mm +18.75 mm +25 mm = f2
+50 mm 33 13 mm +16 23 mm +33 13 mm
+75 mm 100 mm +12.5 mm +50 mm
+100 mm = f1 300 mm 0 mm +100 mm = f1
+125 mm = f1 + f2 (afocal )
+150 mm +500 mm +50 mm 100 mm
+175 mm +300 mm +37.5 mm 50 mm
The eect of varying the lens separation t on the eective focal length fe for f1 = +100 mm and
f2 = +25 mm, with a magnified view in (b). The system is afocal if t = f1 + f2 = 125 mm; fe > 0
for t < f1 + f2 and fe < 0 for t > f1 + f2 .
2.13.13 Systems of One Positive and One Negative Lens

We also consider the case where f1 = +100 mm and f2 = 25 mm. The focal length for t = 0 is:
1 1
1 1 1 1 100
fe = + = + = mm
= 33.33 mm
f1 f2 +100 mm 25 mm 3
The system focal length is negative for t < f1 + f2 = 75 mm, the system is afocal for t = 75 mm, and
the focal length is positive for t > 75 mm.
The eect of varying the lens separation t on the eective focal length fe for f1 = +100 mm and
f2 = 25 mm, with a magnified view in (b). The system is afocal if t = f1 + f2 = 75 mm; fe < 0
for t < f1 + f2 and fe > 0 for t > f1 + f2 .
2.13.14 Newtonian Form of Imaging Equation

We have already seen the familiar Gaussian form of the imaging equation:
1 1 1
+ 0 =
z z f
An equivalent form is obtained by defining the distances x and x0 that are the dierences between
the object and image distances and the focal length:
z = x + f = x = z f
z 0 = x0 + f = x0 = z 0 f
In the case of a real object O and real image O0 as shown in the figure, both x and x0 are positive.
The definition of the parameters x, x0 in the Newtonian form of the imaging equation. For a real
image, both x and x0 are positive.
By simple substitution into the imaging equation, we obtain:
1 1 1 (x0 + f ) + (x + f ) x + x0 + 2f
= + 0 = =
f x+f x +f (x + f ) (x0 + f ) xx0 + (x + x0 ) f + f 2
0 0
xx + (x + x ) f + f 2
= f =
(x + x0 ) + 2f
= x x0 + f 2 = 2f 2
= x x0 = f 2
This is the Newtonian form of the imaging equation. The same expression applies for virtual images,
but the sign of the distances must be adjusted, as shown:
The parameters x, x0 of the Newtonian form for a virtual image.

2.13.15 Example (1) of Two-Lens System

Find the cardinal points of the two-lens system
f1 = +100 mm
f2 = +25 mm
t = +50 mm
The eective focal length is:

f1 f2
fe =
(f1 + f2 ) t
100 mm 25 mm 100 1
= =+ mm = +33 mm
100 mm + 25 mm 50 mm 3 3
Now find the location of the focal point from the formula for the back focal length:
f2 (f1 t)
BF L = V0 F0 =
(f1 + f2 ) t
25 mm (50 mm 100 mm) 50
= = mm
50 mm (100 mm + 25 mm) 3
Alternatively, we can track a ray from infinity through the system. The image distance from the
first lens is f1 = +100 mm, so the object distance to the second lens is
z2 = t f1 = 50 mm 100 mm = 50 mm
The image distance from the second lens is:
z2 f2 (50 mm) (+25 mm) 50

z20 = = = mm = V0 F0
z2 f2 (50 mm) (+25 mm) 3
(parenthetical note, this is half the focal length).

We can now draw the image-space focal and principal points:
To find the object-space focal point, we can evaluate the front focal length:
f1 = +100 mm
f2 = +25 mm
t = +50 mm
f1 (f2 t) (+100 mm) (25 mm 50 mm) 100

F F L = FV = = = mm
(f1 + f2 ) t (100 mm + 25 mm) 50 mm 3
which says that the object-space focal point is to the right of the object space vertex. From the
eective focal length, we can locate the object-space principal point:
100
FH = fe = +
mm
3
FV = FH + HV
100
100 mm = + mm + HV
3
100 100 200
= HV = mm mm = mm
3 3 3
Alternatively, we turn the system around and bring in light from the left. The image distance
from the first lens (actually L2 ) is equal to its focal length:
z10 = f2 = +25 mm
So the object distance to the lens with f1 = +100 mm is:
z2 = t z10 = 50 mm 25 mm = +25 mm
So the distance from this lens to the system image-space focal point is:
z2 f1 (+25 mm) (+100 mm) 100

z20 = = = mm
z2 f1 (+25 mm) (+100 mm) 3
The object-space focal point is virtual and the object-space principal is located at the distance f e
behind it in the reversed system.
We can now reverse the second case and plot the four cardinal points ( F, F0 , H, H0 ) on the same
graph:
Object-space and image-space cardinal points for two-lens system with f1 = +100 mm,
f2 = +25 mm, t = +50 mm. The ray from infinity on the object side is in red, that from infinity on
the image side is in blue.
In this case, the object-space focal point F just happens to coincide with the image-space principal
point H0 and the same is true for the object-space principal point H and the image-space focal point
F0 . This is of no real significance, since the two spaces are independent.
Images from System: (1) Object at Object-Space Focal Point
An object located at the object-space (front) focal point of the system is at the distance equal to
the FFL from the first lens. In this case:
100
z1 = F F L = mm
3
z1 f1 100 mm 100 mm
z10 = = 100 3 = +25 mm
z1 f1 3 mm (100 mm)
z2 = t z10 = +50 mm 25 mm = 25 mm
which is the same as the focal length of the second lens, which means that the image distance from
the second lens is infinite (as expected).
Images from System: (2) Object at Object-Space Principal Point
An object located at the object-space (front) principal point of the system is at the distance equal
to the FFL from the first lens. In this case:
100 100 200
z1 = F F L fe = mm mm = mm
200 3 3 3
z1 f1 mm 100 mm
z10 = = 2003 = +40 mm
z1 f1 3 mm (100 mm)
z0 40 mm 3
(MT )1 = 1 = 200 =+
z1 3 mm 5
z2 = t z10 = +50 mm 40 mm = +10 mm

z2 f2 10 mm 25 mm 50
z20 = = = mm
z2 f2 10 mm 25 mm 3
0 50
z mm 5
(MT )2 = 2 = 3 =+
z2 10 mm 3
The system magnification for that object distance is the product of the two:

3 5
(MT )system = (MT )1 (MT )2 = + + = +1
5 3
as expected for the object and image at the principal points.
Images from System: (3) Equal Conjugates
If we move the object so that it is one focal length from the focal point and two focal lengths from
the principal point, the object distance is:
100 100
z1 = F F L + fe = mm + mm = 0 mm
3 3
z10 = 0 mm
(MT )1 = +1
z2 = t z10 = +50 mm 0 mm = +50 mm

z2 f2 +50 mm 25 mm
z20 = = = +50 mm
z2 f2 +50 mm 25 mm
z0 50 mm
(MT )2 = 2 = = 1
z2 50 mm
(MT )system = (MT )1 (MT )2 = (+1) (1) = 1
as expected for the object and image at the equal-conjugate points.
2.13.16 Example (2) of Two-Lens System: Telephoto Lens
Now consider a system composed of a positive lens and a negative lens separated by just a bit more
than the sum of the focal lengths: f1 = +100 mm, f2 = 25 mm, and t = +80 mm. The focal length
of the equivalent thin lens is fe = 500 mm:
f1 f2
fe =
f1 + f2 t
100 mm (25 mm)
= = +500 mm
100 mm + (25 mm) 80 mm
Note that the focal length of the system is MUCH longer than the focal lengths of either lens.
Now locate the image-space focal point and principal point. For an object located at , the
BFL is found by substitution into the appropriate equation:
(f1 t) f2
BF L = V0 F0 =
(f1 + f2 ) t
(100 mm 80 mm) (25 mm)
= = 100 mm
(100 mm + (25 mm)) 80 mm
The image of an object at is located 100 mm behind the second lens, and thus 180 mm behind
the first lens; this distance VF0 = 180 mm is the physical length, which is MUCH longer than the
focal length of 500 mm. This is the advantage of a telephoto lens; the focal length is much longer
than the lens itself.
The locations of the image-space principal point is determined from the back and equivalent focal
lengths:
H0 F0 = H0 V0 + V0 F0
500 mm = H0 V0 + 100 mm
H0 V0 = +400 mm
H0 V = H0 V0 VV0 = 400 mm 80 mm = +320 mm
so the principal point is located 320 mm in front of the object-space vertex V. A sketch of the
system and the image-space cardinal points is shown below:
Image-space focal and principal points of the telephoto system. The equivalent focal length of the
system is fe = +500 mm, but the image-space focal point is only +100 mm behind the rear vertex
V0 . Tthe image-space principal point is 500 mm in front of the focal point.
The object-space focal point is located by applying the expression for the front focal distance:
f1 (f2 t) (+100 mm) ((25 mm) 80 mm)

F F L = FV = = = +2100 mm
(f1 + f2 ) t (100 mm + (25 mm)) 80 mm
which is far in front of the object-space vertex V. The object-space principal point is found from:
FH = FV + VH
+500 mm = +2100 mm + VH
VH = 500 mm 2100 mm = 1600 mm = HV = VH = +1600 mm
So the object-space principal point is very far in front of the first vertex.
Object-space focal and principal points of the telephoto system. Both are located far ahead of the
front vertex V.
We can locate the image of an object at a finite distance say, 3 m in front of the first lens (OV =
3000 mm) using the three methods: (1) brute-force calculation, (2) by applying the Gaussian
imaging formula for distances measured from the principal points, and (3) from the Newtonian
imaging equation.
(1) Brute-Force Calculation
The distance from the object to the first thin lens is 3000 mm, so the intermediate image distance
satisfies:
1 1 1
+ 0 =
z1 z1 f1
1
1 1 3000
0
z1 = = mm
= 103.45 mm
100 mm 3000 mm 29
The transverse magnification of the image from the first lens is:
z10 1
(MT )1 = =
z1 29
The object distance to the second lens is negative:
3000 680
z2 = t z10 = 80 mm mm = mm
= 23.45 mm
29 29
the object is virtual. The image distance from the second lens is:
1 1 1
+ 0 =
z2 z2 f2
1
1 29 3400
0
z1 = =+ mm
= +377.8 mm
25 mm 680 mm 9
The corresponding transverse magnification is:

3400
z20 + 9 mm
(MT )2 = = 680 = 16.1
z2 29 mm
The system magnification is the product of the component transverse magnifications:

!
1 + 3400
9 mm 5
MT = (MT )1 (MT )2 = 680 =
29 29 mm 9
(2) Gaussian Formula
Now evaluate the same image using the Gaussian formula for distances measured from the principal
points. The distance from the object to the object-space principal point is:
z1 = OH = OV + VH = 3000 mm + (1600 mm) = +1400 mm
The image distance measured from the image-space principal point is found from the Gaussian image
formula:
1
1 1 1 1 1 7000
= = z 0
= H0 O0 = =+ mm = 777.8 mm
z0 fe z 500 mm 1400 mm 9
The distance from the rear vertex to the image is found from the known value for H0 V0 = +400 mm:
V0 O0 = H0 O0 H0 V0
7000 3400
=+ mm 400 mm = mm
= 377.8 mm
9 9
thus matching the distance obtained using brute force. The transverse magnification of the image
created by the system is:
z0 + 7000
9 mm 5
MT = = =
z +1400 mm 9
(3) Newtonian Lens Formula

Now repeat the calculation for the image position using the Newtonian lens formula. The distance
from the object to the object-space focal point is:
x = OF = OV + VF = OV FV = 3000 mm 2100 mm = 900 mm
Therefore the distance from the image-space focal point to the image is:
2
fe (500 mm) 2500
x0 = F0 O0 = = = mm
= 277.8 mm
x 900 mm 9
So the distance from the rear (image-space) vertex V0 to the image is:
V0 O0 = V0 F0 + F0 O0
2500 3400
= 100 mm + mm = mm
= 377.8 mm
9 9
which again agrees with the result obtained by the other two methods.
2.13.17 Images from Telephoto System:

Image (1): Object at Object-Space Focal Point
z1 = F F L = +2100 mm
z1 f1 (+2100 mm) 100 mm
z10 = = = +105 mm
z1 f1 (+2100 mm) (100 mm)
z2 = t z10 = +80 mm 105 mm = 25 mm
the second lens is infinite (as expected).
z2 f2 (25 mm) (25 mm)

z20 = = =
z2 f2 (25 mm) (25 mm)
Image (2) from Telephoto System: Object at Object-Space Principal Point

z1 = F F L fe = 2100 mm 500 mm = 1600 mm

z1 f1 (1600 mm) 100 mm 320
z10 = = =+ mm
z1 f1 (1600 mm) (100 mm) 3
z0 + 320 mm 1
(MT )1 = 1 = 3 =
z1 1600 mm 15
: The object distance to the second lens is:

320 80
z2 = t z10 = +80 mm mm = mm
80 3 3
z2 f2 mm (25 mm)
z20 = = 803 = 400 mm
z2 f2 3 mm (25 mm)
z0 (400 mm)
(MT )2 = 2 = 80 = 15
z2 3 mm

1
(MT )system = (MT )1 (MT )2 = (15) = +1
15
which again confirms that the transverse magnification is that expected for the object and image at
the principal points.
Image (3) from Telephoto System: Equal Conjugates
z1 = F F L + fe = 2100 mm + 500 mm = 2600 mm

z1 f1 (+2600 mm) 100 mm
z10 = = = +104 mm
z1 f1 (+2600 mm) (100 mm)
z0 (+104 mm) 1
(MT )1 = 1 = =
z1 (2600 mm) 25
z2 = t z10 = +80 mm 104 mm = 24 mm

z2 f2 (24 mm) (25 mm)
z20 = = = +600 mm
z2 f2 (24 mm) (25 mm)
z0 (+600 mm)
(MT )2 = 2 = = +25
z2 (24 mm)

1
(MT )system = (MT )1 (MT )2 = (25) = 1
25

2.13.18 Example (3) of Two-Lens System: Two Negative Lenses
Now consider a system composed of a positive lens and a negative lens separated by just a bit more
than the sum of the focal lengths: f1 = 100 mm, f2 = 25 mm, and t = +125 mm. The focal
length of the equivalent thin lens is:
f1 f2
fe = = H0 F0 = FH
f1 + f2 t
(100 mm) (25 mm)
= = 10 mm
(100 mm) + (25 mm) 125 mm
Note that the focal length of the system negative and shorter than either lens..
Now locate the image-space focal point and principal point. For an object located at , the
BFL and FFL are found by substitution into the appropriate equation:
(f1 t) f2
BF L = V0 F0 =
(f1 + f2 ) t
(100 mm 125 mm) (25 mm) 45
= = mm = 22.5 mm
(100 mm) + (25 mm) 125 mm 2
BF L = 22.5 mm
f1 (f2 t)
F F L = FV =
(f1 + f2 ) t
(100 mm) (25 mm 125 mm)
= = 60 mm
(100 mm) + (25 mm) 125 mm
F F L = 60 mm
(1) Object at Object-Space Focal Point

z1 = F F L = 60 mm (virtual object)
z1 f1 (60 mm) (100 mm)
z10 = = = +150 mm
z1 f1 (60 mm) (100 mm)
z2 = t z10 = +125 mm 150 mm = 25 mm
the second lens is infinite (as expected):
z2 f2 (25 mm) (25 mm) 625 mm2

z20 = = = =
z2 f2 (25 mm) (25 mm) 0 mm
Images from System: (2) Object at Object-Space Principal Point

z1 = F F L fe = 60 mm (10 mm) = 50 mm
z1 f1 (50 mm) (100 mm)
z10 = = = +100 mm
z1 f1 (50 mm) (100 mm)
z0 +100 mm
(MT )1 = 1 = = +2
z1 50 mm
z2 = t z10 = +125 mm 100 mm = +25 mm

z2 f2 (+25 mm) (25 mm)
z20 = = = 12.5 mm
z2 f2 (+25 mm) (25 mm)
z0 (12.5 mm) 1
(MT )2 = 2 = =+
z2 (+25 mm) 2

1
(MT )system = (MT )1 (MT )2 = (+2) + = +1
2
which again confirms that the transverse magnification is that expected for the object and image at
the principal points.
Images from System: (3) Equal Conjugates

z1 = F F L + fe = 60 mm + (10 mm) = 70 mm
z1 f1 (70 mm) (100 mm) 700 1
z10 = = =+ mm = 233 mm
z1 f1 (70 mm) (100 mm) 3 3
700
0 mm
z 10
(MT )1 = 1 = 3 =+
z1 (70 mm) 3

700 325
z2 = t z10 = +125 mm mm = mm
= 108.3 mm
325 3 3
z2 f2 3 mm (25 mm)
z20 = = 325 = 32.5 mm
z2 f2 3 mm (25 mm)
z0 (32.5 mm) 3
(MT )2 = 2 = 325 =
z2 3 mm 10

10 3
(MT )system = (MT )1 (MT )2 = + = 1
3 10
2.14 Plane and Spherical Mirrors
One of the most familiar optical elements is the plane mirror (you probably see one every morning!).
For each ray incident at angle measured from the normal to the surface, a reflected ray is generated
at angle relative to the normal. Consider a full sphere with reflective surface on the inside and
a point object O at the center, as shown in (a) in the figure. All rays from the object encounter the
surface at normal and reflect back to form an image at the center. We can infer the focal length of
the spherical concave mirror from this observation by noting that the object and image distances
are identically R, so the focal length is determined by the thin-lens imaging equation:
1 1 1
= +
f z1 z2
1 1 1 2 R
z1 = z2 = R = = + = = f =
f R R R 2
Note that in this case of a complete sphere, the algebraic sign of the radius of curvature is not well
defined, but since rays converge to form the image, the focal length clearly must be positive. Because
the object and image distances are equal, this clearly is imaging at equal conjugates with transverse
magnification is MT = 1:
z2 2f
MT = = = 1
z1 2f
The negative sign on MT means that if the object source is moved upward from its position on
the horizontal axis at the center, then the reflected rays will converge to a point below the optic
axis, as shown in part (b) of the figure.
In part (c) of the figure, half of the spherical mirror surface is removed so that all rays emitted
towards the left will escape without striking the mirror and all rays emitted towards the right will
strike the surface one time before returning to the image at the center and then escaping to the
right. This mirror surface clearly makes rays converge to a real image coincident with the object
and so must have a positive focal length EVEN THOUGH the radius of curvature R is negative
(because V is to the right of C).
2.14 PLANE AND SPHERICAL MIRRORS 77
Spherical mirror: (a) rays from point source at center of sphere are all normal to the surface and
reflect back upon themselves to form a point image at object, so that z1 = z2 = R; (b) if the point
source is moved upward, the image moves downward, which shows that MT = 1; (c) half the
sphere is removed leaving a hemisphere with R = CV < 0.
Derivation of the focal length of a concave spherical mirror. The magnified section at the bottom
shows the triangles used to evaluate f in terms of R: f = R 2 in the paraxial approximation.
We can consider the hemispherical concave mirror with radius of curvature R = VC < 0. Even
though the radius is negative, we have already inferred that the focal length of this system is positive
since the image rays converge, so we have:
|R| R R
f= = =
2 2 2
A ray from an object at infinity that is close to (and parallel to) the optical axis, as shown in the in
the figure. From triangle CAV in the magnified view, it is apparent that:
x x x
sin [] = = =
CV VC R
From F0 AV, we see that
x
tan [2] =
F0 V0
Now apply the paraxial approximation that sin []
= tan []
= if
= 0:
x
sin [] = = = x = R
R
x
tan [2] = = 2 = x = f 2
f
Now equate the two terms to find a relationship between f and R:
R
R = f 2 = f =
2
This expression for the focal length may be substituted into the imaging equation for a single thin
lens:
1 1 1 2
+ = =
z1 z2 f R
For the case just considered of a concave surface, R < 0 and f > 0. If the object distance z1 > f ,
then the image distance z2 is positive, BUT IS MEASURED FROM RIGHT TO LEFT. If the
mirror is a convex spherical surface with R = VC > 0; the image of a ray from an object at infinity
crosses the axis at the image-space focal point behind the mirror, so the optic makes rays diverge
and therefore has negative power.
Convex mirror has positive radius of curvature (R > 0) but the reflected rays diverge and so the
R
surface has negative focal length via f = .
2
2.15 STOPS AND PUPILS 79
2.14.1 Comparison of Thin Lens and Concave Mirror
Comparison of the vertices, focal points, principal points, and equal-conjugate points of a concave
mirror and a thin lens. The vertices and the principal points coincide in both cases so that
MT = +1 for object and image at the vertex of the mirror and at the surfaces of the lens. The
object- and image-space focal points of the mirror coincide at the distance fe = R2 for the mirror,
and the equal conjugate points are located at the center of curvature so that z1 = z2 = 2fe . For the
lens, the equal conjugate points are also located such that z1 = z2 = 2fe with MT = 1.
2.15 Stops and Pupils

In any multielement optical system, the beam of light that passes through the system is shaped like
a solid circular spindle with dierent radii at dierent axial locations. A larger exiting ray cone
means that more light reaches the image to make it brighter, so the diameter of this specific element
is the limiting factor for image brightness. The diameter of one optical element will limit the size
of the ray spindle that exits the system; this limiting element is the aperture stop of the system and
may be a lens or an aperture with no power (an iris diaphragm) that is placed specifically to limit
the diameter of the ray cone. Consider the example of a two-lens system with an iris positioned
between them shown in the figure. The iris limits the cone of rays from the object at O
Schematic of the aperture stop S and entrance and exit pupils E and E0 , respectively for a system
formed from two positive lenses and an iris with no power. The entrance pupil E is the image of
the stop S seen from the left through the first lens L1 , while the exit pupil is the image of S seen
from the right through the second lens L3 . Note that the element that is the stop may vary with
object location O.
Obviously, the aperture stop in an imaging system composed of a single lens is that lens. In a
two-element system, the stop will be one of the two lenses, determined by the relative diameters
and the locations of the lenses. The image of the stop seen from the input side of the lens is the
entrance pupil, which determines the angular spread of the ray cone from an object point that gets
into the optical system, and thus determines the brightness of the image. The image of the stop
seen from the output side is the exit pupil (once called the Ramsden disk ).
In an imaging system intended for viewing by eye, it is useful to locate the exit pupil at the iris
of the eye and to match its diameter to that of the iris of the eye to ensure that all light through
the optical system makes it into the eye to form the viewable image.
2.15.1 Focal Ratio f-number

For multilens systems, the size of the entrance pupil determines the angular extent of the ray cone
that enters the system from a point source. The figure shows a simple hypothetical imaging system
with object-space and image-space principal points H and H0 , respectively and aperture stop of
diameter d0 as the first element in the system (the same analysis applies for systems with the
entrance pupil at other locations for an object at infinity). In this system, the stop is also is the
entrance pupil. A point source at infinity creates a plane wave through the entrance pupil, which is
then incident on the object-space principal plane H with the same diameter. The unit transverse
magnification of the two principal planes ensures that the light emerging from the image-space
principal plane H0 has that same diameter d0 = dNP . The cone angle of rays incident on the image
plane at the image-space focal point F0 is the ratio of the diameter to the distance H0 F0 = fe :
d0 dNP
=
fe fe
This means that the focal ratio of the system is:
fe
f/# =
dNP
Note that a corresponding expression could be constructed based on the diameter of the exit pupil,
but the propagation distance then would have to be the distance from the exit pupil to the image,
which (in this case) is longer than the eective focal length.
Specification of the system focal ratio: the plane wave from a point source at infinity is incident
through the aperture stop with diameter d0 onto the object-space principal plane H. The light
emerging from the image-space principal plane H0 has the same diameter d0 . The light propagates
the focal length fe to the image. The angle of the ray cone is fde 0 ,which is the system focal ratio
f/#.
This f-number specifies the ability of the system to collect light.
2.15.2 Example: Focal Ratio of Lens-Aperture Systems

The focal ratio of a single thin lens obviously is the ratio of the focal length to the diameter of the
lens:
f
f /# =
d0
Note that the smallest possible focal ratio exists for a full sphere (which is anything but thin and
the paraxial approximation certainly does not apply over its full diameter). It might be useful to
determine the focal ratio for such a case with normal glass (n = 1.5). The focal length of the
sphere in the (ridiculously invalid) thin-lens paraxial approximation where R = 12.5 mm is obtained
from the lensmakers equation:
1
1 1
f = (n2 1)
R1 R2
1
1 1
= (1.5 1)
12.5 mm 12.5 mm
= 3.125 mm
The focal ratio is:

f 3.125 mm 1
f /# = = =
d0 25 mm 8
This is ridiculously invalid because it assumes that the sphere is simultaneously thin and fat
If we assume the spherical lens is composed of two thin lenses at the vertices with the power of
a single surface:
1
1.5 1
f1 = f2 = = 25 mm
12.5 mm
t = 25 mm
f1 f2 25 mm 25 mm
fe = = = 25 mm
f1 + f2 t 25 mm + 25 mm 25 mm
(f1 t) f2
BF L = =0
f1 + f2 t
Single Thin Lens + Aperture in front
Consider a system with a diaphragm (iris or aperture) of diameter d0 located at a distance t in

front of the lens with focal length f1 and diameter d1 . Since the aperture has no power to refract
light ( = 0 diopters), then its focal length is infinite (f0 = ). The focal length of the two-lens
system is:
f0 f1 f0
fe = = f1 lim = f1
(f0 + f1 ) t f0 (f0 + f1 ) t
which makes sense: the focal length of a system consisting of one refracting element and one non-
refracting element is that of the refracting lens.
For an object at infinity (z1 = = z2 = f1 ), the diaphragm is the aperture stop if its diameter
is smaller than that of the lens:
d0 < d1 = iris is aperture stop
and the iris is also the entrance pupil. The focal ratio of the system is:
f1
f /# =
d0
The exit pupil may be located by applying the imaging equation:

t f1
zXP =
t f1
which shows that the exit pupil is virtual (behind the lens as seen from image space) if t < f1 .
Note that if t = f1 so that the aperture is located at the object-space focal point of the system, then
the distance from the lens to the exit pupil is infinite: the system is telecentric in image space.
The exit pupil is real (and may be visualized on an observation screen) if zXP > 0 = t > f1 .
Consider some examples with f1 = 100 mm, d1 = 25 mm, t = 25 mm, and d0 = 10 mm. If the iris
is deleted, then the focal ratio is:

fe 100 mm
f /# = = = f /4
d1 25 mm
The iris is the stop and entrance pupil. The location of the exit pupil is:
t f1 25 mm 100 mm 100
zXP = = = mm
t f1 25 mm 100 mm 3
100 mm 4
MXP = 3 =+
25 mm 3
40 1
dXP = d0 MXP = mm = 13 mm
3 3
The iris is the stop and entrance pupil, so the focal ratio is:
fe 100 mm
f /# = = = f /10
dNP 10 mm
Single Thin Lens + Aperture behind
If the lens comes first in the system, then we need to find the condition of the iris diameter to
determine if it is the aperture stop. At some risk of confusion, well maintain the notation where the
diameter of the lens is d1 and that of the aperture is d0 even though it is second in the system. For
an object at infinity, the figure shows that the distance to the iris must be less than the focal length
to have any possibility of being the aperture stop. The image of the aperture seen from object space
is located at
t f1
z=
t f1
which is positive (so the entrance pupil is real) if t < f1 . The transverse magnification of the entrance
pupil is:
z f1
MT = =
t t f1
which implies that the diameter of the image of the iris is:
d00 = MT d0
If we use the same numerical values as before but with the iris behind, the distance to the entrance
pupil is:
t f1 25 mm 100 mm 100
zN P = = = mm
t f1 25 mm 100 mm 3
zNP 100 mm 4
MN P = = 3 =+
25 mm 25 mm 3
4 40
dN P = + 10 mm = mm
3 3
This is the diameter of the incoming beam at the lens, so the focal ratio is:
fe 100 mm
f /# = = 40 = f /7.5
dNP 3 mm
Three examples of systems: the first is a single thin lens with the aperture stop at the lens, so the
stop coincides with the entrance and exit pupils; the second moves the iris in front of the lens so
that it is also the entrance pupil; in the third, the iris is behind the lens and the magnified diameter
of the entrance pupil is the relevant parameter for the focal ratio.
2.15.3 Example: Exit Pupils of Telescopic Systems
Galilean Telescope
In the example of a telescopic system, such as binoculars, composed of an objective lens L1 with
diameter d1 and an eyelens L2 with diameter d2 , where the two lenses are separated by the sum
of their focal lengths. Consider the specific example of a Galilean telescope with f1 = +200 mm,
D1 = 50 mm, f2 = 25 mm, D2 = 25 mm, and t = f1 + f2 = 175 mm. We have already seen that the
angular magnification of the system is the ratio of the focal lengths of the two lenses:
f1 +200 mm
M = = = +8
f2 25 mm
To determine which element is the aperture stop for a ray incident from an object at infinity, we
need to determine where this ray strikes the second lens. In this case, it strikes well within the lens
diameter the ray height from the first lens is:

d1 t 175 mm 25 d2
y= 1 = 25 mm 1 = mm = 3.125 mm <
2 f1 200 mm 8 2
so the first lens is the aperture stop, and therefore also the entrance pupil.
Location of aperture stop for the specified Galilean telescope. Since the ray from infinity that strikes
the edge of the positive lens passes well within the boundary of the negative lens, the aperture stop
is the positive lens for an object at infinity.
The exit pupil is the image of the aperture stop (first lens) seen through the second lens, which
has negative focal length, ensuring that the exit pupil will be virtual. The distance from the stop
to the second lens is:
z2 = t = f1 + f2 = 175 mm
and the image distance from the second lens is:
z2 f2 175 mm (25 mm) 175

z20 = = = mm = 21.875 mm
z2 f2 175 mm (25 mm) 8
Figure 2.1:
The size of the exit pupil is determined from the transverse magnification:
z20 175 mm 1
MT = = 8 =+
z2 175 mm 8
Since the diameter of the stop is d1 = 50 mm, the diameter of the exit pupil is:
1
dXP = MT dStop = + 50 mm = +6.25 mm
8
For the Galilean telescope, the exit pupil is virtual (located 21.875 mm behind the eyelens) and
small.
Keplerian Telescope
Now repeat the analysis for a corresponding Keplerian telescope with f1 = +200 mm, d1 = 50 mm,
f2 = +25 mm, d2 = 25 mm, t = f1 + f2 = 225 mm and angular magnification:
f1 +200 mm
M = = = 8
f2 +25 mm
Again, the height of the ray at the edge of the first lens from an object at infinity has height at
the second lens:

d1 t 225 mm 25
y = 1 = 25 mm 1 = mm = 3.125 mm
2 f1 200 mm 8
d2
|y| <
2
The first element is still the stop and the entrance pupil. The image of the first lens through the
second is the exit pupil; its location and size are determined using the thin-lens imaging equation:
z2 = t = f1 + f2 = 225 mm
z2 f2 225 mm 25 mm 225
z20 = = = mm = +28.125 mm
z2 f2 225 mm 25 mm 8
225
z0 mm 1
MT = 2 = 8 =
z2 225 mm 8

1
dXP = dStop MT = 50 mm = 6.25 mm
8
The exit pupil is real (outside of the system at a distance of 28.125 mm beyond the eyelens) and
inverted.
In both of the telescopes just considered, note that the diameter of the exit pupil is the ratio of
the focal length of the eyepiece and the focal ratio of the object lens:
f2 d d
dXP = = 1 = 1
f1 f1 M
d1 f2
50 mm
(dXP )Galilean = = 6.25 mm
+8
50 mm
(dXP )Keplerian = = 6.25 mm
8
In words, the diameter of the exit pupil is equal to the ratio of the diameter of the entrance pupil
(which is the objective in this case) and the magnifying power; more power means a smaller exit
pupil.
Common binoculars used for birdwatching are listed as 10 50, which means that the angular
magnification (magnifying power) is 10 and the diameter of the entrance pupil (which is that of the
objective lens0 is 50 mm / 2 in. The diameter of the eyelens is:
50 mm
dXP = = 5 mm
10
Until recently, the most common variety of binocular was the 7 50, which has a magnifying
power of 7 and objectives with d = 50 mm, so the diameter of the exit pupil is:
50 mm
dXP = ' 7 mm
7
This is a close match to the diameter of the iris of the dark-adapted eye and thus are a good choice
for astronomical viewing; for that reason, 7 50 binoculars were known as night glasses. When
used with the smaller iris diameter of the eye during daytime, much of the diameter of the exit pupil
would illuminate the opaque iris and not contribute to the brightness of the image on the retina.
For a formerly common amateur telescope with a mirror objective with d1 = 6 in = 150 mm and
a focal length f1 = 48 in
= 1220 mm, the focal ratio is:
48 in
f /# = =8
6 in
so the diameter of the exit pupil is when viewed through an eyelens with focal length f2 is
f2 f2
dXP = =
f /# 8
If the focal length of the eyelens is f2 = 25 mm

= 1 in, then the diameter of the exit pupil is about
3 mm, which is pretty small. If the focal length of the eyelens is f2 = 4 mm = 16 in, the magnifying
power of the system is:

f1 48 in
M = = 1 = +288
f2 6 in
which is a large number that will impress a naive user. BUT the diameter of the exit pupil is very
small
1
f2 in 1
dXP = = 6 = in
= 0.5 mm
8 8 48
so it would be very dicult to see anything through this telescope. This illustrates the flaw in the
strategy that was once used often by manufacturers of cheap telescopes intended as gifts for children;
the manufacturers would often quote a very large value for the magnifying power that required an
eyepiece with a very short focal length and therefore a very small exit pupil. The images were very
dicult to see by novices and experienced users alike.
The location of the exit pupil also is important. It is useful to have it placed outside the
imaging system where the eye would be located so that it is feasible to get all of the light through
the pupil into the eye. The distance from the rear vertex of the system to the exit pupil is the eye
relief :
V0 E0 = eye relief
An imaging system with lots of eye relief may be easier to view through, since the location where
the eye is optimally placed is back away from the eyelens. An example of a system that needs a
large eye relief is a rifle scope, where the eyepiece lens will be located far in front of the viewing
eye.
For dierent object distances, it is possible for the aperture stop to move around, i.e., the
element that defines the aperture stop may change with object distance. The locations and sizes of
the pupils are determined by applying the ray-optics imaging equation to these objects. To some,
the concept of finding the image of a lens may seem confusing, but it is no dierent from before
just think of the lens as a regular opaque object at its location and find the images through the
optics that come after (for the exit pupil) or that came before (entrance pupil).
Which element in a multielement system is the stop depends on the relative sizes of the lenses.
In the first case shown below, the first lens (the objective) is small enough that it acts as the stop
(and thus also the entrance pupil). The image of the objective lens seen through the eyelens is the
exit pupil, and is between the two lenses and very small. Because the exit pupil is small and
remote (located within the optical system), so is the field of view of the Galilean telescope. In
the second example, the smaller eyelens is the stop and also the exit pupil, while the image of the
eyelens seen through the objective is the entrance pupil and is far behind the eyelens and relatively
large.
More Examples of Galilean and Keplerian Telescopes
Consider the two two-lens telescope designs. The Galilean telescope has a positive-power objective
and a negative-power ocular or eyelens. The Keplerian telescope has a positive objective and a
positive eyelens. Assume that the objective is identical in the two cases with f1 = +100 mm and
d1 = 30 mm. The focal lengths and diameters of the oculars (eyepieces) are f = 15 mm and
d2 = +15 mm (these are the approximate dimensions and focal lengths of the lenses in the OSA
Optics Discovery Kit). The lenses of a telescope are separated by f1 + f2 , (f1 + f2 = 85 mm and
115 mm for the Galilean telescope and Keplerian telescope, respectively). We want to locate the
stops and pupils. The stop is found by tracing a ray from an object at through the edge of the
first element and finding the ray height at the second lens. If this ray height is small enough to pass
through the second lens, then the first lens is the stop; if not, then the second lens is the stop.
Galilean telescope for object at z1 = +: (a) the objective lens is the aperture stop and entrance
pupil because it limits the cone of entering rays. The image of the stop seen through the eyelens is
the (very small) exit pupil; (b) the larger objective means that the eyelens is the aperture stop and
the exit pupil. The image of the eyelens seen through the objective is the entrance pupil, and is
behind the eyelens because the object distance to the objective is less than the focal length.
Consider the Galilean telescope first. The ray height at the first lens is the semidiameter of the
lens: d21 = 15 mm; it is not called the radius to avoid confusion with a radius of curvature. From
there, the ray height would decrease to 0 mm at a distance of f1 = +100 mm, but it first encounters
the negative lens at a distance of t = +85 mm. The ray height at this lens is
100 mm 85 mm
15 mm = 2.25 mm
100 mm
which is much smaller than the lens semidiameter of d22 = 7.5 mm. Hence the first lens (the objective
lens) is the stop.
The entrance pupil is the image of the stop through all of the elements that come before the stop.
In this case, the first lens is also the entrance pupil and its transverse magnification is unity. The
exit pupil is the image of the stop through all elements that come afterwards, which is the negative
lens. The distance to the object is f1 + f2 = 85 mm, so the imaging equation is used to locate the
exit pupil and determine its magnification:
1 1 1 1
+ = =
85 mm z 0 f2 15 mm
1
0 1 1 51
z = = mm = 12.75 mm
15 mm 85 mm 4
z0 12.75 mm
MT = = = 0.15
z 85 mm
The exit pupil is upright, but more important, its distance from the second lens is negative; the exit
pupil is a virtual image and not accessible to the eye. The viewer sees the exit pupil in front of
the eye. This limits the field of view of the Galilean telescope.
Follow the same procedure to determine the stop and locate the pupils and their magnifications
for the Keplerian telescope. The ray height at the first lens for an object located at is again
15 mm. The ray height decreases to 0 mm at the focal point, but then decreases still farther until
encountering the ocular lens at a distance of f1 + f2 = 115 mm. The ray height h at this lens is
determined from similar triangles:
15 mm 100 mm
= = h = 2.25 mm
h 15 mm
So the first lens is the stop and entrance pupil (with unit magnification) in this case too. The
distance from the stop to the second lens is f1 + f2 = 115 mm, so the imaging equation for locating
the exit pupil and determining its magnification is:
1 1 1 1
+ 0 = =
115 mm z f2 +15 mm
1
1 1 69
z0 = + = + mm = +17.25 mm
15 mm 115 mm 4
0
z +17.25 mm
MT = = = 0.203
z 85 mm
The exit pupil is a real image of the aperture stop in the Keplerian telescope we can place our eye
at it and see a larger field of view.
Vignetting
The location of the aperture stop is determined for an object located on the optical axis. If the
object is o the axis, the cone of rays that get throught the system is skewed or tilted. If
other elements in the system (lenses or diaphragms) constrain parts of the skewed cone of rays,
then the cone of rays is truncated and the brightness of the image is reduced; this phenomenon is
vignetting.
Example of vignetting; the brightness of the scene at the edges is reduced due to the presence of an
out-of-focus aperture in the system.
2.15.4 Pupils and Diraction

The concept of pupils may be combined with diraction to evaluate the eective focal ratio (f/number)
of the imaging system. For a single thin lens, the diraction spot is determined by the size and shape
specified by the pupil function p [x, y] or p (r) and the distance to the image. If the lens has a circular
2.16 MARGINAL AND CHIEF RAYS 91
pupil of diameter d0 , the pupil function

r
p (r) = CY L
d0
determines the extent of the ray cone that enters the system. We derived the resulting diraction
pattern, which is proportional to a scaled circularly symmetric sombrero function, which is the
analogue of the SINC function using the first-order Bessel function, and therefore is sometimes
called the besinc function.
2
d0 r
h (r) SOM B
4 0 z2
d0
If the object distance is large, then the image distance z2 ' f and the amplitude of the impulse
response is:
r
h (r) SOM B
0 f
d0
The diameter of the Airy disk is approximately:

f
D0 = 2.440 = 2.44 0 f/#
d0
2.15.5 Field Stop

As suggested by its name, a field stop limits the field of view of the system. It may be as simple as
the finite size of the sensor (e.g., a rectangular piece of photosensitive emulsion or a CCD sensor),
or it may be placed at an intermediate image within the system or even at the object itself. Images
of the field stop are located at the same locations as intermediate images of the object.
2.16 Marginal and Chief Rays

Many important characteristics of an optical system, including the possible presence of vignetting,
are determined by the trace of two specific rays through the imaging system. For an object O with
image O0 , aperture stop S and entrance pupil E and exit pupil E0 , the marginal ray traces from the
center of O to the edge of S and back to the center of O0 . The chief ray (or principal ray) is traced
from the edge of O (or edge of the field of view) hrough the center of S to the edge of O0 . Since E
and E0 are images of the stop S, the marginal and chief rays also go through the edges and centers
of the pupils, respectively.
The marginal ray is specified by its ray heights y and ray angle u at dierent points on the
optical axis; the corresponding notation for the chief ray includes overscores or bars: y, u.
Heights and angles of the marginal ray after refraction at a surface are primed, e,g, y 0 and u0 .
The corresponding quantities for the chief ray are y 0 , and u0 .
From the definition of the marginal ray, an object or image is located at any location (value of
z) where y = 0. Similarly, the aperture stop, entrance pupil, and exit pupil are located at values
of z where y = 0. An image exists wherever the marginal ray crosses the axis and the aperture
stop or pupils are located wherever the chief ray crosses the axis. Complete specification of these
two rays is sucient to characterize the location of object and image(s), the field of view, and the
magnifications.
The chief ray is the axis of the unvignetted light beam from a point at the edge of the field of view.
The radius of the unvignetted light beam (or perhaps more appropriately called the semidiameter
to avoid potential confusion with the radius of curvature) is the sum of the heights of the marginal
and chief rays:
dunvignetted
= y + y at any location z
2
Figure 2.2: The marginal and chief rays for a two-element imaging system where the second element
is the stop. The marginal ray comes from the center of the object O, grazes the edge of the stop and
through the center of the image O0 . The chief ray travels fromt the edge of the object through the
center of the stop to the edge of the image.
Because paraxial calculations are linear, it is customary to normalize the ray heights and angles
for the calculation and then scaling the results to satisfy the conditions of the specific system. For
example, we generally select the chief ray height y = 1 and the marginal ray angle u = 1 at the object.
Clearly the choice of unit ray angle (in radians) is inconsistent with the paraxial approximation, but
this is just a computational convenience because all quantities are scalable.
2.16.1 Telecentricity
If the aperture stop is located such that the entrance and/or exit pupils are at infinity, then the
system is telecentric. One way to do this is to place the aperture stop at one of the focal points of
the system, which means that the corresponding pupil is at the same location and the other pupil
is at infinite. As shown in the figure, if the stop is located at the object-space focal point of a single
thin lens, then the entrance pupil is at the same location and the exit pupil is at infinity in image
space this is an image-space telecentric system.
2.16 MARGINAL AND CHIEF RAYS 93
Telecentric system consisting of single thin lens with aperture stop placed at object-space focal point,
showing chief ray (solid blue) and marginal ray (red). The chief intersects the optical axis at that
focal point and so emerges from the lens parallel to the optical axis. The dashed blue lines parallel to
the chief ray intersect at the image. The defocused image is the same height as the focused image.
If the stop is located at the image-space focal plane, then the entrance pupil is at infinity, forming
an object-space telecentric system. If either the entrance or exit pupil is at infinity, then the chief
ray must be parallel to the optical axis on that side of the imaging system. This means that the
system transverse magnification will be constant even if the image is blurry. Put another way, a
blurred image has the correct magnification.
A double telecentric system is an afocal system (telescope) with the stop located at the common
focal plane of the two lenses. This means that both the entrance and exit pupils are at infinity. The
fact that the magnification of the system does not depend on accuracy of focusing makes telecentric
systems particularly useful for metrology.
Double telecentric system with the aperture stop at the common focal point of the two lenses. The
marginal ray is shown in red and the chief ray in solid blue.
2.16.2 Marginal and Chief Rays for Telescopes

The marginal ray of an afocal system used to image an object at infinity travels parallel to the
optical axis before the first lens and after the last (u = 0, u0 = 0). The relative sizes of the two
lenses determine which is the aperture stop for a Galilean telescope, the aperture stop is usually
the negative ocularlens
MORE TO COME
Chapter 3
Tracing Rays Through Optical

Systems
The imaging equation(s) become quite complicated in systems with more than a very few lenses.
However, we can determine the eect of the optical system by ray tracing, where the action on
two (or more) rays is determined. Raytracing may be paraxial or exact. Historically, graphical,
matrix, or worksheet ray tracing were commonly used in optical design, but most ray tracing is now
implemented in computer software so that exact solutions are more commonly implemented than
heretofore.
3.1 Paraxial Ray Tracing Equations

Consider the schematic of a two-element optical system made of thick lenses, so the vertices and
principal planes of individual lenses do not coincide at the same points.
Schematic of ray tracing of a provisional marginal ray from an object at an infinite distance. The
system has two elements and the locations Hn and Hn0 are the principal planes of the nth element.
The ray height at the nth element is yn and the ray angle during transfer between elements n 1
and n is un .
The two elements are represented by their two principal planes, which are the planes of unit
magnification. The refractive power of the first element changes the ray angle of the input ray. In
the example shown, the input ray angle u1 = 0 radians, i.e., the ray is parallel to the optical axis.
The height of this ray above the axis at the object-space principal plane H1 is y1 units. The ray
95
96 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS
Figure 3.1: Refraction of a paraxial ray at a surface with radius of curvature R between media with
refractive indices n and n0 . The ray height and angle at the surface are y and u, respectively. The
angle of the ray measured at the center of curvature is . The height and angle immediately after
refraction are y and u0 . The object and image distances are s and s0 (which are now called z and
z 0 in the text).
emerges from the principal plane H01 at the same height y1 but with a new ray angle u2 . The ray
transfers to the second element through the distance t2 in the index n2 and has ray height y2 at
principal plane H2 . The ray emerges from the principal plane at the same height but a new angle
u3 .
3.1.1 Paraxial Refraction
Consider refraction of a paraxial ray emitted from the object O at a surface with radius of curvature
R. For a paraxial ray, the surface may be drawn as vertical. The height of the ray at the surface
is y.
From the drawing, the incoming ray angle u measured from the optical axis is:
hy i y
u = tan1
= >0
z z
and the corresponding equation for the outgoing ray measured from the optical axis is:
hyi y
u0 = tan1 0 = 0 >0
z z
The angle of the height of the ray at the refractive surface measured from the center of curvature is:
hyi y
= tan1
=
R R
The incident and refracted angles measured from the surface at height y are the angles of incidence
and refraction. From the drawing:
i = u
i0 = u0
3.1 PARAXIAL RAY TRACING EQUATIONS 97
Now apply Snells law in the paraxial approximation:
n sin [i] = n0 sin [i0 ] = n i = n0 i0

n (u ) = n0 (u0 )
= n0 u0 = nu n + n0 = nu + (n0 n)
y
= nu + (n0 n)
R
(n0 n)
= nu y
R
nu y
n0 u0
= nu y
The paraxial refraction equation in terms of the incident angle u, refracted angle u0 , ray height y,
1
surface power = , and indices of refraction n and n0 is:
f
n0 u0 nu
=
y
3.1.2 Paraxial Transfer
Paraxial transfer from one surface to the next in a medium with refractive index n0 .
The transfer equation determines the ray height y 0 at the next surface given the initial ray height
y, the physical distance t0 and the ray angle u0 in the medium with index n0 . From the drawing, we
have:
y 0 = y + t0 u0
0
t
y0 = y + (n0 u0 )
n0
where the substitution was made to put the ray angle in the same form n0 u0 that appeared in the
0
refraction equation. The distance nt 0 t0 is called the reduced thickness (note the potential for
t0
confusing reduced thickness n0 and optical path length n0 t0 ).
3.1.3 Linearity of the Paraxial Refraction and Transfer Equations
Note that both the paraxial refraction and transfer equations are linear in the height and angle,
i.e., neither includes any operations involving squares or nonlinear functions (such as sine, tangent,
or logarithm). Among other things, this means that they may be scaled by direct multiplication
to obtain other equivalent rays, as to match the marginal ray height to the semidiameter of the
aperture stop or the chief ray angle to the semidiameter of the field stop. For example, the output
angle may be scaled by scaling the input ray angle and the height by a constant factor :
(nu y) = (nu) ( y) = (n0 u0 )
We will take often advantage of this linear scaling property to scale rays to to find the exact marginal
and chief rays from the provisional counterparts.
3.1.4 Paraxial Ray Tracing
To characterize the paraxial properties of a system, two provisional rays are traced:
1. Initial height of marginal ray at first surface: y = 1.0, initial marginal ray angle nu = 0;
2. Initial height of chief ray at first surface: y = 0.0, initial chief ray angle nu = 1.
We have already named these rays; the first is the provisional marginal ray that intersects the
optical axis at the object (and thus also at every image of the object). The second ray (distinguished
by the overscore) is called the provisional chief (or principal) ray and travels from the edge of the
object to the edge of the field of view through the center of the stop (and thus through the center of
the pupils, which are images of the stop). Since the paraxial ray tracing equations are linear, these
provisional rays may be scaled to the parameters of the system.
The process of ray tracing is perhaps best introduced by example. Consider a two-element
three-surface system. The first surface is the cornea, with radius of curvature in the model of
R1 = +7.8 mm. The aqueous humor between the cornea and the lens has a thickness of in the
model of 3.6 mm and refractive index of n2 = 1.336. The surfaces of the lens have curvatures
R2 = +10 mm, and R3 = 6 mm, thickness of 3.6 mm, and refractive index n3 = 1.413. The vit-
reous humor between the lens and the retina has the same refractive index of n4 = 1.336 as the
aqueous humor.
3.1 PARAXIAL RAY TRACING EQUATIONS 99
Marginal and chief rays traced through the three-surface optical system.
The refraction at the first surface changes the angle but not the height of a ray from the object.
If the incident ray angle is 0 radians, then the new ray angle for the provisional marginal ray is:

(n0 u0 )1 = (nu)1 y1 [ mm] 1 mm1
= 0 (1.0) (+0.043077)
= 0.043077 radian
Note that we are retaining 6 decimal places in this calculation to ensure the best result at the end.
We will then truncate (round) the value to a more reasonable accuracy.
The transfer equation for the provisional marginal ray between the first and second surface
changes the height of the ray but not the angle. The height at the second surface is:
0
t
y10 = y1 + (n0 u0 )1 [ mm]
n0 1
3.6
=1+ (0.043077) = +0.883924 mm
1.336
0.04 radians and arrives at the
Thus the ray exits the first surface at the reduced angle n0 u0 =
second surface at height y 0
= +0.88 units. The corresponding equations for the chief ray at the first
surface are:
(n0 u0 )1 = (nu)1 y 1 1
= 1 (0.0) (+0.043077)
= 1 radian

t0
y10 = y1 + (n0 u0 )1
n0 1
3.6
=0+ (1) = +2.694611 mm
= 2.695 mm
1.336
Since the provisional chief ray went through the center of the lens, its angle did not change. The
height of the chief ray at the second surface is proportional to the ray angle.
Ray-Tracing Table
The equations may be evaluated in sequence to compute the rays through the system. These are
presented in the table. Each column in the table represents a surface in the system and the primed
quantities refer to distances and angles following the surface. In words, t0 in the first row are the
distances from the surface in the column to the next surface.
P aram eter In i t i a l S u rfa ce 1 S u rfa ce 2 S u rfa c e 3 Im a g e S u rfa c e
R +7.8 mm +10.0 mm 6.0 mm

t0 3.6 mm 3.6 mm
n0 1.0 1.336 1.413 1.336
0
= n n 0.043077 mm1 0.007700 mm1 0.012833 mm
R
t0 3.6 mm = 2.694611 mm 3.6 mm = 2.54771 mm 12.699 mm
n0 1.336 1.413
R ays
y 1 mm 1 mm 0.883924 mm 0.756833 mm 0 mm
n0 u0 0 0.043077 r a d i a n 0.049883 r a d i a n 0.059596 r a d i a n 0.059596 r a d i a n
y 0 mm 2.694611 mm 5.189519 mm 16.779317 mm

n0 u0 1 ra d ia n 1 ra d ia n 0.979251 r a d i a n 0.912654 r a d i a n
The raytrace
indicates
that the provisional
marginal ray emerges from the last surface with height
y 0.756833 mm
and angle = . These are used to calculate the (boxed) distance to
n0 u0 0.059596 radians
the image location (where the marginal ray height is 0):
t0 0 0
y0 = 0 = y + (n u )
n0
t0
0 = (+0.756833) + 0 (0.059596)
n
t0 +0.756833
= 0 = = +12.699 mm
n 0.059596
This is the reduced distance in the image medium with index n4 ; the physical distance t0 is:
+0.756833
= t0 = mm n0 = 12.699 1.336
= 16.966 mm
0.059596
The height and angle of the provisional chief ray at the image location are y = 16.78 mm and
n0 u0
= 0.91 radians, respectively, which may be scaled to the size of a known sensor to determine
the field of view.
This particular system is often used as a model for the human eye with the lens relaxed to view
objects at . The first surface represents the cornea of the eye, while the other two surfaces are
the front and back of the lens. Note that the power of the cornea (0.043077 mm1 = 43 diopters) is
considerably larger than the powers of the lens surfaces (7.7 diopters and 12.8 diopters, respectively).
3.2 Matrix Formulation of Paraxial Ray Tracing

The same linear paraxial ray tracing equations may be conveniently implemented as matrices acting
on ray vectors for the marginal and chief rays whose components are the height and angle. The ray
3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 101
vectors may be defined as:

y
paraxial marginal ray vector :
nu

y
paraxial chief ray vector :
nu
Note that there is nothing magical about the convention for the ordering of y and nu (i.e., which goes
on top of the vector); this is the convention used by Roland Shack at the Optical Sciences Center
at the University of Arizona, but Willem Brouwers book Matrix Methods in Optical Instrument
Design uses the opposite order. Note that the choice of convention here determines the form of the
system matrix, but the two choices are equivalent.
In this notation, the two column vectors that represent the marginal and chief rays may be
combined to form a ray matrix L:

y y y y
L =
nu nu nu nu
which may be evaluated at any point in the system. The determinant of this ray matrix is:
det [L] = y (nu) (nu) y
which we shall show to be a constant the so-called Lagrange invariant. In words, the Lagrange
invariant is the product of the chief ray height and marginal ray angle subtracted from the product
of the marginal ray height and chief ray angle. We denote it by the symbol (aleph, chosen here
for the simple reason that it is distinctive). We shall see that is unaected by both the refraction
and transfer, and therefore is invariant as we progress through dierent locations in the system.
3.2.1 Refraction Matrix
Given the ray vectors or the ray matrix, we can now define operators for refraction and transfer.
Recall that paraxial refraction of a marginal ray and of a chief ray at a surface with power changes
the ray angles but not the heights (at the surfaces):
n0 u0 = nu y for marginal ray

n0 u0 = nu y for chief ray
The refraction process for the marginal ray may be written as a matrix R and the output is the
product with the ray vector which will have the same ray height and a dierent angle:

y y y y
R =
nu nu n0 u0 n0 u0

a c
R =
b d
where we need to evaluate the four values a d. Consider the action of the refraction matrix on the
marginal ray:

y a c y y
R = =
nu b d nu n0 u0
ay + c (nu) = y = a = 1, c = 0
by + d (nu) = n0 u0 = nu y = b = , d = 1
substitute these values to see the form of the refraction matrix:

1 0
R=
1
The determinant of the refraction matrix is:

1 0
det R = det = (1) (1) () (0) = 1
1
The action of a refraction matrix R on a ray matrix L is:
RL = L0

1 0 y y y0 y0
=
1 nu nu n0 u0 n0 u0

y y
=
nu y nu y
The determinant of the ray matrix after refraction is:

det L0 = y (nu y ) y (nu y )
= y nu yy y nu + yy
= y nu y nu = = det [L]
which confirms that the Lagrangian invariant is not aected by refraction.
3.2.2 Ray Transfer Matrix

The transfer of the marginal ray from one surface to the next within the medium with index n0 is
t0 0 0
y0 = y + (n u )
n0
which also may be written as the product of a ray matrix T with the marginal ray vector:
0
t
0 0

y
y + (n u ) n0
T =
n0 u0 n0 u0
0
t 0
1 y y
= n0 =
0 0 0 0
0 1 n u n u
so the determinant of the transfer matrix also is 1:

0
t 0
1 n0 t
det = (1) (1) (0) =1
n0
0 1
The action of the transfer matrix T on the ray matrix L is:

t0
0 0 1
y y y y
L0 = T L = = n0
0 0
0 0
nu nu 0 1 n0 u0 n0 u0
0 0
t 0 0 t 0 0
y + n u y + n u
= n0 n0
n0 u0 n0 u0
and the determinant of the ray matrix after the transfer operation is:
det [L0 ] = det [T L]

0 0
0 t 0 0 0 0 0 t
= y + n u (n u ) y + nu (n0 u0 )
n0 n0
0 0
0 0 0 t 0 0 0 0 0 t
=y nu + n u n u y nu n0 u0 n0 u0
n0 n0
= y 0 n0 u0 y 0 n0 u0 = = det [L]
so the determinants of the ray matrix before and after refraction are also identically the Lagrangian
invariant ; in other words, neither the refraction nor the transfer matrices has any eect on the
determinant of a ray matrix, so the Lagrangian invariant is preserved by refraction or transfer (hence
its name!).
Ray Transfer Matrix for an Optical System
The refraction and transfer matrices may be combined in sequence to model a complete system. If
we start with the marginal ray vector at the input object, the first operation is transfer to the first
surface. The next is refraction by that surface, transfer to the next, and so forth until a final transfer
to the output image:
T n Rn T 2 R2 T 1 R1 T 0 Lob ject = Limage
If the initial ray matrix is located at the object (as usual), the marginal ray height is zero, so the
ray matrix at the object and any images has the form:

0 y in
Lob ject =
(nu)in (nu)in

0 y out
Limage =
(nu)out (nu)out
so the system from object to image is:
S T n Rn T 2 R2 T 1 R1 T 0
S Lob ject = Limage

0 y in 0 y out
(T n Rn T 2 R2 T 1 R1 T 0 ) =
(nu)in (nu)in (nu)out (nu)out
Note that the individual refraction and transfer matrices are sequenced in inverse order, i.e., the
last matrix is the first in the sequence for the system. The transfer matrix T 0 acts on the input ray
matrix, so it must appear on the right.
Ray Matrix for Provisional Marginal and Chief Rays
The system is characterized by using provisional marginal and chief rays located at the object. The
linearity of the computations ensure that the rays may be scaled subsequently to satisfy other system
constraints, such as the diameter of the stop. The provisional marginal ray at the object has height
y = 0 and ray angle nu = +1, while the provisional chief ray at the object has height y = +1 and
angle nu = 0. Thus the provisional ray matrix at the object is:

0 1
L0 =
1 0
3.2.3 Vertex-to-Vertex Matrix for System
We can construct a matrix that represents JUST the optical system by excluding the input ray
matrix, the transfer matrix from object to object-space vertex, the transfer from image-space vertex
to image, and the output ray matrix. This subset is the vertex-to-vertex matrix MVV0 of the
system and is a complete specification of the paraxial properties of the system. The general form
for the matrix is:
A B
MVV0 = (Rn T 2 R2 T 1 R1 ) =
C D
where A, B, C, D are factors to be determined from the various refractions and transfers for a specific
system. The entries A and D in the matrix are pure numbers (without units), while B and D
have dimensions of length and reciprocal length, respectively. From matrix algebra, it is possible to
show that the determinant of the matrix product is the product of the determinants. We already
know that the determinants of the matrices for any transfer or refraction is unity, which establishes
a constraint on the vertex-to-vertex matrix:
det [MVV0 ] = det Rn det T n1 det R2 det T 1 det R1

= 1 1 1 1 = 1
= det [MVV0 ] = 1
= AD BC = 1
1
Consider a simple example of the matrix MVV0 for a two-lens system with powers 1 = (f1 )
and 2 = (f2 )1 separated by t. The product of the two refraction matrices and the transfer matrix
is:
MVV0 =R2 T 1 R1

1 0 1 t 1 0
=
2 1 0 1 1 1

1 1 t t
=
(1 + 2 1 2 t) 1 2 t

1 1 t t
MVV0 =
e 1 2 t
where the known expression for the system power

1 1 1 t
= + = e = 1 + 2 1 2 t
fe f1 f2 f1 f2
has been substituted in the last expression. It is easy to confirm that the determinant of this system
matrix is unity.
We have four equations in the four unknowns A, B, C, D, which may be combined to find useful
systems metrics in terms of the elements in the vertex-to-vertex matrix MVV0 :
1 1
eective focal length of system fe = =
e C
FV D
front focal length FFL = =
n C
V0 F0 A
back focal length BF L = =
n C
VH D1
distance from front vertex to object-space principal point =
n C
H0 V0 1A
distance from image-space principal point to rear vertex =
n0 C
0
VO 0 t2 mA B At1
distance from rear vertex to image (if obj. dist. t1 is known) = 0 = =
n0 n C D Ct1
1
OV t1 D B + Dt2
distance from object to front vertex (if image dist. t2 is known) = = m =
n n C A + Ct2
When evaluating matrices, note that you need to retain plenty of significant figures in the calcu-
lation (at least 6) to ensure that the derived values are suciently accurate.
3.2.4 Example 1: System of Two Positive Thin Lenses

To illustrate, consider the system of two thin lenses in the last section with f1 = +100 mm, f2 =
200
+50 mm, and t = 75 mm, which we showed to have fe = + mm = 66.7 mm. The system matrix
3
is:

1 1 t t A B
MVV0 = =
(1 + 2 1 2 t) 1 2 t C D

1
1 0 1 75 mm 1 0 75 mm
= = 4
1 1
50 mm 1 0 1 100 mm 1 2003mm 12
and its determinant evaluates to one:

1
75 mm
det 4 =1
2003mm 12
From the values in the last section, we can see that
B = 75 mm = t
1 200
= mm = fe
C 3
which in turn demonstrates our old result that the power of a two-lens system is:
1 1 1 t
C= = = 1 + 2 1 2 t = +
fe f1 f2 f1 f2
The input ray matrix consists of the provisional marginal and chief rays at the object, which
pass through the transfer matrix from object to front surface. For example, if the object is located
1000 mm from the front vertex, the transfer matrix is:

1 1000 mm
T0 =
0 1
If a ray is cast out from the center of the object (y = 0) at an angle of 1 radian, the

y 0 y0 1000 mm
T0 =T0 = =
nu 1 n0 u0 1
In words, the height of the provisional marginal ray at the front vertex is 1000 mm and the angle is
1 radian, a HUGE angle, but remember that all equations in this paraxial assumption are linear, so
the angle and ray height can be scaled to any value. The emerging provisional marginal ray is:

1 0
4 75 mm 1000 mm 325 mm y
= =
3 31
12 1 n0 u0
200 mm 2
In words, the marginal ray from an object 1000 mm at an angle of 1 radian at the front vertex of the
lens emerges from the image-space vertex with height y 0 = 325 mm and angle of n0 u0 = 312 radians.
To find the location of the image, find the distance until the marginal ray height y = 0, which is
the location of the image:

t0
325 mm 1 325 mm 0
V0 O0 = T 31 = n
0
31 = 31
0 1
2 2 2
0
31 t
= 325 mm + 0 = 0
2 n
0

t 2 650 +20.97 mm
= = 325 mm + =+ mm =
1 31 31
which agrees with the result obtained earlier. We observed that the transverse magnification of the
image in this configuration is
z0 H0 O0 2 mm
MT = = = = 0.064
z OH 31 mm
so the provisional marginal ray at the image point is:

y0 0 0
= 31 =
n0 u0 MT1
2
The marginal ray out of the vertex-to-vertex matrix for the object distance OV = 1000.
Back Focal Length (BFL)

The image of an object located at is the image-space focal point of the system. This ray enters
the system with angle nu = 0 and arbitrary height, which we can model as y = 1. The emerging
ray is:
1 1
4 75 1 4
3 1 = 3
0
200 2 200
The ray height is 14 mm and the angle is n0 u0 = 200 3
. The distance to the point where the ray
height is zero is the back focal distance:
1
1 t0
1 0
BF L = V0 F0 = T 43 = n0 43 = 3
0 1
200 200 200
0

1 3 t
= + =0
4 200 mm n0
t0 1 200 mm 100
= = = mm = 16.7 mm
1 4 3 6
Front Focal Length (FFL): Ray Through Reversed System

To find the front focal distance, we can trace the provisional marginal ray backwards through
the system, or trace it through the reversed system where the lenses are placed in the opposite
order. The reversed system matrix is:
1

1 0 1 75 1 0 75
(MVV0 )reversed = 1 = 2
1 3 1
1 0 1 1
100 50 200 4
Note that the diagonal elements of the forward and reversed vertex-to-vertex matrices are
swapped, while the o-diagonal elements are identical.
If the input ray height is 1 and the angle is 0, the outgoing ray from the reversed matrix is:

1 1 1
75 1 mm mm 100
2 = 2 2
3 1 3 = F F L = FV = 3 = + 3 mm
0
200 4 200 200
3.2.5 Example 2: Telephoto Lens

To illustrate, we apply the vertex-to-vertex matrix for the thin-lens telephoto considered in the last
section with f1 = +100 mm, f2 = 25 mm, and t = +80 mm:

1 0 1 +80 mm 1 0
MVV0 = 1 1
1 0 1 1
25 mm 100 mm

1
80 mm 1 1 t t
= 5
1 21 =
(1 + 2 1 2 t) 1 2 t
500 mm 5
1
= fe = = +500 mm
C
A 1
= BF L = = (500 mm) = +100 mm
C 5

D 21
= F F L = = (500 mm) = +2100 mm
C 5

VH D1 21
= = = 1 (500 mm) = 1600 mm = HV = +1600 mm
n C 5

VH D1 21
= = = 1 (500 mm) = 1600 mm = HV = +1600 mm
n C 5

H0 V0 1A 1
= = = 1 (500 mm) = 400 mm = V0 H0 = +400 mm
n0 C 5
If the object is located 1000 mm from the first surface, the ray matrix at the front vertex of the
system is :

y 0
T0 =T0
nu 1

1 1000 mm 0 1000 mm
=
0 1 1 1
The height of the provisional marginal ray at the front vertex is 1000 units and the angle is 1 radian,
which are huge values, but can be scaled to any value because all equations are linear.

1
80 mm 1000 mm 280 mm y
5 = 11 =
1 21
1 nu
500 mm 5 5
In words, the marginal ray from an object 1000 mm in front of the lens emerges with height 280 mm
11
and angle of + radians.
5
To find the location of the image, find the distance until the marginal ray height y = 0:

t0
280 mm 1 280 mm 0
V0 O0 = T 11 = n0 11 = 11
5 0 1 5 5

11 t0
= 280 mm + + 0 = 0
5 n

t0 5 1400
= = 280 mm = mm = 127.3 mm
1 11 11
which indicates that the image is virtual. (Figure out why!)

The magnification of the image in this configuration is
z0 OH mm 2
MT = = 0 0 =
z H O mm 31
3.2.6 MVV0 Derived From Two Rays
Consider the action of the vertex-vertex matrix on two rays that we know both before and after the
system. For two arbitrary (but noncollinear) rays, we have:

y1 y10
MVV0 =
nu1 nu01

y2 y20
MVV0 =
nu2 nu02
In actual use, the marginal ray and chief ray are the rays of choice. The marginal ray goes from
the center of the object to the center of the image while grazing the edge of the aperture stop (and
therefore the edge of the entrance and exit pupils), while the chief ray goes from the edge of the
object through the center of the aperture stop (and therefore of the pupils) to the edge of the image.
The vertex-vertex matrix applied to the incoming marginal from the center of the object yields the
emerging marginal ray:
y y0
MVV0 =
nu n0 u0
and the same relation for the chief ray is:

y y0
MVV0 =
n
u n0 u
0
We can combine the two vectors to form a 2 2 matrix:

y y y0 y0
MVV0 =
nu nu n0 u0 n0 u
0
MVV0 L = L0
We can now use the properties of the 2 2 matrix to derive the form of vertex-vertex matrix:
(MVV0 L) L1 = L0 L1

(MVV0 L) L1 = MVV0 LL1 = MVV0 I
= L0 L1 = MVV0
In words, we can evaluate the vertex-vertex matrix from its action of the marginal and chief rays.
The inverse of the input-ray matrix is easy to derive:

y y
L =
nu n u

1 nu
y
= L1 =
det L nu y

1 n
u y
=
y n
u y nu nu y

1 n u y

nu y
where y n u y nu is the previously defined Lagrangian invariant. So the vertex-vertex

matrix has the form:

y0 y0 1 nu y
MVV0 =
n0 u0 n0 u 0 y n
u y nu nu y

1 y 0 y0

nu y

=
n0 u0 n0 u0 nu y

1 y 0 n u y0 nu y y0 y y 0

=
n0 u0 n
u n0 u 0 nu n0 u0 y n0 u0 y

0 0
y y0 y y

0 0

1 nu n u y y

=
nu n
u y y

0 0 0 0 0 0 0 0
nu nu nu nu
where we have used the shorthand notation for the determinant in the last expression:

0
y 0 y0 y y0
det =

nu n u nu n
u
3.3 Object-to-Image (Conjugate) Matrix

The vertex-vertex matrix applied to a test ray with height y and angle u in index n from the
object to the front vertex is:
3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX 111

y A B y y0
MVV0 = =
nu C D nu nu0
y 0 = A y + B (nu)
nu0 = C y + D (nu)
For rays emerging from one plane and converging to the corresponsing conjugate plane (the image),
the output ray height at the image is a function ONLY of the image ray height the angles of all
rays at the object do not matter, since they all converge to the image. In mathematical terms:
y0 = Ay + B (nu) = f [y] (does not depend on angle)

= B = 0
= y 0 = A y
We know the relationship between y 0 and y is the transverse magnification:
y0
= MT = A
y
rays (a, b, c) diverge from the object and converge as (a0 , b0 , c0 ) to form the image; the choice of
specific ray angle at the object has no eect on the location of the convergence only the heights of
the rays at the object matter.
If we define the angular magnification to be the ratio of the angles from the object and to
the image::
u0
= M
u
we can find a relatiohsip from the matrices:
n0 u01 = C y + D (nu1 )
n0 u02 = C y + D (nu2 )
Evaluate the dierence of these:
n0 (u02 u01 ) = C y C y + D (nu2 nu1 )

n0 u0 = n D (u)
u0 n
= M = 0 D
u n
n0
= D = M
n
We can combine these two observations to see the form of the conjugate-to-conjugate matrix:

M T 0
MOO0 = 1 n0
M
fe n
We know that the determinant of this matrix must also be one, which implies that:
n0 n0 1
MT M = 1 = M =
n n MT
so we can also write the conjugate matrix as:

MT 0
MOO0 = 1 1

fe MT
The principal planes H and H0 are those for which MT = +1

+1 0
MHH0 = 1
+1
fe
The points of equal conjugates are related by MT = 1, so the object-image matrix for these points
is:
1 0
MOO0 = 1
1
fe
We can include the translation matrices from object to vertex and from vertex to image along
with the vertex-to-vertex matrix MVV0 :

A B
MVV0 =
C D
The matrix that relates two conjugate planes (object O and image O0 ) may be obtained
by adding
transfer matrices for the appropriate distances from the object to the front vertex t1 = n1 OV
3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX 113

and from the rear vertex to the image t2 = n2 V0 O0 , which yields for n1 = n2 = 1:

1 t2 1 t1
MOO0 = MVV0
0 1 0 1

1 t2 A B 1 t1
=
0 1 C D 0 1

A + t2 C (A + t2 C) t1 + B + t2 D
=
C Ct1 + D

MT 0
= 1

MT
1
= MT = A + t2 C = (Ct1 + D)
= C
0 = (A + t2 C) t1 + B + t2 D
We know that the marginal ray heights at the object and image are zero (yin = yout = 0), which
sets some limits on the conjugate-to-conjugate matrix. Apply this matrix to the ray matrix L at
the object and at the image:
MOO0 L = L0

A + t2 C (A + t2 C) t1 + B + t2 D 0 y in 0 y out
=
C Ct1 + D (nu)in (nu) in (nu)out (nu)out
Evaluate the inverse matrix L1 and apply to both sides from the right:

(MOO0 L) L1 = L0 L1
1
A + t2 C (A + t2 C) t1 + B + t2 D 0 y out 0 y in
=
C Ct1 + D (nu)out (nu)out (nu)in (nu)in

y out
0
y in
= (nu)out (nu)in (nu)out (nu)in (nu)out

yin (nu)in (nu)in

y out
The ratio of the chief ray heights at the object and image is the transverse magnification MT ,
y in
(nu)out 1
whereas the ratio of the marginal ray angles =
(nu)in MT
Example: System with Two Positive Thin Lenses

Again, consider the example of a system composed of two thin lenses with f1 = +100 mm, f2 =
+50 mm, and t = +75 mm:
1

1 0 1 75 mm 1 0 75 mm
MVV0 = = 4
1 1 3 1
1 0 1 1
50 mm 100 mm 200 mm 2
From the table of properties of the matrix, we see that:
1 200
fe = =+ mm
C 3
D 100
F F L = FV = = mm
C 3
A 50
BF L = V0 F0 = = + mm
C 3
D1
VH = = +100 mm
C
A1
H0 V0 = = +50 mm
C
which again match the results obtained before. The matrix that relates the object and image planes
for the two-lens system presented above is:
650 1 2

1 75 1 1000 0
T 2 MVV0 T 1 = 31
43
= 31

1 3 31
0 1 0 1
200 2 200 2
which has the form of the principal plane matrix except the diagonal elements are not both unity.
However, note that they are reciprocals of teach other, so that

2
0
det 31 3 31 = 1

200 2
2
We had evaluated the transverse magnification in this configuration to be MT = , so we note
31
that the upper-left component of the conjugate-to-conjugate matrix is the transverse magnification.
The general form of a conjugate-to-conjugate matrix is:

MT
0
MOO0 = 1

MT
and the specific form that relates the principal planes with MT = 1 is

1 0
MHH0 =
1
This is the matrix of the equivalent single thin lens.
3.3.1 Matrix of the Relaxed Eye (focused at )

The vertex-to-vertex matrix for the three refractions and two transfers is:
3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS 115

t02
t01
1 0 1 0 1 0 1 0 1 0
MVV0 = n2 n1
3 1 0 1 2 1 0 1 1 1
where the individual terms evaluate to:

n01 n1 1.336 1
1 = = = 4.3077 102 mm1 = 43.077 m1 = 43.077 Diopters
R1 7.8 mm
t01 3.6 mm
= = 2. 694 6 mm
n01 1.336
n02 n2 1.413 1.336
2 = = = 0.77 102 mm1 = 7.7 Diopters
R2 10 mm
t02 3.6 mm
= = 2.547 8 mm
n02 1.413
n03 n3 1.336 1.413
3 = = = 1.2833 102 mm1 = 12.833 Diopters
R3 6 mm
so the vertex-to-vertex matrix has the form:

0.756 83 5.189 5 mm
MVV0 =
2 1
5.959 6 10 mm 0.912 65
1
= feye = 5.959 6 102 mm1 = +16.780 mm
= eye = 5.9596 10 mm = 59.596 m1
2 1
= 60 Diopters
A ray from infinity has a ray angle of zero, but the ray height is determined from the diameter of
the iris. If we assume that the iris diameter is 1 mm, then the output ray vector is:

0.75683 5.1895 mm 1 mm 0.756 83 mm y0
= =
5.9596 102 mm1 0.91265 0 5.959 6 102 n0 u0
3.4 Vertex-Vertex Matrices of Simple Imaging Systems

We now get to where the rubber meets the road; the discussion of simple examples of actual
imaging systems. It is useful to emphasize the point that optical systems may create a real image
that may be sensed by a CCD or photographic emulsion, while those for human viewing will
produce virtual images or are afocal (image at infinity).
3.4.1 Magnifier (magnifying glass, loupe)

The magnifier or loupe is a lens (or system of lenses) with positive focal length that is used to
increase the size of the image on the retina than could be formed with the eye alone. Recall that
when the ciliary muscles that deform the eye lens are relaxed, the lens becomes flatter, increasing
the focal length. To view an object close up, the focal length of the lens must shorten by making
the lens more spherical. The closest distance to an object that appears to be sharply focused by
the unaided eye is the near point, which (obviously) depends on the flexibility of the deformable
eyelens and the capability of the ciliary muscles, which (obviously) vary with individual, and with
age for a single individual. The distance to the near point may be as close as 50 mm for a young
child and 1000 mm 2000 mm for an elderly person. This reduction in accommodation is one
of the signs of aging. The near point of an ideal eye is assumed to be 250 mm = 10 in from
the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing
the angular subtense of fine details for those individuals. For this reason, nearsighted individuals
in ancient times (before optical correction) often were attracted to professions requiring fine work,
such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in
these crafts.
In use, the object is held closer to the eye than the near point and viewed through the positive
lens, which in turn is held closer to the eye than its focal length to create a virtual image behind
the lens at the near point. If the focal length of the magnifying lens is f = 100 mm and the image
is distance is z1 = 10 mm, the object-to-image matrix is:

1 250 mm 1 0 1 z mm
MOO0 = 1
0 1 1 0 1
50 mm
6 (6 z 250) mm
=
50 1mm 1 1
50 z
Since this has the form of an object-to-image matrix, the o-diagonal element in the upper-right
corner must evaluate to zero:
250 mm 2
(6 z 250) mm = 0 = z = = 41 mm
6 3
The diagonal element in the upper-left corner of the object-to-image matrix is the transverse
magnification
250 mm
MT = +6 = 1 +
f
This is the transverse magnificxation of the magnifier if the image is at the near point.
If the object is located at the object-space focal point, then the image is at infinity:

1 mm 1 0 1 50 mm
MOO0 = 1
0 1 1 0 1
50 mm

1 1
6 z (z 250) z z 6 mm
= 50 50
1 1
1 z
50 mm 50

0
= 1
0
50 mm
3.4.2 Galilean Telescope of Thin Lenses
The Galilean telescope is an afocal system formed from an objective lens with positive power and
an eyelens with negative power separated by the sum of the focal lengths. If the focal length of the
objective and eyelens are f1 = +200 and f2 = 25 units, the separation t = (200 25) = 175 units.
The system matrix is:

1 0 1 0 1
1 175 mm 8 175 mm
MVV0 = 1 1 =
1 0 1 1 0 8
(25 mm) (+200 mm)
Note that the system power = 0 = fe = , as it must be for an afocal system (both object-
and image-space focal points at infinity). The ray from an object at with unit height generates
the outgoing ray:

1 1
175 mm 1 mm y 0 [ mm] mm
8 = = 8
0 8 0 n0 u0 0
so the outgoing ray is at height 18 and the angle is zero; both incoming and outgoing rays are parallel
to the axis. Note that the diagonal elements of MVV0 are positive and the determinant is 1.
For a provisional chief ray into the system with height 0 and angle 1, the outgoing ray is:

1
175 mm 0 y [ mm] 175 mm
8 = =
0 8 1 nu 8
So the outgoing ray angle is 8 times larger; this is the angular magnification of the telescope; the
image is upright since the incoming and outgoing ray angles are both positive. The form of an afocal
system is:
1
0
MVV00 (afocal system) = m
0 m
3.4.3 Keplerian Telescope of Thin Lenses

The Keplerian telescope with f1 = +200 and f2 = +25 units with separation t = (200 + 25) = 225
units. The system matrix is:

1 0 1 225 mm 1 0 18 225 mm
=
(25 1mm) 1 0 1 (+2001 mm) 1 0 8
The diagonal elements are negative, the determinant is 1, and the system power = 0 = fe = .
The outgoing ray angle is 8, which specifies that the angular magnification is 8 and the image is
inverted.
The ray from an object at with unit height generates the outgoing ray:

18 225 mm 1 mm y 0 [ mm] 18 mm
= =
0 8 0 n0 u0 0
so the outgoing ray is at height 18 the image is inverted and the angle is zero.
The provisional chief ray into the system has height 0 and angle 1; the outgoing ray is:

18 225 mm 0 y 0 [ mm] 225 mm
= =
0 8 1 n0 u0 8
So the outgoing ray angle is 8 times larger than the incoming ray but negative (which implies that
the image is inverted).
3.4.4 Thick Lenses

The matrix method is convenient for thick lenses. If the thick lens is made of glass with n0 = 1.5,
radii of curvature R1 = +50 mm, and R2 = 100 mm, and thickness t0 (which we shall vary). It
is useful to evaluate the focal length of the single thin lens with these radii and refractive index
from the lensmakers equation:

1 1 1
= (n 1)
f R1 R2
1
1 1 200 2
f = (1.5 1.0) =+ mm = 66 mm
50 mm 100 mm 3 3
The powers of the two surfaces are:
n0 n 1.5 1 0.5 1
1 = = =+ =+
R1 50 mm 50 mm 100 mm
n n0 1 1.5 0.5 1
2 = = = =+
R2 100 mm 100 mm 200 mm
so if the thickness is zero, the focal length evaluates to:
e = 1 + 2 1 2 t

1 1 1 1
= + + + + + 0
100 mm 200 mm 100 mm 200 mm
3
=
200 mm
1 1 t 200
fe = + =+ mm
f1 f2 f1 f2 3
which agrees with the result obtained from the lensmakers equation.
The system matrix for the lens with thickness t0 may be evaluated with this parameter:
MVV0 = R2 T 1 R1

1 0 t0 1 0
1 1.5 mm
= 1 1
+ 1 0 1 + 1
200 mm 100 mm

0
1 0.006666 7 t 0.666 6667 t0 mm
=
1 0 1 0
100 mm (0.0033333 t 1) 200 mm 1 0.003333 3 t
Note that the thickness t0 is present in each of the four terms in the matrix. Now we can derive
matrices for dierent values of the thickness: t0 = 0 mm, 1 mm, 2 mm, 5 mm, and 10 mm, where we
substitute into the table of properties to find the BFL, FFL, VH, and H0 V0 :
t0 = 0 mm (thin lens)

1 0.006666 7 0 0.666 6667 0 mm
MVV0 (t0 = 0 mm) =
1 1
(0.003333 3 0 1)
100 mm 200 mm 1 0.003333 3 0

1 0
=
2003mm 1
1 200 2
fe = =+ mm = 66 mm
C 3 3
D 1 200
F F L = FV = = =+ mm = fe
C 2003mm 3
A 1 200
BF L = V0 F0 = = =+ mm = fe
C 2003mm 3
D1 (1 1)
VH = = = 0 mm
C 41

50 mm
A 1 (1 1)
H0 V0 = = = 0 mm
C 41

50 mm
All quantities correspond to the values we would expect for the single thin lens: the front and back
focal lengths are identical to the eective focal length, which means that the principal points coincide
with the vertices they are all located AT the lens.
t0 = 1 mm

1 0.006666 7 1 0.666 6667 1 mm
MVV0 (t0 = 1 mm) =
1 1
(0.0033333 1 1)
100 mm 200 mm 1 0.003333 3 1

0.993 33 0.666 67 mm
=
1
66.814 mm 0.996 67
1
fe = = 66.814 mm
C
D 0.996 67
F F L = FV = = 1
= 66.592 mm
C 66.814 mm
A 0.993 33
BF L = V0 F0 = = 1
= 66.368 mm
C 66.814 mm
D1 (0.996 67 1)
VH = = 1
= 0.2225 mm
C 66.814 mm
A1 (0.993 33 1)
H0 V0 = = 1
= 0.4456 mm
C 66.814 mm
So the object- and image-space principal planes are within the lens and close to the surfaces. Note
that the front and back focal lengths are slightly dierent: the image-space principal point is more
within the lens since the second surface has less power than the front surface.
t0 = 2 mm

1 0.006666 7 2 0.666 6667 2 mm
MVV0 (t0 = 2 mm) =
1 1
(0.0033333 2 1)
100 mm 200 mm 1 0.003333 3 2

0.986 67 1.3333 mm
=
3102
1.493mm 0.993 33
1 1
fe = =
= 66.966 mm
C 1.493 3102
mm
D 0.993 33
F F L = FV = = = 66.519 mm
C 1.493 3102
mm
A 0.986 67
BF L = V0 F0 = = = 66.073 mm
C 1.493 3102
mm
D1 (0.993 33 1)
VH = = = 0.4467 mm
C 3102
1.493mm
A1 (0.986 67 1)
H0 V0 = = = 0.8926 mm
C 3102
1.493mm
Note that the same behavior exists for this lens: the image-space principal point is farther inside
the lens than the object-space principal point.
t0 = 5 mm

1 0.006666 7 5 0.666 6667 5 mm
MVV0 (t0 = 5 mm) =
1
(0.0033333 5 1) 2001mm 1 0.003333 3 5
100 mm

0.966 67 3. 333 3 mm
= = fe = 67.417 mm
3102
1. 483mm 0.983 33
1 1
fe = =
= 67.417 mm
C 1. 483 3102
mm
D 0.983 33
F F L = FV = = = 66.293 mm
C 1. 483 3102
mm
A 0.966 67
BF L = V0 F0 = = = 65.170 mm
C 483 310
1. 2
mm
D1 (0.983 33 1)
VH = = = 1.1238 mm
C 3102
1. 483mm
A1 (0.966 67 1)
H0 V0 = = = 2.247 mm
C 3102
1. 483mm
t0 = 10 mm

1 0.006666 7 10 0.666 6667 10 mm
MVV0 (t0 = 10 mm) =
1 1
(0.003333 3 10 1)
100 mm 200 mm 1 0.003333 3 10

0.933 33 6.666 7 mm
=
7102
1. 466mm 0.966 67
1 1
fe = =
= 68.180 mm
C 1.466 7102
mm
D 0.966 67
F F L = FV = = = 66.293 mm
C 1.466 710
2
mm
A 0.933 33
BF L = V0 F0 = = = 63.635 mm
C 1. 466 7102
mm
D1 (0.966 67 1)
VH = = = 2.2724 mm
C 7102
1. 466mm
A1 (0.933 33 1)
H0 V0 = = = 4.5456 mm
C 7102
1. 466mm
From these results, we see that the eective focal length gets LONGER as the lens gets THICKER
for the same radii of curvature and that the image-space principal point penetrates more inside
the lens as the lens thickness is increased.
3.4.5 Microscope
A simple microscope is also composed of two lenses (assumed to be thin in this discussion, though
the optical components generally are composed of multiple elements). The distance t between the
image-space (rear) focal point of the first lens and the object-space (front) focal point of the ocular
(the tube length) is fixed, often at t = 160 mm. The first lens (the objective) has a (very) short
focal length and the object typically is placed just outside its object-space focal point so that
z1 ' f1 . The objective generates a real image between the objective and eyepiece (or ocular),
which is a lens with a short focal length used as a simple magnifier.
Assume f1 = 5 mm, f2 = 50 mm

1 0 1 160 mm 1 0
MVV0 = 1 1
1 0 1 1
(50 mm) (5 mm)

31 160 mm
= 41 21

50 mm 5

31 160 mm
det 41 21 = 1

50 mm 5
1 50
fe = = + mm = +1.220 mm
C 41
21
D 5 210
F F L = FV = = =+ = 5.12 mm
C 41 41 mm

50 mm
A 31 1550
BF L = V0 F0 = = = = 37.8 mm
C 41 41 mm

50 mm

21
1
D1 5 160
VH = = = mm
= 3.902 mm
C 41 41

50 mm
A 1 31 1 1600
H0 V0 = = = mm = 39.02 mm
C 41 41

50 mm

1 0 1 160 mm 1 0 1 3 mm
MOV0 = 1 1
1 0 1 1 0 1
(50 mm) (5 mm)

31 160 mm
= 41 21

50 mm 5
3.5 Image Location and Magnification
1 1 1
+ =
z1 z2 f
z2 f
MT = = in usual case
z1 z1
1
1 1 1 1 1 z1 f
+ = = z2 = =
z1 z2 f f z1 z1 f
z2 f f
MT = = = f if z1 f
z1 z1 f z1
In words, if the object distance z1 is large (compared to the focal length f ), then the transverse
magnification is (approximately) proportional to the focal length. Therefore, doubling the focal
length doubles the magnification if the object is distant (with the caveat that the magnification is
still negative and smaller than unity, 1 < MT < 0).
3.6 Marginal and Chief Rays for the System

y y y y
L = =
nu n
u nu n
u
det [L] = y n
u y nu
3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 123
The marginal ray goes through the center of the object and any image(s) (i.e., the point where the
marginal ray crosses the optical axis is either the object or an image of the object). It also grazes
the edge of the aperture stop, so if we know the location and the diameter of the aperture stop in
the system, we can scale the height of the marginal ray so that its height matches the semidiameter
of the aperture stop at that location.
The chief ray goes through the center of the stop (and of the entrance and exit pupils), so we set
the chief ray height at the location of the stop to be zero and its angle to be arbitrary (say unity),
then propagate that provisional ray forward towards the image-space vertex and backwards
towards the object-space vertex (note that when tracing backwards toward the first lens, the
matrices in the ray trace must be inverted). During the tracing, we find the element that most
constrains the chief ray, and then scale the height of the provisional chief ray to make sure that it
gets through the other elements. The angle of the chief ray emerging from the front vertex to the
object is the half-angle of the field of view; the angle of the chief ray emerging from the image-space
vertex is the half angle of the image field at the sensor.
3.6.1 Examples of Marginal and Chief Rays for Systems
In the lab, you constructed Keplerian and/or Galilean telescope with an iris diaphragm at various
locations. We can use this as a model for demonstrating how to evaluate the marginal and chief
rays. To evaluate the location of the stop, we must know the diameters as well as the locations of
the lenses. We can cast a provisional marginal ray into the system from the object to determine
which element is the aperture stop. We then scale the provisional marginal ray so that its height
and the semidiameter of the stop match. We then propagate a provisional chief ray forward and
backward from the center of the stop and scale its angle so that it grazes the element that constrains
it. From the angle of the chief ray entering and exiting the system, we can determine the field of
view. We will use the Galilean telescope as the first example.
Example 1: Galilean telescope, object at
Consider a telescope with the following parameters.
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = 40 mm, d2 = 5 mm
t = f1 + f2 = 160 mm

1 0
R1 =
+2001 mm 1

1 160 mm
T =
0 1

1 0
R2 =
401mm 1
The vertex-vertex matrix of this system is

1
1 0 1 160 mm 1 0 160 mm
MVV0 = = 5
401mm 1 0 1 +2001 mm 1 0 5

1
160 mm
MVV0 = 5
0 5
for which element C = 0, which is characteristic of an afocal system. For an object at at infinity, the
provisional marginal ray into the system is has angle of zero and height equal to the semidiameter
of the first element.
d1
y 20 mm
= 2 =
nu 0 0
provisional
We can propagate this ray through the first lens and translate it to the second lens:

y 1 160 mm 1 0 20 mm 4 mm
T R1 = =
nu 0 1 +2001 mm 1 0 1
10
provisional
In words, the height of the provisional marginal ray at the second lens is 4 mm. Note that the ray
after the second lens has the form:

1
y 160 mm 1 mm 4 mm
MV V 0 = 5 =
nu 0 5 0 0
provisional
so that the height of the provisional marginal ray at the second lens is the same before and after
refraction (no surprise there) and that the ray angle after the second lens is 0 (parallel to the optical
axis, again no surprise). Note that the ray height at L2 is larger than the specified semidiameter of
the second lens:
d2 5 mm
y0 > = = 2.5 mm = L2 is aperture stop
2 2
This means two things: (1) that the second lens is the aperture stop, and (2) that we must scale the
height and angle of the provisional marginal ray to ensure that it grazes the edge of the stop. The
scaling factor is the ratio of the height of the provisional marginal ray
d2
2 2.5 mm 5
= =
y at L2 4 mm 8
We apply this scale factor to the marginal ray at all locations in the system. The marginal ray at
the first lens from an object at infinite distance is:

y 5 y 5 20 mm 12.5 mm
= = =
nu 8 nu 8 0 0
at L1 provisional

y 12.5 mm
=
nu 0
at L1
which means that the marginal ray strikes the first lens well inside of the semidiameter; the entering
tube of rays does not fill the lens.
Now that we know that the second lens is the aperture stop, we can propagate a provisional chief
ray from center of the stop in both directions. One possible choice for the provisional chief ray is:

y0 0 mm
=
n0 u0 1
provisional
where again an angle of 1 radian is HUGE, but we will scale it based on the parameters of the rest
of the system. Propagate this ray through the system (towards image space) to obtain

0 mm 1 0 0 mm 0 mm
R2 = =
1 401mm 1 1 1
so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is
the stop because it passes through the center of the lens.
The provisional chief ray may be propagated from the stop backwards towards the first lens.
The translation matrix is inverted because the light is traveling backwards because we are traveling
from right to left.
1
0 mm 1 +160 mm 0 mm 160 mm
T 1 = =
1 0 1 1 1
The height of the provisional chief ray at the first element is negative, which means that it is BELOW
the optical axis at a MUCH LARGER distance than the semidiameter d21 = 20 mm of L1 . To ensure
that the chief ray gets through the first lens, we have to scale its angle by the factor:
d1
2 20 mm 1
=
y 160 mm 8
So now go back to the original prescription for the provisional chief ray and scale it to obtain the
actual chief ray:

y 0 mm y 1 y 0 mm
at L2 = = = = 1
nu 1 nu 8 nu
provisional provisional 8

y0 0 mm
= 1
n0 u0
at L2 8

y0 20 mm
= 1
n0 u0
at L1 8
We can now propagate this ray through L1 . The chief ray emerging from the front vertex is:
1 1
1
0 mm 1 0 1 +160 mm 0 mm
R1
1 T 1 = 1
+2001 mm 1 0 1
8 8
20 mm
=
1
40
Now propagate this chief ray forwards through the system by multiplying by MVV0

1
160 mm 20 mm 0 mm
5 =
1 1
0 5 40 8
which has height of zero emerging from L2 (the aperture stop), as expected.
The field of view of the system is twice the angle at the front of L1 :
1 1 1 180
FoV = 2 radian = radian = = 2.864
40 20 20
The exit pupil is (obviously) located at the aperture stop L2 , while the entrance pupil is the image
of the stop in object space, so we can evaluate the location of the entrance pupil from the calculation
of the chief ray emerging from the front vertex:

y0 20 mm
(emerging from front vertex) =
1
n0 u0 40
1
The height is 20 mm and the angle is 40 radian, so the distance to the location where the ray crosses
the optical axis is:
20 mm
zV 0 N P = 1 = +800 mm
40
the distance from the vertex to the entrance pupil is positive, so the pupil is behind the objective
and is virtual. The transverse magnification of the entrance pupil is:
800 mm
MT = = +5
160 mm
so the diameter of the entrance pupil is magnified:
dN P = 5 5 mm = 25 mm
Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with
aperture stop at second lens (eyepiece).
Example 2: Galilean telescope with aperture stop at FIRST lens, object at
We already know that the height of the provisional marginal ray height at the second lens was
y = 4 mm, so we can select a diameter for L2 that exceeds this value, so that the aperture stop is
now the first lens:
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = 40 mm, d2 = 10 mm
t = f1 + f2 = 160 mm
The vertex-vertex matrix is the same as before:

1
160 mm
MVV0 = 5
0 5
We know from the results just calculated that if d2 = 10 mm, then its semidiameter exceeds
that height of the provisional marginal ray, so the aperture stop then becomes the first lens. The
marginal ray we calculated for the first lens then becomes the actual marginal ray; at the first lens,
the marginal ray is:
y 20 mm
(at L1 ) =
nu 0
and the marginal ray leaving the system after L2 is:

y y0
(after L1 ) =
nu n0 u0

20 mm
= MVV0
0

1
160 mm 20 mm 4 mm
= 5 =
0 5 0 0
Since aperture stop has moved to L1 from L2 , we have to evaluate a dierent chief ray; it will go
through the center of L1 , so the provisional chief ray at L1 is:

y 0 mm
(at L1 ) =
nu 1
provisional
After the first refraction, the provisional chief ray is:

y0 1 0 0 mm 0 mm
(after L1 ) = 1 =
n0 u0 1 1 1
provisional +200 mm
which again should be no surprise, since the chief ray goes through the center of L1 , the lens has no
impact on the ray.
Now propagate the provisional chief ray to L2 by applying the translation matrix:

0 mm 1 160 mm 0 mm 160 mm
T = =
1 0 1 1 1
so the ray height of the chief ray is again MUCH larger than the semidiameter of the lens. The
scaling factor that must be applied to the provisional chief ray is the ratio of the semidiameter of
L2 to the ray height: d2
2 5 mm 5 1
= = =
y 160 mm 160 32
Therefore the true chief ray at the first lens is:

y y
(at L1 ) = 1
nu 32 nu
provisional

1 0 mm 0 mm
= =
32 1 1
32

y 0 mm
(at L1 ) =
1
nu 32
1
In words, the angle of the chief ray into the first lens (and therefore into the aperture stop) is 32
radians, so the full-angle field of view of the system is:

1
F oV = 2 u = radian
16
1 180
= = 3.58
16
which is larger than the field of view in the first case with the smaller diameter for L2 .
Just for fun, propagate both the marginal and chief rays through the system at the same time:

y y y0 y0
MVV0 =
nu nu nu0 nu0

1
160 mm 20 mm 0 mm 4 mm 5 mm
= 5 =
1 5
0 5 0 32 0 32

4 mm 5 mm y0 y0
= =
5
0 32 nu0 nu0
So the ray height of the marginal ray after the second lens is 4 mm and the ray angle is 0 radians
5
(propagates to the image at infinity), while the chief ray height after L2 is 5 mm and the angle is 32
radians. The full angle of the image field is 32 = 16 radians
10 5
= 17.9 .
stop at first lens.
The entrance pupil coincides with the aperture stop in this system, while the exit pupil is the
image of the aperture stop seen through L2 . The object distance to the stop is f1 + f2 = 160 mm, so
the exit pupil distance is:
z1 f2 160 mm (40 mm)

zXP = = = 32 mm
z1 f2 160 mm (40 mm)
and the diameter of the exit pupil is:

32 mm
dXP = MT 40 mm = 40 mm = +8 mm
160 mm
Example 3: Galilean telescope with aperture stop between lenses, object at
Now consider the result if we place an iris diaphragm with diameter d = 8 mm midway between L1
and L2 . The prescription for the system is:
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = 40 mm, d2 = 10 mm
t = f1 + f2 = 160 mm
S : VS = 80 mm, SV0 = 80 mm, dStop = 8 mm
The matrix for the imaging elements is unchanged:

1
160 mm
MVV0 = 5
0 5
but we need to confirm that the new iris is the aperture stop. Cast in a provisional marginal ray
from an object at infinity:

20 mm 1 0 20 mm 20 mm
R1 = =
0 +2001 mm 1 0 1
10
Now propagate this ray to the iris, located at a distance of 80 mm after L1 :

20 mm 1 80 mm 20 mm
T =
1 1
10 0 1 10

12 mm
= = y = 12 mm > dStop = 8 mm = 4 mm at iris
1
10 2 2
So again we need to scale the provisional marginal ray by the ratio:

dS t o p
2 4 mm 1
= =
y 12 mm 3
So the marginal ray at the first lens is:
20 2
1 20 mm mm 6 mm
= 3 = 3
3 0 0 0
20
y mm
= 3
nu 0
Now propagate this ray through the first surface to the iris:

20
1 80 mm 1 0 mm 4 mm
3 =
0 1 +2001 mm 1 0 1
30
We can now propagate this from the iris to and through the second lens:

4
1 0 1 80 mm 4 mm mm
= 3
1 1
40 mm 1 0 1 30 0
4
So the marginal ray exiting the system is at a height of 3 mm and an angle of 0 radians (parallel to
the axis, as expected for a telescope).
Now propagate the provisional chief ray forward (toward L1 ) from the iris; the translation from
the iris is:

y 0 mm
=
nu 1
at stop
1
1 +80 mm 0 mm 1 80 mm 0 mm 80 mm
= =
0 1 1 0 1 1 1
If we propagate the provisional chief ray from the iris towards L2 , we obtain:

1 +80 mm 0 mm +80 mm
=
0 1 1 1
Note both ray heigths are too large, but that the ray height of the provisional chief ray at L2 is
much larger in percentage than its height at L1 ; the ratios are:
d1
2 20 mm 1
= =
80
dmm 80 mm 4
2
2 5 mm 1
= =
80 mm 80 mm 16
So the second lens constrains the chief ray. Apply the scaling factor to the provisional chief ray to
find the true chief ray at the iris:

1 0 mm 0 mm y
= 1 =
16 1 nu
16 at stop
Propagate it forward towards and through L1 to find the prescription for the chief ray entering
the system:
1 1
1 0 1 +80 mm 0 mm 5 mm
1 1 = 3
1 0 1
+200 mm 16 80

y 5 mm
= 3
nu
into L1 80
The field of view of the system is twice the chief ray angle into the system:
3 3 3 180
FoV = 2 radians = radians = = 4.30
80 40 40
Propagate the chief ray towards and through L2 to find the chief ray exiting the system:

1 0 1 +80 mm 0 mm +5 mm
1 = 3
401mm 1 0 1
16 16

y +5 mm
= 3
nu
out of L2 16
iris diaphragm between lenses.
Example 4: Keplerian telescope, object at
Substitute a positive lens with the diameter of 5 mm for L2 , which also means that we have to change
the distance between the lenses:
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = +40 mm, d2 = 5 mm
t = f1 + f2 = 240 mm
The vertex-vertex (system) matrix is:

1 0 1 240 mm 1 0
MVV0 =
+401mm 1 0 1 +2001 mm 1
1
+240 mm
MVV0 = 5
0 5
The prescription for provisional marginal ray into system from object at infinity has the same ray
height as the semidiameter of L1 :

y 20 mm
=
nu 0
provisional
The outgoing provisional marginal ray from the system is:

y 15 240 mm 20 mm 4 mm
MVV0 = =
nu 0 5 0 0
provisional
Since the ray height of the provisional ray is larger than the semidiameter of L2 , then L2 is the
aperture stop:
d2
y0 > = L2 is aperture stop
2
so we must scale the provisional marginal ray by a factor
!
d2 5
y y mm y 5 y
= 2
= 2 =
nu y nu 4 mm nu 8 nu
provisional provisional provisional

5
y 20 mm 12.5 mm
= 8 =
nu 0 0
at L1
Now to the chief ray; the provisional chief ray emerging from center of aperture stop has zero
height and angle of unity:
y0 0 mm
=
n0 u0 1
provisional
The ray is propagated to the first lens:

1
0 mm 1 +240 mm 0 mm +240 mm
T = =
1 0 1 1 1
so the height of the provisional chief ray at the first element is |y| = 240 mm, which is MUCH larger
than the semidiameter d21 = 20 mm of L1 . To ensure that the chief ray gets through the first lens,
we have to scale its angle by the factor:
20 mm 1
=
240 mm 12
So now go back to the original prescription for the provisional chief ray:

y0 0 mm y0 1 y0 0 mm
= = = = 1
n0 u0 1 n0 u0 12 n0 u0
provisional provisional 12

y0 0 mm
= 1
n0 u0
12
We can now propagate it from the rear vertex to and through the front vertex of the system. The
chief ray emerging from the front vertex is:
1 1 1
1 0 1 +240 mm 1 0 0 mm +20 mm
=
+2001 mm 1 0 1 +401mm 1 1
12 1
+ 60
1
In words, the chief ray height at the front surface is y = 20 mm and the chief ray angle is nu = + 60
radian (where the negative sign again just means that the ray angle into the system is the negative
of that emerging therefrom). The field of view of the system is twice the angle:
1 1 1 180
FoV = 2 radian = radian = = 1.91
60 30 30
Marginal ray (red) and chief ray (blue) from object at infinity traced through Keplerian telescope
with aperture stop at second lens.

Example 5: Keplerian telescope, stop at eyepiece, nearby object OV = 500 mm
Consider a telescope with the following parameters.
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = +40 mm, d2 = 5 mm
t = f1 + f2 = 240 mm
z1 = OV = 500 mm
The provisional marginal ray goes from the center of the object to the edge of the first lens, through
the system, and to the center of the image. The first provisional ray is:

y 0 mm
(at object) =
nu 1
provisional
It is useful to locate the image by propagating this provisional ray through the system:

1 0 1 240 mm 1 0 1 500 mm 0 mm 140 mm
=
+401mm 1 0 1 +2001 mm 1 0 1 1 5
So the image location relative to the rear vertex is:

y 140 mm
V0 O0 = = = +28 mm
u 5 radians
V0 O0 = +28 mm
so the image is real.
Now find the height of the provisional marginal ray at L1 :

y 1 500 mm 0 mm 500 mm
(at L1 ) = =
nu 0 1 1 1
provisional
where the ray height is MUCH too large and must be scaled to fit into the lens. The scale factor
is: d1
2 20 mm 1
= =
y (at lens) 500 mm 25
So the second iteration of the provisional marginal ray at the front of the first lens is:

1 500 mm 20 mm
=
25 1 1
25
which has a much smaller incident angle.
Now propagate this ray through the first lens to the second lens:

20 mm 1 240 mm 1 0 20 mm
T R1 =
1 1 1
25 0 1 +200 mm 1 25

28
mm 5 3 mm
= 5 = 5
3 3
50 50
so the ray height is still too large; it is blocked by L2 (which therefore is the aperture stop); scale
this ray to fit into the second lens by applying the factor:
d2
2 2.5 mm 12.5 25
= 28 = =
y (at L2 ) 5 mm 28 56
So the third iteration produces the actual marginal ray from an object at a distance of 500 mm from
L1 :
y 25 0 mm 0 mm 0 mm
= = =
nu 56 1 1
0.017857
at ob ject 25 56

y 0 mm 0 mm
=
=
1
nu 56 0.017857
at ob ject
The prescription for the marginal ray at L1 is:

125
1 500 mm 0 mm mm 8.929 mm
= 14
=
1 1 1
0 1 56 56 56
where the ray height is much smaller than the semidiameter of L1 , so the lens is overly large.
We can propagate this through the system to find the actual prescription for the exiting marginal
ray:

1 500 mm 0 mm 15 240 mm 1 500 mm 0 mm
MVV0 =
1 1
0 1 56 0 5 0 1 56

5
y mm
= 2
5
nu 56
at V0
Just to check, find the distance to the image to make sure it matches the result for the provisional
marginal ray:
5
y mm
V0 O0 = = 2 5 = +28 mm
nu 56
which agrees with what we found earlier.
Now that we know that L2 is the aperture stop for the specified object location, we can propagate
a provisional chief ray from center of the stop in both directions. (We will find that the chief ray is
unaected by the location of the object.) The provisional chief ray is:

y0 0 mm
=
n0 u0 +1
provisional
Propagate through the system towards image space to obtain

0 mm 1 0 0 mm 0 mm
R2 = =
1 401mm 1 1 1
so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is
the stop because it passes through the center of the lens.
The provisional chief ray may be propagated from the stop forwards towards the first lens.
The translation matrix yields the ray height and angle at the first lens:
1
0 mm 1 +240 mm 0 mm 240 mm y0
T = = = (at L1 )
1 0 1 1 1 n0 u0
provisional
Note that the height of the provisional chief ray at L1 is y = 240 mm, which means that it is
BELOW the optical axis at a MUCH value than the semidiameter d21 = 20 mm of L1 . To ensure
that the chief ray gets through the first lens, we have to scale its angle by the factor:
d
1
2 20 mm 1
=
y 240 mm 12
So now go back to the original prescription for the provisional chief ray and scale it to obtain the
actual chief ray:

y0 0 mm y0 y 0
0 mm y 0
= = = 1 = =
nu0 0
1 nu0 0 12 n0 u0 1
n0 0
u
provisional 12 provisional
Note that this is the same chief ray as for the case where the object is at infinity. In words, the chief
ray is determined by the stop and the diameters of the other elements, not by the location of the
object.
We can now propagate the scaled chief ray from the rear vertex to and through the front vertex
of the system. The chief ray emerging from the front vertex is:
1 1
1 0 1 +240 mm 0 mm 20 mm
=
+2001 mm 1 0 1 1
12
1
60
1
which has the correct ray height (the semidiameter of L1 ) y = 20 mm and angle nu = 60 radian.
The field of view of the system is twice the angle:
1 1 1 180
F oV = 2 radian = radian = = 1.91
60 30 30
The exit pupil is (obviously) located at the aperture stop L2 , while the entrance pupil is the
image of the stop in object space, so we can evaluate the location of the entrance pupil from the
calculation of the chief ray emerging from the front vertex:

y0 20 mm
(emerging from front vertex) =
1
n0 u0 60
1
The height is 20 mm and the angle is 40 radian, so the distance to the location where the ray crosses
the optical axis is:
20 mm
zV0 N P = 1 = +1200 mm
60
in front the objective; the entrance pupil is real and its magnification is:
+1200 mm
MT = =5
240 mm
so the diameter of the entrance pupil is:
dNP = 5 dStop = 5 5 mm = 25 mm
Marginal ray (red) and chief ray (blue) from object at a distance of 500 mm from the first lens
traced through Keplerian telescope with aperture stop at second lens.
Chapter 4
Depth of Field and Depth of Focus
From experience with snapshots or movies, we all know that the optical images are not in focus
for objects at all distances from the lens; objects at distances other than that focused appear blurry.
This is not necessarily bad it is used as a creative tool by photographers and cinematographers
to concentrate the attention of the viewer on particular objects of interest. However, in many (if
not all) scientific applications, this limitation to the region of good imaging is detrimental; wed
like to see the entire 3-D object in sharp focus. For this reason, it is essential to understand the
factors that aect the depth of the region of sharp focus, which is the so-called depth of field
on the object as seen through the imaging system.
The concept of depth of field and focus and the dependence on f/# is illustrated in the figure
for a specified linear dimension of acceptable sharpness. The extent of the cone of rays between
the two locations truncated by this sharpness criterion is the depth of focus. Clearly this range is
larger for a smaller cone angle (larger f/#). This would lead us to the conclusion that the depth of
focus (and also its object-space equivalent, the depth of field) is proportional to the f/#:
z f /#
A more accurate criterion requires application of the principles of wave optics to show that diraction
induces a blur spot whose linear dimension also increases with focal ratio that defines the dimension
of acceptable blur. A hybrid combination of the principles of ray and wave optics leads to a
criterion that the depths of field and of focus actually vary with the square of the f/#:
z (f /#)2
This hybrid criterion is discussed after illustrating the concept of depths of field and focus using
examples from film and television.
141
142 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS
The depth of focus for a known linear dimension of acceptable sharpness depends on angle of
the cone of rays, which is determined by the focal ratio (f/# ) of the system. If the cone of rays is
large (small f/#), then the extent of the cone in front of and behind the point of best focus is small;
if the angle of the ray cone is small (large f/#), then a wider range of depths appear in focus.
143
4.0.2 Examples of Depth of Field from Video and Film
Extensive discussion in Wikipedia at http://en.wikipedia.org/wiki/Depth_of_field
1. The Colbert Report, video image with normal lens shows the dierent in apparent sharp-
ness with depth in the scene. This naturally draws attention to the object that is in focus and
often serves as a cue to the audience about which is the object of interest. There are three
areas of interest at dierent distances from the lens, which is focused on the nearest plane
(Stephen Colbert); the more distant plane where Jon Stewart sits is noticeably blurry, but the
bookshelf in the distant plane is very blurry.
Note the dierence in sharpness with depth; Stephen Colbert in the foreground is in sharp focus,
Jon Stewart is clearly less sharp, and the items in the background are quite blurry.
2. Sherlock c 2011, Masterpiece Mystery from the BBC, using limited depth of field to draw
attention to the point of interest
This example shows how the director draws the attention of the audience to the desired point of
interest. The two frames are from A Scandal in Belgravia, the first episode in the second season
of Sherlock broadcast by the BBC and PBS. The two frames are taken from the same camera
position and separated in time by approximately two seconds. In the first frame, Sherlock
(Benedict Cumberbatch) is speaking about the camera phone of Irene Adler (Lara Pulver).
After he finishes speaking, the camera focus shifts rapidly to Adler in the background for her
reply. Note that her form is barely distinguishable in the first frame, which focuses the viewers
attention upon Sherlock in the foreground.
Use of limited depth of field to draw the attention of the audience to the subject of interest. The
camera shifts focus rapidly from the foreground character (at top) to the background character (at
bottom).
145
3. Citizen Kane by Orson Welles, small aperture (large f/#) = large depth of field
Both foreground and background are in focus note cheek of Mr. Bernstein (Everett Sloane) in
near foreground on right and venetian blinds in the windows at the back. Walter Thatcher
(George Coulouris) on left and Charles Foster Kane (Orson Welles) in center are in focus. The
distance to the windows appears to be small because of the sharp focus.
Dierent frame of same scene from Citizen Kane shot with same focus setting. George Coulouris
(as Walter Thatcher) and Everett Sloane (as Mr. Bernstein) remain in focus in the
foreground. Orson Welles (as Charles Foster Kane) has walked to the windows, which are now
clearly many feet from the foreground characters. Kanes stature appears to have been
diminished.
The film Citizen Kane( c 1941 RKO Pictures, Inc.) is famous for its creative cinematog-
raphy by Gregg Toland and the director/star Orson Welles, including original camera angles
(especially upward shots from the floor or even from beneath the floor plane), movements,
transitions, and the use of deep focus. Consider the two frames from the film of a group of
three characters: the standing Orson Welles in the center (at age 26 as the elderly Charles
Foster Kane, a testament to the skill of makeup artist Maurice Seiderman), George Coulouris
on the left (as Walter Parks Thatcher, who had been Kanes guardian), and Everett Sloane
on the right (as Kanes assistant Mr. Bernstein). In the first frame, the three characters
are grouped together and the entire scene appears to be in focus, from the skin on Bernsteins
face on the right to the venetian window blinds in the back. From the sharp focus of the back-
ground windows and expectations about depth of field based on past experience, viewers likely
will surmise that the windows must be physically close to the characters and therefore that
Kane is much taller than the background window sill. Between the first and second images,
the standing Kane has taken 18 steps to walk to the windows (perhaps 35-50 feet from the
foreground characters), while remaining in focus the entire time. His height is now shown to
be approximately the same as the height of the window sill. The apparent shrinking of his
size during the walk may be interpreted as an artistic metaphor for the diminishing stature of
Kane due to the partial failure of his media empire during the Depression. He subsequently
walks back to the foreground to sign the agreement held by Mr. Thatcher that sells much of
his publishing/broadcasting empire back to Thatchers bank. The very large depth of field can
only be obtained by a small aperture stop, which reduces the light reaching the sensor. Clearly
the emulsion must have good sensitivity (it must have been a fast film) and the lighting must
be suciently strong to record useful images. The sequence is available on YouTube- at
http://www.youtube.com/watch?v=WTmVlDh2V2g. Interested readers might want to view
the documentary about the movie (http://www.youtube.com/watch?v=eCkYlCBFV6w). An-
other scene in the movie that is interesting from the perspective of optics is the so-called mirror
scene, which is at the end of the 1-minute clip at http://www.youtube.com/watch?v=8fIP7g9en10
Still from the mirror scene in Citizen Kane. Again, note the depth of field.
147
4. Spellbound, by Alfred Hitchcock ( c Selznick International Pictures, Vanguard Films 1945 )

The climactic scene in this classic movie is a confrontation between Dr. Murchison (Leo G.
Carroll) and Dr. Constance Petersen (Ingrid Bergman), where Petersen reveals she has evi-
dence that Murchison murdered Dr. Anthony Edwardes, whose substitute imposter is played
by Gregory Peck. Frames from the scene are shown in the figure. The frames from the view-
point of Dr. Murchison show the view of his hand, the gun, and Ingrid Bergman, with all appar-
ently in focus. To avoid problems with depth of field, the hand and gun are actually models
that are larger than life size that were positioned closer to Bergman than to the camera. The
website for Turner Classic Movies states that the scene took a week to set up and 19 takes to
get the final result (http://www.tcm.com/this-month/article/18621%7C0/Spellbound.html).
YouTube clip available from http://www.youtube.com/watch?v=8rDMotFmCJc.
Scenes from Spellbound ( c Selznick International Pictures 1945), showing (a) Leo G. Carroll
holding a revolver; (b) Ingrid Bergman walking towards the door as Carrolls character aims the
revolver; (c) and (d) after Bergmans exit, the hand and gun turn towards the camera and fires.
An additional note of interest in this black-and-white film is that two color frames as the gun
fires were spliced into each print by hand.
One of the two color frames of the gunshot spliced into each print of the film Spellbound.
5. Somewhere in Time, split-diopter lens to focus on two distances simultaneously, giving the
appearance of expanded depth of field
Split-diopter lens (Fig. 5.13 from Visual eects cinematography By Zoran Perisic), which is
attached to the front of a normal lens and which adds power on one side of the field of view.
The frame from Somewhere in Time ( c Universal Studios, 1980 ) illustrates the action of
the split-diopter lens added to the normal camera lens. Both the foreground field on the
right (with Christopher Reeve as Richard Collier) and left-hand background field (with Jane
Seymour as Elyse McKenna, the white garden bench, and the trees) appear to be in focus.
The split diopter lens adds refractive power (thus shortening the focal length) for half the field.
Because the sensor is the same distance from the rear vertex of these two half-systems, the
object plane that is in focus in the half field with the additional power is closer to the lens. In
this example, the split-diopter lens is oriented to split the fields through the vertical white
pillar and adds power to the right half of the field. The left side of the vertical pillar is fuzzier
than the right side, where the features of the wood grain are visible. Note that the trees in
the background on the right are out of focus, while those on the left are sharp. The audience
likely does not notice the discrepancies in the image planes.
4.1 CRITERION FOR ACCEPTABLE BLUR 149
Frame from Somewhere in Time ( c Universal Studios, 1980) showing use of split-diopter
lens. Both foreground and background are in focus but note that the left side of the foreground
pillar is fuzzy while the right side is sharp.
A system consisting of both optics and sensor is diraction-limited if the pixel size of the sensor
(smallest resolvable spot) is smaller than the linear dimension of the diraction spot. The system
is detector-limited / sensor-limited if the linear dimension of the individual sensor elements is
larger than the diraction spot.
4.1 Criterion for Acceptable Blur

The discussion of the limiting blur of an imaging system may be extended to characterize the
range of distances (or depths) over which images of point objects exhibit the same (or at
least similar) blur dimensions. If specified in object space, the distance range is called the depth
of field; the same metric in image space is the depth of focus. The depth of field may be thought
of as the zone of acceptable sharpness for object locations.
There is no one way to define the depths of field and focus, but we can rather easily derive
a metric based on ray optics and a hybrid metric that includes the concept of diraction from
wave optics (where the aspects must be taken on faith at this point). The measurement is based
upon the linear dimension B 0 of the acceptable blur. This may be due to a metric of acceptable
spatial resolution or the size of the sensor elements, or the diameter of the diraction spot in the
hybrid metric. Consider a hypothetical value of B 0 shown in the figure. From this value, it is easy
to determine the range of possible axial distances that correspond to B 0 in the ray model and use
that to evaluate the corresponding dimension B in object space via the transverse magnification
z0 B0
MT = = .
z B
The calculation of depth of field: B 0 is the linear dimension of the blur for the system (either the
diameter of the diraction spot in a diraction-limited system or the dimension of the sensor
element in a detector-limited system). The locations z 0 0 specify locations in image space where
the geometrical blur has the same linear size. The corresponding locations in object space are the
limits of the depth of field.
As shown in the figure for a given B 0 , the blur spots are located at two positions equidistant
from the in-focus image. We assign the name 0 to the distance between the in-focus image and
the geometrically blurred images, so these two planes are located at z 0 0 . The depth of focus in
this model is twice 0 :
z 0 = 2 0
In the ray model, the drawing shows that:
D B0 z0 0
0
= 0 = 0 = B 0 = B f/#
z D
(in the case where the object distance is many focal lengths so that the image distance is only
slightly longer than a focal length). If B 0 is small, so must be 0 ; if the f/# is large, so must be 0 .
The object distances z1 and z2 corresponding to these image locations may be evaluated from
the imaging equation for the corresponding image distances z10 = z 0 0 and z20 = z 0 + 0 . It is easy
to see that the absolute magnification |MT | is smaller for the smaller image distance, i.e., MT for
z10 = z 0 0 is smaller than MT for the larger object distance z20 = z 0 + 0 . The nonlinearity of the
imaging equation ensures that the distances between the in-focus object distance z and the extrema
are not equal, i.e., z1 z 6= z z2 , thus requiring labels for both: z1 = z + 1 and z2 = z 2 .
However, if 0 is small, then the concept of longitudinal magnification ML allows simple approximate
expressions for the object distances. We already derived a simple expression for ML in terms of the
4.1 CRITERION FOR ACCEPTABLE BLUR 151
transverse magnification MT :
Dierentiate both sides of the imaging equation:

1 1 1
d + =d =0
z1 z2 f

1 1 1 1
d + = 2 dz1 + 2 dz2 = 0
z1 z2 z1 z2
2
dz2 z22 z2 2
= = = = (MT ) < 0
dz1 z12 z1
2
(z)0 z2
ML = = = (MT )2 < 0
z z1
The increments in object distance are related to the increments in image distance via the longitudinal
magnification:
0
0
= |ML | 1
= |ML | 2 = 1
= 2
=
|ML |
0 0
z1 = z + 1
=z+ =z 2
|ML | MT
0 0
z2
= z 2
=z =z+ 2
|ML | MT
So the depth of field is proportional to the f/# and to the linear dimension of the acceptable blur:
0 0 B 0 f/#
z = z1 z2 = 1 + 2
=2 =2 2 =2
|ML | MT MT2
0

B
z
= 2 2 f/# f/#
MT
In the detector-limited case where the blur dimension is determined by the pixel dimension b0 ,
the depth of field is proportional to the f/#:
b0
z
= 2 2 f/# f/# (in ray model)
MT
Note that the depth of field is larger in slower systems (with large f-numbers and small cone
angles).
If we add the wave concept of diraction, the linear dimension B 0 is determined by the dif-
fraction pattern, which may be written in terms of the wavelength and the focal ratio. Assume that
the linear dimension of image blur has been measured for a particular imaging system at the specific
pair of object and image distances (z and z 0 respectively) of interest:
Blur in a diraction-limited system with aperture diameter D. The image of the point source is a
diraction pattern at the image plane whose linear dimension (using some criterion) is B 0 .
For example, the image of a point source located a distance z from the system could be measured
to find this limiting blur diameter B 0 , where the prime indicates that the measurement is made
in image space. In a diraction-limited system, the discussion of Fraunhofer diraction in imaging
shows that one possible measure for B 0 is the diameter of the central lobe of the diraction spot:
z0 f
B 0 = 2.44 0 = 2.44 0 = 2.44 0 f/#
D D
B0
= 2.44 0 f/#
2
f/# 0 (f/#)
z
= 2 (2.44 0 f/#) 2 = 4.88
MT MT2
2
0 (f/#)
z
= 4.88 (if accounting for diraction)
MT2
So the depths of field and of focus are proportional to the square of the f/# in the diraction-limited
case.
4.2 Depth of Field via Rayleighs Quarter-Wave Rule

We can also derive the depth of focus by finding the range of image locations that satisfy Rayleighs
rule applied to defocus, and then transform those image distances back into object space via the
imaging relation to find the depth of field.
The necessary task is to find the change in the image location for change in the wavefront error
at the edge of the pupil. In the figure, the ideal reference wavefront has radius R1 (R1 = f if the
object is a large distance away) and the wavefront with defocus has radius R2 = R1 + 0 = f + 0 ,
4.2 DEPTH OF FIELD VIA RAYLEIGHS QUARTER-WAVE RULE 153
where 0 is the change in location of the focal plane with an added quadratic phase of W020 = 40 .
The quadratic-phase approximation to the new wavefront is:
x2 + y 2 x2 + y 2 x2 + y 2
W [x, y] = = =
2R2 2 R1 + 0 0
2R1 1 +
R1
0 1 2 X + 0 n
x2 + y 2 2
x +y
= 1+ =
2R1 R1 2R1 n=0
R 1
0 2 2 !
0
x2 + y 2 (1) (2) (1) (2) (3) 0
= 1 + (1) + + +
2R1 R1 2! R1 3! R1
0 2 0 0
x2 + y 2 0

= 1 (if = 1)
2R1 R1 R1 R1 f
x + y2
2 2
x +y 2
= 0
2R1 2R12
where the first term is the quadratic-phase approximation to the ideal wavefront and the second
term is the additional eect of the defocus.
Change in image position 0 as a function of the wavefront error W = W020 for defocus.
In the limit where the object distance is large, the image distance R1 is approximately equal to the
focal length f , so this expression simplifies to:
2
x2 + y 2 0 x + y2
W [x, y] =
2f 2f 2
2 2
2
x +y 0 x + y2
W [x, y] = W [x, y] = W [x, y] =
2f 2f 2
If the wavefront error is positive, W > 0 = 0 < 0, which means that the image moves towards
the lens as shown in the figure.
d0
The magnitude of the wavefront error at the edge of the pupil (where, say, x = and y = 0) is:
2
d0 2
d + 02 d2
|W | = W x = , y = 0 = 2 2
0 0
= 0 02
2 2f 8f
We can now apply Rayleighs rule that the image is eectively ideal if the maximum wavefront error
is less than a quarter wave, so that the single-sided depth of field is easy to evaluate:
2 2
0 d2 0 f 2 2 f2 f
> |W | = 0 02 = 0
= = 20 = 20
4 8f 2 d0 d20 d0
2
= 0
= 20 (f/#) using Rayleighs rule for ideal imaging
In visible light with 0

= 0.5 m, the change in image position under the Rayleigh criterion is
2
0 [0
= 0.5 m]
= (f/#) [ m]
In words, an image in visible light appears to be in focus if the distance of the actual image plane
from the ideal image plane in micrometers is no larger than the square of the f/#. For example, if
the lens is used at f/4, the actual image plane must be within 16 m of the ideal location; if at f/16,
the actual image plane must be within 256 m = 0.25 mm of the ideal location. Note the similarities
and the dierences with the rule of thumb that the size of the diraction spot in micrometers is
equal to the f/#.
The depth of focus is twice this value because we can defocus on either side of the ideal image
plane:
Depth of focus: (z)0 = 2 0 2
= 40 (f/#)
2
= 2 (f/#) [ m]
Now convert this to the object space via the longitudinal magnification to find the depth of field:
0
0 (z)
ML = = = (MT )2
z
0 0
(z) (z)
z
=2 = =
|ML | (MT )2
40 (f/#)2
z
= 2
(MT )
which again is proportional to the square of the f-ratio and is quite similar to the hybrid metric
for depth of field in the diraction-limited case from the last section:
! !
0 (f/#)2 0 (f/#)2
Depth of field: zHybrid
= 4.88 2 ' zRayleigh
=4 2
(MT ) (MT )
These two expressions are quite similar; the fact that these are not identical should be no surprise
since they were derived using dierent assumptions.
Note that the depth of field increases as the square of the f/#, so stopping down the lens by
a factor of 2 has a big impact it increases the depth of field by about a factor of 4. Since the
transverse magnification is less than unity for most real imaging setups (and a lot less for distant
objects), the depth of field increases rapidly as the object distance increases.
It might be useful to do an example. Consider a normal lens with f = 50 mm acting in visible

light (0 = 500 nm = 0.5 m) with the aperture wide open (say, f/2 so that the diameter of the
entrance pupil is d0 = 25 mm) imaging a nearby object with z1 = 1 m:
1
1 1
z2 = = 52.63 mm
50 mm 1000 mm
z2 52.63 mm
MT = = = 0.5263
z1 1000 mm
where (again) the negative sign on the transverse magnification means that the image is upside
4.2 DEPTH OF FIELD VIA RAYLEIGHS QUARTER-WAVE RULE 155
down compared to the object. The depth of focus is:

0
depth of focus at f/2: (z) = 2 0
= 4 0.5 m 22 = 8 m
And the depth of field is obtained by scaling by the square of the transverse magnification:
0
(z) 8 m
depth of field at f/2: z
= =
= 28.9 m
MT2 (0.5263)2
If we stop the lens down to, say, f/16 (a factor of 8), the depths of focus and field are much
larger:
depth of focus at f/16: (z)0 = 2 0
= 4 0.5 m 162 = 512 m
= 0.5 mm
512 m
depth of focus at f/16: z
=
= 1.85 mm
(0.5263)2
If the object is a large distance away, say z1 = 100 m with the lens wide open at f/2, the transverse
magnification is much smaller:
1
1 1
z2 = = 50.025 mm
50 mm 100 m
z2 50.025 mm
MT = = = 5.0025 104
z1 100 m
The depth of focus is the same as it was for the close-up image at f/2:
0
(z) = 4 0.5 m 22 = 8 m
but the much smaller value for the transverse magnification means that the depths of field and focus
are much larger:
8 m
z =
2 = 32 m
(5.002 5 104 )
512 m
z
= 2

= 2 km
(5.002 5 104 )
Depth of field of lens focused at z1 = 20 ft

= 6 m for three focal ratios: f /1.8, f /5.6, and f /16
showing increase in depth of field with increasing focal ratio (from http://www.engadget.com).
4.3 Hyperfocal Distance

The last example just presented where the object distance z1 = 100 m and the depth of field z =
2 km suggests another useful imaging metric: the shortest object distance for which the depth of
field extends to infinity, which is called the hyperfocal distance (z1 )hyp erfocal and the corresponding
image distance (z2 )hyp erfocal is the sum of the focal length and the defocus distance 0 :
(z1 )hyp erfocal + 1 = = (z2 )hyp erfocal 0 = f

= (z2 )hyp erfocal = f + 0
The hyperfocal object distance (z1 )hyp erfocal satisfies the imaging equation for this image distance:
1 1 1
+ =
(z1 )hyp erfocal (z2 )hyperfocal f
1
1 1
Hyperfocal Distance (z1 )hyp erfocal =
f f + 0
f 2 + 0 f f2
= 0 =f+ 0

2
f
= f+
20 (f/#)2
f2
= 2
20 (f/#)
where we can also interpret this in terms of the diameter of the diraction spot:
f2 f2
(z1 )hyp erfocal
= =
(20 f/#) (f/#) (f/#) ddiraction sp ot
where ddiraction sp ot
= 2 0 f/#. So if we have a so-called normal lens with f = 50 mm acting
at f/2 (close to wide open) and in light with 0 = 500 nm, the hyperfocal distance is:
(50 mm)2
(z1 )hyperfocal
= = 625 m
2 500 nm 22
which is quite distant. If we stop the lens down to f/16, we get:
(50 mm)2
(z1 )hyp erfocal
=
= 9.8 m
2 500 nm 162
which is quite a lot closer to the lens. This means that objects at all distances in the interval
10 m / z1 < should appear to be in focus if the lens is used at f/16.
4.4 Methods for Increasing Depth of Field

1. Google Lens: http://www.google.com/patents/US6320979
2. Focus stacking: digital combinations of images collected at dierent focus settings. Dierent
images are combined based on local sharpness to produce an image with extended depth of
field.
3. Light-field camera = plenoptic camera that captures the four-dimensional field [x, y, z, t]. An
example of such a camery is the Lytro, which uses a matrix of microlenses to collect ray
4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH 157
direction information in addition to color and lightness. This stored information allows recovery
of focused information at dierent depths.
4. Cameras with dierent focal settings for dierent colors of light. The information is combined
digitally to extract the sharp edge data from the color with the large f/# with the blurrier
structure in other colors.
4.5 Sidebar: Transverse Magnification vs. Focal Length

It may be useful to derive the relationship between transverse magnification and focal length for a
given object distance. We know the imaging equation for object distance z1 , image distance z2 , and
focal length f
1 1 1
= +
f z1 z2
We already know that for an imaging system consisting of two or more lenses, the object distance is
measured to the object-space principal point, the image distance is measured from the image-space
principal point, and the focal length is replaced by the eective focal length. For a specific object
distance z1 and a fixed focal length f , the equation may be rearranged to determine the image
distance:
z1 f
z2 =
z1 f
We can substitute the expression for the transverse magnification:
! !
z1 f
z2 z1 f f f 1 f 1
MT = = = = =
z1 z1 f z1 z1 zf 1 z1 1 zf
1 1

If the focal length is shorter than the object distance, then the term zf1 < 1:
!
f 1
MT =
z1 1 zf1
X n
f 1 f
=
z1 n=0
n! z1
2 !
f f 1 f
= 1 +
z1 z1 2 z1
2 3
f f 1 f
= + +
z1 z1 2 z1
f
MT = if f z1
z1
where the series for (1 t)1 has been used. For a lens with a fixed focal length f but two object
distances (z1 )a and (z1 )b the transverse magnifications are:
f
(MT )a
=
(z1 )a
f
(MT )b
=
(z1 )b
so the dierence in transverse magnifications is:

f f
(MT )a (MT )b = MT
=
(z1 )a (z1 )b

1 1
MT = (f )
(z1 ) (z1 )b
a
(z1 )b (z1 )a
= (f )
(z1 )a (z1 )b
(z1 )a (z1 )b z1
MT = f =f
(z1 )a (z1 )b (z1 )a (z1 )b
We have already seen that the transverse magnification varies with the focal length of the lens:

1 1 1 1 z2 1 z2 1
= + = +1 = 1 = (1 MT )
f z1 z2 z2 z1 z2 z1 z2
z2
= = (1 MT )
f
f 1
= =
z2 1 MT
If the object distance z1 is large, then |MT | / 0, which means that we can substitute the geometric
series:
+
X
1
= t if |t| < 1
1t
=0
f 1
=
z2 1 MT
+
X
= 1 + MT if |MT | < 1 = z2 ' f
2
= (MT ) = 1 + MT + (MT ) +
=0
f
= 1 + MT if |MT | < 1 = z2 ' f
z2
which implies that the magnification increases with the focal length
We should check this for some known cases: if the object distance z1 = +, then z2 = f and :
f
z1 = = =1 = 1 + |MT |
z2
= |MT |
= 0, correct answer
If the object distance is z1 = 100 f , then the image distance and approximate transverse magnifi-
cation are:
100 f 99 1
z2 = f = = = 1 + MT = MT =
99 z2 100 100
The actual transverse magnification is:
100
99 1 1
MT = = =
100 99 100
so the approximation is still quite good.
4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH 159
Now consider two distant objects a and b at object distances (z1 )a > (z1 )b f , we have:
(z1 )a (z1 )b z
=
f f f
(z1 )a (z1 )b
= (1 + MT )a (1 + MT )b = (MT )a (MT )b = MT
f f
z1
= MT
f
which shows that the dierence in transverse magnifications decreases as the focal length f increases
for fixed z1 . In words, if two distant objects are separated along the optical axis by the distance
z, the transverse magnifications for the two objects are more similar if the focal length f is large,
which gives the impression to the viewer that the objects are close together.
Consider the example shown below; the subjects are a pair of 15- in diameter Rodman smoothbore
cannon dating from 1864 that are preserved on restored carriages at Fort Foote, Maryland, near my
childhood home (when I was growing up, the two barrels had not been mounted, but were lying on
the ground). The near and distant cannons are separated by the fixed distance z1 . The images were
taken with a zoom lens: the first used a telephoto setting with equivalent focal length f1 = 140 mm
for the 35 mm film format (the actual focal length was f1 = 22.2 mm). The second image was taken
with equivalent focal length f2 = 32 mm for the 35 mm format (a wide-angle lens; the actual focal
length f2 = 6.6 mm). The dierence in transverse magnifications clearly is smaller with the long focal
length (first image) as the distant cannon is readily visible; the tiny distant cannon is barely visible
in the second image. The transverse magnifications for the background cannon dier by nearly a
factor of 2.5 for the two images. This eect leads to the statement that telephoto lenses compress
the depth of field (though some vigorously dispute this statement for psychological reasons!).
Illustration of the variation in transverse magnification with focal length of the lens. The equivalent
focal length of the lens used to make the top image is f
= 140 mm (telephoto) and that for the
bottom is f = 32 mm (wide angle). The background cannon is MUCH smaller in the second image.
Chapter 5
Aberrations
Aberrations may be loosely defined as deviations from predicted behavior of an optical system.
Chromatic aberrations describe deviations from predicted behavior due to variations in the refractive
index for dierent wavelengths of light. Monochromatic aberrations are variations from calculated
behavior due to the approximations used. For example, if we use just the first-order approxmation
sin []
= tan []
=
we can describe the deviations from predicted first-order behavior as the third-order aberrations.
The aberrations may be described in terms of waves or of rays. The wave aberration is the
departure of the wavefront from the ideal spherical wave that should emerge from the exit pupil
of the system to the image:
p [x, y] exp [+i [x, y]] = p [x, y] exp [+iW [x, y]]
where W [x, y] is the scalar wave aberration function measured in units of radians at each point
in the exit pupil. Note that the spherical wave converges to a real image or diverges from a
virtual image.
The wave aberration function is the dierence of the actual emerging wave from the ideal sphere,
which has the form: r
2 2 2 2 (x2 + y 2 )
x + y + z = R = z = R 1
R2
5.1 Chromatic Aberration
In the earliest days of optics, all optical systems were constructed from single lenses (singlets) and
therefore suered from chromatic aberrations due to the physical mechanism of dispersion.We saw
that the index of refraction of optical materials decreases with increasing wavelength in regions of
normal dispersion. At longer wavelengths in a regime with normal dispersion, a lens with positive
power will have less refractive power (longer focal length f ). Conversely, a lens with negative
power will have a longer negative focal length at longer wavelengths.
The impact of chromatic aberration on the image was minimized if the focal is long and the focal
ratio is large. For this reason, early telescopes for astronomical viewing were made very long in part
for magnification and in part to reduce the visibility of chromatic aberrations.
161
162 CHAPTER 5 ABERRATIONS
The aerial telescope of Johannes Hevelius with a focal length of f = 45 m

= 148 ft with an aperture
diameter of d
= 220 mm = 8.5 in
The observation that dierent glasses have dierent dispersions is the basis for the principle of
achromatization (from the Greek words for without color ), where two optical elements made from
glasses with dierent dispersion characteristics are combined to match the focal lengths at two
dierent wavelengths (typically red and blue). An achromatic doublet is fabricated from a positive
element made from crown glass with a lower refractive index and lower dispersion, and a negative
element made of flint glass with a larger refractive index and a larger dispersion. For an achromat
with a positive focal length (converging lens), the lens is made of a positive lens from crown glass
and a negative lens from flint glass so that the chromatic aberrations act in opposition to match at
the two wavelengths. If the component lenses are in contact (and often the curvatures are designed
to match so that they may be cemented together, then the positive power must be larger (focal
length must be shorter).
Lens systems may be built that correct for three or more wavelengths. It may be obvious that the
number of elements must match or exceed the number of corrected wavelengths. Apochromats have
at least three elements to correct the focal length at three dierent wavelengths (typically red, green,
and blue) and are fabricated from three glass elements with dierent dispersion characteristics. Of
course, the need for the additional element(s) means that apochromats tend to be more expensive
than achromats.
5.1 CHROMATIC ABERRATION 163
Principle of the achromat: the first singlet lens exhibits chromatic aberration because of the
dispersion of the glass (nred < ngreen < nblue ), which means that red light focuses farther away.
Add a second element of flint glass with negative power that matches the focal lengths for red and
blue light to form an achromat.
Apochromat made of three elements to correct focus at three wavelengths.
The traditional wavelengths used to design optics were specified by Fraunhofer based on absorp-
tion lines in the solar spectrum:
Line [ nm] n for Crown n for Flint
C 656.28 1.51418 1.69427

D 589.59 1.51666 1.70100
F 486.13 1.52225 1.71748
The design of acromats is based on the dispersion of the glass, which we already specified
Refractivity nD 1 1.75 nD 1.5

Mean Dispersion nF nC > 0 dierences between blue and red indices
Partial Dispersion nD nC > 0 dierences between yellow and red indices
nD 1
Abb Number ratio of refractivity and mean dispersion, 25 65
nF nC
For a single thin lens, the power of the system is:

1 1 1
= = (n 1) (n 1) (C1 C2 )
f R1 R2
where
1
C
R
The eect of dispersion on the power is obtained by dierentiating:
d dn nF nC
= (C1 C2 ) = = d = =
dn n1 n1 n1
where is the Abb number.
5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 165
For a two-lens system, we have already determined the formula for the power:
e = 1 + 2 1 2 t
= de = d1 + d2 2 t d1 1t d2
= (1 2 t) d1 + (1 1 t) d2
The power at the two wavelengths is matched so that:
de = 0 = (1 2 t) d1 + (1 1 t) d2

= (1 2 t) 1 + (1 1 t) 2
1 2
1
= (1 2 t) = (1 1 t) 2
1 2
1 2
+ 2 f
1 1 + f
2 2
= t = 1 =
1 + 2 1 + 2
1 1 + 2 2
e =
1 + 2
If the two lenses are in contact so that t = 0, then:

2 f1 2
= =
1 f2 1
For an achromat that has the same focal length for red light (C line, = 656.28 nm) and blue light
(F line, = 486.13 nm).
Note that it is possible to use the same glass and adjust the focal lengths and distance to
achromatize. If 1 = 2 , then
f1 1 + f2 2 (f1 + f2 ) f1 + f2
t = t= =
1 + 2 2 2
1 1 + 2 2 f f
e = = = fe = =2 1 2
fe 2 1
+ 1 f1 + f2
f1 f2
5.2 Third-Order Optics, Monochromatic Aberrations
Aberrations may be interpreted as corrections to the paraxial imaging behavior of optics that result
by adding the second term to the approximations for the trigonometric functions: for cos []:
3
sin []
=
3!
2

cos [] = 1
2!
3

tan [] = +
3
The expression for the cosine may be substituted into the formula for the path length 1 of the ray
in terms of the object distance z1 , the angle and the radius of curvature R:
2 12
1 2R 2R
= 1+ + (1 cos [])
z1 z12 z1
2 12
1 2R 2R 2
= 1+ + 1 1
z1 third order z12 z1 2!
1
R2 R 2
= 1+ +1
z1 z1
2 1
1

= z1 + R2 z1 (R + 1) 2
=
which is a significantly more complicated expression than the first-order solution:

1
= 1 = 1
= z1
z1 first order
The wavefront emerging from the aperture of the system (the exit pupil ) may be characterized
by its shape or by rays at dierent locations in the pupil that are orthogonal to the wavefront. The
rays are defined by the end-point coordinates in the pupil plane (with height r from which they
emerge) and in the image plane (with height r0 to which they travel). The deviations from the wave
or of the rays from the ideal behavior are characterized by the concept of ray aberrations, which
typically are as a set of numerical values (coecients) that describe the amount of deviation of the
ray or of the wavefront from the ideal. The order of the aberrations is determined by the highest
power of the term kept in the expansion for the sine in Snells law:
3 5
sin [] = +
3! 5!
The inclusion of these larger powers in the expansion results in larger deviation of the theoretical
calculation from the actual behavior at larger o-axis angles.
We can also consider deviations of the actual wavefront from the ideal in first-order paraxial or
Gaussian optics. For example, a translation of the ideal wavefront down the z-axis from the ideal
image location may be characterized by an aberration that is called defocus.
The decomposition of the wavefront into deviations from the ideal requires six coecients of
powers of r and r0 :
Spherical Aberration r4
Coma r3 r0 cos []
Astigmatism r2 r02 cos2 []
Curvature of Field r2 r02
Distortion rr03 cos []
Piston Error r04
The last of these, piston error, is a measure of a z-axis translation of the wavefront analogous to
defocus. As such, it has no eect on the image and often is not included in the list of aberrations.
In spherical aberration with positive coecients, the rays from the margin of the pupil cross the
axis closer to the optic than the paraxial rays. The image of a point object created by a system with
spherical aberration shows a bright central region surrounded by a halo of light from the margin
of the pupil.
Spherical aberration describes the deviation of the rays emerging from the pupil from the ideal
convergence to an image point. If the aberration coecient is positive, the rays emerging from the
margin of the pupil cross the optical axis closer to the optic than the paraxial rays close to axis.
In other words, the focal length for marginal rays is shorter than that for paraxial rays. Spherical
aberration is a circularly symmetric deviation of the wavefront from the quadratic-phase ideal of
Gaussian optics. The resulting wavefront emerging from the pupil is a 4th power of the pupil coor-
dinates, which has the shape of a china bowl. This shows that the rays near the edge of the pupil
are directed towards a point on the axis that is closer to the optic. Since spherical aberration is
a function only of the pupil-plane coordinates, it describes a shift-invariant deviation that may be
characterized by an impulse response.
The shape of the wavefronts emerging from the pupil for spherical aberration (black) and defocus
(red). Marginal rays emerging from a pupil that exhibits spherical aberration will cross the axis
(i.e., focus) closer to the pupil than the paraxial rays.
For coma, the deviations from ideal performance for coma are larger for larger values of the
image plane coordinate r0 . If a point source and its image are located on axis, coma in the system
will have no eect on the image, but the image of a point source located o axis will be spread
dierently at dierent values of the image plane coordinates. The image of an o-axis point source
will be teardrop shaped.
To introduce the concept of monochromatic aberrations, consider the complex amplitude of the
wavefront diverging from a specific object point [x0 , y0 ] to the location [x, y] in the entrance pupil:
w [x, y; x0 , y0 ] = p [x, y] exp [+i [x, y; x0 , y0 ]]
where:

z1 r2 1 1
[x, y; x0 , y0 ] = exp +2i exp +i exp [+2i [x, y; x0 , y0 ]]
0 0 z1 f
is the phase at the pupil due to a point source located at [x0 , y0 ] in the object plane, which includes
the quadratic phase of the ideal spherical wavefront converging to the image point plus any
phase error [x, y; x0 , y0 ] and p [x, y] specifies the magnitude function of the pupil (the so-called
apodization function). A similar expression may be written for light converging to the image point
[x00 , y00 ] from the location [x0 , y 0 ] in the exit pupil. If the actual wavefront at [x, y] in the pupil lags
behind the ideal sphere (actually a paraboloid), then the light from that location converging to the
image plane must have been emitted earlier in time; the phase dierence at that location [x, y]
in the pupil is positive. The map of [x, y; x0 , y0 ] may be decomposed into dierent shapes
described by dierent powers of the object coordinates [x0 , y0 ] and of the pupil coordinates [x, y].
The weights of each of these dierent shapes present in the actual wavefront are the aberration
coecients, which are commonly used to specify the dierences of the behavior from the ideal.
Comparison of ideal and actual wavefronts emerging from optical system. The dierence between
the wavefronts may be specified by the dierence in phase or by the intersections of rays normal to
the wavefront.
Alternatively, we can describe the dierence in action of the optic from the ideal in terms of the
rays from dierent points in the pupil. The rays are (of course) perpendicular to the wavefront
emerging from the pupil. Unaberrated rays should all cross the optical axis exactly at the image
point. Rays from an aberrated wavefront will cross at dierent locations.
Rays from dierent points on the wavefront emerging from the pupil of an optic with spherical
aberration; the rays cross the optical axis at dierent locations.
The aberration function specifies the dierence in optical phase between the actual and ideal
wavefronts that converge to the ideal real image point (or diverge from the ideal virtual image
point). Since the shape of the wavefront due to a point object generally varies with its location in
the object plane, the aberration function generally depends on coordinates in both the object and
pupil planes; it is a 4-D function. The coordinates used in the calculations of the rays are shown in
the figure:
Coordinates used to evaluate aberrations. Light propagates from the pupil plane (coordinates
without subscripts) over the distance z2 to the image plane (coordinates with subscripts). Note that
the pupil and image plane coordinates are normalized so that rmax = (r0 )max = 1.
A ray of light with wavelength 0 that emerges from the exit pupil at [x, y] and crosses the image
plane at [x0 , y0 ] has the form:
w [x, y; x0 , y0 ] = p [x, y] exp [+2i [x, y; x0 , y0 ]]
where p [x, y] specifies the magnitude of the pupil transmittance of the exit pupil (the so-called
apodization function) and [x, y; x0 , y0 ] is the phase at the pupil for an object point at coordinates
[x0 , y0 ] emerging from the pupil at [x, y]. The phase includes the converging spherical (actually
parabolic) wave and the phase dierence term:

r2 1 1
[x, y; x0 , y0 ] = +i + [x, y; x0 , y0 ]
20 f z2
We consider the locations in polar coordinates: the image location is [x0 , y0 ] = (r0 , ) and
the pupil coordinates [x, y] = (r, ). If the optical system has a circular cross-section (i.e., if the
optical system is rotationally symmetric), then the behavior of the aberration does not depend on
the absolute azimuthal coordinates but only on their dierence, so that we can consider a three-
dimensional description based on radial coordinates r, r0 , and relative azimuthal angle ;
i.e., we can write the phase error function in the form [r, r0 , ]. The relative phase between the
object point and a location in the pupil is 2 radians (per cycle) multiplied by the number of cycles,
which is the ratio of the distance between the locations in the object plane and in the pupil divided
by the wavelength 0 :
n o1
2 2 2
distance: R = z 2 + (r cos r0 cos ) + (r sin r0 sin )
R 2 n 2 o 12
[x, y; x0 , y0 , z] = 2 = z + (r cos r0 cos )2 + (r sin r0 sin )2
0 0
2 2 2 1
= z + r cos2 + r02 cos2 2rr0 cos cos + r2 sin2 + r02 sin2 2rr0 sin sin 2
0
2 2 1
= z + r2 + r02 2rr0 (cos cos + sin sin ) 2
0
2 2 1
= z + r2 + r02 2rr0 cos [ ] 2
0
2 12
z r + r02 2rr0
= 2 1+ + cos [ ]
0 z2 z2
2 12
z r + r02 2rr0
2 1+ + cos []
0 z2 z2
This expression may be expanded into a power series via the binomial theorem:
n n n (n 1) 2
(1 + u) = 1 + u+ u +
1! 2!
1 1 1 1
= (1 + u) 2 = 1 + u u2 + u3
2 8 16
In the current expression, we can identify:
2
r + r02 2rr0
u + cos []
z2 z2
2
1 r + r02 rr0
= u= + cos []
2 2z 2 z2
2 2
1 1 r + r02 2rr0
= u2 = + cos []
8 8 z2 z2
" 2 2 2 #
1 r2 + r02 2rr0 r + r02 2rr0
= + 2 cos [] + 2 2 cos []
8 z2 z z2 z
4 4 2 2
2 2 2 2

1 r + r0 + 2r r0 4r r0 2 r + r0 rr0
= + cos [] 4 cos []
8 z4 z4 z2 z2
4
1 r + r04 + 2r2 r02 4r2 r02 2 r3 r0 rr03
= + cos [] 4 cos [] + cos []
8 z4 z4 z4 z4
4 2 2 3
1 r + r04 + 2r2 r02 r r0 r r0 rr03
u2 = cos 2
[] + cos [] + cos []
8 8z 4 2z 4 2z 4 2z 4
So the power series for the phase function truncated to the second order becomes:
2 rr
z r + r02 0
[x, y; x0 , y0 , z]
= 2 1 + + 2 cos []
0 2z 2 z2
4 3
z r + r04 + 2r2 r02 r2 r02 2 r r0 rr03
+ 2 2 cos [] + 2 cos [] + cos []
0 8z 4 2z 4 2z 4 2z 4
z
Now we can multiply through by the leading factor of 2 , which produces 10 terms: a constant,
0
three terms from the first-order polynomial, and six from the second-order polynomial:
!
2 2
z r + r 0 rr0
[x, y; x0 , y0 , z]
= 2 + 2 2 cos []
0 20 z 0 z
4 2 2
r + r04 + 2r2 r02 r r0 2 r3 r0 rr03
2 3
2 3
cos [] + 2 3
cos [] + 2 cos []
80 z 20 z 20 z 20 z 3
z
= 2
0
r2 r2 rr0
+ 2 + 2 0 2 cos []
20 z 20 z 0 z
r4 r4 r2 r02 r2 r02 r3 r0 rr03
2 3
2 0 3 2 3
2 3
cos2 [] + 2 3
cos [] + 2 cos []
80 z 80 z 40 z 40 z 20 z 20 z 3
which may be reordered into:
z
[x, y; x0 , y0 , z]
= 2
0
2 2 2
+ r + r0 r r0 cos []
0 z 0 z 0 z
3 2 2
3
r4 + 3
r r0 cos [] r r0
40 z 0 z 0 z 3
2 2 4
r r0 cos2 [] + r r3 cos [] r
0 z 3 0 z 3 0 0 z 3 0
In other words, we have decomposed the phase of the spherical wave into terms with dierent
powers of the coordinate in the pupil plane (with coordinates [x, y] = (r, )) and in the image plane
(with coordinates [x0 , y0 ] = (r0 , ) in a manner analogous to the decomposition into sinusoidal
components in the Fourier transform. Our goal will be to decompose the phase dierence between
the ideal and actual wavefronts using these same terms. Again, since the system is assumed circularly
symmetric, only the dierence in azimuthal coordinates is relevant.
5.2.1 Names of Aberrations
The dierence in the shape of the actual wavefront from the ideal spherical wavefront is decom-
posed into the same terms as the phase; each term has its unique shape and name, and will be
described by a coecient that determines how much of each shape is present in the phase dif-
ference. From the series above, we can apply weighting coecients to the three relevant coordinates
distinguished by subscripts: the index j of the power of the radial coordinate r0 at the image (the
image height), the index m of the power of the radial coordinate r at the pupil, and the index n
of the power of cos []. From the series above we can see that only some powers are included in the
summation, so we can write the phase dierence as
[x, y; x0 , y0 , z] = ideal [x, y; x0 , y0 , z] actual [x, y; x0 , y0 , z2 ]

X
= Wjmn r0j rm cosn
j,m,n
= W000 (propagation from pupil to image)

+ W200 r02 (piston error) + W111 r0 r cos (tip-tilt) + W020 r2 (defocus)
+ W040 r4 (spherical aberration) + W131 r0 r3 cos (coma)
+ W220 r02 r2 (curvature of field) + W222 r02 r2 cos2 (astigmatism)
+ W311 r03 r cos (distortion) + W400 r04 (piston error)
+
The coecients Wjmn measure the amplitudes of the individual terms and typically are spec-
ified in units of wavelengths (the number of waves of the aberration) at the edge of the pupil
(i.e., at r = 1); they must be multiplied by 2 radians per wavelength to convert to phase angle. For
example, a sample system might be specified as having one-half wave of spherical and a quarter
wave of astigmatism.
Shift Invariant or Not?
Note that phase errors that depend on r0 will produce dierent images for dierent image heights
and therefore are shift-variant eects that strictly cannot be characterized by impulse responses
and/or transfer functions. That being said, it is common practice to examine the impulse re-
sponse and/or the transfer function in a local region as though the aberration were shift invari-
ant, which allows the analyst to create a (pseudo) frequency-domain description of the action of
the aberration.
5.2.2 Aberration Coecients
To get an idea of the behavior in the wavefront due to these terms, we can plot graphs of these
shapes at the pupil for specified locations in the object plane. The examples are plotted for
dierent object locations and assuming that 0 = z2 = 1. The aberrations are grouped by the
numerical powers of the radial terms in the series, e.g., j + m = 0 for W000 , j + m = 2 for W200 ,
W111 , and W200 , j + m = 4 for W040 , W131 , etc. You might expect that the second-order grouping
would include W200 (piston error), W111 (tip-tilt), and W020 (defocus). However, for historical
reasons, the groupings are based on the powers for the rays derived from the wavefronts via
the gradient operator (a first-order derivative), so these three form the group of the first-order
aberrations. The terms with j + m = 4 are the third-order aberrations, etc.
Zero-Order Term:
Propagation:
constant phase (zero-order piston error = propagation from pupil to image):

p
1 if x2 + y 2 1
0
[x, y; x0 , y0 , z] = 2 W000 p
0 if x2 + y 2 > 1
The coecient W000 is the number of incremental wavelengths due to propagation downstream
from the object to the pupil is a normal part of the imaging; it is not considered to be an aberration.
In any event, its only eect on the irradiance is the constant attenuation of the image field due to
the inverse square law identical to the constant phase term in the Fresnel and Fraunhofer diraction
terms.
zero-order term, constant phase, piston error aberration

Second-Order Wave (First-Order Ray) Aberrations:
These include the three terms for which the sums of the powers of r and r0 equal two. Since the rays
are oriented orthogonal (and must be calculated by derivatives), these correspond to the first-order
aberrations for rays. In fact, these three terms often are not considered to be aberrations since the
only one that has a degrading eect on an irradiance image is defocus, which may (of course) be
compensated by changing the location of the sensor so that it coincides with the image.
Constant Phase First-Order Piston Error
constant phase (first-order piston error):

2 p
+ r0 if x2 + y 2 1
[x, y; x0 , y0 ] = 2 W200 20 z
p
0 if x2 + y 2 > 1
This is an additional constant phase due to the o-axis location in the image plane; it is quadratic
in the image coordinate, but constant in the pupil coordinate, so it is a constant for a particular
image location. Since this measures the constant phase dierence, it has no eect on the measured
irradiance and therefore no impact on the quality of the image.
constant phase from first-order terms: piston error

Bilinear-Phase Tip-Tilt
linear phase from both object and pupil (tip or tilt):

rr p
0 cos [] if x2 + y 2 1
[x, y; x0 , y0 ] = 2 W111 0 z
p
0 if x2 + y 2 > 1
A phase that has linear contributions from the pupil location r and image location r0 (a bilinear
phase) means that the shape of the field emerging from the pupil for a particular object location is
a flat plane tilted in proportion to the o-axis position of the object and the image. Because it is
a linear phase in the pupil, it displaces the resulting image towards the direction where the phase is
negative.
In atmospheric imaging scenarios (imaging along a vertical path through turbulence), the time-
varying tip-tilt aberration is dominant. For example, the centers of the images of individual stars
appear to move around over short time intervals of the order of hundredths of a second. The
correction of tip-tilt aberration has a very significant positive eect on the quality of the resulting
image. For an example, see the animated GIF file at URL:
http://www.ast.cam.ac.uk/~optics/Lucky_Web_Site/100Her_10ms_200fr.gif
first-order linear term, tip-tilt error

Quadratic-Phase Error, Focus Shift = Defocus
quadratic phase = defocus = focus shift

2 p
+ r if x2 + y 2 1
[x, y; x0 , y0 ] = 2 W020 20 z
p
0 if x2 + y 2 > 1
This quadratic term is the error in the Fresnel propagation from the exit pupil if the observation
plane does not coincide with the image plane and is therefore called defocus. Since it is not a
result of flaws in the optics, it is often not considered to be an aberration, but there is reason to
do so in some applications. As an example, consider the atmospheric imaging scenario mentioned
under tip-tilt; any time-varying quadratic contribution to the relative phase displaces the focal
plane (slightly), so images through atmospheric turbulence with quadratic contributions appear to
go in and out of focus over short time intervals (but, as already mentioned, the tip-tilt aberration
is dominant, totalling 87% of the light energy under certain assumptions see Noll, JOSA, 66,
pp.207-211, 1976 and van Dam & Lane, JOSA A, 19, pp. 745-752).
first-order quadratic term, focus shift error = defocus
Since defocus is a function only of the pupil-plane coordinates, it is shift invariant at the image
plane; the eect of defocus does not vary with image height and therefore may be described by
an impulse response and a transfer function. For example, consider a small first-order focus error of
radians at the edge of a rectangular pupil with linear dimension d0 = 1 unit. The complex-valued
wavefront has the form shown:
Pupil function with defocus of radians at edge of the pupil (half-wave of defocus): (a) real
part; (b) imaginary part; (c) magnitude; (d) phase, showing quadratic nature.
The incoherent transfer function is the scaled autocorrelation of the pupil and the impulse re-
sponse is the inverse Fourier transform. The MTF has a zero at the normalized spatial frequency
= 0.5. Note that the image with defocus is wider and the peak irradiance is smaller than the
diraction-limited image.
(a) MTF of incoherent optical system with square aperture with one-half wave of defocus compared
to MTF without defocus (red); (b) psf with one-half wave of defocus (black) and without defocus
(red).
Other examples of transfer functions (MTFs) and impulse responses for square apertures with dier-
ent amounts of defocus (measured in waves at the edge of the pupil) are shown. Note in particular
that the intermediate frequencies are degraded more rapidly than either the smallest or largest spatial
frequencies. Note that the MTF at certain frequencies is negative, which means that the modulation
has changed sign (lighter regions in the original object become darker in the defocused image).
This can be seen in an object with dierent spatial frequencies.
MTF and corresponding psfs for square pupil with dierent amounts of defocus from 40 at the edge
of the pupil to 1.50 . Note that the decrease in MTF is most pronounced at intermediate spatial
frequencies. For larger amounts of defocus, the MTF goes negative over regions of the frequency
domain (contrast reversal). The psf widens with increasing defocus.
The spatial frequency of a radial grating f [x, y] increases as the reciprocal of the distance from
the center. In the examples shown, the irradiance is biased up so that its normalized maximum and
minimum amplitudes are 1 and 0, respectively. The grating is imaged through a real optical system
onto a CCD sensor that samples the image and thus the image is aliased at large spatial frequencies
(near the center). The three images are at the focal plane (i.e., in focus) and with two increments
of defocus. Track a radial line in the original (in red) to see that the amplitude of the in-focus
does not vary from unity (except where there is aliasing), while the defocused image exhibits several
changes in phase, from light to dark to light, etc. The contrast of the smallest spatial frequency (at
the edge of the image) is reversed in the image with more defocus, and this image also exhibits more
changes in phase.
Eect of two increments of defocus on the image of a radial grating. The negative regions of the
MTF of defocus imply that the contrast of those spatial frequencies is reversed (darker gray
lighter gray and vice versa). Track the lightness along the red lines to see the contrast reversals.
Note that the in-focus image exhibits some sampling (aliasing) artifacts in the center where
the azimuthal spatial frequency is large.
This artifact is often called spurious resolution, because the object is not reproduced at the
locations of the phase change.
5.2.3 Fourth-Order (Third-Order Ray) Aberrations:
the Seidel aberrations

r4
= no variation at object, quartic phase at pupil = spherical aberration W040 (LSI)
20 z 3
rr03
+ cos [] = cubic phase at object, linear phase at pupil = coma, W131
20 z 3
r2 r02
= quadratic phase at object and pupil = field curvature, W220
40 z 3
2 2
r r0
cos2 [] = quadratic phase at object and pupil + azimuth variation = astigmatism, W222
20 z 3
r3 r0
+ cos [] = linear phase at object, cubic phase at pupil = distortion, W311
20 z 3
r4
0 3 = quartic phase at object, no variation at pupil = third-order piston error, W400
80 z
Note that the four of these six terms have even powers of both the pupil coordinate r and the image
coordinate r0 , whereas coma and distortion include odd powers of both.
Spherical Aberration
This is the simplest third-order aberration to describe mathematically since it depends only on
the coordinates in the pupil plane; its eect is constant across the image plane. This means that
spherical aberration is the only one of the six Seidel terms that is shift invariant (and may therefore
be described as a convolution). The wavefront shape for spherical aberration resembles a deeper
bowl than the paraboloid for defocus. Note that the negative sign on the phase means that the
spherical aberration is negative if the phase contribution is positive.
linear phase from both object and pupil (tip or tilt):

4 p
r

if x2 + y 2 1
[x, y; x0 , y0 ] = 2 W040 20 z 3

p
0 if x2 + y 2 > 1
quadratic term from second order of expansion: spherical aberration

If the numerical coecient of spherical aberration is positive, then rays from the marginal regions
of the pupil have a steeper slope than those from the paraxial region near the optical axis. In other
words, the marginal focus is closer to the lens than the ideal paraxial focus. The paraxial image
of a point object is not sharp but exhibits a halo of light around a bright central core.
Negative coecient of spherical aberration of positive lens: rays from the margin of the pupil cross
axis closer to the optic than paraxial rays. The image of a point object at the paraxial focus exhibits
a bright central region surrounded by a halo of light from the margin of the pupil.
Because it is a shift-invariant eect at the image plane, spherical aberration may be described
by an impulse response and by a transfer function. Spherical aberration is a distortion of the true
spherical wavefront that makes a deeper bowl so that the incremental phase error is large near
the edge of the pupil (far from the optical axis, for the marginal part of the wave) and small near
the center of the pupil (near the optical axis, for the paraxial part of the wave).
Example of quartic wavefront error of spherical aberration compared to quadratic error from
defocus. Spherical aberration error is a deeper bowl.
Consider an example for spherical aberration where the phase error is radians at the edge of a
square pupil, the same phase error at the edge that was considered for defocus. The profiles of the
phase in the pupil are:
Pupil function for one-half wave of spherical aberration: (a) real part; (b) imaginary part; (c)
magnitude; (d) phase in units of radians, showing the fourth-power behavior.
The incoherent MTF shows a significant decrease as the frequency approaches cuto and the psf
is noticeably wider and shorter:
(a) MTF of incoherent optical system with square aperture with one-half wave of negative spherical
aberration at the edge of the pupil compared to MTF without aberration (red); (b) psf with one-half
wave of aberration (black) and without aberration (red). Note that the image with spherical
aberration is shorter and fatter.
MTF and corresponding psfs for square pupil with dierent amounts of spherical aberration from
0
4 at the edge of the pupil to 1.50 . The MTF has a similar behavior as for defocus; it decreases
most rapidly at the middle frequencies rather than at smallest or largest, and it may go negative at
some frequencies. The MTF for spherical aberration decreases more slowly than for defocus because
the phase changes more slowly except near the edge of the pupil.
The uncorrected optical system in the Hubble Space telescope suered from significant spherical
aberration due to flaws in the primary mirror that were disguised during mirror testing.
Spherical aberration of the wave emerging from dierent parts of the pupil may be partially
balanced by changing the focus, i.e., by adding defocus. For example, the phase at the edge of the
pupil may be compensated by applying a defocus aberration in the opposite direction so that

14 12
2 W040 3
+ 2 W020 =0
20 z 20 z
W040
= W020 =
z2
If we use defocus cancel the phase error due to spherical aberration at the edge of the pupil, the
resulting transfer function and image have the form shown, so that the image is improved markedly
by using the appropriate amount of defocus.
Application of defocus to balance spherical aberration at edge of square pupil: (a) MTF without
aberrations (black), with 1/2 wave of spherical aberration (red), and after balancing with -1/2 wave
of defocus; (b) corresponding impulse responses.
Coma
= linear phase from both object and pupil (tip or tilt):

3 p
+ r0 r cos [] if x2 + y 2 1
20 z 3
[x, y; x0 , y0 ] = 2 W131 p

0 if x2 + y 2 > 1
The surface shape is proportional to the cube of the image height, proportional to the height
of the ray in the pupil. This produces a dierent phase error, and therefore dierent images, for
dierent values of the image height r0 as shown in the example. The images have a comet-like
shape, hence the name for the aberration.
Star field imaged through optical system with coma; elongation of the star images increases with
distance from optical axis (which is located below bottom of the image). Credit: Star Gazing with
Telescope and Camera, George T. Keene, Amphoto, Garden City, 1967, p. 93.
Curvature of Field
quadratic phase from object and pupil

2 2 p
r0 r if x2 + y 2 1
20 z 3
[x, y; x0 , y0 ] = W220 p

0 if x2 + y 2 > 1
As indicated by the name, the best images in systems with this aberration are on a curved surface.
Some imaging systems (e.g., Schmidt cameras) are deliberately designed with curved fields be-
cause it produces good images over wide fields of view. The sensors used in wide-field Schmidt
astronomical cameras were glass plates that were predistorted prior to being installed in the cam-
era. Since the plates could be as large as 14" square, this was a touchy operation.
Astigmatism
The Latin word for points is stigmata, so that a system with astigmatism is not capable of
producing points. It focuses horizontal and vertical patterns at dierent focal planes, as shown:
Astigmatism focues vertical and horizontal lines at dierent planes (horizontal lines in the
sagittal plane and vertical lines in the meridional plane)
http://www.olympusmicro.com/primer/anatomy/aberrations.html
The aberration coecient for astigmatism is:

quadratic phase from object and pupil and azimuthal variation

p
1 r02 r2 cos2 [] if x2 + y 2 1
20 z 3
[x, y; x0 , y0 ] = 2 W222 p

0 if x2 + y 2 > 1
The error is quadratic with an azimuthal dependence; the additional quadratic is maximized along
the azimuthal direction = 0 and, and zero along the orthogonal direction. It therefore adds
an azimuthally dependent focusing power. In other words, object lines oriented along dierent
directions are focused at dierent distances from the optic.
The eye systems of many people exhibit astigmatism, which means that the corrective lenses
must have dierent powers along the orthogonal axes; in other words, lenses with cylindrical power
are needed.
Lenses that have been corrected for astigmatism are known as anastigmats.
Distortion
cubic phase at pupil, linear phase at object, azimuthal variation

3 p
+ r0 r cos [] if x2 + y 2 1
20 z 3
[x, y; x0 , y0 ] = 2 W311 p

0 if x2 + y 2 > 1
This is a cubic dependence on the pupil coordinate and linear variation the image coordinate.
Like coma, the eect of distortion varies with image height.
The image shapes resulting from distortion with coecients of dierent algebraic signs are dierent.
If W311 < 0 or W311 > 0, the images suer from pincushion distortion or barrel distortion,
respectively.
Images of a grid object through systems with (a) no aberrations; (b) pincushion distortion
( W311 < 0); (c) barrel distortion ( W311 > 0).
Piston Error
quartic phase at object

4 p
r0 if x2 + y 2 1
[x, y; x0 , y0 ] = 2 W400 20 z 3
p
0 if x2 + y 2 > 1
This is a constant phase due to the o-axis distance at the image plane and has no eect on the
irradiance of the image, hence it often is not considered to be an aberration. However, it does have
an important eect on optical systems with sparse primary elements, such as multiple-mirror
telescopes.
constant term from second-order expansion: piston error
Of course, the ultimate resolution of optical systems may be due in part to other uncontrollable
factors. For example, ground-based astronomical telescopes are ultimately limited by random vari-
ations in local air temperature that create random variations in the refractive index of atmospheric
patches. These variations are often decomposed into the Seidel aberrations. The constant phase
(piston) error has no eect on the irradiance (the squared magnitude of the amplitude). Linear
phase errors move the image from side to side and or top to bottom (tip-tilt). Quadratic phase
errors (defocus) add or subtract power from the lens to move the image plane along the axis
forwards (towards the optic) or backwards (away from the optic), respectively. In correction for
atmospheric phase errors, the tip-tilt error is most significant, which means that correcting this
aberration significantly improves the image quality. The field of correcting atmospheric aberrations
is called adaptive optics, and is an active research area.
5.2.4 Zernike Polynomials
It should be no surprise that other useful decompositions of the wavefront errors exist. Another
common set of basis functions are the Zernike polynomials, which are often used for fitting data from
interferometric optical testing (though NOT in the presence of air turbulence; Zernikes have little
value in this situation). The Zernike polynomials are functions of radial and azimuthal coordinates
that describe surfaces on the unit circle such that the average value of each is zero:
Zn (r, ) = Rn (r) cos ( )

Zn (r, ) = Rn (r) sin ( )
where the radial part is defined as:

(n )/2
X k
(1) (n k)!

rn2k if n is even
n+ n
Rn (r) = k=0 k! k ! k !

2 2

0 if n is odd
So that:
0!
R00 (r) = r0 = 1 (r) = Z00 (r, ) = 1 (r) cos (0 ) = 1 (r)
0! 0! 0!
(1)0 1!
R11 (r) = r1 = Z11 (r, ) = r cos (1 ) = r cos ()
0! (0)! (0)!
Z11 (r, ) = R11 (r) sin (1 ) = r sin ()
etc.
One advantage of the Zernike polynomials is that distinct polynomials are orthogonal over the unit
circle (i.e., the scalar product of any pair of distinct Zernike polynomials vanishes):

Z r=1 1 if n = m
Rn (r) Rm (r) r dr nm
r=0 0 if n 6= m
where nm is the Kronecker delta function. The set of the first 36 (nonconstant) Zernike polynomials
yields a decomposition with minimum RMS wavefront error. Since they all represent wavefront errors
at the exit pupil, the corresponding impulse responses and transfer functions may be calculated; the
former are shown in a figure.
First 28 Zernike polynomials ordered by azimuthal index (horizontally) and radial index(vertically).
Ref: http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files.
psfs (impulse responses) of the aberrations for each of the first 28 Zernike Polynomials (ref:
http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files/image096.gif )
5.3 STRUCTURAL ABERRATION COEFFICIENTS 193
5.3 Structural Aberration Coecients

Structural aberration coecients are due to the configuration or orientation of the lens. We
have just seen that the lensmakers equation ensures that there are many prescriptions for a thin
lens with a fixed focal length made from one glass. For example, if n2 = 1.5 and f = 100 mm, we
can have R1 = R2 = 100 mm (double convex) or R1 = 50 mm and R2 = (plano-convex, curved
side towards object) or R1 = and R2 = 50 mm (plano-convex, curved side towards image), and
many other possibilities. It is perhaps logical that the aberrations from these dierent prescriptions
will be dierent too. The calculation leads to one of the rules of thumb for optical systems; a
better image is generated by an optical system if the side of the optic with the larger radius is on
the side with the shorter conjugate, which divides the power of the lens more equally between the
two surfaces.
For example, for a plano-convex lens with the source point at infinity (so that the image is at the
focal point), the image exhibits better quality if the curved side of the lens is towards the object.
With the flat side towards the object, the front flat surface contributes no power to the image.
5.4 Optical Imaging Systems and Sampling

Q factor
5.5 Optical System Rules of Thumb

1. If imaging with a singlet lens, the aberrations are smaller if the lens surface with more curvature
(shorter radius of curvature) is on the side of the longer conjugate. Since the transverse
magnification is smaller than 1 in most cases (distant object), the more curved side of the
lens should be towards the distant object. This divides the power of the surfaces more evenly
and minimizes the spherical aberration.
2. If imaging in visible light, the diameter of the diraction spot in micrometers is approximately
equal to the f-number of the system.
3. The MTF at the Rayleigh limit is about 9% (www.normankoren.com/Tutorials/MTF1A.html).

Lenses are sharpest in the interval of about two stops between the (small) aperture where
diraction starts to dominate and two stops smaller than the maximum aperture. For 35mm
lenses, the maximum aperture often is of the order of f/2, so two stops smaller is typically f/5.6.
The aperture at which diraction starts to dominate depends on wavelength, but is generally
accepted as about f/22. Therefore the sharpest range for a 35mm lens is between about
f/5.6 and f/11.At larger apertures (smaller f/ numbers), resolution is limited by aberrations
(astigmatism, coma, etc.); at small apertures, resolution is limited by diraction. The MTF
if the lens is used wide open is almost always poorer than MTF at f/8 because of the
aberrations. Note that this discussion does not consider the eects of the sensor, just the lens.
4. Image is visually unaberrated if the Strehl ratio D ' 0.8 = (W ) / 0.075 0 =

0 0
Wmax / . = W /
4 14
5. If imaging in visible light, the image appears to be in focus if the defocus distance measured
2
in micrometers is smaller than (f/#) .
6. Depending on source, the resolution r of lens in line pairs per mm is approximately

1390 1600
/r/
f/# f/#
7. More to come...

IMPORTANT Ray Optics Notes 01

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

IMPORTANT Ray Optics Notes 01

Enviado por

Direitos autorais:

Formatos disponíveis

Ray Optics for Imaging Systems

Course Notes for IMGS-321

December 11, 2013

2 Ray (Geometric) Optics 5

2.13.3 Summary of Distances for Two-Lens System . . . . . . . . . . . . . . . . . . . 48

3 Tracing Rays Through Optical Systems 95

4 Depth of Field and Depth of Focus 141

Useful Optics Texts:

Physics Texts with useful discussions:

Curriculum: Geometrical Optics and Imaging

(a) ray model (geometric optics)

(a) third-order optics, aberrations

3. Sign conventions for distances and angles

(a) Nature of objects and images (real and virtual)

(a) Optical path length

6. Refraction at a Spherical Surface

(a) Paraxial approximation, imaging equation

7. Imaging with thin lenses

8. Tracing rays through optical systems

(a) paraxial ray tracing equations

9. Computed Ray Tracing, OSLOTM

1.1 Models of Light and Propagation

1.1.1 Ray model of light (geometrical optics)

1.1.2 Wave model of light (physical optics):

(a) considers light (electromagnetic radiation) to propagate as WAVES ;

1.1.3 Photon model of light (quantum optics):

Ray (Geometric) Optics

2.1 What is an imaging system?

2.1.1 Simplest Imaging System Pinhole in Absorber

every point in space is in focus on the sensor

better statistics and less relative noise

2.2 First-Order Optics

The Maclaurin series for the sine is:

The corresponding first-order approximation to the cosine is the unit constant

lim {cos []} = 1

2.3 Third-Order Optics

2.3.1 Higher-Order Approximations

2.4 Notations and Sign Conventions

2.4.1 Nature of Objects and Images:

2.5 Human Eye

2.6 Principle of Least Time

A ray of light traveling between two arbitrary points

(or car, bullet, or baseball) traveling a distance s at a velocity v requires t seconds:

where the optical path length is defined:

For a single medium, the optical path length is:

2.7 Fermats Principle for Reflection

Schematic for determining the angle of reflection using Fermats principle.

From the drawing, note that:

Snells law for reflection at interface.

2.7.1 Plane Mirrors

2.8 Fermats Principle for Refraction:

Schematic for refraction using Fermats principle.

n1 sin [1 ] = n2 sin [2 ] = n1 1 = n2 2 in paraxial approximation

(nblue nred )flint > (nblue nred )crown

In imaging, we often think of dispersion in refractive elements as an unfortunate bug in the

2.8.2 Refractive Constants for Glasses

Refractivity nD 1 1.75 nD 1.5

(note that larger dispersions result in smaller Abb numbers)

Line [ nm] n for Crown n for Flint

C 656.28 1.51418 1.69427

For the crown glass:

For the flint glass:

wavelengths than at visible wavelengths.

2.9 Image Formation in the Ray Model