Escolar Documentos
Profissional Documentos
Cultura Documentos
Robert D. Guenther
Duke University
Durham, NC, USA
EDITORS
Duncan G. Steel
University of Michigan
Ann Arbor, MI, USA
Leopold Bayvel
London, UK
CONSULTING EDITORS
John E. Midwinter
University College London, London, UK
Alan Miller
University of St Andrews, Fife, UK
EDITORIAL ADVISORY BOARD
R Alferness D Killinger
Lucent Technologies Bell Laboratories University of South Florida
Holmdel, NJ, USA Tampa, FL, USA
D J Brady T A King
Duke University University of Manchester
Durham, NC, USA UK
J Caulfield E Leith
Diversified Research Corporation University of Michigan
Cornersville, TN, USA Ann Arbor, MI, USA
R L Donofrio A Macleod
Display Device Consultants LLC Thin Film Center, Inc.
Ann Arbor, MI, USA Tucson, AZ, USA
J Dudley P F McManamon
Université de Franche-Comté Air Force Research Laboratory
Besançon, France Wright-Paterson AFB, OH, USA
J G Eden J N Mait
University of Illinois U.S. Army Research Laboratory
Urbana, IL, USA Adelphi, MD, USA
H O Everitt D Phillips
U.S. Army Research Office Imperial College
Research Triangle Park, NC, USA London, UK
E L Hecht C R Pidgeon
Adelphi University Heriot-Watt University
Garden City, NY, USA Edinburgh, UK
H S Hinton R Pike
Utah State University King’s College London
Logan, UT, USA UK
Preface
We live in a world powered by light, but much of the understanding of light was developed around the time of
the French Revolution. Before the 1950s, optics technology was viewed as well established and there was little
expectation of growth either in scientific understanding or in technological exploitation. The end of the Second
World War brought about huge growth in scientific exploration, and the field of optics benefited from that
growth. The key event was the discovery of methods for producing a source of coherent radiation, with the key
milestone being the demonstration of the first laser by Ted Maiman in 1960. Other lasers, nonlinear optical
phenomena, and technologies such as holography and optical signal processing, followed in the early 1960s. In
the 1970s the foundations of fiber optical communications were laid, with the development of low-loss glass
fibers and sources that operated in the wavelength region of low loss. The 1980s saw the most significant
technological accomplishment: the development of efficient optical systems and resulting useful devices. Now,
some forty years after the demonstration of the first coherent light source, we find that optics has become the
enabling technology in areas such as:
We find ourselves depending on CDs for data storage, on digital cameras and printers to produce our family
photographs, on high speed internet connections based on optical fibers, on optical based DNA sequencing
systems; our physicians are making use of new therapies and diagnostic techniques founded on optics.
To contribute to such a wide range of applications requires a truly multidisciplinary effort drawing together
knowledge spanning many of the traditional academic boundaries. To exploit the accomplishments of the past
forty years and to enable a revolution in world fiber-optic communications, new modalities in the practice of
medicine, a more effective national defense, exploration of the frontiers of science, and much more, a resource
to provide access to the foundations of this field is needed. The purpose of this Encyclopedia is to provide a
resource for introducing optical fundamentals and technologies to the general technical audience for whom
optics is a key capability in exploring their field of interest.
Some 25 internationally recognized scientists and engineers served as editors. They helped in selecting the
topical coverage and choosing the over 260 authors who prepared the individual articles. The authors form an
international group who are expert in their discipline and come from every part of the technological
community spanning academia, government and industry. The editors and authors of this Encyclopedia hope
that the reader finds in these pages the information needed to provide guidance in exploring and utilizing
optics.
As Editor-in-Chief I would like to thank all of the topical editors, authors and the staff of Elsevier for each of
their contributions. Special thanks should go Dr Martin Ruck of Elsevier who provided not only
organizational skills but also technological knowledge which allowed all of the numerous loose ends to be tied.
B D Guenther
Editor-in-Chief
A
ALL-OPTICAL SIGNAL REGENERATION
O Leclerc, Alcatel Research & Innovation, estimated by Q-factors. Moreover, in future optical
Marcoussis, France networks, it appears mandatory to ensure similar but
q 2005, Elsevier Ltd. All Rights Reserved. high optical signal quality at the output of whatever
nodes in the networks, as to enable successful
transmission of the data over arbitrary distance.
Introduction Among possible solutions to overcome such sys-
The breakthrough of optical amplification, combined tems limitations is the implementation of Optical
with the techniques of wavelength division multi- Signal Regeneration, either in-line for long-haul
plexing (WDM) and dispersion management, have transmission applications or at the output of network
made it possible to exploit a sizeable fraction of the nodes. Such Signal Regeneration performs, or should
optical fiber bandwidth (several terahertz). Systems be able to perform, three basic signal-processing
based on 10 Gbit/s per channel bit-rate and showing functions that are Re-amplifying, Re-shaping, and
capacities of several terabit/s, with transmission Re-timing, hence the generic acronym ‘3R’ (Figure 1).
capabilities of hundreds or even thousands of When Re-timing is absent, one usually refers to
kilometers, have reached the commercial area. the regenerator as ‘2R’ device, which has only
While greater capacities and spectral efficiencies re-amplifying and re-shaping capabilities. Thus, full
are likely to be reached with current technologies, 3R regeneration with retiming capability requires
there is potential economic interest in reducing clock extraction.
the number of wavelength channels by increasing Given system impairments after some transmission
the channel rate (e.g., 40 Gbit/s). However, such distance, two solutions remain for extending the
fourfold increase in the channel bit-rate clearly results actual reach of an optical transmission system or the
in a significant increase in propagation impairments, scalability of an optical network. The first consists in
stemming from the combined effects of noise segmenting the system into independent trunks, with
accumulation, fiber dispersion, fiber nonlinearities, full electronic repeater/transceivers at interfaces (we
and inter-channel interactions and contributing to shall refer to this as ‘opto-electronic regeneration’ or
two main forms of signal degradation. The first one is O/E Regeneration forthwith). The second solution,
related to the amplitude domain; power levels of i.e., all-optical Regeneration, is not the optical
marks and spaces can suffer from random deviations version of the first which would have higher
arising from interaction between signal and amplified bandwidth capability but still performs the same
spontaneous emission (ASE) noise or with signals signal-restoring functions with far reduced complex-
from other channels through cross-phase modulation ity. At this point, it should be noted that Optical 3R
(XPM) from distortions induced by chromatic techniques are not necessarily void of any electronic
dispersion. The second type of signal degradations functions (e.g., when using electronic clock recovery
occurs in the time domain; time position of pulses can and O/E modulation), but the main feature is that
also suffer from random deviations arising from these electronic functions are narrowband (as
interactions between signal and ASE noise through opposed to broadband in the case of electronic
fiber dispersion. Preservation of high power contrast regeneration).
between ‘1’ and ‘0’, and of both amplitude fluctu- Some key issues have to be considered when
ations and timing jitter below some acceptable levels, comparing such Signal Regeneration approaches.
are mandatory for high transmission quality, evalu- The first is that today’s and future optical trans-
ated through bit-error-rate (BER) measurements or mission systems or/and networks are WDM networks.
2 ALL-OPTICAL SIGNAL REGENERATION
Figure 1 Principle of 3R regeneration, as applied to NRZ signals. (a) Re-Amplifying; (b) Re-Shaping; (c) Re-Timing. NB: Such eye
diagrams can be either optical or electrical eye diagrams.
element can operate either on electrical signals simultaneously depend upon both the output ampli-
(standard electrical DFF) provided that optical ! tude and arrival time PD of the ‘1’ and ‘0’ symbols. In
electrical and electrical ! optical signal conversion the unusual case of 2R regeneration (no clocked
stages are added or directed onto optical signals using decision), a tradeoff has then to be derived, consider-
the different techniques described below. The clock ing both the reduction of amplitude PD and the
signal can be of an electrical nature, as for electrical enlarged arrival time PD induced by the regenerator,
decision element in O/E regenerator – or either an to ensure sufficient signal improvement. In
electrical or a purely optical signal in all-optical Figure 3a, we consider a step function for the transfer
regenerators. function of the decision element. In this case,
Prior to reviewing and describing the various amplitude PD are squeezed to Dirac PD after the
current technology alternatives for such Optical decision element, and depending upon addition or not
Signal Regeneration, the issue of the actual charac- of a clock reference, arrival time PD is reduced (3R)
terization of regenerator performance needs to be or dramatically enlarged (2R). In Figure 3b, the
explained and clarified. As previously mentioned, the decision element exhibits a moderately nonlinear
core element of any Signal Regenerator is the decision transfer function. This results in an asymmetric and
element showing a nonlinear transfer function that less-pronounced squeezing of the amplitude PD
can be of varying steepness. As will be seen in compared to the previous case, but in turn results in
Figure 3, the actual regenerative performance of the a significantly less enlarged arrival time PD when
regenerator will indeed depend upon the degree of no clock reference is added (2R regeneration).
nonlinearity of the decision element transfer function. Comparison of these two decision element of
Figure 3 shows the principle of operation of a different nonlinear transfer function indicates that
regenerator incorporating a decision element with for 3R regeneration applications, the more nonlinear
two steepnesses of the nonlinear transfer function. In the transfer function of the decision element the better
any case, the ‘1’ and ‘0’ symbols amplitude prob- performance, the ideal case being the step function.
ability densities (PD) are squeezed after passing In the case of 2R regeneration applications, a tradeoff
through the decision element. However, depending between the actual reduction of the amplitude PD and
upon the addition of a clock reference for triggering enlargement of timing PD is to be derived and clearly
the decision element, the symbol arrival time PD depends upon the actual shape of the nonlinear
will be also squeezed (clocked decision ¼ 3R transfer function of the decision element.
regeneration) or enlarged (no clock reference ¼ 2R
regeneration) resulting in conversion of amplitude
Qualification of Signal Regenerator Performance
fluctuations to time position fluctuations.
As for system performance – expressed through To further illustrate the impact of the actual shape of
BER – regenerative capabilities of any regenerator the nonlinear transfer function of the decision
Figure 3 Signal Regeneration process using Nonlinear Gates. (a) Step transfer function (¼ Electronic DFF); (b) ‘moderately’
nonlinear transfer function. As an illustration of the Regenerator operation ‘1’ and ‘0’ symbols amplitude probability density (PD) and
arrival time probability density (PD) are shown in light gray and dark gray, respectively.
4 ALL-OPTICAL SIGNAL REGENERATION
element in 3R application, the theoretical evolution As a general result, the actual characterization of
of BER with number of concatenated regenerators the regenerative performance of any regenerator
have been plotted for regenerators having different should in fine be conducted considering a cascade of
nonlinear responses. Figure 4 shows the numerically regenerators. In practice this can easily be done with
calculated evolution of the BER of a 10 Gbit/s NRZ the experimental implementation of the regenerator
signal with fixed optical signal-to-noise ratio (OSNR) under study in a recirculating loop. Moreover, such
at the input of the 3R regenerator, as a function of the an investigation tool will also enable access to the
cascaded regenerator incorporating nonlinear gates regenerator performance with respect to the trans-
with nonlinear transfer function of different depths. mission capabilities of the regenerated signal, which
From Figure 4, can be seen different behaviors should not be overlooked.
depending on the nonlinear function shape. As Let us now consider the physical implementation of
previously stated, the best regeneration performance such all-optical Signal Regenerators, along with the
is obtained with an ideal step function (case a), key features offered by the different alternatives.
which is actually the case for O/E regenerator using
electronic decision flip-flop (DFF). In that case, BER
linearly increase (i.e., more errors) in the cascade. All-Optical 2R/3R Regeneration Using
Conversely, when nonlinearity is reduced (cases (b) Optical Nonlinear Gates
and (c)), both BER and noise accumulate, until the Prior to describing the different solutions for all-
concatenation of nonlinear functions reach some optical in-line Optical Signal Regeneration, it should
steady-state pattern, from which BER linearly be mentioned that since the polarization states of the
increases. Concatenation of nonlinear devices thus optical data signals cannot be preserved during
magnifies shape differences in their nonlinear propagation, it is required that the regenerator
response, and hence their regenerative capabilities. exhibits an extremely low polarization sensitivity.
Moreover, as can be seen in Figure 4, all curves This clearly translates to a careful optimization of all
standing for different regeneration efficiencies pass the different optical components making up the
through a common point defined after the first device. 2R/3R regenerator. It should be noted that this also
This clearly indicates that it is not possible to qualify applies to the O/E solutions but is of limited impact,
the regenerative capability of any regenerator when since only the photodiode has to be polarization
considering the output signal after only one regen- insensitive.
erator. Indeed, the BER is the same for either a 3R Figure 5 illustrates the generic principle of oper-
regenerator or a mere amplifier if only measured after ation of an all-optical 3R Regenerator using optical
a single element. This originates from the initial nonlinear gates. Contrary to what occurs in O/E,
overlap between noise distributions associated with regenerator where the extracted clock signal drives
marks and spaces, that cannot be suppressed but only the electrical decision element, the incoming and
minimized by a single decision element through distorted optical data signal triggers the nonlinear
threshold adjustment. gate, hence generating a switching window which is
Figure 4 Evolution of the BER with concatenated regenerators for nonlinear gates with nonlinear transfer function of decreasing
depths from case (a)–(c) (step function).
ALL-OPTICAL SIGNAL REGENERATION 5
of the device for rejecting the signal at l1 wavelength minimum OSNR tolerated by the regenerator
but operation of the MZI is limited by its speed. At (1 dB sensitivity penalty at 10210 BER) was found
this point, one should mention that the photo- to be as low as 25 dB/0.1 nm. Such results clearly
induced modulation effects in SOAs are intrinsically illustrate the high performance of this SOA-based
limited in speed by the gain recovery time, which is a regenerator structure for 40 Gbit/s optical data
function of the carrier lifetime and the injection signals.
current. An approach referred to as differential Such a complex mode of operation for addressing
operation mode (DOM) and illustrated on Figure 6, 40 Git/s bit-rates will probably be discarded when we
which takes advantage of the MZI’s interferometric consider the recent demonstration of standard-mode
properties, makes it possible to artificially increase wavelength conversion at 40 Gbit/s, which uses a
the operation speed of such ‘slow’ devices up to newly-designed active-passive SOA-MZI incorpo-
40 Gbit/s. rating evanescent-coupling SOAs. The device
As discussed earlier, the nonlinear response is a key architecture is flexible in the number of SOAs, thus
parameter for regeneration efficiency. Combining two enabling easier operation optimization and reduced
interferometers is a straightforward means to power consumption, leading to simplified architec-
improve the nonlinearity of the decision element tures and operation for 40 Gbit/s optical 3R
transfer function, and hence regeneration efficiency. regeneration.
This approach was validated at 40 Gbit/s using a Based on the same concept of wavelength conver-
cascade of two SOA-MZI, (see Figure 7 (left)). Such a sion for Optical 3R Regeneration, it should be noted
scheme offers the advantage of restoring data polarity many devices have been proposed and experimentally
and wavelength, hence making the regenerator validated as wavelength converters at rates up to
inherently transparent. Finally, the second conversion 84 Gbit/s, but with cascadability issues still to be
stage can be used as an adaptation interface to the demonstrated to assess their actual regenerative
transmission link achieved through chirp tuning in properties.
this second device.
Such an Optical 3R Regenerator was upgraded to Optical Clock Recovery (CR)
40 Gbit/s, using DOM in both SOA-MZIs with
validation in a 40 Gbit/s loop RZ transmission. The Next to the decision, the CR is a second key function
40 Gbit/s eye diagram monitored at the regenerator in 3R regenerators. One possible approach for CR
output after 1, 10, and 100 circulations are shown in uses electronics while another only uses optics. The
Figure 7 (right) and remain unaltered by distance. former goes with OE conversion by means of a
With this all-optical regenerator structure, the photodiode and subsequent EO conversion through a
modulator. This conversion becomes more complex
and power-hungry as the data-rate increases. It is
clear that the maturity of electronics gives a current
advantage to this approach. But considering the pros
and cons of electronic CR for cost-effective
implementation, the all-optical approach seems
more promising, since full regenerator integration is
potentially possible with reduced power consump-
tion. In this view, we shall focus here on the optical
approach and more specifically on the self-pulsating
effect in three-sections distributed feedback (DFB)
Figure 6 Schematic and principle of operation of SOA-MZI in
differential mode. lasers or more recently in distributed Bragg reflector
Figure 7 Optimized structure of a 40 Gbit/s SOA-based 3R regenerator. 40 Gbit/s eye diagram evolution: (a) B-to-B; (b) 1 lap;
(c) 10 laps; (d) 100 laps.
ALL-OPTICAL SIGNAL REGENERATION 7
(DBR) lasers. Recent experimental results illustrate The regenerative properties of SA make it possible
the potentials of such devices for high bit rates (up to to reduce cumulated amplified spontaneous emission
160 Gbit/s), broad dynamic range, broad frequency (ASE) in the ‘0’ bits, resulting in a higher contrast
tuning, polarization insensitivity, and relatively between mark and space, hence increasing system
short locking times (1 ns). This last feature makes performance. Yet SAs do not suppress intensity noise
these devices good candidates for operation in in the marks, which makes the regenerator incom-
asynchronous-packet regimes. plete. A solution for this noise suppression is optical
filtering with nonlinear (soliton) pulses. The principle
is as follows. In absence of chirp, the soliton temporal
Optical Regeneration by Saturable Absorbers
width scales in the same way as the reciprocal of its
We next consider saturable absorbers (SA) as non- spectral width (Fourier-transform limit) times its
linear elements for optical regeneration. Figure 8 intensity (fundamental soliton relation). Thus, an
(left) shows a typical SA transfer function and increase in pulse intensity corresponds to both time
illustrates the principle of operation. When illumi- narrowing and spectral broadening. Conversely, a
nated with an optical signal with peak power below decrease in pulse intensity corresponds to time
some threshold ðPsat Þ; the photonic absorption of broadening and spectral narrowing. Thus, the filter
the SA is high and the device is opaque to the signal causes higher loss when intensity increases, and
(low transmittance). Above Psat ; the SA transmit- lower loss when intensity decreases. The filter
tance rapidly increases and asymptotically saturates thus acts as an automatic power control (APC) in
to transparency (passive loss being overlooked). feed-forward mode, which causes power stabilization.
Such a nonlinear transfer function only applies to The resulting 2R regenerator (composed by the SA
2R optical regeneration. and the optical filter) is fully passive, which is of high
Different technologies for implementing SAs are interest for submarine systems where the power
available, but the most promising approach uses consumption must be minimal, but it does not include
semiconductors. In this case, SA relies upon the any control in the time domain (no Re-timing).
control of carrier dynamics through the material’s System demonstrations of such passive SA-based
recombination centers. Parameters such as on – off Optical Regeneration have been reported with
contrast (ratio of transmittance at high and low a 20 Gbit/s single-channel loop experiment.
incident powers), recovery time (1/e) and saturation Implementation of the SA-based 2R Regenerator
energy, are key to device optimization. In the fol- with 160 km-loop periodicity made it possible to
lowing, we consider a newly-developed ion-irradiated double the error-free distance (Q ¼ 15.6 dB or 1029
MQW-based device incorporated in a micro-cavity BER) of a 20 Gbit/s RZ signal. So as to extend the
and shown on Figure 8 (right). The device operates as capability of passive 2R regeneration to 40 Gbit/s
a reflection-mode vertical cavity, providing both a systems, an improved configuration was derived from
high on/off extinction ratio by canceling reflection at numerical optimization and experimentally demon-
low intensity and a low saturation energy of 2 pJ. It is strated in a 40 Gbit/s WDM-like, dispersion-managed
also intrinsically polarization-insensitive. Heavy ion- loop transmission, showing more than a fourfold
irradiation of the SA ensures recovery times (at 1=e) increase in the WDM transmission distance at 1024
shorter than 5 ps (hence compatible with bit-rate BER (1650 km without the SA-based regenerator and
above 40 Gbit/s), while maintaining a dynamic 7600 km when implementing the 2R regenerator
contrast in excess of 2.5 dB at 40 GHz repetition rate. with 240 km periodicity).
Figure 8 (a) Saturable Absorber (SA) ideal transfer function. (b) Structure of Multi-Quantum Well SA.
8 ALL-OPTICAL SIGNAL REGENERATION
Such a result illustrates the potential high interest to make the SM regeneration function and trans-
of such passive optical 2R regeneration in long-haul mission work independently in such a way that any
transmission (typically in noise-limited systems) since type of RZ signals (soliton or non-soliton) can be
the implementation of SA increases the system’s transmitted through the system. The BBOR technique
robustness to OSNR degradation without any extra includes an adaptation stage for incoming RZ pulses
power consumption. Reducing both saturation in the SM-based regenerator, which ensures high
energy and insertion loss along with increasing regeneration efficiency regardless of RZ signal format
dynamic contrast represent key future device (linear RZ, DM-soliton, C-RZ, etc.). This is achieved
improvements. Regeneration of WDM signals from using a local and periodic soliton conversion of RZ
the same device, such as one SA chip with pulses by means of launching an adequate power into
multiple fibers implemented between Mux/DMux some length of fiber with anomalous dispersion.
stages, should also be thoroughly investigated. The actual experimental demonstration of the BBOR
In this respect, SA wavelength selectivity in quantum approach and its superiority over the ‘classical’
dots could possibly be advantageously exploited. SM-based scheme for DM transmission was experi-
mentally investigated in 40 Gbit/s DM loop trans-
mission. Under these conditions, one can then
Synchronous Modulation Technique independently exploit dispersion management (DM)
All-optical 3R regeneration can be also achieved techniques for increasing spectral efficiency in
through in-line synchronous modulation (SM) long-haul transmission, while ensuring high trans-
associated with narrowband filtering (NF). Figure 9 mission quality through BBOR.
shows the basic layout of such an Optical 3R One of the key properties of the SM-based all-
Regenerator. It is composed of an optical filter optical Regeneration technique relies on its WDM
followed by an Intensity and Phase Modulator compatibility. The first (Figure 10, left) and straight-
(IM/PM) driven by a recovered clock. Periodic forward solution to apply Signal Regeneration to
insertion of SM-based modulators along the trans- WDM channels amounts to allocating a regenerator
mission link provides efficient jitter reduction and to each WDM channel. The second consists in sharing
asymptotically controls ASE noise level, resulting in a single modulator, thus processing the WDM
virtually unlimited transmission distances. Re-shaping channels at once in serial fashion. This approach
and Re-timing provided by IM/PM intrinsically requires WDM synchronicity, meaning that all bits
requires nonlinear (soliton) propagation in the trunk be synchronous with the modulation, that can be
fiber following the SM block. Therefore, one can refer achieved either by use of appropriate time-delay lines
to the approach as distributed optical regeneration. located within a DMux/Mux apparatus (Figure 10,
This contrasts with lumped regeneration, where 3R is upper right), or by making the WDM channels
completed within the regenerator (see above with inherently time-coincident at specific regenerator
Optical Regeneration using nonlinear gates), and is locations (Figure 10, bottom right). Clearly, the serial
independent of line transmission characteristics. regeneration scheme is far simpler and cost-effective
However, when using a new approach referred to than the parallel version; however, optimized
‘black box’ optical regeneration (BBOR), it is possible re-synchronization schemes still remain to be
Figure 9 Basic layout of the all-optical Regenerator by Synchronous Modulation and Narrowband Filtering and illustration of the
principle of operation.
ALL-OPTICAL SIGNAL REGENERATION 9
Figure 10 Basic implementation schemes for WDM all-Optical Regeneration. (a) parallel asynchronous; (b) serial re-synchronized;
(c) serial self-synchronized.
developed for realistic applications. Experimental should mention the relative lack of wavelength/
demonstration of this concept was assessed by means formats flexibility of these solutions (compared to
of a 4 £ 40 Gbit=s dispersion-managed transmission O/E solutions). It is complex or difficult to restore the
over 10 000 km (BER , 5.1028) in which a single input wavelength or address any C-band wavelength
modulator was used for the simultaneous regener- at the output of the device or to successfully
ation of the 4 WDM channels. regenerate modulation formats other than RZ.
Considering next all-optical regeneration schemes In that respect, investigations should be conducted
with ultra-high speed potential, a compact and loss- to derive new optical solutions capable of processing
free 40 Gbit/s Synchronous Modulator, based on more advanced modulation formats at 40G. Finally,
optically-controlled SOA-MZI, was proposed and the fact that the nonlinear transfer function of the
loop-demonstrated at 40 Gbit/s with an error-free optical is in general triggered by the input signal
transmission distance in excess of 10 000 km. More- instantaneous power also turns out to be a drawback
over, potential ultra-high operation of this improved since it requires control circuitry. The issue of cost
BBOR scheme was recently experimentally demon- (footprint, power consumption, etc.) of these solu-
strated by means of a 80 GHz clock conversion with tions, compared to O/E ones, is still open. In this
appropriate characteristics through the SOA-MZI. respect, purely optical solutions incorporating all-
One should finally mention all fiber-based devices optical clock recovery, the performance of which is
such as NOLM and NALM for addressing ultra-high still to be technically assessed, are of high interest for
speed SM-based Optical Regeneration, although no reducing costs. Complete integration of an all-optical
successful experimental demonstrations have been 2R/3R regenerator or such parallel regenerators onto
reported so far in this field. a single semiconductor chip should also contribute to
make all-optical solutions cost-attractive even though
acceptable performance of such fully integrated
devices is still to be demonstrated.
Conclusion From today’s status concerning the two alternative
In summary, optical solutions for Signal Regeneration approaches for in-line regeneration (O/E or all-
present many key advantages. These are the only optical), it is safe to say that the choice between
advantages to date to possibly ensure WDM compat- either solution will be primarily dictated by both
ibility of the regeneration function (mostly 2R engineering and economical considerations. It will
related). Such optical devices clearly exhibits the result from a tradeoff between overall system
best 2R regeneration performance (wrt to O/E performance, system complexity and reliability,
solutions) as a result of the moderately nonlinear availability, time-to-market, and rapid returns from
transfer function (which in turn can be considered as the technology investment.
a drawback in 3R applications), but the optimum
configuration is still to be clearly derived and
identified depending upon the system application. See also
Optics also allow to foresee and possibly target Interferometry: Overview. Optical Amplifiers: Semi-
ultrafast applications above 40G, for signal regener- conductor Optical Amplifiers. Optical Communication
ation if needed. Among the current drawbacks, one Systems: Wavelength Division Multiplexing.
10 ALL-OPTICAL SIGNAL REGENERATION
Babinet’s principle states that Figure 2 Geometry for the analysis of Fresnel diffraction of a
circular aperture. Reproduced with permission from Guenther R
(1990) Modern Optics. New York: Wiley. Copyright q 1990 John
BðxÞ ¼ G 2 AðxÞ
Wiley & Sons.
must be the Fraunhofer diffraction field of the
complementary aperture. We may rewrite this where the distance between the source and obser-
equation for the diffraction field as vation point is
ia 2 ikD
E~ P ¼ e We assume that the transmission function of the
lrD
aperture is a constant, f ðr; fÞ ¼ 1; to make the
ðð ik
ðx2x0 Þ2 þðy2y0 Þ2
£
2
f ðx; yÞe 2r dx dy ½2 integral as simple as possible. After performing
the integration over the angle f , [3] can be
BABINET’S PRINCIPLE 13
a 2 ! IPd ¼ I0
~ Pa iap 2 ikD ð rl 2 ipr 2 =rl r2
E ¼ e e d ½4
D 0 rl The result of the application of Babinet’s principle is
the conclusion that, at the center of the geometrical
Performing the integration of [4], results in the shadow cast by the disk, there is a bright spot with the
Fresnel diffraction amplitude same intensity as would exist if no disk were present.
This is Poisson’s spot. The intensity of this spot is
~ Pa ¼ a e2 ikD 1 2 e2 ipa2 =rl
E ½5 independent of the choice of the observation point
D along the z-axis.
The intensity of the diffraction field is
!
p a2 See also
IPa ¼ 2I0 1 2 cos
rl
Diffraction: Fraunhofer Diffraction; Fresnel Diffraction.
pa2
2
¼ 4I0 sin ½6
2rl
where k is the cavity decay rate, g’ is the decay rate of system can have more than one attractor, in which
atomic polarization, gII is the decay rate of popu- case different initial conditions lead to different types
lation inversion, and l the pumping parameter. of long-time behavior. The simplest attractor in phase
Described in this way the laser acts as a single space is a fixed point which is a solution with just one
oscillator in much the same way as a mechanical set of values for its dynamical variables; the nonlinear
nonlinear oscillator, where laser output represents system is attracted towards this point and stays there,
damping of the oscillator and excitation of the atoms is giving a DC solution. For other control conditions the
the driving mechanism to sustain lasing (oscillation). system may end up making a periodic motion; the
For ordered or coherent emission, it is necessary for attractor of this motion is called a limit cycle.
one or both of the material variables (which are However, when the operating conditions exceed a
responsible for further lasing emission) to respond certain critical value, the periodic motion of the
sufficiently fast to ensure a phase correlation with the system breaks down into a more complex dynamical
existing cavity field. This is readily obtained in many trajectory, which never repeats. This motion rep-
typical lasers with output mirrors of relatively high resents a third kind of attractor in phase space called a
reflectivity, since the field amplitude within the laser chaotic or strange attractor.
cavity will then vary slowly compared to the fast Figure 1 shows a sequence of trajectories of the
material variables which may then be considered Lorentz equations in the phase space of (x, y, z) on
through their equilibrium values. This situation, increasing the value of one of the control parameters;
commonly referred to as adiabatic elimination of fast this corresponds to (E, P, D) in the Maxwell – Block
variables, reduces the dynamics of lasing action to that equations on increasing the laser pump. For a control
of one (or two) variables, the field (and population), all setting near zero, all trajectories approach stable
others being forced to adapt constantly to the slowly equilibrium at the origin, the topological structure of
varying field variable. Our familiar understanding of the basin of attraction being hyperbolic about zero
DC laser action presupposes such conditions to hold. (Figure 1a). For the laser equations, this corresponds
However when, for example, the level of excitation of to operation below lasing threshold for which the
the atoms is increased to beyond a certain value, i.e., magnitude of the control parameter (laser gain) is
the second laser threshold, all three dynamical insufficient to produce lasing. As the control para-
variables may have to be considered, which satisfies meter is increased, the basin lifts at its zero point to
the minimum requirement of three degrees of freedom create an unstable fixed point here but with now two
(variables) for a system to display chaos. So this simple additional fixed points located in the newly formed
laser is capable of chaotic motion, for which the troughs on either side of the zero point, which is now
emission is aperiodic in time and has a broadband a saddle point. This is illustrated in Figure 1b where
spectrum. Prediction of such behavior was initially the system eventually settles to one or other of the
made by Haken in 1975 through establishing the two new and symmetrical fixed points, depending on
mathematical equivalence of the Maxwell – Bloch the initial conditions. This corresponds to DC or
equations to those derived earlier by Lorentz describ- constant lasing as defined by the parameter values of
ing chaotic motion in fluids. one or other of these points which are indistinguish-
able. Pictorially, think of a conventional saddle shape
comprising a hyperbolic and inverted hyperbolic
Nonlinear Dynamics and Chaos form in mutually perpendicular planes and connected
tangentially at the origin. With the curve of the
in Lasers inverted hyperbola turned up at its extremity and
In general there is no general analytical approach for filling in the volume with similar profiles which allow
solving nonlinear systems such as the Maxwell – the two hyperbolic curves to merge into a topological
Bloch equations. Instead solutions are obtained by volume, one sees that a ball placed at the origin is
numerical means and analyzed through geometric constrained to move most readily down either side of
methods originally developed by Poincare´ as early as the inverted hyperbolas into one or other of the
1892. This involves the study of topological struc- troughs formed by this volume. For the Lorentz
tures of the dynamical trajectories in phase space of system, chaotic behavior occurs at larger values of
the variables describing the system. If an initial the control parameter when all three fixed points
condition of a dissipative nonlinear dynamical become saddles. Since none of the equilibrium points
system, such as a laser, is allowed to evolve for a are now attracting, the behavior of the system cannot
long time, the system, after all the transients have died be a steady motion. Although perhaps difficult to
out, will eventually approach a restricted region of visualize topologically, it is then possible to find a
the phase space called an attractor. A dynamical region in this surface enclosing all three points and
CHAOS IN NONLINEAR OPTICS 17
Figure 1 Trajectories in the three-dimensional phase space of the Lorentz attractor on increasing one of the control parameters.
(Reproduced with permission from Thompson JMT and Stewart HB (1987) Nonlinear Dynamics and Chaos. New York: John Wiley.)
large enough so that no trajectories leave the region. even numerically, a solution comprising several such
Thus, all initial conditions outside the region evolve trajectories therefore evolves and, as a consequence,
into the region and remain inside from then on. A long-term predictability is impossible. This is in
corresponding chaotic trajectory is shown in marked contrast to that of the fixed point and limit
Figure 1c. A point outwardly spirals from the cycle attractors, which settle down to the same
proximity of one of the new saddle points until the solutions. A recording of one of these variables in
motion brings it under the influence of the symme- time, say for a laser the output signal amplitude (in
trically placed saddle, the trajectory then being practice recorded as signal intensity, proportional to
towards the center of this region from where outward the square of the field amplitude), then gives
spiraling again occurs. The spiraling out and switch- oscillatory emission of increasing intensity with
ing over continues forever though the trajectory never sudden discontinuities (resulting from flipping from
intersects. In this case, arbitrarily close initial one saddle to the other in Figure 1c) as expected.
conditions lead to trajectories, which, after a suffi- While the Lorentz/Haken model is attractive for
ciently long time, diverge widely. Since a truly exact its relative simplicity, many practical lasers cannot
assignment of the initial condition is never possible, be reduced to this description. Nevertheless, the
18 CHAOS IN NONLINEAR OPTICS
Figure 2 Lorentz-type chaos in the NH3 laser (emitting at 81 mm optically pumped by a N2O laser.
underlying topology of the trajectory of solutions in oscillate, the period of which successively doubles at
phase space for these systems is often found to be distinct values of the control parameter. This con-
relatively simple and quite similar, as for three-level tinues until the number of fixed points becomes
models descriptive of optically pumped lasers. An infinite at a finite parameter value, where the
experimental example of such behavior for a single variation in time of the solution becomes irregular.
mode far infared molecular laser is shown in Figure 2. For the intermittency route, a signal that behaves
In other lasers, for which there is adiabatic elimin- regularly in time becomes interrupted by statistically
ation of one or other of the fast variables, chaotic distributed periods of irregular motion, the average
behavior is precluded if, as a consequence, the number of which increases with the external control
number of variables of the system is reduced to less parameter until the condition becomes chaotic. The
than three. Indeed this is the case for many practical two-frequency route is more readily identified with
laser systems. For these systems the addition of an early concepts of turbulence, considered to be the
independent external control parameter to the limit of an infinite sequence of instabilities (Hopf
system, such as cavity length or loss modulation, bifurcation) evolving from an initial stable solution
have been extensively used as a means to provide the each of which creates a new basic frequency. It is now
extra degrees of freedom. Examples include gas lasers known that only two or three instabilities (frequen-
with saturable absorbers, solid state lasers with loss cies) are sufficient for the subsequent generation of
modulation, and semiconductor lasers with external chaotic motion.
cavities, to name but a few. In contrast, for multi-
mode rather than single mode lasers, intrinsic
modulation of inversion (or photon flux) by multi- Applications of Chaos in Lasers
mode parametric interaction ensures the additional It has been accepted as axiomatic since the discovery
degrees of freedom. Furthermore, when the field is of chaos that chaotic motion is in general
detuned from gain center, the dynamical variables E, neither predictable nor controllable. It is unpredict-
P, D are complex, providing five equations for single able because a small disturbance will produce
mode systems, which is more than sufficient to yield exponentially growing perturbation of the motion.
deterministic chaos for suitable parameter values. It is uncontrollable because small disturbances lead to
Also, of significance is the remarkably low threshold other forms of chaotic motion and not to any other
found for the generation of instabilities and chaos in stable and predictable alternative. It is, however, this
single-mode laser systems in which the gain is very sensitivity to small perturbations that has been
inhomogeneously broadened, an example being the more recently used to stabilize and control chaos,
familiar He –Ne laser. essentially using chaos to control itself. Among the
Erratic and aperiodic temporal behavior of any of many methods proposed in the late 1980s, a feedback
the system’s variables implies a corresponding con- control approach was proposed by Ott, Grebogi, and
tinuous spectrum for its Fourier transform, which is Yorke (OGY) in which tiny feedback was used to
thus a further signature of chaotic motion. Time stabilize unstable periodic orbits or fixed points of
series, power spectra, and routes to chaos collectively chaotic attractors. This control strategy can be best
provide evidence of deterministic behavior. Of the understood by a schematic of the OGY control
wide range of possible routes to chaos, three have algorithm for stabilizing a saddle point Pp ; as shown
emerged as particularly common and are frequently in Figure 3. Curved trajectories follow a stable
observed in lasers. These are period doubling, (unstable) manifold towards (away from) the
intermittency, and two-frequency scenarios. In the saddle point. Without perturbation of a parameter,
first, a solution, which is initially stable is found to the starting state, s1 ; would evolve to the state s2 :
CHAOS IN NONLINEAR OPTICS 19
Figure 4 Schematic of communication with chaotic erbium-doped fiber amplifiers (EDFAs). Injecting a message into the transmitter
laser folds the data into the chaotic frequency fluctuations. The receiver reverses this process, thereby recovering a high-fidelity copy of
the message. (Reproduced with permission from Gauthier DJ (1998) Chaos has come again. Nature 279: 1156– 1157.)
See also Gauthier DJ (1998) Chaos has come again. Nature 279:
1156– 1157.
Fourier Optics. Optical Amplifiers: Erbrium Doped Gills Z, Iwata C and Roy R (1992) Tracking unstable steady
Fiber Amplifiers for Lightwave Systems. Polarization: states: extending the stability regime of a multimode
Introduction. laser system. Physical Review Letters 69: 3169– 3172.
Haken H (1985) Light. In: Laser Light Dynamics, vol. 2.
Amsterdam: North-Holland.
Further Reading Harrison RG (1988) Dynamical instabilities and chaos in
Abraham NB, Lugiato LA and Narducci LM (eds) (1985) lasers. Contemporary Physics 29: 341– 371.
Instabilities in active optical media. Journal of the Harrison RG and Biswas DJ (1985) Pulsating instabilities
Optical Society of America B 2: 1– 272. and chaos in lasers. Progress in Quantum Electronics
Arecchi FT and Harrison RG (1987) Instabilities and Chaos 10: 147– 228.
in Quantum Optics. Synergetics Series 34, 1– 253. Hunt ER (1991) Stabilizing high-period orbits in a chaotic
Berlin: Springer Verlag. system – the diode resonator. Physical Review Letters
Arecchi FT and Harrison RG (eds) (1994) Selected Papers 67: 1953– 1955.
on Optical Chaos. SPIE Milestone Series, vol. MS 75, Ott E, Grebogi C and Yorke JA (1990) Controlling chaos.
SPIE Optical Engineering Press. Physical Review Letters 64: 1196– 1199.
CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids 21
Pecore LM and Carrol TL (1990) Synchronization of VanWiggeren GD and Roy R (1998) Communication with
chaotic systems. Physical Review Letters 64: 821– 824. chaotic lasers. Science 279: 1198– 1200.
Roy R, et al. (1992) Dynamical control of a chaotic laser: VanWiggeren GD and Roy R (1998) Optical communi-
experimental stabilization of a globally coupled system. cation with chaotic waveforms. Physical Review Letters
Physical Review Letters 68: 1259–1262. 279: 1198– 1200.
Contents
Figure 1 Schematic illustration of the molecular absorption– emission cycle and timescales for the component processes. Competing
processes that may reduce the ultimate photon yield are also shown.
region for several milliseconds. The rapid photon of the excited state), has a short fluorescence lifetime,
absorption-emission cycle may therefore be repeated and is exceptionally photostable.
many times during the residence period, resulting in Ionic dyes are often well suited to SMD as
a burst of fluorescence photons as the molecule fluorescence quantum efficiencies can be close to
transits the beam. The burst size is limited unity and fluorescence lifetimes below 10 nano-
theoretically by the ratio of the beam transit time seconds. For example, xanthene dyes such as Rhod-
ðtr Þ and the fluorescence lifetime: amine 6G and tetramethyl-rhodamine isothiocyanate
are commonly used in SMD studies. However, other
tr highly fluorescent dyes such as fluorescein are
Nphotons ¼ ½1
tf unsuitable for such applications due to unacceptably
For typical values of tr (5 ms) and tf (5 ns) up to high photodegradation rate coefficients. Further-
one million photons may be emitted by the single more, some solvent systems may enhance nonradia-
molecule. In practice, photobleaching and photo- tive processes, such as intersystem crossing, and yield
degradation processes limit this yield to about significant reduction in the photon output. Structures
105 photons. Furthermore, advances in optical of three common dyes suitable for SMD are shown in
collection and detection technologies enable regis- Figure 2.
tration of about 1 – 5% of all photons emitted. This
results in a fluorescence burst signature of up to a
Signal vs Background
few thousand photons or photoelectrons.
Successful single molecule detection (SMD) The primary challenge in SMD is to ensure sufficient
depends critically on the optimization of the fluor- reduction in background levels to enable discrimi-
escence burst size and the reduction of background nation between signal and noise. As an example, in a
interference from the bulk solvent and impurities. 1 nM aqueous dye solution each solute molecule
Specifically a molecule is well suited for SMD if it is occupies a volume of approximately 1 fL. However,
efficiently excited by an optical source (i.e., possesses this same volume also contains in excess of 1010
a large molecular absorption cross-section at the solvent molecules. Despite the relatively small scat-
wavelength of interest), has a high fluorescence tering cross-section for an individual water molecule
quantum efficiency (favoring radiative deactivation (, 10228 cm2 at 488 nm), the cumulative scattering
CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids 23
Figure 4 (a) 1/e2 Gaussian intensity contours plotted for a series of laser beam radii (l ¼ 488 nm, f ¼ 1.6 mm, n ¼ 1.52).
(b) Cylindrical and curved components of the Gaussian probe volume. The curved contribution is more significant for larger beam radii
and correspondingly tight beam waists.
simply defined according to, additional volume due to the curved contour. It is
clear from Figure 4a that, for a given focal length, the
ðZ0 noncylindrical contribution to the probe volume
V¼ pwðzÞ2 dz ½4
2Z0 increases with incident beam diameter, when the
diffraction limited focus is correspondingly sharp and
Solution of eqn [4] yields narrow. Furthermore, it can also be seen that the
second term in eqn [7] is inversely proportional to f 2 ;
2l2 0 3
V ¼ 2pw20 Z0 þ Z ½5 and thus the extent to which the probe volume is
3pw20 underestimated by the cylindrical approximation
increases with decreasing focal length. This fact is
The Gaussian volume expression contains two terms.
significant when performing confocal measurements,
The first term, 2pw20 Z0 ; corresponds to a central
since high numerical aperture objectives with short
cylindrical volume; the second term has a more
focal lengths are typical. Some realistic experimental
complex form that describes the extra curved volume
parameters give an indication of typical dimensions
(Figure 4b). The diffraction-limited beam waist radius
for the probe volume in confocal SMD systems. For
w0 can be defined in terms of the focusing objective
example, if l ¼ 488 nm; f ¼ 1:6 mm; Z0 ¼ 1:0 mm
focal length f ; the refractive index n; and the
and n ¼ 1:52; a minimum optical probe volume of
collimated beam radius R according to
1.1 fL is achievable with a collimated beam diameter
lf of 1.1 mm.
w0 ¼ ½6
npR
Substitution in eqn [5] yields, Intensity Fluctuations: Photon
!2 Burst Statistics
2
lf 0 2l2 npR 3 When sampling a small volume within a system that
V ¼ 2p Z þ Z0
npR 3p lf may freely exchange particles with a large surround-
ing analyte bath, a Poisson distribution of particles is
2l2 f 2 0 2pn2 R2 0 3 predicted. A Poisson distribution is a discrete series
¼ Z þ Z ½7
pn2 R2 3f 2 that is defined by a single parameter m equating to the
mean and variance of the distribution:
The volume is now expressed in terms of identifiable
experimental variables and constants. Once again,
the first term may be correlated with the cylindrical mx e2m
Pðn ¼ xÞ ¼ ½8
contribution to the volume, and the second term is the x!
26 CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids
Common Poisson processes include radioactive illustrates typical photon burst scans demonstrating
disintegration, random walks and Brownian motion. the detection of single molecules (R-phycoerythrin) in
Although particle number fluctuations in the exci- solution. Fluorescence photon bursts, due to single
tation volume are Poissonian in nature, the corres- molecule events, are clearly distinguished above a low
ponding fluorescence intensity modulation induces a background baseline (top panel) of less than 5 counts
stronger correlation between photon counts. For a per channel in the raw data. It is noticeable that
single molecular species the model is described by two bursts vary in both height and size. This is in part due
parameters: an intrinsic molecular brightness and the to the range of possible molecular trajectories
average occupancy of the observation volume. A through the probe volume, photobleaching kinetics,
super-Poissonian distribution has a width or variance and the nonuniform illumination intensity in the
that is greater than its mean; in a Poisson distribution probe region. In addition, it can be seen that the burst
the mean value and the variance are equal. The frequency decreases as bulk solution concentration is
fractional deviation Q is defined as the scaled reduced. This effect is expected since the properties of
difference between the variance and the expectation any given single-molecule event are determined by
value of the photon counts, and gives a measure of the molecular parameters alone (e.g., photophysical and
broadening of the photon counting histogram (PCH). diffusion constants) and concentration merely con-
Q is directly proportional to the molecular brightness
trols the frequency/number of events.
factor 1 and the shape factor g of the optical point
Although many fluorescence bursts are clearly
spread function. (g is constant for a given experimen-
distinguishable from the background, it is necessary
tal geometry.)
to set a count threshold for peak discrimination in
kDn2 l 2 knl order to correctly identify fluorescence bursts above
Q¼ ¼ g1 ½9 the background. A photocount distribution can be
knl
used as the starting point for determining an
A pure Poisson distribution has Q ¼ 0; for super- appropriate threshold for a given data set. The
Poissonian statistics Q . 0: Deviation from the overlap between signal and background photocount
Poisson function is maximized at low number density distributions affects the efficiency of molecular detec-
and high molecular brightness. tion. Figure 6 shows typical signal and background
In a typical SMD experiment raw data are photocount probability distributions, with a
generally collected with a multichannel scalar and threshold set at approximately 2 photocounts per
photons are registered in binned intervals. Figure 5 channel. The probability of spurious (or false)
Figure 5 Photon burst scans originating from 1 nM and 500 pM R-phycoerythrin buffered solutions. Sample is contained within a
50 mm square fused silica capillary. Laser illumination ¼ 5 mW, channel width ¼ 1 ms. The top panel shows a similar burst scan
originating from a deionized water sample measured under identical conditions.
CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids 27
detection resulting from statistical fluctuations in the illustrates a sample photocount distribution, a
background can be quantified by the area under least-squares fit to an appropriate Poisson distri-
the ‘background’ curve at photocount values above bution, and the calculated threshold that results.
the threshold. Similarly, the probability that ‘true’ Once the threshold has been calculated, its value is
single molecule events are neglected can be estimated subtracted from all channel counts and a peak search
by the area under the ‘fluorescence’ curve at photo- utility used to identify burst peaks in the resulting
count values below the threshold. Choice of a high data set.
threshold value will ensure a negligible probability
of calling a false positive, but will also exclude a
number of true single molecule events that lie
below the threshold value. Conversely, a low
Data Filtering
threshold value will generate an unacceptably As stated, the primary challenge in detecting single
high number of false positives. Consequently, molecules in solution is not the maximization of the
choice of an appropriate threshold is key in detected signal, but the maximization of the signal-to-
efficient SMD. noise ratio (or the reduction of background inter-
Since the background shot noise is expected to ferences). Improving the signal-to-noise ratio in such
exhibit Poisson statistics, the early part of the experiments is important, as background levels can
photocount distribution (i.e., the portion that is often be extremely high.
dominated by low, background counts) can be Several approaches have been used to smooth SMD
modeled with a Poisson distribution, to set a data with a view to improving signal-to-noise ratios.
statistical limit for the threshold. Photon counting However, the efficacy of these methods is highly
events above this threshold can be defined as photon dependent on the quality of the raw data obtained in
bursts associated with the presence of single mole- experiment. As examples, three common methods are
cules. In an analogy with Gaussian systems the briefly discussed. The first method involves the use of
selected peak discrimination threshold can be a weighted quadratic sum (WQS) smoothing filter.
defined as three standard deviations from the mean The WQS function creates a weighted sum of
count rate: adjacent terms according to
pffiffi
nthreshold ¼ m þ 3 m ½10 X
m21
n~ k;WQS ¼ wj ðnkþj Þ2 ½11
j¼0
Adoption of a threshold that lies 3 standard
deviations above the mean yields a confidence limit The range of summation m is the same order as the
that is typically greater than 99%. Figure 7 burst width, and the weighting factors wj are chosen
28 CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids
Figure 8 Effect of various smoothing filters on a 750 ms photon burst scan originating from a 1 nM R-phycoerythrin buffered solution.
Raw data are shown in the top panel.
to best discriminate between single-molecule signals A final smoothing technique worth mentioning is
and random background fluctuations. This method the Savitsky-Golay smoothing filter. This filter uses
proves most useful for noisy systems, in which the a least-squares method to fit an underlying
raw signal is weak. There is a practical drawback, in polynomial function (typically a quadratic or
that peak positions are shifted by the smoothing quartic function) within a moving window. This
function, and subsequent burst analysis is therefore approach works well for smooth line profiles of a
hampered. similar width to the filter window and tends to
Another popular smoothing filter is the Lee- preserve features such as peak height, width and
filtering algorithm. The Lee filter preferentially position, which may be lost by simple adjacent
smoothes background photon shot noise and is averaging techniques. Figure 8 shows the effects of
defined according to using each approach to improve signal-to-noise for
raw burst data.
s 2k
n~ k ¼ n k þ ðnk 2 n k Þ ½12
s 2k þ s 20
Photon Burst Statistics
where the running mean ðnk Þ and running variance
ðs 2k Þ are defined by A valuable quantitative analysis method for analysis
of fluorescence bursts utilizes the analysis of Poisson
statistics. Burst interval distributions are predicted to
1 Xm
n k ¼ n m,k#N2m ½13 follow a Poissonian model, in which peak separation
ð2m þ 1Þ j¼2m kþj frequencies adopt an exponential form. The prob-
ability of a single molecule (or particle) event
occurring after an interval Dt is given by eqn [15]:
1 Xm
NðDtÞ ¼ l expð2btÞ ½15
s 2k ¼ ðn 2 n kþj Þ2 2m , k # N 2 2m
ð2m þ 1Þ j¼2m kþj
where l is a proportionality constant and b is a
½14 characteristic frequency at which single molecule
events occur. The recurrence time tR can then be
for a filter ð2m þ 1Þ channels wide. Here, nk is the simply defined as
number of detected photons stored in a channel k, s0
is a constant filter parameter, and N is the total 1
tR ¼ ½16
number of data points. b
CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids 29
1 ðT
kFðtÞl ¼ FðtÞdt ½18
T 0
Figure 9 Burst interval distribution analysis of photon burst
scans. Data originate from 1 mm fluorescent beads moving Here, t is defined as the total measurement time, FðtÞ
through 150 mm wide microchannels at flow rates of 200 nL min21 is the fluorescence signal at time t; and kFðtÞl is the
(circles) and 1000 nL min21 (squares). Least squares fits to a temporal signal average. Fluctuations in the fluor-
single exponential function are shown by the solid lines. escence intensity, dFðtÞðdFðtÞ ¼ FðtÞ 2 kFðtÞlÞ; with
time t; about an equilibrium value kFl; can be
Equation [6] simply states that longer intervals statistically investigated by calculating the normal-
between photon bursts are less probable than shorter ized autocorrelation function, GðtÞ; where
intervals. Furthermore, the recurrence time reflects a
combination of factors that control mobility, probe kFðt þ tÞFðtÞl kdFðt þ tÞdFðtÞl
volume occupancy, or other parameters in the single GðtÞ ¼ 2
¼ þ 1 ½19
kFl kFl2
molecule regime. Consequently, it is expected that tR
should be inversely proportional to concentration,
flow rate or solvent viscosity in a range of systems. In dedicated fluorescence correlation spectroscopy
Figure 9 shows an example of frequency NðDtÞ versus experiments, the autocorrelation curve is usually
time plots for two identical particle systems moving generated in real time in a high-speed digital
at different velocities through the probe volume. A correlator. Post data acquisition calculation is also
least-squares fit to a single exponential function yields possible using the following expression
values of tR ¼ 91 ms for a volumetric flow rate of X
N21
200 nL/min and tR ¼ 58 ms for a volumetric flow GðtÞ ¼ gðtÞgðt þ tÞ ½20
rate of 1000 nL/min. t¼0
Applications
The basic tools and methods outlined in this
chapter have been used to perform SMD in a
variety of chemically and biologically relevant
systems, and indeed there is a large body of
Figure 10 Autocorrelation analysis of photon burst scans of work describing the motion, conformational
1 mm fluorescent beads moving through 150 mm wide micro- dynamics and interactions of individual molecules
channels at flow rates of 500 nL min21 (stars) and 1000 nL min21
(circles). Solid lines represent fits to the data according to
(see Further Reading). A primary application area
eqn [23]. has been in the field of DNA analysis, where SMD
methods have been used in DNA fragment sizing,
single-molecule DNA sequencing, high-throughput
DNA screening, single-molecule immunoassays,
related to the translational diffusion coefficient for the and DNA sequence analysis. SMD methods have
molecule: also proved highly useful in studying protein
structure, protein folding, protein-molecule inter-
w2
D¼ ½22 actions, and enzyme activity.
4tD More generally, SMD methods may prove to be
In a flowing system, the autocorrelation function highly important as a diagnostic tool in systems
depends on the average flow time through the probe where an abundance of similar molecules masks
volume tflow : A theoretical fit to the function can be the presence of distinct molecular anomalies that
described according to are markers in the early stages of disease or
cancer.
( 2 )
1 t
GðtÞ ¼ 1 þ A exp A
N tflow
! ½23 See also
21 2
t v t Microscopy: Confocal Microscopy.
A¼ 1þ 1þ
td z td
i2 ¼ i1 2 i1 ðS þ KÞdx ½5
j2 ¼ j1 2 j1 ðS þ KÞdx ½6
i2 ¼ i1 þ j1 Sdx ½7
Figure 1 Schematic of counterpropagating fluxes in a diffusely
irradiated scattering medium.
j2 ¼ j1 þ i1 Sdx ½8
can be constructed, with I the incident flux and J the
diffusely reflected flux, and i and j the fluxes traveling The net effect of this in the attenuation of the flux
upwards and downwards through an infinitesimally propagating into the sample ðiÞ and the flux back-
small thickness element dx: Two further parameters scattered from the sample ð jÞ can be expressed as the
may be defined which are characteristic of the following differential equations:
medium. di ¼ ðS þ KÞi1 dx 2 j1 Sdx ½9
K The absorption coefficient. Expresses the attenu-
ation of light due to absorption per unit thickness. dj ¼ 2ðS þ KÞj1 dx þ i1 Sdx ½10
S The scattering coefficient. Expresses the attenu-
ation of light due to scattering per unit thickness. For a sample of infinite thickness, these equations can
be solved to give an analytical solution for the
Both of these parameters can be thought of as observed reflectance of the sample in terms of the
arising due to the particle (or chromophore) acting absorption and scattering coefficients:
as a sphere of characteristic size, which casts a
shadow either due to the prevention of on-axis K ð1 2 R1 Þ2
¼ ½11
transmission or due to absorption. The scattering or S 2R1
absorption coefficient then depends on the effective
size of this sphere, and the number density of where R1 is the reflectance from a sample of such
spheres in the medium. In each case, the probability thickness that an increase in sample thickness has no
P of a photon being transmitted through a particular effect on the observed reflectance. The absorption
thickness X of a medium is related exponentially to coefficient K is dependent upon the concentration of
the absorption or scattering coefficient: absorbers in the sample through
K ¼ 21c ½12
P ¼ expð2KXÞ ½3
with 1 the naperian absorption coefficient, c the
P ¼ expð2SXÞ ½4 concentration of absorbers and the factor 2 which is
the geometrical factor for an isotropic scatterer. For
The scattering coefficient S depends on the refractive multiple absorbers, K is simply the linear sum of
index difference between the particle and the absorption coefficients and concentrations. Hence for
dispersion medium. The scattering coefficient is a diffusely scattering medium, an expression analo-
also dependent upon the particle size, and shows gous to the Beer–Lambert law can be derived to relate
an inverse correlation, i.e., the scattering coefficient concentration to a physically measurable parameter,
increases as the particle size decreases. This effect is in this case the sample reflectance. The ratio K=S is
a function of the size of the particle relative to the usually referred to as the Kubelka – Munk remission
wavelength of light impinging on it, with the function, and is the parameter usually quoted in this
scattering coefficient increasing as the wavelength context. It is important to note that the relationship
CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis 33
with concentration is only valid for a homogeneous states within the sample. Provided the fluence is high
distribution of absorbers within a sample. enough, this complete conversion will penetrate some
way into the sample (Figure 2). Under circumstances
where the T1 state absorbs strongly at some wave-
Transient Concentration Profiles length other than the photolysis wavelength, probing
As has been discussed in the previous section, a light at this wavelength will result in the probe beam being
flux propagating through a scattering medium is attenuated significantly within a short distance of the
attenuated by both scattering and absorption events, surface of the sample, and thus will only interrogate
whilst in a nonturbid medium attenuation is by regions where there is a homogeneous excited state
absorption only. Hence even in a sample with a very concentration. Under these circumstances the reflec-
low concentration of absorbers, the light flux is tance of the sample as a function of the concentration
rapidly attenuated as it penetrates into the sample by of excited states is described by the Kubelka –Munk
scattering events. This leads to a significant flux equation, and the change in remission function can be
gradient through the sample, resulting in a reduction used to probe the change in excited state
in the concentration of photo-induced species as concentration.
penetration depth into the sample increases. There
are three distinct concentration profiles which may be Exponential Fall-Off of Concentration
identified within a scattering sample which are This occurs when the laser fluence is low and the
dependent on the scattering and absorption coeffi- absorption coefficient at the laser wavelength high.
cients. These concentration profiles are interrogated Again considering the simple photophysical model,
by a beam of analyzing light, and hence an under- most of the laser flux will be absorbed by the ground
standing of the effect of these differing profiles on the state in the first few layers of sample, and little will
diffusely reflected intensity is vital in interpreting penetrate deeply into the sample. In the limiting case
transient absorption data. The transient depth pro- this results in an exponential fall-off of transient
files are illustrated in Figure 2. concentration with penetration depth into the sample
(Figure 2). Here the distribution of absorbers is not
Kubelka– Munk Plug random and the limiting Kubelka– Munk equation is
This occurs when the photolysis flash, i.e., laser fluence no longer applicable since the mean absorption
is high and the absorption coefficient is low at the laser coefficient varies with sample penetration depth. Lin
wavelength. If we assume a simple photophysical and Kan solved eqs [9] and [10] with the absorption
model involving simply the ground state S0, the first coefficient K varying exponentially with penetration
excited singlet state S1 and first excited triplet state T1, depth and showed that the series solution converges
and we assume that either the quantum yield of triplet for changes in reflectance of less than 10% such that
state production is high or the S1 lifetime is very much the reflectance change is a linear function of the
shorter than the laser pulse allowing re-population of number of absorbing species.
the ground state, then it is possible at high enough
fluence to completely convert all of the S0 states to T1 Intermediate Case
transient concentration under these circumstances advantages over alternatives. The photolysis beam
but a precise knowledge of the absorption and and analyzing beam are spatially well separated, such
scattering coefficients is required. Under most cir- that the analyzing beam intensity is largely unaffected
cumstances, this case is avoided and diffuse reflec- by scattered photolysis light. Also, the fluorescence
tance flash photolysis experiments are arranged such generated by the sample will be emitted in all
that one of the two limiting cases above, generally the directions, whilst the analyzing beam is usually
exponential fall-off (usually achieved by attenuation collimated allowing spatial separation of fluorescence
of the laser beam), prevails in a particular and analyzing light; sometimes an iris is used to aid
experiment. this spatial discrimination. This geometry is appro-
priate for quite large beam diameters and fluences;
where smaller beams are used collinear geometry may
Sample Geometry be more appropriate in order to achieve long
In the case of conventional nanosecond laser flash interaction pathlengths.
photolysis, it is generally the case that right-angled In the case of nanosecond diffuse reflectance flash
geometry is employed for the photolysis and probe photolysis, the geometry required is quite different
beams (Figure 3a). This geometry has a number of (Figure 3b,c). Here, the photolysis beam and analyz-
ing beam must be incident on the same sample
surface, and the diffusely reflected analyzing light is
collected from the same surface. The geometry is
often as shown in Figure 3b, where the analyzing light
is incident almost perpendicularly on the sample
surface, with the photolysis beam incident at an angle
such that the specular reflection of the photolysis
beam passes between detector and analyzing beam
(not shown). Alternatively, the geometry shown in
Figure 3c may be employed, where diffusely reflected
light is detected emerging perpendicular to the sample
surface. In both cases, the geometry is chosen such
that specular (mirror) reflection of either exciting or
analyzing light from the sample is not detected, since
specular reflection is light which has not penetrated
the sample and therefore contains no information
regarding the concentrations of species present.
A requirement, as in conventional flash photolysis,
is that the analyzing beam probes only those areas
which are excited by the photolysis beam, requiring
the latter to be larger than the former. The nature
of the scattering described previously means that
the required geometry does not give spatial
discrimination at the detector between photolysis
and analyzing light, and fluorescence. This is since
both analyzing and photolysis light undergo scatter-
ing and absorption processes (although with wave-
length-dependent absorption and scattering
coefficients) and emerge with the same spatial
profiles. Fluorescence, principally that stimulated by
the photolysis beam since this is of greatest intensity,
originates from within the sample but again under-
goes absorption and scattering and emerges with the
same spatial distribution as the exciting light. In
diffuse reflectance flash photolysis, separation of the
analyzing light from the excitation or fluorescence
Figure 3 Sample geometries for (a) conventional and (b,c)
must be achieved using spectral (filters and/or
!
diffuse reflectance flash photolysis. , photolysis beam; monochromators) or temporal, rather than spatial
!, analyzing beam. discrimination. Time-gated charge coupled devices
CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis 35
There are a number of approaches to kinetic in the first-order rate constant is then
analysis in opaque, often heterogeneous systems, as
described below. þ gx
lnðkÞ ¼ lnðkÞ ½16
Physical Models
A number of these have been applied to very specific
data sets, where parameters of the systems are
accurately known. These include random walk
models in zeolites, and time-dependent fractal-
dimensional rate constants to describe kinetics on
silica gel surfaces. The advantage of these methods
over rate constant distributions is that since they are
based on the physical description of the system, they
yield physically meaningful parameters such as
diffusion coefficients from analysis of kinetic data.
However, for many systems the parameters are
known in insufficient detail for accurate models to
be developed.
Figure 6 Transient absorption decay for the naphthalene
(1 mmol g21) radical cation monitored at 680 nm on silica gel.
Fitted using a Lévy stable distribution.
Examples
An example of bimolecular quenching data is shown (see section on Rate Constant Distributions
in Figure 5. Here, anthracene at a concentration of above) with fitting parameters k ¼ 1:01 £ 104 s21
1 mmol g21 is co-adsorbed to silica gel from aceto- and g ¼ 0:73:
nitrile solution together with azulene at a concen- Where ion – electron recombination is concerned,
tration of 0.8 mmol g21. Laser excitation at 355 nm the Albery model often fails to adequately describe
from an Nd:YAG laser produces the excited triplet the data obtained since it does not allow sufficient
small rate constants relative to the value of k; given
state of the anthracene, which undergoes energy
transfer to the co-adsorbed azulene molecule as a the constraints of the Gaussian distribution. The Lévy
result of the rapid diffusion of the latter. stable distribution is ideal in this application since it
The data shown in Figure 5 are recorded monitor- allows greater flexibility in the shape of the
ing at 420 nm, and the laser energy (approximately distribution.
5 mJ per pulse) is such that an exponential fall-off of Figure 6 shows example data for naphthalene
transient concentration with penetration depth is adsorbed to silica gel (1 mmol g21). The laser pulse
expected such that reflectance change is proportional energy at 266 nm (approximately 40 mJ per pulse) is
to transient concentration (see section on Transient such that photo-ionization of the naphthalene occurs
Concentration Profiles above). The data are shown on producing the radical cation. The subsequent decay
a logarithmic time axis for clarity. The fitted line is of the radical cation via ion – electron recombination
obtained by applying the model of Albery et al. can be monitored at 680 nm. Note that decay is
observed on a time-scale of several thousand seconds.
The fitted line is according to a Lévy stable
distribution with parameters k ¼ 8:2 £ 1024 s21 ; g ¼
0:5 and a ¼ 1:7 (see section on Rate Constant
Distributions above). This model allows the shape
of the distribution to deviate from a Gaussian, and
can be more successful than the model of Albery et al.
in modeling the complex kinetics which arise on
surfaces such as silica gel. Note that where the model
of Albery et al. can successfully model the data, the
data can also be described by a Lévy stable
distribution with a ¼ 2:
the principle and method of laser manipulation is aware of the photon force because its magnitude is
described and then its applications and possibilities in less than an order of pN. However, the photon force
nanotechnology and biotechnology are summarized. of the pN order acquired from a focused laser beam is
sufficient to manipulate nm – mm-sized particles in
solution, under an optical microscope.
Principle and Method
When light is reflected or refracted at the interface Principle of Laser Trapping
between two media with different refractive indices,
the momentum of photons is changed, which leads to If the size of a particle is larger than the wavelength of
the generation of photon force as a reaction to the a trapping laser beam (mm-sized particles), the
momentum change. For example, when a laser beam principle of single-beam gradient force optical trap-
is reflected by a mirror, the momentum of photons is ping can be explained in terms of geometrical optics.
changed by Dp (Figure 1a). As a result, a photon When a tightly focused laser beam is irradiated onto a
force, Fphot, acts on the mirror to deflect it vertically transparent dielectric particle, an incident laser beam
away from the reflected beam. In refraction, the is refracted at the interface between the particle and
momentum of photons is also changed, so that the medium, as represented by Figure 2a. The propa-
photon force acts on the interface, as shown in gation direction of the beam is changed, i.e., the
Figure 1b. Thus light does not only give its energy to momentum of the photon is changed, and conse-
materials via absorption but also applies a mechan- quently, a photon force is generated. As the laser
ical force to them. However, when we are exposed to beam leaves the particle and enters the surrounding
light, such as from a halogen lamp, we are never medium, this refraction causes a photon force to be
exerted again on that interface. Summing up the force
contributions of all rays, if the refractive index of the
particle is higher than that of the medium, a resultant
force exerted on the particle is directed toward the
focal point as an attractive force. However, reflection
at the surface of the particle is negligible if that
particle is transparent and the refractive index ratio of
the particle to medium is close to unity. Therefore, the
incident beam is reflected at two surfaces by a small
amount, and as a result, the particle is directed
slightly to the propagation direction of the incident
light. Where the particle absorbs the incident beam,
Figure 1 Optical pressure originating from the momentum the photon force is also generated to push it in the
change of photons. propagation direction. The negligible effects of the
Figure 2 Principle of single-beam gradient force optical trapping explained by ray optics.
40 CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science
reflection and absorption occur in trapping experi- point of the trapping laser beam, where the beam
ment of transparent particles, such as polymer intensity (electric field intensity) is maximum. The
particles, silica microspheres, etc. However, particles photon force can be expressed as follows:
with high reflectance and absorption coefficients such 1
as metallic particles, exert a more dominant force at Fphot ø Fgrad ¼ 1m a7lEl2 ½2
2
the surface where absorption and reflection occurs.
1p 2 1m
This force is a repulsive force away from the focal a ¼ 3V ½3
point; consequently, metallic microparticles cannot 1p þ 21m
be optically trapped.
where, E is electric field of the light, V is volume of the
If a dielectric particle is much smaller than the
particle, and 1p and 1m are dielectric constants of the
wavelength of a trapping light (nm-sized), it can be
particle and medium. In trapping nanoparticles that
regarded as a point dipole (Rayleigh approximation)
absorb the laser beam, such as gold nanoparticles, the
and the photon force (Fphot) acted on it is given by
complex dielectric constant and attenuation of
Fphot ¼ Fgrad þ Fscat ½1 the electric field in the particle need to be taken into
consideration. Although the forces due to light
Here, Fgrad and Fscat are called the gradient force and absorption and scattering are both propelling the
scattering force, respectively. The scattering force is particle toward the direction of light propagation, the
caused by the scattering of light, and it pushes the magnitude of gradient force is much larger than
particle toward the direction of light propagation. On these forces. This is in contrast to the mm-sized
the other hand, the gradient force is generated when a metallic particle, which cannot be optically trapped.
particle is placed in a heterogeneous electric field of
Laser Manipulation System
light. If the dielectric constant of the particle is higher
than that of the surrounding medium, the gradient An example of a laser micromanipulation system,
force acts on the particle to push it toward the higher with a pulsed laser to induce photoreactions, is shown
intensity region of the beam. In the case of laser in Figure 3. A linearly polarized laser beam from a
trapping of dielectric nanoparticles, the magnitude of CW Nd3þ:YAG laser is modulated to the circularly
gradient force is much larger than the scattering force. polarized light by a l=4 plate and then split into
Consequently, the particle is trapped at the focal horizontally and vertically polarized beams by a
polarizing beamsplitter (PBS1). Both laser beams are entered the region, irradiated by a near-infrared laser
combined by another (PBS2), introduced coaxially beam (1064 nm), was trapped at the focal point, and
into an optical microscope, and then focused onto a moved onto the surface of the glass substrate by
sample through a microscope objective. The focal handling the 3D stage of the microscope. An
spots of both beams in the sample solution can be additional fixation laser beam (355 nm, 0.03 mJ,
scanned independently by two sets of computer- pulse duration , 6 ns, repetition rate , 5 Hz) was
controlled galvano mirrors. Even if the beams are then focused to the same point for , 10 s, which led
spatially overlapping, interference does not take place to the generation of acrylamide gel around the
because of their orthogonal polarization relation. trapped nanoparticle. By repeating the procedure,
Pulsed lasers, such as a Q-switched Nd3þ:YAG patterning of single polymer nanoparticles on a glass
laser, are used to induce photoreactions, such as substrate was achieved, and a fluorescence image of
photopolymerization, photothermal reaction, laser single nanoparticles as a letter ‘H’, is shown in
ablation, etc. Figure 4. A magnified atomic force microscope
(AFM) image of one of the fixed nanoparticles is
also shown, which confirms that only one polymer
Laser Manipulation and Patterning nanoparticle was contained in the polymerized gel.
of Nanoparticles Multiple polymer nanoparticles can also be gath-
Laser manipulation techniques enable us to capture ered, patterned, and fixed on a glass substrate by
and mobilize fine particles in solution. Most studies scanning both trapping and fixation laser beams with
using this technique have been conducted on mm- the use of two pairs of galvano mirrors. The optical
sized objects such as polymer particles, microcrystals, transmission and fluorescence images of the ‘H’
living cells, etc. Because it is difficult to identify patterned nanoparticles on a glass substrate are
individual nanoparticles with an optical microscope, shown in Figure 5a and b, respectively. The letter
laser manipulation techniques have rarely been ‘H’ consists of three straight lines of patterned and
applied to nanotechnology and nanoscience. How- fixed nanoparticles. The trapping laser beam
ever, the laser trapping technique can be a powerful (180 mW) was scanned at 30 Hz along each line
tool for manipulation of nanoparticles in solution with a length of , 10 mm on a glass substrate for
where individual particles can be observed. This is 300 s. Nanoparticles were gathered and patterned
achieved by detection of fluorescence emission from along the locus of the focal point on the substrate.
labeled dyes or scattered light. Now, even single Then the fixation laser beam (0.097 mJ) was scanned
molecule fluorescence spectroscopy has been for another 35 s. As a result, the straight lines of
achieved by the use of a highly sensitive photo-
detector, and a single metallic nanoparticle can be
examined by detecting the light scattered from it.
There have been several reports on the application of
laser manipulation techniques for patterning of
nm-sized particles. Here, fixation methods of nano-
particles, using the laser manipulation technique and
local photoreactions, are introduced.
Figure 5 Optical transmission (a) and fluorescence (b) images of patterned and fixed polymer nanoparticles on the glass substrate, as
the letter ‘H’.
Figure 10 A schematic illustration of efficient collection and transfer of cells based by dual-beam laser manipulation.
See also
Time-Resolved Fluorescence: Measurements in
Polymer Science.
Figure 11 Collection and alignment of fission yeast cells using
dual-beam laser manipulation.
Further Reading
Ashkin A, Dziedzic MJ, Bjorkholm JE and Chu S (1986)
The other trapping beam B was used to move Observation of a single-beam gradient force optical
individual cells to the line sequentially (Figure 10b). trap for dielectric particles. Optical Letters 11:
Finally, by moving the microscope stage, many 288 – 290.
particles trapped at the scanning line can be Ashkin A, Dziedzic MJ and Yamane T (1987) Optical
transferred to another chamber (Figure 10c). trapping and manipulation of single cells using infrared
An example is shown in Figure 11. The trapping laser beams. Nature 330: 769– 771.
beam A was scanned on a line with the rate of 15 Hz Berns WM, Aist J, Edwards J, et al. (1981) Laser
microsurgery in cell and developmental biology. Science
whose length was 30 mm. The trapping beam B was
213: 505 – 513.
used to trap a yeast cell and move it to the line, which Hoffmann F (1996) Laser microbeams for the manipulation
was controlled by a computer mouse pointer. The of plant cells and subcellular structures. Plant Science
cells stored on the line were successfully isolated and 113: 1– 11.
with this method three cells can be collected within Hosokawa Y, Masuhara H, Matsumoto Y and Sato S
15 s. Furthermore, the cells could be transferred with (2002) Dual-beam laser micromanipulation for sorting
46 CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies
biological cells and its device application. Proceedings of Masuhara H, De Schryver FC, Kitamura N and Tamai N
SPIE 4622: 138 –142. (eds) (1994) Microchemistry: Spectroscopy and Chem-
Ito S, Yoshikawa H and Masuhara H (2001) Optical istry in Small Domains. Amsterdam: Elsevier-Science.
patterning and photochemical fixation of polymer Sasaki K, Koshioka M, Misawa H, Kitamura N and
nanoparticles on glass substrates. Applied Physics Masuhara H (1991) Laser-scanning micromanipulation
Letters 78: 2566– 2568. and spatial patterning of fine particles. Japanese Journal
Ito S, Yoshikawa H and Masuhara H (2002) Laser of Applied Physics 30: L907 – L909.
manipulation and fixation of single gold nanoparticles Schütze K, Clement-Segelwald A and Ashkin A (1994)
in solution at room temperature. Applied Physics Letters Zona drilling and sperm insertion with combined laser
80: 482 – 484. microbeam and optical tweezers. Fertility and Sterility
Kamentsky AL, Melamed RM and Derman H (1965) 61: 783– 786.
Spectrophotometer: new instrument for ultrarapid cell Shikano S, Horio K, Ohtsuka Y and Eto Y (1999)
analysis. Science 150: 630 – 631. Separation of a single cell by red-laser manipulation.
Kim BH, Kogi O and Kitamura N (1999) Single-micro- Applied Physics Letters 75: 2671–2673.
particle measurements: laser trapping-absorption Svoboda K and Block SM (1994) Optical trapping of
microscopy under solution-flow conditions. Analytical metallic Rayleigh particles. Optical Letters 19:
Chemistry 71: 4338– 4343. 930 – 932.
Nonlinear Spectroscopies
S R Meech, University of East Anglia, Norwich, UK Nonlinear Optics for Spectroscopy
q 2005, Elsevier Ltd. All Rights Reserved.
The foundations of nonlinear optics are described in
detail elsewhere in the encyclopedia, and in some of
the texts listed in the Further Reading section at
Introduction the end of this article. The starting point is usually the
nonlinearity in the polarization, Pi , induced in the
Chemistry is concerned with the induction and sample when the applied electric field, E, is large:
observation of changes in matter, where the changes
are to be understood at the molecular level. ð1Þ 1 ð2Þ 1 ð3Þ
Pi ¼ 10 xij Ej þ xijk Ej Ek þ xijkl Ej Ek El þ · · · ½1
Spectroscopy is the principal experimental tool for 2 4
connecting the macroscopic world of matter with
the microscopic world of the molecule, and is, where 10 is the vacuum permittivity, xðnÞ is the nth
order nonlinear susceptibility, the indices represent
therefore, of central importance in chemistry. Since
directions in space, and the implied summation over
its invention, the laser has greatly expanded the
repeated indices convention is used. The signal field,
capabilities of the spectroscopist. In linear spec-
resulting from the nonlinear polarization, is calcu-
troscopy, the monochromaticity, coherence, high
lated by substituting it as the source polarization in
intensity, and high degree of polarization of laser
Maxwell’s equations and converting the resulting
radiation are ideally suited to high-resolution
field to the observable, which is the optical intensity.
spectroscopic investigations of even the weakest
The nonlinear polarization itself arises from the
transitions. The same properties allowed, for the
anharmonic motion of electrons under the influence
first time, the investigation of the nonlinear optical
of the oscillating electric field of the radiation. Thus,
response of a medium to intense radiation. Shortly there is a microscopic analog of eqn [1] for the
after the foundations of nonlinear optics were laid, induced molecular dipole moment, mi :
it became apparent that these nonlinear optical
signals could be exploited in molecular spec- mi ¼ ½10 aij dj þ 122 23
0 bijk dj dk þ 10 gijkl dj dk dl þ · · · ½2
troscopy, and since then a considerable number of
nonlinear optical spectroscopies have been devel- in which a is the polarizability, b the first molecular
oped. This short article is not a comprehensive hyperpolarizability, g the second, etc. The power
review of all these methods. Rather, it is a discussion series is expressed in terms of the displacement field d
of some of the key areas in the development of the rather than E to account for local field effects. This
subject, and indicates some current directions in this somewhat complicates the relationship between the
increasingly diverse area. molecular hyperpolarizabilities, for example, gijkl and
CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies 47
Second-Order Spectroscopies
Inspection of eqn [1] shows that in gases, isotropic
liquids, and solids where the symmetry point group
contains a center of inversion, xð2Þ is necessarily zero.
This is required to satisfy the symmetry requirement
that polarization must change sign when the direction
of the field is inverted, yet for a quadratic, or any even
order dependence on the field strength, it must remain
positive. Thus second-order nonlinearities might not
Figure 2 Experimental geometries and corresponding phase appear very promising for spectroscopy. However,
matching diagrams for (a) CARS (b) DFWM (c) RIKES. Note that there are two cases in which second-order nonlinear
only the RIKES geometry is fully phase matched.
optical phenomena are of very great significance in
molecular spectroscopy, harmonic generation and the
spectroscopy of interfaces.
signal arises from the coherent oscillation of induced
dipoles in the sample. Constructive interference can Harmonic conversion
lead to a large enhancement of the signal strength. For Almost every laser spectroscopist will have made use
this to occur the individual induced dipole moments of second-harmonic or sum frequency generation for
must oscillate in phase – the signal must be phase frequency conversion of laser radiation. Insertion of
matched. This requires the incident and the generated an oscillating field of frequency v into eqn [1] yields a
frequencies to travel in the sample with the same second-order polarization oscillating of 2v: If two
phase velocity, kv =v ¼ c=nv where nv is the index of different frequencies are input, the second-order
refraction and kv is the wave propagation constant at polarization oscillates at their sum and difference
frequency v: For the simplest case of second- frequencies. In either case, the second-order polariz-
harmonic generation (SHG), in which the incident ation acts as a source for the second-harmonic (or
field oscillates at v and the generated field at 2v, sum, or difference frequency) emission, provided xð2Þ
phase matching requires 2kv ¼ k2v : The phase- is nonzero. The latter can be arranged by selecting a
matching condition for the most efficient generation noncentrosymmetric medium for the interaction, the
of the second harmonic is when the phase mismatch, growth of such media being an important area of
Dk ¼ 2kv 2 k2v ¼ 0, but this is not generally fulfilled materials science. Optically robust and transparent
due to the dispersion of the medium, nv , n2v : In the materials with large values of xð2Þ are available for the
case of SHG, a coherence length L can be defined as generation of wavelengths shorter than 200 nm to
the distance traveled before the two waves are 1808 longer than 5 mm. Since such media are birefringent
out of phase, L ¼ lp=Dkl: For cases in which the input by design, a judicious choice of angle and orientation
frequencies also have different directions, as is often of the crystal with respect to the input beams allows a
the case when laser beams at two frequencies are degree of control over the refractive indices experi-
combined (e.g., CARS), the phase-matching con- enced by each beam. Under the correct phase
dition must be expressed as a vectorial relationship, matching conditions nv < n2v , and very long inter-
hence, for CARS, Dk ¼ kAS 2 2k1 þ k 2 < 0, where action lengths result, so the efficiency of signal
kAS is the wavevector of the signal at the anti-Stokes generation is high.
frequency (Figure 2). Thus for known input wave- Higher-order harmonic generation in gases is an
vectors one can easily calculate the expected direction area of growing importance for spectroscopy in the
CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies 49
hostile environments. CARS spectroscopy has been and LO fields are seen by the detector, which
widely applied to record the vibrational spectra of measures the intensity as:
flames. Such measurements would obviously be very nc
challenging for linear Raman or IR, due to the strong IðtÞ ¼ lE ðtÞ þ ES ðtÞl2
8p LO
emission from the flame itself. The directional CARS
nc
signal in contrast can be spatially filtered, minimizing ¼ ILO ðtÞ þ IS ðtÞ þ Re½EpS ðtÞ·ELO ðtÞ ½4
this problem. CARS has been used both to identify 4p
unstable species formed in flames, and to probe the where the final term may be very much larger than the
temperature of the flame (e.g., from measured original signal, and is linear in xð3Þ : This term is
population differences eqn [3]). A second advantage usually isolated from the strong ILO by lock-in
of the technique is that the signal is only generated detection. This method is called optical heterodyne
when the two input beams overlap in space. Thus, detection (OHD), and generally leads to excellent
small volumes of the sample can be probed. By signal to noise. It can be employed with other
moving the overlap position around in the sample, coherent signals by artificially adding the LO to the
spatially resolved information is recovered. Thus, it is signal, provided great care is taken to ensure a fixed
possible to map the population of a particular phase relationship between LO and signal. In the
transient species in a flame. RIKES experiment, however, the phase relationship is
CARS is probably the most widely used of the automatic. The arrangement described yields an out-
coherent Raman methods, but it does have some of-phase LO, and measures the real part of xð3Þ , the
disadvantages, particularly in solution phase studies. birefringence. Alternatively, the quarterwave plate is
In that case the resonant xð3Þ signal (eqn [3]) is omitted, and the analyzing polarizer is slightly
accompanied by a nonresonant third-order back- reoriented, to introduce an in-phase LO, which
ground. The interference between these two com- measures the imaginary part of xð3Þ , the dichroism.
ponents may result in unusual and difficult to This is particularly useful for absorbing media. The
interpret lineshapes. In this case, some other coherent OHD-RIKES method has been applied to measure the
Raman methods are more useful. The phase matching spectroscopy of the condensed phase, and has found
scheme for Raman-induced Kerr effect spectroscopy particularly widespread application in transient
(RIKES) was shown in Figure 2. The RIKES signal is studies (below).
always phase matched, which leads to a long Degenerate four wave mixing (DFWM) spec-
interaction length. However, the signal is at v2 and troscopy is a simple and widely used third-order
in the direction of v2 , which would appear to be a spectroscopic method. As the name implies, only a
severe disadvantage in terms of detection. Fortu- single frequency is required. The laser beam is split
nately, if the input polarizations are correctly chosen into three, and recombined in the sample, in the
the signal can be isolated by its polarization. In an geometry shown in Figure 2. The technique is also
important geometry, the signal ðv2 Þ is isolated by known as laser-induced grating scattering. The first
transmission through a polarizer oriented at 458 to a two beams can be thought of as interfering in the
linearly polarized pump ðv1 Þ: The pump is overlapped sample to write a spatial grating, with fringe spacing
in the sample with the probe ðv2 Þ linearly polarized at dependent on the angle between them. The signal is
2 458. Thus the probe is blocked by the polarizer, then scattered from the third beam in the direction
but the signal is transmitted. This geometry may be expected for diffraction from that grating. The
viewed as pump-induced polarization of the isotropic DFWM experiment has been used to measure
medium to render it birefringent, thus inducing electronic spectra in hostile environments, by exploit-
ellipticity in the transmitted probe, such that the ing resonances with electronic transitions. It has also
signal leaks through the analyzing polarizer (hence been popular in the determination of the numerical
the alternative name optical Kerr effect). value of xð3Þ , partly because it is an economical
In this geometry, it is possible to greatly enhance technique, requiring only a single laser, but also
signal-to-noise ratios by exploiting interference because different polarization combinations make it
between the probe beam and the signal. Placing a possible to access different elements of xð3Þ : The
quarterwave plate in the probe beam with its fast axis technique has also been used in time resolved
aligned with the probe polarization, and reorienting it experiments, where the decay of the grating is
slightly (, 18) yields a slightly elliptically polarized monitored by diffraction intensity from a time
probe before the sample. A fraction of the probe delayed third pulse.
beam, the local oscillator (LO), then also leaks Two-photon or, more generally, multiphoton exci-
through the analyzing polarizer, temporally and tation has applications in both fundamental spec-
spatially overlapped with the signal. Thus, the signal troscopy and analytical chemistry. Two relevant level
CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies 51
demonstrated to yield 3D images of the distribution Hall G and Whitaker BJ (1994) Laser-induced grating
of vibrations in living cells. It would be difficult to spectroscopy. Journal of Chemistry Society Faraday
recover such data by linear optical microscopy. SHG Transactions 90: 1.
has been applied in microscopy. In this case, by virtue Heinz TF (1991) Second order nonlinear optical effects at
surfaces and interfaces. In: Ponath H-E and Stegeman GI
of the symmetry selection rule referred to above, a 3D
(eds) Nonlinear Surface Electromagnetic Phenomena,
image of orientational order is recovered. Both these
pp. 353.
and other nonlinear signals provide important new Hesselink WH and Wiersma DA (1983) Theory and
information on complex heterogeneous samples, experimental aspects of photon echoes in molecular
most especially living cells. solids. In: Hochstrasser RM and Agranovich VM
(eds) Spectroscopy and Excitation Dynamics of
Condensed Molecular Systems, chap. 6. Amsterdam:
See also North Holland.
Levenson MD and Kano S (1988) Introduction to Non-
Spectroscopy: Nonlinear Laser Spectroscopy; Raman linear Laser Spectroscopy. San Diego, CA: Academic
Spectroscopy. Press.
Meech SR (1993) Kinetic application of surface nonlinear
optical signals. In: Lin SH, Fujimura Y, and Villaeys A
(eds) Advances in Multiphoton Processes and Spec-
Further Reading
troscopy, vol. 8, pp. 281. Singapore: World Scientific.
Andrews DL and Allcock P (2002) Optical Harmonics in Mukamel S (1995) Principles of Nonlinear Optical
Molecular Systems. Weinheim, Germany: Wiley-VCH. Spectroscopy. Oxford: Oxford University Press.
Andrews DL and Demidov AA (eds) (2002) An Intro- Rector KD and Fayer MD (1998) Vibrational echoes: a new
duction to Laser Spectroscopy. New York: Kluwer approach to condensed-matter vibrational spectroscopy.
Academic. International Review of Physical Chemistry 17: 261.
Butcher PN and Cotter D (1990) The Elements of Non- Richmond GL (2001) Structure and bonding of molecules
linear Optics. Cambridge, UK: Cambridge University at aqueous surfaces. Annual Review of Physical Chem-
Press. istry 52: 357.
Cheng JX and Xie XS (2004) Coherent anti-Stokes Raman Shen YR (1984) The Principles of Nonlinear Optics. New
scattering microscopy: Instrumentation, theory, and York: Wiley.
applications. Journal of Physical Chemistry B 108: 827. Smith NA and Meech SR (2002) Optically heterodyne
de Boeij WP, Pshenichnikov MS and Wiersma DA (1998) detected optical Kerr effect – Applications in condensed
Ultrafast solvation dynamics explored by femtosecond phase dynamics. International Review of Physical
photon echo spectroscopies. Annual Review of Physical Chemistry 21: 75.
Chemistry 49: 99. Tolles WM, Nibler JW, McDonald JR and Harvey AB
Eisenthal KB (1992) Equilibrium and dynamic processes at (1977) Review of theory and application of coherent
interfaces by 2nd harmonic and sum frequency gener- anti-Stokes Raman spectroscopy (CARS). Applied Spec-
ation. Annual Review of Physical Chemistry 43: 627. troscopy 31: 253.
Fleming GR (1986) Chemical Applications of Ultrafast Wright JC (2002) Coherent multidimensional vibrational
Spectroscopy. Oxford, UK: Oxford University Press. spectroscopy. International Review of Physical Chem-
Fleming GR and Cho M (1996) Chromophore-solvent istry 21: 185.
dynamics. Annual Review of Physical Chemistry Zipfel WR, Williams RM and Webb WW (2003) Nonlinear
47: 109. magic multiphoton microscopy in the biosciences.
Fourkas JT (2002) Higher-order optical correlation Nature Biotechnology 21: 1368.
spectroscopy in liquids. Annual Review of Physical Zyss J (ed.) (1994) Molecular Nonlinear Optics. San Diego:
Chemistry 53: 17. Academic Press.
generation of reactive oxidizing intermediates which drug in the skin, so patients must avoid exposure to
are toxic to cells, and this process ultimately leads to sunlight in particular for a short period following
tumor destruction. PDT is a relatively low-power, treatment. Retreatment is then possible once the
nonthermal, photochemical technique that uses photosensitizer has cleared from the skin since these
fluence rates not exceeding 200 mW/cm2 and total drugs have little intrinsic toxicity, unlike many
light doses or fluences of typically 100 J/cm2. conventional chemotherapy agents.
Generally red or near-infrared light is used, since
tissue is relatively transparent at these wavelengths. Photoproperties of Photosensitizers
It is a promising alternative approach to the local
destruction of tumors for several reasons. First, it is a By definition, there are three fundamental require-
minimally invasive treatment since laser light can be ments for obtaining a photodynamic effect: (a) light
delivered with great accuracy to almost any site in of the appropriate wavelength matched to the
the body via thin flexible optical fibers with minimal photosensitizer absorption; (b) a photosensitizer;
damage to overlying normal tissues. Second, the and (c) molecular oxygen. The ideal photochemical
nature of PDT damage to tissues is such that healing and biological properties of a photosensitizer may be
is safer and more complete than after most other easily summarized, although the assessment of a
forms of local tissue destruction (e.g., radiotherapy). sensitizer in these terms is not as straightforward as
PDT is also capable of some degree of selectivity for might be supposed because the heterogeneous nature
tumors when the sensitizer levels, light doses, and of biological systems can sometimes profoundly affect
irradiation geometry are carefully controlled. This these properties. Ideally though, a sensitizer should
selectivity is based on the higher sensitizer retention possess the following attributes: (a) red or near
in tumors after administration relative to the infrared light absorption; (b) nontoxic, and with
adjacent normal tissues in which the tumor arose low skin photosensitizing potency; (c) selective
(generally 3:1 for extracranial tumors but up to 50:1 retention in tumors relative to normal adjacent tissue;
for brain tumors). (d) an efficient generator of cytotoxic species, usually
singlet oxygen; (e) fluorescence, for visualization; (f) a
The photosensitizer is administered intravenously
defined chemical composition, and (g) preferably
to the patient and time allowed (3 –96 hours depend-
water soluble. A list of several photosensitizers
ing on the sensitizer) for it to equilibrate in the body
possessing the majority of the above-mentioned
before the light treatment (Figure 1). This time is
attributes is given in Table 1.
called the drug– light interval. PDT may also be useful
The reasons for these requirements are partly self-
for treating certain nonmalignant conditions, in
evident, but worth amplifying. Strong absorption is
particular, psoriasis and dysfunctional menorrhagia
desirable in the red and near-infrared spectral region,
(a disorder of the uterus), and local treatment of
where tissue transmittance is optimum enabling
infections; such as genital papillomas, infections in
penetration of the therapeutic light within the
the mouth and the upper gastrointestinal tract. In
tumor (Figure 2). To minimize skin photosensitiza-
certain cases, the photosensitizer may be applied tion by solar radiation, the sensitizer absorption
directly to the lesions, particularly for treatment of spectrum should ideally consist of a narrow red
skin tumors, as discussed later. The main side-effect of wavelength band, with little absorption at other
PDT is skin photosensitivity, owing to retention of the wavelengths down to 400 nm, below which solar
irradiation falls off steeply. Another advantage of red
wavelength irradiation is that the potential mutagenic
effects encountered with UV-excited sensitisers (e.g.,
psoralens) are avoided. Since the object of the
treatment is the selective destruction of tumor tissue,
leaving surrounding normal tissue undamaged, some
degree of selective retention of the dye in tumor tissue
is desirable. Unfortunately, the significance of this
aspect has been exaggerated in the literature and in
many cases treatment selectivity owes more to careful
light irradiation geometry. Nevertheless, many nor-
mal tissues have the capacity to heal safely following
PDT damage.
Figure 1 Photodynamic therapy: from sensitization to The key photochemical property of photosensiti-
treatment. zers is to mediate production of some active
CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer 55
Compound l/nm (1/M21 cm21) Drug dose/mg kg21 Light dose/J cm22 Diseases treated
Figure 4 The photophysical and photochemical mechanisms involved in PDT. PS denotes the photosensitizer, SUB denotes the
substrate, either biological, a solvent or another photosensitizer; * denotes the excited state and † denotes a radical.
CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer 57
produce free radicals or electron transfer, resulting in excitation and photodynamic activation of the
ion formation. The type II mechanism, in contrast, photosensitizer. This is not true for a conventional
exclusively involves interaction between molecular light source (as for example a tungsten lamp) where
oxygen and the triplet state to form singlet oxygen the output power is divided over several hundred
which is highly reactive in biological systems. The nanometers throughout the UV, visible, and IR
near-resonant energy transfer from the triplet state to regions and only a fraction of the power lies within
O2 can be highly efficient and the singlet oxygen yield the absorption band of the photosensitizer. To
can approach the triplet state yield provided that the numerically illustrate this, let us simplify things by
triplet state energy exceeds 94 kJ/mol, the singlet representing the lamp output as a square profile and
oxygen excitation energy. It is widely accepted that also consider a photosensitizer absorption band
the type II mechanism underlies the oxygen-depen- about 30 nm full width half maximum. If a laser
dent phototoxicity of sensitizers used for photody- with a frequency span of about 2 nm and a lamp with
namic therapy. Both proteins and lipids (the main the same power but with a frequency span of about
constituents of membranes) are susceptible to photo- 600 nm are used, then the useful portion of it is about
oxidative damage induced via a type II mechanism 0.05 and if we consider a Gaussian profile for the
which generally results in the formation of unstable sensitizer absorption band, the above fraction drops
peroxide species. For example, unsaturated lipids even lower, perhaps even to 0.01. That means that the
may be oxidized by the ‘ene’ reaction where the lamp would achieve 100 times less the excitation rate
singlet oxygen reacts with a double bond. Other of the laser, or in other words, we would require a
targets include such important biomolecules as lamp with an output power of 100 times that of the
cholesterol, certain amino acid residues, collagen, laser to achieve the same rate of excitation and
and the coenzyme, NADPH. Another synergistic consequently the same time of treatment, provided
mode of action involves the occlusion of the micro- we use a low enough laser power to remain within the
circulation to tumors and reperfusion injury. Reper- linear regime of the photosensitizer excitation. The
fusion injury relates to the formation of xanthene other major disadvantage of a lamp source is the lack
oxidase from xanthene dehydrogenase in anoxic of collimation which results in low efficiency for fiber-
conditions which reacts with xanthene or hypo- optic delivery.
xanthene products of ATP dephosphorylation to We now review the different lasers that have found
convert the restored oxygen into superoxide anion application in PDT. Laser technology has significantly
which directly or via the Fenton reaction causes advanced in recent years and there are now a range of
cellular damage. options in PDT laser sources for closely matching the
An important feature of the type II mechanism, absorption profiles of the various photosensitizers.
which is sometimes overlooked, is that when the Moreover, since PDT is becoming more widely used
sensitizer transfers electronic energy to O2 it returns clinically, practical considerations, such as portability
to its ground state. Thus the cytotoxic singlet oxygen and turn-key operation are increasingly important.
species is generated without chemical transformation In Figure 5 the optical spectrum is shown from about
of the sensitizer which may then absorb another 380 nm (violet – blue) to 800 nm (near-infrared,
photon and repeat the cycle. Effectively, a singlet NIR). We have superimposed on this spectrum, the
photosensitizer molecule is capable of generating light penetration depth of tissue to roughly illustrate
many times its own concentration of singlet oxygen, how deeply light can penetrate (and consequently
which is clearly a very efficient means of photo- activate photosensitizer molecules) into lower-lying
sensitization provided the oxygen supply is adequate. tissue layers. The term ‘penetration depth’ is a
measure of the light attenuation by the tissue so the
figure of 2 mm corresponds to 1=e attenuation of the
Lasers in PDT incident intensity at a tissue depth of 2 mm. It is
Lasers are the most popular light source in PDT since obvious that light at the blue end of the spectrum
they have several key characteristics that differentiate has only a superficial effect whereas the more we
them from conventional light sources, namely coher- shift further into the red and NIR regions the
ence, monochromaticity, and collimated output. The penetration depth increases. This is due to two main
two main practical features that make them so useful reasons: absorption and scattering of light by various
in PDT are their monochromaticity and their combi- tissue component molecules. For example, in the
nation with fiber-optic delivery. The monochromati- blue/green region for the absorption of melanin and
city is important since the laser can be tuned to a hemoglobin is relatively high. But there is a notice-
specific absorption peak of a photosensitizer, thus able increase in penetration depth beyond about
ensuring that all the energy delivered is utilized for the 600 nm leading to an optimal wavelength region, or
58 CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer
Figure 5 Photosensitizers and laser light sources available in the visible and near-infrared spectral regions.
‘therapeutic window’ for laser therapy of around this laser is now widely used in thermal laser therapy.
600 – 1100 nm (Figure 5). The Nd:YAG lasers can be operated either in cw
The fundamental wavelength of the Nd:YAG laser (output power , 200 W multimode), long-pulse
lies within this therapeutic window at 1064 nm and (, 500 W average power at 50 Hz), or Q-switched
CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer 59
(50 MW peak power at around 10 ns pulse duration) laser, or lamps. Dye lasers have been used for the
modes. Nd:YAG is a solid-state laser with a yttrium excitation of HpD at 630 nm, protoporphyrin IX at
aluminum garnet crystal doped with about 1% of 635 nm, photoprotoporphyrin at 670 nm, and phtha-
trivalent Nd ions as the active medium of the laser; locyanines around 675 nm. A further example is
and with another transition of the Nd ion, this laser hypericin which, apart from the band at 550 nm, has
(with the choice of different optics) can also operate a second absorption band at 590 nm. This is the
at 1320 nm. Although dyes are available which optimum operation wavelength of dye lasers with
absorb from 800 – 1100 nm, the generation of singlet rhodamine 590, better known as rhodamine 6G.
oxygen via the type II mechanism is energetically Dye lasers can operate in either pulsed or cw mode.
unfavorable because the triplet state energies of these However, there is a potential disadvantage in using a
dyes are too low. However, the fundamental fre- pulsed laser for PDT which becomes apparent when
quency of the laser can be doubled (second-harmonic using lower repetition rates and higher pulse energies
generation, SHG) or tripled (third-harmonic gener- (. 1 mJ per pulse) laser excitation. High pulse
ation, THG) with the use of nonlinear crystals, to energies can induce saturation or transient bleaching
upconvert the output radiation to the visible range, of the sensitizer during the laser pulse and conse-
thus rendering this laser suitable for PDT: i.e., quently much of the energy supplied is not efficiently
frequency doubling to 532 nm and tripling to absorbed by the sensitizer. It has been suggested that
355 nm. Note that the 532 nm output is suitable for this effect accounts for the lack of photosensitized
activation of the absorption band of hypericin with a damage in tissue sensitized with a phthalocyanine and
maximum at 550 nm even though it is not tuned to irradiated by a 5 Hz, 25 mJ-per-pulse flash lamp
the maximum of this absorption. pumped dye laser. However, using a low pulse energy,
In the early days of PDT and other laser therapies, high repetition rate copper vapor pumped dye laser
ion lasers were widely used. The argon ion laser uses (see below), the results were indistinguishable from
ionized argon plasma as gain medium, and produces cw irradiation with an Argon ion dye laser.
two main wavelengths at 488 nm and 514 nm. The Another laser that has found clinical application in
second of the two lies exactly at the maximum of the PDT is the copper vapor laser. In this laser the active
514 nm m-THPC absorption. Argon ion lasers are medium is copper vapor at high temperature main-
operated in cw mode and usually have output powers tained in the tube by a repetitively pulsed discharge
in the region of 5– 10 W at 514 nm (the most powerful current. The copper vapor lasers are pulsed with
argon ion line). Krypton ion lasers, which emit at 568 typical pulse duration of about 50 ns and repetition
or 647 nm, are similar to their argon ion counterparts; rates reaching 20 kHz. Copper vapor lasers have been
however, they utilize ionized krypton plasma as gain operated at quite high average powers up to about
medium. The 568 nm output can be used to activate 40 W. The output radiation is produced at two
the Rose Bengal peak at 559 nm, whereas the 647 nm distinct wavelengths, namely 511 and 578 nm both
most powerful line of krypton ion lasers, has been of which have found clinical use. Copper vapor
used for activation of m-THPC (652 nm). pumped dye lasers have also been widely used for
Ideally, for optimal excitation, it is best to exactly activating red-absorbing photosensitizers, and the
tune the laser wavelength to the maxima of photo- analogous gold vapor lasers operating at 628 nm
sensitizer absorption bands. For this reason tunable have been used to activate HpD.
organic dye lasers have been widely used for PDT. In A relatively new class of tunable lasers is the solid
these lasers the gain medium is an organic dye with a state Ti:Sapphire laser. The gain medium in this laser
high quantum yield of fluorescence. Due to the broad is a , 1%Ti doped sapphire (Al2O3) crystal and its
nature of their fluorescence gain profile, tuning output may be tuned throughout the 690 –1100 nm
elements within the laser cavity can be used for region, covering many photosensitizer absorption
selection of the lasing frequency within the gain band. peaks. For example, most of the bacteriochlorin
Tunable dye lasers with the currently available dyes type photosensitizers have their absorption bands in
can quite easily cover the range from about 350 – that spectral region, e.g., m-THPBC with an absorp-
1000 nm. However, their operation is not of a ‘turn tion maximum at 740 nm. Also, Texaphyrin sensi-
key’ nature since they require frequent replacement of tizers absorb highly in this spectral region: lutetium
their active material either for tuning to a different texaphyrin has an absorption maximum at 732 nm.
spectral region or for a better performance when the Although tunable dye or solid-state lasers are
dye degrades. Tunable dye lasers also need to be widely used in laboratory studies for PDT, fixed-
pumped by some other light source like an argon wavelength semiconductor diode lasers are gradually
ion laser, an excimer laser, a solid state laser supplanting tunable lasers, owing to their practical
(e.g., Nd:YAG 532 or 355 nm), a copper vapor convenience. It is now generally the case that when
60 CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer
a photosensitizer enters clinical trials or enters For very superficial lesions (e.g., in the retina) it
standard clinical use, laser manufacturers provide may even be possible to use multiphoton excitation
dedicated diode lasers with outputs matched to the provided by femtosecond diode lasers operating at
chosen sensitizer. In this context there are diode lasers about 800 nm.
available at 630 nm for use with HpD, at 635 nm for
use with ALA-induced protoporphyrin IX, 670 nm Clinical Applications of Lasers
for ATSX-10 or phthalocyanines, or even in the in PDT
region of 760 nm, for use with bacteriochlorins. The
In the previous section we reviewed the various types
advantage of these diode lasers is that they are
of lasers being used for PDT but their combination
tailormade for use with a particular photosensitizer,
with fiber-optic delivery is also of key importance for
they are highly portable, and they have a relatively their clinical application. The laser light may be
easy turn-key operation. delivered to the target lesion either externally using
So far we have concentrated on red wavelengths surface irradiation, or internally within the lesion
but most of the porphyrin and chlorin family of which is denoted as interstitial irradiation, as
photosensitizers exhibit quite strong absorption depicted in Figure 6. For example, in case of
bands in the blue region, known as ‘Soret’ bands. superficial cutaneous lesions the laser irradiation is
Despite the fact that tissue penetration in this spectral applied externally (Figure 6a). The area of the lesion
region is minimal, these bands are far more intense is marked prior to treatment to determine the
than the corresponding red absorption bands of the diameter of the laser spot. For a given fluence rate
sensitizer. In this respect it is possible to activate these (100 – 200 W/cm2) and the area of the spot, the
sensitizers at blue wavelengths, especially in the case required laser output power is then calculated. A
of treatment of superficial (e.g., skin) malignant multimode optical fiber, terminated with a microlens
lesions. There are now blue diode lasers (and light- to ensure uniform irradiation, is then positioned at a
emitting diodes) for the activation of these bands, in distance from the lesion yielding the desired beam
particular at 405 nm where the Soret band of waste, and the surrounding normal tissue is shielded
protoporphyrin IX lies or at 430 nm for selective with dark material. However, if the lesion is a
excitation of the photoprotoporphyrin species. deeper-lying solid tumor, interstitial PDT is employed
Figure 6 Clinical application of PDT. (a) Surface treatment. (b) Interstitial treatment.
CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer 61
(Figure 6b). Surgical needles are inserted in an and the flagged diffuser fiber is placed in position,
equidistant parallel pattern within the tumor, either central to the balloon optical window. In this way the
freehand or under image (MRI/CT) guidance. Bare whole of the lesion may be treated circumferentially
tip optical fibers are guided through the surgical at the same time.
needles to the lesion and flagged for position. A Finally, the clinical response to PDT treatment is
beamsplitter is used to divide the laser output into shown in Figure 8. The treated area becomes necrotic
2 –4 components so that two to four optical fibers and inflamed within two to four days following PDT.
can be simultaneously employed. The laser power is This necrotic tissue usually sloughs off or is mechani-
adjusted so that the output fluence of each of the cally removed. The area eventually heals with
optical fibers is 100 – 200 mW. Even light distri- minimal scarring (Figure 8).
bution in the form of overlapping spheres approxi-
mately 1 cm diameter ensures the treatment of an
equivalent volume of tissue around each fiber tip.
Once each ‘station’ is likewise treated the fibers are Future Prospects
repositioned accordingly, employing a pull-back A major factor in the future development of PDT will
technique in order for another station to be treated. be the application of relatively cheap and portable
Once the volume of the tumor is scanned in this semiconductor diode lasers. High-power systems
manner the treatment is complete. Interstitial light consisting of visible diode arrays coupled into multi-
delivery allows PDT to be used in the treatment of mode fibers with an output of several Watts, have
large, buried tumors and is particularly suitable for recently become available which will greatly simplify
those in which surgery would involve extensive the technical difficulties which have held back PDT.
resection. The use of new photosensitizers, in the treatment of
The treatment of tumors of internal hollow organs certain tumors by PDT, is making steady progress,
is also possible with PDT. The most representative and although the toxicology testing and clinical
example of this is the treatment of precancerous evaluation are lengthy and costly processes, the
lesions in the esophagus, known medically as expectation is that these compounds will become
‘Barrett’s esophagus’. In this case, a balloon appli- more widely available for patient use during the next
cator is used to house a special fiber with a light- five years. Regarding the current clinical status of
diffusing tip or ‘diffuser’ (Figure 7). These diffusers PDT, Photofrin has recently been approved for
are variable in length and transmit light uniformly treatment of lung and esophageal cancers in Europe,
along the whole of their length in order to facilitate USA, and Japan. Clinical trials are in progress for
treatment of circumferential lesions. The balloon several of the second-generation photosensitizers
applicator is endoscopically inserted into the patient. which offer significant improvements over Photofrin
The balloon is inflated to stretch the esophageal walls in terms of chemical purity, photoproperties, and skin
62 CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics
clearance. With the increasing clinical application of Dolmans DEJGJ, Fukumura D and Jain RK (2003)
compact visible diode lasers the prospects for photo- Photodynamic therapy for cancer. Nature Reviews
dynamic therapy are therefore encouraging. (Cancer) 3: 380 – 387.
Hopper C (2000) Photodynamic therapy: a clinical
reality in the treatment of cancer. Lancet Oncology
See also 1: 212 – 219.
Lasers: Dye Lasers; Free Electron Lasers; Metal Vapor Milgrom L and MacRobert AJ (1998) Light years ahead.
Lasers. Ultrafast Laser Techniques: Generation of Chemistry in Britain 34: 45– 50.
Femtosecond Pulses. Rosen GM, Britigan BE, Halpern HJ and Pou S (1999) Free
Radicals Biology and Detection by Spin Trapping. New
York: Oxford University Press.
Further Reading Svelto O (1989) Principles of Lasers, 3rd edn. New York:
Bonnet R (2000) Chemical Aspects of Photodynamic Plenum Press.
Therapy. Amsterdam: Gordon and Breach Science Vo-Dinh T (2003) Biomedical Photonics Handbook. Boca
Publishers. Raton, FL: CRC Press.
Nuclear Wavepackets
The electronic absorption spectrum represents a sum
over all possible vibronic transitions, each weighted
by a corresponding nuclear overlap factor according
to the Franck– Condon principle. It was recognized
by Heller and co-workers that in the time-domain
representation of the electronic absorption spectrum,
the sum over vibrational overlaps is replaced by the
propagation of an initially excited nuclear wave-
packet on the excited state according to liðtÞl ¼
Figure 5 (a) shows the absorption and emission spectrum of a expð2Ht="Þ lið0Þl: Thus the absorption spectrum is
dilute solution of oxazine-1 in water. (b) shows the transient written as
absorption spectrum of this system at various delays after
pumping at 600 nm using picosecond pulses. The transient 2pv ð1
spectrum, ground state recovery and stimulated emission si ðvÞ ¼ dt exp½2iðv 2 vieg Þt 2 gðtÞ
3c 21
resemble the sum of the absorption and emission spectra.
ið0Þ liðtÞ ½6
spectrum of the ground-state absorption (see for mode i: The Hamiltonian H is determined by the
Figure 5). Thus the transmitted probe intensity is displacement of mode i; Di : A large displacement
increased relative to that without the pump pulse. The results in a strong coupling between the vibration and
GSR contribution to DIpr ðv; TÞ decreases with T the electronic transition from state g to e. In the
according to the excited state lifetime of the photo- frequency domain this is seen as an intense vibronic
excited species. progression. In the time domain this translates
to an
oscillation in the time-dependent overlap ið0Þ liðtÞ ;
with frequency vi ; thereby modulating the intensity
The Coherent Spike
of the linear response as a function of t:
To calculate the pump-probe signal over delay times It follows that an impulsively excited coherent
that are of the order of the pulse duration, one must superposition of vibrational modes, created by an
calculate all possible ways that a signal can be optical pulse that is much shorter than the vibrational
generated with respect to all possible time orderings period of each mode, propagates on the excited state
of the two pump interactions and the probe inter- as a wavepacket. The oscillations of this wavepacket
actions (Liouville space pathways). Such a calculation then modulate the intensity of the transmitted probe
reveals coherent as well as sequential contributions to pulse as a function of T to introduce features into the
the response of the system owing to the entanglement pump-probe signal known as quantum beats (see
of time orderings that arise from the overlap of pulses Figure 6). In fact, nuclear wavepackets propagate on
of finite duration. The nonclassical, coherent, contri- both the ground and excited states in a pump-probe
butions dominate the signal for the time period experiment. The short probe pulse interrogates the
during which pump and probe pulses overlap, leading evolution of the nuclear wavepackets by projecting
to the coherent spike (or coherent ‘artifact’) them onto the manifold of vibrational wavefunctions
(Figure 6). in an excited or ground state.
CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics 67
Femtosecond Kinetics
According to eqn [4] the kinetics of the evolution of
the system initiated by the pump pulse are governed
by the terms KSE ; KESA ; and KGSR : Each of these terms
describes the kinetics of decay, production, or
recovery of excited state and ground state popu-
lations. In the simplest conceivable case, each might
be a single exponential function. The stimulated
emission term KSE contains information about the
depopulation of the initially excited state, which for
example, may occur through a chemical reaction
involving the excited state species or energy transfer
from that state. The excited state absorption term
KESA contains information on the formation and
decay of all excited state populations that give a
transient absorption signal. The ground-state recov-
ery term KGSR contains information on the time-
scales for return of all population to the ground state,
Figure 7 Free energy curves corresponding to a model for an
which will occur by radiative and nonradiative
excited state isomerization reaction, with various contributions to
processes. Usually the kinetics are assumed to follow a pump-probe measurement indicated. See text for a description.
a multi-exponential law such that (Courtesy of Jordanides XJ and Fleming GR.)
X
N
Km ðT; lexc ; vÞ ¼ Aj ðlexc ; vÞ expð2t=tj Þ ½7 reasons, simple fitting of pump-probe data usually
j¼1
cannot reveal the underlying kinetic model. Instead
where the amplitude Aj ; but not the decay time the data should be simulated according to eqns [3]
coefficient tj ; of each contribution in the coupled and [4].
N-component system depends on the excitation Alternative strategies involve global analysis of
wavelength lexc and the signal frequency v: The wavelength-resolved data to extract the kinetic
index m denotes SE, ESA, or GSR. evolution of spectral features, known as ‘species
In order to extract meaningful physical parameters associated decay spectra’. Global analysis of data
from analysis of pump-probe data, a kinetic scheme effectively reduces the number of adjustable para-
in the form of eqn [7] needs to be constructed, based meters in the fitting procedure and allows one to
on a physical model to connect the decay of initially obtain physically meaningful information by associ-
excited state population to formation of a product ating species with their spectra. These methods are
excited state and subsequent recovery of the ground- described in detail by Holzwarth.
state population. An example of such a model is
depicted in Figure 7. Here the stimulated emission
contribution to the signal is attenuated according to
the rate that state 1 goes to 2. The transient List of Units and Nomenclature
absorption appears at that same rate and decays ESA excited state absorption
according to the radiationless process 2 to 3. The rate GSR ground state recovery
of ground state recovery depends on 1 to 2 to 3 to 5 SE stimulated emission
and possibly pathways involving the isomer 4. Sn nth singlet excited state
The kinetic scheme is not easily extracted from TR3 time-resolved resonance Raman
pump-probe data because contributions from each of x (3) nth-order nonlinear optical susceptibility
KSE ; KESA ; and KGSR contribute additively at each
detection wavelength v; as shown in eqns [2] –[4].
Moreover, additional, wavelength-dependent, decay
or rise components may appear in the data on See also
ultrafast time-scales owing to the time-dependent Coherent Lightwave Systems. Coherent Transients:
Stokes’ shift (the g 00 ðtÞ terms indicated in eqns [4]). A Ultrafast Studies of Semiconductors. Nonlinear Optics,
spectral shift can resemble either a decay or rise Applications: Raman Lasers. Optical Amplifiers:
component. In addition, the very fastest dynamics Raman, Brillouin and Parametric Amplifiers. Scattering:
may be hidden beneath the coherent spike. For these Raman Scattering. Spectroscopy: Raman Spectroscopy.
68 CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting
cards that contain all the timing electronics, etc., temporal response of the optical light pulse, the optical
required for TCSPC, for example, Becker and Hickl path in the collection optics, and the response of the
(Germany). detector and ancillary electronics. Experimentally,
This approach to obtaining the fluorescence decay the IRF is usually recorded by placing a scattering
profile can only work if the detected photons are material in the spectrometer.
‘randomly selected’ from the total emission from the In principle, the decay function of the sample can
sample. In practice, this means that the rate of be extracted by deconvolution of the IRF from the
detection of photons should be of the order of 1% of decay function, although in practice it is more
the excitation rate: that is, only one in a hundred common to use the method of iterative re-convolution
excitation pulses gives rise to a detected photon. This using a model decay function(s). Thus, the exper-
histogram mirrors the fluorescence decay of the imentally determined IRF is convolved with a model
sample convolved with the instrument response decay function and parameters within the function
function (IRF). The IRF is itself a convolution of the are systematically varied to obtain the best fit for a
specific model. The most commonly used model
assumes the decaying species to follow one or more
exponential functions (sum of exponentials) or a
distribution of exponentials. Further details regarding
the mechanics of data analysis and fitting procedures
can be found in the literature. A typical fit is
illustrated in Figure 2.
The main aim of this review is to illustrate how
recent advances in laser technology have changed the
face of TCSPC and to discuss the types of laser light
sources that are in use. The easiest way of approach-
ing this topic is to define the criteria for an ideal
light source for this experiment, and then to discuss
the ways in which new technologies are meeting
Figure 1 Schematic diagram of a TSCPC spectrometer. these criteria.
Fluorescence from the sample is collected by L1, which may
also include a monochromator/wavelength selection filter. A
synchronization pulse is obtained either from the laser driver High Repetition Rate
electronics directly, or via a photodiode and second constant
fraction discriminator. The TAC is shown operating in a ‘reversed A typical decay data set may contain something of the
mode’ as is normal with a high repetition rate laser source. order of 0.1 –5 million ‘counts’: it is desirable to
Figure 2 The fluorescence decay of Rhodamine B in water. The sample was excited using a 397 nm pulsed laser diode (1 MHz,
,200 ps FWHM), and the emission was observed at 575 nm. Deconvolution of the IRF (dark gray trace) and observed decay (light gray
trace) with a single exponential function gave a lifetime of 1.50 ns, with a x 2R ¼ 1.07, Durbin–Watson parameter ¼ 1.78. The fitted
curve is shown overlaid in gray, and the weighted residuals are shown offset in black.
70 CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting
Nd:YAG pump lasers, this medium currently domi- and the laser does not provide continuous tuning
nates the entire ultrafast laser market, from high from the deep-UV though to the visible. Incremen-
repetition, low pulse energy systems through to tally, all of these points have been addressed by the
systems generating very high pulse energies at kHz laser manufacturers.
repetition rates. As well as operating as a source of femtosecond
Ti-sapphire is, in many respects, an ideal laser pulses, with their associated bandwidths of many
gain medium: it has a high energy storage capacity, tens of nm, some modern commercial Ti-sapphire
it has a very broad-gain bandwidth, and it has good lasers can, by adjustment of the optical path and
thermal properties. Following the first demon- components, be operated in a mode that provides
stration of laser action in a Ti-sapphire crystal, the picosecond pulses whilst retaining their broad
CW Ti-sapphire laser was quickly commercialized. tuning curve. Under these conditions, the bandwidth
It proved to be a remarkable new laser medium, of the output is greatly reduced, making the laser
providing greater efficiency, with outputs of the resemble a synchronously pumped mode-locked
order of 1 W. The laser also provided an unprece- dye laser.
dented broad tuning range, spanning from just As stated above, a very high repetition rate can be
below 700 nm to 1000 nm. The most important undesirable if the system under study has a long
breakthrough for time-resolved users came in 1991, fluorescence lifetime due to the fluorescence decay
when self-mode-locking in the medium was demon- from an earlier pulse not decaying completely
strated. This self-mode-locking, sometimes referred before the next excitation pulse. Variation of the
to as Kerr-lens mode-locking, arises from the non- repetition of output frequency can be addressed by
linear refractive index of the material and is two means. The first is to use a pulse-picker. This is
implemented by simply placing an aperture within an opto-accoustic device which is placed outside the
the cavity at an appropriate point. The broad-gain laser cavity. An electrical signal is applied to a
bandwidth of the medium results in very short device, synchronized to the frequency of the mode-
pulses: pulse durations of the order of tens of locked laser to provide pulses at frequencies from
femtoseconds can be routinely achieved with com- ,4 MHz down to single-shot. The operation of a
mercial laser systems. pulse picker is somewhat inefficient, with less than
The mode-locked Ti-sapphire laser was quickly half of the original pulse energy being obtained in
seized upon and commercialized. The first generation the output, although output power is not usually an
of systems used multiwatt CW argon ion lasers to issue when using these sources. More importantly,
pump them, making the systems unwieldy and when using pulse-pickers, it is essential to ensure
expensive to run, but in recent years these have that there is no pre- or post-pulse which can lead to
been replaced by diode-pumped Nd:YAG lasers, artifacts in the observed fluorescence decays. The
providing an all solid-state, ultrafast laser source. second means of reducing the pulse duration is
Equally significantly, the current generation of lasers cavity dumping. This is achieved by extending the
are very efficient and do not require significant laser cavity and placing an opto-acoustic device,
electrical power or cooling capacity. There are several known as a Bragg cell, at a beam-waist within the
commercial suppliers of pulsed Ti-sapphire lasers, cavity. When a signal is applied to the intracavity
including Coherent and Spectra Physics. Bragg cell, synchronized with the passage of a
A typical mode-locked Ti-sapphire laser provides mode-locked pulse though the device, a pulse of
an 80 MHz train of pulses with a FWHM of ,100 fs, light is extracted from the cavity. The great
with an average power of up to 1.5 W and are advantage of cavity-dumping is that it builds up
tuneable from 700 nm to almost 1000 nm. The energy within the optical cavity and when pulses are
relatively high peak powers associated with these extracted these are of a higher pulse energy than the
pulses means that second- or even third-harmonic mode-locked laser alone. This in turn leads to more
generation is relatively efficient, providing radiation efficient harmonic generation. This addition of the
in the ranges 350 – 480 nm and 240 – 320 nm. The cavity dumper system extends the laser cavity and
output powers of the second and third harmonics far hence reduces the overall frequency of the mode-
exceeds that required for TCSPC experiments, mak- locked laser to ,50 MHz, and can provide pulses at
ing them an attractive source for the study of 9 MHz down to single shot.
fluorescence lifetime measurements. The simple The extension of the tuning range of the Ti-
mode-locked Ti-sapphire laser has some drawbacks: sapphire laser has been facilitated by the introduction
the very short pulse duration provides a broad- of the optical parametric oscillator (OPO). In the
bandwidth excitation, the repetition rate is often OPO, a short wavelength pump beam passes through
too high with an inter-pulse separation of ,12.5 ns, a nonlinear optical crystal and generates two tuneable
72 CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting
Figure 3 An illustration of the spectral coverage provided by currently available pulsed semiconductor lasers and LEDs, and a modern
Ti-sapphire laser and OPO. Note that the OPO’s output is frequency doubled, and the wavelengths in the range 525 – , 440 nm may be
obtained by sum frequency mixing of the Ti-sapphire’s fundamental output with the output of the OPO. p denotes that there are many
long wavelength pulsed laser diodes falling in the range l . 720 nm.
beams of longer wavelengths. Synchronous pumping In summary, the Ti-sapphire laser is a very versatile
of the OPOs with either a fs or ps Ti-sapphire laser and flexible source that is well suited to the TCSPC
ensures reasonably high efficiencies. The outputs experiment. Off-the-shelf systems provide broad
from the OPO may be frequency doubled, with the spectral coverage and variable repetition rates.
option of intracavity doubling enhancing the effi- Undoubtedly this medium will be the mainstay of
ciency still further. Using an 800 nm pump wave- TCSPC for some time to come.
length, an OPO provides typical signal and idler
tuning ranges of 1050 – . 1350 nm and 2.1 – 3.0 mm,
respectively. The signal can be doubled to give 525 –
Diode Lasers
680 nm, effectively filling the visible spectrum. The Low-power diode lasers, with outputs in the range
average powers obtainable from Ti-sapphire pumped 630 –1000 nm, have been commercially available for
OPOs makes them eminently suitable for TCSPC some time. These devices, which are normally
experiments. A summary of the spectral coverage of operated as CW sources, can be operated in a
the Ti-sapphire laser and OPO, and low-cost solid- pulsed mode, providing a high frequency train of
state sources are illustrated in Figure 3. sub-ns pulses. However, the relatively long output
An additional advantage of the Ti-sapphire laser is wavelengths produced by this first generation of
the possibility of multiphoton excitation. The diode lasers severely restricted the range of chromo-
combination of high average power and very short phores that could be excited and this proved to be a
pulse duration of a mode-locked Ti-sapphire laser limiting factor for their use as TCSPC sources.
means that the peak-powers of the pulses can be very Attempts to generate shorter wavelengths from
high. Focusing the near-IR output of the laser gives these NIR laser diodes by second harmonic gener-
rise to exceedingly high photon densities within a ation (SHG), were never realized: the SHG process is
sample which can lead to the simultaneous absorp- extremely inefficient due to the very low peak output
tion of two or more photons. This is of particular power of these lasers. Shorter wavelengths could be
significance when recording fluorescence lifetimes in obtained by pulsing conventional light-emitting
conjunction with a microscope, particularly when diodes (LEDs), which were commercially available
the microscope is operated in a confocal configur- with emission wavelengths covering the visible
ation. Such systems are now being used for a spectrum down to ,450 nm. However, these had
derivative of TCSPC, known as fluorescence lifetime the disadvantage of exhibiting longer pulse durations
imaging (FLM). than those achievable from laser diodes and have
CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics 73
much broader emission bandwidths than a laser 280 nm and 340 nm. As LEDs, and possibly diode
source. lasers, emitting at these shorter wavelengths become
The introduction of violet and blue diode lasers more widely available, the prospects of low running
and UV-LEDs by the Nichia Corporation (Japan) in costs and ease of use makes these semiconductor
the late 1990s resulted in a renewed interest in devices an extremely attractive option for many
sources utilizing these small, solid-state devices. TCSPC experiments. Commercial TCSPC spec-
At least three manufacturers produce ranges of trometers based around these systems offer a
commercial TCSPC light sources based upon pulsed comparatively low-cost, turnkey approach to fluor-
LEDs and laser diodes and can now boast a number escence lifetime based experiments, opening up what
of pulsed laser sources in their ranges including 375, was a highly specialized field to a much broader
400, 450, 473, 635, 670, and 830 nm. Currently range of users.
commercial pulsed laser diodes and LEDs are
manufactured by Jobin-Yvon IBH (UK), Picoquant
(Germany), and Hamamatso (Japan). These small, See also
relatively low-cost devices offer pulse durations of
, 100 ps and excitation repetition rates of up to tens Lasers: Noble Gas Ion Lasers. Light Emitting Diodes.
of MHz making them ideally suited to ‘routine’ Ultrafast Laser Techniques: Pulse Characterization
lifetime measurements. Although these diode laser Techniques.
sources are not tunable and do not span the full
spectrum, they are complemented by pulsed LEDs
which fill the gap in the visible spectrum. Pulsed Further Reading
LEDs generally have a longer pulse duration and
Lakowicz JR (ed.) (2001) Principles of Fluorescence
broader, lower intensity spectral output, but are Spectroscopy. Plenum Press.
useful for some applications. At the time of writing O’Connor DV and Phillips D (1984) Time-Correlated
there are no diode laser sources providing wave- Single Photon Counting.
lengths of , 350 nm, although Jobin-Yvon IBH (UK) Spence DE, Kean PN and Sibbett W (1991) Optics Letters
have recently announced pulsed LEDs operating at 16: 42– 44.
n~ ¼ n þ iK ½3 where Idif and Ipr are the diffracted and the
probe intensity respectively, d is the sample
where n is the refractive index and K is the thickness, and A ¼ 4pdK=ðl ln 10Þ is the average
attenuation constant. absorbance.
The hologram created by the interaction of the The first and the second terms in the square bracket
crossed pump pulses consists in periodic, one-dimen- describe the contribution of the amplitude and phase
sional spatial modulations of n and K: Such distri- gratings respectively, and the exponential term
butions are nothing but phase and amplitude accounts for the reabsorption of the diffracted beam
gratings, respectively. A third laser beam at the by the sample.
probe wavelength, lpr ; striking these gratings at The main processes responsible for the variation of
Bragg angle, uB ¼ arcsinðlpr =2LÞ; will thus be par- the optical properties of an isotropic dielectric
tially diffracted (Figure 1b). The diffraction efficiency, material are summarized in Figure 2.
h; depends on the modulation amplitude of the The modulation of the absorbance, DA; is essen-
optical properties. In the limit of small diffraction tially due to the photoinduced concentration change,
Figure 1 Principle of the transient grating technique: (a) grating formation (pumping), (b) grating detection (probing).
DC; of the different chemical species i (excited state, that of the reactant and a positive volume change can
photoproduct, …): be expected. This will lead to a decrease of the density
X and to a negative Dnvd :
DAðlpr Þ ¼ 1i ðlpr ÞDCi ½5 Finally, Dned is related to electrostriction in the
i sample by the electric field of the pump pulses.
where 1i is the absorption coefficient of the species i: Like OKE, this is a nonresonant process that also
The variation of the refractive index, Dn; has contributes to the intensity dependent refractive
several origins and can be expressed as index. Electrostriction leads to material com-
pression in the regions of high electric field
Dn ¼ DnK þ Dnp þ Dnd ½6 strength. The periodic compression is accompanied
by the generation of two counterpropagating
DnK is the variation of refractive index due to the acoustic waves with wave vectors, k ~ ¼ ^ð2p=LÞ~i;
ac
optical Kerr effect (OKE). This nonresonant inter- ~
where i is the unit vector along the modulation
action results in an electronic polarization (electronic axis. The interference of these acoustic waves leads
OKE) and/or in a nuclear reorientation of the to a temporal modulation of Dned at the acoustic
molecules (nuclear OKE) along the direction of the frequency y ac ; with 2py ac ¼ kac vs ; vs being the
electric field associated with the pump pulses. As a speed of sound. As Dned oscillates between negative
consequence, a transient birefringence is created in and positive values, the diffracted intensity, which
the material. This effect is usually discussed within is proportional to ðDned Þ2 ; shows at temporal
the framework of nonlinear optics in terms of oscillation at twice the acoustic frequency. In
intensity dependent refractive index or third order most cases, Dned is weak and can be neglected if
nonlinear susceptibility. Electronic OKE occurs in the pump pulses are within an absorption band of
any dielectric material under sufficiently high light the sample.
intensity. On the other hand, nuclear OKE is mostly The modulation amplitudes of absorbance and
observed in liquids and gases and depends strongly on refractive index are not constant in time and their
the molecular shape. temporal behavior depends on various dynamic
Dnp is the change of refractive index related to processes in the sample. The whole point of the
population changes. Its magnitude and wavelength transient grating techniques is precisely the measure-
dependence can be obtained by Kramers – Kronig ment of the diffracted intensity as a function of time
transformation of DAðlÞ or DKðlÞ: after excitation to deduce dynamic information on
the system.
1 ð1 DKðl0 Þ In the following, we will show that the various
Dnp ðlÞ ¼ dl0 ½7
2p 2 0 1 2 ðl0 =lÞ2 processes shown in Figure 2, that give rise to a
diffracted signal, can in principle be separated by
Dnd is the change of refractive index associated with choosing appropriate experimental parameters, such
density changes. Density phase gratings can have as the timescale, the probe wavelength, the polari-
essentially three origins: zation of the four beams, or the crossing angle.
Applications
The Transient Density Phase Grating Technique
If the grating is probed at a wavelength far from
any absorption band, the variation of absorbance,
DAðlpr Þ; is zero and the corresponding change of
refractive index, Dnp ðlpr Þ; is negligibly small. In this
case, eqn [4] simplifies to
!2
pd
hø £ Dn2d ½9
lpr cos uB
In principle, the diffracted signal may also Figure 5 Time profile of the diffracted intensity measured with a
contain contributions from the optical Kerr effect, solution of malachite green after excitation with two pulses with
DnK ; but we will assume here that this non-resonant close to counterpropagating geometry.
CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics 77
L ¼ lpu =2n and, with UV pump pulses, an acoustic After diffusional encounter with the electron donor,
period of the order of 100 ps can be obtained in a ET takes place and a pair of ions is generated. With
typical organic solvent. this system, the whole energy of the 355 nm photon
For example, Figure 7a shows the energy ðE ¼ 3:49 eVÞ is converted into heat. The different
diagram for a photoinduced electron transfer (ET) heat releasing processes can be differentiated
reaction between benzophenone (BP) and an elec- according to their timescale. The vibrational relax-
tron donor (D) in a polar solvent. Upon excitation ation to 1BPp and the ensuing ISC to 3BPp induces
at 355 nm, 1BPp undergoes intersystem-crossing an ultrafast release of 0.49 eV as heat. With donor
(ISC) to 3BPp with a time constant of about 10 ps. concentrations of the order of 0.1 M, the heat
deposition process due to the electron transfer is
typically in the ns range. Finally, the recombination
of the ions produces a heat release in the
microsecond timescale. Figure 7b shows the time
profile of the diffracted intensity measured after
excitation at 355 nm of BP with 0.05 M D in
acetonitrile. The 30 ps pump pulses were crossed
at 278 ðtac ¼ 565 psÞ and the 10 ps probe pulses
were at 590 nm. The oscillatory behavior is due to
the ultrafast heat released upon formation of 3BPp,
while the slow dc rise is caused by the heat
dissipated upon ET. As the amount of energy
released in the ultrafast process is known, the
energy released upon ET can be determined by
comparing the amplitudes of the fast and slow
components of the time profile. The energetics as
well as the dynamics of the photoinduced ET
process can thus be determined. Figure 7c shows
the decay of the diffracted intensity due to the
washing out of the grating by thermal diffusion. As
both the acoustic frequency and the thermal
diffusion depends on the grating wavevector, the
experimental time window in which a heat releasing
process can be measured, depends mainly on the
crossing angle of the pump pulses as shown in
Table 1.
phase grating, such as that shown in Figure 7c. case, the transient grating technique is similar to
This technique can be used with a large variety transient absorption, and it thus allows the measure-
of bulk materials as well as films, surfaces and ment of population dynamics. However, because
interfaces. holographic detection is background free, it is at
least a hundred times more sensitive than transient
Investigation of Population Dynamics absorption.
The population gratings are usually probed with
The dynamic properties of a photogenerated species monochromatic laser pulses. This procedure is well
can be investigated by using a probe wavelength suited for simple photoinduced processes, such as
within its absorption or dispersion spectrum. As the decay of an excited state population to the
shown by eqn [4], either DA or Dnp has to be different ground state. For example, Figure 8 shows the
from zero. For this application, DnK and Dnd should decay of the diffracted intensity at 532 nm
ideally be equal to zero. Unless working with measured after excitation of a cyanine dye at the
ultrashort pulses ð, 1 psÞ and a weakly absorbing same wavelength. These dynamics correspond to the
sample, DnK can be neglected. Practically, it is almost ground state recovery of the dye by non-radiative
impossible to find a sample system for which some deactivation of the first singlet excited state. Because
fraction of the energy absorbed as light is not released the diffracted intensity is proportional to the square of
as heat. However, by using a sufficiently small the concentration changes (see eqn [14]), its decay
crossing angle of the pump pulses, the formation of time is twice as short as the ground state recovery
the density phase grating, which depends on the time. If the reaction is more complex, or if several
acoustic period, can take as much as 30 to 40 ns. intermediates are involved, the population grating has
In this case, Dnd is negligible during the first few ns to be probed at several wavelengths. Instead of
after excitation and the diffracted intensity is due to repeating many single wavelength experiments, it is
the population grating only: preferable to perform multiplex transient grating. In
X this case, the grating is probed by white light pulses
Idif ðtÞ / DC2i ðtÞ ½14 generated by focusing high intensity fs or ps laser
i pulses in a dispersive medium. If the crossing angle of
the pump pulses is small enough (, 18), the Bragg
where DCi is the modulation amplitude of the angle for probing is almost independent on the
concentration of every species i whose absorption wavelength. A transient grating spectrum is obtained
and/or dispersion spectrum overlaps with the probe by dispersing the diffracted signal in a spectrograph.
wavelength. This spectrum consists in the sum of the square of the
The temporal variation of DCi can be due either transient absorption and transient dispersion spectra.
to the dynamics of the species i in the illuminated Practically, it is very similar to a transient
grating fringes or to processes taking place between absorption spectrum, but with a much superior
the fringes, such as translational diffusion, exci- signal to noise ratio. Figure 9 shows the transient
tation transport, and charge diffusion. In liquids, grating spectra measured at various time delays after
the decay of the population grating by translational excitation of a solution of chloranil (CA) and
diffusion is slow and occurs in the microsecond to
millisecond timescale, depending on the fringe
spacing. As thermal diffusion is typically hundred
times faster, eqn [14] is again valid in this long
timescale. Therefore if the population dynamics is
very slow, the translational diffusion coefficient of a
chemical species can be obtained by measuring the
decay of the diffracted intensity as a function of the
fringe spacing. This procedure has also been used to
determine the temperature of flames. In this case
however, the decay of the population grating by
translational diffusion occurs typically in the sub-ns
timescale.
In the condensed phase, these interfringe processes
Figure 8 Time profile of the diffracted intensity at 532 nm
are of minor importance when measuring the measured after excitation at the same wavelength of a cyanine
diffracted intensity in the short timescale, i.e., before dye (inset) in solution and best fit assuming exponential ground
the formation of the density phase grating. In this state recovery.
80 CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics
The tensor RðpÞ depends on the polarization diffusion, the excited lifetime of rhodamine being
anisotropy of the sample, r; created by excitation about 4 ns.
with polarized pump pulses: On the other hand, if the decay of r is slower than
that of DC; the crossed grating geometry can be
Nk ðtÞ 2 N’ ðtÞ 2 used to measure the population dynamics without
rðtÞ ¼ ¼ P2 ½cosðgÞexpð2kre tÞ
Nk ðtÞ þ 2N’ ðtÞ 5 any interference from the density phase grating.
½19 For example, Figure 11 shows the time profile of
the diffracted intensity after excitation of a suspen-
where Nk and N’ are the number of molecules sion of TiO2 particles in water. The upper profile
with the transition dipole oriented parallel and was measured with the set of polarization
perpendicular to the polarization of the probe ð08; 08; 08; 08Þ and thus reflects the time dependence
pulse, respectively, g is the angle between the of R1111 ðpÞ and R1111 ðdÞ. R1111 ðpÞ is due to the
transition dipoles involved in the pump and trapped electron population, which decays by
probe processes, P2 is the second Legendre poly- charge recombination, and R1111 ðdÞ is due to the
nomial, and kre is the rate constant for the reori- heat dissipated upon both charge separation and
entation of the transition dipole, for example, by recombination. The lower time profile was
rotational diffusion of the molecule or by energy measured in the crossed grating geometry and thus
hopping. reflects the time dependence of R1212 and in
The table shows that R1212 ðdÞ ¼ 0; i.e., the particular that of R1212 ðpÞ:
contribution of the density grating can be elimi- Each contribution to the signal can be selectively
nated with the set of polarization ð08; 908; 08; 908Þ; eliminated by using the set of geometry ðz; 458; 08; 08Þ
the so-called crossed grating geometry. In this where the value of the ‘magic angle’ z for each
geometry, the two pump pulses have orthogonal
polarization, the polarization of the probe pulse is
parallel to that of one pump pulse and the
polarization component of the signal that is
orthogonal to the polarization of the probe pulse
is measured. In this geometry, R1212 ðpÞ is nonzero as
long as there is some polarization anisotropy, ðr –
0Þ: In this case, the diffracted intensity is
contribution is listed in Table 2. This approach Dnp variation of refractive index due to popu-
allows for example the nuclear and electronic lation changes
contributions to the optical Kerr effect to be Dntd Dnd due to temperature changes
measured separately. Dx peak to null variation of the parameter x
1 molar decadic absorption coefficient
[cm L mol21]
Concluding Remarks z angle of polarization of the diffracted
The transient grating techniques offer a large signal
variety of applications for investigating the h diffraction efficiency
dynamics of chemical processes. We have only uB Bragg angle (angle of incidence of the
discussed the cases where the pump pulses are time probe pulse)
coincident and at the same wavelength. Excitation upu angle of incidence of a pump pulse
with pump pulses at different wavelengths results lpr probe wavelength [m]
to a moving grating. The well-known CARS lpu pump wavelength [m]
spectroscopy is such a moving grating technique. r density [kg m23]
Finally, the three pulse photon echo can be tac acoustic period [s]
considered as a special case of transient grating y ac acoustic frequency [s21]
where the sample is excited by two pump pulses,
which are at the same wavelength but are not
time coincident. See also
Chemical Applications of Lasers: Nonlinear Spectro-
scopies; Pump and Probe Studies of Femtosecond
List of Units and Nomenclature Kinetics. Holography, Applications: Holographic
Recording Materials and their Processing. Holography,
C concentration [mol L21] Techniques: Holographic Interferometry. Materials
Cv heat capacity [J K21 kg21] Characterization Techniques: x (3). Nonlinear Optics
d sample thickness [m] Basics: Four-Wave Mixing. Ultrafast Technology:
Dth thermal diffusivity [m2 s21] Femtosecond Chemical Dynamics, Gas Phase; Femto-
Idif diffracted intensity [W m22] second Condensed Phase Spectroscopy – Structural
Dynamics.
Ipr intensity of the probe pulse [W m22]
Ipu intensity of a pump pulse [W m22]
~
k acoustic wavevector [m21]
ac
Further Reading
kr rate constant of a heat releasing process
[s21] Bräuchle C and Burland DM (1983) Holographic methods
kre rate constant of reorientation [s21] for the investigation of photochemical and photophysi-
kth rate constant of thermal diffusion [s21] cal properties of molecules. Angewandte Chemie Inter-
K attenuation constant national Edition Engl 22: 582 – 598.
Eichler HJ, Günter P and Pohl DW (1986) Laser-Induced
n refractive index
Dynamics Gratings. Berlin: Springer.
n~ complex refractive index
Fleming GR (1986) Chemical Applications of Ultrafast
N number of molecule per unit volume Spectroscopy. Oxford: Oxford University Press.
[m23] Fourkas JT and Fayer MD (1992) The transient grating: a
r polarization anisotropy holographic window to dynamic processes. Accounts of
R fourth rank response tensor [m2 V22] Chemical Research 25: 227 – 233.
vs speed of sound [m s21] Hall G and Whitaker BJ (1994) Laser-induced grating
V volume [m3] spectroscopy. Journal of the Chemistry Society Faraday
aac acoustic attenuation constant [m21] Trans 90: 1– 16.
b cubic volume expansion coefficient [K21] Levenson MD and Kano SS (1987) Introduction to
g angle between transition dipoles Nonlinear Laser Spectroscopy, revised edn. Boston:
Academic Press.
L fringe spacing [m]
Mukamel S (1995) Nonlinear Optical Spectroscopy.
Dnd variation of refractive index due to density
Oxford: Oxford University Press.
changes Rullière C (ed.) (1998) Femtosecond Laser Pulses. Berlin:
Dned Dnd due to electrostriction Springer.
DnK variation of refractive index due to optical Terazima M (1998) Photothermal studies of photophysical
Kerr effect and photochemical processes by the transient grating
p
Dnd Dnd due to volume changes method. Advances in Photochemistry 24: 255– 338.
84 COHERENCE / Overview
cross-sectional area 104 times and dispersed the beam See also
across it. The CPA technique makes it possible to use
Diffraction: Diffraction Gratings. Ultrafast Laser
conventional laser amplifiers and to stay below the
Techniques: Pulse Characterization Techniques.
onset of nonlinear effects.
COHERENCE
Contents
Overview
Coherence and Imaging
Speckle and Coherence
1 ðT
kf ðtÞl ¼ Lim f ðtÞdt ½6
T!1 2T 2T
Mutual Coherence
In order to define the mutual coherence, the key
concept in the theory of coherence, we consider
Young’s interference experiment, as shown in
Figure 4, where the radiation from a broad source
Figure 3 Spatial coherence and coherence area. of size S is illuminating a screen with two pinholes
P1 and P2 : The light emerging from the two
interference produced using the radiation from points pinholes produces interference, which is observed
within this volume will produce fringes of good on another screen at a distance R from the first
contrast. screen. The field at point P; due to the pinholes,
would be K1 Vðr1 ; t 2 t1 Þ and K2 Vðr2 ; t 2 t2 Þ respect-
ively, where r1 and r2 define the positions of P1 and
Mathematical Description P2 ; t1 and t2 are the times taken by the light to
of Coherence travel to P from P1 and P2 ; and K1 and K2 are
imaginary constants that depend on the geometry
Analytical Field Representation and size of the respective pinhole and its distance
Coherence properties associated with fields are best from the point P: Thus, the resultant field at point P
analyzed in terms of complex representation for would be given by
optical fields. Let the real function V ðrÞ ðr; tÞ represent
the scalar field, which could be one of the transverse Vðr; tÞ ¼ K1 Vðr1 ; t 2 t1 Þ þ K2 Vðr2 ; t 2 t2 Þ ½7
Cartesian components of the electric field associated
Since the optical periods are extremely small as
with the electromagnetic wave. One can then define a
compared to the response time of a detector, the
complex analytical signal Vðr; tÞ such that
detector placed at point P will record only the time-
V ðrÞ ðr; tÞ ¼ Re½Vðr; tÞ; averaged intensity:
½5
ð1 IðPÞ ¼ kV p ðr; tÞVðr; tÞl ½8
Vðr; tÞ ¼ V~ ðrÞ ðr; nÞe22pint dn
0
where some constants have been ignored. With as y ¼ ðImax 2 Imin Þ=ðImax þ Imin Þ, we obtain
eqn [7], the intensity at point P would be given by
2ðI1 I2 Þ1=2
IðPÞ ¼ I1 þ I2 þ 2 Re{ }
K1 Kp2 Gðr1 ;r2 ; t 2 t1 ;t 2 t2 Þ ½9 y¼ lgðr1 ; r2 ; tÞl ½15
I1 þ I2
where I1 and I2 are the intensities at point P;
which shows that for maximum visibility, the two
respectively, due to radiations from pinholes P1 and
interfering fields must be completely coherent
P2 independently (defined as Ii ¼ klKi Vðri ;tÞl2 l) and
ðlg ðr1 ; r2 ; tÞl ¼ 1Þ: On the other hand, if the fields
Gðr1 ;r2 ; t 2 t1 ;t 2 t2 Þ ; Gðr1 ; r2 ; tÞ are completely incoherent ðlgðr1 ; r2 ; tÞl ¼ 0Þ; no
fringes are observed ðImax ¼ Imin Þ: The fields are said
¼ kV p ðr1 ; tÞVðr2 ; t þ tÞl ½10 to be partially coherent when 0 , lg ðr1 ; r2 ; tÞl , 1:
When I1 ¼ I2 ; the visibility is the same as lgðr1 ; r2 ; tÞl:
is the mutual coherence function of the fields at P1
The relation in eqn [15] shows that in an interference
and P2 ; and it depends on the time difference
experiment, one can obtain the modulus of the
t ¼ t2 2 t1 ; since the random process is assumed to
complex degree of coherence by measuring I1 ; I2 ;
be stationary. This function is also sometimes
and the visibility. Similarly, eqn [14] shows that from
denoted by G12 ðtÞ: The function Gii ðtÞ ; Gðri ; ri ; tÞ
the measurement the positions of maxima, one can
defines the self-coherence of the light from the
obtain the phase of the complex degree of coherence,
pinhole at Pi ; and lKi l2 Gii ð0Þ defines the intensity Ii
aðr1 ; r2 ; tÞ:
at point P; due to the light from pinhole Pi :
which is called the complex degree of coherence. It The self-coherence, G11 ðtÞ; of the light from pinhole
can be shown that 0 # lg12 ðtÞl # 1: The intensity at P1 ; is a direct measure of the temporal coherence of
point P; given by eqn [9], can now be written as the source. On the other hand, if the source is of finite
pffiffiffiffiffi size and we observe the interference at point O which
IðPÞ ¼ I1 þ I2 þ 2 I1 I2 Re g r1 ; r2 ; t ½12 corresponds to t ¼ 0; the mutual coherence function
would be
Expressing g ðr1 ; r2 ; tÞ as
G12 ð0Þ ¼ Gðr1 ; r2 ; 0Þ ¼ kV p ðr1 ; tÞVðr2 ; tÞl ; J12 ½17
g ðr1 ; r2 ; tÞ ¼ lg ðr1 ; r2 ; tÞl exp iaðr1 ; r2 ; tÞ 2 2pin0 t
½13 which is called the mutual intensity and is a direct
where að~r1 ; ~r2 ; tÞ ¼ arg½gð~r1 ; ~r2 ; tÞ þ 2pn0 t; and n0 is measure of the spatial coherence of the source. In
the mean frequency of the light, the intensity in eqn general, however, the function G12 ðtÞ measures, for a
[12] can be written as source of finite size and spectral width, a combination
of the temporal and spatial coherence, and in only
pffiffiffiffiffi
IðPÞ ¼ I1 þ I2 þ 2 I1 I2 lg ðr1 ; r2 ; tÞl some limiting cases, the two types of coherence can
be separated.
£ cos aðr1 ; r2 ; tÞ 2 2pn0 t ½14
Spectral Representation of Mutual Coherence
Now, if we assume that the source is quasi-
monochromatic, i.e., its spectral width Dn p n0 ; One can also analyze the correlation between two
the quantities gðr1 ; r2 ; tÞ and aðr1 ; r2 ; tÞ vary fields in the spectral domain. In particular, one can
slowly on the observation screen, and the inter- define the cross-spectral density function Wðr1 ; r2 ; nÞ
ference fringes are mainly obtained due to the which defines the correlation between the amplitudes
cosine term. Thus, defining the visibility of fringes of the spectral components of frequency n of the light
88 COHERENCE / Overview
Gðr1 ;r2 ; tÞ satisfies the wave equations: plane by measuring the visibility and the phase of the
fringes formed in plane C:
1 ›2 Using the van Cittert –Zernike theorem, the com-
721 Gðr1 ;r2 ; tÞ ¼ Gðr1 ;r2 ; tÞ
c 2 ›t 2 plex degree of coherence at the pinholes P1 and P2 on
½24 the mask after the lens L1 is
1 ›2 !
722 Gðr1 ;r2 ; tÞ ¼ 2 Gðr1 ;r2 ; tÞ
c ›t 2 ib12 2J1 ðvÞ 2pnad
g12 ¼ lg12 le ¼ with v ¼ ½26
v cf
where 72j is the Laplacian with respect to the point ri :
Here the first of eqns [24], for instance, gives the for symmetrically placed pinholes about the optical
variation of the mutual coherence function with axis. Here b12 is the phase of the complex degree of
respect to r1 and t for fixed r2 : Further, the variable t coherence and in this special case where the two holes
is the time difference defined through the path are equidistant from the axis, b12 is either zero or p;
difference, since t ¼ ðR2 2R1 Þ=c; and the actual time respectively, for positive or negative values of
does not affect the mutual coherence function (as the 2J1 ðvÞ=v: Let us assume that the intensities at two
fields are assumed to be stationary). pinholes P1 and P2 are equal. The interference pattern
Using the relation in eqn [20], one can also obtain observed at the back focal plane of lens L2 is due to
from eqn [24], the propagation equations for the the superposition of the light diffracted from the
cross-spectral density function Wðr1 ; r2 ; nÞ: pinholes. The beams are partially coherent with
degree of coherence given by eqn [26]. Since the
721 Wðr1 ; r2 ; nÞ þ k2 Wðr1 ; r2 ; nÞ ¼ 0 pinholes are symmetrically placed, the intensity due
½25 to either of the pinholes at a point P at the back focal
722 Wðr1 ; r2 ; nÞ 2
þ k Wðr1 ; r2 ; nÞ ¼ 0 plane of the lens L2 is the same and is given by the
Fraunhofer formula for diffraction from circular
where k ¼ 2pn=c:
apertures, i.e.:
2J1 ðuÞ
2
Thompson and Wolf Experiment
I1 ðPÞ ¼ I2 ðPÞ ¼
; with u ¼ 2pn b sin f ½27
cosðd þ b Þ ½28
narrow band light from a mercury lamp (not shown v
12
in the figure) on to a hole of size 2a in an opaque
screen A: A mask consisting of two pinholes, each of where d ¼ d sin f is the phase difference between the
diameter 2b with their axes separated by a distance d; two beams reaching P from P1 and P2 : For the on-axis
was placed symmetrically about the optical axis of point O; the quantity d is zero.
the experimental setup at plane B between two lenses The maximum and minimum values of IðPÞ are
L1 and L2 ; each having focal length f : The source was given by
at the front focal plane of the lens L1 and the
observations were made at the plane C at the back Imax ðPÞ ¼ 2I1 ðPÞb1 þ l2J1 ðvÞ=vl2 c ½29ðaÞ
focus of the lens L2 : The separation d was varied to
study the spatial coherence function on the mask
Imax ðPÞ ¼ 2I1 ðPÞb1 2 l2J1 ðvÞ=vl2 c ½29ðbÞ
Types of Fields
As mentioned above, g12 ðtÞ is a measure of the
Figure 6 Schematic of the Thompson and Wolf experiment. correlation of the complex field at any two points P1
90 COHERENCE / Overview
Figure 7 Two beam interference patterns obtained by using partially coherent light. The wavelength of light used was 0.579 mm and
the separation d of the pinholes in the screen at B was 0.5 cm. The figure shows the observed fringe pattern and the calculated intensity
variation for three sizes of the secondary source. The dashed lines show the maximum and minimum intensity. The values of the
diameter, 2a; of the source and the corresponding values of the magnitude, lg12 l; and the phase, b12 ; are also shown in each case.
Reproduced with permission from Thompson BJ (1958) Illustration of phase change in two-beam interference with partially coherent
light. Journal of the Optical Society of America 48: 95– 97.
and P2 at specific time delay t: Extending this two fixed points R1 and R2 in the domain of the field
definition, an optical field may be coherent or for all values of time delay t: For such a field, it can be
incoherent, if lg12 ðtÞl ¼ 1 or lg12 ðtÞl ¼ 0; respectively, shown that gðR1 ; R2 ; tÞ ¼ exp½iðb 2 2pn0 tÞ with n0
for all pairs of points in the field and for all time ð. 0Þ and b are real constants. Further, as a corollary,
delays. In the following, we consider some specific lgðR1 ; R1 ; tÞl ¼ 1 and lgðR2 ; R2 ; tÞl ¼ 1 for all t;
cases of fields and their properties. i.e., the field is self-coherent at each of the field
points R1 and R2 ; and
Perfectly Coherent Fields
A field would be termed as perfectly coherent or self- g ðR1 ;R1 ; tÞ ¼ g ðR2 ;R2 ; tÞ ¼ expð22pin0 tÞ for n0 . 0
coherent at a fixed point, if it has the property that
lgðR; R; tÞl ¼ 1 at some specific point R in the domain It can be shown that for any other point r 0 within the
of the field for all values of time delay t: It can also be field
shown that gðR; R; tÞ is periodic in time, i.e.:
g ðr;R2 ; tÞ ¼ g ðR1 ;R2 ;0Þg ðr;R1 ; tÞ ¼ g ðr;R1 ; tÞeib
gðR; R; tÞ ¼ e22p in0 t for n0 . 0 ½30 ½31
g ðR1 ;R2 ; tÞ ¼ g ðR1 ;R2 ;0Þ exp½22p in0 t
For any other point r in the domain of the field,
gðR; r; tÞ and gðr; R; tÞ are also unimodular and also A perfectly coherent optical field for all points in
periodic in time. the domain of the field has the property that
A perfectly coherent optical field at two fixed lgðr1 ;r2 ; tÞl ¼ 1 for all pairs of points r1 and r2 in the
points has the property that lgðR1 ; R2 ; tÞl ¼ 1 for any domain of the field, for all values of time delay t:
COHERENCE / Overview 91
It can then be shown that gðr1 ;r2 ; tÞ ¼ exp{i½aðr1 Þ2 coordinates and a component dependent on time
aðr2 Þ 2 2pin0 t}; where aðr 0 Þ is real function of a delay, is called reducible. In the case of perfectly
single position r 0 in the domain of the field. Further, coherent light, the complex degree of coherence is
the mutual coherence function Gðr1 ;r2 ; tÞ for such a reducible, as we have seen above, e.g., in eqn [32],
field has a factorized periodic form as and in the case of quasi-monochromatic fields, this is
reducible approximately as shown in eqn [37]. There
Gðr1 ;r2 ; tÞ ¼ Uðr1 ÞUp ðr2 Þ expð22pin0 tÞ ½32 also exists a very special case of a cross-spectrally
pure field for which the complex degree of coherence
which means the field is monochromatic with
is reducible. A field is called a cross-spectrally pure
frequency n0 and UðrÞ is any solution of the
field if the normalized spectrum of the superimposed
Helmholtz equation:
light is equal to the normalized spectrum of the
72 UðrÞ þ k20 UðrÞ ¼ 0; k0 ¼ 2pn0 =c ½33 component beams, a concept introduced by Mandel.
In the space-frequency domain, the intensity inter-
The spectral and cross-spectral densities of such fields ference law is expressed as the so-called spectral
are represented by Dirac d-functions with their interference law:
singularities at some positive frequency n0 : Such
fields, however, can never occur in nature. Sðr; nÞ ¼ Sð1Þ ðr; nÞ þ Sð2Þ ðr; nÞ
qffiffiffiffiffiffiffiffiffiffiffiqffiffiffiffiffiffiffiffiffiffiffi
Quasi-Monochromatic Fields þ2 Sð1Þ ðr; nÞ Sð2Þ ðr; nÞ Re½mðr1 ; r2 ; nÞe22 pint
Optical fields are found in practice for which spectral ½38
spread Dn is much smaller than the mean frequency n:
These are termed as quasi-monochromatic fields. The where mðr1 ; r2 ; nÞ is the spectral degree of coherence,
cross-spectral density Wðr1 ; r2 ; nÞ of the quasi-mono- defined in eqn [21] and t is the relative time delay that
chromatic fields attains an appreciable value only in is needed by the light from the pinholes to reach any
the small region Dn; and falls to zero outside this point on the screen; Sð1Þ ðr; nÞ and Sð2Þ ðr; nÞ are the
region. spectral densities of the light reaching P from the
pinholes P1 and P2 ; respectively (see Figure 4) and it is
Wðr1 ; r2 ; nÞ ¼ 0; ln 2 n l . Dn and Dn p n ½34 assumed that the spectral densities of the field at
pinholes P1 and P2 are the same ½Sðr1 ; nÞ ¼ CSðr2 ; nÞ:
The mutual coherence function Gðr1 ; r2 ; tÞ can be
Now, if we consider a point for which the time delay
written as the Fourier transform of cross-spectral
is t0 ; then it can be seen that the last term in eqn [38]
density function as
would be independent of frequency, provided that
ð1
Gðr1 ;r2 ; tÞ ¼ e22p int Wðr1 ;r2 ; nÞe22p iðn2nÞt dn ½35 mðr1 ; r2 ; nÞ expð22pint0 Þ ¼ f ðr1 ; r2 ; t0 Þ ½39
0
where f ðr1 ; r2 ; t0 Þ is a function of r1 ; r2 and t0 only
If we limit our attention to small t such that Dnltl p 1
and the light at this point would have the same
holds, the exponential factor inside the integral is
spectral density as that at the pinholes. If a region
approximately equal to unity and eqn [35] reduces to
exists around the specified point on the observation
ð1 plane, such that the spectral distribution of the light
Gðr1 ;r2 ; tÞ ¼ expð22pint
Þ ½Wðr1 ;r2 ; nÞdn ½36 in this region is of the same form as the spectral
0
distribution of the light at the pinholes, the light at the
which gives pinholes is cross-spectrally pure light.
In terms of the spectral distribution of the light
Gðr1 ;r2 ; tÞ ¼ Gðr1 ;r2 ;0Þ expð22pint
Þ ½37
Sðr1 ; nÞ at pinhole P1 and Sðr2 ; nÞ½¼ CSðr1 ; nÞ at
Equation [37] describes the behavior of Gðr1 ;r2 ; tÞ for pinhole P2 ; the mutual coherence function at the
a limited range of t values for quasi-monochromatic pinholes can be written as
fields and in this range it behaves as a monochromatic pffiffiffi ð
field of frequency n: However, due to the factor Gðr1 ; r2 ; tÞ ¼ C Sðr1 ; nÞmðr1 ; r2 ; nÞ expð22pintÞdn
Gðr1 ;r2 ;0Þ; the quasi-monochromatic field may be
coherent, partially coherent, or even incoherent. ½40
The complex degree of coherence gðr1 ; r2 ; tÞ is thus The cross-spectral density functions of the source and
expressible as the product of two factors: one factor of the field are related as
characterizes the spatial coherence at the two pin-
holes at time delay t0 and the other characterizes the ð722 þ k2 Þð721 þ k2 ÞWV ðr1 ; r2 ; nÞ ¼ 4p 2 WQ ðr1 ; r2 ; nÞ
temporal coherence at one of the pinholes. Equation ½45
[41] is known as the reduction formula for cross-
spectrally pure light. It can further be shown that The solution of eqn [45] is represented as
ð ð eikðR2 2R1 Þ 3 0 3 0
mðr1 ; r2 ; nÞ ¼ gðr1 ; r2 ; t0 Þ expð2pint0 Þ ½42 WV ðr1 ;r2 ; nÞ ¼ WQ ðr0 1 ;r0 2 ; nÞ d r 1d r 2
S S R1 R2
½46
Thus, the absolute value of the spectral degree of
coherence is the same for all frequencies and is equal where R1 ¼ lr1 2 r 0 1 l and R2 ¼ lr2 2 r 0 2 l (see Figure 8).
to the absolute value of the degree of coherence for Using eqn [46], one can then obtain an expression
the point specified by t0 : It has been shown that cross- for the spectrum at a point ðr1 ¼ r2 ¼ r ¼ ruÞ in the far
spectrally pure light can be generated, for example, field ðr q r 0 1 ;r 0 2 Þ of a source as
by linear filtering of light that emerges from the two
pinholes in Young’s interference experiments. 1ð ð 0 0
S1 ðru; nÞ ¼ 2
WQ ðr0 1 ;r0 2 ; nÞe2iku·ðr 1 2r 2 Þ d3 r0 1 d3 r0 2
r S S
Types of Sources ½47
ð1
WQ ðr1 ; r2 ; nÞ ¼ GQ ðr1 ; r2 ; tÞe22p int dt
21
½44
ð1
WV ðr1 ; r2 ; nÞ ¼ GV ðr1 ; r2 ; tÞe 22pint dt Figure 8 Geometry of a 3-dimensional primary source S and
21 the radiation from it.
COHERENCE / Overview 93
k2 cos2 u ð ð
three-dimensional scattering potentials, and two- S1 ðru; nÞ ¼ I0 ðrÞg0 ðvÞm0 ðr0 ; nÞ
2p2 r2 s s
dimensional primary and secondary sources.
£ e2 iku’ ·ðr1 2r2 Þ d2 r1 d2 r2 ½54
Equivalence Theorems
Noting that r ¼ ðr1 þ r2 Þ=2 and r0 ¼ r2 2 r1 ; one
The study of partially coherent sources led to the can transform the variables of the integration and
formulations of a number of equivalence theorems, obtain after some manipulation:
which show that sources of any state of coherence can
produce the same distribution of the radiant intensity k2 cos2 u ~
Sð1Þ ðru; nÞ ¼ I0 m~0 ðku’ ; nÞg0 ðnÞ ½55
as a fully spatially coherent laser source. These ð2prÞ2
theorems provide conditions under which sources of
different spatial distribution of spectral density and of where
different state of coherence will generate fields, which ð 0
have the same radiant intensity. It has been shown, by m~0 ðku’ ; nÞ ¼ m0 ðr0 ; nÞe2 iku’ ·r d2 r0 ½56
s
taking examples of Gaussian – Schell model sources,
that sources of completely different coherence proper- and
ties and different spectral distributions across the ð
I~0 ¼ I0 ðrÞd2 r ½57
source generate identical distribution of radiant s
intensity. Experimental verifications of the results of
these theorems have also been carried out. For further Equation [55] shows that the spectrum of the field in
details on this subject, the reader is referred to the far-zone depends on the coherence properties of
Mandel and Wolf (1995). the source through its spectral degree of coherence
m0 ðr0 ; nÞ and on the normalized source spectrum
g0 ðnÞ:
Correlation-Induced Spectral
Changes Scaling Law
It was assumed that spectrum is an intrinsic property The reason why coherence-induced spectral
of light that does not change as the radiation changes were not observed until recently is that the
propagates in free-space, until studies on partially usual thermal sources employed in laboratories or
COHERENCE / Overview 95
commonly encountered in nature have special coher- where bðP1 ; P2 ; nÞ denotes the phase of the spectral
ence properties and the spectral degree of coherence degree of coherence. Equation [60] implies the two
has the function form: results:
2pn
mð0Þ ðr2 2r1 ; nÞ ¼ f kðr2 2r1 Þ with k ¼ ½58 (i) at any fixed frequency n; the spectral density varies
c sinusoidally with the distance x of the point from
which shows that the spectral degree of coherence the axis, with the amplitude and the phase of the
depends only on the product of the frequency and variation depending on the (generally complex)
space coordinates. This formula expresses the so- spectral degree of coherence mðP1 ; P2 ; nÞ; and
called scaling law, which was enunciated by Wolf. (ii) at any fixed point P in the observation plane the
Commonly used sources satisfy this property. For spectrum SðP; nÞ will, in general, differ from the
example, the spectral degree of coherence of Lam- spectrum Sð1Þ ðP; nÞ; the change also depending on
bertian sources and black-body sources can be shown the spectral degree of coherence mðP1 ; P2 ; nÞ of the
to be light at the two pinholes.
sin klr2 2r1 l Experimental Confirmations
m0 ðr2 2r1 ; nÞ ¼ ½59
klr2 2r1 l Experimental tests of the theoretical prediction of
spectral invariance and noninvariance due to corr-
This expression evidently satisfies the scaling law. If elation of fluctuations across the source were per-
the spectral degree of coherence does not satisfy the formed just after the theoretical predictions. Figure 11
scaling law, the normalized far-field spectrum will, in shows results of one such experiment in which
general, vary in different directions in the far-zone spectrum changes in the Young’s experiment were
and will differ from the source spectrum. studied. Several other experiments also reported
confirmation of the source correlation-dependent
Spectral Changes in Young’s Interference spectral changes. One of the important applications
Experiment of these observations has been to explain the
Spectral changes in Young’s interference experiment discrepancies in the maintenance of the spectro-
with broadband light are not as well understood as in radiometric scales by national laboratories in differ-
experiments with quasi-monochromatic, probably ent countries. These studies also have potential
because in such experiments no interference fringes application in determining experimentally the spec-
are formed. However, if one were to analyze the tral degree of coherence of partially coherent fields.
spectrum of the light in the region of superposition, The knowledge of spectral degree of coherence is
one would observe changes in the spectrum of light in often important in remote sensing, e.g., for determin-
the region of superposition in the form of a shift in the ing angular diameters of stars.
spectrum for narrowband spectrum and spectral
modulation for broadband light. One can readily
derive an expression for the spectrum of light in the
region of superposition. Let Sð1Þ ðP; nÞ be the spectral
density of the light at P which would be obtained if
the small aperture at P1 alone was open; Sð2Þ ðP; nÞ has
a similar meaning if only the aperture at P2 was open
(Figure 4).
Let us assume, as is usually the case, that
Sð2Þ ðP; nÞ < Sð1Þ ðP; nÞ and let d be the distance between
the two pinholes. Consider the spectral density at the
point P; at distance x from the axis of symmetry in an
observation plane located at distance of R from the
plane containing the pinholes. Assuming that x=R p
1; one can make the approximation R2 2 R1 < xd=R:
Figure 11 Correlation-induced changes in spectrum in Young’s
The spectral interference law (eqn [38]) can then be interference. Dashed line represents the spectrum when only one
written as of the slits is open, the continuous curve shows the spectrum
when both the slits are open and the circles are the measured
SðP; nÞ < 2Sð1Þ ðP; nÞ{1 þ lmðP1 ; P2 ; nÞl values in the latter case. Reproduced with permission from
Santarsiero M and Gori F (1992) Spectral changes in Young
£ cos bðP1 ; P2 ; nÞ þ 2pnxd=cR } ½60 interference pattern. Physics Letters 167: 123 –128.
96 COHERENCE / Overview
Interference Spectroscopy
coherence theory, was the use of his interferometer inversion would give:
(Figure 2) to determine the energy distribution in the
ð1
spectral lines. The method he developed is capable to sðmÞ ¼ sðn0 þ mÞ ¼ 2 y ðtÞcosð2pmtÞdt ½66
resolving spectral lines that are too narrow to be 0
analyzed by the spectrometer. The visibility of the
interference fringes depends on the energy distri- which can be used to calculate the spectral energy
bution in the spectrum of the light and its measure- distribution for a symmetric spectrum about n0 from
ment can give information about the spectral lines. In the visibility curve. However, for an asymmetric
particular, if the energy distribution in a spectral line spectral distribution, the visibility and the phase of
is symmetric about some frequency n0 ; its profile is the complex degree of coherence must be determined
simply the Fourier transform of the visibility vari- as the Fourier transform of the shifted spectrum is no
ation as a function of the path difference between the longer real everywhere.
two interfering beams. This method is the basis of
interference spectroscopy or the Fourier transform
spectroscopy. Higher-Order Coherence
Within the framework of second-order coherence So far we have considered correlations of the
theory, if the mean intensity of the two beams in a fluctuating field variables at two space –time ðr; tÞ
Michelson’s interferometer is the same, then the points, as defined in eqn [10]. These are termed as
visibility of the interference fringes in the observation second-order correlations. One can extend the con-
plane is related to the complex degree of coherence of cept of correlations to more than two space –time
light at a point at the beamsplitter where the two points, which will involve higher-order correlations.
beams superimpose (eqn [15]). The two quantities are For example, one can define the space –time cross
related as y ðtÞ ¼ lg ðtÞl where g ðtÞ ; g ðr1 ; r1 ; tÞ is the correlation function of order ðM; NÞ of the random
complex degree of self-coherence at the point r1 on field Vðr; tÞ; represented by GðM;NÞ ; as an ensemble
the beamsplitter. Following eqn [19], g ðtÞ can be average of the product of the field Vðr; tÞ values at N
represented by a Fourier integral as space –time points and V p ðr; tÞ at other M points. In
this notation, the mutual coherence function as
ð1
gðtÞ ¼ sðnÞexpð2i2pntÞdn ½63 defined in eqn [10] would now be Gð1;1Þ : Among
0 higher-order correlations, the one with M ¼ N ¼ 2; is
of practical significance and is called the fourth-order
where sðnÞ is the normalized Ð spectral density of the correlation function, Gð2;2Þ ðr1 ; t1 ; r2 ; t2 ; r3 ; t3 ; r4 ; t4 Þ:
light defined as sðnÞ ¼ SðnÞ= 1 0 SðnÞdn and SðnÞ ; The theory of Gaussian random variables tells us
Sðr1 ; nÞ ¼ Wðr1 ; r1 ; nÞ is the spectral density of the that any higher correlation can be written in terms
beam at the point r1 : The method is usually applied to of second-order correlations over all permutations
very narrow spectral lines for which one can assume of pairs of points. In addition, if we assume that
that the peak that occurs at n0 ; and g ðtÞ can be ðr3 ; t3 Þ ¼ ðr1 ; t1 Þ and ðr4 ; t4 Þ ¼ ðr2 ; t2 Þ; and that the
represented as field is stationary, then Gð2;2Þ is called the intensity –
intensity correlation and is given as
g ðtÞ ¼ g ðtÞexpð22pin0 tÞ with
ð1 ½64 Gð2;2Þ ðr1 ;r2 ;t2 2 t1 Þ
g ðtÞ ¼ sðmÞexpð2i2pmtÞdm
21
¼ kVðr1 ;t1 ÞVðr2 ;t2 ÞV p ðr1 ;t1 ÞV p ðr2 ;t2 Þl
where s ðmÞ is the shifted spectrum such that
¼ kIðr1 ;t1 ÞIðr2 ;t2 Þl
s ðmÞ ¼ sðn0 þ mÞ m $ 2n0
¼ kIðr1 ;t1 ÞlkIðr2 ;t2 Þl 1 þ lgð1;1Þ ðr1 ;r2 ;t2 2 t1 Þl2 ½67
¼0 m , 2 n0
where
From the above one readily gets
Gð1;1Þ ðr1 ;r2 ;t2 2 t1 Þ
ð1
gð1;1Þ ðr1 ;r2 ;t2 2 t1 Þ ¼ ½68
½kIðr1 ;t1 Þl1=2 ½kIðr2 ;t2 Þl1=2
y ðtÞ ¼ lg ðtÞl ¼
s ðmÞ expð2i2pmtÞdm
½65
21
and then the correlation of intensity fluctuations less than one). Equation [71] represents the Hanbury-
becomes Brown – Twiss effect.
kDI1 DI2 l ¼ kIðr1 ;t1 ÞIðr2 ;t2 Þl 2 kIðr1 ;t1 ÞlkIðr2 ;t2 Þl Stellar Intensity Interferometry
2
¼ kIðr1 ;t1 ÞlkIðr2 ;t2 Þllgð1;1Þ ðr1 ;r2 ;t2 2 t1 Þl Michelson stellar interferometry can resolve stars
which have angular sizes of the order of 0:0100 ; since
½69 for smaller stars, the separation between the primary
where we have used eqn [67]. Equation [69] forms the mirrors runs into several meters and maintaining
basis for intensity interferometry. stability of mirrors such that the optical paths do not
change, even by fraction of a wavelength, is extremely
difficult. The atmospheric turbulence further adds to
Hanbury-Brown and Twiss Experiment this problem and obtaining stable fringe pattern
In this landmark experiment conducted both on the becomes next to impossible for very small stars.
laboratory scale and astronomical scale, Hanbury- Hanbury-Brown and Twiss applied the intensity
Brown and Twiss demonstrated the existence of interferometry based on their photoelectric corre-
intensity – intensity correlations in terms of the lation technique for determining the angular sizes of
correlations in the photocurrents in the two detectors such stars. Two separate parabolic mirrors collected
and thus measured the squared modulus of complex light from the star and the output of the photo-
degree of coherence. In the laboratory experiment, detectors placed at the focus of each mirror was sent
the arc of a mercury lamp was focused onto a circular to a correlator. The cable lengths were made unequal
hole to produce a secondary source. The light from so as to compensate for the time difference of the
this source was then divided equally into two parts light arrival at the two mirrors. The normalized
through a beamsplitter. Each part was, respectively, correlation of the fluctuations of the photocurrents
detected by two photomultiplier tubes, which had was determined as described above. This would give
identical square apertures in front. One of the tubes the variation of the modulus-squared degree of
could be translated normally to the direction of coherence as a function of the mirror separation d
propagation of light and was so positioned that its from which the angular size of the stars can be
image through the splitter could be made to coincide estimated. The advantage of the stellar intensity
with the other tube. Thus, by suitable translation a interferometer over the stellar (amplitude) interfe-
measured separation d between the two square rometer is that the light need not interfere as in the
apertures could be introduced. The output currents latter, since the photodetectors are mounted directly
from the photomultiplier tubes were taken by cables at the focus of the primary mirrors of the telescope.
of equal length to a correlator. In the path of each Thus, the constraint on the large path difference
cable a high-pass filter was inserted, so that only the between the two beams is removed and large values of
current fluctuations could be transmitted to the d can now be used. Moreover, the atmosphere
correlator. Thus, the normalized correlations between turbulence and the mirror movements have very
the two current fluctuations: small effect. Stellar angular diameters as small as
0:000400 of arc with resolution of 0:0000300 could be
kDJ1 ðtÞDJ2 ðtÞl measured by such interferometers.
CðdÞ ¼ 1=2 1=2
½70
k½DJ1 ðtÞ2 l k½DJ2 ðtÞ2 l
See also
as a function of detector separation d could be
Coherence: Coherence and Imaging. Coherent Light-
measured. Now when the detector response time is
wave Systems. Coherent Transients: Coherent Transi-
much larger than the time-scale of the fluctuations in ent Spectroscopy in Atomic and Molecular Vapours.
intensity, then it can be shown that the correlations in Coherent Control: Applications in Semiconductors;
the fluctuations of the photocurrent are proportional Experimental; Theory. Interferometry: Overview. Infor-
to the correlations of the fluctuations in intensity of mation Processing: Coherent Analogue Optical
the light being detected. Thus, we would have Processors. Terahertz Technology: Coherent Terahertz
Sources.
CðdÞ < dlgð1;1Þ ðr1 ; r2 ; 0Þl2 ½71
Born M and Wolf E (1999) Principles of Optics. New York: Mandel L and Wolf E (1995) Optical Coherence and
Pergamon Press. Quantum Optics. Cambridge, UK: Cambridge Univer-
Carter WH (1996) Coherence theory. In: Bass M (ed.) Hand sity Press.
Book of Optics. New York: McGraw-Hill. Marathay AS (1982) Elements of Optical Coherence.
Davenport WB and Root WL (1960) An Introduction to the New York: John Wiley.
Theory of Random Signals and Noise. New York: Perina J (1985) Coherence of Light. Boston, MA: M.A.
McGraw-Hill. Reidel.
Goodman JW (1985) Statistical Optics. Chichester: Wiley. Santarsiero M and Gori F (1992) Spectral changes in Young
Hanbury-Brown R and Twiss RQ (1957) Interferometry of interference pattern. Physics Letters 167: 123– 128.
intensity fluctuations in light: I basic theory: the Schell AC (1967) A technique for the determination of the
correlation between photons in coherent beams of radiation pattern of a partially coherent aperture.
radiation. Proceedings of the Royal Society 242: IEEE Transactions on Antennas and Propagation AP-
300– 324. 15: 187.
Hanbury-Brown R and Twiss RQ (1957) Interferometry of Thompson BJ (1958) Illustration of phase change in two-
intensity fluctuations in light: II an experimental test of beam interference with partially coherent light. Journal
the theory for partially coherent light. Proceedings of the of the Optical Society of America 48: 95– 97.
Royal Society 243: 291 – 319. Thompson BJ and Wolf E (1957) Two beam interference
Kandpal HC, Vaishya JS and Joshi KC (1994) Correlation- with partially coherent light. Journal of the Optical
induced spectral shifts in optical measurements. Optical Society of America 47: 895 –902.
Engineering 33: 1996– 2012. Wolf E and James DFV (1996) Correlation-induced
Mandel L and Wolf E (1965) Coherence properties of spectral changes. Report on Progress in Physics 59:
optical fields. Review of Modern Physics 37: 231 – 287. 771 – 818.
transmission of Oðx; yÞ: The imaging optics produce from the intensity pattern leaving the object. Seen
an optical image that is viewed directly by the viewer. from this perspective, partially coherent illumination
A growing number of image formation systems now is an undesirable property that creates nonideal
include a solid-state image detector as shown in images and complicates the image analysis.
Figure 2. The raw intensity optical image is converted The ideal image as described above is not
to an electronic signal that can be digitally processed necessarily the optimum image for a particular task.
and then displayed on a monitor. In this system, the Consider the case of object recognition when the
spatial intensity distribution emanating from the object has low optical contrast. Using the above
monitor is the final image. The task of the optical definition, an ideal image would mimic the low
system to the left of the detector is to gather spatial contrast and render recognition difficult. An image
information about the light properties of the object. formation system that alters the contrast to improve
In fact, the information gathered by the detector the recognition task would be better than the
includes information about the object, the illumina- so-called ideal image. The image formation system
tion system, and the image formation optics. If the that maximized the appropriate performance metric
observer is only interested in the light transmission for a particular recognition task would be considered
properties of the object, the effects of the illumination optimal. In optimizing the indirect imaging system of
and image formation optics must be well understood. Figure 2, the designer can adjust the illumination, the
When the light illuminating the object is known to be imaging optics, and the post-detection processing.
coherent or incoherent, reasonably simple models for In fact, research microscopes often include an
the overall image formation process can be used. adjustment that modifies illumination coherence
More general, partially coherent illumination can and often alters the image contrast. Darkfield and
produce optically formed images that differ greatly phase contrast imaging microscopes usually employ
partially coherent illumination combined with pupil
modification to view otherwise invisible objects. Seen
from this perspective, partially coherent illumination
is a desirable property that provides more degrees of
freedom to the imaging system designer. Quite often,
partially coherent imaging systems provide a com-
promise between the performance of coherent and
incoherent systems.
Variable
phase delay
Mirror
Intensity
Prism
Narrowband Lens
point Distance
source
Beamsplitter
Figure 2 Indirect view optical system. The raw intensity optical
image is detected electronically, processed and a final image is Figure 3 The presence of high contrast fringes in a Michelson
presented to the human observer on a display device. interferometer indicates high temporal coherence.
COHERENCE / Coherence and Imaging 101
Figure 4 A Young’s two pinhole interferometer produces: (a) a uniform high contrast fringe pattern for light that is highly spatially
and temporally coherent; (b) high contrast fringes only near the axis for low temporal coherence and high spatial coherence light; and
(c) no fringe pattern for high temporal coherence, low spatial coherence light.
highly temporally coherent. Light from very narrow- highly temporally coherent lasers that are designed to
band lasers can maintain coherence over very large have the same center frequency. Since the light from
optical path differences. Light from broadband light the lasers is not synchronized in phase, any fringes
sources requires very small optical path differences to that might form for an instant will move rapidly and
add in wave amplitude. average to a uniform intensity pattern over the
Spatial coherence is a measure of the ability of two integration time of a typical detector. Since the fringe
separate points in a field to interfere. The Young’s two contrast is zero over a practical integration time, the
pinhole experiment in Figure 4 measures the cohe- light at the two pinholes is effectively spatially
rence between two points sampled by the pinhole incoherent.
mask. Only a one-dimensional pinhole mask is shown
for simplicity. Figure 4a shows an expanded laser
beam illuminating two spatially separated pinholes Two-point Imaging
and recombining to form an intensity fringe pattern. In a typical partially coherent imaging experiment
The high contrast of the fringe indicates that the wave we need to know how light from two pinholes adds
amplitude of the light from the two pinholes is highly at the optical image plane, as shown in Figure 5.
correlated. Figure 4b shows a broadband point source Diffraction and system aberrations cause the image
expanded in a similar fashion. The fringe contrast is of a single point to spread so that the images of two
high near the axis because the optical path difference spatially separated object points overlap in the
between the two beams is zero on the axis and image plane. The wave amplitude image of a single
relatively low near the axis. For points far from the pinhole is called the coherent point spread function
axes, the fringe pattern disappears because the low (CPSF). Since the CPSF is compact, two CPSFs will
temporal coherence from the broadband source only overlap when the corresponding object points
results in a loss in correlation of the wave amplitudes. are closely spaced. When the two points are
A final Young’s experiment example in Figure 4c sufficiently close, the relevant optical path differ-
shows that highly temporally coherent light can be ences will be small so full temporal coherence can be
spatially incoherent. In the figure, the light illuminat- assumed. Spatial coherence will be the critical factor
ing the two pinholes comes from two separate and in determining how to add the responses to pairs of
102 COHERENCE / Coherence and Imaging
point images. We assume full temporal coherence in the magnitude squaring operation accounts for the
the subsequent analyses and the light is said to be square law response of the image plane detector.
quasimonochromatic. More generally, the image intensity for an arbitrary
object distribution can be found by breaking the
Coherent Two-Point Imaging object into a number of points and adding the CPSFs
Consider imaging a two-point object illuminated by a due to each point on the object. The resultant image
spatially coherent plane wave produced from a point plane intensity is the spatial convolution of the
source, as depicted in Figure 6a. In the figure, the object amplitude transmittance with the CPSF and
imaging system is assumed to be space-invariant. This is given by 2
means that the spatial distribution of the CPSF is ð
Icoh ðxÞ ¼ I0 Oðj Þhðj 2 xÞdj ½2
the same regardless of position of the input pinhole.
The CPSF in the figure is broad enough such that the where OðxÞ is the object amplitude transmittance.
image plane point responses overlap for this particu-
lar pinhole spacing. Since the wave amplitudes from Incoherent Two-Point Imaging
the two pinholes are correlated, the point responses Two pinhole imaging with spatially incoherent light is
add in wave amplitude, resulting in a two-point shown in Figure 6b, where a spatially extended
image intensity given by blackbody radiator is placed directly behind the
pinholes. Once again the imaging optics are assumed
I2coh ðxÞ ¼ I0 lhðx 2 x1 Þl2 þ I0 lhðx 2 x2 Þl2 ½1
to be space-invariant and have the same CPSF as the
where hðxÞ is the normalized CPSF and I0 is a scaling system in Figure 6a. Since the radiation is originating
factor that determines the absolute image intensity from two completely different physical points on the
value. Since the CPSF has units of wave amplitude, source, no correlation is expected between the wave
amplitude of the light leaving the pinholes and the
light is said to be spatially incoherent. The resultant
image intensity corresponding to the two pinhole
object with equal amplitude transmittance values is
calculated by adding the individual intensity
responses to give
I2inc ðxÞ ¼ I0 lhðx 2 x1 Þl2 þ I0 lhðx 2 x2 Þl2 ½3
For more general object distributions, the image
intensity is the convolution of the object intensity
transmittance with the incoherent PSF, and is given by
Figure 5 Imaging of two points. The object plane illumination
ð
coherence determines how the light adds in the region of overlap
Iinc ðxÞ ¼ I0 loðj Þl2 lhðx 2 j Þl2 dj ½4
of the two image plane points.
Figure 6 Imaging of two points for (a) spatially coherent illumination and (b) spatially incoherent imaging.
COHERENCE / Coherence and Imaging 103
where the intensity transmittance is the squared responses, a bilinear system requires consideration of
magnitude of the amplitude transmittance and the all possible pairs of points. This behavior is much
incoherent PSF is the squared magnitude of the CPSF. more complicated and does not allow the application
of the well-developed linear system theory.
Partially Coherent Two-Point Imaging
Finally we consider two pinhole imaging illuminated Source Distribution and Object
by partially spatially coherent light. Now the two-
Illumination Coherence
point responses do not add simply in amplitude or
intensity. Rather, the image intensity is given by the According to eqn [6], the mutual intensity of all pairs
more general equation: of points at the object plane must be known, to
calculate a partially coherent image. Consider the
I2pc ðxÞ ¼ I0 lhðx 2 x1 Þ2 þ lhðx 2 x2 Þl2 telecentric Kohler illumination imaging system
shown in Figure 7. The primary source is considered
þ 2Re{mðx1 ; x2 Þhðx 2 x1 Þhp ðx 2 x2 Þ} ½5
to be spatially incoherent and illuminates the object
where the p denotes complex conjugations and after passing through lens L1 located one focal
mðx1 ; x2 Þ is the normalized form of the mutual distance away from the source and one focal distance
intensity of the object illumination evaluated at the away from the object. Even though the primary
object points in question. The mutual intensity source is spatially incoherent, the illumination at
function is often denoted as J0 ðx1 ; x2 Þ and is a the object plane is partially coherent. The explicit
measure of the cross correlation of the wave mutual intensity function corresponding to the
amplitude distributions leaving the two pinholes. object plane illumination is given by applying the
Rather than providing a rigorous definition, we note van Cittert Zernike theorem:
that the magnitude of J0 ðx1 ; x2 Þ corresponds to the ð
fringe contrast that would be produced if the two J0 ðDxÞ ¼ Sðj Þ exp j2pDxj lF dj ½7
object pinholes were placed at the input of a Young’s
interference experiment. The phase is related to the where SðxÞ is the intensity distribution of the
relative spatial shift of the fringe pattern. When the spatially incoherent primary source, F is the focal
light is uncorrelated, J0 ðx1 ; x2 Þ ¼ 0 and eqn [5] length of the lens, l is the illumination wavelength,
collapses to the incoherent limit of eqn [4]. and Dx ¼ x1 2 x2 : The van Cittert Zernike theorem
When the light is coherent, J0 ðx1 ; x2 Þ ¼ 1 and reflects a Fourier transform relationship between the
eqn [5] reduces to the coherent form of eqn [2]. source image intensity distribution and the mutual
Image formation for a general object distribution is intensity at the object plane. When the source plane is
given by the bilinear equation: effectively spatially incoherent, the object plane
mutual intensity is only a function of separation
ð
distance. For a two-dimensional object, the mutual
Ipc ðxÞ ¼ I0 oðj1 Þop ðj2 Þhðx 2 j1 Þhp ðx 2 j2 Þ
intensity needs to be characterized for all pairs of
Jo ðj1 2 j2 Þdj1 dj2 ½6 unique spacings in x and y: In Figure 7, the lens
arrangement ensures that the object plane is located
Note that, in general, J0 ðx1 ; x2 Þ must be evaluated for in the far field of the primary source plane. In fact, the
all pairs of object points. Close examination of this van Cittert Zernike theorem applies more generally,
equation reveals that just as a linear system can be even in the Fresnel propagation regime, as long as the
evaluated by considering all possible single point standard paraxial approximation for optics is valid.
Figure 7 A telecentric Kohler illumination system with a spatially incoherent primary source imaged onto the pupil.
104 COHERENCE / Coherence and Imaging
Figure 8 Light propages from the source plane to the object plane and produces (a) coherent illumination from a single source point;
(b) incoherent illumination from an infinite extent source; and (c) partially coherent illumination from a finite extent primary source.
Figure 8 shows some simple examples of primary Varying the source size and hence the illumination
source distributions and the corresponding object spatial coherence produces a dip that is less than the
plane mutual intensity function. Figure 8a assumes an incoherent intensity dip. This result is often used to
infinitesimal point source and the corresponding suggest that coherent imaging gives poorer resolution
mutual intensity is 1.0 for all possible pairs of object than incoherent imaging. In fact, generalizations
points. Figure 8b assumes an infinite extent primary about two-point resolution can be misleading.
source and results in a dirac delta function for the Recent developments in optical lithography have
mutual intensity function. This means that there is no shown that coherence can be used to effectively
correlation between any two object points having a increase two-point resolution beyond traditional
separation greater than zero so the object plane diffraction limits. In Figure 9c, one of the pinholes
illumination is spatially incoherent. In Figure 5c, a in a coherent two-point imaging experiment has been
finite extent uniform source gives a mutual intensity modified with a phase shift corresponding to one half
function of the form sinðapDxÞ=ðaDxÞ: The finite- of a wavelength of the illuminating light. The two
sized source corresponds to partially coherent images add in wave amplitude and the phase shift
imaging and shows that the response to pairs of creates a distinct null at the image plane and
points is affected by the spatial distribution of the effectively enhances the two-point resolution. This
source in a complicated way. Note that a large approach is termed phase screen lithography and has
primary source corresponds to a large range of been exploited to produce finer features in lithogra-
angular illumination at the object plane. phy by purposely introducing small phase shift masks
Varying the size of the source in the imaging system at the object mask. In practice, the temporal and
of Figure 8 will affect the spatial coherence of the spatial coherence of the light is engineered to give
illumination and hence the optical image intensity. sufficient coherence to take advantage of the two-
Many textbook treatments discuss the imaging of two point enhancement while maintaining sufficient
points separated by the Rayleigh resolution criterion incoherence to avoid speckle-like artifacts associated
which corresponds to the case where the first with coherent light.
minimum of one point image coincides with the
maximum of the adjacent point image. With a large
Spatial Frequency Modeling of
source that provides effectively incoherent light, the
two point image has a modest dip in intensity
Imaging
between the two points, as shown in Figure 9a. Spatial frequency models of image formation are also
Fully coherent illumination of two points separated useful in understanding how coherence affects image
by the Rayleigh resolution produces a single large formation. A spatially coherent imaging system has a
spot with no dip in intensity, as shown in Figure 9b. particularly simple spatial frequency model. In the
COHERENCE / Coherence and Imaging 105
Figure 9 Comparison of coherent and two point imaging of two pinholes separated by the Rayleigh distance. (a) Coherent image
cannot resolve the two points. (b) Incoherent image barely resolves the two points. (c) Coherent illumination with a phase shifting plate at
one pinhole produces a null between the two image points.
Figure 10 The object Fourier transform is filtered by the pupil function in a coherent imaging system.
spatially coherent imaging system shown in Figure 10, proportional to the pupil plane amplitude transmit-
the on-axis plane wave illumination projects the tance. Equation [8] is the frequency domain version
spatial Fourier transform of the object, or Fraunhofer of the convolution representation given by eqn [2],
pattern, onto the pupil plane of the imaging system. but does not account for the squared magnitude
The pupil acts as a frequency domain filter that can be response of the detector.
modified to perform spatial frequency filtering. The In the previous section, we learned that an infinite
complex transmittance of the pupil is the wave extent source is necessary to achieve fully incoherent
amplitude spatial frequency transfer function of the imaging for the imaging system of Figure 7. As a
imaging system. It follows that the image plane wave thought experiment, one can start with a single point
amplitude frequency spectrum, U ~ c ð fx Þ; is given by source on the axis and keep adding mutually
incoherent source points to build up from a coherent
U ~ fx ÞHð fx Þ
~ c ð fx Þ ¼ Oð ½8 imaging system to an incoherent imaging system.
Since the individual source points are assumed to be
where Oð~ fx Þ is the spatial Fourier transform of the incoherent with each other, the images from each
object amplitude transmittance function and Hð fx Þ is point source can be added in intensity. In Figure 11,
106 COHERENCE / Coherence and Imaging
Figure 11 A second source point projects a second object Fourier transform onto the pupil plane. The images produced by the two
source points add in intensity at the image plane.
only two point sources are shown. Each point source These comparisons are often misleading since one
projects an object Fraunhofer pattern onto the pupil transfer function describes the wave amplitude spatial
plane. The centered point source will result in the frequency transfer function and the other describes
familiar coherent image. The off-axis point source the intensity spatial frequency transfer function. Here
projects a displaced object Fraunhofer pattern that is we note that incoherent imaging systems do indeed
also filtered by the pupil plane before forming an allow higher object amplitude spatial frequencies to
image of the object. The image formed from this participate in the image formation process. This
second point source can be modeled with the same argument is often used to support the claim that
coherent imaging model with a shifted pupil plane incoherent systems have higher resolution. However,
filter. Since the light from the two source points is both systems have the same image intensity spatial
uncorrelated, the final image is calculated by adding frequency cutoff. Furthermore, the nature of the
the intensities of the two coherently formed images. coherent transfer function tends to produce high
The two source point model of Figure 11 can be contrast images that are typically interpreted as
generalized to an arbitrary number of source points. higher-resolution images than their incoherent
The final image is an intensity superposition of a counterparts. Perhaps the real conclusion is that
number of coherent images. This suggests that the term resolution is not well defined and direct
partially coherent imaging systems behave as a comparisons between coherent and incoherent
number of redundant coherent imaging systems, imaging must be treated carefully.
each having a slightly different amplitude spatial The frequency domain treatment for partially
frequency transfer function due to the relative shift of coherent imaging of two-dimensional objects
the pupil filter with respect to the object Fraunhofer involves a four-dimensional spatial frequency transfer
pattern. As the number of point sources is increased function that is sometimes called the bilinear transfer
to infinity, the primary source approaches an infinite function or the transmission cross-coefficient model.
extent spatially incoherent source. In practice, the This model describes how constituent object wave
source need not be infinite. When the source is large amplitude spatial frequencies interact to form image
enough to effectively produce linear-in-intensity plane intensity frequencies. The utility of this
imaging, the imaging system is effectively spatially approach to analysis is limited for someone new to
incoherent. The corresponding image intensity in the the field, but is often used in numerical simulations of
spatial frequency domain, I~inc ð fx Þ; is given by partially coherent imaging systems used in optical
lithography.
I~inc ð fx Þ ¼ I~obj ð fx ÞOTFð fx Þ ½9
where I~obj ð fx Þ is the spatial Fourier transform of the Experimental Examples of Important
object intensity transmittance and OTFð fx Þ is the
incoherent optical transfer function which is pro-
Coherence Imaging Phenomena
portional to the spatial autocorrelation of the pupil Perhaps the best way to gain an understanding of
function. coherence phenomena in imaging is to examine
Many texts include detailed discussions comparing experimental results. In the following section we use
the coherent transfer function and the OTF and experimental data to see how coherence affects edge
attempt to make comparisons about the relative response, noise immunity, and depth of field. Several
performance of coherent and incoherent systems. experimental configurations were used to collect
COHERENCE / Coherence and Imaging 107
light is delivered through a large multifiber lightguide sources. These shapes are representative of most of
and illuminates a moving diffuser. The main purpose the sources employed in microscopy, machine vision,
of the diffuser is to ensure that the entire object is and optical lithography.
illuminated uniformly. The low temporal coherence
of the source and the long propagation distances are
usually enough to destroy any spatial coherence at the Noise Immunity
plane of the diffuser. The rotation further ensures that Coherent imaging systems are notorious for introdu-
no residual spatial coherence exists at the primary cing speckle-like noise artifacts at the image. Dust
source plane. A chromatic filter centered around and optically rough surfaces within an optical system
600 nm, with a spectral width of approximately result in a complicated textured image plane intensity
100 nm, is shown in the light path. The filter was distribution that is often called speckle noise. We refer
used to minimize the effects of chromatic aberrations to such effects as coherent artifact noise. Figure 15a
in the system. The wide spectral width certainly shows the image of a standard binary target as
qualifies as a broadband source in relation to a laser. imaged by a benchtop imaging system, with the
High-contrast photographically defined apertures highly coherent illumination method shown in
determine the spatial intensity distribution at the Figure 13. Some care was taken to clean individual
plane of the primary source. In the following results, lenses and optical surfaces within the system and
the primary source distributions were restricted to no dust was purposely introduced. The image is
circular sources of varying sizes as well as annular corrupted by a complicated texture that is due in
Figure 15 High temporal coherence imaging with disk sources Figure 16 High spatial coherence disk illumination (K ¼ 0:1)
corresponding to (a) extremely high spatial coherence (K ¼ 0:05) imaging through a dust covered surface with (a) narrowband laser
and (b) slightly reduced but high spatial coherence (K ¼ 0:1). illumination and (b) with broadband illumination.
COHERENCE / Coherence and Imaging 109
part to imperfections in the laser beam itself and in the CCD detector. The reflections produce a weak
part to unwanted dust and optically rough surfaces. secondary image that is slightly displaced from the
Certainly, better imaging performance can be main image and the high temporal coherence allows
obtained with more attention to surface cleanliness the two images to interfere.
and laser beam filtering, but the result shows that it The noise performance was intentionally per-
can be difficult to produce high-quality coherent turbed in the image of Figure 16a by inserting a
images in a laboratory setting. slide with a modest sprinkling of dust at an optical
Lower temporal coherence and lower spatial surface in between the object and the pupil plane.
coherence will reduce the effect of speckle noise. The illumination conditions are the same as for the
The image of Figure 15b was obtained with laser image of Figure 15b. The introduction of the dust
illumination according to Figure 14a with a source has further degraded the image. The image of
mask corresponding to a source to pupil diameter Figure 16b maintains the same spatial coherence
ratio of K ¼ 0:1: The object plane illumination is still ðK ¼ 0:1Þ but employs the broadband source. The
highly temporally coherent and the spatial coherence lower temporal coherence eliminates the unwanted
has been reduced but is still very high. The artifact diagonal fringes but the speckle noise produced by
noise has been nearly eliminated by the modest redu- the dust pattern does not improve significantly
ction of spatial coherence. The presence of diagonal relative to the laser illumination K ¼ 0:1 system.
fringes in some regions is a result of multiple Figures 17a – c show that increasing the source size
reflections produced by the cover glass in front of (and hence the range of angular illumination) reduces
Figure 17 Low temporal coherence imaging through a dust-covered surface with various source sizes and shapes: (a) disk source
with K ¼ 0:3; (b) disk source with K ¼ 0:7; (c) disk source with K ¼ 2:0; and (d) a thin annular source with an outer diameter
corresponding to K ¼ 0:5 and inner diameter corresponding to K ¼ 0:45:
110 COHERENCE / Coherence and Imaging
Depth of Field
Coherent imaging systems exhibit an apparent
increase in depth-of-field compared to spatially
incoherent systems. Figure 19 shows spatially
incoherent imagery for four different focus positions.
Figure 20 shows imagery with the same amount of
defocus produced with highly spatially coherent
light. Finally, Figure 21 gives an example of how
defocused imagery depends on the illumination
coherence. The images of a spoke target were all
gathered with a fixed amount of defocus and the
source size was varied to control the illumination
coherence. The images of Figure 21 differ greatly,
even though the CPSF was the same for all the cases.
Figure 19 Spatially incoherent imaging for (a) best focus, (b) moderate misfocus, and (c) high misfocus.
The images of Figures 22a and b are digitally While they are not faithful, they often preserve and
restored versions of the images of Figure 19b and even enhance edge information and the overall image
Figure 20b. The point spread function was directly may appear sharper than the incoherent image.
measured and used to create a linear restoration filter When end task performance is improved, the image
that presumed spatially incoherent illumination. The may be considered to be more optimal than the ideal
restored image in Figure 22a is faithful since the image. It is important to keep in mind that some of
assumption of spatially incoherent light was reason- the spatial information may be misleading and a
able. The restored image of Figure 22b suffers from a more complete understanding of the coherence may
loss in fidelity since the actual illumination coherence be needed for precise image plane measurements.
was relatively high. The image is visually pleasing and This warning is relevant in microscopy where the
correctly conveys the presence of three bar targets. user is often encouraged to increase image contrast
However, the width of the bars is not faithful to the by reducing illumination spatial coherence. Annular
original target which had spacings equal to the widths sources can produce highly complicated spatial
of the bars. coherence functions that will strongly impact the
As discussed earlier, the ideal image is not always restoration of such images.
the optimal image for a given task. In general,
restoration of partially coherent imaging with an
implicit spatially incoherent imaging model will
Summary and Discussion
produce visually pleasing images that are not The coherence of the illumination at the object plane
necessarily faithful to the object plane intensity. is important in understanding image formation.
112 COHERENCE / Coherence and Imaging
Figure 20 High spatial coherence disk source (K ¼ 0:2) imaging for (a) best focus, (b) moderate misfocus, and (c) high misfocus.
Highly coherent imaging systems produce high- effects can be significant, even for broadband light.
contrast images with high depth of field and provide Control over the object plane spatial coherence
the opportunity for sophisticated manipulation of the allows the designer to find a tradeoff between the
image with frequency plane filtering. Darkfield various strengths and weaknesses of coherent and
imaging and phase contrast imaging are examples of incoherent imaging systems.
frequency plane filtering. Unfortunately, coherent As more imaging systems employ post-detection
systems are sensitive to optical noise and are generally processing, there is an opportunity to design
avoided in practical system design. Relaxing the fundamentally different systems that effectively split
temporal coherence of the illumination can provide the image formation process into a physical portion
some improvement, but reducing the spatial cohe- and a post-detection portion. The simple example of
rence is more powerful in combating noise artifacts. image deblurring cited in this article shows that object
Research microscopes, machine vision systems, and plane coherence can affect the nature of the digitally
optical lithography systems are the most prominent restored image. The final image can be best under-
examples of partially coherent imaging systems. stood when the illumination coherence effects are well
These systems typically employ adjustable spatially understood. The spatially incoherent case results in
incoherent extended sources in the shapes of disks the most straightforward model for image interpret-
or annuli. The exact shape and size of the primary ation, but is not always the best choice since the
source, shapes the angular extent of the illumination coherence can often be manipulated to increase the
at the object plane and determines the spatial contrast and hence the amount of useful information
coherence at the object plane. Spatial coherence in the raw image. A completely general approach to
COHERENCE / Coherence and Imaging 113
Figure 21 Imaging with a fixed amount of misfocus and varying object spatial coherence produced by: (a) highly incoherent disk
source illumination (K ¼ 2); (b) moderate coherence disk source (K ¼ 0:5); (c) high coherence disk source (K ¼ 0:2); and (d) annular
source with outer diameter corresponding to K ¼ 0:5 and inner diameter corresponding to K ¼ 0:45:
Figure 22 Digital restoration of blurred imagery with an inherent assumption of linear-in-intensity imaging for (a) highly spatially
incoherent imaging (K ¼ 2) and (b) high spatial coherence (K ¼ 0:2).
114 COHERENCE / Speckle and Coherence
imaging would treat the coherence of the source, the Bracewell RN (1978) The Fourier Transform and its
imaging optics, and the post-detection restoration all Applications. New York: McGraw Hill.
as free variables that can be manipulated to produce Gonzales RC and Woods EW (1992) Digital Image
the optimal imaging system for a given task. Processing, 3rd edn., pp. 253 – 304. Reading, MA:
Addison-Wesley.
Goodman JW (1985) Statistical Optics, 1st edn. New York:
See also John Wiley & Sons.
Goodman JW (1996) Introduction to Fourier Optics, 2nd
Coherent Lightwave Systems. Information Proces- edn. New York: McGraw Hill.
sing: Coherent Analogue Optical Processors. Terahertz Hecht E and Zajac A (1979) Optics, 1st edn., pp. 397 – 435,
Technology: Coherent Terahertz Sources. Reading, MA: Addison-Wesley.
Inoue S and Spring KR (1997) Video Microscopy, 2nd edn.
New York: Plenum.
Further Reading
Reynolds GO, DeVelis JB, Parrent GB and Thompson JB
Bartelt H, Case SK and Hauck R (1982) Incoherent Optical (1989) Physical Optics Notebook: Tutorials in
Processing. In: Stark H (ed.) Applications of Optical Fourier Optics, 1st edn. New York: SPIE, Bellingham
Fourier Transforms, pp. 499 – 535. London: Academic and AIP.
Press. Saleh BEA and Teich MC (1991) Fundamentals of
Born M and Wolf E (1985) Principles of Optics, 6th edn. Photonics, pp. 343 – 381. New York: John Wiley &
(corrected), pp. 491 – 554. Oxford, UK: Pergamon Press. Sons.
Figure 1 Ground glass in sunlight: (a) the image of a ground glass, illuminated by the sun, observed with small aperture (from large
distance), displays no visible speckles; (b) a medium observation aperture (smaller distance) displays weak visible speckles;
(c) observation with very high aperture displays strong contrast speckles; and (d) the image of the ground glass on a cloudy day does
not display speckles, due to the very big aperture of illumination. Reproduced with permission from Häusler G (1999) Optical sensors
and algorithms for reverse engineering. In: Donah S (ed.) Proceedings of the 3rd Topical Meeting on Optoelectronic
Distance/Displacement Measurements and Applications. IEEE-LEOS.
this incoherent averaging starts for distances larger see high interference contrast; in fact we see a speckle
than 250 mm, which is the comfortable viewing contrast of C ¼ 1; if the roughness of the object is
distance. Adults cannot see objects from much closer smaller than the coherence length of the source.
distances. We generally do not see speckles, not just at This is the case for most metal surfaces, for ground
the limit of maximum accommodation. For larger glasses, or for worked plastic surfaces. The assump-
distances, if we take zo ¼ 2500 mm; the resolvable tion is not true for ‘translucent’ surfaces, such as skin,
distance at the object will be 1.1 mm, which is paper, or wood. This will be discussed below.
10 times larger than the diameter of the coherence The observation that coherent imaging is achieved,
area. Averaging over such a large area will drop the if we can resolve distances smaller than the coherence
interference contrast by a factor of 10. Note that width dG, is identical to the simple rule that fully
such a small interference contrast might not be coherent imaging occurs if the observation aperture
visible, but it is not at zero! uo is larger than the illumination aperture uq :
Coming to the second assumption we can laterally As mentioned, we will incoherently average in
resolve distances smaller than dG. In order to under- the image plane, over some object area, determined
stand this, we first have to learn what is an ‘optically by the size of the diffraction spot. According to the
rough’ object surface. Figure 6 illustrates the rules of speckle-averaging, the speckle contrast C
problem. decreases with the inverse square root of the
A surface commonly is called ‘rough’ if the local number N of incoherently averaged speckle pat-
height variations are larger than l: However, a terns. This number N is equal to the ratio
surface appears only rough, if the height variation of the area Adiff of the diffraction spot divided by
within the distance that can be resolved by the the coherence area AG ¼ dG2 : So we obtain for the
observer, is larger than l (reflected light will travel the speckle contrast C:
distance twice, hence a roughness of l=2 will be
sufficient). Then the scattered different phasors, or C¼1 for uq , uo ½7a
‘Huygens elementary waves’:
pffiffiffiffiffi
us , exp½i2kzðx; yÞ ½6 C ¼ 1= ðNÞ ¼ uo =uq ; for uq .¼ uo ½7b
scattered from the object at x; y may have big We summarize the results in Figure 7.
phase differences so that destructive interferences Equation [7b] has an interesting consequence: we
(and speckle) can occur in the area of the diffraction never get rid of speckle noise, even for large
image. So the attribute ‘rough’ depends on the object illumination apertures and small observation aper-
as well as on the observation aperture. With a high tures. In many practical instruments, such as a slide
observation aperture (microscope), the diffraction projector, the illumination ‘aperture stop’ cannot be
image is small and within that area the phase greater than the imaging lens. Fortunately, the
differences might be small as well. So a ground glass observer’s eye commonly has a pupil smaller than
may then look locally like a mirror, while with a small that of the projector, and/or looks from a distance
observation aperture it appears ‘rough’. at the projection screen. Laser projection devices
With the ‘rough’ observation mode assumption, we however cause strong and disturbing speckle effects
can summarize what we have assumed: with resolving for the user and significant effort is invested to cope
distances smaller than the coherence width dG we will with this effect.
Figure 8 Microfiche projection. For high observation aperture (a) strong speckles occur, and for small observation aperture (b), the
speckle contrast is low.
wavelengths, and come to white light interferometry, Hence, many photons do not supply more infor-
see section on speckles as carriers of information mation than one single photon.
below. The result is – not surprising – the same as
Rayleigh scattering found for the depth of field:
Why Are We Not Aware of Coherence
dz ¼ l=sin2 uo ½9 Generally, coherent noise is not well visible, even in
sophisticated technical systems the visibility might be
Coherent noise is the source of the fundamental limit
low. There are two main reasons; first, the obser-
of the distance measuring uncertainty dz of triangu-
vation aperture might be much smaller than the
lation based sensors.
aperture of illumination. This is often true, for large
dz ¼ C · l=ðsin uo sin uÞ ½10 distance observation, even with small apertures of
illumination. The second reason holds for technical
where C is the speckle contrast, l is the wavelength, systems, if the observation is implemented via
sin uo is the aperture of observation, and u is the angle pixelized video targets. If the pixel size is much larger
of triangulation (between the direction of the than a single speckle, which is commonly so, then, by
projection and the direction of observation). For a averaging over many speckles, the noise is greatly
commercially available laser triangulation sensor, (see reduced. However, we have to take into account that
Figure 11), with sin uo , 0:01; C ¼ 1 and u ¼ 308; we pay for this by loss of lateral resolution 1=dx: We
the measuring uncertainty dz will be larger than can formulate another uncertainty relation:
100 mm. We may add, that for sin uo ¼ sin u; which is
valid for auto-focus sensors (such as the confocal dx . l=ðC · sin uo Þ ½11
scanning microscope), eqn [10] degenerates to the
well known Rayleigh depth of field (eqn [9]). which says that if we want to reduce the speckle
The above results are remarkable in that we cannot contrast C (noise), by lateral averaging, we can do
know the accurate position of the projected spot or this but lose lateral resolution 1=dx:
about the position of an intrinsic reflectivity feature;
we also cannot know the accurate local reflectivity of a
coherently illuminated object. A further consequence Can We Overcome Coherence Limits?
is that we do not know the accurate shape of an object Since the daily interaction of light with matter is
in 3D-space. We can calculate this ‘physical measuring coherent scattering, we can overcome the limit of
uncertainty’ from the considerations above. eqn [10] only by looking for measuring principles that
These consequences hold for optically rough are not based on the exploitation of local reflectivity,
surfaces, and for measuring principles that exploit i.e., on ‘conventional imaging’. Concerning optical
the local intensity of some image, such as with all 3D-measurements, the principle of triangulation uses
triangulation type of sensors. The consequences of the position of some image detail, for the calculation
these considerations are depressing: triangulation of the shape of objects, and has to cope with coherent
with a strong laser is not better than triangulation noise.
with only one single photon. The deep reason is, that Are there different mechanisms of photon– matter
all coherent photons stem from the same quantum interaction, with noncoherent scattering? Fluor-
mechanical phase cell and are indistinguishable. escence and thermal excitation are incoherent mecha-
nisms. It can be shown (see Figure 12) that
triangulation utilizing fluorescent light, displays
much less noise than given by eqn [10]. This is
exploited in fluorescent confocal microscopy.
Thermally excited matter emits perfectly incoher-
ent radiation as well. We use this incoherence to
measure the material wear in laser processing (see
Figure 13), with, again, much better accuracy than
given by eqns [9,10].
The sensor is based on triangulation. Nevertheless,
Figure 11 Principle of laser triangulation. The distance of the by its virtually zero coherent noise, it allows a
projected spot is calculated from the location of the speckled spot
measuring uncertainty which is limited only by
image and from the angle of triangulation. The speckle noise
introduces an ultimate limit of the achievable measuring camera noise and other technical imperfections. The
uncertainty. Reproduced with permission from Physikalische uncertainty of the depth measurement through the
Plätter (May 1997): 419. narrow aperture of the laser nozzle is only a few
120 COHERENCE / Speckle and Coherence
Broadband Illumination
Speckle noise can hardly be avoided by broadband
illumination. The speckle contrast C is related to
the surface roughness sz and the coherence length lc
by eqn [12]:
So, only for very rough surfaces, and white light illu-
mination, C can be reduced. However, the majority of
technical objects is smooth, with a roughness of a
few micrometers, hence speckles can be observed
even with white light illumination ðlc , 3 mmÞ: This
situation is different for ‘translucent’ objects such
as skin, paper, or some plastic material.
We call the real shape of the object, or the surface This measure Rq was calculated from measure-
profile, zðx; yÞ; where z is the local distance at the ments at different roughness gauges, with differ-
position ðx; yÞ: We give the measured data an index ent observation apertures. Two measurements are
‘m2 ’: depicted in Figure 16.
(i) The physical measuring uncertainty of the measu- The experiments shown in Figure 16 display the
red data is independent from the observation correct roughness measure, even if the higher frequ-
aperture. This is quite a remarkable property, as it encies of the object spectrum Zðn; mÞ (Figure 17), are
enables us to take accurate measurements at the not optically resolved. The solution of this paradox
bottom of deep boreholes. might be explained as follows.
(ii) The standard deviation sz of the object can From the coherently illuminated object each point
be calculated from the measured data zm: sz ¼ scatters a spherical wave. The waves scattered from
,lzm l. : According to German standards, sz different object points have, in general, a different
corresponds to the roughness measure Rq : phase. At the image plane, we see the laterally
Figure 14 ‘Coherence radar’, white light interferometry at rough surfaces. The surface under test is illuminated with high spatial
coherence and low temporal coherence. We acquire the temporal coherence function (correlogram) within each speckle, by scanning
the object under test in depth.
Pedrotti F and Pedrotti L (1993) Introduction to Optics, van de Gracht (2004) Coherence and Imaging. See this
2nd edn. New Jersey: Prentice Hall. Encyclopedia.
Pérez J-Ph (1996) Optik, Spektrum Akademischer. Heidel- Young M (1993) Optics and Lasers, Including Fibers
berg: Verlag. and Optical Wave Guides, 4th revised edn. Berlin,
Sharma A, Kandpal HC and Ghatak AK (2004) Optical
Heidelberg: Springer Verlag.
Coherence. See this Encyclopedia.
COHERENT CONTROL
Contents
Theory
Experimental
Applications in Semiconductors
that in Figure 1 would also apply to steering about the which satisfies
density matrix. The process depicted in Figure 1 may
be thought of as a microscale analog of the classic ›U
double-slit experiment for light waves. As the goal is i" ¼ ½H0 2 m·1ðtÞU; Uð0Þ ¼ 1 ½2
›t
often high-finesse focusing into a particular target
quantum state, success may call for the manipulation Analyses of this type can be quite insightful, but they
of many quantum pathways (i.e., the notion of a require detailed knowledge about H0 and m. Cases
‘many-slit’ experiment). Most applications are con- involving quantum information science applications
cerned with achieving a desirable outcome for the are perhaps the most demanding with regard to
expectation value kcf lOlcf l associated with some achieving total control. Most other physical appli-
observable Hermitian operator O. cations would likely accept much more modest levels
The practical realization of the task above becomes of control and still be categorized as excellent
a control problem when the system is expressed achievements.
through its Hamiltonian H ¼ H0 þ Vc ; where H0 is Theoretical tools and concepts have a number of
the free Hamiltonian describing the dynamics roles in considering the control of quantum systems,
without explicit control; it is assumed that the including (1) an exploration of physical/chemical
free evolutionary dynamics under H0 will not phenomena under active control, (2) the design
satisfactorily achieve the physical objective. Thus, a of viable control fields, (3) the development of
laboratory-accessible control term Vc is introduced in algorithms to actively guide laboratory control
the Hamiltonian to achieve the desired manipulation. experiments towards achieving their dynamical
Even just considering radiative interactions, the form objectives, and (4) the introduction of special
of Vc could be quite diverse depending on the nature algorithms to reveal the physical mechanisms opera-
of the system (e.g., nuclear spins, electrons, atoms, tive in the control of quantum phenomena. Activities
molecules, etc.) and the intensity of the radiation field. (1) –(3) have thus far been the primary focus of
Many problems may be treated through an electric theoretical studies in this area, and it is anticipated
dipole interaction Vc ¼ 2m·1ðtÞ where m is a system that item (4) will grow in importance in the future.
dipole moment and 1(t) is the laser electric field. A few comments on the history of laser control
Where appropriate, this article will consider the over quantum systems are relevant, as they speak
interaction in this form, but other suitable radiative to the special nature of the currently employed
control interactions may be equally well treated. successful closed-loop quantum control experiments.
Thus, the laser field 1(t) as a function of time Starting in the 1960s and spanning roughly 20 years,
(or frequency) is at our disposal for attempted it was thought that the design of lasers to manipulate
manipulation of quantum systems. molecular motion could be achieved by the appli-
Before considering any practical issues associated cation of simple physical logic and intuition. In
with the identification of shaped laser control fields, a particular, the thinking at the time focused on using
fundamental question concerns whether it is, in cw laser fields resonant with one or more local modes
principle, possible to steer about any particular of the molecule, as state or energy localization was
quantum system from an arbitrary initial state lci l believed to be the key to successful control. Since
to an arbitrary final state lcf l. Questions of this quantum dynamics phenomena typically occur on
sort are addressed by a controllability analysis of ultrafast time-scales, possibly involving spatially
Schrödinger’s equation dispersed wave packets, expecting to achieve high
quality control with a light source operating at one or
two resonant frequencies is generally wishful think-
›lcðtÞl
i" ¼ ½H0 2 m·1ðtÞlcðtÞl; lcð0Þl ¼ lci l ½1 ing. Quantum dynamics phenomena occur in a
›t multifrequency domain, and controls with a few
frequencies will not suffice. Over approximately the
Controllability concerns whether, in principle, some last decade, it became clear that successful control
field 1(t) exists, such that the quantum system often calls for manipulating multiple interfering
described by H0 ¼ H0 2 m·1ðtÞ permits arbitrary quantum pathways (cf., Figure 1). In turn, this
degrees of control. For finite-dimensional quantum recognition led to the need for broad-bandwidth
systems (i.e., those described by evolution amongst a laser sources (i.e., tailored laser pulses). Fortunately,
discrete set of quantum states), the formal tools for the necessary laser pulse-shaping technologies have
such a controllability analysis exist both for evaluat- become available, and these sources continue to
ing the controllability of the wave function, as well expand into new frequency ranges with enhanced
as the more general time evolution operator U(t), bandwidth capabilities. Many chemical and physical
COHERENT CONTROL / Theory 125
applications of control over quantum phenomena Quantum Optimal Control Theory for
involve performing a significant amount of work (i.e., Designing Laser Fields
quantum mechanical action), and sufficient laser field
intensity is required. A commonly quoted adage is When considering control in any domain of appli-
‘no field, no yield’, and at the other extreme, there cation, a reasonable approach is to computationally
was much speculation that operating nonlinearly at design the control for subsequent implementation in
high field intensities would lead to a loss of control the laboratory. Quantum mechanical laser field
because the quantum system would effectively act as design has taken on a number of realizations. At
an amplifier of even weak laser field noise. Fortu- one limit is the application of simple intuition for
nately, the latter outcome has not occurred, as is design purposes, and in some special cases (e.g., the
evident from a number of successful high field laser weak field perturbation theory regime), this approach
control experiments. may be applicable. Physical insights will always play
Quantum control may be expressed as an inverse a central role in laser field design, but to be especially
problem with a prescribed chemical or physical useful, they need to be channeled into the proper
mathematical framework. Many of the interesting
objective, and the task being the discovery of a
applications operate in the strong-field nonperturba-
suitable control field 1(t) to meet the objective.
tive regime. In this domain, serious questions arise
Normally, Schrödinger’s equation [1] is viewed as
regarding whether sufficient information is available
linear, and this perspective is valid if the Hamiltonian
about the Hamiltonian to execute reliable designs.
is known a priori, leaving the equation to be solved
Regardless of whether the Hamiltonian is known
for the wave function. However, in the case of
accurately or is an acknowledged model, the theor-
quantum control, neither the wave function nor the
etical study of quantum control can provide physical
control field is known a priori, and the two enter
insight into the phenomena involved, as well
bilinearly on the right-hand side of eqn [1]. Thus,
as possibly yielding trial laser pulses for further
quantum control is mathematically a nonlinear
refinement in the laboratory.
inverse problem. As such, one can anticipate possibly
Achieving control over quantum mechanical
diverse behavior in the process of seeking successful
phenomena often involves a balance of competing
controls, as well as in the ensuing quantum dynamical dynamical processes. For example, in the case of
control behavior. The control must take into account aiming to create a particular excitation in a molecule
the evolving system dynamics, thereby resulting in or material, there will always be the concomitant
the control field being a function(al) of the current need to minimize other unwanted excitations. Often,
state of the system 1(c) in the case of quantum control there are also limitations on the form, intensity, or
design, or dependent on the system observations other characteristics of the laser controls that must be
1ðkOlÞ in the case of laboratory closed-loop field adhered to. Another goal is for the control outcome to
discovery efforts where kOl ¼ kclOlcl: Furthermore, be as robust as possible to the laser field fluctuations
the control field at the current time t will depend in and Hamiltonian uncertainties. Overriding all of
some implicit fashion on the future state of the these issues is merely the desire to achieve the best
evolving system at the final target time T. It is evident possible physical result for the posed quantum
that control field design and laboratory discovery mechanical objective. In summary, all of these desired
present highly nontrivial tasks. extrema conditions translate over to posing the design
Although the emphasis in this review is on the of control laser fields as an optimization problem.
theoretical aspects of quantum control theory, the Thus, optimal control theory forms a basic
subject exists for its laboratory realizations, and those foundation for quantum control field design.
realizations, in turn, intimately depend on the Control field design starts by specification of the
capabilities of theoretical and algorithmic techniques Hamiltonian components H0 and m in eqn [1], with
for their successful implementations. Accordingly, the goal of finding the best control electric field 1(t) to
theoretical laser control field design techniques will balance the competing objectives. Schrödinger’s
be discussed, along with algorithmic aspects of equation must be solved as part of this process, but
current laboratory practice. This material respect- this effort is not merely a forward propagation task, as
ively will focus on optimal control theory (OCT) and the control field is not known a priori. As an optimal
optimal control experiments (OCEs), as seeking design is the goal, a cost functional J ¼ J(objectives,
optimality provides the best means to achieve any penalties, 1(t)) is prescribed, which contains the
posed objective. Finally, the last part of the article will information on the physical objectives, competing
present some general conclusions on the state of the penalties, and any costs or constraints associated with
quantum control field. the structure of the laser field. The functional J could
126 COHERENT CONTROL / Theory
contain many terms if the competing physical goals achieved control field design. The fact that there are
are highly complex. As a specific simple illustration, generally multiple solutions to the laser design
consider the common goal of steering the system to equations can be attractive from a physical perspec-
achieve an expectation value kcðTÞlOlcðTÞl as close tive, as the designs may be sorted through to identify
as possible to the target value Otarget associated those being most attractive for laboratory
with the observable operator O at time T. An implementation.
additional cost is imposed to minimize the laser The overall mathematical structure of eqns [1] and
fluence. These criteria may be embodied in a cost [4] –[6] can be melded together into a single design
functional of the form equation
2 ðT
12 ðtÞ dt
›
J ¼ kcðTÞlOlcðTÞl 2 Otarget þ v i" lcðtÞl ¼ ½H0 2 m·1ðc; s; tÞlcðtÞl; lcð0Þl ¼ lci l
0 ›t
ðT ›
½7
2 2I dt lðtÞli" 2 H0 þ m·1ðtÞlcðtÞ ½3
0 ›t
The structure of this equation embodies the comments
Here, the parameter v $ 0 is a weight to balance the in the introduction that control field design is
significance of the fluence term relative to achieving inherently a nonlinear process. The main source of
the target expectation value, and llðtÞl serves as a complexity arising in eqn [7] is through the control
Lagrange multiplier, assuring that Schrödinger’s field 1ðc; s; tÞ depending not only on the wave function
equation is satisfied during the variational minimiz- at the present time t, but also on the future value of
ation of J with respect to the control field. Carrying the wavefunction at the target time T contained in s.
out the latter minimization will lead to eqn [1], along This structure again reflects the two-point boundary
with the additional relations value nature of the control equations.
Given the typical multiplicity of solutions to eqns [1]
›
i" llðtÞl ¼ ½H0 2 m·1ðtÞllðtÞl; and [4] –[6], or equivalently, eqn [7], it is attractive to
›t consider approximations to these equations
llðTÞl ¼ 2sOllðTÞl ½4
(except perhaps for the inviolate Schrödinger
1 equation [1], and much work continues to be done
1ðtÞ ¼ IklðtÞlmlcðtÞl ½5 along these lines. One case involves what is referred
v
to as tracking control, whereby a path is specified
s ¼ kcðTÞlOlcðTÞl 2 Otarget ½6 for kcðtÞlOlcðtÞl evolving from t ¼ 0 out to t ¼ T.
Tracking control eliminates s to produce a field of
Equations [1] and [4] –[6] embody the OCT design the form 1(c,t), thereby permitting the design
equations that must be solved to yield the optimal equations to be explicitly integrated as a forward-
field 1(t) given in eqn [5]. These design equations have marching problem toward the target; the tracking
an unusual mathematical structure within quantum equations are still nonlinear with respect to the
mechanics. First, insertion of the field expression in evolving state. Many variations on these concepts
eqn [5] into eqns [1] and [4] produces two cubically can be envisioned, and other approximations may also
nonlinear coupled Schrödinger-like equations. These be introduced to deal with special circumstances. One
equations are in the same family as the standard technique appropriate for at least few-level systems is
nonlinear Schrödinger equation, and as such, one stimulated Raman adiabatic passage (STIRAP), which
may expect that unusual dynamical behavior could seeks robust adiabatic passage from an initial state to a
arise. Although eqn [4] for llðtÞl is identical in form to particular final state, typically with nearly total
the Schrödinger equation [1], only lcðtÞl is the true destructive interference occurring in the intermediate
wavefunction, with llðtÞl serving to guide the states. This design technique can have special attr-
controlled quantum evolution towards the physical active robustness characteristics with respect to field
objective. Importantly, eqn [1] is an initial value errors. STIRAP, tracking, and various perturbation
problem, while eqn [4] is a final value problem. Thus, theory-based control design techniques can be
the two equations together form a two-point bound- expressed as special cases of OCT, with suitable cost
ary value problem in time, which is an inherent functionals and constraints. Further approximation
feature of temporal engineering optimal control as methods will surely be developed, with the rigorous
well. The boundary value nature of these equations OCT concepts forming the general foundation for
typically leads to the existence of multiple solutions, control field design.
where s in eqn [6] plays the role of a discrete As the OCT equations are inherently nonlinear,
eigenvalue, specifying the quality of the particular their solution typically requires numerical iteration,
COHERENT CONTROL / Theory 127
and a variety of procedures may be utilized for this wave packet over spatial domains or multitudes of
purpose. Almost all of the methods employ local quantum states. Handling such complexity can
search techniques (e.g., gradient methods), and some require fields with subtle structure to interact with
also show monotonic convergence with respect to the quantum system in a global fashion to manage all
each new iteration step. Local techniques typically of the motions involved. Thus, we may expect that
evolve to the nearest solution in the space of possible successful control pulses will often have broad
control fields. Such optimizations can be numerically bandwidth, including amplitude and phase modu-
efficient, although the quality of the attained result lation. Until relatively recently, laser sources with
can depend on the initial trial for the control field and these characteristics were not available, but the
the details of the algorithm involved. In contrast, technology is now in hand and rapidly evolving
global search techniques, such as simulated annealing (see the discussion later).
or genetic algorithms, can search more broadly for The cooperativity principle above is of fundamen-
the best solution in the control field space. These tal importance in the control of all quantum
more expansive searches are attractive, but at the phenomena, and a simple illustration of this principle
price of typically requiring more intensive compu- is shown in Figure 2 for the control of wave
tations. Effort has also gone into seeking global or packet motion on an excited state of the NO
semiglobal input ! output maps relating the electric molecule. The target for the control is a narrow
field 1(t) structure to the observable O[1(t)]. Such wave packet located over the well of the excited
maps may be learned, hopefully, from a modest B state. The optimal control field consists of two
number of computations with a selected set of fields, coordinated pulses, at early and late times, with both
and then utilized as high-speed interpolators over the features having internal structure. The figure indi-
control field space to permit the efficient use of global cates that the ensuing excited state wave packet has
search algorithms to attain the best control solution two components, one of which actually passes
possible. At the foundation of all OCT design pro- through the target region during the initial evolution,
cedures is the need to solve the Schrödinger equation. only to return again and meet the second component
Thus, computational technology to improve this at just the right place and time, to achieve the target
basic task is of fundamental importance for objective as best as possible. Similar cooperativity
designing laser fields. interpretations can be found in virtually all
Many computations have been carried out with implementations with OCT.
OCT to attain control field designs for manipulating a The main reason for performing quantum field
host of phenomena, including rotational, vibrational, design is to ultimately implement the designs in the
electronic, and reactive dynamics of molecules, as well laboratory. The design procedures must face labora-
as electron motion in semiconductors. Every system tory realities, which include the fact that most
has its own rich details, which in turn, are com- Hamiltonians are not known to high accuracy
pounded by the fact that multiple control designs will (especially for polyatomic molecules and complex
typically exist in many applications producing com- solid-state structures), and secondly, a variety of
parable physical outcomes. Collectively, these control laboratory field imperfections may unwittingly be
design studies confirm the manipulation of construc- present. Notwithstanding these comments, OCT has
tive and destructive quantum wave interferences as been fundamental to the development of quantum
the general mechanism for achieving successful control, including laying out the logic for how to
control over quantum phenomena. This conclusion perform the analogous optimal control experiments.
may be expressed as the following principle: Design implementations and further OCT develop-
ment will continue to play a basic role in the quantum
Control field– system cooperativity principle: Successful
control field. At present, perhaps the most important
quantum control requires that the field must have the
contribution of OCT has been to (1) highlight
proper structure to take full advantage of all of the
dynamical opportunities offered by the system, to best the basic control cooperativity principle above, and
satisfy the physical objective. (2) provide the basis for developing algorithms to
successfully guide optimal control experiments, as
This simple statement of system – field cooperati- discussed below.
vity embodies the richness, as well as the complexity,
of seeking control over quantum phenomena, and Algorithms for Implementing Optimal
also speaks to why simple intuition alone has not
proved to be a generally viable design technique.
Control Experiments
Quantum systems can often exhibit highly complex The ultimate purpose of considering control theory
dynamical behavior, including broad dispersion of the within quantum mechanics is to take the matter into
128 COHERENT CONTROL / Theory
Figure 2 Control of wave packet evolution on the electronically excited B state of NO, with the goal of creating a narrow final packet
over the excited state well. (a) indicates that the ground-state packet is brought up in two pieces using the coordinated dual pulse in (b).
The piece of the packet from the first pulse actually passes through the target region, then bounces off the right side of the potential, to
finally meet the second piece of the packet at just the right time over the target location r p for successful control. This behavior is an
illustration of the control field– system cooperativity principle stated in the text.
the laboratory for implementation and exploitation partially, if not totally, sidestepping the design
of its capabilities. Ideally, theoretical control field process by performing closed-loop experiments to
designs may be attained using the techniques let the quantum system teach the laser how to
discussed above, followed by the achievement of achieve its control in the laboratory. Figure 3
successful control upon execution of the designs in schematically shows this closed loop process,
the laboratory. This appealing approach is burdened drawing on the following logic:
with three difficulties: (1) Hamiltonians are often
imprecisely known, (2) accurately solving the design 1. The molecular view. Although there may be
equations can be a significant task, and (3) realiz- theoretical uncertainty about the system Hamil-
ation of any given design will likely be imperfect due tonian, the actual chemical/physical system under
to laboratory noise or other unaccounted-for sys- study ‘knows’ its own Hamiltonian precisely! This
tematic errors. Perhaps the most serious of these knowledge would also include any unusual
difficulties is point (1), especially considering that exigencies, perhaps associated with structural or
the best quality control will be achieved by other defects in the particular sample. Further-
maximally drawing on subtle constructive and more, upon exposure to a control field, the system
destructive quantum wave interference effects. ‘solves’ its own Schrödinger equation impeccably
Exploiting such subtleties will generally require accurately and as fast as possible in real time.
high-quality control designs that, in turn, depend Considering these points, the aim is to replace the
on having reliable Hamiltonians. Although various offline arduous digital computer control field
designs have been carried out seeking robustness design effort with the actual quantum system
with respect to Hamiltonian uncertainties, the issue under study acting as a precise analog computer,
in point (1) should remain of significance in the solving the true equations of motion.
foreseeable future, especially for the most complex 2. Control laser technology. Pulse shaping under
(and often, the most interesting!) chemical/physical full computer control may be carried out using
applications. Mitigating this serious problem is the even hundreds of discrete elements in the
ability to create shaped laser pulse controls and frequency domain controlling the phase and
apply them to a quantum system, followed by a amplitude structure of the pulse. This technology
probe of their effects at an unprecedented rate of is readily available and expanding in terms of
thousands or more independent trials per minute. pulse center frequency flexibility and bandwidth
This unique capability led to the suggestion of capabilities.
COHERENT CONTROL / Theory 129
Figure 3 A schematic of the closed-loop concept for allowing the quantum system to teach the laser how to achieve its control.
The actual quantum system under control is in a loop with a laser and pulse shaper all slaved together with a pattern recognition
algorithm to guide the excursions around the loop. The success of this concept relies on the ability to perform very large numbers of
control experiments in a short period of time. In principle, no knowledge of the system Hamiltonian is required to steer the system to
the desired final objective, although a good trial design 10 ðtÞ may accelerate the process. The i th cycle around the loop attempts to
find a better control field 1i ðtÞ such that the system response kOðT Þli for the observable operator O comes closer to the desired
value Otarget .
3. Quantum mechanical objectives. Many chemical/ for efficiency reasons. Fortunately, this economy is
physical objectives may be simply expressed as the coincident with all the tasks and technologies
desire to steer a quantum mechanical flux out one involved. For achieving control alone, there is no
clearly defined channel versus another. requirement that the actual laser pulse structure be
4. Detection of the control action. Detection of the identified on each cycle of the loop. Rather, the laser
control outcome can often be carried out by a control ‘knob’ settings are adequate information for
second laser pulse or any other suitable high duty the learning algorithm, as the only criterion is to
cycle detection means, such as laser-induced mass suitably adjust the knobs to achieve an acceptable
spectrometry. The typical circumstance in point value for the chemical/physical objective. The learn-
(3) implies that little, if any, time-consuming ing algorithm in point (5) operates with a cost func-
offline data analysis will be necessary beyond tional J, analogous to that used for computational
that of simple signal averaging. design of fields, but now the cost functional can
5. Fast learning algorithms. All of the four points only depend on those quantities directly observable in
above may be slaved together using pattern recog- the laboratory (i.e., minimally, the current achieved
nition learning algorithms to identify those control target expectation value). In cases where a physical
fields which are producing better results, and bias understanding is sought about the control mechan-
the next sequence of experiments in their favor for ism, the laser field structure must also be measured.
further improvement. Although many algorithms This additional information is typically also not a
may be employed for this purpose, the global burden to attain, as it often is only required for the
search capabilities of genetic algorithms are quite single best control field at the end of the closed-loop
attractive, as they may take full advantage of the learning experiments.
high throughput nature of the experiments. The OCE process in Figure 3 and steps (1) –(5)
above constitute the laboratory process for achieving
In linking together all of these components into a optimal quantum control, and exactly the same
closed-loop OCE, it is important that no single reasons put forth for OCT motivate the desire for
operational step significantly lags behind any other, attaining optimal performance: principally, seeking
130 COHERENT CONTROL / Theory
the best control result that can possibly be obtained. A central issue is the rate that learning control
Seeking optimal performance also ensures some occurs upon excursions around the loop in Figure 3.
degree of inherent robustness to field noise, as fleeting A number of factors control this rate of learning,
observations would not likely survive the signal including the nature of the physical system, the
averaging carried out in the laboratory. Additionally, choice of objective, the presence of field uncertain-
robustness as a specific criterion may also be included ties and measurement errors, the number of control
in the laboratory cost functional in item (5). When a variables, and the capabilities of the learning
detailed physical interpretation of the controlled algorithm employed. Achieving control is a matter
dynamics is desired, it is essential to remove any of discrimination, and some of these factors,
extraneous control field features that have little whether inherent to the system or introduced by
impact on the system manipulations. Without the choice, may work against attaining good-quality
latter clean-up carried out during the experiments, discrimination. At this juncture, little is known
the final field may be contaminated by structures that quantitatively about the limitations associated with
have little physical impact. Although quantum any of these factors.
dynamics typically occurs on femtosecond or picose- The latter issues have not prevented the successful
cond time-scales, the loop excursions in Figure 3 need performance of a broad variety of quantum control
not be carried out on the latter time-scales. Rather, experiments, and at present the ability to carry out
the loop may be traversed as rapidly as convenient, massive numbers of closed-loop excursions has
consistent with the capabilities of the apparatus. overcome any evident difficulties. The number of
A new system sample is introduced on each cycle of examples of successful closed-loop quantum control
the loop. This process is referred learning control, to is rapidly growing, with illustrations involving the
distinguish it from real-time feedback control. manipulation of laser dye fluorescence, chemical
The closed-loop quantum learning control algo- reactivity, high harmonic generation, semiconductor
rithm can be self-starting, requiring no prior control optical switching, fiber optic pulse transmission, and
field designs, under favorable circumstances. Virtu- dynamical discrimination of similar chemical
ally all of the current experiments were carried out in species, amongst others. Perhaps the most interesting
this fashion. It remains to be seen if this procedure of cases are those carried out at high field intensities,
‘going in blind’ will be generally applicable in highly where prior speculation suggested that such experi-
complex situations where the initial state is far from ments would fail due to even modest laser field noise
the target state. Learning control can only proceed if being amplified by the quantum dynamics. Fortu-
at least a minimal signal is observed in the target nately, this outcome did not occur, and theoretical
state. Presently, this requirement has been met by studies have indicated that surprising degrees of
drawing on the overwhelmingly large number of robustness to field noise may accompany the
exploratory experiments that can be carried out, manipulation of quantum dynamics phenomena.
even in a brief few minutes, under full computer Although the presence of field noise may diminish
control. However, in some situations, the perform- the beneficial influence of constructive and destruc-
ance of intermediate observations along the way to tive interferences, evidence shows that field noise
the target goal may be necessary in order to at does not appear to kill the actual control process, at
least partially guide the quantum dynamics towards least in terms of obtaining reasonable values for the
the ultimate objective. Beneficial use could also be control objectives. One illustration of control in the
made from prior theoretical laser control designs 10(t) high-field regime is shown in Figure 4, demonstrat-
capable of at least yielding a minimal target signal. ing the learning process for the dissociative
The OCE closed-loop procedure was based on the rearrangement of acetophenone to form toluene.
growing number of successful theoretical OCT Interestingly, this case involves breaking two bonds,
design calculations, even with all of their foibles, with the formation of a third, indicating that
especially including less than perfect Hamiltonians. complex dynamics can be managed with suitably
The OCT and OCE processes are analogous, with tailored laser pulses.
their primary distinction involving precisely what Operating in the strong-field regime, especially for
appears in their corresponding cost functionals. chemical manipulations, also has the important
Theoretical guidance can aid in identifying the benefit of producing a generic laser tool to
appropriate cost functionals and learning algorithms control a broad variety of systems by overcoming
for OCE, especially for addressing the needs the long-standing problem of having sufficient band-
associated with attaining a physical understanding width. For example, the result in Figure 4 uses a
about the mechanisms governing controlled quan- Ti:sapphire laser in the near-infrared regime. Starting
tum dynamics phenomena. with a bandwidth-limited pulse of intensity near
COHERENT CONTROL / Theory 131
Conclusions
This article presented an overview of theoretical
concepts and algorithmic considerations associated
Figure 5 Three possible closed-loop formulations for attaining
laser control over quantum phenomena. (a) is a learning control with the control of quantum phenomena. Theory
process where a new system is introduced on each excursion has played a central role in this area by revealing
around the loop. (b) utilizes feedback control with the loop being the fundamental principles underlying quantum
traversed with the same sample, generally calling for an accurate control, as well as by providing algorithms for
model of the system to adjust the controls on each loop excursion.
designing controls and guiding experiments to
(c) is, at present, a dream machine where the laser and the
sample under control are one unit operating without external discover successful controls in the laboratory. The
algorithmic guidance. The closed-loop learning control procedure challenge of controlling quantum phenomena, in
in (a) appears to form the most practical means to achieve laser one sense, is an old subject, going back at least 40
control over quantum phenomena, especially for complex years. However, roughly the first 30 of these years
systems.
would best be described as a period of frustration,
due to a lack of full understanding of the principles
involved and the nature of the lasers needed to
quantum dynamics may be permitted to occur achieve success. Thus, from another perspective, the
between one control pulse and the next, to make subject is quite young, with perhaps the most
the time issue manageable. While the learning con- notable development being the introduction of
trol process in Figures 3 and 5a can be performed closed-loop laboratory learning procedures. At the
model-free, the feedback algorithm in Figure 5b time of writing this article, these procedures are
generally must operate with a sufficiently reliable just beginning to be explored for their full
system Hamiltonian to carry out fast design correc- capabilities. To appreciate the young nature of this
tions to the control field, based on the previous subject, it is useful to note that the analogous
control outcome. These are very severe demands, but engineering control disciplines presently occupy
they may be met under certain circumstances. One many thousands of engineers worldwide, both
reason for considering closed-loop feedback control theoreticians and practitioners, and have done so
in Figure 5b is to explore the basic physical issue of for many years. Yet, engineering control is far
quantum mechanical limitations inherent in the from considered a mature subject. Armed now with
statement ‘to observe is to disturb’. It is suggestive the basic concepts and proper laboratory tools, one
that control over many types of quantum phenomena may anticipate a thorough exploration of control
may fall into the semiclassical regime, lying some- over quantum phenomena, including its many
where between classical engineering behavior and the possible applications.
COHERENT CONTROL / Experimental 133
Experimental
R J Levis, Temple University, Philadelphia, PA, USA considerable effort directed toward using the inten-
q 2005, Elsevier Ltd. All Rights Reserved. sity and high spectral resolution features of the laser
for various applications. For example, the high
spectral resolution character was employed in
Introduction purifying nuclear isotopes. Highly intense beams
were also used in the unsuccessful attempt to pump
The experimental control of quantum phenomena in
enough energy into a single vibrational resonance to
atoms, molecules, and condensed phase materials has
selectively dissociate a molecule. The current revolu-
been a long-standing challenge in the physical
tion in coherent control has been driven in part by
sciences. For example, the idea that a laser beam
the realization that controlling energy deposition
could be used to selectively cleave a chemical bond
into a single degree of freedom (such as selectively
has been pursued since the demonstration of the laser
breaking a bond in a polyatomic molecule) can not
in the early 1960s. However, it was not until very
be accomplished by simply driving a resonance with
recent times that one could realize selective bond
cleavage. One key to the rapid acceleration of such an ever more intense laser source. In the case of
experiments is the capability of manipulating the laser-induced photo dissociation, such excitation
relative phases between two or more paths leading to ultimately results in statistical dissociation through-
the same final state. This is done in practice by out the molecule due to vibrational mode coupling.
controlling the phase of two (or two thousand!) laser In complex systems, controlling energy deposition
frequencies coupling the initial state to some desired into a desired quantum state requires a more
final state. The present interest in phase-related sophisticated approach and one scheme involves
control stems in part from research focusing on coherent control.
quantum information sciences, controlling chemical The two distinct regimes for excitation in coherent
reactivity, and developing new optical technologies, control, the time and frequency domain experiments,
such as in biophotonics for imaging cellular are formally equivalent. While the experimental
materials. Ultimately, one would like to identify approach and level of success for each domain is
coherent control applications having impact similar vastly different, both rely on precisely specifying the
to linear regime applications like compact disk phase relations for a series of excitation frequencies.
reading (and writing), integrated circuit fabrication, The basic idea behind phase control is simple. One
and photodynamic therapy. Note that in each of these emulates the interference effects that can be easily
the phase of the electromagnetic field is not an visualized in a water tank experiment with a wave
important parameter and only the brightness of the propagating through two slits. In the coherent control
laser at a particular frequency is used. While phase is experiment, two or more indistinguishable pathways
of negligible importance in any one photon process, must be excited that connect one quantum state to
for excitations involving two or more photons, phase another. In either water waves, or coherent control,
not only becomes a useful characteristic, it can the probability for observing constructive or destruc-
modulate the yield of a desired process between tive interference at a target state depends on the phase
zero and one hundred percent. difference between the paths connecting the two
The use of optical phase to manipulate atomic and locations in the water tank, or the initial and final
molecular systems rose to the forefront of laser states in the quantum system. In the case of coherent
science in the 1980s. Prior to this there was control, the final state may have some technological
134 COHERENT CONTROL / Experimental
value, for instance, a desirable chemical reaction the initial state and a single excited eigenstate. In
product or an intermediate in the implementation of practice, quasi-CW nanosecond duration laser pulses
quantum computation. are employed for such experiments because current
Frequency domain methods concentrate on inter- ultrafast (fs duration) laser sources cannot typically
fering two excitation routes in a system by controlling prepare a superposition of electronic states. One
the phase difference between two distinct frequencies instructive exception involves the excitation of
coupling an initial state to a final state. Perhaps the Rydberg states in atoms. Here the electronic energy
first mention of the concept of phase control in the level spacing can be small enough that the bandwidth
optical regime can be traced to investigations in of a femtosecond excitation laser may span several
Russia in the early 1960s, where the interference electronic states and thus can create a superposition
between a one- and three-photon path was theoreti- state. One should also note that the next generation
cally proposed as a means to modulate a two-photon of laser sources in the attosecond regime may permit
absorption. This two-path interference idea lay the preparation of a wavepacket from low-lying
dormant until the reality of controlling the relative electronic states in an atom. In the case of molecules,
phase of two distinct frequencies was achieved in avoiding a superposition state is nearly impossible.
1990 using gas phase pressure modulation for Such experiments employ pico- to femtosecond
frequency domain experiments. The idea was theor- duration laser pulses, having a bandwidth sufficient
etically extended to the control of nuclear motion in to excite many vibrational levels, either in the ground
1988. The first experimental example of two-color or excited electronic state manifold of a molecule. To
phase control over nuclear motion was demonstrated complete the time-dependent measurement, the
in 1991. It is interesting that the first demonstrations propagation of the wavepacket to the final state of
of coherent control did not, however, originate in the interest must be observed (probed) using a second
frequency domain approach, but came from time ultrafast laser pulse that can produce fluorescence,
domain investigations in the mid-1980s. This was ionization, or stimulated emission in a time-resolved
work that grew directly out of the chemical dynamics manner. The multiple paths that link the initial and
community and will be described below. final state via the pump and the probe pulses are the
While frequency domain methods were developed key to the equivalence of the time and frequency
exclusively for the observation of quantum inter- domain methods.
ference, the experimental implementation of time- There is another useful way to classify experiments
domain methods was first developed for observing that have been performed recently in coherent
nuclear motion in real time. In the time domain, a control, that is into either open-loop or closed-loop
superposition of multiple states is prepared in a techniques. Open-loop signifies that a calculation
system using a short duration (, 50 fs) laser pulse. may be performed to specify the required pulse shape
The superposition state will subsequently evolve for control before the experiment is attempted. All of
according to the phase relationships of the prepared the frequency-based, and most of the time-based
states and the Hamiltonian of the system. At some experiments fall into this category. Closed-loop, on
later time the evolution must be probed using a the other hand, signifies that the results from
second, short-time-duration laser pulse. The time- experimental measurements must be used to assist
domain methods for coherent control were first in determining the correct parameters for the next
proposed in 1985 and were based on wavepacket experiment, this pioneering approach was suggested
propagation methods developed in the 1970s. The in 1992. It is interesting that in the context of control
first experiments were reported in the mid-1980s. experiments performed to date, closed-loop exper-
To determine whether the time or frequency iments represent the only route to determining the
domain implementation is more applicable for a optimal laser pulse shape for controlling even
particular control context, one should compare the moderately complex systems. In this approach, no
relevant energy level spacing of the quantum system input from theoretical models is required for coherent
to the bandwidth of the excitation laser. For instance, control of complex systems. This is unlike typical
control of atomic systems is more suited to frequency control engineering environments, where closed-loop
domain methods because the characteristic energy signifies the use of experiments to refine parameters
level spacing in an atom (several electron volts) is used in a theoretical model.
large compared to the bandwidth of femtosecond There is one additional distinction that is of value
laser pulses (millielectron volts). In this case, prepa- for understanding the state of coherent control at the
ration of a superposition of two or more states is present time. Experiments may be classified into
impractical with a single laser pulse at the present systems having Hamiltonians that allow calculation
time, dictating that the control scheme must employ of frequencies required for the desired interference
COHERENT CONTROL / Experimental 135
patterns (thus enabling open-loop experiments) and picoseconds in favorable cases, i.e., diatomic mole-
those having ill-defined Hamiltonians (requiring cules in the gas phase. The loss of signal in the
closed-loop experiments). The open-loop experi- coherent oscillations is ultimately due to passage of
ments have demonstrated the utility of various the superposition state in to surrounding bath states.
control schemes, but are not amenable to systems This may be due to collision with another molecule,
having even moderate complexity. The closed-loop dissociation of the system or redistribution of the
experiments are capable of controlling systems of excitation energy into other modes of the system not
moderate complexity but are not amenable to originally excited in the superposition. It is notable
precalculation of the required control fields. Systems that coherence can be maintained in solution phase
requiring closed-loop methods include propagation systems for tens of picoseconds and for the case of
of short pulses in an optical fiber, control of high resonances having weak coupling to other modes,
harmonic generation for soft X-ray generation, coherence can even be modulated in large biological
control of chemical reactions and control of biologi- systems on the picosecond time scale.
cal systems. In the experimental regime, the value of Decoherence represents loss of phase information
learning control for dealing with complex systems has about a system. In this context, phase information
been well documented. represents our detailed knowledge of the electronic
The remainder of this article represents an over- and nuclear coordinates of the system. In the case of
view of the experimental implementation of coherent vibrational coherence, decoherence may be rep-
control. The earliest phase control experiments resented by intramolecular vibrational energy trans-
involved time-dependent probing of molecular wave- fer to other modes of the molecule. This can be
packets, and were followed by two-path interference thought of as the propensity of vibrational modes of a
in an atomic system. The most recent, and general, molecule to couple together in a manner that
implementation of control involves tailoring a time- randomizes deposited energy throughout the mol-
dependent electromagnetic field for a desired objec- ecule. Decoherence in a condensed phase system
tive using closed-loop techniques. Since the use of includes transfer of energy to the solvent modes
tailored laser pulses appears to be rather general, a surrounding the system of interest. For an electro-
more complete description of the experiment is nically excited molecule or atom, decoherence can
provided. involve spontaneous emission of radiation (fluor-
escence), or dissociation in the case of molecules.
Certain excited states may have high probability for
Time-Dependent Methods fluorescence and would represent a significant deco-
The first experiments demonstrating the possibility of herence pathway and an obstacle for coherent
coherent control were performed to detect nuclear control. These states may be called ‘lossy’ and are
motion in molecules in real time. These investigations often used in optical pumping schemes as well as for
were the experimental realization of the pump-dump stimulated emission pumping. An interesting question
or pulse-timing control methods as originally arises as to whether one can employ such states in a
described in 1985. The experimental applications pathway leading to a desired final state, or must such
have expanded to numerous systems including lossy states be avoided at all costs in coherent control.
diatomic and triatomic molecules, small clusters, This question has been answered to some degree by
and even biological molecules. For such experiments, the method of rapid adiabatic passage. In this
a vibrational coherence is prepared by exciting a experiment, an initial state is coupled to a final state
superposition of states. These measurements by way of a lossy state. The coupling is a coherent
implicitly used phase control both in the generation process and the preparation involves dressed states
of the ultrashort pulses and in the coherent probing of that are eigenstates of the system. In this coherent
the superposition state by varying the time delay superposition the lossy state is employed, but no
between the pump and probe pulses, in effect population is allowed to build up and thus decoher-
modulating the outcome of a quantum transition. ence is circumvented.
Since the earliest optical experiment in the mid-
1980s, there have been many hundreds of such
experiments reported in the literature. The motiv-
Two-Path Interference Methods
ation for the very first experiments was not explicitly Interference methods form perhaps the most intuitive
phase control, but rather the observation of nuclear example of coherent control. The first experimental
motion of molecules in real time. In such later demonstration was reported in 1990 and involved the
measurements, oscillations in the superposition states modulation of ionization probability in Hg by
were observable for many cycles, up to many tens of interfering a one-photon and three-photon excitation
136 COHERENT CONTROL / Experimental
pathway to an excited electronic state (followed by enhancing the ultrashort traveling wave in the cavity
two-photon ionization). The long delay between the in this way. The wide bandwidth available may then
introduction of the two-path interference concept in be amplified and tailored into a shaped electromag-
1960 and the demonstration in 1990 was the netic field. Spatial light modulation is one such
difficulty in controlling the relative phase of the method for modulating the relative phases and
one- and three-photon paths. The solution involved amplitudes of the component frequencies in the
generating the third harmonic of the fundamental in a ultrafast laser pulse to tailor the shaped laser pulse.
nonlinear crystal, and copropagating the two beams Pulse shaping first involves dispersing the phase-
through a low-density gas. Changing the pressure of locked frequencies on an optical grating, the
the gas changes the optical phase retardance at dispersed radiation is then collimated using a
different rates for different frequencies, thus allowing cylindrical lens and the individual frequencies are
relative phase control between two colors. The focused to a line forming the Fourier plane. At the
ionization yield of Hg was clearly modulated as a Fourier plane, the time-dependent laser pulse is
function of pressure. Such schemes have been transformed into a series of phase-locked continuous
extended to diatomic molecular systems and the wave (CW) laser frequencies, and thus Fourier
concept of determining molecular phase has been transformed from time to frequency space. The
investigated. While changing the phase of more than relative phases and amplitudes may be modulated
two frequencies has not been demonstrated using in the Fourier plane using an array of liquid crystal
pressure tuning (because of experimental difficulties), pixels. After altering the spectral phase and ampli-
a more flexible method has been developed to alter tude profile, the frequencies are recombined on a
relative phase of up to 1000 frequency bands, as will second grating to form the shaped laser pulse. With
be described next. one degree of phase control and 10 pixels there are
an astronomic number (36010) of pulse shapes that
may be generated in this manner. (At the present
Closed-Loop Control Methods time, pulse shapers have up to 2000 independent
Perhaps the most exciting new aspect of control in elements.) Evolutionary algorithms are employed to
recent years concerns the prospect for performing manage the available phase space. Realizing that the
closed-loop experiments. In this approach a more pulse shape is nothing more than the summation of a
complex level of control in matter is achieved in series of sine waves, each having an associated phase
comparison to the two-path interference and pump- and amplitude, we find that a certain pulse shape can
probe control methods. In the closed-loop method, an be represented by a genome consisting of an array of
arbitrarily complex time-dependent electromagnetic these frequency-dependent phases and amplitudes.
field is prepared that may have multiple subpulses and This immediately suggests that the methods of
up to thousands of interfering frequency bands. The evolutionary search strategies may be useful for
realization of closed-loop control over complex determining the optimal pulse shape for a desired
processes relies on the confluence of three technol- photoexcitation process.
ogies: (i) production of large bandwidth ultrafast The method of closed-loop optimal control for
laser pulses; (ii) spatial modulation methods for experiments was first proposed in 1992 and the first
manipulating the relative phases and amplitudes of experiments were reported in 1997 regarding
component frequencies of the ultrafast laser pulse; optimization of the efficiency of a laser dye molecule
and (iii) closed-loop learning control methods for in solution. Since that time a number of experiments
sorting through the vast number of pulse shapes have been performed in both the weak and strong
available as potential control fields. In the closed-loop field regime. In the weak field regime, the intensity
method, the result of a laser molecule interaction is of the laser does not alter the field free structure of
used to tailor an optimized pulse. the excited states of the system under investigation.
In almost all of the experimental systems used for In the strong field, the field free states of the system
closed-loop control today, Kerr lens mode-locking in are altered by the electric field of the laser. Whether
the Ti:sapphire crystal is used to phase lock the wide one is in the weak or strong field regime depends on
bandwidth of available frequencies (750 –950 nm) to parameters including the characteristic level spacing
create the ultrafast pulse. In this phenomenon, the of the states of the system to be controlled and the
intensity variation in the short laser pulse creates a intensity of the laser coupling into those states.
transient lens in the Ti:sapphire lasing medium that Examples of weak field closed-loop control include
discriminates between free lasing and mode-locked optimization of laser dye efficiency, compression of
pulses. As a result, all of the population inversion an ultrashort laser pulse, dissociation of alkali
available in the lasing medium may be coerced into clusters, optimization of coherent anti-Stokes
COHERENT CONTROL / Applications in Semiconductors 137
Applications in Semiconductors
H M van Driel and J E Sipe, University of Toronto, terminology, if E1 and E2 are the complex electric
Toronto, ON, Canada fields of the two light beams arriving at the screen, by
q 2005, Elsevier Ltd. All Rights Reserved. the superposition principle of field addition the
intensity at a particular point on the screen can be
written as
Introduction
I / lE1 þ E2 l2
Interference phenomena are well-known in classical
optics. In the early 19th century, Thomas Young ¼ lE1 l2 þ lE2 l2 þ lE1 llE2 l cosðf1 2 f2 Þ ½1
showed conclusively that light has wave properties,
with the first interference experiment ever performed. where f1 2 f2 is the phase difference of the beams
He passed quasi-monochromatic light from a single arriving at the screen. While this simple experi-
source through a pair of double slits using the ment was used originally to demonstrate the wave
configuration of Figure 1a and observed an inter- properties of light, it has subsequently been used for
ference pattern consisting of a series of bright and many other purposes, such as measuring the
dark fringes on a distant screen. The intensity wavelength of light. However, for our purposes,
distribution can be explained only if it is assumed here the Young’s double slit apparatus can also be
that light has wave or phase properties. In modern viewed as a device that redistributes light, or
138 COHERENT CONTROL / Applications in Semiconductors
W ¼ la1 þ a2 l2
might be controlled through laser-induced interfer- altered to obtain a new result. A key component is the
ence processes. The use of phase as a control use of an algorithm to select the new pulse properties
parameter also represents a novel horizon of appli- as part of the feedback system. In this approach the
cations for the laser since most previous applications molecule teaches the external control system what it
had involved only amplitude (intensity). ‘requires’ for a certain result to be optimally achieved.
Of course, just as the ‘visibility’ of the screen While the ‘best’ pulse properties may not directly
pattern for a classical Young’s double slit interfero- reveal details of the multiple interference process
meter is governed by the coherence properties of the required to achieve the optimal result, this technique
two slit source, the evolution of a quantum system can nonetheless be used to gain some insight into the
reflects the coherence properties of the perturbations, properties of the unperturbed Hamiltonian and,
and the degree to which one can exercise phase regardless of such understanding, achieve a desirable
control via external perturbations is also influenced result. It has been used to control chemical reaction
by the interaction of the system of interest with a rates involving several polyatomic molecules, with
reservoir – essentially any other degrees of freedom considerable enhancement in achieving a certain
in the system – which can cause decoherence and product relative to what can be done using simple
reduce the effectiveness of the ‘matter interferom- thermodynamics.
eter’. Since the influence of a reservoir will generally
increase with the complexity of a system, one might
think that coherence control can only be effectively
Coherence Control in Semiconductors
achieved in simple atomic and molecular systems. Since coherence control of chemical reactions invol-
Indeed, some of the earliest suggestions for coherence ving large molecules has represented a significant
control of a system involved using interference challenge, it was generally felt that ultrafast deco-
between single and three photon absorption pro- herence processes would also make control in solids,
cesses, connecting the same initial and final states in general, and semiconductors, in particular, diffi-
with photons of frequency 3v and v; respectively, cult. Early efforts therefore focused on atomic-like
and controlling the population of excited states in situations in which the electrons are bound and
atomic or diatomic systems. However, it has been associated with discrete states and long coherence
difficult to extend this ‘two color’ paradigm to more times. In semiconductors, the obvious choice is the
complex molecular systems. Since the reservoir leads excitonic system, defect states, or discrete states
to decoherence of the system overall, shorter and offered by quantum wells. Population control of
shorter pulses must be used to overcome decoherence excitons and directional ionization of electrons from
effects. However, short pulses possess a large quantum wells has been clearly demonstrated, similar
bandwidth with the result that, for complex systems, to related population control of and directional
selectivity of the final state can be lost. One must ionization from atoms. Other manifestations of
therefore consider using interference between multi- coherence control of semiconductors include control
ple pathways in order to control the system, as of electron – phonon interactions and intersub-band
dictated by the details of the unperturbed Hamil- transitions in quantum wells.
tonian of the system. Among the different types of coherence control in
For a polyatomic molecule or a solid, the complete semiconductors is the remarkable result that it is
Hamiltonian is virtually impossible to determine possible to coherently control the properties of free
exactly and it is therefore equally difficult to prescribe electrons associated with continuum states. Although
the optimal spectral (amplitude and phase) content of optimal control might be used for some of these
the pulses that should be used for control purposes. processes, we have achieved clear illustrations of
An alternative approach has therefore emerged, in control phenomena based on the use of harmonically
which one foregoes knowledge of the eigenstates of related beams and interference of single- and two-
the Hamiltonian and details of the different possible photon absorption processes connecting the conti-
interfering transition paths in order to achieve control nuum valence and conduction band states as gener-
of the end state of a system. This branch of coherence ically illustrated in Figure 1c. Details of the various
control has come to be known as optimal control. In processes observed can be found elsewhere. Of
optimal control one employs an optical source for course, the momentum relaxation time of electrons
which one can (ideally) have complete control over or holes in continuum states is typically of the order
the spectral and phase properties. One then uses a of 100 fs at room temperature, but this time is
feedback process in which experiments are carried sufficiently long that it can permit phase-controlled
out, the effectiveness of achieving a certain result is processes. Indeed, with respect to conventional
determined, and the pulse characteristics are then carrier transport, this ‘long time’ lapse is responsible
140 COHERENT CONTROL / Applications in Semiconductors
for the typically high electrical mobilities in crystal- of light via single-photon transitions generally
line semiconductors such as Si and GaAs. In essence, a populates states of equal and opposite momentum
crystalline semiconductor with translational sym- with equal probability or, equivalently, establishes a
metry has crystal momentum as a good quantum standing electron wave with zero crystal momen-
number, and selection rules for scattering prevent the tum. This is not surprising since photons possess
momentum relaxation time from being prohibitively very little momentum and, in the approximation of
small. Since electrons or holes in any continuum state a uniform electric field, do not give any momentum
can participate in such control processes, one need to an excited electron. The particular states that are
not be concerned about pulsewidth or bandwidth, excited depends on the light polarization and crystal
unless the pulses were to be so short that carriers of orientation. However, the main point is that while
both positive and negative effective mass were single-photon absorption can lead to anisotropic
generated within one band. filling of electron states, the distribution in momen-
tum space is not polar (dependent on sign of k).
Coherent Control of Electrical Current Using Similar considerations apply to the excited holes,
Two Color Beams but to avoid repetition we will focus on the
electrons only.
We now illustrate the basic principles that describe For two-photon absorption involving the v pho-
how the interference process involving valence and tons and connecting states similar to those connected
conduction band states can be used to control with single-photon absorption, one must employ the
properties of a bulk semiconductor. We do so in the 2nd order perturbation theory using the Hamiltonian
case of using one or two beams to generate and H ¼ e=mcAv · p: For two-photon absorption there is
control carrier population, electrical current, and spin a transition from the valence band to an (energy
current. In pointing out the underlying ideas behind nonallowed) intermediate state followed by a tran-
coherence control of semiconductors, we will skip or sition from the intermediate state to the final state. To
suppress many of the mathematical details which are determine the total transition rate one must sum over
required for a full understanding but which might all possible intermediate states. For semiconductors,
obscure the essential physics. In particular we by far the dominant intermediate state is the final
consider how phase-related optical beams with state itself, so that the associated transition amplitude
frequencies v and 2v interact with a direct gap has the form
semiconductor such that "v , Eg , 2"v where Eg is
the electronic bandgap. For simplicity we consider av / ðEv eifv pcv ÞðE2v eifv pcc Þ / ðEv Þ2 e2ifv pcv "k ½5
exciting electrons from a single valence band via
single-photon absorption at 2v and two-photon where the matrix element pcc is simply the momen-
absorption at v: As is well known, within an tum of the conduction band state ð"kÞ along the
independent particle approximation, the states of direction of the field. Note that unlike an atomic
electrons and holes in semiconductors can be labeled system, pcc is nonzero, since Bloch states in a
by their vector crystal momentum k; the energy of crystalline solid do not possess inversion symmetry.
states near the conduction or valence bandedges If two-photon absorption acts alone, the overall
varies quadratically with lkl: For one-photon absorp- transition rate between two particular k states would
tion the transition amplitude can be derived using a be W2 / ðIv Þ2 lpcv l2 k2 : As with single-photon absorp-
perturbation Hamiltonian of the form H ¼ e=mc tion, because this transition rate is independent of the
A2v · p where A2v is the vector potential associated sign of k; two-photon absorption leads to production
with the light field and p is the momentum operator. of electrons with no net momentum.
The transition amplitude is therefore of the form When both single- and two-photon transitions are
present simultaneously, the transition amplitude is
the sum of the transition amplitudes expressed in
a2v / E2v eif2v pcv ½4
eqn [3]. The overall transition rate is then found using
eqn [1] and yields W ¼ W1 þ W2 þ Int where the
where pcv is the interband matrix element of p along
interference term Int is given by
the field ðE2v Þ direction; for illustration purposes the
field is taken to be linearly polarized. The overall Int / E2v Ev Ev sinðf2v 2 2fv Þk ½6
transition rate between two particular states of the
same k can be expressed as W1 / a2v ða2v Þp / Note, however, that the interference effect depends on
I2v lpcv l2 where I2v is the intensity of the beam. the sign of k and hence can be constructive for one
This rate is independent of the phase of the light part of the conduction band but destructive for other
beam as well as the sign of k. Hence the absorption parts of that band, depending on the value of the
COHERENT CONTROL / Applications in Semiconductors 141
the maximum electrical current injection in the case by a bias voltage to produce a spin-polarized current.
of GaAs occurs for linearly polarized beams both However, given the protocols discussed above it
oriented along the (111) or equivalent direction. should not come as a surprise that the two color
However, large currents can also be observed with the coherence scheme, when used with certain light
light beams polarized along other high symmetry polarizations, can coherently control the spin polar-
directions. ization, making it dependent on Df: Furthermore, for
certain polarization combinations it is also possible to
Coherent Control of Carrier Density, Spin generate a spin current with or without an electrical
Population, and Spin Current Using Two current. Given the excitement surrounding the field of
Color Beams spintronics, where the goal is the use of the spin
The processes described above do not exhaust the degree of freedom for data storage and processing, the
coherent control effects that can be observed in bulk control of quantities involving the intrinsic angular
semiconductors using harmonic beams. Indeed, for momentum of the electron is of particular interest.
noncentrosymmetric materials, certain light polariz- Various polarization and crystal geometries can be
ations and crystal orientations allow one to coher- examined for generating spin currents with or with-
ently control the total carrier generation rate with or out electrical current. For example, with reference to
without generating an electrical current. In the case of Figure 3a, for both beams propagating in the z
the cubic material GaAs, provided the v beam has ðnormalÞ direction of a crystal and possessing the
electric field components along two of the three same circular polarization, an electrical current can
principal crystal axes, with the 2v beam having a be injected in the ðxyÞ plane, at an angle from the
component along the third direction, one can crystallographic x direction dependent on the relative
coherently control the total carrier density. However, phase parameter, Df; this current is spin-polarized in
the overall degree of control here is determined by the z direction. As well, the injected carriers have a ^z
how ‘noncentrosymmetric’ the material is. component of their velocity, as many with one
The spin degrees of freedom of a semiconductor component as with the other; but those going in one
can also be controlled using two color beams. Due direction are preferentially spin-polarized in one
to the spin – orbit interaction, the upper valence direction in the ðxyÞ plane, while those in the other
bands of a typical semiconductor have certain direction are preferentially spin-polarized in the
spin characteristics. To date, optical manipulation opposite direction. This is an example of a spin
of electron spin has been largely based on the fact that current in the absence of an electrical current.
partially spin-polarized carriers can be injected in a Such a pure spin current that is perhaps more
semiconductor via one-photon absorption of circu- striking is observable with the two beams cross
larly polarized light from these upper valence bands. linearly polarized, for example with the fundamental
In such carrier injection – where in fact two-photon beam in the x direction and the second harmonic
absorption could be used as well – spins with no net beam in the y direction as shown in Figure 3b. Then
velocity are injected, and then are typically dragged there is no net spin injection; the average spin in any
Figure 3 (a) Excitation of a semiconductor by co-circularly polarized v and 2v pulses. Arrows indicate that a spin polarized electrical
current is generated in the x – y plane in a direction dependent on Df while a pure spin current is generated in the beam propagation ðzÞ
direction. (b) Excitation of a semiconductor by orthogonally, linearly polarized v and 2v pulses. Arrows indicate that a spin polarized
electrical current is generated in the direction of the fundamental beam polarization as well as along the beam propagation ðzÞ direction.
COHERENT CONTROL / Applications in Semiconductors 143
direction is zero. And for a vanishing phase parameter Nonetheless, the efficacy of this process is limited by
Df there is no net electrical current injection. Yet, for the fact that it relies on the breaking of center-of-
example, the electrons injected with a þx velocity inversion symmetry of the underlying crystal. It is also
component will have one spin polarization with clear that the maximum current occurs for circularly
respect to the z direction, while those injected with polarized light and that right and left circularly
a 2x component to their velocity will have the polarized light lead to a difference in sign of the
opposite spin polarization with respect to the z current injection. Finally, for this particular single
direction. beam scheme, when circularly polarized light is used
The examples given above are the spin current the electrical current is partially spin polarized.
analog of the two-color electrical current injection
discussed above. There should also be the possibility
Conclusions
of injecting a spin current with a single beam into
crystals lacking center-of-inversion symmetry. Through the phase of optical pulses it is possible to
The simple analysis presented above relies on control electrical and spin currents, as well as carrier
simple quantum mechanical ideas and calculations density, in bulk semiconductors on a time-scale that is
using essentially nothing more than Fermi’s Golden limited only by the rise time of the optical pulse and
Rule. Yet the process involves a mixing of two the intrinsic response of the semiconductor. A few
frequencies, and can therefore be thought of as a optically based processes that allow one to achieve
nonlinear optical effect. Indeed, one of the key ideas these types of control have been illustrated here.
to emerge from the theoretical study of such Although at one level one can understand these
phenomena is that an interpretation of coherence processes in terms of quantum mechanical interfer-
control effects alternate to that provided by the simple ence effects, at a macroscopic level one can under-
quantum interference picture that is provided by the stand these control phenomena as manifestations of
usual susceptibilities of nonlinear optics. These nonlinear optical phenomena. For fundamental as
susceptibilities are, of course, based on quantum well as applied reasons, our discussion has focused on
mechanics, but the macroscopic viewpoint allows for coherence control effects using the continuum states
the identification and classification of the effects in in bulk semiconductors, although related effects can
terms of 2nd order nonlinear optical effects, 3rd order also occur in quantum dot, quantum well, and
optical effects, etc. Indeed, one can generalize many superlattice semiconductors. Applications of these
of the processes we have discussed above to a control effects will undoubtedly exploit the all-optical
hierarchy of frequency mixing effects or high-order nature of the process, including the speed at which the
nonlinear processes involving multiple beams with effects can be turned on or off. The turn-off effects,
frequency nv; mv; pv… with n; m; p being integers. although not discussed extensively here, are related
Many of theses schemes require high intensity of one to transport phenomena as well as momentum
or more of the beams, but can occur in a simple scattering and related dephasing processes.
semiconductor.
Figure 3 Coherent detection and processing of a subcarrier modulated signal lightwave. The power spectrum of the detected signal is
shown: (a) after IF filtering; (b) after multiplication by the IF oscillator; (c) after multiplication with the subcarrier oscillator (in this case
channel 1’s carrier frequency); and (d) after low pass filtering prior to demodulation and data detection. LPF: lowpass filter.
Detection Schemes
There are two classes of demodulation schemes:
synchronous and asynchronous. The former exploits
the frequency and phase of the carrier signal to
perform the detection. The latter only uses the
envelope of the carrier to perform the detection. In
the following we review the principal demodulation
schemes.
Figure 4 Modulation schemes for coherent optical communi- In an ideal PSK homodyne receiver, shown in Figure 5,
cations: (a) ASK; (b) FSK; and (c) PSK. the signal light and local oscillator have identical
COHERENT LIGHTWAVE SYSTEMS 147
optical frequency and phase. If the resulting baseband bit is a 0 or 1. The receiver requires synchronization
information signal is positive, then a 1 has been of the reference oscillator with the IF signal. This is
received. If the baseband information signal is not easy to achieve in practice because of the large
negative, then a 0 bit has been received. This scheme amount of semiconductor laser phase noise.
has the highest theoretical sensitivity but requires an
OPLL and is also highly sensitive to phase noise. ASK Heterodyne Synchronous Detection
This scheme is basically the same as the PSK
ASK Homodyne Detection
heterodyne detection and is similarly difficult to
An ideal ASK homodyne receiver is identical to an design.
ideal PSK homodyne receiver. Because the baseband
information signal is either zero (0 bit) or nonzero FSK Heterodyne Synchronous Detection
(1 bit), the receiver sensitivity is 3 dB less than the
This scheme is shown in Figure 7 for a binary FSK
PSK homodyne receiver.
signal consisting of two possible frequencies f1 ¼
fs 2 Df and f2 ¼ fs þ Df : There are two separate
PSK Heterodyne Synchronous Detection
branches in which correlation with the two possible
Figure 6 shows an ideal PSK synchronous heterodyne IF signals at f1 and f2 is performed. The two resulting
receiver. The IF signal is filtered by a bandpass filter, signals are subtracted from each other and a decision
multiplied by the phase locked reference oscillator to taken on the sign of the difference. Because of the
move the signal to baseband, low pass filtered and requirement for two electrical phase locked loops, the
sent to a decision circuit that decides if the received scheme is not very practical.
148 COHERENT LIGHTWAVE SYSTEMS
gc ¼ 2hNs ½16
The actual detected number of photons per bit Figure 11 Shot-noise limited BER versus number of received
NR ¼ hNs , so photons/bit for various demodulation schemes: (a) PSK homo-
dyne; (b) synchronous PSK; (c) synchronous FSK; (d) ASK
gc ¼ 2NR ½17 envelope; (e) FSK dual-filter; and (f) CPFSK and DPSK.
150 COHERENT LIGHTWAVE SYSTEMS
a data rate of 1 Gb/s. The narrow signal and local- sensitive to phase noise than synchronous receivers.
oscillator linewidths required are only achievable with They do have some sensitivity to phase noise because
external cavity semiconductor lasers. IF filtering leads to phase-to-amplitude noise
The Costas-type and decision-driven PLLs also conversion.
utilize p-hybrids and two photodetectors but as they
have full suppression of the carrier ðu ¼ p=2Þ, non- Differential Detection
linear processing of the detected signals is required to
Differential detection receivers have sensitivities to
produce a phase error signal that can be used by the
phase noise between synchronous and asynchronous
PLL for locking. In the decision-driven PLL, shown in
receivers because, while they do not require phase
Figure 14, the signal from one of the photodetector
locking, they use phase information in the received
outputs is sent to a decision circuit and the output of
signal.
the circuit is multiplied by the signal from the second
photodetector. The mixer output is sent back to the Phase-Diversity Receivers
local oscillator laser for phase locking. The Dn
required for a power penalty of ,1 dB is typically Asynchronous heterodyne receivers are relatively
less than 2 £ 1024 : This is superior to both the insensitive to phase noise but require a much higher
balanced PLL and Costas-type PLL receivers. receiver bandwidth for a given bit rate. Homodyne
receivers only require a receiver bandwidth equal to
Heterodyne Phase Locking the detected signal bandwidth but require an OPLL,
which is difficult to implement. The phase diversity
In practical synchronous heterodyne receivers phase
receiver, shown in Figure 15, is an asynchronous
locking is performed in the electronic domain
homodyne scheme, which does not require the use of
utilizing techniques from radio engineering. In the
an OPLL and has a bandwidth approximately equal
PSK case, electronic analogs of the OPLL schemes
to synchronous homodyne detection. This is at the
used in homodyne detection can be used. Synchro-
expense of increased receiver complexity and reduced
nous heterodyne schemes have better immunity to
sensitivity compared to synchronous homodyne
phase noise than the homodyne schemes.
detection. The phase-diversity scheme can be
used with ASK, DPSK, and CPFSK modulation.
Asynchronous Systems
ASK modulation uses an squarer circuit for the
Asynchronous receivers do not require optical or demodulator component, while DPSK and CPFSK
electronic phase locking. This means that they are less modulation use a delay line and mixer.
152 COHERENT LIGHTWAVE SYSTEMS
In the case of ASK modulation, the outputs of the The scheme is very tolerant of laser phase noise. The
N-port hybrid can be written as Dn for a power penalty of , 1 dB is of the same order
as for ASK heterodyne envelope detection.
1 pffiffiffiffiffi 2pk
Ek ðtÞ ¼ pffiffiffi aðtÞ 2Ps cos 2pfs t þ fs ðtÞ þ
N N
Polarization
pffiffiffiffiffiffiffi
þ 2PLO cos½2pfLO t þ fLO ðtÞ ;
The mixing efficiency between the signal and local-
k ¼ 1;2;…;N; N $ 3 ½26 oscillator lightwaves is a maximum when their
polarization states are identical. In practice the
For N ¼ 2; the phase term is p=2 for k ¼ 2: The voltage polarization state of the signal arriving at the receiver
input to the k-th demodulator (squarer), after the dc is unknown and changes slowly with time. This
terms have been removed by the blocking capacitor, is means that the IF photocurrent can change with time
given by and in the worst case, when the polarization states of
the signal and local-oscillator are orthogonal, the IF
RRL pffiffiffiffiffiffiffiffi
vk ðtÞ ¼ aðtÞ 2 Ps PLO cos 2pðfs 2 fLO Þ photocurrent will be zero. There are a number of
N possible solutions to this problem.
2pk
þ fs ðtÞ 2 fLO ðtÞ þ ½27 1. Polarization maintaining (PM) fiber can be used in
N
the optical link to keep the signal polarization
These voltages are then squared and added to give an
from changing. It is then a simple matter to adjust
output voltage
8 the local-oscillator polarization to achieve opti-
"
XN < mum mixing. However, the losses associated with
RRL 2
vL ðtÞ¼ a 4Ps PLO aðtÞ 1þcos 4p PM fiber are greater than for conventional single-
N :
k¼1 mode (SM) fiber. In addition most installed fiber is
#9
4pk = SM fiber.
fs 2fLO tþ2fs ðtÞ22fLO ðtÞþ ½28 2. An active polarization controller, such as a fiber
N ;
coil using bending-induced birefringence, can be
where a is the squarer parameter. The lowpass filter at used to ensure that the local-oscillator polariz-
the summing circuit output effectively integrates vL ðtÞ ation tracks the signal polarization.
over a bit period. If 2ðfs 2fLO ÞpBT and the difference 3. A polarization scrambler can be used to scramble
between the signal and local oscillator phase noise is the polarization of the transmitted signal at a
small within a bit period, then the input voltage to the speed greater than the bit rate. The sensitivity
decision circuit is degradation due to the polarization scrambling is
3 dB compared to a receiver with perfect polariz-
ðRRL Þ2 ation matching. However, this technique is only
vD ðtÞ¼ a 4Ps PLO aðtÞ ½29
N feasible at low bit rates.
which is proportional to the original data signal. 4. The most general technique is to use a polariz-
In shot-noise limited operation, the receiver sensi- ation-diversity receiver, as shown in Figure 16.
tivity is 3 dB worse than the ASK homodyne case. In this scheme, the signal light and local-oscillator
COHERENT LIGHTWAVE SYSTEMS 153
Table 3 General comparisons between various modulation/demodulation schemes. The sensitivity penalty is relative to ideal
homodyne PSK detection
Modulation/demodulation scheme Sensitivity penalty (dB) Immunity to phase noise IF bandwidth/bit rate Complexity
are split into orthogonally polarized lightwaves by Table 4 Coherent systems experiments. The penalty is with
a polarization beamsplitter. The orthogonal com- respect to the theoretical sensitivity
ponents of the signal and local-oscillator are Receiver Bit rate Sensitivity (dBm) Penalty
separately demodulated and combined. The (Mb/s) (dB)
resulting decision signal is polarization indepen- dBm Photons/bit
dent. The demodulator used depends on the
ASK 400 247.2 365 9.7
modulation format.
FSK (dual-filter) 680 239.1 1600 16
CPFSK (m ¼ 0.5) 1000 237 1500 18.6
differential 4000 231.3 1445 18.6
Comparisons Between the Principal detection
Demodulation Schemes DPSK 400 253.3 45.1 3.5
2000 239.0 480 13.8
General comparisons between the demodulation
schemes discussed above are given in Table 3. FSK
modulation is relatively simple to achieve. FSK and have moderate immunity to phase noise. The
demodulators have good receiver sensitivity and low advantage of CPFSK over DPSK is that the immunity
immunity to phase noise. PSK modulation is also to phase noise can be increased by increasing the FSK
easy to achieve and gives good receiver sensitivity. modulation index.
ASK modulation has no real merit compared to FSK Polarization diversity receivers can be used with
or PSK. any modulation/demodulation scheme and offer the
In general, homodyne schemes have the best most general approach to overcoming polarization
theoretical sensitivity but require optical phase lock- effects.
ing and are very sensitive to phase noise. Phase
diversity receivers are a good alternative to homo-
dyne receivers, in that they do not require an OPLL.
System Experiments
Asynchronous heterodyne receivers have similar Most coherent communication system experiments
sensitivities to synchronous receivers but are much date from the 1980s and early 1990s and were based
less complex. However, they require an IF bandwidth on the technology available at that time. This
much greater than the data rate. restricted the bit rates possible to typically , 4 Gb/s.
Differential detection schemes have good receiver Some coherent systems experimental results obtained
sensitivity, do not require wide bandwidth IF circuits around this time are listed in Table 4. All of the
154 COHERENT TRANSIENTS / Coherent Transient Spectroscopy in Atomic and Molecular Vapors
Conclusion
Further Reading
Coherent optical communication systems offer the
potential for increased sensitivity and selectivity Barry JR and Lee EA (1990) Performance of coherent
optical receivers. IEEE Proceedings 78: 1369– 1394.
compared to conventional IM/DD systems. However,
Cvijetic M (1996) Coherent and Nonlinear Lightwave
coherent optical receivers are more complex and have
Communications. Boston, MA: Artech House.
more stringent design parameters compared to IM/DD Kazovsky LG (1989) Phase- and polarization-diversity
receivers. Moreover, advances in optoelectronic com- coherent optical techniques. Journal of Lightwave
ponent design and fabrication along with the advent of Technology 7: 279 – 292.
reliable optical amplifiers has meant that coherent Kazovsky L, Benedetto S and Willner A (1996) Optical
systems are not a viable alternative to IM/DD systems Fiber Communication Systems. Boston, MA: Artech
in the vast majority of optical communication systems. House.
Linke RA and Gnauck AH (1988) High-capacity coherent
List of Units and Nomenclature lightwave systems. Journal of Lightwave Technology 6:
1750– 1769.
e Electronic charge Ryu S (1995) Coherent Lightwave Systems. Boston, MA:
h Planck constant Artech House.
COHERENT TRANSIENTS
Contents
excitation, the time-evolution of the atoms is mon- The atom-field interaction potential is
itored, from which transition frequencies and relax-
ation rates can be obtained. VðR; tÞ ¼ 2m · EðR; tÞ ½3
The earliest coherent transient effects were where m is a dipole moment operator. When eqn [2] is
observed with atomic nuclear spins in about 1950 substituted into Schrödinger’s equation and rapidly
by H C Torrey, who discovered transient spin varying terms are neglected (rotating-wave approxi-
nutation, and by E L Hahn who detected spin echoes mation), one finds that the state amplitudes evolve
and free induction decay (FID). The optical analogs of according to
these and other nuclear magnetic resonance (NMR)
0 1
effects have now been detected using either optical 2d0 V0 ðtÞ
dc ~ ~ ¼ ð"=2Þ@ A ½4
pulses or techniques involving Stark or laser fre- i" ¼ Hc; H
quency switching. dt V0 ðtÞ d0
In this brief introduction to coherent optical
transients, several illustrative examples of coherent where c is a vector having components ðc1 ; c2 Þ
transient phenomena are reviewed. The optical Bloch d0 ¼ v 0 2 v ½5
equations are derived and coupled to Maxwell’s
Equations. The Maxwell – Bloch formalism is used to is an atom-field detuning
analyze free precession decay, photon echoes, stimu- sffiffiffiffiffiffiffiffi
lated photon echoes, and optical Ramsey fringes. mEðtÞ m 2SðtÞ
Experimental results are presented along with the V0 ðtÞ ¼ 2 ¼2 ½6
" " 10 c
relevant theoretical calculations. In addition, coher-
ent transient phenomena involving quantized matter is a Rabi frequency, m ¼ k1lm·1l2l ¼ k2lm·1l1l is a
waves are discussed. Although the examples con- dipole moment matrix element, 10 is the permittivity
sidered in this chapter are fairly simple, it should be of free space, and SðtÞ is the time-averaged Poynting
appreciated that sophisticated coherent transient vector of the field. Equation [4] can be solved
techniques are routinely applied as a probe of numerically for arbitrary pulse envelopes.
complex structures such as liquids and solids or as a Expectation values of physical observables are
probe of single atoms and molecules. conveniently expressed in terms of density matrix
elements defined by
Optical Bloch Equations rij ¼ ci cpj ½7
Many of the important features of coherent optical
which obey equations of motion
transients can be illustrated by considering the
interaction of a radiation field with a two-level r_11 ¼ 2iV0 ðtÞðr21 2 r12 Þ 2
atom. The lower state l1l has energy 2"v0 =2 and
r_22 ¼ iV0 ðtÞðr21 2 r12 Þ 2
upper state l2l has energy "v0 =2: For the moment, the ½8
atom is assumed to be fixed at R ¼ 0 and all r_12 ¼ 2iV0 ðtÞðr22 2 r11 Þ 2 þ id0 r12
relaxation processes are neglected. The incident r_21 ¼ iV0 ðtÞðr22 2 r11 Þ 2 2 id0 r21
electric field is
An alternative set of equations in terms of real
variables can be obtained if one defines new
1
EðR ¼ 0; tÞ ¼ 1½EðtÞe2ivt þ EðtÞeivt ½1 parameters
2
u ¼ r12 þ r21 ; r12 ¼ ðu þ ivÞ 2
where EðtÞ is the field amplitude, 1 is the field
v ¼ iðr21 2 r12 Þ; r21 ¼ ðu 2 ivÞ 2
polarization and v is the carrier frequency. The time ½9
dependence of EðtÞ allows one to consider pulses w ¼ r22 2 r11 ; r22 ¼ ðm þ wÞ 2
having arbitrary shape. A time-dependent phase for m ¼ r11 þ r22 ; r11 ¼ ðm 2 wÞ 2
the field could have been included, allowing one to
consider the effect of arbitrary frequency ‘chirps,’ but which obey
such effects are not included in the present discussion.
u_ ¼ 2d0 v
It is convenient to expand the atomic state wave
function in a field interaction representation as v_ ¼ d0 u 2 V0 ðtÞw
½10
_ ¼ V0 ðtÞv
w
lcðtÞl ¼ c1 ðtÞeivt=2 l1l þ c2 ðtÞe2ivt=2 l2l ½2 _ ¼0
m
156 COHERENT TRANSIENTS / Coherent Transient Spectroscopy in Atomic and Molecular Vapors
The last of these equations reflects the fact that r11 þ which gives rise to a signal electric field of the form
r22 ¼ 1; while the first three can be rewritten as
1
Es ðR; tÞ ¼ 1 Es ðZ; tÞei½kZ2vt þ Eps ðZ; tÞe2i½kZ2vt
_ ¼ VðtÞ £ U
U ½11 2
½14
where the Bloch vector U has components ðu; v; wÞ
and the pseudofield vector VðtÞ has components The z-axis has been chosen in the direction of k:
[V0 ðtÞ; 0, d0 ]. It follows from Maxwell’s equations that the quasi-
Equations [8] or [10] constitute the optical Bloch steady-state field amplitude is related to the
equations without decay. The vector U has unit polarization by
magnitude and traces out a path on the Bloch sphere
›Es ðZ; tÞ ik
as it precesses about the pseudofield vector. The ¼ PðZ; tÞ ½15
›Z 210
component w is the population difference of the two
atomic states, while u and v are related to the To arrive at eqn [15], it is necessary to assume that
quadrature components of the atomic polarization transverse field variations can be neglected, that the
(see below). phase matching condition k ¼ v=c is met, and that
Equations [8] and [10] can be generalized to the complex field amplitudes PðZ; tÞ and Es ðZ; tÞ
include relaxation. A simplified relaxation scheme vary slowly in space compared with eikZ and
that has a wide range of applicability is one in which slowly in time compared with eivt :
state 2 decays spontaneously to state 1 with rate g2 ; The polarization PðR; tÞ; defined as the average
while the relaxation of the coherence r12 is charac- dipole moment per unit volume, is given by the
terized by a complex decay parameter G12 ; resulting expression
from phase-changing collisions with a background h
gas. When such processes are included in eqn [8], they PðR; tÞ ¼N m21 kr12 ðZ; tÞle2i½kZ2vt
are modified as i
þ m12 kr21 ðZ; tÞlei½kZ2vt ½16
r_11 ¼ 2iV0 ðtÞðr21 2 r12 Þ=2 þ g2 r22 ½12a
where N is the atomic density and r12 ðZ; tÞ and
r_22 ¼ iV0 ðtÞðr21 2 r12 Þ=2 2 g2 r22 ½12b r21 ðZ; tÞ are single-particle density matrix elements
that depend on the atomic properties of the medium.
As such, eqn [16] provides the link between
r_12 ¼ 2iV0 ðtÞðr22 2 r11 Þ=2 2 gr12 þ idr12 ½12c Maxwell’s equations and the optical Bloch equations.
The k l brackets in eqn [16] indicate that there may be
additional averages that must be carried out.
r_21 ¼ iV0 ðtÞðr22 2 r11 Þ=2 2 gr21 2 idr21 ½12d
For example, in a vapor, there is a distribution of
atomic velocities that must be summed over, while, in
where g ¼ g2 =2 þ ReðG12 Þ and the detuning d ¼ d0 2 a solid, there may be an inhomogeneous distribution
ImðG12 Þ is modified to include the collisional shift. of atomic frequencies owing to local strains in the
With the addition of decay, the length of the Bloch media. By combining eqns [9], [13], [15] and [16],
vector is no longer conserved. The quantity g2 is and using the fact that m ¼ m12 ·1 ¼ m21 ·1; one
referred to as the longitudinal relaxation rate. More- arrives at
over, one usually refers to T1 ¼ g21
2 as the longitudi-
nal relaxation time and T2 ¼ g21 as the transverse ›Es ðZ; tÞ ikN m
¼ kr21 ðZ; tÞl
relaxation time. In the case of purely radiative ›Z 10
broadening, g2 ¼ 2g and T1 ¼ T2 =2: ikN m
¼ kuðZ; tÞ 2 ivðZ; tÞl ½17
210
Maxwell –Bloch Equations which, together with eqn [12], constitute the Max-
well – Bloch equations. From eqn [17], it is clearly
As a result of atom-field interactions, it is assumed
seen that the coherence kr21 ðZ; tÞl drives the signal
that a polarization is created in the medium of the
field. If kr21 ðZ; tÞl is independent of Z; the inten-
form
sity of radiation exiting a sample of length L is
1 proportional to
PðR; tÞ ¼ 1 PðZ; tÞei½kZ2vt þ Pp ðZ; tÞe2i½kZ2vt
2 kN mL 2
IðL; tÞ ¼ lkr21 ðtÞll2 ½18
½13 10
COHERENT TRANSIENTS / Coherent Transient Spectroscopy in Atomic and Molecular Vapors 157
Alternatively, the outgoing field can be heterodyned The density matrix elements evolve as
with a reference field to extract the field amplitude
rather than the field intensity. r_I11 ¼ g2 rI22 ; r_I22 ¼ 2g2 rI22 ;
½22
r_I12 ¼ 2grI12 ; r_I21 ¼ 2grI21
Coherent Transient Signals A coherent transient signal is constructed by
The Maxwell – Bloch equations can be used to piecing together field interaction zones and free
describe the evolution of coherent transient signals. evolution periods. There are many different types of
In a typical experiment, one applies one or more coherent transient signals. A few illustrative examples
‘short’ radiation pulses to an atomic sample and are given below.
monitors the radiation field emitted by the atoms
Free Polarization Decay
following the pulses. A ‘short’ pulse, is one satisfying
Free polarization decay (FPD), also referred to as
ldlt; kut; gt; g2 t p 1 ½19 optical FID, is the optical analog of free induction
decay (FID) in NMR. This transient follows prep-
where t is the pulse duration. Conditions [19] allow aration of the atomic sample by any of three methods:
one to neglect any effects of detuning (including (1) an optical pulse resonant with the atomic transi-
Doppler shifts or other types of inhomogeneous tion; (2) a Stark pulse that switches the transition
broadening) or relaxation during the pulse’s action. frequency of a molecule (atom) into resonance with
The evolution of the atomic density matrix during an incident cw laser beam; or (3) an electronic pulse
each pulse is determined solely by the pulse area that switches the frequency of a cw laser into
defined by coincidence with the molecular (atomic) transition
frequency. In methods (2) and (3), the FPD emission
ð1 propagates coherently in the forward direction along
u¼ V0 ðtÞdt ½20 with the cw laser beam, automatically generating a
21
heterodyne beat signal at a detector that can be orders
of magnitude larger than the FPD signal. The FPD
In terms of density matrix elements in a standard
field intensity itself exiting the sample is equal to
interaction representation defined by
!2
kN mL
rI12 ¼ r12 e 2i½kZþdt
; rI21 ¼ r21 e i½kZþdt
; IðL; tÞ ¼ lkr21 ðtÞll2
10
rI22 ¼ r22 ; rI11 ¼ r11 !2 2
kN mL ð
2½g2iðdþk·vÞt
¼ sin u dvW0 ðvÞe
the change in density matrix elements for an 210
interaction occurring at time Tj is given by ½23
ðu; v; wÞ Elements of Bloch vector Berman PR and Steel DG (2001) Coherent optical
UðtÞ Bloch vector transients. In: Bass M, Enoch JM, Van Stryland E and
v Atomic velocity Wolf WL (eds) Handbook of Optics, vol. IV. chap. 24
W0 ðvÞ Atomic velocity distribution New York: McGraw-Hill. (Material presented in this
g Transverse relaxation rate chapter is a condensed version of this reference.).
Brewer RG (1977) Coherent optical spectroscopy. In:
g2 Excited state decay rate or longitudinal
Balian R, Haroche S and Liberman S (eds) Frontiers in
relaxation rate Laser Spectroscopy, Les Houches, Session XXVII, vol.
d Atom-field detuning 1, pp. 341 – 396. Amsterdam: North Holland.
u Pulse area Brewer RG (1977) Coherent optical spectroscopy. In:
m Dipole moment matrix element Bloembergen N (ed.) Nonlinear Spectroscopy, Proc.
rIij Density matrix element in ‘normal’ Int. School Phys., Enrico Fermi, Course 64, Amsterdam:
interaction representation North-Holland.
rij ðZ; tÞ Density matrix element in a field Brewer RG (1977) Coherent optical transients. In Physics
interaction representation Today, vol. 30(5), 50– 59.
t Pulse durations Brewer RG and Hahn EL (1984) Atomic memory. In
v Field frequency Scientific American, vol. 251(6), 50 – 57.
vk Recoil frequency Journal of Optical Society America (1986) Special issue on
coherent optical spectroscopy, Vol. A3(4).
v0 Atomic transition frequency
Levenson MD and Kano SS (1988) Introduction to Non-
VðtÞ Generalized Rabi frequency
linear Laser Spectroscopy, Chapter 6. Boston: Academic
VðtÞ Pseudofield vector Press.
V0 ðtÞ Rabi frequency Meystre P and Sargent M III (1999) Elements of Quantum
Optics, 3rd edn. Chapter 12. Berlin: Springer-Verlag.
Shen YR (1984) The Principles of Nonlinear Optics,
See also Chapter 17. New York: Wiley.
Shoemaker RL (1978) Coherent transient infrared spec-
Interferometry: Gravity Wave Detection; Overview. troscopy. In: Steinfeld JI (ed.) Laser and Coherence
Spectroscopy, pp. 197 –371. New York: Plenum.
Zinth W and Kaiser W (1988) Ultrafast Coherent Spec-
Further Reading troscopy. In: Kaiser W (ed.) Ultrashort Laser Pulses
Allen L and Eberly J (1975) Optical Resonance and Two- and Applications, vol. 60, pp. 235 – 277. Berlin:
Level Atoms, Chapters 2,4,9, New York: Wiley. Springer-Verlag.
observable can be as long as seconds for certain characteristic optical wavelength l; the emitted
atomic transitions, or it may be as short as a few electric field resulting from the polarization is
femtoseconds (10215 s), e.g., for metals, surfaces, or proportional to its second time derivative,
highly excited semiconductors. E/ð›2 =›t2 ÞP. Thus the measurement of the emitted
Coherent spectroscopy and the analysis of coherent field dynamics yields information about the temporal
transients has provided valuable information on the evolution of the optical material polarization.
nature and dynamics of the optical excitations. Often Microscopically the polarization is determined by
it is possible to learn about the interaction processes the transition amplitudes between the different states
among the photoexcitations and to follow the of the system. These may be the discrete states of
temporal evolution of higher-order transitions atoms or molecules, or the microscopic valence and
which are only accessible if the system is in a conduction band states in a dielectric medium, such
non-equilibrium state. as a semiconductor. In any case, the macro-
The conceptually simplest experiment which one scopic polarization P is computed by summing P over
may use to observe coherent transients is to time all microscopic transitions pcv via P ¼ c;v ðmcv pcv þ
resolve the transmission or reflection induced by a c:c:Þ; where mcv is the dipole matrix element which
single laser pulse. However, much richer information determines the strength of the transitions between the
can be obtained if one excites the system with several states v and c, and c:c: denotes the complex conjugate.
pulses. A pulse sequence with well-controlled delay If 1c and 1v are the energies of these states, their
times makes it possible to study the dynamical dynamic quantum mechanical evolution is described
evolution of the photo-excitations, not only by time by the phase factors e2i1c t=" and e2i1v t=" ; respectively.
resolving the signal, but also by varying the delay. Therefore, each pcv is evolving in time according to
Prominent examples of such experiments are pump- e2ið1c 21v Þt=" : Assuming that we start at t ¼ 0
probe measurements, which usually are performed with pcv ðt ¼ 0Þ ¼ pcv;0 ; which may be induced by a
with two incident pulses, or four-wave mixing, for short opticalPpulse, we have for the optical polariza-
which one may use two or three incident pulses. tion PðtÞ ¼ c;v ðmcv pcv;0 e2ið1c 21v Þt=" þ c:c:Þ: Thus PðtÞ
Besides the microscopic interaction processes, the is given by a summation over microscopic transitions
outcome of an experiment is determined by the which all oscillate with frequencies proportional to
quantum mechanical selection rules for the transi- the energy differences between the involved states.
tions and by the symmetries of the system under Hence, the optical polarization is clearly a coherent
investigation. For example, if one wants to investigate quantity which is characterized by amplitude and
coherent optical properties of surface states one often phase. Furthermore, the microscopic contributions to
relies on phenomena, such as second-harmonic or PðtÞ add up coherently. Depending on the phase
sum-frequency generation, which give no signal in relationships, one may obtain either constructive
perfect systems with inversion symmetry. Due to the superposition, interference phenomena like quantum
broken translational invariance, such experiments are beats, or destructive interference leading to a decay
therefore sensitive only to the dynamics of surface (dephasing) of the macroscopic polarization.
and/or interface excitations.
In this article the basic principles of coherent Optical Bloch Equations
transients are presented and several examples are The dynamics of photo-excited systems can be
presented. The basic theoretical description and its conveniently described by a set of equations, the
generalization for the case of semiconductors are optical Bloch equations, named after Felix Bloch
introduced. (1905– 1983) who first formulated such equations to
analyze the spin dynamics in nuclear magnetic
Basic Principles resonance. For the simple case of a two-level model,
the Bloch equations can be written as
In the absence of free charges and currents, Maxwell’s
equations show that the electromagnetic field inter- ›
i" p ¼ D1p þ E·mI ½1
acts with matter via the optical polarization. ›t
This polarization P, or more precisely its second ›
derivative with respect to time ð›2 =›t2 ÞP, appears as a i" I ¼ 2E·mðp 2 pp Þ ½2
›t
source term in the wave equation for the electric
field E. Consequently, if the system is optically thin Here, D1 is the energy difference and I the
such that propagation effects within the sample can inversion, i.e., the occupation difference between
be ignored and if measurements are performed in upper and lower state. The field E couples the
the far field region, i.e., at distances exceeding the polarization to the product of the Rabi energy E·m
COHERENT TRANSIENTS / Foundations of Coherent Transients in Semiconductors 165
and the inversion I. In the absence of the driving a periodic arrangement of atoms, one has to include
field, i.e., E ; 0; eqn [1] describes the free oscil- the continuous dispersion (bandstructure) into the
lation of p discussed above. description. In translationally invariant, i.e., ordered
The inversion is determined by the combined systems, this can be done by including the crystal
action of the Rabi energy and the transition p. The momentum "k as an index to the quantities
total occupation N, i.e., the sum of the occupations of which describe the optical excitation of the material,
the lower and the upper states, remains unchanged e.g., p and I in eqns [1] –[2]. Hence, one has to
during the optical excitation since light does not consider a separate two-level system for each crystal
create or destroy electrons; it only transfers them momentum "k: As long as all other interactions are
between different states. If N is initially normalized disregarded, the bandstructure simply introduces a
to 1, then I can vary between 21 and 1. I ¼ 21 summation over the uncoupled contributions from
corresponds to the ground state of the system, where the different k states, leading to inhomogeneous
only the lower state is occupied. In the opposite broadening due to the presence of a range of
extreme, i.e., for I ¼ 1; the occupation of the upper transition frequencies D1ðkÞ ":
state is 1 and the lower state is completely depleted. For any realistic description of optical processes in
One can show that eqns [1] –[2] contain another solids, it is essential to go beyond the simple picture of
conservation law. Introducing P ¼ p þ pp ¼ 2Re½p non-interacting states and to treat the interactions
and J ¼ iðp 2 pp Þ ¼ 22Im½p; which are real quan- among the elementary material excitations, e.g., the
tities and often named polarization and polarization Coulomb interaction between the electrons and the
current, respectively, one finds that the relation coupling to additional degrees of freedom, such as
lattice vibrations (electron –phonon interaction) or
P 2 þ J 2 þ I2 ¼ 1 ½3
other bath-like subsystems. If crystals are not ideally
is fulfilled. Thus the coupled coherent dynamics of the periodic, imperfections, which can often be described
transition and the inversion can be described on a unit as a disorder potential, need to be considered as well.
sphere, the so-called Bloch sphere. The three terms All these effects can be treated by proper extensions
P; J ; and I can be used to define a three-dimensional of the optical Bloch equations introduced above. For
Bloch vector S ¼ ðP; J ; IÞ: With these definitions, one semiconductors the resulting generalized equations
can reformulate eqns [1] –[2] as are known as semiconductor Bloch equations, where
› the microscopic interactions are included at a certain
S¼V£S ½4 level of approximation. For a two-band model of
›t
a semiconductor, these equations can be written
=
with V ¼ ð22m·E; 0; D1Þ " and £ denoting the schematically as
vector product.
The structure of eqn [4] is mathematically identical › ›
i" p ¼ D1k pk þ Vk ðnck 2 nvk Þ þ i" pk lcorr ½5
to the equations describing either the angular ›t k ›t
momentum dynamics in the presence of a torque or
the spin dynamics in a magnetic field. Therefore, the › c ›
vector S is often called pseudospin. Moreover, many i" n ¼ ðVpk pk 2 Vk ppk Þ þ i" nck lcorr ½6
›t k ›t
effects which can be observed in magnetic resonance
experiments, e.g., free decay, quantum beats, echoes,
› v ›
etc. have their counterparts in photo-excited optical i" n ¼ 2ðVpk pk 2 Vk ppk Þ þ i" nvk lcorr ½7
systems. ›t k ›t
The vector product on the right-hand side of Here pk is the microscopic polarization and nck and
eqn [4] shows that S is changed by V in a direction nvk are the electron populations in the conduction
perpendicular to S and V: For vanishing field the and valence bands (c and v), respectively. Due to the
torque V has only a z-component. In this case the Coulomb interaction and possibly further processes,
z-component of S, i.e., the inversion, remains the transition energy D1k and the Rabi energy Vk
constant, whereas the x- and y-components, i.e., the both depend on the excitation state of the system,
polarization and the polarization current, oscillate i.e., they are functions of the time-dependent
=
with the frequency D1 " on the Bloch sphere polarizations and populations. This leads, in par-
around V: ticular, to a coupling among the excitations for all
different values of the crystals momentum "k:
Semiconductor Bloch Equations
Consequently, in the presence of interactions, the
If one wants to analyze optical excitations in crystal- optical excitations can no longer be described as
lized solids, i.e., in systems which are characterized by independent two-level systems but have to be treated
166 COHERENT TRANSIENTS / Foundations of Coherent Transients in Semiconductors
as a coupled many-body system. A prominent and proportional to e2i Dvt2t=ð2Trad Þ : Usually one measures
important example in this context is the appearance the intensity of a field, i.e., its squared modulus,
of strong exciton resonances which, as a conse- which evolves as e2t=Trad and thus simply shows an
quence of the Coulomb interaction, show up in the exponential decay, the so-called free-induction decay
absorption spectra of semiconductors energetically (see Figure 1). Trad can be very long for transitions
below the fundamental bandgap. with small matrix elements. For semiconductor
The interaction effects lead to significant mathe- quantum wells, however, it is in the order of only
matical complications since they induce couplings 10 ps, as the result of the strong light-matter
between all the different quantum states in a system interaction.
and introduce an infinite hierarchy of equations for The time constant on which the optical polari-
the microscopic correlation functions. The terms zation decays is often called T2 : In the case where this
given explicitly in eqns [5] – [7] arise in a treatment decay is dominated by radiative processes, we thus
of the Coulomb interaction on the Hartree-Fock have T2 ¼ 2Trad :
level. Whereas this level is sufficient to describe
excitonic resonances, there are many additional Superradiance
effects, e.g., excitation-induced dephasing due to
Coulomb scattering and significant contributions The phenomenon of superradiance can be discussed
from higher-order correlations, like excitonic popu- considering an ensemble of N two-level systems which
lations and biexcitonic resonances, which make it are localized at certain positions Ri : In this case
necessary to treat many-body correlation effects that
are beyond the Hartree-Fock level. These contri-
butions are formally included by the terms denoted as
lcorr in eqns [5] –[7]. The systematic truncation of the
many-body hierarchy and the analysis of controlled
approximations is the basic problem in the micro-
scopic theory of optical processes in condensed
matter systems.
Examples
Radiative Decay
The emitted field originating from the non-equili-
brium coherent polarization of a photo-excited
system can be monitored by measuring the trans-
mission and the reflection as function of time. If only
a single isolated transition is excited, the dynamic
evolution of the polarization and therefore of the
transmission and reflection is governed by radiative
decay. This decay is a consequence of the coupling of
the optical transition to the light field described by the
combination of Maxwell- and Bloch-equations.
Radiative decay simply means that the optical
polarization is converted into a light field on a
characteristic time scale 2Trad : Here, Trad is the
population decay time, often also denoted as T1 -
Figure 1 A two-level system is excited by a short optical pulse at
time. This radiative decay is a fundamental process
t ¼ 0: Due to radiative decay (inset) and possibly other dephasing
which limits the time on which coherent effects are mechanisms, the polarization decays exponentially as function of
observable for any photo-excited system. Due to time with the time constant T2 ; the dephasing time. The intensity of
other mechanisms, however, the non-equilibrium the optical field which is emitted as a result of this decay is
polarization often vanishes considerably faster. proportional to the squared modulus of the polarization, which falls
The value of Trad is determined by the dipole off with the time constant T2 =2: The decay of the squared modulus
of the polarization is shown in (a) on a linear scale and in
matrix element and the frequency of the transition, (b) on a logarithmic scale. The dephasing time was chosen to be
21
i.e., Trad / lml2 Dv; with Dv ¼ D1=": The temporal T2 ¼ 10 ps which models the radiative decay of the exciton
evolution of the polarization and the emitted field are transition of a semiconductor quantum well.
COHERENT TRANSIENTS / Foundations of Coherent Transients in Semiconductors 167
Destructive Interference
As the next example we now consider a distribution Figure 2 (a) N identical two-level systems are depicted which
of two-level systems which have slightly different are regularly arranged with a spacing that equals an integer
multiple of l=2; where l is the wavelength of the system’s
transition frequencies characterized by the distri- resonance. Due to their coupling via Maxwell’s equations all fields
bution function gðDv 2 vÞ of the transition frequen- emitted from the two-level systems interfere constructively and
cies which is peaked at the mean value v and has a the coupled system behaves effectively like a single pffiffiffitwo-level
spectral width of dv (see Figure 3a). Ignoring the system with an optical matrix element increased by N : (b) The
radiative coupling among the resonances, the optical temporal decay of the squared modulus of the polarization is
shown on a logarithmic scale. The radiative decay rate is
polarization of the total system evolves after exci- proportional to N and thus the polarization of the coupled system
tation
Ð with a short laser pulse at t ¼ 0 proportional to decays N-times faster than that of an isolated system. This effect
dv gðDv 2 vÞe2i Dvt / g~ ðtÞe2ivt ; where g~ ðtÞ denotes is called superradiance. (c) Measured time-resolved reflection of a
the Fourier transform of the frequency distribution semiconductor single quantum well (dashed line) and a N ¼ 10
function. g~ ðtÞ and thus the optical polarization decays Bragg structure, i.e., a multi quantum well where the individual
wells are separated by l=2 (solid line), on a logarithmic scale.
on a time-scale which is inversely proportional to [Part (c) is reproduced with permission from Haas S, Stroucken T,
the spectral width of the distribution function dv: Hübner M, et al. (1998) Intensity dependence of superradiant
Thus the destructive interference of many different emission from radiatively coupled excitons in multiple-quantum-
transition frequencies results in a rapid decay of the well Bragg structures. Physical Review B 57: 14860. Copyright
polarization (see Figure 3b). (1998) by the American Physical Society].
168 COHERENT TRANSIENTS / Foundations of Coherent Transients in Semiconductors
uncoupled or coupled. As shown in Figure 4, the two possible to decide about the nature of the underlying
uncoupled two-level systems give the same linear transitions if one performs nonlinear optical spec-
polarization as a three-level systems where the two troscopy. This is due to the fact that the so-called
transitions share a common state. It is, however, quantum beats, i.e., the temporal modulations of the
polarization of an intrinsically coupled system, show
a different temporal evolution as the so-called
polarization interferences, i.e., the temporal modu-
lations of the polarization of uncoupled systems. The
former ones are also much more stable in the presence
of inhomogeneous broadening than the latter ones.
In semiconductor heterostructures, quantum-beat
spectroscopy has been widely used to investigate the
temporal dynamics of excitonic resonances. Also the
coupling among different optical resonances has been
explored in pump-probe and four-wave-mixing
measurements.
Coherent Control
In the coherent regime the polarization induced by a
first laser pulse can be modified by a second, delayed
pulse. For example, the first short laser pulse incident
at t ¼ 0 induces a polarization proportional to e2i Dvt
and a second pulse arriving at t ¼ t . 0 induces
a polarization proportional to e2iw e2i Dvðt2tÞ :
Thus for t $ t the total polarization is given by
e2i Dvt ð1 þ e2i Dw Þ with Dw ¼ w 2 Dvt: If Dw is an
integer multiple of 2p; the polarizations of both
pulses interfere constructively and the amplitude of
the resulting polarization is doubled as compared to a
single pulse. If, on the other hand, Dw is an odd
Figure 5 (a) Two Gaussian laser pulses separated by the time
delay t can lead to interferences. As shown by the thick solid line
multiple of p; the polarizations of both pulses
in the inset, the spectral intensity of a single pulse is a Gaussian interfere destructively and the resulting polarization
with a maximum at the central frequency of the pulse. The width of vanishes after the action of both pulses. Thus by
this Gaussian is inversely proportional to the duration of the pulse.
The spectral intensity of the field consisting of both pulses shows
interference fringes, i.e., is modulated with a spectral period that is
inversely proportional to t: As shown by the thin solid and the
dotted lines in the inset, the phase of the interference fringes
depends on the phase difference Dw between the two pulses.
Whereas for Dw ¼ 0 the spectral intensity has a maximum at the
central frequency, it vanishes at this position for Dw ¼ p: (b) The
temporal dynamics of the squared modulus of the polarization is
shown for a two-level system excited by a pair of short laser
pulses neglecting dephasing, i.e., setting T2 ! 1: The first pulse
excites at t ¼ 0 an optical polarization with a squared modulus
normalized to 1. The second pulse, which has the same intensity Figure 6 Transient optical nonlinearities can be investigated by
as the first one, also excites an optical polarization at t ¼ 2 ps: For exciting the system with two time delayed optical pulses. The
t . 2 ps the total polarization is given by the sum of the pump pulse (Ep ) puts the system in an excited state and the test
polarizations induced by the two pulses. Due to interference, the (probe) pulse (Et ) is used to measure its dynamics. The dynamic
squared modulus of the total polarization depends sensitively on nonlinear optical response may be measured in the transmitted or
the phase difference Dw between the two pulses. For constructive reflected directions of the pump and test pulses, i.e., ^kp and ^kt ;
interference the squared modulus of the total polarization after the respectively. In a pump-probe experiment one often investigates
second pulse is four times bigger than that after the first pulse, the change of the test absorption induced by the pump pulse, by
whereas it vanishes for destructive interference. One may achieve measuring the absorption in the direction kt with and without Ep :
all values in between these extremes by using phase differences One may also measure the dynamic nonlinear optical response in
Dw which are no multiples of p: It is thus shown that the second scattering directions like 2kp 2 kt and 2kt 2 kp as indicated by the
pulse can be used to coherently control the polarization induced dashed lines. This is what is done in a four-wave-mixing or
by the first pulse. photon-echo experiment.
170 COHERENT TRANSIENTS / Foundations of Coherent Transients in Semiconductors
Spectral Oscillations
For negative time delays t , 0; i.e., if the test pulse
precedes the pump, the absorption change DaðvÞ is
characterized by spectral oscillations around Dv
which vanish as t approaches zero (see Figure 8a).
The spectral period of the oscillations decreases with
increasing ltl: Figure 8b shows measured differential
transmission spectra of a multiple quantum-well
structure for different negative time delays. As with
the differential absorption, also the differential
transmission spectra are dominated by spectral
oscillations around the exciton resonance whose
period and amplitude decrease with increasing ltl:
The physical origin of the coherent oscillations is
the pump-induced perturbation of the free-induction
decay of the polarization generated by the probe
pulse. This perturbation is delayed by the time t at
which the pump arrives. The temporal shift of the
induced polarization changes in the time domain
leads to oscillations in the spectral domain, since the
Fourier transformation translates a delay in one
domain into a phase factor in the conjugate domain.
Photon Echo
two laser pulses. Further echo phenomena can also Bergmann K, Theuer H and Shore BW (1998) Coherent
be observed as the result of the coupling of the photo- population transfer among quantum states of atoms
excitation to nuclear degrees offreedom and also due to and molecules. Review of Modern Physics 70:
spatial motion of the electrons, i.e., electronic currents. 1003– 1025.
Bloembergen N (1965) Nonlinear Optics. New York:
Time-resolved four-wave-mixing signal, measured
Benjamin.
on an inhomogeneously broadened exciton resonance
Brewer RG (1977) Coherent Optical Spectroscopy. In:
of a semiconductor quantum well structure for Balian, et al. (ed.). Course 4 in Les Houches
different time delays, is presented in Figure 9b. Session XXVII, Frontiers in Laser Spectroscopy, vol. 1.
Compared to Figure 9a, the origin of the time Amsterdam: North-Holland Publ.
axis starts at the arrival of the second pulse, i.e., for Cohen-Tannoudji C, Dupont-Roc J and Grynberg G (1989)
a non-interacting system the maximum of the echo is Photons and Atoms. New York: Wiley.
expected at t ¼ t: Due to the inhomogeneous Haug H and Koch SW (2004) Quantum Theory of the
broadening the maximum of the signal is shifting Optical and Electronic Properties of Semiconductors,
to longer times with increasing delay. Further details of 4th edn. Singapore: World Scientific.
the experimental results, in particular, the exact Koch SW, Peyghambarian N and Lindberg M (1988)
position of the maximum and the width of the Transient and steady-state optical nonlinearities
in semiconductors. Journal of Physics C 21:
signal, cannot be explained on the basis of non-
5229– 5250.
interacting two-level systems, but require the Macomber JD (1976) The Dynamics of Spectroscopic
analysis of many-body effects in the presence of Transitions. New York: Wiley.
inhomogeneous broadening. Mukamel S (1995) Principles of Nonlinear Optical
Spectroscopy. New York: Oxford.
See also Peyghambarian N, Koch SW and Mysyrowicz A (1993)
Introduction to Semiconductor Optics. Englewood
Coherent Control: Applications in Semiconductors;
Cliffs, NJ: Prentice-Hall.
Experimental; Theory. Spectroscopy: Nonlinear Laser
Pötz W and Schroeder WA (eds) (1999) Coherent Control in
Spectroscopy.
Atoms, Molecules, and Semiconductors. Dordrecht:
Kluwer.
Schäfer W and Wegener M (2002) Semiconductor Optics
Further Reading
and Transport Phenomena. Berlin: Springer.
Allen L and Eberly JH (1975) Optical Resonance and Shah J (1999) Ultrafast Spectroscopy of Semiconductors
Two-Level Atoms. New York: Wiley. and Semiconductor Nanostructures, 2nd edn. Berlin:
Baumert T, Helbing J, Gerber G, et al. (1997) Coherent Springer.
control with femtosecond laser pulses. Advanced Yeazell JA and Uzer T (eds) (2000) The Physics and
Chemical Physics 101: 47 – 82. Chemistry of Wave Packets. New York: Wiley.
conserve the energy and momentum and the quantum the electron– phonon collision, memory effects, and
kinetic approach becomes more appropriate. This is energy nonconservation, were demonstrated. More
true for photo-excited semiconductors as well as for recent experiments have demonstrated the influence
the quantum transport regime in semiconductors. of four-particle correlations on coherent nonlinear
Also, semiconductor Bloch equations have been response of semiconductors such as GaAs in the
further extended by including four-particle corre- presence of a strong magnetic field.
lations. Finally, the most general approach to the The ability to generate phase-locked pulses with
description of the nonequilibrium and coherent controllably variable separation between them pro-
response of semiconductors following excitation by vides an exciting opportunity to manipulate a variety
an ultrashort laser pulse is based on nonequilibrium of excitations and processes within a semiconductor.
Green’s functions. If the separation between the two phase-locked pulses
Experimental studies on the coherent response of is adjusted to be less than the dephasing time of the
semiconductors can be broadly divided into three system under investigation, then the amplitudes of the
categories: excitations produced by the two phase-locked pulses
can interfere either constructively or destructively
1. investigation of novel aspects of coherent response depending on the relative phase of the carrier waves
not found in other simpler systems such as atoms; in the two pulses. The nature of interference changes
2. investigation of how the initial coherence is lost, to as the second pulse is delayed over an optical period.
gain insights into dephasing and decoherence A number of elegant experiments have demonstrated
processes; and coherent control of exciton population, spin orien-
3. coherent control of processes in semiconductors tation, resonant emission, and electrical current.
using phase-locked pulses. Phase-sensitive detection of the linear or nonlinear
emission provides additional insights into different
Numerous elegant studies have been reported. This aspects of semiconductor physics.
section presents some observations on the trends in These experimental and theoretical studies of
this field. ultrafast coherent response of semiconductors have
Historically, the initial experiments focused on the led to fundamental new insights into the physics of
decay of the coherent response (time-integrated FWM semiconductors and their coherence properties. Dis-
as a function of delays between the pulses), analyzed cussion of novel coherent phenomena is given in a
them in terms of the independent two-level model, subsequent section. Coherent spectroscopy of semi-
and obtained very useful information about various conductors continues to be a vibrant research field.
collision processes and rates (exciton –exciton, exci-
ton– carrier, carrier– carrier, exciton– phonon, etc.). It
was soon realized, however, that the noninteracting Incoherent Relaxation Dynamics
two-level model is not appropriate for semiconduc- The qualitative picture that emerges from investi-
tors because of the strong influence of Coulomb gation of coherent dynamics can be summarized as
interactions. Elegant techniques were then developed follows. Consider a direct gap semiconductor excited
to explore the nature of coherent response of semi- by an ultrashort laser pulse of duration tL (with a
conductors. These included investigation of time- spectral width hDnL ) centered at photon energy hnL
and spectrally resolved coherent response, and of larger than the semiconductor bandgap Eg . At the
both the amplitude and phase of the response to beginning of the pulse the semiconductor does not
complement intensity measurements. These studies know the pulse duration so that coherent polarization
provided fundamental new insights into the nature of is created over a spectral region much larger than
semiconductors and many-body processes and inter- hDnL . If there are no phase-destroying events during
actions in semiconductors. Many of these obser- the pulse ðdephasing rate1=tD ! 1=tL Þ; then a destruc-
vations were well explained by the semiconductor tive interference destroys the coherent polarization
Bloch equations. When lasers with ,10 fs pulse away from hnL with increasing time during the
widths became laboratory tools, the dynamics was excitation pulse. Thus, the coherent polarization
explored on a time-scale much shorter than the exists only over the spectral region hDnL at the end
characteristic phonon oscillation period (< 115 fs for of the pulse. This coherence will be eventually
GaAs longitudinal optical phonons), the plasma destroyed by collisions and the semiconductor will
frequency (< 150 fs for a carrier density of be left in an incoherent (off-diagonal elements of the
5 £ 1017 cm23), and the typical electron – phonon density matrix are 0) nonequilibrium state with
collision interval (< 200 fs in GaAs). Some remark- peaked electron and hole population distributions
able features of quantum kinetics, such as reversal of whose central energies and energy widths are
176 COHERENT TRANSIENTS / Ultrafast Studies of Semiconductors
determined by hnL, hDnL and Eg , if the energy and This discussion of separating the processes that
momentum relaxation rates ð1=tE ; 1=tM Þ are much conserve energy in the electronic system and those
smaller than the dephasing rates. that transfer energy to other systems is obviously too
The simplifying assumptions that neatly separate simplistic and depends strongly on the nature of the
the excitation, dephasing and relaxation regimes are problem one is considering. The challenge for the
obviously not realistic. A combination of all these experimentalist is to devise experiments that can
(and some other) physical parameters will determine isolate these phenomena so each can be studied
the state of a realistic system at the end of the exciting separately. If the laser energy is such that
pulse. However, after times , tD following the laser hnL 2 Eg , "vLO , the optical phonon energy, then
pulse, the semiconductor can be described in terms majority of the photo-excited electrons and holes do
of electron and hole populations whose distribution not have sufficient energy to emit an optical phonon,
functions are not Maxwell – Boltzmann or Fermi – the fastest of the carrier–phonon interaction pro-
Direct type, but nonthermal in a large majority of cesses. The initial nonthermal distribution is then
cases in which the energy and momentum relaxation modified primarily by processes other than phonon
rates are much smaller than the dephasing rates. scattering and can be studied experimentally
This section discusses the dynamics of this incoherent without the strong influence of the carrier– phonon
state. interactions. Such experiments have indeed been
Extensive theoretical work has been devoted to performed both in bulk and quantum well semicon-
quantitatively understand these initial athermal ductors. These experiments have exhibited spectral
distributions and their further temporal evolution. hole burning in the pump-probe transmission spectra
These include Monte Carlo simulations based on the and thus demonstrated that the initial carrier
Boltzmann kinetic equation approach (typically distributions are indeed nonthermal and evolve to
appropriate for time-scales longer than , 100 fs, the thermal distributions. Such experiments have pro-
carrier– phonon interaction time) as well as the vided quantitative information about various car-
quantum kinetic approach (for times typically less rier–carrier scattering rates as a function of carrier
than , 100 fs) discussed above. We present below density. In addition, experiments with hnL 2 Eg .
some observations on this vast field of research. "vLO and hnL 2 Eg . intervalley separation have
provided valuable information about intravalley as
well as intervalley electron– phonon scattering rates.
Nonthermal Regime
Experiments have been performed in a variety of
A large number of physical processes determines the different semiconductors, including bulk semicon-
evolution of the nonthermal electron and hole ductors and undoped and modulation-doped
populations generated by the optical pulse. These quantum wells. The latter provide insights into
include electron – electron, hole –hole, and electron – intersub-band scattering processes and quantitative
hole collisions, including plasma effects if the density information about the rates.
is high enough, intervalley scattering in the conduc- These measurements have shown that under typical
tion band and intervalence band scattering in valence experimental conditions, the initial nonthermal dis-
bands, intersub-band scattering in quantum wells, tribution evolves into a thermal distribution in times
electron– phonon, and hole – phonon scattering pro- , or , 1 ps. However, the characteristic tempera-
cesses. The complexity of the problem is evident tures of the thermal distributions can be different for
when one considers that many of these processes electrons and holes. Furthermore, different phonons
occur on the same time-scale. Of these myriad may also have different characteristic temperatures,
processes, only carrier– phonon interactions and and may even be nonthermal even when the
electron – hole recombinations (generally much electronic distributions are thermal. This is the hot
slower) can alter the total energy in electronic carrier regime that is discussed in the next subsection.
systems. Under typical experimental conditions, the
redistribution of energy takes place before substantial
Hot Carrier Regime
transfer of energy to the phonon system. Thus, the
nonthermal distributions first become thermal distri- The hot carriers, with characteristic temperatures Tc
butions with the same total energy. Investigation of (Te and Th for electrons and holes, respectively), have
the dynamics of how the nonthermal distribution high energy tails in the distribution functions that
evolves into a thermal distribution provides valuable extend several kTc higher than the respective Fermi
information about the nature of various scattering energies. Some of these carriers have sufficient energy
processes (other than carrier– phonon scattering) to emit an optical phonon. Since the carrier– optical
described above. phonon interaction rates and the phonon energies are
COHERENT TRANSIENTS / Ultrafast Studies of Semiconductors 177
rather large, this may be the most effective energy loss that operate at high electric fields, thus in the regime
mechanism in many cases even though the number of of hot carriers. Another important aspect is that
such carriers is rather small. The emitted optical electrons and holes can be investigated separately by
phonons have relatively small wavevectors and using different doping. Furthermore, the energy loss
nonequilibrium populations larger than expected for rates can be determined quantitatively because the
the lattice temperature. These optical phonons are energy transferred from the electric field to the
often referred to as hot phonons although none- electrons or holes can be accurately determined by
quilibrium phonons may be a better term since they electrical measurements.
probably have nonthermal distributions. The
dynamics of hot phonons will be discussed in the
Isothermal Regime
next subsection, but we mention here that in case of a
large population of hot phonons, one must consider Following the processes discussed above, the exci-
not only phonon emission but also phonon absorp- tations in the semiconductor reach equilibrium with
tion. Acoustic phonons come into the picture at lower each other and the thermal bath. Recombination
temperatures when the fraction of carriers that can processes then return the semiconductor to its
emit optical phonons is extremely small or negligible, thermodynamic equilibrium. Radiative recombina-
and also in the case of quantum wells with sub-band tion typically occurs over a longer time-scale but
energy separation smaller than optical phonons. there are some notable exceptions such as radiative
The dynamics of hot carriers have been studied recombination of excitons in quantum wells occur-
extensively to obtain a deeper understanding of ring on picosecond time-scales.
carrier– phonon interactions and to obtain quantita-
tive measures of the carrier– phonon interaction rates.
Hot Phonons
Time resolved luminescence and pump-probe trans-
mission spectroscopy are the primary tools used for As discussed above, a large population of non-
such studies. For semiconductors like GaAs, the equilibrium optical phonons is created if a large
results indicate that polar optical phonon scattering density of hot carriers has sufficient energy to emit
(longitudinal optical phonons) dominates for elec- optical phonons. Understanding the dynamics of
trons whereas polar and nonpolar optical phonon these nonequilibrium optical phonons is of intrinsic
scattering contribute for holes. Electron – hole scatter- interest.
ing is sufficiently strong in most cases to maintain a At low lattice temperatures, the equilibrium
common electron and hole temperature for times population of optical phonons is insignificant. The
. , 1 ps. Since many of these experiments are optical phonons created as a result of photo-
performed at relatively high densities, they provide excitation occupy a relatively narrow region of wave-
important information about many-body effects such vector (k) space near k ¼ 0 (, 1% of the Brillouin
as screening. Comparison of bulk and quasi-two- zone). This nonequilibrium phonon distribution can
dimensional semiconductors (quantum wells) has spread over a larger k-space by various processes, or
been a subject of considerable interest. The consensus anharmonically decay into multiple large-wavevector
appears to be that, in spite of significant differences in acoustic phonons. These acoustic phonons then
the nature of electronic states and phonons, similar scatter or decay into small-wavevector, low-energy
processes with similar scattering rates dominate both acoustic phonons that are eventually thermalized.
types of semiconductors. These studies reveal that Pump-probe Raman scattering provides the best
the energy loss rates are influenced by many factors means of investigating the dynamics of nonequili-
such as the Pauli exclusion principle, degenerate brium optical phonons. Such studies, for bulk and
(Fermi – Dirac) statistics, hot phonons, and screening quantum well semiconductors, have provided invalu-
and many-body aspects. able information about femtosecond dynamics of
Hot carrier distribution can be generated not only phonons. In particular, they have provided the rate at
by photo-excitation, but also by applying an electric which the nonequilibrium phonons decay. However,
field to a semiconductor. Although the process of this information is for one very small value of the
creating the hot distributions is different, the pro- phonon wavevector so the phonon distribution
cesses by which such hot carriers cool to the lattice function within the optical phonon branch is not
temperature are the same in two cases. Therefore, investigated. The large-wavevector acoustic phonons
understanding obtained through one technique can are even less accessible to experimental techniques.
be applied to the other case. In particular, the physical Measurements of thermal phonon propagation pro-
insights obtained from the optical excitation case can vide some information on this subject. Finally, the
be extremely valuable for many electronic devices nature of phonons is considerably more complex in
178 COHERENT TRANSIENTS / Ultrafast Studies of Semiconductors
quantum wells, and some interesting results have transfer of carriers to the other quantum well by
been obtained on the dynamics of these phonons. tunneling can then be determined by measuring
dynamic changes in the spectra. Such measurements
have provided much insight into the nature and rates
Tunneling and Transport Dynamics of tunneling, and demonstrated phonon resonances.
Ultrafast studies have also made important contri- Resonant tunneling, i.e., tunneling when two elec-
butions to tunneling and transport dynamics in tronic levels are in resonance and are split due to
semiconductors. Time-dependent imaging of the strong coupling, has also been investigated exten-
luminescing region of a sample excited by an sively for both electrons and holes. One interesting
ultrashort pulse provides information about spatial insight obtained from such studies is that tunneling
transport of carriers. Such a technique was used, for and relaxation must be considered in a unified
example, to demonstrate negative absolute mobility framework, and not as sequential events, to explain
of electrons in p-modulation-doped quantum wells. the observations.
The availability of near-field microscopes enhances
the resolution of such techniques to measure lateral
Novel Coherent Phenomena
transport to the subwavelength case.
A different technique of ‘optical markers’ has been Ultrafast spectroscopy of semiconductors has pro-
developed to investigate the dynamics of transport vided valuable insights into many other areas. We
and tunneling in a direction perpendicular to the conclude this article by discussing some examples of
surface (vertical transport). This technique relies how such techniques have demonstrated novel
heavily on fabrication of semiconductor nanostruc- physical phenomena.
tures with desired properties. The basic idea is to The first example once again considers an a-DQW
fabricate a sample in which different spatial regions biased in such a way that the lowest electron energy
have different spectral signatures. Thus if the carriers level in the wide quantum well is brought into
are created in one spatial region and are transported resonance with the first excited electron level in the
to another spatial region under the influence of an narrow quantum well. The corresponding hole
applied electric field, diffusion, or other processes, energy levels are not in resonance. By choosing
the transmission, reflection, and/or luminescence the optical excitation photon energy appropriately,
spectra of the sample change dynamically as the the hole level is excited only in the wide well and the
carriers are transported. resonant electronic levels are excited in a linear
This technique has provided new insights into the superposition state such that the electrons at t ¼ 0
physics of transport and tunneling. Investigation of also occupy only the wide quantum well. Since the
perpendicular transport in graded-gap superlattices two electron eigenstates have slightly different ener-
showed remarkable time-dependent changes in the gies, their temporal evolutions are different with the
spectra, and analysis of the results provided new result that the electron wavepacket oscillates back
insight into transport in a semiconductor of inter- and forth between the two quantum wells in the
mediate disorder. Dynamics of carrier capture in absence of damping. The period of oscillation is
quantum wells from the barriers provided infor- determined by the splitting between the two resonant
mation not only about the fundamentals of capture levels and can be controlled by an external electric
dynamics, but also about how such dynamics affects field. Coherent oscillations of electronic wave-
the performance of semiconductor lasers with quan- packets have indeed been experimentally observed
tum well active regions. by using the coherent nonlinear technique of four-
The technique of optical markers has been applied wave mixing. Semiconductor quantum wells provide
successfully to investigate tunneling between the two an excellent flexible system for such investigations.
quantum wells in an a-DQW (asymmetric double Another example is the observation of Bloch
quantum well) structure in which two quantum wells oscillations in semiconductor superlattices. In 1928
of different widths (and hence different energy levels) Bloch demonstrated theoretically that an electron
are separated by a barrier. The separation between the wavepacket composed of a superposition of states
energy levels of the system can be varied by applying from a single energy band in a solid undergoes a
an electric field perpendicular to the wells. In the periodic oscillation in energy and momentum space
absence of resonance between any energy levels, the under certain conditions. An experimental demon-
wavefunction for a given energy level is localized in stration of Bloch oscillations became possible only
one well or the other. In this nonresonant case, it is recently by applying ultrafast optical techniques to
possible to generate carriers in a selected quantum semiconductor superlattices. The large period of a
well by proper optical excitation. Dynamics of superlattice (compared to the atomic period in a
COLOR AND THE WORLD 179
solid) makes it possible to satisfy the assumptions Chemla DS (1999) Chapter 3. In: Garmire E and Kost AR
underlying Bloch’s prediction. The experimental (eds) Nonlinear Optics in Semiconductors, pp.
demonstration of Bloch oscillations was also per- 175 – 256. New York: Academic Press.
formed using four-wave mixing techniques. This Chemla DS and Shah J (2001) Many-body and correlation
effects in semiconductors. Nature 411: 549– 557.
provides another prime example of the power of
Haug H (2001) Quantum kinetics for femtosecond
ultrafast optical studies.
spectroscopy in semiconductors. In: Willardson RK
and Weber ER (eds) Semiconductors and Semimetals,
vol. 67, pp. 205 – 229. London: Academic Press.
Summary Haug H and Jauho AP (1996) Quantum Kinetics in
Ultrafast spectroscopy of semiconductors and their Transport and Optics of Semiconductors. Berlin:
nanostructures is an exciting field of research that has Springer-Verlag.
provided fundamental insights into important physi- Haug H and Koch SW (1993) Quantum Theory of the
cal processes in semiconductors. This article has Optical and Electronic Properties of Semiconductors.
Singapore: World Scientific.
attempted to convey the breadth of this field and the
Kash JA and Tsang JC (1992) Chapter 3. In: Shank CV and
diversity of physical processes addressed by this field. Zakharchenya BP (eds) Spectroscopy of Nonequilibrium
Many exciting developments in the field have been Electrons and Phonons, pp. 113 – 168. Amsterdam:
omitted out of necessity. The author hopes that this Elsevier.
brief article inspires the readers to explore some of the Mukamel S (1995) Principles of Nonlinear Optical
Further Reading. Spectroscopy. New York: Oxford University Press.
Shah J (1992) Hot Carriers in Semiconductor Nanostruc-
tures: Physics and Applications. Boston, MA: Academic
See also Press.
Shah J (1999) Ultrafast Spectroscopy of Semiconductors
Coherence: Coherence and Imaging. Coherence
and Semiconductor Nanostructures. 2nd edn. Heidel-
Control: Experimental; Theory. Coherent Transients:
berg, Germany: Springer-Verlag.
Ultrafast Studies of Semiconductors. Nonlinear Optics,
Steel DG, Wang H, Jiang M, Ferrio KB and Cundiff ST
Basics: Four-Wave Mixing. Scattering: Scattering
(1994) In: Phillips RT (ed.) Coherent Optical Inter-
Theory.
actions in Solids, pp. 157– 180. New York: Plenum
Press.
Wegener M and Schaefer W (2002) Semiconductor Optics
Further Reading
and Transport. Heidelberg, Germany: Springer-Verlag.
Allen L and Eberly JH (1975) Optical Resonance and Two- Zimmermann R (1988) Many-Particle Theory of Highly
Level Atoms. New York: Wiley Interscience. Excited Semiconductors. Leipzig, Germany: Teubner.
in his famous treatise ‘Opticks’, are the foundations of molecules. They absorb light, exciting molecular
our understanding of color and color perception. orbitals. The dyes are used for food coloration,
Based on his experiments, Newton showed that clothing, photographic, and sensitizer purposes.
different regions of the visible spectrum are perceived 4. Colors of metals, semiconductor materials and
as different colors and the solar radiation (white light) color centers – The colors of these materials are
consists of spectral colors violet, blue, cyan, green, caused by the electronic transitions involving the
yellow, orange, and red. Thus, natural day light is energy bands.
quite colorful though we perceive it as colorless. Diode lasers emit radiation corresponding to
Table 1 gives approximate wavelength ranges of the the bandgap of the semiconductor materials used
spectral colors. Superposition of all the spectral colors for their fabrication. They are inexpensive, com-
results in a perception of white. pact, and can be operated using dry cell batteries.
Colors can be produced in a number of ways: They are extensively used in compact disks,
barcode readers, and optical communications.
1. Gas discharges emit characteristic radiation in the Light emitting diodes (LED) are also prepared
visible region. For example, neon discharge is red using semiconductor materials and they emit
in color, argon discharge blue, sodium discharge radiation corresponding to the bandgap of the
yellow, helium-neon laser gives red radiation and semiconductor materials used for their fabrica-
argon ion laser blue-green radiation. A variety of tion. LEDs are available in a variety of colors, red
fluorescent lamps are available and they provide (GaAsP), orange (AlInGaP), yellow (InGaAs),
continuous spectral power distribution with green (CdSSe), blue (CdZnS) and are widely used
characteristic lines of mercury in the blue-green in clocks, toys, tuners, displays, electronic bill-
region. Gas-filled sodium and fluorescent lamps boards, and appliances.
offer long life and high efficiency and are popular Color centers may be produced by irradiating
for indoor and outdoor lighting. some alkali halides and glass materials with
2. Transition metal (Cr, Mn, Fe, Co, Ni, Cu) electromagnetic radiation (gamma rays, X-rays,
compounds or transition metal impurities are ultraviolet, and visible radiation) or with charged
responsible for the colors of the minerals, paints, or uncharged particles (electrons, protons, neu-
trons). The irradiation produces an electron hole
gems, and pigments. They involve inorganic
pair and the electron is trapped forming an
compounds of transition metal or rare earth ions
electron center. The electron or hole, or both,
with unpaired d or f orbitals. About 1% chro-
form a color center absorbing part of the visible
mium sesquioxide (Cr2O3) in colorless aluminum
radiation. The color centers can be reversed by
oxide results in beautiful red ruby, a gemstone.
heating to high temperatures.
The Cr3þ impurity in pure beryllium aluminum
5. Optical Phenomenon lead to spectacular colors in
silicate gives a beautiful emerald green resulting in
nature and in biological materials. Scattering of
sapphire, another gemstone. Transition metal
light by the atmosphere is responsible for colors
impurities are added to glass in the molten state of the sky, beautiful sunsets and sunrise, twilight,
to prepare colored glass. For example, adding blue moon, and urban glows. Dispersion and
Cr3þ gives green, Mn3þ purple, Fe3þ pale yellow, polarization cause rainbows and halos. Interfer-
Co3þ reddish blue, Ni2þ brown, and Cu2þ green- ence gives rise to the beautiful colors of thin films
ish blue color. on water, foam, bubbles, and some biological
3. Color in Organic Compounds – Most natural and colors (butterflies). Diffraction is responsible for
synthetic dyes and biological (vegetable and the colors of liquid crystals, coronae, color of
animal origin) colorants are complicated organic gems (opal), color of animals (sea mouse), the
colors of butterfly (Morpho rhetenor), and
Table 1 Approximate wavelength ranges of different spectral diffraction gratings.
colors
because of their shapes. Typically, there are about 7 blocks the entire spectrum except the radiation
million cones and 110 – 125 million rods. Cones are corresponding to red color which in turn is blocked
activated during photopic (daylight) conditions and by the blue filter.
the rods during scotopic (night) conditions. Cones are Subtractive color scheme is also applicable when
responsible for color perception whereas rods can paints of various colors are mixed together. The
perceive only white and gray shades. Cones are predo- subtractive primary colors are magenta, yellow, and
minantly located near the fovea, the central region of cyan. The artists often call them red, yellow, and blue.
the retina. When light falls on the rods and the cones, By appropriately mixing the primary colored paints
the photosensitive material in them, namely rho- one can, in principle, generate most of the colors.
dopsin, is activated, generating an electrical signal
which is further processed in the visual system and How many colors can we differentiate?
transmitted to the visual cortex. Considerable data If we do a simple experiment beginning from the
processing takes place within the retina. lowest wavelength of the visible spectrum, say
Color Perception
380 nm, and gradually move to the long wavelength
end of the visible spectrum, one can differentiate
Subjective nature of color about 150 spectral colors. For each spectral color
Color is a perception and is subjective. There is no there will be a number of colors that vary in
way one can define color in an absolute or quantitat- brightness and saturation. Additionally, we can have
ive means, even though it can be precisely character- additive mixtures of these spectral colors, which in
ized by the spectral power distribution (SPD) of the turn have variations in brightness and saturation. It is
radiation. For monochromatic light of different estimated that under optimal conditions we can
wavelengths we see the spectral colors detailed in differentiate as many as 7 to10 million different
Table 1. In general, the SPD of light from an object is colors and shades.
complex and to predict the exact color of such a
spectrum is quite difficult. However, in the following Temporal Response of the Human Visual System
we discuss some simple color schemes.
Positive afterimages
The visual system responds to changes in the
Additive color mixing
stimulation in time. However, the response of the
Newton’s experiments demonstrated that using the
visual system is both delayed and persists for some
three primary color light sources, blue, green, and red
time after the stimulation is off. Because of the
and by varying their relative intensities, one can
persistence of vision, one would see positive after-
generate most of the colors by additive color mixing.
images. The afterimages last about 1/20 s under
Additive color mixing is employed in stage lighting,
normal conditions but are shorter at high light
and large screen color television projection systems. If
conditions. A rapid succession of images show results
we are concerned only with the hue and ignore satur-
in a perception of continuous motion which is the
ation and brightness, the superposition of colored
basis for movies and TV.
light beams of equal brightness on a white screen
Close your eyes for about five minutes so that any
gives the following simple additive color scheme:
after images of the objects you have seen before are
Green þ Red ) Yellow cleared. Open your eyes to see a bright colored object
and then close your eyes again; you will see positive
Blue þ Red ) Magenta afterimages of the object in its original colors. Even
Blue þ Green ) Cyan though the positive afterimage at the initial stages will
be of the same colors as the original, they gradually
Blue þ Green þ Red ) White change color showing that different cones recover at
different rates.
Subtractive color mixing
When white light passes through, say, a red filter, the Negative afterimages
radiation corresponding to the red radiation is If you stare at a color picture intensely for about a
transmitted and all other parts of the visible spectrum minute and look at a white background, such as a
are absorbed. If we insert more red filters the sheet of white paper, you will see the complementary
brightness of the light transmitted becomes less colors of the picture. The image of the object falls on
though the hue is likely to be more saturated. If you a particular part of the retina and if we look at a
insert red and blue filters together in a white beam, green colored object, the cones that are more
you will see no light transmitted because the red filter sensitive to the mid-wavelength region are activated
182 COLOR AND THE WORLD
and are likely to be saturated. If you look at a bright Anuszkiewicz in his painting ‘All Things Do Live in
sheet of white paper immediately, the cones that are the Three’.
sensitive to the middle wavelength region are
Color Perception Models
inactivated since they need some time to replenish
the active chemical whereas the cones that have Thomas Young, in 1801, postulated that there are
maximum sensitivity for the low and high wave- three types of sensors thus proposing trichromacy to
lengths are active, therefore you will see the object in explain the three attributes of color perception: hue,
complementary colors. Yellow will be seen after saturation, and brightness. Data on microspectro-
looking at blue image, cyan after looking at red photometry of excised retinas, reflection densitome-
image and vice versa. try of normal eyes, and psychophysical studies of
different observers confirm the existence of three
types of cones. As shown in Figure 1, the three types
Spatial Resolution of the Human Visual System of cones have different responsivities to light.
Lateral inhibition (color constancy) The cones that have maximum responsivity in the
By and large the perceived color of an object is short wavelength region are often refereed as S-cones,
independent of the nature of the illuminating white the cones that have maximum in the intermediate
light source. The color of the object does not change wavelength range as I-cones, and the cones that have
markedly under different white light sources of maximum in the long wavelength region as L-cones.
illumination. For example, we see basically the The three response curves have considerable
same color of the object in the daytime in the light overlap. The overlap of the response curves and the
of the blue sky or under an incandescent light source complex data processing that takes place in the visual
even though the spectral power distributions of these system enable us to differentiate as many as 150
sources are different and the intensity of light reach- spectral colors and about seven to ten million
ing the eye in both cases is also drastically different. different colors and shades. An important conse-
The color constancy is a result of chromatic lateral quence of the existence of only three types of color
inhibition. The human visual system has a remark- sensors is that different spectral power distributions
able way of comparing the signals from the object and can produce identical stimuli resulting in the percep-
manipulating them with the signals generated from tion of the same color. Such spectral power distri-
the surroundings and picking up the correct color butions are called metamers.
independent of the illuminating source. The color The neural impulses generated by the cones are
constancy and simultaneous lightness contrast are processed through a complex cross-linking of bipolar
due to lateral inhibition. Lateral inhibition was cells, horizontal cells, amacrine cells, and commu-
exploited by a number of artists to provide certain nicated to ganglion cells leading to the optic chiasmas
visual effects that would not have been possible through the optic nerve. Information gathered by
otherwise. For example, Georges Seurat in his famous about 100 million photoreceptors, after considerable
painting ‘La Poseuse en Profil’ improved the light processing, is transmitted by about a million ganglion
intensity contrast in different regions of his painting cells.
by employing edge enhancement effects resulting
from lateral inhibition. Victor Vasarely’s art work
‘Arcturus’ demonstrates the visual effects of lateral
inhibition in different colors.
Spatial assimilation
The color of a region assimilates the color of the
neighboring region, thus changing the perception of
its color. Closely spaced colors, when viewed from a
distance, tend to mix partitively and are not seen as
distinct colors but as a uniform color due to additive
color mixing. Georges Seurat in his painting ‘Sunday
Afternoon on the Island of La Grande Jatte’, uses
thousands of tiny dots of different colors to produce a
Figure 1 Relative spectral absorption of the three types of
variety of color sensations, a technique which is cones. S; I; and L respectively stand for the cones that have
known as pointillism. The visual effects of spatial maximum responsivity for the short, intermediate, and long
assimilation were clearly demonstrated by Richard wavelength regions.
COLOR AND THE WORLD 183
The zone theory of color vision assumes two white 10, and the nine grays are uniformly placed in
separate zones where the signals are sequentially between. Though the Munsell system has gone
processed. In the first zone the signals are generated through many revisions, typically it has a total of
by the cones and in the second zone, the signals are about 1450 chips. The Munsell color scheme, though
processed to generate one achromatic and two subjective in nature, is widely used because of its
opponent chromatic signals. Possibly other zones simplicity and ease.
exist where the signals generated in the first and the
second zones are further processed and communi- The Commission Internationale de l’Echlairage
(C.I.E.) Diagram
cated to the visual cortex where they are processed
and interpreted based on the temporal and spatial Even though the visible light reaching the eye has a
response and comparisons with the information very complex SPD, color perception is mostly
already available in the memory. dependent on the signals generated by the three
different types of cones. However, we do not know
precisely the absorption and sensitivity of the cones
Measuring Color and Colorimetry and therefore are not in a position to calculate the
Since color is a human perception, how do we signal strengths generated by the three different
measure color? Colorimetry is based on the human cones. Additionally, the response of the cones is
visual system and how different observers perceive specific to the individual. Therefore, the C.I.E. system
color. Though, the spectral power distribution of light is based on the average response of a large sample of
is helpful for characterizing spectral hues, it is very normal observers. The C.I.E. system employs three
difficult to infer the color from the spectral power color matching functions xð lÞ; y ðlÞ; and z ðlÞ
distribution. (Figure 2), based on the psychological observations
The three attributes of color perception are hue, of a large number of standard observers to calculate
saturation, and brightness. Hue is the color such as the ‘tristimulus values’ X, Y, and Z defined as
red, green, and blue. Saturation refers to the purity of ð
the color. A color is said to be more saturated if the X ¼ c SðlÞxð
lÞdl ½1
whiteness in the color is less and vice versa. Brightness
is related to the intensity of light. For the sun at ð
sunrise the hue is red, saturated, and low brightness. Y ¼ c SðlÞyðlÞdl ½2
About half an hour after sunrise, the hue is yellow,
less saturated, and higher brightness. Though a
variety of color schemes are available in the literature, ð
in the following we discuss the Munsell system and Z ¼ c SðlÞzðlÞdl ½3
the C.I.E. diagram only.
Here, l is the wavelength, SðlÞ the spectral power
Munsell Color System distribution, and c is the normalization constant. The
integration is carried out over the visible region,
Albert Munsell, a painter and art teacher, devised a normally 380 to 780 nm. The system is based on the
color atlas in 1905 by arranging different colors in an assumption of additivity and linearity. The sum of the
ordered three-dimensional space. He used the three tristimulus values ðX þ Y þ ZÞ is normalized to 1. We
color attributes hue, chroma, and value correspond- choose yðlÞ so that the Y tristimulus value is
ing to hue, saturation, and brightness. In this system proportional to the luminance which is a quantitative
there are 10 basic hues, each of which is divided into measure of the intensity of light leaving the surface.
10 equal gradations, resulting in 100 equally spaced We further calculate x and y which are called
hues. Each hue has a chart with a number of small chromaticity coordinates:
chips arranged in rows and columns. Under daylight
illumination, the chips in any column are supposed to X
x¼ ½4
have colors of equal chroma and the chips in any row ðX þ Y þ ZÞ
are supposed to be of equal brightness. The brightness
increases from the bottom of the chart to the top in Y
steps that are perceptually equal. The saturation of y¼ ½5
ðX þ Y þ ZÞ
the color increases from the inside edge to the outside
edge of the chart in steps that are also perceptually The psychological color is then specified by the
equal. Each chip is identified by hue, value/chroma. In coordinates ðY; x; yÞ: Relationship between colors is
the Munsell scale, black is given a value of 0 and usually displayed by plotting the x and y values on
184 COLOR AND THE WORLD
Figure 2 Color-matching functions x (l),y (l), and z (l) of the C.I.E. 1931 standard observer.
a two-dimensional Cartesian Coordinate system Typical SPDs of commonly used light sources are
with z as an implicit parameter ðz ¼ 1 2 x 2 yÞ given in Figure 4.
which is known as the C.I.E. chromaticity diagram
(Figure 3). In this plot, the monochromatic hues are
Color temperature
on the perimeter of a horseshoe shaped curve and is
The thermal radiation emitted by a black body has a
called spectrum locus. A straight line joining the
characteristic SPD which depends solely on its
ends of the spectrum is called the line of purples. temperature. The wavelength l of the peak of this
Every chromaticity is represented by means of two distribution is inversely proportional to its absolute
coordinates ðx; yÞ in the chromaticity diagram. The temperature. As the absolute temperature of the
point corresponding to x ¼ 1=3 and y ¼ 1=3 (also blackbody increases, the peak shifts towards shorter
z ¼ 1=3) is the point of ‘equal energy’ and represents wavelengths (blue region) and also the width of the
achromatic point. The complementary colors are on distribution decreases. Since the peak of the black
the opposite sides of the achromatic point. Com- body radiation is dependent on the absolute tempera-
plementary wavelength of a color is obtained by ture, the color of the blackbody can be approximately
drawing a line from the color through the achro- defined by its temperature.
matic point to the perimeter on the other side. The color temperature of a body is that tempera-
ture at which the SPD of the body best matches with
that of the blackbody radiation at that temperature.
Continuum (white) sources The concept of color temperature is often used to
A perfect white source is the one that has constant characterize the sources. Low color temperature
SPD over the entire visible region. Such a source sources have a reddish tinge and high color
will not distort the color of the objects they temperature sources tend to be slightly bluish. For
illuminate. A number of sources in general are example, the color temperature of red looking star
considered as white sources; they include incandes- Antares is about 5000 K and that of the bluish white
cent lamps, day skylight, a variety of fluorescent looking Sirius is about 11 000 K. One should note
lamps, etc. It should be noted that the SPD of these that the color temperature of a body is different from
sources is not constant and varies with the source. its actual temperature. The fluorescent lamps have a
COLOR AND THE WORLD 185
Figure 3 Chromaticity diagram of the C.I.E. 1931 standard observer. The chromaticities of the incandescent sources and their color
temperatures are given inside the curve. A, B, and C are the C.I.E. standard illuminants. The chromaticities of the helium-neon laser
(red, 633 nm) and the argon-ion laser (green 515 nm, blue 488 nm, and indigo, 458 nm) are also shown.
Figure 5 Spectral irradiance (kW m22 mm21) of the solar radiation measured above the atmosphere and at the Earth’s surface at sea
level on a clear day and the sun at the zenith.
the radiation. According to Lord Raleigh’s maximum polarization of about 75% to 85% at
theory of scattering, the scattering probability 908 from the sun.
varies inversely as a function of the fourth power 3. Distant mountains seem bluish in color. When an
of the wavelength of the radiation. If we consider artist wishes to paint a distant mountain in
the radiation corresponding to blue (400 nm) perspective, he/she paints it blue; why? This is
and the radiation corresponding to say red called airlight. The low wavelength part of the
(700 nm), the photons corresponding to blue solar radiation is scattered more prominently,
radiation are scattered by about 9.4 times that of giving it a blue perspective when observed from a
photons corresponding to the red radiation. distance. However, it should be mentioned that
During noontime, when we look at the sky, not the distant mountains may look any color
directly at the sun, the sky looks unsaturated blue depending on the scattering and absorption of
because of the predominant scattering experi- the solar radiation sometimes resulting in ‘Purple
enced by the high frequency radiation corre- Mountain’s majesty.’
sponding to blue color. However, it should be 4. Sun looks red at the sunrise and sunset. At
noted that the daytime skylight contains all the sunset as the sun approaches the horizon, the
spectral colors of the solar radiation though the sun changes its color from dazzling white to
spectral power distribution has changed result- yellow, orange, bright red, and dull red and the
ing in an increase in the intensity of the high order is reversed at sunrise. At the sunsets and
frequency (blue) radiation. During noontime at sunrise, the solar radiation travels through many
the horizon, the sky looks close to white in color airmasses compared to approximately one air-
with a bluish tinge, because the solar radiation mass when the sun is at the zenith. Because of
has passed through many air masses. As the the large mass of air the radiation has to pass
thickness of the atmosphere increases, the through, first the blue part of the spectrum is
intensity of the blue radiation decreases and removed and the transmitted green to red
ultimately as the radiation reaches the horizon, spectral colors give the appearance of yellow.
all the spectral colors are scattered about the As this spectrum passes through more airmass,
same level resulting in white or bluish white sky. green is scattered most, leaving spectral yellow,
2. Polarization of the solar radiation by the orange, and red. As the radiation further passes
atmosphere. Light scattered by a molecule is through air, yellow and orange are also scat-
polarized. The blue skylight is polarized with a tered, leaving only red.
COLOR AND THE WORLD 187
coming from different waves and different parts see the spectral colors. The primary rainbow is
of the waves results in glitter. Since you are produced by the water droplets in the atmos-
looking at the reflected light, it will be polarized. phere after rain. The radius of the primary
4. Sky pools and landpools. The light coming from rainbow is about 428 with a width of about 28.
a wavy water surface of a lake consists of oval The color of the primary rainbow would be blue
shaped bright colored patterns that continually to red from the center of the rainbow. A rainbow
change in shape. This results from the reflection is observed at the antisolar point. The brightness
of skylight by the wavy water. If the waves are of the rainbow depends on the ambient sunlight,
not turbulent the reflected light reaching your eye the size of the droplets, and whether the sun is
continually changes resulting in bright colored low or high in the sky. The secondary rainbow,
patterns. Landpools are caused by a nearby which is fainter, is often seen outside the primary
sloping landscape and the reflected light pattern rainbow at about 518. The secondary rainbow
from the landscape. will have red to blue from the center of the
5. Thin films. Oil floats on water and spreads into secondary rainbow. The secondary rainbow is
thin films because of its low surface tension. approximately half as bright as the primary
When viewed in reflected natural white light, rainbow. The secondary rainbow is caused by
the light reflected from the top surface of the two total internal reflections within the droplet
oil and the light reflected from the interfacial whereas the primary rainbow is the result of one
surface of oil and water interfere to produce total internal reflection only. Because of the total
colorful patterns. Brilliant colored patterns are internal reflection process, light is about 95%
commonly seen on the roads after rain because tangentially polarized in the primary rainbow.
of gasoline on water. Foam ordinarily consists 8. Heiligenschein. Heiligenschein or the ‘holy light’
of bubbles ranging typically 1024 mm to can be seen around the shadow of your head on a
several mm. Even though each bubble produces lawn on a sunny morning when dew drops are
colorful light due to the interference effects, present. Usually, the dew drops are held by the
the light coming out of the foam adds up to tiny hairs of the grass at some distance from the
white light. blade of the grass. As light falls on the droplet,
6. Wet surfaces look darker. A water spot on a the spherical drop converges light at its focus and
concrete road looks darker. Since concrete is not if the grass blade is located at that point, the light
porous a thin layer of water sticks to the surface. gets retroreflected approximately towards the
At the water surface part of the light incident is sun resulting in holy light around your shadow.
reflected, and a large part is transmitted. The 9. Coronae. Colored concentric patchy rings seen
transmitted light is diffusively scattered, part of around the moon and the sun are called coronae.
which goes through total internal reflection and They are primarily produced by the water
only a small part reaches the eye, making the droplets in the thin clouds. Even though the
surface look darker. water droplets are randomly located, the dif-
When a fabric is wet it looks darker because fracted light from the droplets has a defined
of a slightly different mechanism. The surface of angle and the superposition of the diffraction
the fabric gets coated with a thin layer of water patterns from different water droplets results in
(refractive index ¼ 1.33) which is smaller than colored ring structures. Coronae can be
the refractive index of the fabric. When the cloth easily seen around the full moon viewed through
is wet more light is transmitted, and less gets thin clouds.
reflected than before. The transmitted light 10. The Glory. On a foggy day with the sun at your
penetrates the fabric and gets scattered, resulting back, one may see bright colored rings in the
in a darker appearance. form of arches of about 58 to 108 at the solar
7. Rainbows. Water with a refractive index of 1.33 point. The light in the glory is polarized and the
is a dispersive medium. White light consisting of ring structure is produced by reflection of the
different wavelengths travel at different speeds in solar radiation by the fog.
the medium and emerge at different directions. If 11. Clouds. Clouds contain suspended water dro-
the dispersion is adequate, different spectral plets or ice crystals. The size of the droplets
colors can be seen by the naked eye. The dew ranges from less than a micrometer to about 100
covered lawn on a sunny day displays the micrometers (1026 m). When the air is super-
spectral colors. If you stand close to the dew saturated with water vapor, the water condenses
drop without blocking the sun’s radiation into water droplets, resulting in a definite shape
falling on the drop, at some angle one should to the cloud. When the sun’s radiation falls on
COLOR AND THE WORLD 189
the cloud, the light is scattered by the water frequent compared to the lightening events
droplets. Even though one would expect that the between a cloud and the Earth. In this process,
scattered light may have some color distribution, the nitrogen and oxygen molecules in air are
since we look at the scattered light from a very excited and ionized. The flash is produced by the
large number of droplets, the summation of all de-excitation of these molecular species. Since
the colors results in white. Some clouds look the temperatures involved are quite high
dark or gray because, either they are in the (30 000 K), the flash has a strong unsaturated
shadow of other clouds or they are so thick that bluish hue. Some hydrogen lines are also
most of the light is absorbed. During sunset and observed in the flash. Hydrogen is produced as
sunrise, the clouds look bright orange because of a result of the dissociation of water molecules
the colored light illuminating them. Even though during the discharge process.
all the clouds are basically white, they may 13. Ice and Halos. Snow reflects most of the light
appear colored because of the selective absorp- falling with a reflectivity of about 95%. Snow-
tion of certain colors in the atmosphere or due to flakes are crystalline and have hexagonal struc-
the colored background radiation. Often, you see ture. In nature they occur in assorted sizes and
a silver lining at the edges of a cloud, because at shapes. As white light falls on them, light may be
the edges the density of water is low and when simply reflected from their surfaces making them
the sun illuminates it from the back, the shining white, which are often called glints. The
radiation is scattered in the forward direction light may also pass through the crystalline
quite efficiently, resulting in a bright and shiny material get dispersed and emerge with different
‘silver lining’ to the cloud. colors, which are known as sparkles.
12. Lightning. Lightning is caused by an intense The halos appear as thin rings around the sun
electrical discharge due to the high potential or the moon. They are produced because of the
differences developed. On a cloudy and rainy minimum deviation of light in snow crystals. The
day, as the water droplets grow in size, the falling 228 halo is most common and is often seen a
drops get distorted to ellipsoidal shape and get number of times in a year. The halos are circular
polarized. The top smaller portion becoming and have a faint reddish outer edge and a bluish
negative and the lower larger part becoming white diffused spot at the center. They often look
positive. As the deformation increases, the drop white because of the superposition of different
ultimately breaks and the smaller drops being colors resulting in a more or less white color.
negatively charged and the larger ones positively
charged. The wind currents easily lift the buoy-
ant small drops to higher altitudes and the Color in Art and Holography
heavier positively charged droplets fall towards
Sources of Color
the Earth. The final result is that the upper region
of the cloud is negatively charged whereas the There are two major color sources, dyes and paint
lower region of the cloud is positively charged. pigments. Dyes are usually dissolved in a solvent. The
The portion of the Earth under the cloud dye molecules have selective absorption resulting in
becomes negatively charged due to induction. selective transmission. Paint pigments are powdered
The localized ions in the positively charged cloud and suspended in oil or acrylic. The paint particles
produce a strong electric field which is further reflect part of the radiation falling on them, selec-
augmented by the induced negative charges on tively absorb part of the radiation and transmit the
the Earth below. The strong electric field balance. Lakes are made of translucent materials such
ionizes the air molecules setting up a current as tiny grains of alumina soaked in a dye of
discharge to the earth. This discharge is called appropriate color.
stepped leader. Following the stepped leader, the In the case of water colors and printer’s inks, part
charges in the upper region of the cloud also of the light incident is reflected from the surface and
discharge resulting in what is called return the remaining transmitted which is selectively
stroke. The sequence of leader and return strokes absorbed by the dye. The transmitted light reflected
occur typically about ten times in a second. by the paper, passes through the dye again and
Since the clouds are charged and they are closer emerges. Each transmission produces color by sub-
to each other, lightening is more frequent tractive color scheme and the emerging light is mixed
between the oppositely charged clouds than partitively by the observer.
cloud and Earth. It is estimated that the lightning In the case of paints, the color depends on the size
between the clouds is about five times more and concentration of the pigment particles and the
190 COLOR AND THE WORLD
medium in which they are suspended. Part of the light In the case of the butterfly, Morpho rhetenor, the
incident on the artwork gets reflected at the surface metallic blue color is produced by the periodic
and the remaining selectively absorbed and trans- structure of its wings. Its wings have layers of scales,
mitted which in turn is further reflected and each of about 200 mm long and 80 mm wide and
transmitted by other pigment particles each time about 1300 thin parallel structures per mm which
following a subtractive color scheme. The light form a diffraction grating.
reaching the canvas gets reflected and passes through The bird feathers viewed through an optical
the medium and emerges. Again, all the emerging microscope show the structural components which
beams are partitively mixed by the observer. are responsible for their coloration. The structural
Holograms may be broadly categorized as trans- details of the peacock’s eye pattern employing the
mission and reflection holograms. Both transmission scanning electron microscopy show photonic struc-
and reflection (volume) holograms are prepared by tures on peacock barbules. Barbules are the strap like
producing the interference pattern of the reference branches on the peacock feathers. The four principal
and object laser beams. Transmission holograms can colors on the peacock feathers blue, green, yellow,
be viewed in the presence of the reference laser beam and brown were identified wth the specific structures.
that was employed for its preparation. Spectacular The interference between the light reflected from the
three-dimensional colored images are reconstructed front surface to that reflected from the back was
by the reflection holograms depending on the angle of shown to be responsible for the brown color in the
incidence of the white light and the direction of peacock’s eye. The photonic structure of the green
observation. Embossed holograms are popular for barbule has a periodic structure of melanin cylinders
advertising, security applications (credit cards), dis- with a spacing of about 150 nm whereas the blue
play, and banknotes. Some studios prepare three- barbule has a periodic structure of 140 nm, and
dimensional holographic portraits that can be viewed yellow barbules have a periodicity of 165 nm.
in natural light. In the future, three-dimensional
videos and even holographic three-dimensional
movies may be a reality. See also
Dispersion Management. Holography, Techniques:
Structural Colors in Nature Color Holography. Photon Picture of Light. Polari-
zation: Introduction. Scattering: Scattering Theory.
Nanoscale Photonic Lattices
Contents
Fiber Sensors
Heterodyning
Image Post-Processing and Electronic Distribution
Smart Pixel Arrays
Interferometric Sensors
Interferometric fiber sensors operate on the principle
of interference between two or more light beams to
convert phase differences to intensity changes. Com-
mon configurations used include Michelson, Mach –
Zehnder, and Sagnac interferometers, polarimetric
systems, grating and etalon-based interferometers
and ring resonators. Interferometric fiber sensors
Figure 3 (a) Transmission and (b) reflective coupling-based have extremely high sensitivity and are able to resolve
fiber sensors. path differences of the order of 1026 of the light
DETECTION / Fiber Sensors 193
Figure 6 Michelson interferometric strain sensor. Part of the signal arm is embedded in a material, such as concrete (in civil
engineering structures) or carbon composite (used in the aerospace industry).
where Db is the fiber (linear) birefringence and l the It is a simple matter to calculate the SOP, from which
fiber length. Both Db and l can be changed by the f can be determined. The form of eqn [2] is very
measurand. A polarimetric strain sensor using Hi-Bi similar to the two-beam interferometer transfer
fiber embedded in carbon fiber composite, is shown in function, shown in Figure 7.
Figure 9. Light from a linearly polarized He – Ne laser A polarimetric electric current sensor based on the
is launched into the Hi-Bi fiber, through a half- Faraday effect is shown in Figure 10. The Faraday
waveplate and lens. The waveplate is used to rotate effect provides a rotation of the light’s polarization
the light plane of polarization so that it is at 458 to the state when a magnetic field is parallel to the optical
principal axes of the fiber. This ensures that half of the path in glass. If an optical fiber is closed around a
input light power is coupled to each of the fast and current-carrying conductor, the Faraday rotation is
slow modes of the Hi-Bi fiber. The output light directly proportional to the current. By detecting the
from the Hi-Bi fiber is passed through a polarization polarization rotation of light in the fiber, the current
beamsplitter. The beamsplitter separates the light into can be measured.
two orthogonally polarized beams, which are then Linearly polarized light is equivalent to a combi-
detected. The output from each of the detectors is nation of equal intensity right and left circularly
given by V1 ¼ I0 sin2 f and V2 ¼ I0 cos2 f; respect- polarized components. The linear state of polariz-
ively, where I0 is the input light intensity. The state of ation of the polarized laser light rotates in the
polarization (SOP) of the output light from the Hi-Bi presence of a magnetic field because the field produces
circular birefringence in the fiber. This means that spectrum analyzer. The resulting fringe pattern fringe
right and left circularly polarized light will travel at spacing is then related to the OPD of the sensing
different speeds and accumulate a relative phase interferometer.
difference given by An example of the optical path scanning technique
is shown in Figure 11a. The sensing interferometer is
f ¼ VBl ½3 designed such that its OPD is much greater than the
coherence length of the light source, so at its output
where V is the Verdet constant (rad Tesla21 m21) of no interference fringes are observed. The output of
the glass, B the magnetic field flux density (Tesla), and the sensing interferometer is fed to the processing
l the length of the fiber exposed to the magnetic field. interferometer. The OPD of the processing interfe-
In the polarimetric current sensor, the polarization rometer is scanned using a piezoelectric fiber stretcher
rotation is converted to an intensity change by the driven by a suitable scanning voltage. As the
output polarizer. The output to the detector is processing interferometer is scanned, interference
proportional to 1 þ sinð2fÞ; from which f and the fringes are observed at two distinct points as shown
current can be determined. in Figure 11b. The first set of fringes occurs when the
The linear birefringence of conventional silica OPD of the processing interferometer is within the
fiber is much greater than its circular birefringence. coherence length of the optical source (close to zero).
To make a practical current sensor, the linear The second set of fringes occurs when the OPDs in the
birefringence must be removed from the fiber and two interferometers are equal. The OPD in the
this can be achieved by annealing the fiber. Annealing sensing interferometer can be determined by measur-
involves raising the temperature of the fiber, during ing the OPD in the processing interferometer between
manufacture, to a temperature above the strain point
the two positions of maximum visibility in the output
for a short period of time and slowly cooling back to
signal from the detector. This, in turn, can be related
room temperature. This reduces stresses in the glass,
to OPD changes due to the action of the measurand,
which are the principal cause of linear birefringence,
in this case strain.
and also cause physical and chemical changes to the
Sagnac interferometric sensors can be used to
glass. Waveguide-induced birefringence cannot be
create highly sensitive gyroscopes that can be used
removed by annealing, but can be significantly
reduced by twisting the fiber. to sense angular velocity (e.g., in aircraft navigation
White light – or more accurately low-coherence – systems). It is based on the principle that the
interferometry utilizes broadband sources such as application of force (e.g., centrifugal force) will alter
LEDs and multimode lasers in interferometric the wavelength of light as it travels around a coil of
measurements. Optical path differences (OPDs) are optical fiber. A basic open-loop fiber gyroscope is
observed through changes in the interferometric shown in Figure 12. The broadband source (e.g.,
fringe pattern. A processing interferometer is required superluminescent diode) is split into two counter
in addition to the sensing interferometer to extract the propagating light beams traveling in the clockwise
fringe information. and anticlockwise directions. The polarizer is used to
The processing of white light interferometry signals ensure the reciprocity of the counter propagating
relies on two principal techniques. The first technique waves. The inherent nonlinear response of the
involves scanning the OPD of the processing inter- gyroscope can be overcome by using a phase
ferometer to determine regions of optical path modulator and signal processing techniques. In
balance. The second technique involves determi- the ideal case the detector output is proportional to
nation of the optical spectrum using an optical 1 þ cos fs : The Sagnac phase shift fs between the
196 DETECTION / Fiber Sensors
two returning beams is given by 1028 rad/s and dynamic ranges in excess of 40 dB
have been achieved with open-loop fiber gyroscopes.
8pANV More advanced gyroscopes can greatly improve on
fs ¼ ½4
cl0 this performance.
A is the area enclosed by a single loop of the fiber, N
the number of turns, V the component of the angular
velocity perpendicular to the plane of the loop, l0 the
Fiber Grating Sensors
free-space wavelength of the optical source, and c the Many types of fiber gratings can be used in sensing
speed of light in a vacuum. Sensitivities greater than applications including Bragg, long-period, and
Figure 11 Low-coherence interferometric sensor. (a) Schematic diagram. (b) Output of the scanned processing interferometer.
Figure 14 Quasi-distributed strain sensing using a wavelength division multiplexed array of FBGs.
198 DETECTION / Fiber Sensors
When the cavity is subject to weak dynamic strain, minima can be used to determine the concentration
the laser output is frequency modulated. This of particular chemicals in the substance surrounding
frequency modulation can be detected by using the grating. The longest wavelength attenuation
interferometric techniques. The main advantage of bands are the most sensitive to the refractive index
the FBG laser sensor over direct interferometry is that of the substance surrounding the grating. This is
it is possible to obtain comparable strain sensitivities because higher order cladding modes extend a
using a much shorter length of fiber. greater distance into the external medium. LPFGs
Long-period fiber gratings (LPFGs) are attracting can also be used as strain, temperature, refractive
much interest for use in sensing applications. They index, bend, and load sensors.
are more sensitive to measurands than FBGs and
easier to manufacture. A typical LPFG has a length
of tens of mm with a grating period of 100s of mm.
Fiber Laser Doppler Velocimeter
Its operation is different to an FBG in that coupling
occurs between the forward propagating core mode Laser Doppler velocimetry (LDV) is a technique used
and co-propagating cladding modes. The high for measuring velocity, especially of fluid flows. A fiber
attenuation of the cladding modes results in a series LDV system and associated scattering geometry is
of minima occurring in the transmission spectrum of shown in Figure 17. In LDV two coherent laser beams
the fiber. This means that the spectral response is intersect in a small measurement volume where they
strongly influenced by the optical properties of the can interfere. The light reflected by a seed particle
cladding and surrounding medium. This can be passing through the measurement volume is modu-
exploited for chemical sensing as shown in Figure 16, lated at a frequency proportional to the spatial
where a broadband source is used to interrogate an frequency (Doppler difference frequency Df ) of the
LPFG. The wavelength shifts of the output spectrum interference fringes and the component of its velocity
normal to the interference fringes. Df is given by the measured Df will be less than or greater than Dfs ;
2nV cos b depending on whether the particle is moving with or
Df ¼ sinðu=2Þ ½6 against the fringe motion. There will then be an
l0
unambiguous detectable range of velocities from
where n is the fluid refractive index, V the particle zero to Vf :
velocity and l0 the laser free-space wavelength. The
output of the detector is processed to extract Df and
therefore V cos b: Df is independent of the scattered
light direction so collection of the scattered light using
Luminescence-Based Fiber Sensors
a lens increases the system sensitivity. The form of Luminescence-based fiber sensors are usually based
eqn [6] indicates that the direction of flow cannot on fluorescence or amplified spontaneous emission
be ascertained. The simplest technique to resolve this occurring in rare earth materials. They can be used in
ambiguity is to apply a frequency shift Dfs to one of many applications such as chemical, humidity, and
the input beams. This can be achieved by the use temperature sensing. It is possible to connect a fiber to
of a piezoelectric frequency shifter. The frequency luminescent material or to introduce luminescent
shift causes a phase shift to appear between the dopants into the fiber. An example of the latter, used
two beams. The phase shift increases linearly with to detect chemical concentration is shown in
time. This results in a fringe pattern of spacing s; which Figure 18. A laser pulse causes the doped section of
moves with constant velocity Vf ¼ sDfs : In this case the fiber to luminesce at a longer wavelength than the
laser. The luminescence intensity I(t) at the detector temperature independent. The anti-Stokes line inten-
has an exponential decay profile given by sity is a function of temperature. The ratio of the two
intensities provides a very accurate measurement
IðtÞ ¼ I0 expð2ðk1 þ k2 CÞtÞ ½7
of temperature. The measurement location is deter-
where I0 is the initial luminescence intensity, t time, mined by timing of the laser pulse.
k1 ; k2 constants, and C the chemical concentration.
The luminescence time constant 1=ðk1 þ k2 CÞ can be See also
determined by comparing the luminescence intensity
Environmental Measurements: Laser Detection of
at various times after excitation by the laser pulse, Atmospheric Gases. Fiber Gratings. Interferometry:
from which C can be determined. The use of time Overview; Phase Measurement Interferometry; White
division multiplexing allows quasi-distributed Light Interferometry. Optical Materials: Smart Optical
measurement of chemical concentration. The use of Materials.
plastic optical fiber for luminescence-based sensors is
attracting much interest. Further Reading
Culshaw B and Dakin JP (1988, 1989, 1997) Optical Fiber
Distributed Fiber Sensing Sensors, vols. I, II, III. Norwood, MA: Artech House.
Grattan KTV and Meggitt BT (1994, 1997, 1999, 1999)
We have seen that both point and quasi-distributed Optical Fiber Sensor Technology, vols. I, II, III and IV.
sensing are possible using fiber sensors. Distributed Boston, CA: Kluwer Academic.
sensing can be achieved through the use of linear or Grattan KTV and Meggitt BT (2000) Optical Fiber Sensor
nonlinear backscattering or forward scattering Technology. Boston, MA: Kluwer Academic.
techniques. In linear backscattering systems, light Kersey AD, Davis MA, Patrick HJ, et al. (1997) Fiber
backscattered from a pulse propagating in an optical grating sensors. Journal of Lightwave Technology 15:
fiber is time resolved and analyzed to obtain the 1442– 1463.
spatial distribution of the measurand field, e.g., Krohn DA (1999) Fiber Optic Sensors: Fundamentals and
Applications, 3rd edn. Research Triangle Park, NC:
polarization optical time domain reflectometry ana-
Instrumentation Society of America.
lyzes the polarization state of backscattered light to
Lopez-Higuera JM (2001) Handbook of Optical Fiber
determine the spatial distribution of electromagnetic
Sensing Technology. New York: Wiley.
fields. Nonlinear backscattering schemes use effects
Othonos A and Kalli K (1999) Fiber Bragg Gratings:
such as Raman or Brillouin scattering. An important Fundamentals and Applications in Telecommunications
example of the former is the anti-Stokes Raman and Sensing. Norwood, MA: Artech House.
thermometry system shown in Figure 19. A light pulse Udd E (1991) Fiber Optic Sensors: An Introduction for
is transmitted down the sensing fiber. Spontaneous Engineers and Scientists. New York: Wiley.
Raman scattering causes Stokes and anti-Stokes Udd E (1995) Fiber Optic Smart Structures. New York:
photons to be generated along the fiber. Some of Wiley.
these photons travel back along the fiber to a Yu FTS and Yin S (2002) Fiber Optic Sensors. New York:
fast detector. The intensity of the Stokes line is Marcel Dekker.
DETECTION / Heterodyning 201
Heterodyning
T-C Poon, Virginia Polytechnic Institute and State signals and u1 is the phase angle between the two
University, Blacksburg, VA, USA signals. Note that when the frequencies of the two
q 2005, Elsevier Ltd. All Rights Reserved. signals to be heterodyned are the same, i.e., v1 ¼ v2 ;
the phase information of cosðv1 t þ u1 Þ can be
extracted to get cosu1 if we use an electronic lowpass
Heterodyning, also known as frequency mixing, is a filter (LPF) to filter out the term cosð2v1 t þ u1 Þ: This
frequency translation process. Heterodyning has its is shown in Figure 2. Heterodyning is often referred
root in radio engineering. The principle of hetero- to as homodyning for the mixing of two signals of the
dyning was discovered in the late 1910s by radio same frequency.
engineers experimenting with radio vacuum tubes. For a general case of heterodyning, we can have a
Russian cellist and electronic engineer Leon There- signal represented by sðtÞ with its spectrum SðvÞ;
min invented the so-called Thereminvox, which was which is given by the Fourier transform of sðtÞ; and
one of the earliest electronic musical instruments that when it is multiplied by cosðv2 tÞ; the resulting
generates an audio signal by combining two different spectrum is frequency-shifted to new locations in
radio signals. American electrical engineer Edwin the frequency domain as:
Armstrong invented the so-called super-heterodyne 1 1
receiver. The receiver shifts the spectrum of the F {sðtÞ cosðv2 tÞ} ¼ Sðv 2 v2 Þ þ Sðv þ v2 Þ ½1
2 2
modulated signal, that is, the frequency contents of where F {sðtÞ} ¼ SðvÞ and F {·} denotes the Fourier
the modulated signal, so as to demodulate the audio transform of the quantity being bracketed. The
information from it. Commercial amplitude modu- spectrum of sðtÞcosðv2 tÞ; along with the spectrum of
lation (AM) broadcast receivers nowadays are super- sðtÞ; is illustrated in Figure 3. It is clear that
heterodyne type. After illustrating the basic principle multiplying sðtÞ with cosðv2 tÞ is a process of hetero-
of heterodyning by some simple mathematics, we will dyning, as we have translated the spectrum of the
discuss some examples on how the principle of signal sðtÞ: This process is known as modulation in
heterodyning is used in radio and in optics. communication systems.
For a simple case, when signals of two different In order to appreciate the process of heterodyning
frequencies are heterodyned or mixed, the resulting let us now consider, for example, heterodyning in
signal produces two new frequencies, the sum radio. In particular, we consider AM. While the
and difference of the two original frequencies. frequency content of an audio signal (from 0 to
Figure 1 illustrates how signals of two frequencies around 3.5 kHz) is suitable to be transmitted over a
are mixed or heterodyned to produce two new pair of wires or coaxial cables, it is, however, difficult
frequencies, v1 þ v2 and v1 2 v2 ; by simply multi- to be transmitted in air. By impressing the audio
plying the two signals cosðv1 t þ u1 Þ and cosðv2 tÞ; information onto a higher frequency, say 550 kHz, as
where v1 and v2 are radian frequencies of the two one of the broadcast bands in AM radio, i.e., by
Figure 2 Heterodyning becomes homodyning when the two Figure 3 Spectrum of s(t) and s(t)cos(v2 t). It is clear that
frequencies to be mixed are the same: homodyning allows the spectrum of s(t) has been translated to new locations upon
extraction of the phase information of the signal. multiplying a sinusoidal signal.
202 DETECTION / Heterodyning
optical signal intensity. This mode of photodetection is Note that the information content sðtÞ originally
called direct detection or incoherent detection in embedded in the phase of the signal plane wave has
optics. now been preserved and transferred to the phase of the
Let us now consider the heterodyning of two plane heterodyne current. The above process is called
waves on the surface of the photodetector. We assume optical heterodyning. In optical communications, it
an information-carrying plane wave, also called is often referred to as optical coherent detection. In
the signal plane wave, As exp{ i½ðv0 þ vs Þt þ sðtÞ contrast, if the reference plane wave has not been used
expð2ik0 x sinfÞ}; and a reference plane wave, for the detection, we have the incoherent detection.
Ar expð iv0 tÞ; or called a local oscillator in radio. The information content carried by the signal plane
The situation is shown in Figure 6. wave would be lost, as it is evident that for Ar ¼ 0; the
Note that the frequency of the signal plane wave is above equation gives only a DC current at a value
vs higher than that of the reference signal, and it is proportional to the intensity of the plane wave, A2s :
inclined at an angle f with respect to the reference Let us now consider some of the practical issues
plane wave, which is normal incident to the photo- encountered in coherent detection. Again, the AC
detector. Also, the information content sðtÞ is in the part of the current given by the above equation is the
phase of the signal wave. In the situation shown in heterodyne current ihet ðtÞ; given by
Figure 6, we see that the two plane waves are
interfered on the surface of the photodetector, giving sinðk0 a sinfÞ
ihet ðtÞ / Ar As cosðvs t þ sðtÞÞ ½7
the total light field ct ; given by k0 sinf
ct ¼ Ar expðiv0 tÞ þ As exp{i½ðv0 þ vs Þt þ sðtÞ} We see that since the two plane waves propagate in
£ expð2ik0 x sinfÞ ½4 slightly different directions, the heterodyne current
output is degraded by a factor of
Again, the photodetector responds to intensity, giving
the current sinðk0 a sinfÞ
¼ asincðk0 a sinfÞ
ð k0 sinf
i / lct l2 dxdy
S where sincðxÞ ¼ sinðxÞ=x: For small angles, i.e.,
ða ða
¼ ½A2r þ A2s þ 2Ar As cosðvs t þ sðtÞ sinf < f; the current amplitude falls off as
2a 2a sincðk0 afÞ: Hence, the heterodyne current is at a
2 k0 x sinfÞdxdy ½5 maximum when the angular separation between the
signal plane wave and the reference plane wave is
where we have assumed that the photodetector zero, i.e., the two plane waves are propagating exactly
has a 2a £ 2a square area. The current can be in the same direction. The current will go to zero when
evaluated to be: k0 af ¼ p; or f ¼ l0 =2a; where l0 is the wavelength of
sinðk0 a sinfÞ the laser light. To see how critical it is for the angle f to
iðtÞ/ 2aðA2r þ A2s Þþ 4Ar As cosðvs t þ sðtÞÞ be aligned in order to have any heterodyne output, we
k0 sinf
assume the size of the photodetector 2a ¼ 1 cm and
½6
the laser used is red, i.e., l0 < 0:6 mm; f is calculated
The current output has two parts: the DC current and to be about 2:3 £ 1023 degrees. Hence to be able to
the AC current. The AC current at frequency vs is work with coherent detection, we need to have precise
commonly known as the heterodyne current. optomechanical mounts for angular rotation.
Finally we describe a famous application of hetero-
dyning in optics – holography. As we have seen, the
mixing of two light fields with different temporal
frequencies will produce heterodyne current at the
output of the photodetector. But we can record the two
spatial light fields of the same temporal frequency with
photographic films instead of using electrical devices.
Photographic films respond to light intensity as well.
We discuss holographic recording of a point source
object as a simple example. Figure 7 shows a colli-
mated laser split into two plane waves and recombined
by the use of two mirrors (M) and two beamsplitters
(BS). One plane wave is used to illuminate the pinhole
Figure 6 Optical heterodyne detection. aperture (our point source object), and the other
204 DETECTION / Heterodyning
illuminates the recording film directly. The plane wave recording in which the reference wave does not exist
that is scattered by the point source object, which is and hence only the object wave is recorded. Now:
located z0 away from the film, generates a diverging
spherical wave. This diverging wave is known as an tðx; yÞ / lcr þ c0 l2
object wave in holography. The object wave arising
from the point source object on the film is given by, ¼ Ar expð2ik0 z0 Þ expð iv0 tÞ þ A0 expð2ik0 z0 Þ
according to Fresnel diffraction: 2
ik0
ik0 2 2
c0 ¼ A0 expð2ik0 z0 Þ exp½2ik0 ðx þ yÞ =2z0 expð iv0 tÞ
2pz0 2pz0
( )
exp 2 ik0 ðx þ yÞ2 =2z0 expð iv0 tÞ ½8
2
k0
¼ A þ B sin ½ðx2 þ yÞ2 =2z0 ½9
This object wave is a spherical wave, where A0 is the 2z0
amplitude of the spherical wave.
The plane wave that directly illuminates the where A and B are some inessential constants. Note
photographic plate is known as a reference wave, that the result of recording two spatial light fields of
cr : For the reference plane wave, we assume that the same temporal frequency preserves the phase
the plane wave has the same phase with the point information (noticeably the depth parameter z0 ) of
source object at a distance z0 away from the film. the object wave. This is considered optical homodyn-
Its field distribution on the film is, therefore, cr ¼ ing as it is clear from the result shown in Figure 2.
Ar expð2ik0 z0 Þ expðiv0 tÞ; where Ar is the amplitude The intensity distribution being recorded on the
of the plane wave. The film now records the film, upon being developed, will have transmittance
interference of the reference wave and the object given by the above equation. This expression is called
wave, i.e., what is recorded on the film as a 2D the Fresnel zone plate, which is the hologram of a
pattern is given by tðx; yÞ / lcr þ c0 l2 : The resulting point source object and we shall call it the point-
recorded 2D pattern, tðx; yÞ; is called the hologram of object hologram. Figure 8 shows the hologram for a
the object. This kind of recording is known as particular value of z0 and k0 : The importance of
holographic recording, distinct from a photographic this phase-preserving recording is that when we
DETECTION / Heterodyning 205
illuminate the hologram with a plane wave, called the an optical scanning beam to raster scan a point object
reconstruction wave, a point object is reconstructed located z0 away from the point source that generates
at the same location of the original point source the spherical wave. The situation is shown in
object if we look towards the hologram as shown in Figure 10. Point C is the point source that generates
Figure 9, that is the point source is reconstructed at a the spherical wave (shown with dashed lines) on the
distance z0 from the hologram as if it were at the same pinhole object, our point object. This point source,
distance away from the recording film. For an for example, can be generated by a focusing lens.
arbitrary 3D object, we can imagine the object as a The solid parallel rays represent the plane wave. The
collection of points, and therefore, we can see that we interference of the two waves generates a Fresnel zone
have a collection of Fresnel zone plates on the holo- plate type pattern on the pinhole object:
gram. Upon reconstruction, such as the illumination
Is ðx; y; z0 ; tÞ
of the hologram by a plane wave, the observer would
ik0
see a 3D virtual object behind a hologram.
¼ Ar expð2ik0 z0 Þ expð iv0 tÞ þ A0 expð2ik0 z0 Þ
As a final example, we discuss a state-of-the-art 2 p z0
holographic recording technique called optical scan- 2
ning holography, which employs the use of optical 2 2
exp½2ik0 ðx þ yÞ =2z0 exp½ iðv0 þ VÞt
heterodyning and electronic homodyning to achieve
!2
Figure 10 Optical scanning holography (use of optical heterodyning and electronic homodyning to record holographic information
upon scanning the object in two dimensions).
while the goal-directed techniques may be different, processing, as mentioned in Figure 1. The image may
based on problems at hand, the pre- and post- then be transmitted over the internet or it can leave
processing techniques and tools are very similar. the digital world to go to the print world. It may then
Thus, most of the common image pre-processing go through wear/tear, subsequently being scanned
techniques are also applicable for image post-proces- back to digital domain again. After some processing,
sing purposes. it may again be distributed over the internet.
Electronic image post-processing techniques Consequently, the image may be copied legally
involve three broad domains such as: (i) spatial; or fraudulently, processed, and used by a user.
(ii) frequency; and (iii) time-frequency. Each of these Further, the threats and vulnerabilities of the
domain techniques may be grouped into different internet have also increased proportionately. The
types of processing such as: (i) spatial domain rapid development and availability of high-end
technique to include filtering, spatial processing computers, printers, and scanners has made the
such as histogram modification, morphological pro- tools of counterfeiting easily accessible and more
cessing, texture processing, etc.; (ii) frequency affordable. With these products, one may counter-
domain technique to include filter; and (iii) time- feit digital images, which are easy to distribute and
frequency domain technique to include short-time modify. As a result, copyright ownership poses a
Fourier transform, wavelet, and Wigner-ville formidable challenge to image post-processing and
transforms. distribution.
On the other hand, it is useful to identify the typical In the subsequent sections, we briefly review
lifecycle of a digital image from the distribution relevant background in image post-processing and
perspective as shown in Figure 2. The image may be distribution techniques. We also present a few
created using a number of different acquisition representative application examples of image post-
methods and sensors as mentioned above. The processing and copyright protection in image
processing block next combines all different types of distribution.
Histogram processing
One of the important issues in post-processing is
image enhancement. Distortions in image may be
introduced due to different noise factors, relative Figure 4 A histogram transformation function.
motion between a camera and the object, a camera
that is out of focus, or atmospheric turbulence. Noise
may be due to thermal measurement errors, recording of the output histogram in order to enhance certain
medium, transmission medium, or digitization pro- intensity levels in an image. This can be accomplished
cess. The histogram processing operation is one of the by the histogram specialization operator which maps
simplest techniques for enhancing the distorted image a given intensity distribution into a desired distri-
for certain type of noises. Histogram equalization is bution using a histogram equalized image as an
a contrast enhancement technique with the objective intermediate stage.
to obtain a new enhanced image with a uniform
histogram. This can be achieved by using the Fractals for Texture Analysis
normalized cumulative histogram as the gray scale
mapping function. Histogram modeling is usually The fractal aspect of an image has successfully been
introduced using continuous domain functions. Con- exploited for image texture analysis. The fractal
sider that the images of interest contain continuous concept developed by Mandelbrot, who coined the
intensity levels in the interval [0,1] and that the term fractal from the Latin ‘fractus’, provides a useful
transformation function t which maps an input image tool to explain a variety of naturally occurring
f ðx; yÞ onto an output image wðx; yÞ is continuous phenomena. A fractal is an irregular geometric object
within this interval. We assume that the transfer with an infinite nesting of structure at all scales.
function, Kf ¼ t ðKw Þ; is single-valued and mono- Fractal objects can be found everywhere in nature
tonically increasing such that it is possible to define such as coastlines, fern trees, snowflakes, clouds,
the inverse law Kw ¼ t21 ðKf Þ: An example of such a mountains, and bacteria. Some of the most important
transfer function is illustrated in Figure 4. All pixels in properties of fractals are self-similarity, chaos, and
the input image, with densities in the interval Df to noninteger fractal dimension (FD). The FD of an
Df þ dDf , are mapped using the transfer function object can be correlated with the object properties.
Kf ¼ tðKw Þ, such that they assume an output pixel Object textures in images are candidates for charac-
density value in the interval from Dw to Dw þ dDw : terization using fractal analysis because of their
Consider that if the histogram h is regarded as a highly complex structure. The fractal theory devel-
continuous probability density function p, describing oped by Mandelbrot is based, in part, on the work of
the distribution of the intensity levels, then we obtain: mathematicians Hausdorff and Besicovitch. The
Hausdorff– Besicovitch dimension, DH, is defined as:
pw ðwÞ ¼ pf ðDf Þ=dðDf Þ ½10
ln N
where DH ¼ limþ ½12
1!0 ln 1=r
Df ¼ duðf Þ=df ½11
In the case of histogram equalization, the output where N is the number of elements of box size r
probability densities are all an equal fraction of the required to form a cover of the object. The FD can be
maximum number of intensity levels in the input defined as the exponent of the number of self-similar
image, thus generating a uniform intensity distri- pieces (N) with magnification factor ð1=rÞ into which
bution. Sometimes it is desirable to control the shape a figure may be broken. The FD is a noninteger
210 DETECTION / Image Post-Processing and Electronic Distribution
value in contrast to objects that lie strictly in many flexible and effective ways to post-process
Euclidean space. The equation for FD is as follows: images that surpass classical techniques – making
it very attractive for data analysis, and modeling.
ln ðnumber of self-similar piecesÞ The continuous WT is defined as
FD ¼
ln ðmagnification factorÞ
ðW
21=2 t2b
ln N Fða; bÞ ¼ lal xðtÞC dt ½15
¼ ½13 2W a
lnð1=rÞ
where a and b are scaling and translation factors and
Frequency Domain Image Filtering CðtÞ is the mother wavelet. The WT may be viewed as
a multiresolution analysis with shifted and scaled
Ignoring noise contribution, we may obtain the basic versions of the mother wavelet. Corresponding WT
filtering expression using eqns [7] and [8] as follows: coefficients cm;n are
One of the fundamental aspects of early image Cm; nðxÞ ¼ 2m=2 Cð22m x 2 nÞ ½18
representation is the notion of scale. An inherent
property of objects in the world and details in image The above eqn [18] forms the basis for ‘sub-band’
is that they only exist as meaningful entities over implementation for multiresolution analysis (MRA)
certain ranges of scale. Consequently, a multiscale of wavelet processing.
representation is of crucial importance if one aims at
describing the structure of the world, or more
specially the structure of projections of the three- Image Post-Processing Examples
dimensional world onto two-dimensional electronic Some of the important application areas for elec-
images. The multiscale image post-processing may be tronic image post-processing involve the following:
facilitated by the wavelet (WT) computation.
A wavelet is, by definition, an amplitude-varying (i) Biomedical image analysis and object detection;
short waveform with a finite bandwidth. Naturally, (ii) Image compression for transmission and distri-
wavelets are more effective than the sinusoids of bution;
Fourier analysis for matching and reconstructing (iii) Satellite image processing and noise removal;
signal features. In wavelet transformation and inver- (iv) IR/SAR/detection and clutter removal; and
sion, all transient or periodic data features can be (v) Image classification and machine vision.
detected and reconstructed by stretching or contract-
ing a single wavelet to generate the matching building In this article, we provide examples of some of
blocks. Consequently, wavelet analysis provides these image-post-processing applications.
DETECTION / Image Post-Processing and Electronic Distribution 211
Example I: Spatial Domain Processing: The fBm is a part of the set of 1=f processes,
Image Post-Processing in Biomedical Image corresponding to a generalization of the ordinary
The mega voltage X-ray images are the X-ray images Brownian motion BH ðsÞ: It is a nonstationary
taken using high voltage X-radiations. These images zero-mean Gaussian random function, defined as
are often characterized by low resolution, low
contrast, and blur. The importance of mega voltage 1
X-ray images arises in the context of radiation therapy BH ðtÞ 2 BH ðsÞ ¼
GðH þ 0:5Þ
for cancer treatment wherein the mega voltage (
ð0
X-radiation is used as a therapeutic agent. Figure 5 ½ðt 2 sÞH20:5 2 sH20:5 dBðsÞ
shows such mega voltage X-ray images, also known as 21
)
an electronic portable digital (EPID) image with a very ð1
H20:5
low level of detail. Figure 5a shows poor dynamic þ ðt 2 sÞ dBðsÞ ; BH ð0Þ ¼ 0 ½19
0
range. In order to improve the contrast of this image,
without affecting the structure (i.e., geometry) of the
information contained therein, we can apply the where the Hurst coefficient H, restricted to
histogram equalization operation discussed in 0 , H , 1; is the parameter that characterizes
eqn [10]. Figure 5b presents the result of the histogram fBm; t and s correspond to different observation
image enhancement post-processing process. The low times of the process BH, and G to the Euler’s
level of detail in the bony anatomy is easily visible in Gamma function. From the frequency domain
Figure 5b. perspective, it is known that some of the most
frequently seen structures in fractal geometry,
Example II: Frequency Domain Processing: generally known as 1/f processes, show a power
Image Post-Processing in Biomedical spectrum satisfying the power law relationship:
Tissue Detection
Figure 5 (a) Original EPID image; and (b) Histogram enhanced image.
212 DETECTION / Image Post-Processing and Electronic Distribution
where DE is the Euclidean dimension that contains FD values do not depend on the mother wavelet
the fBm (i.e., the position of each point of the used. The window size that offers the best results is
process is described with the vector x ¼ the 16 £ 16 pixels, while better discrimination
ðx1 ; …; xE ÞÞ: between low and high H values is obtained using
We exploit the wavelet MRA combined with biorthogonal wavelet. To improve tumor detection
complementary signal processing techniques such performance of the spectral technique, we apply
as time-frequency descriptions and parameter esti- post-processing filters on the results in Figures 8a
mation of stochastic signals in fBm framework for and b, respectively. In Figure 8, for an example, we
tumor detection in CT images. We demonstrate an show the filtered fractal analysis results corre-
example of image post-processing for tumor sponding to a 16 £ 16 pixels window with
segmentation. We test our models to estimate FD Daubechies 4. It is possible to obtain an even
in brain MR images as shown in Figures 6a and b, better description of the location and shape of the
respectively. The images are scanned pixel-by-pixel tumor after applying a two-dimensional post-
with a sliding window that defines the subimage processing filter to the matrices derived from the
for analysis. Pre-processing, such as median filter- variance estimation algorithms with biorthogonal
ing, is used to remove noise from the original 1.5 wavelet and an 8 £ 8 pixels window, respect-
images. ively. The results are shown in Figure 9 and are
The results for the spectral analyses of CT image similar for both averaging and Gaussian filters
are shown in Figure 7. The results suggest that the as well.
Figure 7 MR-Fractal analysis using spectral method (a) and (b) correspond to a 16 £ 16 analyzing window for Daubechies 4 and
biorthogonal 1.5 wavelets, respectively.
DETECTION / Image Post-Processing and Electronic Distribution 213
Figure 8 Post-processing results using: (a) 5 £ 5 Gaussian filtering with s ¼ 1 and (b) 3 £ 3 averaging filter.
Figure 9 Post-processing on the fractal analysis results of Figure 8 for (a) 5 £ 5 Gaussian filtering with s ¼ 1; and (b) 3 £ 3 averaging
filtering.
Figure 10 (a) Noisy image; (b) de-noised image with JHWT-filtering; and (c) de-noised image with WP-filtering.
communication systems, wireless multimedia has management initiative that involves copyright
recently undergone enormous development. How- protection technologies, policies, and legal issues.
ever, to design an efficient multimedia communi- . Transaction Tracking/Fingerprinting: In the con-
cation system over wireless channels, there still exist tinuous (re)distribution of digital images through
many challenges due to severe wireless channel internet and other digital media (CD, camera, etc.),
conditions, special characteristics of the compressed it is important to track the distribution points to
multimedia data, and resource limitation, such as track possible illegal distribution of images.
power supply. Since multimedia data, such as an
image, contain redundancy, to efficiently utilize Next, we will discuss how the digital watermarking
limited resources, source compression may be necess- technology can help enforce these security
ary. Further, as mentioned above, the ease of image requirements.
creation, processing, transmission, and distribution
has also magnified the associated counterfeiting Example of secure image distribution: digital
problems. We discuss the relevant issues of image watermarking
distribution in the following sections. Digital watermarking is the process of embedding a
digital code (watermark) into a digital content like
Techniques in Image Distribution image, audio, or video. The embedded information,
sometimes called a watermark, is dependent on the
The spectacular development of imaging sensors,
security requirements mentioned above. For example,
communication, and internet infrastructure has also
if it is copyright application, the embedded infor-
made the counterfeiting tools easily accessible and
mation could be the copyright notice. Figure 11 shows
more affordable. In addition, the threats and vulner-
a block diagram of the digital watermarking process.
abilities of the internet have also increased manyfold.
Consider that we want to embed a message M in an
All of these together pose a formidable challenge to
image, fo, also known as the original image. For the
the secure distribution and post-processing of images.
sake of brevity, we drop the indices representing the
Requirements of Secure Digital Image Distribution pixel coordinates. The message is first encoded using
source coding and optionally with the help of error
In light of the above-mentioned vulnerabilities of correction and detection coding, represented in
image distribution channels and processes, below are Figure 11 as eðMÞ such that
a few requirements of secure image distribution:
Wm ¼ eðMÞ ½26
. Data Integrity and Authentication: Digital image is
easy to tamper with. Powerful publicly available The encoded message Wm is then combined with
image processing software packages such as Adobe the key-based reference pattern Wk, in addition to the
Photoshop or PaintShop Pro make digital forgeries scaling factor a, to result in Wa which is the signal
a reality. Integrity of the image content and that is actually added to the original image, given as
authentication of the source and destination of
the image distribution are, therefore, of utmost Wa ¼ aðfo Þ½WM ^WK ½27
importance.
. Digital Forensic Analysis: Consider a litigation Note that the scaling coefficient a may be a constant or
involving a medical image that is used for diagnosis image dependent or derived from a model of image
and treatment of a disease, as shown in the post- perceptual quality. When the scaling factor a is
processing example in Figures 7 or 8 above. dependent on the original image, we can have
Integrity of the image again is vital for the proper informed embedding which may result in more
forensic analysis. perceptually adaptive watermarking. The operation
. Copyright Protection: Digital copyright protection involving Wm and Wk is the modulation part, which is
has become an essential issue to keep the healthy dependent on the actual watermarking algorithm
growth of the internet and to protect the rights of used. As an example, in a typical spread-spectrum
on-line content providers. This is something that a based watermarking, this operation is usually an
textual notice alone cannot do, because it can so exclusive NOR (XNOR) operation. The final oper-
easily be forged. An adversary can steal an image, ation is an additive or multiplicative embedding of Wa
use an image processing program to replace the with the original image. We show a simple additive
existing copyright notice with a different notice and watermarking process as
then claim to own the copyright himself. This re-
quirement has resulted in a generalized digital rights fw ¼ fo þ W a ½28
216 DETECTION / Image Post-Processing and Electronic Distribution
Figure 12 (a) Original image, (b) watermarked image, and (c) the difference image.
The result of this process is the watermarked image Figure 12 shows the original image, water-
fw which may actually need some clipping and marked image, and the difference image using a
clamping operations to have the image pixel values spread-spectrum based watermarking algorithm,
in the desired range. respectively. The enhanced difference image rep-
The detector, as shown in Figure 11, does the resents a noise-like image, wherein the scaling
inverse operations of the embedder. If the original coefficient a is clearly dependent on the original
image is known, as in the case of nonblind image. As evident from Figure 12b, the hidden
watermarking, it is first subtracted from the water- message is imperceptible as well as the change in
marked image. Blind watermarking detector, on the original image. The imperceptibility is one of the
the other hand, uses some filtering with a goal to primary criteria of digital watermarking.
estimate and subtract the original image com- There are other criteria such as unobtrusiveness,
ponent. The result of filtering is then passed to the capacity, robustness, security, and detectability.
decoder, which collects the message bits with Thus, digital watermarking is a multidimensional
the help of the shared key, and finally gets back problem, which attempts to make a trade-off
the decoded message M0 : among a number of different criteria. The inserted
DETECTION / Image Post-Processing and Electronic Distribution 217
watermark should not be obtrusive to the intended some authentication information into the image
use of the original image. Because the water- using watermarking, you might want to see the auth-
marking process does not increase the size of the entication fail, when some one steals your image. This
original image, it may be desirable to add as much is an example of nonrobust watermark, where the
information as possible. However, generally speak- fragility of the embedded watermark is required to
ing, the more information one adds, the more detect any forging done on the digital image.
severe will be the impact on the perceptual quality Now, consider how watermark provides the
of the image. From the detection point of view, security requirements for secure image distribution.
watermark should be robust enough to tolerate a In particular, we look into the authentication of
range of image post-processing and degradation. In watermark. Figure 14 shows the use of watermark
addition, the embedded watermark should also be for image authentication, which has some simi-
secure enough so that it may not be removed from larities with the cryptographic authentication using
the watermarked image. Finally, detectability of the digital signature. However, there are potential
embedded watermark is an important criterion that benefits to using watermarks in content authentica-
places some constraints on the embedding tion and verification. Unlike cryptography, water-
algorithm. marking offers both in-storage and in-transit
In order to fully appreciate the use of watermarking authentication. Watermarking is also faster com-
for image distribution, it is instructive to look at pared to encryption, which is very important in
different classes of watermark, which is delineated in internet-based image distribution. By comparing a
Figure 13. watermark against a known reference, it may be
For an example, depending on the robustness possible to infer not just that an alteration occurred
criterion of watermarking, they can be classified but what, when and where changes have occurred.
into three categories such as robust, fragile, and semi- As shown in Figure 14, first we identify some
fragile. Note that while the robustness of watermark features from the image and then we compute
is important, it is not always equally desirable in all the digital signature, which is then embedded
applications. Let us take the authentication as an into the image. The detector takes a test image,
example. If you, as the owner of a digital image, embed computes the signature and extracts the embedded
signature, and then compares the two. If they match, Further Reading
the image is authenticated. If they do not match,
Ahmed F and Moskowitz IS (2004) A correlation-based
then the image may have gone through some
watermarking method for image authentication appli-
forging/tampering effects and the algorithm may cations, accepted for publication in Optical Engineering.
optionally detect the tamper area as well. Barlaud M (1994) Wavelets in Image Communication,
One of the limitations of the watermarking Advances in Image Communication 5. Amsterdam:
approach in secure image distribution is that the Elsevier.
technology is not yet widely deployed, and nor is the Chui A (1992) An Introduction to Wavelets. San Diego:
protocol satisfactorily standardized. Other major Academic Press.
limitation is that the watermarking process cannot Cox NI, Bloom J and Miller M (2001) Digital Water-
be made sufficiently robust to arbitrary types of marking: Principles & Practice. Francis, CO: Morgan
different attacks. In reality, therefore, the hybrid Kauffman Publishers.
combination of cryptography and watermarking is Daubechies I (1990) The wavelet transform, time-frequency
localization and signal analysis. IEEE Transactions on
expected to improve the secure distribution of images
Information Theory 36(5): 961 – 1005.
across the unreliable internet. Gonzalez RC and Woods RE (1993) Digital Image
Processing. New Jersey: Addison-Wesley Publishing
Company.
Iftekharuddin KM and Parra C (2004) Multiresolution-
Concluding Remarks fractal analysis of medical images. Under Review. Neuro-
In this chapter, we discussed the principles of image imaging, vol. 1. New York: Wiley/IEEE Press, Chapter 4.
post-processing and distribution. We showed a few Jain AK (1989) Fundamental of Digital Image Processing.
representative examples for post-processing and New Jersey: Prentice-Hall.
Lu N (1997) Fractal Imaging. San Diego: Academic Press.
distribution. We also explained the relevant issues
Mallat S (1989) A theory for multiresolution signal
in image distribution and present digital water- decomposition: the wavelet representation. IEEE
marking as one of the solutions for secure image Transactions on Pattern Analysis and Machine
distribution. Intelligence 11(7): 674 – 693.
Mandelbrot BB (1983) The Fractal Geometry of Nature.
San Francisco, CA: Freeman.
Acknowledgment Schmucker AM and Wolthusen SD (2003) Techniques and
Applications of Digital Watermarking and Content
One of the authors (KMI) wishes to acknowledge Protection. Norwood, MA: Artech House.
the partial support of this work through a grant Serra J (1982) Image Analysis and Mathematical Mor-
(RG-01-0125) from the Whitaker Foundation. phology. San Diego: Academic Press.
Stallings W (2003) Cryptography and Network Security –
Principles and Practices, 3rd edn. New Jersey: Prentice-
Hall.
See also
Wu M and Liu B (2002) Multimedia Data Hiding.
Imaging: Information Theory in Imaging. New York: Springer.
DETECTION / Smart Pixel Arrays 219
. A derivative improvement that is desired is the For such small charge packets, quantum noise is still
increase of the dynamic range D=R of the complete relevant, leading to significantly degraded image
detection process, as defined, for example, by quality, even when the images are taken under
favorable lighting conditions.
Q2max A second approach to high-sensitivity photosensing
D=R ¼ 1010 log ½4
sQ exploits the dependence of the charge detection
variance on the measurement bandwidth, as indicated
where Qmax denotes the maximum (or saturation) in eqn [3]. One method for exploiting this consists of
charge and sQ is the total charge noise variance of reading out the image sensor at a significantly reduced
the electronic detection circuit. bandwidth of a few 10 kHz. At the same time, the
image sensor is cooled to temperatures of around
In addition to the above described improvements 2100 8C. This becomes necessary because readout at
of conventional pixel functionality, a huge demand such slow speeds increases the time, to read one
exists for the capability to measure the parameters complete image, to many minutes, requiring a
of modulated electromagnetic signals, i.e. to substantial reduction of the dark current. Instead of
demodulate such signals. This demand is related working at low readout bandwidths, it is also possible
to the large number of very powerful electronic to read out the same information repeatedly and non-
and opto-electronic measurement principles, destructively at high speed, and to average many
capable of improving the sensitivity of the
measurements of the same amount of photocharge.
measurement process significantly by employing
Both methods have been shown to lead to readout
modulated signals, either in the temporal or in
noise levels of about one electron; on the other hand,
the spatial domain. For this reason, smart pixel
both approaches require very long times for the
arrays are sought that offer this combination of
acquisition of complete images.
photodetection and demodulation.
This problem can be resolved by exploiting the
massive parallelism that smart pixel arrays offer: low-
bandwidth techniques can be employed without
Modular Building Blocks for Smart losing effective readout speed. Pixels and columns
Pixel Arrays are read out with low-bandwidth amplifiers that are
Towards the end of the last century, a large body all active simultaneously, and only the final off-chip
of literature has been created, addressing various readout amplifier needs to operate at a large
approaches for the realization of one of the above bandwidth. In practice, this approach has been used
described desired opto-electronic functionalities. successfully in single-chip digital cameras with
The result is a powerful ‘toolbox’ of modular Megapixel resolution, exhibiting a readout noise of
building blocks for custom smart pixel arrays, with only a very few electrons.
which optimized solutions for optical measurement A completely different approach is based on the use
problems and practical applications can be of a physical amplification mechanism with low
constructed. intrinsic noise, before detecting the amplified charge
packets with a conventional electronic circuit; the
High-Sensitivity Charge Detection avalanche multiplication effect in high electric fields is
From the noise formula (eqn [3]) two possibilities for very well suited to this approach. Each pixel is
the improvement of the sensitivity of image sensors supplied with its own combination of stabilized
can be deduced: A first approach is the reduction of avalanche multiplier and conventional charge detec-
the capacitance value. An excellent example for this is tion circuit. This approach has been used for the
the invention of the ‘double gate’ FET, in which the realization of a fully integrated 2D array of single
charge to be measured is not placed on a gate photon detectors that can be fabricated with industry
electrode at the surface but resides in the semicon- standard CMOS processes. Similar performance is
ductor bulk. In this way the effective capacitance of obtained with the Impactrone series of high-
the pixel can be reduced by at least one order of sensitivity CCD image sensors offered by Texas
magnitude compared to conventional pixels. The Instruments Inc., making use of a single multistage
resulting capacitance of less than 1 fF leads to a avalanche multiplier at the output of the image
readout noise of about one electron at room sensor. For both high-sensitivity smart pixel arrays,
temperature and for a readout bandwidth of several single-photoelectron performance is reported at room
MHz. Unfortunately the small capacitance allows temperature and at readout frequencies of several
the storage of only around ten thousand electrons. MHz, as required for video applications.
DETECTION / Smart Pixel Arrays 223
Figure 8 Images of a scene with high dynamic range, taken with an image sensor exhibiting linear illumination response (a) and linear-
logarithmic illumination response (b). A similar pixel architecture as illustrated in Figure 7 is employed. (Courtesy of Photonfocus AG,
Lachen, Switzerland.)
X
N X
N
Rk;l ¼ Pi; j wk2i;l2j ½6
i¼2N j¼2N
can be used for the determination of complete, source, usually an array of infrared or visible LEDs,
programmable, real-time image convolutions, with that is modulated at a few tens of MHz. The
arbitrary kernels that are fully under the dynamic modulated light is reflected diffusely by the various
control of digital logic or microprocessors. Practical objects, and the camera’s lens images it onto a lock-in
experience with such convolution imagers shows image sensor. Each of its pixels carries out synchro-
that the convolution results exhibit inaccuracies of nous sampling/accumulation of photocharges. At the
less than 1%. end of the exposure time, the accumulated photo-
charge samples are read out, and the three modu-
lation parameters of the reflected light are determined
Temporal Demodulation for each pixel. The phase shift is a direct measure of
In many relevant optical measurement techniques, the local distance. It has been shown that for a wide
temporally modulated light fields are employed. range of operating conditions, the distance resolution
For this reason, smart pixel arrays are desired, of such optical range cameras is already limited just
providing synchronous demodulation of these signals. by the photo shot noise. A typical result of such a
An electromagnetic signal SðtÞ that is harmonic as a depth image is shown in Figure 14.
function of time with given frequency v , having
unknown modulation amplitude A, offset B and phase
delay w , is described by
Photonic Systems-on-chip
SðtÞ ¼ A sinðvt þ wÞ þ B ½7 The microelectronics revolution is based on the key
notion of integration. More functionality integrated
The three unknowns A, B and w can be measured if monolithically on the same chip leads to faster,
three or more sampling points are sampled synchro- smaller, better performing, and cheaper products.
nously per period in so-called ‘lock-in’ pixels. An The same is true for smart pixel arrays, for which
example of a lock-in pixel employing the drift-field progress through integration is attained in several
transport principle with four accumulation taps, is respects.
illustrated in Figure 13. Demodulation frequencies of
several tens of MHz have already been demonstrated
in practice.
The most important application of such synchro-
nous demodulation pixels is in optical time-of-flight
range cameras. Such a camera incorporates a light
Seeing Chips
Already, in the mid-1980s, researchers were excited
about the possibility of integrating local signal
processing capabilities in each pixel, implementing
some functions of biological vision systems. It was
envisioned that an increasing amount of bio-inspired
functionality would lead to smart image sensors,
exhibiting a primitive sense of vision. The idea of a
‘seeing chip’, with the capability to interpret a scene,
was born.
Unfortunately, even today, our understanding of
the enormously successful biological vision systems
has not yet reached a mature enough state allowing us
to implement signal processing functions that
Figure 15 Schematic block diagram of a monolithically
approximate any ‘image understanding’. Published integrated photonic microsystem, that is expected to be produ-
algorithms are still incapable of interpreting a scene cable in the near future with organic semiconductors. All opto-
and ‘perceiving’ its contents. Obviously, no set of electronic devices are fabricated on the same planar surface:
vision algorithms is yet known with which seeing photovoltaic power supplies (PV); analog and digital signal
processing (ASP and DSP); photodetectors (PD); photoemitters
chips could be realized. Although progress is slower
(PE) – used for bidirectional communication or as parts of a
than envisioned, single-chip machine vision systems photonic detector subsystem – a light guiding structure (LG) that
of growing complexity and increasing functionality can be sensitized for specific sensing applications, and a display
are being developed. subsystem (DS).
Organic Semiconductors for Monolithic Photonic integrated photonic microsystems is still largely
Microsystems
unexplored, and it is offering exciting opportunities
The realization that organic materials with conjugated for research and product development.
double bonds, in particular noncrystalline polymers,
are direct bandgap semiconductors, has recently led to
Smart Pixel Arrays Beyond Electromagnetic Field
a flurry of R&D activities in the domain of organic
Imaging
photonics. Organic light-emitting diodes (OLEDs),
color displays, lighting panels, photodiodes, image The sensing capabilities of smart pixel arrays are not
sensors, photovoltaic cells, RF transponders, and tags, restricted to electromagnetic radiation; this technol-
and several other organic photonic devices have not ogy can also be used for the acquisition of other two-
only been demonstrated but many are already or three-dimensional physical fields. As an example,
commercially available. The special appeal of organic electrostatic imaging can be used for the temporally
semiconductors stems from their potential for mono- and spatially resolved acquisition of surface potential
lithic, low-cost integration of complete, large-area distributions in living cells. Electrostatic potential
photonic microsystems, even on thin, flexible sub- values of several hundred mV can be acquired with a
strates. In Figure 15, such a monolithic photonic precision of 1 mV and a temporal resolution below 1
microsystem is illustrated schematically, combining a millisecond. Other acquisition modalities include
photovoltaic power supply, a photonic detector magnetostatic field imaging, for example by employ-
subsystem, analog and digital signal processing, as ing the Hall effect, or imaging of local electrostatic
well as bidirectional optical communications with the charge or current density distributions.
outside world, possibly including also an on-chip
display. The photonic detector subsystem typically
consists of a light source, a light guiding structure that
is sensitized – for example chemically – and a
Summary
photodetector/image sensor subsection. In such a The relentless advances of semiconductor technology
monolithic photonic microsystem, the term ‘smart make it possible to integrate an increasing amount
pixel array’ obtains a much broader meaning, opening of photosensitive, analog, and digital functionality
up a huge number of practical applications in in the same smart pixel and on the same smart
essentially all domains of our daily and professional pixel array. Unusual photosensor devices can
lives. This domain of organic semiconductors for be fabricated with commercially available
228 DETECTION / Smart Pixel Arrays
DIFFRACTION
Contents
Diffraction Gratings
Fraunhofer Diffraction
Fresnel Diffraction
Figure 1 Examples of linear gratings. (a) Binary grating; (b) sinusoidal grating; (c) triangular grating. Here d is the grating period,
H is the thickness of the modulated region, u is the angle of incidence, and n denotes refractive index.
Figure 2 Diffraction by gratings. (a) Linear grating; (b) definitions of direction angles in the case of a biperiodic grating.
wave not propagating in the x – z plane (conical illustrates a structure consisting of an array of
incidence). rectangular-shaped holes pierced in a flat screen,
Some two-dimensionally periodic gratings are while Figure 3c defines an array of isolated particles
illustrated in Figure 3. The structure illustrated in on a flat substrate. If the grid or the particles are
Figure 3a is an extension of a binary grating to the metallic, surrounded by dielectrics from all sides, one
biperiodic geometry: here a single two-dimensional speaks of inductive and capacitive structures, respect-
feature is placed within the grating period but, of ively: in the former case relatively free-flowing
course, more features with different outlines could be currents are induced in the metallic grid structure,
included as well. Moreover, each period could but in the latter case the isolated metal particles
contain a continuous ‘mountain-range’ or consist become polarized (and therefore capacitive) when
of a ‘New-York’-style pixel structure. Figure 3b illuminated by an electromagnetic field.
DIFFRACTION / Diffraction Gratings 231
Figure 3 Examples of biperiodic gratings. (a) General binary grating; (b) inductive structure; (c) capacitive structure.
scaling law. The situation is much worse for bi- grating we obtain:
periodic gratings, for which the exponent is
around six. Thus no foreseeable developments in hm ¼ sin c2 ð1=QÞ ½8
computer technology will solve the problem, and
approximate analysis methods have to be sought. where sin cx ¼ sinðpxÞ=ðpxÞ: It should be noted,
A multitude of such methods are under active however, that these results are valid only for gratings
development, but in this context we discuss the with d q l; even if they are corrected by Fresnel
simplest example only. transmission losses at the interface between the two
If d q l and the grating profile does not contain media.
wavelength-scale features, one can describe the effects
of a grating on the incident field merely by consider-
ing it as a complex-amplitude-modulating object,
Fabrication of Gratings
which imposes a position-dependent phase shift and Gratings can be manufactured in a number of
amplitude transmission (or reflection) coefficient different ways, perhaps the simplest being to plot an
on the incident field. This model is known as the equidistant array of straight black and white lines and
thin-element approximation (TEA). to reduce the scale of the pattern photographically.
At normal incidence the plane-wave expansion of This method can be readily adapted to the generation
any scalar field component U immediately behind the of more complex grating structures and diffractive
grating (at z ¼ H) takes the form: elements, but it is not suitable for the production of
high-quality gratings required in, for example,
X
1 X
1
spectroscopy.
Uðx; y; HÞ ¼ Tmn
m ¼21 n ¼21
Ruling engines represent the traditional way of
making gratings with periods of the order of the
£ exp½i2pðmx=dx þ ny=dy Þ ½4
wavelength of light, especially for spectroscopic
applications. These are machines equipped with a
where
triangularly shaped diamond ruling edge and an
interferometrically controlled two-dimensional
1 ð dx ð dy
Tmn ¼ Uðx; y; HÞ mechanical translation stage that moves the sub-
dx dy 0 0 strate with respect to the ruling edge. Triangular-
£ exp½2i2pðmx=dx þ ny=dy Þdx dy ½5 profile gratings of high quality can be produced
with such machines, but the fabrication process is
are the complex amplitudes of the transmitted slow and the cost of the manufactured gratings is
diffraction orders. In TEA the field at z ¼ H may be therefore high. In addition, even with real-time
connected to the field at z ¼ 0 (a unit-amplitude plane interferometric control of the ruling-edge position,
wave is assumed here) by optical path integration it is impossible to avoid small local errors in the
along a straight line though the modulated region of grating geometry. Such errors are particularly
the grating: harmful if they are periodic since, according to
the grating equation, these errors will generate
ðH 2p spurious spectral lines known as ghosts. Such lines
Uðx; y; HÞ ¼ exp i nðx;
^ y; zÞ dz ½6
0 l generated by the grating ‘super-period’ are weak
but they may be confused with real spectral lines in
^ y; zÞ ¼ nðx; y; zÞ þ ikðx; y; zÞ is the complex
where nðx; the source spectrum. If the ruling errors are
refractive-index distribution in the region 0 , z , H: random, they result in ‘pedestal-type’ background
With TEA the diffraction efficiencies can be calcu- noise. While such errors do not introduce ghosts,
lated from a simple formula: they nevertheless reduce the visibility of weak
original spectral lines.
hmn ¼ lTmn l2 ½7 The introduction of lasers in the early 1960s made
it possible to form stable and high-contrast inter-
which, however, ignores Fresnel reflections. ference patterns over a sufficient time-scale to allow
Equations [5] –[7] allow one to obtain closed-form the exposure of photographic films and other
expressions for the diffraction efficiencies hmn in photosensitive materials such as polymeric photo-
many grating geometries. For example, a y- resists. This interferometric technique offers a direct
invariant triangular profile with H ¼ l=lnI 2 nIII l method for the fabrication of large diffraction
gives h21 ¼ 100% if n1 ¼ nIII ; n2 ¼ nI ; and u ¼ 0: gratings without appreciable periodic errors. With
For a Q-level stair-step-quantized version of this this technique it is possible to make highly regular
DIFFRACTION / Diffraction Gratings 233
gratings for spectroscopy, but it does not permit easy exposed or nonexposed parts of the resist are
generation of arbitrarily shaped grating structures. dissolved (depending on whether the resist is of
The two intersecting plane waves can also be positive or negative type), leading to a surface-relief
generated with a linear grating using diffraction profile. Several alternative fabrication steps may
orders m ¼ þ1 and m ¼ 21; suppressing the zero follow the development, depending on the desired
order by appropriate design of the grating profile and grating type. If a high-contrast resist is used, a slightly
choosing the period in such a way that all higher ‘undercut’ profile as illustrated in Figure 4c is
transmitted orders are evanescent. This so-called obtained. This can be coated with a metal layer and
phase mask technique is suitable, for example, for the resist can be ‘lifted off’ chemically. Thus a binary
the exposure of Bragg gratings in fibers. metallic grating is formed. To produce a binary
To generate arbitrary gratings one usually employs dielectric profile one bombards the substrate with
lithographic techniques. Some of the possible pro- ions either purely mechanically (ion beam milling) or
cesses are illustrated in Figure 4. The exposure can be with chemical assistance (reactive ion etching), and
performed with scanning photon, electron, or ion finally the residual metal is dissolved.
beams, etc. The process starts with substrate prep- Alternatively, the particle beam dose can be varied
aration, i.e., coating it with a beam-sensitive layer with position to obtain an analog profile in the resist
known as resist, and in the case of electron beam after development. This resist profile can be trans-
exposure with a thin conductive layer to prevent formed into the substrate by proportional etching,
charging. Following the exposure the resist is devel- which leads to an analog dielectric surface-relief
oped chemically, and in this process either the structure (see Figure 4f). This profile can be made
Figure 4 Lithographic processes for grating fabrication. (a) Substrate after preparation for electron beam exposure; (b) exposure by a
scanning electron beam; (c) slightly undercut profile in hight-contrast resist after development, under metal film evaporation; (d) analog
resist profile after variable-dose exposure and resist development; (e) finished binary metallic grating after resist lift-off; (f) transfer of the
analog surface-relief profile into the substrate by proportional etching; (g) etching of a binary profile into the substrate with a metal mask;
(h) growth of a metal film on an analog surface profile.
234 DIFFRACTION / Diffraction Gratings
Figure 5 Spectral diffraction efficiency curves of spectroscopic gratings in TE polarization. (a) Binary; (b) sinusoidal; and (c) triangular
reflection gratings illuminated at Bragg angle. Solid lines: d ¼ l0 : Dashed lines: d ¼ 2l0 :
reflective by depositing a thin metal film on top of it. metal is assumed to be aluminum (n1 and nIII are
If large-scale fabrication of identical gratings is the complex-valued).
goal, a thick (, 100 mm) metal layer (for example In Figure 5a binary gratings with one groove (width
nickel) can be grown on the surface by electroplating. d=2) per period is considered. Parametric optimiz-
After separation from the substrate, the electroplated ation gives H ¼ 422 nm if d ¼ 1l; and H ¼ 168 nm
metal film, which is a negative of the original profile, if d ¼ 2l: Figure 5b illustrates the results for
can be used as a stamp in replicating the original sinusoidal gratings with H ¼ 345 nm if d ¼ 1l; and
profile in plastics using methods such as hot emboss-
ing, ultraviolet radiation curing, and injection
molding. In particular, the latter method provides
the key to high-volume and low-cost production
of various types of diffraction gratings and
other grating-based diffractive structures in the
industrial scale.
Spectroscopic Gratings
Figure 7 Diffraction efficiency versus wavelength for triangular
The purpose of a spectroscopic grating is to divide the transmission gratings illuminated at normal incidence. Solid lines:
incident light into a spectrum, i.e., to direct different TE polarization. Dashed lines: TM polarization.
frequency components of the incident beam into
different directions given by the grating (eqn [1]).
When the grating period is reduced, the angular
dispersion ›um =›l; and therefore the resolution of the
spectrograph, increases.
In the numerical examples presented in Figure 5 we
assume a reflection grating in a Littrow mount, i.e.,
u rm ¼ 2u: This mount is common in spectroscopy; it
obviously requires that the grating is rotated when the
spectrum is scanned. We see from eqn [1] that in the
Littrow mount:
ml
sin u ¼ ½9
2d
› um m
¼2 ½10
›l 2d cos u
H ¼ 520 nm if d ¼ 2l: Finally, triangular gratings valid in the paraxial region r p f : Here l0 is the
with H ¼ 431 nm if d ¼ 1l; and H ¼ 316 nm if design wavelength, which determines the modulation
d ¼ 2l; are considered. The triangular grating profile depth of the locally blazed grating profile:
is seen to be the best choice in general.
l0
H¼ ½12
Diffractive Lenses nðl0 Þ 2 1
Diffractive lenses are gratings with locally
and nðl0 Þ is the refractive index of the dielectric
varying period and groove orientation. Figure 6
material at l ¼ l0 :
illustrates the central part of a diffractive microlens
The zone boundaries rn are defined by the
fabricated by lithographic methods. It is seen to
condition fðrn Þ ¼ 22pn; and therefore we obtain
consist of concentric zones with radially decreasing
from eqn [11]:
width.
The phase transmission function of an ideal qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi
diffractive focusing lens can be obtained by a simple rn ¼ 2nf l0 þ ðnl0 Þ2 < 2nf l0 ½13
optical path calculation, and is given by
qffiffiffiffiffiffiffiffiffi Obviously, when the numerical aperture of the lens
2p p 2 is large, the zone width in the outer regions of the
fðrÞ ¼ ðf 2 f 2 þ r2 Þ < 2 r ½11
l0 l0 f lens is reduced to values of the order of l: Figure 7
illustrates the local diffraction efficiency in different
where f is the focal length, r is the radial distance from part of a diffractive lens: we use the order m ¼ 21
the optical axis, and the approximate expression is of a locally triangular diffraction grating of the
Figure 9 Diffraction pattern produced by a continuous-profile Figure 10 Diffraction pattern produced by a binary beamsplitter
beamsplitter grating when (a) d ¼ 200l and (b) d ¼ 50l: grating grating when (a) d ¼ 200l and (b) d ¼ 50l:
DIFFRACTION / Diffraction Gratings 237
type shown in Figure 1c with u ¼ 0; assuming that substantially thinner and lighter than a conventional
nI ¼ n2 ¼ 1:5 (fused silica) and n1 ¼ nIII ¼ 1: The all-refractive achromatic lens made of crown and flint
efficiency h21 is calculated as a function of l using glasses.
rigorous diffraction theory for both TE and TM
polarization (the electric vector points in the
Beam Splitter Gratings
groove direction in the TE case and is perpendicu-
lar to it in the TM case). It is seen that the local Multiple beamsplitters, also known as array illumi-
efficiency of the lens becomes poor in regions nators, are gratings with sophisticated periodic
where the local grating period is small, and thus structure that are capable of transforming an incident
diffractive lenses operate best in the paraxial region plane wave into a set of diffraction orders with a
(it is, however, possible to obtain improved non- specified distribution of diffraction efficiencies. Most
paraxial performance by optimizing the local often one wishes to obtain a one- or two-dimensional
grating profile). array of diffracted plane waves with equal efficiency.
A factor that limits the use of diffractive lenses with This goal can be achieved by appropriate design of
polychromatic light is their large chromatic aberra- the grating profile, which may be of binary, multi-
tion: in the paraxial limit the focal length depends on level, or continuous form.
the wavelength according to the formula: Figure 8 illustrates binary and continuous grating
profiles capable of dividing one incident plane wave
l0 into eight diffracted plane waves (central odd-
f ðlÞ ¼ f ½14
l numbered orders of the grating) with equal efficiency
in view of TEA. Figure 9 shows the distribution of
However, since the dispersion of a grating has an energy among the different diffraction orders in and
opposite sign compared to the diffraction of a prism, a around the array produced by the continuous profile,
weak positive diffractive lens can be combined with evaluated by rigorous theory for two different grating
a relatively strong positive refractive lens to form a periods, while Figure 10 gives corresponding results
single-element achromatic lens. Such a hybrid lens is for the binary grating.
Figure 12 The far-field diffraction pattern produced by the grating in Figure 11.
Fraunhofer Diffraction
B D Guenther, Duke University, Durham, NC, USA predictions of geometrical optics was given the name
q 2005, Elsevier Ltd. All Rights Reserved. diffraction by Grimaldi.
There is no substantive difference between dif-
fraction and interference. The separation between
the two subjects is historical in origin and is retained
Introduction for pedagogical reasons. Since diffraction and
Geometrical optics does a reasonable job of describ- interference are actually the same physical process,
ing the propagation of light in free space unless that we expect the observation of diffraction to be a
light encounters an obstacle. Geometrical optics strong function of the coherence of the illumination.
predicts that the obstacle would cast a sharp shadow. With incoherent sources, geometrical optics is
On one side of the shadow’s edge would be a bright usually all that is needed to predict the performance
uniform light distribution, due to the incident light, of an optical system and diffraction can be ignored
and on the other side there would be darkness. Close except at dimensions on the scale of a wavelength.
examination of an actual shadow edge reveals, With light sources having a large degree of spatial
however, dark fringes in the bright region and bright coherence, it is impossible to neglect diffraction in
fringes in the dark region. This departure from the any analysis.
240 DIFFRACTION / Fraunhofer Diffraction
Box 1
Green’s Function
1. Kirchhoff Green’s function: Assume that the wave, C, is due to a single point source at r0. At an
observation point, r1, the Green’s function is given by
e2 ik·r01
Cðr1 Þ ¼ ; ½1
r01
where r01 ¼ lr01 l ¼ lr1 2 r0 l is the distance from the source of the Green’s function, at r0, to the
observation position, at r1.
2. Sommerfeld Green’s function: Assume that there are two point sources, one at P0 and one at P00
(see Figure 2 where the source, P00 , is positioned to be the mirror image of the source at P0 on the
opposite side of the screen, thus
0 0
r01 ¼ r 01 ^ r01 Þ ¼ 2cosðn;
cosðn; ^ r 01 Þ ½2
or
0
e2 ik·r01 e2 ik·r 01
C0 ¼ þ 0 ½3b
r01 r 01
Box 2
Green’s Theorem
~
To calculate the complex amplitude, EðrÞ; of a wave at an observation point defined by the vector r, we
need to use a mathematical relation known as Green’s Theorem. Green’s theorem states that, if w and C
are two scalar functions that are well behaved, then
ðð ððð
ðw7C·n^ 2 C7w·nÞds
^ ¼ ðw72 C 2 C72 wÞdn ½4
S V
›C ›w
7C·n^ ¼ 7w·n^ ¼ ½5
›n ›n
This equation is the prime foundation of scalar diffraction theory but only the proper choice of C, w and
the surface, S, allows direct application to the diffraction problem.
242 DIFFRACTION / Fraunhofer Diffraction
12 sin u du df ½13
Figure 3 Integration surface for diffraction calculation. P2 is the source and P0 is the point where the field is to be calculated. Reprinted
with permission from Guenther RD (1990) Modern Optics. New York: John Wiley & Sons.
that C and ›C=›n are known only if the problem has A physical understanding of [23] can be obtained if
already been solved. it is rewritten as
The Sommerfeld Green’s functions will remove the
inconsistencies of the Kirchhoff theory. If we use the ðð e2 ik·r01
wðr0 Þ ¼ Fðr21 Þ ds ½24
Green’s function [3a] then C vanishes in the aperture r01
while ›C=›n can assume the value required by S
Kirchhoff’s boundary conditions. If we use the The field at P0 is due to the sum of an infinite number
Green’s functions [3b] then the converse holds, i.e. of secondary Huygens sources in the aperture S.
›C=›n is zero in the aperture. This improved Green’s The secondary sources are point sources radiating
function makes the problem harder with little gain in spherical waves of the form
accuracy so we will retain the Kirchhoff formalism.
As a result of the above discussion, the surface e2 ik·r01
integral [15] is only performed over the surface, S, in Fðr21 Þ ½25
r01
the aperture. The evaluation of the Green’s function
on this surface can be simplified by noting that with amplitude Fðr21 Þ; defined by
normally r01 @ l; i.e., k @ 1=r01 ; thus on S, " #
! i e2 ik·r21 ^ r01 Þ 2 cosðn;
cosðn; ^ r21 Þ
›C 1 e2 ik·r01 Fðr21 Þ ¼ A
¼ cosðn; r01 Þ 2ik 2 l r21 2
›n r01 r01
½20 ½26
2 ik·r01
e
< 2ik cosðn;^ r01 Þ The imaginary constant, i, causes the wavelets from
r01
each of these secondary sources to be phase shifted
Substituting this approximate evaluation of the with respect to the incident wave. The obliquity
derivative into [15] yields factor, in the amplitude [26]
1 ð ð e2 ik·r01 ›w 1
2 ½cosðn;
^ r01 Þ 2 cosðn;
^ r21 Þ ½27
wðr0 Þ ¼ þ ikw cosðn;
^ r01 Þ ds
4p r01 ›n
S causes the secondary sources to have a forward
directed radiation pattern.
½21 If we had used the Sommerfeld Green’s function,
The source of the incident wave is a point source the only modification to our result would be a change
located at P2, with a position vector, r2, measured in the obliquity factor to cosðn,r01 Þ. In our discussions
with respect to the coordinate system and a distance below we will assume that the angles are all small,
lr21 l away from P1 , a point in the aperture allowing the obliquity factor to be replaced by 1 so
(see Figure 3). The incident wave is therefore that in the remaining discussion the choice of Green’s
a spherical wave of the form function has no impact.
We will assume the source of light is at infinity,
e2 ik·r21 z1 ¼ 1 in Figure 3, so that the aperture is illuminated
wðr21 Þ ¼ A ½22 by a plane wave, traveling parallel to the z-axis. With
r21
this assumption
which fills the aperture. Here also we will assume that
r21 .. l so that the derivative of the incident wave u 0 ¼ cosðn; r21 Þ < 21 ½28
assumes the same form as [20]. Then [21] can be
written We will also make the paraxial approximation which
assumes that the viewing position is close to the
iA ð ð e2 ik·ðr21 þr01 Þ z-axis, leading to
wðr0 Þ ¼
l r21 r01
S u ¼ cosðn; r01 Þ < 1 ½29
^ r01 Þ 2 cosðn;
cosðn; ^ r21 Þ
ds ½23 With these assumptions eqn [24] becomes identical
2 to the Huygens– Fresnel integral discussed in the
This relationship is called the Fresnel – Kirchhoff section on Fresnel diffraction:
diffraction formula. It is symmetric with respect to
ð ð E~ ðrÞ
r01 and r21, making the problem identical when the ~ 0Þ ¼ i
Eðr i
e2 ik·R ds ½30
source and measurement point are interchanged. l S R
DIFFRACTION / Fraunhofer Diffraction 245
The modern interpretation of the Fresnel – Kirch- can be approximated by plane waves. A conse-
hoff (or now in the small-angle approximation, the quence of this requirement is that the entire
Fresnel –Huygens) integral is to view it as a convolu- waveform passing through the aperture contributes
tion integral. By considering free space to be a linear to the observed diffraction.
system, the result of propagation can be calculated by The geometry to be used in this derivation is
convolving the incident (input) wave with the impulse shown in Figure 4. The wave incident normal to the
response of free space (our Green’s function), aperture, S, is a plane wave and the objective of the
calculation is to find the departure of the trans-
i e2 ik·R mitted wave from its geometrical optical path. The
½31
l R calculation will provide the light distribution,
transmitted by the aperture, as a function of the
The job of calculating diffraction has only begun
angle the light is deflected from the incident
with the derivation of the Fresnel – Kirchhoff
direction. We assume that diffraction makes only a
integral. In general, an analytic expression for the
small perturbation on the predictions of geometri-
integral cannot be found because of the difficulty of
cal optics. The deflection angles encountered in this
performing the integration over R. There are two
derivation are, therefore, small, as assumed above,
approximations that will allow us to obtain analytic
and we will be able to use the paraxial
expressions of the Fresnel –Kirchhoff integral. In both approximation.
approximations, all dimensions are assumed to be The distance from a point P, in the aperture, to the
large with respect to the wavelength. In one
observation point P0, of Figure 4, is
approximation, the viewing position is assumed to
be far from the obstruction; the resulting diffraction is
called Fraunhofer diffraction and will be discussed R2 ¼ ðx 2 j Þ2 þ ðy 2 hÞ2 þ Z2 ½32
here. The second approximation, which leads to
Fresnel diffraction, assumes that the observation From Figure 4 we see R0 is the distance from the
point is nearer the obstruction, to be quantified center of the screen to the observation point, P0,
below. This approximation, which is the more
difficult mathematically, is discussed in the article R20 ¼ j 2 þ h2 þ Z2 ½33
on Diffraction: Fresnel Diffraction. The difference between these two vectors is
Figure 4 Geometry for Fraunhofer diffraction. Reprinted with permission from Guenther RD (1990) Modern Optics. New York:
John Wiley & Sons.
246 DIFFRACTION / Fraunhofer Diffraction
The equation for the position of point P in the If the aperture function described the variation in
aperture can be written in terms of [34]: absorption of the aperture as a function of position,
as would be produced by a photographic negative,
R20 2R2 then f ðx; yÞ would be a real function. If the aperture
r ¼ R0 2R ¼ ;
R0 þR function described the variation in transmission of a
biological sample, it might be entirely imaginary.
1
¼ 2ðxj þyhÞ2ðx2 þy2 Þ ½35 The argument of the exponent in [40] is
R0 þR
!
The reciprocal of ðR0 þRÞ can be written as xj þ yh x2 þ y2
ik 2
!21 R0 2R0
1 1 1 R2R0 !
¼ ¼ 1þ ½36 xj þ yh x2 þ y2
R0 þR 2R0 þR2R0 2R0 2R0 ¼ i2p 2 ½41
lR0 2lR0
Now using [36]
" #" #21 If the observation point, P0, is far from the screen, we
xj þ yh x2 þ y2 R0 2 R can neglect the second term and treat the phase
R0 2 R ¼ 2 12
R0 2R0 2R0 variation across the aperture as a linear function of
" # ½37 position. This is equivalent to assuming that the
2 2
xj þ yh x þ y r 21 diffracted wave is a collection of plane waves.
¼ 2 12
R0 2R0 2R0 Mathematically, the second term in [41] can be
neglected if
If the diffraction integral [30] is to have a finite
(nonzero) value, then
x2 þ y2
,, 1 ½42
klR0 2 Rl ,, kR0 ½38 2lR0
This requirement insures that all the Huygens’s This is called the far-field approximation and the
wavelets, produced over the aperture from the center theory yields Fraunhofer diffraction. If the quadratic
out to the position r; will have similar phases and term of [41] is on the order of 2p then the fraction is
will interfere to produce a nonzero amplitude at P0.
x2 þ y2
This requirement results in < Qð1Þ ½43
2lR0
1 and we must retain the quadratic term and the theory
r <1 ½39
12 yields Fresnel diffraction.
2R0 We define spatial frequencies in the x and y
From this equation we are tempted to state that the direction by
aperture dimension must be small relative to the 2p 2pj
vx ¼ sin ux ¼ 2
observation distance. However, for the exponent to l lR0
provide a finite contribution in the integral, it is kr ½44
that is small, not the aperture dimension, r: 2p 2ph
vy ¼ sin uy ¼ 2
Using the approximation for ðR0 2 RÞ; we obtain l lR0
the diffraction integral We define the spatial frequencies with a negative sign
to allow equations involving spatial frequencies to
2 ik·R0 ð ð xjþyh x2 þy 2
have the same form as those involving temporal
~ P ¼ ia e
E f ðx; yÞe
þ ik R0 2 2R0 dx dy
lR0 frequencies. The negative sign is required because vt
S and k·r appear in the phase of the wave with opposite
½40 signs and we want the Fourier transform in space and
in time to have the same form.
The change in the amplitude of the wave due to the With the variables defined in [44], the integral
change in R; as we move across the aperture, is becomes
neglected, allowing R in the denominator of the
Huygens– Fresnel integral to be replaced by R0 and ia e2 ik·R0 ð ð
E~ P ðvx ; vy Þ ¼ f ðx; yÞe2 iðvx xþvy yÞ dx dy
moved outside of the integral. We have introduced the lR0
complex transmission function, f ðx; yÞ; of the aper- S
ture to allow a very general aperture to be treated. ½45
DIFFRACTION / Fraunhofer Diffraction 247
The result of our derivation is that the Fraunhofer Since both f ðxÞ and f ðyÞ are defined as symmetric
diffraction field, E~ P ; can be calculated by performing functions, we need only calculate the cosine trans-
the two-dimensional Fourier transform of the aper- forms to obtain the diffracted field:
ture’s transmission function. To calculate the Fourier transform defined by [46]
we can rewrite the transform as
Definition of Fourier Transform
ð1 ð1
In this discussion the function FðvÞ is defined as the FðvÞ ¼ f ðtÞcos vt dt 2 i f ðtÞsin vt dt ½50
21 21
Fourier transform of f ðtÞ:
If f ðtÞ is a real, even function then the Fourier
ð1 transform can be obtained by simply calculating the
2 ivt
F {f ðtÞ} ; FðvÞ ; f ðtÞe dt ½46
21 cosine transform:
ð1
The transformation from a temporal to a frequency f ðtÞcos vt dt ½51
representation given by [46] does not destroy 21
information; thus, the inverse transform can also be
4x y a sin vx x0 sin vy y0
defined E~ P ¼ i 0 0 e2 ik·R0 ½52
lR0 vx x0 vy y0
1 ð1
F 21 {FðvÞ} ¼ f ðtÞ ¼ FðvÞeivt dv ½47 The intensity distribution of the Fraunhofer diffrac-
2p 21
tion produced by the rectangular aperture is
By assuming that the illumination of an aperture sin2 vx x0 sin2 vy y0
is a plane wave and that the observation position is IP ¼ I0 ½53
ðvx x0 Þ2 ðvy y0 Þ2
in the far field, the diffraction of an aperture is
found to be given by the Fourier transform of The maximum intensities in the x and y directions
the function describing the amplitude transmission occur at
of the aperture. The amplitude transmission of
the aperture, f ðx; yÞ; may thus be interpreted as the vx x0 ¼ vy y0 ¼ 0 ½54
superposition of mutually coherent plane waves.
The area of this rectangular aperture is defined
as A ¼ 4x0 y0 ; resulting in an expression for the
Diffraction by a Rectangular Aperture maximum intensity of
We will now use Fourier transform theory to calculate " #2
the Fraunhofer diffraction pattern from a rectangular i4x0 y0 a e2 ik·R0 A2 a2
I0 ¼ ¼ 2 2 ½55
slit and point out the reciprocal relationship between lR0 l R0
the size of the diffraction pattern and the size of the
aperture. The minima of [53] occur when vx x0 ¼ np or when
Consider a rectangular aperture with a trans- vy y0 ¼ mp: The location of the zeros can be specified
mission function given by as a dimension in the observation plane or, using the
paraxial approximation, in terms of the angle defined
8 by [44]
<1 lxl # x0 ; lyl # y0
f ðx; yÞ ¼ ½48 j nl
:0 all other x and y sin ux < ux < ¼
R0 2x0
½56
Because the aperture is two-dimensional, we need to h ml
sin uy < uy < ¼
apply a two-dimensional Fourier transform but it is R0 2y0
not difficult because the amplitude transmission
The dimensions of the diffraction pattern are charac-
function is separable, in x and y: The diffraction
terized by the location of the first zero, i.e., when
amplitude distribution from the rectangular slit is
n ¼ m ¼ 1; and are given by the observation plane
simply one-dimensional transforms carried out
coordinates, j and h: The dimensions of the diffraction
individually for the x and y dimensions:
pattern are inversely proportional to the dimensions of
ðx0 ðy0 the aperture. As the aperture dimension expands, the
~ P ¼ ia e2ik·R0
E f ðxÞe2ivx x dx f ðyÞe2ivy y dy width of the diffraction pattern decreases until, in the
lR 0 2x0 2y0 limit of an infinitely wide aperture, the diffraction
½49 pattern becomes a delta function.
248 DIFFRACTION / Fraunhofer Diffraction
Figure 5 The amplitude of the wave diffracted by a rectangular slit. This sinc function describes the light wave’s amplitude that would
exist in both the x and y directions. The coordinate would have to be scaled by the dimension of the slit. Reprinted with permission from
Guenther RD (1990) Modern Optics. New York: John Wiley & Sons.
Figure 5 is a theoretical plot of the amplitude of the From Figure 6 we see that the observation point, P,
diffraction wave from a rectangular slit. can be defined in terms of the angle c; where
r
sin c ¼ ½60
Diffraction from a Circular Aperture R0
To obtain the diffraction pattern from a circular This allows an angular representation of the size of
aperture we first convert from the rectangular the diffraction pattern if it is desired.
coordinate system we have used to a cylindrical Using [57] and [59], we may write
coordinate system. The new cylindrical geometry is
ksr
shown in Figure 6. At the aperture plane vx x þ vy y ¼ 2 ðcos u cos w þ sin u sin wÞ;
R0
ksr
¼2 cosðu 2 wÞ ½61
x ¼ s·cos w y ¼ s·sin w R0
½57
f ðx; yÞ ¼ f ðs; wÞ dx dy ¼ s ds dw The Huygens –Fresnel integral for Fraunhofer diffrac-
tion can now be written in terms of cylindrical
coordinates:
At the observation plane ð 2a ð2p sr
~EP ¼ ia e2 ik·R0 2 ik
f ðs; wÞe R0
cosðu2wÞ
s ds dw
lR 0 0 0
j ¼ r·cos u h ¼ r·sin u ½58 ½62
We will use this equation to calculate the diffrac-
In the new, cylindrical, coordinate system at the tion amplitude from a clear aperture of diameter a;
observation plane, the spatial frequencies are written defined by the equation
8
as < 1 s # 2a ; all w
f ðs; wÞ ¼ ½63
:0 s . 2a
kj kr
vx ¼ 2 ¼2 cos u The symmetry of this problem is such that
R0 R0
½59 f ðs; wÞ ¼ f ðsÞgðwÞ ¼ f ðsÞ ½64
kh kr
vy ¼ 2 ¼2 sin u
R0 R0 F {f ðs; fÞ} ¼ F {f ðsÞ} ¼ FðrÞ ½65
DIFFRACTION / Fraunhofer Diffraction 249
Figure 6 Geometry for calculation of Fraunhofer diffraction from a circular aperture. Reprinted with permission from Guenther RD
(1990) Modern Optics. New York: John Wiley & Sons.
The second integral in [67] belongs to a class of A plot of the function contained within the bracket of
functions called the Bessel function defined by the [73] is given in Figure 7. If we define
integral
kar
u¼ ½74
1 ð2p i½sr sin w2nw 2R0
Jn ðsrÞ ¼ e dw ½68
2p 0
then the spatial distribution of intensity in the
In [67] the integral corresponds to the n ¼ 0; zero- diffraction pattern can be written in a form known
order, Bessel function. Using this definition, we can as the Airy formula,
rewrite [67] as
ð1 2J1 ðuÞ 2
I ¼ I0 ½75
FðrÞ ¼ f ðsÞJ0 ðsrÞs ds ½69 u
0
where we have defined
This transform is called the Fourier– Bessel transform
or the Hankel zero-order transform. !2
aA
Using these definitions, the transform of the I0 ¼ ½76
aperture function, f ðsÞ; is lR0
Figure 7 Amplitude of a wave diffracted by a circular aperture. The observed light distribution is constructed by rotating this Bessel
function around the optical axis. Reprinted with permission from Guenther RD (1990) Modern Optics. New York: John Wiley & Sons.
The Fourier transform of a convolution of two The aperture transmission function representing
functions is the product of the Fourier transforms of an array of apertures will be the sum of the
DIFFRACTION / Fraunhofer Diffraction 251
distributions of the individual apertures, represented apertures and the relative phase of the waves radiated
graphically in Figure 8 and mathematically by the by each of the apertures.
summation Because of these properties, arrays of diffracting
apertures have been used in a number of applications.
X
N
CðxÞ ¼ cðx 2 xn Þ ½82 . At radio frequencies, arrays of dipole antennas
n¼1
are used to both radiate and receive signals in
The Fraunhofer diffraction from this array is both radar and radio astronomy systems.
Fðvx Þ; the Fourier transform of CðxÞ; One advantage offered by diffracting arrays at
radio frequencies is that the beam produced by the
ð1
Fðvx Þ ¼ CðxÞe2 ivx x dx ½83 array can be electrically steered by adjusting the
21 relative phase of the individual dipoles.
. An optical realization of a two-element array of
which can be rewritten as
radiators is Young’s two-slit experiment and a
N ð1
X realization of a two-element array of receivers is
Fðvx Þ ¼ cðx 2 xn Þe2 ivx x dx ½84 Michelson’s stellar interferometer. Two optical
21
n¼1 implementations of arrays, containing more dif-
fracting elements, are diffraction gratings and
We now make use of the fact that cðx 2 xn Þ can be
holograms. In nature, periodic arrays of diffracting
expressed in terms of a convolution integral. The
elements are the origin of the colors observed on
Fourier transform of cðx 2 xn Þ is, from the convolu-
some invertebrates.
tion theorem [80], the product of the Fourier trans-
. Many solids are naturally arranged in three-
forms of the individual functions that make up the
dimensional arrays of atoms or molecules that act
convolution:
as diffraction gratings when illuminated by X-ray
X
N wavelengths. The resulting Fraunhofer diffraction
Fðvx Þ ¼ F {cðx 2 aÞ}F {dða 2 xn Þ} patterns are used to analyze the ordered structure
n¼1 of the solids.
X
N
The array theorem can be used to calculate the
¼ F {cðx 2 aÞ} F {dða 2 xn Þ}
n¼1 diffraction pattern from N rectangular slits, each of
8 9 width a and separation d: The aperture function of a
<XN = single slit is equal to
¼ F {cðx 2 aÞ}F d ð a 2 xn Þ ½85
: ; 8
n¼1
<1 lxl # a
; lyl # y0
2
cðx; yÞ ¼ ½86
The first transform in [85] is the diffraction pattern :0 all other x and y
of the generalized aperture function and the second
transform is the diffraction pattern produced by a set
of point sources with the same spatial distribution as The Fraunhofer diffraction pattern of this aperture
the array of identical apertures. We will call this has already been calculated and is given by [52]:
second transform the array function. sin a
To summarize, the array theorem states that the F {cðx; yÞ} ¼ K ½87
a
diffraction pattern of an array of similar apertures is
given by the product of the diffraction pattern from a where the constant K and the variable a are
single aperture and the diffraction (or interference)
pattern of an identically distributed array of point i2ay0 a 2 ik·R0 sin vy y0
sources. K¼ e
lR0 vy y0
ka
N Rectangular Slits a¼ sin ux ½88
2
An array of N identical apertures is called a
diffraction grating in optics. The Fraunhofer diffrac- The array function is
tion patterns produced by such an array have two
important properties; a number of very narrow beams X
N
are produced by the array and the beam positions are AðxÞ ¼ dðx 2 xn Þ where xn ¼ ðn 2 1Þd ½89
a function of the wavelength of illumination of the n¼1
252 DIFFRACTION / Fraunhofer Diffraction
and its Fourier transform is given by similar points in the two slits. Zeros in the
diffraction intensity occur whenever a ¼ np or
sin N b whenever b ¼ 12 ð2n þ 1Þp: Figure 9 shows the
F {AðxÞ} ¼ ½90
sin b interference maxima, from the grating factor,
under the central diffraction peak, described by
kd the shape factor. The number of interference
b¼ sin ux maxima contained under the central maximum is
2
given by
The Fraunhofer diffraction pattern’s intensity distri-
bution in the x-direction is thus given by 2d
21 ½93
a
2
sin2 a sin Nb
Iu ¼ I0 a 2 sin2 b
½91 In Figure 9 three different slit spacings are shown
|{z} |ffl{zffl}
shape factor grating factor with the ratio d=a equal to 3, 6, and 9,
respectively.
We have combined the variation in intensity in the
y-direction into the constant I0 because we assume The Diffraction Grating
that the intensity variation in the x-direction will
be measured at a constant value of y. In this section we will use the array theorem to derive
A physical interpretation of [91] views the first the diffraction intensity distribution of a large
factor as arising from the diffraction associated with number of identical apertures. We will discover that
a single slit; it is called the shape factor. The second the positions of the principal maxima are a function
factor arises from the interference between light of the illuminating wavelength. This functional
from different slits; it is called the grating factor. relationship has led to the application of a diffraction
The fine detail in the spatial light distribution of grating to wavelength measurements.
the diffraction pattern is described by the grating The diffraction grating normally used for wave-
factor and arises from the coarse detail in the length measurements is not a large number of
diffracting object. The coarse, overall light distri- diffracting apertures but rather a large number of
bution in the diffraction pattern is described by the reflecting grooves cut in a surface such as gold or
shape factor and arises from the fine detail in the aluminum. The theory to be derived also applies
diffracting object. to these reflection gratings but a modification must
be made to the theory. The shape of the grooves in the
reflection grating can be used to control the fraction
Young’s Double Slit of light diffracted into a principal maximum and we
The array theorem makes the analysis of Young’s will examine how to take this into account. A grating
two-slit experiment a trivial exercise. This appli- whose groove shape has been controlled to enhance
cation of the array theorem will demonstrate that the the energy contained in a particular principal
interference between two slits arises naturally from maximum is called a blazed grating. The use of
an application of diffraction theory. The result of this special groove shapes is equivalent to the modifi-
analysis will support a previous assertion that cation of the phase of individual elements of an
interference describes the same physical process as antenna array at radio frequencies.
diffraction and the division of the two subjects is an The construction of a diffraction grating for use in
arbitrary one. an optical device for measuring wavelength was first
The intensity of the diffraction pattern from two suggested by David Rittenhouse, an American
slits is obtained from [91] by setting N ¼ 2: astronomer, in 1785, but the idea was ignored
until Fraunhofer reinvented the concept in 1819.
sin2 a Fraunhofer’s first gratings were fine wires spaced by
Iu ¼ I0 cos2 b ½92 wrapping the wires in the threads of two parallel
a2
screws. He later made gratings by cutting (ruling)
The sinc function describes the energy distribution grooves in gold films deposited on the surface of glass.
of the overall diffraction pattern, while the cosine H.A. Rowland made a number of well designed ruling
function describes the energy distribution created by machines, which made possible the production of
interference between the light waves from the large-area gratings. Following a suggestion by Lord
two slits. Physically, a is a measure of the Rayleigh, Robert Williams Wood (1868 – 1955)
phase difference between points in one slit and b developed the capability to control the shape of the
is a measure of the phase difference between individual grooves.
DIFFRACTION / Fraunhofer Diffraction 253
Figure 9 The number of interference fringes beneath the main diffraction peak of a Young’s two-slit experiment with rectangular
apertures. Reprinted with permission from Guenther RD (1990) Modern Optics. New York: John Wiley & Sons.
If N is allowed to assume values much larger than where l is an integer, a principal maximum in the
2, the appearance of the interference fringes, pre- intensity will occur with a value given by
dicted by the grating factor, changes from a simple
sinusoidal variation to a set of narrow maxima, called sin2 a
Iu P ¼ N 2 I0 ½96
principal maxima, surrounded by much smaller, a2
secondary maxima. To calculate the diffraction Secondary maxima, much weaker than the principal
pattern we must evaluate [91] when N is a large maxima, occur when
number.
Whenever N b ¼ mp; m ¼ 0; 1; 2; …; the numer- 2m þ 1
b¼ p m ¼ 1; 2; … ½97
ator of the second factor in [91] will be zero, leading 2N
to an intensity that is zero, Iu ¼ 0: The denominator (When m ¼ 0; m=N is an integer, thus the first value
of the second factor in [91] is zero when b ¼ lp; that m can have in [97] is m ¼ 1:Þ The intensity of
l ¼ 0; 1; 2; … If both conditions occur at the same each secondary maximum is given by
time, the ratio of m and N is equal to an integer,
m=N ¼ l; and instead of zero intensity, we have an sin2 a sin2 N b
Iu S ¼ I0
indeterminate value for the intensity, Iu ¼ 0=0: To find a2 sin2 b
the actual value for the intensity, we must apply 2 3
L’Hospital’s rule 2 2m þ 1
sin2 a 6 sin p7
6 2 7
¼ I0 2 6 7
sin N b N cos N b a 4 2 2m þ 1 5
lim ¼ lim ¼N ½94 sin p
b!lp sin b b!lp cos b 2N
2 32
L’Hospital’s rule predicts that whenever sin2 a 1
¼ I0 2 6 7 ½98
m a 4 2m þ 1 5
b ¼ lp ¼ p ½95 sin p
N 2N
254 DIFFRACTION / Fraunhofer Diffraction
The quantity ð2m þ 1Þ=2N is a small number for large Grating Spectrometer
values of N; allowing the small angle approximation
to be made: The curves shown in Figure 10 display the separation
of the principal maxima as a function of sin ud : The
2
sin2 a 2N angular separation of principal maxima can be
Iu S < I0 2 ½99 converted to a linear dimension by assuming a
a pð2m þ 1 Þ
distance, R; from the graing to the observation
The ratio of the intensity of a secondary maximum plane. In the lower right-hand curve of Figure 10, a
and a principal maximum is given by distance of 2 meters was assumed. Grating spec-
" #2 trometers are classified by the size in meters of R used
Iu S 2 in their design. The larger R; the easier it is to resolve
¼ ½100
Iu P p ð2m þ 1Þ wavelength differences. For example, a 1 meter
spectrometer is a higher-resolution instrument than
The strongest secondary maximum occurs for m ¼ 1 a 1/4 meter spectrometer.
and, for large N; has an intensity that is about 4.5% The fact that the grating is finite in size causes each
of the intensity of the neighboring principal of the orders to have an angular width that limits the
maximum. resolution with which the illuminating wavelength
The positions of principal maxima occur at angles can be measured. To calculate the resolving power of
specified by the grating formula. the grating, we first determine the angular width of a
principal maximum. This is accomplished by measur-
kd sin u m ing the angular change, of the principal maximum’s
b¼ ¼ lp ¼ p ½101
2 N position, when b changes from b ¼ lp ¼ mp=N
to b ¼ ðm þ 1Þp=N; i.e., Db ¼ p=N: Using the
The angular position of the principal maxima (the definition of b
diffracted angle ud ) is given by the Bragg equation
kd sin ud
b¼ ½103
ll 2
sin ud ¼ ½102 pd cos ud Du
d Db ¼
l
where l is called the grating order. This simple and the angular width is
relationship between the angle and the wavelength l
can be used to construct a device to measure Du ¼ ½104
Nd cos ud
wavelength, the grating spectrometer.
The model we have used to obtain this result is The derivative of the grating formula gives
based on a periodic array of identical apertures. The d
Dl ¼ cos ud Du ½105
transmission function of this array is a periodic l
square wave. If, for the moment, we treat the grating lDl
as infinite in size, we discover that the principal Du ¼
d cos ud
maxima in the diffraction pattern correspond to the Equating this spread in angle to eqn [104] yields
terms of the Fourier series describing a square wave.
The zero order, m ¼ 0; corresponds to the lDl l
temporal average of the periodic square wave and ¼ ½106
d cos u Nd cos u
has an intensity proportional to the spatially
averaged transmission of the grating. Because of its The resolving power of a grating is therefore
equivalence to the temporal average of a time- l
varying signal, the zero-order principal maximum is ¼ Nl ½107
Dl
often called the dc term.
The first-order principal maximum corresponds to The improvement of resolving power with N can be
the fundamental spatial frequency of the grating and seen in Figure 10. A grating 2 inches wide and
the higher orders correspond to the harmonics of this containing 15 000 grooves per inch would have a
frequency. resolving power in second order ðl ¼ 2Þ of 6 £ 104. At
The dc term provides no information about the a wavelength of 600 nm this grating could resolve
wavelength of the illumination. Information about two waves, differing in wavelength by 0.01 nm.
the wavelength of the illuminating light can only be The diffraction grating is limited by overlapping
obtained by measuring the angular position of the orders. If two wavelengths, l and l þ Dl;
first or higher order principal maxima. have successive orders that are coincident, then
DIFFRACTION / Fraunhofer Diffraction 255
Figure 10 Decrease in the width of the principal maxima of a transmission grating with an increasing number of slits. The various
principal maxima are called orders, numbering from zero, at the origin, out to as large as seven in this example. Also shown is the effect
of different ratios, d =a; on the number of visible orders. Reprinted with permission from Guenther RD (1990) Modern Optics. New York:
John Wiley & Sons.
transmission phase grating with a uniform phase adjustment of the angular location of the principal
variation across the aperture of the grating is very maxima of the grating factor and the zero-order,
difficult. For this reason, a second approach, based on single-aperture, diffraction maximum. To see how
the use of reflection gratings, is the practical solution this is accomplished, we must determine, first, the
to the problems listed above. effect of off-axis illumination of a diffraction
By tilting the reflecting surface of each groove of a grating.
reflection grating, Figure 11, the position of the shape Off-axis illumination is easy to incorporate into the
factor’s maximum can be controlled. Problems 1, 3, equation for the diffraction intensity from an array.
and 4 are eliminated because the shape factor To include the effect of an off-axis source, the phase of
maximum is moved from the optical axis out to the illuminating wave is modified by changing the
some angle with respect to the axis. The use of incident illumination from a plane wave of ampli-
reflection gratings also removes Problem 2 because tude, E; traveling parallel to the optical axis, to a
all of the incident light is reflected by the grating. plane wave with the same amplitude, traveling at an
Robert Wood, in 1910, developed the technique of angle ui to the optical axis: E~ e2 ikx sin ui : (Because we
producing grooves of a desired shape in a reflective are interested only in the effects in a plane normal to
grating by shaping the diamond tool used to cut the the direction of propagation, we ignore the phase
grooves. Gratings, with grooves shaped to enhance associated with propagation along the z-direction,
their performance at a particular wavelength, are said kz cos ui .) The off-axis illumination results in a
to be blazed for that wavelength. The physical modification of the parameter for single-aperture
properties on which blazed gratings are based can diffraction from
be understood by using Figure 11. The groove faces ka
can be treated as an array of mirror surfaces. The a¼ sin ud ½111
2
normal to each of the groove faces makes an angle uB
with the normal to the grating surface. We can to
measure the angle of incidence and the angle of
ka
diffraction with respect to the grating normal or with a¼ ðsin ui þ sin ud Þ ½112
respect to the groove normal, as shown in Figure 11. 2
From Figure 11, we can write a relationship between and for multiple aperture interference from
the angles
kd
b¼ sin ud ½113
u i ¼ wi 2 u B 2ud ¼ 2wd þ uB ½110 2
to
(The sign convention used defines positive angles as kd
those measured in a counterclockwise rotation from b¼ ðsin ui þ sin ud Þ ½114
2
the normal to the surface. Therefore, uB is a
negative angle.) The blaze angle provides an extra The zero-order, single-aperture, diffraction peak
degree of freedom that will allow independent occurs when a ¼ 0: If we measure the angles with
respect to the groove face, this occurs when
ka
a¼ ðsin wi þ sin wd Þ ¼ 0 ½115
2
The angles are therefore related by
sin wi ¼ 2sin wd ½116
w i ¼ 2 wd
We see that the single-aperture, diffraction maximum
(the shape factor’s maximum) occurs at the same
angle that reflection from the groove faces occurs. We
can write this result in terms of the angles measured
with respect to the grating normal
ui ¼ 2ðud þ 2uB Þ ½117
Figure 11 Geometry for a blazed reflection grating. Reprinted
with permission from Guenther RD (1990) Modern Optics. The blaze condition requires the single aperture
New York: John Wiley & Sons. diffraction maximum to occur at the lth principal
DIFFRACTION / Fresnel Diffraction 257
Fresnel Diffraction
B D Guenther, Duke University, Durham, NC, USA and then generalize to N pinholes. Finally by letting
q 2005, Elsevier Ltd. All Rights Reserved. the areas of each pinhole approach an infinitesimal
value, we will construct an arbitrary aperture of
infinitesimal pinholes. The result will be the
Fresnel was a civil engineer who pursued optics as a Huygens –Fresnel integral.
hobby, after a long day of road construction. Based We know that the wave obtained after propagation
on his own very high-quality observations of diffrac- through an aperture must be a solution of the wave
tion, Fresnel used the wave propagation concept equation,
developed by Christiaan Huygens (1629– 1695) to
develop a theoretical explanation of diffraction. We ›2 E~
will use a descriptive approach to obtain the 72 E~ ¼ m1
›t 2
Huygens– Fresnel integral by assuming an aperture
can be described by N pinholes which act as sources We will be interested only in the spatial variation of
for Huygens’ wavelets. The interference between the wave so we need only look for solutions of the
these sources will lead to the Huygens – Fresnel Helmholtz equation
integral for diffraction.
Huygens’ principle views wavefronts as the pro- ð72 þ k2 ÞE~ ¼ 0
duct of wavelets from various luminous points acting
together. To apply Huygens’ principle to the propa- The problem is further simplified by replacing this
gation of light through an aperture of arbitrary shape, vector equation with a scalar equation,
we need to develop a mathematical description of the
field from an array of Huygens’ sources filling the ð72 þ k2 ÞEðx;
~ y; zÞ ¼ 0
aperture. We will begin by obtaining the field from a This replacement is proper for those cases where
pinhole that is illuminated by a plane wave, nE(x,y,z) [where n is a unit vector] is a solution of the
vector Helmholtz equation. In general, we cannot
E~ i ðr; tÞ ¼ E~ i ðrÞeivt substitute nE for the electric field E because of
Maxwell’s equation
Following the lead of Fresnel, we will use theory of
interference to combine the fields from two pinholes 7·E ¼ 0
258 DIFFRACTION / Fresnel Diffraction
Rather than working with the magnitude of the where Ci is the constant of proportionality. The
electric field, we should use the scalar amplitude of constant will depend on the angle that r0i makes with
the vector potential. We will neglect this complication the normal to Dsi. This geometrical dependence arises
and assume the scalar, E, is a single component of the from the fact that the apparent area of the pinhole
vector field, E. A complete solution would involve a decreases as the observation angle approaches 908.
scalar solution for each component of E. We can generalize [3] to N pinholes
The pinhole is illuminated by a plane wave, and the XN
E~ i ðrj Þ 2 ik·r0j
wave that leaves the pinhole will be a spherical wave ~ 0Þ ¼
Eðr Cj e Dsj ½4
r0j
which is written in complex notation as j¼1
ivt e 2 id 2 ik·r
e The pinhole’s diameter is assumed to be small
~
EðrÞe ¼A e i vt compared to the distances to the viewing position
r
but large compared to a wavelength. In the limit as
The complex amplitude Dsj goes to zero, the pinhole becomes a Huygens
2 id 2 ik·r source. By letting N become large, we can fill the
~ ¼Ae
EðrÞ
e
½1 aperture with these infinitesimal Huygens’ sources
r and convert the summation to an integral.
is a solution of the Helmholtz equation. The field at It is in this way that we obtain the complex
P0, in Figure 1, from two pinholes: one at P1, located a amplitude, at the point P0, see Figure 2, from a wave
distance r01 ¼ lr0 2 r1 l from P0, and one at P2, exiting an aperture, S0 , by integrating over the area of
located a distance r02 ¼ lr0 2 r2 l from P0, is a the aperture. We replace r0j, in [4], by R, the position
generalization of Young’s interference. The complex of P1, the infinitesimal Huygens’ source of area, ds,
amplitude is measured with respect to the observation point, P0.
We also replace rj by r, the position of the infinitesimal
~ 0 Þ ¼ A1 e2 ik·r01 þ A2 e2 ik·r02
Eðr ½2 area, ds, with respect to the origin of the coordinate
r01 r02 system. The discrete summation [4] becomes the
We have incorporated the phase d1 and d2 into the integral
constants A1 and A2 to simplify the equations.
The light emitted from the pinholes is due to a ðð ~ i ðrÞ 2 ik·R
E
~ 0Þ ¼
Eðr CðrÞ e ds ½5
wave, Ei, incident onto the screen from the left. The A R
units of Ei are per unit area so to obtain the amount of
light passing through the pinholes, we must multiply This is the Fresnel integral. The variable C(r) depends
Ei by the areas of the pinholes. If Ds1 and Ds2 are the upon u, the angle between n, the unit vector normal to
areas of the two pinholes, respectively, then the aperture, and R, shown in Figure 2. We now need
to determine how to treat C(r).
A1 / E~ i ðr1 ÞDs1 ~ i ðr2 ÞDs2
A2 / E
~ ~
~ 0 Þ ¼ C1 Ei ðr1 Þ e2 ik·r01 Ds1 þ C2 Ei ðr2 Þ e2 ik·r02 Ds2
The Obliquity Factor
Eðr
r01 r02 When using Huygens’ principle, a problem arises
½3 with the spherical wavelet produced by each
DIFFRACTION / Fresnel Diffraction 259
The aperture is irregular in shape, at least on the From the geometry of Figure 6, the two paths from
scale of a wavelength, thus in general, kRM will vary the source, S, to the observation point, P0, are
over many multiples of 2p as we integrate around the qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
aperture. For this reason, we should be able to ignore R1 þ R2 ¼ z21 þ ðx1 þ bÞ2 þ z22 þ ðx2 þ bÞ2
the second integral in [13] if we confine our attention qffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffi
to the light distribution on the z-axis. After neglecting R01 þ R02 ¼ z21 þ x21 þ z22 þ x22
the second term, we are left with only the geometrical
optics component of [13]: By assuming that the aperture, of width b, is small
compared to the distances z1 and z2, the difference
~ 0 Þ ¼ 2pC ae2 ikz0
Eðz ½14
between these distances, D, can be rewritten as a
ik binomial expansion of a square root
Figure 7 Geometry for Fresnel diffraction. Reprinted with permission from Guenther RD (1990) Modern Optics. New York: John Wiley
& Sons.
The geometry of Figure 7 allows the distances R aperture, a distance Z away from P0:
and R0 to be written as
e2 ikZ
2 2 2 2
R ¼ ðx 2 j Þ þ ðy 2 hÞ þ Z ½20 Z
R0 2 ¼ ðxs 2 xÞ2 þ ðys 2 yÞ2 þ Z0 2 The amplitude and phase of this spherical wave are
modified by the integral containing a quadratic phase
To solve these expressions for R and R0 , we apply the dependent on the obstruction’s spatial coordinates.
binomial expansion [17] and retain the first two By defining three new parameters,
terms:
" # ZZ0 1 1 1
r¼ or ¼ þ 0 ½23a
ðx 2 jÞ 2
ðx 2 xÞ 2 Z þ Z0 r Z Z
0
RþR < ZþZ þ 0
þ s 0
2Z 2Z Z0 j þ Zxs
" # x0 ¼ ½23b
Z þ Z0
ðy 2 hÞ2 ðy 2 yÞ2
þ þ s 0 ½21
2Z 2Z Z0 h þ Zys
y0 ¼ ½23c
Z þ Z0
0
In the denominator of [19], R is replaced by Z and R
by Z0 . (By making this replacement, we are implicitly the more general expression for Fresnel diffraction of
assuming that the amplitudes of the spherical waves a spherical wave can be placed in the same format as
are not modified by the differences in propagation [22]. The newly defined parameters x0 and y0 are the
distances over the area of integration. This is a coordinates, in the aperture plane, of the stationary
reasonable approximation because the two propa- point, S. The parameters in [23] can be used to
gation distances are within a few hundred wave- express, after some manipulation, the spatial depen-
lengths of each other). In the exponent, we must use dence of the phase, in the integrand of [19]
[21], because the phase changes by a large fraction of
ðj 2 xs Þ2 þ ðh 2 ys Þ2
a wavelength, as we move from point to point in the R þ R0 ¼ Z þ Z0 þ
aperture. 2ðZ þ Z0 Þ
" #
If the wave, incident on the aperture, were a plane ðx 2 x0 Þ2 þ ðy 2 y0 Þ2
þ
wave we can assume R0 ¼ 1 and [19] becomes 2r
2 ikZ ð ð A further simplification to the Fresnel diffraction
~ P ¼ ia e
ik 2 2
E f ðx; yÞe2 2Z ½ðx2jÞ þðy2hÞ dx dy integral can be made by obtaining an expression for
0
l Z S
½22 the distance, D, between the source and the obser-
vation point:
The physical interpretation of [22] states that when a
plane wave illuminates the obstruction, the field at ðj 2 xs Þ2 þ ðh 2 ys Þ2
D ¼ Z þ Z0 þ ½24
point P0 is a spherical wave, originating at the 2ðZ þ Z0 Þ
DIFFRACTION / Fresnel Diffraction 263
We use this definition of D to also write A represents the spherical wave from the source, a
distance D away,
1 Z þ Z0 1 1 ia 2 ikD
¼ < A¼ e
ZZ 0 0
ZZ Z þ Z 0 rD 2D
If we treat the aperture function as a simple constant,
Using the parameters we have just defined, we may C and S are integrals of the form
rewrite [19] as ð x2
CðxÞ ¼ cos½gðxÞdx ½26a
x1
ðð ik
~ P ¼ ia e2ikD f ðx;yÞe2 2r
E
ðx2x0 Þ2 þðy2y0 Þ2
dxdy ð x2
0
lrD SðxÞ ¼ sin½gðxÞdx ½26b
x1
½25
The integrals, C(x) and S(x), have been evaluated
numerically and are found in tables of Fresnel
With the use of the variables defined in [23], the
integrals. To use the tabulated values for the integrals,
physical significance of the general expression of the
[26] must be written in a general form
Huygens– Fresnel integral can be understood in an
equivalent fashion as was [22]. At point P0, a !
ðw pu2
spherical wave CðwÞ ¼ cos du ½27
0 2
!
e2ikD ðw pu2
D SðwÞ ¼ sin du ½28
0 2
originating at the source, a distance D away, is
The variable u is an aperture coordinate, measured
observed, if no obstruction are present. Because of the
relative to the stationary point, S(x0,y0), in units of
obstruction, the amplitude and phase of the spherical
wave are modified by the integral in [25]. This rffiffiffiffiffi
lr
correction to the predictions of geometrical optics is
2
called Fresnel diffraction.
sffiffiffiffiffi sffiffiffiffiffi
Mathematically, [25] is an application of the 2 2
method of stationary phase, a technique developed u¼ ðx 2 x0 Þ or u ¼ ðy 2 y0 Þ ½29
lr lr
in 1887 by Lord Kelvin to calculate the form of a
boat’s wake. The integration is nonzero only in the The parameter w in [27] and [28] specifies the
region of the critical point we have labeled S. location of the aperture edge relative to the stationary
Physically, the light distribution, at the observation point S. The parameter w is calculated through the
point, is due to wavelets from the region around S. use of [29] with x and y replaced by the coordinates of
The phase variations of light, coming from other the aperture edge.
regions in the aperture, are so rapid that the value
of the integral over those spatial coordinates is
zero. The calculation of the integral for Fresnel The Cornu Spiral
diffraction is complicated because, if the obser- The plot of S(w) versus C(w), shown in Figure 8, is
vation point is moved, a new integration around a called the Cornu spiral in honor of M. Alfred Cornu
different stationary point in the aperture must be (1841– 1902) who was the first to use this plot for
performed. graphical evaluation of the Fresnel integrals.
To use the Cornu spiral, the limits w1 and w2 of the
Rectangular Apertures aperture are located along the arc of the spiral. The
length of the straight line segment drawn from w1 to
If the aperture function, f(x,y), is separable in the w2 gives the magnitude of the integral. For example,
spatial coordinates of the aperture, then we can if there were no aperture present, then, for the x-
rewrite [25] as dimension, w1 ¼ 21 and w2 ¼ 1. The length of the
line segment from the point (2 1/2,1/2) to (1/2,1/2)
pffiffi
ð1 ð1
~ P ¼ ia e2 ikD
E f ðxÞe 2 igðxÞ
dx f ðyÞe2 igðyÞ dy
would be the value of EP0 , i.e., 2. An identical value
0
lrD 21 21 is obtained for the y-dimension, so that
2 ikD
~ P ¼ A½CðxÞ 2 iSðxÞ½CðyÞ 2 iSðyÞ ~ P ¼ 2A ¼ ae
E
E 0
0
D
264 DIFFRACTION / Fresnel Diffraction
Figure 8 Cornu spiral obtained by plotting the values for C(w) Figure 9 A geometrical construct to determine w1 for the
versus S(w) from a table of Fresnel integrals. The numbers stationary point S associated with five different observation points.
indicated along the arc of the spiral are values of w. The points 1– The values of w are then used in Figure 8 to calculate the intensity
5 correspond to observation points shown in Figure 9 and are of light diffracted around a straight edge. Reprinted with permission
used to calculate the light intensity diffracted by a straight edge. from Guenther RD (1990) Modern Optics. New York: John Wiley &
Reprinted with permission from Guenther RD (1990) Modern Sons.
Optics. New York: John Wiley & Sons.
Table 1 Fresnel integrals for straight edge
This is the spherical wave that is expected when no w1 Cðw1 Þ Sðw1 Þ IP =I0 Pj
aperture is present.
1 0.5 0.5 0
As an example of the application of the Fresnel 2.0 0.4882 0.3434 0.01
integrals, assume the diffracting aperture is an 1.5 0.4453 0.6975 0.021 P1
infinitely long straight edge reducing the problem to 1.0 0.7799 0.4383 0.04
one dimension: 0.5 0.4923 0.0647 0.09 P2
0 0 0 0.25 P3
8 9 20.5 20.4923 20.0647 0.65
<1 x $ x1 = 21.0 20.7799 20.4383 1.26 P4
f ðx; yÞ ¼ 21.2 20.7154 20.6234 1.37
:0 x,x ; 1 21.5 20.4453 20.6975 1.16
22.0 20.4882 20.3434 0.84 P5
pffiffi
The value of EP0 in the y-dimension is from above 2 22.5 20.4574 20.6192 1.08
21 20.5 20.5 1.0
since w1 ¼ 21 and w2 ¼ 1. In the x-dimension, the
value of w at the edge is
sffiffiffiffiffi the straight edge, w1, is positive for P1 and P2, zero for
2 P3, and negative for P4 and P5.
w x1 ¼ ðx 2 x0 Þ Figure 8 shows the geometrical method used to
lr 1
calculate the intensity values at each of the obser-
vation points in Figure 9 using the Cornu spiral. The
We will assume that the straight edge blocks the
numbers labeling the straight line segments in Figure 8
negative half-plane so that x1 ¼ 0. The straight edge
are associated with the labels of the observation
is treated as an infinitely wide slit with one edge
points in Figure 9.
located at infinity so that w2 ¼ 1 in the upper half-
To obtain an accurate calculation of Fresnel
plane. When the observation point is moved, the
diffraction from a straight edge, a table of Fresnel
coordinate x0 of S changes and the origin of the
integrals provides the input to the following equation
aperture’s coordinate system moves. New values for
the w’s must therefore be calculated for each h i2 h i2
observation point. IP ¼ I0 12 2 Cðw1 Þ þ 12 2 Sðw1 Þ
Figure 9 shows the assumed geometry. A set of
observation points, P1 – P5, on the observation screen where I0 ¼ 2A2 . Table 1, shows the values extracted
are selected. The origin of the coordinate system in from the table of Fresnel integrals to find the relative
the aperture plane is relocated to a new position intensity at various observation points. The result
(given by the position of S) when a new observation obtained by using either method for calculating the
point is selected. The value of w1, the position of the light distribution in the observation plane, due to the
edge with respect to S, must be recalculated for each straight edge in Figure 9, is plotted in Figure 10.
observation point. The distance from the origin to The relative intensities at the observation points
DIFFRACTION / Fresnel Diffraction 265
Fresnel Zones
Fresnel used a geometrical construction of zones to
evaluate the Huygens– Fresnel integral. The Fresnel
zone is a mathematical construct, serving the role of a
Huygens source in the description of wave propa-
gation. Assume that at time t, a spherical wavefront
from a source at P1 has a radius of R0 . To determine
the field at the observation point, P0, due to this
wavefront, a set of concentric spheres of radii, Z; Z þ
ðl=2Þ; Z þ 2ðl=2Þ; …; Z þ jðl=2Þ; … are constructed,
where Z is the distance from the wavefront to the
observation point on the line connecting P1 and P0
(see Figure 11). These spheres divide the wavefront
into a number of zones, z1 ; z2 ; …; zj …, called Fresnel
zones, or half-period zones.
We treat each zone as a circular aperture illumi-
nated from the left by a spherical wave of the form Figure 12 Geometry for finding the relationship between R and
R 0 yielding eqn [32]. Reprinted with permission from Guenther RD
0 0
Ae2 ik·R Ae2 ikR (1990) Modern Optics. New York: John Wiley & Sons.
E~ j ðR0 Þ ¼ ¼
R0 R0
The limits of integration extend over the range
R0 is the radius of the spherical wave. The field at P0
due to the jth zone is obtained by using [5] Z þ ðj 2 1Þ l2 # R # Z þ j l2
ðð e2 ik·R The variable of integration is R; thus, a relationship
~ j ðP0 Þ ¼ A e2 ik·R0
E CðwÞ ds ½30
R0
zj R between R 0 and R must be found. This is accom-
plished by using the geometrical construction shown
For the integration over the jth zone, the surface in Figure 12.
element is A perpendicular is drawn from the point Q on the
spherical wave to the line connecting P1 and P0, in
ds ¼ R0 2sinudu df ½31 Figure 12. The distance from the source to the
266 DIFFRACTION / Fresnel Diffraction
observation point is x1 þ x2 and the distance from the amplitudes of the Huygens’ wavelets from each zone
source to the plane of the zone is the radius of the should be approximately equal. The alternation in
incident spherical wave, R0 . The distance from P1 to sign, from zone to zone, is due to the phase change of
P0 can be written the light wave from adjacent zones because the
propagation paths for adjacent zones differ by l/2.
x 1 þ x2 ¼ R 0 þ Z To find the total field strength at P0 due to N zones,
the collection of Huygens’ wavelets is added:
The distance from the observation point to the zone is
X
N
2ilA 2 ikD XN
R2 ¼ x22 þ y2 ¼ ½ðR0 þ ZÞ 2 x1 2 þ ðR0 sinuÞ2 EðP0 Þ ¼ E~ j ðP0 Þ ¼ e ð21Þ j Cj ½37
j¼1
D j¼1
¼ R0 2 þ ðR0 þ ZÞ2 2 2R0 ðR0 þ ZÞ cosu
To evaluate the sum, the elements of the sum are
The derivative of this expression yields regrouped and rewritten as
Figure 13 Vector addition of waves from nine subzones Figure 14 Vibration curve for determining the Fresnel diffraction
constructed in the first Fresnel zone. Reprinted with permission from a circular aperture. The change of the diameter of the half-
from Guenther RD (1990) Modern Optics. New York: John Wiley & circles making up the spiral has been exaggerated for easy
Sons. visualization. The actual changes are one part in 1025. The
position of point B is determined by the aperture size. Reprinted
If the radius of curvature of the arc were calculated, with permission from Guenther RD (1990) Modern Optics.
New York: John Wiley & Sons.
we would discover that it is a constant, except for the
contribution of the obliquity factor,
circular aperture, we construct a spiral, Figure 14, to
lR 0 represent the Fresnel zones of a spherical wave
r¼ CðwÞ incident on the aperture. The point B on the spiral
R0 þ Z
shown in Figure 14 corresponds to the portion of the
Because the obliquity factor for a single zone is a spherical wave unobstructed by the screen. The
constant to 2 parts in 1027, the radius of curvature of length of the chord, AB, represents the amplitude of
the vibration curve can be considered a constant over the light wave at the observation point P0. As the
a single zone. If we let the number of subzones diameter of the aperture increases, B moves along the
approach infinity, the vibrational curve becomes a spiral, in a counterclockwise fashion, away from
semicircle whose chord is equal to the wavelet A. The first maximum occurs when B reaches the
produced by zone one, i.e., E1. If we subdivide addi- point labeled A1 in Figure 14; the aperture has then
tional zones, and add the subzone contributions, we uncovered the first Fresnel zone. At this point, the
create other half-circles whose radii decrease at the amplitude is twice what it would be with no
same rate as the obliquity factor. The vibrational curve obstruction. Four times the intensity!
for the total wave is a spiral, constructed of semicircles If the aperture’s diameter continues to increase, B
which converges to a point halfway between the first reaches the point labeled A2 in Figure 14 and the
half-circle, see Figure 14. The length of the vector amplitude is very nearly zero; two zones are now
from the start to the end of the spiral is E ¼ E1 =2, as exposed in the aperture. Further maxima occur when
we derived above. When this same construction an odd number of zones are in the aperture and
technique is applied to a rectangular aperture, the further minima when an even number of zones are
vibrational curve generated is the Cornu spiral. exposed. Figure 15 shows an aperture containing four
exposed Fresnel zones. The amplitude at the obser-
Circular Aperture vation point would correspond to the chord drawn
from A to A4 in Figure 14.
The vector addition technique, described in Figure 13, The aperture diameter can be fixed and the
can be used to evaluate Fresnel diffraction, at a point observation point P0 can move along, or perpendicular
P0, from a circular aperture and yields the intensity to, the axis of symmetry of the circular aperture. As P0
distribution along the axis of symmetry of the circular is moved away from the aperture, along the symmetry
aperture. The zone concept will also allow a axis, i.e., as Z increases, the radius of the Fresnel zones
qualitative description of the light distribution increase without limit. For small values of n, the
normal to this axis. radius of the nth zone can be approximated by
To develop a quantitative estimate of the intensity pffiffiffiffiffiffi
at an observation point on the axis of symmetry of the rn < nZl ½38
268 DIFFRACTION / Fresnel Diffraction
Opaque Screen
If the screen containing a circular aperture, of radius a,
is replaced by an opaque disk of radius a, the intensity
distribution on the symmetry axis, behind the disk, is
found to be equal to the value that would be observed
with no disk present. This prediction was first derived
by Poisson, to demonstrate that wave theory was
incorrect; however, experimental observation sup-
ported the prediction and verified the theory.
We construct a spiral, shown in Figure 16, similar
to the one for the circular aperture in Figure 14. The
point B on the spiral represents the edge of the disk.
The shaded portion of the spiral from A to B does not
contribute because that portion of the wave is covered
by the disk and the zones, associated with that
portion of the wave, cannot be seen from the
observation point. The amplitude at P0 is the length
Figure 15 Aperture with four Fresnel zones exposed. Reprinted
with permission from Guenther RD (1990) Modern Optics. of the chord from B to A1 shown in Figure 16. If the
New York: John Wiley & Sons. observation point moves toward the disk, then B
moves along the spiral toward A1 . There is always
intensity on the axis for this configuration, though it
At Zmax, the light intensity is a maximum, given slowly decreases until it reaches zero when the
by the chord length from A to A1 in Figure 14. observation point reaches the disk; this corresponds
If l ¼ 500 nm and a ¼ 0:5 mm then this maximum to point B reaching point A1 on the spiral. Physically,
occurs when Z ¼ 0:5 m zero intensity occurs when the disk blocks the entire
light wave. There are no maxima or minima observed
a2 as the disk diameter, a, increases or as the observation
Zmax ¼ ½39
l distance changes. If the observation point is moved
perpendicular to the symmetry axis, a set of
If we start at Zmax and move toward the aperture, concentric bright rings are observed. The origin of
along the axis, as Z decreases in value, a point will be these bright rings can be explained using Fresnel
reached when the intensity on the axis becomes a zones, in a manner similar to the one used to explain
minimum. The value of Z where the first minimum in the bright rings observed in a Fresnel diffraction
intensity is observed is equal to pattern from a circular aperture.
a2
Zmin ¼
2l
In Figure 14 the chord would extend from A to A2.
As the observation point, P0, is moved along the axis
toward the aperture, and Z assumes values less than
Zmin, the point B in Figure 14 spirals inward, toward
the center of the spiral and the intensity cycles through
maximum and minimum values. The cycling of the
intensity, as the observation point moves toward the
aperture, will not continue indefinitely. At some point,
the field on the axis will approach the field observed
without the aperture because the distance between
Zmax and Zmin shrinks to the wavelength of light
making the intensity variation unobservable.
For values of Z that exceed Zmax of [39], the
Figure 16 Vibration curve for opaque disk. The shaded region
aperture radius, a, will be smaller than the radius of
makes no contribution because it is associated with the portion of
the first zone, and Fraunhofer diffraction will be the wave obstructed by the opaque object. Reprinted with
observed because the aperture contains only one zone permission from Guenther RD (1990) Modern Optics. New York:
and the phase in the aperture is a constant. John Wiley & Sons.
DIFFRACTION / Fresnel Diffraction 269
Zone Plate zone plate performs the role of a positive lens with a
focal length given by the positive value of [40]. The
In the construction of Fresnel zones, each zone was
light waves, labeled D in Figure 17, appear to
assumed to produce a Huygens wavelet, out of phase
originate from the virtual image point labeled I. For
with the wavelets produced by its nearest neighbors.
these waves the zone plate performs the role of a
If every other zone were blocked, then there would be
negative lens with a focal length given by the negative
no negative contributions to [37]. The intensity on-
value of [40].
axis would be equal to the square of the sum of
The zone plate will not have a single focus, as is the
the amplitudes produced by the in-phase zones –
exceeding the intensity of the incident wave. An case for a refractive optical element, but rather will
optical component made by the obstruction of either have multiple foci. As we move toward the zone plate,
the odd or the even zones could therefore be used to from the first focus, given by [40], the effective Fresnel
concentrate the energy in a light wave. zones will decrease in diameter. The zone plate will no
The boundaries of the opaque zones, used to block longer obstruct out-of-phase Fresnel zones and the
out-of-phase wavelets, are seen from [38] to increase light intensity on the axis will decrease. However,
as the square root of the integers. An array of opaque additional maxima, of the on-axis intensity, will be
rings, constructed according to this prescription is observed at values of Z for which the first zone plate
called a zone plate, see Figure 17. opening contains an odd number of zones. These
A zone plate will perform like a lens with focal positions can also be labeled as foci of the zone plate;
length however, the intensity at each of these foci will be less
than the intensity at the primary focus.
r21 Lord Rayleigh suggested that an improvement of the
f ¼^ ½40 zone plate design would result if, instead of blocking
l
every other zone, we shifted the phase of alternate
The zone plate, shown in Figure 17, will act as both a zones by 1808. The resulting zone plate, called a phase-
positive and negative lens. What we originally called reversal zone plate, would, more efficiently, utilize
the source can now be labeled the object point, O, the incident light. R.W. Wood was the first to make
and what we called the observation point can now be such a zone plate. Holography provides an optical
labeled the image point, I. The light passing through method of constructing the phase-reversal zone plates
the zone plate is diffracted into two paths, labeled C in the visible region of the spectrum. Semiconductor
and D in Figure 17. The light waves, labeled C, lithography has been used to produce zone plates in the
converge to a real image point I. For these waves the x-ray region of the spectrum.
Figure 17 A zone plate acts as if it were both a positive and a negative lens. Light from the object, O, is diffracted into waves traveling
in both the D and the C directions. The light traveling in the C direction produces a real image of O at I. The light traveling in the D
direction appears to originate from a virtual image at I. Reprinted with permission from Guenther RD (1990) Modern Optics. New York:
John Wiley & Sons.
270 DIFFRACTION / Fresnel Diffraction
this area that the optical path length does not change.
Figure 18a shows the neighborhood, about the true
optical path, as a cross-hatched region equal to the
first Fresnel zone.
Light waves do travel over wrong paths but we do
not observe them because the phase differences for
those waves that travel over the ‘wrong’ paths are
such that they destructively interfere. By moving the
neighborhood, defined in the previous paragraph as
the first Fresnel zone, to a region displaced from the
true optical path, we can use the zone construction to
see that this statement is correct. In Figure 18b the
Figure 18 (a) A set of Fresnel zones has been constructed neighborhood is constructed about a ray that is
about the optical path taken by some light ray. If the optical path of improper according to Fermat’s principle. We see that
the ray is varied over the cross-hatched area shown in the figure,
then the optical path length does not change. This cross-hatched
this region of space would contribute no energy at the
area is equal to the first Fresnel zone and is described as the observation point because of destructive interference
neighborhood of the light ray. (b) The neighborhood defined in (a) between the large number of partial zones contained
is moved so that it surrounds an incorrect optical path for a light in the neighborhood.
ray. We see that this region of space would contribute no wave This leads us to another interpretation of diffrac-
amplitude at the observation point because of the destructive
interference between the large number of partial zones contained
tion. If an obstruction is placed in the wavefront so
in the neighborhood. Reprinted with permission from Guenther that we block some of the normally interfering
RD (1990) Modern Optics. New York: John Wiley & Sons. wavelets, we see light not predicted by Fermat’s
principle, i.e., we see diffraction.
DIFFRACTIVE SYSTEMS
Contents
in addition. In particular, for applications with a value of more than about 3.5 (for silicon used in the
broad wavelength spectrum (e.g., an achromatic lens) near infrared). But, in a numerical simulation the
or a broad spectrum of angles of the incident rays, the refractive index can be arbitrarily high, e.g., several
diffraction efficiency has to be analyzed carefully to thousand. This is the reason why the so called Sweatt
obtain the energy distribution in the different diffrac- model for tracing rays through DOEs uses a weakly
tion orders. But this is beyond the scope of this article curved aspheric surface with height hsweatt and a very
about aberration correction and we assume that we high refractive index nsweatt to simulate a diffractive
know the local diffraction efficiency for each ray. phase function F:
l Fðx; yÞ
Sweatt Model for the Ray Tracing Simulation hsweatt ðx; yÞ ¼ ½2
of DOEs nsweatt 2 1 2p
An aspheric refractive surface with height h as a To take into account the wavelength dispersion of the
function of the lateral coordinates x and y and a DOE the refractive index nsweatt in the Sweatt model
refractive index n below the surface and air (refrac- has to be proportional to the wavelength l
tive index 1) above the surface can be approximately
replaced by a DOE with the phase function (see l
nsweatt ðlÞ ¼ n0 ½3
Figure 1) l0
diffraction efficiency. Then, it can be checked whether In this article the name DOE is used for diffractive
undesired diffraction orders of the DOE are blocked elements with a microstructured surface (binary,
in the optical system by aperture stops or whether multistep or blazed). Such elements have the property
they will disturb the output of the optical system. The that their diffraction efficiency depends only weakly
local diffraction efficiency of each ray for a value m on the wavelength or the angle of incidence of the
can be calculated to a good approximation by using a illuminating light. Another interesting class of dif-
diffractive analysis of the corresponding local grating, fractive elements are holograms in volume media,
i.e., that of an infinitely extended grating which would which are recorded interferometrically using two
have the same grating period and the same structure coherent recording waves. The refractive index in a
shape (e.g., depth) as the DOE at the specific point. volume hologram is modulated periodically and the
If the y component of the incident ray and the local thickness has to amount to several periods of the
grating frequency ny are both zero the ray com- interference pattern along the z-direction in order to
ponents in the x direction can be expressed via the have a so-called thick hologram with a high diffrac-
sine of the angle of incidence win or the angle of the tion efficiency. If there are only a few periods along
diffracted ray wout (see Figure 2). In this case the first the z-direction the quite small modulation of the
row of eqn [9] can be written as refractive index (typically about 0.02) will result in a
l low diffraction efficiency. Therefore, if the angle
nout sin wout ¼ nin sin win þ m ½10 between the two recording waves is small the
px
resulting interference pattern will have large periods
where the local grating period px ¼ 1=nx has been and only a few periods are recorded in the photosen-
introduced. This is the well-known grating equation. sitive layer along the z-direction. But this means that
In this article eqns [4] – [10] assume that the DOE is the holographic optical element (HOE) can only have
on a planar substrate and that x and y are the lateral a small diffraction efficiency. From this fact, it is clear
coordinates on the substrate. However, the equations that also by using the recorded HOE the deflection
can also be written in a vector formulation which is angle between the incident and the diffracted wave
valid for DOEs on curved substrates (see Further must not be smaller than a certain value because
Reading). otherwise the thickness has to be very big or the
So, we have now a powerful simulation method to diffraction efficiency decreases. Therefore, such
design and analyze in the following several optical elements with high diffraction efficiency, which will
systems containing DOEs to correct monochromatic be named in the following holographic optical
or chromatic aberrations. elements (HOE), are always off-axis elements because
for on-axis elements all rays in the neighborhood of
Correction of Monochromatic the axis are only deflected by a small angle and then
Aberrations the diffraction efficiency would be low. For the correct
wavelength and angle of incidence, so that the Bragg
For practical applications it is important to have an condition of holography is fulfilled, HOEs, like
efficient algorithm to calculate the phase function of a blazed DOEs, exhibit only one dominant diffraction
DOE to correct the aberrations in a given optical order m ¼ mdom (normally mdom ¼ 1Þ: But, if the
system. Several different cases have to be distin- angle of incidence or the wavelength change, the
guished. An easy case is the design of a DOE to diffraction efficiency in the order mdom decreases
produce a desired output wavefront Fout from a rapidly and light passes through the HOE without
known input Fin. In this case, eqn [4] can be used to being deflected, i.e., most of the light is then in the
calculate the phase function F of the DOE directly. order m ¼ 0: For this reason, a HOE can be used for
Two examples are given in the following. Afterwards, example for a head-up display to fade in some
some aspects of a more general numerical optimiz- information under the Bragg angle without hindering
ation procedure are described. the normal view onto the environment.
Traditional materials for HOEs like dichromated
Aberration Correction for Interferometrically
gelatin (DCG) are sensitive only in the blue –green
Recorded Holograms
spectral region (e.g., for light of an argon ion laser
Although the following example of the aberration with l ¼ 488 nmÞ: However, once recorded it will be
correction of interferometrically recorded holograms illuminated in many applications (e.g., optical inter-
by using DOEs seems to be quite special it shows connects) by a wavelength in the red – infrared
exemplarily several principal steps which are necess- spectral region. Unfortunately, the wavelength mis-
ary to design a DOE for correcting monochromatic match between recording and reconstruction illumi-
aberrations. nation introduces aberrations.
DIFFRACTIVE SYSTEMS / Aberration Correction with Diffractive Elements 275
If the two desired reconstruction waves at the The task is now to find two recording waves which
wavelength l2 have the phases Fr,l2 and Fo,l2 in the construct nearly the same three-dimensional inter-
surface plane of the HOE (the subscripts r and o refer ference pattern in the hologram at the recording
to interferometry reference and object, respectively), wavelength l1. This means that eqn [12] has to be
the phase function F of the HOE which has to be fulfilled exactly for the difference of the phases of the
recorded is two recording waves in the plane z ¼ 0 and,
additionally, the Bragg condition has to be fulfilled
Fðx; yÞ ¼ Fo;l2 ðx; yÞ 2 Fr;l2 ðx; yÞ ½11 at least approximately. So, the next step is to calculate
one of the recording waves with phase Fr,l1 so that
If our discussion here is restricted to holographic the Bragg condition is fulfilled. This can be done
lenses, the two reconstruction waves will be spherical approximately with a third-order theory resulting in a
(or plane) waves (see Figure 3b) with the center of spherical wave or nearly exactly with an iterative
curvature at the points rr,l2 and ro,l2 and eqn [11] is: numerical procedure resulting generally in an asphe-
ric recording wave. If Fr,l1 is known, the other
2pso;l2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Fðx; yÞ ¼ ðx 2 xo;l2 Þ2 þ ðy 2 yo;l2 Þ2 þ z2o;l2 recording wave with phase Fo,l1 is calculated by
l2
2psr;l2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Fo;l1 ðx; yÞ ¼ Fðx; yÞ þ Fr;l1 ðx; yÞ ½13
2 ðx 2 xr;l2 Þ2 þ ðy 2 yr;l2 Þ2 þ z2r;l2
l2
½12 At least this recording wave will be aspheric. To
generate such an aspheric wave it is traced by ray
The parameters so,l2 and sr,l2 are either þ 1 for a tracing to a plane where the two recording waves are
divergent spherical wave or 2 1 for a convergent spatially separated. There, either a DOE or a
spherical wave. The surface of the HOE itself is in the combination of a DOE and a lens are used to generate
plane at z ¼ 0: the aspheric wave. The calculations of the phase
function of the DOE are done again by using an
equation like [11] whereby one wave is the propa-
gated aspheric wave (possibly propagated also
through a lens) and the other is normally a plane
off-axis wave with which the DOE will be illuminated
in the real recording setup (see Figure 3a). Here, the
off-axis angle is necessary to allow spatial filtering of
disturbing diffraction orders of the DOE which will
be normally fabricated as a binary DOE.
Experiments of our own showed (see the literature)
that nearly diffraction-limited holographic lenses
with a numerical aperture of NA ¼ 0:4 can be
fabricated by correcting the aberrations of about 28
wavelengths peak-to-valley (for the reconstruction at
633 nm wavelength) due to the wavelength mismatch
between recording at 488 nm wavelength and recon-
struction at 633 nm wavelength by using a DOE in
the recording setup.
global polynomial fit but a spline fit, i.e., a set of power of the DOE, whereas the other coefficients
adjacent local polynomials with several continuity represent aspheric terms. Depending on the numerical
conditions at the border area of each local poly- aperture and on the application the degree G of the
nomial. In this case the total number of coefficients polynomial can be increased in order to achieve a
used to describe the phase functions and Fin and Fout better solution. Nevertheless, the degree in the
of the DOE will be increased compared to the optimization should be as small as necessary in
description by a global polynomial. Then, the number order to avoid ‘swinging effects’ and numerical
N of rays for the calculation has also to be increased instabilities of the polynomial. To determine the
accordingly. smallest adequate degree an iterative method has to
So, in this example the DOE acts as a null- be used by starting with a quite small degree (e.g., 4).
compensator for the asphere and at first sight it is If the result of the optimization is not good enough
not a typical example for aberration correction. the degree has to be increased until the result of the
However, by regarding the complete system the optimization is satisfying.
DOE compensates the aberrations, i.e., deviations For non-rotationally symmetric optical systems a
from a spherical wave, generated by the asphere which Cartesian polynomial like in eqn [16] can be taken
are primary and higher-order spherical aberrations if and all the (G þ 2)(G þ 1)/2 coefficients ajk have to
the asphere is rotationally symmetric and more be optimized. Since the number of free parameters is
general aberrations otherwise. This example shows in this case larger than the number of free parameters
how the phase function of a DOE can be designed in the rotationally symmetric case the convergence of
straightforwardly for correcting aberrations if there is the optimization may take a long time and the risk to
only one well-defined wavefront propagating through break down in a local minimum of the multidimen-
the optical system. This design method can also be sional parameter space increases.
used for other optical systems where the input and the In principle, the calculation of a DOE used, e.g., for
output wavefront of the DOE are well defined. testing an aspheric surface can also be done by
optimizing the coefficients of the polynomial repre-
senting the phase function of the DOE. Nevertheless,
Numerical Optimization of DOEs in Optical Systems
as a general rule it can be said that a direct calculation
In a general optical system, where a DOE is not only of the phase function is always more efficient than a
used to compensate the aberrations for one specific numerical optimization, if the direct calculation is
incident wave but for different incident waves (optical possible. In practice, it can be seen that the maximum
system with a field), it is not possible to calculate the degree of a polynomial, which is directly calculated
optimum phase function in such a straightforward by the method described in the last section, is always
way as given above. However, a DOE is in principle smaller than the degree which is necessary in a
similar to an asphere in an optical system, so that the numerical optimization to achieve convergence of the
design rules for including aspherics in optical systems optimization.
can be used in some cases as long as monochromatic Of course, in such cases where several different
aberrations are considered. Nevertheless, one should types of aberrations (on-axis aberrations like spheri-
keep in mind that a DOE is, according to the Sweatt cal aberration, off-axis aberrations like astigmatism
model, an asphere with high refractive index and small and coma or field aberrations like field curvature or
height, i.e., the deflection of the light rays occurs in a distortion) have to be corrected in an optical system a
plane (or on a curved surface if the DOE is situated on direct calculation of the phase function is not possible
a curved substrate). Therefore, the design rules for so that only the numerical optimization can provide a
aspheric surfaces can be used only if the high refractive solution.
index of the Sweatt model is used in the design. Up to now only monochromatic aberrations have
In the general case only a numerical optimization been discussed. But as stated at the beginning of this
can be used. For this, it is very important to select an article the strong dispersion of a DOE having
appropriate mathematical representation of the phase opposite sign to the dispersion of a refractive element
function of the DOE. offers additional applications of DOEs by correcting
For rotationally symmetric optical systems also the chromatic aberrations. The principle of doing this is
phase function of the DOE should be rotationally described in the following.
symmetric. Therefore, in such a case it is useful to
write the phase function F as a polynomial in the
radius coordinate r as given in eqns [14] and [15].
Correction of Chromatic Aberrations
Then, the (G þ 1) or (G/2 þ 1) coefficients aj have to The focal power 1/f ( f is the focal length of the lens)
be optimized, whereby a2 is proportional to the focal of a thin refractive lens consisting of two spherical
278 DIFFRACTIVE SYSTEMS / Aberration Correction with Diffractive Elements
surfaces with radii of curvature R1 and R2 and made i.e. the material has a so-called normal dispersion. This
of a material (glass) with refractive index n is means that the Abbe number Vd is always positive and
! ranges from about 20 for high-dispersive glasses (e.g.,
1 1 1
¼ ðnðlÞ 2 1Þ 2 ½18 SF59) to more than about 80 for low-dispersive glasses
f ðlÞ R1 R2
(e.g. Ultran 20). So, to fabricate a refractive achro-
where l is the wavelength of the illuminating light. matic doublet with a positive focal power it is
This equation is valid for all wavelengths and so necessary to take a positive lens with focal length f1
also for the commonly used wavelength ld ¼ 587:6 made of a low-dispersive glass with a large Abbe
nm (yellow d-line of helium) which is near the center number V1,d and a negative lens with focal length f2
of the visible spectrum. Then 1/R1 2 1/R2 can be made of a high-dispersive glass with a small Abbe
expressed as 1=((nd 2 1)fd) and the focal power number V2,d (see Figure 5a). Therefore, the resulting
depends on the wavelength as focal power of the achromatic doublet
1 nðlÞ 2 1 1 " #
¼ ½19 1 1 1 1 V2;d
f ðlÞ nd 2 1 fd ¼ þ ¼ 12 ½24
fges;d f1;d f2;d f1;d V1;d
where nd and fd are the refractive index and the focal
length at ld ¼ 587:6 nm; respectively. So, by com- is always smaller than the focal power of the positive
bining two thin lenses made of materials with lens with focal power l/f1,d alone.
refractive indices n1 and n2 and focal lengths f1 and The situation is completely different if a refractive
f2 the focal power of the doublet assuming a zero and a diffractive lens are used. From the Sweatt model
distance between the two lenses is:
1 1 1
¼ þ
fges ðlÞ f1 ðlÞ f2 ðlÞ
n1 ðlÞ 2 1 1 n ðlÞ 2 1 1
¼ þ 2 ½20
n1;d 2 1 f1;d n2;d 2 1 f2;d
For an achromatic doublet the focal power should be
equal for two different wavelengths which are
commonly the wavelengths lF ¼ 486:1 nm (blue
F-line of hydrogen) and lC ¼ 656:3 nm (red C-line
of hydrogen) which mark the limits of the well visible
spectrum (of course the visible spectrum itself is
broader but the sensitivity of the human eye to
wavelengths below lF or beyond lC is quite bad).
Therefore, the following equations result:
1 1 n ðl Þ 2 n1 ðlC Þ 1
¼ ) 1 F
fges ðlF Þ fges ðlC Þ n1;d 2 1 f1;d
n2 ðlF Þ 2 n2 ðlC Þ 1
¼2 ½21
n2;d 2 1 f2;d
By introducing the Abbe number Vd
nd 2 1
Vd ¼
n F 2 nC
nðld ¼ 587:6 nmÞ 2 1
¼ ½22
nðlF ¼ 486:1 nmÞ 2 nðlC ¼ 656:3 nmÞ
equation [21] can be simplified to
V1;d f1;d ¼ 2V2;d f2;d ½23 Figure 5 Structure of achromats: (a) refractive achromatic
doublet consisting of a positive lens with low dispersion and
Now, the wavelength dispersion of all conventional a negative lens with high dispersion; (b) hybrid achromatic
materials (glasses, polymers, etc.) is such that the lens consisting of a positive refractive lens and a positive
refractive index decreases with increasing wavelength, diffractive lens.
DIFFRACTIVE SYSTEMS / Aberration Correction with Diffractive Elements 279
of a DOE (eqn [3]) it is known that the refractive explanation of this is that a refractive achromatic
index nsweatt is proportional to the wavelength l doublet has three spherical surfaces and two glasses,
i.e., in total five parameters, whereas a hybrid
n0 n achromat has one spherical surface, one glass and
nsweatt ðlÞ ¼ l¼ dl ½25
l0 ld the DOE with a paraxial term and an aspheric term,
i.e. in total about four parameters. Of course, the
Therefore, the Abbe number of a DOE is merit of these different types of parameters is not
nd 2 1 identical so that this assessment is only qualitative.
Vd ¼ In the case of hybrid achromatic elements for a
n F 2 nC !
ld 1 nd !1 ld broad wavelength range (e.g., for the whole visible
¼ 12 ¼ wavelength range) the dependence of the diffraction
lF 2 lC nd lF 2 lC
efficiency on the wavelength may be a limiting factor
¼ 23:452 ½26
because the other diffraction orders behave like stray
Here, we used the fact that the accuracy of the Sweatt light and reduce the maximum contrast of the image.
model increases with increasing refractive index so Therefore, special laminated blazed diffractive optical
that the Abbe number of a DOE is achieved in the limit elements made of laminae of two or more different
of an infinite refractive index. So, it can be seen that the materials are described in the literature which show a
Abbe number of a DOE is a small negative constant. nearly achromatic behavior of the diffraction effi-
This means that the dispersion of a DOE is high and ciency but the ‘normal’ dispersion of DOEs. DOEs
opposite in sign to that of a refractive element. are in the meantime used in zoom lenses (see the cited
Therefore, according to eqns [23] and [26] a positive patent in Further Reading) to correct chromatic
refractive lens with a large focal power (i.e., small aberrations.
focal length) and a positive diffractive lens with a small
focal power (i.e., long focal length) can be combined to
form an achromatic doublet with a focal power larger
Summary
than the focal power of one of the two lenses alone (see
Figure 5b). By using eqns [19] and [25] the focal power In this article the possibilities of correcting mono-
of a diffractive lens can be calculated with the Sweatt chromatic and chromatic aberrations in optical
model (n very big ! n 2 1 < nÞ: systems by using diffractive optical elements have
1 l been discussed. There are several advantages and
¼ ½27 disadvantages of using DOEs.
fDOE ðlÞ ld fd
Advantages:
In practice, the lenses are neither thin nor is the
distance between them zero. In this case the equations . DOEs offer many degrees of freedom in the design.
for the design of a hybrid achromatic doublet are more By using modern lithographic fabrication tech-
complex (see Further Reading). Nevertheless, the niques the fabrication of a DOE with a strongly
given equations can be used for a first design and to aspheric phase function is just as expensive as
understand the principal behavior. fabricating a simple Fresnel zone lens with a
Another advantage of a hybrid achromatic doublet quadratic phase function. So, DOEs can be used
is that the DOE can in addition behave like an in some cases instead of aspheric surfaces which
asphere. Therefore, e.g., the spherical aberration can are more expensive. In addition, the lithographic
be corrected with the aspheric terms of the DOE fabrication technique guarantees a quite good
provided that the spherical aberration is considerably accuracy concerning the phase function of the
smaller than the phase term encoded in the DOE to DOE compared to the more difficult fabrication of
correct the chromatic aberrations. Otherwise, the well-defined aspheric surfaces.
correction of the monochromatic (spherical) aberra- . The strong chromatic dispersion with the opposite
tion will disturb the correction of the chromatic sign to that of a refractive element allows the
aberrations. A comparison of a simple hybrid correction of chromatic aberrations. DOEs also
achromat, consisting of a planoconvex lens with the give in this case an additional degree of freedom in
diffractive structure made directly onto the plane the design because aspheric terms of the phase
surface of the refractive lens and a conventional function can correct in addition monochromatic
refractive achromatic doublet shows that the optical aberrations of the system as long as these aberra-
performance of the hybrid achromat is a little bit tions are considerably smaller than the quadratic
worse than that of the refractive achromatic doublet term of the phase function which corrects the
regarding, for example, off-axis aberrations. A simple chromatic aberrations.
280 DIFFRACTIVE SYSTEMS / Aberration Correction with Diffractive Elements
Stone Th and George N (1988) Hybrid diffractive-refractive Welford WT (1975) A vector raytracing equation for
lenses and achromats. Applied Optics 27: 2960– 2971. hologram lenses of arbitrary shape. Optical Communi-
Sweatt WC (1977) Describing holographic optical elements cations 14: 322– 323.
as lenses. Journal of the Optical Society of America 67: Winick K (1982) Designing efficient aberration-free holo-
803– 808. graphic lenses in the presence of a construction-
Upatnieks J, Vander Lugt A and Leith E (1966) Correction reconstruction wavelength shift. Journal of the Optical
of lens aberrations by means of holograms. Applied Society of America 72: 143 –148.
Optics 5: 589– 593.
determining the merit of the system. This is not the and how the illumination coming from the source hits
case for wafer steppers. the element. Furthermore, the elements themselves
Even though the optical requirements for the lenses must have very low absorption, because the high
in the illumination system are not as high as in the power used could otherwise lead to destruction of
projections system, they are in practice still far too the elements. This means that the materials must be
demanding for micro-optical elements. Therefore, carefully selected. Finally, the optical wavelengths
the only area where micro-optics could be used in used in advanced lithography systems are 248 nm or
the illumination system is the generation of various less. Because the feature size of a diffractive element
illumination patterns needed for the off-axis illumina- scales down with the wavelength, the features in
tion schemes used for resolution enhancement, as elements used are much smaller than in the case
well as overall homogenization of light. An element of elements used for applications in the visible region.
meeting these goals is essentially a custom-designed Consequently the accurate fabrication of such ele-
diffuser, where both the generated angular range and ments becomes a formidable challenge, because most
the relative intensity of illumination in any given fabrication errors, such as mask misalignment, etch
angular direction are controlled to a high degree. depth error or the way the sharp corners in diffractive
elements get rounded during fabrication, increase in
significance when the feature size is decreased.
Custom Diffusers for Illumination
There exist two distinct approaches for obtaining
As mentioned earlier, one of the strengths of micro- such custom-designed diffusers. We will now discuss
optics lies in its ability to combine different optical the benefits and weaknesses of these two techniques
functions into a single element. When used in in terms of efficiency, pattern uniformity, stray light,
lithographic illumination systems, micro-optical and ease of fabrication, in order to illustrate the trade-
elements can serve a dual purpose. The element can off a designer creating micro-optical elements for a
serve as an artificial aperture, effectively distributing lithography system has to consider.
the light over some predetermined angular range and
at the same time minimizing light outside the range, Aperture Modulated Diffusers
allowing generation of various illumination patterns
One approach to acquire custom diffusers for
needed for the resolution enhancement via off-axis
illumination systems is to use an array of refractive
illumination. Unlike conventional solutions based on
or diffractive micro-lenses. More generally, the lens
the use of actual apertures to generate the desired
phase can be replaced by any phase obtained by
illumination patterns, distribution of light using
geometrical considerations, i.e., by a geometrical map
micro-optical elements can be done with efficiencies
transform between the input and output planes of the
up to 95%, with typical target efficiency being in the
optical system. The basic principle of an aperture
range of 90%, depending on the pattern complexity.
modulated diffuser is to use the phase function to
At the same time, the element can be used to smooth
accurately control the angular divergence of the
the light distribution, improving the uniformity of the
diffuser, while the shape of the desired intensity
illumination. Typically the required uniformity error,
distribution is created by introducing a properly
i.e., allowed variations from the average illumination,
shaped aperture in which the phase function is
is in the range of 5%, but can be as low as 1%.
contained. This basic principle is shown in Figure 4.
In addition to the above-mentioned high-efficiency
and -uniformity targets, optical quality requirements
of modern lithography systems introduce unique
challenges to the micro-optics used in such systems.
First, micro-optical elements used in lithography
systems must also exhibit low amounts of stray light,
especially in formation of hot-spots. Typically, hot-
spots are allowed to be only a few percent of the
average illumination level. Because of the fact that the
size, shape, and profile of the illumination beam can
vary, partly due to variations in the source and partly
because such variations improve the functionality of
the lithography system, by, for example, controlling
Figure 4 Basic principle of aperture modulated diffuser. The
the partial coherence factor, the micro-optical ele- lens f-number determines the angular extend of the far field
ments must be virtually space-invariant. The elements pattern, while the lens aperture shape determines the shape of
must perform the same function regardless of where the pattern.
286 DIFFRACTIVE SYSTEMS / Applications of Diffractive and Micro-Optics in Lithography
The main benefit of using a lens phase lies in the ease One needs to consider several issues with regard to
of the design process, which essentially consists of the fabrication of aperture modulated diffusers. For
selecting the numerical aperture of the lens to meet refractive aperture modulated diffusers, the main issue
the angular size of the desired pattern, while more is the accurate realization of the required surface
complicated phase profiles can be used to improve profile. The fabrication of large arrays of refractive
some aspects of the performance of the diffuser. elements having arbitrary surface profiles is still
In terms of optical performance, aperture modu- challenging. Resist reflow techniques, where the
lated diffusers typically achieve efficiency of up to micro-lenses are realized by melting small islands of
95% and low levels of stray light, easily fulfilling the resist, lead to a spherical lens profile. Complicated and
target specifications outlined earlier, especially when hard-to-control profile reshaping techniques are
refractive surfaces are used. However, since the lens required if dramatic alterations from the spherical
aperture in the element plane controls the shape of the lens profiles are needed. Even though fabrication
intensity distribution, the range of intensity distri- through direct writing techniques, such as laser
bution shapes is limited. This is necessary because it writing offers more freedom, realization of arbitrary
must be possible to tile the required aperture shape in refractive surface remains a challenge. Therefore, use
order to realize an array of elements with high fill of refractive elements is mainly limited to cases where
factor, i.e., to realize a space-invariant diffuser. the lens profile can be close to the spherical shape. The
Furthermore, introducing an aperture into the most important of such cases are patterns with
element plane means that intensity oscillations, due uniform circular or rectangular distributions. The
to boundary diffraction at the edge of the apertures, former can be realized by proper selection of the
cannot be avoided. Consequently, the uniformity of numerical aperture of the lens, while the latter can be
the generated intensity distributions suffers. This is realized by combining two properly selected cylin-
especially true in cases where the desired intensity drical lenses on opposite sides of the substrate.
pattern has sharp corners, where the intensity For diffractive aperture modulated diffusers there
oscillations due to boundary diffraction, can easily are considerably less limitations in terms of what
be up to 50% of the average intensity of the desired phase profiles can be attained. As long as the
intensity pattern. Figure 5 illustrates a typical far opening angles, i.e., the angular size of the pattern
field intensity distribution obtainable with aperture is below 10 degrees, the phase can be realized as a
modulated diffusers. diffractive element. With diffractive elements, the
main issue to consider is therefore the effect of
typical fabrication errors such as profile depth and
shape errors. First, the typical consequence of fabri-
cation errors in the element is loss of efficiency via
increase of stray light. Fortunately the errors do not
generate sharp stray light peaks, with the obvious
exception of the axis hot-spot. This results from the
fact that both lens functions and more complicated
phase functions designed by geometrical consider-
ations, exhibit locally periodic structures, and the
introduction of common fabrication errors then
leads to redistribution of light away from the
desired signal orders of these local structures to
higher or lower orders. On the level of the whole
intensity distribution, such redistribution appears as
a constant rise in the background noise level, but
will not lead to sharp intensity fluctuations.
Another issue related to the fabrication of aperture
modulated diffusers appears when diffuser elements
are fabricated using a series of photomasks, as it is
typically the case with a relatively large size of the
element. Although mask misalignment is essentially a
random error, i.e., the direction and amount of the
Figure 5 Typical intensity distribution obtainable with an
aperture modulated diffuser. The overall distribution is smooth error are difficult to predict before hand, it easily
apart from the slow intensity oscillations at the edges resulting leads to systematic and partly symmetric uniformity
from diffraction at the aperture boundary. problems with aperture modulated diffusers.
DIFFRACTIVE SYSTEMS / Applications of Diffractive and Micro-Optics in Lithography 287
electron beam lithography is widely used in the with proximity printing. Both techniques achieve
origination of micro-optical components, as the comparable resolutions, yet avoid damaging the mask
relatively high cost of creating the master component or substrate by keeping the two separated by some
can be offset in the final product by the inexpensive finite working distance. However, micro-lens array
replication process. Such an approach allows the lithography offers two distinct advantages over
replicated end-product to meet the high optical conventional proximity printing. First, the working
requirements achievable only by high resolution sur- distance between the imaging system and the sub-
face control, yet remain affordable for use in, for strate surface can be much larger, which further
example, consumer electronics. This has lead some in minimizes the risk of mask or substrate damage.
the micro-optics community to conclude that electrons Furthermore, added working distance allows litho-
are better suited for fabrication of components graphy steps on nonflat surfaces, e.g., inside holes or
controlling photons, while photons are better suited grooves where conventional proximity printing
for fabrication components controlling electrons. would not be possible. Another advantage of the
In addition to particle beam lithography, other micro-lens array lithography is the larger depth of
noteworthy techniques exist. For our purposes, the focus. With proximity printing, the resolution rapidly
most interesting technique is the so-called micro-lens decreases with increasing distance. In micro-lens
array lithography, where one or more micro-lens array lithography, the optimum resolution is obtained
arrays are used to image the pattern of the photomask at the image plane, while a region of reasonable
to the resist layer on the substrate. Although the resolution extends forward and backward from this
resolution of this approach is limited to 3 –5 microns, plane (Figure 11). This extension means that in the
for applications such as with flat panel displays, depth of field, resolution is less sensitive to changes in
micromechanics, or multichip module fabrication, the working distance, which makes it better suited for
this technique is of considerable interest. practical applications.
It is well known that the geometrical aberrations In addition to the optical benefits of micro-lens
scale down proportionally with the physical lens size. array lithography, there also exist other practical
Thus, large areas can be imaged favorably by use of benefits. Many micro-devices have a repetitive and
arrays of micro-lenses, where each lens images only a
small section of the whole field. Such a setup is shown
in Figure 10.
With this setup it is possible to expose a compara-
tively large area with a single exposure, thus
eliminating the need for costly stepping operations
and therefore increasing the throughput of the
system. Analysis of the basic system shows that an
array based on spherical micro-lenses of 1 mm focal
length and 300 micron aperture can produce a nearly
diffraction limited resolution on the order of 3 –5
micrometers over a 300 micron field.
In terms of conventional lithography systems, the
micro-lens array lithography has most in common
Further, recent advances in computing and fabrica- functionality, the functionality of a diffractive
tion technologies have increased the number of element is determined by a pattern of wavelength
applications in which diffractive elements are used, scale features. Determining this pattern is the goal of
which has established diffractive optics as yet another diffractive design but to do so one must first under-
tool in an optical designer’s kit. stand the physical relationship between the DOE and
The first modern diffractive element was the the wavefield it produces.
hologram. A hologram is a photographic record of Scalar diffraction theory is one of the simpler
the constructive and destructive interference between models that relates the effect of a surface relief, or
two coherent beams known as the object and write diffractive profile, dðx; yÞ, on an incident wavefield
beams. The object beam contains information about Uin ðx; yÞ:
the object recorded. The write beam is necessary to
record the hologram but contains no information Uout ðx; yÞ ¼ exp½iuðx; yÞUin ðx; yÞ ½1
itself. When the hologram is played back with a third
beam, the read beam, the recorded interference where the phase transformation uðx; yÞ is related to
pattern diffracts it in such a way that it reconstructs the diffractive profile dðx; yÞ by
the object beam.
Unfortunately, the functions that a natural holo- uðx; yÞ ¼ 2pðn 2 1Þdðx; yÞ=l ½2
gram can produce are limited to those whose
wavefields can be produced in a laboratory. Thus, The constant n is the refractive index of the optical
although it is possible to record a hologram that can substrate and l is the wavelength of illumination in a
focus or deflect a single beam, it is difficult to record a vacuum.
hologram that splits a single beam into multiple foci. The effects of the phase change are not apparent
However, the advent of computing technology made until the wavefield Uout ðx; yÞ propagates over some
it possible to design such holograms analytically. The distance z: In scalar diffraction theory the relationship
first diffractive element designed with the aid of a between Uout ðx; yÞ and the wavefield Uz ðx; yÞ at a
computer was demonstrated in 1966. Since then the plane a distance z away, is described by the Rayleigh–
ability to create elements that respond arbitrarily has Sommerfeld diffraction equation:
had widespread application in imaging systems,
telecommunications, and sensing. ð1 ð1
The two most basic diffractive elements are the Uz ðx;yÞ ¼ Uout ðz; hÞ
21 21
grating and the Fresnel zone plate. The diffractive qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
behavior of a grating is predominately transverse to exp ið2pz=lÞ 1þðx2 z=zÞ2 þðy2 h=zÞ2
the optic axis. It causes light to split into discrete £ dz dh
beams, referred to as diffraction orders, each with a iz l 1þðx2 z =zÞ 2
þðy2 h =zÞ2
½3
different angle of propagation relative to the optic
axis. The diffractive behavior of a Fresnel zone plate
Huygens’ principle is reflected in this formulation by
also divides incoming light into discrete orders,
the convolution between the transmitted field
however, the orders are produced along the optic
Uout ðx;yÞ and a spherical wave hðx;yÞ, which rep-
axis. This is useful in producing the diffractive
resents free space propagation over the distance z :
equivalent of a refractive lens. Arbitrary transverse
and longitudinal behavior is realized by combining
characteristics of these basic two diffractive elements. Uz ðx;yÞ ¼ Uout ðx;yÞpp hðx;yÞ ½4
In the remainder of this article, we discuss the design
and fabrication of general diffractive elements. where
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
exp ið2pz=lÞ 1þðx=zÞ2 þðy=zÞ2
Design hðx;yÞ ¼ ½5
izl 1þðx=zÞ2 þðy=zÞ2
Once an optical system designer determines the
functional behavior required for a particular appli-
cation, for example, focusing or multiple beamsplit- The Rayleigh – Sommerfeld formulation is valid so
ting, he or she needs to design a diffractive element long as the angles at which energy diffracts from the
that realizes this function. However, unlike refractive aperture are small. If the distance z is sufficiently large
and reflective optical elements, whose material that the angles stay clustered about the optic axis, one
properties and surface curvatures determine their can replace the spherical wave response of freespace
292 DIFFRACTIVE SYSTEMS / Design and Fabrication of Diffractive Optical Elements
with a parabolic wave: affects the fields that can actually be realized.
A diffractive optic designer can incorporate fabrica-
expði2pz=lÞ 2
hðx; yÞ < 2
exp iðp=lzÞðx þ y Þ ½6 tion constraints into their design either directly or
izl indirectly. In a direct approach, the designer rep-
resents the diffractive structure parametrically, for
This simplification is referred to as the paraxial, or
example, a multilevel phase element can be rep-
Fresnel, approximation. The diffraction integral is
resented by a discrete set of allowed phase values and
now given by:
a binary phase element can be represented by the
expði2pz=lÞ ð 1 ð 1 contours that separate one phase value from the
Uz ðx; yÞ < Uout ðz; hÞ other. A designer defines the diffractive element by
izl 21 21
specifying and modifying these parameters.
£ exp{iðp=lzÞ ðx 2 zÞ2 þ ðy 2 hÞ2 }dz dh In an indirect approach, the designer first deter-
expði2pz=lÞ 2 2 mines an intermediate function without consider-
¼ exp iðp=lzÞðx þ y Þ
izl ation for fabrication and afterward applies the
ð1 ð1 fabrication constraints to this function. For example,
£ Uout ðz; hÞ exp iðp=lzÞðz2 þ h2 Þ if f ðu; vÞ is the desired response one wishes to generate
21 21
in the z-plane using Uin ðx; yÞ as the input, from
£ exp{ 2 i2p zðx=lzÞ þ hðy=lzÞ }dz dh
eqn [8], the optical function required to do is
½7
The second representation results from an expansion Qðx; yÞ ¼ Fðx; yÞ Uin ðx; yÞ ½9
of the exponent and allows the diffraction integral to
be written as a Fourier transform. The quadratic where Fðx; yÞ is the Fourier transform of f ðu; nÞ:
phase terms can be removed if a positive lens with The designer must then find a phase uðx; yÞ that
focal length z is placed next to the diffractive element satisfies the fabrication constraints and insures that
and a second positive lens with focal length z is placed exp½iuðx; yÞ < Qðx; yÞ:
in the image plane (see Figure 1). Under these For example, fabrication using multiple binary
conditions, Uout ðx; yÞ and Uz ðx; yÞ form a Fourier masks yields a phase structure quantized to 2N levels.
transform pair: An effective design should, therefore, yield a quan-
tized phase structure that minimizes reconstruction
error. A naı̈ve approach is to set the magnitude of
Uz ðx; yÞ ¼ FT Uout ðx; yÞ lu¼x=lz;v¼y=lz
pp
½8 Qðx; yÞ in eqn [9] to unity and quantize the phase.
¼ FT{exp½iuðx; yÞ} FT½Uin ðx; yÞlu¼x=lz;v¼y=lz However, this introduces considerable error. A direct
approach to design searches only over the set of
The objective of diffractive design is to find a
quantized phase structures for the one that, for
surface relief dðx; yÞ that generates a desired response
example, generates the desired response with maxi-
qðx; yÞ somewhere in the response Uz ðx; yÞ: However,
mum energy and minimum error. An indirect design
eqns [3] and [7] neglect the fabrication process, which
seeks first a phase structure whose response produces
a minimum amount of error between it and Qðx; yÞ
and then quantizes this structure.
Performance of the element can be improved if one
exploits certain freedoms available in design. For
example, all systems have a finite-sized detector. It is,
therefore, unnecessary to specify the performance of
the diffractive element everywhere in the image plane.
Its behavior within the detector window is all that is
important. If only the intensity of the response is
specified, then the phase of the response within the
window is also a design freedom. Finally, one can
also control the amount of energy contained in the
signal window by scaling the amplitude of the
response. This allows a designer to balance diffrac-
tion efficiency with error in the signal window.
Two general nonlinear optimization algorithms for
the design of DOEs are represented in Figure 2.
Figure 1 Optical Fourier transform module. Indicative of the nature of data flow, Figure 2a is
DIFFRACTIVE SYSTEMS / Design and Fabrication of Diffractive Optical Elements 293
When the pattern is transferred into the substrate substrate. In this case, the operational mode is similar
material via etching, the photoresist acts as an etch to a micromilling machine, wherein inert particles
stop. For the first mask, the etch depth into the chip away at the surface by virtue of their kinetic
substrate corresponds to a p-phase shift. The process is energy. Alternatively, a small bias power and a large
repeated for each subsequent mask, that is, the plasma power produce a large population of ions
substrate is coated with a layer of photoresist, UV with no preferential directionality. This yields an
illumination exposes the photoresist through the isotropic process similar to wet etching.
mask, the photoresist is developed, and the substrate Two methods for performing dry etching are
etched by the remaining pattern. The etch depth of reactive ion etching (RIE) and inductively coupled
each mask is half that of the previous mask. Figure 3 plasma (ICP) etching. However, in RIE systems the
illustrates this process. bias and plasma power are not independent due to
The substrate can be etched using either wet or dry capacitive coupling. Thus, adjusting one affects the
techniques. In wet etching, one uses reactive chemi- other. ICP, on the other hand provides independent
cals, such as hydrofluoric acid, to etch the substrate. control over the bias and plasma powers. Indepen-
However, wet etching is isotropic, that is, substrate dent control allows one to control precisely the
etching occurs equally in all directions. As a result, it is anisotropic nature of an etch process. As mentioned
difficult to preserve vertical sidewalls and to achieve above, this is achieved by varying the bias power
high fidelity in etching high-aspect ratio features. As a relative to the plasma power, which controls the
consequence, dry etching is more commonly used. directionality of the acceleration.
A dry etch uses a beam of ions to remove substrate Independent control is important because the ideal
material. The process for removal can be either process for a given DOE depends on the substrate
chemical or kinetic. The process used is determined material and the characteristics of the diffractive
by the value of the plasma power that creates the profile. Independent control allows one to adjust the
source of ions relative to the bias power used to respective powers, as well as gas flow rates, and their
accelerate the ions toward the substrate. If the bias partial pressures, to determine an optimal etch
power is large and the plasma power small, a small process. Dry etching allows one to incorporate
stream of ions are accelerated rapidly toward the endpoint detection, which provides precise control
over the etch depth. As presented below, achieving the lithography and the fabrication of continuous
proper etch depth is important to generating profile DOEs.
responses with high diffraction efficiency. The process for creating a diffractive element
Care must be used when determining the number of using HEBS glass is similar to fabricating a binary-
phase levels appropriate for a particular application. phase element from a single mask. However, mask
The number of phase levels has a direct impact on the creation is markedly different. HEBS glass is a
portion of input energy one can hope to diffract into white crown glass within which a thin buried layer
the desired response. This is referred to as the of silver ions is incorporated. The layer of silver
diffraction efficiency. According to scalar diffraction ions is created via ion exchange in an aqueous
theory, the relationship between the maximum acidic solution and essentially serves the same
diffraction efficiency h and the number of masks N is: function as silver grains in photographic film.
When the glass is exposed to a beam of electrons,
the silver-containing crystal complexes are chemi-
2 1
h ¼ sinc ½13 cally reduced and the glass darkens. By controlling
N
the electron dose, one controls the opacity of the
glass to produce a grayscale pattern.
Thus, using two masks to create a 4-level phase Although the attraction of single-step grayscale
element, as opposed to a single mask to create a lithography is reduced costs in fabrication, the
binary-phase element, doubles the upper bound process incurs high calibration costs to insure quality
on the diffraction efficiency from 41% to 81%. pattern transfer through several steps. The parameter
The maximum diffraction efficiency for 8-level and
that has the greatest impact on the overall process is
16-level phase elements is 95 and 99%, respectively.
the desired size of the smallest diffractive feature. The
However, this increase in diffraction efficiency
minimum feature size affects the selection of the
incurs fabrication costs. After the first etch step,
accelerating potential, beam current, addressing grid,
each subsequent exposure requires careful mask
and electron dose of the e-beam used to expose the
alignment. Alignment is accomplished using a stan-
HEBS glass.
dard tool from integrated circuit fabrication known
The e-beam accelerating potential sets the resolu-
as a mask aligner. The quality of a mask aligner
tion limit of the process and affects the sensitivity of
is determined by its precision and the majority of
the glass to the electron dose, the total charge per area
commercially available aligners can register a mask
and substrate to within less than a micrometer. This incident upon the glass. For example, for a constant
degree of alignment accuracy is sufficient to fabricate electron dose, increasing the accelerating potential
many useful multilevel diffractive elements. increases the darkening response of the material. The
To summarize, a multilevel surface relief profile large accelerating potential produces a large flux,
uses a set of N masks to generate a surface profile that which achieves the desired electron dose in a short
exhibits 2N phase levels. The etch depth for the kth period using energetic electrons. This, in turn,
mask is dk ¼ l=2k ðn 2 1Þ: In the next section, we enhances the sensitivity of the glass. However,
describe a process for fabricating continuous DOE increasing the accelerating potential also increases
profiles in a single expose and etch cycle. scattering, which limits the minimum feature size
achievable. Thus, low beam energies are generally
used to write small features. But this, unfortunately,
reduces the range of available optical densities
Continuous DOE Profiles Using because the material is less sensitive at low beam
Grayscale Lithography energies. To compensate for the reduced sensitivity, it
Grayscale photolithography has become increasingly is possible to increase the electron dose by increasing
popular because of its flexibility and convenience in the exposure time. However, this undermines to some
fabricating continuous diffractive profiles. Methods extent using a low energy beam to achieve small
including directly writing partially exposed e-beam features due to broadening that results from long
resist, resist reflow lithography, and using grayscale exposure times.
masks in a single step exposure. Here we discuss only Once the accelerating potential for writing the
the latter method because the processes are similar to mask is chosen, the beam current and addressing grid
those used in multiple mask fabrication. Thus, the need to be specified. The beam current is related to
same facilities can be used to produce both. In the spot size and, therefore, affects directly the length
particular, we consider the application of high- of time to write the mask. High current levels produce
energy-beam-sensitive (HEBS) glass to grayscale large spots. Since write times can easily reach into
296 DIFFRACTIVE SYSTEMS / Design and Fabrication of Diffractive Optical Elements
hours for large structures, high current levels are test structures by some experimental means and
desirable to reduce them. determining its functional dependence on the electron
The addressing grid represents the physical grid dose completes the calibration. This allows one to
that the electron beam is translated over in order to specify the electron doses required to achieve the
complete the exposure. This grid should be chosen as features dictated by the device design.
finely as possible, however, too fine a grid can If resolution requirements cannot be met through
significantly increase the required dwell time on judicious selection of the accelerating potential, then
each grid point. In this respect, a compromise is some of the contrast in the mask can be sacrificed for
required. High beam currents produce short dwell an increase in available resolution. As we have
times, which may necessitate using a coarse address mentioned, reducing the maximum optical density,
grid to insure the beam is capable of performing the or contrast, can alleviate scattering-induced broad-
exposure. ening. It is possible, however, to compensate to some
Once the process parameters are set, calibration degree for the reduced contrast during the photo-
between the HEBS transmission and the electron dose lithography process, where UV exposure and develop
is necessary. Analyzing the optical density of written times provide additional latitude in the pattern
transfer process.
Figure 4 represents the fabrication of a continuous
profile DOE using a grayscale mask. The substrate is
spin-coated with a thin film of photoresist, placed in
intimate contact with the grayscale mask, and UV
illuminated to expose the resist film. The resist is
developed and the pattern transferred into the
substrate using either wet or dry etching techniques.
Because the solubility of the resist is proportional to
exposure, developing the photoresist produces a
variable height profile.
Grayscale lithography is challenging because
modern photoresists are also designed to be highly
nonlinear yet linear response is required. Tra-
ditional photoresists are also designed to amplify
mask contrast and produce the sharpest resist
profile, thus they either dissolve away quickly and
completely, or remain impervious to developer. The
Figure 4 Fabrication of a continuous phase DOE using a exposure threshold of traditional resists is, there-
grayscale mask. fore, steep and considerable effort is required to
Figure 5 HEBS glass fabrication. (a) Data map of the optical density cross-section of an exposed HEBS grayscale mask and (b) its
corresponding visible image.
DIFFRACTIVE SYSTEMS / Design and Fabrication of Diffractive Optical Elements 297
Figure 6 Characterization of a continuous phase diffractive lens. (a) Two-dimensional profile. (b) Three-dimensional profile. (c) Line
scan of the two-dimensional plot. (d) Image of element from a scanning electron microscope.
insure that resist exposure is on the linear portion fabricating diffractive elements. This capability
of the curve. will no doubt lead to future advances in tele-
Although the resist exposure process is sensitive to communications, signal processing, and imaging.
all parameters, the behavior of a resist can be
characterized using simulation models. The details
of the photoresist models can be found elsewhere, but See also
the simulations predict that the greatest utilization of Diffraction: Fresnel Diffraction. Diffractive Systems:
the available resist occurs for short exposure times Aberration Correction with Diffractive Elements.
(seconds), and that these short exposures require long Holography, Techniques: Overview.
developing times (minutes). Further, the models
predict that low mask contrast does not adversely
affect the usable resist height. It is, therefore, not Further Reading
unreasonable to use a low contrast mask to achieve Dascher W, Long P, Stein R, Wu C and Lee SH (1997)
small features. Cost-effective mass fabrication of multilevel diffractive
Through simulation it is possible to determine optical elements by use of a single optical exposure with
process parameters that maximize the linear response a gray-scale mask on high-energy beam-sensitive glass.
range of the photoresist. However, these parameters Applied Optics 36: 4675– 4680.
must be validated in the laboratory. The results Dillon T, Balcha A, Murakowski JA and Prather DW (2004)
presented in Figures 5 and 6 provide convincing Continuous-tone grayscale mask fabrication using high-
evidence of the efficacy of these techniques. energy-beam-sensitive (HEBS) glass. Journal of Micro-
lithography, Microfabrication, and Microsystems.
Fienup JR (1980) Iterative method applied to image
reconstruction and to computer-generated holograms.
Final Remarks Optical Engineering 19: 297 – 306.
Goodman JW (1968) Introduction to Fourier Optics.
Advances in passive photonic devices have been New York: McGraw-Hill.
made possible by leveraging fabrication processes Kohler M (1998) Etching in microsystem technology.
from the electronics industry. Modifying standard In: Turunen J and Wyrowski F (eds) Diffractive Optics
photolithographic techniques and photoresist pro- for Industrial and Commercial Applications. Berlin:
cesses has produced an industrial capacity for Wiley-VCH.
298 DIFFRACTIVE SYSTEMS / Diffractive Laser Resonators
LeCompte M, Gao X and Prather DW (2001) Photoresist Swanson GJ (1989) Binary optics technology: the theory
characterization and linearization procedure for the and design of multi-level diffractive optical elements,
grayscale fabrication of diffractive optical elements. Technical Report 854, MIT.
Applied Optics 40: 5921– 5927. Swanson GJ (1991) Binary optics technology: theoretical
Leith EN and Upatnieks J (1962) Reconstructed wavefronts limits on the diffraction efficiency of multi-level diffrac-
and communication theory. Journal of the Optical tive optical elements, Technical Report 914, MIT.
Society of America A 52: 1123–1130. Swanson GJ and Veldkamp WB, (1990) High-efficiency,
Lohmann AW and Paris DP (1967) Binary Fraunhofer multilevel diffractive optical elements. United States
holograms generated by computer. Applied Optical 6: Patent 4,895,790.
1739– 1748. Wang MR and Su H (1998) Laser direct-write gray-level
Mait JN (1995) Understanding diffractive optical design in mask and one-step etching for diffractive microlens
the scalar domain. Journal of the Optical Society of fabrication. Applied Optics 37: 7568– 7576.
America A 12: 2145– 2158. Wu CK (1994) High Energy Beam Sensitive Glasses.
Stern MB and Rubico-Jay T (1994) Dry etching for U.S. Patent 5,285,517.
coherent refractive microlens arrays. Optical Engineer- Wyrowski F (1991) Digital holography as part of diffractive
ing 33: 3547–3551. optics. Report on Progress in Physics : 1481– 1571.
eqns [7] and [8]. A comparison of both equations can state that lossy modes do not have the property to
yields: be phase conjugated during their reflection at both
p
mirrors. As a consequence, the design of diffractive
Uþ ðx; y; zÞ ¼ U2 ðx; y; zÞ ½9 mirrors for unstable laser resonators, in general
characterized by energy outcoupling via diffractive
For reasons of simplicity we neglect here, and in the
losses, is more complex than simply calculating a
following, the subscript m for the mode order if only
phase-conjugating mirror structure for stable laser
one mode is considered. Because this equation must
resonators.
hold in all planes z, it is satisfied in the mirror
Up to now, design methods described in the
resonator planes z0 and zb . Further, by definition of
literature for unstable laser resonators with user-
the operators R1 and R2:
defined output beam shapes have all been based on an
Uþ ðx; y; zb Þ ¼ R1 U2 ðx; y; zb Þ ½10 iterative calculation of the mirror reflection function.
Because a detailed description of these methods is
U2 ðx; y; z0 Þ ¼ R2 Uþ ðx; y; z0 Þ ½11 rather complex, we restrict ourselves to a sketch of
the principal ideas behind them.
The combination of eqns [10] and [11] with eqn [9]
Without loss of generality, we consider an unstable
yields the following expressions for the mirror
laser resonator from which the light is coupled out by
operators:
a mirror with spatially varying transmission and
Uþ ðx; y; zb Þ reflection. The second mirror is fully reflective within
R1 ¼ p ðx; y; z Þ ½12 its aperture. The particular application defines only
Uþ b
few design constraints which are in most cases the
U2 ðx; y; z0 Þ light wavelength, the dimensions of the resonator,
R2 ¼ p ðx; y; z Þ ½13
U2 0 and the shape of the outcoupled beam U0 ðx; y; z0 Þ: As
a consequence of the outcoupling principle, the shape
Equations [12] and [13] imply that lossless modes are
of the mode inside the resonator is not finally defined
phase-conjugated on reflection at the resonator
by the outcoupled beam. In regions where the
mirrors. Because we have not made any assumptions
outcoupling mirror is reflective and not transmissive,
on the shape of the mode, these equations represent
the mode amplitude can be chosen freely. Also the
a general property of lossless resonator modes.
amplitude of the mode in the plane of the back
Furthermore, they provide a general design rule for
resonator mirror is not fixed but influenced by the
the construction of stable laser resonators with
amplitude and phase of the mode in the plane of the
predefined fundamental mode shape.
outcoupling mirror.
If one considers limiting apertures inside the
The general starting situation for an unstable
resonator, an equivalent statement for lossy modes
resonator design is sketched in Figure 2. The design
can be made by making assumptions about phase
procedure must yield the complex amplitudes of four
conjugation in a lossy system and showing that they
waves: the mode directly in front of the outcoupling
lead to a contradiction. Without loss of generality, we
mirror U2 ðx; y; z0 Þ, the mode after reflection at the
assume an aperture on the outcoupling resonator
outcoupling mirror Uþ ðx; y; z0 Þ, and the complex
mirror that clips the mode. We now consider the wave
amplitudes of the forward and reverse modes in the
U2 ðx; y; z0 Þ after reflection at this mirror and assume
plane of the back resonator mirror Uþ ðx; y; zb Þ and
this wave is still phase-conjugated after propagation
U2 ðx; y; zb Þ, respectively. Since we assume no losses at
through the resonator during its reflection at the back
resonator mirror according to:
p
Uþ ðx; y; zb Þ ¼ U2 ðx; y; zb Þ ½14
this mirror, the amplitudes must satisfy the condition: The design method described so far yields a math-
ematical expression for the resonator mirror reflec-
lUþ ðx; y; zb Þl ¼ lU2 ðx; y; zb Þl ½15 tion operator. Practical realization of the resonator
requires this expression be converted into a descrip-
If we further assume no absorption at the outcoupling
tion of a physical property of the mirror. In most
mirror, the amplitudes of the three waves in z0 are
practical cases this is the surface profile. Since the
related according to:
design method considers only phase-only functions,
lU2 ðx; y; z0 Þl þ lU0 ðx; y; z0 Þl ¼ lUþ ðx; y; z0 Þl ½16 the physical description of the reflection operators
are:
To design the resonator one starts with the choice of
initial intensity distributions for the wavefields R1 ¼ exp½2if1 ðx; yÞ and
½19
U2 ðx; y; z0 Þ and Uþ ðx; y; zb Þ. In order to obtain a
R2 ¼ exp½2if2 ðx; yÞ
good filling of the available mode volume, a
homogeneous intensity across the mirror apertures Using the so-called thin-element-approximation, the
is recommendable. To avoid strong diffraction effects surface profiles h1 ðx; yÞ and h2 ðx; yÞ of the two
at the mirror edges, the corresponding intensity mirrors follow directly from the phases f1 ðx; yÞ and
distributions should be smoothed out. Together f2 ðx; yÞ:
with the desired outcoupled beam amplitude
lU0 ðx; y; z0 Þl one can calculate the two remaining l
h1 ðx; yÞ ¼ 2 f ðx; yÞ and
amplitudes for the wave fields U2 ðx; y; zb Þ and 4p 1 ½20
Uþ ðx; y; z0 Þ according to eqns [15] and [16]. l
h2 ðx; yÞ ¼ 2 f ðx; yÞ
An iterative algorithm, such as the well known 4p 2
Gerchberg –Saxton algorithm, can be used to find the
phases of the beams U2 ðx; y; z0 Þ and Uþ ðx; y; z0 Þ that with l being the laser wavelength. The expressions
insure the wavefields have almost equal amplitudes in eqn [20] contain a factor of 1=2 due to reflection.
once they are propagated or back-propagated, respect- These surface profiles are commonly fabricated by
ively, to the plane zb . In order to find wavefields that microlithographic techniques. For stable laser reso-
satisfy eqns [15] and [16], two successive iteration nators, where the mirrors perform a static phase
runs with different constraints on the complex conjugation, the mirror profile is a surface of equal
amplitudes have to be performed. During the first phase of the fundamental mode.
iteration run no change in the initial amplitudes in
plane z0 are allowed. The iteration only look for Use of General Resonator Modes
phases for U2 ðx; y; z0 Þ and Uþ ðx; y; z0 Þ which result in
the amplitude lUþ ðx; y; zb Þl once they are propagated Applications of diffractive laser resonators can be
and back-propagated, respectively, to plane zb . classified into beam shaping for a particular task
Usually the results of a Gerchberg – Saxton algorithm, external to the laser and resonator optimization via
with such constraints, are wavefields Uþ ðx; y; zb Þ and modifying the eigen-space of the round trip operator.
U2 ðx; y; zb Þ with similar but not equal amplitude. Internal beam shaping for external tasks is relevant
Thus, in order to determine wavefields that satisfy eqn to applications where it is advantageous to use a fixed
[15], a second iteration run with the Gerchberg – beam profile other than a Gaussian, as is the case for
Saxton algorithm has to be added in which slight instance in materials processing. Shaping the beam
variations in wavefield amplitudes also in plane z0 are internally, avoids the energy losses usually associated
allowed. Because the application of the constraints to with external beam shaping techniques. However,
ensure convergence of the iteration is rather complex, lasers that have a nonGaussian outcoupled beam
we do not go into details here but refer the reader to the profile are often less flexible in their general
Further Reading at the end of this article. applicability.
Upon completion of the iterations, the mirror However, there is considerable potential for dif-
reflectance operators R1 and R2 are calculated fractive laser resonators to enhance resonator charac-
according to: teristics and thereby improve the lasing operation
itself. The conventional Gaussian-shaped fundamen-
Uþ ðx; y; zb Þ tal mode in customary laser resonators varies
R1 ¼ ½17
U2 ðx; y; zb Þ considerably in intensity from its central maximum
to its outer wings. The transfer of energy from the
U2 ðx; y; z0 Þ pumped active medium into the propagating mode is
R2 ¼ ½18
Uþ ðx; y; z0 Þ dependent upon the local transverse intensity and is
302 DIFFRACTIVE SYSTEMS / Diffractive Laser Resonators
optimal for homogeneous illumination. As a result, index distribution inside the active medium is not
the efficiency with which energy is transferred into the homogeneous, which affects the propagation of the
fundamental mode can be increased by using diffrac- modes. Furthermore, changes or fluctuations in the
tive resonator mirrors to match the shape of the pump energy and cooling mechanisms introduce
lateral fundamental mode to that of the cross-section temporal changes in the optical function associated
of the active medium. with the refractive index of the active medium.
An important resonator property, especially in high Further, although static or averaged, optical power
power laser systems, is the difference of round trip of the active medium are included in the round trip
losses V between the fundamental mode and the operator and calculation of the mirror profiles,
higher, usually undesired, laser modes. This difference experiments with lasers that have diffractive
is referred to as modal discrimination. With no resonator mirrors indicate that the temporal
limiting apertures, the wavefronts of all Gauss – fluctuations can strongly influence the shape and
Hermite or Gauss –Laguerre modes of common laser round trip losses of the fundamental resonator
resonators are phase conjugated upon reflection at the mode. In general, diffractive laser resonators are
spherical mirrors. As we have shown above, there is a more sensitive to deviations from design than
tight connection between the phase conjugation conventional resonators.
property of resonator modes and their property of
having no round trip losses. Thus, the phase
conjugation property of Gauss – Hermite and Examples
Gauss– Laguerre modes directly implies small round
trip losses. In fact, in resonators without apertures We consider as an example the adaptation of
they all have an eigenvalue gp ¼ 1 and according to fundamental mode shape to the cross-section of the
eqn [5] the round trip losses are zero. High-order active medium. In solid state lasers, such as Nd:YAG
modes, whose diameters are large, can be suppressed lasers, the active medium is often a crystal rod with a
by decreasing the size of the mirror aperture, circular cross-section. Instead of using this sharp-
however, because the difference in mode diameter edged cross-section shape as intensity distribution for
between modes of neighboring order m is small, the the desired fundamental mode, a smoothed version,
achievable discrimination is limited to a maximum which is approximated by a super-Gaussian intensity
ratio Vmþ1 =Vm < 10. The structured surface profile of distribution in the center plane of the rod, is chosen to
the mirrors in a diffractive laser resonator whose be the fundamental mode. The super-Gaussian
fundamental mode is nonGaussian is tightly adapted intensity distribution has a more desirable propa-
to the phase of the design mode. The wavefront of the gation characteristic than a sharp circular one.
higher-order modes in these resonators often do not However, even these modes do not exhibit the
fit the mirror profile and it can be observed that their advantageous property of propagation-invariant
round trip losses are significantly higher. Thus, beam shape characteristic of Gaussian beams.
diffractive laser resonators usually show an improved In our first example, we consider the design of a
modal discrimination compared to resonators with stable diffractive laser resonator with a super-
spherical mirrors. This effect can be further improved Gaussian fundamental mode shape inside the active
by inserting into the resonator additional phase plates medium and a Gaussian beam coupled out from the
with diffuser-like properties. resonator. The resonator configuration is based upon
Shaping the surface profile of resonator mirrors or a beam shaping arrangement for a Gauss to super-
insertion of additional phase plates significantly Gauss beam transformation. Its optical setup is
improves the flexibility for the resonator design. As sketched in Figure 3. In the plane of the outcoupling
is demonstrated by the examples below, there are
resonator configurations in which high modal dis-
crimination or adaptation of the mode volume to the
cross-section of the active medium are combined with
a Gaussian shape of the outcoupled beam.
In the design procedure described so far, the
resonators were treated as passive optical systems.
The propagation of the mode through the resona-
tor has not been influenced by the active medium.
However, in real laser systems, this simplification is
often not applicable. Due to the pumping process
and the connected heat insertion, the refractive Figure 3 Setup for the Gauss-to-super-Gauss resonator.
DIFFRACTIVE SYSTEMS / Diffractive Laser Resonators 303
mirror z0 , the fundamental mode has a Gaussian In our second example, the flexibility of diffractive
amplitude Ug ðx; y; z0 Þ. A phase element, which is laser resonators is demonstrated by a resonator with
located directly in front of the outcoupling mirror, a user-defined far-field intensity distribution. In our
shapes the mode amplitude into a super-Gaussian one case the far-field intensity lUf ðx; y; zf Þl2 is shaped like
Usg ðx; y; zm Þ in the center plane zm of the active the Greek letter lambda, l. Because we are con-
medium. For this special beam shaping problem it is cerned only with the intensity of the wavefield, the
possible to calculate a phase plate surface profile that phase ff ðx; y; zf Þ of the far-field Uf ðx; y; zf Þ can be
transforms the beams without energy losses. The chosen freely. As shown in the following, this
corresponding calculation procedure is described phase freedom can be used to improve the inter-
in the Further Reading section at the end of this action of the mode with the active medium, as was
article. the case in our first design example. Again, the mode
The mirror profiles have to be realized to satisfy the shape in the center of the active medium has a super-
previously derived phase conjugation property for the Gaussian intensity distribution, which insures high
described beam. Since the Gaussian beam has its extraction efficiency. We used an iterative Fourier-
minimum waist at the outcoupling mirror, its phase is transform algorithm to calculate a far-field phase
planar in that plane. Thus, to insure phase conju- that yields a nearly super-Gaussian intensity distri-
gation a plane mirror is all that is required in front of bution in the outcoupling plane of the resonator.
the beam shaping phase plate. In the plane of the The resulting wave:
back resonator mirror the beam has a modulated
wavefront due to its transformation from a Gaussian U0 ðx; y; z0 Þ ¼ a0 ðx; y; z0 Þ exp½iw0 ðx; y; z0 Þ ½21
to a super-Gaussian amplitude distribution. Thus, a
surface modulated mirror profile is required to phase is generated directly as the laser output. The
conjugate this beam during reflection. Figures 4a and b amplitude a0 ðx; y; z0 Þ is generated as a mode inside
represent the phase profile of the beam shaping the cavity using a surface-modulated back resonator
element and the back resonator mirror, respectively, mirror. This performs a static phase conjugation of
for the following set of parameters: laser wavelength the desired mode in the mirror plane. The phase
l ¼ 1064 nm, radius of the Gaussian and super- w0 ðx; y; z0 Þ is provided by an additional element on
Gaussian beams in the plane z0 and zm , respectively, the outside of the outcoupling mirror. A sketch of the
wg ¼ 1 mm and wsg ¼ 1 mm, order of the super- resonator configuration is shown in Figure 5.
Gauss n ¼ 10, distance zm 2 z0 ¼ 200 mm, and For the experimental realization of this example
resonator length zb 2 z0 ¼ 250 mm. The amplitude the thermal index gradient of the active medium was
distribution for the fundamental mode in this included into the resonator design and the two
resonator simulated with the Fox –Li-algorithm is diffractive elements were fabricated on a quartz
shown in Figure 4c for the plane of the outcoupling substrate using electron beam lithography and ion-
mirror and in Figure 4d for the center of the active beam etching. The measured mode shape in the
medium. plane of the outcoupling mirror and the far-field
Figure 4 Gauss-to-super-Gauss resonator: (a) phase of beam shaping element; (b) phase of back resonator mirror, simulated
amplitude distributions for the fundamental resonator mode; (c) in the plane of the outcoupling mirror; and (d) in the center of the active
medium.
304 DIFFRACTIVE SYSTEMS / Diffractive Laser Resonators
Figure 5 Configuration for resonator internal generation of a user defined far-field intensity distribution. (With permission from
Zeitner UD, Wyrowski F and Zellmer H (2000) External design freedom for optimization of resonator originated beam shaping. IEEE
Journal of Quantum Electronics 36: 1105– 1109. q 2001 IEEE.)
Figure 6 Measured mode intensities for the resonator of Figure 5. (a) In the plane of the outcoupling mirror; and (b) in the far-field of
the resonator. (With permission from Zeitner UD, Wyrowski F and Zellmer H (2000) External design freedom for optimization of
resonator originated beam shaping. IEEE Journal of Quantum Electronics 36: 1105–1109. q 2000 IEEE.)
intensity distribution are shown in Figures 6a and b, diffractive resonators tend to be more sensitive
respectively. The mode in the outcoupling plane of the against changes of the optical properties of the active
resonator fills an area of about 2 mm diameter with medium than spherical mirror resonators. Such
nearly homogeneous intensity. The far-field shows the changes may occur due to temporal fluctuations in
desired l-shaped intensity distribution. The vari- pump power or cooling power.
ations in intensity are due to the nonlinear response Most promising applications for diffractive reso-
of the active medium. As mentioned above, the nators are:
response can be improved if these effects are included
in the design of the resonator mirrors. . lasers for materials processing which require a
fixed intensity pattern for the sample treatment and
a fixed laser power;
Conclusion . high efficient homogeneous laser illumination, for
Diffractive laser resonators are a generalization of example, for data projection or sensing appli-
common spherical mirror resonators. The use of cations; and
arbitrary resonator mirror profiles and additional . suppression of higher order modes and maintain-
internal phase plates offer a huge flexibility for the ing single-mode operation for high brightness
generation of user defined mode shapes and optim- lasers.
ization of resonator properties like extraction effi-
ciency and modal discrimination. The drawback of Some example applications and solutions have
the use of structured mirrors is an increased adjust- been presented here which cannot be achieved by
ment effort, because the diffractive mirrors have more conventional resonator optics. For stable laser
degrees of freedom to be aligned. Furthermore, resonators, the phase-conjugation property of the
DIFFRACTIVE SYSTEMS / Diffractives in Animals 305
modes is used for a straightforward calculation of the Fox A and Li T (1961) Resonant modes in a maser
required diffractive mirror profiles. Unstable resona- interferometer. Bell Systems Technical Journal 40:
tor geometries with diffractive outcoupling require 453 – 488.
more complex iterative design procedures. However, Gerchberg R and Saxton W (1972) A practical algorithm
especially for high-power and high-gain lasers, for the determination of phase from image and diffrac-
tion plane pictures. Optik 35: 237 –246.
the latter sometimes have more advantageous
Herzig H (1997) Micro Optics. London: Taylor & Francis.
properties, such as better discrimination of undesired
Leger JR (1997) Diffractive laser resonators. In: Turunen J
higher-order modes and more simple mirror cooling and Wyrowski F (eds) Diffractive Optics for Industrial
options. and Commercial Applications, pp. 165 – 188. Berlin:
Akademie Verlag.
Leger JR, Chen D and Dai K (1994) High modal
See also discrimination in a Nd:YAG laser resonator
with internal phase gratings. Optics Letters 19:
Lasers: Carbon Dioxide Lasers. Phase Control: Phase
Conjugation and Image Correction. 1976– 1978.
Makki S and Leger JR (2001) Mode shaping of a graded-
reflectivity-mirror unstable resonator with an intra-
cavity phase element. IEEE Journal of Quantum
Further Reading Electronics 37: 80 –86.
Aagedal H, Wyrowski F and Schmid M (1997) Paraxial Siegman AE (1986) Lasers. Mill Valley, CA: Univ. Science
beam splitting and shaping. In: Turunen J and Books.
Wyrowski F (eds) Diffractive Optics for Industrial Zeitner UD and Wyrowski F (2001) Design of unstable
and Commercial Applications, pp. 165– 188. Berlin: laser resonators with user-defined mode shape. IEEE
Akademie Verlag. Journal of Quantum Electronics 37: 1594– 1599.
Belanger PA, Lachance RL and Pare C (1992) Super- Zeitner UD, Wyrowski F and Zellmer H (2000) External
Gaussian output from a CO 2 laser by using a design freedom for optimization of resonator originated
graded-phase mirror resonator. Optics Letters 17: beam shaping. IEEE Journal of Quantum Electronics
739– 741. 36: 1105– 1109.
Diffractives in Animals
A R Parker, University of Oxford, Oxford, UK Structural colors generally may be formed by one
q 2005, Elsevier Ltd. All Rights Reserved. of three mechanisms: thin-film reflectors, diffraction
gratings, or structures causing scattering of light
waves. This is a simplistic characterization resulting
Introduction from the application of geometric optics to the
Structural coloration involves the selective reflectance reflective elements. Some structures, however, rather
of incident light by the physical nature of a structure. fall between the above categories, such as photonic
Although the color effects often appear considerably crystals (Figure 2) – these become apparent when
brighter than those of pigments, structural colors quantum optics is considered. In some cases, this has
often result from completely transparent materials. In led to confusion over the reflector type. For example,
addition, animal structures can be designed to the reflectors in some scarab beetles have been
provide the opposite effect to a light display: they categorized by different authors as multilayer reflec-
can be antireflectors, causing ‘all’ of the incident light tors, three-dimensional diffraction gratings, photonic
to penetrate a surface (like glass windows, smooth crystals, and liquid crystal displays. Perhaps all are
surfaces cause some degree of reflection). correct! Since the above categories are academic,
The study of natural structural colors began in I place individual cases of structural colors in
the seventeenth century, when Hooke and Newton their most appropriate, not unequivocal, category.
correctly explained the structural colors of silver- Also, in this article I will employ geometric optical
fish (Insecta) and peacock feathers (Figure 1), treatments only.
respectively. Nevertheless, accurate, detailed studies The array of structural colors found in animals
of the mechanisms of structural colors did not today results from millions of years of evolution.
begin until the introduction of the electron Structures that produce metallic colors have also been
microscope in 1942. identified in extinct animals. Confirmation of this
306 DIFFRACTIVE SYSTEMS / Diffractives in Animals
fact, from ultrastructural examination of exception- selective than one with a smaller number of layers
ally well-preserved fossils, such as those from the with a large difference of index. The former
Burgess Shale (Middle Cambrian, British Columbia), therefore gives rise to more saturated colors
515 million years old, permits the study of the role of corresponding to a narrow spectral bandwidth and
light in ecosystems throughout geological time, and these colors therefore vary more with a change of
consequently its role in evolution. In some cases, such angle of incidence. Both conditions can be found in
as for ostracod crustaceans, diffraction gratings have animals – different colored effects are appropriate
been found to increase in their optical sophistication for different functions under different conditions.
throughout 350 million years of evolution. For an oblique angle of incidence, the wavelength
of light that interferes constructively will be shorter
Multilayer Reflectors than that for light at normal incidence. Therefore,
as the angle of the incident light changes, the
Light may be strongly reflected by constructive observed color also changes (Figure 3).
interference between reflections from the different Single layer reflectors are found in nature, where
interfaces of a stack of thin films (of actual thickness light is reflected, and interferes, from the upper and
d) of alternately high and low refractive index (n). For lower boundaries (Figure 4). A difference in the
this to occur, reflections from successive interfaces thickness of the layer provides a change in the color
must emerge with the same phase and this is achieved observed from unidirectional polychromatic light.
when the so-called ‘Bragg Condition’ is fulfilled. The The wings of some houseflies act as a single thin film
optical path difference between the light reflected and appear different colors as a result of this
from successive interfaces is an integral number of phenomenon. A single quarter-wavelength film of
wavelengths and is expressed in the equation: guanine in cytoplasm, for example, reflects about 8%
2nd cos Q ¼ ðm þ 1=2Þl ½1 of the incident light. However, in a multilayer
reflector with ten or more high index layers, reflection
from which it can be seen that the effect varies with efficiencies can reach 100%. Thus, animals possessing
angle of incidence (Q, measured to the surface such reflectors may appear highly metallic.
normal), wavelength (l), and the optical thickness The reflectance of the multilayer system increases
of the layers (nd). There is a phase change of half a very rapidly with increasing number of layers. If the
wavelength in waves reflected from every low to dimensions of the system deviate from the quarter-
high refractive index interface only. The optimal wave condition (i.e., nd is not equal for all layers),
narrowband reflection condition is therefore then the reflector is known as ‘nonideal’ in a
achieved where the optical thickness (nd) of every theoretical sense (may be ‘ideal’ for some natural
layer in the stack is a quarter of a wavelength. In a situations). ‘Nonideal’ reflectors have a reduced
multilayer consisting of a large number of layers proportional reflectance (not always a significant
with a small variation of index the process is more reduction) for a given number of layers and this
DIFFRACTIVE SYSTEMS / Diffractives in Animals 307
Figure 3 Butterfly (Parthenos sylvia (Nymphalidae) from Malaysia), showing the change in color with angle of wings. This animal may
have evolved a temporal signal rather than a static one.
A broadband wavelength-independent reflectance, Amauris, owe their reflection to the second condition.
appearing silver or mirror-like to the human eye, can The mirror-like reflectors in the scallop Pecten eye
be achieved in a multilayer stack in at least three ways comprise alternating layers of cytoplasm ðn ¼ 1:34Þ
in invertebrates (Figure 8). These are (a) a composite and guanine crystals ðn ¼ 1:83Þ and approximate an
of regular multilayer stacks each tuned to a specific ‘ideal’ quarter-wave system in the same manner as
wavelength, (b) a stack with systematically changing within fish skin using the third mechanism. The
optical thicknesses with depth in the structure, ommatidia of the superposition compound eyes of
termed a ‘chirped’ stack, and (c) a disordered Astacus (Crustacea) are lined with a multilayer of
arrangement of layer thicknesses about a mean isoxanthopterin (a pteridine) crystals, which again
value, termed a ‘chaotic’ stack (Figure 8). fall into the third category. Multilayer reflectors can
The nauplius eye of the copepod Macrocyclops also be found in the eyes of certain spiders, butterflies,
(Crustacea) has regularly arranged platelets and possibly flies, where they assist vision, as
,100 nm thick in stacks of 20 –60, achieving the discussed further below.
first condition. Silver beetles and the silver and gold Squid and cuttlefish, for example, possess mirror-
chrysalis of butterflies in the genera Euoplea and like reflectors in photophores (light organs) and
DIFFRACTIVE SYSTEMS / Diffractives in Animals 309
Figure 6 Bivalve shell with outer pigment removed to reveal an incidental multilayer reflector. Layers evolved to provide strength and
prevent crack propagation.
Figure 14 Scanning electron micrograph of the compound eye of Zalea minor (Diptera). The corneal surface at the junction of
two ommatidia is shown with antireflection gratings (positioned on sloping regions, i.e., at high angles to the normal of the eye) phasing
into moth-eye-type protuberances. Scale bar represents 2 mm.
Reflected light
Incident
Transmitted
white
light
light
colored light from the background that would dilute raises the molecule to an excited electronic state from
or alter the color. Additionally, if a dark pigment such which it re-radiates a photon when it returns to the
as melanin is interspersed with the scattering ground state. Diffraction is not involved here.
elements, the reflection will appear a shade of gray Tyndall scattered light is polarized under obliquely
or brown. The cells of the reflector in the photophore incident light. The intensity of the resultant blue is
of a beetle (‘fire-fly’) are packed with sphaerocrystals increased when it is viewed against a dark back-
of urate that cause a diffuse reflection (Figure 20). ground, such as melanin. The relative sizes of
Reflection and refraction that occur at the inter- particles determine the shade of blue. If the particles
faces of strata with different refractive indices may responsible for the scattering coalesce to form
result in the display of white light. The degree of particles with a diameter greater than about 1 mm,
whiteness depends upon the difference in refractive then white light is observed (see above). A gradation
indices. This mechanism is evident in the shells of from blue to white scattering (‘small’ to ‘large’
many lamellibranch molluscs. Between the outer, particles) occurs on the wings of the dragonfly
often pigmented layer and the mantle, is a thick Libellula pulchella.
middle layer of crystalline calcium carbonate. The Scattered blues can also be found in other
inner surface of this (nacreous) layer is lined with dragonflies. In the aeschnids and agrionids, the
multiple laminations of the same salt. In most species epidermal cells contain minute colorless granules
these laminations are sufficiently thick (.1 mm) to and a dark base. The males of libellulids and
render the inner lining white, although in some agrionids produce a waxy secretion that scatters
species they become thin so as to form a multilayer light similarly over their dark cuticle. The green of the
reflector. Calcium carbonate similarly produces female Aeschna cyanea is the combined result of
whiteness in Foraminifera and in calcareous sponges, Tyndall scattering and a yellow pigment, both within
corals, echinoderms, and crustacean cuticles. Also, in the epidermal cells (degradation of the yellow
the class of white solids is silica in diatom tests and pigment turns the dead dragonfly blue).
skeletons of hexactinellid sponges. Scattered blues are also observed from the skin of
Other forms of scattering also exist and result in a the cephalopod Octopus bimaculatus, where a pair of
blue colored effect (red when the system is viewed in ocelli are surrounded by a broad blue ring. Blue light
transmission). Tyndall or Mie scattering occurs in a is scattered from this region as a result of fine granules
colloidal system where the particle size approximates of purine material within cells positioned above
the wavelength of light. Here, diffraction is import- melanophore cells. The color and conspicuousness
ant. Rayleigh scattering occurs in molecules in a two- of the ring are controlled by the regulation of the
photon process by which a photon is absorbed and melanophores, by varying the distribution of melanin
Figure 20 Sections of photophores (bioluminescent organs), external surface of animals on the right. (a) The ‘fire-fly’ beetle
Pyrophorus sp., abdominal photophore of male – reflector is based on a scattering system. (b) The shrimp Sergestes prehensilis,
showing lens layers and absorptive pigment – reflector is based on a multilayer type.
316 DIFFRACTIVE SYSTEMS / Microstructure Fibers
and consequently the density of the absorbing screen. Gale M (1989) Diffraction, beauty and commerce. Physics
The squid Onychia caribbaea can produce rapidly World 2: 24 –28.
changing blue colors similarly. The bright blue Harvey EN (1952) Bioluminescence. New York: Academic
patterns produced by some nudibranch molluscs Press Inc.
Herring PJ, Cambell AK, Whitfield M and Maddock L
result from reflecting cells containing small vesicular
(1990) Light and Life in the Sea. Cambridge, UK:
bodies, each composed of particles about 10 nm in
Cambridge University Press.
diameter and therefore appropriate for Rayleigh Hutley MC (1982) Diffraction Gratings. London: Academic
scattering. Press.
Land MF and Nilsson D-E (2002) Animal Eyes. Oxford,
UK: Oxford University Press.
See also Pain S (1998) Seeing the light. New Scientist 2161: 42– 46.
Diffraction: Diffraction Gratings. Scattering: Scattering Parker AR (1999) Light-reflection strategies. American
Theory. Scientist 87: 248– 255.
Parker AR (2003) In the Blink of an Eye. London: Simon
and Schuster/Cambridge (USA): Perseus Publishing.
Parker AR, Welch VL, Driver D and Martini N (2003) An
Further Reading
opal analogue discovered in a weevil. Nature 426:
Fox DL (1976) Animal Biochromes and Structural Colours. 786 – 787.
Berkeley, CA: University of California Press. Parker AR, McPhedran RC, McKenzie DR, Botten LC and
Fox HM and Vevers G (1960) The Nature of Animal Nicorovici N-AC (2001) Aphrodite’s iridescence.
Colours. London: Sidgwick and Jackson Ltd. Nature 409: 36 –37.
Microstructure Fibers
R S Windeler, OFS Laboratories, Murray Hill, NJ, continuum over two octaves wide. Some guide light
USA with an air core via bandgap guidance. In addition,
q 2005, Elsevier Ltd. All Rights Reserved. the air holes in these microstructure fibers can be
filled with highly tunable materials, giving one the
capability of controlling the fiber properties for use in
Introduction energy efficient devices.
Microstructure fibers have unique properties that Microstructure fibers can have doped regions like
cannot be obtained from traditional fibers (i.e. all traditional fibers. These hybrid fibers combine the
glass, doped silica fibers) and can deliver functional- benefits of traditional and microstructure fibers.
ities superior to many of today’s best transmission The doped core typically guides most of the light,
and specialty fibers. Their unique properties are but its guidance properties can be strongly influenced
obtained from an intricate cross-section of high and by the air regions. Applications for hybrid fibers
low index regions that traverse the length of the fiber. include dispersion compensation, polarization main-
The vast majority of fibers consist of silica for the taining, and bend insensitive fibers.
high-index region and air for the low-index region. Loss is always an important factor when determin-
These fibers are known by several different names ing whether a microstructure fiber will compete with
including microstructure fiber, holey fiber, and or replace a traditional optical fiber. The index
photonic crystal fiber. The term microstructure fiber profiles that make these fibers so special can also
is used in this chapter because it is the most generic, lead to fibers that have an inherently high loss at
encompassing all the fiber types. connections or along the length of the fiber. Loss at
Microstructure fiber properties vary greatly and are connections typically occurs due to mode mismatch
determined by the size and arrangement of air holes in between the two fibers, undesired hole collapse, or
the fiber. For example, fibers have been fabricated poor alignment. Loss along the fiber length can occur
such that the numerical aperture of the core or due to impurities, hole surface roughness, or poor
cladding approaches one. They have been fabricated confinement.
with unique dispersion profiles, such as a zero-group The dispersion profile (zero dispersion wavelength,
velocity dispersion in the near visible regime or an dispersion slope, normal or anomalous) of micro-
ultraflat dispersion over hundreds of nanometers structure fibers can be optimized more than a
wide. They have been fabricated to generate a super traditional fiber, because the hole size and placement
DIFFRACTIVE SYSTEMS / Microstructure Fibers 317
can be arranged to make the cladding index strongly which will cause deviations between the fiber and
wavelength dependent. Dispersion causes a pulse to preform profiles.
spread because the phase velocity is wavelength Several methods are used to fabricate microstruc-
dependent. Normal dispersion is when longer wave- ture preforms including stacking, casting, extrusion,
lengths travel faster than shorter wavelengths. Anom- and drilling. Each process has its advantages and
alous dispersion is just the opposite – shorter disadvantages. By far, the most common method is
wavelengths travel faster than longer wavelengths. stacking in which tubes, rods, and core rods (rods
Dispersion determines the amount of interaction containing doped cores fabricated by traditional
between different wavelengths. For applications optical fiber techniques) are stacked in a close packed
such as transmission fiber, one wants little interaction pattern such as a triangular or hexagonal lattice
and for high nonlinear applications, one wants (Figure 1). The bundle of tubes and rods is then
significant interaction. bound together with a large tube, called an overclad
This chapter examines the methods for fabricating tube. Fibers can be made with complicated or
microstructure fibers and reviews the different types asymmetric index profiles by strategically placing
of microstructure fibers and their properties. Fibers tubes with the same outer diameter (for good
that guide by total internal reflection are reviewed stacking) but a different inner diameter to change
separately from bandgap guided fibers because their the index in that location. For example, a fiber with
properties are significantly different. Lastly, a descrip- smaller holes on opposite sides of the core produces
tion and example of an active device is presented in high birefringence.
which the air regions of a microstructure fiber are The main advantages of stacking are that no special
filled with controllable material. equipment is needed for fabricating microstructure
preforms and that doped cores are easily added.
However, there are several disadvantages. First, the
Microstructure Fiber Fabrication stacking method is limited to simple geometries of air
At first glance, fabrication of microstructure fibers holes because the tubes are stacked in a close pack
looks similar to traditional fibers in that the fibers are arrangement. Second, unless hexagonal tubes are
fabricated in a two-step process. First a preform is used, interstitial areas are created between the tubes
fabricated and then it is drawn (stretched) into a fiber. that may not be desired in the final fiber. Third, the
However, when the process is examined in more method is labor intensive and requires significant
detail, both the preform fabrication and draw depart glass handling.
significantly from traditional methods. First, a novel A variation of the stacking method is used for
preform fabrication method is used to incorporate air fabricating air-clad fibers. Air-clad preforms are
holes that run the length of the preform. Second, created by placing a layer of small tubes around the
novel drawing procedures are used to keep the air perimeter of a large core rod. The assembly is then
holes open as the preform diameter is reduced by overcladed for strength (Figure 2).
several orders of magnitude during draw. Extrusion and casting of glass powder, polymer, or
sol – gel slurry are also used for making single material
Preform Fabrication
wavelength of the guided light, the cladding index can constant value. Obtaining a V number independent
be modeled as an average index weighted by the of wavelength is understood by examining the
volume fraction of the two materials (typically air distance the light propagates into the cladding as a
and silica). As the hole diameter or spacing function of wavelength. As the wavelength becomes
approaches the wavelength of the guided light, shorter, light penetrates a shorter distance into the
complicated models are used such as finite difference cladding, interacting with less of the air regions. This
method, finite element method, beam propagation causes the effective cladding index (nclad ) to increase
method, or multipole method. with decreasing wavelength, such that V approaches
a constant value.
Endlessly single-mode fiber Microstructure fibers are made endlessly single-
The endlessly single-mode fiber consists of a solid moded over all wavelengths by properly designing the
core surrounded by a two dimensional array of small hole diameter-to-pitch ratio (d=L). When the d=L is
air holes in the cladding (Figure 3). The size and low enough, the microstructure fiber guides light only
position of the air holes are arranged such that only in the single mode regardless of wavelength and core
the fundamental mode exists in the core regardless of size (Figure 4). However, these large core diameter
the wavelength of the guided light. fibers are limited by long wavelength bend loss in the
The prevention of higher order modes can be same way as large core single mode traditional fibers.
understood by examining the V number. The V
number is a dimensionless parameter that describes High delta core fiber
the guiding nature of the waveguide. In traditional High delta core microstructure fibers (HDCMF)
fiber, when the V number is less than 2.405, the fiber consist of a small (typically less than 2 microns)
will guide only the fundamental mode. The V number solid silica core surrounded by one or more layers of
is given by large air holes closely spaced together (Figure 5). This
creates a very large index difference between the core
2pa 2
V¼ ðncore 2 n2clad Þ1=2 ½1 and cladding. A core-to-cladding index difference of
l
0.4 is easily achieved in a HDCMF, which is an order
where a is the core diameter, l is the wavelength of of magnitude larger than can be achieved with
the guided light, and ncore and nclad are the index traditional fiber.
of refractions of the core and cladding, respectively. Even though the core is very small, these fibers are
In traditional fiber, the core and cladding index (ncore almost always multimoded due to the large index
and nclad ) are for the most part constant, so the V difference between the core and cladding. However,
number increases as the wavelength decreases. because of its small core, it is difficult to launch
However, for the endlessly single mode microstruc- anything other than the fundamental mode into the
ture fiber, the V number does not increase indefinitely fiber. Once a mode is guided in the fiber it remains in
with decreasing wavelength, but approaches a that mode due to the large difference in effective
indices between modes (Figure 6). When higher-order
modes are purposely generated in the fiber, they also
Figure 3 Photograph of an endlessly single mode fiber Figure 4 The effective V number as a function of L=l for various
(Reprinted with permission from Birks TA, Knight JC and d =L (Reprinted with permission from Birks TA, Knight JC and
Russell P (1997) Endlessly single mode photonic crystal fiber. Russell P (1997) Endlessly single mode photonic crystal fiber.
Optics Letters 22: 961–963). Optics Letters 22: 961– 963).
320 DIFFRACTIVE SYSTEMS / Microstructure Fibers
Figure 7 Far-field patterns show that higher order modes propagate long distances in the HDCMF without coupling to other modes
(Reprinted with permission from Ranka JK, Windeler RS and Stentz AJ (2000) Optical properties of high-delta air-silica microstructure
optical fibers. Optics Letters 25: 796– 798).
DIFFRACTIVE SYSTEMS / Microstructure Fibers 321
Figure 8 Plot of a supercontinuum spectrum over two octaves wide generated from a 75 cm piece of HDCMF (Reprinted with
permission from Ranka JK, Windeler RS and Stentz AJ (2000) Visible continuum generation in air-silica microstructure optical fibers with
anomalous dispersion at 800 nm. Optics Letters 25: 25–27).
Figure 9 Photograph of fiber generating the supercontinuum. The bottom photo shows the continuum after passing through
a prism.
Figure 10 Schematic drawing of the tapered microstructure fiber (Reprinted with permission from Chandalia JK, Eggleton BJ, Windeler
RS, Kosinski SG, Liu X and Xu C (2001) Adiabatic coupling in tapered air-silica microstructures optical fiber. IEEE Photonics Tech. Letters
13(1) (q2001 IEEE)).
322 DIFFRACTIVE SYSTEMS / Microstructure Fibers
cladding is surrounded by a layer of air holes and long slab waveguide. As seen in Figure 11, the NA
then a protective outer silica cladding for strength. increases as the webs become thinner or the wave-
Since the core of the TMF is very similar to a length of light becomes longer. To obtain NA similar
traditional single-mode fiber core, the splice loss is to what is achieved by coating a traditional fiber with
low. Splice losses below 0.075 dB are typical. a low index polymer (NA , 0.45), the web thickness
Properties similar to the HDCMF are obtained by must be about equal to the wavelength of the light.
adiabatically tapering the TMF while maintaining the The NA of the air-clad fiber increases dramatically as
cross-sectional profile of the fiber. In the taper region, the web thickness becomes less than half the
the fundamental mode is no longer guided by the wavelength of the guided light. Numerical apertures
germanium core and evolves into the fundamental above 0.90 have been experimentally demonstrated.
mode of the silica region, where it is confined by the The large NA values achievable with air-clad fibers
ring of air holes. The taper waist is typically ten to are an important advantage for advanced double-clad
twenty centimeters long and has properties identical
to the HDCMF. When the fiber is adiabatically
expanded back to its original size, the light is guided
back into the standard diameter germanium core,
making it easy to splice to traditional fiber with
low loss.
The same effect can be obtained by stretching a
traditional fiber from 125 microns down to 2
microns. However, there are several disadvantages
compared to the tapered microstructure fiber. The
taper region is much longer in the traditional
fiber; the optimal diameter of the silica core is
harder to maintain; the taper region must remain Figure 11 Predicted NA as a function of the web thickness
clean and uncoated; and the taper region is much divided by the wavelength using an infinite slab model.
weaker.
Air-clad fiber
Air-clad fibers consist of a doped core surrounded
by a silica inner cladding. The inner cladding is
surrounded by a ring of air, which in turn is
surrounded by an outer silica cladding for strength
(Figure 2). Thin silica webs connect the inner and
outer claddings to hold the inner fiber region in place.
The region consisting of air and thin silica webs
creates an optical barrier, preventing most of the light
from escaping the inner cladding. Such fibers are
referred to as double clad fiber.
In traditional double-clad fibers, a material of
lower index (typically another layer of glass or a layer
of polymer) immediately surrounds the inner silica
cladding. A key optical parameter describing the light
guiding properties of the inner cladding is the
numerical aperture (NA). The NA is defined as the
sine of the largest (acceptance) angle of light that will
be guided in the inner cladding. The upper NA values
of traditional double-clad fibers (,0.22 and ,0.45
for glass and polymer outer claddings, respectively)
are limited by the refractive indices of the available
cladding materials. Figure 12 Schematic of the mechanism of (a) index guided light
and (b) band gap guided light (Reprinted with permission from
For the air-clad fiber, the NA is determined by the
Cregan RF, Mangan BJ, Knight JC, Birks TA, Russell PJ, Roberts
ability of light to escape from the inner cladding PJ and Allan DC (1999) Single-mode photonic band gap guidance
through the silica webs. The NA of the air-clad fiber of light in air. Science 285. Copyright 1999 American Association
can be calculated by modeling the web as an infinitely for the Advancement of Science).
DIFFRACTIVE SYSTEMS / Microstructure Fibers 323
Figure 13 Photographs and drawing of three types of bandgap fibers: (a) triangular array cladding with an air core (Cregan RF,
Mangan BJ, Knight JC, Birks TA, Russell PJ, Roberts PJ and Allan DC (1999) Single-mode photonic band gap guidance of light in air.
Science vol. 285), (b) tunable with a silica core, (c) concentric rings with an air core (Reprinted with permission from Temelkuran et al.
Nature 420. Copyright 1999 American Association for the Advancement of Science).
Figure 15 Holes in the microstructure fibers are filled with controllable materials to affect the fiber properties actively. Photos show
(a) original fiber and (b) fiber after polymer is inserted and cured in the microstructure fiber.
Bandgap fibers guide light in the core (also referred to light to propagate in a core that has an index lower
as a defect) by confining the light through constructive than the cladding. Bandgap fibers with an air core
interference due to Bragg scattering (Figure 12). could theoretically be very low loss, propagate
Unlike traditional fibers, this mechanism allows high powers, have a large effective area core, and
DIFFRACTIVE SYSTEMS / Microstructure Fibers 325
exhibit very low nonlinearities. In addition, the fibers The spectrum of a bandgap fiber is quite different
can be used in novel devices by filling the core with than an index guided fiber (Figure 14). Frequencies
special gasses or liquids for nonlinear processes. that lie within the bandgap cannot propagate in the
Bandgap fibers can be generalized into three types cladding and are confined to the defect in the lattice
(Figure 13). The first type of fiber has an air core (the core). Bandgap fibers with ideal geometry are
surrounded by a cladding that consists of a periodic predicted to have losses over an order of magnitude
array of air holes in silica. Fibers of this type have lower than can be obtained in an ideal traditional
been designed to guide a single mode and hold the fiber. Although the loss of bandgap fibers has not been
greatest promise of guiding light over long distances demonstrated to be less than traditional fiber at
with very low loss. The second type of bandgap fiber telecommunication wavelengths, bandgap fibers have
consists of a silica core and a silica cladding with a been shown to have much lower loss than traditional
periodic array of holes which are filled with a high- fibers at other wavelengths.
index, tunable liquid. This design allows the bandgap
Tunable Microstructure Fiber Devices
properties to be tuned and is of interest for use in
devices. The third fiber type consists of rings of alter- Unique tunable devices are made by filling the air
nating high- and low-index dielectric material with a holes of microstructure fibers with materials whose
center air core. They typically have a large, multi- properties can be controlled actively (Figure 15).
mode core which is ideal for sending high powers. In The materials can be positioned in the core or
addition, very little light is in the dielectric materials cladding to affect the corresponding modes. Most
so the fiber can be designed to guide any wavelength commonly, the index of refraction is changed using
with relatively little loss. temperature sensitive liquids or polymers, although
Figure 18 Schematic of a tunable device using both high and low index liquids with two heaters. The plot shows that the position and
strength of the spectrum are adjusted independently (Reproduced with permission from Mach P, Dolinski M, Baldwin KW, Rogers JA,
Kerbage C, Windeler RS and Eggleton BJ (2002) Tunable microfluidic optical fiber. Applied Physics Letters 80(23). Copyright (2002) by
the American Physical Society).
326 DIFFRACTIVE SYSTEMS / Microstructure Fibers
materials that respond to electric or magnetic fields Bjarklev A, Broeng J and Bjarklev AS, Photonic Crystal
enable a quicker response time. These material- Fibers, Boston: Kluwer Academic Publishers.
filled active fibers can be used to fabricate devices Broeng J, Mogilevstev D, Bjarklev A and Barkou SE (1999)
like tunable filters, switches, broadband attenua- Photonic crystal fibers: A new class of optical wave-
tors, and fibers with tunable birefringence. Below, a guides. Optical Fiber Technology, vol. 5.
Chandalia JK, Eggleton BJ, Windeler RS, Kosinski SG,
few examples of tunable filters are presented in
Liu X and Xu C (2001) Adiabatic coupling in tapered
which the device properties depend on the type and
air-silica microstructures optical fiber. IEEE Photonics
position of the liquid inserted in the fiber holes. Tech. Letters 13(1).
A microstructure tunable filter works by controlling Cregan RF, Mangan BJ, Knight JC, Birks TA, Russell PJ,
the spectral position, shape, or strength of a longi- Roberts PJ and Allan DC (1999) Single-mode photonic
tudinal long-period grating written in the fiber. A band gap guidance of light in air. Science 285.
grating is written in the fiber before filling the Eggleton BJ, Westbrook PS, White CA, Kerbage C,
holes with tunable liquid. The position of the peak Windeler RS and Burdge GL (2000) Cladding-mode
absorption wavelength is controlled by placing a resonance in air-silica microstructure optical fiber.
liquid, with an index close to silica and strongly Journal Lightwave Technology 18(8).
temperature dependent, in air holes in the cladding Joannopoulos JD, Meade RD and Winn JN (1995) Photonic
region over the grating (Figure 16). The cladding index Crystals. New Jersey: Princeton University Press.
is tuned by varying the temperature of the liquid, using Johnson SG, Ibanesc M, Skorobogatiy M, Weisberg M,
Engeness O, Soljacic M, Jacobs SA, Joannopoubs JD and
a small thin film heater located over the liquid. The
Fink Y (2001) Low-loss asymptotically single-mode
position of the peak absorption wavelength moves as
propagation in large-core OmniGuide fibers. Optics
the cladding index is changed. Express 9(13).
The strength or depth of the filter is controlled by Kerbage C, Hale A, Yablon A, Windeler RS and Eggleton BJ
inserting a moveable plug of high-index fluid in a (2001) Integrated all fiber variable attenuator based on
sealed air section of a microstructure fiber (Figure 17). hybrid microstructure fiber. Applied Physics Letters
The strength of the grating is determined by the 79(19).
fraction of the grating surrounded by high index fluid. Kerbage C, Windeler RS, Eggleton BJ, Dolinskoi M,
The position of the liquid plug relative to the grating Mach P and Rogers JA (2002) Tunable devices based
can be finely adjusted with a thin film heater located on dynamic positioning of micro-fluids in micro-
over the air region adjacent to the material plug. structure optical fiber. Optics Communications 204:
As heat is applied to the air, the air expands and pushes 179 – 184.
the liquid plug over the grating, decreasing the coupl- Knight JC, Birks TA, Russell PJ and Atkin DM (1996)
All-silica single-mode optical fiber with photonic crystal
ing of the fundamental mode to higher order cladding
cladding. Optics Letters 21(19).
modes. Figure 18 shows an example in which the
Knight JC, Birks TA and Russell PJ (2001) ‘Holey’silica fibers.
position and strength of a filter are varied indepen- In: Markel VA and George TF (eds). Optics of nanos-
dently using two heaters and two adjacent materials. tructured materials. New York: John Wiley & Sons.
Mach P, Dolinski M, Baldwin KW, Rogers JA, Kerbage C,
See also Windeler RS and Eggleton BJ (2002) Tunable micro-
fluidic optical fiber. Applied Physics Letters vol. 80.
Photonic Crystals: Photonic Crystal Lasers, Cavities and Ranka JK, Windeler RS and Stentz AJ (2000) Visible
Waveguide. continuum generation in air-silica microstructure optical
fibers with anomalous dispersion at 800 nm. Optics
Letters 25(1).
Further Reading Ranka JK, Windeler RS and Stentz AJ (2000) Optical
Birks TA, Knight JC and Russell P (1997) Endlessly single properties of high-delta air-silica microstructure optical
mode photonic crystal fiber. Optics Letters 22(13). fibers. Optics Letters 25: 796– 798.
DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers 327
dielectric. Metallic mirrors reflect light over a broad uency of v ¼ clkl. The wavevector together with the
range of frequencies incident from arbitrary angles normal to the periodic structure defines a mirror
(i.e., omnidirectional reflectance). However, at infra- plane of symmetry which allows us to distinguish
red and optical frequencies a few percent of the between two independent electromagnetic modes:
incident power is typically lost due to absorption. transverse electric (TE) modes and transverse mag-
Multilayer dielectric mirrors are used primarily to netic (TM) modes. For the TE mode the electric field
reflect a narrow range of frequencies incident from a is perpendicular to the plane, as is the magnetic field
particular angle or particular angular range. Unlike for the TM mode.
their metallic counterparts, dielectric reflectors can be General features of the transport properties of the
extremely low loss. The ability to reflect light of finite structure can be understood when the properties
arbitrary angle of incidence for all-dielectric struc- of the infinite structure are elucidated. In a structure
tures has been associated with the existence of a with an infinite number of layers, translational
complete photonic bandgap, which can exist only in a symmetry along the direction perpendicular to the
system with a dielectric function that is periodic along layers leads to Bloch wave solutions of the form:
three orthogonal directions. In fact, a sufficient
condition for the achievement of omnidirectional
reflection in a periodic system with an interface, is the uK ðx; yÞ ¼ EK ðyÞ eiKy eikx x ½1
Figure 1 Schematic of the multilayer system showing the layer parameters (na , ha – index of refraction and thickness of layer a), the
incident wavevector k~ and the electromagnetic mode convention.
328 DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers
where EK ðyÞ is periodic, with a period of length a, and containing evanescent states. The shape of the pro-
K is the Bloch wave number. These waves represent jected bandstructures for the multilayer film can be
solutions to an eigenvalue problem and are comple- understood intuitively. At kx ¼ 0, the bandgap for
tely and uniquely defined by the specification of K, kx waves traveling normal to the layers is recovered. For
and v. kx . 0, the bands curve upward in frequency. As
Solutions can be propagating or evanescent, kx ! 1, the modes become largely confined to the slabs
corresponding to real or imaginary Bloch wave with the high index of refraction and do not couple
numbers, respectively. It is convenient to display the between layers (and are therefore independent of kx ).
solutions of the infinite structure by projecting the In a finite structure, the translational symmetry in
vðK; kx Þ function onto the v 2 kx plane; Figure 2a,b the directions parallel to the layers is preserved, hence
are examples of such projected structures. kx remains a conserved quantity and can be used to
The gray background areas highlight phase space label solutions even though these solutions will no
where K is strictly real, that is, regions of propagating longer be of the Bloch form. The relevance of the
states, while the white areas represent regions band diagram to finite structures is that it allows for
the prediction of regions of phase space where waves
are evanescently decaying in the multilayer structure.
Since we are primarily interested in waves orig-
inating from the homogeneous medium external to
the periodic structure, we will focus only on the
portion of phase space lying above the light line.
Waves originating from the homogeneous medium
satisfy the condition v $ ckx =n0 , where n0 is the
refractive index of the homogeneous medium, and
therefore they must reside above the light line. States
of the homogeneous medium with kx ¼ 0 are normal
incident, and those lying on the v ¼ ckx =n0 line with
ky ¼ 0 are incident at an angle of 908.
The states in Figure 2a, that are lying in the
restricted phase space defined by the light line and
that have a ðv; kx Þ corresponding to the propagating
solutions (gray areas) of the crystal can propagate in
both the homogeneous medium and in the structure.
These waves will partially or entirely transmit
through the film. Those with ðv; kx Þ in the evanescent
regions (white areas) can propagate in the homo-
geneous medium but will decay in the crystal – waves
corresponding to this portion of phase space will be
reflected off the structure.
The multilayer system leading to Figure 2a rep-
resents a structure with a limited reflectivity cone,
since for any frequency one can always find a kx
vector for which a wave at that frequency can
Figure 2 (a) Projected bandstructure of a multilayer film with propagate in the crystal – and hence transmit through
the light line and Brewster line, exhibiting a reflectivity range of the film. The necessary and sufficient criterion for
limited angular acceptance with (n1 ¼ 2:2 and n2 ¼ 1:7 and a omnidirectional reflectivity at a given frequency is
thickness ratio of h2 =h1 ¼ 2:2=1:7). (b) Projected bandstructure that there exists no transmitting state of the structure
of a multilayer film together with the light line and Brewster line,
inside the light cone – this criterion is satisfied by
showing an omnidirectional reflectance range at the first and
second harmonic (propagating states – dark gray, evanescent frequency ranges marked in light gray in Figure 2b.
states – white, omnidirectional reflectance range – light grey). In fact, the system leading to Figure 2b exhibits two
The film parameters are n1 ¼ 4:6 and n2 ¼ 1:6 with a thickness omnidirectional reflectivity ranges.
ratio of h2 =h1 ¼ 1:6=0:8. These parameters are similar to the The omnidirectional range is defined from above
actual polymer– tellurium film parameters measured in the
by the normal incidence bandedge vh ðky ¼ p=a;
experiment. Reproduced with permission from Fink Y, Winn JN,
Fan S, et al. (1998) A dielectric omnidirectional reflector. kx ¼ 0Þ (Figure 2b) and below by the intersection of
Science 282: 1679– 1682. Copyright 1998 American Association the top of the TM allowed bandedge with the light
for the Advancement of Science. line vl ðky ¼ p=a; kx ¼ vl =cÞ (Figure 2b).
DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers 329
Figure 7 SEM micrographs of 400 mm OD fiber cross-section. The entire fiber is embedded in epoxy. (a) shows the entire fiber cross-
section, with mirror structure surrounding the PES core; (b) demonstrates that the majority of the fiber exterior is free of significant
defects and that the mirror structure adheres well to the fiber substrate; and (c) reveals the ordering and adhesion within the alternating
layers of As2Se3 (bright layers) and PES. Stresses developed during sectioning caused some cracks in the mounting epoxy that are
deflected at the fiber interface. Fibers from this batch were used in the reflectivity measurements recorded below in Figure 9a.
Reproduced with permission from Hart SD, Maskaly GR, Temelkuran B, Prideaux PH, Joannopoulos JD and Fink Y (2002) External
reflection from omnidirectional dielectric mirror fibers. Science 296: 510– 513. Copyright 2002 American Association for the
Advancement of Science.
332 DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers
simultaneously sampling reflected light from multiple periodicity, also called ‘defects’, which allows for the
fibers, agree quite well with single-fiber data creation of localized electromagnetic modes in the
(Figure 9b). vicinity of the defect. These structures, sometimes
These reflectivity results are strongly indicative of called optical cavities, in turn can provide the basis
uniform layer thickness control, good interlayer for a large number of interesting passive and active
adhesion, and low interdiffusion through multiple optical devices such as vertical cavity surface emitting
thermal treatments. This was confirmed by SEM lasers (VCSELs), bi-stable switches, tuneable dis-
inspection of fiber cross-sections (Figure 7). The layer persion compensators, tuneable drop filters, etc. Here
thicknesses observed (a ¼ 0:90 mm for the 400 mm we report on the fabrication of a fiber surrounded by
fibers; a ¼ 0:45 mm for the 200 mm fibers) corre- a Fabry– Perot cavity structure and demonstrate that
spond well to the measured reflectivity spectra. The by application of axial mechanical stress, the spectral
fibers have a hole in the center, due to the choice of a position of the resonant Fabry – Perot mode can be
hollow rod as the preform substrate, which experi- reversibly tuned.
enced some nonuniform deformation during draw.
The rolled-up mirror structure included a double Structure and Optical Properties of
outer layer of PES for mechanical protection, creating
the Fabry –Perot Fibers
a noticeable absorption peak in the reflectivity
spectrum at ,3.2 mm (Figure 9a). The fabrication technique described above allows for
A combination of spectral and direct imaging data the accurate placement of optical cavities that can
demonstrates excellent agreement with the photonic encompass the entire or partial fiber circumference.
band diagram. The measured gap width (range to The fibers discussed above are made of As2Se3 and
mid-range ratio) of the fundamental gap for the PES and have a low index Fabry – Perot cavity
400 mm OD fiber is 27%, compared to 29% in the (Figure 10).
photonic band diagram. This structure was achieved by introducing an
extra polymer layer in the middle of the periodic
multilayer structure of the preform, thus generating a
Tunable ‘Fabry – Perot’ Fibers defect mode in the photonic bandgaps of the drawn
The fabrication of fibers surrounded by or lined with fibers. The position of the bandgap center is linearly
alternating layers of materials with a large disparity related to the optical thickness of the layers by the
in their refractive indices, presents interesting oppor- Bragg condition. Through accurate outer diameter
tunities for passive and active optical devices. While a control afforded by the laser micrometer mounted on
periodic multilayer structure, such as the one the draw tower we have been able to place gaps
reported above, leads to the formation of photonic (Figure 11) at wavelengths ranging from 11 microns
bandgaps and an associated range of high reflectivity, to below 1 micron. Such cost-effective tuneable opti-
it is the incorporation of intentional deviations from cal filters could lead to applications such as optical
Figure 10 Schematic of the structure of dielectric mirror fibers made of As2Se3 (light gray) and PES (dark gray) with low index Fabry–
Perot cavity. The local cylindrical coordinate system is represented as well as the applied axial strain 1zz . Typical radii are
,150 mm ^ 100 mm. Backscattered SEM micrographs of the cross-sections of a 460 micron diameter fiber (a, b, c) and of a 240 micron
diameter fiber (d, e, f) embedded in epoxy and microtomed. (a) and (d) show the entire cross-section of the fibers, (b) and (e)
demonstrate long-range layer uniformity, and (c) and (f) reveal the ordering and adhesion of the Fabry– Perot cavity structure.
Bar scales have been redrawn for clarity. Reproduced with permission from Benoit G, et al. (2003) Static and dynamic properties of
optical cavities in photonic bandgap gains. Advanced Materials 15: 2053– 2056.
334 DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers
and using Hooke’s law and Lame’s equations, we can edges by a factor of 1.14 and 1.15 for the 240 and the
derive a general expression for ur and srr , with two 460 micron diameter fiber, respectively (determined
unknowns A and B per layer: experimentally by measuring the position of two
reference points on the fiber) compared to the strain
A calculated from the rotation of the pole. Moreover, to
ur ¼ þ Br ½6
r avoid measuring slight variations in the spectral
position of the bandgap resulting from outer diameter
2GA
srr ¼ 2 þ 2ðh þ GÞB þ l1zz ½7 variations, typically of the order of 4 –5 microns over
r2 meters of fibers, the measurements were realized at a
fixed reference position on the fiber.
where G ¼ E ½2ð1 þ nÞ is the shear modulus and
h ¼ En ½ð1 þ nÞð1 2 2nÞ the Lame modulus. The By focusing on the fundamental bandgap of the
continuity of the displacement and the radial stress 240 and the 460 micron diameter fiber, we demon-
at each interface, plus the boundary conditions (srr strated the tuning of the Fabry –Perot resonant
vanishes at free surfaces) can be expressed as a linear mode under increasing axial strain (Figure 14).
system whose unique solution allows us to relate the A drop in the reflectivity of 13% was observed at
radial strain in each layer to the applied axial strain as 1.71 mm (dash line) for the 240 micron diameter fiber
a linear relation 1rr ¼ C1zz , where C can be interpreted when increasing the applied axial strain from 0.23%
as an effective Poisson ratio. Finally, taking into (light gray) to 1.07% (dark gray). The normalized-
account that the Fabry –Perot resonant mode itself is shift versus strain curves (Figure 15) appeared to be
linearly shifted within the bandgap because of the linear for both fibers up to approximately 0.9% (dash
different effective Poisson ratios of the glass and line) with a slope equal to 20.3859 and 20.3843 for
polymer layers, we obtain a linear relation between the the 240 and the 460 micron diameter fiber, respec-
normalized shift of the Fabry– Perot resonant mode tively, close to the predicted value (20.373, eqn [8]).
and the applied axial strain: The normalized shift was equal to 20.347% for
0.9% applied axial strain, which seems to be the limit
Dl of the elastic regime and corresponds to small applied
¼ 20:3731zz ½8
l
All the layers (except the outer protective polymer
layer) are under tensile radial stress whose maximum
(0.22 MPa under 1% axial strain) is located at the
interface between the layers and the polymer core
where delamination is most likely to occur.
Mechanical Tuning Experiment Figure 13 Experimental setup used for mechanical tuning
and Discussion demonstration. Reproduced with permission from Benoit G, et al.
(2003) Static and dynamic properties of optical cavities in photonic
Measurements were performed on fibers ,30 cm bandgap gains. Advanced Materials 15: 2053– 2056.
long, which were fixed at one end with epoxy to a
load cell (Transducer Techniques MDB-2.5) while the
other end was attached with strong tape to a pole
mounted on a stepper rotational stage (Newport
PR50) (Figure 13). This end of the fiber was also
screwed to the pole to further secure it in place.
The diameter of the pole was equal to 2.1 cm,
leading to a normalized shift precision below
0.005%. The uncertainty on the Young’s modulus,
due to the precision of the load cell, was lower than
30 MPa. All the reflectivity spectra were normalized
to a background taken with a flat gold mirror. The
measurements were realized as far as possible from Figure 14 Reflectivity versus wavelength plot showing the shift
of the Fabry– Perot resonant mode of the 240 micron diameter
the fixed ends of the fiber where edge effects are likely
fiber for three increasing values of the applied axial strain.
to occur. These edge effects result in a reduction of the Reproduced with permission from Benoit G, et al. (2003) Static
length of the fiber that deforms uniformly and and dynamic properties of optical cavities in photonic bandgap
consequently increase the real strain far from the gains. Advanced Materials 15: 2053– 2056.
336 DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers
loads: 76 and 300 g for the 240 and 460 micron demonstrated by the design and fabrication of tens of
diameter fiber, respectively. Under higher strains, the meters of hollow photonic bandgap fibers for
normalized-shift versus strain curves were no longer 10.6 micron radiation transmission. We demonstrate
linear and the measured loads would start decreasing transmission of carbon dioxide (CO2) laser light with
slightly with time, which could be a sign of high power-density through more than 4 meters of
delamination between the layers and the polymer hollow fiber and measure the losses to be less than
core, of plastic deformation and/or of relaxation in 1.0 dB/m at 10.6 microns. This establishes suppres-
the layers. The stress –strain curves also exhibit an sion of fiber waveguide losses by orders of magnitude
elastic regime up to approximately 1% strain with a compared to the intrinsic fiber material losses.
corresponding Young’s modulus equal to 2.59 GPa Silica optical fibers have been extremely successful
for the 240 mm OD fiber and to 2.39 GPa for the in telecommunications applications, and other types
460 mm OD fiber. These values are slightly lower than of solid-core fibers have been explored at wavelengths
the predicted value, possibly due to relaxation effects where silica is not transparent. However, all fibers
in the polymer and the glass layers (for As2Se3 and that rely on light propagation principally through a
PES, Tg ¼ 175 and 220 8C, respectively). solid material have certain fundamental limitations
stemming from nonlinear effects, light absorption
by electrons or phonons, material dispersion, and
Wavelength-Scalable Hollow Optical
Rayleigh scattering that limit maximum optical
Fibers with Large Photonic Bandgaps transmission power and increase attenuation losses.
for CO2 Laser Transmission These limitations have, in turn, motivated the study of
Hollow optical transmission fibers offer the potential a fundamentally different light-guiding methodology:
to circumvent fundamental limitations associated the use of hollow waveguides having highly reflecting
with conventional index guided fibers and thus have walls. Light propagation through air in a hollow fiber
been the subject of active research in recent years. eliminates or greatly reduces the problems of non-
Here we report on the materials selection, design, linearities, thermal lensing, and end-reflections, facil-
fabrication, and characterization of extended lengths itating high-power laser guidance and other
of hollow optical fiber lined with an interior applications which may be impossible using conven-
omnidirectional dielectric mirror. These fibers consist tional fibers. Hollow metallic or metallo-dielectric
of a hollow air core surrounded by multiple alternat- waveguides have been studied fairly extensively and
ing submicron-thick layers of a high-refractive-index found useful practical application, but their perform-
glass and a low-index polymer, resulting in large ance has been bounded by the notable losses occurring
infrared photonic bandgaps. These gaps provide in metallic reflections at visible and infrared (IR)
strong confinement of optical energy in the hollow wavelengths, as well as by the limited length and
fiber core and lead to light guidance in the funda- mechanical flexibility of the fabricated waveguides.
mental and up to fourth-order gaps. We show that the Hollow all-dielectric fibers, relying on specular or
fiber transmission windows can be scaled over a large attenuated total reflection, have also been explored,
wavelength range covering at least 0.75 to but high transmission losses have prevented their
10.6 microns. The utility of our approach is further broad application. More recently, all-dielectric fibers,
Figure 15 Normalized shift of the Fabry–Perot resonant mode (defined as Dl=l) versus applied axial strain for the 240 micron
(diamonds) and the 460 micron diameter fiber (dots). The curve obtained for the 240 micron diameter fiber is lower because the
experimental 0% strain is likely to correspond to a nonzero positive strain, necessary to keep this thinner fiber straight for the
measurement (equivalent to a load less than 10 g). Reproduced with permission from Benoit G, et al. (2003) Static and dynamic
properties of optical cavities in photonic bandgap gains. Advanced Materials 15: 2053– 2056.
DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers 337
consisting of a periodic array of air holes in silica, have fibers were designed to have large hollow cores, useful
been used to guide light through air using narrow in high-energy transmission.
photonic bandgaps. Solid-core, index-guiding ver- SEM analysis (Figure 16) reveals that the drawn
sions of these silica photonic crystal fibers have also fibers maintain proportionate layer thickness ratios
been explored for interesting and important appli- and that the PES and As2Se3 films adhere well during
cations, such as very large core single-mode fibers, rigorous thermal cycling and elongation. Within the
nonlinear enhancement and broadband superconti- multilayer structure shown in Figure 1, the PES layers
nuum generation, polarization maintenance, and dis- (gray) have a thickness of 900 nm, and the As2Se3
persion management. However, the air-guiding layers (bright) are 270 nm thick (except for the first
capabilities of such waveguides thus far remain and last As2Se3 layers, which are 135 nm). Broad-
inferior to transmission through solid silica, due to band fiber transmission spectra were measured with a
various factors such as the difficulties in fabricating Fourier transform infrared (FTIR) spectrometer
long, uniform fibers which must have a high volume (Nicolet Magna 860), using a parabolic mirror to
fraction of air and many air–hole periods, as well as by couple light into the fiber and an external detector.
the large electromagnetic (EM) penetration depths The results of these measurements are shown in the
associated with the small photonic bandgaps achiev- lower panel of Figure 2 for fibers having two different
able in these air–silica structures. In our fiber, the layer structures. For each spectrum, light is guided at
hollow core is surrounded by a solid high-refractive- the fundamental and high-order photonic bandgaps.
index-contrast multilayer structure leading to large Also shown in the upper panel of Figure 2, is the
photonic bandgaps and omnidirectional reflectivity. corresponding photonic band diagram for an infinite
The pertinent theoretical background and recent periodic multilayer structure calculated using the
analyses indicate that such fibers may be able to experimental parameters of our fiber (layer thick-
achieve ultralow losses and other unique transmission nesses and indices). Good agreement is found
properties. The large photonic bandgaps result in very between the positions of the measured transmission
short EM penetration depths within the layer structure, peaks and the calculated bandgaps, corroborated
significantly reducing radiation and absorption losses with the SEM-measured layer thicknesses, verifying
while increasing robustness. Omnidirectional reflec- that transmission is dominated by the photonic
tivity is expected to reduce intermode coupling losses. bandgap mechanism. In order to demonstrate ‘wave-
To achieve high index contrast in the layered length scalability’ (i.e., the control of transmission
portion of the fiber, we combined a chalcogenide glass through the fiber’s structural parameters) another
with a refractive index of ,2.8 As2Se3, and a high- fiber was produced, having the same cross-section but
performance polymer with a refractive index of
with thinner layers. We compared the transmission
,1.55 PES. We recently demonstrated that these
materials could be thermally co-drawn into precisely
layered structures without cracking or delamination,
even under large temperature excursions. The same
polymer was used as a cladding material, resulting in
fibers composed of ,98% polymer by volume (not
including the hollow core) and thus combine high
optical performance with polymeric processability
and mechanical flexibility. We fabricated a variety of
fibers by depositing a 5 –10 micron thick As2Se3 layer
through thermal evaporation onto a 25 – 50 micron
thick PES film and the subsequent ‘rolling’ of that
coated film into a hollow multilayer tube called a fiber
preform. This hollow macroscopic preform was
consolidated by heating under vacuum and cladded
with a thick outer layer of PES; the layered preform Figure 16 Cross-sectional SEM micrographs at various
was then placed in an optical fiber draw tower and magnifications of hollow cylindrical multilayer fiber mounted in
drawn down into tens or hundreds of meters of fiber epoxy. The hollow core appears black, the PES layers and
having well-controlled submicron layer thicknesses. cladding gray, and the As2Se3 layers bright white. This fiber has a
fundamental photonic bandgap at a wavelength of , 3.55
The nominal positions of the photonic bandgaps were
microns. Reproduced with permission from Temelkuran B, Hart
determined by laser monitoring of the fiber OD SD, Benoit G, Joannopoulos JD and Fink Y (2002) Wavelength-
during the draw process. Typical standard deviations scalable hollow optical fibres with large photonic bandgaps for
in the fiber OD were ,1% of the OD. The resulting CO2 laser transmission. Nature 420: 650– 653.
338 DIFFRACTIVE SYSTEMS / Omnidirectional Surfaces and Fibers
spectra for the original 3.55 micron bandgap fibers to are available that emit at 10.6 microns and are used in
the fiber with the scaled-down layer thicknesses. such applications as laser surgery and materials
Figure 17 shows the shifting of the transmission processing, but waveguides operating at this
bands, corresponding to fundamental and high-order wavelength have remained limited in length or loss
photonic bandgaps, from one fiber to the next. The levels. Using the fabrication techniques outlined
two fibers analyzed in Figure 17 were fabricated from above, we produced fibers having hollow core
the same fiber preform using different draw-down diameters of 700 – 750 microns and ODs of 1300 –
ratios (fibers with a fundamental bandgap centered 1400 microns with a fundamental photonic bandgap
near 3.55 microns have an OD of 670 microns; spanning the 10 – 11 micron wavelength regime,
those with a gap at 3.1 microns have an OD of centered near 10.6 microns. Figure 5 depicts a typical
600 microns). The high-order bandgaps are period- FTIR transmission spectrum for these fibers,
ically spaced in frequency, as expected for such a measured using ,30 cm long straight fibers.
photonic crystal structure. In order to quantify the transmission losses in
The wavelength scalability of our fibers was further these 10.6 micron bandgap hollow fibers, fiber cut-
demonstrated in the fabrication of hollow fibers back measurements were performed. This involved
designed for the transmission of 10.6 micron EM the comparison of transmitted intensity through
radiation. This not only shows that these structures ,4 meters of straight fiber with the intensity of
can be made to guide light at extremely disparate transmission through the same section of fiber cut to
wavelengths, but that specific useful bandgap wave- shorter lengths (Figure 18 inset). This test was
lengths can be accurately targeted during fabrication
performed on multiple sections of fiber, and the results
and fiber drawing. Powerful and efficient CO2 lasers
found to be nearly identical for the different sections
tested. The measurements were performed using a
25 watt CO2 laser (GEM-25, Coherent-DEOS) and
high power detectors (Newport 818T-10). The fiber
was held straight, fixed at both ends as well as at
multiple points in the middle to prevent variations in
the input coupling and propagation conditions during
fiber cutting. The laser beam was sent through
focusing lenses as well as 500 micron diameter
pinhole apertures and the input end face of the fiber
was coated with a metal film to prevent accidental
laser damage from misalignment. The transmission into the hollow fiber core. These results indicate the
losses in the fundamental bandgap at 10.6 microns feasibility of using hollow multilayer photonic
were measured to be 0.95 dB/m, as shown in the inset bandgap fibers as a low-loss wavelength-scalable
of Figure 18, with an estimated measurement uncer- transmission medium for high-power laser light.
tainty of 0.15 dB/m. These loss measurements are
comparable to some of the best reported loss values See also
for other types of waveguides operating at
Diffractive Systems: Design and Fabrication of Diffrac-
10.6 microns. A bending analysis for fibers with a
tive Optical Elements. Fiber and Guided Wave Optics:
bandgap centered at 10.6 microns revealed bending Fabrication of Optical Fiber. Lasers: Carbon Dioxide
losses below 1.5 dB for 90 degree bends with bending Lasers. Spectroscopy: Fourier Transform Spectroscopy.
radii from 4 – 10 cm. We expect that these loss levels
could be lowered even further by increasing the Further Reading
number of layers, through optimization of the layer
thickness ratios and by creating a cylindrically Abeles F (1950) Investigations on the propagation of
symmetric multilayer fiber with no inner seam sinusoidal electromagnetic waves in stratified media.
Application to thin films. Annales de Physique 5: 706.
(present here because of the ‘rolling’ fabrication
Benoit G and Fink Y, Spectroscopic Ellipsometry Database,
method). In addition, using a polymer with lower http://mit-pbg.mit.edu/Pages/Ellipsometry.html
intrinsic losses should greatly improve the trans- Born M and Wolf E (1980) Principles of Optics, 6th edn.
mission characteristics. New York: Pergamon Press, p. 67.
One reasonable figure of merit for optical trans- Engeness TD, Ibanescu M, Johnson SG, et al. (2003)
mission losses through hollow all-dielectric photonic Dispersion tailoring and compensation by modal inter-
bandgap fibers is to compare the hollow fiber losses to actions in OmniGuide fibers. Optics Express 11:
the intrinsic losses of the materials used to make the 1175 – 1198. http://www.opticsexpress.org/abstract.
fiber. As2Se3 has been explored as an IR-transmitting cfm?URI ¼ OPEX-11-10-1175
material, yet the losses at 10.6 microns reported in Fink Y, Winn JN, Fan S, et al. (1998) A dielectric
the literature are ,10 dB/m for highly purified omnidirectional reflector. Science 282: 1679– 1682.
Fink Y, Ripin DJ, Fan S, Chen C, Joannopoulos JD and
material, and more typically are greater than
Thomas EL (1999) Guiding optical light in air using an
10 dB/m for commercially available materials such
all-dielectric structure. Journal of Lightwave Technol-
as those used in our fabrication. Based on FTIR ogy 17: 2039–2041.
transmission and spectroscopic ellipsometer measure- Hart SD, Maskaly GR, Temelkuran B, Prideaux PH,
ments that we have performed on PES, the optical Joannopoulos JD and Fink Y (2002) External reflection
losses associated with propagation through solid PES from omnidirectional dielectric mirror fibers. Science
should be greater than 40 000 dB/m at 10.6 microns. 296: 510 – 513.
This demonstrates that guiding light through air in Ibanescu M, Johnson SG, Soljacic M, et al. (2003) Analysis
our hollow bandgap fibers leads to waveguide losses of mode structure in hollow dielectric waveguide fibers.
that are orders of magnitude lower than the intrinsic Physics Review E 67: 046608-1– 8.
fiber material losses, which has been one of the Johnson SG, Ibanescu M, Skorobogatiy M, et al. (2001)
Low-loss asymptotically single-mode propagation in
primary goals of hollow photonic bandgap fiber
large-core OmniGuide fibers. Optics Express 9:
research. These comparatively low losses are made
748 – 779. http://www.opticsexpress.org/abstract.
possible by the very short penetration depths of EM cfm?URI ¼ OPEX-9-13-748
waves in the high refractive index contrast photonic Kuriki K, Shapira O, Hart SD, et al. (2004) Hollow
crystal structure, allowing these materials to be used multilayer photonic bandgap fibers for NIR appli-
at wavelengths that may have been thought improb- cations. Optics Express 12: 8.
able. Another long-standing motivation of infrared Soljacic M, Ibanescu M, Johnson SG, Joannopoulos JD and
fiber research has been the transmission of high- Fink Y (2003) Optical bistability in axially modulated
power laser light. As a qualitative demonstration of OmniGuide fibers. Optics Letters 28: 516– 518.
the potential of these fibers for such applications, Temelkuran B, Hart SD, Benoit G, Joannopoulos JD and
both straight and smoothly bent fibers of lengths Fink Y (2002) Wavelength-scalable hollow optical fibres
with large photonic bandgaps for CO2 laser trans-
varying from 0.3 –2.5 meters were used to transmit
mission. Nature 420: 650 – 653.
enough CO2 laser energy to burn holes through paper.
Yeh P, Yariv A and Hong C-H (1977) Electromagnetic
The maximum laser power density coupled into our propagation in periodic stratified media. 1. General
fibers in these trials was approximately 300 W/cm2, theory. Journal of the Optical Society of America
more than sufficient to burn a homogeneous polymer 67(4): 423.
material, including PES. No damage to the fibers was Yeh P, Yariv A and Marom E (1978) Theory of Bragg fiber.
observed when the laser beam was properly coupled Journal of the Optical Society 68: 1196– 1201.
340 DIFFRACTIVE SYSTEMS / Wave Optical Modeling and Design
Figure 2 Formally a system may be subdivided into a sequence of homogenous dielectrics like air and inhomogeneous regions which
include the elements. We indicate a region by the letter j.
342 DIFFRACTIVE SYSTEMS / Wave Optical Modeling and Design
In this sense, free propagation is a scalar problem which allows the application of the fast Fourier
which allows the convenient use of Uðx;y;zÞ instead of transform (FFT) in practice. Dependent on the
two field components for the sake of simpler mathe- propagation technique, one or two FFTs must be
matical expressions. performed. In practice, the sampling of fields is of
Equation [6] allows the derivation of other great concern. Naturally the amount of data should
rigorous versions of the free propagation operator be as small as possible. According to the rigorous
P SPW , for instance Rayleigh’s integral formula, which propagation operator eqn [6] the ðx; yÞ-coordinate
is well-known from diffraction theory. Moreover, system and therefore the sampling period in z0 and z
eqn [6] leads to important approximate operators for are identical, that is, scaling does not occur. As
far-field propagation and the propagation of paraxial illustrated in Figure 3, that can lead to an enormous
fields. Propagation of a field into its far field is given by amount of data for large propagation distances. For
the operator: demonstration purpose the field amplitude depicted
at the top is chosen. The field window has the size
UðrsÞ ¼ P far Uðx; y; z0 Þ 1 mm £ 4 mm. The wavelength l is 632.8 nm, n ¼ 1,
expðikrÞ and the sampling distance 10 mm. That results
¼ 2i2pksz ½F Uðx; y; z0 Þlðksx ;ksy Þ ½7 in 100 £ 400 sampling points. Propagation of this
r
field with the spectrum of planewave operator
with
2 the wave
1=2 number k ¼ 2pn=l , r ¼ lrl ¼ maintains the size of the field window. The resulting
x þ y2 þ ðDzÞ2 , and the direction vector s ¼ field for a propagation distance Dz ¼ 10 mm is
ðsx ; sy ; sz Þ ¼ r=r: Here, and in what follows, l denotes shown in the middle. Because of diffraction, the
the vacuum wavelength. lðksx ;ksy Þ indicates the vari- field widens. As long as the initial window is large
ables to be inserted into the Fourier transformed field. enough to encompass the propagated field, the
Though the far-field formula is derived for r ! 1, it is propagation method works well. At the bottom, the
also of great practical value for the propagation of resulting field for Dz ¼ 200 mm is shown. The field
fields generated by nonparaxially emitting sources exceeds the size of the window and that leads to
like laser diodes. Roughly speaking eqn [7] predicts a numerical errors (aliasing). This can be avoided by
spherical wave in the far field which is modulated by embedding the initial field into a window which is
the spectrum of planewaves of the field in z0 : P far large enough with respect to the expected size of the
propagates from the plane z0 to the spherical surface propagated field. As a result, one obtains the
with radius r and not to a plane z.
The paraxial approximation of eqn [6] follows if
the spectrum of planewaves F Uðx; y; z0 Þ possesses
significant values only for small kx and ky , that is if
both electric field components propagate approxi-
mately along the z-axis. The resulting propagation
operator is known as Fresnel’s integral formula. In
practice, its formulation with the Fourier transform
F is typically more suitable:
lðx;yÞn=lDz indicates the variables to be inserted after Figure 3 Illustration of numerical properties of the spectrum of
Fourier transform. The far-field approximation of plane wave propagation technique. The field amplitude depicted
eqn [8] leads to the well-known Fraunhofer diffrac- at the top is numerically propagated in free space and the
tion formula. distributions in the middle and the bottom are obtained. The
simulations in this and all following figures were performed.
Equations [6] –[8] provide fundamental operators The simulations in this and all following figures were performed
for the propagation of fields in wave-optical engin- with the wave-optical engineering software VirtualLabe from
eering. All integrals are based on Fourier transforms LightTrans GmbH.
DIFFRACTIVE SYSTEMS / Wave Optical Modeling and Design 343
periodically. An important example is given for due to the generalized nature of the structures, which
diffractive lenses in the outer zones of high numerical typically prevents the use of system parameter
aperture lenses. In the center and for low numerical optimization techniques. For example, the profile of
aperture TEA is appropriate for propagating through diffractive elements is typically specified by 104 – 106
diffractive lenses. parameters. Thus a direct optimization of the system
parameters, which requires the evaluation of Sðpa Þ
(see eqn [3]) after any change of system parameters
Design Principles pa , is not realistic.
The design problem in wave-optical engineering can Instead of optimizing the system parameters
be stated as follows: specify system parameters pa in directly, which is referred to as design in the structural
a way that ensures for a given input field f in an embodiment of the system, a technique wellknown
output field f out which satisfies the quality criteria from Fourier optics and diffractive optics can be
defined by the merit functions Vm : In addition, one generalized for wave-optical engineering. It does not
may demand a system which is most simple, start with structure related element parameters but
particularly cheap, or various other system properties replaces all or at least some of the operators S j ðpaj Þ of
of concern in practice. eqn [3] by mathematical operators T j ðttj Þ: tt denotes
Design constitutes an optimization problem. the parameters of T : The propagation operator
Restrictions to the system parameters and the merit through a system expressed in this functional
function sets design constraints as well as design embodiment has the form:
freedoms.
System parameters can be restricted, for instance, f out ¼ S functional f in
by the materials available, the maximum size of a ¼ P J ðnJ ; DzJ ÞT J21 ðttJ21 Þ· · ·T 2 ðtt2 Þ
system, and types of element profiles. Within these
limits the system parameters are chosen freely during £ P 1 ðn1 ; Dz1 Þf in ½10
optimization to achieve the optical function.
if all propagations through elements are expressed in
The merit functions determines which parameters
functional form. A very important example of a
of the output field are important and should be
functional operator T is given by a simple multipli-
emphasized, for instance the intensity distribution.
cation of both field components with identical or
All output fields, whose parameters satisfy the
different complex functions tðx; yÞ, that is:
constraints satisfy the design problem. In most
applications, not all field parameters are constrained. 0 1
txx ðx; yÞ 0
For instance, in laser materials processing or illumi- T t ðtt Þf in ¼ @ Af in ðx; yÞ ½11
nation applications, the merit function is related to 0 tyy ðx; yÞ
energy quantities and then the phase of the output
field can be used as a free parameter distribution. The values of both tðx; yÞ form the parameters tt
Moreover, the output field is typically not constrained of T : The functions tðx; yÞ are also called ideal
in the entire output plane, but only in so-called signal transmission functions of elements. Of special interest
regions, which introduces freedom of amplitude are phase-only transmission functions tðx; yÞ ¼
outside the signal regions. Hence, dependent on the exp½itðx; yÞ (see discussion below). If tðx; yÞ ¼
application, there exist field parameters that can be txx ðx; yÞ ¼ tyy ðx; yÞ, the design is simplified. Fortu-
chosen partly or completely free during design. nately, in most actual applications this assumption is
Successful design strategies use the freedoms to valid.
improve the matrix function. It is often not clear a Designing in the functional embodiment requires
priori, if a solution does exist. Then the field the evaluation of S functional instead of S of eqn [3],
constraints are satisfied as good as possible in terms which reduces the numerical complexity enormously.
of the merit function. That improves the likelihood for developing efficient
In imaging systems, typical system parameters design algorithms to optimize the distances Dzj , the
include, for instance, distances, refractive indices, refractive indices nj , and parameters ttj :
curvature of lenses, and parameters to specify The design in the functional embodiment does not
aspherical surfaces. For high-end lens systems, the deliver the structure parameters of elements but only
resulting number of parameters can be large, but they parameters tt of the operator T which gives
are still small enough to utilize parameter optimiz- information about how a field should be changed at
ation algorithms if the designer starts with a the position z where T takes effect. Thus, in a next
reasonable initial guess. In wave-optical engineering, design step, elements must be obtained which cause
the number of system parameters can become huge the required effect. This so-called structure design
346 DIFFRACTIVE SYSTEMS / Wave Optical Modeling and Design
step is also generally challenging because it requires domain and suffers from aberrations for moderate
the solution of an inverse propagation problem and large numerical apertures, which are more severe
through interfaces. For a further discussion the most than for the diffractive lens. The use of inverse
important T , that is the transmission type operator geometrical optics LPIA as a structure design
of eqn [11], is considered. For the sake of simplicity technique is not restricted to the paraxial domain
tðx; yÞ ¼ txx ðx; yÞ ¼ tyy ðx; yÞ is assumed. Then, a and therefore it leads to a suitable aspherical surface
solution which uses geometrical optics LPIA in in order to ensure a spherical phase transmission for
combination with phase-only transmissions is larger numerical apertures too. This is illustrated in
known in the form of a recursive structure design Figure 8. The lens example shows another advantage
strategy. In its paraxial approximation, that is TEA, of the two-step design method, which consists of the
the solution becomes very simple. For a transmission design in the functional embodiment prior to the one
tðx; yÞ ¼ exp½itðx; yÞ, obtained from a functional in the structural form of the element. In the functional
design, an element with a surface profile:
lt ðx; yÞ
hðx; yÞ ¼ ½12
2p ðnelement 2 nfree Þ
form the transmission is obtained. It is not necessary (see also last section). It is important to note, that
to decide about the use of a diffractive or a refractive functional design assumes a diagonal operator
element in an early stage of the design. Moreover, it matrix, that is, it excludes crosstalk of the two field
provides the designer with insight into selecting channels. Transferring the system into a structural
optical elements to achieve the function. If the form may violate this assumption because of vectorial
phase-only transmission can be unwrapped, a refrac- effects in the nonparaxial domain. For instance,
tive as well as a diffractive element can be used. An geometrical optics LPIA in general does not corre-
optical engineer has the freedom to decide about the spond to a diagonal operator. Therefore, it is neces-
suitable choice taking fabrication, size, costs, and sary to analyze the final system for f in – even in the
other additional constraints into account. However, case of linear polarized light – using more accurate
as soon as the phase-only transmission includes so- propagation techniques than S TEA if it is not ensured
called vortices, unwrapping is not possible and that the paraxial approximation is accurate enough.
diffractive optics is mandatory. An example of a The so-called thin lens, or ideal lens, in conven-
vortex is shown in Figure 9. It turns out that tional optical engineering constitutes a special case of
vortices are more likely to appear in wave-optical the two-step design strategy. In the first design step,
engineering the less image-like the optical function is distances and lens focal lengths are determined,
without taking into account the actual shape of a
lens. In the second step lenses, or even lens systems,
are inserted according to the position and the focal
length specified in the first step. Thus, wave-optical
engineering extends our understanding by broad-
ening established methods in optical engineering.
Fundamental Systems
A smart approach to design minimizes the number of
elements required to achieve a desired light trans-
formation. Thus, it is logical to try a functional design
with a system of the form:
Figure 10 A basic system for light transformation can be described functionally by two propagations through homogeneous dielectrics
with refractive index nfree and one phase-only transmission function. (b) If the second propagation is replaced by an operation in
functional form, here the Fourier transform, the setup shown on the right side results.
348 DIFFRACTIVE SYSTEMS / Wave Optical Modeling and Design
the signal region cannot be avoided. As a result, it is theoretical values are smaller and depend on Mbeams
not possible to generate the desired response with because hbound ðUsig Þ depends via Usig on Mbeams : If
100% efficiency. For any given signal field Usig an fabrication processes and the inherent imperfections
upper bound hbound ðUsig Þ of the achievable efficiency are taken into consideration, the values decrease
hðUsig Þ of light transformation with a phase-only even more. As a rule of thumb, it is possible to claim
transmission has been derived, that is hðUsig Þ # that efficiencies larger than 90% are always challen-
hbound ðUsig Þ: The equal sign is only valid for the ging to obtain. With sophisticated design and
special case of a linear phase transmission, for which fabrication techniques, efficiencies larger than 80%
the signal field consists of one k-direction only. Then are reasonable.
hbound ðUsig Þ ¼ hðUsig Þ ¼ 1 is obtained. After structure Up to now, we have discussed only the importance
design for a linear phase-only transmission with TEA, of the signal field phase to efficiency. But the phase of
a so-called blazed grating results. If TEA is a sufficient the initial signal field has another extremely import-
approximation one concludes that a blazed grating ant effect on the result of the transmission design.
has a theoretical efficiency of 100%. For all other From mapping-type considerations, a smooth initial
signals the theoretical efficiency is smaller than signal field phase can be constructed in order to
100%. Moreover, the upper bound is a function of realize a light transformation. An example is illus-
Usig and thus also of its phase. Therefore, in a first trated in Figure 12. It shows the central part of the
step of a transmission design it is recommendable to phase of the transmission function which transforms
perform a phase synthesis which maximizes the a Gaussian input beam into a flat-top line beam.
upper bound. This first design step delivers infor- Because of the smooth signal phase also the trans-
mation about achieveable efficiency and a suitable mission phase is smooth and therefore unwrappable.
initial signal field for the IFTA. This approach is typical for a so-called beam shaping
If fabrication requires phase quantization, ampli- design. It requires a well-defined input beam and high
tude freedom can be used to separate quantization accuracy in adjustment of the beam shaping optics to
noise from the signal regions. Unfortunately this obtain the output field with good quality. If, instead
reduces the achievable efficiency even more. If the of a smooth initial phase, a random one is applied, the
phase to be quantized is homogenously distributed, situation illustrated in Figure 13 (left) is achieved.
the upper bound for a phase-only transmission Here the phase of the transmission looks random,
quantized to Q equidistant phase levels is given by: which is typical for beamsplitter and diffuser. Because
" # such phase functions possess numerous vortices, they
2 1 do not allow an unwrapping. Thus, diffractive
hbound ðUsig ; QÞ ¼ sinc hbound ðUsig Þ ½15
Q elements are the logical result of structure design.
An example is shown in Figure 13 (right).
For the special case of linear phase, the well-known Figure 14 shows how light from a laser diode passes
formula hðQÞ ¼ sinc2 ½1=Q follows. The values of through the diffuser element of Figure 13 (right). The
sinc2 ½1=Q and therefore of quantized blazed radiation is transformed into an intensity pattern that
gratings in the paraxial approximation are 0.405, depicts the ‘skyline’ of Jena. Although globally the
0.812, 0.950, 0.987, and 0.997 for Q ¼ 2, 4, 8, 16, intensity possesses the requested shape, a closer look
and 32, respectively. This result does not mean that, shows that the intensity distribution is speckled.
for example, a diffractive beamsplitter with a However, in numerous applications this is not a
splitting ratio 1 : Mbeams and 16 height levels will problem so long as the size of the speckles is
provide ,99% efficiency. According to eqn [15], the well adapted to the resolution of the detector.
Figure 12 The input beam on the left propagates through the transmission function the central part of the phase of which is shown in
the middle and in the signal plane the amplitude shown on the right results. This Gaussian-to-line transformation is a typical beam
shaping design task.
350 DIFFRACTIVE SYSTEMS / Wave Optical Modeling and Design
Figure 13 A typical phase distribution of a transmission (left) which results from design with IFTA for a random initial signal field
phase. Numerous vortexes prevent its unwrapping. Therefore, structure design leads to diffractive elements. Then, a quantization of the
phase is demanded in most cases. An example is shown on the right-handed side. It depicts a portion of a four level diffuser which results
in the output field shown in Figure 14. The pixel size is 400 nm and the corresponding element was designed and fabricated in a
cooperation of LightTrans GmbH and the Institute of Applied Physics at the University of Jena.
understanding of the optical properties of the system. Pfeil AV and Wyrowski F (2000) Wave-optical
After this final investigation it is ready for fabrication structure design with the local plane-interface app-
and application providing a further innovation roximation. Journal of Modern Optics 47(13):
through optics and photonics on the base of wave- 2335– 2350.
optical engineering. Pfeil AV, Wyrowski F, Drauschke A and Aagedal H (2000)
Analysis of optical elements with the local plane-
interface approximation. Applied Optics 39(19):
See also 3304– 3313.
Schimmel H, Bühling S and Wyrowski F (2003) Use of ray
Fourier Optics.
tracing in wave-optical engineering. Proceedings SPIE
5182: 6– 14.
Further Reading Schimmel H and Wyrowski F (2000) Amplitude matching
strategy for wave-optical design of monofunc-
Aagedal H, Schmid M, Beth T, Teiwes S and Wyrowski F tional systems. Journal of Modern Optics 47(13):
(1996) Theory of speckles in diffractive optics and its 2295– 2321.
application to beam shaping. Journal of Modern Optics Sergienko N, Stamnes JJ, Kettunen V, Kuittinen M, Turunen
43(7): 1409– 1421. J, Vahimaa P and Friberg AT (1999) Comparison of
Aagedal H, Schmid M, Egner S, Müller-Quade J, Beth T and electromagnetic and scalar methods for evaluation of
Wyrowski F (1997) Analytical beam shaping with diffractive lenses. Journal of Modern Optics 46(1):
application to laser-diode arrays. Journal of the Optical 65 – 82.
Society of America A 14(7): 1549– 1553. Smith WJ (1997) Practical Optical System Layout.
Aagedal H, Wyrowski F and Schmid M (1997) Paraxial
New York: McGraw-Hill.
beamsplitting and shaping. In: Turunen J and Wyrowski
Smith WJ (2000) Modern Optical Engineering, 3rd edn.
F (eds) Diffractive Optics for Industrial and Commercial
Bellingham: SPIE Press.
Applications, chapter 6, pp. 165 – 188. Berlin: Akademie
Turunen J (1997) Diffraction theory of microrelief gratings.
Verlag.
In: Herzig HP (ed.) Micro-Optics Elements, Systems and
Born M and Wolf E (1980) Principles of Optics, 6th edn.
Applications, chapter 2, pp. 31– 52. London: Taylor &
New York: Pergamon Press.
Francis.
Bryngdahl O (1974) Optical map transformations. Optics
Turunen J and Wyrowski F (1997) Diffractive Optics for
Communications 10(2): 164– 168.
Industrial and Commercial Applications. Berlin:
Bühling S, Schimmel H and Wyrowski F (2003) Wave-
Akademie Verlag.
optical engineering with virtuallab. Proceedings SPIE,
Vallius T, Jefimovs K, Kettunen V, Kuittinen M,
5182: 24– 33.
Bühling S, Schimmel H and Wyrowski F (2003) Challenges Laakkonen P and Turunen J (2001) Design of non-
in industrial applications of diffractive beam splitting. paraxial array illuminators by step-transition pertur-
Proceedings SPIE 5182: 123 – 131. bation approach. Journal of Modern Optics 48(12):
Collier RJ, Burckhardt CB and Lin LH (1971) Optical 1869– 1879.
Holography. London: Academic Press. Wyrowski F (1990) Diffractive optical elements: Iterative
Fienup JR (1980) Iterative method applied to image calculation of quantized, blazed phase structures.
reconstruction and to computer-generated holograms. Journal of the Optical Society of America A 7:
Optical Engineering 19(3): 297– 305. 961 – 969.
Gerchberg RW and Saxton WO (1972) A practical Wyrowski F (1991) Digital phase encoded inverse filter for
algorithm for the determination of phase from image optical pattern recognition. Applied Optics 30:
and diffraction plane pictures. Optik 35(2): 237 – 246. 4650– 4657.
Goodmann JW (1968) Introduction to Fourier Optics. Wyrowski F (1991) Considerations on convolution
New York: McGraw-Hill. and phase factors. Optics Communications 81(6):
Goodman JW and Silvestri AM (1970) Some effects of 353 – 358.
Fourier-domain phase quantization. IBM Journal of Wyrowski F (1991) Upper bound of the diffraction
Reserved Development 14: 478– 484. efficiency of diffractive phase elements. Optics Letters
Krackhardt U, Mait JN and Streibl N (1992) Upper bound 16(24): 1915– 1917.
on the diffraction efficiency of phase-only fanout Wyrowski F (1993) Design theory of diffraction elements in
elements. Applied Optics 31(1): 27– 37. the paraxial domain. Journal of the Optical Society of
Lajunen H, Tervo J, Turunen J, Vallius T and Wyrowski F America A 10: 1553– 1561.
(2003) Simulation of light propagation by local spherical Wyrowski F (2003) Comments on wave-optical engineer-
interface approximation. Applied Optics 34: ing. Proceedings SPIE 5182: 1– 5.
6804– 6810. Wyrowski F and Turunen J (2002) Wave-optical engineer-
Mandel L and Wolf E (1995) Optical Coherence and ing. In: Guenther AH (ed.) International Trends in
Quantum Optics. Cambridge: Cambridge University Applied Optics, chapter 21, pp. 471– 496. Bellingham:
Press. SPIE Press.
DISPERSION MANAGEMENT 353
DISPERSION MANAGEMENT
core. Compounding the problems cause by chromatic
A E Willner, Y-W Song, J Mcgeehan and Z Pan, dispersion is the fact that as bit rates rise, chromatic
University of Southern California, Los Angeles, CA, dispersion effects rise quadratically with respect to
USA the increase in the bit rate. One can eliminate these
effects using fiber with zero chromatic dispersion,
B Hoanca, University of Alaska Anchorage,
known as dispersion shifted fiber (DSF). However,
Anchorage, AK, USA
with zero dispersion, all channels in a wavelength-
q 2005, Elsevier Ltd. All Rights Reserved. division-multiplexed (WDM) system travel at the
same speed, in-phase, and a number of deleterious
nonlinear effects such as cross-phase modulation
Introduction (XPM) and four-wave mixing (FWM) result. Thus,
Optical communications systems have grown in WDM systems, some amount of chromatic
explosively in terms of the capacity that can be dispersion is necessary to keep channels out-of-
transmitted over a single optical fiber. This trend has phase, and as such chromatic dispersion compen-
been fueled by two complementary techniques, those sation is required. Any real fiber link may also suffer
being the increase in data-rate-per-channel coupled from ‘dispersion slope’ effects, in which a slightly
with the increase in the total number of parallel different dispersion value is produced in each WDM
wavelength channels. However, there are many channel. This means that while one may be able to
considerations as to the total number of wavelength- compensate one channel exactly, other channels may
division-multiplexed (WDM) channels that can be progressively accumulate increasing amounts of
accommodated in a system, including cost, infor- dispersion, which can severely limit the ultimate
mation spectral efficiency, nonlinear effects, and length of the optical link and the wavelength range
component wavelength selectivity. that can be used in a WDM system. This article will
Dispersion is one of the critical roadblocks to address the concepts of chromatic dispersion and
increasing the transmission capacity of optical fiber. dispersion slope management followed by some
The dispersive effect in an optical fiber has several examples highlighting the need for tunability to
ingredients, including intermodal dispersion in a enable robust optical WDM systems in dynamic
multimode fiber, waveguide dispersion, material environments. Some dispersion monitoring
dispersion, and chromatic dispersion. In particular, techniques are then discussed and examples given.
chromatic dispersion is one of the critical effects
in a single mode fiber (SMF), resulting in a Chromatic Dispersion in Optical Fiber
temporal spreading of an optical bit as it propagates
along the fiber. At data rates # 2.5 Gbit/s, the
Communication Systems
effects of chromatic dispersion are not particularly In any medium (other than vacuum) and in any
troublesome. For data rates $ 10 Gbit/s, however, waveguide structure (other than ideal infinite free
transmission can be tricky and the chromatic space), different electromagnetic frequencies travel
dispersion-induced degrading effects must be dealt at different speeds. This is the essence of chromatic
with in some way, perhaps by compensation. dispersion. As the real fiber-optic world is rather
Furthermore, the effects of chromatic dispersion rise distant from the ideal concepts of both vacuum and
quite rapidly as the bit rate increases – when the bit infinite free space, dispersion will always be a
rate increases by a factor of four, the effects of concern when one is dealing with the propagation
chromatic dispersion increase by a factor of 16! This of electromagnetic radiation through fiber.
article will only deal with the management of The velocity in fiber of a single monochromatic
chromatic dispersion in single mode fiber. wavelength is constant. However, data modulation
One of the critical limitations of optical fiber causes a broadening of the spectrum of even the
communications comes from chromatic dispersion, most monochromatic laser pulse. Thus, all modu-
which results in a pulse broadening as it propagates lated data have a nonzero spectral width which
along the fiber. This occurs as photons of different spans several wavelengths, and the different
frequencies (created by the spreading effect of data spectral components of modulated data travel at
modulation) travel at different speeds, due to the different speeds. In particular, for digital data
frequency-dependent refractive index of the fiber intensity modulated on an optical carrier, chromatic
354 DISPERSION MANAGEMENT
dispersion leads to pulse broadening – which in turn twice as wide, doubling the effect of dispersion. On
leads to chromatic dispersion limiting the maximum the other hand, the same doubling of the data rate
data rate that can be transmitted through optical makes the data pulses only half as long (hence twice
fiber (see Figure 1). as sensitive to dispersion). The combination of a
Considering that the chromatic dispersion in wider signal spectrum and a shorter pulse width is
optical fibers is due to the frequency-dependent what leads to the overall quadratic impact. Moreover,
nature of the propagation characteristics, for both the data modulation format used can significantly
the material (the refractive index of glass) and the affect the sensitivity of a system to chromatic
waveguide structure, the speed of light of a particular dispersion. For example, the common nonreturn-to-
wavelength l will be expressed as follows, using a zero (NRZ) data format, in which the optical power
Taylor series expansion of the value of the refractive stays high throughout the entire time slot of a ‘1’ bit,
index as a function of the wavelength: is more robust to chromatic dispersion than is the
return-to-zero (RZ) format, in which the optical
c0 c0
vðlÞ ¼ ¼ ½1 power stays high in only part of the time slot of a ‘1’
nðlÞ ›n ›2 n bit. This difference is due to the fact that RZ data
n0 ðl0 Þ þ dl þ 2 ðdlÞ2
›l ›l have a much wider channel frequency spectrum
Here, c0 is the speed of light in vacuum, l0 is a compared to NRZ data, thus incurring more
reference wavelength, and the terms in ›n=›l and chromatic dispersion. However, in a real WDM
›2 n=›l2 are associated with the chromatic dispersion system, the RZ format increases the maximum
and the dispersion slope (i.e., the variation of the allowable transmission distance by virtue of its
chromatic dispersion with wavelength), respectively. reduced duty cycle (compared to the NRZ format),
Transmission fiber has positive dispersion, i.e. longer making it less susceptible to fiber nonlinearities as can
wavelengths result in longer propagation delays. be seen in Figure 2.
The units of chromatic dispersion are picoseconds A rule for the maximum distance over which data
per nanometer per kilometer, meaning that shorter can be transmitted is to consider a broadening of the
time pulses, wider frequency spread due to data pulse equal to the bit period. For a bit period B, a
modulation, and longer fiber lengths will each dispersion value D and a spectral width Dl, the
contribute linearly to temporal dispersion. Higher dispersion-limited distance is given by
data rates inherently have both shorter pulses and
wider frequency spreads. Therefore, as network speed 1 1 1
LD ¼ ¼ / 2 ½2
increases, the impact of chromatic dispersion rises D·B·Dl D·B·ðcBÞ B
precipitously as the square of the increase in data rate.
The quadratic increase with the data rate is a result of (see Figure 3). For example, for single mode fiber,
two effects, each with a linear contribution. On one D ¼ 17 ps/nm/km, so for 10 Gbit/s data the distance
hand, a doubling of the data rate makes the spectrum is LD ¼ 52 km. In fact, a more exact calculation
Figure 1 The origin of chromatic dispersion in data transmission. (a) Chromatic dispersion is caused by the frequency-dependent
refractive index in fiber. (b) The nonzero spectral width due to data modulation. (c) Dispersion leads to pulse broadening, proportional to
the transmission distance and the data rate.
DISPERSION MANAGEMENT 355
in rad/km/W. A typical range of values for g is WDM system, the total electric field is the sum of the
between 10 –30 rad/km/W. Although the nonlinear electric fields of each individual channel. When
coefficient is small, the long transmission lengths and squaring the sum of different fields, products emerge
high optical powers, that have been made possible by that are beat terms at various sum and difference
the use of optical amplifiers, can cause a large enough frequencies to the original signals. Figure 5 depicts
nonlinear phase change to play a significant role in that if a WDM channel exists at one of the four-wave-
state-of-the-art lightwave systems. mixing beat-term frequencies, then the beat term will
interfere coherently with this other WDM channel
Cross-phase modulation (XPM) and potentially destroy the data.
When considering many WDM channels co-propa-
gating in a fiber, photons from channels 2 through N
can distort the index profile that is experienced by Dispersion Maps
channel 1. The photons from the other channels While zero-dispersion fiber is not a good idea, a large
‘chirp’ the signal frequencies on channel 1, which will value of the accumulated dispersion at the end of a
interact with fiber chromatic dispersion and cause fiber link is also undesirable. An ideal solution is to
temporal distortion. This effect is called cross-phase have a ‘dispersion map,’ alternating sections of
modulation. In a two-channel system, the frequency positive and negative dispersion as can be seen in
chirp in channel 1, due to power fluctuation within Figure 6. This is a very powerful concept: at each
both channels, is given by point along the fiber the dispersion has some
nonzero value, eliminating FWM and XPM, but
dFNL dP dP
DB ¼ ¼ gLeff 1 þ 2gLeff 2 ½6 the total dispersion at the end of the fiber link is
dt dt dt zero, so that no pulse broadening is induced
where, dP1/dt and dP2/dt are the time derivatives of (Table 1). The most advanced systems require
the pulse powers of channels 1 and 2, respectively. periodic dispersion compensation, as well as pre-
The first term on the right-hand side of the above and post-compensation (before and after the trans-
equation is due to SPM, and the second term is due to mission fiber).
XPM. Note that the XPM-induced chirp term is The addition of negative dispersion to a
double that of the SPM-induced chirp term. As such, standard fiber link has been traditionally known
XPM can impose a much greater limitation on WDM as ‘dispersion compensation,’ however, the term
systems than can SPM, especially in systems with ‘dispersion management’ is more appropriate. SMF
many WDM channels. has positive dispersion, but some new varieties of
nonzero dispersion-shifted fiber (NZDSF) come in
Four-wave-mixing (FWM) both positive and negative dispersion varieties. Some
The optical intensity propagating through the fiber is examples are shown in Figure 7. Reverse dispersion
related to the electric field intensity squared. In a fiber is also now available, with a large dispersion
Figure 5 (a) and (b) FWM induces new spectral components via nonlinear mixing of two wavelength signals. (c) The signal
degradation due to FWM products falling on a third data channel can be reduced by even small amounts of dispersion. (Reproduced with
permission from Tkach RW, Chraplyvy AR, Forghieri F, Gnauck AH and Derosier RM (1995) Four-photon mixing and high-speed WDM
systems. Journal of Photon Technology 13(5): 841– 849.)
DISPERSION MANAGEMENT 357
comparable to that of SMF, but with the opposite If a dispersion-management system was perfectly
sign. When such flexibility is available in choosing linear, it would be irrelevant whether the dispersion
both the magnitude and sign of the dispersion of the along a path is small or large, as long as the
fiber in a link, dispersion-managed systems can be overall dispersion is compensated to zero (end to
fully optimized to the desired dispersion map using a end). Thus, in a linear system the performance should
combination of fiber and dispersion compensation be similar, regardless of whether the transmission fiber
devices (see Figure 8). is SMF, and dispersion compensation modules are
Dispersion is a linear process, so first-order deployed every 60 km, or the transmission fiber is
dispersion maps can be understood as linear systems. NZDSF (with approximately a quarter of the dis-
However, the effects of nonlinearities cannot be persion value of SMF) and dispersion compensation
ignored, especially in WDM systems, with many modules are deployed every 240 km. In real life,
tens of channels, where the launch power may be very optical nonlinearities are very important, and recent
high. In particular, in systems deploying dispersion results seem to favor the use of large, SMF-like,
compensating fiber (DCF), the large nonlinear coeffi- dispersion values in the transmission path and corres-
cient of the DCF can dramatically affect the pondingly high dispersion compensation devices. A
dispersion map. recent study of performance versus channel spacing
showed that the capacity of SMF could be more than
four times that of NZDSF. This is because the
Corrections to Linear Dispersion Maps nonlinear coefficients are much higher in NZDSF
Chromatic dispersion is a necessity in WDM systems, than in SMF, and for dense WDM the channel inter-
to minimize the effects of fiber nonlinearities. actions become a limiting factor. A critical conclusion
A chromatic dispersion value as small as a few is that not all dispersion compensation maps are
ps/nm/km is usually sufficient to make XPM and created equal: a simple calculation of the dispersion
FWM negligible. To mitigate the effects of nonlinea- compensation, to cancel the overall dispersion value,
rities but maintain small amounts of chromatic does not lead to optimal dispersion map designs.
dispersion, NZDSF is commercially available. Due
to these nonlinear effects, chromatic dispersion must
be managed, rather than eliminated.
Additionally, several solutions have been shown to lack of a dynamic, tunable DCF solution has not
be either resistant to dispersion, or have been shown reduced its popularity.
to rely on dispersion itself for transmission. Such As can be seen in Figure 9, the core of the average
solutions include chirped pulses (where prechirping dispersion compensating fiber is much smaller than
emphasizes the spectrum of the pulses so that that of standard SMF, and beams with longer wave-
dispersion does not broaden them too much), lengths experience relatively large changes in mode
dispersion assisted transmission (where an initial size (due to the waveguide structure) leading to greater
phase modulation tailored to the transmission dis- propagation through the cladding of the fiber, where
tance leads to full-scale amplitude modulation at the the speed of light is greater than that of the core.
receiver end due to the dispersion), and various This leads to a large negative dispersion value.
modulation formats robust to chromatic dispersion Additional cladding layers can lead to improved
and nonlinearities. DCF designs that can include negative dispersion
slope to counteract the positive dispersion slope of
standard SMF.
Dispersion Management Solutions In spite of its many advantages, DCF has a number
of drawbacks. First, it is limited to a fixed compen-
Fixed Dispersion Compensation sation value. In addition, DCF has a weakly guiding
From a systems point of view, there are several structure and has a much smaller core cross-section,
requirements for a dispersion compensating module: 19 mm2, compared to the 85 mm2 of SMF. This leads
low loss, low optical nonlinearity, broadband (or to higher nonlinearity, higher splice losses, as well as
multichannel) operation, small footprint, low weight, higher bending losses. Last, the length of DCF
low power consumption, and clearly low cost. It is required to compensate for SMF dispersion is rather
unfortunate that the first dispersion compensation long, about one-fifth of the length of the transmission
modules, based on DCF only, met two of these fiber for which it is compensating. Thus DCF modules
requirements: broadband operation and low power induce loss, and are relatively bulky and heavy. The
consumption. On the other hand, several solutions bulk is partly due to the mass of fiber, but also due
have emerged that can complement or even replace to the resin used to hold the fiber securely in place.
these first-generation compensators. One other contribution to the size of the module is the
higher bend loss associated with the refractive index
profile of DCF; this limits the radius of the DCF loop
Dispersion compensating fiber (DCF) to 6 –8 inches, compared to the minimum bend radius
One of the first dispersion compensation tech- of 2 inches for SMF.
niques was to deploy specially designed sections of Traditionally, DCF-based dispersion compen-
fiber with negative chromatic dispersion. The tech- sation modules are usually located at amplifier
nology for DCF emerged in the 1980s and has sites. This serves several purposes. First, amplifier
developed dramatically since the advent of optical sites offer relatively easy access to the fiber, without
amplifiers in 1990. DCF is the most widely deployed requiring any digging or unbraiding of the cable.
dispersion compensator, providing broadband oper- Second, DCF has high loss (usually at least double
ation and stable dispersion characteristics, and the that of standard SMF), so a gain stage is required
DISPERSION MANAGEMENT 359
Figure 9 Typical DCF (a) refractive index profile and (b) dispersion and loss as a function of wavelength. Dn is defined as refractive
index variation relative to the cladding.
Figure 10 System demonstration of dispersion compensation using DCF. (Reproduced with permission from Park YK, Yeates PD,
Delavaux J-MP, et al. (1995) A field demonstration of 20-Gb/s capacity transmission over 360 km of installed standard (non-DSF) fiber.
Photon. Technol. Lett. 7(7): 816– 818.)
before the DCF module to avoid excessively low nonlinearities. For this reason, they may not have
signal levels. DCF has a cross-section four times to be deployed at the mid-section of an amplifier.
smaller then SMF, hence a higher nonlinearity, Figure 10 shows the real demonstration results
which limits the maximum launch power into a using the DCF.
DCF module. The compromise is to place the DCF
in the mid-section of a two-section EDFA. This way, Chirped fiber Bragg gratings
the first stage provides pre-DCF gain, but not to a Fiber Bragg gratings have emerged as major com-
power level that would generate excessive nonlinear ponents for dispersion compensation because of their
effects in the DCF. The second stage amplifies the low loss, small footprint, and low optical nonlinearity.
dispersion compensated signal to a power level Bragg gratings are sections of single-mode fiber in
suitable for transmission though the fiber link. This which the refractive index of the core is modulated in a
launch power level is typically much higher than periodic fashion, as a function of the spatial coordi-
could be transmitted through DCF without generat- nate along the length of the fiber. When the spatial
ing large nonlinear effects. Many newer dispersion periodicity of the modulation matches what is
compensation devices have better performance known as a Bragg condition with respect to the
than DCF, in particular lower loss and lower wavelength of light propagating through the grating,
360 DISPERSION MANAGEMENT
the periodic structure acts like a mirror, reflecting the the cross-section of the fiber is small (which leads to
optical radiation that is traveling through the core of high nonlinearity and high loss). One way to reduce
the fiber. An optical circulator is traditionally used to both the loss and the nonlinearity is to use a higher-
separate the reflected output beam from the input order mode (HOM) fiber (LP11 or LP02 near cutoff
beam. instead of the LP01 mode in the transmission fiber).
When the periodicity of the grating is varied along Such a device requires a good-quality mode
its length, the result is a chirped grating which can be converter between LP01 and LP02 to interface between
used to compensate for chromatic dispersion. The the SMF and HOM fiber. HOM fiber has a dispersion
chirp is understood as the rate of change of the spatial per unit length greater than six times that of DCF.
frequency as a function of position along the grating. Thus, to compensate for a given transmission length in
In chirped gratings the Bragg matching condition for SMF, the length of HOM fiber required is only one
different wavelengths occurs at different positions sixth the length of DCF. Thus, even though losses and
along the grating length. Thus, the roundtrip delay of nonlinearity per unit length are larger for HOM fiber
each wavelength can be tailored by designing the than for DCF, they are smaller overall, because of the
chirp profile appropriately. Figure 11 compares the shorter HOM fiber length. As an added bonus, the
chirped FBG with uniform FBG. In a data pulse that dispersion can be tuned slightly by changing the cutoff
has been distorted by dispersion, different frequency wavelength of LP02 (via temperature tuning). A soon
components arrive with different amounts of relative to be released HOM fiber-based commercial dis-
delay. By tailoring the chirp profile such that the persion compensation module is not tunable, but can
frequency components see a relative delay which is fully compensate for dispersion slope.
the inverse of the delay of the transmission fiber, the
pulse can be compressed back. The dispersion of the
Tunable Dispersion Compensation
grating is the slope of the time delay as a function of
wavelength, which is related to the chirp. The need for tunability
The main drawback of Bragg gratings is that the In a perfect world, all fiber links would have a known,
amplitude profile and the phase profile as a function of discrete, and unchanging value of chromatic dis-
wavelength have some amount of ripple. Ideally, the persion. Network operators would then deploy fixed
amplitude profile of the grating should have a flat dispersion compensators periodically along every
(or rounded) top in the passband, and the phase profile fiber link to exactly match the fiber dispersion.
should be linear (for linearly chirped gratings) or Unfortunately, several vexing issues may necessitate
polynomial (for nonlinearly chirped gratings). The that dispersion compensators are tunability, that they
grating ripple is the deviation from the ideal profile have the ability to adjust the amount of dispersion to
shape. Considerable effort has been expended on match system requirements.
reducing the ripple. While early gratings were plagued First, there is the most basic business issue of
by more than 100 ps of ripple, published results have inventory management. Network operators typically
shown vast improvement to values close to ^3 ps. do not know the exact length of a deployed fiber link
nor its chromatic dispersion value. Moreover, fiber
Higher order mode dispersion compensation fiber plants periodically undergo upgrades and mainten-
One of the challenges of designing standard DCF is ance, leaving new and nonexact lengths of fiber
that high negative dispersion is hard to achieve unless behind. Therefore, operators would need to keep in
stock a large number of different compensator models,
and even then the compensation would only be
approximate. Second, we must consider the sheer
difficulty of 40 Gbit/s signals. The tolerable threshold
for accumulated dispersion for a 40 Gbit/s data
channel is 16 times smaller than at 10 Gbit/s. If the
compensation value does not exactly match the fiber to
within a few percent of the required dispersion
value, then the communication link will not work.
Tunability is considered a key enabler for this bit rate
(see Figures 12 and 13). Third, the accumulated
dispersion changes slightly with temperature,
Figure 11 Uniform and chirped FBGs. (a) A grating with uniform
pitch has a narrow reflection spectrum and a flat time delay as a which begins to be an issue for 40 Gbit/s systems and
function of wavelength. (b) A chirped FBG has a wider bandwidth, 10 Gbit/s ultra long-haul systems. In fiber, the zero-
a varying time delay and a longer grating length. dispersion wavelength changes with temperature at a
DISPERSION MANAGEMENT 361
nonlinearly-chirped grating is stretched, the time Several devices used for dispersion compensation
delay curve is shifted toward longer wavelengths, can be integrated on a chip, using either an optical
but the slope of the ps-vs.-nm curve at a specific chip media (semiconductor-based laser or amplifier
channel wavelength changes continuously. Ulti- medium) or an electronic chip. One such technology
mately, tunable dispersion compensators should is the micro-ring resonator, a device that, when used
accommodate multichannel operation. Several in a structure similar to that of an all-pass filter (see
WDM channels can be accommodated by a single Figure 17), can be used for dispersion compensation
chirped FBG in one of two ways: fabricating a much on a chip-scale. Although these technologies are not
longer (i.e., meters-length) grating, or using a yet ready for deployment as dispersion compensators,
sampling function when writing the grating, thereby they have been used in other applications and have
creating many replicas of transfer function of the FBG the potential to offer very high performance at
in the wavelength domain (see Figure 16). low cost.
One free-space-based tunable dispersion compen- As the ultimate optical dispersion compensation
sation device is the virtually imaged phased array devices, photonic bandgap fibers (holey fibers) are
(VIPA), based on the dispersion of a Fabry– Perot an interesting class in themselves (see Figure 18).
interferometer. The design requires several lenses, a These are fibers with a hollow structure, with holes
movable mirror (for tunability), and a glass plate with engineered to achieve a particular functionality.
a thin film layer of tapered reflectivity for good mode Instead of being drawn from a solid preform, holey
matching. Light incident on the glass plate undergoes fibers are drawn from a group of capillary tubes fused
several reflections inside the plate. As a result, the together. This way, the dispersion, dispersion slope,
beam is imaged at several virtual locations, with a the nonlinear coefficients could in principle all be
spatial distribution that is wavelength-dependent. precisely designed and controlled, up to very small or
very large values, well outside the range of those of
the solid fiber.
Figure 18 (a) SEM image of a photonic crystal fiber (holey fiber), and (b) net dispersion of the fiber at 1550 nm as a function of the core
diameter. (Reproduced with permission from Birk TA, Mogihetsev D, Knight JC and Russell PSt.J (1999) Integrated all-pass filters for
tunable dispersion and dispersion slope compensation. Photon. Technol. Lett. 11(6): 674 –676.)
DISPERSION MANAGEMENT 363
Figure 20 Tunable dispersion slope compensation using a third-order nonlinearly chirped FBG. (a) Cubic time delay curves of the
grating, and (b) quadratic dispersion curves showing the change in dispersion in each channel before and after tuning. (Reproduced with
permission from Song YW, Motoghian SMR, Starodubov D, et al. (2002) Tunable dispersion slope compensation for WDM systems
using a non-channelized third-order-chirped FBG. Optical Fiber Communication Conference 2002. Paper ThAA4.)
364 DISPERSION MANAGEMENT
any tunable chromatic dispersion compensation (BER) or eye opening, but this approach cannot
modules on the fly as the network changes. An in-line differentiate among different degrading effects;
chromatic dispersion monitor can quickly measure (ii) detecting the intensity modulation induced by
the required dispersion compensation value while phase modulation at the transmitter; (iii) extracting
data are still being transmitted through the optical the bit-rate frequency component (clock) from photo-
link. This is very different from the more traditional detected data and monitoring its RF power
chromatic dispersion measurement techniques where (see Figure 21); (iv) inserting a subcarrier at the
dark fiber is used and the measurement is done off- transmitter and subsequently monitoring the sub-
line over many hours (or days). carrier power degradation; (v) extracting several
Chromatic dispersion monitoring is most often frequency components from the optical data using
done at the receiving end, where the Q-factor of the very narrow-band optical filters and detecting
received data or some other means is employed to the optical phase; (vi) dithering the optical-carrier
assess the accumulated dispersion. Existing tech- frequency at the transmitter and measuring the
niques that can monitor dispersion in-line, fast, resultant phase modulation of the clock extracted at
and with relatively low cost include: (i) general receiver with an additional phase-locked loop; and
performance monitoring using the bit-error rate (vii) using an optical filter to select the upper and
lower vestigial sideband (VSB) signals in transmitted
optical data and determine the relative group delay
caused by dispersion (see Figure 22).
Conclusion
Chromatic dispersion is a phenomenon with pro-
found implications for optical fiber communications
systems. It has negative effects, broadening data
pulses, but it also helps reduce the effects of fiber
nonlinearities. For this reason, managing dispersion,
Figure 21 Clock regenerating effect due to chromatic dispersion
rather than trying to eliminate it altogether, is the key.
for NRZ data. As the amount of residual dispersion increases, so
does the amount of power at the clock frequency. This power can Fixed dispersion components are suitable for point-
be used to monitor the amount of uncompensated chromatic to-point OC-192 systems. Tunable dispersion com-
dispersion. (Reproduced with permission from Pan Z, Yu Q, Xie Y, ponents are essential for dispersion management in
et al. (2001) Chromatic dispersion monitoring and automated reconfigurable and OC-768 systems. These dispersion
compensation for NRZ and RZ data using clock regeneration and
compensation elements must have low loss, low
fading without adding signalling. Optical Fiber Communication
Conference 2001. Paper WH5.) nonlinearity, and must be cost effective.
Figure 22 Chromatic dispersion monitoring using the time delay (Dt ) between two VSB signals, which is a function of chromatic
dispersion. (Reproduced with permission from Yu Q, Yan L-S, Pan Z and Willner AE (2002) Chromatic dispersion monitor for WDM
systems using vestigial-sideband optical filtering. Optical Fiber Communication Conference 2002. Paper WE3.)
DISPERSION MANAGEMENT 365
Although several technologies have emerged that Modulation The process of encoding digital
meet some or all of the above requirements, no data onto an optical signal so it
technology is a clear winner. The trend is towards can be transmitted through an
tunable devices, or even actively self-tunable compen- optical network.
sators, and such devices will allow system designers to Nonlinear effects Describes the nonlinear res-
cope with the shrinking system margins and with the ponses of a dielectric to intense
emerging rapidly reconfigurable optical networks. electromagnetic fields. One such
effect is the intensity dependence
List of Units and Nomenclature of the refractive index of an
optical fiber (Kerr effect).
Chromatic The time-domain pulse broad- Nonzero Optical fiber that has a small
dispersion ening in an optical fiber caused dispersion- amount of dispersion in the
by the frequency dependence of shifted fiber 1,550 nm region in order to
the refractive index. This results (NZDSF) reduce deleterious nonlinear
in photons at different frequen- effects.
cies traveling at different speeds Polarization Dispersion resulting from the
[ps/nm/km]. mode fact that different light polariz-
Chromatic Different wavelengths have dispersion ations within the fiber core will
dispersion different amounts of dispersion (PMD) travel at different speeds through
slope values in an optical fiber the optical fiber [ps/km0.5].
[ps/nm2/km]. Self-phase A nonlinear Kerr effect in which
Conventional Fiber that transmits only a single modulation a signal undergoes a self-induced
single mode optical mode by virtue of a very (SPM) phase shift during propagation
fiber (SMF) small core diameter relative to in optical fiber.
the cladding. Provides lowest Wavelength Transmitting many different
loss in the 1,550 nm region and division wavelengths down the same
has zero dispersion at 1,300 nm. multiplexing optical fiber at the same time in
Cross-phase A nonlinear Kerr effect in which (WDM) order to increase the amount of
modulation a signal undergoes a nonlinear information that can be carried.
(XPM) phase shift induced by a copro-
pagating signal at a different See also
wavelength in a WDM system.
Nonlinear Optics, Basics: Four-Wave Mixing. Photonic
Dispersion Optical fiber that has both
Crystals: Nonlinear Optics in Photonic Crystal Fibers.
compensating a large negative dispersion
fiber (DCF) and dispersion slope around
1,550 nm. Further Reading
Fiber Bragg A small section of optical fiber in Agrawal GP (1997) Fiber-Optic Communication Systems.
gratings which there is a periodic change Rochester, NY: John Wiley & Sons.
(FBGs) of refractive index along the Agrawal GP (1997) Nonlinear Fiber Optics, 2nd edn.
core of the fiber. An FBG acts Academic Press.
as a wavelength-selective mirror, Gowar J (1993) Optical Communication Systems, 2nd edn.
reflecting only specific wave- Prentice Hall.
lengths (Bragg wavelengths) Kaminow IP and Koch TL (1997) Optical Fiber Telecom-
and passing others. munications IIIA & IIIB. Academic Press.
Four-wave A nonlinear Kerr effect in which Kashyap R (1999) Fiber Bragg Gratings. Academic Press.
Kazovsky L, Benedetto S and Willner A (1996) Optical
mixing (FWM) two or more signal wavelengths
Fiber Communication Systems. Artech House.
interact to generate a new wave-
Willner AE and Hoanca B (2002) Fixed and tunable
length. management of fiber chromatic dispersion. In: Kaminow
Kerr effect Nonlinear effect in which the IP and Li T (eds) Optical Fiber Telecommunications IV.
refractive index of optical fiber Academic Press.
varies as a function of the inten- Yariv A and Yeh P (1984) Optical Waves in Crystals.
sity of light within the fiber core. John Wiley & Sons.
366 DISPLAYS
DISPLAYS
R L Donofrio, Display Device Consultants LLC, Ann Devices such as ‘tube face reflectivity’ (TFR) can be
Arbor, MI, USA used to measure this parameter and it is published in
q 2005, Elsevier Ltd. All Rights Reserved. the EIA-JT-31 measurement methods. Other similar
methods are used for flat panel displays.
Introduction Color
This article is a brief review of display technology Although color perception is different for different
being used today. It emphasizes some of the optical individuals, standard responses to color stimuli have
aspects of displays. The details of the basic construc- been developed and we can measure the color of
tion of the display devices are also discussed to aid in different objects and display images. Color can be
understanding. The displays covered are the CRT, measured with such equipment as a colorimeter or
VFD, FED, PDP, LED, OLED, LCD-AMLCD, spectroradiometer. The output data are given in
transmissive and transflective LCDs. various color systems. Color systems, such as the
1931 CIE (Commission International de I’Eclairage)
Color System ðx; yÞ; the 1960 CIE ðu; vÞ or 1976 CIE
Definition of a Display ðu0 ; v 0 Þ color coordinates are all used. All these color
A display can be defined as an object which transfers systems can be converted to each other by simple
information to the viewer and/or an output unit that mathematical relations and these conversions are
gives a visual representation of data. usually available to the user in the data output. The
choice of the color space/system is dependent on
application but the system in most general use for
Optical Characteristics of a Display displays is the 1976 CIE ðu0 ; v 0 Þ: This color system has
A display has (in general) the optical characteristics of become more widely used because equally perceived
luminance, reflectivity, contrast ratio, color, surface color differences in this color system are at almost
gloss, resolution, and image sharpness (sometimes equal distances in this color space. In reality, the quest
called small area contrast). Measurement methods for a system with equal distances equating to equal
can be obtained from the EIA (Electronics Industries color differences can be achieved through a compli-
Alliance), JT-31 (Optical Characteristics of Display cated matrix approach in which the constants used in
Devices), SAE (Society of Automotive Engineers and the matrix vary from point to point in the color space.
ISO Standards) and VESA Standards, to name a few.
Gloss
Luminance
Gloss is a measure of surface reflectivity under Snell’s
The luminance of a display is its light output and is Law criteria (equal incidence and reflection angles).
given in units of candela/square meter or cd/m2 (in The measurement angles of 20, 45, 60, and 85
the English system fL). Luminance is measured degrees are used to measure the gloss of common
using a photometer, photometer/colorimeter, or surfaces. Display glass surface measurements tend to
spectroradiometer. use the 60 degree gloss angle. The surface of a display
can be made rougher to scatter incident light away
Illuminance from the viewer or it can be optically coated to reduce
The illuminance is the light incident on the surface the reflected light to achieve improved viewing. Some
being measured. Its units are Lumen per m2 or Lux times both methods are used. Reduced gloss surfaces
(and in the English system Ft.cd). This is measured also are involved with the reduction in surface glare.
using illuminance meters. The level of increased roughness of the surface affects
the user’s ability to see detail.
Reflectivity
Contrast Ratio
The reflectivity for CRTs (cathode ray tubes) is
measured with incident light at 45 degrees (in some When people talk about contrast they generally
cases for LCDs, a smaller 30 degree angle is used). mean ‘contrast ratio’. The contrast ratio (CR) of
The reflected light from the surface is measured with a an area on a display is a unitless parameter and is
detector normal to (perpendicular to) the surface. defined as the excited luminance ðLe Þ divided by
DISPLAYS 367
the unexcited luminance ðLu Þ: The excited lumi- Now, when we use a 50% transmission panel glass at
nance ðLe Þ equals the glass transmission ðT Þ times the ambient illumination of 50 Ft.cd the CR is
some excited internal luminance ðLei Þ and for increased to approximately 8.39. Equations [1] and
simplicity, Lu equals T times an internal zero [2] show how to calculate the contrast. It should be
ambient unexcited luminance light level of ðLui Þ: noted that decreased panel glass transmission or the
Factors, such as internal scattered light in LCDs use of ‘dark glass’ increases contrast because the
and scattered electrons and scattered light in CRTs, reflectivity is a ‘T squared’ effect and the luminance L
give a nonzero un-excited luminance Lu : is a function ‘T’: There are second-order effects
When dealing with the CR, the effect of ambient involved with the reduction of reflected light and
illumination A contributes to the numerator and the unexcited luminance which make the equations more
denominator. Thus, when the ambient illumination is complicated. These second-order effects will not be
increased, it lowers the contrast ratio. In a simple case covered as they are beyond the basic discussion of
where the ambient illumination is reflected from the this article.
surface of the display whose reflectivity is r, the In comparing display CR, one must compare the
contrast ratio is given below: displays at the same ambient illumination. Addition-
ally, you will find that a number of flat panel display
CR ¼ ðLe þ ArÞ ðLu þ ArÞ ½1 advertising reports will report the CR at zero ambient
illumination. As we have shown, these CR values are
For the case of the reflectivity of a phosphor screen
the highest ones for the display. The CR for emissive
(reflectance R) coated panel with a glass transmission
displays tends to be constant over a wide range of
ðT Þ, the reflectivity to first approximation is
viewing angles, whereas the contrast ratio for
r ¼ cRT 2 ½2 liquid crystal displays (LCDs) is more variable over
large-angle viewing. Recent new electrode position
where c is a constant. developments, such as the ‘in-plane switching’ tech-
So we see that changes in contrast can come about nology for active matrix liquid crystal displays
through changing the display’s glass panel trans- (AMLCDs) has gone a long way to reduce this reduced
mission. For example, if the Le of the white field of a angle-of-view effect in that type of LCD display.
CRT was 100 fL with a 90% panel glass (along with Figure 1 shows how a viewer in a sport utility
R ¼ 0:5) and had a Lu of 2 fL, then the zero ambient vehicle (SUV) would see the changes in contrast
contrast would be 50. However, with an ambient with viewing angle for two types of twisted nematic
illumination of 50 Ftcd the contrast would be 4.76. LCDs (TN-LCDs). One of the TN LCDs does not
Figure 1 A typical rectangular viewing envelope for a mid-size SUV vehicle superimposed on LCD iso-contrast plots. Left is standard
TN LCD. Right, is film-compensated TN LCD. Reproduced from Smith-Gillespie R, et al. (2001) Design requirements for automotive
entertainment displays. Proceedings of the 8th Annual Symposium on Vehicle Displays (2001). Permission for reprint courtesy of the
Society for Information Displays.
368 DISPLAYS
have the film compensation and the other does. spectral filters. However, for the usual computer
For relative contrast appraisal, the red areas show the monitor, the ‘specular reflection’ (Snell’s Law type of
highest contrast regions and the blue the lowest. It is reflection) off the front of the display is due to a
obvious that different technologies can give signifi- localized external light source. This reflection is
cant changes in viewed contrast. reduced by methods such as anti-glare (AG) and
anti-reflecting (AR) surfaces applied to the front
surface of the display. A flat versus curved front
Phosphor Aging surface of the display can also be a glare reduction
The fall off of luminance as the phosphor screen is technique. Surface roughness is used in AG surfaces
bombarded with the electron beam is called phosphor to scatter a portion of the specular light away from
aging. This effect is due to two main factors: (i) the the viewer. Surface roughness can be achieved
thermal quenching of the phosphor luminescence through surface abrading, surface plastic coating,
(which is reversible); and (ii) a nonreversible aging surface silicate coating, and surface acid etching to
due to damage to the screen components (such as name a few methods. AR surfaces use optical
phosphor, binder, and aluminum coating). It has been interference techniques to reduce the reflected light
found that the aging (when we are using an electron to the viewer. In some cases, both methods are applied
beam) is a function of the number of electrons to a display surface. Surface gloss of the display is
bombarding the screen. Aging models help us to usually measured with a gloss meter in 60 degree
understand and predict the amount of aging a tube gloss values but 20, 45, and 80 degree angles are also
will show after a fixed amount of time. The first used. Resolution is measured in line pairs/mm or
aging model in 1951 was on ion bombardment and pixels/mm and image sharpness is associated with
showed that: modulation transfer function (MTF) measurements.
We should note here that resolution and image
I ¼ I0 =ð1 þ ANÞ ½3 sharpness are not the same thing. High resolution is
equated with the high frequency limit of the MTF
where I0 is the initial luminance before aging, and I is
measurement. Since the reduction in specular reflec-
the final luminance.
tion through the use of surface roughness can affect
In 1961 a similar (one parameter) Model I ¼
the image sharpness, we see that a system of trade-offs
I0 =ð1 þ CNÞ was used by Pfhanl, where C is the aging
can be developed to optimize a display for a given
constant to half luminance and N is the number of
viewing environment. AR coatings improve glare
coulombs. In 1981, a two-component model for low
reduction without scattering and thus give higher-
current aging was presented, where:
resolution capability to the displays but this is usually
I ¼ I0 =ð1 þ NðB=QD ÞÞ ½4 obtained at a higher price then the AG coatings.
Table 1 Table of emissive and nonemissive displays The Cathode Ray Tube (CRT)
Emissive Nonemissive The CRT is a display which has been in use for over
CRT LCD and AMLCD 100 years (the Braun CRT tube was developed in
VFD Bi-Stable LCD 1898). The CRT uses a phosphor screen which is
FED Digital Mirror D bombarded by energetic electrons and emits visible
EL – inorganic LCOS (Liquid Crystal on Si) light radiation. The single electron gun CRT was used
LED Electronic Inks
for years in early monochrome television sets and
OLED
PLASMA monochrome computer displays. Monochrome
green, amber, and white phosphors have been used
in monitor tubes. Some color selectivity can be
lumophors (or phosphors). The lumophors are obtained with the single electron gun using the
called cathodoluminescent, photoluminescent, and following three methods; (i) voltage penetration
electroluminescent, depending whether the excitation phosphors, (ii) current sensitive phosphors and
action was due to the interaction of the phosphor (iii) beam index methods. In a CRT, the penetration
with cathode rays, photons, or low-voltage electrons. of the electron beam follows Hans Bethe’s energy
Cathodoluminescent lumophors are used in cathode loss equation for electron penetration in solids and
ray tubes (CRTs – television tubes, monitor tubes, thus is a function of the electron energy or CRT
projection CRTs), (VFDs), and (FEDs). voltage.
EL displays can use inorganic lumophors (phos- Hans Bethe’s energy loss equation is:
phors) or organic lumophors such as those used in
OLEDs. Under photoluminescence displays, we have dE=dx ¼ 22Pe4 ðNe =EÞlnðaE=IÞ ½5
‘plasma displays’ (plasma TV), IR-up conversion, and
fluorescent lamps. In plasma displays, the excitation where: E ¼ energy of the incident electron; x ¼ depth
of the plasma in a localized region gives rise to UV of material; Ne ¼ density of atomic electrons;
photons which strike the phosphor screen in that I ¼ mean excitation energy; and a ¼ a constant.
region. In IR-up conversion, IR photons are absorbed In the voltage penetration tube, a cascaded
in such a way that two IR photons can excite an screen is used to give a color change when the
electron with twice the energy of one IR photon. This electron voltage is increased. For example, if a
then allows the use of IR lasers to create displays cascade screen is made from a green phosphor
through the excitation of these up-conversion phos- layer placed on top of the red phosphor layer then
phors. In fluorescent lamps, the UV created by the electrons strike the green phosphor first. Then
the excitation of Hg or Xe, strikes the special with increased voltage, the electrons pass through
lamp phosphors (coated on the inside of the glass the top phosphor and excite the bottom red
envelope), which become luminescent. phosphor. In the current sensitive color tube, the
Under the heading of shutters we can consider LCDs phosphor screen is composed of (for example) red
which consist of AMLCDs, super twisted nematic and green phosphor whose luminance versus
liquid crystal displays (STN LCDs) and bistable LCDs. current curves are different. In these cases, we are
Another shutter type is the liquid crystal on silicon referring to the linearity of the change in lumines-
(LCOS) display and the movable digital mirror display cence from the phosphor as a result of a change in
(DMD), and the digital light processing (DLP) electron beam current. The red phosphor (generally
projecting engines. Another spatial display type is a rare earth oxide) is a linear phosphor, whereas
both normal and dynamic inks. A dynamic ink can the green is nonlinear. Changing the current thus
be a ‘polarizable’ ink display in which a two shade can change the color of the display. In the beam
(light/dark) ink sphere element can rotate under the index tube, a UV (or secondary electron emitters
presence of an electric field to show the light or dark have also been used) emitting index layer is put
side. Although they are very interesting, we will not along side a red, green, or blue triad (for example,
cover ink displays in this article. between the blue and the red phosphor stripes).
The shutter action is also used in some projection The scanning electron beam, through the use of a
TV sets (or projectors) in which the LCD either UV sensor (or secondary electron detector)
transmits or reflects (such as LCOS) the light from an mounted in the rear portion of the tube and the
arc lamp. The DMD movable mirrors reflect the scanning electronics, can determine the location of
digital information to the lens system and the the red, green, and blue phosphors. The electron
unwanted rays are reflected away from the lens to a beam can then excite only the individual phosphor
black light capture area. areas of interest. With this tube, generally, the
370 DISPLAYS
electron beam is at a low current to give a small be made of a series of wires as used in the
enough cross-section to only excite one color and Trinitron design.
thus has limited application. The disadvantage of the shadow mask technique
Another application of the monochrome is the is that the mask can capture as much as 80% of
projection CRT system. Here, three tubes for red, the electrons from the electron guns. This yields
green, and blue are used to project an image on a lower screen luminance and lower contrast due to
viewing screen. The terminology is such that, when scattered electrons. As the screen size gets larger
the CRTs are in the back of the screen and the viewer from 19 inches to 36 inches or more, additional
in front of the screen, it is called a ‘rear’ projection set temperature compensation methods must be used
and if the CRTs and the viewer are in front of the to reduce the effects of mask heating. A number of
screen it is a ‘front’ projection set. The principle here methods to control the shift of the electron beam
is that, either using one CRT tube at a time or by a on the phosphor as the mask heats up have been
superposition two or three of the CRTs, the color of developed. Initially, axially positioned bimetal
the displayed image will be within the color gamut springs were used on masks to compensate for
defined by the primary colors. the heating effect. Then lighter masks were used
with corner lock suspension systems. Another
method is using mask materials with lower
The Color CRT coefficients of expansion such as INVAR.
Contrast in color CRTs has been improved by
The more ubiquitous method of obtaining a color first applying a carbon coating between the panel
display is the device called the shadow mask CRT glass and the phosphor screen. This requires
(and the similar Trinitron CRT). This color CRT is additional process steps which form ‘windows’ in
designed in such a manner that through the use of the carbon coating (matrix) through which the
an aperture mask, (seen in Figure 1) the electrons excited phosphor is seen. The use of pigmented
from the red electron gun strike the red phosphor phosphors also helped to reduce the reflectivity of
and likewise electrons from the green and blue the phosphor/matrix screen.
electron guns strike the green and blue phosphors. The phosphors used in the shadow mask color CRT
The combination allows the viewer to see a color are the red Y2O2S:Eu or, in some cases, the Y2O3:Eu
image. This color separation action is seen in phosphor. The green phosphor is ZnS:Cu and the blue
Figure 2. Although Figure 2 shows a delta tri-color phosphor is ZnS:Ag.
design, there are also in-line systems in which the The CRT seen in Figure 3 shows the new high-
mask consists of an array of slots which are definition television-type (HDTV) tube with the
somewhat rectangular in nature. In this array, 16 £ 9 aspect ratio as contrasted with the older
many slots can be on a line, or a mask made can 4 £ 3 aspect ratio (screen width £ height).
material is cylindrical instead of spherical in nature, There are two types of plasma displays, DC and AC.
has great strength, and high conductivity. The AC design appears to be the more successful.
There are two types of FEDs, a low-voltage Typical PDP phosphors for the triad are (Y,Gd)BO3:
(LVFED) and a high-voltage (HVFED). The HVFED Eu for the red, Zn2SiO4:Mn for the green, and
uses an aluminum film over the phosphor to protect BaMgAl10O17:Eu (called BAM for short) for the blue.
the phosphor screen from aging. The HVFED can use Figure 6 below shows the basic structure of an AC
the high voltage (2– 10 KV) electrons to bombard the plasma display. One phosphor element is shown with
aluminized screen and it can use the more efficient, barrier ribs and address and scan electrodes. The dis-
somewhat standard CRT phosphors ZnS:Cu green, charge of the plasma takes place in the region of the
ZnS:Ag Blue, and Y2O3:Eu red. The LVFED (12 – phosphor and excites the photoluminescent phos-
200 V) cannot use the standard CRT phosphors but phor, which is shaped (in this example) along the side
uses the low-voltage phosphor such as the ZnO:Zn of the barrier ribs and over the address electrode.
green phosphor. This does not use an aluminum layer,
since about 2 KV is needed to penetrate the Al layer.
The commonly used CRT red phosphor Y2O2S:Eu
Electroluminescence
was found unsuitable for HVFED use due to release Electroluminescence was first seen in the 1930s by
of the sulfur during electron bombardment. Destriau. EL is the generation of (nonthermal) light
Figure 5 shows that for this basic FED structure, by use of an electric field. High electric field EL units
the electrons from the Spindt emitter are controlled emit light through the impact of high energy electrons
by the voltages of the anode, control electrode and in luminescent centers in a semi-conducting material.
emitter electrode. The electric field can be alternating current (AC) or
direct current (DC). Powder phosphor AC EL dis-
plays appear to be one of the successful inorganic EL
Plasma Displays display methods. Because of their rugged nature,
Earlier plasma displays, made using excited Ne gas alternating current electroluminescence ACEL dis-
with the characteristic amber glow required no plays have been used in trucks, tractors and the M1
phosphor and were used in the printing industry. Abrams Tank. Thick-film EL panels are also used in
Present plasma displays used for television and some vehicle displays as back lights for LCDs. These
information signs are photoluminescent displays. As light panels are free from mercury and are more
shown in Figure 6, the excitation of the plasma gas environmentally friendly.
made of elements such as He, Ne, Ar, and Xe give rise
to a glow discharge and the creation of vacuum
ultraviolet photons that strike the screen and cause it
Inorganic LED Displays
to luminesce. In construction, the display is similar to Light-emitting diodes are being used as backlights for
the FED in that it is a matrix addressed device. It has LCD displays, replacing some of the fluorescent light
been described in literature that the similarity goes backlighting methods. They are also used in indicator
even further with the need for anode and cathode lights. Large displays have been made of LEDs.
surfaces, a means of evacuating and a perimeter seal. Using a matrix array row and column connectors,
DISPLAYS 373
we can digitally select a given color LED and pattern devices. The common base material is a poly-
in an LED array and generate a color image. Similar phenylene-vinylene PPV and other polymers such as
methods were carried out many years ago using polyfluorene have been used. The construction of the
incandescent lights. However, now they are bright phosphorescent OLED is similar in that the ITO is
and fast reacting. LEDs operate by semiconductor also used as a starting substrate but the hole transport
P –N junction rules and the flow of electrons and layer is applied by spin casting from solution. Spin
holes under forward bias operation give rise to casting is also used for the electron transfer material.
junction light emission. Materials such as GaP:Zn,O After drying (removal of solvents) a calcium cathode
are used to create a red LED with an emission material is applied. A contrast of 3000/1 has been
maxima wavelength of 699 nm. achieved (for a military type application) using a
Red, green, and blue-UV/blue LEDs are now green OLED with an AR coated circular polarizer
available, with an increasing demand for white and at an ambient light level of 500 lux, and a 6/1
LEDs. Two methods are used to produce a white contrast ratio has been achieved with an ambient
LED. One method is to use a blue LED with the front illumination of 50 000 lux and an output of
of the LED coated with a green photoluminescent 370 cd/m2. Additionally, OLEDs can be made on
phosphor. If the phosphor is thin enough, the blue flexible (F) substrates and are called FOLEDs. The use
light penetrates the screen and excites the green to of a transparent (T) cathode gives rise to a top
give a blue/green white. Another method is to use a emission OLED or a TOLED.
blue LED and coat it with a white photoluminescent The first OLED display used in auto products was
phosphor. developed by Pioneer and used in a radio display and
the Ford Motor Company has shown a White OLED
display in one of their ‘Concept Cars’. This display
Organic LEDs allows them to have a reconfigurable instrument
The basic process of light generation in OLEDs is the
formation of a P –N junction. Electrons injected
from the N-type material and holes from the P-type
material recombine at the junction and light is
generated. There are presently two competing
methods to fabricate an OLED. The first developed
by Kodak in 1987 is the small molecule molecular
weight material approach. In OLED operation, the
electron hole pair recombine to form excitons (on
the average one singlet and three triplet excitons are
formed for every four electron hole pairs). The light
for the Kodak OLED is from the recombination of Figure 7 OLED partial cross-section – Kodak type. Repro-
singlet excitons and the energy transferred to the duced from Green P and Haven D (1998) Fundamentals of
fluorescent dye. The terminology is such that we emissive displays. SID Short Course S-3. Permission for reprint
courtesy of the Society for Information Display.
have small molecule-OLEDs (sm-OLEDs) and poly-
mer-OLED (PLED). In later developments, the sm-
OLED device structure was reported to have three
layers. The first is the hole transport layer (hole
transport) enhancing layer copper phthalocyanine
(CuPc), a layer NPB (60 nm), the light emitting
aluminum tris-8-hydroxyquinolate Alq layer
(75 nm), and electron transport layer. The electron
transport layer is in contact with the three layer
structure. The top electrode is a low work function
material, such as MgAg (75 – 200 nm thick) called an
electron injection layer and topped with a transpar-
ent indium tin oxide (ITO) layer for transparent
OLEDS. All of these coatings are applied in a
vacuum chamber.
In the phosphorescent material method, the light
emission is from the triplet excitons through the use Figure 8 Philips poled display in an auto mirror. Reproduced
of phosphorescent dyes and the polymer-based with permission of Philips Mobile Display Systems.
374 DISPLAYS
panel (IP) and is made up of a number of 5 inch £ stripes (depicted) horizontally on the top and
8 inch displays across the instrument panel (Figure 7). vertically on the bottom, give the means of addressing
The POLED as seen in Figure 8 is now being used each color pixel. The white phosphor, for example, is
by Philips Mobile Systems in a rear view car mirror, as a ZnS:Mn, SrS:Ce blend which gives a white
seen below. It is the display from a video camera electroluminescence.
looking at the rear of a car to see any obstacles to Other phosphors such as ZnS:Mn can be used but
backing up safely. its yellow luminescence requires the use of a green
and red filter and only two metal electrodes on the
bottom to give a two-color display. Better red green
Thin Film EL (TFEL) Displays color separation can be achieved through the use of a
Zn12xMgxS:Mn and ZnS:Mn phosphor mix. New EL
A simple type of TFEL display is the color by white
blue phosphor developments have allowed larger
structure shown below. In this case, the color is
color gamuts to be achieved.
generated using a white phosphor and red, green, and
One of the important characteristics of the
blue optical filters. To make this ACEL display
alternating current thin-film electroluminescence
function properly, insulators are used on either side
(ACTFEL) is that it has an operating temperature
of the phosphor. In Figure 9, the ITO conducting
range from 2 40 8C to 85 8C which is key for
automotive use. It has a viewing angle of 160 degrees
with a life of more than 50 000 hours. Active matrix
(active addressed) AMEL structures allow fast
Figure 11 The operation of a simple liquid crystal display. Reproduced from Lueder E (1997) Fundamentals of passive and active-
addressed liquid crystal displays. SID Short Course S-1 (1997). Permission for reprint courtesy of the Society for Information Display.
DISPLAYS 375
response and moving graphic displays. These have to the viewer. If the polarizers were not crossed but
found application in ‘heads up displays’ (HUD) for the polarization transmission axis was in the same
the military and medical industries and in consumer direction in both polarizers, then the opposite effect
goods such as portable computers, pagers, cell would occur.
phones, and 3D gaming. In order to show a color image, the LCD is designed
to have three or more cells clustered per color pixel.
The following (Figure 12) is the transmission
The Active Matrix LCD
profiles of each of these red, green, and blue filters.
The active matrix liquid crystal display (AMLCD) The backlight used is a fluorescent lamp with a
device consists of a series of liquid crystal cells or white phosphor (made up of red, green, and blue
windows whose characteristics can be controlled by photoluminescent phosphors). Figure 13 shows what
an applied voltage to pass or restrict the passage of would happen if only the blue phosphor was used and
light through the windows via the external polarizers also shows the superposition of the mercury lines that
and the polarizing action of the LCD cells. These penetrate the phosphor and contribute to the final
displays are used in laptop and tabletop computers, spectral energy distribution.
TVs, and for automotive displays (for driving
information and entertainment). Figure 10 shows
some present applications for AMLCDs in the Reflective LCD
automobile.
For applications where there is suitable external light
When the simple LCD is used without a transistor
to view the display, no internal light source is used.
matrix, it is termed a passive display rather than an
active one. Figure 11 below shows how a simple LCD
cell operates.
The action shown in this figure is the twisted
alignment nature of the liquid crystal material. When
the polarizer on the left has its high transmission axis
horizontal and the polarizer on the right of this cell
has perpendicular alignment then the light from the
backlight (f) is passed through the crossed polarizers.
However, when voltage is applied to the cell, the
liquid crystal is no longer twisted and the crossed
polarizers prevent the light from going though the cell
Figure 12 Typical color LCD filter transmission characteristics. Figure 14 The action and construction of a reflective LCD.
Reproduced from Donofrio RL and Abileah A (1998) LCD Reproduced with permission from Watanabe R and Tomita O
backlighting system – phosphors and colors. In: Donofrio RL and (2003) Active-Matrix LCDs for Mobile Telephones in Japan. SID
Musa S (eds) Flat Panel and Vehicle Display ‘98, p. 123. Per- Information Display Magazine 7(3): 30–35. Permission for reprint
mission for reprint courtesy of the Society for Information Display. courtesy of the Society for Information Display.
376 DISPLAYS
See also
Color and the World. Instrumentation: Photometry.
Light Emitting Diodes. Materials for Nonlinear Optics:
Liquid Crystals for NLO.
Figure 15 The action and cross-section of a transflective LCD.
Reproduced with permission from Watanabe R and Tomita O
(2003) Active-Matrix LCDs for Mobile Telephones in Japan. SID
Information Display Magazine 7(3): 30– 35. Permission for reprint Further Reading
courtesy of the Society for Information Display.
Donofrio RL (1981) Low Current Density Aging. Proceed-
ings of the Electro-Chemical Society, Abstract #152,
This reflective LCD consists of a polarizer and May 11, 1981.
retarder plates stacked on the front side of the cell. Donofrio RL (2003) The road show. Information Display
The cell with spacers contains the liquid crystal 7(3): 26 –29.
material and on the bottom of the cell is located a Donofrio RL and Abileah A (1998) LCD backlighting. Flat
Panel and Vehicle Displays ’98 : 119– 124. Metropolitan
reflecting surface. It has been found that a dimpled
Detroit Chapter of the SID.
reflecting surface makes a diffuse reflecting surface Donofrio RL, Musa S and Wise K (eds) (1999) Vehicle
(Figure 14). Displays and Microsensors ’99. Proceedings of the 6th
Strategic Forum and Technical Symposium, Sept 22– 23,
Transflective LCD 1999.
Flat Panel Display Measurements Standard, Version 1.0
The importance of the transflective device is that it VESA (Video Electronics Standards Association) May
has better contrast under high ambient illumination 15, 1998.
than the standard transmission back lighted LCD and Forrest S (2002) Fundamentals of Organic Light-Emitting
fair contrast for low ambient illumination. Devices. Short Course S-2 SID.
The construction of this device consists of front and Green P and Haven D (1998) Fundamentals of Emissive
Displays, Short Course S-3, SID.
rear circular polarizers, back light, reflective
Keller P (1991) The Cathode-Ray Tube, Technology,
elements, and novel filters. The back light is in the
History, and Applications. New York: Palisades Press.
rear of the display. However, like the reflective LCD, Leverence HW (1968) An Introduction to Luminescence of
internal reflectors are used on each pixel. The filters Solids. Dover Publications.
are designed spatially to be more transmissive for the MacDonald W and Lowe A (1997) Display Systems –
reflective light than for the back light. Additionally Design and Applications. SID Series in Display Tech-
the cell gap for the reflective light is smaller then the nology, Wiley and Sons Ltd.
transmitted light region because the reflected light Sherr S (1993) Electronic Displays, 2nd edn. John Wiley.
passes the cell twice whereas the transmitted light Shionoya S and Yen WM (1998) Phosphor Handbook.
passes the cell once. This type of device seen in CRC Press.
Figure 15, is finding uses in the auto industry and Stupp EH and Brennesholtz MS (1999) Projection Displays.
where the LCD is needed for both low and high SID Series in Display Technology, Wiley and Sons Ltd.
Watanabe R and Tomita O (2003) Active-Matrix LCDs for
ambient light levels.
Mobile Telephones in Japan. SID Information Display
Magazine 7(3): 30 –35.
Conclusions Web site atis.org/tg2k/. American National Standard
TI.523-2001 ‘Def of Display Device’.
In this brief review of displays we have tried to cover Wu ST and Yang DK (2001) Reflective Liquid Crystal
many different types of displays. The reader will in Displays. SID Series in Displays Technology, Wiley and
time find that there are many other variations on the Sons Ltd.
E
ELECTROMAGNETICALLY INDUCED
TRANSPARENCY
J P Marangos, Imperial College of Science, normal dispersion leading to low values of
Technology and Medicine, London, UK group velocity. In contrast to the linear susceptibility
q 2005, Elsevier Ltd. All Rights Reserved. the nonlinear susceptibility involving these same
resonant laser fields will undergo constructive
rather than destructive interference. This leads to a
strong enhancement in nonlinear frequency mixing.
Introduction For a normally transparent medium the nonlinear
In this article we examine how the process of optical couplings will be small unless large-ampli-
electromagnetically induced transparency (EIT) can tude fields are applied. The most important con-
be used to increase the efficiency of non-linear optical sequence of EIT, in contrast to the usual case when
frequency mixing schemes. We illustrate the basic a medium is transparent, is that the dispersion is
ideas by concentrating upon the enhancement of not vanishing and the nonlinear couplings can be
resonant four-wave mixing schemes in atomic gases. very large.
To start we introduce the quantum interference To understand how this comes about we must
phenomenon that gives rise to EIT. The calculation consider the system illustrated in Figure 1a where
of EIT effects in a three-level atomic medium using a because of the resonant condition satisfied for the
density matrix approach is presented. Next we applied fields the atom can be simplified to just
examine the modified susceptibilities that result and three-levels. The important parameters in this model
that can be used to describe nonlinear mixing, and system are that state l2. is metastable and has a
show how large enhancements in nonlinear conver- dipole-allowed coupling to state l3.. The coupling
sion efficiency can result. Specific examples are briefly between states l1. and l3. is also dipole-allowed
discussed along with a further discussion of the novel
aspects of refractive index and pulse propagation
|a >
modifications that result from EIT. The potential
|3>
benefits to nonlinear optics of electromagnetically
induced transparency are discussed. In an article of |b >
this nature it is not possible to credit all of the
important contributions to this field, but we include a |2>
bibliography for further reading indicating some of
the seminal contributions.
Electromagnetically induced transparency is the
term applied to the process by which an otherwise
opaque medium can be rendered transparent through
laser-induced quantum mechanical interference pro-
cesses. The linear susceptibility is substantially |1> |1>
modified by EIT. Absorption is cancelled due to des-
tructive interference between excitation pathways. Figure 1 A three-level atomic system coupled to laser fields in
the lambda configuration is shown on the left-hand side. In the
The dispersion of the medium is likewise modified limit of a strong coupling field this is equivalent to the dressed
such that where there is no absorption the refractive states la. and lb. coupled to the ground state by the probe field
index is unity and there can be a large value of alone as shown on the right-hand side.
378 ELECTROMAGNETICALLY INDUCED TRANSPARENCY
and as a consequence state l3. can decay radia- adopt a density matrix treatment as employed by
tively to either l1. or l2.. Let a coupling field be various workers. This readily allows the inclusion of
applied resonantly to the l2. 2 l3. transition, dephasing as well as population decay terms. The
which may be cw or pulsed but should in the latter critical parameters in this system are the strengths of
case be transform-limited. We define the Rabi the fields (described in terms of the Rabi couplings),
coupling as V23 ¼ m23" E where E is the laser electric the detuning of the fields from resonance Dv13 ¼
field amplitude and m23 is the dipole moment of the v13 2 vP and Dv23 ¼ v23 2 vC (see Figure 2), the
transition. The Rabi frequency needs to be larger radiative decays from l3. to l1. and l2., gc and gd
than the inhomogeneous broadening of the sample. and from l2 . 2l1 . ga (although the latter is
A second field, the probe, typically of much lower anticipated to be smaller). Extension to other
intensity, is then applied in resonance with the configurations of the three states, such as a V or
l1. 2 l3. transition. If the condition V23 q V13 is ladder scheme is implicit within this general
satisfied then it is convenient to replace the bare treatment.
atomic states l2. and l3. with the dressed states Figure 2 illustrates the system considered. This
(see Figure 1b): treatment will also address the nonlinear response
in a four-wave mixing scheme completed by the
1
two-photon resonant coupling applied between state
la . ¼ p l2. þl3. ½1a
2 l1. and l2..
1 We write the interaction Hamiltonian as:
lb . ¼ p l2.2l3. ½1b
2
H ¼ H0 þ V ½2
Transitions from state l1. induced by the probe
field to this pair of near-resonant dressed states are
subject to exact cancellation at resonance if l2. is
perfectly metastable. This is because the only
contribution to the excitation amplitude comes
|3 >
from the matrix elements , 1lr·El3 . as the
l1. 2 l2. transition is dipole forbidden. The
contributions from the equally detuned states la. gc
and lb. thus enter into the overall amplitude with wc
opposite signs and equal magnitude as can be seen gd
by inspection of eqns [1a] and [1b]. This leads to
cancellation of the absorption amplitude. This type |2>
of absorption cancellation is well known and closely
related to the so-called dark states.
ga wa
Theoretical Treatment of EIT in a
wd
Three-Level Medium
It was realized by several workers in the 1970s that
laser-induced interference effects could lead to a
cancellation of absorption at certain frequencies. To
gain a more quantitative understanding of the effects
of the coupling field upon the optical properties of a wa
dense ensemble of three-level atoms we require a
treatment that computes the optical susceptibility of
the medium. A treatment originally carried out by
Harris et al. for a L scheme similar to that illustrated |1>
in Figure 1 was the first to derive the modified
susceptibilities that will be discussed below. In that Figure 2 A four-wave mixing scheme incorporating the
three-level lambda system in which electromagnetically induced
treatment the state amplitudes in the three-level
transparency has been created for the generated field vd by
atom were solved in the steady-state limit and the coupling field vc. The decay rates gi from the three atomic
from these the linear and nonlinear susceptibilities levels are also shown. For a full explanation of this system
(see below) are then derived. In what follows we see text.
ELECTROMAGNETICALLY INDUCED TRANSPARENCY 379
where H0 is the unperturbed Hamiltonian of the off-diagonal elements we make the substitution:
system and is written as
@~12 ¼ @12 e2 iDa t
H0 ¼ "v1 l1lk1l þ "v2 l2lk2l þ "v3 l3lk3l ½3 @~23 ¼ @23 e2 iDc t ½8
2iDd t
@~31 ¼ @31 e
and V is the interaction Hamiltonian and can be
expressed This operation conveniently eliminates the time
dependencies in the rate equations and the equations
V ¼ "Va e2 iva t l2lk1l þ "Vc e2 ivc t l2lk3l of motion for the density matrix are given by:
4
P13 ðvÞ ¼ 10 xðvÞE ½13 2
lm13 l2 N –6
Rexð1Þ
D ð2vd ; vd Þ ¼
–1000 –500 0 500 1000
10 " ∆ /gd
" #
24D21 ðlV2 l 2 24D21 D32 Þþ4D31 ga2
4D31 D21 2 gd ga 2lV2 l2 Þ 2 þ4ðgd D21 þ ga D31 Þ 2
½14 3
c (3) arb. units
2
lm13 l2 N
Imxð1Þ
D ð2v ;
d dv Þ¼
10 " 1
" #
8D21 gd þ2ga ðlV2 l2 þ gd ga Þ
2
0
ð4D31 D21 2 gd ga 2lV2 l2 Þ 2 þ4ðgd D21 þ ga D31 Þ 2
–1000 –500 0 500 1000
½15 ∆ /g d
We now consider the additional two-photon resonant Figure 3 Electromagnetically induced transparency is shown in
field va. This leads to a four-wave mixing process the case of significant Doppler broadening. The Doppler averaged
values of the imaginary Im[x (1)] and real Re[x (1)] parts of the
that generates a field at vd (i.e., the probe frequency). linear susceptibility and the nonlinear susceptibility x (3) are plotted
The nonlinear susceptibility that describes the in the three frames. The Doppler width is taken to be 20gd,
coupling of the fields in a four-wave mixing process Vc ¼ 100gd and gd ¼ 50ga, gc ¼ 0.
ELECTROMAGNETICALLY INDUCED TRANSPARENCY 381
also incorporated by computing the susceptibilities (cw) limit. We make the assumptions also that there is
over the inhomogeneous profile (see below for no pump field absorption and that we have plane
more details). waves. Under these assumptions the generated field
By inspection of eqn [14] we see that the amplitude Ad is given in terms of the other field
absorptive loss at the minimum of Im[x (1)] varies amplitudes Ai by:
as ga =V2c . This dependence is a consequence of the
interference discussed above. In the absence of › v v
A ¼ i d xð3Þ A2a Ac e2 iDkd z 2 d Im½xð1Þ Ad
interference (i.e., just simply Autler– Townes split- ›z d 4c 2c
ting) the minimum loss would vary as gd =V2c . Since vd ð1Þ
þi Re½x Ad ½16
l3. 2 l1. is an allowed decay channel (in contrast 2c
to l2. 2 l1.) it follows that gb q ga and so the
absorption is much less as a consequence of EIT. where the wavevector mismatch is given by:
Re[x (1)] in eqn [15] and Figure 3 is also modified
significantly. The resonant value of refractive index Dkd ¼ kd þ kc 2 2ka ½17
is equal to the vacuum value where the absorption is
a minimum, the dispersion is normal in this region
The wavevector mismatch will be zero on resonance
with a gradient determined by the strength of the
for the three-level atom considered in this treatment.
coupling laser, a point we will return to shortly. For
In fact the contribution to the refraction from all
an unmodified system the refractive index is also
the other atomic levels must be taken into account
unity at resonance but in that case there is high
whilst computing Dkd and it is implicit that these
absorption and steep anomalous absorption.
make a finite contribution to the wavevector
Reduced group velocities result from the steep
mismatch.
normal dispersion that accompanies EIT. Inspection
We can solve this first-order differential equation
of the expression [16] and Figure 3 shows that x (3) is
with the boundary condition Ad ðz ¼ 0Þ ¼ 0 to find
also modified by the coupling field. The nonlinear
the generated intensity Iðvd Þ after a length z:
susceptibility depends upon 1/V2c as is expected for a
laser dressed system, however in this case there is not
3nv2d ð3Þ 2
destructive but rather constructive interference, Iðvd Þ¼ lx l lAa l4 lAc l2
between the field-split components. This result is of 8Z0 c2
vd vd
ð1Þ ð1Þ v
great importance: it ensures that the absorption can 1þe2 c Im½x z 22e2 c Im½x z cos Dkþ d Re½xð1Þ z
2c
be minimized at frequencies where the nonlinear 2
absorption remains large. v2d ð1Þ 2 vd ð1Þ
Im½x þ Dkþ Re½ x
As a consequence of constructive interference the 4c2 2c
nonlinear susceptibility remains resonantly enhanced
whilst the linear susceptibility vanishes or becomes where Z0 is the impedance of free space. This
very small at resonance due to destructive inter- expression is quantitatively correct for the case of
ference. Large nonlinearity accompanying vanishing the assumptions made. More generally the qualitative
absorption (transparency) of course match con- predictions and general scaling remain valid in the
ditions for efficient frequency mixing as a large limit where the pulse duration is significantly longer
atomic density can then be used. Moreover the than the time required to traverse the medium. Note
dispersion (controlling the refractive index) also that both real and imaginary parts of x (1) and x (3)
vanishes at resonance; this leads to perfect phase play an important role, as we would expect for
matching (i.e., zero wavevector mismatch between resonant frequency mixing. The influence of that part
the fields) in the limit of a simple three-level system. of the medium refraction which is due to other levels
As a result of these features a large enhancement of is contained in the terms with Dk. In the case of a
the conversion efficiency in this type of scheme can completely transparent medium this becomes a major
be achieved. limit to the conversion efficiency.
To compute the generated field strength Maxwell’s
equations must be solved using these expression for
the susceptibility to describe the absorption, refrac- Propagation and Wave-Mixing in
tion, and nonlinear coupling in the medium. We will
treat this within the slowly varying envelope approxi-
a Doppler Broadened Medium
mation since the fields are cw or nanosecond pulses. Doppler shifts arising from the Maxwellian velocity
To be specific we assume that the susceptibilities are distribution of the atoms in the medium lead to
time independent, i.e., that we are in the steady-state a corresponding distribution in the detunings for
382 ELECTROMAGNETICALLY INDUCED TRANSPARENCY
the various members of the atomic ensemble. for pulses propagating in the EIT large reductions,
The response of the medium, as characterized by the e.g., by factors down to ,c/100, in the group velocity
susceptibilities, must therefore include the Doppler have been observed. Another consideration beyond
effect by performing a weighted sum over possible that found in the simple steady-state case is that the
detunings. The weighting is determined by the medium can only become transparent if the pulse
Gaussian form of the Maxwellian velocity distri- contains enough energy to dress all the atoms in the
bution. From this operation the effective values of interaction volume. The minimum pulse energy to
the susceptibilities at a given frequency are obtained, prepare a transparency is:
and these quantities can be used to calculate the
generated field. This step is of considerable practical f13
Epreparation ¼ NL"v ½18
importance as in most up-conversion schemes it is f23
not possible to achieve Doppler-free geometries
and the use of laser cooled atoms, although in where fij are the oscillator strengths of the transitions
principle possible, limits the atomic density that and NL the product of the density and the length.
can be employed. The interference effects persist Essentially the number of photons in the pulse
in the dressed profiles providing the coupling must exceed the number of atoms in the interaction
laser Rabi frequency is comparable to or larger than volume to ensure all atoms are in the appropriate
the inhomogeneous width. This is because the dressed state. This puts additional constraints on the
Doppler profile follows a Gaussian distribution laser pulse parameters.
which falls off much faster in the wings of the profile Up-conversion to the UV and vacuum UV has
than the Lorentzian profile due to the natural been enhanced by EIT in a number of experiments.
broadening. Only pulsed fields have so far been up-converted to
In the case considered with weak probe field, the VUV with EIT enhancement. The requirements
excited state populations and coherences remain on a minimum value of Vc . DDoppler constrains the
small. The two-photon transition need not be conversion efficiency that can be achieved because
strongly driven (i.e., a small two-photon Rabi the 1/V2c factor in eqn [17] ultimately leads to
frequency can be used) but a strong coupling laser is diminished values of x (3). The use of gases of higher
required. The coupling laser must be intense enough atomic weight at low temperatures is therefore
that its Rabi frequency is comparable to or exceeds highly desirable in any experiment utilizing EIT for
the inhomogenous widths in the system (i.e., Doppler enhancement of four-wave mixing to the VUV.
width), and a laser intensity of above 1 MW cm22 is Conversion efficiencies, defined in terms of the
required for a typical transition. This is trivially pulse energies by Ed/Ea or Ed/Ec of a few percent
achieved even for unfocused pulsed lasers, but does have been achieved using the EIT enhancement
present a serious limit to cw lasers unless a specific technique. It is typically most beneficial to maximize
Doppler-free configuration is employed. The latter is the conversion efficiency defined by the first ratio
not normally suitable for a frequency up-conversion since va is normally in the UV and is the lower
scheme if a significant up-conversion factor is requi- energy of the two applied pulses.
red, e.g., to the vacuum ultraviolet (VUV); however
recent experiments report significant progress in cw
frequency up-conversion using EIT and likewise a Nonlinear Optics with a Pair
number of other possibilities, e.g., laser-cooled atoms of Strong Coupling Fields
and standing-wave fields, have been proposed.
A transform-limited single-mode laser pulse is
in Raman Resonance
essential for the coupling laser field since a multimode An important extension of the EIT concept occurs
field will cause an additional dephasing effect on the when two strong fields are applied in Raman
coherence, resulting in a deterioration of the quality resonance between a pair of states in a three-level
of the interference. In contrast, whilst it is advan- system. Considering the system illustrated in Figure 1
tageous for the field driving the two-photon transition we can imagine that both applied fields are now
to be single mode (in order to achieve optimal strong. Under appropriate adiabatic conditions the
temporal and spectral overlap with the EIT hole system evolves to produce the maximum possible
induced by the dressing laser), this is not essential value for the coherence @12 ¼ 0.5. Adiabatic evol-
since this field does not need to drive the coherence ution into the maximally coherent state is achieved by
responsible for interference. adjusting either the Raman detuning or the pulse
When a pulsed laser field is used additional issues sequence (counter to-intuitive order). The pair of
must be considered. The group velocity is modified fields may also be in single-photon resonance with a
ELECTROMAGNETICALLY INDUCED TRANSPARENCY 383
0.3
The slope of the dispersion profile leads to a reduced resonant wavelength causes a large change in the
group velocity vg: refractive index for a field applied close to this
frequency. It is predicted that strong cross-phase
1 1 p ›xð1Þ modulations will be created in this process between
¼ þ ½20
vg c l ›v the fields vf and vd, even in the quantum limit for
these fields. This is predicted to lead to improved
From the expression for ›x/›v we see that this slope schemes for quantum nondemolition measurements
is steepest (and so vg mimimum) in the case where of photons through the measurement of the phase-
Vc q G2 and V2c q G2 G3 but is still small compared shifts they induce (through cross-phase modulation)
to G3 (i.e., VC , G3) and hence ›x/›v / 1/V2c . In the on another quantum field. This type of measurement
limit of small Vc the following expression for vg has direct application in quantum information
therefore holds: processing.
"c10 Vc 2
vg ¼ ½21
2vd lm13 l2 N
See also
Extremely low group velocities, down to a few meters Nonlinear Optics, Applications: Phase Matching;
per second, are achieved in this limit using excitation Raman Lasers. Scattering: Raman Scattering.
of the hyperfine split ground states in either laser
cooled atomic samples or Doppler-free configurations
in finite temperature samples. Recently similar light
slowing has been observed in solids. Storage of the Further Reading
optical pulse within the medium has also been
achieved by adiabatically switching off the coupling Arimondo E (1996) Coherent population trapping in laser
field and thus trapping the optical excitation as an spectroscopy. Progress in Optics 35: 257 – 354.
Harris SE (1997) Electromagnetically induced transpar-
excitation within the hyperfine ground states for
ency. Physics Today 50: 36 –42.
which the storage time can be very long (.1 ms) due
Harris SE and Hau LV (1999) Non-linear optics at low light
to very low dephasing rates. Since the storage levels. Physical Review Letters 82: 4611– 4614.
scenario should be valid even for single photons Harris SE, Field JE and Imamoglu A (1990) Non-
this process has attracted considerable attention linear optical processes using electromagnetically
recently as a means to enable quantum information induced transparency. Physical Review Letters 64:
storage. 1107– 1110.
Extremely low values of Vc are sufficient to induce Hau LV, Harris SE, Dutton Z and Behroozi CH (1999)
complete transparency (albeit in a very narrow dip) Light speed reduction to 17 metres per second in an
and at this location the nonlinear response is ultra-cold atomic gas. Nature 397: 594 – 598.
resonantly enhanced. Very high efficiency nonlinear Marangos JP (2001) Electromagnetically induced trans-
parency. In: Bass M, et al. (ed.) Handbook of Optics,
frequency mixing and phase conjugation at low light
vol. IV, ch. 23. New York: McGraw-Hill.
levels have been reported under these conditions. It
Merriam AJ, Sharpe SJ, Xia H, et al. (1999) Efficient gas-
is expected that high-efficiency nonlinear optical phase generation of vacuum ultra-violet radiation.
processes will persist to ultralow intensities (the Optics Letters 24: 625 – 627.
few photon level) in an EIT medium of this type. Schmidt H and Imamoglu A (1996) Giant Kerr
One example of highly enhanced nonlinear inter- nonlinearities obtained by electromagnetically induced
actions is the predicted large values of the Kerr type transparency. Optics Letters 21: 1936– 1938.
nonlinearity (nonlinear refractive index). The origin Scully MO (1991) Enhancement of the index of refraction
of this can be seen by considering the steep via quantum coherence. Physical Review Letters
dispersion profile in the region of the transparency 67: 1855– 1858.
dip in Figure 4. Imagine that we apply an additional Scully MO and Suhail Zubairy M (1997) Quantum Optics.
Cambridge, UK: Cambridge University Press.
field vf, perhaps a very weak one, at a frequency
Sokolov AV, Walker DR, Yavuz DD, et al. (2001) Femto-
close to resonance between state l2. and a fourth
second light source for phase-controlled multi-photon
level l4.. The ac Stark shift caused by this new field ionisation. Physical Review Letters 87: 033402-1.
to the other three level, although small, will have a Zhang GZ, Hakuta K and Stoicheff BP (1993) Nonlinear
dramatic effect upon the value of the refractive index optical generation using electromagnetically induced
because of the extreme steepness of the dispersion transparency in hydrogen. Physical Review Letters
profile. Essentially even a very small shift of the 71: 3099– 3102.
ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 385
ENVIRONMENTAL MEASUREMENTS
Contents
Doppler Lidar
Hyperspectral Remote Sensing of Land and the Atmosphere
Laser Detection of Atmospheric Gases
Optical Transmission and Scatter of the Atmosphere
Aerosol particles, the other component of the temperature. In the center of the spectrum is a much
atmosphere that scatters laser light, produce Mie narrower peak resulting from scattering of the light
scattering, which applies when the diameter of the by aerosol particles. Because the thermal velocity of
scatterers is not orders of magnitude smaller than the much larger aerosol particles is very low, the
the incident wavelength. Aerosol particles present in width of the distribution of the aerosol return is
the atmosphere include dust, soot, smoke, and pollen, determined by the range of velocities of particles
as well as liquid water and ice. Although for Mie moved about by small-scale turbulence within the
scattering the relationship between the power back- scattering volume. This is typically on the order of a
scattered by an ensemble of aerosol particles and the few m s21. Also shown in Figure 1 is an additional
incident wavelength is not simply characterized, most broadband distribution, due to scattered solar
studies have shown that the energy backscattered radiation collected at the receiver. If the laser source
increases roughly proportional to the first or second is not monochromatic, the backscattered signal
power of the inverse of the incident wavelength, spectrum is additionally broadened, with the result-
depending on the characteristics of the scattering ing spectrum being the convolution of the spectrum
aerosol particles. In a polluted environment, such as shown in Figure 1 with the spectrum of the laser
in the vicinity of urban areas, an abundance of large pulse. As seen in the figure, the entire spectrum is
Doppler-shifted in frequency, relative to the frequ-
particles often results in a roughly linear relationship
ency of the laser pulse. The object for a Doppler lidar
between the inverse of the incident wavelength and
system is to measure this shift, given by df ¼ 2vrad =l;
the backscattered energy, while in more pristine
where vrad is the component of the mean velocity of
environments, such as the free troposphere, the
the particles in the direction of propagation of the
inverse wavelength/backscatter relationship can
lidar pulse and l is the laser wavelength.
approach or exceed a square-law relationship.
The primary objective in Doppler lidar is to Elements of a Doppler Lidar
measure the Doppler frequency shift of the scattered
Doppler lidar systems can be designed primarily to
radiation introduced by the movements of the
measure winds from aerosol-scattered radiation,
scattering particles. Figure 1 shows a typical spectrum
molecule-scattered radiation, or both. The type of
of the radiation collected at the lidar receiver for a
system places specific requirements on the primary
volume of atmosphere irradiated by a monochro-
components that comprise a Doppler lidar system.
matic laser pulse. Molecular scattering produces
A Doppler lidar consists of a laser transmitter to
the broadband distribution in Figure 1, where the
produce pulses of energy that irradiate the atmos-
broadening results from the Doppler shifts of the
pheric volume of interest, a receiver which collects the
radiation backscattered from molecules moving at backscattered photons and estimates the energy and
their thermal velocities. The width of the molecular Doppler shift of the return, and a beam pointing
velocity distribution in the atmosphere ranges from mechanism for directing the laser beam and receiver
, 320 to 350 m s21, scaling as the square root of the field of view together in various directions to probe
different atmospheric volumes. Independent of
0.80 whether the primary scatterers are molecules or
aerosol particles, the system design criteria for a
0.70 Aerosol return Doppler system are driven by a fundamental relation-
0.60 ship between the error in the estimate of mean
frequency shift df1 ; the bandwidth of the return f2
Relative intensity
0.50
(proportional to the distribution of velocities), and
0.40 the number of incident backscattered photons
detected N as
0.30 Molecular return
0.20 df1 / f2 =N1=2 ½1
0.10
Background light Thus, the precision of the velocity estimate is
0 improved by increasing the number of detected signal
Wavelength photons and/or decreasing the bandwidth of the
backscattered signal. It is obvious from the equation
Figure 1 Distribution of wind velocities in a lidar return, showing
Doppler shift resulting from wind-induced translation of the that a significantly greater number of photons are
scatterers. The ratio of aerosol to molecular signals increases with required to achieve the same precision in a Doppler
increasing wavelength. measurement from a molecular backscattered signal,
ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 387
Although measurements at wavelengths near lidars operate in the infrared where aerosol scattering
1.06 mm have been demonstrated, coherent Doppler dominates molecular scattering, they require aerosol
lidar systems used for regular atmospheric probing particles to be present at some level to obtain usable
operate in the eye-safe, infrared portion of the returns. Although clouds also provide excellent lidar
spectrum at wavelengths longer than 1.5 mm. Cur- targets, most of the more useful applications of
rently, the two most common system wavelengths for coherent lidars involve probing the atmospheric
coherent lidar wind systems are in atmospheric boundary layer or lower troposphere where aerosol
window spectral regions around 2 and 10.6 mm. content is highest. Because of the capability to scan
Early coherent Doppler lidar measurements, begin- the narrow lidar beam directly adjacent to terrain,
ning in the 1970s, employed CO2 laser transmitters a unique application of lidar probing is the
and local oscillators operating at wavelengths near measurement of wind structure and evolution in
10.6 mm. Pulsed CO2 laser transmitters with as much complex terrain such as mountains and valleys. Over
as 10 J of energy have since been demonstrated, and the past two decades, Doppler lidar studies have been
systems with 1 J lasers have probed the atmosphere to used, for example, to study structure of damaging
ranges of 30 km or more. In the late 1980s, solid state windstorms on the downwind side of mountain
laser transmitters operating near 2 mm wavelengths ranges, advection of pollution by drainage flows
were introduced into coherent lidar wind-measuring from valleys, and formation of mountain leeside
systems. The compact size and potential reliability turbulence as a potential hazard to landing aircraft.
advantages of solid-state transmitters, in which the The interactions of wind flows with complex terrain
transmitter laser is optically pumped by an array of can produce dangerous conditions for transportation,
laser diodes, provide advantages over larger CO2 especially aircraft operations. Figure 3 shows a strong
laser technology. Also, because for a given measure- mountain wave measured by a Doppler lidar near
ment accuracy the range-resolution obtainable is Colorado Springs, during an investigation of the
proportional to wavelength, 2 mm instruments have effects of downslope winds and turbulence on aircraft
an enhanced capability to probe small-scale features. operations at the Colorado Springs airport. The
However, although development of higher energy, mountains are on the right of the figure, where the
single-frequency coherent lidar sources operating at strong wave with wavelength of about 6 km can be
2 mm region, as well as at 1.5 mm, is currently an seen. Note the reversal in winds near the location
active research area, solid state lasers with pulse where the wave descends to the surface. The Color-
energies greater than several tens of mJ have yet to be ado Springs study was prompted by the crash of a
incorporated into lidar systems for atmospheric passenger jet while landing at the airport during a
probing. period when the winds were down the slope of the
Rocky Mountains just to the west of the airport.
Applications of Coherent Doppler Lidar
A Doppler lidar was recently deployed in an
Coherent lidars have been used to measure winds for operational mode for wind shear detection at the
a variety of applications, and from an assortment of new Hong Kong International Airport. Because winds
platforms, such as ships and aircraft. Since these flowing over mountainous terrain just south of the
Figure 3 Vertical scan of the winds in a mountain wave measured by a 10.6 mm coherent Doppler lidar near Colorado Springs,
Colorado. Courtesy L. Darby, NOAA.
ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 389
airport can produce mountain waves and channeled at airports to detect and track wing tip vortices
flows, wind shear is frequently encountered during generated by arriving or departing aircraft on nearby
landing and takeoff operations. Figure 4 shows a gust parallel runways. In the future, a network of ground-
front situated just to the west of the airport, charac- based lidars could provide information on vortex
terized by a sharp wind change (which would produce location and advection speed as well as wind shear,
a corresponding change in airspeed) approaching leading to a potential decrease in congestion at major
13 m s21 over a distance of less than one kilometer. airports. Also, compact lidar systems looking directly
Providing a capability to detect such events near the ahead of research aircraft have shown the capability
airport and warn pilots during approach or prep- to detect wind changes associated with potentially
aration for takeoff was the primary reason for hazardous clear-air turbulence. In the future, a
deployment of the lidar. Within the first year of Doppler lidar installed on the commercial aircraft
operation the Hong Kong lidar was credited with fleet could potentially look ahead and provide a
improving wind shear detection rates and providing warning to passengers to fasten seat belts before
more timely warnings for pilots of the potentially severe turbulence is encountered.
hazardous conditions. As of August 2003 this lidar The high resolution obtainable in a scanning lidar
had logged more than 10 000 hours of operation. enables visualization of finely structured wind and
Because coherent Doppler lidars are well matched turbulence layers. Figure 5 shows an image of
to applications associated with probing small-scale, turbulence associated with a nocturnal low-level jet
turbulent phenomena, they have also been deployed just 50 m above the surface obtained at night over flat
Figure 4 Horizontal depiction of the radial component of the wind field measured at Hong Kong Airport by a 2.02 mm Doppler
lidar during a gust front passage, showing strong horizontal shear just west of the airport. Courtesy C. M. Shun, Hong Kong
Observatory.
390 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar
Figure 5 Turbulence structure within a nocturnal stable layer at a height of 50 m. Note the stretching of the vertical scale. Courtesy
R. Banta, NOAA.
terrain in Kansas. Although a low-level jet was measurements estimates of turbulence intensity,
present on just about every evening during the integral scale, and other boundary layer character-
experiment, similar images obtained at different istics can be computed.
times by the Doppler lidar illustrated markedly
different characteristics, such as a wide variation
in the observed mechanical turbulence along the
interface. Such observations enable researchers to Direct Detection Doppler Lidar
improve parameterizations of turbulence in models
Description
and better understand the conditions associated with
vertical turbulent transport and mixing of atmos- Direct detection or incoherent Doppler lidar has
pheric constituents such as pollutants. received significant attention in recent years as an
By deploying Doppler lidars on moving platforms alternative to coherent lidar for atmospheric wind
such as ships and aircraft, the spatial coverage of the measurements. In contrast to coherent lidar, in which
measurements can be greatly increased. Although an electrical signal is processed to estimate Doppler
removal of platform motion and changes in orien- shift, an optical interferometer, usually a Fabry –Perot
tation are not trivial, aircraft-mounted lidar can map etalon, serves as the principal element in a direct
out such features as the low-level jet and the detection lidar receiver for determining the frequency
structure of boundary layer convective plumes. shift of the backscattered radiation.
Figure 6 shows horizontal and vertical motions One implementation of a direct-detection Doppler
associated with convective plumes measured along lidar receiver is the ‘fringe imaging’ technique. In this
an approximately 60 km path over the southern design, the interferometer acts as a spectrum analyzer.
Great Plains of the United States. From such The backscatter radiation is directed through a
ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 391
Figure 6 Vertical motion in convective plumes measured by a vertically pointing airborne coherent Doppler lidar. Total horizontal
extent of the measurements is approximately 60 km, reddish colors correspond to upward motions.
Lens
Lens
Etalon
Figure 7 Schematic of Fabry– Perot etalon in a direct detection, fringe-imaging lidar receiver. Courtesy P. Hays, Michigan Aerospace
Corporation.
Fabry –Perot interferometer, which produces a ring and is equivalent to a representation of the back-
pattern in the focal plane (Figure 7). The spectral scattered signal frequency spectrum. As the mean
content information of the incident radiation is frequency of the backscattered radiation changes, the
contained in the radial distribution of the light. Each rings move inward or outward from the center. To
ring corresponds to an order of the interferometer extract a spectrum of the backscattered light, one or
392 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar
more of the rings are imaged onto a two-dimensional Doppler Wind Measurements Based
detector, and the resulting pattern analyzed. The rings on Molecular Scatter
can either be detected using a multi-element ring One of the primary advantages of direct detection
detector, or can be converted to a linear pattern with a Doppler lidar is its capability for measurements
circle-to-line converter optic and then imaged on to a based on scatter from atmospheric molecules.
charge-coupled device array device. Measurement of Doppler shifts from molecular
An alternate direct detection receiver configuration, scattered radiation is challenging because of the
generally called the double edge technique, is to large Doppler-broadened bandwidth of the return.
employ two interferometers as bandpass filters, with Because one wants to measure a mean wind
the center wavelength of each filter set above and velocity with a precision of a few m s21 or better, in
below the laser transmitter wavelength, as shown in the order of 105 photons are required. Some
Figure 8. The incoming radiation is split between the combination of multiple pulse averaging, powerful
two interferometers, and the wavelength shift is lasers, and large receiver optics is required to obtain
computed by examining the ratio of the radiation these high photon counts from backscattered returns.
transmitted by each interferometer. Both the double Molecular-scatter wind measurements have been
edge and fringe-imaging techniques have demon- demonstrated in the visible spectral region at 532 nm
strated wind measurements to heights well into the wavelength as well as at 355 nm in the near
stratosphere. The major challenge associated with the ultraviolet. The ultraviolet region has the dual
double edge receiver is optimizing the instrument advantages of enhanced molecular scatter and less
when both aerosol and molecular scattered radiation restrictive laser eye-safety restrictions. Figure 9 shows
are present, since in general the change in transmission the time series of a wind profile measured in the
as a function of velocity is different for the aerosol and troposphere from both aerosols and clouds using a
molecular signals. Double edge receivers optimized for molecular-scatter, ground-based 355-nm wavelength
both aerosol and molecular returns place the bandpass Doppler fringe-imaging lidar. The figure shows
of the etalon filters at the precise wavelength where the measurements from receiver channels optimized for
change in transmission with change in Doppler shift the wideband molecular signal and the narrowband
is the same for both aerosol and molecular returns. aerosol return. For this measurement, backscattered
For both types of direct detection receivers photons were collected by a 0.5 m receiver aperture,
described above, much of the radiation incident on averaged for 1 minute, and processed. In the absence
the interferometer is reflected out of the system, of clouds, direct detection Doppler lidars have
reducing the overall efficiency of the receiver. Recently, measured wind profiles continuously from the surface
designs that incorporate fiber optics to collect a to beyond 15 km height. Figure 10 shows that the
portion of the reflected radiation and ‘recycle’ it estimated wind error for the same 355 nm lidar is less
back into the etalon have been demonstrated as a than 1 m s21 to about 10 km, and less than 4 m s21 at
method to improve the transmission efficiency of the 15 km height.
etalon in a fringe imaging receiver.
Heterodyne and Direct-Detection
Molecular wind lidar concept
Doppler Trade-Offs
DnD Lively debates within the lidar community have
occurred over the past decade regarding the relative
Molecular
merits of heterodyne versus direct detection Doppler
lidars. To a large extent, the instruments are comp-
lementary. Generally, heterodyne instruments are
Laser much more sensitive when significant aerosols are
frequency
present. Coherent lidar processing techniques have
Signal
Figure 9 Time series of radial wind speed profiles measured by a 355 nm fringe-imaging direct detection lidar aerosol channel (top)
and molecular channel (bottom) at Maura Loa Observatory, HI. Change in wind speed and blocking of the return by clouds is clearly
seen. Courtesy C. Nardell, Michigan Aerospace Corp.
Banta RM, Darby LS, Kaufmann P, Levinson DH and Zhu Post MJ and Cupp RE (1990) Optimizing a pulsed Doppler
C-J (1999) Wind flow patterns in the Grand Canyon as lidar. Applied Optics 29: 4145– 4158.
revealed by Doppler lidar. Journal of Applied Meteor- Rees D and McDermid IS (1990) Doppler lidar atmospheric
ology 38: 1069– 1083. wind sensor: reevaluation of a 355 nm incoherent
Frehlich R (1996) Coherent Doppler lidar measurements of Doppler lidar. Applied Optics 29: 4133– 4144.
winds in the weak signal regime, 1997. Applied Optics Rothermel J, Cutten DR, Hardesty RM, et al. (1998) The
36: 3491– 3499. multi-center airborne coherent lidar atmospheric wind
Gentry B, Chen H and Li SX (2000) Wind measurements sensor. Bulletin of the American Meteorological Society
with a 355 nm molecular Doppler lidar. Optics Letters 79: 581– 599.
25: 1231– 1233. Skinner WR and Hays PB (1994) Incoherent Doppler lidar
Grund CJ, Banta RM, George JL, et al. (2001) High- for measurement of atmospheric winds. Proceedings of
resolution Doppler lidar for boundary-layer and cloud SPIE 2216: 383 – 394.
research. Journal of Atmosphere and Oceanic Techno- Souprayen C, Garnier A and Hertzog A (1999) Rayleigh –
logy 18: 376– 393. Mie Doppler wind lidar for atmospheric measurements.
Huffaker RM and Hardesty RM (1996) Remote sensing of II: Mie scattering effect, theory, and calibration. Applied
atmospheric wind velocities using solid state and CO2 Optics 38: 2422– 2431.
coherent laser systems. Proceedings of the IEEE 84: Souprayen C, Garnier A, Hertzog A, Hauchecorne A and
181– 204. Porteneuve J (1999) Rayleigh– Mie Doppler wind lidar
Menzies RT and Hardesty RM (1989) Coherent Doppler for atmospheric measurements. I: Instrumental setup,
lidar for measurements of wind fields. Proceedings of the validation, first climatological results. Applied Optics
IEEE 77: 449 –462. 38: 2410– 2421.
Reflectance
Images taken simultaneously
in 100 – 200 spectrally
contiguous and spatially
registered spectral bands
0.4 Wavelength (µm) 2.5
of radiance emitted from a perfectly emitting material These molecular vibrations, and overtones of those
of the same temperature. vibrations, are less energetic processes than the
Materials covering the Earth’s surface, or gases in electronic absorption processes described above and
the atmosphere, can be identified in hyperspectral so the resulting absorption features occur at longer
data on the basis of absorption features, or bands, in wavelengths. Fundamental vibrational features occur
the spectra recorded by the sensor. Absorption in the MWIR and LWIR. Overtones and combination
bands result from the preferential absorption of overtones of these vibrations are manifested as
energy in some wavelength interval. The principal weaker absorption features in the SWIR. These
types of processes by which light can be absorbed overtone features in the SWIR include absorptions
include, in order of decreasing energy of the process, diagnostic of carbonate and certain clay minerals.
electronic charge transfers, electronic crystal field Absorption features caused by vibrational overtones
effects, and molecular vibrations. The former two of C – H, O – H, and N –H stretches are also mani-
processes are observed in minerals and manmade fested in the SWIR and these absorption features are
materials that contain transition group metal characteristic of important vegetative biochemical
cations, most usually, iron. Charge transfer absorp- components such as cellulose, lignin, starch, and
tions are the result of cation-to-anion or cation- glucose. An example of the effect of the changes in
to-cation electron transfers. Crystal field absorptions reflected energy recorded at a sensor, due to the
are explainable by the quantum mechanical theory absorption of incident energy by green vegetation and
of atomic structure, wherein an atom’s electrons are other surface materials, is provided in Figure 2.
contained in orbital shells. The transition group Laboratory spectrometers have been used for many
metals have incompletely filled d orbital shells. years to analyze the diagnostic absorptions of
When a transition metal cation, such as iron, is materials caused by the above effects. More recently,
surrounded by anions, and potentially selects other spectrometers were mounted on telescopes to deter-
cations, an electric field, known as the crystal field, mine the composition of the Moon and the other solid
exists. Crystal field absorptions occur when radiant bodies in our solar system. Technology progressed to
energy causes that cation’s orbital shell energy levels where profiling spectrometers (instruments that
to be split by interaction with the crystal field. The measure a successive line of points on the surface)
reflectance spectra of different iron-bearing minerals could be mounted on aircraft. The next logical step
are unique, because the iron cation has a unique was the construction of imaging spectrometers
positioning with respect to the anions and other (hyperspectral sensors) that measured spectra in
cations that compose the mineral. two spatial dimensions. In fact, the data produced
A material’s component molecules have bonds to by a hyperspectral sensor are often thought of as
other molecules and as a result of interaction with an image cube (Figure 3) because they consist of
radiant energy, these bonds can stretch or bend. three dimensions: two spatial and one spectral.
ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 397
Figure 2 Portion of a scene from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) from Utah that includes a basalt flow,
circularly irrigated grasses sparsely vegetated plains, and damp alluvial sediments. AVIRIS channels centered at 0.45, 0.55, 0.81, and
2.41 mm are shown. Green vegetation is brightest in the 0.81 mm band which samples the peak of the near infrared plateau (see
Figure 7) just beyond the ‘red edge’. The green vegetation and damp sediments are darkest in the 2.41 mm band where water has a very
low reflectance (note also Figure 8). The bright patches in the basalt flow in the 2.41 mm image are occurrences of oxidized red cinders
which have a high reflectance in the SWIR.
Figure 3 Visualization of a hyperspectral image cube. A 1.7 mm image band from the Airborne Visible/Infrared Imaging Spectrometer
(AVIRIS) over the Lunar Crater Volcanic Field, Nevada is shown with each of the 224 rightmost columns and 224 topmost lines arrayed
behind it. In the lower right-hand corner, the low reflectance of ferric oxide-rich cinders in the blue and green show up as dark in the
stacking of columns with the high reflectance in the infrared showing up as bright.
398 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere
Reflectance
Data Processing Approaches (FWHM)
An early impediment to the widespread use of
hyperspectral data was the sheer volume of the data
sets. For instance, a single scene from NASA’s
Airborne Visible/Infrared Imaging Spectrometer
(AVIRIS) is 614 samples by 512 lines by 224 bands.
With its short integer storage format, such a scene Band minimum
takes up 140.8 Mbytes. However, since the dawn of
Wavelength
the hyperspectral era, in the 1980s, computing power
has expanded immensely and computing tasks, which Figure 4 Parameters that describe an absorption band.
once seemed prohibitive, are now relatively effortless.
Also, numerous processing algorithms have been sensor spectra. Contrast between the library and the
developed which are especially well suited for use sensor spectra is mitigated through the use of an
with hyperspectral data. additive constant. This constant is incorporated into a
Spectral Matching set of equations which are solved through the use of
standard least squares. The result of such a band-
In one class of algorithms, the spectrum associated fitting algorithm is two data numbers per spatial
with each spatial sample is compared against one or a sample of the hyperspectral scene. First the band
group of model spectra and some similarity metric is depth (or, alternatively, band area) of the feature is
applied. The model spectrum can be measured by a calculated, and second, a goodness-of-fit parameter
laboratory or field portable spectrometer of pure is calculated. These two parameters are most often
materials or can be a single pixel spectrum, or average combined in a band-depth times goodness-of-fit
of pixel spectra, covering a known occurrence of the image. More sophisticated versions of the band-
material(s) to be mapped. A widely used similarity mapping algorithm target multiple absorption
metric is the Spectral Angle Mapper (SAM). SAM is features and are, essentially, expert systems.
the angle, u, obtained from the dot product of the
image pixel spectrum, t, and the library spectrum, r, Spectral Mixture Analysis
as expressed by A fundamentally different methodology for analyzing
hyperspectral data sets is to model the measured
tpr spectrum of each spatial sample as a linear combi-
u ¼ cos21 ½1
ktk p krk nation of endmember spectra. This methodology is
based on the fact that the ground area corresponding
SAM is insensitive to brightness variations and
to any given spatial sample likely will be covered by
determines similarity based solely on spectral shape.
more than one material, with the consequence that the
Lower u values mean more similar spectra.
measured reflectance or emittance spectrum is a mixed
In a related approach, matching is done, not on
pixel spectrum. The objective of linear SMA is to try to
the entire spectrum, but rather on characteristic
determine the fractional abundance of the component
absorption features of the materials of interest. In
materials, or endmembers. The basis of linear spectral
Figure 4, the geometry of an absorption band is
unmixing is to model the response of each spatial
plotted. To map a given mineral, with a specific
element as a linear combination of endmember
absorption feature, a high resolution laboratory
spectra. The basis equation for linear SMA is
spectrum is resampled to the spectral resolution of
the hyperspectral sensor. The spectrum is subsampled rðx; yÞ ¼ aM þ 1 ½2
to match the specific absorption feature (i.e., only
bands from the short wavelength shoulder of the where rðx; yÞ ¼ the relative reflectance spectrum for
absorption to the long wavelength shoulder are the pixel at position ðx; yÞ; a ¼ the vector of end-
included). A straight line continuum (calculated, member abundances, M ¼ the matrix of endmember
based on a line between the two-band shoulders) is spectra, and 1 ¼ the vector of residuals between the
divided out from both the laboratory and the modeled and the measured reflectances. Application
ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 399
of SMA results in a series of fraction images for each unique and diagnostic of the mineral’s identity. The
endmember wherein the data numbers range, ideally, expectation, which was soon borne out by results
between 0 and 1. Fraction image pixels with a digital from these early sensors, was that, given exposure of
number (DN) of 0 are taken to be devoid of the the surface, mineralogic maps as detailed as the
endmember material. Fraction image pixels with a DN spatial resolution of the sensor could be derived from
of 1.0 are taken to be completely covered with the hyperspectral data. As some of the processing
endmember material. techniques discussed above have become more readily
Techniques related to SMA have been developed available, it has become possible to produce miner-
which also map the abundance of a target material on alogic maps from hyperspectral data far more rapidly
a per pixel basis. These include implementations and cost-effectively than geologic maps produced by
highly similar to SMA such as Orthogonal Subspace standard means (e.g., by walking over the ground and
Projection (OSP) wherein, as with SMA, all end- manually noting which lithologies are underfoot).
members must be determined a priori. Other appro- The propensity for transition metals, most
aches such as Constrained Energy Minimization especially iron, to absorb energy through charge
(sometimes referred to as ‘Matched Filter’) do not transfer and crystal field effects, is noted above. Fe –O
require a priori knowledge of the endmembers. and Fe2þ –Fe3þ charge transfers cause a profound
Instead, only the reflectance or emittance spectrum absorption in the reflectance spectrum of Fe-bearing
of the target need be known and the undesired minerals shortwards of 0.4 mm. The wing of this
spectral background is estimated, and ultimately absorption causes the low blue and green reflectance
compensated for, using the principal eigenvectors of of Fe-bearing minerals. Crystal field bands cause an
the sample correlation or covariance matrix of the absorption feature in the 0.9 to 1.0 mm region. The
hyperspectral scene. ability to discriminate subtle differences among Fe-
bearing minerals in hyperspectral data has proven
Expert Systems and Artificial Neural Networks extremely valuable in applications such as mineral
exploration, volcanology, and in the characterization
Another means of extracting information from of abandoned mine lands.
hyperspectral data sets is to use computer approaches In the SWIR, absorptions caused by vibrational
that in some way mimic the human thought process. overtones of molecular bonds within minerals. These
This can take the form of an expert system which include absorptions in the 2.2 and 2.3 mm region
applies a set of rules or tests as each pixel spectrum is which are characteristic of many Al- and Mg-bearing
successively analyzed. For example, if a spectrum has clay minerals. Absorptions in the SWIR are narrower
an absorption at 2.2 mm it could tentatively be in width than the Fe-generated crystal field bands
classified as being caused by a clay mineral. Addition- discussed previously. In fact, the requirement to
ally, if the absorption is a doublet, the specific clay efficiently resolve these vibrational overtone absorp-
mineral is likely a kaolinite. In practice, expert system tion bands (which have full width at half maximum
codes are lengthy and complex although excellent (FWHM) bandwidths of 20 to 40 nm) helped to drive
results can be demonstrated from their use, albeit at the selection of the nominal 10 nm bandwidth of early
the expense of processing time. Another approach in and most current hyperspectral sensors. Certain clay
this vein is the use of an artificial neural network minerals can be indicators of hydrothermal activity
(ANN) for data processing. The use of ANNs is associated with economic mineral deposits; thus, the
motivated by their power in pattern recognition. SWIR is an important spectral region for mineral
ANN architectures are well suited for a parallel exploration. Clay minerals, by definition, have planar
processing approach and thus have the potential for molecular structures that are prone to failure if
rapid data processing. subjected to shearing stresses. The ability to uniquely
identify and map these minerals, using hyperspectral
data, has thus been used to good effect to identify areas
Hyperspectral Remote Sensing of on volcanoes and other hydrothermally altered
the Land terrains that could be subject to landslides.
Minerals with adsorbed or molecularly bound
Geology
OH2 and/or water have vibrational overtone absorp-
Early efforts with airborne hyperspectral sensors tions near 1.45 and 1.9 mm, although these features
focused on geologic remote sensing. This was, at are masked in remotely sensed data by prominent
least in part, a consequence of the mineralogic atmospheric water vapor bands at 1.38 and 1.88 mm.
diversity of the Earth’s surface and the fact that The reflectance of water and OH-bearing minerals
many minerals have absorption features which are decrease to near zero at wavelengths longwards of
400 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere
While different minerals are, by definition, generally Figure 7 Major spectral features in the reflectance spectrum of
composed of diverse component molecules, different green vegetation.
ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 401
0.7 and 0.78 mm. Past the red edge and into the SWIR, The amount of water present in plant leaves is, in
the spectrum of green vegetation is dominated by itself, a valuable piece of information. Hyperspectral
water in the leaf, with leaf water absorptions occurring data can be used to determine the equivalent water
at 0.97, 1.19, 1.45, 1.93, and 2.50 mm. Studies of thickness present in vegetative canopies. Conversely,
vegetation, using multispectral systems, have made the amount of dry plant litter and/or loose wood can
use of a number of broadband indices. Hyperspectral be estimated using spectral mixture analysis and
systems allow for the discrimination of individual related techniques. Taken together, all these data:
absorption features and subtle spectral shape differ- vegetation species maps, determinations of leaf water
ences in vegetation spectra which are unresolvable content and the relative fractions of live vegetation
using broadband multispectral systems. For example, versus litter, can be used to characterize woodlands
the absorptions caused by chlorophyll can be used for for forest fire potential. The ability to map different
studies of chlorophyll content in vegetative canopies. species of vegetation has also proved useful in the
The unique interaction of chlorophyll with leaf field of agriculture, for assessing the impact of
structures can provide a morphology to the green invasive weed species on farm and ranch land.
peak that is unique to a given species and thus is Being able to assess the health of crops, in terms of
mappable using a spectral feature fitting approach. leaf water content, is also important for agriculture.
A desired goal in the study of vegetation and Another subtle vegetative spectral feature, that
ecosystem processes is to be able to monitor the provides information on plant vitality, is the afore-
chemistry of forest canopies. The foliar biochemical mentioned red edge. Shifts in the position of the red
constituents making up forest canopies have associ- edge have been linked to vegetation stress. Studies
ated absorption features in the SWIR, that result have shown that vegetation stress and consequent
from vibrational overtones and combinations of C –O, shifts in the position of the red edge can be caused by
O – H, C – H, and N –H molecular bonds. However the a number of factors, including insufficient water
absorptions from these biochemical components intake or the intake of pernicious trace metals.
overlap so that the chemical abundance of any one Vegetation growing over mineral deposits can display
plant component cannot be directly related to any one red edge shifts and this can be used to assist in mineral
absorption feature. An even more serious challenge is exploration efforts.
introduced by the strong influence of water in the Another application area, in which hyperspectral
SWIR in green vegetation spectra. Water constitutes remote sensing has found great utility, is in the
40 to 80% of the weight of leaves. However, the analysis of snow-covered areas. Over the reflective
unique characteristics of high-quality hyperspectral solar spectrum, the reflectance of snow varies from
data (e.g., excellent radiometric and spectral cali- values near zero in the SWIR to values near one in the
bration, high signal-to-noise ratio, high spectral blue. Differences in the grain size of snow result in
resolution) have been used to detect even these differences in reflectance as is illustrated in Figure 8.
subtle features imprinted on the stronger water The spectral variability of snow makes it easily
features. distinguishable from other Earth surface materials.
0.4
VIS NIR
0.2
0
0.4 0.7 1 1.3 1.6 1.9 2.2 2.5
Wavelength (µm)
Figure 8 Differences in snow reflectance spectra resulting from differences in snow grain size. Courtesy of Dr Anne Nolin of Oregon
State University.
402 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere
Linear spectral unmixing, using hyperspectral data, In some empirical approaches, atmospheric path radi-
has been shown to successfully discriminate between ance is ignored and only a multiplicative correction
snow, vegetation, rock, and clouds and to accurately is applied.
map snow cover fraction in mixed pixels. Accurate Model-based approaches seek to model what the
maps of snow cover, types of snow, and fractions of at-sensor radiance should be on a pixel-by-pixel basis,
liquid water admixed with snow grains, are required including the contribution of the atmosphere. The
for forecasting snowmelt runoff and stream discharge at-sensor radiance, Ls, at any wavelength, l, can be
in watersheds dominated by snow cover. expressed as
1
Hyperspectral Remote Sensing of Ls ¼ ðEr þ MT Þtu þ Lu ½4
p
the Atmosphere
where E is the irradiance at the surface of the Earth, r
The physical quantity most often sought in land is the reflectance of the Earth’s surface, MT is the
remote sensing studies is surface reflectance. How- spectral radiant exitance of the surface at tempera-
ever, the quantity recorded by a hyperspectral sensor ture, T, tu is the transmissivity of the atmosphere at
is radiance at the entrance aperture of the sensor. In zenith angle u, and Lu is the spectral upwelling path
order to obtain reflectance, the background energy radiance of the atmosphere. The ability to solve
level of the Sun and/or Earth must be removed and the eqn [4] and do atmospheric correction on a pixel-by-
scattering and absorbing effects of the atmosphere pixel basis is appealing, in that it negates the
must be compensated for. In the VNIR through shortcomings of an empirical approach where a
SWIR, there are seven atmospheric constituents with correction based on calibration targets in one part
significant absorption features: water vapor, carbon of a scene might not be appropriate for another part
dioxide, ozone, nitrous oxide, carbon monoxide, of the scene, due to differences in atmospheric path-
methane, and oxygen. Molecular scattering (com- length or simple atmospheric heterogeneity. Model-
monly called Rayleigh scattering) is strong in the blue based approaches are also able to take advantage of
but decreases rapidly with increasing wavelength. the greater spectral dimensionality of hyperspectral
Above 1 mm, its effect is negligible. Scattering caused data sets for the derivation of atmospheric properties
by atmospheric aerosols, or Mie scattering, also is (e.g., the amount of column water vapor, CO2 band
more prominent at shorter wavelengths and decreases depths, etc.) directly from the data.
with increasing wavelength, but the dropoff of effects The main atmospheric component, affecting
from Mie scattering is not as profound as that for imagery in the reflective solar portion of the
Rayleigh scattering. Consequently, effects from spectrum, is water vapor. Model-based atmospheric
aerosol scattering can persist into the SWIR. correction techniques determine the column water
There are three primary categories of methods vapor for each pixel in the scene, based on the depth
for removing the effects of atmosphere and solar of the 940 and/or 1130 nm atmospheric water
insolation from hyperspectral imagery, in order to bands. Thus for each pixel, an appropriate amount
derive surface reflectance. Methods of atmospheric of water can be removed. The maps of atmospheric
corrections can be considered as being either an water vapor distribution are themselves of interest
image-based, empirical, or model-based approach. to atmospheric scientists. Absorption features
An image-based, or ‘in-scene’ approach uses only data caused by well-mixed gases, such as the 760 nm
measured by the instrument. Empirical methods make O2 band, can be used by model-based programs, to
use of the remotely sensed data in combination with produce images of approximate scene topography.
field measurements of reflectance, to solve a simplified More advanced model-based approaches also solve
equation of at-sensor radiance such as eqn [3]. for the scattering effects of aerosols on the
Ls ¼ Ar þ B ½3 hyperspectral data.
As noted above, atmospheric constituents, includ-
where Ls is the at-sensor radiance, r is the reflectance ing industrial effluents, can be detected and spatially
of the surface, and A and B are quantities that mapped by hyperspectral sensors. The fundamental
incorporate, respectively, all multiplicative and addi- vibrational absorption of gases of interest occurs in
tive contributions to the at-sensor radiance. All the the MWIR and the LWIR. Overtones of these
quantities expressed in eqn [3] can be considered as gaseous molecular vibrations occur in the VNIR to
varying as a function of wavelength, l. Approxi- SWIR, but are generally too weak to be detected.
mations for A and B from eqn [3] constitute a set of The ability to detect anthropomorphically produced
gains and offsets derived from the empirical approach. gases depends on a number of factors, including the
ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 403
temperature difference between the gas cloud and the Goetz AFH, Vane G, Solomon JE and Rock BN (1985)
background atmosphere, the concentration of gas Imaging spectrometry for Earth remote sensing. Science
within the cloud, and the size of the cloud (e.g., the 228: 1147– 1153.
path length of light through the cloud). Hapke B (1993) Theory of Reflectance and Emittance
Spectroscopy. Cambridge, UK: Cambridge University
Press.
Kirkland LE, Herr KC and Salisbury JW (2001) Thermal
See also infrared spectral band detection limits for unidentified
Imaging: Infrared Imaging. Instrumentation: surface materials. Applied Optics 40: 4852– 4862.
Spectrometers. Martin ME, Newman SD, Aber JD and Congalton RG
(1998) Determining forest species composition using
high spectral resolution remote sensing data. Remote
Further Reading Sensing Environment 65: 249 – 254.
Clark RN (1999) Spectroscopy of rocks and minerals, and Painter TH, Roberts DA, Green RO and Dozier J (1998)
principles of spectroscopy. In: Rencz AN (ed.) Remote The effect of grain size on spectral mixture analysis of
Sensing for the Earth Sciences, pp. 3– 57. New York: snow-covered area with AVIRIS data. Remote Sensing
John Wiley and Sons. Environment 65: 320– 332.
Farrand WH and Harsanyi JC (1997) Mapping the Pieters CM and Englert PAJ (eds) Remote Geochemical
distribution of mine tailings in the Coeur d’Alene River Analysis: Elemental and Mineralogical Composition.
Valley, Idaho through the use of a Constrained Energy Cambridge, UK: Cambridge University Press.
Minimization technique. Remote Sensing Environment Pinzon JE, Ustin SL, Castaneda CM and Smith MO (1997)
59: 64 –76. Investigation of leaf biochemistry by hierarchical fore-
Gao B-C and Goetz AFH (1995) Retrieval of equivalent ground/background analysis. IEEE Transactions on
water thickness and information related to biochemical Geoscience and Remote Sensing 36: 1– 15.
components of vegetation canopies from AVIRIS data. Roberts DA, Green RO and Adams JB (1997) Temporal
Remote Sensing Environment 52: 155 – 162. and spatial patterns in vegetation and atmospheric
Gao B-C, Heidebrecht KH and Goetz AFH (1993) properties using AVIRIS. Remote Sensing Environment
Derivation of scaled surface reflectances from AVIRIS 62: 223– 240.
data. Remote Sensing Environment 44: 165 – 178. Wessman CA, Aber JD and Peterson DL (1989) An
Goetz AFH (1989) Spectral remote sensing in geology. In: evaluation of imaging spectrometry for estimating forest
Asrar G (ed.) Theory and Applications of Optical Remote canopy chemistry. International Journal of Remote
Sensing, pp. 491 – 526. New York: John Wiley & Sons. Sensing 10: 1293– 1316.
detector noise, and this type of error can be reduced DIAL systems at many locations around the Earth,
by signal averaging. Systematic errors arise from with sites in Ny-Ålesund, Spitzbergen (78.98N,
uncompensated instrument and atmospheric effects 11.98E), Observatoire Haute Provence, France
and must be carefully considered in system design and (43.98N, 5.78E), Table Mountain, California
operation and data processing. (34.48N, 118.28W), Mauna Loa, Hawaii (19.58N,
Another approach, the Raman lidar approach, uses 155.68W), and Lauder, New Zealand (45.08S,
inelastic scattering (scattered/emitted light has differ- 169.78E). First established in the 1980s, these DIAL
ent wavelength than the illuminating light) from gases systems are strategically located so that O3 in
where the wavelength shift corresponds to vibrational different latitude bands can be monitored for signs
or rotational energy levels of the molecules. Any of change. In addition, they can provide some profiles
illuminating laser wavelength can be used, but since for comparison with space-based O3 measuring
the Raman scattering cross-section varies as l24, instruments. The parameters of a typical ground-
where l is the wavelength, the shorter laser wave- based UV DIAL system used in the NDSC are given
lengths, such as in the near UV spectral region, are in Table 1.
preferred. While even shorter wavelengths in the solar
blind region below ,300 nm would permit operation
in daytime, the atmospheric attenuation, due to
Airborne DIAL Systems
Rayleigh scattering and UV molecular absorption, Airborne lidar systems expand the range of atmos-
limits the measurement range. High-power lasers are pheric studies beyond those possible by surface-based
used due to the low Raman scattering cross-sections. lidar systems, by virtue of being located in aircraft
The Raman lidar measurements have to be carefully that can be flown to high altitudes and to remote
calibrated against a profile measured using another locations. Thus, they permit measurements at
approach, such as is done for water vapor (H2O) locations inaccessible to surface-based lidar systems.
measurements with the launch of hygrometers on In addition, they can make measurements of large
radiosondes, since the lidar system constants and atmospheric regions in times that are short compared
atmospheric extinctions are not usually well known with atmospheric motions, so that the large-scale
or modeled. In order to obtain mixing ratios of patterns are discernible. Another advantage of air-
H2O with respect to atmospheric density, the H2O borne lidar operation is that lidar systems perform
signals are ratioed to the nitrogen Raman signals. well in the nadir (down) direction since the atmos-
However, care must be exercised in processing these pheric density and aerosol loading generally increases
data due to the spectral dependences of atmospheric with decreasing altitude towards the surface, which
extinction. helps to compensate for the R22 decrease in the lidar
Surface-Based Lidar Systems Table 1 Parameters for the Jet Propulsion Laboratory’s Mauna
Loa DIAL systems for stratospheric O3 measurements
Surface-based lidar systems measure the temporal
evolution of atmospheric profiles of aerosols and Lasers XeCl Nd:YAG
(3rd Harmon.)
gases. The first lidar systems used to remotely
Wavelength (nm) 308 (on) 355 (off)
measure atmospheric gases were Raman lidar sys- Pulse energy (mJ) 300 150
tems. The Raman lidar systems were thought to be Pulse repetition 200 100
very promising since one high-power laser could be frequency (Hz)
used to measure a variety of gases. They are easier to Receiver
Area (m2) 0.79
develop and operate than DIAL systems, because the
Optical efficiency (%) ,40
laser does not have to be tuned to a particular gas Wavelengths– Rayleigh (nm) 308 355
absorption feature. However, the Raman scattering N2 Raman (nm) 332 387
cross-section is low, and the weak, inelastically (for aerosol correction)
scattered signal is easily contaminated by daylight System performance
Measurement rangep (km) 15– 55
background radiation, resulting in greatly reduced
Vertical resolution (km) 3 at bottom, 1 at O3 peak,
performance during daytime. Raman lidar systems 8 –10 at top
have been developed primarily for measuring H2O Measurement averaging 1.5
and retrieving atmospheric temperature. time (hours)
Surface-based UV DIAL systems, that are part of Measurement accuracy ,5% at peak, 10–15%
at 15 km, .40% at 45 km
the Network for the Detection of Stratospheric
Change (NDSC) have made important contributions p
Chopper added to block beam until it reaches 15 km in order to
to the understanding of stratospheric O3. There are avoid near field signal effects.
ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 405
signal with range. For the zenith direction, the atmosphere is sampled. This system has a demon-
advantage is that the airborne lidar system is closer strated absolute accuracy for O3 measurements of
to the region being measured. better than 10% or 2 ppbv (parts per billion by
volume), whichever is larger, and a measurement
precision of 5% or 1 ppbv with a vertical resolution
Field Measurement Programs of 300 m and an averaging time of 5 minutes (about
70 km horizontal resolution at typical DC-8 ground
Global O3 Measurements speeds).
The first airborne DIAL system, which was developed The NASA LaRC airborne UV DIAL systems have
by the NASA Langley Research Center (LaRC), was made significant contributions to the understanding
flown for O3 and aerosol investigations in conjunc- of both tropospheric and stratospheric O3, aerosols,
tion with the Environmental Protection Agency’s and clouds. These systems have been used in 18
Persistent Elevated Pollution Episodes (PEPE) field international and 3 national field experiments over
experiment, conducted over the east coast of the US in the past 24 years, and during these field experiments,
the summer of 1980. This initial system has evolved measurements were made over, or near, all of the
into the advanced UV DIAL system that will be oceans and continents of the world. A few examples
described in the next section. Airborne O3 DIAL of the scientific contributions made by these airborne
systems have also been developed and used in field UV DIAL systems are given in Table 3.
measurement programs by several other groups in the The NASA LaRC airborne UV DIAL system has
United States, Germany, and France. been used extensively in NASA’s Global Tropospheric
The parameters of the current NASA LaRC Experiment (GTE) program which was started in the
airborne UV DIAL system are given in Table 2. The early 1980s, had as its primary mission the study of
on-line and off-line UV wavelengths of 288.2 and tropospheric chemistry in remote regions of the
299.6 nm are used for DIAL O3 measurements during Earth, in part to study and gain a better under-
tropospheric missions, and 301 and 310 nm are used standing of atmospheric chemistry in the unperturbed
for stratospheric missions. This system also transmits atmosphere. A related goal was to document the
1,064 and 600 nm beams for aerosol and cloud Earth’s atmosphere in a number of places during
measurements. The time delay between the on- and seasons when anthropogenic influences are largely
off-wavelength pulses is 400 ms, which is sufficient absent, then return to these areas later to document
time for the return at the first set of wavelengths to the changes. The field missions have included
end, but short enough that the same region of the campaigns in Africa, Alaska, the Amazon Basin,
Table 3 Examples of significant contributions of airborne O3 DIAL systems to the understanding of tropospheric and stratospheric O3
(see also Figures 1 –5)
Troposphere:
Air mass characterizations in a number of remote regions
Continental pollution plume characterizations (see Figures 1 and 2)
Study of biomass burn plumes and the effect of biomass burning on tropospheric O3 production (see Figure 3)
Case study of warm conveyor belt transport from the tropics to the Arctic
Study of stratospheric intrusions and tropopause fold events (see Figure 4)
Observation of the decay of a cutoff low
Power plant plume studies
Stratosphere:
Contributions to the chemical explanation of behavior of Antarctic O3
Quantification of O3 depletion in the Arctic (see Figure 5)
Polar stratospheric clouds – particle characterizations
Intercomparison with surface-based, airborne and space-based instruments
Quantification of O3 reduction in tropical stratosphere after the June 1991 eruption of Mount Pinatubo and characterization of the
tropical stratospheric reservoir edge
Cross-vortex boundary transport
Figure 1 Latitudinal distribution of ozone over the western Pacific Ocean obtained during the second Pacific Exploratory Mission
(PEM West B) in 1994.
Canada, and North America, and over the Pacific The NASA LaRC airborne UV DIAL system has
Ocean from Antarctica to Alaska, with concentrated also been flown in all the major stratospheric O3
measurements off the east coast of Asia and in the campaigns starting in 1987 with the Airborne
tropics. One of the bigger surprises of the two decades Antarctic Ozone Experiment (AAOE) to determine
of field missions, was the discovery in the mid-1990s, the cause of the Antarctic ozone hole. The UV DIAL
that the tropical and South Pacific Ocean had very system documented the O3 loss across the ozone hole
high tropospheric O3 concentrations in plumes from region. Later, when attention was turned to the Arctic
biomass burning in Africa and South America during and the possibility of an ozone hole there, the system
the austral spring. There were few indications from was used to produce an estimate of O3 loss during the
surface-based or space-based measurements that winter season as well as to better characterize the polar
there were extensive biomass burn plumes in the stratospheric cloud (PSC) particles. The UV DIAL
area, primarily since the plumes were largely devoid system was also used to study O3 loss in the tropical
of aerosols due to being stripped out during cloud stratospheric reservoir following the eruption of
convective lofting. Once over the ocean, horizontal Mount Pinatubo in June 1991. The loss was spotted
transport appears to proceed relatively unimpeded by the Microwave Limb Sounder (MLS) on the Upper
unless a storm system is encountered. Atmospheric Research Satellite (UARS), but the MLS
ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 407
Figure 2 Pollution outflow from China over the South China Sea (right side of figure) with clean tropical air on south side of a front
(left side of figure), observed during PEM West B in 1994.
was unable to study the loss in detail due to its low and stratospheric O3 depletion and dynamics.
vertical resolution (5 km) compared to the small-scale High-resolution airborne O3 DIAL and other aircraft
(2– 3 km) features of the O3 loss. Other traditional measurements show that to study tropospheric
space-based O3 measuring instruments also had processes associated with biomass burning, transport
difficulty during this period, due to the high aerosol of anthropogenic pollutants, tropospheric O3 chem-
loading in the stratosphere following the eruption. istry and dynamics, and stratosphere – troposphere
exchange, a vertical profiling capability from space
with a resolution of 2 – 3 km is needed, and this
Space-Based O3 DIAL System
capability cannot currently be achieved using passive
In order to obtain nearly continuous, global distri- remote sensing satellite instruments. An example of
butions of O3 in the troposphere, a space-based O3 the type of latitudinal O3 cross-section that could be
DIAL system is needed. A number of key issues provided by a space-based O3 DIAL system is shown
could be addressed by a space-based O3 DIAL in Figure 1. This figure shows many different aspects
system including: the global distribution of photo- of O3 loss and production; vertical and horizontal
chemical O3 production/destruction and transport transport; and stratosphere– troposphere exchange
in the troposphere; location of the tropopause; that occurs from the tropics to high latitudes.
408 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases
Figure 3 Biomass burning plume over the Atlantic Ocean arising from biomass burning in the central part of western Africa, observed
during the TRACE-A mission in 1992.
Figure 4 Ozone distribution observed on flight across US during the SASS (Subsonic Assessment) Ozone and Nitrogen Experiment
(SONEX) in 1997. A stratospheric intrusion is clearly evident on left side of figure, and low ozone air from tropics transported to
mid-latitudes can be seen in the upper troposphere on the right.
Figure 5 Ozone cross-sections in the stratosphere measured in the winter of 1999/2000 during the SOLVE mission. The change
in ozone number density in the Arctic polar vortex due to chemical loss during the winter is clearly evident at latitudes north
of 728N.
system using an abandoned searchlight mirror was Table 4b Parameters for the NASA Goddard Space Flight
employed by the NASA Goddard Space Flight Center’s Scanning Raman lidar system for measurements of H2O
Center in the mid-1980s, to show that if the signal Day/Night Night only
collector (telescope) was large enough, useful
Raman measurements could be made to distances of Laser Nd:YAG Laser XeF
Wavelength (nm) 355 351
several kilometers. Until recently, Raman lidar
Pulse energy (mJ) 300 30– 60
measurements of H 2O were largely limited to Pulse repetition 30 400
night-time, due to the high sunlight background frequency (Hz)
interference. Measurements are now made during Receiver
daytime, using very narrow telescope fields of view Area (m2) 1.8 1.8
Wavelengths (nm)
and very narrowband filters in the receiver. A good
Water vapor 407 403
example of a Raman lidar system used to measure Nitrogen 387 382
H2O is that at the Department of Energy’s Atmos- System performance
pheric Radiation Measurement – Cloud and Measurement
Radiation Test Bed (ARM-CART) site in Oklahoma. range (km)
Night-time near surface to 12
The site is equipped with a variety of instruments
Daytime near surface to 4
aimed at studying the radiation properties of Range resolution (m) 7.5 at low altitudes,
the atmosphere. The system parameters are given 300 .9 km
in Table 4a,b. Measurement
Raman lidar systems have been used for important accuracy (%)
Night-time ,10 ,5 km (10 sec and
measurements of H2O from a number of ground-
7.5 m range resolution)
based locations. Some of the important early work ,10 ,7 km (1 min avg)
dealt with the passage of cold and warm fronts. The 10– 30 at high altitudes
arrival of the wedge-shaped cold front, pushing the (10– 30 min avg)
warm air up, was one of these studies. Raman lidar Daytime ,10 ,4 km (5 min avg)
have also been located at the ARM-CART site in
Kansas and Oklahoma, where they could both
H2O DIAL Systems
provide a climatology of H2O as well as provide
correlative measurements of H2O for validation of H2O was first measured with the DIAL approach
space-based instruments. using a temperature-tuned ruby laser lidar system in
the mid-1960s. The first aircraft-based H2O DIAL
system was developed at NASA LaRC, and was flown
in 1982, as an initial step towards the development of
a space-based H2O DIAL system. This system was
Table 4a Parameters for the US Department of Energy’s based on Nd:YAG-pumped dye laser technology, and
surface-based Raman lidar system for measurements of H2O it was used in the first airborne H2 O DIAL
Laser Nd:YAG Laser atmospheric investigation, which was a study of
Wavelength (nm) 355 the marine boundary layer over the Gulf Stream.
Pulse energy (mJ) 400 This laser was later replaced with a flashlamp-
Pulse repetition 30
pumped solid-state alexandrite laser, which had
frequency (Hz)
Receiver high spectral purity, i.e., little out-of-band radiation,
Area (m2) 1.1 a requirement since water vapor lines are narrow, and
Wavelengths (nm) this system was used to make accurate H2O profile
Water vapor 407 measurements across the lower troposphere.
Nitrogen 387
System performance
A third H2O DIAL system, called LASE (Lidar
Measurement range (km) Atmospheric Sensing Experiment) was developed as a
Night-time near surface to 12 prototype for a space-based H2O DIAL system, and it
Daytime near surface to 3 was completed in 1995. This was the first fully
Range resolution (m) 39 at low altitudes, autonomously operating DIAL system. LASE uses a
300 .9 km
Measurement accuracy (%)
Ti:sapphire laser that is pumped by a double-pulsed,
Night-time ,10 ,7 km (1 min avg) frequency-doubled Nd:YAG to produce laser
10– 30 at high altitudes pulses in the 815 nm absorption band of H2O
(10– 30 min avg) (see Table 5). The wavelength of the Ti:sapphire
Daytime 10 ,1 km; 5 –15 for laser is controlled by injection seeding with a diode
1– 3 km (10 min avg)
laser that is frequency locked to a H2O line using an
ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 411
Table 5 Parameters of LASE H2O DIAL system Table 6 Examples of significant contributions of airborne H2O
DIAL systems to the understanding of H2O distributions (see also
Laser Ti:sapphire Figures 6 –8)
Wavelength (nm) 813– 818
Pulse energy (mJ) 100 NASA Langley H2O DIAL systems including LASE
Pulse-pair repetition frequency 5 (on- and off-line pulses Study of marine boundary layer over Gulf Stream
separated by 300 ms) Observation of H2O transport at a land/sea edge
Linewidth (pm) ,0.25 Study of large-scale H2O distributions across troposphere
Stability (pm) ,0.35 (see Figure 6)
Spectral purity (%) .99 Correlative in situ and remote measurements
Beam divergence (mrad) ,0.6 Observations of boundary layer development (see Figure 7)
Pulse width (ns) 35 Cirrus cloud measurements
Receiver Hurricane studies (see Figure 8)
Area (m2) 0.11 Relative humidity effects on aerosol sizes
Receiver optical efficiency (%) 50 (night), 35 (day) Ice supersaturation in the upper troposphere
Avalanch Photodiode (APD) 80 Studies of stratospheric intrusions
detector quantum efficiency (%) H2O distributions over remote Pacific Ocean
Field-of-view (mrad) 1.0 Other airborne H2O DIAL systems
Noise equivalence power 2 £ 10214 Boundary layer humidity fluxes
(W Hz20.5) Lower-stratospheric H2O studies
Excess noise factor 3
System performance
Measurement range (altitude) (km) 15
Range resolution (m) 300– 500 space-based H2O DIAL system could be flown on a
Measurement accuracy (%) 5 long-duration space mission this decade.
Space-based DIAL measurements can provide a
global H2O profiling capability, which when com-
absorption cell. Each pulse pair consists of an on-line bined with passive remote sensing with limited
and off-line wavelength for the H2O DIAL measure- vertical resolution, can lead to 3-dimensional
ments. To cover the large dynamic range of H2O measurements of global H2O distributions. High
concentrations in the troposphere (over 3 orders of vertical resolution H2O (#1 km), aerosol (#100 m),
magnitude), up to three line pair combinations are and cloud top (#50 m) measurements from the lidar
needed. LASE uses a novel approach of operating along the satellite ground-track, can be combined
from more than one position on a strongly absorbing with the horizontally contiguous data from nadir
H2O line. In this approach, the laser is electronically passive sounders to generate more complete
tuned at the line center, side of the line, and near the high-resolution H2O, aerosol, and cloud fields for
wing of the line to achieve the required absorption use in the various studies indicated above. In
cross-section pairs (on and off). LASE has demon- addition, the combination of active and passive
strated measurements of H2O concentrations across measurements can provide significant synergistic
the entire troposphere using this ‘side-line’ approach. benefits leading to improved temperature and relative
The accuracy of LASE H2O profile measurements humidity measurements. There is also strong syner-
was determined to be better than 6% or 0.01 g/kg, gism with aerosol and cloud imaging instruments and
whichever is larger, over the full dynamic range of with future passive instruments that are being
H2O concentrations in the troposphere. LASE has planned or proposed for missions addressing atmos-
participated in over eight major field experiments pheric chemistry, radiation, hydrology, natural
since 1995. See Table 6 for a listing of topics studied hazards, and meteorology.
using airborne H2O DIAL systems (Figures 6 –8).
Figure 6 LASE measurements of water vapor (left) and aerosols and clouds (right) across the troposphere on an ER-2 flight from
Bermuda to Wallops during the Tropospheric Aerosol Radiative Forcing Experiment (TARFOX) conducted in 1996.
Figure 7 Water vapor and aerosol cross-section obtained on a flight across a cold front during the Southern Great Plains (SGP) field
experiment conducted over Oklahoma in 1997.
ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 413
Figure 8 Measurements of water vapor, aerosols, and clouds in the inflow region of Hurricane Bonnie during 1998
Convection and Moisture Experiment (CAMEX-3). A rain band can be clearly seen at the middle of Leg-AB on the satellite and
LASE cross-sections.
high sensitivity to the fact that the laser frequency is concentrations of both gases than models had
modulated at a high frequency, permitting a small predicted. The TDL system used to measure HCHO
spectral region to be scanned rapidly. The second- or has been used in a number of additional ground-
fourth-harmonic of the scan frequency is used in the based measurement programs and has also been
data acquisition, effectively eliminating much of the flown on an aircraft in a couple of tropospheric
low-frequency noise due to mechanical vibrations missions.
and laser power fluctuations. In addition, multipass One TDL system, called DACOM (Differential
cells are employed, thereby generating long paths for Absorption CO Measurement), has been used in a
the absorption measurements. large number of NASA GTE missions. It makes
TDL systems have been used in a number of measurements of CO in the 4.7 mm spectral region
surface-based measurement programs. One system and CH4 in the 3.3 or 7.6 mm spectral region.
was mounted on a ship doing a latitudinal survey in This instrument is able to make measurements at a
the Atlantic Ocean and monitored NO2, formal- 1 Hz rate and with a precision of 0.5 – 2.0%,
dehyde (HCHO), and H2O2. Another TDL system depending on the CO value, and an accuracy of
was located at the Mauna Loa Observatory and was ^2%. DACOM has been very useful in determining
used to monitor HCHO and H 2O 2 during a global distributions of CO due to the wide-ranging
photochemistry experiment, finding much lower nature of the GTE missions. It has also contributed to
414 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases
the characterization and understanding of CO in air measurements were made of CH4 and N2O up to
masses encountered during the GTE missions ,20 km in the stratosphere over North America,
(Table 7). Scandinavia, and Russia, on the NASA ER-2.
A pair of TDL instruments, the Airborne Tunable Compared with its companion lead salt diode
Laser Absorption Spectrometer (ATLAS) and the lasers, that were also flown on these flights, the
Aircraft (ER-2) Laser Infrared Absorption Spec- single-mode QC laser, cooled to 82 K, produced
trometer (ALIAS), have been flown on several higher output power (10 mW), narrower laser
NASA missions to explore polar O3 chemistry and linewidth (17 MHz), increased measurement pre-
atmospheric transport. The ATLAS instrument cision (a factor of 3), and better spectral stability
measures N2O, and ALIAS is a 4-channel spec- (,0.1 cm – 1 K). The sensitivity of the QC laser
trometer that measures a large variety of gases, channel was estimated to correspond to a mini-
including HCl, N2O, CH4, NO2, and CO and is mum-detectable mixing ratio of approximately
currently configured to measure water isotopes across 2 ppbv of CH4.
the tropopause.
A new class of lasers, tunable quantum-cascade
(QC) lasers, are being added to the list of those
Laser Long-Path Measurements
available for in situ gas measurement systems. Laser systems can also be used in long-path
Using a cryogenically cooled QC laser during a measurements of gases. The most important of
series of 20 aircraft flights beginning in September such programs entailed the measurement of hydroxyl
1999 and extending through March 2000, radical (OH) in the Rocky Mountains west of
Table 7 Examples of significant contributions of airborne tunable diode laser systems to the understanding of atmospheric chemistry
and transport
Balloon-borne – stratosphere
The first in situ measurements of the suite NO2, NO, O3, and the NO2 photolysis rate to test NOx (NO2 þ NO) photochemistry
The first in situ stratospheric measurements of NOx over a full diurnal cycle to test N2O5 chemistry
In situ measurements of NO2 and HNO3 over the 20–35 km region to assess the effect of Mt. Pinatubo aerosol on heterogeneous
atmospheric chemistry
Measurements of HNO3 and HCl near 30 km
Measurements of CH4, HNO3, and N2O for validation of several satellite instruments
Intrusions from midlatitude stratosphere to tropical stratospheric reservoir
Aircraft mounted – troposphere
Carbon monoxide
Measurement of CO from biomass burning from Asia, Africa, Canada, Central America, and South America
Detection of thin layers of CO that were transported thousands of miles in the upper troposphere
Observed very high levels of urban pollution in plumes off the Asian continent
Emission indices for many gases have been calculated with respect to CO
Methane
Determined CH4 flux over the Arctic tundra, which led to rethinking of the significance of tundra regions as a global source of CH4
Found that biomass burning is a significant source of global CH4
Found strong enhancements of CH4 associated with urban plumes
Aircraft mounted – stratosphere
Polar regions
Extreme denitrification observed in Antarctic winter vortex from NOy:N2O correlation study
Observed very low N2O in Antarctic winter vortex, which helped refute the theory that the O3 hole is caused by dynamics
Contributed to the study of transport out of the lower stratospheric Arctic vortex by Rossby wave breaking
Measurement of concentrations of gases involved in polar stratospheric O3 destruction and production
Activation of chlorine in the presence of sulfate aerosols
Mid-latitudes
Measurements in aircraft exhaust plumes in the lower stratosphere, especially of reactive nitrogen species and CO
Vertical profiles of CO in the troposphere and lower stratosphere
Determination of the hydrochloric acid and the chlorine budget of the lower stratosphere
Measurement of NO2 for testing atmospheric photochemical models
Trends in HCl/Cly in the stratosphere ,21 km, 1992– 1998
Gas concentration measurements for comparison with a balloon-borne Fourier transform spectrometer observing the Sun
Near-IR TDL laser hygrometers (ER-2, WB57, DC-8) for measuring H2O and total water in the lower stratosphere and upper
troposphere
Remnants of Arctic winter vortex detected many months after breakup
Tropics
Tropical entrainment time scales inferred from stratospheric N2O and CH4 observations
ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 415
See also
Laser-Induced Fluorescence (LIF)
Imaging: Lidar. Scattering: Raman Scattering.
The laser-induced fluorescence approach has been
used to measure several molecular and ionic species
in situ. The LIF approach has been used to measure
OH and HO2 (HOx) on the ground and on aircraft Further Reading
platforms. One airborne LIF system uses a diode-
pumped Nd:YAG-pumped, frequency-doubled dye Browell EV (1994) Remote sensing of trace gases from
laser to generate the required energy near 308 nm. satellites and aircraft. In: Calvert J (ed.) Chemistry of
the Atmosphere: The Impact on Global Change,
The laser beam is sent into a White cell where it can
pp. 121– 134. Cambridge, MA: Blackwell Scientific
make 32 –36 passes through the gas in the cell to
Publications.
increase the LIF signal strength. NO is used to convert Browell EV, Ismail S and Grant WB (1998) Differential
HO2 to OH. The detection limit in 1 minute of about absorption lidar (DIAL) measurements from air and
2 –3 ppqv (10215) above 5 km altitude, which trans- space. Applied Physics. B 67: 399 –410.
lates into a concentration of about 4 £ 104 molec/cm3 Dabas A, Loth C and Pelon J (eds) (2001) Advances in Laser
at 5 km and 2 £ 104 molec/cm3 at 10 km altitude. Remote Sensing. Selected Papers presented at the 20th
One of the interesting findings from such measure- International Laser Radar Conference (ILRC), Vichy,
ments is that HOx concentrations are up to 5 times France, 10 – 14 July 2000, pp. 357 – 360. Palaiseau
larger than model predictions based on NOx concen- Cedex, France: A. Edition d’Ecole Polytechnique.
trations, suggesting that NOx emissions from aircraft Goldsmith JEM, Bisson SE, Ferrare RA, et al. (1994)
Raman lidar profiling of atmospheric water vapor:
could have a greater impact on O3 production than
simultaneous measurements with two collocated sys-
originally thought.
tems. Bulletin American Meteorological Society 75:
NO is detected using the LIF technique in a two- 975 – 982.
photon approach: electrons are pumped from the Goldsmith JEM, Blair FH, Bisson SE and Turner DD (1998)
ground state using 226 nm radiation and from that Turn-key Raman lidar for profiling atmospheric water
state to an excited state using 1.06 mm radiation. The vapor, clouds, and aerosols. Applied Optics 37:
226 nm radiation is generated by frequency doubling 4979– 4990.
a dye laser to 287 nm and then mixing that with Grant WB (1995) Lidar for atmospheric and hydrospheric
1.1 mm radiation derived from H2 Raman shifting of studies. In: Duarte F (ed.) Tunable Laser Applications,
frequency-mixed radiation from a dye laser and a pp. 213– 305. New York: Marcel Dekker Inc.
Nd:YAG laser. From the excited level, 187 – 201 nm Grant WB, Browell EV, Menzies RT, Sassen K and She C-Y
(eds) (1997) Laser Applications in Remote Sensing. SPIE
radiation is emitted. In order to measure NO2, it is
Milestones series, 690 pp, 86 papers.
first converted to the photofragment NO via pumping
Grisar R, Booettner H, Tacke M and Restelli G (1992)
at 353 nm from a XeF excimer laser. One of the Monitoring of Gaseous Pollutants by Tunable Diode
interesting findings from airborne measurements Lasers. Proceedings of the International Symposium held
in the South Pacific is that there appeared to be a in Freiburg, Germany, 17– 18 October 1991, 372 pp.
large missing source for NO x in the upper Dordrecht, The Netherlands: Kluwer Academic
troposphere. Publishers.
416 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere
McDermid IS, Walsh TD, Deslis A and White M (1995) dioxide and methane. Proceedings of the SPIE Inter-
Optical systems design for a stratospheric lidar system. national Society for Optical Engineering 1443:
Applied Optics 34: 6201– 6210. 145 – 156.
Mégie G (1988) Laser measurements of atmospheric trace Svanberg S (1994) Differential absorption lidar (DIAL).
constituents. In: Measures RM (ed.) Laser Remote In: Segrist MW (ed.) Air Monitoring Spectroscopic
Chemical Analysis, pp. 333 – 408. New York: J Wiley Techniques, pp. 85– 161. New York: J Wiley & Sons.
& Sons. Webster CR, et al. (1994) Aircraft (ER-2) laser infrared
Podolske JR and Loewenstein M (1993) Airborne absorption spectrometer (ALIAS) for in situ measure-
tunable diode laser spectrometer for trace gas measure- ments of HCl, N2O, CH4, NO2, and HNO3. Applied
ments in the lower stratosphere. Applied Optics 32: Optics 33: 454 – 472.
5324– 5330. Whiteman DN, Melfi SH and Ferrare RA (1992) Raman
Pougatchev NS, Sachse GW, Fuelberg HE, et al. (1999) lidar system for the measurement of water vapor and
Pacific Exploratory Mission-Tropics carbon monoxide aerosols in the Earth’s atmosphere. Applied Optics 31:
measurements in historical context. Journal of Geophy- 3068– 3082.
sics Research 104: 26,195 – 26,207. Zanzottera E (1990) Differential absorption lidar tech-
Sachse GW, Collins Jr JE, Hill GF, et al. (1991) niques in the determination of trace pollutants and
Airborne tunable diode laser sensor for high precision physical parameters of the atmosphere. Critical Reviews
concentration and flux measurements of carbon in Analytical Chemistry 21: 279– 319.
1 dI
Introduction ¼ 2I þ J ½1
k du
The ability to understand and model radiative
transfer (RT) processes in the atmosphere is critical where I is the LOS radiance (watts per unit area per
for remote sensing, environmental characterization, unit wavelength per steradian), k is the extinction
and many other areas of scientific and practical coefficient for the absorbing and scattering species
interest. At the Earth’s surface, the bulk of this (per unit concentration per unit length), u is the
radiation, which originates from the sun, is found in material column density (in units of concentration
the ultraviolet to infrared range between around 0.3 times length), and J is the radiance source function.
and 4 mm. This article describes the most significant I is a sum of direct (i.e., from the sun) and diffusely
RT processes for these wavelengths, which are scattered components. The source function represents
absorption (light attenuation along the line of sight the diffuse light scattered into the LOS, and is the
LOS) and elastic scattering (redirection of the angular integral over all directions Vi of the product of
light). Transmittance, T; is defined as one minus the incoming radiance, IðVi Þ; and the scattering phase
the fractional extinction (absorption plus scattering). function, pðVo ; Vi Þ :
At longer wavelengths (in the mid- and long-wave
infrared) the major light source is thermal emission. ð
A few other light sources are mentioned here. JðVo Þ ¼ pðVo ; Vi ÞIðVi ÞdVi ½2
The moon is the major source of visible light at night. Vi
equation for transmittance: power of the wavelength, and is responsible for the
sky’s blue color. For typical atmospheric conditions,
T ¼ I0 =I00 ¼ expð2kuÞ ½3 the optical depth for Rayleigh extinction is approxi-
mately 0:009=l4 per air mass (l is in mm and air
where I00 is the direct radiance at the boundary of the
mass is defined by the vertical column from
LOS. The quantity ku ¼ lnð1=TÞ is known as the
ground to space). The Rayleigh phase function has
optical depth.
a ð1 þ cos2 uÞ dependence.
Absorption by gases may consist of a smooth
spectral continuum (such as the ultraviolet and
Atmospheric Constituents
visible electronic transitions of ozone) or of discrete
The atmosphere has a large number of constituents, spectral lines, which are primarily rotation lines of
including numerous gaseous species and suspended molecular vibrational bands. In the lower atmosphere
liquid and solid particulates. Their contributions (below around 25 km altitude), the spectral shape of
to extinction are depicted in Figure 1. The largest these lines is determined by collisional broadening and
category is gases, of which the most important are described by the normalized Lorentz line shape
water vapor, carbon dioxide, and ozone. Of these, formula:
water vapor is the most variable and carbon
dioxide the least, although the CO2 concentration is ac =p
fLorentz ðnÞ ¼ ½4
gradually increasing. In atmospheric RT models, a2c þ ðn 2 n0 Þ2
carbon dioxide is frequently taken to have a fixed
and altitude-independent concentration, along with Here n is the wavenumber (in cm21) ½n ¼ ð10 000 mm=
other ‘uniformly mixed gases’ (UMGs). The con- cmÞ=l; n0 is the molecular line transition frequency
centration profiles of the three major gas species are and ac is the collision-broadened half-width (in
very different. Water vapor is located mainly in the cm21), which is proportional to pressure. The
lowest 2 km of the atmosphere. Ozone has a fairly flat constant of proportionality, known as the pressure-
profile from the ground through the stratosphere broadening parameter, has a typical value on the
(,30 km). The UMGs decline exponentially with order of 0.06 cm21 atm21 at ambient temperature.
altitude, with a scale height of around 8 km. The extinction coefficient kðnÞ is the product of
fLorentz ðnÞ and the integrated line strength S; which is
Gases
commonly in units of atm21 cm22.
Gas molecules both scatter and absorb light. Rayleigh At higher altitudes in the atmosphere the pressure
scattering by gases scales inversely with the fourth is reduced sufficiently that Doppler broadening
Figure 1 Spectral absorbance (1 – transmittance) for the primary sources of atmospheric extinction. The 12 nm resolution data were
generated by MODTRAN for a vertical path from space with a mid-latitude winter model atmosphere.
418 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere
computational burden, as a large number of photons is spectral bin molecular attenuation into 3 com-
required for reasonable convergence; the Gaussian ponents:
error in the calculation declines with the square root of
the number of photons. The convergence problem is . Line center absorption from molecular transitions
moderated to a large extent by using the physics of the centered within the spectral bin;
problem being solved to bias the selection of photon . Line tail absorption from the tails of molecular
paths toward those trajectories which contribute lines centered outside of the spectral bin but within
most; mathematically, this is equivalent to requiring 25 cm21; and
that the integrand be sampled most often where its . H2O and CO2 continuum absorption from distant
contributions are most significant. (.25 cm21) lines.
Within the terrestrial atmosphere, only H2O and CO2
An Example Atmospheric RT Model: MODTRAN have sufficient concentrations and line densities to
MODTRAN, developed collaboratively by the Air warrant inclusion of continuum contributions. These
Force Research Laboratory and Spectral Sciences, absorption features are relatively flat and accurately
Inc., is the most widely used atmospheric radiation modeled using 5 cm21 spectral resolution Beer’s Law
transport model. It defines the atmosphere using absorption coefficients.
stratified layering and computes transmittances, Spectral bin contributions from neighboring line
radiances, and fluxes using a moderate spectral tails drop off, often rapidly, from their spectral bin
resolution band model with IR through UV coverage. edge values, but the spectral curves are typically
The width of the standard spectral interval, or bin, in simple, containing at most a single local minimum.
MODTRAN is 1 cm21. At this resolution, spectral For MODTRAN, these spectral contributions are
correlation among extinction sources is well charac- pre-computed for a grid of temperature and pressure
terized as random. Thus, the total transmittance from values, and fit with Padé approximants, specifically
absorption and scattering of atmospheric particulates the ratio of quadratic polynomials in wavenumber.
and molecular gases is computed as the product of the These fits are extremely accurate and enable line
individual components. tail contributions to be computed on an arbitrarily
Rayleigh, aerosol, and cloud extinction are all fine grid. MODTRAN generally computes this
spectrally slowly varying and well represented by absorption at a resolution equal to one-quarter the
Beer’s Law absorption and scattering coefficients on spectral bin width, i.e., 0.25 cm21 for the 1.0 cm21
a 1 cm21 grid. Calculation of molecular absorption band model.
is more complex because of the inherent spectral The most basic ansatz of the MODTRAN band
structure and the large number of molecular transi- model is the stipulation that molecular line center
tions contributing to individual spectral bins. As absorption can be approximated by the absorption of
illustrated in Figure 2, MODTRAN partitions the n identical Voigt lines randomly distributed within
the band model spectral interval, Dn: Early in the
development of RT theory, Plass derived the
Total expression for the transmittance from these n
Line centers randomly distributed lines:
Line tails
W 0sl n
Absorption coefficient
Continuum T ¼ 12 ½7
Dn
MODTRAN’s evolution has resulted in a fine tuning
of the methodology used to define both the effective
line number n and the in-band single-line equivalent
width W 0sl : The effective line number is initially
estimated from a relationship developed by Goody,
in which lines are weighted according the to square
root of their strength, but MODTRAN combines
nearly degenerate transitions into single lines
–1.5 –0.5 0.5 1.5 because these multiplets violate the random distri-
Relative wavenumber bution assumption. The initial effective line number
Figure 2 Components of molecular absorption. The line center, values are refined to insure a match with higher
line tail and continuum contributions to the total absorption are resolution transmittance predictions degraded to the
illustrated for the central 1 cm21 spectral bin. band model resolution. The in-band equivalent
ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere 421
Reflectance
MODTRAN scattering calculations are optimally 0.15 H2O
performed using the DISORT discrete ordinates
algorithm developed by Stamnes and co-workers.
0.1
Methods for computing multiple scattering such as
DISORT require additive optical depths, i.e., Beer’s Scatter Co2
Law transmittances. Since the in-band molecular 0.05
transmittances of MODTRAN do not satisfy Beer’s
Law, MODTRAN includes a correlated-k algorithm H2O
option. The basic band model ansatz is re-invoked to 0
0.5 1 1.5 2
efficiently determine k-distributions; tables of k-data
Wavelength (microns)
are pre-computed as a function of Lorentz and
Doppler half-widths and effective line number, Figure 3 Vegetation spectrum viewed from 3 km altitude before
assuming spectral intervals contain n randomly (apparent reflectance) and after (reflectance) removal of
distributed identical molecular lines. Thus, MOD- atmospheric scatter and absorption.
TRAN k-distributions are statistical, only dependent
on band model parameter values, not on the exact
observed radiance divided by the Top-of-Atmosphere
distribution of absorption coefficients in each spectral
horizontal solar flux. Atmospheric absorption by
interval.
water vapor, oxygen, and carbon dioxide is evident,
as well as Rayleigh and aerosol scattering. Figure 3
shows the surface reflectance spectrum inferred by
Applications
modeling and then removing these atmospheric
Among the many applications of atmospheric trans- effects (this process is known as atmospheric removal,
mission and scattering calculations, we briefly compensation, or correction). It has the characteristic
describe two complementary ones in the area of smooth shape expected of vegetation, with strong
remote sensing, which illustrate many of the RT chlorophyll absorption in the visible and water bands
features discussed earlier as well as current optical at longer wavelengths. A detailed analysis of such a
technologies and problems of interest. spectrum may yield information on the vegetation
type and its area coverage and health.
Earth Surface Viewing
The first example is Earth surface viewing from
Sun and Sky Viewing
aircraft or spacecraft with spectral imaging sensors.
These include hyperspectral sensors, such as AVIRIS, The second remote sensing example is sun and sky
which have typically a hundred or more contiguous viewing from the Earth’s surface with a spectral
spectral channels, and multispectral sensors, such as radiometer, which can yield information on the
Landsat, which typically have between three and a few aerosol content and optical properties as well as
tens of channels. These instruments are frequently estimates of column concentrations of water vapor,
used to characterize the surface terrain, materials and ozone, and other gases. Figure 4 shows data from a
properties for such applications as mineral prospect- Yankee Environmental Systems, Inc. multi-filter
ing, environmental monitoring, precision agriculture, rotating shadow-band radiometer, which measures
and military uses. In addition, they are sensitive to both ‘direct flux’ (the direct solar flux divided by the
properties of the atmosphere such as aerosol optical cosine of the zenith angle) and diffuse (sky) flux in
depth and column water vapor. Indeed, in order to narrow wavelength bands. The plot of ln(direct
characterize the surface spectral reflectance, it is signal) versus the air mass ratio is called a Langley
necessary to characterize and remove the extinction plot, and is linear for most of the bands, illustrating
and scattering effects of the atmosphere. Beer’s Law. The extinction coefficients (slopes) vary
Figure 3 also shows an example of data collected by with wavelength consistent with a combination of
the AVIRIS sensor at , 3 km altitude over thick Mie and Rayleigh scattering. The water-absorbing
vegetation. The apparent reflectance spectrum is the 940 nm band has the lowest values and a curved
422 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere
101
10–1
10–2
415 nm
500 nm
610 nm
665 nm
862 nm
940 nm
10–3
0 1 2 3 4 5 6 7 8 9 10
Air mass ratio
Figure 4 Direct solar flux versus air mass at the surface measured by a Yankee Environmental Systems, Inc. multi-filter rotating
shadow-band radiometer at different wavelengths.
plot, in accordance with the square-root dependence Direct radiance (W cm21 sr21 or I0
of the equivalent width for optically thick lines. W cm22 mm21 sr21)
The diffuse fluxes, which arise from aerosol and Direct radiance at the LOS boundary I00
Rayleigh scattering, have a very different dependence (W cm21 sr21 or W cm22 mm21 sr21)
on air mass than the direct flux. In particular, the Source function (W cm21 sr21 or J
diffuse contributions often increase with air mass for W cm22 mm21 sr21)
high sun conditions. Extinction coefficient (cm21 atm21) k
The ratio of the diffuse and direct fluxes is Effective number of lines in bin n
related to the total single-scattering albedo, which is Scattering phase function (sr21) p
defined as the ratio of the total scattering coefficient Line strength (cm22 atm21) S
to the extinction coefficient. Results from a number
Transmittance T
of measurements showing lower than expected
Column density (atm cm) u
diffuse-to-direct ratios have suggested the presence
Complex error (probability) function w
of black carbon or some other continuum absorber
Single-line total equivalent width (cm21) Wsl
in the atmosphere, which would have a significant
impact on radiative energy balance at the Earth’s Single-line in-band equivalent W 0sl
surface and in the atmospheric boundary layer. width (cm21)
Collision-broadened half-width at half ac
maximum (cm21)
Doppler half-width at 1=e (cm21) ad
List of Units and Nomenclature Spectral interval (bin) width (cm21) Dn
Line shape function [cm] fn Scattering angle (radian) u
Lorentz line shape function [cm] fLorentz ðnÞ Wavelength (mm) l
Voigt line shape function [cm] fVoigt (n) Wavenumber (cm21) n
Scattering asymmetry parameter g Molecular line transition frequency n0
Direct plus diffuse radiance (W cm21 sr21 I Direction of incoming light (sr) Vi
or W cm22 mm21 sr21) Direction of outgoing light (sr) Vo
ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere 423
Contents
Overview
Dispersion
Fabrication of Optical Fiber
Light Propagation
Measuring Fiber Characteristics
Nonlinear Effects (Basics)
Nonlinear Optics
Optical Fiber Cables
Passive Optical Components
Active Fiber Compatible Components the field will tend to be absorbed and populate the
momentum states by giving up its energy to the
Our discussion of propagation in fiber optic wave- medium. The situation is similar with the atomic
guides would be a rather sterile one if there were not a medium except that the pump is generally optical
number of fiber compatible components that can (rather than electrical) and at a different wavelength
generate, amplify, and detect light streams available than the wavelength of the field to be amplified. That
to us in order to construct fiber optic systems. In this is to say, one needs to use a minimum of three levels of
section, we will try to discuss some of the theory of the localized atoms in the medium in order to carry
operation of active optical components in order to out an amplification scheme.
help elucidate some of the general characteristics of The composition of a semiconductor determines its
such components and the conditions that inclusion of bandgap, that is, the minimum energy difference
such components in a fiber system impose on the form between the electron and hole states of a given
that the system must take. momentum value. A source, be it a light emitting
In a passive component such as an optical fiber, we diode (LED) or laser diode (see Figure 4) will emit
need only consider the characteristics of the light that light at a wavelength which corresponds to an energy
is guided. Operation of active components requires slightly above the minimum gap energy (wavelength
that the optical fields interact with a medium in such a equals the velocity of light times Planck’s constant
manner that energy can be transferred from the field divided by energy) whereas a detector can detect
through an excitation of the medium to the control- almost any energy above the bandedge. Only
ling electrical stream and/or vice versa. Generally one semiconductors with bandgaps which exhibit a
wants the exchange to go only one way, that is from minimum energy at a zero momentum transfer can
field to electrical stream or from electrical stream to be made to emit light strongly. Silicon does not
field. In a detector this is not difficult to achieve, exhibit a direct gap and although it can be used as a
whereas in lasers and amplifiers it is quite hard to detector, it cannot be used as a source material. The
eliminate the back reaction of the field on the device. silicon laser is and has been the ‘Holy Grail’ of
Whereas a guiding medium can be considered as a electronics because of the ubiquity of silicon elec-
passive homogeneous continuum, the active medium tronics. To even believe that a ‘Holy Grail’ exists
has internal degrees of freedom of its own as well as a requires a leap of faith. Weak luminescence has been
granularity associated with the distribution of the
microscopic active elements. At least, we must
consider the active medium as having an active
index of refraction with a mind (set of differential
equations anyway) of its own. If one also wants to
consider noise characteristics, one needs to consider
the lattice of active elements which convert the
energy. The wavelength of operation of that active
medium is determined by the energy spacing between
the upper and lower levels of a transition which is
determined by the microscopic structure of these
grains, or quanta, that make up the medium. In a
semiconductor, we consider these active elements to
be so uniformly distributed and ideally spaced that we
Figure 4 A schematic depiction of the workings of a
can consider the quanta (electron hole pairs) to be
semiconductor laser light source. The source is fabricated as a
delocalized but numerous and labeled by their diode, and normal operation of this diode is in forward bias, that is,
momentum vectors. In atomic media such as the the p-side of the junction is biased positively with respect to
rare earth doped optical fibers which serve as optical ground which is attached to the n-side of the junction. With the
amplifiers, we must consider the individual atoms as p-side positively biased, current should freely flow through the
junction. This is not quite true as there is a heterojunction region
the players. In the case of the semiconductor, a between the p- and n-regions in which electrons from the n-side
current, flowing in an external circuit, controls the may recombine with holes from the p-side. This recombination
population of the upper (electronic) and lower (hole) gives off a photon which is radiated. In light emitting diodes
levels of each given momentum state within the (LEDs), the photon simply leaves the light source as a
semiconductor. When the momentum states are spontaneously emitted photon. In a laser diode, the heterojunction
layer serves as a waveguide and the endfaces of the laser as
highly populated with electrons and holes, there is a mirrors to provide feedback and allow laser operation in which the
flow of energy to the field (at the transition majority of the photons generated are generated in stimulated
wavelength) and when the states are depopulated, processes.
FIBER AND GUIDED WAVE OPTICS / Overview 429
Dispersion
L Thévenaz, École Polytechnique Fédérale de propagates with a different phase velocity if the
Lausanne, Lausanne, Switzerland optical medium is dispersive, since the refractive
q 2005, Elsevier Ltd. All Rights Reserved. index n depends on the optical frequency n. This
phenomenon is linear and causal and gives rise to a
signal distortion that may be properly described by a
Introduction transfer function in the frequency domain.
Let nðnÞ be the frequency-dependent refractive
Dispersion designates the property of a medium to
index of the propagation medium. When the optical
propagate the different spectral components of a
wave propagates in a waveguiding structure the
wave with a different velocity. It may originate from
effective wavenumber is properly described by a
the natural dependence of the refractive index of a
propagation constant b; that reads:
dense material to the wavelength of light, and one of
the most evident and magnificent manifestations of 2pn
bðnÞ ¼ nðnÞ ½1
dispersion is the appearance of a rainbow in the sky. c0
In this case dispersion gives rise to a different where c0 is the vacuum light velocity. The propa-
refraction angle for the different spectral components gation constant corresponds to an eigenvalue of the
of the white light, resulting in an angular dispersion wave equation in the guiding structure and normally
of the sunlight spectrum and explicating the etymol- takes a different value for each solution or propa-
ogy of the term dispersion. Despite this spectacular gation mode. For free-space propagation this con-
effect, dispersion is mostly seen as an impairment in stant is simply equal to the wavenumber of the
many applications, in particular for optical signal corresponding plane wave.
transmission. In this case, dispersion causes a The complex electrical field Eðz; tÞ of a signal
rearrangement of the signal spectral components in propagating in the z direction may be properly
the time domain, resulting in temporal spreading of described by the following expression:
the information and eventually in severe distortion.
There is a trend to designate all phenomena Eðz; tÞ ¼ Aðz; tÞ eið2pn0 t2b0 zÞ
resulting in a temporal spreading of information as with n0: the central optical frequency
a type of dispersion, namely the polarization dis- and b0 ¼ bðn0 Þ ½2
persion in optical fibers that should be properly
named as polarization mode delay (PMD). In this where Aðz; tÞ represents the complex envelope of the
article chromatic dispersion will be solely addressed, signal, supposed to be slowly varying compared to the
that designates the dependence of the propagation carrier term oscillating with the frequency n0. Con-
velocity on wavelength and that corresponds to the sequently, the signal will spread over a narrow band
strict etymology of the term. around the central frequency n0 and the propagation
constant b can be conveniently approximated by a
limited expansion to the second order:
Effect of Dispersion on an Optical
Signal db d2 b
bðnÞ ¼ b0 þ ð n 2 n Þ þ ðn 2 n0 Þ2 ½3
dn n ¼n0 dn 2 n ¼n0
0
An optical signal may always be considered as a
sum of monochromatic waves through a normal For a known signal Að0; tÞ at the input of the
Fourier expansion. Each of these Fourier components propagation medium, the problem consists in
FIBER AND GUIDED WAVE OPTICS / Dispersion 433
determining the signal envelope Aðz; tÞ after propa- The second term is the distortion term which is
gation over a distance z. The linearity and the similar in form to a diffusion process and normally
causality of the system make possible a description results in time spreading of the signal. In the case of a
using a transfer function Hz ðnÞ such as: light pulse it will gradually broaden while propagat-
ing along the fiber, like a hot spot on a plate gradually
~ nÞ ¼ Hz ðnÞ Að0;
Aðz; ~ nÞ ½4 spreading as a result of heat diffusion. The effect
~ nÞ is the Fourier transform of Aðz; tÞ. of this distortion is proportional to the distance z
where Aðz;
and to the coefficient Dn , named group velocity
To make the transfer function Hz ðnÞ explicit, let us
dispersion (GVD):
assume that the signal corresponds to an arbitrary
harmonic function: !
1 d2 b d 1
Dn ¼ ¼ ½11
Að0; tÞ ¼ A0 ei2pft ½5 2p d n 2 dn Vg
Since eqns [6] and [7] must represent the same so that the distortion of the signal may be calculated
quantities and using the definition in eqn [4], a simple by a simple convolution of the impulse response with
comparison shows that the transfer function must the signal envelope in the time domain.
take the following form: The effect of dispersion on the signal can be more
easily interpreted by evaluating the dispersive propa-
Hz ðnÞ ¼ e2 i½bðnÞz2b0 z ½8 gation of a Gaussian pulse. In this particular case
The transfer function takes a more analytical form the calculation of the resulting envelope can be
using the approximation in eqn [3]: carried out analytically. If the signal envelope takes
the following Gaussian distribution at the origin:
2
Hz ðnÞ ¼ e2 i2pðn2n0 ÞtD e2 ipDn ðn2n0 Þ z
½9
t2
2
t 20
In the transfer function interpretation the first term Að0; tÞ ¼ A0 e ½13
represents a delay term. It means that the signal is
delayed after propagation by the quantity: with t0 the 1=e half-width of the pulse, the envelope at
distance z is obtained by convoluting the initial
1 db z envelope with the impulse response hz ðtÞ:
tD ¼ z¼ ½10
2p d n 0 Vg
where Vg represents the signal group velocity. This Aðz; tÞ ¼ hz ðtÞ ^ Að0; tÞ
term therefore brings no distortion for the signal and sffiffiffiffiffiffiffiffiffiffi
ðt2tD Þ2
iz0 ip
thus states that the signal is replicated at the distance ¼ A0 e Dn ðzþiz0 Þ ½14
z with a delay tD : z þ iz0
434 FIBER AND GUIDED WAVE OPTICS / Dispersion
This means that a beautiful natural phenomenon such between absorption and dispersion is typically
as a rainbow, originates on a microscopic scale from illustrated in Figure 4. The natural tendency is to
the sluggishness of the medium molecules to react to observe a growing refractive index for increasing
the presence of light. For signal propagation, it results frequencies in low absorption regions. In this case, the
in a distortion of the signal and in most cases in a dispersion is called normal and this is the most
pulse spreading, but the microscopic causes are in observed situation in transparency regions of a
essence identical. This noninstantaneous response is material, which obviously offer the largest interest
tightly related to the molecules’ vibrations that also for optical propagation. In an absorption line the
give rise to light absorption. For this reason it is tendency is opposite, a diminishing index for increas-
convenient to express the propagation in an absorp- ing wavelength, and such a response is called
tive medium by adding an imaginary part to the anomalous dispersion. The dispersion considered
susceptibility xðnÞ: here is the phase velocity dispersion, represented by
the slope of nðnÞ, that must not be mistaken with the
xðnÞ ¼ x 0 ðnÞ þ ix 00 ðnÞ ½28
group velocity dispersion (GVD) that only matters for
so that the refractive index nðnÞ and the absorption signal distortion. The difference between these two
coefficient aðnÞ reads in a weakly absorbing medium: quantities is clarified below.
qffiffiffiffiffiffiffiffiffiffiffi To demonstrate that a frequency-dependent refrac-
nðnÞ ¼ 1 þ x 0 ðnÞ tive index nðnÞ gives rise to a group velocity dispersion
and thus a signal distortion we use eqns [1] and [10],
2px 00 ðnÞ so the group velocity Vg can be expressed:
að n Þ ¼ 2 ½29
lnðnÞ
c0 dn dn
Since the response of the medium, given by the time- Vg ¼ with N ¼nþn ¼n2l ½31
dependent susceptibility xðtÞ in eqn [25], is real and N dn dl
causal, the real and imaginary part of the suscepti- N is called group velocity index and differs from the
bility in eqn [28] are not entirely independent and are phase refractive index n only if n shows a spectral
related by the famous Kramers –Kronig relations: dependence. In a region of normal dispersion ðdn=
dnÞ . 0; the group index is larger than the phase
2 ð1 sx 00 ðnÞ
x 0 ðnÞ ¼ ds index and this is the situation observed in the great
p 0 s2 2 n 2 majority of transparent materials. From eqn [31] and
½30
00 2 ð1 nx 0 ðnÞ using eqn [24] the GVD coefficient Dl can be
x ð nÞ ¼ ds expressed as a function of the refractive index nðlÞ:
p 0 n 2 2 s2
! !
Absorption and dispersion act in an interdependent d 1 d N l d2 n
Dl ¼ ¼ ¼2 ½32
way on the propagating optical wave and knowing dl Vg dl c0 c0 dl2
either the absorption or the dispersion spectrum is
theoretically sufficient to determine the entire optical The GVD coefficient is proportional to the second
response of the medium. The interdependence derivative of the refractive index n with respect to
FIBER AND GUIDED WAVE OPTICS / Dispersion 437
Gloge D (1971) Dispersion in weakly guiding fibers. Marcuse D (1981) Pulse distortion in single-mode fibers.
Applied Optics 10: 2442– 2445. Applied Optics 20: 3573– 3579.
Jeunhomme LB (1989) Single Mode Fiber Optics: Principles Mazurin OV, Streltsina MV and Shvaiko-Shvaikovskaya TP
and Applications. Optical Engineering Series, No. 23. (1983) Silica Glass and Binary Silicate Glasses (Hand-
New York: Marcel Dekker. book of Glass Data; Part A) Series on Physical Sciences
Marcuse D (1974) Theory of Dielectric Optical Wave- Data. Amsterdam: Elseviervol. 15, pp. 63 – 75.
guides. Series on Quantum Electronics – Principles and Murata H (1988) Handbook of Optical Fibers and Cables.
Applications, chap. 2. New York: Academic Press. Optical Engineering Series, No. 15, chap. 2. New York:
Marcuse D (1980) Pulse distortion in single-mode fibers. Marcel Dekker.
Applied Optics 19: 1653– 1660. Saleh BEA and Teich MC (1991) In: Goodman JW (ed.)
Marcuse D (1981) Pulse distortion in single-mode fibers. Fundamentals of Photonics, chap. 5, 8 & 22. New York:
Applied Optics 20: 2962– 2974. Wiley Series in Pure and Applied Optics.
2. The materials used must be low loss, providing For hydrolysis, which occurs when the deposition
transmission with no absorption or scattering of occurs in a hydrogen-containing flame, the reaction is:
light.
3. The materials used must have suitable thermal SiCl4 þ 2H2 O ! SiO2 þ 4HCl ½2
and mechanical properties to allow them to be
drawn down in diameter into a fiber. These processes produce fine glass particles, spherical
in shape with a size of the order of 1 nm. These glass
Silica (SiO2) can be made into a glass relatively particles, known as soot, are then deposited and
easily. It does not easily crystallize, which means that subsequently sintered into a bulk transparent glass.
scattering from unwanted crystalline centers within The key to low-loss fiber is the difference in vapor
the glass is negligible. This has been a key factor in the pressures of the desired halides and the transition
achievement of a low-loss fiber. Silica glass has high metal halides that cause significant absorption loss at
transparency in the visible and near-infrared wave- the wavelengths of interest. The carrier gas picks up a
length regions and its refractive index can be easily pure vapor of, for example, SiCl4, and any impurities
modified. It is stable and inert, providing excellent are left behind.
chemical and mechanical durability. Moreover, the Part of the process requires the formation of the
purification of the raw materials used to synthesize desired core/cladding structure in the glass. In all
silica glass is quite straightforward. cases, silica-based glass is produced with variations
In the first stage of achieving an optical fiber, silica in refractive index produced by the incorporation of
glass is synthesized by one of three main chemical dopants. Typical dopants used are germania (GeO2),
vapor processes. All use silicon tetrachloride (SiCl4) titania (TiO2), alumina (Al2O3), and phosphorous
as the main precursor, with various dopants to pentoxide (P2O5) for increasing the refractive index,
modify the properties of the glass. The precursors and boron oxide (B2O2) and fluorine (F) for
are reacted with oxygen to form the desired oxides. decreasing it. These dopants also allow other proper-
The end result is a high purity solid glass rod with ties to be controlled, such as the thermal expansion
the internal core and cladding structure of the of the glass and its softening temperatures. In
desired fiber. addition, other materials, such as the rare earth
In the second stage, the rod, or preform as it is elements, have also been used to fabricate active
known, is heated to its softening temperature and fibers that are used to produce optical fiber
stretched to diameters of the order of 125 microns. amplifiers and lasers.
Tens to hundreds of kilometers of fiber are produced
from a single preform, which is drawn continuously, MCVD Process
with minimal diameter fluctuations. During the Optical fibers were first produced by the MCVD
drawing process one or more protective coatings are method in 1974, a breakthrough that completely
applied, yielding long lengths of strong, low-loss fiber, solved the technical problems of low-loss fiber
ready for immediate application. fabrication (Figure 2).
As shown schematically in Figure 3, the halide
Preform Fabrication precursors are carried in the vapor phase by oxygen
The key step in preparing a low-loss optical fiber is to carrier gas into a pure silica substrate tube. An oxy-
develop a technique for completely eliminating hydrogen burner traverses the length of the tube,
transition metal and OH ion contamination during which it heats externally. The tube is heated to
the synthesis of the silica glass. The three methods temperatures of about 1,4008C which then oxidizes
most commonly used to fabricate a glass optical the halide vapor materials. The deposition tempera-
fiber preform are: the modified chemical vapor ture is sufficiently high to form a soot made of glassy
deposition process (MCVD); the outside vapor depo- particles which are deposited on the inside wall of the
sition process (OVD); and the vapor-axial deposition substrate tube but low enough to prevent the softened
process (VAD). silica substrate tube from collapsing. The process
In a typical vapor phase reaction, halide precursors usually takes place on a horizontal glass-working
undergo a high temperature oxidation or hydrolysis lathe.
to form the desired oxides. The completed chemical During the deposition process, the repeated traver-
reaction for the formation of silica glass is, for sing of the burner forms multiple layers of soot.
oxidation: Changes to the precursors entering the tube and thus
the resulting glass composition are introduced for
layers which will form the cladding and then the core.
SiCl4 þ O2 ! SiO2 þ 2Cl2 ½1 The MCVD method allows germania to be doped into
442 FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber
OVD Process
The outside vapor deposition (OVD) methods, also
known as outside vapor phase oxidation (OVPO),
synthesize the fine glass particles within a burner
flame. The precursors, oxygen, and fuel for the
burner, are introduced directly into the flame. The
soot is deposited onto a target rod that rotates and
traverses in front of the burner. As in MCVD, the
preform is built up layer by layer, though now initially
by depositing the core glass and then building up the
cladding layers over this. After the deposition process
is complete, the preform is removed from the target
rod and is collapsed and sintered into a transparent
glass preform. The center hole remains, but disap-
Figure 2 Fabrication of an optical fiber preform by the MCVD pears during the fiber drawing process. This tech-
method.
nique has advantages in both size of preform which
can be obtained and the fact that a high-quality silica
substrate tube is no longer required. These two
advantages combine to make a more economical
process. From a single preform, several hundred
kilometers of fiber can be produced.
VAD Process
The most recent refinement to the fabrication process
was developed again to aid the mass production of
high-quality fibers. In the VAD process, both core and
Figure 3 Schematic of preform fabrication by the MCVD cladding glasses are deposited simultaneously. Like
method. OVD, the soot is synthesized and deposited by flame
hydrolysis, as shown in Figure 4, the precursors are
blown from a burner, oxidized, and deposited onto a
the silica glass and the precise control of the refractive silica target rod.
index profile of the preform. When deposition is Burners for the VAD process consist of a series of
complete, the burner temperature is increased and the concentrate nozzles. The first delivers an inert carrier
hollow, multilayered structure is collapsed to a solid and the main precursors SiCl4, the second delivers an
rod. A characteristic of fiber formed by this process is inert carrier glass and the dopants, the third delivers
a refractive index dip in the center of the core of the hydrogen fuel, and the fourth delivers oxygen. Gas
fiber. In addition, the deposition takes place in a flows are up to a liter per minute and deposition rates
closed system, which dramatically reduces contami- can be very high. With the VAD process, both core
nation by OH2 ions and maintains low levels of other and cladding glasses can be deposited simultaneously.
impurities. The high temperature allows high depo- The main advantage is that this is a continuous
sition rates compared to traditional chemical vapor process, as the soot which forms the core and
deposition (CVD) and large performs, built up from cladding glasses are deposited axially onto the end
hundreds of layers can be produced. of the rotating silica rod, it is slowly drawn upwards
The MCVD method is still widely used today into a furnace which sinters and consolidates the soot
though it has some limitations, particularly on the into a transparent glass. The upward motion is such
preform size that can be achieved and thus the that the end at which deposition is occurring remains
manufacturing cost. Diameter of the final preform is in a fixed position and essentially the preform is
determined by the size of the initial silica substrate grown from this base.
tube, which, due to the high purity required, accounts The advantages of the VAD process are a preform
for a significant portion of the cost of the preform. without a central dip and, most importantly, the
A typical preform fabricated by MCVD yields about mass production associated with a continuous pro-
5 km of fiber. Its success has spurred improvements to cess. The glass quality produced is uniform and
the process, in particular to address the fiber yield the resulting fibers have excellent reproducibility
from a single preform. and low loss.
FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber 443
Figure 4 Schematic of preform fabrication by the VAD method. Figure 5 A commercial scale optical fiber drawing tower.
Other Methods of Preform Fabrication preform into the drawing furnace at a speed which
matches the volume of preform entering the furnace
There are other methods of preform fabrication, to the volume of fiber leaving the bottom of the
though these are not generally used for today’s silica
furnace. Within the furnace, the fiber is drawn from
glass based fiber. Some methods, such as fabrication
the molten zone of glass down the tower to a capstan,
of silica through sol –gel chemistry could not provide
which controls the fiber diameter, and then onto a
the large low-loss preforms obtained by chemical
drum. During the drawing process, immediately
vapor deposition techniques. Other methods, such as
below the furnace, is a noncontact device that
direct casting of molten glass into rods, or formation
measures the diameter of the fiber. This information
of rods and tubes by extrusion, are better suited to
is fed back to the capstan which speeds up to reduce
and find application in other glass chemistries and
the diameter or slows down to increase the fiber
speciality fibers.
diameter. In this way, a constant diameter fiber is
produced. One or more coatings are also applied
in-line to maintain the pristine surface quality as the
Fiber Drawing fiber leaves the furnace and thus to maximize the fiber
Once a preform has been made, the second step is to strength.
draw the preform down in diameter on a fiber The schematic drawing in Figure 6 shows a fiber
drawing tower. The basic components and configur- drawing tower and its key components. Draw towers
ation of the drawing tower have remained unchanged are commercially available, ranging in height from
for many years, although furnace design has become 3 m to greater than 20 m, with the tower height
more sophisticated and processes like coating have increasing as the draw speed increases.
become automated. In addition, fiber drawing towers Typical drawing speeds are on the order of
have increased in height to allow faster pulling speeds 1 m s21. The increased height is needed to allow
(Figure 5). the fiber to cool sufficiently before entering the
Essentially, the fiber drawing process takes place as coating applicator, although forced air-cooling of
follows. The preform is held in a chuck which is the fiber is commonplace in a manufacturing
mounted on a precision feed assembly that lowers the environment.
444 FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber
Types of Fiber
The majority of silica fiber drawn today is single
mode. This structure consists of a core whose
diameter is chosen such that, with a given refractive
index difference between the core and cladding, only
a single guide mode propagates at the wavelength
of interest. With a discrete index difference between
core and cladding, it is often referred to as a step
index fiber. In typical telecommunications fiber, single
mode operation is obtained with core diameters of
2 –10 microns with a standard outer diameter of
125 microns.
Multimode fiber has core diameters considerably
larger, typically 50, 62.5, 85, and 110 microns, again
with a cladded diameter of 125 microns. Multimode
fibers are often graded index, that is the refractive
index is a maximum in the center of the fiber and
smoothly decreases radially until the lower cladding Figure 7 Structure of (a) dispersion shifted fiber and (b) two
index is reached. Multimode fibers find use in non- methods of achieving polarization preserving fiber.
telecommunication applications, for example optical
fiber sensing and medicine.
Single mode fibers, which are capable of maintain- List of Units and Nomenclature
ing a linear polarization input to the fiber, are known Dopants Elements or compounds added,
as polarization preserving fibers. The structure of usually in small amounts, to a
these fibers provides a birefringence that removes the glass composition to modify its
degeneracy of the two possible polarization modes. properties.
This birefringence is a small difference in the effective Fiber drawing The process of heating and
refractive index of the two polarization modes that thus softening an optical fiber
can be guided and it is achieved in one of two ways.
preform and then drawing out
Common methods for the realization of this birefrin-
a thin thread of glass.
gence are an elliptical core in the fiber, or through
Fiber loss The transmission loss of light
stress rods, which modify the refractive index in one
as it propagates through a fiber,
orientation. These fiber structures are shown in
usually measured in dB of loss
Figure 7.
While the loss minimum of silica-based fiber is near per unit length of fiber. Loss can
1.55 microns, step index single-mode fiber offers occur through the absorption
zero dispersion close to 1.3 micron wavelengths of light in the core or scattering
and dispersion at the loss minimum is considerable. of light out of the core.
A modification of the structure of the fiber, and in Glass An amorphous solid formed by
particular a segmented refractive index profile in the cooling from the liquid state to
core, can be used to shift this dispersion minimum to a rigid solid with no long range
1.55 microns. This fiber, illustrated in Figure 7 is structure.
known as dispersion shifted fiber. Similarly fibers, Modified chemical A process for the fabrication of
with a relatively low dispersion over a wide wave- vapor deposition an optical preform where gases
length range, known as dispersion flattened fibers, (MCVD) flow into the inside of a rotat-
can be obtained by the use of multiple cladding layers. ing tube, are heated and react
446 FIBER AND GUIDED WAVE OPTICS / Light Propagation
to form particles of glass which Characteristics; Nonlinear Effects (Basics); Optical Fiber
are deposited onto the wall of Cables. Fiber Gratings. Optical Amplifiers: Erbrium
the glass tube. After deposition, Doped Fiber Amplifiers for Lightwave Systems. Optical
the glass particles are consoli- Communication Systems: Architectures of Optical
dated into a solid preform. Fiber Communication Systems; Historical Development;
Wavelength Division Multiplexing. Optical Materials:
Outside vapor A process for the fabrication of
Optical Glasses.
deposition (OVD) an optical preform where glass
soot particles are formed in
an oxy-hydrogen flame and
deposited on a rotating rod. Further Reading
After deposition, the glass Agrawal GP (1992) Fiber-Optic Communication Systems.
particles are consolidated into New York: John Wiley & Sons.
a solid preform. Bass M and Van Stryland EV (eds) (2001) Fiber Optics
Preform A fiber blank, a bulk glass rod Handbook: Fiber, Devices, and Systems for Optical
consisting of a core and clad- Communications. New York: McGraw-Hill Pro-
ding glass composite which is fessional Publishing.
drawn into fiber. Fujiura K, Kanamori T and Sudo S (1997) Fiber materials
Refractive index A characteristic property of and fabrications. In: Sudo S (ed.) Optical Fiber Ampli-
glass, which is defined by the fiers, ch. 4, pp. 193– 404. Boston, MA: Artech House.
speed of light in the material Goff DR and Hansen KS (2002) Fiber Optic Reference
Guide: A Practical Guide to Communications Technol-
relative to the speed of light in
ogy, 3rd edn. Oxford, UK: Butterworth-Heinemann UK.
a vacuum.
Hewak D (ed.) (1998) Glass and Rare Earth Doped Glasses
Silica A transparent glass formed for Optical Fibres. EMIS Datareview Series, INSPEC.
from silicon dioxide. London: The Institution of the Electrical Engineers.
Vapor-axial A process similar to OVD, Keck DB (1981) Optical fibre waveguides. In: Barnoski MK
deposition where the core and cladding (ed.) Fundamentals of Optical Fiber Communications.
layers are deposited simul- New York: Academic Press.
taneously. Li T (ed.) (1985) Optical Fiber Communications: Fiber
Fabrication. New York: Academic Press.
Personick SD (1985) Fiber Optics: Technology and
See also Applications. New York: Plenum Press.
Detection: Fiber Sensors. Fiber and Guided Wave Schultz PC (1979) In: Bendow B and Mitra SS (eds) Fiber
Optics: Dispersion; Light Propagation; Measuring Fiber Optics. New York: Plenum Press.
Light Propagation
F G Omenetto, Los Alamos National Laboratory, material (B). When light traverses the interface of the
Los Alamos, NM, USA two materials with a specific angle, its direction of
q 2005, Elsevier Ltd. All Rights Reserved. propagation is altered with respect to the normal to
the surface, according to the well-known Snell law
which relates the incident angle (onto the interface
Optical fibers have become a mainstay of our modern between A and B) to the refracted angle (away from
way of life. From phone calls to the internet, their the interface between A and B). This is written as
presence is ubiquitous. The unmatched ability that nA sin i ¼ nB sin r; where i is the angle of incidence, r
optical fibers have to transmit light is an amazing the angle of refraction, and nA and nB are the
asset that, besides being strongly part of our present, refractive indices of the materials A and B. Under
is poised to shape the future. appropriate conditions, a critical value of the
The foundation of the unique ability of optical incidence angle (iCR ) exists for which the light does
fibers to transmit light hinges on the well-known not propagate into B and is totally reflected back
physical concept of total internal reflection. This into A. This value is given by iCR ¼ arcsinðnA =nB Þ: It
phenomenon is described by considering the behavior follows that total internal reflection occurs when light
of light traveling from one material (A) to a different is traveling from a medium with a higher index of
FIBER AND GUIDED WAVE OPTICS / Light Propagation 447
refraction into a medium with a lower index of wavelength. A quantity V is defined as V ¼ ð2p=lÞ
refraction. As an approximation, this is the basic idea aðn2core 2 n2cladding Þ1=2 ; where a is the core radius and l
behind light propagation in fibers. is the wavelength of light. For example, a step-index
Optical fibers are glass strands that include a fiber (i.e., a fiber where the transition from the index
concentric core with a cladding wrapped around it. of refraction of the core to the index of refraction in
The index of refraction of the core is slightly higher the cladding is abrupt), supports a single electro-
than the index of refraction of the cladding, thereby magnetic mode if V , 2:405:
creating the necessary conditions for the confinement Light propagation in optical fibers, however, is
of light within them. Such index mismatches between affected by numerous other factors besides the electro-
the core and the cladding are determined by using magnetic modes that the fiber geometry allows.
different materials for each, by doping the glass matrix First, the ability to efficiently transmit light in
before pulling the fiber to create an index gradient, by optical fiber is dependent on the optical transparency
introducing macroscopic defects such as air-holes in of the medium that the fiber is made of. The optical
the structure of the fiber cladding (such as in the transparency of the medium is a function of the
recently developed photonic crystal fibers), or by using wavelength (lÞ of light that needs to be transmitted
materials other than glass, such as polymers or plastic. through the fiber. Optical attenuation is particularly
The real physical picture of light guiding in fibers is high for shorter wavelengths and the transmission
more complex than the conceptual description stated losses for ultraviolet wavelengths become consider-
above. The light traveling in the fiber is, more ably higher than in the near infrared. The latter
precisely, an electromagnetic wave with frequencies region, due to this favorable feature, is the preferred
that lie in the optical range, and the optical fiber itself region of operation for optical fiber based telecom-
constitutes an electromagnetic waveguide. As such, the munication applications. This ultimate physical
waveguide supports guided electromagnetic modes
limit is determined by the process of light being
which can take on a multitude of shapes (i.e., of energy
scattered off the atoms that constitute the glass, a
distribution profiles), depending on the core/cladding
process called Rayleigh scattering. The losses due to
index of refraction profiles. These profiles are often
Rayleigh scattering are inversely proportional to
complex and determine the physical boundary con-
the fourth power of the wavelength ðlosses / l24 Þ;
ditions that influence the field distribution within the
and therefore become quite considerable as l is
fiber. The nature of the fiber modes has been treated
decreased.
extensively in the literature where an in-depth
The ability of optical fibers to transmit light over a
description of the physical solutions can be found.
certain set of wavelengths, is quantified by their
There are many distinguishing features that
optical attenuation. This parameter is defined as the
identify different optical fiber types. Perhaps the
most important one, in relation to the electromag- ratio of the output power versus the input power and
netic modes that the fiber geometry supports, is the is usually measured in units of decibels (dB), i.e.,
distinction that is made between single-mode fibers attenuation (dB) ¼ 10 log10(Pout/Pin).
and multimode fibers. As the name implies, a single- The values that are found in the literature relate
mode fiber is an optical fiber whose index of optical attenuation to distance and express the
refraction profile between core and cladding is attenuation per unit length (such as dB/km). Single-
designed in such a way that there is only one mode fibers used in telecommunications are usually
electromagnetic field distribution that the fiber can manufactured with doped silica glasses designed to
support and transmit. That is to say, light is work in the near infrared, in a wavelength range
propagated very uniformly across the fiber. A very between 1.3 and 1.6 mm (1 mm ¼ 1026 m) which
common commercially used fiber of this kind is the provides the lowest levels of Rayleigh scattering and
Corning SMF-28 fiber, which is employed in a variety is located before the physical infrared absorption of
of telecommunication applications. Multimode fibers silica (,1.65 mm) kicks in. Such fibers, which have
have larger cores and are easier to couple light into, been developed and refined for decades, have
yet the fact that many modes are supported implies attenuation losses of less that 0.2 dB/km at a
that there is less control over the behavior of light wavelength of 1.55 mm. Many other materials have
during its propagation, since it is more difficult to been used to meet specific optical transmission needs.
control each individual mode. While silica remains by far the most popular
The condition for single-mode propagation in material used for optical fiber manufacturing, other
optical fibers is determined during the design and glasses are better suited for different portions of the
manufacture of the fiber by its geometrical spectrum: quartz glass can be particularly transmis-
parameters and is specific to a certain operation sive in the ultraviolet whereas plastic is sometimes
448 FIBER AND GUIDED WAVE OPTICS / Light Propagation
conveniently used for short distance transmission in optical fibers depend, again, largely on the structure
the visible. and constituents of the fiber. One of the most
The ability to collect light at the input of the fiber important effects on pulsed light is dispersion. This
represents another crucial and defining parameter. phenomenon causes the broadening in time of the
Returning to total internal reflection, such an effect light pulses during their travel through the fiber.
can only be achieved by sending light onto an interface This can be easily explained by a Fourier analogy by
at a specific angle. In fibers, the light needs to be thinking that a short pulse in time (such as the
coupled into a circular core and reach the core 10 – 20 picosecond pulse duration that is used for
cladding interface at that specific geometrical angle. optical communications, 1 picosecond ¼ 10 212
The measure of the ability to couple light in the fiber is seconds) has a large frequency content (i.e., is
referred to as ‘coupling efficiency’. The core dimension composed of a variety of ‘colors’). Since the speed
defines the geometry of the coupling process which, in of light through bulk media, and therefore through
turn, determines the efficiency of the propagation of the fiber constituent, depends on wavelength,
light inside the fiber. In order to be coupled efficiently different ‘colors’ will travel at different speeds,
and therefore guided within the fiber, the light has to arriving at the end of the fiber slightly delayed
be launched within a range of acceptable angles. Such with respect to one another. This causes a net pulse
a range is usually defined or derivable from the broadening and is dependent on the properties of
numerical aperture (NA) of the fiber which is the fiber that are depicted in the dispersion curve of
approximately NA ¼ ðn2core 2 n2cladding Þ1=2 ; or can be the fiber. This type of dispersion is called chromatic
alternatively described as a function of the critical dispersion and is dependent on the distance the pulse
angle for total internal reflection within the fiber by travels in the fiber. A measure of the pulse broad-
NA ¼ ncore sin iCR : Also, light has to be focused into ening can be calculated by multiplying the dis-
the core of the fiber and therefore the interplay persion value for the fiber (measured in ps/km) at the
between core size, focusability, and NA (i.e., accep- wavelength of operation times and the distance in
tance angle) of the fiber influence the coupling kilometers that the pulse travels.
efficiency into the fiber (which in turn affects There are other types of dispersion that affect
transmission). Typical telecommunication fibers pulses traveling through fibers: light with different
have a 10 mm core (a NA of ,0.2) and have very polarization travels at a different speed, thereby
high coupling efficiencies (of the order of 70 – 80%). causing an analogous problem to chromatic distor-
Specialized single mode fibers (such as photonic tion. In multimode fibers, different modes travel
crystal fibers) can have core diameters down to a few with different properties. Dispersion issues are
microns, significantly lower coupling efficiencies, and extremely important in the design of the next-
high numerical apertures. Multimode fibers, in con- generation telecommunication systems. As a push is
trast, are easy to couple into given their large core size. made toward higher transmission capacity, pulses
Other factors that contribute to losses during become shorter and these issues have to be dealt
propagation are due to the physical path that the with carefully. Specifically, transmission of light in
actual optical fiber follows. A typical optical fiber optical fibers becomes more complex because the
strand is rarely laid along a straight line. These types interaction between the light and the fiber material
of losses are called bending losses and can be constituents becomes nonlinear. As pulses become
intuitively understood by thinking that when the shorter, their energy is confined in a smaller
fiber is straight, the light meets its total reflection temporal interval, making their peak power (which
condition at the core-cladding interface, but when is equal to energy/time) increase. If the peak power
the fiber bends, the interface is altered and varies becomes sufficiently high, the atoms that form the
the incidence conditions of light at the interface. If the glass of which the fiber is made are excited in a
bend is severe, light is no longer totally reflected and nonlinear fashion adding effects that need to be
escapes from the fiber. addressed with caution. Energetic pulses at a certain
Another important issue that influences propa- wavelength change their shape during propagation
gation arises when the light that travels through the and can become severely distorted or change
fiber is sent in bursts or in pulses of light. Pulsed wavelength (i.e., undergo a frequency conversion
light is very important in fiber transmission because process), thereby destroying the information that
the light pulse is the carrier of optical information, they were meant to carry.
most often representing a binary digit of data. The Several physical manifestations of optical non-
ability to transmit pulses reliably is at the heart of linearity affect pulsed light propagation, giving rise
modern day telecommunication systems. The fea- to new losses but also providing new avenues for
tures that affect the transmission of pulses through efficient transmission.
FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 449
Attenuation
Introduction
The spectral attenuation of an optical fiber follows
The optical fiber is divided in two types: multimode exponential power decay from the power level at a
and singlemode. Each type is used in different applica- cross-section 1 to the power level at cross-section 2,
tions and wavelength ranges and is consequently over a fiber length L; as follows:
characterized differently. Furthermore, the corres-
ponding test methods also vary. P2 ðlÞ ¼ P1 ðlÞ · e2gðlÞL ½1
The optical fiber characteristics may be divided
into four categories: P1 ðlÞ is the optical power transmitted through the
fiber core cross-section 1, expressed in mW; P2 ðlÞ is
. The optical characteristics (transmission related); the optical power transmitted through the fiber core
. The dimensional characteristics; cross-section 2 away from cross-section 1, expressed
. The mechanical characteristics; and in mW; gðlÞ is the spectral attenuation coefficient in
. The environmental characteristics. linear units, expressed in km21; and L is the fiber
length expressed in km.
These categories will be reviewed in the following Attenuation may be characterized at one or more
sections, together with their corresponding test specific wavelengths or as a function of wavelength.
methods. In the later case, attenuation is referred to spectral
attenuation. Figure 1 illustrates such power decay.
Equation [1] may be expressed in relative units as
Fiber Optical Characteristics and follows:
Corresponding Tests Methods
log10 P2 ¼ ðlog10 P1 Þ · 2 gL · log10 e ½2
The following sections will describe the following
optical characteristics:
P is expressed in dBm units using the following
. attenuation; definition.
. macrobending sensitivity;
. microbending sensitivity;
. cut-off wavelength;
. multimode fiber bandwidth;
. differential mode delay for multimode fibers;
. chromatic dispersion;
. polarization mode dispersion; Figure 1 Power decay in an optical fiber due to attenuation.
450 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics
The power in dBm is equal to 10 times the cut to the cut-back length, for example 2 m from the
logarithm in base 10 of the power in mW; or launching point. The FUT attenuation, between the
points where P1 and P2 have been measured, is
0 dBm ¼ 10 log10 ð1 mWÞ ½3 calculated using P1 and P2 ; from the definition
Then equations provided above.
gðlÞ½km21 ¼ ½ð10 log10 P1 Þ 2 ð10 log10 P2 Þ Backscattering method. The attenuation coeffi-
L log10 e ½4 cient of a singlemode fiber is characterized using
bidirectional backscattering measurements. This
aðlÞ½dB=km ¼ ½P1 ðdBmÞ 2 P2 ðdBmÞ=L ½5 method is also used for:
curve together with the FUT group index of refraction, and calculating the average loss between them.
ng ; as: The end-to-end FUT attenuation coefficient is
DTf obtained from the difference between two losses
Lf ¼ cfs · ½7 divided by the difference of their corresponding
ng
distances.
cfs is the velocity of light in free space.
The bidirectional backscattering curve is obtained Insertion loss method. The insertion loss method
using two unidirectional backscattering curves consists of the measurement of the power loss due to
452 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics
the FUT insertion between a launching and a non-destructive for the FUT. Therefore, it is particu-
receiving system, previously interconnected (refer- larly suitable in the field.
ence condition). The powers P1 and P2 are evaluated
in a less straightforward way than in the cut-back Macrobending Sensitivity
method. Therefore, this method is not intended for Macrobending sensitivity is the property by which
use in manufacturing. there is a certain amount of light leaking (loss) into
The insertion loss technique is less accurate the cladding when the fiber is bent and the bending
than the cut-back, but has the advantage of being angle is such that the condition of total internal
reflection is no longer met at the core-cladding
interface. Figure 5 illustrates such a case.
Macrobending sensitivity is a direct function of the
wavelength: the longer the wavelength and/or the
smaller the bending diameter, the more loss the fiber
experiences. It is recognized that 1625 nm is a
wavelength that is very sensitive to macrobending
(see Figure 6).
The macrobending loss is measured by the
power monitoring method (OTDR, especially for
field assessment, see Figure 6) or the cut-back method.
Microbending Sensitivity
Figure 5 Total internal reflection and macrobending effect on Microbending is a fiber property by which the
the light rays. core-cladding concentricity randomly changed along
the fiber length. It causes the core to wobble inside 140 mm radius loosely constrained, with the rest of
the cladding and along the fiber length. the fiber kept essentially straight. The presence of a
Four methods are available for characterizing primary coating on the fiber usually will not affect
microbending sensitivity in optical fibers: lfc : However, the presence of a secondary coating
may result in lfc being significantly shorter than
. expandable drum for singlemode fibers and optical that of the primary coated fiber.
fiber ribbons over a wide range of applied linear Cable cut-off wavelength: Cable cut-off wavelength
pressure or loads; is measured prior to installation on a substantially
. fixed diameter drum for step-index multimode,
straight 22 m cable length prepared by exposing 1 m
singlemode, and ribbon fibers for a fixed linear
of primary-coated fiber at both ends, the exposed
pressure;
ends each incorporating a 40 mm radius loop.
. wire mesh and applied loads for step-index multi-
Alternatively, this parameter may be measured on
mode and singlemode fibers over a wide range of
22 m primary-coated uncabled fiber in the same
applied linear pressure or loads; and
configuration as for the lfc measurement.
. ‘basketweave’ wrap on a fixed diameter drum for
Jumper cable cut-off wavelength: Jumper cable
singlemode fibers.
cut-off wavelength is measured over 2 m with one
The results from the four methods can only be loop of 76 mm radius, or equivalent (e.g., split
compared qualitatively. The test is nonroutine for mandrel), with the rest of the jumper cable kept
general evaluation of optical fiber. essentially straight.
The electric field propagating into the FUT may be dispersion while the negative-b2 region is called
simply described as follows: anomalous dispersion.
At l0 ; b1 is minimum, and b2 ¼ 0; then D ¼ 0:
Eðt; zÞ ¼ E0 sinðvt 2 bzÞ ½8 The dispersion slope S (ps/nm2 · km), also called
v ¼ 2pc=l [rad/s] is the angular frequency; b ¼ kn ¼ the differential dispersion parameter or second-order
ðv=cÞn is the effective index; as b has units of m21 it is dispersion, is given by
often referred to as wavenumber or even sometimes
propagation constant; k is the propagation constant. S ¼ dD=dl ¼ ðv=lÞb3 þ ð2v=l2 Þb2 ½16
The group delay tg is then given by:
b3 ¼ db2 =dv ¼ d3 b=dv3
db=dv ¼ b1 ¼ tg ½9
At l0 ; b1 is minimum, b2 ¼ 0; then D ¼ 0; but S is
An example of the group delay spectral distribution
not zero and depends on b3 : An example of the
is shown in Figure 10. Assuming n is not complex;
spectral distribution of S and S0 is illustrated in
b1 is the first-order derivative of b:
Figure 10.
The group velocity vg is given by
Overall, the general expression of b is given by
vg ¼ ðdb=dvÞ21 ¼ 2ðl2 =2pcÞðdb=dlÞ21 ½10
bðvÞ ¼ b0 þ ðv 2 v0 Þb1 þ ð1=2Þðv 2 v0 Þ2 b2
The FUT input –output delay is given by
þ ð1=12Þðv 2 v0 Þ3 b3 þ · · · ½17
td ¼ L=vg ½11
Figure 11 illustrates the difference between
L is the FUT length. dispersion unshifted fiber (ITU-T Rec. G.652),
The group index of refraction ng is given by dispersion shifted fiber (ITU-T Rec. G.653) and
nonzero dispersion shifted fiber (ITU-T Rec. G.655).
ng ¼ c=vg ¼ n 2 lðdn=dlÞ ½12
The dispersion parameter or dispersion coefficient D Test methods for the determination of chromatic
(ps/nm · km) is given by dispersion
All methods measure the group delay at a specific
D ¼ 2ðv=lÞðdtg =dvÞ ¼ 2ð2pc=l2 Þðd2 b=dv2 Þ wavelength over a range and use agreed fitting
functions to evaluate l0 and S0 :
¼ 2ðl=cÞðd2 n=dl2 Þ ½13 In the phase shift method, the group delay is
measured in the frequency domain, by detecting,
d2 b=dv2 ¼ b2 ½14 recording, and processing the phase shift of a
sinusoidal modulating signal between a reference, a
An example of the spectral distribution of D secondary fiber path, and the channel signal.
obtained from the group delay is shown in Figure 10. Setup variances exist and some do not require the
b2 (ps2/km) is the group velocity dispersion secondary reference path. For instance, by using a
parameter, so D may be related to b2 ; as follows: reference optical filter at the FUT output it is possible
D ¼ 2ðv=lÞb2 ½15 to completely decouple the source from the phase-
meter. With such an approach, chromatic dispersion
When b2 is positive then D is negative and vice-versa. may now be measured in the field over very long links
The region where b2 is positive is called normal using optical amplifiers (see Figure 12).
Figure 10 Relation between the pulse (group) delay and the (chromatic) dispersion.
456 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics
Figure 12 Test results for field-related chromatic dispersion using the phase-shift technique.
In the differential phase shift method, two detec- In the pulse delay method, the group delay is
tion systems are used together with two wavelength measured in the time domain, by detecting, recording,
sources at the same time. In this case the chromatic and processing the delay experienced by pulses at
dispersion may be determined directly from the two various wavelengths.
group delays. This technique usually offers faster and
more reliable results but costs much more than the
Polarization Mode Dispersion
phase-shift technique which is usually preferred.
In the interferometric method, the group delay Polarization mode dispersion (PMD) causes an
between the FUT and a reference path is measured by optical pulse to spread in the time domain and may
a Mach – Zehnder interferometer. The reference delay impair the performance of a telecommunications
line may be an air path or a singlemode fiber standard system. The effect can be related to differential phase
reference material (SRM). The method can be used to and group velocities and corresponding arrival time
determine the following characteristics: of different polarization components of the pulse
signal. For a sufficiently narrowband source, the
. longitudinal chromatic dispersion homogeneity; effect can be related to a differential group delay
and (DGD), Dt; between a pair of orthogonally polarized
. effect of overall or local influences, such as principal states of polarization (PSP) at a given
temperature changes and macrobending losses. wavelength or optical frequency (see Figure 13a).
FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 457
Figure 13 PMD effect on the pulse broadening and its possible pulse impairment. (a) The pulse is spread by the DGD Dt but the bit
rate is too low to create an impairment. (b) The pulse is spread by the DGD Dt but the bit rate is high enough to create an impairment.
(c) The DGD Dt is large enough even at low bit rate to make the pulse spreading and creating impairment.
458 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics
In an ideal circular symmetric fiber, the two PSPs All methods use a linearly polarized source at the
propagate with the same velocity. However: FUT input and are suitable for laboratory measure-
ments of factory lengths of fiber and cable. However,
. a real fiber is not perfectly circular; the interferometric method is the only method appro-
. the core is not perfectly concentric with the priate for measurements of cabled fiber that may be
cladding; moving or vibrating such as is found in the field.
. the core may be subjected to microbending;
. the core may present localized clusters of dopants; Stokes parameter evaluation. SPE determines
and PMD by measuring a response to a change of
. the environmental conditions may stress the narrowband light (from a tuneable light source with
deployed cable and affect the fiber. broadband detector – JME, or a broadband source
with a filtered detector such as an interferometer –
Each time the fiber undergoes local stresses PSA) across a wavelength range. The Stokes vector of
and consequently birefringence. These asymmetry the output light is measured for each wavelength. The
characteristics vary randomly along the fiber and in change of these Stokes vectors with angular optical
time, lead to a statistical behavior of PMD. frequency (wavelength), v and with the change in
For a deployed cabled fiber at a given time and input SOP (state of polarization), yields the DGD as a
optical frequency, there always exist two PSPs such function of wavelength.
that the pulse spreading due to PMD vanishes, if only For both JME and PSA analyses, three distinct and
one PSP is excited. On the contrary, the maximum known linear SOPs (orthogonal on the
pulse spread due to PMD occurs when both PSPs are Poincaré sphere) must be launched for each wave-
equally excited, and is related to the difference in their length. Figure 14 illustrates the test setup and
group delays, the DGD associated with the two PSPs. examples of test results.
For broadband transmission, the DGD statistically The JME and PSA method are mathematically
varies as a function of wavelengths or frequencies and equivalent.
result in an output pulse that is spread out in the time
domain (see Figures 13a – c). In this case, the Fixed analyzer. FA determines PMD by measuring
spreading can be related to the RMS (root mean a response to a change of narrowband light across a
square) of DGD values kDt 2 l1=2 : However, if a known wavelength range. For each SOP, the change in output
distribution such as the Maxwell distribution may be power that is filtered through a fixed polarization
fit to the DGD distribution probability, then a mean analyzer, relative to the power detected without the
(or average) value of the DGD kDtl may be correlated analyzer, is measured as a function of wavelength.
to the RMS value and used as a system performance Figure 15 illustrates a test setup and examples of test
predictor in particular with a maximum value of the results.
DGD distribution associated to a low probability of The resulting measured function can be analyzed in
occurrence. This maximum DGD may then be used to one of two ways:
define the quality of service that would tolerate values
lower than this maximum DGD. . by counting the number of peaks and valleys (EC)
of the curve and application of a formula. This
analysis is considered as a frequency domain
Test methods for polarization mode dispersion approach; and
Three methods are generically used for measuring . by taking the Fourier transform (FT) of the
PMD. Other methods or analyses may exist but they measured function. This FT is equivalent to the
are generally not standardized or are limited in their pulse spreading obtained by TINTY.
applications.
y
pattern interferogram. Two analyses are available lateral pressures, both HEx11 and HE11 polarization
to obtain the PMD: modes whose electric field vector directions are
orthogonally to each other and which have different
. TINTYuses a set of specific operating conditions for propagation constants bx and by :
its successful applications and a basic setup; and Two methods are available for measuring the
. GINTY uses no limiting operating conditions, but polarization crosstalk of PMF:
in addition to the same basic setup, also using a
modified setup compared to TINTY. . power ratio method, which uses the maximum and
minimum values of output power at a specified
Figure 16 illustrates the test setup for both wavelength, and is applicable to fibers and
approaches and examples of test results. connectors jointed to a PMF, and to two or more
PMFs joined in series; and
Polarization Crosstalk
. in-line method, which uses an analysis of the
Polarization crosstalk is a characteristic of energy Poincaré sphere, and is applicable to single or
mixing/transfer/coupling between the two PSPs in a cascaded sections of PMF, and to PMF inter-
PMF (polarization maintaining fiber) when their connected with optical devices.
isolation is imperfect. It is the measure of the strength
of mode coupling or output power ratio between the
Nonlinear Effects
PSPs within a PMF.
A PMF is an optical fiber capable of transmitting, When the power of the transmission signal is
under external perturbations, such as bending or increased to achieve longer span lengths at high bit
FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 461
The pulse will then broaden with negative (normal) Fiber Geometry Characteristics
dispersion and shorten with positive (anomalous)
The fiber geometry is related to the core and cladding
dispersion. SPM may then be used for dispersion
characteristics.
compensation considering that self-phase modulation
imposes C . 0 (positive chirp). It can cancel the
dispersion if properly manage as a function of the sign Core
of the dispersion. SPM is one of the most critical The core center is the center of a circle which best fits
nonlinear effects for the propagation of soliton or the points at a constant level in the near-field intensity
very short pulses over very long distance. profile emitted from the central region of the fiber,
using wavelengths above and/or below the cut-off
Cross-phase modulation wavelength.
XPM is the effect that a powerful pulse has on the The RIP can be measured by refracted near field
phase of an adjacent pulse from another WDM (RNF) or transverse interferometry techniques and
system channel traveling in phase or at slightly the transmitted near field (TNF).
same group velocity. It concerns spectral interference The core concentricity error is the distance between
between two WDM channels: the core center and the cladding center. This
definition applies very well for multimode fibers.
. the increasing IðtÞ at the leading edge of the The distance between the center of the near field
interfering pulse shifts the other pulse to longer profile and the center of the cladding is also used for
wavelength; and singlemode fibers.
. decreasing IðtÞ at the trailing edge of the interfering The mode field diameter (MFD) represents a
pulse shifts the other pulse to shorter wavelengths. measure of the transverse electromagnetic field
intensity of the mode in a fiber cross-section and it
This produces spectral broadening, which dis- is defined from the far-field intensity distribution.
persion converts to temporal broadening depending The MF is the singlemode field distribution of
on the sign of the dispersion. XPM effect is similar to the LP01 mode, giving rise to a spatial intensity
SPM except it depends on the channel count. distribution in the fiber.
The MF concentricity error is the distance between
the MF center and the cladding center.
Four-wave (four-photon) mixing
The core noncircularity is a measure of the
FWM is the by-product production effect from two or
core ellipticity. This parameter is one of the causes
more WDM channels. For two channels IðtÞ modu-
for creating birefringence in the fiber and
lates the phase of each signal (v1 and v2 ). An intensity
consequently PMD.
modulation appears at the beat frequency v1 2 v2 :
Two sideband frequencies are created in a similar
way as harmonics generation. New wavelengths are Cladding
created in a number equal to N2 ðN 2 1Þ=2; where The cladding is the outermost region of constant
N ¼ number of original wavelengths. refractive index in the fiber cross-section.
The cladding center is the center of a circle best
fitting the outer limit (boundary) of the cladding.
Fiber Dimension Characteristics and The cladding diameter is the diameter of the circle
Corresponding Tests Methods defining the cladding center.
The cladding noncircularity is a measure of the
Table 1 provides a list of the various fiber dimensional
difference between the diameters of the two circles
characteristics and their corresponding test methods.
defined by the cladding tolerance field divided by the
nominal cladding diameter.
Table 1 Fiber dimensional characteristics
Measurement of the fiber geometrical attributes The geometrical characteristics of the fiber can be
The fiber geometry is measured by the following obtained from the refractive index distribution using
methods: suitable algorithms:
Test instrumentation may incorporate two or Figure 21 illustrates the core geometry.
more methods such as the one shown in Figure 18.
Side-view technique/transverse interference. The
Transmitted near-field technique. The cladding side-view method is applied to singlemode
diameter, core concentricity error, and cladding fibers to determine the core concentricity error,
noncircularity are determined from the near-field cladding diameter and cladding noncircularity by
intensity distribution. Figure 19 provides a series of measuring the intensity distribution of light that is
examples of test results from TNF measurements. refracted inside the fiber. The method is based on
an interference microscope focused on the side
Refracted near-field technique. The RIP across the view of an FUT illuminated perpendicular to the
entire fiber (core and cladding) can be directly obtained FUT axis. The fringe pattern is used to determine
from the RNF measurement, as shown in Figure 20. the RIP.
TNF image technique. The TNF image technique, accurately determine the cladding diameter of silica
also called near-field light distribution, is used for fibers. The technique uses an electronic micrometer
the measurement of the geometrical characteristics of such as based on a double-pass Michelson inter-
singlemode fibers. The measurement is based on ferometer. The technique is used for providing
analysis of magnified images at the FUT output. Two calibrated fibers to the industry as SRM.
subsets of the method are available:
Numerical Aperture
. grey-scale technique which performs an x – y near-
field scan using a video system; and The NA is an important attribute for multimode
. Single near-field scanning technique performing a fibers in order to predict their launching efficiency,
one-dimensional scan. joint loss at splices and micro/macrobending
characteristics.
Mechanical diameter. This is a precision mecha- A method is available for the measurement of
nical diameter measurement technique used to the angular radiant intensity distribution (far-field)
FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 465
distribution or the RIP at the output of an FUT. . bidirectional backscattering uses an OTDR and
NA can be determined from the analysis of the bidirectional measurements to determine MFD by
test results. comparing the FUT results with a reference fiber.
Residual stress is the stress built up by the thermal Nuclear radiation is considered on the basis of a
expansion difference between core and cladding steady state response of optical fibers and cables
during the fiber drawing process or splicing. Methods exposed to gamma radiation and to determine
are available for measuring residual stress based on the level of related radiation-induced attenuation
polarization effects. A light beam produced by a produced in singlemode or multimode cabled or
rotating polarizer propagates to the x-axis direction uncabled fibers.
while the beam polarization is in the y – z plane and The fiber attenuation generally increases when
the fiber longitudinal axis in the z-axis. The light exposed to gamma radiation. This is primarily due to
experiences different phase shift in the y- and z-axis the trapping of radiolytic electrons and holes at defect
due to the FUT birefringence. The photoelastic effect sites in the glass (i.e., the formation of ‘color centers’).
gives the relationship between this phase change and Two regimes are considered:
residual stresses.
. the low dose rate suitable for estimating the effect
of environmental background radiation; and
Stress Corrosion Susceptibility . the high dose rate suitable for estimating the effect
of adverse nuclear environments.
The stress corrosion susceptibility is related to the
dependence of crack growth on applied stress. The effects of environmental background radiation
It depends on the environmental conditions and are measured by the attenuation (cut-back method).
static and dynamic values may be observed. The effects of adverse nuclear environments are tested
by power monitoring before, during and after FUT
exposure.
Environmental Characteristics
Table 3 lists the fiber characteristics related to the
See also
effect of the environment.
Fiber and Guided Wave Optics: Nonlinear Effects
(Basics). Imaging: Interferometric Imaging. Interferome-
Table 2 Fiber mechanical characteristics
try: Gravity Wave Detection; Overview. Polarization:
Attribute Introduction. Scattering: Scattering Phenomena in Opti-
cal Fibers.
Proof stress
Residual stress
Stress corrosion susceptibility
Tensile strength Further Reading
Stripability
Agrawal GP (1997) Fiber-Optic Communication Systems,
Fiber curl
2nd edn. New York: John Wiley & Sons.
FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics) 467
Agrawal GP (2001) Nonlinear Fiber Optics, 3rd edn. Masson B and Girard A (2004) FTTx PON Guide, Testing
San Diego, CA: Academic Press. Passive Optical Networks. Quebec: EXFO, pp. 56.
Girard A (2000) Guide to WDM Technology and Testing. Miller JL and Friedman E (2003) Optical Communications
Quebec: EXFO, pp. 194. Rules of Thumb. New York: McGraw-Hill, pp. 428.
Hecht J (1999) Understanding Fiber Optics, 3rd edn. Upper Neumann E-G (1988) Single-Mode Fibers, Fundamentals.
Saddle River, NJ: Prentice Hall. Berlin: Springer-Verlag.
R1 ¼ a1 E1 ½1
Introduction where a1 is a constant. This type of behavior corres-
Many physical systems in various areas such as ponds to the so-called linear response. In general, a
condensed matter or plasma physics, biological physical system executes a linear response when the
sciences, or optics, give rise to localized large- superposition of two (or more) input signals E1 and E2
amplitude excitations having a relatively long lifetime. yields a response which is a superposition of the output
Such excitations lead to a host of phenomena referred signals, as schematically represented in Figure 1:
to as nonlinear phenomena. Of the many disciplines of
R ¼ a1 R 1 þ a 2 R 2 ½2
physics, the optics field is probably the one in which
practical applications of nonlinear phenomena have Now, in almost all real physical systems, if the
been the most fruitful, in particular, since the amplitude of an excitation E1 becomes sufficiently
discovery of the laser in 1960. This discovery has large, distortions will occur in the output signals. In
thus led to the advent of a new branch in optics, other words, the response of the system will no longer
referred to as nonlinear optics. The applications of be proportional to the excitation, and consequently,
nonlinear phenomena in optics include the design of the law of superposition of states will no longer be
various kinds of laser sources, optical amplifiers, observed. In this case, the response of the system may
light converters, light-wave communication systems take the following form:
for data transmission purposes, to name a few.
In this article, we present an overview of some basic R ¼ a1 E1 þ a2 E21 þ a3 E31 þ … ½3
principles of nonlinear phenomena that result from
the interaction of light waves with dielectric wave- which involves not only a signal at the input frequency
guides such as optical fibers. These nonlinear phenom- v , but also signals at frequencies 2v , 3v , and so on.
ena can be broadly divided into two main categories, Thus, harmonics of the input signal are generated.
namely, parametric effects and scattering phenomena. This behavior, called nonlinear response, is at the
Parametric interactions arise whenever the state of origin of a host of important phenomena in many
the dielectric matter is left unchanged by the inter-
action, whereas scattering phenomena imply tran-
sitions between energy levels in the medium. More
fundamentally, parametric interactions originate from
the electron motion under the electric field of a light
wave, whereas scattering phenomena originate
from the motion of heavy ions (or molecules).
branches of sciences, such as in condensed matter The condition of constructive interference is the
physics, biological sciences, or in optics. phase-matching condition.
In practice, the macroscopic response of a dielectric
is given by the polarization, which corresponds to the
total amount of dipole moment per unit volume of the
Physical Origin of Optical Nonlinearity
dielectric. As the mass of an ion is much larger than
Optical nonlinearity originates fundamentally from that of an electron, the amplitude of the ion motion is
the action of the electric field of a light wave on the generally negligible with respect to that of the
charged particles of a dielectric waveguide. In electrons. As a consequence, the electron motion
contrast to conductors where charges can move generally leads to the dominant contribution in the
throughout the material, dielectric media consist of macroscopic properties of the medium. The behavior
bound charges (ions, electrons) that can execute only of an electron under an optical electric field is similar
relatively limited displacements around their equili- to that of a particle embedded in an anharmonic
brium positions. The electric field of an incoming potential. A very simple model (called the Lorentz
light wave will cause the positive charges to move in model) that provides a deep insight into the dielectric
the polarization direction of the electric field whereas response consists of an electron of mass m and charge
negative charges will move in the opposite direction. 2 e connected to an ion by an elastic spring (see
In other words, the electric field will cause the Figure 2). Under the electric field EðtÞ, the electron
charged particles to become dipoles, as schematically executes a displacement xðtÞ with respect to its
represented in Figure 2. equilibrium position, which is governed by the
Then each dipole will vibrate under the influence of following equation:
the incoming light, thus becoming a source of
radiation. The global light radiated by all the dipoles d2 x dx
2
þ 2l þ v20 x þ ðað2Þ x2 þ að3Þ x3 þ · · ·Þ
represents the scattered light. When the charge dt dt
displacements are proportional to the excitation e
(incoming light), i.e., in the low-amplitude limit of ¼ 2 EðtÞ ½4
m
scattered radiation, the output light vibrates at the
same frequency as that of the excitation. This process where að2Þ ; að3Þ ; and so on, are constant parameters,
corresponds to Rayleigh scattering. On the other v0 is the resonance angular frequency of the electron,
hand, if the intensity of the excitation is sufficiently and l is the damping coefficient resulting from the
large to induce displacements that are not negligible dipolar radiation. When the amplitude of the electric
with respect to atomic distances, then the charge field is sufficiently large, then the restoring force on
displacements will no longer be proportional to the electrons becomes a nonlinear function of x;
the excitation. In other words, the response of the hence the presence of terms such as að2Þ x2 ; að3Þ x3 ; and
medium, which is no longer proportional to the excita- so on, in eqn [4]. In this situation, the macroscopic
tion, becomes nonlinear. In this case, the scattered response of the dielectric is the polarization
X
waves will be generated not only at the excitation P ¼ 2 exðv; 2v; 3v; …Þ ½5
frequency (the Kerr effect), say v , but also at
frequencies that differ from v (e.g., 2v , 3v). On the where xðv; 2v; 3v; …Þ is the solution of eqn [4] in the
other hand, it is worth noting that when all induced frequency domain, and the summation extends over
dipoles vibrate coherently (that is, their relative phase all the dipole moments per unit volume. In terms of
does not vary randomly), their individual radiation the electric field E the polarization may be written as
may, under certain conditions, interfere construc-
P ¼ 10 ðxð1Þ E þ xð2Þ E2 þ xð3Þ E3 þ · · ·Þ ½6
tively and lead to a global field of high intensity.
where xð1Þ ; xð2Þ ; xð3Þ ; and so on, represent the suscepti-
bility coefficients. Figure 3 (top left) illustrates
schematically the polarization as a function of the
electric field.
In particular, one can clearly observe that when the
amplitude of the incoming electric field is sufficiently
small (bottom left in Figure 3), the polarization (top
right) is proportional to the electric field (bottom left),
thus implying that the electric dipole radiates a wave
Figure 2 Electric dipoles in a dielectric medium under an having the same frequency as that of the incoming
external electric field. light (bottom right). On the other hand, Figure 4
FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics) 469
6 100
Time t (arbitrary units)
2
10–2
0
–5 0 5 –4 –2 0 2 4
Field E (arbitrary units) Frequency (arbitrary units)
Figure 3 Polarization induced by an electric field of small amplitude. Nonlinear dependence of the polarization as a function of field
amplitude (top left) and time dependence of the input electric field (bottom left). Time dependence of the induced polarization (top right)
and corresponding intensity spectrum (bottom right).
shows that for an electric field of large amplitude, the Parametric Phenomena in Optical
polarization is no longer proportional to the electric Fibers
field, leading to radiation at harmonic frequencies
(see Figure 4, bottom right). In anisotropic materials, the leading nonlinear term
This nonlinear behavior leads to a host of in the polarization, i.e., the xð2Þ term, leads to
important phenomena in optical fibers, which are phenomena such as the harmonic generation or
useful for many optical systems but detrimental for optical rectification. This xð2Þ term vanishes in
other systems. homogeneous isotropic materials such as cylindrical
470 FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics)
optical fibers, and there the leading nonlinear term Four-Wave Mixing
becomes the xð3Þ term. Thus, most of outstanding
The four-wave mixing (FWM) process is a third-order
nonlinear phenomena in optical fibers originate from
nonlinear effect in which four waves interact through
the third-order nonlinear susceptibility xð3Þ. Some of
an energy exchange process. Let us consider two
those phenomena are described below.
intense waves, E1 ðv1 Þ and E2 ðv2 Þ; with v2 . v1 ;
Optical Kerr Effect called pump waves, propagating in an optical fiber.
Hereafter we consider the simplest case when waves
The optical Kerr effect is probably the most important propagate with the same polarization. In this
nonlinear effect in optical fibers. This effect induces situation the total electric field is given by
an intensity dependence of the refractive index, which
leads to a vast wealth of fascinating phenomena such Etot ðr; tÞ ¼ E1 þ E2
as self-phase modulation (SPM), cross-phase modu- ¼ A1 ðv1 Þ exp½iðk1 ·r 2 v1 tÞ
lation (CPM), four-wave mixing (FWM), modula- þ A2 ðv2 Þ exp½iðk2 ·r 2 v2 tÞ ½12
tional instability (MI) or optical solitons. The Kerr
effect can be conveniently described in the frequency where k1 and k2 are the wavevectors of the fields E1
domain through a direct analysis of the polarization, and E2 , respectively. Equation [7], which gives the
which takes the following form: nonlinear polarization induced by a single mono-
chromatic wave, remains valid provided that the
3 frequency spacing between the two waves is relatively
PNL ðvÞ ¼ 1 xð3Þ ðvÞlEðvÞl2 EðvÞ ½7
4 0 small, i.e., lDvl ¼ lv2 2 v1 l ,, v0 ¼ ðv1 þ v2 Þ=2:
In this context, eqn [7] leads to
The constant 3/4 comes from the symmetry properties
of the tensor xð3Þ. Setting PNL ðvÞ ¼ 10 1NL EðvÞ; where 3
PNL < 1 xð3Þ ðv0 ÞlEtot l2 Etot ½13
1NL ¼ 34 xð3Þ lEl2 is the nonlinear contribution to the 4 0
dielectric constant, the total polarization takes the Substituting eqn [12] in eqn [13] yields
form
PNL ¼ 2n0 n2 10 lE1 l2 þ 2lE2 l2 E1
PðvÞ ¼ PL þ PNL ¼ 10 ½xð1Þ ðvÞ þ 1NL EðvÞ ½8
2 2 2 p p 2
As eqn [8] shows, the refractive index n, at a given þ lE2 l þ 2lE1 l E2 þ E1 E2 þ E1 E2 ½14
frequency v, is given by
The term lE1 l2 E1 in the right-hand side of eqn
n2 ¼ 1 þ xð1Þ þ 1NL ¼ ðn0 þ DnNL Þ2 ½9 [14] represents the self-induced Kerr effect on the
wave v1 : The term lE2 l2 E1 ; which corresponds to a
with n20 ¼ 1 þ xð1Þ . In practice DnNL ! n0 ; and then, modification of the refractive index seen by the
the refractive index is given by wave v1 ; due to the presence of the wave v2 ;
represents the cross-Kerr effect. Thus, the refractive
index at frequency v1 depends simultaneously on
nðv; lEl2 Þ ¼ n0 ðvÞ þ ne2 lEl2 ½10
the intensities of the two pumps, i.e., n ¼
nðv1 ; lE1 l2 ; lE2 l2 Þ: Similarly the third and fourth
where ne2 is the nonlinear refractive index defined
terms in the right-hand side of eqn [14] illustrate
by ne2 ¼ 3xð3Þ =ð8n0 Þ: The linear polarization PL is
the self-induced and cross-Kerr effects on the wave
responsible for the frequency dependence of the
v2 ; respectively. Note that the self-induced Kerr
refractive index, whereas the nonlinear polarization
effect is responsible for a self-phase modulation,
PNL causes an intensity dependence of the refractive
which induces a spectral broadening of a pulse
index, which is referred to as the optical Kerr
propagating through the optical fiber, whereas the
effect. Knowing that the wave intensity I is given
cross-Kerr effect leads to a cross-phase modulation
by I ¼ alEl2 ; with a ¼ 12 10 cn0 ; the refractive index
that induces a spectral broadening of one wave in
can be then rewritten as
the presence of a second wave. The two last terms
nðv; IÞ ¼ n0 ðvÞ þ n2 I ½11 in the right-hand side of eqn [14] correspond to the
generation of new waves at frequencies 2v1 2 v2
with n2 ¼ ne2 =a ¼ 2ne2 =ð10 cn0 Þ: For fused silica and 2v2 2 v1 ; respectively. The wave with the
fibers one has typically: n2 ¼ 2:66 £ 10220 m2 W21 : smallest frequency 2v1 2 v2 ¼ vs is called the
For example, an intensity of I ¼ 1 GW cm22 leads to Stokes wave, whereas the wave with the highest
DnNL ¼ 2:66 £ 1027 ; which is much smaller than frequency 2v2 2 v1 ¼ vas is called the anti-Stokes
n0 < 1:45: wave. These two newly generated waves interact
FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics) 471
Conclusion
Optical fibers constitute a key device for many areas
of optical sciences, and in particular for ultrafast
optical communications. The nonlinear phenomena
that arise through the nonlinear refractive index
change (induced mainly by the Kerr effect) have been
largely investigated these last two decades for
applications such as all-optical wavelength conver-
sion, parametric amplification, generation of new
optical frequencies or ultrahigh repetition rate pulse
trains. However, in many other optical systems such
as optical communication systems, these nonlinear
phenomena become harmful effects, and in this case
control processes are being developed in order to
suppress or at least reduce their impact in the system.
Nonlinear Optics
K Thyagarajan and A K Ghatak, Indian Institute of Nonlinear optical interactions become prominent
Technology, New Delhi, India when the optical power densities are high and
q 2005, Elsevier Ltd. All Rights Reserved. interaction takes place over long lengths. The usual
method to increase optical intensity is to focus the
light beam using a lens system. For a given optical
power, the tighter the focusing, the larger will be the
Introduction
intensity for a given optical power; however, greater
The invention of the laser provided us with a light will be the divergence of the beam. Thus tighter
source capable of generating extremely large optical focusing produces larger intensities, but over shorter
power densities (several MW/m2). At such large interaction lengths (see Figure 1a). Optical wave-
power densities, matter behaves in a nonlinear guides, in which the light beam is confined to a small
fashion and we come across new optical phenomena cross-sectional area, are currently being explored for
such as second-harmonic generation (SHG), sum realizing efficient nonlinear devices. In contrast to
and difference frequency generation, intensity bulk media, in waveguides, diffraction effects are
dependent refractive index, mixing of various balanced by waveguiding and the beam can have
frequencies, etc. In SHG, an incident light beam at small cross-sectional areas over much longer inter-
frequency v interacts with the medium and gen- action lengths (see Figure 1b). A simple optical
erates a new light wave at frequency 2v: In sum and waveguide consists of a high index dielectric medium
difference frequency generation, two incident beams surrounded by a lower index dielectric medium so
at frequencies v1 and v2 mix with each other that light waves can be trapped in the high index
producing sum ðv1 þ v2 Þ and difference ðv1 2 v2 Þ region by the phenomenon of total internal reflection.
frequencies at the output. Higher-order nonlinear Figure 2 shows a planar waveguide, a channel
effects, such as self-phase modulation, four-wave waveguide and an optical fiber. In the planar
mixing, etc. can also be routinely observed today. waveguide a film of refractive index nf is deposited/
The field of nonlinear optics dealing with such
diffused on a substrate of refractive index ns and has a
nonlinear interactions is gaining importance due to
cover of refractive index nc (with ns ; nc , nf ). The
numerous demonstrated applications in many
waveguide has typical cross-sectional dimensions of a
diverse areas such as optical fiber communications,
few micrometers. Unlike planar waveguides, in which
all-optical signal processing, realization of novel
guidance takes place only in one dimension, in
sources of optical radiation, etc.
channel waveguides the diffused region in a substrate
is surrounded on all sides by lower index media.
Optical fibers are structures with cylindrical symme-
try and have a central cylindrical core of doped silica
surrounded by a concentric cylindrical cladding of
pure silica which has a slightly lower refractive index.
Light guidance in all these waveguides takes place
through the phenomenon of total internal reflection.
Figure 1 (a) In bulk media, tighter focusing produces larger
intensities, but over shorter interaction lengths. (b) In optical
In contrast to bulk interactions requiring beam
waveguides diffraction effects are balanced by waveguiding and focusing, in the case of optical waveguides, the beam
the interaction lengths are much larger. can be made to have extremely small cross-sectional
areas ð,25 mm2 Þ over very long interaction lengths where xð2Þ ; xð3Þ ; … are higher order susceptibilities
(,20 mm in integrated optical waveguides and tens giving rise to the nonlinear terms. The second term on
of thousands of kilometers in the case of optical the right-hand side is responsible for SHG, sum and
fibers). This leads to very much increased nonlinear difference frequency generation, parametric inter-
interaction efficiencies even at moderate powers actions, etc. while the third term is responsible for
(approximately a few tens of mW). third-harmonic generation, intensity dependent
In the following sections, we will discuss some of refractive index, self-phase modulation, four-wave
the important nonlinear interactions that are being mixing, etc. For media possessing an inversion
studied with potential applications to various symmetry, xð2Þ is zero and there is no second-order
branches of science and engineering. Obviously it is nonlinear effect. Thus silica optical fibers, which form
impossible to cover all aspects of nonlinear optics – the heart of today’s communication networks, do not
several books have been written in this area (see possess the second-order nonlinearity.
Further Reading section at the end of this article) – We will first discuss second-harmonic generation
what we will do is to discuss the physics of some of and parametric amplification which arises due to the
the important nonlinear effects. second-order nonlinear term and then go on to effects
Apart from intensity and length of interaction, one due to third-order nonlinear interaction.
of the most important requirements for many efficient
nonlinear interactions is the requirement of phase
Second-Harmonic Generation in
matching. The concept of phase matching can be
easily understood from the point of view of SHG. In Crystals
SHG, the incident wave at frequency v generates a The first demonstration of SHG was made in 1961 by
nonlinear polarization at frequency 2v and this focusing a 3 kW ruby laser pulse ðl0 ¼ 6943 AÞ on a
nonlinear polarization is responsible for the gener- quartz crystal. An incident beam from a ruby laser
ation of the wave at 2v: Now, the phase velocity of the (red color) after passing through a crystal of KDP, gets
nonlinear polarization wave at 2v is the same as the partly converted into blue light which is the second
phase velocity of the electromagnetic wave at freq- harmonic. Ever since, SHG has been one of the most
uency v; which is usually different from the phase widely studied nonlinear interactions.
velocity of the electromagnetic wave at frequency 2v; We consider a plane wave of frequency v propagat-
this happens due to wavelength dispersion in the ing along the z-direction through a medium and
medium. When the two phase velocities are unequal, consider the generation of the second-harmonic
the polarization wave at 2v (which is the source) and frequency 2v as the beam propagates through the
the electromagnetic wave at 2v pass in and out of medium. Now, the field at v generates a polarization
phase with each other as they propagate through the at 2v; which acts as a source for the generation of an
medium. Due to this, the energy flowing in from v to electromagnetic wave at 2v: Corresponding to the
2v cannot add constructively and the efficiency of frequencies v and 2v; the electric fields are assumed
second-harmonic generation is limited. If the phase to be given by
velocities of the waves at v and 2v are matched then
the polarization wave and the wave at 2v remain in 1
EðvÞ ¼ ðE ðzÞeiðvt2k1 zÞ þ c:c:Þ ½3
phase leading to drastically increased efficiencies. This 2 1
condition is referred to as phase matching and plays a
and
very important role in most nonlinear interactions.
1
Eð2vÞ ¼ ðE ðzÞeið2vt2k2 zÞ þ c:c:Þ ½4
Nonlinear Polarization 2 2
In a linear medium, the electric polarization P is respectively; c.c. stands for the complex conjugate of
assumed to be a linear function of the electric field E: the preceding quantities. The quantities:
P ¼ 10 xE ½1 pffiffiffiffiffiffiffiffiffi
k1 ¼ v ð11 m0 Þ ¼ ðv=cÞnv ½5
where, for simplicity, a scalar relation has been
written. The quantity x is termed as linear dielectric and
susceptibility. At high optical intensities (which pffiffiffiffiffiffiffiffiffi
corresponds to high electric fields), all media behave k2 ¼ 2v ð12 m0 Þ ¼ ð2v=cÞn2v ½6
in a nonlinear fashion. Thus eqn [1] is modified to represent the propagation vectors at v and 2v;
respectively; 11 and 12 represent the dielectric
P ¼ 10 ðxE þ xð2Þ E2 þ xð3Þ E3 þ …Þ ½2 permittivities at v and 2v; and nv and n2v represent
474 FIBER AND GUIDED WAVE OPTICS / Nonlinear Optics
where we have used eqn [15] for Dk: The length Lc is Since no at 0.6943 mm lies between ne and no at
called the phase coherence length and represents the 0.34715 mm, there will always exist an angle um along
maximum crystal length up to which the second- which n2e v ðum Þ ¼ nvo : We can solve for um and obtain:
harmonic power increases. Thus, if the length of the " #1=2
crystal is less than Lc ; the second-harmonic power ðnvo Þ2 2 ðn2e v Þ2 n2ov
increases almost quadratically with z: For z . Lc ; the cos um ¼ ½23
ðn2ov Þ2 2 ðn2e v Þ2 nvo
second-harmonic power begins to reduce again.
In general, because of dispersion, it is very difficult For the values given by eqn [20], we find um ¼ 50:58:
to satisfy the phase-matching condition. However, in a
birefringent medium, for example with no . ne ; it Quasi Phase Matching (QPM)
may be possible to find a direction along which the
refractive index of the o-wave for v equals the As mentioned earlier, phase matching is extremely
refractive index of the e-wave for 2v (see Figure 4). important for any efficient nonlinear interaction.
For media with ne . no ; the direction would corre- Recently the technique of quasi phase matching
spond to that along which the refractive index of the (QPM) has become a convenient way to achieve
e-wave for v equals the refractive index for the o-wave phase matching at any desired wavelength in any
for 2v: This can readily be understood by considering material. QPM relies on the fact that the phase
a specific example. We consider the SHG in KDP mismatch in the two interacting beams can be
corresponding to the incident ruby laser wavelength compensated by periodically readjusting the phase
(l0 ¼ 0:6943 mm; v ¼ 2:7150 £ 1015 Hz). For this of interaction through periodically modulating the
frequency: nonlinear characteristics of the medium at a spatial
8 9 frequency equal to the wavevector mismatch of the
< nvo ¼ 1:50502; nve ¼ 1:46532 = two interacting waves. Thus in SHG, when the
½20 nonlinear polarization at 2v and the electromagnetic
: n2v ¼ 1:53269; n2v ¼ 1:48711 ;
o e wave at 2v have an accumulated phase difference of
The refractive index variation for the e-wave is p; then the sign of the nonlinear coefficient is reversed
given by: so that the energy flowing from the polarization to the
wave can add constructively with the existing energy
!2 1
sin2 u cos2 u 2 (see Figure 5). Thus, by properly choosing the period
ne ðuÞ ¼ 2
þ ½21 of the spatial modulation of the nonlinear coefficient,
ne n2o
one could achieve phase matching. This scheme,
where u is the angle that the wave propagation QPM, is being very widely studied for application to
direction makes with the optic axis. Now, as can be nonlinear interactions in bulk and in waveguides.
seen from the above equations: In a ferroelectric material such as lithium niobate,
the signs of the nonlinear coefficients are linked to the
ne , ne ðuÞ , no ½22 direction of the spontaneous polarization. Thus a
Figure 4 In a birefringent medium, for example with no . ne, Figure 5 By reversing the sign of the nonlinear coefficient after
it may be possible to find a direction along which the refractive every coherence length, the energy in the second harmonic can
index of the o-wave for v equals the refractive index of the e-wave be made to grow. (a) Perfect phase matching; (b) Quasi phase
for 2v. matching; (c) Nonphase matched.
476 FIBER AND GUIDED WAVE OPTICS / Nonlinear Optics
periodic reversal of the domains of the crystal can be the brackets in eqn [26] contributes to the growth of
used for achieving QPM (see Figure 6). This is the the second harmonic and similarly if Dk þ K ¼ 0;
currently used technique to obtain high-efficiency then only the first term within the brackets contri-
SHG and other nonlinear interactions in LiNbO3, butes to the growth of the second harmonic. The first
LiTaO3, and KTP. The most popular technique today, condition implies that:
to achieve periodic domain reversal in LiNbO3, is the
technique of electric field poling. In this method, a 2p 2v 2p
2 ðn 2 nv Þ 2 ¼0 ½27
high electric field pulse is applied to properly oriented l0 L
lithium niobate crystal using lithographically defined
where L ¼ 2p=K represents the spatial period of the
electrode patterns to produce a permanent periodic
modulation of the nonlinear coefficient, and l0 is the
domain reversed pattern. Such a periodically domain
wavelength of the fundamental. Thus the modulation
reversed LiNbO 3 crystal with the periodically
period L required for QPM SHG is:
reversed domains going through the entire depth of
the crystal is also referred to as PPLN (periodically l0
poled lithium niobate, pronounced ‘piplin’) and is L¼ 2v
¼ 2 Lc ½28
2ðn 2 nv Þ
now commercially available.
In order to analyze SHG in a periodically poled Figure 7 shows the vector diagram corresponding to
material, let us assume that the nonlinear coefficient d QPM SHG. The phase mismatch between the
varies sinusoidally with a period L. In such a case we fundamental and second harmonic is compensated
have: by the spatial frequency vector of the periodic
variation of d.
d ¼ d0 sinðKzÞ ½24 In the case of waveguides, nv and n2v would
represent the effective indices of the modes at the
where d0 is the amplitude of modulation of the fundamental and the second-harmonic frequency. It
nonlinear coefficient and Kð¼ 2p=LÞ represents the may be noted that the index difference between the
spatial frequency of the periodic modulation. For fundamental and the second harmonic is typically 0.1
easier understanding we are assuming the modulation and thus the required spatial periods for a funda-
to be sinusoidal; in general, the modulation will be mental wavelength of 800 nm is , 4 mm.
periodic but not sinusoidal. Any periodic modulation The main advantage of QPM is that it can be used
can be written as a superposition of sinusoidal and at any wavelength within the transparency window of
cosinusoidal variations. Thus our discussion is valid the material; only the period needs to be correctly
for one of the Fourier components of the variation. chosen for a specific fundamental wavelength. One
By using eqn [24], eqn [11] for the variation of the can also choose appropriate polarization to make use
amplitude of the second harmonic becomes: of the largest nonlinear optical coefficients. Another
advantage is the possibility of varying the domain
dE2 i m d cv
¼ 2 0 20v E21 ðzÞeiðDkÞz sinðKzÞ ½25 reversal period (chirp) to achieve specific interaction
dz n characteristics such as increased bandwidth, etc.
which can be written as: In general, the spatial variation of the nonlinear
grating is not sinusoidal. In this case the efficiency of
dE2 m d cv interaction would be determined by the Fourier
¼ 2 0 02v E21 ðzÞeiðDkÞz ½eiKz 2 e2 iKz ½26 component of the spatial variation at the spatial
dz 2n
frequency corresponding to the period given by
Using similar arguments as earlier, it can be shown eqn [28]. Also, since the required periods are very
that if Dk 2 K ¼ 0; then only the second term within small, it is possible to use a higher spatial period of
modulation and use one of the Fourier components
for the nonlinear interaction process. Thus, in the
case of periodic reversal of the nonlinear coefficient
with a spatial period given by: This includes electric field poling, electron bombard-
ment, thermal diffusion treatment, etc.
l0
Lg ¼ m ; m ¼ 1; 3; 5; … ½29
2ðn2v 2 nv Þ
Third-Order Nonlinear Effects
which happens to be the mth harmonic of the
fundamental spatial frequency required for QPM, In the earlier sections, we discussed effects arising out
the corresponding nonlinear coefficient that would be of second-order nonlinearity, i.e., the term pro-
responsible for SHG would be the Fourier amplitude portional to E2 in the nonlinear polarization. This
at that spatial frequency. This can be taken into nonlinear term is found only in media not possessing
account by defining an effective nonlinear coefficient: an inversion symmetry. Thus amorphous materials or
crystals possessing inversion symmetry do not exhibit
2d0 second-order effects. The lowest-order nonlinear
dQPM ¼ ½30 effects in such a medium is of an order three wherein
mp
the nonlinear polarization is proportional to E3 :
Of course the largest effective nonlinear coefficient is Self-phase modulation (SPM), cross-phase modu-
achieved by using the fundamental frequency with lation (XPM) and four-wave mixing (FWM) rep-
m ¼ 1: Higher spatial periods are easier to fabricate resent some of the very important consequences of
but would lead to reduced nonlinear efficiencies. third-order nonlinearity. These effects have become
As compared to bulk devices, in the case of all the more important as they play a significant and
waveguides, the interacting waves are propagating important role in wavelength division multiplexed
in the form of modes having specific intensity optical fiber communication systems.
distributions in the transverse plane. Because of this,
the nonlinear interaction also depends on the overlap Self-Phase Modulation (SPM)
between the periodically inverted nonlinear medium,
Consider the propagation of a plane light wave at
the fields of the fundamental and second-harmonic
frequency v through a medium having xð3Þ non-
waves. Thus if Ev ðx; yÞ and E2v ðx; yÞ represent the
linearity. The polarization generated in the medium is
electric field distributions of the waveguide modes at
given by
the fundamental and second-harmonic frequency,
then the efficiency of SHG depends on the following P ¼ 10 xE þ 10 xð3Þ E3 ½32
overlap integral: If we consider a single frequency wave with an
ðð electric field given by
Iovl ¼ d0 ðx; yÞE2v ðx; yÞE2v ðx; yÞ dx dy ½31
E ¼ E0 cosðvt 2 kzÞ ½33
then
where d0 ðx; yÞ represents the transverse variation of
the nonlinear coefficient of the waveguide. Thus P ¼ 10 xE0 cosðvt 2 kzÞ
optimization of SHG efficiency in the case of þ 10 xð3Þ E30 cos3 ðvt 2 kzÞ ½34
waveguide interactions has to take account of the
overlap integral. Expanding cos u in terms of cos u and cos 3u; we
3
Since QPM relies on periodic phase matching, it is obtain the following expression for the polarization
highly wavelength dependent. Thus the required at frequency v:
period L is different for different fundamental 3 ð3Þ 2
frequencies. Any deviation in the frequency of the P ¼ 10 x þ x E0 E0 cosðvt 2 kzÞ ½35
4
fundamental would lead to a reduction in the
For a plane wave given by eqn [33], the intensity is
efficiency due to deviation from QPM condition.
Thus the pump laser needs to be highly stabilized at a 1
I¼ c1 n E2 ½36
frequency corresponding to the fabricated period. 2 0 0 0
Apart from SHG, QPM has been used for other where n0 is the refractive index of the medium at low
three-wave interaction processes such as difference intensity. Then
!
frequency generation, parametric amplification, etc.
3 xð3Þ
Among the many materials that have been studied for P ¼ 10 x þ I E ½37
2 c10 n0
SHG using QPM, the important ones are lithium
niobate, lithium tantalite, and potassium titanyl The polarization P and electric field are related
phosphate (KTP). Many techniques have been devel- through the following equation:
oped for periodic domain reversal to achieve a
periodic variation in the nonlinear coefficient. P ¼ 10 ðn2 2 1ÞE ½38
478 FIBER AND GUIDED WAVE OPTICS / Nonlinear Optics
where n is the refractive index of the medium. where c0 is a constant and 2w0 represents the mode
Comparing eqns [37] and [38], we get field diameter (MFD), we get:
3 xð3Þ Aeff ¼ p w20 ½47
n2 ¼ n20 þ I ½39
2 c10 n0
It is usual to express the nonlinear characteristic of
where an optical fiber by the coefficient given by
n20 ¼ 1 þ x ½40 k0 n2
g¼ ½48
Since the last term in the eqn [39] is usually very Aeff
small, we get Thus, for the same input power and same wavelength,
n ¼ n0 þ n 2 I ½41 smaller values of Aeff lead to greater nonlinear effects
in the fiber. Typically:
where
Aeff , 50 – 80 mm2 and
3 xð3Þ ½49
n2 ¼ ½42
4 c10 n20 g < 2:4 W21 km21 to 1:5 W21 km21
is the nonlinear coefficient. When a light beam propagates through an optical
For fused silica n0 < 1:47; n2 < 3:2 £ 10220 m2 =W fiber, the power decreases because of attenuation.
and if we consider power of 100 mW having a cross- Thus, the corresponding nonlinear effects also reduce.
sectional area of 100 mm2, the resultant intensity is Indeed, the phase change suffered by a beam in
109 W/m2 and the corresponding change in refractive propagating from 0 to L is given by
index is
ðL ðL
Dn < n2 I < 3:2 £ 10211 ½43 f¼ bNL dz ¼ bL þ g P dz ½50
0 0
Although this is very small, when the beam propa-
gates over an optical fiber over long distances (a few If a represents the attenuation coefficient, then
hundred to a few thousand kilometers), the accumu- PðzÞ ¼ P0 e2az ½51
lated nonlinear effects can be significant.
In the case of an optical fiber, the light beam and we get
propagates as a mode having a specific transverse f ¼ bL þ gP0 Leff ½52
electric field distribution and thus the intensity is not
constant across the cross-section. In such a case, it is where
convenient to express the nonlinear effect in terms of 1 2 e2aL
the power carried by the mode (rather than in terms Leff ¼ ½53
a
of intensity). If the linear propagation constant of the
mode is represented by b; then in the presence of is referred to as the effective length. For aL q 1;
nonlinearity, the effective propagation constant is Leff < 1=a and for aL p 1; Leff < L:
given by The effective length represents the length of the
fiber over which most of the nonlinear effects has
k0 n2 accumulated. For a loss coefficient of 0.20 dB/km,
bNL ¼ b þ P ½44
Aeff Leff < 21 km:
where k0 ¼ 2p=l0 ; P is the power carried by the mode. If we consider a fiber length much longer than Leff ;
The quantity Aeff represents the effective transverse then to have reduced impact of SPM, we must have
cross-sectional area of the mode and is defined by gP0 Leff p 1 ½54
" #2
ð1 ð2p
2
c ðrÞr dr df or
0 0
Aeff ¼ ð 1 ð 2p ½45 1 a
4
c ðrÞr dr df P0 p < ½55
0 0
gLeff g
where cðrÞ represents the transverse mode field For a ¼ 4:6 £ 1022 km21 (which corresponds to an
distribution of the mode. For example, under the attenuation of 0.2 dB/km) and g ¼ 2:4 W21 km21 ;
Gaussian approximation: we get
2
=w20
cðrÞ ¼ c0 e2r ½46 P0 p 19 mW ½56
FIBER AND GUIDED WAVE OPTICS / Nonlinear Optics 479
Propagation of a Pulse The various terms on the RHS of the eqn [58]
represent the following:
When an optical pulse propagates through a medium,
it suffers from the following effects: I term: attenuation
II term: group velocity term
(i) attenuation; III term: second-order dispersion
(ii) dispersion; and IV term: nonlinear term
(iii) nonlinearity.
If we change to a moving frame defined by
Attenuation refers to the reduction in the pulse coordinates T ¼ t 2 b1 z; eqn [58] becomes
energy due to various mechanisms, such as scattering,
absorption, etc. Dispersion is caused by the fact that a ›A a b ›2 A
¼2 Aþi 2 2 iglAl2 A ½61
light pulse consists of various frequency components ›z 2 2 ›T 2
and each frequency component travels at a different If we neglect the attenuation term, we obtain the
group velocity. Dispersion causes the temporal width following equation which is also referred to as the
of the pulse to change; in most cases it results in an nonlinear Schrödinger equation:
increase in pulse width, however, in some cases the
temporal width could also decrease. Dispersion is ›A b ›2 A
accompanied by chirping, the variation of the instan- ¼i 2 2 iglAl2 A ½62
›z 2 ›T 2
taneous frequency of the pulse within the pulse
duration. Since both attenuation and dispersion cause The above equation has a solution given by
a change in the temporal variation of the optical
Aðz; tÞ ¼ A0 sech s T e2 igz ½63
power, they closely interact with nonlinearity in
deciding the pulse evolution as it propagates through with
the medium. b2 2 s2
Let Eðx; y; z; tÞ represent the electric field variation A20 ¼ 2 s ; g¼2 b ½64
g 2 2
of an optical pulse. It is usual to express E in the
following way: Equation [63] represents an envelope soliton and
has the property that it propagates undispersed
1 through the medium. The full width at half maximum
Eðx; y; z; tÞ ¼ ½Aðz; tÞcðx; yÞeiðv0 t2b0 zÞ þ c:c: ½57
2 (FWHM) of the pulse envelope will be given by
t f ¼ 2t0 ; where
where Aðz; tÞ represents the slowly varying complex
envelope of the pulse, cðx; yÞ represents the transverse 1
sech2 st0 ¼ ½65
electric field distribution, v0 represents center fre- 2
quency, and b0 represents the propagation constant
which gives the FWHM tf :
at v0 :
In the presence of attenuation, second-order 2 pffiffi 1:7627
dispersion and third-order nonlinearity, the complex t f ¼ 2t 0 ¼ lnð1 þ 2Þ < ½66
s s
envelope Aðz; tÞ can be shown to satisfy the following
equation: The peak power of the pulse is:
lb2 l 2
›A a ›A b ›2 A P0 ¼ lA0 l2 ¼ s ½67
¼ 2 A 2 b1 þi 2 2 iglAl2 A ½58 g
›z 2 ›t 2 ›t 2
Here Replacing s by tf ; we obtain
db l20
1 P0 t 2f < D ½68
b1 ¼ ¼ ½59
dv v¼v0 vg 2cy
represents the inverse of the group velocity of the where we have used eqn [60]. The above equation
pulse, and gives the required peak for a given tf for the
formation of a soliton pulse.
2 As an example, we have tf ¼ 10 ps; g ¼ 2:4 W21
d b l2
b2 ¼ 2 ¼2 0 D ½60 km21 ; l0 ¼ 1:55 mm; D ¼ 2 ps=km nm and the
dv v¼v 2pc
0
required peak power will be P0 ¼ 33 mW:
where D represents the group velocity dispersion Soliton pulses are being extensively studied
(measured in ps/km nm). for application to long distance optical fiber
480 FIBER AND GUIDED WAVE OPTICS / Nonlinear Optics
communication. In actual systems, the pulses have to changed, but the pulse is chirped, the frequency
be optically amplified at regular intervals to compen- spectrum of the pulse has increased. Thus SPM leads
sate for the loss suffered by the pulses. The amplifica- to the generation of new frequencies. By Fourier
tion could be carried out using erbium doped fiber transform theory, an increased spectral width implies
amplifiers (EDFAs) or fiber Raman amplifiers. that the pulse can now be compressed in the temporal
domain by passing it through a medium with the
Spectral Broadening due to SPM proper sign of dispersion. This is indeed one of the
In the presence of only nonlinearity, eqn [61] becomes standard techniques to realize ultrashort femtosecond
optical pulses.
dA
¼ 2iglAl2 A ½69 Cross Phase Modulation (XPM)
dz
Like SPM, cross-phase modulation also arises due to
whose solution is given by
the intensity dependence of refractive index, leading
Aðz; tÞ ¼ Aðz ¼ 0; tÞe2 igPz ½70 to spectral broadening. Unlike SPM, in the case of
XPM, intensity variations of a light beam at a
where P ¼ lAl2 is the power in the pulse. If P is a particular frequency modulate the phase of light
function of time, then time dependent phase term at beam at another frequency. If the signals at both
z ¼ L becomes frequencies are pulses, then due to difference in group
velocities of the pulses, there is a walk off between the
eifðtÞ ¼ ei½v0 t2gPðtÞL ½71 two pulses, i.e., if they start together, they will
We can define an instantaneous frequency as separate as they propagate through the medium.
Nonlinear interaction takes place as long as they
df dP
vðtÞ ¼ ¼ v0 2 gL ½72 physically overlap in the medium. Smaller the
dt dt dispersion, smaller will be the difference in group
for a Gaussian pulse: velocities (assuming closely spaced wavelengths) and
2
=t 20 the longer they will overlap. This would lead to
P ¼ P0 e22T ½73 stronger XPM effects. At the same time, if two pulses
giving pass through each other, then since one pulse will
2 interact with both the leading and the trailing edge of
=t 20
4gLTP0 e22T the other pulse, XPM effects will be nil provided there
vðtÞ ¼ v0 þ ½74
t 20 is no attenuation. In the presence of attenuation in the
medium, the pulse will still get modified due to XPM.
Thus the instantaneous frequency within the pulse
A similar situation can occur when the two interact-
changes with time, leading to chirping of the pulse
ing pulses are passing through an optical amplifier.
(see Figure 8). Note that since the pulsewidth has not
To study XPM, we assume simultaneous propa-
gation of two waves at two different frequencies
through the medium. If v1 and v2 represent the two
frequencies, then one obtains for the variation of the
amplitude A1 of the frequency v1 :
dA1
¼ 2igðP~ 1 þ 2P~ 2 ÞA1 ½75
dz
where P~ 1 and P~ 2 represent the powers at frequencies
v1 and v2 , respectively. The first term in eqn [75]
represents SPM while the second term corresponds to
XPM. If the powers are assumed to attenuate at the
same rate, i.e.:
power at v2 on the light beam at frequency v1 ; we off relative to each other. The walk off length is
will refer to the wave at frequency v2 as pump, and given by
the wave at frequency v1 as probe or signal. From
eqn [77] it is apparent that the phase of signal at Dt
frequency v1 is modified by the power at another Lwo ¼ ½79
DDl
frequency. This is referred to as XPM. Note also that
XPM is twice as effective as SPM.
where D represents the dispersion coefficient, and
Similar to the case of SPM, we can now write for
Dl represents the wavelength separation between
the instantaneous frequency in the presence of XPM
the interacting pulses. For return to zero (RZ)
as eqn [72]:
pulses, Dt represents the pulse duration while for
nonreturn to zero (NRZ) pulses, Dt represents the
dP2
vðtÞ ¼ v0 2 2gLeff ½78 rise term or fall time of the pulse. Closely spaced
dt channels will thus interact over longer fiber lengths,
thus leading to greater XPM effects. Larger dis-
Hence the part of the signal that is influenced by the
persion coefficients will reduce Lwo and thus the
leading edge of the pump will be down-shifted in
effects of XPM. Since the medium is attenuating, the
frequency (since in the leading edge dP2 =dt . 0Þ and
power carried by the pulses decreases as they
the part overlapping with the trailing edge will be up-
propagate and thus leading to reduced XPM effect.
shifted in frequency (since dP2 =dt , 0Þ: This leads to a
The characteristic length for attenuation is the
frequency chirping of the signal pulse just as in the
effective length Leff ; defined by eqn [53]. If Lwo p
case of SPM.
Leff ; then over the length of interaction of the pulses,
If the probe and pump beams are pulses, then
the intensity levels do not change appreciably and
XPM can lead to induced frequency shifts depend-
the magnitude of the XPM-induced effects will be
ing on whether the probe pulse interacts only with
proportional to the wavelength spacing Dl: For
the leading edge or trailing edge or both, as they
small Dl’s, Lwo q Leff and the interaction length is
both propagate through the medium. Let us consider
now determined by the fiber losses (rather than by
a case when the group velocity of pump pulse is
walk off) and the XPM-induced effects become
greater than that of the probe pulse. Thus, if both
almost independent of Dl: Indeed, if we consider
pulses enter the medium together, then since the
XPM effects between a continuous wave (cw) probe
pump pulse travels faster, the probe pulse will
beam and a sinusoidally industry modulated pump
interact only with the trailing edge of the pump.
beam, then the amplitude of the XPM-induced
Since in this case dP2 =dt is negative, the probe pulse
suffers a blue-induced frequency shift. Similarly if
the pulses enter at different instants but completely
overlap at the end of the medium, then dP2 =dt . 0
and the probe pulse would suffer a red-induced
frequency shift. Indeed, if the two pulses start
separately and walk through each other, then there
is no induced shift due to cancellation of shifts
induced by leading and trailing edges of the pump.
Figure 9 shows measured induced frequency shift of
532 nm probe pulse as a function of the delay
between this pulse and a pump pulse at 1064 nm,
when they propagate through a 1 m long optical
fiber.
When pulses of light at two different wavelengths
propagate through an optical fiber, due to different
group velocities of the pulses, they pass through
each other, resulting in what could be termed as a
collision. In the linear case, the pulses will pass Figure 9 Measured induced frequency shift of 532 nm probe
through without affecting each other, but when pulse as a function of the delay between this pulse and a pump
pulse at 1064 nm, when they propagate through a 1 m long
intensity levels are high, XPM induces phase shifts
optical fiber. (Reproduced from Baldeck PL, Alfano RR and
in both pulses. We can define a parameter termed Agrawal GP (1998) Induced frequency shift of copropagating
walk off length Lwo which is the length of the ultrafast optical pulses. Applied Physics Letters 52: 1939–1941
fiber required for the interacting pulses to walk with permission from the American Institute of Physics.)
482 FIBER AND GUIDED WAVE OPTICS / Nonlinear Optics
v1 ¼ v3 þ v4 2 v2 ½81
v1 ¼ 2v3 2 v2 ½82
where, as before, Ai ðzÞ represents the amplitude of the frequency, say v2 : In such a case, we obtain
wave, ci ðx; yÞ the transverse field distribution, and bi
2
the propagation constant of the wave. The total d b
Db ¼ ðv3 2 v2 Þðv3 2 v1 Þ 2 ½92
electric field is given by dv v¼v2
E ¼ E1 þ E2 þ E3 þ E4 ½84 In optical fiber communication systems, the channels
are usually equally spaced. Thus, we assume the
Substituting for the total electric field in the equation frequencies to be given by
for nonlinear polarization, the term with frequency
v1 comes out as v4 ¼ v2 þ Dv; v3 ¼ v2 2 2Dv and
½93
ðv 1 Þ 1 v1 Þ iðv1 t2b1 zÞ
PNL ¼ ½PðNL e þ c:c: ½85 v1 ¼ v2 2 Dv
2
Using these frequencies and eqn [60], eqn [92] gives
where us:
310 ð3Þ p
PðNL
v1 Þ
¼ x A2 A3 A4 c2 c3 c4 e2 iDbz ½86 4pDl2
2 Db ¼ 2 ðDnÞ2 ½94
c
and where Dv ¼ 2pDn: Thus maximum FWM takes place
Db ¼ b3 þ b4 2 b2 2 b1 ½87 when D ¼ 0: This is the main problem in using
wavelength division multiplexing (WDM) in dis-
In writing eqn [86], we have only considered the persion shifted fibers which are characterized by
FWM term, neglecting the SPM and XPM terms. zero dispersion at the operating wavelength of
Substituting the expression for PðNL
v1 Þ
in the wave 1550 nm, as FWM will then lead to crosstalk
equation for v1 and making the slowly varying among various channels. FWM efficiency can be
approximation (in a manner similar to that employed reduced by using fiber with nonzero dispersion. This
in the case of SPM and XPM), we obtain the has led to the development of nonzero dispersion
following equation for A1 ðzÞ: shifted fiber (NZDSF) which have a finite nonzero
dispersion of about ^ 2 ps/km nm at the operating
dA1 wavelength.
¼ 22igAp2 A3 A4 e2 iDbz ½88
dz From eqn [94], we notice that for a given dispersion
coefficient D; FWM efficiency will reduce as Dn
where g is defined by eqn [48] with k0 ¼ v=c; v
increases.
representing the average frequency of the four
In order to get a numerical appreciation, we
interacting waves, and Aeff is the average effective
consider a case with D ¼ 0; i.e., Db ¼ 0: For such a
area of the modes.
case h ¼ 1: If all channels were launched with equal
Assuming all waves to have the same attenuation
power Pin then:
coefficient a and neglecting depletion of waves at
frequencies v2 ; v3 and v4 ; due to nonlinear conver- P1 ðLÞ ¼ 4g 2 P3in L2eff e2aL ½95
sion we obtain for the power in the frequency v1 as
Thus the ratio of power generated at v1 ; due to FWM
P1 ðLÞ ¼ 4g 2 P2 P3 P4 L2eff h e2aL ½89 and that existing at the same frequency, is
where Pg P1 ðLÞ
¼ ¼ 4g 2 P2in L2eff ½96
2 3 Pout Pin e2aL
D bL
a2 6 4e2aL sin2 7 Typical values are Leff ¼ 20 km; g ¼ 2:4 W21 km21 :
4 2 5
h¼ 2 1 þ ½90
a þ Db2 ð1 2 e2aL Þ2 Thus:
Supercontinuum Generation
Conclusions
The advent of lasers provided us with an intense
source of light and led to the birth of nonlinear optics.
Nonlinear optical phenomena exhibit a rich variety of
effects and coupled with optical waveguides and
fibers, such effects can be observed even at moderate
optical powers. With the realization of compact
femtosecond lasers, highly nonlinear holey optical
fibers, quasi phase matching techniques, etc, the field
of nonlinear optics is expected to grow further and
lead to commercial exploitation such as in compact
blue lasers, soliton laser sources and multiwavelength
sources and for fiber optic communication.
Loose Jacket
Fiber Identification
In a loose protection the fiber can be inserted into a
cable with only the primary coating. A thin layer of To identify the fibers in a cable, a proper coloration
colored plastic may be added onto the primary with defined color codes must be given to the single
coating to identify the fiber in the cable. The fibers fibers. In the tight protection, the tight buffer is
are arranged in a tube or in a groove on a plastic core directly colored during extrusion. In the loose
(slotted core). The diameter of the tube or the protection, a thin colored layer is added on the
dimension of the core must be far larger than the primary coating of the fiber; tubes or slotted core may
FIBER AND GUIDED WAVE OPTICS / Optical Fiber Cables 489
Cable Components
The fundamental components of an optical cable are:
the cable core with the optical fibers, the strength
member (one or more) which provide the mechanical
strength of the cable, sheaths which provide the
external protection, and filling materials.
or as service channels, or to detect the presence of facility of handling. Care must be taken in the design
moisture in the cable (in this case, the insulation is of the strength member: in fact, the fibers are
missing or partial). The conductors may cause immediately subjected to tension when the cable is
problems due to the voltages induced by power pulled into the ducts or compressed when subjected to
lines or lighting. low temperatures.
Plastic fillers are often used to complete the cable In the stranded loose structures, two categories of
core geometry. Cushion layers are used to protect the cables are defined, one fiber per tube and a group of
cable core against radial compression. Normally, they fibers (up to 20) per tube (Figure 5c). As already
are based on plastic materials wound around the core. mentioned, in the loose structures, the fibers are free
Tapes are wound around the cable core to hold the inside the tube and for this reason the cable is less
assemblies together and secondly to provide a heat critical than the tight cable. The external dimensions
barrier during the sheath extrusion. Mylar tapes are of loose cables are larger than tight cables, but the
usually employed. solution with multiform tubes allows for cables with
a high number of fibers in relatively small dimensions.
Slotted core cables (Figure 2c,d) contain fibers with
Optical Cable Structures, Types, only the primary coating and may be divided, as in
and Applications the stranded loose structure, into one fiber per groove
or into a group of fibers per groove. The cable design
Cable Structures and the choice of a suitable plastic material for the
The structure of an optical cable is strictly dependent slotted core joined to a proper strength member, make
on the construction method and the application type it possible to obtain cables with fiber stability at a
that determines the design and the grade of external wide range of temperature. As in the loose stranded
protection. The main core (or inner) structures of an solutions, the slotted core solution is adopted to
optical cable can be classified as: stranded structures obtain cables with a high number of fibers in small
(tight and loose); slotted core cable; or ribbon cable. dimensions.
In this section, a few examples of cable structures are In the ribbon solution, the fibers (from 4 to 16)
presented. Below, separate sections are devoted to are placed in a linear ribbon array. Many ribbons are
submarine cables and to special application cables, stacked together to constitute the optical unit.
for their particular structures and features. The ribbon cable normally has a loose structure
In the stranded tight structure, a low number of tight (loose tube or slotted core). Three structures are
or loose fibers is stranded around a strength member shown in Figure 6. In the first example (classical
to form a cable unit. Some of this cable unit may be AT & T cable), 12 ribbons of 12 fibers are
joined to constitute a cable with a medium/high directly stacked and stranded in a tube (Figure 6a).
number of fibers (Figure 5a). Alternatively the fibers In the second and third examples, ribbons are inserted
may be arranged in a multilayer structure (Figure 5b). into tubes or grooves (Figure 6b,c). Several tubes or
The main advantages of a tight stranded structure slotted cores can be assembled together to give a cable
are the small dimensions of the cable core, even when with very high fiber density. The ribbon approach
a large number of fibers are involved, and also the may be efficient when a large number of fibers must be
These geometrical mismatching between the two where n is the fiber core refractive index and l is
fibers is represented in Figure 1. the wavelength of the light.
The function that describes the joint attenuation Figure 2 shows the trend of eqn [1], where the
is different for single-mode and multimode con- parameter ranges in the graphs are typical for the
nectors, but the parameters that contribute to it are most common products on the market. It is evident
the same. The general formulation for these func- that the main contribution to attenuation arises from
tions are complicated and, in particular, for the the lateral offset.
multimode connector it involves an integral on the For connectors of good quality, both for single-
shape distribution of the light in the fiber core. mode and multimode fibers, the attenuation values
These functions become simpler in particular are in the range of a few tenths of dB (typically less
conditions: for example, if we consider a single- than 0.3 dB). Another important factor for a joint, in
mode joint without a gap between the two fibers particular for connectors used in networks for high
(that is the more usual condition in the modern speed or analog signals, is the return loss (RL).
connectors) the z parameter is nil, and in this case This is a parameter that gives a measure of the
the function that describes the attenuation of the quantity of light back-reflected onto the optical
discontinuity of the connector, due to the Fresnel
effect. The RL is defined as the ratio in dB between the
incident light power (Pinc) and the reflected light
power (Pref):
!
Pref
RL ¼ 210 log ½2
Pinc
Figure 2 Single mode connector attenuation as a function of the geometrical parameters. Each curve is plotted keeping the other
parameter constant.
496 FIBER AND GUIDED WAVE OPTICS / Passive Optical Components
core to the air in the gap. This causes the reflection of Optical Connectors
4% of the incident light that, as RL, is about 14 dB.
On the market are several types of optical connectors.
To improve the RL performances of an optical
It is possible to divide them into three main groups:
joint, it is necessary to reduce the refractive index
difference. This can be achieved either by index . Connectors with direct alignment of bare fiber;
matching material between the two fibers that . Connectors with alignment by means of a ferrule;
reduces the differences between the refractive index and
of the fiber and the gap, or by performing a physical . Multifiber connectors.
contact between the fiber heads. The first technique is
typically used in mechanical splices but, due to the Connectors with direct alignment of bare fiber
problem related with the pollution contamination of In this type of connector the bare fibers are aligned
the index matching, it is not a good solution for on an external structure that is usually a V-groove
connectors that are to be opened and reconnected (Figure 5). Two configurations can be performed:
several times. plug and socket or plug – adapter– plug. In the first
The physical contact solution is the most frequently case, the fiber in the socket is fixed and the one in the
adopted for these connectors. Physical contact is plug aligns to it. In the plug – adapter– plug configur-
obtained by polishing the ferrule and the fiber end- ation, two identical plugs containing the bare fibers
faces in a convex shape, with the fiber core positioned are locked onto an adapter in which the fibers are
in the apex (Figure 3). With this technique, the typical placed on the aligned structure. These connectors are
values of return loss for a single mode connector are cheaper than the connectors with a ferrule,
in the range from 35 dB to 55 dB, depending on the (described below), but this is unreliable. In fact, the
polishing grade. The residual back-reflected light is bare fibers are delicate and brittle and they can be
generated in the thin layer of the fiber end surface, affected by the external mechanical or environmental
because of slight changes in the refractive index due stresses.
to the polishing process.
Higher return loss values (small quantity of Connectors with alignment by means ferrule
reflected light) are achieved with the angled physical In this type of connector, the fibers are fixed in a
contact (APC) technique. This is obtained by polish- secondary alignment structure, usually of cylindrical
ing the convex ferrule end face angled with respect to shape, called a ferrule. The fiber alignment is
the fiber axis. In this way, the light is not back obtained by inserting the two ferrules containing the
reflected into the fiber core, but into the cladding, and fibers into a sleeve. Usually this type of connector is in
is thus eliminated (Figure 4). In APC connectors, the plug – adapter– plug configuration (Figure 6).
typical return loss values are greater than 60 dB. With this technique, the precision of the alignment
is moved from the V-groove precision to the dimen-
sional tolerance of the ferrule. Typical tolerance
values on the parameters for a cylindrical ferrule
with a diameter of 2.5 mm, are in the range of a tenth
of a micrometer, and must be verified on the diameter,
eccentricity between the hole (through which the fiber
is fixed) and cylindricity of the external surface.
Multifiber connectors
In some types of optical cable, the fibers are organized
Figure 3 Back reflection effects in PC connectors. in multiple structures, in which many fibers are
glued together to form a ribbon. The ribbon can have
Figure 4 Back reflection effects in APC connectors. Figure 5 Scheme of the bare fibers alignment on a V-groove.
FIBER AND GUIDED WAVE OPTICS / Passive Optical Components 497
Figure 6 Scheme of a plug– adapter–plug connector with alignment by ferrules and sleeve.
Mechanical Splices
Figure 11 Scheme of the process to obtain a planar coupler and an example of a device.
These components are used to combine, at the Micro-optic technologies are generally used for
input side of an optical link in only one fiber, more multimode fiber due to the critical collimation
optical signals with different wavelengths and separ- problem. For the single-mode fiber the evanescent
ate them at the receiving end. This technology allows field coupling technique is applied directly to fiber or
to send along a fiber, more communication channels, to planar optical guides.
thus increasing the total bit rate transmitted. These Another efficient technique to create WDM devices
types of components can be obtained using filters to is with a fiber Bragg grating (FBG) wavelength filter
select single a wavelength (as a diffractive grating that directly applied to the fiber. FBGs consist in periodic
resolves the light into its monochromatic com- modulation (see Figure 13) of the refractive index
ponents) or using the phenomena of evanescent field along the core of the fiber, and they act as reflection
coupling occurring between two adjacent fiber cores filters.
(as described for coupler devices), controlling in a During the fabrication process (see Figure 14) the
precise way the coupling zone; this is, in fact, core of the optical fiber is exposed to UV radiation,
dependent on the wavelength (Figure 12). and this exposure increases the refractive index of the
Using wavelength filters, as they select a singular core in defined zones, with a periodic shape obtained
wavelength, it is necessary to perform a cascade filter by the interference fringe created by crossing two
structure for separating more than two wavelengths. coherent beams.
500 FIBER AND GUIDED WAVE OPTICS / Passive Optical Components
FIBER GRATINGS
P S Westbrook, OFS Laboratories, Murray Hill, NJ, and as strain and temperature sensors in various
USA industries. We review optical fibers and the funda-
mentals of fiber grating characteristics as well as
B J Eggleton, University of Sydney, Sydney, NSW, fabrication. Finally we consider several of the major
Australia industrial and scientific applications of fiber gratings.
q 2005, Elsevier Ltd. All Rights Reserved.
use fiber gratings. A single-mode fiber can be viewed boundary a continuum of ‘radiation modes’ results.
as a thin cylinder of silica with a central raised index Although these modes are not guided, they are
region of small enough diameter so that only a single exploited in a variety of applications and can be
mode is guided through total internal reflection at the carefully tailored by a suitably designed fiber grating.
core boundary. The electric field (E-field) of the Note that the radiation and cladding modes exist
fundamental mode propagating in the core is regardless of the presence of a grating; a grating
described by a characteristic transverse E-field profile simply couples power from the core mode to the
and a propagation constant k: cladding/radiation modes; the grating is typically only
a small perturbation on the index profile of the
E ¼ Eðr; wÞe2iðvt2kzÞ waveguide and the corresponding mode fields.
½1
k ¼ 2p neff l
Figure 1 Fiber modes. Refractive index profile of a step index, single-mode fiber (dashed line). Radial profiles of mode E-fields of the
fundamental (LP01) core mode, guided within the raised index region defined by the core, and one of the many higher-order cladding
modes guided by the outer, air– silica interface. Also shown is a fiber indicating the ray paths for core, cladding, and radiation modes.
FIBER GRATINGS 503
transmission spectrum of the grating (see Figure 3a). the fiber grating has the advantage of compactness
The ability of a fiber grating to selectively reflect the and provides in-line wavelength filtering.
core light at one wavelength is one of its most The condition for Bragg reflection may be
important practical properties since it can be used as a expressed more generally as a phase-matching con-
narrowband filter or reflector. As a fiber component, dition. Phase matching refers to the condition when
the phases of all the grating-scattered waves are the
same and there is constructive interference. More
specifically, this condition expresses the relationship
between the spatial frequencies of the incident light,
the scattered light, and the grating modulation,
necessary for constructive interference in the scat-
tered light. For a grating, the phase matching
condition may be expressed in terms of the grating
wavevector, Kgrating ¼ 2p=Lgrating and of the two
mode propagation constants, kincident and kscattered
Figure 2 Bragg reflection. Each period of the grating reflects a
(defined in eqn [1])
small fraction of the incident light. When the period of the grating is
adjusted so that these reflections add coherently, a strong Bragg kincident 2 Kgrating ¼ kscattered ½3
reflection results and a narrow dip is observed in the transmission
spectrum of the fiber (see Figure 3a). As shown in Figures 3 Using phase matching, we can understand the
and 4, Bragg reflection can also occur into cladding modes. various transmission spectra that result from fiber
Figure 3 Phase matching. Fiber grating resonances occur when the phase matching condition is satisfied: kinc 2 Kgrating ¼ kscat .
Propagation constants ðk ¼ 2pneff =lÞ of the fiber modes are indicated as hash marks on the vertical line. The region indicates the regime
of unguided radiation modes. The fiber grating is represented by its longitudinal spatial frequency (wavevector) Kgrating ¼ 2p=Lgrating ;
where Lgrating is the period of the grating. Assuming an incident forward-going core mode, the grating wavevector then phase matches to
(a) counter-propagating core and cladding modes (fiber Bragg grating, FBG), (b) continuum of radiation modes, or (c) a copropagating
cladding mode (long-period grating, LPG).
504 FIBER GRATINGS
gratings. Figure 3 shows the transmission character- as a function of position along the grating can then be
istic observed for core guided light in a single-mode derived, and these allow for computation of the
fiber containing three different types of gratings: grating reflection and transmission. The coupling
Bragg reflector, radiation mode coupler, and copro- between two modes is determined by an overlap
pagating mode coupler. Each grating has a different integral k12 between the product of the E-fields of the
type of phase matching condition. Figure 3 also two modes (E1 and E2) and the periodic refractive
depicts the phase matching condition in eqn [3] index variation:
graphically. The vertical line contains a mark ðð
representing the propagation constants, k, of modes k12 / E1 ðr; fÞE2 ðr; fÞdnðr; fÞ dA ½4
of a single mode fiber (both positive and negative for Area
the two propagation directions) and a solid black where dn(r,f) is derived from the refractive index
region representing the radiation mode continuum. modulation:
Bragg reflection into backward-propagating core and
cladding modes occurs for short grating periods dnðr; f; zÞ ¼ dnðr; fÞ cosðKgrating zÞ ½5
ðKgrating , 2kcore Þ: Such gratings are known as fiber
Bragg gratings (FBGs) and a typical transmission This overlap integral is useful because it allows for a
spectrum with a strong Bragg reflection and several rough comparison of the strength of resonances for
cladding mode resonances is shown in Figure 3a (a different modes without the need to obtain a full
typical Bragg reflection spectrum is shown in Figure 8 CMT solution. In the simple case of Bragg reflection
below; the FBG grating period was , 500 nm). Core- of the core mode in a standard single-mode fiber
cladding mode reflection results in loss resonances (see Figure 1):
for wavelengths shorter than the core – core Bragg pdnh
reflection. When the grating period is very long (in k¼ ; ½6
l
general, Kgrating , kcore ðncladding 2 1Þ=ncladding ; typi-
cally Lgrating is , 1000 times that of an FBG), phase where h < 0:8 is the overlap integral of the core mode
matching to copropagating cladding modes occurs. intensity with the core.
Such gratings are known as long-period gratings To understand fiber grating spectra we first discuss
(LPGs). A typical LPG spectrum is shown in Figure 3c. a ‘uniform’ Bragg grating, of length L. Such a grating
The LPG resonance corresponds to coupling between has uniform index modulation along its length and is
the core mode and a single copropagating cladding shown in Figure 4a. Depending on the length and
mode. For wavevectors in between these two regimes, refractive index modulation of the grating, it may be
coupling to the continuum of radiation modes is classified as either weak or strong. The division
possible. An example of such a grating spectrum is between these two regimes may be understood by
shown in Figure 3b. We note that fiber cladding considering the light scattered by each successive
modes can be either well guided (resulting in the period of the grating. The fraction of light scattered
sharp grating resonances of Figure 3a) or poorly from each period can be approximated by dn/2n using
guided (resulting in a broad continuum of Figure 3b) the Snell’s law reflectance formula. The boundary
depending on the fiber surface. The radiation mode between a strong and a weak grating occurs when the
approximation is valid in the limit that the cladding grating is either long enough or strong enough that
mode is very poorly guided (i.e., has high loss) within the sum of the scattering from all periods is of order 1.
the grating region. This division may also be expressed in terms of k and
While phase matching yields the resonance wave- L: kL , 1:
lengths of a fiber grating, the strengths of these In the weak limit, the incident light propagates
resonances must be derived from a full solution of through the grating with very little scattering. In this
Maxwell’s equations. The most common approxi- case, both the LPG and FBG resonances have
mation used in these calculations is known as coupled transmission spectra that are approximately related
mode theory (CMT). In CMT, the longitudinal to the Fourier transform of the grating index profile.
refractive index variation is assumed to be a small (For LPGs, this may not hold if modal dispersion is
perturbation to the waveguide modes, which results large; see the discussion below.) In the case of a
only in a change in the propagation constant of the uniform grating profile, this results in a 1 2 sinc2
modes but does not change the transverse dependence wavelength dependence of the transmission spectrum
of the mode E-fields (see eqn [1]). CMT then assumes and a spectral width dl=l , Lgrating =L ¼ 1=Nperiods as
that the E-field amplitude of each mode varies slowly shown in Figure 4a. An important feature of the
(compared to wavelength) along the grating. A pair of uniform grating is the side lobes. The side lobes result
coupled equations relating the two mode amplitudes from the sudden ‘turn on’ of the grating. In the FBG,
FIBER GRATINGS 505
Figure 4 Fiber Bragg grating refractive index profiles and spectra. (a) A uniform profile with transmission spectra for both strong and
weak index modulations. (b) An ‘unapodized’ Gaussian grating profile showing Fabry–Perot-like structure within the grating resonance.
(c) A ‘100% dc apodized’ Gaussian grating profile showing a symmetric transmission spectrum. (d) A superstructure grating exhibiting
four resonance peaks.
they can be understood as resonances of the effective By controlling the ac and dc index change, a
Fabry– Perot cavity defined by the sharp grating greatly improved filter may be fabricated. Figure 4b
boundaries. shows the index profile and the grating transmission
In the strong limit, a substantial fraction of light is spectrum of a fiber Bragg grating with an ‘unapo-
scattered out of the incident mode, and the FBG and dized’ Gaussian profile. Note that due to the smooth
LPG transmission spectra become different. As shown increase in modulation amplitude, the sidebands of
in Figure 4a, the FBG spectrum widens to a width the uniform grating have been eliminated. However,
determined only by dn: dl=l ¼ dn=n: The incident because the dc index (and hence Bragg condition) is
light penetrates only a short distance ð, l=dnÞ into the not constant within the grating, sharp resonant
grating before being completely reflected. The strong structure is observed within the grating resonance.
LPG, on the other hand, becomes ‘overcoupled’ as the This structure can be understood as Fabry – Perot
second mode is reconverted into the incident mode. resonances in the cavity formed by the nonuniform
The result is a higher transmission for the incident Bragg condition within the grating. Figure 4c shows
core mode (provided that the scattered mode propa- a grating with a ‘100% dc apodized’ Gaussian
gates without loss). profile. This grating has a constant dc refractive
While uniform gratings are easy to model and index within the grating and therefore the resonance
manufacture, most practical fiber gratings have a is smooth and symmetric. Most useful bandpass
nonuniform profile. In a nonuniform grating profile, filters require a profile close to 100% apodization.
both the refractive index modulation amplitude, Although not shown in Figure 4, a smoothly varying
dnAC(z), and the average effective index value, index profile also eliminates the sidelobes in LPG
dnDC(z), may vary along the fiber grating length: resonances.
Other important types of nonuniform fiber gratings
include chirped fiber gratings and superstructure fiber
dnðzÞ ¼ dnDC ðzÞ þ dnAC ðzÞ cosðKgrating zÞ ½7 gratings. In a chirped fiber grating the grating period
506 FIBER GRATINGS
slowly changes along the length of the grating. As ranging from 484 nm to 155 nm. These include P, Ce,
discussed below, chirped fiber gratings may be used as Pb, Sn, N, Sn-Ge, Sb, and Ge-B. Other dopants, such
chromatic dispersion compensators. They may also as Al and F, show very little photosensitivity.
be used as broadband reflectors. LPGs may also be A second major discovery enabling practical
chirped to produce a broadband response. Super- manufacture of fiber gratings was the effect of
structure gratings have oscillations in the period ‘hydrogen loading’. The fiber is placed in a pressur-
and/or amplitude of the refractive index modulations. ized hydrogen atmosphere until molecular hydrogen
The resulting grating can have several resonance has diffused into the core region to a level of a few
peaks (see Figure 4d). The properties of nonuniform mole percent (a level similar to that of the photosen-
gratings may be computed using CMT and also sitive dopants). UV irradiation of the resulting fiber
understood intuitively using ‘band-diagrams’ and an produces index changes an order of magnitude larger
effective index model of the fiber grating response. than in the unloaded fiber. Hydrogen loading allows
Grating spectra showing radiation mode coupling gratings to be easily imprinted in conventional single-
are unlike FBGs or LPGs because there is no conver- mode fibers, whose Ge content is too low to allow for
sion of the radiation modes back to the incident core strong gratings. The increase in photosensitivity is
mode. The sharp resonances smooth out into a broad true for most photosensitive dopants. Index changes
continuum. The grating transmission spectrum in excess of 1022 have been produced with hydrogen
depends less on the refractive index profile of the loading. In order to avoid increased OH absorption in
grating than on the overlap of the core mode and the telecommunications band (1520 –1630 nm), deu-
grating with the radiation modes. Radiation mode terium (D2) is normally used in place of hydrogen.
coupling is greatly affected by tilting the grating The origins of UV photosensitivity are not fully
planes with respect to the axis of the fiber. As dis- understood; however, at least two mechanisms are
cussed below, such tilted gratings may be designed as agreed to contribute to the index change. Firstly, as is
loss filters and optical fiber taps. Grating tilt has also evident from Figure 5, the UV absorption spectrum is
been applied to decrease core mode back-reflection. modified during UV exposure. This bleaching corre-
LPG resonances (and to some degree FBGs) may sponds to a change in refractive index in the infrared
also exhibit strong dependence on the modal dis- which may be computed using the Kramers– Kronig
persion (or variation of the propagation constant, k, relations. In particular, the absorption band at
with wavelength). This wavelength dependence can
dominate the wavelength dependence arising from
CMT, making the resonance more or less narrow than
predicted by CMT without modal dispersion. In an
extreme case, very broadband LPG resonances
(40 nm) may be produced between the core mode
and a specially designed higher-order mode whose
propagation constant varies such that it is resonant
with the core mode over a large bandwidth.
242 nm resulting from germanium oxygen deficient index change will quickly decay, leaving only stable
centers is bleached, indicating that these defects are defects that easily survive at lower temperatures. By
modified during exposure. The infrared index change annealing at several temperatures for a fixed time, a
results both from this bleaching and from the change relationship between temperature and time required
in the UV absorption edge at , 190 nm. Secondly, for a given refractive index decay may then be derived
absorption of UV causes the silica matrix to contract. and used in accelerated aging tests of fiber gratings.
This compaction and the resulting stress distribution For 20-year grating reliability of , 1% decrease
also change the refractive index. in UV-induced refractive index change, a typical
Both mechanisms are enhanced by the presence of Ge-doped grating requires annealing at 140 8C for
hydrogen in the silica lattice. The hydrogen reacts with 12 hours.
oxygen, thus increasing the number of UV-absorbing
germanium oxygen-deficient centers. As a result, in Grating Inscription
hydrogen loaded fibers, more of the Ge atoms The first method used in UV grating inscription is
participate in the index change, explaining the large known as ‘internal writing’. In this method light
increase in photosensitivity. The number of oxygen- propagating in the fiber causes a standing wave to
deficient defects may also be increased during fiber form within the core, and UV photosensitivity from a
manufacture by collapsing the fiber preform in a low- two-photon process results in permanent grating
oxygen atmosphere. Such fibers show increased formation. Both LPG and FBG resonances have
photosensitivity without the need for hydrogen load- been fabricated using this method. The technique is
ing. Co-doping a Ge-doped fiber with boron has also inflexible, however, because the grating parameters
been shown to increase the level of photosensitivity of are not easily controlled.
a fiber for a given level of index change in the core. Another important advance that made possible the
An important aspect of UV-induced index changes widespread use of fiber gratings is the ‘transverse side-
is that they require thermal annealing to ensure long- writing’ technique. In this method, a UV interference
term stability. Sufficient exposure to elevated tem- pattern is projected through the side of the fiber onto
peratures can completely erase the refractive index the core region. The first demonstration of this
change in some Ge-doped gratings. Gratings can be technique employed a free space interferometer to
completely erased by several hours of heating to produce the UV fringes. A significant improvement on
greater than 500 8C. Thermal stability has been this technique employs a zero-order nulled phase
described by an activation energy model which mask to generate the interference pattern. The phase
assumes a distribution of energies for the defects mask acts to split the incident beam, and these two
giving rise to the refractive index change. Therefore, then form the interference pattern. The method is
at a given anneal temperature a fixed fraction of the shown in Figure 6. The advantage of the phase mask
technique is that the fringes are very stable, since the and more than 10 nm in compression. Passive
UV beams propagate only a short distance. Another thermal stabilization of grating resonances may also
important technique for writing fiber gratings is the be achieved by designing packages in which strain
‘point-by point’ writing method. In a point-by-point and temperature variations cancel each other.
technique the UV interference pattern is modulated
and translated along a fiber in such a way that a long Fiber Grating Stabilized Diode Lasers and Fiber
grating may be formed. Although complex and Lasers
sensitive, such methods can yield gratings as long as One of the first important commercial applications of
a few meters and also allow for the fabrication of fiber gratings was in stabilization of fiber coupled
complex grating profiles. Point-by-point methods diode lasers. Normally these lasers operate on several
have been used to fabricate broadband chirped cavity modes making the output unstable. If feedback
dispersion compensating gratings and Bragg reflec- is provided in the form of a narrowband reflection
tors with very low dispersion. then the laser output will be only in this narrow
LPGs may also be fabricated with UV irradiation. bandwidth. The fiber grating is implemented in the
An amplitude mask is typically used. Requirements fiber pigtail of the diode laser and typically provides a
for beam coherence are greatly reduced because of few percent reflectivity over a bandwidth of , 1 nm.
the long period (typically a few hundred microns). The application is depicted schematically in Figure 7.
Other techniques may also be used to fabricate LPGs. More generally, fiber gratings have been used to
The fiber may be periodically deformed by heating realize fiber laser cavities and fiber distributed feed-
or with periodic microbends. Dynamic LPGs may back lasers. One important example is the cascaded
also be produced by propagating acoustic waves Raman resonator fiber laser, in which several FBGs
along the fiber. recycle Raman pump light to produce high-power,
fiber coupled light at any design wavelength.
Figure 8 Top: Add/drop node employing fiber grating reflector. Bottom: characteristic reflection and transmission spectrum of an
add/drop fiber grating filter.
a fiber grating based add/drop multiplexer. The used in both tilted and untilted form; however they
grating is located between two three-port optical typically have back-reflections that require careful
circulators. (A circulator is a bulk optic element that fiber grating growth or an isolator to suppress back-
allows transmission only from port n to port n þ 1:Þ reflections. Figure 9 shows an example of a gain
Different channels are incident upon the grating from equalizing loss filter using LPGs.
the left and only one particular channel undergoes
reflection at the grating. Note that by carefully Dispersion Compensator
designing the grating properties the reflectivity in
Chromatic dispersion can limit the transmission
the adjacent channel can be made to be negligible.
distances and capacities of long-haul fiber links.
This one channel is reflected and re-routed by the
Chromatic dispersion arises from the variation in
three-port circulator. The remainder of the channels
propagation velocity with wavelength. An initially
are transmitted through the grating. The grating also
sharp pulse is thus dispersed and begins to overlap
enables the ‘add’ functionality. In this case light
with adjacent pulses resulting in degraded signal
originating from a different part of the optical
quality. A key technology for long-haul fiber links is
network enters the link again through an optical
therefore ‘dispersion compensation’. The most com-
circulator and is reflected off the fiber grating.
mon solution has been ‘dispersion compensating fiber’
(DCF), which is simply fiber in which the chromatic
Gain-Flattening Loss Filters
dispersion has been engineered to be the exact
A primary strength of fiber gratings is the diversity of opposite of the dispersion in the fiber link. Limitations
filter shapes that is achievable through modification of DCF include nonlinearities, higher-order dis-
of the grating profile. Gain-flatteners are an import- persion, and lack of tunability. Fiber gratings offer
ant application requiring such filter design. Gain- an alternative approach that overcomes the problems.
flattening filters are important in WDM telecommu- Fiber Bragg gratings can provide strong chromatic
nications systems in order to smooth the wavelength dispersion when operated in reflection with a three-
dependence of fiber amplifiers (such as Er or Raman port circulator (see Figure 10). In this case the grating
amplifiers). Both LPGs and FBGs may be employed as period is made to vary along the length of the grating.
gain flattening filters. The desired loss spectrum is This means that different wavelengths in a pulse
translated into a corresponding grating profile. LPGs reflect at different positions in the grating. This gives
have the advantage of low return loss, but the LPG rise to wavelength-dependent group delay and there-
resonance wavelengths and strengths are very sensi- fore chromatic dispersion. The advantage of the fiber
tive to the fiber geometry, making the computation of grating compared to DCF is that the device can be
the grating profile less accurate. FBGs may also be made to be very compact, with potentially lower
510 FIBER GRATINGS
Figure 9 Long-period grating gain equalizer. Several individual gratings (dashed line) are combined to produce the overall shape
(solid line).
insertion loss, reduced optical nonlinearity, and radiation modes propagating outside the fiber (see
controllable optical characteristics. In particular Figure 3b). When a conventional Bragg grating is
fiber gratings offer the possibility of engineering the tilted with respect to the fiber axis it can scatter light
chromatic dispersion to allow for compensation of out of the fiber in a highly directional manner. The
dispersion slope and other degradations such as angle of scatter is also wavelength dependent
polarization mode dispersion. because of the phase matching condition discussed
above. This wavelength dependence may form the
Optical Monitors
basis of a compact wavelength monitor. The fiber
Fiber gratings may also be employed as compact grating is used to simultaneously couple light out of
optical taps coupling core guided light to the the fiber and separate the wavelengths. A simple
FIBER GRATINGS 511
Espindola RP, Windeler RS, Abramov AA, et al. (1999) GeO2 doped optical fibers. Electronic Letters 29:
External refractive index insensitive air-clad long period 1191– 1193.
fiber grating. Electronic Letters 35: 327 –328. Marcuse D (1974) Theory of Dielectric Optical Wave-
Hill KO, Fujii Y, Johnson DC and Kawasaki BS (1978) guides. New York: Academic Press.
Photosensitivity in optical fiber waveguides: applications Meltz G, Morey WW and Glenn WH (1989) Formation of
to reflection filter fabrication. Applied Physics Letters Bragg gratings in optical fibres by transverse holographic
32: 647 – 649. method. Optical Letters 14: 823 – 825.
IEEE Journal of Lightwave Technology (1997) Special Othonos A and Kalli K (1999) Fiber Bragg Gratings:
issue on fiber gratings, photosensitivity and poling 15(8). Fundamentals and Applications in Telecommunications
Kashyap R (1999) Fiber Bragg Gratings. New York: and Sensing. Norwood, MA: Artech House.
Academic Press. Polladian L (1993) Graphical and WKB analysis of
Kogelnik H (1990) Theory of optical waveguides. In: Tamir nonuniform Bragg gratings. Physical Review E 48:
T (ed.) Guided-Wave Optoelectronics. Berlin: Springer- 4758– 4767.
Verlag. Westbrook PS, Eggleton BJ, Windeler RS, et al. (2001)
Lemaire PJ, Atkins RM, Mizrahi V and Reed WA (1993) Cladding mode loss in hybrid polymer-silica micro-
High-pressure H2 loading as a technique for achieving structured optical fiber gratings. IEEE Photonic Tech-
ultrahigh UV photosensitivity and thermal sensitivity in nology Letters 12: 495 – 497.
FOURIER OPTICS
S Jutamulia, University of Northern California, simple geometrical theory based on geometrical
Petaluma, CA, USA optics.
q 2005, Elsevier Ltd. All Rights Reserved. We start our discussion by looking at Figure 1.
Consider that a coherent point source is at the center
of the front focal plane of the lens, which is denoted
Introduction by plane ðx; yÞ: The point source generates a
collimated beam parallel to the optical axis. The
Coherent optical information processing is almost amplitude function at the back focal plane of the lens,
entirely based on the Fourier-transform property of a which is denoted by plane ðu; vÞ; can be described as
lens. A Fourier-transform lens is actually an ordinary unity. Consequently, we can express mathematically:
lens. If the input transparency is placed in the front
focal plane of the lens and illuminated with coherent f ðx; yÞ ¼ dðx; yÞ ½1
collimated light (planewave), the amplitude function
in the back focal plane of the lens will be the Fourier
transform of the input transparency as shown in Fðu; vÞ ¼ 1 ½2
Figure 1. A coherent optical processor usually
consists of two lenses, as shown in Figure 2, thus it where f ðx; yÞ is the amplitude function in the front
performs two Fourier transforms successively. The focal plane, and Fðu; vÞ is the amplitude function in
first lens transforms the input function from space the back focal plane.
domain into frequency domain, and the second If the point source shifts a distance a along the y-
lens transforms back the frequency function from axis, the amplitude function in the front focal plane is
frequency domain to the output function in space
domain. f ðx; yÞ ¼ dðx; y þ aÞ ½3
the optical axis. In other words, plane ðu; vÞ is not Substituting eqns [8] and [6] into [7] yields
parallel to the wavefront. !
If the phase at the center of plane ðu; vÞ is 2p
Fðu; vÞ ¼ exp i av ½9
considered zero, the phase at any point at lf
plane ðu; vÞ can be calculated referring to Figure 3.
Using simple geometry, we know that Equation [9] is the amplitude function in the back
focal plane of the lens, when the amplitude function
a in the front focal plane of the lens is given by eqn [3].
dðu; vÞ ¼ v ½5
f Consider now that coordinate (x,y) is rotated by
angle a to new coordinates (x0 ,y0 ) as shown in
where dðu; vÞ is the optical path difference. Corre- Figure 4a. The amplitude function in the front focal
spondingly, the phase is plane now becomes
p0 ¼ a sin a ½11a
where l is the wavelength of light.
In complex representation, the amplitude function q0 ¼ a cos a ½11b
is expressed as
Correspondingly, coordinate ðu; vÞ is also rotated by
Fðu; vÞ ¼ Aðu; vÞ exp½ifðu; vÞ ½7 the same angle a to new coordinates ðu0 ; v0 Þ as shown
in Figure 4b. The old coordinates u and v can be
where Aðu; vÞ and fðu; vÞ are magnitude and phase of described in terms of new coordinates u0 and v0 as
the amplitude function, respectively. From eqn [4], follow:
we know that
u ¼ u0 cos a 2 v0 sin a ½12a
Substituting eqn [12b] into eqn [6] yields By again changing parameters p and q to x and y;
respectively, eqn [21] becomes
2p
fðu0 ; v0 Þ ¼ aðu0 sin a þ v0 cos aÞ ½13 " #
lf ðð1 2p
Fðu;vÞ¼ f ðx;yÞexp 2i ðxuþyvÞ dxdy ½22
Further, substituting eqns [11a] and [11b] into 21 lf
[13] yields
which is the optical Fourier transform of the input
2p 0 0 function f ðx;yÞ:
fðu0 ; v0 Þ ¼ ðp u þ q0 v0 Þ ½14
lf
Thus, the amplitude function in the back focal plane 4-f Coherent Optical Processor
of lens is If the amplitude function in the front focal plane of a
" # lens is denoted by f ðx; yÞ; then the amplitude function
0 0 2p 0 0 0 0 in the back focal plane is simply Fðu; vÞ given in
Fðu ; v Þ ¼ exp i ðp u þ q v Þ ½15
lf eqn [22], which is the Fourier transform of f ðx; yÞ:
Mathematically, f ðx; yÞ can also be derived from
For convenience, we now change parameters x0 and Fðu; vÞ (eqn [4]). After neglecting the normalization
y to x and y; u0 and v0 to u and v; p0 and q0 to 2p and
0
factor, it can be expressed as follows:
2q: Equations [10] and [15] become
" #
ð ð1 2p
f ðx; yÞ ¼ dðx 2 p; y 2 qÞ ½16 f ðx; yÞ ¼ Fðu; vÞexp i ðxu þ yvÞ du dv ½23
" # 21 lf
2p
Fðu; vÞ ¼ exp 2 i ðpu þ qvÞ ½17 The normalization factor ð1=lf Þ2 is caused by the
lf
coordinate scale factor lf in the Fourier-transform
We may simply consider that f ðx; yÞ is an input plane ðu;vÞ:
function and Fðu; vÞ is an output function of a system, Equations [22] and [23] are Fourier transform and
since they are both physically the same amplitude inverse Fourier transform, respectively. The operation
functions. If there are two mutually coherent point of Fourier transform is represented by a notation F{·};
sources at ð0; 0Þ and ð0; 2aÞ with different amplitudes and inverse Fourier transform is F21 {·}:
k and l in the front focal plane, the input function is Two Fourier transform lenses can be combined
into a two-lens processor, which is better known as a
f ðx; yÞ ¼ kdðx; yÞ þ ldðx; y þ aÞ ½18 4-f optical processor, as shown in Figure 2. Figure 2
illustrates the principle of a 4-f optical processor, in
These two point sources will generate two collimated
which f ðx; yÞ; Fðu; vÞ; and f ðj; hÞ are amplitude
beams. Since they are mutually coherent, two
functions in input, Fourier transform, and output
collimated beams will interfere, and the resultant
planes, respectively. Since Fðu; vÞ is the function of
amplitude function in the back focal plane is simply
spatial frequency, Fourier-transform plane is also
the superposition of two beams, which is
called a frequency plane. Since a lens performs
!
2p only Fourier transform, to perform inverse Fourier
Fðu; vÞ ¼ k þ l exp i av ½19 transform, the output plane must be rotated by
lf
1808. Thus, orientations of j and h correspond to
Thus, the lens is a linear system when the light is 2x and 2y:
coherent. Furthermore, if input function f ðx; yÞ is a In practice, f ðx; yÞ is represented by an optical
group of mutually coherent point sources with differ- mask, which can be a film transparency or a spatial
ent amplitudes (i.e., an optical transparency illumina- light modulator. The optical mask with f ðx; yÞ
ted with a planewave), it can be expressed as follows: amplitude transmittance is illuminated with colli-
mated coherent light to produce the amplitude
ð ð1 function f ðx; yÞ: The Fourier transform Fðu; vÞ is
f ðx; yÞ ¼ f ðp; qÞdðx 2 p; y 2 qÞdp dq ½20
21 physically formed in the frequency plane. The
frequency content of the input function f ðx; yÞ can
Since the lens is a linear system, output function will be directly altered by placing an optical mask in the
be the same superposition as follows: frequency plane. Finally, an image with amplitude
" # function f ðj; hÞ is formed in the output plane. Notice
ðð1 2p
Fðu;vÞ¼ f ðp;qÞexp 2i ðpuþqvÞ dpdq ½21 that all detectors, including our eyes, detect intensity
21 lf that is the square of amplitude.
516 FOURIER OPTICS
The function Hðu;vÞ; which is the Fourier trans- recording the interference pattern of Gðu; vÞ and the
form of the impulse response hðx; yÞ; is called transfer reference beam Rðu; vÞ: Rðu; vÞ is an oblique colli-
function. Actually, Hðu;vÞ changes the spectrum mated beam, having magnitude of unity, which is
of input function Fðu;vÞ to become Oðu;vÞ and expressed mathematically as
the spectrum of the output function becomes
Hðu;vÞFðu;vÞ: Rðu; vÞ ¼ expðiauÞ ½34
The first application of spatial filtering may be the
and
restoration of a photograph. The image formed by a 2p sin a
camera is a collection of spots. If the image is in focus, a¼ ½35
l
the size of the spots is very fine, and thus the image
looks sharp. If the image is out of focus, the spot size where a is the angle of the reference beam and l is the
is fairly large and the image is blurred. Interestingly, wavelength of light.
an out-of-focus image can be restored using spatial The transmittance of the recorded hologram will be
filtering technique.
Consider that the original image is f ðx; yÞ: Because Tðu; vÞ ¼ lGðu; vÞ þ Rðu; vÞl2
it is out of focus, a point in the image becomes a spot
expressed by hðx; yÞ: If the image is in focus, hðx; yÞ is ¼ lGðu; vÞl2 þ 1 þ Gðu; vÞ expð2iauÞ
a delta function. If the in-focus image is f ðx; yÞ; its
frequency function is Fðu; vÞ: The defocused image þ Gp ðu; vÞ expðiauÞ ½36
oðx; yÞ is expressed by eqn [28], and its frequency where p denotes the complex conjugate. The holo-
function is Fðu; vÞHðu; vÞ: To produce the in-focus gram, which is called a complex matched spatial filter,
image, the frequency function must be Fðu; vÞ: Thus is then placed back in the Fourier-transform plane, as
Fðu; vÞHðu; vÞ must be multiplied by the filter function shown in Figure 7. A new input function f ðx; yÞ is
of 1=Hðu; vÞ to produce Fðu; vÞ only. The filter of displayed in the input plane. The Fourier-transform of
1=Hðu; vÞ can be made from a photographic plate to input function, Fðu; vÞ will be generated in the
provide its magnitude, and a glass plate having two Fourier-transform plane. The amplitude of light,
values of thickness to provide þ1 and 21 signs.
The glass plate is essentially a phase filter. The
photographic plate with varying density function is
an amplitude filter. By passing frequency function
Fðu; vÞHðu; vÞ through the combination of amplitude
and phase filters, it will result in Fðu; vÞ only.
When Fðu; vÞ is again Fourier transformed by a lens,
an in-focus image f ðx; yÞ will be produced.
Figure 8 Joint Fourier transform of two input functions. Figure 9 Fourier transform of the joint power spectrum.
Sðu; vÞ is the joint Fourier spectrum of f ðx; yÞ and which is the auto-correlation of f ðx;yÞ
gðx; yÞ: It can be written as ðð1
" # o1 ðj; hÞ¼ f ða; bÞf p ða 2 j; b 2 hÞda db ½49
2p 21
Sðu; vÞ ¼ Fðu; vÞ exp i au
lf The second term is
" # " #
2p ðð1 4p
þ Gðu; vÞ exp 2 i au ½45 p
lf o2 ðj; hÞ¼ Fðu;vÞG ðu;vÞ exp i au
21 lf
" #
where Fðu; vÞ and Gðu; vÞ are Fourier transforms of 2p
£ exp i ðjuþ hvÞ dudv ½50
f ðx; yÞ and gðx; yÞ; respectively. lf
If a square-law detector is placed in the back focal
plane of the lens, the recorded pattern of joint power which is
spectrum lSðu; vÞl2 is ðð1
o2 ðj; hÞ¼ f ða; bÞgp ½a 2ðj þ2aÞ; b 2 hda db
2 2 21
lSðu; vÞl ¼ lFðu; vÞl
" # ½51
p 4p
þ Fðu; vÞG ðu; vÞ exp i au Equation [51] is the correlation between f ðx;yÞ and
lf
" # gðx;yÞ; shifted by 22a in the j-axis direction.
p 4p Similarly, the third term will be the correlation
þ F ðu; vÞGðu; vÞ exp 2 i au
lf between gðx; yÞ and f ðx; yÞ; shifted by 2a in the j-axis
direction:
þ lGðu; vÞl2 ½46
ðð1
where p denotes the complex conjugate. A photo- o3 ðj; hÞ¼ gða; bÞf p ½a 2ðj 22aÞ; b 2 hda db ½52
21
graphic plate is a square-law detector and can be used
to record lSðu; vÞl2 : When the developed plate is placed And the fourth term is
in the input plane, as shown in Figure 9, the input
ðð1
transmittance is lSðu; vÞl2 as given in eqn [46]. o4 ðj; hÞ¼ gða; bÞgp ða 2 j; b 2 hÞda db ½53
The inverse Fourier transform of the joint power 21
spectrum lSðu; vÞl2 , in the back focal plane of the
which is the auto-correlation of gðx;yÞ and overlaps
lens, is
with the first term at the center of the output plane.
" # From the analysis given above, it is understood that
ðð1 2p
2
oðj; hÞ¼ lSðu;vÞl exp i ðjuþ hvÞ dudv ½47 the correlation of f ðx; yÞ and gðx; yÞ; and the
21 lf
correlation of gðx; yÞ and f ðx; yÞ; are shifted by 22a
and 2a; respectively, along the j-axis. The correlation
which consists of four terms, according to eqn [46].
outputs are centered at ð22a; 0Þ and ð2a; 0Þ in the
We now analyze this equation term by term. The first
back focal plane of the lens. If f ðx; yÞ and gðx; yÞ are
term is
the same, the two correlation outputs are identical.
" #
ðð1 2p The first and fourth terms are the autocorrelations of
2
o1 ðj; hÞ¼ lFðu;vÞl exp i ðjuþ hvÞ dudv f ðx; yÞ and gðx; yÞ; centered at ð0; 0Þ: Thus the
21 lf
second and third terms, which are correlation signals,
½48 can be isolated. A versatile real-time joint transform
520 FOURIER OPTICS
correlator, using inexpensive liquid crystal televi- Leith EN and Upatnieks J (1962) Reconstructed wavefronts
sions, is widely used in laboratory demonstration. and communication theory. Journal of the Optical
Society of America 52: 1123– 1130.
Porter AB (1906) On the diffraction theory of microscopic
See also vision. Philosophical Magazine 11: 154 – 166.
Information Processing: Coherent Analogue Optical Rau JE (1966) Detection of differences in real distri-
Processors. Spectroscopy: Fourier Transform Spec- butions. Journal of the Optical Society of America 56:
troscopy. 1490– 1494.
Tsujiuchi J (1963) Correction of optical images by
compensation of aberrations and by spatial frequency
Further Reading filtering. Progress in Optics 2: 133 – 180.
Abbe E (1873) Beitrage zur Theorie des Mikroskops und Vander Lugt A (1964) Signal detection by complex spatial
der mikroskopichen Wahrnehmung. Archiv Mikrosko- filtering. IEEE Transactions on Information Theory
pie Anatomie 9: 413– 468. IT-10: 139– 145.
Bracewell RN (1965) The Fourier Transform and Its Weaver CS and Goodman JW (1966) A technique for
Applications. New York: McGraw-Hill. optically convolving two functions. Applied Optics
Cutrona LJ, Leith EN, Palermo CJ and Porcello LJ (1960) 5: 1248– 1249.
Optical data processing and filtering systems. IRE Yu FTS (1983) Optical Information Processing. New York:
Transactions on Information Theory IT-6: 386 – 400. Wiley.
Goodman JW (1968) Introduction to Fourier Optics. Yu FTS and Ludman JE (1988) Joint Fourier transform
New York: McGraw-Hill. processor. Microwave Optical Technology Letters
Jutamulia S (1992) Joint transform correlators and their 1: 374 – 377.
applications. Proceedings of the SPIE 1812: 233 – 243. Yu FTS, Jutamulia S, Lin TW and Gregory DA (1987)
Jutamulia S and Asakura T (2002) Optical Fourier-trans- Adaptive real-time pattern recognition using a liquid
form theory based on geometrical optics. Optical crystal TV based joint transform correlator. Applied
Engineering 41: 13– 16. Optics 26: 1370– 1372.
G
GEOMETRICAL OPTICS
Contents
where n and n0 are the indices of refraction of the two but using the paraxial approximation simplifies
media and u and u 0 are the angles as measured from this to
the normal. Figure 4 shows a ray of light leaving x1
an object point P and striking the first surface of a lens f¼ ½6
r1
at point P1 : It is refracted there, and proceeds to the
second surface. All rays shown lie in the z, x-plane Substituting for the angles gives
or meridional plane, which passes through the
n1 2 n01
symmetry axis OZ. The amount of refraction is n01 a01 ¼ þ n1 a 1 ½7
specified by Snell’s law, eqn [2], which we shall now r1
simplify. The sine of an angle can be approximated Notice that the distance from the point P1 to the axis
by the value of the angle itself, if this value is small is labeled as either x1 or x01 : This strange notation
when expressed in radians (less than 0.1 rad), and leads to the trivial relation
x1 ¼ x01 ½8
the refraction matrix R1 for surface 1, defined as matrices in the above equation is known as the
0 1 system matrix S21 and it completely specifies the
1 2k1 effect of the lens on the incident ray passing through
R1 ¼ @ A ½11
0 1 it. It is also written as
0 1
Next, we look at what happens to the ray as it b 2a
travels from surface 1 to surface 2. As it goes from S21 ¼ R2 T21 R1 ¼ @ A ½19
2d c
P1 to P2 ; its distance from the axis becomes
x2 ¼ x01 þ t01 tan a01 ½12 where the four quantities a; b; c; and d are known as
the Gaussian constants. They are used extensively in
or using the paraxial approximation for small specifying the behavior of a lens or an optical
angles: system. We now state the sign and notation
conventions which are necessary for the application
x2 ¼ x01 þ t01 a01 ½13
of paraxial matrix methods:
Another approximation we shall use is to regard t01 1. Light normally travels from left to right.
as being equal to the distance between the lens 2. The optical systems we deal with are symmetrical
vertices V1 and V2 : Using the identity: about the z-axis. The intersections of the refract-
a2 ¼ a01 ½14 ing or reflecting surfaces with this axis are the
vertices and are designated in the order encoun-
leads to a second matrix equation: tered as V1 ; V2 ; etc.
0 1 0 10 1 3. Positive directions along the axes are measured
n2 a2 1 0 n01 a01 from the origin in the usual Cartesian fashion, so
@ A¼@ A@ A ½15
t01 =n01 1 that horizontal distances (that is, along the z-axis)
x2 x01
are positive if measured from left to right. Angles
where the 2 £ 2 translation matrix T21 is defined as are positive when measured up from the z-axis.
0 1 4. Quantities associated with the incident ray are
1 0 unprimed; those for the refracted ray are primed.
T21 ¼ @ 0 0 A ½16 5. A subscript denotes the associated surface.
t1 =n1 1 6. If the center of curvature of a surface is to its right,
the radius is positive, and vice versa.
To get an equation which combines a refraction
7. There is a special rule for mirrors to be explained
followed by a translation, we take the one-column
below.
matrix on the left side of eqn [9] and substitute it
into eqn [15] to obtain
0 1 0 1 Using the Gaussian Constants
n2 a 2 n1 a1
@ A ¼ T21 R1 @ A ½17 Figure 5 shows the double convex lens of Figure 4
x2 x1 with the assumed locations of the cardinal points
indicated. These positions, which we shall now
Note that the 2 £ 2 matrices appear in right to left determine accurately, are at distances designated as
0 0
order and that the multiplication of two square lF ; lF ; lH ; and lH and are measured from the associated
matrices is an extension of the rule given above. This vertex. The object position can be called t; a positive
equation gives the location and slope of the ray quantity which is measured to the right from object to
which strikes surface 2 after a refraction and a the first vertex, or it can be called t1 if measured in the
translation. To continue the ray trace at surface 2, opposite direction. For the first choice, the matrix
we introduce a second refraction matrix R2 by specifying the translation from object to lens will
expressing k2 in terms of the two indices at this have the quantity t=n1 in its lower left-hand corner.
surface and the radius. Then eqn [17] is extended to However, it is both logical and convenient to use the
give the relation: first vertex as the reference point. Hence, we replace
0 1 0 1 t=n1 in the translation matrix with the quantity
n02 a02 n1 a 1 2t1 =n1 ; and remember to specify t1 as a negative
@ A ¼ R2 T21 R1 @ A ½18
x02 x1 number when calculating the image position. The
equation connecting object and image is obtained by
This process can obviously be applied to any number starting with this matrix, multiplying it by the system
of components. The product of the three 2 £ 2 matrix of eqn [19], and finally by a translation matrix
GEOMETRICAL OPTICS / Lenses and Mirrors 5
magnification m; that is
x0
m¼ ½23
x
A perfect image can be formed only if the
magnification is determined solely by the object
distance and the constants of the lens. To eliminate
this difficulty, the lower left-hand element in the
matrix, eqn [22] is required to be equal to zero, and
the magnification is then:
x0 at0
m¼ ¼ c 2 02 ½24
x n2
Figure 5 Definitions of cardinal plane locations. The determinant of this matrix is unity, since it is
the product of three matrices whose individual
determinants are unity. It follows that
corresponding to the translation t20 from lens to
image to obtain: 1 at
¼bþ 1 ½25
0 1 0 10 1 m n1
n02 a02 1 0 b 2a
@ A¼@ A@ A so that
x0 t02 =n02 1 2d c 0 1 0 10 1
0 10 1 n02 a02 1=m 2a n1 a 1
1 0 n1 a 1 @ A¼@ A@ A ½26
@ A@ A ½20 x0 0 m x
2t1 =n1 1 x
Using the fact that the lower left-hand element of
where a or a1 is the angle that the ray from the top the matrix eqn [22] is zero shows that
of the object makes with the z-axis. This equation
takes the initial value of the inclination a and of the t02 d þ ct1 =n1 t1 d 2 bt02 =n02
0 ¼ ; ¼ ½27
position x of the ray leaving the object, and n2 b þ at1 =n1 n1 2 c þ at02 =n02
determines the final values, a0 and x0 at the image.
The product of the three 2 £ 2 matrices on the This equation connects the object distance t1 with
right-hand side can be consolidated into a single the image distance t 20 ; both quantities being
matrix called the object-image matrix SP0 P : Then measured from the appropriate vertex. This relation
eqn [20] can be written as is known as the generalized Gauss lens equation. It
applies to all lens systems, no matter how
0 1 0 1 complicated.
n02 a02 n1 a 1 We can determine the location of the unit planes by
@ A ¼ SP0 P @ A ½21
x0 x letting m ¼ 1: This t 20 in eqn [27] is the location l0H of
the image space unit plane, and it becomes
and this matrix has the complicated form: n02 ðc 2 1Þ
l0H ¼ ½28
0 1 a
at
B bþ 1 2a C This relation expresses the location of the unit plane
B n1 C
SP 0 P ¼B
B bt0
C ½22 on the image side as the distance from V2 to H 0 .
@ 2 0
at1 t2 ct1 at02 C
A Similarly, the location of H with respect of V1 is
0 þ 0 2d2 c2 0
n2 n1 n2 n1 n2
n1 ð1 2 bÞ
lH ¼ ½29
If we put this matrix into the previous equation, we a
obtain an unsatisfactory result: the value of x0 will
depend on the angle a1 made by the incident ray, as To locate the focal planes, consider a set of parallel
can be seen by multiplying the two matrices on the rays coming from infinity and producing a point
right side. The ratio of x0 to x is called the image at F0 : The terms containing t1 in eqn [27] are
6 GEOMETRICAL OPTICS / Lenses and Mirrors
much larger than d or b, and this relation becomes The system matrix S21 is
The focal lengths f and f 0 shown in Figure 5 were and we note that the determinant is
defined earlier as the distances between their
respective focal and unit planes. Hence ð0:83Þð0:92Þ 2 ð0:71Þð20:33Þ ¼ 1:00 ½39
By eqn [10]:
and
1:0 2 1:5
k2 ¼ ¼ 0:50 ½37
2 1:0 Figure 6 Positions of cardinal planes for an asymmetric lens.
GEOMETRICAL OPTICS / Lenses and Mirrors 7
Since x ¼ 0 at the starting point, eqn [26] shows that It is understood that the first entry under r is r1 ; the
second entry is r2 ; and so on. Note that this doublet
n02 a02 ¼ n1 a1 =m ½40 has only three surfaces; if the two parts were
separated by an air space, producing an air-spaced
The ratio of the final angle to the initial angle in this
doublet, then there would be four surfaces to specify.
equation is the angular magnification m; which is then
The values of the indices n0 and the spacings t0 are
m ¼ n1 =mn02 ½41 placed between the values of r: The constants of the
doublet are then
The locations of object and image for unit angular
magnification are called the nodal points, labeled as N r n0 t0
and N 0 : The above definition shows that the linear and
angular magnification are reciprocals of one another 1:0
when object and image are in air or the same medium. 1:500 0:5
The procedure that located the unit points will show ½44
that the nodal points have positions given by 22:0
1:632 0:4
lN ðn0 =n Þ 2 b
¼ 2 1 ½42 1
n1 a
When the two indices are identical, these relations are S31 ¼ R3 T32 R2 T21 R1 ½45
identical to those for the points H and H 0 . That is why
the rays from P to H and H 0 to P0 in Figure 3 are where
parallel.
k2 ¼ ðn02 2 n2 Þ r2 ¼ ð1:632 2 1:500Þ 22:0 ½46
Compound Lenses
A great advantage of the paraxial matrix method is
the ease of dealing with systems having a large
number of lenses. To start simply, consider the
Approximations for Thin Lenses
cemented doublet of Figure 10: a pair of lenses for An advantage of paraxial matrix optics is the ease of
which surface 2 of the first element and surface 1 of obtaining useful relations. For example, combining
the second element match perfectly. The parameters the several expressions for the Gaussian constant a;
of this doublet are given in the manner shown below. we obtain:
n1 n0 k k t0
a¼2 ¼ 20 ¼ k1 þ k2 2 1 02 1 ½47
f f n1
is approximately zero. Then features. Chemical engineers have long known that
" # the index of refraction of a liquid can be determined
1 0 1 1 by observing the empty bore of a thick glass tube with
¼ ðn1 2 1Þ 2 ½49
f0 r1 r2 a telescope having a calibrated eyepiece and then
measuring the magnification of the bore when the
which is the form often seen in texts. We then realize liquid flows through it. Consider a liquid with an
that the original form is valid for lenses of any index of 1.333 and a tube with bore radius of 3 cm,
thickness, but the thin lens form can be used if the outer radius of 4 cm, and glass of index equal to
spacing is much less than the two radii. It is also seen 1.500. The first step is to specify the parameters of the
that when the thickness is approximately zero, then optical system. These are the radii r1 ¼ 23; r2 ¼ 24
and the thickness t01 ¼ 1; regarding the tube as a
b¼c¼1 ½50
concentric lens with an object to its left; the indices
and the unit planes have locations n1 ¼ 4=3 (fractions are convenient), n01 ¼ 3=2; n02 ¼ 1:
Calculating the elements of the two refraction
lH ¼ l0H ¼ 0 ½51 matrices and the translation matrix, the system
matrix is
Thus, they are now coincident at the center of the
0 10 10 1
lens. A ray from any point on the object would pass 1 21=8 1 0 1 1=18
undeviated through the geometrical center of the lens, S21 ¼ @ A@ A@ A ½53
as is often shown in the elementary books. These new 0 1 2=3 1 0 1
values of the Gaussian constants give a system matrix
of the form: Multiplying, the Gaussian constants are
0 1 a ¼ 2=27; b ¼ 11=12; c ¼ 28=27; d ¼ 22=3 ½54
1 21=f 0
S21 ¼ @ A ½52 An important check is to verify that bc 2 ad ¼ 1; as
0 1
is the case here. The expressions given previously
This matrix contains a single constant, the focal for the positions of the cardinal points then lead to
length of the lens, so that the preliminary design of an the values
optical system is merely a matter of specifying the
focal length of the lenses involved and using the lH ¼ 3=2; l0H ¼ 1=2; lF ¼ 216 1=2;
½55
resulting matrices to determine the object-image
l0F ¼ 14; lN ¼ 23; l0N ¼ 24
relationship. We also note that this matrix looks
like a refraction matrix; the non-zero element is in the These points are shown in Figure 11. Although the
upper right-hand corner. This implies that the same scale has been used for both horizontal and
refraction – translation – refraction procedure that vertical distances, this is usually not necessary; the
actually occurs can be replaced, for a thin lens, by a vertical scale, which merely serves to verify the
single refraction occurring at the coinciding unit magnification, can be arbitrary. The entire liquid
planes. Ophthalmologists take advantage of the thin column is the object in question, but it is simpler to
lens approximation by specifying focal lengths in use a vertical radius as the object. Then the image,
units called diopters. The reciprocal of the focal generated as described in all the previous examples,
length in meters determines its refraction power in
diopters. For example, a lens with f 0 ¼ 10 cm will be
a 10 diopter lens. If two lenses are placed in contact,
the combined power is the sum of the individual
powers, for multiplying matrices of the above form
gives the upper right-hand element as ð21=f 01 21=f 02 Þ:
coincides with the object and it is enlarged, erect, original sign. For a mirror, the refraction power is
and virtual. We confirm the image location shown by
using the generalized Gaussian lens equation, giving n01 2 n1 21 2 ð1Þ 2
k1 ¼ ¼ ¼2 ½59
r1 r1 r1
t0 ¼ ð22=3Þ þ ð28=27Þð23Þ ð4=3Þ ð11=12Þ
The system matrix for a single refracting surface thus
þ ð2=27Þð23Þ ð4=3Þ
becomes
¼ 24 ½56
0 1 0 1
1 2k1 1 2=r1
and this agrees with the sketch. The denominator of S11 ¼ R1 ¼ @ A¼@ A ½60
this expression was shown to be the reciprocal of the 0 1 0 1
magnification 1=m with a value of 3/4, and m itself
can be calculated from t0 as c 2 at0 ¼ 4=3; this is with
another important check on the accuracy of the
calculation; the sketch obeys this conclusion, and the a ¼ 22=r1 ; b ¼ c ¼ 1; d¼0 ½61
purpose of this analysis is confirmed.
The connection between object and image can be
expressed as
Reflectors
To show how reflecting surfaces are handled with 0 1 0 10 10 10 1
n01 a01 1 0 b 2a 1 0 n1 a1
matrices, consider a plane mirror located at the origin @ A¼@ A@ A@ A@ A
and normal to the z-axis (Figure 12). The ray which x01 t0 =n01 1 2d c 2t=n1 1 x1
strikes it at an angle a1 leaves at an equal angle, ½62
resulting in specular reflection. Since k ¼ 0 for a flat
surface, the refraction matrix reduces to the unit
and using the same procedure that was applied
matrix and by eqn [9]:
to eqns [28] – [31], the locations of the six cardinal
points are
n01 a01 ¼ n1 a1 ½57
lF ¼ 2n1 b a; l0F ¼ n01 c a
This contradicts Figure 12, since a1 and a01 are
opposite in sign.
lH ¼ n1 ð12 bÞ a; l0H ¼ n01 ðc2 1Þ a ½63
If, however, we specify that
lN ðn01 n1 Þ2 b l0N c 2 ðn1 n01 Þ
n01 ¼ 2n1 ½58 ¼ ; ¼
n1 a n01 a
then the difficulty is removed. Hence, reflections so that for the spherical mirror:
involve a reversal of the sign on the index of
refraction, and two successive reflections restore the ð21Þð1Þ r1
lF ¼ ¼ ¼ l0F ½64
22=r1 2
lH ¼ l0H ¼ 0 ½65
and
21 21
lN ¼ ¼ r1 ½66
22=r1
but
l0N 1 2 ð21Þ
¼ ; l0N ¼ r1 ½67
21 22=r1
Aberrations
A Nussbaum, University of Minnesota, Minneapolis, approximate or paraxial form of Snell’s law which we
MN, USA write as
q 2005, Elsevier Ltd. All Rights Reserved.
n u ¼ n0 u 0 ½1
and
n1 L01 ¼ n1 L1 2 K1 x1 ½7
expected at the focal point F 0 : The rays a little farther called the longitudinal spherical aberration (LSA),
out – the intermediate rays – will cross the axis just a while the distance below the axis at the focal plane is
little to the left of F 0 ; and those near the edge of the the transverse spherical aberration (TSA). To reduce
lens – the marginal rays – will fall very short, as this aberration, note in Figure 2 that the amount of
indicated. Rotating this diagram about the axis, we refraction at the two surfaces of the lens for marginal
realize that spherical aberration produces a very rays is not the same. Equalizing the refraction at the
blurry circular image at the paraxial plane. The three- two surfaces will improve the situation. Suppose we
dimensional envelope of the group of rays produced alter the lens while keeping the index, thickness, and
by this rotation is known as the caustic surface and focal length constant; only the radii will change. This
the narrowest cross-section of this surface, slightly to can be done if the new radii satisfy the paraxial
the left of the paraxial focal plane, is called the circle equation giving the focal length in terms of the lens
of least confusion. If this location is chosen as the constants. The process of varying the shape of lens,
image plane, then the amount of spherical aberration while holding the other parameters constant, is called
can be reduced somewhat. Other methods of improv- bending the lens. This can be easily accomplished by a
ing the image will be considered below. Figure 2 was computer program, obtaining what are known as
obtained by adding a graphics step to the computer optimal shape lenses, which are commercially avail-
program described above. Figure 3 shows how to able. To study their effect on spherical aberration,
define spherical aberration quantitatively. Two kinds define the shape factor s of a lens as
of aberration are specified in this figure. The place
where the ray crosses the axis, lying to the left of F 0; is r2 þ r1
s¼ ½9
at a distance, measured from the paraxial focus, r2 2 r1
Coma
We have seen how to trace nonparaxial, meridional
rays and found that spherical aberration appears
Figure 3 Definitions of transverse and longitudinal spherical because the intermediate and marginal rays do not
aberration. behave like the central rays. We now wish to deal
14 GEOMETRICAL OPTICS / Aberrations
given above is that coma is the failure of skew rays to In other words, even though the lens has been fully
match the behavior of meridional rays, just as corrected for both aberrations, the two corrections
spherical aberration is the failure of meridional rays will not necessarily produce a common image. In the
to match the behavior of paraxial rays. Most texts figure, the meridional fan is called a tangential fan
show coma in its ideal form. and the skew fan is called a sagittal fan; these are
the terms commonly used by lens designers. The
failure of the sagittal and tangential rays to produce
Astigmatism a single image in a lens corrected for both spherical
We have shown that spherical aberration is the aberration and coma is known as astigmatism.
failure of meridional rays to obey the paraxial Astigmatism, coma, and spherical aberration are
approximation and coma is the failure of skew rays the point aberrations that an optical system can have.
to match the behavior of meridional rays. To see the We have already noted that spherical aberration is
connection between these two aberrations, consider the only one that can be associated with a point on
the object point P of Figure 11. This very the z-axis; the others require an off-axis object.
complicated diagram can actually be very helpful Astigmatism is very common in the human eye; to
in understanding the third of the five geometrical see how it shows up, look at the sagittal image point
0
aberrations. Let P be the source of a meridional fan: Ps of Figure 11. This point is a distance zs , from the
this is the group of rays with the top ray labeled PA x,y-plane. A plane through this point will have a line
and the bottom ray is PB. If the lens is completely image on it, the sagittal line, due to the tangential fan,
corrected for spherical aberration, this fan will have which has already come to a focus and is now
0
a sharp image point PT lying directly below the diverging. The converse effect, producing a tangential
z-axis. Now let P also be the source of a fan line, occurs at the other image plane. If we locate an
bounded by the rays PC and PD. This fan is at right- image plane halfway between the two, then both fans
angles to the other one; we can think of these rays contribute to the image and in the ideal case it will be
as having the maximum skewness possible. If the a circle. As the image plane is moved forwards or
lens has no coma, this fan also will produce a sharp backwards, these images become ellipses and even-
0
image point Ps which, for a converging lens, will be tually reduce to a sagittal or a tangential line, as
farther from the lens than the tangential focal point. shown in Figure 12. Because there are two different
image planes, an object with spokes (Figure 13) will Figure 14 Petzval of field curvature.
have an image for which the vertical lines are sharp
and the others get gradually poorer as they become
more horizontal, or vice versa, depending on which
image plane is chosen. Eye examinations detect
astigmatism if the spokes appear to go from black
to gray. This example of the effect of astigmatism was
often based on a radial pattern of arrows, which
explains the origin of the word sagittal, derived from
the Latin ‘sagitta’ for arrow.
Figure 15 Barrel and pincushion distortion.
Curvature of Field and Distortion
Having covered the three point aberrations, there are
two lens defects associated with extended objects.
If we move the object point P in Figure 11 closer to or
farther from the z-axis, we would expect the positions
of the tangential and sagittal focal planes to shift, for
it is only when the paraxial approximation holds that
these image points are independent of x: Hence, we
obtain the two curves of Figure 14, which shows what
astigmatism does to the image of a two-dimensional
object. If the astigmatism could be eliminated, the
effect would be to make these curved image planes
coincide, but we have no guarantee that the common
image will be flat, or paraxial. The resulting defect is
called Petzval curvature or curvature of field. For a
single lens, the Petzval surface can be flattened by a
stop in the proper place, and this is usually done in
inexpensive cameras. Petzval curvature is associated
with the z-axis. If we take the object in Figure 11 and
move it along the y-axis, then all rays leaving it are
skew and this introduces distortion, the aberration
associated with the coordinates normal to the
symmetry axis. Distortion is what causes vertical Figure 16 (a) Distortion-creating stop. (b) Symmetric-compo-
nent lens that reduces distortion.
lines to bulge outward (barrel distortion) or inward
(pincushion distortion) at the image plane, as shown
in Figure 15. A pinhole camera will have no shown in Figure 16a, the rays for object points far
distortion, since there are no skew rays, and a single from the axis are limited to off-center sections of the
thin lens with a small aperture will have very little. lens. The situation in this figure corresponds to barrel
Placing a stop near the lens to reduce astigmatism and distortion; placing the stop on the other side of the
curvature of field introduces distortion because, as lens produces pincushioning. Distortion can be
18 GEOMETRICAL OPTICS / Aberrations
therefore reduced by a design which consists of two glass composing the lens. One form of astronomical
identical groupings (Figure 16b), with the iris telescope, which has a large primary concave mirror
diaphragm placed between them; the distortion of to collect the light and reflect it back to a smaller
the front half cancels that of the back half. Figure 17 secondary convex mirror, is the Ritchey –Chrétien
shows the output for a calculation using approximate design of 1924. Both mirrors are hyperbolic in
formulas rather than exact ray tracing. Note that the shape, completely eliminating spherical aberration.
designer has arranged for astigmatism to vanish at the This is the design of the Hubble space telescope.
lens margin. This example indicates the nature of The two-meter diameter primary mirror, formed by
astigmatism in a system for which the coma and the starting with a spherical mirror and converted to a
spherical aberration are low but not necessarily zero; hyperbola by computer-controlled polishing, was
it is the failure of the tangential and sagittal image originally fabricated incorrectly and caused spherical
planes to coincide. aberration. From a knowledge of the computer
calculations, it was possible to design, build, and
place in orbit a large correcting optical element
which removed all the aberrations from the
Nonspherical Surfaces emerging image.
So far, we have considered optical systems that use
only spherical or plane reflecting and refracting
surfaces. However, many astronomical telescopes
Conclusion
use parabolic and hyperbolic surfaces. In recent The three-point aberrations and the two extended
years, nonspherical or aspheric glass lenses have object aberrations have been defined and illustrated.
become more common because they can be mass Simple ways of reducing them have been mentioned,
produced by computer-controlled machinery and but the best way of dealing with aberrations is in the
plastic lenses can be molded with very strict process of lens design. In addition, there is a
tolerances. Ray tracing procedures given above can completely separate defect known as chromatic
easily be extended to the simplest kind of aspheric aberration. All the above calculations are based on
surfaces: the conic sections. Iterative procedures may ray tracing for a single wavelength (or color) of visible
be applied to surfaces more complicated than the light. However, the index of refraction of a transpar-
conic sections. A well-known example is the curved ent material varies with the wavelength of the light
mirror; a spherical reflecting surface will have all the passing through it, and a design good for one value of
aberrations mentioned here, but a parabolic mirror the wavelength will give an out-of-focus image for
will bring parallel rays to a perfect focus at the any other wavelength. It has been known for many
paraxial focal point, thus eliminating the spherical years that an optical system composed of two
aberration. Headlamp reflectors in automobiles are a different kinds of glass will provide a complete
well-known example. A plano-convex lens with this correction for two colors – usually red and blue –
defect corrected will be one whose first surface is flat and other colors will be partially corrected. Such
and whose second surface is a hyperbola with its lenses are said to be achromatic. For highly demand-
eccentricity equal to the index of refraction of the ing applications, such as faithful color reproductions
GEOMETRICAL OPTICS / Prisms 19
See also
Geometrical Optics: Lenses and Mirrors; Prisms.
Prisms
A Nussbaum, University of Minnesota, Minneapolis, light ray entering the prism at right angles to one of
MN, USA the smaller sides, reflected at the larger side, and
q 2005, Elsevier Ltd. All Rights Reserved. emerging at right angles to the other small side. The
path shown is determined by Snell’s law. This law has
Introduction the form:
Prisms are solid structures made from a transparent n sin u ¼ n0 sin u 0 ½1
material (usually glass). Figure 1 shows the simplest
type of prism – one that has many uses. The vertical
sides are two identical rectangles at right angles to
each other and a larger third rectangle is at 458 to the
other two. The top and bottom are 458– 458 –908
triangles. The most common applications of
prisms are:
For the triangle formed by the projected normals, cos2 r{1 2 n2 sin2 ðA 2 rÞ}
which meet at the angle A:
¼ cos2 ðA 2 rÞ{1 2 n2 sin2 r} ½11
0
r¼r þA ½4
or:
Hence
cos2 r ¼ cos2 ðA 2 rÞ ½12
d ¼ i þ i0 2 A ½5 and finally:
i ¼ ðd þ AÞ=2 ½15
sin i ¼ n sin r; sin i 0 ¼ n sin r 0 ½7
and:
n ¼ sin i sin r ¼ sin{ðd þ AÞ=2} sinðA=2Þ ½16
Contents
Art Holography
High-Resolution Holographic Imaging and Subsea Holography
Holographic Recording Materials and Their Processing
who have concentrated on holography as their no recording of the object will take place. Unlike
medium of choice. photography, where if an object moves during the
Some artists build their own optical studio in which exposure, it will become blurred, in holography it will
to experiment in a hands-on environment, some simply not be recorded. Clearly, if no recording can be
collaborate with scientific and research establish- made, optical researchers working with holography,
ments, where the work, facilities and expertise is while noting the phenomenon, would not necessarily
shared, and others design and supervise the pro- feel the need to make holograms which would fail.
duction of their work which is produced for them by a Several artists, on the other hand, not confined by
commercial company. All of them produce a unique research deadlines, have acknowledged this strange
contribution to the visual arts. phenomenon and exploited this limitation of the
holographic process. One of the very early examples
Early Exploration is ‘Hot Air’ (Figure 1), a laser transmission hologram
by Margaret Benyon and now in the Australian
Holography offers a unique method of recording and National Gallery Collection. Produced in 1970, it
reproducing three dimensions on a flat surface with- shows a three-dimensional composition of a jug, cup,
out the need for special glasses (for stereoscopic and the artist’s hand. The jug and cup were filled with
viewing), lenses (such as lenticular viewing) or linear hot water and during the recording of the holographic
perspective which, since the Renaissance, has been plate, with a continuous wave laser, the heat from the
our main method of tricking the eye into believing hot water disturbed the air above these objects. This
that an image recedes into the distance and displays tiny amount of movement resulted in the area above
three dimensions. the jug and cup not being recorded. Similarly, the
Many of the very early works produced by artists artist’s hand moved slightly during the exposure and
concentrated on the obvious still life nature of the therefore did not record. Strangely, when viewing the
medium and explored the unusual phenomenon of resulting laser transmission hologram, the hand is
objects ‘frozen’ in holographic space. Almost all of visible. What is displayed is a holographic hand-
these early works were recorded using continuous shaped black hole, the three-dimensional recording of
wave helium neon lasers, either in scientific/research the space once occupied by the hand; the dimensional
facilities, which artists had arranged to use, or in recording of the absence of an object.
collaboration with established scientists and optical
imaging researchers.
As early as 1969, British artist Margaret Benyon
was experimenting with this new medium. Her laser
transmission hologram ‘Picasso’, is a recording
of images taken from Pablo Picasso’s painting
‘Demoiselles D’Avignon’. Here Margaret Benyon
took a copy of the painting, cut out the characters,
placed them in a three-dimensional spatial arrange-
ment, with real objects, and then made a hologram of
this ‘scene’. Picasso attempted to show, through many
of his paintings, multiple views of the same subject on
a two-dimensional surface, one of the main thrusts of
Cubism in the late twentieth century. Benyon
achieved this in ‘real space’ by recording a hologram
of the flat reproductions. By doing so, she demon-
strated that holography automatically achieved what
Picasso, and many other key contemporary visual
artists, had been trying to do for years.
Artists in display holography place continual demands between the observer and the holographic emulsion.
on the technology. Their natural tendency is to break The actual wine glass used in the recording has been
rules. I believe that most successful holographic artists broken and what remains (the base, stem, and small
belong to this ‘suck it and see’ category of experi- part of the bowl of the glass) is placed on a platform
mentalists. On hearing about the ‘rules’ of holography, attached to the holographic plate. When viewing the
they break them, just to see what happens. (Margaret
work head on, the missing parts of the broken glass
Benyon, PhD Thesis, 1994).
are recreated, in three dimensions, by the holo-
graphic shadow. When viewed obliquely, from the
Benyon started by breaking the rules and has
side for example, only the broken (real), glass is
continued to develop from these early pioneering
visible. Not only does this piece have an element of
laser transmission works, which helped establish a
performance in the recording, breaking, mounting,
visual ‘vocabulary’ including many important social
and reconstruction of the wine glass, it also explores
and political issues.
questions of visual perception and concepts of
Other artists have exploited the ‘limitations’ of the
reality.
holographic process to produce visually stunning
The recording of the shadow of an object, or
works which question how we look at the three-
objects, has now become commonplace in creative
dimensional world and question what is ‘real’.
holography, with many artists employing techniques
Rick Silberman (USA) developed a shadowgram
to mix different objects together, change their color
technique where the hologram is made of a laser and rearrange them in space. One advantage of
illuminated ground glass screen. Objects are placed this technique is that dependence on absolute stability
in front of this luminous surface and their three- of the objects used is reduced. The objects are
dimensional silhouette is recorded. The viewer of the ‘encouraged’ to move during the recording process
hologram is looking at the space where the object and so create ‘negative’ images. Steve Weinstock
used to be. A pivotal work in this field is ‘The (USA), Marie-Andrée Cossette (Canada), Michael
Meeting’ 1979 (Figure 2), a white light reflection Wenyon and Susan Gamble (USA/UK) have all
hologram which had been recorded in such a way as contributed to this area in some of their early works.
to cause the three-dimensional shadow of a wine
glass to protrude out of the holographic plate and
be viewable several centimeters in front of it, Public Display
As the technology and optical processes of holo-
graphy have developed, so greater flexibility, and
public accessibility, have emerged. The single most
important aspect has been the ability to display
holograms in white light.
Emmett Leith and Uris Upatnieks invented laser
transmission holography in 1964. They demonstrated
the practicalities of Gabor’s original invention, by
utilizing laser light, to produce three-dimensional
images. To display these recorded images, laser light
needed to be shone through the holographic plate, at
the original angle used by the recording reference
beam (off-axis). The results were breathtaking, with
full visual parallax in all directions. Although it was
the laser light which made the recording and
reconstruction of these unusual images possible, the
laser was also a limiting factor. Each time a hologram
was displayed, a laser was required. This relegated
the technique to displays in scientific conferences,
laboratories, or other locations with optical equip-
ment. What was needed was a technique allowing
holograms to be displayed without the need for this
expensive and dangerous laser light. White light
Figure 2 ‘The Meeting’, Rick Silberman 1979. White light
holography was the answer.
reflection hologram (shadowgram). Copyright q 1980 ATP. Although a laser is still required to record the
Reproduced with permission. original hologram, by using a copying technique, a
28 HOLOGRAPHY, APPLICATIONS / Art Holography
restricted aperture or one step process, the resulting the very first art holograms to be displayed using
holograms can be reconstructed using a small source Benton’s rainbow hologram technique. A later work,
of white light such as a spotlight or even direct ‘Equivocal Forks’, 1977, exists in different formats;
sunlight. As expensive lasers were no longer needed to as a laser transmission hologram showing the
display holograms, large public exhibitions could be pseudoscopic display of domestic folks which
mounted. Gallery and museum spaces could include protrude, inside out, from the holographic plate,
holography in their exhibitions and individuals could and as a number of white light transmission holo-
buy and display holography in domestic locations. grams, of the same image, which were shown in an
This technique also benefited the commercial display outdoor installation mounted by The Center for
market, where holograms could now be used for Advanced Visual Studies, part of the Massachusetts
advertising and promotional events. Institute of Technology. This demonstrated to a
Stephen Benton (USA) (1942 – 2003), inventor of wide, nonscientific, audience, that holograms could
white light transmission (rainbow) holography in be accessible, easily illuminated and weatherproof.
1968, made a major contribution to the use of the Casdin-Silver has continued to produce art hologra-
medium by artists. His technique not only allowed phy and has, most recently, worked with large-scale
holograms to be viewed in white light, but they were portraiture.
extremely bright and offered the possibility of Although Benton’s process increased the accessi-
creating multicolor images. His ‘Crystal Beginnings’ bility of holography, an earlier technique, invented by
1977, (Figure 3), shows just how much space can be Yuri Denisyuk (Russia) in 1962, allowed holograms
created with very simple points of light. Benton even to be recorded in a single step and then displayed
taught master classes for artists so that he could using white light. The political situation between the
explain how his technique functioned and explain USSR and the rest of the world during the early 1960s
the mathematics involved. He also worked with meant that information and results from much of this
several artists to help them realize their individual early work was not openly accessible, and communi-
projects. cation between Denisyuk and other key researchers in
Harriet Casdin-Silver (USA) was one of the first holography was limited. However, as soon as details
pioneering artists to work with Benton and produced of the Denisyuk technique became more widely
several key works now seen as integral to the art of available in the 1970s, artists began to embrace it as
holography. Their early collaboration ‘Cobweb a practical alternative.
Space’, 1972, sculpted pure light and was one of The work of Nicholas Phillips (UK), whose
research improved chemical processing of display
holograms during the late 1970s, helped artists and
the rest of the holographic field to achieve much
better results when making reflection holograms, a
technique favored in Europe. He was also heavily
involved with the rock band, ‘The Who’, and their
Holoco company which mounted two very successful
exhibitions of display holography at the Royal
Academy of Art, London, at this time. Artists who
might not have come across the holographic
phenomenon became aware of its potential due to
its presence in one of the UK’s more traditional art
venues. Other large exhibitions in the late 1970s
and early 1980s in USA, Sweden, and Japan
also highlighted the immense possibilities holo-
graphy offered to a nonscientific community of
gallery visitors.
As interest in all aspects of holography increased,
dedicated spaces opened to sell, display, and archive
the output from artists, scientists, and commercial
producers. One of the most influential of these was
Figure 3 ‘Crystal Beginnings’ Stephen Benton 1977, White light
the Museum of Holography, located in the SoHo art
transmission (rainbow) hologram. Jonathan Ross Collection,
London, UK. One of the best known works by the inventor of the district of Lower Manhattan, New York (Figure 4).
rainbow hologram. Copyright q 2004 J. Ross. Reproduced with From 1976 through to 1992, it regularly mounted
permission. group and solo exhibitions by artists, giving many of
HOLOGRAPHY, APPLICATIONS / Art Holography 29
So much interest was generated in pulsed holo- holograms into a single work. The finished holograms
graphy that several studios emerged for use by artists, display a complex series of image fragments and
including independent facilities and those within colors within the depth of the holographic space he
educational establishments. No longer was this the has constructed. Ruben Nuñez (Venezuela), originally
domain of the well funded scientific research facility. approached holography from a background in glass
Several of these ‘studios’ still exist and are offering production and used the ability of the holographic
artists access to equipment through specialized process to ‘capture’ light reflecting and passing
artist-in-residence programs. through his glass objects. The resulting white light
transmission holograms display an abstract ‘photonic’
Abstraction display of form and color. Fred Unterseher (USA),
uses holography to record optical distortions and
Although holography is an exceptional process for
diffraction gratings which would display abstract
recording every detail of an object and the volume it
patterns of light. Many of these early works were
occupies, many artists have wanted to work with the
mandala images which displayed an intense kinetic
abstract and kinetic possibilities the process offers.
effect as an observer walked past the surface of
Light can be recorded and displayed in three
the holographic plate.
dimensions, it can be animated, inverted, compressed,
sculpted, and projected. Douglas Tyler (USA), in
his In the ‘Dream Point Series’ 1983 (Figure 9),
recorded small laser transmission holograms but Pseudo Holograms
displayed them using normal white light, producing
In 1973, Lloyd Cross invented the multiplex (inte-
a smearing effect of the three-dimensional object.
gral) hologram technique which used sequential two-
He then mounted these onto transparent surfaces
dimensional images recorded on movie film combined
(Plexiglas) and drew on the surface of the plastic to
with white light (rainbow) hologram techniques. The
incorporate etched lines and shapes. Rudie Berkhout
result was a holographic stereogram, which gave
(Netherlands/USA) began his holographic explora-
viewers the illusion of seeing animated images in
tion with stunning abstract landscapes made of
three dimensions. The final holographic sheet was
luminous, multicolor, animated objects which took
displayed either curved or as a complete cylinder,
advantage of his mastery of the white light
which was often motorized so that viewers could
transmission technique. Most recently he has been
stand in front of the unit and watch the animation
recording reflection holograms which display undu-
take place as the hologram rotated. Such a display
lating colored ‘waves’ of light as an observer moves
does give the impression that an observer can walk all
past the hologram. These have been included in
the way round the image, enhancing its perceived
specially produced paintings for large-scale archi-
three-dimensional effect.
tectural commissions.
As the original images are recorded photographi-
Sam Moree (USA) combines multiple images,
cally, there are several advantages – anything which
exposures, diffraction gratings, and smaller
can be recorded on inexpensive black and white
movie film can be converted into a holographic
display. Short animated sequences can be presented
while accepted film techniques such as zoom, pan,
tilt, and transitions can be incorporated into the
finished sequence of images. Although very popular
for advertizing and promotional campaigns during
the 1980s, particularly in the USA, several artists used
this technique. They could make the movie film
sequences themselves and then send this film to a
commercial producer for a hologram to be made for
them. This offered an interesting balance in the
creative process. The artist could concentrate on the
filmic image, using facilities and photographic tech-
niques they were familiar with and not have to learn
the holographic process or build a holographic
Figure 9 ‘Dream Passages’: 3200 £ 13800 (3 panels 3200 £ 4600 ).
Douglas E. Tyler 1983, Laser transmission holograms, tape and laboratory themselves. Artists such as Anait (USA)
Plexiglas, illuminated with white light. Detail of center panel. produced several experiments in the area, one of
Copyright q 1983 D.E. Tyler. Reproduced with permission. which showed her inside the holographic display,
HOLOGRAPHY, APPLICATIONS / Art Holography 33
sensitive emulsion is coated. These tend to be in the pixels which, when seen together, make up the
region of 20 £ 23 cm, 30 £ 40 cm, or meter-square complex and high intensity colour design.
formats. Generally, the larger the finished hologram, Dieter Jung (Germany) has produced commis-
the larger the laboratory needed to produce it. sioned art works for large interior architectural
This impacts a great deal on artists who do not spaces. In ‘Perpetuum Mobile’ 2001, located at the
have the facilities, or finances, to produce ‘large’ European Patent Office, The Hague, holograms are
holograms. A solution has been to combine smaller placed on the floor, which can be walked over, and an
holographic plates together, as a mosaic, to give the extensive holographic mobile structure hangs over-
size and scale needed. In recent years, several artists head. The light, and color, from the abstract
have completed public commissions for large-scale holograms in the mobile sections, can be reflected in
interior and exterior holographic installations. the abstract holographic floor and combine to
Setsuko Ishii, (Japan) has produced several instal- produce an interior lightscape impossible to produce
lations in public spaces where individual holographic with more traditional media.
plates are mounted together to form a larger Holograms can be mass produced using a mech-
composite structure. Many of these pieces use anical embossing process which transfers the optical
holograms recorded on dichromate gelatin plates contents of the original master hologram onto
which produce an exceptionally bright image, ideal plastic sheeting for duplication in large runs. One
for public spaces, where visibility, from multiple novel use of this material has been the interior of the
angles, is important. Canadian pavilion at Expo ‘92 (Figure 13). Melissa
Michael Bleyenberg (Germany) produced an entire Crenshaw (Canada/USA) used huge quantities of
exterior holographic wall (13 £ 5 meters) during embossed holographic material to cover the
2002 (Figure 12), for the extension of a research 3000 square foot interior walls of the specially built
building in Bonn. The installed graphic wall, ‘EyeFire’ pavilion. Lighting was installed to illuminate this
(AugenFeuer), is made up of 26 smaller panels, each optical diffracting wall and the entire structure was
constructed from tiny holographic ‘pixels’ (holo- covered in glass, over which water flowed. The
graphic optical elements, HOE) to give an intense resulting interior appeared to be a wall of liquid
color effect. Working in collaboration with the light which changed color and hue depending on
Institute for Light and Building Technology, Cologne, where it was viewed from.
Bleyenberg was able to design his graphic panels Sally Weber (USA) used holographic embossed
on a computer screen and have this information material for several outdoor commissions. Structures
translated into the thousands of tiny holographic were clad in embossed holographic sheets and made
water resistant. Sunlight illuminated the installations
to produce surfaces of constantly changing color.
In direct (bright) sunlight, intense rainbow
colors would be diffracted by the holographic
surfaces. On an overcast day, pastel, more muted,
colors would appear. Although she is often known Twenty-First Century Art
for these large-scale outdoor embossed works, one
of her permanent commissions is housed in the Art holography has developed and matured over the
Karl Ernst Osthaus-Museum, Hagen, Germany, past 50 years to become a viable and significant
where specially made holographic diffraction gratings element in the contemporary visual arts. Critics are
are incorporated into an existing building. ‘Signature still asking ‘is it art?’ in the same way they used to
of Source’ 1997, (Figure 14), is a permanent, site- worry about whether photography was art. While
specific installation producing a line of light which they consider this problem, practitioners in the
enters the museum space from the skylight it is built medium are continuing to find ways to push past its
into. This abstract line of pure color becomes a technical and visual boundaries (Figure 16). They
continue to collaborate with the scientific and
sculptural element within the space of the museum
research world and use holography as a medium to
and is generated by the action of light from outside
comment on life and society in the early twenty-first
the building being manipulated and ‘formed’ as it
century.
passes through the holographic window.
One of the most celebrated and very early attempts …I hope, that one day I will be able to give you, artists,
to incorporate large-scale holograms into art works something really powerful; panoramic holograms,
36 HOLOGRAPHY, APPLICATIONS / Art Holography
Figure 16 ‘Holography as Art’, Margaret Benyon 1980. Manifesto. Copyright q 1980 M. Benyon. Reproduced with permission.
extending to infinity, in natural colors. Dennis Gabor, Interferometry. Introductory Article: Early Development
Inventor of holography in a letter to artist Margaret of the He –Ne Laser (Title TBC).
Benyon 28th February, 1973.
Further Reading
Benyon M (1994) How is Holography Art? PhD thesis,
See also
Royal College of Art, London.
Coherence: Coherence and Imaging. Diffraction: Diffrac- Brill LM (ed.) (1989) Holography as an art medium,
tion Gratings. Holography, Techniques: Overview. Ima- Leonardo. Journal of the International Society of Arts,
ging: Multiplex Imaging. Interferometry: White Light Sciences and Technology, 22 (3 and 4).
HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography 37
Coyle R and Hayward P (1995) Apparition, Holographic Orazem V and Schepers J (1992) Olografia, Holografie,
Art in Australia. University of Sydney, NSW, Australia: Holography, Exhibition Catalogue. Rocca, Paolina,
Power Publications. Italy: Centro Espositivo.
Casdin-Silver H (1998) The Art of Holography Exhibition Lipp A and Zec P (1985) Mehr Licht/More Light, Exhi-
Catalogue. Lincoln, MA, USA: DeCordova Museum bition Catalogue. Kabel-Verlag: Hamburger Kunsthalle.
and Sculpture Park. Pepper AT (ed.) (1992) The Creative Holography Index:
Holographische Visionen, Bilder durch Licht zum Leben The International Catalogue for Holography. Germany:
erweckt (1991) Exhibition Catalogue, Museum für Monand Press.
Holographie und neue Visuelle Medien, Pulheim, Pepper AT (ed.) (1996) Art in Holography 2. International
Germany: Edition Braus. Symposium on Art in Holography. Nottingham, UK:
Holosphere, Formerly published by the Museum of University of Nottingham.
Holography. New York: Boston, USA: Archieved at the Margaret Benyon M (1989) Phases: Exhibition Catalogue.
MIT Museum. New York, USA: Museum of Holography.
Jackson R and Burns J (1976) Holografi: Det 3- Tyler DE (ed.) (1990) Report, International Congress on
Dimensionella Mediet, Exhibition Catalogue. Museum Art in Holography. Notre Dame, Indiana, USA: Saint
of Holography, New York: Stockholm Cultural Mary’s College.
Center. Tyler DE (ed.) (2003) Leading Lights, Women in Holo-
Jung D (ed.) (2003) Holographic Network. Germany: graphy, Exhibition Catalogue. Notre Dame, Indiana,
Rasch Verlag, Bramsach. USA: Saint Mary’s College.
sample volume towards the holographic plate and (Figure 2). Interference occurs between diffuse light
records the optical interference between light reflected from the scene and the angularly separated
diffracted by the object and the undeviated portion reference beam. On replay, the real and virtual images
of the illuminating beam. The subject volume is are similarly angularly separated which makes their
illuminated in transmission and the scene needs an interrogation easier. OAH is primarily applied to
overall transparency of about 80% so that speckle opaque subjects of large volume. Although there is no
noise does not seriously degrade image quality. real upper limit to the size of particles that can be
Particles down to around 5 mm dimension can recorded (this is determined by available energy and
be readily recorded and identified. The parti- coherence of the laser) the practical lower limit to the
cular benefits of ILH are its geometric simplicity, resolution that can be achieved is around 50 to
minimization of laser coherence and energy 100 mm. The scene can be front, side or back
requirements, and high resolution over a large depth illuminated (or all three combined) allowing high
of field. concentrations of large particles to be recorded,
For recording an off-axis reference beam hologram provided that they reflect sufficient light to the film,
(OAH), a two beam geometry is utilized: one beam and allowing subject visibility to be optimized.
illuminates the scene and the other directly illuminates OAH provides a more complete record than ILH, in
the holographic film at an oblique incidence angle the sense that it gives information on the surface
HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography 39
structure of the object, while ILH records only a we generate an image that appears to float in the
silhouette. real space in front of the observer. Often a collimated
Holograms are replayed or reconstructed in a reference beam is used in recording, since reconstruc-
variety of ways, depending on the recording para- tion is then a simple case of turning the hologram
meters and subsequent processing of the recording around. No lens is needed to form this image, and it is
medium. In scientific and engineering applications the optically identical to the original wave that it is
requirements of high resolution and image fidelity reversed left-to-right and back-to-front (pseudo-
demand that they are replayed with a laser, and scopic). Planar sections of the real image can be
viewed with some auxiliary optical system and optically interrogated by directly projecting onto a
(usually) a video camera. Depending on the orien- video camera (often with the lens removed) which can
tation of the hologram, either the virtual image, be moved throughout the 3D volume. It is this
which is apparently located behind the hologram and ‘optical sectioning’ feature of a holographic recording
seen through it, or the real image, which is formed in that is so useful when nondestructive (noninvasive)
space in front of the hologram, will be visible. evaluation must be made, at high resolution, of
Considering first the off-axis hologram in the collections of objects which extend over a consider-
virtual image mode, the processed hologram is able depth.
replaced in the position in which it was recorded A series of images taken from the real image of one
and illuminated with an exact duplicate of the hologram and captured on a video camera (Figure 5)
original reference wave, in terms of its wavelength, illustrates how the viewing plane can be moved
curvature, and beam angle. In this mode, which might sequentially through the reconstruction to show
be termed ‘normal’ viewing of an off-axis hologram, a firstly one marine organism and then another. The
virtual image (Figure 3) is observed as if viewing the axial separation of the organisms is about 35 mm.
scene through a window. Other than the colour, this To replay an in-line hologram, it is also placed in a
image is almost optically indistinguishable from the conjugate of the recording beam (Figure 6). The
original scene; it retains features such as parallax replayed hologram simultaneously forms two images,
and three-dimensionality and does not suffer from one virtual, the other real, which are located on the
perspective distortion or defocusing of the image. optic axis on opposite sides of the holographic plate.
For data extraction and measurement from a These images are co-axial, and located (for the typical
hologram, creation of the virtual image is not the case of collimated recording and replay beams) at
best method of observation. Here the projected real equal distances in front of and behind the hologram
image (Figure 4) is most useful. If we illuminate an plane. Although a true 3D real image is produced it
off-axis hologram from behind, with a wave which is has almost no parallax, and cannot easily be seen by
the exact conjugate of the original (i.e., one which the unaided eye. This image has inverted depth
possesses the same wavelength as the original perspective and is seen in darkfield against the
but with the opposite direction and curvature), out-of-focus virtual image.
Figure 5 A series of images taken from a single hologram showing the concept of moving the viewing plane through the real image.
HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography 41
precisely, although in practice an argon laser provides increasing the signal-to-noise ratio, as the replayed
a useful selection of wavelengths. A mismatch image is viewed against a background of the
between record and replay wavelengths will lead attenuated replay beam. Since the proportion of the
to distortion in the image, because longitudinal illuminating beam that goes into the images is very
and lateral magnifications will no longer have the low, this lack of angular separation requires that the
same value. average optical density of the hologram is high.
In contrast to off-axis holograms, the interference The high optical density attenuates the undiffracted
patterns on an in-line hologram are localized about light, thus the image appears bright against a dark
the geometric shadow of the objects. The lack of a background, resulting in contrast reversal.
spatially separate reference beam leads to a require- The exposed film is chemically processed to render
ment that there be a large fraction (.80%) of the beam the interference field permanent. This process is a
unobscured. This limits the size of particles that can be crucial step in the holographic procedure, and
recorded and sets an upper limit on the recordable especially critical for good off-axis holograms. Some
concentration of particles (around 40 particles cm23 of the factors, which have to be considered, include
for 20 mm diameter particles and a recorded depth of image brightness, image resolution, reconstruction
1 m). There is also a need to balance the size of wavelength, emulsion shrinkage, humidity, and noise
the particles with the object-to-hologram distance. level. It has been well established that for bright
The range of object size recordable on an in-line off-axis holograms on silver halide film, bleaching of
hologram has an upper limit set by the requirement the emulsion with its consequent conversion into a
to record in the far-field of the object, so that the phase hologram is essential to maximize diffraction
conditions of Fraunhofer recording are satisfied: efficiency. Some forms of bleaching produce a
nonuniform shrinkage of the emulsion which will
z . d 2 =l ½2 give rise to astigmatism in the reconstructed image.
Bleaching significantly increases the background
where z is the object-to-film distance and d is the noise levels of the hologram and introduces a high
maximum dimension of the object to be recorded. degree of variability into the result. In practice, the
In-line holograms are relatively easy to replay as the best repeatability will be obtained with amplitude
only requirement is to place the hologram normal to processing if the low brightness can be tolerated.
the replay beam. Following this criterion, the upper
limit for good size measurement is about 2 to 3 mm
particle size over a recording distance of a meter.
Underwater Holography
Unlike off-axis holograms which are extremely
sensitive to misalignment, the in-line geometry mini- When holography is extended to the inspection of
mizes the effects of film rotation about its axes. A underwater scenes, the hologram is recorded in water
collimated recording beam is by far the easiest to but the image is replayed, in the laboratory, in air.
duplicate upon replay. Small deviations from collima- Because of the refractive index difference between
tion can significantly affect conjugacy and hence recording and replay spaces, optical aberrations will
resolution, introducing aberrations (mainly spherical) be introduced into the reconstructed image, and will
and a depth-dependent magnification (e.g., a cuboid seriously impair the potential for precision measure-
replays as a trapezoidal prism). ment, unless some method of aberration reduction is
In general, the interference field is captured on incorporated. In ILH, only spherical aberration is
photographic emulsions; these may be laid on acetate significant since both reference and object beam
or polyester film bases or on glass plates. If film is angles are normal to the recording plane. However,
used, it is essential that it be held as flat as possible in OAH the dominant aberrations are astigmatism
between thin flat glass plates or under vacuum, and and coma, which increase with the field angle of the
should not be stretched or put under any strain during subject in the reconstructed image. These limit
exposure. An in-line hologram does not require an resolution and introduce uncertainty in co-ordinate
extremely high-resolution recording medium. The location, since the focal ‘point’ is now distributed
fringe spacing is not as fine as the typical carrier fringe over a finite region of image space. A key point here is
frequency used in off-axis holograms, and holograms an appreciation that the additional aberrations
can be made on conventional slow photographic introduced are entirely due to the refractive index
emulsions. In practice, however, ultra-fine grained mismatch between object and image spaces and are
holographic film is almost always used since the unconnected with the holographic process itself:
intensity of the forward scattered light (a function of they are in essence the same distortions seen when
grain size) is then reduced: this is important in looking into a fish-tank. Furthermore, the water
HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography 43
itself may be expected to influence the quality of In most underwater applications the objects
the images produced. An increase in the overall (or the camera) are in motion during the exposure.
turbidity of the water will adversely affect both in-line The effect of this motion is to blur out the finer
and off-axis techniques and would be expected to fringes, and thus reduce resolution and contrast.
create a background noise that will reduce image In-plane motion is the most severe, and adopting
fidelity. the experimentally verified criterion that the maxi-
Techniques, such as immersing the holographic mum allowable motion is less than one-tenth of the
plate in water during recording or direct replay back minimum required fringe spacing of the smallest
into water, have been used to minimize underwater object, then for in-line holograms the maximum
aberrations, but such methods are generally imprac- object motion must be less than one-tenth of
tical in the field. One practical solution, unique to the object’s dimension. For particles of 10 mm
holography, compensates the change of effective dimension, and a typical Q-switched YAG laser
wavelength of light as it passes through water, by pulse duration of 20 ns, a high transverse velocity of
allowing a deliberate mismatch of recording and up to 50 m s21 can be tolerated. Off-axis holograms
replay reference beams to balance the refractive are more demanding in their requirements and the
index mismatch. maximum allowed velocity is reduced to about
For holograms recorded and replayed in air, a usual 3 m s21. This is, however, more than adequate for
prerequisite is that the reconstruction wavelength most field applications of the technique.
should remain unchanged from that used in record-
ing. By noting the dependence of wavelength on the
refractive index, we can apply the more general
condition that it is the ratio l/n that must remain
Underwater Holographic Cameras
constant. This relationship suggests that a hologram, The first-known use of holography for recording
immersed in water, and recorded at a wavelength la living marine plankton was by Knox in 1966, who
will produce an aberration-free image in air when recorded in-line holograms (ILH), using a short
replayed at a reconstruction wavelength, lc, which is coherence length ruby laser, of a variety of living
itself equivalent to the wavelength of light when marine plankton species in a tank. However, to avoid
passing through water, lw , i.e.: problems with refractive index and wavelength
changes, the holographic plate was immersed directly
lc ¼ lw ¼ la na =nw ½3 in water, which is completely impractical for field use.
It was not until the late 1970s that quantitative data
where na and nw are the respective refractive indices from replayed holograms of marine particles
of air and water. Thus, in principle, the aberrations recorded in situ were obtained by Carder and his
may be completely eliminated by the introduction of team. Small volumes (10 or 20 cm3) were recorded
an appropriate wavelength change between recording using ILH and a low-power HeNe laser, which
and replay. If a green (532 nm) laser is used in limited the technique to slowly moving objects.
recording, the ideal replay wavelength is about Later, Heflinger used the off-axis geometry with
400 nm (i.e., 532 nm/1.33). However, no convenient both pulsed argon-ion (514 nm) and pulsed xenon
laser lines exist at exactly this wavelength. Further- (535 nm) lasers to record plankton on 35 mm film.
more, complete correction assumes that the entire In order to compensate for the aberrations
recording system be located in water. Since this is introduced by replaying the real image in air, the
both impractical and undesirable, holograms are real image was projected back into the water tank
usually recorded with the holographic film in air using a conjugate beam from a continuous-wave
behind a planar glass window. The additional glass argon laser. The sides of their tank were flexible so
and air paths affect the compensation of aberrations. that the real image could be placed close to the exit
However, third-order aberration theory shows that window and viewed via an external microscope.
if the window-to-air path length ratio is appropria- Knox later developed a ship-borne in-line holo-
tely chosen for a specific replay wavelength, then graphic camera which was deployed from the RV
aberration balancing occurs and residual aberrations Ellen B Scripps. A pulsed xenon laser was used to
are reduced to a minimum over a wide range of record water volumes of up to 105 cm3 to a water
field angles and object locations. For an air gap of depth of 100 m.
120 mm and a BK7 window of 30 mm thickness, a It was not until the late 1990s that significant use of
blue replay wavelength of 442 nm (HeCd laser) underwater holography for plankton recording came
achieves a good performance over a full field angle to the fore again. Joseph Katz at Johns Hopkins
of about 408. University, Baltimore, USA developed a camera based
44 HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography
around a ruby laser to record in-line holograms on create stable interference fringes. For subsea purposes,
rolls of film, in a 63 mm diameter sample volume that a laser with a wavelength in the blue-green region of
can be varied from 100 to 680 mm in length. The the spectrum will match the oft-quoted peak trans-
camera housing utilizes a two-tube design with one mission window of seawater. In practice, a frequency-
tube holding the laser and another containing film doubled Nd-YAG laser with a wavelength of 532 nm is
holder and power units. The recording volume is a good choice, although ruby lasers operating at
situated 890 mm above the camera to minimize 694 nm are also acceptable if transmission distance is
disturbance to flow. The camera is designed as a not an issue. Usually OAH requires more energy and
‘Lagrangian drifter’ to move with the water current places stricter constraints on coherence properties
and was successfully deployed in Chesapeake Bay, than ILH. Beam quality is governed by quality of the
Baltimore, and Solomon’s Island, Maryland. Holo- optical components in the beam path and also by the
grams were recorded of various species of phyto- output beam of the laser itself. Because off-axis
plankton and zooplankton and represented a holography uses diffuse front/side illumination,
significant advance in the deployment of sub-sea energies of up to several hundred millijoules are
holography. often required for large (up to a meter in diameter)
From the early 1980s, a team led by Aberdeen poorly reflective objects, and coherence lengths of
University, Scotland, investigated the use of under- greater than a meter are needed to record large scene
water holography for offshore applications. Primary depths. To obtain a long coherence length, and thus
interest was in inspection of subsea pipelines for enable large-volume objects to be recorded, it is
corrosion and damage, and much of the work necessary that the laser operate in a single longitudinal
concentrated on an analysis of the optical aberrations cavity mode so that the bandwidth of the laser line is
introduced by recording in water and replaying in air kept small. To enable smooth even illumination of
and methods of compensating for such aberrations. both subject and holographic emulsion, good spatial
However, this work was later adapted for holography beam quality is needed. This usually dictates laser
of plankton and led to the consequent development of output beams with either a Gaussian (TEM00) mode
the HoloMar system. The HoloMar camera was the profile or a flat profile, and spatial filtering during
first to incorporate simultaneous ILH and OAH beam expansion. Compared to off-axis holography,
recording. the demands placed upon lasers for ILH are minimal.
More recent developments have utilized digital or The only essential requirement is for a good
electronic recording of holograms in the in-line mode, spatial mode structure. Energy requirements are
and it is to be expected that much more use will be moderate: as little as 10 mJ is enough since little
made of ‘e-holography’ in the future. light is lost during the passage of the beam
For field applications of holography, it is essential to through the recording volume. The laser coherence
record the holograms using a pulsed laser in order to length can be quite short, and only depends on the
Figure 7 The HoloMar camera configuration showing the separate in-line and off-axis beam paths.
HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography 45
effective hologram area, thus lasers operating on Two motor-driven plate holders and transport
more than one longitudinal mode are usually mechanisms hold up to 20 plates for in-line and 24
quite suitable. for off-axis holograms. One pair of plates can be
exposed every 10 seconds, or they can be exposed
The HoloMar System separately. The camera is remotely operated from the
The HoloMar system (shown diagrammatically in surface via a control console (PC). A network of
Figure 7) was developed to record simultaneous in- micro-controllers serves as the backbone of the
line and off-axis holograms of partially overlapping, in-camera control system and communication
but orthogonal, views of the water column. between topside and camera modules is implemented
Although the recording views are orthogonal to using a Controller Area Network (CAN) bus via an
each other, the arrangement provides an element of umbilical cable. The entire camera is enclosed in a
cross-correlation between holograms. It is designed water-tight stainless steel housing of 1,000 mm
around a Q-switched, frequency-doubled Nd-YAG diameter by 2,200 mm length. The prototype system
laser with an output energy of 700 mJ in a single is capable of either ship deployment or attachment to a
pulse of less than 10 ns duration and a coherence fixed buoy and allows recording down to a water
length in excess of 2 m. The laser output is split into depth of 100 m.
two beams: one of 100 mJ and the other of 600 mJ The HoloMar camera is shown, in Figure 9, being
energy. The 100 mJ beam is further split into two lowered into Loch Etive, a sea loch on the west coast
equal energy beams and expanded and collimated to of Scotland, on its maiden dive. Figures 10 and 11
a diameter of about 90 mm. One path forms the show holographic images of plankton recorded on
illuminating beam for the in-line mode and the other this cruise. The first shows a view from a virtual
path forms the reference beam for the off-axis mode.
The in-line path records a water column of 470 mm
by 90 mm diameter (9,000 cm3) at 400 mm from the
front face of the housing, whereas the off-axis
hologram records almost the entire volume
(50,000 cm3) delineated by the front window to the
end of the arms that house the in-line system (see
Figure 7). Three ‘lightrods’, encompassing hollow
perspex tubes with 10 glass beam-splitter plates
distributed equally along the length, positioned at
458 to the main axis, provide roughly even side
illumination to the off-axis volume. The front face of
the camera is shown in Figure 8; the lightrods are
positioned on either side of the recording volume
(two on one side, one on the other). The exit window
for the in-line beam path and the off-axis window Figure 9 The HoloMar camera being lowered into Loch Etive,
can also be seen. Scotland.
Figure 8 Front-end of the HoloMar camera showing the Figure 10 A photograph taken from a virtual image replay of an
‘lightrods’ and in-line and off-axis windows. off-axis hologram recorded in Loch Etive.
46 HOLOGRAPHY, APPLICATIONS / High-Resolution Holographic Imaging and Subsea Holography
See also
Holography, Applications: Holographic Recording
Materials and Their Processing. Holography,
Techniques: Computer-Generated Holograms. Phase
Control: Phase Conjugation and Image Correction.
Figure 11 Images of calanoid copepods from (a) an in-line
hologram recorded at 70 m depth and (b) an off-axis hologram
recorded at 60 m depth, the millimeter scale is added at the replay
stage. Each organism is approximately 4 mm long.
Further Reading
Beers JR, Knox C and Strickland JDH (1970) A permanent
replay of an off-axis hologram at a wavelength of record of plankton samples using holography.
514 nm: this is what a viewer would see when looking Limnology and Oceanography 15: 967 – 970.
from the inside of the holocamera into the laser- Carder KL (1979) Holographic microvelocimeter for use in
illuminated water. Each bright dot signifies an object studying ocean particle dynamics. Optical Engineering
of interest; two fiducial wires can also be seen fixed 18: 524– 525.
diagonally across the arms at the end of the camera, Hariharan P (1984) Optical Holography, 2nd edn.
some 300 mm from the off-axis window. In the lower Cambridge, UK: Cambridge University Press.
left of the image a long ‘string-like’ object about Heflinger LO, Stewart GL and Booth CR (1978) Holo-
graphic motion pictures of microscopic plankton.
40 mm long is probably ‘floc’, the dead organic
Applied Optics 17: 951 – 954.
matter that forms the food-stuff of many Hobson PR and Watson J (2002) The principles and
zooplankton. Objects like this would be destroyed practice of holographic recording of plankton. Journal
in net-collection. Figure 11 shows images taken from of Optics A: Pure and Applied Optics 4: S34 – S49.
the real image projection of in-line and off-axis Katz J, Donathay PL, Zhang J, King S and Russell K (1999)
holograms. The holograms are replayed using a Submersible holocamera for detection of particle
HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing 47
characteristics and motions in the ocean. Deep-Sea marine particulates. Optical Engineering 39:
Research 46: 1455– 1481. 2187– 2197.
Kilpatrick JM and Watson J (1993) Underwater hologram- Stewart GL, Beers JR and Knox C (1970) Application of
metry: reduction of aberrations by index compensation. holographic techniques to the study of marine plankton
Journal of Physics D: Applied Physics 26: 177 – 182. in the field and the laboratory. Proceedings of SPIE
Knox C (1966) Holographic microscopy as a technique for 41: 183 – 188.
recording dynamic microscopic subjects. Science 153: Sun H, Dong H, Player MA, et al. (2002) In-line digital
989– 990. holography for the study of erosion processes in
Malkiel E, Alquaddoomi O and Katz J (1999) Measure- sediments. Measurement Science and Technology 13:
ments of plankton distribution in the ocean submersible L1– L6.
holography. Measurement Science and Technology 10: Watson J, Alexander S, Craig G, et al. (2001) Simultaneous
1141– 1152. in-line and off-axis subsea holographic recording of
Owen RB and Zozulya AA (2000) In-line digital holo- plankton and other marine particles. Measurement
graphic sensor for monitoring and characterizing Science and Technology 12: L9– L15.
For high-quality holograms the resolution limit of the phase variations are then caused by surface relief
material must be higher than the minimum value variations only. If the hologram is thick and has a
obtained according to the above formula. For negligible surface relief ðDd ¼ 0Þ; phase variations
a reflection hologram recorded in blue light are caused by index variations only. In many cases,
ðl ¼ 400 nmÞ with an angle of 1808 between the the phase modulation in a hologram is a combination
beams, a minimum resolving power of 7600 lines of the two different extreme types (a complex
(mm)21 is required. hologram).
In holography, the resolution of the holographic For amplitude holograms, amplitude transmission
image and the resolving power of the recording Ta against exposure or log exposure is used (Figure 3).
material are not directly related to the way in which For a phase hologram, the corresponding curve is the
they are in photography. phase shift against the log exposure relation
Equation [1] gives the minimum resolving power (Figure 4).
needed to record a hologram. The resolution of the One of the most important hologram character-
image depends on the area (diameter D) of the re- istics is its diffraction efficiency h: It is defined as
cording material, the recording laser wavelength ðlÞ; the diffracted intensity Ei of the wanted diffraction
and the distance ðLÞ between the recording material order of the hologram in relation to the incident
and the object (Figure 2).
Theoretically, the resolution of the holographic
image is the diffraction-limited resolution that can be
obtained when the information is collected over an
aperture equal to the size of the recording holo-
graphic plate. In principle, the larger the holographic
plate the better the resolution will be. To obtain high
image resolution, it is important that the recorded
interference fringes will not change or be distorted
during processing. To prevent that from happening, a
stable support for the emulsion (like a glass plate) is
needed and the processing method applied must not
affect the recorded fringe pattern in the emulsion.
In coherent imaging systems laser speckles are present
which may also have an effect upon image resolution.
There are two main types of holograms: amplitude
and phase holograms. In a pure amplitude hologram
only the absorption varies with the exposure (after Figure 3 Amplitude transmission Ta versus the logarithm of
processing), whereas in a pure phase hologram, either exposure H used for the characterization of amplitude trans-
the refractive index or the emulsion thickness varies mission holograms.
with the exposure. If the hologram is thin, d < 0;
well as on the intensity of the interference pattern. must be added to the emulsion to make it sensitive
Some holographic materials must be processed after to other parts of the spectrum. Orthochromatic
exposure in a specific way to obtain a hologram. The emulsions are sensitive to green light. If the material
recorded intensity variations are converted during has been sensitized to both green and red light, it is
this processing step to local variations in optical said to be panchromatic.
density or refractive index and/or thickness of the In the following, silver halide materials manufac-
recording layer. tured by photographic and holographic companies
Exposure H is defined as the incident intensity E are described. The main difference between the new
times the time t of exposure of the recording material. materials and the previous emulsions from Agfa,
If the intensity is constant during the whole exposure Ilford, etc., is that for the current silver halide
time, which is usually the case, then emulsions, the grain sizes are smaller. The present
holographic silver halide emulsion manufacturing
H ¼ Et ½6 process is more or less based on the Russian ultrafine-
Holographic materials are usually characterized grain emulsion technology.
using radiometric units. The radiometric equivalent
of illuminance is irradiance. The unit of irradiance Substrates for Holographic Emulsions
is W(m)22 and the exposure will then be expressed
in J(m)22. The sensitivity of a holographic emulsion The material on which the emulsion is coated
is most often expressed in mJ(cm)22 or mJ(cm)22. has a strong bearing on the final quality of the
Knowing the sensitivity of the material used and hologram. The best choice is often a glass plate as it
having measured the irradiance at the position of is mechanically stable and optically inactive.
the holographic plate, the exposure time can be High resolution imaging, hologram interferometry,
calculated using the above formula, i.e., holographic optical elements (HOEs) and spatial
filters are a few examples where a very stable emulsion
Exposure time ¼ sensitivity=irradiance support is important. Mass produced holograms are
mainly recorded on film substrates. However, the
Holographic materials are sensitized in such a way use of film has many advantages as compared to
that they are optimized for laser wavelengths that of glass (breakage, weight, cost, size, etc.).
commonly used in holography. Film substrates are used exclusively in the production
of large-format holograms. Film substrates are
Silver Halide Materials of mainly two types: a polyester (polyethylene
A silver halide recording photographic material is terephthalate); or a cellulose ester, commonly triace-
tate (cellulose triacetate) or acetate –butyrate. Polye-
based on one type, or a combination of silver halide
ster is mechanically more stable than triacetate film
crystals embedded in a gelatin layer. The emulsion is
and it is also less sensitive to humidity. Because of the
coated on a flexible or stable substrate material.
higher tensile strength, the polyester film can be made
Silver-halide grain sizes vary from about 10 nano-
thinner than the triacetate film. On the other hand,
meters for the ultra-fine-grain holographic emulsions
polyester is birefringent, which can cause problems
to a few micrometers for highly sensitive photo-
when recording holograms. In Table 2 most current
graphic emulsions (Table 1).
Silver salts are only sensitive to UV light, violet, and commercial silver halide materials for holography
are listed.
deep blue light. Therefore special sensitizers (dyes)
Material Thickness Spectral sensitivity Sensitivity [mJ (cm)22] at Resolv. power Grain size
[mm] [nm] [lp (mm)21] [nm]
THERMOPLASTIC MATERIALS
Tavex America
Pan TCC-2 – ,800 1 1 1 1 1500 –
MD Diffusion
Custom orders only
PHOTORESIST MATERIALS
Towne Technologies
UV-Blue Shipley 1800 1.5– 2.4 ,450 1.5 £ 105 – – – 1000 –
Hoya Corporation
Custom orders only
BACTERIORHODOPSIN MATERIALS
MIB GmbH
BR-WT B-type 30–100 ,650 – 80 £ 103 – – 5000 –
BR-D96N M-type 30–100 ,650 30 £ 103 – – – 5000 –
The developing agent is the most important a weak silver-solvent agent; an accelerator (or
constituent in a holographic developer. In addition activator); a restrainer; and a solvent which is usually
to the developing agent, a developer consists of the water. Employing a developer which contains a
following components: a preservative (or antioxidant); silver-solvent agent, solution-physical development
52 HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing
Methylphenidone 0.2 g
Catechol 10 g
Ascorbic acid 5g 10 g 18 g 16 g 18 g
Hydroquinone 5g
Metol 2.5 g
Phenidone 0.5 g 6g
Sodium sulfite 5g 100 g
Sodium phosphate (diabasic) 28.4 g 28.4 g
Urea 50 g
Potassium hydroxide 5g
Sodium hydroxide 12 g 12 g
Sodium carbonate 30 g 55.6 g 60 g
Ammonium thiocyanate 12 g
Distilled water 1l 1l 1l 1l 1l 1l
Dilution: no 15 ml dev. þ 400 ml water no no no no
HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing 53
Amidol or metolp 1g
Cupric bromide 1g
Potassium persulfate 10 g 20 g
Potassium dichromate 4g
p-Benzoquinone 2g
Ferric III sulfate 30 g 30 g
Copper sulfate 35 g
Disodium EDTA 30 g
Boric acid 1.5 g
Citric acid 50 g
Potassium bromide 30 g 30 g 20 g 30 g
Sodium hydrogen 30 g 30 g 30 g
sulfate
Sulfuric acid 4 ml
Distilled water 1l 1l 1l 1l 1l 1l
p
To be added after all other constituents have been dissolved.
used. Recipes for common holographic bleach baths Materials can be ordered with (marked -01) or
are found in Table 4. without (marked -02) anti-halation layer. The 120
emulsion can be used for recording holograms using a
Commercial Silver Halide Emulsions pulsed ruby laser. Kodak recommends the developer
Kodak D-19 for the processing.
In Table 2 most of the current silver halide
FilmoTec is located in Germany (Chemie
holographic emulsions are listed. The main manu-
Park Bitterfeld-Wolfen, Röntgenstrasse, D-06766
facturer of holographic silver halide emulsions is the
Wolfen, Germany; www.filmotec.de). The company
Micron branch of the SLAVICH Joint Stock Com-
manufactures ORWO holographic emulsions which
pany photographic company located outside Moscow
are coated only on triacetate film, 104 cm wide in
(2 pl. Mendeleeva, 152140 Pereslavl-Zalessky,
10 and 30 m lengths.
Russia, www.slavich.com). Both film (triacetate)
In France, M. Yves Gentet has started a small-
and glass plates are manufactured. Many different
scale emulsion manufacturing company: Atelier de
sizes are produced, including 115 cm by 10 m film
Création d’Art en Holographie Yves Gentet (50,
rolls. Recommended holographic developers:
rue Dubourdieu, F-33800 Bordeaux, France;
CW-C2, SM-6. Many rehalogenating holographic
www.perso.wanadoo.fr/holographie) Monochro-
bleach baths work well and, in particular, the
matic and panchromatic ultrafine-grain emulsions
PBU-type bleaches are recommended.
are produced called ‘Ultimate’. Plates and film up to
Colourholographic Ltd is a new company based in
60 cm by 80 cm are produced. An ascorbic acid –
England (Braxted Park, Great Braxted, Witham CM8
metol developer is recommended followed by a Ferric
3BX, England; www.colourholographic.com). The
EDTA bleach. In addition, the Russian GP-2 developer
Colourholographic materials are based on the HRT
can be used. Also a special developer can be ordered
emulsion, previously manufactured in Germany by
from the company.
Dr Birenheide. At the present time only glass
Konica in Japan is manufacturing a green-sensitive
plates are manufactured in England. Recommended
emulsion for the use in the American holographic
processing solutions for the new HRT materials
VOXEL 3D medical imaging system and, in addition,
are a sodium-sulfite-free ascorbic acid – metol
for the Japanese market only.
developer followed by a rehalogenating ferric EDTA
or PBU bleaches. For reflection holograms, a pyr-
Dichromated Gelatin Materials
ogallol developer and the ferric-EDTA bleach can
be used. Dichromated gelatin (DCG) is an excellent recording
Eastman Kodak’s holographic materials are pro- material for volume phase holograms and HOEs.
duced in America (343 State Street, Rochester, The grainless material has its highest sensitivity in the
New York 14650, USA; www.kodak.com/go/PCB UV region, but extends into the blue and green parts
products). These plates are usually made to order only. of the spectrum. There are possibilities to sensitize the
54 HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing
emulsion in the red part of the spectrum as well. The FilmoTec DCG emulsions are only available on
However, DCG is most often exposed with blue laser 190 mm triacetate film, 104 cm wide in 10 and 30 m
wavelengths. Depending on processing parameters, lengths; emulsion thickness 6 or 20 mm. Note that the
diffraction efficiency and bandwidth can be con- film needs to be sensitized in dichromate solution
trolled. It is easy to obtain high diffraction efficiency before recording. This means that the company only
combined with a large SNR. supplies large-format gelatin-coated film.
During the exposure of a DCG emulsion to UV or Holotec is a German-based company which offers
blue light, the hexavalent chromium ion (Cr6þ) is presensitized DCG coated on both plates and film
photo-induced to trivalent chromium ion (Cr3þ) (Jülicher Strasse 191, D-52070 Aachen, Germany;
which causes cross-linking between neighboring www.holotec.de). The Holotec emulsion is the high-
gelatin molecules. The areas exposed to light are quality DCG emulsions developed by Professor
hardened and become less soluble than the unexposed Stojanoff in Aachen. The company can supply
areas. Developing consists of a water wash which large-format DCG glass plates (m2 size) or film
removes the residual or unreacted chemical com- (PET-Polyethylenterephtalat), pre-sensitized and
pounds. Dehydration of the swollen gelatin follows ready to use.
after the material has been immersed in isopropanol,
which causes rapid shrinkage resulting in voids and
Photopolymer Materials
cracks in the emulsion, thus creating a large refractive
index modulation. The underlying mechanism is not Photopolymer materials have become popular for
completely clear because high modulation can also be recording phase holograms and HOEs, in particular
caused by the binding of isopropanol molecules to for mass production of holograms since some
chromium atoms at the cross-linked sites. The DCG photopolymer materials only require dry processing
material has a rather low sensitivity of about techniques.
100 mJ(cm)22. In many cases DCG plates are A photopolymer recording material, such as the
prepared by fixing an unexposed silver halide DuPont material, consists of three parts: a photopoly-
emulsion and then sensitizing it in a dichromate merizable monomer; an initiator system (initiates
solution. It is also, for example, possible to make a polymerization upon exposure to light); and a
10 mm thick emulsion on a 800 by 1000 glass plate using polymer (the binder). First, an exposure is made
1 g ammonium dichromate mixed with 3 g photo- to the information-carrying interference pattern.
graphic grade gelatin in 25 ml deionized water. This exposure polymerizes a part of the monomer.
Then the emulsion is spin-coated on the glass Monomer concentration gradients, formed by vari-
substrate or the emulsion is applied by the doctor ation in the amount of polymerization due to the
blade coating technique. Many HOE manufacturers variation in exposures, give rise to diffusion of
prefer to produce their own emulsions but there monomer molecules from the regions of high
are also commercial DCG emulsions available on concentration to the regions of lower concentration.
the market. The material is then exposed to regular light of
Processing of DCG holograms are performed in uniform intensity until the remaining monomer is
graded alcohol (propanol) baths starting with water– polymerized. A difference in the refractive index
propanol solutions of high water content and ending within the material is obtained. The DuPont material
with pure propanol (cold or hot) baths. Depending on requires only a dry processing technique (exposure to
the wanted HOE characteristics, one has to carefully UV light and a heat treatment) to obtain a hologram.
control the propanol/water mixtures as well as the The DuPont photopolymer materials have a coated
temperature of the different baths. Cold baths produce film layer thickness of about 20 mm. The photo-
better uniformity and lower noise. Warm baths can polymer film is generally coated in a 12.500 width on a
yield high index modulation but often with increased 1400 wide Mylarw polyester base which is .00200 thick.
noise. The bandwidth can be controlled by the The film is protected with a .0009200 thick Mylarw
processing temperature and the ratio of propanol/ polyester cover sheet.
water mixtures. The recording of a hologram on DuPont polymer is
rather simple. The film has to be laminated to a piece
of clean glass or attached to a glass plate using an
Commercial DCG Materials
index-matching liquid. Holograms can be recorded
Slavich in Russia is one manufacturer of presensitized manually, but in order to produce large quantities
dichromated plates for holography. The DCG emul- of holograms, a special machine is required.
sion is marked PFG-04. Plates up to a size of 30 by For hologram replication a laser line scanning
40 cm can be ordered. technique can provide the highest production rate.
HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing 55
The photopolymer material needs an exposure of for 15 minutes. A typical photoresist for holography
about 10 mJ(cm)22. (e.g., Shipley Microposit 1350) has a sensitivity of
After the exposure is finished, the film has to be about 10 mJ(cm)22. The most common laser wave-
exposed to strong white or UV light. DuPont lengths for recording photoresist holograms are the
recommends about 100 mJ(cm)22 exposure at 350 – following wavelengths: 413 nm (Krypton-ion),
380 nm. After that, the hologram is put in an oven at 442 nm (Helium –Cadmium), and 458 nm (Argon-
a temperature of 120 8C for two hours in order to ion). At higher wavelengths, e.g., 488 nm, the
increase the brightness of the image. The process is sensitivity is rather low. Development of Shipley
simple and very suitable for machine processing resist is performed in Shipley 303A developer, diluted
using, for example, a baking scroll oven. 1 part developer þ 5 parts de-ionized water. Nor-
mally only a 10 second development time is needed
(under agitation). After that the plate has to be rinsed
Commercial Pholymer Materials
in de-ionized water for 2 minutes. The plate needs to
Currently the only manufacturer of holographic be dried quickly with a high pressure steam of
photopolymer materials is E.I. du Pont de compressed nitrogen or dry, filtered compressed air.
Nemours & Co. (DuPont Holographics, POB 80352,
Wilmington, DE 19880, USA; www.dupont.com/
Commercial Resist Materials
holographics). In the past it was possible to obtain
photopolymer materials from DuPont, but in 2002 The main American manufacturer of photoresist is
the company changed its policy and today it is Shipley Co. (1457 Macarthur Rd., #101, Whitehall,
producing materials mainly for DuPont Holo- PA 18052, USA). There are a few companies which
graphics, but approved customers may still be able make plates based on Shipley resist.
to buy material from DuPont. The reason for the Towne Technologies, Inc. (6– 10 Bell Ave., Sommer-
restrictions is that many applications of mass- ville, NJ 08876, USA; www.townetech.com) produces
produced holograms are in the field of document spin-coated plates using Shipley resist (Shipley S-1800
security, where DuPont’s photopolymers are used to series). Towne plates have a sublayer of iron oxide
produce optical variable devices (OVDs). which enhances adhesion of the resist during electro-
plating operations. In addition, the sublayer elimin-
ates unwanted backscatter from the glass substrate.
Photoresist Materials
The plates are developed in Shipley 303A developer.
Photoresist materials in holography are used for Hoya Corporation (7-5 Naka-Ochiai 2-chome,
making masters for display and security holograms. Shinjuku-ku, Tokyo, Japan; www.hoya.co.jp) is a
The recorded relief image in the photoresist plate is producer of mask blanks coated with 0.5 mm thick
then used to make the nickel shim needed as the tool Shipley S-1800 resist (SLW plates). Different types
for embossing holograms into plastic materials. of glass as well as quartz plates are produced.
The exposure to actinic radiation produces changes Hoya supply plates for holography, however, custom
in the photoresist layer that result in a solvency orders only.
differentiation as a function of exposure. Processing is
done with a suitable solvent dissolving either the
Thermoplastic Materials
unexposed or exposed regions, depending on whether
the resist is of the negative or the positive type. For The holographic thermoplastic is a multilayer struc-
optical recording, positive photoresist (exposed resist ture coated on glass or film. The substrate is first
removed during development) is preferred to the coated with a conducting layer, e.g., evaporated gold,
negative type because of the higher resolving power then a photoconductor, e.g., poly-n-vinyl carbazole,
and low scatter. On the resulting surface, relief PVK, that has been sensitized with, e.g., 2,4,7-
pattern particles of nickel are deposited by electro- trinitro-9-fluorenone (TNF), and on top of this layer
lysis to make a mold, which can then be used as an a thermoplastic coating is deposited (usually a
embossing tool. The photoresist process can be used styrene-methacrylate material). The recording of a
for making transmission holograms only. If an hologram starts with a uniform charging of the
embossed hologram is mirror-backed by using, for surface of the thermoplastic material using a corona
example, an aluminum coating process, it can be used charger. The charge is divided between the photo-
in the reflection reconstruction mode as well. Photo- conductor and the thermoplastic layer. Exposure and
resists are UV and deep-blue sensitive. The material is the consequent photogeneration in the photoconduc-
spincoated on glass substrates to obtain a thickness of tor cause charges of opposite signs to migrate to
between 0.5 and 2 mm, then it is baked at about 75 8C the interface with the thermoplastic layer and
56 HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing
the substrate. This will not change the charge, but of a salt-marsh bacterium, termed purple membrane
only decrease the surface potential. Before the image (PM), of the cell membrane of Halobacterium
can be developed, the material has to be recharged salinarum. BR thin films are made from polymer
with the uniform corona charger. This step adds suspensions of the isolated purple membrane. Some
additional charges to the exposed areas, in proportion of the desirable properties of BR films are real-time
to the reduced potential which is also proportional to writing and erasing in microsecond time-scales,
the exposure. The material is then heated to a durability allowing millions of write/erase cycles
softening temperature of the thermoplastic layer without degradation, very high spatial resolution of
which will be deformed due to the electrostatic forces 5000 lp(mm)21, reasonably good light sensitivity of
acting on it. The material is then cooled and the image approximately 10 mJ(cm)22, the encoding of both
is fixed as a relief surface pattern in the thermoplastic amplitude and phase patterns, and the use of visible
layer. The processing time after the exposure necess- laser light for writing and reading. The composition
ary to obtain the image is between ten seconds and of the films may also be altered by chemical means to
half a minute. If the material is heated to a somewhat control the optical characteristics, and the BR
higher temperature, the pattern disappears and the molecules themselves may be mutagenically changed
image is erased. The thermoplastic material can now for performance optimization. Particular properties
be used for the recording of another hologram, that can be enhanced in these ways are image lifetimes
repeating the recording procedure. Such a cycling and the diffraction efficiency of holograms recorded
technique can be repeated hundreds of times, without in the films. One advantage of this material is that it is
any serious effect on the quality of the image. That, reversible. Therefore it has been considered to be used
and the fast dry processing made such materials in holographic storage systems and for real-time
popular for the use in holographic cameras for holographic interferometry applications. BR film can
nondestructive testing. The sensitivity is between be used for recording polarization holograms because
10 and 100 mJ(cm)22 over the whole visible electro- of BR films inherent photo-inducable optical aniso-
magnetic spectrum. tropy. The material is also possible to expose with
pulsed lasers, which makes it suitable for fast
holographic recording.
Commercial Thermoplastic Materials
The BR film works in the following way. When
In the past, several companies produced thermo- illuminated with green light, the BR molecule under-
plastic materials and recording equipment for holo- goes an absorption resulting in a maximum shift of
graphy. Since there is very little demand for such about 160 nm towards the red and its color changes
materials today, there is only one company which can from purple to yellow. The yellow image can be
deliver thermoplastic equipment for holography: erased by illuminating the BR material with blue
Tavex America Inc. (14 Garden Road, Natick, light. The recording –erasing cycle can be repeated
MA 01760, USA; www.tavexamerica.com). Tavex about a million times. Suitable BR-films for holo-
America manufactures a thermoplastic camera, graphy are obtained by embedding purple mem-
TCC-2 in which 40 mm by 40 mm plates are used. branes into inert matrices like polyvinylalcohol or
For obsolete systems, thermoplastic film can still be polyacrylamide. For hologram recording and erasure
obtained from MD Diffusion in France (93 rue two photochemical conversions B ! M (B-type holo-
d’Adelshoffen, F67300 Schiltigheim, France). grams) and M ! B (M-type holograms) can be
employed. After excitation by light, BR cycles
through a sequence of spectrally distinguishable
Bacteriorhodopsin
intermediates and returns to the initial state. The
Photochromic materials, a class to which bacterio- time associated with this thermal relaxation process
rhodopsin (BR) belongs, have not been used much in depends on several aspects, for example, type of BR
holography. Photochromics are real-time recyclable (wildtype or variant), pH value of film-matrix, film-
materials which need no processing for development. temperature, etc. For example, the M intermediate is
They can easily be erased and reused over and over populated from the B state (absorption spectrum
again. In holography the only successful material has peaked at 570 nm) by a green pumping beam and
been the BR film which was introduced only a few information is recorded with blue light, which
years ago. BR is a biomolecule with several appli- initiates the photochemical M ! B transition. The
cations in photonics, comprising a light-driven M state has its absorption peak at 410 nm. In BR
molecular proton pump with exceptional photoelec- variants in which the thermal relaxation of the M
tric and photochromic properties. BR is a photo- state has been prolonged, photocontrolled switching
chromic protein, found in the photosynthetic system between these two states can be achieved. Only when
HOLOGRAPHY, APPLICATIONS / Holographic Recording Materials and Their Processing 57
the pumping beam is incident upon the BR film can (SBN); bismuth silicon oxide (BSO); and bismuth
information be recorded with blue light. The pump- germanium oxide (BGO), consist of bulk space
ing beam can be used as a readout beam for the charge patterns. An interference pattern acting upon
recorded hologram. a crystal will generate a pattern of electronic charge
carriers that are free to move. Carriers are moving to
Commercial BR Materials areas of low optical illumination where they get
trapped. This effect will form patterns of net space
A holographic camera based on a BR film is charge, creating an electrical field pattern. As these
manufactured in Germany. The holographic system, crystals have strong electro-optic properties, the
FringeMaker-plus, developed at the University of electrical field pattern will create a corresponding
Marburg is marketed by Munich Innovation Bio- refractive index variation pattern, which means a
materials (MIB) GmbH. (Hans-Meerwein-Str., phase hologram. These holograms can be immedi-
D-35032 Marburg, Germany; www.mib-biotech.de). ately reconstructed with a laser beam differing from
MIB is also a manufacturer of both wildtype and the beams creating the hologram. A hologram can
genetically modified BR film for holography and also be erased and a new hologram can be recorded in
other scientific/technical applications. BR-films the crystal. Holograms can be fixed by converting the
offered by MIB can be used in two different recording charge patterns to the patterns of ions which are not
configurations: the transition from the initial B to M- sensitive to light. In some materials the hologram can
state (induced with yellow light) serves as basis for be fixed by heating the crystal to above 100 8C based
optical data recording and processing (B-type record- on thermally activated iconic conductivity. The
ing); or the photochemically induced transition from resulting iconic is frozen upon cooling the crystal
M to B state (M-type recording). In the latter case, back to room temperature. More details about these
yellow light is used to control the population of the materials and storage applications are found in the
M-state and blue light is used to write the infor- Further Reading.
mation. MIB offers BR-films with normal and slow
thermal relaxation characteristics. Typical relaxation
time ranges (RTR), after which the photochemically
See also
excited BR molecules have returned to the original Holography, Techniques: Overview.
state, are between 0.3 and 80 s. The BR film is sealed
between two windows of high-quality optical glass.
Further Reading
The BR film layer has a thickness of 30 –100 mm,
dependent on the optical density of the film. The Bjelkhagen HI (1993) Silver Halide Recording Materials for
diameter of the film device is 25 mm and the clear Holography and Their Processing. Springer Series in
aperture is 19 mm. Optical Sciences, vol. 66. Heidelberg: Springer-Verlag.
Bjelkhagen HI (ed.) (1996) Selected Papers on Holographic
Recording Materials. SPIE Milestone Series, vol. MS130.
Photorefractive Crystal Materials Bellingham, WA, USA.
Photorefractive crystals are electro-optic materials Lessard RA (ed.) (1995) Selected Papers on Photopolymers,
Physics, Chemistry, and Applications. SPIE Milestone
which are suitable for recording volume phase
Series, vol. MS114. Bellingham, WA, USA.
holograms in real-time. These crystals are important
Sincerbox GT (ed.) (1994) Selected Papers on Holographic
components in holographic storage systems and are Storage. SPIE Milestone Series, vol. MS95. Bellingham,
seldom used in other holographic image applications. WA, USA.
Holograms recorded in crystals, such as: lithium Smith HM (ed.) (1977) Holographic Recording Materials,
niobate (LiNbO3); barium titanate (BaTiO3); potass- Springer Topics in Applied Physics, vol. 20. Berlin:
ium tantalate niobate (KTN); barium sodium neobate Springer-Verlag.
58 HOLOGRAPHY, TECHNIQUES / Overview
HOLOGRAPHY, TECHNIQUES
Contents
Overview
Color Holography
Computer-Generated Holograms
Digital Holography
Holographic Interferometry
Sandwich Holography and Light in Flight
The first successful technique for separating the Processing of the hologram (developing, fixing, and
twin images was developed by Leith and Upatnieks washing of the recording material) yields a plate with
in 1962. This is called the off-axis or side band alternating transparent and opaque parts, variation
technique of holography. We shall consider mainly of refractive index, or variation of height correspond-
the off-axis technique here. ing to intensity variation in the fringe pattern. Such a
plate can be regarded as a complicated diffraction
grating. In this process, the hologram is illuminated
Basic Holography Principle with monochromatic, coherent light. The hologram
Light from a laser or any light beam is characterized diffracts this light into wavefronts which are essen-
by spatial coherence ðlc Þ and temporal coherence ðtc Þ; tially indistinguishable from the original waves which
which is discussed in all textbooks on optics. For were diffracted from the object. These diffracted
typical laboratory hologram recording, the required waves produce all the optical phenomena that can be
degree of coherence of laser radiation is determined produced by the original waves. They can be collected
by the type of object and geometrical arrangement by a lens and brought into focus, thereby forming an
being used for the recording. The following condition image of the original object, even though the object
must hold: has since been removed. If the reconstructed waves
are intercepted by the eye of an observer, the effect is
L p ctc ½1 exactly as if the original waves were being observed;
the observer sees what appears to be the original
where tc is the coherence time, and L the maximum object in true three-dimensional form. As the
path length difference between the two waves chosen observer changes his viewing position, the perspective
to record the hologram. To begin recording, two of the image scene changes; parallactic effects are
wavefronts are derived from the same laser source. evident and the observer must refocus when the obser-
One wavefront is used to illuminate the object vation point is changed from a near to a distant object
and another is used as a reference wavefront. in the scene. Assuming that both the construction
The wavefront derived by illuminating the object is and reconstruction of the hologram are made with the
superimposed with the reference wave and the same monochromatic light source, there is no visual
interference pattern is recorded. These object and test which can be made to distinguish between the
reference waves must: real object and the reconstructed image of the object.
. be coherent (derived from the same laser); and It is as if the hologram were a window through which
. have a fixed relative phase at each point on the the apparent object is viewed.
recording medium.
Hologram of a Point Object
If the above conditions are not met, the fringes will
move during the exposure and the holographic record Consider a hologram recorded with a collimated
will be smeared. Therefore, to record the hologram reference wave normal to the recording plate and a
one should avoid air currents and vibrations. The point object inclined at a certain angle. If the
recording medium must have sufficient resolution to hologram is illuminated once again with the same
record the fine details in the fringe pattern. This will collimated reference wave, it reconstructs two
be typically of the order of the wavelength. To create images, one virtual true image and the other real
a three-dimensional image from the holographic image. However, the two images differ in one very
process one then has to record and reconstruct the important respect.
hologram. The object and reference waves are derived While the virtual image is located in the same
from the same laser. position as the object and exhibits the same parallax
Holograms record an interference pattern formed properties, the real image is formed at the same
by interference of object and reference wavefronts, distance from the hologram but in front of it.
as explained above. We may mention here that Corresponding points on the real and virtual images
for the two plane waves propagating at an angle u are located at equal distances from the plane of the
between them, the spacing of the interference fringes hologram; the real image has the curious property
is given by that its depth is inverted. Such an image is not formed
with a normal optical system; it is therefore called a
l0 pseudo image as opposed to a normal or orthoscopic
a¼ ½2
image.
2 sinðu=2Þ
This depth inversion results in conflicting visual
where l0 is the free space wavelength. clues, which make viewing of the real image
60 HOLOGRAPHY, TECHNIQUES / Overview
psychologically unsatisfactory. Thus, if O1 and O2 are Ur describes the reference (plane wave) propagating
two elements in the object field, and if O1 blocks the at angle u to the z-axis.
light scattered by O2 at a certain angle, the hologram The intensity at the hologram plane is
records information only on the element O1 at this
angle and records no information about this part of I ¼ lUr þ Uo l2
O2. An observer viewing the real image from the
¼ lUr l2 þ lUo l2 þ Urp Uo þ Ur Uop ½5
corresponding direction then cannot see this part of
O 2, which, contrary to normal experience, is
As can be seen from the above equation, the
obscured by O1, even though O2 is in front of O1.
amplitude and phase of the wave are encoded as the
amplitude and phase modulation of a set of inter-
ference fringes. The material used to record the
Production of an Orthoscopic Real patterns of interference fringes described above is
Image assumed to provide linear mapping of the intensity
An orthoscopic real image of an object can be incident during the reconstruction process into
produced by recording two holograms in succession. amplitude transmitted by or reflected from the
In the first step, a hologram is recorded of the recorded material. Usually both light detection and
object with a collimated reference beam. When wavefront modulation are performed by photo-
this hologram is illuminated once again with the graphic plate/film. We assume the amplitude trans-
collimated reference beam that was used to record it, mission properties of the plate/film after processing to
it reconstructs two images of the object at unit be described by
magnification, one of them being an orthoscopic
T ¼ To 2 bðE 2 Eo Þ ½6
virtual image, while the other is a pseudoscopic real
image. A second hologram is then recorded of this
where the exposure E; at the film is E ¼ It ; here t is
real image with a second collimated reference beam.
the exposure time. T0 is the transmittance of the
When the second hologram is illuminated with a
unexposed plate.
collimated beam it reconstructs a pseudoscopic
If the hologram is reilluminated by the reference
virtual image located in the same position as the
wave, the transmitted wave amplitude will be
real image formed by the second hologram is an
orthoscopic image. Since a collimated reference beam
Ut ¼ Ur To 2 bt ðlUo l2 þ Urp Uo þ Ur Uop Þ ½7
is used throughout, the final real image is the same
size as the original object and free from aberrations. The first term is reference wave times a constant. The
In addition to these characteristic linked intensities second term is the reference wave modulated by
with the three-dimensional nature of the reconstruc- lUo l2 ¼ Oðx; yÞ2 ; it gives small-angle scattering about
tion, the holographic recording has several other the reference wave direction. The third term is
properties. Each portion of the hologram can proportional to Uo : It is the same as the original
reproduce the entire image scene. If a smaller and object wave (note this is only so if the reconstruction
smaller portion of the hologram is used for recon- wave is identical to the reference wave). The fourth
struction, there is loss of image intensity and term is
reconstruction. When a hologram is reversed, such
as in contact printing processes, it will still recon-
2btUr2 U0p ¼ 2btR2 eiky2sinu Oðx; yÞe2 ifðx;yÞ ½8
struct a positive image indistinguishable from the
image produced by the original hologram. This is essentially a wave traveling in a direction
sin21 ð2sinuÞ to the z-axis, with the correct object
amplitude modulation but its phase reversed in sign,
Simple Mathematical Description of producing a conjugate wave.
Holography
Let the object and reference waves be given by Types of Holograms
Primarily, the holograms are classified as thin and
Uo ¼ Oðx; yÞeifðx;yÞ ½3 thick, based on the thickness of the recording medium.
When the thickness of the recording medium is small
and compared with the average spacing of the interference
fringes, the hologram can be treated as a thin
Ur ¼ R eikysinu ½4 hologram. Such holograms are characterized by
HOLOGRAPHY, TECHNIQUES / Overview 61
spatially varying complex amplitude transmittance: and are incident on it from the same side. Reflection
holograms reconstruct images in reflected light. They
tðx; yÞ ¼ ½tðx; yÞexp½2ifðx; yÞ ½9
are recorded such that the interfering wavefronts are
Thin holograms can be further categorized as thin symmetrical with respect to the surface of the
amplitude holograms or thin phase holograms. If recording medium but are incident on it from
amplitude transmittance of the hologram is such that opposite sides. When the angle between the interfer-
fðx; yÞ is constant while tðx; yÞ varies over the ing wavefronts is maximum (1808), the spacing
hologram, the hologram is termed an amplitude holo- between the fringe planes of the recording medium
gram. For a lossless phase hologram, ltðx; yÞl ¼ 1; is minimum. Under such conditions, reflection
so that the complex amplitude transmittance is caused holograms may have wavelength sensitivity high
by variation in phase. enough to be reconstructed, even with white light.
When the thickness of the hologram recording An important development in the field of holo-
material is large compared to the fringe spacing of the graphy was the invention of rainbow holograms by
interference fringes, the holograms may be considered Benton in 1969. This provides a method for utilizing
as volume holograms. These may be treated as a white light for illumination when viewing the
three-dimensional system of layers corresponding to a holograms. The technique does so by minimizing
periodic variation of absorption coefficient or refrac- the blur introduced by color dispersion in trans-
tive index, and the diffracted amplitude is at a mission holograms, at the price of giving up parallax
maximum when Bragg’s condition is satisfied. In information in one dimension.
general, the behavior of thin and thick holograms
corresponds to Raman Nath and Bragg diffraction
Recording Materials
regimes. The distinctions between two regimes is
made on the basis of a parameter Q; which is defined An ideal recording medium for holography should
by the relation: have a well matching spectral sensitivity correspond-
ing to available laser wavelengths, linear transfer
2plo d characteristics, high resolution, and low noise. It
Q¼ ½10
no L 2 should also be either inexpensive or indefinitely
where recyclable. Toward achieving the above-mentioned
properties, several materials have been studied, but
L ¼ spacing of fringe on hologram, measured normal none has been found so far that meets all the
to the surface; requirements. Materials investigated include: silver
d ¼ thickness of recording medium; halide photographic emulsion; dichromated gelatin
no ¼ mean refractive index of the medium; and plates/films; silver halide sensitized gelatin plates/
lo ¼ wavelength of light. films; photo-resists; photo polymer systems; photo-
chromics; photo thermoplastics; and ferro-electric
Small values of Q ðQ , 1Þ correspond to thin crystals. Recently, the use of storage and processing
gratings, while large values of Q ðQ . 10Þ corres- capabilities of computers, together with CCD
pond to volume gratings. For values of Q between cameras, has been used for recording holograms.
1 and 10, intermediate properties are exhibited. Silver halide photographic emulsions are a com-
However, this criterion is not always adequate. The monly used recording material for holography,
boundaries between the thin and thick holograms are mainly because of relatively high sensitivity and
given by the value of easy availability. Manufacture, use, and processing
l2o of these emulsions have been well standardized. The
P¼ ½11 need for wet processing and drying may constitute
L2 no n1
a major drawback. Dichromated gelatin can be
where P22 is the relative power diffracted into higher considered an almost ideal recording material for
orders, and P , 1 for thin holograms and P . 10 for volume phase holograms, as it has a large refractive
thick holograms. For P having values between 1 and index modulation capability, high resolution, low
10, the hologram may be thin or thick, depending on absorption and scattering.
the other parameters involved. Because of these features, dichromated gelatin has
Holograms can also be classified based on whether been extensively investigated. The most significant
they are reconstructed in transmitted light or reflected disadvantage is its comparatively low sensitivity.
light. For transmission holograms, during the record- Pennington et al. developed an alternative technique,
ing stage, two interfering wavefronts make equal but which combines the advantage of silver halide
opposite angles to the surfaces of recording medium emulsions (high sensitivity) with that of dichromated
62 HOLOGRAPHY, TECHNIQUES / Overview
gelatin (low absorption scattering and high stability). The unique ability of holography – to record and
It involves exposing silver halide photographic reconstruct both electromagnetic and sound waves –
emulsion and then processing it, so as to obtain a makes it a valuable tool for education, science,
volume phase hologram consisting solely of hardened industry, and business. Below are some of the
gelatin. important applications:
Thin phase holograms can be recorded on photo-
resists, which are light-sensitive organic films yielding 1. Holographic interferometry (HI) is one of the
relief image after exposure and development. They most powerful measuring tools. In HI, two states
offer advantages of easy replication using thermo- of an object, i.e., initial and deformed state are
plastics but are slow in response and undergo recorded on the same photographic plate. After
nonlinear effects at diffraction efficiency greater reconstruction of the light wave corresponding
than ,0.05. Shipley AZ-1350 is a widely used to two states of an object, interferences and
photoresist, with maximum sensitivity in the ultra- deformations are displayed in terms of the
violet, dropping rapidly for longer wavelengths interference pattern. The change of distance of
towards the blue. one tenth of a micron, or lower, can be resolved.
Photopolymers are being keenly investigated HI provides scientists/engineers with crucial data
because they offer advantages such as ease of for design of critical machine parts of power-
handling, low cost, and real-time recording for the generating equipment, in the aircraft industry,
application of holography and nonlinear optics. How- automobile industry, and nuclear installations
ever, they have low sensitivity and short shelf life. (say, for example, in the design of containers
Thin phase surface relief holograms can be used to transport nuclear materials, improve the
recorded in a thin layer of thermoplastics. These design of aircraft wings and turbine blades, etc.).
materials have a reasonably high sensitivity over the Presently, HI is being widely used in mechanical
whole visible spectrum, fairly high diffraction effi- engineering, acoustics, and aerodynamics, for
ciency, and do not require wet processing. Their nondestructive testing, to investigate oscillation
application in holographic interferometry, optical in diaphragms and flow around various objects,
information processing, and in making compact respectively.
holographic devices has been widely reported. 2. Microwave holography can detect objects
Photorefractive crystals such as Bi12SiO20, LiNbO3, deep within spaces, by recording the radio
Bi12GeO20, BaTiO3, LiTaO3, etc., as recording waves they emit.
materials, offer high-angular sensitivity and provide 3. Another important application of holography is
capability to read and write volume holographic data the design of optical elements, which possess
in real time. Besides the materials discussed here, special properties. A holographic recording of a
several other materials have been investigated for concave mirror behaves in much the same way as
recording holograms; these include photocromics, the mirror itself, i.e., it can focus the light. In
elastomere devices, magneto-optic materials, etc. some cases chromatism can be introduced in the
design of elements so that a location of the point,
where the beams are focused, depends on the
Application of Holography wavelength. This can be achieved by accurately
choosing the recording arrangement of the
Holography can be constructed not only with the
focusing elements, these elements are, in fact,
light waves of lasers, but also with sound waves,
diffraction gratings. These can have low noise
microwaves, and other waves in the electromag-
levels, freedom from astigmatism, and have
netic spectrum of radiation. Holograms made with
other useful properties. Holographic optical
ultraviolet light or X-rays can record images of
elements have found applications in supermarket
objects/particles smaller than the wavelength of
scanners to read barcodes, headup displays
visible light, e.g., atoms or molecules. Acoustical
in fighter aircraft to observe critical cockpit
holography uses sound waves to see through solid
instruments, etc.
objects. Holography has a vast scope of practical
4. A telephone credit card used in Europe and
applications, which have been classified into two
other developed countries has embossed surface
major categories:
holograms which carry a monetary value. When
1. applications requiring three-dimensional images the card is inserted into the telephone, a card
for visual perception; and reader discerns the amount due and deducts
2. applications in which holography is used as a the appropriate amount to cover the cost of
measuring tool. the call.
HOLOGRAPHY, TECHNIQUES / Overview 63
5. Holography is having applications in analog and classified under this category. The progress in
digital computers, offering remarkable opportu- portrait holography is hampered partly because
nities to realize various logical operators, devices of imperfection of pulsed lasers and partly
for identifying images based on matched filter- because of the deterioration of photographic
ing, and in computer memory units. The basic material when exposed to pulsed electromagnetic
advantage of holographic memory is that a radiation.
relatively large volume of information can be The principle behind the creation of elusion by
stored and that there are a limited number of using composite holograms is also very convin-
ways to change the record. The arrival of the cing for the display of objects. To synthesize the
first prototype of optical computers, which use composite holograms, the photographs of var-
holograms as storage material for data, could ious aspects of a scene are printed onto the
have a dramatic impact on the holographic photographic plate. The synthesis techniques for
market. The yet-to-be-unveiled optical compu- preparing the composite hologram are very
ters will be able to deliver trillions of bits of complicated, while the images created by holo-
information faster than the current generation grams are still far from perfect. However, there is
of computers. no doubt that composite holography opens
6. Optical tweezers are going to become an holography up as an artistic technique. Recently
important tool for the study of micro-organ- rainbow holography has been very popular to
isms/bacteria, etc. display such objects.
7. Holograms can be used to locate and retrieve
information without knowing its location in the
storage medium, only needing to know some of
its content. See also
8. The links between computer science and
holography are now well-established and are Holography, Applications: Art Holography. Hologra-
phy, Techniques: Color Holography; Computer-
developing, with at least two aspects making
Generated Holograms; Digital Holography; Holographic
computer-generated holograms extremely inter-
Interferometry. Phase Control: Phase Conjugation and
esting. First, such holograms enable us to obtain Image Correction.
visual 3-dimensional reconstructions of ima-
gined objects. For example, one can reconstruct
three dimensions of a model of an object still
in the design stage. Second, computer-generated
Further Reading
holograms can be used to reconstruct lightwaves
with specified wave fronts. This means specially Benton SA (1969) On a method for reducing the infor-
computed and manufactured holograms may mation content of holograms. Journal of the Optical
function as optical elements that transform the Society of America 59: 1545.
incident light wave into desired wavefronts. Denisyuk YuN (1962) Photographic reconstruction of the
optical properties of an object in its own scattered
9. Another important applications of holography is
radiation field. Sov. Phys.-Dokl. 7: 543.
its utilization to compensate for the distortion
Denisyuk YuN (1963) On the reproduction of the optical
that occurs when viewing objects through properties of an object by the wave field of its scattered
optically heterogeneous mediums. It can be radiation, Pt. I. Optical Spectroscopy 15: 279.
achieved based on the principle of beam reversal Denisyuk YuN (1965) On the reproduction of the optical
by the hologram. properties of an object by the wave field of its scattered
10. Finally, let us consider the application of holo- radiation, Pt II. Optical Spectroscopy 18: 152.
graphy to art. The development of holography Gabor SD (1948) A new microscopic principle. Nature 161:
gives very effective ways of creating qualitative 777.
three-dimensional images. Thus, a new indepen- Gabor D (1949) Microscopy by reconstructed wavefronts.
dent area of holographic creative work represen- Proceedings of the Royal Society A197: 454.
Gabor D (1951) Microscopy by reconstructed wavefronts:
tational/artistic holography has appeared. The
II. Proceedings of the Physical Society B64: 449.
art of holographic depiction has developed
Hariharan P (1983) Optical Holography: Principle, Tech-
along two major routes. The first is creation of niques and Applications. Cambridge, UK: Cambridge
the view hologram, used as holograms of University Press.
natural objects that are to be displayed in Leith EN and Upatnieks J (1962) Reconstructed wavefronts
exhibitions and museums; these are also known and communication theory. Journal Optical of the
as artistic holograms. Portrait holography is also Society of America 52: 1123.
64 HOLOGRAPHY, TECHNIQUES / Color Holography
Leith EN and Upatnieks J (1963) Wavefront reconstruction Pennington KS and Lin LH (1965) Multicolor wavefront
with continuous-tone objects. Journal of the Optical reconstruction. Applied Physics Letters 7: 56.
Society of America 53: 1377. Stroke GW and Labeyrie AE (1966) White-light
Leith EN and Upatnieks J (1964) Wavefront reconstruction reconstruction of holographic images using the
with diffused illumination and three-dimensional objects. Lippmann-Bragg diffraction effect. Physics Letters
Journal of the Optical Society of America 54: 1295. 20: 368.
Color Holography
H I Bjelkhagen, De Montfort University, Leicester, multicolor wavefront reconstruction in one of their
UK first papers on holography. The early methods con-
q 2005, Elsevier Ltd. All Rights Reserved. cerned mainly transmission holograms recorded with
three different wavelengths from a laser or lasers,
combined with different reference directions to
Introduction avoid cross-talk. Such a color hologram was then
Since the appearance of the first laser-recorded reconstructed (displayed) by using the original
monochromatic holograms in the 1960s, the possibi- laser wavelengths from the corresponding reference
lities of recording full-color high-quality holograms directions. However, the complicated and expen-
have now become a reality. Over many years holo- sive reconstruction setup prevented this technique
graphic techniques have been created to produce from becoming popular. More suitable for holographic
holograms with different colors, referred to as pseudo- color recording is reflection holography, which can
or multicolor holograms. For example, in the be reconstructed and viewed in ordinary white light.
field of display holography, it is common to make Most likely, the three-beam transmission tech-
multiple-exposed reflection holograms using a single- nique might eventually become equally applicable,
wavelength laser and emulsion thickness manipu- provided inexpensive multicolor semiconductor
lation to obtain an image with different colors. In this lasers become widely available. Such RGB lasers
way many beautiful pseudocolor holograms have been can be used to display a very realistic, deep, and
made by artists. Applying Benton’s rainbow hologram sharp holographic image of, for example, a room.
technique (for example, the one used for embossing Relatively few improvements in color holography
holograms on credit cards), it has been possible to were made during the 1960s and 1970s. Not until the
record transmission holograms as well as mass-produce end of the 1970s did a few examples of color holo-
embossed holograms with ‘natural’ colors. However, a grams appear. During the 1980s, new and improved
rainbow hologram has a very limited viewing position techniques were introduced. A review of various
from which a correct color image can be seen. transmission and reflection techniques for color holo-
This article will acquaint the reader with the topic graphy was published by Hariharan (1983). From
of color holography (full-color or true-color holo- 1990 until today, many high-quality color holograms
graphy). This describes a holographic technique to have been recorded, mainly due to the introduction of
obtain a color 3D image of an object where the color new and improved panchromatic recording
rendition is as close as possible to the color of the real materials. On the market are ultra-fine-grain silver
object. A brief review of the development of color halide emulsions (manufactured by Slavich in Russia)
holography, the current status and the prospects as well as photopolymer materials (manufactured by
of this new 3D imaging technique, are presented. E. I. du Pont de Nemours & Co. in the USA).
Also included are recording techniques using three Color reflection holography presents no problems
lasers providing red, green and blue (RGB) laser with regard to the geometry of the recording setup,
light as well as computer-generating techniques for but the final result is highly dependent on the
producing full parallax 3D color images. recording material used and the processing tech-
niques applied. There are some problems associated
with recording color holograms in general and
The Development of Color specifically in silver halide emulsions:
Holography . Scattering occurring in the blue part of the
In theory, the first methods for recording color spectrum typical of many holographic coarse-
holograms were established in the early 1960s. grain silver halide emulsions makes them unsuit-
Already Leith and Upatnieks (1964) proposed able for the recording of color holograms.
HOLOGRAPHY, TECHNIQUES / Color Holography 65
. Multiple storage of interference patterns in a single color holograms, as well as monochrome holograms,
emulsion reduces the diffraction efficiency of each being more widely used.
individual recording. The diffraction efficiency of a
three-color recording in a single-layer emulsion is Recording of Color Holograms
lower than a single wavelength recording in the
Currently, most transmission color holograms are of
same emulsion.
the rainbow type. Large-format holographic stereo-
. During wet processing, emulsion shrinkage
grams made from color-separated movie or video
frequently occurs, causing a wavelength shift.
recordings have been produced. There are also some
White-light-illuminated reflection holograms nor-
examples of embossed holograms in which a correct
mally show an increased bandwidth upon recon-
color image is reproduced. In order to generate a true
struction, thus affecting the color rendition.
color rainbow hologram, a special setup is required in
. The fourth problem, related to some extent to the
which the direction of the reference beam can be
recording material itself, is the selection of appro-
changed in between the recordings of the color-
priate laser wavelengths and their combination in
separated RGB primary images. However, as already
order to obtain the best possible color rendition of
mentioned, holographic color images of the rainbow
the object.
type can reconstruct a correct color image only along
a horizontal line in front of the film or plate and, thus,
In the beginning, when no suitable panchromatic
of less interest for serious color holography appli-
emulsion existed, the sandwich technique was used to
cations. Reflection holography can offer full parallax,
make color reflection holograms. Two plates were
large field of view 3D color images where the colors
sandwiched together, in which, for example, two
do not change when observed from different
different types of recording materials were used. The
directions.
most successful demonstration of the sandwich
recording technique was made by Kubota (1986) in Silver Halide Materials
Japan. He used a dichromated gelatin (DCG) plate for
the green (515 nm) and the blue (488 nm) compo- To be able to record high-quality color reflection
nents, and an Agfa 8E75 silver halide plate for the red holograms it is necessary to use extremely low light-
(633 nm) component of the image. Kubota’s sandwich scattering recording materials. This means, for
color hologram of a Japanese doll recorded in 1986, example, the use of ultra-fine-grain silver halide
demonstrated the potential of high-quality color holo- emulsions (grain size about 10 nm). Currently, the
graphy for the first time. Color holograms have also only producer of a commercial holographic panchro-
been recorded in red-sensitized DCG materials. matic ultra-fine-grain silver halide material is the
Extensive work in the field of reflection color Micron branch of the Slavich photographic company
holography was performed by Hubel and Solymar located outside Moscow. The PFG-03c emulsion
(1991). Primarily they employed Ilford silver halide comes in standard sizes from 63 mm £ 63 mm format
materials for the recording of color holograms up to 300 mm £ 406 mm glass plates and is also
applying the sandwich technique. available on film. Some characteristics of the Slavich
Very important is the research on recording material are presented in Table 1.
materials by Usanov and Shevtsov (1990) in Russia. By using suitable processing chemistry for the PFG-
Their work is based on the formation of a microcavity 03c emulsion it has been possible to obtain high-
structure in gelatin, by applying a special processing quality color holograms. These holograms are
technique to silver halide emulsions. Such holograms recorded in a single-layer emulsion, which greatly
recorded in a silver halide emulsion have a high simplifies the recording process as compared with
diffraction efficiency and exhibit a typical DCG many earlier techniques. The aforesaid single-layer
hologram quality. technique is described below.
Not until panchromatic ultra-fine-grain silver
halide emulsions were introduced in Russia in the Table 1 Characteristics of the Slavich panchromatic emulsion
early 1990s, was it possible to record high-quality
color holograms in a single emulsion layer as Silver halide material PFG-03C
Emulsion thickness 7 mm
demonstrated by the present author. Although it is Grain size 12– 20 nm
now possible to produce color holograms, they need Resolution ,10,000 lp(mm)21
to be displayed correctly. An important improvement Blue sensitivity ,1.0– 1.5 £ 1023 J(cm)22
would be to find an edge-illuminating technique to Green sensitivity ,1.2– 1.6 £ 1023 J(cm)22
make color holograms easier to display. The display Red sensitivity ,0.8– 1.2 £ 1023 J(cm)22
Color sensitivity peaked at 633 nm and 530 nm
problem still remains the main obstacle preventing
66 HOLOGRAPHY, TECHNIQUES / Color Holography
Figure 1 The 1976 CIE uniform scales chromaticity diagram shows the gamut of surface colors and positions of common laser
wavelengths. Optimal color-recording laser wavelengths are also indicated.
HOLOGRAPHY, TECHNIQUES / Color Holography 67
considerations that must be taken into account when is mathematically equivalent to integral approxi-
choosing the wavelengths for color holograms. One mations for the tristimulus integrals. Peercy and
of these important considerations is the question of Hesselink used both Gaussian quadrature and
whether three wavelengths are really sufficient for Riemann summation for the approximation of
color holography. The wavelength selection problem the tristimulus integrals. In the first case they found
for color holography has been treated in several the wavelengths to be 437, 547, and 665 nm. In the
papers, for example by Peercy and Hesselink (1994). second case the wavelengths were 475, 550, and
Hubel and Solymar (1991) provided a definition of 625 nm. According to Peercy and Hesselink, the
color recording in holography: sampling approach indicates that three monochro-
matic sources are almost always insufficient to
A holographic technique is said to reproduce ‘true’ preserve all of the object’s spectral information
colors if the average vector length of a standard set of accurately. They claim that four or five laser
colored surfaces is less than 0.015 chromaticity wavelengths are required.
coordinate units, and the gamut area obtained by Only further experiments will show how many
these surfaces is within 40% of the reference gamut.
wavelengths are necessary and which combination is
Average vector length and gamut area should both be
the best for practical purposes. Another factor that
computed using a suitable white light standard
reference illuminant. may influence the choice of the recording wavelengths
is the availability of cw lasers currently in use in
An important factor to bear in mind when working holographic recording, for example, argon ion,
with silver halide materials is that a slightly longer krypton ion, diode-pumped solid state (DPSS) fre-
blue wavelength than the optimal one might give quency-doubled Nd:YAG, helium neon, and helium
higher-quality holograms (better signal-to-noise cadmium lasers (Table 2).
ratio) because of reduced Rayleigh scattering during The recent progress in DPSS laser technology has
the recording. Another important factor to consider is made available both red and blue DPSS lasers. These
the reflectivity of the object at primary spectral lasers are air-cooled, small, and each laser requires
wavelengths. It has been shown that the reflectivity less than a hundred watts of electric power to
of an object at three wavelength bands, peaked at operate. Usually, a set of three DPSS lasers will be
450, 540, and 610 nm, has an important bearing on the best choice of cw lasers for color holography in
color reconstruction. These wavelengths can also be the future.
considered optimal for the recording of color
holograms even though the laser lines are very
Setup for Recording Color Holograms
narrow-band. A triangle larger than necessary can
be considered in order to compensate for color A typical reflection hologram recording setup is
desaturation (color shifting towards white) that illustrated in Figure 2.
takes place when reconstructing color reflection
holograms in white light. According to Hubel’s
Table 2 Wavelengths from cw lasers
color rendering analysis, the optimal wavelengths
are 464, 527, and 606 nm for the sandwich silver Wavelength [nm] Laser type Single line power [mW]
halide recording technique. If the calculations are
442 Helium cadmium ,100
performed to maximize the gamut area instead, the 457 DPSS blue ,500
following set of wavelengths is obtained: 456, 532, 458 Argon ion ,500
and 624 nm. However, when approached from a 468 Krypton ion ,250
different viewpoint, the optimal trio of wavelengths 476 Krypton ion ,500
based on the reconstructing light source of 3400 K, a 477 Argon ion ,500
488 Argon ion ,2000
6 mm thick emulsion with a refractive index of 1.63 497 Argon ion ,500
and an angle of 308 between the object and the 502 Argon ion ,400
reference beam the following wavelengths were 514 Argon ion ,5000
obtained: 466.0, 540.9, and 606.6 nm. Peercy and 521 Krypton ion ,100
Hesselink (1994) discussed wavelength selection by 529 Argon ion ,600
531 Krypton ion ,250
investigating the sampling nature of the holographic 532 DPSS green ,3000
process. During the recording of a color hologram the 543 Green neon ,10
chosen wavelengths point-sample the surface-reflec- 568 Krypton ion ,100
tance functions of the object. This sampling on color 633 Helium neon ,80
perception can be investigated by the tristimulus 647 Krypton ion ,2000
656 DPSS red ,1000
value of points in the reconstructed hologram which
68 HOLOGRAPHY, TECHNIQUES / Color Holography
The different laser beams necessary for the In the initial experiments performed by the author in
exposure of the object pass through the same beam 1993, a specially designed test object consisting of the
expander and spatial filter. A single-beam Denisyuk 1931 CIE chromaticity diagram, a rainbow ribbon
arrangement is used, i.e., the object is illuminated cable, pure yellow dots, and a cloisonné elephant, was
through a recording holographic plate. The light used for the color balance adjustments and exposure
reflected from the object constitutes the object beam tests. Later, another test target was employed – the
of the hologram. The reference beam is formed by the Macbeth ColorCheckerw chart, which was used for
three expanded laser beams. This ‘white’ laser beam color rendering tests.
illuminates both the holographic plate and the object Color reflection holograms can also be produced
itself through the plate. Each of the three primary laser using copying techniques so that projected real
wavelengths forms its individual interference pattern in images can be obtained, however, normally associ-
the emulsion, all of which are recorded simultaneously ated with a restricted field of view. For many display
during the exposure. In this way, three holographic purposes, the very large field of view obtainable in a
images (a red, a green, and a blue image) are super- Denisyuk color hologram is often more attractive.
imposed upon one another in the emulsion.
Three laser wavelengths are employed for the
recording: 476 nm, provided by an argon ion
Processing of Color Holograms
laser, 532 nm, provided by a cw frequency-doubled
Nd:YAG laser, and 647 nm, provided by a krypton The dry processing of color holograms recorded on
laser. Two dichroic filters are used for the combining photopolymer materials has already been described.
of the three laser beams. The ‘white’ laser beam goes The process is simple and very suitable for machine
through a spatial filter, illuminating the object processing using, for example, a baking scroll oven.
through the holographic plate. The processing of silver halide emulsions is more
By using the dichroic filter beam combination difficult and critical. The Slavich emulsion is rather
technique, it is possible to perform simultaneous soft, and it is important to harden the emulsion before
exposure recording, which makes it possible to the development and bleaching takes place. Emulsion
control independently the RGB ratio and the overall shrinkage and other emulsion distortions caused by
exposure energy in the emulsion. The RGB ratio can the active solutions used for the processing must
be varied by individually changing the output power be avoided. In particular, when recording master
of the lasers, while the overall exposure energy is color holograms intended for photopolymer replica-
controlled solely by the exposure time. The overall tion, shrinkage control is extremely important.
energy density for exposure is about 3 mJ(cm)22. The processing steps are summarized in Table 3.
HOLOGRAPHY, TECHNIQUES / Color Holography 69
Table 4 Chromaticity coordinates from color hologram recording tests using the Macbeth ColorCheckerw
Object White #19 Blue #13 Green #14 Red #15 Yellow #16 Magenta #17 Cyan #18
sub-beam is directed through a sequence of digital This technique has opened the door to real 3D
images on a liquid-crystal screen. Each resulting computer graphics. However, generating such a large
exposure, about two millimeters square, is called a hologram is a very time-consuming process. The Ford
‘hogel’. The full-color hogels are the holographic P2000 hologram took almost two weeks (300 hours)
building blocks of a finished CGH image. In an to produce, since each individual panel requires a
automated step-and-repeat process, 9 £ 104 hogels 24-hour recording time.
are formed on a flat square tile of DuPont panchro-
matic photopolymer film by simultaneous RGB laser
exposures. The 60 cm by 60 cm tile itself is the
finished hologram, or to obtain larger holograms, the
The Future of Color Holography
3D image is made in 60 cm by 60 cm tiled increments. The manufacturing of large-format color reflection
So far, the largest image created was of Ford’s P2000 holograms is now possible. Mass production of color
Prodigy concept car in which ten such holograms tiled holograms on photopolymer materials has started in
together made up the large color reflection hologram Japan. Although good color rendition can be
(1.2 m by 3 m) which is reproduced in Figure 6. obtained, some problems remain to be solved, such
HOLOGRAPHY, TECHNIQUES / Color Holography 71
Figure 6 Computer-generated color hologram of Ford’s P2000 Prodigy concept car. A ten tile full parallax Mosaic hologram produced
by Zebra Imaging, Inc., size 1.2 m by 3 m.
as desaturation, image resolution, signal-to-noise There are also many potential commercial appli-
ratio and dynamic range. Other limitations concern- cations of this new feature of holographic imaging,
ing holographic color recording include the fact that provided that the display problem can be solved using
some colors we see are the result of fluorescence, some sort of integrated lighting.
which cannot be recorded in a hologram. There are Today, it is technologically possible to record
some differences in the recorded colors in one of the and replay acoustical waves with very high fidelity.
test charts shown here. However, color rendition is a Hopefully, holographic techniques will be able to
very subjective matter. Different ways of rendition offer the same possibility in the field of optical waves,
may be preferable for different applications, and wavefront storage, and reconstruction. Further
different people may have different color preferences. development of computer-generated holographic
At the present time, research on processing color images will make it possible to display extremely
reflection holograms recorded on ultra-fine-grain realistic full-parallax 3D color images of nonexisting
materials is still in progress. Work is carried out on objects, which could be applied in various spheres,
microcavity structure holograms in order to produce for example, in product prototyping, as well as
low-noise reflection holograms with high diffraction in other applications in 3D visualization and
efficiency. For producing large quantities of color three-dimensional art. Eventually it will become
holograms, the holographic photopolymer materials possible to generate true-holographic 3D color images
are most suitable and will be further improved in in real-time, as computers become faster and possess
the future. greater storage capacity. However, the most important
Another important field of research is the develop- issue is the development of extremely high-resolution
ment of a three-wavelength pulsed laser. Employing electronic display devices which are necessary for
such a laser, dynamic events, as well as portraits, can electronic holographic real-time imaging.
be recorded in a holographic form. Currently,
research work is in progress at the French German See also
Research Institute ISL, Saint Louis, France, as well as
at the Geola company in Lithuania. Holography, Techniques: Computer-Generated
Holograms.
The virtual color image behind a color holographic
plate represents the most realistic image of an object
that can be obtained today. The extensive field of Further Reading
view adds to the illusion of beholding a real
Bjelkhagen HI (1993) Silver Halide Recording Materials for
object rather than an image of it. The wavefront
Holography and Their Processing. Springer Series in
reconstruction process recreates accurately the three- Optical Sciences, vol. 66. New York: Springer-Verlag.
wavelength light scattered off the object during the Bjelkhagen HI, Jeong TH and Vukičević D (1996) Color
recording of the color hologram. This 3D imaging reflection holograms recorded in a panchromatic
technique has many obvious applications, in parti- ultrahigh-resolution single-layer silver halide emulsion.
cular, in displaying unique and expensive artifacts. Journal of Imaging Science & Technology 40: 134 – 146.
72 HOLOGRAPHY, TECHNIQUES / Computer-Generated Holograms
Bjelkhagen HI, Huang Q and Jeong TH (1998) Progress Kubota T (1986) Recording of high quality color holo-
in color reflection holography. In: Jeong TH (ed.) grams. Applied Optics 25: 4141– 4145.
Sixth International Symposium on Display Holography, Leith EN and Upatnieks J (1964) Wavefront reconstruction
Proc. SPIE, vol. 3358, pp. 104 – 113. Bellingham, WA: with diffused illumination and three-dimensional
The International Society for Optical Engineering. objects. Journal of the Optical Society of America 54:
Hariharan P (1983) Colour holography. In: Wolf E (ed.) 1295– 1301.
Progress in Optics, vol. 20, pp. 263 –324. Amsterdam: Peercy MS and Hesselink L (1994) Wavelength selection
North-Holland. for true-color holography. Applied Optics 33:
Hubel PM and Solymar L (1991) Color reflection holo- 6811– 6817.
graphy: theory and experiment. Applied Optics 30: Usanov YuE and Shevtsov MK (1990) Principles of
4190– 4203. fabricating micropore silver-halide-gelatin holograms.
Klug MA, Klein A, Plesniak W, Kropp A and Chen B (1997) Optics and Spectroscopy (USSR) 69: 112– 114.
Optics for full parallax holographic stereograms. In: Watanabe M, Matsuyama T, Kodama D and Hotta T
Benton SA and Trout TJ (eds) Practical Holography XI (1999) Mass-produced color graphic arts holograms.
and Holographic Materials III, Proc. SPIE, vol. 3011, In: Benton SA (ed.) Practical Holography XIII,
pp. 78 – 88. Bellingham, WA: The International Society Proc. SPIE, vol. 3637, pp. 204– 212. Bellingham, WA:
for Optical Engineering. The International Society for Optical Engineering.
Computer-Generated Holograms
W J Dallas, University of Arizona, Tucson, AZ, USA usual photo-chemical treatment the photographic
plate is called a hologram. In the reconstruction
A W Lohmann, University of Erlangen-Nürnberg,
Germany step, the hologram acts as a diffracting object that is
illuminated by a replica of the reference light. Due to
q 2005, Elsevier Ltd. All Rights Reserved. diffraction, the light behind the hologram is split into
three parts (Figure 1b). One part proceeds to the
Introduction observer who perceives a virtual object where
the genuine object had been originally during the
A hologram is a tool for shaping the amplitude and recording step.
the phase of a light wave. Shaping the amplitude
distribution is easy, one way is by a photographic
plate, another way is by binary pixels as used in ‘half-
tone printing’. The phase distribution can be shaped
by RE-fraction on a thin film of variable thickness, or
by DIF-fraction on a diffraction grating that has been
deliberately distorted.
Both kinds of holograms can be manufactured
based on a computer design, the refractive version
being called a ‘kinoform’, and the diffractive version
being known as a computer-generated hologram
(CGH). We will explain the fundamentals of compu-
ter holography with emphasis on the ‘Fourier– CGH’,
and will also highlight the large variety of
applications.
The amplitude transmittance of the hologram is two other types: ‘image-plane’ CGHs and ‘Fresnel-
TH ðxÞ ¼ c0 þ c1 IH ðxÞ ¼ c1 u0 upR þ ··· ½2 type’ CGHs. The Fresnel-type will be treated in the
section on ‘software’ because of some interest in 3-D
The coefficients c0 and c1 represent the photochemi- holograms. Certain hardware issues will also be
cal development. In the reconstruction process described.
(Figure 1b) the transmittance TH is illuminated by The explanation of Fourier CGHs begins with
the reference beam. Hence, we get grating diffraction (Figure 2). The only uncommon
feature in Figure 2 is the off-axis location of the
uR ðxÞTH ðxÞ ¼ c1 u0 ðxÞluR ðxÞl2 þ · · · ½3 source. It is arranged such that the plus-first diffrac-
tion order will hit the center of the output plane.
If the reference intensity luR ðxÞl2 is constant, this term We will now convert a simple Ronchi grating
represents a reconstruction of the object wave u0 ðxÞ: (Figure 3a) into a Fourier CGH. The Ronchi grating
The omitted parts in eqns [2] and [3] describe those transmittance can be expressed as the Fourier series:
beams that are ignored by the observer (Figure 1b).
The hologram transmittance TH ðxÞ is an analog X 2p imx 1
signal: GðxÞ ¼ Cm exp ; C0 ¼ ;
ðmÞ
D 2
0 # TH # 1 ½4
sinðp m=2Þ
Cm ¼ ½5
Before the 1960s, when computer holograms were pm
invented, the available plotters could produce only
binary transmittance patterns. Hence, it was necess- Now we shift the grating bars by an amount PD.
ary to replace the analog transmission TH ðxÞ by a Then we modify the width from D=2 to WD, as
binary transmission BH ðxÞ; which is possible without shown in Figure 3b. The Fourier coefficients are now
loss of image quality. In addition, binary computer
expð2p imPÞ sinðp mWÞ
holograms have better light efficiency and a better Cm ¼ ½6
signal-to-noise ratio. They are also more robust when pm
being copied or printed. The Cþ1 coefficient is responsible for the complex
Another advantage of a CGH is that the object does amplitude at the center of the readout plane:
not have to exist in reality, which is important for
some applications. The genuine object may be expð2p iPÞ sinðp WÞ
difficult to manufacture or, if it is three-dimensional, Cþ1 ¼ ½7
p
difficult to illuminate. The CGH may be used to
implement an algorithm for optical image processing. We are able now to control the beam magnitude,
In that case, the term ‘object’ becomes somewhat A ¼ ð p1 ÞsinðpWÞ; by the relative bar width W
fictitious. We will discuss more about applications, and the phase f ¼ 2pP by the relative bar shift P:
towards the end of this article. This particular kind of phase shift is known as
‘detour phase’.
So far, the light that ends up at the center of the exit
From a Diffraction Grating to a Fourier plane, leaves the grating as a plane wave with a wave
vector parallel to the optical axis. Now we distort the
CGH grating on purpose (Figure 3c):
Now we will explain, in some detail, the ‘Fourier
type’ CGH. Fourier holograms are more popular than W ! Wðx; yÞ; P ! Pðx; yÞ ½8
74 HOLOGRAPHY, TECHNIQUES / Computer-Generated Holograms
lf
# dx0 ½14
DxH
Dx0 DxH
# ½15
dx 0 D
Figure 6 Reconstruction setup for a Fourier CGH.
This means that the number of image pixels Dx0 =dx0
is bounded by the number of CGH cells DxH =D: The
Fourier transform of vðx0 ; y0 Þ :
generalization to two dimensions is straightforward.
" # Again, Fourier CGHs benefit from the fact that the
ðð 2piðxx0 þyy0 Þ
uH ðx;yÞ¼ 0 0
vðx ;y Þ exp dx0 dy0 light propagation from the virtual object to the
lf hologram plane can be described by a Fourier
½12 transformation. Hence, the FFT can be used.
And now we move to the synthesis of a Fresnel
The condition ‘mild enough distortions’ has been hologram. A Fresnel hologram is recorded at a finite
treated in the literature. There is not enough space distance z from the object. Hence, we have to
here to present that part of the CGH theory. But we simulate the wave propagation in free space from
show an actual CGH with 64 £ 64 cells (Figure 4). the object (at z ¼ 0) to the hologram at a distance z.
This CGH is no longer a grating, but, it shows enough This free space propagation can be described (in the
resemblance to our heuristic derivation. Hence, the paraxial approximation) as a convolution of object
CGH theory appears to be trustworthy. A CGH looks and quasi-spherical wave:
less regular if the image vðx0 ; y0 Þ contains a random " #
ð1 2ip ðx 2 x 0 2
Þ
phase that simulates a diffuser. The diffuser spreads u0 ðxÞ ! u0 ðx0 Þ exp dx0 ¼ uðx; zÞ
light across the CGH and so levels out unacceptably 21 lz
large fluctuations in the amplitude. ½16
With 512 £ 512 or 1024 £ 1024 cells in a CGH
one can obtain an image of the quality seen in In the Fourier domain the corresponding operation is:
Figure 5. The reconstruction setup is shown in
Figure 6, here with the letter F as the image. u~ 0 ðmÞ ! u~ 0 ðmÞ exp½2iplzm2 ¼ uð
~ m; zÞ ½17
In the case of a rectangular object u0 ðx; yÞ; the It is easy to proceed from here to the CGH
number of pixels is Dx0 Dy0 =dx0 dy0 : synthesis of a 3-D object. One starts with an object
Now to describe the algorithmic effort for comput- detail u1 ðxÞ at z1 ; for example, the house as in
ing the convolution, the u0 ðxÞ has N pixels, and, the Figure 7. Then, one propagates to the next object
lateral size of the exponential is essentially M ¼ z=dz; detail at z2 : This second detail may act as a multi-
where dz represents ‘depth of focus’. M describes the plication (transmission) and as addition (source
lateral size of the diffraction cone, measured in pixel elements). This process is repeated as the wave
size units dx0 ; and the convolution involves MN moves towards the plane of the hologram. There,
multiply/adds. A comparison of these two algorithms a reference wave has to be added, as in the
is dictated by the ratio Fourier case.
A very powerful algorithm for the CGH synthesis
M is ‘IFTA’ (iterative Fourier transform algorithm).
½18
4 log N Suppose, we want to design a Fourier CGH, which
looks like a person and whose reconstructed image
The direct convolution is preferable if the distance z
shows the signature of that person (Figure 8a,b).
is fairly small. The integration limits of eqn [16]
In other words, the two intensities
depend on the oscillating phase px2 =lz: Effectively,
nothing is contributed to this integral, when this
phase varies by more than 2p over the length, dx0 ; of luH ðx; yÞl2 and lu~ H ðm; vÞl2 ½20
an object pixel. Note that the pixel size of uðx; zÞ
does not change with the distance z, because the are determined. But, the phase distributions of uH
bandwidth DmðzÞ ¼ Dm0 is z-independent. This is so and u~ H are still at our disposal. The IFTA algorithm
because the power spectrum is z-invariant: will yield those phases in many cases at the expense
of ten, hundred or more Fourier transformations.
~ m; zÞl2 ¼ lu~ 0 ðmÞl2
luð ½19 The IFTA algorithm is sometimes called ‘Fourier
Ping-Pong’ because it bounces back and forth between
the ðx; yÞ and the ðm; nÞ domain. The amplitudes luH l
and lu0 l are enforced at every opportunity. But the
associated phases are free to vary and to converge,
eventually. If so, uH and u0 are completely known.
The IFTA will not always converge, for example not if
luH ðx; yÞl ¼ dðx; yÞ and lu~ H ðm; vÞl ¼ dðm; vÞ ½21
Figure 8 IFTA-CGH: (a) The Fourier CGH. (b) The optical reconstruction thereof. (Courtesy of H.O. Bartelt.)
HOLOGRAPHY, TECHNIQUES / Computer-Generated Holograms 77
other one has negative phase contrast (Figure 10b). Finally, a word about terminology; several terms
Another simple spatial filter (Figure 11a) produces
are used in place of ‘CGH’: synthetic hologram,
a derivative of the input uðx; yÞ : ›uðx; yÞ ›x digital hologram, holographic optical element
(Figure 11b). Equations [22] – [24] explain this (HOE), diffractive optical element (DOE) or digital
differential filtering experiment. optical element.
CGHs have also been made for other waves, such
›uðx; yÞ as acoustical waves, ocean waves, and microwaves.
uðx; yÞ ! ¼ vðx; yÞ ½22 Digital antenna steering represents a case of computer
›x
holography.
~ m; nÞ ! 2pim·uð
uð ~ m; nÞ ¼ v~ ðm; nÞ ½23
See also
8 9 Diffraction: Fraunhofer Diffraction; Fresnel Diffraction.
< þiðm . 0Þ =
~ m; nÞ ¼ 2plml·
pð ½24 Fourier Optics. Holography, Techniques: Computer-
: 2iðm . 0Þ ; Generated Holograms; Holographic Interferometry;
Overview. Imaging: Information Theory in Imaging;
Inverse Problems and Computational Imaging; Three-
Dimensional Field Transformations. Information
The phase portion of the filter (Figure 11a) is the
Processing: Coherent Analogue Optical Processors;
same as for the Hilbert filter, but, the filter amplitude Free-Space Optical Computing; Incoherent Analog
now enhances the higher frequencies. The object used Optical Processors. Optical Communication Systems:
for getting the results of Figure 10b and 11b was a Free-Space Optical Communications. Optical Process-
phase-only object: a bleached photograph. ing Systems. Phase Control: Wavefront Coding.
A very famous spatial filtering experiment is
‘matched filtering’, which can be used for pattern
recognition. The original matched filters were classi- Further Reading
cal Fourier holograms of the target pattern. Compu- Andrés P (1999) Opt. Lett. 245: 1331.
ter hologram’s can do the same job, even with Brown BR and Lohmann AW (1966) Appl. Opt. 5: 967.
spatially incoherent objects, this work being initiated Brown BR and Lohmann AW (1969) IBM J. R&D 13: 160.
by Katyl. Bryngdahl O (1973) Progr. Opt. 11: 168.
Fourier holograms are also attractive for data Bryngdahl O and Wyrowski F (1990) Progr. Opt. 28: 1– 86.
storage, due to their robustness against local defects. Dallas WJ (1971) Appl. Opt. 10: 673.
However, a local error in the Fourier domain spreads Dallas WJ (1973) Appl. Opt. 12: 1179.
out all over the image domain. The computed Dallas WJ (1980) Topics in Appl. Phys. 40: 291 – 366.
hologram has an advantage over classical holograms Dallas WJ (1999 – 2003) http://www.radiology.arizona.edu/
dallas/CGH.htm. University of Arizona, Tucson, AZ.
because of the freedom to incorporate any error
Dresel T, Beyerlein M and Schwider J (1996) Appl. Opt.
detection and correction codes for reducing the bit
35: 4615.
error rate even further. Fienup JR (1981) Proc. SPIE 373: 147.
One of the most recent advances is in matched Fienup JR (1982) Appl. Opt. 21: 2758.
filtering with totally incoherent light. This step allows Gerchberg RW and Saxton WO (1972) Optik 35: 237.
moving spatial filtering out of the well-guarded Goodman JW and Silvestri AM (1970) IBM J. R&D
laboratory into hostile outdoor environments. 14: 478.
HOLOGRAPHY, TECHNIQUES / Digital Holography 79
Hansler RL (1968) Appl. Opt. 7: 1863. Lohmann AW and Werlich HW (1971) Appl. Opt. 10: 670.
Hauk D and Lohmann AW (1958) Optik 15: 275. McGovern AJ and Wyant JC (1971) Appl. Opt. 10: 619.
Ichioka Y and Lohmann AW (1972) Appl. Opt. 11: 2597. Pe’er A, Wang D, Lohmann AW and Friesem AA (1999)
Katyl RH (1972) Appl. Opt. 11: 1255. Opt. Lett. 24: 1469; (2000) 25: 776.
Kozma A and Kelly DL (1965) Appl. Opt. 4: 387. Sinzinger S and Jahns J (1999) Micro Optics. New York:
Lee WH (1978) Progr. Opt. 16: 219 –232. Wiley.
Lesem LB, Hirsch PM and Jordan JA (1969) IBM J. R&D Vander Lugt AB (1964) IEEE, IT 10: 139.
13: 150. Wyrowski F and Bryngdahl O (1991) Rep. Prog. Phys.
Lohmann AW and Paris DP (1967) Appl. Opt. 6: 1746. 1481– 1571.
Lohmann AW and Paris DP (1968) Appl. Opt. 7: 651. Yaroslavskii LP and Merzlyakov NS (1980) Methods in
Lohmann AW and Sinzinger S (1995) Appl. Opt. 34: 3172. Digital Holography. New York: Plenum Press.
Digital Holography
W Osten, University of Stuttgart, Stuggart, Germany but physically different objects (master sample com-
q 2005, Elsevier Ltd. All Rights Reserved. parison) available at different locations with inter-
ferometric sensitivity. We call this remote metrology.
Introduction
The sinusoidal nature of light, its ability to interfere, Direct Phase Reconstruction
and the precise knowledge of its wavelength make it by Digital Holography
possible to measure several metrological quantities,
In digital speckle pattern interferometry (DSPI), the
such as distances and displacements, with high
object is focused onto the target of an electronic
accuracy. However, the high frequency of light results
sensor. Thus, an image plane hologram is formed as a
in the difficulty that the primary quantity to be
measured, i.e., the phase of the light wave, cannot be result of the interference with an inline reference
observed directly. All quadratic sensors (CCD, wave. In contrast with DSPI, a digital hologram is
CMOS, photo plates) are only able to record recorded without imaging. The target records the
intensities. Therefore, one makes use of a trick: the superposition of the reference and the object wave in
transformation of phase changes into recordable the near-field region – a so-called Fresnel hologram.
intensity changes as the basic principle of interfero- The basic optical setup in digital holography for
metry and holography. Holography provides the recording holograms is the same as in conventional
possibility to record and to reconstruct the complete holography (Figure 1a). A laser beam is divided into
information of optical wave fields. Numerous poten- two coherent beams, one illuminating the object and
tial applications such as 3D-imaging of natural forming the object wave, the other entering the target
objects, fabrication of diffractive optical elements, directly and forming the reference wave. On this
and interferometric investigation of complex technical basis, very compact solutions are possible. Figure 1b
components become possible. Further progress has shows a holographic camera where the interfero-
been made by recording holograms directly with meter, containing a beamsplitter and some wavefront
electronic targets and not in photographic emulsions. shaping components, is mounted in front of the
In this way the independent numerical reconstruction camera target.
of phase and intensity can be accomplished by the For a description of the principle of digital Fresnel
computer. One important consequence is that inter- holography, we use a simplified version of the
ferometric techniques, such as 3D-displacement anal- optical setup (Figure 2a). The object is modeled by
ysis and 3D-contouring, can be implemented easily a plane rough surface that is located in the ðx; yÞ-
and with high flexibility. Furthermore, digitally plane and illuminated by laser light. The scattered
recorded holograms can be transmitted via telecom- wavefield forms the object wave uðx; yÞ: The target
munication networks and optically reconstructed at of an electronic sensor (e.g., a CCD or a CMOS)
any desired place using an appropriate spatial light used for recording the hologram is located in the
modulator. Consequently, the complete optical infor- ðj; hÞ-plane at the distance d from the object.
mation of complex objects can be transported Following the basic principles of holography, the
between different places with high velocity. This hologram hðj; hÞ originates from the interference of
opens the possibility to compare nominal identical the object wave uðj; hÞ and the reference wave rðj; hÞ
80 HOLOGRAPHY, TECHNIQUES / Digital Holography
Figure 1 Setup for recording digital holograms. (a) Schematic setup for recording a digital hologram onto a CCD target.
(b) Implementation of a holographic camera by mounting a compact holographic interferometer in front of a CCD-target (4 illumination
directions are used). Reproduced from Seebacher S, Osten W, Baumbach Th and Jüptner W (2001) The determination of material
parameters of microcomponents using digital holography. Optics and Lasers in Engineering 36(2): 103–126, copyright q 2001, with
permission from Elsevier.
u0 ¼ Tc ¼ tðhÞc ½7
and
u0 ¼ a cu2 þ cr2 þ curp þ crup þ ct0 ½8
k¼
2p
½11 expðikd0 Þ ð ð
l u0 ðx0 ; y0 Þ ¼ t½hðj; hÞrðj; hÞ
ild0 21 · · · 1
as the wave number. The obliquity factor, cos u, ik n 0 2 0 2
o
£ exp ðj 2 x Þ þ ðh 2 y Þ dj dh
represents the cosine of the angle between the 2d0
outward normal and the vector joining P to O0 . ½17
This term is given as:
Equation [17] is a convolution integral that can be
0
d expressed as
cos u ¼ ½12
r ðð
and therefore the Huygens– Fresnel principle can be u0 ðx0 ; yÞ ¼ t½hðj; hÞrðj; hÞH
21 · · · 1
rewritten: £ ðj 2 x ; h 2 y0 Þdj dh
0
½18
d ðð
0
expðikrÞ
u0 ðx0 ;y0 Þ ¼ t½hðj; hÞrðj; hÞ dj dh with the convolution kernel:
il 21···1 r2
½13
0 expðikd0 Þ
0 ik 0 2 02
This numerical reconstruction delivers the complex Hðx ; y Þ ¼ exp ðx þ y Þ ½19
ild0 2d0
amplitude of the wavefront. Consequently, the phase
distribution, fðx0 ;y0 Þ, and the intensity, Iðx0 ;y0 Þ; can be Another notation of eqn [17] is found if the term
calculated directly from the reconstructed complex exp½ðik=2d0 Þðx0 2 þ y0 2 Þ is taken out of the integral:
function, u0 ðx0 ;y0 Þ :
0 0 0 expðikd0 Þ k 0 2 0 2 ðð
Imlu0 ðx 0 ; y 0 Þl u ðx ;y Þ ¼ exp i 0 ðx þy Þ
fðx0 ; y0 Þ ¼ arctan ½2p; p ½14 ild0 2d 21···1
Relu0 ðx0 ; y 0 Þl
k 2 2
£ t½hðj; hÞrðj; hÞ exp i 0 ðj þ h Þ
Iðx0 ; y0 Þ ¼ u0 ðx0 ; y0 Þu0 p ðx0 ; y0 Þ ½15 2d
2p 0 0
£ exp 2i 0 ðjx þ hy Þ dj dh ½20
The direct approach to the phase yields several ld
advantages for imaging and metrology applications or
that are discussed later.
0 0 expðikd0 Þ
0 k 02 02
u ðx ;y Þ ¼ exp i 0 ðx þy Þ
Reconstruction Principles in Digital ild0 2d
Holography k 2
£ FTld0 t½hðj; hÞrðj; hÞ exp i ð j þ h2 Þ
2d0
Different numerical reconstruction principles have
been investigated: the Fresnel approximation, the ½21
convolution approach, the lens-less Fourier
approach, the phase shifting approach, and the where FTld0 indicates the 2-dimensional Fourier
phase-retrieval approach. In this section the main transform that has been modified by a factor 1=ðld0 Þ:
techniques are briefly described. Equation [21] makes clear that the diffracted
HOLOGRAPHY, TECHNIQUES / Digital Holography 83
X N21
M21 X
u0 ðm; nÞ ¼ t½hð jDj; lDhÞrð jDj; lDhÞ
j¼0 l¼0 Figure 4 The principle of digital holographic interferometry
demonstrated on an example of a loaded small beam. The
ip
£ exp 0 ð j2 Dj2 þ l2 Dh2 Þ interference phase is the result of the subtraction of the two
d
l
phases which correspond to the digital holograms of both
jm ln object states.
£ exp 2i2p þ ½22
M N
where constants and pure phase factors preceding the some efficient approaches to overcome the difficulties
sums have been omitted. The main parameters are the of this process.
pixel number M £ N and the pixel pitches Dj and Dh Following eqn [22] the pixel spacing in the
in the two directions, which are defined by the used reconstructed image is
sensor chip.
The discrete phase distribution, fðm; nÞ; of the d0 l d0 l
Dx0 ¼ and Dy0 ¼ ½24
wavefront and the discrete intensity distribution, MDj NDh
Iðm; nÞ; on the rectangular grid of M £ N sample
points can be calculated from the reconstructed Additional to the real image, a blurred virtual image
complex function, u0 ðm; nÞ; by using eqns [14] and and a bright dc-term, the zero-order diffracted field, is
[15], respectively. The wrapped phase distribution is reconstructed. This term can effectively be eliminated
computed directly without the need of additional by preprocessing the stored hologram or the different
temporal or spatial phase modulation such as phase terms can be separated by using the off-axis instead of
shifting or spatial heterodyning. the in-line technique. However, the spatial separation
For interferometric applications with respect to between the object and the reference field requires
displacement and shape measurement the phase a sufficient space-bandwidth product of the used
difference, dðm; nÞ; of two reconstructed wave CCD-chip, as discussed in the section on direct phase
fields, u10 ðm; nÞ and u20 ðm; nÞ; needs to be calculated. reconstruction above.
The algorithm can be reduced to the following
equation: Numerical Reconstruction by the Convolution
Approach
dðm; nÞ ¼ f1 ðm; nÞ 2 f2 ðm; nÞ ½23
In the section above, we have already mentioned that
the connection between the image term, u0 ðx0 ; y0 Þ;
Figure 4 shows the principle of displacement
and the product, t½hðj; hÞrðj; hÞ; can be described
measurement by digital holography. Besides the
by a linear space-invariant system. The diffraction
investigation of surface displacements in the region
formula [17] is a superposition integral with the
of the wavelength, digital holography can be applied
impulse response:
advantageously for the measurement of the shape of
complex objects. In Figure 5, a turbine blade was expðikd0 Þ
investigated with the 2-wavelength contouring Hðx0 2 j; y0 2 hÞ ¼
ild0
method. Both, the digitally reconstructed mod
ik
2p-phase and its demodulated version after-phase £ exp fð j 2 x 0 2
Þ þ ð h 2 y 0 2
Þ g ½25
2d0
unwrapping, are presented. Phase unwrapping is
necessary in digital holography, since the fringe- A linear space-invariant system is characterized by a
counting problem still remains. However, there are transfer function G; that can be calculated as the
84 HOLOGRAPHY, TECHNIQUES / Digital Holography
Figure 5 Phase reconstruction by digital holographic interferometry on an example of 2-wavelength contouring of a turbine blade. (a)
Image of the turbine blade. (b) Mod-2p phase distribution. (c) Unwrapped phase distribution. (d) CAD reconstruction of the turbine blade.
Fourier-transform of the impulse response: with ^ as the convolution symbol. The computing
effort is comparatively high: two complex multipli-
Gðfx ; fy Þ ¼ FTfHðj; h; x0 ; y0 Þg ½26 cations and three Fourier transforms. The main
difference to the Fresnel transform is the different
with the spatial frequencies fx and fy : Consequently, pixel size in the reconstructed field:
the convolution theorem can be applied, which states
that the Fourier transform of the convolution, Dx0 ¼ Dj and Dy0 ¼ Dh ½29
t½hðj; hÞrðj; hÞ; with H is the product of the
individual Fourier transforms, FT{t½hðj; hÞrðj; hÞ}
and FT{H}: Thus u0 ðx0 ; y0 Þ can be calculated by the Numerical Reconstruction by the Lensless
inverse Fourier transform of the product of the Fourier Approach
Fourier-transformed convolution partners: We have already mentioned that the limited spatial
resolution of the sensor restricts the angle resolution
u0 ðx0 ; y0 Þ ¼ FT21 fFT½tðhÞrFT½Hg ½27 of the digital hologram. The sampling theorem
requires that the angle between the object beam
and the reference beam at any point of the
u0 ðx0 ; y0 Þ ¼ ½tðhÞr^H ½28 electronic sensor be limited in such a way that the
HOLOGRAPHY, TECHNIQUES / Digital Holography 85
microinterference spacing is larger than double the both virtual images are reconstructed and the
pixel size. In general, the angle between the reference reconstruction algorithm then reads:
beam and the object beam varies over the sensor’s
surface, and so does the maximum spatial frequency. expðikdÞ 2i k ðx2 þy2 Þ
Thus, for most holographic setups, the full spatial u0 ðx0 ; y0 Þ ¼ e 2d
ild n o
bandwidth of the sensor cannot be used. However, k 2
þh2 Þ
it is important to use the entire spatial bandwidth of £ FTld0 t½hðj; hÞ·rðj; hÞ·e2i 2d ðj ½30
the sensor because the lateral resolution of the re-
constructed image depends on a complete evaluation
with FTld as the 2-dimensional Fourier transform-
of the entire information obtained from the sensor.
ation which has been scaled by a factor 1=ld: In this
Even if the speed of digital signal processing is recording configuration, the effect of the spherical
increasing rapidly, algorithms should be as simple and phase factor associated with the Fresnel diffraction
as fast to compute as possible. For the Fresnel pattern of the object is eliminated by the use of a
approach and the convolution approach, several fast spherical reference wave, rðj; hÞ; with the same
Fourier transforms and complex multiplications are average curvature:
necessary. Therefore, a more effective approach, such
as the subsequently described algorithm, seems p
promising. rðx; yÞ ¼ const: £ exp i ðj2 þ h2 Þ ½31
ld
The lensless Fourier approach is the fastest and
most suitable algorithm for small objects. The This results in a more simple reconstruction algori-
corresponding setup is shown in Figure 6. It allows thm which can be described by
us to choose the lateral resolution in a range from a
few microns to hundreds of microns without any 0 0 0 p
u ðx ; y Þ ¼ const: £ exp 2i ðx2 þ y2 Þ
additional optics. Each point, ðj; hÞ; on the hologram ld
is again considered as a source point of a spherical £ FTld {hðj; hÞ} ½32
elementary wavefront (Huygen’s principle). The
intensity of these elementary waves is modulated
Besides the simpler and faster reconstruction
by t½hðj; hÞ – the amplitude transmission of the
algorithm (only one Fourier transformation has to
hologram. The reconstruction algorithm for lensless
be computed), the Fourier algorithm uses the full
Fourier holography is based on the Fresnel recon-
space-bandwidth product (SBP) of the sensor chip
struction. Here again, uðx; yÞ is the object wave in the because it adapts the curvature of the reference wave
object plane, hðj; hÞ the hologram, rðj; hÞ the refer- to the curvature of the object wave.
ence wave in the hologram plane, and u0 ðx0 ; y 0 Þ the
reconstructed wavefield.
For the specific setup of lensless Fourier hologra-
phy, a spherical reference wave is used with an origin Influences of Discretization
at the same distance from the sensor as the object For the estimation of the lateral resolution in digital
itself. In the case that d0 ¼ 2d; x ¼ x0 ; and y ¼ y0 ; holography, three different effects related to discre-
tization have to be considered: averaging, sampling,
and the limited sensor size. We assume a quadratic
sensor with N £ M pixels of the size Dj £ Dj:
Each pixel has a light-sensitive region with a side
length gDj ¼ Djeff : g 2 is the so-called fill factor
ð0 # g # 1Þ; that indicates the active area of the
pixel. The sensor averages the incident light which
has to be considered because of possible intensity
fluctuations over this area. The continuous
expression for the intensity Iðj; hÞ ¼ t½hðj; hÞ;
registered by the sensor, has to be integrated over
the light-sensitive area. This can be expressed
mathematically by the convolution of the incident
intensity Iðj; hÞ; with a rectangle function:
Figure 6 Scheme of a setup for digital lensless Fourier
holography of diffusely reflecting objects. I1 ðj; hÞ / I ^ rectDjeff :Djeff ½33
86 HOLOGRAPHY, TECHNIQUES / Digital Holography
The consequences are amplitude distortions, aliasing multiwavelength contouring, each hologram
and speckling that have to be considered in the can be stored and reconstructed independently
reconstruction procedure. with its corresponding wavelength. This results in
a drastic decrease of aberrations and makes it
possible to use larger wavelength differences for
Advantages of Digital Holography the generation of shorter synthetic wavelengths.
† Because all states of the inspected object can be
Besides the electronic processing and the direct access stored and evaluated independently, only seven
to the phase, some further advantages recommend digital holograms are necessary to measure the
digital holography for several applications such as 3D-displacement field and shape of an object
metrology and microscopy: under test: one hologram for each illumination
† The availability of the independently recon- direction before and after the loading, respecti-
structed phase distributions of each individual vely, and one hologram with a different wave-
state of the object and interferometer, respectively, length (or a different source-point of illumination)
offers the possibility to record a series of digital which can interfere with one of the other holo-
holograms with increased load amplitude. In the grams to perform two-wavelength-contouring
evaluation process, the convenient states can be (or two-source-point-contouring) for shape
compared interferometrically. Furthermore, a measurement. If four illumination directions are
series of digital holograms with increasing load used, nine holograms are necessary.
can be applied to unwrap the mod 2p-phase † The direct access to all components of the
temporally. In this method, the total object wavefield makes it possible to correct wavefront
deformation is subdivided into many measure- aberrations effectively by computer.
ment steps, in which the phase differences are † The digital hologram containing all information
smaller than 2p: By adding up those intermediate of the wavefront can be transmitted via the
results, the total phase change can be obtained internet to any location and can be reconstructed
without any further unwrapping. This is an digitally by computer or as an analogue version by
important feature, since it is essential to have an a spatial light modulator. Consequently, remote
unwrapped phase to be able to calculate the real access to wavefronts, that are generated at distant
deformation data from the phase map. Figure 7 places, is possible.
shows an example of such a measurement. † Finally, the possibility to miniaturize complex
The left image shows the wrapped deformation holographic sensor setups and to use this method
phase for a clamped coin which was loaded with for remote comparative interferometry, makes
heat. The right image shows the temporal digital holography a versatile tool for the solution
unwrapped phase which has been obtained by of numerous inspection and measurement
dividing the total deformation into 85 sub- problems.
measurements. Thus, the displacement of the
object can be observed almost in real time during
the loading process.
† The independent recording and reconstruction See also
of all states also gives a new degree of freedom Diffraction: Fresnel Diffraction. Fourier Optics.
for optical shape measurement. In the case of Holography, Techniques: Holographic Interferometry.
HOLOGRAPHY, TECHNIQUES / Digital Holography 87
Holographic Interferometry
P Rastogi, Swiss Federal Institute of Technology, engineering structures to an accuracy of a fraction of
Lausanne, Switzerland a micrometer but also to detect material flaws and
q 2005, Elsevier Ltd. All Rights Reserved. inhomogeneities having escaped the manufacturing
process. The methodology of quantitative analysis of
holographic interferograms has undergone extensive
development during the last decade. This article
Introduction provides a brief review of the holographic techniques
The development of holographic interferometry has with emphasis placed on bringing out its relevance
redefined our perception of optical interferometry. in experimental mechanics and nondestructive
The unique property of holographic interferometry to testing. The series of books given in the Further
bring a wavefront, generated at some earlier time, Reading section at the end of this article treat the
stored in a hologram and released at a later time, to field in detail.
interfere with a comparison wavefront, has made
possible the interferometric comparison of a rough Basic Methods in Holographic
surface, which is subject to stress, with its normal Interferometry
state. Assuming the surface deformations to be very
small, the interference of the two speckled wavefronts The ability of holography to store a wavefront and
forms a set of interference fringes that are indicative release it for reconstruction at a later time offers
of the amount of displacement and deformation access to the possibility to compare wavefronts, which
undergone by the diffuse object. have albeit existed at different times. There are several
Holographic interferometry is an important and schemes that have been established to obtain the
exciting area of research and development. It has interferometric comparison of the wavefronts.
established itself as a highly promising noninvasive
Double-Exposure Holographic Interferometry
technique in the field of optical metrology. Tech-
niques within its folds for the measurement of static In frozen form of wavefront comparison, two
and dynamic displacements, topographic contours, holograms of the object are recorded on a photo-
and flow fields have been developed and demon- graphic plate. The first hologram is made with the
strated with success in a wide range of problems. object in its initial, unstressed state and the second
Significant and extensive contributions to the devel- hologram is recorded with the object in its final,
opment of holographic interferometry have been stressed state. Upon reconstruction, the hologram
realized both from the theoretical and experimental releases the two stored wavefronts, one correspond-
perspectives. A wide range of procedures have been ing to the object in its unstressed state and the other
developed which make it not only possible to corresponding to the object in its stressed state
measure surface displacements and deformations of (Figure 1). The virtual image is overlaid by a set of
bright and dark interference fringes, which are due displacement and for any arbitrary illumination and
to the displacement of the object between the observation geometry.
two exposures.
The fringe pattern is permanently stored on the Real-Time Holographic Interferometry
holographic plate. The interference fringes denote the A hologram is made of an arbitrarily shaped
loci of points that have undergone an equal change of rough surface. After development, the hologram
optical path between the light source and the is placed back in exactly the same position.
observer. The image intensity of the holographic Upon reconstruction, the hologram produces the
interferogram is original wavefront. A person looking through the
hologram sees a superposition of the original object
Iðx; yÞ ¼ I0 ðx; yÞ{1 þ Vðx; yÞcosðwðx; yÞ 2 w0 ðx; yÞÞ} and its reconstructed image (Figure 2). The object
wave interferes with the reconstructed wave to
½1 produce a dark field due to destructive interference.
If the object is now slightly deformed, interference
where I0 ðx; yÞ is the background intensity, Vðx; yÞ is
fringes are produced which are related to the change
the fringe contrast, and w and w0 are the phases of the
in shape of the object. A dark fringe is produced,
waves from the object in its initial and deformed
whenever:
states, respectively.
A bright fringe is produced whenever
wðx; yÞ 2 w0 ðx; yÞ ¼ 2np n ¼ 0; 1; 2; … ½3
0
wðx; yÞ 2 w ðx; yÞ ¼ 2np n ¼ 0; 1; 2; … ½2
The method is very useful for determining the direc-
where n is the fringe number. The wave scattered by a tion of the object displacement, and for compensating
rough surface shows rapid variations of phase, on the interferogram the influence of the rigid body
which have little correlation across the wavefront. motion of the object when subjected to a stress field.
The phase difference varies randomly over the ray The optical setup for real-time holography uses an
directions contained within the aperture of the immersion tank, which is mounted on a universal
viewing system, which would normally prevent the stage and contains the photographic plate. The
observation of fringes with large aperture optics near holographic plate is followed by a vidicon camera,
such points. However, a surface may exist at a which visualizes the interferograms directly onto the
distance from the object at which the value of phase TV monitor screen connected to a video-recording
difference is stationary over the cone of ray pairs tape to memorize the information. The use of
defined by the viewing aperture. Interference fringes thermoplastic plates as recording material has
are localized in the region where the variation in gained in importance in recent years for implement-
phase difference is minimum over the range of ing real-time and double exposure holographic
viewing directions. This approach has been widely interferometry setups, as it can be processed rapidly
used to compute fringe localization for any object in situ using electrical and thermal processing.
where Dci is the phase shift which is produced, for three-dimensional vector displacement would be to
example, by shifting the phase of the reference wave. record holograms from three different directions in
A minimum of three intensity patterns is required to order to obtain three independent resolved com-
calculate the phase at each point on the object ponents of displacement. The computation of the
surface. Known phase shifts are commonly intro- resulting system of equations would then allow for
duced using a piezoelectric transducer, which serves determining the complete displacement vector.
to shorten or elongate the reference beam path by a
fraction of wavelength; the phase-shifter is placed in Out-of-Plane Displacement Component
Measurement
one arm of the interferometer. Numerous phase
shifting techniques have been developed and which The out-of-plane component of displacement is
can be incorporated to holographic interferometry to usually measured by illuminating and viewing the
generate a phase map, Dwðx; yÞ; corresponding to the object surface in a near normal direction. The fringe
object displacement. The calculated phase Dwðx; yÞ is equation corresponding to the line of sight displace-
independent of the terms I0 ðx; yÞ and Vðx; yÞ; and ment component is given by
considerably reduces the dependence of the accuracy
nl
of measurements on the fringe quality. w¼ ½8
2
The introduction of phase shift interferometry not
only provides accurate phase measurements but also where w is the out-of-plane component of
eliminates the problem of phase sign ambiguity of displacement. An example of a fringe pattern
the interference fringes. Dynamic phase shifting depicting the out-of-plane displacements in a rec-
techniques allow for producing time sequences of tangular aluminum plate, clamped along the
deformation maps for studying objects subjected to edges and drawn out at the center by means of a
time-varying loads. The concept is based on consider- bolt, subjected to three-point loading, is shown in
ing the normal phase variations caused by a dynamic Figure 4.
phenomenon as equivalent to the phase shifting that is
In-Plane Displacement Component Measurement
introduced in a holographic interferometer in order to
perform the phase evaluation task. This method The in-plane component of displacement can be
extends the possibility of applying the method to measured by illuminating the object along two
continuous deformation measurements. directions symmetric to the surface normal and
observing along a common direction. This is schema-
tized in Figure 5. The in-plane displacements, in a
Basic Interferometers for Static direction containing the two beams, are mapped as a
moiré between the interferograms due to each
Displacement Measurement
illumination beam. The moiré appears as a family of
Equation [6] shows that observation along a single fringes:
direction yields information only about the
resolved part of the surface displacement in one nl
u¼ ½9
particular direction. The approach to measuring 2sin ue
Figure 4 Example of interferogram corresponding to the whole field distribution of out-of-plane component of displacement, w ; on an
aluminum plate subjected to three-point bending.
92 HOLOGRAPHY, TECHNIQUES / Holographic Interferometry
Figure 5 Schematic of the geometrical configuration for measuring the in-plane component of displacement, u:
nl
Dw ¼ ½10
2
Figure 7 Comparative holography reveals the presence of defects in aluminum plates using (a) identical and (b) unequal loadings.
Figure 8 (a) Schematic of the geometrical configuration of a vibrating cantilever beam; (b) plot of fringe function for time-average
holography.
Frozen Time-Average Holographic Interferometry nodes, a decreasing intensity, and an unequal spacing
between the successive zeros. The interferogram in
Consider that a hologram is recorded of an object,
Figure 9 shows the vibrational modes of a helicopter
which is vibrating sinusoidally in a direction normal
component. The nodes that represent zero motion are
to its surface (Figure 8a). The exposure time is
clearly seen as the brightest areas in the time average
supposed to be much longer than the period of
reconstructed pattern.
vibration. The intensity distribution of the image
reconstructed by this hologram is Real-Time Time-Average Holographic
Interferometry
2 2p
Iðx; yÞ ¼ Io ðx; yÞJo dðx; yÞðcos ue þ cos uo Þ ½11 The possibility of studying the response of vibrating
l
objects in real-time extends the usefulness of the
where Jo is the zero-order Bessel function of the first technique to identify the resonance states (resonant
kind. The virtual image is modulated by the Jo2 ðj Þ vibration mode shapes and resonant frequencies) of
function. The dark fringes correspond to the zeros of the object. A single exposure is made of the object in
the function Jo2 ðj Þ: A plot of the function, shown in its state of rest. The plate is processed, returned to its
Figure 8b, is characterized by a comparatively original condition and reconstructed. The observer
brighter zero-order fringe, which corresponds to the looking through the hologram at the sinusoidally
94 HOLOGRAPHY, TECHNIQUES / Holographic Interferometry
Surface Contouring
A practical way to display the shape of an object is to
obtain contour maps showing the intersection of the
object with a set of equidistant planes orthogonal to
the line of sight.
l
Dz ¼ ½15
2ðn0 0
1 2 n 2Þ
Figure 13 Example of a moiré topographic contour pattern
obtained using the two-beam method.
Figure 14 Examples of (a) shape measurement: the distance between two adjacent contour planes is 208 mm; and, (b) very large
out-of-plane deformation measurement on a cantilever beam. The distance between two adjacent contour planes is 19 mm.
Figure 16 Examples of (a) moiré fringe contour, and (b) phase distribution corresponding to slope change produced on a centrally
loaded aluminum plate clamped along its boundary.
HOLOGRAPHY, TECHNIQUES / Holographic Interferometry 97
2 p ›u ›w
Dw ¼ sin ue þ ð1 þ cos ue Þ Dx ½17
l ›x ›x
›w nl
¼ ½18
›x 2Dx
Figure 20 Fringe contours showing (a) a specific limb movement of a chick embryo during incubation, and (b) the hatching behavior.
The point on the shell receiving blows is clearly visible.
The Principle of Sandwich Holography image can therefore be manipulated by moving one
plate in relation to the other during reconstruction.
In the following discussion on sandwich holography,
We have found that the simplest way to do this is to
we will only study the simplified example where
glue the two plates together separated by the glass
the directions of illumination and observation are
thickness of one plate. To make the situation identical
normal to the studied displacement and therefore
during exposure and reconstruction, the plates also
a is equal to zero. The reflected light from an
have to be recorded in pairs. Thus, the first exposure
object point combined with the reference light
can be made with one front plate and one back plate
produces a set of hyperbolas (Young’s fringes) on
in contact in the hologram holder (Figure 2). The
the hologram plate.
second exposure is made in exactly the same position
When the object point is displaced between
with two new plates. After processing the front plate
exposures on two different hologram plates, the
of the second exposure is glued in front of the back
hyperbolas will be positioned differently on the
plate of the first exposure.
plates. There is a one-to-one relationship between
Then a reconstruction beam similar to the reference
the ellipsoids and the hyperbolas. Thus one point
beam, is used to reconstruct the sandwich hologram.
on the hologram plate will be crossed by one
The interference fringes seen on the image of the
hyperbola if the corresponding object point is
object are manipulated by tilting the sandwich
moved so that it crosses one of the ellipses of the
hologram, which results in an addition or subtraction
holodiagram. Thus, the moire effect of the displaced
of new fringes as in the following examples:
hyperbolas at the hologram plate represents the
moire effect of the ellipses in the object space. 1. If the object is rotated by the small angle f1, the
The interference pattern in the reconstructed object number of fringes can, during reconstruction, be
HOLOGRAPHY, TECHNIQUES / Sandwich Holography and Light in Flight 101
Figure 2 The exposures and reconstructions of a sandwich end to a rigid frame that surrounded all the bars. The
hologram. middle bar was also supported at the upper end.
Forces were applied at the middle of the bars. The
right bar is deformed away from the observer while
eliminated by rotating the sandwich in the same
the other two are deformed towards him. After
direction by the much larger angle f2.
deformation of the three objects the second pair of
plates were exposed. Then all four plates were
d
f1 ¼ f ½7 processed.
2Ln 2
The back plate of the first exposure (B1 in Figure 2)
2. If the sandwich during reconstruction is rotated was repositioned in the plate holder in front of the
around an axis perpendicular to that of the object plate of the second exposure (F2). The two plates were
rotation, new fringes are introduced at the angle bonded together to form a sandwich hologram. O1
g, which is analogous to, but much larger than, and O2 represent the two positions of one object,
the object rotation f1. which between the two exposures was bent at angle
f1 toward the hologram. (L) is the distance between
d the front hologram and the object (,1 m); (L) is also
f1 ¼ w tgg ½8
2Ln 2 the distance between the point of illumination and the
object. (D) is the distance between the point source of
3. When the sandwich is rotated as described in (2)
the reference beam and the hologram plate ðD ¼ 2LÞ;
fringes are formed, the radius of curvature (r2) of
(d) is the thickness of one hologram plate (,1.3 mm);
which are much larger than those of the bent
(n) is the refractive index of the plates (,1.5 mm).
object (r1).
When the two plates were properly repositioned in
" #
the plate holder, fringes were seen everywhere, both
2Ln 1
r1 ¼ r ½9 on the steel bars and on the supposedly fixed frame
d w2 2
which apparently had made an accidental movement
(Figure 3). By tilting the plate holder in different
Examples
directions, the fringes on the frame could be
The object consisted of three vertical steel bars eliminated and the hologram could be evaluated as
(Figure 3) that were fixed by screws at their lower if the frame had remained fixed (Figure 4).
102 HOLOGRAPHY, TECHNIQUES / Sandwich Holography and Light in Flight
Introduction to Light-in-flight
Recording by Holography
Recording of light pulses in flight have been possible
since the beginning of the 1990s, using two-photon
fluorescence or ultrafast Kerr cells driven by laser
pulses. Very fast optical phenomena, such as refractive-
index changes in laser-produced plasmas, have been
recorded by holography using short illumination
pulses.
In 1968, Staselko et al. published a method of
studying the time coherence pattern of wave trains
from pulsed lasers by the use of autocorrelation
Figure 5 Another tilt of the sandwich hologram reveals that the in which the unknown pulse is compared to itself.
left and the right bar have been deformed in opposite directions. The method we name ‘light-in-flight recording by
holography’ (LIF) is based on studying both the
temporal and the spatial shape of pulses by corre-
The plate holder was then tilted backwards at angle lation, which means that we study the change in
f2 and horizontal fringes were found on the frame shape of a pulse before and after it has passed through
while the top of the left bar became fringe-free the transformation we want to study. For this process,
(Figure 5). The number of fringes on the right bar we want to start with as short a pulse (or short
increased, proving that it had been tilted in the coherence length) as possible and then compare the
opposite direction. If the sandwich were instead transformed pulse to the original pulse.
HOLOGRAPHY, TECHNIQUES / Sandwich Holography and Light in Flight 103
Figure 8 Resulting reconstructed images of the light reflected by a mirror at about 458. These four photos were taken during
reconstruction while the camera behind the hologram plate was moved from left to right in between each of the exposures.
Other Uses
Like all methods of ‘gated viewing’, LIF can be used
to look through scattering media, e.g., to look
through human tissue. By observing only the first
light the influence of the later arriving scattered light
is eliminated. Thus, LIF can be used as a diagnostic Figure 11 A modified Minkowski diagram consisting of one cone
of illumination (A) and one cone of observation (B). The separation
method based on the idea that the scattering,
in time and space between (A) and (B) could either be static as
absorption, or refractive index of cancer is different in LIF or caused by a relativistic velocity of the observer from
from that of healthy tissue. Another use of LIF is to (A) to (B).
image just a thin section of a gas or liquid filled with
particles. In this way it is possible to observe the
distribution and velocity in a certain localization where (x) and (y) represent space coordinates, while
without influence of the scattering from particles in (z) represents time multiplied by (c). A sphere from a
front of or behind the studied section. short pulse of light that expands from the point A is
Up to now we have discussed gated viewing by LIF represented by a cone with its apex at (A). A sphere
in the laboratory using picosecond light pulses and from a short observation that converges to (B) is
distances of up to a few meters. Instead we could represented by a cone with its apex at (B). Only those
look at astronomical objects with light pulses of days object points, that are positioned at the elliptic
(e.g., a supernova) and distances of many light years. intersection of the two cones, can be seen illuminated.
Even in these cases we will see particles in space This ellipse is identical to one of the ellipses of the
illuminated by ellipsoids with (A) being the super- holodiagram. The distance in time and space separ-
nova and (B) a telescope on Earth. While each ating (A) and (B) could be either caused by the fixed
ellipsoid in the laboratory may represent a pathlength distance between source of laser light and hologram
of some nanoseconds, each ellipsoid in space might plate in holographic LIF, or by a high velocity of the
represent light years. observer. Thus, the Lorentz contraction in special
However, the methods we have shown here are relativity could be explained as a measuring error
not limited to the study of the apparent shape of because the high velocity of the observer traveling
spheres of light. They can also be used to study other from (A) to (B) causes his sphere of observation to be
objects moving at ultrahigh velocities. If an explod- transformed into an ellipsoid of observation. The tilt
ing supernova throws out particles in all directions at of the elliptic intersection indicates which obser-
a rather slow velocity, it can be seen surrounded by vations that appear to be simultaneous depending on
an expanding sphere. If, instead, this sphere the separation of (A) and (B) along the y-axis, both in
expanded with the speed of light (which is not LIF and in relativity.
possible) it would appear to us just like an ellipsoid
of the holodiagram. If, however, it expanded at a
very high velocity, but lower than that of light, the See also
sphere would appear to us in the shape of an egg Holography, Techniques: Holographic Interferometry.
with its pointed end towards the observer. The
velocity of expansion could, in this case, appear
superluminal (higher than the speed of light). If the Further Reading
velocity is (v), its approaching velocity will to a first
Abramson N (1971) Moiré patterns and hologram inter-
approximation appear to be the true velocity divided ferometry. Nature 231: 65– 67.
by ð1 2 v=cÞ: Abramson N (1981, 1986) The Making and Evaluation of
Finally, we will show a relationship between LIF and Holograms. London: Academic Press.
special relativity, by looking at a modification of the Abramson N (1996) Light in Flight, or, The Holodiagram –
‘Minkowski light cones’. Figure 11 shows a diagram The Columbi Egg of Optics. USA: SPIE Press, vol. PM 27.
106 HOLOGRAPHY, TECHNIQUES / Sandwich Holography and Light in Flight
An W and Carlsson T (1999) Digital measurement in three-dimensional flows. Optics Letters 22:
of three-dimensional shapes using light-in-flight 828 – 830.
speckle interferometry. Optical Engineering 38: Malin D and Allen D (1990) Echoes of the supernova. Sky
1366– 1370. and Telescopes 22– 25, January.
Dugay MA and Mattick AT (1971) Ultrahigh speed Powell R and Stetson K (1965) Interferometric analysis by
photography of picosecond light pulses and echoes. wavefront reconstruction. Journal of the Optical Society
Applied Optics 10: 2162– 2170. of America 55: 1593– 1598.
Garrington S (1992) Invisible jet sheds new light. Nature Spears K and Serafin J (1989) Chrono-coherent imaging for
355: 771– 772. medicine. IEEE Transactions on Medical Engineering
Giordmaine J and Rentzepis P (1967) Two-photon exci- 36: 1210– 1220.
tation of fluorescence by picosecond light pulses. Staselko DI, Denisyuk YN and Smirnow AG (1969)
Applied Physics Letters 11: 216. Holographic recording of the time-coherence pattern
Hinrichs H and Hinsch KD (1997) Light-in-flight of a wave train from a pulsed laser source. Optics and
holography for flow visualization and velocimetry Spectroscopy 26: 413.
I
IMAGING
Contents
2. Signal coding, to encode the acquired signal for communication addresses the perturbations that
the efficient transmission and/or storage of data occur at the encoder and decoder or during
and decode the received signal for the restoration transmission, the characterization of imaging
of images. The encoding may be either lossless systems must also address the perturbations in
(without loss of information) or lossy (with some the image-gathering process prior to coding.
loss of information), if a higher compression is These additional perturbations require a clear
required; distinction between the information rate of the
3. Image restoration, to produce a digital represen- encoded signal and the associated theoretical
tation of the scene and transform this represen- minimum data rate or, more generally, between
tation into a continuous image for the observer. information and entropy.
Human vision may be included in this model as a The mathematical model of imaging that is out-
fourth stage, to characterize imaging systems in terms lined here accounts for the appropriate deterministic
of the information rate that the eye of the observer and stochastic (random) properties of natural scenes
conveys from the displayed image to the higher levels and for the critical limiting factors that constrain the
of the brain. performance of imaging systems. The corresponding
These stages are similar to each other in one formulations of the figures of merit are sufficiently
respect: each contains one or more transfer functions, general to account for a wide range of radiance field
either continuous or discrete, followed by one or properties, photodetection mechanisms, and signal
more sources of noise, also either continuous or processing structures. However, the illustrations
discrete. The continuous transfer function of the emphasize: (a) the salient properties of natural scenes
image-gathering device (and of the human eye) is that dominate the captured radiance field in the
followed by sampling that transforms the captured visible and near-infrared region under normal day-
radiance field into a discrete signal with analog light conditions; and (b) the photodetector arrays that
magnitudes. In most devices the photodetection are used most frequently as the photodetection
mechanism conveys this signal serially into an mechanism of image-gathering devices.
analog-to-digital (A/D) converter to produce a digital
signal. However, analog signal processing in a parallel Radiance Field
structure, akin to that in human vision, has many
advantages that are increasingly emulated by neural The radiance field Lðx; y; lÞ; reflected by natural
networks. scenes, may be modeled approximately as
This model of imaging differs from the classical
model of communication that Shannon and Wiener 1
Lðx; y; lÞ ¼ IðlÞrðlÞrðx; yÞ ½1
addressed in two fundamental ways: p
1. The continuous-to-discrete transformation in where IðlÞ and rðlÞ; respectively, represent the
image gathering. Whereas communication is spectral irradiance and reflectance properties as a
constrained critically only by bandwidth and function of the wavelength l; and rðx; yÞ represents
noise, image gathering is constrained also by the the random reflectance properties as a function of the
compromise between blurring and aliasing, due to spatial coordinates ðx; yÞ: Figure 2 illustrates a model
limitations in the response of optical apertures of rðx; yÞ for which the power spectral density (PSD)
and in the sampling density of photodetection F^ r ðy ; vÞ corresponds closely to that typical of natural
mechanisms. This additional constraint requires scenes, as given by
the rigorous treatment of insufficiently sampled
signals throughout the imaging system.
^ r ðy ; v Þ ¼ 2pm2r s 2r
2. The sequence of image gathering followed by F ½2
signal coding. Whereas the characterization of ½1 þ ð2pmr z Þ2 3=2
IMAGING / Information Theory in Imaging 109
Xp Xp 57:3Xp
X ¼ 2tan21 < rad ¼ deg ½3
2 ‘p ‘p ‘p
where dðx; yÞ is the Dirac delta function; and the gain If, for spatial coordinates at either the scene or the
is given by photodetector array, the unit of ðx; yÞ is meters, then
the unit of ðy ; vÞ is cycles m21; and if, for angular
1 ð coordinates centered at the objective lens, the unit of
K¼ A‘ V IðlÞrðlÞrðlÞdl ½6 ðx; yÞ is radians, then the unit of ðy ; vÞ is cycles rad21.
p
Either way, it is convenient to normalize the sampling
where A‘ ¼ pD2 =4 is the area of the objective lens intervals to unity (i.e., X ¼ Y ¼ 1). Then the area of
the sampling passband is lBl ^ ¼ 1; the unit of ðx; yÞ is
with diameter D, and rðlÞ is the responsivity of the
photodetectors. This gain relates the magnitude of the samples, and the unit of ðy ; vÞ is cycles sample21.
signal sðx; yÞ directly to that of the reflectance rðx; yÞ; The SFR t^ ðy ; vÞ of the image-gathering device is
and, together with the variances sr2 and sp2 of the
reflectance and the photodetector noise, respectively, t^ ðy ; vÞ ¼ t^‘ ðy ; vÞt^p ðy ; vÞ ½14
it also determines the rms-signal-to-rms-noise ratio
where t^‘ ðy ; vÞ and t^p ðy ; vÞ are the SFRs of the
(SNR) Ksr =sp :
objective lens and photodetector apertures, respect-
The DFT of sðx; yÞ is the periodic spectrum s~ðy ; vÞ
ively. This SFR may be modeled approximately by the
given by
circularly symmetric Gaussian shape
s~ðy ; vÞ ¼ Kr^ðy ; vÞt^ ðy ; vÞ p m^ þ n~ p ðy ; vÞ ½7
t^ ðy ; vÞ ¼ exp½2ðz=zc Þ2 ½15
where r^ðy ; vÞ is the spectrum of the reflectance, where, as shown in Figure 6, the optical-response
t^ ðy ; vÞ is the spatial frequency response (SFR) of index zc controls the trade-off between the blurring
the image-gathering device, and n~ p ðy ; vÞ is the that is caused by the decrease of the SFR inside the
spectrum of the photodetector noise. The symbol m^ sampling passband and the aliasing that is caused by
is the Fourier transform of the sampling function m: It the extension of the SFR beyond this passband.
accounts for the sidebands generated by the sampling
process and is given by Signal Coding
X Signal coding may be divided into two stages, as
m^ ; mð
^ y ; vÞ ¼ dðy 2 v; v 2 wÞ ½8
v;w depicted in Figure 1. The first stage, which is
represented by the digital operator te ðx; yÞ and the
where ðv; wÞ ¼ ðm=X; n=YÞ: The corresponding samp- quantization noise nq ðx; y; kÞ; accounts for manipula-
ling passband is tions of the acquired signal that seek to extract those
^ ¼ ðy ; vÞ; lvl , 1 ; lvl , 1
B ½9
2X 2Y
where
^ y ; vÞ ¼ n^ a ðy ; vÞ þ n~ p ðy ; vÞ
nð ½11
features of the captured radiance field considered to then the blurring and raster effects of the image-
be perceptually most significant, which usually entails display process are suppressed entirely. This con-
some loss of information. The second stage, which is dition leads not only to the best realizable image
represented by the encoder, accounts for manipula- quality but also to the simplest expression for the
tions that seek to reduce redundancies in the acquired continuous displayed image, as given in the same
signal, without loss of information. The resultant form as eqns [10] and [17] by
encoded signal se ðx; y; kÞ may be expressed in the
same form as eqn [4] for the acquired signal sðx; yÞ as ^ y ; v; kÞ ¼ r^ðy ; vÞG^ r ðy ; v; kÞ þ n^ r ðy ; v; kÞ
Rð ½23
These simplifying conditions permit the Wiener filter Theoretical Minimum Data Rate E
given above and the figures of merit given below to be
The theoretical minimum data rate E (also in bits
expressed as a function of the SNR Ksr =sp ; which can
sample21) that is required to convey the information
be specified without accounting explicitly for either
rate H is defined by
the gain K or the variances sr2 and sp2 :
E ¼ E½~se ðy ; v; kÞ 2 E½~se ðy ; v; kÞl~sðy ; vÞ ½34
Figures of Merit
where the second term, E½·l· is the uncertainty of
Information Rate H s~e ðy ; v; kÞ; when s~ðy ; vÞ is known. For the assump-
The information rate H (in bits sample21) of the tions that have been made about the coding process,
encoded signal se ðx; y; kÞ is defined by the conditional entropy becomes the entropy
E½n~ q ðy ; v; kÞ of the quantization noise, so that E
^ simplifies to
H ¼ E½~se ðy ; v; kÞ 2 E½~se ðy ; v; kÞlr^ðy ; vÞ : ðy ; vÞ [ B
½30 E ¼E½~se ðy ; v; kÞ2E½n~ q ðy ; v; kÞ
2 3
where the first term, E½·; represents the entropy of 1ð ð 4 ^ 0r ðy ; vÞlG^ e ðy ; vÞl2 þ F
F ^ 0n ðy ; vÞ
e 5dvdv
¼ log 1þ
s~ e ðy ; v; kÞ and the second term, E½·l·; represents the 2 B^ F~ 0q ðy ; v; kÞ
conditional entropy of this signal when the reflec-
½35
tance spectrum r^ðy ; vÞ within the sampling passband
B^ is known. For the assumptions that have been made
This expression for E represents the entropy of
about image gathering and coding, the conditional
completely decorrelated data.
entropy becomes the entropy of the accumulated
noise n^ e ðy ; v; kÞ; so that H simplifies to
Information Efficiency H =EE
H ¼ E½~se ðy ; v; kÞ 2 E½nð
^ y ; v; kÞ : ðy ; vÞ [ B ^
2 3 The information efficiency of completely decorrelated
1ð ð 4 F^ 0r ðy ; vÞlG^ e ðy ; vÞl2
5dvdv ½31
data is defined by the ratio H/E. This ratio reaches its
¼ log 1 þ upper bound H=E ¼ 1 when F ^ 0n ðy ; vÞ p F
^ 0r ðy ; vÞl
2 B^ ^ 0n ðy ; v; kÞ
F ^Ge ðy ; vÞl2 and F^ 0n ðy ; vÞ p F
e
e ~ 0q ðy ; v; kÞ; because then
e
both eqn [31] for H and eqn [35] for E reduce to the
same expression
The theoretical upper bound of H is Shannon’s
2 3
channel capacity C for a bandwidth-limited system,
1ð ð 4 ^ 0r ðy; vÞlG^ e ðy;vÞl2
F 5dvdv ½36
with white photodetector noise and an average power H¼E ¼ log 1þ
2 B^ F~ 0q ðy; v; kÞ
limitation (here the variance s 2r Þ; as given by
or, equivalently, as a function of the integrand the best match occurs when the sampling interval
^ y ; v; kÞ of the information rate H given by eqn
Hð is near the mean spatial detail of the scene (i.e.,
[31] as when m < 1Þ:
ðð h i
^ y ; v; k Þ
^ 0r ðy ; vÞ 1 2 22Hð These conditions are consistent with those for
F ¼ F dvdv ½38
which the information rate H reaches Shannon’s
channel capacity C. However, blurring and aliasing
This relationship between fidelity and information constrain H to values that are smaller than C by a
rate can be extended to other goals. For example, factor of nearly two (i.e., H & C=2Þ: Hence, these
the enhancement of spatial features (e.g., edges conditions replace the role of white and band-limited
and boundaries), for robotic vision, can be signals in classical communication theory, to establish
accommodated by letting t~e ðy ; vÞ be the desired an upper bound on the information rate in imaging
feature-enhancement filter. systems.
The above two conditions appeal intuitively when
the problem of image restoration is considered:
Performance and Design Trade-offs 1. The result that the SFR t^ ðy ; vÞ that optimizes H
depends on the available SNR is consistent with
Electro-optical Design
the observation that, in one extreme, when the
Figure 7 presents the information rate H (for lossless SNR is low, substantial blurring should be
encoding) as a function of the optical-response index avoided because the photodetector noise would
zc and the mean spatial detail m: The information rate constrain the enhancement of fine spatial detail
H versus zc is given for several SNRs Ksr =sp and even if aliasing were negligible. In the other
m ¼ 1; and H versus m is given for the three extreme, when the SNR is high, substantial
informationally optimized designs specified in aliasing should be avoided so that this enhance-
Table 1. ment is relieved from any constraints except
The curves show that H can reach its highest those that the sampling passband inevitably
possible value only when both of the following imposes.
conditions are met:
1. The relationship between the SFR t^ðy ; vÞ and the Table 1 Informationally optimized designs
sampling passband B ^ is chosen to optimize H for
Design K sr =sp zc Hp
the available SNR Ksr =sp : This design is said to be
informationally optimized. 1 256 0.3 4.4
2 64 0.4 3.3
2. The relationship between the sampling passband
^ and the PSD F ^ r ðy ; vÞ is chosen to optimize H. 3 16 0.5 2.2
B
^
For the PSDs Fr ðy ; vÞ typical of natural scenes, p
m ¼ 1:
Figure 7 Information rate H versus the optical-response index zc and the mean spatial detail m (relative to the sampling interval).
The designs are specified in Table 1, and the SFRs t^(y ; v) for zc are shown in Figure 6.
114 IMAGING / Information Theory in Imaging
2. The result that H reaches its highest value when associated theoretical minimum data rate, or entropy,
the sampling interval is near the mean spatial E for h-bit quantization. The encoder SFR t~e ðy ; vÞ is
detail of the scene is consistent with the obser- assumed to be unity. The plot illustrates the trade-off
vation that, ordinarily, it would not be possible to between H and E in the selection of the number of
restore spatial detail that is finer than the sampling quantization levels for encoding the acquired signal.
interval, whereas it would be possible to restore It shows, in particular, that the design that realizes the
much coarser detail from fewer samples. highest information rate H also enables the encoder
to realize the highest information efficiency H/E. This
result appeals intuitively, because the perturbations
Information Rate and Data Rate due to aliasing and photodetector noise that interfere
Figure 8 presents an information-entropy HðEÞ plot with the image restoration can also be expected to do
that characterizes the relationship between the so with the signal decorrelation.
information rate H of the encoded signal and the
Throughput SFR
Figure 9 illustrates the SFRs t^ ðy ; vÞ and Cð^ y ; v; kÞ
of the image-gathering device and Wiener filter,
respectively, and of their product, the throughput
SFR G^ r ðy ; v; kÞ; for the three designs specified in
Table 1. As the accumulated noise becomes
negligible, G^ r ðy ; v; kÞ approaches the ideal SFR of
the classical communication channel, which is unity
inside the sampling passband and zero outside.
Figure 9 SFRs of image gathering, restoration, and throughput. The designs are specified in Table 1.
IMAGING / Information Theory in Imaging 115
spatial detail m would usually remain uncertain In practice, therefore, image restorations have to
because it depends, not only on the properties of rely on estimates of the statistical properties of scenes.
the scene, but also on the angular resolution and The tolerance of the quality of the restored image to
viewing distance of the image-gathering device. errors in these estimates is referred to as the
robustness of the image restoration.
Figure 10 presents the information rate H and the
corresponding maximum-realizable fidelity F for
matched and mismatched Wiener restorations.
These restorations represent the digital image Rðx;
y ; kÞ or, equivalently, the continuous image Rðx; y; kÞ
displayed on a noise-free medium. The matched
restorations use the correct value of m; whereas the
mismatched restorations use wrong estimates of m:
These curves reveal that the informationally opti-
mized designs produce the highest fidelity and
robustness, and that both improve with increasing
H. Moreover, these curves reveal that, ordinarily, one
cannot go far wrong in Wiener restorations by
assuming that the mean spatial detail is equal to the
sampling interval (i.e., m ¼ 1). This observation
appeals intuitively because spatial detail that is
much smaller cannot be resolved, and detail that is
larger is not degraded substantially by blurring and
aliasing.
Figure 11 Images restored with the Wiener filter and their three components for designs 1 (top) and 3 (bottom).
116 IMAGING / Information Theory in Imaging
Figure 12 Images restored with the Wiener filter and their three components for designs 1 (top) and 3 (bottom).
photodetector noise. The contrast of the illustrations for visual quality by suppressing some of the defects
of the two noise components has been increased due to aliasing and ringing. This enhancement
by a factor of four to improve their visibility. improves clarity at the cost of a small loss in sharpness.
The extraneous structure that aliasing produces in The images are produced again from 64 £ 64 pixels,
the images of the resolution wedges is commonly but they are displayed in three different formats.
referred to as Moiré pattern. The corresponding The smallest format, with a density of 64 pixels/cm,
distortion in the images of the random polygons corresponds to a display of 256 £ 256 pixels in a 4 £ 4
emerges more subtly as jagged (or staircase) edges. cm area, which is typical of the images found in the
The images are formed from a set of 64 £ 64 prevalent digital image-processing literature. As can
samples with a Z ¼ 4 interpolation and have a 4 £ 4 be observed, this small format leads to images that
cm format. If these images were displayed in a smaller are indistinguishable from each other because they
format, then the jagged edges could not be resolved hide many of the distortions that larger formats
by the observer and would appear to be the result of clearly reveal.
blurring or photodetector noise instead of aliasing. In general, if the images are displayed in a format
Aliasing has, therefore, often been overlooked as a that is large enough to reveal the finest spatial detail
significant source of image degradation. near the limit of resolution of the imaging system,
The Wiener restoration produces images in which then distortions due to the perturbations in this
fine spatial detail near the limit of resolution normally system become also visible. When no effort is spared
has a high contrast approaching that in the scene. to reduce these distortions without a perceptual loss
However, these images also tend to contain visually of resolution and sharpness, then it becomes strongly
annoying defects, due to the accumulated noise and evident that the best visual quality with which
the ringing near sharp edges (Gibbs phenomenon). images can be restored improves with increasing
Hence, it is often desirable to combine this restoration information rate, even after the fidelity has essen-
with an enhancement filter that gives the user some tially reached its upper bound. This improvement
control over the trade-off among fidelity, sharpness continues until it is gradually ended by the unavoid-
and clarity. able compromise among sharpness, aliasing, and
Figure 13 presents images for which the Wiener ringing as well as by the granularity of the image
restorations given in Figures 11 and 12 are enhanced display.
IMAGING / Information Theory in Imaging 117
See also
Information Processing: Optical Digital Image Proces-
sing. Modulators: Electro-optics.
Further Reading
Fales CL, Huck FO, Alter-Gartenberg R and Rahman Z
(1996) Image gathering and digital restoration. Philo-
sophical Transactions of the Royal Society of London
A354: 2249– 2287.
Huck FO and Fales CL (2001) Characterization of image
systems. Encyclopedia of Imaging Science and Technol-
Figure 13 Images restored with the Wiener filter and enhanced ogy. New York: John Wiley & Sons.
for visual quality. The large, medium, and small formats contain Huck FO, Fales CL and Rahman Z (1996) An information
16, 32 and 64 pixels/cm, respectively. theory of visual communication. Philosophical Trans-
actions of the Royal Society of London A354:
2193– 2248.
Huck FO, Fales CL and Rahman Z (1997) Visual
Communication: An Information Theory Approach.
Concluding Remarks
Boston, CA: Kluwer Academic Publishers.
The performance of image-gathering devices and the Lesurf JCG (1995) Information and Measurement. Bristol,
human eye is constrained by the same critical limiting UK: Institute of Physics Pub.
factors. When these factors are accounted for O’Sullivan JA, Blahut RE and Snyder DL (1998) Infor-
properly, then it emerges that the design of the mation-theoretic image formation. IEEE Transactions
Information Theory 44: 2094– 2123.
image-gathering device that is optimized for the
Shannon CE (1948) A mathematical theory of communi-
highest realizable information rate corresponds
cation. Bell System Technical Journal 27: 379 – 423 and
closely, under appropriate conditions, to the design 28: 623– 656.
of the human eye that evolution has optimized for Shannon CE and Weaver W (1964) The Mathematical
viewing the real world. This convergence of infor- Theory of Communication. Urbana, IL: U. Illinois Press.
mation theory and evolution toward the same design Wiener N (1949) Extrapolation, Interpolation, and
clearly supports the extension of information theory Smoothing of Stationary Time Series. New York:
to the characterization of imaging systems. John Wiley & Sons.
118 IMAGING / Inverse Problems and Computational Imaging
and v, the spatial frequencies associated to the space circumvent the problem by applying a suitable filter
variables x), tells us how a signal of a fixed frequency to gr ðxÞ; in order to suppress the out-of-band noise.
is propagated through the linear system, so that the The second problem is the nonuniqueness of the
blurring can also be viewed as a sort of frequency solution of the problem with the filtered data. This is
filtering. due to the so-called invisible objects, namely objects
Under these assumptions the image g is the whose Fourier transform is zero on V, so that the
convolution product of object f and PSF K: corresponding images are exactly zero. If we find a
solution of the deconvolution problem, by adding an
ð arbitrary invisible object to this solution, we find
gðxÞ ¼ Kðx 2 x0 Þf ðx0 Þdx0 ½1
another solution of the same problem. A well-defined
solution can be obtained by requiring its Fourier
and the solution of the direct problem is just the transform to be zero outside V. In other words, we set
computation of a convolution product. to zero what is not transmitted by the imaging system.
The image g, as given by eqn [1], is the so-called The standard way for approaching the previous
noise-free image. The actually recorded image gr is questions is to look for solutions in the least-squares
affected by a noise term and, in general, it can be sense, namely for objects which minimize the
written in the following form: functional:
where nðxÞ is a function describing noise contami- where p denotes the convolution product, as
nation. It is a random function and, therefore, it is defined in eqn [1], and the right-hand side is the
unknown even if one understands its statistical square of the L2-norm (which can also be interpreted
properties. We point out that eqn [2] does not imply as energy) of the discrepancy between the recorded
that the noise is additive or signal independent. For image gr and the computed image K p f : If one looks
instance, in microscopy and astronomy, the contami- for a least-squares solution with minimal energy then
nation of the image is due both to photon noise, one automatically re-obtains the solution discussed
which satisfies Poisson statistics, and read-out noise, above, namely an object whose Fourier transform in
which is basically Gaussian and white. Therefore the V is given by
noise term in eqn [2] must only be intended as the
difference between the noisy and the noise-free image. g^ ðvÞ
In a first attempt at approaching the inverse f^inv ðvÞ ¼ r ½5
^ vÞ
Kð
problem, it seems quite natural to ignore the noise
term nðxÞ and to solve eqn [1] for f ðxÞ with gðxÞ and is zero outside V. Such a solution is that
replaced by gr ðxÞ: Then, by taking the Fourier provided by the so-called inverse filter.
transform of both sides of this equation and using However, there is an additional difficulty in that the
the well-known convolution theorem, one finds the transfer function usually tends to zero at the
following relationship: boundary of the band while the Fourier transform
of the noise, hence that of gr ; does not, or tends to
^ vÞ
^ vÞfð
g^ r ðvÞ ¼ Kð ½3 zero in a different way. It follows that f^inv ðvÞ is
divergent or very large at the boundary of the band.
which relates the Fourier transform of f and g to the We conclude that its inverse Fourier transform does
transfer function of the imaging system. not exist or, if it exists, is affected by large artifacts
The solution of this equation looks elementary. due to noise propagation. The latter is the typical
However, a first problem is due to the fact that an situation in the case of digital images. In Figure 1 we
optical system is generally band-limited; its band V is give the result of a numerical simulation showing the
the bounded domain of spatial frequencies where the typical effect produced by the inverse filter. The object
transfer function is different from zero. Since the is a beautiful picture of a galaxy recorded by the
noise is not band-limited or, at least, has a band Hubble Space Telescope (HST), while the PSF is a
much broader than that of the optical system, computed one describing the blurring of HST before
outside V the right-hand side of eqn [3] is zero installation of the optics correction. The blurred
while the left-hand side is not; in other words, the image is obtained by convolving the object with the
solution of the deconvolution problem, as formulated PSF and by adding white noise. The restored image
above, does not exist because no object f satisfies produced by the inverse filter is completely corrupted
eqn [3] everywhere. In this particular case, one can by noise.
120 IMAGING / Inverse Problems and Computational Imaging
Figure 1 Illustrating the effect of the inverse filter: (a) Image of the Circinus galaxy taken on 10 April 1999, with the Wide Field
Planetary Camera 2 of the Hubble Space Telescope; (b) the PSF used for blurring the RGB components of the image (here shown in
red); (c) the blurred image obtained by convolving the components of the image in (a) with the corresponding PSFs and adding noise; (d)
the restoration obtained by applying the inverse filter to the three components of (c). Part (a) courtesy of NASA/Space Telescope
Science Institute.
The inverse filter provides a solution which fits the of the problem. It is easy to show that its Fourier
data in the best possible way and therefore also fits transform is given by
the noise in the frequency domains where the signal is
weakly transmitted by the imaging system. This can ^ p ðvÞ
K
f^m ðvÞ ¼ g^ r ðvÞ ½7
be achieved at the cost of an energy (L2 -norm) of the ^ vÞl2 þ m
lKð
solution which is too large. Therefore, such a
conclusion suggests that one must search for a The regularized solutions form a one-parameter
reasonable compromise between data fitting and family of functions. When m is small these functions
energy of the solution. This is the basic idea of are strongly contaminated by noise whose effect is
regularization theory, at least in its most simple form. gradually reduced when m increases. For too large
The mathematical formulation is obtained by looking values of m; one re-obtains blurred versions of the
for objects which minimize the functional: original object.
Such behavior is illustrated by the sequence of
Fm ðf Þ ¼ kK p f 2 gr k2 þ mkf k2 ½6 regularized solutions given in Figure 2 and corre-
sponding to the example of Figure 1. The regulariz-
where m; the so-called regularization parameter, is a ation parameter increases starting from left to right
parameter controlling the trade-off between data and from top to bottom. Its values are m ¼ 0 (inverse
fitting and energy of the solution: when m is small, the filter), 0.001, 0.01, 0.05, 0.1, and 0.5 (the noise is of
data fitting is good and the energy is large while, the order of a few percent and the PSF is normalized
when m is large, the data fitting is poor and the energy in such a way that the sum of all its value is 1). The
is small. sequence makes evident the well-established theor-
For a given m; the function fm ; which minimizes the etical result of the existence of an optimal value of the
functional of eqn [6] is called the regularized solution regularization parameter. Such an optimal value can
IMAGING / Inverse Problems and Computational Imaging 121
Figure 2 Illustrating the effect of the regularization parameter. The blurred image is that shown in Figure 1c. The sequence is
obtained by increasing the value of the regularization parameter: (a) m ¼ 0 (inverse filter); (b) m ¼ 0:001; (c) m ¼ 0:01; (d) m ¼ 0:05;
(e) m ¼ 0:1; (f) m ¼ 0:5. It is evident that the best restoration is that shown in (c).
be determined in the case of numerical simulations as As shown by eqn [7], the regularized solution has
that represented in Figure 2. However, its estimation the same band V of the imaging system, because
in the case of real images is more difficult. Several K^ p ðvÞ is zero outside V. As is known, a band-limited
methods for the selection of m have been proposed. function can be represented in terms of its samples
We only mention the so-called discrepancy principle. taken at a rate which is roughly proportional to the
It consists of determining the value of m, such that the size of the band. This is the content of the famous
value of the discrepancy functional at fm ; 12 ðfm Þ Shannon sampling theorem (more precisely its 2D
coincides with an estimate of the energy of the extension). On the other hand, the sampling distance
noise. In other words, the data are fitted with an can be taken as a measure of the resolution provided
accuracy comparable with their uncertainty. by the restored image so that one concludes that the
122 IMAGING / Inverse Problems and Computational Imaging
restoration method discussed above does not produce zero to some positive value. The ringing is essentially
an improvement of the resolution of the imaging related to the well-known Gibbs effect in the
system. It certainly produces an improvement of the truncation of the Fourier series.
quality of the image which makes possible a visual The requirement of positivity is the requirement of
verification of that resolution. a particular lower bound on the values of the
In many circumstances, the term super-resolution is solution. Therefore, one can consider more general
used for describing methods which allow an improve- constraints in requiring a given lower and/or upper
ment of resolution beyond the limit provided by the bound on the values of the solution, eventually
sampling theorem. Therefore, a super-resolving combined with a constraint on the domain to
method must produce a restored image with a band produce, for instance, a super-resolved positive
broader than that of the imaging system. Therefore, solution. The most general form of these constraints
super-resolution is related to the problem of out- requires that the solution belongs to a given closed
of-band extrapolation. and convex set C in some functional space. A general
Such an approach was already proposed in the method producing regularized solutions in a convex
1960s, when it was observed that the Fourier trans- set is an iterative method which is essentially a
form of an object with a finite spatial extent (all gradient method for the minimization of the least-
objects have this property) is analytic; then the squares functional, with a projection on the set C at
theorem of unique analytic continuation implies each iteration. In the case of object deconvolution,
that the Fourier transform of the object can be the least-squares functional is that given in eqn [4]
determined everywhere from its values on the band of and the method has the following form: if fk is the
the imaging system. result of the k-th iteration, then fkþ1 is given by
Unfortunately this problem is ill-posed, so that out-
of-band extrapolation is not feasible in practice. Only fkþ1 ¼ PC ½fk þ tKT p ðgr 2 K p fk Þ ½8
recently it has been recognized that a considerable
amount of super-resolution is possible when the where PC is the convex projection onto the set C; t is a
spatial extent of the object is not much greater than relaxation parameter and KT ðxÞ ¼ Kð2xÞ: In general,
the resolution distance of the instrument. For the algorithm, known as the projected Landweber
example, super-resolving methods could be used for method, is initialized with f0 ¼ 0: An important
detecting unresolved binary star in astronomical property is that it has a regularization effect, in the
images. sense that iterations must not be pushed to conver-
The important point is that a super-resolving gence but stopped after a suitable number of
method must implement explicitly the finite extent iterations to avoid strong noise contamination. In
of the object: the domain of the object must be Figure 3 we compare, in a specific example, the result
estimated and the method must search for a solution obtained by means of the linear regularization
which is zero outside this domain. This introduces an method and that obtained by means of the projected
important question in object restoration and, more iterative method with a lower and upper bound on
generally, in the theory of inverse problems, namely the solution. The reduction of ringing is evident.
Among the methods producing positive solutions
the design of regularized solutions satisfying
we must mention the most popular one in astronomy,
additional conditions (constraints). This question is
the so-called Richardson – Lucy method, which is an
also important from the theoretical point of view: the
iterative method for Maximum-Likelihood esti-
problem is ill-posed because of insufficient infor-
mation. If the PSF has been normalized in such a
mation on the object as transmitted by the imaging
way that the sum of all its values is one, it takes the
system; the use of additional constraints reduces the
following form:
class of the solutions which are compatible with the
data and therefore can improve the restoration. This
gr
is the use of a priori information in the solution of fkþ1 ¼ fk KT p ½9
K p fk
inverse problems.
A constraint which has been widely investigated The method is normally initialized with a uniform
and used is the positivity of the solution, i.e., all the image and it must not be pushed to convergence to
values of the restored image must be positive or zero. avoid noise amplification. Another algorithm
The physical meaning of the constraint is obvious. Its suggested by statistical arguments and producing
beneficial effect is to reduce ringing artifacts which positive solution is the so-called Maximum Entropy
affect the regularized solutions previously discussed, Method (MEM).
in cases where the object contains bright spots over a The image restoration methods described above
black background or sharp intensity variations from have wide applications both in microscopy and
IMAGING / Inverse Problems and Computational Imaging 123
Figure 3 Illustrating the effect of constraints on image restoration: (a) Image of the Earth taken during the Apollo 11 mission; (b)
blurred image of (a) obtained by assuming out-of-focus blur and adding noise; (c) optimal restoration provided by the regularization
method; the ringing around the boundary of the Earth is evident; (d) optimal restoration obtained by means of the projected Landweber
method with a lower and an upper bound on the solution at each iteration. Part (a) courtesy of NASA.
while the variable s takes values between 2a and a; if the direction u of the function f ; is the Fourier
a is the radius of a disc containing the phantom. As is transform of f on the straight line passing through the
evident, the sinogram is symmetric with respect to the origin and having direction u:
point ð0; pÞ: Each point in the rectangle corresponds
to a straight line crossing the phantom and the ^ vuÞ
P^ u ðvÞ ¼ fð ½11
straight lines through a fixed point of the phantom
describe a sinusoidal curve in the ðs; wÞ-plane. This This theorem clarifies the information content of CT
property has given rise to the name sinogram. data: each projection provides the Fourier transform
The sinogram is, in a sense, the image of the body of the function f along a well-defined straight line in
slice under investigation, as provided directly by a the plane of the spatial frequencies. It is also the
CT-scanner. However, an inspection of this image starting point for deriving the basic algorithm of data
does not easily provide information about the body. inversion in tomography, the filtered back-projection
Therefore, it must be processed to obtain a more (FBP). It is obtained from Fourier transform inversion
in polar coordinates and is a two-step algorithm: the
significant image. Radon transform inversion is an
first step is a filtering of the projections by means of a
example of inverse and ill-posed problem and its
ramp-filter; the second is the back-projection of the
solution must be treated with extra care.
filtered sinogram. More precisely the two steps are as
The crucial point in Radon transform inversion is
follows:
the Fourier slice theorem, whose content is the
following: the Fourier transform of the projection in . Filtering: for each u the projection Pu ðsÞ is replaced
by a filtered projection given by
ðþ1
Qu ðsÞ ¼ lvlP^ u ðvÞeivs dv ½12
21
Figure 7 (a) Picture of a brain slice; (b) the corresponding sinogram. As explained in the text, the sinogram is a gray-level
representation of the Radon transform in the s; w plane. Each horizontal line in (b) corresponds to a projection of (a).
126 IMAGING / Inverse Problems and Computational Imaging
Figure 8 (a) The effect of the projection operation is shown by drawing the projections of a circular phantom corresponding to three
directions. (b) shows the result obtained by applying the back-projection operation to the three projections given in (a). It is evident that
the picture provided by this operation has a maximum on the domain of the circular phantom.
Two different modalities of ECT are usually problems are basically nonlinear so that the compu-
considered: tational cost of the methods for their solution is, in
general, too high for clinical applications. However
. single photon emission computerized tomography the research in these areas is very active and,
(SPECT), which makes use of radio-isotopes where therefore, the situation could be improved in the
a single g-ray is emitted per nuclear disintegration; near future.
. positron emission tomography (PET), which makes
use of b þ-emitters, where the final result of a
nuclear disintegration is a pair of g-rays, propagat-
ing in opposite directions, produced by the See also
annihilation of the emitted positron in the tissues. Tomography: Tomography and Optical Imaging.
Adaptive Optics
C Pernechele, Astronomical Observatory of are many effects which degrade that physical limit,
Padova, Padova, Italy resulting in poorer image quality at the focal plane.
q 2005, Elsevier Ltd. All Rights Reserved. These effects may be static (e.g., time-independent,
as, for example, error in the alignment of the optical
train) or dynamic (e.g., variable with time, such as
optical misalignment introduced by thermo-mechan-
Introduction ical distortion). Active and adaptive optics, the topics
The image quality of a perfect optical system is ideally of this article, are technologies able to correct for the
limited by its diffraction figure. In the real world there optical aberrations induced by dynamical effects.
128 IMAGING / Adaptive Optics
There are a variety of causes liable for dynamical Dynamic optical misalignment may depend on
image degradation, among these time-variable optical gravitational and thermal deformation, while point-
misalignments, pointing and tracking errors (for ing error may appear as very slow (drift), slow
motorized instruments), and atmospheric pertur- (wander), or fast (jitter) tilt of the image on the focal
bations. In practice, both atmospheric turbulence plane. Atmospheric perturbations are produced by air
and misaligned instrument introduce random spatial turbulence and may be natural (as in the astronomical
and temporal perturbations on the optical beam, or satellite tracking cases) or induced by the
distorting the spherical (ideal) wavefront incoming on experiment itself (thermal blooming in high power
to the focal plane. This leads to loss of both the laser applications). All these effects appear on very
diffraction limited condition and image resolution. A different time-scales, as shown in Figure 2. As
typical situation for astronomical application is noticeable, atmospheric turbulence introduces aber-
shown in Figure 1. A target (a star in this case) rations of all orders, at a frequency above about 1 Hz.
emits light whose wavefront, being far from the Pointing and tracking errors introduce an image tilt
observing system, can be considered plane. When this with different time-scale. Gravitational effects induce
wavefront passes through the atmosphere it is mechanical deformation of the optics, mainly in
distorted with a temporal frequency of the order of astronomical applications because of the massive
hundreds of Hertz. This dynamically distorted components (the telescope primary mirror in particu-
wavefront fills the aperture of the optical system lar). Thermal deformation produces optical misalign-
and will once more be distorted by the dynamic ment among optical components mainly resulting in
misalignment of the optical system itself. The final defocus. Figure 2 is purely illustrative, because some
wavefront will be deformed with respect to the ideal effects may produce wavefront aberrations at differ-
(spherical) wavefront. ent frequencies than indicated there. For example,
drift may be faster than indicated in the figure, and
thermal blooming (a thermal effect) introduces
aberrations at a higher frequency, and so on.
The correction of the wavefront errors is usually
made by means of devices able to change the phase
wave somewhere in the light path. Though the
concept for correcting aberrated wavefront at slow
or high frequency is the same, for example, the use of
one or more deformable elements in the optical path,
the technologies developed for each area are quite
different: variations at slow frequency (approxi- (due to gravitational decentering between M1 and
mately below 0.1 Hz in Figure 2) are indicated as M2). Since a telescope has a tracking system some
quasi-static errors and the techniques developed to pointing error may be present, causing motion of the
overcome them are properly called active optics, image in the focal plane. The observation of an off-
while adaptive optics indicates the techniques to axis target (typically a star for astronomical appli-
overcome the unwanted effects at high frequency cations) is made to measure the deformations of the
(above about 1 Hz in frequency). The frequency value incoming wavefront by means of a wavefront sensor
dividing the active from adaptive optics is not well (WS) system. This step is completed using some of the
defined and depends on the field of application and its wavefront sensing technique described elsewhere in
scientific engineering communities. The important this encyclopedia. The WS must measure the infor-
point is that a correct use of a combination of the two mation on the wavefront distortion with adequate
techniques may ultimately lead to bringing the spatial resolution to compensate for aberrations
imaging system close to its diffraction limit. across the full aperture of the system. Once the
wavefront distortion has been measured by the WS,
the needed corrections are sent to the control system
Slow Corrections: Active Optics which apply the appropriate mirror deformation/
movements. In the example of Figure 3 only the
The analysis of the wavefront is required before its primary mirror (entrance pupil) is deformable and it
correction, and one or more deformable/movable is used for correcting spherical and astigmatism
mirrors and their control system, to operate the aberrations. M2 is used to compensate the decenter-
corrections. A typical system for low frequency ing coma by means of lateral displacement. M3 may
correction is shown in Figure 3. A telescope contains be tilted to correct for low-frequency pointing errors.
three mirrors (M1, M2, and M3) and an instrument In astronomical applications active optics is essential
to detect its scientific target, usually placed on its for telescopes with primary-mirror diameters larger
optical axis. The dynamic deformations present in than approximately four meters which cannot be
this kind of instrument are mainly third-order manufactured rigidly at reasonable cost. Active optics
spherical (thermal deformation of M1), third-order requires a thin deformable or a segmented primary
astigmatism (a saddle-horse deformation of M1 mirror maintained on the correct shape at every
due to gravity when the telescope is pointing elevation. Using this method, low-frequency correc-
away from the zenith), and third-order coma tion has been achieved to date in telescopes with
primary mirrors of the order of ten meters.
distance over which the wavefront phase is highly atmosphere accumulates, across the diameter of a
correlated (in this way it provides a measure of the large telescope, phase errors of a few micrometers.
turbulence scale size). For an undistorted plane wave, Turbulence is distributed along the propagation
r0 is infinity, while for atmospheric distortion it path through the atmosphere, and then the wavefront
typically varies between 5 cm (bad sites) and 20 cm errors become uncorrelated as the field angle
(excellent sites). We depict Fried’s parameter sche- increases. This effect is known as angular isoplanat-
matically in Figure 5, as bubbles of dimension r0 in ism and is one of the major limitations to the
the turbulent layer (referred to as Fried’s cells). r0 is a astronomical adaptive optics. The structure becomes
statistical parameter, which depends on time and completely uncorrelated out of the isoplanatic angle,
wavelength ðr0 ¼ r0 ðt; l6=5 ÞÞ: Its variability with time approximately equal to r0 =H; where H is the altitude
may be considerable, sometimes changing by a factor of the turbulence layer. This decorrelation is sketched
of 2 within a few seconds. A small telescope, with a in Figure 5, where the short exposure images for
diameter less than r0 ; will always operate at this objects 1 and 2 have different speckles structure.
diffraction limit. When the r0 is smaller than the In Earth’s atmosphere significant turbulence occurs
telescope aperture, the angular size of the image is at altitudes up to 20 km, often with large peaks in the
different if we consider long- or short-time exposures. vicinity of the tropopause, at around 10 km there
If the exposure time is very short (less than exist, in practice, more than one parameter H in
hundredths of a second) it is possible to image a Figure 5, one for each turbulence layer. This three-
structure in the focal plane (see short exposure image dimensional distribution of the turbulence consider-
in Figure 5). This structure, which varies with time, is ably restricts the angle over which adaptive compen-
due to high-order wavefront distortions that occur sation is effective and leads to the design of
within the aperture. The short exposure image is multiconjugate adaptive optics, as we next explain.
broken into a large number of speckles of diameter
2:44 l=D; which are randomly distributed over a
Adaptive System Design
circular region of approximately 2:44 l=r0 in angular
diameter. A lower-order wavefront distortion is As stated above, an active system is composed of at
correlated with the size of the largest disturbances least one wavefront sensor and one deformable
produced by atmospheric turbulence (the so-called mirror. Typically, the wavefront is corrected by
outer scale) which varies considerably (from 1 m to making independent tip, tilt, and piston movements
over 1 km). Such disturbances produce a time- of the deformable mirror surface, as shown in
dependent tilt of the overall wavefront that causes Figure 6. As a rule of thumb, a deformable mirror
fluctuations in the angle of arrival of the light from a used for atmospheric correction should have one
star and randomly displace the image. For D of the actuator for each Fried’s cell, such that the total
order of the Fried parameter r0 ; image motion number of actuators should be equal to the number of
dominates, and a simple tip-tilt mirror (not just Fried’s cells filling the telescope aperture, i.e., D2 =r20 :
deformable) provides useful correction. For D greater For instance, a near-perfect correction for an obser-
than r0 ; speckles dominate and a deformable mirror is vation in visible light (0.6 mm) with an 8 m telescope
necessary to account for the wavefront distortion. For placed in a good astronomical site ðr0 ¼ 20 cmÞ;
long exposures the dimension of the image is would require a deformable mirror with approxi-
determined by the ratio l=r0 rather than l=D: The mately 6400 actuators in its adaptive optics system.
angular resolution is diminished due to the atmos- Recalling that r0 is strongly dependent on the
pheric turbulence (see Figure 5). For a 4 m telescope, wavelength (i.e., proportional to l6=5 ), similar per-
atmospheric distortion degrades the spatial resolution formance in the infrared at 2 mm requires only
by more than one order of magnitude compared to 250 actuators. This is one reason why adaptive
the diffraction limit, and the intensity at the center of correction is easier in the infrared. A large number
the star image is lowered by a factor of 100 or more.
Even at the best astronomical sites, ground-based
telescopes observing at visible wavelengths cannot,
because of atmospheric turbulence alone, achieve an
angular resolution better than telescopes of about
20 cm diameter.
Most of the distortions imposed by the atmo-
sphere are in the wave-phase and an initially plane
wavefront traveling 20 km through the turbulent Figure 6 A sketch of a deformable mirror.
132 IMAGING / Adaptive Optics
of Fried’s cells requires a similarly large number of because atmospheric turbulence (and in particular its
subapertures in the wavefront sensor. Commonly high spatial frequencies) is, for a given AO correction,
used WSs are based on modifications of the classic less pronounced at longer wavelengths. The spatial
Hartmann screen test or of curvature type. In the and temporal sampling of the disturbed wavefront
former case the adaptive device is made with can be reduced, which in turn can permit the use of
individual piezoelectric actuators (see Figure 6), fainter reference stars. Coupled with the larger
while in the latter the correction is achieved with a isoplanatic angle in the infrared, this gives a much
bimorph adaptive mirror, made of two bonded better outlook for AO correction than in the visible.
piezoelectric plates. With both methods, wavefront Nevertheless, even for observations at 2.2 mm, the
sensing is done on a reference off-axis object sky coverage achievable by this technique (equal to
(Figure 3), or even on the observed object itself the probability of finding a suitable reference star
(Figure 4) if it is bright enough. In the former case the in the isoplanatic patch around the chosen target) is of
angular distance of the reference object must fall the order of 0.5 to 1%. It is therefore unsurprising
inside the isoplanatic angle and must be sufficiently that most scientific applications of AO have to date
luminous. In typical astronomical sites the isoplanatic been limited to the study of objects which provide
angle at a wavelength of 0.6 mm is , 5 arcsec, while it their own reference object. The most promising way
can be , 20 arcsec at 2 mm. This is an additional to overcome the isoplanatic angle limitation is by the
reason why the adaptive optics system is most use of artificial reference stars, also referred to as laser
efficient in the infrared range. Correction in the guide stars (LGS). These are patches of light created
visible requires a reference star approximately 25 by the back-scattering of pulsed laser light of sodium
times brighter than in the infrared. Most current atoms in the high mesosphere or by molecules and
astronomical systems are designed to provide diffrac- particles located in the low stratosphere. The laser
tion-limited images in the near-infrared (1 to 2 mm) beam is focused at an altitude of about 90 km in the
with the capability for partial correction in the first case (sodium resonance) and 10 to 20 km in the
visible. However, some military systems for satellite second case (Rayleigh diffusion). Such an artificial
observations do provide full correction in the visible reference star can be created as close to the
on at least 1 m class telescopes. astronomical target as desired, and a wavefront
The control system is generally a specialized sensor measuring the scattered laser light is used to
computer that calculates, from the WS measure- correct the wavefront aberrations on the target
ments, the commands to send the actuators of the object.
deformable mirror. The calculation must be per- Military laboratories have reported the successful
formed quickly (within 0.5 to 1 ms), otherwise the operation of adaptive optics devices at visible
state of the atmosphere may have changed making the wavelengths using laser guide star on both a 60 cm
wavefront correction inaccurate. The required com- telescope and a 1.5 meter telescope, achieving an
puting power can exceed several hundred million image resolution of 0.15 arcsec. Nevertheless, there
operations for each command set sent to a 250 are still physical limitations of a LGS. The first
actuator deformable mirror. As in active optics problem, focus anisoplanatism, or cone-effect, was
systems, zonal or modal control methods are used. realized long ago. Since the artificial star is created at
In zonal control, each zone or segment of the mirror is a relatively low altitude, back-scattered light col-
controlled independently by wavefront signals that lected by the telescope forms a conical beam, which
are measured for the subaperture of that zone. In does not cross the same turbulence-layer areas as light
modal control, the wavefront is expressed as the best- from the astronomical target. This leads to a phase
fit linear combination of polynomial modes of the estimation error, which in principle can be solved by
wavefront distorted by the atmospheric the simultaneous use of several laser guide stars
perturbations. surrounding the targeted object. A more significant
problem is image motion or tilt determination. Since
the paths of the outgoing and returning light rays are
Future Development the same, the centroid of the artificial light appears to
As explained above, a convenient method for be stationary in the sky, while the apparent position
measuring the wavefront distortion is to look for an of an astronomical source suffers lateral motions (also
off-axis reference star (also known as natural guide known as tip/tilt). The simplest solution is to
star, NGS) within the isoplanatic angle. However, it is supplement the AO system and LGS with a tip/tilt
generally impossible to find a bright reference star corrector set on a (generally) faint close NGS.
close to an arbitrary astronomical target. Conditions Performance is then limited by poor photon statistics
are far better in the infrared than in the visible in the tip/tilt correction.
IMAGING / Adaptive Optics 133
In the field of telecommunications, studies are in Jumper EJ and Fitzgerald EJ (2001) Recent advances in
progress for using adaptive techniques in data aero-optics. Progress in Aerospace Sciences 37:
transmission between ground-based stations and 299 – 339.
satellites. Liang J, Williams DR and Miller DT (1997) Supernormal
vision and high-resolution retinal imaging through
adaptive optics. JOSA A 14(11): 2884– 2892.
See also Lukin VP (1995) Atmospheric Adaptive Optics. Belling-
ham: SPIE Optical Engineering Press.
Environmental Measurements: Doppler Lidar;
Ragazzoni R, Marchetti E and Valente G (2000) Adaptive-
Optical Transmission and Scatter of the Atmosphere.
optics corrections available for the whole sky. Nature
Instrumentation: Astronomical Instrumentation; Tele-
403(6765): 54– 56.
scopes. Phase Control: Wavefront Coding; Wavefront
Sensors and Control (Imaging Through Turbulence). Roddier F (1999) Adaptive Optics in Astronomy.
Cambridge: Cambridge University Press.
Roggeman M and Welsh B (1996) Imaging Through
Turbulence. Boca Raton, FL: CRC Press.
Further Reading Tyson RK (2000) Introduction to Adaptive Optics.
Beckers JM (1993) Adaptive optics for astronomy: prin- Bellingham: SPIE Optical Engineering Press.
ciples, performance and application. ARAA 31: 13 –62. Welford W (1974) Aberrations of the Symmetrical Optical
Fried DL (1965) Statistics of a geometric representation of System. New York: Academic Press.
wavefront distortion. JOSA 55: 1427– 1435. Wilson RW and Jenkins CR (1996) Adaptive optics for
Hardy JW (1998) Adaptive Optics for Astronomical astronomy: theoretical performance and limitations.
Telescopes. New York: Oxford University Press. MNRAS 268: 39.
Hyperspectral Imaging
M L Huebschman, R A Schultz and H R Garner, Fluorescent dyes (also referred to as fluorophores
University of Texas, Dallas, TX, USA or fluorochromes) and recently quantum dots, as
q 2005, Elsevier Ltd. All Rights Reserved. markers, are a very powerful tool for analyzing
organic tissues for researchers in cytology (study of
cell function), cytogenetics (study of the human
chromosomes), histology (study of structure and
Introduction function of tissues), microarray scanning (analysis
Hyperspectral imaging techniques on a microscopic of concentrations of multiple varied probes), and
level have paralleled the hyperspectral remote biological material science. Without the correlation
astronomical sensing which has developed over the of the spectral and spatial data in biological images,
years. It is especially apparent in remote sensing of there would be little useful information for the
our own planet’s land masses and atmosphere, such as researcher. Therefore, hyperspectral imaging in the
in the LANDSAT satellite program. In biology, biological sciences must combine imaging, spec-
biochemistry, and medicine, hyperspectral imaging troscopy, and fluorescent probe techniques. Such
and multispectral imaging are often interchanged, imaging requires high spatial resolution to present a
although there is a distinct difference in the two true physical picture, and the spectroscopy must also
techniques. have high spectral resolution in order to determine
A multispectral imager produces images of each of the quantities of biochemical probes present. The
a number of discrete but relatively wide spectral correlation must provide the color-position relation
bands. Multispectral imagers employ several wide for identification of meaningful information.
bandpass filters to collect the data for the images. A This description will begin with an initial review of
hyperspectral imager produces images over a con- multispectral imaging to lead into a discussion of
tiguous range of spectral bands and, thus, offers the hyperspectral imaging. The discussion will examine
potential for spectroscopic analysis of data. Hyper- aspects of hyperspectral imaging based on an inter-
spectral imagers may use gratings or interferometers ferometric method and on a grating dispersion
to produce the contiguous range of spectral bands to method similar to the methods in multispectral
record. imaging.
IMAGING / Hyperspectral Imaging 135
Image Parameters portion of the total emission spectrum for any one
fluorophore.
The analysis of fluorophores involves the excitation of
the dye with a shorter wavelength and thus higher
energy than the emission spectrum. The use of the Multispectral Imaging
appropriate excitation wavelength will yield a maxi- Multispectral imaging systems in biomedical research
mum emission intensity at a characteristic wave- are composed of a microscope, filters and/or inter-
length. The quality of a fluorescent image generally ferometer for spectral selection, a charged coupled
combines contrast and brightness. Contrast depends device (CCD) camera to record the data, and
on the ratio of emission intensity to that of back- computer control for acquisition, analysis, and dis-
ground light. Consequently, the use of filters to block play (Figure 1a). The introduction of CCD cameras
the excitation wavelength and prevent it from over- has offered a means to document results and capture
whelming the field is critical in providing high images as digital computer files. This has facilitated
contrast for the detection of fluorophore emission. the analysis of results with more and more complex
However, the use of narrow filters to restrict the probe sets and has allowed the use of more
background can severely limit the brightness of the sophisticated computer algorithms for data analysis.
fluorescent image. This is due to the barrier filters The other major component of the multispectral
allowing only a portion of the actual emission imaging system is a light source to excite the sample’s
spectrum to pass. Brightness is an intrinsic property fluorescent dyes whose emissions are selected by
of each individual fluorophore, but is also dependent filters or an interferometer for recording on the CCD
on the ability to illuminate with the peak excitation camera.
wavelength for that fluorophore. An example of the use of multispectral imaging is in
The balance between contrast and brightness may the identification of the 24 human chromosomes. A
be further complicated by applications designed to scheme was presented for the analysis of human
facilitate the evaluation of multiple fluorophores. The chromosomes by differentially labeling each chromo-
selection of appropriate excitation and emission some with a unique combination of 5 fluorophore-
filters within a single set to allow for the simultaneous labeled nucleotides (building blocks of DNA and
viewing of multiple fluorophores can be very inhibit- RNA, consisting of a nitrogenous base, a five-carbon
ing. Although quad-band beamsplitters are currently sugar, and a phosphate group). These complex
available, commercial advances to date have yielded probes are simultaneously hybridized to metaphase
only triple-pass combinations for simultaneous detec- chromosomes and analyzed to determine which of
tion, such as one of which permits the simultaneous the fluorophores are present. Combinations of
one, two, or three of the five fluorophores are
detection of the widely used fluorophore combination
enough to distinguish a particular chromosome
fluorescein-5-isothiocyanate (FITC), Texas Red and
from the others.
40 ,6-diamidino-2-phenylindole (DAPI) (e.g., Chroma
An interferometric multispectral system used to
Technology Corp.).
perform this type of analysis is the Spectral Karyotyp-
Although the capture of digital images has pro-
ing (chromosome typing) or SKY System (Figure 1a).
pelled this field, efforts to collect multicolored images
This technology does not represent hyperspectral
as digital files also suffer from technical limitations. imaging, but is presented for comparison. Illumina-
Color CCD cameras often utilize filters or liquid tion light passes through a triple band-pass dichroic
crystals to segregate the red, green, and blue filter which allows for the simultaneous excitation of
wavelengths onto the imaging chip. The employment all the fluorochromes in, for instance, Speicher’s
of these cameras is restricted to resolving colors as scheme. Modern multi-band pass filters add little to
discriminated by the filters used. For scientific image distortion while effectively blocking unwanted
applications, then, the greatest sensitivity and selec- autofluorescence or excitation illumination. The
tivity has been achieved through black and white fluorescence emissions from the sample are split into
CCD cameras with optimized filters for a specific two separate beams by the interferometer. The beams
application. While this approach is widely used, it are then recombined with an optical path difference
suffers from the inherent limits of the spectra of data (OPD) between them such that the two beams are no
that can be analyzed since it is defined by the longer in register and interfere with each other. By
characteristics of the optical filters employed. Thus, adjusting the OPD, a particular wavelength band can
the system is restricted to the use of fluorophores that be selected for, which results in constructive inter-
can be effectively resolved from each other by filters, ference and thus can be collected by the CCD
and the filters selected limit the data collected to a camera pixels. After adjusting the OPD to record
136 IMAGING / Hyperspectral Imaging
Figure 1 The instruments for multispectral and hyperspectral imaging are similar. (a) Multispectral imagers use interferometers or
bandpass filters to collect noncontiguous emission wavelengths. (b) Hyperspectral imagers employ interferometers or gratings to collect
a contiguous wavelength band. Computers, microscopes, CCD cameras, and excitation light sources are common to all the techniques.
The systems which use filter sets or interferometers use motorized filter placement systems or automated interferometers. The gratings
systems employ a motorized stage for the sample movement. Adapted with permission from Schultz RA, Nielsen T, Zavaleta JR,
Ruch R, Wyatt R and Garner HR (2001) Hyperspectral imaging: a novel approach for microscopic analysis. Cytometry 43: 239– 247.
the 5 dye wavelengths, each pixel’s 5 records are then System. It uses five single-band-pass excitation/
compared to the known probe-labeling scheme to emission filter sets where each filter set is specific for
identify the corresponding chromosome material. one of the five fluorochromes used in the combina-
The image of the metaphase chromosome spread torial labeling process (Figure 1a). Again, five
can be karyotyped using specific pseudo-colors for separate exposures are taken (one exposure for each
maximal visual discrimination of the chromosomes. filter set) and combined to create a multicolor image.
Another nonhyperspectral method is the Multi- Most M-FISH systems now have motorized filter
Fluorescence In Situ Hybridization or M-FISH set selectors which minimize any image registration
IMAGING / Hyperspectral Imaging 137
shifts that may occur between exposures. For each advent of smaller and smaller featured microarrays,
pixel, the presence/absence of signal from each of the the desire to use more dyes in identification schemes,
five fluorochromes is assessed and then matched to the and the use of quantum-dot markers which can be
probe-coding scheme to determine which chromo- used to indicate both genetic makeup and quantity,
some material is present. As with the SKY System, the limitations of multispectral imaging has led to a
an image of the metaphase spread is converted into move to hyperspectral imaging.
a karyotype and specific pseudo-colors are used to In contrast to the fluorescence multispectral ima-
maximize visual discrimination of the chromosomes. ging approaches, a hyperspectral image is fully three-
dimensional. Every XY spatial pixel of the image has
an arbitrarily large number of wavelengths allied with
Hyperspectral Imaging it. An intensity cube of data is collected. Two of the
axes of the cube are spatial and one axis is wavelength
As has been stated, the main difference between (see Figure 2). At each of the 3D points, there is an
multi- and hyper-spectral imaging is the collection of associated intensity of emitted energy. Therefore, it is
a broad contiguous band of wavelengths. The a four-dimensional information space. The individual
apparatus and methods for hyperspectral imaging intensity signature of differing emitting bodies can
are similar to multispectral imaging (Figure 1). then be separated to give a complete view. Spectral
Instead of collecting sections of the emission spec- decomposition is a mathematical/algorithmical pro-
trum using filters or interferometers, hyperspectral cess by which individual emission spectra of the dyes
imaging collects a continuous emission spectrum by can be uniquely separated from a composite (full
dispersing the wavelength of light using gratings and spectral) image by knowing the signature of each dye
measuring the wavelength intensity, or by dispersing emission spectra component measured separately.
Fourier components of light using interferometers Spectral decomposition may be performed using
and measuring interference intensity (Figure 1b). standard algorithms with codes which are readily
Multispectral images are limited to measuring total available.
intensity within the selected wavelength band. Deter-
mining the abundance of more than one dye per pixel
Interferometric Hyperspectral Imaging
is less accurate when only a small segment of each
dyes’ spectrum is passed by the emission filter. This The system architecture of the interferometric hyper-
has not been considered a limitation in the past spectral imager resembles that of the SKY System
because most research depended on just identifying (Figure 1). But, instead of just sequencing the
the presence and not the abundance of several interferometer to select a set of specific wavelengths
spectrally broad and overlapping dyes which also to record as in the SKY System, the interferometer in
occupied relatively large spatial areas. With the the hyperspectral imager is sequenced through
Figure 2 Intensity data cube. In this data cube, each CCD exposure captures a measurement in intensity (number of counts) by
wavelength for a small spatial area in Y and DX. Many displacements in X with subsequent CCD acquisitions yield a ‘cube’ of data in X,
Y, and l. The intensity counts may be thought of as another dimension, I(x,y,l). Another form of the data cube is with each CCD
acquisition comprising intensity counts for a spatial X and Y area for a small Dl and multiple displacements in l for each acquisition.
138 IMAGING / Hyperspectral Imaging
discrete optical path differences to record the camera and the microscope system magnification,
intensity at each pixel for each OPD. Fourier since the interferometric hyperspectral viewer sees
spectroscopy techniques are used to transform the the entire field of view at once. Let the x- and
OPD intensities at each pixel into the wavelength y- dimensions of the CCD pixel be dx and dy;
intensities at each pixel. Each pixel may be thought of respectively, and the magnification be M; then the
as a Fourier spectrometer which is not dependent on spatial resolutions in the x and y directions are 2dx=M
any other pixel. Then, spectral decomposition algor- and 2dy=M; respectively. For CCD pixel sizes of
ithms can be run to determine which dyes are present 10 microns square, binning together two pixels in the
(see the next section). x- and two pixels in the y- direction, and an overall
Like the multifilter methods, the interferometer magnification of 60, the x and y resolutions are
approach provides a real two-dimensional image that both 0.66 microns.
can be observed, focused, and optimized prior to
measurement. But like the filtering methods, the Field of View
excitation source must be filtered for sufficient For N pixel rows and C pixel columns, the field of
contrast to view the fluorescent emissions. Interfe- view will be Ndy=M and Cdx=M for the length of the
rometers have high optical throughput, relative to area along the y- and x- axis, respectively. For a CCD
other methods, and can have high and variable array of 1536 columns by 1024 rows, and the
spectral resolution. The spectral range can cover parameters above, the field of view is 171 microns
from the ultraviolet to the infrared. The high and by 256 microns, x and y; respectively.
variable spectral resolution allows the user to trade The sensitivity, dynamic range, and spectral
off sensitivity, spectral resolution, and acquisition response are set by the CCD camera optical
time in a way that is not possible with any other components.
technique. And interferometers do not cause polariz-
ation of light as a filter may. Spectral Resolution
However, interferometric instruments are suscep- Spectral resolution will depend on the ability to
tible to image registration and lateral coherence vary the OPD and the Fourier transform. While
difficulties. They are limited by the extreme precision the interferometer is used basically as a filter in the
required of the optical components which results in multispectral imager to select a wavelength, in the
the continuous need for monitoring and adjustment, hyperspectral imager it is used to vary the OPD for all
since aberrations are not tolerated in interferometric the wavelengths whose energy is integrated at the
instruments. The throughput is reduced by loss due to CCD camera pixels.
the beamsplitter design and from the mirrors. The electric field, E, for emissions at one wave-
Additionally, the Fourier transform instrument length, l; or wave number, k ¼ 1=l; radial frequency,
deconvolutes based on a two-step process; the v; and amplitude, A; can be written as
correlation of the light passed by an interferometer
followed by an inverse Fourier transform in software. EðkÞ ¼ AðkÞ e2 ið2pkx2vtÞ ½1
To mitigate the extreme precision and sensitivity of and the intensity which is measured is the magnitude
the most interferometric systems, Sagnac interferom- squared of the electric field:
eters have been employed because of their common
path intrinsic stability. The common path compen- I ¼ Ep E ½2
sates any misalignment, vibration, or temperature
affect. The OPD is provided by the Sagnac affect. Emissions from fluorescent sources are usually
When the Sagnac interferometer is rotated, the path noncoherent and composed of a range of wave-
traveled in one direction will be different from that of lengths, even with only one dye present. The intensity
the light on the common path in the other direction. measured at a pixel is the superposition of many wave
Although, this interferometer method is trading one numbers. After traveling a certain OPD, L; the
difficulty for another, it is somewhat easier to manage intensity integrated over all wave numbers may be
the rotation of a physical system than to maintain the written as
exacting precision of mirror alignments in a standard 1 ð ð
interferometer. Ipxl ðLÞ ¼ IðkÞdk þ IðkÞ cosð2pkLÞdk ½3
2
or
Spatial Resolution
Ipxl ðLÞ ¼ C0 þ C1 · Re{FT½IðkÞ} ½4
The spatial resolution of a Sagnac Hyperspectral
Imager is set by the size of the pixels of the CCD where FT stands for Fourier transform.
IMAGING / Hyperspectral Imaging 139
It includes features for analyzing and displaying the emulates standard filtering. An overlay feature within
resultant X – Y images and spectral information for the software allows the curve fitting results to be
the grating spectrometer Hyperspectral Imager. displayed in false color and compared by viewing the
During a scan, individual pictures are taken in the composite for multiple fluorophores in a single image.
Y – l plane, while the stage is incremented in the Additionally, it permits the contribution of a fluor-
X-direction. A collection of such Y – l scan files, ophore to be emphasized for visualization purposes
which can total from 50 – 1000, is merged to build an by either displaying the contribution of a single
image cube (Figure 2). Another advantage of the fluorescent signature in the absence of the others that
single-line acquisition mode is that time-dependent were present in the image or by scaling individual
features can be monitored continuously. The software display intensity values up or down. Overlapping
was designed to permit repeated collection of a single pseudo-colors assigned to represent each fluorophore
line at variable time points. The utility of this feature (either by curve fitting or by windowing) may be
has been demonstrated by capturing images of displayed as added, averaged, maximum or minimum
fluorescent objects moving through UV-transparent values. For each new fluorophore – excitation source
capillaries. combination, the characteristic emission spectra is
The software was designed to display the resulting acquired and added to a library of available spectra
X – Y plane extracted in various ways from the image that can be used as components in the spectral
cube. A graph of the emission spectrum for any pixel decomposition. This permits the user to introduce
within that image may be viewed by clicking on any new fluorophores at any time. Moreover, background
pixel of interest in the X – Y image (Figure 3c). A spectra (defined as a region of interest by the user) can
standard linear curve-fitting algorithm is used to be acquired and used as one element of the
determine the contribution of individual dyes to the decomposition. This background component can be
measured emission spectrum. The software also turned off in the false color view (as any of the
features a windowing technique (integration of the spectral components may be) resulting in enhanced
emission spectra over a fixed window) which noise reduction and visual contrast improvement.
Figure 3 The benefit of spectral imaging is demonstrated in a spatial high-resolution scan of an expression microarray. (a) A portion of
the scan of an expression microarray on a microscope slide using the Genepix4000 scanner. It records the intensity emitted by the Cy3
dye when excited with a 532 nm laser. (b) A portion of the scan of the same microarray with the Hyperspectral Imaging Microscope using
a 40X objective and Hg excitation lamp. False color is displayed to indicate the intensity variations. (c) A Hyper Scope computer program
display of the spectrum recorded when the cursor is placed over one pixel in a scan.
142 IMAGING / Hyperspectral Imaging
Huebschman ML, Schultz RA and Garner HR (2002) Press WH, Flannery BP, Teukolsky SA and Vetterling WT
Characteristics and capabilities of the hyperspectral (1986) Numerical Recipes, the Art of Scientific Comput-
imaging microscope. Eng in Med and Bio. 21: 104 – 117. ing, pp. 546. New York: Cambridge University Press.
Inoue T and Gliksman N (1998) Techniques for optimizing Raap AK (1998) Advances in fluorescence in situ hybrid-
microscopy and analysis through digital image proces- ization. Mutat. Res. Fundam. Mol. Mech. Mutagen 400:
sing. Methods in Cell Biology 56: 63 – 90. 287 – 298.
Jain AK (1997) Fundamental of Digital Image Processing, Sakamoto M, Pinkel D, Mascio L, et al. (1995) Semiauto-
pp. 80 –128. New Jersey: Prentice-Hall. mated DNA probe mapping using digital imaging
Jalal SM and Law ME (1999) Utility of multicolor microscope: II. system performance. Cytrometry 19:
fluorescent in situ hybridization in clinical cytogenetics. 60 –69.
Genet Med. 1: 181– 186. Schrock E, duManoir S, Veldman T, et al. (1996) Multicolor
Johannes C, Chudoba I and Obe G (1999) Analysis of X- spectral karyotyping of human chromosomes. Science
ray-induced aberrations in human chromosome 5 using 273: 494 – 497.
high-resolution multicolour banding FISH (mBAND). Schultz RA, Nielsen T, Zavaleta JR, Ruch R, Wyatt R and
Chromosome Research 7: 625 – 633. Garner HR (2001) Hyperspecteral imaging: a novel
Kruse F (1993) The Spectral Image Processing System approach for microscopic analysis. Cytometry 43:
(SIPS) – Interactive visualization and analysis of imaging 239 – 247.
spectrometer data. Remote Sensing Environment 44: Schwartz S (1999) Molecular cytogenetics: show me the
145– 163. colors. Genet Med. 1: 178 – 180.
Lu YJ, Morris JS, Edwards PAW and Shipley J (2000) Speicher MR, Ballard SG and Ward DC (1996) Karyotyping
Evaluation of 24-color multifluor-fluorescence in-situ human chromosomes by combinatorial multi-fluor
hybridization (M-FISH) karyotyping by comparison FISH. Nature Genetics 12: 368– 375.
with reverse chromosome painting of the human breast Wang YL (1998) Digital deconvolution of fluorescence
cancer cell line T-47. Chromosome Research 8: images for biologists. Methods in Cell Biology 56:
127– 132. 305 – 315.
Malik Z, Cabib D, Buckwald RA, Talmi A, Garini Y and Zhao L, Hayes K and Glassman A (2000) Enhanced
Lipson SG (1996) Fourier transform multipixel spec- detection of chromosomal abnormalities with the use
troscopy for quantitative cytology. Journal of of RxFISH multicolor banding technique. Cancer Genet
Microscopy 182: 133 –140. Cytogenetics 118: 108 – 111.
Figure 2 Between an ultrafast light source S and a detector D can be distinguished various classes of photons which can be used
for imaging.
IMAGING / Imaging Through Scattering Media 145
m0s ¼ ms ð1 2 gÞ ½4
scattering properties, and is given by Here the time variable is replaced by the frequency v
which is the modulation angular frequency of the
expð2L=aÞ with a2 ¼ 1=3ma ðm0s þ ma Þ ½5 source. In this model, the fluence can be seen as a
complex number describing the amplitude and phase
This formula reflects the fact that the photon’s of the so-called ‘photon density wave’ in addition to a
pathlength is increased by scattering, leading to an DC component:
increase of the damping, which is related to
absorption. Fð~r; vÞ ¼ FAC ð~r; vÞ þ FDC ð~r;0Þ
These few examples provide a ‘static’ view of light
behavior in scattering media. Studying in a more ¼ IAC expðiuÞ þ FDC ð~r;0Þ ½9
detailed way, the spatial and temporal distribution of
light, is more difficult. where u is the phase shift of the diffusing wave whose
wavelength is:
Photon Migration in Scattering Media
sffiffiffiffiffiffiffiffi
Photon migration models have been inspired by other 2c
2p ½10
fields, such as astrophysics, soft matter physics, and 3m0s v
neutronics, where physical media are random in both
space and time. To avoid difficulties linked to the Although analytical techniques provide a rapid
microscopic complexity such investigations rely on a understanding of most phenomena involved in static
statistical basis. or dynamic situations, real biological samples are
The most widely used theory is the time- much more complex than homogeneous isotropic
dependent diffusion approximation to the transport media. In these cases, direct and inverse algorithms
equation: map the spatially varying optical properties.
Monte Carlo simulations are very popular nowa-
7 ~ r; tÞÞ 2 ma Fð~r; tÞ ¼ 1 ›Fð~r; tÞ 2 Sð~r; tÞ
~ · ðD7Fð~ ½6 days; a number of useful programs are available on
c ›t the websites of various groups. Results may include
time or polarization dependence of the photons along
where ~r and t are spatial and temporal coordinates,
their paths.
c is the speed of light in tissue, and D is the
The other method used in tissue optics is the
diffusion coefficient related to the absorption and
random walk theory (RWT) on a lattice. RWT models
scattering coefficients by the formula:
the diffusion-like motion of photons in turbid media
1 in a probabilistic manner. Using RWT, an expression
D¼ ½7 may be derived for the probability of a photon
3½ma þ m0s
arriving at any point and time given a specific starting
The quantity Fð~r; tÞ is called the fluence, defined as point and time.
the power divided by the unit area. This equation
does not reflect the scattering angular dependence; Imaging in the Ballistic Regime: Taking
the use of m0s instead of ms ensures that isotropic
scattering has been reached. Indeed, for the use of
Advantage of Various Coherence
the diffusion theory which implies anisotropic Properties
scattering, the diffusion coefficient is expressed in
Shallow Tissue Imaging Through Selection of
terms of the transport-corrected scattering coeffi-
Ballistic Photons
cient; Sð~r; tÞ is the source term.
The diffusion equation has been solved analytically When imaging thin (less than 2 mm) tissue samples, it
for different types of measurements such as reflection is possible to use photons which have not been
and transmission modes, assuming that the optical scattered, called the ballistic photons which can be
properties remain invariant throughout the tissue. used to form high-resolution images in the same
To incorporate the finite boundaries, the method of manner as if no scattering had taken place.
images has been used. Their number, however, decreases exponentially
The diffusion approximation equation in the with propagation distance and this ballistic signal is
frequency-domain is the Fourier transform of the usually hidden by multiply scattered photons that
time-domain equation. This leads to the equation: obscure the image. For relatively shallow tissue
depths it is possible to use various spatial filtering
~ ~ iv techniques to block the multiply scattered photons
7·ðD7Fð~r; vÞÞ 2 ma þ Fð~r; vÞ þ Sð~r; vÞ ¼ 0 ½8
c and there is much current research that aims to
IMAGING / Imaging Through Scattering Media 147
extend optical imaging and biopsy to depths of a range or of ultrafast lasers to provide high average
few mm. power, low coherence length radiation (sub-two cycle
Confocal microscopy rejects much of the scattered lasers) has resulted in an in vivo OCT system
light through spatial filtering and has been shown to providing real-time imaging.
form 3-D images at tissue depths of up to 200 – There are a number of other approaches to
300 mm, both in vitro and in vivo, which corresponds coherent imaging, including whole-field 2-D acqui-
to about 50 dB of damping for the ballistic photons. sition techniques which remove the need to scan
Two-photon fluorescence microscopy is particu- transversely pixel by pixel.
larly interesting in this regime since it ensures that all Whole-field imaging using a temporally and
the detected fluorescence photons originate from the spatially incoherent source and a well-balanced
desired image plane. This technique also ensures a interferometric microscope intrinsically provides
better penetration (using IR excitation) and a better higher image acquisition rates and can take advan-
resolution (nonlinear response) than single photon tage of inexpensive spatially incoherent, broadband
confocal microscopes. sources such as high-power LEDs, and white light
Imaging to much greater tissue depths does not thermal sources.
appear to be practical using optical microscopy and As an example, Figure 5 shows Linnik-type white
spatial filtering alone and more sophisticated tech- light interferometer with optical path modulation and
niques must also be employed to discriminate in favor Figure 6 an image of an ex-vivo rat retina where one
of the ballistic photons. These techniques exploit can recognize the main feature expected from
either the fact that the ballistic signal photons retain histology.
spatial coherence of incident light or the fact that the Holography, which works in the Fourier space
ballistic photons keep the optical phase memory that rather than in the object or image space, is also an
multiply diffused photons have lost. interferometric technique able to discriminate
One of the most successful ballistic light-imaging between ballistic and scattered photons. Since the
techniques based on the latter filtering approach is development of electronic holography, which takes
optical coherence tomography (OCT) which uses advantage of available arrays of detectors with
coherence gating using low temporal coherence light multimillion pixels, this approach is rapidly growing.
and an interferometric detection. Coherent detection Real-time 3-D imaging systems based on photo-
is used to discriminate against the multiply scattered refractive holography in multi-quantum-wells
photons that scatter back into the trajectories of devices, which can potentially acquire depth-resolved
ballistic photons. The ballistic light signal is measured images at thousands of frames per second, have been
by interference with a reference beam and the demonstrated.
resulting pattern is detected with high sensitivity These ballistic light techniques can extend the
using homodyne or heterodyne detection. By using depth of tissue imaging to the mm range when
low coherence length sources and matching the time-
of-flight of the reference beam signal to that of the
desired ballistic light, OCT can acquire depth-
resolved images in scattering media such as biological
tissues. For powers in the mW range for tissue
irradiation in the infrared, the unscattered ballistic
(or more precisely single back scattered) component
of the light will be limited by the shot noise detection
limit after propagating through approximately 25
scattering mean free paths (MFP) of a scattering
medium (or more than 100 dB). This corresponds to a
‘typical’ tissue thickness of about 1 mm tissue depth
for the reflection geometry which is potentially useful
for clinical applications such as screening for skin
cancer, or for retinal examination.
OCT has proved clinically useful as a means of
acquiring depth-resolved images in the eye and is
being developed for application in strongly scattering
biological tissue. Although the image acquisition Figure 5 The full-field OCT setup based on a Linnik
requires pixel-by-pixel scanning, the use of new interferometer. The PZT actuator is used to modulate the path
high-power superluminescent diodes in the 100 mW difference between the two arms.
148 IMAGING / Imaging Through Scattering Media
Figure 7 Polarization imaging: here the degree of circular polarization is measured in the upper image. The lower image shows the
field without polarization filtering.
mentioned problem of the inhomogeneity of the only to characterize the local changes induced by the
optical properties of human tissues means that the activation.
results are still poor in term of resolution and sub- The main field of application of this approach is
centimeter structures can hardly be observed in breast certainly in brain activation studies. A matrix of light
imaging, for instance. emitters (usually at two wavelengths which differen-
One important exception to this limited success in tiate oxy- and deoxy-hemoglobin, for instance 780
finding precisely the position, the size, and the optical and 840 nm) is coupled to a matrix of light detectors.
properties of hidden structures is the case of The signals are usually balanced and any local
‘activation’ studies. This corresponds to a ‘differen- change which breaks the symmetry of the scattered
tial’ situation in which an unknown structure is photon paths induces a differential signal, easy to
locally changing with time. Here the goal is not to detect, because there is no background signal before
reach a full description of the unknown structure but activation takes place.
150 IMAGING / Imaging Through Scattering Media
Moreover it has been demonstrated that for weak lengths which depend on all these different para-
perturbations (weak variations of the optical par- meters. The amount of light emitted at the site of the
ameters), the inverse problem can be rigorously fluorophore is directly proportional to the total
solved. This geometrical selection provides a cheap amount of light impinging on the fluorophore.
and simple setup with rather spectacular results. Using fluorescence contrast has the unique feature
Sometimes a modulation is added and the phase shift of carrying two spectroscopic contrasts related to
of the photon density wave is enhanced by the both excitation and fluorescent photons.
differential approach. The next two techniques take advantage of another
way to handle the optical information by coupling
optics and acoustics.
Imaging in the Multiple Scattering
Regime: Taking Advantage of
Coupling Light and Sound: Acousto-Optic
all the Photons and Photo-Acoustic
Light Only: Transillumination and Fluorescence Although the two approaches that we will now des-
Imaging cribe are different in their basic principles as well as
The optical systems used for CW measurements are in the experimental setups, their names are similar –
rather simple and require only the alignment of a opto-acoustics (sometimes called photo-acoustics)
number of light sources and detectors. Compared to and acousto-optics.
time-gated techniques due to multiple scattering a
much stronger path length distribution is present in Opto- (or Photo-)Acoustics
the signal and the result is a loss of localization and
In this technique a pulsed (or modulated) electro-
resolution. In the so-called transillumination geome-
magnetic beam irradiates the structure under exam-
try, collimated detection is used to isolate multiply
ination. Despite their tortuous paths (about 10 to
scattered photons close to the normal emergence
100 cm) photons propagate through the volume in a
angle.
few nanoseconds. Local absorbing centers then
Direct imaging with the required resolution using
absorb the electromagnetic energy, they experience
CW techniques in very thick tissues, such as breast
a fast thermal expansion, and become sources of
tissue, has not been established even with sophisti-
acoustic waves.
cated multi-sources/multi-detectors combinations
The optical problem of tomographic reconstruc-
(typically a few tens of sources and detectors at
tion is then transformed into a simpler problem
three different wavelengths).
of acoustic reconstruction of source distribution
Despite rapid progress in the modeling of light
because the speed of the sound is much slower
behavior in scattering media it has not yet been
than the speed of the light. More precisely we can
demonstrated that CW imaging can provide images
compute the signal generated by a short light pulse
with suitable resolution in thick tissue and time-
impinging on a scattering medium which contains an
dependent measurement techniques discussed earlier
absorber (thickness e) as can be seen on Figure 8,
are dominantly used.
when ema p 1:
For fluorescence imaging, one has to account for
The amount of power per unit surface which is
the strong attenuation of light as it passes through
deposited in the absorbing slice is Ema (E is the fluence
tissue. Fluorescent dyes (porphyrin, indo-cyanine,
in W/m2). When the light pulse duration t is short
and more recently quantum dots) have been devel-
enough, despite the propagation in the scattering
oped that are excited and re-emit in the ‘biological
medium (t p e=v where v is the speed of the sound),
window’ at near-infrared (NIR) wavelengths, where
scattering and absorption coefficients are lower,
which is favorable for deep fluorescence imaging.
Fluorescent intensity in deep tissues depends on the
location, size, concentration, and intrinsic character-
istics (e.g., lifetime, quantum efficiency) of the
fluorophores, and on the scattering and absorption
coefficients of the tissue at both the excitation and
emission wavelengths. In order to extract intrinsic
characteristics of fluorophores within tissue, it is
necessary to describe the statistics of photon path Figure 8 Photo acoustic signal generation: 1-D model.
IMAGING / Imaging Through Scattering Media 151
we find a variation of the mass by volume unit: sample volume. This modulation can be detected by
bEma e selecting a single speckle grain and averaging over a
Dr , rbDT , r ½11 number of uncorrelated speckle distributions to
Cv e
obtain a suitable average value (for instance with
The corresponding pressure is: latex particles). This is not always possible, in
v2 v2 bE particular because the overall geometry is rather
Dr ¼ Dr or Dr , r m ½12 stable in a semisolid tissue. Using a single detector
g Cr a
which receives many speckle grains gives less con-
Although this approach is new, compared to the trast. A multiple detector and a parallel processing of
purely optical transillumination approach, it has led many speckle grains improve greatly the speed and
to a large number of studies in which time gating the signal-to-noise ratio: a typical image of absorbing
brings new information to the accurate determination spheres, embedded in a scattering phantom, is
of the US emitter depth. It has been applied to breast shown in Figure 10.
tumor examination to complement standard imaging
techniques such as echography and X-ray mammo-
graphy. Recently this technique has demonstrated
submillimeter resolution for in vivo blood
distribution in rat brain imaging.
Acousto-Optic
Here a DC single frequency laser is used to illuminate
the sample. For a typical biological sample a few cm
thick the trajectories are distributed between the
sample thickness value and more than ten times this
value. Due to the good temporal coherence of the
source all the scattered photons are likely to interfere
when they emerge from the sample volume. These
interferences are seen as a speckle field which can be
observed at a suitable distance from the sample
surface.
As seen in Figure 9, adding an ultrasonic field will
mainly induce a small (smaller than the optical
wavelength) displacement of the scatterers (it also
induces local variation of the refractive index) which
will cause a speckle modulation at the ultrasonic
frequency.
If the zone irradiated by the ultrasonic field is
optically absorbing, less photons will emerge from
this zone than from a nonabsorbing one and thus less
modulation will be seen when scanning across the
Infrared Imaging
K Krapels and R G Driggers, US Army Night Vision human eye cannot image infrared energy because the
and Electronic Sensors Directorate, Fort Belvoir, VA, lens of the eye is opaque to infrared radiation.
USA The infrared spectrum begins at the red end of the
Published by Elsevier Ltd. visible spectrum where the eye can no longer sense
energy. It spans from 0.7 mm to 100 mm. The
infrared spectrum is, by common convention, broken
Introduction into five different bands (this may vary according to
All objects above 0 Kelvin emit electromagnetic the application). The bands are typically defined as:
radiation associated with the thermal activity on the near infrared (NIR) from 0.7 to 1.0 mm; short
surface of the object. For terrestrial temperatures wavelength infrared (SWIR) from 1.0 to 3.0 mm;
(around 300 Kelvin), objects emit a good portion of mid-wavelength infrared (MWIR) from 3.0 to
the electromagnetic flux in the infrared part of the 5.0 mm; long wavelength infrared (LWIR) from 8.0
electromagnetic spectrum. The visible band spans to 14.0 mm; and far infrared (FIR) from 14.0 to
wavelengths from 0.4 to 0.7 micrometers (infrared 100 mm. These bands are depicted graphically in
engineers typically use micrometer/mm for wave- Figure 1, which shows the atmospheric transmission
length rather than the nanometer or wavenumber for a 1-kilometer horizontal ground path for a
more commonly used in other fields). Infrared ‘standard’ day in the United States. These types of
imaging devices convert energy in the infrared transmission graphs can be tailored for any con-
portion of the electromagnetic spectrum into images ditions, using sophisticated atmospheric models such
that can be created by the human eye. The unaided as MODTRAN (from www.ontar.com). Note that
IMAGING / Infrared Imaging 153
Figure 2 Visible image– reflected energy. Figure 3 Infrared image– self-emitted energy.
there are many atmospheric ‘windows’, such that an it would be pointless to construct a camera that
imager designed with such a band selection can see responds to this waveband.
through the atmosphere. Figures 2 and 3 are images that show the difference
The primary difference between a visible spectrum in the source of the radiation sensed by the two types
camera and an infrared imager is the physical of cameras. The visible image (Figure 2) is light that
phenomonology of the radiation from the scene was provided by the Sun, propagated through Earth’s
being imaged. The energy used by a visible camera atmosphere, reflected off the objects in the scene,
is predominantly reflected solar, or some other traversed through a second atmospheric path to the
illuminating energy, in the visible spectrum. The sensor, and then imaged with a lens and a visible band
energy imaged by infrared imagers (Forward Looking detector. A key here is that the objects in the scene are
Infrared, commonly known as FLIRs) in the MWIR represented by their reflectivity characteristics. The
and LWIR bands, is primarily self-emitted radiation. image characteristics can also change by any change
From Figure 1, the MWIR band has an atmospheric in atmospheric path or source characteristic change.
window in the 3 to 5 mm region and the LWIR band The atmospheric path characteristics from the Sun to
has an atmospheric window in the 8 to 12 mm region. the objects change frequently, because the Sun angle
The atmosphere is opaque in the 5 to 8 mm region, so changes throughout the day and also weather and
154 IMAGING / Infrared Imaging
cloud conditions change. The visible imager charac- where c1 and c2 are constants of 3.7418 £
terization model is therefore a multipath problem 104 W mm4/cm2 and 1.4388 £ 104 mm K. The wave-
that is extremely difficult. length, l, is provided in micrometers and 1ðlÞ is the
The LWIR image given in Figure 3 is obtained emissivity of the surface. A black-body source is
primarily by the emission of radiation by objects defined as an object with an emissivity of 1.0, so it is a
in the scene. The amount of electromagnetic flux perfect emitter. Source emissions of black-bodies at
depends on the temperature and emissivity of the typical terrestrial temperatures are shown in Figure 4.
objects. A higher temperature and a higher emissivity Often, in modeling, the terrestrial background tem-
correspond to a higher flux. The image shown is perature is assumed to be 300 K. The source emittance
‘white-hot’ or the whiter a point in the image the curves are shown for other temperatures for compari-
higher the flux leaving the object. It is interesting to son purposes. One curve corresponds to an object
note that trees have a natural self-cooling process colder than the background and two curves corre-
since a high temperature can damage foliage. Objects spond to temperatures hotter than the background.
that have absorbed a large amount of solar energy are Planck’s equation describes the spectral shape of
hot and are emitting large amounts of infrared the source as a function of wavelength. It is readily
radiation. This is sometimes called ‘solar loading’. apparent that the peak shifts to the left (shorter
The characteristics of the infrared radiation wavelengths) as the body temperature increases.
emitted by an object are described by Planck’s If the temperature of a black-body is increased to
black-body law in terms of spectral radiant emittance that of the Sun (5900 Kelvin), the peak of the
c spectral shape would decrease to 0.55 mm or
Ml ¼ 1ðlÞ 5 c =l1T W=cm2 mm ½1
l ðe 2 2 1Þ green light. This peak wavelength is described by
Figure 4 Planck’s blackbody radiation curves for four temperatures from 290 K to 320 K.
Wien’s displacement law from the constituents of the scene. This emitted
energy is transmitted through the atmosphere to the
lmax ¼ 2898=T; mm ½2 sensor. As it propagates through the atmosphere it is
degraded by absorption and scattering. Obscuration
Figure 5 shows the peak wavelength as a function of by intervening objects and additional energy emitted
temperature through the longwave region. It is by the path also affect the target energy. All of these
important to note that the difference between the contributors, which are not the target of interest,
black-body curves is the ‘signal’ in the infrared essentially reduce one’s ability to discriminate
bands. For an infrared sensor, if the background is the target. So, at the entrance aperture for the sensor,
at 300 K and the target is at 302 K, the signal is the the target signature is reduced from the values
difference in flux between these curves. Signals in observed at the target. Then, the signal is degraded
the infrared ride on very large amounts of back- by the optics and scanner (as applicable) of the sensor.
ground flux. However, in the visible band this is not The energy is then sampled by the detector array and
the case. For example, consider the case of a converted to electrical signals. Various electronics
white target on a black background. The black amplify and condition this signal before it is presented
background is generating no signal, whereas the to either a display for human interpretation or an
white target is generating a maximum signal (given algorithm like an automatic target recognizer for
that the sensor gain has been adjusted). Dynamic machine interpretation. A linear systems approach to
range may be fully utilized in a visible sensor. In the modeling allows the components’ transfer functions
case of an IR sensor, a portion of the dynamic range to be treated separately as contributors to the overall
is used by the large background flux radiated by system performance. This approach allows for
everything in the scene. This flux never has a small straightforward modifications to a performance
value, hence sensitivity and dynamic range require- model for changes in the sensor or environment
ments are much more difficult to satisfy in IR when performing trade-off analyses.
sensors than in visible sensors. Table 1 summarizes
this characteristic numerically for the 300 K back-
ground and targets shown in Figure 4. History of Infrared Imaging
A typical infrared imaging system scenario is Night vision systems began to be developed exten-
depicted in Figure 6. The scene consists of two sively during World War II. Satisfying the require-
major components, the target and the background. ment to image at night was approached with two
In an IR scene, the majority of the energy is remitted different methodologies. The first method was image
intensification ðI2 Þ: This method involves amplifica-
Table 1 Signal/dynamic range limitation in IR bands tion of any small light that was available and
displaying it directly to the eye. Typically I2 devices
Blackbody temperature Exitance in 8 –12 mm band Contrast
are sensitive to visible light and a small portion of the
(w/cm22)
short-wave IR (SWIR, 0.7 – 3.0 mm) band. They are
290 (210DT K target) 1.25E 2 02 28.48% often classified (along with visible band imagers) as
300 (background) 1.48E 2 02 electro-optical (EO) imagers. With an I2 imager, there
310 (þ10DT target) 1.74E 2 02 7.97%
must be some source of illumination for them to
320 (þ20DT target) 2.02E 2 02 15.39%
function well (as little as starlight is sufficient
Table 2 Applications of infrared imaging active imagers are used in civil applications and
Category User Application
are declining in military use. Broadly, applications
for thermal imagers fall into two categories;
Military Intelligence analyst Intelligence surveillance military and commercial. Table 2 lists some of the
and reconnaissance purposes and users for modern thermal imagers.
Pilot, driver Navigation/pilotage
Gunner, weapons Targeting/fire control
The design and performance criteria vary widely for
officer some of the applications.
Air defense Search and track
Commercial Civil government
Police Surveillance, fugitive
tracking
Types of Infrared Imagers
Fire Rescue, hot-spot location Infrared imagers are classified by different character-
Environmental
istics: scan type, detector material/cooling require-
EPA Emission tracking
Interior department Resource management ments, and detector physics. The scan type refers to
NIMA\USGS Mapping the method used to construct the electronic image.
NOAA\MWS Weather forecasting The camera may use a single detector which is raster
Industrial scanned over the input scene to build an image.
Manufacturers Machine vision
Alternatively, a parallel scan uses a linear array of
Maintenance Non-destructive testing
Medical detectors scanned across a scene to build an image.
Doctors Diagnostic imaging The latest advances in materials have led to staring
arrays of detectors. In a staring system, a detector is
present in two dimensions to represent each image
pixel and no mechanical motion of the focal plane is
for approx 20/40 acuity with current systems). The necessary to construct the image. There are some
second type of night vision device is the infrared or hybrid FLIR types that combine the different imaging
thermal imager (also known as FLIRs). FLIRs are techniques. Usually this combination leads to an
sensitive to the radiation in either the mid-wave improvement in signal-to-noise ratio or to reduce
IR (MWIR, 3 – 5 mm) or long-wave IR (LWIR, undersampling.
8– 14 mm) bands. The second classification, by detector material/
The first infrared cameras used photographic film cooling requirements, usually describes FLIRs built
that was sensitive to infrared wavelengths. Intelli- using differing materials. For example, a typical
gence and reconnaissance imaging from aircraft LWIR FLIR material is mercury cadmium telluride
drove the infrared system development as a night (HgCdTe or MCT). In order to achieve high
augmentation to existing visible cameras. These sensitivity, these devices are cooled to decrease dark
imagers used either continuous moving film or statio- current. Usually the cooling is based on liquid
nary film, like an ordinary camera. A slit aperture was nitrogen or a cryogenic cooler, and the detectors
used in the continuous moving film cameras and operate at 77 K. Another common detector material
aircraft motion provided the scan of the scene along is indium antimonide (InSb), which is used for MWIR
the film to build a continuous image. The introduction FLIRs and is also cooled. A newer class of infrared
of television, as electronic imaging in the visible cameras is not cooled, being referred to as ‘uncooled’
spectrum, led to the development of electronic FLIRs. These uncooled FLIRs are microbolometers
imagers in the infrared band. Electronic infrared (resistive elements) or pyrometers (capacitive
imagers were not restricted to looking down and elements) and have less sensitivity than cooled
using aircraft motion for scan, hence these devices imagers. Typically, the cooled imagers are comprised
came to be known as FLIRs (Forward Looking of photon detectors while the uncooled imagers are
InfraRed). Today, an electronic infrared imager may based on thermal detectors, the uncooled FLIRs being
be called a FLIR or a thermal imager, with the trend used in lower-performance applications.
being to use FLIR to describe military applications.
Early FLIRs were often accompanied by illumina-
tion devices. They were not very sensitive compared to Infrared Imager Performance
modern FLIRs so active illumination was necessary. There are three general categories of infrared sensor
Infrared searchlights or illuminators were used to get performance characterizations. The first is sensitivity
higher signal-to-noise ratios. These types of FLlRs are and the second is resolution. When end-to-end, or
called ‘active’ IR imagers, while FLIRs that do not human-in-the-loop (HITL), performance is required,
use illuminators are ‘passive’ imagers. Generally, the third type of performance characterization
IMAGING / Infrared Imaging 157
emitted and solar reflected flux. In the longwave, the prefilter MTF at the half-sample frequency is larger
flux is primarily emitted where both day and night than 10%. The MTDP equation is given by
yield a 2 £ 1017 mm frequency, the AMOP drops
rapidly to zero. The effect is to extend the prediction
p
SNRth Cðf Þ
of the MRT beyond the half sample rate and MTDPðf Þ ¼ 2 ½3
AMOPðf Þ
introduce a parameter called minimum temperature
difference perceived (MTDP) to replace MRT in the
where SNRth is a threshold signal-to-noise ratio and
range performance estimate and also in the lab to C is the total system filtered noise. MTDP is used in
characterize an undersampled system. the same manner as MRTD in range calculations.
The concept of MTDP follows from observation of Another approach is the triangle orientation
standard 4-bar images as the pattern frequency passes discrimination (TOD) threshold. In the TOD, the
the imager half sample rate. The image changes from test pattern is a (nonperiodic) equilateral triangle in
four bars to three, two, and finally one (can be four possible orientations (apex up, down, left, or
perceived). The phase of the target image on the right), and the measurement procedure is a robust
detector array may need to be adjusted to observe this 4AFC (four alternative forced-choice) psychophysical
progression. The standard MRT measurement procedure. In this procedure, the observer has to
requires that all four bars be resolvable by the indicate which triangle orientation he sees, even if he
observer during the measurement process. For use is not sure. Variation of triangle contrast and/or size
in the lab, the MTDP is defined as the minimum leads to a variation in the percentage correct between
temperature difference at which two, three, or four 25% (complete guess) and 100%, and by inter-
bars can be resolved in the image of the standard polation the exact 75% correct threshold can be
4-bar test pattern by an observer, with the test pattern obtained. A complete TOD curve (comparable to an
positioned at optimum phase. The optimum phase is MRTD curve) is obtained by plotting the contrast
the phase at which two, three, or four bars are ‘best thresholds as a function of the reciprocal of the
perceived’. TRM3 uses the standard definition of triangle angular size (Figure 9).
MRT as for adequately sampled imagers. The TRM3 The TOD method has a large number of theoretical
Approach Model is implemented if the imager is and practical advantages: it is suitable for under-
considered undersampled as defined when the sampled and well-sampled electro-optical and optical
IMAGING / Infrared Imaging 159
imaging systems in both the thermal and visual point of an off-axis parabolic mirror. The mirror
domains, it has a close relationship to real target serves the purpose of a long optical path length to
acquisition, and the observer task is easy. The results the target, yet relieves the tester from concerns
are free from observer bias and allow statistical over atmospheric losses to the temperature differ-
significance tests. The method may be implemented in ence. The image of the target is shown in Figure 10b.
current MRTD test equipment with little effort, and The SITF slope for the scan line in Figure 10c is
the TOD curve can be used easily in a target the DS=DT where DS is the signal measured for a
acquisition (TA) model such as ACQUIRE. Two vali- given DT: The Nrms is the background signal on the
dation studies with real targets show that the TOD same line.
curve predicts TA performance for under-sampled
and well-sampled imaging systems very well. Cur- Nrms ½volts
NETD ¼ ½4
rently, a theoretical sensor model to predict the TOD SITF_Slope½volts=K
(comparable to NVTherm or TRM3) is under
development. The lab measurement and field per- Resolution is a general term that describes the size
formance appear sound, but the model is not yet of resolvable features in an imager’s output.
available. With infrared imaging systems, the resolution is
described by the system modulation transfer
function (MTF). Consider Figure 11 for the
Testing Infrared Imagers: NETD, determination of MTF, from either point spread
MTF, and MRTD (Sensitivity, function or edge spread function (psf or esf). In
Figure 11a, a test target is placed at the focal point
Resolution and Acuity) of a collimator. The target may be a point, edge, or
In general, imager sensitivity is a measure of the line as shown in Figure 11b. The IR system images
smallest signal that is detectable by a sensor. this target. The output of a detector scan, or a line
Sensitivity is determined using the principles of of detectors in the staring array case, is taken for
radiometry and the characteristics of the detector. analysis, in Figure 11c. For the edge spread
For infrared imaging systems, noise equivalent function, the spatial derivative is taken to get the
temperature difference (NETD) is a measure of psf. The Fourier transform is then calculated to
sensitivity. The system intensity transfer function give the system MTF as shown in Figure 11d. The
(SITF) can be used to estimate the NETD, which is the point spread function is really the impulse response
system noise rms voltage over the noise differential of the system, so a smaller psf is desirable, where a
output. NETD is the smallest measurable signal wide MTF is desirable. Such a psf/MTF combi-
produced by a large target (extended source). nation gives a higher resolution.
Equation [4] describes NETD as a function of The MRTD of an infrared system is defined as
noise voltage and the system intensity transfer the differential temperature of a four bar target
function. The measured values are determined from that makes the target just resolvable by a particular
a line of video stripped from the image of a test observer. It is a measure of observer visual acuity
target, as depicted in Figure 10. A square test target when a typical observer is using the infrared
is placed before a black-body source. The delta T imager. It results in a descriptor, which is a
is the difference between the black-body temperature function not just of a single value. It is a plot
and the mask. This target is then placed at the focal of sensitivity as a function of spatial frequency.
160 IMAGING / Infrared Imaging
This parameter can be extended to field perform- A bar target mask, with a black-body illuminator, is
ance using Johnson’s criteria. placed at the focal plane of a collimator (Figure 12a
As a laboratory measurement, it uses a 7:1 aspect and 12b). The temperature is increased from a small
ratio bar target. The procedure is shown in Figure 12. value until the bars are just resolvable to a trained
IMAGING / Infrared Imaging 161
observer (Figure 12c). Then the temperature differ- typically range from 7 to 20%. In the infrared, all
ence is decreased until the bars are no longer visible. terrestrial objects are emitting with an emission
These are averaged for a particular target. The same temperature of around 300 Kelvin. Typically, a two-
procedure is performed with a negative contrast bar degree equivalent black-body difference in photon
target. The average of the positive and negative flux is considered a high contrast target to back-
values is the MRT for a particular spatial frequency. ground flux. The flux difference between two black-
These data points are plotted and a curve may be bodies of 302 K compared to 300 K can be calculated
fitted to interpolate/extrapolate performance at other in a manner similar to that shown in Figure 4. This is
than the discrete spatial frequencies of the targets, as the difference in signal that provides an image, so note
in Figure 12d. the difference in signal compared to the ambient
background flux. In the longwave, the signal is three
percent of the mean flux and in the mid-wave, it is six
Summary percent of the mean flux. This means that there is a
We have provided a general description of infrared large flux pedestal associated imaging in the infrared.
imaging systems in terms of characteristics, modeling, Unfortunately, there are two problems accompanying
field performance, and performance measurement. this large pedestal. First, photon noise is the dominant
The characteristics of infrared imagers continue to noise and is determined by the mean of the pedestal
change. Significant changes have occurred in the past and is compared to the small signal differences.
five years, to include the development of higher Second, the charge well storage in infrared detectors
performance uncooled imagers and ultra-narrow field is limited to around 107 charge carriers. A longwave
of view photon systems. Large format detector arrays system in a hot desert background would generate
are commercially available in the mid-wave and are 1010 charge carriers in a 33 millisecond integration
becoming more available in the longwave. Still in the time. Smaller F-numbers, spectral bandwidths, and
research phase are dual-band focal plane level. At first integration times are used so that the charge well does
sight, it appears that the longwave flux characteristics not saturate. This results in SNR of 10 to 30 times
are as good as a daytime visible system; however, below the ideal. It has been suggested that some on-
there are two other factors limiting performance. chip compression may be a solution to the well
First, the energy bandgaps of infrared photons are pedestal problem. In many scanning FLIR systems,
much smaller than those photons in the visible, so the the pedestal is eliminated by AC-coupling the
detectors suffer from higher dark current. The detector signals. Infrared focal plane array (IRFPA)
detectors are usually cooled to reduce this effect. readout circuits have been previously designed and
Second, the reflected light in the visible is modulated fabricated to perform charge skimming and charge
with target and background reflectivities that partitioning.
162 IMAGING / Infrared Imaging
D’Agostino and Webb created the 3D noise modulation in the image of the target. MRT
model, both spatial and temporal considerations, measurements in the past have utilized this variation
for analyzing system noise. This technique divided the with phase by allowing the observer to optimize the
noise components into manageable and understand- displayed image through target or system line-of-sight
able components. It also simplified the integration of changes during the measurement process. For under-
these effects into models. sampled imagers, TRM3 addresses the problem of the
The introduction of focal plane array (FPA) imagers MRT calculation’s inability to predict beyond the
has created a requirement to improve the model half sample rate of the sensor, by replacing the MTF
performance to account for the difference between in the denominator of the MRT equation with an
scanning arrays and staring arrays. This change has appropriately scaled term called the average modu-
made sampling an important area for improvement. lation at optimum phase (AMOP). AMOP is the
A new model called NVTHERM is a result of the average signal difference in the image of the 4-bar
NVESD model improvement effort, which addresses standard pattern, with the test pattern positioned at
undersampled imaging system performance. optimum phase. AMOP oscillates between the pre-
Sensor characterization is seen in three ways: sample MTF and the bar modulation. Beyond a
theoretical models, field performance, and laboratory frequency of 0.8 times the sample rate or 1.6 times
measurements (Figure 9). Theoretical models are the half sample arrays and quantum well detector
developed that describe sensor sensitivity, resolution, systems. These systems will find their place in
and human performance for the purpose of evaluat- applications to include both military and commercial
ing new conceptual sensors. Acquisition models, and sectors.
other models, are developed to relate the theoretical
models to field performance. This link allows
theoretical models to be converted to field perform-
ance quantities (e.g., probabilities of detection, Further Reading
recognition, and identification). Field performance
Bijl P and Valeton MJ (1998) Triangle orientation
is measured in the field so that the theoretical models
discrimination: the alternative to minimum resolvable
can be refined and become more accurate with temperature difference and minimum resolvable con-
advanced sensor developments. Since field perform- trast. Optical Engineering 37(7): 1976– 1983.
ance activities are so expensive, methods for the D’Agostino J and Webb C (1991) 3-D analysis framework
direct measurement of sensor performance are devel- and measurement methodology for imaging system
oped for the laboratory. Field performance testing of noise. Proceedings of SPIE 1488: 110 – 121.
every infrared sensor built, including buy-off, accep- Driggers RG, Cox P and Edwards T (1998) Introduction to
tance, and life-cycle testing, is ridiculous and out of Infrared and Electro-Optical Systems. Boston, MA:
the question. Laboratory measurements of sensor Artech House, p. 8.
performance are developed such that, given these Holst GC (1995) Electro-Optical Imaging System
Performance. Winter Park, FL: JCD Publishing, p. 347.
measurements, the field performance of a system can
Holst GC (1995) Electro-Optical Imaging System
be predicted. Sensor characterization programs
Performance. Winter Park, FL: JCD Publishing, p. 432.
require accurate sensor models, field performance Johnson J (1958) Analysis of image forming systems.
estimates with acquisition models, and repeatable Proceedings of IIS 249 – 273.
laboratory measurements. Johnson J and Lawson W (1974) Performance modeling
There are a few alternatives to the US NVTherm methods and problems. Proceedings of IRIS.
model. One candidate approach to undersampled Ratches JA (1973) NVL Static Performance Model for
imager modeling is Germany’s TRM3, or the MDTP Thermal Viewing Systems. USA Electronics Command
approach. The impact of undersampling in the image Report ECOM 7043, AD-A011212.
of a 4-bar target can readily be seen by simply Schade OH (1948) Electro-optical characteristics of
observing the change in the image as a function of television systems. RCA Review IX(1 – 4).
Sendall R and Lloyd JM (1970) Improved specifications for
spatial frequency and phase. The spatial frequency is
infrared imaging systems. Proceedings of IRIS 14(2):
defined as line pairs per angular extent of the target, 109 – 129.
and the phase specifies the relative location of the Vollmerhausen RH and Driggers RG (1999) NVTHERM:
target image to the detector array raster. These effects next generation night vision thermal model. Proceedings
are obviously not independent of each other, but for of IRIS Passive Sensors 1.
each target there can be found an optimal phase Wittenstein W (1998) Thermal range model TRM3.
where the observer can see the maximum amplitude Proceedings of SPIE 3436.
164 IMAGING / Interferometric Imaging
Interferometric Imaging
D L Marks, University of Illinois at photographic film or an electronic focal plane array.
Urbana-Champaign, Urbana, IL, USA Even with these disadvantages, interferometric ima-
q 2005, Elsevier Ltd. All Rights Reserved. ging is frequently more feasible than building a single
large aperture to achieve the highest available resol-
ution images.
Interferometric imaging is image formation by In contrast to interferometric imaging, a standard
measuring the interference between electromagnetic image forming instrument such as a telescope has a
signals from incoherently emitting or illuminated resolution limited by the telescope mirror aperture in
sources. To learn about interferometric image for- the absence of aberrations and atmospheric turbu-
mation using coherent light, see Holography, Tech- lence. As telescope mirrors become very large, they
niques: Overview. To understand what optical become bulky, extremely heavy, and difficult to
coherence is, see Coherence: Overview. To learn manipulate, as well as deform under their own weight
what kinds of sources produce incoherent light, see and temperature gradients. However, adaptive optics
Incoherent Sources: Lamps. To learn about interfero- is being more frequently utilized to dynamically
metric instruments, see Interferometry: Overview. To correct for instrument aberrations as well as atmos-
learn about interferometric instruments utilizing pheric turbulence. The images captured by telescopes
incoherent light, see Interferometry: White Light are typically intensity-only images, which are amen-
Interferometry. To better understand the effect of able to image processing and deconvolution, but
the coherence of light sources on image formation, nevertheless lack phase information. Interferometric
see Coherence: Coherence and Imaging. imaging is an alternative, where the resolution of a
Interferometric imaging attempts to improve the large aperture can be attained by measuring the
resolution of images by interferometrically combining interference between individual points of the electro-
the signals from several apertures. By doing so, a magnetic field within an area equivalent to the large
resolution can be achieved that is superior to that aperture extent. The interferometric combining of
achievable through standard image formation by any light from sub-apertures to achieve the resolution of a
of the individual apertures. This is especially useful in larger aperture is called aperture synthesis.
astronomy, where resolving more distant and smaller Unlike image formation by a lens, interferometric
stellar objects is always desirable. It is also finding images are never physically formed. Conventional
increasing use in microscopy. It has several advantages images are formed when diverging spherical waves
over conventional noninterferometric imaging emanating from sources are focused by a lens to
methods. The method can combine the light gathered converging spherical waves on a photographic film or
from several imaging instruments to form an image an electronic sensor. In this case, the image infor-
superior to that which can be formed by any one of the mation is directly contained in the exposure at each
instruments. It can also obviate the need to produce a position on the sensor. Interferometric measurements
single very large aperture to achieve the equivalent contain the image information as statistical corre-
resolution, which in many cases is of impractical size. lations between the electromagnetic fields at pairs
In addition, because it can produce a phase-resolved of spatial points. The statistical properties of
measurement, the data can be processed more flexibly electromagnetic waves are modeled by optical coher-
to form a computed image estimate. There are also ence theory. Interferometric imaging works by
considerable disadvantages to interferometric imaging measuring the statistical correlations between various
compared to conventional imaging methods. The pairs of points in the electromagnetic field, and then
amount of signal gathered is far less than could be uses optical coherence theory to infer what the image
gathered with an equivalent single aperture. It is of the source of the radiation is.
difficult to mechanically or feedback stabilize the time The light that emanates from most natural and
delay between two widely separated apertures to many artificial sources is incoherent. Incoherent
obtain an accurate phase measurement. Often only a radiation is optical random noise. It is analogous to
small bandwidth of the source light can be utilized, the sound of static heard on a radio receiving no
further reducing the available signal. The interference signal. Because it is random, the fluctuations of the
component of the combined signal can be very small. electromagnetic field produced by an incoherent
Typically an interferometric image must be computed source are completely unrelated to the fluctuations
from the data, rather than being directly measured on of any other source. The origin of the randomness is
IMAGING / Interferometric Imaging 165
usually from microscopic processes such as the So while the source itself produces random noise, the
thermal motions of electrons or the spontaneous fluctuations of the field at points distant from the
emission of an excited atom. Because these micro- source become correlated because they originate from
scopic processes tend to act independently of one the same source.
another, the light produced by different sources tends Partially coherent waves occur when the measured
to be mutually incoherent. Because incoherent field contains a superposition of the fields from more
sources do not radiate a deterministic field, the than one incoherent source. If there are now two
average amount of power they radiate is usually point sources instead of one, each field point in space
specified rather than the field itself, which is random. will receive radiation from both sources. The amount
However, as the fields propagate away from their of the contribution from each source to each point in
sources, the fields can become partially coherent. To space depends on the distance between the field point
see this, consider the light waves propagating away and the source. In general, because both sources can
from a single point-like incoherent source. The field be different distances from the two field points, the
produced at the source point consists of random fluctuations at the field points will not be the same.
fluctuations in time, but these fluctuations travel For example, two field points that are the same
outward in a spherical wave at the speed of light. The distances from both sources will be perfectly corre-
fluctuations of the field of any two points an equal lated, or coherent, because they receive the same
distance from the source will be the same, or perfectly combination of fields from both sources. This
correlated, because they arrive from the source at the situation is illustrated in Figure 1b. However, if one
same time. This is illustrated in Figure 1a, where a considers two points that are near both respective
random wave is emanated from a point incoherent sources, the points will usually receive light from the
source. Because of the complete correlation, we can nearby source. Because the sources are incoherent,
say that these two points are fully spatially coherent. and each point is mostly receiving radiation from its
closer source, these points will be nearly uncorre-
lated, as shown in Figure 1c.
A spatial distribution of incoherent sources radiat-
ing various amounts of power will create a pattern of
Coherent field
points correlations in the field remote from the source. By
using interferometry, the correlation between pairs of
Point incoherent
(a) source field points remote from the sources can be measured.
The interference between two points in the electro-
magnetic field is achieved by relaying the two fields
Source 1 Source 2 (e.g., with mirrors) to a common location that has an
Coherent field equal travel time for the fields from both points,
points
where the fields are superimposed. The power of the
(b)
superimposed beams is then detected, by a photo-
cathode, for example. If the two signals being
Nearly incoherent combined are incoherent, then no interference will
Source 1
field points take place, and the total power measured on the
photodetector is simply the sum of the power of the
(c) Source 2 two signals. However, if the two signals are partially
coherent, there will be a deviation in the measured
Figure 1 The distance between incoherent sources and two
field points determines the coherence between the field points.
power from the sum of the power of the two signals.
(a) Because there is only one source, the field everywhere is The magnitude of this deviation relative to the power
determined by the radiation of that source. Therefore, the same of the constituent signals indicates the degree of
fluctuations of the wave arrive at two points an equal distance from partial coherence of the signal.
the source at the same time, and produce identical fields at both An interferometer, such as depicted in Figure 2,
points, and so are completely coherent. (b) Two field points are
now the same distance from two incoherent sources. Because
gathers the light from two areas of the electromag-
both sources contribute identical fields to both points (because the netic field remote from an object and combines them
fluctuations of the wave from both sources arrive at both points at together at the same point to form an interferogram.
the same time), the superposition of the fields from both sources In this example, two telescopes of a known separation
at both field points are identical, and so the fields are coherent. gather light from a star. The primary mirror of each
(c) Two field points are now near two sources. Because each field
point receives radiation mostly from its nearby source and very
telescope is of insufficient size to resolve the star as
little from the other source, the fields at each field point are almost anything but a point. However, a telescope is used
completely mutually incoherent. in practice to collect enough light to produce a
166 IMAGING / Interferometric Imaging
Figure 4 The image of two interfering wavefronts from a point object with various phase differences between the two wavefronts.
IMAGING / Interferometric Imaging 167
resolution is to be able to observe planets and planet emitting objects. Because of the versatility that
formation around distant stars, to identify possible interferometric detection can provide, interfero-
signs of life in other solar systems. metric imaging will surely be increasingly used
A similar principle is used for interferometric outside of astronomy.
image formation in radio astronomy. The chief
differences are the methods used to collect, detect,
and correlate the electromagnetic radiation. Radio See also
astronomical observatories, such as the Very Large
Array in New Mexico, USA, and MERLIN at the Coherence: Coherence and Imaging; Overview.
Jodrell Bank Observatory in Cheshire, UK, use Holography, Techniques: Overview. Incoherent
Sources: Lamps. Interferometry: Overview; White Light
collection dishes and radio antennas to directly
Interferometry.
measure the electric field, which is not feasible at
optical frequencies. These fields are converted to
electrical voltages, which can then be directly
correlated using electronic mixers, rather than Further Reading
detecting the correlation indirectly through inter- Baldwin JE and Haniff CA (2002) The application of
ference. Because the wavelength is much larger, the interferometry to optical astronomical imaging.
stability requirements are greatly relaxed. Radio Philosophical Transactions of the Royal Society of
and optical frequency interferometric images are London A 360: 969 – 986.
formed in essentially the same way, using the Beckers JM (1991) Interferometric imaging with the very
Van Cittert –Zernike theorem. large telescope. Journal of Optics 22: 73 – 83.
Interferometric imaging methods have also Born M and Wolf E (1997) Principles of Optics. New York:
Cambridge University Press.
become of interest for microscopy applications.
Goodman JW (1996) Introduction to Fourier Optics.
Aperture synthesis is not needed in microscopy, New York: McGraw-Hill Inc.
because the objects of interest are very small and Goodman JW (2000) Statistical Optics. New York: Wiley-
therefore single apertures can be used to achieve Interscience.
diffraction-limited resolution. However, interfero- Mandel L and Wolf E (1995) Optical Coherence and
metric detection of partially coherent light affords Quantum Optics. New York: Cambridge University
greater flexibility because an image may be com- Press.
puted from the data rather than being directly Marks DL, Stack RA, Brady DJ, Munson DC Jr and
detected. Typically this is achieved by creating an Brady RB (1999) Visible cone-beam tomography with a
incoherent hologram of the object, typically through lensless interferometric camera. Science 284:
shearing interferometry. For example, by using a 2164– 2166.
Monnier JD (2003) Optical interferometry in astronomy.
Rotational Shearing Interferometer, one can measure
Reports on Progress in Physics 66: 789 –857.
an incoherent hologram that can be used to produce
Potuluri P, Fetterman MR and Brady DJ (2001) High
a representation of an object with a greatly increased depth-of-field microscopic imaging using an interfero-
depth-of-field. In addition, direct detection of the metric camera. Optics Express 8: 624– 630.
partial coherence of sources can allow the methods Thompson AR, Moran JM and Swenson GW (2001)
of computed tomography to be used to produce Interferometry and Synthesis in Radio Astronomy.
three-dimensional images of volumes of incoherently New York: Wiley-Interscience.
Lidar
M L Simpson and D P Hutchinson, Oak Ridge chemical constituents. Usually a LIDAR system
National Laboratory, Oak Ridge, TN, USA consists of a transmitting laser source, transmitter
q 2005, Elsevier Ltd. All Rights Reserved. optics, receiver optics, and a detector. Several
configurations of LIDAR systems are possible, based
on the type of laser source and detection scheme
Introduction selected. Basically, sources can be continuous
Light detection and ranging (LIDAR) refers to an wave (CW), pulsed, or chirped. Sources that are
optical system that measures any or all of a variety of CW are most often used to measure velocity or
target parameters including range, velocity, and chemical constituents. Pulsed and chirped sources
170 IMAGING / Lidar
Figure 1 Triangular-wave modulation Envelope on the transmitted beam (black) and return beam (gray).
IMAGING / Lidar 171
Finally, one can solve for the distance to the target, R; and the line-of-sight (LOS) velocity is
as a function of the beat frequency and chirp rate:
l Df ðupÞþDf ðdownÞ
cT v¼ ½9
R¼ Df ½5 2 2
2B
The difference frequency, Df, is the same for the up The range resolution of a FM-CW LIDAR system is
and down ramp of the modulation signal. Figure 2 the range for which the difference frequency equals 1
demonstrates the return signal for the case of a target cycle in the sweep time interval of T or Df ¼ 1=T:
moving toward the LIDAR system. The difference Then, from eqn [5], the range resolution, DR, is
frequency now contains a component proportional to given by
the target induced Doppler shift given by
cT 1 c
DR ¼ ¼ ½10
v 2B T 2B
fd ¼ 2 ½6
l
Therefore, the range resolution is only a function
where v is the line-of-sight target velocity and l is of the sweep bandwidth and not the laser wavelength.
the laser wavelength. From Figure 2, the Doppler However, the Doppler shift is dependent on the laser
shift causes the difference frequency on the up ramp frequency. The accuracy with which the difference
to be less than the range induced frequency shift frequency may be determined is a function of the
given by eqn [5] and the net difference frequency is number of cycles over which the measurement is made.
greater on the down slope. In this case, the difference The number of cycles is maximized for the case of
frequency is given by t ¼ T=2 or T should be set for twice the time delay for
the maximum range of interest.
B R
Df ¼ 2 þfd ½7
T c Numerical Example
The range and Doppler contributions to the differ- A range resolution of 30 cm requires a chirp
ence frequency may be separated by measuring bandwidth, B, of 500 MHz. For a total range of
the frequency difference over both the up and down interest of 10 km, the chirp time, T; should be
chirp of the waveform. The Doppler frequency 133.33 ms or the chirp rate should be 3750 Hz
fd ¼ Df ðupÞ2 Df ðdownÞ and the range is given by (1 up and 1 down ramp per cycle). The difference
the average of Df(up) and Df(down) over a complete frequency for a 10 km range target will be 250 MHz,
cycle of the modulating triangular wave. Therefore, which demonstrates the need for wideband detectors
substituting in eqns [5] and [6], the range may be to capture high-resolution range images (Table 1).
Table 1 Summary of FM-CW LIDAR characteristics sources are then discussed, and it is found that optimal
Parameter Value
detector size and array-spacing is dependent on the
coherence properties of the intended object field.
Range resolution (m) 0.3
Chirp rate (Hz) 3750 Aperture Array
Chirp bandwidth (Hz) 500 £ 106
Maximum range (km) 10 As an alternative to holographic LO beam array
generation, the LO mask design presented here
consists of an array of apertures with size and
spacing equal to the detector dimensions. The LO is
Optical Configuration for Imaging expanded to coherently illuminate the entire mask
LIDAR with approximately uniform intensity. The mask is
Imaging LIDAR systems can be constructed for located at the front focal plane of the ancillary
virtually any wavelength range with appropriate lens and its image is located at the back focal plane
sources, optical components, and detectors. The (detector plane) of the objective lens. The coherent
design of an optical receiver system capable of image of the mask formed at the detector plane by
recording coherent images requires additional con- the ancillary-objective lens combination can be
siderations over those necessary for a video camera or described by a two-stage Fraunhofer diffraction
single-channel receiver. First, the collection optics process.
must have resolution capabilities consistent with both Finite dimensions of the ancillary pupil results in
the detector grid dimensions and the coherence spatial filtering which can affect the image profile of
properties of the source. Next, the local oscillator the LO mask. If desired, the ancillary pupil may be
(LO) power must be applied to each array element. chosen to provide nearly-Gaussian LO beamlet
For a typical infrared (HgCdTe) detector, one to ten profiles with beam waists at each detector location.
milliwatts of LO power might be required to reach the It is useful to first consider the coherent image of a
shot noise limit. For large arrays, this may result in single aperture uniformly illuminated with plane LO
wavefronts. In one dimension, this wave train
unacceptable heat loads, even for momentary appli-
diffracts from the aperture to form a far-field
cation. Thus, detector arrays with substantial spacing
diffraction pattern at the Fraunhofer plane of the
between detector elements benefit from the application
ancillary lens given by
of individual LO ‘beamlets’ to shield the insensitive
portions of the array substrate. These beamlets sinðbÞ
EF ðbÞ ¼ EF ½11
can be generated by illuminating a mask containing b
an array of holes with the LO Gaussian laser beam.
where b ¼ 12 kb sinðuÞ; k ¼ 2p=l; l is the wavelength,
This scheme is illustrated in Figure 3.
b is the aperture width, and u is measured from the
In the configuration shown in the figure, the signal optical axis. For small u; b ¼ pbn=ðlf Þ where n
beam (from the target) and the LO beamlets are measures a point on the Fraunhofer plane and f is
combined at the beamsplitter prior to being focused the focal length of the ancillary lens. The finite
onto the detector plane by the objective lens. The diameter D (determined by an aperture stop in the
detector plane coincides with the far-field image plane Fraunhofer plane) of the ancillary lens spatially filters
of the objective lens, and any additional optical the diffraction pattern and this limits the range of b
elements utilized to enhance the camera’s resolution values in eqn [11]. We define an aperture-limited
are not shown. An expanded LO beam illuminates an pupil as one which limits b to the range 2p , b , p;
aperture array located at the far-field object plane of this will be the case when f =D ¼ b=2l:
the ancillary lens. For instance, if the objective lens Consider now a linear aperture array consisting of N
and the ancillary lens are identical, the LO mask is individual apertures of width b and separation a . b:
imaged onto the detector plane with unity magnifi- Assume that each aperture is illuminated by plane,
cation. The detector surface is assumed to be flat with uniform, mutually coherent LO wavefronts of equal
uniform video and heterodyne quantum efficiency. intensity. In this case, the LO profile at the Fraunhofer
Several aspects of this design are discussed below. plane is characterized by the diffraction pattern for a
We first discuss how the optical parameters may be coherently illuminated linear diffraction grating:
adjusted to give nearly-Gaussian LO profiles at each
sinðN aÞ
detector position. The receiver antenna array is then EF ðuÞ ¼ E0 sinc ðbÞ ½12
shown to consist of nonoverlapping sub-beams N sinðaÞ
spaced with the detector array geometry. Signal where a ¼ 1
2 ka sin u and where b; u; and k are defined
considerations for both incoherent and coherent above.
IMAGING / Lidar 173
Figure 3 The individual beamlets are generated by the local oscillator or aperture mask. Reproduced with permission from Simpson
ML, Bennett CA, Emery MS, et al. (1997) Coherent imaging with two-dimensional focal-plane arrays: design and applications. Applied
Optics 36: 6913– 6920.
Propagation of the LO from the ancillary lens extended uniform LO wavefront. The uniform
Fraunhofer plane to the detector plane can be illumination BPLO profiles in the object field differ
described by applying the Fraunhofer diffraction from the detector plane profiles only in horizontal
integral to eqn [12] with integration limits appro- scale which now is consistent with the magnified
priate to the ancillary pupil dimensions. The larger detector dimensions.
pupil gives increased illumination at the detector For our purposes, we may define the far field of
edges while the aperture-limited pupil gives a the objective lens according to the condition for
nearly Gaussian LO profile with 1=e points close maximum Gaussian beam collimation:
to the detector edges. fl
vd v0 ¼ ½13
p
Receiver Antenna Array where v0 is an effective BPLO Gaussian beam waist at
the Fraunhofer plane and vd is an effective BPLO
The requirement of wavefront matching between
beam waist at the detector. Under this circumstance,
the LO field and the signal field leads to the notion
the BPLO Rayleigh range is ZD ¼ pv20 =l ¼ f ðv0 =vd Þ:
of an ‘antenna beam’ or equivalently, ‘backward
Identifying v0 =vd as the magnification (notice that
propagating LO’ (BPLO) to define the field of view of
for the BPLO, the magnification is greater than
an individual heterodyne receiver element. This beam
unity) gives do ¼ ðdi =f ÞZD , ZD where do is the
is formed by allowing a reverse projection of the local
object distance and di is the image distance. Thus
oscillator field to diffract from the detector surface,
the far-field of the objective lens is at least one BPLO
through all collection optics in the signal beam path
Rayleigh range away.
to a point on the object field of the imaging system. A
The object field BPLO profiles for an aperture-
receiver array will produce an array of such BPLO
limited detector array consists of nearly Gaussian
beams which must maintain the detector array
sub-beams with 1=e points located approximately at
geometry in the object field if the LO and signal
the magnified detector edge locations. Since the far-
beams are to mix efficiently.
field object distance exceeds ZD ; the divergence of
As above, spatial filtering by the receiver pupil
each BPLO sub-beam is constant and remains distinct
(assumed to be located at the Fraunhofer plane of the
and nonoverlapping for all do . ZD.
objective lens) will affect the BPLO profile. In the
Fraunhofer plane of the objective lens, the uniformly
Imaging Applications Involving Incoherent
illuminated single-detector BPLO profile is given by
Sources
eqn [11], and we define an aperture-limited detector
as one which limits the range of b to ^ p. Similarly, For an object plane in the far field of aperture-limited
eqn [12] characterizes the BPLO profile resulting collection optics, the nearly Gaussian-profile BPLO
from a detector array uniformly illuminated with an propagates from the receiver aperture according to
174 IMAGING / Lidar
the condition for maximum Gaussian beam collima- radiation collected from an extended incoherent
tion. The waist is at the Fraunhofer plane where the source. Following convention, we refer to the overlap
BPLO wavefront is nearly plane, while at the object of ms and the BPLO as the effective receiver aperture.
field the BPLO wavefronts are spherical with radius Detectors larger than the aperture-limit, whose
equal to area exceeds Aci, exhibit reduced receiver apertures
2 and a corresponding decrease in heterodyne mixing
!2 3
Z efficiency. For example, varying the receiver pupil,
Rðdo Þ ¼ do 41 þ D 5 ½14 while maintaining constant LO illumination, results
do
in a heterodyne signal proportional to the received
power for detector diameters much less than the
where we have set do 2 f < do : aperture-limit, while detector diameters much greater
It will generally be the case that an imaging system than the aperture-limit produce heterodyne signal
will be used to analyze extended objects whose size only in proportion to the square-root of the received
exceeds the resolution limit. Following convention, power. Thus, for constant LO illumination and
we define a spatially coherent source as one which is incoherent sources, increasing the entendue past the
illuminated with spatially coherent light with aperture limit will always increase the heterodyne
appreciable structure and roughness of depth within signal, but less than for direct detection.
the coherence length of the illuminating light, and
where the target motion does not appreciably perturb
the speckle field during the measurement process. The Optimal Detector Size
speckle effects, which so seriously degrade the As noted above, an aperture-limited system is one
information content in images formed of spatially where detectors of diameter d are matched with an
coherent objects, are not present in images formed of objective lens with f =D ¼ d=2l: Commercially avail-
incoherent objects such as thermal emitters. Speckle able detectors for use at 10 mm are commonly
may be reduced by time-averaging speckle fluctu- 100 mm wide suggesting the use of a f =5 lens, however
ations resulting from target motion, atmospheric aspheric lenses as fast as f =1 are also commercially
effects, and independent transmitter pulses, and in available. Apart from aberrations, it would be
the limit such objects become incoherent pseudother- advantageous to utilize a 5 £ 5 subarray of 20 mm
mal sources. Similar remarks apply to spatial detectors with an f =1 lens in place of a single 100 mm
averages over detector subarrays. detector for applications involving coherent sources.
Independent radiators on the object field produce Applications involving incoherent thermal sources
disturbances which can diffract to illuminate the would generally favor a wide detector/fast lens
entire receiver aperture as well as a finite coherence combination since thermal sources provide weak
area in the image field Aci given by heterodyne signals.
l2
Aci .. ½15
Vci Noise Analysis and Performance
where Vci is the solid angle subtended by the receiver Radar Range Equation
aperture at the image field. Alternatively, we may
interpret Aci as the area of an aperture in the object The laser power requirements for the LIDAR system
plane which, if illuminated with wavefronts charac- depend on the signal-to-noise ratio of the FPA, the i.f.
terized by eqn [14] (reciprocal illumination), would amplifier performance, atmospheric transmission,
just fill the receiver aperture with a central diffraction and the optics system design. The following calcu-
order. Thus the combination of independent disturb- lations are used to determine the performance of the
ances within the detector’s field of view on the object LIDAR system for a single pixel of the FPA. The radar
plane, combine to form a spatially coherent disturb- range equation is
ance over the receiver aperture and within Aci. !
According to the Van Cittert – Zernike theorem, the Gt spt pD2
Pr ¼ Pt hatm hsys ½16
complex degree of spatial coherence ms is a unity 4pR2 4pR2 4
amplitude wavetrain that propagates exactly as a
signal beam resulting from the reciprocal coherent where P r ¼ received power; P t ¼ transmitted
illumination of an object-field aperture of area Aci. power; Gt ¼ antenna gain; spt ¼ radar cross-section;
Thus, ms and the BPLO are Helmholtz-reciprocal R ¼ range; D ¼ transmitting and receiving aperture
pairs whose overlap determines the mixing efficiency diameter; hatm ¼ atmospheric transmission; and
with which heterodyne signal is generated from hsys ¼ optical system efficiency.
IMAGING / Lidar 175
same target. New technology developments in quan- With the new QWIP detectors theoretically having
tum well infrared photodetector (QWIP) detectors heterodyne bandwidths of 30 GHz or more, full
and arrays, and in HgCdZnTe technology are now high-resolution spectral coverage for the long-wave
paving the way toward thermo-electrically cooled IR is within grasp. For chemical detection with
LWIR imaging array detectors. Also waveguide, heterodyne receivers, there are two modes of oper-
folded cavity, laser resonator designs are providing ation. The first is termed passive heterodyne radio-
compact, lightweight packages for CO2 lasers. metry where the source is a black-body radiator and
Coherent receivers, that use the new focal plane the absorption is measured for chemical species
arrays and compact sources, provide the capability of located between the blackbody and the receiver. The
making parallel LIDAR measurements without the second mode of operation is thermoluminescence
added hardware and software overhead of scanning where a laser transmitter is used to excite a chemical
systems. In the following section, several examples species in the atmosphere and the resulting emission
are provided that demonstrate the versatility of (luminescence) is measured with the heterodyne
imaging LIDAR systems that use detector arrays for receiver. The advantage of imaging in both cases is
measurement applications such as chemical plumes, the ability to capture chemical plumes in the presence
velocity gradients, and speckle reduction. of wind and other anomalies.
Wideband heterodyne receivers are useful in To illustrate the ability of heterodyne imaging
measuring chemical species whose absorption peaks receivers to spectroscopically resolve chemical
are not well aligned with the laser source line. If the plumes, Figure 6 shows a series of 10 £ 10 pixel
species absorption peak corresponds directly with the images of a small bottle of concentrated ammonium
laser line, an absorption measurement can be made hydroxide against a 212 8C blackbody recorded in
using direct detection (amplitude only). For off-line both passive heterodyne radiometry and direct
(off-resonance) measurements, heterodyning is detection modes. Figure 6a shows a heterodyne
needed, where the bandwidth of the heterodyne image of the bottle with the top in place where, in
receiver dictates how far away from the laser line Figure 6b the top is removed and the absorbing
an absorption peak can be measured. For example, plume is clearly imaged. The heterodyne images in
the CO2 laser line spacing in the 9 to 12 micron Figures 6a and 6b were recorded with the LO
region of the spectrum averages around 50 GHz. adjusted to emit the 9R(30) line which is known to
be absorbed strongly by NH3. In Figure 6c, the same The alternative considered here is to sacrifice spatial
image was recorded by direct detection and fails to resolution by averaging over an array of detectors, all
detect the ammonia plume. Similarly, in Figure 6d, of which record the signal returned from a single
the LO was adjusted to emit a nearby line known not transmitter pulse. Detectors separated by a coherence
to be absorbed by NH3 and the plume is again not area diameter will record signals that approach
detected. Both the objective and ancillary lens were statistical independence, and in this case speckle
f =1 38 mm, and the object field was about three effects can be reduced by an amount approaching
meters away. The top of the bottle was approximately the square-root of the number of detectors in the
2 cm in diameter and is located at the bottom of the subarray.
field of view. Measurements utilizing hot ammonium The following laboratory experiment demonstrates
hydroxide imaged against a cold background resulted speckle reduction by spatial averaging with a fixed
in similar images (albeit at substantially reduced focal plane array. The receiver used in this experiment
signal to noise ratios) where the plume was bright and consists of a 3 £ 3 array of HgCdTe detectors. The
the background was dark. 50-micron diameter detectors are arranged in a
Another use of imaging LIDAR is the measurement square pattern, with 100 micron center-to-center
of velocity profiles. Applications include wake vortex spacing. Custom electronics multiplex the output of
measurements for aircraft, wind shear detection, and each detector for subsequent processing. As with the
vibration signature measurements for identification. velocity profile experiment, LIDAR illumination is
In another experiment, active mode images were produced by frequency shifting a portion of the LO
recorded by illuminating the target with a portion of beam by 40 MHz with an acousto-optic modulator
the LO laser beam which had been shifted in prior to illuminating the object field. Polarization of
frequency by 40 MHz using an acousto-optic modu- the illumination beam in the experiment is parallel
lator. The resulting Doppler image is recorded using to the LO prior to scattering from the target.
an 8 £ 8 HgCdTe array with discrete electronics for The measured receiver bandwidth, limited mostly
each pixel element in the array. Figure 7 shows the by the dewar design, is about 100 MHz, which
active image of a vertical squirrel cage fan where each permits the acquisition of the 40 MHz heterodyne
pixel is rendered to represent the peak Doppler shift signal. The collection optic consists of a 38 mm f =2
measured with the detector at the corresponding asphere which produces a 2 mm diameter pixel image
location in the image plane. The moving target at a 3 m object distance.
scatters incoherently and thus speckle effects do not The target for this experiment is a 30 cm diameter
degrade the image. The measured Doppler shifts disk coated with an aluminum powder. The particle
are consistent with the known rotational velocity of size of this powder ranges from 1 to 100 microns as
the target. determined by electron microscopy. The disk is
Speckle effects associated with coherent mixing attached to a stepper motor that is incremented
of laser light reflected from an extended target between frames to provide statistically independent
(constructive and destructive interference) are a speckle fields. Figure 8 shows the distribution of
major source of noise in LIDAR measurements.
These speckle effects can be reduced by signal
averaging. For single detector LIDAR, averaging
over many transmitter pulses can reduce speckle
but this technique also reduces temporal resolution.
measurements taken with a single pixel superimposed Related Phenomena, Chap. 2, pp. 9– 75. New York:
on the distribution taken by averaging over the focal Springer-Verlag.
plane array. A reduction in speckle noise by a factor Haner AB and Isenor NR (1970) Intensity correlations from
of the square root of the number of averaged pixels is pseudothermal light sources. American Journal of
Physics 38: 748 – 750.
obtained as predicted.
Holzhauser E and Massig JH (1978) An analysis of optical
In this chapter, an overview of imaging LIDAR
mixing in plasma scattering experiments. Plasma
systems has been presented. Imaging LIDAR system Physics 20: 867 – 877.
design, performance, and application were discussed. Jelalian AV (1992) Laser Radar Systems, pp. 3– 5. Norwood,
With new technology developments in wide-band- WA: Artech House.
width QWIP detector arrays, thermo-electrically Kogelnik H and Li T (1966) Laser beams and resonators.
cooled detectors, and compact sources, substantial Applied Optics 5: 1550– 1566.
potential exists for imaging LIDAR systems in the Kusters JF, Rye BJ and Walker AC (1989) Spatial weighting
next generation in remote sensing. Due to the in laboratory light scattering experiments. Applied
parallel, multifunction measurement capability of Optics 28: 657 – 664.
these new receivers a single sensor system can provide Martienssen W and Spiller E (1964) Coherence and
fluctuations in light beams. American Journal of Physics
three-dimensional information, including range,
32: 919– 926.
shape, velocity, and chemical composition. In
Measures RM (1984) Laser Remote Sensing Fundamentals
addition to the sensor fusion advantages, snapshot and Applications. New York: John Wiley.
imaging allows the capture of high-speed, rapidly Rye BJ (1979) Antenna parameters for incoherent back-
evolving events that elude present scanning systems. scatter heterodyne lidar. Applied Optics 18: 1390– 1398.
Shannon RR (1997) The Art and Science of Optical Design.
New York: Cambridge University Press.
See also Siegman AE (1966) The antenna properties of optical
heterodyne receivers. Applied Optics 5: 1588– 1594.
Environmental Measurements: Doppler lidar. Imaging: Simpson ML, Bennett CA, Emery MS, et al. (1997) Coherent
Infrared Imaging. imaging with two-dimensional focal-plane arrays: design
and applications. Applied Optics 36: 6913– 6920.
Veldkamp WB (1983) Holographic local oscillator beam
Further Reading
mutiplexing for array heterodyne detection. Applied
Born M and Wolf E (1964) Principles of Optics. London: Optics 22: 891 – 900.
Pergamon. Veldkamp WB and Van Allen EJ (1983) Binary holographic
Forrestor AT (1956) On coherence properties of light LO beam multiplexer for IR imaging detector arrays.
waves. American Journal of Physics 24: 192 – 196. Applied Optics 22: 1497– 1506.
Goodman JW (1975) Statistical properties of laser Zernike F (1938) The concept of degree of coherence and its
speckle patterns. In: Dainty JC (ed.) Laser Speckle and application to optical problems. Physica V8: 785– 795.
Multiplex Imaging
A Lacourt, Université de Franche-Comté, Similarly, the shape of the light spot we can see in a
Besancon, France dark room, from the rays passing through a small
q 2005, Elsevier Ltd. All Rights Reserved. aperture of a closed shutter, is not that of the aperture.
Generally, the spot has a circular shape, as an image
of the sun. Actually, it is neither a shadow of the hole
nor an image of the sun but a combination of both.
Introduction Light incident on an aperture contains the entire
The ability to project an image onto a screen in a information about the light distribution of the object,
darkened room has been known since antiquity and but this information has to be ‘deblurred’, or
Leonardo da Vinci (1515) precisely described prin- decoded. The aperture acts as an elementary spatial
ciples of the Camera oscura. When one exposes a 2D-filter that selects and preserves the direction and
paper sheet directly to sunlight, it is uniformly intensity of light rays coming from each point of the
lighted. However, if one pierces a little hole through scene. The projection on the screen of each pencil of
the paper sheet then an image of the sun can be light passing through the hole is a small light spot, the
displayed on a screen beyond the hole (Figure 1). geometry of which is the same as the aperture, and
IMAGING / Multiplex Imaging 179
Background
Electromagnetic Waves Propagation
photographic plate. Obviously, the light distribution by a cylindrical light beam, the FZP is a multiple foci
on the plate is blurred. Indeed, any elementary diffractive lens (Figure 8).
transparencies on the mask project on the receptor The reading step is either analogical or numerical.
inverted images of the object located the shadow of Through the analogical method, we consider the
the mask. Such a distribution is expressed by a recorded plate as a pseudohologram (Figure 9). When
convolution. Let Oðx; yÞ be the conical projection of the ‘shadowgram’ is lighted with a parallel beam of
the object on the receptor and Zðx; yÞ that of the laser light, the FZPs recorded around each point
mask. The shadows distribution on the receptor is concentrates the light on its foci. Then an image of the
(the variable being normalized): object is reconstructed on the focal planes; a method
that is simple and elegant. Nevertheless, the proces-
I0 ðx0 ; y0 Þ ¼ Zðx; yÞ p Oðx; yÞ sing is not linear since we record an intensity
distribution at the recording step, while the reading
ð
¼ Zðx; yÞOðx0 2 x; y0 2 yÞdx dy ½5 step by diffraction involves amplitude. Moreover, the
S/N ratio decreases as the amount of data increases,
recorded on the photographic plate because of its
The reading step is a decorrelation processing from finite dynamic.
the recorded intensity (Figure 7). It consists in Mertz and Young had the idea of using a second
achieving a new correlation of I0 with an arbitrary FZP to perform the deconvolution processing by
function T: optical methods. In this case, the SPSF, Rðx; yÞ; is the
autocorrelation function of the FZP transparency
I00 ðx; yÞ ¼ Tðx; yÞ p ½Zðx; yÞ p Oðx; yÞ (Figure 10). It presents a narrow central peak,
emerging from a pyramidal ground (Figure 11).
¼ ½Tðx; yÞ p Zðx; yÞ p Oðx; yÞ ½6 The optical setup is the same as the recording step.
dx dy ¼ Oðx0 ; y0 Þ ½7
Figure 8 Properties of FZP.
correction of this distribution to match the model the transport equation (TE):
prediction with the actual measurement. The model-
ing part of the imaging problem is usually referred to ^ tÞ
1 ›Lðr; V;
þ 7·Lðr; V; ^ tÞV ^ þ ðm þ m ÞLðr; V;
^ tÞ
s a
as the forward problem, and the correction procedure v ›t
is known as an inverse problem. ð
¼ ms Lðr; V^ 0 ; tÞf ðV;
^ V^ 0 ÞdV^ 0 þ Sðr; V;
^ tÞ ½1
approximation. The diffusion approximation is valid where x ¼ v=ðvma Þ: Equations [5] – [6] describe a
under the following three conditions: (a) the light spherical wave whose wavelength at ma ¼ 0:1 cm21
sources are isotropic or the distance between the and m0s ¼ 8 cm21 (typical for biological tissues) and
source and the detector is much larger than l; (b) at modulation frequency n ¼ 100 MHz; is close to
characteristic source modulation frequencies n are 35 cm, so that the phase of the wave changes
low enough that vms 0 =ð2pnÞ .. 1; and (c) ma ,, m0s : approximately by 108 per cm. In the same time,
In the biomedical applications of PDWI the condition the wave decays exponentially at a distance of about
(a) is fulfilled when the source – detector distance is 3 cm, so that PDWs are strongly damped. A similar
larger than 1 cm. The condition (b) is valid when the analytical solution in the form of a spherical wave can
characteristic modulation frequencies of the light be obtained for the case of a semi-infinite homo-
source are less than 1 GHz. The condition (c) is geneous medium and point source placed on the
fulfilled in most biological tissues, for which the surface of the medium (see Figure 1). These analytical
scattering is highly isotropic, i.e. g1 ,, 1: Thus, the solutions are employed to measure ma and m0s of the
forward problem of PDWI, i.e., the modeling of light homogeneous medium. The method is based on
propagation, can be solved using the diffusion fitting measured ac, dc, and phase of the PDW as
approximation in a large variety of situations. functions of source – detector distance by the analyti-
PDWs described by equation [4] are the scalar cal solution.
damped traveling waves, characterized by the coher- Inhomogeneities in the optical properties of a
ent changes of light energy (photon density) in
turbid medium cause distortion of the PDW from
different locations on time intervals from micro-
sphericity (see Figure 1). PDWs in homogeneous
seconds to nanoseconds. Usually pulsed or harmoni-
media of limited size or with strong surface curvature
cally modulated light sources are used to generate
also usually have complex structure. For a limited
PDWs. The properties of the PDWs from a harmonic
number of configurations one can obtain solutions to
source are fundamental, because the PDW from the
the DE using analytical methods of mathematical
source of any type of modulation can be considered as
physics, such as the method of Green’s functions.
a superposition of harmonic waves from sources
However, in practice the complexity of the inhomo-
having different frequencies using the Fourier
transform. geneities and surfaces usually requires numerical
In order to get an idea of the basic properties of solution of the DE. This can be achieved using
PDW, one can consider the analytical solution of [4] numerical methods for the solution of partial
for the case of a harmonically modulated point-like differential equations, such as the finite element
source placed in an infinite homogeneous scattering method and the finite difference method. The finite
medium. This solution is difference method can also be used to solve the TE.
This may be necessary in situations when the DE is
Uðr; tÞ ¼ Udc ðrÞ þ Uac ðrÞ exp½2ivt þ ifðrÞ ½5 not valid, for example to accurately account for the
influence of the cerebrospinal fluid (which has
where Uac ; Udc ; and f are the ac, dc and phase, relatively low scattering) on light transport in the
respectively, of the wave measured at the distance r human skull. However, the solution of the TE
from the source, and v ¼ 2pn: Uac ; Udc ; and f
requires formidable computational resources, which
depend on the medium optical properties, source
increase proportionally to the volume of the medium.
intensity Q0 ; modulation amplitude A0 ; and initial
Therefore, hybrid DE-TE methods were developed, in
phase c0 as
Q0 A 0
Uac ðrÞ ¼
4pD
" 1=2 #
vm a 2 1=2 1=2
exp 2r ðð1þx Þ þ1Þ
2D
½6a
r
" #
vma 1=2
exp 2r
Q0 D
Udc ðrÞ ¼ ½6b
4pD r
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Figure 1 Equiphase surfaces of the PDW induced by a point
vma 1=2
fðrÞ ¼ r ð1þx2 Þ1=2 21 ½6c source in a semi-infinite homogeneous medium (solid lines) and in
2D a medium including heterogeneity (dashed lines).
188 IMAGING / Photon Density Wave Imaging
which DE is used to describe light propagation in spatial distribution of the optical properties ma and/or
most of the medium except small low-scattering areas m0s from the ‘measured’ bulk values mm a and ms
0m
{ma ðrÞ; m0s ðrÞ} ¼ F21 ½G ½8 G 2 Ge ¼ J½{ma ðrÞ; m0s ðrÞ}e {Dma ðrÞ; Dm0s ðrÞ} þ … ½10
Neither the transport model nor the diffusion model where J½… is the Jacobian, and {Dma ðrÞ;Dm0s ðrÞ} ¼
allow inversion in a closed analytical form for a {ma ðrÞ; m0s ðrÞ} 2 {ma ðrÞ; m0s ðrÞ}e : Neglecting terms after
general boundary configuration and spatial distri- the first one in [10], the problem of finding the cor-
bution of optical properties. The basic reason for the rection {Dma ðrÞ;Dm0s ðrÞ} to the estimate {ma ðrÞ; m0s ðrÞ}e
complexity of the inverse problem is in the complex- reduces to the calculation of the Jacobian and to the
ity of the light paths in the highly scattering medium. solution of a linear algebraic equation.
This situation is opposite to that in CT, where the This approach provides a basis for the PDWI
x-ray radiation propagates through the medium algorithm, which is the following. First one makes an
along straight lines almost without scattering, initial guess at the optical properties of the medium.
which significantly simplifies the forward problem Then one calculates the ac, dc, and phase of the PDW
and allows its analytical inversion. From the using a model of light propagation through the tissue
mathematical point of view, the complexity of the and the geometry of the source –detector configur-
inverse problem of PDWI is due to its nonlinearity ation. The calculated values are compared with the
and ill-posedness. Only approximate methods of measured ones. If the error is above a set tolerance
inversion have been developed so far. level, one calculates corrections to the current values
The simplest method is called backprojection by of ma and m0s by solving eqn [10]. The algorithm is
analogy with the method of the same name used in completed when the required error level is achieved.
CT. The method attempts to reconstruct the actual Thus, the PDWI algorithm iteratively applies both
IMAGING / Photon Density Wave Imaging 189
Applications
PDWI was proposed and developed as a noninva-
sive biomedical imaging technique. Although the
spatial resolution of PDWI cannot compete with
that of CT or MRI, PDWI has the advantages of
low cost, high temporal resolution, and high
biochemical specificity. The latter is based on the
optical spectroscopic approach. Biological tissues are
relatively transparent to light in the near-infrared
band between 700 and 1000 nm (a near-infrared
window). The major biological chromophores in
this window are oxy- and deoxyhemoglobin, and Figure 2 (a) Geometrical arrangement of the 16 sources and
water. One can separate contributions of these two detectors on the head of a human subject. Each one of the
chromophores into light absorption by employing eight small red circles (numbered 1 –8) represents one pair of
light sources at different wavelengths. Multi-wave- source fibers at 758 and 830 nm. Two larger green circles
represent detector fiber bundles. The rectangle inside the head
length PDWI can also be used to measure the redox outline is the 4 £ 9 cm2 imaged area. (b) Maps of oxyhemoglobin
state of the cytochrome oxidase, which is the ([HbO2], left panels) and deoxyhemoglobin ([Hb] right panels)
terminal electron acceptor of the mitochondrial concentration changes at rest (top) and at the time of maximum
electron transport chain and responsible for 90% response during the third tapping period (bottom). The measure-
of cellular oxygen consumption. ment protocol involved hand-tapping with the right hand
(contralateral to the imaged brain area). Adapted with permission
One of the hot topics in cognitive neuroscience is from Franceschini MA, Toronov V, Filiaci M, Gratton E, Fantini S
the hemodynamic mechanism of the oxygen supply to (2000) On-line optical imaging of the human brain with 160-ms
neurons and removal of products of cerebral meta- temporal resolution. Optics Express 6(3), 49– 57 (Figures 1
bolism. High temporal resolution and biochemical and 4).
specificity make PDWI a unique noninvasive tool for
studies of functional cerebral hemodynamics,
although light can penetrate no deeper than the very considered to be a promising tool for mammography
surface of the adult brain. Figure 2 displays a typical and structural imaging of anomalies in the brain.
arrangement of light sources and detectors used in PDWI is an especially promising tool for neonatology,
PDWI of functional cerebral hemodynamics, and the because the infant tissues have low absorption.
images of the activated area in the brain. PDWI instruments employ lasers, light-emitting
Because of its low cost, noninvasiveness, and diodes and gas-discharge tubes as sources of light.
compact size of the optical sensors, PDWI is also Among the detectors there are photomultiplier
190 IMAGING / Photon Density Wave Imaging
tubes, avalanche photodiodes, and CCD cameras. Photon mean free path, l [cm]
Both sources and detectors can be coupled with Radiance, Lðr; V^ ; tÞ [W cm22 steradian21]
the tissue by means of optical fibers. A number Radiation fluence, Fðr; tÞ [W cm22]
of commercially produced near-infrared instruments Radiation flux, Jðr; tÞ [W cm22]
for PDWI are available on the market. According Reduced scattering coefficient, m0s [cm21]
to the type of source modulation, there are three Scattering coefficient, ms [cm21]
groups of instruments: frequency domain (harmo- Set of measured values (ac, dc, and phase), G
nically modulated sources), time domain (pulsed Set of predicted values (ac, dc, and phase), Ge
sources), and steady state. Steady-state instruments Source term Qðr; tÞ [J cm23 s21]
are simpler and cheaper than frequency-domain Source – detector distance, r [cm]
and time-domain ones, but their capabilities are Spatial coordinate, r [cm]
much lower. Time-domain instruments can theor- Speed of light in medium, v [cm/s]
etically provide a maximum of information about Spherical harmonics Ynm ðVÞ ^
the tissue structure, but technically it is difficult Transport equation (TE)
to build a time-domain instrument with high Weight function, hðrÞ
temporal resolution and high signal-to-noise ratio. x ¼ v=ðvma Þ
Frequency-domain instruments provide temporal
resolution up to several milliseconds and have
high quantitative accuracy in the determination of See also
optical properties.
Coherence: Coherence and Imaging. Fiber and Guided
Wave Optics: Light Propagation. Imaging: Imaging
Through Scattering Media. Optical Materials: Hetero-
Nomenclature geneous Materials. Scattering: Scattering Theory.
of wave optical effects, notably diffraction. There- Point Spread Function for Coherent,
fore, the following sections provide a description of Incoherent, and Partially Coherent
field transformations based on wave optics.
Illumination
The input–output relations of an optical imaging
Systems Approach to 3D Imaging
system relate different magnitudes according to the
In this article we deal with linear imaging systems. type of illumination. If the illumination is coherent,
Hence the relation between three-dimensional (3D) the system is linear in the complex amplitude, while if
input (object) signals and 3D output (image) signals the illumination is fully incoherent the system is linear
can be completely specified by the 3D impulse in intensity. The 3D impulse response of the
response or point spread function (PSF) of the system incoherent ðhInc Þ and coherent ðhCoh Þ systems are
hðx; y; z; x0 y0 ; z0 Þ: This function specifies the response related as follows:
of the system at image space coordinates ðx; y; zÞ to a
2
d-function input (point source) at object coordinates hInc ðx; y; z; x 0 y 0 ; z 0 Þ ¼ hCoh ðx; y; z; x 0 y 0 ; z 0 Þ ½5
ðx0 ; y0 ; z0 Þ: The optical system object and image signals
l l
are related by the superposition integral:
Notwithstanding, many optical systems cannot be
ððð considered fully coherent or fully incoherent. In these
iðx; y; zÞ ¼ oðx 0 ; y 0 ; z 0 Þhðx; y; z; x 0 y 0 ; z 0 Þdx 0 dy 0 dz 0 situations the system is said partially coherent. If the
V
optical path differences within the optical system are
½3 much shorter than the coherence length of the
illumination, the light effectively has complete
where V is the object domain. It is clear that for a temporal coherence and the magnitude of interest is
generic linear imaging system the image signal is the mutual intensity. The mutual intensity is defined
completely specified by the response to input impulses as the statistical correlation of the complex wave-
located at all possible points of the image domain. function Uðr; tÞ between different points in space:
If the system is shift invariant, the PSF due to a
point source located anywhere in the object domain Jðr1 ; r2 Þ ¼ kUp ðr1 ; tÞUðr2 ; tÞl ½6
will be a linearly shifted version of the response due
to a reference point source. This approximation is in
where ri ¼ ðxi ; yi ; zi Þ; i ¼ 1; 2: The illumination is then
general not valid and the system’s PSF is a function of
called quasi-monochromatic. For the case of partially
the position of the point source in the object domain.
coherent quasi-monochromatic illumination, the sys-
Shift invariance is traditionally divided into radial
tem is linear in the mutual intensity. The partially
and axial shift invariance. Radial shift variance is the
coherent impulse response ðhPC Þ is also related to the
change in the image of a point source as a function of
coherent impulse response as follows:
its radial distance from the optical axis. Radial shift
variance is caused by lens aberrations. Axial shift hPC ðr1 ; r2 ; r 01 ; r 02 Þ ¼ hpCoh ðr1 ; r 01 ÞhCoh ðr2 ; r 02 Þ ½7
variance is the change in the image of a point source
as a function of its axial position, z0 ; in the object The 3D superposition integral that relate object and
space. Axial shift variance is inherent to any rotation- image mutual intensities is now:
ally symmetric lens system because they can only have
ððð
one pair of conjugate surfaces, i.e., object –image Jimage ðr1 ; r2 Þ ¼ hPC ðr1 ; r2 ; r 01 ; r 02 Þ
corresponding surfaces. V
In most practical cases the image is detected at a £ Jobject ðr 01 ; r 02 Þdr 01 dr 02 ½8
single plane, z ¼ z1 transverse to the optical axis Oz:
In those situations, we need to specify the impulse Therefore, we conclude that knowledge of the
response on this plane: coherent PSF is enough to characterize the optical
system for illumination of various degrees of
hðx; y; z1 ; x0 y0 ; z0 Þ ¼ hz0 ðx; y; x0 y0 Þ ½4 coherence.
The condition for lateral shift invariance is then Spatial Frequency Content of
expressed as hz0 ðx; y; x0 y0 Þ ¼ h~ z0 ðx 2 x0 ; y 2 y0 Þ: In this
case, the superposition integral is a two-dimensional
the 3D Point Spread Function
(2D) convolution and it is possible to define the As previously stated, the coherent impulse response of
transfer function of the system that is the 2D Fourier the optical system completely defines its imaging
transform of the impulse response h~ z0 ðx 2 x0 ; y 2 y0 Þ: characteristics. Therefore, it is justified to wonder
IMAGING / Three-Dimensional Field Transformations 193
what the limitations are in the specification of this The previous derivation did not consider any
response. In this section we analyze the frequency limitation in the range of spatial frequencies allowed
domain of coherent and incoherent impulse through the system. We now assume a finite aperture
responses. that will introduce an effective radial cutoff frequency
The impulse response of the optical system will fc , 1=l and will further limit the spatial spectral
satisfy the corresponding wave equations regardless domain of hðx; y; zÞ: This is depicted in Figure 1. Note
of the particulars of the optical system. In particular, that the transverse and longitudinal spatial frequen-
assuming a homogeneous isotropic medium in the cies are related, hence the longitudinal spatial
imaging domain, the coherent response hðrÞ to an frequenciesqffiffiffiffiffiffiffiffiffiffiffi
are located
ffi within a bandpass of width
impulse at r0 will satisfy the homogeneous Helmholtz Dfz ¼ 1 2 l 2 fc :
22 2
equation in the image domain: Let us now consider the frequency content
of possible incoherent PSFs. From eqn [5] we derive
72 hðrÞ þ k2 hðrÞ ¼ 0 ½9
the 3D Fourier transform of the incoherent impulse
where hðrÞ is the complex amplitude and k ¼ 2p=l; response:
with l the wavelength in the medium. If the field at a
given origin of coordinates z ¼ 0; f ðx; yÞ; is known, { }
HInc ðfx ; fy ; fz Þ ¼ FT hInc ðx; y; z; x0 y0 ; z0 Þ
hðrÞ can be calculated using the angular spectrum 0 0 0 2
¼ FT{lhCoh ðx; y; z; x y ; z Þl }
representation as follows:
¼ H^Hðfx ; fy ; fz Þ ½14
ð ð1
hðx; y; zÞ ¼ Fðfx ; fy Þ where ^ represents a 3D correlation. The domain
21
of HInc now becomes a volume of revolution as
£ exp½i2p ðfx x þ fy y þ fz zÞdfx dfy ½10
shown in Figure 2. Moreover, because hInc is real,
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
HInc is conjugate symmetric, i.e., HInc ¼ HIncp
: Now,
where fz ¼ l22 2 ðfx2 þ fy2 Þ and Fðfx ; fy Þ ¼ Inc
the transverse cutoff frequency is fc ¼ 2fc while
FT{f ðx; yÞ}; FT represents a 2D Fourier transform- the longitudinal cutoff frequency is fclInc ¼ 2Dfz :
ation and fx ; fy are the spatial frequencies along the x
and y axes respectively.
Equation [10] can be written as a 3D integral as
follows:
ð ð1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
hðx; y; zÞ ¼ Fðfx ; fy Þ·d fz 2 l22 2 fx2 2 fy2
21
£ exp½i2p ðfx x þ fy y þ fz zÞdfx dfy dfz
½11
where d represents the dirac impulse. Next we
represent the 3D impulse response hðx; y; zÞ as a 3D
Fourier decomposition:
ð ð1 Figure 1 Three-dimensional spatial-frequency domain of the
hðx; y; zÞ ¼ Hðfx ; fy ; fz Þ coherent point spread function.
21
£ exp½i2p ðfx x þ fy y þ fz zÞdfx dfy dfz
½12
where Hðfx ; fy ; fz Þ ¼ FT 3D {hðx; y; zÞ}: Comparing
eqns [11] and [12] we can infer that:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Hðfx ; fy ; fz Þ ¼ Fðfx ; fy Þ·d fz 2 l22 2fx2 2fy2 ½13
Diffraction Calculation of the 3D Point aperture P0 and PF ; and s joining P0 and Pd : The
Spread Function Kirchhoff formulation states that the complex ampli-
tude UðPd =PF Þ is the integral over the aperture of the
The 3D-image field created by a point source in the complex amplitude at each point propagated to Pd by
object space can be predicted by proper modeling of a spherical wave expðiksÞ=s modified by the direc-
the optical system. In the absence of aberrations, the tional factor cosðn;^ rÞ 2 cosðn;
^ sÞ: n̂ is the unit vector
distortion of the PSF is originated in the diffraction normal to the aperture:
created by the limiting aperture of the optical system.
Diffraction is thus the ultimate limitation to the 2iC ð ð exp½2ikðr 2 sÞ
UðPd =PF Þ ¼
resolution of the optical system. Under the design 2l A rs
conditions of the system, i.e., assuming all aberra- £ ½cosðn;
^ rÞ 2 cosðn; ^ sÞdS ½15
tions are corrected, the image of a point source is the
diffraction pattern produced when a spherical wave where r ¼ krk; s ¼ ksk; and C is the amplitude of the
converging to the (geometrical) image point is spherical wave.
diffracted by the limiting aperture of the system. The directional factor varies slowly and in many
This is called a diffraction limited image. situations can be approximated by a constant equal to
There are many formulations to describe the 3D 2 for small image-field angles. The amplitude over the
diffraction pattern obtained with a converging wave aperture is also slowly varying and can be approxi-
through an aperture. Here we review the Kirchhoff mated by a constant, i.e., C=r < C0 : Different
formulation. Let us consider the aperture A and the formulations of the Kirchhoff integral differ in the
field at a point Pd originated from a converging estimates of the factor 1=s and the phase kðr 2 sÞ: For
spherical wave focused at point PF (see Figure 3). example, one such formulation leads to the following
We define the vectors r joining a generic point on the expression for the 3D diffraction pattern for a focal
point on the axis of an optical system with circular
aperture:
Pd (xd , yd , zd) " !#
2ika2 C0 r2d
A UðPd =PF Þ ¼ exp ik zd 2 zF þ
s zd 2zd
! " #
ð1 karrd 2 2
2ika r ðzd 2 zF Þ
P0 J0 exp rdr
0 zd 2zd zF
n r
½16
PF (xF , yF , zF) z where a is the radius of the aperture, rd is the
radial coordinate of Pd ; and r is the radial coordinate
Figure 3 Diffraction of a spherical wave by a planar aperture. of P0 :
Figure 4 Lines of equal intensity in a meridional plane near the focus of a converging spherical wave diffracted at a circular aperture
(l ¼ 0:5 m, a ¼ 0.5 cm, zF ¼ 4 cm).
IMAGING / Volume Holographic Imaging 195
hologram, described in Figure 2. The volume holo- Now consider an extended, monochromatic,
gram is created by interfering a spherical wave and a spatially incoherent object and intermediate
planewave, as shown in Figure 2a. The spherical wave image, as in Figure 2f. According to the above
originates at the coordinate origin. The planewave is description, the volume hologram acts as a ‘Bragg
off-axis and its wavevector lies on the xz plane. As in slit’ in this case: because of Bragg selectivity, the
most common holographic systems, the two beams volume hologram transmits light originating from
are assumed to be at the same wavelength l, and the vicinity of the y axis, and rejects light
mutually coherent. The volume hologram results originating anywhere else. For the same reason,
from exposure of a photosensitive material to the the volume hologram affords depth selectivity (like
interference of these two beams. the pinhole of a confocal microscope). The width
First, assume that the object is a simple point of the slit is determined by the recording geometry,
source. The intermediate image is also approximately and the thickness of the volume hologram. For
a point source, that we refer to as ‘probe,’ located example, in the transmission recording geometry of
somewhere in the vicinity of the reference point Figure 2a, where the reference beam originates a
source. Assuming the wavelength of the probe source distance z0 away from the hologram, the planewave
propagates at angle u with respect to the optical
is the same as that of the reference and signal beams,
axis z (assuming u p 1 radian), and the hologram
volume diffraction theory shows that:
thickness is L (assuming L p z0 ), the width of the
(i) If the probe point source is displaced in the y slit is found to be
direction relative to the reference point source,
lz0
the image formed by the volume hologram is also Dx < ½2
displaced by a proportional amount. Most Lu
common imaging systems would be expected to
The imaging function becomes richer if the
operate this way. object is polychromatic. In addition to its Bragg
(ii) If the probe point source is displaced in the x slit function, the volume hologram exhibits then
direction relative to the reference point source, dispersive behavior, like all diffractive elements. In
the image disappears (i.e., the detector plane this particular case, dispersion causes the hologram
remains dark). to image simultaneously multiple Bragg slits, each
(iii) If the probe point source is displaced in the z at a different color and parallel to the original slit
direction relative to the reference point source, at wavelength l, but displaced along the z axis.
a defocused and faint image is formed on the Light from all these slits finds itself in focus at the
detector plane. ‘Faint’ here means that the detector plane, thus forming a ‘rainbow image’ of
fraction of energy of the defocused probe an entire slice through the object, as shown in
transmitted to the detector plane is much Figure 2g.
smaller than the fraction that would have To further exploit the capabilities of volume
been transmitted if the probe had been at the holograms, we recall that in general it is possible
origin. to ‘multiplex’ (superimpose) several volume gratings
IMAGING / Volume Holographic Imaging 197
Figure 2 (a) Recoding of a transmission-geometry volume hologram with a spherical wave and an off-axis plane wave with its
wave-vector on the xz plane. (b) Imaging of a probe point source that replicates the location and wavelength of the reference point
source using the volume hologram recorded in part (a). (c) Imaging of a probe point source at the same wavelength but displaced in the
y direction relative to the reference point source. (d) Imaging of a probe point source at the same wavelength but displaced in the
x direction relative to the reference point source. (e) Imaging of a probe point source at the same wavelength but displaced in the z
direction relative to the reference point source. (f) ‘Bragg slitting:’ Imaging of an extended monochromatic, spatially incoherent object
using a volume hologram recorded as in (a). (g) Joining Bragg slits from different colors to form rainbow slices: Imaging of an extended
polychromatic object using a volume hologram recorded as in (a). (h) Multiplex imaging of several slices using a volume hologram
formed by multiple exposures.
198 IMAGING / Volume Holographic Imaging
Figure 2 Continued.
IMAGING / Volume Holographic Imaging 199
Figure 2 Continued.
within the same volume by successive exposures. In (e.g., pinholes or slits) in an information-theoretic
the imaging context, suppose that we multiplex sense. Efforts are currently underway to strengthen
several gratings similar to the grating described in this property by design of volume holographic
Figure 2a but with spherical reference waves origi- imaging elements which are more elaborate than
nating at different locations and plane signal waves described here.
at different orientations. When the multiplexed
volume hologram is illuminated by an extended
polychromatic source, each grating forms a separate
image of a rainbow slice, as described earlier. By See also
spacing appropriately the angles of propagation of
the plane signal waves, we can ensure that the Holography, Techniques: Overview. Imaging: Three-
rainbow images are formed on nonoverlapping areas Dimensional Field Transformations.
on the detector plane, as shown in Figure 2h. This
device is now performing true four-dimensional (4D)
imaging: it is separating the spatial and spectral
components of the object illumination so that they Further Reading
can be measured independently by the detector
Barbastathis G and Brady DJ (1999) Multidimensional
array. Assuming the photon count is sufficiently tomographic imaging using volume holography. Pro-
high and the number of detector pixels is sufficient, ceedings of the IEEE 87(12): 2098– 2120.
this ‘spatio-spectral slicing’ operation can be per- Barbastathis G and Sinha A (2001) Information content of
formed in real time, without need for mechanical volume holographic images. Trends in Biotechnology
scanning. 19(10): 383– 392.
The complete theory of VHI in the spectral and Barbastathis G, Balberg M and Brady DJ (1999) Confocal
spatial domains is given in the Further Reading microscopy with a volume holographic filter. Optics
section below. Experimental demonstrations of VHI Letters 24(12): 811– 813.
have been performed in the context of a confocal Barbastathis G, Sinha A and Neifeld MA (2001) Infor-
microscope where the volume hologram performs mation-theoretic treatment of axial imaging. In: OSA
Topical Meeting on Integrated Computational Imaging
the function of a ‘Bragg pinhole’ as well as a real-time
Systems, pp. 18– 20. Albuquerque, NM.
4D imager. The limited diffraction efficiency h of
Liu W, Psaltis D and Barbastathis G (2004) Real time
volume holograms poses a major concern for VHI spectral imaging in three spatial dimensions. Optics
systems. Holograms, however, are known to act as Letters in press.
matched filters. It has been shown that the matched van-der Lugt A (1964) Signal detection by complex spatial
filtering nature of volume holograms as imaging filtering. IEEE Transactions on Information Theory
elements is superior to other filtering elements IT-10(2): 139 – 145.
200 IMAGING / Wavefront Sensors and Control (Imaging Through Turbulence)
telescope. As a result, the effective spatial resolution images. Let ið~xÞ be an image of an object oð~xÞ;
in long-exposure images in the presence of atmos- where x ~ is a two-dimensional spatial vector, and let
pheric turbulence is given by l=ro ; not l=D: Ið f~ Þ and Oð f~Þ be their spatial Fourier transforms,
Atmospheric turbulence distorts a field’s wavefront where f~ is a two-dimensional spatial frequency vector.
by delaying it in a way that is spatially and temporally In the absence of atmospheric turbulence, the
random. The origin of these delays is the random diffraction-limited Fourier transform Ið f~ Þ is given by
variation in the index of refraction of the atmosphere
as a result of turbulent air motion brought about by Id ð f~ Þ ¼ Oð f~ÞHd ð f~ Þ ½1
temperature variations. A Fourier optics imaging
model is useful in characterizing the effects of where the subscript ‘d’ denotes the diffraction-limited
atmospheric turbulence on spatial resolution in case and where Hd ð f~ Þ is the diffraction-limited
202 IMAGING / Wavefront Sensors and Control (Imaging Through Turbulence)
telescope transfer function. In the presence of Figure 7 Computer-simulated long-exposure image of a binary
star.
atmospheric turbulence, the Fourier transform of a
long exposure image ile ð~xÞ is given by
where d is the distance between the imaging system’s
Ile ð f~ Þ ¼ Oð f~ÞHd ð f~ ÞHle ð f~Þ ¼ Id ð f~ÞHle ð f~ Þ ½2
exit pupil and detector plane. As can be seen from eqn
where the subscript ‘le’ denotes long exposure. If the [2], the effect of atmospheric turbulence on a long-
turbulence satisfies Kolmogorov statistics, as is exposure image is a multiplication of the diffraction-
typically assumed, then: limited image’s Fourier transform by a transfer
2 function due to atmospheric turbulence. In Figure 5,
!5=3 3 an amplitude plot of a diffraction-limited transfer
4 ~
ldlfl 5
Hle ð f~ Þ ¼ exp 2 3:44 ½3 function is displayed along with an amplitude plot of
ro a long-exposure transfer function for a 2.3 m
IMAGING / Wavefront Sensors and Control (Imaging Through Turbulence) 203
diameter telescope, an imaging wavelength of quantities that retain high spatial frequency infor-
0.5 mm, and an ro value of 10 cm. Notice that the mation about the Fourier phase spectrum when
long exposure transfer function significantly reduces averaged over multiple short-exposure images. The
the amount of high spatial frequency information in cross spectrum and bispectrum techniques calculate
an image. Diffraction-limited and long-exposure the average cross spectrum Cðf; ~ Df~Þ and bispectrum
turbulence-limited images of a binary star are ~ ~
Bðf1 ; f2 Þ; respectively, that are given by
shown in Figures 6 and 7, illustrating the significant
loss of spatial resolution brought on by atmospheric ~ Df~ Þ ¼ Iðf~ ÞIp ð f~ þ Df~ Þ
Cð f; ½4
turbulence.
analysis of short-exposure star images. In these are the cross spectrum elements obtained with
images (see Figure 3), a number of speckles are sufficiently high SNRs to be useful in the reconstruc-
present, where the individual speckle sizes are tion process. The equivalent constraint on the
roughly the size of the diffraction-limited spot of bispectrum elements is that either f~1 ; f~2 ; or f~1 þ f~2
the telescope. From this, he realized that high must be less than ro =l: Once the average cross
spatial frequency information is retained in short- spectrum or bispectrum is calculated from a series of
exposure images, albeit distorted. He then showed short-exposure images, the phase spectrum fI ðf~Þ of
that the energy spectra EI ðf~Þ of the individual the underlying image can be calculated from their
short-exposure images can be averaged to obtain an phases using one of a number of algorithms based
estimate of the average energy spectrum that retains upon the following equations:
high signal-to-noise ratio (SNR) information at
high spatial frequencies. The square root of the fI ð f~ þ Df~ Þ ¼ fI ðf~ Þ 2 fC ð f~ þ Df~ Þ ½6
average energy spectrum (the amplitude spectrum)
of a star, calculated using Labeyrie’s method (also for the cross spectrum and:
known as speckle interferometry), is shown in
Figure 5, along with the diffraction-limited and fI ð f~1 þ f~2 Þ ¼ fI ð f~1 Þ þ fI ð f~2 Þ 2 fB ð f~1 þ f~2 Þ ½7
long-exposure amplitude spectra. Notice that mean-
ingful high spatial frequency information is retained for the bispectrum. Because the phases are only
in the average amplitude spectrum out to the known for units of 2p; phasor versions of eqns [6]
diffraction limit of the telescope. Because the and [7] must be used. For the cross spectrum
atmospherically corrupted amplitude spectrum esti- technique, each short-exposure image must be cen-
mate is significantly lower than the diffraction- troided prior to calculating its cross spectrum and
limited amplitude, a number of short exposure adding it to the total cross spectrum. The bispectrum
images must be used to build up the SNRs at the technique is insensitive to tilt, so no centroiding is
higher spatial frequencies. necessary; however, the bispectrum technique requires
To reconstruct an image, both the Fourier setting two image phase spectrum values to arbitrary
amplitudes and Fourier phases of the image must values to initialize the phase reconstruction process.
be known. In fact, it is well known that the Fourier Typically, the image phase spectrum values at f~ ¼
phase of an image carries most of the information in ð0; 1Þ and f~ ¼ ð1; 0Þ are set equal to zero, which
the image. Because only the Fourier amplitudes are centroids the reconstructed image but does not affect
available using Labeyrie’s method, other techniques its morphology.
were subsequently developed to calculate the Fourier To obtain an estimated Fourier transform Ie ðf~ Þ of
phase spectrum from a sequence of short-exposure the image from the energy and phase spectrum
images. Techniques that combine amplitude spec- estimates, the square root of the average energy
trum estimation using Labeyrie’s method along with spectrum is combined with the average phase
phase spectrum estimation are called speckle imag- spectrum calculated from either the bispectrum or
ing techniques. The two most widely used phase spec- cross spectrum. The result is given by
trum estimation methods are the Knox –Thompson
(or cross spectrum) and bispectrum algorithms.
Both of these algorithms calculate Fourier-domain Ie ðf~ Þ ¼ ½EI ð f~Þ1=2 exp½jfI ð f~Þ ½8
204 IMAGING / Wavefront Sensors and Control (Imaging Through Turbulence)
pffiffiffiffiffi
where j ¼ 21. The result in eqn [8] includes and is given by
atmospheric attenuation of the Fourier amplitudes,
so it must be divided by the amplitude spectrum of the im ð~xÞ ¼ hm ð~
xÞ p oð~xÞ þ nm ð~xÞ ½9
atmosphere to obtain the desired object spectrum
estimate. The atmospheric amplitude spectrum esti- where hm ð~xÞ and nm ð~xÞ are the atmospheric/system
mate is obtained by using Labeyrie’s technique on a blur and detector noise functions for the mth image,
sequence of short-exposure images of a nearby respectively, and the asterisk denotes convolution.
unresolved star. No phase spectrum correction is The noise is assumed to be zero mean with variance
necessary because it has been shown theoretically and s 2 ð~xÞ: The conditional pdf p½im ð~xÞloð~xÞ; hm ð~xÞ, m ¼
verified experimentally that atmospheric turbulence 1; …; M is the pdf of the noise with a mean equal
has a negligible effect upon fI ð f~ Þ: to hm ð~xÞ p oð~xÞ and, as a function of oð~xÞ and hm ð~xÞ;
it is known as a likelihood function. For spatially
independent Gaussian noise, as is the case for camera
Blind Deconvolution
read noise, the likelihood function is given by
Speckle imaging techniques are effective as well as
easy and fast to implement. However, it is necessary YY 1
LG oð~xÞ; hm ð~xÞ; m ¼ 1; …; M ¼ pffiffiffiffiffiffiffiffiffiffiffi
to obtain a separate estimate of the atmospheric m x~ 2ps 2
ð~xÞ
energy spectrum using an unresolved star in order to 2
reconstruct a high-resolution image. Usually, the star xÞ =2s2 ð~xÞ}
exp{ 2 ið~xÞ 2 ðo p hm Þð~
is not co-located with the object and the star images ½10
are not collected simultaneously with the images of
the object. As a result, the atmospheric energy For Poisson noise, the likelihood function is given
spectrum estimate may be inaccurate. At the very by
least, collecting the additional star data expends
valuable observation time. In addition, there is often LP oð~xÞ; hm ð~xÞ; m ¼ 1; …; M
meaningful prior knowledge about the object and the
Y Y ðo p hm Þð~xÞ ið~xÞ exp{ 2 ðo p hm Þð~xÞ }
data, such as the fact that measured intensities are ¼
positive (positivity), that many astronomical objects m x~
ið~xÞ!
are imaged against a black background (support), and ½11
that the measured data are band-limited, among
others. Speckle imaging algorithms do not exploit this In practice, logarithms of eqns [10] and [11] are
additional information. As a result, a number of taken prior to searching for the estimates of oð~xÞ
techniques have been developed to jointly estimate and hm ð~xÞ that maximize the likelihood function.
the atmospheric blurring effects as well as an estimate In general, there are an infinite number of
of the object being imaged using only the short- estimates of oð~xÞ and hm ð~xÞ that reproduce the
exposure images of the object and available prior original dataset when convolved together. However,
knowledge. These techniques are called blind decon- when additional prior knowledge is included to
volution techniques. Two of the most common constrain the solution space, such as positivity,
techniques are known as multi-frame blind deconvo- support, and the knowledge that hm ð~xÞ is band-
lution (MFBD) and projection-based blind deconvo- limited, along with regularization, the MFBD
lution (PBBD). algorithm almost always converges to the correct
The MFBD algorithm is based upon maximizing a estimates of oð~xÞ and hm ð~xÞ:
figure of merit to determine the best object and The PBBD technique also uses prior knowledge
atmospheric short-exposure point spread function constraints to jointly estimate oð~xÞ and hm ð~xÞ:
(PSF) estimates given the data and available prior However, instead of creating and maximizing a
knowledge. This figure of merit is created using a likelihood function, the PBBD technique uses the
maximum likelihood problem formulation. Concep- method of convex projections to find estimates of oð~xÞ
tually, this technique produces estimates of the and hm ð~xÞ that are consistent with the measured
object and the atmospheric turbulence short- short-exposure images and all the prior knowledge
exposure PSFs that most likely generated the constraints. The data consistency constraint and the
measured short-exposure images. Mathematically, prior knowledge constraints are mathematically
the MFBD algorithm obtains these estimates by formulated as convex sets and the PBBD technique
maximizing the conditional probability density repeatedly takes the current estimates of oð~xÞ and
function (pdf) p½im ð~xÞloð~ xÞ; hm ð~
xÞ, m ¼ 1; …; M; hm ð~xÞ and projects them onto the convex constraint
where im ð~xÞ is the mth of M short exposure images sets. As long as all the sets are truly convex, the
IMAGING / Wavefront Sensors and Control (Imaging Through Turbulence) 205
algorithm is guaranteed to converge to a solution that blur each short-exposure image. Thus, the additional
is in the intersection of all the sets as long as the data must be collected simultaneously with the short-
algorithm doesn’t stagnate at erroneous solutions exposure images. The atmospheric field phase per-
called traps. These traps are the projection equivalent turbations are calculated from data obtained with a
of local minima in cost function minimization wavefront sensor using a different region of the
algorithms. optical spectrum than used for imaging so that no
A key component of this algorithm is the ability to light is taken from the imaging sensor. Most
formulate the measured short-exposure images and commonly, the wavefront sensor data consists of the
prior knowledge constraints into convex sets. As an first derivatives of the phasefront (phase differences)
example, consider the convex set Cpos ½uð~xÞ that as obtained with Shack – Hartmann or shearing
corresponds to the positivity constraint. This set is interferometer wavefront sensors, but curvature
given by sensing wavefront sensors are also used that produce
data that are proportional to the second derivatives of
Cpos ½uð~xÞ W {uð~xÞ such that uð~xÞ $ 0} ½12 the phasefront.
The original algorithm to implement the DWFS
technique produced an estimate Oð ^ f~Þ of the Fourier
It is straightforward to show that Cpos ½uð~xÞ is a ~
transform Oðf Þ of the object being imaged using the
convex set. Other constraints can be converted into following estimator:
convex sets in a similar manner. Although all
constraints can be formulated into sets, most but X
Im ð f~ ÞH
~ pm ð f~ Þ
not all of these sets are convex.
^ f~ Þ ¼
Oð m
X ½13
The projection process is straightforward to ~ m ð f~ Þl2
lH
implement. For example, the projections of the m
current estimates of oð~xÞ and hm ð~xÞ onto Cpos ½uð~xÞ
are carried out by zeroing out the negative values in where H ~ m ð f~ Þ is the overall system transfer function
these estimates. Similarly, the projections of oð~xÞ and (including the telescope and atmospheric effects) for
hm ð~xÞ onto the convex sets that correspond to their the mth short-exposure image estimated from the
support constraints are carried out by zeroing out all wavefront sensor data. To create H ~ m ðf~Þ; the atmos-
their values outside of their supports. In practice, pheric field phase perturbations corrupting the mth
some versions of PBBD algorithms have been image must be calculated from the wavefront sensor
formulated to use minimization routines as part of measurements. This topic is discussed in more detail
the projection process in order to minimize the in Imaging: Adaptive Optics. Then H ~ m ðf~Þ is obtained
possibility of stagnation. using standard Fourier optics techniques. Although
eqn [13] is easily implemented given an existing
system that includes a wavefront sensor, it suffers
Deconvolution from Wavefront Sensing
from the fact that it is a biased estimator. An unbiased
A key distinguishing feature of post-detection com- version of eqn [13] is given by
pensation algorithms is the type of additional data
X
required to estimate and remove atmospheric blur- Im ð f~ ÞH
~ pm ð f~ Þ
ring in the measured short-exposure images. Blind ^ f~ Þ ¼ X m
Oð ½14
deconvolution techniques need no additional data. Hn;ref ð f~ ÞH
~ pn;ref ðf~ Þ
Speckle imaging techniques estimate average atmos- n
pheric blurring quantities and need an estimate of the
average atmospheric energy spectrum. Because aver- where H ~ n;ref ð f~ Þ and Hn;ref ð f~ Þ are estimates of the
age quantities are calculated when using speckle overall system transfer function obtained from data
imaging techniques, only an estimate of the average collected on a reference star using wavefront sensor
atmospheric energy spectrum is needed, not the data and image-plane data, respectively. Thus, to
specific energy spectra that corrupted the measured obtain an unbiased estimator, two more datasets must
short-exposure images. As a result, the additional be collected. The number of frames in the reference
data need not be collected simultaneously with the star dataset need not be the same as for the imaging
short-exposure images. The technique described in dataset. As for speckle imaging, the atmospheric
this section, deconvolution from wavefront sensing statistics must be the same for the reference star and
(DWFS), seeks to deconvolve the atmospheric blur- imaging datasets.
ring in each short-exposure image by estimating the Another technique that uses at least one additional
specific atmospheric field phase perturbations that data channel along with the measured image is called
206 IMAGING / Wavefront Sensors and Control (Imaging Through Turbulence)
by the atmosphere, these multiple regions tend to not into an image with reasonable quality. These tech-
interfere in phase, reducing the amplitude of the niques are essentially the same techniques as used in
image’s Fourier component at that spatial frequency. interferometric imaging, not those described in this
This amplitude reduction effect occurs for all spatial article. When partially redundant pupil masks are
frequencies greater than the limit imposed by the used (including VGP masks), the image may include
atmosphere, ro =l: Therefore, techniques have been enough spatial frequency content that the methods
developed to modify the wavefront by masking part described in the post-detection compensation section
of the pupil prior to measuring the field with an can be applied.
imaging sensor. This class of techniques has various
names including optical aperture synthesis, pupil
masking, and aperture masking. Here this class of
techniques will be referred to as partially redundant See also
pupil masking (PRPM). Imaging: Adaptive Optics; Interferometric Imaging;
PRPM techniques function by masking the pupil Inverse Problems and Computational Imaging.
in such a way as to minimize the amount of out-of-
phase interference of light on the imaging detector.
Early efforts used pupil masks that were opaque
plates with ro -sized holes in them. Because the
atmospheric coherence length is ro ; all the light Further Reading
from an ro -sized hole is essentially in phase. By Ayers GR, Northcott MJ and Dainty JC (1988) Knox–
choosing the holes in the mask so that only two holes Thompson and triple-correlation imaging through
contribute to a single spatial frequency in the image, atmospheric turbulence. Journal of the Optical Society
out-of-phase interference problems are removed. of America A 5: 963 – 985.
This type of mask is called a nonredundant pupil Bertero M and Boccacci P (1998) Introduction to Inverse
mask. Of course, a significant amount of light is lost Problems in Imaging. Bristol, UK: Institute of Physics
Publishing.
due to this masking, so only bright objects benefit
Craig IJD and Brown JC (1986) Inverse Problems in
from this particular implementation. Often, the
Astronomy. Bristol, UK: Adam Hilger Ltd.
pupil mask needs to be rotated in order to obtain Dainty JC (1984) Stellar speckle interferometry. In: Dainty
enough spatial frequencies in the image to be able to JC (ed.) Topics in Applied Physics, vol. 9: Laser Speckle
reconstruct a high-quality estimate of the object. To and Related Phenomena, 2nd edn, pp. 255 – 320. Berlin:
increase the light levels in the image, masks have Springer-Verlag.
been made to allow some out-of-phase interference Goodman JW (1985) Statistical Optics. New York: John
for a given spatial frequency, thus trading off the Wiley and Sons.
amplitude of the spatial frequencies allowed in the Haniff CA and Buscher D (1992) Diffraction-limited
image with the amount of light in the image. Masks imaging with partially redundant masks I: Infrared
with this property are called partially redundant imaging of bright objects. Journal of the Optical Society
of America A 9: 203 – 218.
pupil masks.
Katsaggelos AK and Lay KT (1991) Maximum likelihood
One version of PRPM is called the variable
identification and restoration of images using the
geometry pupil (VGP) technique. It is envisioned to expectation-maximization algorithm. In: Katsaggelos
work in conjunction with an adaptive optics system AK (ed.) Digital Image Restoration, pp. 143 – 176.
that does not fully correct the wavefront for Berlin: Springer-Verlag.
atmospheric phase distortions. It employs a spatial Nakajima T and Haniff CA (1993) Partial adaptive
light modulator or a micro-electromechanical mirror compensation and passive interferometry with large
between the deformable mirror and the imaging ground-based telescopes. Publications of the Astronom-
sensor. Either of these two devices is driven by signals ical Society of the Pacific 105: 509 –520.
from the wavefront sensor in the adaptive optics Roggemann MC and Welsh B (1996) Imaging Through
system in such a way that the portions of the pupil Turbulence. Boca Raton, FL: CRC Press.
Schulz TJ (1993) Multiframe blind deconvolution of
where the phasefront has a sufficiently large second
astronomical images. Journal of the Optical Society of
derivative is blocked, while the remainder of the America A 10: 1064– 1073.
phasefront is transmitted. Tyson RK (1991) Principles of Adaptive Optics. San Diego,
All PRPM techniques benefit from post-compen- CA: Academic Press.
sation processing. For the case of nonredundant pupil Youla DC and Webb H (1982) Image restoration by the
masks, post-processing techniques are required to method of convex projections: Part 1 – Theory. IEEE
convert the discrete-spatial-frequency information Transactions on Medical Imaging M1-1: 81– 94.
208 INCOHERENT SOURCES / Lamps
INCOHERENT SOURCES
Contents
Lamps
Synchrotrons
Under the assumption that the color temperature of Successful IR reflectors have been made from multi-
an incandescent lamp is approximately the same as layer dielectric films. These are similar to the coatings
the operating temperature of the filament, we can used to make high reflectivity mirrors for He –Ne
combine eqns [2] and [4] to obtain eqn [6], and lasers, but they have much broader bandwidth and
combine eqns [3] and [4] to obtain eqn [7]: lower reflectivity. The reflector-filament alignment
!30:95 problem is most easily solved using the geometry of a
LIFE1 TEMPERATURE2 tungsten halogen filament tube as the carrier for the
¼ ½6
LIFE2 TEMPERATURE1 IR reflector. The typical efficacy gain is about 50%: a
60-watt IR-halogen lamp produces about as much
!4:52 light as a 90-watt halogen lamp.
EFFACACY1 TEMPERATURE1 Other methods being investigated to increase the
¼ ½7
EFFICACY2 TEMPERATURE2 efficacy of incandescent lamps involve modification of
the filament to decrease its emissivity in the IR and/or
One way to allow operation with higher filament increase its emissivity in the visible, and hence turn it
temperatures, while not decreasing filament life, is to into more of a ‘selective radiator’ than natural
fill the lamp with a high molecular weight and/or high tungsten. Three general methods are under investi-
pressure gas, as either will retard tungsten evapor- gation: one involves creating a dense pattern of
ation. Lamps using krypton or xenon in place of the submicron size cavities on the surface of the tungsten
normal argon fill allow for a moderate increase in to inhibit IR radiation; a second involves coating the
filament temperature. Better yet is a fill of halogen tungsten with a material that is itself a selective
gas, such as HBr, CH3Br, or CH2Br2 plus Kr or Xe at radiator that favors the visible portion of the
total pressures that range from 2 to 15 atmospheres. spectrum, while a third involves replacing the
The high total pressure inhibits tungsten evaporation tungsten filament with a material that is a natural
from the filament, and the halogen fill gas reacts with selective radiator. No products exist that employ
tungsten that deposits on the wall of the lamp to these techniques, but active R&D projects are
remove this tungsten and maintain light output over underway that may yield a more efficient incandes-
the life of the lamp. The tungsten removed from the cent lamp in the future.
wall of the lamp is redeposited on the tungsten Incandescent lamps can be used as standards for
filament, increasing lamp life under some circum- total flux and color temperature if carefully con-
stances. Due to the high gas pressures involved, structed and operated. The National Institutes for
tungsten halogen incandescent lamps are constructed Standards and Technology (NIST) in the US sells
in small ‘filament tubes’ to minimize both the amount 1000-watt tungsten halogen lamps calibrated to an
of gas needed and the total explosive energy. The accuracy of 0.6% in luminous intensity and 8 K in
filament tubes are manufactured from quartz or color temperature at a color temperature of
special high temperature glass. For general purpose 2856 K. Similar service are provided by standards
applications, the quartz or high temperature glass organizations in other countries.
filament tube is enclosed in a larger glass envelope to
protect the user should the filament tube explode and
Discharge Lamps
to protect the filament tube from environmental
contaminants, such as salts from human fingers. The most efficient lamps produced today are based on
Since 90% to 95% of the energy consumed by an electric discharges in gasses. These include ‘low
incandescent lamp is radiated as infrared energy, it is pressure’ lamps such as fluorescent lamps and low
easy to see that the potential exists to dramatically pressure sodium lamps; and ‘high pressure’ or high
increase the efficacy of incandescent lamps by intensity discharge (HID) lamps such as metal halide,
reflecting some of this infrared energy back onto the high pressure sodium, high pressure mercury, and
filament, thus allowing the filament to be maintained sulfur lamps. These lamps are all more efficient than
at its normal operating temperature with far less incandescent lamps because they are ‘selective radia-
electrical power. The reflector must have very low tors’ instead of blackbody sources and have been
absorption in the visible region of the spectrum, high designed to generate as much radiant energy as
reflectivity in the IR and a reasonably sharp transition possible in the visible band while minimizing energy
between the two, because the greatest IR emission is produced outside the visible band. The best white
just below the visible band. The reflector and filament light discharge lamps have conversion efficiencies
must also be precisely positioned so that the reflected from electric power to visible light of 25% to 30%,
IR radiation hits the filament, and the reflector must making them almost five times as efficient as the
also be low enough in cost to be commercially viable. typical incandescent lamp.
INCOHERENT SOURCES / Lamps 211
Fluorescent Lamps
. Blue: BaMg2Al16O27:Eu2þ
Figure 3 Linear fluorescent lamp. q 2005 Roberts Research . Green: CeMgAl11O19(Ce3þ):Tb3þ
and Consulting, Inc. Published by Elsevier Ltd. All rights reserved. . Red: Y2O3:Eu3þ
212 INCOHERENT SOURCES / Lamps
Rare earth phosphors provide a combination of temperature. These mercury amalgams have the
high efficiency, high color rendition index, and high secondary advantage of making the fluorescent lamp
resistance to degradation that far exceeds the less sensitive to ambient temperature.
performance of the halophosphate phosphors used Mercury, which is the key ingredient in fluorescent
in older generations of fluorescent lamps. lamps, is considered to be a hazardous material by
However, due to the Stokes Shift, there is a regulatory agencies in many countries. At least as
substantial energy loss converting the UV photons early as the mid-1980s researchers began to study
produced by excited mercury atoms into visible light, ways to reduce or eliminate mercury from fluorescent
even using a ‘perfect’ phosphor that has quantum lamps.
efficiency of 1.0. The average energy of a visible Development of a mercury-free fluorescent lamp is
photon is 2.43 eV, using the 380 nm to 780 nm limits very difficult and the only mercury-free fluorescent
for visible radiation discussed at the start of this lamp that exists, as of mid-2003 has an efficacy of
article. When one 245 nm photon that has 4.885 eV about 10 l mW 21, one-tenth the efficacy of the best
of energy is converted to one visible photon with an mercury-based fluorescent lamp (Osram Sylvania
average energy of 2.43 eV, a bit over 50% of the LINEXw Linear Excimer Lamp System, Model
photon’s energy is lost. The situation is worse for the LINEX A3-10W40). There are three key problems
185 nm photons, which each have 6.707 eV of energy. that have to be solved to develop an efficient mercury-
They lose 64% of their energy when they are free fluorescent lamp system:
converted to visible photons, so it is good that the
fluorescent lamp discharge generates six to seven (i) Identify a nontoxic material that generates UV
245 nm photons for each 185 nm photon. The Stokes photons and has its optimum gas density at a
shift is responsible for the largest single energy loss convenient and safe temperature. The most
mechanism in fluorescent lamps. Work is underway promising candidate is xenon, used as either an
to develop phosphors with a quantum efficiency atomic radiator or an excimer.
greater than 1.0, so less energy will be lost in the UV- (ii) Develop a phosphor that produces high-quality
to-visible light conversion process, but no practical white light with high efficiency when excited by
phosphors have been developed to date. UV photons from the chosen mercury-replace-
The mercury pressure for optimum UV pro- ment material. This has proven a major chal-
duction is about 6 £ 1023 Torr, which is the vapor lenge. The resonance line of atomic Xe has a
pressure of mercury over a liquid pool at 40 8C. At wavelength of 147 nm, while the output from a
higher mercury pressure there is too much absorp- xenon excimer discharge is at 172 nm. The Stokes
tion by ground state mercury atoms, and at lower shift loss converting each of these to visible light is
pressures there is an insufficient number of mercury 71% for atomic xenon and 66% for xenon
atoms for excitation by the electrons. The diameter excimers. To reduce the Stokes shift loss, a
of most linear lamps is chosen so that the coolest quantum-splitting phosphor, one that generates
spot on the interior of the lamp wall will be at 40 8C more than one visible photon for each UV
when the lamp is operated at its design power in an photon, is needed. Most of the effort to develop
ambient temperature of 25 8C. Usually this ‘cold mercury-free fluorescent lamps has been concen-
spot’ will be near the middle of the lamp, but in trated in this area. No viable phosphor has been
some higher power linear lamps the wall tempera- developed, but work continues.
ture is well over 40 8C. In these lamps a cold spot is (iii) Develop a ‘reservoir’ for xenon or the other
often created at the end of the lamp by use of a working gas. The mass of ‘active’ gas needed for
longer electrode mount and a heat shield mounted a fluorescent lamp is small; a 4-foot, 1.5-inch
between the electrode and the end of the lamp. diameter lamp needs 0.1 mg of mercury in the
Obviously, the performance of fluorescent lamps gas phase to operate properly. However, during
designed in this manner is highly sensitive to life, some of this gas is consumed through
ambient temperature. In some smaller, high-power physical attachment or chemical reaction and
density lamps, most commonly compact fluorescent must be replaced. The drop of liquid mercury or
lamps (CFLs), it is not possible to create a region small mass of mercury amalgam used in conven-
with a temperature as low as 40 8C. In these tional fluorescent lamps provides a reservoir that
lamps the mercury pressure is controlled by amal- maintains the mercury gas pressure in the
gamating the mercury with Pb, Bi, or Sn, creating a presence of these consumption mechanisms.
compound that both reduces the mercury vapor
pressure at any given temperature and also reduces In contrast to the failure to develop a mercury-free
the variability of mercury vapor pressure with fluorescent lamp, there has been great success in
INCOHERENT SOURCES / Lamps 213
reducing the amount of mercury used in each lamp. A of electrodeless fluorescent lamps. Since the need for
typical 4-foot, 1.5-inch diameter fluorescent lamp high length-to-diameter ratios is driven by the losses
produced in the 1970s would have 50 mg to 100 mg at the lamp electrodes, researchers realized they
of mercury. This high amount of mercury was mostly could make fluorescent lamps in short, wide shapes
the result of the type of manufacturing equipment if they eliminated the electrodes. As shown in
used, but mercury consumption required that each Figure 5, an electrodeless lamp is just like a
lamp had 15 mg to 20 mg of mercury in order to transformer, except it has a single-turn discharge
survive for its rated 20 000-hour life. The develop- secondary instead of a multiturn copper or alumi-
ment of coatings and other materials that reduce num secondary. An open core version of an
mercury consumption, when combined with new electrodeless fluorescent lamp with integral ballast
manufacturing technology, has allowed the amount is shown in Figure 6. The electric field is created by
of mercury used in modern 4-foot fluorescent lamps to a time-varying magnetic field. Because of the
be reduced to 3 mg to 5 mg, and further reductions are relative high voltage-per-turn imposed by the
possible with improved technology. discharge secondary, these lamps operate only at
As stated above, the length-to-diameter ratio of high frequency. There are three electrodeless fluor-
most fluorescent lamps is significantly greater than escent lamps on the market. The closed core design
one. The reason for this is that there is a 10 to 15 volt
drop at the electrodes, and the power lost in these
regions does not produce a significant amount of
light. A large length-to-diameter ratio increases the
voltage and power of the positive column, where
the bulk of the UV photons are generated, relative to
the power lost in the electrode regions. When the
energy crunch hit in the mid-1970s, there was strong
interest in developing a fluorescent lamp that could
replace the inefficient incandescent lamp. However,
the need for high length-to-diameter ratios meant that
it would not be practical to make a fluorescent lamp
shaped like an incandescent lamp. This need created
the technology to manufacture CFLs which are
constructed of long tubes bent or coiled into shapes Figure 5 Basic electrodeless fluorescent lamp with closed
that allow them to fit into spaces occupied by ferrite core. q 2005 Roberts Research and Consulting, Inc.
Published by Elsevier Ltd. All rights reserved.
incandescent lamps.
Fluorescent lamps and other discharge lamps have
a negative incremental voltage vs. current (V/I)
characteristic, which means that their operating
voltage decreases as the operating current increases.
Due to this, such discharges need to be operated from
a current limited power source, commonly known as
a ‘ballast’ in the lighting industry. Ballasts were
originally simple series inductors and lamps operated
at the power line or mains frequency, though many
line frequency ballasts are now considerable more
complex than an inductor. In the 1940s, it was
discovered that the efficacy of fluorescent lamps could
be increased by operation at frequencies of a few kHz.
It was not until the 1970s, however, until the cost of
power electronics came down to the point where it
was possible to make affordable high frequency
electronic ballasts. Electronic ballasts operating in
the 20 kHz to 100 kHz range are now used with
almost all CFLs and in most commercial applications
using linear fluorescent lamps. Figure 6 Open core electrodeless lamp with integral ballast.
The desire for fluorescent lamps shaped like q 2005 Roberts Research and Consulting, Inc. Published by
incandescent lamps also sparked the development Elsevier Ltd. All rights reserved.
214 INCOHERENT SOURCES / Lamps
shown in Figure 7 operates near 250 kHz with an and use the same type of mercury-rare gas discharge
external ballast, while the open core design shown as fluorescent lamps, are a convenient source of
in Figure 8 operates near 2.5 MHz. Figure 9 shows 254 nm radiation that can be used to calibrate UV
an open core design with an integral ballast that spectrometers. Care should be taken when using these
also operates near 2.5 MHz. Some people have lamps, as the 254 nm radiation can damage human
suggested that electrodeless fluorescent lamps are eyes and skin.
fundamentally different from normal fluorescent
lamps and the high frequency field drives the
High Intensity Discharge Lamps
phosphor to emit light. This is incorrect. Other
than the manner in which the electric field is If the mercury pressure in a discharge lamp is
created, and the new shapes this allows, electrode- raised from 6 £ 1023 Torr to an atmosphere or
less fluorescent lamps operate in the same manner
as normal fluorescent lamps.
Due to the strong dependence of fluorescent
performance on mercury temperature, these lamps
are not useful as flux standards. However, the five
mercury visible lines listed above, plus the 365 nm
line of mercury, make the ordinary fluorescent lamp
an ideal source for wavelength calibration of spec-
trometers. Germicidal lamps, which do not use
phosphor but are constructed in quartz envelopes
Figure 7 Closed core electrodeless fluorescent lamp, Osram Figure 9 GE Genuraw open core electrodeless fluorescent
Sylvania Icetron/Endura Lamp, from US Patent 5,834,905. lamp with integral ballast. Courtesy of General Electric Company.
Figure 8 Open core electrodeless fluorescent lamp with separate ballast, Philips QL lamp, from US Patent 6,373,198 B1.
INCOHERENT SOURCES / Lamps 215
greater, the 245 nm and 185 nm resonance lines are the walls, leading to very short lamp life. The
trapped via absorption by ground-state mercury halogen combines with most of the tungsten before
atoms and the radiation output from the discharge it has a chance to reach the wall of the arc tube.
shifts to the five optically thin visible lines listed in The halogen will also combine with any tungsten
the discussion of fluorescent lamps, plus the that has deposited on the wall of the arc tube. The
365 nm near-UV line and a number of lines near tungsten-halogen compound formed is gaseous at
300 nm. The result is a high pressure mercury lamp the operating temperature of the lamp and will
that operates with an efficacy of about 55 l m W21 diffuse back toward the core of the discharge
and has a CRI of about 50. The high-mercury where the high temperature of the arc dissociates
vapor pressure requires a wall temperature of at the tungsten-halogen compound. The tungsten is
least 360 8C, which in turn requires that the arc deposited back on the electrodes, thereby greatly
tube be constructed from quartz and operated at extending the life of the lamp. Due to their very
high power density. A 400-watt high-pressure high pressures, UHP lamps are constructed in small
mercury arc tube has a diameter of 22 mm and discharge tubes and have a very short discharge
length of about 100 mm. In contrast, a 32-watt length, typically 1 to 2 mm, making these lamps a
fluorescent lamp has a diameter of 25.4 mm and a virtual point source. Due to their excellent spectral
length of about 1200 mm. This accounts for the characteristics and small source size, UHP lamps
classification of these high pressure lamps as high have found application in various projection
intensity discharge (HID) lamps. Due to the high applications, including LCD-based electronic
temperature of the arc tube, most HID lamps are projectors.
constructed with the arc tube mounted inside a The performance of high-pressure mercury lamps
hard glass outer jacket that protects both the user can be substantially improved by adding metals
and the arc tube. The high-pressure mercury lamp such as sodium and scandium to the high-pressure
is a direct visible radiator, but the arc tube does mercury arc. The sodium emits near the peak of the
generate a small amount of radiation in the eye sensitivity curve, providing high efficacy, while
254 nm to 365 nm region, that radiation is trapped the scandium has many transitions in the visible,
by the outer glass jacket. Some high-pressure which fill out the spectrum and provide high CRI.
mercury lamps have a phosphor coating on the If metallic elements are added directly to the high-
inside of the outer jacket to convert this UV pressure mercury arc they would deposit on or
radiation into visible radiation, mostly in regions react with the wall of the arc tube. To prevent this
of the spectrum where the high-pressure mercury they are added as iodides, such as NaI or ScI3. This
arc is deficient in radiation, thus improving the creates a regenerative cycle in which the NaI and
color of the lamp. ScI3 are heated and dissociated in the 5000 K core
Most high-pressure mercury lamps operate at a of the arc, allowing the free metals to radiate.
mercury pressure of about four atmospheres. As the metals diffuse toward the cooler sections of
Because their spectrum is dominated by mercury the arc near the walls, they recombine with the
line emission, the output of these lamps is deficient iodine to reform the iodides. These metal halide
in the red portion of the spectrum and the lamps lamps have produce light with an efficacy of about
have poor color rendition. In 1992, Philips received 90 l m W21 to 100 l m W21 and a CRI of about
a US patent for a new type of ultra-high pressure 65 to 75.
mercury lamp that they call UHP for ‘Ultra High The high-pressure sodium (HPS) lamp is an HID
Performance’. This lamp operates at a mercury lamp that uses hot sodium vapor to generate a
pressure of 200 atmospheres or more. The output golden white light. The lamp uses a mixture of
spectrum is characterized by a significant amount sodium, mercury, and rare gas such as xenon. The
of continuum radiation from quasi-molecular states core temperature of the sodium-mercury-xenon
of mercury and this continuum includes energy in discharge is typically about 4000 K, while the
the red portion of the spectrum that is missing temperature at the edge of the discharge and the
from conventional high pressure mercury lamps. walls of the ceramic arc tube is typically about
UHP lamps also include a small amount of 1500 K. At low pressure, emission from sodium is
halogen, such as bromine, which creates a tung- dominated by the resonance lines at 589.0 and
sten-halogen cycle similar to that discussed earlier 589.6 nm, so the light is virtually monochromatic.
in relation to tungsten-halogen incandescent lamps. However, in the HPS lamp, the sodium pressure is
During lamp operation, tungsten is sputtered and typically 75 Torr, which is high enough to almost
evaporated from the electrodes. In the absence of completely trap emission from the resonance lines,
the halogen fill, this tungsten rapidly deposits on forcing radiant energy to escape in the wings of
216 INCOHERENT SOURCES / Lamps
these lines, thereby increasing the color quality of Organic light emitting diodes (OLEDs) generate
the light. Since sodium will attack quartz, this light by the passage of electric current through
mixture is contained in a translucent alumina special organic molecules, using a process known
(aluminum oxide) ceramic arc tube. As with other as electrophosphorescence. The electrophosphores-
HID lamps, the alumina arc tube is contained in a cent material is usually coated on a transparent and
hard glass outer bulb. HPS lamps are the most perhaps flexible substrate, and overcoated with a
efficient generators of ‘white light’ available. transparent conductor. OLEDs currently have lower
A typical 400 watt HPS lamp produces light with efficacy and shorter life than LEDs, but they also
an efficacy of 125 l m W21, this high efficacy being have lower manufacturing costs. If LEDs are the
due to the fact that the emission matches the ‘new’ point sources that replace incandescent lamps
peak of the eye sensitivity curve. The light has a and perhaps CFLs, then OLEDs are the ‘new’ diffuse
CRI of only 22; however, the HPS lamp is widely sources that some say will replace liner fluorescent
used for roadway and other outdoor lighting lamps.
applications.
See also
Solid State Sources Light Emitting Diodes.
The newest incoherent sources to be used for
general purpose lighting are solid state devices:
light emitting diodes (LEDs), and organic light
Further Reading
emitting diodes (OLEDs). LEDs are very similar to Cobine JD (1958) Gaseous Conductors: Theory and
laser diodes but without the two mirrors used to Engineering Applications. New York: Dover Publi-
form the laser cavity in laser diods. Photons are cations.
generated as a result of electron-hole recombina- de Groot JJ and van Vliet JAJM (1986) The High Pressure
tion as electrons move across the energy gap of a Sodium Lamp. Philips Technical Library. Deventer–
Antwerprn: MacMillian Education.
biased p-n junction. By appropriate selection of
Elenbass W (1951) The High Pressure Mercury Vapor
materials, photons can be created at visible
Discharge. Amsterdam: North-Holland Publishing
wavelengths. Low brightness infrared and red Company.
LEDs have been available for quite some time. It Fisher HE (1996) High Pressure Mercury Discharge Lamp,
was not until the mid-1980s that high brightness U.S. Patent, 5,497,049.
red LEDs were developed. These were followed by Fisher HE and Hörster H (1998) High-Pressure Mercury-
increasingly more challenging yellow, green, and Vapor Discharge Lamp, U.S. Patent 5,109,181.
then blue high brightness LEDs, developed in 1994 Godyak VA (2002) Bright Idea: Radio-frequency light
by Shuji Nakamura, of Nichia Chemical Industries. sources. IEEE Industry Applications Magazine 8:
Nakamura added phosphor to his high brightness 42 – 49.
blue LEDs in 1996 and developed the first ‘white’ Hirsh MN and Oskam HJ (1978) Gaseous Electronics,
Volume 1, Electrical Discharges. New York: Academic
LED, thus triggering the current level of excitement
Press.
concerning the use of LEDs to replace incandescent
Mitchel AGC and Zemansky MW (1971) Resonance
and discharge lamps. Monochromatic high bright- Radiation and Excited Atoms, 2nd edn. London:
ness LEDs have already replaced color-filtered Cambridge University Press.
incandescent lamps in many applications, such as Nichia’s Shuji Nakamura: Dream of the Blue Laser Diode,
traffic lights, vehicle signal lights, and signage. In ScienceWatch, January/February 2000. Available at:
these applications, LEDs are the clear winner in http://www.sciencewatch.com/jan-feb2000/
life and efficacy. However, when compared to sw_jan-feb2000_page3.htm
unfiltered sources, the best white LEDs are about Pekarski P, Hechtfischer U, Heusler G, Koerber A, Moonch
as efficient as tungsten halogen incandescent lamps H, Retsch JP, Weichmann U and Kitz A (2003) UHP
and far less efficient than fluorescent and metal Lamps for Projection Systems, XXVI International
Conference on Phenomena in Ionized Gasses, Greifs-
halide lamps. They also cost far more per lumen
wald, Germany. Available at: http://www.icpig.
than even the most expensive discharge lamp.
uni-greifswald.de/proceedings/data/Pekarski_1
LEDs have high internal quantum efficiency, so Raizer YP (1991) Gas Discharge Physics. Berlin: Springer-
the main technical challenges are to fabricate Verlag.
devices to minimize internal light trapping in Rea MS (2000) IESNA Lighting Handbook, 9th edn.
order to improve device efficiency and reduce the New York: Illuminating Engineering Society of North
cost per lumen produced. America.
INCOHERENT SOURCES / Synchrotrons 217
Synchrotrons
R Clarke, University of Michigan, Ann Arbor, MI, The quest for ever higher spatial resolution, and the
USA desire to obtain high-quality X-ray data on shorter
q 2005, Elsevier Ltd. All Rights Reserved. time-scales, have both been prime motivations for the
development of high intensity hard X-ray sources
since the mid-1970s. This evolutionary trend has led
Since their discovery by Röntgen in 1895, X-rays have to the design and construction of large synchrotron
been used in a wide variety of applications ranging radiation facilities, serving many thousands of users
from medical and industrial radiography to crystal- and enabling forefront experimental studies across a
lographic studies of the structure of biomolecules. broad range of disciplines. One such example of a
The short wavelength (l , 0:1 nm) and high pene- national facility, dedicated to high-brightness X-ray
trating power of hard X-rays makes them an ideal synchrotron radiation, is the Advanced Photon
probe of the internal structure and composition of Source (APS), located at Argonne National Labora-
condensed matter. Indeed, Watson and Crick’s eluci- tory in the United States (Figure 1). Facilities of a
dation of the double helix structure of DNA in the similar scope and purpose have also been constructed
early 1950s, based on the X-ray diffraction patterns in Europe and Japan, and a number of new high
of Rosalind Franklin, has revolutionized biology and brightness sources are currently under construction
provided an atomistic view of the stuff of life. in several countries around the world (see Table 1).
Figure 1 Aerial view of the Advanced Photon Source, Argonne National Laboratory. Courtesy of Advanced Photon Source.
218 INCOHERENT SOURCES / Synchrotrons
In this article we will draw on examples from APS to a high degree of collimation in the vertical direction
illustrate the characteristics and performance of the with a divergence half-angle given approximately by
current generation of synchrotron sources in the hard the relativistic factor g21 ¼ m0 c2 =E, where m0 c2 is
X-ray region. the rest-mass energy (0.51 MeV) of the electron (or
From the earliest days of particle physics studies positron) and E is the energy of the stored particle
in the 1930s and 40s, it was realized that copious beam. For a bending magnet source with
amounts of X-rays could be produced by accelerating E ¼ 2 – 3 GeV the beam would have a vertical
charged particles in a curved trajectory, as in a collimation of , 0.3 mrad (17 mdeg). This would
cyclotron or a synchrotron. For particle physics such produce an X-ray beam with a vertical extent of
radiation is highly detrimental because it limits the , 3 mm at a distance of 10 m from the bending
ultimate energy of particle accelerators and forces magnet source. The horizontal fan of bend magnet
their circumference to become inordinately large in radiation is broad and is usually limited by a mask or
order to reduce radiation losses. a slit to a few mrad. The most useful characteristic of
Originally considered to be a major source of bend magnet radiation is its large bandwidth whereby
concern in particle accelerator design, synchrotron the X-ray spectrum is continuous and increasing in
radiation soon began to emerge as a boon to X-ray intensity (see Figure 2) until some critical energy
science. This came about in the early 1970s after the above which the intensity begins to fall off. Bending
capability was developed to produce and store magnet radiation is therefore useful for X-ray
significant circulating currents of high-energy measurements requiring a ‘white’ beam, for example,
charged particles (electrons or positrons) for colliding Laue X-ray diffraction, energy dispersive diffraction,
beam experiments. At that time X-ray scientists were and Extended X-ray Absorption Fine Structure
tolerated as ‘parasitic’ experimenters at such storage (EXAFS) analysis.
rings, taking X-ray radiation from the bending Beginning in the 1980s, several second-generation
magnets which maintained the stored current in an storage rings were constructed in various locations
overall circular trajectory. The PETRA (Positron – around the world. The design of these facilities for
Electron Tandem Ring Accelerator) facility at HASY- the first time was optimized for the production of
LAB/DESY in Hamburg, Germany, and the Cornell bright beams of X-ray and XUV (soft X-ray to
Electron Storage Ring (CESR)/Cornell High-Energy ultraviolet) and their operation was dedicated
Synchrotron Source (CHESS) at Cornell University, exclusively to synchrotron radiation science. The
Ithaca, New York, are examples of accelerator increase in brightness provided by these new storage
facilities that were primarily intended for particle ring sources, compared to standard laboratory X-ray
physics experiments and then became partly dedi- sources (e.g., rotating anode generators), was dra-
cated to X-ray production. Such shared synchrotron matic (5 or 6 orders of magnitude) and this opened
facilities are referred to as ‘first-generation sources’. up many new experiments that were not previously
The first generation sources demonstrated several feasible. However, it was soon realized that the
attractive features of synchrotron radiation, including capabilities of such sources could be extended to
INCOHERENT SOURCES / Synchrotrons 219
S
N with a corresponding oscillation frequency v0 ¼
S
N
2p =T 0 ¼ ð2pc=lu Þgp .
N S
N
N
For K p 1, the motion in the co-moving frame is
N
S purely transverse and nonrelativistic. However, for
S d K , 1, the oscillatory motion in this frame can have a
Gap @ 1–3 cm N
e rio
cp longitudinal component and in this case the trans-
S
N ti
S ne verse motion becomes nonharmonic with frequency
N ag
M
Electrons
S components at odd multiples of the fundamental
l0 frequency v0. This nonharmonic motion is a con-
Synchrotron
sequence of the fact that in the laboratory frame the
radiation particle is moving with constant speed < c but its
X-rays path length is slightly longer than lu , owing to the
transverse motion. This constrains the motion in the
Figure 3 Schematic of undulator. Courtesy of Advanced Photon co-moving frame to be nonharmonic.
Source.
Undulator Spectrum
relativistic speeds through an alternating magnetic It now remains to transform v0 into the laboratory
field. The magnetic field in the vertical direction ðyÞ is frame so that we can obtain the spectral distribution
taken to be periodic: of the output of the undulator. Invoking the Doppler
transformation:
By ¼ B0 sin½ð2p =lu Þz ½3
v ¼ gp v0 ð1 þ bp cos u 0 Þ ½8a
where z is in the direction of average particle motion.
The transverse particle trajectory is described by: tan u ¼ ðsin u 0 Þ=½gp ðcos u 0 þ bp Þ ½8b
x ¼ ðlu fmax =2pÞ sin½ð2p =lu Þz ½4 where u is the angle at which the radiation is
observed:
where lu is the magnet period, and fmax is the
maximum particle deflection angle: vn ðuÞ ¼ ð4p cng 2 Þ=ðlu ½1 þ 1
2 K2 þ ðguÞ2 Þ ½9a
K ¼ 93:4B0 ðTeslaÞlu ðmetersÞ ½6 Equation [9] is the central expression describing
the spectral output of an undulator. Notice how
The field strength parameter K serves as a useful the macroscopic period of the undulator lu is trans-
quantity for discussing the characteristics of various formed into a microscopic wavelength ln ðuÞ by the
insertion device designs. For K # 1, which defines relativistic factor g 2. For third-generation hard
the undulator regime, the transverse angular deflec- X-ray sources with particle energies in the 6 – 8 GeV
tions are less than or on the order of the angular range, g , 14 000, taking cm scales down to the
spread of the emitted radiation and in this case the angstrom level.
radiation from each of the deflections adds coher- Equation [9] describes a spectrum of monochro-
ently. This is distinguished from the case of a matic peaks, for on-axis radiation ðu ¼ 0Þ, with
multipole wiggler where K q 1 leading to incoherent wavelengths that increase with increasing K. Since
superposition. K , B0 lu (eqn [6]) any particular harmonic can be
Most of the interesting features of undulator tuned either by changing B0 or lu. In most undulator
radiation are revealed by considering the trajectory designs, lu is fixed by the construction of the perma-
of the particle in a Lorentz frame moving at speed vp nent magnet array, so the tuning is conveniently
with respect to the laboratory frame. In this frame the accomplished by varying B0 . In practice this is done
particle will execute oscillatory motion with the by changing the vertical distance between the poles
INCOHERENT SOURCES / Synchrotrons 221
5.0 x1014
APS Undulator A
4.0 x1014
u1=2 ¼ ð1=gÞ½ð1 þ 1
2 K2 Þ=2Nn1=2 ¼ ½ln =L1=2 ½10
3.0 x1014
2.0 x1014
where L is length of the undulator. For a typical third-
generation source undulator, such as the Type A
1.0 x1014 undulator at the Advanced Photon Source,
L ¼ 2:4 m, giving u1=2 < 6 mrad for the first harmonic
(ph/s/mrad2/mm2/0.1%bw)
n=1
n=3 tuned to 12 keV. This corresponds to an X-ray beam
n=5
1018 size of , 180 mm at a distance 30 m from the source.
The very tight collimation of undulator radiation
Brilliance
INFORMATION PROCESSING
Contents
All-Optical Multiplexing/Demultiplexing
Coherent Analog Optical Processors
Free-Space Optical Computing
Incoherent Analog Optical Processors
Optical Bit-Serial Computing
Optical Digital Image Processing
Optical Neural Networks
. Switching is normally carried out by separating data channels, each of capacity M Gbps, are multi-
each wavelength of each fiber onto different plexed to give an aggregate rate of N £ M Gbps. The
physical outputs. Space switches are then used fundamental components are a pulsed light source, an
to spatially switch the separated wavelengths, an optical modulator, a multiplexer, channel, add/drop
extremely inefficient way of utilizing network unit, and a demultiplexer. The light source needs to
resources; have good stabilities and be capable of generating
. The need for amplifiers with high gain and flat ultrashort pulses (, 1 ps). Direct modulation of the
spectra. laser source is possible but the preferred method is
based on external modulation where the optical
In order to overcome these problems, OTDM was signal is gated by the electronic data. The combi-
introduced that offers the following: nation of these techniques allows the time division
multiplexed data to be encoded inside a subnanose-
. Flexible high bandwidth on demand (. 1 Tbit/s
compared to the bit rates of 2.5 –40 Gbit/s per cond time slot, which is subsequently interleaved into
wavelength channel in WDM systems); a frame format. Add/drop units provide added
. The total bandwidth offered by a single channel versatility (see Figure 2) allowing the ‘adding’ and
network is equal to DWDM; ‘dropping’ of selected OTDM channels to inter-
. In a network environment, OTDM provides change data at chosen points on the link. At the
potential improvements in: receiving end, the OTDM pulse stream is demulti-
. Network user access time, delay and through- plexed down to the individual channels at the initial
put, depending on the user rates and statistics. M Gbps data rate. Data retrieval is then within the
. Less complex end node equipment (single- realm of electronic devices and the distinction
channel versus multichannels). between electronic and optical methods is no longer
. Self-routing and self-clocking characteristics; relevant. Demultiplexing requires high-speed all
. Can operate at second- and third-transmission optical switching and can be achieved using a number
windows: of methods, which will be discussed in more detail
. 1500 nm (like WDM) due to Erbium doped below.
fiber amplifier (EDFA); The optical multiplexing (or interleaving) can be
. 1300 nm wavelengths. carried out at the bit level (known as bit interleaving)
. Offers both broadcast and switched based or at the packet level (known as packet interleaving),
networks. where blocks of bits are interleaved sequentially. This
is in accord with the popular conception of packet
switched networks. The high data rates required and
Principle of OTDM consequently narrow time slots necessitate the need
Figure 1 show the generic block diagram of a point- for strict tolerances at processing nodes, e.g.,
to-point OTDM transmission link, where N optical switches. As such, it is important that the duration
of the optical pulses is chosen to be significantly rate OTDM signal (see Figure 3b). As shown in
shorter than the bit period of the highest multiplexed Figure 3b, the framing pulse has a higher intensity
line rate, in order to reduce the crosstalk between for clock recovery purpose. At the demultiplexer, the
channels. incoming OTDM pulse train is split into two paths
(see Figure 4a). The lower path is used to recover the
framing pulse by means of thresholding, this is then
Bit Interleaved OTDM delayed by an amount corresponding to the position
of the ith (wanted) channel (see Figure 4b). The
A simple conceptual description of a bit interleaved
delayed framing pulse and the OTDM pulse stream
multiplexer is shown in Figure 3. It uses a number
are then passed through an AND gate to recover the
of different length optical fiber delay lines (FDL) to
ith channel. The AND operation can be carried out
interleave the channels. The propagation delay of
all optically using, for example, a nonlinear loop
each FDL is chosen to position the optical channel in
mirror, a terahertz optical asymmetric demultiplexer,
its corresponding time slot in relation to the
or a soliton-trapping gate.
aggregate OTDM signal. Prior to this, each optical
pulse train is modulated by the data stream. The
output of the modulators and an undelayed pulse
Packet Interleaved OTDM
train, labeled the framing signal, are combined, using
a star coupler or combiner, to produce the high bit Figure 5a shows the block diagram of a system used
to demonstrate packet interleaving. Note that the
time interval between successive pulses now needs to
be much less than the bit interval T. This is achieved
by passing the low bit rate output packet of the
modulator into a compressor, which is based on a
feed forward delay line structure. The feed forward
delay lines are based on a cascade of passive M –Z
interferometers with unequal arm lengths. This
configuration generates the delay times required,
i.e., T 2 t; 2ðT 2 tÞ…ð2n21 ÞðT 2 tÞ; etc., where n ¼
Figure 2 Add/drop unit in an OTDM network node. log2 k is the number of stages, T and t are the
Figure 3 Bit interleaved OTDM; (a) block diagram and (b) timing waveforms.
INFORMATION PROCESSING / All-Optical Multiplexing/Demultiplexing 227
incoming bit duration and the outgoing bit duration, of the compressor is:
respectively, and k is the packet length in bits.
Pulse i’s location at the output is given by X
k21
ð2n 2 1ÞðT 2 tÞ þ ði 2 1Þt: The signal at the input Iin ðtÞ ¼ dðt 2 iTÞAi ½1
i¼0
1 X
k21
Oi ðtÞ ¼ Ii ½t 2 iðT 2 tÞ ½2
2nþ1 i¼0
X
k21
1 X k21
k21 X
Figure 4 Bit interleaving demultiplexer: (a) block diagram and Iout ¼ Oi ¼ Ii ½t 2 ði þ jÞT þ jtAi ½3
(b) typical waveforms. i¼0 2nþ1 i¼0 j¼0
Figure 5 Packet interleaving multiplexer: (a) block diagram and (b) typical waveforms.
228 INFORMATION PROCESSING / All-Optical Multiplexing/Demultiplexing
Figure 6 Packet interleaving demultiplexer: (a) block diagram and (b) typical waveforms.
For a packet delay of (i þ j)T packets are being built (DFB), active mode locked lasers (MLL) (capable of
up. For the case ði þ jÞ ¼ 3; four bits are compressed generating repetitive optical pulses), and harmonic
into a packet (see Figure 5b). An optical gating device mode-locking Erbium-doped fiber (EDF) lasers.
is used to select the only complete usable compressed Alternative techniques include super-continuum
copy of the incoming packets from the unwanted pulse generation in a dispersion shifted fiber with
copies. EDF pumping, adiabatic soliton compression using
At the receiving end, demultiplexing is carried out dispersion-flattened dispersion decreasing fiber, and
by decompressing the OTDM packet stream using the pedestal reduction of compressed pulses using a
same delay line structure (see Figure 6). Multiple dispersion-imbalanced nonlinear optical loop mirror.
copies of the compressed packet are generated, and As shown in Figure 1, the laser light source is power
each is delayed by ðT 2 tÞ: An optical demultiplexer split to form the pulse source for each channel. These
placed at the output of the delay lines generates a are subsequently modulated with an electrical data
sequence of switching window at the rate of 1/T. By signal (external modulation). External modulation is
positioning the switching pulses at the appropriate preferred in an OTDM system as it can achieve
position, a series of adjacent packet (channel) bits narrow carrier linewidth, thus reducing the timing
from each of the copied compressed packets is jitter of the transmitted pulse. For ultrahigh bit rate
extracted (see Figure 6b). The output of the demulti- OTDM systems, the optical pulses emerging from the
plexer is the decompressed packet signal, which can external modulators may also need compressing. One
now be processed at a much slower rate using option is to frequency-chirp the pulses and pass them
electronic circuitry. through an anomalous dispersive medium. Using this
approach, it is essential that the frequency chirp be
linear throughout the duration of the pulse, in
Components of an OTDM System addition the midpoint of the linear-frequency chirp
should coincide with the center of the pulse. When a
Optical Sources
frequency-chirped pulse passes through a dispersive
In an ultrahigh speed OTDM system, it is essential medium, different parts of the pulse travel at different
that the optical sources are capable of generating speeds, due to a temporal variation of frequency. If
transform-limited subnanosecond pulses having low the trailing edge travels faster than the leading edge,
duty cycles, tuneability, and a controllable repetition the result would be pulse compression. An optical
rate for synchronization. A number of suitable light fiber cable or a semiconductor laser amplifier (SLA)
sources are: gain-switch distribute feedback laser can be used to create the frequency chirp.
INFORMATION PROCESSING / All-Optical Multiplexing/Demultiplexing 229
Figure 7 Configuration of a PLC-OTDM-MUX. Reproduced with permission from Ohara T, Takara H and Shake I, et al. (2003).
160-Gb/s optica-time-division multiplexing with PPLN hybrid integrated planar lightwave circuit. IEEE Photonics Letters 15(2):
302–304. q 2003 IEEE.
230 INFORMATION PROCESSING / All-Optical Multiplexing/Demultiplexing
Iout ðtÞ
¼ 0:25ðGSLA1 ðtÞ þ GSLA2 ðtÞ
Iin ðtÞ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
^ 2 GSLA1 ðtÞGSLA2 ðtÞ cos DfðtÞÞ ½4
Figure 9 SLA carrier density response to input pulse. where GSLA1 ðtÞ and GSLA2 ðtÞ refer to the gain profiles
of the respective SLAs, and DfðtÞ is the time-
dependent phase difference between them. Assuming
The refractive index of an SLA is a function of the that the gain and phase profiles of an excited
semiconductor carrier density, which can be modu- amplifier are given by G(t) and f(t), respectively,
lated optically resulting in fast modulation. If a high- then the signals passing through the upper and lower
intensity pulse is input to an SLA the carrier density arms experience optical properties of the material
changes nonlinearly, as shown in Figure 9. Phase given by:
modulation is affected via the material chirp index
ðdN=dnÞ; which represents the gradient of the GðtÞ; Gðt 2 Td Þ; fðtÞ and fðt 2 Td Þ ½5
refractive index carrier density curve for the material.
The phase modulation is quite strong and, in contrast where Td is given by 2Ld NSLA =c; Ld is the distance
to the intensity-based nonlinearity in an optical fiber, between SLA1 and SLA2, NSLA is the SLA index, and
(see Kerr effect below), is a consequence of a resonant c is the speed of light in a vacuum. As the width of
interaction between the optical signal and the the DfðtÞ profile is dependent on the distance
material. The nonlinearity is relatively long-lived between the SLAs, placing them in close proximity
and would be expected to limit the switching speed allows high-resolution switching. Less emphasis has
when using semiconductors. Semiconductors, when been placed on the SLA gain as this is considered to
in excitation from an optical field, tend to experience be less effective when phase differences of p are
the effect quite quickly (ps) with a slow release reached. It only remains for the gain and phase
INFORMATION PROCESSING / All-Optical Multiplexing/Demultiplexing 231
Sagnac Interferometers
There are two main types; the nonlinear optical loop
mirror (NOLM), and the terahertz optical asym-
metric demultiplexer (TOAD).
NOLM
In this method of switching the inherent nonlinear-
ity of an optical fiber known as the Kerr effect is
used. The phase velocity of any light beam passing
through the fiber will be affected by its own Figure 11 Nonlinear optical loop mirror.
intensity and the intensity of any other beams
present. When the intrinsic third-order nonlinearity
of silica fibers is considered via the intensity-
dependent nonlinear refraction component, then
the signal phase shift is
2p 2p
Dfsignal ¼ n2 LIds þ 2 n LI ½6
lds lds 2 c
Assuming unity gain around the loop, the transmit- phase property. Figure 14a shows the timing diagram
tance of the NOLM is associated with a TOAD demultiplexer. The control
pulse, shown in Figure 14b(1), is incident at the SLA
2 Df at a time t1; the phase profiles for the CCW and
Tx ¼ 1 2 cos ¼ DfðtÞ
2 CCW data pulses are shown in Figures 14b(2) and
! (3), respectively. The resulting transmission window
ðL
2 2p is shown in Figure 14b(4). The window width is
¼ 1 2 cos n I ðtÞ dx ½8
lds 2 0 C given by tasy ¼ 2Dx=cf ; where Dx is the SLA offset
from the loop center and cf is the speed of light in the
As it stands, the switching resolution is determined
fiber. As in NOLM, if the phase difference is of
by the width of the control pulse; however, it is
sufficient magnitude, then data can be switched to
possible to allow the signal and control to ‘walk off’
port 2. The switch definition is, in principle,
each other, allowing the window width to be
determined by how close the SLA is placed to the
increased by an amount determined by the ‘walk
loop center when the asymmetry is relatively large.
off’. The phase shift is now
However, for small asymmetries the switching
2p ðL window is asymmetric, which is due to the CW
DfðtÞ ¼ 2 n I ðt 2 Tw xÞdx ½9 and CCW gain profiles being different (see
lds 2 0 C
Figure 15). Assuming the phase modulation dom-
where the parameter Tw, is the walk off time per inates the transmission then the normalized trans-
unit length between control and signal pulses. The mission for a small asymmetry loop is as depicted in
increased window width is accompanied by a lower Figure 16. The gain response (not shown) would
peak transmission (see Figure 13). have a similar shape and temporal position as the
phase response.
TOAD
The TOAD uses a loop mirror architecture that
incorporates an SLA and only needs a short length of
fiber loop (see Figure 14a). The SLA carrier density is
modulated by a high-intensity control pulse, as in the
M– Z-based demultiplexer. The operation of the
TOAD is similar to the NOLM, where the loop
transmittance is determined by the phase difference
between CW and CCW traveling pulses. Strictly
speaking, the gain of the SLA must also be taken into
account; however, without any loss of generality the
effect is adequately described by considering only the
1 exp 2 Q =2
2
inset in Figure 21). The intensity of the demultiplexed
optical signal is boosted by the optical pre-amplifier BER ¼ pffiffiffiffi ½13
2p Q
(EDFA). Amplified spontaneous emission (ASE)
from the optical preamplifier adds a wide spectrum where Q is defined as:
of optical fields onto the demultiplexed optical signal.
Although an optical bandpass filter (BPF) can reduce Im 2Is
Q¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
the ASE, it still remains one of the major noise s 2RIN;m þ s 2RIN;s þ s 2amp;m þ s 2amp;s þ s 2rec;m þ s 2rec;s
sources. The function of the optical filter is to reduce
½14
the excess ASE to within the range of the signal
spectrum. The PIN photodiode converts the received and Im and Is are the average photocurrents for a
optical power into an equivalent electrical current, mark and a space, respectively. sx;i are the variances
which is then converted to a voltage signal by an of the relative intensity noise (RIN), further optical
electrical amplifier. Finally, the output voltage is pre-amplifier and receiver for a mark and a space.
sampled periodically by a decision circuit for estimat- For 100 Gb/s (10 channels) OTDM system, the
ing the correct state (mark/space) of each bit. BER against average received optical power for
The BER performance of an OTDM system optimized NOLM and TOAD demultiplexers, is
deteriorates because of the noise and crosstalk shown in Figure 22.
236 INFORMATION PROCESSING / All-Optical Multiplexing/Demultiplexing
List of Units and Nomenclature Optical Time Division Multiplexing; Wavelength Division
Multiplexing.
Ai ith data bit in a packet
c Speed of light in a vacuum
cf Speed of light within the fiber
Ec Electric fields of control signals Further Reading
Eds Electric fields of data signals Agrawal GP (2002) Fiber-Optic Communication Systems,
fFWM Frequency due to four-wave 3rd edn. New York: J Wiley and Sons Inc.
mixing Chang K (ed.) (2003) Handbook of Optical Components
GSLA(t) Gain profiles of SLA and Engineering, 2nd edn. New York: Wiley and
Ic Intensity of the control signal Sons Inc.
ID Intensity of the data signal De Marchis G and Sabella R (1999) Optical Networks
Im Average photo current for a mark Design and Modelling. Dordrecht: Kluwer Academic.
Is Average photo current for a space Hamilton SA, Robinson BS, Murphy TE, Savage SJ
k Packet length in bits and Ippen EP (2002) 100 Gbit/s optical time-division
multiplexed networks. Journal of Lightwave Technol-
L Fiber length
ogy 20(12): 2086– 2100.
Ld Distance between two SLAs Jhon YM, Ki HJ and Kim SH (2003) Clock recovery from
LE Fiber effective length 40 gbps optical signal with optical phase-locked loop
n Number of stages based on a terahertz optical asymmetric demultiplexer.
n2 Kerr coefficient Optics Communications 220: 315 – 319.
NSLA SLA index Kamatani O and Kawanishi S (1996) Ultrahigh-speed clock
P Polarization recovery with phase locked loop based on four wave
Pc Control signal launch power mixing in a traveling-wave laser diode maplifier. Journal
Pds Data signal launch power of Lightwave Technology 14: 1757– 1767.
tasy The window width is given by Nakamura S, Ueno Y and Tajima K (2001) Femtosecond
tasy ¼ 2Dx/cf switching with semiconductor-optical-amplifier-based
symmetric-Mach – Zehnder-type all-optical switch.
T Incoming bit interval
Applied Physics Letters 78(25): 3929–3931.
Tw Walk off time per unit
Ohara T, Takara H, Shake I, et al. (2003) 160-Gb/s optica-
length between control and time-division multiplexing with PPLN hybrid integrated
data pulses planar lightwave circuit. IEEE Photonics Letters 15(2):
Tx Transmittance of the NOLM 302 – 304.
d(·) Optical carrier pulse shape Ramaswami R and Sivarajan KN (2002) Optical Network;
DfðtÞ Time-dependent phase difference a Practical Perspective. Morgan Kaufman.
Dx SLA offset from the fiber loop center Reman E (2001) Trends and evolution of optical networks
10 Vacuum permittivity of the and technology. Alcatel Telecommun. Rev. 3rd Quarter
medium 173 – 176.
ls Data signal wavelength Sabella R and Lugli P (1999) High Speed Optical
samp,m Variances for optical Communications. Dordrecht: Kluwer Academic.
Seo SW, Bergman K and Prucnal PR (1996) Transparent
pre-amplifier for a mark
optical networks with time division multiplexing.
samp,s Variances for optical
Journal of Lightwave Technology 14(5): 1039– 1051.
pre-amplifier for a space Sokolloff JO, Prucnal PR, Glesk I and Kane M (1993)
srec,m Variances for receiver for a mark A terahertz optical asymmetric demultiplexer
srec,s Variances for receiver for a space (TOAD). IEEE Photonics Technology Letters
sRIN,m Variances of RIN for a mark 5(7): 787 – 790.
sRIN,s Variances of RIN for a space Stavdas A (ed.) (2001) New Trends in Optical
t Outgoing bit interval Network Design and Modelling. Dordrecht: Kluwer
f(t) Phase profile of SLA Academic.
x (3) Third-order non-linear susceptibility Ueno Y, Takahashi M, Nakamura S, Suzuki K, Shimizu T,
of an optical fiber Furukawa A, Tamanuki T, Mori K, Ae S, Sasaki T and
Tajima K (2003) Control scheme for optimizing the
interferometer phase bias in the symmetric-Mach –
See also
Zehnder all-optical switch (invited paper). Joint special
Interferometry: Overview. Nonlinear Optics, Basics: issue on recent progress in optoelectronics and communi-
Four-Wave Mixing. Optical Communication Systems: cations. IEICE Trans. Electron. E86-C(5): 731 –740.
INFORMATION PROCESSING / Coherent Analog Optical Processors 237
After expanding the squares in the exponentials, there considered below). The lens acts as a phase factor
will be three terms: one can go outside the integral
and one will be approximately equal to unity so the p
exp 2 i ðx20 þ y20 Þ
diffracted pattern becomes lf
2pd where f is the focal length of the lens. Now in the
exp i input plane the object Uðx0 ; y0 Þ is multiplied by the
Uðx1 ; y1 Þ ¼ l
ild phase factor of the lens:
ð1 ð1
p 2
exp i ðx1 þ y21 Þ Uðx0 ; y0 Þ p
ld 21 21 U0 ðx0 ; y0 Þ ¼ Uðx0 ; y0 Þexp 2 i ðx20 þ y20 Þ ½8
lf
2p
exp 2 i ðx x þ y0 y1 Þ dx0 dy0
ld 0 1 Inserting this expression into the integral formula
½7 for Fresnel diffraction yields the diffraction pattern
observed in the focal plane of the lens:
This is the Fourier transform of Uðx0 ; y0 Þ; scaled by a
factor ld; with an extra quadratic phase factor
2pf
exp i
Uðx1 ; y1 Þ ¼ l
ip 2 ilf
exp ðx þ y21 Þ ð1 ð1
ld 1 p
exp i ðx21 þ y21 Þ Uðx0 ; y0 Þ
lf 21 21
The factor expð i2pd=lÞ=i is usually of no importance " #
since it is a phase factor distributed across the output 2p
exp 2 i ðx x þ y0 y1 Þ dx0 dy0
plane. In some applications we only want to know the lf 0 1
power spectrum (squared modulus) of the Fourier ½9
transform. In such cases the quadratic phase factor
disappears because the squared modulus of any The lens exactly compensates for the other phase
complex exponential is equal to 1. Performing a factor that was in the integral. This expression for the
Fourier transform without a lens is possible within Fraunhofer approximation is almost the same as
the Fraunhofer approximation. But if it is required to before, but the focal length f of the lens replaces the
obtain the Fourier transform of an image which distance d. So by placing a lens immediately after the
extends about 1 cm from the optical axis using a input plane we can have a much more compact setup
coherent He – Ne laser ðl ¼ 632:8 nmÞ; the condition since lenses are available with a wide range of focal
to be satisfied is d q 496 m: the required observation lengths, usually from a few centimeters to a few
distance d is at least a few kilometers. This kind of meters. Another way to look at this is to consider
optical setup is usually impractical, but can fortu- that the effect of a lens is to bring the pattern at
nately be alleviated by means of lenses. infinity to the focal plane of the lens – except for the
phase factor.
Object on the Lens
Object Before the Lens
Instead of having only the transparency in the input
plane, we place a lens immediately after it as shown in Another setup used to perform the Fourier transform
Figure 2. For mathematical simplicity the lens and is shown in Figure 3, where the lens is placed
input plane are superimposed (more realistic cases are somewhere between the input and the output planes.
Using the Fresnel formula for diffraction it can be
Figure 2 Diffraction with the object on the lens. Figure 3 Diffraction with the object before the lens.
INFORMATION PROCESSING / Coherent Analog Optical Processors 239
shown that the diffraction pattern in the focal plane is: scale of the Fourier transform by changing the
distance between the lens and the object.
1 Note that although two of the three methods yield
Uðx1 ; y1 Þ ¼
ilf Fourier transforms that are accompanied by a
" ! # quadratic phase factor, the latter is not necessarily
p d ð1 ð1
2 2
exp i 12 ðx1 þ y1 Þ Uðx0 ; y0 Þ an impediment to correlation, contrary to popular
lf f 21 21
" # belief. Indeed all three configurations have been used
2p for correlation pattern recognition, and each con-
exp 2 i ðx x þ y0 y1 Þ dx0 dy0 ½10
lf 0 1 figuration has its advantages.
Object After the Lens ðf p hÞð2x; 2yÞ ¼ FT½Fðm; nÞH p ðm; nÞ ½12
Still another configuration is the one shown in The correlation can be considered as the result of
Figure 4, where the lens is placed at a distance d three operations: a Fourier transform; a product; and
before the input plane. For this configuration the an additional Fourier transform. The Fourier trans-
Fresnel diffraction yields the following pattern in the form of a 2D function can be obtained by means of
focal plane of the lens: coherent optics. The equivalent of multiplication in
optics is transmissivity, so multiplication is accom-
p 2 plished by sending light successively through two
exp i ðx1 þ y21 Þ
ld f ð1 ð1 superimposed transparencies.
Uðx1 ; y1 Þ ¼ Uðx0 ; y0 Þ
ild d 21 21 There are two main categories of optical
2p correlators, serial correlators and joint transform
exp 2 i ðx x þ y0 y1 Þ dx0 dy0
ld 0 1 correlators.
½11
Serial Correlators
Once again the Fourier transform of Uðx0 ; y0 Þ is
multiplied by a quadratic phase factor. It is interesting A serial correlator functions in two stages. The first
to note that, in this case, the Fourier transform is stage performs the Fourier transform of the input
scaled by a factor ld: This can be used to control the image f ðx; yÞ; which is multiplied in the Fourier plane
with a transparency Hðm; nÞ: The filter Hðm; nÞ can be
a hologram, a spatial light modulator (SLM) or a
simple transparency. A second Fourier transform
displays in the output plane the correlation of the
input image with the impulse response hðx; yÞ; which
is the Fourier transform of the filter function Hðm; nÞ:
Serial correlators directly implement the two
Fourier transforms and the multiplication operation
in a pipelined manner. The input plane and the
correlation plane are conjugate planes of the system,
that is in the absence of a filter, the output plane
displays an inverted image of the input f ðx; yÞ:
Between those conjugate planes, at the conjugate
plane of the source with respect to the first lens, is the
Fourier plane where the spatial frequency filter is
Figure 4 Diffraction with the object after the lens. placed. When the input is illuminated with parallel
240 INFORMATION PROCESSING / Coherent Analog Optical Processors
light, the frequency plane is at the back focal plane of planes, the distances must satisfy the conjugate
the first lens. A SLM can be placed in the input plane relation:
to feed the images to be correlated; a second SLM can
1 1 1
be placed in the Fourier plane to display the filter. The þ ¼ ½13
filter is often calculated by a digital computer, and s0 si f2
usually takes the form of a hologram if phase where si and s0 are defined in Figure 6. As previously
information must be used in the filter. discussed for convergent light, the Fourier transform
will have an additional parabolic phase factor. This
The 4-f correlator must be taken into account when constructing the
The 4-f correlator is the most intuitive optical filter H.
correlator system. We showed above that an f-f This configuration has some advantages over the 4-
system can be used to optically obtain a Fourier f correlator. By moving the object between the first
transform (Figure 5). If f ðx; yÞ is in the front focal lens and the filter, it is possible to control the scale of
plane, its Fourier transform Fðm; nÞ appears at the the Fourier transform. And because the first lens is
back focal plane of the first lens. If a filter function imaging only a single point on the optical axis, only
H p ðm; nÞ is inserted at this point, by means of a SLM spherical aberration needs to be compensated for. In
or a hologram, then a second f-f system can perform a 4-f systems, the first lens must be corrected for all
second Fourier transform to yield the correlation aberrations for the whole input plane, and for a given
plane at the back focal plane of the second lens. extent of the input object, the first lens must be larger
than that for the convergent system in order to avoid
Convergent beam correlator space variance effects. Unfortunately, specially
This kind of correlator consists of two conjugate designed expensive lenses are required to correct for
systems. This time, the Fourier transform is obtained those aberrations, so well-corrected Fourier trans-
by placing the input object in a convergent beam after form lenses for 4-f systems are expensive, whereas a
the first lens. The second lens is placed so that the simple doublet is sufficient to correct for on-axis
output plane is conjugate to the input plane, and as spherical aberration, which is all that is required for
before, the filter is placed at the point of convergence the convergent light system. In addition, the con-
of the light after the first lens. It is important to note vergent light system does not require an expensive
that this is not the focal plane of the lens, but the collimator that is required in the 4-f system to obtain
conjugate plane of the source of light. In order to have the parallel illuminating beam. Aberration correction
an imaging system between the input and output requirements for the second lens are less stringent
than for the first, because in most cases, a slight Table 1 Commonly used filters for image processing
deformation of the output plane can be acceptable, Name Filter
whereas comparable distortions in the Fourier plane
can cause misregistrations between the Fourier trans- Matched filter HCMF ðm; nÞ ¼ F ðm; nÞ
form and the filter that can seriously degrade the F ðm; nÞ
Phase-only filter HPOF ðm; nÞ ¼
correlation. lF ðm; nÞl
1
Inverse filter HINV ðm; nÞ ¼
F ðm; nÞ
Divergent beam correlators
Instead of placing the input object in the convergent F p ðm; nÞ
Wiener filter HW ðm; nÞ ¼
beam, as in the previous case, another configuration lF ðm; nÞl2 þ Sðm; nÞ
that has been used places the input object in the
divergent beam before the first lens. This configur-
ation is rarely seen today, because it exacerbates the realization of 1=Fðm; nÞ can be impossible. A close
space variance and aberration problems of the 4-f relative of the inverse filter is the Wiener filter, which
system, and the control it allows over the scale of avoids the problem of the inverse filter by introdu-
the Fourier transform can be achieved with fewer cing a quantity Sðm; nÞ in the denominator of the
penalties by means of the convergent light system. filter function. The function Sðm; nÞ can sometimes be
a constant. Hðm; nÞ is usually chosen to minimize the
Single lens correlator effect of noise or to optimize some parameter such as
Consider an imaging system using only one lens the peak to sidelobe ratio, or the discrimination
illuminated with a parallel beam of coherent light ability of the system.
(Figure 7). The input and output planes are located at
distances 2f on either side of the lens – the shortest
conjugate distances possible. If a filter Hðm; nÞ is Joint Transform Correlators
placed at the focal plane of the lens on the object’s In serial transform correlators, the filter is generally
Fourier transform, the correlation will appear on the a complex function. Spatial light modulators are
output plane. This system requires satisfying imprac- usually conceived with only one degree of freedom
tical conditions: aberration corrections for both the for each pixel, which can be optimized for amplitude
Fourier transform and for the imaging operation, so it modulation or for phase modulation. In order to
is rarely seen. achieve a fully complex filter, two modulators are
required, or some spatial bandwidth of a modulator
Filters
is sacrificed to encode a computer-generated
Many types of filters can be used for different hologram.
applications. Table 1 lists some commonly used The joint-transform correlator (JTC) does not
filters for pattern recognition. The classical matched require a fully complex filter. The filter is stored in
filter yields the optimum signal to noise ratio in the the spatial domain so there is usually no need for the
presence of additive noise. The phase-only filter computation of a filter in the spectral domain. In the
(POF) is frequently used because of its high JTC, the correlation of a scene with a reference image
discrimination capability. The inverse filter is some- is contained in the power spectrum of an input plane
times used because it theoretically yields very sharp consisting of the scene placed side by side with the
correlation peaks, but is very sensitive to noise, and reference. The correlation terms are revealed when
when the spectrum Fðm; nÞ has some zeros, the the Fourier transform of the power spectrum is
242 INFORMATION PROCESSING / Coherent Analog Optical Processors
computed. The joint transform correlation can be Multiple targets and multiple references
summarized in three steps: The input scene of the JTC may contain multiple
targets. Each target in the scene will correlate with
1. The scene and the reference are located side by
the other targets. The additional correlation peaks
side in the input plane.
can clutter the correlation plane, so some care must
2. The Fourier transform of the input plane is
be taken when positioning the scene with respect
computed or carried out optically by one of the
to the reference. Let sðx; yÞ contain multiple targets
methods previously described. The transform
si ðx; yÞ:
intensity is measured and constitutes the joint
power spectrum. Some optional enhancements, X
N
sðx; yÞ ¼ si ðx 2 xi ; y 2 yi Þ ½18
based on spatial filtering and/or nonlinear proces- i¼1
sing on the power spectrum, can also be
The joint power spectrum becomes:
performed.
3. The Fourier transform of this joint power X
N X
N X
N
spectrum is carried out, yielding the correlation Eðm; nÞ ¼ kRðm; nÞk2 þ kSi ðm; nÞk2 þ
i¼1 i¼1 k¼1
plane containing the correlation of the reference
i–k
with the scene.
A mathematical analysis is required to understand the Si ðm; nÞSpk ðm; nÞe2i2p½mðxi 2xk Þþnðyi 2yk Þ þ C:C:
location of the correlation peaks. The side-by-side XN
þ Rðm; nÞSpi ðm; nÞe2i2p½mðxr 2xi Þþnðyr 2yi Þ þ C:C:
input scene sðx; yÞ and reference image rðx; yÞ are i¼1
The joint power spectrum Eðm; nÞ is equal to the where C.C. denotes the complex conjugate of the
modulus squared of the Fourier transform of the preceding term. The correlation plane will now be:
input plane: X
N
Cðx 0 ; y 0 Þ ¼ Rrr ðx 0 ; y 0 Þ þ Rsi si ðx 0 ; y 0 Þ
2i2pðmxr þnyr Þ 2i2pðmxs þnys Þ
Iðm; nÞ ¼ Rðm; nÞe þ Sðm; nÞe i¼1
N
2 N X
X
Eðm; nÞ ¼ kIðm; nÞk ¼ þ Rsi sk x 0 2 ðxi 2 xk Þ; y 0 2 ðyi 2 yk Þ
kRðm; nÞk2 þ kSðm; nÞk2 i¼1
k¼1
p 2i2p½mðxr 2xs Þþnðyr 2ys Þ
þ Rðm; nÞS ðm; nÞe i–k
X
N
þ Rp ðm; nÞSðm; nÞe2i2p½mðxs 2xr Þþnðys 2yr Þ þ Rrsi x 0 2 ðxr 2 xi Þ; y 0 2 ðyri 2 yi Þ
½15 i¼1
N
X
N X
The joint power spectrum contains the Fourier þ Rsk si x0 þ ðxi 2 xk Þ; y 0 þ ðyi 2 yk Þ
transforms of the autocorrelation of the input i¼1
k¼1
scene, the autocorrelation of the reference, the i–k
correlation of the scene with the reference and its
X
N
complex conjugate. The correlation peaks are þ Rsi r x 0 þ ðxr 2 xi Þ; y0 þ ðyri 2 yi Þ
obtained from the Fourier transform of the joint i¼1
power spectrum: ½20
The second lines of eqns [9] and [20] are the cross-
Cðx 0 ; y 0 Þ ¼ Rss ðx 0 ; y 0 Þ þ Rrr ðx 0 ; y 0 Þ
correlation terms between the targets. In order to
þ Rsr x 0 2 ðxr 2 xs Þ; y 0 2 ðyr 2 ys Þ avoid confusion between reference – target corre-
þ Rsr x 0 þ ðxr 2 xs Þ; y 0 þ ðyr 2 ys Þ lations and target– target correlations, special care
must be taken in positioning the reference with
½16
respect to the scene. All the target –target corre-
lations will be inside a centered box that has twice
where the length and twice the width of the input scene, so
ð1 ð1 the reference should be placed at least at twice the
Rab ðx0 ; y0 Þ ¼ aðj; zÞbp ðj 2 x0 ; z 2 y0 Þdjdz length or width of the scene.
21 21 Multiple targets in the input scene can be viewed as
½17 multiple references, and the correlation of multiple
INFORMATION PROCESSING / Coherent Analog Optical Processors 243
references with multiple targets are all computed correlation plane is shown in Figure 8c and d with the
simultaneously. This property has led to applications central peak truncated. The central peak contains
in multiclass pattern recognition problems and in the autocorrelation terms Rss and Rrr : On each side of
optical neural networks, where each neuron of a layer the autocorrelation peak are the cross-correlations
is coded as a reference object. This is one of the between the different targets of the input scene. The
advantages of the JTC compared to serial correlators, correlation between the reference image and the input
since serial correlators are limited to one filter at scene is located in the top part of the correlation
a time. plane. The correlation plane is symmetrical with
respect to the origin, so the correlation peaks are
reproduced on the bottom half of the output plane.
Example
Here is an example of joint transform correlation.
The input plane is shown in Figure 8a. The plane is Joint transform correlator architectures
composed of the input scene in the top third of the Different architectures have been used for JTCs.
input plane and the reference is placed in the center. Three setups are presented here. This is not an
This input scene contains two targets. The joint exhaustive list. The simplest optical setup shown in
power spectrum is shown in Figure 8b with its Figure 9 is an optoelectronic implementation of a
pixels mapped to a logarithmic intensity scale. The power spectrum machine. A convergent beam
Figure 8 Joint transform correlation example. (a) Input scene; (b) joint power spectrum; (c) correlation plane (top view); (d) correlation
plane (mesh view).
244 INFORMATION PROCESSING / Coherent Analog Optical Processors
optical Fourier transform system is used to produce input plane, and only half of the spatial light
the Fourier transform of an image displayed on a modulator can be used to display the input scene.
single spatial light modulator (SLM). A CCD There are alternative setups to alleviate those
camera is placed in the Fourier plane to measure drawbacks, but they require the use of two or
the joint power spectrum. The joint transform three SLMs. Two modulators can be used in the
correlation is a two-step process: the input plane input plane to simulate a full modulator for the
containing the scene and the reference is first scene, as shown in Figure 10. The original setup can
displayed on the SLM, and the camera simul- also be doubled with one optical Fourier transform
taneously captures the joint power spectrum. In system for the joint power spectrum and one for the
the second step, the joint power spectrum is correlation plane (Figure 11). With three modu-
displayed on the SLM, so the correlation plane is lators, it is possible to perform the joint transform
now on the camera. An optional second camera correlation in a single step with a full modulator for
with intensity filters can be added to this setup. In the input scene.
this case, one camera serves to digitize the joint
power spectrum during step 1 and the second Comparison to serial transform correlators
camera captures the correlation plane at step 2 The joint transform correlator has some advantages
and transmits it to a display device. This optional over the serial transform correlator. Since the phase
configuration is useful to keep both the correlation of the Fourier transform is destroyed in the JTC,
plane and the joint power spectrum within the there are much fewer constraints on the design. So it
dynamic range of their specific cameras. is advantageous to use an optical convergent beam
The setup shown in Figure 9 has two main setup to obtain the Fourier transform. As mentioned
disadvantages: the correlation is performed in two before, there is no need for a fully complex spatial
steps and requires some processing to construct the light modulator in the joint transform correlator.
INFORMATION PROCESSING / Coherent Analog Optical Processors 245
ð1
gðx;bÞ ¼ f ðm;bÞhðx þ m;bÞdm ½24 Conclusion
21
It has been estimated that serial coherent optical
ð1
gð0;bÞ ¼ f ðm;bÞhðm;bÞdm ½25 correlators should be able to attain data rates of up to
21 1012 operations per second, using the fastest spatial
INFORMATION PROCESSING / Free-Space Optical Computing 247
light modulators available. Joint transform correla- Techniques: Computer-Generated Holograms. Imaging:
tors are somewhat slower, due to the input plane Information Theory in Imaging. Information Processing:
being divided between the object and the reference, to Incoherent Analog Optical Processors; Optical Neural
geometrical constraints in placement of targets, and Networks.
due to the need to use electronics to carry out some of
the operations. Until now, speeds have been limited
by the data rates at the input, but as the speeds of Further Reading
optically addressed SLMs increase, the problem could
Arsenault HH, Szoplik T and Macukow B (eds) (1989)
be shifted to the output, where the lack of optical high
Optical Processing and Computing. New York:
spatial bandwidth threshold devices could cause a
Academic Press.
bottleneck. Bartelt H (1987) Unconventional correlators. In: Horner JL
(ed.) Optical Signal Processing, pp. 97 – 128. New York:
Academic Press.
List of Units and Nomenclature Garcia-Martinez P, Arsenault HH and Ferreira C (2001)
Time sequential nonlinear correlations for optical
A Vector or matrix pattern recognition. Optoelectronic Information proces-
CCD Charge coupled device sing: optics for information systems. Proceedings of
Fðm; nÞ Fourier transform of f ðx; yÞl SPIE CR81-12: 186 – 207.
FT Fourier transform Goodman JW (1988) Introduction to Fourier Optics.
FT21 Inverse Fourier transform New York: McGraw-Hill.
Hp Complex conjugate of H Javidi B (1994) Nonlinear joint transform correlators. In:
JTC Joint-transform correlator Javidi B and Réfrégier P (eds) Workshop on Optical
POF Phase-only filter Pattern Recognition. Bellingham, WA: SPIE – The
Rab Correlation between functions a and b International Society for Optical Engineering, PM-12.
Javidi B and Horner JL (eds) (1994) Real-Time Optical
ðx; yÞ Spatial coordinates
Information Processing. New York: Academic Press.
l Wavelength of light
Javidi B and Johnson KM (eds) (1996) Optoelectronic
ðm; nÞ Spatial frequencies Devices and Systems for Processing. Bellingham, WA:
p Correlation SPIE – The International Society for Optical Engineer-
SLM Spatial light modulator ing, CR65.
SONG Sliced orthogonal nonlinear general- Lu XL, Yu FTS and Gregory D (1990) Comparison of
ized correlation Vander Lugt and joint transform correlators. Applied
Physics B 51: 153– 164.
McAulay AD (1991) Optical Computer Architectures.
New York: Wiley & Sons.
See also Preston K (1972) Coherent Optical Computers. New York:
Coherence: Coherence and Imaging. Diffraction: McGraw-Hill.
Fraunhofer Diffraction; Fresnel Diffraction. Fourier Vander Lugt AB (1992) Optical Signal Processing.
Optics. Geometrical Optics: Aberrations. Holography, New York: John Wiley & Sons.
Parallel computation is a direct approach to nonlinear behavior or bistability. The term optical
enhance the speed of digital computations which are bistability was coined, which meant the presence of
otherwise inhibited by the sequential processing two stable states of output for the same intensity
bottleneck. Physical limitations – such as limited value.
interconnections, interactions amongst electrons, Another development took place at the University
with the consequent inability to share a data bus of Michigan by the development of a precision optical
simultaneously, and constant propagation delay of processor; this was an optical processor which could
electronic circuits – place an upper bound on the process radar signals using optics, this is the example
speed of the fastest operation possible with an of an analog optical computing processor.
electronic computer, without a modification to the A third area which is in between the digital and the
computing architecture. Optics provides an architec- analog are the discrete level devices which most often
tural advantage over electronics, due to the fact that take the form of matrix-based processors. Free space
the most natural way to use optics lends itself to a optical matrix vector multipliers are examples of such
parallel implementation. Optical signals propagate in processors.
parallel, cross each other without interference The latest field of optical computing research is in
(whereas charged electrons interact heavily with the algorithms and architecture area. The motivation
each other), have the lowest possible propagation here is to develop new algorithms which take
delay of all signals, and can provide million channel advantage of optics. The thing that is lacking is the
free-space interconnections with simple lenses presence of appropriate device technology that can
(whereas an unrealistic number of connections are implement the algorithm in a given architecture. In
necessary for electrons). Thus, optics is an excellent terms of practical realization, analog optical comput-
candidate for implementing parallel computing ing is the most successful one. Adaptive optic
machines using the 3D space connectivity. While processors have found success both in astronomy as
electronic architecture is naturally 2D, optical free well as vision sciences, and more recently in optical
space architecture is 3D, exploiting the third dimen- imaging through the atmosphere.
sion of space. The communication bottleneck forces Development of free space optical computing has
electronic computers to update the computation state two main driving forces: the technology (device) base
space sequentially, whereas the use of optics will and the algorithm base. In this classification, since the
empower one to change the entire computation space field is in its infancy, both device-based classes and
in parallel. architectural (algorithm) classes are listed side-by-
side as separate entities, ignoring the natural overlap
that may be present in some instances. For example,
Historical Perspective
nonlinear optical devices could be implemented into a
Historically, digital optical computing was proposed variety of architecture that is listed below. Holo-
as a natural extension of nonlinear optics. Just as graphic processor is another technology-based class
transistors, which are nonlinear electronic devices, while some other architectures, such as, symbolic
led to the realization of the computing logic using substitution may involve use of holograms, but it
transistors, it was envisioned that nonlinear optical does not exclude it being a class by itself. The
effects would give rise to optical transistors which following list gives a classification of optical free
would lead to development of optical logic gates. One space computing:
of the limitations of nonlinear optical device-based
computation is the high laser power requirement to 1. Digital Optical Computing:
demonstrate any nonlinear optical effect. (a) Pattern coded computing or cellular logic:
One of the effects used to demonstrate nonlinearity (i) Optical shadow-casting processor;
is to change the refractive index with the incident (ii) Symbolic substitution;
intensity. In other words, most materials will behave (iii) Theta logic based computation.
linearly with low optical energy. The refractive index (b) Parallel arithmetic circuits:
expressed as n ¼ n0 þ n2 I is nominally linear because (c) Hybrid optical computing:
the nonlinear co-efficient n2 is usually very small. In (d) Nonlinear optical device based:
the presence of high power, the n2 I is significant (e) Threshold logic.
enough to cause an optical path change by altering 2. Discrete optical computing:
the refractive index. When such a medium is placed (a) Optical fuzzy logic:
inside a Fabry – Perot resonator, it will allow one to (b) Optical matrix processors:
control the transmission of the filter by changing the (i) Vector matrix multiplier;
incident intensity. Thus, at high power, it shows a (ii) Differential equation solver;
INFORMATION PROCESSING / Free-Space Optical Computing 249
Although the most successful commercial application Figure 1 Block diagram of an optical logic gate using nonlinear
of MEMs is its use in consumer electronics, such as optical element.
250 INFORMATION PROCESSING / Free-Space Optical Computing
On
Output
A
Off
C
A B Input
tA1
A1 A2 ¼ ½2
A2 A3 1 2 r2 e2 ikl2al
110
90
70
DI3
50
C0=0
30 10
5 15
20
10
Figure 4 The input– output relationship of absorptive bistability predicted by eqns [5] and [6].
straight lines represent the left-hand side with in the photorefractive material. The backward pump
increasing values of I1 : It can be readily seen from beam is then diffracted to produce a beam which
Figure 5 that at sufficiently high values of I1 ; multiple retraces the path of the probe beam but has an
solutions are possible for eqn [10]. This again will opposite phase. It can be seen from Figure 7, that the
give rise to the bistable behavior. optical phase conjugation can be readily used as an
AND logical operator. By using the four-wave
mixing, an optical counterpart of transistors has
Nonlinear Interference Filters been introduced. FWM have also been used in AND-
based optical symbolic substitution (OSS) operations.
The optical bistability from the nonlinear interference FWM is also used to produce matched filter-based
filters (or etalons) can be used to achieve optical logic optical correlators.
gates. The transmission property of these etalons can Another device showing the optical bistability is
be modified by another optical beam, known as a bias the self electro-optic effect device (SEED). SEEDs
beam. A typical logic gate and the transmission were used to implement optical logic circuits. The
property are shown in the following figure. If the bias SEEDs are produced by putting multiple quantum
beam is such that, Pswitch 2 ðPa þ Pbias Þ , Pb ; then the wells (MQW) in the intrinsic region of a p-i-n
transmitted beam shown in Figure 6b readily follows structure. The MQWs are created by placing
an AND gate. The truth table for such configurations alternating thin layers of high (barriers) and low
is shown in Figure 6d. (wells) bandgap materials. An electric field is applied
to the SEEDs externally. The absorption coefficient is
a function of applied bias voltage. When a light
Four-Wave Mixing beam is incident on these devices, it creates a
Four Wave Mixing (FWM) can be used to implement photocurrent, consequently lowering the bias voltage
various computing functions. Using the photorefrac- across the MQW. This, in turn, increases the
tive effect and FWM in a third-order nonlinear optical absorption and when the input optical power is
material, optical phase conjugate beams can be high enough, the system sees a peak absorption and
obtained. A schematic of FWM and phase conju- suddenly switches to a lower state and thus shows
gation is shown in Figure 7. The forward pump beam the bistable behavior. A schematic of SEED is shown
interferes with the probe beam to produce a grating in Figure 8.
PA
PB PT
Pt
Pbias
PR Nonlinear
interference filter
Pa + Pb
(a) (b)
0 0 0
0 1 0
Pa + Pb
1 0 0
1 1 1
(d) (c)
Figure 6 A typical optical logic gate configuration using nonlinear interference filter: (a) schematic; (b) the transmitted power output;
(c) reflected power output (both as functions of inputs); and (d) the truth table for AND operation with beams A and B as inputs.
INFORMATION PROCESSING / Free-Space Optical Computing 253
Ea Ep (probe) Eb
Flipped number 4 1 1
LEDs Ordinary
Input overlap 4 1 4+3=7
multiplication
pixel Output Decoding mask 4 1 12
overlap pixel
is integrated horizontally using a cylindrical lens. This Now the yi s can be generated by the vector matrix
leads to an addition operation of the partial products, product, as shown below:
resulting in the generation of the output product 0 1
vector. Mathematically: y1
B C
B C
2 32 3 where B y2 C
@ A
a11 a12 a13 a14 b1
6 76 7 y3
6a 76 7
6 21 a22 a23 a24 76 b2 7 0 1
y¼6
6
76 7
76 7 A
6 a31 a32 a33 a34 76 b3 7 B C
4 54 5 B C
BAC
a41 a42 a43 a44 b4 B C
B C
2 3 0 1B
BBC
C
a11 b1 þ a12 b2 þ a13 b3 þ a14 b4 0 1 0 0 0 1 0 0 B C
6 7 B CB C
B
6a b þ a b þa b þ a b 7 B CB C
6 21 1 24 4 7 ¼ B1 0 0 1 0 0 0 1 CB C ½15
¼6
22 2 23 3
7 ½12 @ AB C C
B
C
6 7
6 a31 b1 þ a32 b2 þ a33 b3 þ a34 b4 7
4 5 0 1 1 0 0 1 0 0 BB C
C
BC C
B C
a41 b1 þ a42 b2 þ a43 b3 þ a44 b4 B C
B C
BDC
@ A
D
The above can be achieved if the vector b is expanded
as
Note here that the AND has been converted to OR,
2 3 which needs to be inverted and then ORed to generate
b1 b2 b3 b4 the function F: A setup shown in Figure 11 can be
6 7
6b b2 b3 b4 7 used to accomplish this. In step 1, the yi s are
6 1 7
6 7 generated, and then in step 2 they are inverted and
6 7
6 b1 b2 b3 b4 7
4 5 summed by a second pass through the same system
b1 b2 b3 b4 with a different mask. Other applications of vector–
matrix multipliers are in neural networks, where the
and a point-by-point multiplication is performed interconnection weights are represented by the matrix
between the expanded vector and the original matrix. and the inputs are the vector.
Then the addition is performed by the cylindrical
lenses. Analog Optical Computing
Another simple application of vector matrix
multiplier is in digital logic, known as programmable Correlation is a way to detect the presence of signal in
logic array (PLA). PLA is an electronic device additive noise. Optics offer a fast way of performing
that consists of a set of AND gates followed by a correlations. The practical motivation of this area
set of OR gates, which can be organized in AND –OR comes from the fact that a lens performs a Fourier
format suitable for realizing any arbitrary general transform of a two-dimensional signal at its focal
purpose logic operation. AND logic is performed plane. The most famous analog optical computing is
when light representing a logic level is passed known as optical matched filter. The theoretical basis
through a transparency representing the other of an optical matched filter is in the Fourier theorem
variable; OR is achieved by converging light to a of matched filtering which states that the cross
common detector. correlation between two signals can be formed by
multiplying the Fourier transform of the signal and
the complex conjugate of the second signal and
Assume a function F ¼ AC þ ABD
þ ABC ½13 followed by a subsequent Fourier transform oper-
ation. Mathematically:
where A represents the logical inverse of A: Using
Correlationðf ; gÞ ¼ F21 ½F{f }p conjugateðF{g}Þ ½16
DeMorgans’ theorem in digital logic, this can be
expressed as where F represents a Fourier transform operation
and F21 represents an inverse transform. Since the
þC
F ¼ ðA Þ þ ðA þ B þ C
Þ þ ðA
þBþC
Þ lens can easily perform the Fourier transform, and
multiplication can be performed by light passing
¼ y1 þ y2 þ y2 ½14 through a medium, the only problem that has to be
256 INFORMATION PROCESSING / Free-Space Optical Computing
function in general. It can be written as and [17] into eqn [15] yields
Notice that
x u
OTFð0Þ ¼ 1 ½16
x
due to normalization, and
h
OTFðaÞ ¼ OTFð2aÞ ½17
f f f f
Because OTF is an autocorrelation function, it must Figure 1 4f coherent optical processor consists of two Fourier
be a symmetric function. Substitution of eqns [16] transform lenses.
INFORMATION PROCESSING / Incoherent Analog Optical Processors 259
Thus, the Fourier transform of the object can be processor is illuminated with coherent light, the CTF
visualized in the frequency plane. For example, if the of the system is HðuÞ itself as shown in Figure 2b.
object is a grating, we will see its frequency However, if the 4f optical processor is illuminated
components in the frequency plane. However, if the with incoherent light, we will use OTF instead of CTF
coherent illumination (e.g., from a laser) is replaced in the analysis of linear system. The OTF is the auto-
with incoherent illumination (e.g., from a lightbulb), correlation of CTF, which is graphically depicted in
the pattern of Fourier transform in the frequency Figure 2c. In this example, since the OTF is real and
plane disappears. Does it mean that we can no longer positive, the MTF is identical to the OTF.
perform spatial filtering? No, we still can perform For simplicity, we analyze and show a one-
spatial filtering, although the pattern of Fourier dimensional filter only in a two-dimensional graph.
transform is no longer visualized. It certainly can be extended to the two-dimensional
Consider that we have a spatial filter in the filter. And we need a three-dimensional graph to show
frequency plane whose physical form can be the two-dimensional autocorrelation.
expressed by the function Hðu; vÞ: For illustration, It is important to note that, although the CTF is a
this spatial filter is simply a small one-dimensional high pass filter that passes frequencies in the band of
window HðuÞ; as shown in Figure 2a. If the 4f optical u ¼ a to u ¼ a þ b; the MTF is a low pass filter
that passes frequencies from u ¼ 2b to u ¼ b;
independent of a: This indicates that an incoherent
processor cannot enhance high frequency. It is always
H (u)
a low pass processor. It is interesting to note that
the OTF is independent of the location of the filter
u ¼ a; but it is determined by the width of the filter
Du ¼ b only.
Accordingly, an aperture made up of randomly
distributed pinholes behaves as a low pass filter in an
u incoherent imaging system. The cut-off frequency of
such a filter can be derived from the diameter of the
a a+b pinhole, which is equivalent to Figure 2b. The
(a)
procedure for using this filter is simple. If a camera
is used to take a picture, the random pinhole filter can
CTF(u) be simply attached to the camera lens. The picture
taken will not consist of high-frequency components,
because the filter acts like a low pass filter. The low
pass filtering can also be done by reducing the
aperture of camera. However, by doing so, the light
entering the camera is also reduced. The use of a
random pinhole filter removes high frequencies
u without reducing the light intensity. When the
random pinhole filter is placed in the frequency
a a+b
(b) plane of a 4f coherent optical processor, it can be used
to reduce the speckle noise. Note that the speckle
noise is the main drawback in coherent optical
MTF(u) processing.
A Computed Tomography
A
The computed tomography or CT using X-ray is
usually considered beyond optical information pro-
Single hole cessing. However, it is interesting to review the
Figure 3 Removal of pixel structure using a single hole in principle of the X-ray CT, since the X-ray source is
frequency plane. Most of the energy is wasted and the output incoherent, and the CT image reconstruction utilizes
intensity is dim. Fourier transformation that is commonly used in
INFORMATION PROCESSING / Incoherent Analog Optical Processors 261
Figure 5 (a) Projected image consists of pixel structure when no phase filter is applied in the frequency plane. (b) Depixelated image is
produced when phase filters are applied in the frequency plane.
ð ð1
I pðxÞ ¼ f ðx; yÞdy ½32
f ðx; yÞds ¼ 2ln ½28
I0 21
See also
Figure 8 Rotated X-ray parallel beam produces the rotated Spectroscopy: Fourier Transform Spectroscopy.
projection function p(x 0 ).
Further Reading
Bracewell RN (1965) The Fourier Transform and its
Applications. New York: McGraw-Hill.
v Hounsfield GN (1973) Computerized transverse axial
scanning (tomography): Part I. British Journal of
Radiology 46: 1016– 1022.
Iizuka K (1985) Engineering Optics. New York: Springer-
Verlag.
Jutamulia S, Toyoda S and Ichihashi Y (1995) Removal
of pixel structure in liquid crystal projection display.
u Proceedings of SPIE 2407: 168 – 176.
Klein MV (1970) Optics. New York: Wiley.
Lohmann AW and Werlich HW (1971) Incoherent matched
filtering with Fourier holograms. Applied Optics 10:
670 – 672.
Reynolds GO, DeVelis JB, Parrent GB Jr and Thompson BJ
(1989) The New Physical Optics Notebook: Tutorials in
Fourier Optics. Bellingham, WA: SPIE.
Figure 9 F (u; v ) computed from a set of rotated projection Vander Lugt A (1964) Signal detection by complex spatial
functions. filtering. IEEE Trans. Inf. Theo. IT-10: 139 – 145.
INFORMATION PROCESSING / Optical Bit-Serial Computing 263
Yang X, Jutamulia S and Li N (1996) Liquid-crystal Yu FTS and Wang EY (1973) Speckle reduction in
projection image depixelization by spatial phase scram- holography by means of random spatial sampling.
bling. Applied Optics 35: 4577– 4580. Applied Optics 12: 1656– 1659.
1 1 0
1
D l3
0 A l1
1 0 1
1 B l2 Bandpass
0 XGM-SOA
filter l 3
C l3
A D
XGM-SOA F
C XGM-SOA
E
Semiconductor
Control signal input optical amplifier
(SOA)
Directional
Optical bit-serial
coupler 10/90
data input
Optical bit-serial
data output
Figure 3 XPM-SOA in Sagnac interferometer for optical manipulation of bits at high data rates.
Coupled cavities
Input 1 Output 1
Input 2 Output 2
(a)
Input 1 Output 1
Input 2 Output 2
(b)
Heuring VP, Jordan HF and Pratt JP (1992) A bit serial frequency-shifting inverse-threshold pairs. Optical
architecture for optical computing. Applied Optics Engineering 43(5): 1115– 1120.
31(17): 3213– 3224. Scheuer J and Yariv A (2003) Two-dimensional optical ring
Hunsperger RG (2002) Integrated Optics, 5th edn. resonators based on radial Bragg resonance. Optics
New York: Springer. Letters 28(17): 1528–1530.
Joannopoulios JD, Meade RD and Winn JN (1995) Photonic Xu L, Glesk I, Baby V and Prucnal PR (2004) All-optical
Crystals. Princeton, NJ: Princeton University Press. wavelength conversion using SOA at nearly symmetric
McAulay AD (1991) Optical Computer Architectures. position in a fiber-based Sagnac interferometric loop.
New York: Wiley. IEEE Photonics Technical Letters 16(2): 539 – 541.
McAulay AD (2004) Novel all-optical flip-flop using Yariv A (1997) Optical Electronics in Modern Communi-
semiconductor optical amplifiers in innovating cations, 5th edn. New York: Oxford University Press.
electronic continuous-tone, gray-scale image to one thresholded and the quantizer error is calculated. The
containing only binary-valued pixels, for the purpose error is then diffused to neighboring pixels that have
of displaying, printing, or storing the image. The not yet been processed according to the scaling and
underlying concept is to provide the viewer of the interconnect defined by the error diffusion filter. The
image with the illusion of viewing a continuous-tone remainder of the pixels in the image are subsequently
image when, in fact, only black and white pixel values processed with each quantizer input, now being the
are used in the rendering. This process is particularly algebraic sum of the corresponding original input
important in applications such as laser printing, pixel intensity and the weighted error from previously
bilevel displays, xerography, and more recently, processed pixels. While error diffusion produces
facsimile. halftone images of superior quality, the unidirectional
There are a number of different methods by which processing of the algorithm continues to introduce
this digital image halftoning can be accomplished, visual artifacts that are directly attributable to the
which are generally classified as either point or algorithm itself.
neighborhood processes. A point process is one In an effort to overcome the implementation
which computes the output pixel value based strictly constraints of serial processing and improve overall
on some characteristic of the corresponding input halftone image quality, a number of different parallel
pixel. In contrast, a neighborhood process computes a architectures have been investigated, including several
single output pixel based on a number of pixels based on neural network algorithms. While neural
in a neighborhood or region of the input image. network approaches can provide distinct advantages
Ordered dither, which is considered a point process, in improved halftone image quality, they also present
produces an output by comparing a single continu- challenges to hardware implementations, including
ous-tone input value against a deterministic periodic speed of convergence and the physical interconnect
array of threshold values. If the value of the input requirements. These challenges, in addition to the
pixel under consideration is greater than the corres- natural compatibility with imaging media, provide the
ponding threshold value, the corresponding output motivation for developing optical image processing
pixel is rendered white. If the intensity is less that the architectures for these types of applications.
threshold, it is rendered black. Dispersed-dot ordered
dither is a subset of ordered dither in which the
halftone dots are of a fixed size while clustered-dot The Error Diffusion Algorithm
ordered dither uses variable-sized dots and simulates Figure 1 shows a block diagram of the error diffusion
the variable-sized dots of a printer’s halftone screen in architecture used in digital image halftoning. The
the rendering. Among the advantages of point process input image is represented as xm;n ; wm;n is the impulse
halftoning, in general and ordered dither halftoning response of a 2D causal, unity gain error diffusion
specifically, are simplicity and speed of implemen- filter, and the output quantized image is described by
tation. The primary disadvantage is that ordered ym;n [ {21; 1}: The quantizer q½um;n provides the one-
dither produces patterns in the halftoned image, bit thresholding functionality necessary to convert
which are visually undesirable. each analog input pixel to a low-resolution digital
In contrast, halftoning using the error diffusion output pixel. The quantizer error 1m;n is computed as
algorithm, first introduced by Floyd and Steinberg in the difference between the output and input to the
1975, employs neighborhood operations and is quantizer and distributed to adjacent pixels according
currently the most popular neighborhood process. to the weighting and interconnection specified by the
In this approach to halftoning, the output pixel value error diffusion filter. The unity gain constraint ensures
is determined, not solely by the value of the input
pixel and some deterministic pattern, but instead by a
weighted average of values from pixels in a neighbor- Quantizer
xm,n + um,n ym,n
hood surrounding the specific pixel being computed. Σ q [um,n]
Here, the error of the quantization process is _
computed and spatially distributed or diffused within
a local neighborhood in order to influence future
pixel quantization decisions within that neighbor- _ Σ +
hood and thereby improve the overall quality of the
halftoned image. In classical unidirectional error em,n
diffusion, the image is processed sequentially, pro- hi,j
ceeding from the upper-left to the lower-right.
Starting in the corner of the image, the first pixel is Figure 1 Block diagram of recursive error diffusion architecture.
268 INFORMATION PROCESSING / Optical Digital Image Processing
that no amplification or attenuation of the error signal describe the effect of error diffusion mathematically.
occurs during the error diffusion process and preserves Consider the relationship between the quantizer error
the average intensity of the image. 1m;n ; ym;n 2 um;n and the overall quantization error
In this architecture, the error associated with the 1^m;n ; ym;n 2 xm;n in the frequency domain. If we
quantizer decision at spatial coordinates ðm; nÞ is assume that the error is uncorrelated with the input
diffused within a local 2D region to influence adjacent and has statistical properties consistent with a white
quantization decisions. All of the state variables in process, z-transform techniques can be applied to
this architecture are scalar values since this architec- show that the feedback architecture provides the
ture represents a recursive algorithm in which each following noise-shaping characteristics:
pixel in the image is processed sequentially. In the ^ 1 ; z2 Þ
Eðz
general case, where wm;n is assumed to be a 2D finite Hns ðz1 ; z2 Þ ; ¼ 1 2 Wðz1 ; z2 Þ ½2
impulse response (FIR) filter with coefficients that Eðz1 ; z2 Þ
span a region of support defined by Rm;n ; the ^ 1 ; z2 Þ; Eðz1 ; z2 Þ; and Wðz1 ; z2 Þ represent the
Here, Eðz
quantizer input state um;n can be written as z-transforms of 1^m;n ; 1m;n ; and wm;n ; respectively.
X Equation [2] shows that the noise-shaping character-
um;n ¼ xm;n 2 wi; j 1m2i;n2j ½1 istic of the error diffusion architecture is directly
i; j[Rm;n related to the spectral characteristics of Wðz1 ; z2 Þ:
Appropriate selection of Wðz1 ; z2 Þ can spectrally
Here, wi; j ¼ 0; for all i ¼ j; since the error signal is shape the quantizer noise in such a way that
diffused spatially within the region of support Rm;n minimizes the effect of the low-resolution quantiza-
and does not influence the pixel under consideration. tion process on the overall halftoning process. An
In this unidirectional error diffusion, the 2D error error diffusion filter with low-pass spectral charac-
diffusion filter is therefore noncausal. Figure 2 shows teristics produces an overall noise-shaping character-
the impulse response of several popular error diffu- istic 1 2 Wðz1 ; z2 Þ; which has the desired high-pass
sion filters that have been used in rectangular grid frequency characteristics. This noise-shaping func-
architectures. Figure 2a shows the original error tion suppresses the quantization noise within those
diffusion filter weights developed by Floyd and frequencies occupied by the image and spectrally
Steinberg. They argued that the four weights were shapes the noise to high frequencies, which are less
the smallest number of weights that would produce objectionable to the human visual system. Circular
good halftone images. Figure 2b is a filter that symmetry is another important characteristic in the
contains a larger region of support developed by error diffusion filter since the human visual system
Jarvis, Judice, and Ninke in 1976. Finally, the filter in is particularly sensitive to directional artifacts in
Figure 2c, developed by Stucki, contains the same the image.
region of support with coefficients that are a multiple In the following qualitative description of digital
of 2 for digital compatibility and computational image halftoning, a 348 £ 348 natural image of the
efficiency. Here, ‘†’ represents the pixel being Cadet Chapel at West Point was selected as the test
processed and the weights describe the local region image. This image provides a variety of important
over which the error is diffused. The normalization image characteristics, with regions of uniform gray-
factors ensure that the filter coefficients sum to one scale, edges, and areas with fine detail. This image
and therefore meet the unity gain criterion. was scanned from a high-resolution black and white
To understand the effect of error diffusion and its photograph at 150 dots per inch (dpi) and then
impact on the quantizer error, it is instructive to printed using a 300 dpi laser printer.
Figure 3 shows the test image of the Cadet Chapel
rendered using dispersed-dot ordered dither and
7 5 results in 256 gray-levels at 300 dpi. Figure 4 shows
(16
1
) 7
3 5 1
(48
1
) 3 5 7 5 3 the halftone image of the Cadet Chapel using the error
1 3 5 3 1
diffusion algorithm and the Floyd – Steinberg filter
(a) (b) coefficients shown in Figure 2a. The unidirectionality
of the processing and the causality of the diffusion
8 4 filter result in undesirable visual artifacts in the
(42
1
) 2 4 8 4 2
halftone image. These include directional hysteresis,
1 2 4 2 1
which is manifested as ‘snakes’ running from upper-
(c) left to lower-right, and transient behavior near
Figure 2 Three common error diffusion filters used in boundaries, which appears as ‘shadows’ below and
rectangular grid architectures. (The ‘†’ represents the origin.) to the right of sharp-intensity changes. The directional
INFORMATION PROCESSING / Optical Digital Image Processing 269
hysteresis is particularly objectionable in uniform gray The Error Diffusion Algorithm and
intensity areas such as the cloud structure in the upper- Neural Networks
left of Figure 4. Similar artifacts are also present in
images halftoned using the error diffusion algorithm The popularity of the neural network-based approach
and other causal diffusion kernels. The logical to signal and image processing lies in the ability to
conclusion to draw from this limited, qualitative minimize a particular performance metric associated
analysis is that if we could diffuse the error symme- with a highly nonlinear system of equations. Specifi-
trically and simultaneously process the entire image in cally, the problem of creating a halftone image can be
parallel, we could reduce some of these visual artifacts cast in terms of a nonlinear quadratic optimization
and thereby improve overall halftone image quality. problem where the performance measure to be
minimized is the difference between the original and
the halftone images.
x1 x2 x3 x4
T1,1 T2,1 T3,1 T4,1
T1,2 T2,2 T3,2 T4,2
T1,3 T2,3 T3,3 T4,3
T1,4 T2,4 T3,4 T4,4
y1 y2 y3 y4
y1 y2 y3 y4
Figure 4 Halftone image of the Cadet Chapel produced using Figure 5 Electronic realization of a four-neuron Hopfield-type
the error diffusion algorithm and the Floyd– Steinberg weights. neural network.
270 INFORMATION PROCESSING / Optical Digital Image Processing
elements, and c is a scaling factor. In equilibrium, In equilibrium, the error diffusion neural network
eqn [3] implies that can be shown to satisfy
X
ui ¼ xi þ Ti; j Ibuj c ½4 u ¼ Wðy 2 uÞ þ x ½6
j
Hopfield showed that when the matrix of intercon- Here, u is the state vector of neuron inputs, W is a
nection weights T is symmetric with zero diagonal matrix containing the interconnect weights, x is the
elements and the high-gain limit of the sigmoid I[·] is input state vector, and y is the output state vector. For
used, the stable states of the N functions yi(t) ¼ Ibujc an N £ N image, W is an N 2 £ N 2 sparse, circulant
are the local minima of the energy function: matrix derived from the original error diffusion
weights wm;n : If we define the coordinate system
1 such that the central element of the error diffusion
E ¼ 2 yT Ty 2 xT y ½5 kernel is ði; jÞ ¼ ð0; 0Þ; then the matrix W is defined as
2
Wði; jÞ ¼ 2w½ð j 2 iÞ div N; ðj 2 iÞ mod N where:
where y [ {21; 1} is the N-vector of quantized
x x
states. As a result, if the neural network can be x div y ¼ if xy $ 0 or if xy , 0 ½7
y y
shown to be stable, as the network converges the
energy function is minimized. In most neural network An equivalence to the Hopfield network can therefore
applications, this energy function is designed to be be described by
proportional to a performance metric of interest and
u ¼ AðWy þ xÞ ½8
therefore as the network converges, the energy
function and consequently the performance metric where A ¼ ðI þ WÞ21 : Effectively, the error diffusion
is also minimized. network includes a pre-filtering of the input image x
by the matrix A while still filtering the output image y
The Error Diffusion Neural Network but now with a new matrix, AW. Recognizing
that AW ¼ I 2 A and adding the arbitrary constant
An understanding of the operation and characteristics
k ¼ yT y þ xT Ax; we can write the energy function
of the Hopfield-type neural network can be applied to
of the error diffusion neural network as
the development and understanding of a 2D exten-
sion of the error diffusion algorithm. Figure 6 shows ^
Eðx; yÞ ¼ yT Ay 2 2yT Ax þ xT Ax ½9
an electronic implementation of a four-neuron error
diffusion-type neural network, where the individual This energy function is a quadratic function, which
neurons are represented as amplifiers and the can be factored into
synapses by the physical connections between the
input and output of the amplifiers. ^
Eðx; yÞ ¼ ½B ðy 2 xÞ T ½B ðy 2 xÞ ½10
|fflffl{zfflffl} |fflffl{zfflffl}
error error
T
where A ¼ B B: From eqn [10] we find that as the
x1 x2 x3 x4 error diffusion neural network converges and the
w1,1 w2,1 w3,1 w4,1 energy function is minimized, so too is the error
w1,2 w2,2 w3,2 w4,2 between the output and input images.
w1,3 w2,3 w3,3 w4,3 If the neurons update independently, the
w1,4 w2,4 w3,4 w4,4 convergence of the error diffusion network is
guaranteed if
− − − − ;k : ½AWk;k $ 0 ½11
Σ Σ Σ Σ
+ + +
We find in practice, that even in a synchronous
+
e1 e2 e3 e4 implementation, the halftoned images converge to a
solution, which results in significantly improved
halftone image quality over other similar halftoning
algorithms.
Figure 6 Four-neuron electronic implementation of the error The purpose of the error diffusion filter is to spectrally
diffusion neural network. shape the quantization noise in such a way that the
INFORMATION PROCESSING / Optical Digital Image Processing 271
quantizer error is distributed to higher spatial Smart Pixel Technology and Optical
frequencies, which are less objectionable to the Image Processing
human visual system. In this application, a feedback
filter for the error diffusion neural network was The concept underlying smart pixel technology is to
designed using conventional 2D filter design tech- integrate electronic processing and individual optical
niques resulting in the 7 £ 7 impulse response shown devices on a common chip, to take advantage of the
in Figure 7. Figure 8 shows the same image of the complexity of electronic processing circuits and the
Cadet Chapel halftoned using the error diffusion speed of optical devices. Arrays of these smart pixels
neural network algorithm and the 2D error diffusion then allow the advantage of parallelism that optics
filter shown in Figure 7. There is clear improvement provides. There are a number of different
in halftone image quality over the image produced approaches to smart pixel technology that differ
using sequential error diffusion in Figure 4. Notice primarily in the kind of opto-electronic devices, the
particularly the uniform distribution of pixels in the type of electronic circuitry, and the method of inte-
cloud formation in the upper-left of Figure 8. Also gration of the two. Common opto-electronic devices
noteworthy is the improvement around the fine detail found in smart pixels include: light-emitting diodes
portions of the tree branches and next to the vertical (LEDs); laser diodes (LDs); vertical cavity surface
edges of the chapel. emitting lasers (VCSELs); multiple quantum well
(MQW) modulators; liquid crystal devices (LCDs);
and photodetectors (PDs). The most common types
of electronic circuitry are silicon-based semiconduc-
0.0003 0.0019 0.0051 0.0068 0.0051 0.0019 0.0003 tors such as complementary metal oxide semicon-
0.0019 0.0103 0.0248 0.0328 0.0248 0.0103 0.0019
ductors (CMOS) and compound semiconductors
such as gallium arsinide (GaAs)-based circuitry.
0.0051 0.0248 0.0583 0.0766 0.0583 0.0248 0.0051
There are a number of different approaches to
0.0068 0.0328 0.0766 0.0766 0.0328 0.0068 smart pixels, which generally differ in the way in
0.0051 0.0248 0.0583 0.0766 0.0583 0.0248 0.0051 which the electronic and optical devices are inte-
grated. Monolithic integration, direct epitaxy, and
0.0019 0.0103 0.0248 0.0328 0.0248 0.0103 0.0019
hybrid integration are the three most common
0.0003 0.0019 0.0051 0.0068 0.0051 0.0019 0.0003 approaches in use today.
Figure 7 Impulse response of a 7 £ 7 symmetric error diffusion Smart pixel technology provides a natural metho-
filter. (‘†’ represents the origin.) dology by which to implement optical image proces-
sing architectures. The opto-electronic devices
provide a natural optical interface while the elec-
tronic circuitry provides the ability to perform either
analog or digital computation. It is important to
understand that any functionality that can be
implemented in electronic circuitry can be integrated
into a smart pixel architecture. The only limitations
arise from physical space constraints imposed by the
integration of the opto-electronic devices with
the electronic circuitry. While smart pixels can be
fabricated with digital or analog circuitry, the
smart pixel architecture described subsequently uses
mixed-signal circuitry.
VLSI circuitry, using a hybrid integration approach SEED is continuous in intensity and represents the
called flip-chip bonding. This type of smart pixel is individual input analog pixel intensity. The input
commonly referred to as CMOS-SEED smart pixel SEED at each neuron converts the optical signal to a
technology. photocurrent and subsequently, current mirrors are
To provide an example of the application of smart used to buffer and amplify the signal. The width-to-
pixel technology to digital image halftoning, a 5 £ 5 length ratio of the metal oxide semiconductor
CMOS-SEED smart pixel array was designed and field effect transistors (MOSFETs) used in the
fabricated. The CMOS circuitry was produced using CMOS-SEED circuitry provide current gain to
a 0.5 mm silicon process and the SEED modulators amplify the photocurrent. The first circuit produces
were subsequently flip-chip bonded to silicon circui- two output signals: þIu which represents the state
try using a hybrid integration technique. The central variable um;n as the input to the quantizer and 2Ie
neuron of this new smart pixel array consists of which represents the state variable 21m;n as the input
approximately 160 transistors while the complete to the feedback differencing node. The function of
5 £ 5 array accounts for over 3,600 transistors. the quantizer is to provide a smooth, continuous
A total of 50 optical input/output channels are threshold functionality for the neuron that produces
provided in this implementation. the output signal Iout ; corresponding to the output
Figure 9 shows the circuitry associated with a state variable ym;n : This second electronic circuit is
single neuron of the smart pixel error diffusion called a modified wide-range transconductance
neural network. All state variables in each circuit of amplifier and produces a hyperbolic tangent sigmoi-
this architecture are represented as currents. Begin- dal function when operated in the sub-threshold
ning in the upper-left of the circuit and proceeding regime. The third circuit takes as its input Iout ;
clockwise, the input optical signal incident on the produces a replica of the original signal, and drives
Figure 9 Circuit diagram of a single neuron and a single error weight of the 5 £ 5 error diffusion neural network based on a CMOS-
SEED-type smart pixel architecture. (Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series
in Optical Sciences, Vol. 81 [1], Fig. 8.6, p. 222. Copyright 2001, Springer-Verlag GmbH & Co. KG.)
INFORMATION PROCESSING / Optical Digital Image Processing 273
the output optical SEED. In this case, the output Figure 11 shows a photomicrograph of a 5 £ 5
optical signal is a binary quantity represented as the error diffusion neural network, which was fabricated
presence or absence of light. When the SEED is using CMOS-SEED smart pixels. Figure 12 shows a
forward-biased, light is generated through electro- photomicrograph of a single neuron of the 5 £ 5
luminescence. The last circuit at the bottom of the CMOS-SEED neural network. The rectangular fea-
schematic implements a portion of the error weight- tures are the MQW modulators while the silicon
ing and distribution function of the error diffusion circuits are visible between and beneath the
filter. The individual weights are implemented by modulators. The MQW modulators are approxi-
scaling the width-to-length ratio of the MOSFETs to mately 70 mm £ 30 mm and have optical windows,
achieve the desired weighting coefficients. The which are 18 mm £ 18 mm.
neuron-to-neuron interconnections are accomplished Simulations of this network predict individual
using the four metalization layers of the 0.5 mm neuron switching speeds of less than 1 ms, which
silicon CMOS process. In this design, the error corresponds to network convergence speeds capable
diffusion filter was limited to a 5 £ 5 region of of providing real-time digital image halftoning.
support because of the physical constraints of the Individual component functionality and dynamic
circuitry necessary to implement the complete error operation of the full 5 £ 5 neural array were both
diffusion neural network in silicon circuitry. The experimentally characterized. Figure 13 shows CCD
impulse response of the 5 £ 5 filter used in this images of the operational 5 £ 5 CMOS-SEED smart
particular smart pixel architecture is shown in pixel array. The white spots show SEED MQW
Figure 10. The error weighting circuitry at the modulators emitting light. Figure 13a shows the fully
bottom of Figure 9 represents only the largest functioning array while Figure 13b shows the 5 £ 5
weight (0.1124) with interconnects to its four local CMOS-SEED neural array under 50% gray-scale
neighbors (IoutA 2 IoutD). input. Here 50% of the SEED modulators are shown
to be in the on-state demonstrating correct operation
of the network and the halftoning architecture.
0.0023 0.0185 0.0758 0.0185 0.0023 Both the simulations and the experimental results
0.0185 0.0254 0.1124 0.0254 0.0185 demonstrate that this approach to a smart pixel
0.0758 0.1124 0.1124 0.0758 implementation of the error diffusion neural network
0.0185 0.0254 0.1124 0.0254 0.0185 provides sufficient accuracy for the digital halftoning
0.0023 0.0185 0.0758 0.0185 0.0023 application. The individual neuron switching speeds
Figure 10 Error diffusion filter coefficients used in the 5 £ 5
also demonstrate the capability for this smart pixel
CMOS-SEED error diffusion architecture. (‘†’ represents the hardware implementation to provide real-time
origin.) halftoning of video images.
Figure 11 Photomicrograph of a smart pixel implementation of a 5 £ 5 CMOS-SEED error diffusion neural network for digital image
halftoning. (Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences,
Vol. 81 [1], Fig. 8.5, p. 221. Copyright 2001, Springer-Verlag GmbH & Co. KG.)
274 INFORMATION PROCESSING / Optical Digital Image Processing
Figure 12 Photomicrograph of a single neuron of the 5 £ 5 error diffusion neural network. (Reprinted with permission from Shoop BL,
Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences, Vol. 81 [1], Fig. 8.4, p. 220. Copyright 2001, Springer-
Verlag GmbH & Co. KG.)
Figure 13 CCD images of the 5 £ 5 CMOS-SEED array. (a) Fully-operational array, and (b) under 50% gray scale illumination.
(Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences, Vol. 81 [1],
Figs. 8.19 and 8.20, p. 236. Copyright 2001, Springer-Verlag GmbH & Co. KG.)
Image Processing Extensions neural network shows that the frequency response of
this filter directly shapes the frequency spectrum of
Other image processing functionality is also possible the output halftone image and therefore directly
by extending the fundamental concepts of 2D, controls halftone image content. Other filter designs
symmetric error diffusion to other promising image with different spectral responses can provide other
processing applications. One important application image processing features such as edge enhancement,
includes color halftoning while other extensions which could lead to feature extraction and automatic
include edge enhancement and feature extraction. target recognition applications.
In the basic error diffusion neural network, the error
diffusion filter was specifically designed to produce
visually pleasing halftoned images. Care was taken
to ensure that the frequency response of the filter See also
was circularly symmetric and that the cutoff Detection: Smart Pixel Arrays. Information Proces-
frequency was chosen in such a way as to preserve sing: Optical Neural Networks. Optical Processing
image content. An analysis of the error diffusion Systems.
INFORMATION PROCESSING / Optical Neural Networks 275
to describe continuous sets of events, each of which research and applications, we are still in that second
both causes and is caused by the others. boom, that began during the 1980s.
ANNs themselves have grown much more compli-
An ANN is a signal processor comprised of a multiplicity
cated as well. Starting out as simple, feed-forward, of nodes which receive one or more input signals and
single-layer systems, they rapidly became bidirec- generate one or more output signals in a nonlinear
tional and even multilayered as well. Higher-order fashion wherein those nodes are connected with one or
neural networks processed the input signals before two way signaling channels.
inserting them into the neural network. Pulse coupled
Note also that biological inspiration is no longer a
neural networks (PCNNs) arose that more closely
part of the definition. Most ANNs used for associ-
modeled the pulsations in real brains. Learning was
ation are only vaguely related to their biological
broken into learning with instruction, following
counterparts.
example, and self-organizing, the latter leading to
self organized maps and adaptive resonance theory,
and many related subjects. These are but crude Optical Processors
pointers to the wealth and variety of concepts now Optics is well-suited for massive parallel interconnec-
part of the ANN field. tions, so it seems logical to explore optical ANNs.
The history of ANNs is both complicated and The first question to address is “Why bother to use
somewhat sordid. This is not the occasion to retrace optical processors at all?” as electronic digital
the unpleasant things various members of this computers seem to have everything:
community have done to others in their quests for
stature in the field. One result of that history is a . Between the writing of this article (on an electronic
dramatically up-and-down pattern of interest in and computer) and its publication, the speed of
support for ANNs. The first boom followed the electronic computers will almost double and they
development of a powerful single-layer neural net- will become even cheaper and faster.
work called the Perceptron by its inventor Rosenblatt. . They can have arbitrary dynamic range and
Perceptrons were biologically motivated, fast, and accuracy.
reasonably effective; so interest in the new field of . Massively improved algorithms also improve their
ANNs was high. That interest was destroyed by a speed. For instance, the wonderful fast Fourier
book on the field arguing that Perceptrons were only transform (FFT) that opticists have used to
linear discriminants and most interesting problems simulate optics for decades has been replaced by
are not linearly discriminable. Ironically, the same an even faster fastest Fourier transform in the West
argument has been made about optical Fourier (FFTW). And for wavelets, the fast wavelet
correlators, and it is true there as well. But optical transform (FWT) is even faster.
processing folk largely ignored the problem and kept
working on it. Sociologists of science might enjoy Optical processors, on the other hand, have
asking why reactions differed so much in the two severely limited analog accuracy and few analog
fields. Funding and interest collapsed and the field algotectures or archirithms. They are specialized
went into hibernation. systems (equivalent in electronics, not to computers
It was widely recognized that multilayer ANNs but to application specific integrated circuits
would comprise nonlinear discriminants, but training (ASICs)). They are usually larger and clumsier than
them presented a real problem. The simple rewards electronic systems and always cost more.
and punishment of weights depending on perform- There are several reasons why optics may still have
ance was easy when each output signal came from a a role in some computations. First, many things on
known set of weighted signals that could be rewarded which we wish to perform computations are inher-
or punished jointly, according to how the output ently optical. If we can do the computations optically
differed from the sought-after output. With a multi- before detection, we can avoid some time and energy
layer perceptron, credit assignment of inner layers is and sensitivity penalties inflicted, if we first detect the
no longer simple. It was not until Werbos invented signals then process them electronically, and then
‘backward error propagation’ that a solution to this (sometimes) convert them back to optics. Examples
problem was available and the field started to boom include spectroscopy and optical communication.
again. Also key to the second boom were enormously Second, pure optics is pure quantum mechanics.
popular articles by Hopfield on rather simple auto- Input of data and setup of the apparatus is experiment
and hetero-associative neural networks. Basically, preparation. Readout of data is classical measure-
those networks are almost never used, but they were ment. Nature does the rest free of charge. There is a
of great value in restarting the field. In terms of unitary transformation from input to output that
INFORMATION PROCESSING / Optical Neural Networks 277
requires no human intervention and is, in fact, negative number. Data are read in parallel using
destroyed by human intervention. Those intermediate spatial light modulators (SLMs), source arrays,
(virtual) computations require no energy and no time, acousto-optic modulators, etc. They are operated
beyond the time taken for light to propagate through upon by other SLMs, acousto-optic systems, and so
the system. Speed and power consumption are limited forth. Finally, they are detected by individual
only by the input and output. This applies for detectors or detector arrays. The required nonlinea-
everything from Fourier optics to digital optics to rities are then done electronically. It is a theory that
logic gates. Thus, optics can have speed and power holds for all linear algebraic processes up to
consumption advantages over electronics. computational complexity OðN 4 Þ that they can be
Third, if quantum computers involving entangle- performed with temporal complexity OðN 0 Þ if the
ment ever emerge, they may well be optical. Such extra dimensions of complexity are absorbed in
computers would be used for nondeterministic space, up to OðN 2 Þ through 2D arraying and in
solutions of hard problems. That is, they would fanin/fanout, the ability of multiple beams of light to
explore all possible paths at once and see which one be operated upon in parallel by the same physical
does best. This is something we can already do modulator – this has the same complexity as the 2D
electronically or optically for maze running (using array of modulators, namely OðN 2 Þ: Thus, we can
classical, not quantum entanglement) means. invert a matrix, an order OðN 4 Þ for an N £ N matrix,
Fourth, most optical processors are analog, which OðN 0 Þ in time, that is, independently of the size of N;
is sometimes superior to digital. In those cases, we as long as that matrix is accommodated by the
would still have to choose between analog optics and apparatus. This offers a speed electronics cannot
analog electronics, of course. match.
So, optical processing (like all other processing) is a Most of the optical neural networks of the mid-
niche market. But some of the niches may be quite 1980s had optical vector– matrix multipliers at their
large and important. It is not a universal solution, but heart. Most were a few layers of feed forward
neither is digital electronics. systems, but by the 1990s feedback had been
incorporated as well. These are typically N inputs
connected to N outputs through an N £ N matrix,
The Major Branches of Optical with N varying from 100 to 1000. With diffuser
Processing and Their Corresponding interconnection and a pair of SLMs, it is possible to
Neural Networks connect an N £ N input with an N £ N output using
N 4 arbitrary weights, but that requires integrating
Every branch of optical processing leads to its own
over a time up to N 2 intervals. Most of the N4-
special purpose optical neural networks, so those
dimensional matrices can be approximated well by
branches become a good place to start an article such
fewer than N 4 terms using singular value decompo-
as this one.
sition (SVD) and (as this version of the interconnec-
tion uses outer products), that reduces the number
Optical Linear Algebra
of integration times considerably in most cases. We
Many neural network paradigms use linear algebra have shown, for instance, that recognizing M target
followed by a point-by-point nonlinearity in each of images can be accomplished well with only M terms
multiple stages. The linear operation is a matrix – in the SVD.
vector multiplication, a matrix – matrix multipli-
cation, or even a matrix – tensor multiplication. In
Coherent Optical Fourier Transformation
any case, the key operations are multiplication and
addition – operations very amenable to performance It took opticists several years, during the early 1960s,
with analog optics. With effort, it is possible to to realize that if Fourier mathematics was useful in
combine simple electronics with analog optics oper- describing optics, optics might be useful in perform-
ating on non-negative signals (the amount of light ing Fourier mathematics. The first applications of
emitted from a source, the fraction of light passing optical Fourier transforms were in image filtering and
through a modulator, etc.) to expand to real or even particularly in pattern recognition. Such processors
complex numbers. With even more effort, we can use are attractive not only for their speed but also for
multiple analog non-negative signals to encrypt a their ability to locate the recognized pattern in 1D,
digital signal, allowing digital optical processors. 2D, or even 3D. It was soon obvious that a Fourier
Almost all optical neural networks using linear optical correlator was a special case of a single layer
algebra use only real numbers. In most of these, the perceptron. Of course, it inherits the weaknesses of a
real number is encoded as a positive number and a single layer perceptron too. Soon, workers had added
278 INFORMATION PROCESSING / Optical Neural Networks
feedback to create a winner-takes-all type of auto- for optimum blind segmentation of images. The
associative optical neural network, a system that autowaves can be made to ‘leave tracks’ so they can
associated images with themselves. This proved be reversed, which allows them to solve maze
useful in restoring partial or noisy images. It was problems nondeterministically, because they can
not until 2003 that we learned how to make powerful take all the paths. When the first wave exits, we
hetero-associative networks with Fourier optical simply retrace it to find the shortest path through the
systems. This creates a nonlinear discrimination maze – a prototypical nondeterministic polynomial
system that maintains the target location and operation.
generalizes well from a few samples of multiple
classes to classify new data quite accurately. Holograms
hologram, one that can change or adapt over time as than an input layer, an output layer, and two
the input pattern changes. Once a satisfactory intermediate or hidden layers.
photorefractive hologram is formed, it can be Others have used photorefractive material to
stabilized (fixed) if desired. Most optical neural implement learning in neural networks. In biological
network uses envision a dynamic situation with neural networks the synaptic strengths (the weights
changing interconnections. between neurons) continue to adapt. With photo-
What has not been discussed above is that the index refractive holograms, we can achieve this in an
of refraction varies with the light’s electric field vector optical ANN.
cubed, so photorefractives provide a convenient As previously discussed, a key development in the
way to accomplish such nonlinear operations as history of ANNs was a way to train multilayer neural
winner-takes-all. networks and thus accomplish nonlinear discrimi-
Sometimes, we can record holograms with photo- nation. The first, and still most popular, way to do
refractives that cannot be recorded by conventional this is called backward error propagation. Several
holograms in conventional materials. To explain this, groups have been implemented in photorefractive
a hologram is a transducer between two optical optical neural networks.
wavefronts – usually called the reference and object
beams. If the reference beam is incident on the
hologram, it is (fully or partially) converted into the
object beam. If the precisely reversed (phase con- Conclusions
jugated) reference beam is incident on it, the precisely Essentially every major type of neural network has
reversed (phase conjugated) object beam is derived been implemented optically. Every major tool of
from it. Likewise, the object wavefront (or its optics has found its way into these optical neural
phase conjugate) produces the reference wavefront networks. Yet, it remains the case that almost every
(or its phase conjugate). The hologram is made by neural network used for practical applications is
recording the interference pattern between the two electronic. The mere existence proof of a use for
wavefronts: the reference and object beams. The optical neural networks may be what the field needs
ability of two wavefronts to form a temporally stable to move forward from the demonstration stage to the
interference pattern at some point in space is called application stage. The tools exist and there are niches
the mutual coherence of those wavefronts. Usually, where optics seems more appropriate than elec-
both wavefronts are derived from the same laser tronics. Hopefully, the next review of the field will
beam to make achieving high mutual coherence include a number of commercial successes. Nothing
possible. And, of course, both beams must be present spurs science better than the funds that follow
to form an interference pattern. In some cases of successful applications.
photorefractive holograms, all of those conditions
can be violated. The two wavefronts can derive from
two sources that are different in wavelength, polar-
ization, etc., and need not be simultaneously present.
List of Units and Nomenclature
This phenomenon, called mutual phase conjugation, Adaptive resonance Another self-organized neural
converts each wavefront into the phase conjugate of theory (ART) net categorizer that works by
the other, but for effects due to any differences in mutual adjustment between
wavelength and polarization. This allows new free- bottom-up and top-down
doms that have been exploited in optical neural neural connections.
networks. Computational The measure of how the num-
Photorefractive materials also tend to be piezo- complexity ber of required calculations
electric. It has been found that up to about 20 totally scales with the problem size,
independent reflective holograms can be stored with usually denoted by N: The
different applied electric field in lithium niobate. scaling is represented by the
Changing electric fields so changes the fringe spacings Big O form of ðNÞ; where f ðnÞ
and indices of refraction, that only one of those shows how it scales asympto-
holograms has any appreciable diffraction efficiency tically (large N). Problems that
at any one electric field value. The result is an scale as OðN P Þ are said to be of
electrically selectable interconnection pattern. The polynomial complexity. Unfor-
same hologram can contain the information needed tunately, some of the most
for many neural networks or multiple layers of fewer important problems have
of them. There is no need, in general, to have more exponential complexity.
280 INFORMATION PROCESSING / Optical Neural Networks
INSTRUMENTATION
Contents
Astronomical Instrumentation
Ellipsometry
Photometry
Scatterometry
Spectrometers
Telescopes
Table 1 Optical interfaces to representative types of telescope and achievable angular resolution
instrument directly at the focus of M1 (prime focus), fundamental constants on cosmological timescales
although this imposes severe restrictions on the mass (, 1010 years). The term ‘efficiency’ has several
and size of the instrument. An alternative mounting components:
for instruments is via optical fibers. The input, with
any fore-optics required to couple the telescope beam (i) Throughput of the optical system, including the
into the fiber, is mounted at the appropriate focus efficiency of the optical components (lenses or
while the output, possibly several tens of meters mirrors), dispersers, filters, and also detectors
distant, is mounted where the instrument is not and also includes losses due to transmission/
subjected to variations in the gravity vector. reflection, diffraction, scattering, and vignetting
The useful field of view (FOV) of the telescope is by internal obstructions;
determined by how rapidly the image quality (ii) Noise from all sources: the detector, including
degrades with radius in the field; restrictions on the time-independent (read-noise) and time-depen-
size of the Cassegrain hole and the space set aside for dent (dark-current) components; thermal emis-
the star trackers required for accurate guiding sion from the instrument, telescope, enclosure
and auxiliary optics involved in target acquisition, and sky; straylight (see above), and electronic
AO, etc. interference;
Typical telescope parameters are shown in Table 1, (iii) Efficient use of the instrument and telescope to
of which the last two entries are speculative examples maximize the fraction of the time spent accumu-
of ELTs. The table also lists the diffraction limit of the lating useful photons instead of, say, calibrating;
telescopes and the limits imposed by seeing, with and reconfiguring the instrument; or acquiring
without the effects of the telescope structure and targets;
enclosure, and approximate limits for performance (iv) Parallelizing the observations so that the same
with AO which approaches the diffraction limit in the instrument records data from many objects at the
infrared. same time. This is discussed further below.
Table 2 Typical requirements for angular resolution and single-object field of view
Target Distancea Active nucleus Star cluster (10 pc) Visible extent
Field of View and Multiplexing determined. This is limited by either the SNR of the
observation of the line via Q < 1=SNR; or by the
The FOV of an individual observation must be
stability of the spectrograph via:
sufficient to cover the angular extent of the object
with the minimum number of separate pointings of the
Dx dx
telescope. As well as minimizing the total observation Q¼ ½2
x dx
time, this reduces photometric errors caused by
changes in the ambient conditions (e.g., flexure, where x is the angular width of the slit measured on
thermal effects, and changes in the sky background). the sky and dx/dx is the image scale at the detector. Dx
(Photometry is the measurement of the flux density is the amount of uncorrectable flexure in units of the
received from an astronomical object, usually done detector pixel size. This is the part of the motion of
with the aid of an imager employing filters with the centroid of the line which is not susceptible to
standardized broad passbands. Spectrophotometry is modeling with the aid of frequent wavelength
the same but in the much narrower passbands obtained calibration exposures. Typically, Dx < 0:2 pixels is
by dispersing the light.) The last column of Table 2 achievable (roughly 3 mm at the detector or 10 mm at
indicates the required size of FOV. It is also highly the slit for an 8 m telescope) implying Q < 1=30 for a
desirable to observe many objects simultaneously since typical ðdx=dxÞ ¼ 0:1 – 0:200 =pixel and x ¼ 0:5 – 100 :
most studies require large statistically homogenous Table 3 gives typical requirement for R and the
samples. The precise requirement depends on the value of Q required to achieve it, given the spectral
surface number density of potential targets and the resolution obtainable for different types of current
sampling strategy adopted to create the sample. spectrograph. Extrasolar planet studies require very
Cosmological studies of the large-scale distribution high stability.
of galaxies and clusters emphasize the size of the field Some spectral observations require very accurate
($ 0.58), since the targets may be sparsely sampled, measurements of the shape of the spectral features to
while studies of clustered objects (galaxies or stars) infer the kinematics of ensembles of objects which are
aim to maximize the surface density of targets individually not resolvable (e.g., stars in distant
(# 1000 arcmin22) within a modest FOV (, 50 ). galaxies, gas in blackhole accretion disks). Other
applications require accurate measurements of the
Spectral Resolution
relative flux of various spectral features within the
The required energy resolution is specified in terms of same spectrum (e.g., to reveal abundances of elements
the spectral resolving power, R ; l=dl, where dl is or different types of star).
the resolution in wavelength defined usually in terms
of the Rayleigh criterion. One of the main data
Optical Principles
products required in astrophysics is the velocity along
the line of sight (radial velocity) obtained through Imaging
measurement of the wavelength of a known spectral
feature (emission or absorption line). This may be The simplest type of imager consists of a detector,
determined to an accuracy of: comprising a 2D array of light-sensitive pixels
(Table 4) placed at a telescope focal surface without
Qc any intervening optics except for the filters required
dv ¼ ½1 for photometry. Although maximizing throughput,
R
this arrangement removes flexibility in changing the
where Q is the fraction of the width of the spectral image scale since the physical size of pixels (typically
resolution element, dl, to which the centroid can be 15 – 25 mm) is fixed. Thus the Gemini telescopes
284 INSTRUMENTATION / Astronomical Instrumentation
Wavelength Material/ Pixel size Format Read noise (e-/pix) Dark current (e-/pix/hr) QE range (%)
(mm) type (mm)
Single Mosaic
Typical State-of-the-art
ðDT ¼ 8 mÞ provides an uncorrected image scale of focal reducer is a special case of the generic spectro-
25 –40 mas/pixel at its sole F/16 focus. Although graph described in the next section.
well-suited to high-resolution observations with AO,
this is inappropriate for most observations of faint Spectroscopy
galaxies where 100 –200 mas/pixel is required. This Basic principles
arrangement also provides limited facilities for Astronomical spectrographs generally employ plane
defining the wavelength range of the observations diffraction gratings. The interference between adja-
since there is no intermediate focal surface at which to cent ray paths as they propagate through the medium
place narrowband filters, whose performance is is illustrated in Figure 1. From this, we obtain the
sensitive to field angle and are, therefore, best placed grating equation:
at an image conjugate. There is also no image of the
telescope pupil at which to place a cold stop to reject
the external and instrumental thermal radiation mrl ¼ n1 sin a þ n2 sin b ; G ½3
which is the dominant noise source at l . 1:5 mm:
The alternative is to use a focal-reducer consisting where a and b are the angles of incidence and
of a collimator and camera. The auxiliary com- diffraction respectively, r ¼ 1=a is the ruling den-
ponents (filters, cold-stops, Fabry – Perot etalons, etc.) sity, l is the wavelength, and m is the spectral
then have an additional image and pupil conjugate order. The refractive indices in which the
available to them and the image scale may be chosen incident and diffracted rays propagate are n1 and
by varying the ratio of the camera and collimator n2 respectively.
focal lengths. The extra complexity of optics inevi- The layout of a generic spectrograph employing a
tably reduces throughput but, through the use of plane diffraction grating in a focal reducer arrange-
modern optical materials, coatings such as Sol – Gel ment is shown in Figure 2. The spectrograph
and the avoidance of internally obstructed reflective re-images the slit onto the detector via a collimator
optics, losses may be restricted to 10 – 20% except at and camera. The disperser is placed in the collimated
wavelength shorter than , 0.35 mm. An imaging beam close to a conjugate of the telescope pupil.
INSTRUMENTATION / Astronomical Instrumentation 285
A field lens placed near the focus is often incorporated Considering the case of a diffraction grating in
in the collimator for this purpose. vacuo, differentiation of the grating equation with
The geometrical factor, G, defined in the equation respect to the diffracted angle yields the angular
is constrained by the basic angle of the spectrograph: dispersion:
dl cos b
c¼a2b ½4 ¼ ½5
db mr
which is the (normally fixed) angle between the
optical axes of the collimator and camera. c ¼ np; from which the linear dispersion is obtained by
corresponds to the Littrow configuration (for considering the projection of db on the detector, dx:
integer n).
dl dl db cos b
¼ ¼ ½6
dx db dx m rf 2
Rp ¼ mrW ½7
F2
s0 ¼ s ½8
F1
Figure 1 Derivation of the grating equation by consideration where F1 and F2 are the collimator and camera focal
of the optical path difference between neighboring ray paths AB ratios, respectively. The spectral resolution of the
and A0 B0 . spectrograph is simply the width of this expressed in
f2
Detector
fT D2
Grating
Slit
DT s Ψ b
a a
D1
c W
f1
Telescope
Figure 2 Generic spectrograph employing a plane diffraction grating. Note that according to the sign convention implicit in Figure 1,
the diffraction angle, b, is negative in this configuration.
286 INSTRUMENTATION / Astronomical Instrumentation
where c, is given by eqn [4] using the identity: Use of volume phase holographic gratings. A
sin x þ sin y ; 2 sin x þ2 y cos x þ2 y . Thus, the newer alternative to surface-relief (SR) dispersers are
resolving power at blaze is obtained from volume phase holographic (VPH) gratings in which
288 INSTRUMENTATION / Astronomical Instrumentation
the interference condition is provided by a variation modulation amplitude, respectively (Figure 5). These
of refractive index of a material such as dichromated are described by the same grating (eqn [3]), as before
gelatine, ng, which depends harmonically on position leading to an identical expression for the resolving
inside the grating material as: power at blaze, where mrlB ¼ 2 sin a; to that of a
blazed plane diffraction grating in the Littrow
ng ðx; zÞ ¼ n g þ Dng cos½2prg ðx sin g þ z cos gÞ ½22 configuration (eqn [20]):
Lines of constant ng
x
g = p/2
z
a a
b
2g +a
Figure 5 Various configurations of volume phase holographic gratings. The blaze condition is when b ¼ a: The inset shows how the
blaze condition may be altered by changing the basic angle of the spectrograph.
INSTRUMENTATION / Astronomical Instrumentation 289
555
10
500 555
11
454
g 12
Cross-dispersion
417
13
385
14
357
D1 15
333
16 333
312
W
312 300
(a) (b) Primary dispersion
Figure 6 (a) Basic layout of an Echelle grating used in a near-Littrow configuration. (b) Illustrative cross-dispersed spectrogram
showing a simplified layout on the detector. The numbers 10–16 label the different spectral orders. The numbers labeling the vertical
axis are the wavelength (nm) at the lowest end of each complete order. Other wavelengths are labeled for clarity. For simplicity the
orders are shown evenly spaced in cross-dispersion.
direct light from the various targets to a remote attendant calibration uncertainties and the lack of
spectrograph whose input focus consists of one or contiguous estimates of the sky background. How-
more continuous pseudoslits. These are made up ever, it is easier to adapt to large fields since the
of the fiber outputs arranged in a line. The spectrograph field can be much smaller than the field
continuous nature of the slit means that spectral over which the fiber inputs are distributed. In
overlap is avoided without restricting the surface summary, multislit systems are best for narrow-but-
density of addressable targets; although there is a deep surveys while the multifiber systems excels at
limit imposed by the distance of closest approach wide-but-shallow surveys. Fiber-fed instruments may
of the fibers. The method of deploying the fiber further prove their worth in ELTs where technical
inputs may be a plugplate consisting of pre-cut problems may require that these bulky instruments
holes into which encapsulated fibers are manually are mounted off the telescope (see below for further
plugged, or a pick-and-place robot which serially discussion).
positions the fiber inputs at the correct location Of paramount importance in MOS is the quality of
on a magnetized field plate. The latter is highly the background subtraction as discussed above.
versatile but mechanically complex with a Traditionally, this requires slits which sample the
significant configuration time that may erode sky background directly adjacent to the object. An
the actual on-target integration time. alternative is nod-and-shuffle (va-et-vient) in which
nearby blank regions are observed alternately with
The two systems have contrasting capabilities the main field using the same slit mask by moving the
(Figure 7). The multislit approach provides generally telescope (the nod). In the case of CCDs, the photo-
better SNR since the light feeds into the spectrograph generated charge from the interleaved exposures is
directly, but is compromised in terms of addressable stored temporarily on the detector by moving it to an
surface density of targets, by the need to avoid unilluminated portion of the device (the shuffle).
spectral overlap and in FOV by the difficulty of the After many repeats on a timescale less than that of
wide-field optics. A current example of this type of variations in the sky background (a few minutes), the
instrument, GMOS, is shown in Figure 8 and accumulated charge is read out, incurring the
described in Table 5. The multifiber approach is read-noise penalty only once. Although requiring
limited in SNR by modal noise in the fibers and an effective doubling of the exposure time and an
Figure 7 Illustration of the difference between the multislit and multifiber approaches to multi-object spectroscopy.
INSTRUMENTATION / Astronomical Instrumentation 291
Gemini instrument
support structure
Fore optic
support
structure
On-instrument
wavefront
Collimator sensor
Filter wheels
IFU/mask
cassettes
GMOS
without enclosure and
Grating turret
electronics cabinets
& indexer unit
CCD unit Dewar
Main optical shutter
Camera support structure
Figure 8 The Gemini Multiobject Spectrograph (GMOS). One of two built for the two Gemini 8-m telescopes by a UK –Canadian
consortium. It includes an integral field capability provided by a fiber-lenslet module built by the University of Durham. The optical
configuration is the same as shown in Figure 2. It is shown mounted on an instrument support unit which includes a mirror to direct light
from the telescope into different instruments. The light path is shown by the red dashed line. The slit masks are cut by a Nd-YAG laser in
3-ply carbon fiber sheets. See Table 5 for the specification.
increase in the size of the detector, this technique Other such techniques are Fabry – Perot, Fourier-
allows the length of the slits to be reduced since no transform and Hadamard spectroscopy. Since the
contiguous sky sample is required, thereby greatly temporal variation of the sky background is a major
increasing the attainable multiplex gain if the number limitation in astronomy of faint objects, IFS is the
density of potential targets is sufficiently high. preferred technique for terrestrial observations of
faint objects, but nonsimultaneous techniques are
preferable in certain niche areas, and relatively more
Integral field spectroscopy (IFS) important in space where the sky background is
IFS provides a spectrum of each spatial sample reduced.
simultaneously within a contiguous FOV. Other The main techniques (Figure 9) are as follows:
approaches provide the same combination of imaging
and spectroscopy but require a series of nonsimulta- (i) Lenslets. The field is subdivided by placing an
neous observations. Examples include imaging array of lenslets at the telescope focus (or its
through a number of narrow passband filters with conjugate following re-imaging fore-optics) to
different effective wavelength and spectroscopy form a corresponding array of micropupils.
with a single slit which is stepped across the object. These are re-imaged on the detector by a focal
292 INSTRUMENTATION / Astronomical Instrumentation
reducer using a conventional disperser. The methods since the one-dimensional slicing
spectra are, therefore, arranged in the same pat- produces fewer diffraction losses in the spectro-
tern as that of the lenslet array. Spectral overlap is graph than the two-dimensional division used by
reduced by angling the dispersion direction away the others, and minimizes the fraction of the
from the symmetry axes of the lenslet array and detector surface which must be left unillumi-
by the fact that the pupil images are smaller than nated in order to avoid cross-talk between
the aperture of the corresponding lenslet. noncontiguous parts of the field. However the
(ii) Fibers þ lenslets. The field is subdivided as in (i) micromirrors are difficult to make since they
but the pupil images are relayed to a remote require diamond-turning or grinding in metal or
spectrograph using optical fibers which reformat glass with a very fine surface finish (typically
the field into linear pseudoslits. This avoids the with RMS , 1 nm for the optical and , 10 nm in
problem of overlap, allowing a greater length of the infrared). This sort of system is, however,
spectrum than in (i), but is more complex since well-matched to cold temperatures, since the
the arrays of fibers and lenslets must be precisely optical surfaces and mounts may be fabricated
co-aligned and are subject to modal noise in the from the same material (e.g., Al) or from
fibers. materials with similar thermal properties.
(iii) Image slicer. The field is subdivided in only one
The data are processed into a datacube whose
dimension by a stack of thin slicing mirrors
dimensions are given by the two spatial coordinates,
placed at a conjugate of the telescope focus
x; y; plus wavelength. The datacube may be sliced up
(Figure 10). Each mirror is angled so as to direct
in ways analogous to tomography to understand the
light to its own pupil mirror which re-images the
physical process operating in the object.
field to form part of a linear pseudoslit. Thus the
slices into which the image is divided are
rearranged end-to-end to form a continuous slit Spectropolarimetry and Polarimetry
at the spectrograph input. Unlike the other Spectropolarimetry and polarimetry are analogous to
techniques, this retains spatial information spectroscopy and imaging, where the polarization
along the length of each slice. This is dispersed state of the light is measured instead of the total
by conventional means via a focal-reducer. An intensity. Like IFS, spectropolarimetry is a photon-
additional optic is often required at the image of starved (i.e., limited in SNR by photon noise) area
each slice at the slit to reimage the micropupil which benefits from the large aperture of current and
images produced by the slicing mirrors onto a projected telescopes.
common pupil inside the spectrograph. In The Stokes parameters of interest are I, Q, and U.
principle this is the most efficient of the three V is generally very small for astronomical objects.
INSTRUMENTATION / Astronomical Instrumentation 293
Figure 10 Principle of the advanced image slicer. Only three slices are shown here. Real devices have many more, e.g., the IFU for
the Gemini Near-Infrared Spectrograph has 21 to provide a 500 £ 300 field with 0.1500 £ 0.1500 sampling. Reproduced from Content R
(1997) A new design for integral field spectroscopy with 8 m telescopes. Proceedings SPIE 2871: 1295– 1305.
From these, the degree and azimuthal angle of linear parameters for each sample are then given by
polarization are obtained as:
Q Sð0Þ 2 Sðp=2Þ
¼
! ! I Sð0Þ þ Sðp=2Þ
Q2 þ U2 1 U * +
pL ¼ c¼ arctan ½26 A1 2 vAB B1 A2 2 vAB B2
I2 2 Q ¼ ;
A1 þ vAB B1 A2 þ vAB B2
The Stokes parameters may be estimated using a U Sðp=4Þ 2 Sð3p=4Þ
modified spectrograph with a rotatable achromatic ¼
I Sðp=4Þ þ Sð3p=4Þ
half-wave retarder, characterized by the angle of ½27
* +
rotation about the optical axis, u, placed before the C1 2 vCD D1 C2 2 vCD D2
¼ ;
instrument and a polarizing beamsplitter which C1 þ vCD D1 C2 þ vCD D2
separates the incident beam into orthogonal polariz-
ation states (Figure 11). The two states are recorded I ¼ Sð0Þ þ Sðp=2Þ
simultaneously on different regions of the detector. * +
The separation is achieved usually through the A1 þ vAB B1 A2 þ vAB B2
¼ ; etc:
angular divergence produced by a Wollaston prism hA1 hA2
placed before the disperser or through the linear offset
produced by a calcite block placed before the slit. The where A and B are the signals (functions of position
intensity as a function of the waveplate rotation and/or l) recorded with the waveplate at u ¼ 0 and
angle, S(u), is recorded for each wavelength and u ¼ p=2 and the subscripts 1 and 2 refer to the two
position in the spectrum or image. The Stokes polarization states produced by the beamsplitter.
294 INSTRUMENTATION / Astronomical Instrumentation
Figure 11 Schematic layout of spectropolarimeter using a polarizing beamsplitter in the collimated beam (e.g., a Wollaston prism
producing e- and o-states) to produce separate spectra for orthogonal polarization states on the detector. Alternative configurations in
which the two polarizations have a linear offset (e.g., a calcite block) may be used in which case the beamsplitter is placed before the slit.
Likewise, C and D are a pair of observations taken , 1 m are required for self-contained par-focal
with u ¼ p=4 and u ¼ 3p=4: The angled brackets IFUs. For l , 2:5 mm; fused silica is a suitable
indicate a suitable method of averaging the quantities material, although it is not currently possible to
inside. The factors vAB and vCD are estimated as go below , 0.35 mm, depending on fiber length.
sffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffi Newer techniques and materials may improve
A1 A2 C1 C2 the situation. For l . 2:5 mm; alternative
vAB ¼ vCD ¼ ½28 materials are required, such as ZrF4 and, at
B1 B2 D1 D2
longer wavelengths, chalcogenide glasses, but
and hA1 is a single calibration factor for polarization these are much more expensive and difficult to
state 1 of observation A, etc. One problem is use than fused silica.
instrumental polarization caused by oblique reflec- (ii) Conservation of Etendue. Fibers suffer from
tions from telescope mirrors upstream of the polariz- focal ratio degradation, a form of modal diffu-
ing waveplate, such as the fold mirror external to sion, which results in a speeding up of the output
GMOS (see Figure 8). This must be carefully beam with respect to the input. Viewed as an
accounted for by taking observations of stars with increase in entropy, it results in a loss of
accurately known polarization. The dispersing information. In practice, this implies either
element is also likely to be very strongly dependent reduced throughput, as the fiber output is
on the polarization state of the incident light with vignetted by fixed-size spectrograph optics, or a
widely different efficiency as a function of wave- loss in spectral resolution if the spectrograph is
length. Although this effect is cancelled out using the oversized to account for the faster beam from the
procedure described above, it introduces a heavy toll fibers. This effect may be severe at the slow focal
in terms of SNR reduction. ratios pr oduced by many telescopes ðF . 10Þ
but is relatively small for fast beams ðF , 5Þ:
Thus, its effect may be mitigated by speeding up
Technology Issues
the input beam by attaching lenslets directly to
Use of Optical Fibers in Astronomy the fibers, either individually (for MOS) or in a
close-packed array (for IFS). If the spectrograph
As discussed above, optical fibers are commonly used in operates in both fiber- and beam-fed modes (e.g.,
astronomy for coupling spectrographs to wide, spar- Figure 8), it is also necessary to use lenslets at the
sely sampled FOVs in the case of MOS, or small slit to convert the beam back to the original
contiguously sampled FOVs in the case of IFS. For both speed.
applications, the most important characteristics are: (iii) Efficient coupling to the telescope. This requires
that the fibers are relatively thick to match the
(i) Throughput. Near-perfect transmission is physical size of the sampling at the telescope
required for wavelength intervals of , 1 octave focus which is typically xs ¼ 0:100 for IFS and
within the overall range of 0.3 – 5 mm. For MOS, xs ¼ 200 for MOS. By conservation of Etendue,
a prime focus fiber feed to an off-telescope the physical size of the fiber aperture is
spectrograph implies fiber runs of several tens
of meters for 8 m telescopes, increasing
proportionally for ELTs. For IFS, runs of only df ¼ xs DT Ff ½29
INSTRUMENTATION / Astronomical Instrumentation 295
where Ff is the focal ratio of the light entering the For example, the Gemini telescopes are optimized for
fiber and DT is the diameter of the telescope the NIR by undersizing M2 so that the only stray
aperture. Using the above values with Ff ¼ 4; as light entering the final focal surface is from the cold
required to maintain Etendue, implies 15 , df , night sky.
300 mm for an 8 m or 60 , df , 1200 mm for a Irrespective of the thermal background and wave-
30 m telescope. Except at the lower end, this length, all detectors (see Table 4), require cooling to
implies the use of multimode (step-index) fibers. reduce internally generated noise to acceptable levels.
However, the need in IFS to oversize the fibers to For the CCDs used in the optical, the cryogen is liquid
account for manufacturing errors and current nitrogen, but for the infrared, cooling with liquid
requirements of xs $ 0:200 to produce sufficient helium is required.
SNR in a single spatial sample gives a practical
limit of df . 50 mm which is in the multimode
Structure
regime.
As discussed above, the necessity for instruments to
Recently, the use of photonic crystal fibers in scale in size with the telescope aperture cannot be
astronomy has been investigated. These, together achieved by simply devising a rigid structure, since
with other improvements in material technology may the required materials are generally not affordable
improve the performance of optical fibers, which will (instruments on 8 m telescopes cost roughly e/$ 5–
be of special relevance to ELTs where, for example, 15 M, including labor). Furthermore, the stability
there will be a need to couple very fast beams into also depends critically on the mounting of the
fibers. individual optical elements, where the choice of
material is constrained by the need for compliance
Cooling for the Near-Infrared
with the optical materials. The solution adopted for
As shown in Figure 12, it is necessary to cool GMOS was to design the collimator and camera
instruments for use in the infrared. This is to prevent mounting so that variations in the gravity vector
thermal emission from the instrument becoming the induce a translation in each component orthogonal to
dominant noise source. The minimum requirement is the optical axis without inducing tilts or defocus.
to place a cold stop at a conjugate of the telescope Therefore, the only effect of instability is a translation
pupil. However, much of the rest of the instrument of the image of the slit on the detector. Since the slit
requires cooling because of the nonzero emissivity of area is rigidly attached to the telescope interface, the
the optics. The telescope also requires careful design. image of a celestial object does not move with respect
H
300K 250K 200K T = 150K
1000.0
100.0
R = 300
Counts/pixel/s
10.0 Mean
1.0 R = 3000
Continuum
0.1 100K
Dark current
1 2 3 4
Wavelength (µm)
Figure 12 Illustration of the need for cooling in NIR instruments. This is modeled for the case of a spectrograph on an 8 m telescope
with 0.200 £ 0.200 sampling with both throughput and total emissivity of 50%. The curves indicate the cumulative thermal background for
wavelengths shortward of that shown on the scale. For comparison, a typical dark current is shown for NIR detectors (see Table 4). Also
shown is the background from the night sky for the case of the H-band, for two values of spectral resolution, R. The mean sky
background level is shown for both values of R. For R ¼ 3000; the continuum level found between the strong, narrow OH-lines which
make up most of the signal is also shown. To observe in the H-band, cooling to 240 8C is required for high spectral resolution, but is
unnecessary at low resolution. At longer wavelengths, fully cryogenic cooling with liquid nitrogen is generally required.
296 INSTRUMENTATION / Astronomical Instrumentation
to the slit providing that the telescope correctly tracks ambient temperature during observations and
the target. The motion of the image of the slit on the reduces thermal gradients by forced circulation of air.
detector is then corrected by moving the detector
orthogonal to the optical axis to compensate. Taking
account of nonrepeatable motion in the stucture and
See also
the detector mounting, it proved possible to attain the
desired stability in open-loop by measuring the Diffraction: Diffraction Gratings. Fiber and Guided Wave
flexure experienced with the control system turned Optics: Fabrication of Optical Fiber. Imaging: Adaptive
off, making a look-up table to predict the required Optics. Instrumentation: Telescopes; Spectrometers.
detector position for each attitude setting (elevation Spectroscopy: Fourier Transform Spectroscopy; Hada-
mard Spectroscopy and Imaging.
and attitude) and applying the correction by counting
pulses sent to the stepper motors which control the
detector translation.
Some instruments require much greater stability Further Reading
than GMOS does. The flexure-control outlined above
Allington-Smith JR, Murray G, Content C, et al. (2002)
may be augmented by operation in closed-loop so Integral field spectroscopy with the Gemini Multiobject
that nonrepeatable movements may be accounted for. Spectrograph. Publications of the Astronomical Society
However, this ideally requires a reference light source of the Pacific 114: 892.
which illuminates the optics in the same way as the Bjarklev A, Broeng J and Sanchez Bjarklev AS (2003)
light from the astronomical target and which is Photonic Crystal Fibers. Kluwer.
recorded by the science detector without adversely Bowen IS (1938) The Image-Slicer: a device for reducing
impacting the observation. Alternatively, partial loss of light at slit of stellar spectrograph. Astrophysical
feedback may be supplied by repeated metrology of Journal 88: 113.
key optical components such as fold-mirrors. Carrasco E and Parry I (1994) A method for determining
the focal ratio degradation of optical fibres for astron-
One strategy to measure radial velocities to very
omy. Monthly Notices of the Royal Astronomical
small uncertainities (a few meters per second) is to
Society 271: 1.
introduce a material (e.g., iodine) into the optical Courtes G (1982) Instrumentation for Astronomy with
path, which is present throughout the observations. Large Optical Telescope, IAU Colloq, 67. In: Humph-
This absorbs light from the observed object’s con- ries CM (ed.) Astrophysics and Space Science Library,
tinuum at one very well-defined wavelength. Thus, vol. 92, pp. 123. Dordrecht: Reidel.
instrument instability can be removed by measuring Glazebrook K and Bland-Hawthorn J (2001) Microslit
the centroid of the desired spectral line in the object nod-shuffle spectrocopy: a technique for achieving very
relative to the fixed reference produced by the high densities of spectra. Publications of the Astronom-
absorption cell. However, flexure must still be care- ical Society of the Pacific 113: 197.
fully controlled to avoid blurring the line during the Lee D, Haynes R, Ren D and Allington-Smith JR (2001)
Characterisation of microlens arrays for astronomical
course of a single exposure.
spectroscopy. PASP 113: 1406.
Mounting of instruments via a fiber-feed remotely
McLean I (1997) Electronic Imaging in Astronomy.
from the telescope, at a location where they are not Chichester, UK: John Wiley and Sons Ltd.
subjected to variations in the gravity vector, is a Palmer C (2000) Diffraction Grating Handbook. Richard-
solution for applications where great stability is son Grating Laboratory, 4th edn. Rochester, NY.
required. But, even here, care must be taken with Ridgway SH and Brault JW (1984) Astronomical Fourier
the fore-optics and pickoff system attached to the transform spectroscopy revisited. Ann. Rev. Astron.
telescope, and to account for modal noise induced by Astrophys. 22.
changes in the fiber configuration as the telescope Saha SK (2002) Modern optical astronomy: technology and
tracks. impact of interferometry. Review of Modern Physics 74:
Finally, the instrument structure has other func- 551.
Schroeder D (2000) Astronomical Optics, 2nd edn.
tions. For cooled instruments, a cryostat is required
San Diego, CA: Academic Press.
inside which most of the optical components are van Breugel W and Bland-Hawthorn J (eds) Imaging the
mounted. For uncooled instruments, careful control Universe in Three Dimensions, ASP Conference Series,
of temperature is needed to avoid instrument motion vol. 195.
due to thermal expansion. This requires the use of an Weitzel L, Krabbe A, Kroker H, et al. (1996) 3D: The next
enclosure which not only blocks out extraneous light, generation near-infrared imaging spectrometer. Astron.
but provides a thermal buffer against changes in the & Astrophys. Suppl. 119: 531.
INSTRUMENTATION / Ellipsometry 297
Ellipsometry
J N Hilfiker, J A Woollam & Co., Inc., Lincoln, NE, orientation and phase, it is considered unpolarized.
USA For ellipsometry, we are interested in the case where
the electric field follows a specific path and traces out a
J A Woollam, University of Nebraska, Lincoln, NE, distinct shape at any point. This is known as polarized
USA
light. When two orthogonal light waves are in-phase,
q 2005, Elsevier Ltd. All Rights Reserved. the resulting light will be linearly polarized (Figure 1a).
The relative amplitudes determine the resulting
orientation. If the orthogonal waves are 908 out-of-
Introduction phase and equal in amplitude, the resultant light is
Ellipsometry measures a change in polarization as circularly polarized (Figure 1b). The most general
light reflects from or transmits through a material polarization is ‘elliptical’, which combines orthogonal
structure. The polarization-change is represented as waves of arbitrary amplitude and phase (Figure 1c).
an amplitude ratio, C, and a phase difference, D. The This is how ellipsometry gets its name.
measured response is dependent on optical properties
and thickness of each material. Thus, ellipsometry is Materials
primarily used to determine film thickness and optical Two values are used to describe optical properties,
constants. However, it is also applied to the charac- which determine how light interacts with a material.
terization of composition, crystallinity, roughness,
doping concentration, and other material properties
associated with a change in optical response.
Interest in ellipsometry has grown steadily since the
1960s as it provided the sensitivity necessary to
measure nanometer-scale layers used in microelec-
tronics. Today, the range of applications has spread to
basic research in physical sciences, semiconductor,
data storage, flat panel display, communication,
biosensor, and optical coating industries. This wide-
spread use is due to increased dependence on thin
films in many areas and the flexibility of ellipsometry
to measure most material types: dielectrics, semicon-
ductors, metals, superconductors, organics, biologi-
cal coatings, and composites of materials.
This article provides a fundamental description of
ellipsometry measurements along with the typical
data analysis procedures. The primary applications of
ellipsometry are also surveyed.
Light
They are generally represented as a complex number. The optical constants for TiO2 are shown in
The complex refractive index ðnÞ~ consists of the index Figure 3 from the ultraviolet (UV) to the infrared
ðnÞ and extinction coefficient ðkÞ: (IR). The optical constants are wavelength dependent
with absorption ðk . 0Þ occurring in both UV and IR,
n~ ¼ n þ ik ½1 due to different mechanisms that take energy from the
light wave. IR absorption is commonly caused by
Alternatively, the optical properties can be rep-
molecular vibration, phonon vibration, or free-
resented as the complex dielectric function:
carriers. UV absorption is generally due to electronic
1~ ¼ 11 þ i12 ½2 transitions, where light provides energy to excite an
electron to an elevated state. A closer look at the
with the following relation between conventions: optical constants in Figure 3 shows that real and
imaginary optical constants are not independent.
1~ ¼ n~ 2 ½3 Their shapes are mathematically coupled together
through Kramers– Krönig consistency. Further details
The index describes the phase velocity for light as it
are covered later in this article.
travels through a material compared to the speed of
light in vacuum, c:
c Interaction Between Light and Materials
v¼ ½4
n Maxwell’s equations must remain satisfied when light
Light slows as it enters a material with higher index. interacts with a material, which leads to boundary
Because frequency remains constant, the wavelength conditions at the interface. Incident light will reflect
will shorten. The extinction coefficient describes the and refract at the interface, as shown in Figure 4. The
loss of wave energy to the material. It is related to the angle between the incident ray and sample normal
absorption coefficient, a, as:
4pk
a¼ ½5
l
Light loses intensity in an absorbing material,
according to Beer’s Law:
Figure 2 Wave travels from air into absorbing Film 1 and then
transparent Film 2. The phase velocity and wavelength change in
each material depending on index of refraction (Film 1: n ¼ 4;
Film 2: n ¼ 2). Figure 4 Light reflects and refracts according to Snell’s law.
INSTRUMENTATION / Ellipsometry 299
!
ðfi Þ will be equal to the reflected angle, ðfr Þ: Light E0t 2ni cos ui
entering the material is refracted to an angle ðft Þ tp ¼ ¼ ½8d
E0i p
ni cos ut þ nt cos ui
given by:
Thin film and multilayer structures involve multiple
n0 sinðfi Þ ¼ n1 sinðft Þ ½7
interfaces, with Fresnel reflection and transmission
coefficients applicable at each. It is important to track
The same occurs at every interface where a portion
the relative phase of each light component to
reflects and the remainder transmits at the refracted
correctly determine the overall reflected or trans-
angle. This is illustrated in Figure 5. The boundary
mitted beam. For this purpose, we define the film
conditions provide different solutions for electric
phase thickness as:
fields parallel and perpendicular to the sample
surface. Therefore, light can be separated into
t1
orthogonal components with relation to the plane b ¼ 2p n cos u1 ½9
l 1
of incidence. Electric fields parallel and perpendicular
to the plane of incidence are considered p- and The superposition of multiple light waves introduces
s-polarized, respectively. These two components are interference that is dependent on the relative phase of
independent for isotropic materials and can be each light wave. Figure 5 illustrates the combination
calculated separately. Fresnel described the amount of light waves in the reflected beam and their
of light reflected and transmitted at an interface corresponding Fresnel calculations.
between materials:
!
E0r ni cos ui 2 nt cos ut Ellipsometry Measurements
rs ¼ ¼ ½8a
E0i s
ni cos ui þ nt cos ut For ellipsometry, primary interest is measurement of
how p- and s-components change upon reflection or
transmission relative to each other. In this manner,
!
E0r nt cos ui 2 ni cos ut the reference beam is part of the experiment. A
rp ¼ ¼ ½8b known polarization is reflected or transmitted from
E0i p
ni cos ut þ nt cos ui
the sample and the output polarization is measured.
The change in polarization is the ellipsometry
! measurement, commonly written as:
E0t 2ni cos ui
ts ¼ ¼ ½8c
E0i s
ni cos ui þ nt cos ut r ¼ tanðCÞ eiD ½10
Figure 5 Light reflects and refracts at each interface, which leads to multiple beams in a thin film. Interference between beams
depends on relative phase and amplitude of the electric fields. Fresnel reflection and transmission coefficients can be used to calculate
the response from each contributing beam.
300 INSTRUMENTATION / Ellipsometry
An example ellipsometry measurement is shown in becoming elliptically polarized and travels through
Figure 6. The incident light is linear with both p- and a continuously rotating polarizer (referred to as the
s-components. The reflected light has undergone analyzer). The amount of light allowed to pass will
amplitude and phase changes for both p- and depend on the polarizer orientation relative to the
s-polarized light, and ellipsometry measures their electric field ‘ellipse’ coming from the sample. The
changes. detector converts light from photons to electronic
The primary methods of measuring ellipsometry signal to determine the reflected polarization.
data all consist of the following components: light This information is used with the known input
source, polarization generator, sample, polarization polarization to determine the polarization change
analyzer, and detector. The polarization generator caused by the sample reflection. This is the ellipso-
and analyzer are constructed of optical components metry measurement of C and D.
that manipulate the polarization: polarizers, compen-
sators, and phase modulators. Common ellipsometer
configurations include rotating analyzer (RAE), Single Wavelength Ellipsometry (SWE)
rotating polarizer (RPE), rotating compensator SWE uses a single frequency of light to probe the
(RCE), and phase modulation (PME). sample. This results in a pair of data, C and D, used to
The RAE configuration is shown in Figure 7. A determine up to two material properties. The optical
light source produces unpolarized light, which is sent design can be simple, low-cost, and accurate. Lasers
through a polarizer. The polarizer passes light of a are an ideal light source with well-known wavelength
preferred electric field orientation. The polarizer axis and significant intensity. The optical elements can be
is oriented between the p- and s-planes, such that optimized to the single wavelength. However, there is
both arrive at the sample surface. The linearly relatively low information content (two measurement
polarized light reflects from the sample surface, values). SWE is excellent for quick measurement of
Figure 6 Typical ellipsometry configuration, where linearly polarized light is reflected from the sample surface and the polarization
change is measured to determine the sample response.
Figure 7 Rotating analyzer ellipsometer configuration uses a polarizer to define the incoming polarization and a rotating polarizer after
the sample to analyze the outgoing light. The detector converts light to a voltage with the time-dependence leading to measurement of
the reflected polarization.
INSTRUMENTATION / Ellipsometry 301
nominally known films, like SiO2 on Si. Care must be Spectroscopic Ellipsometry (SE)
taken when interpreting unknown films, as multiple
Spectroscopic measurements solve the ‘period’ pro-
solutions occur for different film thickness. The
blem discussed for SWE. Data at surrounding
data from a transparent film will cycle through the
wavelengths insure the correct thickness is deter-
same values as the thickness increases. From Figure 5,
mined, even as there remain multiple solutions at one
it can be seen that this is related to interference
wavelength in the spectrum. This is demonstrated in
between the multiple light beams. The refracted
Figure 9, showing the spectroscopic data for the first
light may travel through different thicknesses to
three thickness results. The SE data oscillations
emerge in-phase with the first reflection, but is the
clearly distinguish each thickness. Thus, a single
returning light delayed by one wavelength or multiple
thickness solution remains to match data at multiple
wavelengths? Mathematically, the thickness cycle can
wavelengths.
be determined as:
SE provides additional information content for
each new wavelength. While the film thickness will
l remain constant regardless of wavelength, the optical
Df ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½11 constants will change across the spectrum. The
2 n~ 21 2 n~ 20 sin2 f
optical constant shape contains information regard-
ing the material microstructure, composition, and
This is demonstrated in Figure 8, where the complete more. Different wavelength regions will provide the
thickness cycle is shown for SiO2 on Si at 758 angle of best information for each different material property.
incidence and 500 nm wavelength. A bare silicon For this reason, SE systems have been developed
substrate would produce data at the position of the to cover very wide spectral regions from the UV to
star. As the film thickness increases, the data will the IR.
move around this circle in the direction of the arrow. Spectroscopic measurements require an additional
From eqn [11], the thickness cycle is calculated as component, a method of wavelength selection. The
226 nm. Thus the cycle is completed and returns to most common methods include the use of mono-
the star when the film thickness reaches 226 nm. Any chromators, Fourier-transform spectrometers, and
data point along the cycle represents a series of gratings or prisms with detection on a diode array.
thicknesses which depend on how many times the Monochromators are typically slow as they sequen-
cycle has been completely encircled. For example, the tially scan through wavelengths. Spectrometers and
square represents a data point for 50 nm thickness. diode arrays allow simultaneous measurement at
However, the same data point will represent 276 and multiple wavelengths. This has become popular with
502 nm thicknesses. A second data point at a new the desire for high-speed SE measurements.
wavelength or angle would help determine the correct
Variable Angle Ellipsometry
thickness.
Ellipsometry measurements are typically performed
at oblique angles, where the largest changes in
polarization occur. The Brewster angle, defined as:
n1
tanðfB Þ ¼ ½12
n0
gives the angle where reflection of p-polarized light used to monitor numerous material types, including
goes through a minimum. Ellipsometry measure- metals and dielectrics. A promising in situ application
ments are usually near this angle, which can vary is for multilayer optical coatings, where thickness and
from 558 for low-index dielectrics to 758 or 808 for index can be determined in real-time to allow
semiconductors and metals. correction of future layers, to optimize optical
It is also common to measure at multiple angles of performance.
incidence in ellipsometry. This allows additional data
to be collected for the same material structure under
different conditions. The most important change Data Analysis
introduced by varying the angle is the change in
Ellipsometry measures changes in polarization. How-
path length through the film as the light refracts at a
ever, it is used to determine the material properties of
different angle. Multiple angles do not always add
interest, such as film thickness and optical constants.
new information about a material structure, but the
In the case of a bulk material, the equations derived
extra data builds confidence in the final answer.
for a single reflection can be directly inverted to
provide the ‘pseudo’ optical constants from the
In Situ ellipsometry measurement, r:
Within the last decade, it has become common to " !#
employ optical diagnostics during processing. In situ 2 2 12r
~ ¼ sin ðfÞ 1 þ tan ðfÞ
k1l ½13
ellipsometry allows the optical response to be 1þr
monitored in real-time. This has also led to feedback
control for many processes. While in situ is generally This equation assumes there are no surface layers of
restricted to a single angle of incidence, there is a any type. However, there is typically a surface oxide
distinct advantage to collecting data at different or roughness for any bulk material and the direct
‘times’ during processing to get unique ‘glimpses’ of inversion would incorporate these into the bulk
the sample structure as it changes. optical constants. The more common procedure
In situ SE is commonly applied to semiconductors. used to deduce material properties from ellipsometry
Conventional SE systems measure from the UV to measurements follows the flowchart in Figure 10.
NIR, which matches the spectral region where Regression analysis is required because an exact
semiconductors absorb due to electronic transitions. equation cannot be written. Often the answer is over-
The shape and position of the absorption can be very determined with hundreds of experimental data
sensitive to temperature, composition, crystallinity, points for a few unknowns. Regression analysis
and surface quality, which allows SE to closely allows all of the measured data to be included when
monitor these properties. In situ SE has also been determining the solution.
Data analysis proceeds as follows. After the sample occurs at a thickness of 749 nm. This corresponds to
is measured, a model is constructed to describe the the correct film thickness. It is possible that the
sample. The model is used to calculate the predicted regression algorithm will mistakenly fall into a ‘local’
response from Fresnel’s equations. Thus, each material minima, depending on the starting thickness and the
must be described with a thickness and optical MSE structural conditions. Comparing the results by
constants. If these values are not known, an estimate eye for the lowest MSE and a local minima easily
is given for the purpose of the preliminary calculation. distinguish the true minima (see Figures 11b and c).
The calculated values are compared to experimental
data. Any unknown material properties can then be
Ellipsometry Characterization
varied to improve the match between experiment and
calculation. The number of unknown properties The two most common material properties studied by
should not exceed the information content contained ellipsometry are film thickness and optical constants.
in the experimental data. For example, a single- In addition, ellipsometry can characterize material
wavelength ellipsometer produces two data points properties that affect the optical constants. Crystal-
(C,D) which allows a maximum of two material linity, composition, resistivity, temperature, and
properties to be determined. Finding the best match molecular orientation can all affect the optical
between the model and experiment is typically done constants and in turn be measured by ellipsometry.
through regression. An estimator, like the mean This section details many of the primary applications
squared error (MSE), is used to quantify the difference important to ellipsometry.
between curves. The unknown parameters are allowed
to vary until the minimum MSE is reached. Film Thickness
Care must be taken to ensure the best answer is The film thickness is determined by interference
found, corresponding to the lowest MSE. For between light reflecting from the surface and light
example, Figure 11a shows the MSE curve versus traveling through the film. Depending on the relative
film thickness for a transparent film on silicon. There phase of the rejoining light to the surface reflection,
are multiple ‘local’ minima, but the lowest MSE value there can be constructive or destructive interference.
Figure 11 (a) MSE curve versus thickness shows the ‘global’ minimum where the best match between model and experiment occurs,
and ‘local’ minima that may be found by the regression algorithm, but do not give the final result. (b) The experimental data and
corresponding curves generated for the model at the ‘global’ minimum. (c) Similar curves at the ‘local’ minimum near 0.45 microns
thickness is easily distinguishable as an incorrect result.
304 INSTRUMENTATION / Ellipsometry
Figure 12 (a) Reflected intensity and (b) ellipsometric delta for three thin oxides on silicon show the high sensitivity of D to nanometer-
scale films not available from the intensity measurement.
The interference involves both amplitude and phase optical constants can be used to predict the response
information. The phase information from D is very at each wavelength. However, it is less convenient to
sensitive to films down to submonolayer thickness. adjust unknown optical constants on a wavelength-
Figure 12 compares reflected intensity and ellipso- by-wavelength basis. It is more advantageous to use
metry for the same series of thin SiO2 layers on Si. all wavelengths simultaneously. A dispersion relation-
There are large variations in D, while the reflectance ship often solves this problem, by describing the
for each film is nearly the same. optical constant shape versus wavelength. The
Ellipsometry is typically used for films with adjustable parameters of the dispersion relationship
thickness ranging from sub-nanometers to a few allow the overall optical constant shape to match the
microns. As films become greater than several tens of experimental results. This greatly reduces the number
microns thick, it becomes increasingly difficult to of unknown ‘free’ parameters compared to fitting
resolve the interference oscillations, except with individual n; k values at every wavelength.
longer infrared wavelengths, and other characteriz- For transparent materials, the index is often
ation techniques become preferred. described using the Cauchy or Sellmeier relationship.
Thickness measurements also require a portion of The Cauchy relationship is typically given as:
the light to travel through the entire film and return to
the surface. If the material is absorbing, thickness B C
nðlÞ ¼ A þ 2
þ 4 ½14
measurements by optical instruments will be limited l l
to thin, semi-opaque layers. This limitation can be
circumvented by measuring in a spectral region where where the three terms are adjusted to match the
there is lower absorption. For example, an organic refractive index for the material. The Sellmeier
film may strongly absorb UV and IR light, but remain relationship enforces Kramers– Krönig (KK) consist-
transparent at mid-visible wavelengths. For metals, ency, which ensures the optical dispersion retains a
which strongly absorb at all wavelengths, the realistic shape. The Cauchy is not constrained by KK
maximum layer for thickness determination is consistency and can produce unphysical dispersion.
typically ,100 nm. The Sellmeier relationship can be written as:
Al2 l20
Optical Constants 11 ¼ ½15
ðl2 2 l20 Þ
Thickness measurements are not independent of the
optical constants. The film thickness affects the path Absorbing materials will often have a transparent
length of light traveling through the film, but the wavelength region that can be modeled with the
index determines the phase velocity and refracted Cauchy or Sellmeier relationships. However, the
angle. Thus, both contribute to the delay between absorbing region must account for both real and
surface reflection and light traveling through the film. imaginary optical constants. Many dispersion
Both n and k must be known or determined along relationships use oscillator theory to describe the
with the thickness to get the correct results from an absorption for materials, which include the Lorentz,
optical measurement. Harmonic, and Gaussian oscillators. They all share
The optical constants for a material will vary for similar attributes, where the absorption features
different wavelengths and must be described at all are described with an amplitude, broadening,
wavelengths probed with the ellipsometer. A table of and center energy (related to frequency of light).
INSTRUMENTATION / Ellipsometry 305
Figure 15 Experimental ellipsometry data and corresponding fit when the model is described as (a) a homogeneous single-layer and
(b) a graded film with index variation through the film. (c) The best model to match subtleties in experimental data allows the index to
increase toward the surface of the film, with a thin roughness layer on the surface.
Polarization: Introduction; Matrix Analysis. Semiconduc- Johs B, Hale J, Ianno NJ, et al. (2001) Recent developments
tor Physics: Band Structure and Optical Properties. in spectroscopic ellipsometry for in situ applications. In:
Spectroscopy: Fourier Transform Spectroscopy. Duparré A and Singh B (eds) Optical Metrology Road-
map for the Semiconductor, Optical, and Data Storage
Industries II, vol. 4449, pp. 41 – 57. Bellingham,
Further Reading Washington: SPIE.
Roseler A (1990) Infrared Spectroscopic Ellipsometry.
Aspnes DE (1985) The accurate determination of optical Berlin: Akademie-Verlag.
properties by ellipsometry. In: Palik ED (ed.) Handbook Rossow U and Richter W (1996) Spectroscopic ellipsome-
of Optical Constants of Solids, pp. 89 – 112. Orlando try. In: Bauer G and Richter W (eds) Optical Charac-
FL: Academic Press. terization of Epitaxial Semiconductor Layers,
Azzam RMA and Bashara NM (1987) Ellipsometry and pp. 68– 128. Berlin: Springer-Verlag.
Polarized Light. Amsterdam, The Netherlands: Elsevier Tompkins HG (1993) A User’s Guide to Ellipsometry.
Science B.V.
San Diego, CA: Academic Press.
Boccara AC, Pickering C and Rivory J (eds) (1993)
Tompkins HG and Irene EA (eds) (in press) Handbook of
Spectroscopic Ellipsometry. Amsterdam: Elsevier Pub-
Ellipsometry. New York: William Andrew Publishing.
lishing.
Tompkins HG and McGahan WA (1999) Spectroscopic
Collins RW, Aspnes DE and Irene EA (eds) (1998)
Ellipsometry and Reflectometry. New York: John Wiley
Proceedings from the Second International Conference
& Sons, Inc.
on Spectroscopic Ellipsometry. Thin Solid Films, vols.
313– 314. Woollam JA (2000) Ellipsometry, variable angle spectro-
Gottesfeld S, Kim YT and Redondo A (1995) Recent scopic. In: Webster JG (ed.) Wiley Encyclopedia of
applications of ellipsometry and spectroellipsometry in Electrical and Electronics Engineering, pp. 109 – 116.
electrochemical systems. In: Rubinstein I (ed.) Physical New York: John Wiley & Sons.
Electrochemistry: Principles, Methods, and Appli- Woollam JA and Snyder PG (1992) Variable angle spectro-
cations. New York: Marcel Dekker. scopic ellipsometry. In: Brundle CR, Evans CA and
Herman IP (1996) Optical Diagnostics for Thin Film Wilson S (eds) Encyclopedia of Materials Characteriz-
Processing, pp. 425 – 479. San Diego, CA: Academic ation: Surfaces, Interfaces, Thin Films, pp. 401 – 411.
Press. Boston, MA: Butterworth-Heinemann.
Johs B, Woollam JA, Herzinger CM, et al. (1999) Woollam JA, Johs B, Herzinger CM, et al. (1999) Overview
Overview of variable angle spectroscopic ellipsometry of variable angle spectroscopic ellipsometry (VASE),
(VASE), Part II: Advanced applications. Optical Part I: Basic theory and typical applications. Optical
Metrology, vol. CR72, pp. 29 – 58. Bellingham, Metrology, vol. CR72, pp. 3– 28. Washington: SPIE,
Washington: SPIE, Bellingham. Bellingham.
Photometry
J Schanda, University of Veszprém, Veszprém, radiation measurement. We will restrict our treatise
Hungary to the above fundamental meaning of photometry.
q 2005, Elsevier Ltd. All Rights Reserved. Under spectral efficiency function we understand
the spectral luminous efficiency function of the
human visual system. The internationally agreed
Introduction symbols are VðlÞ; for photopic vision (daylight
conditions) and V 0 ðlÞ for scotopic vision (nighttime
Fundamentals of Photometry conditions). CIE standardized the VðlÞ function in
Photometry is defined by the International Commis- 1924 and the V 0 ðlÞ function in 1951; their spectral
sion on Illumination (internationally known as distribution is shown in Figure 1. The VðlÞ function
CIE from the abbreviation of its French name: was determined using mainly the so-called flicker
Commission Internationale de l’Eclairage) as photometric technique, where the light of a reference
‘measurement of quantities referring to radiation as stimulus and of a test stimulus of varying wavelengths
evaluated according to a given spectral efficiency are alternatively shown to the human observer at a
function, e.g., VðlÞ or V 0 ðlÞ:’ A note to the above frequency at which the observer is already unable to
definition states that in many languages it is used in a perceive the difference in color (due to the different
broader sense, covering the science of optical wavelength of the two radiations), but still perceives a
308 INSTRUMENTATION / Photometry
where
ð780 nm
Lx ¼ Km Slx VðlÞdl ½1
380 nm
luminance. Looking, however, at the definition of the area emitting the radiation, and Q is the angle bet-
base unit, it is obvious that from the physical point of ween the normal of dA and the direction of luminance
view the definition of a quantity corresponding to measurement (see Figure 3). The unit of luminance
radiant power, measured in watts, can help to bridge is cd m22 (in some older literature called nit).
the gap between photometry and radiometry: lumi-
nous flux, measured in lumens (lm), with the symbol dF
Illuminance: E¼ ½5
F; is defined by eqn [1], with Sl ðlÞ inserted in W m21. dA
Based on this more practical quantity of luminous
flux and its unit, lm, the different quantities used in where dF is the luminous flux incident on the dA
photometry can be built up as follows: element of a surface (see Figure 4). The unit of
illuminance is the lux (lx ¼ lm m22).
ð1 dF ðlÞ
e
Luminous flux: F ¼ Km VðlÞdl ½2
0 dl A remark on the use of SI units and related quantities
No prefixes can be added to the SI units, thus
where Fe ðlÞ is the radiant flux measured in W, irrespective whether one measures luminance based
dFe ðlÞ
Fe;l ¼
dl
dF
Luminous intensity: I¼ ½3
dV
where dF is the luminous flux traveling in an
elementary solid angle dV; assuming a point source
(see Figure 2). The unit of luminous intensity is the
candela (cd ¼ lm sr21). Figure 3 Geometry for the definition of luminance.
›2 F
Luminance: L¼ ½4
›A cos Q ›V
Figure 2 Concept of point source, solid angle, and luminous Figure 4 Illuminance is the total luminous flux per unit area
intensity. incident at a point coming from a hemispherical solid angle.
310 INSTRUMENTATION / Photometry
which may be different from that of the field The above might be related to a further question,
considered. lying already at the boundaries of photometry, but
To build an instrument that measures this quantity that has to be considered in photometric design and
is a real challenge for the future. measurement: the human daily and yearly rhythm
(circadian and seasonal rhythm) of human activity
coupled to hormone levels. They are influenced by
Advanced Use of Photometry
light, as, for example, the hormone melatonin
Based on above newly defined quantities, several production is influenced by light exposure. Physio-
attempts are under way to extend the usefulness of logical investigations showed that melatonin pro-
photometry in designing the human visual environ- duction suppression has a maximum around 460 nm
ment. The eye is an optical system and as in every and might be coupled to a radiation sensitive ganglion
such system, the depth of focus and the different cell in the retina. Whether discomfort sensation is
aberrations of the system will decrease with decreas- mediated via the same neural pathway or via a visual
ing pupil size. Pupil size will decrease with increasing one has not yet been decided. But photometry has to
illumination, and in the case of constant luminance take these also into consideration and in the future,
with higher content of short wavelength radiation. measurement methods and instruments to determine
Thus, there exists a school of researchers who them, will have to be developed.
advocate that increased blue content in the light has
beneficial effects on vision, and one should extend the
classical photopic-based photometry with a scotopic-
based one to properly describe the visual effect of
Advances in Photometric
lighting. Measurements
Other investigations are concerned about the
Primary Standards
visibility at low light levels, the range used in street
lighting (from about a few thousands of a candela per The main concern in photometry is that the uncer-
square meter to about a few candelas per square tainty of photometric measurements is still much
meter, according to one definitions: 1023cd m22 – higher than that in other branches of physics. This is
3 cd m22). In this mesopic range both rods and cones partly due to the higher uncertainty in radiometry and
are contributing to vision, and this changes with partly to the increase in uncertainty within the chain
lighting level and direction of view (for foveal vision, of uncertainty propagation from the National Lab-
i.e., looking straight ahead, photopic photometry oratory to the workshop floor measurement.
seems to hold even at low light levels). For peripheral In National Standards Laboratories, very sophisti-
visual angles brightness perception and the percep- cated systems are used to determine the power of the
tion of an object (a signal, sign or obstacle in a incoming radiation and then elaborated spectro-
nighttime driving situation) seem to have different radiometric techniques are used to evaluate the
spectral responsivity. In driving situations the necess- radiation in the form of light, i.e., perform photo-
ary reaction time of the driver is an important metric measurements (e.g., NIST CIRCUS equip-
parameter, thus experiments are going on to define ment, where multiple laser sources are used as power
a photometric system based on reaction time sources, and highly sophisticated methods to produce
investigations. a homogeneous nonpolarized radiation field for the
In indoor situations apart from the necessary calibration of secondary photometric detectors).
level of illumination, the observed glare is a There is, however, also a method to supply end
contributor whether an environment will be users with absolute detectors for the visible part of the
accepted as pleasing or annoying. Illuminating spectrum. Modern high-end Si photovoltaic cells have
engineering distinguishes between two types of internal quantum efficiencies in the visible part of the
glare: disability glare reduces visibility, discomfort spectrum of over 99.9%. Reflection losses at the
glare is just an annoying experience without silicon surface are minimized by using three or more
influencing the short-term task performance. An detectors arranged in a trap configuration, where the
interesting question is the spectral sensitivity to light reflected from one detector is fed to the second
discomfort glare, as it can influence not only indoor one, from there to the third one, and eventually to
but also outdoor activity. Preliminary experiments some further ones. In a three detector configuration,
seem to show that luminance sensitivity and as shown in Figure 6, the light from the third detector
discomfort glare sensitivity have different spectral is reflected back to the second and from there to the
distribution; glare sensitivity seems to peak at first one. As every detector reflects only a small
shorter wavelengths. amount of radiation, by the fifth reflection practically
312 INSTRUMENTATION / Photometry
all the radiation is absorbed and contributes to the test lamp is as usual in the middle of the sphere, but
electric signal. Such trap detectors have an almost now light from an external source is introduced into
100% quantum efficiency in the visible part of the the sphere. An illuminance meter measures the flux
spectrum, and can be used as photometric detectors if entering from this source. The sphere detector
a well designed color correcting filter is applied in compares the two signals ðyi and ye Þ: Knowing the
front of the detector. absolute characteristics of the sphere (a difficult
measurement) one can determine the total luminous
Secondary Type Measurements flux of the test lamp using the absolute illuminance
value. As illuminance is easily determined from
In practical photometry the three most important luminous intensity (from eqns [3] and [5] one gets
quantities to be measured are the total luminous flux with dV ¼ dA=r2 that E ¼ I=r2 ; where r is the distance
of different lamps, the illuminance in a plane and the between the source and the illuminated surface,
luminance. supposed to be perpendicular to the direction to the
source), this technique permits us to derive the total
Light source measurement luminous flux scale from illuminance or luminous
Total luminous flux. The two methods to measure intensity measurement using an integrating sphere.
the total luminous flux is to use a goniophotometer or
a photometer (Ulbicht) sphere. In goniophotometry, Luminous intensity of LEDs. An other major
recent years have not brought major breakthroughs, break-through achieved during the past years was the
the automation of the systems got better, but the unified measurement of LED intensity. Light-emitting
principles are unchanged. diodes became, in recent years, important light
The integrating sphere photometer (a sphere with sources for large scale signaling and signing, and it
inner white diffuse coating, where the lamp is in the is foreseen that they will become important con-
middle of the sphere) used to be a simple piece of tributors in every field of light production (from car
equipment to compare total luminous flux lamps headlamps to general illumination). The most funda-
against flux standards. In recent years a new mental parameter of the light of an LED is its
technique has been developed at NIST– USA. This luminous intensity. The spatial power distribution of
enables the absolute measurement of luminous flux LEDs is usually collimated, but the LEDs often
from illuminance measurement, the fundamentals of squint, as seen in Figure 8. In the past, some
this new arrangement being shown in Figure 7: the manufacturers measured the luminous intensity in
INSTRUMENTATION / Photometry 313
the direction of maximum emission, others used the between different laboratories decreased from the
direction of the optical axis for this quantity. The 10 to 20% level to 1 to 2%. The remaining difference
highly collimated character of the radiation made is mainly due to the fact that the LEDs emit in narrow
measurements in far field rather difficult. Therefore, wavelength bands, and the transfer of the calibration
CIE recommended a new term and measuring value for the 100 mm2 detector from a white (CIE
geometry: average LED intensity can be measured Standard Illuminant A color temperature) incandes-
under two measuring conditions, as shown in cent lamp to the narrow band LED emission is still
Figure 9. The detector has to be set in the direction uncertain, mainly due to stray light effects in the
of the LED mechanical axis (discussions are still spectral responsivity and emission measurement.
going on as to what the reference direction should be
with modern surface mounted LEDs, as with those
the mechanical axis is ill-defined, the normal to the Luminance distribution measurement
base-plane could be a better reference direction). The The human observer sees luminance (and color)
detector has to have an exactly 1.00 cm2 circular differences. Thus for every illuminating engineering
aperture, and the distance between this aperture and design task the luminance distribution in the environ-
the tip of the LED is for condition A, d ¼ 0:316 m, ment is of utmost importance. Traditionally this was
and for Condition B, d ¼ 0:100 m (these two measured using a spot-luminance meter, aiming the
distances with the 1.00 cm2 detector area provide device into a few critical directions. The recent
0.001 sr (steradians) and 0.01 sr opening angles). development of charge coupled device (CCD) two-
Recent international round-robins have shown that, dimensionally sensitive arrays (and other, e.g., MOS-
based on the new recommendations, agreement FET, CMOS, Charge injection device (CID), charge
imaging matrix (CIM) systems: as for the time being
the CCD technology provides best performance, we
will refer to two-dimensional electronic image
capture devices as to CCD cameras) opened the
possibility of using an image-capturing camera for
luminance distribution measurements. Such measure-
ments are badly needed in display calibration, near-
field photometry (photometry in planes nearer than in
which the inverse square law holds), testing of car
headlamp light distribution, glare, and homogeneity
studies indoors and outdoors, etc.
Solid-state cameras have the big advantage over
Figure 8 Spatial light distribution of an LED, Figure 8a shows older type vacuum-tube image capturing devices,
the distribution of a ‘squinting’ LED in a plane including the optical that the geometric position and alignment of the
axis, Figure 8b shows light distribution in a plane perpendicular to single pixels is well-defined and stays constant.
the optical axis. By permission of the Commission Internationale
de l’Eclairage, from the Publication “Measurement of LEDs” CIE
Nowadays, CCD image detector chips are mass
127-1997; CIE Publications are obtainable from the CIE Central produced, and one can get a variety of such devices
Bureau: Kegelgasse 27, A-1033 Wien, Austria. and cameras starting with some small resolution
Figure 9 Schematic diagram of CIE Standard Conditions for the measurement of Average LED Intensity. Distance d ¼ 0.316 m for
Condition A and d ¼ 0.100 m for Condition B. By permission of the Commission Internationale de l’Eclairage, from the Publication
“Measurement of LEDs” CIE 127-1997; CIE Publications are obtainable from the CIE Central Bureau: Kegelgasse 27, A-1033 Wien,
Austria.
314 INSTRUMENTATION / Photometry
(few thousand pixels) devices up to cameras with tens irradiation; display devices have usually a non-
of mega pixel resolution. Detectors are now available linear input (DAC value) – output (luminance)
with internal intensification enabling measurements characteristic, and the camera inverse gamma
down to a few photons per second intensity levels. value corrects for this output device characteristic.
Main problems with these detectors are: This is, however, not required if the camera is used
for photometric measurements; this built-in non-
. spectral and absolute nonuniformities of the single linearity has to be corrected in the evaluating soft-
pixels (see Figure 10), where a spatial homogeneity ware if the camera is intended for photometric
map of a CCD two-dimensional array detector is measurements.
shown; the irregular 3% sensitivity change on the . to be able to perform photometric measurements
surface of the detector is negligible for imaging the camera has to have a spectral responsivity
purposes, but has to be corrected in cases of corresponding to the CIE VðlÞ-function. Many
photometric measurements. Even if the receptor cameras have built-in filters to do this – eventually
chip would have an absolutely homogeneous also red and blue filters to be able to capture color –
sensitivity, there would be a drop in response but the color correction of cameras, where the
from the middle of the imaging area to the correction is made by small adjacent filter chips, is
boarders: Light reaching the edges of the detector usually very poor. Figure 12 shows the spectral
reach the detector at an oblique angle and this
produces a decrease of sensitivity with a4 ; where a
is the angle of incidence, measured from the middle
of the lens to the given pixel of the detector and the
surface normal of the detector.
. aliasing effects if the pixel resolution is not large
enough to show straight lines as such when they are
not in line with a pixel row or column.
. nonlinearity and cross-talk among the adjacent
pixels. Figure 11 shows the so-called ‘inverse
gamma’ characteristic of a CCD camera. The
digital electronic output of the camera shows a Y ¼
E2g type function, where Y is the output DAC
(digital-analog converter) values, E is the irradi-
ance of the pixel, and g is the exponent (this
description comes from the film industry, where
the film density depends exponentially on the Figure 11 ‘Inverse gamma’ characteristic of a commercial
digital photographic camera, measurement points are shown at
different speed settings, curve is a model function representative
of the camera response.
Figure 10 Spatial homogeneity of a two-dimensional CCD Figure 12 Spectral sensitivity of a commercial digital photo-
array. graphic camera.
INSTRUMENTATION / Photometry 315
sensitivity curve of a digital photographic camera; for illuminating engineering is at present how the
the output signal is produced by an internal matrix many millions of luminance values can be evaluated
transformation of the signals produced by adjacent to get to meaningful light measurement data.
pixels equipped with different color filters. The real challenge will come if visual science
Figure 13 shows the spectral sensitivity of a CCD provides better hints how the human visual system
camera specially designed for photometric evaluates the illuminance distribution on the retina,
measurements. Naturally, meaningful photometric and instrument manufacturers will be able to capture
measurements can be made only with such a signals corresponding to those produced by the
camera. Very often, however, an approximate receptor cells and provide the necessary algorithms
luminance distribution is enough, and then a our brain uses to get to brightness, lightness, color,
picture captured by a digital photographic camera, luminance contrast, and glare type of output
plus one luminance measurement of a representa- information.
tive object for absolute calibration, suffices.
1 cd ¼ 1 lm sr21
different from that of the field considered. 540 £ 10 12 hertz and whose radiant flux is
unit : cd m 22 1/683 watt.
Luminancep: Quantity defined by the formula
Notes:
1. Radiation at a frequency of 540 £ 1012 Hz has a ›2 F
L¼
wavelength in standard air of 555.016 nm. ›A cos u ›V
2. A comparison field may also be used in which the
where ›F is the luminous flux transmitted by an
radiation has any relative spectral distribution, if
elementary beam passing through the given point and
the equivalent luminance of this field is known
propagating in the solid angle dV containing the
under the same conditions of measurement.
given direction; dA is the area of a section of that
Far field photometry: Photometry where the beam containing the given point, u is the angle
inverse square law is valid. between the normal to that section and the direction
Flicker photometerp: Visual photometer in which of the beam.
the observer sees either an undivided field illuminated unit : cd m22
successively, or two adjacent fields illuminated alter- Luminous intensityp: Quotient of the luminous flux
nately, by two sources to be compared, the frequency dFv leaving the source and propagated in the element
of alteration being conveniently chosen so that it is of solid angle dV containing the given direction, by
above the fusion frequency for colours but below the the element of solid angle:
fusion frequency for brightnesses.
dFv
Foveap: Central part of the retina, thin and Iv ¼
depressed, which contains almost exclusively cones dV
and forming the site of most distinct vision. unit : cd ¼ lm sr21
Note: The fovea subtends an angle of about Luxp: SI unit of illuminance: Illuminance produced
0.026 rad (1.5 degree) in the visual field. on a surface of area 1 square meter by a luminous flux
Goniophotometerp: Photometer for measuring the of 1 lumen uniformly distributed over that surface.
directional light distribution characteristics of
sources, luminaires, media or surfaces. 1 lx ¼ 1 lm m22
Illuminancep: Quotient of the luminous flux dFv Note: Non-metric unit: lumen per square foot
incident on an element of the surface containing the (lm ft22) or footcandle (fc) (USA) ¼ 10.764 lx.
point, by the area dA of that element. Near-filed photometry: photometry made in the
Equivalent definition. Integral, taken over the vicinity of an extended source, so that the inverse
hemisphere visible from the given point, of the square law is not valid.
expression Lv cos u dV; where Lv is the luminance Photometer (or Ulbicht) spherep: A hollow sphere,
at the given point in the various directions of the whitened inside. Owing to the internal reflexions in
incident elementary beams of solid angle dV; and u is the sphere, the illumination on any part of the
the angle between any of these beams and the normal sphere’s inside surface is proportional to the luminous
to the surface at the given point. flux entering the sphere, or produced inside the sphere
by a lamp. The illuminance of the internal sphere wall
dFv ð
Ev ¼ ¼ Lv cos u dV is measured via a small window.
dA 2psr Pixel: The individual picture elements of an image
or elements in a display that can be addressed
unit : lx ¼ lm m22
individually.
Inverse square law: The illumination at a point on a Radiancep: Quantity defined by the formula
surface varies directly with the luminous intensity of ›2 F
the source, and inversely as the square of the distance L¼
›A cos u ›V
between the source and the point if the source is seen
where ›F is the radiant flux transmitted by an
as a point source.
elementary beam passing through the given point
Lumenp: SI unit of luminous flux: Luminous flux
and propagating in the solid angle dV containing the
emitted in unit solid angle (steradian) by a uniform
given direction; dA is the area of a section of that
point source having a luminous intensity of 1 candela.
beam containing the given point, u is the angle
(9th General Conference of Weights and Measures,
between the normal to that section and the direction
1948).
of the beam.
Equivalent definition. Luminous flux of a beam of
monochromatic radiation whose frequency is unit : W m22 srad
INSTRUMENTATION / Scatterometry 317
Retinap: Membrane situated inside the back of the Total luminous flux: luminous flux of a source
eye that is sensitive to light stimuli; it contains emitted into 4p steradians.
photoreceptors, the cones and the rods, and nerve Trap detector: Detector array prepared from
cells that interconnect and transmit to the optic nerve detectors of high internal quantum efficiency, where
the signals resulting from stimulation of the the reflected radiation of the first detector is directed
photoreceptors. to the second one, and so on so that practically all
Spectralp: An adjective that, when applied to a the radiation is absorbed by one of the detectors
quantity X pertaining to electromagnetic radiation, (the radiation is “trapped” in the detector system).
indicates:
Scatterometry
J C Stover, The Scatter Works, Inc., Tucson, AZ, telescope, it will scatter from the interior walls and
USA the imaging optics themselves. Some of this light
q 2005, Elsevier Ltd. All Rights Reserved. eventually reaches the detector and creates a dim
background haze that washes out the image of the
distant star. A good telescope design accounts for
Introduction these effects by limiting potential scatter propagation
Scattered light is a limiting source of optical noise in paths and by requiring that critical elements in the
many advanced optical systems, but it can also be a optical system meet scatter specifications. This means
sensitive indicator of optical component quality. doing a careful system analysis and a means to
Consider the simple case of a telescope successfully quantify the scattering properties of the telescope
used to image a dim star against a dark background; components. This article discusses modern techniques
however, if light from a bright source (such as the for quantifying, measuring, and analyzing scattered
moon located well out of the field of view) enters the light, and reviews their development.
318 INSTRUMENTATION / Scatterometry
Like many scientific advances moving scatterometry steradian); however, in order to make the results
from an art to a reliable metrology was done in a series more meaningful, these signals are usually normal-
of small hops (not always in the same direction), rather ized, in some fashion, by the light incident on the
than a single leap. It started in 1961 when a paper by scatter source. The three ways commonly employed
Hal Bennett and Jim Porteous reported measurements to do this are defined below.
made by gathering most of the light scattered from If the scattering feature in question is uniformly
front surface mirrors and normalizing this signal by distributed across the illuminated spot on the sample
the much larger specular reflection. They defined this (such as surface roughness), then it makes sense to
ratio as the total integrated scatter (TIS), and using a normalize the collected scattered power in watts/
scalar diffraction theory result drawn from the radar steradian by the incident power. This simple ratio,
literature, related it to the surface root mean square which has units of inverse steradians, was commonly
(rms) roughness. By the mid-1970s, several angle- referred to as ‘the scattering function.’ Although this
resolved scatterometers had been built as research term is occasionally still found in the literature, it has
tools in university, government, and industry labs. been generally replaced by the closely related BRDF,
Unfortunately, instrument operation and data which is defined by the differential ratio of the sample
manipulation were generally poor, and meaningful radiance normalized by its irradiance. After some
comparison measurements were virtually impossible simplifying assumptions are made, this reduces to the
due to instrument differences, sample contamination, original scattering function with a cosine of the polar
and confusion over what parameters should be scattering angle in the denominator. The BRDF,
compared. Analysis of scatter data, to characterize defined in this manner, has become the standard
sample surface roughness, was the subject of many way to report angle-resolved scatter from features
publications. A derivation of what is commonly called that uniformly fill the illuminated spot. The cosine
‘BRDF’ (bidirectional reflectance distribution func- term results from the fact that NIST used radiometric
tion) was published by Nicodemus and co-workers at terms to define BRDF:
the National Bureau of Standards (now the National
Institute of Science and Technology or NIST) in 1970, Ps=V
BRDF ¼ ½1
but did not gain common acceptance as a way to Pi cos us
quantify scatter measurements until the late 1980s
when the advent of small computers, combined with The scatter function is often referred to as the
inspection requirements for defense-related optics, ‘cosine corrected BRDF’ and is simply equal to the
dramatically stimulated the development of scatter BRDF multiplied by the cosine of the polar scattering
metrology. Commercial laboratory instrumentation angle. Figure 1 gives the geometry for the situation,
became available that could measure and analyze as and defines the polar and azimuthal angles (us and
many as 50 to 100 samples a day, and the number (and fs ), as well as the solid collection angle (V). Other
sophistication) of measurement facilities increased common abbreviations are BSDF, for the more
dramatically. The first ASTM Standards were pub- generic bidirectional scatter distribution function,
lished (TIS in 1987 and BRDF in 1991), but it was still and BTDF for quantifying transmissive scatter.
several years before most publications correctly used
these quantifying terms. Government defense funding
decreased dramatically in the early 1990s, following Ps
the end of the Cold War, but the economic advantages Pi
of using scatter metrology and analysis for space
applications and in the rapidly advancing semi- qs
conductor industry, continued state-of-the-art qi Ω
advancements.
The following sections detail how scatter is
quantified when related to area (roughness) and
local (pit/particle) generating sources. Instrumenta-
tion and the use of scattering models are also briefly
reviewed. fs
X
Quantifying Scattered Light
Scatter signals can be easily quantified as scattered Figure 1 Scatter analysis uses standard spherical coordinates
light power per unit solid angle (in watts per to define terms.
INSTRUMENTATION / Scatterometry 319
Integration of the scatter signal over much of the definitions, found in Figure 1, also apply for the DSC:
scattering hemisphere allows calculation of TIS, as
the ratio of the scatter signal to the reflected specular Ps=V
DSC ¼ ½3
power. This integration is usually carried out experi- Ii
mentally in such a way that both the incident beam
and reflected specular beam are excluded. In the most If the DSC is integrated over the solid angle
common TIS situation, the beam is incident at a small associated with a collection aperture then the value
angle near surface normal, and the integration is done has units of area. Because relatively small area
from small values of us to almost 90 degrees. If the focused laser beams are often used as a source, area
fraction of light scattered from the specular reflection is most commonly given in micrometers squared.
is small and if the scatter is caused by surface These three scatter parameters, the BRDF, the TIS,
roughness, then it can be related to the rms surface and the DSC, are obviously functions of system
roughness of the reflecting surface. As a ratio of variables such as geometry, scatter direction (both in
powers, the TIS is a dimensionless quantity. The and out of the incident plane), incident wavelength
normalization is by Pr (instead of Pi) because and polarization, as well as feature characteristics. It
reductions in scatter caused by low reflectance do is the dependence of the scatter signal on these system
not influence the roughness calculation. The pertinent parameters that makes the scatter models useful for
relationships are given below, where s is the rms optimizing instrument designs. It is their dependence
roughness and l is the light wavelength: on feature characteristics that makes scatter measure-
ment a useful metrology tool.
A key point needs to be stressed. When applied
TIS ¼ Ps=Pr ø ð4ps=lÞ2 ½2 appropriately, TIS, BRDF, and DSC are absolute
terms, not relative terms. The DSC of a 100 nm PSL
in a given direction for a given source is a fixed value,
Of course, all scatter measurements are inte- which can be repeatedly measured and even accu-
grations over a detector collection aperture, but the rately calculated from models. The same is true for
TIS designation is reserved for situations where the TIS and BRDF values associated with surface rough-
aim is to gather as much scattered light as possible, ness of known statistics. Scatter measuring instru-
while ‘angle resolved’ designs are created to gain ments, such as particle scanners or lab scatterometers,
information from the distribution of the scattered can be calibrated in terms of these quantities. As has
light. Notice that TIS values become very large when already been pointed out, the user of a scanner will
measured from a diffuse surface, where the specular almost always be more interested in characterizing
reflection is very small. Although TIS can be defects than in the resulting scatter values, but the
measured for any surface, the diffuse reflectance underlying instrument calibration can always be
(equal to Ps/Pi) would often be more appropriate expressed in terms of these three quantities. This is
for diffuse surfaces. The various restrictions associ- true even though designers and users may find it
ated with relating TIS to rms roughness are convenient to use other metrics (such as polysterene
detailed below. latex (PSL) spheres) as a way to relate calibration.
Scatter from discrete features, such as particles and
pits, which do not completely fill the illuminated spot,
must be treated differently. This is because changes in
Angle Resolved Scatterometers
spot size, with no corresponding change in total The diagram in Figure 2 shows the most common
incident power, will change the incident intensity scatterometer configuration. The source is fixed and
(watts/unit area) at the feature and thus also change the sample is rotated to the desired incident angle.
the scatter signal (and BRDF) without any corre- The receiver is then rotated about the sample during
sponding changes in the scattering feature. Clearly scatter measurement. Most commonly, scattero-
this is unacceptable if the object is to characterize the meters operate just in the plane of incidence; however,
defect with scatter measurements. The solution is to instruments capable of measuring at virtually any
define another quantification term, known as the location in either the reflective or transmissive hemi-
differential scattering cross-section (DSC), where the spheres have been built. Although dozens of instru-
normalization is the incident intensity at the feature ments have been built following this general design,
(the units for DSC are area/steradian). Because this other configurations are in use. For example,
was not done in terms of radiometric units at the time the source and receiver may be fixed and the
it was defined, the cosine of the polar scattering angle sample rotated so that the scatter pattern moves
is not in the definition. The same geometrical past the receiver. This is easier mechanically than
320 INSTRUMENTATION / Scatterometry
moving the receiver at the end of an arm, but Computer control of the measurement is essential
complicates analysis because the incident angle and to maximize versatility and minimize measurement
the observation angle change simultaneously. time. The software required to control the measure-
Another combination is to fix the source and sample ment plus the display and analysis of the data can be
together, at constant incident angle, and rotate this expected to be a significant portion of total instru-
unit (about the point of illumination on the sample) ment development cost. The following reviews
so that the scatter pattern moves past a fixed receiver. typical design features (and issues) associated with
This has the advantage that a long receiver/sample the source, sample mount and receiver components.
distance can be used without motorizing a long The source in Figure 2 is formed by a laser beam
(heavy) receiver arm. It has the disadvantage that that is chopped, spatially filtered, expanded, and
heavy (or multiple) sources are difficult to deal with. finally brought to a focus on the receiver path. The
Other configurations, with everything fixed, have beam is chopped to reduce both optical and electronic
been designed that employ several receivers to merely noise. This is usually accomplished through the use of
sample the BSDF and display a curve fit of the lock-in detection in the electronics package which
resulting data. This is an economical solution if the suppresses all signals except those at the chopping
BSDF is relatively uniform without isolated diffrac- frequency. Low noise, programmable gain electronics
tion peaks. The goniometer section of a real instru- are essential to reducing system noise. The reference
ment, similar to that of Figure 2, is shown in Figure 3. detector is used to allow the computer to ratio out
laser power fluctuations and, in some cases, to
provide the necessary timing signal to the lock-in
Chopper Spatial electronics. Polarizers, wave plates, and neutral
Lens filter density filters are also commonly placed prior to the
Laser spatial filter. The spatial filter removes source scatter
from the laser beam and presents a point source
Ref. det. Signature
noise which is imaged by the final focusing element, in this
Mirror
case a mirror, to the detector zero position. Focusing
the beam at this location allows near specular scatter
Sample to be more easily measured. Lasers are convenient
Beam sources, but are not necessary. Broadband sources are
dump often required to meet a particular application or to
simulate the environment where a sample will be
used. Monochromators and filters can be used to
Goniometer
provide scatterometer sources of arbitrary wave-
length. The noise floors with these tunable incoherent
Receiver
sources increases as the spectral bandpass is
Figure 2 Basic elements of an incident plane scatterometer are narrowed, but they have the advantage that the
shown. scatter pattern does not contain laser speckle.
The sample mount can be very simple or very
complex. In principal, six degrees of mechanical
freedom are required to fully adjust the sample. The
order in which these stages are mounted affects the
ease of use (and cost) of the sample holder. In
practice, it often proves convenient to either elimin-
ate, or occasionally duplicate, some of these degrees
of freedom. In addition, some of these axes may be
motorized to allow the sample area to be raster-
scanned to automate sample alignment or to measure
reference samples. As a general rule, the scatter
pattern is insensitive to small changes in incident
angle but very sensitive to small angular deviations
Figure 3 The author’s scatterometer, which is similar to the from specular. Instrumentation should be configured
diagram of Figure 2 is shown. In this case a final focusing lens is
introduced to produce a very small illuminated spot on the silicon
to allow location of the specular reflection
wafer sample. The white background was introduced to make the (or transmission) very accurately. Receiver designs
instrument easier to view. vary, but changeable entrance apertures, bandpass
INSTRUMENTATION / Scatterometry 321
closer than 0.1 degrees from specular can be made in well-known grating equations as:
this manner and it is an excellent way to check
incoming optics or freshly coated optics for low
scatter. fx ¼ ðsin us cos fs 2 sin ui Þ=l and
should always be given with enough information to type and use) are in daily use. These instruments
determine bandwidth limits. are truly amazing; they inspect 200 mm wafers at a
rate of one every thirty seconds reporting feature
location, approximate size, and in some cases even
type (pit or particle) by analyzing scatter signals
Measuring and Analyzing Scatter from
that last only about 100 nanoseconds. Similar
Isolated Surface Features systems are now starting to be used in the
Understanding scatter from discrete surface features computer disk industry and flat panel display
has led to big changes in the entertainment inspection of surface features will follow.
business (CDs, DVDs, digitized music, and films, Scanners are calibrated with PSLs of different sizes.
etc.) as well as providing an important source of Because it is impossible to identify all defect types and
metrology for the semiconductor industry as they because diameter has little meaning for irregularly
develop smaller faster chips for a variety of shaped objects, feature ‘size’ is reported in ‘PSL
modern uses. Thus, just about everybody in the equivalent diameters.’ This leads to confusing situ-
modern world utilizes our understanding of scatter ations for multiple detector systems, which (without
from discrete surface features. some software help) would report different sizes for
On the metrology side, the roughness signals real defects that scatter much differently than PSLs.
described in the last section are a serious source of These difficulties have caused industry confusion
background noise that limits the size of the similar to the ‘Gaussian statistics/bandwidth limited’
smallest defects that can be found. Discrete surface issues encountered before roughness scatter was
features come in a dazzling array of types, sizes, understood. The publication of international stan-
materials, and shapes – and they all scatter dards relating to scanner calibration has reduced the
differently. A 100 nm silicon particle scatters a lot level of confusion.
differently than a 100 nm silicon oxide particle
(even if they have the same shape), and a 100 nm
Scatter Related Standards
diameter surface pit will have a different scatter
pattern. Models describing scatter from a variety of Early scatter related standards for BRDF, TIS,
discrete surface features have been developed. and PSD calculations were written for ASTM
Although many of these are kept confidential for (the American Society for Testing Materials), but as
competitive reasons, NIST offers some models the industrial need for these documents moved to the
publicly through a web site. semiconductor industry, there was pressure to move
Confirming a model requires knowing exactly the documents to SEMI (Semiconductor Equipment
what is scattering the light. In order to accomplish and Materials Inc.), which is an international
this, depositions of PSLs of known size are made organization. By 2004, the process of rewriting the
on the surface. Scatter is measured from one or ASTM documents in SEMI format was well under-
more spheres. A second measurement of back- way in the Silicon Wafer Committee. Topics covered
ground scatter is then subtracted and the net BRDF include: surface defect specification (M35), defect
is converted to DSC units using the known capture rate (M50), scanner specifications (M52),
(measured) illuminated spot size. Measurements of scanner calibration (M53), particle deposition
this type have been used to confirm discrete feature testing (M58), BRDF measurement (ME1392), and
scatter models. The model is then used to calculate PSD calculation (MF1811). This body of literature
scatter from different diameters and materials. is probably the only place where all aspects
Combined with the model for surface roughness of these related problems are brought together in
to evaluate system noise, this capability allows one place.
signal to noise evaluation of different defect
scanners designs.
Conclusion
The bottom line is that scatter measurement and
Practical Industrial Instrumentation analysis has moved from an art to a working
Surface defects and particles, smaller than 100 nm metrology. Dozens of labs around the world can
are now routinely found on silicon wafers using now take the same sample and get about the same
scatter instruments known as particle scanners. measured BRDF from it. Thousands of industrial
Thousands of these instruments (with price tags surface scanners employing scattered light signals are
approaching a million dollars each depending on in use every day on every continent. In virtually every
324 INSTRUMENTATION / Spectrometers
house in the modern world there is at least one Gu ZH, Dummer RS, Maradudin AA and McGurn AR
entertainment device that depends on scatter signals. (1989) Experimental study of the opposition effect in the
In short – scatter works. scattering of light from a randomly rough metal surface.
Applied Optics 28(3): 537.
Nicodemus FE, Richmond JC, Hsia JJ, Ginsberg IW and
Limperis T (1977) Geometric Considerations and
See also Nomenclature for Reflectance. NBS Monograph 160,
Scattering: Scattering Theory. US Dept. of Commerce.
Scheer CA, Stover JC and Ivakhnenko VI (1998)
Comparison of models and measurements of
scatter from surface-bound particles. SPIE Proceed-
Further Reading ings 3275.
Bennett JM and Mattsson L (1995) Introduction to Surface Schiff TF, Stover JC and Bjork DR (1992) Mueller matrix
Roughness and Scattering, 2nd edn. Washington, DC: measurements of scattered light. Proceedings of SPIE
Optical Society of America. 1753: 269 – 277.
Bennett HE and Porteus JO (1961) Relation between Stover JC (1975) Roughness characterization of smooth
surface roughness and specular reflectance at normal machined surfaces by light scattering. Applied Optics
incidence. Journal of the Optical Society of America 51: 14(8): 1796.
123. Stover JC (1995) Optical Scattering: Measurement and
Cady FM, Bjork DR, Rifkin J and Stover JC (1989) BRDF Analysis, 2nd edn. Bellingham, WA: SPIE.
error analysis. Proceedings of SPIE 1165: 154 – 164. Stover JC (2001) Calibration of particle detection
Church EL and Zavada JM (1975) Residual surface systems. In: Alain Diebold (ed.) Handbook of Silicon
roughness of diamond-turned optics. Applied Optics Semiconductor Metrology, chap. 21. New York: Marcel
14: 1788. Dekker.
Doicu A, Eremin Y and Wriedt T (2000) Acoustic & Wolfe WL and Bartell FO (1982) Description and
Electromagnetic Scattering Analysis Using Discrete limitations of an automated scatterometer. Proceedings
Sources. San Diego, CA: Academic Press. of SPIE 362-30.
Spectrometers
K A More, SenSyTech, Inc. Imaging Group, Ann The basic elements of a spectroscopic instrument
Arbor, MI, USA are shown in Figure 1. The source, or more usually
q 2005, Elsevier Ltd. All Rights Reserved. an image of the source, fills an entrance slit and
the radiation is collimated by either a lens or
mirror. The radiation is then dispersed, by either a
prism or a grating, so that the direction of
Introduction propagation of the radiation depends upon its
Sp7ectrometers were developed after the discovery wavelength. It is then brought to a focus by a
that glass prisms disperse light. Later, it was second lens or mirror and the spectrum consists of
discovered that diffraction from multiple, equally a series of monochromatic images of the entrance
spaced, wires or fibers also dispersed light. About slit. The focused radiation is detected, either by an
100 years ago, Huygens proposed his wave theory image detector such as a photographic plate, or by
of light and Fraunhofer developed diffraction a flux detector such as a photomultiplier, in which
theory, which allowed scientific development of case the area over which the flux is detected is
diffraction gratings. limited by an exit slit. In some cases the radiation
These discoveries then led to the development of is not detected at this stage, but passes through the
spectrometers. Light theory was sufficiently devel- exit slit to be used in some other optical system. As
oped such that spectrometer designs, developed over the exit slit behaves as a monochromatic source,
100 years ago, are still being used today. These the instrument can be regarded as a wavelength
theories and designs are briefly described along with filter and is then referred to as a monochromator.
comments on how current technology has improved
upon these designs. This is followed by some
examples of imaging spectrometers which have wide
Prisms
spectral coverage from 450 nm to 14 mm and produce The wavelength dependence of the index of refraction
images of more than 256 spectral bands. is used in prism spectrometers. Such an optical
INSTRUMENTATION / Spectrometers 325
Figure 1 The basic elements of a spectroscopic instrument. With permission from Hutley MC (1982) Diffraction Gratings, pp. 57–232.
London: Elsevier.
Figure 2 Elementary prism spectrometer schematic. W is the width of the entrance beam; Sp is the length of the prism face; and B
is the prism base length. Reproduced with permission from The Infrared Handbook (1985). Ann Arbor, MI: Infrared Information
Analysis Center.
Figure 3 Infrared spectrograph of the Littrow-type mount with a rock salt prism. Reproduced with permission from The Infrared
Handbook (1985) Ann Arbor, MI: Infrared Information Analysis Center.
It can be shown that: the prism for a double pass through the prism, as
shown in Figure 3.
dDp =dl ¼ ½B=W½dn=dl ¼ ½dDp =dn½dn=dl ½3
where Gratings
B ¼ base length of the prism, Rowland is credited with the development of a
W ¼ width of the illumination beam grating mount that reduced aberrations in the
n ¼ index of refraction spectrogram. He found that a grating ruled on a
while concave surface of radius R and locating the entrance
and exit slits on the same radius would give the least
dx=dl ¼ F½B=W½dn=dl ½4 aberrations, as shown in Figure 4. r ¼ R cos a is the
distance to the entrance slit (r) and r1 ¼ R cos b is the
One may define the resolving power, RP, of an distance to the focal point for the exit slit ðr1 Þ:
instrument as the smallest resolvable wavelength Rowland showed that:
difference, according to the Rayleigh criterion,
divided into the average wavelength in that spectral cos a=R 2 cos2 a=r þ cos b=R 2 cos2 b=r1 ¼ 0 ½8
region. Thus:
One solution to this equation is for a and b to each be
RP ¼ l=Dl ¼ ½l=dDp ½dDp =dl zero and
r ¼ R cos a
¼ ½l=dDp ½B=W½dn=dl ½5 and
convenient for applications such as the routine The oldest concave grating mount is that designed
analysis of samples of metals and alloys. In this by Rowland himself and which bears his name
case, slits and detectors are set up to measure the light (Figure 5b). In this case, the grating and photographic
in various spectral lines, each characteristic of a plates are fixed at opposite ends of the diameter of the
particular component or trace element. Rowland circle by a moveable rigid beam. The
entrance slit remains fixed above the intersection of
two rails at right angles to each other and along
which the grating plate holder (or the exit slit) is free
to move. In this way, this entrance slit, grating, and
plate holder are constrained always to lie on the
Rowland circle and it has the advantage that the
dispersion is linear, which is useful in the accurate
determination of wavelengths. Unfortunately, this
mounting is rather sensitive to small errors in the
position of the entrance slit and in the orthogonality
of the rails, and is now very rarely used.
A variation on this mounting was devised by
Abney, who again mounted the grating and the
plate holder on a rigid bar at opposite ends of
the diameter of the Rowland circle (Figure 5c). The
entrance slit is mounted on a bar of length equal to
the radius of the Rowland circle. In this way, the slit
always lays on the Rowland circle, but it had to
Figure 4 The construction of the Rowland Circle. With
permission from Hutley MC (1982) Diffraction Gratings, rotate about its axis in order that the jaws should
pp. 57–232. London: Elsevier. remain perpendicular to the line from the grating to
Figure 5 Mountings of the concave grating: (a) Paschen –Runge; (b) Rowland; (c) Abney; (d) and (e) Eagle; (f) Wadsworth;
(g) Seya–Namioka; (h) Johnson –Onaka. With permission from Hutley MC (1982) Diffraction Gratings, pp. 57– 232. London: Elsevier.
328 INSTRUMENTATION / Spectrometers
the slit. It also had the disadvantage that the A particularly important example of this is the
source must move with the exit slit, which could be Seya – Namioka mounting (Figure 5g), in which
inconvenient. the entrance slit and exit slit are kept fixed and the
In the Eagle mounting (Figures 5d and 5e), the spectrum is scanned by a simple rotation of the
angles of incidence and diffraction are made equal, or grating (Figure 5g). In order to achieve the optimum
very nearly so, as with Littrow mounting for plane conditions for this mounting, we set a ¼ w þ u and
gratings. Optically, this system has the advantage that b ¼ u 2 w, where 2w is the angle subtended at the
the astigmatism is generally less than that of the grating by the entrance and exit slit and u is the angle
Paschen –Runge or Rowland mounting, but on the through which the grating is turned. The amount of
other hand, the dispersion is nonlinear. From a defocus is given by:
mechanical point of view, it has the disadvantage
that it is necessary with great precision both to rotate
Fðu; w; r; r1 Þ ¼ ½cos2 ðu þ wÞ=r þ ½cosðu þ wÞ=R
the grating and to move it nearer to the slits in order
to scan the spectrum. However, it does have the þ ½cos2 ðu 2 wÞ=r1 þ ½cosðu 2 wÞ=R
practical advantage that it is much more compact
½9
than other mountings, and this is of particular
importance when we bear in mind the need to enclose
the instrument in a vacuum tank. Ideally, the entrance and the optimum conditions are those for which
slit and exit slit or photographic plate should be F(u,w,r,r1) remains as small as possible as u is varied
superimposed if we are to set a ¼ b: In practice, of over the required range. Seya set F and three
course, the two are displaced either sideways, in the derivatives of F with respect to u equal to zero for
plane of incidence as shown in Figure 5, or out of the u ¼ 0 and obtained the result:
plane of incidence, in which case the entrance slit is pffiffi
positioned below the meridonal plane and the plate w ¼ sin21 ð1= 3Þ ¼ 358150 ½10
holder just above it, as shown in Figure 5. The out-of-
plane configuration is generally used for spectro- and
graphs and the in-plane system for monochromators.
The penalty incurred in going out of plane is that r ¼ r1 ¼ R cos w ½11
coma is introduced in the image, and slit curvature
becomes more important. This limits the length of the which corresponds to the Rowland circle either in
ruling that can effectively be used. zero order or for zero wavelength. In practice, it is
The second well-known solution to the Rowland usual to modify the angle slightly so that the best
equation is the Wadsworth mounting (Figure 5f), in focus is achieved in the center of the range of interest
which the incident light is collimated, so r is set rather than at zero wavelength.
at infinity and the focal equation reduces to The great advantage of the Seya – Namioka mount-
r1 ¼ R cos2 b=ðcos a þ cos bÞ: One feature of this ing is its simplicity. An instrument need consist only
mounting is that the astigmatism is zero when the of a fixed entrance and exit slit, and a simple rotation
image is formed at the center of the grating blank, of the grating is all that is required to scan the
i.e., when b ¼ 0: It is a particularly useful mounting spectrum. It is, in fact, simpler than instruments using
for applications in which the incident light is plane gratings. Despite the fact that at the ends of the
naturally collimated (for example, in rocket or useful wavelength range the resolution is limited by
satellite astronomy, spectroheliography and in work the defect of focus and the astigmatism is particularly
using synchrotron radiation). However, if the light is bad, the Seya –Namioka mounting is very widely
not naturally collimated, the Wadsworth mount used, particularly for medium-resolution rather than
requires a collimating mirror, so one has to pay the high-resolution work.
penalty of the extra losses of light at this mirror. The A similar simplicity is a feature of the Johnson–
distance from the grating to the image is about half Onaka mounting (Figure 5h). Here again, the
that for a Rowland circle mounting which makes the entrance and exit slits remain fixed, but the grating
instrument more compact and, since the grating is rotated about an axis which is displaced from its
subtends approximately four times the solid angle, center; in this way, it is possible to reduce the change
there is a corresponding increase in the brightness of of focus that occurs in the Seya – Namioka mounting.
the spectral image. The system is set up so that at the center of the desired
Not all concave grating mountings are solutions to wavelength range, the slits and grating lie on the
the Rowland equation. In some cases, other advan- Rowland circle, as shown in Figure 5h. The optimum
tages may compensate for a certain defect of focus. radius of rotation, i.e., the distance GC, was found by
INSTRUMENTATION / Spectrometers 329
Figure 7 Hyperspectral Data Cube. Hyperspectral imagers divide the spectrum into many discrete narrow channels. This fine
quantization ofn spectral information on a pixel by pixel basis enables researchers to discriminate the individual constituents in an area
much more effectively. For example, the broad spectral bands of a multispectral sensor allow the user only to coarsely discriminate
between areas of deciduous and coniferous forest, plowed fields, etc., whereas a hyperspectral imager provides characteristic
signatures which can be correlated with specific spectral templates to help determine the individual constituents and possibly even
reveal details of the natural processes which are affecting them.
motion to form an image. This scanner is called a pushbroom scanners, so that platform instability
pushbroom scanner. from pitch roll and yaw and from inability to fly in
One important requirement is that all spectral a straight line do not compromise the coregistration
measurements of a pixel be coregistered. Most of spectral data on each pixel. The image data may
airborne imaging spectrometers use a common require geometric correction, but the spectral data are
aperture for a line scanner, or a slit for whisk and not compromised.
INSTRUMENTATION / Spectrometers 331
There is one other type of imaging spectrometer motion moves over one ground pixel, the whole array
that uses a linear variable filter (also known as a is read out and the frame is shifted one row of pixels
wedge filter) over a 2D array of photodetectors. Each so the second frame adds a second spectral band to
image contains a different spectral band over each each row imaged in the first frame. This frame
row of pixels so that each frame images a scene with a stepping is carefully timed to the platform velocity
different spectral band over each row of pixels. The and is repeated until each row of pixels is imaged in
array is oriented so the rows of spectral bands are all spectral bands. This type of imaging spectro-
perpendicular to the flight track. After the platform meter has been flown in aircraft and satellites.
In aircraft, the platform motion corrupts the coregis-
tration of the spectrum in each pixel. Extensive
Table 1 MIVIS physical properties ground processing is required to geometrically correct
Height Width Deptha Weight each frame to improve spectral coregistration. This
is a difficult task and this type of imaging spectro-
in cm in cm in cm lbs kg
meter has lost favor for airborne use. Stabilized
Scan head 26.5 67.0 20.6 52.0 28.1 71.0 220 100 satellites have been a better platform and the wedge
Electronics 40.0 102.0 19.0 48.3 24.0 61.0 type of imaging spectrometer has been used success-
Total system 460 209
fully in space.
weight (approx.)
As examples of airborne and ground-based
a
Not including connectors or cable bends. imaging spectrometers, a line scanner imaging
spectrometer and a ground-based scanned push coated dichroics to four spectrometers. The thin
broom scanner are described below. metallic mirrors reflect long wavelengths and trans-
MIVIS (multispectral infrared and visible imaging mit short wavelengths. The multiple dielectric layers
spectrometers) hyperspectral scanner developed by cause interference such that light is reflected at short
SenSyTech Imaging Group (formerly Daedalus Enter- wavelengths and transmitted at long wavelengths.
prises) for the CNR (Consiglio National Researche) After splitting off four wide bands (visible, near
of Italy is a line scanner imaging spectrometer. The infrared, mid-wave infrared, and thermal infrared)
optical schematic is shown in Figure 8. A rotating 458 each band is dispersed in its own spectrometers.
mirror scans a line of pixels on the ground. Each pixel The wavelength coverage for each spectrometer is
is collimated by a parabolic mirror in a Gregorian based on the wavelength sensitivity of different
telescope mount. The collimated beam is reflected by photodetector arrays. The visible spectrometer uses
a Pfund assembly to an optical bench that houses four a silicon photodiode array; the near infrared spec-
spectrometers. An aperture pinhole in the Pfund trometer, an InGaAs array, the mid-infrared a InSb
assembly defines a common instantaneous field of array and the thermal infrared, a MCT (mercury
view for each pixel. The collimated beam is then split doped cadmium teluride) photo-conductor array.
off with either thin metallic coated or dielectric Note the beam expander in spectrometer 3 for the
Spatial resolution 1.0 mrad IFOV vertical and horizontal (square pixels). Optical lens for 0.5 mrad IFOV
Spatial coverage 158 TFOV horizontal, nominal. A second, 7.58 TFOV, is an option
Focus range 50 to infinity
Spectral resolution 5.0 nm per spectral channel, sampled at 2.5 nm interval
Spectral range 400–1100 nm
Dynamic range 8 bit (1 part in 256)
Illumination Solar illumination ranging from 10:00 a.m. to 2:00 p.m. under overcast conditions to full noon sunshine
Acquisition time 1.5 sec for standard illuminations. Option for longer times for low light level conditions
Viewfinder Near real time video view of the scene
Calibrations Flat fielding to compensate for CCD variations in responsivity. Special calibration
Data displays Single band imagery, selectable Waterfall Chart (one spatial line by spectral, as acquired by the CCD).
Single pixel spectral display
Data format Convertible to format compatible with image processors
Data storage Replaceable hard disc, 14 data cubes/disk
mid-infrared (IR). Since wavelength resolution only concept that could meet the above specification
increases with the size of the illuminated area of a was a Littrow configuration grating spectrometer that
diffraction grating a beam expander was needed to imaged one line in the scene and dispersed the spectra
meet the specified mid-IR resolution. perpendicular to the line image onto a CCD array. A
The physical characteristics of MIVIS are given in galvanometer mirror stepped the line image over the
Table 1 and the spectral coverage in Table 2. scene to generate a spectral image of the scene. A
An example of a push broom imaging spectrometer block diagram of LAFS is shown in Figure 9 and a
is the large area fast spectrometer (LAFS), which is a photograph of the prototype in Figure 10. Figure 11 is
field portable, tripod mounted, imaging spectrometer. a photograph of the optical head. Data from a single
LAFS uses a push broom spectrometer with a frame are stored in computer (RAM) memory then
galvanometer driven mirror in front of the entrance transferred to a replaceable hard disk for bulk
slit. The mirror is stepped to acquire a 2D image storage. Data are collected in a series of 256 CCD
£ 256 spectral bands data cube. frames and each CCD frame has a line of 256 pixels
LAFS was developed for the US Marine Corps along one axis and 256 spectral samples along the
under the direction of the US Navy Coastal Systems second axis. The 256 CCD frames constitute a data
Station of the Dahlgren Division. The technical cube as shown in Figure 12. This does not allow field
specifications are shown in Table 3. Various concepts viewing of an image in a single spectral band. Earth
for the imaging spectrometer were studied and the Viewe software is used in the field portable computer
Figure 10 LAFS.
334 INSTRUMENTATION / Spectrometers
Table 4 LAFS physical properties Hyperspectral A data cube with many (.48)
spectral bands.
Dimensions and weight Size (inches) weight (lbs)
Line scanners An optical/mechanical system that
Optical head 7£9£8 11 scans one pixel at a time along a
Electronics 12 £ 15.5 £ 9 30 line of a scene.
Battery 4.5 £ 16.5 £ 2 7
Whisk broom An optical/mechanical system that
48 þ cables
scans two or more lines at a time.
Power: Less than 150 W at 12 VDC; Battery time – 45 min/ Push broom An optical system that images a
charge. complete line at a time.
to reorder the data cube into 256 spatial images, one See also
for each spectral band as shown in Figure 12.
The prototype is packaged in two chassis, a com- Diffraction: Diffraction Gratings; Fraunhofer Diffraction.
pact optical head and a portable electronic chassis. Fiber Gratings. Geometrical Optics: Prisms. Imaging:
The size, weight, and power are shown in Table 4. Hyperspectral Imaging; Interferometric Imaging. Inter-
ferometry: Overview. Modulators: Acousto-Optics. Opti-
LAFS was designed to also work as an airborne
cal Materials: Color Filters and Absorption Glasses.
imaging spectrometer. The framing mirror can be
Semiconductor Materials: Dilute Magnetic Semiconduc-
locked so it views a scene directly below the aircraft tors; Group IV Semiconductors, Si/SiGe; Large Gap II –VI
(push broom mode) or the mirror can be programmed Semiconductors; Modulation Spectroscopy of Semicon-
to sweep a 256 lines cross track to the aircraft flight ductors and Semiconductor Microstructures. Spec-
path (whisk broom mode). In the whisk broom mode, troscopy: Fourier Transform Spectroscopy; Raman
the scan mirror rotation arc can be increased for a Spectroscopy.
wider field of view than in the push broom mode.
LAFS illustrates the decrease in size, weight, and
power of a push broom imaging spectrometer that
results from a 256 £ 256 pixel array rather than Further Reading
single pixel line arrays of a line scanner. The array Baselow R, Silvergate P, Rappaport W, et al. (1992)
increases integration time of a line by a factor of HYDICE (Hyperspectral Digital Imagery Collection
256, which allows longer detector integration Experiment Instrument Design. International Sym-
dwell time on each pixel and thus smaller light posium on Spectral Sensing Research.
collecting optics. Bianchi, R, Marino, CM and Pignatti, S Airborne Hyper-
spectral Remote Sensing in Italy. Pomezia (Roma), Italy:
CNR, Progetto LARA, 00040.
Borough HC, Batterfield OD, Chase RP, Turner RW and
List of Units and Nomenclature Honey FR (1992) A Modular Wide Field of View Airborne
Data cube Multiple images of a scene in many Imaging Spectrometer Preliminary Design Concept.
spectral bands. The spectra of each International Symposium on Spectral Sensing Research.
Carmer DC, Horvath R and Arnold CB (1992) M7
pixel can be obtained by plotting
Multispectral Sensor Testbed Enhancements. Inter-
the spectral value on the same pixel
national Symposium on Spectral Sensing Research.
in each spectral band image. Hackwell JA, Warren DW, Bongiovi RP, et al. (1996) LWIR/
Pixel An image point (or small area) MWIR Imaging Hyperspectral Sensor for Airborne and
defined by the instantaneous-field- Ground-Based Remote Sensing. In: Descour M and
of-view of either the optics, entrance Mooney J (eds) Imaging Spectrometry II. Proc. SPIE
aperture or slit, and the detector 2819, pp. 102 –107.
area, or by some combination of Hutley MC (1982) Diffraction Gratings. London:
these components. A pixel is the Academic Press.
smallest element in a scene that is James JF and Sternberg RS (1969) The Design of Optical
resolved by the imaging system. Spectrometers. London: Chapman and Hall.
Kulgein NG, Richard SP, Rudolf WP and Winter EM (1992)
Instantaneous- The IFOV is used in describing
Airborne Chemical Vapor Detection Experiment. Inter-
field-of-view scanners to define the size of the
national Symposium on Spectral Sensing Research.
(IFOV) smallest field-of-view (usually in Lucey PG, Williams T, Horton K, Hinck K and Buchney C
milliradians or microradians) that (1994) SMIFTS: A Cryogenically Cooled, Spatially
be resolved by a scanner system. Modulated, Imaging Fourier Transform Spectrometer
Field-of-view The total size in angular dimensions for Remote Sensing Applications. International Sym-
of a scanner or imager. posium on Spectral Sensing Research.
336 INSTRUMENTATION / Telescopes
Mourales P and Thomas D (1998) Compact Low- Torr MR, Baselow RW and Mounti J (1983) An imaging
Distortion Imaging Spectrometer for Remote Sensing. spectrometric observatory for spacelab. Astrophys.
SPIE Conference on Imaging Spectrometry IV 3438: Space Science. 92: 237.
31– 37. Vane G, Green RO, Chrien TG, Enmark HT, Hansen EG
Pagano TS (1992) Design of the Moderate Resolu- and Porter WM (1993) The Airborne Visible/Infrared
tion Imaging Spectrometer – NADIR (MODIS-N). Imaging Spectrometer (AVIRIS). Remote Sensing of the
American Society of Photogrametry & Remote Sensing, Environment 44: 127– 143.
pp. 374 – 385. Warren CP, Anderson RD, Velasco A, Lin PP, Lucas JR and
Stanich CG and Osterwisch FG (1994) Advanced Oper- Speer BA (2001) High Performance VIS and SWIR
ational Hyperspectral Scanners MIVIS and AHS. First Hyperspectral Imagers for Airborne Applications. Fifth
International Airborne Remote Sensing Conference and International Conference and Exhibition, Veridian.
Exhibition, Strasbourg, France: Environmental Research Wolfe WL (1997) Introduction to Imaging Spectrometers.
Institute of Michigan. SPIE Optical Engineering Press.
Sun X and Baker JJ (2001) A Hight Definition Hyperspec- Wolfe WL and Zisfis GJ (eds) (1985) The Infrared Hand-
tral Imaging System. Proceedings, Fifth International book, revised edition. Washington, DC: Office of Naval
Airborne Remote Sensing Conference and Exhibition, Research, Department of the Navy.
Veridian. Zywicki RW, More KA, Holloway J and Witherspoon N
Torr MR, Baselow RW and Torr DG (1982) Imaging (1994) A Field Portable Hyperspectral Imager: The
Spectrascopy of the Thermosphere from the Space Large Area Fast Spectrometer. Proceedings of the Inter-
Shuttle. Applied Optics 21: 4130. national Symposium on Spectral Sensing Research.
Telescopes
M M Roth, Astrophysikalisches Institut Potsdam, least the latter flaw was corrected for by introducing
Potsdam, Germany the achromat. Later, the achromat was significantly
q 2005, Elsevier Ltd. All Rights Reserved. improved by Peter Dollond, Jesse Ramsden, and
Josef Fraunhofer. The catadioptric telescope was
probably introduced by Lenhard Digges (1571) or
Introduction Nicolas Zucchius (1608). In 1671, Sir Isaac
In a broad sense, telescopes are optical instruments Newton was the first to use a reflector for astronom-
which provide an observer with an improved view of ical observations. Wilhelm Herschel improved this
a distant object, where improved may be defined in technique and began in 1766 to build much
terms of magnification, angular resolution, and light larger telescopes with mirror diameters up to
collecting power. Historically, the invention of the 1.22 m. When, in the nineteenth century, the conven-
telescope was a breakthrough with an enormous tional metal mirror was replaced by glass, it was
impact on fundamental sciences (physics, astron- Léon Foucauld (1819 – 1868) who applied a
omy), and, as a direct consequence, on philosophy; silver coating in order to improve the reflectivity of
but also on other technological, economical, political, its surface.
and military developments. It seems that the first Modern refractors can be built with superb apo-
refractive telescopes were built in the Netherlands chromatic corrections, and astronomical reflectors
near the end of the sixteenth century by Jan have advanced to mirror diameters of up to 8.4 m
Lipperhey, Jakob Metius, and Zacharias Janssen. (monolithic mirror), and 10 m (segmented mirror).
Based on reports of those first instruments, Galileo While the classical definition of a telescope involves
Galilei built his first telescope, which was later named an observer, i.e., the eye, in modern times often
after him, and made his famous astronomical electronic detectors have replaced human vision. In
observations of sunspots, the phases of Venus, the following, we will, therefore, use a more general
Jupiter’s moons, the rings of Saturn, and his discovery definition of a telescope as an optical instrument,
of the nature of the Milky Way as an assembly of very whose purpose is to image a distant object, either in a
many stars. The so-called astronomical telescope was real focal plane, or in an afocal projection for the
invented by Johannes Kepler, who in 1611 also observation by eye.
published in Dioptrice the theory of this telescope.
The disturbing effects of spherical and chromatic
aberrations were soon realized by astronomers,
Basic Imaging Theory
opticians, and other users, but it was not before In the most simple case, a telescope may be
Moor Hall (1792) and John Dollond (1758) that at constructed from a single lens or a single mirror,
INSTRUMENTATION / Telescopes 337
h 0 h h 1 r1 r2
n 2 n ¼ ðn0 2 nÞ ½3 f0 ¼ ½8
s0 s r n 2 1 ðr2 2 r1 Þ
we obtain finally:
Reflection at a Spherical Surface
0 0
n n n 2n Let us now consider the other simple case of imaging
2 ¼ ½4
s0 s r object O into O0 by reflection from the concave
For an object at infinity, where s ¼ 1; s0 becomes the spherical surface, as depicted in Figure 2. In the
focal distance f 0 : paraxial case, we have:
n0 n s ¼ y=s; i ¼ f 2 s; s 0 ¼ y=s0 ;
0
f ¼r 0 and f~ ¼ 2r ½5 ½9
n 2n n0 2 n
i 0 ¼ f 2 s 0; f ¼ y=r
Figure 1 Refraction at a spherical surface between media with refractive indices n and n 0:
338 INSTRUMENTATION / Telescopes
1 1 2 Dy
2 ¼2 ½10 tan d ¼ ½12
s0 s r f
The so-called plate scale m of the telescope in units
Again, with s ¼ 1; the focal length becomes:
of arcsec/mm is given by:
r
f0 ¼ 2 ½11 m ¼ 3600 atan
1
ðf in mmÞ ½13
2
f
In what follows, we will now assume that the distance In the case of the refractor, we have also shown an
of the object is always large (1), which, in general, is eyepiece, illustrating that historically the instrument
the situation when a telescope is employed. The was invented to enhance the human vision. The eye is
image is formed at a surface defined by O0 , ideally located at the position of the exit pupil, thus receiving
a focal plane (paraxial case). We will consider parallel light, i.e., observing an object at infinity,
deviations from the ideal case below. however (i) with a flux which is increased by a factor
A; given by the ratio of the area of the telescope
Simple Telescopes
aperture with diameter D to the area of the eye’s pupil
As we have observed above, a telescope is in principle d with A ¼ ðD=dÞ2 ; and (ii) with a magnification M of
nothing but an optical system, which images an object the apparent angle between two separate objects at
at infinity. In the most simple case, a lens with positive infinity, which is given by the ratio of the focal lengths
power or a concave mirror will satisfy this condition. of the objective and the eyepiece:
The principle of a refractor and a reflecting telescope
is shown in Figures 3 and 4. In both cases, parallel fobj
G¼ ½14
light coming from the distant object is entering focl
through the entrance pupil, experiences refraction
The magnification can also be expressed as the ratio
(reflection) at the lens (mirror), respectively, and is
of the diameters of the exit pupil and the entrance
converging to form a real image in the focal plane.
pupil:
Two point sources, for example two stars in the
Dexpup
sky, separated by an angle d; will be imaged in the G¼ ½15
focal plane as two spots with separation Dy: When f is Denpup
The first astronomical telescopes with mirrors were
equipped with an eyepiece for visual observation. The
example in Figure 4 shows a configuration which is
more common nowadays for professional astronom-
ical observations, using a direct imaging detector
in the focal plane (photographic plate, electronic
camera).
Aberrations
Reflector Surfaces
Figure 3 Refractor, astronomical telescope for visual observations with eyepiece (Kepler telescope).
INSTRUMENTATION / Telescopes 339
conic constant:
y2 2 2Rz þ ð1 þ CÞz2 ¼ 0 ½19
and the following characteristics:
y
f ¼ z þ ðf 2 zÞ where ¼ tanð2iÞ ½20
f 2z
we obtain finally:
R 1þC 2 ð1 þ CÞð3 þ CÞ 4
f ¼ 2 y 2 y ½27
2 4R 16R3
.
It is sometimes more convenient to consider the Again, we can use the slope of the tangent to
angular spherical aberration, since a telescope is determine the change of angles i ! ip :
naturally measuring angles between objects at infin- !
ity. To this end, we compare an arbitrary reflector dz dzp d
surface according to eqn [19] with a reference Aang;sph ¼ 2 2 ¼ 2 Dz ½31
dy dy dy
paraboloid, which is known to be free of on-axis
spherical aberration (Figure 7). For any given ray, we It is, therefore, sufficient to consider the difference
observe how the reflected ray under study is tilted in z between the surface under study and the
INSTRUMENTATION / Telescopes 341
y2 Q yQ2 f0 1 r1 r2
Aang; f ¼ 3a1 þ 2a 2 þ a3 Q 3 ½34 ¼2 ½35
R2 R dn ðne 2 1Þ2 r2 2 r1
a1: coma r1 r2
¼ f 0e ðne 2 1Þ ½36
r2 2 r1
a2: astigmatism
a3: distortion we get:
dn
df 0 ¼ 2 f0 ½37
ne 2 1 e
nF0 2 nC 0 0 f0
f 0F 0 2 f 0C 0 ¼ 2 fe ¼ 2 e ½38
ne 2 1 ne
ne 2 1
ne ¼ ½39
nF 0 2 nC 0
n nn r 0 D/2f 0 l
1 3.83 0.61
2 7.02 1.12
Figure 10 Different types of refractor objectives, from left to
3 10.2 1.62
right: classical C achromat, Fraunhofer E achromat with air gap,
4 13.3 2.12
AS achromat, APQ apochromat.
INSTRUMENTATION / Telescopes 343
that it also possesses an air gap, however with an plays an important role for the final appearance to the
improved chromatic aberration correction in com- observer, with an effect on magnification, image
parison to the Fraunhofer objective. The lenses show a quality, chromatism, field of view, and other proper-
more pronounced curvature and are more difficult to ties of the entire system. From a large variety of
manufacture. The beneficial use of KzF2 glass is an different ocular types, let us briefly consider a few
example of the successful research for the develop- simple examples.
ment of new optical media, which was pursued near The first eyepiece in Figure 12 is the one of the
the end of the nineteenth century in a collaboration Kepler telescope, which was already introduced in
between Otto Schott and Ernst Abbé. The last Figure 3. It has a quite simple layout with a single lens
example is a triplet, employing ZK2, CaF2, and of positive power. The lens is located after the
ZK2. This APQ type of objective is an apochromat, telescope focal plane at a distance which is equal to
whose chromatic aberration vanishes at 3 wave- the ocular focal length foc. The telescope and the
lengths. It is peculiar for its use of a crystal (calcium ocular focal planes are coinciding, and as a result, the
fluoride, CaF2) as the optical medium of the central outcoming beam is collimated, forming the system
lens. The manufacture of the CaF2 lens is critical and exit pupil after a distance foc. This is where ideally the
expensive, but it provides for an excellent correction of observer’s eye is located. The focal plane is creating a
chromatic and spherical aberration of this high quality real image, which is a convenient location for a
objective. The triplet is also difficult to manufacture reticle. The Kepler telescope is, therefore, a good
because of the coefficient of thermal expansion, which tool for measurement and alignment purposes.
is roughly different by a factor of two between CaF2 We note that the image is inverted, which is relatively
and the glass. In order to avoid disrupture of the unimportant for a measuring telescope and astron-
cemented lens group, some manufacturers have used omy, but very annoying for terrestrial use.
oil immersion to connect the triplet. Since CaF2 not
only has excellent dispersion properties to correct for
chromatic aberration, but also exhibits an excellent
transmission beyond the visual wavelength regime,
this type of lens is particularly useful for applications
in the blue and UV.
Finally, Figure 11 shows the complete layout of an
astrograph, which is a telescope for use with photo-
graphic plates. The introduction of this type of
instrument meant an important enhancement of
astrophysical research by making available a signifi-
cantly more efficient method, compared to the
traditional visual observation of individual stars, one
by one. Using photographic plates, it became possible
to measure position and brightness of hundreds and
thousands of stars, when previously one could
investigate tens. Typical astrographs were built with
apertures between 60 and 400 mm, and a field-of-view
as large as 15 degrees. Despite the growing importance
of space-based observatories for astrometry, some
astrographs are still in use today.
Basic Oculars
Reflectors
Our third example is the Gregory reflector, which The presence of the corrector plate and, if used, the
also is a 2-mirror system, however with a concave field flattener, gives rise to significant chromatism.
elliptical secondary mirror. The correction of this Nevertheless, the Schmidt telescope was a very
telescope is similar, but somewhat inferior to the one successful design and dominated wide-angle survey
of the Cassegrain. The use of a concave M2 is work in astronomy for many decades.
advantageous for manufacture and testing. Never-
theless, the excessive length of the system has Applications
precluded the use of this layout for large astronomical
telescopes, except for stationary beams, for example, General Purpose Telescopes
in solar observatories.
According to the original meaning of the Greek
The last example is the design put forward by
‘tele – skopein’, the common conception of the tele-
Bernhard Schmidt (1879 – 1935) who wanted to
scope is the one of an optical instrument, which
create a telescope with an extremely wide field-of-
provides an enlarged view of distant objects. In
view. The first telescope of his new type achieved a
addition, a large telescope aperture, compared to the
field-of-view of 168 – something which had not been
eye, improves night vision and observations with
feasible with previous techniques. The basic idea is to
poorly illuminated scenes. There are numerous appli-
use a fast spherical mirror, and to correct for spherical
cations in nautical, security, military, hunting, and
aberration by means of an aspherical corrector plate
other professional areas. For all of these applications,
in front of the mirror, which also forms the entrance
an intuitive vision is important, essentially ruling out
pupil of the telescope. The focal surface is strongly
the classical astronomical telescope (Kepler telescope,
curved, making it necessary to employ a detector with
see Figure 3), which produces an inverted image.
the same radius of curvature (bent photographic
Binoculars with Porro prisms are among the most
plates), or to compensate the curvature by means
popular instruments with upright vision, presenting a
of a field lens immediately in front of the focus.
compact outline due to the folded beam geometry
(Figure 14), but also single tube telescopes with
re-imaging oculars are sometimes used.
Metrology
measuring telescope, such that an image of the Another special device, which combines the former
collimator mask appears in the telescope focal plane. two instruments into one, is the Autocollimator. The
Also the telescope is equipped with a mask in its focal optical principle is shown in Figure 16: in principle,
plane, usually with a precision scale or another pattern the autocollimator is identical to a measuring
for convenient alignment with the collimator mask. If telescope, except that a beamsplitter is inserted
now the collimator is tilted by a small angle d against between the focal plane and the objective. The
the optical axis of the telescope, the angle of incidence beamsplitter is used to inject light from the same
in the measuring telescope changes by this same kind of light source device as in the normal
amount, giving rise to a shift of the image by Dx ¼ f collimator. The axis of the folded light source beam
tan d: Note that the measurement is completely and of the illuminated collimator mask are aligned to
insensitive to parallel shifts between the telescope match the focal plane mask of the telescope section:
and the collimator, since it is parallel light which is the optical axis of the emerging beam is identical
transmitted between the two instruments. to the optical axis of the telescope. Under this
condition, the reflected light from a test object will
be exactly centered on the telescope focal plane mask,
when it is perfectly aligned to normal incidence.
Normally, a high-quality mirror is attached to the
device under study, but sometimes also the test object
itself possesses a plane surface of optical quality,
which can be used for the measurement.
Figure 17 shows schematically an example for such
a measurement, where a mirror is mounted on a
linear stage carriage for the purpose of measuring
the flexure of the stage to very high precision.
With a typical length of < 1 m, this task is not
Figure 16 Autocollimator. a trivial problem for a mechanical measurement.
Figure 18 The 80 cm þ 60 cm ‘Great Refractor’ at the Astrophysical Observatory Potsdam. The correction of the 80 cm objective of
this telescope led Johannes Hartmann to the development of the ‘Hartmann Test’, which was the first quantitative method to measure
aberrations. This method is still in use today, and has been further developed to become the ‘Shack-Hartmann’ technique of wavefront
sensing. Courtesy of Astrophysikalisches Institut Potsdam.
By measurement with the autocollimator the tilt of between the autocollimator and the mirror under
the mirror consecutively for a sufficiently large study. Light coming from the test mirror will be
number of points xi along the rail, we can now plot partially transmitted through the additional mirror
the slope of the rail at each point xi as tan d: towards the autocollimator, and partially reflected
dy back to the test mirror, and so forth. After k
¼ tan d ½42 reflections, a deflection of kd is measured, when d is
dx
the tilt of the test mirror.
Integrating the series of slope values reproduces the The wedge angle a of a plane-parallel plate with
shape of the rail y ¼ f ðxÞ: refractive index nw is easily measured by observing
Other applications include multiple autocolli- the front and back surface reflections from the plate,
mation, wedge measurements, and right angle keeping in mind that the beam bouncing back from
measurements. the rear will also experience refraction in the plate.
Multiple autocollimation is achieved by inserting a If d is the measured angle difference between the two
fixed plane-parallel and semi-transparent mirror front and rear beams, then a ¼ d=2nw :
348 INSTRUMENTATION / Telescopes
Two surfaces of an object can be tested for most fascinating application of any optical system
orientation at right angles by attaching a mirror to known to the general public. The historical develop-
surface 1 and observe the mirror with the fixed ment of the astronomical telescope spans from the
autocollimator. In a second step, an auxiliary 17th century Galileian refractor with an aperture of
pentaprism is inserted into the beam, deflecting the just a few centimeters in diameter, over the largest
light towards surface 2, where the mirror is attached refractors built for the Lick and Yerkes observatories
next. If a precision pentaprism is used, any deviation with objectives of up to < 1 m in diameter (see also
from right angles between the two surfaces is seen as Figures 18 –20), the 4 m-class reflectors of the second
an offset between step 1 and 2. half of the 20th century like e.g. the Hale Observatory
These are only a few examples of the very many 5m, the Kitt Peak and Cerro Tololo 4 m, the La Silla
applications for alignment and angle measurements 3.6 m, or the La Palma 4.2 m telescopes (to name only
in optics and precision mechanics. a few), to the current state-of-the-art of 8 – 10 m class
Without going into much detail, we are just telescopes, prominent examples being the two
mentioning that also the theodolite and sextant Keck Observatory 10 m telescopes on Mauna Kea
make use of the same principle of angular measure- (Hawaii), or the ESO Very Large Observatory
ments, however without the need of a collimator or (Paranal, Chile), consisting of four identical 8.2 m
autocollimation. telescopes (Figures 21 –24).
This development has always been driven by the
discovery of fainter and fainter celestial objects,
Astronomical Telescopes
whose quantitative study is the subject of modern
Despite its importance as a measurement and testing astrophysics, involving many specialized disciplines
device for optical technology and manufacture as such as atom/quantum/nuclear physics, chemistry,
described in the preceeding paragraph, it is the general relativity, electrodynamics, hydrodynamics,
original invention and subsequent improvement of magneto-hydrodynamics, and so forth. In fact,
the telescope for Astronomy which is probably the cosmic objects are often referred to as ‘laboratories’
Figure 23 8.2 m thin primary mirror for the VLT. The VLT
design is based on thin meniscus mirrors, made out of
ZERODUR, which have low thermal inertia and which are always
Figure 21 The Very Large Telescope Observatory (VLT) on Mt kept at ambient temperature, thus virtually removing the effect of
Paranal, Chile, with VLT Interferometer (VLTI). The four large thermal turbulence (mirror seeing). Seeing is the most limiting
enclosures are containing the active optics controlled VLT unit factor of ground-based optical astronomy. The optimized design
telescopes, which are operated either independently of each of the VLT telescopes has introduced a significant leap forward to
other, or in parallel to form the VLTI. The structures on the ground a largely improved image quality from the ground, and,
are part of the beam combination optics/delay lines. Courtesy of consequently, improved sensitivity. The thin mirrors are con-
European Southern Observatory. stantly actively controlled in their mounting cells, using numerous
piston actuators on the back surface of the mirror. Another key
technology under development is ‘adaptive optics’, a method to
compensate atmospheric wavefront deformations in real-time,
thus obtaining diffraction limited images. Courtesy of European
Southern Observatory.
Further Reading
Bely PY (ed.) (2003) The Design and Construction of Large
Optical Telescopes. New York: Springer.
Born M and Wolf E (1999) Principles of Optics, 7th ed.
Cambridge: Cambridge University Press.
Figure 24 Top: spin-cast 8.4 m honeycomb mirror #1 for the
Large Binocular Telescope, the largest monolithic mirror of optical
Haferkorn H (1994) Optik, 3rd ed. Leipzig: Barth
quality in the world. Shown in the process of applying a protective Verlagsgesellschaft.
plastic coating to the end-polished surface. Bottom: LBT mirror #1, Korsch D (1991) Reflective Optics. San Diego: Academic
finally mounted in mirror cell. Courtesy of Large Binocular Press.
Telescope Observatory. Laux U (1999) Astrooptik, 2nd ed. Heidelberg: Verlag
Sterne und Weltraum.
McCray WP (2004) Giant Telescopes: Astronomical
Telescope (Figure 25), down to milli-arcsec resolution
Ambition and the Promise of Technology. Cambridge:
of the Keck and VLT Interferometers. The develop-
Harvard University Press.
ment of the aforementioned ELTs is thought to be Osterbrock DE (1993) Pauper and Prince: Richey, Hale,
only meaningful if operated in a diffraction-limited and Big American Telescopes. Tucson: University of
mode, using the emerging technique of adaptive Arizona Press.
optics, which is a method to correct for wavefront Riekher R (1990) Fernrohre und ihre Meister, 2nd ed.
distortions caused by atmospheric turbulence (image Berlin: Verlag Technik GmbH.
blurr) in real-time. Rutten HGJ and van Venrooij MAM (1988) Telescope
The combination of large optical/infrared tele- Optics. Richmond, VA: Willman-Bell.
scopes with powerful detectors and focal plane Schroeder DJ (1987) Astronomical Optics. San Diego:
instruments (see Instrumentation: Astronomical Academic Press.
Instrumentation) has been a prerequisite to make Welford WT (1974) Aberrations of the Symmetrical
astrophysics one of the most fascinating disciplines of Optical System. London: Academic Press.
fundamental sciences. Wilson RN (1996) Reflecting Telescope Optics I: Basic
Design Theory and its Historical Development. Berlin:
Springer.
See also Wilson RN (1999) Reflecting Telescope Optics II: Manu-
Geometrical Optics: Aberrations. Imaging: Adaptive facture, Testing, Alignment, Modern Techniques. Berlin:
Optics. Instrumentation: Astronomical Instrumentation. Springer.
INTERFEROMETRY / Overview 351
INTERFEROMETRY
Contents
Overview
Gravity Wave Detection
Phase-Measurement Interferometry
White Light Interferometry
Two-Beam Interference
When two quasi-monochromatic waves interfere, the
irradiance of the interference pattern is given by
pffiffiffiffiffi
I ¼ I1 þ I2 þ 2 I1 I2 cos½f ½1
(Figure 7). The fringes are fringes of equal thickness. typical Mach – Zehnder interferometer for looking at
It is sometimes useful to tilt the reference surface a a sample in transmission.
little to get several nearly straight interference fringes.
The relationship between surface height error and
Murty Lateral Shearing
fringe deviation from straightness is the same as for
the Twyman – Green interferometer.
Interferometer
Lateral shearing interferometers compare a wave-
front with a shifted version of itself. While there are
many different lateral shear interferometers, one that
Mach –Zehnder Interferometer works very well with a nearly collimated laser beam is
A Mach – Zehnder interferometer is sometimes used a Murty shearing interferometer shown in Figure 9.
to look at samples in transmission. Figure 8 shows a The two laterally sheared beams are produced by
reflecting a coherent laser beam off a plane parallel
plate. The Murty interferometer can be used to
measure the aberrations in the lens or it can be used
to determine if the beam leaving the lens is
collimated. If the beam incident upon the plane
parallel plate is collimated a single fringe results,
while if the beam is not perfectly collimated straight
equi-spaced fringes result where the number of fringes
gives the departure from collimation.
interferometers, one that works very well with a as shown in Figure 11. The light that is scattered the
nearly collimated beam is shown in Figure 10. It is first time through the plate, and unscattered
essentially a Mach –Zehnder interferometer where, in the second time through the plate, interferes with
one arm, the beam is magnified and in the other arm the light unscattered the first time and scattered the
the beam is demagnified. Interference fringes result in second time. The resulting interference fringes show
the region of overlap. The sensitivity of the inter- errors in the mirror under test. This type of
ferometer depends upon the amount of radial shear. If interferometer is insensitive to vibration because it
the two interfering beams are approximately the same is what is called a common path interferometer, in
size there is little sensitivity, while if the two beams that the test beam (scattered – unscattered) and the
greatly differ in size there is large sensitivity. Some- reference beam (unscattered –scattered) travel along
times radial shear interferometers are used test the almost the same paths in the interferometer.
quality of optical components.
Smartt Point Diffraction
Scatterplate Interferometer Interferometer
A scatterplate interferometer, invented by Jim Burch Another common path interferometer used for the
in 1953, is one of the cleverest interferometers. testing of optics is the Smartt point diffraction
Almost any light source can be used and no high- interferometer (PDI) shown in Figure 12. In the PDI
quality optics are required to do precision interfero- the beam of light being measured is focused onto
metry. The critical element in the interferometer is the a partially transmitting plate containing a pinhole.
scatterplate which looks like a piece of ground glass, The pinhole removes the aberration from the
but the important item is that the scattering points light passing through it (reference beam), while
are arranged so the plate has inversion symmetry the light passing through the plate (test beam) does
not have the aberration removed. The interference
of these two beams gives the aberration of the lens
under test.
Sagnac Interferometer
The Sagnac interferometer has beams traveling in
opposite directions, as shown in Figure 13. The
interferometer is highly stable and easy to align. If the
interferometer is rotated with an angular velocity
there will be a delay between the transit times of
the clockwise and the counterclockwise beams. For
this reason, the Sagnac interferometer is used in
Figure 10 Radial shear interferometer. laser gyros.
Fiber Interferometers and the transmitted light for different values of surface
reflectivity. Multiple beam interference fringes are
Fiber interferometers were first used for rotation extremely useful for measuring the spectral content of
sensing by replacing the ring cavity in the Sagnac a source. One such multiple beam interferometer, that
interferometer with a multiloop made of a single- is often used for measuring spectral distributions of a
mode fiber. Fiber-interferometer rotation sensors are source, is a Fabry –Perot interferometer that consists
attractive as rotation sensors because they are small of two plane-parallel plates separated by an air space,
and low-cost. as shown in Figure 15.
Since the optical path in a fiber is affected by its
temperature and it also changes when the fiber is
stretched, fiber interferometers can be used as sensors Phase-Shifting Interferometry
for mechanical strains, temperature, and pressure.
They can also be used for the measurement of The equation for the intensity resulting from two-
magnetic fields by bonding the fiber to a magneto- beam interference, contains three unknowns, the two
restrictive element. Electric fields can be measured by individual intensities and the phase difference
bonding the fiber to a piezoelectric film. between the two interfering beams. If three or
more measurements are made of the intensity of the
two-beam interference as the phase difference
between the two beams is varied in a known manner,
Multiple Beam Interferometers
all three unknowns can be determined. This tech-
In general, multiple beam interferometers provide nique, called phase-shifting interferometry, is an
sharper interference fringes than a two beam inter- extremely powerful tool for measuring phase
ferometer. If light is reflected off a plane parallel plate distributions. Generally the phase difference is
there are multiple reflections. If the surfaces of the
plane parallel plate have a low reflectivity, the higher-
order reflections have a low intensity and the multiple
reflections can be ignored and the resulting inter-
ference fringes have a sinusoidal intensity profile, but if
the surfaces of the plane parallel plate have a high
reflectivity, the fringes become sharper. Figure 14
shows the profile of the fringes for both the reflected
See also
changed by 90 degrees between consecutive measure-
ments of the interference intensity in which case Coherence: Speckle and Coherence. Detection: Fiber
Sensors. Holography, Techniques: Computer –Gener-
the three intensities can be written as
ated Holograms; Digital Holography; Holographic Inter-
pffiffiffiffiffi ferometry. Imaging: Adaptive Optics; Interferometric
Ia ðx; yÞ ¼ I1 þ I2 þ 2 I1 I2 cos fðx; yÞ 2 908
pffiffiffiffiffi Imaging; Wavefront Sensors and Control (Imaging
Ib ðx; yÞ ¼ I1 þ I2 þ 2 I1 I2 cos fðx; yÞ Through Turbulence). Interferometry: Phase Measure-
pffiffiffiffiffi ment Interferometry; White Light Interferometry; Gravity
Ic ðx; yÞ ¼ I1 þ I2 þ 2 I1 I2 cos fðx; yÞ þ 908
Wave Detection. Microscopy: Interference Microscopy.
and the phase distribution is given by Tomography: Optical Coherence Tomography.
!
Ia ðx; yÞ 2 Ic ðx; yÞ
fðx; yÞ ¼ ArcTan Further Reading
2 Ia ðx; yÞ þ 2Ib ðx; yÞ 2 Ic ðx; yÞ
Candler C (1951) Modern Interferometers. London: Hilger
Phase-shifting interferometry has greatly enhanced and Watts.
the use of interferometry in metrology since it Francon M (1966) Optical Interferometry. New York:
provides a fast low noise way of getting interfero- Academic Press.
metric data into a computer. Hariharan P (1991) Selected papers on interferometry.
SPIE, vol. 28.
Hariharan P (1992) Basics of Interferometry. San Diego:
Vibration-Insensitive Phase-Shifting Academic Press.
Interferometer Hariharan P (2003) Optical Interferometry. San Diego:
Academic Press.
While phase-shifting interferometry has greatly Hariharan P and Malacara D (1995) Selected papers on
enhanced the use of interferometry in metrology, interference, interferometry, and interferometric metrol-
there are many applications where it cannot be used ogy. SPIE, vol. 110.
because of the environment, especially vibration. One Malacara D (1990) Selected papers on optical shop
recent phase-shifting interferometer that works well metrology. SPIE, vol. 18.
in the presence of vibration is shown in Figure 16. In Malacara D (1992) Optical Shop Testing. New York: Wiley.
Malacara D (1998) Interferogram Analysis for Optical
this interferometer, four phase-shifted frames of data
Testing. New York: Marcel Decker.
are captured simultaneously. The test and reference
Robinson D and Reid G (1993) Interferogram Analysis.
beams have orthogonal polarization. After the test Brisol, UK: Institute of Physics.
and reference beams are combined they are passed Steel WH (1985) Interferometry. Cambridge: Cambridge
through a holographic element to produce four University Press.
identical beams. The four beams are then transmitted Tolansky S (1973) An Introduction to Interferometry.
through wave retardation plates to cause 0, 90, 180, New York: Longman.
INTERFEROMETRY / Gravity Wave Detection 357
1 kHz, but the occurrence of an event such as this will in space. This is the goal of the Laser Interferometer
be extremely rare. More events will come from novae Space Antenna (LISA) collaboration. The plan is to
in other galaxies, but due to the great distances and deploy three satellites in a heliocentric orbit with a
the fact that the magnitude of the gravity wave falls separation of about 5 £ 106 km. The launch for LISA
of as 1/r, such events will be substantially diminished is planned for around 2015, and the detector will be
in magnitude. sensitive to gravity waves within the frequency range
A Michelson interferometer, with arms aligned of 1023 to 1021 Hz. Due to the extremely long
along the x- and y- axes, can measure small phase baseline, LISA is not strictly an interferometer, as
differences between the light in the two arms. most light will be lost as the laser beams expand while
Therefore, this type of interferometer can turn the traveling such a great distance. Instead, the phase of
length variations of the arms produced by a gravity the received light will be detected and used to lock the
wave into changes in the interference pattern of the phase of the light that is re-emitted by another laser.
light exiting the system. This was the basis of the idea Joe Weber pioneered the field of gravitational
from which modern laser interferometric gravita- radiation detection in the 1960s. He used a 1400-kg
tional radiation detectors have evolved. Imagine a aluminum cylinder as an antenna. Gravitational
gravity wave of amplitude h is incident on an waves would hopefully excite the lowest-order
interferometer. The change in the arm length will be normal mode of the bar. Since the gravity wave is a
h , DL/L0, so in order to optimize the sensitivity it is
strain on space-time, experimentalists win by making
advantageous to make the interferometer arm length
their detectors longer in length. This provides a long-
L0 as large as possible. Detectors coming on-line
term advantage for laser interferometers, which can
are attempting to measure distance displacements
scale in length relatively easily. However, as of 2002,
that are of order DL , 10218 m or smaller, less
interferometers and bars (now cooled to 4 K) have
that the size of an atomic nucleus! If the interfero-
comparable sensitivities.
meters can detect such distance displacements, and
directly detect gravitational radiation, it will be one
of the most spectacular accomplishments in experi-
mental physics. The first implementation of a laser Interferometer Styles
interferometer to detect a gravity wave was by
The Michelson interferometer is the tool to be
Forward who used earphones to listen to the motion
used to detect a gravitational wave. Figure 1 shows
of the interference signal. The engineering of signal
extraction for modern interferometers is obviously a basic system. The beamsplitter and the end mirrors
far more complex. would be suspended by wires, and effectively free to
Numerous collaborations are building and operat- move in the plane of the interferometer. The arms
ing advanced interferometers in order to detect have lengths L1 and L2 that are roughly equal on a
gravitational radiation. In the United States there is kilometer scale. With a laser of power P and
the Laser Interferometer Gravity Wave Observatory wavelength l incident on the beamsplitter, the light
(LIGO), which consists of two 4-km interferometers
located in Livingston, Louisiana and Hanford,
Washington. In addition, there is an additional
2-km interferometer within the vacuum system at
Hanford. An Italian –French collaboration (VIRGO)
has a 3-km interferometer near Pisa, Italy. GEO, a
German –British collaboration, is a 600-m detector
near Hanover, Germany. TAMA is the Japanese
300-m interferometer in Tokyo. The Australians
(AIGO) are constructing a 500-m interferometer in
Western Australia. All of the detectors will be
attempting to detect gravity waves with frequencies
from about 50 Hz to 1 kHz.
As will be described below, there are a number of
terrestrial noise sources that will inhibit the perform-
ance of the interferometric detectors. The sensitivity
of detection increases linearly with interferometer Figure 1 A basic Michelson interferometer. The photodetector
arm length, which implies that there could be receives light exiting the dark port of the interferometer and hence
advantages to constructing a gravity wave detector the signal.
INTERFEROMETRY / Gravity Wave Detection 359
Figure 4 A power recycled Michelson interferometer with Figure 5 A signal recycled and power recycled Michelson
Fabry–Perot cavities in each arm. Normally light would exit the interferometer with Fabry– Perot cavities in each arm. Normally
interferometer through the light port and head back to the laser. light containing the gravity wave signal would exit the
Installation of the recycling mirror with reflectivity Rr sends the interferometer through the dark port and head to the photo-
light back into the system. A Fabry–Perot cavity is formed detector. Installation of the signal recycling mirror with reflectivity
between the recycling mirror and the first mirror (R1) of the arms. Rs sends the light back into the system. The phase of the light
For LIGO this strategy will increase the power circulating in the acquired from the gravity wave will build up at a particular
interferometer by a factor of 50. frequency determined by the reflectivity Rs.
consider one’s ability to measure the relative phase factor of 50. The higher circulating light power
between the light in the two arms. The Heisenberg therefore improves the sensitivity.
uncertainty relation for light with phase f and There is one additional modification to the
photon number N is DfDN , 1. For a measurement interferometer system that can further improve
lasting time t using laser power P and frequency f, the sensitivity, but only at a particular frequency.
photon number is N ¼ Plt=hc; andpffiffiffi withpPoisson
ffiffiffiffiffiffiffiffiffi A further Fabry – Perot system can be made by
statistics describing the light DN ¼ N ¼ Plt=hc: installing what is called a signal recycling mirror;
Therefore this would be mirror Rs in Figure 5. Imagine light in
arm 1 on the interferometer that acquires phase as the
2p pffiffiffiffiffiffiffiffiffi arm expands due to the gravity wave. The traveling
DfDN ¼ DL Plt=hc ¼ 1
l gravity wave’s oscillation will subsequently cause arm
1 to contract while arm 2 expands. If the light that
implies that was in arm 1 could be sent to arm 2 while it is
sffiffiffiffiffiffi expanding, then the beam would acquire additional
1 hcl phase. This process could be repeated over and over.
DL ¼ Mirror Rs serves this purpose, with its reflectivity
2p Pt
defining the storage time for light in each inter-
With more light power the interferometer can ferometer arm. The storage time defined by the cavity
measure smaller distance displacements and achieve formed by the signal recycling mirror, Rs, and the
better sensitivity. LIGO will use about 10 W of mirror at the front of the interferometer arm cavity,
laser power, and will eventually work towards R1, determines the resonance frequency. Signal
100 W. However, there is a nice trick one can use to recycling will give a substantial boost to interferom-
produce more light circulating in the interferometer, eter sensitivity at a particular frequency, and will
namely power recycling. Figure 4 displays the power eventually be implemented in all the main ground-
recycling interferometer design. The interferometer based interferometric detectors.
operates such that virtually none of the light exits the The LIGO interferometers are infinitely more
interferometer dark port, and the bulk of the light complex than the relatively simple systems displayed
returns towards the laser. An additional mirror, Rr in the figures of this paper. Figure 6 presents an aerial
in Figure 4, recycles the light. For LIGO, recycling view of the LIGO site at Hanford, Washington. The
will increase the effective light power by another magnitude of the 4-km system is apparent.
INTERFEROMETRY / Gravity Wave Detection 361
or a noise spectral density of This assumes the light just travels down the arm and
back once. With Fabry –Perot cavities the light is
sffiffiffiffiffiffi stored, and the typical photon takes many trips back
1 hcl pffiffiffiffi
DLðf Þ ¼ ðin units of m HzÞ and forth before exiting the system. In order to
2p P maximize light power the end mirrors (R2 , 1) and
362 INTERFEROMETRY / Gravity Wave Detection
the strain sensitivity is improved to are a number of modes that can oscillate (i.e. violin
sffiffiffiffiffiffi modes). At temperature T each mode will have energy
1 hcl of kBT, but distributed over a band of frequencies
hðf Þ ¼
2pL0 P determined by the quality factor (or Q) of the
material. Low-loss (or high-Q) materials work best;
where L0 ¼ 4L ð1 2 R1 Þ:
for the violin modes of the wires there will be much
As the frequency of gravity waves increases the
noise at particular frequencies (in the hundreds of
detection sensitivity will decrease. If the gravity wave
hertz). The mirror is a cylindrical object, which will
causes the interferometer arm length to increase, then
have normal modes of oscillation that can be
decrease, while the photons are still in the arm cavity,
thermally excited. The first generation of LIGO will
then the phase acquired from the gravity wave will be
have these masses composed of fused silica, which is
washed away. This is the reason why interferometer
typical for optical components. The Qs for the
sensitivity decreases as frequency increases, and
internal modes are greater than 2 £ 106 : A switch
explains the high-frequency behavior seen in
may eventually be made to sapphire mirrors, which
Figure 7. Taking this into account, the strain
have better thermal properties. The limitation to the
sensitivity is
interferometers’ sensitivity due to the thermal noise
sffiffiffiffiffiffi
!1=2 internal to the mirrors can be seen in Figure 7, and
1 hcl 2pL0 f 2 will be a worrying noise source for the first-
hðf Þ ¼ 1þ
2pL0 P c generation LIGO in the frequency band around
100 –400 Hz.
where L0 ¼ 4L ð1 2 R1 Þ and f is the frequency of the The frequency noise of the laser can couple into
gravity wave. the system to produce length displacement sensitivity
If the gravitational wave is to change the inter- noise in the interferometer. With arm lengths of
ferometer arm length then the mirrors that define the 4 km, it will be impossible to hold the length of the
arm must be free to move. In systems like LIGO wires two arms absolutely equal. The slightly differing
suspend the mirrors; each mirror is like a pendulum. arm spans will mean that the light sent back from
While allowing the mirrors to move under the each of the two Fabry– Perot cavities will have
influence of the gravity wave is a necessary condition, slightly differing phases. As a consequence, great
the pendulum itself is the first component of an effort is made to stabilize the frequency of the light
elaborate vibration isolation system. Seismic noise entering the interferometer. The LIGO laser can be
will be troublesome for the detector at low frequen- seen in Figure 8. The primary laser is a Lightwave
cies. The spectral
pffiffiffiffi density of the seismic noise is about Model 126 nonplanar ring oscillator. High power is
ð1027 =f 2 Þm Hz for frequencies above 1 Hz. generated from this stabilized laser through the use
A simple pendulum, by itself, acts as a motion of optical amplifiers. The beam is sent through four
filtering device. Above its resonance frequency the optical amplifiers, and then retro-reflected back
pendulum filters motion with a transfer function like
Tðf Þ / ðf0 =f Þ2 : Detectors such as LIGO will have a
pendulum with resonant frequencies of about f0 < 1
Hz; thus providing an isolation of 104 when looking
for signals at f ¼ 100 Hz: The various gravity wave
detector collaborations have different vibration iso-
lation designs. The mirrors in these interferometers
will be suspended in elaborate vibration isolation
systems, which may include multiple pendulum and
isolation stacks. Seismic noise will be the limiting
factor for interferometers seeking to detect gravity
waves in the tens of hertz range, as can be seen in the
sensitivity curve presented in Figure 7.
Due to the extremely small distance displacements
that these systems are trying to detect it should come
as no surprise that thermal noise is a problem. This
noise enters through a number of components in the
system. The two most serious thermal noise sources Figure 8 A picture of the Nd:YAG laser and amplifier system
are the wires suspending the mirrors in the pendulum, that produces 10 W of light for LIGO. Photograph courtesy of
and the mirrors themselves. Consider the wires; there Caltech/LIGO.
INTERFEROMETRY / Gravity Wave Detection 363
Hough J and the GEO 600 Team (1997) GEO 600 pulsor PSR 1913 þ 16. Astrophysical Journal 345:
Current status and some aspects of the design. In: 434 – 450.
Tsubono K, Fujimoto MK and Kurodo K (eds) Gravita- Thorne KS (1987) Gravitational radiation. In: Hawking S
tional Wave Detection, pp. 175 – 182. Tokyo: Universal and Isreal W (eds) 300 Years of Gravitation,
Academic. pp. 330 – 458. Cambridge, UK: Cambridge University
Robertson NA (2000) Laser interferometric gravitat- Press.
ional wave detectors. Classical Quantum Gravity 17: Tsubono K and the TAMA collaboration (1997) TAMA
R19 – R40. Project, Gravitational wave detection. In: Tsubono K,
Saulson PR (1994) Fundamentals of Interfero- Fujimoto MK and Kurodo K (eds) Gravitational Wave
metric Gravitational Wave Detectors. Singapore: Detection, pp. 183– 191. Tokyo: Universal Academic.
World Scientific. Weiss R (1972) Electromagnetically coupled broadband
Taylor JH and Weisberg JM (1989) Further experi- gravitational antenna. MIT Quarterly Progress Report
mental tests of relativistic gravity using the binary no. 105.
Phase-Measurement Interferometry
K Creath, Optineering and University of Arizona, There are many different types of applications for
Tucson, AZ, USA fringe analysis. For example, optical surface quality
can be determined using a Twyman– Green or Fizeau
J Schmit, Veeco Instruments, Inc., Tucson, AZ, USA interferometer. In addition, wavefront quality
q 2005, Elsevier Ltd. All Rights Reserved. measurements of a source or an optical system can
be made in transmission, and the index of refraction
and homogeneity of optical materials can be mapped
Introduction out. Many nonoptical surfaces can also be measured.
Typically, surface topography information at some
Phase-measurement interferometry is a way to specific spatial frequency scale is extracted. These
measure information encoded in interference patterns measurements are limited by the resolution of the
generated by an interferometer. Fringe patterns optical system and the field of view of the imaging
(fringes) created by interfering beams of light are system. Lateral and vertical dimensions can also be
analyzed to extract quantitative information about an measured. Applications of nonoptical surfaces
object or phenomenon. These fringes are localized include disk and wafer flatness, roughness measure-
somewhere in space and require a certain degree of ment, distance and range sensing. Phase measurement
spatial and temporal coherence of the source to be techniques used in holographic interferometry, TV
visible. Before these techniques were developed in the holography, speckle interferometry, moiré, grating
late 1970s and early 1980s, fringe analysis was done interferometry, and fringe projection are used for
either by estimating fringe deviation and irregularity nondestructive testing to measure surface structure as
by eye or by manually digitizing the centers of well as displacements due to stress and vibration. This
interference fringes using a graphics tablet. Digital article outlines the basics of phase measurement
cameras and desktop computers have made it possible interferometry (PMI) techniques as well as the types
to easily obtain quantitative information from fringe of algorithms used. The bibliography lists a number
patterns. The techniques described in this article are of references for further reading.
independent of the type of interferometer used.
Interferometric techniques using fringe analysis can
measure features as small as a micron wide or as large
as a few meters. Measurable height ranges vary from Background
a few nanometers up to 10s of mm, depending upon
Basic Parts of a Phase-Measuring Interferometer
the interferometric technique employed. Measure-
ment repeatability is very consistent, and typically A phase-measuring interferometer consists of a light
repeatability of 1/100 rms of a fringe is easily source, an illumination system (providing uniform
obtainable while 1/1000 rms is possible. Actual illumination across the test surface), a beamsplitter
measurement precision depends on what is being (usually a cube or pellicle so both beams have the
measured and how the technique is implemented, same optical path), a reference surface (needs to be
while accuracy depends upon comparison to a good because these techniques measure the difference
sanctified standard. between the reference and test surface), a sample
INTERFEROMETRY / Phase-Measurement Interferometry 365
fixture, an imaging system (images a plane in space surface is imaged onto the camera and the computer
where the surface is located onto the camera), a has a frame grabber that takes frames of fringe data.
camera (usually a monochrome CCD), an image The most common commercially available inter-
digitizer/frame grabber, a computer system, software ferometer for optical testing is the Fizeau interfer-
(to control the measurement process and calculate the ometer (Figure 2). This versatile instrument is very
surface map), and often a spatial or temporal phase insensitive to vibrations due to the large common
shifter to generate multiple interferograms. path for both interfering wavefronts. The reference
surface (facing the test object) is moved by a PZT to
Steps of the Measurement Process provide the phase shift.
To generate a phase map, a sample is placed on a Interference microscopes (see Microscopy: Inter-
sample fixture and aligned, illumination levels are ference Microscopy. Interferometry: White Light
adjusted, the sample image is focused onto the Interferometry) are used for looking at surface
camera, the fringes are adjusted for maximum roughness and small structures (see Figure 3). These
contrast, the phase shifter or fringe spacing is instruments can employ Michelson, Mirau, and
adjusted or calibrated as necessary, a number of Linnik interference objectives with a laser or white
images is obtained and stored with the appropriate light source, or Fizeau-type interference objectives
phase differences, the optical path difference (OPD) is that typically use a laser source because of the
calculated as the modulo 2p phase and then is unequal paths in the arms of the interferometer.
unwrapped at each pixel to determine the phase map. The phase shift is accomplished by moving the
To make consistent measurements, some interfe- sample, the reference surface, or parts of the objective
rometers need to be on vibration isolation systems relative to the sample. Figure 3 shows a schematic of a
and away from heavy airflows or possible acoustical Mirau-type interferometric microscope for phase
coupling. It helps to cover any air paths that are measurement.
longer than a few millimeters. Consideration needs to
be made for consistency of temperature and humidity.
The human operating the interferometer is also a
factor in the measurement. Does this person always
follow the same procedure? Are the measurements
sampled consistently? Are the test surfaces clean? Is
the sample aligned the same and correctly? Many
different factors can affect a measurement. To obtain
repeatable measurements it is important to have a
consistent procedure and regularly verify measure-
ment consistency.
Figure 4 Interference fringes for coherent and incoherent sources as observed at point x ; y ; z ¼ 0 corresponds to equal optical path
lengths of the reference and object beams.
INTERFEROMETRY / Phase-Measurement Interferometry 367
sample, tilting a glass plate, moving a diffraction Since the detector has to integrate for some finite
grating, rotating a polarizer, analyzer, or half-wave time, the detected irradiance at a single point becomes
plate, using a two-frequency (Zeeman) laser source, an integral over the integration time D (Figure 8)
modulating the source wavelength, and switching an where the average phase shift for the ith frame of data
acousto-optic Bragg cell, or magneto-optic/electro- is ai :
optic cell. While any of these techniques can be used
for coherent sources, special considerations need to 1 ðai þðD=2Þ
Ii ðx; yÞ ¼ I ðx; yÞf1 þ g0 ðx; yÞ
be made for temporally incoherent sources (see D ai 2ðD=2Þ 0
Interferometry: White Light Interferometry). cos½fðx; yÞ þ aðtÞ gdaðtÞ ½4
Figure 7 shows how moving a mirror introduces a
relative phase shift between object and reference After integrating over aðtÞ; the irradiance of the
beams in a Twyman – Green interferometer. detected signal becomes
Extracting Phase Information
D
Ii ðx; yÞ ¼ I0 ðx; yÞ 1 þ g0 ðx; yÞsinc
Including the phase shift, the interference equation is 2
written as
cos½fðx; yÞ þ ai ½5
Iðx; yÞ ¼ I0 ðx; yÞf1 þ g0 ðx; yÞcos½fðx; yÞ þ aðtÞ g ½3
where sincðD=2Þ ¼ sinðD=2Þ=ðD=2Þ; which reduces the
where Iðx; yÞ is the irradiance at a single detector detected visibility.
point, I0 ðx; yÞ is the average (dc) irradiance, g0 ðx; yÞ is
the fringe visibility before detection, fðx; yÞ is the Ramping Versus Stepping
phase of the wavefront being measured, and aðtÞ is
There are two different ways of shifting the phase;
the phase shift as a function of time.
either the phase shift can be changed in a constant and
linear fashion (ramping) (see Figure 8) or it can be
stepped in increments. Ramping provides a continu-
ous smooth motion without any jerking motion. This
option may be preferred if the motion does not wash
out the interference fringes. However, ramping
requires good synchronization of the camera/digitizer
and the modulator to get the correct OPD (or phase)
changes between data frames. Ramping allows faster
Figure 5 Variable definitions for interference fringes. data taking but requires electronics that are more
sophisticated. In addition, it takes a finite time for a
mass to move linearly. When ramping, the first frame
or two of data usually needs to be discarded because
the shift is not correct until the movement is linear.
The major difference between ramping and step-
ping the phase shift is a reduction in the modula-
tion of the interference fringes after detection
(the sincðD=2Þ term in eqn [5]). When the phase shift
is stepped ðD ¼ 0Þ; the sinc term has a value of one.
When the phase shift is ramped ðD ¼ aÞ for a phase
Figure 6 Interference fringe patterns corresponding to different
relative phase shifts between test and reference beams.
shift of a ¼ 908 ¼ p=2; this term has a value of 0.9. Note that ðx; yÞ dependencies are still implied. The
Therefore, ramping slightly reduces the detected choice of the specific phase shift values is to make the
fringe visibility. math simpler. The phase at each detector point is
!
Signal Modulation 21 I3 2 I2
f ¼ tan ½9
Phase measurement techniques assume that the I1 2 I2
irradiance of the interference fringes at the camera
covers as much of the detector’s dynamic range as In most fringe analysis techniques, we are basically
possible and that the phase of the interference fringe trying to solve for terms such that we end up with the
pattern is modulating at each individual pixel as the tangent function of the phase. The numerator and
phase between beams is modulated. If the irradiance denominator shown above are respectively pro-
at a single detector point does not modulate as the portional to the sine and cosine of the phase. Note
relative phase between beams is shifted, the height of that the dc irradiance and fringe visibility appear in
the surface cannot be calculated. Besides the obvious both the numerator and denominator. This means
reductions in irradiance modulation due to the that variations in fringe visibility and average
detector sampling and pixel size or a bad detector irradiance from pixel to pixel do not affect the
element, scattered light within the interferometer, and results. As long as the fringe visibility and average
defects or dirt on the test object, can also reduce irradiance at a single pixel is constant from frame to
signal modulation. Phase measurement techniques frame, the results will be good. If the different phase
are designed to take into account the modulation of shifts are multiplexed onto multiple cameras, the
the signal at each pixel. If the signal does not results will be dependent upon the gain of corres-
modulate enough at a given pixel, then the data at ponding pixels.
that pixel are considered unusable, flagged as ‘bad’ Bad data points with low signal modulation are
and are often left blank. Phase values for these points determined by calculating the fringe visibility at each
may be interpolated from surrounding pixels if there data point using:
are sufficient data. pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðI3 2 I2 Þ2 þ ðI1 2 I2 Þ2
g¼ pffiffi ½10
Three-Frame Technique 2I0
The simplest method to determine phase uses three It is simpler to calculate the ac signal modulation
frames of data. With three unknowns, three sets of ð2I0 gÞ and then set the threshold on the modulation to
recorded fringe data are needed to reconstruct a a typical value of about 5 – 10% of the dynamic
wavefront providing a phase map. Using phase shifts range. If the modulation is less than this value at any
of ai ¼ p=4; 3p=4; and 5p=4; the three fringe given data point, the data point is flagged as bad. Bad
measurements at a single point in the interferogram points are usually caused by noisy pixels and can be
may be expressed as due to scratches, pits, dust and scattered light.
Although three frames of data are enough to
p
I1 ¼ I0 1 þ g cos f þ determine the phase, this and other 3-frame algor-
2
ithms are very sensitive to systematic errors due to
" pffiffi #
nonsinusoidal fringes or nonlinear detection, phase
2
¼ I0 1 þ g ðcos f 2 sin fÞ ½6 shifter miscalibration, vibrations and noise. In gen-
2
eral, the larger the number of frames of data used to
determine the phase, the smaller the systematic
3p errors.
I2 ¼ I0 1 þ g cos f þ
2 Phase Unwrapping
" pffiffi #
2 The removal of phase ambiguities is generally called
¼ I0 1 þ g ð2cos f 2 sin fÞ ½7
2 phase unwrapping, and is sometimes known as
integrating the phase. The phase ambiguities owing
to the modulo 2p arctangent calculation can simply
5p be removed by comparing the phase difference
I3 ¼ I0 1 þ g cos f þ
2 between adjacent pixels. When the phase difference
" pffiffi # between adjacent pixels is greater than p; a multiple
2 of 2p is added or subtracted to make the
¼ I0 1 þ g ð2cos f þ sin fÞ ½8
2 difference less than p: For the reliable removal of
INTERFEROMETRY / Phase-Measurement Interferometry 369
discontinuities, the phase must not change by more Phase Change on Reflection
than p (l=2 in optical path (OPD)) between adjacent Phase shifting interferometry measures the phase of
pixels. Figure 9 shows an example of wrapped and reflected light to determine the shape of objects. The
unwrapped phase values. reflected wavefront will represent the object surface
Given a trouble-free wrapped phase map, it is (within a scaling factor) if the object is made of a
enough to search row by row (or column by column) single material. If the object is comprised of multiple
for phase differences of more than p between materials that exhibit different phase changes on
neighboring pixels. However, fringe patterns usually reflection, the measured wavefront needs to be
are not perfect and are affected by systematic errors. corrected for these phase differences (see Interfero-
Some of the most frequently occurring error sources metry: White Light Interferometry).
are noise, discontinuities in the phase map, violation
of the sampling theorem and invalid data points (e.g.,
Overview of Phase Measurement
due to holes in the object or low modulation regions).
Some phase maps are not easily unwrapped. In these
Algorithms and Techniques
cases different techniques like wave packet peak There are literally hundreds of published algorithms
sensing in white light interferometry are used. and techniques. The optimal algorithm depends on
370 INTERFEROMETRY / Phase-Measurement Interferometry
the application. Most users prefer fast algorithms Table 1 Sampling function weights for a few selected
using a minimal amount of data that are as accurate algorithms
and repeatable as possible, immune to noise, adapt- N frames Phase Coefficients
able, and easy to implement. In practice, there are shift
trade-offs that must be considered when choosing a p 1; 21; 0
3
specific algorithm or technique. This section provides 2 0; 1; 21
an overview of the types of algorithms to aid the p 0; 21; 0; 1
4
reader in sifting through published algorithms. 2 1; 0; 21; 0
p 0; 22; 0; 2; 0
5
Synchronous Detection 2 1; 0; 22; 0; 1
pffiffi
p 3½0; 1; 1; 0; 21; 21; 0
One of the first techniques for temporal phase 7
3 21; 21; 1; 2; 1; 21; 21
measurement utilized methods of communication p 1; 5; 211; 215; 15; 11; 25; 21
theory to perform synchronous detection. To syn- 8
2 1; 25; 211; 15; 15; 211; 25; 1
pffiffi
chronously detect the phase of a noisy sinusoidal p 3½0; 23; 23; 3; 9; 6; 26; 29; 23; 3; 3; 0
12
signal, the signal is first correlated (or multiplied) 3 2; 1; 27; 211; 21; 16; 16; 21; 211; 27; 1; 2
with sinusoidal and cosinusoidal reference signals
(signals in quadrature) of the same frequency and
then averaged over many periods of oscillation. This the phase is calculated using
method of synchronous detection as applied to phase 2X 3
measurement can be extracted from the least squares ni Ii
6 7
estimation result when the phase shifts are chosen 21 6 X
i 7
f ¼ tan 6 7 ½16
such that N measurements are equally spaced over 4 d i Ii 5
one modulation period. With phase shifts ai such i
that:
The numerator and denominator of the arctangent
2p argument are both polynomials. The numerator is a
ai ¼ i; with i ¼ 1; …; N ½13
N sum proportional to the sine (imaginary part) and the
denominator is a sum proportional to the cosine (real
the phase can be calculated from
part),
"P #
21 Ii ðx; yÞsin ai X
fðx; yÞ ¼ tan P ½14 num ¼ 2kI0 g sin ai ¼ ni Ii ½17
Ii ðx; yÞcos ai i
The carrier frequency (i.e., the number of tilt fringes) As an example, a four-channel interferometer can
is adjusted so that there is an a phase change from the be made using the setup shown in Figure 13 to record
center of one pixel to the next. As long as there is not four interferograms with 908 phase shifts between
much aberration (deviation), the phase change from them. Camera 1 will yield fringes shifted 1808 with
pixel to pixel across the detector array will be respect to camera 2, and cameras 3 and 4 will have
approximately constant. phase shifts of 908 and 2708. The optical system may
When the fringes are set up this way, the phase can also utilize a holographic optical element to split the
be calculated using adjacent pixels (see Figure 12). If beam to multiplex the four phase shifts on four
one fringe takes up 4 pixels, the phase shift a between quadrants of a single camera.
pixels will be 908. An algorithm such as the three-
frame, four-frame, or five-algorithm can be used with Signal Demodulation Techniques
adjacent pixels as the input. Therefore, 3, 4, or 5 The task of determining phase can be broadened by
pixels in a row will yield a single-phase point. Then, looking toward the field of signal processing. For
the analysis window is shifted sideways one pixel and communication via radio, radar, and optical fibers
phase is calculated at the next point. This technique electrical engineers have developed a number of ways
assumes that the dc irradiance and fringe visibility do of compressing and encoding a signal as well as
not change over the few pixels used to calculate each decompressing a signal and decoding it. An inter-
phase value. ference fringe pattern looks a lot like an am radio
Spatial Multichannel Phase-Shift Techniques signal. Thus, it can be demodulated in similar ways.
In recent years many new algorithms have been
These techniques detect all phase maps simul- developed by drawing on techniques from communi-
taneously and multiplex the phase shift using static cation theory and applying them to interferogram
optical elements. This can be done by using either processing. Many of these use different types of
separate cameras as illustrated below or by using transforms such as Hilbert transforms for straight
different detector areas to record each of the fringes or circular transforms for closed fringes.
interferograms used to calculate the phase. The
phase is usually calculated using the same techniques Extended Range Phase Measurement Techniques
that are used for the temporal phase techniques.
A major limitation of phase measurement techniques
is that they cannot determine surface discontinuities
larger than l=4 (l=2 in optical path (OPD)) between
adjacent pixels. One obvious solution is to use longer
wavelength sources in the infrared where optically
rough surfaces look smooth and their shape can be
measured. An alternative is two-wavelength inter-
ferometry where two measurements at different
wavelengths are taken and the measurable height
limitation is now determined by the equivalent
wavelength:
l1 l2
Figure 12 Spatial phase shifting. Different relative phase shifts leq ¼ ½23
are on adjacent detector elements. ll1 2 l2 l
Systematic Errors
Noise Sensitivity
Measurement noise mostly arises from random
fluctuations in the detector readout and electronics.
This noise reduces precision and repeatability. Aver-
aging multiple measurements can reduce effects of
these random fluctuations.
Conclusions
Phase measurement interferometry techniques have
increased measurement range and precision enabling
the production of more complex and more precise
components. As work continues on development of
interferometric techniques, phase measurement tech-
niques will continue to become more robust and less
sensitive to systematic errors. Anticipated advances
will enable measurements of objects that were
unimaginable 20 or 30 years ago.
See also
Interferometry: White Light Interferometry. Microscopy:
Interference Microscopy.
Further Reading
Bruning JH (1978) Fringe scanning interferometers. In:
Malacara D (ed.) Optical Shop Testing, pp. 409 – 438.
New York: Wiley.
Creath K (1988) Phase-measurement interferometry tech-
niques. In: Wolf E (ed.) Progress in Optics, vol. 26,
pp. 349– 393. Amsterdam: Elsevier.
Creath K (1993) Temporal phase measurement methods. In:
Figure 15 Measurements of (a) hard disc substrate with a Robinson DW and Reid GT (eds) Interferogram
vertical range is 2.2 micrometers and (b) binary grating roughness Analysis, pp. 94– 140. Bristol: IOP Publishing.
standard. Creath K and Schmit J (1996) N-point spatial phase-
measurement techniques for non-destructive testing.
air turbulence need to be considered as well. Spatial Optics and Lasers in Engineering 24: 365 –379.
Ghiglia DC and Pritt MD (1998) Two-Dimensional Phase
phase measurement techniques are sensitive to mis-
Unwrapping. New York: Wiley.
calibrated tilt (wrong carrier frequency), unequally
Greivenkamp JE and Bruning JH (1992) Phase shifting
spaced fringes, and sampling and windowing in interferometry. In: Malacara D (ed.) Optical Shop
Fourier transform techniques. Testing, 2nd edn, pp. 501– 598. New York: Wiley.
Hariharan P (1992) Basics of Interferometry. New York:
Choosing an Algorithm or Technique
Academic Press.
Each algorithm and type of measurement technique is Kerr D (ed.) (2004) FASIG special issue on fringe analysis.
sensitive to different types of systematic errors. Optics and Lasers in Engineering 41: 597 –686.
Choosing a proper algorithm or technique for a Kujawinska M (1993) Spatial phase measurement methods.
In: Robinson DW and Reid GT (eds). Interferogram
particular type of measurement depends on the
Analysis, pp. 141– 193. Bristol, UK: IOP Publishing.
specific conditions of the test itself and reducing the
Larkin KG (1996) Neat nonlinear algorithm for envelope
systematic errors for a particular type of measure- detection in white light interferometry. Journal of
ment. This is the reason that so many algorithms Optical Society of America A 13: 832 – 842.
exist. The references in the bibliography will help the Malacara D (1992) Optical Shop Testing, 2nd edn.
reader determine what type of algorithm will work New York: Wiley.
best for a specific application. Malacara D, Servin M and Malacara Z (1998) Interfero-
gram Analysis for Optical Testing. New York: Marcel
Examples of Applications Dekker.
Robinson DW and Reid GT (1993) Interferogram Analysis:
Phase shifting interferometry can be used for
Digital Processing Techniques for Fringe Pattern
measurements such as hard disk flatness, quality of
Measurement. Bristol, UK: IOP Publishing.
optical elements, lens curvature, dimensions and Schmit J and Creath K (1996) Window function influence
quality of air-bearing surfaces of magnetic read/write on phase error in phase-shifting algorithms. Applied
heads, cantilevers, and semiconductor elements. Optics 35: 5642– 5649.
Figure 15 shows results for measurements of a hard Surrel Y (2000) Fringe analysis. Photo-Mechanics: Topics
disk substrate and a roughness grating. in Applied Physics 77: 55– 102.
INTERFEROMETRY / White Light Interferometry 375
Figure 1 Interference colors in a gasoline spill in a puddle. This picture would closely resemble a picture from an interference
microscope, if not for the leaf in the top left corner.
Interference colors have been tabulated for many compensated interferometers and can be used for
years and for many different purposes. Newton measurement of dispersion in gases. Other interfe-
devised his color scale to describe interference fringes rometers, such as the Twyman – Green or Fizeau
for two pieces of glass in contact with each other. interferometers, make use of their unequal interfe-
Michel-Lévy and Lacroix in 1889, created a color rometer arms and their nonlocalized fringes from
scale to help recognize different rock forming laser sources for testing different optical elements
minerals. For more information about colors in and systems in reflection and transition (see Interfero-
white light interference see sections on White Light metry: Phase Measurement Interferometry).
Interference and Spectral Interference below.
White Light Interference
Michelson Interferometer
A white light source consists of a wide spectrum of
A Michelson interferometric setup, shown in Figure 3,
wavelengths in visible spectrum, from about 380 up
is often used to analyze white light interference, but
to 750 (violet to red) nanometers. However, the
the intensity of the fringes is observed as one mirror is
principles of WLI described in this article basically
scanned rather than the color of fringes that is usually
apply when any low coherence source is used. Low
observed.
coherence refers to a source of not only low temporal
Two plane mirrors (equivalent to the top and
but also low spatial coherence; WLI can also be
bottom surfaces of a thin film) return the beams to the
referred to as low coherence interferometry. We will
beamsplitter, which recombines parts of the returned
be concerned mainly with temporal effect of the
beams and directs them towards the detector where
source, the source spectrum.
the interference is observed. Beam S1 travels through
Since different wavelengths from the source spec-
the parallel plate three times while beam S2 passes
trum are mutually incoherent, we will first look at
through the plate only once causing the system to be
interference between two waves of a selected mono-
not well compensated (balanced) for dispersion.
chromatic illumination with wave number k ¼ 2p=l;
Thus, for observation of white light interference, an
where l is the wavelength. The intensity of the
identical plate is placed in the path of beam S2. As one
interference fringes at point x; y (these coordinates
of the plane mirrors moves along the optical axis to
are omitted in all equations), as one of the mirrors is
change the OPD, the detector collects the irradiance,
scanned along the optical axis, z can be described as
the intensity. Fringes will be observed if the system is
well-compensated for dispersion and for optical path pffiffiffiffiffi
lengths. Any spectral changes or changes in optical Iðk; zÞ ¼ I1 þ I2 þ 2 I1 I2 lgðzÞl cosðkzÞ ½1
path lengths in the interferometer affect the shape or
position of the fringes, and the interferometer or can be written in the form:
measures these changes.
Common-path interferometers, like the Mach – " pffiffiffiffiffi #
Zender or Jamin interferometers, are naturally 2 I1 I2
Iðk; zÞ ¼ I0 1þ lgðzÞl cosðkzÞ ½2
I1 þ I2
Figure 5 Spectrum and interferogram for (a) tungsten-halogen lamp and for (b) red LED sources.
can be expressed in a more general form: predicted and corrected for in surface height calcu-
lations. The linear dependence of the phase change on
IðzÞ ¼ I0 ½1 þ VðzÞ cosðk0 z þ w0 Þ ½5 reflection versus wave number shifts the location of
the coherence envelope peak position of fringes by
Figure 6 shows color fringes for such a case; the dark different amounts. The constant phase change on
fringe marks the zero OPD, and this fringe is reflection and higher-order terms only shift the fringes
surrounded by the greenish blue colors of shorter underneath the coherence envelope. As long as the
wavelengths. In contrast, the bright fringe at the change on reflection has a small dependence on
maximum envelope position or marking the zero the second- and higher-order of the wave number, the
OPD would be surrounded by the reddish colors of shape of the coherence envelope is preserved.
longer wavelengths.
In a real system the fringes may be shifted with Changes in Envelope and Fringes
respect to the envelope by any amount of w0 ; and this Due to Dispersion
shift may be due to any number of factors, such as the Dispersion in an interferometer that is not balanced,
phase shift on reflection from nondielectric surfaces perhaps because a dispersive element was placed in
and dispersion, which we consider next. one arm or the compensating plate has an incorrect
thickness, will influence fringe formation. The phase
Changes in Envelope and Fringes Due to delay between interference patterns for individual
Reflection of Nondielectric Materials wavelengths is proportional to the product of the
The relative position of the envelope maximum and geometrical path and the index of refraction equal to
fringe position will be different if the beam is reflected d £ nðkÞ:
from different nondielectric materials. This difference The intensity may be described as:
exists because the phase change on reflection from a
test surface, like metals or heavily doped semicon- ð k2
ductors, varies with each wavelength. This variance IðzÞ ¼ k1 þ VðzÞ cos{kz 2 kd½nðkÞ}l dk ½6
in fringe and peak coherence position may be k1
380 INTERFEROMETRY / White Light Interferometry
Figure 12 Fringes in quasimonochromatic and white light for object similar to one as presented in Figure 13.
enable correction for changes in wave number value Film Thickness Measurement
k0 which can be due to such things as changes in the
White light interferograms that are obtained in an
working voltage of bulb, an introduced higher-order
optical profiler can be used to measure transparent
dispersion, or a large tilt of the sample.
film thicknesses because the character of the inter-
ferogram changes with the thickness of the film
Increased Resolution White Light Interferometry
(Figure 18). Two different techniques, a thin or thick
Interferometric methods that employ a monochro- film technique, are used depending of the range of the
matic light source to detect the phase of the fringes film thickness. A thick film technique is used if two
386 INTERFEROMETRY / White Light Interferometry
Contents
Brief History
Carbon Dioxide Laser In 1917 Albert Einstein developed the concept of
stimulated emission which is the phenomenon used in
C R Chatwin, University of Sussex, Brighton, UK lasers. In 1954 the MASER (Microwave Amplifica-
q 2005, Elsevier Ltd. All Rights Reserved. tion by Stimulated Emission of Radiation) was the
first device to use stimulated emission. In that year
Townes and Schawlow suggested that stimulated
emission could be used in the infrared and optical
Introduction portions of the electromagnetic spectrum. The device
This article gives a brief history of the development of was originally termed the optical maser, this term
the laser and goes on to describe the characteristics of being dropped in favor of LASER, standing for: Light
the carbon dioxide laser and the molecular dynamics Amplification by Stimulated Emission of Radiation.
that permit it to operate at comparatively high power Working against the wishes of his manager at Hughes
and efficiency. It is these commercially attractive Research Laboratories, the electrical engineer Ted
features and its low cost that has led to its adoption as Maiman created the first laser on the 15 May 1960.
one of most popular industrial power beams. This Maiman’s flash lamp pumped ruby laser produced
outline also describes the main types of carbon pulsed red electromagnetic radiation at a wave-
dioxide laser and briefly discusses their characteristics length of 694.3 nm. During the most active period
and uses. of laser systems discovery Bell Labs made a very
390 LASERS / Carbon Dioxide Laser
significant contribution. In 1960, Ali Javan, William transition may occur on one of two transitions:
Bennet and Donald Herriot produced the first Helium (0001) ! (10 00), l ¼ 10:6 mm; (0001) ! (02 00),
Neon laser, which was the first continuous wave l ¼ 9:6 mm; see Figure 1. The 10.6 mm transition
(CW) laser operating at 1.15 mm. In 1961, Boyle and has the maximum probability of oscillation and gives
Nelson developed a continuously operating Ruby the strongest output; hence, this is the usual
laser and in 1962, Kumar Patel, Faust, McFarlane wavelength of operation, although for specialist
and Bennet discovered five noble gas lasers and lasers applications the laser can be forced to operate on
using oxygen mixtures. In 1964, C.K.N. Patel created the 9.6 mm line.
the high-power carbon dioxide laser operating at Figure 1 illustrates an energy level diagram with
10.6 mm. In 1964, J.F. Geusic and R.G. Smith four vibrational energy groupings that include all the
produced the first Nd:Yag laser using neodymium significantly populated energy levels. The internal
doped yttrium aluminum garnet crystals and relaxation rates within these groups are considered to
operating at 1.06 mm. be infinitely fast when compared with the rate of
energy transfer between these groups. In reality the
internal relaxation rates are at least an order of
Characteristics magnitude greater than the rates between groups.
Due to its operation between low lying vibrational Excitation of the upper laser level is usually
energy states of the CO2 molecule, the CO2 laser has provided by an electrical glow discharge. However,
a high quantum efficiency, , 40%, which makes it gas dynamic lasers have been built where expanding a
extremely attractive as a high-power industrial hot gas through a supersonic nozzle creates the
materials processing laser (1 to 20 kW), where energy population inversion; this creates a nonequilibrium
and running costs are a major consideration. Due to region in the downstream gas stream with a large
the requirement for cooling to retain the population population inversion, which produces a very high-
inversion, the efficiency of electrical pumping and power output beam (135 kW – Avco Everett
optical losses – commercial systems have an overall Research Lab). For some time the gas dynamic laser
efficiency of approximately 10%. Whilst this may was seriously considered for use in the space-based
seem low, for lasers this is still a high efficiency. The Strategic Defence Initiative (SDI-USA). The gas
CO2 laser is widely used in other fields, for example, mixture used in a CO2 laser is usually a mixture of
surgical applications, remote sensing, and measure- carbon dioxide, nitrogen, and helium. The pro-
ment. It emits infrared radiation with a wavelength portions of these gases varies from one laser system
that can range from 9 mm up to 11 mm. The laser to another, however, a typical mixture is 10%-CO2;
Figure 1 Six level model used for the theoretical description of CO2 laser action.
LASERS / Carbon Dioxide Laser 391
Molecular Dynamics
Direct Excitation and De-excitation
It is usual for excitation to be provided by an
electrical glow discharge. The direct excitation of Figure 2 Electron energy distribution function.
carbon dioxide (CO2) and nitrogen (N2) ground state
molecules proceeds via inelastic collisions with fast Ee ; FCO2 ; and FN2 are obtained by solution of the
electrons. The rates of kinetic energy transfer are a Boltzmann transport equation (BTE); the average
and g; respectively, and are given by eqns [1] and [2]: electron energy can be optimized to maximize the
efficiency (FCO2 ; FN2 ) with which electrical energy is
FCO2 £ IP
a¼ ðsec21 Þ ½1 utilized to create a population inversion. Hence, the
E000 1 £ n0 discharge conditions required to maximize efficiency
can be predicted from the transport equation. Figure 2
FN2 £ IP shows one solution of the BTE for the electron energy
g¼ ðsec21 Þ ½2 distribution function.
Ev¼1 £ n4
FCO2 ¼ Fraction of the input power (IP) coupled into Resonant energy transfer between the CO2 (0001)
the excitation of the energy level E000 1 ; n0 is and N2 ðv ¼ 2Þ energy levels (denoted 1 and 5 in
the CO2 ground level population density; Figure 1) proceeds via excited molecules colliding
FN2 ¼ Fraction of the input power (IP) coupled into with ground state molecules. A large percentage of
the excitation of the energy level Ev¼1 ; and n4 the excitation of the upper laser level takes place via
is the N2 ground level population density. collisions between excited N2 molecules and ground
state CO2 molecules. The generally accepted rate of
The reverse process of the above occurs when this energy transfer is given by eqns [5] and [6]:
molecules lose energy to the electrons and the
electrons gain an equal amount of kinetic energy; K51 ¼ 19 000 PCO2 ðsec21 Þ ½5
the direct de-excitation rates are given by h and b;
eqns [3] and [4], respectively:
!
E000 1 K15 ¼ 19 000 PN2 ðsec21 Þ ½6
h ¼ a £ exp ðsec21 Þ ½3
Ee
where PCO2 and PN2 are the respective gas partial
! pressures in Torr.
Ev¼1 Hence, CO2 molecules are excited into the upper
b ¼ g £ exp ðsec21 Þ ½4
Ee laser level by both electron impact and impact with
excited N2 molecules. The contribution from N2
where Ee is the average electron energy in the molecules can be greater than 40% depending on the
discharge. discharge conditions.
392 LASERS / Carbon Dioxide Laser
Collision – Induced Vibrational Relaxation of the where T is the absolute temperature and n refers to
Upper and Lower Laser Levels the population density of the gas designated by the
The important vibrational relaxation processes are subscript.
illustrated by Figure 1 and can be evaluated from eqns This expression takes account of the different
[7– 10]; where the subscripts refer to the rate between constituent molecular velocity distributions and
energy levels 1 and 32; 21 and 31; 22 and 31; 32 and different collision cross-sections for CO2 ! CO2,
0, respectively: N2 ! CO2 and He ! CO2 type collisions. Equation
[13] also takes account of the significant line broad-
K132 ¼ 367 PCO2 þ 110 PN2 þ 67 PHe ðsec21 Þ ½7 ening effect of helium.
Neglecting the unit change in rotational quantum
number, the energy level degeneracies g1 and g2 may
K2131 ¼ 6 £ 105 PCO2 ðsec21 Þ ½8 be dropped. n100 0 is partitioned such that n100 0 ¼
0:1452n2 and eqn [12] can be re-cast as eqn [14]:
K2231 ¼ 5:15 £ 105 PCO2 ðsec21 Þ ½9 g ¼ s ðn1 2 0:1452n2 Þ cm21 ½14
feed all their energy through the Pð22Þ transition. This quite basic physical model can give a good prediction
model simply assumes constant intensity, basing laser of laser output performance.
performance on the performance of an average unit
volume. By introducing the stimulated emission term
into the molecular rate equations, which describe the
Optical Resonator
rate of transfer of molecules between the various Figure 4 shows a simple schematic of an optical
energy levels illustrated in Figure 1, a set of molecular resonator. This simple optical system consists of
rate eqns [17 –21] can be written that permit simula- two mirrors (the full reflector is often a water
tion of the performance of a carbon dioxide laser: cooled gold coated copper mirror), which are
aligned to be orthogonal to the optical axis that
dn1 runs centrally along the length of the active gain
¼ an0 2 hn1 þ K51 n5 2 K15 n1 2 Ksp n1
dt medium in which there is a population inversion.
The output coupler is a partial reflector (usually
n1
2 K132 n1 2 n32 2 Ip cg ½17 dielectrically coated zinc selenide – ZnSe –that may
n32 e
be edge cooled) so that some of the electromag-
netic radiation can escape as an output beam. The
dn2 n ZnSe output coupler has a natural reflectivity of
¼ Ksp n1 þ Ip cg 2 K2131 n21 2 21 n31
dt n about 17% at each air– solid interface. For high
31 e power lasers (2 kW) 17% is sufficient for laser
n22
2 K2231 n22 2 n ½18 operation; however, depending on the required
n31 e 31
laser performance the inside surface is often given
a reflective coating. The reflectivity of the inside
face depends on the balance between the gain (eqn
dn3 n
¼ 2K2131 n21 2 21 n31 2 2K2231 n22 [14]), the output power and the power stability
dt n
31 e requirements. The outside face of the partial
n22 n1 reflector must be anti-reflection (AR) coated.
2 n31 þ K132 n1 2 n32
n31 e n32 e Spontaneous emission occurs within the active gain
n32 medium and radiates randomly in all directions; a
2 K320 n32 2 n ½19
n0 e 0 fraction of this spontaneous emission will be in the
same direction as the optical axis, perpendicular to
the end mirrors, and will also fall into a resonant
dn5
¼ gn4 2 bn5 2 K51 n5 þ K15 n1 ½20 mode of the optical resonator. Spontaneous emission
dt photons interact with CO2 molecules in the excited
upper laser level, excited state (0001), which stimu-
dIp Ip lates these molecules to give up a quanta of
¼ Ip cg 2 ½21 vibrational energy as photons via the radiative
dt T0
transition ð000 1Þ ! ð100 0Þ; l ¼ 10:6mm: The radi-
The terms in square brackets ensure that the system ation given up will have exactly the same phase and
maintains thermodynamic equilibrium; subscript ‘e’ direction as the stimulating radiation and thus will be
refers to the fact that the populations in the square coherent with it. The reverse process of absorption
brackets are the values for thermodynamic equili- also occurs, but so long as there is a population
brium. The set of five simultaneous differential inversion there will be a net positive output. This
equations can be solved using a Runge – Kutta process is called light amplification by stimulated
method. They can provide valuable performance emission of radiation (LASER). The mirrors continue
prediction data that is helpful in optimizing laser to redirect the photons parallel to the optical axis and
design, especially when operated in the pulsed mode. so long as the population inversion is not depleted,
Figure 3a illustrates some simulation results for the more and more photons are stimulated by stimulated
transverse flow laser shown in Figure 9. The results emission which dominates the process and also
illustrate the effect of altering the gas mixture and dominates the spontaneous emission, which is
how this can be used to control the gain switched important to initiate laser action.
spike that would result in unwelcome work piece Light emitted by lasers contains several optical
plasma generation if allowed to become too large. frequencies, which are a function of the different
Figure 3b shows an experimental laser output pulse modes of the optical resonator; these are simply
from the high pulse repetition frequency (prf – 5kHz) the standing wave patterns that can exist within
laser illustrated in Figure 9. This illustrates that even a the resonator structure. There are two types of
394 LASERS / Carbon Dioxide Laser
Figure 3 (a) Predicted output pulses for transverse flow CO2 laser for different gas mixtures, (b) Experimental output pulse from
transverse flow CO2 laser.
Figure 6 Schematic for sealed-off slab laser and diffusion cooled laser with RF excitation (courtesy of Rofin).
This design of laser results in a device with an designs. For this reason this type of laser is suitable
extremely small footprint that has low maintenance for a wide range of welding and surface treatment
and running costs. Applications include: cutting, applications.
welding, and surface engineering.
Transversely Excited Atmospheric (TEA) Pressure
Fast Transverse Flow Laser
If the gas total pressure is increased above , 100 Torr
In the fast transverse flow laser (Figure 9a) the gas it is difficult to sustain a stable glow discharge,
flow, electrical discharge, and the output beam are at because above this pressure instabilities degenerate
right angles to each other (Figure 9b). The transverse into arcs within the discharge volume. This problem
discharge can be high voltage DC, RF, or pulsed up to can be overcome by pulse excitation; using submi-
8 kHz (Figure 9c). Very high output power per unit crosecond pulse duration, instabilities do not have
discharge length can be obtained with an optimal sufficient time to develop; hence, the gas pressure can
total pressure ðPÞ of ,100 Torr; systems are available be increased above atmospheric pressure and the laser
delivering 10 kW of output power, CW or pulsed (see can be operated in a pulsed mode. In a mode locked
Figures 3a and b). The increase in total pressure format optical pulses shorter than 1 ns can be
requires a corresponding increase in the gas discharge produced. This is called a TEA laser and with a
electric field, E; as the ratio E=P must remain transverse gas flow is capable of producing short high
constant, since this determines the temperature of power pulses up to a few kHz repetition frequency. In
the discharge electrons, which have an optimum order to prevent arc formation, TEA lasers usually
mean value (optimum energy distribution, Figure 2) employ ultraviolet or e-beam preionization of the gas
for efficient excitation of the population inversion. discharge just prior to the main current pulse being
With this high value of electric field, a longitudinal- applied via a thyratron switch. Output coupling is
discharge arrangement is impractical (500 kV for a usually via an unstable resonator. TEA lasers are used
1 m discharge length); hence, the discharge is applied for marking, remote sensing, range-finding, and
perpendicular to the optical axis. Fast transverse flow scientific applications.
gas lasers provided the first multikilowatt outputs
but tend to be expensive to maintain and operate.
In order to obtain a reasonable beam quality, the
output coupling is often obtained using a multipass
Conclusions
unstable resonator. As the population inversion is It is 40 years since Patel operated the first high power
available over a wide rectangular cross-section, this is CO2 laser. This led to the first generation of lasers
a disadvantage of this arrangement and beam quality which were quickly exploited for industrial laser
is not as good as that obtainable from fast axial flow materials processing, medical applications, defense,
398 LASERS / Carbon Dioxide Laser
Figure 9 (a) Transverse flow carbon dioxide laser gas recirculator, (b) Transverse flow carbon dioxide electrodes, (c) Transverse flow
carbon dioxide gas discharge as seen from the output window.
and scientific research applications; however, the first lasers incorporate many novel design features that
generation of lasers were quite unreliable and are beyond the scope of this article and are often
temperamental. After many design iterations, the peculiar to the particular laser manufacturer.
CO2 laser has now matured into a reliable, stable This includes gas additives and catalysts that may
laser source available in many different geometries be required to stabilize the gas discharge of a
and power ranges. The low cost of ownership of particular laser design; it is this optimization of
the latest generation of CO2 laser makes them a the laser design that has produced such reliable
very attractive commercial proposition for many and controllable low-cost performance from the
industrial and scientific applications. Commercial CO2 laser.
LASERS / Carbon Dioxide Laser 399
Hoffman AL and Vlases GC (1972) A simplified model for Siegman A (1986) Lasers. Mill Valley, California: University
predicting gain, saturation and pulse length for Science.
gas dynamic lasers. IEEE Journal of Quantum Electronics Smith K and Thomson RM (1978) Computer
8(2): 46. Modeling of Gas Lasers. New York and London:
Johnson DC (1971) Excitation of an atmospheric Plenum Press.
pressure CO2 – N2 – He laser by capacitor discharges. Sobolev NN and Sokovikov VV (1967) CO2 lasers. Soviet
IEEE Journal of Quantum Electronics Q.E.-7(5): 185. Physics USPEKHI 10(2): 153.
Koechner W (1988) Solid State Laser Engineering. Berlin: Svelto O (1998) Principles of Lasers, 4th edn. New York:
Springer Verlag. Plenum.
Kogelnik H and Li T (1966) Laser beams and resonators. Tychinskii VP (1967) Powerful lasers. Soviet Physics
Applied Optics 5: 1550– 1567. USPEKHI 10(2): 131.
Levine AK and De Maria AJ (1971). Lasers, vol. 3. Vlases GC and Money WM (1972) Numerical modelling of
New York: Marcel Dekkar, Chapter 3. pulsed electric CO2 lasers. Journal of Applied Physics
Moeller G and Ridgen JD (1965) Applied Physics Letters 43(4): 1840.
7: 274. Wagner WG, Haus HA and Gustafson KT (1968) High
Moore CB, Wood RE, Bei-Lok Hu and Yardley JT (1967) rate optical amplification. IEEE Journal of Quantum
Vibrational energy transfer in CO2 lasers. Journal of Electronics Q.E.-4: 287.
Chemical Physics 11: 4222. Witteman W (1987) The CO2 Laser. Springer Series in
Patel CKN (1964) Physics Review Letters 12: 588. Optical Sciences, vol. 53.
Dye Lasers
F J Duarte, Eastman Kodak Company, New York, atomic and molecular spectroscopy and have found
NY, USA use in many and diverse fields from medical to
military applications. In addition to their extraordi-
A Costela, Consejo Superior de Investigaciones nary spectral versatility, dye lasers have been shown
Cientificas, Madrid, Spain
to oscillate from the femtosecond pulse domain to the
q 2005, Elsevier Ltd. All Rights Reserved. continuous wave (cw) regime. For microsecond pulse
emission, energies of up to hundreds of joules per
pulse have been demonstrated. Further, operation at
Introduction high pulsed repetition frequencies (prfs), in the multi-
kHz regime, has provided average powers at kW
Background
levels. This unrivaled operational versatility is
Dye lasers are the original tunable lasers. Discovered summarized in Table 1.
in the mid-1960s these tunable sources of coherent Dye lasers are excited by coherent optical energy
radiation span the electromagnetic spectrum from the from an excitation, or pump, laser or by optical
near-ultraviolet to the near-infrared (Figure 1). Dye energy from specially designed lamps called flash-
lasers spearheaded and sustained the revolution in lamps. Recent advances in semiconductor laser
Figure 1 Approximate wavelength span from the various classes of laser dye molecules. Reproduced with permission from Duarte FJ
(1995) Tunable Laser Handbook. New York: Academic Press.
LASERS / Dye Lasers 401
Dye laser class Spectral coveragea Energy per pulseb Prf b Power b
technology have made it possible to construct very 1970: The continuous-wave (cw) dye laser is dis-
compact all-solid-state excitation sources that, covered (O. G. Peterson, S. A. Tuccio, and
coupled with new solid-state dye laser materials, B. B. Snavely).
should bring the opportunity to build compact 1971: The distributed feedback dye laser is discov-
tunable laser systems for the visible spectrum. ered (H. Kogelnik and C. V. Shank).
Further, direct diode-laser pumping of solid-state 1971 –1975: Prismatic beam expansion in dye lasers
dye lasers should prove even more advantageous to is introduced (S. A. Myers; E. D. Stokes and
enable the development of fairly inexpensive tunable colleagues; D. C. Hanna and colleagues).
narrow-linewidth solid-state dye laser systems for 1972: Passive mode-locking is demonstrated in cw
spectroscopy and other applications requiring low dye lasers (E. P. Ippen, C. V. Shank, and
powers. Work on electrically excited organic gain A. Dienes).
media might also provide new avenues for further 1972: The first pulsed narrow-linewidth tunable dye
progress. laser is introduced (T. W. Hänsch).
The literature of dye lasers is very rich and many 1973: Frequency stabilization of cw dye lasers is
review articles have been written describing and demonstrated (R. L. Barger, M. S. Sorem, and
discussing traditional dye lasers utilizing liquid gain J. L. Hall).
media. In particular, the books Dye Lasers, Dye Laser 1976: Colliding-pulse-mode locking is introduced
Principles, High Power Dye Lasers, and Tunable (I. S. Ruddock and D. J. Bradley).
Lasers Handbook provide excellent sources of author- 1977 –1978: Grazing-incidence grating cavities are
itative and detailed description of the physics and introduced (I. Shoshan and colleagues;
technology involved. In this article we offer only a
M. G. Littman and H. J. Metcalf; S. Saikan).
survey of the operational capabilities of the dye lasers
1978 – 1980: Multiple-prism grating cavities are
using liquid gain media in order to examine with more
introduced (T. Kasuya and colleagues;
attention the field of solid-state dye lasers.
G. Klauminzer; F. J. Duarte and J. A. Piper).
1981: Prism pre-expanded grazing-incidence grating
oscillators are introduced (F. J. Duarte and
Brief History of Dye Lasers J. A. Piper).
1965: Quantum theory of dyes is discussed in the 1982: Generalized multiple-prism dispersion theory
context of the maser (R. P. Feynman). is introduced (F. J. Duarte and J. A. Piper).
1966: Dye lasers are discovered (P. P. Sorokin and 1983: Prismatic negative dispersion for pulse
J. R. Lankard; F. P. Schäfer and colleagues). compression is introduced (W. Dietel,
1967: The flashlamp-pumped dye laser is discovered J. J. Fontaine, and J-C. Diels).
(P. P. Sorokin and J. R. Lankard; W. Schmidt 1987: Laser pulses as short as six femtoseconds are
and F. P. Schäfer). demonstrated (R. L. Fork, C. H. Brito Cruz,
1967 – 1968: Solid-state dye lasers are discovered P. C. Becker, and C. V. Shank).
(B.H.SofferandB.B.McFarland;O.G.Peterson 1994: First narrow-linewidth solid-state dye laser
and B. B. Snavely). oscillator (F. J. Duarte).
1968: Mode-locking, using saturable absorbers, is 1999 – 2000: Distributed feedback solid-state dye
demonstrated in dye lasers (W. Schmidt and lasers are introduced (Wadsworth and
F. P. Schäfer). colleagues; Zhu and colleagues).
402 LASERS / Dye Lasers
second), the dye solution might be static. However, for side of the cavity is comprised of a partial reflector, or
high-prf operation (a few thousand pulses per second) an output coupler, and the other end of the resonator is
the dye solution must be flowed at speeds of up to a composed of a multiple-prism grating assembly. It is
few meters per second in order to dissipate the heat. A the dispersive characteristics of this multiple-prism
simple broadband optically pumped dye laser can be grating assembly and the dimensions of the emission
constructed using just the pump laser, the active beam produced at the gain medium that determine the
medium, and two mirrors to form a resonator. In order tunability and the narrowness, or spectral purity, of
to achieve tunable, narrow-linewidth, emission, a the laser emission.
more sophisticated resonator must be employed. This In order to selectively excite a single vibrational –
is called a dispersive tunable oscillator and is depicted rotational level of a molecule such as iodine (I2), at
in Figure 3. In a dispersive tunable oscillator the exit room temperature, one needs a laser linewidth of
Dn < 1:5 GHz (or Dl < 0:0017 nm at l ¼ 590 nm).
The hybrid multiple-prism near-grazing-incidence
(HMPGI) grating dye laser oscillator illustrated in
Figure 4 yields laser linewidths in the 400 MHz #
Dn # 650 MHz range at 4 – 5% conversion efficien-
cies whilst excited by a copper-vapor laser operating
at a prf of 10 kHz. Pulse lengths are , 10 ns at
full-width half-maximum (FWHM). The narrow-
linewidth emission from these oscillators is said to
be single-longitudinal-mode lasing because only one
electromagnetic mode is allowed to oscillate.
The emission from oscillators of this class can be
amplified many times by propagating the tunable
narrow-linewidth laser beam through single-pass
amplifier dye cells under the excitation of pump
lasers. Such amplified laser emission can reach
Figure 3 Copper-vapor-laser pumped hybrid multiple-prism enormous average powers. Indeed, a copper-vapor-
near grazing incidence (HMPGI) grating dye laser oscillator.
Adapted with permission from Duarte FJ and Piper JA (1984)
laser pumped dye laser system at the Lawrence
Narrow linewidth high prf copper laser-pumped dye-laser Livermore National Laboratory (USA), designed for
oscillators. Applied Optics 23: 1391–1394. the laser isotope separation program, was reported to
Figure 4 Flashlamp-pumped multiple-prism grating oscillators. From Duarte FJ, Davenport WE, Ehrlich JJ and Taylor TS (1991)
Ruggedized narrow-linewidth dispersive dye laser oscillator. Optics Communications 84: 310–316. Reproduced with permission from
Elsevier.
404 LASERS / Dye Lasers
yield average powers in excess of 2.5 kW, at a prf of remove the excess heat and to quench the triplet
13.2 kHz, at a better than 50% conversion efficiency states. In the original cavity reported by Peterson
as reported by Bass and colleagues in 1992. Besides and colleagues, in 1970, a beam from an Arþ laser
high conversion efficiencies, excitation with copper- was focused on to an active region which is
vapor lasers, at l ¼ 510:554 nm, has the advantage contained within the resonator. The resonator
of inducing little photodegradation in the active comprised dichroic mirrors that transmit the blue-
medium thus allowing very long dye lifetimes. green radiation of the pump laser and reflect the red
emission from the dye molecules. Using a pump
Flashlamp-Pumped Dye Lasers
power of about 1 W, in a TEM00 laser beam, these
Flashlamps utilized in dye laser excitation emit at authors reported a dye laser output of 30 mW.
black-body temperatures in the 20 000 K range, thus Subsequent designs replaced the dye cell with a dye
yielding intense ultraviolet radiation centered around jet, an introduced external mirror, and integrated
200 nm. One further requirement for flashlamps, and dispersive elements in the cavity. Dispersive elements
their excitation circuits, is to deliver light pulses with such as prisms and gratings are used to tune the
a fast rise time. For some flashlamps this rise time can wavelength output of the laser. Frequency-selective
be less than a few nanoseconds. elements, such as etalons and other types of
Flashlamp-pumped dye lasers differ from laser- interferometers, are used to induce frequency
pumped pulsed dye lasers mainly in the pulse energies narrowing of the tunable emission. Two typical cw
and pulse lengths attainable. This means that dye laser cavity designs are described by Hollberg,
flashlamp-pumped dye lasers, using relatively large in 1990, and are reproduced here in Figure 5. The
volumes of dye, can yield very large energy pulses. first design is a linear three-mirror folded cavity. The
Excitation geometries use either coaxial lamps, with second one is an eight-shaped ring dye laser cavity
the dye flowing in a quartz cylinder at the center of comprised of mirrors M1, M2, M3, and M4. Linear
the lamp, or two or more linear lamps arranged cavities exhibit the effect of spatial hole burning
symmetrically around the quartz tube containing the which allows the cavity to lase in more than one
dye solution. Using a relatively weak dye solution of longitudinal mode. This problem can be overcome
rhodamine-6G (2.2 £ 1025 M), a coaxial lamp, and in ring cavities (Figure 5b) where the laser emission
an active region defined by a quartz tube 6 cm in is in the form of a traveling wave.
diameter and a length of 60 cm, Baltakov and Two aspects of cw dye lasers are worth emphasiz-
colleagues, in 1974, reported energies of 400 J in ing. One is the availability of relatively high powers in
pulses 25 ms long at FWHM. single longitudinal mode emission and the other is the
Flashlamp-pumped tunable narrow-linewidth dye
laser oscillators described by Duarte and colleagues,
in1991, employ a cylindrical active region 6 mm in
diameter and 17 cm in length. The dye solution is
made of rhodamine 590 at a concentration of
1 £ 1025 M. This active region is excited by a coaxial
flashlamp. Using multiple-prism grating architectures
(see Figure 4) these authors achieve a diffraction
limited TEM00 laser beam and laser linewidths of
Dn < 300 MHz at pulsed energies in the 2 – 3 mJ
range. The laser pulse duration is reported to be
Dt < 100 ns. The laser emission from this class of
multiple-prism grating oscillator is reported to be
extremely stable. The tunable narrow-linewidth
emission from these dispersive oscillators is either
used directly in spectroscopic, or other scientific
applications, or is utilized to inject large flashlamp-
pumped dye laser amplifiers to obtain multi-joule
pulse energies with the laser linewidth characteristics
of the oscillator.
Figure 5 CW laser cavities: (a) linear cavity and (b) ring cavity.
Continuous Wave Dye Lasers Adapted from Hollberg LW (1990) CW dye lasers. In: Duarte FJ
and Hillman LW (eds) Dye Laser Principles, pp. 185–238.
CW dye lasers use dye flowing at linear speeds of up New York: Academic Press. Reproduced with permission from
to 10 meters per second which are necessary to Elsevier.
LASERS / Dye Lasers 405
demonstration of very stable laser oscillation. First, were incorporated in order to introduce negative
Johnston and colleagues, in 1982, reported 5.6 W of dispersion, by Dietel and colleagues in 1983, and thus
stabilized laser output in a single longitudinal mode at subtract dispersion from the cavity and ultimately
593 nm at a conversion efficiency of 23%. In this provide the compensation needed to produce
work eleven dyes were used to span the spectrum femtosecond pulses.
continuously from , 400 nm to , 900 nm. In the area The shortest pulse obtained from a dye laser, 6 fs,
of laser stabilization and ultra narrow-linewidth has been reported by Fork and colleagues, in 1987,
oscillation it is worth mentioning the work of using extra-cavity compression. In that experiment,
Hough and colleagues, in 1984, that achieved laser a dye laser incorporating CPM and prismatic
linewidths of less than 750 Hz employing an external compensation was used to generate pulses that were
reference cavity. amplified by a copper-vapor laser at a prf of 8 kHz.
The amplified pulses, of a duration of 50 fs, were
then propagated through two grating pairs and a
Ultrashort-Pulse Dye Lasers four-prism sequence for further compression.
In this equation
Figure 7 Generalized multiple-prism arrays in (a) additive and (b) compensating configurations. From Duarte FJ (1990) Narrow-
linewidth pulsed dye laser oscillators. In: Duarte FJ and Hillman LW (eds) Dye Laser Principles, pp. 133– 183. New York: Academic
Press. Reproduced with permission from Elsevier.
Here, k1;m and k2;m represent the physical beam Here, M1 and M2 are the beam magnification factors
expansion experienced by the incident and the exit given by
beams, respectively.
Y
r
Equation [2] indicates that 7l f2;m ; the cumulative M1 ¼ k1;m ½7a
dispersion at the mth prism, is a function of the m¼1
geometry of the mth prism, the position of the light
Y
r
beam relative to this prism, the refractive index of the M2 ¼ k2;m ½7b
prism, and the cumulative dispersion up to the m¼1
previous prism 7l f2;ðm21Þ :
For an array of r identical isosceles, or equilateral, For a multiple prism expander designed for an
prisms arranged symmetrically, in an additive con- orthogonal beam exit, and Brewster’s angle of
figuration, so that the angles of incidence and incidence, eqn [6] reduces to the succinct expression
emergence are the same, the cumulative dispersion given by Duarte, in 1990:
reduces to X
r
7l FP ¼ 2 ð^1Þðnm Þm21 7l nm ½8
7l f2;r ¼ r7l f2;1 ½5 m¼1
Under these circumstances the dispersions add in a Equation [6] can be used to either quantify the overall
simple and straightforward manner. For configur- dispersion of a given multiple-prism beam expander
ations incorporating right-angle prisms, the disper- or to design a prismatic expander yielding zero
sions need to be handled mathematically in a more dispersion, that is, 7l FP ¼ 0; at a given wavelength.
subtle form.
The generalized double-pass, or return-pass,
dispersion for multiple-prism beam expanders Physics and Architecture of Solid-State
was introduced by Duarte, in 1985: Dye-Laser Oscillators
telescope. The linewidth equation including the At present, very compact and optimized multiple-
intracavity beam magnification factor can be prism grating tunable laser oscillators are found in
written as two basic cavity architectures. These are the
multiple-prism Littrow (MPL) grating laser oscil-
Dl < DuðM7l QG Þ21 ½9 lator, reported by Duarte in 1999 and illustrated in
Figure 8, and the hybrid multiple-prism near
From this equation, it can be deduced that a narrow grazing-incidence (HMPGI) grating laser oscillator,
Dl is achieved by reducing Du and increasing the introduced by Duarte in 1997 and depicted in
overall intracavity dispersion ðM7l QG Þ: The intra- Figure 9. In early MPL grating oscillators the
cavity dispersion is optimized by expanding the size individual prisms integrating the multiple-prism
of the intracavity beam incident on the diffractive expander were deployed in an additive configur-
surface of the tuning grating until it is totally ation thus adding the cumulative dispersion to that
illuminated. of the grating and thus contributing to the overall
Hänsch used a two-dimensional astronomical dispersion of the cavity. In subsequent architectures
telescope to expand the intracavity beam incident the prisms were deployed in compensating con-
on the diffraction grating. A simpler beam expansion figurations so as to yield zero dispersion and thus
method consists in the use of a single-prism beam allow the tuning characteristics of the cavity to be
expander as disclosed by several authors (Myers,
1971; Stokes and colleagues, 1972; Hanna and
colleagues, 1975). An extension and improvement
of this approach was the introduction of multiple-
prism beam expanders as reported by Kasuya and
colleagues, in 1978, Klauminzer, in 1978, and Duarte
and Piper, in 1980. The main advantages of multiple-
prism beam expanders, over two-dimensional tele-
scopes, are simplicity, compactness, and the fact that
the beam expansion is reduced from two dimensions
to one dimension. Physically, as explained previously,
prismatic beam expanders also introduce a dispersion
component that is absent in the case of the
Figure 8 MPL grating solid-state dye laser oscillator: optimized
astronomical telescope. Advantages of multiple- architecture. The physical dimensions of the optical components
prism beam expanders over single-prism beam of this cavity are shown to scale. The length, of the solid-state dye
expansion are higher transmission efficiency, lower gain medium, along the optical axis, is 10 mm. Reproduced with
amplified spontaneous emission levels, and the permission from Duarte FJ (1999) Multiple-prism grating solid-
flexibility to either augment or reduce the prismatic state dye laser oscillator: optimized architecture. Applied Optics
38: 6347– 6349.
dispersion.
In general, for a pulsed multiple-prism grating
oscillator, Duarte and Piper, in 1984, showed that the
return-pass dispersive linewidth is given by
21
Dl ¼ DuR MR7l QG þ R7l FP ½10
determined by the grating exclusively. In this circumstances, eqn [1] predicts broadband emission
approach the principal role of the multiple- which according to the uncertainty principle, in the
prism array is to expand the beam incident on the form of,
grating thus augmenting significantly the overall
dispersion of the cavity as described in eqns [9] or DnDt < 1 ½12
[10]. In this regard, it should be mentioned that
can lead to very short pulse emission since Dn and Dl
beam magnification factors of up to 100, and
are related by the identities
beyond, have been reported in the literature. Using
solid-state laser dye gain media, these MPL and Dl < l2 =Dx ½13
HMPGI grating laser oscillators deliver tunable
single-longitudinal-mode emission at laser linewidths and
350 MHz # Dn # 375 MHz and pulse lengths in the
3 –7 ns (FWHM) range. Long-pulse operation of Dv < c=Dx ½14
this class of multiple-prism grating oscillators
has been reported by Duarte and colleagues, in
Distributed-Feedback Solid-State Dye Lasers
1998. In these experiments laser linewidths of
Dn < 650 MHz were achieved at pulse lengths in Recently, narrow-linewidth laser emission from solid-
excess of 100 ns (FWHM) under flashlamp-pumped state dye lasers has also been obtained using
dye laser excitation. a distributed-feedback (DFB) configuration by
The dispersive cavity architectures described here Wadsworth and colleagues, in 1999, and Zhu and
have been used with a variety of laser gain media in colleagues, in 2000. As is well known, in a DFB laser
the gas, the liquid, and the solid state. Applications to no external cavity is required, the feedback being
tunable semiconductor lasers have also been reported provided by Bragg reflection from a permanently or
by Zorabedian, in 1992, Duarte, in 1993, and Fox dynamically written grating structure within the gain
and colleagues, in 1997. medium as reported by Kogelnik and Shank, in 1971.
Concepts important to MPL and HMPGI grating By using a DFB laser configuration where the
tunable laser oscillators include the emission of a interference of two pump beams induces the required
single-transverse-mode (TEM00) laser beam in a periodic modulation in the gain medium, laser
compact cavity, the use of multiple-prism arrays, the emission with linewidth in the 0.01 – 0.06 nm range
expansion of the intracavity beam incident on the have been reported. Specifically, Wadsworth and
grating, the control of the intracavity dispersion, and colleagues reported a laser linewidth of 12 GHz
the quantification of the overall dispersion of the (0.016 nm at 616 nm) using Perylene Red doped
multiple-prism grating assembly via generalized dis- PMMA at a conversion efficiency of 20%.
persion equations. Sufficiently high intracavity It should also be mentioned that the DFB laser is
dispersion leads to the achievement of return-pass also a dye laser development that has found wide and
dispersive linewidths close to the free-spectral range of extensive applicability in semiconductor lasers
the cavity. Under these circumstances single-longi- employed in the telecommunications industry.
tudinal-mode lasing is readily achieved as a result of
multipass effects.
Laser Dye Survey
A Note on the Cavity Linewidth Equation
Cyanines
So far we have described how the cavity linewidth
equation These are red and near-infrared dyes which present
long conjugated methine chains (!CH ¼ CH!) and are
Dl < Duð7l uÞ21
useful in the spectral range longer than 800 nm,
and the dispersion equations can be applied to where no other dyes compete with them. An
achieve highly coherent, or very narrow-linewidth, important laser dye belonging to this class is
emission. It should be noted that the same physics can 4-dicyanomethylene-2-methyl-6-( p-dimethylamino-
be applied to achieve ultrashort, or femtosecond, styryl)-4H-pyran (DCM, Figure 10a), with laser
laser pulses. It turns out that intracavity prisms can be emission in the 600 – 700 nm range depending on
configured to yield negative dispersion as described the pump and solvent, liquid or solid, used.
by Duarte and Piper, in 1982, Dietel and colleagues,
Xanthenes
in 1983, and Fork and colleagues in 1984. This
negative dispersion can reduce significantly the These dyes have a xanthene ring as the chromophore
overall dispersion of the cavity. Under those and are classified into rhodamines, incorporating
LASERS / Dye Lasers 409
Figure 10 Molecular structures of some common laser dyes: (a) DCM; (b) Rh6G; (c) PM567; (d) Perylene Orange (R ¼ H)
and Perylene Red (R ¼ C4H6O); (e) Coumarin 307 (also known as Coumarin 503); (f) Coumarin 153 (also known as Coumarin 540A).
amino radicals substituents, and fluoresceins, with vulnerable to photochemical reactions with oxygen as
hydroxyl (OH) radical substituents. They are gener- indicated by Rahn and colleagues, in 1997. The
ally very efficient and chemically stable and their molecular structure of a representative and
emission covers the wavelength region from 500 to well-known dye of this class, 1,3,5,7,8-pentamethyl-
700 nm. 2,6-diethylpyrromethene-difluoroborate complex
Rhodamine dyes are the most important group of (pyrromethene 567, PM567) is shown in Figure 10c.
all laser materials, with rhodamine 6G (Rh6G, Recently, analogs of dye PM567 substituted at
Figure 10b), also called rhodamine 590 chloride, position 8 with an acetoxypolymethylene linear
being probably the best known of all laser dyes. chain or a polymerizable methacryloyloxy poly-
Rh6G exhibits an absorption peak, in ethanol, at methylene chain have been developed. These new
530 nm and a fluorescence peak at 556 nm, with laser dipyrromethene.BF2 complexes have demonstrated
emission typically in the 550 – 620 nm region. It has improved efficiency and better photostability than the
been demonstrated to lase efficiently both in liquid parent compound both in liquid and solid state.
and solid solutions.
Conjugated Hydrocarbons
Pyrromethenes
This class of organic compounds includes perylenes,
Pyrromethene.BF2 complexes are a new class of laser stilbenes, and p-terphenyl. Perylene and perylimide
dyes synthesized and characterized more recently as dyes are large, nonionic, nonpolar molecules
reported by Shah and colleagues, in 1990, Pavlopou- (Figure 10d) characterized by their extreme photo-
los and colleagues, in 1990, and Boyer and stability and negligible singlet –triplet transfer, as
colleagues, in 1993. These laser dyes exhibit reduced discussed by Seybold and Wagenblast and colleagues,
triplet –triplet absorption over their fluorescence and in 1989, which have high quantum efficiency because
lasing spectral region while retaining a high quantum of the absence of nonradiative relaxation. The
fluorescence yield. Depending on the substituents on perylene dyes exhibit limited solubility in conven-
the chromophore, these dyes present laser emission tional solvents but dissolve well in acetone, ethyl
over the spectral region from the green/yellow to the acetate, and methyl methacrylate, with emission
red, competing with the rhodamine dyes and have wavelengths in the orange and red spectral regions.
been demonstrated to lase with good performance Stilbene dyes are derivatives of uncyclic unsatu-
when incorporated into solid hosts. Unfortunately, rated hydrocarbons such as ethylene and butadiene,
they are relatively unstable because of the aromatic with emission wavelengths in the 400 – 500 nm range.
amine groups in their structure, which render them Most of them end in phenyl radicals. Although they
410 LASERS / Dye Lasers
are chemically stable, their laser performance is solid-state dye lasers. A recent detailed review of the
inferior to that of coumarins. p-Terphenyl is a work done in this field has been given by Costela et al.
valuable and efficient p-oligophenylene dye with Inorganic glasses, transparent polymers, and
emission in the ultraviolet. inorganic – organic hybrid matrices have been suc-
cessfully used as host matrices for laser dyes.
Coumarins Inorganic glasses offer good thermal and optical
Coumarin derivatives are a popular family of laser properties but present the difficulty that the high
dyes with emission in the blue-green region of the melting temperature employed in the traditional
spectrum. Their structure is based on the coumarin process of glass making would destroy the organic
ring (Figure 10e,f) with different substituents that dye molecules. It was not until the development of the
strongly affect its chemical characteristics and allow low-temperature sol-gel technique for the synthesis of
covering an emission spectral range between 420 and glasses that this limitation was overcome. The sol-gel
580 nm. Some members of this class rank among the process, based on inorganic polymerization reactions
most efficient laser dyes known, but their chemical performed at about room temperature and starting
photobleaching is rapid compared with xanthene and with metallo-organic compounds such as alkoxides
pyrromethene dyes. An additional class of blue-green or salts, as reported by Brinker and Scherrer, in 1989,
dyes is the tetramethyl derivatives of coumarin dyes provides a safe route for the preparation of rigid,
introduced by Chen and colleagues, in 1988. The transparent, inorganic matrix materials incorporat-
emission from these dyes spans the spectrum in the ing laser dyes. Disadvantages of these materials are
453 – 588 nm region. These coumarin analogs exhibit the possible presence of impurities embedded in the
improved efficiency, higher solubility, and better matrix or the occurrence of interactions between
lifetime characteristics than the parent compounds. the dispersed dye molecules and the inorganic
structure (hydrogen bonds, van der Waals forces)
Azaquinolone Derivatives which, in certain cases, can have a deleterious effect
on the lasing characteristics of the material.
The quinolone and azaquinolone derivatives have a The use of matrices based on polymeric materials
structure similar to that of coumarins but with their to incorporate organic dyes offers some technical and
laser emission range extended toward the blue by economical advantages. Transparent polymers exhi-
20 –30 nm. The quinolone dyes are used when laser bit high optical homogeneity, which is extremely
emission in the 400 – 430 nm region is required. important for narrow-linewidth oscillators, as
Azaquinolone derivatives, such as 7-dimethylamino- explained by Duarte, in 1994, good chemical
1-methyl-4-metoxy-8-azaquinolone-2 (LD 390), compatibility with organic dyes, and allow control
exhibit laser action below 400 nm. over structure and chemical composition making
possible the modification, in a controlled way, of
Oxadiazole Derivatives
relevant properties of these materials such as polarity,
These are compounds which incorporate an axodi- free volume, molecular weight, or viscoelasticity.
azole ring with aryl radical substituents and Furthermore, the polymeric materials are amenable
which lase in the 330 –450 nm spectral region. Dye to inexpensive fabrication techniques, which facili-
2-(4-biphenyl)-5-(4-t-butylphenyl)-1,3,4-oxadiazole tate miniaturization and design of integrated
(PBD), with laser emission in the 355 – 390 nm optical systems. The dye molecules can be either
region, belongs to this family. dissolved in the polymer or linked covalently to the
polymeric chains.
In the early investigations on the use of solid
Solid-State Laser Dye Matrices polymer matrices for laser dyes, the main problem to
Although some first attempts to incorporate dye be solved was that of the low resistance to laser
molecules into solid matrices were already made in radiation exhibited by the then existing materials. In
the early days of development of dye lasers, it was not the late 1980s and early 1990s new modified
until the late 1980s that host materials with the polymeric organic materials began to be developed
required properties of optical quality and high with a laser-radiation-damage threshold comparable
damage threshold to laser radiation began to be to that of inorganic glasses and crystals as reported
developed. In subsequent years, the synthesis of new by Gromov and colleagues, in 1985, and Dyumaev
high-performance dyes and the implementation of and colleagues, in 1992. This, together with the
new ways of incorporating the organic molecules above-mentioned advantages, made the use of
into the solid matrix resulted in significant advances polymers in solid-state dye lasers both attractive
towards the development of practical tunable and competitive.
LASERS / Dye Lasers 411
An approach that intends to bring about materials input energy per mole of dye molecule when the
in which the advantages of both inorganic glasses and output energy is down to 50% of its initial value).
polymers are preserved but the difficulties are avoided The useful lifetime increased to 12 000 pulses
is that of using inorganic – organic hybrid matrices. when the Rh6G chromophore was linked covalently
These are silicate-based materials, with an inorganic to the polymeric chains as reported by Costela and
Si – O – Si backbone, prepared from organosilane colleagues, in 1996.
precursors by sol-gel processing in combination Comparative studies on the laser performance of
with organic cross-linking of polymerizable mono- Rh6G incorporated either in copolymers of HEMA
mers as reported by Novak, in 1993, Sanchez and and MMA or in MPMMA were carried out by Giffin
Ribot, in 1994, and Schubert, in 1995. In one and colleagues, in 1999, demonstrating higher
procedure, laser dyes are mixed with organic mono- efficiency of the MPMMA materials but superior
mers, which are then incorporated into the porous normalized photostability (up to 240 GJ/mol) of the
structure of a sol-gel inorganic matrix by immersing copolymer formulation.
the bulk in the solution containing monomer and Laser conversion efficiencies in the range 60 –70%
catalyst or photoinitiator as indicated by Reisfeld and for longitudinal pumping at 532 nm were reported by
Jorgensen, in 1991, and Bosch and colleagues, in Ahmad and colleagues, in 1999, with PM567
1996. Alternatively, hybrids can be obtained from dissolved in PMMA modified with 1,4-diazobi-
organically modified silicon alkoxides, producing the cyclo[2,2,2] octane (DABCO) or perylene additives.
so-called ORMOCERS (organically modified cer- When using DABCO as an additive, a useful lifetime
amics) or ORMOSILS (organically modified silanes) of up to 550 000 pulses, corresponding to a normal-
as reported by Reisfeld and Jorgensen, in 1991. An ized photostability of 270 GJ/mol, was demonstrated
inconvenient aspect of these materials is the appear- at a 2 Hz prf.
ance of optical inhomogeneities in the medium due to Much lower efficiencies and photostabilities have
the difference in the refractive index between the been obtained with dyes emitting in the blue-green
organic and inorganic phases as well as to the spectral region. Costela and colleagues, in 1996,
difference in density between monomer and polymer performed some detailed studies on dyes coumarin
which causes stresses and the formation of optical 540A and coumarin 503 incorporated into meth-
defects. As a result, spatial inhomogeneities appear in acrylate homopolymers and copolymers, and demon-
the laser beam, thereby decreasing its quality as strated efficiencies of at most 19% with useful
discussed by Duarte, in 1994. lifetimes of up to 1200 pulses, at a 2 Hz prf,
for transversal pumping at 337 nm. In 2000,
Somasundaram and Ramalingam obtained useful
Organic Hosts
lifetimes of 5240 and 1120 pulses, respectively, for
The first polymeric materials with improved resist- coumarin 1 and coumarin 490 dyes incorporated into
ance to damage by laser radiation were obtained by PMMA rods modified with ethyl alcohol, under
Russian workers, who incorporated rhodamine dyes transversal pumping at 337 nm and a prf of 1 Hz.
into modified poly(methyl methacrylate) (MPMMA), A detailed study of photo-physical parameters of
obtained by doping PMMA with low-molecular R6G-doped HEMA-MMA gain media was per-
weight additives. Some years later, in 1995, formed by Holzer and colleagues, in 2000. Among
Maslyukov and colleagues demonstrated lasing the parameters determined by these authors are
efficiencies in the range 40 – 60% with matrices quantum yields, lifetimes, absorption cross-sections,
of MPMMA doped with rhodamine dyes pumped and emission cross-sections. A similar study on the
longitudinally at 532 nm, with a useful lifetime photophysical parameters of PM567 and two PM567
(measured as a 50% efficiency drop) of 15 000 pulses analogs incorporated into solid matrices of different
at a prf of 3.33 Hz. acrylic copolymers and in corresponding mimetic
In 1995, Costela and colleagues used an approach liquid solutions was performed by Bergmann and
based on adjusting the viscoelastic properties of the colleagues, in 2001.
material by modifying the internal plasticization of Preliminary experiments with rhodamine
the polymeric medium by copolymerization with 6G-doped MPMMA at high prfs were conducted
appropriate monomers. Using dye Rh6G dissolved in using a frequency-doubled Nd:YAG laser at 5 kHz by
a copolymer of 2-hydroxyethyl methacrylate Duarte and colleagues, in 1996. In those experiments,
(HEMA) and methyl methacrylate (MMA) and involving longitudinal excitation, the pump laser was
transversal pumping at 337 nm, they demonstrated allowed to fuse a region at the incidence window of
laser action with an efficiency of 21% and useful the gain medium so that a lens was formed. This lens
lifetime of 4500 pulses (20 GJ/mol in terms of total and the exit window of the gain medium comprised a
412 LASERS / Dye Lasers
short unstable resonator that yielded broadband of 5 Hz and 1 mJ pump energy, to be compared with a
emission. In 2001, Costela and colleagues demon- useful lifetime of only 10 000 pulses obtained with
strated lasing in rhodamine 6G- and pyrromethene- samples prepared in normal conditions. Perylene
doped polymer matrices under copper-vapor-laser Orange incorporated into polycom glass and pumped
excitation at prfs in the 1.0 – 6.2 kHz range. In these longitudinally at 532 nm gave an efficiency of 72%
experiments, the dye-doped solid-state matrix was and a useful lifetime of 40 000 pulses at a prf of 1 Hz
rotated at 1200 rpm. Average powers of 290 mW as reported by Rhan and King, in 1998.
were obtained at a prf of 1 kHz and a conversion A direct comparison study of laser performance of
efficiency of 37%. For a short period of time average Rh6G, PM567, Perylene Red, and Perylene Orange in
powers of up to 1 W were recorded at a prf of organic, inorganic, and hybrid hosts was carried out
6.2 kHz. In subsequent experiments by Abedin and by Rahn and King in 1995. They found that the
colleagues the rpf was increased to 10 kHz by using as nonpolar perylene dyes had better performance in
pump laser a diode-pumped, Q-switched, frequency partially organic hosts, whereas the ionic rhodamine
doubled, solid state Nd:YLF laser. Initial average and pyrromethene dyes performed best in the
powers of 560 mW and 430 mW were obtained for inorganic sol-gel glass host. The most promising
R6G-doped and PM567-doped polymer matrices, combinations of dye and host for efficiency and
respectively. Lasing efficiency was 16% for R6G photostability were found to be Perylene Orange in
and the useful lifetime was 6.6 min (or about polycom glass and Rh6G in sol-gel glass.
4.0 million shots). An all-solid-state optical configuration, where
the pump was a laser diode array side-pumped,
Q-switched, frequency-doubled, Nd:YAG slab laser,
Inorganic and Hybrid Hosts
was demonstrated by Hu and colleagues, in 1998,
In 1990, Knobbe and colleagues and McKiernan and with dye DCM incorporated into ORMOSIL
colleagues, from the University of California at Los matrices. A lasing efficiency of 18% at the wavelength
Angeles, incorporated different organic dyes, via the of 621 nm was obtained, and 27 000 pulses were
sol-gel techniques, into silicate (SiO2), aluminosili- needed for the output energy to decrease by 10% of
cate (Al2O3 – SiO2), and ORMOSIL host matrices, its initial value at 30 Hz repetition rate. After 50 000
and investigated the lasing performance of the pulses the output energy decreased by 90% but it
resulting materials under transversal pumping. Lasing could be recovered after waiting for a few minutes
efficiencies of 25% and useful lifetimes of 2700 pulses without pumping.
at 1 Hz repetition rate were obtained from Rh6G Violet and ultraviolet laser dyes have also been
incorporated into aluminosilicate gel and ORMOSIL incorporated into sol-gel silica matrices and their
matrices, respectively. When the dye in the ORMO- lasing properties evaluated under transversal pump-
SIL matrices was coumarin 153, the useful lifetime ing by a number of authors. Although reasonable
was still 2250 pulses. Further studies by Altman and efficiencies have been obtained with some of
colleagues, in 1991, demonstrated useful lifetimes of these laser dyes, the useful lifetimes are well below
11 000 pulses at a 30 Hz prf when rhodamine 6G 1000 pulses.
perchlorate was incorporated into ORMOSIL More recently, Costela and colleagues have demon-
matrices. The useful lifetime increased to 14 500 strated improved conversion efficiencies in dye-doped
pulses when the dye used was rhodamine B. inorganic – organic matrices using pyrromethene 567
When incorporated into xerogel matrices, pyrro- dye. The solid matrix in this case is poly-trimethyl-
methene dyes lase with efficiencies as high as the silyl-methacrylate cross-linked with ethylene glycol
highest obtained in organic hosts but with much and copolymerized with methyl methacrylate. Also,
higher photostability: a useful lifetime of 500 000 Duarte and James have reported on dye-doped
pulses at 20 Hz repetition rate was obtained by Faloss polymer-nanoparticle laser media where the polymer
and colleagues, in 1997, with PM597 using a pump is PMMA and the nanoparticles are made of silica.
energy of 1 mJ. The efficiency of PM597 dropped to These authors report TEM00 laser beam emission and
40% but its useful lifetime increased to over improved dn=dT coefficients of the gain media that
1 000 000 pulses when oxygen-free samples were results in reduced Du.
prepared, in agreement with previous results that
documented the strong dependence of laser par-
ameters of pyrromethene dyes on the presence of
oxygen. A similar effect was observed with perylene
Dye Laser Applications
dyes: deoxygenated xerogel samples of Perylene Red Dye lasers generated a renaissance in diverse
exhibited a useful lifetime of 250 000 pulses at a prf applied fields such as isotope separation, medicine,
LASERS / Dye Lasers 413
photochemistry, remote sensing, and spectroscopy. tunability coupled with narrow-linewidth emission,
Dye lasers have also been used in many experiments at large pulsed energies, make dye lasers useful in
that have advanced the frontiers of fundamental military applications such as directed energy and
physics. damage of optical sensors.
Central to most applications of dye lasers is their
capability of providing coherent narrow-linewidth
radiation that can be tuned continuously from the
near ultraviolet, throughout the visible, to the near The Future of Dye Lasers
infrared. These unique coherent and spectral proper-
Predicting the future is rather risky. In the early
ties have made dye lasers particularly useful in the
1990s, articles and numerous advertisements in
field of high-resolution atomic and molecular spec-
commercial laser magazines predicted the demise
troscopy. Dye lasers have been successfully and
and even the oblivion of the dye laser in a few years. It
extensively used in absorption and fluorescence
did not turn out this way. The high power and
spectroscopy, Raman spectroscopy, selective exci-
wavelength agility available from dye lasers have
tations, and nonlinear spectroscopic techniques.
ensured their continued use as scientific and research
Well documented is their use in photochemistry,
tools in physics, chemistry, and medicine. In addition,
that is, the chemistry of optically excited atoms and
molecules, and in biology, studying biochemical the advent of narrow-linewidth solid-state dye lasers
reaction kinetics of biological molecules. By using has sustained a healthy level of research and devel-
nonlinear optical techniques, such as harmonic and opment in the field. Today, some 20 laboratories
sum frequency and difference frequency generation, around the world are engaged in this activity.
the properties of dye lasers can be extended to the The future of solid-state dye lasers depends largely
ultraviolet. on synthetic and manufacturing improvements. There
Medical applications of dye lasers include cancer is a need for strict control of the conditions of
photodynamic therapy, treatment of vascular lesions, fabrication and a careful purification of all the
and lithotripsy as described by Goldman in 1990. compounds involved. In the case of polymers, in
The ability of dye lasers to provide tunable sub- particular, stringent control of thermal conditions
GHz linewidths in the orange-red portion of the during the polymerization step is obligatory to ensure
spectrum, at kW average powers, made them that adequate optical uniformity of the polymer
particularly suited to atomic vapor laser isotope matrix is achieved, and that intrinsic anisotropy
separation of uranium as reported by Bass and developed during polymerization is minimized. From
colleagues, in 1992, Singh and colleagues, in 1994, an engineering perspective what are needed are dye-
and Nemoto and colleagues, in 1995. In this doped solid-state media with improved photostability
particular case the isotopic shift between the two characteristics and better thermal properties. By
uranium isotopes is a few GHz. Using dye lasers better thermal properties is meant better dn=dT
yielding Dn # 1 GHz the 235U can be selectively factors. To a certain degree progress in this area has
excited in a multistep process by tuning the dye already been reported. As in the case of liquid dye
laser(s) to specific wavelengths, in the orange-red lasers, the lifetime of the material can be improved by
spectral region, compatible with a transition sequence using pump lasers at compatible wavelengths. In this
leading to photoionization. Also, high-prf narrow- regard, direct diode laser pumping of dye-doped
linewidth dye lasers, as described by Duarte and solid-state matrices should lead to very compact,
Piper, in 1984, are ideally suited for guide star appli- long-lifetime, tunable lasers emitting throughout the
cations in astronomy. This particular application visible spectrum. An example of such a device would
requires a high-average power diffraction-limited be a green laser-diode-pumped rhodamine-6G doped
laser beam at l < 589 nm to propagate for some MPMMA tunable laser configured in a multiple-
95 km, above the surface of the Earth, and illuminate prism grating architecture.
a layer of sodium atoms. Ground detection of the As far as liquid lasers are concerned, there is a need
fluorescence from the sodium atoms allows measure- for efficient water-soluble dyes. Successful develop-
ments on the turbulence of the atmosphere. These ments in this area were published in the literature
measurements are then used to control the adaptive with coumarin-analog dyes by Chen and colleagues,
optics of the telescope thus compensating for the in 1988, and some encouraging results with
atmospheric distortions. rhodamine dyes have been reported by Ray and
A range of industrial applications is also amenable colleagues, in 2002. Efficient water-soluble dyes
to the wavelength agility and high-average power could lead to significant improvements and simpli-
output characteristics of dye lasers. Furthermore, fications in high-power tunable lasers.
414 LASERS / Edge Emitters
See also Duarte FJ (ed.) (1991) High Power Dye Lasers. Berlin:
Springer-Verlag.
Coherent Control: Experimental. Nonlinear Optics, Duarte FJ (ed.) (1995) Tunable Lasers Handbook.
Applications: Pulse Compression via Nonlinear Optics. New York: Academic Press.
Quantum Optics: Entanglement and Quantum Infor- Duarte FJ (ed.) (1995) Tunable Laser Applications.
mation. Relativistic Nonlinear Optics. New York: Marcel Dekker.
Duarte FJ (2003) Tunable Laser Optics. New York:
Elsevier Academic.
Duarte FJ and Hillman LW (eds) (1990) Dye Laser
Further Reading Principles. New York: Academic Press.
Costela A, Garcı́a-Moreno I and Sastre R (2001) In: Maeda M (1984) Laser Dyes. New York: Academic Press.
Nalwa HS (ed.) Handbook of Advanced Electronic Schäfer FP (ed.) (1990) Dye Lasers, 3rd edn. Berlin:
and Photonic Materials and Devices, vol. 7, Springer-Verlag.
pp. 161 – 208. San Diego, CA: Academic Press. Radziemski LJ, Solarz RW and Paisner JA (eds) (1987)
Diels J-C and Rudolph W (1996) Ultrashort Laser Pulse Laser Spectroscopy and its Applications. New York:
Phenomena. New York: Academic Press. Marcel Dekker.
Demtröder W (1995) Laser Spectroscopy, 2nd edn. Berlin: Weber MJ (2001) Handbook of Lasers. New York:
Springer-Verlag. CRC.
Edge Emitters
J J Coleman, University of Illinois, Urbana, IL, USA Light emission is generated only within a few
q 2005, Elsevier Ltd. All Rights Reserved.
micrometers of the surface, which means that
99.98% of the volume of this small device is inactive.
Energy conversion in diode lasers is provided by
current flowing through a forward-biased pn junc-
Introduction tion. At the electrical junction, there is a high density
The semiconductor edge emitting diode laser is a of injected electrons and holes which can recombine
critical component in a wide variety of applications, in a direct energy gap semiconductor material to give
including fiber optics telecommunications, optical an emitted photon. Charge neutrality requires equal
data storage, and optical remote sensing. In this numbers of injected electrons and holes. Figure 2
section, we describe the basic structures of edge shows a simplified energy versus momentum (E– k)
emitting diode lasers and the physical mechanisms for diagrams for (a) direct, and (b) indirect semiconduc-
converting electrical current into light. Laser wave- tors. In a direct energy gap material, such as GaAs,
guides and cavity resonators are also outlined. The the energy minima have the same momentum vector,
power, efficiency, gain, loss, and threshold character- so recombination of an electron in the conduction
istics of laser diodes are presented along with the band and a hole in the valence band takes place
effects of temperature and modulation. Quantum directly. In an indirect material, such as silicon,
well lasers are outlined and a description of grating momentum conservation requires the participation of
coupled lasers is provided. a third particle – a phonon. This third-order process
Like all forms of laser, the edge emitting diode laser is far less efficient than direct recombination and, thus
is an oscillator and has three principal components; a far, too inefficient to support laser action.
mechanism for converting energy into light, a medium Optical gain in direct energy gap materials is
that has positive optical gain, and a mechanism for obtained when population inversion is reached.
obtaining optical feedback. The basic edge emitting
laser diode is shown schematically in Figure 1. Current
flow is vertical through a pn junction and light is
emitted from the ends. The dimensions of the laser in
Figure 1 are distorted, in proportion to typical real
laser diodes, to reveal more detail. In practical
laser diodes, the width of the stripe contact is
actually much smaller than shown and the thickness
is smaller still. Typical dimensions are (stripe
width £ thickness £ length) 2 £ 100 £ 1000 mm. Figure 1 Schematic drawing of an edge emitting laser diode.
LASERS / Edge Emitters 415
qno Lz
Jo ¼ ½3
tspon
Charge neutrality requires that dn ¼ dp. Transpar- Figure 4 Schematic cross-section of a typical five-layer
ency is the point where these two conditions are just separate confinement heterostructure (SCH) laser.
416 LASERS / Edge Emitters
hc
E¼ ½4
l
where h is Planck’s constant and c is the speed of light.
The bandgap energy of the SCH structure is shown
Figure 6 Schematic diagram of a real index guided ridge
in Figure 5. The lowest energy active layer collects
waveguide diode laser.
injected electrons and holes and the carrier confine-
ment provided by the inner barrier layers allows the
high density of carriers necessary for population
inversion and gain. The layers in the structure also
have the refractive index profile shown in Figure 5
which forms a five-layer slab dielectric waveguide.
With appropriate values for thicknesses and indices of
refraction, this structure can be a very strong
fundamental mode waveguide with the field profile
shown. A key parameter of edge emitting diode lasers
is G, the optical confinement factor. This parameter is
the areal overlap of the optical field with the active
layer, shown as dashed lines in the figure and can be as Figure 7 Near field emission pattern, cross-sectional emission
little as a few percent, even in very high performance profiles (solid lines), and refractive index profiles (dashed lines)
lasers. from an index guided diode laser.
LASERS / Edge Emitters 417
1 1
gth ¼ ai þ ln ½7
2L R1 R2
1 1
Ggth ¼ ai þ ln ½8
2L R1 R2
J
gt ¼ bJo ln ½9
Jo
1 1
aþ ln
Figure 9 Emission spectra from a Fabry– Perot edge emitting J 2L R1 R2
laser structure at, and above, laser threshold.
Jth ¼ o exp ½10
hi GbJ o
Po
hw ¼ ½16
I
The rate equation for photons is given by binary compounds and ternary or quaternary alloys.
The particular choice is based on such issues as
dw un w emission wavelength, a suitable heterostructure
¼ ðc=nÞbðn 2 no Þw þ 2 ½19
dt tsp tp lattice match, and availability of high-quality sub-
strates. For example, common low-power lasers for
where u is the fraction of the spontaneous emission CD players (l , 820 nm) are likely to be made on a
that couples into the mode (a small number), and tp is GaAs substrate using a GaAs – AlxGa12xAs hetero-
the photon lifetime (,1 psec) in the cavity. These rate structure. Lasers for fiberoptic telecommunications
equations can be solved for specific diode laser systems (l , 1.3 – 1.5 mm) are likely to be made on an
materials and structures to yield a modulation InP substrate using an InP – InxGa 12xAs yP 12y
frequency response. A typical small signal frequency heterostructure. Red laser diodes (l , 650 nm) are
response, at two output power levels, is shown in likely to be made on a GaAs substrate using a GaAs –
Figure 11. The response is typically flat to frequencies InxGa12xAsyP12y heterostructure. Blue and ultra-
above 1 GHz, rises to a peak value at some violet lasers, a relatively new technology compared
characteristic resonant frequency, and quickly rolls to the others, make use of AlxGa12xN – GaN –
off. The characteristic small signal resonant frequency InyGa12yN heterostructures formed on sapphire or
vr, is given by silicon carbide substrates.
An important part of many diode laser structures is
ðc=nÞbw a strained layer. If a layer is thin enough, the strain
v2r ¼ ½20
tp arising from a modest amount of lattice mismatch can
be accommodated elastically. In thicker layers, lattice
where w is the average photon density. Direct mismatch results in an unacceptable number of
modulation of edge emitting laser diodes is limited dislocations that affect quantum efficiency, optical
by practicality to less than 10 GHz. This is in part absorption losses, and, ultimately, long-term failure
because of the limits imposed by the resonant rates. Strained layer GaAs – InxGa12xAs lasers are a
frequency and in part by chirp. Chirp is the frequency critical component in rare-earth doped fiber
modulation that arises indirectly from direct current amplifiers.
modulation. Modulation of the current modulates the The structure of the active layer in edge emitting
carrier densities which, in turn, results in modulation laser diodes was originally a simple double hetero-
of the quasi-Fermi levels. The separation of the quasi- structure configuration with a single active layer of
Fermi levels is the emission energy (wavelength). 500 – 1000 Å in thickness. In the 1970s however,
Most higher-speed lasers make use of external advances in the art of growing semiconductor
modulation schemes. heterostructure materials led to growth of high-
The discussion thus far has addressed aspects of quality quantum well active layers. These structures,
semiconductor diode edge emitting lasers that are having thicknesses that are comparable to the
common to all types of diode lasers irrespective of the electron wavelength in a semiconductor (,200 Å),
choice of materials or the details of the active layer revolutionized semiconductor lasers, resulting in
structure. The materials of choice for efficient much lower threshold current densities, higher
laser devices include a wide variety of III – V efficiencies, and a broader range of available
emission wavelengths. The energy band diagram
for a quantum well laser active region is shown in
Figure 12.
This structure is a physical realization of the
particle-in-a-box problem in elementary quantum
mechanics. The quantum well yields quantum states
in the conduction band at discrete energy levels, with
odd and even electron wavefunctions, and breaks
the degeneracy in the valence band, resulting in
separate energy states for light holes and heavy holes.
The primary transition for recombination is from the
n ¼ 1 electron state to the h ¼ 1 heavy hole state and
takes place at a higher energy than the bulk bandgap
energy. In addition, the density of states for quantum
Figure 11 Direct current modulation frequency response for an wells becomes a step-like function and optical gain
edge emitting diode laser at two output power levels. is enhanced. The advantages in practical edge
420 LASERS / Edge Emitters
emitting lasers that are the result of one-dimensional can be large. Another variant is the distributed Bragg
quantization are such that virtually all present reflector (DBR) laser resonator, shown in Figure 13b.
commercial laser diodes are quantum well lasers. This DBR laser has a higher index contrast, deeply
These kinds of advantages are driving development of etched surface grating located at one or both ends of
laser diodes with additional degrees of quantization, the otherwise conventional SCH laser heterostruc-
including two dimensions (quantum wires) and three ture. One of the advantages of this structure is the
dimensions (quantum dots). opportunity to add a contact for tuning the Bragg
The Fabry –Perot cavity resonator described here is grating by current injection.
remarkably efficient and relatively easy to fabricate, Both the DBR and DFB laser structures are
which makes it the cavity of choice for many particularly well suited for telecommunications or
applications. There are many applications, however, other applications where a very narrow single laser
where the requirements for linewidth of the laser emission line is required. In addition, the emission
emission or the temperature sensitivity of the emis- wavelength temperature dependence for this type of
sion wavelength make the Fabry –Perot resonator less laser is much smaller, typically 0.5 ÅC21.
desirable. An important laser technology that
addresses both of these concerns involves the use of
a wavelength selective grating as an integral part of
the laser cavity. A notable example of a grating See also
coupled laser is distributed feedback laser shown
Semiconductor Physics: Quantum Wells and GaAs-
schematically in Figure 13a. This is similar to the
based Structures.
SCH cross-section of Figure 4 with the addition of a
Bragg grating above the active layer.
The period L, of the grating is chosen to fulfill
the Bragg condition for mth-order coupling
between forward- and backward-propagating
Further Reading
waves, which is Agrawal GP (ed.) (1995) Semiconductor Lasers Past,
Present, and Future. Woodbury, NY: American Institute
ml of Physics.
L¼ ½21 Agrawal GP and Dutta NK (1993) Semiconductor Lasers.
2n
New York: Van Nostrand Reinhold.
Bhattacharya P (1994) Semiconductor Optoelectronic
where l=n is the wavelength in the medium. The index Devices. Upper Saddle River, NJ: Prentice-Hall.
step between the materials on either side of the Botez D and Scifres DR (eds) (1994) Diode Laser Arrays.
grating is small and the amount of reflection is also Cambridge, UK: Cambridge University Press.
small. Since the grating extends throughout the length Chuang SL (1995) Physics of Optoelectronic Devices.
of the cavity, however, the overall effective reflectivity New York: John Wiley & Sons.
LASERS / Excimer Lasers 421
Coleman JJ (1992) Selected Papers on Semiconductor Kapon E (1999) Semiconductor Lasers I and II. San Diego,
Diode Lasers. Bellingham, WA: SPIE Optical Engineer- CA: Academic Press.
ing Press. Nakamura S and Fasol G (1997) The Blue Laser Diode.
Einspruch NG and Frensley WR (eds) (1994) Heterostruc- Berlin: Springer.
tures and Quantum Devices. San Diego, CA: Academic Verdeyen JT (1989) Laser Electronics, 3rd edn. Englewood
Press. Cliffs, NJ: Prentice-Hall.
Iga K (1994) Fundamentals of Laser Optics. New York: Zory PS Jr (ed.) (1993) Quantum Well Lasers. San Diego,
Plenum Press. CA: Academic Press.
Excimer Lasers
J J Ewing, Ewing Technology Associates, Inc., UV power and pulse energy, however, are consider-
Bellevue, WA, USA ably higher.
q 2005, Elsevier Ltd. All Rights Reserved. The fundamental properties of these lasers enable
the photochemical and photophysical processes used
in manufacturing and medicine. Short pulses of UV
light can photo-ablate materials, being of use in
Introduction medicine and micromachining. Short wavelengths
We present a summary of the fundamental operating enable photochemical processes. The ability to
principles of ultraviolet, excimer lasers. After a brief provide a narrow linewidth using appropriate resona-
discussion of the economics and application motiv- tors enables precise optical processes, such as litho-
ation, the underlying physics and technology of these graphy. We trace out some of the core underlying
devices is described. Key issues and limitations in the properties of these lasers that lead to the core utility.
scaling of these lasers are presented. We also discuss the technological problems and
solutions that have evolved to make these lasers so current markets lay in a map. Current applications
useful. are shown by the boxes in the lower right-hand
The data in Figure 1 provide a current answer to corner. Excimer lasers have been built with clear
the question of ‘why excimer lasers?’ At the dawn of apertures in the range of a fraction of 0.1 to 104 cm2,
the excimer era, the answer to this question was quite with single pulse energies covering a range of 107.
different. In the early 1970s, there were no powerful Lasers with large apertures require electron beam
short wavelength lasers, although IR lasers were excitation. Discharge lasers correspond to the more
being scaled up to significant power levels. However, modest pulse energy used for current applications. In
visible or UV lasers could in principle provide better general, for energies lower than ,5 J, the excitation
propagation and focus to a very small spot size at method of choice is a self-sustained discharge, clearly
large distances. Solid state lasers of the time offered providing growth potential for future uses, if
very limited pulse repetition frequency and low required.
efficiency. Diode pumping of solid state lasers, and The applications envisaged in the early R&D days
diode arrays to excite such lasers, was a far removed were in much higher energy uses, such as in laser
development effort. As such, short-wavelength lasers weapons, laser fusion, and laser isotope separation.
were researched over a wide range of potential media The perceived need for lasers for isotope separation,
and lasing wavelengths. The excimer concept was one laser-induced chemistry, and blue-green laser-based
of those candidates. communications from satellites, drove research in
We provide, in Figure 2, a ‘positioning’ map high pulse rate technology and reliability extension
showing the range of laser parameters that can be for discharge lasers. This research resulted in proto-
obtained with commercial or developmental excimer types, with typical performance ranges as noted on
lasers. The map has coordinates of pulse energy and the positioning map that well exceed current market
pulse rate, with diagonal lines expressing average requirements. However, the technology and under-
power. We show where both development goals and lying media physics developed in these early years
made possible advances needed to serve the ultimate
real market applications.
The very important application of UV lithography
requires high pulse rates and high reliability. These
two features differentiate the excimer laser used for
lithography from the one used in the research
laboratory. High pulse rates, over 2000 Hz in current
practice, and the cost of stopping a computer chip
production line for servicing, drive the reliability
requirements toward 1010 shots per service interval.
In contrast, the typical laboratory experiments are
more often in the range of 100 Hz and, unlike a
lithography production line, do not run 24 hours a
day, 7 days a week. The other large market currently
is in laser correction of myopia and other imperfec-
tions in the human eye. For this use the lasers are
more akin to the commercial R&D style laser in terms
of pulse rate and single pulse energy. Not that many
shots are needed to ablate tissue to make the needed
correction, and shot life is a straightforward require-
ment. However, the integration of these lasers into a
certified and accepted medical procedure and an
overall medical instrument drove the growth of this
Figure 2 The typical energy and pulse rate for certain
applications and technologies using excimer lasers. For energy
market.
under a few J, a discharge laser is used. The earliest excimer The excimer concept, and the core technology
discharge lasers were derivatives of CO2 TEA (transversely used to excite these lasers, is applicable over a
excited atmospheric pressure) and produced pulse energy in the broad range of UV wavelengths, with utility at
range of 100 mJ per pulse. Pulse rates of order 30 Hz were all that specific wavelength ranges from 351 nm in the near
the early pulsed power technology could provide. Early research
focused on generating high energy. Current markets are at
UV to 157 nm in the vacuum UV (VUV). There
modest UV pulse energy and high pulse rates for lithography and were many potential excimer candidates in the
low pulse rates for medical applications. early R&D phase (Table 1). Some of these
LASERS / Excimer Lasers 423
Ar2 126 Very short emission wavelength, deep in the Vacuum UV; inefficient as laser
due to absorption, see Xe2.
Kr2 146 Broad band emitter and early excimer laser with low laser efficiency.
Very efficient converter of electric power into Vacuum UV light.
F2 157 Molecule itself is not an excimer, but uses lower level dissociation; candidate
for next generation lithography.
Xe2 172 The first demonstrated excimer laser. Very efficient formation and emission but
excited state absorption limits laser efficiency. Highly efficient as a lamp.
ArF 193 Workhorse for corneal surgery and lithography.
KrCl 222 Too weak compared to KrF and not as short a wavelength as ArF.
KrF 248 Best intrinsic laser efficiency; numerous early applications but found major
market in lithography. Significant use in other materials processing.
XeI 254 Never a laser, but high formation efficiency and excellent for a lamp. Historically
the first rare gas halide excimer whose emission was studied at high pressure.
XeBr 282 First rare gas halide to be shown as a laser; inefficient laser due to excited state
absorption, but excellent fluorescent emitter; a choice for lamps.
Br2 292 Another halogen molecule with excimer like transitions that has lased but had no
practical use.
XeCl 308 Optimum medium for laser discharge excitation.
Hg2 335 Hg vapor is very efficiently excited in discharges and forms excimers at high
pressure. Subject of much early research, but as in the case of the rare gas
excimers such as Xe2 suffer from excited state absorption. Despite the high
formation efficiency, this excimer was never made to lase.
I2 342 Iodine UV molecular emission and lasing was first of the pure halogen excimer-like
lasers demonstrated. This species served as kinetic prototype for the F2
‘honorary’ excimer that has important use as a practical VUV source.
XeF 351, 353 This excimer laser transition terminates on a weakly bound lower level that
dissociates rapidly, especially when heated. The focus for early defense-related
laser development; as this wavelength propagates best of this class in the
atmosphere.
XeF 480 This very broadband emission of XeF terminates on a different lower level than
the UV band. Not as easily excited in a practical system, so has not had
significant usage.
HgBr 502 A very efficient green laser that has many of the same kinetic features of the rare gas
halide excimer lasers. But the need for moderately high temperature for the
needed vapor pressure made this laser less than attractive for UV operation.
XeO 540 This is an excimer-like molecular transition on the side of the auroral lines of the
O atom. The optical cross-section is quite low and as a result the laser never
matured in practice.
candidates can be highly efficient, .50% relative to level dissociation is typically from an unbound or
deposited energy, at converting electrical power very weakly bound ground-state molecule. However,
into UV or visible emission, and have found a halogen molecules also provide laser action on
parallel utility as sources for lamps, though with excimer-like transitions, that terminate on a molecu-
differing power deposition rates and configurations. lar excited state that dissociates quickly. The vacuum
The key point in the table is that these lasers are UV transition in F2 is the most practical method for
powerful and useful sources at very short wave- making very short wavelength laser light at significant
lengths. For lithography, the wavelength of interest power. Historically, the rare gas dimer molecules,
has consistently shifted to shorter and shorter such as Xe2, were the first to show excimer laser
wavelengths as the printed feature size decreased. action, although self absorption, low gain, and poor
Though numerous candidates were pursued, as optics in the VUV limited their utility. Other excimer-
shown in Table 1, the most useful lasers are those like species, such as metal halides, metal rare gas
using rare gas halide molecules. These lasers are continua on the edge of metal atom resonance lines,
augmented by dye laser and Raman shifting and rare gas oxides were also studied. The key
to provide a wealth of useful wavelengths in the differentiators of rare gas halides from other excimer-
visible and UV. like candidates are strong binding in the excited
All excimer lasers utilize molecular dissociation to state, lack of self absorption, and relatively high
remove the lower laser level population. This lower optical cross-section for the laser transition.
424 LASERS / Excimer Lasers
Excimer Laser Fundamentals fraction of excited states that will form molecules.
The shape of the lower potential energy curve can
Excimer lasers utilize lower-level dissociation to vary as well. From a semantic point of view hetero-
create and sustain a population inversion. For nuclear, diatomic molecules, such as XeO, are not
example, in the first excimer laser demonstrated, excimers, as they are not made of two of the same
Xe2, the lasing species is one which is comprised of atoms. However, the more formal name ‘exciplex’ did
two atoms that do not form a stable molecule in the not stick in the laser world; excimer laser being
ground state, as xenon dimers in the ground state are preferred. Large binding energy is more efficient for
not a stable molecule. In the lowest energy state, that capturing excited atoms into excited excimer-type
with neither of the atoms electronically excited, molecules in the limited time they have before
the interaction between the atoms in a collision emission. Early measurements of the efficiency of
is primarily one of repulsion, save for a very weak converting electrical energy deposited into fluorescent
‘van der Waals’ attraction at larger internuclear radiation showed that up to 50% of the energy input
separations. This is shown in the potential energy could result in VUV light emission. Lasers were not
diagram of Figure 3. If one of the atoms is excited, for this efficient due to absorption.
example by an electric discharge, there can be a The diagram shown in Figure 3 is simplified
chemical binding in the electronically excited state of because it does not show all of the excited states
the molecule relative to the energy of one atom with that exist for the excited molecule. There can be
an electron excited and one atom in the ground state. closely lying excited states that share the population
We use the shorthand notation p to denote an electro- and radiate at a lower rate, or perhaps absorb to
nically excited atomic molecular species, e.g., Xep. higher levels, also not shown in Figure 3. Such excited
The excited dimer, excimer for short, will recombine state absorption may terminate on states derived from
rapidly at high pressure from atoms into the Xep2 higher atomic excited states, noted as Xepp in the
electronically excited molecule. Radiative lifetimes of figure, or yield photo-ionization, producing the
the order 10 nsec to 1 msec are typical before photon diatomic ion-molecule plus an electron, such as:
emission returns the molecule to the lower level.
Collisional quenching often reduces the lifetime Xep2 þ hn ! Xeþ
2 þe ½1
below the radiative rate. For Xep2 excimers, the
emission is in the vacuum ultraviolet (VUV). Such excited state absorption limited the efficiency for
There are a number of variations on this theme, as the VUV rare gas excimers, even though they have
noted in Table 1. Species, other than rare gas pairs, very high formation and fluorescence efficiency.
can exhibit broadband emission. The excited state Rare gas halide excimer lasers evolved from
binding energy can vary significantly, changing the ‘chemi-luminescence’ chemical kinetics studies
which were looking at the reactive quenching of
rare gas metastable excited states in flowing after
glow experiments. Broadband emission in reactions
with halogen molecules was observed in these
experiments, via reactions such as:
KrFp formation kinetics via ion channel: efficient excimer lamps can be made. Another point to
note is that forming the inversion requires transient
Ionization: e þ Kr ! Krþ þ 2e ½5 species (and in some cases halogen fuels) that absorb
the laser light. Net gain and extraction efficiency are a
trade-off between pumping rate, an increase in which
Attachment: e þ F2 ! Fð2Þ þ F ½6
often forms more absorbers and absorption itself.
Table 2 provides a listing of some of the key
Ion – ion recombination: absorbers. Finally, the rare gas halide excited states
can be rapidly quenched in a variety of processes.
Krþ þ Fð2Þ þ M ! KrFp ðv; JÞhigh þ M ½7 More often than not, the halogen molecule and the
electrons in the discharge react with the excited states
of the rare gas halide, removing the contributors to
Relaxation: gain. We show, in Table 3, a sampling of some of the
types of formation and quenching reactions with their
KrFp ðv; JÞhigh þ M ! KrFp ðv; JÞlow þ M ½8 kinetic rate constant parameters. Note that one
category of reaction, three-body quenching, is unique
Finally, there is one other formation mechanism,
to excimer lasers. Note also that the three-body
displacement, that is unique to the rare gas halides. In
recombination of ions has a rate coefficient that is
this process, a heavier rare gas will displace a lighter
effectively pressure dependent. At pressures above
rare gas ion in an excited state to form a lower-energy
,3 atm, the diffusion of ions toward each other is
excited state:
slowed and the effective recombination rate constant
ArFp þ Kr ! KrFp ðv; JÞhigh þ Ar ½9 decreases.
Though the rare gas halide excited states can be
In general, the most important excimer lasers tend made with near-unity quantum yield from the
to have all of these channels contributing to excited primary excitation, the intrinsic laser efficiency is
state formation for optimized mixtures. not this high. The quantum efficiency is only ,25%
These kinetic formation sequences express some (recall rare gas halides all emit at much longer
important points regarding rare gas halide lasers. The wavelengths than rare gas excimers); there is ineffi-
source of the halogen atoms for the upper laser level cient extraction due to losses in the gas and windows.
disappears via both attachment reactions with In practice there are also losses in coupling the power
electrons and by the reaction with excited states. into the laser, and wasted excitation during the finite
The halogen atoms eventually recombine to make start-up time. The effect of losses is shown in Table 4
molecules, but at a rate too slow for sufficient and Figure 5 where the pathways are charted and
continuous wave (cw) laser gain. For excimer-based losses identified. In KrF, it takes ,20 eV to make a
lamps, however, laser gain is not a criterion and very rare gas ion in an electron beam excited mixture,
Formation
Rare gas plus halogen molecule Krp þ F2 ! KrFp (v ; J)þF
k , 7 £ 10210 cm3/sec; Branching ratio to KrFp , 1
Arp þ F2 ! ArFp (v ; J)þF
k , 6 £ 10210 cm3/sec; Branching ratio to ArFp , 60%
Xep þ F2 ! XeFp (v; J)þF
k , 7 £ 10210 cm3/sec; Branching ratio to XeFp , 1
Xep þ HCl(v ¼ 0) ! Xe þ H þ Cl
k , 6 £ 10210 cm3/sec; Branching ratio to XeClp , 0
Xep þ HCl(v ¼ 1) ! XeClp þ H
k , 2 £ 10210 cm3/sec; Branching ratio to XeClp ,1
Ion– ion recombination Ar(þ) þ F(2) þ Ar ! ArFp(v,J )
k , 1 £ 10225 cm6/sec at 1 atm and below
k , 7:5 £ 10226 cm6/sec at 2 atm; Note effective 3-body rate
constant decreases further as pressure goes over ,2 atm
Kr(þ) þ F(2) þ Kr ! KrFp (v ; J)
k , 7 £ 10226 cm6/sec at 1 atm and below; rolls over at higher pressure
Quenching
Halogen molecule XeClp þ HCl ! Xe þ Cl þ HCl
k , 5:6 £ 10210 cm3/sec;
KrFp þ F2 ! Kr þ F þ F
k , 6 £ 10210 cm3/sec;
All halogen molecule quenching have very high reaction rate constants
3-body inert gas ArFp þ Ar þ Ar ! Ar2 Fp þ Ar
k , 4 £ 10231 cm6/sec
KrFp þ Kr þ Ar ! Kr2 Fp þ Ar
k , 6 £ 10231 cm6/sec
XeClp þ Xe þ Ne ! Xe2 Clp þ Ne
k , 1 £ 10233 cm6/sec; 4 £ 10231 cm6/sec with Xe as 3rd body
These reactions yield a triatomic excimer at lower energy that can not
be recycled back to the desired laser states
Electrons XeClp þ e ! Xe þ Cl þ e
k , 2 £ 1027 cm3/sec
In 2-body quenching by electrons, the charge of the electron interacts with the
dipole of the rare gas halide ion pair at long range; above value is typical of others
2-body inert gas KrFp þ Ar ! Ar þ Kr þ F
k , 2 £ 10212 cm3/sec
XeClp þ Ne ! Ne þ Xe þ Cl
k , 3 £ 10213 cm3/sec
2-body quenching by rare gases is typically slow and less important than the
reactions noted above
somewhat less in an electric discharge. The photon small signal gain is of the order of 5%/cm. The key
release is ,5 eV; the quantum efficiency is only 25%. absorbers are the parent halogen molecule, F2, the
Early laser efficiency measurements (using electron fluoride ion F(2), and the rare gas excited state atoms.
beam excitation) showed laser efficiency in the best A detailed pulsed kinetics code will provide slightly
cases slightly in excess of 10%, relative to the different results due to transient kinetic effects and
deposited energy. The discrepancy relative to the the intimate coupling of laser extraction to quenching
quantum efficiency is due to losses in the medium. For and absorber dynamics. The ratio of gain to loss
the sample estimate in Table 4 and Figure 5 we show is usually in the range of 7 to 20, depending on
approximate density of key species in the absorption the actual mixture used and the rare gas halide
chain, estimated on a steady-state approximation of wavelength. The corresponding extraction efficiency
the losses in a KrF laser for an excitation rate of order is then limited to the range of ,50%. The small
1 MW/cc, typical of fast discharge lasers. The net signal gain, go ; is related to the power deposition
428 LASERS / Excimer Lasers
Table 4 Typical ground and excited state densities shown for KrF at Pin , 1 MW/cm3
Figure 5 The major decay paths and absorbers that impact the extraction of excited states of rare gas halide excimers. Formation is
via the ion –ion recombination, excited atom reaction with halogens and displacement reactions, not shown. We note for a specific
mixture of KrF the approximate values of the absorption by the transient and other species along with the decay rates for the major
quenching processes. Extraction efficiency greater than 50% is quite rare.
rate per unit volume, Pdeposited , by eqn [10]. excitations that become KrFp, only ,55% or so will
be extracted in the best of cases. For other less ideal
Pdeposited ¼ g0 Ep ðtupper hpumping hbranching f sÞ21 ½10 rare gas halide lasers, the gain to loss ratio, the
extraction efficiency, and the intrinsic efficiency are all
Ep is the energy per excitation, ,20 eV or lower. Practical discharge lasers have lower practical
3.2 £ 10218 J/excitation, f is the fraction of excited wall plug efficiency due to a variety of factors,
molecules in the upper laser level, ,40% due to KrFp discussed below.
in higher vibrational states and the slowly radiating
‘C’ state. The laser cross-section is s, and the h terms
Discharge Technology
are efficiencies for getting from the primary excitation
down to the upper laser level. The upper state Practical discharge lasers use fast-pulse discharge
lifetime, tupper , is the sum of the inverses of radiative excitation. In this scheme, an outgrowth of the CO2
and quenching lifetimes. When the laser cavity flux is TEA laser, the gas is subjected to a rapid high-voltage
sufficiently high, stimulated emission decreases the pulse from a low-inductance pulse forming line or
flow of power into the quenching processes. network, PFL or PFN. The rapid pulse serves to break
The example shown here gives a ratio of net gain to down the gas, rendering it conductive with an
the nonsaturating component of loss of 13/1. Of the electron density of . ,1014 electrons/cm3. As the
LASERS / Excimer Lasers 429
gas becomes conductive, the voltage drops and power leading to electrode sputtering, dust that finds its
is coupled into the medium for a short duration, way to windows, and the need to service the laser
typically less than 50 nsec. The discharge laser also after ,108 pulses. When one considers the efficiency
requires some form of ‘pre-ionization’, usually of the power conditioning system, along with the
provided by sparks or a corona-style discharge on finite kinetics of the laser start-up process in short
the side of the discharge electrodes. The overall pulses, and the mismatch of the laser impedance to
discharge process is intrinsically unstable, and pump the PFN/PFL impedance, lasers that may have 10%
pulse duration needs to be limited to avoid the intrinsic efficiency in the medium in e-beam excitation
discharge becoming an arc. may only have 2% to 3% wall plug efficiency in a
Moreover, problems with impedance matching of practical discharge laser.
the PFN or PFL with the conductive gas lead to both A key aspect of excimer technology is the need for
inefficiency and extra energy that can go into post- fast electrical circuits to drive the power into the
discharge arcs, which both cause component failure discharge. This puts a significant strain on the switch
and produce metal dust which can give rise to that is used to start the process. While spark gaps, or
window losses. A typical discharge laser has an multichannel spark gaps, can handle the required
aperture of a few cm2 and a gain length of order current and current rate of rise, they do not present a
80 cm and produces outputs in the range of 100 mJ to practical alternative for high-pulse rate operation.
1 J. The optical pulse duration tends to be ,20 nsec, Thyratrons and other switching systems, such as solid
though the electrical pulse is longer due to the finite state switches, are more desirable than a spark gap,
time needed for the gain to build up and the but do not offer the desired needed current rate of
stimulated emission process to turn spontaneously rise. To provide the pumping rate of order 1 MW/cm3
emitted photons into photons in the laser cavity. needed for enough gain to turn the laser on before the
The pre-ionizer provides an initial electron density discharge becomes unstable, the current needs to rise
on the order of 108 electrons/cm3. With spark pre- to a value of the order of 20 kA in a time of ,20 nsec;
ionization or corona pre-ionization, it is difficult to the current rate of rise is ,1012 A/sec. This rate of rise
make a uniform discharge over an aperture greater will limit the lifetime of the switch that drives the
than a few cm2. For larger apertures, one uses X-rays power into the laser gas. As a response to this need,
for pre-ionization. X-ray pre-ionized discharge exci- magnetic switching and magnetic compression cir-
mer lasers have been scaled to energy well over 10 J cuits were developed so that the lifetime of the switch
per pulse and can yield pulse durations over 100 nsec. can be radically increased. In the simplest cases, one
The complexity and expense of the X-ray approach, stage of magnetic compression is used to make the
along with the lack of market for lasers in this size current rate of rise to be within the range of a
range, resulted in the X-ray approach remaining in conventional thyratron switch as used in radar
the laboratory. systems. Improved thyratrons and a simple magnetic
Aside from the kinetic issues of the finite time assist also work. For truly long operating lifetimes,
needed for excitation to channel into the excimer multiple stages of magnetic switching and com-
upper levels and the time needed for the laser to ‘start pression are used, and with a transformer one can
up’ from fluorescent photons, these lasers have use a solid state diode as the start switch for the
reduced efficiency due to poor matching of the discharge circuit. A schematic of an excimer pulsed
pump source to the discharge. Ideally, one would power circuit, that uses a magnetic assist with a
like to have a discharge such that when it is thyratron and one stage of magnetic compression, is
conductive the voltage needed to sustain the dis- shown in Figure 6. The magnetic switching approach
charge is less than half of that needed to break the gas can also be used to provide two separate circuits
down. Regrettably for the rare gas halides (and in to gain high efficiency by having one circuit to
marked distinction to the CO2 TEA laser) the voltage break down the gas (the spiker) and a second,
that the discharge sustains in a quasi-stable mode is low-impedance ‘sustainer’ circuit, to provide the
,20% of the breakdown voltage, perhaps even less, main power flow. In this approach, the laser itself
depending on the specific gases and electrode shapes. can act as part of the circuit by holding off
Novel circuits, which separate the breakdown the sustaining voltage for a m-sec or so during
phase from the fully conductive phase, can readily the charge cycle and before the spiker breaks
double the efficiency of an excimer discharge laser, down the gas. A schematic of such a circuit is
though this approach is not necessary in many shown in Figure 7. Using this type of circuit, the
applications. The extra energy that is not dissipated efficiency of a XeCl laser can be over 4% relative to
in the useful laser discharge often results in arcing or the wall plug. A wide variety of technological twists
‘hot’ discharge areas after the main laser pulse, on this approach have been researched.
430 LASERS / Excimer Lasers
Applications: Pulse Compression via Nonlinear Optics; Levatter J and Lin SC (1980) Necessary conditions for the
Raman Lasers. Scattering: Raman Scattering. homogeneous formation of pulsed avalanche discharges
at high pressure. Journal of Applied Physics 51:
210 – 222.
Long WH, Plummer MJ and Stappaerts E (1983) Efficient
discharge pumping of an XeCl laser using a high voltage
Further Reading
prepulse. Applied Physics Letters 43: 735– 737.
Ballanti S, Di Lazzaro P, Flora F, et al. (1998) Ianus the Rhodes CK (ed.) (1979) Excimer Lasers, Topics of Applied
3-electrode laser. Applied Physics B 66: 401 – 406. Physics, vol. 30. Berlin: Springer-Verlag.
Brau CA (1978) Rare gas halogen lasers. In: Rhodes CK Rokni M, Mangano J, Jacob J and Hsia J (1978) Rare gas
(ed.) Excimer Lasers, Topics of Applied Physics, vol. 30. fluoride lasers. IEEE Journal of Quantum Electronics
Berlin: Springer Verlag. QE-14: 464 – 481.
Brau CA and Ewing JJ (1975) Emission spectra of XeBr, Smilansksi I, Byron S and Burkes T (1982) Electrical
XeCl, XeF, and KrF. Journal of Chemical Physics 63: excitation of an XeCl laser using magnetic pulse
4640– 4647. compression. Applied Physics Letters 40: 547 – 548.
Ewing JJ (2000) Excimer laser technology development. Taylor RS and Leopold KE (1994) Magnetic-spiker
IEEE Journal of Selected Topics in Quantum Electronics excitation of gas discharge lasers. Applied Physics B,
6(6): 1061–1071. Lasers and Optics 59: 479 – 509.
The extension of the low-gain FEL theory to the (TRW, Boeing). Some of the outstanding results of
general ‘electron-tube’ theory is important because it this effort were demonstration of the high-gain
led to development of new radiation schemes and new operation of an FEL amplifier in the mm-wavelength
operating regimes of the optical FEL. This was regime, utilizing an Induction Linac (Livermore,
exploited by physicists in the discipline of accelerator 1985), and demonstration of enhanced radiative
physics and synchrotron radiation, who identified, energy extraction efficiency in FEL oscillator, using
starting with the theoretical works of C. Pellegrini and a ‘tapered wiggler’ in an RF-Linac driven FEL
R. Bonifacio in the early 1980s, that high-current, oscillator (Los-Alamos, 1983). The program has not
high-quality electron beams, attainable with further been successful in demonstrating the potential of
development of accelerators technology, could make it FELs to operate at the high average power levels
possible to operate FELs in the high-gain regime, even needed for DEW applications. But after the cold-war
at short wavelengths (vacuum ultra-violet – VUV and period, a small part of the program continues to
soft X-ray) and that the high-gain FEL theory can be support research and development of medical FEL
extended to include amplification of the incoherent application.
synchrotron spontaneous emission (shot noise)
emitted by the electrons in the undulator. These led Principles of FEL Operation
to the important development of the ‘self (synchro-
tron) amplified spontaneous emission (SASE) FEL’, Figure 1 displays schematically an FEL oscillator. It is
which promised to be an extremely high brightness composed of three main parts: an electron accel-
radiation source, overcoming the fundamental erator, a magnetic wiggler (or undulator), and an
obstacles of X-ray lasers development: lack of mirrors optical resonator.
(for oscillators) and lack of high brightness radiation Without the mirrors, the system is simply a
sources (for amplifiers). synchrotron undulator radiation source. The elec-
A big boost to the development of FEL technology trons in the injected beam oscillate transversely to
was given during the period of the American ‘strategic their propagation direction z, because of the trans-
defense initiative – SDI’ (Star-Wars) program in the verse magnetic Lorenz force:
mid-1980s. The FEL was considered one of the main F’ ¼ 2evz e^ z £ B’ ½1
candidates for use in a ground-based or space-based
In a planar (linear) wiggler, the magnetic field on axis
‘directed energy weapon – DEW’, that can deliver
is approximately sinusoidal:
megawatts of optical power to hit attacking missiles.
The program led to heavy involvement of B’ ¼ Bw e^ y cos kw z ½2
major American defense establishment laboratories In a helical wiggler:
(Lawrence – Livermore National Lab, Los-Alamos
National Lab) and contracting companies B’ ¼ Bw ð^ey cos kw z þ e^ x sin kw zÞ ½3
Figure 1 Components of a FEL-oscillator. (Reproduced from Benson SV (2003) Free electron lasers push into new frontiers. Optics
and Photonics News 14: 20–25. Illustration by Jaynie Martz.)
LASERS / Free Electron Lasers 433
In either case, if we assume constant (for the planar provides incoherent synchrotron undulator radiation
wiggler – only on the average) axial velocity, then in the frequency range of microwave to hard X-rays.
z ¼ vz t: The frequency of the transverse force and the Synchrotron undulator radiation was studied in
mechanical oscillation of the electrons, as viewed 1951 and since then has been a common source of
transversely in the laboratory frame of reference, is: VUV radiation in synchrotron facilities. From the
v point of view of laser physics theory, this radiation
v0s ¼ kw vz ¼ 2p z ½4 can be viewed as ‘spontaneous synchrotron radiation
lw
emission’ in analogy to spontaneous radiation emis-
where lw ¼ 2p=kw is the wiggler period. sion by electrons excited to higher bound-electron
The oscillating charge emits an electromagentic quantum levels in atoms or molecules. Alternatively,
radiation wavepacket. In a reference frame moving it can be regarded as the classical shot noise radiation,
with the electrons, the angular radiation pattern associated with the current fluctuations of the
looks exactly like dipole radiation, monochromatic in randomly injected discrete charges comprising the
all directions (except for the frequency-line-broad- electron beam. Evidently this radiation is incoherent,
ening due to the finite oscillation time, i.e., the
and the fields it produces average in time to zero,
wiggler transit time). In the laboratory reference-
because the wavepackets emitted by the randomly
frame the radiation pattern concentrates in the
injected electrons interfere at the observation point
propagation ðþzÞ direction, and the Doppler up-
with random phase. However, their energies sum up
shifted radiation frequency depends on the obser-
and can produce substantial power.
vation angle Q relative to the z-axis:
Based on fundamental quantum-electrodynamical
v0s principles or Einstein’s relations, one would expect
v0 ¼ ½5
1 2 bz cos Q that any spontaneous emission scheme can be
On axis (Q ¼ 0), the radiation frequency is: stimulated. This principle lies behind the concept of
the FEL, which is nothing but stimulated undulator
ckw bz synchrotron radiation. By stimulating the electron
v0 ¼ ¼ ð1 þ bz Þbz gz2 ckw ø 2gz2 ckw ½6
1 2 bz beam to emit radiation, it is possible, as with any
where bz ; vz =c, gz ; ð1 2 b2z Þ21=2 are the axial laser, to generate a coherent radiation wave and
(average) velocity and the axial Lorenz factor, extract more power from the gain medium, which in
respectively, and the last part of the equation is this case is an electron beam, that carries an immense
valid only in the (common) highly relativistic limit amount of power. There are two kinds of laser
gz q 1: schemes which utilize stimulation of synchrotron
Using the relations b2z þ b2’ ¼ b2 , b’ ¼ aw =g, one undulator radiation:
can express gz :
(i) A laser amplifier. In this case the mirrors in the
g schematic configuration of Figure 1 are not
gz ¼ ½7
1 þ a2w =2 present, and an external radiation wave at
(this is for a linear wiggler, in the case of a helical frequencies within the emission range of the
wiggler the denoninator is 1 þ a2w ): undulator is injected at the wiggler entrance. This
1 requires, of course, an appropriate radiation
g ; ð1 2 b2 Þ21=2 ¼ 1 þ k2 source to be amplified and availability of
mc
sufficiently high gain in the FEL amplifier.
¼ 1 þ 1k ½MeV 0:511 ½8
(ii) A laser oscillator. In this case an open cavity (as
and aw – (also termed K) ‘the wiggler parameter’ is shown in Figure 1) or another (waveguide) cavity
the normalized transverse momentum: is included in the FEL configuration. As in any
laser, the FEL oscillator starts building up its
eBw
aw ¼ ¼ 0:093Bw ½KGausslw ½cm ½9 radiation from the spontaneous (synchrotron
kw mc undulator) radiation which gets trapped in the
Typical values of Bw in FEL wigglers (undulators) are resonator and amplified by stimulated emission
of the order of Kgauss’, and lw of the order of CMs, along the wiggler. If the threshold condition is
and consequently aw , 1: Considering that electron satisfied (having single path gain higher than the
beam accelerator energies are in the range of MeV to round trip losses), the oscillator arrives to
GeV, one can appreciate from eqns [6] – [8], that a saturation and steady state coherent operation
significant relativistic Doppler shift factor 2g 2z , in the after a short transient period of oscillation
range of tens to millions, is possible. It, therefore, build-up.
434 LASERS / Free Electron Lasers
1 k i 2 1 k f ¼ "v ½10
ki 2 kf ¼ q ½11
v
q¼ e^ ½13
c q
and eqns [10] – [13] have only one solution, v ¼ 0,
q ¼ 0. This observation is illustrated graphically in
the energy –momentum diagram of Figure 2a in the
framework of a one-dimensional model. It appears
that if both eqns [10] and [11] can be satisfied, then
the phase velocity of the emitted radiation wave vph ¼
v=q (the slope of the chord) will equal the electron
wavepacket group velocity vg ¼ vz at some inter-
mediate point kp ¼ pp =":
v 1ki 2 1kf ›1
vph ¼ ¼ ¼ ¼ vg ½14
q "ðki 2 kf Þ › p pp
Figure 4 The figure illustrates that the origin of difference between the emission and absorption frequencies is the curvature of the
energy dispersion line, and the origin of the homogeneous line broadening is momentum conservation uncertainty ^p=L in a finite
interaction length. (Reproduced with permission from Friedman A, Gover A, Ruschin S, Kurizki G and Yariv A (1988) Spontaneous and
stimulated emission from quasi-free electrons. Reviews of Modern Physics 60: 471–535. Copyright (1988) by the American Physical
Society.)
ve ø va ø v0 ¼ vz ðqz0 þ kw Þ ½17
2
1 ›1kz 1 1 › 1kz which for qz0 ¼ ðv=cÞcos Qq reproduces the classical
vz ¼ ; ¼
" ›kz g 2z g m "2 ›k2z synchronism condition in eqn [5]. The homogeneous
LASERS / Free Electron Lasers 437
broadening linewidth is found to be: magnetic field in eqn [2] (Figure 1):
vx and the wave electric field Ex reverse sign, at each that varies with the beat wavenumber ks þ kw at slow
period exactly at the same points ðlw =4; 3lw =4Þ: This phase velocity ðvph ¼ v=ðkz þ kw Þ , cÞ: This slow
situation is depicted in Figure 6, which shows the force-wave is called the pomdermotive (PM) wave.
slippage of the wave crests relative to the electron at Assuming the signal radiation wave in eqn [26]
five points along one wiggler period. The figure is polarization-matched to the wiggler (linearly
describes the synchronism condition, in which the polarized or circularly polarized for a linear or
radiation wave slips one optical period ahead of the helical wiggler respectively), the PM force amplitude
electron, while the electron goes through one wiggle is given by:
motion. In all positions along this period, v·E $ 0 (in
a helical wiggler and a circularly polarized wave this lF~ pm l ¼ elE
~ s’ law gbz ½28
product is constant and positive v·E . 0 along the
entire period). Substituting lw ¼ 2p=kw , this phase With large enough kw , it is always possible to slow
synchronism condition may be written as: down the phase velocity of the pondermotive wave
v until it is synchronized with the electron velocity:
¼ kz þ kw ½24
vz v
vph ¼ ¼ vz ½29
which is the same as eqns [17] and [5]. kz þ kw
Figure 6 shows that a single electron (or a bunch of
and can apply along the interaction length a
electrons of duration smaller than an optical period)
decelerating axial force, that will cause properly
would amplify a co-propagating radiation wave,
phased electrons to transfer energy to the wave on
along the entire wiggler, if it satisfies the synchronism
account of their longitudinal kinetic energy.
condition in eqn [24] and enters the interaction region
This observation is of great importance. It reveals
ðz ¼ 0Þ at the right (decelerating) phase relative to the
that even though the main components of the wiggler
radiation field. If the electron enters at the opposite
and radiation fields are transverse, the interaction is
phase it accelerates (on account of the radiation field
basically longitudinal. This puts the FEL on an equal
energy which is then attenuated by ‘stimulated
footing with the slow-wave structure devices as the
absorption’). Thus, when an electron beam is injected
TWT and the Cerenkov – Smith – Purcell FELs, in
into a wiggler at the synchronism condition with
which the longitudinal interaction takes place with
electrons entering at random times, no net amplifica-
the longitudinal electric field component of a slow
tion or absorption of the wave is expected on the
TM radiation mode. The synchronism condition in
averages. Hence, some more elaboration is required,
eqn [29] between the pondermotive wave and the
in order to understand how stimulated emission gain
electron, which is identical with the phase-matching
is possible then.
condition in eqn [24], is also similar to the synchro-
Before proceeding, it is useful to define the
nism condition between an electron and a slow
‘pondermotive force’ wave. This force originates
electromagnetic wave (eqn [14]).
from the nonlinearity of the Lorenz force equation:
Using the pondermotive wave concept, we can
d now explain the achievement of gain in the FEL
ðg mvÞ ¼ 2eðE £ v £ BÞ ½25 with a random electron beam. Figure 7 illustrates
dt
the interaction between the pondermotive wave
At zero order (in terms of the radiation fields), the and electrons, distributed at the entrance ðz ¼ 0Þ
only field force on the right-hand side of eqn [25] is randomly within the wave period. Figure 7a shows
due to the strong wiggler field (eqns [2] and [3]), ‘snap-shots’ of the electrons in one period of the
which results in the transverse wiggling velocity (eqn pondermotive wave lpm ¼ 2p=ðkz þ kw Þ at different
[22] for a linear wiggler). When solving next eqn [25] points along the wiggler, when it is assumed that the
to first order in terms of the radiation fields: electron beam is perfectly synchronous with the
pondermotive wave vph ¼ v0 : As explained before,
Es ðr; tÞ ¼ Re E~ s eiðkz z2vtÞ some electrons are slowed down, acquiring negative
½26 velocity increment Dv: However, for each such
Bs ðr; tÞ ¼ Re B~ s eiðkz z2vtÞ electron, there is another one, entering the wiggler
at an accelerating phase of the wave, acquiring
the cross product v £ B between the transverse the same positive velocity increment Dv: There is
components of the velocity and the magnetic field then no net change in the energy of the e-beam or
generates a longitudinal force component: the wave, however, there is clearly an effect of
‘velocity-bunching’ (modulation), which turns along
Fpm ðz; tÞ ¼ Re e^ z F~ pm eiðkz þkw Þz2ivt ½27 the wiggler into ‘density-bunching’ at the same
LASERS / Free Electron Lasers 439
Figure 7 ‘Snapshots’ of a pondermotive wave period interacting L with an initially uniformly distributed electron beam taking place
respectively along the interaction length 0 , z , L b : (a) Exact synchronism in a uniform wiggler (bunching). (b) Energy bunching, density
bunching and radiation are in the energy buncher, dispersive magnet Lb , z , Lb þ Ld and radiating wiggler Lb þ Ld , z , Lb þ Ld þ Lr
sections of an Optical– Klystron. (c) Slippage from bunching phase to radiating phase at optimum detuning off synchronism in a uniform
wiggler FEL.
frequency v and wavenumber kz þ kw as the (spatial lag of lpm =4 2 mlpm in real space) which
modulating pondermotive wave. The degree of places the entire bunch at a decelerating phase relative
density-bunching depends on the amplitude of the to the PM-wave and so amplifies the radiation wave.
wave and the interaction length L: In the nonlinear The principle of stimulated-emission gain in FEL,
limit the counter propagating (in the beam reference illustrated in Figure 7c, is quite similar. Here the
frame) velocity modulated electrons may over-bunch wiggler is uniform along the entire length L, and the
namely cross over, and debunch again. displacement of the electron bunches into a deceler-
Bunching is the principle of classical stimulated ating phase position relative to the PM-wave is
emission in electron beam radiation devices. If the obtained by injecting the electron beam at a velocity
e-beam had been prebunched in the first place, we vz0 , slightly higher than the wave vph (velocity
would have injected it at a decelerating phase relative detuning). The detailed calculation shows that
to the wave and obtained net radiation gain right away detuning
corresponding
to a phase shift of DCðLÞ ¼
(super radiant emission). This is indeed the principle ðv=vz0 Þ 2 ðkz þ kw Þ L ¼ 22:6 (corresponding to
behind the ‘Optical-Klystron’ (OK) demonstrated in spatial bunch advance of 0:4lpm along the wiggler
Figure 7b. The structure of the OK is described ahead length), provides sufficient synchronism with the PM-
in Figure 19. The electron beam is velocity (energy) wave in the first half of the wiggler to obtain
modulated by an input electromagnetic radiation bunching, and sufficient deceleration-phasing of the
wave in the first ‘bunching-wiggler section’ of length created bunches in the second part of the wiggler to
Lb : It then passes through a drift-free ‘energy- obtain maximum gain.
dispersive magnet section’ (chicane) of length Ld , in
which the velocity modulation turns into density
bunching. The bunched electron beam is then injected
Principles of FEL Theory
back into a second ‘radiating-wiggler section’, where it
co-propagates with the same electromagnetic wave The 3D radiation field in the interaction region can
but with a phase advance of p=2 2 m2p, m ¼ 1, 2, … be expanded in general in terms of a complete set of
440 LASERS / Free Electron Lasers
free-space or waveguide modes {1q ðx; yÞHq ðx; yÞ}: Thus, one should substitute in eqn [34]:
hX i
Eðr; tÞ ¼ Re cq ðzÞ1q ðx; yÞeiðkzq z2vtÞ ½30 F~ z ðkz þ kw ; vÞ ¼ 2e½E~ pm ðkz þ kw ; vÞ
q
þE ~ sc ðkz þ kw ; vÞ ½35
The mode amplitudes Cq ðzÞ may grow along the
wiggler interaction length 0 , z , L, according to and solve it self-consistently with eqns [32] and [33]
the mode excitation equation: to obtain the ‘external-force’ response relation:
where xp ðkz ; vÞ is the longitudinal susceptibility of the In a helical wiggler, AJJ ; 1: Usually aw p 1, and
electron beam ‘plasma’.The beam charge density in therefore AJJ ø 1:
the FEL may be quite high, and consequently the
space charge field E~ sc , arising from the Poison eqn
The Pierce Dispersion Equation
[32], may not be negligible. One should take into
consideration then that the total longitudinal force F~ z The longitudinal plasma response susceptibility func-
is composed of both the PM-force of eqn [27] and an tion xp ðkz ; vÞ has been calculated, in any plasma
arising longitudinal space-charge electric force – eE ~ sc : formulation, including fluid model, kinetic model, or
LASERS / Free Electron Lasers 441
even quantum-mechanical theory. If the electron Alternatively stated, the exit amplitude of the
beam axial velocity spread is small enough (cold electromagnetic mode can in general be expressed in
beam), then the fluid plasma equations can be used. terms of the initial conditions:
The small signal longitudinal force equation derived
from eqn [25], together with eqn [33] and the small Cq ðLÞ ¼ H E ðvÞCq ðv; 0Þ þ Hv ðvÞ~vq ðv; 0Þ
signal current modulation expression:
þ H i ðvÞ~iðv; 0Þ ½49
J~z ø r0 v~ z þ vz r~ ½40
where
result in: X
3
H ð E;v;iÞ ðvÞ ¼ Ajð E;v;iÞ ðvÞeidkj L ½50
v0p2
xp ðkz ; vÞ ¼ 2 1 ½41 j¼1
ð v 2 kz v z Þ 2
In the conventional FEL, electrons are injected in
where v0p ¼ ðe2 n0 =gg 2z 1mÞ1=2 is the longitudinal randomly, and there is no velocity prebunching
plasma frequency, n0 is the beam electrons density ð~vðv; 0Þ ¼ 0Þ or current prebunching ð~iðv; 0Þ ¼ 0Þ (or
ðr0 ¼ 2en0 Þ, vz is the average axial velocity of the equivalently nð ~ v; 0Þ ¼ 0). Consequently, Cq ðzÞ is
beam. proportional to Cq ð0Þ and one can define and
In this ‘cold-beam’ limit, the FEL dispersion eqn calculate the FEL small-signal single-path gain
[37] reduces into the well-known ‘cubic dispersion parameter:
equation’ derived first by John Pierce in the late 1940s
for the TWT: PðLÞ lCq ðv; LÞl2
GðvÞ ; ¼ 2
¼ lH E ðvÞl2 ½51
Pð0Þ lCq ðv; 0Þl
dkðdk 2 u 2 upr Þðdk 2 u þ upr Þ ¼ Q ½42
The FEL Gain Regimes
where dk ¼ kz 2 kzq , u is the detuning parameter (off
the synchronism condition of eqn [24]): At different physically meaningful operating regimes,
some parameters in eqn [42] can be neglected relative
v to others, and simple analytic expressions can be
u; 2 kzq 2 kw ½43
vz found for dki , Ai , and consequently GðvÞ: It is
vp convenient to normalize the FEL parameters to the
up ¼ ½44 ¼ QL3 : An
wiggler length: u ¼ uL, upr ¼ upr L, Q
vz
additional figure of merit parameter is the ‘thermal’
Q ¼ ku 2p ½45 spread parameter:
v v
Here upr ¼ r pup, where rp , 1 is the plasma uth ¼ zth L ½52
reduction factor. It results from the reduction of the vz vz
longitudinal space-charge field E ~ sp in a beam of
where vzth is the axial velocity spread of the e-beam
finite radius rb due to the fringe field effect (rp ! 1 (in a Gaussian velocity
when the beam is wide relative to the longitudinal pffiffi distribution model:
f ðvz Þ ¼ exp½ðvz 2 vz0 Þ=vzth p vzth ). The axial vel-
modulation wavelength: rb q lpm ¼ 2p=ðkzq þ kw Þ). ocity spread can result out of beam energy spread
The cubic equation has, of course, three solutions or angular spread (finite ‘emittance’). It should be
dki (i ¼ 1, 2, 3), and the general solution for the small enough, so that the general dispersion relation
radiation field amplitude and power is thus: of eqn [37] reduces to eqn [42] (the practical ‘cold
beam’ regime).
X
3
Cq ðzÞ ¼ Aj eidkj z ½46 Assuming now a conventional FEL ð~vz ð0Þ ¼ 0,
j¼1
~ið0Þ ¼ 0Þ, the single path gain in eqn [51] can be
calculated. We present next this gain expression in
PðzÞ ¼ lCq ðzÞl2 P q ½47 the different regimes. The maximum values of the
gain expression in the different regimes are listed in
The coefficients Ai can be determined from three Table 1.
initial conditions of the radiation and e-beam
parameters Cq ð0Þ, v~ ð0Þ, ~ið0Þ, and can be given as a Low gain
linear combination of them (here ~i ¼ Ae J~z is the This is the regime where the differentialgain in a
longitudinal modulation current): single path satisfies G 2 1 ¼ ½PðLÞ 2 Pð0Þ Pð0Þ p 1:
It is not useful for FEL amplifiers but most FEL
Aj ¼ AEj ðvÞCvq ð0Þ þ Avj ðvÞ~vv ð0Þ þ Aij ðvÞ~iv ð0Þ ½48 oscillators operate in this regime.
442 LASERS / Free Electron Lasers
Q P(L)
II Collective low-gain upr . , uth , p upr
¼ 1 þ Q=2
2 P(0)
P(L) 1 qffiffiffiffiffiffiffiffiffi
III Collective high-gain . upr . Q
Q=2 1=3 , uth , Q
.p ¼ exp( 2Q= upr )
P(0) 4
1=3 ; p P(L)
V Warm beam uth . upr , Q u 2th )
¼ exp(3Q=
P(0)
result in an exponential gain expression: input radiation signal ðCq ðv; 0Þ ¼ 0Þ if the electron
!2 beam velocity or current (density) are prebunched.
A1 Namely, the injected e-beam has a frequency com-
GðvÞ ¼ e2ð Im dk1 ÞL ½57
A1 þ A2 þ A3 ponent v~ ðvÞ or ~iðvÞ in the frequency range where the
radiation device emits. In the case of pure density
If we focus on the tenuous-beam strong coupling bunching ð~vðvÞ ¼ 0Þ, the coherent power emitted is
(high – gain) regime upr p ldkl, then the cubic eqn [42] found from eqns [46, 47, 49]:
gets the simple form:
ðPq ÞSR ¼ Pq lH i ðvÞl2 l~iðv; 0Þl2 ½65
dkbðdkÞ2 2 u2 c ¼ G3 ½58
(proportional, as expected, to the modulation As long as the interaction is weak enough (small
current squared) and in the high-gain amplified signal regime), the change in the electron velocity is
superradiance limit (assuming initial partial bunching negligible – vzi ø vz0 , and the phase of the force-
liðvÞ=Ibl p 1): wave, experienced by the electron, is linear in time
Ci ðtÞ ¼ ½v 2 ðkz þ kw Þvz0 ðt 2 t0i Þ þ vt0i : Near syn-
~iðvÞ 2 1 pffi
e 3GL e2ðv2v0 Þ =ðDvÞHG
2 2
PSR ¼ Ppb
2
½70 chronism condition u ø 0 (eqn [24]), eqn [75] results
Ib 9ðGLÞ in bunching of the beam, because different accelera-
The discussion is now extended to incoherent (or tion/deceleration forces are applied on each electron,
partially coherent) spontaneous emission. Due to its depending on their initial phase Ci ð0Þ ¼ vt0i ð2p ,
particulate nature, every electron beam has random Ci ð0Þ , pÞ within each optical period 2p=v (see
frequency components in the entire spectrum (shot Figure 7). Taylor expansion of vzi around vz0 in eqns
noise). Consequently, incoherent radiation power is [75] and [76], and use of conservation of energy
always emitted from an electron-beam passing between the e-beam and the radiation field, would
through a wiggler, and its spectral-power can be lead again to the small signal gain expression eqn
calculated through the relation: [53] in the low gain regime.
When the interaction is strong enough (the non-
dPq 2 klið~ vÞl2 l
¼ P q lH i ðvÞl2 ½71 linear or saturation regime), the electron velocities
dv p T change enough to invalidate the assumption of linear
Here ~iðvÞ is the Fourier transform of P the current of time dependence of Ci and the nonlinear set of
randomly injected electrons iðtÞ ¼ 2e N j¼1 dðt 2 toj Þ,
T
eqns [75] and [76] needs to be solved exactly.
where NT is the average number of electrons in a time It is Ð
convenient to invert the dependence on time
period T, namely, the average (DC) current is Ib ¼ zi ðtÞ ¼ tt0i vzi ðt 0 Þdt 0 , and turnÐthe coordinate
z to the
2eNT =T: For a randomly distributed beam, the shot independent variable ti ðzÞ ¼ z0 dz0 vzi ðz0 Þ þ t0i : This,
noise current is simply kliðvÞl2 l=T ¼ eIb , and therefore and direct differentiation of gi ðvzi Þ, reduces eqns [75]
the spontaneous emission power of the FEL, which is and [76] into the well-known pendulum equation:
nothing but the ‘synchrotron-undulator radiation’, is
given by (see eqn [66]): dui
¼ K2s sin Ci ½77
dz
dPq 1 L2 aw 2
¼ eIb Zq sinc2 ðuL=2Þ ½72 dCi
dv 16p Aem gbz ¼ ui ½78
dz
If the wiggler is long enough, the spontaneous
emission emitted in the first part of the wiggler can be where
amplified by the rest of it (SASE). In the high-gain ðz
limit (see eqn [67]), the amplified spontaneous Ci ¼ ðv=vzi ðz 0 Þ 2 kz 2 kw Þ dz 0 ½79
0
emission power within the gain bandwidth of
eqn [64] is given by:
v
ð1 pffi ui ¼ 2 kz 2 kw ½80
2 1 vzi
Pq ¼ P q eIb lH i ðvÞl2 dv ¼ Psh e 3GL ½73
p 0 9
are respectively the pondermotive potential phase and
where Psh is an ‘effective shot-noise input power’: the detuning value of electron i at position z:
2 ePpb pffiffiffiffiffiffiffiffiffiffi
Psh ¼ pffiffi ðDvÞHG ½74 k aw as AJJ
p I0 ðGLÞ2 Ks ¼ ½81
g0 gz0 b2z0
Saturation Regime is the synchrotron oscillation wavenumber, where aw
is given in eqn [9], as ¼ elEl ~ v=mc, and g0 ¼ gð0Þ,
The FEL interaction of an electron with an harmonic
electromagnetic (EM) wave is essentially described by gz0 ¼ gz ð0Þ, and bz0 ¼ bz ð0Þ are the initial parameters
the longitudinal component of the force in eqn [25], of the assumed cold beam.
driven by the pondermotive force of eqns [27] The pendulum eqns [77] and [78] can be integrated
and [28]: once, resulting in:
d 1
2 u2i ðzÞ 2 K2s cos Ci ðzÞ ¼ Ci ½82
ðg mv Þ ¼ lF~ pm lcos½vt 2 ðkz þ kw Þzi ½75
dt i zi
and the integration constant is determined for each
dzi =dt ¼ vzi ½76 electron by its detuning and phase relative to the
LASERS / Free Electron Lasers 445
Ks L ¼ p ½87
Figure 10 ‘Snapshots’ of the (g – C) phase-space distribution of an initially uniformly distributed cold beam relative to the PM-wave
trap at three points along the wiggler (a) Moderate bunching in the small-signal low gain regime. (b) Dynamics of electron beam trapping
and synchrotron oscillation at steady state saturation stage of a FEL oscillator (Ks L ¼ p).
by eqn [87]. In an oscillator, this is controlled by wave) ui in eqn [80] differs from the beam detuning
increasing the output mirror transmission sufficiently, parameter u in eqn [43]: ui ¼ u 2 ReðdkÞ: Based on
so that the single path incremental small signal gain these considerations and eqn [85], the maximum
G-1 will not be much larger than the round trip loss, energy extraction from the beam in the saturation
and the FEL will not get into deep saturation. When process is:
the FEL is over-saturated ðKs L . pÞ, the trapped
electrons begin to gain energy as they continue to Re dk 2 u
Dg ¼ 2dgi ð0Þ ¼ 2b3z0 g 2z0 g0 ½88
rotate in their phase-space trajectories beyond the k
lowest energy point of the trap, and the radiative where u is the initial detuning parameter in eqn [43].
energy extraction efficiency drops down. In an FEL oscillator, operating in general in the
A practical estimate for the FEL saturation power low-gain regime, lRe dkl p lul, oscillation will start
emission and radiation extraction efficiency can be usually at the resonator mode frequency, correspond-
derived from the following consideration: the elec- ing to the detuning parameter uðvÞ ¼ 22:6=L, for
tron beam departs from most of its energy during the which the small signal gain is maximal (see Figure 8).
interaction with the wave, if a significant fraction of Then the maximum radiation extraction efficiency
the electrons are within the trap and have positive can be estimated directly from eqn [88]. It is, in the
velocity dvzi relative to the wave velocity vph at z ¼ 0, highly relativistic limit ðbz0 ø 1Þ:
and if at the end of the interaction length ðz ¼ LÞ, they
complete half a pendulum swing and reverse their Dg 1
hext ¼ ø ½89
velocity relative to the wave dvzi ðLÞ ø 2dvzi ð0Þ: g0 2Nw
Correspondingly, in the energy phase-space diagram
(Figure 10b) the electrons perform half a synchro- In an FEL amplifier, in the high-gain regime
tron oscillation swing and dgi ðLÞ ¼ gi ðLÞ 2 gph ¼ Re dk ¼ G=2 q lul, and consequently in the same limit:
2dgi ð0Þ: In order to include in this discussion also Gl w
the FEL amplifier (in the high gain regime), we note hext ø ½90
4p
that in this case the phase velocity of the wave vph in
eqn [84], and correspondingly gph , are modified by It may be interpreted that the effective wiggler length
the interaction contribution to the radiation wave- for saturation is Leff ¼ 2p=G:
number – kz ¼ kz0 þ ReðdkÞ, and also the electron Equation [90], derived here for a coherent wave, is
detuning parameter (relative to the pondermotive considered valid also for estimating the saturation
LASERS / Free Electron Lasers 447
efficiency also in SASE-FEL. In this context, it is also wavelength according to eqns [7] and [91]. For this
called ‘the efficiency parameter’ 2r : reason, and also in order to avoid harmonic
frequencies emission (in case of a linear wiggler),
FEL Radiation Schemes and aw , 1 in common FEL design. Consequently, con-
Technologies sidering the practical limitations on lw , the operating
wavelength eqn [91] is determined primarily by the
Contrary to conventional atomic and molecular beam relativistic Lorentz factor g (eqn [8]).
lasers, the FEL operating frequency is not determined The conclusion is that for a short wavelength FEL,
by natural discrete quantum energy levels of the one should use an electron beam accelerated to high
lasing matter, but by the synchronism condition of kinetic energy Ek : Also, tuning of the FEL operating-
eqn [24] that can be predetermined by the choice of wavelength can be done by changing the beam energy.
wiggler period, lw ¼ 2p=kw , the resonator dispersion Small-range frequency tuning can be done also by
characteristics kzq ðvÞ, and the beam axial velocity vz : changing the spacing between the magnet poles of a
Because the FEL design parameters can be chosen linear wiggler. This varies the magnetic field experi-
at will, its operating frequency can fit any require- enced by the e-beam, and effects the radiation wave-
ment, and furthermore, it can be tuned over a wide length through change of aw (see eqns [7] and [91]).
range (primarily by varying vz ). This feature of FEL Figure 11 displays the operating wavelengths of
led to FEL development efforts in regimes where it is FEL projects all over the world versus their e-beam
hard to attain high-power tunable conventional lasers energy. FELs were operated or planned to operate
or vacuum-tube radiation sources – namely in the over a wide range of frequencies, from the microwave
sub-mm (far infrared or THz) regimes, and in the to X-ray – eight orders of magnitude. The data points
VUV down to soft X-ray wavelengths. fall on the theoretical FEL radiation curve eqns [7],
In practice, in an attempt to develop short wave- [8], and [91].
length FELs, the choice of wiggler period lw is limited
by an inevitable transverse decay of the magnetic field FEL Accelerator Technologies
away from the wiggler magnets surface (a decay range
of < k21 The kind of accelerator used is the most important
w ) dictated by the Maxwell equations. To avoid
interception of electron beam current on the walls or factor in determining the FEL characteristics.
on the wiggler surfaces, typical wiggler periods are Evidently, the higher the acceleration energy, the
made longer than lw . 1 cm. FELs (or FEMs – free shorter is the FEL radiation wavelength. However,
electron masers) operating in the long wavelengths not only the acceleration beam energy determines the
regime (mm and sub-mm wavelengths) must be based shortest operating wavelength of the FEL, but also the
on waveguide resonators to avoid excessive diffraction e-beam quality. If the accelerated beam has large
of the radiation beam along the interaction length (the energy spread, energy instability, or large emittance
wiggler). This determines the dispersion relation (the product of the beamwidth with its angular
spread), then it may have large axial velocity spread
kzq ðvÞ ¼ ðv2 2 v2coq Þ1=2 c where vcoq is the waveguide
cutoff frequency of the radiation mode q: The use of vzth : At high frequencies, this may push the detuning
this dispersion relation in eqn [24] results in an spread parameter uth (eqn [52]) to the warm beam
equation for the FEL synchronism frequency v0 : regime (see Table 1), in which the FEL gain is
Usually the fundamental mode in an overmoded diminished, and FELs are usually not operated.
waveguide is used (the waveguide is overmoded Other parameters of the accelerator determine
because it has to be wide enough to avoid interception different characteristics of the FEL. High current in
of electron beam current). In this case ðv0 q vco Þ and the electron beam enables higher gain and higher
certainly in the case of an open resonator (common in power operation. The e-beam pulse shape (or CW)
FELs operating in the optical regime) kzq ¼ v=c, and characteristics, affect, of course, the emitted radiation
the synchronism condition in eqn [24] simplified to the waveform, and may also affect the FEL gain and
well-known FEL radiation wavelength expression in saturation characteristics. The following are the main
eqn [6]: accelerator technologies used for FEL construction.
Their wavelength operating-regimes (eqn [91]) (deter-
mined primarily by their beam acceleration energies),
l ¼ ð1 þ bz Þbz g 2z lw ø 2g 2z lw ½91
are displayed in Figure 12.
where gz , aw are defined in eqns [7] –[9].
To attain strong interaction, it is desirable to keep Modulators and pulse-line accelerators
the wiggler parameter aw large (eqn [38]), however, if These are usually single pulse accelerators, based on
aw . 1, this will cause reduction in the operating high voltage power supplies and fast discharge stored
448 LASERS / Free Electron Lasers
Figure 11 Operating wavelengths of FELs around the world vs. their accelerator beam energy. The data points correspond in
ascending order of accelerator energy to the following experimental facilities: NRL (USA), IAP (Russia), KAERI (Korea), IAP (Russia),
JINR/IAP (Russia), INP/IAP (Russia), TAU (Israel), FOM (Netherlands), KEK/JAERI (Japan/Korea), CESTA (France), ENEA (Italy),
KAERI-FEL (Korea), LEENA (Japan), ENEA (Italy), FIR FEL (USA), mm Fel (USA), UCSB (USA), ILE/ILT (Japan), MIRFEL (USA),
UCLA-Kurchatov (USA/Russia), FIREFLY (GB), JAERI-FEL (Japan), FELIX (Netherlands), RAFEL (USA), ISIR (Japan), UCLA-
Kurchatov-LANL (USA/RU), ELSA (France), CLIO (France), SCAFEL (GB), FEL (Germany), BFEL (China), KHI-FEL (Japan), FELI4
(Japan), iFEL1 (Japan), HGHG (USA), FELI (USA), MARKIII (USA), ATF (USA), iFEL2 (Japan), VISA (USA), LEBRA (Japan), OK-4
(USA), UVFEL (USA), iFEL3 (Japan), TTF1 (Germany), NIJI-IV (Japan), APSFEL (USA), FELICITAI (Germany), FERMI (Italy), UVSOR
(Japan), Super-ACO (France), TTF2 (Germany), ELETTRA (Italy), Soft X-ray (Germany), SPARX (Italy), LCLS (USA), TESLA
(Germany). X, long wavelengths; p , short wavelengths; W, planned short wavelengths SASE-FELs. Data based in part on H. P. Freund,
V. L. Granatstein, Nucl. Inst. and Methods In Phys. Res. A249, 33 (1999), W. Colson, Proc. of the 24th Int. FEL conference, Argone, III.
(ed. K. J. Kim, S. V. Milton, E. Gluskin). The data points fall close to the theoretical FEL radiation condition expression (91) drawn for two
practical limits of wiggler parameters.
electric energy systems (e.g., Marx Generator), which these FEMs operated mostly in the collective high-
produce short pulse (tens of nSec) Intense Relativistic gain regime (see Table 1).
Beam (IRB) of energy in the range of hundreds of keV Some of the early pioneering work on FEMs was
to few MeV and high instantaneous current (order of done in the 1970s and 1980s in the US (NRL,
kAmp), using explosive cathode (plasma field emis- Columbia Univ., MIT), Russia (IAP), and France
sion) electron guns. FELs (FEMs), based on such (Echole Politechnique), based on this kind of
accelerators, operated mostly in the microwave and accelerators.
mm-wave regimes. Because of their poor beam
quality and single pulse characteristic, these FELs Induction linacs
were, in most cases, operated only as Self Amplified These too are single pulse (or low repetition rate)
Spontaneous Emission (SASE) sources, producing accelerators, based on induction of electromotive
intense radiation beams of low coherence at instan- potential over an acceleration gap by means of an
taneous power levels in the range of 1 –100 MW. electric-transformer circuit. They can be cascaded to
Because of the high e-beam current and low energy, high energy, and produce short pulse (tens to hundreds
LASERS / Free Electron Lasers 449
The FEL small signal gain, must be large enough to dimensions, energy spread, and emittance parameters
build-up the radiation field in the resonator from are set in steady state by a balance between the
noise to saturation well within the macropulse electrons oscillations within the ring lattice and
duration. radiation damping due to the random synchrotron
RF-Linacs are essential facilities in synchrotron emission process. This produces high-quality (small
radiation centers, used to inject electron beam current emittance and energy spread) continuous train of
into the synchrotron storage ring accelerator from electron beam bunches, that can be used to drive a
time to time. Because of this reason, many FELs based FEL oscillator placed as an insertion device in one of
on RF-LINACs were developed in synchrotron the straight sections of the ring between two bending
centers, and provide additional coherent radiation magnets.
sources to the synchrotron radiation center users. Demonstrations of FEL oscillators, operating in a
Figure 15 displays FELIX – a RF-LINAC FEL storage ring, were first reported by the French (LURE-
which is located in one of the most active FEL Orsay) in 1987 (at visible wavelengths) and the
radiation user-centers in FOM – Holland. Russians (VEPP-Novosibirsk) in 1988 (in the ultra-
violet). The short wavelength operation of storage-
Storage rings ring FELs is facilitated by the high energy, low
Storage rings are circular accelerators in which a emittance and low energy spread parameters of
number of electron (or positron) beam bunches the beam.
(typically of 50 – 500 pS pulse duration and hundreds Since storage ring accelerators are at the heart of all
of ampere peak current) are circulated continuously synchrotron radiation centers, one could expect that
by means of a lattice of bending magnets and FEL would be abundant in such facilities as insertion
quadrupole lenses. Typical energies of storage ring devices. There is, however, a problem of interference
accelerators are in the hundreds of MeV to GeVs of the FEL operating as an insertion device in the
range. As the electrons pass through the bending normal operation of the ring itself. The energy spread
magnets, they lose a small amount of their energy due increase, induced in the electron beam during the
to emission of synchrotron radiation. This energy is interaction in a saturated FEL oscillator, cannot be
replenished by a small RF acceleration cavity placed controlled by the synchrotron radiation damping
in one section of the ring. The electron beam bunch process, if the FEL operating power is too high.
Figure 15 The FELIX RF-Linac FEL operating as a radiation users center in F.O.M. Netherlands. (Courtesy of L. van der Meer, F.O.M.)
452 LASERS / Free Electron Lasers
This limits the FEL power to be kept as a fraction of This scheme revolutionized the development of
the synchrotron radiation power dissipation all FELs in the direction of high-power, high-efficiency
around the ring (the ‘Renieri Limit’). The effect of operation, which is highly desirable, primarily for
the FEL on the e-beam quality, reduces the lifetime of industrial applications (material processing, photo-
the electrons in the storage ring, and so distrupts the chemical production, etc.).
normal operation of the ring in a synchrotron In the recirculating SC-RF-LINAC FEL scheme the
radiation user facility. wasted beam emerging out of the wiggler after losing a
To avoid the interference problems, it is most fraction of only few percents (see eqn [89]) out of its
desirable to operate FELs in a dedicated storage ring. kinetic energy, is not dumped into a beam-dump, as in
This also provides the option to leave long enough normal cavity RF accelerators, but is re-injected, after
straight sections in which long enough wigglers circulation, into the SC-RF accelerator. The timing of
provide sufficient gain for FEL oscillation. Figure 16 the wasted electron bunches re-injection is such that
displays the Duke storage ring FEL, which is used as a they experience a deceleration phase along the entire
unique radiation user facility, providing intense length of the accelerating cavities. Usually, they are
coherent short wavelength radiation for applications re-injected at the same cell with a fresh new electron
in medicine, biology, material studies, etc. bunch injected at an acceleration phase, and thus the
accelerated fresh bunch receives its acceleration
Superconducting (SC) RF-LINACS kinetic energy directly from the wasted beam bunch,
When the RF cavities of the accelerator are super- that is at the same time decelerated. The decelerated
conducting, there are very low RF power losses on the wasted beam bunches are then dumped in the electron
cavity walls, and it is possible to maintain continuous beam dump at much lower energy than without
acceleration field in the RF accelerator with a recirculation, at energies that are limited primarily
moderate-power continuous RF source, which deli- just by the energy spread induced in the beam in the
vers all of its power to the electron beam kinetic FEL laser-saturation process. This scheme, not only
energy. Combining the SC-RF-LINAC technology increases many folds the over-all energy transform-
with an FEL oscillator, pioneered primarily by ation efficiency from wall-plug to radiation, but would
Stanford University and Thomas Jefferson Lab solve significant heat dissipation and radioactivity
(TJL) in the US and JAERI Lab in Japan, gave rise activation problems in a high-power FEL design.
to an important scheme of operating such a system Figure 17 displays the TJL Infrared SC-RF-LINAC
in a current recirculating energy retrieval mode. FEL oscillator, that demonstrated for the first time
Figure 16 The Duke – University Storage Ring FEL operating as a radiation-users center in N. Carolina, USA. (Mendening: Matthew
Busch, courtesy of Glenn Edwards, Duke FEL Lab.)
LASERS / Free Electron Lasers 453
record high average power levels – nearly 10 kWatt This also makes it possible to sustain high average
at optical frequencies (1 –14 mm). The facility is in circulating current despite the disruptive effect of the
upgrade development stages towards eventual oper- FEL on the e-beam. This technological development
ation at 100 kWatt in the IR and 1 kWatt in the UV. has given rise to a new concept for a radiation-user
It operates in the framework of a laser material facility light-source 4GLS (fourth-generation light
processing consortium and demonstrates important source), which is presently in a pilot project
material processing applications, such as high-rate development stage at Daresbury Lab in the UK
micromachining of hard materials (ceramics) with (see Figure 18). In such a scheme, IR and UV FEL
picoSecond laser pulses. oscillators and XUV SASE-FEL can be operated
The e-beam current recirculation scheme of SC- together with synchrotron magnet dipole and
RF-LINAC FEL has a significant advantage over the wiggler insertion devices without disruptive inter-
e-beam recirculation in a storage ring. As in electro- ference. Such a scheme, if further developed, can
static accelerators, the electrons entering the wiggler give rise to new radiation-user, light-source facilities,
are ‘fresh’ cold-beam electrons from the injector, and that can provide a wider range of radiation
not a wasted beam corrupted by the laser saturation parameters than synchrotron centers of previous
process in a previous circulation through the FEL. generation.
Figure 17 The Thomas Jefferson Lab. recirculating beam-current superconducting Linac FEL operating as a material processing
FEL-user center in Virginia USA (Courtesy of S. Benson, Thomas Jefferson Laboratory).
Figure 18 The Daresbury Fourth Generation Light-Source concept (4GLS). The circulating beam-current superconducting Linac
includes SASE-FEL, bending magnets and wigglers as insertion devices. (Courtesy of M. Poole, Damesbury Laboratory)
454 LASERS / Free Electron Lasers
Figure 19 Schematics of the Optical– Klystron, including an energy bunching wiggler, a dispersive magnet bunching section and a
radiating wiggler.
456 LASERS / Free Electron Lasers
synchronism with the beam. Slowing down the PM still has some limitations. It can operate efficiently
wave can be done by the gradual increase of the only at a specified high radiation power level for
wiggler wavenumber kw ðzÞ (or decrease of its period which the tapering was designed. In an oscillator, a
lw ðzÞ), so that eqns [29] or [91] keep being satisfied long enough untapered section must be left to permit
for a given frequency, even if vz (or gz ) goes down. sufficient small signal gain in the early stages of the
A more correct description of the nonlinear laser oscillation build-up process.
interaction dynamics of the electron beam in a
saturated tapered-wiggler FEL is depicted in FEL Oscillators
Figure 20: the electron trap synchronism energy Most FEL devices are oscillators. As in any laser, in
gph ðzÞ tapers down (by design) along the wiggler, order to turn the FEL amplification process into an
while the trapped electrons are forced to slow down oscillation process, one provides a feedback mechan-
with it, releasing their excess energy by enhanced ism by means of an optical resonator. In steady state
radiation. An upper limit estimate for the extraction saturation, GRrt ¼ 1, where Rrt is the round trip
efficiency of such a tapered wiggler FEL would be: reflectivity factor of the resonator and G ¼ PðLÞ=Pð0Þ
is the saturated single-path gain coefficient of the
gph ð0Þ 2 gph ðLÞ FEL. To attain oscillation, the small signal (unsatu-
hext ¼ ½99
gph ð0Þ rated) gain, usually given by the small gain expression
in eqn [53], must satisfy the lasing threshold
and the corresponding radiative power generation condition G . 1=Rrt , as in any laser.
would be: DP ¼ hext Ib Ek =e: In practice, the phase- When steady state oscillation is attained, the
space area of the tapered wiggler separatrix is reduced oscillator output power is:
due to the tapering, and only a fraction of the electron
beam can be trapped, which reduces correspondingly T
Pout ¼ DPext ½100
the practical enhancement in radiative extraction 1 2 Rrt
efficiency and power.
An alternative wiggler tapering scheme consists of where DPext ¼ hext I0 ðg0 2 1Þmc2 =e and hext is the
tapering the wiggler field Bw ðzÞ (or wiggler parameter extraction efficiency, usually given by eqn [89]
amplitude aw ðzÞ). If these are tapered down, the axial (low-gain limit).
velocity and axial energy (eqn [7]) can still keep Usually, FEL oscillators operate in the low-gain
constant (and in synchronism with the PM wave) regime, in which case 1 2 Rrt ¼ L þ T p 1 (where L
even if the beam energy g goes down. Thus, in this is the resonator internal loss factor). Consequently,
scheme, the excess radiative energy extracted from then Pout ø DPext T=ðL þ TÞ, which would give a
the beam comes out of its transverse (wiggling) maximum value, depending on the saturation level
energy. of the oscillator. In the general case, one must solve
Efficiency and power enhancement of FEL by the nonlinear force equations together with the
wiggler tapering have been demonstrated experimen- resonator feedback relations of the oscillating radi-
tally both in FEL amplifiers (first by Livermore, 1985) ation mode, in order to maximize the output power
and oscillators (first by Los-Alamos, 1983). This (eqn [100]) or efficiency by choice of optimal T for
elegant way to extract more power from the beam given L.
Figure 20 ‘Snapshots’ of the trap at three locations along a tapered wiggler FEL.
LASERS / Free Electron Lasers 457
In an FEL oscillator operating with periodic effect). This reduces the overlap between the
electron bunches (as in RF-acclerator based FEL), radiation pulse and the e-beam bunch along the
the solution for the FEL gain and saturation wiggler (see Figure 14) and consequently decreases
dynamics requires extension of the single frequency the gain. Fine adjustment of the resonator mirrors
solution of the electron and electromagnetic field (as shown in Figure 14) is needed to attain
equations to the time domain. In principle, the maximal power and optimal radiation pulse
situation is similar to that of a mode-locked laser, shape. The pulse-slippage gain reduction effect is
and the steady state laser pulse train waveform negligible only if the bunch length is much
constitutes a superposition of the resonator longi- longer than the slippage length Nw l, which can
tudinal modes that produces a self-similar pulse be expressed as:
shape with the highest gain (best overlap with the
e-beam bunch along the interaction length). tp q 2p=DvL ½101
Because the e-beam velocity vz0 is always smaller
(in an open resonator) than the group velocity of where DvL is the synchrotron undulator radiation
the circulating radiation wavepacket, the radiation frequency bandwidth (eqn [55]). This condition
wavepacket slips ahead of the electron bunch one is usually not satisfied in RF-accelerator FELs
optical period l in each wiggling period (Slippage operating in the IR or lower frequencies, and the
Figure 21 Anticipated peak brightness of SASE FELs (TTF-DESY, LCLS-SLAC) in comparison to the undulators in present third
generation Synchrotron Radiation sources. Figure courtesy of DESY, Hamburg, Germany.
458 LASERS / Free Electron Lasers
slippage effect gain reduction must be then taken where Df1=2 is the spectral width of the cold resonator
into account. mode. Expression [102] predicts extremely narrow
An FEL operating in the cold-beam regime linewidth. In practice, CW operation of FEL was not
constitutes an ‘homogeneous broadening’ gain med- yet attained, but Fourier transform limited linewidths
ium in the sense of conventional laser theory. in the range of Df =f0 ø 1026 were measured in long-
Consequently, the longitudinal mode competition pulse electrostatic accelerator FELs. In an FEL
process that would develop in a CW FEL oscillator, oscillator, based on a train of e-beam bunches (e.g.,
leads to single-mode operation and high spectral an R.F. accelerator beam), the linewidth is very wide
purity (temporal coherence) of the laser radiation. and is equal to the entire gain bandwidth (eqn [56]) in
The minimal (intrinsic) laser linewidth would be the slippage dominated limit, and to the Fourier
determined by an expression analogous to the transform limit Dv ø 2p=tp in the opposite negli-
Schawlow– Towns limit of atomic laser: gible-slippage limit (eqn [101]). Despite this slippage,
it was observed in RF-LINAC FEL that the radiation
pulses emitted by the FEL oscillator are phase
ðDf1=2 Þ2 corrected with each other, and therefore their total
ðDf Þint ¼ ½102
Ib =e temporal coherence length may be as long as the
Figure 22 Phase 1 of the SASE FE L (TTF VUV-FEL1): (a) Accelerator layout scheme; (b) General view of the TESLA test facility.
Figure courtesy of DESY, Hamburg, Germany.
460 LASERS / Metal Vapor Lasers
Metal Principal wavelengths Powers Total efficiencies Pulse repetition Technological development
(nm) frequency (kHz)
Typical Maximum
(W) (W)
Figure 2 Partial energy level scheme for the copper vapor laser.
arrangement. The cylindrical electrodes and silica electrons: HBr þ e2 ! H þ Br2 ; followed by ion
laser end windows are supported by water-cooled end neutralization: Br2 þ Cuþ ! Br þ Cup : As a result
pieces. The laser windows are usually tilted by a few of the lower operating temperature and kinetic
degrees to prevent back reflections into the active advantages of HBr, CuBr lasers are typically twice
medium. The laser head is contained within a water as efficient (2 –3%) as their elemental counterparts.
cooled metal tube to provide a coaxial current return Sealed-off CuBr systems with powers of order
for minimum laser head inductance. Typically a slow 10 – 20 W are commercially produced.
flow (,5 mbar l min21) of neon at a pressure of An alternative technique for reducing the operating
20 – 80 mbar is used as the buffer gas with an temperature of elemental CVLs is to flow a buffer gas
approximately 1% H2 additive to improve the mixture consisting of , 5% HBr in neon at approxi-
afterglow plasma relaxation. The buffer gas provides mately 50 mbar l min21 and allow this to react with
a medium to operate the discharge when the laser solid copper metal placed within the plasma tube
is cold and slows diffusion of copper vapor out of at about 600 8C to produce CuBr vapor in situ.
the ends of the hot plasma tube. Typical copper fill The so-called Cu HyBrID (hydrogen bromide in
times are of order 200 – 2000 hours (for 20 – 200 g discharge) laser has the same advantages as the CuBr
copper load). Sealed-off units with lifetimes of order laser (e.g., up to 3% efficiency) but at the cost of
1000 hours have been in production in Russia for requiring flowing highly toxic HBr in the buffer gas,
many years. a requirement which has so far prevented commer-
During operation waste heat from the repetitively cialization of Cu HyBrID technology.
pulsed discharge heats the alumina tube up to The kinetic advantages of the hydrogen halide in
approximately 1500 8C at which point the vapor the CuBr laser discharge can also be applied to a
pressure of copper is of approximately 0.5 mbar conventional elemental CVL through the addition of
which corresponds to the approximate density small partial pressure of HCl to the buffer gas in
required for maximum laser output power. Typical addition to the 1 –2% H2 additive. HCl is preferred to
warm-up times are therefore relatively long at around HBr as it is less likely to dissociate (the dissociation
one hour to full power. energy of HCl at 0.043 eV is less than HBr at
One way to circumvent the requirement for high 0.722 eV). Such kinetic enhancement leads to a
temperatures required to produce sufficient doubling in average output power, a dramatic
copper density by evaporation of elemental copper increase in beam quality through improved gain
(and hence also reduce warm-up times) is to use a characteristics, and shifts the optimum pulse
copper salt with a low boiling point located in repetition frequency for kinetically enhanced CVLs
one or more side-arms of the laser tube (Figure 4). (KE-CVLs) from 4 – 10 kHz up to 30 – 40 kHz.
Usually copper halides are used as the salt, with the Preferential pumping of the upper laser levels
copper bromide laser being the most successful. For requires an electron temperature in excess of the
the CuBr laser, a temperature of just 600 8C is 2 eV range, hence high voltage (10– 30 kV), high
sufficient to produce the required Cu density by current (hundreds of A), short (75 –150 ns) excitation
dissociation of CuBr vapor in the discharge. With the pulses are required for efficient operation of copper
inclusion of 1– 2% H2 in the neon buffer gas, HBr is vapor lasers. To generate such excitation pulses CVLs
also formed in the CuBr laser, which has the are typically operated with a power supply incorpor-
additional benefit of improving recombination in ating a high-voltage thyratron switch. In the most
the afterglow via dissociative attachment of free basic configuration, the charge-transfer circuit shown
in Figure 5a, a dc high-voltage power supply ðCT ¼ 0:5CS2 Þ thereby transferring the charge from
resonantly charges a storage capacitor (CS, typically CS1 and CS2 to CT in a time much less than the initial
a few nF) through a charging inductor LC, a high- LC inversion time. At the moment when the charge
voltage diode and bypass inductor LB, up to twice the on transfer capacitor C T reaches a maximum
supply voltage VS in a time of order 100 ms. When (also 2 4VS), LS2 saturates, and the transfer capacitor
the thyratron is triggered, the storage capacitor is discharged through the laser tube, again with a
discharges through the thyratron and the laser head peaking capacitor to increase the voltage rise time. By
on a time-scale of 100 ns. Note that during the fast using magnetic pulse compression the thyratron
discharge phase, the bypass inductor in parallel with switched voltage can be reduced by 4 and the peak
the laser head can be considered to be an open circuit. current similarly reduced (at the expense of increased
A peaking capacitor CP (, 0.5 CS) is provided to current pulse duration) thereby greatly extending the
increase the rate of rise of the voltage pulse across the thyratron lifetime. Note that in both circuits a
laser head. Given the relatively high cost of ‘magnetic assist’ LA saturable inductor is provided
thyratrons, more advanced circuits are now often in series with the thyratron to delay the current pulse
used to extend the service lifetime of the thyratron to through the thyratron until after the thyratron has
several thousand hours. In the more advanced circuit reached high conductivity thereby reducing power
(Figure 5b) an LC inversion scheme is used in deposition in the thyratron.
combination with magnetic pulse compression tech- Copper vapor lasers produce high average powers
niques and operates as follows. Storage capacitors CS (2 –100 W available commercially, with laboratory
are resonantly charged in parallel to twice the dc devices producing average powers of over 750 W)
supply voltage VS as before. When the thyratron is and have wall-plug efficiencies of approximately 1%.
switched, the charge on CS1 inverts through the Copper vapor lasers also make excellent amplifiers
thyratron and transfer inductor LT. This LC inversion due to their high gains, and the amplifiers can be
drags the voltage on the top of CS2 down to 2 4VS. chained together to produce average powers of
When the voltage across the first saturable inductor several kW. Typical pulse repetition frequencies
LS1 reaches a maximum (2 4VS) LS1 saturates, and range from 4 to 20 kHz, with a maximum reported
allows current to flow from the storage capacitors pulse repetition frequency of 250 kHz. An approxi-
(now charged in series) to the transfer capacitor mate scaling law states that for an elemental
device with tube diameter D (mm) and length L (m)
the average output power in watts will be of order
D £ L. For example, a typical 25 W copper vapor
laser will have a 25 mm diameter by 1 m long laser
tube, and operate at 10 kHz corresponding to 2.5 mJ
pulse energy and 50 kW peak power (50 ns pulse
duration). Copper vapor lasers have very high single-
pass gains (greater than 1000 for a 1 m long tube),
large gain volumes and short gain durations
(20 – 80 ns, sufficient for the intracavity laser light
to make only a few round trips within the optical
resonator). Maximum output power is therefore
usually obtained in a highly ‘multimode’ (spatially
incoherent) beam by using either a fully stable or a
plane –plane resonator with a low reflectivity output
coupler (usually Fresnel reflection from an uncoated
optic is sufficient). To obtain higher beam quality a
high magnification unstable resonator is required
(Figure 6). Fortunately, copper vapor lasers have
sufficient gain to operate efficiently with unstable
resonators with magnifications ðM ¼ R1 =R2 Þ up to
100 and beyond. Resonators with such high
magnifications impose very tight geometric con-
straints on propagation of radiation on repeated
Figure 5 Copper vapor laser excitation circuits. (a) Charge
round-trips within the resonator, such that after two
transfer circuit; (b) LC inversion circuit with magnetic pulse round-trips the divergence is typically diffraction-
compression. limited. Approximately half the stable-resonator
464 LASERS / Metal Vapor Lasers
Figure 6 Unstable resonator configuration often used to obtain high beam quality from copper vapor lasers.
output power can therefore be obtained with near output. The two main afterglow recombination
diffraction-limited beam quality by using an unstable metal vapor lasers are the strontium ion vapor laser
resonator. Often, a small-scale oscillator is used in (430.5 nm and 416.2 nm) and calcium ion vapor laser
conjunction with a single power amplifier to produce (373.7 nm and 370.6 nm) whose output in the violet
high output power with diffraction-limited beam and UV spectral regions extends the spectral coverage
quality. Hyperfine splitting combined with of pulsed metal vapor lasers to shorter wavelengths.
Doppler broadening lead to an inhomogeneous A population inversion is produced by recombina-
linewidth for the laser transitions of order tion pumping where doubly ionized Sr (or Ca)
8 –10 GHz corresponding to a coherence length of recombines to form singly ionized Sr (or Ca) in an
order 3 cm. excited state: Srþþ þ e2 þ e2 ! Srþp þ e2 : Note that
The high beam quality and moderate peak recombination rates for a doubly ionized species are
power of CVLs allows efficient nonlinear frequency much faster than for singly ionized species hence
conversion to the UV by second harmonic generation recombination lasers are usually metal ion lasers.
(510.55 nm ! 255.3 nm, 578.2 nm ! 289.1 nm) Recombination (pumping) rates are also greatest in a
and sum frequency generation (510.55 nm þ cool dense plasma, hence helium is usually used as the
578.2 nm ! 271.3 nm) using b -barium borate buffer gas as it is a light atom hence promotes rapid
b-BaB2O4 (BBO) as the nonlinear medium. Typically collisional cooling of the electrons. Helium also has a
average powers in excess of 1 W can be obtained at much higher ionization potential than the alkaline-
any of the three wavelengths from a nominally 20 W earth metals (including Sr and Ca) which ensures
CVL. Powers up to 15 W have been obtained at preferential ionization of the metal species (up to
255 nm from high-power CVL master-oscillator 90% may be doubly ionized).
power-amplifier systems using cesium lithium borate; Recombination lasers can operate where the energy
CsLi B6O10 (CLBO) as the nonlinear crystal. level structure of an ion species can be considered to
Key applications of CVLs include pumping of dye consist of two (or more) groups of closely spaced
lasers (principally for laser isotope separation) and levels. In the afterglow of a pulsed discharge electron
pumping Ti:sapphire lasers. Medically, the CVL collisional mixing within each group of levels will
yellow output is particularly useful for treatment of maintain each group in thermodynamic equilibrium
skin lesions such as port wine stain birth marks. CVLs yielding a Boltzmann distribution of population
are also excellent sources of short-pulse stroboscopic within each group. If the difference in energy between
illumination for high-speed imaging of fast objects the two groups of levels ð5 eV >> kTÞ is sufficiently
and fluid flows. The high beam quality, visible large then thermal equilibrium between the groups
wavelength and high pulse repetition rate make cannot be maintained via collisional processes. Given
CVLs very suited to precision laser micromachining that recombination yields excited singly ionized
of metals, ceramics and other hard materials. More species (usually with a flux from higher to lower ion
recently, the second harmonic at 255 nm has proved levels), it is possible to achieve a population inversion
to be an excellent source for writing Bragg gratings in between the lowest level of the upper group and the
optical fibers. higher levels within the lower group. This is the mech-
anism for inversion in the Srþ (and analogous Caþ)
ion laser as indicated in the partial energy level
Afterglow Recombination Metal Vapor scheme for Srþ shown in Figure 7.
Strontium and calcium ion recombination lasers
Lasers have similar construction to copper vapor lasers
Recombination of an ionized plasma in the afterglow described above. The lower operating temperatures
of a discharge pulse provides a mechanism for (500 – 800 8C) mean that minimal or no thermal
achieving a population inversion and hence laser insulation is required for self-heated devices.
LASERS / Metal Vapor Lasers 465
Figure 7 Partial energy level scheme for the strontium ion laser.
For good tube lifetime BeO plasma tubes are required Many potential applications exist for strontium
due to the reactivity of the metal vapors. Usually and calcium ion lasers, given their ultraviolet
helium at a pressure of up to one atmosphere is used (UV)/violet wavelengths. Of particular importance
as the buffer gas. Typical output powers at 5 kHz is fluorescence spectroscopy in biology and forensics,
pulse repetition frequency are of order 1 W (, 0.1% treatment of neonatal jaundice, stereolithography,
wall plug efficiency) at 430.5 nm from the He Srþ micromachining and exposing photoresists for inte-
laser and 0.7 W at 373.7 nm from the He Caþ laser. grated circuit manufacture. Despite these many
For the high specific input power densities (10 – potential applications, technical difficulties in power
15 W cm23) required for efficient lasing, overheating scaling mean that both strontium ion and calcium ion
of the laser gas limits aperture scaling beyond lasers have only limited commercial availability.
10 –15 mm diameter. Slab laser geometries have been
used successfully to aperture-scale strontium ion
lasers. Scaling of the laser tube length beyond 0.5 m Continuous-Wave Metal Ion Lasers
is not practical as achieving high enough excitation In a discharge excited noble gas, there can be a large
voltages for efficient lasing becomes problematic. concentration of noble gas atoms in excited meta-
Gain in strontium and calcium ion lasers is lower than stable states and noble gas ions in their ground states.
in resonance-metastable metal vapor lasers such as These species can transfer their energy to a minority
the CVL, hence optimum output coupler reflectivity is metal species (M) via two key processes: either charge
approximately 70%. The pulse duration is also longer transfer (Duffendack reactions) with the noble gas
at around 200 – 500 ns resulting in moderate beam ions (Nþ):
quality with plane –plane resonators.
For both strontium and calcium ion lasers the two M þ Nþ ! Mþp þ N þ DE
principal transitions share upper laser levels and ðwhere Mþp is an excited metal ion stateÞ
hence exhibit gain competition such that without
wavelength-selective cavities usually only the longer
wavelength of the pair is produced. With wavelength- or Penning ionization with a noble gas atom in an
selective cavities up to 60% (Srþ) and 30% (Caþ) of excited metastable state (Np):
the normal power can be obtained at the shorter
wavelength. M þ Np ! Mþp þ N þ e2
466 LASERS / Metal Vapor Lasers
In a continuous discharge in a mixture of a noble gas Typical laser tube construction (Figure 9) consists
and a metal vapor, steady generation of excited metal of a 1 –3 mm diameter discharge channel typically
ions via energy transfer processes can lead to a steady 0.5 m long. A pin anode is used at one end of the laser
state population inversion on one or more pairs of tube together with a large-area cold cylindrical
levels in the metal ion and hence produce cw lasing. cathode located in a side-arm at the other end of the
Several hundred metal ion laser transitions have tube. Cadmium is transported to the main discharge
been observed to lase in a host of different metals. The tube from a heated Cd reservoir in a side-arm at
most important such laser is the helium cadmium 250 –300 8C. With this source of atoms at the anode
laser, which has by far the largest market volume by end of the laser a cataphoresis process (in which the
number of unit sales of all the metal vapor lasers. In a positively charged Cd ions are propelled towards
helium cadmium laser, Cd vapor is present at a the cathode end of the tube by the longitudinal
concentration of about 1 – 2% in a helium buffer gas electric field) transports Cd into the discharge
which is excited by a dc discharge. Excitation to the channel. Thermal insulation of the discharge tube
upper laser levels of Cd (Figure 8) is primarily via ensures that it is kept hotter than the Cd reservoir to
Penning ionization collisions with Heþ 23S1 meta- prevent condensation of Cd from blocking the tube
stable ions produced in the discharge. Excitation via bore. A large-diameter Cd condensation region is
electron-impact excitation from the ion ground state provided at the cathode end of the discharge channel.
may also play an important role in establishing a A large-volume side arm is also provided to act as a
population inversion in Cdþ lasers. Population gas ballast for maintaining correct He pressure. As He
inversion can be sustained continuously because the is lost through sputtering and diffusion through
2
P3/2 and 2P1/2 lower laser levels decay via strong the Pyrex glass tube walls, it is replenished from a
resonance transitions to the 2S1/2 Cdþ ground state, high-pressure He reservoir by heating a permeable
unlike in the self-terminating metal vapor lasers. glass wall separating the reservoir from the ballast
Two principal wavelengths can be produced, namely chamber which allows He to diffuse into the ballast
441.6 nm (blue) and 325.0 nm (UV) with cw powers chamber. Typical commercial sealed-off Cd lasers
up to 200 mW and 50 mW available respectively from have operating lifetimes of several thousand hours.
commercial devices. Overall laser construction is not that much more
Figure 8 Partial energy level diagram for helium and cadmium giving HeCd laser transitions.
LASERS / Noble Gas Ion Lasers 467
of neon, argon, krypton, and xenon, spanning the in Figure 1. The 4s levels decay radiatively to the ion
spectrum from ultraviolet to infrared; oscillation was ground state. The strongest of these laser lines are
also obtained in the ions of other gases, for example, listed in Table 1. The notation used for the energy
oxygen, nitrogen, and chlorine. A most complete levels is that of the L – S coupling model. The ion
listing of all wavelengths observed as gaseous ion ground state electron configuration is 3s23p5(2Po3/2).
lasers is given in the Laser Handbook cited in the The inner ten electrons have the configuration
Further Reading section. Despite the variety of 1s22s22p6, but this is usually omitted for brevity.
materials and wavelengths demonstrated, however, The excited states shown in Figure 1 result from
it is the argon and krypton ion lasers that have coupling a 4p or 4s electron to a 3s23p4(3P) core, with
received the most development and utilization. the resulting quantum numbers S (the net spin of the
Continuous ion lasers utilize high current density electrons), L (the net angular momentum of the
gas discharges, typically 50 A or more, and 2– 5 mm electrons), and J (the angular momentum resulting
in diameter. Gas pressures of 0.2 to 0.5 torr result in from coupling S to L) are represented by 2Sþ1 LJ,
longitudinal electric fields of a few V/cm of discharge, where L ¼ 0; 1; 2; … is denoted S, P, D, F, … The
so that the power dissipated in the discharge is superscript ‘o’ denotes an odd level, while even levels
typically 100 to 200 W/cm. Such high-power dissipa- omit a superscript. Note that some weaker transitions
tion required major technology advances before long- involving levels originating from the 3s23p4(1D) core
lived practical lasers became available. Efficiencies configuration also oscillate. Note also that the
have never been high, ranging from 0.01% to 0.2%.
quantum mechanical selection rules for the L – S
A typical modern ion laser may produce 10 W output
coupling model are not rigorously obeyed, although
at 20 kW input power from 440 V three-phase power
the stronger laser lines satisfy most or all of these
lines and require 6 – 8 gallons/minute of cooling
rules. The selection rule lDJl ¼ 1 or 0, but not J ¼
water. Smaller, air-cooled ion lasers, requiring 1 kW
0 ! J ¼ 0 is always obeyed. All transitions shown in
of input power from 110 V single-phase mains can
Figure 1 and Table 1 belong to the second spectrum of
produce 10 – 50 mW output power, albeit at even
argon, denoted Ar II. Lines originating from tran-
lower efficiency.
sitions in the neutral atom make up the first spectrum,
Ar I; lines originating from transitions in doubly
ionized argon are denoted Ar III, and so forth for even
Theory of Operation more highly ionized states.
The strong blue and green lines of the argon ion laser Much work has been done to determine the mec-
originate from transitions between the 4p upper levels hanisms by which the inverted population is formed in
and 4s lower levels in singly ionized argon, as shown the argon ion laser. Reviews of this extensive research
Figure 1 4p and 4s doublet levels in singly ionized argon, showing the strongest blue and green laser transitions.
LASERS / Noble Gas Ion Lasers 469
Table 1 Ar II laser blue-green wavelengths (i) by electron collision with the 3p6 neutral ground
Wavelength Transitiona Relative
state atoms. This ‘sudden perturbation process’
(nanometers) (upper level) ! (lower level) strengthb (Watt) requires at least 37 eV electrons, and also singles
out the 4p 2Po3/2 upper level, which implies that
454.505 4p 2Po3/2 ! 4s 2P3/2 0.8 only the 476 and 455 nm lines would oscillate.
457.935 4p 2So1/2 ! 4s 2P1/2 1.5
460.956 (1D)4p 2Fo7/2 ! (1D)4s 2D5/2 –
This is the behavior seen in pulsed discharges at
465.789 4p 2Po1/2 ! 4s 2P3/2 0.8 very low pressure and very high axial electric
472.686 4p 2Do3/2 ! 4s 2P3/2 1.3 field. This pathway probably contributes little to
476.486 4p 2Po3/2 ! 4s 2P1/2 3.0 the 4p population under ordinary continuous
487.986 4p 2Do5/2 ! 4s 2P3/2 8.0 wave (cw) operating conditions, however.
488.903 4p 2Po1/2 ! 4s 2P1/2 c
can occur. In pulsed ion lasers, this is exhibited by the At still higher currents, lines in Ar IV can be made to
laser pulse terminating before the excitation current oscillate as well.
pulse ends, which would seem to preclude continuous Much less research has been done on neon,
operation. However, the intense discharge used in krypton, and xenon ion lasers, but it is a good
continuous operation heats the ions to the order of assumption that the population and depopulation
2300 K, thus greatly Doppler broadening the absorp- processes are the same in these lasers. Table 3 lists
tion linewidth and reducing the magnitude of the both the Kr II and Kr III lines that are available from
absorption. Additionally, the ions are attracted to the the largest commercial ion lasers. Oscillation on lines
discharge tube walls, so that the absorption spectrum in still-higher ionization states in both krypton and
is further broadened by the Doppler shift due to their xenon have been observed.
wall-directed velocities. The plasma wall sheath gives
about 20 V drop in potential from the discharge axis
to the tube wall, so most ions hit the wall with 20 eV
Operating Characteristics
of energy, or about ten times their thermal velocity. A typical variation of output power with discharge
A typical cw argon ion laser operates at ten times current for an argon ion laser is shown in Figure 3.
the gas pressure that is optimum for a pulsed laser, This particular laser had a 4 mm diameter discharge
and thus the radiation trapping of the 4s ! 3p5 in a water-cooled silica tube, 71 cm in length, with a
transitions is so severe that it may take several 1 kG axial magnetic field. The parameter is the argon
milliseconds after discharge initiation for the laser pressure in the tube before the discharge was struck.
oscillation to begin. Note that no one curve is exactly quadratic, but that
As the discharge current is increased, the intensities the envelope of the curves at different filling pressures
of the blue and green lines of Ar II eventually saturate, is approximately quadratic. At such high discharge
and then decrease with further current. At these high current densities (50 A in the 4 mm tube is approxi-
currents, there is a buildup in the population of mately 400 A/cm2) there is substantial pumping of
doubly ionized atoms, and some lines of Ar III can be gas out of the small-bore discharge region. Indeed, a
made to oscillate with the appropriate ultraviolet return path for this pumped gas must be provided
mirrors. Table 2 lists the strongest of these lines, those from anode to cathode ends of the discharge to keep
that are available in the largest commercial lasers. the discharge from self-extinguishing. The axial
Again, there is no quantitative model for the electric field in this discharge was 3– 5 V/cm, so the
performance in terms of the discharge parameters, input power was of the order of 10 to 20 kW, yielding
but the upper levels are assumed to be populated an efficiency of less than 0.1%, an unfortunate
by processes analogous to those of the Ar II laser. characteristic of all ion lasers.
a
All levels are denoted in L – S coupling with the core shown in ( ). Odd parity is denoted by the superscript ‘o’.
b
Relative strengths are given as power output from a commercial Spectra-Physics model 2080RS ion laser.
Figure 4 A typical commercial low-power, air-cooled argon ion laser. The cathode can (2) is at the right and the beryllia bore (1) and
anode (4) is at the left. The cathode current is supplied through ceramic vacuum feed-throughs (3). Mirrors (5) are glass fritted onto the
ends of vacuum envelope. (Photo courtesy of JDS Uniphase Corp.)
Figure 5 Discharge bore structure of a typical commercial high-power argon ion laser, the Coherent Innovae. Copper cups are
brazed to the inner wall of a ceramic envelope, which is cooled by liquid flow over its outer surface. A thin tungsten disk with a small hole
defining the discharge path is brazed over a larger hole in the bottom of the copper cup. Additional small holes in the copper cup near
the ceramic envelope provide a gas return path from cathode to anode. Also shown is the hot oxide-impregnated tungsten cathode.
(Photo courtesy of Coherent, Inc.)
an additional 20 eV. When these ions eventually hit forms may be more than ten times that elsewhere in
the discharge walls near the location of the sheath, the small diameter bore region. This was the
they have 40 eV of energy rather than the 20 eV downfall of high-power operation of dielectric
from the normal wall sheath elsewhere in the small discharge bores, even BeO; while sputtering was
diameter discharge. Sputtering yield (number of acceptable elsewhere in the discharge, the amount
sputtered wall atoms per incident ion) is exponen- of material removed in a small region near the
tially dependent on ion energy in this low ion discharge throat would cause catastrophic bore
energy region, so the damage done to the wall in failure at that point. This localized increase in
the vicinity of the ‘throat’ where the double sheath sputtering in the discharge throat is common to all
474 LASERS / Noble Gas Ion Lasers
ion lasers, including modern cooled tungsten disk ‘poisoning’ the hot cathode used in conventional
tubes. Eventually, disks near the throat are eroded dc discharge lasers. A similar electrode-less dis-
to larger diameters, and more material is deposited charge was demonstrated as a quasi-cw laser by
on the walls nearby. Attempts to minimize localized using iron transformer cores and exciting the
sputtering by tapering the throat walls or the discharge with a 2.5 kHz square wave. Various
confining magnetic field have proven unsuccessful; microwave excitation configurations at 2.45 GHz
a localized double sheath always forms somewhere and 9 GHz also resulted in ion laser oscillation.
in the throat. Because of the asymmetry caused by Techniques common to plasma fusion research
ion flow, the double sheath is larger in amplitude in were also studied. Argon ion laser oscillation was
the cathode throat than the anode throat, so that produced in Z-pinch and Q-pinch discharges and
the localized wall damage is larger at the cathode also by high-energy (10– 45 keV) electron beams.
end of the discharge than at the anode end. However, none of these latter techniques resulted
Another undesirable feature of sputtering is that in a commercial product, and all had efficiencies
the sputtered material ‘buries’ some argon atoms worse than the simple dc discharge lasers.
when it is deposited on a wall. This is the basis of the It is interesting that the highest-power output
well-known Vac-Ionw vacuum pump. Thus, the demonstrations were made in less than ten years
operating pressure in a sealed laser discharge tube after the discovery. Before 1970, 100 W output on the
will drop during the course of operation. In low- argon blue-green lines was demonstrated with dc
power ion lasers, this problem is usually solved by discharges two meters in length. In 1970, a group in
making the gas reservoir volume large enough to the Soviet Union reported 500 W blue-green output
satisfy the desired operating life (for example, the from a two-meter discharge with 250 kW of dc
large cathode can in Figure 4). In high-power ion power input, an efficiency of 0.2%. Today, the highest
lasers, the gas loss would result in unacceptable output power argon ion laser offered for sale is 50 W
operating life even with a large reservoir at the fill output.
pressure. Thus, most high-power ion lasers have a gas
pressure measurement system and dual-valve
arrangement connected to a small high-pressure
reservoir to ‘burp’ gas into the active discharge Manufacturers
periodically to keep the pressure within the operating
More than 40 companies have manufactured ion
range. Unfortunately, if a well-used ion laser is left
lasers for sale over the past four decades. This field
inoperative for months, some of the ‘buried’
has now (2004), narrowed to the following:
argon tends to leak back into the tube, with the
gas pressure becoming higher than optimum. The Coherent, Inc. http://www.coherentinc.com
pressure will gradually decrease to its optimum INVERSion Ltd. http://inversion.iae.nsk.su
value with further operation of the discharge (in JDS Uniphase http://www.jdsu.com
perhaps tens of hours). In the extreme case, the gas Laser Physics, Inc. http://www.laserphysics.com
pressure can rise far enough so that the gas discharge Laser Technologies http://www.lg-lasertechnologies.com
will not strike, even at the maximum power GmbH
supply voltage. Such a situation requires an external LASOS Lasertechnik http://www.LASOS.com
vacuum pump to remove the excess gas, usually a GmbH
factory repair. Lexel Laser, Inc. http://www.lexellaser.com
In addition to simple dc discharges, various other Melles Griot http://lasers.mellesgriot.com
techniques have been used to excite ion lasers, Spectra-Physics, Inc. http://www.spectraphysics.com
primarily in a search for higher power, improved
efficiency and longer operating life. Articles in the
Further Reading section give references to these
attempts. Radio-frequency excitation at 41 MHz Further Reading
was used in an inductively coupled discharge, with
Bridges WB (1979) Atomic and ionic gas lasers. In: Marton
the laser bore and its gas return path forming a
L and Tang CL (eds) Methods of Experimental Physics,
rectangular single turn of an air-core transformer.
Vol 15: Quantum Electronics, Part A, pp. 31 – 166.
A commercial product using this technique was New York: Academic Press.
sold for a few years in the late 1960s. Since this Bridges WB (1982) Ionized gas lasers. In: Weber MJ
was an ‘electrode-less’ discharge, ion laser lines in (ed.) Handbook of Laser Science and Technology,
reactive gases such as chlorine could be made to Vol. II, Gas Lasers, pp. 171 – 269. Boca Raton, FL:
oscillate, as well as the noble gases, without CRC Press.
LASERS / Optical Fiber Lasers 475
Bridges WB (2000) Ion lasers – the early years. IEEE Dunn MH and Ross JN (1976) The argon ion laser. In:
Journal of Selected Topics in Quantum Electronics 6: Sanders JH and Stenholm S (eds) Progress in
885– 898. Quantum Electronics, vol. 4, pp. 233– 269. New York:
Davis CC and King TA (1975) Gaseous ion lasers. In: Pergamon.
Goodwin DW (ed.) Advances in Quantum Electronics, Weber MJ (2000) Handbook of Laser Wavelengths.
vol. 3, pp. 169 – 469. New York: Academic Press. Boca Raton, FL: CRC Press.
Table 1 Commonly used pump and signal transitions in rare-earth doped fiber lasers
Rare earth dopant Pump wavelengths [nm] Signal wavelengths [nm] Excited state lifetime [ms] Energy levels
Figure 1 Example of a simple all-fiber optical oscillator. The rare-earth doped fiber provides optical gain, and the Bragg gratings
provide narrowband optical feedback.
inhomogeneous linewidths of several nanometers, range of possible lasing wavelengths. One grating
and can be optimized to provide around 30 dB gain must have a reflectivity less than unity to allow a
over an 80 nm bandwidth centered around 1570 nm. proportion of the lasing signal to be coupled out of
The large gain-bandwidth is useful for achieving the laser cavity. At threshold, the gain in the amplifier
either wide tunability of the lasing wavelength and/or exactly compensates for the loss in the output coupler.
ultrashort pulse generation. Table 1 summarizes the The optical pump, typically from a semiconductor
most commonly used pump and signal transitions for laser diode, is absorbed as is propagates along the
a range of rare-earth dopants in silicate optical fibers. fiber amplifier, e.g., by rare-earth ion dopants, which
Other desirable properties of silica fiber lasers are raised to an excited state. The absorbed energy is
include high efficiency, excellent thermal dissipation, stored and eventually emitted at a longer wavelength
substantial energy storage, high power capability, and by either spontaneous or stimulated emission. Spon-
compatibility with a wide range of optical fiber taneous emission produces unwanted noise and is
devices and systems. The characteristics of optical associated with a finite excited state lifetime, whilst
fiber lasers can be optimized for some very different stimulated emission produces optical gain provided
applications, ranging from narrow linewidth and the gain medium is ‘inverted’, i.e., with more active
low-noise lasers to broadband pulsed lasers with high ions in the upper lasing energy level than the lower
energy and/or high peak power. In the following lasing energy level. In practice some energy is also lost
sections the fundamentals of laser theory as applied to to nonradiative (thermal) emissions, e.g., in the
fiber lasers will be reviewed, highlighting the reasons transfer of energy between the pump and upper
underlying such versatility. lasing energy levels. In an ideal gain medium the
nonradiative transfer rate between the pump and
upper lasing energy levels is much larger than the
Fiber Laser Fundamentals spontaneous or stimulated emission rates, so the
An example of a simple all-fiber laser is shown inversion may usually be assumed to be independent
schematically in Figure 1. The laser contains the basic of nonradiative transfer rates. Energy level diagrams
elements of optical gain (e.g., a rare-earth doped fiber for three- and four-level atoms are shown in Figure 2.
amplifier) and optical feedback (e.g., a pair of Bragg The local optical gain coefficient (i.e. power gain
grating filters). In the laser shown, the two Bragg per unit length, in nepers2) in a fiber amplifier is
gratings provide optical feedback only over a narrow proportional to the local inversion, i.e., gðzÞ , sS
range of wavelengths, which further restricts the DNðzÞ; where sS is the emission cross-section at the
LASERS / Optical Fiber Lasers 477
(i.e., the dependence of gain on signal power) in fiber gain is produced, and hence the threshold of three-
amplifiers. More accurate analysis would also take level lasers is given approximately by eqn [4]
into account factors such as the rare-earth dopant with aint ¼ sS NLA 2LC , and increases with
distribution (which may not be uniform), the amplifier length. In either case the pump power
variation of pump and signal intensities along the required at the input of the laser to reach threshold,
fiber, spectral variation in the absorption and emis- PP ð0Þ, may be approximated using the relation
sion cross-sections, spatial and spectral hole-burning PPabs ¼ PP ð0Þ 2 PP ðLA Þ < PP ð0Þ½1 2 expð2sP NLA Þ.
in the gain medium, degeneracies in the energy levels, For either three- or four-level lasers pumped above
excited state absorption and other loss mechanisms, threshold, the inversion of the gain medium remains
temperature, and spontaneous emission noise. at its threshold level as the internal round-trip gain
must remain unity; however a coherent optical output
signal at the peak emission wavelength grows from
Continuous Wave Fiber Lasers noise until its amplitude is limited by saturation of the
The ideal continuous wave (cw) laser converts input amplifier gain. Provided the output coupling loss is
pump power with low coherence to a highly coherent not too large (e.g., T , 60%) such that the total
optical output signal which is constant in amplitude signal intensity is constant along the whole length of
and wavelength, with the spectral purity of the output the fiber amplifier, it can be shown that changes in
(i.e., linewidth) limited only by cavity losses. Practical pump power above threshold cause a proportional
continuous wave lasers may be characterized by four change in the signal power coupled out of the
parameters: the pump power required for the onset of Fabry– Perot cavity in one direction, Pþ Sout ; i.e.
oscillation (i.e., threshold); the conversion efficiency
ð1 2 RÞ nS
of pump to signal power above threshold; the peak Pþ
Sout ¼ ðP 2 Pth Þ ½5
wavelength of the optical output; and the spectral d0 nP Pabs
width of the optical output. Continuous wave fiber
in which d0 ¼ 2aint L þ ln 1=R represents the total
lasers often have a low threshold, high efficiency, and
round-trip loss at threshold. The relationship
narrow linewidth relative to other types of laser.
between absorbed pump power and output signal
Output powers approaching 100 watts have been
power is shown schematically in Figure 3. The
achieved using specialized techniques.
constant of proportionality is known as the slope
The threshold power is determined by the total
or conversion efficiency, defined as hS ¼
cavity loss and gain efficiency (i.e., gain per unit of
PSout =ðPPabs 2 Pth Þ: In both three- and four-level fiber
absorbed pump power) of the fiber amplifier. For
lasers with small intrinsic losses aint < 0, and small
example, for the laser shown in Figure 1 the internal
output coupling (i.e., R < 1), the slope efficiency can
losses are minimal, and the pump power required to
approach the intrinsic quantum efficiency, nS =nP :
reach threshold may be calculated by setting the
Sustained oscillation can only occur if the optical
product of the round-trip small-signal gain, and
path length around the cavity is an integral number of
the output mirror reflectivity, G2 R, equal to unity.
wavelengths. Wavelengths satisfying this condition
For the laser configuration shown in Figure 1, and
are called ‘modes’ of the cavity, determined by
using eqns [1] and [2] for a four-level laser, the pump
power which must be absorbed in the amplifier for
the laser to reach threshold is
hn P A
Pth ¼ acav LC ½4
sS t
in which LC is the cavity length, and acav ¼ aint þ
1
2 LC lnð1=RÞ is the total cavity loss per unit length,
comprising both internal losses, aint ; and outcoupling
losses through a mirror with reflectivity R. For
example, if the internal losses were negligible and
the transmittance, T ¼ 1 2 R; of the output reflector
in the laser shown in Figure 1 was 50% (i.e. 3 dB),
and the other fiber parameters were the same as above
(i.e. G ¼ 0:3 dB/mW), then the laser threshold would
be 5 mW. In three-level fiber lasers ground state
absorption of the signal is usually the dominant loss Figure 3 Representative plot of laser output power versus
mechanism which must be overcome before a net absorbed pump power.
LASERS / Optical Fiber Lasers 479
of nonradiative energy transfer between those levels. harmonic of the cavity mode spacing. There are two
For example, for an ideal three-level gain medium main methods of mode locking:
(e.g., erbium dopant) with no internal cavity losses
the maximum available Q-switched pulse energy is (i) active mode locking (e.g., by an electro-optic
half the stored energy, or Epulse ¼ E=2: It can be modulator), in which amplitude or phase modu-
shown that the peak of the output pulse power occurs lation is driven by an external source; and
as the inversion decreases through threshold, and may (ii) passive mode locking (e.g., by a fast or slow
be approximated by dividing the pulse energy by the saturable absorber), in which an element with
pulse duration. Even for moderate pulse energies the nonlinear gain or loss responds to the optical
peak power is potentially very high. For the erbium- signal in the laser cavity itself.
doped fiber laser parameters used previously, the peak
pulse power would be approximately 1600 W, i.e., In both cases the modulation provided by the
sufficient to cause significant spectral broadening of mode-locking element distributes energy between
the output wavelength due to stimulated Raman neighboring modes, effectively injection-locking
scattering and other nonlinear effects within the them in phase. A variation of mode locking tech-
optical fiber. niques often present in fiber lasers is soliton mode-
Mode-locked lasers generate optical pulses by locking, in which the active or passive modulation is
forcing many cavity modes to be locked in phase. weak, and a balance between nonlinearity and
As per Fourier theory, the pulse repetition rate is dispersion is the dominant process underlying pulse
related to the spacing of the lasing modes, and the formation. Hybrid mode-locking techniques can
more modes that add coherently, the greater the pulse also be used to combine the best aspects of different
bandwidth and the shorter the transform-limited pulse-generation methods.
pulse duration. For example, the relationship Whilst the latter frequency-domain view of mode
between mode spacing and pulse repetition rate, locking is helpful, nonlinear effects are almost always
and between the spectral and temporal envelopes of a significant in mode-locked fiber lasers, and hence the
train of Gaussian pulses is shown in Figure 6. linear concept of a mode is not appropriate for
In mode-locked lasers a nonlinear or time-varying detailed analysis. Hence it is usually preferable to
element must be present which synchronously modu- describe pulse generation in mode-locked fiber lasers
lates the optical signal circulating in the cavity in in the time domain; for stable operation the pulse at
amplitude or phase at a frequency which is a any point in the cavity must self-replicate after each
round-trip, with the net effects of temporal and
spectral loss, gain, dispersion and nonlinearity all
canceling.
An optical pulse propagating in a fiber laser is
usually described by a slowly varying complex
envelope, Cðz; tÞ; where lCðz; tÞl2 is the instantaneous
power in a frame of reference moving at the pulse
group velocity, and arg½Cðz; tÞ is the instantaneous
phase relative to the optical carrier. The various
elements in the laser cavity, such as the passive fiber,
optical fiber amplifier, filters, and mode-locking
devices, each modify the complex pulse envelope as
it propagates around the laser cavity.
For example, in the absence of other effects, group
velocity dispersion (GVD) causes different wave-
lengths to travel at different velocities along the
fiber, resulting in ‘chirping’ and temporal broadening
of optical pulses by an amount proportional to the
distance propagated. The effect can be described
mathematically as the accumulation of a phase shift
which is quadratic in frequency, i.e., Cðz; vÞ ¼
Figure 6 Schematic showing the Fourier-transform relationship
between an infinite train of Gaussian pulses and the lasing modes
Cð0; vÞ expbib2 v2 z=2c: Taking the derivative with
in a mode-locked laser. The temporal and spectral widths are respect to propagation distance, and using ›=›t ¼
measured full width at half the maximum intensity. 2iv; the incremental change in the complex envelope
482 LASERS / Optical Fiber Lasers
with propagation distance is A is the effective core area for the guided mode. The
incremental change per unit length in the complex
›C b ›2 C pulse envelope induced by self-phase modulation is
¼ 2i 2 ½7
›z 2 ›t 2 then
If the GVD parameter, b2 ; is positive then long ›C
wavelengths propagate faster than shorter wave- ¼ iglCl2 C ½8
›z
lengths, and the dispersion is called ‘normal’. If the
reverse is true, the dispersion is ‘anomalous’. In For example, self-phase-modulation is used to realize
standard single-mode step index optical fibers the a nonlinear switching element in the passively mode-
dispersion passes through zero at wavelengths around locked laser cavity shown schematically in Figure 7.
1300 nm, and is anomalous for longer wavelengths. It For obvious reasons the laser is called a ‘figure-8’
is often necessary to control the net dispersion in a laser, though it is actually a ring laser into which a
fiber laser, and this can be done by incorporating device known as a nonlinear amplifying loop-mirror
specially designed fibers with wavelength-shifted or (NALM) has been incorporated. The NALM acts as a
otherwise modified dispersion characteristics, or by fast saturable absorber, with transmittance dependent
using chirped Bragg grating reflectors. on the instantaneous optical intensity. The NALM
Whilst silica is not commonly regarded as a operates as follows: light propagating through the
nonlinear medium (i.e., in which the optical proper- optical isolator is divided equally into co- and
ties are a function of the intensity of light), in counter-propagating waves in the loop-mirror on
ultrashort pulsed fiber lasers the peak powers are the right-hand side. The light propagating in a
usually such that nonlinear effects cannot be ignored. clockwise direction is amplified immediately, and
In silica the main nonlinearity is associated with the propagates around the loop with high intensity,
Kerr effect, and is readily observed as an intensity- whilst the counter-clockwise propagating signal has
dependent refractive index, i.e. n ¼ n0 þ n2 I; a low intensity for most of its time in the loop. The
where the nonlinear index, n2 ; is typically difference in intensity means that the co- and counter-
about 3 £ 10220 m2/W in silica fibers. Other non- propagating waves experience different nonlinear
linear effects related to the Kerr nonlinearity which phase shifts. When they are recombined at the
can sometimes be important are cross-phase- coupler, high-intensity light is transmitted back into
modulation and four-wave mixing. In the absence the ring, whilst low-intensity light is reflected back to
of other effects, the nonlinear index results in self- the isolator, and lost. In this way the NALM provides
phase modulation (SPM), or a phase shift in the both gain and a nonlinear transmittance which
optical carrier which is proportional to the depends on the instantaneous power of the input,
instantaneous intensity of the signal. Mathe- described by
matically, Cðz; tÞ ¼ Cð0; tÞexpbiglCð0; tÞl2 zc; in which
g ¼ n2 v0 =ðcAÞ is the nonlinear coefficient of the fiber, 1 1
T <G 12 1 þ cos gLPin G ½9
v0 is the radian frequency of the optical carrier, and 2 2
Figure 7 Schematic of a ‘figure-8’ fiber laser. The laser is passively mode-locked by the nonlinear amplifying loop mirror on the right,
which acts as a fast saturable absorber, transmitting and amplifying high-power optical signals whilst reflecting low optical power signals.
LASERS / Optical Fiber Lasers 483
in which G is the linear gain of the amplifier, and L lumped, such that the envelope of any pulse generated
the loop length, Pin is the input power, and T the ratio in the laser does not change shape significantly as it
of transmitted power to input power. The nonlinear propagates around the cavity, then the master
transmittance of the NALM promotes pulse for- differential equation describing propagation of an
mation and shortening in the laser cavity, and hence optical signal within a fiber laser which is passively
passively mode-locks the laser. The polarization mode-locked by a fast saturable absorber can be
controllers are to compensate for unwanted bire- written
fringence in the optical fiber, and to control the
2
transmittance of the loop-mirror at low input powers. ›U D › U
The fiber dispersion and nonlinearity, together with i þ 2 ib þ ð1 2 i1ÞlUl2 U
›z 2 ›t 2
other factors, further influence the pulses generated in
the laser, as described in the following paragraphs. ¼ idU 2 ðn 2 imÞlUl4 U ½11
The combined effects of dispersion and self-phase-
modulation can give rise to some interesting and Equation [11] is known as the Ginzburg – Landau
useful physical phenomena. In lossless optical fibers, equation, which describes propagation of the normal-
or fiber lasers in which loss is balanced by gain, pulse ized pulse envelope, Uðz; tÞ; as a function of delayed
propagation is described by the nonlinear Schrödinger time, t, and normalized distance, z, and under the
equation (NLSE), i.e., a nonlinear wave equation combined influence of group velocity dispersion
including the effects of both group velocity dispersion (D ¼ ^1 for anomalous or normal group velocity
and intensity-dependent refractive index. The NLSE dispersion, respectively), dispersion due to band-
may be derived by combining eqns [8] and [9], i.e. limiting by amplification in a homogeneously broad-
ened gain medium ðb . 0Þ; nonlinear refractive
›C b ›2 C index, nonlinear gain or loss (1), linear gain (d), and
i 2 2 þ glCl2 C ¼ 0 ½10
›z 2 ›t 2 quintic terms, m and n, which if less than zero relate to
saturation of the nonlinear gain and refractive index,
It can be shown that if the dispersion is anomalous respectively. Note that if the right-hand side of
ðb2 , 0Þ then for a particular shape of pulse envelope the equation is set to zero, and 1 ¼ b ¼ 0; then the
the distributed effects of dispersion and nonlinearity equation reduces to the normalized form of the
can exactly cancel, so ›C=›z ¼ 0, and the pulse nonlinear Schrödinger equation. Stable pulses gener-
propagates without changing shape. Such pulses are ated within the laser are solutions of the above
called fundamental solitons, the solution for which is equation with ›U=›z ¼ 0; and hence the effect of all
Cðz; tÞ ¼ C0 sechðt=T0 Þ exp½iz=ð2L
pffiffiffiffiffiffiD Þ, in which the terms must balance. Consequently it is possible for
peak amplitude is C0 ¼ 1= gLD ; T0 is the pulse stable pulses to be generated even with normal
width parameter, and LD ¼ T02 =b2 is known as the dispersion in the cavity, though the pulses are then
dispersion length. For example, in a standard silica generally strongly chirped, and have an envelope
fiber at l ¼ 1:55 mm, b2 < 220 ps 2/km and which is approximately Gaussian shaped rather than
g < 10 W21km21, hence a soliton with T0 ¼ 1 ps the hyperbolic secant shape expected with anomalous
would have peak power 2 W and energy dispersion. Stable soliton-like pulses can be found by
E0 ¼ 2lb2 l=ðgT0 Þ ¼ 4 pJ. The energy of a soliton is solving the Ginzburg –Landau equation for specific
quantized, in that pulses with initial energy between ranges of parameters. There are many numerical
0.5 and 1.5 times E0 can evolve into a fundamental solutions, and some analytical solutions, but in
soliton during propagation. Consequently solitons general the dynamics of pulse formation and propa-
behave somewhat like particles, and are robust even in gation in passively mode-locked fiber lasers can be
the presence of significant perturbations. The pulses quite complex, and for some sets of parameters
which form spontaneously in the figure-8 fiber laser remarkable and unexpected numerical solutions can
have many of the properties of solitons. be found. One such example which has been observed
Mode-locked fiber lasers, such as that shown in experimentally is the ‘exploding’ soliton, which
Figure 7, can often be described by a generalized unlike the usual soliton is periodically unstable, as
nonlinear Schrödinger equation, in which extra terms shown in the plot of pulse evolution in Figure 8.
are included in eqn [10] to account for the average
effects of an excess linear gain or loss, nonlinear gain
and loss (e.g., due to a saturable absorber), spectral
Other Fiber Lasers
filtering, and temporal modulation, etc. Assuming the The lasers discussed in the previous sections are
effects of the added elements can be regarded as indicative of the main types of fiber laser in terms of
distributed over the length of the cavity rather than their output characteristics; however it should be
484 LASERS / Optical Fiber Lasers
Agrawal GP (2001) Applications of Nonlinear Fiber Duling IN (ed.) (1995) Compact Sources of Ultrashort
Optics. San Diego, CA: Academic Press. Pulses. Cambridge, UK: Cambridge University Press.
Akhmediev NN and Ankiewicz A (1997) Solitons. London: France PW (ed.) (1991) Optical Fibre Lasers and Ampli-
Chapman and Hall. fiers. Glasgow, UK: Blackie.
Becker PC, Olsson NA and Simpson JR (1999) Erbium- Jeunhomme LB (1990) Single-Mode Fiber Optics: Prin-
Doped Fiber Amplifiers: Fundamentals and Technology. ciples and Applications, 2nd edn. New York: Dekker.
San Diego, CA: Academic Press. Kashyap R (1999) Fiber Bragg Gratings. San Diego, CA:
Desurvire E (1994) Erbium-Doped Fiber Amplifiers: Academic Press.
Principles and Applications. New York: Wiley. Siegman AE (1986) Lasers. Oxford, UK: Oxford University
Digonnet MJF (ed.) (2001) Rare-Earth-Doped Fiber Press.
Lasers and Amplifiers, 2nd edn. New York: Marcel Weber MJ (ed.) (1995) CRC Handbook of Laser Science
Dekker. and Technology. Boca Raton, FL: CRC Press.
efficiencies, approaching unity in solution and as high dyes, polymers exhibit a relatively low degree of
as 60% in the semiconducting solid state, have been self-absorption of the emitted light, and much
realized. reduced quenching of the luminescence with
increased concentration. This means that it is possible
to work with much higher chromophore densities
Organic Semiconductor Lasers
than conventional laser dyes, which is advantageous
While organic dye lasers have been widely used for achieving very low threshold lasing in extremely
since the mid-1960s, the history of organic compact resonators. Furthermore, because these
semiconductor lasers is much more recent. Other materials are semiconducting, they have the potential
than a few early observations of lasing in for direct electrical excitation. This could lead to
molecular crystals in the 1970s by Karl and others, semiconductor diode lasers operating throughout the
the field of organic semiconductor lasers began in visible spectrum and, in particular, in the blue and
earnest following the related discovery in 1990 by green spectral bands where currently there are few
Burroughes et al. of electroluminescence in a commercially viable inorganic semiconductor
conjugated polymer. In 1992, Moses demonstrated sources. Finally, these materials are plastics and so
lasing for the first time in a solution of the are amenable to simple processing, and can be shaped
conjugated polymer MEH-PPV (Figure 1b). How- and structured into complex feedback resonators.
ever, it was another four years – once material
synthesis had been sufficiently refined – before
stimulated emission and then true laser action were
Organic Semiconductor Gain
observed in polymers in the semiconducting
solid state. In the following year the first demon- Materials
strations were made of dye-doped small molecular A great advantage of organic laser media is that a very
semiconductor lasers. Since that time, there has wide range of materials can be readily synthesized,
been a great growth in interest in the field with an and through minor changes in the molecular structure
average of ,30 research papers per year. Further one can relatively simply tune the emission properties
details of the development of the field of organic of the material. Many different organic semiconduc-
semiconductor lasers can be found in the Further tors have been applied as lasers, though they
Reading. fall into three major categories. These are
So what are the particular properties of conjugated conjugated polymers, dye-doped small molecular
polymers that have stimulated interest in developing semiconductors, and molecular single crystals.
organic semiconductor lasers? Firstly these materials Conjugated polymers make up the most diverse
are broadband visible emitters. Through appropriate materials set, most notably including derivatives of
chemical design, their emission can be tuned through- poly(paraphenylene-vinylene), poly(paraphenylene),
out the visible spectrum, from 400 nm to 700 nm and poly(fluorene) (Figure 1b – d). The second cate-
(Figure 2). The polymers exhibit large optical gain gory comprises small molecular semiconductors that
coefficients, because the optical transitions are are co-evaporated under vacuum with a dopant laser
dipole-allowed. Compared with small molecular dye. In this blend of materials, the organic semi-
conductor functions simply as an energy transfer host
for the emitting dye molecules. In the third category
1.0
Photoluminescence (arbitrary units)
Figure 3 (a) Generic energy level structure and optical transitions of a conjugated polymer. SN refers to the Nth singlet state; TN refers
to the Nth triplet state; ISC: intersystem crossing. (b) Absorption and photoluminescence spectra of the polymer MEH-PPV.
state S0, so forming a classic 4-level laser system. The When formed, excitons tend to hop to the longest,
optical transition is dipole-allowed and so has a high lowest-energy, sites before emitting a photon. The
oscillator strength. As a result, conjugated polymers exciton migration process has two effects. First, it
typically exhibit simulated emission cross-sections in means the emission takes place from a subset of the
the range 10216 cm2 to 10215 cm2, while the popu- chain segments, resulting in a lower degree of
lation inversion lifetime is as short as a few hundred inhomogeneous broadening of the emission spectrum
picoseconds. (vibronic peaks are better resolved than in absorption
There are, however, several other transitions that (see Figure 3b)). Second, the peaks of the absorption
can compete with the stimulated emission process. and emission spectra tend to be widely separated in
There can be excited-state absorption from the S1 wavelength, reducing the level of self-absorption at
singlet state into higher-lying singlet states. Addi- the lasing wavelength.
tionally intersystem crossing from the S1 state to the As mentioned above, blending materials is a
lowest excited triplet state T1 can subsequently lead particularly attractive approach towards further
to excited triplet absorption. Other processes in the reducing self-absorption in organic semiconductor
polymer can also impede the stimulated emission. lasers. In such systems, a narrow bandgap dopant
Firstly, charged excitations can exhibit absorption chromophore is blended with a concentration of
bands that overlap with the emission. Secondly, at 1 – 2% in a wider bandgap semiconducting host.
high excitation densities, exciton– exciton annihila- Emission from the host is strongly quenched via
tion can significantly deplete the excited states nonradiative Förster energy transfer to the guest
available for stimulated emission. Ultimately, molecule. The blended material exhibits an absorp-
singlet – singlet annihilation is the limiting factor for tion profile that is dominated by the host absorption,
the maximum excitation density possible, and while while the emission profile is almost completely that of
polymer films can have a repeat-unit density of the guest dopant. By strongly separating the emission
,1021 cm23, the excited state density is typically peak from the absorption edge of the host, it is
limited to around 1018 cm23. In many materials this possible to achieve a substantially reduced (by
is not restrictive on achieving optical gain because perhaps an order of magnitude or more) residual
stimulated emission can occur at densities of absorption coefficient at the lasing wavelength. In
1017 cm23 and below. doing so, the combined material acts much more like
An important factor that distinguishes the photo- an ideal 4-level laser medium and can thereby have
physics of semiconducting polymers from conven- substantially lower pump thresholds for stimulated
emission.
tional laser dyes is an internal process of exciton
migration between the absorption and emission of a
photon. Excitations are confined to short straight
sections of the polymer between kinks or twists in the
Measuring Gain
chain. These straight sections, or ‘sites’, are typically The dynamics of the excited state population in
a few repeat units in length, though there is a semiconducting polymers have been widely studied
statistical spread of site lengths in any given material. using femtosecond transient absorption experiments.
488 LASERS / Organic Semiconductors and Polymers
This is a pump – probe technique in which the amplifying properties of the medium. The experimen-
transmission of the broadband probe pulse is tal procedure for such studies is as follows. A thin film
compared in the presence and absence of a pump of the semiconducting material, of typically 100 nm
pulse that excites molecules into the first excited state. thickness, is excited by a pulsed pump laser focused
Such measurements have the time resolution of the to a stripe of dimensions ,100 mm wide by a few
pump pulses used and so can typically be ,100 fs. By millimeters in length. A significant fraction of the
systematically delaying the arrival of the broadband light emitted by the material is trapped in the film
probe pulse, it is possible to map out the evolution of by total internal reflection at the polymer– air and
the gain dynamics across the emission spectrum of the polymer– substrate interfaces, and is thereby wave-
material with very high time resolution. guided along the length of the excitation stripe. This
Figure 4 shows a series of snapshots of waveguided spontaneous emission can be amplified
the absorption induced by a pumping pulse in a by stimulated emission before being emitted out the
poly(thiophene) derivative. This figure shows the edge of the film.
changing absorption of the sample between 200 fs One finds that, above a particular threshold of
and 400 ps after excitation. Within the absorption pump energy, there is a superlinear increase in the
band of the material, one finds that the level of output energy with pump intensity. Associated with
absorption is slightly reduced because the pump pulse this is a dramatic narrowing of the broad emission
has partly depleted the ground state. In the emission spectrum to a linewidth of typically less than 10 nm,
band there is also a reduction in the absorption that as illustrated in Figure 5. These two features are
mimics in shape the photoluminescence spectrum of signatures of gain narrowing via the process of
the polymer. This negative absorption – or optical amplified spontaneous emission. The line-narrowed
gain – shows direct evidence for stimulated emission feature usually appears at the first vibronic overtone
through much of the emission band. In the low energy of the optical transition, where the gain exceeds the
tail of the emission, however, there is an increase in losses in the material by the greatest amount. This
absorption that arises due to the excited-state particular wavelength experiences the greatest ampli-
absorption process mentioned earlier. Clearly, for a fication and, above a particular pumping density,
good laser material, the spectra of excited-state and undergoes a runaway effect in which most of the light
charge-induced absorptions should overlap as little as is stimulated to emit at that wavelength at the expense
possible with the photoluminescence. of the rest of the spectrum.
Many other investigations of gain in semiconduct- In the case of amplified spontaneous emission or
ing polymer films have used a phenomenon known as mirrorless lasing as it is also known, the spontaneous
spectral line narrowing to explore indirectly the emission guided along the length of the stripe acts as
1.0
n
Normalized intensity
0.6
Increasing pump
0.4 power
0.2
0.0
420 440 460 480 500 520
Wavelength (nm)
lasers, a second-order (i.e., m ¼ 2) Bragg grating has suppressed by the periodic structure. This leads to a
been favored to provide the feedback for the laser, so photonic stopband within which the emission is
that the first-order scattering provides a surface- reduced. At the edges of this stopband the micro-
emitted output coupling, perpendicular to the plane structure can support standing wave electric field
of the guide. This surface-emitting geometry is patterns of the same spatial period. At these
particularly attractive in polymer lasers because it is wavelengths there can be a strong interaction
relatively difficult to achieve a high optical quality between the electric field and excited states in the
edge to the waveguide. DFB lasers have the advantage gain medium, which leads to low-threshold band
over simple microcavity resonators in that they can edge lasing as observed in the figure.
combine long interaction lengths, limited by the Although DFB resonators have advantages over
scattering length of the Bragg grating, with excellent microcavity devices, the complicated lithographic
wavelength selection, which is controlled by the and etching steps involved in their fabrication can
grating period. detract from the key material advantage of simple
In addition to conventional 1D DFB gratings, a processing. To address this issue, several different
number of more exotic 2D and 3D DFB, and techniques for simply replicating the complicated
photonic crystal resonators have been explored in microstructures have been explored. These include
semiconducting polymers. Even randomly located UV embossing of flexible plastic substrates, and
scattering sites in a polymer film have been used to hot embossing and micromolding of the active
provide a weak closed-loop cavity feedback.
Examples of 2D gratings are given in Figure 7,
which shows atomic force micrographs of an eggbox
structure, formed from two orthogonal gratings, and
a circular DFB grating. Such 2D DFB structures
can significantly improve the operation of surface-
emitting polymer lasers compared with 1D gratings.
Improved feedback yields lower oscillation
thresholds, combined with higher slope efficiencies
and significantly improved beam quality. Indeed, such
surface emitting DFB lasers can produce nearly
diffraction limited laser beams.
Typical operating characteristics of an egg-box
DFB polymer laser are shown in Figure 8. This laser,
which is based on the material MEH-PPV, is
transversely pumped with a pulsed pump laser
focused onto a small region of the waveguide; up to
90% of the incident pump light is absorbed in the
100 nm thick film. The laser has a threshold pump
energy of 4 nJ and an external slope efficiency of
. 7%, and operates on a single frequency close to the
Bragg wavelength of the periodic waveguide. Lasing
does not occur exactly at the Bragg wavelength,
because at this wavelength the propagation of light is
polymer itself. Such techniques can very readily Electroluminescence in conjugated polymers was
reproduce structures with feature sizes as small as first achieved in 1990 by Burroughes and co-workers.
100 nm and could allow future mass production of Since that time there has been remarkable progress
photonic microstructures in these materials. made in the science and technology of organic LEDs.
Commercial products are now on the market, and
the feasibility of both flexible and full-color light-
Towards Plastic Diode Lasers
emitting displays, in which the individual pixels have
A typical experimental configuration for characteris- been ink-jet printed, have been demonstrated in the
ing optically pumped polymer lasers is shown in laboratory. Despite such rapid progress in the field of
Figure 9. The laser is mounted in a vacuum chamber organic LEDs, there have been no successful demon-
in order to isolate the polymer from oxygen and strations to date of electrically pumped lasing in any
water. A pulsed visible or ultraviolet pump laser is organic semiconductor. In this section we highlight
attenuated and then focused onto the waveguide to some of the challenges that must be faced if electrically
transversely pump the polymer laser. The emission is pumped lasing is to be achieved. To understand the
collected using a CCD spectrograph for spectral problems associated with electrical driving of organic
measurements and energy meter and beam profilers semiconductor lasers, it is useful first to review the
for other characteristics. Typical pulsed pump lasers, generic structure of organic light-emitting diodes.
which have been used, include nitrogen laser pumped A generic organic LED structure is shown in
dye lasers, and spectral harmonics of regeneratively Figure 10. The device consists of one or more organic
amplified Ti3þ-sapphire lasers or flashlamp pumped layers sandwiched between two electrical contacts.
Nd3þ:YAG lasers. Recent advances have shown the
Usually a transparent anode is used, commonly made
thresholds to be low enough to be successfully
from ITO, which allows light to be emitted from the
pumped with a modest Nd3þ pulsed microchip laser.
front face of the LED. A low work-function metal,
So while optically pumped polymer lasers are now
such as calcium or aluminum, is used as the cathode.
proving to be compact, functional light sources, a
The organic part of the device may consist of several
major motivation for research in this field remains the
layers. These include one or more charge transport
prospect of electrically driven plastic diode lasers.
layers, plus a light-emitting layer in which the
Such lasers could be very low-cost, flexible light
sources that would access any wavelength throughout electrons and holes combine to form excitons before
the visible spectrum. In particular, there are exciting emitting light. Typically, the combined thickness of
prospects of producing blue and green diode lasers in these organic layers is as little as 100 nm. Such thin
spectral regions currently difficult to access with devices are necessary to keep operating voltages low,
inorganic semiconductors. The lowest threshold to overcome potential barriers for injection, and
densities reported for optically pumped organic because charge mobilities in organic semiconductors
semiconductor lasers are of the order of are very low (1025 to 1022 cm2 V21 s21).
100 W cm22. To achieve similar excitation densities The inclusion of electrical contacts into organic
electrically, one requires pump current densities of semiconductor laser structures can be problematic
, 200 A cm22. This is substantially more than the since they need to be situated very close to the
few mA cm22 commonly used in polymer LEDs for emitting region. Metallic contacts can quench the
display applications. Presently, such high current luminescence and introduce losses to the waveguide
densities are infeasible with continuous excitation, mode, thereby increasing the oscillation thresholds.
but can be exceeded with short current pulses of a few Furthermore, the ITO anode has a high refractive
tens of nanoseconds duration. index and so can reduce the spatial overlap of the
Figure 9 Schematic of a typical experimental configuration for Figure 10 Schematic structure of a generic organic light-
characterizing polymer lasers. emitting diode.
492 LASERS / Organic Semiconductors and Polymers
guided mode with the active medium. Such issues Optical gain [dB cm21]
indicate that the design of organic LEDs needs to be Stimulated emission cross-section [cm2]
somewhat modified for achieving electrically pumped Waveguide propagation loss [cm21]
lasing. In particular, there needs to be a trade-off coefficient g ðexpð2gzÞÞ
between the optimum electronic and photonic Wavelength [nm]
designs. To address these issues, various device
structures have been proposed that distance the
optical mode from metallic contacts, while reducing
the thickness of the ITO layer to optimize the mode See also
confinement in the active material.
A more demanding obstacle to electrically pumped Diffractive Systems: Diffractive Laser Resonators.
lasing is the additional excited state absorption Lasers: Dye Lasers; Planar Waveguide Lasers.
associated with the electrical pumping scheme. As Optical Amplifiers: Basic Concepts. Optical Materials:
Plastics. Photonic Crystals: Photonic Crystal Lasers,
many as three-quarters of the generated excitons in an
Cavities and Waveguide.
organic LED are formed in the triplet state which is
nonemissive and can lead to excited-state absorption,
albeit principally at lower energies than the lasing
wavelength. Additionally, injected charges that have
not yet formed excitons, exhibit strong absorption, Further Reading
which overlaps significantly with the peak of the Burroughes JH, Bradley DDC, Brown AR, et al. (1990)
material gain. Indeed, this charge-induced absorption Light-emitting-diodes based on conjugated polymers.
has so far prevented the demonstration of injection Nature 347: 539 – 541.
lasing in conjugated polymers, even in optimized Friend RH, Gymer RW, Holmes AB, et al. (1999)
device structures mentioned above. A possible Electroluminescence in conjugated polymers. Nature
alternative route for efficient pumping of polymer 397: 121 – 128.
lasers could be through indirect electrical pumping. Karl N (1972) Laser emission from an organic molecular
For example, an inorganic diode laser or LED, or crystal. Physica Status Solidi (a) 13: 651– 655.
even an organic LED, could be used as an electrically Kranzelbinder G and Leising G (2000) Organic
solid-state lasers. Reports on Progress in Physics 63:
efficient optical pump for the semiconducting poly-
729 – 762.
mer laser. Recent advances in GaN-based LEDs McGehee MD and Heeger AJ (2000) Semiconducting
and lasers provide interesting prospects for such (conjugated) polymers as materials for solid-state lasers.
alternative systems. Advanced Materials 12: 1655– 1668.
Research on semiconducting polymer lasers con- Miyata S and Nalwa HS (eds) (1997) Organic Electro-
tinues, both to reduce the excitation densities needed luminescent Materials and Devices. Amsterdam: OPA.
for threshold and to investigate new material systems Moses D (1992) High quantum efficiency luminescence
that could overcome the obstacle of charge-induced from a conducting polymer in solution – a novel
absorption. By building on recent developments in polymer laser-dye. Applied Physics Letters 60:
resonator geometries, material blends, and compact 3215– 3216.
pumping schemes, semiconducting polymer laser Pope M and Swenberg CE (1999) Electronic Processes in
Organic Crystals and Polymers. New York: Oxford
systems, whether optically or electrically pumped,
University Press.
should soon become practical, visible light sources for Samuel IDW (2000) Polymer electronics. Philosophical
a range of applications. Transactions of the Royal Society of London A 358:
193 – 210.
Scherf U, Riechel S, Lemmer U and Mahrt RF (2001)
List of Units and Nomenclature Conjugated polymers: lasing and stimulated emission.
Current Opinion Solid State and Materials Science 5:
Charge mobility [cm 2 V21 s21] 143 – 154.
Current density [A cm22] Tessler N (1999) Lasers based on semiconducting organic
Energy density [mJ cm22] materials. Advanced Materials 11: 363– 370.
Excited state densities [cm23] Tessler N, Denton GJ and Friend RH (1996) Lasing
Gain coefficient a ðexpðazÞÞ [cm21] from conjugated-polymer microcavities. Nature 382:
Intensity/power density [kW cm22] 695 – 697.
LASERS / Planar Waveguide Lasers 493
Figure 1 Range of different waveguide designs, with the core represented by the darker regions: (a) Optical fiber; (b) slab waveguide;
(c) channel waveguide; (d) embedded channel waveguides with a layer of the overcladding; (e) rib waveguide; (f) inverted rib; (g) ion
diffused; and (h)strip loaded, comprising of a strip of a different higher index material (that serves to confine light) on a core layer.
with each other spatially and temporally). This has structure setup by the sapphire planes. Since both the
come to represent the signature output of a laser. excitation (or pump) and the lasing modes are well
confined, high modal overlap is achieved. These
PWLs can be edge-pumped or face-pumped and
Planar Waveguide Lasers (PWL) high powers (measured in the order of Watts) can
By way of an example, let us consider a solid state, be achieved.
rare-earth doped laser, such as an Nd-doped YAG In the above design, the cavity is defined by two
laser. All rare-earth ions are able to fluoresce, some parallel reflectors bracketing the light-generating
(e.g., Nd) more efficiently than others, and hence can medium and is known as a Fabry– Perot etalon. The
be light emitters. These ions have to be dissolved distance between the two mirrors sets up a wave-
(doped) into a solid host, say, a glass or a single crystal length filtering mechanism whereby only resonant
such as YAG (Yttrium Aluminum Garnet). The wavelengths are sustained. These are related to the
excitation of these ions is done optically with light intra-mirror distance by the simple equation shown
of a suitably higher energy (or lower wavelength). below:
When this ‘pump’ light is made incident on a rod of
the rare earth doped material, it is absorbed by the al
Lc ¼ ½3
rare earth ions and used to reach an excited electronic 2n
state. Amongst all the possible higher energy states,
only certain ones are sustainable because the electron where Lc is the length of the cavity, a is an integer, n is
resides in that state for an acceptably long period of the refractive index of the medium, and l is the
time (called the lifetime). Hence, population inver- wavelength in vacuum. As the cavity length increases,
sion can be more readily achieved in these longer the number of possible resonant wavelengths
lifetime states and stimulated processes such as increases and the spacing between wavelengths is
amplification and lasing are readily possible. The reduced. These are called the longitudinal modes
laser light generated in the doped rod is emitted (Figure 4a). Likewise, lateral modes (Figure 4b) can
across its entire cross-section. be generated in the cavity as well, tending to make the
Planar waveguide lasers are designed to miniaturize operation of the laser multimoded, as shown in
the kinds of stand-alone solid state lasers describe Figure 5. This is characteristic of a Fabry– Perot laser,
above. In principle, a PWL is a laser where the light which can be overcome by embellishing the basic
emitting resonant cavity is a planar waveguide. This etalon design.
topic of our interest is a relatively recent development As stated above, another big advantage here is that
that renders some unique advantages. These can be the platform for fabricating PWLs can be chosen to be
broadly classified into two categories: one relating to the same as to make PLCs. Waveguide comprizing
making efficient high-power lasers and the other, to the the laser can be continued further and configured into
integration of lasers with other optical components. other desirable components, such as splitters,
An example of the first type is a Nd-YAG laser couplers, etc. Thus the laser does not have to be
configured into a planar geometry so that the laser separately coupled to the waveguide. This is a
light is now contained within the waveguide. One major advantage as this connecting process is time-
design that has evolved to achieve this well is shown consuming and incurs loss of light. In the field of
in Figure 3. The light is guided by the waveguiding photonics, the ability to integrate active components
with passive ones onto the same chip is considered to
be the first step toward photonic chips, analogous to
the integrated circuit chip that has revolutionalized
our world. This is the driver behind continued
research and development of active elements such as
PWLs that can be coupled with other actives, such as
thin film electro-optic modulators or acousto-optic
filters, etc.
Figure 4 Generation of longitudinal (a) and lateral modes (b) in a Fabry–Perot laser.
Figure 5 The output of a Fabry– Perot laser: (b) is governed by the number of modes within the gain spectrum of the lasing medium
shown by the dotted lines (a). The spacing between the modes can be altered changing the length of the cavity. The smallest mode in
output spectrum (b) needs to have enough gain to overcome the lasing threshold.
useful gain of the system. Ensuring high spatial photosensitivity of glasses using 193 nm or 244 nm
dispersion of the Er atoms is critical to minimize UV light to periodically alter the refractive index of a
this effect. Special glass compositions, such as glass waveguide. Photosensitivity occurs by a combi-
phosphate glasses, are used here to ensure that the nation of molecular changes (such as the formation of
Er is not phase separated or clustered. These glasses oxygen deficiency sites or color centers) as well as
also provide a high induced cross-section for gain. changes in the material strain in the region exposed to
Glasses with low phonon energy are also desirable to the UV light (Figure 7c). Alternatively, gratings can
minimize phonon-mediated nonradiative decays in also be formed using etching to make periodic
the rare earths. Light emitting semiconductors and grooves in the waveguide. These are called corrugated
polymers containing fluorescent dyes have also been or surface relief gratings (Figure 7a and 6b).
used to make PWLs. The back Grating 1 has a high reflectance of almost
Now that we have gained an overall perspective of 100%, whereas Grating 2 is a weaker grating. The
PWLs, we can discuss different manifestations. The reflectance strength of Grating 2 is determined by
following discussion uses a Er doped system as a taking into account the gain coefficient of the material
model since there is a substantial body of work on this and the output power required from the laser. The
system. reflection intensity can be altered according to the
following equation:
& ’
Distributed Bragg Reflector (DBR) 2 pLG Dneff hðVÞ
R ¼ tanh ½5
PWL lB
Figure 7 DBR planar waveguide laser. The PWLs shown in (a) and (b) have sinusoidal and trapezoidal surface-relief gratings
respectively etched in their cores. An in-line grating of the kind seen in (c) could be fabricated using the photosensitivity of the glass.
The top cladding is shown only in (a).
498 LASERS / Planar Waveguide Lasers
Figure 8 The reflection spectrum of the back (1) and the front Figure 10 Shifting the front and back gratings is one means of
(2) gratings. As the intensity of the reflected light is made to controlling the width of the overlap region, shown in gray. This way
increase, so does the linewidth. Hence, the narrower spectrum of a single mode operation can be achieved even for a longer cavity
partially reflecting grating 2 tends to dictate the operating with many modes.
wavelength range of a DBR.
The gratings are often written holographically, set in. Waveguides are created by ion-exchange by
where the wafer is exposed to an interference pattern immersion in a hot bath containing suitable ions (Na
caused by two coherent light beams. The wafer for Li, for instance). When starting with a uniformly
underneath can be sensitized with a photosensitive doped glass, one issue in the assimilation of the laser
glass layer or a film a photoresist. It is not efficient to array with other components is the fact that Er-doped
do multiple holographic exposures to individually region does not start and end with the waveguide
change each of the gratings in the array. One way of laser section. Recently, the Er has been deposited
doing this with a single holographic exposure is by selectively in the laser waveguide in phosphate glass
constructing the waveguides at different relative using methods such as ion implantation, bringing the
angles. Then the component of the index modulation, concept of integration closer to reality.
resolved along the axis of the waveguide, would vary
for each waveguide depending on the relative
waveguide angle (Figure 15); this would lead to Pumping
differences in n: Alternatively, the waveguide array
can be made of the same height but of different One of the key issues here is the optical pumping of the
widths. This is easily done since waveguide height is rare-earth PWLs to generate acceptably high output
usually deposited uniformly but the width is a power. Individual diode lasers or an array of diode
function of the mask layout. Here the waveguide lasers all on a single chip (called a diode bar) have been
utilized as pumps. Both single and multimode diode
and the grating are orthogonal to each other and the
lasers are available and have been used. There are pros
variation in wavelengths is achieved by the fact that
and cons for either choice: whereas single mode lasers
the effective propagation constant for each of the
are less powerful (in the range of few 100 mW), it is
waveguides is different.
difficult to convert the multimoded output of their
Arrays of PWLs have been made on a silicon
counterparts to useful power. The choice is determined
substrate using deposition of an Er containing glass
by the waveguide design, which needs to be optimized
that is subsequently patterned into waveguide, and
for high overlap of the lasing mode with the pump.
ultimately laser, arrays. Another approach has been
Carefully tapered waveguides can be used distribute
to start with a phosphate glass as the host and the the pump power from a diode bar to the various lasers
substrate. A phosphate glass is a superior host for in an array without significant loss.
rare-earth ions such as Er since the sensitization
efficiency is nearly unity and much high doping levels
are possible before concentration quenching effects
PWL Stability
Stable operation of a PWL is one of the big
challenges. In DBR PWLs, it is seen that there is a
threshold for pump power only below which single
moded operation can be achieved. Additionally, any
PWL die needs to be made immune to ambient
temperature changes. These changes can affect many
of the critical output parameters of the PWL:
wavelength, linewidth, power and in the case of an
array, the wavelength separation. This occurs pri-
marily due to the thermal shift in the grating pitch as
well as the thermo-optic effect. It is overcome by
supplanting the laser die onto a heat spreader chip
made from conducting material such as Cu or BeO,
followed by a thermoelectric cooler. Ironically,
the laser’s output wavelength and linewidth are fine-
tuned and ‘fixed’ using temperature to overcome any
Figure 15 Two different means of making planar waveguide drifts due to processing errors or other reasons.
laser array with a single holographic exposure to write the gratings. Reliability is another important issue that needs to
The darker rectangular blocks represent the core channels or ribs. be addressed, especially for telecommunication appli-
In (a) the width and hence the effective index of the waveguide is
varied whereas the relative angle between the waveguides is
cations where these types of components have to
altered in (b). The lines represent the direction of the gratings, pass high quality standards such as the Telecordia
which is orthogonal only to the left most waveguide in (b). qualification in the USA.
LASERS / Planar Waveguide Lasers 501
Semiconductor Lasers
S W Koch, Philipps-University, Marburg, Germany into the conduction band. Even though this pumping
in semiconductors, as in other solid state laser
M R Hofmann, Ruhr-Universität Bochum, Bochum, materials, can be done optically, the large techno-
Germany logical impact of semiconductor lasers is at least in
q 2005, Elsevier Ltd. All Rights Reserved. part due to the possibility of electrical pumping with a
few tens of milli-amperes at low voltages.
The original version of such a semiconductor laser
Introduction device is shown in Figure 1, The laser structure
This article discusses the principle of operation and consists of a GaAs-based p-n junction which is biased
the basic material engineering aspects for the design in the forward direction. Electrons are injected from
of semiconductor lasers. The physics of the gain the n region and holes from the p region, respectively.
medium is analyzed in detail and the role of many- Due to the diode characteristics of the p-n junction,
body effects due to the Coulomb interaction of the semiconductor lasers are often also called laser
carriers for the optical properties of the active diodes. In the device shown in Figure 1, the laser
semiconductor material is demonstrated. The two mirrors are formed by the cleaved facets of the
main types of laser resonators for diode lasers, semiconductor crystal. Because of the high refractive
Fabry –Perot resonators and vertical cavity struc- index of GaAs, the semiconductor– air interface
tures, are described. Finally, the dynamics of semi- provides a reflectivity of about 32% which is
conductor lasers, including the principles of short sufficient for laser emission.
pulse generation and nonequilibrium gain dynamics, However, the early type of laser, the so-called
is addressed. homojunction device, was not very efficient because
of the low carrier density in the active area and because
of the weak overlap between the inverted region and
Basic Principles the optical mode. Considerable improvement towards
The optical emission from semiconductor lasers arises room-temperature continuous wave (cw) operation
from the radiative recombination of charge carrier was achieved by the introduction of the so-called
pairs, i.e., electrons and holes in the active area of the double heterostructure. Its principle is shown in
device. The conduction-band electron fills the valence- Figure 2.
band hole by simultaneously transferring the energy The basic idea behind the double heterostructure
difference to the light field. This is called radiative laser is that the active material (e.g., GaAs) is
recombination. Hence, the frequency of the laser light embedded in another semiconductor with a slightly
is mainly determined by the bandgap of the semi- larger bandgap (e.g., (AlGa)As). This causes a
conductor laser material. In order to achieve lasing, potential well in which electrons and holes are
one needs carrier inversion, i.e., a sufficient number of confined in order to achieve high carrier densities in
electrons in the conduction band so that the prob- the active area. Moreover, the smaller refractive index
ability of light absorption is lower than the probability of the surrounding high-bandgap material helps to
of emission. guide the optical field within the region of highest
It has been shown that a large variety of inversion. Additional ridge-shaped lateral structuring
semiconductor materials is suited for semiconductor is frequently used to form a complete optical
lasers. The most stringent requirement for the waveguide.
material is that it has a direct bandgap. In other
words, the maximum of the valence band and
the minimum of the conduction band have to be at
the same position in k-space (in the center of the
Brillouin zone). The elemental semiconductors
silicon and germanium do not fulfill this condition
and therefore cannot be used for semiconductor
lasers.
In order to achieve carrier inversion, it is necessary
to pump the semiconductor laser, i.e., to excite a
sufficient number of electrons from the valence band Figure 1 Principle of operation of a semiconductor laser.
LASERS / Semiconductor Lasers 503
the emission by fabricating semiconductor alloys as, requirements: first, the bandgap of the material has
e.g., (GaIn)As or (AlGa)As. Presently, a large variety of to fit the desired photon energy; and second, and
different materials is commercially available covering, more crucial, an appropriate substrate with the right
for example, the red ((AlGaIn)P), the near infrared lattice constant has to be available. This second
((AlGa)As, (GaIn)As), the telecom ((GaIn)(AsP)) and condition excludes many material compositions from
even the blue –ultraviolet range ((GaIn)N). laser applications. In many cases, both conditions can
Great care has to be taken that the crystal used for a only be simultaneously met with quaternary semi-
semiconductor laser is of almost perfect quality. conductor alloys (for example (GaIn)(AsP) on InP
Dislocations in the crystal, for example, have to be substrates).
avoided as well as any kinds of defects since they act
as nonradiative recombination centers. Nonradiative
recombination competes with the radiative recombi-
nation that is essential for the laser process. Non-
Physics of the Gain Medium
radiative recombination effects include Auger From a material physics point of view, the important
recombination and recombination via defects. structural aspects of a certain laser material are
Auger recombination is the process where an electron summarized in the band structure of the gain
and a hole recombine and completely transfer the medium, i.e., in the quantum mechanically allowed
energy difference to another carrier (electron or hole) energy states of the material electrons. The calcu-
which is excited to a higher state. This is an intrinsic lation of the band structure of a particular gain
effect that depends mainly on the band structure and medium is therefore a crucial aspect of semiconductor
the relevant values of the transition probabilities laser modelling. Additionally, however, one has to
(transition matrix elements). The probability for understand the relevant properties of the excited
Auger recombination is temperature- and carrier- electron – hole plasma in the active region of the laser.
density-dependent and can be minimized, e.g., by In its ground state, i.e., without excitation and at
properly cooling the device and by keeping the carrier very low temperatures, a perfect semiconductor is an
densities as low as possible. insulator where no electrons are in the conduction-
Nonradiative recombination via defects is an band states. The presence of a population inversion,
extrinsic effect that depends on the material quality i.e., occupied conduction band and empty valence
and can thus be minimized by optimized growth. band states in a certain range of frequencies, is,
However, trying to avoid dislocations limits the however, a requirement for obtaining optical gain and
choice of materials for laser structures severely. The light amplification from a semiconductor. Sufficiently
growth processes require ‘host’ crystals, so-called strong pumping leads to the generation of an
substrates, on which the real structure can be grown. electron – hole plasma, where the term ‘electron’
GaAs, InP, SiC and sapphire wafers are available in specifically refers to conduction-band electrons and
sufficient quality as standard substrates for opto- ‘holes’ to the missing electrons in the originally full
electronic applications. The epitaxially grown valence bands. Holes are also quasiparticles, just as
material adapts to the crystal structure of the the electrons. Electrons in a solid are quasiparticles
substrate which implies that the lattice constant of with an effective mass that is considerably different
the substrate and structure material should be very from the free-electron mass in vacuum and that is
similar. If the lattice constants do not agree, i.e., if determined by the band structure, i.e., the interaction
there is a structural mismatch, a certain amount of with all the ions of the solid. The properties of
strain develops which increases with increasing lattice the missing electrons, such as a positive charge for the
mismatch. Too much strain leads to the formation of missing negative electron charge, a spin opposite to
dislocations which are detrimental to laser appli- that of the missing electron, etc., are attributed to the
cations. Control of the dislocation density is currently holes. As a consequence of their electrical charges,
one of the critical issues for the wide-gap III– V the excited electrons and holes interact via the
materials, such as GaN, because no substrate is Coulomb potential which is repulsive for equal
available that matches the GaN lattice constant. charges (electron –electron, hole –hole) and attractive
However, a small amount of strain is often even for opposite charges (electron– hole). Furthermore,
desirable as long as it does not cause dislocations. electrons and holes obey Fermi – Dirac statistics and
Strain influences the band structure of the material in the Pauli exclusion principle that forbids two
a predictable way and can therefore be used to tailor Fermions from occupying the same quantum state.
the laser emission properties. The consequences of the Pauli principle are often
Thus laser materials for a certain desired referred to as ‘band-filling effects’ in the physics of
wavelength range have to fulfill two important semiconductor lasers.
LASERS / Semiconductor Lasers 505
The theory therefore has to account not only for the below by the effective (or renormalized) semiconduc-
band-structure but also for this band filling, i.e., the tor bandgap and from above by the electron –hole
detailed population of the electron and hole states. chemical potential. Here, the chemical potential is the
Taking into account only the band structure and band energy at which neither absorption nor amplification
filling leads to a so-called free-carrier theory. But the occurs, i.e., where the semiconductor medium is
electron– hole plasma in a semiconductor gain med- effectively transparent.
ium is an interacting many-body system because of the The effective bandgap energy for elevated carrier
carrier interactions and the exclusion principle. In a densities is significantly below the fundamental
many-body system the properties of any one carrier absorption edge of an unexcited semiconductor.
are influenced by all the other carriers in their This ‘bandgap renormalization’ is a consequence of
respective quantum states. Generally, advanced the fact that the Pauli exclusion principle reduces the
many-body techniques are needed to systematically repulsive Coulomb interaction between carriers of
analyze such an interacting electron –hole plasma and equal charge. Furthermore, the Coulomb screening,
to compute the resulting optical properties, such as i.e., the density-dependent weakening of the effective
gain, absorption or refractive index, all of which are Coulomb interaction potential, contributes to the
changing in a characteristic way with the changing reduction of the bandgap.
electron– hole density. Another many-body effect originates in the
It is often helpful to investigate the characteristic Coulomb attraction between an electron and a hole,
time-scales and the most direct consequences of and leads to an excitonic or Coulomb enhancement
the various interaction effects even though the of the interband transition probability. All these
consequences of the different many-body effects are effects contribute to the detailed characteristics of
in general not additive. The fastest interaction, under the optical spectra of a semicondcutor gain medium.
typical laser conditions, is the carrier– carrier Figure 3 shows an example of calculated spectra in
Coulomb scattering that acts on scales of femto- to comparison to experimental data.
picoseconds to establish Fermi – Dirac distributions of The example demonstrates that the microscopic
the carriers within their bands. These distributions theory correctly accounts for the changes in the gain
are often referred to as ‘quasi-equilibrium distri- spectrum with changing carrier density by means of a
butions’ since they describe electrons and holes in the consistent treatment of band structure, band-filling,
conduction and valence bands which may be charac- and many-body Coulomb effects.
terized by an ‘electronic temperature’ or ‘plasma
temperature.’ The plasma temperature is a measure
for the mean kinetic energy of the respective carrier The Resonator
ensemble and relaxes towards the lattice temperature
as a consequence of electron –phonon (¼ quantized In addition to a gain medium, a laser needs a
lattice vibration) coupling, i.e., the carriers can emit positive feedback process for the amplified stimulated
or absorb phonons. The coupling to longitudinal emission. Hence, the inverted medium has to be
optical phonons in polar materials is especially strong
with scattering times in the picosecond range. The
corresponding times for the electron –acoustic pho-
non scattering are in the nanosecond range. Gener-
ally, the carrier– phonon interaction provides an
exchange of kinetic energy of the carriers with the
crystal lattice in addition to the relaxation of the
initial nonequilibrium distribution. Deviations
between plasma and lattice temperature may occur,
especially when lasers are subject to strong pumping
or rapid dynamic changes.
The Coulomb interaction not only causes rapid
carrier scattering but also directly influences the laser
gain spectrum. Gain, i.e., negative absorption, occurs
in the spectral region of population inversion, in other
words, where the net probability for a test photon to
be absorbed is lower than the probability to be
amplified by stimulated electron – hole recombina- Figure 3 Comparison of experimental and theoretical gain
tion. Energetically, the gain region is bounded from spectra of a (GaIn)(Nas)/GaAs laser.
506 LASERS / Semiconductor Lasers
embedded into a resonator. One of the conceptually So far, we have discussed so-called edge
simplest versions of such a resonator is a so-called emitting lasers where the light emission is in the
Fabry –Perot resonator which consists of two parallel plane of the active area of the device. The lengths of
plane mirrors. A double heterostructure Fabry– Perot these lasers are typically hundreds of microns and
laser is shown in Figure 2. Such Fabry –Perot reso- thus correspond to a few hundreds of optical
nators are very easy to fabricate for semiconductor wavelengths. This causes one of the major disadvan-
lasers: the semiconductor crystal is cleaved perpen- tages of edge emitters – the multimode emission.
dicular to the optical waveguide. The cleaved edges Another disadvantage is the poor elliptic beam
provide a reflectivity of about 32% due to the high profile due to diffraction at the small rectangular
refractive index of the semiconductor. This is sufficient output aperture. Finally, the fabrication procedure of
to provide enough feedback for laser operation. edge emitters is complex since it requires multiple
However, despite its conceptual simplicity, the steps of cleaving before it is even possible to test
cleaved-edge Fabry– Perot concept also introduces a the laser.
number of problems. First, the light output occurs on A novel alternative semiconductor laser concept
two sides of the laser, whereas for most applications was realized in the early 1990s: the vertical-cavity
all the output power should be emitted out of one surface-emitting laser (VCSEL). A typical VCSEL
facet only. For practical applications, this problem structure is shown in Figure 4. In VCSELs, the light
can be solved by depositing coatings onto the cleaved emission is parallel to the growth direction, i.e.
facets of the structure. For example, deposition of a perpendicular to the epitaxial layers of the structure.
10% reflectivity coating on one facet and of a 90% The cavity length of a VCSEL is in the order of a few
reflectivity coating on the other facet concentrates the (1 –5) multiples of half the wavelength of the emitted
laser emission almost completely to the facet of lower light in the material. These small dimensions imply
reflectivity. Antireflection coatings with reflectivities that the interaction length between the active medium
below 1024 are used when laser diodes are coupled to and the light field in the resonator is very short.
external cavities, e.g., in tunable lasers. Therefore, the mirrors of a VCSEL have to be of very
A second problem with Fabry– Perot lasers is that good quality; reflectivities of more than 99% are
the emission wavelength is often not well defined. A required in most configurations.
Fabry –Perot resonator of length L (optical length nL) Very high reflectivities can be achieved with
has transmission maxima – so-called modes – at all dielectric multilayer structures, so-called Bragg reflec-
multiples of c/2nL where c is the speed of light and n tors. Bragg reflectors consist of a series of pairs of
the refractive index. That means that for typical layers of different refractive indices, where each layer
devices with a length of 500 mm, hundreds of modes has the thickness of a quarter of the emission
are present even within the limited gain bandwidth wavelength. Most preferably, Bragg reflectors are
of a semiconductor. These modes often compete epitaxially grown together with the active area of the
with each other resulting in multimode emission device in only one growth process.
which may become particularly complex and Small VCSEL resonators with Bragg reflectors at
uncontrollable when the laser is modulated for both ends of the cavity are called microcavities. They
high-speed operation. are designed in such a way that only one longitudinal
This problem can be overcome in so-called mode is present in the region of the semiconductor
distributed feedback (DFB) lasers. In these devices, gain spectrum. Accordingly, VCSELs are well suited
the refractive index is periodically modulated along to provide single-mode characteristics. Moreover, a
the waveguide. In a simplified picture, this periodic round aperture for light emission can be realized
pattern causes multiple reflections at different with a diameter of a few micrometers so that
locations within the waveguide. For certain wave-
lengths, these multiple reflections interfere construc-
tively to provide enough ‘distributed’ feedback for
laser operation. With the DFB concept, single-mode
emission can be realized even for high-speed modu-
lation of the lasers. Alternatively, the region of
distributed feedback can be separated from the region
of amplification in so-called distributed Bragg
reflector (DBR) lasers. These are devices with
multiple sections where one section is responsible
for the optical gain and another section (the DBR
section) is responsible for the optical feedback. Figure 4 Vertical-cavity surface-emitting laser (VCSEL).
LASERS / Semiconductor Lasers 507
diffraction is weak and the beam profile is almost i.e., by the variation of the optical gain with carrier
perfect. density.
VCSELs can be tested on wafer without further An important application of relaxation oscil-
processing and can be arranged in two-dimensional lations is short pulse generation by gain switching
arrays. A major disadvantage of VCSELs is their of laser diodes. The idea of gain switching is to inject
temperature dependence. The bandgap of the active a short and intense current pulse or an optical pump
material and the cavity mode depend on temperature pulse into the laser structure. This pulse initiates
in such a way that the cavity mode shifts away from relaxation oscillations but it is so short that already
the region of maximum gain when the temperature is the second oscillation maximum is not amplified. As
varied. This severely limits the temperature range of a consequence, only one intense pulse is emitted. The
operation for VCSELs. shortest pulse widths achieved with this scheme so
far are in the few picoseconds range. For short pulse
generation, a further complexity of the diode laser
Semiconductor Laser Dynamics comes into play. The emitted pulses are observed to
The dynamical behavior of semiconductor lasers is be strongly chirped, i.e., the center frequency
rather complex because of the coupled electron – hole changes during the pulse. This complex self-phase-
and light dynamics in the active material. Like other modulation is a result of the strong coupling of the
laser materials, semiconductor lasers exhibit a typical real and the imaginary parts of the susceptibility in
threshold behavior when the pump intensity is the semiconductor. If the gain (which is basically
increased. For weak pumping, the emission is low given by the imaginary part of the susceptibility)
and spectrally broad; the device operates below changes during a pulse emission, the refractive index
threshold and shows (amplified) spontaneous emis- (basically the real part of the susceptibilty) also
sion. At threshold, the optical gain compensates the changes considerably. This leads to a self-phase-
losses in the resonator (i.e., internal losses and mirror modulation if a strong optical pulse is emitted. This
losses) and lasing action starts. Ideally the output strong phase – amplitude coupling is also responsible
power increases linearly with the injection current for the high feedback sensitivity of diode lasers, for
above threshold. The derivative of this curve is complex spatio-temporal dynamics (e.g., self-focus-
proportional to the differential quantum efficiency. ing, filamentation), and for a considerable
The differential quantum efficiency can reach values broadening of the emission linewidth. The phase –
close to unity and even the overall efficiency of amplitude coupling is often quantified with the
the diode laser, i.e., the conversion rate from so-called linewidth enhancement or a factor. This
electrical power into optical power, can be as high parameter, however, is a function of carrier density,
as 60% which is far higher than for any other laser photon wavelength, and temperature.
material. Diode lasers are also suited for the generation or
Ideally, the laser emission is single mode above amplification of extremely short pulses with dur-
threshold. However, as discussed above, edge-emit- ations below 1 picosecond. The concept used for such
ting diode lasers tend to emit on multiple longitudinal extremely short pulse generation is called mode-
modes and the competition of these modes can lead to locking. The idea of mode-locking is to synchronize
complex dynamical behavior which is significantly the longitudinal modes of the laser so that their
enhanced as soon as the laser receives small amounts superposition is a sequence of short pulses. The most
of optical feedback. Semiconductor lasers are extre- successful concept to achieve mode-locking of diode
mely sensitive to feedback and tend to show lasers is to introduce a saturable absorber into the
dynamically unstable, often even chaotic, behavior laser cavity and to additionally modulate the injection
under certain feedback conditions. current synchronously to the cavity roundtrip freq-
Like other lasers, semiconductor lasers respond uency. A saturable absorber exhibits nonlinear
with relaxation oscillations when they are suddenly absorption characteristics: the absorption is high for
switched on or disturbed in stationary operation. weak intensities and weak for high intensities so that
Relaxation oscillations are the damped periodic high peak power pulses are supported. Such absor-
response of the total carrier density and the optical bers can be integrated as a separate section into
output intensity as a dynamically coupled system. multiple-section diode lasers which have already been
Both are important for many aspects of semiconductor successfully used for subpicosecond pulse generation.
laser dynamics. The relaxation oscillation frequency Furthermore, electro-absorption modulators, passive
determines the maximum modulation frequency of waveguide sections, and tunable DBR segments for
diode lasers. This frequency is strongly governed by wavelength tuning can also be integrated into such
the differential gain in the active material, multiple-section devices.
508 LASERS / Up-Conversion Lasers
Up-Conversion Lasers
A Brenier, University of Lyon, Villeurbanne, France practical importance is that some of them can be
q 2005, Elsevier Ltd. All Rights Reserved. pumped in the near-infrared range (around 800 nm or
980 nm) by commercially available laser diodes. They
can be based on all-solid-state technology and share
Introduction its advantages: compactness of devices, low mainten-
ance, and efficiency. They are needed for a variety of
Under photoluminescence excitation, a luminescent
applications: color displays, high density optical data
center located inside a transparent material usually
emits at a longer wavelength than the excitation. This storage, laser printing, biotechnology, submarine
is because of the existence of de-excitation losses communications, gas-laser replacement, etc.
inside the material. Thus, efficient down-conversion Because the emitted photons have higher energy
lasers can be built, for example, the commercial than the excitation ones, more than one pump photon
Nd3þ:Y3Al5O12 laser which works at 1064 nm, is needed to generate a single laser photon, so the
under usual pumping near 800 nm. More surpris- physical processes involved in the up-conversion laser
ingly, it has been demonstrated that fluorescence can are essentially nonlinear. Two kinds of laser devices
also be emitted at shorter wavelength than the can be considered. In the first one, the excitation of
excitation. This is possible if the losses are reduced the luminescent center up-migrates through the
and overcome by so-called ‘up conversion mechan- electronic levels and reaches the initial laser level.
isms’. Generally speaking, a laser emitting at a shorter This step is due to some luminescence mechanisms.
wavelength than the excitation is classified as an Then, in a second step, lasing at a short wavelength
up-conversion laser. occurs through a transition towards the final laser
These lasers are able to generate coherent light in level. In the second kind of device, the pump
the red, green, blue, and uv spectral ranges. A point of excitation activates a down-conversion lasing, but
LASERS / Up-Conversion Lasers 509
the laser wave is maintained inside the cavity, cannot probability W(s21) takes the form:
escape, and is up-converted in short wavelengths by
the second-order nonlinear susceptibility of the 3"4 c4 1 QA ð fA ðEÞfS ðEÞ
crystal host. This requires bifunctional nonlinear W¼ dE ½1
4pn4 R6 trad E4
optical and laser crystals. The device is thus a self-
frequency doubling laser or a self-sum frequency where R is the S –A distance, E is the photon energy of
mixing laser. If the nonlinearity is not an intrinsic the transition operated by each ion, n the refractive
property of the laser crystal but is due to a second index, and QA is the integrated absorption cross-
nonlinear optical crystal located inside or outside the section of the A-transition:
cavity, the device operates intracavity or extracavity
up conversion. ð sA
QA ¼ sA ðEÞdE; fA ¼ ½2
QA
Up-Conversion from Luminescence In eqn [1], trad is the radiative emission probability
Mechanisms of S for the involved transition calculated from the
The numerous 2Sþ1LJ energy levels (4fn configuration) emission probability AS:
of the trivalent rare earth ions, inserted in crystals or ð
1
glasses, offer many possibilities for up conversion, ¼ AS ðEÞdE; fS ¼ trad AS ½3
particularly from long-lived intermediate levels that trad
can be populated with infrared pumping. Crystal field
induced transitions between them are comprised of The integral in eqn [1] is called the overlap integral
sharp lines, because the 4fn electrons are protected and shows that the transfer is efficient if the
from external electric fields by the 5s2p6 electrons. luminescence spectrum of S is resonant with the
Their fluorescence spectra can exhibit cross-sections absorption spectrum of A (corresponding to the S and
high enough (10219 cm2) to lead to laser operation in A transitions involved in the nonradiative transfer).
the visible range. For up-conversion purposes, the The transitions are often not in perfect resonance but
most popular ions are Er3þ, Pr3þ, Tm3þ, Ho3þ, and remain efficient enough to be used in practice. Then
Nd3þ. Up conversion in Er3þ was used for the first the energy mismatch is compensated by phonons, and
time in 1959, to detect infrared radiation. the energy transfer is said to be phonon-assisted.
The energy transfer visualized in Figure 1a is an up-
Energy Transfers conversion cross-relaxation. It is often met in the rare
earth ions cited above when they are sufficiently
Two ions in close proximity interact electrostatically concentrated in the material. The S and A ions can be
(Coulomb interaction) and can exchange energy. The of the same or of different chemical species. Succes-
ion which gives energy is called the sensitizer S or sive energy transfers can promote the activator from
donor, and the energy receiving ion is the activator A its ground state towards higher energy levels, up to
or acceptor. Two examples discussed below are the one from which lasing can occur. This is the so-
shown in Figure 1. The most typical case of such called ‘Addition de Photons par Transferts d’Energie’,
nonradiative energy transfer in trivalent rare earth sometimes appointed ‘APTE’. Yb3þ is the most used
ions in insulating materials is due to electrical dipole – ion as sensitizer and leads to the following transfers in
dipole interaction. For the latter, the transition examples of Er3þ or Ho3þ co-doping:
2
F5=2 ðYbÞ þ 4 I15=2 ðErÞ ! 2 F7=2 ðYbÞ þ 4 I11=2 ðErÞ
½4
2
F5=2 ðYbÞ þ 4 I11=2 ðErÞ ! 2 F7=2 ðYbÞ þ 4 F7=2 ðErÞ
2
F5=2 ðYbÞ þ 5 I8 ðHoÞ ! 2 F7=2 ðYbÞ þ 5 I6 ðHoÞ
½5
2
F5=2 ðYbÞ þ 5 I6 ðHoÞ ! 2 F7=2 ðYbÞ þ 5 S2 ðHoÞ
The energy transfer shown in Figure 1b is a down- leads to population with density n1 of an excited state
conversion cross-relaxation. It is the inverse of the in a first step. Because the rare earth ions have
previous one. At first sight it is, of course, undesirable numerous energy levels, it is possible for the pump to
in an up-conversion laser. However, we notice that also be resonant with the transition from level 1
one ion in the upper level will lead to two ions in the towards a higher energy level 2, and is further
intermediate level. This fact can be exploited in absorbed. Such a process leading to n2 up-conversion
special conditions (see below) and be beneficial for up population is a sequential two-photon absorption. If
conversion. there is no 1 ! 2 resonant transition for the v
The dynamical study of the energy transfers, frequency, it is possible to use a second pump wave
starting from the microscopic point of view contained at a different angular frequency v0 resonant with the
in eqn [1], is very complicated because of the random 1 ! 2 transition (Figure 2b). Of course, in practice, it
locations of the ions or because of other processes, is advantageous when a single pump beam is required
such as energy migration among sensitizers and back for up conversion.
transfers. A popular method leading to useful Let us show the relevant terms of the rate equation
predictions in practice is the rate equation analysis analysis describing the process in Figure 2a:
dealing with average parameters. The time evolution
of the population densities ni of the various levels i are dn1 n b
¼ 2 1 þ Is0 n0 2 Is1 n1 þ 21 n2 ½9
described by first-order coupled equations, solved dt t1 t2
with adequate initial conditions and including the
pumping rate. As an example, the rate equations for
dn2 n
levels 1 and 2 in Figure 1a take the form: ¼ 2 2 þ I s1 n1 ½10
dt t2
dn1 n b
¼ 2 1 2 2kn21 þ 21 n2 þ W ½6
dt t1 t2 n0 þ n1 þ n2 ¼ 1 ½11
dn2 n
¼ 2 2 þ kn21 ½7 where I is the photon density of the pump (photons
dt t2 s21 cm22) and s0 and s1 are the absorption cross-
n0 þ n1 þ n2 ¼ 1 ½8 sections (cm2) of the 0 ! 1 and 1 ! 2 transitions,
respectively. The n2 population density shows a
where t1,2 are the lifetimes of levels 1 and 2, k is the nonlinear I dependence, most often close to I 2.
up-conversion transfer rate, b21 is the branching ratio
for the 2 ! 1 transition, and W is the pump rate (not Photon Avalanche
shown in Figure 1). In rare earth doped materials, an up-conversion
emission, intense enough for visible lasing, can result
Sequential Two-Photon Absorption
from a more complex mechanism. In this case,
Pumping the ground state of a luminescent ion absorption of the pump is not resonant with a
(Figure 2a) with a wave at angular frequency v, transition from the ground state (weak ground-state
absorption), but on the contrary, the photon pump
energy matches the gap between two excited states
(high excited state absorption). These levels are
labeled 1 and 2 in Figure 3. Level 1 generally has
a rather long lifetime in order to accumulate
population and level 2 is the initial up-conversion Is1 in the excited state, that is to say if:
laser level. When the pump light is turned on, the !
populations of levels 1 and 2 first grow slowly, and 1 1
then can accelerate exponentially. This is generally kþ
t2 t1
referred to as ‘photon avalanche’. The term ‘looping I s1 . ½16
ð1 2 b21 Þ
mechanism’ also applies, because we can see in k2
Figure 3 that the excited state absorption 1 ! 2 is t2
followed by a cross-relaxation energy transfer (quan-
So the right-hand side of eqn [16] appears to be a
tum efficiency up to two) that returns the system back
pump threshold separating two dynamical regimes
to level 1. So after one loop, the n1 population density
when the condition in eqn [15] is satisfied, as the
has been amplified, and is further increased after
insert in Figure 3 shows.
successive loops. This pumping scheme was discov-
Another way to distinguish between the two
ered in 1979, with LaCl3:Pr3þ crystal used as an
regimes is given by linearization of the system of
infrared photon counter.
eqns [12] –[14] ðn0 ø 1Þ. Then the description will
The rate equation analysis applied to the three-level
only be valid close to t ¼ 0. We know from standard
system in Figure 3 leads to the following time
mathematical methods that the solutions of eqns
evolutions:
[12] –[14] are then a combination of expðatÞ terms. If
the threshold condition in eqn [16] is satisfied, a
dn1 n b
¼ 2 1 þIs0 n0 2Is1 n1 þ 21 n2 þ2kn0 n2 ½12 parameter a in the combination becomes positive, so
dt t1 t2
the population of level 2 increases exponentially
(photon avalanche) at early times and the dynamics of
dn2 n the system is unstable.
¼ 2 2 þIs1 n1 2kn0 n2 ½13
dt t2 The physical meaning of eqns [15] and [16] is
clarified by introducing the yield hCR of the cross-
n0 þn1 þn2 ¼ 1 ½14 relaxation process:
Table 1 Relevant terms for frequency up-conversion the birefringence of the crystals. When this is not
possible, another technique can be used, namely
Optical process Term New wave
quasi-phase matching, which involves reversing
Linear periodically the sign of the nonlinear optical
Refractive index, x (1)(2 v;v) coefficient.
absorption, stimulated
In bifunctional crystals, the laser effect and the x (2)
emission
Second order nonlinear interaction occur simultaneously inside the same
Second harmonic x (2)(2 v3;v1,v1) v3 ¼ 2v1 crystal. In the case of the self-frequency doubling
generation laser, the angular frequency v1 ¼ v2 in eqn [24] is
Sum-frequency mixing x (2)(2 v3;v1, ^v2) v3 ¼ v1 þ v2 that of the fundamental laser wave. In the case of the
self-sum frequency mixing laser, v1 corresponds to
the fundamental laser wave and v2 to the pump wave.
As we can see from eqns [22] – [24], the polarization
of the laser emission is crucial in order for the device
to work. So, the laser stimulated emission cross-
section s for propagation in the phase matching
direction ðu; fÞ in the adequate polarization has to be
evaluated. This can be performed with the formula:
s X sY s Z
s ðu; wÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½25
s 2Y s 2Z e2X þ s 2X s 2Z e2Y þ s 2X s 2Y e2Z
impose that the frequency conversion is efficient in Up-conversion lasers from energy transfers have the
practice, only if the following phase matching advantage of using a single pump laser. This latter
condition is satisfied: populates firstly a long-lived intermediate level.
A typical example, based on LiYF4:Er3þ crystal, is
v1 n1 " ðu; wÞ þ v2 n2 " ðu; wÞ ¼ v3 n3 # ðu; wÞ ½24 provided by Figure 7.
The pump at 802 nm from a GaAlAs diode laser, or
where ni " and ni # are respectively the upper and at 969 nm, populates efficiently at the 8 ms lifetime
4
lower refractive index in the direction of propagation I11/2 level. The green laser, at 551 nm, corresponds to
ðu; wÞ (ni in eqn [23] are the same with a simplified the 4S3/2 ! 4I15/2 transition and it can be shown that
notation). Equation [24] is restricted here for energy transfer is an essential part of the mechanism
simplicity to collinear type I (waves 1 and 2 and the by stopping abruptly the pump laser: green lasing
same polarization) and is the expression of photon continues to occur for several hundred microseconds
momentum conservation. It is usually achieved from because the 4S3/2 level is still fed (an excited-state
514 LASERS / Up-Conversion Lasers
551nm
4I
13/2
5 two-photon absorption mechanism is represented in
291 cm–1 Figure 8. The material is LaF3:Nd3þ and violet lasing
4I
15/2
at 380 nm corresponds to the 4D 3/2 ! 4 I11/2
transition.
Figure 7 Energy level diagram of LiYF4:Er3þ. The dashed lines Transition from the ground state up to the 4F5/2
indicate the energy transfers responsible for the green and blue
level is firstly obtained from an infrared beam at
up-conversion lasing. Reproduced with permission of American
Institute of Physics from Macfarlane RM, Tong F, Silversmith AJ 790 nm in order to feed the 4F3/2 intermediate level
and Lenth W (1988) Violet CW neodymium upconversion lasers. after a fast nonradiative de-excitation. The 4F3/2 level
Applied Physics Letters 52(16): 1300. has a rather long lifetime: 700 ms, so it accumulates
Table 2 Main room temperature up-conversion lasers operating from energy transfers
Laser material Laser wavelength (nm) Pump wavelength (nm) Output power
n3 t3 = 0.37 ms
30
4D lp = 974 nm
3/2
b32
n2 t2 = 4.3 ms
b21
Yellow pump
b31
Yellow pump
Violet laser
Violet laser
n1 t1 = 10 ms
lp =974 nm b20
20 b30
4G n0
5/2
x 10 (cm )
–1
4F
5/2 conversion lasing from two-photon absorption (doubly resonant).
4F Reproduced with permission of Institute of Physics Publishing
3/2
from Huber G (1999) Visible cw solid-state lasers. Advances in
10
Lasers and Applications, Bristol and Philadelphia: Scottish
Universities Summer School in Physics & Institute of Physics
Yellow pump
IR pump
Publishing, 19.
4I
11/2
0 4
(a) I9/2 (b)
Table 3 Main room temperature up-conversion lasers operating
Figure 8 Simplified energy level diagram of LaF3:Nd3þ and up- from two-photon absorption
conversion lasing from two-photon absorption. (a) Two different
Laser material Laser Pump Output
pump wavelengths, (b) doubly resonant single pump wavelength.
wavelength (nm) wavelength (nm) power
Reproduced with permission of American Institute of Physics from
Lenth W, Silversmith AJ and Macfarlane RM (1988) Green Y3Al5O12:Er 1% 561 647 þ 810
infrared pumped erbium up conversion lasers. In: TAM AC, Gole LiYF4:Er 1% 551 647 þ 810 0.95 mJ
JL and Stwalley WC (eds) Advances in Laser Science – III. AIP LiYF4:Er 1% 551 810 40 mW
Conference Proceedings n8172: 8– 12. LiYF4:Er 1% 551 974 45 mW
LiYF4:Er 551 974 20 mW
KYF4:Er 1% 562 647 þ 810 0.95 mJ
LiLuF4:Er 552 974 70 mW
enough population that a second yellow pumping at LiLuF4:Er 552 970 213 mW
591 nm (Figure 8a) can efficiently populate the 4D3/2 LiYF4:Tm 1% 453–450 781 þ 649 0.2 mJ
initial laser level. With 110 mW of infrared pump and Fiber:Tm 480 1120 57 mW
Fiber:Tm 480 1100–1180 45 mW
300 mW of yellow pump, 12 mW of violet output Fiber:Tm 455 645 þ 1064 3 mW
were obtained at 20 K temperature. Fiber:Tm 803–816 1064 1.2 W
A simpler device is obtained if a single-pump beam Fiber:Tm 482 1123 120 mW
is used. This is the case in Figure 8b, where a yellow Fiber:Yb:Tm 650 1120
Fiber:Ho 540–553 643 38 mW
pump at 578 nm provides both ground state and
Fiber:Ho 544–549 645 20 mW
excited state absorption. This doubly resonant Fiber:Er 548 800 15 mW
scheme is less efficient in LaF3:Nd3þ than the previous Fiber:Er 546 801 23 mW
one but in Figure 9, we show another example: Fiber:Er 544 971 12 mW
LiYF4:Er3þ, working at room temperature. The Fiber:Er 543 800
Fiber:Pr 635 1010 þ 835 180 mW
lasing 4S3/2 ! 4I15/2 transition at 551 nm delivers
Fiber:Pr 605 1010 þ 835 30 mW
20 mW upon 1600 mW pumping at 974 nm. Fiber:Pr 635 1020 þ 840 54 mW
The simplified Er3þ level scheme of Figure 9 Fiber:Pr 520 1020 þ 840 20 mW
contains four main levels with population densities: Fiber:Pr 491 1020 þ 840 7 mW
n0 ¼ ½4 I15=2 , n1 ¼ ½4 I13=2 , n2 ¼ ½4 I11=2 , n3 ¼ ½4 S3=2 . Fiber:Nd 381 590 74 mW
Fiber:Nd 412 590 500 mW
The rate equations of populations of this system can
516 LASERS / Up-Conversion Lasers
be written and solved because all the implied materials. As an illustration of this mechanism, let
parameters (lifetimes, branching ratios, ground, and us consider the case of the continuous wave Pr3þ –
excited states absorption cross-sections) have been Yb3þ-doped fluorozirconate fiber laser (3000 ppm Pr,
measured in this material. Predictive models of the 20 000 ppm Yb). With two Ti:sapphire pump lasers
up-conversion green laser are successful, matching operating at 852 and 826 nm (Figure 10), an output
experimental measurements, and confirm that the power at 635 nm as high as 1.02 W was obtained
doubly resonant mechanism is realistic at low doping with an incident pump power of 5.51 W (19% slope
concentration. efficiency).
Table 3 summarizes the performances of the main This remarkable result can be explained quantitat-
up-conversion lasers based on sequential photon ively by modeling the photon avalanche with a five-
absorption and working at room temperature. level diagram and restricted in Figure 11 to a single
laser pump at 850 nm. The laser emission corre-
Photon Avalanche sponds to the 3P0 ! 3F2 transition. The five relevant
levels have the population densities: n0 ¼ ½3 H4 ðPrÞ,
Photon avalanche up-conversion was observed n1 ¼ ½1 G4 ðPrÞ, n2 ¼ ½3 P0 ðPrÞ, n3 ¼ ½2 F7=2 ðYbÞ,
mainly in Nd3þ, Tm3þ, Er3þ, and Pr3þ doped n4 ¼ ½2 F5=2 ðYbÞ.
The two components of the mechanism are:
Figure 10 Experimental set-up for Yb– Pr-doped fiber up- The rate equation analysis predicts that the
conversion laser. M1: dichroic mirror, C1, C2: lenses, M2 dichroic
avalanche threshold occurs at about 1 W pump
input mirror. Reproduced with permission of Optical Society of
America from Sandrock T, Scheife H, Heumann E and Huber G and the long fluorescence rise time is somewhat
(1997) High power continuous wave up-conversion fiber laser at shortened at higher pump power, as observed
room temperature. Optics Letters 22(11): 809. experimentally.
Figure 11 Energy level diagram of fluorozirconate: Yb3þ:Pr3þ fiber. The dashed lines indicate the energy transfers. Reproduced with
permission of the Optical Society of America from Sandrock T, Scheife H, Heumann E and Huber G (1997) High power continuous wave
up-conversion fiber laser at room temperature. Optics Letters 22(11): 809.
LASERS / Up-Conversion Lasers 517
Table 4 Main room temperature up-conversion lasers operating from photon avalanche
Laser material Laser wavelength (nm) Pump wavelength (nm) Output power
2
F5=2 ! 2 F7=2 ½31 The laser beam cannot escape from the cavity and
dichroic mirrors are used. The input mirror has high
Channels in eqns [29] and [31] operate around transmission at the pump wavelength and is highly
1060 nm wavelength and the channel in eqn [30] near reflective at laser- and second-harmonic wavelengths.
1338 nm. The output mirror is highly reflective at laser
wavelength and has high transmission for second
harmonic. Channels in eqns [29] and [31] generate
The Self-Frequency Doubling Laser
green light near 530 nm and the channel in eqn [30]
A typical laser scheme is shown in Figure 12. generates red light near 669 nm.
518 LASERS / Up-Conversion Lasers
The most exploited bi-functional crystal is YAl3 direction u ¼ 66:88, f ¼ 132:68, and high green
(BO 3) 4:Nd 3þ, well-known as NYAB. It is a power can be obtained: 225 mW under 1.56 W
negative uniaxial trigonal crystal (extraordinary pump.
index lower than the ordinary one) and its laser Table 5 summarizes the main self-frequency
emission is easily observed in ordinary polarization doubling results obtained with the channel in
with a high cross-section: 2 £ 10 219 cm 2 at eqn [29].
1063 nm. Type I phase matching occurs at a
polar angle u ¼ 30:78 and 26.88 for channels in
eqns [29] and [30] respectively. The effective The Self-Sum Frequency Mixing Laser
nonlinear optical coefficient deff is close to The example in Figure 13 gives the main features of
1.4 pm V21 at azimuthal angle f ¼ 08: such a laser. The input mirror has high transmission at
The Yb3þ ion has some advantages over the Nd3þ the pump wavelength and both mirrors are highly
ion due to its very simple energy level scheme: there reflective at laser wavelength. The output mirror has
is no excited state absorption at the laser wave- high transmission at sum frequency mixing
length, no up-conversion losses, no concentration wavelength.
quenching, and no absorption in the green. The The laser wave at angular frequency v1 in Table 1
small Stokes shift between pump and laser emission corresponds to the channel in eqns [29] or [30] in the
reduces the thermal loading of the material during case of Nd3þ. The wave at angular frequency v2 in
laser operation. A self-doubling laser based on Table 1 has two different roles: first it excites the
YAl3(BO3)4:Yb3þ of crystal has produced 1.1 W Nd3þ laser center and secondly its nonabsorbed part
green power upon 11 W diode pumping. is up converted by the second-order nonlinear
The drawback of YAl3(BO3)4 is that it is rather process. v2 (corresponding to the wavelength l2 ) is
difficult to grow because it is not congruent. This then chosen to match the main Nd3þ absorption
difficulty was overcome by the discovery at Ecole lines:
Nationale Supérieure de Chimie de Paris of the rare-
earth calcium oxoborate CaGd4(BO3)3O which can
4
be grown to a large size by the Czokhralski method. It I9=2 ! 4 G5=2 2 2 G7=2 ðl2 ¼ 590 nmÞ
is a monoclinic biaxial crystal and doped with Nd3þ, 4
it was firstly exploited (channel in eqn [29]) in I9=2 ! 4 F7=2 2 4 S3=2 ðl2 ¼ 750 nmÞ
½32
the XY principal plane at u ¼ 908, f ¼ 468 with 4
I9=2 ! 4 F5=2 2 2 H9=2 ðl ¼ 800 nmÞ
ðdeff ¼ 0:5 pm V21 Þ. It was soon recognized that the
optimum phase matching direction ðdeff ¼ 1:68 pm
4
I9=2 ! 4 F3=2 ðl2 ¼ 880 nmÞ
V21 Þ occurred out of the principal planes, in the
See also
Materials Characterization Techniques: x(2). Nonlinear
Figure 13 Scheme of a self-sum frequency mixing laser. Note Optics, Applications: Phase Matching. Nonlinear
that in reality the three beams are superimposed. Optics, Basics: x(2) –Harmonic Generation.
Table 6 Main results of Nd3þ self-sum frequency mixing lasers based on eqn [29]
2003. The proceedings of the 1996 AURICOM and particularly true of silicate glass, metal reflectors,
subsequent NLMI conferences have been published and in some cases, polymers. In fact, silicate glass has
by the SPIE. We are saddened to note that Professor become one of the purest noncrystalline materials
Libenson died in February 2004. available in bulk quantities. However, for practical
The Boulder Damage Symposium grew out of a purposes, most laser systems are limited by damage at
one-day Symposium convened in Boulder in 1969 surfaces, coatings, or impurities. Impurities aggregate
under the auspices of the ASTM Subcommittee on at grain boundaries in crystalline materials, and
Lasers, to establish standards for laser materials. In coalesce into inclusions in glasses, resulting in
their summary of the first Symposium, the conference absorption, scattering, and field enhancement.
organizers remarked: The coherent nature of laser light makes any
scattering phenomena potentially damaging, because
The entire question of the nature of the standards the scattered light interferes constructively with the
remains to be addressed. Should they take the form of laser beam, leading to the formation of intensity hot
energy density at which catastrophic damage is likely to spots. Every interface in an optical system is poten-
occur, or should they be specified in terms of a mean tially a source of constructive interference between the
number of shots at a given level of energy density before reflected and incident waves. Multilayer dielectric
certain degradation of performance is measured? The
coatings are particularly vulnerable to damage, due to
latter would seem to express the kind of information that
the presence of residual stress as well as field
the buyer of laser materials would find most useful. Once
a standard is agreed upon, a meaningful test configur- enhancement from Fresnel reflection. This has led to
ation must be established. It is likely that this will be a attempts to mitigate the problem by use of specific
well-controlled oscillator amplifier chain with the test designs, such as avoiding AR coatings by using
specimen an active element of the system, but this is not Brewster-angle surfaces, or use of nonquarter wave
certain. Another point to be determined is that of the layers to move the peak field away from the vicinity of
accuracy to which its specifications should be written. boundaries where impurities tend to collect. Other
This depends in part on the kind of quality control the damage-sensitive design considerations can be found
producers of laser glass believe is feasible. in the literature. Multiple reflections are the basis of
photonic crystal optics, and applications of these
The questions raised in this first Symposium remain devices to high-power density systems will undoubt-
at the core of laser damage studies today. How do we edly be limited by field-enhanced damage phenomena.
define damage, in terms of the physical effect on the In CW operation, of course, thermal effects
material or in terms of the degradation of perform- dominate. They may be irreversible, creating perma-
ance of the system? How do we measure damage nent damage sites in the medium, or reversible, such as
thresholds or values? How do we specify the proper- color centers that can be bleached out. As laser pulse
ties and preparation of the sample, including the durations vary from nanosecond to picosecond and
relevant experimental and material variables? How femtosecond durations, the nature of the dominant
can we ensure that damage values are reproducible, damage mechanisms changes as well. As a rule of
unless we make the measurements in a very well thumb, the scaling of damage thresholds as t 1/2, where
controlled and characterized system? With what t is the pulse duration, was established empirically,
precision should damage levels be stated, in order to and was found to be valid over a wide range of pulse
be useful? Extensive research was required in order to durations. This scaling implies the presence of diffu-
answer these questions, and to elucidate the funda- sion effects, either of heat or of electrons, in the vicinity
mental nature of damage mechanisms. Many of the of a damage center. As might be expected, this scaling
results of this research are documented in the 35 year does not apply for very short pulses (femtoseconds or
record of the Boulder Damage Symposium and the less), where the extremely high field values can lead to
corresponding Russian-based conferences. instantaneous breakdown of the material.
From the beginning of this area of research, Whether the damage is due to field enhancement or
investigators devoted a significant effort to establish- is impurity-mediated, it is difficult to establish a
ing the ‘first causes’ of damage phenomena. However, reproducible value for a damage threshold, unless the
they quickly realized that in most cases, damage sample is macroscopic in extent, and the sampling
occurred at surfaces and material interfaces, or was laser beam is reproducibly of high spatial and
mediated by impurities or inclusions in bulk temporal quality, preferably truly single-mode.
materials. As material fabrication processes Damage measurements on microscopic areas are
improved, and higher-quality materials became avail- unlikely to be representative of the entire sample.
able, observed damage thresholds approached the Damage will occur first at a point where the field is
intrinsic limits of the pure material. This was strongest or the material is weakest, and a sufficient
LASER-INDUCED DAMAGE OF OPTICAL MATERIALS 521
area must be tested to ensure that it includes such understanding of the nature of damage in laser glass will
sites. Also, unless the beam quality of the laser used have been obtained. Higher threshold values will have
for testing is well controlled and well characterized, it been reproducibly achieved, and some agreement can be
will not be possible to reproduce the test results. arrived at regarding useful and realistic standards.
It is equally important to characterize the samples, Thirty-five years later, damage levels in optical
including their method of preparation, cleaning, and materials have been significantly improved, and reali-
test environment. For damage research, as opposed to stic and useful standards have been developed under
simple threshold testing, postmortem examination is the auspices of the International Organization for
also required, to elucidate the mechanism of the Standardization (ISO), and are now accepted globally.
observed damage effect. However, we now have However, as the operating regimes of lasers have
several techniques, such as photothermal deflection, moved to femtosecond temporal scale and micrometer
that can provide useful pre-catastrophic indicators of dimensions, new phenomena have arisen that require
the onset of laser damage, and thus are appropriate further research. An entirely new technology of laser-
for quality control. enabled manufacturing and materials conditioning
Professor Roger Wood had published a series of has developed, based on the same physics as laser
reviews and textbooks on laser damage in optical damage. Combining the disciplines of materials
materials, which provide an overview of the field, science and applied optics, the interaction of intense
with appropriate emphasis on measurement tech- light with materials remains a scientifically challen-
niques, characterization of the test setup and sample, ging and economically important field of research.
and implications for system design. He has also edited
a collection of seminal publications in the field,
See also
published by the SPIE in 1990.
It is gratifying to those working in this field to Optical Coatings: Laser Damage in Thin Film Coatings.
observe how much of the research carried out to
elucidate and, if possible, avoid the deleterious effects Further Reading
of laser damage has been applied to the constructive Bonch-Bruevich AM, Konov VI and Libenson MN (eds)
purpose of laser-assisted material processing and (1990) Optical Radiation Interaction with Matter. SPIE
manufacturing. The introduction of low-cost, reliable Proceedings, vol. 1440.
femtosecond lasers has stimulated activity in this field, Glass AJ and Guenther AH (eds) (1969) Damage in Laser
since it enables the experimenter to control to exquisite Glass Symposium: Review and Summary. ASTM Special
Technical Publication 469.
precision the power and energy deposited on a
ISO Online Catalogue, Section 31.260, http://www.iso.ch/
material. With femtosecond pulses, one can ablate iso/en/CatalogueListPage.CatalogueList
material from the surface of an object while minimiz- Laser-Induced Damage in Optical Materials, Collected
ing any collateral damage to the interior material. Papers 1969 – 1998, (CD-ROM), SPIE, Bellingham,
Surface layers of dirt can be removed from priceless art Washington (1999).
works such as oil paintings, without concern for Laser-Induced Damage in Optical Materials, Collected
damage to the object itself. Objects with arbitrary Papers 1999 – 2003, (CD-ROM), SPIE, Bellingham,
shape can be built up from metal particles, using laser Washington (2004) (To be published).
additive manufacturing techniques. Thin films and Libenson MN (ed.) (1997) Non-Resonant Laser-Matter
photonic crystal structures can be created using laser- Interaction (NLMI-9). SPIE Proceedings, vol. 3093.
Libenson MN (ed.) (2001) Non-Resonant Laser-Matter
assisted deposition. Laser irradiation provides a power
Interaction (NLMI-10). SPIE Proceedings, vol. 4423.
tool for surface cleaning, conditioning, and annealing. Libenson MN (ed.) (2004) Non-Resonant Laser-Matter
Laser interaction with materials and laser-assisted Interaction (NLMI-11). SPIE Proceedings, vol. 5506,
manufacturing techniques comprise a vital area of (To be published).
research in the materials science community. Wood RM (ed.) (1986) Laser Damage in Optical
In both the US and Russia, study of the interaction Materials. Bristol, UK: Adam Hilger.
of intense coherent light with optical materials began Wood RM (ed.) (1990) Selected Papers on Laser Damage in
with a disarmingly simple objective; to understand Optical Material. SPIE Milestone Series, vol. MS24.
the fundamental mechanisms of interaction. In the SPIE Optical Engineering Press.
review and summary of the 1969 ASTM Symposium, Wood RM (ed.) (2003) Laser-Induced Damage in Optical
Materials. Bristol, UK: Institute of Physics.
the editors wrote:
Wood RM (ed.) (2003) The Power- and Energy-Handling
In view of the number of problems remaining to be Capability of Optical Materials, Components, and
resolved, it is suggested that another Symposium on laser Systems. SPIE Press Tutorial Texts in Optical Engineer-
damage be held in 1970. Hopefully, at that time a better ing, vol. TT60.
522 LIGHT EMITTING DIODES
Introduction
Short Historic Overview
During the second half of the nineteenth century,
electric lighting became possible. In 1879, Edison
demonstrated his incandescent lamp, and although
some electric arc lamps competed with the incandes-
cent lamp bulb for a few years, the first 50 years of Figure 1 Increase of luminous efficacy of LEDs during the past
40 years. Courtesy of Lumileds Lighting.
electrical lighting was dominated by the incandescent
lamp. Gas discharge lamps (both low- and high-
pressure lamps) became more widely distributed only technologies and better light extraction techniques.
by the end of the third decade of the twentieth In the figure we also show the efficacy values of
century, and widespread distribution of the fluor- average incandescent, high-pressure Hg- and fluor-
escent lamp, that still dominates our offices and many escent lamps. The first LEDs emitted in the red part of
other lighting applications today, was generally the spectrum, and only by 1975 was the green LED
accepted only after the Second World War. invented. By 1985, practically any color between red
The early incandescent lamps had an efficacy of a and green became possible, and the efficacy of these
few lm/W, but by 1920 this could be increased by up lamps reached a level where for monochromatic light
to 10 – 15 lm/W. Gas discharge lamps started with applications, their efficacy could compete favorably
approximately 30 lm/W, but by 1960 their efficacy with filtered incandescent light. LEDs emit in a
increased to about 70 lm/W. Fluorescent lamps still relatively narrow wavelength band (20 nm – 30 nm
provide only 70 to 100 lm/W; high-pressure lamp bandwidth), and if one has to filter from the
efficacy could be increased above this value. The continuous spectrum of an incandescent lamp such
different lamp types can be used, however, in different a narrowband, the loss becomes considerable. LEDs
applications, depending on the size and the unit needed only a few volts to function and it became
power of the lamp type. Nonincandescent light general practice to build LEDs with approximately a
generation in solid-state materials dates back to the 20 mA current load. These LEDs became very
1920s: a faint glow of a SiC crystal was observed popular for interior signaling (on instruments,
when current passed through it at a point contact. In household appliances, and in similar applications).
the 1930s, electroluminescence of ZnS powder layers The goal was, however, to be able to produce white
was detected (Destriaux effect). By the mid-1960s, it light, but the blue-emitting LED was still missing.
became clear that by using the then available Only after 1990 could the problem of producing p – n
technology, good luminous efficiency could be junctions from GaN, that had a bandgap large enough
expected only in III –V compounds (GaAs, GaP, and to emit blue light, be solved. Different ternary and
their ternary and quaternary compounds using Al, quaternary compounds, as shown in Figure 1,
In, and N in the compounds). Pure GaAs has a enabled efficacies comparable to incandescent
bandgap in the infrared (IR), thus one can build lamp efficacy, and the race for producing white
good IR-emitters using this material, but no visible light began.
light emitter. However, GaAs proved to be an
excellent coherent light-emitting material and
enabled the construction of IR-emitting semiconduc-
Physical Fundamentals
tor lasers. Figure 2 shows a forward-biased p –n junction. As
First visible light-emitting diodes were fabricated injected free electrons and holes are driven towards
using a GaAsP alloy. Figure 1 shows how the efficacy the junction, they can radiatively recombine, emitting
of III – V compound LEDs can be increased by a photon. The probability of photon emission versus
introducing new compositions, and later, new nonradiative recombination will depend on the band
LIGHT EMITTING DIODES 523
Figure 2 Forward biased p–n junction, light generation via Figure 4 Relative spectral power distribution of some LED
recombination of free charge carriers. structures.
List of Units and Nomenclature having been made for the state
of chromatic adaptation.
Chromatic Adaptation by stimuli in which Correlated color The temperature of the Planck-
adaptation the dominant effect is that of temperature ian radiator whose perceived
different relative spectral dis- color most closely resembles
tributions. that of a given stimulus at the
Chromaticity Partial description of the color same brightness and under
stimulus, describing only chro- specified viewing conditions.
matic aspects and not the Efficacy (luminous Quotient of the luminous flux
‘intensity’ (the luminance) of efficacy of a emitted by the power con-
the stimulus. In light sources, source) sumed by the source.
chromaticity is an important Luminous Ratio of radiant flux weighted
descriptor, as the level of illu- efficiency according to V(l), the spectral
mination can be changed by luminous efficiency of the
changing the distance between human eye, to the correspond-
the source and the illuminated ing radiant flux.
object. The luminance of the Spectral luminous Ratio of the radiant flux at
object will depend on the efficiency (of the wavelength lm to that at wave-
illumination and the reflection human observer) length l, such that both radi-
(transmission) characteristics ations produce equally intense
of the object. luminous sensations under
Color In color science one dis- specified photometric con-
tinguishes between color (per- ditions and lm is chosen so
ception), characteristic of that the maximum value of this
visual perception that can be ratio is equal to 1.
described by attributes of hue,
brightness (or lightness) and
colorfulness (or saturation or See also
chroma), and color stimulus, a
Incoherent Sources: Lamps. Lasers: Semiconductor
specification of the optical
Lasers.
stimulus in terms of operation-
ally defined values, such as
three tristimulus values, or Further Reading
luminance, and chromaticity
(or dominant wavelength and Fukuda M (1991) Reliability and Degradation of Semicon-
saturation). ductor Lasers and LEDs. Boston, MA: Artech House.
Rudden MN and Wilson J (1993) Elements of Solid State
Color rendering Measure of the degree to
Physics. Chichester, UK: Wiley.
which the psychophysical
Sze SM (1985) Semiconductor Devices: Physics and
color of an object illuminated Technology. New York: Wiley.
by the test illuminant conforms Wenckebach WT (1999) Essentials of Semiconductor
to that of the same object Physics. Chichester, UK: Wiley.
illuminated by the reference Žukauskas A, Shur MS and Gaska R (2002) Introduction to
illuminant, suitable allowance Solid-State Lighting. New York: Wiley.
M
MAGNETO-OPTICS
Contents
strong modification of the spin splittings of the coefficient can be split into an amplitude and a phase
energy bands with respect to nonmagnetic materials. factor:
Thus the Faraday effect or CARS are particularly
well-suited methods for the investigation of such r~ ^ ¼ r^ expðic^ Þ ½5
systems.
By comparing eqns [4] and [5] the phase shifts during
As examples of optical pumping and ODMR,
reflection can be expressed in terms of the refractive
results obtained with III – V semiconductors will
index:
be presented, which show the most important
applications of th