PhysicsInProportionI 1

Physics in Proportion
Mark A. Peterson
c 2005 M.A. Peterson
Contents
1 What is physics? 3
1.1 Proportionality . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Two Kinds of Physics . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Learning Physics . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 A Capsule History of Physics . . . . . . . . . . . . . . . . . . 11
2 Mathematical Tools 21
2.1 Proportion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Data: Straight Line Plots . . . . . . . . . . . . . . . . . . . . 26
2.4 Uncertainty in Data . . . . . . . . . . . . . . . . . . . . . . . 30
2.5 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6 Position, Time, and (constant) Velocity . . . . . . . . . . . . . 33
2.7 The speed of light, and SI units . . . . . . . . . . . . . . . . . 38
2.8 Dimension and Scaling . . . . . . . . . . . . . . . . . . . . . . 39
2.9 Power Laws, and the Logarithm . . . . . . . . . . . . . . . . . 40
2.10 Numbers in Geometry are Ratios . . . . . . . . . . . . . . . . 43
2.11 The Trigonometric Functions . . . . . . . . . . . . . . . . . . 44
iii
iv CONTENTS
2.12 Angular Measures . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.13 Trigonometric functions of special angles . . . . . . . . . . . . 50
2.14 Small angle approximations . . . . . . . . . . . . . . . . . . . 51
3 Geometrical Optics 61
3.1 Angular Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2 The Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 Binocular Vision and Parallax . . . . . . . . . . . . . . . . . . 66
3.4 Wide Open Pupils . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5 The Lens of the Eye . . . . . . . . . . . . . . . . . . . . . . . 70
3.6 Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.7 Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.8 Interpreting Relationships . . . . . . . . . . . . . . . . . . . . 76
3.9 The focal length of the eye . . . . . . . . . . . . . . . . . . . . 78
3.10 Virtual Images . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.11 Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.12 Object and Image . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.13 Optical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.13.1 The Magnifying Glass . . . . . . . . . . . . . . . . . . 90
3.13.2 The Microscope . . . . . . . . . . . . . . . . . . . . . . 93
3.13.3 Two lenses together . . . . . . . . . . . . . . . . . . . . 95
3.13.4 The Astronomical Telescope . . . . . . . . . . . . . . . 96
3.13.5 Galilean Telescope . . . . . . . . . . . . . . . . . . . . 98
3.14 Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.15 Spherical Aberrations . . . . . . . . . . . . . . . . . . . . . . . 102
3.16 Reection and Refraction . . . . . . . . . . . . . . . . . . . . . 104
3.17 Fermats Principle . . . . . . . . . . . . . . . . . . . . . . . . 106
3.18 Wavefronts: A Dual Theory of Light . . . . . . . . . . . . . . 109
CONTENTS v
4 Time and Oscillation 121
4.1 Angular Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.1.1 The Solar Clock . . . . . . . . . . . . . . . . . . . . . . 123
4.1.2 The Sidereal Clock . . . . . . . . . . . . . . . . . . . . 124
4.1.3 Solar vs. Sidereal . . . . . . . . . . . . . . . . . . . . . 124
4.1.4 Aside on Keplers Laws . . . . . . . . . . . . . . . . . . 128
4.2 Atomic Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.3 GPS: Global Positioning System . . . . . . . . . . . . . . . . . 131
4.4 Longitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.5 The Moons of Jupiter . . . . . . . . . . . . . . . . . . . . . . . 137
4.6 Period, Frequency and Amplitude . . . . . . . . . . . . . . . . 141
4.7 Velocity in Orbit, Projected . . . . . . . . . . . . . . . . . . . 143
4.8 Pendulums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.8.1 The Period of a Pendulum . . . . . . . . . . . . . . . . 146
4.9 The Binomial Approximation for Perturbations . . . . . . . . 149
4.10 Pendulums and the Rotation of the Earth . . . . . . . . . . . 152
4.11 Simple Harmonic Oscillators . . . . . . . . . . . . . . . . . . . 154
4.12 Exponential Decay . . . . . . . . . . . . . . . . . . . . . . . . 158
4.13 Dating by Radioactive Decay . . . . . . . . . . . . . . . . . . 163
5 Mass, Weight, and Equilibrium 173
5.1 Archimedes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.2 Torque and Force . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.3 Spring Forces: Hookes Law . . . . . . . . . . . . . . . . . . . 182
5.4 Weight and Mass . . . . . . . . . . . . . . . . . . . . . . . . . 183
5.5 Springs in Parallel and Series . . . . . . . . . . . . . . . . . . 186
vi CONTENTS
5.6 Newtons Third Law . . . . . . . . . . . . . . . . . . . . . . . 189
5.7 Youngs Modulus . . . . . . . . . . . . . . . . . . . . . . . . . 191
5.8 The Force Between Atoms . . . . . . . . . . . . . . . . . . . . 193
6 Mechanical Energy and Motion 201
6.1 Gravitational Potential Energy . . . . . . . . . . . . . . . . . 201
6.2 Spring Potential Energy . . . . . . . . . . . . . . . . . . . . . 206
6.3 The Potential Energy of a Pendulum . . . . . . . . . . . . . . 207
6.4 Falling, and Kinetic Energy . . . . . . . . . . . . . . . . . . . 209
6.5 Velocity v in falling . . . . . . . . . . . . . . . . . . . . . . . . 213
6.6 Universal Gravitation . . . . . . . . . . . . . . . . . . . . . . . 214
6.7 Energy of an Oscillator . . . . . . . . . . . . . . . . . . . . . . 217
6.8 Oscillators Losing Energy . . . . . . . . . . . . . . . . . . . . 219
6.9 A Chemical Bond . . . . . . . . . . . . . . . . . . . . . . . . . 222
7 Vector Quantities 231
7.1 Projectile Motion . . . . . . . . . . . . . . . . . . . . . . . . . 231
7.2 Vector Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 233
7.3 Velocity and Speed . . . . . . . . . . . . . . . . . . . . . . . . 236
7.4 Galilean Relativity . . . . . . . . . . . . . . . . . . . . . . . . 237
7.5 Falling and Relativity . . . . . . . . . . . . . . . . . . . . . . . 238
7.6 Falling and Impulse . . . . . . . . . . . . . . . . . . . . . . . . 240
7.7 More on Projectile Motion . . . . . . . . . . . . . . . . . . . . 241
7.8 Impulse and Conservation of Momentum . . . . . . . . . . . . 243
7.9 Impulse and Circular Motion . . . . . . . . . . . . . . . . . . . 244
CONTENTS vii
8 Density and Fluids 253
8.1 Mass Density . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
8.2 Archimedes Principle . . . . . . . . . . . . . . . . . . . . . . . 255
8.3 Galileos Balance . . . . . . . . . . . . . . . . . . . . . . . . . 260
8.4 Galileos Proof of Archimedes Principle . . . . . . . . . . . . 261
8.5 Buoyancy and Pressure . . . . . . . . . . . . . . . . . . . . . . 263
8.6 More on Hydrostatic Pressure . . . . . . . . . . . . . . . . . . 266
8.7 Atmospheric Pressure . . . . . . . . . . . . . . . . . . . . . . . 268
8.8 The Barometer . . . . . . . . . . . . . . . . . . . . . . . . . . 270
8.9 Bernoullis Principle . . . . . . . . . . . . . . . . . . . . . . . 272
8.10 Applications of Bernoullis Principle . . . . . . . . . . . . . . . 273
8.10.1 Force of the wind . . . . . . . . . . . . . . . . . . . . . 273
8.10.2 Flow Past an Airfoil . . . . . . . . . . . . . . . . . . . 275
8.11 Flow in Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.11.1 Venturi Flow Meter . . . . . . . . . . . . . . . . . . . . 277
8.11.2 Poisseuille Flow . . . . . . . . . . . . . . . . . . . . . . 278
8.11.3 Current Density . . . . . . . . . . . . . . . . . . . . . . 278
8.12 Shear Stress and Viscosity . . . . . . . . . . . . . . . . . . . . 280
8.13 Stokes Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
8.14 Poisseuille Flow Revisited . . . . . . . . . . . . . . . . . . . . 283
8.15 The Reynolds Number . . . . . . . . . . . . . . . . . . . . . . 286
8.16 Resistance in Series and Parallel . . . . . . . . . . . . . . . . . 288
8.17 The Human Circulatory System . . . . . . . . . . . . . . . . . 290
8.18 A Fractal Model of Circulation . . . . . . . . . . . . . . . . . 295
viii CONTENTS
9 Temperature, Heat, and Internal Energy 311
9.1 Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
9.2 Thermometers . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
9.3 The Gas Thermometer . . . . . . . . . . . . . . . . . . . . . . 319
9.4 Avogadros Hypothesis . . . . . . . . . . . . . . . . . . . . . . 321
9.5 Heat Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
9.6 Molar Heat Capacities . . . . . . . . . . . . . . . . . . . . . . 327
9.7 Statistical Model for Molar Heat Capacity . . . . . . . . . . . 328
9.8 Phase Transitions . . . . . . . . . . . . . . . . . . . . . . . . . 331
9.9 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
10 Thermodynamics 339
10.1 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
10.2 PV Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
10.3 Various Processes . . . . . . . . . . . . . . . . . . . . . . . . . 343
10.3.1 Adiabatic Process: Q = 0 . . . . . . . . . . . . . . . 343
10.3.2 Isothermal Processes, T = 0 . . . . . . . . . . . . . . 345
10.3.3 A Constant Pressure Process . . . . . . . . . . . . . . . 347
10.3.4 Reversible and Irreversible Processes . . . . . . . . . . 348
10.4 Heat Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
10.4.1 The Carnot Cycle . . . . . . . . . . . . . . . . . . . . . 354
10.4.2 Refrigerators, Heat Pumps . . . . . . . . . . . . . . . . 355
10.5 Life at Fixed Temperature . . . . . . . . . . . . . . . . . . . . 356
10.6 Life at Fixed Temperature and Pressure . . . . . . . . . . . . 358
CONTENTS ix
11 Statistical Physics 361
11.1 Ideal Solutions as Ideal Gases . . . . . . . . . . . . . . . . . . 361
11.2 Statistical Mechanics . . . . . . . . . . . . . . . . . . . . . . . 364
11.3 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
11.4 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 368
12 Waves in One Dimension 373
12.1 Standing Waves on a String . . . . . . . . . . . . . . . . . . . 374
12.2 Standing Sound Waves in a Pipe . . . . . . . . . . . . . . . . 377
12.3 Wave Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
12.4 The Speed of Sound in Air . . . . . . . . . . . . . . . . . . . . 382
12.5 Sinusoidal Travelling Waves . . . . . . . . . . . . . . . . . . . 384
12.5.1 Doppler Eect with Moving Receiver . . . . . . . . . . 386
12.5.2 Doppler Eect with Moving Source . . . . . . . . . . . 386
12.6 Superposition, and the Beat Frequency . . . . . . . . . . . . . 388
12.7 Reection of Waves . . . . . . . . . . . . . . . . . . . . . . . . 390
12.8 Reection and Standing Waves . . . . . . . . . . . . . . . . . 392
12.9 Energy Current on a String . . . . . . . . . . . . . . . . . . . 395
12.10Energy Current Density . . . . . . . . . . . . . . . . . . . . . 397
12.11Energy Current Density in Sound . . . . . . . . . . . . . . . . 398
12.12Energy Current Density in Light . . . . . . . . . . . . . . . . . 400
12.13The Inverse Square Law for Intensity . . . . . . . . . . . . . . 402
12.14Time Averaging: Mean Square Intensity . . . . . . . . . . . . 404
x CONTENTS
13 Waves in Two and Three Dimensions 411
13.1 The Huyghens Construction . . . . . . . . . . . . . . . . . . . 412
13.2 Youngs Experiment . . . . . . . . . . . . . . . . . . . . . . . 414
13.3 Single Slit Diraction . . . . . . . . . . . . . . . . . . . . . . . 418
13.4 Waves in Refraction . . . . . . . . . . . . . . . . . . . . . . . . 422
13.5 Interference Colors . . . . . . . . . . . . . . . . . . . . . . . . 424
13.6 X-Ray Crystallography . . . . . . . . . . . . . . . . . . . . . . 428
13.7 The Electromagnetic Spectrum . . . . . . . . . . . . . . . . . 431
13.8 Youngs Experiment . . . . . . . . . . . . . . . . . . . . . . . 435
14 Electric Charge and Potential 439
14.1 Static Electricity . . . . . . . . . . . . . . . . . . . . . . . . . 439
14.2 Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
14.2.1 Electrostatic Energy U
C
. . . . . . . . . . . . . . . . . 442
14.2.2 Electrostatic Potential Dierence . . . . . . . . . . . . 444
14.2.3 Capacitors in Series and Parallel . . . . . . . . . . . . . 445
14.3 Units and Values . . . . . . . . . . . . . . . . . . . . . . . . . 448
14.4 Parallel Plate Capacitor . . . . . . . . . . . . . . . . . . . . . 449
14.4.1 The Parallel Plate Formula for Capacitance . . . . . . 450
14.4.2 The Parallel Plate Formula for Electric Field . . . . . . 452
14.4.3 The Parallel Plate Formula for Potential . . . . . . . . 454
14.4.4 Motion Between Parallel Capacitor Plates . . . . . . . 455
14.4.5 Oscilloscope and CRT . . . . . . . . . . . . . . . . . . 456
14.5 Aside on Coulombs Law . . . . . . . . . . . . . . . . . . . . . 458
14.6 Franklins Bells . . . . . . . . . . . . . . . . . . . . . . . . . . 461
CONTENTS xi
15 Electric Current 469
15.1 Potential V and Current I . . . . . . . . . . . . . . . . . . . . 469
15.2 Ohms Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
15.3 Microscopic Form of Ohms Law . . . . . . . . . . . . . . . . . 471
15.4 Dissipation of Energy in Resistors . . . . . . . . . . . . . . . . 474
15.5 Resistors in Series and Parallel . . . . . . . . . . . . . . . . . . 476
15.6 Discharging a Capacitor . . . . . . . . . . . . . . . . . . . . . 477
16 Bioelectricity, Electrochemistry 485
16.1 Excitable Membranes . . . . . . . . . . . . . . . . . . . . . . . 485
16.2 Nerve Axons: Hodgkin-Huxley Theory . . . . . . . . . . . . . 489
16.3 Galvanis Frogs and Voltas Piles . . . . . . . . . . . . . . . . 496
16.4 The Daniell Cell . . . . . . . . . . . . . . . . . . . . . . . . . . 497
16.5 Cathode and Anode . . . . . . . . . . . . . . . . . . . . . . . . 500
16.6 Half Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
16.7 Using Batteries . . . . . . . . . . . . . . . . . . . . . . . . . . 506
16.7.1 Batteries in Series . . . . . . . . . . . . . . . . . . . . . 506
16.7.2 Fuel Cells and Electrolysis . . . . . . . . . . . . . . . . 506
16.7.3 Sir Humphry Davy . . . . . . . . . . . . . . . . . . . . 508
16.7.4 The Telegraph . . . . . . . . . . . . . . . . . . . . . . . 509
17 Magnetism 515
17.1 Magnetic Field Lines . . . . . . . . . . . . . . . . . . . . . . . 516
17.2 The Magnetic Force on a Moving Charge . . . . . . . . . . . . 520
17.3 Mass Spectrometer . . . . . . . . . . . . . . . . . . . . . . . . 523
17.4 Spiralling Along the Field Lines . . . . . . . . . . . . . . . . . 525
xii CONTENTS
17.5 The Magnetic Force on a Current I . . . . . . . . . . . . . . . 528
17.6 Generation of Electric Current . . . . . . . . . . . . . . . . . . 531
17.7 Faradays Law and Relativity . . . . . . . . . . . . . . . . . . 533
17.8 Generating Alternating Current . . . . . . . . . . . . . . . . . 535
17.9 The Magnetic Field due to a Current . . . . . . . . . . . . . . 536
17.10Faradays Law, Lenzs Law, and Self-Induction . . . . . . . . . 539
17.11Mutual Inductance and Transformers . . . . . . . . . . . . . . 542
17.12Two Magnetisms? . . . . . . . . . . . . . . . . . . . . . . . . . 542
18 Electromagnetic Waves and Resonance 549
18.1 Plane Waves and Polarization . . . . . . . . . . . . . . . . . . 550
18.2 Polarization in Nature . . . . . . . . . . . . . . . . . . . . . . 552
18.3 Scattering and the Index of Refraction . . . . . . . . . . . . . 554
18.4 Light Transmission in Gases . . . . . . . . . . . . . . . . . . . 556
18.5 Why is the Sky Blue? . . . . . . . . . . . . . . . . . . . . . . . 558
18.6 Producing EM Waves . . . . . . . . . . . . . . . . . . . . . . . 561
18.7 Hertzian Waves . . . . . . . . . . . . . . . . . . . . . . . . . . 565
18.8 Resonant Absorption and Emission . . . . . . . . . . . . . . . 569
18.9 The Blackbody Spectrum . . . . . . . . . . . . . . . . . . . . 573
19 Quantum Mechanics 581
19.1 Quanta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
19.2 Einstein and the Heat Capacity of Solids . . . . . . . . . . . . 584
19.3 Photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
19.4 The Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . . . 590
19.5 De Broglie Waves . . . . . . . . . . . . . . . . . . . . . . . . . 595
19.6 The Heisenberg Uncertainty Principle . . . . . . . . . . . . . . 599
CONTENTS xiii
20 Nuclear Processes 607
20.1 E=Mc
2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
20.2 Atomic Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
20.3 Beta Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
20.4 Alpha Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
20.5 Radiation and the Body . . . . . . . . . . . . . . . . . . . . . 622
20.5.1 General Considerations . . . . . . . . . . . . . . . . . . 622
20.5.2 Gamma Rays and X Rays . . . . . . . . . . . . . . . . 627
20.5.3 Beta Radiation . . . . . . . . . . . . . . . . . . . . . . 629
20.5.4 Alpha Radiation . . . . . . . . . . . . . . . . . . . . . 633
20.6 Neutrons and Fission . . . . . . . . . . . . . . . . . . . . . . . 634
20.7 Nuclear Reactors and Articial Isotopes . . . . . . . . . . . . 638
20.8 Articial Isotopes . . . . . . . . . . . . . . . . . . . . . . . . . 643
Foreword
Everyone knows that physics is a rich subject, and most people are quick to
express their interest in it. Physics is the foundation of the natural sciences,
but it is also central to intellectual history. It has inspired new methods
and approaches in the social sciences and in nance. It has created new
mathematics. Through interpretations of what it has to say about our place
in the natural world, it touches even the humanities and religion.
Physicists themselves, however, take a much more restricted view, at
least in introductory textbooks. After a few generalities, they get right to
work on the motion of hypothetical point particles, and this may go on for
most of a semester, or even a year. A puzzled student learns that what is
most interesting in physics is still some distance away, and will come with
time. Those who go on in physics ultimately see the sense in this, but the
great majority of students are left with an introduction that never actually
goes anywhere. Worse, they may have come to see physics as a collection of
mathematical formulae to learn for professional school entrance exams, and
otherwise to forget. This is a sad transformation.
Perhaps the rst course with calculus should get right to work on New-
tons description of the world, although it wouldnt hurt to explain a little
more clearly the logic of this approach. The rst course without calculus,
though, does not have a clear rationale. Without calculus one cannot even
state Newtons second law. Yet a rst physics course that is not founded on
Newtons mechanics is almost unthinkable. The non-calculus course has a
problem.
A look at history suggests another way to organize physics, one that might
be more meaningful to students, and might even be a truer introduction to
physics. One sees, in history, the importance of proportional reasoning from
1
2 CONTENTS
the physics of the Greeks down to the present. This is a more elementary
level of mathematics than the calculus course assumes, but it does not mis-
represent the eld. Quite the contrary, the idea of proportion in Nature is in
some ways the essence of physics.
Putting the emphasis on proportion, and related ideas like dimensional
analysis, leads to an ordering of topics that is quite dierent from the stan-
dard text, at least initially. It essentially amounts to replacing the Newtonian
picture of point particles by continuum objects of geometry. The theory of
geometrical optics comes rst, an intriguing blend of mathematical theory
and familiar experience, pervaded by the notion of proportionality. We take
a cue from history in choosing topics and ways of thinking that people under-
stood early, with simple mathematical arguments. Densities of various kinds
play a central role. Fluid phenomena are more prominent in this approach
than they would be in a standard text. The historically recent formalism
of vectors is de-emphasized and replaced, where necessary, by geometrical
arguments about horizontal and vertical projections, a way of thinking that
was never problematic in the past.
History also furnishes, frequently, a story line for topics in physics.
Without being strictly chronological, we still nd that attention to the story
often organizes the subject in an interesting and engaging way. This method
frequently changes the order of topics. In electrostatics, for example, it is
natural to start with the idea of capacitance rather than Coulombs Law.
The reader familiar with the standard text will see many examples of such
changes.
We have unashamedly tried to make this text a suitable preparation for
the physics portion of the MCAT exam. It seems to us that the MCAT,
without requiring calculus, is in fact a rather interesting test of physics, and
frequently probes real comprehension, and not just ability to calculate or
to memorize. Attention to history, to real phenomena, and to continuum
descriptions, it seems to us, is the most useful introduction to physics for
this exam, and beyond that, for integration into a liberal arts education.
Chapter 1
What is physics?
Most people, including most physicists, have trouble saying what physics is.
Dictionaries are not very satisfactory either. They tend to dene physics
using the word energy, for example, a word that has acquired its precise
meaning (the one intended here) only gradually, within physics. This says,
in eect, that you will know what physics is after you have studied it. But
really, it helps to know what a subject is before you study it.
A look at history gives a surprisingly simple answer: physics is the math-
ematical theory of Nature. In the context of the present this is hard to see,
because now every natural science is mathematical, not just physics. When
you think about it, though, you realize that the other natural sciences are
dened by their subject matter. Biology studies the world of living things.
Chemistry studies the world of substances. Geology studies the world it-
self. But if these sciences have become mathematical, it is because they have
imported ideas and methods from physics.
The inuence of physics has transformed all the natural sciences, and
that is a very good reason for studying physics: all sciences now are part
physics. Physics takes all Nature as its subject matter, and in particular
Natures mathematical structure. This is its dening characteristic. Thus
part of what physics studies is not Nature at all, but mathematical models
of Nature. Perhaps that is why it is hard to say what physics is.
It may be obvious now that Nature is mathematical, but until recently
it was universally held that the world is not mathematical, that it is too
3
4 CHAPTER 1. WHAT IS PHYSICS?
complex and chaotic for that. Galileos opponents, for example, ridiculed
the idea that mathematics had anything to tell us about what the world is
really like. Gradually, pioneers like Galileo began to discern simplicities in
the complexity, things so simple that we can describe them with mathematics
after all.
Physics is the mathematical theory of Nature, but it is not mathematics.
It is about real phenomena, in a way that mathematics itself is not. Here
too history can guide us. Where do mathematical theories of Nature come
from? What phenomena do they describe, and how does the description
work? We will pursue these questions by keeping real phenomena at the
center of attention, and as mathematics enters the picture, we will take time
to interpret it and dig out its meaning. This skill is probably the most useful
thing you can gain by studying physics.
1.1 Proportionality
Physics is the mathematical theory of Nature. Very often the mathematics
in question will be quite simple, just a statement of proportionality. It is
an oversimplication, but a useful one, to say that physics is the study of
proportionalities in Nature, both obvious ones and mysterious ones. Propor-
tionality is the thread that will run through this book: it is the mathematics
of our mathematical theory.
Proportionality is a relationship. Here is an example using the notions
of mass and weight. The words are common English words, and may even
seem to mean the same thing, but in physics one makes a careful distinction.
What is true is that mass m and weight W are proportional :
W m (1.1)
or, introducing the constant of proportionality g, which is called the acceler-
ation due to gravity,
W = mg (1.2)
We want to be able to look at an expression like that and read its meaning.
One thing it says is that W and m are similar somehow. If m should double,
then W would also double: a more massive thing weighs more, by the same
1.1. PROPORTIONALITY 5
factor. Note that we dont need the numerical value of the constant g to see
this. That value is irrelevant to the relationship we are getting at. That is
why g is actually omitted in one way of saying it, Eq (1.1). Are mass and
weight really just the same thing, going under dierent names, with a sort of
conversion factor g to switch from one name to the other? In that case it
would seem redundant to have two names for what is really one concept. It
must be that weight and mass really are conceptually dierent, in some way
that we havent been told. Then the proportionality between two dierent
things becomes slightly surprising. It says we can measure mass by measuring
weight. In a kind of twist, it will turn out that the constant g actually is
not strictly constant, but depends on where you are, so it is a constant only
if you stay at one denite place. Thus another way to read Eq (1.2) is that
weight W is proportional to g at xed m now m becomes the constant of
proportionality, and we can measure g by measuring W. All these meanings
are contained in the simple statement above.
Statements like Eq (1.2) are sometimes called formulas, with the ex-
pectation that we will put numbers in. We could put numbers in, of course,
but that is not really what the statement is about, nor what it is for. It is
really getting at the relationship between weight and mass, which is both
simple and subtle. Standardized tests of physics, like the MCAT, to take an
important example, will test your ability to read statements like Eq (1.2).
Your ability to put numbers into it is of no interest to anyone. Even if putting
in numbers does turn out to be part of the question, it will not be the im-
portant part. Students who do not know about reading relationships will
waste their time memorizing formulas, hoping that when the time comes
they will choose the right one and put in the right numbers. This is a losing
strategy. Forget numbers! Concentrate on the relationships. That is what
you will be tested on. Learn to read the meaning. That is what this book
aims to help you do.
It is true that one should work towards a sense for magnitudes, a kind
of physical common sense. For this purpose it is important to think about
numbers after all, and to see how they are related in formulas, but these will
be typical values, and can be very approximate. This may seem a mysterious
remark, but we will be doing many rough estimates using approximate values
to get a feeling for magnitudes as we go.
Let us look at a typical MCAT question alluding to Eq (1.2): Gravity on
the Moon is only 1/6 that on Earth. A 6 kg mass on Earth is taken to the
Moon. What is its mass on the Moon?
(A) 1 kg
(B) 9.8 kg
(C) 6 kg
(D) 1.63 kg
The answer is (C). Answers (B) and (D) might appeal to someone who
knows the typical numerical value g = 9.8 ms
2
on Earth, or who might
compute g/6. These values are irrelevant, because of their units if nothing
else. Answer (A) will appeal to someone who thinks we must divide by 6
because gravity is weaker on the Moon. The correct answer, that the mass
m in Eq (1.2) is an invariant quantity, goes to the real meaning of Eq (1.2).
Weight is something we are intuitively familiar with, but closely coupled to
it is the much simpler, invariant concept of mass. How did people gure out
that there exists a simple quantity mass, related to weight by the propor-
tionality in Eq (1.2)? And what is mass? Eq (1.2) expresses a mathematical
relationship, but its real meaning is physical, not mathematical.
Our simple example question has led us up to the edge of a real mystery.
Mass, as simple and fundamental as it is, is not well understood. The masses
of the elementary particles, for example, are a continuing puzzle. For our
purposes we can just notice that this question about mass clearly alludes to
Eq (1.2), but not as a formula. You dont answer the question by putting
numbers into Eq (1.2). You answer it by knowing what Eq (1.2) is getting
at: the surprising concept of mass.
Looking at Eq (1.2) in this way, one realizes that there is actually a story
concealed in what looks like a trivial formula. The story would start with the
sense perception of weight, something everyone has a feeling for, and would
end with the abstract notion of mass. Understanding this story is in many
ways the important thing. It would be enough to answer the MCAT question,
for instance. Although we will not always take an historical approach, it will
be a help to our study of physics to know something about how it developed.
We therefore include a capsule history at the end of this chapter.
It would be very pleasant to learn physics through stories. The history of
physics is in fact full of stories. For each one you have to learn the physics to
1.2. TWO KINDS OF PHYSICS 7
understand the story, so this could even be a kind of method for learning. If it
is a story of a discovery, one cannot help imagining how the discovery actually
appeared to the discoverers. In the end, the mathematical representation of
the idea will be abstract, but attention to the story means it can be never
be merely abstract. We will use this method occasionally, and even broaden
the denition of physics a little bit, to include its stories, and its connections
to history, the arts, and technology.
1.2 Two Kinds of Physics
Another slightly confusing thing about physics as a discipline is that there
are two almost contradictory ways that it uses mathematics. In one way
of looking at it, physics is endlessly mathematical: it has actually inspired
new kinds of mathematics, and continues to do so even today. This kind of
physics is associated with higher and higher mathematics, beginning with the
invention of calculus in the 17th century and continuing on to developments
that are not easy to describe in ordinary language, but are occasional grist for
popularizers talking about such things as curved space or uncertainty prin-
ciples. From this point of view, you cant have too much mathematics when
you are doing physics. The more the better. Experiments have frequently
conrmed the results of theories that could never have been conceived with-
out higher mathematics. This in itself is a very intriguing fact, and argues
that Nature is deeply and richly mathematical, beyond anything we could
have expected.
The way this very mathematical theory is actually used in practice, how-
ever, has a very dierent avor. Far from solving complex equations exactly,
most physicists, and others who use physics, work with rough estimates and
quick approximations. They have in mind mental pictures or models that
they apply to this or that situation. The models are essentially statements
of proportionalities, with pictures that go along with them to make them
easier to think about and visualize. The models are grounded in the ex-
act theories of physics, but their usefulness comes not from their exactness
but from their ability to approximate situations that are too complicated to
model exactly which is to say, most situations. Physicists pride themselves
in being able to do useful computations on the back of an envelope. Clearly
whatever they do on the back of an envelope cant be very much! But if it
cuts to the essential idea of what is going on, identifying an appropriate
model, and drawing some simple conclusion from it, then it is physics of the
highest order.
The two kinds of physics reect two kinds of simplicity, since physics
always aims to bring out what is simple. Most people would say that the
mathematical theories and the elaborate experiments of physics are far from
simple, but in some ways this misses the essential point. A mathematical
theory, being unambiguous and completely dened, is in some sense very
simple. And an experiment that cools a sample to near absolute zero (to
get rid of thermal uctuations), pumps down a vacuum around it (to get
rid of extraneous, perturbing matter), mounts everything on shock absorbers
(to eliminate unwanted vibrations), etc., looks, and is, elaborate, but at its
heart it creates a region of almost unnatural simplicity, in order to discern
the simple law that governs what is left. The complicated mathematics,
the complicated apparatus, are all meant to isolate something that is simple
enough to be comprehensible.
On the other hand, what may seem simple in a supercial sense, like
what we see when we walk down the street, is, from the point of view of
physics, fantastically, hopelessly complicated. The only way physics can say
anything about it is to take a back of the envelope approach ignore
the true complexity, and just model the main things, roughly. This remark
applies to any situation that is not a precision experiment in physics. To
apply physics broadly one must be willing to approximate, literally throwing
away the precision that physics claims to have.
Professional physicists learn the rst kind of physics rst the highly
mathematical kind. That is the tradition. They view the second kind of
physics, the back of the envelope kind, as a very sophisticated understand-
ing that they acquire only late in their training. We will be aiming, however
at the second kind of physics. We will not spend much class time devel-
oping a mathematical formalism. Rather we will often be thinking about
real situations and trying to understand them in terms of models that are
mathematical, but not overly so. This approach covers a lot of real physics,
and could be interpreted as aiming at higher sophistication in place of higher
mathematics. It means we can feel free to consider complicated, even impos-
sibly complicated, problems not with the intent of solving them, but just
with the intent of understanding them quantitatively, roughly.
1.3. LEARNING PHYSICS 9
In practice, physics provides models of what is proportional to what.
Knowing what the models are, and what the proportionalities are, is real
physics for most practical purposes. The designers of the MCAT exam very
sensibly look for this kind of understanding, a broad and useful knowledge,
not mathematically deep. That is the subject of this book.
1.3 Learning Physics
There is a whole branch of physics (PER: Physics Education Research) de-
voted to how people learn physics. One intriguing conclusion of PER is that
people have innate ideas about how the world works that they have to give
up in order to learn physics. That just makes the point once again that the
simple models of modern physics and the simplicity of everyday things
are not the same. It is sometimes said that students begin as Aristotelians.
The reference is to Aristotles physics, a 2000 year aberration, a dominant
idea from Roman times down to the time of Galileo. Aristotles physics
claims to be common sense about the world, proved by elementary logic and
observation.
History alone, even without PER, would tell us that Aristotles physics
must have had a powerful appeal to have lasted so long. It is noteworthy that
it is scrupulously non-mathematical, and also that it ultimately produced
nothing of value. Despite its apparent basis in observation and logic, nothing
in Aristotelian physics survived. It was simply and thoroughly wrong. And
yet it is apparently the intuitive idea of the world that we all unconsciously
begin with! This puts us on notice that physics, despite its emphasis on
simplicity, is psychologically not so simple.
To learn physics, we must sometimes unlearn what seems right, and ac-
cept a subtly dierent idea in its place. This may take some conscious eort
to do. The conscious eort takes the form of recalling and recognizing math-
ematical models, and using them.
The problem of how an object moves, for example, is modeled by Newtons
mechanics. You must translate the situation, as given, into the terms of the
model. In this case, you think of the object as made of little point masses,
with initial positions and velocities. Then you compute what the model
says about how those positions and velocities change in time. This may
involve geometry, algebra, or arithmetic. It might all be done in a picture
or diagram. There may be quick shortcuts knowing about these would be
part of knowing the mathematical model. If it is very clear what to do, but
merely tedious, you could get a computer to help. In the end you translate
the computation back into a statement about the actual object. In summary,
you substitute the model for any intuitive ideas you may have, and you see
what the model says. Lastly, you think about the result, and try to see how
it makes sense. Ultimately you would like your intuition to agree with the
model, but that takes time to learn.
Whether we can really change our Aristotelian intuition is an interesting
question. When physicists talk about their physical intuition, it may be,
in the end, that they have simply learned to consult the mathematical model
more quickly and easily, through practice, which is not quite what we usually
mean by intuition. For some models, like quantum mechanics, it may fairly
be said that nobody claims to have made it intuitive.
On the other hand, physics is not a system of thought that forces you
to think in a certain way, and denies you any creativity. Historically it is
clear that what we now call physics provided, as people said, a new way to
philosophize. They meant using mathematics to understand the world, and
called it Natural Philosophy. It is true that learning these methods requires
discipline, but in the end you have more to think with, and more to think
about: new tools. No one is telling you what to think. But you will want to
use these tools eectively.
The usual rst course in physics starts with the ideas of Newton. PER
rst became interesting to physics teachers as it documented just how un-
successful the rst course often is. A great deal of eort has gone into im-
proving the success rate of the introductory course in the sense of raising
test scores on PER tests. Another possible interpretation, though, is that
perhaps Newton is not the best place to begin. Starting with Newton is
especially problematic if one does not use calculus, since that was Newtons
key innovation. Newtons mechanics is the beginning of that very fascinating
interaction between physics and higher mathematics that we have just said
we would not emphasize. Acquaintance with the history of science suggests,
rather, that the foundational ideas of physics are simpler and go back much
further. Newton famously said, If I have seen farther, it is because I have
stood on the shoulders of giants. For us, starting the study of physics, it
1.4. A CAPSULE HISTORY OF PHYSICS 11
might be worthwhile to ask who those giants were, and what Newton meant
by standing on their shoulders. Perhaps we could stand on their shoulders
too!
This book takes that suggestion seriously. All physical concepts have their
roots in simple, concrete phenomena. We have alluded to the familiar notion
of weight, for example. There may even be an intuitive, commonsensical way
to understand the phenomenon, as in Aristotles principle that all earthy
matter seeks to go toward the center of the Earth. Why is this not enough?
What prompted anyone to take the next step? In this example, Archimedes
Law of the Lever, showing that there is something profoundly mathematical
about weight, played a role that is too important to ignore. History can tell
us, if we pay attention to it, which ideas are really important, and worth our
time.
History also suggests that, in physics, patient contemplation of simple
things is rewarded. While other sciences seem to progress by accretion,
adding more facts and growing larger and more complex, physics, to an
amazing degree, returns again and again to its oldest problems and princi-
ples, with new eyes and deeper appreciation. In terms of learning physics,
it means we can aord to learn simple things, not hurriedly, but thoroughly.
You might think that we have already said more than enough about the sim-
ple relation W = mg, for example. Many people in 1910 would have agreed
with you, just as Einstein was reinterpreting this relation in an astonishing
way, soon to become his General Relativity Theory, a completely new and
unexpected theory of gravity.
We close this introduction with a capsule history of physics. It is the
story in outline, to be reconsidered in detail later.
1.4 A Capsule History of Physics
Physics begins with geometry, and geometry begins with Euclid.
1
Strangely
enough, we do not know very much about how this happened. The Hellenis-
tic Greek civilization that produced geometry and physics was centered in
1
I am much indebted to Lucio Russos The Forgotten Revolution for the point of view
expressed here about classical civilization.
Alexandria, Egypt, and amazingly! no history of Alexandria survives.
We know a lot about Athens in its Golden Age, although that was much
earlier. We know Socrates, for example, better than we know the celebri-
ties in todays newspapers, even though he died in 399 B.C. But of Euclid,
who probably wrote around 300 B.C. in Alexandria, we know truly nothing,
except for his surviving works, chiey the Elements. This work was, and
remains, the rst resource of theoretical physics.
Euclidean geometry is still a subject everyone studies: it is the theory of
points, lines, and circles, starting from a few basic postulates about them.
The truth of the theorems of geometry is not in doubt, because all of these
theorems are proved, that is, justied by arguments that go back to the
postulates. This in itself is not physics, because the points, lines, and circles,
are purely theoretical objects, not real objects. They obey the postulates by
denition.
It becomes theoretical physics, however, if we set up some correspondence
between real objects and these theoretical objects, if, for example, we suggest
that rays of light in Nature behave like straight lines in geometry. Suddenly
we have a wealth of predictions about how real light should behave, and the
ability to theorize about optical congurations that have never been built
before.
The Alexandrian Greeks invented this conception: a mathematical struc-
ture and a correspondence to Nature. That is what physics is. In the
work of Archimedes we have other examples of such structures and such
correspondences. Nearly two thousand years later Galileo Galilei, through
study of Archimedes, came to understand his methods and to pick up where
Archimedes had left o. Galileo discovered his own examples of such corre-
spondences. In just a few decades more, physics had reached a recognizably
modern form.
It is an intriguing question why there is a 2000 year gap in this story.
The answer is almost scary, like science ction. The centers of Greek science
were conquered by the Romans between about 212 B.C. and 46 B.C. With
the arrival of the Romans, original work in the sciences came to an end.
The tradition was kept alive through reworking of some of the old texts
and commentaries, but active understanding and research stopped. The
Romans themselves had no clue. In the hundreds of years that they were
in possession of Greek cities, they never even translated Euclid into Latin,
for example. That was left to the Europeans of the Middle Ages, eager to
recover the knowledge of the classical past, and they translated, at least
initially, from Arabic. Why there was not an Arabic language counterpart
of Galileo is a particularly intriguing question, with repercussions that we
are still feeling. Islamic science inherited Greek science weirdly mixed with
later Roman inuence, religion, and astrology. In the West, where the story
is better documented, one can see that it took about 500 years to separate
the Greek substance from the Roman nonsense. Perhaps that is a clue.
What the early translators found was fragmentary. It was clear from cita-
tions in works that had survived to works otherwise unknown that much
classical knowledge was gone forever. Euclid wrote at least four books on
optics, for example, but only one of them survives, and that one is probably
the least interesting from the point of view of physics: it is a geometrical
theory of human vision, and addresses questions that we would probably not
put rst, such as why distant objects appear indistinct and why they are
eventually lost to sight if they are far enough away.
Some brilliant works of Archimedes survived, including his derivation of
the Law of the Lever, and his theory of buoyancy, Archimedes Principle.
These are results that are still as fresh as when they were rst discovered:
they are still a part of modern physics.
One of the last original Greek works from the Eastern Mediterranean,
written just before the Romans arrived, was Apollonius study of the geom-
etry of the conic sections, the curves that you get when you cut a cone with
a plane, including the parabola and the ellipse. We dont know why he was
studying this, but it is startling that both these curves were discovered in the
early 1600s to be the paths of objects in nature, moving under the inuence
of gravity. That key discovery was only possible because these curves were
known from the work of Apollonius, some 1700 years earlier. In fact all the
works mentioned here were enormously inuential in the Renaissance, the
rebirth, of physics.
Two lessons emerge from looking at this early history of physics. First,
it isnt enough to know just the results, the answers. The Romans were
fond of curious lore, and they knew the Greek results in a sense, but without
understanding them. Their misconceptions are sometimes even comical. Vit-
ruvius, a 1st century Roman author, is aware of Eratosthenes determination
of the radius of the Earth, for example, although he has no idea how it was
done. As he describes it, it emerges that he thinks the Earth is a at disk,
with ourselves at the center of it, and that it is the radius of this disk that
has been measured. Roman science, such as it was, is basically the same as
the later medieval bestiaries and fabulous travelogues. As far as science goes,
the Dark Ages began with the rise of Rome, not with its fall. The Romans
had no interest in the intellectual process behind the Greek scientic results.
They were only interested in the results. But to have a real understanding,
you need to know the process by which the answers were found. Science
declined in the Roman period because it wasnt practiced, only copied, with
growing incomprehension.
Second, even though classical knowledge in the custody of the Romans
was almost completely lost, it was never created again independently. Rather
the surviving bits of Greek science were carefully worked over and became
the nucleus of the new physics. This is really quite amazing. You would
think that physics, being the mathematical description of the elementary
processes all around us, would have been discovered many times over, and in
many places. This did not happen, though. It was discovered only once, by
the Hellenistic Greeks, and we are all their students.
By the mid 1600s Greek physics, together with new results of Galileo,
Kepler, and others strongly hinted at a larger mathematical theory of Nature.
Optics in particular developed rapidly. Then in 1689 Isaac Newton threw
open the door to modern physics, in eect, with a new theory of the motion of
objects, and a new mathematics to implement it. With his calculus an innity
of curves became available to model Nature, not just the straight line, circle,
etc. of classical geometry. All motion became the subject of a mathematical
theory, and in particular the motion of the planets, the most famous puzzle
in all of mathematical science, was solved and understood. It must have
seemed to people living then that human reason was on the threshold of
unlimited accomplishment. Writings from this period, the Enlightenment,
are wonderfully optimistic in this sense, and documents like the Constitution
of the United States of America embody the courage to attack old problems,
like governance, with the power of human rationality and a condent new
spirit.
Newtons theory of the universe was the paradigm for understanding all
physical phenomena for over 200 years. The picture is basically very simple.
The universe consists of small massy particles. These particles would move
in straight lines at constant speed in the absence of external inuences. This
is quite a non-obvious assumption, because we virtually never see things
moving in straight lines at constant speed. The reason is that all particles
are subject to external inuences the eects of the other particles! Each
particle exerts a force on each other particle, causing its motion to depart
from simple uniform motion in a complicated way. The result is the world
we see.
The force in this description includes the universal gravitational force,
by which each particle attracts each other particle, a force that Newton
described with mathematical precision. It has the intriguing property of
getting weaker with the inverse square of the distance between particles (a
mysterious proportionality). There are also other forces, like the forces that
hold particles together to make solid objects, and forces of contact by which
one object resists being penetrated by another one forces that Newton did
not claim to understand in any deep way. These forces too could be treated
within his theory in a practical, approximate way.
Through the 1700s and 1800s this picture was used to model the be-
havior of solids, liquids, and gases, now thought of as consisting of massy
particles (not necessarily atoms, but not inconsistent with atoms either). In
1785 Coulomb discovered that the electric force, familiar to everyone nowa-
days as the force of static cling in polymer fabrics, is also an inverse square
force between particles, much like gravity, but stronger, sometimes attractive
but sometimes also repelling. Experimental research in electrical phenomena
revealed that electric currents exert forces on each other, related somehow
to the mystery of magnetism. Accidental discoveries revealed that electric-
ity plays a mysterious but crucial role in living systems. Mary Shelleys
Frankenstein, written in 1818, is a window into the strangeness of this new
knowledge.
An interesting controversy developed in the early 1800s over the nature
of light. Newton had believed that light, like everything else in his mechanics,
consists of particles, but increasingly experiments, notably those of Thomas
Young (who also contributed to the decryption of the Rosetta Stone), sug-
gested that light is a wave, with a very small wavelength. There was already
a Newtonian theory of sound waves, described initially by Newton himself:
the particles of the air, or of a solid or liquid, move in undulatory ways in
a sound wave. Thus a Newtonian theory of light waves was also possible, as
the undulations of the particles of some medium. This is a bit peculiar, since
light, unlike sound, travels through a vacuum, but Newtonian ideas were so
rmly established that this seemed to prove that the vacuum is really a kind
of tenuous but rigid material!
In this context the vacuum was called the ether, taking its name from
ancient speculations about the material of the heavens, a fth element.
James Clerk Maxwells theory of the motions of this material, the stresses and
strains in it, became a unied theory of electricity and magnetism, including
light waves, to his immense astonishment and satisfaction. This superlatively
beautiful theory is still our theory of electricity and magnetism, although we
no longer believe it describes a material ether. The theory survives, while
the mechanical model that inspired it has fallen away. Maxwells theory, as
we now understand it, is a break from the Newtonian model of the universe
it includes things which are not massy particles: namely the electric and
magnetic elds, permeating space. Light is then undulations of these elds,
and hence a new kind of non-mechanical wave.
As Newtons ideas were developed in the 19th century, and reformulated
in various ways, a new quantity, energy, became increasingly important. En-
ergy is a truly mysterious thing, and it is dicult to recognize it as a single,
unied concept, because it is so changeable it takes so many forms. With
this new concept, though, one has a somewhat dierent picture of the world,
as an arena in which energy is always being passed around. The forces that
were so important in Newtons original description are now just the way
particles pass energy among each other. But energy can also go from the
particles to the electric and magnetic elds, and we dont say that the par-
ticles are exerting forces on the elds. So the Newtonian picture, containing
just particles and their mutual forces on each other, becomes a subset of a
bigger picture.
Energy also ows without any forces being involved from hot to cold bod-
ies, a mysterious process involving the concept of temperature that is outside
the Newtonian mechanical picture. Such energy can then be transferred to
other bodies through a force, in the good old Newtonian way. This kind of
energy ow, involving both heat and mechanical force is the physicists view
of heat engines, like steam engines, that burn fuel to do mechanical work,
and the basis of the branch of physics called thermodynamics.
One very successful attempt to bring thermodynamics back within the
Newtonian framework is called statistical mechanics, and is still a lively
research area. It basically assumes that the mysterious ow of heat, the
non-mechanical transfer of energy, really is mechanical, but at a microscopic
level, where the motions of particles are essentially random, and must be
treated statistically. The phenomenon of Brownian motion, rst noticed by
the botanist Robert Brown as the apparently ceaseless random motion of
pollen grains in water, was recognized only in the 20th century by Einstein
as a visible manifestation of this random microscopic mechanical energy.
Brownian motion is an exception, in that it shows what is going on at
very small length scales while still being large enough to see in a microscope.
Much of what came to occupy physicists in the 20th century is too small to
see in any conventional sense, but could still be seen in a dierent sense: the
things themselves are not seen, but their vibrations are seen very easily!
This was rst done in identifying the chemical elements in compounds
through the ame test. A bit of unknown material might produce a green
ame, indicating the presence of copper, when held over a Bunsen burner.
Robert Bunsen and Gustav Kircho at the University of Heidelberg rened
this method by looking at the spectrum of the green light with a prism,
and found that typically only certain well dened colors or wavelengths,
or frequencies, which are equivalent ways to say it were present in the
emitted light: in the case of copper, for instance, not just green, but a very
precise, denite green, with a denite frequency, and perhaps other denite
frequencies at lower intensity, which were only noticeable when the light was
dispersed through the prism. This method of looking at the spectrum of
ames led to the rapid development of the eld of spectroscopy, and large
catalogues of the natural frequencies of the chemical elements and their com-
pounds.
These frequencies were useful to know for identication purposes, even
while their origin was completely mysterious. A striking example is provided
by astronomy. Around 1850 August Comte, the philosopher of Positivism,
had given an example of something we could never know, even in principle:
the nature of the material that makes up the stars. Just a few years later,
Bunsen and Kircho analyzed starlight by the methods of the new spec-
troscopy, and found exactly the frequencies that correspond to the lighter
chemical elements on Earth. One unfamiliar family of frequencies, found in
sunlight, was hypothesized to indicate a new, unknown chemical element,
and was named helium from the Greek for sun. Helium was later discovered
on Earth as well, in natural gas. Thus spectroscopy provided a kind of win-
dow into the submicroscopic world, and suggested that for practical purposes
we could envision that world as made of oscillators with mysterious, denite
frequencies. It is the oscillators that we see.
The microscopic nature of these oscillators became clear only gradually,
and by a circuitous route. First, experiments in Crookes tubes, glass tubes
with a good vacuum inside and electrodes leading to the outside where they
could be connected to a high voltage source, showed glowing cathode rays.
The setup is rather like what we now use for uorescent lighting, except
that in a Crookes tube the vacuum is better and you use direct current,
not alternating current. The cathode rays turned out to be a stream of
negatively charged particles, accelerated by the applied high voltage, striking
the residual gas atoms left in the tube and making them oscillate (emit light
at their characteristic frequencies). These electrical particles were named
electrons, and proved to be one of the fundamental constituents of matter,
drawn out into space in the Crookes tube, where they could be studied apart
from the complications of solid materials.
Second, the Crookes tubes at high enough voltage produced X-rays (Roent-
gen, 1895), so named because their nature was initially mysterious, much
more penetrating than cathode rays, capable of going right out through the
glass, and through solid, opaque material as well. Wilhelm Roentgen won
the rst Nobel Prize in Physics, in 1901, for this marvellous discovery.
Third, at almost the same time as this discovery, natural radioactivity
was discovered in certain minerals (Becquerel, 1896). It included penetrat-
ing radiation, like X-rays, but with no need for a high voltage source, and
also ionizing radiation, not so penetrating, which turned out to be high en-
ergy charged particles, like cathode rays, but both positively and negatively
charged. These phenomena were clues to the microscopic nature of matter in
themselves, and they also provided tools for studying it, since even without
knowing their nature precisely, beams of these radiations could be directed
at thin material targets. How would they be deected? Such experiments
are in principle easy to do.
The most important result of such scattering experiments was the dis-
covery by Geiger and Marsden (1911) in the laboratory of Ernest Rutherford
that the positive charge in matter is concentrated in very small, massive
nuclei, while the light electrons spread out to ll up most of the space.
Radioactivity tells us that matter is made of charged particles. Thus
the oscillators of spectroscopy must somehow represent oscillations of these
charges. It would make sense to try to imagine congurations of charges
that would oscillate at the observed frequencies, but such attempts never
succeeded. The actual structure is completely dierent. Rather the observed
frequencies correspond to energies of the charge congurations, not frequen-
cies! The reason these two things could be confused is a truly mysterious
proportionality between energy and frequency in light.
This proposal of light quanta, or photons, from Albert Einstein won the
Nobel prize many years later, in 1921, but when it was proposed in 1905
it seemed crazy even to Einsteins admirers, as we know from letters. It
was the key, though. The frequencies of the oscillators, as inferred from
spectroscopy, actually are telling us not about frequencies, but about the
dierences in energy levels of the microscopic entities of matter. These en-
ergies can be computed from amazingly straightforward models, resembling
Newtonian models (atoms somewhat like the solar system, for example),
with Coulombs inverse square force providing the interaction, and every-
thing moving in a space where Euclidean geometry still describes the basic
spatial relationships. In the new setting, called quantum mechanics, only
certain energies are allowed. Classical analogues can still guide us at the
scale of atoms: this is fantastic luck. It means we have a theory quantum
theory of atoms, molecules, solid state electronics, etc., that is the basis
for much of modern technology.
Einstein is best known popularly for his Theory of Relativity. This
quintessentially 20th century theory illustrates the unity of physics across
the centuries, because it takes up again a question that had been discussed
in antiquity, namely how we can tell if we are moving. Einsteins short answer
is that we cant tell. We can only tell if we are moving relative to something
else. Thus the idea of being at rest has no absolute meaning. This ques-
tion arose in antiquity and in Galileos time with reference to the motion of
the Earth (why dont we notice it if we are spinning along at hundreds of
miles per hour?) and in Einsteins new formulation it had still other startling
consequences that we leave for Chapter 20.
Despite the continued success of Euclidean geometry even at the atomic
length scale, the limitations of Euclidean geometry had begun to be found.
A consequence of Einsteins 1905 relativity theory, pointed out three years
later by Hermann Minkowski, is that Euclidean space would henceforth have
to be augmented with a fourth dimension, time, and might better be called
spacetime. This spacetime is not Euclidean, in the sense that the natural no-
tion of length is quite dierent from length in familiar geometry: so dierent
in fact that it can be positive, negative, or zero. (Euclidean length is always
positive.) A very daring extension of this idea, general relativity, suggests
that spacetime is distorted by energy, and that the apparently curved paths
followed by particles coasting along under the inuence of gravity, like home
runs and planetary orbits, are actually the straight lines of the spacetime.
This non-Euclidean theory of gravity has been conrmed in a number of
crucial experiments.
One of the outstanding problems of modern physics is to reconcile the
non-Euclidean geometry of gravitation theory with quantum mechanics. The
ideas proposed in connection with this problem are even more daring exten-
sions of geometry, although there is no consensus on what the outcome will
be.
Chapter 2
Mathematical Tools
This chapter is all about the mathematics of proportionality, in various
guises, with a few related ideas on the side. It starts with the idea of propor-
tionality in arithmetic, and moves to the idea of proportionality in geometry.
These are things you already know, so the point isnt to learn them as if for
the rst time, but rather to aim at a more sophisticated understanding of
them.
On the arithmetic side we count the straight line graphs we get when
we graph proportionality relationships. A related idea is graphing data and
nding that they fall on a straight line, always a good method for analyzing
data if you can use it. The idea of detecting a power law in data by graphing
the logarithms may be new to you. It is a very important notion! Changing
units is an essential use of proportion, as important as it is familiar.
On the geometric side we recall that similar triangles are proportional to
each other, and we especially recall the ratios that dene the trigonometric
functions like sine and cosine. There is one thing here that might be new,
the small angle approximation for trigonometric functions.
We begin with something you learned in grade school, but try to dress
it up in an interesting way, to get a feeling for the (rather low) level of
mathematics in the early Renaissance, when the story of modern physics
begins.
21
22 CHAPTER 2. MATHEMATICAL TOOLS
2.1 Proportion
If you had been born into a 15th century Italian merchants family, you
would have been sent to an abacus school to learn the mathematics you
would need for life. This would have been basic arithmetic, followed by the
culmination of your mathematical education, the Rule of Three. Apart from
some practical geometry, this rule was basically all that higher mathematics
had to oer you, but it was surprisingly eective. With the Rule of Three the
great fortunes of Islamic civilization and Renaissance Europe were founded.
And with the Rule of Three physics was reborn. We still use it, although not
by that name.
The Rule of Three is an algorithm for solving problems like If 3 pounds
of wool cost 5 ducats, how much do 7 pounds cost? As the merchants saw
it, you are given three numbers, and you are to nd the fourth. The rule says
that you multiply two of the numbers, and divide by the third. There were
ways to identify which number was which, in any such problem. To look at
one of the old abacus school textbooks is to realize that you could learn how
to solve such problems with almost no understanding, just by rote.
Algebra was invented in part to make it clearer what was going on in
this problem, because there is an unknown here, and you are to nd it. A
graph also helps to make the meaning clear, but this innovation came later.
In various forms, we will be meeting this problem again and again, so let us
start with it.
The unspoken assumption behind the Rule of Three (using the problem
about ducats and pounds as an example) is that ducats D and pounds P
are proportional. That is, if you want twice as many pounds, it will cost you
twice as many ducats, or
D P (2.1)
It is exactly the same thing to say there is some constant of proportionality
k, such that
D = kP (2.2)
This assumption has nothing to do with the particular numbers given in
the statement of the problem. It is more abstract. D and P are variables,
related by proportionality. We even have a diagrammatic representation of
this proportionality relationship, Fig 2.1, in which all possible pairs (P, D)
2.1. PROPORTION 23
D
P
0 5 10
0
5
10
x
Fig. 2.1: The proportionality between ducats D and pounds P is represented in a graph.
are visualized as a straight line through the origin. No particular values have
any special importance. It is a picture of the relationship. In the gure,
however, we have singled out with dashed lines the values that were given in
the problem, P = 3, D = 5, and P = 7, along with x, the value we were to
nd. Notice that this graphical solution to the problem does not even use
arithmetic!
The constant of proportionality k in Eq (2.2) is the slope of this line,
namely k = D/P for any point (P, D) on the line. As you go over by P
you go up by D, so slope is a measure of the steepness of the line. But we
are given a point on the line, namely (3, 5). Starting at 0, we go over 3
and up 5. Thus we are given k = 5/3 (ducats/pound), the price per pound.
Knowing k we can nd D for any P. In particular if P = 7 pounds, then
D = kP = (5 ducats/3 pounds) (7 pounds) = 35/3 ducats. As the Rule of
Three said, the answer is found by multiplying 5 and 7, and dividing by 3.
Doing out the arithmetic, though, even in so simple a problem, obscures
the basic idea, which is not really about numbers. The real idea is given
by the relationship between the variables in Eq (2.1) or (2.2), or Fig 2.1.
Actually putting in the numbers is not very illuminating. Pity the poor
schoolchildren of the Renaissance who did put the numbers into problem after
problem of this type, but never learned to read a relationship like Eq (2.2),
or a graph like Fig 2.1, summarizing the whole idea!
2.2 Units
What we were really saying in the previous section is that money M is
proportional to wool W, i.e.
M W (2.3)
These quantities are not numbers, though! Wool is a real thing, and money
is a real thing. They are represented by numbers only if we measure them,
and to measure them we must choose units, like pounds for wool and ducats
for money. The units are arbitrary choices, but the relationship Eq (2.3)
is true, quite apart from these choices. Thus proportionality is more than
just a numerical relationship. A better way to represent the problem in a
graph is Fig 2.2, which is almost the same as Fig 2.1, but represents the
real quantities on the axes, and indicates the units in which they have been
measured. If we changed units, the numbers would change, but the meaning
would be exactly the same.
To convert from one unit to another, you use the Rule of Three once
again, because the measures of things in dierent units are proportional. If 3
ducats is the same quantity as 4 orins, then what is 9 ducats? You multiply
9 ducats by the constant of proportionality k = 4/3 (orins/ducat) to get
9 4/3 = 12 orins. A peculiar thing about this conversion is that, since
4 orins = 3 ducats, k, their ratio, is 1! Thus, when we convert units, we
are just multiplying by the number 1, in a peculiar form. For the purpose
of arithmetic it looks like the number 4/3, but because it has units, and is
really 4/3 (orins/ducat), it is the number 1. It would be a serious mistake
to omit writing the units! When we convert, we are expressing the same real
quantity of money, 9 ducats: the quantity does not change, although the
number changes from 9 to 12 (orins).
Converting units kept a lot of Renaissance accountants busy, and it has
kept scientists busy too. Gradually, over the centuries, units have been stan-
dardized and the process of conversion has, for the most part, been made as
simple as is reasonably possible, but there is no escaping the frequent need
2.2. UNITS 25
Money (Ducats)
Wool (Pounds)
0 5 10
0
5
10
Fig. 2.2: Money in this transaction is proportional to wool, but the numbers depend on the
units you choose.
to do it. It is a skill that we, like Renaissance schoolchildren, should simply
acquire, and be able to use accurately.
Since centimeters and meters are related by 100 cm = 1 m, for example,
the number 1 in the form
1 =
100 cm
1 m
(2.4)
can be used to convert from meters to centimeters. If you are 1.76 m tall,
you are
1.76 m = 1.76 m
_
100 cm
1 m
_
= 176 cm (2.5)
i.e. 176 cm tall. Notice how the unit meters cancels in numerator and
denominator in the middle step, leaving centimeters. We can also say the
conversion factor is 10
2
cm/m, using power of 10 notation, or 10
2
m/cm,
the form we would use to convert from centimeters to meters. The whole
point of the metric system is that these changes should just require shifting
the decimal point. To change to a non-metric unit of length like inches, we
could use 2.54 cm = 1 in, so that your height in inches is
176 cm = 176 cm
_
1 in
2.54 cm
_
= 69.3 in (2.6)
In short, to convert units, simply multiply by 1 in the appropriate form, and
take time to write everything out carefully. Even within the metric system
it would be easy to move the decimal point the wrong way. With the above
method, if you take the time to use it, it is almost impossible to make this
mistake.
Here is a unit conversion that many people get wrong. Lets make sure
we get it right! If a certain area is 2 square meters, what is it in square
centimeters? We do it out carefully:
2 m
2
= 2 m
2
_
100 cm
1 m
__
100 cm
1 m
_
= 2 10
4
cm
2
(2.7)
Just notice that to convert meters squared we need the conversion factor
between meters and centimeters twice.
In the metric system every basic unit can be used with prexes to derive
other units, related to the basic unit by a power of ten. Below we give most
of these prexes, together with their combinations with the meter.
Prex eect on -meter Full name Abbreviation
Mega- 10
6
m Megameter(?) Mm
kilo- 10
3
m kilometer km
centi- 10
2
m centimeter cm
milli- 10
3
m millimeter mm
micro- 10
6
m micrometer, micron m
nano- 10
9
m nanometer nm
10
10
m

Angstrom

A
pico- 10
12
m picometer pm
The

Angstrom is not really a metric unit, despite being related to the meter
by a power of 10, but it is a convenient length, being about the size of an
atom. It is no longer used, but one sees it in older literature. I have never
seen the Megameter actually used. One says thousands of kilometers.
2.3 Data: Straight Line Plots
In Figure 2.2 it was assumed that the two quantities, wool and money, were
proportional. The straight line in that gure is a theoretical line. You could
2.3. DATA: STRAIGHT LINE PLOTS 27
imagine testing this theory by actually observing transactions in the market-
place. Every time someone bought wool, you would record the amount P of
wool, and the price D. Then you would plot these data as points in a graph
in the (P, D) plane. Each transaction is a point. It might look something like
Fig 2.3. In this made-up example we see that the data do indeed lie along
D (Ducats)
P (Pounds)
0 5 10
0
5
10
Fig. 2.3: A scatter plot of transactions in the wool market. The data suggest proportion-
ality, as indicated by the dashed line.
a straight line indicating proportionality. Thus, we could conclude, it is es-
tablished by experiment that P D. Not only that, but we could introduce
the constant of proportionality k, and say D = kP, where the value of k can
be read from the data as the slope of the line that ts, indicated as a dashed
line in Fig 2.3. This k would be an empirically determined price-per-pound.
This way of handling data is surprisingly useful and eective. You can
tell in a graph if the points seem to lie along a straight line. As in this
ctitious example, you wouldnt expect empirical data to line up perfectly,
but the linear trend is obvious. To take a more physical example, we know
that when you dive down under water, the pressure P increases with depth
D. Suppose we have some kind of pressure gauge and we measure actual
pressure vs depth data It might look like Fig 2.4. You can see in the gure
that gauge pressure and depth are proportional, even though the units are
Gauge Pressure P (Arbitary units)
Depth D (Arbitrary units)
Fig. 2.4: The data indicate gauge pressure is proportional to depth.
not indicated. They are unnecessary to make this point. If you wanted to
know the constant of proportionality for later use, though, you would need
to know the units.
Like all proportionality relationships, the line in Fig 2.4 goes through the
origin. That is, when the depth D is zero, the pressure P is zero. Is this really
true, though? When we say D = 0, we mean the surface of the water, before
we start to go down. Is the pressure zero there? That is what our gauge said,
but in fact, it must be measuring pressure with reference to the pressure at
the surface, or, to put it more simply, the change in pressure from the pressure
at the surface. A closely related concept is the absolute pressure, and if we
were to measure absolute pressure somehow, we would nd the pressure at
the surface is not zero, but a positive value called atmospheric pressure.
A plot of absolute pressure P vs depth D might look like Fig 2.5. This is
just like Fig 2.4, except that now all the pressure values have atmospheric
pressure included. In particular, at D = 0, we can read o that the pressure
is not zero, but a positive value, which must therefore be the atmospheric
pressure. It is still true that the change in pressure P is proportional to
the change in depth D, measuring from the surface.
This use of the Greek letter (delta) to indicate a change in a quantity
takes getting used to. Here P means absolute pressure, but P means the
2.3. DATA: STRAIGHT LINE PLOTS 29
Absolute Pressure P (atm)
Depth D (meters)
D
P
10 20 30
1
2
3
Fig. 2.5: The absolute pressure is a linear function of depth. The change in pressure P
is proportional to the change in depth D.
change in P from some other point. In Fig 2.5 it is indicated that the change
is measured from the surface, where D = 0. Thus P = P P
atm
= P
gauge
in this case, and D = D 0 = D. In particular, it is crucial in using
this notation to be clear that we are not multiplying by something called
! Rather, we are nding the change from some other (D, P) point, which
must be understood if the notation is to make sense. In Fig 2.5 we mean the
change from (0, P
atm
).
We can express this linear relationship between P and D by
P = P
atm
+kD (2.8)
This is the slope-intercept form of the equation of a straight line. When
D = 0, P = P
atm
=1 atm, and P
atm
is therefore the intercept on the P axis.
The slope k of the line is
k =
P
D
0.1 atm/m (2.9)
Thus k is also the constant of proportionality relating gauge pressure to
depth. This is only saying that the slopes of the lines in Fig 2.4 and 2.5 are
the same. They are really the same line, with one of them moved up by P
atm
.
2.4 Uncertainty in Data
The data in Figs 2.4 and 2.5 scatter about the straight line, and dont lie
precisely on it. Such scatter is a property of all measurement. It is often
called measurement error, but this term suggests that there is some kind
of mistake. A better name is measurement uncertainty. This suggests,
correctly, that uncertainty is an inherent property of measured values, and
not a mistake. Scientists must simply learn to deal with it.
Measurement uncertainty is probably the reason Aristotelian philosophers
believed mathematics could not describe our world. After all, if all measure-
ment is uncertain and it is! then how can you trust it? This is a deep
question.
The most important method we have to deal with measurement uncer-
tainty is to keep track of its estimated size through the measurement process,
and to indicate the size of the uncertainty when we talk about the data. How
good are the data? This is a crucial piece of information, which should al-
ways be included. In Figs 2.4 and 2.5, although the data are made up, we
could imagine that the size of the Xs representing the data points is meant
to indicate the uncertainty in each point. An important convention in quot-
ing numerical data is to give only the signicant digits, with the last digit
uncertain. This is a quick and easy way to indicate uncertainty.
Let us think how the measurement of pressure vs depth might have been
made, and estimate the uncertainty. A pressure gauge is lowered down on
a rope. When we have paid out 100 meters of rope, we will say the depth
is D = 100 m. What is uncertain about this? Well, the rope might not
be hanging straight down. Perhaps we are on a boat that is being moved a
bit by the waves and wind, or perhaps there are currents down there which
carry the rope somewhat sideways. We know that D = 100 is about right,
but we also know that these eects could make a dierence of 1 meter or so.
We should probably say D = 99 1 m (since these eects tend to make the
depth D less than the rope paid out). If we know something about how the
pressure P is measured, we could also estimate how uncertain P is. In this
2.4. UNCERTAINTY IN DATA 31
way we keep track of the uncertainty. As you do an experiment, you may
nd checks of consistency. The size of the scatter, that is, the size of the
uncertainty, is determined along with everything else. You may even nd
ways to beat the uncertainty down but never to eliminate it altogether!
A good experimentalist always thinks about the measurement and the
uncertainty at the same time, not because the uncertainty invalidates the
measurement, but just the reverse: because knowing the uncertainty makes
the measurement meaningful. If we cant estimate the uncertainty, a mea-
sured number is meaningless.
Galileo, at the very beginning of modern physics, somehow had a very
good feeling for this subtlety of the measurement process. Without it he could
not have been so sure and he was sure that mathematical relationships
are veried in Nature. The conceptual diculty in making this step can
hardly be overstated. It is part of Galileos genius that he was able to hold
together in his mind the perfection of a simple mathematical theory and the
messiness of real data, and to believe that they somehow corresponded. He
put it like this:
Just as the merchant who wants his calculations to deal with
sugar, silk, and wool must discount the boxes, bales, and other
packings, so the mathematical scientist, when he wants to
recognize in the concrete the eects which he has proved in
the abstract, must deduct the material hindrances, and if he is
able to do so, I assure you that things are in no less agreement
than arithmetical computations. The errors, then, lie not in
the abstractness or concreteness, not in geometry or physics,
but in a calculator who does not know how to make a true
accounting.
For Galileo, dealing with error meant recognizing that data come to us with
some extra wrapping! Thats not a bad picture to keep in mind.
Galileo seems to say that our mathematical theories are exact, and that
it is only material hindrances that make this dicult to see. There are, in
fact, a few systems that are so simple that we can (now) make very accurate
predictions and also make very accurate measurements, and in these few cases
theory and experiment agree, so far as anyone can tell, exactly. These simple
systems include the motions of planets and satellites, moving in a vacuum,
governed by the gravitational force; and the natural frequencies of atoms in
a vacuum, governed by quantum mechanics. One should also mention the
behavior of a single electron in a magnetic eld.
Apart from these simple systems, though, material hindrances play a
much bigger role. Physics still has something to say about situations that are
not articial physics experiments, but it is rough and approximate. Textbook
problems are often worded as if all quantities were known exactly. In most
settings, though, the mathematical model that we call physics is an idealiza-
tion, deliberately leaving things out to keep the model simple. The things
we leave out, if they are truly not very important, may manifest themselves
as small systematic discrepancies in what we expect and what we actually
observe. They may also manifest themselves as a small apparent randomness
around an average value.
Making detailed, precise predictions from physical theories usually re-
quires higher mathematics. That is not what this book is about. Using
physics in a rough and ready way, though, does not at all require higher math-
ematics, but only a knowledge of proportion, and this is the way physics is
actually used most often in practice: as a method of approximate prediction.
2.5 Dimension
No matter how we change the units, a length remains a length. Any length
is said to have dimension [L]. This is a rather peculiar use of the word
dimension. It simply means that the quantity in question is a length,
and could therefore be measured in meters or inches, or any other unit of
length, without committing us to a particular choice. The quantity in the
preceding section had dimension [L], for instance.
An area, for example the base height of a rectangle, being the product
of two lengths, is said to have dimension [L
2
]: dimension obeys the rules of
algebra when you multiply. A product carries the dimensions of its factors.
This dimension tells us that the units of area could be m
2
, or in
2
.
A volume, being the product of three lengths, say length width
height, has dimension [L
3
], and could be measured in units m
3
or cm
3
(cubic
centimeters, or ccs, as they say).
2.6. POSITION, TIME, AND (CONSTANT) VELOCITY 33
A conversion factor like (100 cm)/(1 m) in Eq (2.4) is a ratio of two
lengths, and hence dimensionless. Such things are pure numbers. A conver-
sion factor, specically, is always the ratio of a quantity to itself (in dierent
units), so it is always the pure number 1.
There are only two other familiar quantities in physics that behave like
length in this way, namely time, which carries the dimension we denote [T],
and mass, which carries the dimension we denote [M]. Products of physical
quantities that carry these dimensions continue to carry them. Dimension is
a property that persists in statements of proportionality. Therefore the arith-
metic of physical quantities is not just the ordinary arithmetic of numbers:
all physical quantities carry with them their dimension.
It is a mysterious thing that most of the familiar physical quantities, like
force, velocity, energy, stress, diusion constant, momentum, viscosity, etc.
are all products of powers of just three kinds of things, [M], [L], and [T]. (We
will later introduce one more, electric charge [Q].) This conrms, perhaps
more clearly than anything else, that physics is about proportionality. All
physical quantities are, dimensionally speaking, products of just a few types
of quantities, and products are proportional to their factors.
2.6 Position, Time, and (constant) Velocity
If a moving object moves a distance D that is proportional to time t, then
we say
D t (2.10)
or, equivalently
D = vt (2.11)
for some constant v. Here we are describing a relationship between D and
t. D and t are variables, not numbers. As time t increases, the distance D
travelled increases proportionately. The constant of proportionality is the
speed v. The letter is chosen to suggest velocity, a near synonym for
speed. A graph of D vs. t would be a straight line, like Fig 2.2, but with
the axes labelled distance and time instead of money and wool, measured
in appropriate units. The line would have slope v.
Since D is a length, it has dimension [L], and t, being a time, has dimen-
sion [T]. This means v must have dimension [v] = [LT
1
]. Suitable units for
it would be meters/second or miles/hour, etc. Here v is just a number, with
units, corresponding to the constant speed of the object.
A generalization of this relationship is the position x of something that
moves with constant velocity v, where x is the usual coordinate along a line
or axis. If we say
x = vt (2.12)
we are saying much the same thing as before. The dierence is that all
quantities now could be either positive or negative. Distance is considered to
be intrinsically positive, as when we say that the distance between any two
positions x
1
and x
2
is |x
2
x
1
|. Negative distance doesnt make sense, but
negative position does make sense: it is to the negative side of x = 0. The
word speed is also taken to be intrinsically positive, but now we are calling
v velocity, and it can be either positive or negative, depending on which
way the object is travelling. Speed is the absolute value |v|, always positive,
whether v is positive or negative. Finally, t no longer means elapsed time,
which would always be positive, but rather time on a clock, which is negative
if it comes before the time which is arbitrarily called 0.
A further generalization of this relationship is the linear function
x = x
0
+vt (2.13)
which still describes something moving at constant velocity v, only now not
starting at x = 0 at the time t = 0, but rather at x = x
0
. This could describe
something moving just like x = vt, but with a head start x
0
. Its position
diers from x = vt by x
0
at all times, as if the two were running along keeping
a constant distance between them. The graph of these two relationships is
Fig 2.6. The linear relationship x = x
0
+vt is in slope-intercept form because
the two constants, v and x
0
, are the slope and intercept respectively in the
graph.
We recall the denition of the slope of a line, given any two points on
it, (t
1
, x
1
) and (t
2
, x
2
). In this example we know the equation of the line,
Eq (2.13), and so we know the slope, but if we didnt know it, we could
reconstruct it from the two points, using the denition
slope =
x
2
x
1
t
2
t
1
=
x
t
(2.14)
x
t
x
0
Fig. 2.6: The graph of x = x
0
+vt (solid) and x = vt (dashed). The slope of the lines, v,
is constant and the same for both.
We have used the -notation to indicate the change in position and the
change in time. Physically it is obvious that this combination must give v,
because it is a displacement (x) divided by the time necessary to travel it
(t), and that is just what we mean by velocity.
The following problem comes up frequently in various disguises: suppose
that at time t = 0, A is at position P
A
and B is at position P
B
. A and
B move with constant velocities v
A
and v
B
respectively. When and where
do they meet? The problem can be pictured (and solved) graphically as in
Fig 2.7. It is clear that there is typically just one time and place where they
will meet. If, however, we have the very special case v
A
= v
B
, in which
they are moving with the same velocity, then the lines will be parallel (unlike
what is shown) and they wont meet. It is clear that in this case they will
keep a constant distance, as in Fig 2.6. We can look for these features in
an algebraic solution. The statement of the problem tells us that x
A
, the
position of A, and x
B
, the position of B, are given, for any time t, by
x
A
= P
A
+v
A
t (2.15)
x
B
= P
B
+v
B
t (2.16)
x
t
P
A
P
B
t
m
x
m
Fig. 2.7: Two objects A (red) and B (blue) move along the x axis with constant velocity
and meet at a denite time t
m
and position x
m
.
The condition that they meet is x
A
= x
B
, and using the expressions above,
we nd that this is a condition on the time t. It is satised at t = t
m
, where
t
m
=
P
A
P
B
v
B
v
A
(2.17)
Even if we are given a problem of this kind with numbers, it is better to
introduce algebraic symbols and do it algebraically (you can put the numbers
in at the end, if necessary). One benet of doing it algebraically is that you
dont nd yourself doing numerical arithmetic at every step, but only at the
end. The main benet, though, is that a result like Eq (2.17) contains a
lot of useful information that is lost when you substitute numbers. This
information, if you take the time to read it, is useful as a check, and it can
also suggest other ways of looking at the problem. Let us take the time to
read and interpret Eq (2.17).
The rst thing to do is check the dimensions, for consistency. On the
left hand side we have t
m
, which is dimensionally [T]. On the right hand
side we have [L] in the numerator, and [LT
1
] in the denominator. Does
the ratio have dimension [T]? It ought to! Then we check common sense.
Fig 2.7 shows what it looks like if P
A
> P
B
and v
B
> v
A
. In that case
t
m
> 0, and we see that Eq (2.17) gives a positive meeting time. If we had
had v
B
< v
A
, it is easy to see in Fig 2.7 that the slope of the blue line
would be less than the slope of the red line, and the meeting time t
m
would
be negative. But that is also what Eq (2.17) says, because in this case the
denominator would switch signs and become negative, while the numerator
would still be positive. Finally, as we imagine v
B
v
A
, i.e., the velocities
becoming the same, we know that A and B dont meet, because the lines
in the graph would be parallel, and that is also what Eq (2.17) says: as the
denominator goes to 0, t
m
. That is how Eq (2.17) tells us there is no
meeting in this case.
It is always a good idea to check special cases where the result is obvious.
For example, if P
A
= P
B
then A and B are together at t = 0, and hence it
is obvious that t
m
= 0 in this case just what Eq (2.17) says. Also, in the
limiting case that one of v
A
or v
B
is very large (the limiting case of one of
them going to innity), then we would have t
m
0. That too is common
sense: the gap between A and B is closed in almost no time.
There is yet another way to read Eq (2.17). The numerator P
A
P
B
is
the initial relative position of A with respect to B, i.e., the displacement from
P
B
which ends at P
A
. The denominator v
B
v
A
is the relative velocity of B
with respect to A. Their ratio is just the time it takes to cover the relative
displacement P
A
P
B
at the relative velocity. Amazingly, this illustrates the
principle of relativity: we could imagine how the motion of A and B looks to
someone who is moving along at speed v
A
, i.e., the same speed as A. To him,
A appears not to be moving. In fact all the velocities are changed, according
to this observer, in having v
A
subtracted o, so that what was v
A
becomes
v
A
v
A
= 0 and v
B
becomes v
B
v
A
. Now how long does it take A and B
to come together? Well, A sits still, at an initial distance P
A
P
B
from B,
and B approaches at velocity v
B
v
A
. Clearly the time necessary is given by
Eq (2.17). By shifting the point of view to the moving perspective, we have
made the problem easier.
It is always worth inspecting a result for the way it depends on the vari-
ables in the problem. But that is only possible if they are left as variables! It
would be a good exercise to solve for the position x
m
where A and B meet in
this problem and try to interpret the result, just as we interpreted t
m
above.
Motion at constant speed is rather special, and most things dont move
with constant speed, at least not for long, so as a model of real motions
constant speed motion, D t and its various generalizations, is still not very
general. There is an interesting physical example, however, of something that
does turn out to move at constant speed, namely light. We say a little about
this in the next section.
2.7 The speed of light, and SI units
The speed of light in vacuum is a constant of Nature, and is given its own
symbol c. The distance D travelled by light in time t is D = ct where
c 3 10
8
m/s. This is such a high speed that for everyday purposes,
not involving clever instrumentation, the speed of light might as well be
innite. Nonetheless, over the centuries, beginning around 1675, people have
succeeded in measuring the speed of light with ever increasing accuracy.
In recent years, though, something peculiar has happened. Einsteins
Special Relativity Theory asserts that c is the same for everyone, a true con-
stant of Nature. This theory is now unconditionally accepted. Meanwhile
the denition of our standard unit of length, the meter, has been changed
several times since its adoption in the French Revolution. Its most recent
change was in 1983, when the roles of c and the standard meter were re-
versed. The speed of light is now precisely 2.99792458 10
8
m/s, and is now
understood to be the denition of the meter. The meter is, by denition, the
unit of length such that in 1 second light in vacuum travels 2.99792458 10
8
meters. The denition of the second is, as we shall see, on very rm footing,
so the meter is now just as rm. The value for c was chosen to agree with
the previous denition of the meter, to the accuracy attainable, at the time
of the changeover.
By international agreement the units most often used in physics are the
Syst`eme International units (SI units), based on a standard unit for [L] (the
meter, abbreviated m), [T] (the second, abbreviated s), and [M] (the kilo-
gram, abbreviated kg). We will, for the most part, use SI units in this book.
Very occasionally we will not write the units with a measured number, but
simply say SI. This means you have to look at the dimension of the quan-
tity and construct the SI unit from that. If the dimension is [MLT
2
], for
2.8. DIMENSION AND SCALING 39
example, then the SI unit is kgm/s
2
. Usually, though, we will write out
units in full.
2.8 Dimension and Scaling
In Galileos last book, Two New Sciences (1638), he makes a very interesting
use of the notion of dimension. His idea is that you either know or you can
guess, for some physical quantities, that they are proportional to volume.
Such a quantity would grow like volume if it were scaled up. Similarly,
quantities proportional to area would grow like area if they were scaled up.
Here is the argument Galileo makes. If you were twice as big, meaning
twice as tall, twice as wide, etc., then you would weigh eight times as much,
because weight is proportional to volume, and volume [L
3
] is proportional to
the cube of the linear dimension. On the other hand, what about the strength
of your bones and muscles? Galileo argues that strength is proportional to the
cross-sectional area of the bone (or muscle), and if the bone is twice as thick,
then its cross-section is four times as big, because area [L
2
] goes as the square
of the linear dimension. Thus you would be eight times heavier, but only four
times stronger: you would be eectively weaker, relative to your weight, by a
factor of two. Galileos argument implies that if you simply scale something
up indenitely, eventually it would be unable to support its own weight and
would collapse, as its weight became too much for its strength. To take an
example from architecture, we note that this had actually begun to happen
in 16th century Europe. Cities vied with each other to build the biggest and
most impressive cathedrals. Eventually there were some disasters (Beauvais,
for instance), and the largest planned cathedrals were never nished.
Galileo applied these ideas to animals. Many mammals have essentially
the same body plan, but they dier very much in size. In order that the
large ones not be too weak, in the sense described above, their bones have
to be thicker than you would expect from simple scaling. If the length of
the bone goes up by a factor of 2, for example, the thickness should go up
not by a factor of 2 but by a factor of
8 = 2
3/2
2.8. Thus the bone
would look thicker and clumsier. (Galileos own drawing exaggerates the
eect to make the point.) We would certainly not expect elephant bones to
be simply delicate mouse bones scaled up! The argument is so appealing and
so persuasive that it has only quite recently been carefully checked. Real
animals turn out to be a bit more complicated than Galileo said. But the
idea has been very inuential.
The background to this story is amusing. In 1588, some fty years before
he wrote Two New Sciences, Galileo was an unemployed dropout medical
student with a keen interest in geometry. Through his own abilities, and
perhaps family connections, he managed to be a serious candidate for the
open chair of mathematics at the University of Pisa (which belonged to Flo-
rence). In what may have been, in eect, a job interview, he was invited to
address the Florentine Academy, and to lecture on mathematics, something
virtually none of the members knew anything about. He chose to expound
the geometry of Dantes Inferno, a work they all knew intimately, and did so
in very entertaining fashion. He got the job! But in the process he committed
a blunder that he must have realized soon after. The Inferno, as he describes
it, is an enormous region within the Earth, capped by a layer of the Earths
crust some 400 miles thick. He explicitly says that it would not be in danger
of caving in because it is geometrically similar to a large dome in architecture
(like the famous Brunelleschi dome on Florences own cathedral), but scaled
up. This is just the situation that Two New Sciences addresses: scaling up
makes the dome weaker, something he hadnt realized at the time, and in
fact makes it obvious that the Inferno would collapse after all! Thus Galileo
must have lived with a painful dilemma all of his life: his marvellous insight
into scaling and proportionality was something he had to keep to himself. To
reveal it would have been an embarrassment not so much to himself, since
after all, he would be the one to correct the error, but really to Florence and
the Florentine Academy, for reasons that are too political to go into here.
It also might have been dangerous to point out that the Inferno couldnt
actually exist, because the literal existence of a Hell within the Earth was
Catholic dogma. His solution was to keep quiet, but to publish at the end of
his life, when nothing more could happen to him.
2.9 Power Laws, and the Logarithm
Galileo guessed that strength should be proportional to L
2
, where L is a
typical length in an animal or a structure. Is that true, though? And what
exactly is meant by strength here? It was never dened.
2.9. POWER LAWS, AND THE LOGARITHM 41
Suppose we make some operational denition of the strength S of a beam,
say the weight necessary to break it when it is supported in a certain way, etc.
Then we could actually measure strength S vs. length L for various beams,
all scale models of each other. In this way we could determine experimentally
if Galileo was right about S L
2
. The best way to check this experimentally
would be to graph the measured values of S vs. L
2
. We know how it should
look if the two are proportional: the points should lie on a straight line
through the origin. Suppose we try this, though, and it doesnt work: they
dont lie on a straight line through the origin. Galileos idea is so plausible
that it occurs to us: maybe he just guessed the wrong power in the scaling
law. Perhaps it is
S L
(2.18)
for some other power , dierent from 2, perhaps not even an integer. Now
we seem to have a tedious problem. We could guess values of , one after
the other, and for each one graph S vs. L
to see if we have proportionality

(by looking for a straight line through the origin, as before). This trial and
error process could go on for a long time. There is actually a much better
way to detect such a power law, though. We should graph log S vs. log L,
where log is a logarithm. Well see in a moment how this works.
First, though, recall the most important property of logarithms: if A > 0
and B > 0, then
log(AB) = log(A) + log(B) (2.19)
That is, the logarithm of a product is the sum of the logarithms. It follows
that
log(A
2
) = log(A A) = log(A) + log(A) = 2 log(A) (2.20)
log(A
3
) = log(A
2
A) = 2 log(A) + log(A) = 3 log(A) (2.21)
... (2.22)
and in general
log(A
) = log(A) (2.23)
Now we return to our power law hypothesis,
S = kL
(2.24)
which we have rewritten, introducing a constant of proportionality k (un-
known). If this power law were true, it would also be true, by taking the
logarithm on both sides, that
log(S) = log(k) + log(L) (2.25)
This says that if we graph log(S) vs. log(L) we should get a straight line
of slope and intercept log(k), because it is a linear relationship in slope-
intercept form. If we just graph the data this way, we check all possible
power laws in one graph! A straight line indicates a power law, and in that
case the slope tells us the power. The intercept is perhaps not so interesting,
but it still tells us the constant of proportionality in the power law, if we
want to know it.
Since this procedure may not be very familiar, we illustrate it with made-
up data in the table below:
S L
1.6 1
9.1 2
89.4 5
506 10
A plot of log(S) vs. log(L) shows that the data points lie along a straight line:
thus the relationship between S and L is a power law. Picking two points
on the line we nd that the slope, and hence the power, is about 2.5. If you
try checking this exercise in detail, you may nd yourself wondering which
logarithm we used. For any positive number except 1 let us say 10, for
concreteness there is a corresponding logarithm, called logarithm base 10,
abbreviated log
10
. For Fig 2.8 we used the logarithm base e 2.71828..., also
called the natural logarithm, abbreviated ln. It makes no dierence for this
purpose, though, which logarithm you pick, as long as it is the same choice on
both axes. The reason is, logarithms to dierent bases are proportional! So
a dierent choice of logarithm is like changing the units in the same way on
both axes. In particular, the slope will come out the same for any choice.
2.10. NUMBERS IN GEOMETRY ARE RATIOS 43
ln(S)
ln(L)
0 1 2 3
0
5
Fig. 2.8: A log log plot of the data in the preceding table. The data appear linear with slope
about 2.5, indicating S L
2.5
.
2.10 Numbers in Geometry are Ratios
Euclidean geometry does contain numbers, but these numbers are ratios of
lengths, or of areas, and hence dimensionless. The numbers of geometry are
pure numbers. There is no need for units.
The most famous ratio in all geometry is = C/D, the ratio of a circles
circumference C to its diameter D. cannot be written as a decimal fraction,
so it is a rather awkward number for arithmetic. A good approximation is
3.14159265... For almost any practical purpose one could use fewer decimal
places. The diameter D of a circle is 2R where R is the radius, so
C = D = 2R (2.26)
A circle of radius 1 is called a unit circle, and its circumference is 2.
The denition of uses the notion of geometric similarity, that is, that
all circles are similar: they dier only in being scaled up or down. In similar
gures, the ratios of corresponding lengths are the same, because they are
scaled up or down together. A circle of radius R is similar to a circle of radius
1, and its circumference 2R is just 2 scaled up by the same factor R.
There are familiar formulae that we learn from geometry that seem to
involve dimensional quantities, for example
Area of circle = R
2
(2.27)
Surface Area of sphere = 4R
2
(2.28)
Volume of sphere =
4
3
R
3
(2.29)
In these formulae the areas are clearly of dimension [L
2
] and the volume of
dimension [L
3
], as they should be. That is not how Archimedes stated them,
though, when he proved them. Rather he said that the ratio of the circles
area to the square on its radius is . On his tomb was engraved the sphere
inscribed in a cylinder, a wordless reference to his beautiful result above on
the volume of the sphere: what he actually said was that the ratio of this
volume to the volume of the cylinder is
2
3
.
Here is another example: a theorem of Euclid (Elements I, Proposition
32) says that the sum of the interior angles of any triangle is 180
. At least
that is how we say it now. Where we say 180
, however, Euclid says two

right angles. That is, the ratio of the sum of the interior angles of a triangle
to the right angle is 2. And a right angle is not just a shorthand for 90
, but
rather it is the angle you get where two perpendicular lines cross, and makes
no reference to any number whatever. The degree, as a measure of angle, is a
very old unit, going back to the Babylonians, and was almost certainly known
to Euclid, but he didnt use it. Thus the famous theorem about the angles of
a triangle is another illustration that numbers in geometry are ratios. Except
for the ratio 2, numbers here are irrelevant. The idea is best expressed by
the picture that accompanies Euclids proof, Fig. 2.9.
2.11 The Trigonometric Functions
Similar triangles dier from each other only by an overall scale factor. It is
easy to recognize when triangles are similar because scaling leaves the angles
the same, and only changes the overall size. Two triangles are similar if
and only if they have the same angles. The ratio of corresponding sides is
2.11. THE TRIGONOMETRIC FUNCTIONS 45
A C
B
D
Fig. 2.9: Euclids Prop. I.32. The three interior angles of the triangle all appear at point
C, where they clearly t together to make two right angles (one straight angle). CD is the
line through C parallel to AB. That the angles and which appear at C really are the
same as the corresponding angles of the triangle follows from properties of parallel lines.
For example, the two angles labelled are opposite interior angles and hence equal.
the same for both of them, and thus depends only on these angles, and not
on which triangle with these angles one is thinking about. For example in
Fig 2.10 we have a/b=A/B, since these are ratios of corresponding sides in
similar triangles.
This connection between proportionality and similar triangles leads to
graphical methods for solving problems. If you look back at Fig 2.1, you can
see it in perhaps a new way: our rst solved problem used similar triangles,
although we called it a graph, so you probably didnt notice the triangles.
By far the most important use of this idea is in similar right triangles.
Since in these triangles one angle is a right angle, the sum of the other two
angles is also a right angle, so that the sum of all three angles together is
two right angles. The two acute angles that add to a right angle are called
complementary. The triangle is thus specied completely by giving just one
acute angle. The other two angles are the complement of that one and a
right angle. The ratios of sides in a right triangle depend only on this one
angle, and are extremely important geometric ratios to know about. In old
days they were carefully tabulated in books, and now they are available
on calculators. You put in the angle and get out the ratio. Angles are
a
b
A
B
Fig. 2.10: The triangles are similar, so the ratios of corresponding sides are equal:
a/b=A/B
frequently given Greek letter names, a continuous tradition from Hellenistic
times, like (pronounced theta) in Fig 2.11, which shows how the ratios of
sides are named. Given an angle , one can ask for sin(), cos(), or tan()
(pronounced sine of theta, cosine of theta, tangent of theta). The reciprocals
of these are also possible ratios, but are less commonly used. In Fig 2.11 you
can see that if we use the complementary angle, the sides we call adjacent
and opposite would be switched (but the hypotenuse is of course still the
hypotenuse). Thus the sine of an angle is the cosine of its complement, and
the cosine of an angle is the sine of its complement. The tangent of an angle
is the reciprocal of the tangent of its complement, since switching b/a gives
a/b. This reciprocal a/b is called the cotangent of , but we will not use it.
The sine and cosine are related through a very useful identity. Referring
to Fig 2.11, and recalling the Pythagoras Theorem a
2
+b
2
= c
2
, we have
cos
2
+ sin
2
= 1 (2.30)
Throughout the Renaissance these functions of an angle provided a living
to mathematicians and military engineers. These methods could be used,
for example, to determine the height of a cli or a tower without climbing
it. To determine the height b of the tower in Fig 2.11, you would pace o
the distance a to the tower, and also, by sighting, determine the angle .
2.12. ANGULAR MEASURES 47
a
b
c
sin()=b/c
cos()=a/c
tan()=b/a
Fig. 2.11: The trigonometric ratios. Remember the mnemonic SOH-CAH-TOA: Sine is
Opposite over Hypotenuse, Cosine is Adjacent over Hypotenuse, Tangent is Opposite over
Adjacent
Then since tan() = b/a, you compute the height as b = a tan() To use the
method, of course, you need some way to compute tan() from the measured
. That was where the expertise came in. Galileo spent a good part of
his early career developing streamlined methods to make measurements and
computations of this sort, using a Military Compass of his own invention.
His students were young noblemen who might need this kind of knowledge
in military campaigns. His thorough knowledge of such geometric methods
was the perfect preparation for him to interpret what he saw through the
telescope, when in 1609 his life was changed by this invention, and he turned
it on the moon and Jupiter. He used the idea of similar triangles to measure
the height, not of a tower on earth, but of mountains on the moon!
2.12 Angular Measures
The two sectors of a circle in Fig. 2.12 dier only by scaling: that is, al-
though one of them is bigger, they have the same angle . Thus they are
geometrically similar, and the ratio of the arclength to the radius is the same
in both: s/r = S/R. This is much like the observation in Fig. 2.10, except
that here one of the lengths is measured along the curved arc of the circle.
r
s
R
S

Fig. 2.12: The two sectors are similar. The ratio s/r = S/R is the radian measure of .
The ratio S/R determines the angle , and is called the radian measure of .
For many purposes this way of measuring angle is simpler and more practical
than the more familiar degree measure. In any case, both measures are in
common use, and we have to be ready to use either one.
In Fig. 2.13 we see the angle 90
with an arc added to make it a quarter

sector of a circle. Since the circumference of the circle would be 2R, one
quarter of that, the length of the arc, would be R/2. Thus the radian
measure of this angle is /2.
R
R/2
Fig. 2.13: The radian measure of 90
is /2.
By the same argument, 1/8 of the circle is the angle 45
or /4 radians,
2.12. ANGULAR MEASURES 49
60
is 1/6 of the circle or /3 radians, and 30
is 1/12 of the circle or /6

radians. It is worth practicing the conversions for these special angles so that
you can do them almost without thinking.
Since conversion of units is so important, let us do it out carefully one
more time. We choose some convenient angle for which we know the measure
in both systems: say the quarter circle, which is 90
or /2 radians. Then
the ratio of these two angles, which is just the ratio of an angle to itself, is
the number 1, only written in a very peculiar way:
1 =
/2 radians
90
=
radians
180
(2.31)
It looks as if it might have been better to say that the half circle is 180
or
radians: this is how most people remember it. In any case, we now can
multiply any angle given in degrees by the number 1 in the above form to
convert it to radians. Let us check it on one of the special angles we already
know: we will nd the radian measure of 60
,
60
= 60
_
radians
180
_
=

3
radians. (2.32)
The units, radians and degrees, follow the rules of algebra in this computa-
tion. In particular, degrees occur in both numerator and denominator, and
hence cancel, leaving just radians in the result. The units tell us we have the
number 1 in the right form to make the conversion. Let us try this on an
angle we havent converted yet, 1 radian. What is that in degrees? For this
purpose we need the reciprocal of the conversion factor (which is still just
the number 1, being the ratio of one and the same angle to itself),
1 radian = (1 radian)
_
180
radians
_
57.3
. (2.33)
By being careful to write the units along with every number, we can be
sure of making the conversion correctly. In this example, radians cancel in
numerator and denominator, leaving a result with the units degrees.
In terms of radian measure, the angles of a triangle add up to . In an
equilateral triangle, for example, three equal angles add up to , so each is
/3.
1
1
1 1
1
Fig. 2.14: Special triangles
2.13 Trigonometric functions of special an-
gles
Use Fig. 2.14 to show that
sin(/6) = cos(/3) = 1/2 (2.34)
sin(/3) = cos(/6) =
3/2 (2.35)
sin(/4) = cos(/4) =
2/2 (2.36)
tan(/6) = 1/
3 (2.37)
tan(/3) =
3 (2.38)
tan(/4) = 1 (2.39)
These special angles are frequently useful in practical problems.
2.14. SMALL ANGLE APPROXIMATIONS 51
2.14 Small angle approximations
If is a small angle, then the small angle approximations apply:
sin (2.40)
tan (2.41)
cos 1 (2.42)
Here means approximately equal. For example, if /6 is small, then we
would nd sin(/6) /6 = 0.5236, which is not so dierent from the exact
value sin(/6) = 0.5000. Of course this approximation only works if we use
radian measure! In units of degrees this angle is 30, which is not at all a good
approximation to 0.5000. In fact thirty degrees, or /6 radians is not even
a particularly small angle, which is why the approximation is not especially
accurate even when we do it right. The meaning of small here is much
less than 1 radian. The best way to get a feeling for it is perhaps to compute
sin for various less than 0.1 or so. You will nd that the approximation
is excellent for angles as small as this, and the smaller the angle, the better.
1
A
B
CD
Fig. 2.15: The small angle approximation is the statement that the lengths AC, AD, and
BD are all approximately equal. In the picture it is hard to see the dierence, so it is clear
that they are approximately equal. Which of these is sin, which is tan, and which is
itself ?
We can understand the small angle approximation in a picture, as seen
in Fig. 2.15. That gure also alerts us to two more special angles, namely 0
and /2, the limiting case of complementary angles as the triangle degener-
ates to a line. In this limit the small angle approximation is exact, and we
have special values like those in the previous section, but perhaps even more
important,
sin(0) = cos(/2) = 0 (2.43)
cos(0) = sin(/2) = 1 (2.44)
tan(0) = 0 (2.45)
To see this, just imagine the triangle in Fig. 2.15 collapsing to a line.
The small angle approximation, in case the angle is small but not zero, is
a good example of accepting a little messiness (it is not quite exact) for the
sake of simplicity (it is easy). Quick, what is the tangent of 0.01 radians?
It is typical of physics to work with models that are believed to be exact,
and then to cut corners to get quick, informative answers. It is precisely
because the underlying theory is exact that we can use approximations with
condence: we cant be very far o! And if necessary, of course, we can
always calculate more exactly. We will use the small angle approximation a
lot in the next chapter.
Problems
Many of these problems use terms and concepts not dened yet. We put
them here to point out that there are things one can say without knowing
any details, just using the notions of proportionality and dimension.
Proportionality
For each of problems 2.1-2.5 include both a graphical solution, like Fig 2.1,
and a numerical solution.
2.1 In an electromagnet, the magnetic eld B is proportional to current I.
If B is 5 10
2
Tesla when I is 3 Amperes, what current do you need for a
eld of 0.2 Tesla?
2.2 In a circuit, the current I is proportional to applied voltage V . Suppose
that when V = 0.4 Volts, I = 5 milliamperes. What current will you have if
V is 1.1 Volts?
2.3 The energy E collected by a solar cell is proportional to time t. If it
collects 900 Joules in 5 seconds, how long will it take to collect 3000 Joules?
2.4 Pressure P in a liquid is proportional to depth d. If the pressure is
1.2 10
5
Pascals at a depth d = 80 cm, what will be the pressure at a depth
of 2 meters?
2.5 Electrical resistance R in a wire is proportional to length L. If a 2.4
m wire has a resistance of 0.2 Ohms, how long should a wire be to have a
resistance of 1 Ohm?
53
Dimensions and Units
In the following problems, understand x to be a length [L], t to be a time [T],
and m to be a mass [M]. For each new symbol, determine its dimension and
its SI unit. Each problem depends on the one before it, so work carefully!
2.6 x = vt, so v has dimension ... and SI unit ...
2.7 v = gt, so g has dimension ... and SI unit ...
2.8 F = mg so F has dimension ... and SI unit ...
2.9 Px
2
= F, so P has dimension ... and SI unit ...
2.10 P = S/t so S has dimension ... and SI unit ...
Conversion of Units
Another system of units, similar to the SI system, is the cgs system. In the
cgs system, the units for [L], [M], and [T] are the centimeter, the gram, and
the second respectively (hence cgs). One must frequently gure out how to
make these conversions. Give the unit in each system, and the conversion
factor for converting in each direction, for the following physical quantities.
2.11 Area.
2.12 Volume.
2.13 Force. The dimension of force is [MLT
2
]. The SI unit is called the
Newton, and the cgs unit is called the dyne, so this problem asks for the
conversion factor for Newtons to dynes, and vice versa.
2.14 Energy. The dimension of energy is [ML
2
T
2
]. The SI unit is called
the Joule, and the cgs unit is called the erg, so this problem asks for the
conversion factor for Joules to ergs, and vice versa.
2.15 Mass density. The dimension of mass density is [ML
3
].
2.16 Viscosity. The dimension of viscosity is [ML
1
T
1
]. The SI unit of
viscosity is the Pascal-second, and the cgs unit is the Poise, so gure out how
to convert Pascal-seconds to Poises, and vice versa. The viscosity of water is
about 1 centipoise. What is that in Pascal-seconds?
Scaling
2.17 In the 19th century, the body mass index (BMI), a kind of numerical
statistic associated with the human body, was dened as BMI = M/h
2
,
where M is the mass and h is the height in SI units. One sometimes hears
that a healthy BMI is between 20 and 25.
(a) Using Galileos ideas, say how the BMI should scale with linear size.
What does that suggest about the usefulness of the BMI in identifying well
proportioned bodies? Write a few sentences about this.
(b) A young boy had the following BMIs at dierent ages:
Age 4 9 13
BMI 17.8 21.0 25.1
People who knew him thought he was always a bit overweight, but his BMI
changed a surprising amount. What is going on? Hint: use your answer from
(a).
2.18 A water ltering system uses a xed volume V of a material that
absorbs water contaminants onto its surface. The rate at which it removes
contaminants is proportional to the surface area of the material, so we would
like to make this area as large as possible.
(a) If the material were made into a sphere, it would have minimal surface
area. What is this area? How does it scale with volume V ?
(b) Suppose instead that the material is made into N identical smaller
spheres, each with volume V/N. What is the surface area of one such sphere?
What is the total surface area of all N spheres? In particular, how does the
total surface area depend on N? Why should the material be in the form of
a ne powder?
Position, Time and Velocity
2.19 The most famous paradox of Zeno says that Achilles, who is a fast
runner, cannot catch a slow moving tortoise if the tortoise has a head start.
The argument goes like this. Suppose Achilles can run 10 m/s and the
tortoise only 1 m/s, but the tortoise has a 10 m head start. Then it takes
1 s for Achilles to cover this 10 m, but in that time, the tortoise has moved
on another 1 m. In the time 0.1 s it takes Achilles to cover this distance, the
tortoise has moved on yet again, by 0.1 m. In the time it takes Achilles to
cover this distance, the tortoise has moved on ... You get the idea. It looks
as if Achilles will never catch the tortoise!!
(a) Represent the positions of Achilles x
A
and the tortoise x
T
as linear
functions of time, as in Section 2.6.
(b) Solve for the time t
m
at which Achilles catches the Tortoise (by this
method it looks as if he does catch him).
(c) Express t
m
as a decimal fraction, and point out its relationship to
Zenos paradox.
2.20 Two trains are heading towards each other on the same track.
(a) How much time is there to avert the trainwreck? They are 60 miles
apart, one going 40 mph and the other 20 mph. Do the problem two ways, a
hard way and an easy way. Notice that the two speeds are of course positive,
because speed is always positive, but the trains are travelling in opposite
directions, so the velocities must have opposite signs.
(b) If you didnt make a graph in part (a) showing what happens, mod-
elled on Fig 2.7, make one now.
(c) Where does the wreck occur, if it does?
2.21 Carry out the project suggested near the end of Section 2.6: that is,
nd x
m
algebraically in terms of P
A
, P
B
, v
A
, and v
B
, and simplify it as much
as possible. Interpret its common sense meanings in words.
Power Laws
Test the data below by making a graph to see if there is a power law rela-
tionship y = Cx
for some constant C. If there is a power law, determine

the power .
2.22
x 2 5 10 20
y 56.6 89.4 126.5 178.9
2.23
x 2 5 10 20
y 0.283 1.12 3.16 8.94
2.24
x 2 5 10 20
y 1.8682 1.4192 1.1527 0.9363
2.25 In The Almagest, Ptolemys system of the world, hundreds of stars
are listed and their brightnesses given as visual magnitude 1, 2, 3, 4, 5, or 6.
Here 1 is the brightest and 6 is the dimmest. The number of stars in each
category is given in the table below. It looks as if Ptolemy was not interested
in listing all the dim stars in magnitude 5 and 6, since there are many more
than the ones he names. The brighter stars, though, are fairly represented in
the table. The number clearly grows with visual magnitude for magnitudes
1-4. Is it a power law? Show a graph that helps decide this question.
Magnitude 1 2 3 4 5 6
Number of Stars 15 45 206 476 217 49
Ratios in Geometry
2.26 The meter, still the SI unit of length, was originally dened so that
the distance from the North Pole to the Equator along the longitude line
through Paris should be 10
7
meters.
(a) What is the radius of the Earth in meters, assuming the Earth is a
sphere?
(b) What are the advantages and disadvantages of dening the length
standard in this way?
2.27 In the 3rd century B.C.E. Eratosthenes measured the size of the earth
by comparing astronomical sightings with ground based measurements. The
story as it comes down to us (probably oversimplied) is that the sun at
noon on midsummers day shines directly down into the wells in Syene, on
the Nile. Therefore the sun is directly overhead. But in Alexandria, some
500 miles to the north, the sun is not overhead, but rather 7
1
2
south of the
overhead position.
(a) Explain how the shadow of a vertical stick can be used to measure
the suns angle at Alexandria.
(b) Explain the use of Fig 2.16 in the argument, and use it to estimate
the size of the Earth from the Alexandria data.
2.28 A tower is 200 m away from you, over level ground. The top of the
tower appears 20
o
above the horizontal from where you stand. How high
is the tower? How might your own height aect your answer? Include an
informative sketch that shows the idea of the computation.
2.29 A square is constructed on the hypotenuse of a right triangle with a
30
o
acute angle in it. What is the ratio of the area of the square to the area
of the triangle?
2.30 On Archimedes tomb was engraved a sphere inscribed in a cylinder.
The cylinder was a right cylinder over a circular base, like a tin can, and to
say the sphere was inscribed means that the cylinder was just big enough
S
A
Fig.2.16: Noon rays of sunlight at Alexandria (A) and Syene (S). The angle is somewhat
exaggerated here. It is actually only 7
1
2
o
. The Earths equator is shown as a dashed line.
to contain the sphere, touching it at the top and bottom, and around the
equator. Sketch this famous gure, and show that Archimedes own formulae
imply that the sphere has 2/3 the volume of the cylinder.
2.31 Explain how the Pythagoras theorem and the special triangles in
Fig 2.14 imply the values of the trigonometric functions for /6, /4, and
/3. Which of these trigonometric values, would you say, are the easiest to
remember?
Chapter 3
Geometrical Optics
Around the year 1420 a few painters in Italy began using geometry to plan
and design their paintings. The result was an artistic revolution, called per-
spective painting. They had discovered, or re-discovered, a mathematical
description of how we see, and in particular why things appear large or small
to us. What they realized and they had Euclids Optics as an authority for
it is that the size we see is really angular size.
This idea is a bit tricky, because seeing has an important psychological
component too. Something large appears small if it is far away, to be sure,
but if we know it is large, we make a mental adjustment. The most dramatic
example of this phenomenon must be the appearance of the full moon. On the
horizon at nightfall it looks enormous, but later, high in the sky at midnight,
it looks much smaller. And yet its angular size never changes! (It is hard to
believe this until you have measured it yourself. As you read this chapter,
think of a way to measure the angular size of the moon, and try it out on
the next full moon.) Why the moon seems to change size is still a subject
for debate perhaps when we see the moon on the far horizon, where large
things should look small, being so far away, and even at that great distance
the moon doesnt look small, we interpret it as meaning that the moon must
be huge. Later, when it is high in the sky, and we have no cues about its
great distance, we have no reason to believe it is large, and thus it looks
smaller. (This is not the only theory.)
The example of the full moon is the exception, though. What the Renais-
sance painters discovered is that on the whole, we do see by geometry. The
61
62 CHAPTER 3. GEOMETRICAL OPTICS
experimental proof was in their paintings. Constructed by geometry, they
looked real. One of the rst examples, the Trinity fresco of Massacio in
Santo Croce in Florence, painted in 1425, can still be seen today. This fresco
astounded Massacios contemporaries they knew it was painted on a at
wall, but, as they said, it was like looking through the wall so realistic was
the illusion of depth and distance.
3.1 Angular Size
The secret to painting realistically is to make everything in the painting have
the same angular size it would have if the scene were real. So what exactly
is angular size? Well, the angular size of the moon is about one half degree,
or about 0.0087 radian. This means the angle formed at the eye by rays of
light coming from the extreme edges of the moon is 0.0087. The situation
is illustrated in Fig. 3.1 for the angular size of any sphere. The angle at
the eye is also called the angle subtended by the sphere. In the small angle
approximation it is 2R/D, where R is the radius of the sphere and D is
the distance to the sphere. More generally, for any object, not necessarily a
sphere, the angle subtended at the eye, in small angle approximation, is the
height H of the object divided by its distance D, as in Fig. 3.2. This theory
assumes that light travels in straight lines, and that we can use Euclidean
geometry!
Notice how the small angle approximation was used in Figs. 3.1 and 3.2
to get the main idea into a simple form. The angular size of something is just
its linear size divided by its distance, 2R/D in the case of a sphere of radius
R, and H/D in the case of something of height H. It would work the same
way for an object of width W: it would subtend angle W/D at the distance
D. This makes angular size a truly simple concept. The formula is not quite
exact, however! The actual angular size of the moon is
moon
= 2 sin
1
_
R
moon
D
moon
_
2R
moon
D
moon
(3.1)
Is it worth using the exact formula to gure out the angular size of something,
including the symbol sin
1
that you might not even understand? For many
purposes, no. We will usually assume we are describing something that is
3.1. ANGULAR SIZE 63
/2
E
D
O
R
R
Fig. 3.1: The angle subtended at the eye E by a sphere of radius R at a distance EO=D
can be computed by studying this gure. In small angle approximation it is = 2(/2)
2 sin(/2) = 2R/D, just the diameter of the sphere divided by its distance.
/2
E
D
O
H/2
H/2
Fig. 3.2: The angle subtended at the eye E by an object of height H at a distance D is
= 2(/2) 2 tan(/2) = 2(H/2)/D = H/D.
far enough away so that small angle approximation applies. Thus we will
often say = (equals) where strictly speaking we should say (approximately
equals).
Now suppose you want to represent the full moon realistically in a paint-
ing. How should you paint it? Well, it should have angular size 0.0087 to
the viewer. That means it should be painted with a height H, or width W
(or just call it diameter, since the moon is round) such that H/D = 0.0087,
where D is the distance of the viewer. This points out a problem with per-
spective painting. The whole construction must assume that the viewer is at
some denite distance D. In most locations where you would put a painting,
though, there is no way to force the viewer to stand in a particular place.
Fortunately this turns out not to be a critical consideration. The psycho-
logical aspects of seeing make us quite tolerant about being in the wrong
place: the realism eect still works. So just assume a reasonable distance for
D, like 2 meters. Then the diameter of the moon in the painting should be
2 0.0087 = 0.0174 m = 1.74 cm.
Of course most paintings dont try to represent things life-size. More
often the whole scene is scaled so that things only keep the same relative
proportion to each other, but appear, say, 2/3 the size they actually would.
Again, we adjust to this psychologically, and we still say the painting looks
realistic. We even seem to have a particular fondness for miniature paintings.
Perhaps it is a kind of nonverbal joke that things look both real and not real
(because they are small). Our way of seeing is all about proportions. That
is the secret of perspective painting.
Suppose a scene contains two human gures, each about the same height
H in reality, but at dierent distances, D
1
and D
2
. Then in the real scene
they have angular sizes
1
= H/D
1
and
2
= H/D
2
. Their relative angular
sizes are therefore
1
/
2
= D
2
/D
1
(the common factor H cancels out in the
ratio). Note common sense: the nearer gure subtends the bigger angle.
This same proportion must be observed in the painting, where the gures
will be painted with heights H
1
and H
2
, and viewed at distance D, so that
they have angular sizes H
1
/D and H
2
/D. The ratio of these is H
1
/H
2
, and
it must agree with D
2
/D
1
as found above: therefore
H
1
H
2
=
D
2
D
1
(3.2)
For instance, if the second gure is twice as far away in the real scene, so
that D
2
/D
1
= 2, then the height of the second gure in the painting must
3.2. THE EYE 65
be only half that of the rst gure in the painting, so that H
1
/H
2
= 2. The
second gure looks smaller in the real scene because it is farther away, and
it looks smaller in the painting because it is painted smaller (but it is not
farther away). The relative angular size, however, is the same in both cases.
The angle H/D gets smaller as D gets larger (at xed H), but it also gets
smaller as H gets smaller (at xed D). That is the phenomenon that is being
exploited here. Everyone understands this qualitatively, but we are giving
a genuinely mathematical description. This extra step requires an eort, of
course, and the masters of Renaissance painting took this idea much farther
than we are taking it here. In doing so they were laying the foundation for
the rebirth of theoretical physics.
3.2 The Eye
In Renaissance theories of how we see, the eye was just a point, like the point
E in Fig. 3.1. To understand optics, though, it is essential to know how the
eye works in more detail than that! In Fig. 3.3 the point E has been replaced
by a schematic diagram of the eye, with its own extended structure, looking
at an object of height H at a distance D, so that the object has an angular
size = H/D. We suppose rays of light from the extremities of the object
enter the pupil of the eye and hit the retina as shown. This amounts to a
theory of how the eye works. WARNING: there are signicant problems with
this theory already! Try to spot them we will improve the theory as we go.
The image occupies the region H
on the retina. How big is the image

H
? Using the small angle approximation, and assuming that the angle
outside the eye is also the angle inside the eye, as it would be if the rays were
straight lines, we have
=
H
(3.3)
where D
2.5 cm is the distance from the pupil of the eye to the retina,
i.e., just the diameter of the eye. Therefore
H
= D
(3.4)
This is the kind of statement that we want to be able to read with un-
derstanding. It says that the size of the image (H) is proportional to the
H H
D D
Fig. 3.3: A rst attempt at a theory of the eye. WARNING: this is a useful start, but even
the physics is not this simple, not to mention the eye!
angular size () of the object, with a constant of proportionality D
, the di-
ameter of the eye. This is what we have been claiming right along: what we
see (H) is proportional to angular size (), i.e., H
. It isnt a statement
about numbers: it is a statement about a relationship, in this case just the
simple relationship of proportionality. Perceived size and angular size are
proportional. This is a mathematical theory.
Cameras work the same way: the images that form on the lm, or on
the detector array of a digital camera, also have a linear size proportional to
the angular size of the object being photographed. Of course the eye is not
a simple camera, and we dont literally sense the pattern on the retina (in
particular we dont see the images upside down, as they actually form!) But
these images are the raw material of seeing, and their sizes reect angular
sizes. That information is largely retained in the mental processing of vision.
3.3 Binocular Vision and Parallax
Another way we use our ability to measure angle with our eyes is to measure
distance. When we x attention on a nearby point P with both eyes, the eyes
3.3. BINOCULAR VISION AND PARALLAX 67
individually see the point P in slightly dierent directions. This dierence,
A
A
B
D
H
P
Fig. 3.4: Two eyes, separated by H, look at a point P at distance D. They do not agree
on the angular position of P, diering by the angle , called the parallax angle. This
dierence shows up as an oset on the retina. The top eye does not image P at the point
A, as the bottom eye does, but at a dierent point B. This oset is thus a measure of the
angle . Since = H/D and H is the xed distance between the eyes, the visual system
has measured the distance to P, namely D = H/.
the parallax angle, is a measure of how close P is, as you can see in Fig.
3.4. In fact, if P is a distance D away, then the parallax angle is = H/D
where H is the distance between the eyes, about 10 cm. Since H is xed
by our anatomy, a measure of is a measure of D (or, you might say, of
1/D). How do the eyes measure ? Fig. 3.4 points out that if the eyes look
straight ahead, then the two images of P are at points on the retina that
dont correspond. In principle the brain could process the information in
this form, but actually, in this conguration we see a double image and we
dont get very good distance information. When we bring the two images
together, by crossing our eyes, muscles turn our eyes toward each other until
the point P that we are paying attention to is imaged at corresponding points
on the two retinas. How far we have to turn our eyes is a measure of . We
sense it by sensing the strain in the muscles that move the eyes, and the
eyes provide the feedback so that we know when we have turned far enough.
This complicated set of sensations is integrated in the brain into a sense of
distance.
For distant objects we dont have to use these muscles the parallax
angle is essentially zero, and both eyes can look in the same direction. That
is, the light arriving from a distant point comes along parallel lines. Starlight
is certainly like this, for example, because stars are very distant, but even
something 10 meters away has such a small parallax that our visual system
cant measure it. Measuring distance by measuring parallax doesnt work
for D much much bigger than H, because then all you can say is that =
H/D 0, and if all you can say is that 0, then all you can say is that D
is essentially innite. Here innite just means much bigger than H, i.e.
much bigger than 10 cm. We can measure the distance to objects within a
couple of meters by this method, but for objects farther away we must use
other cues.
In other situations where parallax is used for measuring distance, not
unconsciously with the eyes, but by sighting and triangulation, the way sur-
veyors do it, H is called the baseline for the measurement. What we are
pointing out here is that if you cant measure the parallax angle very accu-
rately, then you cant determine distances D that are much bigger than the
length of the baseline H. Surveyors work hard to measure accurately, and
so extend the usefulness of the method to larger D. Astronomers do too.
3.4 Wide Open Pupils
Up until now, we have treated the pupils of the eye as if they were very small,
so that only one well dened ray comes through, intersecting the retina at a
well dened point. A very small pupil is sometimes called a pinhole, and
a pinhole camera can be built that actually works with a pinhole in place of
a lens, just as these diagrams suggest. If our eyes really were like this, no
one would need glasses, certainly an advantage. In bright sunlight you may
see more clearly, even if you normally need glasses, just because your pupils
stop down and are more like pinholes.
3.4. WIDE OPEN PUPILS 69
The disadvantage of pinholes for eyes is that, under normal lighting con-
ditions, very little light would come through them. Of course under these
conditions our pupils open up to let in more light, but that means the imag-
ing system has to be more complicated than in Fig. 3.3. We propose an
improved model for the eye in Fig. 3.5, where all the rays from a distant
star are imaged at the same point, so that the star is seen as a nice sharp
point and not as a smudge. As we pointed out in the last section, rays of
Fig. 3.5: Eye receiving light from a distant star. WARNING: this is still very schematic.
light from a distant point are parallel. But then, since the rays intersect
in a single point on the retina, they cannot be parallel: they must change
direction as they enter the eye. The surface of the cornea of the eye, the
outer transparent layer where light enters, has a precisely determined shape
to be sure that this focussing happens correctly. We have not tried to show
the details of this shape in the gure.
Fig. 3.6: A myopic eye.
If the cornea has the wrong shape, an extremely common condition, then
the result may be more like Fig. 3.6, where the image of the star is spread
over a noticeable area on the retina. This eye is myopic, near-sighted. As
P
Fig. 3.7: The good eye of Fig. 3.5 without an adapting lens. As the object point P moves
closer, the image moves back, o the retina.
we will see, if the object were brought nearby, then the image would move
back toward the retina, and for a near enough object it would be imaged on
the retina. This is the experience of near-sighted people, who can at least
see nearby things clearly. A far-sighted eye has the opposite problem the
image is behind the retina, and bringing the object closer only causes the
image to move even further back, making the focussing discrepancy worse.
Laser surgery on the cornea can now change the shape of the cornea and
correct these focussing errors. We will look at the geometry of this situation
in more detail later.
3.5 The Lens of the Eye
By the remarks of the preceding section, the good eye in Fig. 3.5 should have
trouble seeing nearby objects! It can see a star as a sharp point, to be sure,
but if an object comes close, like the point P in Fig. 3.7, the image moves
back and forms behind the retina. As you can tell in Fig. 3.7, the retina itself
gets a spread out, blurry image of P. Our eyes have an adaptive feature to
help us see nearby objects clearly, the lens, a transparent component of the
eye connected to muscles that can change its shape. The cornea has a xed
shape, and that is the problem. The lens, inside, can change its shape and
thus change the direction of the rays, making them focus on the retina as
shown, schematically in Fig. 3.8. We do not attempt to show the lens or
the actual path of the light inside the eye but the rays cannot be simple
straight lines. They must make additional deviations, just as they do at the
cornea in entering the eye. These additional deviations are small corrections
3.6. REFRACTION 71
P
Fig. 3.8: The good eye of Fig. 3.5 with a lens (only the eect of the lens is shown).
to bring the light to a good focus on the retina. We can sense the muscular
strain in the lens muscles when we focus on something nearby, and this is
another visual cue we have about distance. Of course there is a limit to how
much correction the lens can apply. If you try to look at a point too close to
your eye, you will not be able to see it clearly. It is a natural consequence of
aging that the lens becomes less capable of applying this correction, so older
folks have to get more elaborate external corrective lenses to make up for the
failing internal one.
If you take the lens o a camera, the camera is useless: no image forms.
But if your lenses were surgically removed, you could still see (sort of). The
reason is that the analogue of the camera lens is really the cornea, the outer
surface of the eye, and you would still have that. That is the main component
of our image-forming system. The shape of the cornea wouldnt be right,
because the construction of the eye assumes the lens will be in the system, but
you could wear glasses to correct for its absence. Our lens is an auxiliary
part of the system that cameras dont have. By the way, how do cameras
solve the problem that the lens solves for the eye?
3.6 Refraction
Image formation in Fig. 3.5 depends on the bending of the rays at the air-
cornea interface. This is the phenomenon of refraction. Now we look at
refraction in more detail. The basic phenomenon is shown in Fig 3.9. A light
ray passes from one transparent material into another: it makes the angle
on one side and the angle
on the other side, measured from the normal

direction. (Normal means perpendicular to the interface.) The two angles
are dierent, but they are connected by a simple rule called Snells Law,
discovered by the Dutch mathematician Willebrord Snell in the early 1600s.
n
n
Fig. 3.9: Refraction at a horizontal interface. A light ray changes direction as it goes from
the top medium, characterized by index of refraction n, to the bottom medium, charac-
terized by index of refraction n
. The two direction angles are measured from the normal

direction, perpendicular to the interface, the dashed vertical line. The picture would look
exactly the same if the ray were going the other direction, from bottom to top.
Snell must have carefully measured the angles and
on both sides, and

then made various guesses about the data that he collected. The table below
shows what he might have measured at an air-water interface, and how he
might have made his discovery.

sin sin
sin / sin
0.300 0.224 1.34 0.296 0.222 1.33

0.600 0.439 1.37 0.565 0.425 1.33
0.900 0.630 1.43 0.783 0.589 1.33
1.200 0.777 1.55 0.932 0.701 1.33
As the angle in the rst column of the table increases, the second angle
also increases. You might guess that there is a simple proportionality between
3.6. REFRACTION 73
the two angles, and in fact the third column shows that they are roughly
proportional, but the ratio isnt quite constant, and to be proportional, they
would have to be related by a constant. You consider the sines of the angles
in the fourth and fth columns, instead of the angles themselves, and aha!
The sines are proportional. Their ratio is a constant, which turns out to be
1.33. This number describes refraction at an air-water interface! It is called
the relative index of refraction of the interface.
More generally, Snells law says that this works for any interface: the sines
of the angles are proportional in refraction, and the relative index depends
on which materials you use. The simplest way to write the law is
n sin = n
sin
(3.5)
where n and n
are the indices of refraction of the two materials individually.

Solving for sin / sin
= n
/n we see that the relative index of refraction

is the ratio of the indices of refraction of the two materials. Thus every
transparent material is characterized by its index of refraction, and interfaces
between one material and another are characterized by their ratio.
The table below shows typical values of the index of refraction for a
few materials. Since the experiment only measures the ratios, we get to
choose some special transparent medium and give it the index 1 by denition.
Initially air was given this special status. But as the theory developed to
describe more phenomena than just refraction, it became clear that a better
choice would be to give the vacuum the index n = 1 by denition, and then,
by measurement, air under normal room conditions has index slightly greater
than 1, as shown in the table. For most purposes you can take the index of
refraction of air to be 1.
Medium n
Air 1.00028
Ice 1.31
Water 1.33
Glass 1.5-2.0
As the table implies, there are many kinds of glass, and they dier quite
a lot in their index of refraction. Also, when water freezes, its index of
refraction changes slightly. In fact the index of refraction for a given material
depends on temperature, and also on color(!), so the numbers in the table
are merely representative. The index of refraction turns out to be a very
interesting property, and summarizes in an average way quite a lot about
how the atoms of a material interact with light. That is not at all obvious
just from the phenomenon of refraction alone. For hundreds of years this
index was measured for all kinds of materials under all kinds of conditions
without any deep knowledge of why it had the values it did. The experimental
value was enough to characterize a material for optical purposes, and so it
was worth measuring, even if it was, in a deeper sense, a mystery.
In fact, Snells Law is a good example of a mysterious proportionality
in Nature, with the generality that is typical of physics. It is an idea that
is useful across the sciences. Biologists are aware of the index of refraction
in microscopy, and geologists use it to identify minerals. Chemists use it
to characterize compounds. For physicists it is just one manifestation of a
richly detailed theory of matter that is now quite well understood. This old
idea from 17th century Holland is now woven into a much larger theoretical
framework of atoms, molecules, crystals, and electric elds. But even without
knowing any of this, we will nd Snells Law has interesting things to say
about how the eye works.
3.7 Focal Length
Our theory of how the eye works in Fig. 3.5 is not much of a theory it looks
more like a cartoon. This gure has more content, though, than you might
think, if we just use geometry and Snells Law to dig that content out. We
redraw the essential idea in Fig. 3.10, putting the star on the horizontal axis,
for simplicity, and only drawing two parallel rays. We assume the cornea has
a spherical shape, of radius R, and that the rays intersect behind the cornea
a distance f, the focal length, the distance the light travels to reach a focus.
This will be a more elaborate argument than anything we have done up
to now. The steps are not dicult individually, however try to check each
step of the argument, and then see how it all ts together. We start with
3.7. FOCAL LENGTH 75
f
R
A O C
B D
E
Fig. 3.10: Two parallel rays, DB and EC, are refracted at a spherical interface of radius
R and are brought to a focus at the point A, a distance f behind the interface.
Snells Law in small angle approximation: in that case Eq (3.5) becomes
n = n
(3.6)
This says the two angles in refraction, and
, are simply proportional, and

since in that case
= (n/n
), the constant of proportionality is just the

relative index of refraction. You can check this in the rst table in Section
3.6, third column, even though the angles arent particularly small. For an
angle as small as 0.3 the approximation is quite good, and it just gets better
for smaller angles (not shown in the table). Thus for small angles, Snells
Law is particularly simple.
In Eq (3.6), with reference to Fig 3.10, the index n refers to the medium
on the right and the index n
refers to the medium on the left. If the sphere

is an eyeball, then it is air on the right and eyeball on the left (or vitreous
humor, the uid inside the eye). If n is the index of refraction of air, we
would have n 1, but let us keep calling it n for generality. Then at the end,
just by taking n = 1.33 instead of 1, we can also tell what happens when we
open our eyes under water, at a water-cornea interface. To begin checking
the argument, locate the angles and
in Fig. 3.10 and be sure you see how

they really are the usual angles in refraction.
Now we read o some geometrical relationships from Fig. 3.10.
OB = R (3.7)
AC = f (3.8)
BC = Rsin = f tan (3.9)
=
=
_
1
n
n
_
=
_
n
n
n
_
(3.10)
It takes a little to work to nd all these relationships in the gure think
of it as a puzzle and try to solve the puzzle. The last one, Eq (3.10), uses
Snells Law, in small angle approximation, Eq (3.6).
Now, since the angles and are small, Eq (3.9) simplies, by the small
angle approximation, to R = f, or
f =
R
=
n
R
n
n
(3.11)
This is a result! It predicts where the rays focus, i.e. the distance f, just
knowing the shape of the cornea R and the indices of refraction n and n
.
That is all we need in order to evaluate the right hand side.
There is a very interesting and peculiar thing about Eq (3.11). The nal
result, on the far right, does not have in it. The angle cancels out in
the ratio /, which occurs in an intermediate step of the calculation. This
means that all the parallel rays like BD that we didnt show in the gure
determine the same focal length f, i.e. they all intersect ray EA at the same
point A. Each such ray has its own , as you can verify by sketching in a
typical parallel ray between the two that are already there, for example. The
argument above leads to the same focal length f for all such rays, since the
nal result doesnt depend on . Thus the rays must behave like the rays in
Fig 3.5, all coming together to a single focal point.
3.8 Interpreting Relationships
It is easy to make a mistake in a complicated argument like this, so it is
essential, when we nally get to a result like the one in Eq (3.11), to check
common sense, by interpreting the meaning.
3.8. INTERPRETING RELATIONSHIPS 77
The rst thing is to check dimensions: f on the left is a length (dimension
[L]), so the right side must also be a length. We see that it is: R is a length
(dimension [L]), and the ratio of indices of refraction is a pure number. So
our result passes the dimensional check, that both sides of the equation must
have the same dimension. We also see something that we might even have
anticipated, if we had been clever: the focal length f is a multiple of the size
of the sphere. It had to come out that way! The focal length is a length,
after all, and the only length in the problem, the way it comes to us in Fig.
3.10, is the radius of the sphere, R. What length could be the result of a
calculation using properties of the sphere? Only some multiple of R. This
observation is surprisingly useful: knowing just the dimensions of what you
are looking for (f in this case), think what could even conceivably be the
result. Often, as here, there is basically only one possibility. This argument
is called dimensional analysis, and we will certainly see it again.
Now we consider special or limiting cases. This is another crucial thing to
do with any expression like Eq (3.11). What would happen if n = n
? This
would mean that =
, so that there really isnt any refraction after all

the rays go right through the interface without changing direction. Clearly
in that case they wouldnt come to a focus, because they are all parallel. And
that is just what Eq (3.11) says: as n
n, the denominator goes to zero,

and f . The focal point recedes away to innity, which is a way of saying
the rays are parallel. Finally what would happen if n
>> n? This means an

enormous index of refraction in the left hand medium, which means
0,
i.e., the rays on the left would be essentially along the normal direction.
Thus all rays would converge to the point O, which is at the distance R from
the interface, being the center of the sphere. And again, that is just what
Eq (3.11) says: if n
>> n, then we can ignore n in the denominator, and the

fraction in parentheses is essentially 1. In this case, Eq (3.11) says f = R,
which we just saw was correct.
Taking some time to check a result like this, to interpret it, is a very
important part of learning physics. It is how you begin to learn to read
relationships like Eq (3.11). And let us emphasize again that what we are
talking about is relationships. If we had done all this with specic choices of
numbers, we would just get some number as the result, and there would be
no way to see if it made sense or not. It wouldnt mean anything, or rather it
would mean something too restricted and specic to be of any interest. But a
relationship like Eq (3.11) is full of meanings that we still havent extracted,
even though we have begun to see what it means.
3.9 The focal length of the eye
We know the focal length f of the eye: it is the diameter of the eye, because
the focal point is on the retina. Thus if the eyeball is really a sphere of radius
R, then f = 2R. Comparing with Eq (3.11), we deduce that the factor in
parentheses must have the value 2, i.e. n
/(n
n) = 2. Since n = 1 in air,
we can solve for n
, the index of refraction of the vitreous humor, and we nd

n
= 2 (check that this is the solution). This is quite remarkable: we predict

what must be inside the eye, almost by pure thought!
As soon as we do this, though, we realize it cant be true. The index
n
= 2 is just too big. It is true that very dense glass might have an index
close to n = 2, and diamond has an index n = 2.42, but these values are
unusual. The vitreous humor undoubtedly has a high water content, and
water has quite a small index, just 1.33 it seems highly improbable that
Nature could somehow add something to water that would bring the index
up to the value 2. In fact, if we put in n
= 4/3, the approximate value

for water, and n = 1, for air, which is surely closer to the truth, we nd
n
/(n
n) = 4, so that f 4R. And yet f is the diameter of the eyeball!

What is going on??
The only possible resolution of this puzzle is that the eyeball is not a
sphere, and in particular the radius of curvature R of the cornea must be
considerably smaller than half the diameter of the eye, smaller by roughly
a factor of 2. The eye must look more like Fig 3.11, with a highly curved
cornea superimposed on a basically spherical eyeball of larger radius. The
focal length is determined by the cornea alone, so now we see how it could be
that f 4R: the R in Fig 3.10 is not the radius of the eyeball, it is the radius
of curvature of the cornea. We even realize that this shape is a familiar fact
about the eye. It is the reason that contact lenses dont oat freely over the
whole surface of the eyeball, but are conned to just the cornea, where their
shape is molded to t.
We noted earlier that we could also use Eq (3.11) to think about how we
see when we open our eyes under water. In this case n = 4/3, the value for
water. If also n
= 4/3, we would have zero in the denominator, and f would

3.9. THE FOCAL LENGTH OF THE EYE 79
R
f
Fig. 3.11: For reasons given in the text, the radius of curvature R of the cornea must be
smaller than the radius f/2 of the eyeball. (The dotted circle is included in this gure
to help visualize the radius of curvature R of the cornea, and does not correspond to any
structure in the interior of the eye.)
be innite, a case we already ran into as a special case in the previous section.
This corresponds to no refraction at the interface, and no image formation.
This is not what happens, though. We can see under water, just not very
clearly. It must be that n
for the vitreous humor is appreciably larger than

4/3, the value for pure water. This is to be expected on other grounds as
well. Adding solutes to water invariably raises the index of refraction. Thus
n
> n, and that is why we can see under water! If you imagine n increasing
from 1 (the value for air) in Eq (3.11), the denominator of the fraction in
parentheses would get smaller, and the fraction itself would get bigger. This
means f would get bigger, so the focal point would move back, o the retina
we are eectively farsighted under water. If you dont know what it is like to
be farsighted, just open your eyes under water. Of course if you wear goggles
or a diving mask, you can suddenly see sharply again under water. Why?
Because now you have an air-cornea interface, not a water-cornea interface,
and the relative index of refraction at the interface is just what it should be
for good vision.
3.10 Virtual Images
The curvature of the cornea is essential for the formation of the image on
the retina. That is clear in Eq (3.11), which can even describe what would
happen if the cornea were at. Since R can be anything in this relationship,
suppose R is very large. A at interface is the limiting case as R goes to
innity: a small piece of an enormous sphere looks at. But if we let R
in Eq (3.11), then also f , that is, the image goes to innity, which is to
say it doesnt form at any nite place. So a at interface wouldnt produce
an image. That is why our eyes have a curved surface, and why a camera
has a curved lens.
You might very plausibly think, therefore, that a at interface cannot
form an image. (Isnt that what we just said?) But, in a certain amusing
sense, that is not true! We will show not only that a at interface does
form an image, but that you are even very familiar with this phenomenon.
We hasten to add that this paradox is possible only because we are going
to redene the word image slightly: the image formed will be a virtual
image. The example we will consider is what you see when you look down
into shallow water, through the at air-water interface. You see a virtual
image of the rocks, shells, etc. on the bottom.
The geometry of the situation is shown in Fig 3.12.
Once again we inspect the gure and read o geometric information:
BP = D (3.12)
BP
= D
(3.13)
AB = Dtan = D
tan
(3.14)
We imagine looking straight down into the water, so that the rays our eyes
receive are nearly vertical, that is, the angles and
are small. (They are

not drawn small in the gure, in order to spread things out so that you can
see the geometrical relationships, but now we specialize to the case of small
.) In this case we can use the small angle approximation in Snells Law and
nd
D
D
=
tan
tan
n
(3.15)
We see that the ratio of depths D
/D does not depend on , i.e., it does not

depend on which ray we choose. It only depends on the two media (air and
3.10. VIRTUAL IMAGES 81
A B
P
P
D
D
water
air
Fig. 3.12: Light from a point P under water, as it emerges into air, seems to come from
the point P
. Thus P appears to be at the depth D
instead of the true depth D.

water) through their indices of refraction. All the rays intersect at the same
depth D
, as the rays on the right suggest (as long as the angles are small).
That means the emerging light from P seems to come from a dierent point,
P
. Our visual system can estimate distance from the geometry of the rays
we receive, as we have already noticed. Hence water looks shallower than
it actually is! The visual evidence about where the bottom is comes to us
from the rays we actually receive, which emanate from P
, not the rays in

the water that emanate from P.
The point P
is called a virtual image of the point P. When we look

into water, we are really looking at this virtual image. In the case of water
and air we have n
= 1 and n = 4/3. Thus the depth we see, D
is only
3/4 of the true depth. When you look for yourself (a still swimming pool is
an excellent place to see this eect) you may have the impression that the
eect is even more extreme than this the pool looks quite shallow. This is
because you are probably looking down at an angle, and not straight down.
The geometry is more complicated to work out in this case, although the idea
is exactly the same: every point on the bottom produces a virtual image that
you see, and the virtual image is even shallower in the general case than in
the straight-down case.
A special limiting case of this last eect is easy to see. Because the index
of refraction of water is greater than that of air, light rays bend away from
the vertical as they emerge into air. That is clear in Fig. 3.12. For a large
enough angle, the ray in air will have bent away from the vertical by /2,
that is, it will just skim the surface, as shown in Fig. 3.13. According to
P
P
c
/2 air
water
Fig.3.13: A ray from P at the critical angle
c
emerges in air to skim the surface. A nearby
ray from P would intersect the horizontal at the virtual image P
. This is essentially where

the point P would appear to be, to someone looking along the water at a very shallow angle.
Snells Law, if
= /2, so that sin
= 1, the corresponding angle in water,

which we call
c
for critical angle, obeys
n
= nsin
c
(3.16)
If n
= 1, for air, and n = 4/3, for water, we have

c
= sin
1
(3/4) = 0.8481.
This is about 48.6
. (You might very well wonder what happens to rays from

P at larger angles than this! Well return to this question.) Meanwhile just
notice that if you look into the water at a grazing angle, along the ray in
Fig. 3.13, you see the virtual image P
at the surface of the water! This

3.11. THIN LENSES 83
observation conrms, in a limiting case, our impression that when we look
into water at an angle, the water looks really shallow. It is just the virtual
image that we are looking at.
This phenomenon of the critical angle is something to keep in mind when-
ever you think about a light ray going from high to low index of refraction.
The critical angle
c
is always determined by the picture in Fig. 3.13 and
the corresponding relationship Eq. (3.16). If we think of light rays going the
other direction, in Figs. 3.12 and 3.13, and perhaps coming to the eye of a
sh at the point P, we notice that the sh sees the whole upper world of the
air conned to a cone of opening angle
c
, a phenomenon sometimes called
Snells window. You can look through Snells window yourself if you can
swim down to a reasonable depth: turn over on your back, look up, and you
will see the surface above you illuminated in a bright circle out as far as the
critical angle. Then it goes dark.
3.11 Thin Lenses
In thinking about the eye, we have encountered many of the basic ideas of
geometrical optics. We have kept details about the real eye to a minimum,
mentioning them only when physical principle required it. It is interesting
to notice in this special example how physics and biology dier in their
emphases. Biologists treat the eye with less explicit emphasis on geometry.
Physicists have little interest in most of the anatomical structure of the eye,
and treat the essential parts as geometrically as possible. It should be clear
that these approaches complement each other, and that each has useful things
to say.
A glass lens, like a simple camera lens or a magnifying glass, must be
much simpler to understand than the eye, so simple that a biologist wouldnt
even be interested. This is something we ought to be able to understand
pretty completely. In principle we could trace rays through any lens, of any
shape, using Snells Law in its exact form at interfaces, and in this way
we could learn exactly what any lens does. High quality lenses and lens
systems are designed by this process. As usual, though, physicists use quick
approximations to get at the essential properties of common lens shapes. We
will assume that the interfaces are spherical surfaces, each characterized by
f
Fig. 3.14: The focal length of a plano-convex lens
its own radius of curvature R, just as we have done above, and we will also
assume that the lenses are thin, so that we can say that a lens is located
at some denite position, without distinguishing between its front and back
surfaces, even though, strictly speaking, they are at slightly dierent places.
We will also assume that light rays make small angles with the normal at
the lens surfaces, so that we can use small angle approximation to describe
refraction. This is just what we have been doing right along. The only thing
new here is the requirement that the lens should be thin.
A thin lens is characterized by one number, its focal length f, as illus-
trated in Fig. 3.14, the distance behind the lens at which an image forms of
a point at innity. It is precisely where the lm ought to be located in a
camera to take a sharp picture of a star. One also speaks of the focal plane,
the plane located a distance f behind the lens. In a camera, the lm actually
lies in the focal plane when the camera is focused on innity, and an image
of the night sky would show sharp star images at many places. For a plano-
convex lens like the one in Fig (3.14) we can use geometry and Snells Law
in the small angle approximation to nd the focal length from the radius of
curvature R and the index of refraction, much as we did in Section 3.7. It
turns out to be
f =
nR
n
n
(3.17)
where n = 1 refers to the air around the lens, and n
refers to the material

of the lens. We would use n = 4/3 if the lens were in water. Try checking
3.11. THIN LENSES 85
the common sense of Eq (3.17), using the same ideas that we used to check
Eq (3.11). Is it dimensionally correct, for example?
Here is an odd thing, which turns out to be generally true: the focal
length of this lens is the same if the light comes in from the left and focuses
on the right, as it is if the light comes through from the right and focuses on
the left, although the ray tracing argument is dierent in the two cases. It
is a good puzzle to try to do the geometry and nd f both ways. It comes
out the same either way, just Eq (3.17).
The image formed in the focal plane is called a real image, in contrast to
the virtual image that we saw in the last section. When we say real image,
we are emphasizing that it would actually appear, visibly, on a screen or on
lm placed in the focal plane. The image on a movie screen is a real image.
Similarly, the image that forms on the retina is a real image.
f
Fig. 3.15: Light from a distant point diverges after going through a plano-concave lens as
if coming from a virtual image a distance f behind the lens.
A plano-concave lens creates a virtual image of a distant star as shown
in Fig 3.15. The parallel rays from a distant point diverge after passing
through the concave lens, as if they were coming from a point behind the
lens. The distance to this point might again be called f, the focal length
of the (concave) lens. The same kind of geometrical argument, using Snells
Law in small angle approximation, leads once again to the same formula for
f, Eq. (3.17)! This is really a surprise. It is as if convex and concave lenses
were somehow the same, mathematically, although they seem to behave so
dierently. Of course they are not really the same: one is convex, the other
concave, one forms a real image, the other a virtual image. In particular the
foci are on opposite sides of the lens.
The following sign convention turns out to be a way to put all this to-
gether. We think of the light owing through the lens. If the focus occurs
downstream from the lens (to the left of the lens in our case, as in Fig
3.14), corresponding to a real image, then we call f positive. But if the focus
occurs upstream from the lens, (to the right of the lens as we have drawn
it in Fig. 3.15), corresponding to a virtual image, then we call f negative.
Similarly, if the spherical surface is convex, like a sphere seen from the out-
side, we call the radius of curvature R positive, but if the spherical surface
is concave, like a sphere seen from the inside, we call R negative. Now ev-
erything works! The same formula, Eq (3.17), describes both plano-convex
and plano-concave lenses, but for the concave lens R is negative, and so the
formula makes f negative, telling us that the lens forms a virtual image,
upstream from the lens, as in Fig. 3.15. The concavity is represented by the
negative sign in R.
There is a possibility that we havent considered in Eq (3.17). We have
always assumed that n
n is positive, since we think of n
as referring to
a glass lens, and n referring to air, or perhaps water, and then n
> n. But
suppose n
< n, as would be the case for a lens shaped air bubble in glass.
Then if R > 0, corresponding to a convex air lens in glass, we would nd
f < 0 in Eq (3.17), since n
n < 0, so that we predict a diverging lens and

a virtual image, in spite of the convex shape. Is this what actually happens?
Yes! The sign of f does tell us how the lens behaves, even in cases that we
hadnt explicitly intended. And the (negative) focal length f is still given
correctly by Eq (3.17)! The sign conventions tell us how to choose the sign
of R and how to interpret the sign of f.
3.12 Object and Image
We have seen that if the object is at innity, then the image is at the focal
length (i.e., in the focal plane). This is really the operational denition of
the focal plane, and also a way to compute f from geometry. The concept is
illustrated in Figs. 3.14 and 3.15.
But what if the object is not at innity? There is a simple relationship
between the object position o, the image position i, and the focal length f,
3.12. OBJECT AND IMAGE 87
called the thin lens equation
1
i
+
1
o
=
1
f
(3.18)
Here i and o are measured from the position of the lens, and there are sign
conventions: o is positive if it is upstream from the lens and negative if it
is downstream. On the other hand i is positive if it is downstream from the
lens and negative if it is upstream. These conventions are chosen so that the
simplest situation, with an object upstream forming an image downstream,
corresponds to both o positive and i positive.
h
h
A
B
E
O
C
D
F
f
i o
Fig. 3.16: An object of height h at o produces an inverted real image of height h
at i. The
focal length of the lens is f
As you might expect, Eq (3.18) is a consequence of geometry, and not
dicult to prove. We will go through it at the end of this section. Meanwhile,
let us do the more important and interesting job of interpreting the meaning
of Eq (3.18). First of all, check dimensions: i, o, and f are all lengths, so
they carry dimension [L]. Therefore each term has dimension [L
1
], and thus
the equation makes dimensional sense. All terms have the same dimension.
Next, check special cases: if the object is at innity, like a star, then 1/o = 0
and Eq 3.18 is just 1/i = 1/f, that is, i = f, or to put it into words, the
image is at the focal length. That is correct, of course! And if f happens
to be negative, because the lens is concave, then i is negative, that is, it is
upstream from the lens, and must be a virtual image (Fig. 3.15 again).
We can check something we only asserted before in Fig. 3.16. When the
object comes in from innity to some nearer position, the image moves back
from the focal plane, farther away from the lens. That is clear in Fig 3.16,
and it is also clear in Eq (3.18). If f is positive, and 1/o increases from zero,
then 1/i has to be decrease in order that 1/o and 1/i continue to add up to
the same positive value 1/f. That means i increases: the image moves back.
Now let us see in detail what Fig. 3.16 means, and how it leads to the
thin lens equation, Eq (3.18). The diagram actually shows how the point A
leads to its image D by following just two special rays from A to see where
they intersect. Other rays would also intersect there, but it is not as easy to
describe them. The rst special ray is the one that goes through the middle
of the lens, which we show as a straight, undeviated line AOD. It is straight
because the normal directions to the two surfaces of the lens are parallel at
the middle (they are both horizontal), so that whatever deviation happens at
the front surface is undone at the back surface. Strictly speaking there should
be a little jog through the lens, but the lens is thin, so we ignore that. Also
the normal isnt quite horizontal if we enter just above the middle, but in the
small angle approximation we dont enter very far above the middle, so we
ignore that little eect too. Thus we have one special ray, straight through
the middle, AOD. The second special ray is AE. It goes horizontally from
A to the lens. We know what happens on the other side: it goes through the
focal point F at distance f behind the lens (compare Fig. 3.14). So with
very little eort we draw those two rays, and where they intersect at D is
the image of A.
Now we dig out the geometrical information in the diagram. From the
similar triangles OCD OAB we see
i
h
=
o
h
(3.19)
This tells us that the image is a kind of magnied version of the object, with
h
/h = i/o. Then from the similar triangles DCF OFE we have

i f
h
=
f
h
(3.20)
3.13. OPTICAL SYSTEMS 89
The thin lens equation Eq. (3.18) follows from Eqs (3.19) and (3.20) by
algebra. [A quick way: divide the left side of Eq. (3.20) by the left side of
Eq. (3.19), and the right side of Eq. (3.20) by the right side of Eq. (3.19).
These are equals divided by equals, so they are equal.] Since the thin lens
equation only involves the distance o, and not anything else about the object,
all points of the object are imaged in the plane at position i, not just the
point A at D. In particular, the image of the point B is C.
Eq. (3.19), which is easy to see in Fig 3.16, is used, with a sign convention,
to express the magnication by a single lens,
Single Lens Magnication =
i
o
(3.21)
The minus sign is the sign convention. With this convention, the magnica-
tion is negative when the image is inverted and positive when it is erect.
If you hold up a convex lens at a distance from your eye and look through
it, you see an inverted version of the scene in front of you. This is because
you are actually looking at the inverted real image at the position i, in front
of the lens. The rays dont stop at i, of course, they continue on, and what
gets to your eye comes from the real image as if there were really something
there. If you move closer to it, eventually you get too close to focus properly,
and it blurs. This inverted real image is also what slide and movie projectors
produce at the screen. In this case the magnication is enormous. How is
that achieved? And why dont you see the image upside-down?
We havent considered the other signs possible for i, o, and f, but the
thin lens equation holds for all possibilities. We will run into them in the
next section on systems of lenses.
3.13 Optical Systems
At the end of the last section we described what you see when you look
through a convex lens held up at a distance from your eye. You see an
inverted real image in front of the lens. That means that the image on your
retina is really the image of an image. The convex lens forms a real image
somewhere in space, and then that image becomes the object that you look
at. That is how optical systems of lenses work: each lens forms an image
that then becomes the object for the next lens as the light ows through the
system. It is like input-output systems strung together, with the output of
one being the input for the next. In this section we will consider a number
of optical systems. Remember: the eye is part of the system!
3.13.1 The Magnifying Glass
When you use a magnifying glass, you hold a convex lens close to the thing
you are magnifying, closer than the focal length f, in fact. The result is a
virtual image behind the lens, instead of a real image in front of it. This
follows immediately from the thin lens equation Eq. (3.18) if we solve for i:
i =
of
o f
(3.22)
If 0 < o < f, then i is negative, i.e., the image is virtual. The situation is
illustrated in Fig. 3.17. Again we can use two special rays to locate the image
of the point A, the ray AE (which, extended, goes through the focal point F)
and the central ray AO. After passing through the lens the ray AE becomes
horizontal, by denition of the focal point, and extending these special rays
backwards, they seem to come from the point C, which is therefore the virtual
image of A.
When we look through a magnifying glass, we see the virtual image, with
its magnied height h
instead of the real object, with its height h. It is not

clear in Fig. 3.17, however, that the virtual image will really look bigger,
because it is also farther away. What we see, after all, is angular size. So
Fig. 3.17, although it is suggestive, does not provide a completely satisfactory
explanation for how a magnifying glass works. We should really think about
the angular size of what we see, and for that we must introduce the distance
d from our eye to the magnifying glass. Then without the magnifying glass
we see an object of height h at distance d +o with angular size
=
h
d +o
(3.23)
and with the magnifying glass we see the virtual image of height h
at distance
d +|i| with angular size
M
=
h
d +|i|
(3.24)
h
h
A
B
E
O
C
D F
o
f
i
Fig. 3.17: An object of size h at o closer to a convex lens than the focal length f forms a
magnied virtual image of size h
at i.
We must be careful to use |i|, the magnitude of i instead of i itself, because
i itself is negative, but it is the magnitude |i| that says how far it is from the
lens. Of course, since we know i is negative, we can just change the sign to
get |i| = of/(f o), from Eq (3.22). The ratio of the angular size with the
lens to the angular size without the lens is the angular magnication:
=
h
(d +o)
h(d +|i|)
=
_
1
d
+
1
o
_
_
1
d
+
1
o
_
1
f
(3.25)
The rather complicated expression on the far right of Eq (3.25) follows from
some algebra, using Eq (3.22) to eliminate i and the magnication relation
between h and h
, namely h
/h = |i|/o from Fig. 3.17.

As always, when we encounter a complicated result, arrived at by a long
argument that might have introduced some mistakes, or might have failed
to include an essential feature, we try to check the relationship for common
sense. First dimensions: the ratio on the left hand side is of course dimen-
sionless, so the right hand side should be as well. But each term in the
numerator of the nal expression is [L
1
], and each term in the denominator
is as well, so the ratio is dimensionless.
Next, we notice that if f > 0, as it is for a convex lens, then the de-
nominator is less than the numerator, because it is the same except with 1/f
subtracted, and therefore the angular magnication is a number greater than
1, corresponding to actual magnication. This is, in a sense, what we wanted
to be sure of. We also notice that if f < 0, as it is for a concave lens, then
subtracting 1/f really means adding something: the denominator is greater
than the numerator, and the magnication is less than 1. The concave lens
makes thing look smaller. This too is a familiar fact of experience. If f is
innite, as it is for a at piece of glass, the object looks just the same whether
the glass is there or not, because the 1/f in the denominator is zero, and the
angular magnication is 1.
For a typical hand-held magnifying glass, with f 20 cm, say, we might
have o = f/2 and d = f, leading to an angular magnication of 3/2. This
seems about right. By moving o towards f, i.e., moving the object towards
the focal plane 20 cm behind the lens we could boost the angular magnica-
tion to 2, but thats it.
As o f, the virtual image goes to innity, where it is comfortable
for the eye to look at, and still a nice angular size, because it grows larger
as it moves away. In this case, 1/o and 1/f cancel in the denominator of
Eq. (3.25), and the angular magnication simplies to
= 1 +
d
f
(if o = f) (3.26)
Now suppose we bring our eye right up to the magnifying glass, i.e., let
d 0. Then the angular magnication would be 1. What good is that? you
may ask. The virtual image at innity, with the lens, has the same angular
size as the object a distance f from our eyes would have without the lens
if we could see it! The point is, though, for short enough focal length f, we
couldnt see an object there. It would be too close to our eyes for our internal
lens to accommodate and bring to a focus on our retinas. But we can see
the virtual image at innity produced by the lens, with a relaxed eye, and it
is just like being a distance f away from the object, which is to say, really
close. In eect we are using the magnifying glass to supplement our internal
lens, which is not what we do when we use a magnifying glass casually. This
is what the eyepiece of a microscope does our eye is right at the lens, the
lens has a very short focal length, and we are looking at something in the
focal plane, much too close to see ordinarily, but comfortable to see with the
lens.
Most textbooks use the term angular magnication to describe this special
application of the magnifying glass, and they compare the angular size you
could achieve by viewing an object at distance f (with the lens, of course)
with the angular size you would have to settle for at distance d
min
, the
minimum distance at which you can focus on things, conventionally estimated
at 25 cm, but varying from person to person. This comparison says you could
make things look bigger by the factor d
min
/f with the lens, by reducing the
distance to the object, but of course you dont just introduce the lens, you
also physically move the object, and that is where the angular magnication
comes from.
Microscope eyepieces are described by their magnication in this second
sense. A 10 eyepiece magnies things 10 times, meaning it has a focal
length f = 2.5 cm. At the distance 2.5 cm, things look ten times bigger
than they do at the distance 25 cm, where you would otherwise have to view
them.
This is a good place to point out the perils of learning results without
thinking about where they come from. Suppose you conscientiously learn
that a magnifying glass produces an angular magnication d
min
/f, where
d
min
= 25 cm is the near point of the eye and f is the focal length of the
lens. Then someone hands you a convex lens with f = 50 cm. You go to
the formula, and nd an angular magnication of 25/50 = 1/2, that is, the
lens should make things look smaller. But when you try it, you nd that
this convex lens makes things look bigger just like every other magnifying
glass! Where did you go wrong??
3.13.2 The Microscope
We have already said, in eect, how a microscope works: just put together
what we already know about how a convex lens forms images, both real and
virtual. A microscope consists of two lenses, in principle. The rst lens,
the objective, forms a highly magnied, inverted real image. Since the
magnication is i
1
/o
1
, we want i
1
to be much larger than o
1
. Therefore, by
the thin lens equation Eq. (3.18) o
1
must be just slightly larger than f
1
, the
focal length of the objective lens. Then i
1
forms at a position downstream
from the objective by a large multiple of f
1
, and to keep the microscope a
reasonable size, this means that f
1
must be small, i.e., the radius of curvature
of the objective must be small. Then, as we have just described above, the
second lens, the eyepiece, functions as a magnifying glass to inspect this
inverted real image by forming a virtual image of it at innity that the eye
can see. The magnication of the result, over what you could see without
the microscope, is the product of the magnications of the two parts of the
system separately, (i
1
/o
1
)(d
min
/f
2
). The distance between the two lenses
is i
1
+ f
2
as we have described it, but f
2
is small, to get good magnication
from the eyepiece, so the distance between the lenses is essentially just i
1
,
and that is also the length L of the microscope barrel that holds the lenses
in place at either end. Also, as we have already noted, o
1
is essentially f
1
,
the focal length of the objective. Therefore the angular magnication of a
microscope is essentially
Angular Magnication =
Ld
min
f
1
f
2
(3.27)
Let us interpret this result what does it really mean? First, the minus sign
reminds us that the image is inverted. We check dimensions: the ratio of
lengths on the right is dimensionless: good. We get better magnication by
choosing smaller focal lengths, i.e. more highly curved lenses. We get better
magnication by making the microscope long why is that? Because the
farther away the real image gets from the objective, the bigger it gets. (Of
course that also makes the instrument bigger and clumsier: maybe better
to keep it compact and work on making good lenses.) And nally, oddly,
we seem to get better angular magnication if d
min
is larger! What is that
about? Well, that part is just common sense: microscopes are even more
helpful to people who cant hold things close to see them!
In good microscopes the objective is still a single lens, as we have de-
scribed it, but the eyepiece is often a little optical system in itself. The
reason is that the designers are correcting for chromatic aberration. With-
out this, objects seen in the microscope seem to have a colored halo. It all
goes back to the index of refraction n
of the glass. Unfortunately the index

of refraction depends slightly on the color of the light, so that according to
formulae like Eq (3.17), the focal length is not the same for all colors. If you
get a nice sharp image for one color, nearby colors are blurry, and you see
them all together. In the next section we will see how a little system could
be achromatic, i.e. not subject to this problem, even while it is made from
glass that does have this problem.
3.13.3 Two lenses together
A general thin convex lens that is curved on both sides can be made by gluing
together two plano-convex lenses, which are curved on only one side, along
their at sides. This means a general lens might be thought of as a system
of two lenses (we might think of them as half-lenses), an amusing example
of a two lens system, because both lenses are in the same place. We can nd
the focal length f of the resulting system from the focal lengths f
1
and f
2
of
the two plano-convex lenses separately by computing the image of an object
at innity. We do this one lens at a time. Since the object is at innity,
1/o = 0. Thus the rst lens forms an image at i
1
= f
1
. This image is the
object for the next lens. We therefore have o
2
= i
1
= f
1
. The minus sign
is because i and o have opposite sign conventions: where i is positive, o is
negative. Then from the thin lens equation Eq. (3.18) for the second lens,
1
i
2
=
1
f
2
1
o
2
(3.28)
but i
2
is just f, the focal length of the system, because it is where the image
of the distant object is. Using o
2
= f
1
we nd
1
f
=
1
f
1
+
1
f
2
(3.29)
a nice result on how lenses combine. (Caution: it was essential that they
were at the same place! More general congurations of two lenses are not
this simple.)
If we put together two plano-convex lenses of index of refraction n
and
radii of curvature R
1
and R
2
(not necessarily the same) to make a lens with
these radii of curvature characterizing the two faces, then by Eq. (3.17) and
Eq. (3.29), the focal length of the complete lens (in a surrounding medium
with index of refraction n) is given by
1
f
=
n
n
n
_
1
R
1
+
1
R
2
_
(3.30)
This is called the lensmakers equation, because it tells you how to make a
lens of a desired focal length. As before, the sign convention is that R is
positive for a convex face and negative for a concave face. It works for both.
That is, we could also use half-lenses that are plano-concave, with negative
R. (Caution: some books use opposite sign conventions for the two faces, so
that R is positive for a convex face on one side, but positive for a concave
face on the other. This seems to me unnecessary, and extremely confusing!)
Can you check the common sense of the lensmakers equation? What are
some of the meanings hidden in this relationship?
Finally, nothing says we have to use the same glass for the two half-lenses.
If the glasses are dierent, the lensmakers equation becomes
1
f
=
n
1
n
nR
1
+
n
2
n
nR
2
(3.31)
The indices of refraction n
1
and n
2
may depend on color, but if n
1
increases
when n
2
decreases, the eect might cancel out in the sum, so that f doesnt
change. We just have to nd special glass with the right chromatic property.
The result is an achromatic lens system. (Like every good idea, there must
have been a fortune in this for somebody, probably many fortunes.)
3.13.4 The Astronomical Telescope
A simple conguration of two convex lenses, with focal lengths f
1
and f
2
,
makes a telescope for looking at distant things, hence the name astronomical
telescope. We follow the light through the system, denoting the rst lens (the
objective lens of the telescope) by the subscript 1. Since o
1
= , we have
1/o
1
= 0, so that i
1
= f
1
. Therefore the image of the rst (objective) lens is
a distance f
1
downstream from the rst lens. Now we want i
2
= , because
the image at i
2
is the object for the eye to look at. Thus, by the thin lens
equation for the second lens, o
2
= f
2
. That is, the object for the second lens
must be a distance f
2
upstream from the second lens. That object, recall, is
just the image formed by the rst lens, f
1
downstream from the rst lens. So
the length of the telescope is f
1
+f
2
, and the two lenses must be mounted as
shown in Fig 3.18. One should imagine an eye (not shown) looking through
the telescope, of course. The angular magnication can be read o the gure.
A
B
C
F
2
f
2
f
1
Fig. 3.18: This telescope consists of two lenses separated by f
1
+ f
2
. The angular magni-
cation is, in magnitude,
2
/
1
= f
1
/f
2
, or f
1
/f
2
if we use the sign convention that an
inverted image gets a minus sign. The lines AB and BC through the middle of the lenses
to the focal plane show how the angles are related. The red lines represent actual light rays
going through the telescope from a distant object, like a star.
Since
BF = f
1
1
= f
2
2
(3.32)
in small angle approximation, the angular magnication is just f
1
/f
2
, the
minus sign indicating that the image is inverted. If you switch the roles of
the lenses, by looking through the telescope the wrong way, it makes things
look smaller instead of bigger. To make a telescope with high magnication,
you should have a long focal length objective and a short focal length eye-
piece. For reasons we will explore in more detail later, the best telescope
for many astronomical purposes is not necessarily the one with the highest
magnication. In particular, high magnication spreads the image out and
makes it dimmer. Thus it might appear bigger, but also harder to see.
3.13.5 Galilean Telescope
When Galileo made his telescopes, beginning in the summer of 1609, he
had virtually no idea how lenses worked. His description of his trial and
error method suggests that he failed to nd what we call the astronomical
telescope, and instead found a conguration that we now call the Galilean
telescope. Its main virtue is that the image is right side up. This was no
doubt an important consideration when he tried to convince others that the
image seen through it was indeed a faithful representation of reality.
A
B
C
F
2
f
2
f
1
Fig. 3.19: Galilean telescope: the caption to Fig. 3.18 applies verbatim, with the under-
standing that f
2
< 0.
From the mathematical point of view, the astronomical telescope and the
Galilean telescope are the same, except that the Galilean telescope uses a
diverging lens for the eyepiece, so that f
2
< 0. As before the length of the
telescope is f
1
+f
2
, but since f
2
< 0, this length is less than f
1
. The angular
magnication is as before f
1
/f
2
, but this is actually positive now, indicating
an image right side up.
3.14. MIRRORS 99
3.14 Mirrors
Reection in a mirror can also be described very simply with geometrical
optics. The simple law of reection, known in Hellenistic times, is that the
incident ray and the reected ray make equal angles with the normal at a
reecting surface,
i
=
r
(3.33)
illustrated in Fig 3.20. This law, together with the concepts of real image,
r
Fig. 3.20: The law of reection:
i
=
r
virtual image, etc. that we have already encountered, amounts to a theory
of mirrors.
Fig 3.21 shows that light originating at a point A and reecting from a
mirror seems to come from a point B, located symmetrically opposite A on
the other side of the mirror. When we look into a mirror, we are therefore
seeing a virtual image.
Spherical mirrors (reecting spherical surfaces) obey relationships much
like those for lenses. They are characterized by a focal length, for example.
The focal length of a spherical mirror is the distance f in front of the mirror
at which a distant object forms an image. The concept is illustrated in Fig
3.22 for a concave mirror, showing the real image in front of the mirror.
A convex spherical mirror, on the other hand, produces a virtual image
of a distant point, and the location of that virtual image, a distance R/2
A B
C
D
Fig.3.21: Incident and reected rays make equal angles with the normal to the mirror CD.
As a result, reected light from A seems to come from B.
behind the mirror, denes the focal length in the convex case. By analogy
with lenses, we introduce a sign convention and call the focal length negative
in this case. The formula
f = R/2 (3.34)
holds for all mirrors if we consider the radius of curvature R negative for
convex mirrors and positive for concave mirrors. R is innite for at mir-
rors. This simple, purely geometrical relationship replaces Eq (3.30, the
lensmakers equation, for lenses.
Exactly the same kinds of geometrical arguments that we used for lenses
lead to the law of image formation for spherical mirrors, and it is the same
as for lenses! We recall the relationship here:
1
i
+
1
o
=
1
f
(3.35)
where f is understood to be given by Eq (3.34). The sign convention is that
both i and o are positive in front of the mirror, corresponding to the simplest
3.14. MIRRORS 101
2
0
R
R/2
F
D A
B
Fig. 3.22: The focal length of a concave mirror is R/2, where R is the radius of curvature.
Here O is the center of the spherical surface AB of radius R. The red line DAF is a
typical light ray, coming in horizontally and going through the focal point F.
case, illustrated in Fig 3.23.
0 F
h
h
i
f
o
Fig. 3.23: A concave mirror of radius R forms a real image of height h
at i of an object
of height h at o. Here O is the center of curvature for the mirror, and f = R/2.
The two special rays that are used to nd the image in Fig 3.23 are the
ray through O, which reects straight back, and the horizontal ray, which
reects through the focal point F. The image is where they intersect. Other
rays leaving the object would also intersect at the image point, but they are
more dicult to describe. From similar triangles in the gure one can deduce
Eq (3.35) in just the same way that we deduced Eq (3.18) for thin lenses.
Just as for a single lens, the magnication of the image is h
/h = i/o in
magnitude, or
Mirror magnication = i/o (3.36)
using the sign convention that makes the magnication negative for an in-
verted image. Unlike the case for lenses, this relationship is not obvious in
the gure. It follows from similar triangles and some algebra.
A concave mirror, having f > 0, forms a real inverted image in front of
the mirror, since i > 0 for o > f, by Eq (3.35). A convex mirror, having
f < 0, forms a virtual upright image behind the mirror, since i < 0 in this
case, again by Eq (3.35). You can see both of these images by looking into
the two sides of a spoon.
A Newtonian reecting telescope uses a concave mirror in place of an
objective lens to form a real image, which is then inspected with an eyepiece.
The main advantage is that with a mirror in place of a lens, there is no
chromatic aberration, because the light does not go through an objective lens
with its complicated index of refraction. Also, it may be easier to fabricate a
good mirror surface than to make glass that is awless not just on its surfaces
but also in its interior.
3.15 Spherical Aberrations
The theory that light is described by rays that are straight lines in Euclidean
geometry is at least a candidate for an exact theory of light, but we have
been treating it approximately, using the small angle approximation. Now we
look, briey, at how the exact theory diers from our approximate treatment
in the case of a spherical mirror.
In Fig 3.24 we draw rays reecting from a spherical mirror, like the typical
ray in Fig 3.22, but more of them. The focal point of the mirror is shown
with a black dot labeled F. The mirror is taken large enough that for some
rays the angle (referring to Fig 3.22), is not small. We notice that contrary
3.15. SPHERICAL ABERRATIONS 103
F
A
B
Fig. 3.24: The image of a distant point formed by a spherical mirror. The cusp shape
AFB is called a light caustic.
to the small angle approximation, the rays do not accurately intersect at F.
The ones with small , close to the symmetry axis, do intersect there, but
the rays farther from the axis, with large , are noticeably o. The way the
rays intersect suggests a cusp-like arrowhead gure, with the focal point F
at the tip. You may even have noticed bright reections with this shape,
in the bottom of a coee cup, for example, if its shiny interior surface acts
like a round mirror. These shapes are called light caustics, and arise as
imperfect focal points, like the one here. It looks as if geometrical optics
could also describe light caustics, somehow, when the optics are a little bit
o we do not pursue this very interesting idea here, but it has been the
starting point for some fascinating applied mathematics.
The problem we have noticed here, called spherical aberration, is a serious
issue for instrument makers. The unfortunate truth is that spherical optical
components are not quite the right shape. Spherical lenses suer from the
same problem as spherical mirrors. If they are small enough, the discrepancy
may not matter, but if they are so large that the small angle approximation
is no longer very good, then they should be made in a better shape, not a
spherical shape. For a mirror, that better shape turns out to be a parabola,
and one sometimes hears about parabolic mirrors. Shapes dierent from
spheres are invariably more expensive to manufacture, so high quality optics
is not cheap. One motivation for using large optical elements, even though
it requires more careful attention to shape, is that large elements allow more
light through, just as our eyes admit more light when our pupils open wider.
More light means brighter images.
An ingenious idea for making a parabolic mirror cheaply, for use as a
telescope, is to rotate a circular pool of mercury. The shiny liquid naturally
assumes a parabolic prole when it rotates, and you can even control the focal
length (which is just half the radius of curvature at the center) by controlling
the rotation speed. Unfortunately, since the mirror must be horizontal, such
a telescope can only point vertically. That is probably its main disadvantage,
although keeping the surface smooth and free of vibrations might also be a
technical challenge. In principle such a telescope could be looking straight
up, waiting for interesting objects to pass into its narrow eld of view, or it
could look at a steerable plane mirror that directs light from other directions
straight down.
3.16 Reection and Refraction
Up to now we have treated refraction at an interface and reection at a mirror
as if they were two dierent things. In fact, though, when light refracts at
an interface, only some of it is refracted the rest is reected. That is, every
such interface is also a kind of mirror. You know of course that you can see
your reection in transparent glass.
Interfaces at which both reection and refraction occur are often called
dielectric interfaces, to distinguish them from metal surfaces. To empha-
size their reective property, we could even call such an interface a dielectric
mirror. There is nothing new to say about the geometrical optics of a di-
electric mirror, though it is just like any other mirror.
There is still an important question to consider, however: what fraction of
the light is reected and what fraction is transmitted at a dielectric interface?
This question presupposes a way of measuring how much light we have,
a measure of brightness. The best measure of brightness makes use of the
notion of energy, and what we really mean by the question is, what fraction
3.16. REFLECTION AND REFRACTION 105
of the energy is reected and what fraction is transmitted? We cant make
this question more precise yet, but we will describe the answer anyway.
Reection and refraction occur because of a dierence in the index of
refraction, a mismatch. For normal incidence, if the relative index of re-
fraction at the interface is n, the reected fraction, or reection coecient
is
R =
_
n 1
n + 1
_
2
(3.37)
Note that if the two media had the same index of refraction, the relative
index of refraction (their quotient) would be 1, and the reection coecient
would be R = 0, i.e., no reection. This is sometimes used as a quick way
to measure the index of refraction of an unknown glass or mineral. Just
immerse it in a series of clear oils with dierent indices of refraction if you
cant see it in the oil, then there must be an index match! Seeing it, after
all, means we see the light reected from the interface, but in a matched oil,
there is no reected light.
Let us do a numerical example. If window glass has index n = 1.5 and
air has index 1, then the relative index of refraction is n = 1.5, and R =
1/25 = 4/100. In other words, only 4% of the incident energy is reected
at the air-glass interface. If no energy is absorbed by the glass, then 96%
is transmitted through the interface. Of course there is a glass-air interface
at the back of the window. You should verify that R is the same whether
we go from air into glass or from glass into air. That means once again 4%
of the energy is reected at this second interface. The energy incident there
is not all the energy initially incident, only 96% of it, but that is almost all
of it. Therefore roughly 8% of the originally incident enery is reected, and
92% transmitted. Specialty glass can have an index of refraction as high as
n = 2. For such glass R = 1/9, meaning over 10% of the incident energy
is reected at each interface. That may not seem like much more than the
usual 4%, but it is close to three times higher reectivity. Architects like
such glass: it makes their buildings more opaque, seen from outside, and not
like shbowls. The higher reectivity is very noticeable when you come up
to a glass door of such glass.
The formula above, Eq (3.37), is for the reection coecient at normal
incidence. For light incident at any other angle, R is larger than this. We
have met a case where R actually becomes 1, that is, all the light is reected!
This is for the case of light incident on a dielectric interface at an angle

c
, where
c
is the critical angle. Recall from the discussion around
Fig 3.13 that the critical angle is the incident angle at which the refracted
ray just barely gets into the second medium, by skimming along the surface.
For an incident angle even larger, it cant get into the other medium it is
all reected. This phenomenon is called total internal reection. A clever
application of total internal reection is to make excellent mirrors with just
glass, and no silver or other metal to coat them. Consider the simple prism
in Fig 3.25. The critical angle at the air-glass interface, if n = 1.5 for glass, is
c
= sin
1
(2/3) = 0.7297 radians = 41.8
. Light incident normally at the left

face is totally reected because its angle of incidence on the long face is 45
,
which is larger than
c
. If you look through a prism, the totally reecting
face looks silvery, and it is hard to resist turning it over to be sure that it
actually isnt.
Fig. 3.25: Total internal reection in a glass prism
3.17 Fermats Principle
There is an old idea, going back to Socrates, that what Nature does must be
somehow for the best. In Platos dialogue Phaedo Socrates even expresses
his disgust for the physical ideas of Anaxagoras, because they dont tell him
what he really wants to know, namely how it is that what Nature does is
3.17. FERMATS PRINCIPLE 107
for the best. In modern physics, however, this idea has become absolutely
fundamental. Almost all physical theories can be formulated in a simple way,
expressing that what Nature actually chooses to do is somehow best. Such
formulations are called variational principles.
Perhaps the rst use of a variational principle was a new formulation of
geometrical optics by Pierre Fermat in the 17th century, now called Fermats
Principle. Geometrical optics follows from three basic rules: light travels in
straight lines in a homogeneous medium, it reects at an interface according
to the law of reection Eq (3.33), and it refracts at an interface according to
Snells Law Eq (3.5). There seems to be no particular reason that light should
do this and not behave in some totally dierent way. It seems arbitrary. But
these three rules, in turn, all follow from one single idea, Fermats Principle,
which almost seems to explain what is going on: a light ray between two
points takes the path of shortest time. One could argue (and one did) that
God would not waste time with His light, and that is why light behaves
the way it does! To put it in a more neutral way, light seems to optimize
the travel time in getting from one place to another, and this implies all of
geometrical optics, including extensions of the theory that we could not have
treated before. This example illustrates very neatly the simplicity and power
of variational principles.
Behind Fermats principle lies the assumption that light travels with some
denite speed in each medium. With this assumption it is clear how Fermats
principle implies that light travels in straight lines: the way to get from A
to B in minimum time at xed speed is to take the shortest path, and that
of course is a straight line.
It is not so clear how Fermats principle implies the Law of Reection, but
a clever geometrical argument explains this. Given two points A and D in
a single homogeneous medium, and a nearby mirror, we ask for the shortest
path from A to D via the mirror. The path should go from A to some point
C on the mirror by a straight line, of course, any other path would take
more time, unnecessarily. And then it should go from C on the mirror to
D by another straight line. Thus the only freedom we have in searching for
the shortest path is the freedom to choose the point C on the mirror. The
situation is illustrated in Fig 3.26. As illustrated in Fig 3.26, we introduce
the point B symmetrically opposite A on the other side of the mirror. The
mirror is the plane bisecting the line segment AB. Thus the triangle ACB
A
D
C
B
Fig. 3.26: What choice for C minimizes the travel time along ACD? Hint: not the one
shown!
is isosceles for any choice of C on the mirror, and hence the distance ACD
is the same as the distance BCD for any choice of the point C. We need
only choose C to minimize BCD, but that is easy: BCD must be a straight
line, and that implies the law of reection. In this way of looking at the
rays reected from a mirror, it is obvious why there is a virtual image of
A at B. Every ray from A that reects from the mirror continues along
the direction that comes from B. This is easier to see via the variational
argument, perhaps, than it was in Fig 3.21.
How does Fermats principle imply Snells Law? Now we ask for the
shortest time path from a point A in one medium to a point B in another
medium. You might think the ray should just be the straight line from A to
B, as before, but if B is in a slow medium, and A is in a fast medium, the
shortest time path from A goes to a point on the interface relatively close to
B, insofar as that is possible, and then travels a shorter distance in the slow
medium. Actually solving for the shortest time path requires calculus, so we
just give the result: if the light speed in a medium with index of refraction n
is c/n, where c is the speed of light in vacuum, then the shortest time path
obeys Snells law. The bending of the ray toward the normal as it enters a
medium of higher index of refraction is just its way of reducing the distance
it must travel in the slow medium.
Snells Law only follows from Fermats Principle if the light speed is c/n,
where n is the index of refraction, and naturally one wonders if it is really true
3.18. WAVEFRONTS: A DUAL THEORY OF LIGHT 109
that light travels at a rate proportional to n
1
in transparent materials. The
answer is yes. Fermat could not have known that, of course. Experimental
conrmation came only centuries later.
Now we can consider how light travels through complicated materials,
like inhomogeneous glass, where the index of refraction may not be constant,
but instead changes gradually with position. This could happen if the glass
was not well mixed, so that the concentrations of important constituents are
high at some places and low at others. Fermats Principle tells us that the
actual path taken by light rays will avoid the regions of high n, because these
take longer time to get through, and favor the regions of low n. The actual
rays will be curves.
A familiar example of this is the water mirages one sees on highways on
hot summer days. In this case, light is travelling through air that is heated
from below by the ground. The index of refraction of air is very close to
n = 1, of course, the value for vacuum, but the amount by which it is greater
than 1 is essentially proportional to the airs density. You could imagine
that each air molecule contributes to the index of refraction, and the more
molecules there are, the larger n is. The hot air near the ground expands
and is less dense, so there are fewer molecules, and n is less. The cool air
above is more dense, so there are more molecules, and n is more. Thus light
coming from the sky near the horizon does not come straight to your eye,
but follows a curved path that comes near the ground, favoring the region of
low n, and it may even appear that the sky is reected in the highway ahead
of you. It looks like a reection from water the road looks wet.
3.18 Wavefronts: A Dual Theory of Light
In this nal section we look at our results in geometrical optics in a dual
way, adding wavefronts to our diagrams for light rays. The wavefronts are
surfaces that are perpendicular to the rays. We only draw one example,
since this is only meant to be a vague hint, suggesting a completely dierent
picture of what light is.
Starting from the rays we can draw the wavefronts by just drawing curves
perpendicular to the rays. Corresponding to rays leaving the point A, for ex-
ample, we get spherical wavefronts centered on A, shown in blue in Fig 3.27.
B A
Fig. 3.27: A convex lens forms a real image at B of an object at A. The rays show how
one pictures this in geometrical optics, and the wavefronts give an alternative, equivalent
picture.
The eect of the lens is to convert these wavefronts to new spherical wave-
fronts centered on B. The formation of the image can then be thought of as
these new spherical wavefronts advancing toward B, carrying their energy,
until the energy is concentrated at B. In some ways this is a more satisfy-
ing picture, since it was never really clear in geometrical optics why a place
where rays cross should correspond to bright light, but in the wave picture
we seem to see why that would be true: the wave is concentrated there.
The thin lens equation gets an interesting new interpretation in this pic-
ture. We write it as
1
i
=
1
o
+
1
f
(3.38)
Here o refers to the distance from the lens to A, and i refers to the distance
from the lens to B. Now think of each term in the thin lens equation as
a curvature. The term 1/o is the curvature of a sphere of radius o, i.e.,
the curvature of the wavefront from the object A at the position of the
lens. Similarly the term 1/i is the curvature of a sphere of radius i, i.e., the
curvature at the position of the lens of a wavefront centered at B. What
the thin lens equation seems to say is that the lens of focal length f adds
curvature 1/f to the curvature of a wavefront that reaches it. The resulting,
now dierently curved, wavefront then goes on to form an image determined
by its new curvature. There is a sign convention to watch out for curving
one way is positive, the other way is negative.
We once derived the focal length of a combination of two thin lenses put
together at the same place, having individual focal lengths f
1
and f
2
. The
result for the new focal length f was
1
f
=
1
f
1
+
1
f
2
(3.39)
In terms of curvatures, this is a triviality! Each lens adds its characteristic
contribution to the total curvature that is all it says.
This excursion into wavefronts was not meant to be a complete explana-
tion of anything, just a provocative rst look at an alternative theory. We
will actually return to the wave theory of light. For now let us just ask our-
selves if geometrical optics tells us anything about what light really is. Does
light really consist of rays obeying Snells Law, and so forth? The success
of the theory might tempt us to say yes, there really are light rays, but the
sudden suggestion of a dierent picture that looks as if it would lead to all
the same results should make us cautious about ascribing reality to our con-
structions. This is an example of the wave-particle duality that so intrigued
the discoverers of quantum mechanics, and continues to be a subject of fas-
cination. In this context we could ask, is light a ray or a wave? The most
cautious answer is that our theories tell us nothing about what light really
is. Rather they are just correspondences between mathematical structures
like geometry and real things like light, that are successful in ordering our
understanding about what happens around us. Such correspondences are all
physics has to oer.
Problems
Not every problem is like one discussed in the text. Be ready to make sketches
and interpret the geometry of the situation. Where an estimate is called for,
use reasonable numbers, and make your choices clear.
Angular Size
3.1 Estimate the angular size, both height and width, of someone you see
standing across the street.
3.2 A high-tech camera is said to have such good resolution of detail that
it can see a dime at a distance of a mile. Clearly this description is talking
about an angle. What angle is it?
3.3 The Moons angular size as seen from Earth is about
1
2
. If the Moon
is 60 Earth radii distant from Earth, what is its actual size?
3.4 The Suns angular size as seen from Earth is almost exactly the same
as the Moons.
(a) How can that be, if the Sun is actually much bigger?
(b) If the Suns radius is 100 times that of the Earth, how far away is it?
(c) We know the angular size of the Sun and Moon are the same because
the Moon just covers the sun in a total solar eclipse. Actually, though, it
sometimes fails to cover, leaving a bright ring of Sun showing around it, even
at the moment of perfect alignment (a so-called annular eclipse). Why is
this?
113
3.5 The Greek philosopher Anaxagoras was accused of blasphemy when he
suggested that the Sun was not a god, but rather a hot rock, as big as the
Peloponnesus. How far away did he (implicitly) think the Sun was?
3.6 If the Moon is 60 Earth radii from the Earth, what is the parallax angle
relating two observers, of whom one sees the Moon setting on the western
horizon and the other sees the Moon rising on the eastern horizon? What is
the angular size of the Earth as seen from the Moon? What is the relation
between these two questions?
3.7 A surveying crew makes observations from a north-south baseline of
length 100 m. A tall tree is due east, as seen from one end of the line, but it
is 1
north of east as seen from the other end.

(a) How far away is the tree?
(b) How dierent would the answer be if we replace 1
by 0.9
(a change
of 10%), realizing our angle measurement is uncertain by about this much?
3.8 An old argument against the motion of the Earth is that if the Earth
moved around the Sun, then the stars would show parallax: the nearest ones
would seem to move back and forth against the background of the more
distant ones. But this is not observed. At least it wasnt until the 19th
century, when at last parallax was observed in a star. Even in the closest
stars, the parallax is only about 1 second of arc (where 60 seconds is a minute
and 60 minutes is 1
). This is a very small angle, and to detect such a small

motion is technically a challenge. Roughly how far away are the nearest
stars, according to this observation? Give the distance in astronomical units
(AU), the distance from the Earth to the Sun. (Hint: the baseline for this
parallax measurement is 2 AU: why?)
Eyes
3.9 A camera focussed on innity, i.e., set to capture a nice sharp point
image of a star, will not be able to take a sharp picture of something close
up. Explain why with a diagram, and say how cameras are constructed to
solve this problem.
Snells Law
3.10 Suppose a ray of light is incident on an interface at angle /3 from
the normal and exits at angle /6 from the normal. What is the relative
index of refraction at the interface? Which medium has the greater index of
refraction?
3.11 Suppose a ray of light goes through a at plate of glass, but not normal
to the plane of the glass (i.e., at an angle).
(a) Sketch the path of the ray in case there is air on both sides of the
glass.
(b) Sketch the path of the ray in case there is air on one side and water
on the other.
(c) Show that in both cases the direction of the ray when it exits the glass
is the same as if the glass had not been there. What does this imply about
what you see through a pane of glass?
3.12 Fig 3.28 shows a light ray going through a prism in a symmetrical
way, arranged to make the deviation of the ray from its initial direction a
minimum.
2

2
1
Fig. 3.28: In this sketch, a light ray goes through a prism in a peculiarly symmetrical way,
entering and exiting at the same angle
1
. At this angle its deviation by the prism away
from its original direction is a minimum (non-obvious fact).
(a) Make a careful drawing to show that the deviation at the rst interface
is
1

2
. Since the deviation is the same at both interfaces, the total
deviation is 2(
1
2
).
(b) Use Euclidean geometry to show that
2
= /2.
(c) Use Snells Law to show that the correct angle
1
for a minimum
deviation ray obeys sin
1
= nsin(/2), and nd
1
for an ice prism with
n = 1.31 and = 60
, an angle that actually occurs in ice crystals in the

atmosphere.
(d) Thus compute the angle of minimum deviation for a 60
ice prism.
Could this have anything to do with the halo around the moon, a ring that
sometimes is seen at an angular distance of 22
from the moon?

3.13 In Fig 3.29, show that the relative index of refraction is the ratio
n = AB : CD (3.40)
This is how Snell originally expressed his law.
A B
C D
Fig. 3.29: A light ray is refracted at an interface. The ratio of AB to CD is the relative
index of refraction, a characteristic of the interface: that is a geometric way to look at
Snells Law.
Focal Length of the Eye
3.14 Fish eyes are essentially spherical, unlike ours. Discuss the problem
of the focal length of the sh eye, following the discussion in Section 3.9, not
forgetting that, of course, sh eyes must work in water. How can we be sure
that there is some essential structure inside the sh eye which is not part of
the model in that Section? What could it be?
Virtual Images
3.15 Suppose an optometrist looks, with unaided eye, into the relaxed eye
of a patient, and sees the (suitably illuminated) structures on the patients
retina. Just like someone looking into water, the optometrist is really looking
at a virtual image. Where is that virtual image located? (This question
requires us to follow certain rays back to see where they appear to emanate
from.) Sketch a diagram as part of your answer.
Thin Lenses
3.16 (a) How is Eq (3.17) dierent from Eq (3.11)? What factor relates
the two expressions for focal length? What is physically dierent in the two
congurations that are being described?
(b) By pointing out what happens at the plane interface of the plano-
convex lens, explain the factor you noticed in (a).
3.17 Say in words what Eq (3.17) means, and point out things about it
that make common sense, including dimension, limiting cases, etc.
Object and Image
3.18 Take the object position o to be positive (upstream from the lens)
and nd the image position i in case f > 0 (convex lens). Do this for several
representative values of o: f/4, f/2, f, 2f, 10f. Summarize in words what this
says about looking at something through a convex lens.
3.19 Take the object position o to be positive (upstream from the lens) and
nd the image position i in case f < 0 (concave lens). Do this for several
representative values of o: -f/4, -f/2, -f, -2f -10f. (Note that these are positive
values for o!) Summarize in words what this says about looking at something
through a concave lens.
3.20 Describe how a slide projector forms a real image on a screen, and
propose realistic values for i, o, and f. What is the magnication in your
proposal (including the usual sign convention)? Include a diagram.
3.21 A burning glass (convex lens) has a focal length of 15 cm. How big
is the focussed image of the sun that it forms on a sheet of paper? (Hint:
consider rays through the center of the lens to locate the image.)
Optical Systems
3.22 The magnication of a convex lens of focal length f is often said to
be d
min
/f, where d
min
is the near point of the eye, typically 25 cm. A 50
cm focal length lens, however, actually magnies things. The magnication
is not 25/50=1/2. What is going on?
3.23 From the verbal description in subsection 3.13.2 make an informative
diagram showing how a microscope works.
3.24 Use the lensmakers equation to design a reasonably thin lens with
f = 2.5 cm. (It could be a microscope eyepiece.) Make an accurate sketch
of your design, and give all necessary specications.
3.25 What is the focal length of a system of two convex lens, of individual
focal lengths f
1
and f
2
, if they are separated by a distance d? Measure the
length from the second lens, as you follow the path of the light rays down-
stream. Check the dimensions of your result, and verify common sense. In
particular check the special cases d = 0 and d = (f
1
+f
2
), and comment.
3.26 Design an astronomical telescope that is small enough to carry around
easily and that magnies by a factor of -8. Make an accurate sketch and give
all relevant specications.
3.27 As in the problem above, design a Galilean telescope that is small
enough to carry around easily and that magnies by a factor of 8. Make an
accurate sketch and give all relevant specications.
Mirrors
3.28 Use Euclidean geometry to show that the law of reection implies the
picture in Fig 3.21.
3.29 When you look into a shiny sphere, like a Christmas ornament, you
see yourself and the objects around you reected. Use the relation between
o, i, and R (object position, image position, and radius of the sphere) to
locate the image for various object positions o: R/2, R, 2R, 10R. Be careful
with sign conventions! Here R is the radius of the sphere, which can only be
positive, but what is the sign convention for nding the focal length f for
the sphere? And is o really positive here?
3.30 A limiting case of the spherical mirror with radius R is the at mirror.
Discuss Eqs (3.34) and (3.35) in this limiting case. Do they make sense?
Explain what these relations mean.
3.31 Prove Eq (3.36) from the geometry of image formation in the case of
a concave mirror forming a real image (as in Fig 3.23).
Reection and Refraction
3.32 One design for binoculars has, for each eye, an objective lens, an
eyepiece lens, and two internal prisms. If each of these optical components
is made of crown glass, with n = 1.5, and each has an entrance surface
and an exit surface, and the components are not coated with any special
anti-reecting surface layer, how much energy is lost by reection?
3.33 It is possible to delay one optical pulse with respect to another by
sending it through a ber optic cable. The extra travel time for the pulse
through the cable is the delay, and it can be controlled by choosing the length
of the ber. Suppose we want a delay of 1 s, and the ber if made of glass
with n = 1.5. How long should the ber be?
Chapter 4
Time and Oscillation
Time is one of those things that seems elementary and obvious until you try
to say what it is. No other science claims time itself as its subject: this topic
is pure physics. Insights into time and its measurement are useful across
the sciences, from potassium-argon dating of ancient minerals in geology, to
femtosecond laser pulses in the study of reaction kinetics in chemistry. But
what is time, really?
Albert Einstein startled the world of physics in 1905 with his rst paper on
Special Relativity Theory. You might expect that this would be an impossibly
dicult paper to read, but consider the following passage, taken from it
verbatim: If I say that the train arrives here at 7 oclock, that means,
more or less, the pointing of the small hand of my watch to 7 and the
arrival of the train are simultaneous events. This seems way too simple!
Why is there a sentence like this in a deep scientic paper? Einstein is
emphasizing that what we mean by time depends on what we use for clocks.
Previously physics had assumed that time exists independently, and clocks
simply measure it. Einstein is pointing out that clocks dene time, which
has no other meaning or reality. Do you accept that? It seems a bit strange,
but modern physics has had to adopt this notion of time. Einsteins idea is
now part of the bedrock of physics. Time is dened by clocks, and not the
other way around.
So what is a clock? That will be our topic. At a minimum, we can
say, a clock should have some measurable property that we will take to be
121
122 CHAPTER 4. TIME AND OSCILLATION
proportional to time. Then by measuring that property, we are measuring
time.
An old metaphor for time is a stream or a river, owing smoothly along.
Taking this metaphor literally, we can collect owing water, somehow, and
the amount of water collected is a measure of the elapsed time we have
invented a water clock. An hour glass with owing sand is a variation on
this idea. We will not spend any time on these clocks, since they turn out
to be surprisingly complex systems, for all their apparent simplicity. The
consequence is that they are not very reliable clocks. On the other hand,
just because of their complexity, uid ow and granular ow turn out to be
challenging and interesting phenomena to study in their own right. For now
we turn to simpler clocks.
4.1 Angular Clocks
Another metaphor for time that can be made into a clock is a turning wheel.
Nature even provides such a clock in the apparent turning of the heavens,
carrying the sun, moon, planets, and stars across the sky, from East to West,
as if they were on a great spherical dome. Thus the metaphor isnt just a
metaphor, it is, in a sense, real. When you look south, where the Sun, Moon,
and zodiacal stars appear from typical latitudes in the northern hemisphere,
you see this daily rotation going in the direction we call clockwise. Of course
now we say that the Earth is the wheel, not the heavens, and the apparent
clockwise motion of the heavens is really due to the Earth turning the other
way, counterclockwise (as seen from above the northern hemisphere). But
mechanical clocks were already being built in the late middle ages, before the
motion of the Earth was understood, as analogue devices to mimic the heav-
enly motions. Since the heavens turn clockwise, clocks also turn clockwise, a
convention that has never changed.
If you use a turning wheel for a clock, the quantity you measure is , the
angle through which it has turned. Since we are assuming it is a clock, is
proportional to t, time. That is
= t (4.1)
The constant of proportionality (omega) is called the angular speed.
Using the abbreviation [T] for the dimension of time, we see that has
4.1. ANGULAR CLOCKS 123
dimensions [T
1
], since t has dimension [T] and is dimensionless. Typical
units for would therefore be radians per second, degrees per hour, cycles
per minute, etc. (Recall that angles are dimensionless, but still are measured
in some dimensionless unit, like radians or degrees. A cycle is another unit
of angle, one full turn of the wheel, the same as 2 radians or 360 degrees.)
It is ambiguous, and potentially confusing, but if the unit for is radians
per second, it is frequently expressed without saying radians explicitly. That
is, 50 s
1
means 50 radians per second. The default unit for angle is radians.
Needless to say, wall clocks with a second hand, minute hand, and hour
hand, are angular clocks. We read an angle and interpret it as a time. In
such a clock the second hand moves with = 1 cycle/minute, and the other
hands just keep track of how many cycles the second hand has made.
4.1.1 The Solar Clock
In a certain sense, which turns out to be a bit subtle, the angular position
of the Sun is our most basic clock. Hours, minutes, and seconds are dened
as fractions of the day, and one solar day is the time it takes the Sun to go
from one noon to the next. Everyone knows this, right? This has been the
meaning of time for most of human history, dened by the Sun as a clock.
The Suns angular position can even be measured quickly and easily with a
sundial. We interpret the measured angle as a time by using Eq (4.1), with
Sun
= 1 cycle/day.
Here is an example. Suppose we measure angle from the zenith, so that
= 0 corresponds to noon. Then when = 15
, the elapsed time t, by

denition, is such that
15
= t =
_
1 cycle
day
__
360
cycle
_
t =
360
day
t (4.2)
so that t = (15
/360
) day= 1/24 day= 1 hour. That is, it is 1 PM. It would

have been simpler to notice that
Sun
= 15
/hour. Of course the unit ruled

on a sundial is hours, not degrees, so that you dont even have to make this
conversion of units, which we only do here as a reminder.
Sun time is close to being the time we use in everyday life, but it is not
quite right. Now we dene time with a dierent clock, and we adjust it to
make it agree with sun time on average. Why would we complicate things
like that?
4.1.2 The Sidereal Clock
Any star could also dene a clock, and hence a time, by its angular position,
which changes over the course of a night, much the way the Suns angular
position changes over the course of a day. This time is called sidereal time,
and we would read the sidereal clock by reading the angular position of the
star and using Eq (4.1), just as we did in the example above. Of course we
would have to know the constant of proportionality
Sidereal
for this clock,
which turns out to be slightly greater than 1 cycle per day! The reason is that
the stars, in their apparent East to West motion, gradually gain on the sun,
or to put it another way, the sun slips back toward the East, losing ground
among the stars, slipping back once around the Zodiac in a year (this is the
denition of the year). So the sidereal clock turns faster.
4.1.3 Solar vs. Sidereal
To relate these two clocks, we need to know how many days there are in a
year: 365.2422, approximately. This is the result of an observational program
which has been carried out over literally thousands of years, as one civilization
after another tried to bring its calendar into synchrony with the seasons. We
can take this number as well established. That the extra fractional part of
this number, namely 0.2422, is about 1/4 = 0.25 means that we put one
extra day into the calendar every four years. That the fractional part is
really closer to 1/4 3/400 = 0.2425 means that three times every 400 years
we dont put in the extra day. As you see, this is still not quite the right
correction. Since 0.2422 = 1/4 3/400 3/10000, we should also refrain
from adding the extra day an additional three times every 10, 000 years, but
there is no agreement about exactly when to do that, and maybe we wont
have to worry about it.
Now we can relate the solar and the sidereal clocks. Using the conversion
factor 1 = 365.2422 days/year, we nd that
Sun
= 365.2422 cycles/year.
But
Sidereal
= 366.2422 cycles/year, because by denition, the year ends
when the stars exactly lap the Sun, like a runner who catches her opponent
on a racetrack by gaining an entire lap. Thus
Sidereal
Sun
=
366.2422
365.2422
= 1.00274 (4.3)
This is the factor by which the sidereal clock runs faster than the solar clock,
it appears.
If this were the whole story, it would just mean that we have two dierent
clocks, provided to us by Nature, that run at slightly dierent rates. It would
be like having two dierent units to measure time, the solar day, and the
sidereal day, with the solar day being slightly longer. (Here by sidereal
day we mean the time for a star to go from one zenith, or transit, to the
next, to complete one cycle.) We would read the sidereal clock in sidereal
days, we would read the solar clock in solar days, but we could easily convert
units, so both would tell us the time, and they would agree about what time
means.
The actual story is more subtle, though. The angular speeds
Sun
and
Sidereal
must be understood as average speeds, averaged over the entire year.
The measurement of the length of the year, on which all of this is based, is
just a measurement of the total accumulated angle in each clock over the
year, and says nothing about whether the clocks ran at constant rate. For
a single clock this is not even a question, as we could dene the meaning of
time by insisting that is constant. But with two clocks we can ask whether
they keep the same time, and the answer in this case is that they do not. The
solar clock sometimes speeds up and sometimes slows down with respect to
the sidereal clock, or, alternatively, the sidereal clock sometimes slows down
and sometimes speeds up with respect to the solar clock. What we can be
sure of is that they do not determine the same time, even if we correct for
the dierence in average angular speeds.
A quick way to be sure of this is to consider the average angular speeds
over just half the year instead of the whole year. In going from the vernal
equinox to the autumnal equinox, the Sun goes exactly halfway around the
Zodiac, so this is half a year. How many solar days are in half a year? Well,
the day on which the Sun reaches the equinox is a little bit variable, since the
year is not an integer number of days this moment, which is announced in
the newspapers each Spring and Fall, indicating the onset of a new season,
is about six hours later each year until leap year jumps it back by 24 hours.
But the autumnal equinox is always around Sept. 22, and the vernal equinox
is always around March 21. So you can just count the days to nd out how
many days there are in half a year. I get 180 days from Sept. 22 to March
21, and 185 days from March 21 to Sept. 22. So in addition to running
slower than the sidereal clock on average, the solar clock runs particularly
slowly in the winter, and speeds up a bit in the summer, or perhaps it runs at
constant rate, and it is the sidereal clock which speeds up in the winter and
slows down in the summer. In any case, if we choose one over the other, we
are making a choice about what time is, and there is no purely logical way
to decide. This dilemma arises whenever we have two candidates for clocks
that do not agree.
In choosing between these two clocks we cannot use logic, because they
are logically equivalent. We use physics. That is, we have a way of physically
understanding the motions of the clocks, and this understanding includes an
explanation for why one of them is not keeping good time and the other one
is. Of course time, which we are dening here, is part of the theories that we
use to make the decision! So in the end, time is dened in such a way that it
becomes part of a coherent system to make mathematical sense of the world.
But it is a choice. It is subject to international agreement, for example.
Here is how we understand the two clocks, solar and sidereal. The sidereal
clock uses the stars as a reference, and we assume the stars are essentially
stationary from the point of view of the Earth, because of their great distance
D. Their positions are angular positions on the dome of the sky, and even
if they should move by a distance H, the angular size of that displacement
H/D is essentially zero, because D is essentially innite. The Earth may
move by some relatively small distance H, but again the parallax angle H/D
is essentially zero. Thus the stars provide a stationary reference for observers
on Earth. Their apparent motion from East to West, the motion of the clock,
is entirely due to the rotation of the Earth. But our understanding of rigid
body rotation says that a rotating rigid sphere, isolated from any outside
inuence, rotates at constant angular speed forever. This is even true if the
sphere is revolving around the Sun in a gravitationally bound orbit. We are
clearly calling on more physics here than you could be expected to know, but
these conclusions follow from Newtons theory of motion. We therefore think
we understand why the sidereal clock is a good clock, and should be taken
seriously. The angular speed
Sidereal
is the angular speed of the turning
A
A
Day 0
Day 1
Sun
Fig. 4.1: The Earth turns (counterclockwise) through more than one full rotation between
one noon and the next because it has also advanced in its orbit. The extra angle that it
must turn is the angular distance that it advances.
Earth, and physics strongly suggests that it should be essentially constant.
On the other hand we also have good reason to think the solar clock
would not be a good clock. Fig 4.1 shows how far the Earth must turn
between one noon and the next at some xed point A on the Earths surface.
The Earth has to make more than one complete rotation because while it
is rotating it is also revolving in its orbit around the Sun. That is why
the solar day is a little longer than the sidereal day. The extra angle the
Earth must turn is essentially the angle through which it moves along its
orbit. Fig. 4.1 is oversimplied, however. In the rst place, we have made
the distance the Earth moves along its orbit larger than it really is, just to
make its eect on the solar day big enough to see easily. More important,
as we describe in a little more detail in the next section, the Earths orbit
is slightly elliptical, and the Earth travels faster through some parts of the
orbit than others (faster during the Northern Hemisphere winter, in fact).
This is the important thing: even while it rotates at a constant rate, the
Earth speeds up and slows down in its orbit. When it moves fast (in winter)
it must turn farther in a solar day, and this takes more time. Thus the solar
day is of variable length. There is even another complication, coming from
the tipping of the rotation axis with respect to the orbital plane, but we have
seen enough to know that the solar clock is variable and complicated.
Thus until recently (1967, to be exact) our civilization used the sidereal
clock, scaled by the factor in Eq (4.3) to give mean solar time. With this
denition, the sun at noon (by the clock) would be sometimes ahead of the
zenith and sometimes behind. The amount by which the solar clock is ahead
of or behind clock time through the year is graphed in Fig 4.2, what is
sometimes called the Equation of Time.
T (minutes)
15
10
5
0
-5
-10
-15
100 200 300
N
Fig. 4.2: The Equation of Time, the amount T by which a sundial is ahead of or behind
mean solar time, as a function of day N in the year, with N = 1 corresponding to Jan. 1
4.1.4 Aside on Keplers Laws
The properties of planetary orbits alluded to in the previous section were
discovered in the early 1600s by Johannes Kepler, and were published by him
in 1619. The three Kepler laws were shown by Newton in 1689 to follow from
his Law of Universal Gravitation, and in this way the problem of planetary
motion at least in its simplest form was solved once and for all. The
Kepler laws are
I. The orbit of a planet is an ellipse with the Sun at one focus (see Fig 4.3).
II. The planet moves in such a way that the line joining it to the Sun
sweeps out equal areas in equal times (see Fig 4.4).
III. The period T and the semimajor axis a of its elliptical orbit are such
that T
2
a
3
.
b
a
a
O S
Fig.4.3: An ellipse is characterized by two lengths, the semimajor axis a and the semiminor
axis b. The ellipse has two foci, one indicated by the point S above and another symmet-
rically located to the left of the center O. The distance OS is

a
2
b
2
, as indicated in
the gure. A special case of the ellipse is the circle, for which a = b is the radius. In this
case the foci and the center coincide. If the ellipse is a planetary orbit, the sun is at S.
S
A
B
C
D
Fig. 4.4: Keplers 2nd Law: In an elliptical planetary orbit, equal areas are swept out in
equal times. Since the two shaded regions are equal in area, the times to sweep them out
were equal, which means the planet moved faster along the arc AB, where it was near the
Sun, than along CD, where it was far away.
The Earth is a planet, and the Earths period T is 1 year. Until 1976
its semimajor axis a was the astronomical unit (AU) by denition, so the
constant of proportionality in Keplers Third Law was exactly 1 yr
2
/(AU)
3
.
In 1976 the denition of the AU was changed slightly! Now 1 AU is the
average distance of the Earth from the Sun, in a certain sense. Since the
Earths elliptical orbit is almost circular, though, i.e, a b, referring to
Fig 4.3, the dierence between a and 1 AU is very small, and to excellent
approximation we can still take the constant to be 1 yr
2
/(AU)
3
.
4.2 Atomic Clocks
In the 20th century new clocks were developed (atomic clocks), using prin-
ciples of quantum mechanics, that promised very high accuracy. Like the
rigid, spherical Earth turning in isolation, these clocks used an essentially
simple, isolated timekeeper (cesium atoms at low density in a near-vacuum,
and more recently rubidium atoms). Atomic clocks, however, do not agree
perfectly with the sidereal clock. This poses the question all over again:
which clock do you believe? It comes back to questions of physics. Are there
reasons why the sidereal clock might be irregular, and in particular might
be slowing down, as the comparison with atomic clocks suggests? Well, yes:
the Earth is not quite rigid, is not quite spherical, and is not quite isolated.
This complicates its rotation, and suggests ways it could slow down. On the
other hand there is no plausible reason why cesium atoms should be grad-
ually speeding up. So it seems that we should trust the atomic clocks, and
even use them to learn things about the dynamics of the Earth, by studying
how it fails to be a perfect clock.
In 1963 Jocelyn Bell and Anthony Hewish, radio astronomers, detected
what came to be known as pulsars, astronomical sources of regular pulsed
radio signals. For a brief time there was even speculation that these were
signals from an alien civilization (the rst name for pulsars was LGM: Little
Green Men) but soon it became clear that the pulses were extremely regular,
and could not contain a message. Their regularity suggested that they must
come from rigid rotating objects, essentially angular clocks, with the nice
feature that on each rotation they send us a pulse, like the light ashes from a
lighthouse as it rotates. If this is the case (and it is still our model for pulsars)
they might be fantastically good clocks, because whatever they are, they must
4.3. GPS: GLOBAL POSITIONING SYSTEM 131
be much more massive than the Earth, and so it would be much harder to
slow them down or speed them up. Perhaps they really turn at constant
angular speed. In the comparison with atomic clocks, though, pulsars lost
out. Within a few years it was found that at least some pulsars slow down
slightly with respect to atomic clocks, and hence we must conclude that they
really do slow down in reality. This says something about the environment
in which they are spinning, perhaps, or about their dynamics (starquakes
have been observed in pulsars, sudden changes in shape that aect their
angular speed at least that is the most plausible interpretation). Thus
pulsars did not become our time standard, but if they had been discovered
50 years earlier, they probably would have.
Finally, the atomic clocks do not agree among themselves! They are ba-
sically all the same construction, so how do we choose? Once again physics
has something to say. Einsteins Special Relativity Theory actually predicts
that they shouldnt all run at the same rate, although they should all run
at constant rates. Part of reading the clock is making these relativistic cor-
rections. There is a correction for latitude: clocks nearer the North Pole are
moving slower, as the Earth turns, than clocks nearer the Equator, and this
turns out to aect the clock rate. It is now a well understood eect. Also the
altitude of the clock makes a dierence, because according to Einsteins Gen-
eral Relativity Theory, the gravitational eld aects the clock rate, and the
gravitational eld is weaker at higher altitude. This is also well understood.
These corrections, which everyone agrees should be made, bring the clocks
to near agreement. If there were some systematic discrepancy left among
the clocks, it would indicate some new physics phenomenon that we dont
yet understand, but there doesnt seem to be such a thing in the data. The
remaining discrepancy looks random. Therefore the time shown by all the
clocks in the network, which consists of over fty clocks, at widely separated
locations, is simply averaged. The result is: time!
4.3 GPS: Global Positioning System
The basic idea for GPS comes from Euclid Book I, Proposition 22, which
says that you can construct a triangle knowing the lengths of its sides. Call
the sides D
1
, D
2
, and D
3
. As Fig 4.5 shows, we can lay out the length D
1
,
with endpoints A and B. Then if the point P is known to be a distance
D
1
D
2
D
3
A B
P
Fig. 4.5: A triangle is determined by the lengths of its sides, the principle of the Global
Positioning System.
D
2
from A and a distance D
3
from B, we swing an arc with radius D
2
and
center A and an arc with radius D
3
and center B. Where these intersect is
the point P, the third vertex of the triangle.
In GPS the points A and B are two satellites, whose positions are known.
The point P is a lost hiker trying to gure out her position. That position
would be determined if she could just nd her distance D
2
and D
3
from the
two satellites. The system accomplishes this by very accurate timing! Signal
pulses are sent out from the satellites and travel with the speed of light. They
are eventually received at P, and the delay due to travel time is a measure
of the distance the signal travelled. This goes to the very notion of speed:
the distance the signal travels is proportional to the time, and the constant
of proportionality is the speed, denoted c in the case of the speed of light.
Thus
D
2
= ct
2
(4.4)
D
3
= ct
3
(4.5)
where t
2
and t
3
are the two measured delays, and c is the known speed of
light. The longer the delay, the farther away she must be. She switches on the
4.3. GPS: GLOBAL POSITIONING SYSTEM 133
receiver. It measures the delays, computes D
2
and D
3
using the known value
of c, and then does the construction of Fig 4.5 to determine her position.
The above is the basic idea. Actually implementing it calls for a lot of
cleverness, and the system is still rapidly developing and improving, but it
is basically just Euclidean geometry and accurate timing. One problem that
might occur to you is that our world is three dimensional, and the above
construction seems to take place in just a plane (two dimensions). That is a
fair criticism: in fact, there have to be three satellites, and three delays t
2
,
t
3
, and t
4
to locate the receiver P in three dimensions. The picture is still
basically the same, but instead of two circles of radius D
2
and D
3
intersecting
to determine P, we would have three spheres of radius D
2
= ct
2
, D
3
= ct
3
,
and D
4
= ct
4
intersecting to determine P.
We would like the GPS to locate things within a few meters. How good
must the timing be? To answer this, we must nally take account of what
the speed of light actually is, numerically: about 3 10
8
m/s. So the time
to go 3 meters is
t =
3 meters
3 10
8
meters/sec
= 10
8
s (4.6)
That is 0.01 microsecond, or 10 nanoseconds, a very short time! Any error in
timing by just that small amount produces an error of 3 meters in position.
The conceptually simplest way of measuring a delay like t
2
from satellite
#2 would be for satellite #2 to emit a pulse at a predetermined time that
is known to everyone. Then the receiver at P notes the time the pulse
is received, which is a later time, of course, and notes the dierence from
the known time it was sent. This is conceptually simple, but technically
much too hard: everyone would have to be carrying atomic clocks to make
a comparison like that with a precision of 10 nanoseconds! Instead, the
receiver just compares arrival times of signals from dierent satellites, but
does not assume that it knows the real time. To make up for this missing
datum, it needs to use a signal from a fourth satellite. Over a short time,
like a second, it can make meaningful comparisons with a precision of 10
nanoseconds, but over long times, not being an atomic clock, it goes out of
synchrony with standard time. With four satellites all sending pulses close
together, it gets three accurate time dierences, enough to locate P in three
dimensional space. The receiver acts more like a stopwatch than a clock,
so that it only has to be a good timer for a brief interval, not stable and
accurate over a long time. You could think of the rst pulse as starting the
stopwatch, and the other three pulses then get measured with respect to the
rst one.
Let us consider two more technical points about the GPS that the engi-
neers have had to deal with. First, the signals from the satellite do not propa-
gate through a vacuum eventually they come through the atmosphere, with
its index of refraction n. In fact n is not even constant, but depends on the
properties of the air. In the ionosphere, a region in the upper atmosphere, n
is anomalously large. And since the light speed is really c/n, and not c, as
we have been assuming, we are likely to overestimate the distances unless we
correct for the characteristics of the atmosphere. Therefore a mathematical
model of the index of refraction of the atmosphere is part of the GPS, built
into the receivers!
Finally, do not forget that we are assuming we know exactly where the
satellites are, probably within a few centimeters. How is this possible? Well,
Newtons laws of motion can predict pretty accurately where they must be,
and signals sent between the satellites of the system and ground stations can
monitor any drift away from the prediction and update the system with true
positions. The network of atomic clocks governs the satellite pulses, since it
is essential that they be properly synchronized.
Taken all together, the GPS is a truly amazing invention. As a resource
for economic eciency, not to mention saving lives, its value is almost im-
possible to exaggerate. Yet the GPS makes essential use of physics that was
until recently considered almost impossibly abstract, and without practical
importance, like relativity theory. It also makes essential use of Euclidean
geometry, the very beginning of physics. It should be seen not as an achieve-
ment of our time alone, but as a culmination of a very long project, over
most of human history.
4.4 Longitude
An older version of the GPS problem (the problem of determining location)
is to determine longitude. This problem too requires Euclidean geometry,
good clocks, and celestial references, so it is a kind of precursor to the GPS,
before the days of articial satellites and atomic clocks.
4.4. LONGITUDE 135
The basic idea is shown in Fig 4.6, showing the Earth from above the
North Pole, so that the angle in the gure is longitude. Light from a star is
G
A
Fig. 4.6: The Earth is viewed from above the North Pole. A star at essentially innite
distance (note parallel rays) is seen at its zenith in Greenwich G, and at the same moment
at an angle (measured from the local zenith) at A. The longitude at A is therefore .
observed at a location A on Earth. The star appears at an angle from its
zenith at A. Now suppose that at this very moment that same star is seen
at the zenith over G, Greenwich Observatory (the zero of longitude). Then
geometry says that the longitude angle at A is . Another way to make the
observation is to time the star from the moment of its zenith over Greenwich
to the moment of its zenith over location A. Since the sidereal clock (the
rotating Earth) moves at 1.0027 cycle/day, you could convert this time into
the angle moved.
This makes it seem as if it would be easy to determine longitude. You
just observe any star that happens to be at the zenith over Greenwich right
now, and its angular departure from the zenith, as seen from where you are,
is your longitude. The hard part is in the phrase right now. How would
you know that a given star just happens to be at the zenith at Greenwich
right now? You could have a table of stars with the times of their zeniths at
Greenwich (Greenwich time). But you would still need a clock that told you
Greenwich time. Then you would be all set. You observe a star at the time
given for it in the table according to the Greenwich clock. To do it, though,
you need that clock. If the clock is wrong, then you make the observation at
the wrong time, the star has moved, and you get the wrong angle.
Nowadays anyone can make a rough measurement of longitude. You could
call a friend in London on the telephone and ask where the sun is in the sky.
At the same time you note where the sun is in your sky. (You could each
use sundials to measure the angle.) The dierence is your longitude. Or you
could note the Suns position at some convenient time by your watch at your
home on the East coast, y from the East coast to the West coast of the US,
and when you get there check the Suns position a second time against your
watch. You see that the Sun is not as far along in its daily motion as the watch
says it should be. The discrepancy is the dierence in longitude between the
East coast and the West coast. You have to trust that your watch is still
keeping good East coast time, but even the cheapest watch can do that now.
If you are satised with a very crude measurement, the assignment of time
zones is a rough indication of longitude, so you wouldnt have to measure
anything, just compare local clocks on the West coast with your East coast
watch: 3 hours discrepancy means 45
in longitude. (Reality check: the

longitude dierence between Boston and San Francisco is about 51
, but
when we use time zones we can only get an answer in integer multiples of
hours, i.e. 15
. With that understanding, that the value obtained from time

zones could be o by as much as 15
, these two values agree.)

It requires an eort to imagine a time when none of this was possible:
no nearly instantaneous communication to distant places, and no clocks that
could be transported while accurately keeping the time of their original lo-
cation. Without one or other of these things, the measurement of longitude
is nearly impossible. In the late middle ages in Europe it was known that
Japan was about 10000 miles to the East, but what is that in longitude? It
turns out that Tokyo is about 140
east of London, not even half way around

the world. But Columbus apparently convinced himself that it was closer
to 300
, or most of the way around, so that Japan was out in the Atlantic
somewhere. To believe this he had to believe that the spherical Earth is
only about half the size it actually is. This is to say that ignorance on the
question of longitude was profound. The Hellenistic Greeks had known the
true size of the Earth, but 1700 years later no one knew.
4.5. THE MOONS OF JUPITER 137
In addition to Eratosthenes accurate determination of the size of the
Earth (using latitude, much easier), the Hellenistic Greeks had made some
measurements of longitude by the following method. It occasionally hap-
pened that a lunar eclipse was observed at cities widely separated from each
other and people noted the time (probably not very accurately, but at least
approximately). The eclipse functioned retrospectively like a phone call, an
instant communication both cities were seeing the same thing at the same
moment, but the local times were dierent. The dierence in local times is
precisely a measurement of longitude, as we have seen. With patience, as
lunar eclipses conveniently happened, it would be possible to build up an
accurate map of the Earth this way. But it is not the kind of method one
can call upon as needed, to determine the longitude of a ship, for example,
for purposes of navigation.
The determination of longitude by navigators seems to require a good
clock, one that can run on board a ship and keep Greenwich time to bet-
ter than 1 second for a whole voyage. Eventually such clocks were built,
and given the Greek name chronometers. Their invention, in England, con-
tributed signicantly to British naval power. This is a wonderful story, mem-
orably told by Dava Sobel in her well known book Longitude, but it doesnt
have much more to teach us about physics. We will tell instead about an
attempt that failed, but an ingenious one, and quite instructive.
4.5 The Moons of Jupiter
In January 1610 Galileo, making the rst telescopic observations of Jupiter,
noticed four bright little stars, as he called them, that seemed to accompany
Jupiter. They seemed to oscillate from one side of the planet to the other
as he observed them over successive nights and weeks, getting sometimes a
little ahead of the planet and sometimes a little behind. It took him a few
days to realize what he was seeing. The little stars were actually moons
of Jupiter, making circular orbits around the planet, but he was seeing the
circles from the side. Schematically the situation was like Fig 4.7, where we
imagine Galileo looking in along the x-axis and seeing only the y-coordinate
of each moon.
Each moon is, in eect, a little angular clock. The angle of the moon in
its orbit is called its phase, perhaps by analogy with our terrestrial moon.
y
x
y
M
1
J

Fig. 4.7: A moon M in circular orbit of radius 1 is observed from the side. The observer
sees only the projection of the orbit on the y-axis. If is the phase of the moon in its
orbit, the observer sees y = sin . This is the observed displacement of the moon M from
Jupiter J, sometimes positive, sometimes negative.
When = 0 the moon is seen against Jupiter, but when = /2 the moon
is at maximum displacement from Jupiter. It goes back to zero displacement
at = , and reaches maximum displacement on the other side of Jupiter
when = 3/2. This makes the clock a little tricky to read. You dont really
see the phase angle , you only see the projection onto the y-axis, which
is sin . Galileo had to learn to look at the displacement of the moon and
realize what it meant in terms of angle . That is what Fig 4.7 shows how
to do.
If you only see the clock from the side, you actually cant tell the angle
unambiguously. We have shown the clock with phase /4, with M in the
rst quadrant, but there is another location, in the second quadrant, namely
= 3/4, where it has the same y-displacement from Jupiter. This is just a
little complication to be aware of.
In calling this displacement sin , we are actually extending the denition
of sin beyond its original denition, as a ratio of sides in a right triangle.
That denition would apply to the in Fig 4.7, which is in the rst quadrant,
4.5. THE MOONS OF JUPITER 139
and we have even indicated the appropriate right triangle, but for angles
larger than /2 that denition wouldnt make sense. The extension of the
denition to any angle is shown in Fig 4.8. With this denition, which clearly
y
x
1
(1,0)
(cos,sin)
Fig. 4.8: The cosine and sine of an angle are dened to be the x and y coordinates of
the point on the unit circle at angle , measured counterclockwise from the x-axis. For the
angle shown here the cosine is negative and the sine is positive.
agrees with the old denition for angles in the rst quadrant, the sine and
cosine can be either positive or negative, depending on the angle. You can
see why the sine and cosine are sometimes called the circular functions.
The tangent is now dened more generally by
tan =
sin
cos
(4.7)
We notice how this picture includes some familiar facts about the sine and
cosine, for example cos(0) = 1 and sin(0) = 0, but also some perhaps unfa-
miliar facts, like cos() = 1 and sin() = 0. The identity cos
2
+sin
2
= 1
that we pointed out in Eq (2.30) still holds, because the point (cos , sin )
is a point on the circle of radius 1, whatever quadrant it may be in.
Each moon of Jupiter is essentially an angular clock, and its phase angle
obeys = t for some angular speed . Thus is proportional to time, and
the axis in the graph in Fig. 4.7 could be considered a kind of time. Galileos
observations of the displacement of the moon from Jupiter, if plotted versus
time, would look like that graph, but wouldnt stop after one cycle. In fact,
he went on taking data on the moons for years, in order to establish their s
as accurately as possible, to be able to read time from the clock. The moons
were fascinating in themselves, but he also had a practical motivation: he
saw the moons as the solution to the longitude problem! Here was a clock
that everyone could see and agree on! It could play the role that Greenwich
time was to play later. Comparing local time to Jupiter-moon-time would be
a determination of longitude. You wouldnt need to carry a clock with you,
because the clock was already there, wherever you went, visible in the sky.
Sadly, this idea never really worked in the way Galileo had hoped. He
perhaps underestimated the diculty most people had even seeing the moons
of Jupiter, much less making an accurate determination of their positions.
And of course one can only make the measurement at night, in good weather,
during those hours when Jupiter is in a good position to view. It was not a
general solution to the longitude problem. But the thoroughly documented
orbits of the moons of Jupiter did lead to a wonderful discovery around
1675, long after the death of Galileo, one that would have delighted him.
The moons occasionally enter the shadow of Jupiter, and later re-emerge, in
Jovian lunar eclipses. These occur at precisely identiable times, because the
moon rather suddenly disappears! In observing these the Danish astronomer
Ole Roemer noticed that the eclipses occur early when Jupiter appears far
from the Sun in the sky (in opposition), and they occur late when Jupiter
appears near the Sun in the sky (in conjunction). He realized that it wasnt
the moons speeding up and slowing down, to get to their eclipses early or
late. Rather, the time for the light to travel to inform us was dierent in the
two cases, taking longer when Jupiter is farther away he had determined
that light travels with nite speed! From the amount of the time delay t and
the very roughly known size of the solar system D, he could even estimate the
speed of light c from D = ct. Much later still, good terrestrial determinations
of the speed of light c and the time delay t would determine the size of the
solar system D to high accuracy.
4.6. PERIOD, FREQUENCY AND AMPLITUDE 141
4.6 Period, Frequency and Amplitude
Fig 4.9 shows how the four moons discovered by Galileo behave in time. As
in the previous section these are displacements ahead of and behind Jupiter
as a function of time, and they are sine curves. The curves all start at a
hypothetical time t = 0 when all the moons have phase angle zero. Because
their s are dierent, they rapidly get out of phase, that is, their phase
angles are soon all dierent. In this context is called angular frequency
instead of angular speed. It is angular speed if you are looking at the circular
orbit face on, but it is angular frequency if you are looking from the side, as
we imagine here. The units are typically radians/sec, often denoted simply
s
1
.
If we call the displacement y, the equation for such a curve is
y = Asin(t) (4.8)
The constant A is the amplitude, and it corresponds to the maximum dis-
placement in the graph. Thus it is the same as the radius of the moons
orbit, as we saw in Fig 4.7. The smallest orbit has radius 1 here, since the
smallest amplitude is 1. (This just means we are measuring the other radii
in units of that one.) The largest amplitude is about 4.5, which means that
one of the moons is 4.5 times as far from Jupiter as the closest one. The sine
function itself oscillates between 1 and 1, but when you multiply by some
other amplitude A, it oscillates between A and A.
The period is the time it takes the moon to complete one cycle. If we call
it T and measure the angle in radians, we must require T = 2, because
the circular function sin just repeats its values again after reaches 2,
having completed a full circle. Therefore the period T is related to angular
frequency by
T =
2
(4.9)
Note how the dimension works. The period has dimension [T], and angular
frequency has dimension [T
1
].
0 5 10 15 20 25 30
5
4
3
2
1
0
1
2
3
4
5
Time (days)
A
p
p
a
r
e
n
t

d
i
s
p
l
a
c
e
m
e
n
t

f
r
o
m

J
u
p
i
t
e
r
Fig. 4.9: The apparent displacement of the moons of Jupiter away from the planet, as a
function of time, are sinusoidal curves. The amplitude of each curve is the radius of the
corresponding moons orbit. The displacements are scaled so that the smallest orbit has
radius 1. It is interesting to notice that the moons which are farther from Jupiter take
longer to complete one cycle than the ones which are closer.
Finally, the frequency is
f = 1/T (4.10)
with typical unit cycles/sec, now called Hertz (abbreviated Hz). The 1 in
the denition of f is really 1 cycle, and the period T is typically given in
4.7. VELOCITY IN ORBIT, PROJECTED 143
seconds, the time for one cycle. It is always a possible source of confusion
that people may say frequency when they mean , angular frequency.
These two frequencies dier by a factor 2, since = 2f. This is the same
factor 2 that occurs in the conversion factor 1 = 2 radians/cycle. It is
always a good idea to say angular frequency when you are talking about
if there is any possibility of confusion.
Finally, we can notice that for these moons the large amplitude curves
are also the long period curves, and the small amplitude curves have shorter
periods. This means that the inner moons get around the cycle faster, as
you would expect. In fact the moons periods T and orbital radii a obey
Keplers Third Law, T
2
a
3
! Galileo had all the data he needed to verify
this mysterious fact, but alas he never paid much attention to Kepler. He
apparently never read Keplers books, and he never noticed the power law
relationship, which would have delighted him.
4.7 Velocity in Orbit, Projected
We continue with the topic of how circular motion looks when viewed from
the side, i.e., sinusoidal oscillation, now considering speed, or velocity. A
moon in a circular orbit of radius R with a period T goes around the entire
circumference of the orbit 2R in time T, so its constant speed is
v =
2R
T
= R (4.11)
Notice how the units work: the dimensionless unit radians has disappeared!
R has dimension [L], has dimension [T
1
], and v has dimension [LT
1
],
with units perhaps ms
1
. But the derivation requires that be in radians
per second.
The velocity v = R can be pictured as in Fig 4.10. The moon moves
along the circular orbit, and its velocity at any moment is tangent to the
circle. Looking from the side, however, we see only v
y
, the projection of this
velocity on the y axis, which is
v
y
= Rcos (4.12)
Thus the projected velocity oscillates. The projected velocity is rst positive,
then negative, then positive again as the moon orbits.
R
R
R cos
Fig. 4.10: The projection on the y axis of the velocity R of an orbiting moon is Rcos
The words speed and velocity are nearly interchangeable in everyday
use, but in physics speed is usually just a nonnegative number, while ve-
locity may be either positive or negative, as here, with the sign indicating
direction. More generally velocity may be a vector, an arrow, like that in
Fig 4.10, giving not just the speed, but the direction in space.
4.8 Pendulums
The kind of time dependence seen in Fig 4.9 is surprisingly common in Na-
ture, and not just because one might be looking at circular orbits from the
side. Many things oscillate in a sinusoidal way even if there is no circle in-
volved. The displacement of a pendulum as it swings left and right, like
the one in Fig 4.11, is essentially sinusoidal in time, with a denite angular
frequency . This makes a pendulum a clock, and one reads the clock by
4.8. PENDULUMS 145
reading the phase, i.e., the completely ctitious angle it would have if the
back and forth motion of the pendulum bob were circular motion, but seen
from the side. This phase angle is not to be confused with the real angle
L
Fig. 4.11: A pendulum of length L

the pendulum makes with the vertical direction, which corresponds to dis-
placement, not phase! Since we have always called the phase angle , let us
emphasize that the displacement angle of the pendulum is a dierent thing
by giving it a dierent Greek letter, . We could even let be the oscillating
variable and write
=
0
sin(t) (4.13)
For small swings of a pendulum, this is a pretty accurate description of how
the pendulum angle behaves in time. The constant
0
is the amplitude,
the maximum angle the pendulum makes with the vertical. The pendulum
swings between
0
and
0
. The period T of the pendulum is the time for
it to swing through a complete cycle. One can measure time in units of T
by just counting the periods the pendulum is a kind of clock. The angular
frequency, if you should need it, is = 2/T, by Eq (4.9).
A pendulum makes a very good clock, in fact. One can test this by
comparing pendulums with each other. They all keep quite consistent time,
even if they have dierent periods, at least if the amplitude of their swings
is small. That is, they are all described by Eq (4.13), with dierent s and
(small)
0
s. Galileo was one of the rst to notice this, and he used his own
pendulum clocks in experiments. It seems quite amazing that such a simple
clock had not been noticed and perfected long before! But in fact a scramble
for rights to this invention ensued. Galileo in his old age hoped his son
Vincentio might somehow get the rights to it, but the Dutch mathematician
and physicist Christian Huyghens is usually credited with doing the most to
put the pendulum clock into general use. Huyghens did considerable work
on the problem of making a pendulum that always had the same period
T, no matter whether it oscillated with a small amplitude
0
or a large
one. The dependence of period on amplitude is slight, and Galileo, who is
usually a very good observer, apparently didnt even notice it, but in fact
as the amplitude gets larger, the period gets slightly longer. Thus, for good
consistent timekeeping, the amplitude of the swing should be small.
4.8.1 The Period of a Pendulum
One of the most remarkable things about a simple pendulum clock is that it
doesnt matter what you make it out of. A little weight on the end of a string
of length L makes a pendulum, but it is completely irrelevant what the little
weight is. The period T of the pendulum depends only on the length L. If you
make a pendulum that is 1 meter in length, for example, it will have a period
of almost exactly 2 seconds. Try it! This means you could be cast away on a
desert island (with a meter stick) and still construct a simple pendulum clock
that keeps known time. If you like your eggs cooked for exactly three minutes,
for example, you could do it. Of course you could also calibrate a pendulum
clock by comparing it with an astronomical clock. Early pendulum clocks
were calibrated by counting swings for an entire sidereal day, determined by
the successive transits of a star!
Longer pendulums have longer periods, in a very precise way. This rela-
tionship was noticed in the middle 17th century. It can be established with
no other clocks than the pendulums themselves. Just let two pendulums of
dierent lengths run at the same time, and count how many swings each
makes in some denite time, for example in the time that the shortest pen-
dulum makes 100 swings. Measure length in units of the shortest pendulum.
Then you might collect the following data:
4.8. PENDULUMS 147
L # of swings
1 100
2 71
3 58
4 50
Notice that when the pendulum is made 4 times longer, the period be-
comes 2 times longer, so it only makes half as many swings. We could also
say the frequency is only 1/2 what it was. More generally, the relationship
is
T
L (4.14)
That is, the period is proportional to the square root of the length. This is
a clear example of a mysterious proportionality in Nature, pure physics. We
try dierent ways of expressing it. It would be the same thing, for example,
to say
L
1/2
(4.15)
That is, the angular frequency is proportional to the reciprocal of the square
root of the length (1/
L). We could replace (angular frequency) by f

(frequency) in the statement above, because and f are proportional to
each other, and hence both are proportional to L
1/2
, although with dierent
constants of proportionality. We could also square both sides to get rid of
the square root and say
2
L
1
(4.16)
These are all equivalent statements.
Writing the last statement of proportionality as an equality introduces a
constant of proportionality g
2
=
g
L
(4.17)
or equivalently, for the period of a pendulum,
T = 2
L
g
(4.18)
Let us check dimensions in Eq (4.17). The left side has dimensions [T
2
] and
L is a length. Therefore the dimensions of g, which we may denote [g], are
[g] = [LT
2
]. This looks a little bit like a speed, but speed has dimensions
[LT
1
] notice the dierence! The units of g identify it as an acceleration,
and in fact g is called the acceleration due to gravity. It occurs in physics
wherever the Earths gravitational eld plays a role. In this case, for example,
it is obvious that the reason the pendulum swings at all is that gravity pulls
it down. An amusing limiting case of Eq (4.18) is the case g 0. Then
T . What is this telling us?
From the experimental observation that L = 1 meter gives T 2 seconds,
we nd
g 10 m/s
2
(4.19)
A more precise measurement of T gives the better experimental determina-
tion g = 9.8 m/s
2
to two decimal digits. You will seldom see g quoted to
higher accuracy than two digits, because, although g is a constant at any
given location, it actually varies over the surface of the Earth. This varia-
tion aects the third digit. Thus gravity is not quite constant as you travel,
although its variation is subtle. Gravity is less at high altitude, but even at
sea level it varies with latitude.
The variation of g was discovered in just the way you might expect. As
soon as Huyghens and others had managed to build good pendulum clocks,
they were optimistically taken on board ships as possible chronometers to
determine longitude. There were even some early successes: in 1664 a certain
Captain Holmes, carrying one of the rst of these clocks, and running short
of water, correctly determined that he was closer to the Cape Verde Islands
o Africa than he was to the Caribbean, and turned East rather than West.
But it was soon clear that these clocks could not give correct longitude, and
one reason was that they didnt run at a constant rate. The period T of a
pendulum is given by Eq (4.18), but g depends on where you are.
Several expeditions investigated this eect, and it turns out that g de-
pends on latitude. A typical result, quoted here from Newtons Principia
Mathematica, was that of Edmund Halley (of Halleys Comet), in 1677, who
found that a pendulum clock that had a period of exactly 2 seconds in Lon-
don, by comparison with the sidereal clock, had to be readjusted at the island
of St. Helena. The pendulum had to be made shorter by 1/8 inch to keep
the same period, a larger adjustment even than the clockmaker had allowed
for. That can only mean that g at St. Helena is considerably smaller than
g in London, smaller, in fact, by the same factor as L is smaller, since g/L
is kept constant. We will look at the details of this in the next section, with
quick approximate methods for handling such issues.
4.9. THE BINOMIAL APPROXIMATION FOR PERTURBATIONS 149
By the time Newton was writing about this, in the 1680s, he understood
exactly what was going on, which is more than we can fully explain here, but
we can outline it. The main reason that g is observed to be smaller at more
equatorial latitudes is the rotation of the Earth. This eect alone, though,
which Newton computed, is not enough to explain the size of the observed
eect. The Earth must also be oblate, bulging a bit at the Equator, which
also has the eect of lowering g there. In fact, one expects the Earth to bulge
like that if it is rotating, and by just the amount that the data required!
Finally, Newton could not rule out that the pendulums had been warmer in
the tropics than in London, so that L had become larger because of thermal
expansion, and therefore had to be shortened, another contribution in the
same sense as what is observed. He estimated this eect quantitatively and
showed that it was small compared to the other eects, but not negligible,
and he kept this small correction, which made the agreement with theory
almost exact.
Thus pendulum measurements, combined with Newtons new theory of
motion, had demonstrated conclusively that the Earth rotates (still contro-
versial at the time), and had revealed a subtle departure of the Earths shape
from a sphere. The pendulum, despite its simplicity, is a very delicate and
precise instrument. You can measure its period with high precision if you are
just willing to take enough time and count enough swings, and the period
tells you about the interesting quantity g.
4.9 The Binomial Approximation for Pertur-
bations
A small change in g causes a small change in T, the period of a pendulum.
Physicists have quick and easy methods for estimating such eects. Any
physicist would look at Eq (4.18) and realize that if g were 1% smaller, then
T would be about 0.5% larger, or more generally, whatever the fractional
change in g, the fractional change in T would be about half of that, and in
the other direction. How does one see this?
It all goes back to the binomial theorem, which gives a formula for ex-
panding the nth power of a binomial. We only need it in a very special case,
in which the binomial is (1 + x), and |x| << 1 is small. In this case, which
we could call a small x approximation,
(1 +x)
n
1 + nx (|x| << 1) (4.20)
It is as if the exponent n slid down and multiplied x. We will call this
the binomial approximation. This is an extremely useful approximation, just
like the small angle approximation, and should be memorized! Fortunately
its form makes it easy to remember.
As a numerical example, (1.01)
2
1 + 2(.01) = 1.02. The exact value is
1.0201, dierent by only one part in 10
4
. The approximation is good because
x here is 0.01, which is much less than 1. The binomial approximation
works just as well if n is not an integer. For example
1.01 = (1.01)
0.5
1 +(0.5)(0.01) = 1.005. The exact value is 1.004987..., diering by less than

2 parts in 10
5
. For x as small as 0.01 you can use the binomial approximation
with condence.
How do we use the binomial approximation to see how g aects T for a
pendulum? We write Eq (4.18) as
T = 2L
1/2
g
1/2
(4.21)
Now if g changes to g
(i.e., some other value), then T changes to T
, where
T
= 2L
1/2
g
1/2
(4.22)
We introduce the notation g
= g + g, where g (delta g) is the little

change in g, equal to the dierence g
g. The symbol here means change

in, a kind of peculiar concept. It does not mean we are multiplying g by
something! We know from observations that g does change a little bit when
you change latitude, so g is just a name for this change. It is one single
quantity, not a product! Depending on how g changes, g could be either
positive or negative. Similarly we shall write T
= T +T, where T is the

change in the period. Then Eq (4.22) becomes
T + T = 2L
1/2
(g + g)
1/2
= 2L
1/2
g
1/2
(1 +
g
g
)
1/2
(4.23)
T(1
1
2
g
g
) (4.24)
4.9. THE BINOMIAL APPROXIMATION FOR PERTURBATIONS 151
We used the binomial approximation and Eq (4.21) in the last step. Finally,
by algebra, this reduces to
T
T

1
2
g
g
(4.25)
which is just what we said at the beginning. If g/g = 0.01, for example,
corresponding to g becoming 1% less, then T/T 0.005, corresponding to
T becoming 0.5% more.
Exactly the same argument, just changing the letters, shows that if y =
Ax
n
, where A is a constant, then a change x in x implies a change y in
y given by
y
y
n
x
x
(4.26)
This is the most useful form of the binomial approximation. In the case
above, n was 1/2. We can use this idea to see how changing the length L
of a pendulum by a small amount L aects the period T:
T
T

1
2
L
L
(4.27)
because T is proportional to the 1/2 power of L.
In the voyages described by Newton, both quantities changed, because
the ships went to locations where g was dierent, and then the pendulum L
was adjusted to keep T the same. The cumulative eect of the two changes
is
T
T

1
2
g
g
+
1
2
L
L
(4.28)
and if T = 0, because the adjustment in L was made to cancel the change
in g, we nd
L
L
=
g
g
(4.29)
The argument above gives this only approximately, but it is actually exactly
true, because it is really g/L that is being kept constant by way of this
adjustment, so that L must be shortened by exactly the same factor that g is
made smaller. If g is smaller by 1%, through being multiplied by 0.99, then
L is also.
Halleys expedition, to take one typical example, found that L had to be
shortened by 1/8 inch in a pendulum that had a 2 second period (so L 3
feet long). Thus we nd L/L 1/300. The typical change in g is therefore
about one part in 300, or a few tenths of a percent. Why does g change like
this?
4.10 Pendulums and the Rotation of the Earth
Whatever the cause of g, it must lead to something with the right dimen-
sion, namely [LT
2
], an acceleration. What is there about the Earth that
has dimensions of [T], or [L]? Well, the Earth is a sphere, so the only length
that characterizes it is its radius R. (You might object, why not its diameter,
2R? That would be ne too. We are just estimating magnitudes, by looking
for things with the right dimensions. Pure numbers like 2 cannot be ruled in
or ruled out by these arguments.) For something with the dimension [T] we
have its daily period, the day, and also its annual period, the year. We might
also think of the lunar period, the month, as possibly having something to
do with g. Equivalently we could consider the angular speed in each case,
having dimension [T
1
].
Now to make g with its dimension [LT
2
] from these quantities, we can
only use the combination
g = C
2
R (4.30)
where C is a pure number. Typically the pure numbers that arise in the-
oretical physics are around 1. Nature seems not to use pure numbers that
are very large or very small, on the whole. That is a slightly mysterious
fact. Nature seems to like numbers like and 2, that are of order 1, as
physicists say, like the numbers that come from geometry as typical ratios.
So we assume that C is not as big as 100 or as small as 0.01, but has the
order of magnitude 1.
Let us now estimate g from the expression above. Since for the Earth
R 6 10
6
m, and the angular rotation speed is 2/86400 7 10
5
s
1
, we nd
2
R 3 10
2
m/s
2
as an estimate for the typical size of g
we should expect if the rotation of the Earth has something to do with it.
Since g 10 m/s
2
, this is a change of about 0.3%, just what was found by
Halleys expedition! Even without a detailed theory of how the rotation of
4.10. PENDULUMS AND THE ROTATION OF THE EARTH 153
the Earth aects pendulums, we can see that there probably exists such an
eect. Dimensional analysis alone has told us.
Let us also estimate, in the same way, how much the Earth bulges at
the Equator because of its rotation. This would be a change in radius R
of the equatorial circle. R should be proportional to , because if were
zero, the bulge would also be zero. Now has dimension [T
1
], and to get a
length [L], we have to cancel the time dimension. The simplest combination
using the quantities we have been talking about that grows with and has
dimension [L] is
R = C
_
2
R
g
_
R (4.31)
This expression has common sense features which make it plausible. If gravity
were stronger, g would be bigger, and since g is in the denominator, the bulge
would be less. That makes sense. Stronger gravity would hold things in place
better. Also, because of R in the numerator, the bulge would be bigger if
the Earth were bigger, but that is just part of scaling things up. Since we
have already evaluated the expression in parentheses, and found that it is
about 0.003, we can see that the bulge will be of the order of a few tenths of
a percent of the radius R, which would be hard to see, as a visible bulging
of the Earth, but still amounts to 0.003 6 10
6
m 20 km, which is more
than the rise of the highest mountains.
How much would g change between the bottom of a mountain and the
top? Here we would not expect the rotation of the Earth to have anything
to do with the eect. Presumably gravity would be weaker at the top of
the mountain whether the Earth were rotating or not. The eect should be
roughly proportional to the height of the mountain H, which is a length.
To get something with the dimensions of g, we must use g itself, the only
quantity left with the dimension [T] in it:
g = C
H
R
g (4.32)
For a 6 km mountain, which is high, but realistic, H/R 10
3
. Thus
if g = 9.80 m/s
2
at the bottom of the mountain, we shouldnt be surprised
if it is only 9.79 m/s
2
at the top, a change of about a part in 10
3
. This is
not a real theory, of course, because we dont know how to compute g from
anything more fundamental, but it is an estimate of how big an eect there
could be, using dimensional analysis. It turns out that there is an altitude
eect of roughly this size. (Records set at the Mexico City Olympics are
given a special notation in some accounts, because of the high altitude and
the slightly smaller g. Presumably if g is less, you could jump higher, farther,
etc. How much higher? How much farther?)
Perhaps more surprisingly, g varies in an unpredictable way if we can
measure it to more than three decimal places. Geologists use gravimeters,
sensitive instruments for measuring g, to see these local variations in g. Local
dense deposits in the Earth can show up as local regions of larger g, and
similarly, regions of lower density can show up as local regions of smaller
g. Commercial gravimeters measure g using a unit which is about 1 part
per million, so the variations that actually are seen are typically 10 to 100
parts per million. A gravimeter is like a crude eye, looking into the Earth,
but without any ability to focus, able to report only that there is something
interesting nearby or there isnt.
4.11 Simple Harmonic Oscillators
Any quantity x that oscillates in a sinusoidal way is called a simple harmonic
oscillator. The word harmonic indicates that it has a well dened angular
frequency , so that the time dependence of x is
x = Asin(t) (4.33)
for some constant amplitude A and angular frequency . There is one slight
generalization we should also consider. In Eq (4.33) we notice that the phase
= t is 0 at t = 0, but in general the zero of time could correspond to any
phase in the oscillation. Thus the most general simple harmonic oscillation
is
x = Asin(t +) (4.34)
The extra phase angle is called the phase shift. If = /2, for example,
then the oscillation starts at its maximum value A at time t = 0, instead
of starting at 0. For this particular phase shift /2, the resulting oscillation
could be written more simply as Acos(t), i.e.,
Asin(t +/2) = Acos(t) (4.35)
4.11. SIMPLE HARMONIC OSCILLATORS 155
The cosine function is just the sine function with a phase shift /2. If we
graph both the sine and the cosine on the same axes, as in Fig 4.12, we see
that the cosine is literally the sine shifted by /2. Another special phase
0 1 2 3 4 5 6 7 8 9 10
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
Angle
cos
sin
/2
Fig. 4.12: The sine and cosine curves are the same except for a phase shift of /2. The
formula is sin( + /2) = cos(). Adding /2 to the argument of the sine shifts the sine
curve (solid) to the left by /2, as indicated by the arrow, where it coincides with the
cosine curve (dashed).
shift is . The eect of a phase shift is
Asin(t +) = Asin(t) (4.36)
Perhaps you can see in Fig 4.12 that shifting the sine curve by twice /2,
which is the same as shifting the cosine curve by /2 gives the sine curve
upside-down. This phase shift of turns out to be a very important idea
later on when we look at how waves interfere.
The velocity of a simple harmonic oscillator is
v = Acos(t +) (4.37)
by the argument of Section 4.7. Thus the oscillating velocity is /2 out of
phase with the oscillating position. The velocity is maximal (either positive
or negative) when the position of the oscillator is zero, and the velocity is zero
when the position of the oscillator is maximal (either positive or negative).
This last statement just says that the oscillator stops at the extreme limit of
each swing: perhaps that is obvious.
The world is full of real oscillators, meaning things with a sinusoidal time
dependence. All of them are potentially clocks. Old-fashioned watches have
a little wheel that oscillates back and forth, driven by a coiled watch spring.
Modern quartz watches have a crystal that oscillates at a very high frequency,
so fast, and with such a small amplitude, that you couldnt see it with your
eye. And for many purposes, atoms behave like oscillators, in the sense of
having built-in denite frequencies.
Physicists have a kind of metaphor for all simple harmonic oscillators: the
mass-on-a-spring, shown in Fig 4.13. The idea is that if the mass moves left,
the spring is compressed, and if the mass moves right, the spring is stretched.
Either way, the spring forces the mass back the other direction. The spring
is said to exert a restoring force on the mass, tending to restore it to that
special place where the spring is neither compressed nor stretched, its relaxed
position. The mass always overshoots, however, rst on one side, then on
the other, and hence it oscillates. An analysis of the mass on a spring using
Newtons laws of motion predicts that the angular frequency will be
=
_
k
m
(4.38)
where m is the mass and k is a constant representing the stiness of the
spring. The period of the oscillator is then
T = 2
_
m
k
(4.39)
4.11. SIMPLE HARMONIC OSCILLATORS 157
k
m
Fig. 4.13: A mass m on a spring with spring constant k.
Even if the oscillator is more complicated than this, or even looks nothing at
all like this, physicists keep this simple picture in mind, because, in a sense,
all oscillators are alike, abstractly, so one might as well have one standard
picture of an oscillator. We have not talked about either force or mass, so
this is just a look ahead, but everyone has an intuitive idea of what is meant
by mass. It is worth noticing in these expressions that if m gets larger for
xed k, then goes down and T goes up the more massive oscillator is more
sluggish. In fact T

m for xed k. Also, if k gets larger, i.e., the spring
gets springier, then goes up springier means faster oscillation,
k
for xed m.
Let us see how this works in the case of the one oscillator we know, the
pendulum. The pendulum bob must have some mass m, and it is on a string
of length L. The frequency is such that
2
=
g
L
=
k
m
(4.40)
On the right hand side we are insisting that we want to regard the pendulum
as a mass m on a spring k, even though there is no actual spring. We see
that we can do it if we just understand the spring constant to be
k =
mg
L
(4.41)
In this case it is potentially confusing to regard the pendulum as a mass on
a spring, because what we nd is that the spring constant k is proportional
to m. In the ratio k/m = g/L, the mass m cancels, so that the period of
a pendulum is actually independent of m. Unless you remember that when
you change m in a pendulum you also are changing k, you might erroneously
expect the pendulum to run slower if it is more massive: not true! For this
oscillator there is no way to change m without also changing k in a way that
exactly compensates. Fig 4.13 suggests that the spring and the mass are
independent things, but for the pendulum the mass m, through its weight,
is the spring, somehow. It is a characteristic property of gravity to pull
harder on larger masses, just compensating for their greater sluggishness (or
inertia, to use the technical word). We take up the topic of mass and weight
in Chapter 5.
4.12 Exponential Decay
So far we have described oscillation as if it kept the same amplitude A forever,
and never ran down. In fact, though, most oscillators, like pendulums, do
run down. A good model for this is to assume that on each successive swing
its amplitude is some xed fraction r of the previous amplitude. Thus r is
the ratio of successive amplitudes, assumed constant (and less than one). If
the oscillator starts from rest with displacement A
0
, then on the next swing
it will only reach the displacement A
1
= rA
0
. On the second swing it reaches
displacement A
2
= rA
1
= r
2
A
0
, and so forth. In general the nth swing will
reach displacement
A
n
= r
n
A
0
(4.42)
This is a discrete example of exponential decay. It is called exponential
because the variable n, which is the number of oscillations, and therefore
proportional to time, occurs in the exponent. Whether a given oscillator
really behaves like this as it runs down is, of course, a question for experiment
to decide, but it is always a reasonable guess.
4.12. EXPONENTIAL DECAY 159
If you did this experiment on a real oscillator, and noted A
n
for n =
0, 1, 2, ..., how could you tell if it was exponential decay? The naive method
is to look at successive ratios, like A
1
/A
0
, A
2
/A
1
, etc. to see if they were all
the same r. Because of the unavoidable little inaccuracies of the measurement
process, though, the ratios will certainly not all be the same, even if this is
exponential decay. A much better method is to re-express Eq (4.42) by taking
a logarithm:
log(A
n
) = nlog(r) + log(A
0
) (4.43)
This says that in exponential decay, log(A
n
) is a linear function of n. Thus
we just plot log(A
n
) vs. n and see if it is a straight line. If it is, then the
slope is log(r), and thus we determine r (even if the data scatter a bit about
the line, so that individual ratios are not constant).
This method illustrates a very useful idea in data analysis generally. If
you suspect some relationship, like Eq (4.42) above, an excellent test is to
plot data in such a way that they will lie along a straight line if the suspicion
is correct. The reason is that you can always recognize a straight line! Other
curves are much less obvious to identify. We have seen this idea before, in
Sections 2.3 and 2.9. In the case of the exponential relationship in Eq (4.42),
the logarithm produces the linear relationship with n in Eq (4.43), bringing
n down out of the exponential. The idea is very much like that in Section
2.9, where we had a similar test for a power law, using a log-log plot. The
test for an exponential decay is sometimes called a semilog plot, because we
only take the logarithm of A
n
, but not n.
We illustrate the method below, using the fake data in the table.
n A
n
ln(A
n
)
0 2.092 0.738
1 1.674 0.515
2 1.298 0.261
3 1.065 0.063
4 0.913 -0.091
5 0.747 -0.292
The data in the semilog plot in Fig (4.14) lie approximately on the straight
line
ln(A
n
) 0.711 0.205n (4.44)
We chose the natural logarithm here. Any other choice of base for the log-
arithm would just change the scale on the vertical axis, leaving the picture
ln A
n
n
0 1 2 3 4 5
-0.5
0.0
0.5
1.0
Fig. 4.14: Plot of ln(A
n
) vs n from the accompanying table.
the same. The inverse of the natural logarithm is the exponential function
(base e). Applying this function to both sides we nd A
n
itself in a standard
exponential form,
A
n
e
0.7110.205n
= e
0.711
(e
0.205
)
n
(4.45)
In the terms that we began with,
A
0
e
0.711
2.04 (4.46)
r e
0.205
0.815 (4.47)
That is, the oscillator starts with an amplitude that is about 2, and each
successive swing is only about 81% the displacement of the previous one.
If we think of steady oscillation as being circular motion seen from the
side, then decaying oscillation can be thought of as spiral motion seen from
the side. The exponential decay of the oscillation is what turns the circle
into a spiral, as the amplitude shrinks in time. The equation of such an
exponentially decaying oscillation, looking in from the side, is
y = Ae
t
sin t (4.48)
as shown in Fig 4.15. The radius A of the circle, which had been constant
before, is now exponentially decaying, and is replaced by Ae
t
, where is
4.12. EXPONENTIAL DECAY 161
x
y y
t
(a) (b)
Fig. 4.15: A spiral in toward the origin, as in graph (a), looks like an oscillation decaying
in time, as in graph (b), when observed from the side. The decaying oscillation in (b)
has the equation y = Ae
t
sint, where is the angular frequency, and is the decay
rate. The exponentially decaying amplitude Ae
t
is indicated with dashed lines. Compare
Fig 4.7.
a constant, called the decay rate. If the decay rate happens to be 0, then the
exponential factor is e
0
= 1 and we are back to the previous case, of simple
harmonic motion. But if > 0, then when the time increases by one period
T, the amplitude is multiplied by r = e
T
. This is essentially the case we
thought about earlier, where on each swing the amplitude is cut down by
some factor r.
The decay constant has a simple interpretation. It has dimension [T
1
],
like , since t, like t, must be dimensionless. Thus
=
1
(4.49)
is a time, called the decay time. It is the time it takes for the amplitude to
fall to 1/e 0.368 of its initial value, that is, for the oscillation to die down
very noticeably. Notice that it is not the time for the oscillation to stop. In
fact, the oscillation never stops in this model, it just gets smaller and smaller,
and the time for it to get very noticeably smaller is about .
Since there are now two times describing the oscillator, T the period, and
the decay time, there is a natural dimensionless ratio
Q =

T
(4.50)
called the quality of the oscillator. (Engineers just refer to the Q of the
oscillator.) It is the decay time measured in periods, which is simply the
number of oscillations the oscillator makes before it has essentially run down.
The oscillator in Fig 4.15 looks as if it has a Q of about 3. This is a very low
quality oscillator! It hardly oscillates at all before it has essentially quit. A
church bell, on the other hand, may ring for many seconds, at a frequency
of hundreds of Hz, so it could easily have a Q of 1000 or more. When the
Apollo 12 mission left the Moon in November, 1969, it deliberately crashed
the landing module Intrepid into the lunar surface to generate oscillations
to be detected by seismometers left behind. The results were amazing and
puzzling: the Moon rang like a bell! Its Q was estimated at 4000-6000.
Earthquakes generate oscillations of the Earth, but Q down here is more like
200. Think about that next time you feel a tremor.
Exponential decay can also be used as a clock. It isnt as convenient as
an oscillation that just ticks along forever, but it if you know the decay time
, and if you can monitor the decay, then you have a measure of time. If for
example, you wait long enough for the amplitude to fall to 1/e of its initial
value, then you must have waited for one decay time . If you wait for it to
fall by another factor 1/e, so that it is now less than the initial amplitude
by the factor 1/e
2
, then you must have waited 2. In that sense, exponential
decay is a clock: the logarithm of the amplitude is a linear function of time.
The best name for it might be a logarithmic clock. The logarithm of the
decaying quantity is interpreted as the time.
The simplest way to read a logarithmic clock is to use the notion of half-
life, denoted T
1/2
. The half life is a time, not too dierent from the decay
time, actually, but simpler to think about. It is the time for the exponentially
decaying quantity to fall to half its original value. Since the decay time is
the time for the exponentially decaying quantity to fall to 1/e of its original
value, these ideas are essentially the same, but 1/2 is a little more familiar
than 1/e. Here is how to use it. Suppose something is decaying exponentially,
and it has been reduced to 1/4 its original value. How much time has passed?
Answer: two half lives, because it has been reduced by (1/2) and then by
(1/2) again, for a total reduction of (1/2)
2
= 1/4. If it were reduced to 1/8
of its original value, three half lives must have passed, because 1/8 = (1/2)
3
.
4.13. DATING BY RADIOACTIVE DECAY 163
4.13 Dating by Radioactive Decay
Natural radioactivity gives us a number of logarithmic clocks. The most
familiar of these is probably Carbon 14, a radioactive isotope of carbon with
a half life T
1/2
5700 years. By measuring the amount of C
14
remaining in
a sample, we can determine how old the sample is (lots of details to ll in
here!)
As we just mentioned, the half life is not a very new concept: it is essen-
tially just the decay time , and simpler to think about. In any exponential
decay, there is some quantity that decays in time t by the factor e
t
. The
half life T
1/2
is just the time at which this factor is 1/2, the time for the
quantity to fall to 1/2 its original value. Thus
e
T
1/2
=
1
2
(4.51)
and hence, taking the natural logarithm on each side,
T
1/2
= ln 2 (4.52)
so that
T
1/2
= ln 2 (4.53)
using = 1/. The proportionality factor between half life T
1/2
and decay
time is ln(2) 0.693, which is smaller than one, so that the half life is
a bit less than the decay time. That makes sense because in a half life the
quantity falls to 0.5 time its original value, but in a decay time it falls to
0.368 times its original value, which takes a little longer.
Thus, in the case of C
14
, the decay time is = T
1/2
/ ln 2 = 8200 years.
That means the decay rate is = 1/ = 1.2 10
4
yr
1
= 3.8 10
12
s
1
.
The number of decays in a short time is proportional to the decay rate, as
the name suggests, but also to the number of C
14
nuclei, since each one of
them is a candidate to decay. The decay rate is really telling us what fraction
of the nuclei decay per second, per year, etc., depending on units. If we have
a sample of N
0
nuclei, then decay events happen at the rate N
0
, called the
activity, proportional both to and to N
0
. The SI unit of activity is the
Becquerel (Bq), 1 decay per second. In one year, using as computed above,
the fraction 1.2 10
4
or 0.012% of the C
14
nuclei in our bodies, or in any
other sample, undergo radioactive decay. So how radioactive does that make
us?
The natural abundance of C
14
is about 1 part in 10
12
, that is, 1 atom
in every 10
12
atoms of carbon is the radioactive isotope. Since the atomic
weight of carbon is about 12, one mole of carbon atoms has a mass 12 grams,
so in 1 kg of carbon, there are about 80 moles, or 5 10
25
atoms, of which
N
0
= 5 10
13
are C
14
. Using the decay rate found above, we nd the
activity will be about N
0
200 decays per second (200 Bq). We are
pretty hot! (Note: if the notions of atomic weight and mole are unfamiliar
to you, just accept the result. We will deal with these issues more carefully
in Chapter 9, and we will say more about radioactivity in Chapter 20.) Now
suppose we had a 1 kg sample of carbon and it was only giving us 100 decays
per second instead of 200: having only half its original activity, it must have
only half of its original C
14
atoms, and hence it must be 5700 years old. (In
practice one uses much smaller samples, and the number of decays observed
per second is proportionately less. We are skipping over all the practical
details of how you would actually monitor a sample for its activity.)
The above describes the basic idea, but there are many details still to ll
in. Perhaps the most pressing one is, if C
14
is decaying away at this rate,
why is there any of it around? The answer is that it is continually being
produced by cosmic rays, in a process by which Nitrogen-14, the principal
constituent of the atmosphere, is converted to C
14
. When C
14
decays, it goes
back to being N
14
. Atmospheric C
14
reacts with oxygen in the atmosphere
to create a radioactive form of carbon dioxide that is taken up by plants,
which are eaten by herbivores, which are eaten by carnivores. In this way
the atmospheric C
14
is exchanged among all living things, and its abundance
agrees with its abundance in the atmosphere, which, as we said above, is
about 1 part in 10
12
. After a plant or animal dies, though, it stops this
exchange process of respiring and eating. Whatever C
14
it has is all that it
is ever going to have, and that starts to decay away. This is why formerly
living things can be dated.
The rst tests of this idea looked at samples of known age, for example
wood from the sarcophagus of an Egyptian pharaoh of a historically datable
dynasty. The most useful comparison has been with dendrochronology, the
dating of wood samples by tree rings. Visible patterns of more and less
growth year by year in tree rings, due to random annual variations in weather,
have been pieced together to give precisely datable samples going back in
some cases over 10,000 years. The basic result is that radiocarbon dating
works, but there are small corrections necessary for precise work, and in
some cases large corrections. Underwater life, for example, is not exposed
directly to the atmosphere, but it is exposed to dissolved carbonate ions
which, coming from ancient rocks, have no C
14
. Thus underwater specimens
have less C
14
than expected and without a correction would appear older than
they really are. Occasional episodes of volcanism can temporarily inject a lot
of ancient CO
2
into the atmosphere, none of it radioactive: this dilutes C
14
and so makes everything alive at the time appear slightly older than it is, in
the sense that it has less C
14
than it should. The Age of Industrialization,
starting around 1800, did the same thing, in burning fossil fuels, so C
14
became relatively less abundant than it was before. On the other hand,
nuclear weapons tests in the atmosphere between 1944 and 1963 produced a
lot of C
14
, so abundances now are unnaturally high. The study of this pulse
of C
14
has revealed how carbon moves through the biosphere, a small benet
from an otherwise regrettable episode.
Problems
Angular Clocks
4.1 What is the angular velocity of the minute hand and the hour hand of
a clock? Give the answer in several dierent units.
4.2 The summer and winter solstices occur around June 21 and December
21, with a variability of about 18 hours (which sometimes changes the date).
Find the number of solar days between the solstices, and say what it means
for the speed of the earth in its orbit. Why is this result so dierent from
the number of solar days between the equinoxes?
Global Positioning
4.3 In Fig 4.5 the satellites at A and B are separated by 500 km, and the
delays registered by the receiver at P are t
2
= 1 ms and t
3
= 1 ms respectively
(i.e., both are 1 ms). Where is the point P? (Assume c = 3 10
8
m/s, and
ignore the complexities of the atmosphere.)
Longitude
4.4 If in making a longitude measurement your clock is o by 1 second from
GMT, how great an error do you make in the determination? By how great
a distance do you err if you are on the equator? If you are at 45
N latitude?
If you are at the North Pole? Make your reasoning clear.
167
4.5 Suppose that in making a measurement of local time, for the purpose
of comparing with GMT, and hence determining longitude, you measure the
position of a known star, but get the position wrong by 1 minute of arc. How
far o will the longitude determination be? Roughly how great a distance
does that correspond to? Make your reasoning clear.
Circular Functions
4.6 Use Fig 4.8 to nd the cosine, sine, and tangent of the angles 3/4, 2/3,
5/4, 5/3, and 8/3, and sketch a circle with these positions indicated.
4.7 7. Sketch the graph, on the same set of axes, for the functions 2 sin t
and sin 2t.
4.8 Investigate in Fig 4.9 whether the radii R and the periods T of the
moons of Jupiter obey a power law, i.e., whether R T
for some .
Velocity in Orbit
4.9 Estimate from Fig 4.9 the speeds of the moons of Jupiter in units of
the speed of the innermost moon. Make it clear how you are doing this.
4.10 Copy from Fig 4.9 the graph of the observed position Rsin of the
outermost moon of Jupiter over time, and on the same time axis graph the
projected component of velocity Rcos . Check that the projected velocity
really is zero at the time when the moon stops at its extreme elongation
from Jupiter and turns back the other direction.
Pendulum
4.11 Suppose we set out to make a pendulum that takes exactly 1 second
on each swing, so that its period is T = 2 s.
(a) Describe how to check by counting swings from one transit of a star
to the transit on the following night that the period is exactly 2 seconds.
(b) How long will this pendulum be, in meters?
(c) It was once proposed to make this length the standard unit of length,
on the grounds that it could be easily reproduced anywhere. Would this
actually work?
4.12 Suppose you take a pendulum clock, which is running accurately at
your old location, to a new location where g, the acceleration due to gravity
is less by 0.1%. How many minutes will the clock gain or lose in a day at the
new location? How could you adjust the length of the pendulum to make
the clock run accurately again?
4.13 Suppose your legs are 3 ft long, and your stride (length of one step)
is also 3 ft.
(a) If your legs swing like pendulums of length L = 3 ft, with no lag
between one swing and the next, how fast do you walk, in miles per hour?
(b) In fact, it is not reasonable to expect a 3 ft leg to swing like an L = 3
ft pendulum, because what we mean by that is a mass M on the end of a 3 ft
string. But in a leg, there is mass all the way along the length. Only a little
bit of the mass is at a distance 3 ft from the pivot. If the mass is uniformly
distributed along the length L, then it would swing like a pendulum of length
2
3
L, a result that requires calculus. What would be the eect on your answer
in (a) if we put in this more realistic length?
(c) In fact, the mass of the leg is not uniformly distributed along the
length, but is concentrated in the upper part of the leg. This would produce a
further correction in the direction of the correction from (a) to (b). Comment
on the common sense of these results: are the walking rates reasonable?
Binomial Approximation
Use the binomial approximation to get quick answers to the following ques-
tions.
4.14 If the angular speed of a wheel goes down by 1%, what happens to
the period T?
4.15 If the radius R of a sphere becomes 1% greater, what happens to the
volume V ?
4.16 If the radius R of a sphere becomes 1% smaller, what happens to the
surface area A?
4.17 If the surface area S of a sphere becomes 1% larger, what happens to
the volume V ?
Simple Harmonic Oscillators
4.18 Two oscillators have the same amplitude A and frequency , but they
oscillate with a constant phase dierence . Graph their motion as a function
of time, on the same set of axes. Also write formulas for the functions of
time that give their positions.
4.19 Two oscillators have the same amplitude A, but their frequencies are
in the ratio 2:1. Each starts initially at position 0, at t = 0. Write formulas
for functions that give their positions as a function of time, and sketch a
graph of their motion on one set of axes.
4.20 Two oscillators have the same amplitude A and frequency , but they
dier in phase by /2. Graph their motion as a function of time on one set
of axes.
4.21 An oscillator can be thought of as a mass m on a spring of stiness k.
How does the frequency of the oscillator change if the mass is doubled but
the spring stays the same? Sketch a graph of the oscillation for the original
m and also for the doubled mass 2m, on the same time axis. (Label the
graphs!)
4.22 An oscillator can be thought of as a mass m on a spring of stiness
k. How does the frequency of the oscillator change if the spring stiness k
doubles? Sketch a graph of the oscillation for the original k and also for the
doubled stiness 2k, on the the same time axis. (Label the graphs.)
4.23 A mass of 100 kg is oscillating with a period of 10 s. What must be
the stiness k of the spring that causes it to oscillate? (We use quotation
marks, because it might not be a real spring: the mass might be hanging on
a long rope and swinging as a pendulum, for example.)
4.24 A 3 kg mass oscillates 5 times per second from A to B and back again.
What must be the stiness k of the spring that causes it to oscillate?
Exponential Decay
4.25 Show by graphical means that the following data for successive ampli-
tudes A
n
of a pendulum running down are consistent with exponential decay,
and in particular use the slope of your graph to nd the factor r relating each
amplitude to the next. Check common sense against the original data for
A
n
.
n A
n
ln(A
n
)
0 4.29 1.46
1 2.87 1.06
2 1.88 0.634
3 1.29 0.251
4 0.923 -0.080
5 0.645 -0.439
4.26 1 g of Strontium-90 is about 6.7 10
15
atoms. The half-life of
Strontium-90 is about 28 years. Using the ideas of Section 4.13, nd the
activity of a 1 g sample in decays per second. Make your reasoning clear.
Chapter 5
Mass, Weight, and Equilibrium
The theory of the balance of forces and torques in equilibrium, what we now
call statics, was fully worked out in antiquity. Nowadays the notion of force
seems more complicated, occurring, as it does, not only in statics but also
in dynamics, the theory of motion. The earlier theory took force as a simple
concept, one that we are all intuitively clear about, with weight being the
simplest force of all. We will follow this tradition and assume that force
requires no deep explanation. Archimedes Law of the Lever can be used to
give an operational denition of static force.
5.1 Archimedes
In one of the most beautiful works of Hellenistic physics, Archimedes proved
his famous Law of the Lever. It is illustrated in Fig 5.1. It says that weights
in proportion p : q balance where the beam connecting them is divided in
proportion q : p. Nowadays this result is just a special case of Newtons
theory, and would be called the balance of torques. It is worth looking at
Archimedes argument in isolation, though. Historically, Archimedes results
stood alone for centuries, as one of very few indications that Nature admits
a mathematical theory.
The book in question is On the Equilibrium of Plane Figures. It deals with
the kind of situation illustrated in Fig 5.1, weights on possibly unequal arms
173
174 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM
3
2
2 3
Fig. 5.1: The Law of the Lever: Weights in proportion p : q balance where the beam is
divided in proportion q : p. The balance point is indicated here by the triangular fulcrum.
of a balance apparatus. The beam of the apparatus itself has no weight (an
interesting abstraction already). The form of the book is much like the form
of Euclids Elements. It begins with postulates, and the rest is deduction
from the postulates. We describe just a little of the book.
Postulate I says that equal masses at equal distances from the fulcrum
balance. It might seem that this is obviously true, but when you imagine
trying to verify it, you realize there is a problem. Suppose you take weights
that you think are equal at equal distances from the fulcrum, and they do
not balance. What do you conclude? That Postulate I is false, or that the
weights werent equal after all? You cant tell! That is why Postulate I is
a postulate. Its status has nothing to do with empirical verication. It is
completely abstract. The equal weights it talks about are theoretical objects.
There is so far no way to know when weights are equal! Postulate I is true
within the theory, by denition. That is what it means to be a postulate.
Postulate III says that if weights balance, and you remove something
from one side, then the weight goes down on the other side. From just these
two postulates Archimedes proves his Proposition I: if two weights balance
at equal distances, then they are equal (giving, nally, a way to tell that
they are equal.) Here is the proof: Suppose two weights that balance at
5.1. ARCHIMEDES 175
equal distances are not equal. Then we can remove weight from the heavier
one until they are equal. But in removing weight from one side of weights
that balance, we know the other side will go down, by Postulate III. This
contradicts Postulate I, since the now equal weights at equal distances do
not balance. Thus our supposition that the weights were not equal must be
wrong, and therefore they are equal!
The weights in this theory are suspended from their centers (or centers
of gravity), the place where each balances individually. It is assumed that
the weights keep the same value even if you change their shape. This is the
key to proving the Law of the Lever. For the example in Fig 5.1, where the
masses are in proportion 3 : 2, divide the mass 3 into 2 3 = 6 equal parts
and the mass 2 into 22 = 4 equal parts, so that all 10 parts are equal. Now
reshape them, keeping the centers of the two masses in the same place, and
re-assemble them into a single rod of length 10, where the beam had length
5. It will look as shown in Fig 5.2. The balance point is now the center of
Fig. 5.2: Archimedes proof of the Law of the Lever, using the example of Fig 5.1. The
original weights of 3 and 2 have been reshaped into a rod. Each mass has contributed
segments to the rod, and the centers of each mass are in the original places. A dotted line
shows where the left mass ends and the right mass begins. The balance will now obviously
be at the midpoint of the rod, just the location determined by the Law of the Lever.
the rod, by symmetry, and that is just the point determined by the Law of
the Lever.
In fact, if the left mass were p and the right mass were q, where p and q
are integers, then take a unit of length such that the beam has length p +q,
and form the rod out of 2p + 2q equal segments, of total length 2p + 2q.
Its midpoint can be found by starting from either end and counting p + q
segments. But starting from the left you count p segments to the center of
the left mass, and then q segments further on. And starting from the right
you count q segments to the center of the right mass, and then p segments
further on. And in this way you see that the original beam is divided as q : p.
This result means you can, in principle, easily measure one weight W
1
in
units of another weight W
2
, because that is just the proportion W
1
: W
2
, and
W
1
W
2
=
L
2
L
1
, (5.1)
where L
1
and L
2
are the distances along the beam from each weight to
the fulcrum when you get the weights to balance. The ratio of the weights is
represented visibly on the beam as the ratio of the two arms. The two lengths
L
1
and L
2
are called the lever arms for the weights W
1
and W
2
respectively.
In addition to being a measuring instrument, the lever is a practical
device for balancing a large weight by a small weight. A small weight W
2
can be leveraged by the ratio L
2
/L
1
to balance a large weight W
1
, to lift
it, for example. The ratio L
2
/L
1
(pure number) is sometimes called the
mechanical advantage of the arrangement. Archimedes result is what is
needed to design practical devices to handle such jobs.
The theory is still a bit abstract. In real applications the beam itself has
weight, for example. In fact, though, it is easy to include the weight of the
beam. You nd the actual balance by nding the point A where W
1
and W
2
would balance by themselves, as we have done, and then, regarding the beam
as a weight W
3
, with its center at B, the balance of the entire assembly will
be at the point that divides the segment AB in proportion W
3
: (W
1
+ W
2
).
That is, you just use the Law of the Lever again.
Archimedes is mentioned in two places in Roman histories, once in the
biography of the Roman general Marcellus in Plutarchs Lives, and again in
Polybius history of the Punic Wars. These accounts, which are rather sim-
ilar, were written well over a hundred years after the death of Archimedes,
when Roman rule and Roman ways were rmly established. Plutarch empha-
sizes Archimedes impracticality, a common Roman complaint about Greek
5.2. TORQUE AND FORCE 177
intellectuals. He seems to express a Roman frustration that the Greeks pro-
duced theoretical works and not blueprints for useful devices. The Romans
seem never to have appreciated how practical and useful the Greek abstrac-
tions were to anyone who understood them. These same accounts also con-
tradict themselves, in a way, because the reason they talk about Archimedes
at all is their interest in the war machines he devised at the siege of Syra-
cuse. According to Plutarch it was largely Archimedes ingenuity that kept
the Romans out. When the city nally fell, Archimedes was murdered by a
soldier.
5.2 Torque and Force
Nowadays we rearrange Eq (5.1), multiplying both sides by L
2
L
1
, and write
the balance condition as
W
1
L
1
= W
2
L
2
(5.2)
We read it as saying that the torque W
1
L
1
due to W
1
, which tends to ro-
tate the beam in the conventional positive direction (counterclockwise) is
equal to the torque W
2
L
2
due to W
2
, which tends to rotate the beam in the
conventional negative direction (clockwise). Here torque is the product of
the weight and the distance, the distance being the distance out from the
fulcrum, also called the lever arm. Since these torques balance, the beam
balances. We can also write it as
W
1
L
1
W
2
L
2
= 0 (5.3)
We read this as saying that the total torque about the fulcrum is zero. We
have to put in each torque with its appropriate sign, saying which direction
it tends to twist the beam.
The condition of balance is called equilibrium, a Latin word that refers
precisely to this balance of torques, literally equal-balance. More generally
the notion of equilibrium has been generalized to denote any condition of
balance or stationarity.
Weight is the prototype for the more general notion called force. Weight
is a force, but not every force is a weight. Archimedes lever provides a way
to compare weights with other, more general, forces. For example, when
there is a weight W
1
at one end of the lever, and you balance it by pushing
down on the other end with your hand, you can read o from the position of
the fulcrum what force you are exerting. It is the same force as the weight
W
2
that would balance W
1
, namely W
2
= (L
1
/L
2
)W
1
. The two forces are
related by the mechanical advantage. We represent the situation somewhat
abstractly in Fig 5.3. The beam is shown subject to forces W
1
and W
2
at the
W
1
W
2
A B
O
L
1
L
2
F
Fig. 5.3: Abstract representation for the situation in Fig 5.1
two ends A and B. We do not say whether these forces are weights or some
other force. We also show an upward force F exerted by the fulcrum, which
supports the beam. There are actually two conditions of balance,
W
1
L
1
W
2
L
2
= 0 (5.4)
F W
1
W
2
= 0 (5.5)
The rst of these is the balance of torques. The force F does not appear in
Eq (5.4) because it exerts no torque about O: its lever arm is zero. (To be
careful, we should emphasize that we are balancing torques about the point
O, the lever arms L
1
and L
2
are measured from that point.) The second
equation, Eq (5.5) is the balance of vertical forces, expressing the fact that
the fulcrum, by pushing up, balances the forces down which would otherwise
cause the beam to fall. We call forces up positive and forces down
negative, an arbitrary sign convention. Notice that if we used the opposite
sign convention and called down positive and up negative, we would still
get the same second equation, just multiplied through by 1. Either way,
Eq (5.5) says F = W
1
+ W
2
. That is, the fulcrum force F balances the full
weight (or whatever W
1
and W
2
are).
It is interesting to notice that the torques balance about any point, not
just O. Suppose we compute torques about the point A, meaning we measure
lever arms from A. Then the force W
1
exerts no torque, because its lever arm
is zero. The balance of torques, using the same sign convention for positive
and negative torque, is now
0 = FL
1
W
2
(L
1
+L
2
) (5.6)
But putting in F = W
1
+W
2
, from Eq (5.5), we nd after cancellation of the
term W
2
L
1
that what is left in Eq (5.6) is exactly Eq (5.4), the Law of the
Lever. Alternatively, since the Law of the Lever must hold about both point
O and point A, Eq (5.5) is not really a new, second condition. Using two
statements of the Law of the Lever, Eq (5.4) and Eq (5.6), we nd Eq (5.5)
by subtracting:
FL
1
W
2
(L
1
+L
2
) = 0 (5.7)
W
1
L
1
W
2
L
2
= 0 (5.8)
(F W
1
W
2
)L
1
= 0 by subtraction (5.9)
In the last equation, since L
1
= 0, it must be that F W
1
W
2
= 0, just
Eq (5.5). This kind of internal consistency is necessary for the theory to
make sense.
The diagram in Fig 5.3 is an example of a free body diagram, an essential
tool in reasoning correctly about forces and torques. It strips away all detail
from the situation and keeps only the minimum information about the shape
of the body together with the forces that act on that body. Let us do an even
simpler free body diagram, one for just the weight W
1
, shown in Fig 5.4. In
this case both forces act along a line through the center, so the lever arms
are zero (measured from this point), and the torques are zero, and hence
automatically balance. The only condition is that the force up should balance
W
1
T
1
Fig. 5.4: Free body diagram for a weight W
1
on a string with tension T
the force down. The force down is of course the weight W
1
, but the weight
is hanging on a string, which exerts a force T
1
up (the force due to a string
is called tension, hence the letter T). The condition of balance is
T
1
W
1
= 0 (5.10)
that is, T
1
= W
1
, the tension exactly balances the weight. Similarly, a weight
W
2
would hang in equilibrium on a string with tension T
2
= W
2
.
Now we return to Fig 5.3. If this is a free body diagram for the beam
alone, which is represented by a single horizontal line, then the forces exerted
on it should be the force F due to the fulcrum, and forces at either end due
to the strings. That is, we imagine the weights W
1
and W
2
are hanging on
strings from the beam, but are not considered part of the beam. The beam
itself is only contacted by the strings, which pull on it with their tension
forces T
1
at the left end, and T
2
at the right end. As we have just seen,
however, from looking at free body diagrams for the weights, these tension
forces are equal to the weights, and since the strings are below the beam (and
strings can only pull, not push), the forces due to the strings on the beam are
down. (We are assuming the strings themselves have negligible weight!) Thus
the free body diagram in Fig 5.3 is correct, but perhaps slightly confusing. It
seems to suggest that there are weight forces acting on the beam at the ends,
but really the weight forces act on dierent objects, the weights themselves.
The forces in Fig 5.3 are tension forces, due to strings attached to the beam,
and just happen to be equal to the weights.
Forces and torques are best understood from free body diagrams. The
word free means we choose an object and draw it free of all the other
things around it, in isolation. Then we add the forces that act on the chosen
object due to other objects. Anything that touches our chosen object can
in principle exert a force, and so should be represented by a force in the
diagram. That is why there are three forces in Fig 5.3: the chosen object is
the beam, and the things that touch it are the two strings and the fulcrum.
In Fig 5.4 the chosen object is a weight, and it is touched only by one other
object, a string, from above. That accounts for the force T
1
in the diagram.
The weight force W
1
is not due to anything that touches the weight, but
rather to the Earth, which mysteriously exerts a force at a distance without
touching: thus all objects have an extra force on them, not due to something
touching them, but rather due to the Earth, pulling them down.
Let us go back to Fig 5.3 and see what dierence it makes to include the
weight W
3
of the beam, which we have neglected so far. This force acts at
the center of the beam, which is at the position (L
2
L
1
)/2 to the right of
the point O (this is its lever arm). The balance of torques says
W
1
L
1
W
2
L
2
_
L
2
L
1
2
_
W
3
= 0 (5.11)
and thus, by algebra, in order to balance, the beam should be divided in the
ratio
L
1
L
2
=
W
2
+W
3
/2
W
1
+W
3
/2
(5.12)
We check common sense. First, both sides of this relation are dimensionless.
Second, if W
3
is negligible, the right hand side is just W
2
/W
1
, the usual Law
of the Lever. As another limiting case, suppose W
3
is much bigger than the
weights W
1
and W
2
, as if these were perhaps just mosquitos that had landed
on the (heavy) beam. Ignoring W
1
and W
2
we have L
1
/L
2
= 1, that is, the
beam balances in the middle clearly correct, because the mosquitos are too
small to aect the balance very much. These common sense checks strongly
suggest that the expression is correct. If W
3
is neither very large nor very
small, the balance position will be somewhere between the two limiting cases
that we just checked.
5.3 Spring Forces: Hookes Law
In Fig 5.4 the upward force T
1
of a string under tension could also have
been provided by a spring, with the weight hanging on the spring, instead
of a string. In terms of the balance of forces, the diagram would have been
the same. What is meant by a spring is something that noticeably deforms
(stretches, in this case) when it exerts a force.
Denote by x the extension of the end of the spring from its relaxed posi-
tion. Then we can ask how much the spring stretches for a given weight W
when it is in equilibrium. It seems clear that if the weight were more, then x
would also be more. In fact, for most springs, which means most deformable
materials, the amount of extension is proportional to the weight:
x W (5.13)
an observed fact that is called Hookes Law, after Robert Hooke, a con-
temporary of Isaac Newton. We could also express this as an equality by
introducing a constant of proportionality
W = kx (5.14)
where k is a constant corresponding, in a precise way, to the stiness of the
spring. If k is large, the spring is sti, in the sense that W must be large to
get even a small deformation x.
The nal form of Hookes Law, encoding the above information in the
most useful way, says that the force F
S
exerted by a spring is
F
S
= kx (5.15)
The force exerted by a spring is proportional to the extension x, and it is in
the opposite direction to the deformation. That is the meaning of the minus
sign. If the spring is stretched downward by a weight, then it pulls upward
on the weight. If the spring is underneath and is compressed by the weight,
then the deformation x is still downward, and the force on the weight is still
upward, opposing gravity, just as in Fig 5.4, again. Hookes Law for most
springs describes both compression and stretching, with the same k. The
Hookes Law force Eq (5.15) is also called a restoring force, since it opposes
any displacement away from the rest position. The larger the deformation
5.4. WEIGHT AND MASS 183
of the spring, in either direction, the harder it pushes back, and the force is
only zero when the spring has its relaxed length.
The constant k in Hookes Law is the same spring constant that appears
in Eqs (4.38) and (4.39), for the angular frequency and period of a spring os-
cillator. In that setting, a sti spring, with large k, has a high frequency and
a short period. In what follows, though, we think just about the equilibrium
of a spring. We will return to the spring as an oscillator in Chapter 6.
A spring of known spring constant k is a convenient scale for weighing
things. It can be calibrated using weights that have already been measured
and checked with a balance. It would have marks showing how far it stretches
for various weights, the distances being proportional to the weights, if it re-
ally obeys Hookes Law. Thereafter one can use it as a scale in place of the
balance. It works because you can see by the deformation of the spring how
much force the spring exerts, and hence the weight that it is supporting.
The two systems for weighing, the spring scale and the balance, seem inter-
changeable, and of course both are used in practice. The remarkable thing
is that they measure dierent things! Only one of them actually measures
weight. That is a subtlety that we take up in the next section.
5.4 Weight and Mass
A sensitive spring scale can be transported to dierent latitudes and altitudes,
along with some test weight W, just to check its behavior. If you should do
this, you would nd that the test object does not keep the same weight!
According to the spring scale, W is less in equatorial regions, and it is less
at high altitude. If we didnt have some reasonable explanation for what
is happening, we might think that the spring is somehow aected by being
in dierent places, but that doesnt seem very plausible. Suppose, on the
contrary, that the spring constant k is the same everywhere, i.e., the spring
is not aected. Then we have to believe the spring scale. If it says the weight
is dierent, it really must be dierent.
Weight is the force due to gravity, of course, and we already know that
g, the measure of the Earths gravity, varies from place to place. This is the
simple explanation for why the weight of an object really does vary. Weight
is proportional to g, and g varies. That is,
W = mg (5.16)
where g is the acceleration due to gravity, and m is a constant of propor-
tionality, called the mass. In ordinary speech we use mass and weight
quite interchangeably, but in physics we make a careful distinction. Mass
and weight dont even have the same units. Mass is measured in grams,
or, in the Standard International (SI) system of units, kilograms, abbrevi-
ated kg. The dimension associated with mass is [M]. From Eq (5.16) we
see that weight, and more generally force, has dimension [MLT
2
], recalling
that [g] = [LT
2
]. Thus the SI unit for weight, or force, is the kg-m/s
2
,
also called the Newton, abbreviated N. A spring scale measures weight, and
so should be calibrated in Newtons. In the US the spring scale might be
calibrated in pounds, because the pound is also a unit of weight, and hence
force.
A balance, on the other hand, if it is really used for measuring, comes
with standard weights, labelled by their mass in grams. When we balance
an unknown W
2
against a standard weight W
1
, we know
L
1
L
2
=
W
2
W
1
=
m
2
g
m
1
g
=
m
2
m
1
(5.17)
That is, the balance determines mass, not weight! Whatever g is at the place
where the measurement is made, it cancels out. Its value is irrelevant. The
masses do not change as they are transported, and if they balance in one
location, they balance in all locations. This is why chemists use balances,
not spring scales, to measure quantities of materials. 100 grams of a reagent
has the same meaning everywhere, but 1 Newton weight of reagent has a
slightly variable meaning, depending on location. The two quantities are
roughly the same however, because the weight of m = 100 grams = 0.1 kg
is about 0.1 9.8 = 0.98 N 1 N, using the typical value g = 9.8 m/s
2
in
W = mg.
In the spring scale, the spring force kx and the weight mg balance, i.e.
x =
mg
k
(5.18)
where x is the extension of the spring. It is this extension that you read on
the scale. When the scale is used for weighing, we know k is constant, so
5.4. WEIGHT AND MASS 185
this says that x is proportional to mg, i.e., we read the weight mg. But we
can also keep both k and m constant, and then we read this as saying that
x is proportional to g. This is how gravimeters are constructed. They are
just sensitive spring scales, with a xed mass m to weigh. We read the local
value of g.
We have seen how the SI unit of [T], the second, is dened, and how the
SI unit of [L], the meter, is dened. How is the SI unit of [M], the kilogram,
dened? Remarkably, it is by an arbitrary choice! When the metric system
was invented, during the French Revolution, it was intended that the gram be
the mass of 1 cubic centimeter of water, and this is still approximately true,
but for precise work this is apparently not good enough to be a denition.
The kilogram is now, by denition, the mass of a platinum-iridium cylinder
kept at the Bureau International de Poids et Mesures (BIPM) in S`evres,
France in a vacuum, to prevent, to the extent possible, chemical changes on
its surface which might incorporate extra mass from the atmosphere. There
are copies of it, which balance with it, to the precision attainable, in other
countries. The weight of the standard kilogram in Newtons would be found
by multiplying by the local value of g.
We are more accustomed to measuring weight in pounds. The denition
of the pound (lb) now depends on the SI system! By denition
1 lb = 4.448221615 N (5.19)
The British system, which is of course very ancient, did not distinguish be-
tween weight and mass. Rather confusingly, one sometimes hears mass ex-
pressed in pounds (or pound mass, a more careful way to say it). One pound
mass is the mass that weighs a pound. This convention relies on the choice
of an arbitrary representative value for g, namely 9.80665 m/s
2
, and with
this conversion factor 1 pound mass is 0.45359237 kg, or 453.59237 g. We
will only use these conversions to re-express forces approximately in pounds,
in order to get a more intuitive idea of them. It will be quite sucient to
know 1 kg 1/0.454 2.2 pound mass for this purpose: i.e., on the surface
of the Earth, 1 kg weighs about 2.2 pounds.
5.5 Springs in Parallel and Series
We continue using free body diagrams to understand the notion of force in
equilibrium. Suppose a weight W = mg is supported by two springs, with
spring constants k
1
and k
2
, as in Fig 5.5, and imagine that each spring is
stretched by the same amount x. Let L
1
and L
2
be the distances from the
center of the weight at which the springs are attached, as indicated in the
free body diagram. Then in equilibrium the balance of torques about the
k
1
k
2
W
k
1
x
k
2
x
L
1
L
2
Fig. 5.5: A weight is supported on two unequal springs, each stretched by an amount x.
The situation is summarized in the free body diagram on the right.
center and the balance of forces requires
k
1
xL
1
+k
2
xL
2
= 0 (5.20)
k
1
x +k
2
x W = 0 (5.21)
Notice that the weight W does not contribute to the torque about the center,
because its lever arm about the center is zero. The rst relation says L
2
/L
1
=
5.5. SPRINGS IN PARALLEL AND SERIES 187
k
1
/k
2
, which just says where to attach the springs so that the springs will
stretch by the same x when the weight is suspended. If this condition were not
observed, the springs would have to stretch unequal amounts in equilibrium
to balance the torques, and the weight would hang crookedly. The second
relation says
W = (k
1
+k
2
)x (5.22)
This says that the springs are stretched by an amount x proportional to the
weight W, with the constant of proportionality k
1
+ k
2
. It is as if the two
springs were eectively a single spring with an eective spring constant
k
eff
= k
1
+k
2
(5.23)
In particular, if we have two springs k, then k
eff
= 2k. This arrangement
of springs is called springs in parallel, and what we have found is that the
result is an eectively stronger spring. Each spring only has to support a
fraction of the total weight W. Spring number 1 exerts a force
F
1
= k
1
x =
_
k
1
k
1
+k
2
_
W (5.24)
taking x from Eq (5.22). The fraction in parentheses is clearly less than 1,
so F
1
is less than the total weight W, and similarly for the force F
2
exerted
by the other spring. Notice that the stronger spring bears proportionately
more weight in this conguration. The two springs divide up the load.
Now consider a dierent conguration of two springs, springs in series, as
shown in Fig 5.6. A weight is suspended from the two springs. The free body
diagram for the weight alone has two forces, the force k
1
x
1
from the spring
which touches it from above, and the weight W, due to the Earth. Here x
1
is the extension of the rst spring. Note that the second spring does not
touch the weight, and therefore does not contribute a force to the free body
diagram for the weight. The second free body diagram is for the compound
object consisting of the weight together with the rst spring. This object is
touched by the second spring, from above, which exerts a force k
2
x
2
, where
x
2
is the extension of the second spring. It has the same weight W, since we
assume the weight of the rst spring is negligible.
In equilibrium the forces must balance, so we deduce
W = k
1
x
1
= k
2
x
2
(5.25)
k
1
k
2
W W
k
1
x
1
k
2
x
2
Fig. 5.6: Two springs in series support a weight W. The free body diagrams are for the
weight W and for the compound object consisting of the weight W together with the rst
(massless) spring.
Notice that in series each spring supports the full weight W, unlike the case
in parallel, where the weight was divided between the springs. The series
arrangement is in eect a single compound spring with an eective spring
constant k
eff
. The extension of the compound spring is the sum of the
extensions of the individual springs,
x = x
1
+x
2
=
W
k
1
+
W
k
2
=
W
k
eff
(5.26)
and thus
1
k
eff
=
1
k
1
+
1
k
2
(5.27)
quite a dierent k
eff
from the parallel case. If we have two springs k, then
in series k
eff
= k/2. The series spring is less sti than either of the springs
5.6. NEWTONS THIRD LAW 189
individually. By algebra,
k
eff
=
_
k
1
k
1
+k
2
_
k
2
(5.28)
Since the expression in parentheses is less than 1, k
eff
is less than k
2
. Ex-
changing the suxes 1 and 2 we see that k
eff
is also less than k
1
.
To sum up, two springs in parallel make a stronger spring, as in Eq (5.23),
and two springs in series make a weaker spring, as in Eq (5.27). We will
see analogues of these two expressions in many other situations where two
components can combine in parallel or series.
5.6 Newtons Third Law
Our treatment of springs in series depended upon a rather non-obvious choice,
illustrated in Fig 5.6 at the far right, the choice of weight+spring in the
rightmost free body diagram. With that choice the argument was easy. But
suppose we had not thought to make that choice would we have been stuck?
The answer is no. Any choice of body for the free body diagram will work,
but there is one additional thing we have to know about such diagrams and
the forces in them. This additional thing is Newtons Third Law. It says
that if A exerts a force on B, then B exerts a force on A that is equal in
magnitude and opposite in direction. In symbols we could say F
AB
= F
BA
.
We illustrate how to use Newtons Third Law by returning to the problem
of springs in series in Fig 5.7 but making a dierent choice. As in Fig 5.6
we will keep the free body diagram for the weight W, which tells us that
W = k
1
x
1
in equilibrium, but now for a second diagram we take as the free
body the rst spring. Its free body diagram is shown rightmost in Fig 5.7.
When we think about the rst spring as a free body, we have to think what
forces are exerted on it by other bodies. The Earth can exert a force even
without touching it, of course, but we are assuming the weight of the spring is
negligible, so its small weight does not appear in the diagram. Now we think
what objects touch this spring. The second spring touches it from above, and
pulls up with the force k
2
x
2
. Also the weight W touches it from below. What
is the force on the rst spring due to the weight? That is where Newtons
Third Law comes in. We already know that the force on the weight due to
k
1
k
2
W
k
1
x
1
k
1
x
1
k
2
x
2
Fig. 5.7: Another way to think about springs in series
the rst spring is k
1
x
1
, as shown in the free body diagram for the weight (the
middle gure). That tells us what the force is due to the weight on the rst
spring: it is k
1
x
1
again, but down, not up. These two k
1
x
1
forces are the pair
of forces described by Newtons Third Law. For every force on a free body,
there is a paired force, pointing the other direction, on some other body,
the body responsible for the force in the rst place. The two k
1
x
1
forces in
Fig 5.7 are a good example of a Newtons Third Law pair. Since the rst
spring is just suspended in equilibrium, the balance of forces on it, reading
from the diagram, says k
1
x
1
= k
2
x
2
, in agreement with Eq (5.25). The rest
of the analysis of springs-in-series goes just as before.
Let us think what else Newtons Third Law says about this situation.
There is a force W (down) on the weight, due to the Earth. That means
there is a force W on the Earth, due to the weight (the minus sign mean-
ing opposite, or in this case up). That is not something we would notice,
but it means weight is really a mutual attraction between the Earth and
5.7. YOUNGS MODULUS 191
individual masses. Newtons Third Law says things attract each other grav-
itationally. It is a hint of Newtons universal gravitation, the theory that
everything attracts everything.
Finally, where is the Newtons Third Law pair for the force k
2
x
2
in Fig 5.7?
Well, k
2
x
2
is a force up, due to spring #2 on spring #1. We have not drawn
a free body diagram for spring #2, but if we did, it would have a force k
2
x
2
down, due to spring #1, exactly the paired force associated to the force on
spring #1 due to spring #2.
If you have followed this discussion, you know how to use Newtons Third
Law. It reminds us that every force in a diagram is the force exerted by some
other body, because there must be another body where the paired force occurs.
To give just one more example, the second spring is perhaps supported by a
hook. In order to balance the force k
2
x
2
down, exerted by the rst spring,
there must be a force k
2
x
2
up, exerted by the hook on the second spring.
Therefore there is also a force k
2
x
2
down exerted by the second spring on the
hook, etc.
Newtons Third Law is sometimes described as the law of action and
reaction. This is not a very useful formulation. It is not even clear what it
means. Is the force F
AB
on A due to B the action, and the force F
BA
on B
due to A the reaction? Does one of them cause the other? Is one of them
primary and the other secondary? The honest answer is no. They occur as
a pair. That is all Newtons Third Law says.
5.7 Youngs Modulus
A solid slab of material, like steel, obeys Hookes Law in the sense that if you
try to stretch it, by pulling it from both sides with a force F, it will stretch by
a small amount x proportional to F. A steel wire is a convenient geometry
to try this out. If you hang a weight from a wire, it will stretch, by an amount
proportional to the weight. It is, in eect, a spring.
Of course the geometry of the material makes a dierence. A thick slab
might deform only imperceptibly, while a long thin wire might stretch very
noticeably. That is, if we write Hookes Law as F = k
eff
x, where k
eff
is
the spring constant, we know that k
eff
will be much greater for the slab than
for the wire, even though both are made of steel. Thus k
eff
is not a material
property of steel, but rather it depends on things like the thickness and the
length of the steel.
There is, however, a kind of spring constant k that is a material property,
namely the k that describes the spring connecting two neighboring atoms
in the material. We are assuming that applying a force to try to separate
these two atoms would produce a displacement proportional to the force, i.e.,
that Hookes Law also applies at the atomic scale. This amounts to a model
of a solid in which the atoms are masses and they are connected in a regular
structure by springs. Each spring has a length which is the interatomic
distance
atom
, whatever that may be. When we stretch the material, we are
really stretching all these tiny springs.
Imagine the interatomic springs in a steel wire that are oriented along
the long direction of the wire. There would also be springs perpendicular
to these, connecting atoms across the width of the wire, but they would not
be stretched by a weight hanging on the wire, so we ignore them. In any
cross-section of the wire the number of lengthwise springs is proportional
to the cross-sectional area A. When the wire supports a weight, all these
springs are stretched in parallel, so their eective strength adds up: k
eff
for
the wire is proportional to A. But the number of lengthwise springs in a wire
of length L is proportional to L. When the wire supports a weight, all these
springs are stretched in series, so their eective strength is less: the eective
k of the wire is proportional to 1/L. Thus
k
eff

A
L
(5.29)
or, putting in the constant of proportionality,
k
eff
=
AY
L
(5.30)
where Y is called Youngs modulus. Thus when we stretch the wire with a
force F, the amount of stretching x obeys
F =
AY
L
x (5.31)
This relation is usually re-arranged to read
F
A
= Y
x
L
(5.32)
5.8. THE FORCE BETWEEN ATOMS 193
and interpreted as follows. The left hand side, F/A, a force per unit area,
is called the stress. Its SI unit is N/m
2
, also called the Pascal, abbreviated
Pa. (In Chapter 8 we will meet the notion of pressure, which is also a stress,
also measured in Pascals.) On the right hand side we have the dimensionless
combination x/L, called the dimensionless strain (or just strain). It is the
fractional change in length. What this argument about microscopic springs
really says is that stress should be proportional to strain. The constant of
proportionality is Youngs modulus Y , a material property. Notice that Y
has the dimension of stress. Eq (5.32) is, of course, an experimentally testable
proposition, and for small enough strain it is found to be true, as if there
really are little springs!
Here is a numerical example. Youngs modulus for steel is about 2 10
11
Pa. A cylindrical steel wire with radius 1 mm and length 10 m, has k
eff
=
2 10
11
10
6
/10 6 10
4
N/m. This means a 10 kg mass (weighing 98 N)
will stretch the wire about x 1.610
3
m, or between 1 and 2 millimeters.
The strain is x/L 1.6 10
4
, and the stress is Y x/L 3.2 10
7
Pa.
This value is actually getting to be a bit large! The yield stress of steel is
about 2.5 10
8
Pa. This is the applied stress that would make the material
begin to deform permanently, not in a springy way. In our example the
applied stress is less than the yield stress, but not by much. The breaking
stress is about 4 10
8
Pa. We are about an order of magnitude less than
the breaking stress in the example. Realistically, though, imperfections in
an actual wire may make it weaker than our estimate in practice, and more
liable to break. Engineers always design structures with a safety factor, so
that applied stresses will be well below the yield stress and breaking stress.
5.8 The Force Between Atoms
If we think about the little springs in more detail, we can relate Youngs
modulus to the microscopic strength k of an interatomic spring. Let
atom
be the interatomic spacing, and consider again a cylindrical wire. Then in
the cross-sectional area A, each lengthwise spring occupies area
2
atom
, and
hence there are A/
2
atom
springs in a cross-section. This makes a kind of
compound spring of strength kA/
2
atom
, where k is the interatomic spring
constant. (We are just recapitulating the reasoning above that said the
eective spring constant is proportional to A, only this time keeping the
constant of proportionality.) Now these eective springs, each with spring
constant kA/
2
atom
, are in series when we put them together to make a wire
of length L. The number of them is L/
atom
, and hence the eective spring
constant is found by dividing,
k
eff
=
kA
L
atom
(5.33)
Comparing with Eq (5.30), we see that Youngs modulus is related to the
interatomic spring constant k by
Y = k/
atom
(5.34)
(check dimensions!) If we use Y = 2 10
11
Pa, for steel, and estimate
atom
= 5 10
10
m for the interatomic spacing (5

A), then we nd the
interatomic spring constant is
k =
atom
Y = (5 10
10
m)(2 10
11
Pa) = 100 N/m (5.35)
a very peculiar fact! After all, the Newton and the meter are chosen to
be convenient units at our human scale. One doesnt expect a constant
associated with the atomic scale to have an everyday value in SI units. This
accident certainly makes it easy to remember, though.
Of course the little spring with this k cannot stretch very far. In the
numerical example of the previous section we found that a strain of 1.610
4
,
corresponding to a stress 3.2 10
7
Pa on the wire, was already getting close
to the yield stress, the breakdown of Hookes Law. If we extrapolate to the
breaking stress, 4 10
8
Pa, we nd that the strain 2 10
3
is denitely
too large. This corresponds to a stretching of the little atomic spring by
x/
atom
= 2 10
3
, or x 10
12
m, using
atom
5
A, as before. The
corresponding force, to stretch the atomic spring too far, is
F = kx = 100 pN (Force to break atomic spring) (5.36)
i.e., about 100 piconewtons.
Interestingly, a similar value, tens of piconewtons, is found in biophysical
settings, as the typical force necessary to separate two atoms or molecules
which are adhering without having formed a chemical bond. Living cells,
when they crawl over a surface, form molecular adhesions with their sub-
strate, and the force necessary to break these adhesions is typically tens
of pN. The force necessary to pull membrane proteins out of membranes is
similar. Clever methods to pull on the ends of single RNA molecules can
detect the breaking of adhesions between one part of the RNA molecule and
another, to pull it out straight. Again the force necessary is tens of piconew-
tons. The molecules in these experiments are certainly not the constituents
of steel, but the strength of adhesions is about the same. We get the impres-
sion that non-specic adhesions between atoms can be broken with a force
of some tens of piconewtons, at least in order of magnitude.
This observation is especially interesting because we dont have a good
theory for it. Our understanding of how individual atoms interact is quan-
tum mechanics, and quantum mechanics uses the notion of energy, not force.
We understand covalent chemical bonds better than we understand these
somewhat ill-characterized situations of atoms which adhere simply because
they are close to each other. Even in the steel, the breaking strength is deter-
mined not by the perfect crystalline structure but by defects in this structure,
grain boundaries between crystals, for example. Perfect crystals could sur-
vive much higher strains than 2 10
3
, the breaking value we estimated for
steel, and 100 piconewtons is certainly not enough to separate the atoms in
a molecule rather it can separate two nearby molecules from each other,
leaving the molecules intact. This scale, in between quantum and classical
physics, is still a research area.
Problems
Law of the Lever
5.1 Use Archimedes result to show that if a weight q balances a weight p
at the two ends of a lever of length L, then the balance point is a distance
pL/(p + q) from q and a distance qL/(p + q) from p. Sketch how this looks,
taking p = 3 and q = 4.
5.2 Explain how a bottle opener works (the kind that pries o a bottle top)
in terms of forces and torques. Make a free body diagram for the bottle
opener, and give numerical estimates for the forces, assuming equilibrium.
5.3 Describe the forces on the lower jaw when you bite something. Make
a sketch of a free body diagram for the lower jaw bone and give numerical
estimates of the forces, assuming equilibrium. Distinguish the force down
due to the upper jaw at the hinge where the two bones meet, and the force
up of the muscle that pulls the jaw closed.
5.4 When you row, you pull one end of the oar, the water pushes the other
end, and at the oarlock the boat pushes the oar. Make a free body diagram
for the oar, and give numerical estimates for the forces, assuming equilibrium.
Why is the boat pushed forward?
5.5 Suppose you hold a 20 lb. weight in your right hand, with your elbow at
your side and your forearm extended horizontally. Sketch a free body diagram
for the forearm, and estimate the forces, assuming equilibrium. Distinguish
the force down of the humerus (upper arm bone) at the joint, and the force
up of the biceps muscle.
197
5.6 Estimate the forces on a diving board when a diver walks out to the end
of it. Make a free body diagram of the board, of course! Note that a diving
board typically is fastened at the back and also is contacted by a support
underneath, closer to the back than the front, which may even be adjustable
(in position).
5.7 A seesaw is pivoted at the middle. How can three children, weighing 40
lbs., 60 lbs., and 80 lbs., distribute themselves on the seesaw so that there is
a child at each end and the seesaw balances? Find all possible ways.
Newtons Third Law
5.8 In Fig 5.8 three masses are shown hanging in equilibrium one below the
other on strings. Find the tensions in the strings by more than one argument.
Hookes Law
5.9 A 100 g mass is suspended on a spring, and the spring stretches by 5
cm.
(a) Find the spring constant k in N/m, making clear your assumptions.
(b) Assuming that this k is the same one that governs the frequency of
small oscillations about equilibrium, nd the angular frequency .
5.10 A 3 kg mass is suspended on a spring, and the spring stretches by 20
cm.
(a) Find the spring constant k in N/m, making clear your assumptions.
(b) Assuming that this k is the same one that governs the frequency of
small oscillations about equilibrium, nd the angular frequency .
5.11 A 2 kg mass is suspended on a spring and stretches it a certain amount.
An additional 1 kg mass is added, and the spring stretches an additional 8
cm. Find k, the spring constant of the spring, making clear your assumptions.
m
1
m
2
m
3
Fig. 5.8: Three masses hang one below the other on strings
5.12 (a) Show that 3 springs in parallel, each with spring constant k, act
together like a spring with spring constant 3k.
(b) Show that N springs in parallel, each with spring constant k, act
together like a spring with spring constant Nk.
5.13 (a) Show that 3 springs in series, each with spring constant k, act
together like a spring with spring constant k/3.
(b) Show that N springs in series, each with spring constant k, act to-
gether like a spring with spring constant k/N.
Weight and Mass
5.14 Show how the denition of the pound mass follows from the denition
of the pound and the choice of a standard representative value of g. (Note
that the representative value of g is a kind of ction, since at any location g
has a denite value that is not a matter of choice, and will not be equal to
the representative value.)
Youngs Modulus
5.15 Youngs modulus for copper is about 1.210
11
Pa, and the yield stress
is about 1 10
8
Pa. Suppose you have a 1 m length of copper wire, with
diameter 0.5 mm. What is the heaviest weight you can suspend on it without
exceeding the yield stress? How much will the wire stretch?
Atomic Forces
5.16 Use the ideas of Section 5.8 to estimate the force to break an atomic
spring as S
b
2
atom
, where S
b
is the breaking stress for the bulk material. A
more direct argument to this result is also possible if you can simplify and
shorten the argument, by all means do so. Use the data for steel to evaluate
this expression and conrm the estimate we gave of the breaking strength:
100 pN.
Chapter 6
Mechanical Energy and Motion
In this chapter we introduce the idea of energy. In particular we see how
the notion of energy can replace the notion of force in equilibrium. Instead
of saying that forces and torques balance, we say that a certain energy is
minimized. We also see how to describe certain kinds of motions, like falling
and oscillation, using the notion of energy. This is a hint of a very general
trend in physics the replacement of force methods with energy methods.
In quantum mechanics one hardly runs into the notion of force, but the idea
of energy is everywhere.
6.1 Gravitational Potential Energy
Everyone knows that things tend to fall down, and to nd as low a place as
possible. This statement about equilibrium was one of the basic foundations
of Aristotles physics. Aristotles physics has nothing to teach us now about
physics, but it does represent what seemed like common sense for roughly two
thousand years. It conrms that we all intuitively understand the tendency
of things to move downward.
How does Nature choose what should go at the bottom, in case there
should be a choice about it? In Aristotles physics the answer was that Earth,
understood here as one of the four elements, should have the lowest place.
Next would be Water, another element, so that the equilibrium arrangement
201
202 CHAPTER 6. MECHANICAL ENERGY AND MOTION
is Earth on the bottom and Water just above that. In the Aristotelian scheme
most materials are compounded of four elements, including Air and Fire, so
admixtures of these could make wood, for example, less liable to sink than
pure water, with the result that wood oats. In fact pure Air and Fire want
to go up, not down, in Aristotles physics, perhaps to explain why the air
stays overhead and does not fall to the ground.
In the writings of Galileo we can see how a careful observer and ex-
perimenter, who was himself educated as an Aristotelian, nally convinced
himself that Air does not in fact tend to go up, but has weight, just as Wa-
ter does, and tends to go down, as all matter does. Galileo eventually even
weighed a quantity of Air, and describes how he did it in his last book Two
New Sciences.
In wrestling with these ideas Galileo interpreted Archimedes Law of the
Lever in a startling new way. This new interpretation answers the question
of what should go down in case there is a choice. We can recognize in
this argument the beginnings of the idea of gravitational potential energy.
Needless to say, Galileos formulation was mathematical, in contrast to the
wholly non-mathematical formulations of Aristotle.
m
1
m
2
L
1
L
2
Fig. 6.1: If the lever rotates, one mass rises and the other falls
6.1. GRAVITATIONAL POTENTIAL ENERGY 203
Galileos idea refers to a lever like that in Fig 6.1 and asks what happens if
the lever should rotate about the fulcrum. Clearly one mass goes up and the
other goes down, but at dierent speeds, because the two arms are dierent.
The longer arm corresponds to the greater speed. The balance condition says
that a small mass with a long lever arm balances a large mass with a small
lever arm, but Galileo read this as saying that a small mass with a large
speed balances a large mass with a small speed, replacing lever arm with
speed (they are proportional, after all, in a rotation recall Eq (4.11)).
Nowadays we look at how far the masses rise or fall in the rotation (this is
also proportional to speed). Galileos idea then becomes, a large mass rising
a small amount balances a small mass sinking a large amount. This idea is
captured in the idea of gravitational potential energy U
g
. For a single mass
m, the gravitational potential energy is
U
g
= mgh (6.1)
where h is the height of the center of the mass above some denite level.
U
g
is a quantity that contains both the amount of mass m, and the height
h, but it is only the product that matters, so that large m can compensate
for small h, or vice versa. For a system of masses, U
g
is just the sum of
the energies of the components separately. In particular, for a system of two
masses, U
g
= m
1
gh
1
+m
2
gh
2
.
In the case of the lever in Fig 6.1, the change in U
g
when the lever rotates
is
U
g
= m
1
gh
1
+m
2
gh
2
= (m
1
gL
1
+m
2
gL
2
) sin() (6.2)
What does this mean? U
g
for the system changes because the heights change.
These changes in height are h
1
and h
2
, one negative and one positive,
and are both of the form Lsin(). We see that if the lever was in balance,
that is, if m
1
gL
1
= m
2
gL
2
, then U
g
= 0, that is, U
g
does not change when
the lever rotates. But if m
1
is larger than it should be to balance, we know
that m
1
will go down, that is, the lever will rotate in the direction shown.
Since m
1
gL
1
> m
2
gL
2
in this case, U
g
< 0, that is, U
g
decreases. On the
other hand, if m
2
is too large, m
2
goes down, so that < 0 (the opposite
of what is shown). Since m
1
gL
1
< m
2
gL
2
in this case, and sin() < 0, we
nd U
g
< 0 in this case too. To summarize, if U
g
can decrease, it does so.
That explains why the lever tips the way it does. If it cannot decrease, then
it is in balance.
Let us look at this in a less formal way. The lever has the choice of tipping
one way or the other. Either way, the gravitational potential energy of one
of the masses goes down, and that of the other mass goes up. What the lever
actually does is tip to the side where the lowering of the potential energy
of one mass more than compensates for the raising of the potential energy
of the other, so that the net eect is to lower the gravitational potential
energy of the whole system. The behavior of the system is predicted by the
principle that it seeks to lower its gravitational potential energy. We notice
that this also explains why things fall: they are lowering their gravitational
potential energy.
This idea can even be formulated as a variational principle: the equi-
librium of a system of masses subject to gravity is the conguration that
minimizes the gravitational potential energy. Otherwise, if the system can
lower its gravitational potential energy, it will move to do so, and it is not in
equilibrium. (Note: this variational principle turns out to be a very good way
to think about equilibrium, but in more complex situations it must include
also other kinds of energy, not just gravitational potential energy.)
Let us use this variational principle to nd the equilibrium of two masses
m
1
and m
2
mounted on the rim of a wheel of radius R free to rotate, as
in Fig 6.2. This turns out to be a kind of generalization of the lever. The
strategy will be to nd how much the gravitational potential energy changes
if the wheel rotates by a small amount . If, by rotating, the energy could
be lowered, we know that this is not equilibrium. We will work in small
angle approximation, because we only want to know if the energy could be
lowered by even a small amount. In a rotation, each mass moves a distance
R along the circumference (this is just the denition of radian measure).
In small angle approximation, this is the same as moving along the tangent
line. The change in height is not this entire distance R, but only the
projection on the vertical direction, which brings in a factor sin . Thus
U
g
= m
1
gh
1
+m
2
gh
2
= (m
1
gRsin
1
m
2
gRsin
2
) (6.3)
Note that one mass goes down and one goes up. If the quantity in parentheses
is dierent from zero, then can be chosen to make U
g
decrease, that is, the
energy can be lowered by turning one way or the other. But if the quantity
in parentheses is zero, then the energy cannot be lowered. The energy must
be at a minimum. This is the equilibrium condition. We write it out below:
m
1
gRsin
1
= m
2
gRsin
2
(6.4)
6.1. GRAVITATIONAL POTENTIAL ENERGY 205
2
m
1
m
2
L
1
L
2
R
Fig. 6.2: A wheel of radius R with two masses m
1
and m
2
mounted on the rim is free to
rotate and nd its equilibrium. The lever arm for computing the torque about the hub of
the wheel is L
1
= Rsin
1
for m
1
and L
2
= Rsin
2
for m
2
.
Referring to Fig 6.2, we see that the equilibrium condition can also be written
as
m
1
gL
1
= m
2
gL
2
(6.5)
where L
1
and L
2
are the displacements of the masses from the center, pro-
jected onto the horizontal line. This is just Archimedes Law of the Lever!
We see that the lever arm in a situation like this is not the entire distance
of the mass from the fulcrum, or pivot, which is R for both masses, but
just its projection onto the horizontal, or its horizontal component. Here
L
1
= Rsin
1
and L
2
= Rsin
2
, so it is the sine of the angle that accom-
plishes this projection. We used this relation from trigonometry in arriving
at Eq (6.5).
We now have two characterizations of equilibrium. We can balance forces
and torques, or we can minimize potential energy.
6.2 Spring Potential Energy
In the last section we saw that the balance of torques in a lever could be
replaced by a dierent condition, namely that the gravitational potential
energy be a minimum. This seems like a completely dierent way of looking
at the lever, but it leads to the same equilibrium condition. In this section
we do the same thing for the equilibrium of a weight hanging on a spring.
There too the equilibrium can be described as the minimum of a potential
energy, but now we must introduce a new term, the potential energy of the
spring.
The spring potential energy U
s
should be a minimum when the spring
itself is in equilibrium, and that is when x, its extension away from its relaxed
length, is zero. The simplest expression that is a minimum at x = 0 is x
2
.
This is positive for all x = 0, and thus clearly a minimum at x = 0. The
spring potential energy U
s
is just proportional to this, and therefore quadratic
in extension x. As we shall see, the constant of proportionality has to be k/2,
where k is the usual spring constant. That is,
U
s
=
1
2
kx
2
(6.6)
Now we consider the potential energy of a mass hanging on a spring,
including both the gravitational potential energy and the spring potential
energy. The mass and spring hang vertically, along the conventional y axis.
Therefore let y be our name for the vertical position of the mass, measured
from the position the weight would have if the spring were not stretched.
Thus y is both the height of the mass and the deformation of the spring. If y
is positive, the mass is above the relaxed position of the spring, so the spring
is compressed. If y is negative, the mass is below the relaxed position of the
spring, and the spring is stretched. The total potential energy is
U = U
g
+U
s
= mgy +
1
2
ky
2
(6.7)
Now we ask for what value of y the potential energy is a minimum. The min-
imum is easy to nd by a trick from algebra called completing the square.
The purpose of completing the square is to represent a quadratic expression
like Eq (6.7) as a constant plus a square, i.e., by algebra,
U =
1
2
k
_
y +
mg
k
_
2
m
2
g
2
2k
(6.8)
6.3. THE POTENTIAL ENERGY OF A PENDULUM 207
The coordinate y occurs only in the squared expression, and that term is of
course never negative. Its smallest value is zero, and that occurs when
y =
mg
k
(6.9)
This is exactly the equilibrium position. The negative value means the spring
is stretched downward, as we know it will be, and the amount of the extension
agrees with what we found in Eq (5.18) from the balance of forces. Thus it
seems that we can either say the forces balance in equilibrium, or we can
say the potential energy is minimized in equilibrium. These formulations are
equivalent.
6.3 The Potential Energy of a Pendulum
A pendulum bob of mass m may swing for awhile, but eventually it runs
down and reaches equilibrium at the minimum of gravitational potential
energy, hanging motionless straight down. We will call this lowest possible
position y = 0, and we will measure height on the y axis from this point. Non-
equilibrium positions of the pendulum correspond to y > 0, and gravitational
potential energy mgy > 0.
There is another way to look at the energy of the pendulum, using x, the
horizontal coordinate of the pendulum bob, instead of y, the vertical coordi-
nate, since they are not independent of each other. The relation between x
and y is shown in Fig 6.3 From the Pythagoras Theorem,
L
2
= x
2
+ (L y)
2
(6.10)
and solving for y in terms of x we have
y = L(L
2
x
2
)
1/2
= LL
_
1
x
2
L
2
_
1/2
LL
_
1
x
2
2L
2
_
=
x
2
2L
(6.11)
where we used the binomial approximation Eq (4.20), assuming |x/L| << 1
(small displacement of the pendulum). This says the graph of y is approxi-
mately a parabola for small |x|. In fact, of course, the graph of y is a circle,
not a parabola, so the restriction to small |x| is necessary. Since x and y are
L
y
x
L
m
Fig. 6.3: A pendulums position can be described by x and y coordinates. Here the origin
of coordinates is taken at the equilibrium position.
related in this way for the pendulum, we can express the potential energy U
g
as
U
g
= mgy
1
2
_
mg
L
_
x
2
(6.12)
The gravitational potential of the pendulum looks just like the potential en-
ergy of a spring if you write it in terms of the horizontal displacement x
instead of the vertical displacement y, because it is quadratic in x. The ef-
fective spring constant k
eff
of the pendulum is the expression in parentheses,
k
eff
= mg/L. Remarkably, we have already seen this in a dierent way: from
the behavior of the pendulum as a harmonic oscillator, we found that it has
this spring constant in Section 4.11.
Thus the pendulums potential energy, which is really gravitational po-
tential energy, looks like a springs potential energy when it is expressed
in terms of the horizontal coordinate. The pendulum is a kind of gravity
6.4. FALLING, AND KINETIC ENERGY 209
spring.
6.4 Falling, and Kinetic Energy
When a mass m falls, it loses gravitational energy U
g
= mgh, because its
height h decreases. On the other hand, it also gains what we might informally
call oomph. Galileo explicitly wonders about this in his writings, citing the
example of pile drivers the greater the height h from which the pile driver
falls, the better it is at knocking the pile into the ground. It has more oomph.
Another example that intrigued Galileo is the behavior of a pendulum
when you block the string. Quite remarkably, if the pendulum swings down
from a height h above the minimum, then it rises to the same height h on
A B P
Fig. 6.4: A blocked pendulum still rises to the same level. Here the pendulum starts from
rest at A, the string is blocked at P, and the pendulum rises to B.
the other side, even if you block the string, as illustrated in Fig. 6.4. It still
has just enough oomph to get up to the original level.
We now understand oomph as a form of energy, called kinetic energy,
frequently notated K. In the examples above, K increases as U
g
decreases
in such a way that the total energy E, meaning kinetic energy plus potential
energy, is constant, i.e.
E = U
g
+K = constant (6.13)
In the case of the blocked pendulum, the mass m regains exactly its original
height at the moment that it stops, because then K is zero and all the energy
is once again in the form U
g
. If mgh has the same value as originally, then
the height h must also be the same. Galileo knew other examples of things
that behaved like this too. A smooth ball, like a billiard ball, rolling down a
hard, smooth ramp, rolls up again to the same height if it is guided smoothly
onto another ramp sloping upward. Its energy becomes kinetic as it loses
potential, and then it becomes potential as it loses kinetic, but the total is
always the same.
The simple relationship in Eq (6.13) can be pictured in a graph, as in
Fig 6.5. Although the relationship is simple, the graph is rather abstract. The
independent variable is the height h, but it is represented on the horizontal
axis. The gravitational energy U
g
is proportional to h, so its graph is a
straight line through the origin. The constant total energy E is the same no
matter what height h the mass m has, so its graph is a horizontal line at that
constant value. The dierence between these two lines is the kinetic energy
K, so what the graph really shows, rather indirectly, is K at any h. Even
more indirectly this says how fast m is moving at any h. In particular, m
comes to rest at h = H, where K = 0. This height is called a turning point,
because something thrown upward with energy E would reach that height
(and no farther) before turning to fall back.
In Newtons theory the kinetic energy of a mass m has a simple expression
K =
1
2
mv
2
(6.14)
where v is the speed. Clearly K is zero if v is zero, at a turning point, for
example, and K and v increase together, so K has the general characteris-
tics we would expect from the examples. It tells us how much oomph m
6.4. FALLING, AND KINETIC ENERGY 211
E
U
g
h
H
K
Fig. 6.5: The gravitational potential energy U
g
= mgh is shown as a function of height h.
The dierence between U
g
and the constant level E is the kinetic energy K. At h = H,
the value of K is zero, so at that height the mass m would come to a stop (turning point).
has, due to its motion. This form could have been guessed by dimensional
analysis! If we look at the dimension of U
g
, recalling [g] = [LT
2
], we nd
[U
g
] = [ML
2
T
2
]. If we ask for something with this dimension involving v,
with its dimension [LT
1
], we see that the only possible combination is mv
2
.
Of course dimensional analysis cant tell us that there is also a factor 1/2.
With this expression for K, Newtons theory says that E = U
g
+ K
is constant under certain conditions, especially the condition that friction
should be negligible. Let us assume that friction is negligible and see what
we can learn from E = constant. Suppose a mass m falls from a height H,
being initially at rest. This situation corresponds exactly to Fig 6.5. What
is the speed of m when it hits the ground? If we measure height h from the
ground, then h = H initially, and h = 0 at the end. Also v = 0 initially
(m starts at rest) and we dont know v at the end. The statement that
E = U
g
+K is constant then says that
E
i
= mgH + 0 = E
f
= 0 +
1
2
mv
2
f
(6.15)
where subscripts i and f mean initial and nal. Solving for v
f
we have
v
f
=
_
2gH (6.16)
A quick dimensional check shows that this expression is dimensionally con-
sistent. (Always check dimensions!) The expression also passes a common
sense check, because if m falls from a higher H it will be going faster
clearly correct. A possibly surprising thing is that m has cancelled out of the
expression, and in the absence of friction everything should fall at the same
speed, independent of mass. This familiar fact is easy to check by dropping
things together: they do fall together. Finally, try a numerical common sense
check: what would v
f
be for something that falls o a 10 m house? Taking
g 10 ms
2
, we nd v
f
14 m/s. Does that seem right? In units that may
be more familiar, this is about 30 miles per hour. It seems reasonable.
Turning the problem around, suppose we throw something vertically up-
ward at speed v
i
. How high will it go? The energy calculation is exactly the
same as before, except that we switch the indices i and f. Now it starts with
kinetic energy and ends with potential energy. Solving for the height H in
Eq (6.16) we have
H =
v
2
i
2g
(6.17)
Do the dimensions agree on both sides? Does the expression make sense?
By the same computation as the one in the previous paragraph, we see that
if we can throw something like a baseball upward at 30 miles per hour, it
will go about 10 meters high. Most people could probably do that, but not
much more. If g were less, on the other hand, we could do a lot better.
On the moon we could throw 6 times higher, because g is only one sixth its
terrestrial value, and g is in the denominator of Eq (6.17). (Reasoning like
this from proportionality is much quicker and easier than putting in a lot of
numerical values and doing arithmetic.)
Caution: some students seeing the example above generalize too quickly
and imagine there must be some principle like kinetic energy equals potential
energy. Nothing like that is true. If the total energy is zero, as sometimes
6.5. VELOCITY V IN FALLING 213
happens, then it is true that kinetic energy is the negative of potential
energy. But that is obvious: they have to add to zero. More generally
all you can say, given that energy is conserved, is that kinetic energy +
potential energy is constant. That constant is the total energy. Another
way to say it is that potential energy lost is kinetic energy gained (so that
the sum stays constant). Maybe that is what they mean to say.
6.5 Velocity v in falling
There is a slight awkwardness about these computations, because we are as-
suming that it makes sense to talk about speed v even though v is continually
changing. Before, when we talked about v, it was in the context of constant
speed (like the speed of light). There v was the constant of proportionality
in case D t. But if v isnt constant, it certainly isnt a constant of propor-
tionality, so what is it? Most people feel that they understand what speed
v means even if v is changing. It is what the speedometer on a car shows,
for example. Still, a careful denition requires calculus, and we have made
a conscious choice not to use calculus. That means certain topics are o
limits. We have run into one of those limits here: we can compute v for an
object that falls a distance H, using the method of energy we have found
v
H, in fact, in Eq (6.16) but we cant say precisely what v means! (A

similar comment would have been in order after Eq (4.37).)
Most introductory physics texts spend quite a lot of time on this topic,
and do far more with it. In that sense they are paying more attention to the
history of physics than we do, because this topic really was very important
historically, and led quite directly to the invention of calculus and Newtons
mechanics. If these connections intrigue you, by all means learn calculus and
do this right. You will nd that the notion of instantaneous v is identical
with the notion of derivative in calculus.
Without calculus we can only make assertions, and we will keep these to a
minimum. Galileo says that for a long time he thought that the velocity of a
falling mass should be proportional to the height H that it falls, but Galileo
frequently meant oomph when he said velocity, that is, kinetic energy K,
and it is actually true that K H for an object that falls from rest, as in
Eq (6.15). He describes it as one of the great discoveries of his life when he
nally realized that in this case
v t (6.18)
where t is the time the mass falls from rest. That is, for a falling body, v is
proportional to time, not space. As an equality,
v = gt (6.19)
The constant of proportionality is g, the acceleration due to gravity (nally
seen here in the context that gives it its name). Since v
H, Eq (6.16),
and v t, Eq (6.19), we have also
H t (6.20)
If falling were motion at constant speed, we would have H t, but instead
we have Eq (6.20). Squaring both sides and keeping track of the constant of
proportionality from Eqs (6.16) and (6.19), we nd more precisely
H =
1
2
gt
2
(6.21)
for the height H fallen in time t, starting from rest. This famous result is due
to Galileo, except that Galileo didnt pay much attention to the numerical
value of g. What seemed signicant to him was H t
2
in the case of free
fall from rest. He often expressed it in the way that he apparently discovered
it experimentally. If we put in t = 0, 1, 2, 3, 4, ... we nd H = 0, 1, 4, 9, 16, ...
in some units. Then in each unit of time we have the dierences H =
1, 3, 5, 7, .... This progression of the odd integers, for the H fallen in each
successive second, delighted him. It seemed to show very clearly how m was
picking up speed at a constant rate in time, faster in each successive second.
The accelerated fall of an object dropped from rest is shown in Fig 6.6.
6.6 Universal Gravitation
Newtons universal gravitational force law says that a particle of mass m
1
attracts a particle of mass m
2
with a force that falls o like the inverse square
of the distance r between them, i.e.
F
grav
=
Gm
1
m
2
r
2
(6.22)
6.6. UNIVERSAL GRAVITATION 215
1
3
5
7
Fig.6.6: An object falling from rest is shown at equal intervals of time. The distance fallen
in each unit of time is given in units of the rst distance. This picture expresses H t
2
,
where H is total distance fallen, and t is time.
Here G is a constant of Nature, called Newtons gravitational constant. Re-
markably, even large objects attract each other this way if they are symmet-
rical spheres. In this case the r in Eq (6.22) is the center-to-center distance
between the spheres. This extended case covers the case of the (spherical)
Sun and planets.
Associated with the gravitational force law is a gravitational potential
energy
U
G
=
Gm
1
m
2
r
(6.23)
(Check dimensions!) We can use U
G
just as we have used other potential
energies, to see how speed changes in falling.
First, though, we should clear up something that might be bothering you.
Didnt we already have an expression for the gravitational force on a mass
m
2
, namely its weight m
2
g? And didnt we already have an expression for its
gravitational potential energy, namely m
2
gh? And arent these expressions
quite dierent from what we are saying now, in this section? Remarkably,
no! For particles near the surface of the (spherical) Earth, these expressions
agree, if m
1
is the mass of the Earth M
E
, and r is the radius of the Earth
R
E
. That is, it must be that
g =
GM
E
R
2
E
(6.24)
In this case Eq (6.22) just says F
grav
= m
2
g, and also, by the binomial
approximation Eq (4.20)
U
G
U
G
=
r
r
(6.25)
which means for r = R
E
U
G
=
GM
E
m
2
R
2
E
r = m
2
gr (6.26)
using Eq (6.24). Since r, the change in distance from the center of the
Earth, is just another name for change in height h, these new expressions
are really the old expressions for objects near the surface of the Earth!
Graphing the potential energy U
G
in case m
1
is the Earth, in Fig 6.7,
we see how an object dropped from a great height would pick up kinetic
energy K as it fell, assuming no friction. The fall ends at r = R
E
, of course,
when the dropped object actually hits. Interpreting this graph is just like
interpreting the other energy graphs we have seen. Notice that the total
energy E in this example is negative, meaning there is a turning point.
The case of a planet, of mass m, in circular orbit about the sun, of mass
M, can also be summarized in terms of energy, but now the distance r of
the planet does not change, because r is the constant radius of the circle.
The planet has constant kinetic energy K, but this refers to its speed in its
circular orbit. It turns out that for a circular orbit at radius r the kinetic
energy is
K =
U
G
2
=
GMm
2r
(kinetic energy in circular orbit) (6.27)
just half the potential energy, in magnitude. Therefore the total energy is
E = K +U =
U
2
=
GMm
2r
(6.28)
6.7. ENERGY OF AN OSCILLATOR 217
r
U
G
R H
E
K
U
grav
=
GMm
r
Fig. 6.7: A mass m falls from rest at height H above the center of a sphere of mass M
and radius R. Its total energy is E, and the energy graph shows how its kinetic energy K
increases as it approaches the surface at R. This could represent an object falling (without
friction) from a great height above the Earth. Compare Fig 6.5, which shows only a small
region near r = R, appropriate for heights H R (i.e., close to the Earths surface).
The same considerations apply, with the masses and r changed, to the Moon
in circular orbit about the Earth, to articial Earth satellites, and to the
moon systems of other planets. This simple relation between K, U
G
and E,
involving just a factor of 2, is a result from Newtonian mechanics called the
Virial Theorem. It holds in a more general form for any bound orbit in a
potential proportional to 1/r. Since the electrostatic force between electrical
charges, like the proton and the electron, also corresponds to a 1/r potential,
the Virial Theorem reappears in the Bohr model of the hydrogen atom!
6.7 Energy of an Oscillator
The spring potential energy U
s
= (1/2)kx
2
, where x is the displacement of a
mass m, is characteristic of a simple harmonic oscillator with spring constant
k. It has a minimum at x = 0, but if the oscillator is not in equilibrium at
x = 0, then it must be oscillating with angular frequency =
_
k/m between
x = A and x = A, for some amplitude A, like a pendulum that is disturbed.
A graph of U
s
vs. x, shown in Fig 6.8, is a good way to organize what is
-A A
x
U
s
E
K
Fig. 6.8: The potential energy U
s
= (1/2)kx
2
of an oscillator. If the oscillator has ampli-
tude A, its total energy is E = (1/2)kA
2
, corresponding to U
s
at the turning points A.
The kinetic energy K is the dierence between E and U
s
.
happening. This is just like Fig 6.5, but with U
s
replacing U
g
. Now the
constant energy is
E = U
s
+K = constant (6.29)
The constant total energy E is graphed as a horizontal line. As in Fig 6.5,
the graph shows indirectly, through the kinetic energy K, the speed of the
oscillator at every point of its oscillation. At the turning points of the oscil-
lations, x = A, the spring potential energy takes the value E = (1/2)kA
2
,
and the kinetic energy K = 0. At other places in the oscillation K > 0,
meaning the oscillator is in motion. K is maximal at x = 0, where U
s
is
minimal.
We actually know how x and v behave in time for a simple harmonic
6.8. OSCILLATORS LOSING ENERGY 219
oscillator. From Eqs (4.34) and (4.37) we have
x = Asin(t +) (6.30)
v = Acos(t +) (6.31)
Therefore the energies, as a function of time, are
U
s
=
1
2
kx
2
=
1
2
kA
2
sin
2
(t +) (6.32)
K =
1
2
mv
2
=
1
2
m
2
A
2
cos
2
(t +) (6.33)
Since
2
= k/m and sin
2
+ cos
2
= 1, we nd
U
s
+K =
1
2
kA
2
(6.34)
and this is clearly a constant in time, the total energy E. It coincides with
U
s
at the turning points, when K = 0 and all the energy is potential energy.
Both U
s
and K oscillate in time, but in such a way that their sum is
constant. The energy gets passed back and forth from kinetic to potential
and back to kinetic, forever, as the oscillator oscillates, as shown in Fig 6.9.
6.8 Oscillators Losing Energy
When a simple harmonic oscillator is not in equilibrium, it has energy E
greater than the minimum of U
s
. It oscillates between turning points x = A.
To reach equilibrium it must somehow lose this excess energy, represented in
Fig 6.8 by the level E.
Oscillators left to themselves do run down, like a pendulum given a push
and then left alone. Oscillators do lose energy somehow. In Newtons theory,
this can only be due to friction forces. In the case of a pendulum, there
might be air friction. If the pendulum is a mass on a string, there might
be rubbing of the bers of the string against each other as the pendulum
swings. Such friction has the eect of dissipating energy E, i.e., causing E to
go down. One can picture this in Fig 6.8 as the level E literally going slowly
down: the turning points x = A approach x = 0 from either side. This
E
0 T/2 T
t
U
s
K
Fig. 6.9: The potential energy U
s
= kx
2
/2 and the kinetic energy K = mv
2
/2 for an
oscillator add to the constant E at every time t.
corresponds to what we see, as the amplitude of the swing becomes gradually
less. Eventually the oscillator reaches equilibrium, E = 0, and A = 0. A
good oscillator may go through thousands of oscillations as it runs down,
however. A (theoretical) ideal oscillator would lose no energy and would
oscillate forever at the xed amplitude A, never reaching equilibrium.
This same picture is used in quantum mechanics. The dierence is that
the oscillator cannot lose its energy E gradually, so one cannot think of
the horizontal line moving gradually down. Instead, the energy E changes in
quantum jumps, the same denite amount of energy each time. What we call
the equilibrium here is called the ground state in quantum mechanics, but
it is still the same thing: the state of lowest energy. The quantum oscillator
reaches the ground state by losing quanta of energy in discrete jumps.
In the 1900 paper that was the beginning of theoretical quantum mechan-
ics, Max Planck proposed this mechanism for the way a simple harmonic os-
cillator loses energy, with the further stipulation that the quantum jump in
energy would be proportional to frequency, E . Putting in a constant
6.8. OSCILLATORS LOSING ENERGY 221
x
U
s
E
h
Fig. 6.10: A quantum oscillator loses energy in discrete jumps of
of proportionality, we now say
E = (6.35)
where = h/2 (pronounced h-bar), and h is called Plancks constant. It
turns out to have the fantastically small value 10
34
kg m
2
s
1
. In our
everyday macroscopic world we dont notice energy losses on this tiny scale.
So as far as we can tell, an ordinary pendulum, with 1 s
1
, loses
energy smoothly and gradually, not in jumps. At the scale of molecules,
though, frequencies may be much higher, easily of the order 10
14
s
1
.
Then E 10
20
kg m
2
s
1
, and in a molecule a change in energy
of 10
20
kg m
2
s
1
might be very noticeable. At the level of molecules the
discreteness of E is crucial to understanding energy transfer.
The quantum mechanical picture is dominated by the notion of energy.
Fig 6.10 is the right way to think of a simple harmonic oscillator in the context
of quantum mechanics. The oscillator can only have one of the discrete set of
energies indicated in the gure, separated by , where is the frequency of
the oscillator. That is why the oscillator can only lose energy in this discrete
amount (or quantum).
The idea of a friction force does not come into this picture at all. It is
a question of current research to understand friction forces in terms of the
more fundamental quantum description.
6.9 A Chemical Bond
U
E
0
x
K
bound states
unbound state
Fig. 6.11: The potential energy U of two atoms is shown as a function of their separation
x. If the two atoms have energy E as shown, then they could separate, but if they lose
enough energy, they will be in a bound state, bound to each other at roughly the separation
where U has its minimum.
To see how these ideas are used in quantum chemistry, imagine that the
interaction of two atoms is described by a potential energy U(x), where x is
their separation, as shown in Fig 6.11.
If the total energy of the system is E, then the dierence between E and
U(x) is the kinetic energy of the system, as a function of separation. The
unbound state has an energy E so large that K > 0 for large x. Thus there
is enough energy for the system to separate into two atoms arbitrarily far
apart. The atoms are not bound to each other. If, however, the system
should lose energy and make a transition to one of the bound states, the
6.9. A CHEMICAL BOND 223
atoms could not separate. Their motion in the classical picture would have
turning points where K = 0. They would be oscillating in the potential
well, that region around the minimum of U. The potential well is not a
perfect parabola, but it would still be a reasonable approximation to model
this situation as a quantum oscillator. The allowed energy levels would not
be exactly evenly spaced, though, and there would be only a nite number
of them.
We have not said how one would know U(x) for the two atoms. Finding
and improving this kind of description for atoms and molecules is still a
research area. The meaning of U(x) is quite straightforward to interpret if
we happen to know what it looks like as a graph, as in this case. In this
example there were bound states because of the minimum in U(x), giving
the possibility of turning points. It may also happen that U does not have a
minimum. In this case there are no bound states, and a molecule could not
form.
Problems
Gravitational Potential Energy
6.1 (a) How does the sinking of a stone in water illustrate the tendency of
systems to minimize their gravitational potential energy? Remember that it
is not just the stone that is moving, it is also the water.
(b) How does the rising of a bubble in water illustrate the tendency of
systems to minimize their gravitational potential energy?
6.2 (a) Suppose two equal weights W are at equal distances L from the
center of a balance, but the (massless) beam that they are connected to is
not horizontal. Rather it has turned on its fulcrum and is inclined from the
horizontal by an angle . Use the ideas and idealizations of Section 6.1 to
see if, starting from this position, the gravitational potential energy could be
lowered by a further rotation of the beam through a small angle . Which
way will the beam rotate? A picture will help. (Of course if the gravitational
potential energy cannot be lowered, then the system is in equilibrium, also a
theoretical possibility.)
Spring Potential Energy
6.3 (a) In the example of the mass hanging on a spring, with the potential
energy given in Eq (6.7), there is a given mass m, a spring constant k and of
course the acceleration due to gravity g. Find a combination of these three
quantities that has the dimension of length. This is a natural length for the
225
problem, that simply comes along with the problem somehow. Call it y
N
,
where the N stands for natural.
(b) Similarly, nd a combination of m, k and g that has the dimension of
energy. This is a natural energy in the problem, call it U
N
.
(c) y
N
and U
N
would make good units of length and energy for this
problem. Find the equilibrium y in units of y
N
and the equilibrium energy
U in units of U
N
.
(d) Graph the potential energy U given in Eq (6.7) as a function of y.
Pendulum Potential Energy
6.4 Explain in words, and using a picture, why a pendulum, meaning a
mass m on a string of length L, is eectively a less sti spring if L is
longer.
Falling
In the problems below, assume energy is conserved (i.e., ignore friction), and
explain your reasoning in detail.
6.5 (a) If in falling 10 m from rest an object reaches a speed 30 mph, what
speed will it reach in falling 20 m? Use proportional reasoning, but note that
speed is not proportional to distance fallen.
(b) If in falling 10 m from rest an object gains kinetic energy 17 J, what
kinetic energy will it gain in falling 20 m?
(c) If the objects in (a) and (b) are one and the same, what is the mass
m of the object?
(d) What is the value of g at this location?
6.6 Use conservation of energy to nd the speed v of an object that falls
a distance H from rest, but unlike the treatment in Section 6.4, choose the
h-coordinate so that h = 0 at the starting point. Since also v = 0 at the
starting point, the total energy must be E = 0. Nonetheless, the result for v
after falling a distance H must be the same as before, since this is a physical
fact, and has nothing to do with how we choose to measure height. Make a
clear and careful argument, and include a graph like Fig 6.5, with appropriate
changes.
6.7 The prole of a sledding hill is shown in Fig 6.12. Assuming friction is
negligible, how fast will a sled be moving on the at at the bottom if it starts
from rest at the top? Note that the gure can also be considered a graph of
gravitational potential energy U
g
vs horizontal position.
20 m
Fig. 6.12: A sledding hill is 20 m high what will be the speed of a sled at the bottom?
6.8 Suppose a pendulum of length L = 10 m and mass m = 2.4 kg is pulled
back to an angle 5
from the vertical and released from rest.

(a) Taking the zero of height to be where m hangs at equilibrium, what
is its potential energy as it is released? How dierent would this value be if
you used the approximation in Eq 6.12?
(b) With what speed does the mass m swing through its lowest point?
6.9 The height H you can throw depends on the speed v you can throw
(upward) and the local value of g. Without referring to the text, but only
using dimensional analysis, show that if you could throw twice as fast, you
could throw four times as high.
6.10 (a) Show how Eq (6.21) follows from Eq (6.16) and Eq (6.19).
(b) If you drop a stone into a well and hear a splash 1 s later, the water
is about 16 feet down. Suppose you drop in a stone and hear a splash 2 s
later. How far down is is the water?
6.11 (a) A stone of mass m is dropped from a high bridge, 20 m above the
surface of a river. Ignoring air friction, how fast is the stone moving when it
hits the water?
(b) Suppose someone had thrown the stone straight down from the bridge
at an initial speed 10 m/s. In this case how fast is the stone moving when it
hits the water?
(c) Give a common sense argument explaining why the answers to (a)
and (b) do not dier by 10 m/s.
6.12 Two masses fall from the same height H to level ground at height 0,
but the rst mass simply drops down starting from rest, while the second is
projected horizontally at high speed. Which hits the ground rst, and why?
Universal Gravitation
6.13 In this problem, assume the Earth is at rest in space, and not in a
complicated solar system! Ignore air friction too.
(a) How fast would something hit the Earth if if fell in from a very great
distance?
(b) How fast would you have to shoot something straight up to give it
enough energy to escape the Earth (i.e., not run into a turning point)?
6.14 (a) In 1798 Henry Cavendish, in a delicate and beautiful experiment,
measured Newtons gravitational constant G. Cavendish actually measured
the tiny force of gravitational attraction between two lead spheres a known
distance apart! Explain how this determines G.
(b) Explain how knowing G, and of course knowing the acceleration g due
to gravity at the Earths surface, and the radius of the Earth, R
E
, Cavendish
was able to deduce the mass of the Earth.
(c) Check the accepted numerical values for these things does everything
agree?
6.15 (a) Use the Virial Theorem to determine the speed of a mass m in
circular orbit about a much greater mass M at radius r.
(b) How long will it take m to get once around the orbit? This is, of
course the period T. You should nd T
2
r
3
, Keplers 3rd Law.
(c) We know T and r for the Earth in its orbit around the Sun. Explain
how your result in (b) determines the mass of the Sun in terms of things we
know about the Earths orbit, like its period (1 year) and its radius (1 AU).
Evaluate the mass of the Sun.
Energy of an Oscillator
6.16 (a) If the amplitude of a harmonic oscillator is A, then it oscillates
from A to A, a distance of 2A, in half a period. Dividing distance by time,
what is the average speed of the oscillator?
(b) What is the maximum speed of the oscillator in its oscillation?
(c) The ratio of the answers in (a) and (b) is the average speed expressed
in units of the maximum speed, a pure number. Find it, and verify that it
is less than one, as common sense suggests, and the same for all harmonic
oscillators.
6.17 In Section 4.12 we described how an oscillator might swing to an
amplitude A always smaller than the previous amplitude by the same factor
r < 1, thus running down in a manner that is called exponential decay. Show
that in such an oscillator the energy also decays exponentially, being less on
each swing by a constant factor (but not the factor r that describes the decay
of the amplitude).
Chapter 7
Vector Quantities
In the previous chapter we saw how things fall when they fall straight down.
A more complicated kind of falling is projectile motion, the motion of some-
thing ying through the air, like a ball when you throw it. Galileo discovered
how to describe both kinds of falling, and showed that projectile motion is
just a simple generalization of vertical falling. Projectile motion requires two
dimensions to describe. There is motion both vertically and horizontally. It
was Galileos great insight that these two motions take place independently.
That turned out to be the very simple answer to an old question: what is the
path of missiles through the air? It is really quite amazing that this answer
wasnt discovered earlier. Galileo himself seems to be surprised that he was
the rst to understand it.
More generally, the problem of projectile motion is a good example for
thinking about motion in two dimensions.
7.1 Projectile Motion
The trajectory of an object that is thrown horizontally in the gravitational
eld is very simple if you look at the horizontal components and vertical
components separately. The amazing fact is that the horizontal motion and
the vertical motion are completely independent! While the object moves
horizontally at constant speed, it falls vertically in just the same way that it
231
232 CHAPTER 7. VECTOR QUANTITIES
y
x
1
3
5
7
0
1
4
9
16
Fig. 7.1: An object thrown horizontally falls vertically as if it were simply dropping down,
while it moves horizontally equal distances in equal times. The objects position is shown
here at equal intervals of time. The points lie on a parabola.
would drop straight down. If we introduce horizontal and vertical coordinates
(x, y), with y increasing down, the motion is
x = v
0x
t (7.1)
y =
1
2
gt
2
(7.2)
where v
0x
, the horizontal component of velocity, is constant. This motion,
shown in Fig 7.1, traces out a parabola, a curve known to the Hellenistic
Greeks.
When we say x and y in Eqs (7.1) and (7.2), or in Fig 7.1, we mean the
coordinates of a particle in the x-y plane. We can also consider the pair
(x, y) to be the two components of the displacement vector, meaning the
displacement of the particle from the origin. Here x and y could be either
positive or negative, just as in the case of 1-dimensional displacements.
7.2. VECTOR ADDITION 233
7.2 Vector Addition
The rst mention of the idea of combining two motion along two dierent
directions seems to be in the Questions on Mechanics attributed to Aristotle
(but actually not by him, on stylistic grounds). Sometimes the author is
called pseudo-Aristotle. In any case, this little book of problems raises the
question of something that moves along a a line while the line itself is moving.
The picture is in Fig 7.2. An object moves along the line from A to B, but
A B
C D
E
Fig. 7.2: While an object moves rightward from A to B along the line AB, the line itself
moves upward to position CD, so that when the object reaches B, the point B itself has
reached D. The eect is that the object moved from A to D, along the diagonal. Halfway
through the motion, the line is halfway up, and the object is at point E.
as it moves, the line itself moves up. By the time the object reaches B, the
line has moved up to coincide with the line CD, so the object reaches the
point D. It has actually moved along the diagonal AD.
Nowadays we would express this idea with vectors (arrows) for the dis-
placements, as in Fig 7.3. The idea is the same as in Fig 7.2, but perhaps
easier to interpret. A displacement horizontally along the vector r
1
combines
with the displacement vertically along vector r
2
to give the net displacement
along the diagonal, called the vector sum of the two displacements. The
vector sum results from putting the two addends tail-to-head. When we look
r
1
r
2
r
1
+ r
2
Fig. 7.3: The vector addition of displacements in Fig. 7.2 The displacement rightward
along r
1
combines with the displacement upward along r
2
to give a net displacement which
is diagonally up. This displacement is the vector sum r
1
+r
2
, the cumulative eect of the
two displacements.
at it this way, we say displacement is a vector. This implies that we can add
displacements as vectors to get net displacement.
A similar thing can be said about velocity, which can also be regarded
as a vector. In Fig 7.2 there was a horizontal motion, that is, a horizontal
velocity v
1
that combined with a vertical motion, that is, a vertical velocity
v
2
, to produce the actual net velocity along the diagonal v
1
+v
2
. The picture
in Fig 7.4 shows how these two velocities combine.
The length of the velocity vector is what we mean by speed. It is just a
number, and in fact a non-negative number (with units). We may say v
1
or
|v
1
| for this length. But if we mean the vector, we will always put an arrow
over the top, as in v
1
. It is a very good idea to do this when you are writing
out problems, too, so that your notation tells you which items are vectors
and which are ordinary numbers (also called scalars). The velocity vector is
not a number, or scalar. Velocities add (in the sense of vectors), but speeds
do not add. In Fig 7.4 we could get the speed of the composite motion from
v
1
and v
2
by the Pythagoras theorem, but if the vectors v
1
and v
2
are not
perpendicular, then this isnt true either. In fact, the pseudo-Aristotle book
7.2. VECTOR ADDITION 235
v
1
v
2
v
1
+ v
2
Fig. 7.4: A horizontal velocity and a vertical velocity combine to give the net velocity, the
vector sum, in a diagonal direction.
seems particularly surprised to notice that two velocities corresponding to
high speeds can add as vectors to make a low speed, as in Fig 7.5.
v
1
v
2
v
1
+v
2
Fig. 7.5: If two velocities are somewhat anti-aligned, their vector sum may be shorter than
either of them. This corresponds to adding two velocities and nding a new velocity with
smaller speed than either summand.
The extreme case of anti-alignment occurs when two vectors point in
exactly opposite directions. In this case vector addition is more like sub-
traction! Since motion along a line is one dimensional, we wouldnt actually
need to use vectors. We could just express displacements x and velocities v as
numbers (either positive or negative along the line, depending on direction).
This is what we have been doing in previous chapters, where motion was
along just one direction, for the most part. Alternatively, we could say that
when we allow position x and velocity v to be either positive or negative,
indicating direction, we are actually using vectors, but because they always
point along the same line, we dont need to remind ourselves that they are
vectors by using the special vector notation (arrow over the top). That is
just for two and three dimensions.
As a physical realization of the pseudo-Aristotle idea, we could think of
someone walking on the deck of a ship, while the ship itself is moving. Here
the moving ship replaces the moving line. Someone watching the motion
from a stationary position on shore would see the velocity of the person as
the vector sum of the walkers velocity with respect to the deck plus the
ships velocity. The idea of velocities cancelling is easy to picture in this
case. Imagine that someone walks toward the stern of the ship at the same
velocity that the ship moves forward. Then, to someone on shore, the walker
seems to be walking in place, as if on a treadmill. The two velocities add to
zero.
7.3 Velocity and Speed
As we said above, speed is the length of the velocity vector. The scalar
quantity speed carries less information than the velocity vector, because all
information about the direction of the vector has been lost. Still, sometimes
all one wants to know is the speed. In case a velocity is known as the vector
sum of two perpendicular vectors, as in Fig 7.4, one can nd the speed using
the Pythagoras theorem. From Fig 7.4, we see that the velocity v
1
+v
2
is the
hypotenuse of a right triangle with sides (lengths) v
1
and v
2
. This means,
from geometry, that the speed squared, |v
1
+v
2
|
2
, is v
2
1
+v
2
2
.
In case v
1
and v
2
are not perpendicular, though, as in Fig 7.5, we would
have to use a generalization of the Pythagoras theorem called the Law of
Cosines. In this generality the speed squared is
|v
1
+v
2
|
2
= v
2
1
+v
2
2
2v
1
v
2
cos(
12
) , (7.3)
where
12
is the angle between the vectors v
1
and v
2
when you place them tail-
to-head. If these two vectors are perpendicular, as in Fig 7.4, then
12
= /2,
and thus cos
12
= 0, so that Eq. 7.3 is just the Pythagoras theorem. But
in Fig 7.5, there is an acute angle between the vectors, and thus the speed
is less than it would be if they were perpendicular, because the last term in
Fig 7.3 represents something subtracted o.
Vectors are frequently expressed in terms of their projections onto stan-
dard axes. Usually the x-axis is taken horizontal, pointing to the right, and
the y-axis as vertical, pointing up. A vector with projections v
x
and v
y
onto
7.4. GALILEAN RELATIVITY 237
these axes can be called v = (v
x
, v
y
). The projections can be either posi-
tive or negative, indicating the directions. The two components together tell
everything about the vector.
In three dimensions we would have a third axis, the z-axis, and a projec-
tion v
z
onto that axis. Then the velocity vector, in terms of these projections,
or components, would be v = (v
x
, v
y
, v
z
). The speed squared in this case, the
most general case, is
|v|
2
= v
2
x
+v
2
y
+v
2
z
, (7.4)
by the Pythagoras theorem.
Kinetic energy K of a mass m depends only on the speed |v|, and not on
all the details of the vector v. In the most general case it is therefore
K =
m|v|
2
2
=
m
2
(v
2
x
+v
2
y
+v
2
z
) . (7.5)
7.4 Galilean Relativity
It is a fascinating and very physical fact that when you are on a moving ship,
or a moving train, or a moving plane, so long as the motion is in a straight
line at constant velocity, you do not feel that you are moving. If you drop
something, for example, it seems to fall straight down, just as if you were
at rest. Galileo noticed this in connection with the problem of whether the
Earth moves or is at rest. On the basis of examples like moving ships, Galileo
argued that we could be moving and yet not feel it, or even have any way
to demonstrate that we are moving. The statement that we cannot tell by
experiment whether we are moving smoothly along or not is the statement of
the Principle of Relativity. It says that we can tell if we are moving relative
to something else, but we cannot tell if we are moving in any absolute sense.
In fact, it has no operational meaning to say we are moving in an absolute
sense.
The eect of changing point of view to another point of view, moving with
respect to the rst point of view, is very simple in Galileos picture of it. All
velocities simply get a certain constant velocity added on. This constant
velocity just expresses the relative velocity of the two points of view. In
Fig 7.4, for example, the velocity v
1
might be the velocity of a sailor walking
across the deck. With reference to the ship, this is the sailors velocity. But
if the ship is moving with velocity v
2
with respect to the shore, then an
observer on shore gets the velocities with reference to the shore by adding
on v
2
to all velocities with reference to the ship.
7.5 Falling and Relativity
Simple falling, as shown in Fig 6.6, and projectile motion, as shown in Fig 7.1,
seem to be rather dierent things, but in the Theory of Relativity they are
the same! The rst shows something dropping straight down, and the second
shows something that has been thrown horizontally to the right. How could
these be the same?
In Fig 6.6, something that is initially at rest is released, and in equal
units of time drops straight down, falling distances 1, 3, 5, ... in successive
units of time. But who is to say that it was initially at rest? From the point
of view of someone moving smoothly to the left, this object, even before it
was dropped, is moving smoothly to the right. When it is dropped, it will
continue to move smoothly to the right, from this point of view, because this
is really just the observation of Fig 6.6 from the point of view of someone
moving to the left. The result is Fig 7.1!
The Principle of Relativity says that both these points of view are equally
valid. Einstein frequently used this example in explaining his Relativity The-
ory to popular audiences. As he put it, suppose someone drops a small stone
from a railway carriage, while it is moving smoothly at constant velocity.
With respect to the railway carriage, the stone falls straight down, as in
Fig 6.6. But someone outside the train on the embankment who observes
the misdeed sees the stone falling as in Fig 7.1, because with reference to
the embankment the stone is moving along with the velocity of the railway
carriage even before it is dropped. Now, Einstein asks, what is its actual
path in space, a straight line or a parabola? He argues that this question
makes no sense. Objects do not move with reference to space, he says, but
only with reference to other objects. Thus you seem to get dierent answers
depending on what reference objects you use, the carriage or the embank-
ment, but since they all describe the same physical thing, namely the falling
stone, they all must be consistent. The Theory of Relativity is about how to
7.5. FALLING AND RELATIVITY 239
reconcile apparently dierent points of view, and also about how to use the
possibility of switching points of view to get new insight. As we will see in
Chapter 20, the famous formula E = mc
2
follows from an argument of this
type.
For our purposes, the only dierence between one point of view and an-
other is that all velocities change in a simple way when we change points
of view. Namely, they all appear to have a certain constant vector velocity
added (in the sense of adding vectors). This new velocity is just the one that
relates the two dierent points of view. In going from Fig 6.6 to Fig 7.1, a
horizontal velocity is added, corresponding to the horizontal velocity of the
railway carriage.
We could imagine viewing Fig 6.6 from other points of view as well, say
from the point of view of someone moving smoothly up. Then before being
released, the object in Fig 6.6 is moving smoothly down, and it continues to
do this after being released, while the falling motion is superimposed on it.
This leads to the more general equation for motion in the y direction (with
positive y down)
y = v
0y
t +
1
2
gt
2
(7.6)
where v
0y
is the additional constant velocity due to the change in point of
view. This generalizes Eq (7.2) to the case of something that is not thrown
horizontally, but is projected downwards. Taking v
0y
negative, we get the
case of something projected upwards. Comparing with Eq (6.19) we see that
the y-component of velocity in the new point of view would not be just gt
but
v
y
= v
0y
+gt (7.7)
The constant velocity v
0y
has been added to express how the object moves
from the new point of view.
The Principle of Relativity says that anything that happens, like simple
falling, is really just one representative of a whole family of things that could
happen. One gets other members of the family by imagining how the rst
one would look if it were viewed from a point of view moving at constant
velocity with respect to the initial point of view. Since all such points of view
are equally good, all members of the family are things that would actually
happen. Fig 6.6 and Fig 7.1 are members of the same family, related just by
a shift in point of view. That is the sense in which they are really the same.
7.6 Falling and Impulse
In one dimension, a good way to think of how a (constant) force F changes
velocity v of a mass m over some time t is to compute the impulse I = Ft.
This quantity takes into account not just the force F but also the time t over
which it acts. Both are important in determining how it changes velocity.
One can think of impulse I as being a kind of kick administered to a mass
m. The last thing to know is that the response to this kick is determined
by m, the mass. More massive things (having more inertia m) respond less,
with the result
v =
I
m
=
Ft
m
(7.8)
for the change in velocity v. Since both force F and velocity v have a direc-
tion, the kick is in the direction of F, and the change in velocity v is in that
same direction.
In the case of gravity, the force on m is F = W = mg down (just the
weight). Thus in a time t, the impulse is I = mgt. Dividing by m, the
change in velocity is v = I/m = gt (down). This is just the same as
Eq (6.19) (where t, the elapsed time, is simply called t). The response v
is the same for all masses m because the force W, and hence the impulse I
is proportional to m, but then to nd the response we divide by m.
More generally, in two dimensions, force

F and velocity v are vectors, and
thus impulse

I is also a vector. Mass m is a scalar, but the acceleration due to
gravity g is a vector. The vectors all have a direction. Velocity certainly has
a direction, and force does also. Weight

W = mg, the force due to gravity,
is down, for example, so g is a vector down. If we take the down direction to
be positive, then we could say
W = (0, mg) (7.9)

expressing the vector in terms of its horizontal and vertical components. The
impulse due to this force over time t is then
I =

Wt = (0, mgt) (7.10)
and the change in velocity is

I/m, that is,
v = (0, gt) . (7.11)

7.7. MORE ON PROJECTILE MOTION 241
That is, the y-component of velocity changes in time, but the x-component
does not change, since there is no x-component of force. If you compare
the beginning of this section with the end, you will see that the two parts
are basically saying the same thing twice. The rst part treats the falling
problem as 1-dimensional, considering only the vertical direction. The sec-
ond part, when we switch to vectors, keeps track of two dimensions, and in
particular points out that the horizontal component of velocity is constant
(zero change).
7.7 More on Projectile Motion
If, instead of taking the down direction to be positive, we take the up direction
to be positive, then the projection of the force due to gravity,

W = mg on
this axis, being down, is mg, and the formulae of projectile motion look
slightly dierent, g being replaced by g. We collect these formulae here, in
slightly more general form,
v
x
= v
0x
(constant) (7.12)
v
y
= v
0y
gt (7.13)
x = x
0
+v
0x
t (7.14)
y = y
0
+v
0y
t gt
2
/2 (7.15)
and say what they mean. The horizontal component of velocity v
x
is constant.
The subscript 0, as in v
0x
, will always indicate a constant quantity. In v
y
, for
instance, v
0y
is the value of v
y
at the initial time t = 0, just some number,
indicating by its sign whether the projectile was going up or down at that
moment. The velocity v
y
itself is not constant, but continually decreasing
due to the force of gravity, according to the impulse theory of the preceding
section. We see this in the term gt in the formula for v
y
. The x coordinate
changes linearly in time, and its value at t = 0 is the quantity x
0
. Finally
the y coordinate shows the usual falling behavior, but starting at y
0
at time
t = 0, not necessarily at y = 0.
If we know where the projectile starts, i.e., the position (x
0
, y
0
), and its
initial velocity (v
0x
, v
0y
), then Eqs (7.12)-(7.15) tell exactly how the projectile
moves. Frequently, though, these things will not be given explicitly. Someone
may ask how high a projectile rises above its starting point, for example. This
is asking for the dierence y y
0
when y has its maximum value. It would be
enough to take y
0
= 0 and simply nd the maximum value of y. Or one may
ask for the distance a projectile goes from its starting point before hitting
the ground again. In this case one may take x
0
= 0 and nd x at the time
the projectile hits. Or one may be told the initial speed v
0
of the projectile
and the angle above the horizontal with which it is projected. In this case
one must nd the components of the initial velocity v
0
from a picture. Since
v
0
v
0x
v
0y
Fig. 7.6: The components of the initial velocity vector are found by trigonometry.
the speed v
0
= |v
0
| is the length of the hypotenuse of the right triangle in
Fig 7.6, we have
v
0x
= v
0
cos (7.16)
v
0y
= v
0
sin (7.17)
Now we may ask how long a time a projectile rises before beginning to
fall back. This is just the time from t = 0, when it is launched, to the time
when v
y
= 0. This occurs (solving for t in Eq (7.13)) at time t = v
0y
/g.
Thus it rises for a time v
0y
/g. We check dimensions and common sense: it is
dimensionally a time, it increases with v
0y
and decreases with g.
How high does a projectile go? This is just y y
0
at time v
0y
/g, namely
v
2
0y
/2g, as we also found in Eq (6.17), where the problem was considered in
one dimension. We could also do this problem using energy methods. The
constant energy is
E = mgy + (m/2)(v
2
0x
+v
2
y
) (7.18)
7.8. IMPULSE AND CONSERVATION OF MOMENTUM 243
Initially y = y
0
and v
y
= v
0y
. At the highest point y is unknown and v
y
= 0.
Thus
mgy
0
+ (m/2)(v
2
0x
+v
2
0y
) = mgy + (m/2)v
2
0x
(7.19)
Solving for y y
0
we nd once again v
2
0y
/2g.
How far does a projectile go before hitting the ground again, if it is
launched over level ground? The time to fall back to earth is the same as
the time to rise, so the total time is now 2v
0y
/g, and the horizontal distance
is x x
0
= 2v
0x
v
0y
/g. Again we check dimensions and common sense, as
well as some special cases. Suppose v
0y
= 0, corresponding to a horizontal
launch. Does it make sense that the total distance is then zero? Suppose
v
0x
= 0. Does this case make sense?
Many projectile problems are variations on these.
7.8 Impulse and Conservation of Momentum
The impulse law Eq (7.8) for the way velocity changes due to a force is the
closest we will come to stating Newtons 2nd Law of motion, the fundamental
law of Newtonian mechanics. To repeat, it says that velocity changes in
response to a force acting through time, and the response of a mass m is
inversely proportional to mass, i.e., larger masses respond with less change
in their velocity. This law has a particularly simple meaning if it describes
two masses m
1
and m
2
exerting forces on each other, with all other forces,
due to other masses, negligible. In this case the only force on m
1
is

F
12
,
the force on 1 by 2, and the only force on m
2
is

F
21
, the force on 2 by 1.
Furthermore, by Newtons 3rd Law,

F
12
=
F
21
. We are now representing
the forces as vectors. The impulse law says that the change in velocity of the
masses will be
v
1
=
F
12
t
m
1
(7.20)
v
2
=
F
21
t
m
2
(7.21)
and therefore, multiplying in the rst equation by m
1
and in the second by
m
2
, we see
m
1

v
1
= m
2

v
2
(7.22)
The quantity mv, called momentum, has appeared as a consequence of seeing
how the two masses interact. What one loses in mv the other gains. Thus
the total momentum, dened as
Total momentum = m
1
v
1
+m
2
v
2
(7.23)
doesnt change. This is called the law of conservation of momentum. Notice
that momentum is a vector. For a single mass it has the direction of the
velocity vector of that mass, and for a system of masses it is a vector sum.
We should notice all the conditions that apply to our derivation of this
law. In the rst place, the impulse method as we have given it is only for
a constant force. Second, the interaction we described was between two
particles only, with no forces from any other particles. (This might be a
good description in a collision where two things exert much larger forces on
each other than anything else does.) It is possible to relax these conditions
considerably, and conservation of momentum holds much more generally than
this short description would suggest.
7.9 Impulse and Circular Motion
When something of mass m moves at constant speed in a circle, v, its velocity
vector, is always changing direction (turning). Even though the speed is
constant, the vector is changing, and the change v in a short time t is
just the impulse

Ft divided by the mass m. It takes a little thought to
make this intuitive, but the force

F and the resulting impulse, as well as the
change v, are in this case perpendicular to the velocity, towards the center
of the circle. The velocity v itself is along the circle. Figure 7.7 shows how
a change toward the center can turn v in the way that it actually does turn.
If we think just about the magnitudes, the speed v is R, where is the
angular velocity and R is the radius of the circle (recall Eq 4.11). In time
t the mass m moves through angle t, but from Fig 7.7, in small angle
approximation, this angle is also (v)/v. Thus
v
v
= t =
v
R
t (7.24)
7.9. IMPULSE AND CIRCULAR MOTION 245
v
v
v
v
v
later
R
Fig. 7.7: When an object moves in a circle of radius R at constant speed, its velocity is
continually changing in the centripetal direction, toward the center. The vector addition,
adding v + v to produce the later v is shown at the right. Strictly speaking, the gure
must be understood in small angle approximation only.
and therefore
v =
v
2
R
t (7.25)
We see from the right side of Eq 7.25 that the impulse over this short time
must be (mv
2
/R)t and hence that the force causing the mass m to move
in a circle is centripetal, with magnitude
|
F
centripetal
| =
mv
2
R
(7.26)
Note that this says nothing about what the force is that makes m move in
a circle. It only says that, whatever it is, it must have this magnitude to be
consistent with the observed v and R. To make use of this idea, we should
think of examples where we know something else about the force.
Here is a famous example that makes use of these ideas of centripetal
force and circular motion. A planet with mass m
P
in circular orbit of radius
R about the Sun is subject to the attractive gravitational force Gm
P
m
S
/R
2
,
where m
S
is the mass of the Sun. Since this is just the centripetal force that
causes it to move in that orbit, it must be that the speed v of the planet in
its orbit is such that the magnitude comes out right, namely
Gm
P
m
S
R
2
=
m
P
v
2
R
(7.27)
Multiplying through by R/2, we can express this relationship in terms of
energies,
Gm
P
m
S
2R
=
1
2
m
P
v
2
. (7.28)
That is, the kinetic energy of the planet is just half its potential energy
Gm
P
m
S
/R, in magnitude. We had met this way of looking at it as the
Virial Theorem, Eq 6.27. Thus, solving for v in either of the above equations,
we have two equivalent ways to know the speed of planets in their orbits.
(This relationship implies Keplers Third Law, relating the period T to the
radius R of the orbit, Problem 7.12.)
Problems
Projectile Motion
7.1 A marble is batted horizontally o a table at a speed of 3 m/s. The
table is 1 m high. How far from the table will the marble hit the oor?
7.2 For a reworks display, the rockets are to be red at an angle of 70
above the horizontal, and they should reach a height of 100 meters. What
should be the speed of the rockets as they are launched? (Assume that they
are given this speed almost instantaneously, and just coast upward after
that).
7.3 (a)With what speed should a projectile be launched if it is to carry
1 km over level ground? Assume that it is launched at an angle = /4.
(b) How high will this projectile go?
7.4 A projectile is launched at an angle = /3 above the horizontal. It
takes 2 seconds to reach its zenith.
(a) How high does it go?
(b) How far does it go horizontally before hitting the ground?
7.5 A 1 kg mass is thrown upward at an angle = /4 to the horizontal
with a speed of 10 m/s. What is the minimum kinetic energy K that it has
in its ight?
247
Velocity and Speed
7.6 Fig 7.8 shows mutually perpendicular x, y, and z axes, and a vector v,
with its projections v
x
, v
y
, and v
z
onto the axes. Use the gure to justify
Eq (7.4) for the length of the vector v.
y
x
z
v
v
x
v
y
v
z
Fig. 7.8: The velocity vector v is projected onto three mutually perpendicular axes.
7.7 (a) Use the Law of Cosines in case = 0 to nd the speed corresponding
to the sum of two collinear velocities v
1
and v
2
. Also draw the corresponding
picture.
(b) Repeat in case = .
Impulse
7.8 The gravitational force is down, so the impulse it gives to any mass m
is also down, and hence it changes only the vertical component of velocity v
y
,
leaving the horizontal component v
x
alone. Suppose the velocity vector of a
mass m has vertical component v
0y
and horizontal component v
0x
initially,
and let the gravitational force mg act for a time t, changing the vertical
component.
(a) Find the impulse delivered to m in time t.
(b) Find v
y
after the time t.
(c) Find v
x
after the time t.
(d) Check common sense: are your answers correct in the special case
t = 0?
(e) Initially the kinetic energy was K =
m
2
(v
2
0x
+v
2
0y
). What is the kinetic
energy after the time t?
(f) How much does K change in the time t? Can you interpret this result?
7.9 (a) A superball, of mass 0.2 kg, is dropped from a height of 1 m. With
what speed does it hit the oor?
(b) Suppose it rebounds upward with the same speed. What is the change
in velocity? (hint: not zero!)
(c) Suppose the collision with the oor lasts for the short time t = 0.01 s.
What force must have been acting over this time to account for the change
in velocity? Note: this force has nothing to do with the balls weight!
Momentum and Relativity
7.10 Two equal masses m colliding and bouncing away from the collision
might behave as shown in Fig 7.9. Since one mass has velocity v and the
other has velocity v, the total momentum is mv + m(v) = 0. If the only
forces on these masses are the forces F
12
and F
21
= F
12
that they exert on
each other, the total momentum should be conserved in the collision, and
we see that it is: after the collision the momentum is m(v) + mv = 0, the
same as before.
Now imagine how this collision would appear to an observer moving with
velocity v, i.e., with the mass on the left initially, who therefore sees this mass
initially at rest, and not moving. How would the other mass look, and how
would the situation look after the collision, to an observer who always moves
smoothly to the right with speed v? Draw pictures showing the collision from
this point of view. Show that momentum is also conserved according to this
observer (as it must be, according to the Principle of Relativity).
m
v
m
-v
-v
m m
v
(before)
(after)
Fig. 7.9: Two equal masses approach each other with equal speed, collide, and bounce away
with the same speeds. How would this collision look from a moving point of view?
7.11 The previous problem described an elastic collision, so-called because
the kinetic energy is the same before and after (no energy lost). Consider the
situation of a perfectly inelastic collision shown in Fig 7.10. Answer the ques-
tions of the previous problem in this case, and also show that although the
kinetic energy before and after the collision is dierent for the two observers,
the change in kinetic energy (the energy lost in the collision) is something
they agree on.
m
v
m
-v
m m
(before)
(after)
Fig. 7.10: In a perfectly inelastic collision, the two masses shown colliding stick together.
Clearly the kinetic energy of the system goes down. How would this collision look from a
moving point of view?
Impulse and Circular Motion
7.12 (a) Keplers Third Law says that the period T and radius R of plane-
tary orbits about the Sun are related by T
2
= kR
3
for some constant k. Show
that Keplers Third Law follows from Newtons theory of universal gravita-
tion, and determine the constant k in terms of G, Newtons gravitational
constant, and M
S
, the mass of the Sun.
(b) Use the fact that the Earth, with an orbital radius R 1.5 10
11
m,
orbits the Sun in 1 year to determine the mass of the Sun. You will need to
know G 6.67 10
11
N m
2
/kg
2
.
7.13 (a) Suppose you whirl a mass m in a horizontal circle on the end
of a 50 cm cord. If m = 0.1 kg and makes 1 revolution per second, what
is the horizontal component of tension in the cord? (Only the horizontal
component is centripetal).
(b) The cord in part (a) pulls along its length, which means the tension
force also has an upward, or vertical, component. What is the vertical com-
ponent of tension in the cord? (Hint: something is supporting the mass,
keeping it from falling.)
(c) What angle does the cord make with the vertical as the mass m whirls
around?
Chapter 8
Density and Fluids
King Hiero of Syracuse commissioned a crown of gold, but when it was deliv-
ered he suspected the goldsmith of cheating him. The crown had the correct
weight of the gold he had given, but what if the goldsmith had kept some of
the gold and made up the weight with less precious silver? He told his friend
Archimedes of his suspicions. The most famous story of Archimedes tells
how he detected the forgery of the crown. (This is the one where he jumps
out of the bath shouting Eureka! and runs naked through the streets.)
The 1st century Roman author Vitruvius tells this story. According to him,
Archimedes great insight came when he stepped into the bath and noticed
how the displaced water overowed, giving a way to measure volume, and
hence density. If the crown were not pure gold, it would not have the density
of gold, and thus the forgery could be proven.
8.1 Mass Density
Weight is proportional to volume for a pure substance, like gold. If you have
twice as much gold (by volume), it weighs twice as much. Since weight is
also proportional to the more fundamental quantity mass, we can also say
mass M is proportional to volume V , that is,
M V (8.1)
253
254 CHAPTER 8. DENSITY AND FLUIDS
The constant of proportionality is called mass density, and is often given the
Greek letter (rho), so that
M = V (8.2)
The dimension of is [] = [ML
3
]. A mass density has to be multiplied by
a volume to become a mass.
Mass density is an example of a material property. It is a characteristic
of the material, and can be measured and tabulated for future use. It is
common to give this property in the cgs unit g/cm
3
. In SI units the value is
larger by a factor 1000. Below are the densities of a few familiar materials:
Substance Mass Density (g/cm
3
)
Water 1.0
Aluminum 2.7
Iron 7.9
Copper 9.0
Silver 10.5
Gold 19.3
Mercury 13.5
Lead 11.4
The least imaginative way to measure the density of a substance is to
take a known mass M, to measure its volume V , perhaps by measuring
the volume of water it displaces, and then to take the ratio = M/V .
According to Vitruvius, this is what Archimedes did. Later readers have
suspected that in this version the story has been dumbed down to the level
of Roman comprehension. One internal piece of evidence: Vitruvius actually
does not even seem to understand the concept of ratio. Rather, according to
him, Archimedes took a lump of gold of the same weight as the crown, and
compared the water it displaced with the water the crown displaced. The
crown displaced more water, being less dense than gold.
Galileo noticed that the rather childish reasoning in Vitruvius story
was not really worthy of Archimedes (that godlike man), especially when
one considers that Archimedes own theories suggest a much more sensitive
method. In an unpublished essay written at about the age of 20, Galileo sug-
gests that what Archimedes actually did was much more interesting than the
Vitruvius story. To understand the idea, we have to know about Archimedes
Principle.
8.2. ARCHIMEDES PRINCIPLE 255
8.2 Archimedes Principle
If a substance is less dense than water, it oats, and if it is more dense,
it sinks. This is just one consequence of Archimedes Principle, which says
that in equilibrium a substance immersed (or partially immersed) in a uid
is buoyed up by a force equal to the weight of the uid it displaces.
This insight is simple, general, applicable, and useful. You would think
that something proved thousands of years ago could be improved on now,
but really, Archimedes Principle is perfect as it is. There is nothing to
add. This is all the more amazing when you look at Archimedes proof,
which is so simple that you almost think there must be something wrong
with it. Somewhat surprisingly, the uid is taken to have a spherical surface,
with the center of the Earth as center. In most applications we would be
looking at such a small volume of water that its surface would look at, but
Archimedes is correct that strictly speaking the equilibrium surface is curved.
This assumption does not play an essential role in the argument, but it is
necessary to know about it in order to understand Archimedes diagrams.
D
A B
C
Fig. 8.1: An object oating on the left side displaces a volume of water equal to the
symmetrically located volume on the right. Since the system is in equilibrium, the two
weights must be the same. The volume ABCD might be very small, despite appearances,
just water in a pail, for example.
Archimedes only postulate in his book On Floating Bodies is that in
a uid that part which is thrust the less is driven along by that which is
thrust the more. Thus in equilibrium the thrust must balance. In Fig 8.1
the only dierence between the left side and the right side is that a certain
volume of water on the right has been replaced by an object on the left which
also protrudes out of the water. These must weigh the same amount, or one
side will be thrust the more. Thus the weight of the water displaced is the
same as the weight of the object, and since the object is in equilibrium, it is
buoyed up by a force equal to its weight. Thus the object is buoyed up by
the weight of the water displaced.
This law of buoyancy is also shown to hold in more general situations.
Suppose you push the oating object under water and hold it there with a
force F. The thrust on the left will be the weight of the object W
o
plus your
force F. It is balanced by the weight W
w
of the symmetrically located volume
of water, corresponding to the water displaced by the object, which is now
more than before, since the whole object is immersed. Since F + W
o
= W
w
in equilibrium, you must push down with a force
F = W
w
W
o
(8.3)
Similarly, if an object does not oat, but sinks, you could keep it from
sinking by exerting a force F up such that F +W
o
= W
w
, and therefore
F = W
o
W
w
(8.4)
You dont need to support the whole weight W
o
, with your force F, but only
the excess of that weight over the weight of the displaced water. It is as if
the object had lost as much weight as the weight of the displaced water.
In both these examples the weight W
w
of the water displaced acts like a
force up, a buoyant force. In a free body diagram for an immersed object we
should include it as a force F
b
(b for buoyant) due to the adjoining uid, as
in Fig 8.2. From the diagram we can read o the net force on the object,
meaning the sum of both forces. If the object has volume V and density
o
,
then its mass is
o
V , and its weight is W
o
=
o
V g. Suppose it is completely
immersed. Then the net force is
F
net
= W
o
+F
b
= W
o
W
w
= (
o
w
)V g (8.5)
where
w
is the density of water. Note that the volume of displaced water is
also V if the object is completely immersed, so the weight of the displaced
water is
w
V g. The object is in equilibrium if these forces balance, i.e.,
W
o
F
b
= -W
w
Fig. 8.2: An immersed object is subject to two forces, its weight W
o
, and the buoyant force
F
b
due to the adjoining uid, equal in magnitude to the weight of the uid displaced. It is
called W
w
here, since weight is a force down, but the buoyancy force is up.
F
net
= 0, which happens if
o
=
w
. If
o
>
w
, then F
net
> 0 and the object
sinks. If
o
<
w
, then F
net
< 0 and the object rises. (We implicitly took
the down direction to be positive when we represented weight by a positive
quantity and represented F
b
by the negative of a weight.) The behavior is
entirely controlled by the relative densities.
The ratio
o
/
w
is called the specic gravity of the object. It is just
density measured in units of the density of water. Specic gravity greater
than 1 implies the object will sink, and less than 1 implies the object will
oat. The values in the table of the previous section are specic gravities,
since they are given in units where
w
= 1.
The weight W
o
in Fig 8.2 is shown acting at the center of gravity of the
object (we are assuming that the object is the same density everywhere, so
that it would balance at its geometrical center). The buoyant force F
b
is
shown acting at what would be the center of gravity of the uid volume,
if the object were not there. The uid is homogeneous, so this is the same
geometrical center. That is, both forces act at the same place, and the torques
about this point automatically balance. There is no unbalanced torque, and
hence no tendency for the object to turn, even if it sinks or is buoyed up. In
principle, though, these two forces act at two dierent points, and it is only
the simplicity of this object that makes the two points coincide in this case.
=1 =2
F
b
W
o
Fig. 8.3: A symmetrically shaped object is denser at one end than at the other. When
it is immersed there is an unbalanced torque about the center, because the buoyant force
operates where the (homogeneous) uid volume would balance if the object werent there,
namely at the center. (The density is given in units of the left hand density. The actual
value is irrelevant.)
In Fig 8.3 we imagine an immersed object that has a symmetrical shape
but an asymmetrical distribution of mass (denser on the right than on the
left). Again the weight acts at the center of gravity, where the object would
balance, and the buoyant force acts where the corresponding uid volume
would balance, but now the two points are dierent, and there is a net
torque. Whether this object is buoyant and rises, or is denser and sinks, it
will twist in the clockwise direction.
There would be no unbalanced torque in Fig 8.3 if the object were aligned
vertically instead of horizontally. This is true whether the center of gravity
is directly above the geometrical center or below it. In the rst case, though,
the equilibrium position is unstable, because as soon as the object tips a little
bit, as in Fig 8.4 on the left, there is an unbalanced torque that tends to turn
it in the direction it has tipped. Any accidental tipping gets amplied. If the
1
2
2
1
Fig. 8.4: The object in Fig 8.2 is imagined to be immersed at an angle. If the denser part
is above, the unbalanced torque tends to turn the object over. If the denser part is below,
the unbalanced torque tends to keep it below. Thus the left side is close to the unstable
vertical alignment, and the right side is close to the stable vertical alignment.
center of gravity is below the geometrical center, as in Fig 8.4 on the right,
the unbalanced torque tends to restore the (rotational) equilibrium. This
is the stable alignment. One could imagine the object seeking to minimize
its gravitational potential energy. If the denser part is on top, it can lower
its potential energy by turning over. If the denser part is on the bottom, it
tends to stay there.
One obvious application of these ideas is to the stability of ships. The
buoyant force that holds them up is applied at the center of the displaced
uid, which is somewhere below the water line. In a careless design, the
center of gravity of the ship might very well be above this, especially if the
ship has an elaborate superstructure. Such a ship would be unstable. To
solve this problem, old sailing ships carried a permanent load of rocks in
their hold for ballast. This brought the center of gravity low enough for
stability. Modern racing yachts have a heavy keel for the same reason.
A familiar example of this phenomenon is a oating cube. Since the cube
protrudes out of the water, it is clear that a symmetrically oating cube
would have its center of gravity above the point where the buoyant force acts.
This alignment is unstable. As a result, the cube tips and oats at an odd
angle. We will follow historical treatments in talking about rectangular solids
oating in a geometrically simple way, as if they were somehow forbidden to
tip, but when you think about it, you realize that tipping makes the problem
much more complicated!
8.3 Galileos Balance
Galileo argued that Archimedes would have tested the crown of Hiero using a
combination of Archimedes Principle and the Law of the Lever, as shown in
Fig 8.5. The measurement occurs in two steps. First the crown, with weight
L L
W
L-x x
W
L
Fig.8.5: Galileos balance: the specic gravity of the crown is L/x. (Details of the argument
are given in the text.) The crown in the gure is apparently only a little bit denser than
aluminum!
W
o
, is balanced by a counterweight W at the distance L from the fulcrum.
8.4. GALILEOS PROOF OF ARCHIMEDES PRINCIPLE 261
The balance of torques tells us
W
o
L
= WL (8.6)
Now a pail of water is brought up and the crown is immersed. In eect the
crown loses weight W
w
, the weight of the displaced water. The counterweight
is moved in by a distance x so that the crown once again balances. The
balance condition is now
(W
o
W
w
)L
= W(L x) (8.7)
Subtracting Eq (8.7) from (8.6) we have
W
w
L
= Wx (8.8)
By Eq (8.6), we have L
= L(W/W
o
). Putting this expression for L
into
Eq (8.8), and dividing both sides by WL, we have
W
w
W
o
=
x
L
(8.9)
Since W
w
/W
o
=
w
/
o
, the specic gravity
o
/
w
is
w
=
L
x
(8.10)
The specic gravity of the crown, which is essentially the unknown density
of the crown, is represented visually by the position x of the counterweight.
In Fig 8.5 the crown has specic gravity a little more than 3, and is certainly
not gold!
For some reason Galileo never published this elegant idea. He was, in
fact, surprisingly secretive, for a man who later became such a public gure.
Looking back, though, we can see in this youthful essay that Archimedes
had, so to speak, come to life again in the work of Galileo.
8.4 Galileos Proof of Archimedes Principle
Galileo may have been unsatised with Archimedes argument for Archimedes
Principle, even while he was convinced that it must be true. He gave an-
other proof. The idea, translated into modern terms, is that a oating body
A
h
y
D
Fig. 8.6: How much does the gravitational potential energy change if an object moves
vertically by the amount y? The object has height h, cross-sectional area A, and it has
sunk a depth D into the uid.
in equilibrium, being free to move in any way, actually moves to minimize
the gravitational potential energy (of the whole system). The situation is
pictured in Fig 8.6. The object has height h, cross-sectional area A, and
therefore volume V = Ah, mass density
o
, and therefore mass
o
Ah. It has
sunk a depth D into the water. As before, we compute the change in gravita-
tional potential energy in case the object moves. If this can be negative, then
we know it is not at the equilibrium, because it can move to lower its energy.
The eect of moving the object down a distance y in terms of energy is
the same as taking a slice of small thickness y o the top and putting it
on the bottom, which would lower the energy of the object by an amount
o
Aygh. At the same time, though, one must take a slice of water of
8.5. BUOYANCY AND PRESSURE 263
the same dimensions and move it up to the surface (where it spreads out
the level of the water goes up slightly). This raises the potential energy of
the water by
w
AygD. The change in the gravitational energy U
g
of the
system is therefore
U
g
= (
w
D
o
h)Ayg (8.11)
If the quantity in parentheses is not zero, then one can choose y to make
U
g
< 0, so that the potential energy can be lowered. Thus the equilibrium
condition is
w
D
o
h = 0, or
D =

o
h
w
(8.12)
This tells us the depth of the object in its equilibrium position. We check
the weight of the displaced water in equilibrium: its volume is AD, so its
weight is, using the equilibrium D from Eq (8.12),
AD
w
g = A
_
o
h
w
_
w
g = Ah
o
g (8.13)
This is exactly the weight of the object. The condition that the gravitational
energy should be minimized has thus led back to Archimedes Principle.
8.5 Buoyancy and Pressure
The buoyancy force F
b
on an object is surely due to the adjoining water,
but Archimedes proof gives no clue how the water actually exerts this force.
The energy method does not even mention force. We now understand the
force exerted by a uid in equilibrium in terms of the concept of pressure.
Pressure P is a force per unit area, with dimension [P] = [ML
1
T
2
]. The
SI unit of pressure is the Newton per meter squared (N/m
2
), also called the
Pascal, abbreviated Pa. Thus if the pressure is 10 Pa, the force on 1 square
meter would be 10 Newtons, and the force on 0.1 square meter would be 1
N. A pressure has to be multiplied by an area to be a force, so it is a kind
of force density. Such a thing is also called a stress. Pressure is called a
normal stress, because the force due to pressure acts normally to the surface
(i.e., in the normal direction, perpendicular to the surface).
A
h
D
Fig. 8.7: The uid pushes inward on an object of height h and cross-sectional area A,
having sunk to a depth D. The net force due to the uid is a force PA up, where P is the
pressure at depth D. The horizontal forces balance.
Let us see how this force supports a buoyant object, like the one in Fig 8.6.
We redraw it, showing the normal forces due to pressure in Fig 8.7. On every
part of the surface contacted by the uid there is a force due to pressure P,
directed normally into the surface. The horizontal forces balance, but on
the bottom surface there is a force inward, which is to say up, equal to PA,
where P is the pressure at the depth D on the surface A. This must be the
buoyant force! We already know its magnitude is the weight W
w
=
w
ADg
of the water displaced, by Archimedes Principle. Thus the pressure at depth
D is
P =
W
w
A
=

w
ADg
A
=
w
Dg (8.14)
We notice that this equilibrium pressure, or hydrostatic pressure, as it is also
8.5. BUOYANCY AND PRESSURE 265
called, is proportional to depth D. This makes sense: as you go down, the
pressure goes up. P is also proportional to , in case it is some uid dierent
from water. The pressure in liquid mercury, for example, would be about 13
times greater than in water at the same depth.
This same hydrostatic pressure accounts for why the uid is at rest in
equilibrium. The volume of uid above an area A is supported by the force
PA (up). That is, the object in Fig 8.7 could be removed, and uid allowed
into its place. That uid becomes, in eect, the object, and is buoyed up in
exactly the same way, by the same hydrostatic pressure. Since that volume
would weigh much more if the uid were mercury, the pressure has to be
correspondingly more in mercury than in water. This gives a very simple
interpretation of hydrostatic pressure: at any depth D, it is whatever is
necessary to hold up the weight of the uid above. For a uid of density
f
the pressure at depth D is
P =
f
Dg (8.15)
If we know how pressure translates into force on a surface, Eq (8.15)
implies Archimedes Principle. In Fig 8.8 we consider an object completely
immersed. The horizontal pressure forces balance, but the pressure on the
A
h
Fig. 8.8: An object completely immersed in a uid is buoyed up by the weight of the water
displaced, because the pressure below is greater than the pressure above.
bottom (depth D
2
) is greater than the pressure on the top (depth D
1
). Using
the hydrostatic pressure from Eq (8.15) we have the buoyant force
F
b
=
f
D
2
gA
f
D
1
gA =
f
hgA =
f
V g (8.16)
using V = hA for the volume of the object. Note that
f
is the density of the
uid! Thus, in magnitude, F
b
is the weight of the uid displaced. It doesnt
matter whether the object is shallow or deep, whether the pressures are large
or small. All that matters is the dierence of the pressure on the top and
bottom, and this dierence is always the same for this object, leading to a
buoyant force that is the same at any depth. Intuitively you might think
that at great depth the enormous pressure on the top would drive the object
down more, but this is not true: the even more enormous pressure on the
bottom more than makes up for this eect.
It might occur to you that at high pressure the object could be com-
pressed, and have a smaller volume. Then it would displace less uid and
the buoyant force would be less. This is true! We have ignored compressibil-
ity in the above discussion, but a clever toy called a Cartesian diver exploits
this eect. The object traps an air bubble, and this bubble is quite com-
pressible. The bubble is created to make the object just barely buoyant,
so that it oats, but if you can increase the pressure on the whole system,
the bubble is compressed, the object becomes more dense than water, and
it sinks. Releasing the pressure allows the bubble to expand, displace more
water, and it oats again.
8.6 More on Hydrostatic Pressure
The uid in Fig 8.9 is in equilibrium, although it would be possible to get
confused, wondering how the small weight of water on the left can balance
the great weight of water on the right. The answer is that the equilibrium
condition for a uid is a condition on pressure, given in Eq (8.15), and not
a condition on weight. It is true that the force on the oor of the righthand
compartment is much greater than the force on the oor of the lefthand
compartment, but that is because its area is greater: the pressure is the
same. In fact, if the pressure were to be greater on the oor at the right,
the uid in the connecting tube would be thrust the more from the right,
8.6. MORE ON HYDROSTATIC PRESSURE 267
Fig. 8.9: The uid is in equilibrium, despite the greater weight on the right.
and would move from right to left. In equilibrium, however, the uid is not
moving, and hence the pressure must not change as one moves horizontally.
The pressure depends only on depth.
A more surprising application of the hydrostatic pressure in Eq (8.15) is
the siphon, shown in Fig 8.10. If we think about the hydrostatic pressure in
the top vessel, we measure depth from the top surface. Continuing into the
tube we have a region of negative depth, where we are above the top surface,
before following the tube down to the second vessel, where the hydrostatic
pressure of Eq (8.15) is positive again (we will revise this picture slightly
in the next section, but not in a way to change the analysis). This fairly
large positive pressure is the hydrostatic pressure we would nd if the tube
were closed o at the bottom. If we think of the hydrostatic pressure in the
bottom vessel, however, we nd a lesser value, because we measure from its
top surface, which is lower. In particular, the hydrostatic pressure at the
mouth of the tube, in case the tube is closed o, is less than the pressure in
Fig. 8.10: The uid is not in hydrostatic equilibrium, but ows in the direction that it is
thrust the more. If it were in equilibrium, as it would be with the bottom of the tube
closed o, the pressure in the tube would be higher than the pressure just outside the tube.
the tube. If the tube is now slightly opened, so that the equilibrium is only
slightly disturbed, the higher pressure inside drives uid out into the lower
vessel, and the uid slowly transfers from the upper to the lower vessel.
8.7 Atmospheric Pressure
It is easy to understand that deepsea creatures live in a world of high pressure,
but it was only in the 17th century that it was understood that we too live in a
deep sea: a sea of air. We too are subject to a hydrostatic pressure, sucient
to hold up the column of air over any area down here on the ground. That
pressure, atmospheric pressure, is roughly P
atm
10
5
N/m
2
. Thus on every
square centimeter of surface there is a force inward, due to the atmosphere, of
about 10 Newtons, about the weight of 1 kg, more than 2 pounds. Thus there
8.7. ATMOSPHERIC PRESSURE 269
is a force distributed on our bodies of hundreds of pounds, tending to crush
us. Why dont we feel it? One answer is that the uid in our tissues is all at
this same pressure, so there is no particular danger of actually being crushed.
Rather we are in hydrostatic equilibrium with this ambient pressure. It is not
of any use to us to be aware of it, and we have not evolved any sensory organs
to detect it. We do, of course, detect sudden deviations from equilibrium.
Atmospheric pressure, so easy to forget, should really have been included
in our hydrostatic pressure result, Eq (8.15). If we think about how hy-
drostatic pressure supports a column of water, we should realize that the
pressure force PA on the bottom of the column must balance not only the
weight of the water, but also the force P
atm
A on the top. That means the
hydrostatic pressure at depth D is actually
P = P
atm
+Dg (8.17)
This correction to Eq (8.15) does not change any of our previous examples
in any signicant way, though. The reason is that it was always a dierence
of pressures that was the important thing. In understanding buoyancy, for
example, we looked at the dierence in pressures between the top and bottom
of an object: when we take the dierence, the constant P
atm
cancels, leaving
us with the same expression that we would have without it.
In fact, it is hard to think of an everyday example where the actual
value of the pressure matters. Usually all that matters is the dierence of
the pressure from P
atm
, which is the term we have been emphasizing. This
dierence is often called gauge pressure, since it is what a typical pressure
gauge measures, by comparing the pressure (in a tire, say) with the ambient
atmospheric pressure. If your tire pressure is 26 pounds per square inch (psi),
according to a tire gauge, that is above and beyond the atmospheric pressure
of about 14 psi. It would be technically correct, but not a good idea, to insist
to your mechanic that your tire pressure is really 40 psi. This could only be
confusing. On the other hand, in the context of the ideal gas law, which we
will meet soon, you must use the true pressure of 40 psi!
One place where the dierence between Eq (8.15) and (8.17) might appear
signicant is in our discussion of the siphon. When the depth D becomes
negative (above the surface of the top vessel), Eq (8.15) gives a negative
pressure P, but Eq (8.17) gives a positive pressure. In the end it is only a
dierence of pressures that is relevant, and P
atm
cancels out, but one might
worry that perhaps negative pressure doesnt make any sense. We will see,
however, that true pressure can be negative, even if it isnt actually negative
in this case.
8.8 The Barometer
Evangelista Torricelli, a student of Galileo, showed how to measure P
atm
,
essentially by measuring the gauge pressure of the vacuum (the true pressure
is 0). His barometer is shown schematically in Fig 8.11. In equilibrium the
Fig.8.11: A tube evacuated at the top indicates the ambient pressure by the height to which
a uid rises: a barometer
pressure at the level of the uid in the reservoir is P
atm
, and the pressure at
the top of the tube (a vacuum) is 0. Thus, from the equilibrium of the uid
column,
P
atm
= gH (8.18)
8.8. THE BAROMETER 271
where H is the height of the uid column (what we would call depth D
measuring from the top of the column to the level of the reservoir). This
uid column weighs exactly as much as a column of the entire atmosphere of
the same horizontal cross-section. The equilibrium condition determines H:
H =
P
atm
g
(8.19)
We see that H P
atm
, so a measurement of H is a measurement of P
atm
in
some units. Torricelli measured H and found that P
atm
varies a little bit from
day to day! Barometric pressure is a very familiar part of our weather reports
now, but at the time this discovery must have been completely unexpected.
Let us estimate the size of a water barometer. Putting everything in SI
units, we have P
atm
10
5
N/m
2
, g 10 m/s
2
, and 10
3
kg/m
3
, giving
H 10 meters. This is rather tall! Torricellis barometer was a glass tube
(so that he and his neighbors could see the water level) sticking out through
a hole in the roof of his house. A much more practical device uses mercury.
Since the density of mercury is over 13 times that of water, and H 1/, a
mercury barometer is shorter by a factor of 13, and can sit comfortably on a
table. It is about 0.76 meters tall. In the US the height is still given in inches
of mercury, and is about 30 inches, but varies with the weather, of course. In
the eye of a hurricane it can be 29 inches. Yet another unit of pressure is the
atmosphere. One atmosphere is, by denition, 760 millimeters of mercury,
referring to the above barometer design. The millimeter of mercury, as a unit
of pressure, is also called the Torr, for Torricelli.
Aristotle believed that Nature abhors a vacuum, and that what is going
on here is that the uid is being sucked upward by the vacuum, and is trying
to ll it. This is so intuitively appealing that one must make a conscious
eort to think about it in terms of pressure, a pressure that we are intuitively
unaware of. The vacuum does nothing to suck the uid upward. Rather the
atmospheric pressure outside pushes the uid into the tube. This is how
you drink through a straw too contrary to intuition! Basically, there is no
such thing as suction in these examples. Try to imagine that! What we call
suction here is really just the creation of a pressure less than atmospheric
pressure.
There is such a thing as suction in the sense that it is possible to create
negative pressure. This would be tension in a uid. There is still a pressure
force, but it points the other direction, pulling normally on a surface, not
pushing. The best familiar example is tall trees. Atmospheric pressure can
only lift the sap in trees to a height of 10 m, as we have just seen, but many
trees are taller than that. The reason is that there is a mechanism in the
crown of the tree (evaporation) for continually removing uid, like a pump,
putting the uid beneath under enough tension to literally suck the sap up!
8.9 Bernoullis Principle
The Renaissance engineers knew how to build ornamental fountains for their
aristocratic patrons. In particular, they knew that the reservoir that feeds
the fountain must be as high as you want the fountain to rise. This sounds
very much like conservation of energy. If we think of following a small volume
V of water through the system, we can imagine it starting at the top of
the reservoir, at height H, where it has initial gravitational potential energy
U
g
= V gH. The water is drawn o through a pipe at the bottom, and
the reservoir is fed by a stream at the top, so that it is always full. Our
volume V gradually sinks lower in the reservoir, approaching the pipe, but
not moving quickly at all. Thus it loses its potential energy, but it doesnt
gain kinetic energy. Then, as it goes into the pipe, it picks up speed v,
meaning kinetic energy K = (1/2)V v
2
. If the pipe is horizontal, this
speed is just enough to bring it back to height H in the fountain, so it must
be that K is just U
g
, as if energy were conserved. And yet we know that
when the water lost potential energy, it didnt gain kinetic energy K right
away. There was an intermediate time when it was at rest at the bottom
of the reservoir. When it nally gained kinetic energy K, it was because it
moved from the high pressure of the tank to the low pressure of the pipe:
the unbalanced pressure accelerated it through the pipe. This tells us how
it managed to remember its original height: it loses U
g
but goes to higher
pressure P. Then it moves to lower pressure P but gains K. It is trading
o these quantities among each other, but something is staying the same, so
that in the end, when it is all U
g
again, it is the same as before, that is, the
same height H.
The above discussion is not a proof of anything, but rather an example
of a situation governed by Bernoullis Principle, a rather sophisticated result
of the theory of uid motion. It says that in a steady ow, if there is no
8.10. APPLICATIONS OF BERNOULLIS PRINCIPLE 273
appreciable friction, then the quantity
P +gh +
1
2
v
2
= constant (8.20)
along a streamline, that is, it is constant as you follow some V along, h
being its height at any time, and v being its speed. The quantity V has
disappeared from the statement: the three terms each have the dimension of
energy density, not energy. If you multiply each term by a volume V , then
it has the dimension of energy. It is rather a surprise to see P, which we
think of as a force per unit area, a kind of surface force density, appearing as
an energy per unit volume, a volume energy density, but dimensionally these
are the same.
In the example of the fountain, the energy density is initially gH. As we
follow our V to the bottom it loses this energy density, but the pressure P
increases, and sure enough: at the bottom, P has increased by the hydrostatic
value gH. Thus Bernoullis principle includes the notion of hydrostatic
pressure, in case K = 0. Then, in our example, P essentially goes to P
atm
in
the pipe, and K takes the value gH, just the energy density necessary to give
V the energy V gH, the amount it must have to reach a turning point
at height H. Each term in the statement of Bernoullis Principle, Eq (8.20),
is the important one at some time in this process. The conserved quantity
moves around among all three terms.
8.10 Applications of Bernoullis Principle
8.10.1 Force of the wind
How hard does the wind push on you? Enough to lean into? Enough to
knock you o your feet? Of course it matters how fast it is blowing, and
whether you stand facing it or sideways.
We will model this situation by assuming Bernoullis Principle applies,
since air in motion does not seem to be slowed by friction. We imagine a
region V in the air moving horizontally with speed v until it encounters an
obstacle (like you!) and is brought to rest. What does Bernoullis Principle
say? Since the height h doesnt change, the term gh is constant, and can
be considered part of the constant on the right hand side of Eq (8.20). Far
away from you, P = P
atm
, and the kinetic energy density is (1/2)v
2
. Thus,
when the air is brought to rest, the pressure is
P = P
atm
+
1
2
v
2
(8.21)
and the extra force on you, apart from the force P
atm
A which is always there,
is
F
wind
= (P P
atm
)A =
1
2
v
2
A (8.22)
To estimate its value we need to know the mass density of the air,
and we need to assume a wind speed v and an area A that you present to
the wind. Notice F
wind
v
2
, so that when the wind speed doubles, the
force quadruples! A very rough estimate for comes from the barometer.
We noticed that a column of water 10 meters high weighs the same as a
column of atmosphere of the same cross-sectional area A. If we take the
atmosphere to be 10 km high, then the column of atmosphere has 1000 times
more volume, and yet it weighs the same. Therefore its density must be
1000 times less than that of water, 1000 kg/m
3
, so that 1 kg/m
3
for
air. Despite the crudity of this estimate, the result is about right, and the
argument is a kind of mnemonic. The area A you present to the wind might
be A 0.3 m
2
, and the wind might be blowing at v 10 m/s, a bit more
than 20 mph. Then the force of the wind would be F 15 N, a bit more
than 3 pounds force, not very much, but denitely noticeable. A hurricane
force wind of 4 times this speed would exert 16 times more force, more than
50 pounds you could certainly lean into it.
These rough estimates seem to agree with common experience, and sug-
gest that we really do understand essentially what is going on here: the wind
exerts a pressure force on us, because the pressure on the windward side of
an obstacle, where the air is brought to rest, is higher. We must also as-
sume that the pressure on our back, the leeward side, is just P
atm
, because
in the shelter of the obstacle there are no streamlines terminating rather
the wind streams past us, leaving a sheltered region roughly in hydrostatic
equilibrium. It is then the dierence between the pressure on front and back
that we experience as the force of the wind, as in Eq (8.22).
8.10. APPLICATIONS OF BERNOULLIS PRINCIPLE 275
8.10.2 Flow Past an Airfoil
The most spectacular and counterintuitive consequence of Bernoullis Prin-
ciple is the lift on an airplane wing, or airfoil. As sketched in Fig 8.12, ow
v
0
, P
atm
v
0
, P
atm
v
1
, P
1
v
2
, P
2
Fig. 8.12: Steady ow past an airfoil, shown here in cross section, can lead to dierent
speeds above and below, and hence, by Bernoullis principle, dierent pressures.
past the airfoil may result in dierent speeds above and below. The quan-
tity P + (1/2)v
2
is constant along the ow lines, and has the same value
on each ow line, namely P
atm
+ (1/2)v
2
0
, its value before it encounters the
airfoil. If v
1
> v
2
, then necessarily P
1
< P
2
, and the pressure is greater on
the underside of the wing. Thus the pressure force P
2
A up is greater than
the pressure force P
1
A down, and the net force due to the air on the wing is
up. That is lift!
In Fig 8.12 the wing is imagined to be stationary while the air streams
past it to the left, as in a wind tunnel. According to the principle of relativity,
however, all that matters here is that the air is moving relative to the wing.
It would be the same if the air were stationary and the wing moved to the
right, as in ight, the view we would have of this phenomenon if we were to
move uniformly to the left. In that view the air on top of the wing might
be approximately stationary, but the air under the wing would be to some
extent moved along by the wing and compressed, making the pressure higher.
It is amazing that aircraft are able to stay aloft. We should not leave
this example without a numerical estimate, just to be sure that we have
basically understood it. Suppose that v
0
200 m/s (about 450 mph), and
that v
1
210 m/s, v
2
190 m/s, so that the dierence in the two speeds is
10% of the average speed. We take 1 kg/m
3
for air. Then the pressure
imbalance is
P
2
P
1
=
1
2
(v
2
1
v
2
2
) 4000 N/m
2
(8.23)
The weight of 100 passengers, if each, together with luggage, has a mass
100 kg, is about 10
5
N (taking g 10 m/s
2
). The wing area A necessary
for (P
2
P
1
)A 10
5
N is 25 m
2
. The load would be more than just the
passengers, of course, since the plane itself weighs a lot, so let us say the
wing area should be several times this. It is still in the realm of plausibility.
Jetliner wings are roughly this size. This estimate also lets us know why
planes have to y fast. If P v
2
0
, as is suggested here, then cutting
airspeed by a factor of 2, cuts lift by a frightening factor of 4.
8.11 Flow in Pipes
Liquids are essentially incompressible in ow (i.e., the density is constant),
and this has an interesting consequence for ow in pipes. If we consider any
section of pipe with uid lling it and owing through it, the amount of uid
entering this section must just equal the amount leaving it, since there is no
space for any extra uid to go. We picture steady ow through a constriction
in Fig 8.13, paying special attention to the section between the heavy vertical
lines. In a time t a volume V enters, of length v
1
t and cross-section A
1
,
i.e.,
V = A
1
v
1
t (8.24)
and the same volume V , of length v
2
t and cross-section A
2
leaves, where
v
1
and v
2
are the speeds at the entrance and exit. Since these two volumes
are equal,
A
1
v
1
= A
2
v
2
= I = constant (8.25)
This tells us how speed changes with the cross-section of the pipe. Equiva-
lently
v
1
A
(8.26)
8.11. FLOW IN PIPES 277
A
1
A
2
v
1
t
v
2
t
Fig. 8.13: Flow in a pipe with changing diameter: the volume entering this segment in a
short time t, namely A
1
v
1
t, must equal the volume exiting, A
2
v
2
t.
As the area A goes down, the speed v goes up. The constant of proportion-
ality is I. A nozzle like that in Fig 8.13, constricting the cross-sectional area,
may be explicitly intended to give a large exit speed. You have probably
done this yourself with a garden hose, constricting the exit with your thumb.
The volume V that ows through any cross-section A of the pipe is pro-
portional to t, according to Eq (8.24). The constant of proportionality is
I = Av, the volumetric ow rate, also called current, the same everywhere
along the pipe. (Check that [I] = [L
3
T
1
], and could have units liters/second,
gallons/hour, etc.) You could collect the uid coming out of the pipe, and the
volume collected would increase with time at the constant rate I. Knowing
A for the pipe, and measuring volume ow rate I = Av by collecting uid,
you could nd the speed v = I/A by dividing.
If we multiply the volume ow rate I = Av by the mass density we have
vA, the mass current. Multiplying by essentially converts volume to mass.
The dimension of mass current is [MT
1
], with SI unit kg/s. Its signicance
is the rate at which mass passes through the area A. In particular, it is the
rate at which mass accumulates if you collect it at the end of the pipe.
8.11.1 Venturi Flow Meter
If friction at the walls and in the interior of the ow is unimportant, the
steady ow in a pipe obeys Bernoullis Principle. That means we can also
say how the pressure changes in the pipe. Taking the pipe horizontal, so that
the height h is a constant, we have
P
1
+
1
2
v
2
1
= P
2
+
1
2
v
2
2
(8.27)
in Fig 8.13. One could measure the pressure dierence in the tube
P = P
2
P
1
=
1
2
(v
2
1
v
2
2
) =
1
2
v
2
1
_
1
A
2
2
A
2
1
_
(8.28)
using Eq (8.25). Since P v
2
1
, with a known constant of proportionality,
this measurement of P actually measures v
1
, the ow speed into the con-
striction. Such a device is called a Venturi ow meter. It could be used to
measure wind speed, for example, a case where the volumetric method would
not be very appropriate.
8.11.2 Poisseuille Flow
The ow speed v = I/A, found from the current I, should be understood
as a kind of average v. In Fig 8.13 we drew the uid motion as if v were
constant over the cross-section A. Such a ow is called plug ow, because
the uid moves like a solid plug in sections where A is constant. But it may
very well happen that the ow is faster in the center of the pipe and slower
near the pipe wall. In that case v = I/A is really an average speed. The ow
rate I says nothing about how the ow is distributed across the cross-section.
Fig 8.14 shows a pattern of ow in a pipe called Poisseuille ow. The ow
speed is zero on the wall of the pipe and has a maximum in the middle. In
this case it turns out that the volumetric method gives v = I/A = v
max
/2.
Like all averages, v is somewhere between the extreme values, 0 and v
max
,
but it is just a coincidence that v is exactly the arithmetic mean of the two.
If you measured v by the volumetric method, you might be surprised to nd
that some impurity introduced into the pipe actually appears at the other
end in only half the time you expected.
8.11.3 Current Density
We began this section with a method for nding the velocity of the ow v from
the volume current I, namely v = I/A. Now we turn that around. We think
8.11. FLOW IN PIPES 279
A v
max
t
Fig. 8.14: Poisseuille ow has a parabolic ow prole, that is, the volume owing through
the cross-section A in time t is bounded by a parabola. The volumetric ow rate for this
ow turns out to be I = Av
max
/2.
of v as a kind of density of current, that must be multiplied by a geometrical
factor A (the area of the pipe) to be a current I = vA. In the section above,
on Poisseuille ow, we even saw that the current density v varies over the
cross-sectional area A, so some parts of A have more current going through
than other parts. The center has the most current going through, in the
sense that if we took a little area A located on the centerline of the pipe,
it would have more current vA through it than the same A would have
if it were located near the wall of the pipe (where v is less). The dimension
of volume current density is volume per unit time per unit area. That turns
out to be just [L/T]: velocity!
This notion of current density is surprisingly useful and important. We
will run into it wherever there is some current spread out over an area. Solar
energy is best described as an energy current density, for example, because
there energy is arriving at some rate from the Sun, but it is spread out
over the area that collects it. Electric current in a wire can be described
just like volume current in a pipe: the electric current density might look
like Poisseuille ow, for example, highest in the middle. Or maybe electric
current is like plug ow. Or maybe it is completely dierent, with most of
the current density on the surface of the wire and not much in the middle
these would be dierent current densities, even with the same current. How
the current is actually distributed is, of course, an experimental question.
D
v
max
t
x
y
Fig. 8.15: Flow over a plane solid surface is a simple example of shear ow. The volume
that ows through the area of height D in time t is shown in blue. The rate of shear
strain is = v
max
/D. The ow could be created by a horizontal surface contacting the
uid from above, moving at speed v
max
, dragging the top layer of uid along with it.
8.12 Shear Stress and Viscosity
In the Poisseuille ow of Fig 8.14 layers of uid slide on other layers, since
they are not all moving together at the same speed. This sliding of uid
layers on each other is called shear ow. The simplest geometry for shear
ow is shown in Fig 8.15, where horizontal layers slide on each other. In this
example the speed of ow (in the x direction) is proportional to the height
y above the solid oor,
v
x
y (8.29)
The subscript x on the speed v indicates that it is in the x direction. The con-
stant of proportionality is called the rate of shear strain, or more informally
shear rate, often denoted (gamma dot). Thus
v
x
= y (8.30)
Note the dimension [ ] = [T
1
], like an angular speed. This quantity is a
measure of how fast the uid is being sheared, that is, how fast it is being
distorted. As we shall see, uids resist being sheared fast: the faster they
are sheared, the more they resist. If you do it slowly, they resist less. In the
limit as you do it very slowly, they dont resist at all, and in this respect are
dierent from solids, which resist even static distortion (Hookes Law).
The shear ow in Fig 8.15 might be a model for the ow at the bottom of
a river. The uid doesnt move on the river bed, but as you go up, into the
8.12. SHEAR STRESS AND VISCOSITY 281
river itself, the uid speed increases, shearing the uid. In this situation the
uid exerts a force on the river bed in the x direction, as if it were tending
to drag the solid surface along. By Newtons Third Law, the river bed must
exert a force on the uid to hold it back (this is the resistance of the
uid to shearing, alluded to above). The drag force on the river bed is best
described by a stress, i.e., a force per unit area, because it is distributed over
the surface: every little area on the bottom feels a drag force proportional
to its area. Similarly, if there is a solid surface on the top, dragging the
uid along at speed v
max
, it exerts a force on the uid in the x direction and
the uid exerts a force on the surface in the x direction. This shear stress
is like pressure, in that it is a stress, but pressure is a normal stress, and
shear stress is a tangential stress. The corresponding force is tangential to
the surface, not normal to it. Recall that the SI unit of stress is the Pascal
or N/m
2
, force per unit area. Multiplying by an area in m
2
, you get a force
in Newtons.
This tangential shear stress is frequently given the name
xy
(sigma).
For normal uids the shear stress is proportional to the rate of shear strain
, i.e.,
xy
(8.31)
This makes precise what we meant by the uid resisting more if it is sheared
faster. At higher shear rate, the stress is proportionately higher. The con-
stant of proportionality is called viscosity, given the symbol (eta). Thus
xy
= (8.32)
Viscosity is a material property of the uid. It determines how much tangen-
tial force the uid exerts on a surface in a given shear ow, or, by Newtons
Third Law, how hard you have to push a uid tangentially to shear it at a
given rate. Note the dimension of viscosity: [] = [ML
1
T
1
]. The SI unit
of viscosity is the kg/ms or Pas.
Let us do a numerical example to make this idea concrete. The viscosity
of water is about 10
3
Pas. This rather small value suggests that water
does not exert much tangential force in shear ow at our human scale, and
that turns out to be true. In fact water is a good lubricant, and a smooth
wet oor can be dangerously slippery. We will do a rough estimate of the
force on a shoe moving along the oor at v
max
= 1 m/s in case there is a
thin water layer. Suppose the layer is D = 10
3
m thick (1 mm). Then,
referring to Fig 8.15, and imagining the shoe sole as the solid surface at the
top of the uid region, separated from the oor below by the uid layer, we
see that = v
max
/D 10
3
s
1
. Multiplying by for water, we nd the
tangential shear stress 1 Pa, according to Eq (8.32). On a shoe of area
A 300 cm
2
3 10
2
m
2
, we have a force of only A 3 10
2
N. It
is no wonder the oor is slippery! We are accustomed to push tangentially
with a force of many pounds against the oor when we walk. If there is a
water layer, a much smaller force creates a shear ow like the one in Fig 8.15,
and we slip. If the water layer is thinner than we estimated, the rate of shear
strain would be proportionately larger. If for example it were only 10
6
m
thick (1 micron), then the force we estimated would be 1000 times larger, 30
N thats still only a few pounds.
A similar dangerous eect is aquaplaning in automobiles: if you drive at
high speed into a ooded layer on the highway, the tires may lose contact with
the road surface, and the customary friction with the roadway is replaced by
the shear stress of the water, which is essentially zero. Tires have tread
patterns mainly to give water a channel to get out from under the tire.
The lesson of this computation is that friction in water, at least at our
length scale, is really quite small. It is tempting to ascribe the resistance of
the water that we feel in swimming, for example, to friction, and hence to
viscosity, but that is not really where it comes from. That force of resistance
is almost entirely due to pressure (normal stress) not shear stress (tangential
stress). Bernoullis Principle gives a better way to think about it, as in the
problem of the force of the wind in Section 8.10.1. Replace the wind by
water.
8.13 Stokes Flow
At the small length scale of one-celled organisms, viscous resistance in swim-
ming is dominant. If we are swimming and take a stroke or give a kick,
we can coast along through the water, especially if we hold a streamlined
position. But if a paramecium stops moving its cilia it comes to a dead stop
immediately. It does not coast at all. Streamlining would be no advantage
to a paramecium or any other small creature.
8.14. POISSEUILLE FLOW REVISITED 283
The viscous drag force F on a small sphere of radius R moving at speed
v through a uid of viscosity was calculated in the mid 19th century by
George Stokes, who found
F = 6Rv (8.33)
in the direction opposite to v (i.e., tending to slow the sphere down). This
is the force which in fact does slow a small sphere down in extremely short
time. If, however, there are other forces on the sphere, like gravity and the
buoyant force, then they will balance the Stokes force and the sphere will
move with constant speed v, proportional to 1/. This suggests a way to
measure viscosity: observe the slow sinking through a uid of a small sphere
of known material and size. The speed indirectly tells you the viscosity: the
smaller the speed, the greater the viscosity.
The viscous force on a small object moving through a uid is surprisingly
insensitive to shape. A at disk of radius R is quite a dierent shape from a
round sphere, but the force on it is about the same as the force on the sphere
and it doesnt matter much whether it is moving face rst or edge rst!
All that matters is its typical linear dimension, the radius R in both these
examples.
8.14 Poisseuille Flow Revisited
Bernoullis Principle does not apply to Poisseuille ow. Since Poisseuille
ow is a shear ow, there must be shear stress, which is a kind of friction.
In the steady ow of Fig 8.14 the cross-section A is constant, so also v is
constant, but contrary to what Bernoullis Principle would predict, P is not
constant. Rather there is a pressure dierence between one end of the pipe
and the other, and this is what pushes the uid through the pipe, in spite of
friction. (Without friction, the uid would just coast through at constant v,
and the pressure P would be constant. Bernoullis Principle describes only
this frictionless case.)
Flow in a large pipe, at moderate speed, like the ows in household
plumbing, obeys Bernoullis Principle to good approximation. We dont
need pumps to push water through our horizontal pipes. It pretty much
coasts through. Because the viscosity of water is so small, the friction is
not very important. Also, in our larger arteries, the pressure of the blood
is related to the pressure at the heart by Bernoullis Principle. That is why
it makes sense to measure it in the arm. In large blood vessels, friction is
unimportant. In a narrow pipe at low speed, however, like capillary ow in
the cardiovascular system, Bernoullis Principle certainly does not hold, even
approximately, and an appreciable pressure drop is necessary to push the
ow through the capillaries, just the pressure dierence between the arterial
side and the venous side of our circulatory system. So when does Bernoullis
Principle apply? And when is friction important? Apparently we cannot
simply say that water has small viscosity and forget about it at this smaller
scale.
We can understand this situation with the help of dimensional analysis.
A key quantity to think about is the pressure drop P from one end of a
pipe to the other in Poiseuille ow, pushing a steady current I through the
pipe. Let the pipe have length and radius R. How does P depend on
these quantities? Well, in terms of dimensions, P is a stress, i.e., [P] =
[ML
1
T
2
]. It seems reasonable that P , because if we follow the pipe
with another identical one, making a pipe twice as long, we would need the
same pressure drop again to drive I through the second pipe. That is, if
became twice as long, P would be twice as big too. P I, because higher
pressure would push more current through (by making the uid ow faster).
The dimension of current is [I] = [L
3
T
1
]. In order to get the dimension [M],
we must also introduce the viscosity [] = [ML
1
T
1
], but that makes sense,
because if the viscosity were higher, like in thick oil or syrup, we would need
more pressure to create the same current I. Then in order for the dimensions
to agree we must have, for some dimensionless constant C,
P = C
_
I
R
4
_
(8.34)
The factor R
4
is dimensionally necessary, R being the only other length in
the problem. It also makes sense: if the pipe is wider, the pressure to drive
current I would be less, dramatically less, in fact. If the pipe is twice as wide,
P is less by a factor 16. For Poisseuille ow, the constant C turns out to
be 8/ 2.5, but the actual value is not so important. We just note that
it is of order 1. By this dimensional argument we know something about
P.
Now is P small or not? Remember that Bernoullis Principle predicts
8.14. POISSEUILLE FLOW REVISITED 285
P = 0 for steady ow in a horizontal pipe, since it says
P +
1
2
v
2
= constant (8.35)
and the second term is constant. Bernoullis Principle is approximately true
if the change in P, that is, P, is much less than the (truly constant) second
term (1/2)v
2
. We therefore nd the ratio and ask if it is much less than 1:
P
1
2
v
2
=
2C
R
2
v
(8.36)
Here we used P from Eq (8.34) and I = R
2
v to express the current I in
terms of speed v. Is this ratio small? We notice that the length of the pipe
is in the numerator. This means that for a long enough pipe, the ratio is as
large as we want, and Bernoullis Principle certainly fails. That is, friction is
important in a long enough pipe, but that is just common sense. Of course
the uid eventually loses energy in that case. But let us take a short pipe,
with length some small multiple of R, say = cR, with c a pure number
of order 1. Then, ignoring all the dimensionless constants, and just keeping
the dimensional factors,
P
1
2
v
2

Rv
(8.37)
Is this small? We can estimate it for ow in the aorta, where we have argued
Bernoullis Principle holds. The viscosity of blood is perhaps 5 times that of
water, so 510
3
Pas. Also for blood is essentially the same as water, 10
3
kg/m
3
, R 10
2
m, and v 1 m/s. The right side of Eq (8.37) is then the
dimensionless number 5 10
4
. This is small and we see why Bernoullis
Principle holds to good approximation.
On the other hand, we can do the same estimate for ow in a capil-
lary. Then R 5 10
6
m and v 10
3
m/s, and we nd the right side
of Eq (8.37) is the dimensionless number 1000, very large compared to 1.
Bernoullis Principle doesnt hold here!
8.15 The Reynolds Number
The number we were computing at the end of the last section is the reciprocal
of a very famous quantity, the Reynolds number. We dene it here,
Re =
Rv
(8.38)
where and are the viscosity and mass density of a uid, R is a typical
length characterizing the ow, like the radius of the pipe, and v is a typical
speed in the ow. Taking reciprocals of the quantities we computed in the
last section, we note that Re 2000 in the aorta and Re 0.001 in the
capillaries. If Re >> 1, then we expect Bernoullis Principle to hold, and
friction to be unimportant. On the other hand, if Re << 1, then friction
dominates the ow. This just restates the result of the last section in terms
of Re instead of 1/Re.
Now it is easy to see why the viscosity of water is not very important at
the human scale of R 1 m and v 1 m/s: we nd Re 10
6
, which is
much greater than 1. On the other hand, for single celled organisms, with size
perhaps 10
5
m, accustomed to move at 10
5
m/s, we have Re 10
4
. This
is much less than 1, so viscosity dominates their world. In fact, whenever
you look at things moving in a microscope, you are looking at a situation
dominated by viscosity, with small Reynolds number. The Reynolds number
helps us decide for a given ow whether viscosity is important or not. If
Re is less than one, viscosity is very important. If Re is greater than 100,
viscosity is not important ah, if only life were that simple!
As Re gets larger, viscosity should become less and less important, and
Bernoullis Principle should become more and more accurate. But as Osborne
Reynolds showed in the 1890s, in careful experiments involving pipe ow,
something amazing happens when Re becomes larger than 2000 or so the
ow suddenly makes a transition to a new kind of ow, turbulence. This
transition to turbulence happens in a pipe at high enough Re, which means
large enough radius R, large enough speed v, small enough viscosity , or
any combination of these that makes Re suciently large. In turbulent ows,
Bernoullis Principle fails for a new reason: the new complicated ow includes
shear ow at all length scales, including the very small scales where viscosity
is important. That is, the radius R of the pipe is not the only thing setting
the length scale. The ow itself can set a new length scale!
8.15. THE REYNOLDS NUMBER 287
We still have the proportionality relationship
P I (8.39)
between pressure drop P and current I in the pipe, as in Eq (8.34), argued
there by dimensional analysis, but now the pure number C in that relation-
ship need not be of order 1. It might involve the large, dimensionless
Reynolds number. We can still write this proportionality relationship as
P = rI (8.40)
but here r is a resistance that we cannot actually calculate, or even estimate,
from rst principles. It is called a resistance, because for xed pressure drop
P, larger r means smaller I (their product is xed). In turbulence r is
unexpectedly large, and this represents a departure from Bernoullis Principle
(which, you recall, says P = 0 for horizontal ow through a cylindrical pipe,
i.e., r = 0).
Turbulent ow is not well understood theoretically, even though most
familiar ows are turbulent. This means we are surrounded by uid ow
phenomena that we dont fully understand. Just to give one example, the
resistance r of a pipe to turbulent ow depends in a mysterious way on
the roughness of the interior wall, something that is completely irrelevant in
Poisseuille ow. In turbulent ow the motion is chaotic, even if there is some
overall average speed of the uid as a whole. Within the turbulent ow the
whole uid is rapidly mixed, and there is no such thing as a smooth ow
line. The average ow of a turbulent uid down a pipe looks like plug ow,
but within this average ow the uid is doing incredibly complicated things.
In the ow past an airfoil in Fig 8.12, the ow lines behind the airfoil would
be impossible to draw. (Leonardo da Vinci attempted careful drawings of
turbulent ow in rivers: it is interesting to see how he visualized it, very
swirly!) There is no fully reliable mathematical model of turbulent ow, so
this represents a problem physics has identied, but not yet solved. In the
absence of a good mathematical theory, designers and engineers must build
expensive prototypes and test them in real ows, wind tunnels, etc. One
extremely useful consequence of the theory as we have given it is that you
dont have to use full-scale prototypes for such tests. You can make little
scale models and adjust v, and to make the Reynolds number the one
you are interested in. The resulting scaled ow should model the real ow.
One practical consequence of turbulence, as we have already noted, is
increased resistance r to ow in pipes or channels, as in Eq (8.40). Flow in
re hoses is turbulent, and the pressure drop down the hose due to turbu-
lence means that the water that emerges out the nozzle does not have all the
kinetic energy K that it could, and hence it cannot go as high as it should.
For unknown reasons, if a small amount of polymer is dissolved in the water,
the turbulence is inhibited, and the turbulent resistance r is less, meaning
the water can now reach higher, say to the sixth oor of a burning build-
ing. This example hints at how useful it would be to understand turbulence
better. Another practical consequence of turbulence is mixing. In industrial
processes one frequently wants to mix two solutions. This happens very fast
if the the mixture is made turbulent. Nowadays there is a lot of work on
microuidic devices, essentially networks of very small pipes, with perhaps
some moving parts, maybe etched on a silicon chip. One problem in these
devices is getting two uids to mix: being small, these devices operate at
low Reynolds number, so the ows are never turbulent. In the absence of
turbulence, our usual method for mixing things doesnt work.
8.16 Resistance in Series and Parallel
The relationship in Eq (8.40), expressing the proportionality between pres-
sure dierence and volume current in a single pipe, can be extended to net-
works of pipes. In Fig 8.16 we imagine two pipes in parallel, with resistances
r
1
and r
2
, connecting uid at pressure P
2
to pressure P
1
. Because of the
pressure dierence P = P
2
P
1
, there will be a ow in each pipe, and the
total current owing will be
I = I
1
+I
2
= P/r
1
+ P/r
2
= P
_
1
r
1
+
1
r
2
_
(8.41)
Since for the system as a whole we would dene the resistance r by P = rI,
the total resistance r is given by
1
r
=
1
r
1
+
1
r
2
(8.42)
A simpler way to say this is to dene the conductance g = 1/r where I =
gP, and to note that in parallel the conductances add, since each pipe
conducts current.
8.16. RESISTANCE IN SERIES AND PARALLEL 289
r
1
, I
1
r
2
, I
2
I I
P
2
P
1
Fig. 8.16: Two pipes in parallel, with resistances r
1
and r
2
, connect a reservoir at pressure
P
2
with a reservoir at pressure P
1
. The total current I through the system is shared between
the two pipes. Since the uid is incompressible, I = I
1
+I
2
.
When one pipe follows another, the pipes are said to be in series, as in
Fig 8.17. Taking the resistances of the individual pipes to be r
1
and r
2
, we
I I
P
2
P
1
r
1
r
2
Fig. 8.17: Two pipes in series, with resistances r
1
and r
2
, connect reservoirs at pressures
P
2
and P
1
. Each pipe carries current I. The pressure drops continuously as you move
from left to right through the pipe, and in particular the junction between the two pipes
(dotted line) is at some intermediate pressure.
see that the pressure drop in the rst pipe, r
1
I, plus the pressure drop in the
second pipe, r
2
I, must be the total pressure drop P = P
2
P
1
. Thus
P = I(r
1
+r
2
) (8.43)
so that the resistance r of the series combination of pipes is just
r = r
1
+r
2
(8.44)
Thus in series, the resistances add.
When we study electrical current we will see exactly these same relation-
ships. The role of pressure dierence will be taken over by voltage dierence
(potential dierence), and the role of volume current will be taken over by
electric current. The role of pipes (with their resistance) will be taken over
by resistors.
8.17 The Human Circulatory System
Blood ows through arteries and veins the way water ows through pipes.
Physics ought to have something to say about this. The branching network of
the arteries is like a network of resistances. Physiologists have distinguished
various levels in this branching scheme, beginning with the aorta, the single
large artery from the left ventricle of the heart, then the large arteries, the
small arteries, the arterioles, and the capillaries. The walls of the arteries
include smooth muscle. The arterioles, in particular, are under involuntary
control of the nervous system, and can open up or constrict. Flow in the
arterioles is Poisseuille ow, so the resistance depends sensitively on the
radius R of the arteriole (like 1/R
4
). Thus the body can selectively control
resistance.
From the capillaries blood ows into venules (i.e., small veins), then veins,
and back to the right atrium of the heart. The veins are more passive, and
many contain valves to keep blood from owing backward. The volume of
the venous side is greater than that of the arterial side. In fact, almost 80%
of the blood at any moment is in the veins!
The output of the right ventricle into the pulmonary artery circulates
blood to the lungs. Again there is a branched network down to the pulmonary
capillaries, where the blood is oxygenated before going back to the left atrium
of the heart.
One remarkable observation follows from the incompressibility of the
blood. The volume ow into any part of the system must, on average, equal
8.17. THE HUMAN CIRCULATORY SYSTEM 291
the volume ow out, as in Eq (8.25). Thus the ow into the systemic arteries
from the heart must equal the ow out through the capillaries. The heart
pumps about I = 6 liters/minute of blood, i.e, I = 10
4
m
3
/s, into the aorta,
with radius r
0
= 10
2
m, and hence area A
0
= 310
4
m
2
. Thus the average
velocity of ow in the aorta is u
0
= I/A = 0.3 m/s. On the other hand, the
velocity of ow in the capillaries is u
c
10
3
m/s, and the radius of a capil-
lary is about r
c
3 10
6
m, and hence area A
c
= 3 10
11
m
2
. The total
area of the capillaries is A = N
c
A
c
, where N
c
is the number of capillaries in
the body. We can compute N
c
, because N
c
A
c
u
c
= I = 10
4
m
3
/s implies
N
c

10
4
m
3
/s
10
3
m/s 3 10
11
m
2
3 10
9
capillaries, (8.45)
a number that would be hard to obtain any other way.
When you have your blood pressure taken, the two numbers, say 130/80,
are the systolic and diastolic pressures, measured in mm of Hg (gauge pres-
sures). The beating heart creates pulses of higher pressure (here 130) on top
of a resting pressure (here 80). On the venous side the pressure is not much
more than atmospheric pressure (gauge pressure 0). Thus the pressure drop
from the heart through the capillaries the pressure that drives the ow
is about P = 100 mm Hg 1.3 10
4
Pa. The measurement of blood pres-
sure reminds us that Bernoullis principle, P +gh+
1
2
u
2
= constant, holds
through the large arteries. Furthermore, as we will see, the average blood
speed u stays quite constant through this part of the system. Therefore, if
you measure the pressure P in a large artery at the height of the heart (call
this h = 0), then you are essentially measuring pressure P at the heart. At a
dierent height h, the measurement would dier from the value at the heart
by the hydrostatic pressure gh. Clinicians usually measure blood pressure
in the upper arm, at mid-chest height.
Knowing P and I for the ow, we can nd the resistance R
A
for the
arterial system,
R
A
=
P
I

1.3 10
4
Pa
10
4
m
3
/s
1.3 10
8
Pa s/m
3
(8.46)
Is this resistance perhaps due to the capillaries, which are so narrow? The
resistance of a single capillary can be computed from what we know of Pois-
seuille ow, using Eq (8.34),
R
c
=
8
c
r
4
c
8(5 10
3
Pa s)(3 10
4
m)
(3 10
6
m)
4
4 10
16
Pa s/m
3
(8.47)
We have used values for blood viscosity 5 10
3
Pa s, the length of
a capillary
c
3 10
4
m, and the radius of a capillary r
c
3 10
6
m.
Now the capillaries taken all together are N
c
such resistances in parallel, so
by Eq (8.16), the resistance of all the capillaries is
R =
R
c
N
c
4 10
16
3 10
9
10
7
Pa s/m
3
(8.48)
The uncertainty in this computation is rather large, but in comparing
Eq (8.46) with Eq (8.48), it does seem that there is appreciably more resis-
tance in the arterial system than just the resistance of the capillaries. That is
consistent with the observation that in the arterioles there is a smooth mus-
cle mechanism to change the resistance. That wouldnt make much sense
if the arteriole resistance were negligible! We will see in the next section a
model in which the resistance is rather uniform through the whole system,
in a sense to be explained.
When you exercise hard, your muscles need more oxygen, and hence more
blood. The arterioles supplying these muscles open wider. That creates a
problem, though, as diagrammed in Fig 8.18. In the branched network, the
only relevant branch to look at is the one that separates arteries serving the
exercising muscles from everything else. Suppose R
1
is this exercising part,
and let I
1
and I
2
be the currents through R
1
and R
2
. When its arterioles
open wider, R
1
decreases. The problem is that the pressure drop across R
1
is the same as the pressure drop across R
2
. Thus I
1
R
1
= I
2
R
2
, and therefore
I
2
I
1
=
R
1
R
2
(8.49)
When R
1
goes down, I
2
goes down relative to I
1
, and since these two together
sum to the total current output I of the heart, i.e.,
I
1
+I
2
= I (8.50)
it means that the muscles have stolen blood from the rest of the body
(including the brain). Now that I
2
is a smaller fraction of the total I, the
8.17. THE HUMAN CIRCULATORY SYSTEM 293
R
0
R
1
R
2
I
I
Fig.8.18: Flow through the systemic arteries is indicated schematically, with hydrodynamic
resistance, i.e., subnetworks of arteries, indicated by the resistor symbol. The resistance R
1
is the subnetwork serving an exercising muscle. It is in parallel with another subnetwork
R
2
which has only normal demand. When R
1
goes down, the current I supplied by the
heart goes preferentially to R
1
, starving R
2
.
only way to bring I
2
back to its proper value is to increase I, the total output
of the heart. On each stroke the heart empties (most of) the left ventricle,
a xed volume, so the only way to increase I is for the heart to make more
strokes, i.e., to beat faster. And of course that is what happens when we
exercise!
We know that the cross-sectional area of the arterial system goes up even
as the individual vessels get smaller, because the blood slows down, reaching
a speed of only u
c
1 mm/s in the capillaries. The way the cross-sectional
area goes up, though, is interesting. It is shown in Fig 8.19. That diagram
is not so easy to interpret, but it appears that the area stays quite constant
until the vessels are about 200 m in diameter. In the large vessels, then, the
average blood speed does not slow down. Then the area begins to increase
(to a total area larger than indicated there, in vessels that are smaller a
little extrapolation is necessary), and the blood slows down. That is, there
seem to be two regimes, a constant area regime, and a growing area regime,
with the crossover at diameter 200 m. That is also roughly where the
crossover occurs from a Bernoullis Principle regime, in large vessels, to a
Poisseiulle ow regime in small vessels. This observation will be part of the
1 10 100 1000 10000
1
10
100
Total cross-sectional area (m
2
x 10
-4
)
Vessel diameter (m)
Fig. 8.19: The cross-sectional area of the arterial system as a function of the diameter
of vessels. The increase in area at small vessel size implies the slowing down of the ow.
(Redrawn from The Mechanics of the Circulation, C.G. Caro, et. al., Oxford University
Press, 1978, Fig. 12.4.)
fractal model of circulation in the next section.
There is one more thing we should say about the circulation in the large
vessels, where Bernoullis Principle applies. This is also the regime where
you can feel your pulse, and where you can measure the systolic and di-
astolic pressures, i.e., where the pressure oscillates appreciably. The large
vessels actually bulge when the systolic pressure pulse occurs, and this bulge
propagates along the arteries like a wave. (Sometimes you can even see this
in arteries near the skin.) A peculiar thing can happen to a wave when it
runs into some change in its medium: it can reect. This could happen where
the arteries branch into smaller arteries, for example, since the branch repre-
sents a kind of interruption. A reected systolic wave would represent blood
owing the wrong way(!) or really, it would subtract from blood owing the
8.18. A FRACTAL MODEL OF CIRCULATION 295
right way. It would be a kind of design aw in the system. Thus one should
expect that branching is designed to minimize reection of systolic waves in
the large vessel regime. This is a physical insight into the problem which is
also part of the fractal model of the next section.
8.18 A Fractal Model of Circulation
Warm-blooded animals maintain a constant temperature, typically above the
ambient temperature, despite losing energy by ow of heat to the outside.
Since this heat loss is through the surface, it goes as the square of the linear
size, or as the 2/3 power of the volume (or mass). This led, in the 19th
century, to the expectation that the nutritional requirements B of animals
would grow with mass like
B M
2/3
Bergmann
s Law (8.51)
That B grows more slowly than M, as Bergmanns Law implies, is a familiar
fact. Large animals need proportionately less food than small ones. Humans
eat roughly 1/50 their weight each day, but small mice may eat half their
weight each day! Since it is not dicult to determine the food requirements
of animals as a function of their mass, there is now a lot of data on this
question, and the result is surprising. The actual empirical law is
B M
3/4
(8.52)
and it is accurately obeyed not just by mammals but also by cold-blooded an-
imals, and even such diverse life forms as insects, plants and bacteria! Clearly
there is some other need being fueled by food intake, and it is more demand-
ing than balancing heat ow through the surface, since it grows slightly faster
than that particular need, which is still there, for some animals.
In 1997 Georey West, James Brown, and Brian Enquist (WBE) sug-
gested that this law, which we will call the 3/4 law, was a consequence of the
need to supply three-dimensional bodies through one-dimensional networks,
for which the human circulatory system is a convenient example, duplicated
in one way or another in virtually every living thing. In geometrical terms,
they were suggesting that the circulatory system is a fractal, a set that does
not have the dimension that naively it should have. We begin by explaining
this idea briey.
The dimension of a set S describes how it scales. A clever way to measure
this is to think of covering S with spheres of diameter
1
. Let the number
necessary be N
1
. Now take smaller spheres, of diameter
2
, and let the
number necessary to cover be N
2
, etc. In this way you nd N as a function
of sphere diameter . Now you ask how N scales with by graphing ln N vs.
ln . If this is a straight line with slope , then the fractal dimension is .
(The minus sign is because as the size of spheres goes down, the number of
them goes up.) This means N
, and so N, the measure of S, scales

with the exponent . It is clear for geometrically simple sets like the line
segment AB or the square surface ABCD in Fig 8.20, that agrees with the
usual notion of dimension.
A B
A B
C D
Fig. 8.20: Using spheres with 2
1
= 1/2 the diameter, it takes 2
1
times as many to cover
the line AB, and 2
2
= 4 times as many to cover the square ABCD, because these objects
have dimension 1 and 2 respectively.
This denition of dimension turns up some sets that behave very pecu-
liarly! One of the easiest examples to see clearly is the Koch snowake,
shown in Fig 8.21. Let the length of the initial line segment in the construc-
tion (not shown) be 1. Then using spheres of diameter = 1, 1/3, 1/3
2
,
1/3
3
, ..., we need N = 1, 4, 4
2
, 4
3
, ... to cover the snowake. Graphing N
vs. on log-log paper gives a slope = ln 4/ ln 3 1.26. Thus the Koch
etc.
(a)
(b)
(c)
(d)
Fig.8.21: The Koch snowake has triangular excursions 1/3 of the way along each side! It
can be built up by starting with a straight segment (not shown), (a) introducing a triangular
excursion, then (b) introducing triangular excursions into each of its 4 sides, then (c) doing
the same for each of the resulting 16 sides, etc. Its fractal dimension is ln4/ ln3 1.26,
and not 1, as one might naively expect.
snowake, although it seems to be made out of one-dimensional segments,
has fractal dimension 1.26, which is greater than 1.
The WBE theory assumes that the arterial system (or its analogue in
plants) is a fractal branching network of essentially one-dimensional tubes
that has fractal dimension 3. The network starts with the aorta, of length
0
. This then branches into n large arteries, where n is a xed parameter,
characterizing the network. For the sake of concreteness, let us suppose
n = 3, but continue to call it n. Each of the n arteries of level 1, of length
1
, branches into n smaller arteries. (There are therefore n
2
of these smaller
arteries.) These, in turn, after length
2
, each branch into n still smaller
arteries, etc. The number of arteries at each level is 1, n, n
2
, n
3
, .... In
general, at the kth level there are n
k
arteries. Eventually, after N branchings,
we get to the capillaries, so
n
N
= N
c
(8.53)
where N
c
is the number of capillaries. If we try to t this scheme to the
human circulatory system, knowing N
c
3 10
9
, and taking n = 3 for the
number of branches at each branching, we nd N = 20, since 3
20
= 3.510
9
.
The fractal branching scheme is shown in Fig 8.22 What is not shown
l
0
l
1
l
2
etc. ... capillaries
level: k = 0 1 2 ... N
Fig. 8.22: The branched arterial network is shown pulled out nearly straight. At each
branching, the n
k
tubes at level k, of length
k
, each produce n new tubes, each of length
k+1
. The capillaries are shown greatly magnied!
there is that each level, down to the capillaries, covers the same region of
space. In fact, even the tissues that make up the walls of the larger arteries
are nourished by capillaries! This suggests that the fractal dimension of the
network is 3, so that for each k we can cover the body by spheres of size
k
with a number that goes as
3
k
. But the number of these spheres is n
k
. Thus
n
k

3
k
(8.54)
or
k
n
k/3
(8.55)
Putting in the constant of proportionality, we have
k
=
0
n
k/3
(8.56)
At each level, the length of the artery goes down by the same factor, n
1/3
.
This, as we have seen, follows from the rst assumption of the WBE theory,
that the arterial system is a fractal of dimension 3.
Let us try this on the human example, using n = 3. If the aorta has
length
0
= 0.5 m, the capillaries, at level k = N = 20 would have length
c
0.5 3
20/3
3 10
4
m. This is about right!
The second assumption of the WBE theory is that the radii of the arteries
are designed so as to minimize the energy required to pump blood through,
consistent with getting blood down to the capillaries. The third and last
assumption of the theory is that the capillaries are essentially the same in all
organisms, designed for the ecient exchange of dissolved substances with
surrounding tissue.
In plants there is a very simple rule about the radii of branches of the
arterial system: the cross-sectional area doesnt change when branching
occurs, because it is always the same pipes (xylem) as you go up the stem
(the aorta), only various pipes are redirected into various branches, without
changing their areas. The same rule, that cross-sectional area stays constant,
applies to humans (and all animals with beating hearts) in the upper part of
the arterial system, where the ow obeys Bernoullis Principle. We have seen
this in Fig 8.19. Amazingly, this follows from the principle of minimizing the
energy required to pump the blood. The only resistance in the upper part
of the circulation comes from reection of the systolic wave at places where
branching occurs, and the condition for no reection is exactly that the area
stay the same at the next level! Thus in both plants and animals, although
for dierent reasons, we have areas staying the same from one level to the
next,
r
2
k
= nr
2
k+1
larger arteries (8.57)
so that
r
k+1
= r
k
n
1/2
Thus
r
k
= r
0
n
k/2
and the radius becomes less by the same factor, n
1/2
, at each branching.
Each new, smaller artery has cross-section less by the factor n
1
, but there
are n of them, so the total cross-section stays the same.
In animals this pattern holds only until the ow becomes Poiseiulle, which
in humans occurs when r
k
100 m. Taking n = 3 and r
0
= 10
2
m (the
radius of the aorta), this implies k = 8, since 10
2
3
8/2
120 10
4
m
120m. Thus in humans there are 12 more branchings (for a total of 20) to
get down to the capillaries.
We already know the lengths
k
and the number of arteries n
k
for each
level of this part of the system. To minimize its Poiseuille ow resistance
turns out to require
r
k+1
= r
k
n
1/3
smaller arteries (8.60)
a result known since the 1920s, known as Murrays Law. Now the area does
not stay the same at branches. Rather, if A
k
is the area at level k,
A
k+1
= nr
2
k+1
= n
1/3
r
2
k
= n
1/3
A
k
(8.61)
The area grows with the factor n
1/3
at each branching. Since the diameter of
vessels goes down by the factor n
1/3
at each branching, we can say A
k
r
1
k
in this regime, and that is clearly visible in Fig 8.19! It shows up as the slope
1 in the log-log plot for the smaller vessels. It is easy to check that the
resistance of each level in this part of the system is the same, and since these
resistances are in series, they simply add. Compare Eqs (8.46) and (8.48).
If we ask how small the capillaries are according to this rule, continuing
our computation with the data for humans, we should start with r
8
= 120 m,
computed above according to the rule for large arteries, and then follow 12
more branchings by the new rule for small arteries, nding r
c
= r
8
3
12/3
=
120/81 = 1.5 m. Again, this is about right! This fractal scheme seems
uncannily accurate.
We really shouldnt be working our way down to the capillaries to see
what the theory says about them, because the third assumption of the the-
ory, mentioned above, is that the capillaries are always the same, and con-
ceptually should be the starting point. We should work our way up from
them, expressing other things in terms of them. We illustrate this principle
by deriving the 3/4 law, B M
3/4
. We will assume the constant-area rule
holds all the way down to the capillaries. This is true for plants, but not
for animals. It turns out that for animals one can replace capillaries by
the smallest vessels for which the constant-area rule holds (100m radius),
so the derivation is essentially the same.
The nutritional requirement of any organism is proportional to the volume
rate of ow from the heart Q
0
, which is the rate at which the organism is
supplied with nutrient,
B Q
0
= r
2
0
u
0
= n
N
r
2
c
u
c
(8.62)
Here u
0
is the mean speed of ow in the aorta, u
c
is the mean speed in
the capillaries, etc. The point is that having expressed it in terms of the
constants u
c
and r
c
describing the capillaries, the only thing that varies from
one organism to another is the factor n
N
, i.e.,
B n
N
(8.63)
Similarly the mass M of the organism is proportional to the total volume V
b
of the blood, and this in turn is proportional to the volume of the capillaries,
M V
b
= r
2
0
0
(1 +n
1/3
+n
2/3
+n
3/3
+n
4/3
+...) (8.64)
Cr
2
0
0
(8.65)
C(n
N/2
r
c
)
2
(
c
n
N/3
) (8.66)
Cr
2
c
c
n
4N/3
(8.67)
Here C is a number of order 1, about 3 if n = 3 and about 4 if n = 2. Thus,
after expressing everything in terms of capillary quantities, we have
M n
4N/3
(8.68)
Comparing Eq (8.63) and (8.68), we have the 3/4 law!
Problems
Density
8.1 Give a careful argument for converting mass density in cgs units to SI
units. Why do you think cgs units are often used for mass density in spite
of the trend to SI?
8.2 The acceleration due to gravity g is proportional to the mass of the
Earth M
E
. If the Earth were more massive, g would be larger than it is. In
Newtons theory of universal gravitation, in fact,
g =
GM
E
R
2
E
(8.69)
where R
E
is the radius of the Earth, and G is Newtons gravitational con-
stant, a constant of Nature. Newtons constant was rst measured by the
gifted English experimentalist Henry Cavendish in 1798. It is still not known
to high accuracy, but it is about 6.67 10
11
in SI units. From these data,
determine the average mass density of the Earth. Is your answer plausible?
8.3 Find your own volume, what you might call your personal space,
assuming that your density is that of water.
Archimedes Principle
8.4 When people say this is just the tip of the iceberg, they are referring
to the fact that most of an icebergs volume is under water, and the tip that
303
is visible is only a very small fraction of the iceberg. Look up the data you
need and determine what fraction it is. Be clear what data you are using,
and make your reasoning clear.
8.5 Name some familiar metals that will oat in mercury, and nd two
metals that will sink in mercury.
8.6 (a) Draw a diagram like Archimedes own diagram in Fig 8.1 to illustrate
why, if you push a buoyant object of weight W
o
down into water with a force
F, to completely submerge it, the equilibrium condition is F + W
o
= W
w
,
where W
w
is the weight of the water displaced. Give the argument in words
that should accompany the diagram.
(b) Similarly, draw a diagram like Archimedes own diagram in Fig 8.1
to illustrate why, if you support a submerged dense object with a force F
(upward), the equilibrium condition is F + W
o
= W
w
. Give the argument
in words.
8.7 (a) Two liquids, with densities
1
<
2
, immiscible in each other, are
allowed to come to equilibrium in a container. Use minimization of energy
to argue that the liquid with with density
2
will be on the bottom, and the
interface between the two liquids will be a horizontal plane.
(b) A rectangular solid, with intermediate density
0
, i.e., satisfying
1
<
0
<
2
, is placed into the container. Show that it can oat (fully submerged)
at the interface between the two liquids, and if its height is h, then it sinks
a distance D into the denser liquid given by
D =
_
1
_
h (8.70)
(c) Show that the above expression makes sense in various ways, including
limiting cases.
8.8 (a) When you step on a scale, what you read is actually less than your
true weight! Estimate the size of this eect, realizing that you are immersed
in a uid, the atmosphere, with a mass density roughly 1 kg/m
3
.
(b) Could scales be engineered to correct for this eect and read true
weight? What would be required?
8.9 The hot air in a hot air balloon may have a density only 0.9
a
, where
a
1 kg/m
3
is the density of the ambient air. What must be the volume
of a balloon that could lift your mass m 50 kg? Assume the skin of the
balloon, supports, basket, etc. taken all together weigh as much as you do.
Ignore your own buoyancy and the buoyancy of the solid parts of the balloon.
Pressure
8.10 Sketch a graph of the pressure as a function of depth in the two-liquid
system described in Problem 8.7. Explain in words why your graph looks
the way it does.
8.11 (a) With a force F you support a rectangular object of mass M by
holding it in one hand. Draw a free body diagram for this situation and nd
the force F.
(b) The rectangular object is actually a small aquarium, lled with water
of mass M (the glass bottom and sides of the aquarium are so thin that they
have negligible mass). Draw a free body diagram for just the glass of the
aquarium, including, of course, the pressure force PA on the bottom of the
aquarium due to the water, since in this way of thinking of it the water is
now an external object that contacts the glass. Show that you deduce the
same force F as in part (a).
8.12 DANGER: DO NOT TRY THIS. A simple idea for providing yourself
air under water is a hollow tube extending to the surface to breathe through.
Why is this a bad idea for all but the shallowest descents? Or to put it
another way, why are snorkels short?
Bernoullis Principle
8.13 (a) An above-ground water tank of height H develops a hole near the
bottom. If it is full of water, what is the speed of the water exiting the tank
through the hole? Give an expression showing the proportionalities involved,
and evaluate your expression in case H = 5 m.
(b) A bathysphere lowered into the deepest trench in the ocean, a depth
of about 10 km, develops a leak. What is the speed of the water coming in
through the leak? (Compare with the speed of sound in air, about 340 m/s.)
8.14 Use Bernoullis Principle to estimate the horizontal force on you if you
are standing waist-deep in a stream owing at 1 m/s. Is the result plausible?
8.15 Draw a picture of a siphon connected to an overhead reservoir, label
the picture with appropriate dimensions (give letter names), and give an
expression for the speed with which the uid should exit the siphon if there
are no frictional losses in the system. Does it matter how the entrance of the
siphon is positioned in the reservoir?
8.16 Water comes out of a tap of area A
0
with speed v
0
(straight down).
(a) Use Bernoullis Principle to show that when the water has fallen a
distance h, it has speed v =
_
v
2
0
+ 2gh.
(b) Show the same thing using conservation of energy for a small part of
the water, of mass m.
(c) Since the water speeds up as it falls, and yet the volume current
I = A
0
v
0
= Av is constant, it must be that the area A of the owing stream
of water becomes smaller as v gets larger. Find A as a function of distance
fallen h, and graph it. Also look carefully at water coming out of a tap!
Shear Stress and Viscosity
8.17 (a) Following the arguments in Section 8.12, explain why shear rate
and shear viscosity have the dimensions they do.
(b) Suppose that (in a horizontal pipe) volume current I, with dimension
[L
3
/T], and pressure drop P (down the pipe) are proportional, as in
P = rI (8.71)
Find the dimension of the pipe resistance r.
(c) Suppose resistance r in a pipe is proportional to the viscosity of the
uid and to the length of the pipe. These factors alone dont have the right
dimensions to be resistance. What is the dimension of the missing factor?
Stokes Flow
8.18 When a small sphere of radius R sinks in a uid of viscosity , the
gravitational force, the buoyant force, and the Stokes drag force exactly bal-
ance.
(a) Make a free body diagram for the sphere, showing the three forces.
(b) Find the speed v of the sphere in terms of its density from the
condition that the forces should balance, i.e., that the net force should be
zero.
(c) Verify explicitly that your result has the dimension of a velocity.
Pressure, Current, and Resistance: P = Ir
The problems in this section are analogous to DC circuit problems in elec-
tronics. The pressure dierence P from one end of a pipe to another is
analogous to the voltage dierence V that might be applied by a battery
across a resistor. The resulting volume current I is analogous to the electri-
cal current I. The pipe resistance r is analogous to electrical resistance R.
Because of this analogy we indicate pipes by the electrical resistance sym-
bol, like a wiring diagram, and give values in SI units without naming them
explicitly.
8.19 In Fig 8.23 the points A and B are connected in parallel by 3 pipes,
with resistances 100, 200, and 500 (SI) respectively.
(a) If the pressure drop from A to B is P = 10 (SI), what is the SI
current through each pipe individually?
(b) What is the total current through all 3 pipes together?
(c) What is the eective resistance of the network, relating the pressure
drop to the total current? How does it compare to the individual pipe resis-
tances?
8.20 In Fig 8.24 there is a pressure drop of 30 (SI) from A to B, and the
resistances are as indicated.
A
B
100 200 500
Fig. 8.23: Three pipes connect A to B, with individual SI resistances indicated.
A
B
C
100
200
300
Fig. 8.24: Pipes connect A to B, with individual SI resistances indicated. The point C is
at the junction of one pipe with another.
(a) What is the pressure drop from A to C? What is the pressure drop
from C to B? (Check: they must add to the full 30 from A to B.)
(b) What is the total current through the network?
(c) What is the eective resistance of the network?
8.21 Assuming Bernoullis principle holds in the large arteries, what blood
pressure would you measure in the carotid artery 30 cm above the heart in
a subject whose blood pressure taken in the standard way was 130/80?
Chapter 9
Temperature, Heat, and
Internal Energy
When Fire was thought to be one of the four elements, a popular idea due
to Plato was that the atoms of Fire were tiny tetrahedra, like the one in
Fig 9.1. This seems a peculiar idea now, but it once made a kind of intuitive
Fig. 9.1: According to Plato, the atoms of Fire are tetrahedra.
sense. The points of the tetrahedron, being sharp, could cause injury in the
form of burns. Galileo suggested that these pointed tetrahedra could act like
311
312CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY
little knives to melt metals, by cutting a solid into so many small particles
that it would ow like a uid.
The theory of the four elements, with atoms in the shape of the Platonic
solids, was already quite archaic in Galileos time. In this theory all materials
are a mixture of Fire, Earth, Water, and Air, and burning a material just
means liberating the Fire that is in it. In the 18th century, advances in
chemistry began to reveal what the real chemical elements are, but the old
idea nonetheless persisted in a slightly dierent form. Now heat was imagined
to be a uid, although perhaps not a chemical substance, that owed out
of materials when they burned, and warmed things by owing into them.
This uid was called phlogiston. In many ways it was just the old element
Fire. The 18th century chemists were frustrated in not being able to purify
this substance, the way they could purify other things. They could make
phlogiston move from one body to another, in what they (and we) call the
ow of heat, but they could never isolate it. That is, they could never get
heat by itself, not associated with another substance.
This ancient idea was nally laid to rest by the American Benjamin
Thompson, who did his scientic work in Europe, having left the Colonies in
1776 as a Tory. He had a distinguished career in England and later Bavaria,
where he acquired the title Count Rumford, and among other duties was
responsible for overseeing the manufacture of cannons. These cannons began
the process as solid cylinders, which were then bored by turning them against
an abrasive drill to hollow them out. They became very hot and could be
continually cooled with water, which itself became hot, and so on. Rumford
pointed out that if this were the ow of some substance from the cannon
to the water, then the cannon seemed to have an innite supply of it. On
the other hand, the substance never appeared except when the cannon was
in motion, being turned. Stop the turning and the phlogiston soon stopped
coming out. And no matter how much phlogiston a cannon lost, it was in
no way a dierent material after it cooled o. All this strongly suggested to
Rumfords mind that phlogiston was not a substance at all, but rather a kind
of motion, and that the motion was being supplied by the turning apparatus.
The motion could communicate itself from one substance to another at a level
too small to see. It could never be isolated, because motion is always motion
of some thing. This is essentially our modern idea of what the ow of heat
is: the ow of internal energy from hot to cold bodies. The only dierence
is that the word motion has been replaced by the words internal energy.
313
In modern statistical mechanics, some or all of the internal energy would be
the random kinetic energy K of atoms, that is, motion.
Rumfords insight became precise and quantitative in 1843 in the work of
James Prescott Joule, for whom the SI unit of energy is named. Joule used
a weight W of initial height h to power a rotary paddle wheel device that
churned water as the weight descended. In this way he took the potential
energy U
g
= Wh of the weight, and transferred it all to the water. The
water was thermally insulated from the room in which the experiment was
done, and Joule found that the result of giving the water energy in the known
amount E = U
g
was to raise its temperature T by a denite small amount
T. It didnt matter whether the energy was transferred quickly or slowly
(within practical limits), and if twice as much energy was transferred, the
temperature change was twice as much, indicating a proportionality
E T (9.1)
Knowing this proportionality he could interpret a change in temperature T
as a change E in the internal energy of the water. With this insight the
notion of energy nally took on the central importance it has for us today.
This idea is now called the Law of Conservation of Energy, or the First Law
of Thermodynamics. It says that energy, if we just keep track of it in all its
forms, including potential energy, kinetic energy, and internal energy, never
increases or decreases, but simply moves around. In Joules experiment it
went from being potential energy of a weight to being internal energy of
the water. As the water gradually equilibrated with the room, despite the
thermal insulation around it, we would say that the energy slowly became
internal energy of the entire room and its contents, diusing into everything.
This observation is a hint of the Second Law of Thermodynamics, which says
that the ow of heat, meaning the transfer of internal energy because of a
temperature dierence, has a peculiar quality, tending to make energy more
diuse, and never to concentrate it.
Is this diusion of energy the motion of some thing? It is tempting to say
so, but that really takes us back to the phlogiston theory. Energy is certainly
not a material thing, nor is it simply motion. It is a thing in the sense that
it is conserved, and hence has a kind of integrity, but that is all one can say.
9.1 Temperature
In the account of Joules experiment we took for granted a notion of temper-
ature. Of course by Joules time there were thermometers that he used to
measure the temperature change T in the churned water (he used mercury
thermometers). What is temperature, though?
The most important idea in the theory of temperature is the concept of
thermal equilibrium. This is the notion that when two systems are able to
exchange internal energy through the ow of heat, they will do so until an
equilibrium is established, and after that no further change occurs. Once they
have reached this equilibrium, we say that they are at the same temperature.
It is postulated that if A is in equilibrium with B and B is in equilibrium with
C, then A is in equilibrium with C. You might say, this is obvious! If A is the
same temperature as B and B is the same temperature as C, then of course
A is the same temperature as C! That would be missing the meaning of this
idea, though, because you are already thinking of temperature as a number,
measured by a thermometer. The statement about the mutual equilibrium of
A, B, and C is a statement about how things actually interact and behave,
and has nothing to do with thermometers. Conceivably A and C brought into
contact would start exchanging internal energy through the ow of heat, even
if the pairs A and B, and B and C, would not. Then it would be impossible
to characterize temperature by a number. This hypothetical case seems never
to occur, however, for any systems A, B, and C, and therefore we can assign
a single number, temperature, to their equilibrium. This is sometimes called
The Zeroth Law of Thermodynamics, to make the point that the other laws
rely upon it, and also that this one apparently wasnt recognized as a logical
necessity until later. It is a bit subtle! It helps, though, to have the notion
of thermal equilibrium rmly in mind as we think about temperature.
You are probably sitting in a room where nothing too dramatic has hap-
pened for awhile. The objects in the room have had plenty of time to ex-
change internal energy through the ow of heat, and they have probably
reached a mutual equilibrium. That means they are all at the same temper-
ature (by the denition of temperature). Try touching a few things around
you. I nd that some things feel noticeably cooler than others. Ceramic and
metal feel colder than wood and fabric. But they are not! They are all at
the same temperature! What is going on?
9.1. TEMPERATURE 315
We do not actually sense temperature. What we sense is the ow of
heat, driven by temperature dierence. Since you are warmer than the other
things in the room (and not in thermal equilibrium), you exchange internal
energy with them when you come in contact: internal energy ows from you
to the things you touch. The temperature dierence is always the same, no
matter what you choose, but the ow of internal energy is not always the
same. Apparently there is an energy current J driven by the temperature
dierence T, but the current depends on what you are touching. This is
so much like the ow of a uid driven by a pressure dierence that we can
model it in the same way as Eq (8.40), with a thermal resistance r, so that
T = rJ (9.2)
Here J, the energy current, would have SI units Joules/sec (also called
Watts). This relation is usually expressed in terms of the thermal conduc-
tance = 1/r (kappa), the reciprocal of thermal resistance, as
J = T (9.3)
This says that an energy current J arises proportional to temperature dif-
ference T, with the constant of proportionality depending on the sit-
uation. It is just a rough idea here, but it is phenomenologically true for
small temperature dierences T. This is sometimes called Newtons Law
of Cooling.
What we sense as hot or cold seems to be J. This is an energy current,
representing energy coming in or going out at some rate. It does seem as if
we should be aware of this, for reasons of basic survival, especially a very
rapid ow of energy (which we would sense as very hot or very cold). Nature
has equipped us with the means to detect these internal energy ows. If T
is xed in Eq (9.3), then the dierent currents J that we feel correspond to
dierent thermal conductances . Since metals are good conductors of heat,
they feel cool at room temperature (big , big current out), while wood,
which is not a good conductor of heat, does not (small , small current out).
The energy ow J in response to a temperature dierence T is always
from the hotter body to the cooler body. Since we havent even said what
hotter and cooler mean, this is really a denition of those terms! The ow
J is always in the direction to move the system toward thermal equilibrium.
Thus it can never happen that A is hotter than B, B is hotter than C, and
C is hotter than A, because then the energy would ow around cyclically,
never reaching equilibrium, contrary to the Zeroth Law. Rather the ow J
tends to raise the temperature of the cooler body and lower the temperature
of the hotter body, until they are the same. Of course we know examples
of this, but here the idea is asserted as a general law of Nature, applying to
everything.
The energy ow J must arise in response to a temperature dierence
T if there is any physical process whatsoever that could move energy from
the hotter system to the cooler one. One may try to make the thermal
conductance very small, like in thermos bottles that use silvered interiors
and vacuum between glass layers, etc., to try to thwart the transfer of energy
from inside to outside, but Nature will nd a way. The hot coee does
eventually cool o. An interesting case is the Sun. It is much hotter than
we are, but it is separated from us by millions of miles of vacuum. Does this
vacuum insulate us? Not at all! There is a mechanism for transferring energy
through the vacuum, namely sunlight, and hence there is an energy current J
from the Sun to us, solar energy. A solar collector of 1 square meter, designed
to absorb this energy, may receive an energy current of almost 1000 J/s (a
kilowatt!) This transfer is in the direction that would eventually bring us to
thermal equilibrium (warming us up and cooling the Sun), but since the Sun
itself is far from equilibrium, burning nuclear fuel, this thermal equilibrium
is far in the future. Before the source of the Suns energy was known, it was
a mystery how it could have avoided cooling o in the geologic time that was
the (more or less) known age of the Earth.
These considerations of thermal equilibrium bring us to a rather peculiar
view of temperature: temperature is a measure of the tendency of a system
to give up its internal energy through the ow of heat. When two systems
are in contact, and one of them has a greater tendency than the other to give
up internal energy, so that heat ows from it to the other one, then we say
it is at a higher temperature. We do not say that it has more or less energy,
and we cannot even speak of the phlogiston it may contain, since we do not
believe in that, only that it has a greater tendency to give up internal energy,
for whatever reason. And heat has nothing to do with temperature, but only
with temperature dierence. Heat may ow at low temperature just as well
as at high temperature. Heat has nothing to do with hot !
9.2. THERMOMETERS 317
9.2 Thermometers
Every material property density or viscosity, for example depends on
temperature. That makes almost everything a possible thermometer. Every
property responds to changes in temperature. We ourselves are crude but
sensitive thermometers. If our core body temperature changes by just a
few degrees Fahrenheit, we die! Most material properties dont change so
dramatically in just a few degrees Fahrenheit, which means inert matter is,
on the whole, much less sensitive to temperature than living systems are.
When we construct useful thermometers, we have to look for rather small
changes in material properties.
The familiar mercury thermometer uses the relative change in the den-
sity of mercury with temperature as a measure of temperature. What you
actually see is the small change in volume V of a xed mass M. Since mass
density = MV
1
, with exponent 1 on V , a small change in mass density
is related to the corresponding small change in volume V by
=
V
V
(9.4)
recalling the argument leading to Eq (4.26). The minus sign makes sense, be-
cause if the volume goes up, the density goes down, i.e., the changes are in the
opposite direction. Our familiar Celsius and Fahrenheit temperature scales
dene change in temperature T to be, at least approximately, proportional
to this quantity, i.e.
V
V
T (9.5)
This idea only denes the change in temperature T, so we could still choose
T arbitrarily at one particular V , and then measure changes from that. The
constant of proportionality is also a choice, determining what we mean by the
unit of temperature (the degree). The Fahrenheit and Celsius temperature
scales dier precisely in these choices. In the Celsius temperature scale the
value T = 0 is assigned to the freezing point of water, while in the Fahrenheit
scale T = 0 was originally assigned (very arbitrarily!) to a particularly cold
winter day. Also 1 degree on the Celsius scale is 1.8 degrees on the Fahrenheit
scale. This means that Celsius temperature C and Fahrenheit temperature
F are related by
F =
9
5
C + 32 (9.6)
since C = 0 and F = 32 are the same temperature (freezing point of water).
In the Celsius temperature system, using mercury in the thermometer,
Eq (9.5) becomes
V
V
= T (9.7)
with = 1.8 10
4
/
C. The constant of proportionality (beta) is the

thermal coecient of volume expansion of mercury in inverse degrees Celsius.
Since 1 = 1
C/1.8
F, we can also say = 1.0 10

4
/
F.
The mercury thermometer no longer denes the Celsius system. The
actual denition is dierent, and by that denition, Eq (9.7) is only approx-
imate. The reason is that the relative volume changes of dierent materials
as they are warmed up are only approximately proportional to each other,
but not exactly. This means each material would give a slightly dierent
notion of temperature, if it were used to dene temperature. Conceivably
we could just choose some material, like mercury, and use it, but we would
know that it wasnt a very fundamental quantity, that it was a bit arbitrary,
and depended on the peculiarities of mercury. There is actually a much bet-
ter thermometer available, one that is independent of the properties of any
material. The temperature dened in this way is called absolute temperature.
We will see it in the next section.
Before we leave the mercury thermometer, let us see how it is that we
can measure the very tiny change in relative volume, just 1 part in 10
4
, that
corresponds to 1 degree. The secret is to rearrange Eq (9.7) to read
V = V T (9.8)
Since V V , we can make V bigger by making V very large. In the
mercury thermometer, this large V is in a sort of reservoir bulb at one end.
The change V is conned to a cylindrical volume with very small cross-
sectional area A, so that the change shows up as a change in length , i.e.,
V = A. Then the change you actually read is
=
V T
A
(9.9)
By making V big and A small, you can amplify a small change T to be
a large change , despite the small coecient .
9.3. THE GAS THERMOMETER 319
Another practical consideration in a real mercury thermometer is that the
glass containing the mercury also expands when the temperature goes up (al-
though its coecient of thermal expansion is much less). If the cross-sectional
area A of the cylindrical volume gets larger, because of this expansion of the
glass, then the length that we read will not be as large for a given V . A
properly calibrated thermometer will of course take all this into account in
the way its temperature scale is marked o. One thing you may have noticed
is that the mercury in a thermometer may rst go down before it goes up
to a new, higher temperature. The reason is that the glass is warmed rst,
and it expands. Only later does the energy current J reach the mercury and
cause it to expand.
9.3 The Gas Thermometer
The thermal properties of solids and liquids are peculiar to each substance,
but gases are all very much alike. This is the basis of a thermometer that is
independent of substance.
A xed quantity of gas in thermal equilibrium at a xed temperature has
the property that its pressure P and its volume V obey the simple relation
PV = constant (9.10)
This means that if the pressure goes up, the volume goes down, in just such
a way that their product is always the same, and this is true for all gases
(at least as long as they are not too dense: there may be small deviations
from this relationship at high pressure). If the temperature of the thermal
equilibrium is raised, then the gas expands (at constant pressure P), so the
new value of PV is larger, but once again it is constant at this new value
when you change P, since V changes in a corresponding way, as long as the
new thermal equilibrium is maintained. Since this product PV depends on
temperature, it suggests dening a temperature T by
PV T (9.11)
where the constant of proportionality is still to be chosen. For a xed quantity
of gas, the choice of constant c in
PV = cT (9.12)
determines the size of the absolute degree, and this is customarily chosen to
be the same size as the Celsius degree. That completely determines T, which
is called the Kelvin temperature.
PV
T (Celsius) 0 100 -273
273 373 0 T (Kelvin)
Fig. 9.2: Extrapolating from measurements of PV at two known Celsius temperatures
locates the absolute zero on the Celsius scale. We imagine doing this with two dierent gas
thermometers. The lines are graphs showing proportionality of PV to a new temperature
scale (Kelvin) with its zero at 273
C, as in Eq (9.12). A measurement of PV can

thus be interpreted as a Kelvin temperature, although the constant of proportionality c in
Eq (9.12) is dierent for each thermometer.
Unlike the other temperature scales, the absolute (Kelvin) temperature
scale has a denite zero, called absolute zero, which is not a matter of choice.
It is the temperature at which PV would go to zero! One can determine it
by measuring PV at 100
C (boiling point of water), then at 0
C (freezing
point of water), where the value is less, and nally extrapolating to the
Celsius temperature at which it would be zero, as in Fig 9.2. The result is
that the absolute zero occurs at about 273.15
C, which is called 0 K, the

K standing for the Kelvin temperature scale, after William Thompson, Lord
9.4. AVOGADROS HYPOTHESIS 321
Kelvin. Thus Kelvin (absolute) temperature K and Celsius temperature C
are related by
K = 273.15 + C (9.13)
9.4 Avogadros Hypothesis
In Fig 9.2 we imagine using two gas thermometers to locate the absolute zero,
one with twice the volume V of the other at the same pressure P. Thus the
constant of proportionality c in Eq (9.12) is twice as big for one thermometer
as for the other. In 1811, Amedeo Avogadro suggested that the volume V
of a gas (at constant P), and hence the constant c, is proportional to N,
the number of molecules of the gas. This was long before the very existence
of molecules was established! This idea was suggested by observations like
what happens when water is decomposed into hydrogen and oxygen. There is
twice as much hydrogen as oxygen (by volume). Suppose each water molecule
consists of two hydrogens and an oxygen (as we now know it does). Then
the 2 : 1 ratio of hydrogen to oxygen molecules obtained by hydrolysis would
show up as a 2 : 1 ratio of hydrogen to oxygen volumes. The two gas
thermometers in Fig 9.2 could have been a hydrogen thermometer and an
oxygen thermometer, using the hydrogen (upper line) and oxygen (lower
line) from decomposing some xed quantity of water. Accepting Avogadros
hypothesis, we say that the reason that c is twice as big for the upper line
in the graph is that the hydrogen has twice the number N of molecules.
Thus for each thermometer c N, where N is the number of molecules, and
writing it with a constant of proportionality, we have c = kN, where k is a
constant, the same for all gas thermometers, i.e., all gas samples, independent
of amount and substance. This k is now called Boltzmanns constant, and
is often written k
B
to distinguish it from all the other uses of the letter k.
Thus we write Eq (9.12) as
PV = Nk
B
T (9.14)
where N is the number of gas molecules. Eq (9.14) is called the ideal gas
law.
When Avogadro made his suggestion, he had no idea how large N might
be in a typical gas thermometer, or how to determine it: it would certainly
be a very large number. Since P, V , and T would be of order 1 in sensible,
laboratory units, k
B
must be a very tiny number in those units. We now
know
k
B
1.38 10
23
J/K (9.15)
The units here are Joules per Kelvin, Kelvin meaning degrees on the Kelvin
scale.
The physical chemists of Avogadros day made a dierent use of his hy-
pothesis. Suppose you take two equal volumes (at the same T and P) of two
dierent (pure) gases, and one of them weighs more. How can that be? Since
they each have the same number of molecules, by Avogadros hypothesis, it
can only be that the molecules of one gas weigh more than the molecules of
the other, and you determine their ratio when you weigh them, even without
knowing the absolute number of molecules. In the case of hydrogen and oxy-
gen, for example, a volume of oxygen weighs 16 times more than the same
volume of hydrogen (at the same T and P). It had already been noticed that
hydrogen seemed to be the lightest gas, so it would make sense, perhaps, to
make it the unit of weight for this purpose: molecular weight. This is es-
sentially what we do to this day, only we now (essentially) use the hydrogen
atom as the unit, and we know that the hydrogen molecule is H
2
, so the
hydrogen molecule has molecular weight 2, more or less by denition, and
then the oxygen molecule O
2
has molecular weight 16 times more, namely
32.
Turning this observation around, suppose you had 2 grams of hydrogen
gas and 32 grams of oxygen gas (just naming their molecular weights in
each case). Then you would have the same number of hydrogen molecules
as oxygen molecules. The actual number would perhaps be unknown, but
whatever it is, it is the same for both. This number is called Avogadros
number N
A
, and Avogadros number of the molecules of any gas is called 1
mole of that gas, a word from the Italian. In practice, this means that a
mass equal (in grams) to its molecular weight is 1 mole, for any gas. This
denition does not require that we be able to count molecules, or even that
we know the value of N
A
. (Note: the term molecular weight is unfortunate:
it is the mass of a mole, in grams, not the weight.)
As late as 1900 Avogadros number was not known very accurately, and
a few people even doubted the very existence of molecules! We now know
N
A
6 10
23
(9.16)
9.5. HEAT CAPACITY 323
If a gas consists of N molecules, then there are n = N/N
A
moles of the gas.
Thus we can rewrite the ideal gas law Eq (9.14) as
PV = nRT (9.17)
where n is the number of moles of a gas, and
R = N
A
k
B
8.314 J/K mole (9.18)
is the gas constant. The gas constant R can be determined experimentally
from measurements on any gas with known molecular weight, without know-
ing N
A
, so the version of the ideal gas law in Eq (9.17) was useful long before
the version in Eq (9.14).
Here is a numerical example. Suppose we have a 1 liter container of some
gas at atmospheric pressure and at room temperature 20
C. What does it
weigh? The given information would only be enough to nd how many moles
we have, but not to nd the weight. To do that we would have to know the
molecular weight, to nd the mass, and the local value of g, to nd the
weight. From the ideal gas law the number of moles n is
n =
PV
RT
=
(10
5
N/m
2
)(10
3
m
3
)
(8.3 J/K mole)(293 K)
= 0.041 mole (9.19)
(Notice how the units work.) This is as much as we can say without more
information. If we happen to know that the gas is nitrogen (N
2
), with molec-
ular weight MW = 28 g/mole, then the mass is m = nMW 1 g. The
weight would be mg (10
3
kg)(10 m/s
2
) = 0.01 N. Since the air we live in
is mostly N
2
, at atmospheric pressure, and roughly room temperature, just
like this sample, it would be the same computation to nd the density of our
air, which is therefore about 1 g per liter, or 1 kg/m
3
, the same estimate
we arrived at once before, in section 8.10.1. The method in this section,
however, is in principle very precise, since V , P, and T can all be measured
very accurately, and the average molecular weight of air is also known very
accurately.
9.5 Heat Capacity
Even before the ow of heat was well understood, it was possible to quantify
it. The amount of heat that raises 1 gram of water by 1
C was called 1
calorie, abbreviated cal. When Joule found that the mechanical energy
4.18 J dissipated in water also raises 1 gram of water by 1
C, it became
clear that the calorie is just another unit of energy, and that
1 cal = 4.18 J (9.20)
Although the calorie is not the SI unit of energy, it is still in wide use
because it is so convenient. Conversion to SI units uses the relation above, of
course. The continued use of the calorie poses a subtle pitfall for the unwary,
because it suggests that there is something distinct, called heat, that is
measured in calories, as opposed to other kinds of energy. No no no! That
would be phlogiston, and we no longer believe in it. Energy that ows into
a system as heat can come out as energy in some other form, because in
the end it is just energy, and energy is not trapped in any particular form.
And energy added mechanically, like in Joules experiment, can raise the
temperature just as well as the ow of heat can.
There is a dierent unit of energy, also called the Calorie, or more prop-
erly the big calorie, that is 1 kilocalorie, or 1000 calories, abbreviated with
a capital C as Cal.
1 Cal = 1000 cal (9.21)
The energy content of food is always given in big calories, so to convert to
Joules, you must rst convert to calories by multiplying by 1 = 1000 cal/Cal,
then multiply by 1 = 4.18 J/cal.
Any energy can be conveniently measured if it can be converted into the
internal energy of water, because it will show up as a rise in temperature.
Suppose the temperature of 1 kg of water is raised by 2 K. (Since the Kelvin
degree and the Celsius degree are the same, we might as well use Kelvins.)
Then each gram of water is warmer by 2 K, and since there are 1000 grams,
the energy added to the water was 2000 cal, from whatever source. We are
using Eq (9.1), which said
E T (9.22)
but now we know the constant of proportionality:
E = CT (9.23)
The constant C, called the heat capacity, is 1000 cal/K in the example above.
Putting in T = 2 K, we nd E = 2000 cal. Clearly, though, the heat
9.5. HEAT CAPACITY 325
capacity C is proportional to the mass M of the water, i.e.
C M (9.24)
To heat twice as much water by the same 2 K would take twice as much
energy. Thus it makes more sense to think of heat capacity as
C = Mc (9.25)
where the constant of proportionality c has units cal/Kg, and is called the
specic heat. It is really the heat capacity per gram. For water, we know
c
water
= 1 cal/Kg, by the denition of the calorie. For any substance the
specic heat c is a material property, and can be looked up in handbooks.
Suppose a lump of metal at 30
C is dropped (carefully) into 1 kg of

water at 20
C. Since the two are at dierent temperatures, and they are in

contact, they will exchange internal energy and come to thermal equilibrium.
Suppose the temperature of the equilibrium is 22
C. Then, as we have just

noticed, this means that E = 2000 cal for the water, and since energy is
conserved, this means that the metal lost 2000 cal, i.e., E = 2000 cal for
the metal. (Experimentally, one must carefully insulate the water so that
all heat ow is internal to the system, and no heat ows out to the room.
This would include even the containing vessel, a bit of an idealization.) Since
T = 8 K for the metal, its heat capacity is C = E/T = 250 cal/K.
This by itself does not tell us very much, but if we weigh the metal and
determine that its mass is M = 2700 g, then we nd the specic heat of the
metal is c = C/M = 0.093 cal/Kg. This is close to the specic heat of zinc
or copper. Perhaps the metal is brass, which is an alloy of the two.
The principle in the example above is that energy is conserved, so that the
energy that moves into the water must be the same as the energy that moves
out of the metal. Here is another example of that type. Suppose 100 g of
copper at 30
C is dropped into 1 kg of water at 20
C. The heat capacity of

the water is C
w
= 1000 cal/K, and the heat capacity of the copper, using the
specic heat of copper c
cu
= 0.092 cal/Kg, is C
cu
= (100 g)(0.092) cal/Kg=
9.2 cal/K. The total change in energy is zero, since the energy only moves
from one to the other, so
0 = E = C
w
T
w
+C
cu
T
cu
(9.26)
and thus
T
cu
T
w
=
C
w
C
cu
(9.27)
This says the temperature changes are in the same ratio as the heat capaci-
ties. The minus sign means one temperature goes down while the other one
goes up (we knew that). The large heat capacity changes less in tempera-
ture, while the small heat capacity changes more. In fact, in this example, the
right hand side is very large, more than 100, because C
w
is so much greater
than C
cu
. That means that almost all the temperature change is the copper
cooling o. The temperature change of the water is less than 1/100 that of
the copper, a very slight warming up. A large quantity of water, like the
one in this example, is sometimes called a heat bath, because it maintains
a nearly constant temperature due to its large heat capacity, even while it
exchanges energy with smaller systems. By analogy, any large heat capacity,
like a large enough block of copper, could be a heat bath, controlling the
temperature of an experiment.
In the example above we can solve for the nal equilibrium temperature,
which we already know will be approximately the original temperature of the
water, 20
C. Let T
w
, T
cu
, and T
e
be the initial temperatures of the water
and the copper, and the nal equilibrium temperature. Then Eq (9.26) says
0 = (T
e
T
w
)C
w
+ (T
e
T
cu
)C
cu
(9.28)
(We wrote the temperature changes as the nal temperature minus the initial
temperature in each case.) Then, solving algebraically for T
e
, we nd
T
e
=
T
w
C
w
+T
cu
C
cu
C
w
+C
cu
(9.29)
This is a weighted average of the initial temperatures, so it is somewhere in
the middle. The weights (multiplying T
w
and T
cu
) are the heat capacities
(C
w
and C
cu
), and we have already noticed that T
w
is heavily weighted, since
C
w
is much larger than C
cu
. That is why the average comes out to be very
close to T
w
. In fact, evaluating the expression, we have T
e
= (20 1000 +
30 9.2)/(1000 + 9.2) = 20.09
C. The change in temperature of the copper,

9.91
C, is more than 100 times the change in temperature of the water,

0.09
C.
9.6. MOLAR HEAT CAPACITIES 327
9.6 Molar Heat Capacities
Specic heat, or heat capacity per gram, is the practical measure of heat
capacity, but the more fundamental measure is heat capacity per mole. We
convert specic heat c to molar specic heat c
M
simply by multiplying by
the molecular weight MW (i.e., grams/mole). Right away we nd something
quite amazing: many, if not most, solids have the same molar specic heat,
around 6 cal/K mole 3R, where R is the gas constant, Eq (9.18). (Note
that the gas constant R does have the right units to be a molar specic heat.)
This peculiar fact, called the Law of Dulong and Petit, is illustrated in the
table below:
Material c (cal/Kg) MW (g/mole) c
M
(cal/Kmole)
Al 0.216 27.0 5.83
Cu 0.092 63.5 5.85
Fe 0.107 55.85 5.98
Au 0.031 197 6.09
Pb 0.031 207 6.40
What does it mean? Let us interpret this by idealizing it slightly, taking the
molar heat capacity to be c
M
= 3R, the same for all solids.
Imagine we have two solids, X and Y , in thermal equilibrium, and suppose
that X consists of n
X
moles and Y consists of n
Y
moles. Then their heat
capacities are C
X
= n
X
c
M
and C
Y
= n
Y
c
M
. If we add energy E to the
system, then some of it, E
X
, goes to X and the rest, E
Y
, goes to Y .
But since they are in thermal equilibrium, the temperature change T is the
same for both. Thus E
X
= n
X
c
M
T and E
Y
= n
Y
c
M
T, so
E
X
E
Y
=
n
X
n
Y
=
N
X
N
Y
(9.30)
That is, each solid gets energy in proportion to the number of moles it con-
tains, which is proportional to the number of molecules it contains (N
X
and
N
Y
). For example, if X has twice as many molecules as Y , then X gets twice
as much energy as Y . On average, then, each molecule gets the same energy!
This is called equipartition of energy, and gives a very simple picture of what
is happening microscopically. The energy spreads out equally over all the
molecules.
The molar heat capacities of gases are dierent from those of solids, but
again, they show a surprising simplicity. For noble gases, like helium and
argon, in a container of denite, xed volume, the molar heat capacity is
just half that of solids, namely
3
2
R. This means that if 1 mole of helium is
in equilibrium with 1 mole of aluminum, then when you add a little energy
E, twice as much goes to the aluminum as to the helium. Each aluminum
atom gets twice as much energy, on average, as a helium atom, when the
system reaches its equilibrium temperature. Why? This does not seem like
equipartition. Why do the helium atoms not get their share? The main
constituents of the air, N
2
and O
2
, each have molar heat capacity about
5
2
R
in a container of xed volume. This is almost as large as that of a solid, 3R,
so each molecule of the air would get almost as much energy as each molecule
of a solid with which it is in equilibrium.
In the next section we give a simple statistical model of what is happening
here.
9.7 Statistical Model for Molar Heat Capac-
ity
Statistical mechanics takes the view that internal energy is just the familiar
kinds of mechanical energy, kinetic energy and potential energy, but at a
microscopic scale that we cant see, and continually exchanged among the
molecules in a random fashion. Thus the internal energy of a gas would just
be the kinetic energy of the molecules. Their interaction with their container
would be collisions at the wall, where they might pick up some energy or lose
some energy, but statistically they would have a denite average energy over
time. The molecules might also collide with each other, redistributing the
energy randomly over the all the molecules of the gas.
The kinetic energy of a single molecule of mass m is
K =
1
2
m(v
2
x
+v
2
y
+v
2
z
) (9.31)
This is just Eq (6.14), but v
2
, the speed squared, is the sum of three terms,
referring to the components of velocity along x, y, and z directions. For a
real molecule in a real gas these terms will be rapidly changing in time, but
9.7. STATISTICAL MODEL FOR MOLAR HEAT CAPACITY 329
the statistical theory gives a very simple rule for their average at temperature
T, denoted < >:
_
mv
2
x
2
_
=
_
mv
2
y
2
_
=
_
mv
2
z
2
_
=
1
2
k
B
T (9.32)
where k
B
is Boltzmanns constant. Thus the average kinetic energy < K >
of one molecule is the sum of three equal terms, giving
< K >=
3
2
k
B
T (9.33)
and the internal energy E of 1 mole of a gas is just the total energy of
Avogadros number of such molecules,
E =
3
2
k
B
TN
A
=
3
2
RT (9.34)
where R is the gas constant. This gives a simple relationship between the
internal energy E and the temperature T of a gas. A change of temperature
by T implies a change of energy E =
3
2
RT, and the molar heat capac-
ity, the constant of proportionality in this relationship, is
3
2
R, just what is
observed for the inert gases.
The number 3 in the molar heat capacity
3
2
R comes from the three
quadratic terms in the kinetic energy < K > of a molecule. Those, in turn,
were there because the molecule is free to move in any of three dimensions
of space. We say the molecule has 3 degrees of freedom, each represented by
a quadratic term in the energy. Now we ask ourselves how it can be that
the air has molar heat capacity
5
2
R, not
3
2
R. Could there be 5 degrees of
freedom for air molecules instead of 3? Yes! The molecules of the air are
N
2
and O
2
, each molecule containing two atoms. Each atom has its own
kinetic energy that would be 3 degrees of freedom for each, 6 degrees of
freedom in all, except that along the line connecting them they must have the
same component of velocity, because they stay together, so that leaves only 5
degrees of freedom. The two new degrees of freedom correspond to rotation
of the molecule. Thus the molar heat capacity is
5
2
R, and a measurement of
heat capacity, which is a laboratory scale measurement, is really telling us
something about the geometry of the molecules, and that they are rotating!
The molecules of a solid cannot translate along independently, like the
molecules of a gas, but they can still have kinetic energy K if they are
oscillating in place about their (mechanical) equilibrium positions. They
must, in eect, be on springs that provide a restoring force and always push
them back where they belong in the solid. A simple model of this situation
says that each molecule is a simple harmonic oscillator with energy
K +U
S
=
1
2
m(v
2
x
+v
2
y
+v
2
z
) +
1
2
k(x
2
+y
2
+z
2
) (9.35)
Note there are 3 degrees of freedom for velocity, because the molecule can
be moving in any direction, and 3 degrees of freedom for position, because
the molecule can be displaced in any direction. With 6 degrees of freedom in
all, each represented by a quadratic term in the energy, the average energy
at temperature T of a molecule is
< K +U
S
>= 6
1
2
k
B
T = 3k
B
T (9.36)
and the molar internal energy is
E = N
A
< K +U
S
>= 3N
A
k
B
T = 3RT (9.37)
corresponding to molar heat capacity 3R.
Now we can see why the molecules of a solid get more energy than the
molecules of an inert gas when the two substances are in equilibrium. The
solid has more places to put the energy, having both kinetic and potential
energy. The gas molecules have only kinetic energy. There is an equipartition
of energy, but it is an equipartition among the degrees of freedom, not among
the molecules. Each degree of freedom gets energy
1
2
k
B
T at temperature T.
It is instructive to look at water in this connection. We know the specic
heat of water is 1 cal/Kg, by denition of the calorie, and the molecular
weight of H
2
O is 1 + 1 + 16 = 18 g/mole. Thus the molar specic heat of
water is 18 cal/Kmole, or about 9R. This is an enormous molar specic
heat! It suggests that the water molecule, in the liquid state, has 18 degrees
of freedom. What could they all be? It is true that the molecule consists of
3 atoms, but it is as if they could all move independently as oscillators, in
every direction, as if the molecule didnt constrain them at all. It suggests
that the water molecule, as a constituent of the liquid state, is remarkably
dissociated. Most molecules are quite rigid, like N
2
and O
2
in the atmosphere,
which behave like rigid rotators, with one degree of freedom suppressed. The
9.8. PHASE TRANSITIONS 331
water molecule by itself is rigid like any other small molecule, but in the liquid
state it seems to lose this rigidity. The anomalously large heat capacity of
water is still not really understood. The problem hinted at here is sometimes
called the problem of the structure of water.
9.8 Phase Transitions
We dened the heat capacity to be the constant of proportionality in Eq (9.1),
E T. But suppose we added a little energy E to a substance, in-
tending to measure the corresponding T, and we found T = 0: we would
conclude that the heat capacity C was innite! Such a system would be a
perfect heat bath, capable of absorbing energy and not changing its temper-
ature T at all.
That sounds like some kind of idealization, but it is actually completely
commonplace. A mixture of ice and water in equilibrium at 0
C is such a
thing. This is the temperature at which ice and water coexist. When you add
energy E, it doesnt go into raising the temperature. Rather it goes into
converting some ice into liquid water, at the same temperature. This melting
transition requires energy. If we try to warm the water, by putting it in
contact with something warmer and allowing heat to ow in, but keeping the
ice and water well stirred and in equilibrium with each other, then heat ows
at 0
C from the water to the ice, melting the ice. Similarly, if you remove
energy from the ice water mixture, by bringing it into contact with something
colder, for example, its temperature does not go down. Rather, as heat ows
out of the system, liquid water converts to ice at the same temperature. This
is low temperature ow of heat without a change in temperature: a concept
suciently dierent from ordinary language to require some thought.
A transition from one state to another, diering in molar internal energy
at the transition temperature, is called a rst order phase transition. The
internal energy dierence between the two phases is called the latent heat. In
the case of a solid/liquid transition, it is called the latent heat of fusion (Latin
word for melting). The latent heat of fusion for water is 80 cal/g, or 1440
cal/mole. The liquid/gas transition is another rst order phase transition.
The latent heat of vaporization for water is 540 cal/g, or 9720 cal/mole.
These are surprisingly large energies. The 80 calories necessary to melt a
gram of ice at 0
C would have warmed 1 gram of melt water almost to

boiling. Where does all this energy go?
Fig 6.11 illustrates how a water molecule might be bound to a neighboring
molecule, at a fairly well dened distance, in the solid, ice. Its interaction
with its neighbor is described by a potential energy, and it is in a low energy
state, nearly the minimum of that potential energy. Without acquiring more
energy, it can only oscillate by a small amount, between turning points that
are very close together. But with more energy, it reaches the higher states,
with larger amplitude oscillations, and with even more energy it can break
free altogether. Thus it is a plausible model that the energy we add to melt
a solid goes into exciting the molecules to higher oscillator states until the
bound structure can no longer hold together. The melting transition is not
well understood, however.
In the vaporization transition, the molecules must end up with the average
kinetic energy of a gas at the transition temperature. By the equipartition
argument, though, they already have this energy! Thus the internal energy
needed to vaporize a liquid must be used to separate molecules from each
other, much like in the melting transition. That the latent heat of vaporiza-
tion is much larger than the latent heat of fusion tells us that the molecules
are energetically bound to their neighbors in the uid almost as strongly as
they are in the solid.
A technique called dierential scanning calorimetry (DSC) looks for phase
transitions systematically in samples of newly synthesized compounds. It
starts at a low temperature, adds a little energy E, and monitors the cor-
responding change in temperature T. The ratio E/T is the heat capac-
ity C. It does this again and again, gradually raising the temperature and
keeping track of C. If we come to a phase transition, the ratio C is suddenly
very large, because the energy doesnt go into raising the temperature, but
rather into latent heat. Thus in the scan, which might be automated and
simply produce a graph of C at each temperature, there would be spikes at
the transition temperatures. You might think this technique would be quite
superuous, because surely one can look at a sample and see if it has melted
or vaporized! The point is, though, these arent the only phase transitions.
There are whole classes of compounds, including biologically interesting com-
pounds like lipids, that show transitions between liquid crystalline phases.
These are dierent liquid phases, so they might look just the same to the ca-
9.9. ENTROPY 333
sual eye, but the heat capacity shows that phase transitions occur at special
transition temperatures. Naturally one wonders what is happening micro-
scopically! Typically the molecules are partially ordering in some way, but
not becoming as perfectly ordered as they are in a crystal. In a material that
forms a nematic liquid crystalline phase, for example, the molecules are elon-
gated, and quite rigid. In the normal liquid phase they are densely packed
together, but completely disordered in orientation. At the transition to the
liquid crystalline phase (going down in temperature) they acquire a statisti-
cal tendency to point in the same direction, even though they still slide past
each other like the molecules in any liquid. In the DSC measurement, E
added at the transition temperature goes to break the orientational order,
not to raise the temperature. The name liquid crystal very neatly captures
the idea of partial ordering within a uid phase.
9.9 Entropy
The use of words like order and break the orientational order at the end
of the last section actually has a precise technical meaning in the concept of
entropy. The entropy S of a system is a denite, numerical measure of its
disorder. When heat ows into a system, its entropy increases, as you might
suppose. In the dierential scanning calorimeter, we did not say how the
energy E was added to the sample, but suppose it were added as heat. To
emphasize the way the energy was added, as a ow of heat and not in some
other way, E is called Q in such a case. This should not be read as a
change in Q however, as there is no quantity Q to change. Rather it is a
change in E, the energy, by the method of heat ow.
Keeping the meaning of Q rmly in mind, we can now dene how much
the disorder S increases when heat ows into a system at temperature T.
Amazingly, the symbol S really means the change in a well dened quantity
S. It is
S =
Q
T
(9.38)
By keeping track of the entropy changes S in the quantity S, starting
from some convenient state, one can know the entropy S of a system in any
equilibrium state. The discovery that there is such a quantity S, dened
by Eq (9.38), was one of the great surprises of 19th century physics. Its
interpretation as molecular disorder came even later. It still seems a peculiar
and elusive concept.
Only in the 20th century did it become clear that S = 0 at T = 0,
that is, the disorder is zero at absolute zero. This is the Third Law of
Thermodynamics. Of course the interpretation of S as disorder makes this
very plausible. Knowing S at T = 0, Eq (9.38) tells us how to nd it at
any other temperature, because as we add heat (and therefore disorder), the
sample warms up.
The simplest place to understand Eq (9.38) is precisely at a rst order
phase transition, because in that situation, as we add energy in the form of
heat, the temperature T doesnt change. That means we can say exactly
how much a mole of water increases its disorder when it melts: it is just the
latent heat of fusion divided by the transition temperature,
(S)
fusion
=
(Q)
fusion
T
melt
1440 cal/mole
273 K
= 5.27 cal/K mole = 2.65 R
(9.39)
Notice that molar entropy can be expressed as a multiple of the gas constant
R, which has the same units. Does the disorder increase by more than this
or less when water boils? Do other substances behave similarly?
Problems
Thermometers
9.1 Most materials expand as they are heated and contract as they are
cooled, but a notable exception to this behavior is water near its freezing
point.
(a) What familiar phenomenon, observable on lakes in cold climates in
the winter, makes it certain that water does not contract when it freezes, like
most materials, but rather expands? Explain carefully.
(b) This expansion of ice just continues a trend that is also visible in
the liquid state: water expands as it is cooled in the narrow temperature
range from 4
to 0
(Celsius). Above 4
water expands as it is heated, like

most other materials. Sketch a qualitative graph (no need to look up real
data) of the volume of a xed mass of water between 0
and 10
, labelling
the axes. Describe how to use the volume of the water as a thermometer
in this temperature range, and what diculty you might have with such a
thermometer.
9.2 (a) Design a mercury thermometer, small enough to t most of it under
a human tongue, such that 1
F corresponds to a change in length of 1 cm

in the visible mercury column.
(b) The volume expansion coecient of alcohol is 1.1 10
4
/
C. How
would your thermometer perform if alcohol were substituted for mercury?
335
Gas Thermometers
9.3 (a) What is the molecular weight of air, if it is 80% nitrogen (molecular
weight 28) and 20% oxygen (molecular weight 32)? Explain your reasoning
clearly.
(b) Find the temperature if a mass of 5 g of air occupies 2000 cm
3
at
atmospheric pressure 10
5
Pa.
(c) The computation in (b) assumes that the air is a gas. Check consis-
tency: how does the temperature you found compare with the liquefaction
temperatures of oxygen and nitrogen?
9.4 We have used the convenient approximation 1 kg/m
3
for the density of
air under normal conditions. Describe how to determine this density more
accurately, and use this method to nd a better value.
9.5 A thin, tough beach ball of diameter 40 cm is inated to a pressure
of 5 atmospheres with air at room temperature. Before it was blown up it
weighed 1 N. What does it weigh after it is blown up? Include a free body
diagram, clearly labelled. (Dont neglect to consider buoyancy).
9.6 One frequently hears that 1 mole of gas at standard temperature and
pressure occupies a volume 22.4 liters. Explain why this is true.
Heat Capacity
9.7 (a) 50 g of copper at 100
C and 50 g lead at 0
C are brought to-

gether and heat ows from the copper to the lead until they reach the same
temperature. Assume both metals obey the Law of Dulong and Petit. The
molecular weight of copper is about 64 and of lead is about 207. Find the
equilibrium temperature.
(b) Why is the resulting temperature not 50
C?
(c) More generally, if equal masses m of two dierent materials, substance
A with molecular weight (MW)
A
at temperature T
A
and substance B with
9.9. ENTROPY 337
molecular weight (MW)
B
at temperature T
B
, exchange heat until they come
to equilibrium, what will be the nal temperature (assuming the Law of
Dulong and Petit)?
9.8 By what temperature should a lake warm up in one day if the main
energy input is solar energy, with an energy current density of 1 kW/m
2
?
Assume all this energy is captured by the lake, and ignore other energy inputs
and losses. Since the lake is horizontal, and not oriented perpendicular to
the Suns rays, we are overestimating the energy current density on the lake
surface, so assume the day is eectively only 4 hours long to correct for this.
Also assume the water is mixed to a depth of 2 m, but the water below does
not get any of the energy input.
9.9 Can you warm up a nail appreciably by hitting it with a hammer?
Estimate the kinetic energy of a hammer and assume it is all delivered to a
5 g nail made of steel. (You can take the molecular weight to be about 60 and
assume the Law of Dulong and Petit.) What is the change in temperature
per blow?
Chapter 10
Thermodynamics
We can warm ourselves at a re: thats been obvious since we lived in caves.
But we can also get a re to do things for us: that is a recent discovery.
Hero of Alexandria, an author about whom we know virtually nothing, and
who might have lived any time between the 2nd century B.C.E and the 3rd
century C.E., describes toy-like devices that operated by steam power. This
is a hint that the Alexandrian Greeks might have been on the verge of their
own industrial revolution before the Romans got there. We will never know
about this. It seems unlikely they would have been satised with just toys if
they had had more time. When our industrial revolution began, in the 18th
century, people quickly tried to improve the early engines, to make them
more powerful and ecient. In doing this they were investigating, without
even fully realizing it, fundamental questions about the ow of energy.
In Section 9.9 we emphasized that energy might be transferred to an
object as the ow of heat, and gave this particular kind of transfer the name
Q. We could also transfer the same energy to a similar object by lifting it
up higher: this gives it additional gravitational potential energy. These two
ways of transferring energy are clearly quite dierent though. In particular
the rst one increases the entropy S of the object (its internal disorder),
while the second one doesnt. If the second object were to convert its extra
gravitational potential energy to internal energy, perhaps by falling onto the
oor, then it might indirectly end up in the same state as the rst one. The
rst one could not so spontaneously use its extra energy to join the second
one on its high shelf, however.
339
340 CHAPTER 10. THERMODYNAMICS
A transfer of energy that is not the ow of heat roughly speaking
is said to be mechanical work. Mechanical work involves just a few degrees
of freedom: when we lift something up, for example, we are operating with
only one degree of freedom, the height. The ow of heat, on the other
hand, spreads energy over Avogadros number of degrees of freedom. This
distinction, involving many degrees of freedom or few, seems to be crucial.
10.1 Work
There is a simple interpretation of how a mass m at height h got its me-
chanical energy U
g
= mgh. Someone put it there, and in doing so gave it
that energy. Similarly, if a spring has energy
1
2
kx
2
, it is because someone
has compressed or stretched it by a distance x, and has thereby given it that
energy. In both cases, we envision a process, of lifting, or compressing, that
transfers energy to the mass m, or the spring.
To lift something you need to support its weight mg as you move it up
through a distance h. The energy you give to m is just the product of the
force mg you exert (up, to support it) and the displacement h (also up, since
you are lifting it). This is a recipe for nding gravitational potential energy
mgh, as the work you would do to place m at height h, where
Your Work on m = Force up Displacement up (10.1)
The mass has energy U
g
= mgh because you did that much work on it. At
the same time, it did negative work on you! It pushes down with force mg
on your hand, while your hand moves up a distance h. Since the force in
this case is opposite to the displacement, the work is negative. You have lost
energy mgh while the mass m has gained energy mgh. It is much less obvious
that you have lost energy, but m has clearly gained energy, and that energy
must have come from somewhere. The whole process is really a transfer of
energy from you to m.
When things push on each other, they are generally doing work on each
other, either positive or negative, and hence transferring energy. When you
compress a spring by a distance x, you must push as hard as the spring
pushes on you, that is, with the force F = kx. The work you do as you
push through distance x is just the potential energy the spring acquires,
10.1. WORK 341
F
y
h
mg
mgh
F
x
kx
kx
2
/2
Fig. 10.1: The work done in lifting a weight or compressing a spring is the area under the
graph of force vs. displacement. This generalizes the simple rule ForceDisplacement.
namely U
s
=
1
2
kx
2
. You can think of the work you do as energy stored in
the spring. That energy has been transferred from you to the spring, so you
have lost that much energy: the spring did negative work on you, because
it pushed one way, but the displacement was the other way. The factor
1
2
, which might seem surprising, is explained to some extent in Fig 10.1.
The rule Work=ForceDisplacement is the area under the graph of force vs.
displacement if the force is constant, and that turns out to be the appropriate
generalization in case the force changes with displacement, like the spring
force F = kx, which is proportional to displacement. We emphasize that the
work done on a spring in this sense exactly agrees with the potential energy
U
S
of the spring.
10.2 PV Work
In the context of thermodynamics the most important work done by a gas
is the work it does in expanding or contracting its volume. In steam engines
the gas in question would be the steam, and its expansion is the whole point
of the engine: that is where it does something useful. The gas is at a pressure
P, and hence exerts a normal force F = PA on every area A of the containing
vessel. To change the volume, some part of the area A must move normally
through a displacement . Then the work done would be PA = PV ,
recognizing the change in volume V = A. The simplest geometry for
this is a piston, as in Fig 10.2. There is only a single degree of freedom, the
PA
l
Fig. 10.2: The gas does work PV on the piston when it expands by V = A. The
piston does work PV on the gas. If P changes appreciably during the process, PV
should be understood as the area under the P vs. V graph.
position of the piston in the cylinder.
As we are imagining it, the gas does positive work on the piston, because it
pushes the piston in the same direction as the displacement. Notice, however,
that the piston does negative work on the gas, because the piston pushes
inward, by Newtons Third Law, while the displacement that we have drawn
is outward, opposite in direction. This means that the energy of the gas in
the cylinder goes down. That energy has been transferred to the piston in
the positive work that the gas did.
In case the pressure P changes as the volume changes, the meaning of
PV is the area under the graph of P vs. V , just like the work in other
situations (see Fig 10.1).
10.3. VARIOUS PROCESSES 343
10.3 Various Processes
We will consider various processes involving the gas in the piston in Fig 10.2.
The gas is really a kind of metaphor for any system that can do work, have
work done on it, receive heat, or lose heat, which means any macroscopic
system at all. We are really thinking more abstractly than it may appear
about how energy moves around, because the details of the system will play
no essential role. At the same time, gas in a cylinder is a very concrete and
practical example of what the theory means.
The basic observation is that in a process in which the gas does work
PV , and receives a ow of heat Q, its internal energy changes by
E = QPV (10.2)
The minus sign is because when the gas expands (V > 0), it does work on
the piston, but the piston does negative work on it. Both E and V refer
to the gas.
10.3.1 Adiabatic Process: Q = 0
As we noticed in Eq (9.3), heat ows in response to a temperature dierence,
but the constant of proportionality, the conductance , may be large or small.
There is nothing to prevent from being very small, and from our doing the
process quickly, so that there is little time for heat to ow. Under these
conditions it is a reasonable model to assume Q = 0. We must still carry
out the process smoothly, so that the dierent parts of the gas stay in thermal
equilibrium with each other. Such a process is called adiabatic.
Under these conditions, E = PV . The internal energy of the gas
goes down as the gas expands, and the temperature T also goes down. Since
PV goes down, and V goes up, P certainly goes down. Thus the work PV
must be understood as the area under the P vs. V graph, as in Fig 10.3.
The work done on the piston comes from the internal energy of the gas. This
example clearly shows that internal energy, spread over Avogadros number
of degrees of freedom, can become energy associated with a single degree
of freedom. The motion of the piston could be used to raise a weight, for
example.
P
V
T
low
T
high
Fig. 10.3: The work done in an adiabatic expansion is the area under the P vs. V (solid)
curve. The dotted curves are PV = constant, that is, isotherms corresponding to a high
temperature and a low temperature.
If the gas is compressed adiabatically by the piston, then V < 0. In
this case E = PV > 0, and the internal energy of the gas increases, by
the amount of work done on it by the piston. Its temperature goes up. You
have probably noticed this eect in using a hand bicycle pump.
In the adiabatic process we are considering (note: the process must be
done reversibly, as we clarify below), S = 0, by Eq (9.38), since Q = 0.
That is, the entropy, or disorder of the gas, doesnt change. This seems
surprising, since the temperature T goes down in the expansion. Isnt lower
temperature associated with lower disorder? After all, S = 0 at T = 0. It is
true that the gas has more orderly motions at lower temperature if everything
else stays the same, but here, as T goes down, V , the volume, goes up. Now
the gas is more disordered in the sense that it has more room V to spread
out. What we see in the adiabatic process is a tradeo between order in
space, enforced by the walls of the piston, and orderly motions, associated
with temperature. As the gas is compressed adiabatically, it becomes more
ordered in space (more conned), but less ordered in its motions, because
it is at higher temperature. As it expands, it becomes more ordered in
its motions, but less ordered in space (less conned). These two contrary
tendencies exactly cancel in the measure of disorder called entropy. The
adiabatic process is a constant entropy process, isentropic: S = 0.
10.3.2 Isothermal Processes, T = 0
An isothermal process is one that takes place at constant temperature T.
This could happen if the gas is in thermal contact with a heat bath, that is,
a large heat capacity at the temperature T. For a gas obeying the gas law
Eq (9.17), T = 0 means E = 0, that is, the internal energy E is also
constant. This is especially easy to see in case E T, as in the statistical
theory of Eq (9.34), but it is true for any ideal gas. In a process like this
0 = E = QPV (10.3)
The work done by the gas is again the area under the P vs. V curve,
P
V
T=constant
Fig. 10.4: The work done in an isothermal process is the area under the P vs. V curve.
In this case the curve is the isotherm PV = constant.
Fig 10.4. In this case, if the gas does positive work PV , the energy does
not come from its internal energy. Rather its internal energy stays the same,
and the energy comes from heat owing into the gas, which means, from the
heat bath! We could think of it as an expansion like the adiabatic expansion,
except that now the thermal conductance to the outside is large, so that
instead of cooling as it expands, the gas receives heat to keep its temperature
constant. We could imagine it cooling just slightly as it expands, but as this
happens, the temperature dierence with the heat bath drives a ow of heat
from the heat bath to warm the gas back up. The high thermal conductance
between the gas and the heat bath means they can never dier signicantly
in temperature. The pressure P in the gas goes down as it expands, since V
goes up, and PV = constant.
Similarly, if we do work on the gas, compressing it isothermally, the in-
ternal energy does not go up. Rather the slight temperature increase that
we might initially create drives a heat ow to the bath. The work we do on
the gas goes into the heat bath as internal energy. The pressure in the gas
goes up as the volume goes down, since PV = constant.
We can also follow the change in entropy S in an isothermal process.
By Eq (9.38), when the gas expands and energy ows in as heat Q, the
entropy of the gas increases by S = Q/T. This might be a surprise: the
temperature stays the same but the disorder increases. Why? The answer is
that the volume has increased in the expansion, so the gas is more disordered
in space, spreading out more. This is an eect that we already noticed in the
adiabatic expansion, but there it was in connection with a contrary eect as
the temperature went down, and here we see it by itself. The Q entering
the gas at temperature T is exactly the Q that leaves the heat bath at
temperature T, so the entropy of the heat bath goes down by the same
amount that the entropy of the gas goes up, and the change in entropy for
the entire system of heat bath plus gas is zero.
The actual value of the work done in the isothermal expansion of an ideal
gas is surprisingly important. As we already said, it is the area under the P
vs. V curve in Fig 10.4, and it can be found by calculus. It is
PV = nRT ln
_
V
f
V
i
_
(10.4)
where PV is our notation for the work done, V
f
and V
i
are the nal volume
and the initial volume respectively, and ln is the natural logarithm, a function
available on calculators. This is also Q, since E = 0 in this process, and
hence the isothermal change in entropy is
S =
Q
T
= nRln
_
V
f
V
i
_
= nRln
_
P
i
P
f
_
(10.5)
The form involving the initial pressure P
i
and the nal pressure P
f
follows
because PV =constant in an isothermal expansion. We see that S increases
in the expansion because the logarithm is an increasing function of its argu-
ment: larger nal volume V
f
means larger S.
We can understand this result without calculus using dimensional anal-
ysis. The work done is an energy, so it must be proportional to RT on
dimensional grounds. It must also be proportional to n, the number of moles
of gas, because the heat necessary to maintain temperature T is proportional
to the number of molecules: this energy gets equipartitioned over all degrees
of freedom. Since T is constant, the only variable is V , and on dimensional
grounds the result can depend only on the dimensionless ratio V
f
/V
i
, i.e., the
work done must be nRTf(V
f
/V
i
), where f is some unknown function. But
the work done in expanding from V
A
to V
B
, plus the work done in expanding
from V
B
to V
C
, where these are any volumes, must be the same as the work
done in expanding from V
A
to V
C
, i.e., the function f must have the property
f(V
C
/V
A
) = f(V
B
/V
A
) + f(V
C
/V
B
), or more simply f(xy) = f(x) + f(y).
You may recall that the logarithm has this property. In fact it is the only
function that does. This argument does not determine a dimensionless con-
stant multiplying everything, but this is the same as saying that we have
not determined what the base should be for the logarithm. It turns out to
be the natural logarithm (base e). We will return to this result on the en-
tropy change associated with a change in volume or density at the end of
this chapter.
10.3.3 A Constant Pressure Process
Suppose we have a mole of helium gas in the cylinder of Fig 10.1, and we
want to add enough energy, via heat ow Q, to raise its temperature some
xed amount T. How much heat do we need? Well, E T, where
E is the internal energy of the helium, and the constant of proportionality
is the molar heat capacity C
M
. We even know C
M
=
3
2
R for helium, as
discussed in the text around Eq (9.34). Thus you might very justiably say
that the heat required will be Q =
3
2
RT, a little more than 12 J to raise
the temperature 1 K. But now we say we will do this keeping the helium
at constant pressure P, by maintaining a constant force on the piston from
the outside. What dierence could that make? you might ask. Well, since
PV T for the helium, and T is increasing, and P is constant, we see that V
must increase, that is, the gas must expand (at constant pressure). In doing
so, it does work PV on the piston, and this represents a loss of internal
energy to the gas. In fact, since E = QPV =
3
2
RT, we nd
Q =
3
2
RT +PV (10.6)
so that the heat Q required is more than you thought! In fact, since
PV = RT, by the ideal gas law Eq (9.17), we nd Q =
5
2
RT, as if
the molar heat capacity C
M
were
5
2
R and not
3
2
R. We need more than 20 J to
raise the temperature by 1 K. The reason is that not all the heat we add goes
into internal energy of the gas, and hence into increasing the temperature.
Some of it comes out again as work done in expansion of the gas. That is
why we have to put in some extra heat.
To make sense of this somewhat confusing situation, we should realize that
the original question how much heat do you need to raise the temperature
by T? was not precise enough. We have to say exactly what the process
is. If it is a constant volume process, then we should use the heat capacity
we already knew about, C
V
=
3
2
R, where the subscript V means that this
is the heat capacity at constant volume. In this process, the pressure goes
up as the temperature goes up, and all the added heat goes into raising
the temperature. But if the process is at constant pressure, then we must
use C
P
= C
V
+ R, the heat capacity at constant pressure. If we keep track
of all the energy, including the work done in expansion in this process, the
confusion goes away.
In principle C
V
and C
P
are dierent for solids and liquids as well. In
fact, though, as we have seen, solids and liquids typically expand very little
as they are heated, only a part in 10
4
or so, per degree. Thus the work
done in expansion is tiny. And since this work done is just the dierence
C
P
T C
V
T, we see that C
P
C
V
. What makes gases dierent is that
they expand a lot at constant pressure when they are heated.
10.3.4 Reversible and Irreversible Processes
The adiabatic and isothermal processes, as we have described them, are re-
versible. If we compress a gas adiabatically, the work we do is stored in the
internal energy of the gas, the temperature goes up, and the pressure goes
up. Allowing the gas to expand again, we get back the work we invested,
as the gas expands and cools back to its initial temperature. Both processes
are represented in Fig 10.3, diering only in the direction we move along one
and the same P vs. V curve (the solid one in the gure).
The same remarks apply to the isothermal process, using Fig 10.4. We
can do work on a gas to compress it isothermally, and the pressure goes up.
The energy is stored in the heat bath(!) Allowing the gas to expand again
isothermally, we get that work back. It is interesting to notice that this
process includes a reversible ow of heat, between the gas and the heat bath
at the same temperature.
The prototype irreversible process is the spontaneous ow of heat from
hot to cold, i.e. between two genuinely dierent temperatures. Once this
happens, there is no way simply to reverse it. It is true that there exist
devices which act as refrigerators, moving energy from cold to hot, leaving
the cold even colder, but this is not a simple reversal of the spontaneous ow
of heat. Refrigerators are machines that require their own ows of energy.
When heat Q ows from a hotter body, at temperature T
h
, to a colder
body, at temperature T
c
< T
h
, the total entropy of the two bodies, taken
together, increases. By Eq (9.38), the entropy of the hotter body changes by
S
h
= Q/T
h
. This is a negative change (decrease) in entropy, because
heat is owing out. The entropy of the colder body changes by S
c
= Q/T
c
.
This is a positive change in entropy, because heat is owing in. The total
change in entropy is
S = S
h
+ S
c
= Q
_
1
T
c
1
T
h
_
> 0 (10.7)
It is positive because T
c
< T
h
, so that 1/T
c
> 1/T
h
.
Since heat ows spontaneously from hot to cold, but not from cold to
hot, entropy can increase in such a process, but not decrease.
Finally, processes with friction are irreversible. The prototype is Joules
experiment with the paddlewheel and the water. The paddlewheel does work
Fx on the water, by pushing it with some force F through some displace-
ment x. The water is churned around, and is not in thermal equilibrium as
this happens, but when it nally reaches equilibrium, it is in the same state
as could have been attained by the addition of heat Q equal to the work
done on the water. Thus its entropy has increased, by Eq (9.38). (Notice that
heat was not in fact added. The entropy of the nal state can only be calcu-
lated, however, by thinking of some other process, involving the addition of
heat, that would get to that state.) Once the water has settled down, at a
higher temperature, and a higher entropy, it is clear that the reverse process,
in which the water pushes the paddlewheel back to its initial position, will
not occur.
Let us think about the eect of friction between the piston and the cylin-
der in the adiabatic expansion of a gas, Fig (10.3). The gas does work PV
on the piston, as illustrated there, but because of friction, not all the force
PA is available to, say, raise a weight. Some of the force PA is balanced
(opposed) by the friction force f on the piston due to the cylinder. The
remaining force PA f is available to do work on the weight, and thus the
work done is only (PA f). The remaining energy taken from the gas,
f, which we could call the work done by the friction force f, eventually
brings the piston and cylinder to a state that could have been achieved by
adding heat Q = f (even though that is not in fact how it happened).
The entropy of the cylinder and piston have therefore increased. The entropy
of the gas has not changed. The process cannot be simply reversed, because
the energy stored in the raised weight is not enough to compress the gas back
to its original volume.
In general, in an energy transfer with friction, system A does work W on
system B, but the change in the (few) mechanical degrees of freedom of B
leaves it in a position only to do (in return) a smaller amount of work W
< W
on system A. The energy which is not represented in the mechanical degrees
of freedom of B is thermalized, that is, spread over Avogadros number of
degrees of freedom of B, as if it had been transferred as heat.
One possible model for how the entropy increases in the piston and cylin-
der with friction (system B) is to think microscopically about how friction is
caused. There could be little rough spots on the moving surfaces that catch,
stick, and then slip. In the process, little vibrations are excited in certain
degrees of freedom of the solid. When this happens, the energy is surely not
equipartitioned, so the system is not in thermal equilibrium. The degrees of
freedom that have extra energy are, in eect, hotter. As thermal equilibrium
is attained, heat ows from the hot degrees of freedom to the cold ones, and
10.4. HEAT ENGINES 351
this ow of energy from hot to cold is irreversible, and increases entropy.
How such friction processes actually operate is still a research topic, as is the
description of systems with many degrees of freedom not in equilibrium.
To summarize, in irreversible processes, S > 0: that is, total entropy
increases. In reversible processes, total entropy does not change, i.e., S = 0.
The statement that S 0 in every process that actually occurs, whether
reversible or irreversible, is the Second Law of Thermodynamics.
10.4 Heat Engines
The steam engine was the inspiration for the most brilliant and beautiful
conception in thermodynamics, the ideal engine of Sadi Carnot. Carnot
published this idea at the age of 28 in 1824, with the title Reections on the
Motive Power of Heat. He died only a few years later, of cholera. (To be
fair, the rst really clear statement of what Carnot had glimpsed was due to
Rudolf Clausius some 20 years later.)
Carnot drew on an analogy between steam power and hydropower. In the
case of hydropower, you have water at a height H
1
, and there is also a lower
level available, at height H
2
< H
1
. You exploit the dierence in heights when
you let the water fall from H
1
to H
2
, doing work along the way (turning a
turbine, for example). Note that it does no good to have water at H
1
if there
is not a lower level for it to move to. The analogy with steam power is the
following: in a steam engine (or any heat engine) you have a material at high
temperature T
1
, and you also have a lower temperature T
2
available, and
heat will spontaneously ow from one to the other. Along the way you can
divert some of it into useful work. Note that no heat ows if you do not have
the lower temperature T
2
. The dierence in temperature drives the ow of
heat, just as the dierence in gravitational potential energy drives the ow
of water.
It makes no sense in the context of hydropower to let water fall freely,
without putting in some kind of turbine that extracts useful work. Similarly
it makes no sense in the context of heat engines to let heat ow irreversibly
from T
1
to T
2
. Rather, the heat should ow through a mechanism that
extracts work. That is what a heat engine is. Carnot solved the problem of
making the best possible mechanism, in the sense of getting the most work for
a given ow of heat. Carnots insight is that the best you can do is to make
the engine reversible. And amazingly, any reversible heat engine operating
between T
1
and T
2
is equivalent to any other. They all do exactly the same
thing, regardless of their construction, whether they are steam engines or
devices of the distant future, using materials and methods yet undreamed
of!
The proof of this amazing result is simple. It is basically the picture
in Fig 10.5. What is diagrammed there is two reversible engines operating
T
1
T
2
Q
1
Q
2
Q
1
Q
2
W
Fig. 10.5: Reversible heat engine 1 on the left takes in heat Q
1
at T
1
, exhausts heat Q
2
at
T
2
, and performs work W = Q
1
Q
2
on reversible heat engine 2 on the right, which takes
in heat Q
2
at T
2
and exhausts heat Q
1
at T
1
. The argument in the text shows Q
1
= Q
1
and Q
2
= Q
2
, no matter how the engines are made.
between the high temperature T
1
and the low temperature T
2
. On the left,
heat ows from T
1
to T
2
and some of it is diverted into work W = Q
1
Q
2
.
The surprise comes on the right: the output work of the rst engine becomes
the input of the second engine, operating in reverse! Since this engine is
reversed, it takes in heat from the low temperature T
2
and exhausts it at
the high temperature T
1
(it is operating as a referigerator). The two engines
together become a kind of compound engine. What is its eect on the two
heat baths?
The heat Q
1
exhausted at T
1
by engine 2 cannot be greater than the heat
Q
1
taken in by engine 1, since then the net eect would be that heat has been
tricked into owing from cold to hot. But neither can the heat Q
1
exhausted
at T
1
by engine 2 be less than the heat Q
1
taken in by engine 1, because the
engines are reversible, and reversing both of them we would again nd heat
spontaneously owing from cold to hot. Since Q
1
> Q
1
is impossible, and
Q
1
< Q
1
is impossible it can only be that Q
1
= Q
1
, i.e., the heat exhausted
by engine 2 is the same as that taken in by engine 1. Thus if the engines
were disconnected from each other and made to run in the same direction,
they would both take in the same heat at T
1
, perform the same work, and
exhaust the same heat at T
2
.
We can go further and say exactly how ecient this reversible engine is,
dening eciency e as
e =
W
Q
1
(10.8)
i.e., the fraction of the high temperature heat Q
1
that we can turn into work
W. For the reversible engine the value is called the Carnot eciency, the
highest eciency attainable by an engine operating between temperatures
T
1
and T
2
. Irreversible engines are less ecient than this: for example, zero
eciency corresponds to W = 0, which means Q
1
= Q
2
, and the heat simply
ows irreversibly from T
1
to T
2
. We have computed the change in entropy
S in this case, in Section 10.3.4, and we recall that it is positive. Of
course S > 0 is the characteristic feature of an irreversible process. As we
imagine making a heat engine more ecient for xed Q
1
, but still irreversible,
W increases as Q
2
decreases, but Q
2
is still large enough that the entropy
change is positive,
S =
Q
1
T
1
+
Q
2
T
2
> 0 (10.9)
The limiting case, the best engine, is the case S = 0, the reversible engine,
with Q
2
as small as it can be. To get still more work, i.e., to be even more
ecient, the engine would have to decrease total entropy, something that
never happens. (Such an engine, powering a reversible refrigerator, would
make heat spontaneously ow from cold to hot.)
For the reversible engine, the total change in entropy of the two heat
baths is
S =
Q
1
T
1
+
Q
2
T
2
= 0 (10.10)
and therefore
Q
2
Q
1
=
T
2
T
1
(10.11)
The Carnot eciency e is therefore, using W = Q
1
Q
2
,
e =
W
Q
1
=
Q
1
Q
2
Q
1
= 1
Q
2
Q
1
= 1
T
2
T
1
(10.12)
an astonishingly simple result. The eciency depends only on the absolute
temperatures. If, for example, we can use steam at 500 K, with heat owing
to room temperature 300 K, then the optimal eciency, the Carnot eciency,
is 1 300/500 = 2/5 = 0.40, or 40%. More than half the energy available
from the fuel that heats the high temperature bath is exhausted as waste
heat Q
2
at room temperature, doing no good to anyone. If we could use
hotter steam, we could do better! Of course higher temperatures may be
impractical for other reasons, but Carnots insight tells us what we must do
to have even a hope of high eciency.
10.4.1 The Carnot Cycle
The heat engines in Carnots argument are perhaps a little too abstract!
What makes us think that such machines could be built, even in principle?
In this section we describe an actual device that would operate as a reversible
engine (to the extent that we can minimize friction, of course). This makes
it clear that the Carnot eciency can be approximated in real devices.
We imagine using the cylinder and piston of Fig 10.2. The machine
will run in a cycle, taking in some xed amount of heat Q
1
, doing some
corresponding work W, exhausting the rest as heat Q
2
, and repeating, over
and over. The rst step, taking in heat Q
1
reversibly, is just an isothermal
expansion of the gas at temperature T
1
, in the course of which an equal
amount of work, namely Q
1
, is done on the piston. Since this is more work
than we ultimately get out, some of that energy will later be put back into
the gas to complete the cycle. In the next step we must get reversibly down
to the low temperature T
2
. This can be done with an adiabatic expansion:
more work is done, and the gas cools to T
2
. Now heat Q
2
must be exhausted
isothermally and reversibly: this requires an input of work equal to Q
2
.
Finally the gas must be heated reversibly back to the high temperature T
2
:
this can be done by compressing it adiabatically (which again requires that
we do work on the gas, giving energy back). Since each step is reversible, as
discussed in Section 10.3, this Carnot cycle must have the Carnot eciency of
an optimal engine. Actually carrying out these steps might be cumbersome,
requiring thermal insulation for the adiabatic steps, and then thermal contact
with dierent heat reservoirs for the isothermal steps. It is not proposed as
a practical engine, but rather as a proof of existence, a proof of concept.
10.4.2 Refrigerators, Heat Pumps
In examining Carnots idea, we noticed that a heat engine running backward
is a refrigerator. Mechanical work is used to move heat from cold to hot.
We can see how that might work in practice by following the Carnot cycle
in reverse. At low temperature the gas expands isothermally this is where
heat ows reversibly into the gas from the low temperature bath. Now the
gas is isolated thermally and adiabatically compressed, by an input of work,
to raise its temperature. The hot gas can now give up its heat reversibly as it
is compressed still more, in thermal contact with the high temperature bath.
The gas is now isolated again and allowed to expand and cool adiabatically to
the low temperature, where the cycle begins again with isothermal expansion.
Metaphorically, it is like bailing out a basement. You must do work to raise
water from the basement, so that you can dump it out at a higher level, just
as you must do work to raise the temperature of the gas, so that you can
dump heat to a higher temperature bath. Work changes the temperature,
just as work changes the height. For a given input of work W, the best
refrigerator would remove the most heat Q
2
at temperature T
2
. How much
would that be? (See the problem section.)
In refrigeration it is the low temperature end that is of interest. We want
to remove energy from it, even though there is no colder place for heat to
ow. The same machine accomplishes a dierent task if we focus on the high
temperature end. The refrigerator dumps energy into the high temperature
bath, in eect trying to warm it. In this application it is called a heat
pump, transferring energy from a most unlikely place, the low temperature
bath. An application of heat pumps that is actually used is the heating of
buildings. The low temperature bath is the wintry outdoors, and the heat
pump runs as a refrigerator, trying to cool the outdoors and pumping the
heat indoors. Again, it is a nice problem to see how much heat Q
1
you can
get for a given expenditure of work W. With a reversible heat pump we can
certainly anticipate Q
1
> W, since a not very smart (but widely used) way
to heat space is just to dissipate W in friction, getting the same energy W as
if it had been a ow of heat. Surely a clever mechanism can do better than
that!
10.5 Life at Fixed Temperature
Living systems are vitally concerned with energy transfer. That makes it
seem as if thermodynamics ought to be relevant to them. On the other
hand, living systems never extract work from two temperature baths, and
a cylinder of compressed gas also does not seem a very useful model for a
living thing. How, if at all, does thermodynamics apply to life?
The thermodynamic concept which emerges as crucial for this purpose
is entropy. The concept of entropy S, as in Eq (9.38), was actually discov-
ered through Carnots argument about heat engines. But as a measure of a
systems disorder it has much more general applicability. Knowing about en-
tropy we can understand, in very general terms, how living systems succeed
in building structure.
Organisms live at one temperature T. Thus reversible transfers of energy
involving a system A obey
E = QW = TS W (10.13)
Here E is the internal energy of A, Q is heat owing into A, and W is
work done by A on some other system B, hence a loss of energy to A. We
used Eq (9.38) to replace Q with TS for a reversible process.
Now living systems need subsystems like A which are capable of doing
work W on other subsystems. Remember that work is an investment of
energy in just a few degrees of freedom, not spread over Avogadros number
of degrees of freedom. That is how structure can be built. So one should
10.5. LIFE AT FIXED TEMPERATURE 357
see the term W above as structure, at least conceptually. The more
negative it is, the more work subsystem A has done, and the more it has
contributed to structure. We highlight this term by rearranging Eq (10.13),
W = E TS (10.14)
So how can A make its contribution? We see two ways. First, A may have a
lot of internal energy E initially, and it may give it up in doing work, so that
E is negative. That is not such a surprising thing. If it has energy, it can
transfer its energy to something else. But second, A may have small entropy
S, that is, it may be quite ordered. In that case, it can increase its entropy,
even without changing its energy, and in the process TS contributes to
W: structure. The entropy of A goes up, but if A is building a structure
in subsystem B, the entropy of B may go down. Only the total change
in entropy is guaranteed not to decrease. In this case a disordering of A
can create order in B. The energy comes from the constant temperature
environment! This is really quite far from what intuition would suggest.
The combination of energy and entropy that occurs so naturally above is
called the Helmholtz free energy
F = E TS (10.15)
Work done by a system at xed temperature reduces its Helmholtz free en-
ergy, and work done on a system at xed temperature increases its Helmholtz
free energy. A subsystem A with large Helmholtz free energy is able to do
work to create structure in another subsystem B because this work is re-
ally a transfer of Helmholtz free energy from A to B, and thus might be
either an increase in the internal energy E of B (not so interesting), or a
decrease in the entropy S of B (interesting)! Either process would increase
the Helmholtz free energy of B. (Note that Helmholtz free energy is not
conserved: it spontaneously decreases in irreversible processes. So we can
think of transferring it with the attendant possibility of spilling some of
it.)
Thus living systems need access to subsystems of high Helmholtz free
energy, i.e., food, oxygen, etc. They use this free energy to create order,
like partitioning ions on one side of a membrane. That free energy can then
be used to create high free energy molecules like ATP. It is all transfer of
Helmholtz free energy. That is what thermodynamics has to say.
10.6 Life at Fixed Temperature and Pressure
Organisms typically live not only at xed T but also at xed P. This means
some of the work done by a subsystem A might simply go into expansion, that
is, into PV work, not usually associated with useful structure. To highlight
the interesting work done, it is useful to introduce a slight generalization of
the Helmholtz free energy called the Gibbs free energy,
G = E TS +PV (10.16)
In any process that ends up at the same T and P, the change in G is
G = E TS +PV = QW TS +PV (10.17)
In a reversible process we nd, using Eq (9.38),
G = W PV (10.18)
which is just the interesting work done (i.e., with the uninteresting PV
subtracted o). That is, a subsystem A that does interesting work does so
precisely by giving up Gibbs free energy (G is negative, so that G,
the interesting work, will be positive). When a system reaches its minimum
G, by doing all the work it can, its usefulness is over. Since in irreversible
processes G goes down more than it needs to for the given work done, this
amounts to a kind of spilling of free energy, and as usual the most ecient
use of energy is in reversible processes.
In constant P processes for which V is zero, the change in Gibbs free
energy and the change in Helmholtz free energy agree. Otherwise one should
simply remember that E contains the term PV , which may or may not
be worth splitting o as interesting or not interesting. This observation
is essentially the same as in Section 10.3.3.
Problems
Work
10.1 (a) A 600 g spring with k = 200 N/m is compressed by 30 cm and
released. It oscillates and eventually dies down, having dissipated internally
the energy it was given. Assume negligible ow of heat to the surroundings.
By how much does its temperature rise (assume a molecular weight of 60 and
the Law of Dulong and Petit).
(b) Repeat in case the compression was 60 cm.
10.2 (a) A hammer and a nail, each made of steel, fall to the oor from
a height of 3 m. The hammer is 1 kg and the nail is 5 g. How much does
the temperature of each one rise? Assume no friction with the air and no
transfer of heat to the surroundings. Also take the molecular weight to be
60, and assume the Law of Dulong and Petit.
(b) What is an easy way to see that the temperature changes in (a) are
the same?
10.3 How much PV work is done per mole by a substance that converts
from solid to gas at 0
C and atmospheric pressure? Ignore the volume of

the solid.
10.4 Suppose a pressure dierence P = P
high
P
low
between the ends of
a pipe drives a volume current I according to Eq (8.40), P = rI. Here r
is the resistance of the pipe. In a short time t, during which a small volume
359
V = It moves into the pipe at the upstream end and the same volume
moves out at the downstream end, the work done on the uid in the pipe is
W = P
high
V P
low
V = (P)It. Thus the work done is proportional to
time t, and the rate at which work is done (called power, SI unit Watts) is
(P)I.
(a) Verify that the SI unit of (P)I is J/s, or Watts (W).
(b) Show that the rate at which work is done can also be expressed as
I
2
r or (P)
2
/r, where r is the pipe resistance.
(c) Use the estimates from Section 8.17 to nd the rate at which work is
done (by the heart) to force blood through the human circulatory system.
(d) Compare your result in part (b) with the rate of energy input available
from food, about 2000 Cal/day (express in SI). Is the result reasonable?
10.5 (a) In a reversible constant temperature expansion of a gas, what hap-
pens to its internal energy? Its entropy? Its pressure? Give an explanation
in words in each case.
(b) In a reversible adiabatic expansion of a gas, what happens to its
internal energy? Its entropy? Its pressure? As in (a) give an explanation in
words.
(c) Suppose we are sloppy, and the gas expands adiabatically, but irre-
versibly. What happens to its internal energy? Its entropy? Its pressure?
Justify in words!
10.6 The best refrigerator, running in a cycle and moving heat from low
temperature T
2
to high temperature T
1
with input of work W, would extract
the most heat Q
2
from the low temperature reservoir for given W. Describe
this best possible refrigerator, and nd how much heat Q
2
it can extract.
Chapter 11
Statistical Physics
Since even a very small sample of matter is made of many atoms or molecules,
it is natural to think of using statistics, that is, averages, to describe some
of its properties. We have already mentioned the statistical theory of molar
specic heat in Section 9.6. In this chapter we look at some other uses of
statistical ideas in physics, including the simple but eective experimental
technique of averaging repeated measurements to reduce error due to ran-
domness in the measurement process.
11.1 Ideal Solutions as Ideal Gases
The solute molecules in a solution are much like the molecules of a gas. They
are free to move around randomly within certain connes, and the associated
disorder (entropy) is just like the corresponding disorder in a gas. They strike
the walls of their container and thus exert a pressure just like the molecules
of a gas. Albert Einstein was one of the rst to make this point.
The meaning of ideal in an ideal solution is that the solute molecules do
not interact with each other, the same as the meaning of ideal in an ideal
gas. This will be the case if the density of solute molecules (concentration)
is not too large. The pressure of the solute molecules, called osmotic
pressure, is related to concentration c by, essentially, the ideal gas law,
= cRT (11.1)
361
362 CHAPTER 11. STATISTICAL PHYSICS
where R is the gas constant and T is absolute temperature. In SI units,
c = n/V would be expressed in moles/m
3
, where n is the number of moles of
solute molecules in volume V . The ideal gas law in this context is sometimes
called the vant Ho relation. If we measure concentration C = N/V in
molecules/m
3
with N the number of molecules in volume V , so that C = N
A
c,
recalling the Boltzmann constant k
B
= R/N
A
, we could also write the vant
Ho relation as
= Ck
B
T (11.2)
One very interesting analogy with the ideal gas is the case of a solution
with a semi-permeable membrane boundary. Imagine that the solute is in
aqueous solution, and that the container is permeable to water but not to
the solute molecules (this is the meaning of semi-permeable). It might be a
red blood cell, for example, with Na
+
and Cl
ions inside. These ions exert

an osmotic pressure on the cell membrane. If there is the same concentration
outside, then there is a pressure outside to balance the pressure inside (a so-
called isotonic medium), but if it is pure water outside, the semi-permeable
membrane admits water, diluting the solution inside, the cell swells up, and
the osmotic pressure of Na
+
, say, does isothermal work
PV = Nk
B
T ln
_
C
i
C
f
_
(11.3)
where C
i
and C
f
are the initial and nal concentrations of Na
+
, and similarly
for other solute ions, by exactly the arguments that led to Eq (10.4). In fact,
the osmotic pressure in this case is enough to swell the membrane into a
sphere and rupture it, so that C
f
would have to refer to some moment before
this happens.
The solute molecules in a solution are dierent from the molecules of an
ideal gas in that they interact with solvent molecules. There is no analog
of the solvent for a gas. This means that in a solvent, call it solvent 1,
there is some average energy of interaction E
1
per solute molecule, and in
a second solvent there could be a dierent average energy of interaction E
2
.
Suppose the two solvents are separated by a semi-permeable membrane which
is impermeable to the solvent molecules, but allows the solute molecules to
go through. How will they partition themselves? One might think that
if E
1
< E
2
then any molecules in solvent 2 have excess energy that they
could give up by going to solvent 1, thus lowering their energy until they
11.1. IDEAL SOLUTIONS AS IDEAL GASES 363
can lose no more, nding themselves then at equilibrium. This, however,
would be forgetting what equilibrium is for a system with many degrees of
freedom. In fact, at constant temperature, the system minimizes not its
energy, but its Helmholtz free energy at equilibrium, and this involves not
just the energy, but also the entropy. By the argument that led to Eq (10.5)
we know that changing the concentration from C
1
to C
2
at temperature T
implies an entropy change of
S = Nk
B
ln
_
C
1
C
2
_
(11.4)
Thus, dividing by N and recalling Eq (10.15), we nd the change in Helmholtz
free energy per molecule in going from solvent 1 to solvent 2 is
F = E
2
E
1
k
B
T ln
_
C
1
C
2
_
(11.5)
If this is dierent from zero, then solute molecules can lower the free energy
of the system, i.e., produce F < 0, by going from one solvent to the other.
At equilibrium no further lowering of F is possible, and thus the expression
above must be zero, so that the equilibrium concentrations obey
k
B
T ln
_
C
1
C
2
_
= E
2
E
1
(11.6)
We check common sense: if E
1
< E
2
, so that we might naively expect all
solute molecules to go to solvent 1, we nd E
2
E
1
> 0, so that C
1
> C
2
.
That is, the solute molecules do tend to go to the lower energy environment,
but not entirely. The entropy cost of putting all the solute molecules in one
solvent is too high (the system is too ordered), and the actual equilibrium is
a kind of compromise.
The strangest thing about the entropy term, perhaps, is that the entropy
change per molecule,
S = k
B
ln
_
C
1
C
2
_
(11.7)
depends on the concentrations C
1
and C
2
. It is as if the entropy change for
adding a molecule to a solution at concentration C were
S = k
B
ln C (11.8)
Then removing a molecule from concentration C
1
and adding one to concen-
tration C
2
gives Eq (11.7).
The work you can get from the system per molecule depends on the way
the molecules are apportioned, even though they do not interact with each
other! In particular they do not exert forces on each other. The contribution
of the entropy to the free energy is a little bit like the eect of a repulsive
force between the molecules proportional to temperature, but that is not at
all what is actually going on. There is no force between the molecules in
this model of an ideal solution, but they behave a bit as if there were. The
statistical tendency of the molecules to spread out suggests a repulsive force
that makes them do that, even though there is no such thing.
11.2 Statistical Mechanics
The observations of the previous section have far-reaching application through-
out biology, chemistry, and physics more generally. If we solve for the ratio
of concentrations in Eq (11.6) we nd
C
1
C
2
= e
E/k
B
T
(11.9)
where E = E
1
E
2
is the dierence in energy, per solute molecule, between
one environment and the other. Now we stop and interpret this expression.
The concentrations can be thought of as proportional to the probabilities
that a molecule in thermal equilibrium will occupy space in one solution
or the other, as if it were a matter of chance where a molecule nds itself.
Higher probability in one region would mean higher concentration in that
region. Thus we can think of Eq (11.9) as telling us the relative probability
of being in region 1 or region 2. It is as if the probability P of being in a
given volume in a region depended on the energy E the molecule would have
there according to
P e
E/kT
(11.10)
Then
C
1
C
2
=
P
1
P
2
=
e
E
1
/k
B
T
e
E
2
/k
B
T
= e
(E
1
E
2
)/k
B
T
(11.11)
11.3. RANDOMNESS 365
correctly reproduces Eq (11.6). This description of the behavior of molecules
in terms of probabilities given by Eq (11.10) is called equilibrium statistical
mechanics, and Eq (11.10) is called the Maxwell-Boltzmann distribution.
In deriving Eq (11.10) we thought about a molecule that could be on one
side or the other of a semipermeable membrane, but the result turns out
to be much more general than that. Eq (11.10) describes the equilibrium
probability distribution of molecules that have available to them any states
at all characterized by energy E. For example, molecules in the air have
available to them states at dierent heights, and these are characterized by
gravitational potential energy mgh, where m is the mass of the molecule. If
we take h = 0 to be ground level, and assume the atmosphere is in thermal
equilibrium at temperature T, the probability of a molecules being at height
h is less than the probability of being at h = 0 by the factor e
mgh/k
B
T
.
Thus the density of the atmosphere, and hence the pressure, is also less by
this factor. This prediction, that the pressure falls o exponentially with
height, is called the Law of Atmospheres. It isnt quite right, because
the atmosphere is not really in thermal equilibrium, but it is a reasonable
approximation. This is just one example of a quick insight from statistical
mechanics. It tells us why the atmosphere doesnt fall to the ground!
11.3 Randomness
A most intriguing thing about measurement uncertainty is that it often obeys
a mathematical law, in spite of being uncertain. The law is statistical, a
statement about average values. Suppose we measure something over and
over, getting the values X
1
, X
2
, ..., X
N
. Even though we meant to be
measuring the same thing again and again, the values are all dierent. In
a situation like this, where we really dont know the true value, because of
measurement uncertainty, we usually nd an average value for this quantity
X, denoted with brackets < >,
< X >=
X
1
+X
2
+... +X
N
N
(11.12)
i.e., we add up all the values, and divide by the number of values, the usual
sense of average, or mean. This is the sample mean of X, because it has been
obtained by sampling, i.e., repeated measurements.
Now we try to model this uncertainty mathematically by assuming that
for each trial the measured values were a little bit o from the true value
X
T
, namely
X
i
= X
T
+e
i
where i = 1, 2, ..., N (11.13)
Here e
i
= X
i
X
T
is the little random error in the ith measurement. Then
the sample mean of all the X
i
is (more or less repeating ourselves)
< X > =
(X
T
+e
1
) + (X
T
+e
2
) +... + (X
T
+e
N
)
N
(11.14)
= X
T
+
e
1
+e
2
+... +e
N
N
(11.15)
Thus the sample mean is the true value X
T
, with a little bit of wrapping
added, just as Galileo said. In what follows we will call the second term in
(11.15) the error.
Now why would we go to the trouble of making many measurements and
averaging? The result < X > still has error, much the way each individual
measurement X
i
had its error e
i
. What has been gained? Averaging is only
worth doing if the error in < X > is somehow less than it is in an individual
measurement X
i
, that is, if the error has gone down. This could happen
if the individual e
i
s were sometimes positive and sometimes negative, so
that in adding them up there is considerable cancellation. The whole point
of averaging is that we assume that this will happen, and that the sample
mean < X > will be less uncertain than any individual X
i
.
This situation is complicated, and hard to think about. To simplify it
we introduce a mathematical operation called expected value, denoted E.
Expected value is like an average, but it is not found by sampling, adding, and
then dividing by the number of samples. It is a purely theoretical operation, a
kind of ideal average, simpler than averaging. It is a mathematical model of
averaging, intended to model the unattainable case of sampling and averaging
an innite number of times. Instead of doing that, which would be impossible,
we just use E. Here is what E gives for error terms involving e
1
and e
2
:
E[e
1
] = E[e
2
] = 0 (11.16)
E[e
2
1
] = E[e
2
2
] =
2
E[e
1
e
2
] = 0
In particular, the expected value of e
1
is zero, because it is sometimes posi-
tive and sometimes negative, and the expected value of e
2
1
, which is always
11.3. RANDOMNESS 367
positive, and so cant average to zero, is a number
2
which tells us the ex-
pected size of the error, just the quantity a good experimentalist is always
keeping in mind. Thus
2
is a denite number that we know, at least ap-
proximately, if we are doing our job. The typical size of e
1
itself would be
the square root of
2
, namely (sigma), called the root mean square value
of the error e
1
. The product e
1
e
2
has expected value 0 because it is positive
if e
1
and e
2
have the same sign, and negative if they have dierent sign, and
neither possibility is favored, so that the product might well average, in an
ideal sense, to 0. Finally, we imagine that all the uncertainties behave in the
same way, because the measurement process is the same each time. Thus
E[e
2
5
] =
2
, E[e
7
] = 0, E[e
3
e
7
] = 0, etc.
With the operation E we can assess the error term in the sample mean
< X >. Applying it to the error in Eq (11.15) we nd
E
_
e
1
+e
2
+... +e
N
N
_
=
E[e
1
] + E[e
2
] + ... +E[e
N
]
N
= 0 (11.17)
This does not mean that the error term is zero, or that the sample mean
< X > will give us the true value X
T
, but rather that the uncertainty term,
or error term, is as often positive as negative, so that an ideal average of it
would be zero. That is what E computes. (Note a property of E, coming
from the idea of average: the average of a sum is the sum of the averages,
and constants like 1/N can multiply outside the average.)
To nd the typical size of the error, we should really look at the ex-
pected square error, averaging quantities that are positive, so that cancella-
tion doesnt conceal what is going on. Thus we compute
E
_
_
e
1
+e
2
+... +e
N
N
_
2
_
(11.18)
Squaring a sum like that is a bit of a mess, so we do it below in case N = 3.
It will be clear how to generalize to any N.
_
e
1
+e
2
+e
3
3
_
2
=
e
2
1
+e
1
e
2
+e
1
e
3
+e
2
e
1
+e
2
2
+e
2
e
3
+e
3
e
1
+e
3
e
2
+e
2
3
9
(11.19)
Now apply E. Terms like E[e
2
1
] give
2
, and terms like E[e
1
e
2
] give zero,
according to Eq (11.16). There are only 3 nonzero terms, so
E
_
_
e
1
+e
2
+e
3
3
_
2
_
=
3
2
9
=

2
3
(11.20)
More generally, in a sample mean with N samples, there would be N non-
zero terms, and the expectation value of the square of the error term would
be
E
_
_
e
1
+e
2
+... +e
N
N
_
2
_
=
N
2
N
2
=

2
N
(11.21)
Then the root mean square error in < X > is the square root of this, namely
/
N, and as N gets large, this does indeed get smaller, because
N is in
the denominator. This is why it is worth it to average many samples to get
a sample mean. The error does indeed go down.
11.4 Brownian Motion
Brownian motion is a very concrete example of adding random quantities
together. Since it is described by exactly the mathematical model of the
preceding section, it is possible to say a word about it here. Recall from
Section 1.4 that Brownian motion is the random jittering motion of a small
particle suspended in a liquid. We model the situation by a series of random
steps. Let us just think about the motion projected along one direction,
so that it is 1 dimensional for example we think about how the particle is
going left and right in the microscope eld, ignoring how it may also be going
up and down. The description would be the same in the other direction.
We call the random steps e
1
, e
2
, etc., and imagine that the expected
values are those in (11.15). In particular, since E[e
1
] = 0, the step is as likely
to go left as right. If we call the position of the Brownian particle X, then
starting at X = 0 and adding up steps, after N steps we have
X = e
1
+e
2
+... +e
N
(11.22)
Applying the expected value E we nd, by the same arguments as before,
E[X] = 0 (11.23)
E[X
2
] = N
2
(11.24)
11.4. BROWNIAN MOTION 369
If each step takes some average time (tau), then the total time to take
N steps is t = N, so that N = t/. The sample mean should approach
the expected value if we take enough samples, so we expect an experiment
to nd
< X
2
>=
t
2
= 2Dt (11.25)
where 2D =
2
/ is a constant. That is, the mean squared displacement is
proportional to time. This peculiar proportionality law is actually obeyed
by Brownian particles, making it clear that their motion really should be
considered random. From a plot of < X
2
> vs t you could determine the
unknown constant D (called the diusion constant) from the slope.
Fig. 11.1: A typical Brownian path, a series of random steps
Problems
11.1 (a) Use the Maxwell-Boltzmann distribution to nd the scale height
of the atmosphere, assuming thermal equilibrium at temperature T = 0
C.
This is the height at which the pressure is lower than P
atm
by the factor 1/e.
Is the value reasonable?
(b) Say in words why the atmosphere doesnt simply fall to the ground
because of its weight.
11.2 Molecules of NaCl in water can dissociate into Na
+
and Cl
. As a
result, all three species will be present, with concentrations we denote by
[NaCl], [Na
+
], and [Cl
]. The two atoms in the molecule have one energy,

E
1
when they are bound together as NaCl in an aqueous environment, and
a dierent energy E
2
when they are dissociated in the aqueous environment.
In going from bound to dissociated, the energy change is E = E
2
E
1
.
(a) Use Eq (11.8) to nd the change in entropy in going from bound to
dissociated.
(b) In thermal equilibrium, the concentrations must be such that F =
E TS is zero, since otherwise the system could lower its free energy.
What does this imply about the equilibrium concentrations?
(c) E is quite negative in this situation, i.e., the molecule lowers its
energy by dissociating. What does this imply about the equilibrium concen-
trations?
371
Useful Values
We collect here useful natural constants and values.
Symbol Meaning Value First Appearance
AU astronomical unit 1.50 10
11
m 4.1.4
B
Earth
Earths magnetic eld 0.5 G = 5 10
5
T 17.2
c speed of light 3.00 10
8
m/s 2.7
c
water
specic heat of water 1 cal/gK=1 Cal/kgK 9.5
cal calorie 4.18 J 9.5
e elementary charge 1.60 10
19
C 14.3
eV electron volt 1.60 10
19
J 14.3
0
permittivity of vacuum 8.854 10
12
F/m 14.4.1
water
viscosity of water 10
3
Pas 8.12
G Newtons gravitational constant 6.67 10
11
Nm
2
/kg
2
6.6
g acceleration due to gravity 9.8 m/s
2
1.1
h Plancks constant 6.63 10
34
Js 6.7
h-bar h/2 = 1.05 10
34
Js 6.7
in inch 2.54 cm
j
solar
solar energy current density 1 kW/m
2
12.10
k Coulombs k 9.0 10
9
Nm
2
/C
2
14.5
k
B
Boltzmanns constant R/N
A
= 1.38 10
23
J/K 9.4
lb pound weight 4.45 N 5.4
mi mile 1.6 km
645
Symbol Meaning Value 1st Appearance
m
e
mass of electron 0.000549 u= 9.11 10
31
kg 20.2
m
p
mass of proton 1.00726 u= 1.67 10
27
kg 20.2
m
n
mass of neutron 1.008665 u= 1.67 10
27
kg 20.2
0
permeability of vacuum 4 10
7
H/m 17.9
N
A
Avogadros number 6.022 10
23
9.4
P
atm
atmospheric pressure 10
5
N/m
2
= 760 mm Hg 8.7
R gas constant 8.31 J/Kmole=2 cal/Kmole 9.4
R
E
radius of the Earth 6.37 10
6
m 4.10
water
density of water 1 g/cm
3
= 10
3
kg/m
3
8.1
Stefan-Boltzmann constant 5.67 10
8
W/m
2
K
4
18.9
u atomic mass unit 1.66 10
27
kg= 931.44 Mev/c
2
20.2

PhysicsInProportionI 1

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

PhysicsInProportionI 1

Enviado por

Direitos autorais:

Formatos disponíveis

Physics in Proportion

to see if we have proportionality

, however, Euclid says two

with an arc added to make it a quarter

is 1/6 of the circle or /3 radians, and 30

is 1/12 of the circle or /6

for some constant C. If there is a power law, determine

on the retina. How big is the image

on the other side, measured from the normal

. The two direction angles are measured from the normal

on both sides, and

0.300 0.224 1.34 0.296 0.222 1.33

are the indices of refraction of the two materials individually.

/n we see that the relative index of refraction

, are simply proportional, and

), the constant of proportionality is just the

refers to the medium on the left. If the sphere

in Fig. 3.10 and be sure you see how

, so that there really isnt any refraction after all

n, the denominator goes to zero,

>> n? This means an

>> n, then we can ignore n in the denominator, and the

, the index of refraction of the vitreous humor, and we nd

= 2 (check that this is the solution). This is quite remarkable: we predict

= 4/3, the approximate value

n) = 4, so that f 4R. And yet f is the diameter of the eyeball!

= 4/3, we would have zero in the denominator, and f would

for the vitreous humor is appreciably larger than

are small. (They are

/D does not depend on , i.e., it does not

. Thus P appears to be at the depth D

instead of the true depth D.

, not the rays in

is called a virtual image of the point P. When we look

= 1 and n = 4/3. Thus the depth we see, D

. This is essentially where

= /2, so that sin

= 1, the corresponding angle in water,

= 1, for air, and n = 4/3, for water, we have

. (You might very well wonder what happens to rays from

at the surface of the water! This

refers to the material

n is positive, since we think of n

n < 0, so that we predict a diverging lens and

/h = i/o. Then from the similar triangles DCF OFE we have

instead of the real object, with its height h. It is not

/h = |i|/o from Fig. 3.17.

of the glass. Unfortunately the index

. Light incident normally at the left

north of east as seen from the other end.

). This is a very small angle, and to detect such a small

, an angle that actually occurs in ice crystals in the

from the moon?

, the elapsed time t, by

) day= 1/24 day= 1 hour. That is, it is 1 PM. It would

/hour. Of course the unit ruled

in longitude. (Reality check: the

. With that understanding, that the value obtained from time

, these two values agree.)

east of London, not even half way around

Fig. 4.11: A pendulum of length L

L). We could replace (angular frequency) by f

1 +(0.5)(0.01) = 1.005. The exact value is 1.004987..., diering by less than