Você está na página 1de 100

monthly

THE AMERICAN MATHEMATICAL

VOLUME 124, NO. 2 FEBRUARY 2017

The Image of a Square 99


Annalisa Crannell, Marc Frantz, and Fumiko Futamura

Experimental Math for Math Monthly Problems 116


Allen Stenger

Characterizing Additive Systems 132


Michael Maltenfort

Rotating Multiple Sets of Labeled Points to Bring Them


Into Close Coincidence: A Generalized Wahba Problem 149
Bisharah Libbus, Gordon Simons, and Yi-Ching Yao

NOTES
Generalized Infinite Products for Powers of e1/k 161
Scott Ginebaugh

Sums of Quadratic Residues and Nonresidues 166


Christian Aebi and Grant Cairns

A Note on Average of Roots of Unity 170


Chatchawan Panraksa and Pornrat Ruengrot

Maximal Area of Equilateral Small Polygons 175


Charles Audet

PROBLEMS AND SOLUTIONS 179

REVIEWS
The Philosophies of Mathematics 188
by Alan Baker
MATHBIT
169, Another Proof That There Are Infinitely Many Primes

An Official Publication of the Mathematical Association of America


Congratulations to MAA
Award-winning Books of 2016!
Selected by ALA CHOICE Magazine
as Outstanding Academic Books

A Century of Advancing Mathematics


Stephen Kennedy, Editor
Donald J. Albers, Gerald L. Alexanderson, Della
A Century Dumbaugh, Frank A. Farris, Deanna B. Haunsperger,
of Advancing
va
anciing & Paul Zorn, Associate Editors
Mathematics
Math
the
tth
he
Catalog Code: CAM List: $60.00
EDITOR :

ASSOCIATE
STEPHEN F . KENNEDY

DONALD J . ALBERS
392 pp., Hardbound, 2015 MAA Member: $45.00
ISBN: 978-0-88385-588-1
EDITORS : GERALD L . ALEXANDERSON
DELLA DUMBAUGH
FRANK A . FARRIS
DEANNA B . HAUNSPERGER
PAUL ZORN

eBook, Free with MAA membership

Trigonometry: A Clever Study Guide


TRIGONOMETRY
A CLEVER STUDY GUIDE
By James Tanton
P RO BLE M BO O K SE RIE S

JAMES TANTON

Catalog Code: CLP-1 List: $19.95


232 pp., Paperbound, 2015 ISBN: 978-0-88385-836-3

MAA Beckenbach Book Prize Winner

When Life is Linear: From Computer


Graphics to Bracketology
By Tim Chartier

Catalog Code: NML-45 List: $52.00


140 pp., Paperbound, 2015 MAA Member: $39.00
ISBN: 978-0-88385-649-9

Get these award-winning titles and more.


Call 1-800-331-1622 or order online at:
store.maa.org/site.
monthly
THE AMERICAN MATHEMATICAL

VOLUME 124, NO. 2 FEBRUARY 2017

EDITOR
Susan Jane Colley
Oberlin College

NOTES EDITOR REVIEWS EDITOR


Vadim Ponomarenko Jason Rosenhouse
San Diego State University James Madison University

PROBLEM SECTION EDITORS


Gerald A. Edgar Daniel H. Ullman Douglas B. West
Ohio State University George Washington University Zhejiang Normal University and
University of Illinois

ASSOCIATE EDITORS
David Aldous Daniel Krashen
University of California, Berkeley University of Georgia
Elizabeth S. Allman Jeffrey Lawson
University of Alaska Fairbanks Western Carolina University
David H. Bailey Susan Loepp
University of California, Davis Williams College
Scott T. Chapman Jeffrey Nunemacher
Sam Houston State University Ohio Wesleyan University
Allan Donsig Bruce P. Palka
University of Nebraska-Lincoln National Science Foundation
Michael Dorff Paul Pollack
Brigham Young University University of Georgia
John Ewing Adriana Salerno
Math for America Bates College
Stephan Ramon Garcia Edward Scheinerman
Pomona College Johns Hopkins University
Luis David Garcia Puente Anne V. Shepler
Sam Houston State University University of North Texas
Sidney Graham Frank Sottile
Central Michigan University Texas A&M University
J. Roberto Hasfura-Buenaga Susan G. Staples
Trinity University Texas Christian University
Michael Henle Sergei Tabachnikov
Oberlin College Pennsylvania State University
Tara Holm Daniel Velleman
Cornell University Amherst College
Lea Jenkins Cynthia Vinzant
Clemson University North Carolina State University
Gary Kennedy Steven H. Weintraub
Ohio State University, Mansfield Lehigh University
Chawne Kimber Kevin Woods
Lafayette College Oberlin College
ELECTRONIC PRODUCTION
MANAGING EDITOR AND PUBLISHING MANAGER
Bonnie K. Ponce Beverly Joy Ruedi
NOTICE TO AUTHORS Proposed problems and solutions may be submitted to Prob-
lem Editor Daniel Ullman online via https://american
The MONTHLY publishes articles, as well as notes and other fea- mathematicalmonthly.submittable.com/submit.
tures, about mathematics and the profession. Its readers span
a broad spectrum of mathematical interests, and include pro- Questions but not submissions may be addressed to
fessional mathematicians as well as students of mathematics monthlyproblems@maa.org.
at all collegiate levels. Authors are invited to submit articles
Advertising correspondence should be sent to:
and notes that bring interesting mathematical ideas to a wide
audience of MONTHLY readers. MAA Advertising
1529 Eighteenth St. NW
The MONTHLY’s readers expect a high standard of exposition;
Washington DC 20036
they expect articles to inform, stimulate, challenge, enlighten,
Phone: (202) 319-8461
and even entertain. MONTHLY articles are meant to be read, en-
E-mail: advertising@maa.org
joyed, and discussed, rather than just archived. Articles may
be expositions of old or new results, historical or biographical Further advertising information can be found online at www.
essays, speculations or definitive treatments, broad develop- maa.org.
ments, or explorations of a single application. Novelty and
Change of address, missing issue inquiries, and other sub-
generality are far less important than clarity of exposition
scription correspondence can be sent to:
and broad appeal. Appropriate figures, diagrams, and photo-
graphs are encouraged. maaservice@maa.org.
Notes are short, sharply focused, and possibly informal. They or
are often gems that provide a new proof of an old theorem, a The MAA Customer Service Center
novel presentation of a familiar theme, or a lively discussion P.O. Box 91112
of a single issue. Washington, DC 20090-1112
(800) 331-1622
Submission of articles, notes, and filler pieces is required via the
(301) 617-7800
MONTHLY’s Editorial Manager System. Initial submissions in pdf or
LATEX form can be sent to Editor Susan Jane Colley at Recent copies of the MONTHLY are available for purchase
www.editorialmanager.com/monthly. through the MAA Service Center at the address above.
The Editorial Manager System will cue the author for all re- Microfilm Editions are available at: University Microfilms In-
quired information concerning the paper. The MONTHLY has ternational, Serial Bid coordinator, 300 North Zeeb Road, Ann
instituted a double-blind refereeing policy. Manuscripts that Arbor, MI 48106.
contain the author’s names will be returned. Questions con-
cerning submission of papers can be addressed to the Editor- The AMERICAN MATHEMATICAL MONTHLY (ISSN 0002-9890) is
Elect at monthly@maa.org. Authors who use LATEX can find published monthly except bimonthly June-July and August-
our article/note template at www.maa.org/monthly.html. September by the Mathematical Association of America
This template requires the style file maa-monthly.sty, which at 1529 Eighteenth Street, NW, Washington, DC 20036 and
can also be downloaded from the same webpage. A format- Lancaster, PA, and copyrighted by the Mathematical Asso-
ting document for MONTHLY references can be found there too. ciation of America (Incorporated), 2017, including rights to
this journal issue as a whole and, except where otherwise
Letters to the Editor on any topic are invited. Comments, criti- noted, rights to each individual contribution. Permission to
cisms, and suggestions for making the MONTHLY more lively, make copies of individual articles, in paper or electronic
entertaining, and informative can be forwarded to the Editor form, including posting on personal and class web pages,
at monthly@maa.org. for educational and scientific use is granted without fee
The online MONTHLY archive at www.jstor.org is a valuable provided that copies are not made or distributed for profit
resource for both authors and readers; it may be searched or commercial advantage and that copies bear the follow-
online in a variety of ways for any specified keyword(s). MAA ing copyright notice: [Copyright 2017 Mathematical Asso-
members whose institutions do not provide JSTOR access ciation of America. All rights reserved.] Abstracting, with
may obtain individual access for a modest annual fee; call credit, is permitted. To copy otherwise, or to republish,
800-331-1622 for more information. requires specific permission of the MAA’s Director of Pub-
lications and possibly a fee. Periodicals postage paid at
See the MONTHLY section of MAA Online for current informa- Washington, DC, and additional mailing offices. Postmas-
tion such as contents of issues and descriptive summaries of ter: Send address changes to the American Mathemati-
forthcoming articles: cal Monthly, Membership/Subscription Department, MAA,
www.maa.org/monthly.html. 1529 Eighteenth Street, NW, Washington, DC 20036-1385.
The Image of a Square
Annalisa Crannell, Marc Frantz, and Fumiko Futamura

Abstract. Every quadrangle is the perspective image of a square. We illustrate this statement
by using perspective art techniques and by analogy to the visualization of conic sections.
We also give examples of how understanding perspective images of squares can be applied
fruitfully in the areas of photogrammetry (determining true relative sizes of real-world objects
from a photograph) and linear algebra (more specifically, in the decomposition of projective
transformations).

1. INTRODUCTION. What looks like a square? Which geometric shapes are the
images of squares? Brook Taylor—of Taylor Series fame—illustrated literally the cen-
trality of squares to perspective artists in the first drawing of his New Principles of
Linear Perspective, published in 1719 [17]. Taylor was both an accomplished mathe-
matician and a skillful landscape painter. New Principles brought Taylor’s interest in
mathematics and drawing together, noting in the preface that the subject of perspec-
tive “. . . has still been left in so low a degree of Perfection, as it is found to be, in the
Books that have been hitherto wrote upon it.” His book introduced, among other things,
the usefulness of a “vanishing line” (a generalization of the more familiar “vanishing
point”), and stirred a revival of interest in the mathematics of perspective in Europe [1].
Figure 1 demonstrates Taylor’s setup illustrating that the trapezoid abcd is the per-
spective image of the square ABCD. A question Taylor could have asked himself (but
apparently never did) is, how much could we deform abcd and still be able to make
the same claim? Could a, b, c, d be the vertices of a diamond? What about a kite?
Could they be the vertices of a nonconvex shape such as a Penrose dart? The answer
is surprising: Every quadrangle is the perspective image of a square.
The goal of this paper is to provide some visually compelling insight into the cor-
respondence between squares and their many images—a visual insight that incorpo-
rates not only familiar images of direct perspective such as Taylor’s, but also allows
for somewhat more complicated interpretations of perspective (such as the discon-
nected pools of light in Figure 2). Along the way, we draw analogies between our
main theorem and images of conic sections.
In addition to giving the theorem a robust visual interpretation, we will also give
examples of how understanding perspective images of squares can be applied fruit-
fully in the areas of photogrammetry (determining true relative sizes of real-world
objects from a photograph) and linear algebra (more specifically, in the decomposition
of projective transformations). We’ll use these fields to dig into some of the “why we
care” aspect of this subject.
But first, we will need some definitions and background.

2. A LAMP AND SOME TERMINOLOGY. In this section we introduce some


terminology motivated by lamplight on a wall. Figure 2 shows a lamp with a box-
shaped lampshade (a rectangular parallelepiped) with square horizontal cross sections
and one face parallel to a wall. The light bulb O, which we idealize as a point, is
at the center of the shade, and through the square openings at the top and bottom of
http://dx.doi.org/10.4169/amer.math.monthly.124.2.99
MSC: Primary 51N15

February 2017] THE IMAGE OF A SQUARE 99


O
V

H
G d
b a
c d
e

K
B C
D I
A F

Figure 1. The first figure of Brook Taylor’s New Principles of Linear Perspective [17]. Here ABCD depicts a
square, and abcd depicts the image of a square (used with the permission of the Max Planck Institute for the
History of Science).

the lampshade, the bulb projects two trapezoidal pools of light on the wall. At first
glance these two pools of light don’t seem too remarkable, possibly because we think
of the bulb projecting the upper square opening ABCD of the lampshade onto the wall
and ceiling, and the lower opening onto the wall and floor. However, there is a more
interesting way of looking at this image.
It is possible to regard the square ABCD as being completely projected onto the
wall, which we think of as an infinite plane π. The image of any point, say A, is the
unique point on the wall collinear with A and the center of projection O. Following
the dashed “connector” OA, we see that the image of A is Aπ , and similarly the image
of B is Bπ . The image of C is a little different, because O lies between C and its image
Cπ . The same applies to D and Dπ . Notice how crucial the bottom of the lampshade is
in physically realizing the complete projection. The result is that the interior of ABCD
is projected to two separate, unbounded pools of light.
The unboundedness prompts a second look, revealing something we have glossed
over. The midpoints of AD and BC on the top of the lampshade are marked in white.
A line from O to either of these points is parallel to the plane π; hence the images
of these points are in some sense infinitely distant. If we were to be really precise
about it—which we won’t—we would need the complete formalism of spaces like E2
and E3 , called extended Euclidean space (see [6, p. 60–62] or [14, p. 84] for a formal
definition). For the purposes of this paper, it suffices to think of these spaces as the
union of all points in R2 or R3 (called ordinary points) together with additional points
(called points at infinity) such that every set of parallel lines meet at exactly one point at
infinity and every set of parallel planes meet at exactly one line at infinity. In particular,
the images of the midpoints of AD and BC are points at infinity belonging to the
(extended) plane π. (Geometers will be familiar with a similar space, real projective

100 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124

B
C
A
D
O
π

Figure 2. A wall-mounted lamp with a square lampshade (a rectangular parallelepiped with square ends).

space, or PR3 ; working in E3 instead of PR3 allows us to use the standard metric
properties of Rn such as distance between ordinary points and angles between ordinary
lines.)
What we would really like to say is that the quadrangle Aπ Bπ Cπ Dπ is the image
of the square ABCD. However, the term “quadrangle” often refers to figures having
line segments as edges (see, for example, [18]). If we trace in order the points Aπ , Bπ ,
Cπ , Dπ in Figure 2, we trace a “bow tie” like that in Figure 3(a). However, we should
not regard that figure as the image of the square ABCD; for example, our previous
remarks show that the interior of segment AD does not map to the interior of Aπ Dπ ,
and the interior of BC does not map to the interior of Bπ Cπ . To address this issue, we
will modify our usual geometric definitions somewhat. Let us agree that, given four
coplanar points A, B, C, D, no three of which are collinear, “the quadrangle ABCD”
refers to the points A, B, C, and D called vertices, and the infinite lines AB, BC,
CD, and DA, called edges. The same goes for the quadrangle Aπ Bπ Cπ Dπ . It will also
be useful to refer to the diagonals of ABCD, namely the infinite lines AC and BD.
Thus, the notation determines which of the six lines associated with the quadrangle
ABCD are to be considered edges and which diagonals. In Figure 3(b) the edges of the
quadrangles ABCD and Aπ Bπ Cπ Dπ are drawn with solid lines, and the diagonals with
dashed lines.
In fact, we regard the edges and diagonals not just as infinite lines but as extended
lines, meaning that each contains a point at infinity. In a natural way, a quadrangle
ABCD is a parallelogram if AB  CD (that is, AB is parallel to CD, meaning that they
meet at a point at infinity) and AD  BC. A parallelogram is a rectangle if adjacent
edges (that is, edges with a common vertex) are perpendicular, say AB ⊥ BC, and a

February 2017] THE IMAGE OF A SQUARE 101


Bπ Aπ Bπ Aπ
B A
B A

C D
C D
Dπ Cπ Dπ Cπ

(a) (b)
Figure 3. We choose the two figures in (b) to represent quadrangles ABCD and Aπ Bπ Cπ Dπ , rather than those
in (a). The extended solid lines are edges and the dashed lines are diagonals.

rectangle is a square if its diagonals are perpendicular; that is, if AC ⊥ BD. Therefore
the lamp in Figure 2 projects a square ABCD like that in Figure 3(b) to a quadrangle
Aπ Bπ Cπ Dπ like the one next to it. The edges AD and BC, being parallel, have a
common point at infinity, and as we will show, that point projects to the center of the
“×” in Figure 3(b)—that is, the intersection Aπ Dπ · Bπ Cπ .
Before discussing that point at infinity, we add a few more auxiliary parts to our
concept of a quadrangle. To motivate the choice of terminology, Figure 4 portrays a
quadrangle ABCD as the top of a box drawn in perspective. The box could be say,
an office building seen from an airplane, with a horizon line v seen in the distance. In
imitation of Taylor’s perspective drawing terminology [1, p. 8], we denote the principal
vanishing points of ABCD to be the intersections V = AD · BC and V  = AB · CD of
nonadjacent edges of the quadrangle. These points determine the line v, which we will
call the vanishing line of ABCD. The points W = v · AC and W  = v · BD are the
vanishing points of the diagonals.

W V W V

C
D
B
A

Figure 4. Auxiliary parts of a quadrangle, portrayed as the top of a building with a distant horizon.

The adjective “vanishing” is typically applied to points or lines when thinking of


them as images of other points or lines that are infinitely distant. This adjective is fairly
suggestive in Figure 4 if we think of an infinitely distant horizon, but what about the
unmarked vanishing point Vπ = Aπ Dπ · Bπ Cπ in Figure 3(b)? Recall that in Figure 2
we located the image of A by extending the dashed connector OA until it met the wall π
at Aπ , and so on. Now AD and BC, being parallel in space, meet at an infinitely distant
vanishing point V . To aim a connector from O in the direction of V , we must aim it
parallel to AD and BC, hence this connector is represented by the black mounting post
of the lamp, which meets π at the wall mount, represented by a little black disk. The

102 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
disk represents the center of the × of Figure 3(b), that is, the point Vπ = Aπ Dπ · Bπ Cπ
on the wall that is the image of the infinitely distant point V . Although the lamp may
seem unremarkable at first, each of its parts—the bulb, the shade, the post, and the
wall mount—has an interesting interpretation.
A few more remarks before we present the main theorem. We have seen that the
light rays of the light bulb O define a bijective mapping from the plane of ABCD
to plane of the wall. Given distinct, extended planes σ and π, and a point O not on
either of them, we call f O : σ → π a perspectivity with center O if for each point
X ∈ σ , the point X π = f O (X ) is the point of π collinear with O and X . Likewise,
f O−1 : π → σ is also a perspectivity with center O. The intersection line σ · π is a
line of fixed points, called the axis of f O . Given a set S in σ , we call f O (S ) the
perspective image of S (under f O ). Since the latter set is also the perspective image
of the former under f O−1 , the two sets are said to be in perspective. In particular, we
say that a quadrangle (in one plane) is the perspective image of a square (in another)
if a perspectivity maps the vertices and edges of the square to the vertices and edges,
respectively, of the quadrangle. We note also that points at infinity are sometimes called
directions, because every line through such a point runs in the same (parallel) direction.
We will use the fact that two such points represent perpendicular directions when every
line containing one point is perpendicular to every line containing the other.
To assist in reading and following the notation, we have adapted Andersen’s
mnemonic for choosing variable names [1]: π and σ for planes, with σ contain-
ing the square, a for the axis between π and σ , v for vanishing lines, and V , V  , W ,
W  for vanishing points.

3. MAIN THEOREM. The theorem that every quadrangle is the perspective image
of a square is both known and unfamiliar. It is known in the sense that the theorem
appears in various papers and books in the mathematical literature (Dörrie, for
example, called this theorem one of the “true jewels of mathematical miniature work”
and used a diagram of the proof to adorn the cover of his book [4]). But proofs have
tended to fall into one of two camps. The proofs that appeal to visual perspective
restrict themselves implicitly to the case of convex polygons (most notably, see [5]);
approaches that allow for more general configurations (for example [4], [9], and [19])
often discard the visual interpretation, although even in those cases the accompanying
diagrams show the usual convex setup. Perhaps this disconnect between “visualiz-
ing” and “proving” explains why the theorem is also unfamiliar; it seems to appear
infrequently in modern projective geometry texts, and sometimes its appearance in the
literature even comes as a conjecture or a puzzle (see the concluding question of [13]
and the contest at [12]).
Since versions of our main theorem are proved elsewhere, we give an informal
proof, concentrating on the “generic” case of a quadrangle ABCD in which no two
edges are parallel, so that the vanishing points V , V  , W , W  are all ordinary.

Theorem 1. Every quadrangle is the perspective image of a square.

Proof. Let ABCD be a quadrangle in a plane π, with vanishing points V , V  , W ,


W  and vanishing line v (see Figure 5). Let a be an ordinary line in π parallel to
v, and let σ be another plane containing a. Let σ  be the plane containing v that is
parallel to σ , and let O be one of the intersection points of the circles in σ  with
diameters VV  and WW  . We claim that the perspectivity f O : π → σ maps ABCD
to a square Aσ Bσ Cσ Dσ , where Aσ = f O (A), Bσ = f O (B), etc. Since σ and σ  are
parallel, f O maps V , V  , W , W  to respective points Vσ , Vσ , Wσ , Wσ at infinity. This

February 2017] THE IMAGE OF A SQUARE 103


tells us that the lines Aσ Dσ Vσ and Bσ Cσ Vσ are parallel; similarly the lines Aσ Bσ Vσ
and Cσ Dσ Vσ are parallel. Thus the image quadrangle Aσ Bσ Cσ Dσ is a parallelogram.
By Thales’ theorem for triangles inscribed in a semicircle, OV ⊥ OV  , so Vσ and Vσ
are perpendicular directions and thus Aσ Bσ Cσ Dσ is a rectangle. Similarly, Wσ and
Wσ are perpendicular directions, so Aσ Bσ Cσ Dσ has perpendicular diagonals and is
therefore a square.

σ
O

90°
90°
V W
W D V
A π
C
B

Cσ σ


Figure 5. Illustration of the proof of Theorem 1.

The method of the proof is easily adapted to the case of the bow tie quadrangle
Aπ Bπ Cπ Dπ of Figure 3(b), which resulted from the lamp projection. In Figure 6 the
plane π of Aπ Bπ Cπ Dπ lies horizontally, with one of the vanishing points of the quad-
rangle given by Vπ = Aπ Dπ · Bπ Cπ , in analogy to Figure 4. Where are the other three
vanishing points? With regard to Vπ = Aπ Bπ · Cπ Dπ , we have Aπ Bπ  Cπ Dπ , hence
Vπ is the point at infinity—that is, the direction—parallel to Aπ Bπ and Cπ Dπ . We
therefore draw the vanishing line vπ through Vπ parallel to Cπ Dπ as shown, and locate
Wπ = vπ · Aπ Cπ and Wπ = vπ · Bπ Dπ . To locate the center of projection O as in the
proof, let σ  be the plane through vπ perpendicular (for convenience) to π, and draw
a semicircle in σ  with diameter Wπ Wπ . Since Vπ is at infinity, there is no semicircle
in σ  with diameter Vπ Vπ , but if we imagine Vπ as an ordinary point on vπ to the left
(say) of Vπ , and then move Vπ farther and farther to the left, a semicircle connecting
the two points stays anchored at Vπ and locally looks more and more like a ray perpen-
dicular to vπ at Vπ . This turns out to be the correct approach; as shown in Figure 6, O
is the intersection of the perpendicular to vπ at Vπ with the semicircle having diameter
Wπ Wπ .
The other parts of the proof can be done analogously to the proof of Theorem 1, as
in Figure 7. Observe that the location of O is visually consistent with the location of
the light bulb in Figure 2.

4. SQUARE CONICS. Because we generalized the concepts of vanishing points


and vanishing lines of arbitrary quadrangles, we can now draw squares as perspec-
tive images of some rather unusual quadrangles. However, there are better ways to see
how squares can be perspective with unexpected shapes. Figure 8 shows a variation on
the lamp idea; it’s a floor lamp with a box-shaped shade, but in this case the nearby

104 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
the “semicircle” σ
with “diameter” VπVπ
O

Cπ Dπ
π
Wπ Vπ Wπ
π
Aπ Bπ

Figure 6. The construction of O in the case of the bow tie quadrangle Aπ Bπ Cπ Dπ .

D C σ
O
TOP OF LAMP

A B

Cπ Vπ Dπ

π
Aπ Bπ

Figure 7. Completing the square for the bow tie quadrangle Aπ Bπ Cπ Dπ . If we stood above the lamp in
Figure 2 and looked down, ABCD would be the top of the lampshade and Aπ Bπ Cπ Dπ its image on the wall.

wall plane π is slanted, as might happen in an attic room. The square opening ABCD
of the lampshade projects to a quadrangle Aπ Bπ Cπ Dπ that includes a dart shape, dia-
grammed in the inset at the upper right.
Our two lamp examples are reminiscent of a common example of conic sections,
namely the light patterns cast by a lamp with a circular cylindrical shade. With the
bulb at the center of the shade, the light streams out in a double-napped circular cone,
and the sections of the cone of light by walls, floor, and ceiling are conic sections. If
instead of a circular cone like x 2 + y 2 = z 2 we consider a surface of the form

|x| + |y| = |z|,

we get a “square cone” like that in Figure 9, comprised of a pair of pyramids with
square horizontal cross sections. That is, we get the kind of volume illuminated by the
square lampshades in our examples, and each section of such a cone is the perspective
image of a square. We think of the vertex O = (0, 0, 0) as the center of perspective,
and choose a square, horizontal slice ABCD as the square of interest. In Figure 9 the
plane π meets only the lower pyramid, the intersection being a convex quadrangle
Aπ Bπ Cπ Dπ . Indeed, the idea of looking at the intersection of a pyramid with a plane
was the basis for a proof that Emch published in this journal in 1917 illustrating, in his
words, “the importance of perspective as an introduction to projective geometry” [5].

February 2017] THE IMAGE OF A SQUARE 105


Cπ Aπ



D C A B

O
π

Figure 8. A floor lamp with a square lampshade casts light on an attic wall.

But for more interesting quadrangles, we need more than just one of the pyramids;
we need the full cone. In Figure 10 we see two views of a situation in which the
plane π tilts so that it meets all four faces of the upper pyramid, and just two faces of
the lower pyramid. The intersection of the plane and the cone is a dart-type quadrangle
Aπ Bπ Cπ Dπ , like that created by the lamp in Figure 8. The formula for the square cone
is easy to work with, and we encourage readers to use a graphing program to explore
the interesting variety of “square conics” analogous to circles, ellipses, parabolas, and
hyperbolas. All of them are images of a square!

5. IMITATING THE MASTERS. Our work so far has given the impression that a
square and its quadrangle image always (or often) lie in separate planes. But a result
similar to Theorem 1 holds even when we restrict all objects to a single plane. The
proof of Theorem 1 leads to the solution of a same-plane drawing problem that is
essentially the reverse of a type investigated by Renaissance masters such as Leone
Battista Alberti (1404–1472) and Piero della Francesca (1415–1492), as well as math-
ematician Brook Taylor. In Figure 11, which is is essentially a partial version of
Figure 5, we think of the planes σ and π as the front and top, respectively, of a

106 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
C B

A
D

x y z

Aπ Cπ


π

Figure 9. Plane section of a square cone.

A B
B D
C C A

π Cπ Aπ

Figure 10. Two views of another square conic.

box (rectangular parallelepiped). The problem posed by the masters involved a reg-
ular polygon or circle drawn undistorted on the front of a box, and they sought to
reproduce a copy of that object seen obliquely in perspective on the top of the box.
The solutions generally resulted in a figure on the top that was the perspective image
of the one on the front (see, for example, [11, pp. 186–189]). The proofs given here
show that reverse can be done, that is, start with a quadrangle ABCD in the plane π—a
square distorted by perspective—and draw a square Aσ Bσ Cσ Dσ in the plane σ that is
the perspective image of ABCD under some perspectivity from π to σ .

February 2017] THE IMAGE OF A SQUARE 107


Figure 11 illustrates how to locate the image Aσ of A using a method that is essen-
tially the reverse of one presented by Brook Taylor [17, Figure 16]; the other points are
located the same way. To locate Aσ , we extend the line AD until it meets the vanishing
line v at V and the axis a at X . We then draw  through X so that the lines  and
OV are parallel; we claim  is the image of the line AD. Why is this? Recall that X ,
lying on the axis, is a fixed point of the perspectivity; moreover the point at infinity on
Aσ Dσ is the image Vσ of V , which has the same direction as OV.1 Hence, Aσ Dσ = .
To locate Aσ on , we extend the connector OA until it meets  at Aσ . The other points
Bσ , Cσ , and Dσ are located similarly.

σ
O

90°
90°
υ V W
W D V
A
π
C
a B
X
parallel
σ

Figure 11. We reverse Taylor’s method for constructing the perspective image of a square.

Perspective collineations. The diagram of Figure 11 actually illustrates two types of


mappings. First, there is the perspectivity f O between the planes π and σ in three-
dimensional space. In reality of course, the diagram exists entirely in the plane of the
page, hence there is also the plane-to-itself mapping FO that, figuratively speaking,
maps the ink dot labeled A to the ink dot labeled Aσ , and so on. This one-to-one
mapping of the plane to itself is called a perspective collineation; FO has a center O, an
axis a (a line of fixed points), and it maps points to points, lines to lines, and preserves
intersections. From this it follows that FO maps quadrangles to quadrangles. When the
center does not lie on the axis, a perspective collineation is called a homology, and it
is completely determined by specifying its center O, its axis a, and two corresponding
points X and FO (X ) collinear with O (see, for example, [3, p. 53]).

Corollary 1. Any quadrangle is the image under a perspective collineation of a square


in the same plane.

Proof. With just a minor difference, the proof imitates that of Theorem 1. Let ABCD
be a quadrangle in a plane π with vanishing points V , V  , W , W  and vanishing line
v. Choose O on the intersection of the circles with diameters VV  and WW  . Let a be
a line in π parallel to v, and let FO be the perspective collineation with center O and
1 A fine point needed here is that we also consider the planes σ  and σ to be parallel in space to the plane

of the page, on which the whole configuration is projected. Thus parallel lines in the separate planes are drawn
parallel in the figure.

108 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
axis a such that Vπ := FO (V ) is the point at infinity on OV. Since v and a are parallel,
they have a common point P at infinity, and Pπ := FO (P) = P since P lies on a. But
FO (V ) is also a point at infinity, hence FO maps v and the points V , V  , W , and W  to
infinity as in the proof of Theorem 1. By the same reasoning as in that earlier proof,
Aπ Bπ Cπ Dπ is a square, where Aπ = FO (A), Bπ = FO (B), etc.

As we will see in the next two sections, this corollary inspired by Renaissance artists
leads to intriguing modern applications.

6. PROJECTIVE COLLINEATIONS AND APPEARANCE. We turn now to


exploring one of the useful applications of Corollary 1: a decomposition technique.
Decompositions of projective collineations are useful in a variety of modern appli-
cations, including the teaching and practice of computer vision, the analysis of
video sequences, computer animation, augmented reality, and many other areas. (For
examples of the aforementioned see [8], [10], [15], and [20], respectively.)
In the realm of linear algebra, linear transformations are more general than elemen-
tary transformations (these latter include horizontal and vertical shears, horizontal and
vertical compressions/expansions, and reflections). In the same way, in the realm of
projective geometry, projective collineations (that is, transformations of the plane that
take lines to lines) are more general than perspective collineations. Figure 12 shows a
drawing H0 of a house (a set of 11 points A0 , B0 , . . . connected by line segments), and
the image H1 of that drawing under a projective collineation. The collineation is not
perspective, because the connectors A0 A1 , B0 B1 , and C0 C1 of points and their images
are not concurrent at a center of projection.

A1
H1
3
B1

2 C1

1
C0

–1 B0
H0

–2

A0
–3
–5 –4 –3 –2 –1 0 1 2 3

Figure 12. The figures H0 and H1 are not in perspective.

Decomposing a transformation is often helpful in understanding its properties.


Introductory linear algebra texts often mention the fact that an invertible linear trans-
formation of the real plane R2 can be decomposed into the product (composition)

February 2017] THE IMAGE OF A SQUARE 109


of simpler elementary transformations (see, for example, [2, p. 454]). Similarly, a
projective collineation is a product of perspective collineations.
But do the images we get from a product of elementary or perspective transforma-
tions look different from the images of their more elementary counterparts? The figure
H1 looks rather warped; is it possible that a single perspective collineation can map
a house that looks like H0 —that is, a figure similar to H0 —to the strange-looking
set H1 ? The answer is yes. The proof depends on the fundamental theorem of pro-
jective geometry, which says that a projective collineation is completely determined
by specifying the images of four points in the extended plane, no three of which are
collinear. We also use the fact that a similarity of the plane—an affine transformation
that preserves angles—is a projective collineation.

Corollary 2. A projective collineation is the product of a perspective collineation and


a similarity.

Proof. Let G be a projective collineation of the extended plane, let A0 B0 C0 D0 be a


square, and let the quadrangle A1 B1 C1 D1 be the image of A0 B0 C0 D0 under G, with
A1 = G(A0 ), etc. By Corollary 1, there exists a square A2 B2 C2 D2 and a perspective
collineation FO with center O that maps A2 B2 C2 D2 to A1 B1 C1 D1 . Clearly there exists
a similarity S that maps the square A0 B0 C0 D0 to the square A2 B2 C2 D2 , hence FO ◦ S
maps A0 B0 C0 D0 to A1 B1 C1 D1 . It follows from the fundamental theorem of projective
geometry that G = FO ◦ S.

Corollary 3. If the center of the perspective collineation given by Corollary 2 is an


ordinary point, we can write the projective collineation as the product of a perspective
collineation and an isometry.

Proof. If the center O is an ordinary point, then by contracting the square A2 B2 C2 D2


toward O or by expanding the square away from O along the lines of perspectivity, we
can construct FO such that A2 B2 C2 D2 is the same size as A0 B0 C0 D0 . It follows that S
can be chosen to be an isometry.

Although the statements of Corollaries 2 and 3 ought to be well known, we have not
been able to find these elsewhere in the literature. Rather, a common computer vision
technique (see [8], [16]) decomposes a projective collineation into three components
as the composition of an orientation-preserving similarity, an affine transformation,
and a perspective collineation.
Figure 13 illustrates Corollary 2, which says that the projective collineation of Fig-
ure 12 is a product FO ◦ S. The figure H2 in Figure 13 is the image of H1 under the
similarity S, and the point O is the center of the perspective collineation FO that maps
H2 to H1 . We have drawn all the connectors to show that they indeed meet at O.
Observe that in this case we could contract H2 toward O and then reflect it in O, to
get a figure congruent (not just similar) to H0 as in Corollary 3.

7. ANALYZING A PHOTOGRAPH. The previous section described an application


for decomposition of the functions that give us images; this section describes an appli-
cation for analyzing the images themselves. Photogrammetry is the science of deduc-
ing three-dimensional information from photographs. One of the common applica-
tions of this science is auto accident reconstruction, in which photogrammetry allows
practictioners to estimate skid mark lengths from photographs (see, for example, [7]).

110 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
6

4 O

H2 H1
2

–2
H0

–4

–6

–8
–10 –8 –6 –4 –2 0 2 4

Figure 13. A decomposition of the projective collineation in Figure 12. The figure H0 is mapped to H2 by a
similarity, and H2 is mapped to H1 by a perspective collineation with center O.

Photogrammetry software uses the mathematics of perspective to transform an oblique


view of skid marks in a photograph into a bird’s-eye view suitable for analysis.
Corollary 1, as we show here, can be very helpful in similar problems. For example,
Figure 14 is a mock-up of a cropped photograph of a hallway with a couple of flyers
affixed to a wall at haphazard angles. Suppose we know that the flyers, designed to
advertise a Mathematics of Origami festival, are squares. The light gray flyer appears
smaller than the dark one, but is it really? We can use Corollary 1 to answer this
question.

Figure 14. Photograph of two square flyers on a wall. Which is larger?

February 2017] THE IMAGE OF A SQUARE 111


In Figure 15 we use the procedure in the proof of Corollary 1 as follows. We locate
the vanishing points V and V  of the edges of the quadrangle representing the dark
flyer. Similarly, we locate the vanishing points W and W  of the diagonals of that
quadrangle (not shown to avoid clutter). We then choose a center of perspective O
from the intersection of the circles with diameters VV  and WW  . We also locate the
vanishing points U, U  of the light gray flyer. Observing that the vanishing points lie
on a vertical vanishing line v (consistent with vanishing point information from the
other objects in the hallway), we choose for the axis a the vertical line representing the
corner of the hallway. Using the center O and axis a, we construct the square images
of the flyers under the perspective collineation FO given by Corollary 1.

V
a

Figure 15. Solution of the photo problem.

Now obviously the resulting images are the correct shape—they’re square, as guar-
anteed by Corollary 1—but are they the correct relative sizes? To see that they are,
imagine that the actual receding wall is covered by a square grid whose lines are par-
allel to the edges of the dark square, as suggested in Figure 16. Of course, we see
the receding wall at an oblique angle, so those grid squares don’t appear square; they
are just more quadrangles in the plane of the photograph. Each such quadrangle has
the same principal vanishing points V, V  as the dark flyer; likewise, the diagonals of
these quadrangles have the same vanishing points as the diagonals of the dark flyer
(not shown in the figure). Thus FO maps these quadrangles to squares aligned with the
square image of the dark square, resulting in a square grid, as shown to the right of a.
In fact, the image is a faithful, undistorted reproduction of the hallway wall, reversed
from left to right. Since the perspective collineation FO preserves points, lines, and
intersections, it maps any object on the oblique grid so that its image has the corre-
sponding intersection points with the square grid, hence the relative sizes of the dark
square and the light square in Figure 15 are exactly as shown. From this figure, we see
that the light flyer is actually larger than the dark one, which answers our question.
We can be even more specific; a standard room is 8 feet high. Measuring carefully
shows us that the light square in Figure 15 is 1/4 as long as the vertical corner edge

112 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
V

a
U

Figure 16. The perspective grid is mapped to a square grid.

of the hallway, so that square is two feet on each side, whereas the near dark square is
3/4 the size of the light square, or 18 inches on a side. That is, with largely geometric
techniques (as opposed to numerical or computational techniques), we can discover
photogrammatic information from a photograph or perspective drawing.

Brook Taylor’s view. Observe that in using Corollary 1, we treated the objects in the
photograph as figures all in the same plane. That is, we solved the problem by con-
structing a perspective collineation—a map from the plane to itself. But alternatively,
we could have considered this as a problem involving several planes, using Theorem 1
to construct a perspectivity from the plane of the receding wall to the plane of the adja-
cent wall facing us in the photograph. In that case, however, we must bring in some
notions we didn’t treat explicitly in our proof of Theorem 1.
For example, although our proof of Theorem 1 was confined to the situation where
all vanishing points of the given quadrangle are ordinary, the vanishing points of the
flyers on the receding wall lie at infinity, since the edges of the flyers are actually
parallel. Consequently, their common vanishing line, whose image in the photograph
is the line v in Figure 15, lies at infinity also. We can see the line in the diagram, but
it’s infinitely far away in space. To choose a plane σ  through this line parallel to the
wall plane σ facing us, we must therefore choose the so-called plane at infinity—the
union of all the points and lines at infinity. Thus center of perspective O, which lies in
σ  , is also a point at infinity. In other words, the dashed connectors emanating radially
from O are actually parallel in space to one another—they merely appear to converge
because we see them in perspective, like sunbeams—and the associated perspectivity
with center O is a parallel projection from the receding wall to the wall facing us.
In fact, it can be shown that these rays are parallel in space to the floor and ceiling,
and pierce each wall at a 45◦ angle. This map is therefore an isometry that causes a
reflected image of the receding wall to appear on the wall facing us, as though one wall
were folded at the corner onto the other.
We mention the notion of folding as an isometry because Taylor himself described
the same phenomenon in his construction (the reverse of ours, starting with the square

February 2017] THE IMAGE OF A SQUARE 113


and obtaining the quadrangle) as “turning” the object plane “till it co-incides with the
Picture,” and concluded [17, p. 22]:

I have observed that the Shapes of the Representations of Figures on a Plane don’t at all depend
upon the Angle the Picture makes with that Plane.

Our investigation into the perspective images of squares began with Taylor’s three-
dimensional interpretation, and then used those results to move our investigations into
two-dimensional applications. Taylor’s observation—that a same-plane square must
necessarily be the same size as a different-plane square that has the same quadrangle
image—allows us to come full circle.
Or, perhaps we should say, it allows us to come full square.

ACKNOWLEDGMENT. The authors wish to thank the referees, whose careful and critical readings of our
earlier drafts were invaluable to us in our revisions. This work was supported by NSF TUES Grant DUE-
1140135.

REFERENCES

1. K. Andersen, Brook Taylor’s Work on Linear Perspective. Springer-Verlag, New York, 1992.
2. H. Anton, Elementary Linear Algebra. Seventh ed. John Wiley & Sons, New York, 1994.
3. H. S. M. Coxeter, Projective Geometry. Second ed. Springer, New York, 2003.
4. H. Dörrie, 100 Great Problems of Elementary Mathematics. Dover, New York, 1965.
5. A. Emch, A problem in perspective, Amer. Math. Monthly 24 (1917) 379–382, http://dx.doi.org/
10.2307/2973980.
6. H. Eves, A Survey of Geometry. Revised ed. Allyn and Bacon, Boston, 1972.
7. M. Frantz, A car crash solved—with a Swiss army knife, Math. Mag. 84 (2011) 327–338, http://dx.
doi.org/10.4169/math.mag.84.5.327.
8. R. Hartley, A. Zisserman, Multiple View Geometry in Computer Vision. Second ed. Cambridge Univ.
Press, New York, 2003.
9. J. L. S. Hatton, The Principles of Projective Geometry Applied to the Straight Line and Conic. Cambridge
Univ. Press, London, 1913.
10. E. Michaelsen, U. Stilla, Pose estimation from airborne video sequences using a structural approach for
the construction of homographs and fundamental matrices, Lecture Notes in Computer Science Vol. 3138.
Ed. by A. Fred, T. Caelli, R. P. W. Duin, A. Campilho, and D. de Ridder, Springer, Berlin, 2004. 486–494,
http://dx.doi.org/10.1007/978-3-540-27868-9_52.
11. D. Pedoe, Geometry and the Visual Arts. Dover, New York, 2011.
12. Problem 2013-2c, Newsletter of the Delft Institute of Applied Mathematics (December 2013) 293,
http://www.nieuwarchief.nl/home/problems/pdf/uitwerking-2013-2.pdf.
13. W. H. Richardson, Projection of a quadrangle into a parallelogram, Amer. Math. Monthly 73 (1966) 644–
645, http://dx.doi.org/10.2307/2314807.
14. D. Row, T. J. Reid, Geometry, Perspective Drawing, and Mechanisms. World Scientific Publishing,
Hackensack, NJ, 2012.
15. K. Shoemake, T. Duff, Matrix animation and polar decomposition, Proc. Conf. Graphics Interface 92
(1992) 258–264.
16. M. Sonka, V. Hlavac, R. Boyle, Image Processing, Analysis and Machine Vision. Cengage Learning,
Boston, 2014.
17. B. Taylor, New Principles of Linear Perspective: or the Art of Designing on a Plane the Representations
of all sorts of Objects, in a more General and Simple Method than has been done before. London, 1719,
http://echo.mpiwg-berlin.mpg.de/MPIWG:C0RQ3H5B.
18. E. W. Weisstein, Quadrangle—From MathWorld, A Wolfram Web Resource, http://mathworld.
wolfram.com/Quadrangle.html.
19. C. R. Wylie, Jr., Introduction to Projective Geometry. Dover, New York, 1970.
20. X. Zhang, Projection matrix decomposition in AR—A study with Access3D, in Mixed and Augmented
Reality, 2004, Third IEEE and ACM International Symposium on ISMAR 2004. 258–259, http://dx.
doi.org/10.1109/ISMAR.2004.48.

114 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
ANNALISA CRANNELL wishes she had had a course on projective geometry at some point in her life.
Nonetheless, she enjoys sharing the subject with her colleagues in the mathematical community and with her
Franklin & Marshall students.
Franklin & Marshall College, Lancaster PA 17604
annalisa.crannell@fandm.edu

MARC FRANTZ a former painter, is a research associate in mathematics at Indiana University. He loves the
visual approach to mathematics, especially links between mathematics and art.
Indiana University, Bloomington, IN 47405
mfrantz@indiana.edu

FUMIKO FUTAMURA is an associate professor of mathematics at Southwestern University. She currently


enjoys spending her time painting portraits, creating a projective geometry and perspective drawing course and
textbook with her coauthors, and finger painting with her daughter.
Southwestern University, Georgetown, TX 78626
futamurf@southwestern.edu

February 2017] THE IMAGE OF A SQUARE 115


Experimental Math for Math Monthly
Problems
Allen Stenger

Abstract. Experimental mathematics is a newly developed approach to discovering mathe-


matical truths through the use of computers. In this paper, we look at how these techniques
can be applied to help solve six problems that have appeared in the Problems section of the
M ONTHLY. The paper has examples of constant recognition, sequence recognition, and inte-
ger relation detection.

Experimental mathematics is a newly developed approach to discovering mathematical


truths through the use of computers. Mathematicians have always calculated as part of
their search for new facts. The computer makes this easier and has extended our range,
but there are also new computer methods that are qualitatively different from what has
gone before.
“Experimental math” is a very broad term. Borwein and Devlin state in [10] that
“experimental mathematics is really an approach to mathematical discovery” (p. 115)
and “Experimental mathematics is the use of a computer to run computations—
sometimes no more than trial-and-error tests—to look for patterns, to identify partic-
ular numbers and sequences, to gather evidence in support of specific mathematical
assertions that may themselves arise by computational means, including search.” (p. 1)
Experimental math is thus primarily heuristic; it guides us to an expression, but we
still have to prove it.
In this paper, we will look at three particular computer methods that are important
in mathematics research and illustrate their use on six problems that have appeared in
the Problems section of T HE A MERICAN M ATHEMATICAL M ONTHLY.
Experimental math has been very successful in mathematical research, but there
are a couple of reasons why it might be even more successful in helping to solve
M ONTHLY problems. One reason is that M ONTHLY problems tend to have short, neat
answers (typical published solutions run about half a page to a page), so these meth-
ods, which lead directly to a final answer, might be an important shortcut in solving
the problem. Another reason is that M ONTHLY problems are always presented out of
context so that we do not know where the problem came from or (usually) why it is
interesting. The lookup methods are especially useful here because they do not require
any context. The lack of context is more of a challenge for integer relation detection,
as we will see in our example in Section 7.
The best place to start learning about experimental mathematics is the brief but
wide-ranging survey and introduction [10]. The two-volume set [6, 7] contains an enor-
mous number of worked examples and exercises from a wide variety of mathematics.
The book [3] contains many lengthy and very difficult examples. The website [2] is
a collection of much useful information and links to other sites. There is a research
journal, Experimental Mathematics, published by Taylor & Francis.
For this paper, we will use Mathematica to perform the multiple-precision calcula-
tions needed, although any computer-algebra system or high-precision package would
http://dx.doi.org/10.4169/amer.math.monthly.124.2.116
MSC: Primary 40-04, Secondary 05-04, 97D50

116 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
work. The calculations and timings shown in this paper were performed using Mathe-
matica 10.0.2.0 on a 2.8 GHz Macintosh iMac computer.

1. THE METHODS. In this paper, we will look at three particular methods or tech-
niques that are commonly used in experimental math and see examples of applying
each to M ONTHLY problems.
Other computer methods that are useful for many M ONTHLY problems, but not cov-
ered in this paper, are the mechanical summation methods of Gosper, Wilf, Zeilberger,
et al. These are especially useful for problems involving sums with binomial coeffi-
cients. They are illustrated in a very illuminating and entertaining article [21] published
earlier in this M ONTHLY. These methods are often included as part of experimental
math, but they produce both the final answer and the proof, so they are not heuristic in
the same way that methods we consider here are.

Constant recognition. The Inverse Symbolic Calculator Plus (ISC+) [8] is an online
service that attempts to identify a constant, given a good numerical approximation to
the constant. According to its website, ISC+ “uses a combination of lookup tables and
integer relation algorithms in order to associate a closed form representation” with the
given approximation. It is used to identify values that come up in research, such as
definite integrals or infinite series, by calculating them to a high precision (the rule
of thumb is that 15 digits are needed) and asking ISC+ for a closed-form candidate.
Such problems are very common in the M ONTHLY Problems section, and the value
can often be discovered by this method. Even with computers, it is sometimes difficult
to calculate a value to 15 digits, and we will see examples of this in this paper.
Even in the old days, we might have attempted to guess the value of a series by
adding up several dozen terms. If we got a sum of 3.14159, we would probably guess
that the series summed to π and attempt to prove this using known facts about π,
including other series whose value included π. With computers we can get more digits;
if the answer was 3.1415926535897932385, we would be even more confident that
the answer was π and would be willing to work harder to prove this. Plugging in our
20-digit π suspect into ISC+ indeed produces π.
The ISC+ table is enormous, and the lookup method almost always produces a can-
didate if you have enough digits. However, guessing a constant from a high-precision
approximation is far from infallible. We like mathematical problems to have neat
answers. For example, to 30 digits, we have

eπ 163
= 262537412640768743.999999999999.

Anyone looking at the right-hand side would guess that it represents an integer, but to
35 digits, we have

eπ 163
= 262537412640768743.99999999999925007.

Another example (from [4, pp. 498–503]) is the integral


 ∞ 
∞ x 
cos(2x) cos d x.
0 n=1
n

To 42 digits, this agrees with π/8, but in fact, it is not π/8. A collection of even more
spectacular examples of misleading near matches is in [9].

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 117


Sequence recognition. The On-Line Encyclopedia of Integer Sequences (OEIS) [22]
is a large searchable table of integer sequences. It is used by calculating several terms
of the sequence of interest and then using the table to see if it is a known sequence.
This technique is especially valuable for combinatorial problems, where it is often
easy to count the objects for small sizes but difficult to work out the general case. (The
table lookup is not inherently a computer technique, as the Encyclopedia started as a
collection of file cards in 1964 and became a print book in 1973, but the computer has
extended its reach and made it easier to use.)
Let’s suppose you become interested in the problem of how many slices a pizza
can be cut into using n straight cuts, and you don’t realize the problem has already
been solved. After some experimentation, you decide the maximum number of pieces
for one through four cuts is 2, 4, 7, 11. You can then try looking up this sequence in
OEIS. Its top hit is the sequence A000124, which is described as the central polygonal
numbers but also as the maximal number of pieces formed when slicing a pancake with
n cuts, so you realize your problem has already been solved. Better yet, the OEIS gives
you the formula for this number of pieces, n(n + 1)/2 + 1, and numerous references
where you can find proofs and further information.

Integer relation detection. The third method is integer relation detection, in which
we seek to express a given constant as a rational linear combination of known con-
stants. An ancient example is the greatest common divisor of two integers, which we
know can be expressed as such a combination: gcd(a, b) = ax + by for some integers
x, y.
The general integer relation detection problem is: Given a set of n numbers ck ,
attempt to find an integerlinear combination of them that is very nearly 0; that is,
find integers ak such that nk=1 ak ck ≈ 0. If successful, and the combination is exactly
0, this means that any of the ck that have a nonzero coefficient can be expressed as
a rational linear combination of the others. Ferguson and Bailey’s PSLQ algorithm
[15] and the Lenstra–Lenstra–Lovász (LLL) lattice reduction algorithm [19] are two
well-known integer relation detection algorithms. Mathematica’s solver is the function
FindIntegerNullVector; the Mathematica documentation does not reveal which
algorithm this uses.
Another (slightly disguised) example of integer relation detection is the question
of whether a given number x is an algebraic number (that is, it is a zero of a polyno-
mial with integer coefficients). We can recast this question as: For some n is there an
integer relation between the numbers 1, x, . . . , x n ? In other words, are there integers
a0 , . . . , an such that an x n + · · · + a0 = 0? If we could show that a mystery number
was a zero of particular polynomial, we would then know a lot about it, even if we
could not get an explicit representation. Take the simple example that we are given a
number x that is approximately

x ≈ 3.146264369941972342329135.

Is x algebraic? (Clearly, the right-hand side is algebraic because it is rational, but the
question is really whether x is the root of a polynomial with small coefficients.) This
can be answered using FindIntegerNullVector and a suitable number of powers of
x (say n = 10). Mathematica also has a function RootApproximant specifically for
answering whether a number is algebraic, and it says that x satisfies

x 4 − 10x 2 + 1 = 0.

118 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
For most purposes, this would be almost as good as an explicit form. In this example,
because the equation is so simple, we can in fact find the explicit form. A few
keystrokes √
in Mathematica
√ gives the roots, and looking at their numerical values
shows x = 2 + 3.
One early and spectacular example of integer relation detection is the Bailey–
Borwein–Plouffe formula for π in base 16 (see, for example, [10, Chapter 2]):

∞ 
1 4 2 1 1
π= − − − .
i=0
16 i 8i + 1 8i + 4 8i + 5 8i + 6

This formula allows the calculation of any base-16 digit of π with a moderate amount
of effort and without calculating the preceding digits. The formula was hard to dis-
cover but once discovered can be proved easily using only calculus. More examples of
integer relation detection are in an article [4] published earlier in this M ONTHLY.

2. A RAPIDLY CONVERGING SUM. We start with an easy example. M ONTHLY


problem 11853 [23] asks for the value of



1
.
n=1
sinh 2n

This series converges extremely rapidly, so it is easy to get a good numerical approx-
imation: The first five terms give about 28 digits of accuracy. Mathematica gives to
15 digits that


5
1
K = = 0.313035285499331.
n=1
sinh 2n

The ISC+ “standard lookup” does not identify this constant, but the “advanced lookup”
yields the transformed value 1/(1 + K ) = tanh(1), in other words, K = 1/ tanh 1 − 1.
We are prompted to conjecture



1 1
n
= − 1, (1)
n=1
sinh 2 tanh 1

which is plausible because of the hyperbolic functions on both sides and checks out
numerically: If we sum the first 10 terms, the two sides agree to about 900 decimals.
This is strong evidence but not a proof; we still use traditional hand methods to get a
proof.
Because the hyperbolic functions have expressions in terms of the exponential func-
tion, we might try expanding both sides of (1) as power series in e−1 and see if they
match. We have on the left, using the geometric series, that



1 ∞
2 ∞
exp(−2n )
= = 2
n=1
sinh 2n n=1
exp(2n ) − exp(−2n ) n=1
1 − exp(−2 · 2n )

∞ 

=2 exp(−2n (2m + 1)).
n=1 m=0

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 119



Some thought shows that this double series can be rearranged to 2 ∞ k=1 e
−2k
: Each
positive integer can be written in exactly one way as the product of a power of 2 and
an odd integer, so the expression 2n (2m + 1) = 2 · 2n−1 (2m + 1) in the sum takes on
each even positive integer value exactly once. Meanwhile, expanding the right-hand
side of (1) using the geometric series gives

1 e1 + e−1 2e−1 2e−2 ∞


−1= 1 − 1 = = = 2 e−2k ,
tanh 1 e − e−1 e1 − e−1 1 − e−2 k=1

so the two sides are equal, and our numerically inspired conjecture is proved.

3. A NUMBER-THEORETIC DETERMINANT. Let’s try a discrete problem


which does not require any high-precision calculation. M ONTHLY problem 11179 [5]
asks: For positive integers i and j, let

−1 if j | (i + 1)
mi j = ,
0 if j  (i + 1)

and when n ≥ 2 let Mn be the (n − 1) × (n − 1) matrix with (i, j)-entry m i j . Evaluate


det Mn . (For integers a, b the notation a | b means that a divides b, that is, b/a is an
integer.)
For example, for n = 6 we have the 5 × 5 matrix

⎛ ⎞
−1 −1 0 0 0
⎜ −1 0 −1 0 0 ⎟
⎜ ⎟
M6 = ⎜ −1 −1 0 −1 0 ⎟. (2)
⎝ −1 0 0 0 −1 ⎠
−1 −1 −1 0 0

We work out the first few terms as examples and get that for n = 2 through n = 25
the values of det Mn are

−1, −1, 0, −1, 1, −1, 0, 0, 1, −1, 0, −1, 1, 1, 0, −1, 0, −1, 0, 1, 1, −1, 0, 0.

A number theorist might recognize this sequence, but anyone can ask the OEIS about
it. One of the OEIS hints is “enter about 6 terms, starting with the second term,” so we
ask about the subsequence −1, 0, −1, 1, −1, 0. OEIS immediately replies with 1,399
matches, of which the one rated most relevant is its sequence A008683, the Möbius
function μ(n). This sequence in fact matches all 24 of our calculated values, so we
conjecture det Mn = μ(n).
The matrices Mn have an obvious recursive structure in the sense that m i j does
not depend on n, and so the upper left (k − 1) × (k − 1) submatrix is always Mk .
The determinants have a further recursive structure: If we expand by minors along
the bottom row, the minor determinant for column j is ± det M j . This is because, in
forming the minor, the M j at the upper left is preserved while the −1 terms in the
superdiagonal slide into the diagonal. For example, the minor for column 3 and row
5 in (2) is

120 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
 
 −1 −1 0 0 
 
 −1 0 0 0 
 −1 −1 −1 .
 0 
 −1 0 0 −1 

In this example, M3 is in the upper left, and the matrix in the lower right has all −1
along the diagonal and all 0 above the diagonal.
The minor thus has determinant (−1)n− j det M j . If we make the convention that
det M1 = 1 for the empty matrix M1 , this evaluation is still true for j = 1. Therefore,
expanding det Mn by minors along the bottom row gives us a recurrence: We have for
n > 1 that
 
det Mn = (−1)(−1)n−1+ j−1 (−1)n− j det M j = − det M j .
j<n, j|n j<n, j|n

This rearranges as

det M j = 0.
j|n

The Möbius function also has a recursive structure. It satisfies a recurrence


 1, n = 1;
μ(d) =
d|n
0, n > 1.

(This is the first formula in the OEIS entry A008683.) This is the same recurrence
satisfied by det Mn , and det Mn and μ(n) have the same starting value of 1, so we have
by induction that det Mn = μ(n).

4. A PARAMETRIC SERIES DEFINED BY RECURRENCE. We look at a more


difficult series that depends on a parameter and whose terms are given by a recurrence
rather than explicitly. M ONTHLY problem 11604 [13] asks: Given 0 ≤ a ≤ 2, let an
be the sequence defined by a1 = a and

an+1 = 2n − 2n (2n − an ) for n ≥ 1. (3)

Find ∞ 2
n=1 an .
The sequence depends on the parameter a, so we are being asked for a function and
not a single number, but we will try to work out the value for particular values of a
and then try to guess the general result. Try the endpoints first: The case a = 0 is easy
but uninformative (all terms are 0). For the other endpoint, a = 2, the first few terms
an are

2.0, 2.0, 1.17157, 0.608964, 0.307436, 0.154089, 0.0770908, 0.0385512,


 2
and we see that each term is roughly half the preceding term. In the sum an , each
term is about 1/4 the previous term, so to get 15 decimals, 50 terms should be plenty.
To 25 decimals, Mathematica gives


50
an2 = 9.869604401089358618834491,
n=1

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 121


which does not look like anything in particular, but ISC+ immediately identifies it as
π 2 . We do not know how π got into a problem with only square roots, but we press
on. Trying some additional values, we get
50 2
a n=1 an ISC+ identifies as
0 0 0
1/2 0.346622711232150957648277 π 2 /9 − 34
1 1.467401100272339654708623 π 2 /4 − 1
2 9.869604401089358618834491 π 2
We are not asked anything about the individual values an , and even if we had some
information, it might not help with the value of the sum. But we will make a detour
and see if an has any interesting properties. We suspect from the a = 2 example above
that 2n an goes to a nonzero finite limit. We make a wild guess that the value for n = 50
gives a result close to the true limit, and calculate some examples:
a 250 a50 ≈ limn→∞ 2n an ISC+ identifies as
0 0 0
1/2 1.096622711232150957648277 π 2 /9
1 2.467401100272339654708623 π 2 /4
2 9.869604401089358618834491 π2

Surprisingly, the same π 2 values turn up! We still do not know where the π 2 comes
from, but comparing the tables, we conjecture that



an2 = lim 2n an + simple function of a.
n→∞
n=1

In fact, this is easy to prove now that we have thought of it. Rearrange, square, and
rearrange the recurrence (3) to get
2
an+1 = 2n+1 an+1 − 2n an .

When this is summed, the right-hand side telescopes, and we get


∞ 

an2 = a12 + 2
an+1 = a12 + lim 2n an − 2a1 = lim 2n an + a 2 − 2a,
n→∞ n→∞
n=1 n=1

so the “simple function” is a 2 − 2a, and this gives the right answer for the four
examples we tried. (We are assuming temporarily that lim 2n an exists; this will be
proved later when we evaluate it.)
So we have reduced the sum problem to an asymptotic problem for the general term,
which should be easier. The recurrence (3) has a lot of 2n in it, and it should be easier
to think about if we reparameterize to get rid of them. If we define bn = an /2n , we get
 
2n+1 bn+1 = 2n − 2n (2n − 2n bn ) = 2n − 2n 1 − bn ,

which rearranges to

1− 1 − bn
bn+1 = with b1 = a/2, so 0 ≤ b1 ≤ 1. (4)
2

122 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
This is much simpler: Not only is the 2n gone, but each b value depends only on
the previous
 value
√  not on n. That is, it is an iteration, bn+1 = f (bn ) where
and
f (x) = 12 1 − 1 − x . This is attractive not only because it is simpler but also
because there is a systematic (but complicated) theory to get asymptotic values of
sequences defined by an iteration (see [14, Chapter 8]).
In our case, rather than apply the systematic theory, we will make an observation
that leads to a quick solution. Remembering the π 2 and the square roots, we might be
reminded of the half-angle formulas for the trigonometric functions, of which the most
common are
 
θ 1 + cos θ θ 1 − cos θ
cos = and sin = .
2 2 2 2

Neither of these has exactly the same form as our recurrence, but if we square the
second one, we can get a half-angle formula for sin2 that does have the right format,
namely

θ 1 − cos θ 1− 1 − sin2 θ
sin 2
= = .
2 2 2

Therefore, we define

θ1 = arcsin b1 and θn+1 = 12 θn

so that bn = sin2 θn is the solution of the recurrence (4), and

θ1
an = 2n sin2
2n−1

is the solution to the recurrence (3). Then using limx→∞ x 2 sin2 (c/x) = c we calculate

θ1 a
lim 2 an = lim 2 sin n−1 = 4θ12 = 4 arcsin2
n 2n 2
.
n→∞ n→∞ 2 2

The final formula is then





a
an2 = 4 arcsin2
+ a 2 − 2a,
n=1
2

which matches the calculated values.

5. A STIRLING SERIES. Even with today’s fast computers, it is often difficult to


get enough digits to feed to the lookup program (recall that our rule of thumb is that
we need 15 digits). Traditional methods of numerical analysis are still very useful,
particularly methods for transforming series and integrals and methods for accelerating
convergence of series. In this example, we will approximate a slowly convergent series
with a combination of other series that converge just as slowly but for which we know
the sum explicitly. This method is sometimes called Kummer’s transformation of series
(see, for example, [17, p. 247]).

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 123


M ONTHLY problem 10832 [18] asks for an explicit form for the sum

∞ 
kk 1
k
− √ . (5)
k=1
k!e 2πk

Today, Mathematica can identify the sum immediately and directly, but back in 2000,
when this problem was posed, Mathematica was not as smart. Let’s see how experi-
mental math can help us identify the sum.
The sum converges slowly (the general term is about 1/k 3/2 ), so brute force does not
work. There is a very precise asymptotic formula (an extension of Stirling’s formula;
see, for example, [24, p. 140, formula 5.11.1]) for ln k!, which begins

  1 1 1
ln k! = k + 12 ln k − k + 12 ln(2π) + − 3
+ − ··· .
12k 360k 1260k 5

We therefore have

kk 1 1 1 1
= √ exp − + − + · · ·
k!ek 2πk 12k 360k 3 1260k 5

1 1 1 139 571
=√ 1− + + − + · · · .
2πk 12k 288k 2 51840k 3 2488320k 4

The first term of this will cancel with the other term in the sum (5), and we can use
as many of the remaining terms as we think is useful. Using three terms reduces the
required numerical work to a reasonable level for a computer. Write

1 1 139
ak = − + 2
+
12k 288k 51840k 3

so that we have

∞   ∞ 
1  ak

kk 1 kk 1
− √ = − √ (1 + a k ) + √ √
k=1
k!ek 2πk k=1
k!ek 2πk 2π k=1 k
∞ 
kk 1
= −√ (1 + ak ) (6)
k=1
k!ek 2πk

1 1 3 1 139
+√ − ζ(2) + ζ( ) +
5
ζ( ) ,
7
(7)
2π 12 288 2 51840 2

where ζ is the Riemann zeta function. The zeta series converge slowly too, but a lot is
known about them and how to calculate them more quickly, and we can let Mathemat-
ica figure them for us. To 25 decimals, we get

(7) = −0.08378540362877196918178047.

We estimate (6) by truncating it at some point and summing numerically. The


general
√ term is about the first omitted term from the asymptotic expansion, i.e.,
571/( 2πk · 2488320k 4 ), so truncating the series at N introduces an error of about

124 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
571 ∞
1 571 1 1
√ ≈ √ < 10−4 7/2 .
2488320 2π k=N k 9/2
(7/2)2488320 2π N 7/2 N

Therefore, if we take N = 104 , we will get about 18 decimals of accuracy. (Remember


that we are doing heuristics. If we misestimate the error, there is no logical problem,
but we may misidentify the number.) To 25 digits, Mathematica gives the truncated
sum as −0.0002841050988840270181249700. This takes about 30 seconds, which is
reasonable; if it had taken too long, we could have sped up the convergence some more
by using more terms in the asymptotic expansion and fewer terms in the numerical part.
The whole sum is therefore approximately

(5) ≈ −0.0840695087276559961999.

The ISC+ identifies this as


 
2 ζ 12
− −√ = −0.08406950872765599646148950,
3 2π
which matches to 18 decimals.
This result is plausible and √ encouraging; looking atthe second
√ term in the original
problem,
  we have that the 1/ 2π matches, and the 1/ k “sort of” matches
 the
ζ 12 , although we know the series does not converge and is not really ζ 12 . Having
detected the zeta function, we might look through books and find expressions such as
 ∞
x − x
ζ (s) = s d x (0 < Re s < 1),
0 x s+1
which is one way to analytically continue the zeta function to the left of the line
Re s = 1 (see, for example, [26], p. 14). For s = 12 , we rearrange and evaluate this
to get
 n 
 1 √  
lim √ − 2 n = ζ 12 .
n→∞ k
k=1

This explains the second part of the answer, so we would now need to show
 n 
 kk 2 √ 2
lim k
−√ n =− .
n→∞
k=1
k!e 2π 3

The sum here is also a limiting case of a known function, in this case the Lambert
W -function (see, for example, [24], section 4.13, p. 111), whose power series expan-
sion is
∞
k k−1 k
W (z) = (−1)k−1 z .
k=1
k!

Formally, we want to study the derivative at z = −1/e, but this point is on the circle
of convergence and the series does not converge there, so we have to work inside the
circle and take a limit. This can be done by appealing to properties of this function;
the details are in the published solution [18]. Another experimental math treatment of
this problem is in [10, pp. 81–85].

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 125


6. AN ALTERNATING SUM OF SQUARES OF ALTERNATING SUMS. This
example also converges extremely slowly, and even after a standard transformation to
speed up the convergence, it is still too slow to be useful. We will add a heuristic trick
to reduce the work to a manageable level.
M ONTHLY problem 11682 [16] asks for a closed form for

∞ 2
∞  (−1)k−1
(−1) n
. (8)
n=0 k=1
n+k

This is an intimidating-looking problem, and even getting a numerical estimate is chal-


lenging. The inner series is the tail of ln 2 = (−1)k−1 /k, which converges extremely
slowly. The tail is about ±1/(2n), so the outer sum converges slowly, too.
We can speed up the convergence of the inner sum by Euler’s transformation (see,
for example, [17, p. 244]). Write an = an+1 − an for the forward difference operator
and k an for the composition of this operator k times. Euler’s transformation states

∞ ∞
n a0
(−1)k ak = .
k=0 n=0
2n+1

Our particular example is worked out in [17, p. 246, Example 1], where we find



(−1)k−1 

1
= n+k  . (9)
k=1
n+k k=0
2k+1 (n + k + 1) k

The right-hand side converges quickly, and to get 15 decimals, we only need about 50
terms.
The inner sum is about 1/(4n 2 ), and the outer sum is an alternating series, so we
would need about 107 or 108 terms to get 15 decimals, and each of those has an inner
sum of 50 terms. That is a lot of terms, and we need a better way.
We will attempt to get a good value with much less work by using the following
observation. We know that the partial sums of an alternating series lie alternately above
and below the series value (and that the error is less than the first omitted term). Empir-
ically, it is further true that for series with slowly and smoothly decreasing terms, the
series value is almost exactly halfway between two successive partial sums (or what is
the same, the series value is almost exactly the partial sum plus half the first omitted
term). To take a simple example, ln 2 = 0.693147. The first 100 terms of the series
ln 2 = ∞ k=1 (−1) k−1
/k give a poor approximation of 0.688172, but adding half the
next term gives the much better approximation 0.693123. (This heuristic observation
has been worked out in more generality and detail as the method of “repeated averag-
ing”; see, for example, [11, p. 72] and [12, p. 278].)
Our method is to truncate the outer sum of (8) after 100,000 terms, and estimate
each term (and the first omitted term) using Euler’s transformation (9) with 50 terms.
That is, write


50
1
dn = n+k  ,
k=0
2k+1 (n + k + 1) k

126 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
and sum all the included terms and add half the next term, giving to 25 digits

5 −1

10
1 2
(8) ≈ (−1)n dn2 + d100000
n=0
2

= 0.411233516699556597589303
+ 0.000000000012499875000313
= 0.411233516712056472589616.

This takes about 40 seconds in Mathematica, which is reasonable. The ISC+ (with
advanced lookup) identifies this as

π2
= 0.4112335167120566091181038,
24

which matches the calculated value to 15 decimals. 


Where does π 2 /24 come from? The π 2 makes us think of ∞ π2
n=1 n 2 = ζ (2) = 6 ,
1

especially because of the terms in the outer sum being very nearly 1/(4n 2 ). However,

naively applying this estimate to the sum gives ∞
2
n=1 (−1)
n−1 1
4n 2
= 18 ζ (2) = π48 , only
half the calculated value, and it is not clear how ζ (2) might be generated.
However, thinking about the double (or triple) series and rummaging through zeta
function lore might make us think of Tom Apostol’s evaluation [1] of ζ (2) using the
double integral
 1  1
1
ζ (2) = d x d y.
0 0 1 − xy

(This method appeared earlier as an exercise in LeVeque [20, Section 6-10, exercise 6,
p. 122], and later Apostol independently rediscovered it and popularized it.) Apostol
then used an extremely clever change of variables to evaluate the integral. It is easy to
turn our sum into a double integral, too, and it looks a little like Apostol’s:

∞ 2  2

∞  (−1)k−1 
∞ 1

(−1)n = (−1)n (−1)k−1 x n+k−1 d x
n=0 k=1
n+k n=0 0 k=1


∞  1 2
xn
= (−1) n
dx
n=0 0 1+x
∞ 
 
1 1
(−1)n x n y n
= dx dy
n=0 0 0 (1 + x)(1 + y)
 1  1
1
= d x. (10)
0 0 (1 + x)(1 + y)(1 + x y)

Somewhat miraculously, Mathematica knows the value of this integral: π 2 /24, which
confirms our guess. If we trust Mathematica, our job is done!

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 127


If we do not trust Mathematica that much, we can work the integral by hand. Math-
ematica can help with this too because it also knows the value of the indefinite integral:
 
1
dx
(1 + x)(1 + y)(1 + x y)
   
1 xy + 1 xy + 1 (x + 1)y
= − Li2 + Li2 − ln ln(x y + 1)
2 1−y y+1 y−1

(x − 1)y
+ ln − ln(x y + 1) + 2 tanh−1 (x) ln(y + 1) ,
y+1

where we need the dilogarithm function


∞  x
ln(1 − t)
Li2 (x) = x /n = −
n 2
dt. (11)
n=1 0 t

We can verify the indefinite integral by hand by differentiating, but it is easier to work
forward now that we have the hint of using Li2 . We expand the integrand of (10) in
partial fractions twice to get
 1  1
1
dx
0 0 (1 + x)(1 + y)(1 + x y)
 1 1 
1 1 x
= − dy dx
0 1−x 1+y 1 + xy
2
0
 1
1
= (ln 2 − ln(1 + x)) d x
0 1−x
2

 
1 1 1 1 1 1
= (ln 2 − ln(1 + x)) d x + (ln 2 − ln(1 + x)) d x.
2 0 1+x 2 0 1−x

The first integral is easily evaluated as 12 ln2 2. To evaluate the second integral, we make
the change of variables x = 1 − 2t to get
 1  0
1 1
(ln 2 − ln(1 + x)) d x = (− ln(1 − t)) (−2 dt)
0 1−x 1/2 2t
π2 1
= Li2 ( 12 ) − Li2 (0) = − ln2 2,
12 2

where we have used the value Li2 (0) = 0 from the definition (11), and the value
Li2 ( 21 ) = π 2 /12 − 12 ln2 2 that comes from setting x = 12 in the functional equation
(see [24, p. 611, formula 25.12.6]):

1 2
Li2 (x) + Li2 (1 − x) = π − (ln x)(ln(1 − x)) for 0 < x < 1.
6

Combining this with the first integral, the ln2 2 terms cancel and we are left with
(10) = π 2 /24.

128 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
7. A RAPIDLY CONVERGING PRODUCT. M ONTHLY problem 11677 [25] asks
for an evaluation of
∞ 
 √  √ 
P= 1 + 2e−nπ 3 cosh nπ/ 3 .
n=1

Just as in Section 2, this expression converges very rapidly,



andwe only
√  need a few
−nπ 3
a good approximation. If we write an = e
terms to get cosh nπ/ 3 (so that we
seek P = ∞ 1 (1 + 2a n )), then
√ √ 
ln an ≈ −nπ 3 − 1/ 3 ≈ −3.6276n ≈ −1.57545n ln 10,

so we get about 1.5 significant digits for each term we take in the product.
Taking the first 15 terms and calculating to 25 digits, we get

P ≈ 1.028032541689576770462884.

But now we hit a snag: We ask ISC+ about this, and it says it found nothing, both in the
standard lookup and the advanced lookup. (We asked on March 18, 2016; the database
is updated continually, and ISC+ may someday be able to identify this constant.)
Because the item we seek is a product,
 we wonder if we would have better luck
working with its logarithm, ln P = ∞ n=1 ln(1 + 2an ). Taking the first 15 terms of this
and calculating to 25 digits, we get

ln P ≈ 0.02764682187200888558353500.

This has the same problem, though: ISC+ cannot find it.
The ISC+ lookups almost always work for M ONTHLY problems, perhaps because
those usually have neat answers, but this is an exception, and we look at other
methods to identify the number. Testing whether it is an algebraic integer using
RootApproximant does not produce any useful answers. It does misidentify the
25-digit version of ln P as

5657351 − 29079344023205
,
9578834
which agrees to 24 digits but is not correct. We will try integer relation detection.
There are two challenges to using integer relation detection. The first is that often
the desired number must be calculated to a very high precision, sometimes to hun-
dreds of digits. For our example, this is not much of a problem because the product
converges so rapidly. The other problem is guessing which constants should go into a
linear combination to get the desired number. These guesses are based on experience
and similar expressions for which we know the constants. In M ONTHLY problems we
are not given the context, and we may not have any experience with the particular
expressions, so guessing the constants may be especially challenging.
We are going to work with ln P again. We do not have much idea what constants to
use, but we will guess
√ that we should
√ include the constants that appear explicitly in the
product,
√ namely
√ π, 3, and π 3 and their logarithms, ln π and ln 3. (Do not use both
π 3 and√ π/ 3 because one is a rational multiple of the other, and do not use both ln 3
and ln 3, for the same reason.) A good rule of thumb when looking for a logarithm

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 129


is to throw in the logs of small primes because expressions often have small integer
factors in addition to the transcendental factors. We add ln 2, ln 5, ln 7, and ln 11 to
the mix. We will also bump up the precision of our approximation by calculating ln P
with 100 terms and 100 digits of precision.
Somewhat miraculously, this very loose procedure produces a neat answer when
FindIntegerNullVector tells us that (within the precision of the calculations)

36 · ln P − 2 · π 3 + 9 · ln 3 = 0,

in other words
√ √
P = eπ 3/18 4
3. (12)

We test this against the product with 200 terms and find they agree to about 317 digits,
so we conjecture that this is the correct value of the product.
Unfortunately, the explicit answer does not seem to point to any method of proof.
One oddity that might catch our eye√ is the 18th root; that is, in the product and the final
answer, we have a term with exp(π 3), but in the final answer, it appears to the 1/18
power. If we know a lot about special functions, this might remind us of the modular
functions and in particular of the Dedekind eta function, which includes a 1/12 power
and that appears in a discriminant formula to the 24th power:



 
η(τ ) = eπiτ/12 1 − e2πinτ , Im τ > 0.
n=1

This turns out to be the key observation, as it is possible to express the given product
in terms of a ratio
√ of eta function values, and a functional equation allows us to express
4
the ratio as 1/ 3. The complete solution is in [25].

ACKNOWLEDGMENT. Many thanks to the referees for their thorough reviews and helpful comments.

REFERENCES

1. T. M. Apostol, A proof that Euler missed: Evaluating ζ (2) the easy way, Math. Intelligencer 5 no. 3
(1983) 59–60, http://dx.doi.org/10.1007/BF03026576.
2. D. H. Bailey, J. M. Borwein, Experimental Math Website, http://experimentalmath.info/.
3. D. H. Bailey, J. M. Borwein, N. J. Calkin, R. Girgensohn, D. R. Luke, V. H. Moll, Experimental Mathe-
matics in Action. A K Peters, Wellesley, MA, 2007.
4. D. H. Bailey, J. M. Borwein, V. Kapoor, E. W. Weisstein, Ten problems in experimental mathematics,
Amer. Math. Monthly, 113 (2006) 481–509, http://dx.doi.org/10.2307/27641975.
5. D. Beckwith, L. Zhou, A determinant by Möbius inversion: 11179. Amer. Math. Monthly, 114 (2007)
550, http://www.jstor.org/stable/27642263.
6. J. M. Borwein, D. H. Bailey, Mathematics by Experiment: Plausible Reasoning in the 21st Century.
Second ed. A K Peters, Wellesley, MA, 2008.
7. J. M. Borwein, D. H. Bailey, R. Girgensohn, Experimentation in Mathematics: Computational Paths to
Discovery. A K Peters, Natick, MA, 2004.
8. P. Borwein, J. Borwein, S. Plouffe, Inverse Symbolic Calculator Plus at University of Newcastle
(Australia), maintained at the University of Newcastle, https://isc.carma.newcastle.edu.au/.
9. J. M. Borwein, P. B. Borwein, Strange series and high precision fraud. Amer. Math. Monthly, 99 (1992)
622–640, http://dx.doi.org/10.2307/2324993.
10. J. M. Borwein, K. Devlin, The Computer as Crucible: An Introduction to Experimental Mathematics.
A K Peters, Wellesley, MA, 2009.
11. G. Dahlquist, Å. Björck, Numerical Methods. Dover Publications, Mineola, NY, 2003.

130 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
12. G. Dahlquist, Å. Björck, Numerical Methods in Scientific Computing. Vol. I. Society for Industrial and
Applied Mathematics, Philadelphia, 2008.
13. P. P. Dályay, H. Widmer, Evaluate a series: 11604, Amer. Math. Monthly, 120 (2013) 476, http://dx.
doi.org/10.4169/amer.math.monthly.120.05.469.
14. N. G. de Bruijn, Asymptotic Methods in Analysis. Corrected reprint of the Third (1970) ed. Dover Publi-
cations, New York, 1981.
15. H. R. P. Ferguson, D. H. Bailey, S. Arno, Analysis of PSLQ, an integer relation finding algorithm. Math.
Comp., 68 (1999) 351–369, http://dx.doi.org/10.1090/S0025-5718-99-00995-3.
16. O. Furdui, B. Bradie, An alternating sum of squares of alternating sums: 11682, Amer. Math. Monthly,
122 (2015) 78–79, http://dx.doi.org/10.4169/amer.math.monthly.122.01.75.
17. K. Knopp, Theory and Application of Infinite Series. Second ed. Dover Publications, New York, 1990.
18. D. E. Knuth, C. C. Rousseau, A Stirling series: 10832, Amer. Math. Monthly, 108 (2001) 877–878,
http://dx.doi.org/10.2307/2695574.
19. A. K. Lenstra, H. W. Lenstra, Jr., L. Lovász, Factoring polynomials with rational coefficients, Math. Ann.,
261 (1982) 515–534, http://dx.doi.org/10.1007/BF01457454.
20. W. J. LeVeque, Topics in Number Theory. Vol. I. Addison-Wesley Publishing Co., Reading, MA, 1956.
21. I. Nemes, M. Petkovšek, H. Wilf, D. Zeilberger, How to do M ONTHLY problems with your computer,
Amer. Math. Monthly, 104 (1997) 505–519, http://dx.doi.org/10.2307/2975078.
22. OEIS Foundation Inc., The On-Line Encyclopedia of Integer Sequences (2011), http://oeis.org/.
23. H. Ohtsuka, Problem proposed: 11853, Amer. Math. Monthly, 122 (2015) 700, http://dx.doi.org/
10.4169/amer.math.monthly.122.7.700.
24. NIST Handbook of Mathematical Functions, Eds. F. W. J. Olver, D. W. Lozier, R. F. Boisvert, C. W. Clark.
Cambridge Univ. Press, New York, 2010. Also online at http://dlmf.nist.gov.
25. A. Stadler, R. Boukharfane, Dedekind η function disguised: 11677. Amer. Math. Monthly, 121 (2014)
951–952, http://dx.doi.org/10.4169/amer.math.monthly.121.10.946.
26. E. C. Titchmarsh, The Theory of the Riemann Zeta Function. Second ed. Ed. and with a preface by
D. R. Heath-Brown. The Clarendon Press, Oxford Univ. Press, New York, 1986.

ALLEN STENGER is a retired software developer and current math hobbyist. He received a B.S. in math-
ematics from Emory and an M.A. in mathematics from Penn State. He is an editor of the Missouri Journal
of Mathematical Sciences. His mathematical interests are number theory and classical analysis. This paper is
expanded from a talk he gave at the 2013 MAA Southwestern Section meeting in Socorro, New Mexico.
2892 95th St., Boulder, CO 80301-4916
StenBiz@gmail.com

February 2017] EXPERIMENTAL MATH FOR MATH MONTHLY PROBLEMS 131


Characterizing Additive Systems
Michael Maltenfort

Abstract. An additive system is a collection of sets that gives a unique way to represent either
all nonnegative integers, or all nonnegative integers up to some maximum. A structure theorem
of de Bruijn gives a certain form for an additive system of infinite size. This form is not, in
general, unique. We improve de Bruijn’s theorem by finding a unique form for an additive
system of arbitrary size. Our proof gives a concrete construction that allows us to test easily
whether a collection of sets is an additive system. We also show how to determine most of the
structure of an additive system if we are only given its union.

1. INTRODUCTION. An additive system of infinite size is a mathematical object


that gives, for every nonnegative integer n, a unique set of nonnegative integers that
add to n. In 1956, N. G. de Bruijn published a structure theorem [2] that said that every
additive system can be written in a certain form. However, this form usually is not
unique. In this M ONTHLY, Mel Nathanson recently gave a new proof of the theorem
[4]. Our first main result is to strengthen the theorem by showing that an additive
system can be written uniquely in a form that is only a bit different. Before we can
discuss our other results, though, we must say more about what an additive system is.
In his article, Nathanson points out that his results use nothing harder than long
division. With this in mind, we begin with a detailed introduction that we hope is read-
able by the widest possible audience. We outline our important ideas and results while
postponing until later the rigorous definitions and proofs. Throughout this introduc-
tion, “number” always means a whole number that is not negative.
Consider the infinite collection of sets from Figure 1. We are interested in adding
finitely many numbers from those listed, taking no more than one from the same set.
For example, 2 + 40 = 42, or 6 + 400 + 9000 = 9406. On the other hand, we do not
consider 4 + 8 + 700 = 712, since 4 and 8 are in the same set, but we can get the sum
712 as 2 + 10 + 700. Selecting no numbers is permitted and gives a sum of zero.

1, 2, 3, 4, 5, 6, 7, 8, 9
10, 20, 30, 40, 50, 60, 70, 80, 90
100, 200, 300, 400, 500, 600, 700, 800, 900
1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000

Figure 1. An additive system built on decimal numbers.

The key point is that any number can be written as this type of a sum, and this can
be done in one and only one way. This is the mathematical object we study in this
paper.

Definition. An additive system of infinite size is a collection of sets such that every
number can be written, in one and only one way, as a sum of numbers from the collec-
tion, with no two numbers selected from the same set.
http://dx.doi.org/10.4169/amer.math.monthly.124.2.132
MSC: Primary 11B13, Secondary 11A63

132 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
An important variation is when the sums only go up to a certain maximum. In the
next definition, note that we are looking at a finite collection of finite sets.

Definition. An additive system of size z is a collection of sets such that every number
less than z can be written, in one and only one way, as a sum of numbers from the
collection, with no two numbers selected from the same set. We also require that no
number greater than or equal to z can be written as such a sum.

In Figure 1, each selected number determines one nonzero digit of the sum, which is
why the collection is an additive system (of infinite size). If instead we limit ourselves
to the first three sets from Figure 1, this would be an additive system of size 1000
because the sums we get go from 0 to 999. (We justify 1000 as the size, rather than
999, by noting that the numbers from 0 to 999 are the 1000 smallest numbers.)
Each set from Figure 1 consists of all positive numbers with a single designated
nonzero digit. The collection is an additive system because every number can be writ-
ten uniquely in the decimal (base 10) system. A binary (base 2) variation would be the
collection of the sets {1}, {2}, {4}, {8}, {16}, {32}, . . . . This is another additive system
of infinite size. If we took, say, only the first four of these sets, we would instead get
an additive system of size 16.
Now consider the collection of five sets from Figure 2. It may not be as obvious
that this is another additive system of infinite size, because it is something of a hybrid
between base 5 and binary. Things become clearer if we consider a cashier using U.S.
currency. The cashier has five drawers of coins, each of which has an endless sup-
ply of a certain type of coin. The first drawer contains pennies (each worth 1 cent),
the next contains nickels (each worth 5 cents), and the other three drawers contain
quarters, half-dollars, and dollar coins (each worth 25 cents, 50 cents, and 100 cents,
respectively). If the cashier wishes to get a specified amount of money as efficiently
as possible, the value of the pennies is an element of the first set, {1, 2, 3, 4}, the value
of the nickels is in the second set, {5, 10, 15, 20}, and so on. (We measure value in
cents here.) For example, to make $27.43, the cashier uses $27 in dollar coins, $0.25
in quarters, $0.15 in nickels, and $0.03 in pennies. In terms of our additive system,
2743 = 3 + 15 + 25 + 2700. This is why the collection from Figure 2 is another addi-
tive system of infinite size. Taking only the first three sets would give an additive
system of size 50 because our cashier could uniquely create any value from 0 to 49.

1, 2, 3, 4
5, 10, 15, 20
25
50
100, 200, 300, 400, 500,
Figure 2. An additive system built on U.S. coins.

Exercise. Consider taking the cashier above and using dimes (each worth 10 cents)
instead of half-dollars. The values used by each coin type, in increasing value, would
be according to the collection from Figure 3. Show that this collection is not an additive
system by showing that 26 can be written in two ways.

In Figures 1 and 2, each set consists of consecutive positive multiples of its smallest
element. Furthermore, while the smallest element of the first set is 1, the smallest

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 133


1, 2, 3, 4
5
10, 20
25, 50, 75
100, 200, 300, 400, 500,
Figure 3. Not an additive system (because we included dimes).

element of every other set is the first missing multiple not found in the previous set
of the list. What we are saying, more or less, is that these additive systems generalize
place values. Not all additive systems follow this pattern, but those that do are called
British number systems. (Recall that our full definitions will come later.) The collection
from Figure 3 is not a British number system since 25 is not the first multiple of 10 that
fails to be in {10, 20}. Of course, the collection from Figure 3 is not even an additive
system. There do exist additive systems that are not British number systems. Let us
look at a couple of examples.
Figure 4 involves some unusual notation. The idea here is that we have two sets. The
first consists of all positive numbers with all “even” digits zero. The second consists of
all positive numbers with all “odd” digits zero. More precisely, the numbers indicated
by · · · 0  0  0  0  are those whose 10s digit, 1000s digit, 100 000s digit, etc. are
all zero; each “” can be replaced by any digit. So this set contains 408 and 901 and
20 407. With · · ·  0  0  0  0, on the other hand, we have numbers in which the
1s digit, the 100s digit, the 10 000s digit, etc. are all zero. So 50 and 8090 are in the
second set, as is 100 020. We allow any  to be replaced by zero, as long as they are
not all replaced by zero, since we specifically exclude zero from each of the two sets.
(We instead could have described the first set as consisting of the “full word” nonzero
integers that are “compatible” with the “partial word” · · · 0  0  0  0 ; the second set
is based on the partial word · · ·  0  0  0  0. See, e.g., [1] for a complete explanation
of this terminology and its notation, which is close to the notation we use.)

0 0 0 0 0
0 0 0 0 0
Figure 4. An additive system that is not a British number system; either even or odd digits are zero.

Although 4567 is in neither set, we can write it as a sum with one number from each
set: 4567 = 507 + 4060, since 507 is in the first set and 4060 in the second. Similarly,
7 672 091 = 7 070 001 + 602 090. These are the only ways to write this type of sum
for 4567 or 7 672 091. So, reasoning as we did for Figure 1, this is an additive system
of infinite size, because the digits tell us the one and only one way to write a sum
for a given number. The collection is not a British number system because neither set
consists only of consecutive multiples of its smallest element. Also note that neither
of the sets from Figure 4 is, by itself, an additive system.
The collection from Figure 5 is another example. Reasoning as with the collec-
tion from Figure 4, this collection is not a British number system. To see that it is
an additive system, compare it to the additive system from Figure 2. What if the
cashier of that example has only three drawers? The first drawer has pennies and half-
dollars, the second has nickels, and the third has quarters and dollar coins. When the
cashier continues to produce specified amounts of money as efficiently as possible,
each set from Figure 5 gives the value that can come out of one drawer. For example,

134 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
whereas with Figure 2, 794 is the sum 4 + 15 + 25 + 50 + 700, with Figure 5, we
have 794 = 54 + 15 + 725. That is, whereas the cashier originally created $7.94 with
$0.04, $0.15, $0.25, $0.50, and $7.00 from the five drawers, now from the three draw-
ers $0.54, $0.15, and $7.25 are used.

1, 2, 3, 4, 50, 51, 52, 53, 54


5, 10, 15, 20
25, 100, 125, 200, 225, 300, 325, 400,
Figure 5. A mixed quotient of the additive system from Figure 2.

We can always form a new additive system from another by “associating” some
sets together. This is how we created Figure 5 from Figure 2. To be more specific,
associating sets means replacing them with a single set. We create the new set by
taking the positive sums that we can get by using only the replaced sets. For example,
the first set from Figure 5 consists of the sums using the first set (pennies) and fourth
set (half-dollars) from Figure 2. Figure 4 is another example of an association. We
created it from Figure 1 by associating every other set together. In general, when we
associate sets together, we call the new additive system a quotient.
Some quotients are more useful than others, and this turns out to be a crucial obser-
vation. If we begin with a British number system, then it is not so useful to associate
sets that are adjacent. For example, if from Figure 2 we associated quarters and half-
dollars, two adjacent denominations, then the sets {25} and {50} would be replaced by
{25, 50, 75}. The new quotient is not especially interesting because the new collection
is just another British number system, namely the one in which our cashier has no
half-dollars. The useful quotients are those in which we only associate sets that are not
adjacent, and we define a term for this. We say such a quotient is mixed. For example,
Figures 4 and 5 are mixed quotients of the additive systems from Figures 1 and 2,
respectively.
We can now state our fundamental structure theorem.

Theorem 1. Every additive system can be uniquely written as a mixed quotient.

As mentioned above, this is similar to a theorem of de Bruijn [2], but ours goes
further because de Bruijn’s form is not unique. We get uniqueness by using slightly
different definitions. As in Figure 2, our British number systems can include an infi-
nite set, provided that it is the final one. On the other hand, in de Bruijn’s definition,
every British number system is a collection of finite sets, such as in Figure 1. Our
uniqueness also depends on limiting our quotients to ones that are mixed, something
that was unnecessary for de Bruijn’s result. Also note that de Bruijn’s theorem is only
for additive systems of infinite size. For additive systems of finite size, Theorem 1
is essentially present in Munagi’s work [3], which, in turn, depends on de Bruijn’s
theorem.
After giving his key lemma, de Bruijn states, “The theorem easily follows by
repeated applications of the following lemma.” More work is needed, however, to
prove the theorem from this lemma, and this is what Nathanson recently provided
in [4]. Nathanson’s proof, however, may be hard to apply to a particular additive
system, such as the collection from Figure 5. We avoid this problem by giving our
construction first, on its own. Then we prove that we have constructed the unique form
for the original additive system. It is also worth noting that our proof is shorter than
Nathanson’s.

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 135


We can illustrate the ideas of our construction by considering Figure 5. We saw
above that it is a mixed quotient of the additive system from Figure 2. By Theorem 1,
this is the only such way we can get Figure 5. Starting from Figure 5, let us show how
our construction finds the collection from Figure 2.
In Figure 5, one set contains 1, 2, 3, and 4, but not 5. This means {1, 2, 3, 4} is
the first set we create. Now look for 5, the next number that would have appeared in
this set, and its multiples. Another set from Figure 5 contains 5, 10, 15, and 20, but
not 25. This means {5, 10, 15, 20} is the second set. The next multiple of 5 missing
from this set is 25. Looking in Figure 5 for 25 and its multiples, we find a set with
25 but not 50, so {25} is the next desired set. We then continue, looking next for
50 and its multiples. This gives the British number system we need, i.e., the one from
Figure 2. The association we need for our mixed quotient is then not hard to determine:
for example, {1, 2, 3, 4} and {50} are associated because they are both contained in a
single set from Figure 5. The difficult part of Theorem 1, which we tackle later, is
showing that the constructed mixed quotient is actually equal to the original additive
system.
This construction also can be applied to an arbitrary collection of disjoint sets of
positive numbers. Is an arbitrary collection an additive system? The answer is “yes”
if and only if the constructed additive system equals the original collection. This is
another significant step forward, as we believe such a test is new. It is incredibly more
efficient than using the definition of an additive system.
Following our structure theorem, we provide several equivalent ways to recognize
whether the sets of an additive system can be reordered so that it is a British number
system. Then we move to our final result, which is rather surprising: most of the struc-
ture of an additive system can be determined from only the union of its sets. Here is a
limited version of what we prove.

Theorem 2. Suppose two additive systems have equal unions. If either system has
infinite size or if the two systems have equal size, then the additive systems are equal.

The hypotheses for Theorem 2 may seem a bit arbitrary, and indeed, we will later
weaken them. To see that different additive systems can have the same union, consider
the following.

Exercise. Find two distinct British number systems with union {1, 2, 3, 4, 5, 10, 15}.
Why does this not contradict Theorem 2?

As we will see in Section 6, given an additive system, or even just its union, there is
an easy way to determine if another additive system has the same union. It turns out,
however, that there never will be a third additive system with that union.
Before we begin the next section, let us discuss a couple of ways in which our
approach diverges from the work that others have done. First, our definition of an
additive system (that shortly we make more precise) is actually new, because our col-
lections contain only positive numbers. In earlier definitions, such sets always contain
zero. Rather than considering the sums as we described them above, those definitions
consider taking exactly one element from each set, with finitely many nonzero. Our
equivalent definition omits all zeros because we find it easier to discuss a sum t + u
rather than something like t + u + 0 + 0 + 0 + 0 + · · · , where we must include a zero
for each set we are not using.
Second, we should mention our choices for terminology. Unfortunately, the vocab-
ulary already used in this subject has been inconsistent. Of the terms we use, some

136 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
already exist and others we invented. Our goal has been to select terms to be both
appropriately descriptive and concise.
Our use of additive system comes from Nathanson [4], who also uses unique rep-
resentation system. Such a collection is called a number system by de Bruijn [2] and
a complementing system of subsets by Munagi [3]. Munagi uses the term usual to
describe the additive systems that de Bruijn and Nathanson call British number sys-
tems. As mentioned above, our British number systems are related but somewhat dif-
ferent. Finally, our quotient is what de Bruijn and Munagi call a degeneration, but
Nathanson calls a contraction. We regret introducing yet another term, but feel that
the benefits are worth it. It is common for equivalence relations (on the sets of a col-
lection, in this case) to give rise to a quotient, one for each equivalence class. That
is exactly what happens here. For readers who know about such things, our defini-
tion indeed meets the category-theoretic requirements of a quotient, provided that we
define a morphism between two additive systems to be a map between the collections
in which each set of one collection goes to a set that contains it.
We now start again from the beginning, adding in the details we have omitted so
far.

2. ADDITIVE SYSTEMS AND BRITISH NUMBER SYSTEMS. Throughout


this paper, every number and variable is a nonnegative integer or, when explicitly
permitted, ∞ (i.e., +∞).

Definition. Suppose A is a collection of disjoint sets, each of which is a nonempty


set of positive integers. We define an A-sampling to be a finite partial transversal of A
(i.e., a finite set that contains at most one element from each set of A) and an A-sum
to be the sum of the elements of an A-sampling. By convention, the empty set, which
is always an A-sampling, has sum zero. Let z be a positive integer or ∞. We say A is
an additive system that has size z if the following hold.
• The A-sums are exactly the nonnegative integers less than z (i.e., the first z non-
negative integers, if z < ∞).
• Each A-sum is the sum of only one A-sampling.

This definition does not distinguish between ordered collections, e.g., (Ai )0≤i<I ,
and unordered collections, e.g., {As }s∈S .
The empty collection is the unique additive system of size 1. The reader may wish to
verify that there are unique additive systems of size 2 and 3, but two additive systems
of size 4.
We can rephrase this definition as follows. A is an additive system of size z if and
only if the operation of summation is a bijective map from the set of A-samplings to
the set of nonnegative integers less than z.
We write |A| for the size of an additive system A.
Here is one way in which we can create an additive system from a smaller one.

Proposition 3. Suppose A is an additive system with |A| = z < ∞, and suppose z  =


∞ or z  = sz for an integer s ≥ 2. Let B be the set of positive multiples of z that are
less than z  , and let A = A ∪ {B}. Then we have the following.
1. A is an additive system of size z  .
2. If σ is an A -sampling with sum n ≥ z, then max σ is in B and specifically is the
largest element of ∪A that is less than or equal to n.

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 137


Proof. For 0 ≤ t < z, let σt be the unique A-sampling with sum t. Of course, each
σt is also an A -sampling. What are the other A -samplings, and what are their sums?
The sums of A -samplings of the form σt ∪ {z} are z, z + 1, . . . , 2z − 1; the sums of
σt ∪ {2z} are 2z, 2z + 1, . . . , 3z − 1; and so on. Both parts of our proposition then
follow immediately.

Here is what happens if we begin with the empty collection and use Proposition 3
I times.

Definition. Let B = (Bi )0≤i<I be an ordered collection of I nonempty sets, where


0 ≤ I ≤ ∞. We say B is a British number system if all of the following hold:
• For all 0 ≤ i < I , each Bi consists of consecutive positive multiples of min Bi .
• min B0 = 1.
• For all 1 ≤ i < I , min Bi is the smallest positive multiple of min Bi−1 that is not in
Bi−1 .

In a British number system B = (Bi )0≤i<I , the set Bi−1 is always finite for 1 ≤ i <
I , because otherwise min Bi could not satisfy the final condition above. For I finite,
however, it is possible to have B I −1 infinite, as in Figure 2.

Proposition 4. Let B = (Bi )0≤i<I be a British number system. Define b I to be ∞ if


∪B is infinite, 1 if B is the empty collection, and otherwise be the smallest positive
multiple of min B I −1 that is not in B I −1 .
1. Then B is an additive system, and |B| = b I .
2. Furthermore, if σ is a nonempty B-sampling with sum n, then max σ is the
largest element of ∪B that is less than or equal to n.

Note that |B| = ∞ if and only if either I = ∞, or I < ∞ and B I −1 is an infinite set.
It also turns out that British number systems are characterized by (2) of Proposition 4,
up to ordering. See Theorem 8.

Proof. Define bi = min Bi for 0 ≤ i < I . Note that we always have b0 = 1, and that
if I = ∞, then limi→∞ bi = ∞ = b I .
For 0 ≤ j ≤ I and j < ∞, define B j = (Bi )0≤i< j . Then B j is an additive system
of size b j . This is trivial for j = 0, and then follows by induction using Proposition 3.
This completes (1) if I < ∞.
For the rest of (1) and also for (2), consider a nonnegative integer n < b I . Fix j as
small as possible such that n < b j . Notice that j < ∞. Since the B-samplings with
sum n are exactly the B j -samplings with sum n, exactly one B-sampling σ has sum
n. This finishes (1) if I = ∞. Finally, for (2), suppose σ is nonempty, so that j ≥ 1
and n ≥ b j−1 . Apply Proposition 3 with A = B j−1 and B = B j−1 to get that max σ
is the largest element of ∪B j that is less than or equal to n. Since every element of
∪B \ ∪B j is larger than n, max σ is also the largest element of ∪B that is less than
or equal to n.

Given a British number system B and a positive integer n < |B|, the following
“greedy” algorithm allows us to find the B-sampling that has sum n. Say the desired
B-sampling is {s1 , s2 , . . . , st } with s1 > s2 > · · · > st . By (2) of Proposition 4, s1 is
the largest element of ∪B that is less than or equal to n. Having found s1 , . . . , si , either

138 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
s1 + · · · + si = n, in which case i = t, or we apply the same result to see that si+1 is
the largest element of ∪B that is less than or equal to n − (s1 + · · · + si ).

3. QUOTIENTS. As we recall from the introduction, we can use quotients to create


additive systems that are not British number systems.

Definition. Let A be an additive system and ∼ an equivalence relation on A. For any


set A in A, we write [A] for its equivalence class and A for the set of nonzero [A]-
sums. Since A1 ∼ A2 if and only if A1 = A2 , there is one distinct A for each
equivalence class of A. We define A/ ∼, the quotient of A by ∼, to be the collection of
all the distinct A . (The equivalence relation will be clear from the context. If it were
necessary to emphasize it, one could use [A]∼ and A ∼ .)

Proposition 5. Given an equivalence relation on an additive system, its quotient is an


additive system of the same size.

Proof. Let ∼ be an equivalence relation on an additive system A. The A-samplings


and the (A/ ∼)-samplings correspond one-to-one. It may be easier to illustrate
this first with an example, so let A and A/ ∼ be the additive systems from Fig-
ures 2 and 5, respectively. As in the introduction, 794 is the sum of the A-sampling
{4, 15, 25, 50, 700} and also the sum of the (A/ ∼)-sampling {54, 15, 725}. Why?
Because we take 4 and 50 and replace them with 54, since in the quotient we count
pennies with half-dollars, and similarly for 25 and 700. In general, beginning with an
A-sampling, group together elements of the sampling that are in sets equivalent under
∼. Then replace each group by the sum of the group’s elements. We let the reader
verify that this is a one-to-one correspondence. Since the sums of corresponding
samplings are equal, the proposition follows.

Note that if C is the quotient A/ ∼, then ∼ can be recovered from A and C. To


see this, let us show that A1 ∼ A2 if and only if there exists a set of C that contains
A1 ∪ A2 . This follows from three facts. First, A1 ∼ A2 is equivalent to A1 = A2 .
Second, Ai ⊆ Ai , and finally, Ai is the only set of C that contains Ai (since sets of
C are disjoint by Proposition 5).

Definition. Suppose B = (Bi )0≤i<I is a British number system. We say an equiv-


alence relation ∼ on B is mixing if adjacent sets are not equivalent, i.e., Bi ∼ B j
implies j = i ± 1. A mixed quotient is an additive system written as B/ ∼, where B
is a British number system and ∼ is a mixing equivalence relation.

If A equals the mixed quotient B/ ∼, then ∼ is uniquely determined by A and B,


as we showed above for arbitrary quotients.

4. THE PROOF OF THE STRUCTURE THEOREM. In this section we prove


Theorem 1.
Although we wish to write an additive system A uniquely as a mixed quotient
B/ ∼, the construction of neither B nor ∼ depends on A being an additive system. So
to begin, let A be any collection of disjoint sets of positive integers.
From A we define a British number system B = (Bi )0≤i<I and a mixing equiva-
lence relation ∼ on B. As we construct B, b j will be the size of the British number
system (Bi )0≤i< j and will also be min B j , should it turn out that j < I . Define b0 = 1.

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 139


Suppose we have defined b0 , b1 , . . . , b j and B0 , B1 , . . . , B j−1 . If b j is in no set of
A, then we define I = j and our construction of B is complete. Otherwise, let A j
be the unique set of A containing b j . If all positive multiples of b j are in A j , then let
I = j + 1 and B j be the set of all positive multiples of b j ; again our construction of B
is complete. Otherwise let b j+1 be the smallest positive multiple of b j that is not in A j ,
let B j be the positive multiples of b j that are less than b j+1 , and repeat this procedure.
If the process never stops, let I = ∞.
It is easily checked that this construction produces a unique British number system
B = (Bi )0≤i<I and unique (not necessarily distinct) sets Ai in A, 0 ≤ i < I , such that
the following two conditions hold.
1. For 0 ≤ i < I , Bi is the largest set of consecutive multiples of min Bi , beginning
with min Bi , that is contained in Ai .
2. |B| is not an element of any set of A.
Note that when B = (Bi )0≤i<I and the sets Ai satisfy the conditions above, then for all
1 ≤ i < I , we have Ai−1 = Ai , because otherwise Bi−1 ∪ {bi } would be a set of con-
secutive multiples of bi−1 that begins with bi−1 and is contained in Ai−1 , contradicting
(1) for Bi−1 .
For the uniqueness in our theorem, suppose A equals a mixed quotient B/ ∼. If we
define Ai to be Bi , then B and the Ai meet the conditions above, so B is uniquely
determined. The uniqueness of ∼ then follows from our observation at the end of
Section 3.
We now return to A being an arbitrary collection of disjoint sets of positive integers.
Let B = (Bi )0≤i<I and the sets Ai satisfy (1) and (2) above. Define Bi ∼ B j by Ai =
A j . Clearly ∼ is an equivalence relation on B, and by the remark following (2), it is
mixing. To complete our theorem, we prove

A is an additive system if and only if A = B/ ∼ .

This statement, once proved, also allows us to determine whether the arbitrary col-
lection A is an additive system.  
One direction is trivial, so assume A is an additive system. Since Bi = B j if and
only if Ai = A j , Bi → Ai gives a well-defined, one-to-one map from B/ ∼ to A. We
must show that this map is onto and that Bi = Ai for all i. Being additive systems,
A and B/ ∼ are each collections of disjoint nonempty sets of positive integers. Thus,
it is enough to show that the following holds for all positive integers n.

Either there exists 0 ≤ i < I such that n is in both Ai and Bi , or n is in no set


of A and in no set of B/ ∼.

Let N be a positive integer, and assume this statement for all positive integers n < N .
One consequence of this inductive hypothesis is that if we have a set of positive inte-
gers less than N , then the set is an A-sampling if and only if it is a (B/ ∼)-sampling.
In order to reduce to N < |B/ ∼|, we show that if N = |B/ ∼|, then |A| =
|B/ ∼|. This implies our induction statement for all n ≥ N , since no such n is in any
set of A or of B/ ∼.
So suppose N = |B/ ∼|. Every positive integer n < N is a (B/ ∼)-sum, and thus,
by our induction hypothesis, also an A-sum, so |A| ≥ |B/ ∼|. If |A| > |B/ ∼| = N ,
then N is the sum of an A-sampling σ , which leads to the following contradiction.
By (2), N is not in any set of A (since |B| = |B/ ∼|). Thus, σ contains at least
two elements. But then every element of σ would be less than N , so our induction

140 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
hypothesis would imply that σ is a (B/ ∼)-sampling, contradicting N = |B/ ∼|. So
we assume N < |B/ ∼|.
If N is an element of ∪B, say N ∈ Bi , then N is both in Ai , by (1), and also in Bi ,
by the definition of a quotient. Since this is what we need to show, we also assume N
is not in ∪B.
Let σ N be the (B/ ∼)-sampling with sum N . If σ N has more than one element,
then, as above, by our induction hypothesis, σ N also is an A-sampling. In this case, N
is in no set of A or of B/ ∼. Thus we have reduced to our final case: N is not in ∪B
  σ N = {N } is the (B/ ∼)-sampling with sum N ; say N is an element of some set
and
B j of B/ ∼. We must show that N is in A j .

Lemma 6. Let B/ ∼ be a mixed quotient;


  write B = (Bi )0≤i<I . If N is not in ∪B
and there exists j such that N is in B j , then there exist positive integers s, t, and u
such that
1. s < t < u < N ,
2. s + N = t + u,
 
3. s and u are in B j , and
 
4. there exists k such that t ∈ Bk and Bk = B j .

Apply Lemma 6 to N  . By our inductive hypothesis, s and u are in A j and t


is in Ak . Since Bk = B j , we know Ak = A j . Thus {t, u} is an A-sampling, so
N < t + u < |A|.
Being less than |A|, N is the sum of an A-sampling. This A-sampling must be
σ N = {N }. Otherwise, by our inductive hypothesis, we would have a second (B/ ∼)-
sampling with sum N . So N is in some set of A. This set must be A j , because otherwise
{s, N } and {t, u} would be distinct A-samplings with the same sum.
We now only need to prove Lemma 6. For 0 ≤ i < I, asusual let bi = min Bi .
 By  the definition of a quotient, N is the sum of a B j -sampling ω. (Recall that
B j is the equivalence class of B j .) Since N is not in ∪B, ω contains at least two
elements. Let r be any element of ω that is not max ω, say r ∈ Bi , for some Bi ∼ B j ;
let u = N − {r }, we have
 r . Because u is the sum of the nonempty B j -sampling  ω \
that u is in B j . Since ∼ is a mixing equivalence, Bi+1 = Bi = B j . By our choice
of r and again using that ∼ is a mixing equivalence, u ≥ max (ω \ {r }) ≥ bi+2 .
Let t = bi+1 and s = bi+1 − r = t + u − N . Then s < t < bi+2 ≤ u < N . Since
we have t ∈ Bi+1 , we only need to show that s is in B j . In fact, r ∈ Bi implies
s = bi+1 − r ∈ Bi , since Bi is the set of positive
  multiples of bi that are less than bi+1 .
This is what we need, since Bi ⊆ Bi = B j .

5. CHARACTERIZING BRITISH NUMBER SYSTEMS. In this section we


develop a better understanding of the properties of our British number systems. This
work will also help us prove Theorem 2.

Definition. Let n be a positive integer and C be a collection of sets of positive integers.


We define C<n , the truncation of C at n, to be the collection of all nonempty sets of the
form C ∩ {m : m < n}, where C is a set of C.

If B is a British number system and N ∈ ∪B, then B<N also satisfies the definition
of a British number system. By Proposition 4, |B<N | = N . This is a special case of
(1) implying (2) in the following result.

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 141


Proposition 7. Let A be an additive system and N be an element of ∪A. Then the
following are equivalent.
1. If we write A = B/ ∼ as a mixed quotient, then N is in ∪B.
2. A<N is an additive system of size N .
3. A<N is an additive system.
4. Every A<N -sum is less than N .

Remark. If we wanted to extend the above proposition to arbitrary truncations, we


could use the following fact. Suppose A is any collection of nonempty sets of positive
integers. Then every truncation of A either is A or is A<N for some N in ∪A. Indeed,
for any positive integer n, A<n = A if every element of ∪A is less than n, and otherwise
A<n = A<N , where N is the smallest element of ∪A such that n ≤ N .

Proof. Since A is an additive system, every nonnegative integer less than N is the sum
of a unique A-sampling, which a priori is an A<N -sampling. Furthermore, since {N }
is the only A-sampling with sum N , we see that N is not an A<N -sum. Thus A<N
is an additive system if and only if all A<N -sums are less than N , and in this case
|A<N | = N , so (2), (3), and (4) are equivalent.
Write A = B/ ∼ as a mixed quotient. In general, a B<N -sum may not be an A<N -
sum. For example, consider the B<52 -sum 54 when B and A are the collections from
Figures 2 and 5, respectively. However, every A<N -sum is always a B<N -sum by the
definition of a quotient. Thus, if (1) holds, then an A<N -sum is less than N because, as
mentioned after the definition of truncation, B<N is a British number system of size
N . This gives us (4).
On the other hand, if (1) fails, then by Lemma 6, we can find an A<N -sampling
{t, u} such that N < t + u, which means that (4) fails.

We can now characterize which additive systems are British number systems.

Theorem 8. For an additive system A, the following are equivalent.


1. We can order the sets of A so that it is a British number system.
2. Every truncation of A is an additive system.
3. A is totally ordered by <, where we write A < A if every element of A is less
than every element of A .
4. For every A-sampling σ that is nonempty, max σ is the largest element of ∪A
that is less than or equal to the sum of σ .

As we discussed at the end of Section 2, for a British number system, property


(4) above allows us to use a greedy algorithm to determine the sampling for a given
sum. As we now see from the fact that (4) implies (1), this algorithm must fail for
some sum in any additive system that is not a British number system. For example, in
Figure 4, 265 = 205 + 60, even though a greedy algorithm instead would have chosen
209 as the largest element of ∪A that is less than 265. For an additive system that is
not a British number system, we could use Theorem 1 to help us find the sampling
{205, 60}, but there is another way that involves somewhat less computation. We must
postpone this technique until we have some tools from the next section.

Proof. Write A = B/ ∼ as a mixed quotient. Then (1) holds if and only if ∼ is trivial
in the sense that A = B, except for ordering. Thus (1) is equivalent to (2) by Proposi-
tion 7 and the remark that follows it.

142 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Trivially (1) implies (3), and we proved that (1) implies (4) in Proposition 4.
Suppose (1) does not hold. We could use Lemma 6 to show that (3) and (4) do not
hold, but we prefer to prove this directly. Write B = (Bi )0≤i<I , and define bi = min Bi .
There exist j < k such that B j ∼ B k ; since
 ∼ is a mixing
 equivalence,
  j + 1 < k. Then
b j < b j+1 < bk with b j and bk in B j and b j+1 ∈ B j+1 = B j , so (3) does not hold.

Finally,
  consider the A-sampling σ = b j+1 , bk . We have that b j + bk is an element
of B j that is greater than max σ and less than the sum of σ , so (4) also does not
hold.

6. UNIONS OF ADDITIVE SYSTEMS. We still must prove Theorem 2, and we


also will improve on it. To begin, we define terminology for a situation in which we
have two distinct additive systems with equal unions.

Definition. A collection is reducible if it can be written as A ∪ {{z, 2z, . . . , (s − 1)z},


{sz}}, where s ≥ 2 and z ≥ 1 are integers and A is an additive system of size z. The
reduction of such a collection is A ∪ {{z, 2z, . . . , sz}}.

Although not part of the definition, a reducible collection is an additive system,


as is its reduction. Specifically, in this definition, A ∪ {{z, 2z, . . . , (s − 1) z} , {sz}}
and A ∪ {{z, 2z, . . . , sz}} are additive systems of respective sizes 2sz and (s + 1) z,
by applying Proposition 3 (twice for the first collection). Since s ≥ 2, these additive
systems have different sizes. Because their unions are equal, we would expect this
from Theorem 2. We can now state a refinement of that theorem.

Theorem 9. If the unions of two additive systems are equal, then either the additive
systems are equal or one is the reduction of the other.

As in our proof of Theorem 1, here we begin with a rather general situation. From an
arbitrary set of positive integers U , we construct a collection CU of disjoint nonempty
sets such that ∪CU = U . Later we restrict U to be the union of some additive system.
If U is the empty set, let CU be the empty collection. If U is nonempty, but does not
contain both 1 and 2, let CU = {U }. Except for these special cases, i.e., if U contains
both 1 and 2, our construction has three parts. First we define a subset TU ⊆ U . Then
we define a function fU from U to the set of nonnegative integers. Finally we define
the collection CU .
Our goal for TU is for it to contain 1 and 2 and also satisfy the following. Given
consecutive elements d − < d of TU , define c = d − d − .
i. If c + d ∈
/ U and 2d ∈ / U , then d is the largest element of TU .
ii. If c + d ∈
/ U and 2d ∈ U , then 2d is the smallest element of TU that is larger
than d.
iii. If c + d ∈ U , then c + d is the smallest element of TU that is larger than d.
Beginning with the set {1, 2}, these conditions can be used to progressively determine
the elements of TU in ascending order, so it is clear that there exists a unique set of
positive integers TU that contains 1 and 2 and satisfies these conditions. Since here we
only consider sets U that contain 1 and 2, we have TU ⊆ U .

Exercise. Find TU when U is the union of the collection from Figure 4; repeat using
Figure 5. Can you guess what TU is when U is the union of an additive system? (Hint:
An alternate construction of TU begins with {1}. If the maximum of the current set is

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 143


b, we append to the set as many consecutive multiples of b, beginning with 2b, that
are in U . This is similar to part of the construction in the proof of Theorem 1.)

We next define the function fU from U to the set of nonnegative integers. For u in
U , let t be the greatest element of TU such that t ≤ u. (Since 1 is in TU , t exists.) Then
let fU (u) be the smallest positive integer r such that r + t is in U ; if no such positive
integer exists, set fU (u) = 0. Note that fU (u) = 0 if and only if max U = u ∈ TU . In
particular, fU (u) = 0 for at most one u ∈ U .

Exercise. Using the exercise above, find fU when U is the union of the collection
from Figure 4; repeat using Figure 5. Speculate on a general result for fU when U is
the union of an additive system.

Finally, let CU be the collection of all nonempty sets of the form fU−1 ({n}), where
n is a nonnegative integer. Clearly the sets of CU are disjoint and U = ∪CU .
Here is what we can say about CU when U is the union of an additive system.
Theorems 2 and 9 follow directly from this.

Theorem 10. Let U be the union of an additive system A.


1. Then CU is an additive system.
2. Suppose there exist integers s ≥ 2 and z ≥ 1 such that the s largest elements of
U are z, 2z, . . . , sz. Then
(a) CU is reducible, and
(b) A is equal to either CU or the reduction of CU .
3. If there are no integers s and z as in (2), then A is equal to CU .

Here is how to determine whether there exist s and z as in (2). A necessary con-
dition is that U has finitely many elements, but at least two. In that case, the only
possible value of z is the difference between the two largest elements of U ; s times
that difference must equal the largest element of U . In particular, at most one pair
(s, z) satisfies (2).
Here are the intermediate results we need to prove Theorem 10.

Proposition 11. Suppose ∼ is a mixing equivalence on a British number system


B = (Bi )0≤i<I . Let U = ∪ (B/ ∼), and suppose 1 and 2 are in U .
1. TU = ∪B.
   
2. For u ∈ B j , if fU (u) > 0, then fU (u) = min B j .
3. If fU is always positive, then CU = B/ ∼.
4. If fU (u) = 0, then

B/ ∼ = (B/ ∼)<min B I −1 ∪ {B I −1 } (with I < ∞), and


CU = (B/ ∼)<u ∪ {{u}} .

For these lemmas, assume B = (Bi )0≤i<I is a British number system.

Lemma 12. Suppose m ∈ Bi for some 0 ≤ i < I , and bi = min Bi . Then either
bi + m = |B| and m is the largest element of ∪B, or bi + m is the smallest element
of ∪B that is larger than m.

144 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
  13. Consider B/ ∼, where ∼ is any equivalence relation on B.
Lemma  Suppose

u ∈ B j and that t is the largest element of ∪B such that t ≤ u. Then t ∈ B j .
Lemma 12 follows directly from the definition of a British number system. For
Lemma 13, let σ be the B-sampling with sum u. By the definition of a quotient, every
element of σ is in B j , and by (2) in Proposition 4, max σ = t. Now we can prove
Proposition 11.
Proof. Let T = TU , f = fU , and bi = min Bi for all 0 ≤ i < I .
For (1), first notice that for any additive system A, |A| ≥ 3 if and only if ∪A contains
1 and 2. Since U contains 1 and 2, we have that |B| = |B/ ∼| ≥ 3, so ∪B also
contains 1 and 2. Therefore, we can show that T = ∪B by showing that ∪B satisfies
the three defining conditions of T . So let d − < d be consecutive elements of ∪B and
c = d − d − . Denote by d + the smallest element of ∪B that is larger than d, if it exists.
Say d − ∈ Bi . By Lemma 12, c = bi . This also means c ≤ d − < d, so that
c + d < 2d.
Now, d ∈ Bi or d = bi+1 . Consider first d ∈ Bi . Again applying Lemma 12, either
c + d = |B| and d + does not exist, or d + = c + d ∈ ∪B ⊆ U . Thus, either the first
or third of the defining properties of T applies.
For d = bi+1 , once again we apply Lemma 12. Either 2d = |B| and d + does not
exist, or d + = 2d ∈ ∪B ⊆ U . So either the first or second of the defining properties
of T applies, provided that we can show c + d ∈ / U . Since ∼ is a mixing equivalence,
Bi = Bi+1 , so {bi , bi+1 } is a (B/ ∼)-sampling, and thus bi + bi+1 = c + d ∈ / U.
This completes (1). 
For (2), let u ∈ B j and t be the largest element   of ∪B such that t ≤ u. By (1) and
the definition of f , f (u) = f (t). Since t ∈ B j by Lemma 13, we can replace u by t
and assume u ∈ ∪B and t = u. Specifically, u is in a set of B that is equivalent to B j ;
replacing j if necessary, we may assume u ∈ B j .  
Assume now that f (u) > 0. We  claim that f (u) ≥ min B j . Since f (u) > 0, it
suffices to consider 0 < n < min B j and show that n + u ∈ / U . Indeed, if σ is the

(B/ ∼)-sampling with sum n, then no element of σ is in B j since n < min B j .
Thus σ ∪ {u} is the unique (B/ ∼)-sampling with sum n + u. Since σ ∪ {u} has more
than one element, n + u is not in U = ∪ (B/ ∼). For (2), we therefore only need to
show that f (u) ≤ min B j .      
Let j0 be the smallest   integer such that Bj = B j0 ; then min B j =
nonnegative

b j0 . If j0 < j, then bj0 , u is a B j -sampling (where B j is  the equivalence class of
B j ), so b j0 + u ∈ B j ⊆ U and thus f (u) ≤ b j0 = min B j . So we may also assume
j0 = j. We thus need to show f (u) ≤ b j .
By Lemma 12, since u ∈ B j , either b j + u = |B| or b j + u ∈ ∪B. If b j + u ∈
∪B ⊆ U , then f (u) ≤ b j , as desired. Finally, if b j + u = |B|, then since every ele-
ment of U is less than |B/ ∼| = |B|, we have f (u) < b j , completing   (2). (Actually,
this final case cannot occur, since we already showed f (u) ≥ min B j = b j0 = b j .)
Now consider u 1 and u 2 in U such that fU (u 1 ) and fU (u 2 ) are positive. By (2), u 1
and u 2 are in the same set of B/ ∼ if and only if fU (u 1 ) = fU (u 2 ). (Distinct sets of
B/ ∼ are disjoint, and thus have different minimum elements.) We then get (3) from
the definition of CU .
For (4), we have that u = max U is the unique element on which fU is zero. This
implies that I < ∞ and that CU has the desired form, by using the same reasoning as
in (3). To get the desired form for B/ ∼, it is enough to prove these two claims:
• B
I −1 = B I −1 , and
• if z ∈ U and z ≥ b
I −1 , then z ∈ B I −1 .

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 145


If B I −1 = B I −1 , then there exists j < I − 1 such that B j ∼ B I −1 . This would imply
that b j + u ∈ B I −1 ⊆ U , contradicting that u is the maximum of U .
Finally, suppose z ∈ U with z ≥ b I −1 . Let t be the largest element of ∪B such
that t ≤ z. Since b I −1 ∈ ∪B, t ≥ b I −1 . Now the sets of B are totally ordered (see (3)
of Theorem 8) and B I −1 is its biggest set, so we must have t ∈ B I −1 . Finally, since
B I −1 = B I −1 , Lemma 13 shows that z ∈ B I −1 .

In Proposition 11, notice that (3) and the final part of (4) imply that CU and B/ ∼
are equal, except possibly for the membership of elements with fU (u) = 0, and there
is at most one such element.
We can now prove Theorem 10.

Proof. Let C = CU . If U does not contain both 1 and 2, then it is easy to see that A
must be the empty collection or {{1}}. In either case, there do not exist integers s and
z as in (2), and the special cases of our construction show that C = A. So we may
assume that U contains 1 and 2.
Consider the following conditions.
i. A = C.
ii. There exist integers s ≥ 2 and z ≥ 1 such that the s largest elements of U are
z, 2z, . . . , sz.
iii. C is reducible. (In particular, C is an additive system.)
iv. A is the reduction of C.
We show that (i) implies that (ii), (iii), and (iv) hold. This establishes all of Theorem 10
except for the first part of (2); for that we must separately prove that (ii) implies (iii).
Using Theorem 1, write A as the mixed quotient B/ ∼ so we can apply Proposi-
tion 11. Say B = (Bi )0≤i<I , and let bi = min Bi for all 0 ≤ i < I .
Suppose A = C. By (3) of Proposition 11, fU cannot be always positive, so fU is
zero on max U = max ∪B. Since U has a maximum element, I < ∞.
Let z = b I −1 and let s ≥ 1 be the number of elements in B I −1 , so that B I −1 =
{z, 2z, . . . , sz} and sz = max U . By (4) of Proposition 11,

A = A<z ∪ {{z, 2z, . . . , sz}} and


C = A<sz ∪ {{sz}} .

If s were equal to 1, then we would have A = C, and thus s ≥ 2. In particular, the first
equation above now gives us (ii). Substituting the first equation above into the second,
we get

C = A<z ∪ {{z, 2z, . . . , (s − 1) z} , {sz}} .

This gives us (iii) and (iv), since Proposition 7 implies that A<z is an additive system
of size z.
Now assume (ii). Recalling from (1) of Proposition 11 that TU = ∪B, let us show
that z and 2z are in TU .
Let t be the largest element of TU that is strictly less than 2z; say t ∈ Bi . Since
TU ⊆ U , t ≤ z. We claim that

bi + t ≥ 2z, with equality implying that 2z ∈ TU .

146 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Indeed, let us consider the two possible conclusions when applying Lemma 12 to t. We
could have that bi + t = |B| = |B/ ∼| > max U ≥ 2z or that bi + t is the smallest
element of ∪B that is larger than t. In the latter case, bi + t ∈ TU , so bi + t ≥ 2z, with
equality giving us that 2z ∈ ∪B = TU .
Using the inequality that appears in our claim, we can conclude that bi ≤ t ≤ z ≤
(bi + t) /2, so we must have bi = t = z. In particular, z ∈ TU , and applying the second
part of our claim, 2z ∈ TU .
We now apply the definitions of TU , fU , and C = CU to conclude the following.
First, z, 2z, . . . , sz are all in TU . Second, fU (u) = 0 exactly for u = sz, and fU (u) =
z if and only if u ∈ {z, 2z, . . . , (s − 1) z}. (For “only if,” we use that u ≥ fU (u), which
follows from (2) of Proposition 11.) Third, {sz} and {z, 2z, . . . , (s − 1) z} are sets of
C. Thus C = C<z ∪ {{z, 2z, . . . , (s − 1) z} , {sz}}. By Proposition 7, C<z is an additive
system of size z, so C is reducible. This completes our proof.

Recall that the construction in the proof of Theorem 1 allows us to test an arbitrary
collection to see if it is an additive system. We can now perform a similar test with
unions. Given an arbitrary set U of positive integers, construct CU . Then U is the
union of an additive system if and only if CU is an additive system, something we can
determine by applying our earlier test. Indeed, if U is the union of an additive system,
then CU is an additive system by Theorem 10. Conversely, if CU is an additive system,
then U is the union of the additive system CU itself, since U = ∪CU holds for arbitrary
U.
Let us pick up one additional earlier thread. At the end of Section 2, we saw that if
we are given a British number system B and a positive integer n < |B|, we can find
the B-sampling with sum n by a greedy algorithm. That is, we successively choose
the largest possible elements from the additive system’s union in a way so that we get
partial sums that do not exceed n. We also saw in Theorem 8 that for any additive
system that is not a British number system, this greedy algorithm must fail for some n.
What if we begin with n and an additive system that may not be a British number
system? What if we are only given the union U of such an additive system? Let us
show how to find the sampling σ with sum n, assuming that such a sampling exists,
i.e., n is smaller than the size of the additive system. To avoid trivialities, assume U
contains 1 and 2.
First construct TU , or at least construct those elements of TU that are less than
or equal to n. Then use the same greedy algorithm with TU to find σ0 . (So σ0 is
the sampling of the underlying British number system, i.e., the one whose mixed
quotient is the additive system with which we began.) To determine σ , apply fU to
each element of σ0 . Whenever fU fails to be one-to-one on σ0 , replace elements by
their sum. For example, if U is the union of the sets from Figure 5 and n = 373,
then σ0 = {3, 20, 50, 300}. Since the values of fU on these elements are 1, 5, 1,
and 25, respectively, σ = {53, 20, 300}. We leave it to the reader to use Proposi-
tion 11, especially (2), to justify this algorithm. Special care is needed if there is
u ∈ σ0 with fU (u) = 0. In this case, in the notation of Proposition 11, we can use that
u ∈ B I −1 = B I −1 to see that it is in a different set of B/ ∼ from all other elements
of σ0 .

7. ADDITIONAL DIRECTIONS. There are a couple of ways in which one could


generalize additive systems. If positive rational numbers were allowed in the sets of a
collection, we could extend the idea of a British number system, or its mixed quotient,
so that its sums are all nonnegative rational numbers. (Once again, this would gener-
alize the notion of numbers in base b.) Do these mixed quotients characterize some

February 2017] CHARACTERIZING ADDITIVE SYSTEMS 147


generalization of an additive system? What happens if we allow convergent infinite
sums, so that we can get all nonnegative real numbers?
Alternatively, what if we take an ordinary additive system (i.e., one whose sets
contain only positive integers) but consider “sums” of samplings that are no longer
required to be finite? That is, we would take formal sums, with no more than one
element added from a given set of the additive system. For example, if p is a
primeand we begin with the British number system B = (Bi )0≤i<∞ defined by
Bi = pi , 2 pi , . . . , ( p − 1) pi , then the formal sums would give us the p-adic inte-
gers. We leave it to others to determine how these extensions might be used.

ACKNOWLEDGMENT. Many thanks to two referees who offered many detailed suggestions.

REFERENCES

1. F. Blanchet-Sadri, Algorithmic combinatorics on partial words, Int. J. Found. Comput. Sci. 23 (2012)
1189–1206, http://dx.doi.org/10.1142/S0129054112400473.
2. N. G. de Bruijn, On number systems, Nieuw Arch. Wisk. 4 no. 3 (1956) 15–17.
3. A. O. Munagi, k-Complementing subsets of nonnegative integers, Int. J. Math. Math. Sci. 2005 no. 2
(2005) 215–224, http://dx.doi.org/10.1155/IJMMS.2005.215.
4. M. B. Nathanson, Additive systems and a theorem of de Bruijn, Amer. Math. Monthly 121 (2014) 5–17,
http://dx.doi.org/10.4169/amer.math.monthly.121.01.005.

MICHAEL MALTENFORT received his Ph.D. from the University of Chicago in 1997. After twelve years
at Truman College, one of the City Colleges of Chicago, he moved to Northwestern University, where he has
been a Lecturer and College Adviser since 2013. He has loved square dancing since 1985 and has been a square
dance caller since 2002.
Department of Mathematics, Northwestern University, Evanston, IL 60208
malt@northwestern.edu

100 Years Ago This Month in The American Mathematical Monthly


Edited by Vadim Ponomarenko
The thirty-eighth meeting of the Chicago Section of the American Mathematical
Society was held at the University of Chicago on December 22 and 23, 1916, at
which eleven research papers were presented by representatives of the following uni-
versities: Chicago, Illinois, Minnesota, Nebraska, Purdue and Wisconsin, and Rose
Polytechnical Institute. The members present enjoyed an informal dinner together at
the Quadrangle Club on Friday evening. Among topics informally discussed were
(1) The contract between the Mathematical Association of America and the Annals
of Mathematics for the enlargement of that journal and the publication in it of exposi-
tory and historical articles; (2) the need in this country of careful consideration of the
whole question of the history of mathematics and of combined and systematic effort
in developing investigation in this line; and (3) the desirability of holding ourselves
in readiness to assist the publishers of the Revue Semestrielle and the Fortschritte in
case it becomes necessary, on account of the war conditions, in order to continue the
publications.
—Excerpted from “Notes and News” 24 (1917) 96–100.

148 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Rotating Multiple Sets of Labeled Points to
Bring Them Into Close Coincidence:
A Generalized Wahba Problem
Bisharah Libbus, Gordon Simons, and Yi-Ching Yao

Abstract. While attempting to better understand the 3-dimensional structure of the mam-
malian nucleus as well as a rigid-body kinematics application, the authors encountered a
naturally arising generalized version of the Wahba (1965) problem concerned with bring-
ing multiple sets of labeled points into close coincidence after making appropriate rotations
of these sets of labeled points. Our solution to this generalized problem entails the develop-
ment of a computer algorithm, described and analyzed herein, that generalizes and utilizes an
analytic formula, derived by Grace Wahba (1965), for determining space satellite attitudes,
that task being to find a suitable rotation that brings one set of m labeled points into close
coincidence, in a least-squares sense, with a second set of m labeled points.

1. INTRODUCTION. Given k + 1 sets S0 , S1 , . . . , Sk , each consisting of m n-


dimensional labeled points (k ≥ 1, m ≥ 2, n ≥ 2), the task is to independently rotate
each of the latter k sets so as to bring all of the k + 1 sets into close coincidence in a
least-squares sense. If we denote the points in S j by {a j ,  = 1, . . . , m}, j = 0, . . . , k,
then the task is to find k rotation matrices M1 , . . . , Mk that simultaneously minimize
the (weighted) loss function

 
m
S(M1 , . . . , Mk ) = wi j ||Mi ai − M j a j ||2 , (1)
0≤i< j≤k =1

where M0 = In (the n × n identity matrix), and where ||v|| denotes the Euclidean norm
of vector v.
When k = 1, this is known as “the Wahba problem,” thus explaining why we
refer to the problem of minimizing (1) as “the generalized Wahba problem” (GWP).
Grace Wahba [17], as a graduate student, using nothing more than linear algebra and
some clever reasoning, obtained an explicit formula for the rotation matrix M1 , while
addressing a compelling need by space scientists, in 1965, to estimate satellite atti-
tudes: given two sets of m labeled n-dimensional points {a1 , . . . , am } and {b1 , . . . , bm },
find a rotation matrix M that brings the second set into the best least-squares coinci-
dence with the first, i.e.,find a rotation matrix M that minimizes Wahba’s (unweighted)
loss function S(M) = m=1 ||a − Mb ||2 . See Figure 1 with m = 4 and n = 3, where
the unit vectors a ( = 1, 2, 3, 4) are representations, in the satellite reference frame,
of the directions of four observed objects, and the unit vectors b ( = 1, 2, 3, 4) are
representations of the corresponding observations in a known reference frame.
See [4] for an elegant analytic solution for the optimizer M in the Wahba problem,
the direct use of which can sometimes make an accurate computation of M difficult.
Markley [8] provides a computationally more accurate approach based on a singular
value decomposition of an n × n matrix with n = 3 (in the context of satellite attitude
http://dx.doi.org/10.4169/amer.math.monthly.124.2.149
MSC: Primary 65F30, Secondary 65K05; 15B10

February 2017] A GENERALIZED WAHBA PROBLEM 149


b1 b2 a1 a2
b3 a3

b4 a4

Figure 1. Known (left) and satellite (right) reference frames.

estimation). (For an essentially trivial description of the required computations, see


http://en.wikipedia.org/wiki/Wahba%27s problem.) Note that the optimization prob-
lem and related computational methods require no restriction on the vectors a and b
while they are unit vectors in the context of satellite attitude estimation.
Finally, as a segue into applications, we mention that the Wahba problem was
extended in the robotics literature (cf. Horn [5] and Umeyama [15]) to include a trans-
lation vector in addition to the rotation matrix, which can be easily reduced to the orig-
inal Wahba problem. In our formulation of the GWP, we may also include translation
vectors along with the rotation matrices, one for each set of labeled points. With minor
modifications, our algorithmic approach (to be described and discussed in Section 3)
can also be used to deal with this extended version.
The GWP opens up applications to rigid-body kinematics, with “landmarks.” For
instance, imagine the complicated wrist motion of a baseball pitcher while deliver-
ing a curveball to an awaiting batter. The set of landmarks could be chosen in close
proximity to the largest carpal (wrist) bone, “capitate.” But, in order to secure more
accurate calculations, a better choice would appear to be to choose a landmark in close
proximity to each of the eight carpal bones, thereby providing a broader base, the only
disadvantage being that this collection of landmarks, jointly, is not fully rigid, albeit
nearly so. Further, imagine, by some means that we are able to measure, with good
precision, the 3-dimensional locations of these eight landmarks at consecutive times
t0 < t1 < · · · < tk , for k ≥ 1. The computational task is to describe, as accurately as
possible (in a least-squares sense), the components of linear and rotational motions of
the pitcher’s wrist over each of the time intervals [t0 , t1 ], [t1 , t2 ], . . . , [tk−1 , tk ]. This
task can easily be formulated as an extended version of the GWP. See Spoor and
Veldpaus [14] and Veldpaus et al. [16] for variant versions of the special case k = 1
modeling (nearly) rigid-body motion.
A simplified rigid-body example for k = 2, m = 4, and n = 3 appears in Figure 2,
with a regular tetrahedron, and with landmarks shown in red, green, yellow and blue
(labeled 1, 2, 3, and 4), located at the four vertices. For additional simplification we are
suppressing any discussion of linear translations (i.e., we assume that the center of the
tetrahedron, shown in black in the figure, is fixed). Thus, there are two rotation matrices
to be computed, M1 and M2 , describing, respectively, the rotations from the first to
second and from the first to third “snapshots” (appearing in Figure 2) of the tetrahedron
in rotational motion. Now, if the 3-dimensional data describing the locations of all of
the vertices were error free, then the minimum value of S(M1 , M2 ) would necessarily
be equal to 0, and the minimizer would identify precisely accurate rotation matrices

150 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
1
1
4
1

2
4
3 2
2

4 3
3

Figure 2. Rotations of a regular tetrahedron.

M1 and M2 . But measurement errors are to be expected, and the resulting ambiguity
in the data can be resolved by minimizing S(M1 , M2 ) (that is, by seeking a best least-
squares fit of the data, subject to the requirement that M1 and M2 are 3 × 3 rotation
matrices).
For rigid-body applications, it makes sense to adjust the weights wi j in (1) appro-
priately to reflect the fact that one is dealing with a sequence of contiguous, time-
ordered rotational motions, such as by using larger weights when j − i is small.
Cellular nuclei of eukaryotic organisms, where chromosomes reside, provide the
setting for an entirely different application of the GWP. The focus of attention is on the
ends of chromosomes, called telomeres. While it is known that telomeres are anchored
to the nuclear envelope (cf. Alsheimer et al. [1], Crabbe et al. [2], and Moens et al.
[9]), so as to facilitate the required motion of chromosomes within the cellular nucleus
during the various phases of cellular activity, the anchoring details are not very well
understood. However, specific proteins are being identified as playing key roles in the
association of telomeres with the nuclear envelope (cf. Hou et al. [6], Kind and van
Steensel [7], Postberg et al. [10], and Schmidt et al. [12]). It is tempting to suspect,
but presently it cannot be ascertained, whether the arrangement of anchoring points
is unique, with each telomere occupying a fixed attachment location to the nuclear
envelope relative to all other telomeres, an arrangement that is common to all nuclei
of the given type. We shall refer to this suspected arrangement as the “fixed anchoring
points” (FAP) hypothesis. To be specific, assume that we are observing k + 1 cellu-
lar nuclei. Now, if it is possible to independently rotate the latter k nuclei, together
with their telomere attachments, so as to bring their corresponding telomere locations
into close coincidence with the corresponding telomere locations in the first cellular
nucleus, this would provide strong evidence in support of the FAP hypothesis. To be
even more specific, we might compute a function like S(M1 , . . . , Mk ) in (1), and if the
minimum possible value of this function is larger than some specified threshold value,
then we might reasonably view this as providing a sound statistical basis for rejecting
the FAP hypothesis.
As a practical matter, it is not presently possible to compute the 3-dimensional
locations of telomeres within their cellular nuclei. So this appealing approach toward
testing the FAP hypothesis is not presently feasible to implement. However, one of
the authors of this paper has experimentally secured 2-dimensional projections of the
missing 3-dimensional data on telomere locations, and, with this incomplete data,
the current authors have been able to convincingly reject the alternative hypothesis that
the attachment points of telomeres to the nuclear envelope occur randomly. Unfortu-
nately, this provides scant evidence for the validity of the FAP hypothesis.

February 2017] A GENERALIZED WAHBA PROBLEM 151


Besides the GWP discussed here, we should mention that other kinds of generaliza-
tions of the Wahba problem have been discussed in the literature (cf. Shuster [13] and
Psiaki [11]).
Before explaining our successful algorithmic solutions to the GWP, we should point
out that attempts by us to directly minimize S(M1 , . . . , Mk ) analytically, when k ≥ 2,
have met with failure, a task that appears to be impossible, in general. Something more
than analytical reasoning is needed. But, interestingly, Wahba’s elegant analytic solu-
tion available for k = 1 does play a decisive role in the execution of the algorithmic
solutions for general k. For the task of minimizing the sum in (1) over all possible
rotation matrices M1 , . . . , Mk can be accomplished algorithmically via repeated suit-
ably designed applications of the analytic methodology used for minimizing S(M)
over M (the methodology for the case k = 1). The basic idea, in each step, is to hold
k − 1 of the M’s fixed and perform a minimization with respect to the remaining M,
doing this repeatedly while cycling through all of the M’s. Our experience with the
algorithmic approach has been that we routinely obtain a rapid convergence to the
desired minimum, obtaining, in the process, very accurate limiting values for the rota-
tion matrices M1 , . . . , Mk . In Section 3 we describe a couple of specific algorithms for
cycling through the M’s, and establish a convergence result for the second of these.
While neither of the two algorithms is guaranteed to solve the minimization problem,
again our experience has been that convergence to a global minimum routinely occurs
whenever one repeatedly visits all the M’s, even if this cycling is performed randomly.
In short, we have found that our algorithmic approach is efficacious and robust with
respect to cycling variants.
However, as we discuss in Section 3, there are, apparently, rare, exceptional cases
of data for which the convergence of S(M1 , . . . , Mk ) can be to something other than
the desired global minimum, depending on the starting values of the rotation matrices
M1 , . . . , Mk and the cycling methodology employed. A global minimum can always
be found in these cases, but the task is more challenging.
The rest of this paper is organized as follows. As a preliminary to the description
and discussion of our algorithmic approach, we present in Section 2 the solution to the
Wahba problem via singular value decomposition. In Section 3 we present a couple
of algorithms and investigate the convergence issue along with some discussion of
numerical results based on extensive simulation studies. Section 4 contains concluding
remarks. Proofs of three technical lemmas are relegated to an appendix.

2. SOLUTION TO THE WAHBA PROBLEM. In thissection we present the


solution to the Wahba problem of minimizing S(M) = m=1 ||a − Mb ||2 over
M ∈ S O(n) via singular value decomposition, where S O(n) denotes the group of all
n × n rotation matrices (orthogonal matrices whose determinants are equal to 1). This
is an extension of Markley’s [8] approach to general n. See also de Ruiter and Forbes
[3] for discussions of different approaches. To solve the Wahba problem, we need the
following lemmas, whose proofs are relegated to the appendix.

Lemma 1. For a diagonal matrix D  = diag(δ1 , . . . , δn ) with δ1 ≥ δ2 ≥ · · · ≥ δn−1 ≥


n
|δn |, we have maxG∈S O(n) tr (G D) = i=1 δi , i.e., the maximum value is attained when
G is the identity matrix.

Lemma 2. For a diagonal matrix D = diag(δ1 , . . . , δn ) with δ1 ≥ · · · ≥ δn ≥ 0,


n−1 value of tr (G D) over all orthogonal matrices G with determinant −1
the maximum
equals i=1 δi − δn , i.e., the maximum value is attained when G = J(−1) , the n × n
identity matrtix with its last diagonal element replaced by −1.

152 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
 
Since S(M) = m=1 (||a ||2 + ||b ||2 ) − 2 m=1 aT Mb , it is apparent that mini-
mizing S(M) is equivalent to maximizing


m

S(M) = aT Mb = tr (A T M B), (2)
=1

where the notations (·)T and tr (·) denote a matrix transpose and a matrix trace, respec-
tively, and where A = (a1 , . . . , am ) and B = (b1 , . . . , bm ) are n × m matrices. Given
any n × n (nonnegative-definite) diagonal matrix D = diag(δ1 , . . . , δn ) with δ1 ≥
n
δ2 ≥ · · · ≥ δn ≥ 0, it follows from Lemma 1 that tr (G D) = i=1 G ii δi is maxi-
mized over all rotation matrices G = (G i j ) ∈ S O(n) n when G = J(1) (= In ), the n × n
identity matrix (which attains the maximum value i=1 δi ). Moreover, by Lemma 2,
tr (G D) is maximized over all orthogonal matrices G whose determinants are equal to
−1 when G = J(−1) , the n × n identity matrix with its last diagonal element replaced
n−1
by −1 (which attains the maximum value i=1 δi − δn ). Now, let U DV T be a singular
value decomposition of the matrix product AB T , where the diagonal elements of the
diagonal matrix D satisfy δ1 ≥ δ2 ≥ · · · ≥ δn ≥ 0 and where U and V are appropri-
ately chosen orthogonal matrices of dimension n × n. Observe, for any rotation matrix
M, that

S(M) = tr (A T M B) = tr ((A T M B)T ) = tr (B T M T A) = tr (M T (AB T ))
= tr (M T (U DV T )) = tr ((V T M T U )D) = tr (G D),

where G = V T M T U is orthogonal with determinant det (G) = det (U )det (V ). As M


ranges over all rotation matrices, G ranges over all orthogonal matrices with determi-
nant det (U )det (V ). It follows that 
S(M) is maximized, and S(M) is minimized, over
all rotation matrices M ∈ S O(n) when M = U J(det (U )det (V )) V T . This is the solution to
Wahba’s (unweighted) problem.

3. ALGORITHMS AND CONVERGENCE RESULTS FOR THE GENERAL-


IZED WAHBA PROBLEM. We now consider the generalized Wahba problem of
minimizing S(M1 , . . . , Mk ) in (1) over (M1 , . . . , Mk ) ∈ S O(n) × · · · × S O(n). Let-
ting wi j = w ji for i > j, we may rewrite (1) as

 
m
S(M1 , . . . , Mk ) = wi j (||ai ||2 + ||a j ||2 )
0≤i< j≤k =1

 
m
− wi j (Mi ai )T (M j a j ).
0≤i= j≤k =1

Hence, for each fixed j ∈ {1, . . . , k},

 
m
S(M1 , . . . , Mk ) = −2 wi j (Mi ai )T (M j a j ) + S(− j)
i∈{0,...,k}\{ j} =1

= −2 tr (B Tj M j A j ) + S(− j) ,

where A j = (a j1 , . . . , a jm
) (matrix of dimension n × m), the th column of the n × m
dimensional matrix B j is i∈{0,...,k}\{ j} wi j Mi ai ( = 1, . . . , m), and S(− j) is a sum of

February 2017] A GENERALIZED WAHBA PROBLEM 153


Table 1. An example for A2 with k = 5 :
4 5 2 2 2 2 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
5 2 1 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
2 4 4 5 5 5 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 5 1 3 3 3 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1 5 3 3 1 2 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

terms that do not involve M j . It follows that minimizing S(M1 , . . . , Mk ) over M j with
the other Mi ’s fixed is equivalent to maximizing tr (B Tj M j A j ) over M j (cf. (2) with A
and B replaced by B j and A j , respectively), and can be readily solved using any avail-
able algorithm (e.g., Markley’s singular value decomposition method in Section 2). As
described for the fixed index j ∈ {1, . . . , k}, we shall refer to this approach toward
reducing the size of S(M1 , . . . , Mk ) as a j-step, applied to a general current state
(M1 , . . . , Mk ). Our algorithm for minimizing S(M1 , . . . , Mk ) now takes shape: (i) start
with an arbitrary state (configuration) (M10 , . . . , Mk0 ) ∈ S O(n) × · · · × S O(n), called
the seed; then (ii) update this state through a sequence of j-steps, updating one rotation
matrix at a time, cycling through the possible choices for j in some prescribed manner.
What we will call algorithm A1 uses the trivial cycling strategy: cycling through the
indices {1, . . . , k} repeatedly, starting with the index 1. What we will call algorithm A2
cycles through these indices by choosing at each step a “best possible j-step,” i.e., one
that reduces the current value of S(M1 , . . . , Mk ) as much as possible. More precisely, if
the current (best possible) j-step is for j = j  ∈ {1, . . . , k}, then to determine the next
(best possible) j-step, we need to compare all the j-steps with j ∈ {1, . . . , k} \ { j  }
and choose one that yields the smallest (updated) value of S(M1 , . . . , Mk ). It follows
that the two algorithms A1 and A2 coincide for k = 2, while algorithm A2 is more
time consuming for k > 2.
For algorithm A2 , unlike for algorithm A1 , the frequency distribution of j-steps per-
formed ( j = 1, . . . , k) could become significantly uneven when k ≥ 3. But empirical
evidence indicates otherwise. What we observe is that, after a few j-steps, the pattern
of j values chosen by the algorithm starts to repeat according to some permutation of
the integers 1, . . . , k, continuing in this way until the current values of S(M1 , . . . , Mk )
cease to change (apart from round-off errors). Effectively, convergence has occurred.
Table 1 is a typical example for k = 5, with the values of j broken up into 40 ver-
tical blocks of 5, describing a total of 200 j-steps. It can be seen that the process of
repetition of the permutation (5, 3, 2, 4, 1) begins with the 42nd j-step.
Whatever the cycling strategy used, we shall let (M1r , . . . , Mkr ) denote the state
(rotation matrix configuration) at the end of the r th step, r = 1, 2, . . . . Clearly
S(M1r , . . . , Mkr ) decreases in r . So two natural questions arise: (1) As r → ∞, does
(M1r , . . . , Mkr ) converge? and (2) Does

lim S(M1r , . . . , Mkr ) = min S(M1 , . . . , Mk )


r →∞ M1 ,...,Mk

hold? Since these remain open questions for algorithm A1 , we shall focus our attention
on algorithm A2 , addressing them from a theoretical standpoint as well as we presently
can.
An illustrative example for the algorithms Ai , for i = 1, 2, clarifies what can go
wrong. For k = 2, m = 4, and n = 3, we repeatedly computed the minimizing config-
urations (M1∗ , M2∗ ) for randomly generated 3 × 4 data A j , for j = 0, 1, 2, twice with
the same data, starting each trial with a different set of seeds, and checking for possible
disagreement. For simplicity we set all of the weights wi j in (1) equal to 1. (Note that

154 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Table 2. An example for A1 = A2 with k = 2 :
Two Distinct Seed-Dependent Limits for the Same Data
A0 A1 A2
0.56 0.42 0.99 0.62 −0.09 −0.36 −0.45 −0.90 0.00 0.83 −0.40 −0.40
0.82 0.91 −0.09 0.79 0.72 −0.59 −0.06 −0.36 −0.91 0.11 −0.54 0.44
0.13 0.00 −0.07 −0.01 −0.69 −0.72 0.89 −0.23 0.40 0.55 −0.74 −0.80

M1 seed M2 seed M1 limit M2 limit


1.000 0.000 0.000 1.000 0.000 0.000 −0.670 −0.529 0.520 0.049 −0.080 −0.996
0.000 1.000 0.000 0.000 1.000 0.000 −0.427 −0.298 −0.854 0.829 0.560 −0.004
0.000 0.000 1.000 0.000 0.000 1.000 0.607 −0.795 −0.026 0.557 −0.825 0.094
M1 seed M2 seed M1 limit M2 limit
0.465 −0.774 −0.429 −0.861 −0.435 0.263 −0.900 0.229 0.370 −0.762 −0.512 −0.396
0.491 0.629 −0.603 0.251 −0.814 −0.525 −0.326 0.209 −0.922 −0.139 −0.468 0.873
0.736 0.070 0.673 0.442 −0.386 0.810 −0.289 −0.951 −0.114 −0.632 0.720 0.285

the two algorithms are the same for k = 2.) On the 302nd repetition of this process, we
finally encountered a case of disagreement, as shown in Table 2. The pair of seeds used
in the second trial are randomly generated rotation matrices. As the table shows, these
give rise to a pair of limiting rotation matrices that differ from those resulting from
the simple pair of seeds (I3 , I3 ) used for the first trial where I3 is the 3 × 3 identity
matrix. The limiting values of S(M1r , M2r ) (as r → ∞) for this example, are 12.81672
and 12.52939, respectively (with an approximate ratio of 1.02). Extensive empirical
studies of this sort, conducted by the authors with randomly generated data and trial
seeds, have never produced more than two different limiting configurations (M1∗ , M2∗ )
when k = 2, m = 4, and n = 3. So we are confident that the smaller value 12.52939,
for this example, corresponds to a genuine global minimum for the sum in (1).
What the two limiting configurations appearing in Table 2 have in common is
important to note. They both have the appearance of being a global-minimum con-
figuration in that no further improvement (reduction of S(M1∗ , M2∗ )) is possible by the
application of an additional j-step. But, of course, one truly is and the other is not
a global-minimum configuration. In what follows, we will describe both configura-
tions as “stationary configurations.” This is an important concept for us to consider
at this point because our methodology naturally leads to the discovery of stationary
configurations that might or might not be global-minimum configurations. Whether a
stationary configuration is truly a global-minimum configuration depends on the seed
and the cycling strategy chosen.
A configuration (M1∗ , . . . , Mk∗ ) is said to be stationary in S O(n) × · · · × S O(n)
(with respect to the sets of points {a j } and the weights wi j ) if for each j = 1, . . . , k,

S(M1∗ , . . . , Mk∗ ) ≤ S(M1∗ , . . . , M ∗j−1 , M j , M ∗j+1 , . . . , Mk∗ ) for all M j = M ∗j . (3)

A configuration (M1∗ , . . . , Mk∗ ) is said to be strictly stationary if the inequality (3)


is strict for each j = 1, . . . , k. Note that (M1∗ , . . . , Mk∗ ) is a (strictly) stationary
configuration if and only if for each j = 1, . . . , k, M ∗j is the (unique) minimizer
of S(M1∗ , . . . , M ∗j−1 , M j , M ∗j+1 , . . . , Mk∗ ) over M j . Note also that if (M1∗ , . . . , Mk∗ )
minimizes S(M1 , . . . , Mk ), then it is a stationary configuration. To address the con-
vergence issues, we need to impose a metric on the (compact) space S O(n) × · · · ×
S O(n). For convenience, we adopt the norm |M| := max1≤i, j≤n |Mi j | and the metric
d((M1 , . . . , Mk ), (M1 , . . . , Mk )) := max{|M j − M j | : j = 1, . . . , k}.

February 2017] A GENERALIZED WAHBA PROBLEM 155


Theorem 1. Let P = {P1 , . . . , Pν } be the set of all stationary configurations which
is assumed to be finite with cardinality ν. Further assume that for each pair Pi and
P j with i = j, either S(Pi ) = S(P j ) or one of Pi and P j is strictly stationary. Then
the sequence of configurations Q r := (M1r , . . . , Mkr ) generated by algorithm A2 con-
verges to a stationary configuration (which may depend on the initial configuration
(M10 , . . . , Mk0 )).

To prove Theorem 1, we need the following lemma whose proof is relegated to the
appendix.

Lemma 3. Suppose that a subsequence {Q r } converges to Q  = (M1 , . . . , Mk ). Then


Q  is stationary and S(Q  ) = limr →∞ S(Q r ).

Proof of Theorem 1. To prove that Q r = (M1r , . . . , Mkr ) converges, it suffices by com-


pactness of S O(n) × · · · × S O(n) to show that any two convergent subsequences
{Q r } and {Q ru } with respective limits Q  = (M1 , . . . , Mk ) and Q  = (M1 , . . . , Mk )
satisfy Q  = Q  . By Lemma 3, Q  = Pi and Q  = P j for some i and j, and S(Q  ) =
S(Q  ) = c := limr →∞ S(Q r ). Suppose i = j, i.e., Q  = Pi = P j = Q  . We will show
that this leads to a contradiction. Since S(Pi ) = c = S(P j ), we have by the assumption
of the theorem that one of Pi and P j is strictly stationary. Without loss of generality,
assume i = 1, j = 2, and P1 is strictly stationary. Let P  = {Ph ∈ P : S(Ph ) = c}.
Without loss of generality, further assume P  = {P1 , P2 , . . . , Pν  } where 2 ≤ ν  ≤ ν.
For each h = 2, . . . , ν  , P1 and Ph differ in at least two components since P1 is strictly
stationary and S(P1 ) = S(Ph ). Write Ph = (M1(h) , . . . , Mk(h) ), for h = 1, . . . , ν  . (Note
that (M1 , . . . , Mk ) = Q  = P1 = (M1(1) , . . . , Mk(1) ) and (M1 , . . . , Mk ) = Q  = P2 =
(M1(2) , . . . , Mk(2) ).) Let ε > 0 be the smallest one among all nonzero values of |Mi(1) −
Mi(h) |, for i = 1, . . . , k and h = 2, . . . , ν  . Consider the neighborhood

Rh := {(O1 , . . . , Ok ) ∈ S O(n) × · · · × S O(n) :


d((O1 , . . . , Ok ), (M1(h) , . . . , Mk(h) )) < ε/2},

for h = 1, . . . , ν  . Clearly R1 ∩ Rh = ∅, for h = 2, . . . , ν  . Furthermore, since P1 and


Ph differ in at least two components for h = 2, . . . , ν  , every configuration in R1 must
differ in two or more components from any configuration in R2 ∪ · · · ∪ Rν  . By Lemma
3, every convergent subsequence of {Q r } converges to some Ph ∈ P  . It follows that
for some K > 0,

Q r ∈ R1 ∪ R2 ∪ · · · ∪ Rν  for all r > K (4)

(since otherwise there would be a convergent subsequence with a limit ∈ / R1 ∪ R2 ∪


· · · ∪ Rν  ). Since Q r converges to Q  = P1 , there is an  such that r > K and
Q r ∈ R1 . We claim that (M1r , . . . , Mkr ) ∈ R1 for all r ≥ r > K , which contradicts
r r
(M1u , . . . , Mk u ) → P2 . To establish the claim, we proceed by induction. The claim
holds for r = r . Suppose Q r ∈ R1 for r ≥ r > K . We need to show that Q r +1 ∈ R1 .
Note by (4) that Q r +1 ∈ R1 ∪ R2 ∪ · · · ∪ Rν  . Also by the definition of A2 , Q r and
Q r +1 differ only in one component. As noted earlier, every configuration in R1 must
differ in two or more components from any configuration in R2 ∪ · · · ∪ Rν  , implying
that Q r +1 ∈ R1 . This completes the proof.

156 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
While for technical reasons we can only address the convergence issue for algo-
rithm A2 (instead of A1 ) as in Theorem 1, this result provides theoretical support of
convergence for A1 as the two algorithms are close cousins. (Recall that A1 = A2 for
k = 2.) Indeed, we have performed extensive simulation studies with n = 3, and for
most of the simulation studies we have used algorithm A1 (instead of A2 ), and always
observed apparent convergence of (M1r , . . . , Mkr ) as r gets large. It should be remarked
that if the sequence of (M1r , . . . , Mkr ) generated by algorithm A1 converges, then the
limiting configuration is necessarily stationary.
Theorem 1 assumes that the number of stationary configurations is finite. To get
some idea of how many stationary configurations there can be in the worst possible
data situation, we carried out extensive simulation studies for m = 4, n = 3, and
k = 2, . . . , 10, and found that the maximum numbers of stationary configurations
are 2, 2, 3, 3, 4, 4, 4, 4, 5, respectively. This is of practical importance when one is
worried that the stationary configuration found is not a global-minimum configura-
tion. One can simply use algorithm A1 repeatedly with randomly generated seeds. For
k = 5, m = 4, and n = 3 as an example, one is very likely to end up with the same
stationary configuration over and over again, because there is only one stationary con-
figuration (which is necessarily the global-minimum configuration). But if one finds
a second stationary configuration, then the configuration that yields the larger value
of S(M1 , . . . , M5 ) can be discarded. Continuing, no new stationary configuration is
likely to be found, but if a third stationary configuration is encountered, one can again
discard the configuration corresponding to the larger value of S(M1 , . . . , M5 ). At this
point, one can continue with randomly generated seeds, but this table says that, based
on an enormous number of examples we have checked, the maximum number of sta-
tionary configurations for k = 5 is 3, and new ones will not be found by continuing.
So as a practical matter, one is bound by persistence (even in the worst possible data
situation) to find the global minimum one seeks. Lest it seem to the reader that this
process, as outlined (to make certain that the true global minimum is found), will be
time consuming, the actual computational time on a PC will be a few minutes at most,
and probably considerably less, simply because algorithm A1 converges so rapidly.
We concede that the above discussion is based solely on empirical evidence without
rigorous theoretical justification.
Due to the possible presence of multiple stationary configurations, one can never
know for sure if the limiting (stationary) configuration of (M1r , . . . , Mkr ) corresponds
to the global minimum of S(M1 , . . . , Mk ). However, it appears to us that this issue is
likely to be insignificant in practice for the following reasons:
• for the vast majority of the simulated data sets, there appears to be only one
stationary configuration which would necessarily correspond to the global minimum
of S(M1 , . . . , Mk );
• in the rare cases when multiple stationary configurations arise, evidence suggests
that the k + 1 sets of labeled points in the corresponding data set {a j ,  = 1, . . . , m},
for j = 0, . . . , k, cannot be brought into very close coincidence by properly cho-
sen rotation matrices M1 , . . . , Mk , which, of course, is the objective. In view of the
rather large size of the global minimum for the example described in Table 2, an
inability to obtain a close coincidence of the corresponding labeled points is evi-
dent;
• for these exceptional simulated cases of multiple stationary configurations, we
have observed that the largest of the S(M1∗ , . . . , Mk∗ ) values is only a few percent-
age points larger than the smallest, namely the one corresponding to the global
minimum of S(M1 , . . . , Mk ); cf. the ratio of 1.02 for the example described in
Table 2.

February 2017] A GENERALIZED WAHBA PROBLEM 157


4. CONCLUSION. The well-known Wahba problem is to find an optimal rotation
that brings one set of labeled points into close coincidence, in a least-squares sense,
with a second set of labeled points, which was solved by Wahba [17] analytically.
Later several effective algorithms were proposed to obtain the solution numerically. In
this paper we formulated a generalized version of the Wahba problem which is to find
optimal rotations that bring multiple sets of labeled points into close coincidence in a
weighted least-squares sense. While there appears to be no analytic solution for this
generalized optimization problem, we proposed a couple of algorithms (A1 and A2 )
to solve it numerically. The basic idea, in each step, is to reduce the problem to one
that is equivalent to the original Wahba problem and so can be readily solved. While
there is no guarantee for the algorithms to find the optimal rotations, we established
some convergence results and carried out extensive simulation studies in support of
the algorithms. Finally, we note that the Wahba problem was extended in the robotics
literature (cf. Horn [5] and Umeyama [15]) to include a translation vector in addition to
the rotation matrix, which can be easily reduced to the original Wahba problem. In our
formulation, we may also include translation vectors along with the rotation matrices,
one for each set of labeled points. With minor modifications, the algorithms A1 and
A2 can also be used to deal with this extended version.

5. APPENDIX. In the appendix we prove Lemmas 1, 2, and 3.

Proof of Lemma 1. The case δn ≥ 0 is trivial. We now assume δn < 0. Since the
function

F(δ1 , . . . , δn ) := max tr (G D)
G∈S O(n)

n
is continuous in δ1 , . . . , δn , it suffices to prove F(δ1 , . . . , δn ) = i=1 δi for δ1 > · · · >
δn−1 > −δn > 0. Let M ∈ S O(n) maximize tr (G D) over G ∈ S O(n), i.e.,


n
F(δ1 , . . . , δn ) = tr (M D) = Mii δi . (5)
i=1

We claim that Mi j = 0 for i = j, from which it follows easily that Mii = 1 for all i
n
and F(δ1 , . . . , δn ) = i=1 δi .
It remains to establish the claim. For each pair (i, j) with i = j, let Ri j (θ) be the
identity matrix with the elements at the four locations (i, i), (i, j), ( j, i), and ( j, j)
replaced by cos θ, − sin θ, sin θ, and cos θ, respectively. Since Ri j (θ) ∈ S O(n), we
have M Ri j (θ) ∈ S O(n), and

f i j (θ) : = tr (M Ri j (θ)D)
= δi (Mii cos θ + Mi j sin θ) + δ j (M j j cos θ − M ji sin θ)

+ δh Mhh .
h∈{1,...,n}\{i, j}

As M maximizes tr (G D) over G ∈ S O(n), f i j (θ) attains the maximum value at θ = 0,


implying that

d
0= f i j (θ)|θ=0 = δi Mi j − δ j M ji , i.e., δi Mi j = δ j M ji . (6)

158 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
By (6) with i = 1, we have


n 
n
δ12 = δ12 M12j = δ 2j M 2j1 ,
j=1 j=1


which together with δ12 > δ 2j for j = 1 and nj=1 M 2j1 = 1 implies that M11
2
= 1, which
in turn implies that M1 j = M j1 = 0 for j = 1. Applying (6) repeatedly shows that
Mi j = 0 for i = j. The proof is complete.

Proof of Lemma 2. Note that tr (G D) = tr (G J(−1) J(−1) D) = tr (G  D  ), where


G  = G J(−1) and D  = J(−1) D = diag(δ1 , . . . , δn−1 , −δn ). As G ranges over all
orthogonal matrices with determinant −1, G  ranges over all rotation matrices. The
desired result now follows from Lemma 1.

Proof of Lemma 3. Since S(Q r ) is decreasing in r , we have

c := lim S(Q r ) = lim S(Q r ) = S(Q  ).


r →∞ →∞

To show that Q  is stationary, suppose to the contrary that for some 1 ≤ j ≤ k and
some M ∗j ∈ S O(n),

c∗ := S(M1 , . . . , M j−1 , M ∗j , M j+1 , . . . , Mk ) < S(M1 , . . . , Mk ) = S(Q  ) = c.

With ε = c − c∗ > 0, a standard continuity argument shows that there exists a δ > 0
such that S(M1 , . . . , M j−1 , M ∗j , M j+1 , . . . , Mk ) < c − ε/2 whenever |Mi − Mi | < δ
for all i ∈ {1, . . . , k} \ { j}. Since Q r → Q  , there is an  such that d(Q r , Q  ) < δ,
r 
i.e., |Mi  − Mi | < δ for all 1 ≤ i ≤ k. By the definition of algorithm A2 ,
r  r  r  r 
S(Q r +1 ) ≤ min S(M1 , . . . , M j−1

, M j , M j+1

, . . . , Mk  )
Mj

r  r  r  r 
≤ S(M1 , . . . , M j−1

, M ∗j , M j+1

, . . . , Mk  )
< c − ε/2 < c,

which contradicts the fact that S(Q r ) monotonically decreases to c, completing the
proof.

REFERENCES

1. M. Alsheimer, E. von Glasenapp, R. Hock, R. Benavente, Architecture of the nuclear periphery of rat
pachytene spermatocytes: Distribution of nuclear envelope proteins in relation to synaptonemal complex
attachment sites, Mol. Biol. Cell 10 (1999) 1235–1245.
2. L. Crabbe, A. J. Cesare, J. M. Kasuboske, J. A. Fitzpatrick, J. Karlseder, Human telomeres are tethered
to the nuclear envelope during postmitotic nuclear assembly, Cell Rep. 2 (2012) 1521–1529.
3. A. H. J. de Ruiter, J. R. Forbes, On the solution of Wahba’s problem S O(n), J. Astronaut. Sci. 60 (2013)
1–31.
4. J. L. Farrell, J. C. Stuelpnagel, Problem 65-1: A least squares estimate of spacecraft attitude, SIAM Rev.
8 (1966) 384–386.

February 2017] A GENERALIZED WAHBA PROBLEM 159


5. B. K. P. Horn, Closed-form solution of absolute orientation using unit quaternions, J. Opt. Soc. Amer. A
4 (1987) 629–642.
6. H. Hou, Z. Zhou, Y. Wang, J. Wang, S. P. Kallgren, T. Kurchuk, E. A. Miller, F. Chang, S. Jia, Csi1 links
centromeres to the nuclear envelope for centromere clustering, J. Cell Biol. 199 (2012) 735–744.
7. J. Kind, B. van Steensel, Stochastic genome-nuclear lamina interactions, Nucleus 5 (2014) 124–130.
8. F. L. Markley, Attitude determination using vector observations and the singular value decomposition, J.
Astronaut. Sci. 36 (1988) 245–258.
9. P. B. Moens, C. Heyting, A. J. J. Dietrich, W. van Raamsdonk, Q. Chen, Synaptonemal complex antigen
location and conservation, J. Cell Biol. 105 (1987) 93–103.
10. J. Postberg, S. A. Juranek, S. Feiler, H. Kortwig, F. Jonsson, H. J. Lipps, Association of the telomere-
telomere-binding protein complex of hypotrichous ciliates with the nuclear matrix and dissociation dur-
ing replication, J. Cell Sci. 114 (2001) 1861–1866.
11. M. L. Psiaki, Generalized Wahba problems for spinning spacecraft attitude and rate determination, J.
Astronaut. Sci. 57 (2010) 73–92.
12. J. Schmidt, R. Benavente, D. Hdzic, C. Hoog, C. L. Stewart, M. Alsheimer, Transmembrane protein Sun2
is involved in tethering mammalian meiotic telomeres to the nuclear envelope, Proc. Natl. Acad. Sci. USA
104 (2007) 7426–7431.
13. M. D. Shuster, The generalized Wahba problem, J. Astronaut. Sci. 54 (2006) 245–259.
14. C. W. Spoor, F. E. Veldpaus, Rigid body motion calculated from spatial co-ordinates of markers, J.
Biomech. 13 (1980) 391–393.
15. S. Umeyama, Least-squares estimation of transformation parameters between two point patterns, IEEE
Trans. Pattern Anal. Mach. Intell. 13 (1991) 376–380.
16. F. E. Veldpaus, H. J. Woltring, L. J. M. G. Dortmans, A least-squares algorithm for the equiform trans-
formation from spatial marker co-ordinates, J. Biomech. 21 (1988) 45–54.
17. G. Wahba, Problem 65-1: A least squares estimate of spacecraft attitude, SIAM Rev. 7 (1965) 409.

BISHARAH LIBBUS is a retired geneticist. He received his Master’s degree in biology from the American
University of Beirut, Lebanon (AUB), and his Ph.D. in genetics from the University of Missouri, Columbia. He
was a postdoc at the Johns Hopkins University before holding various faculty positions at Haigazian College
and AUB in Beirut, the University of Vermont, and a senior visiting scientist at the National Institutes of
Health. His interest in chromosome order grew out of his study of mammalian meiosis and male chromosome
organization.
401 Ironwoods Dr., Chapel Hill, NC 27516
blibbus@gmail.com

GORDON SIMONS is a retired professor of statistics from the University of North Carolina, where he taught
statistics and probability for 38 years. He received undergraduate and Master’s degrees in mathematics and a
Ph.D. in statistics, all from the University of Minnesota, and then was a postdoc at Stanford University for
two years before coming to the University of North Carolina in 1968 to teach, and to direct the education and
research of graduate students in statistics. He chaired the Department of Statistics for seven years during his
tenure at UNC.
Department of Statistics and Operations Research, University of North Carolina, Chapel Hill NC 27599-3260
gsimons@live.unc.edu

YI-CHING YAO received an undergraduate degree in electrical engineering from National Taiwan University
in 1976 and a Ph.D. in mathematics (specialized in statistics) from MIT in 1982. He was with the Department
of Statistics, Colorado State University during 1983–1995. Since 1995, he has been a research fellow at the
Institute of Statistical Science, Academia Sinica, Taiwan. His research interests are in the areas of applied
probability and statistics. He served on the editorial board for the journals of the Annals of Statistics, Bernoulli,
Sankhyā, and Statistica Sinica.
Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan, ROC
yao@stat.sinica.edu.tw

160 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
NOTES
Edited by Vadim Ponomarenko

Generalized Infinite Products for


Powers of e1/k
Scott Ginebaugh

Abstract. Catalan and Pippenger discovered infinite products for various values related to e.
Based on these, infinite products for e1/2 and e2/3 were found by Sondow and Yi [3], who
furthermore conjectured that the products could be generalized for a power of e1/k . The dis-
coveries in this paper result from attempting to prove this conjecture. By generalizing Sondow
and Yi’s products, infinite products for powers of e1/k are found.

1. INTRODUCTION. In 1873, Catalan [1] found an infinite product for e:


   1/2  1/4  
2 4 68 10 12 14 16 1/8
e= ··· . (1)
1 3 57 9 11 13 15

Pippenger [2] published a similar product in 1980:


 1/2  1/4    
e 2 24 4 6 6 8 1/8 8 10 10 12 12 14 14 16 1/16
= ··· . (2)
2 1 33 5577 9 9 11 11 13 13 15 15

Based on Catalan and Pippenger’s products, Sondow and Yi [3] discovered the follow-
ing formulas:
√  1/2  1/4  
e 2 66 10 10 14 14 1/8
= ··· , (3)
2 3 57 9 11 13 15

 1/3    
e2/3 3 3 6 6 9 1/9 9 12 12 15 15 18 18 21 21 24 24 27 1/27
√ = · · · . (4)
3 2 4578 10 11 13 14 16 17 19 20 22 23 25 26

Products (2),(3), and (4) are similar in that they were proved by calculating the nth
partial product and applying Stirling’s asymptotic formula

N ! ∼ 2π N (N /e) N (N → ∞).

Using a similar method, we found generalizations of the infinite products (3) and
(4). These products were found based on the fact that (3) can be rewritten as
√  1/2  2 2 1/4  1/8
e 2 2 ·3 24 (5 · 7)2
= ··· (5)
2 3 5·7 9 · 11 · 13 · 15
http://dx.doi.org/10.4169/amer.math.monthly.124.2.161
MSC: Primary 11Y60, Secondary 40A20

February 2017] NOTES 161


and, as Sondow and Yi pointed out, (4) is equivalent to

 1/3  1/9
e2/3 3 35 22
√ =
3 2 4·5·7·8
 1/27
317 (2 · 4 · 5 · 7 · 8)2
· · · · . (6)
10 · 11 · 13 · 14 · 16 · 17 · 19 · 20 · 22 · 23 · 25 · 26

2. THEOREMS.

Theorem 1. For any integer k ≥ 2, the following infinite product holds:

 1/k 

 n −k n−1 −1 n−1 k+1
1/k n
e(k−1)/k k k−2 kk k !
= . (7)
k 1/(k−1) (k − 1)! n=2
k n !k n−2 !k

Using elementary algebra, the formula in Theorem 1 can be rewritten as

 1/k 

 n −k n−1 −1 n−2
1/k n
e(k−1)/k k k−2 kk (k n−1 !/(k n−2 !k k ))k−1
= . (8)
k 1/(k−1) (k − 1)! n=2
k n !k n−2 !/(k n−1 !2 k k n−1 −k n−2 )

In this form, Sondow and Yi’s formula (6) is the special case k = 3.
Equation (8) can be written in a form similar to (4), where for any k ≥ 3 the first
factor is
 1/k
kk k
··· (9)
23 k−1

and the nth factor for any n ≥ 2 is



k n−1 k n−1 + k k n−1 + k k n−1 + k k n−1 + 2k
· · ·
k n−1 + 1 k n−1 + 2 k n−1 + k − 1 k n−1 + k + 1 k n−1 + k + 2
1/k n
k n−1 + 2k k n−1 + 2k kn
· · · n−1 · · · . (10)
k + 2k − 1 k n−1 + 2k + 1 kn − 1

Thus, Sondow and Yi’s formula (4) is the special case k = 3 of (9) and (10).

Example 1. Letting k = 5 in (9) and (10) yields

 1/5  1/25
e4/5 555 5 10 10 10 10 15 15 15 15 20 20 20 20 25 25 25
=
51/4 234 6 7 8 9 11 12 13 14 16 17 18 19 21 22 23 24
 1/125
25 30 30 30 30 35 120 120 120 120 125 125 125
· ··· ··· .
26 27 28 29 31 32 117 118 119 121 122 123 124

162 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
When written in the form of (8), this is

 1/5  1/25
e4/5 53 519 (2 · 3 · 4)4
=
51/4 2·3·4 6 · 7 · 8 · 9 · 11 · 12 · · · · 19 · 21 · 22 · 23 · 24
 1/125
599 (2 · 3 · 4 · 6 · 7 · 8 · 9 · 11 · 12 · · · · 19 · 21 · 22 · 23 · 24)4
· ··· .
26 · 27 · 28 · 29 · 31 · 32 · · · · 119 · 121 · 122 · 123 · 124

Note that (7) also works when k = 2, which provides the product
√  1/4  1/8    
e 2 46 8 10 12 14 1/16 16 18 20 22 24 26 28 30 1/32
= ··· .
2 3 57 9 11 13 15 17 19 21 23 25 27 29 31

Theorem 2. For any integer k ≥ 2, the following infinite product holds:


 1/k n
2
e(k−1) /k ∞ 2 n−1
k (k−1) k (an−1 )k
= , (11)
k k−1 n=1
an

where

k n+1 !k n−1 !
a0 = 1, an = for n ≥ 1.
k n !2 k (k−1)k n−1

Theorem 2 can be written in a form similar to (3), where the first factor is
 1/k
k k k k
··· ··· 2 (12)
k+1 2k − 1 2k + 1 k −1

and nth factor for any n ≥ 2 is


 1/k n
N an
, (13)
Da n

where

Nan = (k n + k)k (k n + 2k)k · · · (2k n − k)k (2k n + k)k · · · (k n+1 − k)k

and

Dan = (k n + 1)(k n + 2) · · · (k n + k − 1)(k n + k + 1)(k n + k + 2)


· · · (k n + 2k − 1)(k n + 2k + 1) · · · (k n+1 − 1).

Equation (13) becomes much simpler when expanded. The denominator contains all
integers from k n to k n+1 excluding multiples of k. The numerator contains all multiples
of k from k n to k n+1 , excluding multiples of k 2 , k times. This is easily seen in an
example.

February 2017] NOTES 163


Example 2. Letting k = 3 in (12) and (13) yields

 1/3  
e4/3 3333 12 12 12 15 15 15 21 21 21 24 24 24 1/9
=
9 4578 10 11 13 14 16 17 19 20 22 23 25 26
 
30 30 30 33 75 78 78 78 1/27
· ··· ···
28 29 31 32 76 77 79 80

which can also be written in the form of (11) as

 1/3  1/9
e4/3 34 312 (4 · 5 · 7 · 8)3
=
9 4·5·7·8 10 · 11 · 13 · 14 · · · · 22 · 23 · 25 · 26
 36 1/27
3 (10 · 11 · 13 · 14 · · · · 22 · 23 · 25 · 26)3
· ··· .
28 · 29 · 31 · 32 · · · · 76 · 77 · 79 · 80

Furthermore, we can see that Sondow and Yi’s formula (3) is a specific case of this
generalized form where k = 2.

3. PROOFS.

Proof of Theorem 1. From (7) we can see that cancellations occur when computing
the partial products, which are

 1/k  2k 2 −2k−1 1/k 2  3k 3 −3k 2 −k−1 2 1/k 3


k k−2 k k k! k k !
, 2
, 3
,
k! k ! k !
 4 −4k 3 −k 2 −k−1 3
1/k 4
k 4k k !
,...
k4!

for n = 1, 2, 3, 4, . . . . By induction, we can see that the exponent of k in the numerator


of the nth partial product is given by the formula

 n
 n nk n+1 − 2(n + 1)k n + (n − 1)k n−1 + 1
E n := k + n−1
k − k n−1 − k n−v = .
v=1
k−1
(14)
Thus the nth partial product is
 1/k n
k En k n−1 !
.
kn !

By applying Stirling’s formula, we get


 √ n−1
1/k n n
k En 2πk n−1 (k n−1 /e)k k En /k k (n−1)/k (k−1)/k
√ = e .
2πk n (k n /e)k n k 1/2k n k n

164 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Replacing E n with (14) and combining like terms give us
3/2−k/2
−1/(k−1) (k−1)/k
k k n (k−1) e .

Therefore, as n → ∞, this equals

e(k−1)/k
.
k 1/(k−1)

Proof of Theorem 2. From (11), notice that cancellations occur when computing the
partial products, which are

 2
1/k  2
1/k 2  2 2
1/k 3
k (k−1) k 2k(k−1) k 3k (k−1)
, 3 , 4 2 3 2 (k−1)k 2 ,
k 2 !/k!2 k k−1 k !k!/k 2 !2 k (k−1)k k !k !/k ! k
 n−1 2
1/k n
k nk (k−1)
. . . , n+1 n−1 n 2 (k−1)k n−1 ,....
k !k !/k ! k

The nth partial product can be rewritten as


 1/k n
n(k−1)2 +k−1 k n !2
k k .
k n+1 !k n−1 !
2 /k
By applying Stirling’s formula to each factorial, we get e(k−1) /k k−1 . Thus the proof
of Theorem 2 is complete.

ACKNOWLEDGMENT. The author is grateful to Pei-yong Wang for advising the research and the writing
of this paper.

REFERENCES

1. E. Catalan, Sur la constante d’Euler et la fonction de Binet, C. R. Acad. Sci. Paris Sér. I Math. 77 (1873)
198–201.
2. N. Pippenger, An infinite product for e, Amer. Math. Monthly 87 (1980) 391.

3. J. Sondow, H. Yi, New Wallis- and Catalan-type infinite products for π , e and 2 + 2, Amer. Math.
Monthly 117 (2010) 912–917.

Department of Mathematics, Wayne State University, Detroit 48202


scott.ginebaugh@wayne.edu

February 2017] NOTES 165


Sums of Quadratic Residues and Nonresidues
Christian Aebi and Grant Cairns

Abstract. It is well known that when a prime p is congruent to 1 modulo 4, the sum of the
quadratic residues equals the sum of the quadratic nonresidues. In this note, we give ele-
mentary proofs of V.-A. Lebesgue’s analogous results for the case where p is congruent to 3
modulo 4.

1. INTRODUCTION. It is well known that when a prime p is congruent to 1 mod-


ulo 4, the sum of the quadratic residues equals the sum of the quadratic nonresidues.
Indeed, because −1 is a quadratic residue, the quadratic residues occur in pairs x and
p − x, thus giving ( p − 1)/4 pairs each of whose sum is p. So the quadratic residues
sum to p( p − 1)/4, as do the quadratic nonresidues [5, Problem 3.1.15].
For the rest of this note, the prime p is congruent to 3 modulo 4. Identify Z p with
the set {0, 1, . . . , p − 1}, and let Zlp = {i ∈ Z p : 1 ≤ i < p/2} and Zup = {i ∈ Z p :
i > p/2}. Let Q denote the set of quadratic residues of Z p , and write Q l = Q ∩ Zlp
and Q u = Q ∩ Zup . Similarly, let N denote the set of quadratic nonresidues of Z p , and

write N l = N ∩ Zlp and N u = N ∩ Zup . For a set A ⊂ Z p , let us write A for the sum
in Z of the elements of A.
The main purpose of this note is to give an elementary direct proof of the following
result.

Theorem (V.-A. Lebesgue).


 
(a) If p ≡ 7 (mod 8), then Q l = N l .
  l  
(b) If p ≡ 3 (mod 8), then Q + Q = N + N l .

This theorem was established by V.-A. Lebesgue [6, p. 144 (16)] using trigonomet-
rical series that were well known to Gauss [3, Article 362]. As we show in the final
section of this note, the result may also
√ be deduced from standard formulas for the
class number of the quadratic field Q[ − p]. Part (a) is obtained in this way in [1,
Cor. 13.4].
We mention in passing that Dirichlet proved that, when p is congruent to 3 modulo
4 and p > 3, there are more quadratic residues than nonresidues in Zlp . It would be
interesting to have a simple direct proof of this fact; see [8]. The arguments in this
note do not seem to furnish such a proof.

2. SIMPLE DIRECT PROOF. Let n denote the number of quadratic nonresidues


that are less than p/2. Note that −1 is a quadratic nonresidue since p ≡ 3 (mod 4);
see [7, Chap. 24]. Hence, for each quadratic nonresidue q, the element p − q is a
quadratic residue. In particular, Q u has n elements.

Lemma.

(a) If p ≡ 7 (mod 8), then Q = np.
  
(b) If p ≡ 3 (mod 8), then 3 Q = np + 2p .

http://dx.doi.org/10.4169/amer.math.monthly.124.2.166
MSC: Primary 11A15

166 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Proof. Let σ be the doubling function x → 2x on Z, and consider the function σ̄
induced on Z p by σ . Notice that if x ∈ Q l , then σ̄ (x) = σ (x) = 2x, and if x ∈ Q u ,
then σ̄ (x) = σ (x) − p = 2x − p.
(a) When p ≡ 7 mod 8, the function σ̄ preserves Q. Now, since Q u has n
elements,
    
Q= σ̄ (Q) = σ (Q l ) + σ (Q u ) − np = σ (Q) − np,
  
that is, Q = 2 Q − np, giving Q = np, as required.
(b) When p ≡ 3 (mod 8), the function σ̄ sends
 quadratic
  residues to quadratic
nonresidues. Now, since Q u has n elements, and Z p = 2p ,
   
 p  p   
Q= − σ̄ (Q) = − σ (Q l ) + σ (Q u ) − np
2 2
 
p 
= − σ (Q) + np.
2
  p    
Thus, Q= 2
− 2 Q + np, giving 3 Q = np + 2p , as required.

Proof of the Theorem. Note that


      
np = np − Nl + Nl = Qu + Nl = Q+ Nl − Ql . (1)
  
Thus, when p ≡ 7 (mod 8), since np = Q by the lemma, we have N l = Q l .
Similarly, if p ≡ 3 (mod 8), then, by the lemma and equation (1), we obtain
   
 p     p
3 Q = np + =⇒ 3 Q = Q + N l − Q l +
2 2
  l  l  
=⇒ 2 Q = N − Q + Q + N
   
=⇒ Q − N = N l − Ql
   
=⇒ Q + Ql = N + N l .

3. PROOF USING CLASS NUMBER FORMULAS. Suppose the prime p is con-


gruent to√3 modulo 4 and greater than 3. Let h denote the class number of the quadratic
field Q[ − p]. Jacobi’s original formula for h is as follows; see [2, Chap. 6 (19)].

p−1  
1 i
h=− i ,
p i=1 p


where pi denotes the Legendre symbol. Using our notations, the preceding may be
written

1  
h= ( N− Q). (2)
p

February 2017] NOTES 167


Another standard formula is as follows; see [4, Theorems 8.1.6 and 8.1.9].

( p−1)/2  
1  i
h=  .
2− 2
i=1
p
p

Thus,

h = (r − n)/δ, (3)

where r (respectively, n) is the number of quadratic residues (respectively, quadratic


nonresidues) that are less than p/2, and

1 : if p ≡ 7 mod 8,
δ=
3 : if p ≡ 3 mod 8.

From the combination of (2) and (3), we obtain


 
p(r − n) = δ N− Q . (4)

When p ≡ 7 mod 8, equation (4) gives


      
p−1 p
p(r − n) = N− Q =⇒ p − 2n = −2 Q
2 2

=⇒ Q = pn.

When p ≡ 3 mod 8, equation (4) gives


     
p−1 p
p(r − n) = 3 N− Q =⇒ p − 2n = 3 −6 Q
2 2
  
p
=⇒ 3 Q = pn + .
2

This establishes the lemma of the previous section. The theorem then follows as before.

REFERENCES

1. B. C. Berndt, Classical theorems on quadratic residues, Enseign. Math. 22 (1976) 261–304.


2. H. Davenport, Multiplicative Number Theory. Third ed. Graduate Texts in Mathematics, Vol. 74, Springer-
Verlag, New York, 2000, http://dx.doi.org/10.1007/978-1-4757-5927-3.
3. C. F. Gauss, Recherches Arithmétiques. Chez Courcier, Paris, 1807.
4. F. Halter-Koch, Quadratic Irrationals: An Introduction to Classical Number Theory. CRC Press, Boca
Raton, FL, 2013, http://dx.doi.org/10.1201/b14968.
5. I. Niven, H. Zuckerman, H. Montgomery, An Introduction to the Theory of Numbers. Fifth ed. Wiley,
New York, 1991.
6. V.-A. Lebesgue, Démonstration de quelques théorèmes relatifs aux résidus et aux non-résidus quadra-
tiques, Journal de mathématiques pures et appliquées 1re série 7 (1842) 137–159.
7. J. H. Silverman, A Friendly Introduction to Number Theory. Fourth ed. Prentice Hall, New York, 2012.
8. A. L. Whiteman, Theorems on quadratic residues, Math. Mag. 23 (1949) 71–74.

168 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Collège Calvin, Geneva, Switzerland 1211
christian.aebi@edu.ge.ch

Department of Mathematics, La Trobe University, Melbourne, Australia 3086


G.Cairns@latrobe.edu.au

Another Proof That There Are Infinitely Many Primes


There are many proofs that there are infinitely many primes. An argument by Chaitin
[1] shows, roughly speaking, that if there were only finitely many primes, then there
would not be enough prime factorizations to represent large integers. In this note we
give a short explicit argument based on this idea. We show that if there were only
k primes, then the integers from 1 to k 3k would have fewer than k 3k different prime
factorizations.
Suppose there are only k primes

p1 = 2, p2 = 3, p3 = 5, ..., pk .

Since 2, 3, 5, 7, and 11 are prime, we have k > 4. We will use “lg” to denote the base
2 logarithm. Note that we have the crude bounds k > lg k and k lg k > 1.
Let N = k 3k . Given any positive integer n ≤ N , we can write

n = 2a1 3a2 5a3 · · · pkak

and for each j, we have


a
2a j ≤ p j j ≤ n ≤ N ,

implying a j ≤ lg N . So a j is between 0 and lg N , so there are at most 1 + lg N


possibilities for each a j . We then observe

1 + lg N = 1 + 3k lg k < 4k lg k < k · k · k = k 3 ,

so there are strictly fewer than k 3 possibilities for each a j , and hence fewer than (k 3 )k
possibilities for the tuple (a1 , . . . , ak ). That is, there are fewer than k 3k possibilities
for the prime factorization of n, so it is not possible to construct prime factorizations
for all positive integers n ≤ k 3k .

—Submitted by Idris Mercer, Florida International University

REFERENCE

1. G. J. Chaitin, Toward a mathematical definition of life, in The Maximum Entropy Formalism, Ed.
R. D. Levine, M. Tribus, MIT Press, Cambridge, 1979. 477–498.

http://dx.doi.org/10.4169/amer.math.monthly.124.2.169
MSC: Primary 11A41

February 2017] NOTES 169


A Note on Average of Roots of Unity
Chatchawan Panraksa and Pornrat Ruengrot

Abstract. We consider the problem of characterizing all functions f defined on the set of inte-
gers modulo n with the property that an average of some nth roots of unity determined by f
is always an algebraic integer. Examples of such functions with this property are linear func-
tions. We show that, when n is a prime number, the converse also holds. That is, any function
with this property is representable by a linear polynomial. Finally, we give an application of
the main result to the problem of determining self-perfect isometries for the cyclic group of
prime order p.

1. INTRODUCTION. Let n be a positive integer. Denote by Zn = {0, 1, . . . , n − 1}


the ring of integers modulo n. Let ω = e2πi/n be a primitive nth root of unity. In this
work we consider the following problem.

Problem 1. Suppose f : Zn −→ Zn is a function such that the average

1  a f (x)+bx
n−1
μa,b
f = ω is an algebraic integer for every a, b ∈ Zn . (1)
n x=0

What can be said about the function f ?

When n is a power of a prime, such a problem is related to the problem of finding


self-perfect isometries (as defined in [1]) for cyclic p-groups, since the condition (1) is
the integrality condition for perfect characters. More explanations on this relationship
are given in the last section.

2. PRELIMINARIES. We shall use the symbol (=) to denote ordinary equality. The
symbol (≡) will be used to denote congruence (mod n) (i.e., equality in Zn ).

Definition. We say that a function f : Zn −→ Zn is a polynomial function if there


exists a polynomial F ∈ Zn [X ] such that

f (x) ≡ F(x) for all x = 0, 1, . . . , n − 1.

First we show that any function representable by a linear polynomial function sat-
isfies condition (1).

Lemma 2. Suppose that f : Zn −→ Zn is a linear polynomial function, then μa,b


f is
an algebraic integer for every a, b ∈ Zn .

Proof. If f is given by a linear polynomial, then so is a f (x) + bx for any a, b. Sup-


pose a f (x) + bx ≡ αx + β for some α, β ∈ Zn . If α ≡ 0, then

1 β
n−1
μa,b
f = ω = ωβ ,
n x=0
http://dx.doi.org/10.4169/amer.math.monthly.124.2.170
MSC: Primary 11R04

170 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
which is an algebraic integer. If α ≡ 0, then ωα = 1. Consequently,
 
1  αx+β
n−1
ωβ 1 − ωαn
μa,b = ω = = 0,
f
n x=0 n 1 − ωα

which is also an algebraic integer.

The next result shows that it is not necessary to check (1) for all pairs (a, b).

Proposition 3. Let f : Zn −→ Zn be a function. Suppose that μa,b f is an algebraic


integer. Then μka,kb
f is also an algebraic integer for any k relatively prime to n.

Proof. Since k and n are coprime, we can define an automorphism σk ∈ Aut(Q(ω)/Q)


by σk (ω) = ωk . Then

1  k(a f (x)+bx) 1  
n−1 n−1
μka,kb
f = ω = σk (ω)a f (x)+bx = σk μa,b
f .
n x=0 n x=0

Since μa,b is an algebraic integer and σk ∈ Aut(Q(ω)/Q), it follows that μka,kb


 a,b
f
 f
= σk μ f is also an algebraic integer.

Finally, we give a necessary and sufficient condition for the average of roots of unity
to be an algebraic integer. This is a standard result in algebraic number theory.

Lemma 4. Let ω1 , . . . , ωn be roots of unity. Their average is an algebraic integer if


and only if either ω1 + · · · + ωn = 0 or ω1 = · · · = ωn .

Proof. The sufficiency is clear. Let μ denote the average of ω1 , . . . , ωn and assume
that it is an algebraic integer. By the triangle inequality, |μ| ≤ 1 with equality if and
only if ω1 = · · · = ωn . Moreover, |μ | ≤ 1 for all algebraic conjugates μ of μ. If not
all ωi ’s are equal, then |μ| < 1. As a result, we also have |α| < 1, where α is the
product of all algebraic conjugates of μ. But α is an integer, which implies that α must
be 0. It follows that μ = 0.

3. THE CASE WHERE n IS A PRIME. Let n = p be a prime number. Our main


result is to show that any function f : Z p −→ Z p satisfying (1) must be representable
by a linear polynomial. To study a function f : Z p −→ Z p , it suffices to study polyno-
mials of degree at most p − 1. The following lemma was proved (for a general finite
field) by Dickson [2].

Lemma 5. For any function f : Z p −→ Z p , there exists a unique polynomial


F ∈ Z p [X ] of degree at most p − 1 such that

f (x) ≡ F(x) (mod p) for all x = 0, 1, . . . , p − 1.

Henceforth, we shall identify a function f : Z p −→ Z p with its corresponding


polynomial of degree at most p − 1. For the proof of our main result, we will need
to consider the following set.

February 2017] NOTES 171


Definition. For a polynomial f in Z p [X ], define
 
W f = λ ∈ Z p : x → ( f (x) + λx) is a permutation on Z p .

Lemma 6 (Stothers [3, Theorem 2]). Let f be a polynomial in Z p [X ] of degree at


most p − 1. If |W f | > ( p − 3)/2, then deg( f ) ≤ 1.

We now state our main result.

Theorem 7. Let f : Z p −→ Z p be a function. Suppose that the average

1  f (x)+bx
p−1

f =
μ1,b ω
p x=0

is an algebraic integer for every b ∈ Z p . Then f is a linear polynomial function.

Proof. By Lemma 4, we have that, for each b, either f (x) + bx is constant modulo
f = 0. Suppose there is b0 ∈ Z p such that f (x) + b0 x is constant modulo p.
p or μ1,b
Then it is clear that f is of the form f (x) ≡ αx + β for all x ∈ Z p .
p−1 f (x)+bx
If there is no b with f (x) + bx constant modulo p, then x=0 ω = 0 for
all b. Since the minimal polynomial of ω is 1 + X + · · · + X p−1 , we must have that
x → f (x) + bx is a permutation modulo p, for all b. This means that the set W f has
cardinality p. Hence, by Lemma 6, f is a linear polynomial.

Corollary 8. Let f : Z p −→ Z p be a function. Suppose that the average

1  a f (x)+bx
p−1

f =
μa,b ω
p x=0

is an algebraic integer for every a, b ∈ Z p . Then f is a linear polynomial function.

Proof. Take a = 1 and apply Theorem 7.

We have seen (in Lemma 2) that if a function f : Zn −→ Zn is representable by a


linear polynomial, then it satisfies (1). Corollary 8 implies that the converse also holds
for n prime. It is natural to make the following conjecture.

Conjecture 9. Suppose f : Zn −→ Zn is a function such that the average μa,b f


is an algebraic integer for every a, b ∈ Zn . Then there exist α, β ∈ Zn such that
f (x) ≡ αx + β for all x.

Remark. It is possible that a function f may have a nonlinear form and still satisfy (1).
The conjecture says that such function should be representable by a linear polynomial
in Zn [X ].
For example, when n = 6, it can be checked that f (x) ≡ x 3 + x satisfies (1).
However f is representable by a linear polynomial, since x 3 + x ≡ 2x for all x ∈ Z6 .

172 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
4. CONNECTION TO PERFECT ISOMETRIES. Corollary 8 has an application
to the representation theory of finite groups, especially to the problem of finding
self-perfect isometries for the cyclic group C p of prime order p. For the purpose of
illustrations, we will define perfect isometries specifically for this special case. Inter-
ested readers are referred to [1] for the definition of perfect isometries for general
blocks of finite groups.
Throughout this section, let G = C p . Denote by R(G) the free abelian group gen-
erated by Irr(G), the set of all irreducible complex characters of G. We will regard
R(G) as lying in C F(G)1 , the space of complex-valued class functions of G.
Let I : R(G) −→ R(G) be a linear map. Define a generalized character μ I of
G × G by

μ I (g, h) = I (χ)(g)χ(h), for all g, h ∈ G.
χ∈Irr(G)

Definition. (see Definition 1.1 in [1]) An isometry I : R(G) −→ R(G) is said to be


a perfect isometry if μ I satisfies the following two conditions.
(i) (Integrality) For all g, h ∈ G, the number μ I (g, h)/ p is an algebraic integer.
(ii) (Separation) If μ I (g, h) = 0, then both g and h are the identity element or both
are not.

Let ω = e2πi/ p . Suppose that G is generated by an element u ∈ G. For


x = 0, 1, . . . , p − 1, let χx be the irreducible complex character of G such that

χx (u a ) = ωax , a = 0, 1, . . . , p − 1.

In particular, χ0 is the trivial character.


Any bijection f on the set {0, 1, . . . , p − 1} gives rise to an isometry I f : R(G) −→
R(G) defined (on the basis) by

I f (χx ) = χ f (x) , x = 0, 1, . . . , p − 1.

Proposition 10. An isometry I f is perfect if and only if f is a linear bijection.

Proof. Since every element in G is of the form u a for some a, we have


p−1

p−1
μ I (g, h) = μ I (u a , u b ) = I (χx )(u a )χx (u b ) = ωa f (x)+bx .
x=0 x=0

Thus, we see that the condition in (1) is precisely the requirement that μ I satisfies the
integrality condition. This is the only condition to consider, as the separation condition
is satisfied for any bijection f .
If f is a linear bijection, then by Lemma 2, the integrality condition is satisfied.
Thus, I f is a perfect isometry.
Conversely, if I f is a perfect isometry, then μ I (u a , u b )/ p is an algebraic integer for
all a, b. It follows from Corollary 8 that f must be linear.

Remark. The following actions on Irr(G) are well known to give bijections on the
set.
1 This is an inner product space with the standard inner product of group characters.

February 2017] NOTES 173


• (Multiplication by a linear character) For a fixed χβ ∈ Irr(G), multiplication by
χβ gives a bijection

Iβ : Irr(G) −→ Irr(G), Iβ (χx )(g) = (χβ χx )(g) = χβ+x (g).

• (Automorphism action) For a fixed α ∈ {1, 2, . . . , p − 1}, the automorphism


g → g α induces a bijection

Iα : Irr(G) −→ Irr(G), Iα (χx )(g) = χx (g α ) = χαx (g).

Proposition 10 implies that, for an isometry induced by a bijection on Irr(G) to be a


perfect isometry, it must be a composition of the above two types of isometries.

ACKNOWLEDGMENT. The second author is supported by MUIC Seed Grant Research 017/2015. We wish
to express our gratitude toward the anonymous referee and the editor for helpful comments and suggestions
leading to improvements of this work.

REFERENCES

1. M. Broué, Isométries parfaites, types de blocs, catégories dérivées, Astérisque 181 no. 182 (1990) 61–192.
2. L. R. Dickson, The analytic representation of substitutions on a power of a prime number of letters with a
discussion of the linear group, Ann. Math. 11 no. 1/6 (1896) 65–120.
3. W. W. Stothers, On permutation polynomials whose difference is linear, Glasgow Math. J. 32 no. 2 (1990)
165–171.

Mahidol University International College


999 Phutthamonthon 4 Road, Salaya,
Nakhonpathom, Thailand 73170
chatchawan.pan@mahidol.edu
pornrat.rue@mahidol.edu

174 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Maximal Area of Equilateral Small Polygons
Charles Audet

Abstract. We show that among all equilateral polygons with a given number of sides and the
same diameter, the regular polygon has the maximal area.

1. INTRODUCTION. Let n ≥ 3 be a fixed integer. We answer the following ques-


tion: Which n-sided equilateral small polygon has the greatest area? The term small
was coined by Graham [6] and signifies that the diameter of the polygon is equal to
one, i.e., that the largest Euclidean distance between pairs of its vertices is one. The
main result of the present paper is that the regular polygon is the equilateral small
n-sided polygon with the greatest area.
One might be tempted to think that this is an obvious result, since it is well
known [5] that among all n-sided polygons sharing the same perimeter, the regular
one has maximal area. However, for related extremal problems, the optimal polygons
are not necessarily regular. When n is an odd number, the n-sided small polygon with
maximal area is the regular polygon [10]. This result holds over all small polygons,
including nonconvex and nonequilateral ones. But when the number of sides is even,
the n-sided small polygon with maximal area is not the regular one. This follows
from the observation that when n ≥ 6 is an even number, the area of the small regular
(n − 1)-sided polygon is larger than that of the small regular n-sided polygon [2]. The
optimal figures for n varying from 6 to 12 are given in [4, 6, 7]; lower and upper upper
bounds on the area and perimeter are presented in [8, 9] for larger values of n.
Similar questions are raised in the context of finding the small polygon with
maximal perimeter. There is a small equilateral hexagon whose perimeter is 3.5%
larger than that of the regular small hexagon [11], and there is a nonequilateral small
hexagon [6] whose area and perimeter are 3.9% and 3.3% larger, respectively, than that
of the regular small hexagon. Figure 1 illustrates these three small hexagons, where
the line segments represent unit-length diagonals. Equilateral and nonequilateral small
octagons with maximal perimeter are presented in [3] and in [1].

Figure 1. Three small hexagons.

http://dx.doi.org/10.4169/amer.math.monthly.124.2.175
MSC: Primary 51M16, Secondary 97G40

February 2017] NOTES 175


The present note studies the question of finding the small equilateral polygon with
greatest area. The main result is shown in the next section, based on two preliminary
lemmas.

2. THE LARGEST SMALL EQUILATERAL POLYGON. The following nota-


tion is adopted: The area and perimeter of a polygon P are denoted A(P ) and P(P ),
respectively. Arneg , Pnr eg , and cnr eg are the area, perimeter, and side length of the small
regular n-sided polygon.
The main result is shown by contradiction, by developing a connection between the
area of a polygon with that of another with twice as many vertices. The next lemma
states this connection for regular polygons with an even number of sides.

Lemma 1. If n ≥ 4 is an even number, then

r eg ncnr eg   r eg

A2n = Arneg + 1 − 1 − (cn )2 .
4

Proof. When n is even, the side length and area of the small regular n-sided polygon
satisfy cnr eg = sin( πn ) and Arneg = n8 sin( 2π
n
). It follows that
 π   
r eg n 1 2π
A2n − Arneg = sin − sin
4 n 2 n
n  π  π   π 
= sin − sin cos
4 n n n
n   
r eg
= cnr eg − cnr eg 1 − (cn )2 ,
4
and the result follows.

We now show that if the area of a small polygon exceeds that of the regular one,
then there exists another polygon with twice as many vertices with an area exceeding
that of the regular one by the same amount.

Lemma 2. Let n ≥ 4 be an even number. If P is a small equilateral n-sided polygon


such that A(P ) ≥ Arneg + δ for some δ > 0, then there exists an equilateral 2n-sided
polygon Q such that A(Q) ≥ A2n + δ.
r eg

Proof. Let P be a small equilateral n-sided polygon such that δ := A(P ) − Arneg
> 0, where n ≥ 4 is an even number. Define c = n1 P(P ) to be the length of the equi-
lateral sides of the polygon P .
The equilateral polygon Q is constructed from P by adding one vertex near the
center of each side of P . Each added vertex is at distance h away from the center of
one side of P , where h > 0 is taken to be as large as possible so that the diameter of
Q remains equal to one. The initial polygon P is represented in Figure 2 by the full
lines, the added vertices are depicted by white circles, and Q is delimited by the dotted
lines.
The area of both polygons are related as follows:

nch nch
A(Q) = A(P ) + = Arneg + + δ. (1)
2 2

176 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Figure 2. The dotted equilateral polygon Q is obtained by adding n vertices at the same distance h from each
side of the equilateral n-sided polygon P .

We next compute a valid lower bound h on the distance h from each added ver-
tex to the polygon P . The left part of Figure 3 illustrates two opposite sides of the
n-sided polygon, together with the added vertices labeled A and B. By construction,
the distance between them satisfies |AB| ≤ 1. The value of h diminishes when these
sides are moved away from each other. The minimal value of h occurs when both sides
are parallel, and when the two pairs of vertices are at unit distance, as illustrated in the
right part of the Figure 3. The distance h from the added vertices A and B  to the
polygon satisfies

1  
h ≥ h := 1 − 1 − c2 .
2

Figure 3. The distance from an added vertex to the polygon is larger when the corresponding sides are not
parallel to each other: h ≥ h.

Recall that the maximal area enclosed by an n-sided polygon of a given perimeter
is achieved by the regular polygon [5]. This implies that P(P ) > Pnr eg and c > cnr eg
 A(P
because

) > Arneg
 . From equation (1), Lemma 1, and using the fact that the func-
tion c 1 − 1 − c is monotone increasing, we see that
2

nch
A(Q) ≥ Arneg + +δ
2
nc   
= Arneg + 1 − 1 − c2 + δ
4

February 2017] NOTES 177


ncnr eg   r eg

≥ Arneg + 1 − 1 − (cn )2 + δ
4
r eg
= A2n + δ.

Repeated applications of this last lemma allow us to prove the main result.

Theorem 3. For any integer n ≥ 3, the small n-sided equilateral polygon with the
greatest area is the regular small polygon.

Proof. The largest small polygon is the regular one when n is odd [10]. Suppose by
contradiction that there exists a small equilateral n-sided polygon P (with n even) for
which A(P ) = Arneg + δ for some scalar δ > 0.
Applying Lemma 2 repeatedly yields a sequence of small equilateral polygons with
2n, 4n, 8n, . . . sides, each with an area exceeding that of the small regular polygon by
the value δ. However,
m π  π
r eg
lim A2m + δ = lim sin +δ = +δ
m→∞ m→∞ 4 m 4
implies a contradiction: there exists a small equilateral polygon whose area exceeds π4 ,
the area of the circle with unit diameter.

ACKNOWLEDGMENT. This work was supported by NSERC grant 2015-05311.

REFERENCES

1. C. Audet, P. Hansen, F. Messine, The small octagon with longest perimeter, J. Combin. Theory Ser. A 114
(2007) 135–150.
2. , Ranking small regular polygons by area and by perimeter, J. Appl. Ind. Math 3 (2009) 21–27.
Original Russian text: Diskretnyi Analiz i Issledovanie Operatsii, 15 (2008) 65–73.
3. C. Audet, P. Hansen, F. Messine, S. Perron, The minimum diameter octagon with unit-length sides:
Vincze’s wife’s octagon is suboptimal, J. Combin. Theory Ser. A 108 (2004) 63–75.
4. C. Audet, P. Hansen, F. Messine, J. Xiong, The largest small octagon, J. Combin. Theory Ser. A 98 (2002)
46–59.
5. V. Blåsjö, The isoperimetric problem, Amer. Math. Monthly 112 (2005) 526–566.
6. R. L. Graham, The largest small hexagon, J. Combin. Theory Ser. A 18 (1975) 165–170.
7. D. Henrion, F. Messine, Finding largest small polygons with GloptiPoly, J. Global Optim. 56 (2013)
1017–1028.
8. M. J. Mossinghoff, A $1 problem, Amer. Math. Monthly 113 (2006) 385–402.
9. , An isodiametric problem for equilateral polygons, Contemp. Math. 457 (2008) 237–252.
10. K. Reinhardt, Extremale polygone gegebenen durchmessers, Jahresber. Deutsch. Math. Verein 31 (1922)
251–270.
11. N.K. Tamvakis, On the perimeter and the area of the convex polygon of a given diameter, Bull. Greek
Math. Soc. 28 (1987) 115–132.

GERAD and Département de Mathématiques et de Génie Industriel, École Polytechnique de Montréal,


Montréal, Québec, Canada.
Charles.Audet@gerad.ca
http: // www. gerad. ca/ Charles. Audet

178 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
PROBLEMS AND SOLUTIONS
Edited by Gerald A. Edgar, Daniel H. Ullman, Douglas B. West
with the collaboration of Paul Bracken, Ezra A. Brown, Daniel Cranston, Zachary Franco,
Christian Friesen, László Lipták, Rick Luttmann, Frank B. Miles, Leonard Smiley, Kenneth
Stolarsky, Richard Stong, Walter Stromquist, Daniel Velleman, and Fuzhen Zhang.

Proposed problems should be submitted online at


http: // www. americanmathematicalmonthly. submittable. com/ submit.
Proposed solutions to the problems below should be submitted on or before June
30, 2017 via the same link. More detailed instructions are available online. Proposed
problems must not be under consideration concurrently at any other journal nor be
posted to the internet before the deadline date for solutions. An asterisk (*) after the
number of a problem or a part of a problem indicates that no solution is currently
available.

PROBLEMS
11957. Proposed by Éric Pité, Paris, France. Let m and n be two integers with n ≥ m ≥ 2.
Let S(n, m) be the Stirling number of the second kind, i.e., the number of ways to partition
a set of n objects into m nonempty subsets. Show that
 
n
n m S(n, m) ≥ m n .
m

11958. Proposed by Kent Holing, Trondheim, Norway.


(a) Find a condition on the side lengths a, b, and c of a triangle that holds if and only if the
nine-point center lies on the circumcircle.
(b) Characterize the triangles whose nine-point center lies on the circumcircle and whose
incenter lies on the Euler line.
11959. Proposed by Donald Knuth, Stanford University, Stanford, CA. Prove that, for any
n-by-n matrix A with (i, j)-entry ai, j and any t1 , . . . , tn , the permanent of A is
 
1  
n n
σi ti + σ j i, j ,
a
2n i=1 j=1

where the outer sum is over all 2n choices of (σ1 , . . . , σn ) ∈ {1, −1}n .
11960. Proposed by Ulrich Abel, Technische Hochschule Mittelhessen, Friedberg,
Germany. Let m and n be natural numbers, and, for i ∈ {1, . . . m}, let ai be a real number
with 0 ≤ ai ≤ 1 . Define
 m 
1  m
f (x) = 2 (1 + ai x) − m (1 + ai x) .
mn n
x i=1 i=1

Let k be a nonnegative integer, and write f (k) for the kth derivative of f . Show that
f (k) (−1) ≥ 0.
http://dx.doi.org/10.4169/amer.math.monthly.124.2.179

February 2017] PROBLEMS AND SOLUTIONS 179


11961. Proposed by Mihaela Berindeanu, Bucharest, Romania. Evaluate
 π/2
sin x
√ d x.
0 1 + sin(2x)

11962. Proposed by Elton Hsu, Northwestern University, Evanston, IL. Let {X n }n≥1 be
a sequence of independent and identically distributed random variables each taking the
values ±1 with probability 1/2. Find the distribution of the random variable



1 X1 1 X2 1
+ + + ··· .
2 2 2 2 2

11963. Proposed by Gheorghe Alexe and  George-Florin Serban, Braila, Romania. Let
a1 , . . . , an be positive real numbers with nk=1 ak = 1. Show that

n
(ai + ai+1 )4
≥ 12n,
i=1
ai2 − ai ai+1 + ai+1
2

where an+1 = a1 .

SOLUTIONS

An Inequality with Squared Tangents


11778 [2014, 456]. Proposed by Li Zhou, Polk State College, Winter Haven, FL.
Let x, y, z be positive real numbers such that x + y + z = π/2. Let f (x, y, z) =
1/(tan2 x + 4 tan2 y + 9 tan2 z). Prove that
9  2 
f (x, y, z) + f (y, z, x) + f (z, x, y) ≤ tan x + tan2 y + tan2 z .
14

Solution by Vazgen Mikayelyan, Department of Mathematics and Mechanics, Yerevan


State University, Yerevan, Armenia. Letting a = tan x, b = tan y, and c = tan z, we have
a, b, c > 0, since 0 < x, y, z < π/2, and ab + bc + ca = 1 since
1 − tan y tan z 1 − bc
a = tan x = cot(y + z) = = .
tan y + tan z b+c
By the Cauchy–Schwarz inequality,
3a 2 b2 11a 2 b2 b2 c2 13b2 c2 14c2 a 2
3(a 2 + 4b2 + 9c2 ) = 2
+ 2
+ 2 + 2
+
b a c b a2
(3ab)2 (11ab)2 (bc)2 (13bc)2 (14ca)2
= + + + +
3b2 11a 2 c2 13b2 14a 2
(14ab + 14bc + 14ca)2 142
≥ = .
3b2 + 11a 2 + c2 + 13b2 + 14a 2 25a 2 + 16b2 + c2
Hence,
1 3(25a 2 + 16b2 + c2 )
f (x, y, z) = ≤ .
a 2 + 4b2 + 9c2 142

180 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Adding this and the analogous inequalities obtained by cycling the variables, we obtain
3(42a 2 + 42b2 + 42c2 ) 9(a 2 + b2 + c2 )
f (x, y, z) + f (y, z, x) + f (z, x, y) ≤ = ,
142 14
which is the desired inequality.
Editorial comment. Paolo Perfetti proved the stronger result
9 9
f (x, y, z) + f (y, z, x) + f (z, x, y) ≤ ≤ (tan2 x + tan2 y + tan2 z).
14 14

Also solved by A. Ali (India), S. Baek (Korea), R. Bagby, P. P. Dályay (Hungary), O. Geupel (Germany),
P. Perfetti (Italy), R. Stong, R. Tauraso (Italy), and the proposer.

Concyclic or Collinear
11779 [2014, 456]. Proposed by Michel Bataille, Rouen, France.

V
Let M, A, B, C, and D be distinct points B
(in any order) on a circle  with center O.
Let the medians through M of triangles E
MAB and MC D cross lines AB and CD at P
P and Q, respectively, and meet  again D
at E and F, respectively. Let K be the K
intersection of AF with DE, and let L be L
O A
the intersection of BF with CE. Let U U
M
and V be the orthogonal projections of F
Q
C onto MA and D onto MB, respectively,
and assume U = A and V = B. Prove
that A, B, U , and V are concyclic if and
only if O, K , and L are collinear.
C

Solution by Richard Stong, Center for Communications Research, San Diego, CA. The
problem is not quite correct. We must also assume that E and F do not coincide, hence
K and L do not coincide. (If K and L coincide, then O, K , L are clearly collinear, but
A, U, B, V need not be concyclic.)
Let R be the radius of , let N the point where lines AC and BD intersect, and let X and
Y be the reflections of O across lines AC and BD, respectively. The claim is the equivalence
of (1) A, B, U, V are concyclic, and (2) O, K , L are collinear. We show that each of these
is equivalent to (3) M is equidistant from X and Y .
(1) ⇐⇒ (3). Note that A, B, U, V are concyclic if and only if the powers from M are
equal: |MA| · |MU| = |MB| · |MV|. From trigonometry and the extended law of sines,
     
|MB| · |MV| = 4R 2 sin 12 ∠MOB sin 12 ∠MOD cos 12 ∠BOD .

From the law of cosines applied to


MOY, we find that |MY|2 equals
     
R 2 + 4R 2 cos2 12 ∠BOD − 4R 2 cos 12 ∠BOD cos 12 (∠MOB + ∠MOD)
     
= R 2 + 8R 2 sin 12 ∠MOB sin 12 ∠MOD cos 12 ∠BOD
= R 2 + 2|MB| · |MV|,

February 2017] PROBLEMS AND SOLUTIONS 181


and analogously, |MX|2 = R 2 + 2|MA| · |MU|. Thus, |MX| = |MY| if and only if |MA| ·
|MU| = |MB| · |MV|.
(2) ⇐⇒ (3). Lay down complex coordinates with  equal to the unit circle. If a point
Z with coordinate z is on the unit circle and point W with coordinate w is any other point
in the complex plane, then the second intersection of line WZ with the unit circle has
coordinate (z − w)/(wz − 1). Hence, letting lower case letters denote coordinates of the
points with the corresponding upper case letter, we compute
ab(a + b − 2m) cd(c + d − 2m)
e= and f = .
2ab − am − bm 2cd − cm − dm
By Pascal’s theorem applied to the hexagon CEDBFA, we see that K , L, and N all lie on
the line

(ce + db + f a − ed − b f − ac)z + (abd f + ace f + bcde − abc f − acde − bde f )z


= ce(b + f ) + db(a + c) + f a(e + d) − ed( f + a) − b f (c + e) − ac(b + d).

If K = L, then this is the unique line through K and L. Hence, O, K , and L are collinear
if and only if

ce(b + f ) + db(a + c) + f a(e + d) − ed( f + a) − b f (c + e) − ac(b + d) = 0.

Plugging in the formulas for e and f above and factoring out (a − b)(c − d), this becomes
 
1 1 1 1 (ab − cd)(ad − bc)
m + − − + (a + c − b − d)m = .
a c b d abcd
Since x = a + c and y = b + d, this is the equation of the line perpendicular to XY through
the midpoint (a + b + c + d)/2 of XY. Hence, O, K , and L are collinear if and only if
|MX| = |MY|.
Also solved by R. Chapman (U. K.), J.-P. Grivaux (France), C. R. Pranesachar (India), and the proposer.

Altitudes of a Tetrahedron
11783 [2014, 549] and 11797 [2014, 738]. Proposed by Zhang Yun, Xi’an City, Shaanxi,
China. Given a tetrahedron, let r denote the radius of its inscribed sphere. For 1 ≤ k ≤ 4,
let h k denote the distance from the kth vertex to the plane of the opposite face. Prove that

4
hk − r 12
≥ .
h +r
k=1 k
5

Solution by Roberto Tauraso, Dipartimento di Matematica, Università di Roma “Tor Ver-


gata,” Rome, Italy. The volume of the tetrahedron is given by
h k Ak rS
= ,
3 3

where Ak is the area of the face opposite the kth vertex and S = 4k=1 Ak is the surface
area of the tetrahedron. Hence, h k = r/tk , where tk = Ak /S. Since 0 < tk < 1 and the
function f (t) = (1 − t)/(1 + t) is convex on [0, +∞), we have
 4   
4
hk − r 4
1 1 12
= f (tk ) ≥ 4 f tk = 4 f = .
h +r
k=1 k k=1
4 k=1 4 5

182 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Editorial comment. The problem was inadvertently repeated as Problems 11783 and 11797.
Several solvers noted that equality holds if and only if the face areas are equal. This
does not require, however, that the tetrahedron be regular. Some solvers noted that the
n-dimensional analogue of the inequality holds with lower bound n(n + 1)/(n + 2).
Also solved by A. Ali (India), S. Baek (Korea), R. Bagby, M. Bataille (France), D. M. Bătinetu-Giurgiu &
T. Zvonaru (Romania), I. Borosh, R. Boukharfane (Morocco), R. Chapman (U. K.), N. Curwen (U. K.),
P. P. Dályay (Hungary), M. Dincă (Romania), D. Fleischman, H. S. Geun (Korea), O. Geupel (Germany),
M. Goldenberg & M. Kaplan, J. G. Heuver (Canada), S. Hitotumatu (Japan), E. J. Ionaşcu, Y. J. Ionin,
B. Karaivanov (U.S.A.) & T. S. Vassilev (Canada), O. Kouba (Syria), D. Lee (Korea), O. P. Lossers (Nether-
lands), V. Mikayelyan (Armenia), R. Nandan, Y. Oh (Korea), P. Perfetti (Italy), I. Pinelis, C. R. Pranesachar
(India), Y. Shim (Korea), J. C. Smith, R. Stong, T. Viteam (India), M. Vowe (Switzerland), T. Zvonaru &
N. Stanciu (Romania), GCHQ Problem Solving Group (U. K.), Missouri State University Problem Solving
Group, University of Louisiana at Lafayette Math Club, and the proposer.

Circles around an Equilateral Triangle


11784 [2014, 549]. Proposed by Abdurrahim Yilmuz, Middle East Technical University,
Ankara, Turkey. Let ABC be an equilateral triangle with center O and circumradius r .
Given R > r , let ρ be a circle about O of radius R. All points named “P” are on ρ.
(a) Prove that |PA|2 + |PB|2 + |PC|2 = 3(R 2 + r 2 ).
(b) Prove that min P∈ρ |PA| |PB| |PC| = R 3 − r 3 and max P∈ρ |PA| |PB| |PC| = √
R3 + r 3.
(c) Prove that the area of a triangle with sides of length |PA|, |PB|, and |PC| is 4 (R 2 − r 2 ).
3

(d) Prove that if H , K , and L are√ the respective projections of P onto AB, AC, and BC,
then the area of triangle HKL is 3163 (R 2 − r 2 ).
(e) With the same notation, prove that |HK|2 + |KL|2 + |HL|2 = 94 (R 2 + r 2 ).
Solution by TCDmath Problem Group, Trinity College, Dublin, Ireland.
(a) We represent the points by complex numbers: A = r , B = r ω, C = r ω2 , O = 0,
P = z, where r > 0 and ω = e2πi/3 . We compute

|PA|2 = (z − r )(z − r ) = |z|2 − r (z + z) + r 2 ,


|PB|2 = (z − r ω)(z − r ω2 ) = |z|2 − r (zω + zω2 ) + r 2 , and
|PC|2 = (z − r ω2 )(z − r ω) = |z|2 − r (zω2 + zω) + r 2 .

Summing these equations and using 1 + ω + ω2 = 0 yields

|PA|2 + |PB|2 + |PC|2 = 3(R 2 + r 2 ).

(b) We have
   
|PA| |PB| |PC| = (z − r )(z − r ω)(z − r ω2 ) = z 3 − r 3 .
This formula takes its maximum value R 3 + r 3 when z 3 = −R 3 , that is, when z = Reiθ
with 3θ ≡ π (mod 2π ) or when P lies on one of the altitudes of the triangle on the oppo-
site side to the vertex. It takes its minimum value R 3 − r 3 when z 3 = R 3 , that is, when P
lies on one of the three altitudes of the triangle on the same side as the vertex.
(c) Heron’s formula for the area  of a triangle with sides a, b, c is
   
162 = 2 a 2 b2 + b2 c2 + c2 a 2 − a 4 + b4 + c4
 2  
= a 2 + b2 + c2 − 2 a 4 + b4 + c4 .

In our case, we have (|PA|2 + |PB|2 + |PC|2 )2 = 9(R 2 + r 2 )2 and

February 2017] PROBLEMS AND SOLUTIONS 183


 2
|PA|4 + |PB|4 + |PC|4 = R 2 + r 2 − r (z + z)
 2  2
+ R 2 + r 2 − r (zω + zω2 ) + R 2 + r 2 − r (zω2 + zω)
 2    
= 3 R 2 + r 2 − 2 R 2 + r 2 r z + z + zω + zω2 + zω2 + zω
 
+ r 2 (z + z)2 + (zω + zω2 )2 + (zω2 + zω)2
 2  2
= 3 R 2 + r 2 + 6r 2 zz = 3 R 2 + r 2 + 6R 2 r 2 .

Thus, 162 = 3(R 2 + r 2 )2 − 12(R 2 r 2 ) = 3(R 2 − r 2 )2 . Hence,  = 43 (R 2 − r 2 ).
(d) In coordinates (x, y), line BC has equation x = −r/2, and the projection PL has equa-
tion y = b, where b is the imaginary part of z. Thus,
1
L= (−r + z − z).
2
To determine H , we rotate the plane with the transformation z → ωz and then rotate back
with z → ω2 z. This yields
ω2 1
H= (−r + ωz − ω2 z) = (−ω2r + z − ωz).
2 2
Similarly,
1
(−ωr + z − ω2 z).
K =
2

Thus, using |1 − ω| = |1 − ω2 | = |ω − ω2 | = 3, we have
√ √ √
3 3 3
|H K | = |z − r |, |K L| = |z − ωr |, and |L H | = |z − ω2r |.
2 2 2
Replacing z with z in these expressions leaves the triple {|HK|, |KL|, |LH|} unchanged.
The triangle
√ HKL is therefore similar to the triangle in part (c) with coefficient of similarity
equal to √ 3/2. It follows that the area of
HKL is 3/4 times the area of the earlier triangle,
that is, 3163 (R 2 − r 2 ).
(e) Similarly, comparing with the earlier triangle, |HK|2 + |KL|2 + |HL|2 = 94 (R 2 + r 2 ).
Editorial comment. The Editors regret that the “16” in the denominator of part (d) was
misprinted as “116.”
All parts also solved by R. Bagby, M. Bataille (France), R. Boukharfane (Morocco), R. Chapman (U. K.),
N. Curwen (U. K.), P. P. Dályay (Hungary), A. Ercan (Turkey), M. Goldenberg & M. Kaplan, J.-P. Grivaux
(France), A. Habil (Syria), E. J. Ionaşcu, Y. J. Ionin, B. Karaivanov, O. Kouba (Syria), M. D. Meyerson,
J. Minkus, C. R. Pranesachar (India), J. C. Smith, R. Stong, T. Zvonaru & N. Stanciu (Romania), GCHQ
Problem Solving Group (U. K.), and the proposer. Some but not all parts solved by A. Ali (India), R. B. Cam-
pos (Spain), D. Fleischman, O. Geupel (Germany), P. Nüesch (Switzerland), J. Schlosberg, C. R. Selvaraj &
S. Selvaraj, T. Viteam (India), and Z. Vörös (Hungary).

A Limit of a Ratio of Logarithms


11786 [2014, 550]. Proposed by George Stoica, University of New Brunswick, Saint John,
Canada. Let x1 , x2 , . . . be a sequence of positive numbers such that limn→∞ xn = 0 and
limn→∞ x log xn
+···+xn
is a negative number. Prove that limn→∞ log xn
log n
= −1.
1

Solution by Lixing Han, University of Michigan, Flint, MI. Suppose that


lim log xn /(x1 + · · · + xn ) = β < 0.
n→∞

184 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Then by the Stolz–Cesàro theorem, we have
log(xn+1 /xn ) log xn+1 − log xn log xn
lim = lim = lim = β. (1)
n→∞ xn+1 n→∞ xn+1 n→∞ x 1 + · · · + x n

This implies that xn+1 < xn for large n since xn > 0 and limn→∞ xn = 0 by assumption.
Since xn goes to zero as n → ∞, (1) implies limn→∞ βxn+1 = limn→∞ log(xn+1 /xn ) = 0.
It follows that
xn+1
lim = 1. (2)
n→∞ x n

By the mean value theorem, there exists ζn such that


1
log xn+1 − log xn = (xn+1 − xn ),
ζn
where ζn is between xn and xn+1 . Thus, for large n, xn+1 /xn < ζn /xn < 1. From (2), we
conclude limn→∞ ζn /xn = 1. Using this result and (1) again,
log xn+1 − log xn 1 xn+1 − xn xn xn+1 − xn
lim = lim · = lim · = β.
n→∞ xn+1 n→∞ ζn xn+1 n→∞ ζn xn xn+1
This implies that for any ∈ (0, |β|), there exists a positive integer N such that
1 1
β − < − <β +
xn xn+1
for all n ≥ N . Summing these inequalities from n = N to N + m − 1, we obtain
1 1
(β − )m < − < (β + )m.
xN x N +m
Dividing by N + m and taking the limit as m → ∞,
 
1 1
β − ≤ lim − ≤ β + .
m→∞ (N + m)x N (N + m)x N +m

Since limm→∞ 1
(N +m)x N
= 0,

1 1
≤ − lim (N + m)x N +m = − lim nxn ≤ .
β + m→∞ n→∞ β −
Let approach 0 to obtain limn→∞ nxn = −1/β > 0. Thus,
 
1
lim log(nxn ) = lim (log n + log xn ) = log − .
n→∞ n→∞ β
However, if this holds, then, since log n → ∞, it must be the case that

log n + log xn log(− β1 )


lim = lim = 0.
n→∞ log n n→∞ log n
Hence,
 
log xn
lim 1+ = 0,
n→∞ log n
so limn→∞ log xn / log n = −1.

February 2017] PROBLEMS AND SOLUTIONS 185


Also solved by P. Bracken, P. P. Dályay (Hungary), P. J. Fitzsimmons, E. J. Ionaşcu, O. Kouba (Syria),
J. H. Lindsey II, O. P. Lossers (Netherlands), M. Omarjee (France), T. Persson & M. P. Sundqvist (Sweden),
I. Pinelis, J. C. Smith, A. Stenger, R. Stong, and the proposer.

Sum of Medians of a Triangle


11790 [2014, 648]. Proposed by Arkady Alt, San Jose, CA, and Konstantin Knop, St. Peters-
burg, Russia. Given a triangle with semiperimeter s, √inradius r , and medians of length m a ,
m b , and m c , prove that m a + m b + m c ≤ 2s − 3(2 3 − 3)r .
Solution by James Christopher Smith, Knoxville, TN. Write R for the circumradius. We use
two inequalities. The first is

(m a + m b + m c )2 ≤ 4s 2 − 16Rr + 5r 2 ,

due to Xiao-Guang Chu and Xue-Zhi Yang. (See J. Liu, “On an inequality for the medians
of a triangle,” Journal of Science and Arts, 19 (2012) 127–136.) The second is

s ≤ (3 3 − 4)r + 2R,

√ inequality. (See problem E1935, this M ONTHLY, 73 (1966) 1122.)


known as Blundon’s
Write u = 2 3 − 3. From Blundon’s inequality,

(2s − 3ur )2 = 4s 2 − 12sur + 9u 2 r 2


 √ 
≥ 4s 2 − 12ur (3 3 − 4)r + 2R + 9u 2r 2
 √ 
= 4s 2 − 24u Rr + 9u 2 − 12u(3 3 − 4) r 2

= 4s 2 − 16Rr + (16 − 24u)Rr + 3u(7 − 6 3 )r 2 .

Next, we use Euler’s inequality R ≥ 2r to get


 2 √
2s − 3ur ≥ 4s 2 − 16Rr + (16 − 24u)2r 2 + 3u(7 − 6 3 )r 2
= 4s 2 − 16Rr + 5r 2 ,

which is greater than or equal to (m a + m b + m c )2 by the Chu–Yang inequality.


Also solved by R. Boukharfane (Canada), O. Geupel (Germany), O. Kouba (Syria), R. Tauraso (Italy), M. Vowe
(Switzerland), and T. Zvonaru & N. Stanciu (Romania).

A Middle Subspace
11792 [2014, 648]. Proposed by Stephen Scheinberg, Corona del Mar, CA. Show that every
infinite-dimensional Banach space contains a closed subspace of infinite dimension and
infinite codimension.
Solution by University of Louisiana at Lafayette Math Club, Lafayette, LA. Let V be an
infinite-dimensional normed vector space (we do not require completeness). We construct a
sequence of linearly independent vectors v0 , v1 , . . . in V and a sequence of bounded linear
functionals λ0 , λ1 , . . . such that λi (v j ) = δi, j for all nonnegative integers i and j. Choose
a nonzero v0 ∈ V . By the Hahn–Banach theorem, there is a bounded linear functional λ0
on V with λ0 (v0 ) = 1. Suppose that nonzero vectors v0 , . . . , vk ∈ V and bounded linear
functionals λ0 , . . . , λk have been defined such that λi (v j ) = δi, j for i, j ∈ {1, . . . , k}. The
k
vector subspace i=1 ker λi has infinite dimension since it has finite codimension in V ,
k
which is infinite-dimensional. In particular, there exists nonzero vk+1 ∈ i=1 ker λi . The

186 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
functional λk+1 may be defined by λk+1 (v j ) = 0 for 0 ≤ j ≤ k and λk+1 (vk+1 ) = 1 and
then extended by the Hahn–Banach theorem to a bounded linear functional on V . The
vectors v0 , . . . , vk+1 are linearly independent since applying λ j to ci vi = 0 shows that
c j = 0. Continuing in this way, we construct the desired sequence v0 , v1 , . . . .
Let W be the closure of the linear span of {v0 , v2 , v4 , . . . }. The subspace W has infi-
nite dimension, since the vi are linearly independent. We claim also that W has infinite
codimension, that is, that V /W is infinite-dimensional. We prove this by showing that
the cosets v1 + W, v3 + W, v5 + W, . . . are linearly independent. Suppose otherwise, that
n n and there are some scalars α1 , . . . , αn ∈ R with at least one of them nonzero
there is some
such that i=1 αi v2i−1 ∈ W . Say α j = 0. Since λ2 j−1 (vi ) = 0 for even i, the linear func-
tional λ2 j−1 vanishes on their linear span and therefore on the closure W . This contradicts
 n 

λ2 j−1 αi v2i−1 = α j = 0.
i=1

Thus, W has infinite codimension.


Also solved by R. Chapman (U. K.), N. Eldredge, M. González & Á. Plaza (Spain), J. P. Grivaux (France),
P. Perfetti (Italy), R. Tauraso (Italy), and the proposer.

Sums of Unit Vectors


11825 [2015, 284]. Proposed by Marian Dinca, Vahalia University of Târgoviste, Bucharest,
Romania, and Sorin Radulescu, Institute of Mathematical Statistics and Applied Mathe-
matics, Bucharest, Romania. Let E be a normed linear space. Given x1 , . . . , xn ∈ E (with
n ≥ 2) such that xk  = 1 for 1 ≤ k ≤ n and the origin of E is in the convex hull of
{x1 , . . . , xn }, prove that x1 + · · · + xn  ≤ n − 2.
Solution by Edward Schmeichel, San José State University, San José, CA. Since the origin
 convex hull of{x1 , . . . , xn }, there are nonnegative real numbers tk for 1 ≤ k ≤ n
is in the
with nk=1 tk = 1 and nk=1 tk xk = 0. Since
 
 
   
tk = tk xk  =   − t x 
j j ≤ t j = 1 − tk ,
 j=k  j=k

we see that 1 − 2tk ≥ 0. Thus,


 n 
   n
 
x1 + · · · + xn  =  (1 − 2tk )xk  ≤ (1 − 2tk ) = n − 2.
 
k=1 k=1

Editorial comment. This inequality seems to have first appeared in M. S. Klamkin and
D. J. Newman, An inequality for the sums of unit vectors, Univ. Beo. Publ. Elek. Fac., Ser.
Mat. i. Fiz. 338–352 (1971) 47–48. A more accessible reference is G. D. Chakerian and
M. S. Klamkin, Inequalities for Sums of Distances, this M ONTHLY 80 (1973) 1009–1017.
Also solved by M. Aassila (France), U. Abel (Germany), K. F. Andersen (Canada), R. Bagby, E. Bojaxhiu
(Albania) & E. Hysnelaj (Australia), R. Boukharfane (France), F. Brulois, P. Budney, S. Byrd & R. Nichols,
N. Caro (Brazil), R. Chapman (U. K.), W. J. Cowieson, P. P. Dályay (Hungary), P. J. Fitzsimmons, N. Grivaux
(France), E. A. Herman, Y. J. Ionin, E. G. Katsoulis, J. H. Lindsey II, O. P. Lossers (Netherlands), V. Muragan
& A. Vinoth (India), M. Omarjee (France), M. A. Prasad (India), R. Stong, R. Tauraso (Italy), J. Van Hamme
(Belgium), J. Zacharias, R. Zarnowki, New York Math Circle, and the proposers.

February 2017] PROBLEMS AND SOLUTIONS 187


REVIEWS
Edited by Jason Rosenhouse
Department of Mathematics and Statistics, James Madison University, Harrisonburg,
VA 22807

The Philosophies of Mathematics


Alan Baker

What does it mean to assert a mathematical claim, for example that there is a prime
between 5 and 10? If the claim is true, then what makes it true? And how do we come
to know it in the first place? It is apparently basic questions such as these that drive
the field of philosophy of mathematics. That these questions arise for even the most
elementary mathematical propositions makes the philosophical project to elucidate
the nature of mathematics accessible to nonspecialists. It also makes it frustratingly
inconclusive.
Before delving into contemporary philosophy of mathematics, let us begin by cast-
ing a glance back one hundred years to the early part of the twentieth century. At this
time, philosophers of mathematics were focused on the following question
(i) Are the central claims of our core mathematical theories true? If so, what makes
them true? [foundation]
Interestingly, the project of addressing the foundation question was taken up not just
by philosophers, but also by a number of prominent mathematicians. Many philoso-
phers view this period as the “golden age” of philosophy of mathematics. Although
the foundation of mathematics went through a series of crises during this time, the
issues being addressed were of interest to the wider mathematical community. The
“Big Four” philosophical views on the nature of mathematics that emerged during this
period were logicism, intuitionism, formalism, and platonism.
According to logicism, the truths of mathematics are ultimately truths of logic. Once
appropriate definitions of the basic terms are given, statements such as “2 + 2 = 4”
can be seen to be true solely by virtue of the meanings of the expressions involved,
as is sometimes the case with nonmathematical claims, such as “All bachelors are
unmarried.” Logicism began with the work of the German philosopher Gottlob Frege
in the late nineteenth century, was taken up by Bertrand Russell in the early twentieth
century, and culminated with the massive (and massively impenetrable) three-volume
work, Principia Mathematica, published in 1910 by Russell and Whitehead.
According to intuitionism, which was championed by the Dutch mathematicians
Brouwer and Heyting, and which took its inspiration from the philosophy of Immanuel
Kant, mathematical entities such as numbers are created by the mental acts of mathe-
maticians. Intuitionism is a form of antirealism, since it denies that there is a preexist-
ing universe of mathematical entities waiting to be described. However, mathematical
claims can be true if they are proved in the right way. In particular, proofs of the
existence of some particular mathematical entity must proceed by giving an explicit
“recipe” for constructing the given entity.
According to formalism, mathematical claims are meaningless strings of symbols
that are manipulated according to explicitly stated formal rules. This is a more radical
http://dx.doi.org/10.4169/amer.math.monthly.124.2.188

188 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
form of antirealism, since it denies that mathematical claims are even true (or false
either)! The most famous defender of formalism was David Hilbert. Hilbert was not a
formalist about the whole of mathematics, only the part that deals with infinite totali-
ties. For Hilbert, a mathematical claim such as, “There is a prime number between 10
and 20” can be expressed as a finite string of claims about finite numbers (i.e., “Either
10 is prime or 11 is prime or . . . or 20 is prime.”) and thus is meaningful. By contrast, a
claim such as, “There is no largest prime number,” cannot be expressed as a finite string
of claims about finite numbers, and hence, on Hilbert’s view, is not capable of being
true or false. For Hilbert, infinitary statements are merely vehicles for moving between
(meaningful) finitary claims. The use of infinitary claims is permissible provided that
we can show that their use never results in inconsistency.
Famously, all three of these philosophical views of mathematics ran into technical
difficulties. Frege’s version of logicism was dealt a fatal blow by Russell’s paradox.
On Frege’s view, every property determines a set of things that have that property.
Russell’s paradox asks about the property, which we shall denote by p, of not being
self-membered. Is the set S determined by p a member of itself? If so, then it does
have the determining property p, implying it is not a member of itself. But if it is not
a member of itself, then it has property p, implying that it is a member of itself! Rus-
sell’s own response to this paradox was to build a version of logicism that separates
objects and sets made up of those objects into different levels. This paved the way
to the development of modern set theory. While set theory seems to make an excel-
lent foundation for the rest of mathematics, it does not vindicate logicism because set
theory is not logic.
The main technical problem with intuitionism is that it requires an underlying logic
that rejects the Law of the Excluded Middle. This is the principle that, for any state-
ment p, either p or not- p is true. If p is a mathematical existence claim, then one way
to prove p, in classical mathematics, is to show that the assumption that not- p leads
to a contradiction. The intuitionist rejects this form of reductio ad absurdum proof,
since what the intuitionist requires in order to establish p is the construction of a par-
ticular example that fits the existence claim. For example, if p is the claim that there
exist irrational numbers which, when raised to irrational powers, are rational, then the
intuitionist will demand at least one specific example of such a number. This feature
of intuitionism is not contradictory, as was the case with Frege’s logicism, but it does
conflict with mainstream mathematical practice. David Hilbert famously complained
that “taking the Principle of the Excluded Middle from the mathematician . . . is the
same as . . . prohibiting the boxer the use of his fists.”
Hilbert’s own preferred philosophy of mathematics, formalism, ran into its own
roadblock in the formidable shape of Gödel’s celebrated incompleteness theorems. A
corollary of these theorems is that a consistent system strong enough for arithmetic
cannot be used to probe its own consistency. This means that the finitary part of math-
ematics cannot be relied upon to guarantee the consistency of the infinitary—and, for
Hilbert, meaningless—parts.
There was also a fourth philosophical view floating around in the early twentieth
century, one with much older roots and with some well-known proponents, including
such luminaries as G. H. Hardy and Kurt Gödel. Here is a characteristic passage from
Hardy’s A Mathematician’s Apology:

I believe that mathematical realty lies outside us, that our function is to discover
or observe it, and that the theorems which we prove, and which we describe
grandiloquently as our “creations,” are simply our notes to our observations.
This view has been held, in one form or another, by many philosophers of high

February 2017] REVIEWS 189


reputation from Plato onwards, and I shall use the language which is natural to a
man who holds it.

While not a view about the foundations of mathematics per se, platonism is in an
important sense the most straightforwardly realist position of all: mathematicians are
exploring and describing an abstract landscape that exists independently of us. While
not subject to the technical problems that afflicted the preceding three views, platonism
runs into severe difficulties answering a second philosophical question:
(ii) How do we come to know the truth of the central claims of our core mathematical
theories? [knowledge]
Note that Hardy’s talk of “observations” in the above passage is at best metaphorical.
No mathematician has ever literally observed a mathematical object.
What does philosophy of mathematics look like today? Fast-forwarding a hundred
years from the foundational controversies of the early twentieth century, we can see
successors of each of the “Big Four” philosophical views on the nature of mathematics.
In addition to a shift from the foundation question to the knowledge question, a third
question has also come to increasing prominence in contemporary debates:
(iii) What explains the usefulness of mathematics in science, and its applicability more
generally to the world? [application]
In the remainder of this review, I shall briefly outline the four most prominent current
philosophies of mathematics, and suggest in each case a book that explores the given
position in more detail.
First up is neologicism. For decades after Frege’s logicism was torpedoed by Rus-
sell’s paradox, it was assumed that this dealt a fatal blow to logicism more generally.
It was not until the early 1980’s that philosophers noticed a relatively straightforward
way to salvage the core aspects of Frege’s approach while avoiding Russell’s para-
dox. In his original work, Frege proposes an axiom that he calls “Basic Law V.” One
implication of Basic Law V is that for every property there is a set of objects that fall
under that property. Frege uses Basic Law V to prove a key foundational result, that the
number of Fs is equal to the number of Gs if and only if the F-objects can be put into
one-to-one correspondence with the G-objects. This latter result has come to be known
as “Hume’s Principle.” Basic Law V is what gives rise to Russell’s paradox. (Consider
the property of not being self-membered. According to Basic Law V, there is a set S
of objects with this property. Is S a member of itself? Either answer leads to contra-
diction.) However, Basic Law V plays a very little role in Frege’s system other than
to prove Hume’s principle, and Hume’s principle itself does not fall prey to Russell’s
paradox. Neologicism proposes to jettison Basic Law V and instead use Hume’s prin-
ciple, plus logic, as the foundation of arithmetic. Hume’s principle is one example of a
family of principles known as abstraction principles. Another goal of neologicism is to
find a way of distinguishing “good” (i.e., consistent) abstraction principles from “bad”
(i.e., inconsistent) abstraction principles and to find foundational abstraction principles
for other areas of mathematics. (An example from geometry is the principle that the
direction of line M is equal to the direction of line N if and only if M is parallel to N .)
An excellent book-length summary of neologicism, including both the philosophical
motivations and the technical results, is Fixing Frege by John Burgess [1].
The second of our four contemporary philosophies of mathematics is structural-
ism. While not a direct successor of intuitionism (in the way that neologicism is a
direct successor of logicism), structuralism shares with the older intuitionist position
a down-playing of the status of mathematical objects as mind-independent entities.

190 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
For the structuralist, mathematics is about structure, not objects. Indeed, any collec-
tion of objects with the right structure can serve to instantiate a given mathematical
theory. Take the natural numbers, for example. According to structuralism, numbers
are simply places in the natural number structure. There is no independent object that
is the number 17. Structuralism helps to address all three of the issues that preoccupy
philosophers of mathematics. It helps with the knowledge question, since knowledge
of structures seems more tractable than knowledge of abstract objects. It helps with
the applicability question, since structures are by their nature realizable in multiple
ways by different physical phenomena. And it fits well with the way that mathemati-
cians talk about mathematics, and with research into structure-centered foundations
for mathematics such as category theory. Philosopher Stewart Shapiro has done a lot
to articulate and defend structuralism and his book Philosophy of Mathematics: Struc-
ture and Ontology [4] is an excellent overview of this position.
Formalism not only falls foul of Gödel’s results, it also flies in the face of math-
ematical practice. When mathematicians describe themselves as formalists, they tend
to use this label merely to emphasize their view of the importance of rigor and proof.
Few actually believe—as formalism dictates—that mathematical claims are meaning-
less strings of symbols. The next philosophical position I shall discuss is fictional-
ism. According to the fictionalist, core mathematical claims are meaningful, but they
are false. A claim such as “7 is a prime number” is akin to a claim about a fictional
character, such as “Sherlock Holmes is a pipe-smoking detective.” Each is an accept-
able claim to make, in the right context, yet each is, strictly speaking, false. Sherlock
Holmes does not exist, and nor does the number 7. On the fictionalist view, what math-
ematicians are doing is setting out fictional scenarios and then exploring their conse-
quences. Thus, for example, the story of arithmetic might begin: “Once upon a time
there was a number, 0, that was the successor of no number, and it had a successor, 1,
and . . . .” Fictionalism does well on the knowledge question: we make up our mathe-
matical fictions, so there is no problem explaining how we know about what happens
in them. More of a problem is the applicability question. The Sherlock Holmes stories
may be entertaining, but they are not particularly useful. What makes our mathemati-
cal fictions so invaluable for theorizing about the physical world? Mary Leng develops
and defends a broadly fictionalist position in her book Mathematics and Reality [3].
This brings us to the fourth and final philosophical position, known as indispens-
abilist platonism (IP). The strategy underlying IP is to use the applicability of math-
ematics to answer the knowledge question for platonism. We begin by noting that
science makes reference to a variety of theoretical entities such as electrons, genes,
and black holes. We believe in the existence of these entities because they are part of
our best scientific theories. But science also refers to a variety of mathematical entities
such as numbers, sets, and functions. Moreover, these mathematical entities are seem-
ingly indispensable to science: we do not know how to formulate our theories without
them. IP argues that this provides sufficient grounds for believing in the existence of
numbers, sets, and functions. In brief, we ought to believe in the literal truth of math-
ematics because we believe our best scientific theories and we need mathematics for
our best scientific theories. The pros and cons of indispensabilist platonism have been
much discussed over the past two decades. Mark Colyvan’s book, The Indispensability
of Mathematics [2] provides a nice overview of this position.
Philosophers of mathematics have traditionally focused their attention on a very nar-
row selection of core areas of mathematics such as arithmetic, geometry, and set the-
ory. This has changed over the past several decades, with philosophers now routinely
bringing in a more diverse array of examples from fields such as topology, group the-
ory, linear algebra, and knot theory. This broadening of perspective has gone hand

February 2017] REVIEWS 191


in hand with a greater sensitivity to the details of actual mathematical practice, as
opposed to some philosophical idealization of mathematics. It is ironic, therefore, that
these developments have occurred just as philosophy of mathematics has fallen off
most mathematicians’ radar. I end with one final book recommendation, this time for a
more general overview of philosophy of mathematics for the mathematically informed
nonspecialist: Stewart Shapiro’s book Thinking About Mathematics: The Philosophy of
Mathematics [5]. Promoting ongoing dialogue between mathematicians and philoso-
phers of mathematics is surely beneficial to both fields.

REFERENCES

1. J. Burgess, Fixing Frege. Princeton Univ. Press, Princeton, 2005.


2. M. Colyvan, The Indispensability of Mathematics. Oxford Univ. Press, New York, 2003.
3. M. Leng, Mathematics and Reality. Oxford Univ. Press, New York, 2013.
4. S. Shapiro, Philosophy of Mathematics: Structure and Ontology. Oxford Univ. Press, New York, 1997.
5. ———, Thinking About Mathematics: The Philosophy of Mathematics. Oxford Univ. Press, New York,
2000.

Swarthmore College, Swarthmore, PA 19081


abaker1@swarthmore.edu

192 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Now Available in the Anneli Lax
New Mathematical Library

Portal through Mathematics


Oleg Ivanov
Translated by Robert G. Burns

Portal through Mathematics is a


collection of puzzles and problems
mostly on topics related to secondary
mathematics. The problems and
topics are fresh and interesting and
frequently surprising. The book
is organized into 29 themes, each
a topic from algebra, geometry or calculus and each
launched from an opening puzzle or problem. Portal
through Mathematics will be useful for prospective
secondary teachers of mathematics and may be used (as a
supplementary resource) in university courses in algebra,
geometry, calculus, and discrete mathematics.

Catalog Code: NML-47 318 pp., Paperbound, 2017


List: $55.00 ISBN 978-0-88385-651-2
MAA Member: $41.25

Order your copy today.


Call 1-800-331-1622 or order online at: store.maa.org/site.
MATHEMATICAL ASSOCIATION OF AMERICA
1529 Eighteenth St., NW O Washington, DC 20036

Need a text for a QR course?

Common Sense
Mathematics
Ethan D. Bolker and Maura B. Mast

&RPPRQ6HQVH0DWKHPDWLFVLVD
WH[WIRUDRQHVHPHVWHUFROOHJHOHYHO
FRXUVHLQTXDQWLWDWLYHOLWHUDF\7KH
WH[WHPSKDVL]HVFRPPRQVHQVHDQG
FRPPRQNQRZOHGJHLQDSSURDFKLQJ
UHDOSUREOHPVWKURXJKSRSXODUQHZVLWHPVDQG¿QGLQJXVHIXOPDWK
HPDWLFDOWRROVDQGIUDPHVZLWKZKLFKWRDGGUHVVWKRVHTXHVWLRQV

Catalog Code: CSM Print ISBN: 9781939512109


E-ISBN: 9781614446217 List: $60.00
ebook: $30.00 MAA Member: $45.00
MAA Textbooks 328 pages, Hardbound, 2016

To order a print book visit www.store.maa.org or call 800-331-1622.


To order an electronic book visit www.maa.org/ebooks/CSM.

Você também pode gostar